## 基于MAP-REDUCE模式的大数据聚集方法研究

Study of on the Data Aggregation Method Based on the Map Reduce
Abstract: With the development of computer network and Internet widely available, the data of Internet with hitherto unknown speed growth and accumulation, the big data has been entered into our life. To improve the efficiency of massive data has become the focus of the development of modern society. Map Reduce parallel programming environment, has been widely used in the field of big data processing. At the same time, academic circles also made a great contribution to related algorithms of the Map Reduce, effectively promoted the development of Map Reduce. This paper focuses on the data aggregation method based on the Reduce model of Map, and uses the Map Reduce function to find the grouping operation, statistics, and the maximum and minimum value. put forward the aggregation algorithm. Aggregation algorithm not only can improve the computational efficiency, but also can effectively reduce the operation time. To this end, according to the features and advantages of Map Reduce large data put forward algorithm.
Key Words: Map Reduce; Clustering algorithm; Count; Maximum or Minimum

1.研究现状及意义    2
1.1研究现状及意义    2
2.Map Reduce简介    2
2.1 Map Reduce函数映射和归并    2
2.2可靠性介绍    3
2.3 Map Reduce主要功能    3
2.4 Map Reduce优缺点    4
3.大数据聚集运算    4
3.1统计数量运算    4
3.2分组运算    4
3.3最大值和最小值运算    5
4.运算分析    5
4.1统计数量运算    5
4.2分组运算    6
4.3最大值和最小值运算    6
5.实验流程结果    7
5.1实现统计运算    7
5.2实现分组运算    8
5.3实现最值运算    8
6.结论    9

1.研究现状及意义
1.1研究现状及意义
Map Reduce是一种用于大规模数据聚集的映射和归并运算。Map 和Reduce的主要思想是从函数式编程语言借来的，为了方便编程在分布式并行编程环境完成任务，将自己的程序运行在一个分布式系统中。该运算是通过指定的一个Map 功能来实现，可以使每个分组成为一个元素，指定减少功能的新价值的数值映射，以保证所有数值映射的同时，并且能够在每一组数值映射中共享同一个密钥。
