首先,需要下载ncdc气象数据集并将其上传到Hadoop分布式文件系统中。然后,可以使用以下MapReduce程序来找到最低温度:

Mapper:

java 
public class TemperatureMapper extends Mapper
  
    { 
 
    private static final int MISSING_TEMPERATURE = 9999; 
 
    @Override 
    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { 
        String line = value.toString(); 
        String year = line.substring(15, 19); 
        int temperature; 
        if (line.charAt(87) == '+') { 
            temperature = Integer.parseInt(line.substring(88, 92)); 
        } else { 
            temperature = Integer.parseInt(line.substring(87, 92)); 
        } 
        if (temperature != MISSING_TEMPERATURE) { 
            context.write(new Text(year), new IntWritable(temperature)); 
        } 
    } 
} 

  

Reducer:

java 
public class TemperatureReducer extends Reducer
  
    { 
 
    @Override 
    public void reduce(Text key, Iterable
   
     values, Context context) throws IOException, InterruptedException { 
        int minTemperature = Integer.MAX_VALUE; 
        for (IntWritable value : values) { 
            minTemperature = Math.min(minTemperature, value.get()); 
        } 
        context.write(key, new IntWritable(minTemperature)); 
    } 
} 

   
  

最后,可以使用以下命令来运行MapReduce程序:

 
hadoop jar 
   
    
     

    
   
  

其中,` `是MapReduce程序的JAR文件路径,` `是ncdc数据集所在的HDFS目录路径,` `是输出结果的HDFS目录路径。

运行成功后,输出目录中将包含每个年份的最低温度。


评论关闭
IT源码网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!