How to write mapreduce programming in spark by using java on user-defined javaPairRDD?

付雅丹 Tue, 07 Jul 2015 07:20:10 -0700

Hi, everyone!

I've got <key,value> pair in form of <LongWritable, Text>, where I used the
following code:


SparkConf conf = new SparkConf().setAppName("MapReduceFileInput");
JavaSparkContext sc = new JavaSparkContext(conf);
Configuration confHadoop = new Configuration();

JavaPairRDD<LongWritable,Text> sourceFile=sc.newAPIHadoopFile(
"hdfs://cMaster:9000/wcinput/data.txt",
DataInputFormat.class,LongWritable.class,Text.class,confHadoop);

Now I want to handle the javapairrdd data from <LongWritable, Text> to
another <LongWritable, Text>, where the Text content is different. After
that, I want to write Text into hdfs in order of LongWritable value. But I
don't know how to write mapreduce function in spark using java language.
Someone can help me?


Sincerely,
Missie.

How to write mapreduce programming in spark by using java on user-defined javaPairRDD?

Reply via email to