Hi,

I am trying to understand Hadoop Map method compared to spark Map and I
noticed that spark Map only receives 3 arguments 1) input value 2) output
key 3) output value, however in hadoop map it has 4 values 1) input key 2)
input value 3) output key 4) output value. Is there any reason it was
designed this way? Just trying to undersand:

Hadoop:

public void map(K key, V val,
                       OutputCollector<K, V> output, Reporter reporter)

--------


                // Count each word in each batch

                JavaPairDStream<String, Integer> *pairs* = words.mapToPair(

                  *new* *PairFunction<String, String, Integer>()* {

                    @Override *public* Tuple2<String, Integer> call(String s)
*throws* Exception {

                      *return* *new* Tuple2<String, Integer>(s, 1);

                    }

                  });

Reply via email to