Hi, I am trying to understand Hadoop Map method compared to spark Map and I noticed that spark Map only receives 3 arguments 1) input value 2) output key 3) output value, however in hadoop map it has 4 values 1) input key 2) input value 3) output key 4) output value. Is there any reason it was designed this way? Just trying to undersand:
Hadoop: public void map(K key, V val, OutputCollector<K, V> output, Reporter reporter) -------- // Count each word in each batch JavaPairDStream<String, Integer> *pairs* = words.mapToPair( *new* *PairFunction<String, String, Integer>()* { @Override *public* Tuple2<String, Integer> call(String s) *throws* Exception { *return* *new* Tuple2<String, Integer>(s, 1); } });