Map the key value into a key,Tuple2<key,value> and process that - Also ask the Spark maintainers for a version of keyed operations where the key is passed in as an argument - I run into these cases all the time
/** * map a tuple int a key tuple pair to insure subsequent processing has access to both Key and value * @param inp input pair RDD * @param <K> key type * @param <V> value type * @return output where value has both key and value */ @Nonnull public static <K extends Serializable, V extends Serializable> JavaPairRDD<K,Tuple2<K, V>> toKeyedTuples(@Nonnull JavaPairRDD< K, V> inp) { return inp.flatMapToPair(new PairFlatMapFunction<Tuple2<K, V>, K, Tuple2<K, V>>() { @Override public Iterable<Tuple2<K, Tuple2<K, V>>> call(final Tuple2<K, V> t) throws Exception { return new Tuple2<K, Tuple2<K, V>>>(t._1(),new Tuple2<K,V>(t._1(),t._2()); } }); } -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/ReduceByKey-but-with-different-functions-depending-on-key-tp19177p19198.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org