First use groupByKey(), you get a tuple RDD with (key:K,value:ArrayBuffer[V]). Then use map() on this RDD with a function has different operations depending on the key which act as a parameter of this function.
> 在 2014年11月18日,下午8:59,jelgh <johannes.e...@gmail.com> 写道: > > Hello everyone, > > I'm new to Spark and I have the following problem: > > I have this large JavaRDD<MyClass> collection, which I group with by > creating a hashcode from some fields in MyClass: > > JavaRDD<MyClass> collection = ...; > JavaPairRDD<Integer, Iterable<MyClass>> grouped = > collection.groupBy(...); // the group-function is just creating a hashcode > from some fields in MyClass. > > Now I want to reduce the variable grouped. However, I want to reduce it with > different functions depending on the key in the JavaPairRDD. So basically a > reduceByKey but with multiple functions. > > Only solution I've come up with is by filtering grouped for each reduce > function and apply it on the filtered subsets. This feels kinda hackish > though. > > Is there a better way? > > Best regards, > Johannes > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/ReduceByKey-but-with-different-functions-depending-on-key-tp19177.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org