Thanks, pair_rdd.rdd.groupByKey().... did the trick. On Wed, Aug 10, 2016 at 8:24 PM, Holden Karau <hol...@pigscanfly.ca> wrote:
> So it looks like (despite the name) pair_rdd is actually a Dataset - my > guess is you might have a map on a dataset up above which used to return an > RDD but now returns another dataset or an unexpected implicit conversion. > Just add rdd() before the groupByKey call to push it into an RDD. That > being said - groupByKey generally is an anti-pattern so please be careful > with it. > > On Wed, Aug 10, 2016 at 8:07 PM, Arun Luthra <arun.lut...@gmail.com> > wrote: > >> Here is the offending line: >> >> val some_rdd = pair_rdd.groupByKey().flatMap { case (mk: MyKey, md_iter: >> Iterable[MyData]) => { >> ... >> >> >> [error] ******** .scala:249: overloaded method value groupByKey with >> alternatives: >> [error] [K](func: >> org.apache.spark.api.java.function.MapFunction[(aaa.MyKey, >> aaa.MyData),K], encoder: org.apache.spark.sql.Encoder[K >> ])org.apache.spark.sql.KeyValueGroupedDataset[K,(aaa.MyKey, aaa.MyData)] >> <and> >> [error] [K](func: ((aaa.MyKey, aaa.MyData)) => K)(implicit evidence$4: >> org.apache.spark.sql.Encoder[K])org.apache.spark.sql.KeyValueGroupedDataset[K,(aaa.MyKey, >> aaa.MyData)] >> [error] cannot be applied to () >> [error] val some_rdd = pair_rdd.groupByKey().flatMap { case (mk: >> MyKey, hd_iter: Iterable[MyData]) => { >> [error] ^ >> [warn] ************.scala:249: non-variable type argument aaa.MyData in >> type pattern Iterable[aaa.MyData] (the underlying of Iterable[aaa.MyData]) >> is unchecked since it is eliminated by erasure >> [warn] val some_rdd = pair_rdd.groupByKey().flatMap { case (mk: >> MyKey, hd_iter: Iterable[MyData]) => { >> [warn] >> ^ >> [warn] one warning found >> >> >> I can't see any obvious API change... what is the problem? >> >> Thanks, >> Arun >> > > > > -- > Cell : 425-233-8271 > Twitter: https://twitter.com/holdenkarau >