Re: groupByKey() compile error after upgrading from 1.6.2 to 2.0.0

Holden Karau Wed, 10 Aug 2016 20:25:55 -0700

So it looks like (despite the name) pair_rdd is actually a Dataset - my
guess is you might have a map on a dataset up above which used to return an
RDD but now returns another dataset or an unexpected implicit conversion.
Just add rdd() before the groupByKey call to push it into an RDD. That
being said - groupByKey generally is an anti-pattern so please be careful
with it.


On Wed, Aug 10, 2016 at 8:07 PM, Arun Luthra <arun.lut...@gmail.com> wrote:

> Here is the offending line:
>
> val some_rdd = pair_rdd.groupByKey().flatMap { case (mk: MyKey, md_iter:
> Iterable[MyData]) => {
> ...
>
>
> [error] ******** .scala:249: overloaded method value groupByKey with
> alternatives:
> [error]   [K](func: org.apache.spark.api.java.function.MapFunction[(aaa.MyKey,
> aaa.MyData),K], encoder: org.apache.spark.sql.Encoder[
> K])org.apache.spark.sql.KeyValueGroupedDataset[K,(aaa.MyKey, aaa.MyData)]
> <and>
> [error]   [K](func: ((aaa.MyKey, aaa.MyData)) => K)(implicit evidence$4:
> org.apache.spark.sql.Encoder[K])org.apache.spark.sql.
> KeyValueGroupedDataset[K,(aaa.MyKey, aaa.MyData)]
> [error]  cannot be applied to ()
> [error]     val some_rdd = pair_rdd.groupByKey().flatMap { case (mk:
> MyKey, hd_iter: Iterable[MyData]) => {
> [error]                                             ^
> [warn] ************.scala:249: non-variable type argument aaa.MyData in
> type pattern Iterable[aaa.MyData] (the underlying of Iterable[aaa.MyData])
> is unchecked since it is eliminated by erasure
> [warn]     val some_rdd = pair_rdd.groupByKey().flatMap { case (mk: MyKey,
> hd_iter: Iterable[MyData]) => {
> [warn]
>                                 ^
> [warn] one warning found
>
>
> I can't see any obvious API change... what is the problem?
>
> Thanks,
> Arun
>



-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: groupByKey() compile error after upgrading from 1.6.2 to 2.0.0

Reply via email to