Thanks, pair_rdd.rdd.groupByKey().... did the trick.

On Wed, Aug 10, 2016 at 8:24 PM, Holden Karau <hol...@pigscanfly.ca> wrote:

> So it looks like (despite the name) pair_rdd is actually a Dataset - my
> guess is you might have a map on a dataset up above which used to return an
> RDD but now returns another dataset or an unexpected implicit conversion.
> Just add rdd() before the groupByKey call to push it into an RDD. That
> being said - groupByKey generally is an anti-pattern so please be careful
> with it.
>
> On Wed, Aug 10, 2016 at 8:07 PM, Arun Luthra <arun.lut...@gmail.com>
> wrote:
>
>> Here is the offending line:
>>
>> val some_rdd = pair_rdd.groupByKey().flatMap { case (mk: MyKey, md_iter:
>> Iterable[MyData]) => {
>> ...
>>
>>
>> [error] ******** .scala:249: overloaded method value groupByKey with
>> alternatives:
>> [error]   [K](func: 
>> org.apache.spark.api.java.function.MapFunction[(aaa.MyKey,
>> aaa.MyData),K], encoder: org.apache.spark.sql.Encoder[K
>> ])org.apache.spark.sql.KeyValueGroupedDataset[K,(aaa.MyKey, aaa.MyData)]
>> <and>
>> [error]   [K](func: ((aaa.MyKey, aaa.MyData)) => K)(implicit evidence$4:
>> org.apache.spark.sql.Encoder[K])org.apache.spark.sql.KeyValueGroupedDataset[K,(aaa.MyKey,
>> aaa.MyData)]
>> [error]  cannot be applied to ()
>> [error]     val some_rdd = pair_rdd.groupByKey().flatMap { case (mk:
>> MyKey, hd_iter: Iterable[MyData]) => {
>> [error]                                             ^
>> [warn] ************.scala:249: non-variable type argument aaa.MyData in
>> type pattern Iterable[aaa.MyData] (the underlying of Iterable[aaa.MyData])
>> is unchecked since it is eliminated by erasure
>> [warn]     val some_rdd = pair_rdd.groupByKey().flatMap { case (mk:
>> MyKey, hd_iter: Iterable[MyData]) => {
>> [warn]
>>                                   ^
>> [warn] one warning found
>>
>>
>> I can't see any obvious API change... what is the problem?
>>
>> Thanks,
>> Arun
>>
>
>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>

Reply via email to