Thanks Sean!

On Thu, Jan 19, 2017 at 6:09 AM, Sean Owen <so...@cloudera.com> wrote:

> Yes, confirmed that fixing it unfortunately causes trouble in Java 8. See
> https://issues.apache.org/jira/browse/SPARK-19287 for further discussion.
>
> On Wed, Jan 18, 2017 at 9:00 PM Sean Owen <so...@cloudera.com> wrote:
>
>> Hm. Unless I am also totally missing or forgetting something, I think
>> you're right. The equivalent in PairRDDFunctions.scala operations on a
>> function from T to TraversableOnce[U] and a TraversableOnce is most like
>> java.util.Iterator.
>>
>> You can work around it by wrapping it in a faked IteratorIterable.
>>
>> I think this is fixable in the API by deprecating this method and adding
>> a new one that takes a FlatMapFunction. We'd have to triple-check in a test
>> that this doesn't cause an API compatibility problem with respect to Java 8
>> lambdas, but if that's settled, I think this could be fixed without
>> breaking the API.
>>
>> On Wed, Jan 18, 2017 at 8:50 PM Asher Krim <ak...@hubspot.com> wrote:
>>
>> In Spark 2 + Java + RDD api, the use of iterables was replaced with
>> iterators. I just encountered an inconsistency in `flatMapValues` that may
>> be a bug:
>>
>> `flatMapValues` (https://github.com/apache/spark/blob/master/core/src/
>> main/scala/org/apache/spark/api/java/JavaPairRDD.scala#L677) takes
>> a `FlatMapFunction` (https://github.com/apache/spark/blob/
>> 39e2bad6a866d27c3ca594d15e574a1da3ee84cc/core/src/main/java/
>> org/apache/spark/api/java/function/FlatMapFunction.java)
>>
>> The problem is that `FlatMapFunction` was changed to return an iterator,
>> but `rdd.flatMapValues` still expects an iterable. Am I using these
>> constructs correctly? Is there a workaround other than converting the
>> iterator to an iterable outside of the function?
>>
>> Thanks,
>> --
>> Asher Krim
>> Senior Software Engineer
>>
>>


-- 
Asher Krim
Senior Software Engineer

Reply via email to