[ 
https://issues.apache.org/jira/browse/SPARK-28854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914981#comment-16914981
 ] 

Hao Yang Ang commented on SPARK-28854:
--------------------------------------

[~hyukjin.kwon] if xs is "consumed" at the first map then wouldn't

 

sc.parallelize(Seq(1, 2, 3)).mapPartitions(xs => 
xs.map(2*).map(2*)).collect.foreach(println)



fail the same way as well?

And why would zipping with empty iterators fail? The error feels more like 
Iterator#hasNext was wrong about Iterator#next.

What you have suggested is just another workaround. The fact is still a bug 
until zip is fixed or if you can stop users from writing the offending code at 
compile-time, eg having a new iterator type that doesn't do zip.

> Zipping iterators in mapPartitions will fail
> --------------------------------------------
>
>                 Key: SPARK-28854
>                 URL: https://issues.apache.org/jira/browse/SPARK-28854
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.3
>            Reporter: Hao Yang Ang
>            Priority: Minor
>
> {code}
> scala> sc.parallelize(Seq(1, 2, 3)).mapPartitions(xs => 
> xs.map(2*).zip(xs)).collect.foreach(println)
> ...
> java.util.NoSuchElementException: next on empty iterator
> {code}
>  
>  
> Workaround - implement zip with mapping to tuple:
> {code}
> scala> sc.parallelize(Seq(1, 2, 3)).mapPartitions(xs => xs.map(x => (x * 2, 
> x))).collect.foreach(println)
> (2,1)
> (4,2)
> (6,3)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to