[ https://issues.apache.org/jira/browse/SPARK-28854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914981#comment-16914981 ]
Hao Yang Ang commented on SPARK-28854: -------------------------------------- [~hyukjin.kwon] if xs is "consumed" at the first map then wouldn't sc.parallelize(Seq(1, 2, 3)).mapPartitions(xs => xs.map(2*).map(2*)).collect.foreach(println) fail the same way as well? And why would zipping with empty iterators fail? The error feels more like Iterator#hasNext was wrong about Iterator#next. What you have suggested is just another workaround. The fact is still a bug until zip is fixed or if you can stop users from writing the offending code at compile-time, eg having a new iterator type that doesn't do zip. > Zipping iterators in mapPartitions will fail > -------------------------------------------- > > Key: SPARK-28854 > URL: https://issues.apache.org/jira/browse/SPARK-28854 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.4.3 > Reporter: Hao Yang Ang > Priority: Minor > > {code} > scala> sc.parallelize(Seq(1, 2, 3)).mapPartitions(xs => > xs.map(2*).zip(xs)).collect.foreach(println) > ... > java.util.NoSuchElementException: next on empty iterator > {code} > > > Workaround - implement zip with mapping to tuple: > {code} > scala> sc.parallelize(Seq(1, 2, 3)).mapPartitions(xs => xs.map(x => (x * 2, > x))).collect.foreach(println) > (2,1) > (4,2) > (6,3) > {code} > -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org