[ https://issues.apache.org/jira/browse/SPARK-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen updated SPARK-5361: ------------------------------ Affects Version/s: 1.2.0 > python tuple not supported while converting PythonRDD back to JavaRDD > --------------------------------------------------------------------- > > Key: SPARK-5361 > URL: https://issues.apache.org/jira/browse/SPARK-5361 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.2.0 > Reporter: Winston Chen > > Existing `SerDeUtil.pythonToJava` implementation does not count in tuple > cases: Pyrolite `python tuple` => `java Object[]`. > So with the following data: > {noformat} > [ > (u'2', {u'director': u'David Lean', u'genres': (u'Adventure', u'Biography', > u'Drama'), u'title': u'Lawrence of Arabia', u'year': 1962}), > (u'7', {u'director': u'Andrew Dominik', u'genres': (u'Biography', u'Crime', > u'Drama'), u'title': u'The Assassination of Jesse James by the Coward Robert > Ford', u'year': 2007}) > ] > {noformat} > Exceptions happen at the `genres` part: > {noformat} > 15/01/16 10:28:31 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 7) > java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to > java.util.ArrayList > at > org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:157) > at > org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:153) > at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) > at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308) > {noformat} > There is already a pull-request for this bug: > https://github.com/apache/spark/pull/4146 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org