[ https://issues.apache.org/jira/browse/SPARK-23840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422509#comment-16422509 ]
Hyukjin Kwon commented on SPARK-23840: -------------------------------------- It would be nicer if we can have error messages since it's hard to reproduce and it's quite difficult to debug only given that information .. FYI, the execution path would be roughly Python --py4j--> Spark Driver ---> Spark Executor --> Python worker and I can't check every code path :( .. > PySpark error when converting a DataFrame to rdd > ------------------------------------------------ > > Key: SPARK-23840 > URL: https://issues.apache.org/jira/browse/SPARK-23840 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.3.0 > Reporter: Uri Goren > Priority: Major > > I am running code in the `pyspark` shell on an `emr` cluster, and > encountering an error I have never seen before... > This line works: > spark.read.parquet(s3_input).take(99) > While this line causes an exception: > spark.read.parquet(s3_input).rdd.take(99) > With > > TypeError: 'int' object is not iterable -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org