Josh Rosen created SPARK-5063: --------------------------------- Summary: Raise more helpful errors when SparkContext methods are called inside of transformations Key: SPARK-5063 URL: https://issues.apache.org/jira/browse/SPARK-5063 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Josh Rosen Assignee: Josh Rosen
Spark does not support nested RDDs or performing Spark actions inside of transformations; this usually leads to NullPointerExceptions (see SPARK-718 as one example). The confusing NPE is one of the most common sources of Spark questions on StackOverflow: - https://stackoverflow.com/questions/13770218/call-of-distinct-and-map-together-throws-npe-in-spark-library/14130534#14130534 - https://stackoverflow.com/questions/23793117/nullpointerexception-in-scala-spark-appears-to-be-caused-be-collection-type/23793399#23793399 - https://stackoverflow.com/questions/25997558/graphx-ive-got-nullpointerexception-inside-mapvertices/26003674#26003674 (those are just a sample of the ones that I've answered personally; there are many others). I think that we should add some logic to attempt to detect these sorts of errors: we can use a DynamicVariable to check whether we're inside a task and throw more useful errors when the RDD constructor is called from inside a task or when the SparkContext job submission methods are called. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org