[ https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15637290#comment-15637290 ]
Apache Spark commented on SPARK-18260: -------------------------------------- User 'brkyvz' has created a pull request for this issue: https://github.com/apache/spark/pull/15771 > from_json can throw a better exception when it can't find the column or be > nullSafe > ----------------------------------------------------------------------------------- > > Key: SPARK-18260 > URL: https://issues.apache.org/jira/browse/SPARK-18260 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Burak Yavuz > Priority: Blocker > > I got this exception: > {code} > SparkException: Job aborted due to stage failure: Task 0 in stage 13028.0 > failed 4 times, most recent failure: Lost task 0.3 in stage 13028.0 (TID > 74170, 10.0.138.84, executor 2): java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.JsonToStruct.eval(jsonExpressions.scala:490) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71) > at > org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71) > at > org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:211) > at > org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:210) > at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804) > {code} > This was because the column that I called `from_json` on didn't exist for all > of my rows. Either from_json should be null safe, or it should fail with a > better error message -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org