[jira] [Updated] (SPARK-18260) from_json can throw a better exception when it can't find the column or be nullSafe

2016-11-07 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-18260:

Issue Type: Sub-task  (was: Bug)
Parent: SPARK-18351

> from_json can throw a better exception when it can't find the column or be 
> nullSafe
> ---
>
> Key: SPARK-18260
> URL: https://issues.apache.org/jira/browse/SPARK-18260
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Burak Yavuz
>Assignee: Burak Yavuz
>Priority: Blocker
> Fix For: 2.1.0
>
>
> I got this exception:
> {code}
> SparkException: Job aborted due to stage failure: Task 0 in stage 13028.0 
> failed 4 times, most recent failure: Lost task 0.3 in stage 13028.0 (TID 
> 74170, 10.0.138.84, executor 2): java.lang.NullPointerException
>   at 
> org.apache.spark.sql.catalyst.expressions.JsonToStruct.eval(jsonExpressions.scala:490)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
>   at 
> org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:211)
>   at 
> org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:210)
>   at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
> {code}
> This was because the column that I called `from_json` on didn't exist for all 
> of my rows. Either from_json should be null safe, or it should fail with a 
> better error message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-18260) from_json can throw a better exception when it can't find the column or be nullSafe

2016-11-03 Thread Shixiong Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-18260:
-
Component/s: SQL

> from_json can throw a better exception when it can't find the column or be 
> nullSafe
> ---
>
> Key: SPARK-18260
> URL: https://issues.apache.org/jira/browse/SPARK-18260
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Burak Yavuz
>
> I got this exception:
> {code}
> SparkException: Job aborted due to stage failure: Task 0 in stage 13028.0 
> failed 4 times, most recent failure: Lost task 0.3 in stage 13028.0 (TID 
> 74170, 10.0.138.84, executor 2): java.lang.NullPointerException
>   at 
> org.apache.spark.sql.catalyst.expressions.JsonToStruct.eval(jsonExpressions.scala:490)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
>   at 
> org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:211)
>   at 
> org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:210)
>   at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
> {code}
> This was because the column that I called `from_json` on didn't exist for all 
> of my rows. Either from_json should be null safe, or it should fail with a 
> better error message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-18260) from_json can throw a better exception when it can't find the column or be nullSafe

2016-11-03 Thread Michael Armbrust (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust updated SPARK-18260:
-
Target Version/s: 2.1.0
Priority: Blocker  (was: Major)

> from_json can throw a better exception when it can't find the column or be 
> nullSafe
> ---
>
> Key: SPARK-18260
> URL: https://issues.apache.org/jira/browse/SPARK-18260
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Burak Yavuz
>Priority: Blocker
>
> I got this exception:
> {code}
> SparkException: Job aborted due to stage failure: Task 0 in stage 13028.0 
> failed 4 times, most recent failure: Lost task 0.3 in stage 13028.0 (TID 
> 74170, 10.0.138.84, executor 2): java.lang.NullPointerException
>   at 
> org.apache.spark.sql.catalyst.expressions.JsonToStruct.eval(jsonExpressions.scala:490)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
>   at 
> org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:211)
>   at 
> org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:210)
>   at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
> {code}
> This was because the column that I called `from_json` on didn't exist for all 
> of my rows. Either from_json should be null safe, or it should fail with a 
> better error message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org