[jira] [Updated] (SPARK-5498) [SPARK-SQL]when the partition schema does not match table schema,it throws java.lang.ClassCastException and so on

2018-03-16 Thread Lijia Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijia Liu updated SPARK-5498:
-
Affects Version/s: 2.2.0
Fix Version/s: 3.0.0

> [SPARK-SQL]when the partition schema does not match table schema,it throws 
> java.lang.ClassCastException and so on
> -
>
> Key: SPARK-5498
> URL: https://issues.apache.org/jira/browse/SPARK-5498
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.0, 2.2.0
>Reporter: jeanlyn
>Assignee: jeanlyn
>Priority: Major
> Fix For: 1.4.0, 3.0.0
>
>
> when the partition schema does not match table schema,it will thows exception 
> when the task is running.For example,we modify the type of column from int to 
> bigint by the sql *ALTER TABLE table_with_partition CHANGE COLUMN key key 
> BIGINT* ,then we query the patition data which was stored before the 
> changing,we would get the exception:
> {noformat}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 27.0 failed 4 times, most recent failure: Lost task 0.3 in stage 27.0 
> (TID 30, BJHC-HADOOP-HERA-16950.jeanlyn.local): java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.MutableLong cannot be cast to 
> org.apache.spark.sql.catalyst.expressions.MutableInt
> at 
> org.apache.spark.sql.catalyst.expressions.SpecificMutableRow.setInt(SpecificMutableRow.scala:241)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:322)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:314)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at 
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Driver stacktrace:
> at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
> at 
> org.apa

[jira] [Updated] (SPARK-5498) [SPARK-SQL]when the partition schema does not match table schema,it throws java.lang.ClassCastException and so on

2018-03-16 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-5498:
-
Affects Version/s: (was: 2.2.0)

> [SPARK-SQL]when the partition schema does not match table schema,it throws 
> java.lang.ClassCastException and so on
> -
>
> Key: SPARK-5498
> URL: https://issues.apache.org/jira/browse/SPARK-5498
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.0
>Reporter: jeanlyn
>Assignee: jeanlyn
>Priority: Major
> Fix For: 1.4.0
>
>
> when the partition schema does not match table schema,it will thows exception 
> when the task is running.For example,we modify the type of column from int to 
> bigint by the sql *ALTER TABLE table_with_partition CHANGE COLUMN key key 
> BIGINT* ,then we query the patition data which was stored before the 
> changing,we would get the exception:
> {noformat}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 27.0 failed 4 times, most recent failure: Lost task 0.3 in stage 27.0 
> (TID 30, BJHC-HADOOP-HERA-16950.jeanlyn.local): java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.MutableLong cannot be cast to 
> org.apache.spark.sql.catalyst.expressions.MutableInt
> at 
> org.apache.spark.sql.catalyst.expressions.SpecificMutableRow.setInt(SpecificMutableRow.scala:241)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:322)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:314)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at 
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Driver stacktrace:
> at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
> at 
> org.apache.spark.scheduler.DAGS

[jira] [Updated] (SPARK-5498) [SPARK-SQL]when the partition schema does not match table schema,it throws java.lang.ClassCastException and so on

2018-03-16 Thread Dongjoon Hyun (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-5498:
-
Fix Version/s: (was: 3.0.0)

> [SPARK-SQL]when the partition schema does not match table schema,it throws 
> java.lang.ClassCastException and so on
> -
>
> Key: SPARK-5498
> URL: https://issues.apache.org/jira/browse/SPARK-5498
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.0
>Reporter: jeanlyn
>Assignee: jeanlyn
>Priority: Major
> Fix For: 1.4.0
>
>
> when the partition schema does not match table schema,it will thows exception 
> when the task is running.For example,we modify the type of column from int to 
> bigint by the sql *ALTER TABLE table_with_partition CHANGE COLUMN key key 
> BIGINT* ,then we query the patition data which was stored before the 
> changing,we would get the exception:
> {noformat}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 27.0 failed 4 times, most recent failure: Lost task 0.3 in stage 27.0 
> (TID 30, BJHC-HADOOP-HERA-16950.jeanlyn.local): java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.MutableLong cannot be cast to 
> org.apache.spark.sql.catalyst.expressions.MutableInt
> at 
> org.apache.spark.sql.catalyst.expressions.SpecificMutableRow.setInt(SpecificMutableRow.scala:241)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:322)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:314)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at 
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Driver stacktrace:
> at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
> at 
> org.apache.spark.scheduler.DAGSched

[jira] [Updated] (SPARK-5498) [SPARK-SQL]when the partition schema does not match table schema,it throws java.lang.ClassCastException and so on

2015-04-25 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-5498:
-
Assignee: jeanlyn

> [SPARK-SQL]when the partition schema does not match table schema,it throws 
> java.lang.ClassCastException and so on
> -
>
> Key: SPARK-5498
> URL: https://issues.apache.org/jira/browse/SPARK-5498
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.0
>Reporter: jeanlyn
>Assignee: jeanlyn
> Fix For: 1.4.0
>
>
> when the partition schema does not match table schema,it will thows exception 
> when the task is running.For example,we modify the type of column from int to 
> bigint by the sql *ALTER TABLE table_with_partition CHANGE COLUMN key key 
> BIGINT* ,then we query the patition data which was stored before the 
> changing,we would get the exception:
> {noformat}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 27.0 failed 4 times, most recent failure: Lost task 0.3 in stage 27.0 
> (TID 30, BJHC-HADOOP-HERA-16950.jeanlyn.local): java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.MutableLong cannot be cast to 
> org.apache.spark.sql.catalyst.expressions.MutableInt
> at 
> org.apache.spark.sql.catalyst.expressions.SpecificMutableRow.setInt(SpecificMutableRow.scala:241)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$13$$anonfun$apply$4.apply(TableReader.scala:286)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:322)
> at 
> org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:314)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at 
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at 
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> at 
> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> at 
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.sql.execution.Limit$$anonfun$4.apply(basicOperators.scala:141)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at 
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
> at org.apache.spark.scheduler.Task.run(Task.scala:56)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Driver stacktrace:
> at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGSchedu