[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14652892#comment-14652892 ] Cheng Hao commented on SPARK-7119: -- [~marmbrus] This is actually a bug fixing, and it blocks the bigbench testing for quite a long time, since the code is ready (to be reviewed), can we add it back to the 1.5 target list? > ScriptTransform doesn't consider the output data type > - > > Key: SPARK-7119 > URL: https://issues.apache.org/jira/browse/SPARK-7119 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0, 1.3.1, 1.4.0 >Reporter: Cheng Hao >Priority: Critical > > {code:sql} > from (from src select transform(key, value) using 'cat' as (thing1 int, > thing2 string)) t select thing1 + 2; > {code} > {noformat} > 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent > failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): > java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be > cast to java.lang.Integer > at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) > at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57) > at > org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:64) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14619672#comment-14619672 ] Yi Zhou commented on SPARK-7119: The issue blocked Spark SQL query relative to scriptTransform so hopefully it can be fixed in 1.5.0 > ScriptTransform doesn't consider the output data type > - > > Key: SPARK-7119 > URL: https://issues.apache.org/jira/browse/SPARK-7119 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0, 1.3.1, 1.4.0 >Reporter: Cheng Hao > > {code:sql} > from (from src select transform(key, value) using 'cat' as (thing1 int, > thing2 string)) t select thing1 + 2; > {code} > {noformat} > 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent > failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): > java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be > cast to java.lang.Integer > at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) > at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57) > at > org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:64) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573889#comment-14573889 ] zhichao-li commented on SPARK-7119: --- This workaround query can be executed correctly and there's a simple fix for this issue by the way :) > ScriptTransform doesn't consider the output data type > - > > Key: SPARK-7119 > URL: https://issues.apache.org/jira/browse/SPARK-7119 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0, 1.3.1, 1.4.0 >Reporter: Cheng Hao > > {code:sql} > from (from src select transform(key, value) using 'cat' as (thing1 int, > thing2 string)) t select thing1 + 2; > {code} > {noformat} > 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent > failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): > java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be > cast to java.lang.Integer > at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) > at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57) > at > org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:64) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573888#comment-14573888 ] zhichao-li commented on SPARK-7119: --- This workaround query can be executed correctly and there's a simple fix for this issue by the way :) > ScriptTransform doesn't consider the output data type > - > > Key: SPARK-7119 > URL: https://issues.apache.org/jira/browse/SPARK-7119 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0, 1.3.1, 1.4.0 >Reporter: Cheng Hao > > {code:sql} > from (from src select transform(key, value) using 'cat' as (thing1 int, > thing2 string)) t select thing1 + 2; > {code} > {noformat} > 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent > failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): > java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be > cast to java.lang.Integer > at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) > at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57) > at > org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:64) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14572368#comment-14572368 ] Apache Spark commented on SPARK-7119: - User 'zhichao-li' has created a pull request for this issue: https://github.com/apache/spark/pull/6638 > ScriptTransform doesn't consider the output data type > - > > Key: SPARK-7119 > URL: https://issues.apache.org/jira/browse/SPARK-7119 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0, 1.3.1, 1.4.0 >Reporter: Cheng Hao > > {code:sql} > from (from src select transform(key, value) using 'cat' as (thing1 int, > thing2 string)) t select thing1 + 2; > {code} > {noformat} > 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent > failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): > java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be > cast to java.lang.Integer > at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) > at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57) > at > org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:64) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562938#comment-14562938 ] Yin Huai commented on SPARK-7119: - [~chenghao] Can you try? {{from (from src select transform(key, value) using 'cat' as (thing1 string, thing2 string)) t select cast(thing1 as int) + 2;}} > ScriptTransform doesn't consider the output data type > - > > Key: SPARK-7119 > URL: https://issues.apache.org/jira/browse/SPARK-7119 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0, 1.3.1, 1.4.0 >Reporter: Cheng Hao > > {code:sql} > from (from src select transform(key, value) using 'cat' as (thing1 int, > thing2 string)) t select thing1 + 2; > {code} > {noformat} > 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent > failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): > java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be > cast to java.lang.Integer > at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) > at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57) > at > org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:64) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7119) ScriptTransform doesn't consider the output data type
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510940#comment-14510940 ] Apache Spark commented on SPARK-7119: - User 'viirya' has created a pull request for this issue: https://github.com/apache/spark/pull/5688 > ScriptTransform doesn't consider the output data type > - > > Key: SPARK-7119 > URL: https://issues.apache.org/jira/browse/SPARK-7119 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.3.0, 1.3.1, 1.4.0 >Reporter: Cheng Hao >Priority: Blocker > > {panel} > from (from src select transform(key, value) using 'cat' as (thing1 int, > thing2 string)) t select thing1 + 2; > {panel} > {panel} > 15/04/24 00:58:55 ERROR CliDriver: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent > failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): > java.lang.ClassCastException: org.apache.spark.sql.types.UTF8String cannot be > cast to java.lang.Integer > at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106) > at scala.math.Numeric$IntIsIntegral$.plus(Numeric.scala:57) > at > org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:127) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:118) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:68) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:52) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at org.apache.spark.rdd.RDD$$anonfun$17.apply(RDD.scala:819) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1618) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) > at org.apache.spark.scheduler.Task.run(Task.scala:64) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:209) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > {panel} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org