[jira] [Commented] (SPARK-9255) Timestamp handling incorrect for Spark 1.4.1 on Linux
[ https://issues.apache.org/jira/browse/SPARK-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640048#comment-14640048 ] Sean Owen commented on SPARK-9255: -- Can you clarify how you build and run your Spark cluster, and how you build your app? for example you are building your app and possibly including Spark code in it? Timestamp handling incorrect for Spark 1.4.1 on Linux - Key: SPARK-9255 URL: https://issues.apache.org/jira/browse/SPARK-9255 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.1 Environment: Redhat Linux, Java 8.0 and Spark 1.4.1 release. Reporter: Paul Wu Attachments: timestamp_bug.zip This is a very strange case involving timestamp I can run the program on Windows using dev pom.xml (1.4.1) or 1.4.1 or 1.3.0 release downloaded from Apache without issues , but when I ran it on Spark 1.4.1 release either downloaded from Apache or the version built with scala 2.11 on redhat linux, it has the following error (the code I used is after this stack trace): 15/07/22 12:02:50 ERROR Executor 96: Exception in task 0.0 in stage 0.0 (TID 0) java.util.concurrent.ExecutionException: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293) at org.spark-project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.spark-project.guava.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) at org.spark-project.guava.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) at org.spark-project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2380) at org.spark-project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) at org.spark-project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257) at org.spark-project.guava.cache.LocalCache.get(LocalCache.java:4000) at org.spark-project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004) at org.spark-project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:105) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:102) at org.apache.spark.sql.execution.SparkPlan.newMutableProjection(SparkPlan.scala:170) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:261) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:246) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.throwIfErrors(ToolBoxFactory.scala:316) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.wrapInPackageAndCompile(ToolBoxFactory.scala:198) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.compile(ToolBoxFactory.scala:252) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$$anonfun$compile$2.apply(ToolBoxFactory.scala:429) at
[jira] [Commented] (SPARK-9255) Timestamp handling incorrect for Spark 1.4.1 on Linux
[ https://issues.apache.org/jira/browse/SPARK-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641175#comment-14641175 ] Paul Wu commented on SPARK-9255: Related https://issues.apache.org/jira/browse/SPARK-9058, but I doubt it is the same. If you look at the sample code, it has almost no aggregations at all. However, the fix for https://issues.apache.org/jira/browse/SPARK-9058 may also fix this issue. I guess you can test it. Timestamp handling incorrect for Spark 1.4.1 on Linux - Key: SPARK-9255 URL: https://issues.apache.org/jira/browse/SPARK-9255 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0, 1.4.1 Environment: Redhat Linux, Java 8.0 and Spark 1.4.1 release. Reporter: Paul Wu Attachments: timestamp_bug2.zip, tstest Updates: This issue is due to the following config: spark.sql.codegen true If this param is set to be false, the problem does not happen. The bug was introduced in 1.4.0. Releases 1.3.0 and 1.3.1 have no this issue. === This is a very strange case involving timestamp I can run the program on Windows using dev pom.xml (1.4.1) or 1.4.1 or 1.3.0 release downloaded from Apache without issues , but when I ran it on Spark 1.4.1 release either downloaded from Apache or the version built with scala 2.11 on redhat linux, it has the following error (the code I used is after this stack trace): 15/07/22 12:02:50 ERROR Executor 96: Exception in task 0.0 in stage 0.0 (TID 0) java.util.concurrent.ExecutionException: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293) at org.spark-project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.spark-project.guava.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) at org.spark-project.guava.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) at org.spark-project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2380) at org.spark-project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) at org.spark-project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257) at org.spark-project.guava.cache.LocalCache.get(LocalCache.java:4000) at org.spark-project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004) at org.spark-project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:105) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:102) at org.apache.spark.sql.execution.SparkPlan.newMutableProjection(SparkPlan.scala:170) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:261) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:246) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.throwIfErrors(ToolBoxFactory.scala:316)
[jira] [Commented] (SPARK-9255) Timestamp handling incorrect for Spark 1.4.1 on Linux
[ https://issues.apache.org/jira/browse/SPARK-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14639949#comment-14639949 ] Paul Wu commented on SPARK-9255: [~srowen] I don't think it is due to version difference: The same code runs on Release 1.3.0 correctly on Red Linux. This bug was introduced after 1.3.0. Timestamp handling incorrect for Spark 1.4.1 on Linux - Key: SPARK-9255 URL: https://issues.apache.org/jira/browse/SPARK-9255 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.1 Environment: Redhat Linux, Java 8.0 and Spark 1.4.1 release. Reporter: Paul Wu Attachments: timestamp_bug.zip This is a very strange case involving timestamp I can run the program on Windows using dev pom.xml (1.4.1) or 1.4.1 or 1.3.0 release downloaded from Apache without issues , but when I ran it on Spark 1.4.1 release either downloaded from Apache or the version built with scala 2.11 on redhat linux, it has the following error (the code I used is after this stack trace): 15/07/22 12:02:50 ERROR Executor 96: Exception in task 0.0 in stage 0.0 (TID 0) java.util.concurrent.ExecutionException: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293) at org.spark-project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.spark-project.guava.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) at org.spark-project.guava.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) at org.spark-project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2380) at org.spark-project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) at org.spark-project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257) at org.spark-project.guava.cache.LocalCache.get(LocalCache.java:4000) at org.spark-project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004) at org.spark-project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:105) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:102) at org.apache.spark.sql.execution.SparkPlan.newMutableProjection(SparkPlan.scala:170) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:261) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:246) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.throwIfErrors(ToolBoxFactory.scala:316) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.wrapInPackageAndCompile(ToolBoxFactory.scala:198) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.compile(ToolBoxFactory.scala:252) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$$anonfun$compile$2.apply(ToolBoxFactory.scala:429) at
[jira] [Commented] (SPARK-9255) Timestamp handling incorrect for Spark 1.4.1 on Linux
[ https://issues.apache.org/jira/browse/SPARK-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637600#comment-14637600 ] Sean Owen commented on SPARK-9255: -- This looks like the kind of problem you get when you mix up code compiled against different Spark or Scala versions. It sounds from your description like that's what you're doing, too. Timestamp handling incorrect for Spark 1.4.1 on Linux - Key: SPARK-9255 URL: https://issues.apache.org/jira/browse/SPARK-9255 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.1 Environment: Redhat Linux, Java 8.0 and Spark 1.4.1 release. Reporter: Paul Wu Attachments: timestamp_bug.zip This is a very strange case involving timestamp I can run the program on Windows using dev pom.xml (1.4.1) or 1.4.1 or 1.3.0 release downloaded from Apache without issues , but when I ran it on Spark 1.4.1 release either downloaded from Apache or the version built with scala 2.11 on redhat linux, it has the following error (the code I used is after this stack trace): 15/07/22 12:02:50 ERROR Executor 96: Exception in task 0.0 in stage 0.0 (TID 0) java.util.concurrent.ExecutionException: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) at org.spark-project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293) at org.spark-project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.spark-project.guava.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) at org.spark-project.guava.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) at org.spark-project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2380) at org.spark-project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) at org.spark-project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257) at org.spark-project.guava.cache.LocalCache.get(LocalCache.java:4000) at org.spark-project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004) at org.spark-project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:105) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:102) at org.apache.spark.sql.execution.SparkPlan.newMutableProjection(SparkPlan.scala:170) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:261) at org.apache.spark.sql.execution.GeneratedAggregate$$anonfun$9.apply(GeneratedAggregate.scala:246) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: scala.tools.reflect.ToolBoxError: reflective compilation has failed: value is not a member of TimestampType.this.InternalType at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.throwIfErrors(ToolBoxFactory.scala:316) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.wrapInPackageAndCompile(ToolBoxFactory.scala:198) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$ToolBoxGlobal.compile(ToolBoxFactory.scala:252) at scala.tools.reflect.ToolBoxFactory$ToolBoxImpl$$anonfun$compile$2.apply(ToolBoxFactory.scala:429) at