Re: SparkSQL exception on spark.sql.codegen
Hi Eric and Michael: I run into this problem with Spark 1.4.1 too. The error stack is: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$ at org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:180) at org.apache.spark.sql.execution.Filter.conditionEvaluator$lzycompute(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter.conditionEvaluator(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:58) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:57) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) java.lang.AssertionError: assertion failed: List(package expressions, package expressions) at scala.reflect.internal.Symbols$Symbol.suchThat(Symbols.scala:1678) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:44) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61) at scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72) at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:161) at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:21) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$typecreator1$1.apply(CodeGenerator.scala:46) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at scala.reflect.api.TypeTags$class.typeOf(TypeTags.scala:335) at scala.reflect.api.Universe.typeOf(Universe.scala:59) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.(CodeGenerator.scala:46) at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$.(GeneratePredicate.scala:25) at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$.(GeneratePredicate.scala) at org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:180) at org.apache.spark.sql.execution.Filter.conditionEvaluator$lzycompute(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter.conditionEvaluator(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:58) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:57) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70)
Re: SparkSQL exception on spark.sql.codegen
Those are probably related. It looks like we are somehow not being thread safe when initializing various parts of the scala compiler. Since code gen is pretty experimental we probably won't have the resources to investigate backporting a fix. However, if you can reproduce the problem in Spark 1.2 then please file a JIRA. On Mon, Nov 17, 2014 at 9:37 PM, Eric Zhen zhpeng...@gmail.com wrote: Yes, it's always appears on a part of the whole tasks in a stage(i.e. 100/100 (65 failed)), and sometimes cause the stage to fail. And there is another error that I'm not sure if there is a correlation. java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$ at org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:114) at org.apache.spark.sql.execution.Filter.conditionEvaluator$lzycompute(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter.conditionEvaluator(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:58) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:57) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) On Tue, Nov 18, 2014 at 11:41 AM, Michael Armbrust mich...@databricks.com wrote: Interesting, I believe we have run that query with version 1.1.0 with codegen turned on and not much has changed there. Is the error deterministic? On Mon, Nov 17, 2014 at 7:04 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi Michael, We use Spark v1.1.1-rc1 with jdk 1.7.0_51 and scala 2.10.4. On Tue, Nov 18, 2014 at 7:09 AM, Michael Armbrust mich...@databricks.com wrote: What version of Spark SQL? On Sat, Nov 15, 2014 at 10:25 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi all, We run SparkSQL on TPCDS benchmark Q19 with spark.sql.codegen=true, we got exceptions as below, has anyone else saw these before? java.lang.ExceptionInInitializerError at org.apache.spark.sql.execution.SparkPlan.newProjection(SparkPlan.scala:92) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:51) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:48) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at scala.reflect.internal.Types$TypeRef.computeHashCode(Types.scala:2358) at scala.reflect.internal.Types$UniqueType.init(Types.scala:1304) at scala.reflect.internal.Types$TypeRef.init(Types.scala:2341) at scala.reflect.internal.Types$NoArgsTypeRef.init(Types.scala:2137) at scala.reflect.internal.Types$TypeRef$$anon$6.init(Types.scala:2544)
Re: SparkSQL exception on spark.sql.codegen
Okay, thank you Micheal. On Wed, Nov 19, 2014 at 3:45 AM, Michael Armbrust mich...@databricks.com wrote: Those are probably related. It looks like we are somehow not being thread safe when initializing various parts of the scala compiler. Since code gen is pretty experimental we probably won't have the resources to investigate backporting a fix. However, if you can reproduce the problem in Spark 1.2 then please file a JIRA. On Mon, Nov 17, 2014 at 9:37 PM, Eric Zhen zhpeng...@gmail.com wrote: Yes, it's always appears on a part of the whole tasks in a stage(i.e. 100/100 (65 failed)), and sometimes cause the stage to fail. And there is another error that I'm not sure if there is a correlation. java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$ at org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:114) at org.apache.spark.sql.execution.Filter.conditionEvaluator$lzycompute(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter.conditionEvaluator(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:58) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:57) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) On Tue, Nov 18, 2014 at 11:41 AM, Michael Armbrust mich...@databricks.com wrote: Interesting, I believe we have run that query with version 1.1.0 with codegen turned on and not much has changed there. Is the error deterministic? On Mon, Nov 17, 2014 at 7:04 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi Michael, We use Spark v1.1.1-rc1 with jdk 1.7.0_51 and scala 2.10.4. On Tue, Nov 18, 2014 at 7:09 AM, Michael Armbrust mich...@databricks.com wrote: What version of Spark SQL? On Sat, Nov 15, 2014 at 10:25 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi all, We run SparkSQL on TPCDS benchmark Q19 with spark.sql.codegen=true, we got exceptions as below, has anyone else saw these before? java.lang.ExceptionInInitializerError at org.apache.spark.sql.execution.SparkPlan.newProjection(SparkPlan.scala:92) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:51) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:48) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at scala.reflect.internal.Types$TypeRef.computeHashCode(Types.scala:2358) at scala.reflect.internal.Types$UniqueType.init(Types.scala:1304) at scala.reflect.internal.Types$TypeRef.init(Types.scala:2341) at
Re: SparkSQL exception on spark.sql.codegen
What version of Spark SQL? On Sat, Nov 15, 2014 at 10:25 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi all, We run SparkSQL on TPCDS benchmark Q19 with spark.sql.codegen=true, we got exceptions as below, has anyone else saw these before? java.lang.ExceptionInInitializerError at org.apache.spark.sql.execution.SparkPlan.newProjection(SparkPlan.scala:92) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:51) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:48) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at scala.reflect.internal.Types$TypeRef.computeHashCode(Types.scala:2358) at scala.reflect.internal.Types$UniqueType.init(Types.scala:1304) at scala.reflect.internal.Types$TypeRef.init(Types.scala:2341) at scala.reflect.internal.Types$NoArgsTypeRef.init(Types.scala:2137) at scala.reflect.internal.Types$TypeRef$$anon$6.init(Types.scala:2544) at scala.reflect.internal.Types$TypeRef$.apply(Types.scala:2544) at scala.reflect.internal.Types$class.typeRef(Types.scala:3615) at scala.reflect.internal.SymbolTable.typeRef(SymbolTable.scala:13) at scala.reflect.internal.Symbols$TypeSymbol.newTypeRef(Symbols.scala:2752) at scala.reflect.internal.Symbols$TypeSymbol.typeConstructor(Symbols.scala:2806) at scala.reflect.internal.Symbols$SymbolContextApiImpl.toTypeConstructor(Symbols.scala:103) at scala.reflect.internal.Symbols$TypeSymbol.toTypeConstructor(Symbols.scala:2698) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$typecreator1$1.apply(CodeGenerator.scala:46) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at scala.reflect.api.TypeTags$class.typeOf(TypeTags.scala:335) at scala.reflect.api.Universe.typeOf(Universe.scala:59) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.init(CodeGenerator.scala:46) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateProjection$.init(GenerateProjection.scala:29) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateProjection$.clinit(GenerateProjection.scala) ... 15 more -- Best Regards
Re: SparkSQL exception on spark.sql.codegen
Hi Michael, We use Spark v1.1.1-rc1 with jdk 1.7.0_51 and scala 2.10.4. On Tue, Nov 18, 2014 at 7:09 AM, Michael Armbrust mich...@databricks.com wrote: What version of Spark SQL? On Sat, Nov 15, 2014 at 10:25 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi all, We run SparkSQL on TPCDS benchmark Q19 with spark.sql.codegen=true, we got exceptions as below, has anyone else saw these before? java.lang.ExceptionInInitializerError at org.apache.spark.sql.execution.SparkPlan.newProjection(SparkPlan.scala:92) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:51) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:48) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at scala.reflect.internal.Types$TypeRef.computeHashCode(Types.scala:2358) at scala.reflect.internal.Types$UniqueType.init(Types.scala:1304) at scala.reflect.internal.Types$TypeRef.init(Types.scala:2341) at scala.reflect.internal.Types$NoArgsTypeRef.init(Types.scala:2137) at scala.reflect.internal.Types$TypeRef$$anon$6.init(Types.scala:2544) at scala.reflect.internal.Types$TypeRef$.apply(Types.scala:2544) at scala.reflect.internal.Types$class.typeRef(Types.scala:3615) at scala.reflect.internal.SymbolTable.typeRef(SymbolTable.scala:13) at scala.reflect.internal.Symbols$TypeSymbol.newTypeRef(Symbols.scala:2752) at scala.reflect.internal.Symbols$TypeSymbol.typeConstructor(Symbols.scala:2806) at scala.reflect.internal.Symbols$SymbolContextApiImpl.toTypeConstructor(Symbols.scala:103) at scala.reflect.internal.Symbols$TypeSymbol.toTypeConstructor(Symbols.scala:2698) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$typecreator1$1.apply(CodeGenerator.scala:46) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at scala.reflect.api.TypeTags$class.typeOf(TypeTags.scala:335) at scala.reflect.api.Universe.typeOf(Universe.scala:59) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.init(CodeGenerator.scala:46) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateProjection$.init(GenerateProjection.scala:29) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateProjection$.clinit(GenerateProjection.scala) ... 15 more -- Best Regards -- Best Regards
Re: SparkSQL exception on spark.sql.codegen
Yes, it's always appears on a part of the whole tasks in a stage(i.e. 100/100 (65 failed)), and sometimes cause the stage to fail. And there is another error that I'm not sure if there is a correlation. java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$ at org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:114) at org.apache.spark.sql.execution.Filter.conditionEvaluator$lzycompute(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter.conditionEvaluator(basicOperators.scala:55) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:58) at org.apache.spark.sql.execution.Filter$$anonfun$2.apply(basicOperators.scala:57) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) On Tue, Nov 18, 2014 at 11:41 AM, Michael Armbrust mich...@databricks.com wrote: Interesting, I believe we have run that query with version 1.1.0 with codegen turned on and not much has changed there. Is the error deterministic? On Mon, Nov 17, 2014 at 7:04 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi Michael, We use Spark v1.1.1-rc1 with jdk 1.7.0_51 and scala 2.10.4. On Tue, Nov 18, 2014 at 7:09 AM, Michael Armbrust mich...@databricks.com wrote: What version of Spark SQL? On Sat, Nov 15, 2014 at 10:25 PM, Eric Zhen zhpeng...@gmail.com wrote: Hi all, We run SparkSQL on TPCDS benchmark Q19 with spark.sql.codegen=true, we got exceptions as below, has anyone else saw these before? java.lang.ExceptionInInitializerError at org.apache.spark.sql.execution.SparkPlan.newProjection(SparkPlan.scala:92) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:51) at org.apache.spark.sql.execution.Exchange$$anonfun$execute$1$$anonfun$1.apply(Exchange.scala:48) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at scala.reflect.internal.Types$TypeRef.computeHashCode(Types.scala:2358) at scala.reflect.internal.Types$UniqueType.init(Types.scala:1304) at scala.reflect.internal.Types$TypeRef.init(Types.scala:2341) at scala.reflect.internal.Types$NoArgsTypeRef.init(Types.scala:2137) at scala.reflect.internal.Types$TypeRef$$anon$6.init(Types.scala:2544) at scala.reflect.internal.Types$TypeRef$.apply(Types.scala:2544) at scala.reflect.internal.Types$class.typeRef(Types.scala:3615) at scala.reflect.internal.SymbolTable.typeRef(SymbolTable.scala:13) at scala.reflect.internal.Symbols$TypeSymbol.newTypeRef(Symbols.scala:2752) at scala.reflect.internal.Symbols$TypeSymbol.typeConstructor(Symbols.scala:2806) at