[jira] [Commented] (SPARK-8443) GenerateMutableProjection Exceeds JVM Code Size Limits

2016-11-14 Thread Barry Becker (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664779#comment-15664779
 ] 

Barry Becker commented on SPARK-8443:
-

I see the same error in spark 1.6.3. Is there a workaround? Should this issue 
be reopened? I am working with a dataset with more than 200 columns. Before 
1.6.3 I could not even try this case because of SPARK-16664.
{code}
Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method 
"(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I"
 of class 
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" 
grows beyond 64 KB
at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
at org.codehaus.janino.CodeContext.write(CodeContext.java:836)
{code}

> GenerateMutableProjection Exceeds JVM Code Size Limits
> --
>
> Key: SPARK-8443
> URL: https://issues.apache.org/jira/browse/SPARK-8443
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0
>Reporter: Sen Fang
>Assignee: Sen Fang
> Fix For: 1.5.0
>
>
> GenerateMutableProjection put all expressions columns into a single apply 
> function. When there are a lot of columns, the apply function code size 
> exceeds the 64kb limit, which is a hard limit on jvm that cannot change.
> This comes up when we were aggregating about 100 columns using codegen and 
> unsafe feature.
> I wrote an unit test that reproduces this issue. 
> https://github.com/saurfang/spark/blob/codegen_size_limit/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
> This test currently fails at 2048 expressions. It seems the master is more 
> tolerant than branch-1.4 about this because code is more concise.
> While the code on master has changed since branch-1.4, I am able to reproduce 
> the problem in master. For now I hacked my way in branch-1.4 to workaround 
> this problem by wrapping each expression with a separate function then call 
> those functions sequentially in apply. The proper way is probably check the 
> length of the projectCode and break it up as necessary. (This seems to be 
> easier in master actually since we are building code by string rather than 
> quasiquote)
> Let me know if anyone has additional thoughts on this, I'm happy to 
> contribute a pull request.
> Attaching stack trace produced by unit test
> {code}
> [info] - code size limit *** FAILED *** (7 seconds, 103 milliseconds)
> [info]   com.google.common.util.concurrent.UncheckedExecutionException: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "(Ljava/lang/Object;)Ljava/lang/Object;" of class "SC$SpecificProjection" 
> grows beyond 64 KB
> [info]   at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2263)
> [info]   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
> [info]   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
> [info]   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:285)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:50)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:48)
> [info]   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at scala.collection.immutable.Range.foreach(Range.scala:141)
> [info]   at 
> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
> [info]   at 
> scala.collection.AbstractTraversable.foldLeft(Traversable.scala:105)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply$mcV$sp(CodeGenerationSuite.scala:47)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
> [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:

[jira] [Commented] (SPARK-8443) GenerateMutableProjection Exceeds JVM Code Size Limits

2016-07-20 Thread Hayri Volkan Agun (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385667#comment-15385667
 ] 

Hayri Volkan Agun commented on SPARK-8443:
--

The same issue for large sql syntax with a lot of unions and iterations over 
dataframe at spark 1.6.2


Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method 
"(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$SpecificUnsafeProjection;Lorg/apache/spark/sql/catalyst/InternalRow;)V"
 of class 
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection"
 grows beyond 64 KB
at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
at org.codehaus.janino.CodeContext.write(CodeContext.java:854)
at org.codehaus.janino.CodeContext.writeShort(CodeContext.java:959)
at 
org.codehaus.janino.UnitCompiler.writeConstantFieldrefInfo(UnitCompiler.java:10279)
at org.codehaus.janino.UnitCompiler.getfield(UnitCompiler.java:9946)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3322)
at org.codehaus.janino.UnitCompiler.access$8200(UnitCompiler.java:185)
at 
org.codehaus.janino.UnitCompiler$10.visitFieldAccess(UnitCompiler.java:3282)
at org.codehaus.janino.Java$FieldAccess.accept(Java.java:3232)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
at 
org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4368)
at 
org.codehaus.janino.UnitCompiler.compileContext2(UnitCompiler.java:3190)
at org.codehaus.janino.UnitCompiler.access$5600(UnitCompiler.java:185)
at 
org.codehaus.janino.UnitCompiler$9.visitFieldAccess(UnitCompiler.java:3152)
at org.codehaus.janino.Java$FieldAccess.accept(Java.java:3232)
at 
org.codehaus.janino.UnitCompiler.compileContext(UnitCompiler.java:3160)
at 
org.codehaus.janino.UnitCompiler.compileContext2(UnitCompiler.java:3172)
at org.codehaus.janino.UnitCompiler.access$5400(UnitCompiler.java:185)
at 
org.codehaus.janino.UnitCompiler$9.visitAmbiguousName(UnitCompiler.java:3150)
at org.codehaus.janino.Java$AmbiguousName.accept(Java.java:3138)
at 
org.codehaus.janino.UnitCompiler.compileContext(UnitCompiler.java:3160)
at 
org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4367)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3975)
at org.codehaus.janino.UnitCompiler.access$6900(UnitCompiler.java:185)
at 
org.codehaus.janino.UnitCompiler$10.visitMethodInvocation(UnitCompiler.java:3263)
at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3290)
at 
org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4368)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2662)
at org.codehaus.janino.UnitCompiler.access$4400(UnitCompiler.java:185)
at 
org.codehaus.janino.UnitCompiler$7.visitMethodInvocation(UnitCompiler.java:2627)
at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:3974)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2654)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1643)
at org.codehaus.janino.UnitCompiler.access$1100(UnitCompiler.java:185)
at 
org.codehaus.janino.UnitCompiler$4.visitExpressionStatement(UnitCompiler.java:936)
at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2097)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:958)
at 
org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1007)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:993)
at org.codehaus.janino.UnitCompiler.access$1000(UnitCompiler.java:185)
at org.codehaus.janino.UnitCompiler$4.visitBlock(UnitCompiler.java:935)
at org.codehaus.janino.Java$Block.accept(Java.java:2012)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:958)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:1742)
at org.codehaus.janino.UnitCompiler.access$1200(UnitCompiler.java:185)
at 
org.codehaus.janino.UnitCompiler$4.visitIfStatement(UnitCompiler.java:937)
at org.codehaus.janino.Java$IfStatement.accept(Java.java:2157)
at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:958)
at 
org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1007)
at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:993)
at org.codehaus.janino.UnitCompiler.access$1000(UnitCompiler.java:185)
at org.codehaus.janino.UnitCompiler$4.visitBlock(UnitCompiler.java:935)
at org.codehaus.janino.Java$Block.accept(Java.java:20

[jira] [Commented] (SPARK-8443) GenerateMutableProjection Exceeds JVM Code Size Limits

2015-06-28 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605028#comment-14605028
 ] 

Apache Spark commented on SPARK-8443:
-

User 'saurfang' has created a pull request for this issue:
https://github.com/apache/spark/pull/7076

> GenerateMutableProjection Exceeds JVM Code Size Limits
> --
>
> Key: SPARK-8443
> URL: https://issues.apache.org/jira/browse/SPARK-8443
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0
>Reporter: Sen Fang
>
> GenerateMutableProjection put all expressions columns into a single apply 
> function. When there are a lot of columns, the apply function code size 
> exceeds the 64kb limit, which is a hard limit on jvm that cannot change.
> This comes up when we were aggregating about 100 columns using codegen and 
> unsafe feature.
> I wrote an unit test that reproduces this issue. 
> https://github.com/saurfang/spark/blob/codegen_size_limit/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
> This test currently fails at 2048 expressions. It seems the master is more 
> tolerant than branch-1.4 about this because code is more concise.
> While the code on master has changed since branch-1.4, I am able to reproduce 
> the problem in master. For now I hacked my way in branch-1.4 to workaround 
> this problem by wrapping each expression with a separate function then call 
> those functions sequentially in apply. The proper way is probably check the 
> length of the projectCode and break it up as necessary. (This seems to be 
> easier in master actually since we are building code by string rather than 
> quasiquote)
> Let me know if anyone has additional thoughts on this, I'm happy to 
> contribute a pull request.
> Attaching stack trace produced by unit test
> {code}
> [info] - code size limit *** FAILED *** (7 seconds, 103 milliseconds)
> [info]   com.google.common.util.concurrent.UncheckedExecutionException: 
> org.codehaus.janino.JaninoRuntimeException: Code of method 
> "(Ljava/lang/Object;)Ljava/lang/Object;" of class "SC$SpecificProjection" 
> grows beyond 64 KB
> [info]   at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2263)
> [info]   at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
> [info]   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
> [info]   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:285)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:50)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(CodeGenerationSuite.scala:48)
> [info]   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:144)
> [info]   at scala.collection.immutable.Range.foreach(Range.scala:141)
> [info]   at 
> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
> [info]   at 
> scala.collection.AbstractTraversable.foldLeft(Traversable.scala:105)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply$mcV$sp(CodeGenerationSuite.scala:47)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at 
> org.apache.spark.sql.catalyst.expressions.CodeGenerationSuite$$anonfun$2.apply(CodeGenerationSuite.scala:47)
> [info]   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
> [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
> [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
> [info]   at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
> [info]   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
> [info]   at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
> [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
> [info]   at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
> [info]   at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
> [info]   at 
> org.scalate