[jira] [Comment Edited] (SPARK-29561) Large Case Statement Code Generation OOM

2019-10-29 Thread Michael Chen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962361#comment-16962361
 ] 

Michael Chen edited comment on SPARK-29561 at 10/29/19 7:59 PM:


If I increase the memory, it will run into the generated code grows beyond 64 
KB exception and disable whole stage code generation for the plan.
 But if I increase the number of branches/complexity of the branches, it will 
just run into the OOM problem again.


was (Author: mikechen):
If I increase the memory, it will run into the generated code grows beyond 64 
KB exception and disable whole stage code generation for the plan. So that is 
ok.
But if I increase the number of branches/complexity of the branches, it will 
just run into the OOM problem again.

> Large Case Statement Code Generation OOM
> 
>
> Key: SPARK-29561
> URL: https://issues.apache.org/jira/browse/SPARK-29561
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Michael Chen
>Priority: Major
> Attachments: apacheSparkCase.sql
>
>
> Spark Configuration
> spark.driver.memory = 1g
>  spark.master = "local"
>  spark.deploy.mode = "client"
> Try to execute a case statement with 3000+ branches. Added sql statement as 
> attachment
>  Spark runs for a while before it OOM
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:182)
>   at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1320)
>   at 
> org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178)
>   at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73)
> 19/10/22 16:19:54 ERROR FileFormatWriter: Aborting job null.
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.util.HashMap.newNode(HashMap.java:1750)
>   at java.util.HashMap.putVal(HashMap.java:631)
>   at java.util.HashMap.putMapEntries(HashMap.java:515)
>   at java.util.HashMap.putAll(HashMap.java:785)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3345)
>   at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:212)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:3230)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:3198)
>   at 
> org.codehaus.janino.Java$LocalVariableDeclarationStatement.accept(Java.java:3351)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3197)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3254)
>   at org.codehaus.janino.UnitCompiler.access$3900(UnitCompiler.java:212)
>   at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:3216)
>   at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:3198)
>   at org.codehaus.janino.Java$Block.accept(Java.java:2756)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3197)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3260)
>   at org.codehaus.janino.UnitCompiler.access$4000(UnitCompiler.java:212)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitDoStatement(UnitCompiler.java:3217)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitDoStatement(UnitCompiler.java:3198)
>   at org.codehaus.janino.Java$DoStatement.accept(Java.java:3304)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3197)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3186)
>   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3009)
>   at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1336)
>   at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1309)
>   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:799)
>   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:958)
>   at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:212)
>   at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:393)
>   at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:385)
>   at 
> org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1286)
> 19/10/22 16:19:54 ERROR Utils: throw uncaught fatal error in thread Spark 
> Context Cleaner
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at 
> 

[jira] [Comment Edited] (SPARK-29561) Large Case Statement Code Generation OOM

2019-10-29 Thread Michael Chen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962361#comment-16962361
 ] 

Michael Chen edited comment on SPARK-29561 at 10/29/19 7:59 PM:


Yes it works when I increase the driver memory. It just runs into the generated 
code grows beyond 64 KB exception and then disables whole stage code generation 
for the plan.
 But if I increase the number of branches/complexity of the branches, it will 
just run into the OOM problem again.


was (Author: mikechen):
If I increase the memory, it will run into the generated code grows beyond 64 
KB exception and disable whole stage code generation for the plan.
 But if I increase the number of branches/complexity of the branches, it will 
just run into the OOM problem again.

> Large Case Statement Code Generation OOM
> 
>
> Key: SPARK-29561
> URL: https://issues.apache.org/jira/browse/SPARK-29561
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Michael Chen
>Priority: Major
> Attachments: apacheSparkCase.sql
>
>
> Spark Configuration
> spark.driver.memory = 1g
>  spark.master = "local"
>  spark.deploy.mode = "client"
> Try to execute a case statement with 3000+ branches. Added sql statement as 
> attachment
>  Spark runs for a while before it OOM
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at 
> org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:182)
>   at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1320)
>   at 
> org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178)
>   at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73)
> 19/10/22 16:19:54 ERROR FileFormatWriter: Aborting job null.
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.util.HashMap.newNode(HashMap.java:1750)
>   at java.util.HashMap.putVal(HashMap.java:631)
>   at java.util.HashMap.putMapEntries(HashMap.java:515)
>   at java.util.HashMap.putAll(HashMap.java:785)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3345)
>   at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:212)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:3230)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:3198)
>   at 
> org.codehaus.janino.Java$LocalVariableDeclarationStatement.accept(Java.java:3351)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3197)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3254)
>   at org.codehaus.janino.UnitCompiler.access$3900(UnitCompiler.java:212)
>   at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:3216)
>   at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:3198)
>   at org.codehaus.janino.Java$Block.accept(Java.java:2756)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3197)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3260)
>   at org.codehaus.janino.UnitCompiler.access$4000(UnitCompiler.java:212)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitDoStatement(UnitCompiler.java:3217)
>   at 
> org.codehaus.janino.UnitCompiler$8.visitDoStatement(UnitCompiler.java:3198)
>   at org.codehaus.janino.Java$DoStatement.accept(Java.java:3304)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3197)
>   at 
> org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3186)
>   at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3009)
>   at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1336)
>   at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1309)
>   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:799)
>   at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:958)
>   at org.codehaus.janino.UnitCompiler.access$700(UnitCompiler.java:212)
>   at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:393)
>   at 
> org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:385)
>   at 
> org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1286)
> 19/10/22 16:19:54 ERROR Utils: throw uncaught fatal error in thread Spark 
> Context Cleaner
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at 
>