[ 
https://issues.apache.org/jira/browse/SPARK-13431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159649#comment-15159649
 ] 

Kazuaki Ishizaki commented on SPARK-13431:
------------------------------------------

I identified why this problem occurs only in maven. Shade plugin for maven 
increases the length of Java bytecode for a method. This increasing happens 
since shade plugin rewrites Java bytecode to rebuild constant pool.

Here is an output of ``javap -c SparkSqlParser_ExpressionParser.class`` before 
applying shade plugin. The static initializer ``static{}`` uses ``ldc`` 
bytecode for accessing constant pool at offset 13, 18, and 23. Each ``ldc`` 
consume only two bytes. As a result, the bytecode length of this method is 
*less than 65536*.
{code}
public class 
org.apache.spark.sql.catalyst.parser.SparkSqlParser_ExpressionParser extends 
org.antlr.runtime.Parser {
...
  static {};
    Code:
       0: bipush        70
       2: anewarray     #1035               // class java/lang/String
       5: dup
       6: iconst_0
       7: ldc_w         #1036               // String ...
      10: aastore
      11: dup
      12: iconst_1
      13: ldc           #127                // String
      15: aastore
      16: dup
      17: iconst_2
      18: ldc           #127                // String
      20: aastore
      21: dup
      22: iconst_3
      23: ldc           #127                // String
      25: aastore
      ...
   59900: return
  }
}
{code}

After applying shade plugin, the static initializer ``static{}`` uses ``ldc_w`` 
bytecode for accessing constant pool at offset 13, 19, and 25. Each ``ldc_w`` 
consumes three bytes.  As a result, the bytecode length of this method is *more 
than 65535*.
{code}
  static {};
    Code:
       0: bipush        70
       2: anewarray     #2965               // class java/lang/String
       5: dup
       6: iconst_0
       7: ldc_w         #5240               // String ...
      10: aastore
      11: dup
      12: iconst_1
      13: ldc_w         #2924               // String
      16: aastore
      17: dup
      18: iconst_2
      19: ldc_w         #2924               // String
      22: aastore
      23: dup
      24: iconst_3
      25: ldc_w         #2924               // String
      28: aastore
      ...
   65533: lconst_0
   65534: lastore
      ...

  }
}
{code}

Shading plugin seems to rebuild constant pool based on [this 
comment|http://svn.apache.org/viewvc/maven/plugins/tags/maven-shade-plugin-2.4.3/src/main/java/org/apache/maven/plugins/shade/DefaultShader.java?view=markup#l417].
 To use a lot of constant pool entry due to many definitions of String may 
increase the entry index of the constant pool. As a result, it leads to replace 
``ldc`` with ``ldc_w``. Finally, the length of Java bytecode is increased.

As a next step, what will we do?
* Can we avoid this rebuild by an option?
* Can we create a pull request for shade plugin to avoid this?
* Can we use another plugin?
* Can we split ExpressionParser.g into smaller files?
* Other solutions?




> Maven build fails due to: Method code too large! in Catalyst
> ------------------------------------------------------------
>
>                 Key: SPARK-13431
>                 URL: https://issues.apache.org/jira/browse/SPARK-13431
>             Project: Spark
>          Issue Type: Bug
>          Components: Build
>    Affects Versions: 2.0.0
>            Reporter: Stavros Kontopoulos
>            Priority: Blocker
>
> Cannot build the project when run the normal build commands:
> eg.
> {code}
> build/mvn -Phadoop-2.6 -Dhadoop.version=2.6.0  clean package
> ./make-distribution.sh --name test --tgz -Phadoop-2.6 
> {code}
> Integration builds are also failing: 
> https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.6/229/console
> https://ci.typesafe.com/job/mit-docker-test-zk-ref/12/console
> It looks like this is the commit that introduced the issue:
> https://github.com/apache/spark/commit/7925071280bfa1570435bde3e93492eaf2167d56



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to