[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139969#comment-16139969 ]
Otis Smart edited comment on SPARK-16845 at 8/24/17 12:31 PM: -------------------------------------------------------------- Hello! 1. I encounter a similar issue (see below text) on Pyspark 2.2 (e.g., dataframe with ~50000 rows x 1100+ columns as input to ".fit()" method of CrossValidator() that includes Pipeline() that includes StringIndexer(), VectorAssembler() and DecisionTreeClassifier()). 2. Was the aforementioned patch (aka fix(https://github.com/apache/spark/pull/15480) not included in the latest release; what are the reason and (source) of and solution to this persistent issue please? py4j.protocol.Py4JJavaError: An error occurred while calling o9396.fit. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 38 in stage 18.0 failed 4 times, most recent failure: Lost task 38.3 in stage 18.0 (TID 1996, ip-10-0-14-83.ec2.internal, executor 4): java.util.concurrent.ExecutionException: java.lang.Exception: failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of method "compare(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB /* 001 */ public SpecificOrdering generate(Object[] references) { /* 002 */ return new SpecificOrdering(references); /* 003 */ } /* 004 */ /* 005 */ class SpecificOrdering extends org.apache.spark.sql.catalyst.expressions.codegen.BaseOrdering { /* 006 */ /* 007 */ private Object[] references; /* 008 */ /* 009 */ /* 010 */ public SpecificOrdering(Object[] references) { /* 011 */ this.references = references; /* 012 */ /* 013 */ } /* 014 */ /* 015 */ /* 016 */ /* 017 */ public int compare(InternalRow a, InternalRow b) { /* 018 */ InternalRow i = null; // Holds current row being evaluated. /* 019 */ /* 020 */ i = a; /* 021 */ boolean isNullA; /* 022 */ double primitiveA; /* 023 */ { /* 024 */ /* 025 */ double value = i.getDouble(0); /* 026 */ isNullA = false; /* 027 */ primitiveA = value; /* 028 */ } /* 029 */ i = b; /* 030 */ boolean isNullB; /* 031 */ double primitiveB; /* 032 */ { /* 033 */ /* 034 */ double value = i.getDouble(0); /* 035 */ isNullB = false; /* 036 */ primitiveB = value; /* 037 */ } /* 038 */ if (isNullA && isNullB) { /* 039 */ // Nothing /* 040 */ } else if (isNullA) { /* 041 */ return -1; /* 042 */ } else if (isNullB) { /* 043 */ return 1; /* 044 */ } else { /* 045 */ int comp = org.apache.spark.util.Utils.nanSafeCompareDoubles(primitiveA, primitiveB); /* 046 */ if (comp != 0) { /* 047 */ return comp; /* 048 */ } /* 049 */ } /* 050 */ /* 051 */ was (Author: otissmart): Hello! 1. I encounter a similar issue (see below text) on Pyspark 2.2. 2. Was the aforementioned patch (aka fix(https://github.com/apache/spark/pull/15480) not included in the latest release; what are the reason and (source) of and solution to this persistent issue please? py4j.protocol.Py4JJavaError: An error occurred while calling o9396.fit. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 38 in stage 18.0 failed 4 times, most recent failure: Lost task 38.3 in stage 18.0 (TID 1996, ip-10-0-14-83.ec2.internal, executor 4): java.util.concurrent.ExecutionException: java.lang.Exception: failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of method "compare(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB /* 001 */ public SpecificOrdering generate(Object[] references) { /* 002 */ return new SpecificOrdering(references); /* 003 */ } /* 004 */ /* 005 */ class SpecificOrdering extends org.apache.spark.sql.catalyst.expressions.codegen.BaseOrdering { /* 006 */ /* 007 */ private Object[] references; /* 008 */ /* 009 */ /* 010 */ public SpecificOrdering(Object[] references) { /* 011 */ this.references = references; /* 012 */ /* 013 */ } /* 014 */ /* 015 */ /* 016 */ /* 017 */ public int compare(InternalRow a, InternalRow b) { /* 018 */ InternalRow i = null; // Holds current row being evaluated. /* 019 */ /* 020 */ i = a; /* 021 */ boolean isNullA; /* 022 */ double primitiveA; /* 023 */ { /* 024 */ /* 025 */ double value = i.getDouble(0); /* 026 */ isNullA = false; /* 027 */ primitiveA = value; /* 028 */ } /* 029 */ i = b; /* 030 */ boolean isNullB; /* 031 */ double primitiveB; /* 032 */ { /* 033 */ /* 034 */ double value = i.getDouble(0); /* 035 */ isNullB = false; /* 036 */ primitiveB = value; /* 037 */ } /* 038 */ if (isNullA && isNullB) { /* 039 */ // Nothing /* 040 */ } else if (isNullA) { /* 041 */ return -1; /* 042 */ } else if (isNullB) { /* 043 */ return 1; /* 044 */ } else { /* 045 */ int comp = org.apache.spark.util.Utils.nanSafeCompareDoubles(primitiveA, primitiveB); /* 046 */ if (comp != 0) { /* 047 */ return comp; /* 048 */ } /* 049 */ } /* 050 */ /* 051 */ > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" > grows beyond 64 KB > --------------------------------------------------------------------------------------------- > > Key: SPARK-16845 > URL: https://issues.apache.org/jira/browse/SPARK-16845 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Reporter: hejie > Assignee: Liwei Lin > Fix For: 1.6.4, 2.0.3, 2.1.1, 2.2.0 > > Attachments: error.txt.zip > > > I have a wide table(400 columns), when I try fitting the traindata on all > columns, the fatal error occurs. > ... 46 more > Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method > "(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I" > of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" > grows beyond 64 KB > at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941) > at org.codehaus.janino.CodeContext.write(CodeContext.java:854) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org