[ 
https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139969#comment-16139969
 ] 

Otis Smart edited comment on SPARK-16845 at 8/24/17 12:31 PM:
--------------------------------------------------------------

Hello!

1. I encounter a similar issue (see below text) on Pyspark 2.2 (e.g., dataframe 
with ~50000 rows x 1100+ columns as input to ".fit()" method of 
CrossValidator() that includes Pipeline() that includes StringIndexer(), 
VectorAssembler() and DecisionTreeClassifier()).

2. Was the aforementioned patch (aka 
fix(https://github.com/apache/spark/pull/15480) not included in the latest 
release; what are the reason and (source) of and solution to this persistent 
issue please?

py4j.protocol.Py4JJavaError: An error occurred while calling o9396.fit.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 38 in 
stage 18.0 failed 4 times, most recent failure: Lost task 38.3 in stage 18.0 
(TID 1996, ip-10-0-14-83.ec2.internal, executor 4): 
java.util.concurrent.ExecutionException: java.lang.Exception: failed to 
compile: org.codehaus.janino.JaninoRuntimeException: Code of method 
"compare(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I"
 of class 
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" 
grows beyond 64 KB
/* 001 */ public SpecificOrdering generate(Object[] references) {
/* 002 */   return new SpecificOrdering(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificOrdering extends 
org.apache.spark.sql.catalyst.expressions.codegen.BaseOrdering {
/* 006 */
/* 007 */   private Object[] references;
/* 008 */
/* 009 */
/* 010 */   public SpecificOrdering(Object[] references) {
/* 011 */     this.references = references;
/* 012 */
/* 013 */   }
/* 014 */
/* 015 */
/* 016 */
/* 017 */   public int compare(InternalRow a, InternalRow b) {
/* 018 */     InternalRow i = null;  // Holds current row being evaluated.
/* 019 */
/* 020 */     i = a;
/* 021 */     boolean isNullA;
/* 022 */     double primitiveA;
/* 023 */     {
/* 024 */
/* 025 */       double value = i.getDouble(0);
/* 026 */       isNullA = false;
/* 027 */       primitiveA = value;
/* 028 */     }
/* 029 */     i = b;
/* 030 */     boolean isNullB;
/* 031 */     double primitiveB;
/* 032 */     {
/* 033 */
/* 034 */       double value = i.getDouble(0);
/* 035 */       isNullB = false;
/* 036 */       primitiveB = value;
/* 037 */     }
/* 038 */     if (isNullA && isNullB) {
/* 039 */       // Nothing
/* 040 */     } else if (isNullA) {
/* 041 */       return -1;
/* 042 */     } else if (isNullB) {
/* 043 */       return 1;
/* 044 */     } else {
/* 045 */       int comp = 
org.apache.spark.util.Utils.nanSafeCompareDoubles(primitiveA, primitiveB);
/* 046 */       if (comp != 0) {
/* 047 */         return comp;
/* 048 */       }
/* 049 */     }
/* 050 */
/* 051 */



was (Author: otissmart):
Hello!

1. I encounter a similar issue (see below text) on Pyspark 2.2.

2. Was the aforementioned patch (aka 
fix(https://github.com/apache/spark/pull/15480) not included in the latest 
release; what are the reason and (source) of and solution to this persistent 
issue please?

py4j.protocol.Py4JJavaError: An error occurred while calling o9396.fit.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 38 in 
stage 18.0 failed 4 times, most recent failure: Lost task 38.3 in stage 18.0 
(TID 1996, ip-10-0-14-83.ec2.internal, executor 4): 
java.util.concurrent.ExecutionException: java.lang.Exception: failed to 
compile: org.codehaus.janino.JaninoRuntimeException: Code of method 
"compare(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I"
 of class 
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" 
grows beyond 64 KB
/* 001 */ public SpecificOrdering generate(Object[] references) {
/* 002 */   return new SpecificOrdering(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificOrdering extends 
org.apache.spark.sql.catalyst.expressions.codegen.BaseOrdering {
/* 006 */
/* 007 */   private Object[] references;
/* 008 */
/* 009 */
/* 010 */   public SpecificOrdering(Object[] references) {
/* 011 */     this.references = references;
/* 012 */
/* 013 */   }
/* 014 */
/* 015 */
/* 016 */
/* 017 */   public int compare(InternalRow a, InternalRow b) {
/* 018 */     InternalRow i = null;  // Holds current row being evaluated.
/* 019 */
/* 020 */     i = a;
/* 021 */     boolean isNullA;
/* 022 */     double primitiveA;
/* 023 */     {
/* 024 */
/* 025 */       double value = i.getDouble(0);
/* 026 */       isNullA = false;
/* 027 */       primitiveA = value;
/* 028 */     }
/* 029 */     i = b;
/* 030 */     boolean isNullB;
/* 031 */     double primitiveB;
/* 032 */     {
/* 033 */
/* 034 */       double value = i.getDouble(0);
/* 035 */       isNullB = false;
/* 036 */       primitiveB = value;
/* 037 */     }
/* 038 */     if (isNullA && isNullB) {
/* 039 */       // Nothing
/* 040 */     } else if (isNullA) {
/* 041 */       return -1;
/* 042 */     } else if (isNullB) {
/* 043 */       return 1;
/* 044 */     } else {
/* 045 */       int comp = 
org.apache.spark.util.Utils.nanSafeCompareDoubles(primitiveA, primitiveB);
/* 046 */       if (comp != 0) {
/* 047 */         return comp;
/* 048 */       }
/* 049 */     }
/* 050 */
/* 051 */


> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" 
> grows beyond 64 KB
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-16845
>                 URL: https://issues.apache.org/jira/browse/SPARK-16845
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: hejie
>            Assignee: Liwei Lin
>             Fix For: 1.6.4, 2.0.3, 2.1.1, 2.2.0
>
>         Attachments: error.txt.zip
>
>
> I have a wide table(400 columns), when I try fitting the traindata on all 
> columns,  the fatal error occurs. 
>       ... 46 more
> Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method 
> "(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" 
> grows beyond 64 KB
>       at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
>       at org.codehaus.janino.CodeContext.write(CodeContext.java:854)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to