[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028783#comment-17028783 ] Frederik Schreiber commented on SPARK-25094: Should this issue linked to SPARK-22510? > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > Attachments: generated_code.txt > > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578348#comment-16578348 ] Marco Gaido commented on SPARK-25094: - [~igreenfi] as I mentioned you, this is a known issue. You found a TODO because currently it is not possible to implement that TODO. There is an ongoing effort to make it happening, but it is a huge effort, so it will take time. Thanks. > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > Attachments: generated_code.txt > > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578275#comment-16578275 ] Izek Greenfield commented on SPARK-25094: - looking in the code the problem is here: def splitExpressionsWithCurrentInputs( expressions: Seq[String], funcName: String = "apply", extraArguments: Seq[(String, String)] = Nil, returnType: String = "void", makeSplitFunction: String => String = identity, foldFunctions: Seq[String] => String = _.mkString("", ";\n", ";")): String = { // TODO: support whole stage codegen if (INPUT_ROW == null || currentVars != null) { expressions.mkString("\n") } else { splitExpressions( expressions, funcName, ("InternalRow", INPUT_ROW) +: extraArguments, returnType, makeSplitFunction, foldFunctions) } } the TODO section!! > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > Attachments: generated_code.txt > > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578127#comment-16578127 ] Izek Greenfield commented on SPARK-25094: - the code that creates this plan is very complex. I will try to reproduce it in simple code in the meanwhile I can attach the generated code so you can see the problem is that the code does not create functions and inline all the Plan into the processNext method. [^generated_code.txt] > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > Attachments: generated_code.txt > > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578114#comment-16578114 ] Hyukjin Kwon commented on SPARK-25094: -- It's more about code generation. It would be nicer if we know what input produce that output described in the JIRA. Otherwise I would rather resolve this as Cannot Reproduce since strictly no one knows how to reproduce > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578037#comment-16578037 ] Izek Greenfield commented on SPARK-25094: - [~hyukjin.kwon] Does the full Plan is OK? > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577763#comment-16577763 ] Hyukjin Kwon commented on SPARK-25094: -- [~igreenfi], mind adding a reproducer so that we can verify and resolve this later? > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25094) proccesNext() failed to compile size is over 64kb
[ https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577593#comment-16577593 ] Marco Gaido commented on SPARK-25094: - This is a duplicate of many. Unfortunately this problem has not yet been solved, so in this case whole-stage code generation is disabled for the query. There is an ongoing effort in order to enable to fix this issue in the future though. > proccesNext() failed to compile size is over 64kb > - > > Key: SPARK-25094 > URL: https://issues.apache.org/jira/browse/SPARK-25094 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.0 >Reporter: Izek Greenfield >Priority: Major > > I have this tree: > 2018-08-12T07:14:31,289 WARN [] > org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen > disabled for plan (id=1): > *(1) Project [, ... 10 more fields] > +- *(1) Filter NOT exposure_calc_method#10141 IN > (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES) >+- InMemoryTableScan [, ... 11 more fields], [NOT > exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)] > +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, > deserialized, 1 replicas) >+- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner > :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0 > : +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > : +- *(1) Project [, ... 6 more fields] > :+- *(1) Filter (isnotnull(v#49) && > isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) > && (v#49 = DATA_REG)) && isnotnull(unique_id#39)) > : +- InMemoryTableScan [, ... 6 more fields], [, > ... 6 more fields] > : +- InMemoryRelation [, ... 6 more > fields], StorageLevel(memory, deserialized, 1 replicas) > : +- *(1) FileScan csv [,... 6 more > fields] , ... 6 more fields > +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0 > +- Exchange(coordinator id: 1456511137) > UnknownPartitioning(9), coordinator[target post-shuffle partition size: > 67108864] > +- *(3) Project [, ... 74 more fields] >+- *(3) Filter (((isnotnull(v#51) && (asof_date#42 > <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54)) > +- InMemoryTableScan [, ... 74 more fields], [, > ... 4 more fields] > +- InMemoryRelation [, ... 74 more > fields], StorageLevel(memory, deserialized, 1 replicas) > +- *(1) FileScan csv [,... 74 more > fields] , ... 6 more fields > Compiling "GeneratedClass": Code of method "processNext()V" of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1" > grows beyond 64 KB > and the generated code failed to compile. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org