[jira] [Commented] (SPARK-33172) Spark SQL CodeGenerator does not check for UserDefined type
[ https://issues.apache.org/jira/browse/SPARK-33172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231841#comment-17231841 ] Apache Spark commented on SPARK-33172: -- User 'davidrabinowitz' has created a pull request for this issue: https://github.com/apache/spark/pull/30372 > Spark SQL CodeGenerator does not check for UserDefined type > --- > > Key: SPARK-33172 > URL: https://issues.apache.org/jira/browse/SPARK-33172 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.4.7, 3.0.1 >Reporter: David Rabinowitz >Priority: Minor > > The CodeGenerator takes the DataType given to {{getValueFromVector()}} as > is, and generates code based on its type. The generated code is not aware of > the actual type, and therefore cannot be compiled. For example, using a > DataFrame with a Spark ML Vector (VectorUDT) the generated code is: > {{InternalRow datasourcev2scan_value_2 = datasourcev2scan_isNull_2 ? null : > (datasourcev2scan_mutableStateArray_2[2].getStruct(datasourcev2scan_rowIdx_0, > 4));}} > {{ Which leads to a runtime error of}} > {{20/10/14 13:20:51 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 153, Column 126: No applicable constructor/method found for actual parameters > "int, int"; candidates are: "public > org.apache.spark.sql.vectorized.ColumnarRow > org.apache.spark.sql.vectorized.ColumnVector.getStruct(int)"}} > {{ org.codehaus.commons.compiler.CompileException: File 'generated.java', > Line 153, Column 126: No applicable constructor/method found for actual > parameters "int, int"; candidates are: "public > org.apache.spark.sql.vectorized.ColumnarRow > org.apache.spark.sql.vectorized.ColumnVector.getStruct(int)"}} > {{ at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12124)}} > {{...}} > {{ which then throws Spark to an infinite loop of this error.}} > The solution is quite simple, {{getValueFromVector()}} should match nad > handle UserDefinedType the same as {{CodeGenerator.javaType()}} is doing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33172) Spark SQL CodeGenerator does not check for UserDefined type
[ https://issues.apache.org/jira/browse/SPARK-33172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215698#comment-17215698 ] Apache Spark commented on SPARK-33172: -- User 'davidrabinowitz' has created a pull request for this issue: https://github.com/apache/spark/pull/30071 > Spark SQL CodeGenerator does not check for UserDefined type > --- > > Key: SPARK-33172 > URL: https://issues.apache.org/jira/browse/SPARK-33172 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.4.7, 3.0.1 >Reporter: David Rabinowitz >Priority: Minor > > The CodeGenerator takes the DataType given to {{getValueFromVector()}} as > is, and generates code based on its type. The generated code is not aware of > the actual type, and therefore cannot be compiled. For example, using a > DataFrame with a Spark ML Vector (VectorUDT) the generated code is: > {{InternalRow datasourcev2scan_value_2 = datasourcev2scan_isNull_2 ? null : > (datasourcev2scan_mutableStateArray_2[2].getStruct(datasourcev2scan_rowIdx_0, > 4));}} > {{ Which leads to a runtime error of}} > {{20/10/14 13:20:51 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 153, Column 126: No applicable constructor/method found for actual parameters > "int, int"; candidates are: "public > org.apache.spark.sql.vectorized.ColumnarRow > org.apache.spark.sql.vectorized.ColumnVector.getStruct(int)"}} > {{ org.codehaus.commons.compiler.CompileException: File 'generated.java', > Line 153, Column 126: No applicable constructor/method found for actual > parameters "int, int"; candidates are: "public > org.apache.spark.sql.vectorized.ColumnarRow > org.apache.spark.sql.vectorized.ColumnVector.getStruct(int)"}} > {{ at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12124)}} > {{...}} > {{ which then throws Spark to an infinite loop of this error.}} > The solution is quite simple, {{getValueFromVector()}} should match nad > handle UserDefinedType the same as {{CodeGenerator.javaType()}} is doing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org