[ https://issues.apache.org/jira/browse/SPARK-15327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davies Liu resolved SPARK-15327. -------------------------------- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13235 [https://github.com/apache/spark/pull/13235] > Catalyst code generation fails with complex data structure > ---------------------------------------------------------- > > Key: SPARK-15327 > URL: https://issues.apache.org/jira/browse/SPARK-15327 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Reporter: Jurriaan Pruis > Assignee: Davies Liu > Fix For: 2.0.0 > > Attachments: full_exception.txt > > > Spark code generation fails with the following error when loading parquet > files with a complex structure: > {code} > : java.util.concurrent.ExecutionException: java.lang.Exception: failed to > compile: org.codehaus.commons.compiler.CompileException: File > 'generated.java', Line 158, Column 16: Expression "scan_isNull" is not an > rvalue > {code} > The generated code on line 158 looks like: > {code} > /* 153 */ this.scan_arrayWriter23 = new > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeArrayWriter(); > /* 154 */ this.scan_rowWriter40 = new > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(scan_holder, > 1); > /* 155 */ } > /* 156 */ > /* 157 */ private void scan_apply_0(InternalRow scan_row) { > /* 158 */ if (scan_isNull) { > /* 159 */ scan_rowWriter.setNullAt(0); > /* 160 */ } else { > /* 161 */ // Remember the current cursor so that we can calculate how > many bytes are > /* 162 */ // written later. > /* 163 */ final int scan_tmpCursor = scan_holder.cursor; > /* 164 */ > {code} > How to reproduce (Pyspark): > {code} > # Some complex structure > json = '{"h": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", "count": > 3}], "b": [{"e": "test", "count": 1}]}}, "d": {"b": {"c": [{"e": "adfgd"}], > "a": [{"e": "testing", "count": 3}], "b": [{"e": "test", "count": 1}]}}, "c": > {"b": {"c": [{"e": "adfgd"}], "a": [{"count": 3}], "b": [{"e": "test", > "count": 1}]}}, "a": {"b": {"c": [{"e": "adfgd"}], "a": [{"count": 3}], "b": > [{"e": "test", "count": 1}]}}, "e": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": > "testing", "count": 3}], "b": [{"e": "test", "count": 1}]}}, "g": {"b": {"c": > [{"e": "adfgd"}], "a": [{"e": "testing", "count": 3}], "b": [{"e": "test", > "count": 1}]}}, "f": {"b": {"c": [{"e": "adfgd"}], "a": [{"e": "testing", > "count": 3}], "b": [{"e": "test", "count": 1}]}}, "b": {"b": {"c": [{"e": > "adfgd"}], "a": [{"count": 3}], "b": [{"e": "test", "count": 1}]}}}' > # Write to parquet file > sqlContext.read.json(sparkContext.parallelize([json])).write.mode('overwrite').parquet('test') > # Try to read from parquet file (this generates an exception) > sqlContext.read.parquet('test').collect() > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org