[ https://issues.apache.org/jira/browse/SPARK-40253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
yihangqiao updated SPARK-40253: ------------------------------- Issue Type: Bug (was: Improvement) > Data read exception in orc format > ---------------------------------- > > Key: SPARK-40253 > URL: https://issues.apache.org/jira/browse/SPARK-40253 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.3 > Environment: os centos7 > spark 2.4.3 > hive 1.2.1 > hadoop 2.7.2 > Reporter: yihangqiao > Priority: Major > Original Estimate: 168h > Remaining Estimate: 168h > > When running batches using spark-sql and using the create table xxx as select > syntax, the select query part uses a static value as the default value (0.00 > as column_name) and does not specify the data type of the default value. In > this usage scenario, because the data type is not explicitly specified, the > metadata information of the field in the written ORC file is missing (the > writing is successful), but when reading, as long as the query column > contains this field, it will not be able to Parsing the ORC file, the > following error occurs: > Caused by: java.io.EOFException: Read past end of RLE integer from compressed > stream Stream for column 1 kind SECONDARY position: 0 length: 0 range: 0 > offset: 0 limit: 0 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org