[ 
https://issues.apache.org/jira/browse/SPARK-40253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yihangqiao updated SPARK-40253:
-------------------------------
    Issue Type: Bug  (was: Improvement)

>  Data read exception in orc format
> ----------------------------------
>
>                 Key: SPARK-40253
>                 URL: https://issues.apache.org/jira/browse/SPARK-40253
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.3
>         Environment: os centos7
> spark 2.4.3
> hive 1.2.1
> hadoop 2.7.2
>            Reporter: yihangqiao
>            Priority: Major
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When running batches using spark-sql and using the create table xxx as select 
> syntax, the select query part uses a static value as the default value (0.00 
> as column_name) and does not specify the data type of the default value. In 
> this usage scenario, because the data type is not explicitly specified, the 
> metadata information of the field in the written ORC file is missing (the 
> writing is successful), but when reading, as long as the query column 
> contains this field, it will not be able to Parsing the ORC file, the 
> following error occurs:
> Caused by: java.io.EOFException: Read past end of RLE integer from compressed 
> stream Stream for column 1 kind SECONDARY position: 0 length: 0 range: 0 
> offset: 0 limit: 0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to