Re: [PR] ORC-1567: Remove orc-tools restriction on orc suffix name [orc]

via GitHub Thu, 04 Jan 2024 07:40:55 -0800


cxzl25 commented on PR #1722:
URL: https://github.com/apache/orc/pull/1722#issuecomment-1877310082


   If we don't use orc datasource, using the implementation of hive will 
generate a file name without orc suffix.
   
   ```bash
   ./bin/spark-sql
   ```
   ```sql
   set spark.sql.hive.convertMetastoreCtas=false;
   create table tmp_orc stored as orcfile as select id from range(1);
   ```
   
   ```bash
   ls -al spark-warehouse/tmp_orc
   total 24
   drwxr-xr-x@ 6 csy  staff  192  1  4 23:34 .
   drwxr-xr-x@ 4 csy  staff  128  1  4 23:34 ..
   -rwxr-xr-x@ 1 csy  staff    8  1  4 23:34 
.part-00000-5888ab4a-8773-4376-a320-0cd6f4df5889-c000.crc
   -rwxr-xr-x@ 1 csy  staff   12  1  4 23:34 
.part-00011-5888ab4a-8773-4376-a320-0cd6f4df5889-c000.crc
   -rwxr-xr-x@ 1 csy  staff    0  1  4 23:34 
part-00000-5888ab4a-8773-4376-a320-0cd6f4df5889-c000
   -rwxr-xr-x@ 1 csy  staff  202  1  4 23:34 
part-00011-5888ab4a-8773-4376-a320-0cd6f4df5889-c000
   ```
   
   
   Can we do this? If the input is only one file, ignore the orc suffix? 
   This behavior is similar to `FileDump`. `FileDump` does not require the file 
to have an orc suffix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] ORC-1567: Remove orc-tools restriction on orc suffix name [orc]

Reply via email to