cxzl25 commented on PR #1722: URL: https://github.com/apache/orc/pull/1722#issuecomment-1877310082
If we don't use orc datasource, using the implementation of hive will generate a file name without orc suffix. ```bash ./bin/spark-sql ``` ```sql set spark.sql.hive.convertMetastoreCtas=false; create table tmp_orc stored as orcfile as select id from range(1); ``` ```bash ls -al spark-warehouse/tmp_orc total 24 drwxr-xr-x@ 6 csy staff 192 1 4 23:34 . drwxr-xr-x@ 4 csy staff 128 1 4 23:34 .. -rwxr-xr-x@ 1 csy staff 8 1 4 23:34 .part-00000-5888ab4a-8773-4376-a320-0cd6f4df5889-c000.crc -rwxr-xr-x@ 1 csy staff 12 1 4 23:34 .part-00011-5888ab4a-8773-4376-a320-0cd6f4df5889-c000.crc -rwxr-xr-x@ 1 csy staff 0 1 4 23:34 part-00000-5888ab4a-8773-4376-a320-0cd6f4df5889-c000 -rwxr-xr-x@ 1 csy staff 202 1 4 23:34 part-00011-5888ab4a-8773-4376-a320-0cd6f4df5889-c000 ``` Can we do this? If the input is only one file, ignore the orc suffix? This behavior is similar to `FileDump`. `FileDump` does not require the file to have an orc suffix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
