bkietz commented on pull request #9685:
URL: https://github.com/apache/arrow/pull/9685#issuecomment-799485312


   If we want to support detection of compression then that requires a fairly 
significant change to this PR. As written, compression is a property of the 
FileFormat, which is not mutated (even during discovery). Thus we couldn't look 
at (for example) the `.gz` extension on provided file sources and switch from 
"CSV" to "gzipped CSV". Compression-as-FileFormat-property paints us into a 
corner WRT guessing compression.
   
   Adding discovery of file formats would give us a place to put this 
functionality, but that's a larger change and definitely out of scope here.
   
   If guessing compression will ever be a priority, I'd recommend removing 
compression-as-property and instead writing `Result<shared_ptr<InputStream>> 
FileSource::OpenCompressed(optional<Compression::type> = {})` (without an 
explicit compression type, it will guess what codec to use). This can replace 
usage of `FileSource::Open` in `file_csv.cc:OpenReader`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to