AudriusButkevicius commented on PR #41821: URL: https://github.com/apache/arrow/pull/41821#issuecomment-2144413426
My intended use of this is to reduce strain on the filesystem when reading large (many files) datasets from a network attached filesystem, by reading the metadata file instead of many separate files. I also have a hard requirement for encryption sadly as the data is sensitive. It would be amazing if this worked with encrypted datasets assuming the key is the same. I would also be ok storing the metadata in plaintext, perform fragment filtering based on row-group stats, and then re-read and decrypt footers of the chosen files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
