mapleFU commented on issue #38245: URL: https://github.com/apache/arrow/issues/38245#issuecomment-1761548949
@rdbisme Not all 170MB file would consuming so many memory, but your case seems matches it well. You can considering a file size for each column is `k` MiB. And after compresion, it might become `0.5k` MiB. And the footer size might be `0.05k` MiB And now, when reading, because the file only has one page for each column. It have to: 1. Read the data, which might cause `0.55k` MiB for the whole data 2. Decompress the page, it might cause `1k` MiB of decompressed data 3. Decode them to arrow, which might causing another batch of `1k` MiB -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
