[GitHub] [arrow] yli1994 commented on issue #14726: pq.read_table("parquet files path", memory_map=True) still consume large memory space(200G file cost 200G memory and slow)

GitBox Thu, 01 Dec 2022 18:37:11 -0800


yli1994 commented on issue #14726:
URL: https://github.com/apache/arrow/issues/14726#issuecomment-1334690052


   > If you want to reduce memory usage when reading a file, you should not 
read it as an entire table, but as a sequence of batches. See here: 
https://arrow.apache.org/docs/python/parquet.html#finer-grained-reading-and-writing
   
   And LMDB (which uses 'memory map', not sure difference between LMDB mmap and 
arrow mmap) can read directly from disk, thanks a lot for your answering!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] yli1994 commented on issue #14726: pq.read_table("parquet files path", memory_map=True) still consume large memory space(200G file cost 200G memory and slow)

Reply via email to