kubat-square-sense commented on issue #45236:
URL: https://github.com/apache/arrow/issues/45236#issuecomment-2589990638

   Maybe you had a different pyarrow build? 
   
   I have the same result in a Dockerfile
   ```Dockerfile
   FROM python:3.12
   
   RUN pip install pyarrow==18.1.0 memray
   
   COPY test.parquet test.parquet
   
   RUN echo "import pyarrow.parquet as pq\ndata = 
pq.read_table('test.parquet')\nprint(data.nbytes / 1024**2)" > /test.py
   
   RUN memray run --native -o report.bin test.py 
   
   CMD memray stats report.bin
   ```
   
   
   `docker build --tag test_image .`
   `docker run test_image`
   
   Running the above (current pyarrow distro 18.1.0) yields:
   
         
         📏 Total allocations:
                 75319
         
         📦 Total memory allocated:
                 1.476GB
         
         📊 Histogram of allocation size:
                 min: 1.000B
                 --------------------------------------------
                 < 6.000B   :  4688 ▇▇▇▇▇
                 < 36.000B  : 22577 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
                 < 222.000B : 28911 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
                 < 1.319KB  : 16851 ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇
                 < 7.999KB  :  1397 ▇▇
                 < 48.503KB :   831 ▇
                 < 294.066KB:    34 ▇
                 < 1.741MB  :     7 ▇
                 < 10.556MB :     0 
                 <=64.000MB :    23 ▇
                 --------------------------------------------
                 max: 64.000MB
         
         📂 Allocator type distribution:
                  MALLOC: 71282
                  CALLOC: 3149
                  REALLOC: 838
                  MMAP: 50
         
         🥇 Top 5 largest allocating locations (by size):
                 - <stack trace unavailable> -> 1.378GB
                 - _call_with_frames_removed:<frozen importlib._bootstrap>:488 
-> 84.912MB
                 - dedent:/usr/local/lib/python3.12/textwrap.py:436 -> 4.572MB
                 - sub:/usr/local/lib/python3.12/re/__init__.py:186 -> 3.808MB
                 - dedent:/usr/local/lib/python3.12/textwrap.py:435 -> 1.117MB
         
         🥇 Top 5 largest allocating locations (by number of allocations):
                 - <stack trace unavailable> -> 29539
                 - _call_with_frames_removed:<frozen importlib._bootstrap>:488 
-> 17261
                 - <module>:test.py:3 -> 5130
                 - sub:/usr/local/lib/python3.12/re/__init__.py:186 -> 4586
                 - dedent:/usr/local/lib/python3.12/textwrap.py:436 -> 4382
         
   
   Changing to pyarrow==17.0.0
   
         📦 Total memory allocated:
                 219.388MB
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to