convoi opened a new issue, #36158:
URL: https://github.com/apache/arrow/issues/36158
### Describe the bug, including details regarding any error messages,
version, and platform.
Fixed size binary arrays with nulls use full fixed length size of memory for
nulls.
This demonstrates the issue
```
len = 1000000 # 1 mio
nulls = 1000 # 1k nulls
x = [b'0'*len]+[None]*nulls # 1 value of 1M bytes, and 1000 nulls
arr = pa.array(x, pa.binary(len))
print(f"allocated: {pa.total_allocated_bytes()}") # should print about
1MB but prints 1001*1MB.
```
Same thing happens if I load parquet data with fixed size binary columns
with many nulls. Actually it makes those files unreadable, as memory
consumption is off the charts.
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]