zanmato1984 commented on issue #44513:
URL: https://github.com/apache/arrow/issues/44513#issuecomment-2550475588
Hi @kolfild26 , I've successfully run the case in my local (M1 MBP with 32GB
memory, arrow 18.1.0) but didn't reproduce the issue.
My python script:
```
import pandas
import pickle
import pyarrow
def main():
print("pandas: {0}, pyarrow: {1}".format(pandas.__version__,
pyarrow.__version__))
with open('small.pkl', 'rb') as f: small = pickle.load(f)
with open('large.pkl', 'rb') as f: large = pickle.load(f)
print("small size: {0}, large size: {1}".format(small.num_rows,
large.num_rows))
join = small.join(large, keys=['ID_DEV_STYLECOLOR_SIZE',
'ID_DEPARTMENT', 'ID_COLLECTION'], join_type='left outer')
print("join size: {0}".format(join.num_rows))
if __name__ == "__main__":
main()
```
Result:
```
python test.py
pandas: 2.2.3, pyarrow: 18.1.0
small size: 18201475, large size: 360449051
join size: 18201475
```
Did I miss something?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]