zanmato1984 commented on issue #44513:
URL: https://github.com/apache/arrow/issues/44513#issuecomment-2556438531
I can now confirm that the problem does exist.
By applying filter and sum on the join result, I found my previous
non-segfault runs were false positive:
```
join = small.join(large, keys=['ID_DEV_STYLECOLOR_SIZE',
'ID_DEPARTMENT', 'ID_COLLECTION'], join_type='left outer')
print("join size: {0}".format(join.num_rows))
cond = pc.and_(pc.equal(join['ID_DEV_STYLECOLOR_SIZE'], 88506230299),
pc.equal(join['ID_DEPARTMENT'], 16556030299))
filtered = join.filter(cond)
print("filtered")
print(filtered)
sum = pc.sum(join['PL_VALUE'])
print("sum")
print(sum)
```
Result:
```
filtered: PL_VALUE: [[null]]
...
sum: 33609597 # Another run emits 33609997
```
And I also happen to have access to a x86 Ubuntu desktop, on which I
reproduced the segfault.
I'm now digging into it.
Also, considering the silent wrong answer on some platforms, I'm marking
this issue critical.
Thanks alot @kolfild26 for helping me to reproduce the issue!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]