jorisvandenbossche commented on PR #36846:
URL: https://github.com/apache/arrow/pull/36846#issuecomment-1735229268
Comparing the current branch with numpy promotion rules:
```python
import itertools
import pyarrow as pa
import numpy as np
types = ["uint8", "uint16", "uint32", "uint64", "int8", "int16", "int32",
"int64", "float16", "float32", "float64"]
for t1, t2 in itertools.combinations(types, 2):
np_result = np.result_type(t1, t2)
pa_result = pa.unify_schemas([pa.schema([('a', t1)]), pa.schema([('a',
t2)])], options="permissive").field('a').type
pa_result_np = np.dtype(pa_result.to_pandas_dtype())
if np_result != pa_result_np:
print(f"{t1} + {t2}\t:\t{np_result} (numpy), {pa_result_np} (arrow)")
```
```
uint8 + int8 : int16 (numpy), int8 (arrow)
uint16 + int8 : int32 (numpy), int16 (arrow)
uint16 + int16 : int32 (numpy), int16 (arrow)
uint32 + int8 : int64 (numpy), int32 (arrow)
uint32 + int16 : int64 (numpy), int32 (arrow)
uint32 + int32 : int64 (numpy), int32 (arrow)
uint64 + int8 : float64 (numpy), int64 (arrow)
uint64 + int16 : float64 (numpy), int64 (arrow)
uint64 + int32 : float64 (numpy), int64 (arrow)
uint64 + int64 : float64 (numpy), int64 (arrow)
```
That shows the issue of widening the bitwidth for int+uint as commented
above.
Numpy also seems to go to float when combining uint64 with any signed
integer (since uint64 values cannot all be held in int64). Not sure if we
should also do that ..
And comparing with our numeric kernels DispatchBest (implemented in
`CommonNumeric`, added in https://github.com/apache/arrow/pull/9294):
```python
for t1, t2 in itertools.combinations(types, 2):
result_merge = pa.unify_schemas([pa.schema([('a', t1)]),
pa.schema([('a', t2)])], options="permissive").field('a').type
try:
result_dispatch= pa.compute.add(pa.array([1, 2, 3], t1),
pa.array(np.array([1, 2, 3], t2), t2)).type
except:
continue
if result_merge != result_dispatch:
print(f"{t1} + {t2}\t:\t{result_merge} (MergeOptions),
{result_dispatch} (DispatchBest)")
```
```
uint8 + int8 : int8 (MergeOptions), int16 (DispatchBest)
uint16 + int8 : int16 (MergeOptions), int32 (DispatchBest)
uint16 + int16 : int16 (MergeOptions), int32 (DispatchBest)
uint32 + int8 : int32 (MergeOptions), int64 (DispatchBest)
uint32 + int16 : int32 (MergeOptions), int64 (DispatchBest)
uint32 + int32 : int32 (MergeOptions), int64 (DispatchBest)
uint32 + float32 : double (MergeOptions), float (DispatchBest)
uint64 + float32 : double (MergeOptions), float (DispatchBest)
int32 + float32 : double (MergeOptions), float (DispatchBest)
int64 + float32 : double (MergeOptions), float (DispatchBest)
```
So the numeric kernels already widen the uint+int case. And they do not
widen the float bitwidth in the uint+float case (that's maybe something to
change in DispatchBest in a follow-up).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]