jorgecarleitao commented on pull request #9376:
URL: https://github.com/apache/arrow/pull/9376#issuecomment-778723732
@velvia , @seddonm1 My opinion atm is that would not really help much from
the compute side, as we would still need to write separate logic for the 4
cases `(Array, scalar)`, `(scalar, Array)`, `(scalar, scalar)`, `(array,
array)`, which is exactly what this PR already proposes.
For example, a subtraction of two arrays goes along the lines of
```rust
let iter = lhs.values().zip(rhs.values()).map(|(r, l)| r - l);
let array = unsafe { PrimitiveArray<T>::from_trusted_len_iter(iter) };
// .values() is a slice
```
If we had an implementation of `PrimitiveScalarArray<T>`, we could change
`.values()` to be an iterator, but then we lose the benefits of contiguous
regions, as Rust can no longer assume that `values` is a contiguous memory
region (slices have that property, iterators do not). If both sides are
scalars, we want to to create a `PrimitiveScalarArray<T>`, so, again we need to
branch. In all cases, we need a match `(lhs, rhs)`. I think that DataFusion in
this respect is doing things right.
To change the Array trait we basically need a change in the arrow
specification so that all implementations agree on how to communicate such
information via IPC, ffi, etc, so, that is mailing list material :)
Note that this PR addresses @alamb 's concern by introducing a adapter that
people can use if they do not want to bother implementing the scalar variants.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]