scovich commented on PR #9234:
URL: https://github.com/apache/arrow-rs/pull/9234#issuecomment-3811891539

   Skimming down the list of methods:
   * Many code sites -- both internally and in user code -- use  
`Array::data_type` to drive casting decisions. Custom implementations of the 
`Array` trait risk causing panics in such code, unless `Array::as_any` actually 
returns the corresponding concrete type. See e.g. 
https://docs.rs/arrow/latest/arrow/array/macro.downcast_primitive_array.html
      * Corollary: Any type that attempts to `impl Array` as a "container" is 
utterly unusable as an actual array, because the original type could only be 
recovered if its `Array::as_any` returns a reference to the original type.
   * `Array::into_data` and `Array::to_data` must produce a valid `ArrayData` 
(with a valid `ArrayData::data_type`), or risk causing panics and/or UB in 
downstream consumers such as `make_array`.
      * Corollary: `make_array` can never recover the original custom array 
type. It will instead recover whatever `ArrayData::data_type` indicated.
   * `Array::is_empty`, `Array::len`, and `Array::offset` must be accurate, 
even if this array is the result of `Array::slice`
   * `Array::nulls` must be accurate (any entry wrongly marked non-null has an 
undefined value)
   
   Summing it all up -- any custom Array implementation must either be a 
complex analogue to `dyn Any` -- completely ignoring the normal Array API -- or 
must  look and act _exactly_ like a newtype wrapper around one of the built-in 
Array types (e.g. `Arc<dyn Array>` can safely `impl Array`).
   - There seems to be a connection here with 
https://github.com/apache/arrow-rs/issues/8794, where `Array::as_any` as a 
replacement for `Array: Any` causes awkwardness in casting?
   
   Problem is -- there's no way to tell which one you're dealing with, for an 
arbitrary `&dyn Array`. The only way to be sure you have a usable `Array` is to 
round trip it to `ArrayData` and back before attempting to work with it 
(assuming `Array::into_data` is implemented correctly).
   
   If we really wanted to support the use case of `dyn Array` as a stand-in for 
`dyn Any` (since `Array: !Any`), then we'd have to define a new method, a "raw" 
version of `Array::as_any`, that can be downcast to recover the true type, and 
possibly also an analogue to `Array::data_type` that could guide the use of 
that raw casting. That way, the custom array-as-pointer could maintain the 1:1 
correspondence between `Array::data_type` and `Array::as_any`, while still 
allowing to recover the original type. However, it seems like these are really 
just different use cases and we probably shouldn't try to conflate them in a 
single type.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to