Hi,

On Wed, Oct 27, 2021 at 7:57 PM Antoine Pitrou <anto...@python.org> wrote:
>
> This seems to assume that many or most arrays will have non-zero
> offsets.  Is this something that commonly happens in the Rust Arrow
> world?  In Arrow C++ I'm not sure non-zero offsets appear very frequently.
>
> Regards
>
> Antoine.

Regarding usecases, I can add two from our internal query engine:

- we store and cache complete columns in memory and then process them
in smaller, cache-optimized, batches, which are created by slicing the
full dataset.
- our query engine and data model uses ListArrays a lot and common
operations are aggregations of the values nested in these ListArrays.
So for example for a ListArray of Float64Array, calculating a
Float64Array containing the sums of each nested array.

The first design decision initially uncovered a lot of assumptions
about offsets being 0. It is however again a bit of a special case
since the offsets are usually multiples of 8 and so can be pushed down
into the buffers' base pointer even for validity and boolean.

Regards,
-- 
Jörn Horstmann | Senior Backend Engineer

www.signavio.com
Kurfürstenstraße 111, 10787 Berlin, Germany

Reply via email to