Not a Rust expert yet, but here are my 2 (or more) cents:

The alignment was chosen (as far as I know) so that separately
allocated buffers would not share the same cache line, which could
cause performance issues when the same cache line is accessed by
multiple threads. So aligning our buffers that way is nice for optimal
performance, but not strictly necessary.

The from_raw_parts/from_unowned functions don't seem to be really used
inside the arrow codebase itself. There is one usage for a constant
empty buffer (which get's special handling to not get deallocated) and
one in the From<[u8]> implementation. That one seems to allocate a
aligned memory itself, which might explain why `from_raw_parts` sets
the `owned` flag and will later deallocate the memory. Regarding the
deallocation, the `from_unowned` function should not need to require
an alignment since it does not deallocate the memory later.

There is however also a function `typed_data` on buffer, used by some
kernels, and that requires the alignment to match that of the
primitive type. And when creating a buffer we currently do not know
yet which kind of primitive types will be stored there later. So I
guess we need to still enforce an alignment of at least the largest
`ArrowPrimitiveType`. I would also assume that external data is
aligned to this native alignment, otherwise I'd see it as the
responsibility of the other system to create an aligned copy.

So in summary:

MutableBuffer and From<[u8]> should continue to allocate with cache
line aware alignment.
Buffer could only require an alignment of the maximum of mem::align_of
of all ArrowPrimitiveTypes

Regards,
Jörn

On Tue, Sep 22, 2020 at 7:17 PM Jorge Cardoso Leitão
<jorgecarlei...@gmail.com> wrote:
>
> Hi,
>
> I had some time to look at https://issues.apache.org/jira/browse/ARROW-10039,
> wrt to the alignment requirements that rust implementation currently
> imposes.
>
> The gist is that it is not that easy, and I would like to request some
> guidance.
>
> Some facts:
> 1. Our current implementation does not accept a pointer if that pointer is
> not memory aligned (we panic)
> 2. Our rust implementation's alignment is a static/const that depends on
> the architecture and therefore constant throughout the program
> 3. Rust alloc/dealloc both require an argument denoting memory alignment.
> 4. calling dealloc with the wrong alignment is undefined behavior
>
> 3-4 imply that removing our safeguard against unaligned memory (wrt to the
> constant alignment) leads to undefined behavior: we take ownership of a
> pointer with an alignment X != our alignment and when we try to free it, we
> enter undefined world.
>
> Some options:
>
> 1. do not support interoperability (status quo)
> 2. do not support pointer-based memory sharing (e.g. re-align the pointer
> with a memcopy)
> 3. do not support owned pointer-based memory sharing (i.e. never call
> dealloc on that data)
> 4. infer the alignment of the pointer, and keep track of the pointer's
> alignment as part of its metadata (e.g. on `BufferData`)
>
> Any ideas or preferences?
>
> Best,
> Jorge



-- 
Jörn Horstmann | Senior Backend Engineer

www.signavio.com
Kurfürstenstraße 111, 10787 Berlin, Germany







HRB 121584 B Charlottenburg District Court, VAT ID: DE265675123
Managing Directors: Dr. Gero Decker, Daniel Rosenthal

Reply via email to