Thanks a lot Antoine for the pointers. Much appreciated!

Generally, it should not hurt to align allocations to 64 bytes anyway,
> since you are generally dealing with large enough data that the
> (small) memory overhead doesn't matter.
>

Not for performance. However, 64 byte alignment in Rust requires
maintaining a custom container, a custom allocator, and the inability to
interoperate with `std::Vec` and the ecosystem that is based on it, since
std::Vec allocates with alignment T (.e.g int32), not 64 bytes. For anyone
interested, the background for this is this old PR [1] in this in arrow2
[2].

Neither myself in micro benches nor Ritchie from polars (query engine) in
large scale benches observe any difference in the archs we have available.
This is not consistent with the emphasis we put on the memory alignments
discussion [3], and I am trying to understand the root cause for this
inconsistency.

By prefetching I mean implicit; no intrinsics involved.

Best,
Jorge

[1] https://github.com/apache/arrow/pull/8796
[2] https://github.com/jorgecarleitao/arrow2/pull/385
[2]
https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding





On Mon, Sep 6, 2021 at 6:51 PM Antoine Pitrou <anto...@python.org> wrote:

>
> Le 06/09/2021 à 19:45, Antoine Pitrou a écrit :
> >
> >> Specifically, I performed two types of tests, a "random sum" where we
> >> compute the sum of the values taken at random indices, and "sum", where
> we
> >> sum all values of the array (buffer[1] of the primitive array), both for
> >> array ranging from 2^10 to 2^25 elements. I was expecting that, at
> least in
> >> the latter, prefetching would help, but I do not observe any difference.
> >
> > By prefetching, you mean explicit prefetching using intrinsics?
> > Modern CPUs are very good at implicit prefetching, they are able to
> > detect memory access patterns and optimize for them. Implicit
> > prefetching would only possibly help if your access pattern is
> > complicated (for example you're walking a chain of pointers).
>
> Oops: *explicit* prefecting would only possibly help.... sorry.
>
> Regards
>
> Antoine.
>
>
> > If your
> > access is sequential, there is zero reason to prefetch explicitly
> > nowadays, AFAIK.
> >
> > Regards
> >
> > Antoine.
> >
> >
>

Reply via email to