Hey Druids,

There has generally been a lot of talk about moving away from ByteBuffer
and towards the DataSketches Memory package (
https://datasketches.apache.org/docs/Memory/MemoryPackage.html) or even
using Unsafe directly. Much of that discussion happened on
https://github.com/apache/druid/issues/3892.

Recently a patch was merged that added datasketches-memory as a dependency
of druid-processing: https://github.com/apache/druid/pull/9308. The reason
was partially due to better performance and partially due to nicer API
(both reasons mentioned in #3892 as well).

JEP 370 is a potential long term solution but it seems a while away from
being ready: https://openjdk.java.net/jeps/370

I wanted to bring the larger discussion back up and see what people think
is a good path forward.

My suggestion is that we migrate the VectorAggregator interface to use
Memory, but keep BufferAggregator the way it is. That way, as we build out
support for vectorization (right now, only timeseries/groupby support it,
and only a few aggregators, but we should be building this out) we'll be
doing it with a nicer and potentially faster API. But we won't need to go
back and redo a bunch of old code, since we'll keep the non-vectorized code
paths the way they are. (And hopefully, one day, delete them all outright.)

Gian

Reply via email to