On Sun, Jul 30, 2023 at 11:17 AM Bruno Roustant <bruno.roust...@gmail.com>
wrote:

Interesting coincidence, I'm currently working on a learned index on sorted
> keys that can advantageously replace binary search.
> It is very compact (additional space of 2% of the sorted key array, e.g.
> 40KB for 200MB of keys), and it is between 2x to 3x faster than binary
> search for the rank/indexOf methods. By design there are nearly no
> branches: the index of a key is approximated by using hierarchical linear
> segment computation.
> The PGM-Index paper is there: https://pgm.di.unipi.it/
> And my implementation is here, just submitted in HPPC:
> https://github.com/carrotsearch/hppc/pull/39
>

Wow, this looks very relevant to Lucene!  Could this index be used for
faster implementation of our skip lists?  Even though they are static
(computed once at segment-write time) vs dynamic/online that these learned
indices are also able to handle, it looks like learned indices are still
better than simple binary search, and quite a bit more compact, for static
cases.

Our "best linear fit" approximation to compress monotonic longs
(DirectMonoticWriter/Reader) looks like a simple example of these learned
indices too.

Mike McCandless

http://blog.mikemccandless.com

Reply via email to