Oh I realized this super interesting read (thanks Rob!) was accidentally sent privately to me, so I'm reply-all'ing explicitly so everyone else gets to read this too :)
I especially love the charts and math at the end! Mike McCandless http://blog.mikemccandless.com On Thu, Jul 27, 2023 at 8:02 AM Rob Audenaerde <rob.audenae...@gmail.com> wrote: > Super interesting read! > > Btw. following the links of the internet I somehow ended up here, which is > a nice in-depth comparison on binary-search approaches: > > https://en.algorithmica.org/hpc/data-structures/binary-search/ > > > > On Thu, Jul 27, 2023 at 1:40 PM Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Hi Team, >> >> At Amazon (customer facing product search team) we've been playing with / >> benchmarking Tantivy (exciting Rust search engine loosely inspired by >> Lucene: https://github.com/quickwit-oss/tantivy, created by Paul Masurel >> and developed now by Quickwit and the Tantivy open source dev community) vs >> Lucene, by building a benchmark that blends Tantivy's search-benchmark-game >> (https://github.com/quickwit-oss/search-benchmark-game) and Lucene's >> nightly benchmarks (luceneutil: https://github.com/mikemccand/luceneutil >> and https://home.apache.org/~mikemccand/lucenebench). >> >> It's great fun, and we would love more eyeballs to spot any remaining >> unfair aspects of the comparison (thank you Uwe and Adrien for catching >> things already!). We are trying hard for an apples to apples comparison: >> same (enwiki) corpus, same queries, confirming we get precisely the same >> top N hits and total hit counts, latest versions of both engines. >> >> Indeed, Tantivy is substantially (2-3X?) faster than Lucene for many >> queries, and we've been trying to understand why. >> >> Sometimes it is due to a more efficient algorithms, e.g. the count() API >> for pure disjunctive queries, which Adrien is now porting to Lucene (thank >> you!), showing sizable (~80% faster in one query) speedups: >> https://github.com/apache/lucene/issues/12358. Other times may be due >> to Rust's more efficient/immediate/Python-like GC, or direct access to SIMD >> (Lucene now has this for aKNN search -- big speedup -- and maybe soon for >> postings too?), and unsafe code, different skip data / block postings >> encoding, or ... >> >> Specifically, one of the fascinating Tantivy optimizations is the >> branchless binary search: https://quickwit.io/blog/search-a-sorted-block. >> Here's another blog post about it (implemented in C++): >> https://probablydance.com/2023/04/27/beautiful-branchless-binary-search/ >> >> The idea is to get rustc/gcc to compile down to x86-64's CMOVcc >> ("conditional move") instruction (I'm not sure if ARM has an equivalent? >> Maybe "conditional execution"?). The idea is to avoid a "true" branch of >> the instruction stream (costly pipeline flush on mis-prediction, which is >> likely in a binary search or priority queue context) by instead >> conditionally moving a value from one location to another (register or >> memory). Tantivy uses this for skipping through postings, in a single >> layer in-memory skip list structure (versus Lucene's on-disk >> (memory-mapped, by default) multi-layer skip list) -- see the above blog >> post. >> >> This made me wonder: does javac's hotspot compiler use CMOVcc? I see >> javac bug fixes like https://github.com/openjdk/mobile/commit/a03e9220 >> which seems to imply C2 does in fact compile to CMOVcc sometimes. So then >> I wondered whether a branchless binary search in Java is a possibility? >> Has anyone played with this? >> >> Before Robert gets too upset :) Even if we could build such a thing, the >> right place for such optimizations is likely the JDK itself (see the >> similar discussion about SIMD-optimized sorting: >> https://github.com/apache/lucene/issues/12399). Still, I'm curious if >> anyone has explored this, and maybe saw some performance gains from way up >> in javaland where we can barely see the bare metal shining beneath us :) >> >> Sorry for the long post! >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >