Hi Mike,
actually Hotspot is using CMOV. Some nodes after bytecode analysis are
converted to CMoveNode and it has implementations to generate machine
code (as far as i see) for x86, s390 and arm.
The generic code is here:
https://github.com/openjdk/jdk/blob/486c7844f902728ce580c3994f58e3e497834952/src/hotspot/share/opto/movenode.cpp
Actually it is used in some cases, but I did not find out when it uses
it to generate instructions from bytecode. It also has some code to
optimize the cmov away if the result is known before (or not, e.g. for
floats it does not do this).
I think best would be to ask on the hotspot compiler list on suggestions
how to write Java code to trigger the JVM to insert the CMOV.
Uwe
Am 27.07.2023 um 13:40 schrieb Michael McCandless:
Hi Team,
At Amazon (customer facing product search team) we've been playing
with / benchmarking Tantivy (exciting Rust search engine loosely
inspired by Lucene: https://github.com/quickwit-oss/tantivy, created
by Paul Masurel and developed now by Quickwit and the Tantivy open
source dev community) vs Lucene, by building a benchmark that blends
Tantivy's search-benchmark-game
(https://github.com/quickwit-oss/search-benchmark-game) and Lucene's
nightly benchmarks (luceneutil:
https://github.com/mikemccand/luceneutil and
https://home.apache.org/~mikemccand/lucenebench).
It's great fun, and we would love more eyeballs to spot any remaining
unfair aspects of the comparison (thank you Uwe and Adrien for
catching things already!). We are trying hard for an apples to apples
comparison: same (enwiki) corpus, same queries, confirming we get
precisely the same top N hits and total hit counts, latest versions of
both engines.
Indeed, Tantivy is substantially (2-3X?) faster than Lucene for many
queries, and we've been trying to understand why.
Sometimes it is due to a more efficient algorithms, e.g. the count()
API for pure disjunctive queries, which Adrien is now porting to
Lucene (thank you!), showing sizable (~80% faster in one query)
speedups: https://github.com/apache/lucene/issues/12358. Other times
may be due to Rust's more efficient/immediate/Python-like GC, or
direct access to SIMD (Lucene now has this for aKNN search -- big
speedup -- and maybe soon for postings too?), and unsafe code,
different skip data / block postings encoding, or ...
Specifically, one of the fascinating Tantivy optimizations is the
branchless binary search:
https://quickwit.io/blog/search-a-sorted-block. Here's another blog
post about it (implemented in C++):
https://probablydance.com/2023/04/27/beautiful-branchless-binary-search/
The idea is to get rustc/gcc to compile down to x86-64's CMOVcc
("conditional move") instruction (I'm not sure if ARM has an
equivalent? Maybe "conditional execution"?). The idea is to avoid a
"true" branch of the instruction stream (costly pipeline flush on
mis-prediction, which is likely in a binary search or priority queue
context) by instead conditionally moving a value from one location to
another (register or memory). Tantivy uses this for skipping through
postings, in a single layer in-memory skip list structure (versus
Lucene's on-disk (memory-mapped, by default) multi-layer skip list) --
see the above blog post.
This made me wonder: does javac's hotspot compiler use CMOVcc? I see
javac bug fixes like https://github.com/openjdk/mobile/commit/a03e9220
which seems to imply C2 does in fact compile to CMOVcc sometimes. So
then I wondered whether a branchless binary search in Java is a
possibility? Has anyone played with this?
Before Robert gets too upset :) Even if we could build such a thing,
the right place for such optimizations is likely the JDK itself (see
the similar discussion about SIMD-optimized sorting:
https://github.com/apache/lucene/issues/12399). Still, I'm curious if
anyone has explored this, and maybe saw some performance gains from
way up in javaland where we can barely see the bare metal shining
beneath us :)
Sorry for the long post!
Mike McCandless
http://blog.mikemccandless.com
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:u...@thetaphi.de