Thanks for sharing Adrien, this is really cool! It's neat that the relative gains of Java vs C are quite a bit less than they were ~11 years ago when I played with a much smaller subset of queries. Also, COUNT on disjunction queries with Lucene Cyborg got slower. What a feat, to port so much of our complex Search code to C!
Mike McCandless http://blog.mikemccandless.com On Mon, Jul 22, 2024 at 9:43 AM Adrien Grand <jpou...@gmail.com> wrote: > Hello everyone, > > I recently stumbled on this paper after Ishan shared it on LinkedIn: > https://github.com/0ctopus13prime/lucene-cyborg-paper/blob/main/LuceneCyborg_Hybrid_Search_Engine_Written_in_Java_and_C%2B%2B.pdf > . > > This is quite impressive: this person did a high-fidelity rewrite of > Lucene in C++: it can even read indexes created by Lucene as-is. Then they > ran the Tantivy benchmark to compare performance with Lucene, Tantivy and > PISA. There are many takeaways, this is an interesting read. > > -- > Adrien >