I think the results of the benchmark will depend on the properties of
the indexed terms. For english wikipedia (luceneutil) the average word
length is around 5 bytes so this optimization may not do much.

On Tue, Apr 25, 2023 at 1:58 AM Patrick Zhai <zhai7...@gmail.com> wrote:
>
> I did a quick run with your patch, but since I turned on the CMS as well as 
> TieredMergePolicy I'm not sure how fair the comparison is. Here's the result:
> Candidate:
> Indexer: indexing done (890209 msec); total 33332620 docs
> Indexer: waitForMerges done (71622 msec)
> Indexer: finished (961877 msec)
> Baseline:
> Indexer: indexing done (909706 msec); total 33332620 docs
> Indexer: waitForMerges done (54775 msec)
> Indexer: finished (964528 msec)
>
> For more accurate comparison I guess it's better to use LogxxMergePolicy and 
> turn off CMS? If you want to run it yourself you can find the lines I quoted 
> from the log file.
>
> Patrick
>
> On Mon, Apr 24, 2023 at 12:34 PM Thomas Dullien 
> <thomas.dull...@elastic.co.invalid> wrote:
>>
>> Hey all,
>>
>> I've been experimenting with fixing some low-hanging performance fruit in 
>> the ElasticSearch codebase, and came across the fact that the MurmurHash 
>> implementation that is used by ByteRef.hashCode() is reading 4 bytes per 
>> loop iteration (which is likely an artifact from 32-bit architectures, which 
>> are ever-less-important). I made a small fix to change the implementation to 
>> read 8 bytes per loop iteration; I expected a very small impact (2-3% CPU or 
>> so over an indexing run in ElasticSearch), but got a pretty nontrivial 
>> throughput improvement over a few indexing benchmarks.
>>
>> I tried running Lucene-only benchmarks, and succeeded in running the example 
>> from https://github.com/mikemccand/luceneutil - but I couldn't figure out 
>> how to run indexing benchmarks and how to interpret the results.
>>
>> Could someone help me in running the benchmarks for the attached patch?
>>
>> Cheers,
>> Thomas
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to