Re: Patch to change murmurhash implementation slightly

Patrick Zhai Mon, 24 Apr 2023 22:58:03 -0700

I did a quick run with your patch, but since I turned on the CMS as well as
TieredMergePolicy I'm not sure how fair the comparison is. Here's the
result:
Candidate:
Indexer: indexing done (890209 msec); total 33332620 docs
Indexer: waitForMerges done (71622 msec)
Indexer: finished (961877 msec)
Baseline:
Indexer: indexing done (909706 msec); total 33332620 docs
Indexer: waitForMerges done (54775 msec)
Indexer: finished (964528 msec)


For more accurate comparison I guess it's better to use
LogxxMergePolicy and turn off CMS? If you want to run it yourself you can
find the lines I quoted from the log file.

Patrick

On Mon, Apr 24, 2023 at 12:34 PM Thomas Dullien
<[email protected]> wrote:

> Hey all,
>
> I've been experimenting with fixing some low-hanging performance fruit in
> the ElasticSearch codebase, and came across the fact that the MurmurHash
> implementation that is used by ByteRef.hashCode() is reading 4 bytes per
> loop iteration (which is likely an artifact from 32-bit architectures,
> which are ever-less-important). I made a small fix to change the
> implementation to read 8 bytes per loop iteration; I expected a very small
> impact (2-3% CPU or so over an indexing run in ElasticSearch), but got a
> pretty nontrivial throughput improvement over a few indexing benchmarks.
>
> I tried running Lucene-only benchmarks, and succeeded in running the
> example from https://github.com/mikemccand/luceneutil - but I couldn't
> figure out how to run indexing benchmarks and how to interpret the results.
>
> Could someone help me in running the benchmarks for the attached patch?
>
> Cheers,
> Thomas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]

Re: Patch to change murmurhash implementation slightly

Reply via email to