rmuir commented on PR #13816:
URL: https://github.com/apache/lucene/pull/13816#issuecomment-2365228413
this ID is generated for safety, and we only calculate this **once per
segment** when initially flushing it. Doesn't even happen at merge. Really, it
should be a few insns under lock, shouldn't need to be fast. That's why there
isn't contention, your profiler is lying to you. Sorry, I don't buy 30ms.
And if somehow you are writing THAT many segments concurrently, then you
definitely need the safety (and we dont want thread visibility issues where 2
segments end out with the same ID).
If we want to improve it, IMO the best way is to remove custom BigDecimal
and xor/shift logic, but instead swap in the OpenJDK implementation. This code
was written before OpenJDK had decent PRNGs.
This will clean up the code AND improve the performance.
```java
// initialization
var factory = RandomGeneratorFactory.of("Xoroshiro128PlusPlus");
var generator = factory.create(seed);
// generation
byte[] result = new byte[ID_LENGTH];
synchronized(idLock) {
random.nextBytes(result);
}
return result;
```
Instead of two BigDecimal operations and an allocation under the lock, it
will just do some xors and shifts on two `long` variables behind the scenes.
And I think I would trust it more since code is maintained in the JDK.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]