mikemccand commented on issue #15552:
URL: https://github.com/apache/lucene/issues/15552#issuecomment-3751564845
> There are some relevant code in the AWS Java SDK v2 which seems to contain
efforts to make the combining (reducing) routine faster.
Whoa, are these the methods that could combine two CRC32 checksums? It sure
looks like it ... but they are marked as `@SdkInternalApi` ... though the
comment says it was taken from `zlip`:
```
* The implementation of CRC combination was taken from the zlib source code
here:
* <a
href="https://github.com/luvit/zlib/blob/master/crc32.c">https://github.com/luvit/zlib/blob/master/crc32.c</a>
```
But:
> But we should be careful about exposing too much here? I feel like lucene
should be free to change the polynomial if we want, for improved performance
(e.g. switch to java.util.zip.CRC32C). It is hardware-accelerated via dedicated
insns on all modern cpus.
Hmm that is a good point -- the choice of checksum algo/polynomial/matrices
is really an implementation detail -- if we expose this utility API it would
hamper (maybe?) what checksum algos we could use in the future. And one could
always do this concurrent checksum chunking outside of Lucene hardwired to
CRC32 and accept the risk that Lucene can freely change it at any time ... so
maybe we should not pursue this. Have we ever changed the checksummer? Has it
been CRC32 since the start (I think so?).
Anyway I'm OK with WONTFIX -- thanks all for the discussion / pointers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]