pmpailis commented on PR #13076:
URL: https://github.com/apache/lucene/pull/13076#issuecomment-1929150939
Thank you so much @rmuir & @uschindler for taking such a close look and also
running benchmarks. 🙇 The reason I went with the look up table was because
there seemed to be some improvement in Neon compared to `Integer.bitCount`
(hadn't checked using `VarHandle` tbf), and although I wasn't fond of the
explicit lookup table either, in the case that we went ahead with something
like that, I was hoping to discuss a better alternative (also vector based
results seem much different).
I added the changes to use `VarHandle` and re-run the benchmarks. The
following are from my local dev machine (Neon)
```
Benchmark (size) Mode Cnt
Score Error Units
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 1 thrpt 15
488.021 ± 4.800 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 128 thrpt 15
5.896 ± 0.038 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 207 thrpt 15
4.420 ± 0.065 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 256 thrpt 15
3.589 ± 0.032 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 300 thrpt 15
3.123 ± 0.040 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 512 thrpt 15
1.854 ± 0.017 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 702 thrpt 15
1.348 ± 0.045 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 1024 thrpt 15
0.938 ± 0.015 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 1 thrpt 15
502.334 ± 16.595 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 128 thrpt 15
18.142 ± 0.508 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 207 thrpt 15
11.611 ± 0.367 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 256 thrpt 15
9.426 ± 0.124 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 300 thrpt 15
7.932 ± 0.254 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 512 thrpt 15
4.762 ± 0.116 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 702 thrpt 15
3.532 ± 0.018 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 1024 thrpt 15
2.425 ± 0.016 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 1 thrpt 15
473.315 ± 5.442 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 128 thrpt 15
27.318 ± 0.152 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 207 thrpt 15
16.651 ± 0.540 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 256 thrpt 15
14.506 ± 0.046 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 300 thrpt 15
12.170 ± 0.023 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 512 thrpt 15
7.478 ± 0.020 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 702 thrpt 15
5.157 ± 0.314 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 1024 thrpt 15
3.677 ± 0.085 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 1 thrpt 15
491.316 ± 14.116 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 128 thrpt 15
87.343 ± 2.689 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 207 thrpt 15
43.176 ± 1.220 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 256 thrpt 15
48.915 ± 0.477 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 300 thrpt 15
34.555 ± 0.326 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 512 thrpt 15
26.251 ± 0.284 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 702 thrpt 15
17.679 ± 0.204 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector 1024 thrpt 15
13.717 ± 0.056 ops/us
```
Also run the same experiments on a Xeon cloud instance with the following
results:
```
Benchmark (size) Mode Cnt
Score Error Units
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 1 thrpt 15
407.490 ? 1.681 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 128 thrpt 15
13.283 ? 0.033 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 207 thrpt 15
8.201 ? 0.194 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 256 thrpt 15
6.775 ? 0.124 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 300 thrpt 15
5.658 ? 0.159 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 512 thrpt 15
3.488 ? 0.099 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 702 thrpt 15
2.588 ? 0.046 ops/us
VectorUtilBenchmark.binaryHammingDistanceIntBitCount 1024 thrpt 15
1.866 ? 0.009 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 1 thrpt 15
319.515 ? 0.776 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 128 thrpt 15
16.192 ? 0.222 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 207 thrpt 15
9.828 ? 0.057 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 256 thrpt 15
7.082 ? 0.044 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 300 thrpt 15
6.120 ? 0.090 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 512 thrpt 15
4.043 ? 0.058 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 702 thrpt 15
2.625 ? 0.047 ops/us
VectorUtilBenchmark.binaryHammingDistanceLookupTable 1024 thrpt 15
1.954 ? 0.008 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 1 thrpt 15
344.508 ? 1.039 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 128 thrpt 15
101.425 ? 1.319 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 207 thrpt 15
56.693 ? 6.604 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 256 thrpt 15
76.473 ? 0.201 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 300 thrpt 15
58.439 ? 1.204 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 512 thrpt 15
50.839 ? 1.050 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 702 thrpt 15
42.945 ? 0.974 ops/us
VectorUtilBenchmark.binaryHammingDistanceVarHandle 1024 thrpt 15
38.331 ? 0.215 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 1 thrpt 15
281.455 ? 1.110 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 128 thrpt 15
31.618 ? 0.277 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 207 thrpt 15
19.928 ? 0.091 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 256 thrpt 15
16.684 ? 0.066 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 300 thrpt 15
11.351 ? 0.065 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 512 thrpt 15
8.520 ? 0.179 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 702 thrpt 15
5.596 ? 0.012 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector512 1024 thrpt 15
4.352 ? 0.021 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 1 thrpt 15
280.541 ? 3.963 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 128 thrpt 15
22.965 ? 0.386 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 207 thrpt 15
14.085 ? 0.278 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 256 thrpt 15
12.248 ? 0.180 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 300 thrpt 15
10.086 ? 0.220 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 512 thrpt 15
6.216 ? 0.022 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 702 thrpt 15
4.288 ? 0.064 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector256 1024 thrpt 15
3.164 ? 0.007 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 1 thrpt 15
281.373 ? 1.142 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 128 thrpt 15
27.610 ? 0.741 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 207 thrpt 15
16.567 ? 0.165 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 256 thrpt 15
14.946 ? 0.381 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 300 thrpt 15
11.887 ? 0.032 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 512 thrpt 15
7.735 ? 0.108 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 702 thrpt 15
5.430 ? 0.120 ops/us
VectorUtilBenchmark.binaryHammingDistanceVector128 1024 thrpt 15
3.870 ? 0.083 ops/us
```
where `VarHandle` clearly outperforms all other solutions.
As suggested, I'll proceed with adding this as the main and only
implementation of hamming distance and remove both the Panama one and the
leftovers from the existing implementation (i.e. lookup table).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]