vsop-479 opened a new issue, #12788:
URL: https://github.com/apache/lucene/issues/12788
### Description
Does it worth to make Math.max in CompetitiveImpactAccumulator.addAll
unrolled or vectorized?
Maybe scalar can be auto vectorized by JIT, but there is some speed up with
unrolled on my computer(mac m2).
What i can't figure out is vectorized implementation get the worst
performance, maybe the cpu architecture? because i measure the performance use
dotProduct too, vectorized implementation is also the worst.
Max | ns | diff
-- | -- | --
scalar | 26041 | -
unrolled | 23625 | +9.2%
vectorized | 888084 | -
# unrolled
````
for (int i = 0; i < 256; i += 8) {
a[i] = Math.max(a[i], b[i]);
a[i + 1] = Math.max(a[i + 1], b[i + 1]);
a[i + 2] = Math.max(a[i + 2], b[i + 2]);
a[i + 3] = Math.max(a[i + 3], b[i + 3]);
a[i + 4] = Math.max(a[i + 4], b[i + 4]);
a[i + 5] = Math.max(a[i + 5], b[i + 5]);
a[i + 6] = Math.max(a[i + 6], b[i + 6]);
a[i + 7] = Math.max(a[i + 7], b[i + 7]);
}
````
# vectorized
````
for (; i < SPECIES.loopBound(a.length); i += SPECIES.length()) {
IntVector vectorA = IntVector.fromArray(SPECIES, a, i);
IntVector vectorB = IntVector.fromArray(SPECIES, b, i);
IntVector vectorC = vectorA.max(vectorB);
vectorC.intoArray(a, i);
}
````
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]