vsop-479 opened a new issue, #12788:
URL: https://github.com/apache/lucene/issues/12788

   ### Description
   
   Does it worth to make Math.max in CompetitiveImpactAccumulator.addAll 
unrolled or vectorized?
   Maybe scalar can be auto vectorized by JIT, but there is some speed up with 
unrolled on my computer(mac m2).
   What i can't figure out is vectorized implementation get the worst 
performance, maybe the cpu architecture? because i measure the performance use 
dotProduct too,  vectorized implementation is also the worst.
   
   Max | ns | diff
   -- | -- | --
   scalar | 26041 | -
   unrolled | 23625 |  +9.2%
   vectorized | 888084 | -
   
   # unrolled
   ````
   for (int i = 0; i < 256; i += 8) {
         a[i] = Math.max(a[i], b[i]);
         a[i + 1] = Math.max(a[i + 1], b[i + 1]);
         a[i + 2] = Math.max(a[i + 2], b[i + 2]);
         a[i + 3] = Math.max(a[i + 3], b[i + 3]);
         a[i + 4] = Math.max(a[i + 4], b[i + 4]);
         a[i + 5] = Math.max(a[i + 5], b[i + 5]);
         a[i + 6] = Math.max(a[i + 6], b[i + 6]);
         a[i + 7] = Math.max(a[i + 7], b[i + 7]);
       }
   ````
   
   # vectorized
   ````
   for (; i < SPECIES.loopBound(a.length); i += SPECIES.length()) {
         IntVector vectorA = IntVector.fromArray(SPECIES, a, i);
         IntVector vectorB = IntVector.fromArray(SPECIES, b, i);
         IntVector vectorC = vectorA.max(vectorB);
         vectorC.intoArray(a, i);
       }
   ````


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to