benwtrent commented on PR #13321:
URL: https://github.com/apache/lucene/pull/13321#issuecomment-2079695750

   OK, I ran on Google's ARM machine (`Tau T2A machine series`) to make sure 
the ARM performance improvements still exist for int4 (and it wasn't some silly 
macos thing):
   
   ```
   Benchmark                                       (size)   Mode  Cnt  Score   
Error   Units
   VectorUtilBenchmark.binaryDotProductScalar        1024  thrpt   15  2.850 ± 
0.002  ops/us
   VectorUtilBenchmark.binaryDotProductVector        1024  thrpt   15  2.771 ± 
0.016  ops/us
   VectorUtilBenchmark.binaryHalfByteScalar          1024  thrpt   15  2.845 ± 
0.009  ops/us
   VectorUtilBenchmark.binaryHalfByteScalarPacked    1024  thrpt   15  2.128 ± 
0.003  ops/us
   VectorUtilBenchmark.binaryHalfByteVector          1024  thrpt   15  7.667 ± 
0.007  ops/us
   VectorUtilBenchmark.binaryHalfByteVectorPacked    1024  thrpt   15  7.009 ± 
0.025  ops/us
   ```
   
   Something else funny is that this is almost at the same speed as 
`floatVector` on this hardware.
   
   Micro-benchmarks are VERY close to float. This means the reduction in bytes 
read & parsing will make int4 much faster than float.
   
   ```
   Benchmark                                  (size)   Mode  Cnt  Score   Error 
  Units
   VectorUtilBenchmark.floatDotProductScalar    1024  thrpt   15  2.476 ± 0.028 
 ops/us
   VectorUtilBenchmark.floatDotProductVector    1024  thrpt   75  8.703 ± 0.300 
 ops/us
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to