Re: [PR] Introduce bpv24 vectorized decoding for DocIdsWriter [lucene]

via GitHub Thu, 06 Feb 2025 00:00:20 -0800


gf2121 commented on PR #14176:
URL: https://github.com/apache/lucene/pull/14176#issuecomment-2639090793


   Thanks @iverase !
   
   For the vectorized decodeing, I benchmarked the decoding method with jmh, 
the result on my M2 mac:
   
   ```
   Benchmark                             Mode  Cnt    Score   Error   Units
   BKDCodecBenchmark.readInts16ForUtil  thrpt    5   94.529 ± 2.886  ops/ms
   BKDCodecBenchmark.readInts16Vector   thrpt    5  194.320 ± 7.082  ops/ms
   BKDCodecBenchmark.readInts24ForUtil  thrpt    5   93.435 ± 5.063  ops/ms
   BKDCodecBenchmark.readInts24Legacy   thrpt    5   81.779 ± 1.390  ops/ms
   BKDCodecBenchmark.readInts24Vector   thrpt    5  151.203 ± 0.460  ops/ms
   ```
   
   It suggests that `readInts24ForUtil` and `readInts24Legacy` do not have to 
much difference, which is consistent with previous luceneutil result:
   
   > The previous result was got by taskRepeatCount=20 . I find that the 
speedup disappeared when taskRepeatCount increased to 50:
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                  TermDayOfYearSort      196.21      (8.7%)      194.85     
(11.2%)   -0.7% ( -18% -   21%) 0.871
                CountFilteredIntNRQ       84.92     (13.1%)       84.84     
(12.1%)   -0.1% ( -22% -   28%) 0.987
                             IntNRQ      137.14     (20.2%)      137.30     
(18.4%)    0.1% ( -31% -   48%) 0.989
                     FilteredIntNRQ      134.41     (20.0%)      135.05     
(18.1%)    0.5% ( -31% -   48%) 0.954
                         TermDTSort      196.18      (9.0%)      201.19      
(9.0%)    2.6% ( -14% -   22%) 0.506
   
   The vectorized decoding method using vector API seems perform much better, 
I'll try to run luceneutil to confirm end-to-end result. I'll keep this PR 
simple and leave vectorized decoding optimization to another PR. 
https://github.com/apache/lucene/pull/14203
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Introduce bpv24 vectorized decoding for DocIdsWriter [lucene]

Reply via email to