costin commented on pull request #453:
URL: https://github.com/apache/lucene/pull/453#issuecomment-973999178


   I have tighten the implementation a bit, removing an extra field adding some 
constants and following more the style of Packed64 with regards to the 
conditionals.
   In addition updated the benchmark to differentiate between consecutive 
get/set and spare (get/set) where different parts of memory are being read.
   This has a big impact on the VHLB performance (almost double) while the 
other implementations don't exhibit much difference, making them more 
consistent, for example:
   
   ```
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     23   10240  
thrpt    3  16121.142 ±  836.602  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          23   10240  
thrpt    3  28436.567 ±  771.609  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     23   10240  
thrpt    3  42751.522 ± 2367.785  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          23   10240  
thrpt    3  40369.377 ± 1737.285  ops/s
   Packed64Benchmark.packed64_Consecutive                      23   10240  
thrpt    3  52004.882 ± 2006.942  ops/s
   Packed64Benchmark.packed64_Sparse                           23   10240  
thrpt    3  44671.486 ± 1567.467  ops/s
   ```
   
   It might be that the sparse benchmark is not adequate enough (the operations 
happen from the outside in, which should give slight advantage towards the 
middle due to data locality).
   
   Below is the full benchmark:
   
   ```
   Benchmark                                                (bpv)  (size)   
Mode  Cnt      Score       Error  Units
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive      1   10240  
thrpt    3  38301.741 ±   655.443  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive      4   10240  
thrpt    3  25817.622 ± 16839.449  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive      5   10240  
thrpt    3  21086.108 ±  1738.288  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive      8   10240  
thrpt    3  16114.364 ±  1042.073  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     11   10240  
thrpt    3  16056.271 ±  2397.599  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     16   10240  
thrpt    3  17125.401 ±  1574.413  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     23   10240  
thrpt    3  16063.316 ±   453.476  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     25   10240  
thrpt    3  16046.605 ±   315.952  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     31   10240  
thrpt    3  16017.969 ±   894.789  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     32   10240  
thrpt    3  19653.331 ±  1231.635  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     47   10240  
thrpt    3  15992.127 ±  1909.132  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     59   10240  
thrpt    3  17114.009 ±  6263.882  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     61   10240  
thrpt    3  15822.847 ±  8200.733  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Consecutive     64   10240  
thrpt    3  40686.026 ±  2245.477  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse           1   10240  
thrpt    3  49795.721 ±  1016.448  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse           4   10240  
thrpt    3  37455.051 ±  1059.990  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse           5   10240  
thrpt    3  34629.635 ±   730.416  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse           8   10240  
thrpt    3  28438.560 ±   364.251  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          11   10240  
thrpt    3  28240.196 ±   752.703  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          16   10240  
thrpt    3  29998.199 ±  1275.620  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          23   10240  
thrpt    3  28481.596 ±  1537.821  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          25   10240  
thrpt    3  28585.727 ±   948.670  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          31   10240  
thrpt    3  28002.335 ±  1701.436  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          32   10240  
thrpt    3  34116.362 ±   421.667  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          47   10240  
thrpt    3  28258.341 ±  1065.642  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          59   10240  
thrpt    3  29776.379 ±   469.763  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          61   10240  
thrpt    3  28820.101 ±  3651.373  ops/s
   Packed64Benchmark.packed64VarHandleLongByte_Sparse          64   10240  
thrpt    3  57477.947 ±  3698.974  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive      1   10240  
thrpt    3  32689.162 ±  1387.629  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive      4   10240  
thrpt    3  35393.931 ±  1447.491  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive      5   10240  
thrpt    3  40258.860 ±  2152.352  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive      8   10240  
thrpt    3  35385.111 ±   347.894  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     11   10240  
thrpt    3  45596.088 ±   990.686  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     16   10240  
thrpt    3  35012.112 ±  9437.142  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     23   10240  
thrpt    3  47095.570 ±   905.039  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     25   10240  
thrpt    3  31985.949 ±   590.707  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     31   10240  
thrpt    3  42513.815 ±  1896.300  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     32   10240  
thrpt    3  35779.268 ±   956.749  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     47   10240  
thrpt    3  30137.376 ±   516.086  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     59   10240  
thrpt    3  25869.023 ±  1372.035  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     61   10240  
thrpt    3  24951.079 ±   345.293  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Consecutive     64   10240  
thrpt    3  35496.562 ±  2744.344  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse           1   10240  
thrpt    3  53103.799 ±   564.971  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse           4   10240  
thrpt    3  53134.834 ±   763.798  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse           5   10240  
thrpt    3  42840.502 ±   547.945  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse           8   10240  
thrpt    3  53661.284 ±  2760.469  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          11   10240  
thrpt    3  42976.618 ±  4369.925  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          16   10240  
thrpt    3  53319.723 ±  8308.326  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          23   10240  
thrpt    3  40894.483 ±   772.369  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          25   10240  
thrpt    3  40646.463 ±  1482.981  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          31   10240  
thrpt    3  40995.711 ±  1436.172  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          32   10240  
thrpt    3  53884.020 ±  2006.964  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          47   10240  
thrpt    3  29708.915 ±   657.222  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          59   10240  
thrpt    3  35063.658 ±  1556.863  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          61   10240  
thrpt    3  28871.592 ±   366.623  ops/s
   Packed64Benchmark.packed64VarHandleLongLong_Sparse          64   10240  
thrpt    3  53333.751 ± 15511.237  ops/s
   Packed64Benchmark.packed64_Consecutive                       1   10240  
thrpt    3  36595.316 ±  1213.323  ops/s
   Packed64Benchmark.packed64_Consecutive                       4   10240  
thrpt    3  39777.028 ±   953.027  ops/s
   Packed64Benchmark.packed64_Consecutive                       5   10240  
thrpt    3  46119.465 ±   211.767  ops/s
   Packed64Benchmark.packed64_Consecutive                       8   10240  
thrpt    3  39886.892 ±  1574.036  ops/s
   Packed64Benchmark.packed64_Consecutive                      11   10240  
thrpt    3  53222.462 ±  1847.024  ops/s
   Packed64Benchmark.packed64_Consecutive                      16   10240  
thrpt    3  39897.330 ±  1012.499  ops/s
   Packed64Benchmark.packed64_Consecutive                      23   10240  
thrpt    3  50924.631 ±  2607.771  ops/s
   Packed64Benchmark.packed64_Consecutive                      25   10240  
thrpt    3  51179.396 ±  4118.732  ops/s
   Packed64Benchmark.packed64_Consecutive                      31   10240  
thrpt    3  49350.911 ±  2652.142  ops/s
   Packed64Benchmark.packed64_Consecutive                      32   10240  
thrpt    3  40245.046 ±  1506.183  ops/s
   Packed64Benchmark.packed64_Consecutive                      47   10240  
thrpt    3  43794.173 ±  4789.862  ops/s
   Packed64Benchmark.packed64_Consecutive                      59   10240  
thrpt    3  41248.048 ±  2533.885  ops/s
   Packed64Benchmark.packed64_Consecutive                      61   10240  
thrpt    3  42538.675 ±  3462.201  ops/s
   Packed64Benchmark.packed64_Consecutive                      64   10240  
thrpt    3  40091.319 ±   962.752  ops/s
   Packed64Benchmark.packed64_Sparse                            1   10240  
thrpt    3  60621.693 ±  6289.038  ops/s
   Packed64Benchmark.packed64_Sparse                            4   10240  
thrpt    3  63121.896 ±  5265.890  ops/s
   Packed64Benchmark.packed64_Sparse                            5   10240  
thrpt    3  49445.705 ±  1348.639  ops/s
   Packed64Benchmark.packed64_Sparse                            8   10240  
thrpt    3  63533.078 ±  5166.244  ops/s
   Packed64Benchmark.packed64_Sparse                           11   10240  
thrpt    3  47624.192 ±  5238.701  ops/s
   Packed64Benchmark.packed64_Sparse                           16   10240  
thrpt    3  64148.964 ±   820.543  ops/s
   Packed64Benchmark.packed64_Sparse                           23   10240  
thrpt    3  44013.707 ±  3850.643  ops/s
   Packed64Benchmark.packed64_Sparse                           25   10240  
thrpt    3  44837.906 ±  2231.909  ops/s
   Packed64Benchmark.packed64_Sparse                           31   10240  
thrpt    3  44638.184 ±  1887.982  ops/s
   Packed64Benchmark.packed64_Sparse                           32   10240  
thrpt    3  63831.488 ±  1325.146  ops/s
   Packed64Benchmark.packed64_Sparse                           47   10240  
thrpt    3  41508.192 ±  3339.770  ops/s
   Packed64Benchmark.packed64_Sparse                           59   10240  
thrpt    3  40849.655 ±  2445.001  ops/s
   Packed64Benchmark.packed64_Sparse                           61   10240  
thrpt    3  36888.602 ±  1342.537  ops/s
   Packed64Benchmark.packed64_Sparse                           64   10240  
thrpt    3  64386.395 ±  1694.784  ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to