mikemccand commented on PR #15779:
URL: https://github.com/apache/lucene/pull/15779#issuecomment-3997216238

   On nightly benchy box (`beast3`, Ryzen Threadripper 3990X, before:
   
   ```
   38092046 terms loaded
   done shuffling
   Inserted 38092046 terms in 20841.91 ms, unique term 38092046
   Inserted 38092046 terms in 21000.09 ms, unique term 38092046
   Inserted 38092046 terms in 21635.49 ms, unique term 38092046
   Inserted 38092046 terms in 20560.37 ms, unique term 38092046
   
    Performance counter stats for '/usr/lib/jvm/java-25-openjdk/bin/java -cp 
.:lucene/core/build/classes/java/main25:lucene/core/build/classes/java/main BHT 
/lucenedata/enwiki/\
   allterms-20110115.txt':
   
                    0      context-switches:u               #      0.0 cs/sec  
cs_per_second
                    0      cpu-migrations:u                 #      0.0 
migrations/sec  migrations_per_second
              554,477      page-faults:u                    #   3849.9 
faults/sec  page_faults_per_second
           144,022.41 msec task-clock:u                     #      1.6 CPUs  
CPUs_utilized
        4,099,191,298      L1-dcache-load-misses:u          #      3.5 %  
l1d_miss_rate            (20.04%)
           17,137,351      L1-icache-load-misses:u          #      0.2 %  
l1i_miss_rate            (20.02%)
        1,287,102,957      branch-misses:u                  #      3.0 %  
branch_miss_rate         (20.00%)
       42,486,918,231      branches:u                       #    295.0 M/sec  
branch_frequency     (20.00%)
      556,536,771,158      cpu-cycles:u                     #      3.9 GHz  
cycles_frequency       (30.03%)
      246,271,809,438      instructions:u                   #      0.4 
instructions  insn_per_cycle  (30.05%)
       18,978,848,529      stalled-cycles-frontend:u        #     0.03 
frontend_cycles_idle        (20.04%)
        1,060,428,248      dTLB-loads:u                     #     27.1 %  
dtlb_miss_rate           (20.08%)
              245,693      iTLB-loads:u                     #    132.8 %  
itlb_miss_rate           (20.06%)
   
         90.699653506 seconds time elapsed
   
        132.112774000 seconds user
         12.021535000 seconds sys
   ```
   
   After:
   
   ```
   38092046 terms loaded
   done shuffling
   Inserted 38092046 terms in 11263.41 ms, unique term 38092046
   Inserted 38092046 terms in 12925.52 ms, unique term 38092046
   Inserted 38092046 terms in 12718.04 ms, unique term 38092046
   Inserted 38092046 terms in 12635.16 ms, unique term 38092046
   
    Performance counter stats for '/usr/lib/jvm/java-25-openjdk/bin/java -cp 
.:lucene/core/build/classes/java/main25:lucene/core/build/classes/java/main BHT 
/lucenedata/enwiki/\
   allterms-20110115.txt':
   
                    0      context-switches:u               #      0.0 cs/sec  
cs_per_second
                    0      cpu-migrations:u                 #      0.0 
migrations/sec  migrations_per_second
               41,869      page-faults:u                    #    365.2 
faults/sec  page_faults_per_second
           114,640.36 msec task-clock:u                     #      2.1 CPUs  
CPUs_utilized
        3,491,553,643      L1-dcache-load-misses:u          #      2.7 %  
l1d_miss_rate            (20.06%)
           15,855,892      L1-icache-load-misses:u          #      0.2 %  
l1i_miss_rate            (20.08%)
        1,271,522,632      branch-misses:u                  #      2.6 %  
branch_miss_rate         (20.09%)
       48,708,599,021      branches:u                       #    424.9 M/sec  
branch_frequency     (20.08%)
      430,787,025,878      cpu-cycles:u                     #      3.8 GHz  
cycles_frequency       (30.10%)
      285,622,442,684      instructions:u                   #      0.7 
instructions  insn_per_cycle  (30.06%)
       18,208,623,146      stalled-cycles-frontend:u        #     0.04 
frontend_cycles_idle        (20.05%)
          621,501,564      dTLB-loads:u                     #      3.9 %  
dtlb_miss_rate           (20.04%)
              272,769      iTLB-loads:u                     #     53.7 %  
itlb_miss_rate           (20.03%)
   
         55.867613269 seconds time elapsed
   
        102.551131000 seconds user
         12.034597000 seconds sys
   ```
   
   Nice!  Note the amazing drop in `dtlb_miss_rate`, which I think is a cache 
the CPU keeps lose for mapping virtual -> physical address.  So the better 
locality pays off.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to