-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65303/#review196583
-----------------------------------------------------------


Ship it!




Ship It!

- Stephan Erb


On Jan. 31, 2018, 7:12 nachm., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65303/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2018, 7:12 nachm.)
> 
> 
> Review request for Aurora and Jordan Ly.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Use `ArrayDeque` rather than `HashSet` for fetchTasks, and use imperative 
> style rather than functional.  I arrived at this result after running 
> benchmarks with some of the other usual suspects (`ArrayList`, `LinkedList`).
> 
> This patch also enables stack and heap profilers in jmh (more details 
> [here](http://hg.openjdk.java.net/codetools/jmh/file/25d8b2695bac/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_35_Profilers.java)),
>  providing insight into the heap impact of changes.  I started this change 
> with a heap profiler as the primary motivation, and ended up using it to 
> guide this improvement.
> 
> 
> Diffs
> -----
> 
>   build.gradle 64af7aefbe784d95df28f59606a0d17afb57c3a1 
>   src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java 
> 9ec9865ae9a60fa2ab81832a2cf886b7b6b887cd 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java 
> b59999ca9a5185e240ad729fefc6638476a4aecc 
> 
> 
> Diff: https://reviews.apache.org/r/65303/diff/2/
> 
> 
> Testing
> -------
> 
> Full benchmark summary for `TaskStoreBenchmarks` is at the bottom, but here 
> is an abridged version.  It shows that task fetch throughput universally 
> improves by ~2x (mod error margins), and heap allocation reduces by at least 
> the same factor.  Overall GC time increases slightly as captured here, but 
> the stddev was anecdotally high across runs.  I chose to present this output 
> as a caveat and a discussion point.
> 
> If you scroll to the full output at the bottom, you will see some more 
> granular allocation data.  Please note that the `norm` stats are normalized 
> for the number of operations, which i find to be the most useful measure for 
> validating a change.  Quoting the jmh sample link above:
> ```quote
> It is often useful to look into non-normalized counters to see if the test is 
> allocation/GC-bound (figure the allocation pressure "ceiling" for your 
> configuration!), and normalized counters to see the more precise benchmark 
> behavior.
> ```
> 
> Prior to this patch:
> ```console
> Benchmark                                    (numTasks)         Score         
> Error   Units
> FetchAll.run                                      10000       481.529 ±     
> 184.751   ops/s
> FetchAll.run:·gc.alloc.rate.norm                  10000    334970.771 ±   
> 33544.960    B/op
> 
> FetchAll.run                                      50000        78.652 ±      
> 20.869   ops/s
> FetchAll.run:·gc.alloc.rate.norm                  50000   3991107.524 ±  
> 701585.657    B/op
> 
> FetchAll.run                                     100000        38.371 ±      
> 11.710   ops/s
> FetchAll.run:·gc.alloc.rate.norm                 100000  13487028.139 ± 
> 3369614.510    B/op
> 
> IndexedFetchAndFilter.run                         10000       296.557 ±     
> 198.389   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm     10000    655319.005 ±   
> 98138.360    B/op
> 
> IndexedFetchAndFilter.run                         50000        50.300 ±       
> 5.818   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm     50000   6671548.381 ±  
> 452020.849    B/op
> 
> IndexedFetchAndFilter.run                        100000        17.637 ±       
> 3.739   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm    100000  28100173.458 ± 
> 4486308.188    B/op
> ```
> 
> With this patch:
> ```console
> Benchmark                                    (numTasks)         Score         
> Error   Units
> FetchAll.run                                      10000      1653.572 ±     
> 799.123   ops/s
> FetchAll.run:·gc.alloc.rate.norm                  10000    155426.052 ±   
> 10345.657    B/op
> 
> FetchAll.run                                      50000       210.454 ±      
> 54.340   ops/s
> FetchAll.run:·gc.alloc.rate.norm                  50000   1457560.505 ±  
> 228631.547    B/op
> 
> FetchAll.run                                     100000        97.783 ±      
> 42.130   ops/s
> FetchAll.run:·gc.alloc.rate.norm                 100000   5096464.582 ± 
> 1792136.191    B/op
> 
> IndexedFetchAndFilter.run                         10000       500.740 ±     
> 210.675   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm     10000    370760.068 ±   
> 36813.071    B/op
> 
> IndexedFetchAndFilter.run                         50000        95.316 ±      
> 23.084   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm     50000   3389472.432 ±  
> 550602.162    B/op
> 
> IndexedFetchAndFilter.run                        100000        41.572 ±      
> 26.747   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm    100000  12324183.188 ± 
> 7537788.165    B/op
> ```
> 
> 
> **Full benchmark output**
> 
> Prior to this patch:
> ```console
> Benchmark                                                   (numTasks)        
>  Score         Error   Units
> FetchAll.run                                                     10000       
> 481.529 ±     184.751   ops/s
> FetchAll.run:·gc.alloc.rate                                      10000       
> 148.678 ±      42.890  MB/sec
> FetchAll.run:·gc.alloc.rate.norm                                 10000    
> 334970.771 ±   33544.960    B/op
> FetchAll.run:·gc.churn.PS_Eden_Space                             10000       
> 146.991 ±     135.486  MB/sec
> FetchAll.run:·gc.churn.PS_Eden_Space.norm                        10000    
> 332983.005 ±  347401.950    B/op
> FetchAll.run:·gc.churn.PS_Survivor_Space                         10000        
>  0.804 ±       1.823  MB/sec
> FetchAll.run:·gc.churn.PS_Survivor_Space.norm                    10000      
> 1784.147 ±    3904.546    B/op
> FetchAll.run:·gc.count                                           10000        
>  9.000                counts
> FetchAll.run:·gc.time                                            10000       
> 143.000                    ms
> 
> FetchAll.run                                                     50000        
> 78.652 ±      20.869   ops/s
> FetchAll.run:·gc.alloc.rate                                      50000       
> 250.771 ±      34.190  MB/sec
> FetchAll.run:·gc.alloc.rate.norm                                 50000   
> 3991107.524 ±  701585.657    B/op
> FetchAll.run:·gc.churn.PS_Eden_Space                             50000       
> 250.131 ±     144.214  MB/sec
> FetchAll.run:·gc.churn.PS_Eden_Space.norm                        50000   
> 3999003.844 ± 2907196.744    B/op
> FetchAll.run:·gc.churn.PS_Old_Gen                                50000        
>  6.937 ±      20.180  MB/sec
> FetchAll.run:·gc.churn.PS_Old_Gen.norm                           50000    
> 111462.141 ±  322286.235    B/op
> FetchAll.run:·gc.churn.PS_Survivor_Space                         50000        
>  6.056 ±       4.371  MB/sec
> FetchAll.run:·gc.churn.PS_Survivor_Space.norm                    50000     
> 96534.909 ±   73072.098    B/op
> FetchAll.run:·gc.count                                           50000        
> 22.000                counts
> FetchAll.run:·gc.time                                            50000      
> 3222.000                    ms
> 
> FetchAll.run                                                    100000        
> 38.371 ±      11.710   ops/s
> FetchAll.run:·gc.alloc.rate                                     100000       
> 343.280 ±      63.923  MB/sec
> FetchAll.run:·gc.alloc.rate.norm                                100000  
> 13487028.139 ± 3369614.510    B/op
> FetchAll.run:·gc.churn.PS_Eden_Space                            100000       
> 343.804 ±     147.542  MB/sec
> FetchAll.run:·gc.churn.PS_Eden_Space.norm                       100000  
> 13524848.537 ± 7132093.384    B/op
> FetchAll.run:·gc.churn.PS_Old_Gen                               100000        
>  7.251 ±      26.847  MB/sec
> FetchAll.run:·gc.churn.PS_Old_Gen.norm                          100000    
> 286256.200 ± 1043939.286    B/op
> FetchAll.run:·gc.churn.PS_Survivor_Space                        100000        
> 11.448 ±      16.645  MB/sec
> FetchAll.run:·gc.churn.PS_Survivor_Space.norm                   100000    
> 440924.671 ±  539369.420    B/op
> FetchAll.run:·gc.count                                          100000        
> 53.000                counts
> FetchAll.run:·gc.time                                           100000      
> 8664.000                    ms
> 
> IndexedFetchAndFilter.run                                        10000       
> 296.557 ±     198.389   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate                         10000       
> 178.657 ±      96.891  MB/sec
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm                    10000    
> 655319.005 ±   98138.360    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space                10000       
> 181.829 ±     115.598  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space.norm           10000    
> 669894.533 ±  362265.228    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space            10000        
>  1.017 ±       2.764  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space.norm       10000      
> 3509.419 ±    8933.232    B/op
> IndexedFetchAndFilter.run:·gc.count                              10000        
> 11.000                counts
> IndexedFetchAndFilter.run:·gc.time                               10000       
> 174.000                    ms
> 
> IndexedFetchAndFilter.run                                        50000        
> 50.300 ±       5.818   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate                         50000       
> 271.042 ±      35.522  MB/sec
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm                    50000   
> 6671548.381 ±  452020.849    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space                50000       
> 278.006 ±     188.990  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space.norm           50000   
> 6835542.988 ± 4208216.383    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen                   50000        
>  7.836 ±      22.513  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen.norm              50000    
> 194944.435 ±  557587.333    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space            50000        
>  6.063 ±       2.432  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space.norm       50000    
> 148960.731 ±   42282.391    B/op
> IndexedFetchAndFilter.run:·gc.count                              50000        
> 24.000                counts
> IndexedFetchAndFilter.run:·gc.time                               50000      
> 3059.000                    ms
> 
> IndexedFetchAndFilter.run                                       100000        
> 17.637 ±       3.739   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate                        100000       
> 336.740 ±      69.527  MB/sec
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm                   100000  
> 28100173.458 ± 4486308.188    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space               100000       
> 336.494 ±      88.830  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space.norm          100000  
> 28063164.240 ± 4888826.638    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen                  100000        
>  8.028 ±      37.263  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen.norm             100000    
> 672808.968 ± 2924497.150    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space           100000        
> 11.351 ±      17.881  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space.norm      100000    
> 930977.737 ± 1252367.282    B/op
> IndexedFetchAndFilter.run:·gc.count                             100000        
> 47.000                counts
> IndexedFetchAndFilter.run:·gc.time                              100000      
> 7245.000                    ms
> ```
> 
> With this patch:
> ```console
> Benchmark                                                   (numTasks)        
>  Score         Error   Units
> FetchAll.run                                                     10000      
> 1653.572 ±     799.123   ops/s
> FetchAll.run:·gc.alloc.rate                                      10000       
> 236.532 ±      98.709  MB/sec
> FetchAll.run:·gc.alloc.rate.norm                                 10000    
> 155426.052 ±   10345.657    B/op
> FetchAll.run:·gc.churn.PS_Eden_Space                             10000       
> 247.755 ±      55.490  MB/sec
> FetchAll.run:·gc.churn.PS_Eden_Space.norm                        10000    
> 163873.606 ±   59092.580    B/op
> FetchAll.run:·gc.churn.PS_Survivor_Space                         10000        
>  1.328 ±       1.540  MB/sec
> FetchAll.run:·gc.churn.PS_Survivor_Space.norm                    10000       
> 883.684 ±    1120.393    B/op
> FetchAll.run:·gc.count                                           10000        
> 18.000                counts
> FetchAll.run:·gc.time                                            10000       
> 191.000                    ms
> 
> FetchAll.run                                                     50000       
> 210.454 ±      54.340   ops/s
> FetchAll.run:·gc.alloc.rate                                      50000       
> 248.216 ±      15.196  MB/sec
> FetchAll.run:·gc.alloc.rate.norm                                 50000   
> 1457560.505 ±  228631.547    B/op
> FetchAll.run:·gc.churn.PS_Eden_Space                             50000       
> 239.336 ±     174.541  MB/sec
> FetchAll.run:·gc.churn.PS_Eden_Space.norm                        50000   
> 1409078.860 ± 1141224.117    B/op
> FetchAll.run:·gc.churn.PS_Old_Gen                                50000        
>  6.504 ±      17.220  MB/sec
> FetchAll.run:·gc.churn.PS_Old_Gen.norm                           50000     
> 38644.950 ±  105262.889    B/op
> FetchAll.run:·gc.churn.PS_Survivor_Space                         50000        
>  5.994 ±       4.160  MB/sec
> FetchAll.run:·gc.churn.PS_Survivor_Space.norm                    50000     
> 35246.411 ±   25958.915    B/op
> FetchAll.run:·gc.count                                           50000        
> 21.000                counts
> FetchAll.run:·gc.time                                            50000      
> 2875.000                    ms
> 
> FetchAll.run                                                    100000        
> 97.783 ±      42.130   ops/s
> FetchAll.run:·gc.alloc.rate                                     100000       
> 336.209 ±      80.094  MB/sec
> FetchAll.run:·gc.alloc.rate.norm                                100000   
> 5096464.582 ± 1792136.191    B/op
> FetchAll.run:·gc.churn.PS_Eden_Space                            100000       
> 342.190 ±     144.180  MB/sec
> FetchAll.run:·gc.churn.PS_Eden_Space.norm                       100000   
> 5167420.986 ± 1634774.992    B/op
> FetchAll.run:·gc.churn.PS_Old_Gen                               100000        
> 11.783 ±      36.073  MB/sec
> FetchAll.run:·gc.churn.PS_Old_Gen.norm                          100000    
> 182947.872 ±  525172.467    B/op
> FetchAll.run:·gc.churn.PS_Survivor_Space                        100000        
> 12.299 ±      13.795  MB/sec
> FetchAll.run:·gc.churn.PS_Survivor_Space.norm                   100000    
> 184635.309 ±  199254.266    B/op
> FetchAll.run:·gc.count                                          100000        
> 46.000                counts
> FetchAll.run:·gc.time                                           100000      
> 7778.000                    ms
> 
> IndexedFetchAndFilter.run                                        10000       
> 500.740 ±     210.675   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate                         10000       
> 171.305 ±      57.968  MB/sec
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm                    10000    
> 370760.068 ±   36813.071    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space                10000       
> 176.084 ±     103.579  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space.norm           10000    
> 387100.753 ±  376481.454    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space            10000        
>  1.305 ±       1.866  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space.norm       10000      
> 2812.059 ±    3518.689    B/op
> IndexedFetchAndFilter.run:·gc.count                              10000        
> 11.000                counts
> IndexedFetchAndFilter.run:·gc.time                               10000       
> 170.000                    ms
> 
> IndexedFetchAndFilter.run                                        50000        
> 95.316 ±      23.084   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate                         50000       
> 258.291 ±      30.111  MB/sec
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm                    50000   
> 3389472.432 ±  550602.162    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space                50000       
> 250.887 ±     148.296  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space.norm           50000   
> 3308741.831 ± 2461004.974    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen                   50000        
>  5.218 ±      21.710  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen.norm              50000     
> 69254.269 ±  282577.478    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space            50000        
>  5.803 ±       2.885  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space.norm       50000     
> 76523.177 ±   51120.227    B/op
> IndexedFetchAndFilter.run:·gc.count                              50000        
> 21.000                counts
> IndexedFetchAndFilter.run:·gc.time                               50000      
> 2775.000                    ms
> 
> IndexedFetchAndFilter.run                                       100000        
> 41.572 ±      26.747   ops/s
> IndexedFetchAndFilter.run:·gc.alloc.rate                        100000       
> 331.638 ±      50.813  MB/sec
> IndexedFetchAndFilter.run:·gc.alloc.rate.norm                   100000  
> 12324183.188 ± 7537788.165    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space               100000       
> 333.474 ±     116.673  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Eden_Space.norm          100000  
> 12357891.009 ± 7285356.875    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen                  100000        
> 10.296 ±      27.573  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Old_Gen.norm             100000    
> 371782.085 ±  910072.098    B/op
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space           100000        
> 11.815 ±      10.161  MB/sec
> IndexedFetchAndFilter.run:·gc.churn.PS_Survivor_Space.norm      100000    
> 428555.780 ±  184610.507    B/op
> IndexedFetchAndFilter.run:·gc.count                             100000        
> 49.000                counts
> IndexedFetchAndFilter.run:·gc.time                              100000      
> 8602.000                    ms
> ```
> 
> 
> Thanks,
> 
> Bill Farner
> 
>

Reply via email to