-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65303/#review196107
-----------------------------------------------------------




src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java
Line 234 (original), 235 (patched)
<https://reviews.apache.org/r/65303/#comment275620>

    Have you considered passing in the predicate filter in here? For index 
scans this should help to eliminate a large amount of allocations.


- Stephan Erb


On Jan. 24, 2018, 1:32 a.m., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65303/
> -----------------------------------------------------------
> 
> (Updated Jan. 24, 2018, 1:32 a.m.)
> 
> 
> Review request for Aurora and Jordan Ly.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Use `ArrayDeque` rather than `HashSet` for fetchTasks, and use imperative 
> style rather than functional.  I arrived at this result after running 
> benchmarks with some of the other usual suspects (`ArrayList`, `LinkedList`).
> 
> This patch also enables stack and heap profilers in jmh (more details 
> [here](http://hg.openjdk.java.net/codetools/jmh/file/25d8b2695bac/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_35_Profilers.java)),
>  providing insight into the heap impact of changes.  I started this change 
> with a heap profiler as the primary motivation, and ended up using it to 
> guide this improvement.
> 
> 
> Diffs
> -----
> 
>   build.gradle 64af7ae 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java 
> b59999c 
> 
> 
> Diff: https://reviews.apache.org/r/65303/diff/1/
> 
> 
> Testing
> -------
> 
> Full benchmark summary for `TaskStoreBenchmarks.MemFetchTasksBenchmark` is at 
> the bottom, but here is an abridged version.  It shows that task fetch 
> throughput universally improves by at least 2x, and heap allocation reduces 
> by at least the same factor.  Overall GC time increases slightly as captured 
> here, but the stddev was anecdotally high across runs.  I chose to present 
> this output as a caveat and a discussion point.
> 
> If you scroll to the full output at the bottom, you will see some more 
> granular allocation data.  Please note that the `norm` stats are normalized 
> for the number of operations, which i find to be the most useful measure for 
> validating a change.  Quoting the jmh sample link above:
> ```quote
> It is often useful to look into non-normalized counters to see if the test is 
> allocation/GC-bound (figure the allocation pressure "ceiling" for your 
> configuration!), and normalized counters to see the more precise benchmark 
> behavior.
> ```
> 
> Prior to this patch:
> ```console
> Benchmark                 (numTasks)    Score         Error   Units
> 
>                           10000      1066.632 ±     266.924   ops/s
> ·gc.alloc.rate.norm       10000    289227.205 ±    8888.051    B/op
> ·gc.count                 10000        24.000                counts
> ·gc.time                  10000       103.000                    ms
> 
>                           50000        84.444 ±      32.620   ops/s
> ·gc.alloc.rate.norm       50000   3831210.967 ±  840844.713    B/op
> ·gc.count                 50000        21.000                counts
> ·gc.time                  50000      1407.000                    ms
> 
>                          100000        38.645 ±      20.557   ops/s
> ·gc.alloc.rate.norm      100000  13555430.931 ± 6787344.701    B/op
> ·gc.count                100000        52.000                counts
> ·gc.time                 100000      3304.000                    ms
> ```
> 
> With this patch:
> ```console
> Benchmark               (numTasks)   Score         Error   Units
> 
>                          10000    2851.288 ±     481.472   ops/s
> ·gc.alloc.rate.norm      10000  145281.908 ±    2223.621    B/op
> ·gc.count                10000      39.000                counts
> ·gc.time                 10000     130.000                    ms
> 
>                          50000     297.380 ±      35.681   ops/s
> ·gc.alloc.rate.norm      50000 1183791.866 ±   77487.278    B/op
> ·gc.count                50000      25.000                counts
> ·gc.time                 50000    1821.000                    ms
> 
>                         100000     122.211 ±      81.618   ops/s              
>           
> ·gc.alloc.rate.norm     100000 4364450.973 ± 2856586.882    B/op
> ·gc.count               100000      52.000                counts
> ·gc.time                100000    3698.000                    ms
> ```
> 
> 
> **Full benchmark output**
> 
> Prior to this patch:
> ```console
> Benchmark                                                                     
>    (numTasks)   Mode  Cnt         Score         Error   Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run                                
>         10000  thrpt    5      1066.632 ±     266.924   ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate                 
>         10000  thrpt    5       286.647 ±      62.371  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm            
>         10000  thrpt    5    289227.205 ±    8888.051    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space        
>         10000  thrpt    5       291.263 ±     159.266  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm   
>         10000  thrpt    5    294277.617 ±  166069.041    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space    
>         10000  thrpt    5         1.218 ±       1.029  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm
>        10000  thrpt    5      1220.540 ±     708.455    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count                      
>         10000  thrpt    5        24.000                counts
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time                       
>         10000  thrpt    5       103.000                    ms
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack                         
>         10000  thrpt                NaN                   ---
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run                                
>         50000  thrpt    5        84.444 ±      32.620   ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate                 
>         50000  thrpt    5       267.018 ±      27.389  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm            
>         50000  thrpt    5   3831210.967 ±  840844.713    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space        
>         50000  thrpt    5       258.565 ±     149.845  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm   
>         50000  thrpt    5   3707563.530 ± 2262218.319    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen           
>         50000  thrpt    5         4.487 ±      18.053  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm      
>         50000  thrpt    5     63848.757 ±  264487.651    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space    
>         50000  thrpt    5         6.034 ±       3.651  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm
>        50000  thrpt    5     87385.381 ±   75159.508    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count                      
>         50000  thrpt    5        21.000                counts
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time                       
>         50000  thrpt    5      1407.000                    ms
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack                         
>         50000  thrpt                NaN                   ---
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run                                
>        100000  thrpt    5        38.645 ±      20.557   ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate                 
>        100000  thrpt    5       381.453 ±      63.491  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm            
>        100000  thrpt    5  13555430.931 ± 6787344.701    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space        
>        100000  thrpt    5       389.816 ±     123.320  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm   
>        100000  thrpt    5  13823571.735 ± 6642604.600    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen           
>        100000  thrpt    5         1.947 ±      16.766  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm      
>        100000  thrpt    5     92330.241 ±  794991.221    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space    
>        100000  thrpt    5        11.934 ±      18.565  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm
>       100000  thrpt    5    414896.926 ±  551658.959    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count                      
>        100000  thrpt    5        52.000                counts
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time                       
>        100000  thrpt    5      3304.000                    ms
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack                         
>        100000  thrpt                NaN                   ---
> ```
> 
> With this patch:
> ```console
> Benchmark                                                                     
>    (numTasks)   Mode  Cnt        Score         Error   Units
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run                                
>         10000  thrpt    5     2851.288 ±     481.472   ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate                 
>         10000  thrpt    5      384.383 ±      58.697  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm            
>         10000  thrpt    5   145281.908 ±    2223.621    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space        
>         10000  thrpt    5      388.851 ±     114.120  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm   
>         10000  thrpt    5   147171.915 ±   50430.527    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space    
>         10000  thrpt    5        1.264 ±       0.980  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm
>        10000  thrpt    5      479.848 ±     420.881    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count                      
>         10000  thrpt    5       39.000                counts
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time                       
>         10000  thrpt    5      130.000                    ms
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack                         
>         10000  thrpt               NaN                   ---
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run                                
>         50000  thrpt    5      297.380 ±      35.681   ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate                 
>         50000  thrpt    5      288.839 ±      19.035  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm            
>         50000  thrpt    5  1183791.866 ±   77487.278    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space        
>         50000  thrpt    5      296.587 ±     125.148  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm   
>         50000  thrpt    5  1214497.578 ±  457975.153    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen           
>         50000  thrpt    5        6.942 ±      23.492  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm      
>         50000  thrpt    5    28880.733 ±   99593.659    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space    
>         50000  thrpt    5        6.440 ±       3.887  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm
>        50000  thrpt    5    26354.762 ±   14876.857    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count                      
>         50000  thrpt    5       25.000                counts
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time                       
>         50000  thrpt    5     1821.000                    ms
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack                         
>         50000  thrpt               NaN                   ---
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run                                
>        100000  thrpt    5      122.211 ±      81.618   ops/s
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate                 
>        100000  thrpt    5      377.099 ±      77.146  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.alloc.rate.norm            
>        100000  thrpt    5  4364450.973 ± 2856586.882    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space        
>        100000  thrpt    5      381.570 ±     119.260  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Eden_Space.norm   
>        100000  thrpt    5  4415115.428 ± 3000198.792    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen           
>        100000  thrpt    5        1.914 ±      16.479  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Old_Gen.norm      
>        100000  thrpt    5    31833.830 ±  274098.881    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space    
>        100000  thrpt    5       12.117 ±      20.931  MB/sec
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.churn.PS_Survivor_Space.norm
>       100000  thrpt    5   136001.918 ±  196459.666    B/op
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.count                      
>        100000  thrpt    5       52.000                counts
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·gc.time                       
>        100000  thrpt    5     3698.000                    ms
> TaskStoreBenchmarks.MemFetchTasksBenchmark.run:·stack                         
>        100000  thrpt               NaN                   ---
> ```
> 
> 
> Thanks,
> 
> Bill Farner
> 
>

Reply via email to