On Tue, 2018-11-27 at 16:52 -0800, Ryan Schmitt wrote: > I suppose that's possible, but there are several other modes > available > that don't use sampling. If you're interested in throughput of heap > allocation, you can use Mode.Throughput with the GC profiler (partial > results shown): >
Can we move on now? --- Benchmark Mode Cnt Score Error Units Testing6.index thrpt 97019963.567 ops/s Testing6.index:·gc.alloc.rate thrpt ≈ 10⁻⁵ MB/sec Testing6.index:·gc.alloc.rate.norm thrpt ≈ 10⁻⁷ B/op Testing6.index:·gc.count thrpt ≈ 0 counts Testing6.iterator thrpt 80554752.864 ops/s Testing6.iterator:·gc.alloc.rate thrpt ≈ 10⁻⁵ MB/sec Testing6.iterator:·gc.alloc.rate.norm thrpt ≈ 10⁻⁷ B/op Testing6.iterator:·gc.count thrpt ≈ 0 counts --- Oleg > Benchmark Mode Cnt > Score Error Units > MyBenchmark.index:·gc.alloc.rate.norm thrpt 3 > 176.000 ± 0.001 B/op > MyBenchmark.iterator:·gc.alloc.rate.norm thrpt 3 > 176.000 ± 0.001 B/op > > The numbers for Mode.AverageTime are pretty much the same; I don't > think the two are different when using GCProfiler. > > I also wrote a version of the benchmark that allows the Iterator > object to escape, thus defeating scalar replacement. This version > does > perform more allocation, as expected: > > Benchmark > Mode Cnt Score Error Units > MyBenchmark.index:·gc.alloc.rate.norm > thrpt 3 176.000 ± 0.001 B/op > MyBenchmark.iterator:·gc.alloc.rate.norm > thrpt 3 176.000 ± 0.001 B/op > MyBenchmark.iteratorWithoutScalarReplacement:·gc.alloc.rate.norm > thrpt 3 208.000 ± 0.001 B/op > > I can also confirm that indexing performs worse than iterating when > using a HeaderGroup backed by a LinkedList (containing nine > elements): > > Benchmark Mode Cnt Score Er > ror Units > MyBenchmark.index avgt 3 106.215 ± > 1.261 ns/op > MyBenchmark.iterator avgt 3 78.689 ± > 8.851 ns/op > MyBenchmark.iteratorWithoutScalarReplacement avgt 3 80.600 ± > 5.045 ns/op > > In other words, using an iterator (such as by way of a for-each loop) > can improve time efficiency (due to algorithmic efficiency), with no > effect on GC pressure (thanks to JIT optimizations). Just for grins, > I > re-ran the LinkedList benchmark using -Xint (which forces the entire > JVM to run in interpreted mode) and observed a nearly two > order-of-magnitude performance degradation: > > Benchmark Mode Cnt Score > Error Units > MyBenchmark.index avgt 3 5088.854 ± > 844.754 ns/op > MyBenchmark.iterator avgt 3 4997.693 ± > 583.512 ns/op > MyBenchmark.iteratorWithoutScalarReplacement avgt 3 4954.591 ± > 361.937 ns/op > > > On Tue, Nov 27, 2018 at 4:00 PM Gary Gregory <[email protected]> > wrote: > > > > Can the variations be explained by GC pauses and other processes > > using the > > CPU? > > > > Gary > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
