----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/52669/ -----------------------------------------------------------
Review request for Aurora, David McLaughlin, John Sirois, and Zameer Manji. Repository: aurora Description ------- This experiment is inspired by David's comment: "I don’t think the storage engine matters. We just need to be able to offload it from the Scheduler JVM. The problem with H2 isn’t SQL or anything else, it’s the GC pressure." Basic idea is to switch to another storage backend: "nioMemFS stores data outside of the VM's heap - useful for large memory DBs without incurring GC costs" (http://www.h2database.com/html/advanced.html) Our micro-benchmarks look promising Current Master (on-heap db): TaskStoreBenchmarks.DBFetchTasksBenchmark.run N/A N/A N/A 10000 thrpt 5 83399.727 ± 13513.406 ops/s TaskStoreBenchmarks.DBFetchTasksBenchmark.run N/A N/A N/A 50000 thrpt 5 38674.517 ± 26893.133 ops/s TaskStoreBenchmarks.DBFetchTasksBenchmark.run N/A N/A N/A 100000 thrpt 5 0.080 ± 0.037 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run N/A N/A N/A 10000 thrpt 5 251.447 ± 234.791 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run N/A N/A N/A 50000 thrpt 5 49.090 ± 43.262 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run N/A N/A N/A 100000 thrpt 5 25.915 ± 11.143 ops/s UpdateStoreBenchmarks.JobDetailsBenchmark.run N/A 1000 N/A N/A thrpt 5 106.155 ± 62.439 ops/s UpdateStoreBenchmarks.JobDetailsBenchmark.run N/A 5000 N/A N/A thrpt 5 29.003 ± 24.196 ops/s UpdateStoreBenchmarks.JobDetailsBenchmark.run N/A 10000 N/A N/A thrpt 5 15.572 ± 8.836 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 1 N/A N/A N/A thrpt 5 26.939 ± 25.415 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 10 N/A N/A N/A thrpt 5 28.599 ± 26.182 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 100 N/A N/A N/A thrpt 5 22.560 ± 9.864 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 1000 N/A N/A N/A thrpt 5 17.537 ± 6.562 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 10 N/A thrpt 5 28.917 ± 12.473 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 100 N/A thrpt 5 23.867 ± 8.910 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 1000 N/A thrpt 5 25.711 ± 22.488 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 10000 N/A thrpt 5 11.045 ± 6.477 ops/s This patch (off-heap db): Benchmark (instanceOverrides) (instances) (metadata) (numTasks) Mode Cnt Score Error Units TaskStoreBenchmarks.DBFetchTasksBenchmark.run N/A N/A N/A 10000 thrpt 5 81368.994 ± 32366.724 ops/s TaskStoreBenchmarks.DBFetchTasksBenchmark.run N/A N/A N/A 50000 thrpt 5 68668.801 ± 10404.233 ops/s TaskStoreBenchmarks.DBFetchTasksBenchmark.run N/A N/A N/A 100000 thrpt 5 24713.776 ± 43833.452 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run N/A N/A N/A 10000 thrpt 5 268.549 ± 301.280 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run N/A N/A N/A 50000 thrpt 5 50.502 ± 21.999 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run N/A N/A N/A 100000 thrpt 5 26.802 ± 19.717 ops/s UpdateStoreBenchmarks.JobDetailsBenchmark.run N/A 1000 N/A N/A thrpt 5 300.698 ± 420.377 ops/s UpdateStoreBenchmarks.JobDetailsBenchmark.run N/A 5000 N/A N/A thrpt 5 69.884 ± 75.162 ops/s UpdateStoreBenchmarks.JobDetailsBenchmark.run N/A 10000 N/A N/A thrpt 5 39.471 ± 27.366 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 1 N/A N/A N/A thrpt 5 73.923 ± 119.519 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 10 N/A N/A N/A thrpt 5 92.899 ± 70.455 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 100 N/A N/A N/A thrpt 5 63.936 ± 46.243 ops/s UpdateStoreBenchmarks.JobInstructionsBenchmark.run 1000 N/A N/A N/A thrpt 5 38.806 ± 25.216 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 10 N/A thrpt 5 83.017 ± 72.351 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 100 N/A thrpt 5 72.640 ± 74.998 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 1000 N/A thrpt 5 72.037 ± 56.763 ops/s UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run N/A N/A 10000 N/A thrpt 5 31.657 ± 8.772 ops/s Please be aware: The values seen here will not really carry over to real world usage. It would therefore be awesome if one of you could test this on a larger cluster! Diffs ----- build.gradle 3cd083ccc72ed9dfb4adc491e9ab5a40d7762879 src/main/java/org/apache/aurora/scheduler/storage/db/DbModule.java e7287cec28e7b8ca978c506bfe821f261bc0ac26 Diff: https://reviews.apache.org/r/52669/diff/ Testing ------- ./gradlew build ./gradlew jmh -Pbenchmarks='UpdateStoreBenchmarks.*|TaskStoreBenchmarks.*' Thanks, Stephan Erb