[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366078#comment-17366078 ] Hudson commented on HBASE-25998: Results for branch branch-2.4 [build #145 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/145/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/145/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/145/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/145/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.4/145/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5 > > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366000#comment-17366000 ] Hudson commented on HBASE-25998: Results for branch branch-2 [build #280 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/280/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/280/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/280/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/280/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/280/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} -- Something went wrong with this stage, [check relevant console output|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/280//console]. > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5 > > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365977#comment-17365977 ] Hudson commented on HBASE-25998: Results for branch branch-2.3 [build #240 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/240/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/240/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/240/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/240/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/240/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.5 > > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365692#comment-17365692 ] Hudson commented on HBASE-25998: Results for branch master [build #326 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/326/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365057#comment-17365057 ] Bharath Vissapragada commented on HBASE-25998: -- FSHLog doesn't show much improvement in WALPE with the patch, so I believe that is reflected in the YCSB runs too. Unfortunately I'm not able to deploy a branch-2 cluster right now (without much effort) to get the async WAL numbers. I will update here once I have a cluster and some data. > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365017#comment-17365017 ] Andrew Kyle Purtell commented on HBASE-25998: - Unlike with WALPE there's a lot going on in a real cluster test. The WAL is critical to performance but only one of many factors. We would expect an improvement in WAL latency to be reflected in improved per-mutation operational latency. Your YCSB results are in line with that even though it is not as impressive as a microbenchmark. > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364520#comment-17364520 ] Bharath Vissapragada commented on HBASE-25998: -- Thanks [~apurtell] for trying out the patch (and review). One interesting behavior here is that this big throughput difference is only obvious for Async WAL implementation, not clear to me why, perhaps there is a lot more contention in that implementation for some reason. I repeated the same set of tests in branch-1/master based FSHLog and the patch only performs slightly better (few single digit % points). This behavior was also confirmed in the YCSB runs on branch-1 (on a 3 node containerized EC2 cluster). Without patch: branch-1/FSHLog (10M ingest only) {noformat} [OVERALL], RunTime(ms), 199938 [OVERALL], Throughput(ops/sec), 50015.50480649001 [TOTAL_GCS_PS_Scavenge], Count, 293 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 1222 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.611189468735308 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 34 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.017005271634206603 [TOTAL_GCs], Count, 294 [TOTAL_GC_TIME], Time(ms), 1256 [TOTAL_GC_TIME_%], Time(%), 0.6281947403695145 [CLEANUP], Operations, 512 [CLEANUP], AverageLatency(us), 41.0234375 [CLEANUP], MinLatency(us), 0 [CLEANUP], MaxLatency(us), 18527 [CLEANUP], 95thPercentileLatency(us), 13 [CLEANUP], 99thPercentileLatency(us), 37 [INSERT], Operations, 1000 [INSERT], AverageLatency(us), 5085.9494093 [INSERT], MinLatency(us), 1499 [INSERT], MaxLatency(us), 220927 [INSERT], 95thPercentileLatency(us), 6511 [INSERT], 99thPercentileLatency(us), 16655 [INSERT], Return=OK, 1000 {noformat} With patch: branch-1/FSHLog (10M ingest only) {noformat} [OVERALL], RunTime(ms), 195064 [OVERALL], Throughput(ops/sec), 51265.2257720543 [TOTAL_GCS_PS_Scavenge], Count, 284 [TOTAL_GC_TIME_PS_Scavenge], Time(ms), 1184 [TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.6069802731411229 [TOTAL_GCS_PS_MarkSweep], Count, 1 [TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 33 [TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.01691752450477792 [TOTAL_GCs], Count, 285 [TOTAL_GC_TIME], Time(ms), 1217 [TOTAL_GC_TIME_%], Time(%), 0.6238977976459008 [CLEANUP], Operations, 512 [CLEANUP], AverageLatency(us), 45.783203125 [CLEANUP], MinLatency(us), 1 [CLEANUP], MaxLatency(us), 20591 [CLEANUP], 95thPercentileLatency(us), 14 [CLEANUP], 99thPercentileLatency(us), 37 [INSERT], Operations, 1000 [INSERT], AverageLatency(us), 4958.6662675 [INSERT], MinLatency(us), 1380 [INSERT], MaxLatency(us), 295935 [INSERT], 95thPercentileLatency(us), 6335 [INSERT], 99thPercentileLatency(us), 19071 [INSERT], Return=OK, 1000 {noformat} Unfortunately, the tooling I have does not support branch-2/master (yet) so that I can repeat this YCSB run for Async WAL implementation but if WALPE runs are any indication, we should be a good enough throughput improvement. > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363977#comment-17363977 ] Andrew Kyle Purtell commented on HBASE-25998: - My results, on a MacBook Pro 2019 2.3 GHz 8-Core Intel Core i9 Java: {noformat} openjdk version "11.0.8" 2020-07-14 LTS OpenJDK Runtime Environment Zulu11.41+23-CA (build 11.0.8+10-LTS) OpenJDK 64-Bit Server VM Zulu11.41+23-CA (build 11.0.8+10-LTS, mixed mode) {noformat} Current master at 555f8b46 (./bin/hbase org.apache.hadoop.hbase.wal.WALPerformanceEvaluation -threads 256 -iterations 10, first stats dump) {noformat} -- Histograms -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.latencyHistogram.nanos count = 2879583 min = 1875557 max = 246347480 mean = 2938092.83 stddev = 10908886.22 median = 2190795.00 75% <= 2373648.00 95% <= 2833351.00 98% <= 4978663.00 99% <= 6457163.00 99.9% <= 213634065.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncCountHistogram.countPerSync count = 28275 min = 52 max = 103 mean = 101.79 stddev = 3.18 median = 102.00 75% <= 102.00 95% <= 102.00 98% <= 102.00 99% <= 103.00 99.9% <= 103.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncHistogram.nanos-between-syncs count = 28276 min = 118014 max = 242926458 mean = 1179929.00 stddev = 7471679.91 median = 867201.00 75% <= 934459.00 95% <= 1181470.00 98% <= 1909398.00 99% <= 3711500.00 99.9% <= 6662930.00 -- Meters -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.appendMeter.bytes count = 1604304263 mean rate = 51688418.00 events/second 1-minute rate = 43829916.60 events/second 5-minute rate = 39579618.94 events/second 15-minute rate = 38725509.54 events/second org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncMeter.syncs count = 28278 mean rate = 911.04 events/second 1-minute rate = 772.23 events/second 5-minute rate = 697.15 events/second 15-minute rate = 682.06 events/second {noformat} With patch (./bin/hbase org.apache.hadoop.hbase.wal.WALPerformanceEvaluation -threads 256 -iterations 10, first stats dump) {noformat} -- Histograms -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.latencyHistogram.nanos count = 5113265 min = 879033 max = 202881049 mean = 1421741.40 stddev = 6905506.90 median = 1063825.00 75% <= 1215826.00 95% <= 1843140.00 98% <= 3479868.00 99% <= 4076417.00 99.9% <= 202881049.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncCountHistogram.countPerSync count = 50232 min = 52 max = 106 mean = 101.84 stddev = 2.92 median = 102.00 75% <= 102.00 95% <= 102.00 98% <= 103.00 99% <= 103.00 99.9% <= 103.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncHistogram.nanos-between-syncs count = 50233 min = 98682 max = 73959735 mean = 542249.37 stddev = 2083003.22 median = 418651.00 75% <= 487203.00 95% <= 742249.00 98% <= 1040476.00 99% <= 1693894.00 99.9% <= 3739216.00 -- Meters -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.appendMeter.bytes count = 2848677354 mean rate = 91435148.51 events/second 1-minute rate = 79981952.53 events/second 5-minute rate = 74153640.38 events/second 15-minute rate = 72986075.52 events/second org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncMeter.syncs count = 50237 mean rate = 1612.44 events/second 1-minute rate = 1410.28 events/second 5-minute rate = 1307.50 events/second 15-minute rate = 1286.92 events/second {noformat} > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type:
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362670#comment-17362670 ] Bharath Vissapragada commented on HBASE-25998: -- [~stack] Thanks for taking a look, test runs seem fine, trying to do an e-e throughput test on a cluster, will report the results here. > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362233#comment-17362233 ] Michael Stack commented on HBASE-25998: --- The numbers look nice [~bharathv] (Not in a place to try locally – OOO). Patch looks good. Your safety check passes? > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361973#comment-17361973 ] Bharath Vissapragada commented on HBASE-25998: -- Redid the experiments with JDK-11 (to account for any latest monitor performance enhancements) and I see similar numbers. Also, the numbers above are for {{-t 256}} which implies heavy contention. It seems like the patch performs well under heavy load and the gap narrows with fewer threads (which I guess is expected), but even with very low concurrency the patch seems to out perform the current state. > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361867#comment-17361867 ] Bharath Vissapragada commented on HBASE-25998: -- [~zhangduo] [~apurtell] [~stack] might of interest to you (draft patch up for review), results seem too good to be true. If you don't mind trying the patch locally in your environment (just want to eliminate any noise from my end).. PTAL. > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png, monitor-overhead-2.png > > > While working on HBASE-25984, I noticed some weird frames in the flame graphs > around monitor entry exit consuming a lot of CPU cycles (see attached > images). Noticed that the synchronization there is too coarse grained and > sometimes unnecessary. I did a simple patch that switched to a reentrant lock > based synchronization with condition variable rather than a busy wait and > that showed 70-80% increased throughput in WAL PE. Seems too good to be > true.. (more details in the comments). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-25998) Revisit synchronization in SyncFuture
[ https://issues.apache.org/jira/browse/HBASE-25998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361864#comment-17361864 ] Bharath Vissapragada commented on HBASE-25998: -- {noformat} java -version java version "1.8.0_221" Java(TM) SE Runtime Environment (build 1.8.0_221-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode) {noformat} For default WAL provider (async WAL) Without Patch {noformat} -- Histograms -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.latencyHistogram.nanos count = 10271257 min = 2672827 max = 67700701 mean = 4084532.41 stddev = 6244597.80 median = 3403047.00 75% <= 3525394.00 95% <= 3849268.00 98% <= 4319378.00 99% <= 61134500.00 99.9% <= 67195663.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncCountHistogram.countPerSync count = 100888 min = 52 max = 103 mean = 101.91 stddev = 2.09 median = 102.00 75% <= 102.00 95% <= 102.00 98% <= 102.00 99% <= 103.00 99.9% <= 103.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncHistogram.nanos-between-syncs count = 100889 min = 119051 max = 62778058 mean = 1601305.10 stddev = 3626948.72 median = 1361530.00 75% <= 1407052.00 95% <= 1523418.00 98% <= 1765310.00 99% <= 2839178.00 99.9% <= 62778058.00 -- Meters -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.appendMeter.bytes count = 5721241096 mean rate = 37890589.06 events/second 1-minute rate = 36390169.75 events/second 5-minute rate = 33524039.88 events/second 15-minute rate = 31915066.49 events/second org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncMeter.syncs count = 100889 mean rate = 668.16 events/second 1-minute rate = 641.77 events/second 5-minute rate = 590.37 events/second 15-minute rate = 561.67 events/second {noformat} With patch: {noformat} -- Histograms -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.latencyHistogram.nanos count = 12927042 min = 943723 max = 60827209 mean = 1865217.32 stddev = 5384907.53 median = 1323691.00 75% <= 1443195.00 95% <= 1765866.00 98% <= 1921920.00 99% <= 3144643.00 99.9% <= 60827209.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncCountHistogram.countPerSync count = 126797 min = 52 max = 104 mean = 101.87 stddev = 2.54 median = 102.00 75% <= 102.00 95% <= 102.00 98% <= 103.00 99% <= 103.00 99.9% <= 103.00 org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncHistogram.nanos-between-syncs count = 126798 min = 122666 max = 60703608 mean = 711847.31 stddev = 3174375.63 median = 519092.00 75% <= 570240.00 95% <= 695175.00 98% <= 754972.00 99% <= 791139.00 99.9% <= 59975393.00 -- Meters -- org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.appendMeter.bytes count = 7200681555 mean rate = 79170095.16 events/second 1-minute rate = 75109969.27 events/second 5-minute rate = 66505621.40 events/second 15-minute rate = 63719949.74 events/second org.apache.hadoop.hbase.wal.WALPerformanceEvaluation.syncMeter.syncs count = 126800 mean rate = 1394.11 events/second 1-minute rate = 1322.31 events/second 5-minute rate = 1169.99 events/second 15-minute rate = 1120.69 events/second {noformat} > Revisit synchronization in SyncFuture > - > > Key: HBASE-25998 > URL: https://issues.apache.org/jira/browse/HBASE-25998 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver, wal >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Major > Attachments: monitor-overhead-1.png,