[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451594#comment-16451594 ] Chris Lohfink commented on CASSANDRA-14281: --- yeah i +1d the code on 18th, the benchmark was just followup me trying it out to see same improvement he did, which I didn't see on my laptop. Main feedback (was in IRC so not in comments, adding it now for posterity) that he fixed was parent metrics tracking children to prevent negative values after table drops > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > Attachments: bench.png, bench2.png, benchmark.html, benchmark2.png > > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451572#comment-16451572 ] Jeff Jirsa commented on CASSANDRA-14281: Is https://github.com/apache/cassandra/pull/217 the final patch? Chris you're +1 (fully review, not just benchmark)? > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > Attachments: bench.png, bench2.png, benchmark.html, benchmark2.png > > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441870#comment-16441870 ] Chris Lohfink commented on CASSANDRA-14281: --- fwiw tried this again on a beefier linux box instead of my laptop and saw the improvement: !benchmark2.png|width=799,height=417! > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > Attachments: bench.png, bench2.png, benchmark2.png > > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427938#comment-16427938 ] Michael Shuler commented on CASSANDRA-14281: !bench2.png! Converted the TIFF bench.png to PNG > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > Attachments: bench.png, bench2.png > > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427876#comment-16427876 ] Chris Lohfink commented on CASSANDRA-14281: --- I am +1 for code if that was a vague comment > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > Attachments: image-2018-04-05-22-06-05-947.png > > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427874#comment-16427874 ] Chris Lohfink commented on CASSANDRA-14281: --- I dont see any issues with code and thats an improvement certainly. I tried cassandra-stress locally and !image-2018-04-05-22-06-05-947.png|width=799,height=417! didnt get an improvement but it maybe different with a faster host (tried 8-128 threads in powers of 2) > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > Attachments: image-2018-04-05-22-06-05-947.png > > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426980#comment-16426980 ] ASF GitHub Bot commented on CASSANDRA-14281: GitHub user burmanm opened a pull request: https://github.com/apache/cassandra/pull/217 CASSANDRA-14281 Remove locking from DEHR and use CAS update operation in rescaling to improve performance Add method to combine two Snapshots Use LongAdder[] instead of AtomicLongArray to reduce contention Read metrics from children LatencyMetrics instances when needed, instead of writing to parent instances each time. Add benchmark for LatencyMetrics and Reservoir When child is released, it replicates previous status to parents You can merge this pull request into a Git repository by running: $ git pull https://github.com/burmanm/cassandra latency_metrics Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cassandra/pull/217.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #217 commit 98ad2320e6c28bcfd7afc6a7f2bf25b640bc957a Author: Michael Burman Date: 2018-03-01T12:59:53Z Remove locking from DEHR and use CAS update operation in rescaling to improve performance Add method to combine two Snapshots Use LongAdder[] instead of AtomicLongArray to reduce contention Read metrics from children LatencyMetrics instances when needed, instead of writing to parent instances each time. Add benchmark for LatencyMetrics and Reservoir When child is released, it replicates previous status to parents > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426982#comment-16426982 ] Michael Burman commented on CASSANDRA-14281: https://github.com/apache/cassandra/pull/217 > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426970#comment-16426970 ] Michael Burman commented on CASSANDRA-14281: {code:java} trunk: op rate : 44,168 op/s [WRITE: 44,168 op/s] partition rate : 44,168 pk/s [WRITE: 44,168 pk/s] row rate : 44,168 row/s [WRITE: 44,168 row/s] latency mean : 2.9 ms [WRITE: 2.9 ms] latency median : 2.0 ms [WRITE: 2.0 ms] latency 95th percentile : 7.0 ms [WRITE: 7.0 ms] latency 99th percentile : 11.8 ms [WRITE: 11.8 ms] latency 99.9th percentile : 115.8 ms [WRITE: 115.8 ms] latency max : 191.9 ms [WRITE: 191.9 ms] latency_writes branch: op rate : 46,633 op/s [WRITE: 46,633 op/s] partition rate : 46,633 pk/s [WRITE: 46,633 pk/s] row rate : 46,633 row/s [WRITE: 46,633 row/s] latency mean : 2.7 ms [WRITE: 2.7 ms] latency median : 1.9 ms [WRITE: 1.9 ms] latency 95th percentile : 6.3 ms [WRITE: 6.3 ms] latency 99th percentile : 11.1 ms [WRITE: 11.1 ms] latency 99.9th percentile : 112.1 ms [WRITE: 112.1 ms] latency max : 212.5 ms [WRITE: 212.5 ms] {code} > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425728#comment-16425728 ] Chris Lohfink commented on CASSANDRA-14281: --- [~burmanm] can you make a attach a link to branch or make pr for the bot to do it (just to make it easier). also since this is a perf patch can you post the difference with cassandra-stress as full benchmark instead of just the jmh ones? > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16395183#comment-16395183 ] Joshua McKenzie commented on CASSANDRA-14281: - Bounced out of Ready to Commit but can't move back to PA. Just a heads up. > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392940#comment-16392940 ] Michael Burman commented on CASSANDRA-14281: No, I mean one update is copied to multiple histograms when on the hot path. Initialized in this method: [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/metrics/TableMetrics.java#L993] And applied for each update here: [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/metrics/TableMetrics.java#L1043] Instead of doing it only when metrics are requested (which is also more efficient process). > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392930#comment-16392930 ] Chris Lohfink commented on CASSANDRA-14281: --- what are the unnecessary multiple updates? You mean System.nanotime multiple times? > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14281) Improve LatencyMetrics performance by reducing write path processing
[ https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392655#comment-16392655 ] Michael Burman commented on CASSANDRA-14281: Additional note.. there's also same kind of unnecessary multiple updates in the read path. Including TableMetrics & TableHistogram's update. This patch obviously helps a little bit, since it reduces the histogram update time, but there's most likely stuff to reduce. Fixing those helps in the write path the: "metric.colUpdateTimeDeltaHistogram.update(Math.min(18165375903306L, timeDelta));" line, but these are very minor performance issues in the update path at the moment. Another ticket for read path? > Improve LatencyMetrics performance by reducing write path processing > > > Key: CASSANDRA-14281 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14281 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Burman >Assignee: Michael Burman >Priority: Major > > Currently for each write/read/rangequery/CAS touching the CFS we write a > latency metric which takes a lot of processing time (up to 66% of the total > processing time if the update was empty). > The way latencies are recorded is to use both a dropwizard "Timer" as well as > "Counter". Latter is used for totalLatency and the previous is decaying > metric for rates and certain percentile metrics. We then replicate all of > these CFS writes to the KeyspaceMetrics and globalWriteLatencies. > Instead of doing this on the write phase we should merge the metrics when > they're read. This is much less common occurrence and thus we save a lot of > CPU time in total. This also speeds up the write path. > Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each > update operation, which causes a contention if there are more than one thread > updating the histogram. This impacts scalability when using larger machines. > We should make it lock-free as much as possible and also avoid a single > CAS-update from blocking all the concurrent threads from making an update. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org