[ https://issues.apache.org/jira/browse/CASSANDRA-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131453#comment-16131453 ]
Jeff Jirsa edited comment on CASSANDRA-13756 at 8/29/17 5:43 AM: ----------------------------------------------------------------- Shouldn't need a version for trunk, but [~jasobrown] if you can check me there to be sure that'd be nice (I think in the faster rewrite for trunk, we now [build|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/streamhist/StreamingTombstoneHistogramBuilder.java#L182-L186] a snapshot that is no longer modified on read). || branch || utest || dtest || | [3.0|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.0-13756] | [3.0 circle|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.0-13756] | [3.0 dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/221/] | | [3.11|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.11-13756] | [3.11 circle|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.11-13756] | [3.11 dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/222/] | was (Author: jjirsa): Shouldn't need a version for trunk, but [~jasobrown] if you can check me there to be sure that'd be nice (I think in the faster rewrite for trunk, we now [build|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/streamhist/StreamingTombstoneHistogramBuilder.java#L182-L186] a snapshot that is no longer modified on read). || branch || utest || dtest || | [3.0|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.0-13756] | [3.0 circle|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.0-13756] | [3.0 dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/189/] | | [3.11|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.11-13756] | [3.11 circle|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.11-13756] | [3.11 dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/190/] | > StreamingHistogram is not thread safe > ------------------------------------- > > Key: CASSANDRA-13756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13756 > Project: Cassandra > Issue Type: Bug > Reporter: xiangzhou xia > Assignee: Jeff Jirsa > Fix For: 3.0.x, 3.11.x > > > When we test C*3 in shadow cluster, we notice after a period of time, several > data node suddenly run into 100% cpu and stop process query anymore. > After investigation, we found that threads are stuck on the sum() in > streaminghistogram class. Those are jmx threads that working on expose > getTombStoneRatio metrics (since jmx is kicked off every 3 seconds, there is > a chance that multiple jmx thread is access streaminghistogram at the same > time). > After further investigation, we find that the optimization in CASSANDRA-13038 > led to a spool flush every time when we call sum(). Since TreeMap is not > thread safe, threads will be stuck when multiple threads visit sum() at the > same time. > There are two approaches to solve this issue. > The first one is to add a lock to the flush in sum() which will introduce > some extra overhead to streaminghistogram. > The second one is to avoid streaminghistogram to be access by multiple > threads. For our specific case, is to remove the metrics we added. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org