[ https://issues.apache.org/jira/browse/CASSANDRA-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801479#comment-15801479 ]
Maxime Fouilleul commented on CASSANDRA-13096: ---------------------------------------------- It is better with the whitelist indeed: {code} whitelistObjectNames: ["java.lang:*","org.apache.cassandra.metrics:*","org.apache.cassandra.net:type=FailureDetector"] {code} We have divided by two the duration on a node (from ~7sec to ~3sec). But anyway, it does not explain why scrapping is so long on certain node and drop to "instantaneous" when cleaning snapshots. Thanks for your help. > Snapshots slow down jmx scrapping > --------------------------------- > > Key: CASSANDRA-13096 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13096 > Project: Cassandra > Issue Type: Bug > Reporter: Maxime Fouilleul > Attachments: CPU Load.png, Clear Snapshots.png, JMX Scrape > Duration.png > > > Hello, > We are scraping the jmx metrics through a prometheus exporter and we noticed > that some nodes became really long to answer (more than 20 seconds). After > some investigations we do not find any hardware problem or overload issues on > there "slow" nodes. It happens on different clusters, some with only few giga > bytes of dataset and it does not seams to be related to a specific version > neither as it happens on 2.1, 2.2 and 3.0 nodes. > After some unsuccessful actions, one of our ideas was to clean the snapshots > staying on one problematic node: > {code} > nodetool clearsnapshot > {code} > And the magic happens... as you can see in the attached diagrams, the second > we cleared the snapshots, the CPU activity dropped immediatly and the > duration to scrape the jmx metrics goes from +20 secs to instantaneous... > Can you enlighten us on this issue? Once again, it appears on our three 2.1, > 2.2 and 3.0 versions, on different volumetry and it is not systematically > linked to the snapshots as we have some nodes with the same snapshots volume > which are going pretty well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)