[ https://issues.apache.org/jira/browse/CASSANDRA-17787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573871#comment-17573871 ]
Josh McKenzie commented on CASSANDRA-17787: ------------------------------------------- bq. We should really not be doing it simultaneously for every table There's an opportunity here to cap the command to N tables and error out if we find more than N. This seems like one of those "this is fine until it scales to a point where it's not", intersecting with "when you reach the point where things don't scale we should degrade gracefully and inform you". > Full repair on a keyspace with a large amount of tables causes OOM > ------------------------------------------------------------------ > > Key: CASSANDRA-17787 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17787 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair > Reporter: Brandon Williams > Priority: Normal > Fix For: 4.0.x, 4.1.x > > > Running a nodetool repair -pr --full on a keyspace with a few hundred tables > will cause a direct memory OOM, or lots of heap pressure with > use_offheap_merkle_trees: false. Adjusting repair_session_space_in_mb does > not seem to help. From an initial look at a heap dump, it appears to node is > holding many _remote_ trees in memory. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org