[ https://issues.apache.org/jira/browse/CASSANDRA-17787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812677#comment-17812677 ]
Andres de la Peña commented on CASSANDRA-17787: ----------------------------------------------- CASSANDRA-19336 has found the same issue for repairs targeting more replicas than the replication factor. This can happen when the repair doesn't use {{\-\-partitioner-range}} or when using virtual nodes. That ticket proposes a patch limiting the number of simultaneous repair jobs. Reducing the Merkle tree depths seems risky for the case with multiple nodes reported here because of over-streaming. However, I think smaller trees might be used for the vnodes case reported by CASSANDRA-19336 because many small ranges imply fewer rows per range. > Full repair on a keyspace with a large amount of tables causes OOM > ------------------------------------------------------------------ > > Key: CASSANDRA-17787 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17787 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair > Reporter: Brandon Williams > Priority: Normal > Labels: lhf > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > > Running a nodetool repair -pr --full on a keyspace with a few hundred tables > will cause a direct memory OOM, or lots of heap pressure with > use_offheap_merkle_trees: false. Adjusting repair_session_space_in_mb does > not seem to help. From an initial look at a heap dump, it appears to node is > holding many _remote_ trees in memory. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org