[ https://issues.apache.org/jira/browse/CASSANDRA-19336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andres de la Peña updated CASSANDRA-19336: ------------------------------------------ Test and Documentation Plan: ||Patch||CI|| |[4.0|https://github.com/apache/cassandra/compare/cassandra-4.0...adelapena:19336-4.0]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/3423/workflows/6dd2bc40-d663-4c38-96d2-1a9d98b531da] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3423/workflows/d6255d7f-a238-4eb6-93f1-fe373ad567c5]| |[4.1|https://github.com/apache/cassandra/compare/cassandra-4.1...adelapena:19336-4.1]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/3424/workflows/7e153df1-c7c3-453d-9003-e1cacaf0d9fb] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3424/workflows/68503730-3744-4d65-8484-658711d01bf6]| |[5.0|https://github.com/apache/cassandra/pull/3073]|[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3417/workflows/9a3ef50e-1616-4bca-b9f9-275eb1ddf5fa] [j17|https://app.circleci.com/pipelines/github/adelapena/cassandra/3417/workflows/43c45c1b-7fa8-48e6-9137-1ed52594b03d]| |[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:19336-trunk]|[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3425/workflows/0a657cb6-f749-4caa-97cd-ed6660736313] [j17|https://app.circleci.com/pipelines/github/adelapena/cassandra/3425/workflows/3c827700-b4a9-4860-94a9-a641dd06dfe1]| was: ||Patch||CI|| |[4.0|https://github.com/apache/cassandra/compare/trunk...adelapena:19336-4.0]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/3423/workflows/6dd2bc40-d663-4c38-96d2-1a9d98b531da] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3423/workflows/d6255d7f-a238-4eb6-93f1-fe373ad567c5]| |[4.1|https://github.com/apache/cassandra/compare/trunk...adelapena:19336-4.1]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/3424/workflows/7e153df1-c7c3-453d-9003-e1cacaf0d9fb] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3424/workflows/68503730-3744-4d65-8484-658711d01bf6]| |[5.0|https://github.com/apache/cassandra/pull/3073]|[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3417/workflows/9a3ef50e-1616-4bca-b9f9-275eb1ddf5fa] [j17|https://app.circleci.com/pipelines/github/adelapena/cassandra/3417/workflows/43c45c1b-7fa8-48e6-9137-1ed52594b03d]| |[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:19336-trunk]|[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3425/workflows/0a657cb6-f749-4caa-97cd-ed6660736313] [j17|https://app.circleci.com/pipelines/github/adelapena/cassandra/3425/workflows/3c827700-b4a9-4860-94a9-a641dd06dfe1]| > Repair causes out of memory > --------------------------- > > Key: CASSANDRA-19336 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19336 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair > Reporter: Andres de la Peña > Assignee: Andres de la Peña > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 0.5h > Remaining Estimate: 0h > > CASSANDRA-14096 introduced {{repair_session_space}} as a limit for the memory > usage for Merkle tree calculations during repairs. This limit is applied to > the set of Merkle trees built for a received validation request > ({{{}VALIDATION_REQ{}}}), divided by the replication factor so as not to > overwhelm the repair coordinator, who will have requested RF sets of Merkle > trees. That way the repair coordinator should only use > {{repair_session_space}} for the RF Merkle trees. > However, a repair session without {{{}-pr-{}}}/{{{}-partitioner-range{}}} > will send RF*RF validation requests, because the repair coordinator node has > RF-1 replicas and is also the replica of RF-1 nodes. Since all the requests > are sent at the same time, at some point the repair coordinator can have up > to RF*{{{}repair_session_space{}}} worth of Merkle trees if none of the > validation responses is fully processed before the last response arrives. > Even worse, if the cluster uses virtual nodes, many nodes can be replicas of > the repair coordinator, and some nodes can be replicas of multiple token > ranges. It would mean that the repair coordinator can send more than RF or > RF*RF simultaneous validation requests. > For example, in an 11-node cluster with RF=3 and 256 tokens, we have seen a > repair session involving 44 groups of ranges to be repaired. This produces > 44*3=132 validation requests contacting all the nodes in the cluster. When > the responses for all these requests start to arrive to the coordinator, each > containing up to {{repair_session_space}}/3 of Merkle trees, they accumulate > quicker than they are consumed, greatly exceeding {{repair_session_space}} > and OOMing the node. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org