[ https://issues.apache.org/jira/browse/CASSANDRA-19260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Petrov updated CASSANDRA-19260: ------------------------------------ Resolution: (was: Fixed) Status: Open (was: Resolved) > org.apache.cassandra.tcm.ClusterMetadataService#commit does not catch up when > rejected > -------------------------------------------------------------------------------------- > > Key: CASSANDRA-19260 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19260 > Project: Cassandra > Issue Type: Bug > Components: Transactional Cluster Metadata > Reporter: David Capwell > Assignee: Alex Petrov > Priority: Normal > Fix For: 5.1 > > Attachments: ci_summary.html, ci_summary.json > > > This was found in the cep-15-accord branch (CASSANDRA-18804). The test that > found this was a simple benchmark test. > 1) deploy a 6 node cluster > 2) create a table > 3) in parallel launch many accord transactions > When accord gets a transaction it needs to make sure the table is “managed” > by accord which uses TCM for this bookkeeping, this is just a List<TableId> > in ClusterMetadata. We found that we detect that the table isn’t managed so > we try to add it, we get a reject and the TCM epoch has not moved forward! > Debugging this it looks like org.apache.cassandra.tcm.RemoteProcessor#commit > is the root cause as it only seems to try to catch up if there is a messaging > error and not a TCM rejection! Given that the caller to TCM is not able to > find the epoch to “wait” on I feel that this is a TCM issue as TCM normally > tries to make sure success/rejects are blocking, but in this one case it > appears not to be so -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org