[ https://issues.apache.org/jira/browse/CASSANDRA-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcus Eriksson updated CASSANDRA-18791: ---------------------------------------- Change Category: Operability Complexity: Normal Reviewers: Alex Petrov, Sam Tunnicliffe Status: Open (was: Triage Needed) > CEP-21 - Multiple TCM fixes for issues discovered by unit, integration and > simulation testing > --------------------------------------------------------------------------------------------- > > Key: CASSANDRA-18791 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18791 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Membership > Reporter: Marcus Eriksson > Assignee: Marcus Eriksson > Priority: Normal > > Full branch: [https://github.com/krummas/cassandra/commits/marcuse/cep-21-tcm] > Tests: > [cci|https://app.circleci.com/pipelines/github/krummas/cassandra/885/workflows/7cc3e1a1-45b9-4069-bb02-2a46855b4dbe] > - current status is: > unit tests: 35/12052 failures > jvm dtests: 23/1459 failures > python dtests: 110/1018 failures > We will spend the next few weeks getting all the test targets down to 0 > Summary of changes; > [CEP-21] Python dtest fixes * maybe fix hintedhandoff test > - [https://github.com/krummas/cassandra/commit/3da91c26fb] > - [https://github.com/krummas/cassandra/commit/98e444adbb] > [CEP-21] In-JVM DTest fixes > - [https://github.com/krummas/cassandra/compare/6491a70041...c9126a8024] > [CEP-21] Unit test fixes > - [https://github.com/krummas/cassandra/compare/02be7aa71c...6491a70041] > [CEP-21] Escape infinite local log loop on replica mis-configuration > - [https://github.com/krummas/cassandra/commit/02be7aa71c] > Currently different replicas can have different configurations (guardrails > for example) If a transformation is not applied on a replica, this node got > stuck in an infinite loop. For now escape that loop until we have a better > solution. > [CEP-21] Fix batchlog consistency errors during epoch bumps > - [https://github.com/krummas/cassandra/commit/c853bf864e] > [CEP-21] Avoid using batches in distributed metadata log keyspace > - [https://github.com/krummas/cassandra/commit/d4b4766e0b] > [CEP-21] Fix table metadata serialization > - [https://github.com/krummas/cassandra/commit/4056bab669] > [CEP-21] add more metrics > - [https://github.com/krummas/cassandra/commit/b6ccb559f5] > [CEP-21] getHostIdForEndpoint return null if unknown endpoint > - [https://github.com/krummas/cassandra/commit/b9243df05b] > [CEP-21] CMS handling > - [https://github.com/krummas/cassandra/commit/15eea30d43] > - [https://github.com/krummas/cassandra/commit/d64da5f5e4] > - [https://github.com/krummas/cassandra/commit/61deb52811] > [CEP-21] Upgrade fixes > - [https://github.com/krummas/cassandra/commit/33d186b4ce] > Properly set system.local host id on upgrade. > - [https://github.com/krummas/cassandra/commit/b96bdc83e1] > If replica misses migration message, set migration as successfull when it > sees the first epoch bump. > - [https://github.com/krummas/cassandra/commit/712828bc82] > Handle hints on upgrade - we change the hostid when enabling CMS, hints > should be delivered before that. > [CEP-21] Catchup/log fetching improvements > - [https://github.com/krummas/cassandra/commit/31a183e236] > When an instance sees a message from a peer with a newer epoch, try to catch > up from that peer instead of the CMS to reduce load on the CMS nodes and to > allow for cluster to quiesce in the case of the CMS being down. > - [https://github.com/krummas/cassandra/commit/8c6a4b35db] > We can get a snapshot when catching up, in this case the pending log should > first apply the snapshot and skip any previous entries. > - [https://github.com/krummas/cassandra/commit/387853487f] > When deserializing partition update, allow if current epoch >= serialized > epoch > - [https://github.com/krummas/cassandra/commit/626d224716] > When we replay from a snapshot we might see a node as LEFT for the first time > (it was bootstrapped and left while we were down) > [CEP-21] Require Paxos V2 for cluster metadata log operations > - [https://github.com/krummas/cassandra/commit/2217f551a6] > TCM is required to use Paxos V2 to because of the way the legacy paxos path > uses a keyspace’s RF to assert whether there are enough available replicas to > perform the read before a CAS. It doesn’t work properly with meta strategy > when adding CMS members > [CEP-21] Disaster recovery > - [https://github.com/krummas/cassandra/commit/9011233604] > Allow an instance to dump its current cluster metadata, and force-boot from > it. Basically we need a way to force an instance to become the CMS, in case > the original CMS goes down. > [CEP-21] Switch nodeId from uuid to int > - [https://github.com/krummas/cassandra/commit/aea5500ae0] > [CEP-21] Make CQLSSTableWriter exclusively a client utility > - [https://github.com/krummas/cassandra/commit/0693b22297] > [CEP-21] Support nodetool assasinate > - [https://github.com/krummas/cassandra/commit/312a1c1b0e] > [CEP-21] In progress sequence updates > - [https://github.com/krummas/cassandra/commit/7f56e0e5b3] > Protection against out-of-order and repeated execution, sequence rediscovery > and reliability improvements. > - [https://github.com/krummas/cassandra/commit/7ddb941d80] > DC and RF aware acks for multistep operations. Make progress barrier > consistency level configurable. > [CEP-21] Enforce data ownership checks > - [https://github.com/krummas/cassandra/commit/5c42fd098c] > Never accept operations for ranges we don't own. > [CEP-21] Gossip fixes > - [https://github.com/krummas/cassandra/commit/8572735e28] > Several gossip issues found during upgrade and load testing > - [https://github.com/krummas/cassandra/commit/ed785cb414] > Avoid gossip deadlock when merging CM nodes to gossip > - [https://github.com/krummas/cassandra/commit/e40c3a4ea] > Replaced endpoints should be evicted from gossip like in previous versions. > [CEP-21] Re-enable startup checks on non-test initialization > - [https://github.com/krummas/cassandra/commit/63013ad366] > [CEP-21] Unify streaming: make all operations use explicit ranges for > streaming > - [https://github.com/krummas/cassandra/commit/ccba2e84de] > All streaming operations now use a movement map describing what should be > streamed where. > [CEP-21] Add vtable for metadata log > - [https://github.com/krummas/cassandra/commit/9fa4d61e5a] > [CEP-21] Add exception code to commit result if rejected > - [https://github.com/krummas/cassandra/commit/7331e0842b] > [CEP-21] Make cleanup safe to run during range movements > - [https://github.com/krummas/cassandra/commit/6f990c118f] > [CEP-21] ReplicaPlan recomputation and stillAppliesTo implementation for Paxos > - [https://github.com/krummas/cassandra/commit/0e5cc6a4fd] > [CEP-21] Update index status fixes post-rebase > - [https://github.com/krummas/cassandra/commit/06fba6bbc0] > [CEP-21] Create new auth tables, remove cidr constants for column names > - [https://github.com/krummas/cassandra/commit/fef280dda6] > [CEP-21] Schema fixes > - [https://github.com/krummas/cassandra/commit/dd9a7e9752] > Schema cleanups, remove old schema pulling > - [https://github.com/krummas/cassandra/commit/9f0538c4b3] > Don't include system_distributed in initial schema. > - [https://github.com/krummas/cassandra/commit/4d5fce6884] > Simplify check for whether DROP COMPACT STORAGE is permitted > - [https://github.com/krummas/cassandra/commit/0bb8efb8f0] > Don't invalidate prepared stmt cache on every schema change > - [https://github.com/krummas/cassandra/commit/93517d9ee4] > Allow Schema.instance to be initialized empty for client apps > - [https://github.com/krummas/cassandra/commit/f612d2cd3d] > Simplistic schema metadata diff > - [https://github.com/krummas/cassandra/commit/42bc2dd5ee] > Don't warn about new system tables in StartupCheck > - [https://github.com/krummas/cassandra/commit/b570e74bf3] > Exclude meta keyspace from TableMetrics::totalNonSystemTablesSize > [CEP-21] Simulator updates > - [https://github.com/krummas/cassandra/commit/7e368cfc3e] > Simulate NTS > - [https://github.com/krummas/cassandra/commit/00a34d88c3] > Multi cms simulation, Deadlines for local processor, reworked retries for > local and remote processor > - [https://github.com/krummas/cassandra/commit/e6dce927da] > Simulator harry integration > - [https://github.com/krummas/cassandra/commit/f30bf25060] > Eclipse warn > [CEP-21] Bootstrap fixes > - [https://github.com/krummas/cassandra/commit/6eea8aad69] > ClusterMetadata::writePlacementAllSettled handles bootstrapping nodes > correctly > - [https://github.com/krummas/cassandra/commit/ef9e9c6074] > Reenable write survey mode > [CEP-21] Minor cleanups > - [https://github.com/krummas/cassandra/commit/335d10c9d6] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org