[jira] [Updated] (CASSANDRA-15229) BufferPool Regression
[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhao Yang updated CASSANDRA-15229: -- Source Control Link: https://github.com/apache/cassandra/pull/535 Test and Documentation Plan: added unit test and tested performance. https://app.circleci.com/pipelines/github/jasonstack/cassandra/313/workflows/7b5205e2-21ee-46e8-931c-5b658cf49be5 was:added unit test and tested performance. > BufferPool Regression > - > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching >Reporter: Benedict Elliott Smith >Assignee: Zhao Yang >Priority: Normal > Fix For: 4.0, 4.0-beta > > Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, > 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, > 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, > 15229-unsafe.png > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of global coordination and per-allocation overhead. We don’t need > 1KiB granularity for allocations, nor 16 byte granularity for tiny > allocations. > - > Since CASSANDRA-5863, chunk cache is implemented to use buffer pool. When > local pool is full, one of its chunks will be evicted and only put back to > global pool when all buffers in the evicted chunk are released. But due to > chunk cache, buffers can be held for long period of time, preventing evicted > chunk to be recycled even though most of space in the evicted chunk are free. > There two things need to be improved: > 1. Evicted chunk with free space should be recycled to global pool, even if > it's not fully free. It's doable in 4.0. > 2. Reduce fragmentation caused by different buffer size. With #1, partially > freed chunk will be available for allocation, but "holes" in the partially > freed chunk are with different sizes. We should consider allocating fixed > buffer size which is unlikely to fit in 4.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-16146: Reviewers: Blake Eggleston > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203554#comment-17203554 ] Yifan Cai commented on CASSANDRA-16146: --- Updated the error message. Unit and jvm dtest passed. One test failure ({{test_closing_connections - thrift_hsha_test.TestThriftHSHA}}) from dtest. It does not look related and exists before this patch. [https://app.circleci.com/pipelines/github/yifan-c/cassandra/111/workflows/ae86acf0-b416-4a42-92e8-cb845d5393a7] > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517 ] Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 9:23 PM: -- The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning what. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned to is a more correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing (and automatic rebalancing under asymmetric resource consumption) is something that I would hope to be built atop these features, but could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. was (Author: benedict): The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning what. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned to is a more correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, but could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they > promise to accept the change proposed by the initiator. The ring members do > not accept proposals if they had already promised to honor another proposal, > to avoid conflicting ring membership updates. > *COMMIT:* Once the initiator receives acceptances from all the nodes in the > cohort, it commits the proposal by broadcasting the proposed ring delta via > gossip. Upon receiving these changes, the other members of the cohort apply > the delta to their local version of the ring and broadcast their new computed > version via gossip. The initiator concludes the ring membership update > operation by checking that all nodes agree on the new proposed version. > *ABORT:* A proposal not accepted by all members of the cohort may be > automatically aborted by the i
[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203523#comment-17203523 ] Brandon Williams commented on CASSANDRA-16146: -- +1, with minor bikeshed: it would be nice if the error mentioned shutting the node down instead if they really want to do that. > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-builds] branch master updated: ninja-fix: test-cdc and test-compression keep logs under a subdirectory
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git The following commit(s) were added to refs/heads/master by this push: new 875d918 ninja-fix: test-cdc and test-compression keep logs under a subdirectory 875d918 is described below commit 875d91841e5614291c5b6278edbc07a4f3174ba3 Author: Mick Semb Wever AuthorDate: Mon Sep 28 22:58:08 2020 +0200 ninja-fix: test-cdc and test-compression keep logs under a subdirectory --- jenkins-dsl/cassandra_job_dsl_seed.groovy | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy b/jenkins-dsl/cassandra_job_dsl_seed.groovy index 9a1694a..7684f32 100644 --- a/jenkins-dsl/cassandra_job_dsl_seed.groovy +++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy @@ -464,7 +464,7 @@ cassandraBranches.each { steps { shell(""" ./cassandra-builds/build-scripts/cassandra-test.sh ${targetName} ; - xz build/test/logs/*.log + find build/test/logs -type f -name "*.log" | xargs xz -qq """) } } @@ -695,7 +695,7 @@ testTargets.each { """) shell(""" ./cassandra-builds/build-scripts/cassandra-test.sh ${targetName} ; -xz build/test/logs/*.log +find build/test/logs -type f -name "*.log" | xargs xz -qq """) } publishers { - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517 ] Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 8:49 PM: -- The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning what. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned to is a more correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, but could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. was (Author: benedict): The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning what. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned tois the correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, but could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they > promise to accept the change proposed by the initiator. The ring members do > not accept proposals if they had already promised to honor another proposal, > to avoid conflicting ring membership updates. > *COMMIT:* Once the initiator receives acceptances from all the nodes in the > cohort, it commits the proposal by broadcasting the proposed ring delta via > gossip. Upon receiving these changes, the other members of the cohort apply > the delta to their local version of the ring and broadcast their new computed > version via gossip. The initiator concludes the ring membership update > operation by checking that all nodes agree on the new proposed version. > *ABORT:* A proposal not accepted by all members of the cohort may be > automatically aborted by the initiator or manually via a command line tool. > For simplicity the pro
[jira] [Commented] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517 ] Benedict Elliott Smith commented on CASSANDRA-16139: The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning the replica. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned tois the correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, and could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they > promise to accept the change proposed by the initiator. The ring members do > not accept proposals if they had already promised to honor another proposal, > to avoid conflicting ring membership updates. > *COMMIT:* Once the initiator receives acceptances from all the nodes in the > cohort, it commits the proposal by broadcasting the proposed ring delta via > gossip. Upon receiving these changes, the other members of the cohort apply > the delta to their local version of the ring and broadcast their new computed > version via gossip. The initiator concludes the ring membership update > operation by checking that all nodes agree on the new proposed version. > *ABORT:* A proposal not accepted by all members of the cohort may be > automatically aborted by the initiator or manually via a command line tool. > For simplicity the protocol above requires that all nodes are up during the > proposal step, but it should be possible to optimize it to require only a > quorum of nodes up to perform ring changes. > A python pseudo-code of the protocol is available > [here|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-safe_ring_membership-py]. > With the abstraction above it becomes very simple to perform ring change > operations: > * > [bootstrap|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-bootstrap-py] > * > [replace|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-replace-py] > * > [move|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-move-py] > * [remove > node|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_node-py] > * [remove > token|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_token-py] > h4. Token Ring Data Structure > The token ring data structure can be seen as a [Delta State Replicated Data > Type|https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#State-based_CRDTs] > (Delta CRDT) containing the state of all (virtual) nodes in the cluster > where updates to the ring are operations on this CRDT. > Each member publishes its latest local accepted state (delta state) via > gossip and the union of all delta states comprise the glo
[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517 ] Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 8:49 PM: -- The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning what. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned tois the correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, but could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. was (Author: benedict): The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning the replica. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned tois the correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, but could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they > promise to accept the change proposed by the initiator. The ring members do > not accept proposals if they had already promised to honor another proposal, > to avoid conflicting ring membership updates. > *COMMIT:* Once the initiator receives acceptances from all the nodes in the > cohort, it commits the proposal by broadcasting the proposed ring delta via > gossip. Upon receiving these changes, the other members of the cohort apply > the delta to their local version of the ring and broadcast their new computed > version via gossip. The initiator concludes the ring membership update > operation by checking that all nodes agree on the new proposed version. > *ABORT:* A proposal not accepted by all members of the cohort may be > automatically aborted by the initiator or manually via a command line tool. > For simplicity the
[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517 ] Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 8:48 PM: -- The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning the replica. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned tois the correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, but could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. was (Author: benedict): The token ring is problematic for us implementors; wrap around is a minor headache, but much more important is how on earth you safely perform multiple overlapping range movements - it's basically impossible, as you don't know which will necessarily complete, and so do not know who will end up owning the replica. Overlapping range movements even as a concept is bad, and unique to the token ring conceptualisation. Bounded ranges of ownership - whether as tokens or keys - that nodes are explicitly assigned tois the correct approach. Defining the membership of each key/token range explicitly prevents these complicated scenarios - a node joining can only possibly replicate these keys, and nothing any other node is doing will modify that. These can be defined per keyspace to permit greater replication flexibility, and importantly safe modifications to replication factor without new machinery. Automatic healing is something that I would hope to be built atop these features, and could in principle be built atop a token ring, just super painfully and probably with many bugs (like all of our range movements up to today). Note that this work necessarily overlaps with safe schema changes, the two are intertwined. I'll leave other thoughts on the topic for another day - some time in the next 2-3 months I will published my white paper on the topic. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they > promise to accept the change proposed by the initiator. The ring members do > not accept proposals if they had already promised to honor another proposal, > to avoid conflicting ring membership updates. > *COMMIT:* Once the initiator receives acceptances from all the nodes in the > cohort, it commits the proposal by broadcasting the proposed ring delta via > gossip. Upon receiving these changes, the other members of the cohort apply > the delta to their local version of the ring and broadcast their new computed > version via gossip. The initiator concludes the ring membership update > operation by checking that all nodes agree on the new proposed version. > *ABORT:* A proposal not accepted by all members of the cohort may be > automatically aborted by the initiator or manually via a command line tool. > For simplici
[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16146: -- Test and Documentation Plan: ci, jvm dtest Status: Patch Available (was: Open) > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203515#comment-17203515 ] Yifan Cai commented on CASSANDRA-16146: --- PR (to 3.0): [https://github.com/apache/cassandra/pull/760] CI: [https://app.circleci.com/pipelines/github/yifan-c/cassandra/108/workflows/d4fc0b93-111e-4cbc-bd2c-c68e1a72fe09] The patch simply prevents calling stopGossip and startGossip when not in the normal mode. I will prepare the patch to the other branches (3.11 and trunk) once this one looks good. cc: [~bdeggleston] > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203496#comment-17203496 ] Jeff Jirsa commented on CASSANDRA-16139: > Would you care to elaborate why? My high level goal here is to ensure we can > reliably add/remove/replace nodes to a cluster, and this seems to be > reasonably doable with consistent hashing as far as I understand. I'd love to > explore alternatives but I'd be interested in learning what requirements are > not fulfilled by the current architecture. Because it's a concept borrowed from the 2007 paper and never reconsidered and it has ALL SORTS of unpleasant failure realities, and we can do better in 2021. For example: why, when a single machine fails in a datacenter, and the rest of the hosts detect the failure, does the database do nothing to re-replicate that data, instead forcing a user to come along and run some magic commands that literally only a handful of people actually understand, when the database COULD do it all on its own without humans in the loop? Why would we rely on humans assigning tokens, anyway, or static token assignment, when the database can see imbalance, and could potentially deal with imbalance on its own? The whole existence of vnodes should have been a red flag that tokens as a distribution mechanism were flawed. Tokens are a simplistic concept that are easy to reason about but horrible to use. If we're rewriting it, please take the time to research how other distributed databases move data around when there's a hot shard or a lost shard, because it's a meaningful and critical missing part of Cassandra. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they > promise to accept the change proposed by the initiator. The ring members do > not accept proposals if they had already promised to honor another proposal, > to avoid conflicting ring membership updates. > *COMMIT:* Once the initiator receives acceptances from all the nodes in the > cohort, it commits the proposal by broadcasting the proposed ring delta via > gossip. Upon receiving these changes, the other members of the cohort apply > the delta to their local version of the ring and broadcast their new computed > version via gossip. The initiator concludes the ring membership update > operation by checking that all nodes agree on the new proposed version. > *ABORT:* A proposal not accepted by all members of the cohort may be > automatically aborted by the initiator or manually via a command line tool. > For simplicity the protocol above requires that all nodes are up during the > proposal step, but it should be possible to optimize it to require only a > quorum of nodes up to perform ring changes. > A python pseudo-code of the protocol is available > [here|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-safe_ring_membership-py]. > With the abstraction above it becomes very simple to perform ring change > operations: > * > [bootstrap|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-bootstrap-py] > * > [replace|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-replace-py] > * > [move|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-move-py] > * [remove > node|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_node-py] > * [remove > token|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_token-py] > h4. Token Ring Data Structure > The token ring data structure can be seen as a [Delta State Replicated Data > Type|https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#State-based_CRDTs] > (Delta CRDT) containing the state of all (virtual) nodes in the cluster > where updates to the ring are operations on this CRDT. > Each member publishes its latest local accepted state
[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203496#comment-17203496 ] Jeff Jirsa edited comment on CASSANDRA-16139 at 9/28/20, 8:09 PM: -- {quote}Would you care to elaborate why? My high level goal here is to ensure we can reliably add/remove/replace nodes to a cluster, and this seems to be reasonably doable with consistent hashing as far as I understand. I'd love to explore alternatives but I'd be interested in learning what requirements are not fulfilled by the current architecture.{quote} Because it's a concept borrowed from the 2007 paper and never reconsidered and it has ALL SORTS of unpleasant failure realities, and we can do better in 2021. For example: why, when a single machine fails in a datacenter, and the rest of the hosts detect the failure, does the database do nothing to re-replicate that data, instead forcing a user to come along and run some magic commands that literally only a handful of people actually understand, when the database COULD do it all on its own without humans in the loop? Why would we rely on humans assigning tokens, anyway, or static token assignment, when the database can see imbalance, and could potentially deal with imbalance on its own? The whole existence of vnodes should have been a red flag that tokens as a distribution mechanism were flawed. Tokens are a simplistic concept that are easy to reason about but horrible to use. If we're rewriting it, please take the time to research how other distributed databases move data around when there's a hot shard or a lost shard, because it's a meaningful and critical missing part of Cassandra. was (Author: jjirsa): > Would you care to elaborate why? My high level goal here is to ensure we can > reliably add/remove/replace nodes to a cluster, and this seems to be > reasonably doable with consistent hashing as far as I understand. I'd love to > explore alternatives but I'd be interested in learning what requirements are > not fulfilled by the current architecture. Because it's a concept borrowed from the 2007 paper and never reconsidered and it has ALL SORTS of unpleasant failure realities, and we can do better in 2021. For example: why, when a single machine fails in a datacenter, and the rest of the hosts detect the failure, does the database do nothing to re-replicate that data, instead forcing a user to come along and run some magic commands that literally only a handful of people actually understand, when the database COULD do it all on its own without humans in the loop? Why would we rely on humans assigning tokens, anyway, or static token assignment, when the database can see imbalance, and could potentially deal with imbalance on its own? The whole existence of vnodes should have been a red flag that tokens as a distribution mechanism were flawed. Tokens are a simplistic concept that are easy to reason about but horrible to use. If we're rewriting it, please take the time to research how other distributed databases move data around when there's a hot shard or a lost shard, because it's a meaningful and critical missing part of Cassandra. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they > promise to accept the change proposed by the initiator. The ring members do > not accept proposals if they had already promised to honor another proposal, > to avoid conflicting ring membership updates. > *COMMIT:* Once the initiator receives acceptances from all the nodes in the > cohort, it commits the proposal by broadcasting the proposed ring delta via > gossip. Upon receiving these changes, the other members of the cohort apply > the delta to their local version of the ring and broadcast their new computed > version via gossip. The initiator concludes the ring membership update > operation by checking that all nodes agree on the new
[jira] [Commented] (CASSANDRA-16139) Safe Ring Membership Protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203474#comment-17203474 ] Paulo Motta commented on CASSANDRA-16139: - Thanks for your comments Benedict and Jeff! Please find follow-up below. {quote}Any replacement should not be built upon Gossip (either in its current or an improved form) {quote} The proposed protocol uses gossip on 2 steps: a) before PROPOSE, to validate the initiator has the same ring version as the cohort; b) on COMMIT, to broadcast the ring membership update. Step a) is an optimization that prevents the initiator from proposing a new ring version if there's a current disagreement. Step b) adds resilience against initiator failure during commit at the expense of latency, but can easily be made synchronous to address that. I may be failing to see what's problematic about gossip here so I'll wait for your justification on why we should avoid it. {quote}Being able to operate with a quorum is probably a lot harder than with every node's involvement, so I'd suggest thinking about that sooner than later {quote} That's a valid point. I will focus on making this work with all nodes for now, since that's a fair assumption/requirement, and if we see necessity we can get back to this later. {quote}How do you guarantee that all participants in an operation have a consistent view of the ring for the purposes of that operation? {quote} content-based versioning. example: * Node A ring (version: *b710*): {code:json} { "previous": "5f36", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": ["3:N", "7:N"]} }{code} * Node B ring (version: *b710*): {code:json} { "previous": "5f36", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": ["3:N", "7:N"]} }{code} * Node C ring (version: *b710*): {code:json} { "previous": "5f36", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": ["3:N", "7:N"]} }{code} Suppose now nodes "D" and "E" want to join the ring with the same tokens "4" and "8" - only one of them should succeed. Each of them will read the current ring version *b710*. Each node will generate the following "proposed" ring version: * Node D proposed ring (version: *6f69*): {code:json} { "previous": "b710", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": ["3:N", "7:N"], "D": ["4:J", "8:J"]} } {code} * Node E proposed ring (version: *8f88*): {code:json} { "previous": "b710", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": ["3:N", "7:N"], "E": ["4:J", "8:J"]} } {code} They will then each send a PROPOSE message with the following parameters to the cohort: * NODE D: {code:java} PROPOSE(current_version="b710", proposed_version="6f69"){code} * NODE E: {code:java} PROPOSE(current_version="b710", proposed_version="8f88"){code} In this situation it's possible that each of the 3 situations happen: * Neither E nor D gets a PROMISE from all nodes - no proposal succeeds * NODE D is able to get a promise from nodes A, B and C for version *6f69*. * NODE E is able to get a promise from nodes A, B and C for version *8f88*. Now let's say NODE D proposal succeeds and the ring updates its version to *6f69*. Any proposal from NODE E referencing the previous ring version *b710* will be rejected by the cohort, so node E will be forced to update its version before submitting a new proposal. {quote}If we're rewriting all of the membership/ownership code, we should definitely be thinking about a world that isn't based on tokens and hash tables. {quote} Would you care to elaborate why? My high level goal here is to ensure we can reliably add/remove/replace nodes to a cluster, and this seems to be reasonably doable with consistent hashing as far as I understand. I'd love to explore alternatives but I'd be interested in learning what requirements are not fulfilled by the current architecture. > Safe Ring Membership Protocol > - > > Key: CASSANDRA-16139 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16139 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > > This ticket presents a practical protocol for performing safe ring membership > updates in Cassandra. This protocol will enable reliable concurrent ring > membership updates. > The proposed protocol is composed of the following macro-steps: > *PROPOSE:* An initiator node wanting to make updates to the current ring > structure (such as joining, leaving the ring or changing token assignments) > must propose the change to the other members of the ring (cohort). > *ACCEPT:* Upon receiving a proposal the other ring members determine if the > change is compatible with their local version of the ring, and if so, they >
[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203429#comment-17203429 ] Yifan Cai commented on CASSANDRA-16146: --- Thanks [~brandon.williams] for commenting so quickly. bq. perhaps just not allow this outside of NORMAL Adding a pre-check before both starting and stopping gossip to make sure the current mode is NORMAL sounds good to me. > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16146: -- Fix Version/s: 4.0-beta3 3.11.x 3.0.x > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203409#comment-17203409 ] Brandon Williams commented on CASSANDRA-16146: -- bq. Operator runs nodetool to stop and re-start gossip. The gossip state gets flipped to NORMAL We should perhaps just not allow this outside of NORMAL, since in that case you probably want to just stop the the node instead. > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16146: -- Description: At high level, {{StorageService#setGossipTokens}} set the gossip state to {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) overrides the actual gossip state. It could happen in the below scenario. # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and code execution exits StorageService#initServer. # Operator runs nodetool to stop and re-start gossip. The gossip state gets flipped to {{NORMAL}} was: {{At high level, {{StorageService#setGossipTokens}} set the gossip state to NORMAL blindly. Therefore, re-enabling gossip (stop and start gossip) overrides the actual gossip state.}} {color:#24292e}It could happen in the below scenario.{color} {color:#24292e} {color} # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and code execution exits StorageService#initServer. # Operator runs nodetool to stop and re-start gossip. The gossip state gets flipped to {{NORMAL}} > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16146: -- Bug Category: Parent values: Correctness(12982)Level 1 values: Consistency(12989) Complexity: Low Hanging Fruit Discovered By: Code Inspection Severity: Low Status: Open (was: Triage Needed) > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > {{At high level, {{StorageService#setGossipTokens}} set the gossip state to > NORMAL blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state.}} > > {color:#24292e}It could happen in the below scenario.{color} > {color:#24292e} {color} # Bootstrap failed. The gossip state remains in > {{BOOT}} / {{JOINING}} and code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
Yifan Cai created CASSANDRA-16146: - Summary: Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap Key: CASSANDRA-16146 URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 Project: Cassandra Issue Type: Bug Components: Cluster/Gossip Reporter: Yifan Cai Assignee: Yifan Cai {{At high level, {{StorageService#setGossipTokens}} set the gossip state to NORMAL blindly. Therefore, re-enabling gossip (stop and start gossip) overrides the actual gossip state.}} {color:#24292e}It could happen in the below scenario.{color} {color:#24292e} {color} # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and code execution exits StorageService#initServer. # Operator runs nodetool to stop and re-start gossip. The gossip state gets flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203377#comment-17203377 ] Benedict Elliott Smith edited comment on CASSANDRA-14793 at 9/28/20, 5:08 PM: -- Perhaps shut the server down for all writers and compaction, and serve only reads? I've not got a strong opinion about it though - hard to run safely in this context, so would seem fine to just admit defeat in this case. This does strengthen the argument for replication system keyspace data to multiple disks discussed above. was (Author: benedict): Perhaps shut the server down for all writers and compaction, and serve only reads? > Improve system table handling when losing a disk when using JBOD > > > Key: CASSANDRA-14793 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14793 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Marcus Eriksson >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > We should improve the way we handle disk failures when losing a disk in a > JBOD setup > One way could be to pin the system tables to a special data directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203377#comment-17203377 ] Benedict Elliott Smith commented on CASSANDRA-14793: Perhaps shut the server down for all writers and compaction, and serve only reads? > Improve system table handling when losing a disk when using JBOD > > > Key: CASSANDRA-14793 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14793 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Marcus Eriksson >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > We should improve the way we handle disk failures when losing a disk in a > JBOD setup > One way could be to pin the system tables to a special data directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203307#comment-17203307 ] Benjamin Lerer commented on CASSANDRA-14793: [~marcuse], [~benedict] How do you think we should handle the case where the {{disk_failure_policy}} is {{best_effort}} and the disk containing the system data is marked as {{unreadable}} or {{unwritable}} ? > Improve system table handling when losing a disk when using JBOD > > > Key: CASSANDRA-14793 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14793 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Marcus Eriksson >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > We should improve the way we handle disk failures when losing a disk in a > JBOD setup > One way could be to pin the system tables to a special data directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan West updated CASSANDRA-15833: Authors: Jacek Lewandowski, Jordan West (was: Jacek Lewandowski) Since Version: 3.11.9 Source Control Link: https://github.com/apache/cassandra/commit/cf27558b1442e75e17e47071ecf92d1b3e5a0e36 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed as https://github.com/apache/cassandra/commit/cf27558b1442e75e17e47071ecf92d1b3e5a0e36. Thanks for the input and review everyone! > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.0-beta > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated (d4eba9f -> c6ef476)
This is an automated email from the ASF dual-hosted git repository. jwest pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git. from d4eba9f Abort repairs when getting a truncation request add cf27558 Don't attempt value skipping with mixed cluster new c6ef476 Merge branch 'cassandra-3.11' into trunk The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGES.txt| 1 + .../apache/cassandra/db/filter/ColumnFilter.java | 9 +++ src/java/org/apache/cassandra/gms/Gossiper.java| 4 +- .../distributed/upgrade/MixedModeReadTest.java | 92 ++ 4 files changed, 104 insertions(+), 2 deletions(-) create mode 100644 test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeReadTest.java - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk
This is an automated email from the ASF dual-hosted git repository. jwest pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git commit c6ef476278ec783b77faa367e82d9b1ffabc Merge: d4eba9f cf27558 Author: Jordan West AuthorDate: Mon Sep 28 08:15:26 2020 -0700 Merge branch 'cassandra-3.11' into trunk CHANGES.txt| 1 + .../apache/cassandra/db/filter/ColumnFilter.java | 9 +++ src/java/org/apache/cassandra/gms/Gossiper.java| 4 +- .../distributed/upgrade/MixedModeReadTest.java | 92 ++ 4 files changed, 104 insertions(+), 2 deletions(-) diff --cc CHANGES.txt index 0215a71,3b47c33..190eebc --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,23 -1,6 +1,24 @@@ -3.11.9 - * Don't attempt value skipping with mixed version cluster (CASSANDRA-15833) +4.0-beta3 + * Abort repairs when getting a truncation request (CASSANDRA-15854) + * Remove bad assert when getting active compactions for an sstable (CASSANDRA-15457) * Avoid failing compactions with very large partitions (CASSANDRA-15164) + * Prevent NPE in StreamMessage in type lookup (CASSANDRA-16131) + * Avoid invalid state transition exception during incremental repair (CASSANDRA-16067) + * Allow zero padding in timestamp serialization (CASSANDRA-16105) + * Add byte array backed cells (CASSANDRA-15393) + * Correctly handle pending ranges with adjacent range movements (CASSANDRA-14801) + * Avoid adding locahost when streaming trivial ranges (CASSANDRA-16099) + * Add nodetool getfullquerylog (CASSANDRA-15988) + * Fix yaml format and alignment in tpstats (CASSANDRA-11402) + * Avoid trying to keep track of RTs for endpoints we won't write to during read repair (CASSANDRA-16084) + * When compaction gets interrupted, the exception should include the compactionId (CASSANDRA-15954) + * Make Table/Keyspace Metric Names Consistent With Each Other (CASSANDRA-15909) + * Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure (CASSANDRA-15861) + * NPE thrown while updating speculative execution time if keyspace is removed during task execution (CASSANDRA-15949) + * Show the progress of data streaming and index build (CASSANDRA-15406) +Merged from 3.11: ++ * Don't attempt value skipping with mixed version cluster (CASSANDRA-15833) + * Use IF NOT EXISTS for index and UDT create statements in snapshot schema files (CASSANDRA-13935) * Make sure LCS handles duplicate sstable added/removed notifications correctly (CASSANDRA-14103) Merged from 3.0: * Add flag to ignore unreplicated keyspaces during repair (CASSANDRA-15160) diff --cc src/java/org/apache/cassandra/db/filter/ColumnFilter.java index 30c3ed7,57ff729..c9d0a70 --- a/src/java/org/apache/cassandra/db/filter/ColumnFilter.java +++ b/src/java/org/apache/cassandra/db/filter/ColumnFilter.java @@@ -26,10 -23,12 +26,11 @@@ import com.google.common.collect.Iterat import com.google.common.collect.SortedSetMultimap; import com.google.common.collect.TreeMultimap; -import org.apache.cassandra.config.CFMetaData; import org.apache.cassandra.cql3.ColumnIdentifier; import org.apache.cassandra.db.*; +import org.apache.cassandra.db.rows.Cell; import org.apache.cassandra.db.rows.CellPath; -import org.apache.cassandra.config.ColumnDefinition; + import org.apache.cassandra.gms.Gossiper; import org.apache.cassandra.io.util.DataInputPlus; import org.apache.cassandra.io.util.DataOutputPlus; import org.apache.cassandra.net.MessagingService; @@@ -443,13 -349,14 +444,17 @@@ public class ColumnFilte { s = TreeMultimap.create(Comparator.naturalOrder(), Comparator.naturalOrder()); for (ColumnSubselection subSelection : subSelections) -s.put(subSelection.column().name, subSelection); +{ +if (fullySelectedComplexColumns == null || !fullySelectedComplexColumns.contains(subSelection.column())) +s.put(subSelection.column().name, subSelection); +} } + // see CASSANDRA-15833 -if (isFetchAll && Gossiper.instance.isAnyNodeOn30()) ++if (isFetchAll && Gossiper.instance.haveMajorVersion3Nodes()) + queried = null; + -return new ColumnFilter(isFetchAll, isFetchAll ? metadata.partitionColumns() : null, queried, s); +return new ColumnFilter(isFetchAll, metadata, queried, s); } } @@@ -616,15 -500,10 +621,19 @@@ } } + // See CASSANDRA-15833 + if (version <= MessagingService.VERSION_3014 && isFetchAll) + queried = null; + +// Same concern than in serialize/serializedSize: we should be wary of the change in meaning for isFetchAll. +// If we get a filter with isFetchAll from 3.0/3.x
[cassandra] branch cassandra-3.11 updated (0f46c90 -> cf27558)
This is an automated email from the ASF dual-hosted git repository. jwest pushed a change to branch cassandra-3.11 in repository https://gitbox.apache.org/repos/asf/cassandra.git. from 0f46c90 Merge branch 'cassandra-3.0' into cassandra-3.11 add cf27558 Don't attempt value skipping with mixed cluster No new revisions were added by this update. Summary of changes: CHANGES.txt| 1 + .../apache/cassandra/db/filter/ColumnFilter.java | 9 ++ .../distributed/impl/AbstractCluster.java | 2 +- .../impl/DelegatingInvokableInstance.java | 7 +- .../distributed/upgrade/MixedModeReadTest.java | 102 + .../cassandra/db/filter/ColumnFilterTest.java | 3 - 6 files changed, 119 insertions(+), 5 deletions(-) create mode 100644 test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeReadTest.java - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15902) OOM because repair session thread not closed when terminating repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Dejanovski updated CASSANDRA-15902: - Reviewers: Alexander Dejanovski, Alexander Dejanovski (was: Alexander Dejanovski) Alexander Dejanovski, Alexander Dejanovski Status: Review In Progress (was: Patch Available) Starting testing and review. > OOM because repair session thread not closed when terminating repair > > > Key: CASSANDRA-15902 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15902 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Swen Fuhrmann >Assignee: Swen Fuhrmann >Priority: Normal > Fix For: 3.0.x, 3.11.x > > Attachments: heap-mem-histo.txt, repair-terminated.txt > > > In our cluster, after a while some nodes running slowly out of memory. On > that nodes we observed that Cassandra Reaper terminate repairs with a JMX > call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} because > reaching timeout of 30 min. > In the memory heap dump we see lot of instances of > {{io.netty.util.concurrent.FastThreadLocalThread}} occupy most of the memory: > {noformat} > 119 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by > "sun.misc.Launcher$AppClassLoader @ 0x51a80" occupy 8.445.684.480 (93,96 > %) bytes. {noformat} > In the thread dump we see lot of repair threads: > {noformat} > grep "Repair#" threaddump.txt | wc -l > 50 {noformat} > > The repair jobs are waiting for the validation to finish: > {noformat} > "Repair#152:1" #96170 daemon prio=5 os_prio=0 tid=0x12fc5000 > nid=0x542a waiting on condition [0x7f81ee414000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0007939bcfc8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137) > at > com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1509) > at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:160) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$13/480490520.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) {noformat} > > Thats the line where the threads stuck: > {noformat} > // Wait for validation to complete > Futures.getUnchecked(validations); {noformat} > > The call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} stops > the thread pool executor. It looks like that futures which are in progress > will therefor never be completed and the repair thread waits forever and > won't be finished. > > Environment: > Cassandra version: 3.11.4 and 3.11.6 > Cassandra Reaper: 1.4.0 > JVM memory settings: > {noformat} > -Xms11771M -Xmx11771M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > on another cluster with same issue: > {noformat} > -Xms31744M -Xmx31744M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > Java Runtime: > {noformat} > openjdk version "1.8.0_212" > OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03) > OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode) > {noformat} > > The same issue described in this comment: > https://issues.apache.org/jira/browse/CASSANDRA-14355?focusedCommentId=16992973&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16992973 > As suggested in the comments I created this new specific ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CASSANDRA-15902) OOM because repair session thread not closed when terminating repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203277#comment-17203277 ] Alexander Dejanovski commented on CASSANDRA-15902: -- Hi [~moczarski], I'm aware of similar reports regarding repair sessions not being cleaned up correctly. I'll happily test this patch and perform a review. > OOM because repair session thread not closed when terminating repair > > > Key: CASSANDRA-15902 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15902 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Swen Fuhrmann >Assignee: Swen Fuhrmann >Priority: Normal > Fix For: 3.0.x, 3.11.x > > Attachments: heap-mem-histo.txt, repair-terminated.txt > > > In our cluster, after a while some nodes running slowly out of memory. On > that nodes we observed that Cassandra Reaper terminate repairs with a JMX > call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} because > reaching timeout of 30 min. > In the memory heap dump we see lot of instances of > {{io.netty.util.concurrent.FastThreadLocalThread}} occupy most of the memory: > {noformat} > 119 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by > "sun.misc.Launcher$AppClassLoader @ 0x51a80" occupy 8.445.684.480 (93,96 > %) bytes. {noformat} > In the thread dump we see lot of repair threads: > {noformat} > grep "Repair#" threaddump.txt | wc -l > 50 {noformat} > > The repair jobs are waiting for the validation to finish: > {noformat} > "Repair#152:1" #96170 daemon prio=5 os_prio=0 tid=0x12fc5000 > nid=0x542a waiting on condition [0x7f81ee414000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0007939bcfc8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137) > at > com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1509) > at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:160) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$13/480490520.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) {noformat} > > Thats the line where the threads stuck: > {noformat} > // Wait for validation to complete > Futures.getUnchecked(validations); {noformat} > > The call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} stops > the thread pool executor. It looks like that futures which are in progress > will therefor never be completed and the repair thread waits forever and > won't be finished. > > Environment: > Cassandra version: 3.11.4 and 3.11.6 > Cassandra Reaper: 1.4.0 > JVM memory settings: > {noformat} > -Xms11771M -Xmx11771M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > on another cluster with same issue: > {noformat} > -Xms31744M -Xmx31744M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > Java Runtime: > {noformat} > openjdk version "1.8.0_212" > OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03) > OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode) > {noformat} > > The same issue described in this comment: > https://issues.apache.org/jira/browse/CASSANDRA-14355?focusedCommentId=16992973&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16992973 > As suggested in the comments I created this new specific ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) -
[jira] [Assigned] (CASSANDRA-16038) Add a getter for InstanceConfig parameters - in-jvm-dtests-api
[ https://issues.apache.org/jira/browse/CASSANDRA-16038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova reassigned CASSANDRA-16038: --- Assignee: (was: Ekaterina Dimitrova) > Add a getter for InstanceConfig parameters - in-jvm-dtests-api > -- > > Key: CASSANDRA-16038 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16038 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java >Reporter: Ekaterina Dimitrova >Priority: Low > > In order to change the way config will be loaded (for reference > CASSANDRA-15234 ) a getter for the InstanceConfig parameters is needed > CC [~maedhroz] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15299) CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta
[ https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198465#comment-17198465 ] Sam Tunnicliffe edited comment on CASSANDRA-15299 at 9/28/20, 11:33 AM: Sorry it's been a while without any visible movement here, but I've just pushed some more commits to address the latest comments from [~ifesdjeen] and [~omichallat]. I've added some tests for protocol negotiation, correct handling of corrupt messages and resource management. {quote} * In CQLMessageHandler#processOneContainedMessage, when we can't acquire capacity and, subsequently, we're not passing the frame further down the line. Shouold we release the frame in this case, since usually we're releasing the source frame after flush.{quote} Done, though we only need to do this when {{throwOnOverload == true}} as otherwise we process the inflight request before applying backpressure. {quote} * ReusableBuffer is unused.{quote} Ah yes, removed {quote} * Server has a few unused imports and eventExecutorGroup which is unused.{quote} Cleaned up the imports and removed eventExecutorGroup {quote} * I'm not sure if we currently handle releasing corrupted frames.{quote} For self-contained frames, there's nothing to do here as no resources have been acquired before the corruption is detected, hence {{CorruptFrame::release}} is a no-op. For frames which are part of a large message, there may be some resource allocated before we discover corruption. This is ok though, as we consume the frame, supplying it to the large message state machine, which handles releasing the bufffers of the previous frames (if any). I've added a test for this scenario which includes a check that everything allocated has been freed. {quote}Shouold we maybe make FrameSet auto-closeable and make sure we always release buffers in finally? I've also made a similar change to processItem which would add item to flushed to make sure it's released. That makes flushed variable name not quite right though. {quote} I've pulled in some of your change to {{processItem}} as it removes some duplication around polling the queue. I've removed the condition in the {{finally}} of {{ImmediateFlusher}} though, since if we throw from {{processQueue}} then {{doneWork}} will be false anyway, but there may have been some items processed and waiting to flush. The trade off is calling {{flushWrittenChannels}} even if there's no work to do, but that seems both cheap and unlikely, what do you think? As far as making {{FrameSet}} autoclosable, I don't think that's feasible, given how they are created and accessed. I've tried to address one of your comment re: the memory management here by adding some comments. They're probably not yet enough, but let me know if they are helpful at all. {quote}We can (and probably should) open a separate ticket that could aim at performance improvements around native protocol. {quote} Agreed, I'd like to do some further perf testing, but the results from your initial tests makes a follow-up ticket seem a reasonable option. {quote}I've noticed an issue when the client starts protocol negotiation with an unsupported version. {quote} Fixed, thanks. [~ifesdjeen], I haven't pulled in your burn test or changes to {{SimpleClient}} yet, I'll try to do that next week. I also haven't done any automated renaming yet, I'll hold off on that so as not to add to the cognitive burden until we're pretty much done with review. ||branch||CI|| |[15299-trunk|https://github.com/beobal/cassandra/tree/15299-trunk]|[circle|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=15299-trunk]| was (Author: beobal): Sorry it's been a while without any visible movement here, but I've just pushed some more commits to address the latest comments from [~ifesdjeen] and [~omichallat]. I've added some tests for protocol negotiation, correct handling of corrupt messages and resource management. {quote} * In CQLMessageHandler#processOneContainedMessage, when we can't acquire capacity and, subsequently, we're not passing the frame further down the line. Shouold we release the frame in this case, since usually we're releasing the source frame after flush. {quote} Done, though we only need to do this when {{throwOnOverload == true}} as otherwise we process the inflight request before applying backpressure. {quote} * ReusableBuffer is unused. {quote} Ah yes, removed {quote} * Server has a few unused imports and eventExecutorGroup which is unused.{quote} Cleaned up the imports and removed eventExecutorGroup {quote} * I'm not sure if we currently handle releasing corrupted frames.{quote} For self-contained frames, there's nothing to do here as no resources have been acquired before the corruption is detected, hence {{CorruptFrame::release}} is a no-op. For frames which are part of a large message,
[jira] [Commented] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving
[ https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203141#comment-17203141 ] Michael Semb Wever commented on CASSANDRA-16128: Added compression to all the text files being uploaded to nightlies.a.o in [234186acfc461b75056c251a825ccbb42f4e4fb6|https://github.com/apache/cassandra-builds/commit/234186acfc461b75056c251a825ccbb42f4e4fb6] (thanks to [~Bereng] for the review) > Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o > instead of archiving > --- > > Key: CASSANDRA-16128 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16128 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta > > > Jenkins improvements > 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we > don't lose it next time the Jenkins master is corrupted) > 2. Print the SHAs of the different git repos used during the build process. > Also store them in the .head files (so the pipeline can print them out too). > 3. Instead of archiving artefacts, ssh them to > https://nightlies.apache.org/cassandra/ > (Disk usage on agents is largely under control, but disk usage on master was > the new problem. The suspicion here is the Cassandra-*-artifact's artefacts > was the disk usage culprit, though we have to evidence to support it.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-builds] branch master updated: In Jenkins, fix printing SHAs in pipeline summary, and compress text artifacts before uploading to nightlies.a.o
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git The following commit(s) were added to refs/heads/master by this push: new 234186a In Jenkins, fix printing SHAs in pipeline summary, and compress text artifacts before uploading to nightlies.a.o 234186a is described below commit 234186acfc461b75056c251a825ccbb42f4e4fb6 Author: Mick Semb Wever AuthorDate: Mon Sep 28 11:55:37 2020 +0200 In Jenkins, fix printing SHAs in pipeline summary, and compress text artifacts before uploading to nightlies.a.o patch by Mick Semb Wever; reviewed by Berenguer Blasi for CASSANDRA-16128 --- jenkins-dsl/cassandra_job_dsl_seed.groovy | 24 ++-- jenkins-dsl/cassandra_pipeline.groovy | 7 --- 2 files changed, 22 insertions(+), 9 deletions(-) diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy b/jenkins-dsl/cassandra_job_dsl_seed.groovy index cc40ae3..9a1694a 100644 --- a/jenkins-dsl/cassandra_job_dsl_seed.groovy +++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy @@ -301,7 +301,7 @@ matrixJob('Cassandra-template-dtest-matrix') { publishOverSsh { server('Nightlies') { transferSet { - sourceFiles("**/nosetests.xml,**/test_stdout.txt,**/ccm_logs.tar.xz") + sourceFiles("**/nosetests.xml,**/test_stdout.txt.xz,**/ccm_logs.tar.xz") remoteDirectory("cassandra/\${JOB_NAME}/\${BUILD_NUMBER}/") } } @@ -462,7 +462,10 @@ cassandraBranches.each { node / scm / branches / 'hudson.plugins.git.BranchSpec' / name(branchName) } steps { -shell("./cassandra-builds/build-scripts/cassandra-test.sh ${targetName}") +shell(""" +./cassandra-builds/build-scripts/cassandra-test.sh ${targetName} ; + xz build/test/logs/*.log + """) } } } @@ -496,7 +499,10 @@ cassandraBranches.each { node / scm / branches / 'hudson.plugins.git.BranchSpec' / name(branchName) } steps { -shell("sh ./cassandra-builds/docker/jenkins/jenkinscommand.sh apache ${branchName} https://github.com/apache/cassandra-dtest.git master ${buildsRepo} ${buildsBranch} ${dtestDockerImage} ${targetName} \${split}/${splits}") +shell(""" +sh ./cassandra-builds/docker/jenkins/jenkinscommand.sh apache ${branchName} https://github.com/apache/cassandra-dtest.git master ${buildsRepo} ${buildsBranch} ${dtestDockerImage} ${targetName} \${split}/${splits} ; +xz test_stdout.txt +""") } } } @@ -687,7 +693,10 @@ testTargets.each { echo "cassandra-builds at: `git -C cassandra-builds log -1 --pretty=format:'%h %an %ad %s'`" ; echo "Cassandra-devbranch-${targetName} cassandra: `git log -1 --pretty=format:'%h %an %ad %s'`" > Cassandra-devbranch-${targetName}.head ; """) -shell("./cassandra-builds/build-scripts/cassandra-test.sh ${targetName}") +shell(""" +./cassandra-builds/build-scripts/cassandra-test.sh ${targetName} ; +xz build/test/logs/*.log + """) } publishers { publishOverSsh { @@ -793,13 +802,16 @@ dtestTargets.each { echo "cassandra-builds at: `git -C cassandra-builds log -1 --pretty=format:'%h %an %ad %s'`" ; echo "Cassandra-devbranch-${targetName} cassandra: `git log -1 --pretty=format:'%h %an %ad %s'`" > Cassandra-devbranch-${targetName}.head ; """) -shell("sh ./cassandra-builds/docker/jenkins/jenkinscommand.sh \$REPO \$BRANCH \$DTEST_REPO \$DTEST_BRANCH ${buildsRepo} ${buildsBranch} \$DOCKER_IMAGE ${targetName} \${split}/${splits}") +shell(""" +sh ./cassandra-builds/docker/jenkins/jenkinscommand.sh \$REPO \$BRANCH \$DTEST_REPO \$DTEST_BRANCH ${buildsRepo} ${buildsBranch} \$DOCKER_IMAGE ${targetName} \${split}/${splits} ; +xz test_stdout.txt + """) } publishers { publishOverSsh { server('Nightlies') { transferSet { -sourceFiles("**/test_stdout.txt,**/ccm_logs.tar.xz") +sourceFiles("**/test_stdout.txt.xz,**/ccm_logs.tar.xz") remoteDirectory("cassandra/\${JOB_NAME}/\${BUILD_NUMBER}/") } } diff --git a/jenkins-dsl/cassandra_pipeline.groovy b/jenkins-dsl/cassandra_pipeline.groovy index d
[jira] [Assigned] (CASSANDRA-16048) Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and Value Columns
[ https://issues.apache.org/jira/browse/CASSANDRA-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov reassigned CASSANDRA-16048: --- Assignee: Jordan West (was: Alex Petrov) > Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and > Value Columns > -- > > Key: CASSANDRA-16048 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16048 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Jordan West >Assignee: Jordan West >Priority: Normal > Fix For: 4.0-beta > > > Some compact storage tables, specifically those where the user has defined > both at least one clustering and the value column, can be safely handled in > 4.0 because besides the DENSE flag they are not materially different post 3.0 > and there is no visible change to the user facing schema after dropping > compact storage. We can detect this case and allow these tables to silently > drop the DENSE flag while still throwing a start-up error for COMPACT STORAGE > tables that don’t meet the criteria. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15811) Improve DROP COMPACT STORAGE
[ https://issues.apache.org/jira/browse/CASSANDRA-15811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov reassigned CASSANDRA-15811: --- Assignee: Marcus Eriksson (was: Alex Petrov) > Improve DROP COMPACT STORAGE > > > Key: CASSANDRA-15811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15811 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: Alex Petrov >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to > deprecate Thrift. However, current semantics of dropping compact storage > flags from tables reveal several columns that are usually empty (colum1 and > value in non-dense case, value for dense columns, and a column with an empty > name for super column families). Showing these columns can confuse > application developers, especially ones that have never used thrift and/or > made writes that assumed presence of those fields, and used compact storage > in 3.x because is has “compact” in the name. > There’s not much we can do in a super column family case, especially > considering there’s no way to create a supercolumn family using CQL, but we > can improve dense and non-dense cases. We can scan stables and make sure > there are no signs of thrift writes in them, and if all sstables conform to > this rule, we can not only drop the flag, but also drop columns that are > supposed to be hidden. However, this is both not very user-friendly, and is > probably not worth development effort. > An alternative to scanning is to add {{FORCE DROP COMPACT}} syntax (or > something similar) that would just drop columns unconditionally. It is likely > that people who were using compact storage with thrift know they were doing > that, so they'll usually use "regular" {{DROP COMPACT}}, withouot force, that > will simply reveal the columns as it does right now. > Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually > have data[*] we had to remove empty type validation, properly handling > compact storage starts making more sense, but we’ll solve it through not > having columns, hence not caring about values instead, or keeping values > _and_ data, not requiring validation in this case. EmptyType field will have > to be handled differently though. > [*] as it is possible to end up with sstables upgraded from 2.x or written in > 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster > is guaranteed to have empty values in this column, and this behaviour, even > if undesired, might be used by people. > Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows > any non-empty value to be written to it, but we already allow creating table > via CQL, and still write data into it with thrift. It seems to have been > unintended, but it might have become a feature people rely on. If we simply > back port 15373 to 2.2 and 2.1, we’ll change and break behaviour. Given > no-one complained in 3.0 and 3.11, this assumption is unlikely though. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15811) Improve DROP COMPACT STORAGE
[ https://issues.apache.org/jira/browse/CASSANDRA-15811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov reassigned CASSANDRA-15811: --- Assignee: Alex Petrov (was: Marcus Eriksson) > Improve DROP COMPACT STORAGE > > > Key: CASSANDRA-15811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15811 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to > deprecate Thrift. However, current semantics of dropping compact storage > flags from tables reveal several columns that are usually empty (colum1 and > value in non-dense case, value for dense columns, and a column with an empty > name for super column families). Showing these columns can confuse > application developers, especially ones that have never used thrift and/or > made writes that assumed presence of those fields, and used compact storage > in 3.x because is has “compact” in the name. > There’s not much we can do in a super column family case, especially > considering there’s no way to create a supercolumn family using CQL, but we > can improve dense and non-dense cases. We can scan stables and make sure > there are no signs of thrift writes in them, and if all sstables conform to > this rule, we can not only drop the flag, but also drop columns that are > supposed to be hidden. However, this is both not very user-friendly, and is > probably not worth development effort. > An alternative to scanning is to add {{FORCE DROP COMPACT}} syntax (or > something similar) that would just drop columns unconditionally. It is likely > that people who were using compact storage with thrift know they were doing > that, so they'll usually use "regular" {{DROP COMPACT}}, withouot force, that > will simply reveal the columns as it does right now. > Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually > have data[*] we had to remove empty type validation, properly handling > compact storage starts making more sense, but we’ll solve it through not > having columns, hence not caring about values instead, or keeping values > _and_ data, not requiring validation in this case. EmptyType field will have > to be handled differently though. > [*] as it is possible to end up with sstables upgraded from 2.x or written in > 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster > is guaranteed to have empty values in this column, and this behaviour, even > if undesired, might be used by people. > Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows > any non-empty value to be written to it, but we already allow creating table > via CQL, and still write data into it with thrift. It seems to have been > unintended, but it might have become a feature people rely on. If we simply > back port 15373 to 2.2 and 2.1, we’ll change and break behaviour. Given > no-one complained in 3.0 and 3.11, this assumption is unlikely though. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-16048) Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and Value Columns
[ https://issues.apache.org/jira/browse/CASSANDRA-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov reassigned CASSANDRA-16048: --- Assignee: Alex Petrov (was: Jordan West) > Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and > Value Columns > -- > > Key: CASSANDRA-16048 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16048 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Jordan West >Assignee: Alex Petrov >Priority: Normal > Fix For: 4.0-beta > > > Some compact storage tables, specifically those where the user has defined > both at least one clustering and the value column, can be safely handled in > 4.0 because besides the DENSE flag they are not materially different post 3.0 > and there is no visible change to the user facing schema after dropping > compact storage. We can detect this case and allow these tables to silently > drop the DENSE flag while still throwing a start-up error for COMPACT STORAGE > tables that don’t meet the criteria. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-dtest] branch master updated: fix bad rebase, remove remove_perf_disable_shared_mem
This is an automated email from the ASF dual-hosted git repository. marcuse pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git The following commit(s) were added to refs/heads/master by this push: new 5890b5f fix bad rebase, remove remove_perf_disable_shared_mem 5890b5f is described below commit 5890b5fd76b6a0f5dd3dc9b464b5aa9fb592c7bd Author: Marcus Eriksson AuthorDate: Mon Sep 28 09:38:36 2020 +0200 fix bad rebase, remove remove_perf_disable_shared_mem --- repair_tests/repair_test.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/repair_tests/repair_test.py b/repair_tests/repair_test.py index 4b8f037..a33cd2f 100644 --- a/repair_tests/repair_test.py +++ b/repair_tests/repair_test.py @@ -15,7 +15,7 @@ from ccmlib.node import ToolError from dtest import FlakyRetryPolicy, Tester, create_ks, create_cf from tools.data import insert_c1c2, query_c1c2 -from tools.jmxutils import JolokiaAgent, make_mbean, remove_perf_disable_shared_mem +from tools.jmxutils import JolokiaAgent, make_mbean since = pytest.mark.since logger = logging.getLogger(__name__) @@ -948,7 +948,6 @@ class TestRepair(BaseRepairTest): cluster = self.cluster cluster.populate([3]) node1, node2, node3 = cluster.nodelist() -remove_perf_disable_shared_mem(node1) # for jmx cluster.start(wait_for_binary_proto=True) self.fixture_dtest_setup.ignore_log_patterns.extend(["Nothing to repair for"]) session = self.patient_cql_connection(node1) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org