[jira] [Updated] (CASSANDRA-15229) BufferPool Regression

2020-09-28 Thread Zhao Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yang updated CASSANDRA-15229:
--
Source Control Link: https://github.com/apache/cassandra/pull/535
Test and Documentation Plan: 
added unit test and tested performance.

 

https://app.circleci.com/pipelines/github/jasonstack/cassandra/313/workflows/7b5205e2-21ee-46e8-931c-5b658cf49be5

  was:added unit test and tested performance.


> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict Elliott Smith
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
> Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, 
> 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, 
> 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, 
> 15229-unsafe.png
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.
> -
> Since CASSANDRA-5863, chunk cache is implemented to use buffer pool. When 
> local pool is full, one of its chunks will be evicted and only put back to 
> global pool when all buffers in the evicted chunk are released. But due to 
> chunk cache, buffers can be held for long period of time, preventing evicted 
> chunk to be recycled even though most of space in the evicted chunk are free.
> There two things need to be improved:
> 1. Evicted chunk with free space should be recycled to global pool, even if 
> it's not fully free. It's doable in 4.0.
> 2. Reduce fragmentation caused by different buffer size. With #1, partially 
> freed chunk will be available for allocation, but "holes" in the partially 
> freed chunk are with different sizes. We should consider allocating fixed 
> buffer size which is unlikely to fit in 4.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-16146:

Reviewers: Blake Eggleston

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203554#comment-17203554
 ] 

Yifan Cai commented on CASSANDRA-16146:
---

Updated the error message. 

Unit and jvm dtest passed. One test failure ({{test_closing_connections - 
thrift_hsha_test.TestThriftHSHA}}) from dtest. It does not look related and 
exists before this patch. 

 
[https://app.circleci.com/pipelines/github/yifan-c/cassandra/111/workflows/ae86acf0-b416-4a42-92e8-cb845d5393a7]

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 9:23 PM:
--

The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning 
what. Overlapping range movements even as a concept is bad, and unique to the 
token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned to is a more correct approach.  Defining the membership of 
each key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing (and automatic rebalancing under asymmetric resource 
consumption) is something that I would hope to be built atop these features, 
but could in principle be built atop a token ring, just super painfully and 
probably with many bugs (like all of our range movements up to today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.


was (Author: benedict):
The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning 
what. Overlapping range movements even as a concept is bad, and unique to the 
token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned to is a more correct approach.  Defining the membership of 
each key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, but could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
> promise to accept the change proposed by the initiator. The ring members do 
> not accept proposals if they had already promised to honor another proposal, 
> to avoid conflicting ring membership updates.
> *COMMIT:* Once the initiator receives acceptances from all the nodes in the 
> cohort, it commits the proposal by broadcasting the proposed ring delta via 
> gossip. Upon receiving these changes, the other members of the cohort apply 
> the delta to their local version of the ring and broadcast their new computed 
> version via gossip. The initiator concludes the ring membership update 
> operation by checking that all nodes agree on the new proposed version.
> *ABORT:* A proposal not accepted by all members of the cohort may be 
> automatically aborted by the i

[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203523#comment-17203523
 ] 

Brandon Williams commented on CASSANDRA-16146:
--

+1, with minor bikeshed: it would be nice if the error mentioned shutting the 
node down instead if they really want to do that.

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-builds] branch master updated: ninja-fix: test-cdc and test-compression keep logs under a subdirectory

2020-09-28 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git


The following commit(s) were added to refs/heads/master by this push:
 new 875d918  ninja-fix: test-cdc and test-compression keep logs under a 
subdirectory
875d918 is described below

commit 875d91841e5614291c5b6278edbc07a4f3174ba3
Author: Mick Semb Wever 
AuthorDate: Mon Sep 28 22:58:08 2020 +0200

ninja-fix: test-cdc and test-compression keep logs under a subdirectory
---
 jenkins-dsl/cassandra_job_dsl_seed.groovy | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy 
b/jenkins-dsl/cassandra_job_dsl_seed.groovy
index 9a1694a..7684f32 100644
--- a/jenkins-dsl/cassandra_job_dsl_seed.groovy
+++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy
@@ -464,7 +464,7 @@ cassandraBranches.each {
 steps {
 shell("""
 ./cassandra-builds/build-scripts/cassandra-test.sh 
${targetName} ;
- xz build/test/logs/*.log
+ find build/test/logs -type f -name "*.log" | 
xargs xz -qq
   """)
 }
 }
@@ -695,7 +695,7 @@ testTargets.each {
   """)
 shell("""
 ./cassandra-builds/build-scripts/cassandra-test.sh 
${targetName} ;
-xz build/test/logs/*.log
+find build/test/logs -type f -name "*.log" | xargs xz -qq
   """)
 }
 publishers {


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 8:49 PM:
--

The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning 
what. Overlapping range movements even as a concept is bad, and unique to the 
token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned to is a more correct approach.  Defining the membership of 
each key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, but could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.


was (Author: benedict):
The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning 
what. Overlapping range movements even as a concept is bad, and unique to the 
token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned tois the correct approach.  Defining the membership of each 
key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, but could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
> promise to accept the change proposed by the initiator. The ring members do 
> not accept proposals if they had already promised to honor another proposal, 
> to avoid conflicting ring membership updates.
> *COMMIT:* Once the initiator receives acceptances from all the nodes in the 
> cohort, it commits the proposal by broadcasting the proposed ring delta via 
> gossip. Upon receiving these changes, the other members of the cohort apply 
> the delta to their local version of the ring and broadcast their new computed 
> version via gossip. The initiator concludes the ring membership update 
> operation by checking that all nodes agree on the new proposed version.
> *ABORT:* A proposal not accepted by all members of the cohort may be 
> automatically aborted by the initiator or manually via a command line tool.
> For simplicity the pro

[jira] [Commented] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517
 ] 

Benedict Elliott Smith commented on CASSANDRA-16139:


The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning the 
replica. Overlapping range movements even as a concept is bad, and unique to 
the token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned tois the correct approach.  Defining the membership of each 
key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, and could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
> promise to accept the change proposed by the initiator. The ring members do 
> not accept proposals if they had already promised to honor another proposal, 
> to avoid conflicting ring membership updates.
> *COMMIT:* Once the initiator receives acceptances from all the nodes in the 
> cohort, it commits the proposal by broadcasting the proposed ring delta via 
> gossip. Upon receiving these changes, the other members of the cohort apply 
> the delta to their local version of the ring and broadcast their new computed 
> version via gossip. The initiator concludes the ring membership update 
> operation by checking that all nodes agree on the new proposed version.
> *ABORT:* A proposal not accepted by all members of the cohort may be 
> automatically aborted by the initiator or manually via a command line tool.
> For simplicity the protocol above requires that all nodes are up during the 
> proposal step, but it should be possible to optimize it to require only a 
> quorum of nodes up to perform ring changes.
> A python pseudo-code of the protocol is available 
> [here|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-safe_ring_membership-py].
> With the abstraction above it becomes very simple to perform ring change 
> operations:
>  * 
> [bootstrap|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-bootstrap-py]
>  * 
> [replace|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-replace-py]
>  * 
> [move|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-move-py]
>  * [remove 
> node|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_node-py]
>  * [remove 
> token|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_token-py]
> h4. Token Ring Data Structure
> The token ring data structure can be seen as a [Delta State Replicated Data 
> Type|https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#State-based_CRDTs]
>  (Delta CRDT) containing the state of all (virtual) nodes in the cluster 
> where updates to the ring are operations on this CRDT.
> Each member publishes its latest local accepted state (delta state) via 
> gossip and the union of all delta states comprise the glo

[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 8:49 PM:
--

The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning 
what. Overlapping range movements even as a concept is bad, and unique to the 
token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned tois the correct approach.  Defining the membership of each 
key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, but could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.


was (Author: benedict):
The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning the 
replica. Overlapping range movements even as a concept is bad, and unique to 
the token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned tois the correct approach.  Defining the membership of each 
key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, but could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
> promise to accept the change proposed by the initiator. The ring members do 
> not accept proposals if they had already promised to honor another proposal, 
> to avoid conflicting ring membership updates.
> *COMMIT:* Once the initiator receives acceptances from all the nodes in the 
> cohort, it commits the proposal by broadcasting the proposed ring delta via 
> gossip. Upon receiving these changes, the other members of the cohort apply 
> the delta to their local version of the ring and broadcast their new computed 
> version via gossip. The initiator concludes the ring membership update 
> operation by checking that all nodes agree on the new proposed version.
> *ABORT:* A proposal not accepted by all members of the cohort may be 
> automatically aborted by the initiator or manually via a command line tool.
> For simplicity the 

[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203517#comment-17203517
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-16139 at 9/28/20, 8:48 PM:
--

The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning the 
replica. Overlapping range movements even as a concept is bad, and unique to 
the token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned tois the correct approach.  Defining the membership of each 
key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, but could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.


was (Author: benedict):
The token ring is problematic for us implementors; wrap around is a minor 
headache, but much more important is how on earth you safely perform multiple 
overlapping range movements - it's basically impossible, as you don't know 
which will necessarily complete, and so do not know who will end up owning the 
replica. Overlapping range movements even as a concept is bad, and unique to 
the token ring conceptualisation.

Bounded ranges of ownership - whether as tokens or keys - that nodes are 
explicitly assigned tois the correct approach.  Defining the membership of each 
key/token range explicitly prevents these complicated scenarios - a node 
joining can only possibly replicate these keys, and nothing any other node is 
doing will modify that.  These can be defined per keyspace to permit greater 
replication flexibility, and importantly safe modifications to replication 
factor without new machinery.

Automatic healing is something that I would hope to be built atop these 
features, and could in principle be built atop a token ring, just super 
painfully and probably with many bugs (like all of our range movements up to 
today).

Note that this work necessarily overlaps with safe schema changes, the two are 
intertwined.  I'll leave other thoughts on the topic for another day - some 
time in the next 2-3 months I will published my white paper on the topic.

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
> promise to accept the change proposed by the initiator. The ring members do 
> not accept proposals if they had already promised to honor another proposal, 
> to avoid conflicting ring membership updates.
> *COMMIT:* Once the initiator receives acceptances from all the nodes in the 
> cohort, it commits the proposal by broadcasting the proposed ring delta via 
> gossip. Upon receiving these changes, the other members of the cohort apply 
> the delta to their local version of the ring and broadcast their new computed 
> version via gossip. The initiator concludes the ring membership update 
> operation by checking that all nodes agree on the new proposed version.
> *ABORT:* A proposal not accepted by all members of the cohort may be 
> automatically aborted by the initiator or manually via a command line tool.
> For simplici

[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16146:
--
Test and Documentation Plan: ci, jvm dtest
 Status: Patch Available  (was: Open)

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203515#comment-17203515
 ] 

Yifan Cai commented on CASSANDRA-16146:
---

PR (to 3.0): [https://github.com/apache/cassandra/pull/760]

CI: 
[https://app.circleci.com/pipelines/github/yifan-c/cassandra/108/workflows/d4fc0b93-111e-4cbc-bd2c-c68e1a72fe09]

The patch simply prevents calling stopGossip and startGossip when not in the 
normal mode. 

I will prepare the patch to the other branches (3.11 and trunk) once this one 
looks good. 

cc: [~bdeggleston]

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Jeff Jirsa (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203496#comment-17203496
 ] 

Jeff Jirsa commented on CASSANDRA-16139:


> Would you care to elaborate why? My high level goal here is to ensure we can 
> reliably add/remove/replace nodes to a cluster, and this seems to be 
> reasonably doable with consistent hashing as far as I understand. I'd love to 
> explore alternatives but I'd be interested in learning what requirements are 
> not fulfilled by the current architecture.


Because it's a concept borrowed from the 2007 paper  and never reconsidered and 
it has ALL SORTS of unpleasant failure realities, and we can do better in 2021.

For example: why, when a single machine fails in a datacenter, and the rest of 
the hosts detect the failure, does the database do nothing to re-replicate that 
data, instead forcing a user to come along and run some magic commands that 
literally only a handful of people actually understand, when the database COULD 
do it all on its own without humans in the loop? Why would we rely on humans 
assigning tokens, anyway, or static token assignment, when the database can see 
imbalance, and could potentially deal with imbalance on its own? The whole 
existence of vnodes should have been a red flag that tokens as a distribution 
mechanism were flawed.  

Tokens are a simplistic concept that are easy to reason about but horrible to 
use. If we're rewriting it, please take the time to research how other 
distributed databases move data around when there's a hot shard or a lost 
shard, because it's a meaningful and critical missing part of Cassandra.  

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
> promise to accept the change proposed by the initiator. The ring members do 
> not accept proposals if they had already promised to honor another proposal, 
> to avoid conflicting ring membership updates.
> *COMMIT:* Once the initiator receives acceptances from all the nodes in the 
> cohort, it commits the proposal by broadcasting the proposed ring delta via 
> gossip. Upon receiving these changes, the other members of the cohort apply 
> the delta to their local version of the ring and broadcast their new computed 
> version via gossip. The initiator concludes the ring membership update 
> operation by checking that all nodes agree on the new proposed version.
> *ABORT:* A proposal not accepted by all members of the cohort may be 
> automatically aborted by the initiator or manually via a command line tool.
> For simplicity the protocol above requires that all nodes are up during the 
> proposal step, but it should be possible to optimize it to require only a 
> quorum of nodes up to perform ring changes.
> A python pseudo-code of the protocol is available 
> [here|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-safe_ring_membership-py].
> With the abstraction above it becomes very simple to perform ring change 
> operations:
>  * 
> [bootstrap|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-bootstrap-py]
>  * 
> [replace|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-replace-py]
>  * 
> [move|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-move-py]
>  * [remove 
> node|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_node-py]
>  * [remove 
> token|https://gist.github.com/pauloricardomg/1930c8cf645aa63387a57bb57f79a0f7#file-remove_token-py]
> h4. Token Ring Data Structure
> The token ring data structure can be seen as a [Delta State Replicated Data 
> Type|https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#State-based_CRDTs]
>  (Delta CRDT) containing the state of all (virtual) nodes in the cluster 
> where updates to the ring are operations on this CRDT.
> Each member publishes its latest local accepted state 

[jira] [Comment Edited] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Jeff Jirsa (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203496#comment-17203496
 ] 

Jeff Jirsa edited comment on CASSANDRA-16139 at 9/28/20, 8:09 PM:
--

{quote}Would you care to elaborate why? My high level goal here is to ensure we 
can reliably add/remove/replace nodes to a cluster, and this seems to be 
reasonably doable with consistent hashing as far as I understand. I'd love to 
explore alternatives but I'd be interested in learning what requirements are 
not fulfilled by the current architecture.{quote}


Because it's a concept borrowed from the 2007 paper  and never reconsidered and 
it has ALL SORTS of unpleasant failure realities, and we can do better in 2021.

For example: why, when a single machine fails in a datacenter, and the rest of 
the hosts detect the failure, does the database do nothing to re-replicate that 
data, instead forcing a user to come along and run some magic commands that 
literally only a handful of people actually understand, when the database COULD 
do it all on its own without humans in the loop? Why would we rely on humans 
assigning tokens, anyway, or static token assignment, when the database can see 
imbalance, and could potentially deal with imbalance on its own? The whole 
existence of vnodes should have been a red flag that tokens as a distribution 
mechanism were flawed.  

Tokens are a simplistic concept that are easy to reason about but horrible to 
use. If we're rewriting it, please take the time to research how other 
distributed databases move data around when there's a hot shard or a lost 
shard, because it's a meaningful and critical missing part of Cassandra.  


was (Author: jjirsa):
> Would you care to elaborate why? My high level goal here is to ensure we can 
> reliably add/remove/replace nodes to a cluster, and this seems to be 
> reasonably doable with consistent hashing as far as I understand. I'd love to 
> explore alternatives but I'd be interested in learning what requirements are 
> not fulfilled by the current architecture.


Because it's a concept borrowed from the 2007 paper  and never reconsidered and 
it has ALL SORTS of unpleasant failure realities, and we can do better in 2021.

For example: why, when a single machine fails in a datacenter, and the rest of 
the hosts detect the failure, does the database do nothing to re-replicate that 
data, instead forcing a user to come along and run some magic commands that 
literally only a handful of people actually understand, when the database COULD 
do it all on its own without humans in the loop? Why would we rely on humans 
assigning tokens, anyway, or static token assignment, when the database can see 
imbalance, and could potentially deal with imbalance on its own? The whole 
existence of vnodes should have been a red flag that tokens as a distribution 
mechanism were flawed.  

Tokens are a simplistic concept that are easy to reason about but horrible to 
use. If we're rewriting it, please take the time to research how other 
distributed databases move data around when there's a hot shard or a lost 
shard, because it's a meaningful and critical missing part of Cassandra.  

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
> promise to accept the change proposed by the initiator. The ring members do 
> not accept proposals if they had already promised to honor another proposal, 
> to avoid conflicting ring membership updates.
> *COMMIT:* Once the initiator receives acceptances from all the nodes in the 
> cohort, it commits the proposal by broadcasting the proposed ring delta via 
> gossip. Upon receiving these changes, the other members of the cohort apply 
> the delta to their local version of the ring and broadcast their new computed 
> version via gossip. The initiator concludes the ring membership update 
> operation by checking that all nodes agree on the new 

[jira] [Commented] (CASSANDRA-16139) Safe Ring Membership Protocol

2020-09-28 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203474#comment-17203474
 ] 

Paulo Motta commented on CASSANDRA-16139:
-

Thanks for your comments Benedict and Jeff! Please find follow-up below.
{quote}Any replacement should not be built upon Gossip (either in its current 
or an improved form)
{quote}
The proposed protocol uses gossip on 2 steps:
 a) before PROPOSE, to validate the initiator has the same ring version as the 
cohort;
 b) on COMMIT, to broadcast the ring membership update.

Step a) is an optimization that prevents the initiator from proposing a new 
ring version if there's a current disagreement. Step b) adds resilience against 
initiator failure during commit at the expense of latency, but can easily be 
made synchronous to address that.

I may be failing to see what's problematic about gossip here so I'll wait for 
your justification on why we should avoid it.
{quote}Being able to operate with a quorum is probably a lot harder than with 
every node's involvement, so I'd suggest thinking about that sooner than later
{quote}
That's a valid point. I will focus on making this work with all nodes for now, 
since that's a fair assumption/requirement, and if we see necessity we can get 
back to this later.
{quote}How do you guarantee that all participants in an operation have a 
consistent view of the ring for the purposes of that operation?
{quote}
content-based versioning. example:
 * Node A ring (version: *b710*):
{code:json}
{ "previous": "5f36", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": 
["3:N", "7:N"]} }{code}

 * Node B ring (version: *b710*):
{code:json}
{ "previous": "5f36", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": 
["3:N", "7:N"]} }{code}

 * Node C ring (version: *b710*):
{code:json}
{ "previous": "5f36", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": 
["3:N", "7:N"]} }{code}

Suppose now nodes "D" and "E" want to join the ring with the same tokens "4" 
and "8" - only one of them should succeed.

Each of them will read the current ring version *b710*. Each node will generate 
the following "proposed" ring version:
 * Node D proposed ring (version: *6f69*):
{code:json}
{ "previous": "b710", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": 
["3:N", "7:N"], "D": ["4:J", "8:J"]} } {code}

 * Node E proposed ring (version: *8f88*):
{code:json}
{ "previous": "b710", "vnodes": {"A": ["1:N", "5:N"], "B": ["2:N", "6:N", "C": 
["3:N", "7:N"], "E": ["4:J", "8:J"]} } {code}

They will then each send a PROPOSE message with the following parameters to the 
cohort:
 * NODE D:
{code:java}
PROPOSE(current_version="b710", proposed_version="6f69"){code}

 * NODE E:
{code:java}
PROPOSE(current_version="b710", proposed_version="8f88"){code}

In this situation it's possible that each of the 3 situations happen:
 * Neither E nor D gets a PROMISE from all nodes - no proposal succeeds
 * NODE D is able to get a promise from nodes A, B and C for version *6f69*.
 * NODE E is able to get a promise from nodes A, B and C for version *8f88*.

Now let's say NODE D proposal succeeds and the ring updates its version to 
*6f69*. Any proposal from NODE E referencing the previous ring version *b710* 
will be rejected by the cohort, so node E will be forced to update its version 
before submitting a new proposal.
{quote}If we're rewriting all of the membership/ownership code, we should 
definitely be thinking about a world that isn't based on tokens and hash tables.
{quote}
Would you care to elaborate why? My high level goal here is to ensure we can 
reliably add/remove/replace nodes to a cluster, and this seems to be reasonably 
doable with consistent hashing as far as I understand. I'd love to explore 
alternatives but I'd be interested in learning what requirements are not 
fulfilled by the current architecture.

> Safe Ring Membership Protocol
> -
>
> Key: CASSANDRA-16139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Gossip, Cluster/Membership
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> This ticket presents a practical protocol for performing safe ring membership 
> updates in Cassandra. This protocol will enable reliable concurrent ring 
> membership updates.
> The proposed protocol is composed of the following macro-steps:
> *PROPOSE:* An initiator node wanting to make updates to the current ring 
> structure (such as joining, leaving the ring or changing token assignments) 
> must propose the change to the other members of the ring (cohort).
> *ACCEPT:* Upon receiving a proposal the other ring members determine if the 
> change is compatible with their local version of the ring, and if so, they 
>

[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203429#comment-17203429
 ] 

Yifan Cai commented on CASSANDRA-16146:
---

Thanks [~brandon.williams] for commenting so quickly. 

bq. perhaps just not allow this outside of NORMAL

Adding a pre-check before both starting and stopping gossip to make sure the 
current mode is NORMAL sounds good to me.

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16146:
--
Fix Version/s: 4.0-beta3
   3.11.x
   3.0.x

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203409#comment-17203409
 ] 

Brandon Williams commented on CASSANDRA-16146:
--

bq. Operator runs nodetool to stop and re-start gossip. The gossip state gets 
flipped to NORMAL

We should perhaps just not allow this outside of NORMAL, since in that case you 
probably want to just stop the the node instead.

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16146:
--
Description: 
At high level, {{StorageService#setGossipTokens}} set the gossip state to 
{{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
overrides the actual gossip state.
  
It could happen in the below scenario.
# Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and code 
execution exits StorageService#initServer.
# Operator runs nodetool to stop and re-start gossip. The gossip state gets 
flipped to {{NORMAL}}

  was:
{{At high level, {{StorageService#setGossipTokens}} set the gossip state to 
NORMAL blindly. Therefore, re-enabling gossip (stop and start gossip) overrides 
the actual gossip state.}}
 
{color:#24292e}It could happen in the below scenario.{color}
{color:#24292e} {color} # Bootstrap failed. The gossip state remains in 
{{BOOT}} / {{JOINING}} and code execution exits StorageService#initServer.
 # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
flipped to {{NORMAL}}


> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16146:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: 
Consistency(12989)
   Complexity: Low Hanging Fruit
Discovered By: Code Inspection
 Severity: Low
   Status: Open  (was: Triage Needed)

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> {{At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> NORMAL blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.}}
>  
> {color:#24292e}It could happen in the below scenario.{color}
> {color:#24292e} {color} # Bootstrap failed. The gossip state remains in 
> {{BOOT}} / {{JOINING}} and code execution exits StorageService#initServer.
>  # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-28 Thread Yifan Cai (Jira)
Yifan Cai created CASSANDRA-16146:
-

 Summary: Node state incorrectly set to NORMAL after nodetool 
disablegossip and enablegossip during bootstrap
 Key: CASSANDRA-16146
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Gossip
Reporter: Yifan Cai
Assignee: Yifan Cai


{{At high level, {{StorageService#setGossipTokens}} set the gossip state to 
NORMAL blindly. Therefore, re-enabling gossip (stop and start gossip) overrides 
the actual gossip state.}}
 
{color:#24292e}It could happen in the below scenario.{color}
{color:#24292e} {color} # Bootstrap failed. The gossip state remains in 
{{BOOT}} / {{JOINING}} and code execution exits StorageService#initServer.
 # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD

2020-09-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203377#comment-17203377
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-14793 at 9/28/20, 5:08 PM:
--

Perhaps shut the server down for all writers and compaction, and serve only 
reads?  I've not got a strong opinion about it though - hard to run safely in 
this context, so would seem fine to just admit defeat in this case.  This does 
strengthen the argument for replication system keyspace data to multiple disks 
discussed above.


was (Author: benedict):
Perhaps shut the server down for all writers and compaction, and serve only 
reads?

> Improve system table handling when losing a disk when using JBOD
> 
>
> Key: CASSANDRA-14793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14793
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should improve the way we handle disk failures when losing a disk in a 
> JBOD setup
>  One way could be to pin the system tables to a special data directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD

2020-09-28 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203377#comment-17203377
 ] 

Benedict Elliott Smith commented on CASSANDRA-14793:


Perhaps shut the server down for all writers and compaction, and serve only 
reads?

> Improve system table handling when losing a disk when using JBOD
> 
>
> Key: CASSANDRA-14793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14793
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should improve the way we handle disk failures when losing a disk in a 
> JBOD setup
>  One way could be to pin the system tables to a special data directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD

2020-09-28 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203307#comment-17203307
 ] 

Benjamin Lerer commented on CASSANDRA-14793:


[~marcuse], [~benedict] How do you think we should handle the case where the 
{{disk_failure_policy}} is {{best_effort}} and the disk containing the system 
data is marked as {{unreadable}} or {{unwritable}} ? 

> Improve system table handling when losing a disk when using JBOD
> 
>
> Key: CASSANDRA-14793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14793
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should improve the way we handle disk failures when losing a disk in a 
> JBOD setup
>  One way could be to pin the system tables to a special data directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657

2020-09-28 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-15833:

Authors: Jacek Lewandowski, Jordan West  (was: Jacek 
Lewandowski)
  Since Version: 3.11.9
Source Control Link: 
https://github.com/apache/cassandra/commit/cf27558b1442e75e17e47071ecf92d1b3e5a0e36
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed as 
https://github.com/apache/cassandra/commit/cf27558b1442e75e17e47071ecf92d1b3e5a0e36.
 Thanks for the input and review everyone!

> Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
> 
>
> Key: CASSANDRA-15833
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15833
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
> Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch
>
>
> CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. 
> This results in digest mismatch when querying incomplete set of columns from 
> a table with consistency that requires reaching instances running pre 
> CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in 
> Cassandra 3.4). 
> The fix is to bring back the previous behaviour until there are no instances 
> running pre CASSANDRA-10657 version. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (d4eba9f -> c6ef476)

2020-09-28 Thread jwest
This is an automated email from the ASF dual-hosted git repository.

jwest pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from d4eba9f  Abort repairs when getting a truncation request
 add cf27558  Don't attempt value skipping with mixed cluster
 new c6ef476  Merge branch 'cassandra-3.11' into trunk

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|  1 +
 .../apache/cassandra/db/filter/ColumnFilter.java   |  9 +++
 src/java/org/apache/cassandra/gms/Gossiper.java|  4 +-
 .../distributed/upgrade/MixedModeReadTest.java | 92 ++
 4 files changed, 104 insertions(+), 2 deletions(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeReadTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk

2020-09-28 Thread jwest
This is an automated email from the ASF dual-hosted git repository.

jwest pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit c6ef476278ec783b77faa367e82d9b1ffabc
Merge: d4eba9f cf27558
Author: Jordan West 
AuthorDate: Mon Sep 28 08:15:26 2020 -0700

Merge branch 'cassandra-3.11' into trunk

 CHANGES.txt|  1 +
 .../apache/cassandra/db/filter/ColumnFilter.java   |  9 +++
 src/java/org/apache/cassandra/gms/Gossiper.java|  4 +-
 .../distributed/upgrade/MixedModeReadTest.java | 92 ++
 4 files changed, 104 insertions(+), 2 deletions(-)

diff --cc CHANGES.txt
index 0215a71,3b47c33..190eebc
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,23 -1,6 +1,24 @@@
 -3.11.9
 - * Don't attempt value skipping with mixed version cluster (CASSANDRA-15833)
 +4.0-beta3
 + * Abort repairs when getting a truncation request (CASSANDRA-15854)
 + * Remove bad assert when getting active compactions for an sstable 
(CASSANDRA-15457)
   * Avoid failing compactions with very large partitions (CASSANDRA-15164)
 + * Prevent NPE in StreamMessage in type lookup (CASSANDRA-16131)
 + * Avoid invalid state transition exception during incremental repair 
(CASSANDRA-16067)
 + * Allow zero padding in timestamp serialization (CASSANDRA-16105)
 + * Add byte array backed cells (CASSANDRA-15393)
 + * Correctly handle pending ranges with adjacent range movements 
(CASSANDRA-14801)
 + * Avoid adding locahost when streaming trivial ranges (CASSANDRA-16099)
 + * Add nodetool getfullquerylog (CASSANDRA-15988)
 + * Fix yaml format and alignment in tpstats (CASSANDRA-11402)
 + * Avoid trying to keep track of RTs for endpoints we won't write to during 
read repair (CASSANDRA-16084)
 + * When compaction gets interrupted, the exception should include the 
compactionId (CASSANDRA-15954)
 + * Make Table/Keyspace Metric Names Consistent With Each Other 
(CASSANDRA-15909)
 + * Mutating sstable component may race with entire-sstable-streaming(ZCS) 
causing checksum validation failure (CASSANDRA-15861)
 + * NPE thrown while updating speculative execution time if keyspace is 
removed during task execution (CASSANDRA-15949)
 + * Show the progress of data streaming and index build (CASSANDRA-15406)
 +Merged from 3.11:
++ * Don't attempt value skipping with mixed version cluster (CASSANDRA-15833)
 + * Use IF NOT EXISTS for index and UDT create statements in snapshot schema 
files (CASSANDRA-13935)
   * Make sure LCS handles duplicate sstable added/removed notifications 
correctly (CASSANDRA-14103)
  Merged from 3.0:
   * Add flag to ignore unreplicated keyspaces during repair (CASSANDRA-15160)
diff --cc src/java/org/apache/cassandra/db/filter/ColumnFilter.java
index 30c3ed7,57ff729..c9d0a70
--- a/src/java/org/apache/cassandra/db/filter/ColumnFilter.java
+++ b/src/java/org/apache/cassandra/db/filter/ColumnFilter.java
@@@ -26,10 -23,12 +26,11 @@@ import com.google.common.collect.Iterat
  import com.google.common.collect.SortedSetMultimap;
  import com.google.common.collect.TreeMultimap;
  
 -import org.apache.cassandra.config.CFMetaData;
  import org.apache.cassandra.cql3.ColumnIdentifier;
  import org.apache.cassandra.db.*;
 +import org.apache.cassandra.db.rows.Cell;
  import org.apache.cassandra.db.rows.CellPath;
 -import org.apache.cassandra.config.ColumnDefinition;
+ import org.apache.cassandra.gms.Gossiper;
  import org.apache.cassandra.io.util.DataInputPlus;
  import org.apache.cassandra.io.util.DataOutputPlus;
  import org.apache.cassandra.net.MessagingService;
@@@ -443,13 -349,14 +444,17 @@@ public class ColumnFilte
  {
  s = 
TreeMultimap.create(Comparator.naturalOrder(), 
Comparator.naturalOrder());
  for (ColumnSubselection subSelection : subSelections)
 -s.put(subSelection.column().name, subSelection);
 +{
 +if (fullySelectedComplexColumns == null || 
!fullySelectedComplexColumns.contains(subSelection.column()))
 +s.put(subSelection.column().name, subSelection);
 +}
  }
  
+ // see CASSANDRA-15833
 -if (isFetchAll && Gossiper.instance.isAnyNodeOn30())
++if (isFetchAll && Gossiper.instance.haveMajorVersion3Nodes())
+ queried = null;
+ 
 -return new ColumnFilter(isFetchAll, isFetchAll ? 
metadata.partitionColumns() : null, queried, s);
 +return new ColumnFilter(isFetchAll, metadata, queried, s);
  }
  }
  
@@@ -616,15 -500,10 +621,19 @@@
  }
  }
  
+ // See CASSANDRA-15833
+ if (version <= MessagingService.VERSION_3014 && isFetchAll)
+ queried = null;
+ 
 +// Same concern than in serialize/serializedSize: we should be 
wary of the change in meaning for isFetchAll.
 +// If we get a filter with isFetchAll from 3.0/3.x

[cassandra] branch cassandra-3.11 updated (0f46c90 -> cf27558)

2020-09-28 Thread jwest
This is an automated email from the ASF dual-hosted git repository.

jwest pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 0f46c90  Merge branch 'cassandra-3.0' into cassandra-3.11
 add cf27558  Don't attempt value skipping with mixed cluster

No new revisions were added by this update.

Summary of changes:
 CHANGES.txt|   1 +
 .../apache/cassandra/db/filter/ColumnFilter.java   |   9 ++
 .../distributed/impl/AbstractCluster.java  |   2 +-
 .../impl/DelegatingInvokableInstance.java  |   7 +-
 .../distributed/upgrade/MixedModeReadTest.java | 102 +
 .../cassandra/db/filter/ColumnFilterTest.java  |   3 -
 6 files changed, 119 insertions(+), 5 deletions(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/upgrade/MixedModeReadTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15902) OOM because repair session thread not closed when terminating repair

2020-09-28 Thread Alexander Dejanovski (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Dejanovski updated CASSANDRA-15902:
-
Reviewers: Alexander Dejanovski, Alexander Dejanovski  (was: Alexander 
Dejanovski)
   Alexander Dejanovski, Alexander Dejanovski
   Status: Review In Progress  (was: Patch Available)

Starting testing and review.

> OOM because repair session thread not closed when terminating repair
> 
>
> Key: CASSANDRA-15902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15902
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Swen Fuhrmann
>Assignee: Swen Fuhrmann
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
> Attachments: heap-mem-histo.txt, repair-terminated.txt
>
>
> In our cluster, after a while some nodes running slowly out of memory. On 
> that nodes we observed that Cassandra Reaper terminate repairs with a JMX 
> call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} because 
> reaching timeout of 30 min.
> In the memory heap dump we see lot of instances of 
> {{io.netty.util.concurrent.FastThreadLocalThread}} occupy most of the memory:
> {noformat}
> 119 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by 
> "sun.misc.Launcher$AppClassLoader @ 0x51a80" occupy 8.445.684.480 (93,96 
> %) bytes. {noformat}
> In the thread dump we see lot of repair threads:
> {noformat}
> grep "Repair#" threaddump.txt | wc -l
>   50 {noformat}
>  
> The repair jobs are waiting for the validation to finish:
> {noformat}
> "Repair#152:1" #96170 daemon prio=5 os_prio=0 tid=0x12fc5000 
> nid=0x542a waiting on condition [0x7f81ee414000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007939bcfc8> (a 
> com.google.common.util.concurrent.AbstractFuture$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
> at 
> com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137)
> at 
> com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1509)
> at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:160)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$13/480490520.run(Unknown
>  Source)
> at java.lang.Thread.run(Thread.java:748) {noformat}
>  
> Thats the line where the threads stuck:
> {noformat}
> // Wait for validation to complete
> Futures.getUnchecked(validations); {noformat}
>  
> The call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} stops 
> the thread pool executor. It looks like that futures which are in progress 
> will therefor never be completed and the repair thread waits forever and 
> won't be finished.
>  
> Environment:
> Cassandra version: 3.11.4 and 3.11.6
> Cassandra Reaper: 1.4.0
> JVM memory settings:
> {noformat}
> -Xms11771M -Xmx11771M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 
> -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat}
> on another cluster with same issue:
> {noformat}
> -Xms31744M -Xmx31744M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 
> -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat}
> Java Runtime:
> {noformat}
> openjdk version "1.8.0_212"
> OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
> OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode) 
> {noformat}
>  
> The same issue described in this comment: 
> https://issues.apache.org/jira/browse/CASSANDRA-14355?focusedCommentId=16992973&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16992973
> As suggested in the comments I created this new specific ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CASSANDRA-15902) OOM because repair session thread not closed when terminating repair

2020-09-28 Thread Alexander Dejanovski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203277#comment-17203277
 ] 

Alexander Dejanovski commented on CASSANDRA-15902:
--

Hi [~moczarski],

I'm aware of similar reports regarding repair sessions not being cleaned up 
correctly.
I'll happily test this patch and perform a review.

> OOM because repair session thread not closed when terminating repair
> 
>
> Key: CASSANDRA-15902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15902
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Swen Fuhrmann
>Assignee: Swen Fuhrmann
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
> Attachments: heap-mem-histo.txt, repair-terminated.txt
>
>
> In our cluster, after a while some nodes running slowly out of memory. On 
> that nodes we observed that Cassandra Reaper terminate repairs with a JMX 
> call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} because 
> reaching timeout of 30 min.
> In the memory heap dump we see lot of instances of 
> {{io.netty.util.concurrent.FastThreadLocalThread}} occupy most of the memory:
> {noformat}
> 119 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by 
> "sun.misc.Launcher$AppClassLoader @ 0x51a80" occupy 8.445.684.480 (93,96 
> %) bytes. {noformat}
> In the thread dump we see lot of repair threads:
> {noformat}
> grep "Repair#" threaddump.txt | wc -l
>   50 {noformat}
>  
> The repair jobs are waiting for the validation to finish:
> {noformat}
> "Repair#152:1" #96170 daemon prio=5 os_prio=0 tid=0x12fc5000 
> nid=0x542a waiting on condition [0x7f81ee414000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007939bcfc8> (a 
> com.google.common.util.concurrent.AbstractFuture$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
> at 
> com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137)
> at 
> com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1509)
> at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:160)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$13/480490520.run(Unknown
>  Source)
> at java.lang.Thread.run(Thread.java:748) {noformat}
>  
> Thats the line where the threads stuck:
> {noformat}
> // Wait for validation to complete
> Futures.getUnchecked(validations); {noformat}
>  
> The call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} stops 
> the thread pool executor. It looks like that futures which are in progress 
> will therefor never be completed and the repair thread waits forever and 
> won't be finished.
>  
> Environment:
> Cassandra version: 3.11.4 and 3.11.6
> Cassandra Reaper: 1.4.0
> JVM memory settings:
> {noformat}
> -Xms11771M -Xmx11771M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 
> -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat}
> on another cluster with same issue:
> {noformat}
> -Xms31744M -Xmx31744M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 
> -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat}
> Java Runtime:
> {noformat}
> openjdk version "1.8.0_212"
> OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
> OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode) 
> {noformat}
>  
> The same issue described in this comment: 
> https://issues.apache.org/jira/browse/CASSANDRA-14355?focusedCommentId=16992973&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16992973
> As suggested in the comments I created this new specific ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-

[jira] [Assigned] (CASSANDRA-16038) Add a getter for InstanceConfig parameters - in-jvm-dtests-api

2020-09-28 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova reassigned CASSANDRA-16038:
---

Assignee: (was: Ekaterina Dimitrova)

> Add a getter for InstanceConfig parameters - in-jvm-dtests-api
> --
>
> Key: CASSANDRA-16038
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16038
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java
>Reporter: Ekaterina Dimitrova
>Priority: Low
>
> In order to change the way config will be loaded (for reference 
> CASSANDRA-15234 ) a getter for the InstanceConfig parameters is needed 
> CC [~maedhroz]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15299) CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta

2020-09-28 Thread Sam Tunnicliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198465#comment-17198465
 ] 

Sam Tunnicliffe edited comment on CASSANDRA-15299 at 9/28/20, 11:33 AM:


Sorry it's been a while without any visible movement here, but I've just pushed 
some more commits to address the latest comments from [~ifesdjeen] and 
[~omichallat]. I've added some tests for protocol negotiation, correct handling 
of corrupt messages and resource management.
{quote} * In CQLMessageHandler#processOneContainedMessage, when we can't 
acquire capacity and, subsequently, we're not passing the frame further down 
the line. Shouold we release the frame in this case, since usually we're 
releasing the source frame after flush.{quote}
Done, though we only need to do this when {{throwOnOverload == true}} as 
otherwise we process the inflight request before applying backpressure.
{quote} * ReusableBuffer is unused.{quote}
Ah yes, removed
{quote} * Server has a few unused imports and eventExecutorGroup which is 
unused.{quote}
Cleaned up the imports and removed eventExecutorGroup
{quote} * I'm not sure if we currently handle releasing corrupted frames.{quote}
For self-contained frames, there's nothing to do here as no resources have been 
acquired before the corruption is detected, hence {{CorruptFrame::release}} is 
a no-op. For frames which are part of a large message, there may be some 
resource allocated before we discover corruption. This is ok though, as we 
consume the frame, supplying it to the large message state machine, which 
handles releasing the bufffers of the previous frames (if any). I've added a 
test for this scenario which includes a check that everything allocated has 
been freed.
{quote}Shouold we maybe make FrameSet auto-closeable and make sure we always 
release buffers in finally? I've also made a similar change to processItem 
which would add item to flushed to make sure it's released. That makes flushed 
variable name not quite right though.
{quote}
I've pulled in some of your change to {{processItem}} as it removes some 
duplication around polling the queue. I've removed the condition in the 
{{finally}} of {{ImmediateFlusher}} though, since if we throw from 
{{processQueue}} then {{doneWork}} will be false anyway, but there may have 
been some items processed and waiting to flush. The trade off is calling 
{{flushWrittenChannels}} even if there's no work to do, but that seems both 
cheap and unlikely, what do you think?
 As far as making {{FrameSet}} autoclosable, I don't think that's feasible, 
given how they are created and accessed.  I've tried to address one of your 
comment re: the memory management here by adding some comments. They're 
probably not yet enough, but let me know if they are helpful at all.
{quote}We can (and probably should) open a separate ticket that could aim at 
performance improvements around native protocol.
{quote}
Agreed, I'd like to do some further perf testing, but the results from your 
initial tests makes a follow-up ticket seem a reasonable option.
{quote}I've noticed an issue when the client starts protocol negotiation with 
an unsupported version.
{quote}
Fixed, thanks.

[~ifesdjeen], I haven't pulled in your burn test or changes to {{SimpleClient}} 
yet, I'll try to do that next week. I also haven't done any automated renaming 
yet, I'll hold off on that so as not to add to the cognitive burden until we're 
pretty much done with review.
||branch||CI||
|[15299-trunk|https://github.com/beobal/cassandra/tree/15299-trunk]|[circle|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=15299-trunk]|


was (Author: beobal):
Sorry it's been a while without any visible movement here, but I've just pushed 
some more commits to address the latest comments from [~ifesdjeen] and 
[~omichallat]. I've added some tests for protocol negotiation, correct handling 
of corrupt messages and resource management.
 
{quote} * In CQLMessageHandler#processOneContainedMessage, when we can't 
acquire capacity and, subsequently, we're not passing the frame further down 
the line. Shouold we release the frame in this case, since usually we're 
releasing the source frame after flush.
{quote}

Done, though we only need to do this when {{throwOnOverload == true}} as 
otherwise we process the inflight request before applying backpressure.

{quote} * ReusableBuffer is unused. {quote}

Ah yes, removed

{quote} * Server has a few unused imports and eventExecutorGroup which is 
unused.{quote}

Cleaned up the imports and removed eventExecutorGroup

{quote} * I'm not sure if we currently handle releasing corrupted frames.{quote}

For self-contained frames, there's nothing to do here as no resources have been 
acquired before the corruption is detected, hence {{CorruptFrame::release}} is 
a no-op. For frames which are part of a large message,

[jira] [Commented] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving

2020-09-28 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203141#comment-17203141
 ] 

Michael Semb Wever commented on CASSANDRA-16128:


Added compression to all the text files being uploaded to nightlies.a.o in 
[234186acfc461b75056c251a825ccbb42f4e4fb6|https://github.com/apache/cassandra-builds/commit/234186acfc461b75056c251a825ccbb42f4e4fb6]
 (thanks to [~Bereng] for the review)

> Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o 
> instead of archiving
> ---
>
> Key: CASSANDRA-16128
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16128
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta
>
>
> Jenkins improvements
> 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we 
> don't lose it next time the Jenkins master is corrupted)
> 2. Print the SHAs of the different git repos used during the build process. 
> Also store them in the .head files (so the pipeline can print them out too).
> 3. Instead of archiving artefacts, ssh them to 
> https://nightlies.apache.org/cassandra/
> (Disk usage on agents is largely under control, but disk usage on master was 
> the new problem. The suspicion here is the Cassandra-*-artifact's artefacts 
> was the disk usage culprit, though we have to evidence to support it.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-builds] branch master updated: In Jenkins, fix printing SHAs in pipeline summary, and compress text artifacts before uploading to nightlies.a.o

2020-09-28 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git


The following commit(s) were added to refs/heads/master by this push:
 new 234186a  In Jenkins, fix printing SHAs in pipeline summary, and 
compress text artifacts before uploading to nightlies.a.o
234186a is described below

commit 234186acfc461b75056c251a825ccbb42f4e4fb6
Author: Mick Semb Wever 
AuthorDate: Mon Sep 28 11:55:37 2020 +0200

In Jenkins, fix printing SHAs in pipeline summary, and compress text 
artifacts before uploading to nightlies.a.o

 patch by Mick Semb Wever; reviewed by Berenguer Blasi for CASSANDRA-16128
---
 jenkins-dsl/cassandra_job_dsl_seed.groovy | 24 ++--
 jenkins-dsl/cassandra_pipeline.groovy |  7 ---
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy 
b/jenkins-dsl/cassandra_job_dsl_seed.groovy
index cc40ae3..9a1694a 100644
--- a/jenkins-dsl/cassandra_job_dsl_seed.groovy
+++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy
@@ -301,7 +301,7 @@ matrixJob('Cassandra-template-dtest-matrix') {
 publishOverSsh {
 server('Nightlies') {
 transferSet {
-
sourceFiles("**/nosetests.xml,**/test_stdout.txt,**/ccm_logs.tar.xz")
+
sourceFiles("**/nosetests.xml,**/test_stdout.txt.xz,**/ccm_logs.tar.xz")
 remoteDirectory("cassandra/\${JOB_NAME}/\${BUILD_NUMBER}/")
 }
 }
@@ -462,7 +462,10 @@ cassandraBranches.each {
 node / scm / branches / 'hudson.plugins.git.BranchSpec' / 
name(branchName)
 }
 steps {
-shell("./cassandra-builds/build-scripts/cassandra-test.sh 
${targetName}")
+shell("""
+./cassandra-builds/build-scripts/cassandra-test.sh 
${targetName} ;
+ xz build/test/logs/*.log
+  """)
 }
 }
 }
@@ -496,7 +499,10 @@ cassandraBranches.each {
 node / scm / branches / 'hudson.plugins.git.BranchSpec' / 
name(branchName)
 }
 steps {
-shell("sh 
./cassandra-builds/docker/jenkins/jenkinscommand.sh apache ${branchName} 
https://github.com/apache/cassandra-dtest.git master ${buildsRepo} 
${buildsBranch} ${dtestDockerImage} ${targetName} \${split}/${splits}")
+shell("""
+sh ./cassandra-builds/docker/jenkins/jenkinscommand.sh 
apache ${branchName} https://github.com/apache/cassandra-dtest.git master 
${buildsRepo} ${buildsBranch} ${dtestDockerImage} ${targetName} 
\${split}/${splits} ;
+xz test_stdout.txt
+""")
 }
 }
 }
@@ -687,7 +693,10 @@ testTargets.each {
 echo "cassandra-builds at: `git -C cassandra-builds log -1 
--pretty=format:'%h %an %ad %s'`" ;
 echo "Cassandra-devbranch-${targetName} cassandra: `git 
log -1 --pretty=format:'%h %an %ad %s'`" > 
Cassandra-devbranch-${targetName}.head ;
   """)
-shell("./cassandra-builds/build-scripts/cassandra-test.sh 
${targetName}")
+shell("""
+./cassandra-builds/build-scripts/cassandra-test.sh 
${targetName} ;
+xz build/test/logs/*.log
+  """)
 }
 publishers {
 publishOverSsh {
@@ -793,13 +802,16 @@ dtestTargets.each {
 echo "cassandra-builds at: `git -C cassandra-builds log -1 
--pretty=format:'%h %an %ad %s'`" ;
 echo "Cassandra-devbranch-${targetName} cassandra: `git 
log -1 --pretty=format:'%h %an %ad %s'`" > 
Cassandra-devbranch-${targetName}.head ;
   """)
-shell("sh ./cassandra-builds/docker/jenkins/jenkinscommand.sh 
\$REPO \$BRANCH \$DTEST_REPO \$DTEST_BRANCH ${buildsRepo} ${buildsBranch} 
\$DOCKER_IMAGE ${targetName} \${split}/${splits}")
+shell("""
+sh ./cassandra-builds/docker/jenkins/jenkinscommand.sh \$REPO 
\$BRANCH \$DTEST_REPO \$DTEST_BRANCH ${buildsRepo} ${buildsBranch} 
\$DOCKER_IMAGE ${targetName} \${split}/${splits} ;
+xz test_stdout.txt
+  """)
 }
 publishers {
 publishOverSsh {
 server('Nightlies') {
 transferSet {
-sourceFiles("**/test_stdout.txt,**/ccm_logs.tar.xz")
+sourceFiles("**/test_stdout.txt.xz,**/ccm_logs.tar.xz")
 
remoteDirectory("cassandra/\${JOB_NAME}/\${BUILD_NUMBER}/")
 }
 }
diff --git a/jenkins-dsl/cassandra_pipeline.groovy 
b/jenkins-dsl/cassandra_pipeline.groovy
index d

[jira] [Assigned] (CASSANDRA-16048) Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and Value Columns

2020-09-28 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-16048:
---

Assignee: Jordan West  (was: Alex Petrov)

> Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and 
> Value Columns
> --
>
> Key: CASSANDRA-16048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16048
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Some compact storage tables, specifically those where the user has defined 
> both at least one clustering and the value column, can be safely handled in 
> 4.0 because besides the DENSE flag they are not materially different post 3.0 
> and there is no visible change to the user facing schema after dropping 
> compact storage. We can detect this case and allow these tables to silently 
> drop the DENSE flag while still throwing a start-up error for COMPACT STORAGE 
> tables that don’t meet the criteria. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15811) Improve DROP COMPACT STORAGE

2020-09-28 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-15811:
---

Assignee: Marcus Eriksson  (was: Alex Petrov)

> Improve DROP COMPACT STORAGE
> 
>
> Key: CASSANDRA-15811
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15811
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Alex Petrov
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to 
> deprecate Thrift. However, current semantics of dropping compact storage 
> flags from tables reveal several columns that are usually empty (colum1 and 
> value in non-dense case, value for dense columns, and a column with an empty 
> name for super column families). Showing these columns  can confuse 
> application developers, especially ones that have never used thrift and/or 
> made writes that assumed presence of those fields, and used compact storage 
> in 3.x because is has “compact” in the name.
> There’s not much we can do in a super column family case, especially 
> considering there’s no way to create a supercolumn family using CQL, but we 
> can improve dense and non-dense cases. We can scan stables and make sure 
> there are no signs of thrift writes in them, and if all sstables conform to 
> this rule, we can not only drop the flag, but also drop columns that are 
> supposed to be hidden. However, this is both not very user-friendly, and is 
> probably not worth development effort. 
> An alternative to scanning is to add {{FORCE DROP COMPACT}} syntax (or 
> something similar) that would just drop columns unconditionally. It is likely 
> that people who were using compact storage with thrift know they were doing 
> that, so they'll usually use "regular" {{DROP COMPACT}}, withouot force, that 
> will simply reveal the columns as it does right now.
> Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually 
> have data[*] we had to remove empty type validation, properly handling 
> compact storage starts making more sense, but we’ll solve it through not 
> having columns, hence not caring about values instead, or keeping values 
> _and_ data, not requiring validation in this case. EmptyType field will have 
> to be handled differently though.
> [*] as it is possible to end up with sstables upgraded from 2.x or written in 
> 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster 
> is guaranteed to have empty values in this column, and this behaviour, even 
> if undesired, might be used by people. 
> Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows 
> any non-empty value to be written to it, but we already allow creating table 
> via CQL, and still write data into it with thrift. It seems to have been 
> unintended, but it might have become a feature people rely on. If we simply 
> back port 15373 to 2.2 and 2.1, we’ll change and break behaviour. Given 
> no-one complained in 3.0 and 3.11, this assumption is unlikely though. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15811) Improve DROP COMPACT STORAGE

2020-09-28 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-15811:
---

Assignee: Alex Petrov  (was: Marcus Eriksson)

> Improve DROP COMPACT STORAGE
> 
>
> Key: CASSANDRA-15811
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15811
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to 
> deprecate Thrift. However, current semantics of dropping compact storage 
> flags from tables reveal several columns that are usually empty (colum1 and 
> value in non-dense case, value for dense columns, and a column with an empty 
> name for super column families). Showing these columns  can confuse 
> application developers, especially ones that have never used thrift and/or 
> made writes that assumed presence of those fields, and used compact storage 
> in 3.x because is has “compact” in the name.
> There’s not much we can do in a super column family case, especially 
> considering there’s no way to create a supercolumn family using CQL, but we 
> can improve dense and non-dense cases. We can scan stables and make sure 
> there are no signs of thrift writes in them, and if all sstables conform to 
> this rule, we can not only drop the flag, but also drop columns that are 
> supposed to be hidden. However, this is both not very user-friendly, and is 
> probably not worth development effort. 
> An alternative to scanning is to add {{FORCE DROP COMPACT}} syntax (or 
> something similar) that would just drop columns unconditionally. It is likely 
> that people who were using compact storage with thrift know they were doing 
> that, so they'll usually use "regular" {{DROP COMPACT}}, withouot force, that 
> will simply reveal the columns as it does right now.
> Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually 
> have data[*] we had to remove empty type validation, properly handling 
> compact storage starts making more sense, but we’ll solve it through not 
> having columns, hence not caring about values instead, or keeping values 
> _and_ data, not requiring validation in this case. EmptyType field will have 
> to be handled differently though.
> [*] as it is possible to end up with sstables upgraded from 2.x or written in 
> 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster 
> is guaranteed to have empty values in this column, and this behaviour, even 
> if undesired, might be used by people. 
> Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows 
> any non-empty value to be written to it, but we already allow creating table 
> via CQL, and still write data into it with thrift. It seems to have been 
> unintended, but it might have become a feature people rely on. If we simply 
> back port 15373 to 2.2 and 2.1, we’ll change and break behaviour. Given 
> no-one complained in 3.0 and 3.11, this assumption is unlikely though. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-16048) Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and Value Columns

2020-09-28 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-16048:
---

Assignee: Alex Petrov  (was: Jordan West)

> Safely Ignore Compact Storage Tables Where Users Have Defined Clustering and 
> Value Columns
> --
>
> Key: CASSANDRA-16048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16048
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL
>Reporter: Jordan West
>Assignee: Alex Petrov
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Some compact storage tables, specifically those where the user has defined 
> both at least one clustering and the value column, can be safely handled in 
> 4.0 because besides the DENSE flag they are not materially different post 3.0 
> and there is no visible change to the user facing schema after dropping 
> compact storage. We can detect this case and allow these tables to silently 
> drop the DENSE flag while still throwing a start-up error for COMPACT STORAGE 
> tables that don’t meet the criteria. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-dtest] branch master updated: fix bad rebase, remove remove_perf_disable_shared_mem

2020-09-28 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git


The following commit(s) were added to refs/heads/master by this push:
 new 5890b5f  fix bad rebase, remove remove_perf_disable_shared_mem
5890b5f is described below

commit 5890b5fd76b6a0f5dd3dc9b464b5aa9fb592c7bd
Author: Marcus Eriksson 
AuthorDate: Mon Sep 28 09:38:36 2020 +0200

fix bad rebase, remove remove_perf_disable_shared_mem
---
 repair_tests/repair_test.py | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/repair_tests/repair_test.py b/repair_tests/repair_test.py
index 4b8f037..a33cd2f 100644
--- a/repair_tests/repair_test.py
+++ b/repair_tests/repair_test.py
@@ -15,7 +15,7 @@ from ccmlib.node import ToolError
 
 from dtest import FlakyRetryPolicy, Tester, create_ks, create_cf
 from tools.data import insert_c1c2, query_c1c2
-from tools.jmxutils import JolokiaAgent, make_mbean, 
remove_perf_disable_shared_mem
+from tools.jmxutils import JolokiaAgent, make_mbean
 
 since = pytest.mark.since
 logger = logging.getLogger(__name__)
@@ -948,7 +948,6 @@ class TestRepair(BaseRepairTest):
 cluster = self.cluster
 cluster.populate([3])
 node1, node2, node3 = cluster.nodelist()
-remove_perf_disable_shared_mem(node1) # for jmx
 cluster.start(wait_for_binary_proto=True)
 self.fixture_dtest_setup.ignore_log_patterns.extend(["Nothing to 
repair for"])
 session = self.patient_cql_connection(node1)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org