[jira] [Updated] (CASSANDRA-15097) Avoid updating unchanged gossip state
[ https://issues.apache.org/jira/browse/CASSANDRA-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-15097: --- Test and Documentation Plan: Unittest is passed. And the code is committed and running in Instagram production environment. Status: Patch Available (was: Open) > Avoid updating unchanged gossip state > - > > Key: CASSANDRA-15097 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15097 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > The node might get unchanged gossip states, the state might be just updated > after sending a GOSSIP_SYN, then it will get the state that is already up to > date. If the heartbeat in the GOSSIP_ACK message is updated, it will > unnecessary re-apply the same state again, which could be costly like > updating token change. > It's very likely to happen for large cluster when a node startup, as the > first gossip message will sync all endpoints tokens, it could take some time > (in our case about 200 seconds), during that time, it keeps gossip with other > node and get the full token states. Which causes lots of pending gossip tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15133) Node restart causes unnecessary token metadata update
[ https://issues.apache.org/jira/browse/CASSANDRA-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-15133: --- Test and Documentation Plan: Unittest is passed. And the code is committed and running in Instagram production environment. Status: Patch Available (was: Open) > Node restart causes unnecessary token metadata update > - > > Key: CASSANDRA-15133 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15133 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > Restarting a node causes gossip generation update. When it propagates the > message to the cluster, every node blindly update its local token metadata > even it is not changed. Update token metadata is expensive for large vnode > cluster and causes token metadata cache unnecessarily invalided. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15135) SASI tokenizer options not validated before being added to schema
[ https://issues.apache.org/jira/browse/CASSANDRA-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent White updated CASSANDRA-15135: -- Discovered By: Adhoc Test Since Version: 3.4 > SASI tokenizer options not validated before being added to schema > - > > Key: CASSANDRA-15135 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15135 > Project: Cassandra > Issue Type: Bug > Components: Feature/SASI >Reporter: Vincent White >Priority: Normal > > If you attempt to create a SASI index with an illegal argument combination > the index will be added to the schema tables before trying instantiate the > tokenizer which causes a RuntimeException. Since the index was written to the > schema tables, cassandra will hit the same exception and fail to start when > it tries to load the schema on boot. > The branch below includes a unit test to reproduce the issue. > ||3.11|| > |[PoC|https://github.com/vincewhite/cassandra/commit/089547946d284ae3feb0d5620067b85b8fd66ebc]| > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15097) Avoid updating unchanged gossip state
[ https://issues.apache.org/jira/browse/CASSANDRA-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-15097: --- Severity: Low Complexity: Normal Discovered By: User Report Bug Category: Parent values: Degradation(12984)Level 1 values: Performance Bug/Regression(12997) Status: Open (was: Triage Needed) > Avoid updating unchanged gossip state > - > > Key: CASSANDRA-15097 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15097 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > The node might get unchanged gossip states, the state might be just updated > after sending a GOSSIP_SYN, then it will get the state that is already up to > date. If the heartbeat in the GOSSIP_ACK message is updated, it will > unnecessary re-apply the same state again, which could be costly like > updating token change. > It's very likely to happen for large cluster when a node startup, as the > first gossip message will sync all endpoints tokens, it could take some time > (in our case about 200 seconds), during that time, it keeps gossip with other > node and get the full token states. Which causes lots of pending gossip tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15134) SASI index files not included in snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent White updated CASSANDRA-15134: -- Severity: Normal (was: Low) > SASI index files not included in snapshots > -- > > Key: CASSANDRA-15134 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15134 > Project: Cassandra > Issue Type: Bug > Components: Feature/SASI >Reporter: Vincent White >Assignee: Vincent White >Priority: Normal > > Newly written SASI index files are not being included in snapshots. This is > because the SASI index files are not added to the components > ({{org.apache.cassandra.io.sstable.SSTable#components}}) list of newly > written sstables. > Although I don't believe anything except snapshots ever tries to reference > the SASI index files from this location, on startup Cassandra does add the > SASI index files (if they are found on disk) of existing sstables in their > components list. In that case sstables that existed on startup with SASI > index files will have their SASI index files included in any snapshots. > > This patch updates the components list of newly written sstable once the > index is built. > ||3.11||Trunk|| > |[PoC|https://github.com/vincewhite/cassandra/commit/a641298ad03250d3e4c195e05a93aad56dff8ca7]|[PoC|https://github.com/vincewhite/cassandra/commit/1cfe46688380838e7106f14446658988cfe68137]| > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15098) Endpoints no longer owning tokens are not removed for vnode
[ https://issues.apache.org/jira/browse/CASSANDRA-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-15098: --- Severity: Normal Complexity: Normal Discovered By: User Report Bug Category: Parent values: Correctness(12982)Level 1 values: Persistent Corruption / Loss(12986) Status: Open (was: Triage Needed) > Endpoints no longer owning tokens are not removed for vnode > --- > > Key: CASSANDRA-15098 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15098 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > The logical here to remove endpoints no longer owning tokens is not working > for multiple tokens (vnode): > https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/StorageService.java#L2505 > And it's very expensive to copy the tokenmetadata for every check. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15135) SASI tokenizer options not validated before being added to schema
Vincent White created CASSANDRA-15135: - Summary: SASI tokenizer options not validated before being added to schema Key: CASSANDRA-15135 URL: https://issues.apache.org/jira/browse/CASSANDRA-15135 Project: Cassandra Issue Type: Bug Components: Feature/SASI Reporter: Vincent White If you attempt to create a SASI index with an illegal argument combination the index will be added to the schema tables before trying instantiate the tokenizer which causes a RuntimeException. Since the index was written to the schema tables, cassandra will hit the same exception and fail to start when it tries to load the schema on boot. The branch below includes a unit test to reproduce the issue. ||3.11|| |[PoC|https://github.com/vincewhite/cassandra/commit/089547946d284ae3feb0d5620067b85b8fd66ebc]| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15098) Endpoints no longer owning tokens are not removed for vnode
[ https://issues.apache.org/jira/browse/CASSANDRA-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-15098: --- Test and Documentation Plan: Unittest is passed. And the code is committed and running in Instagram production environment. Status: Patch Available (was: Open) > Endpoints no longer owning tokens are not removed for vnode > --- > > Key: CASSANDRA-15098 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15098 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > The logical here to remove endpoints no longer owning tokens is not working > for multiple tokens (vnode): > https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/StorageService.java#L2505 > And it's very expensive to copy the tokenmetadata for every check. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15133) Node restart causes unnecessary token metadata update
[ https://issues.apache.org/jira/browse/CASSANDRA-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-15133: --- Complexity: Low Hanging Fruit Change Category: Performance Status: Open (was: Triage Needed) > Node restart causes unnecessary token metadata update > - > > Key: CASSANDRA-15133 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15133 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > Restarting a node causes gossip generation update. When it propagates the > message to the cluster, every node blindly update its local token metadata > even it is not changed. Update token metadata is expensive for large vnode > cluster and causes token metadata cache unnecessarily invalided. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15133) Node restart causes unnecessary token metadata update
[ https://issues.apache.org/jira/browse/CASSANDRA-15133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844502#comment-16844502 ] Jay Zhuang commented on CASSANDRA-15133: Here is a purposed fix: | [15133-trunk|https://github.com/cooldoger/cassandra/tree/15133-trunk] | [!https://circleci.com/gh/cooldoger/cassandra/tree/15133-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/15133-trunk] | And also it fixes a removing {{movingEndpoint}} issue, as the following code never works, removing item while looping the collection will throw `ConcurrentModificationException`: {noformat} for (Pair pair : movingEndpoints) { if (pair.right.equals(endpoint)) { movingEndpoints.remove(pair); break; } } {noformat} > Node restart causes unnecessary token metadata update > - > > Key: CASSANDRA-15133 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15133 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip, Cluster/Membership >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Normal > > Restarting a node causes gossip generation update. When it propagates the > message to the cluster, every node blindly update its local token metadata > even it is not changed. Update token metadata is expensive for large vnode > cluster and causes token metadata cache unnecessarily invalided. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15134) SASI index files not included in snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent White updated CASSANDRA-15134: -- Severity: Low Discovered By: Adhoc Test Since Version: 3.4 > SASI index files not included in snapshots > -- > > Key: CASSANDRA-15134 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15134 > Project: Cassandra > Issue Type: Bug > Components: Feature/SASI >Reporter: Vincent White >Assignee: Vincent White >Priority: Low > > Newly written SASI index files are not being included in snapshots. This is > because the SASI index files are not added to the components > ({{org.apache.cassandra.io.sstable.SSTable#components}}) list of newly > written sstables. > Although I don't believe anything except snapshots ever tries to reference > the SASI index files from this location, on startup Cassandra does add the > SASI index files (if they are found on disk) of existing sstables in their > components list. In that case sstables that existed on startup with SASI > index files will have their SASI index files included in any snapshots. > > This patch updates the components list of newly written sstable once the > index is built. > ||3.11||Trunk|| > |[PoC\|[https://github.com/vincewhite/cassandra/commit/a641298ad03250d3e4c195e05a93aad56dff8ca7]]|[PoC\|[https://github.com/vincewhite/cassandra/commit/1cfe46688380838e7106f14446658988cfe68137]]| > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15134) SASI index files not included in snapshots
[ https://issues.apache.org/jira/browse/CASSANDRA-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent White updated CASSANDRA-15134: -- Description: Newly written SASI index files are not being included in snapshots. This is because the SASI index files are not added to the components ({{org.apache.cassandra.io.sstable.SSTable#components}}) list of newly written sstables. Although I don't believe anything except snapshots ever tries to reference the SASI index files from this location, on startup Cassandra does add the SASI index files (if they are found on disk) of existing sstables in their components list. In that case sstables that existed on startup with SASI index files will have their SASI index files included in any snapshots. This patch updates the components list of newly written sstable once the index is built. ||3.11||Trunk|| |[PoC|https://github.com/vincewhite/cassandra/commit/a641298ad03250d3e4c195e05a93aad56dff8ca7]|[PoC|https://github.com/vincewhite/cassandra/commit/1cfe46688380838e7106f14446658988cfe68137]| was: Newly written SASI index files are not being included in snapshots. This is because the SASI index files are not added to the components ({{org.apache.cassandra.io.sstable.SSTable#components}}) list of newly written sstables. Although I don't believe anything except snapshots ever tries to reference the SASI index files from this location, on startup Cassandra does add the SASI index files (if they are found on disk) of existing sstables in their components list. In that case sstables that existed on startup with SASI index files will have their SASI index files included in any snapshots. This patch updates the components list of newly written sstable once the index is built. ||3.11||Trunk|| |[PoC\|[https://github.com/vincewhite/cassandra/commit/a641298ad03250d3e4c195e05a93aad56dff8ca7]]|[PoC\|[https://github.com/vincewhite/cassandra/commit/1cfe46688380838e7106f14446658988cfe68137]]| > SASI index files not included in snapshots > -- > > Key: CASSANDRA-15134 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15134 > Project: Cassandra > Issue Type: Bug > Components: Feature/SASI >Reporter: Vincent White >Assignee: Vincent White >Priority: Low > > Newly written SASI index files are not being included in snapshots. This is > because the SASI index files are not added to the components > ({{org.apache.cassandra.io.sstable.SSTable#components}}) list of newly > written sstables. > Although I don't believe anything except snapshots ever tries to reference > the SASI index files from this location, on startup Cassandra does add the > SASI index files (if they are found on disk) of existing sstables in their > components list. In that case sstables that existed on startup with SASI > index files will have their SASI index files included in any snapshots. > > This patch updates the components list of newly written sstable once the > index is built. > ||3.11||Trunk|| > |[PoC|https://github.com/vincewhite/cassandra/commit/a641298ad03250d3e4c195e05a93aad56dff8ca7]|[PoC|https://github.com/vincewhite/cassandra/commit/1cfe46688380838e7106f14446658988cfe68137]| > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15134) SASI index files not included in snapshots
Vincent White created CASSANDRA-15134: - Summary: SASI index files not included in snapshots Key: CASSANDRA-15134 URL: https://issues.apache.org/jira/browse/CASSANDRA-15134 Project: Cassandra Issue Type: Bug Components: Feature/SASI Reporter: Vincent White Assignee: Vincent White Newly written SASI index files are not being included in snapshots. This is because the SASI index files are not added to the components ({{org.apache.cassandra.io.sstable.SSTable#components}}) list of newly written sstables. Although I don't believe anything except snapshots ever tries to reference the SASI index files from this location, on startup Cassandra does add the SASI index files (if they are found on disk) of existing sstables in their components list. In that case sstables that existed on startup with SASI index files will have their SASI index files included in any snapshots. This patch updates the components list of newly written sstable once the index is built. ||3.11||Trunk|| |[PoC\|[https://github.com/vincewhite/cassandra/commit/a641298ad03250d3e4c195e05a93aad56dff8ca7]]|[PoC\|[https://github.com/vincewhite/cassandra/commit/1cfe46688380838e7106f14446658988cfe68137]]| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15133) Node restart causes unnecessary token metadata update
Jay Zhuang created CASSANDRA-15133: -- Summary: Node restart causes unnecessary token metadata update Key: CASSANDRA-15133 URL: https://issues.apache.org/jira/browse/CASSANDRA-15133 Project: Cassandra Issue Type: Improvement Components: Cluster/Gossip, Cluster/Membership Reporter: Jay Zhuang Assignee: Jay Zhuang Restarting a node causes gossip generation update. When it propagates the message to the cluster, every node blindly update its local token metadata even it is not changed. Update token metadata is expensive for large vnode cluster and causes token metadata cache unnecessarily invalided. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15132) one-way TLS authentication for client encryption is broken
[ https://issues.apache.org/jira/browse/CASSANDRA-15132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15132: - Status: Awaiting Feedback (was: Triage Needed) Hi [~jsanda], thanks for reporting this issue and also for the patch. Other than an additional log entry, I don't think this breaks anything, does it? > one-way TLS authentication for client encryption is broken > -- > > Key: CASSANDRA-15132 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15132 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: John Sanda >Priority: Normal > > CASSANDRA-14652 caused a regression for client/native transport encryption. > It broken one-way TLS authentication where only the client authenticates the > coordinator node's certificate chain. This would be configured in > cassandra.yaml as such: > {noformat} > client_encryption_options: > enabled: true > keystore: /path/to/keystore > keystore_password: my_keystore_password > optional: false > require_client_auth: false > {noformat} > With the changes in CASSANDRA-14652, ServerConnection.java always assumes > that there will always be a client certificate chain, which will not be the > case with the above configuration. > Here is the error that shows up in the logs: > {noformat} > ERROR [Native-Transport-Requests-1] 2019-05-17 18:20:20,016 > ServerConnection.java:147 - Failed to get peer certificates for peer > /127.0.0.1:50736 > javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated > at > sun.security.ssl.SSLSessionImpl.getPeerCertificateChain(SSLSessionImpl.java:501) > ~[na:1.8.0_202] > at > org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:143) > [main/:na] > at > org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:127) > [main/:na] > at > org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:75) > [main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:566) > [main/:na] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [main/:na] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_202] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [main/:na] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15132) one-way TLS authentication for client encryption is broken
[ https://issues.apache.org/jira/browse/CASSANDRA-15132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sanda updated CASSANDRA-15132: --- Description: CASSANDRA-14652 caused a regression for client/native transport encryption. It broken one-way TLS authentication where only the client authenticates the coordinator node's certificate chain. This would be configured in cassandra.yaml as such: {noformat} client_encryption_options: enabled: true keystore: /path/to/keystore keystore_password: my_keystore_password optional: false require_client_auth: false {noformat} With the changes in CASSANDRA-14652, ServerConnection.java always assumes that there will always be a client certificate chain, which will not be the case with the above configuration. Here is the error that shows up in the logs: {noformat} ERROR [Native-Transport-Requests-1] 2019-05-17 18:20:20,016 ServerConnection.java:147 - Failed to get peer certificates for peer /127.0.0.1:50736 javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated at sun.security.ssl.SSLSessionImpl.getPeerCertificateChain(SSLSessionImpl.java:501) ~[na:1.8.0_202] at org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:143) [main/:na] at org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:127) [main/:na] at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:75) [main/:na] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:566) [main/:na] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [main/:na] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_202] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [main/:na] {noformat} was: CASSANDRA-14652 caused a regression for client/native transport encryption. It broken one-way TLS authentication where only the client authenticates the coordinator node's certificate chain. This would be configured in cassandra.yaml as such: {noformat} client_encryption_options: enabled: true keystore: /path/to/keystore keystore_password: my_keystore_password optional: false require_client_auth: false {noformat} With the changes in CASSANDRA-14652, ServerConnection.java always assumes that there will always be a client certificate chain, which will not be the case with the above configuration. > one-way TLS authentication for client encryption is broken > -- > > Key: CASSANDRA-15132 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15132 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: John Sanda >Priority: Normal > > CASSANDRA-14652 caused a regression for client/native transport encryption. > It broken one-way TLS authentication where only the client authenticates the > coordinator node's certificate chain. This would be configured in > cassandra.yaml as such: > {noformat} > client_encryption_options: > enabled: true > keystore: /path/to/keystore > keystore_password: my_keystore_password > optional: false > require_client_auth: false > {noformat} > With the changes in CASSANDRA-14652, ServerConnection.java always assumes > that there will always be a client certificate chain, which will not be the > case with the above configuration. > Here is the error that shows up in the logs: > {noformat} > ERROR [Native-Transport-Requests-1] 2019-05-17 18:20:20,016 > ServerConnection.java:147 - Failed to get peer certificates for peer > /127.0.0.1:50736 > javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated > at > sun.security.ssl.SSLSessionImpl.getPeerCertificateChain(SSLSessionImpl.java:501) > ~[na:1.8.0_202] > at > org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:143) > [main/:na] > at > org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:127) > [main/:na] > at > org.apache.cassandra.
[jira] [Comment Edited] (CASSANDRA-15105) Flaky unit test AuditLoggerTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844283#comment-16844283 ] Sumanth Pasupuleti edited comment on CASSANDRA-15105 at 5/20/19 10:55 PM: -- I've reviewed the patch. LGTM w.r.t. fix for {{AuditLoggerTest}} class. As I ran UTs multiple times with this patch, I noticed {{testExcludeSystemKeyspaces}} still [fails|https://circleci.com/gh/sumanth-pasupuleti/cassandra/508#tests/containers/14] due to events collected in {{InMemoryAuditLogger}}. I did a scrub across the UTs to make sure we disable audit logger each time we enable, and consequently made a change to {{StorageServiceServerTest}} on top of [~eperott]'s patch. From my several (10) [runs|https://circleci.com/gh/sumanth-pasupuleti/workflows/cassandra/tree/15105_trunk_UT] of UTs, AuditLogger tests have been passing. [Patch|https://github.com/apache/cassandra/pull/323] [Passing Tests|https://circleci.com/workflow-run/7a96f12c-c695-4ca8-8bf6-36108bdaa75c] was (Author: sumanth.pasupuleti): I've reviewed the patch. LGTM w.r.t. fix for {{AuditLoggerTest}} class. As I ran UTs multiple times with this patch, I noticed {{testExcludeSystemKeyspaces}} still [fails|https://circleci.com/gh/sumanth-pasupuleti/cassandra/508#tests/containers/14] due to events collected in {{InMemoryAuditLogger}}. I did a scrub across the UTs to make sure we disable audit logger each time we enable, and consequently made a change to {{StorageServiceServerTest}} on top of [~eperott]'s patch. From my several (10) runs of UTs, AuditLogger tests have been passing. [Patch|https://github.com/apache/cassandra/pull/323] [Passing Tests|https://circleci.com/workflow-run/7a96f12c-c695-4ca8-8bf6-36108bdaa75c] > Flaky unit test AuditLoggerTest > --- > > Key: CASSANDRA-15105 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15105 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Per Otterström >Assignee: Per Otterström >Priority: Normal > Fix For: 4.0 > > > Depending on execution order some tests will fail in the AuditLoggerTest > class. Any test case that happens to execute after > testExcludeSystemKeyspaces() will typically fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15132) one-way TLS authentication for client encryption is broken
[ https://issues.apache.org/jira/browse/CASSANDRA-15132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844299#comment-16844299 ] John Sanda commented on CASSANDRA-15132: I pushed a fix at https://github.com/jsanda/cassandra/tree/tls-client-auth-patch. > one-way TLS authentication for client encryption is broken > -- > > Key: CASSANDRA-15132 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15132 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: John Sanda >Priority: Normal > > CASSANDRA-14652 caused a regression for client/native transport encryption. > It broken one-way TLS authentication where only the client authenticates the > coordinator node's certificate chain. This would be configured in > cassandra.yaml as such: > {noformat} > client_encryption_options: > enabled: true > keystore: /path/to/keystore > keystore_password: my_keystore_password > optional: false > require_client_auth: false > {noformat} > With the changes in CASSANDRA-14652, ServerConnection.java always assumes > that there will always be a client certificate chain, which will not be the > case with the above configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15132) one-way TLS authentication for client encryption is broken
John Sanda created CASSANDRA-15132: -- Summary: one-way TLS authentication for client encryption is broken Key: CASSANDRA-15132 URL: https://issues.apache.org/jira/browse/CASSANDRA-15132 Project: Cassandra Issue Type: Bug Components: Feature/Encryption Reporter: John Sanda CASSANDRA-14652 caused a regression for client/native transport encryption. It broken one-way TLS authentication where only the client authenticates the coordinator node's certificate chain. This would be configured in cassandra.yaml as such: {noformat} client_encryption_options: enabled: true keystore: /path/to/keystore keystore_password: my_keystore_password optional: false require_client_auth: false {noformat} With the changes in CASSANDRA-14652, ServerConnection.java always assumes that there will always be a client certificate chain, which will not be the case with the above configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15105) Flaky unit test AuditLoggerTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumanth Pasupuleti updated CASSANDRA-15105: --- Reviewers: Sumanth Pasupuleti, Vinay Chella (was: Vinay Chella) > Flaky unit test AuditLoggerTest > --- > > Key: CASSANDRA-15105 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15105 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Per Otterström >Assignee: Per Otterström >Priority: Normal > Fix For: 4.0 > > > Depending on execution order some tests will fail in the AuditLoggerTest > class. Any test case that happens to execute after > testExcludeSystemKeyspaces() will typically fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15105) Flaky unit test AuditLoggerTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844283#comment-16844283 ] Sumanth Pasupuleti commented on CASSANDRA-15105: I've reviewed the patch. LGTM w.r.t. fix for {{AuditLoggerTest}} class. As I ran UTs multiple times with this patch, I noticed {{testExcludeSystemKeyspaces}} still [fails|https://circleci.com/gh/sumanth-pasupuleti/cassandra/508#tests/containers/14] due to events collected in {{InMemoryAuditLogger}}. I did a scrub across the UTs to make sure we disable audit logger each time we enable, and consequently made a change to {{StorageServiceServerTest}} on top of [~eperott]'s patch. From my several (10) runs of UTs, AuditLogger tests have been passing. [Patch|https://github.com/apache/cassandra/pull/323] [Passing Tests|https://circleci.com/workflow-run/7a96f12c-c695-4ca8-8bf6-36108bdaa75c] > Flaky unit test AuditLoggerTest > --- > > Key: CASSANDRA-15105 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15105 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Per Otterström >Assignee: Per Otterström >Priority: Normal > Fix For: 4.0 > > > Depending on execution order some tests will fail in the AuditLoggerTest > class. Any test case that happens to execute after > testExcludeSystemKeyspaces() will typically fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14516) filter sstables by min/max clustering bounds during reads
[ https://issues.apache.org/jira/browse/CASSANDRA-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-14516: Resolution: Fixed Status: Resolved (was: Open) Thanks for looking into this [~n.v.harikrishna]. Sorry for the false alarm. > filter sstables by min/max clustering bounds during reads > - > > Key: CASSANDRA-14516 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14516 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Venkata Harikrishna Nukala >Priority: Normal > Fix For: 4.0 > > > In SinglePartitionReadCommand, we don't filter out sstables whose min/max > clustering bounds don't intersect with the clustering bounds being queried. > This causes us to do extra work on the read path. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15086) Illegal column names make legacy sstables unreadable in 3.0/3.x
[ https://issues.apache.org/jira/browse/CASSANDRA-15086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843808#comment-16843808 ] Sam Tunnicliffe commented on CASSANDRA-15086: - [~cam1982], it's not really a duplicate, in this issue the columns are illegal in 2.1 as well as 3.0+, but 3.0 has no mechanism to handle them at all (scrub cannot fix the sstables containing these cells). On the other hand, the issue in 15081 can be corrected by adding the missing column to the schema metadata in \{{system_schema.dropped_columns}}. Once this is done, those tables can be read without issue, and even without the correct missing column defs, {{sstablescrub}} can recover the rest of the data. > Illegal column names make legacy sstables unreadable in 3.0/3.x > --- > > Key: CASSANDRA-15086 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15086 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 3.0.19, 3.11.5 > > > CASSANDRA-10608 adds extra validation when decoding a bytebuffer representing > a legacy cellname. If the table is not COMPACT and the column name component > of the cellname refers to a primary key column, an IllegalArgumentException > is thrown. It looks like the original intent of 10608 was to prevent Thrift > writes from inserting these invalid cells, but the same code path is > exercised on the read path. The problem is that this kind of cells may exist > in pre-3.0 sstables, either due to Thrift writes or through side loading of > externally generated SSTables. Following an upgrade to 3.0, these partitions > become unreadable, breaking both the read and compaction paths (and so also > upgradesstables). Scrub in 2.1 does not help here as it blindly reproduces > the invalid cells. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15128) Cassandra does not support openjdk version "1.8.0_202"
[ https://issues.apache.org/jira/browse/CASSANDRA-15128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panneerselvam reassigned CASSANDRA-15128: - Assignee: Aleksey Yeschenko > Cassandra does not support openjdk version "1.8.0_202" > -- > > Key: CASSANDRA-15128 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15128 > Project: Cassandra > Issue Type: Bug > Components: Build >Reporter: Panneerselvam >Assignee: Aleksey Yeschenko >Priority: Normal > > I am trying to setup Apache Cassandra DB 3.11.4 version in my Windows 8 > system and getting below error while starting the Cassandra.bat file. > Software installed: > * Cassandra 3.11.4 > * Java 1.8 > * Python 2.7 > It started working after installing HotSpot jdk 1.8 . > Are we not supporting openjdk1.8 or only the issue with the particular > version (1.8.0_202). > > > {code:java} > Exception (java.lang.ExceptionInInitializerError) encountered during startup: > null > java.lang.ExceptionInInitializerError > at java.lang.J9VMInternals.ensureError(J9VMInternals.java:146) > at > java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:135) > at > org.apache.cassandra.utils.ObjectSizes.sizeOfReferenceArray(ObjectSizes.java:79) > at > org.apache.cassandra.utils.ObjectSizes.sizeOfArray(ObjectSizes.java:89) > at > org.apache.cassandra.utils.ObjectSizes.sizeOnHeapExcludingData(ObjectSizes.java:112) > at > org.apache.cassandra.db.AbstractBufferClusteringPrefix.unsharedHeapSizeExcludingData(AbstractBufferClusteringPrefix.java:70) > at > org.apache.cassandra.db.rows.BTreeRow.unsharedHeapSizeExcludingData(BTreeRow.java:450) > at > org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:336) > at > org.apache.cassandra.db.partitions.AtomicBTreePartition$RowUpdater.apply(AtomicBTreePartition.java:295) > at > org.apache.cassandra.utils.btree.BTree.buildInternal(BTree.java:139) > at org.apache.cassandra.utils.btree.BTree.build(BTree.java:121) > at org.apache.cassandra.utils.btree.BTree.update(BTree.java:178) > at > org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:156) > at org.apache.cassandra.db.Memtable.put(Memtable.java:282) > at > org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1352) > at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:626) > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:470) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:227) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:232) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:241) > at > org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:587) > at > org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:581) > at > org.apache.cassandra.cql3.QueryProcessor.executeOnceInternal(QueryProcessor.java:365) > at > org.apache.cassandra.db.SystemKeyspace.persistLocalMetadata(SystemKeyspace.java:520) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:221) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:620) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:732) > Caused by: java.lang.NumberFormatException: For input string: "openj9-0" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:580) > at java.lang.Integer.parseInt(Integer.java:615) > at > org.github.jamm.MemoryLayoutSpecification.getEffectiveMemoryLayoutSpecification(MemoryLayoutSpecification.java:190) > at > org.github.jamm.MemoryLayoutSpecification.(MemoryLayoutSpecification.java:31) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15131) NPE while force remove a node
[ https://issues.apache.org/jira/browse/CASSANDRA-15131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated CASSANDRA-15131: -- Description: Reproduce: # start a three nodes cluster(A, B, C) by : ./bin/cassandra -f # shutdown node A # In Node B, removing node A by:./bin/nodetool removenode 2331c0c1-f799-4f35-9323-c57ad020732b # But this process is too slow, so we force remove A by:./bin/nodetool removenode force # NPE happens in client # {code:java} RemovalStatus: Removing token (-9206149340638432876). Waiting for replication confirmation from [/10.3.1.11,/10.3.1.14]. error: null -- StackTrace -- java.lang.NullPointerException at org.apache.cassandra.gms.VersionedValue$VersionedValueFactory.removedNonlocal(VersionedValue.java:214) at org.apache.cassandra.gms.Gossiper.advertiseTokenRemoved(Gossiper.java:556) at org.apache.cassandra.service.StorageService.forceRemoveCompletion(StorageService.java:4353) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1471) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1312) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1404) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:832) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$81(TCPTransport.java:683) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} Code Analysis 1. removeNode will mark the node as Leaving {code:java} tokenMetadata.addLeavingEndpoint(endpoint); {code} 2. forceRemoveNode then step into remove {code:java} 1. if (!replicatingNodes.isEmpty() || !tokenMetadata.getLeavingEndpoints().isEmpty()) 2. { 3. logger.warn("Removal not confirmed for for {}", StringUtils.join(this.replicatingNodes, ",")); 4. for (InetAddress endpoint : tokenMetadata.getLeavingEndpoints()) 5. { 6. UUID hostId = tokenMetadata.getHostId(endpoint); 7. Gossiper.instance.advertiseTokenRemoved(endpoint, hostId); 8. excise(tokenMetadata.getTokens(endpoint), endpoint); 10 } 11 replicatingNodes.clear(); 12 removingNode = null; 13 } {code} 3 .code line#6,will get hostId, but if removeNode execute completely right now and it will remove host : *tokenMetadata.removeEndpoint(endpoint);* So hostId is null. 4. code line#7 will call *hostId.toString(),* hence NPE happens. The ugly NPE can't show what happens in force remove request and we should fix it. We found this bug in version 3.11.4, the trunk also has this bug. I will give the patch soon. was: Reproduce: # start a three nodes cluster(A, B, C) by : .