[ https://issues.apache.org/jira/browse/CASSANDRA-16588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17322096#comment-17322096 ]
Sam Tunnicliffe commented on CASSANDRA-16588: --------------------------------------------- This is a dupe of CASSANDRA-14155 and your diagnosis corresponds with Ariel's there. IDK why that stalled, but his proposal is essentially the same as yours: bq. it seems to me that we should be able to ignore messages received during the shadow round that don't have the information we are looking for without erroring out. Rather than checking on the number of epstates returned, I'd go for verifying that the states we do require for the collision check are present. I've pushed a couple of commits which do that, wdyt? ||patch||CI||Circle|| |[3.11|https://github.com/beobal/cassandra/tree/CASSANDRA-16588]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/661/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/661/pipeline]|[pipeline|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=CASSANDRA-16588]| |[trunk|https://github.com/beobal/cassandra/tree/CASSANDRA-16588-trunk]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/660/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/660/pipeline]|[pipeline|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=CASSANDRA-16588-trunk]| > NPE getting host_id in Gossiper.isSafeForStartup > ------------------------------------------------ > > Key: CASSANDRA-16588 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16588 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip > Reporter: Brandon Williams > Assignee: Brandon Williams > Priority: Normal > Fix For: 3.11.x, 4.0-rc > > > As seen here: > https://ci-cassandra.apache.org/job/Cassandra-devbranch/604/testReport/junit/org.apache.cassandra.distributed.upgrade/MixedModeGossipTest/testStatusFieldShouldExistInOldVersionNodesEdgeCase/ > {noformat} > java.lang.NullPointerException > at org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:952) > at > org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:657) > at > org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:933) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:784) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:729) > at > org.apache.cassandra.distributed.impl.Instance.lambda$startup$10(Instance.java:541) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {noformat} > I believe what is happening is a GossipDigestAck has been queued to ack the > shutdown state from the node on the seed, but isn't actually sent until the > node has restarted and gone into shadow. Since the ack contains the node's > IP, it assumes a host_id will be there but since this is not an actual shadow > response, it is not. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org