[ 
https://issues.apache.org/jira/browse/CASSANDRA-16588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17322096#comment-17322096
 ] 

Sam Tunnicliffe commented on CASSANDRA-16588:
---------------------------------------------

This is a dupe of CASSANDRA-14155 and your diagnosis corresponds with Ariel's 
there. IDK why that stalled, but his proposal is essentially the same as yours:

bq. it seems to me that we should be able to ignore messages received during 
the shadow round that don't have the information we are looking for without 
erroring out.

Rather than checking on the number of epstates returned, I'd go for verifying 
that the states we do require for the collision check are present. I've pushed 
a couple of commits which do that, wdyt?

||patch||CI||Circle||
|[3.11|https://github.com/beobal/cassandra/tree/CASSANDRA-16588]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/661/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/661/pipeline]|[pipeline|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=CASSANDRA-16588]|
|[trunk|https://github.com/beobal/cassandra/tree/CASSANDRA-16588-trunk]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/660/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/660/pipeline]|[pipeline|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=CASSANDRA-16588-trunk]|



> NPE getting host_id in Gossiper.isSafeForStartup
> ------------------------------------------------
>
>                 Key: CASSANDRA-16588
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16588
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Normal
>             Fix For: 3.11.x, 4.0-rc
>
>
> As seen here: 
> https://ci-cassandra.apache.org/job/Cassandra-devbranch/604/testReport/junit/org.apache.cassandra.distributed.upgrade/MixedModeGossipTest/testStatusFieldShouldExistInOldVersionNodesEdgeCase/
> {noformat}
> java.lang.NullPointerException
>       at org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:952)
>       at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:657)
>       at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:933)
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:784)
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:729)
>       at 
> org.apache.cassandra.distributed.impl.Instance.lambda$startup$10(Instance.java:541)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>       at java.lang.Thread.run(Thread.java:748)
> {noformat}
> I believe what is happening is a GossipDigestAck has been queued to ack the 
> shutdown state from the node on the seed, but isn't actually sent until the 
> node has restarted and gone into shadow.  Since the ack contains the node's 
> IP, it assumes a host_id will be there but since this is not an actual shadow 
> response, it is not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to