[ 
https://issues.apache.org/jira/browse/CASSANDRA-16238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332802#comment-17332802
 ] 

David Capwell commented on CASSANDRA-16238:
-------------------------------------------

Review

* 
https://github.com/apache/cassandra/compare/trunk...driftx:CASSANDRA-16238#diff-99267a2170b04fd7dd24d6c6bf2ba1fc26d6dc896cd74f8c5bd56c476e2540e4R580
 - nit: you can call isEmpty rather than size

For the host replacement tests, I set that field low to help find issues, so if 
this case happens more frequently because of that I am cool removing it; but we 
do see this outside of these classes as well.

bq. .1 is detected for the first time via gossip, and as it is going through 
StorageService but before it is added to TokenMetatadata, the gossiper's status 
check has begun running

If I understand you, the call to StorageService.onChange (which calls 
handleStateNormal) happens-after gossip status check, so removes the state? 
GossipDigestAck2 should be handled in the gossip stage and eventually call 
applyNewStates to apply the state and trigger notifications, but doStatusCheck 
is called in the GossipTasks thread pool, which checks isGossipOnlyMember which 
returns true in this case (as state isn't fully settled yet), at which point we 
schedule a task in the gossip stage to remove (but at this point the 
isGossipOnlyMember(endpoint) == false).  

If I understand you correctly, this feels like a race condition where we read 
data not fully committed, which feels like a bug (which is why it was set to 0 
in the first place).  Do I understand you [~brandon.williams]?

> Fix flaky jvm-dtests that fail with Unable to contact any seeds
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-16238
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16238
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/java
>            Reporter: David Capwell
>            Assignee: Brandon Williams
>            Priority: Normal
>             Fix For: 4.0-rc
>
>         Attachments: 16238-archived-failures.txt
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/745/workflows/1c7e589e-b5af-4a56-b40a-43da424602c7/jobs/4231
> {code}
> test teardown failure
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [main] 2020-10-29 17:38:13,808 CassandraDaemon.java:817 - Exception 
> encountered during startup
> java.lang.IllegalStateException: Unable to contact any seeds!
>       at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1601)
>       at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:931)
>       at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:892)
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:699)
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:635)
>       at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:407)
>       at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:671)
>       at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:795), 
> ERROR [main] 2020-10-29 17:38:13,808 CassandraDaemon.java:817 - Exception 
> encountered during startup
> java.lang.IllegalStateException: Unable to contact any seeds!
>       at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1601)
>       at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:931)
>       at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:892)
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:699)
>       at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:635)
>       at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:407)
>       at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:671)
>       at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:795)]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to