[ 
https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437015#comment-17437015
 ] 

David Capwell commented on CASSANDRA-17081:
-------------------------------------------

Posted the cause here 
https://issues.apache.org/jira/browse/CASSANDRA-17085?focusedCommentId=17436995&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17436995

Here is what I am seeing in 
bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state

{code}
      node3 = new_node(cluster)
        try:
            node3.start()
        except NodeError:
            pass  # node doesn't start as expected
        t.join()
        node1.start()
{code}

node1.start checks all the alive nodes (according to ccm) to see if node1 is 
seen as up in the logs.  node3 is dead (or dying), so it should not be included 
in the watch set

I was able to repro the issue when I limit the environment to 2 cores; trying a 
patch where we force shutdown node3 before starting node1 to avoid ccm checking 
node3's logs

> Fix test: 
> bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17081
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17081
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Josh McKenzie
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: NA
>
>
> Seeing in circle and locally on trunk:
> Looks like it's timing out waiting for the bootstrap to complete.
> {code:java}
> test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).
>         <class 'ccmlib.node.TimeoutError'>
>         28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
>  Tail: ...b336de0e72/nb-1-big-Data.db 
> ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
> StorageService.java:483 - Stopping gossiper
>         [<TracebackEntry 
> /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:895>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:664>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:588>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:56>]
> test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
> required 1 times.
>         <class 'ccmlib.node.TimeoutError'>
>         28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: 
>  Tail: ...
>         [<TracebackEntry 
> /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:895>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:664>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:588>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:56>]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to