[ 
https://issues.apache.org/jira/browse/CASSANDRA-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437020#comment-17437020
 ] 

David Capwell commented on CASSANDRA-17081:
-------------------------------------------

Here is the section of code failing in ccm

https://github.com/riptano/ccm/blob/master/ccmlib/node.py#L890-L895

{code}
 if common.is_int_not_bool(wait_other_notice):
            for node, mark in marks:
                node.watch_log_for_alive(self, from_mark=mark, 
timeout=wait_other_notice)
        elif wait_other_notice:
            for node, mark in marks:
                node.watch_log_for_alive(self, from_mark=mark)
{code}

marks is defined as follows (the issue is here)

https://github.com/riptano/ccm/blob/master/ccmlib/node.py#L772-L775

{code}
if wait_other_notice:
            marks = [(node, node.mark_log()) for node in 
list(self.cluster.nodes.values()) if node.is_live()]
        else:
            marks = []
{code}

the node.is_live() check returns true in some cases for node3 (the node which 
failed to start up), which causes ccm to watch node3's logs for node1 to show 
up... since node3 is actually down the logs will not see node1; which leads to 
a timeout.

> Fix test: 
> bootstrap_test.py::TestBootstrap::test_bootstrap_with_reset_bootstrap_state
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17081
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17081
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Josh McKenzie
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: NA
>
>
> Seeing in circle and locally on trunk:
> Looks like it's timing out waiting for the bootstrap to complete.
> {code:java}
> test_bootstrap_with_reset_bootstrap_state failed (1 runs remaining out of 2).
>         <class 'ccmlib.node.TimeoutError'>
>         28 Oct 2021 19:03:53 [node3] after 120.39/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: ERROR [Stream-Deserializer-/127.0.0.1:7000-20b885c
>  Tail: ...b336de0e72/nb-1-big-Data.db 
> ERROR [Stream-Deserializer-/127.0.0.1:7000-29a7cdb5] 2021-10-28 15:01:36,578 
> StorageService.java:483 - Stopping gossiper
>         [<TracebackEntry 
> /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:895>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:664>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:588>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:56>]
> test_bootstrap_with_reset_bootstrap_state failed; it passed 0 out of the 
> required 1 times.
>         <class 'ccmlib.node.TimeoutError'>
>         28 Oct 2021 19:08:23 [node3] after 120.41/120 seconds Missing: 
> ['127.0.0.1:7000.* is now UP'] not found in system.log:
>  Head: 
>  Tail: ...
>         [<TracebackEntry 
> /Users/jmckenzie/src/cassandra-dtest/bootstrap_test.py:483>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:895>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:664>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:588>
> <TracebackEntry /Users/jmckenzie/src/ccm/ccmlib/node.py:56>]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to