[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-17 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346172#comment-17346172
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

Absolutely, good catch! Thank you! 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0, 4.0-rc, 4.x
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-17 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17345987#comment-17345987
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

[~e.dimitrova] I committed the dtest change as that is independent of the ccm 
change. The ccm change I noticed there are changes in ccm master not tagged as 
cassandra-test yet. Hence I am gathering info on whether this is intentional or 
not. Once I know that I will commit the ccm change, I hope that is ok with you 
but I am being extra cautious here. 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-14 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344567#comment-17344567
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

When I mentioned the long tests, I didn't mean that this is the case here but I 
wanted to give an example of another class where there was different solution 
to timeout problems and we should verify those we see now why they appear and 
start from there.

Thank you, everything looks good to me!

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-14 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344383#comment-17344383
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

I did a last CI run and it 
[lgtm|https://app.circleci.com/pipelines/github/bereng/cassandra/287/workflows/7d1e15b7-9d2d-4347-80ee-ab8cb1166155/jobs/2723]

Timeouts imo are not related to having expanded the test so now it better fits 
into some other 'long' category as it was your case. I just think we're close 
to the timeouts but time will tell. We don't have to decide that now. If you're 
+1 I'll revert the requirements.txt change and then we can commit.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-13 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343844#comment-17343844
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

Assuming you tested 16667 with the new ccm logging and you see the same issue, 
I am +1.

Let's run the multiplexer again for both tests to confirm the issue is solved 
and commit. Thank you!
{quote}This lead me to speculate we probably have low generic timeouts now for 
jenkins hence my proposal to raise them.
{quote}
I suggest we open an umbrella ticket to look at those classes/tests. In another 
ticket the solution was to move two test classes to long run tests. While I see 
what you mean, the number of failures we see is still just a few tests compared 
and not enough justification for me to raise the time for the thousands of 
tests we have which will lead to significant CI run time increase in case of 
failures. 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-12 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343713#comment-17343713
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

[~e.dimitrova] so if you peruse the latest runs you will notice 
[here|https://ci-cassandra.apache.org/job/Cassandra-4.0/38/testReport/junit/dtest-offheap.repair_tests.incremental_repair_test/TestIncRepair/test_multiple_repair/]
 and 
[here|https://ci-cassandra.apache.org/job/Cassandra-4.0/38/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_forwards_and_cleanup/]
 dtests timeouts on common operations like cql queries or node/cluster starts. 
Some for this test class but different test method, some for other tests. This 
lead me to speculate we probably have low generic timeouts now for jenkins 
hence my proposal to raise them. Of course we can do as you suggested and go 
one by one instead. I.e. I included a fix for CASSANDRA-16667 here for 
convinience.

So:
- I updated the [PR|https://github.com/apache/cassandra-dtest/pull/135] to it's 
final version + the 16667 fix
- CCM 
[PR|https://github.com/ekaterinadimitrova2/ccm/commit/60515b59cc59dd4ac500321f6b6b3a8818ed7f88]
 stays the same
- If you +1 I run a final CI and commit.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-12 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343286#comment-17343286
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

Bq. _The requirements.txt change I already mentioned it goes out. Good._
 My bad, I missed it, apologize

Bq. _On the timeout issue they seem to be coming up a bit more now. But we can 
go one at a time. Good._

The link points to a failure of this test but I looked at the last 5 runs in 
Jenkins and there are less than 10 failures per run, some of them already 
known. Did I miss something else you wanted to point out to me? Just double 
checking myself.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-12 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343261#comment-17343261
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

The requirements.txt change I already mentioned it goes out. Good.

On the timeout issue they seem to be coming up a bit more 
[now|https://ci-cassandra.apache.org/job/Cassandra-4.0/38/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_forwards_and_cleanup/].
 But we can go one at a time. Good.

Yep I can remove the other 'flaky' mentions in the class. Good

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-12 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343252#comment-17343252
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

Bq. Can you please confirm we're talking about the same flaky and this is ready 
to commit with?

I left a few comments in GitHub, most important, please, return the riptano/ccm 
repo. I used mine in requirements.txt only for test purposes. Thanks

Bq.  I am thinking of just generally raising for all tests those 90s to 120s or 
180s. Wdyt?

-1. Were have only a few new flakes and raising the timeout for the whole suite 
doesn't seem right to me. This will lead to even longer CI runs in case of 
failures. I suggest we first look at those few ones and the reason for 
flakiness and start from there.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-12 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343123#comment-17343123
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

Looking at jenkins CI failures and logs today I have the impression other 
tests, maybe bc of the recent docker parallelization changes, are starting to 
hit timeouts. I am thinking of just generally raising for all tests those 90s 
to 120s or 180s. Wdyt?

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-11 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343005#comment-17343005
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

The flaky was already 
[removed|https://github.com/apache/cassandra-dtest/pull/135/files#diff-bea9d3b0d863b1adc1bfe294ec6af0fb32c07b8869bde21c5f9c71f9bf4ad968L229]
 Can you please confirm we're talking about the same flaky and this is ready to 
commit with?
- CCM 
[PR|https://github.com/ekaterinadimitrova2/ccm/commit/60515b59cc59dd4ac500321f6b6b3a8818ed7f88]
- C* OSS [PR|https://github.com/apache/cassandra-dtest/pull/135/files] (minus 
the requirements.txt change)

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-11 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342721#comment-17342721
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I just realized... please fix the flaky annotation, let's not leave it not 
serving its goal and only bringing confusion. Thanks!

 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342258#comment-17342258
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I fixed the CCM patch 
[here|https://github.com/ekaterinadimitrova2/ccm/commit/60515b59cc59dd4ac500321f6b6b3a8818ed7f88],
 I had a small mistake due to which the TAIL was always empty. Now it is fine, 
you can see where the method stopped reading from the log.

You can see in the latest 
[runs|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/808/workflows/737073ba-0e24-4a1f-9cdb-fffbef041a73/jobs/4568/tests#failed-test-0]
 I submitted with lower timeout, the message we look for is in the log but 
_watch_log_for_ method doesn't find the message; we just reached the timeout 
earlier before reaching the message in the log which we read line by line.

I am +1 on the raised timeout plus updated CCM debug message.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341874#comment-17341874
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I pushed it on Friday with different timeout values and got a time when it 
fails but I wanted to try today with the final version of the multiplexer so I 
can easily track down the logs. ([~adelapena] added that option to easily find 
the logs you need)

I am fine to remove the ccm patch but at least rename correctly the tail to 
head. I will let you know if I find anything new in the logs

 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-10 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341760#comment-17341760
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

Tail patch wasn't producing anything. But seems that just by increasing the 
timeout the thing runs 
[clean|https://app.circleci.com/pipelines/github/bereng/cassandra/282/workflows/793a0963-d319-4667-acaf-a74ec3e8b84f/jobs/2697].
 It still irks me we're not bottoming out on this thing... Proposed PR 
[here|https://github.com/apache/cassandra-dtest/pull/135] but I would remove 
the ccm from Ekaterina.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-10 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341720#comment-17341720
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

I went ahead and gave the multiplexer one more go. I got plenty of 
[failures|https://app.circleci.com/pipelines/github/bereng/cassandra/281/workflows/2afcd87f-ce10-4495-9153-a362e198d3cc/jobs/2691/artifacts]
 but the ones I could look at were legit. We can see 
[here|https://circle-production-customer-artifacts.s3.amazonaws.com/picard/5e9576564660060baefcf188/6098c61a1823b752f1333400-0-build/artifacts/dtest/stdout.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256=20210510T064321Z=host=60=AKIAJR3Q6CR467H7Z55A%2F20210510%2Fus-east-1%2Fs3%2Faws4_request=f108cc3b72f8eb9728c8c703b518ea0fed001a39ef1a8f3537be43c4e80b3396]
 the failure from {{2021-05-10 05:39:37,970}} it's indeed missing the message 
when you track down the log 
[file|https://circle-production-customer-artifacts.s3.amazonaws.com/picard/5e9576564660060baefcf188/6098c61a1823b752f1333400-0-build/artifacts/dtest_logs/1620625235394_test_move_backwards_and_cleanup%5B2-10%5D/node4.log?X-Amz-Algorithm=AWS4-HMAC-SHA256=20210510T064822Z=host=59=AKIAJR3Q6CR467H7Z55A%2F20210510%2Fus-east-1%2Fs3%2Faws4_request=6915219122fd8bd306dc0ae23ecac80f9ae1193254d97a2d51b9e04c9220c64c].
 So I am going to increase timeout and see if I can get some good failures.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-07 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340845#comment-17340845
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

Why wait if we can trigger it? give it one more try imo. It's just a couple 
clicks if I am not missing anything?

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-07 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340844#comment-17340844
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I suggest we commit the debug patch and next time it appears if it appears we 
can think about it, then we will have the exact view what was read from the 
log. 
This is a test/infra issue and also it appears very rarely. It is also for 
experimental feature. WDYT? 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-07 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340706#comment-17340706
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

[~e.dimitrova] I have been testing this with diff timeouts and they are either 
too short or too long so the failure can't be repro'd like that unfortunately. 
So I looked at your patch and I think it is the right thing to do indeed to get 
some more info. I can't explain why you didn't repro bc you have the same exact 
code I would have produced :shrug: Would you mind running it again? maybe this 
time we'll be lucky. Otherwise we'll ask Andres which is in good terms with the 
bug repro Gods and see if he can with your patch.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-05 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339996#comment-17339996
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

Hi [~e.dimitrova] thx for looking into this. I was buried yesterday and 
couldn't look much but I hope to do so today. Yes the 'flaky' annotation will 
either be removed or moved to '2' once we reach a conclusion.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-05 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339943#comment-17339943
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

That is so weird, I tried to push two times 1000 runs since last night, same 
config and same jobs on java 11 as [~adelapena] and I can't reproduce anything.

I would like to suggest this [tiny patch 
|https://github.com/ekaterinadimitrova2/ccm/commit/3f4ba3597f0809201f88dae403a15c33d12dd021]
 which in case it fails again will help us to see where it was. The _Tail_ 
added before is actually _Head_ so  I decided to add both.

I think having both would be good for debugging in such weird cases. WDYT?

Also, on the additional topic I brought, I think it is meaningless to keep  

_@flaky(max_runs=1)_, either we should consider there was a mistake and 2 was 
aimed or we can remove it to prevent further confusion. WDYT?

 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-05 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339627#comment-17339627
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I played with the new multiplexer yesterday, added some debug messages but my 
first try of 1000 runs didn't fail even once. I tried both java 8 and java 11. 
Started another run with more repetitions but it fails as the build is failing 
for some reason.

I will check again later today when the build is fixed

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-05 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339516#comment-17339516
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

The only failure I could track down from [~adelapena] multi run repros the 
problem

We can see 
[here|https://circle-production-customer-artifacts.s3.amazonaws.com/picard/52e11c3bf7d0f78c41388745/608de58867e0447553dcf233-1-build/artifacts/dtest/stdout.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256=20210505T081400Z=host=60=AKIAJR3Q6CR467H7Z55A%2F20210505%2Fus-east-1%2Fs3%2Faws4_request=e0c2ac533b07578c306d0500b9b9c618a04799610ae57e9294a0c50827bea00d]

bq. ccmlib.node.TimeoutError: 02 May 2021 00:01:09 [node4] after 90.12/90 
seconds Missing: ['Starting listening for CQL clients'] not found in system.log

it hasn't found it on 00:01:09 when it was already present at 00:00:27
bq. INFO  [main] 2021-05-02 00:00:27,098 PipelineConfigurator.java:125 - 
Starting listening for CQL clients on /127.0.0.4:9042 (unencrypted)...

So sthg doesn't add up somewhere indeed...


> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-04 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339392#comment-17339392
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

Thx for the clarification [~e.dimitrova]. That would mean there is around 20s 
delay between the log line being issued and actually making it into the log 
file. I will take a look at Andres multiplex run and see if I can see anything 
in there.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-04 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338953#comment-17338953
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I am sorry [~bereng] , it seems I confused you. These are two different 
unrelated things I was talking about:
1) I saw the @flaky mark and I felt that it is worth it to mention it while we 
are looking into these tests as it seems useless now.

2) Indeed, the current timeout issue is a different problem. So what I meant is 
that the log file is processed line by line as far as I remember, if for some 
reason the processing was slow, that line where the message is might have not 
been reached yet until the 90 seconds pass.

I will try to add some logging and use the new multiplexer job from 
[~adelapena]. Thanks [~adelapena], indeed it helps. 

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-04 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338924#comment-17338924
 ] 

Andres de la Peña commented on CASSANDRA-16644:
---

If it's of any help, it seems that the CircleCi-based test multiplexer proposed 
in CASSANDRA-16625 has managed to reproduce the failure 54 times in [this 
run|https://app.circleci.com/pipelines/github/adelapena/cassandra/376/workflows/eb463f0a-ad93-488a-89a3-bb44fbc0c578/jobs/3270]
 with 1000 iterations.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-03 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338743#comment-17338743
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

Flaky or not, what I am worried about is the log perusing logic not having 
caught up on that message being present in the logs indeed. Either that or the 
log messages reach the log with a delay. But then it would mean there's a 20s 
delay somewhere which makes no sense. If that'd be the case adding flaky=2 
won't help here as that would be a general problem affecting all tests. So I 
having no sensible proposal I can only suggest we close until/if it resurfaces 
as investigating further won't take me anywhere imo. But maybe a new pair of 
eyes have better ideas wdyt?

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-03 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338718#comment-17338718
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I feel that It is still a matter of timing and that the difference between the 
messages when they appeared in the logs doesn't mean that It took the same time 
for them to be processed, maybe?

I had some trouble with my local ccm and didn't manage to check this but you 
might want to try first reducing significantly the time out to verify my theory.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-05-03 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338593#comment-17338593
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16644:
-

I want to look at the test more but my first observation is the tests in that 
class are marked flaky with max number of runs 1.

At first I thought it is 1 run more if it fails but a quick check showed me 
just one run at all which makes flaky useless in this case.

So if nothing else and those should be really flaky (I didn't find a discussion 
around them in the original ticket where they were added), at least I would say 
run them twice or mark them with _@flaky(max_runs=2)_

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
> system.log:
> E   Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> ../venv/src/ccm/ccmlib/node.py:56: TimeoutError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16644) Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup

2021-04-30 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17337154#comment-17337154
 ] 

Berenguer Blasi commented on CASSANDRA-16644:
-

In order to follow the chain of events you need the logs 
[here|https://nightlies.apache.org/cassandra/trunk/Cassandra-trunk-dtest-novnode/403/Cassandra-trunk-dtest-novnode/label=cassandra-dtest,split=64/]

The test fails on

bq. ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 
seconds Missing: ['Starting listening for CQL clients'] not found in system.log:

But inspecting stdout logs we can see the node is started around 23:59:21

{noformat}
23:59:21,502 ccm INFO Starting node4 with 
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 java_version=8 
cassandra_version=4.0, install_dir=/home/cassandra/cassandra
00:00:19,524 cassandra.cluster INFO New Cassandra host  discovered
{noformat}

in node4 system logs we can see the actual message is really present in the log 
and the node has started:

{noformat}
INFO  [main] 2021-04-27 00:00:18,493 PipelineConfigurator.java:125 - Starting 
listening for CQL clients on /127.0.0.4:9042 (unencrypted)...
{noformat}

We seem to have a failure where a message present at 00:00:18 is not being 
detected and the failure is triggered at 00:01:00 which makes no sense. I have 
followed the code looking for holes in the log {{mark}} logic but I didn't see 
any obvious problems. The test seems pretty solid looking at jenkins history, 
it is a test problem not a code problem, we have no previous tickets on it, 
there doesn't seem to be anything else to investigate,... so I am proposing 
closing this one and reopening if it raises again. Then I would add some 
further debug to have some thread we start pulling.

> Flaky TestTransientReplicationRing.test_move_backwards_and_cleanup
> --
>
> Key: CASSANDRA-16644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16644
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
>
> Failure 
> [here|https://ci-cassandra.apache.org/job/Cassandra-trunk/472/testReport/junit/dtest-novnode.transient_replication_ring_test/TestTransientReplicationRing/test_move_backwards_and_cleanup/]
> {noformat}
> Error Message
> ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 90.11/90 seconds 
> Missing: ['Starting listening for CQL clients'] not found in system.log: 
> Tail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura
> Stacktrace
> self =  at 0x7f66c51ebd90>
> @flaky(max_runs=1)
> @pytest.mark.no_vnodes
> def test_move_backwards_and_cleanup(self):
> """Test moving a node backwards without moving past a neighbor 
> token"""
> move_token = '5'
> expected_after_move = [gen_expected(range(0, 6), range(31, 40)),
>gen_expected(range(0, 21, 2)),
>gen_expected(range(1, 6, 2), range(6, 31)),
>gen_expected(range(7, 20, 2), range(21, 40))]
> expected_after_repair = [gen_expected(range(0, 6), range(31, 40)),
>  gen_expected(range(0, 21)),
>  gen_expected(range(6, 31)),
>  gen_expected(range(21, 40))]
> >   self.move_test(move_token, expected_after_move, expected_after_repair)
> transient_replication_ring_test.py:333: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_ring_test.py:235: in move_test
> node4.start(wait_for_binary_proto=True)
> transient_replication_ring_test.py:50: in new_start
> return old_start(*args, **kwargs)
> ../venv/src/ccm/ccmlib/node.py:901: in start
> self.wait_for_binary_interface(from_mark=self.mark)
> ../venv/src/ccm/ccmlib/node.py:687: in wait_for_binary_interface
> self.watch_log_for("Starting listening for CQL clients", **kwargs)
> ../venv/src/ccm/ccmlib/node.py:588: in watch_log_for
> TimeoutError.raise_if_passed(start=start, timeout=timeout, node=self.name,
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> start = 1619481570.0236456, timeout = 90
> msg = "Missing: ['Starting listening for CQL clients'] not found in 
> system.log:\nTail: INFO  [main] 2021-04-26 23:59:22,590 YamlConfigura"
> node = 'node4'
> @staticmethod
> def raise_if_passed(start, timeout, msg, node=None):
> if start + timeout < time.time():
> >   raise TimeoutError.create(start, timeout, msg, node)
> E   ccmlib.node.TimeoutError: 27 Apr 2021 00:01:00 [node4] after 
> 90.11/90 seconds Missing: ['Starting listening for CQL clients'] not found in 
>