[ 
https://issues.apache.org/jira/browse/CASSANDRA-17806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17577384#comment-17577384
 ] 

Benjamin Lerer commented on CASSANDRA-17806:
--------------------------------------------

The patch looks good to me.

> Flaky test_rolling_upgrade
> --------------------------
>
>                 Key: CASSANDRA-17806
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17806
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Berenguer Blasi
>            Assignee: Berenguer Blasi
>            Priority: Normal
>             Fix For: 3.0.x, 4.0, 4.1-beta, 4.2
>
>
> The fix on CASSANDRA-17140 needs to be extended into other places as it seems 
> it now fails only one in a billion but still we can fix that one.
> {noformat}
> Regression
> dtest-upgrade.upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD.test_rolling_upgrade
>  (from Cassandra dtests)
> Failing for the past 1 build (Since
> #115 )
> Took 10 min.
> Failed 1 times in the last 9 runs. Flakiness: 12%, Stability: 88%
> Error Message
> RuntimeError: A subprocess has terminated early. Subprocess statuses: 
> Process-1 (is_alive: True), Process-2 (is_alive: False), attempting to 
> terminate remaining subprocesses now.
> Stacktrace
> self = 
> <upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD
>  object at 0x7f4d313e4e50>
>     @pytest.mark.timeout(3000)
>     def test_rolling_upgrade(self):
>         """
>             Test rolling upgrade of the cluster, so we have mixed versions 
> part way through.
>             """
> >       self.upgrade_scenario(rolling=True)
> upgrade_tests/upgrade_through_versions_test.py:340: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> upgrade_tests/upgrade_through_versions_test.py:417: in upgrade_scenario
>     self._check_on_subprocs(self.fixture_dtest_setup.subprocs)
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> <upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD
>  object at 0x7f4d313e4e50>
> subprocs = [<Process name='Process-1' pid=10867 parent=389 stopped 
> exitcode=-SIGKILL daemon>, <Process name='Process-2' pid=10881 parent=389 
> stopped exitcode=1 daemon>]
>     def _check_on_subprocs(self, subprocs):
>         """
>             Check on given subprocesses.
>     
>             If any are not alive, we'll go ahead and terminate any remaining 
> alive subprocesses since this test is going to fail.
>             """
>         subproc_statuses = [s.is_alive() for s in subprocs]
>         if not all(subproc_statuses):
>             message = "A subprocess has terminated early. Subprocess 
> statuses: "
>             for s in subprocs:
>                 message += "{name} (is_alive: {aliveness}), 
> ".format(name=s.name, aliveness=s.is_alive())
>             message += "attempting to terminate remaining subprocesses now."
>             self._terminate_subprocs()
> >           raise RuntimeError(message)
> E           RuntimeError: A subprocess has terminated early. Subprocess 
> statuses: Process-1 (is_alive: True), Process-2 (is_alive: False), attempting 
> to terminate remaining subprocesses now.
> upgrade_tests/upgrade_through_versions_test.py:475: RuntimeError
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to