[ https://issues.apache.org/jira/browse/CASSANDRA-17305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ekaterina Dimitrova updated CASSANDRA-17305: -------------------------------------------- Fix Version/s: 4.x > Test Failure: > dtest-upgrade.upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD.test_parallel_upgrade > ----------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-17305 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17305 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python > Reporter: Josh McKenzie > Priority: Normal > Fix For: 4.x, 4.1-beta > > > Failed 2 times in the last 29 runs. Flakiness: 7%, Stability: 93% > Error Message > Failed: Timeout >900.0s > {code} > Stacktrace > self = > <upgrade_tests.upgrade_through_versions_test.TestProtoV3Upgrade_AllVersions_RandomPartitioner_EndsAt_3_11_X_HEAD > object at 0x7f19858d59d0> > def test_parallel_upgrade(self): > """ > Test upgrading cluster all at once (requires cluster downtime). > """ > > self.upgrade_scenario() > upgrade_tests/upgrade_through_versions_test.py:313: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > upgrade_tests/upgrade_through_versions_test.py:435: in upgrade_scenario > cluster.stop() > ../venv/lib/python3.8/site-packages/ccmlib/cluster.py:576: in stop > if not node.stop(wait=wait, signal_event=signal_event, **kwargs): > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = <ccmlib.node.Node object at 0x7f1985bba040>, wait = True > wait_other_notice = False, signal_event = <Signals.SIGTERM: 15>, kwargs = {} > still_running = True, wait_time_sec = 16, i = 4 > def stop(self, wait=True, wait_other_notice=False, > signal_event=signal.SIGTERM, **kwargs): > """ > Stop the node. > - wait: if True (the default), wait for the Cassandra process > to be > really dead. Otherwise return after having sent the kill > signal. > - wait_other_notice: return only when the other live nodes of > the > cluster have marked this node has dead. > - signal_event: Signal event to send to Cassandra; default is to > let Cassandra clean up and shut down properly (SIGTERM [15]) > - Optional: > + gently: Let Cassandra clean up and shut down properly; > unless > false perform a 'kill -9' which shuts down faster. > """ > if self.is_running(): > if wait_other_notice: > marks = [(node, node.mark_log()) for node in > list(self.cluster.nodes.values()) if node.is_live() and node is not self] > > if common.is_win(): > # Just taskkill the instance, don't bother trying to shut it > down gracefully. > # Node recovery should prevent data loss from hard shutdown. > # We have recurring issues with nodes not stopping / > releasing files in the CI > # environment so it makes more sense just to murder it hard > since there's > # really little downside. > > # We want the node to flush its data before shutdown as some > tests rely on small writes being present. > # The default Periodic sync at 10 ms may not have flushed > data yet, causing tests to fail. > # This is not a hard requirement, however, so we swallow any > exceptions this may throw and kill anyway. > if signal_event is signal.SIGTERM: > try: > self.flush() > except: > common.warning("Failed to flush node: {0} on > shutdown.".format(self.name)) > pass > > os.system("taskkill /F /PID " + str(self.pid)) > if self._find_pid_on_windows(): > common.warning("Failed to terminate node: {0} with pid: > {1}".format(self.name, self.pid)) > else: > # Determine if the signal event should be updated to keep API > compatibility > if 'gently' in kwargs and kwargs['gently'] is False: > signal_event = signal.SIGKILL > > os.kill(self.pid, signal_event) > > if wait_other_notice: > for node, mark in marks: > node.watch_log_for_death(self, from_mark=mark) > else: > time.sleep(.1) > > still_running = self.is_running() > if still_running and wait: > wait_time_sec = 1 > for i in xrange(0, 7): > # we'll double the wait time each try and cassandra should > # not take more than 1 minute to shutdown > > time.sleep(wait_time_sec) > E Failed: Timeout >900.0s > ../venv/lib/python3.8/site-packages/ccmlib/node.py:970: Failed > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org