[ https://issues.apache.org/jira/browse/CASSANDRA-17685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-17685: ----------------------------------------- Complexity: Normal Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > Test failure: > transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair > ----------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-17685 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17685 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python > Reporter: Andres de la Peña > Priority: Normal > Fix For: 4.1.x > > > The Python dtest > {{transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair}} > is flaky at least in {{{}cassandra-4.1{}}}, with a flakiness < 1%. > I haven't seen the failure on Jenkins but on [this CircleCI > run|https://app.circleci.com/pipelines/github/adelapena/cassandra/1663/workflows/c63703e3-8c7a-42c6-981a-53cb59babe1f/jobs/17476] > for CASSANDRA-17458. > The failure can also be [reproduced in the > multiplexer|https://app.circleci.com/pipelines/github/adelapena/cassandra/1666/workflows/6f925be1-c0df-4b2a-83e0-4612a46f32bd/jobs/17516], > with 5 failures in 5000 iterations: > {code:java} > self = > <transient_replication_test.TestTransientReplicationRepairStreamEntireSSTable > object at 0x7f87951c77b8> > @pytest.mark.no_vnodes > def test_optimized_primary_range_repair(self): > """ optimized primary range incremental repair from full replica > should remove data on node3 """ > self._test_speculative_write_repair_cycle(primary_range=True, > optimized_repair=True, > > repair_coordinator=self.node1, > > expect_node3_data=False) > transient_replication_test.py:523: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > transient_replication_test.py:473: in _test_speculative_write_repair_cycle > with tm(self.node1) as tm1, tm(self.node2) as tm2, tm(self.node3) as tm3: > transient_replication_test.py:62: in __enter__ > self.start() > transient_replication_test.py:55: in start > self.jmx.start() > tools/jmxutils.py:187: in start > subprocess.check_output(args, stderr=subprocess.STDOUT) > /usr/lib/python3.6/subprocess.py:356: in check_output > **kwargs).stdout > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > input = None, timeout = None, check = True > popenargs = (('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', > '/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandr...t/tools/../lib/jolokia-jvm-1.6.2-agent.jar', > 'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', ...),) > kwargs = {'stderr': -2, 'stdout': -1} > process = <subprocess.Popen object at 0x7f877e977208> > stdout = b"Couldn't start agent for PID 11637\nPossible reason could be that > port '8778' is already occupied.\nPlease check the standard output of the > target process for a detailed error message.\n" > stderr = None, retcode = 1 > def run(*popenargs, input=None, timeout=None, check=False, **kwargs): > """Run command with arguments and return a CompletedProcess instance. > > The returned instance will have attributes args, returncode, stdout > and > stderr. By default, stdout and stderr are not captured, and those > attributes > will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture > them. > > If check is True and the exit code was non-zero, it raises a > CalledProcessError. The CalledProcessError object will have the > return code > in the returncode attribute, and output & stderr attributes if those > streams > were captured. > > If timeout is given, and the process takes too long, a TimeoutExpired > exception will be raised. > > There is an optional argument "input", allowing you to > pass a string to the subprocess's stdin. If you use this argument > you may not also use the Popen constructor's "stdin" argument, as > it will be used internally. > > The other arguments are the same as for the Popen constructor. > > If universal_newlines=True is passed, the "input" argument must be a > string and stdout/stderr in the returned object will be strings > rather than > bytes. > """ > if input is not None: > if 'stdin' in kwargs: > raise ValueError('stdin and input arguments may not both be > used.') > kwargs['stdin'] = PIPE > > with Popen(*popenargs, **kwargs) as process: > try: > stdout, stderr = process.communicate(input, timeout=timeout) > except TimeoutExpired: > process.kill() > stdout, stderr = process.communicate() > raise TimeoutExpired(process.args, timeout, output=stdout, > stderr=stderr) > except: > process.kill() > process.wait() > raise > retcode = process.poll() > if check and retcode: > raise CalledProcessError(retcode, process.args, > > output=stdout, stderr=stderr) > E subprocess.CalledProcessError: Command > '('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', > '/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandra/cassandra-dtest/tools/../lib/jolokia-jvm-1.6.2-agent.jar', > 'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', 'start', > '11637')' returned non-zero exit status 1. > /usr/lib/python3.6/subprocess.py:438: CalledProcessError > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org