[ 
https://issues.apache.org/jira/browse/CASSANDRA-17685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17685:
-----------------------------------------
       Complexity: Normal
    Discovered By: User Report
         Severity: Normal
           Status: Open  (was: Triage Needed)

> Test failure: 
> transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17685
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17685
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/python
>            Reporter: Andres de la Peña
>            Priority: Normal
>             Fix For: 4.1.x
>
>
> The Python dtest 
> {{transient_replication_test.py::TestTransientReplicationRepairStreamEntireSSTable::test_optimized_primary_range_repair}}
>  is flaky at least in {{{}cassandra-4.1{}}}, with a flakiness < 1%.
> I haven't seen the failure on Jenkins but on [this CircleCI 
> run|https://app.circleci.com/pipelines/github/adelapena/cassandra/1663/workflows/c63703e3-8c7a-42c6-981a-53cb59babe1f/jobs/17476]
>  for CASSANDRA-17458.
> The failure can also be [reproduced in the 
> multiplexer|https://app.circleci.com/pipelines/github/adelapena/cassandra/1666/workflows/6f925be1-c0df-4b2a-83e0-4612a46f32bd/jobs/17516],
>  with 5 failures in 5000 iterations:
> {code:java}
> self = 
> <transient_replication_test.TestTransientReplicationRepairStreamEntireSSTable 
> object at 0x7f87951c77b8>
>     @pytest.mark.no_vnodes
>     def test_optimized_primary_range_repair(self):
>         """ optimized primary range incremental repair from full replica 
> should remove data on node3 """
>         self._test_speculative_write_repair_cycle(primary_range=True,
>                                                   optimized_repair=True,
>                                                   
> repair_coordinator=self.node1,
> >                                                 expect_node3_data=False)
> transient_replication_test.py:523: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> transient_replication_test.py:473: in _test_speculative_write_repair_cycle
>     with tm(self.node1) as tm1, tm(self.node2) as tm2, tm(self.node3) as tm3:
> transient_replication_test.py:62: in __enter__
>     self.start()
> transient_replication_test.py:55: in start
>     self.jmx.start()
> tools/jmxutils.py:187: in start
>     subprocess.check_output(args, stderr=subprocess.STDOUT)
> /usr/lib/python3.6/subprocess.py:356: in check_output
>     **kwargs).stdout
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> input = None, timeout = None, check = True
> popenargs = (('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', 
> '/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandr...t/tools/../lib/jolokia-jvm-1.6.2-agent.jar',
>  'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', ...),)
> kwargs = {'stderr': -2, 'stdout': -1}
> process = <subprocess.Popen object at 0x7f877e977208>
> stdout = b"Couldn't start agent for PID 11637\nPossible reason could be that 
> port '8778' is already occupied.\nPlease check the standard output of the 
> target process for a detailed error message.\n"
> stderr = None, retcode = 1
>     def run(*popenargs, input=None, timeout=None, check=False, **kwargs):
>         """Run command with arguments and return a CompletedProcess instance.
>     
>         The returned instance will have attributes args, returncode, stdout 
> and
>         stderr. By default, stdout and stderr are not captured, and those 
> attributes
>         will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture 
> them.
>     
>         If check is True and the exit code was non-zero, it raises a
>         CalledProcessError. The CalledProcessError object will have the 
> return code
>         in the returncode attribute, and output & stderr attributes if those 
> streams
>         were captured.
>     
>         If timeout is given, and the process takes too long, a TimeoutExpired
>         exception will be raised.
>     
>         There is an optional argument "input", allowing you to
>         pass a string to the subprocess's stdin.  If you use this argument
>         you may not also use the Popen constructor's "stdin" argument, as
>         it will be used internally.
>     
>         The other arguments are the same as for the Popen constructor.
>     
>         If universal_newlines=True is passed, the "input" argument must be a
>         string and stdout/stderr in the returned object will be strings 
> rather than
>         bytes.
>         """
>         if input is not None:
>             if 'stdin' in kwargs:
>                 raise ValueError('stdin and input arguments may not both be 
> used.')
>             kwargs['stdin'] = PIPE
>     
>         with Popen(*popenargs, **kwargs) as process:
>             try:
>                 stdout, stderr = process.communicate(input, timeout=timeout)
>             except TimeoutExpired:
>                 process.kill()
>                 stdout, stderr = process.communicate()
>                 raise TimeoutExpired(process.args, timeout, output=stdout,
>                                      stderr=stderr)
>             except:
>                 process.kill()
>                 process.wait()
>                 raise
>             retcode = process.poll()
>             if check and retcode:
>                 raise CalledProcessError(retcode, process.args,
> >                                        output=stdout, stderr=stderr)
> E               subprocess.CalledProcessError: Command 
> '('/usr/lib/jvm/java-8-openjdk-amd64/bin/java', '-cp', 
> '/usr/lib/jvm/java-8-openjdk-amd64/lib/tools.jar:/home/cassandra/cassandra-dtest/tools/../lib/jolokia-jvm-1.6.2-agent.jar',
>  'org.jolokia.jvmagent.client.AgentLauncher', '--host', '127.0.0.1', 'start', 
> '11637')' returned non-zero exit status 1.
> /usr/lib/python3.6/subprocess.py:438: CalledProcessError
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to