[ https://issues.apache.org/jira/browse/KAFKA-17084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863442#comment-17863442 ]
Chia-Ping Tsai commented on KAFKA-17084: ---------------------------------------- It pass on my local {code:java} docker exec ducker01 bash -c "cd /opt/kafka-dev && ducktape --cluster-file /opt/kafka-dev/tests/docker/build/cluster.json ./tests/kafkatest/tests/core/network_degrade_test.py " /usr/local/lib/python3.9/dist-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated and will be removed in a future release "class": algorithms.Blowfish, [INFO:2024-07-05 18:26:22,698]: starting test run with session id 2024-07-05--001... [INFO:2024-07-05 18:26:22,698]: running 4 tests... [INFO:2024-07-05 18:26:22,699]: Triggering test 1 of 4... [INFO:2024-07-05 18:26:22,705]: RunnerClient: Loading test \{'directory': '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': 'network_degrade_test.py', 'cls_name': 'NetworkDegradeTest', 'method_name': 'test_latency', 'injected_args': {'task_name': 'latency-100-rate-1000', 'device_name': 'eth0', 'latency_ms': 50, 'rate_limit_kbit': 1000}} [INFO:2024-07-05 18:26:22,712]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: on run 1/1 [INFO:2024-07-05 18:26:22,713]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: Setting up... [INFO:2024-07-05 18:26:33,111]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: Running... [INFO:2024-07-05 18:26:53,298]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: Tearing down... [INFO:2024-07-05 18:27:02,302]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: PASS [WARNING - 2024-07-05 18:27:02,302 - runner_client - log - lineno:294]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: Test requested 5 nodes, used only 4 [WARNING:2024-07-05 18:27:02,303]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: Test requested 5 nodes, used only 4 [INFO:2024-07-05 18:27:02,305]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000: Data: None [INFO:2024-07-05 18:27:02,313]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [INFO:2024-07-05 18:27:02,313]: Triggering test 2 of 4... [INFO:2024-07-05 18:27:02,320]: RunnerClient: Loading test \{'directory': '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': 'network_degrade_test.py', 'cls_name': 'NetworkDegradeTest', 'method_name': 'test_latency', 'injected_args': {'task_name': 'latency-100', 'device_name': 'eth0', 'latency_ms': 50, 'rate_limit_kbit': 0}} [INFO:2024-07-05 18:27:02,323]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: on run 1/1 [INFO:2024-07-05 18:27:02,324]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: Setting up... [INFO:2024-07-05 18:27:13,280]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: Running... [INFO:2024-07-05 18:27:33,398]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: Tearing down... [INFO:2024-07-05 18:27:42,431]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: PASS [WARNING - 2024-07-05 18:27:42,432 - runner_client - log - lineno:294]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: Test requested 5 nodes, used only 4 [WARNING:2024-07-05 18:27:42,433]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: Test requested 5 nodes, used only 4 [INFO:2024-07-05 18:27:42,435]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0: Data: None [INFO:2024-07-05 18:27:42,455]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [INFO:2024-07-05 18:27:42,455]: Triggering test 3 of 4... [INFO:2024-07-05 18:27:42,467]: RunnerClient: Loading test \{'directory': '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': 'network_degrade_test.py', 'cls_name': 'NetworkDegradeTest', 'method_name': 'test_rate', 'injected_args': {'task_name': 'rate-1000-latency-50', 'device_name': 'eth0', 'latency_ms': 50, 'rate_limit_kbit': 1000000}} [INFO:2024-07-05 18:27:42,471]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: on run 1/1 [INFO:2024-07-05 18:27:42,472]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: Setting up... [INFO:2024-07-05 18:27:53,139]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: Running... [INFO:2024-07-05 18:28:16,992]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: Tearing down... [INFO:2024-07-05 18:28:26,030]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: PASS [WARNING - 2024-07-05 18:28:26,030 - runner_client - log - lineno:294]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: Test requested 5 nodes, used only 4 [WARNING:2024-07-05 18:28:26,031]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: Test requested 5 nodes, used only 4 [INFO:2024-07-05 18:28:26,033]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000: Data: None [INFO:2024-07-05 18:28:26,048]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [INFO:2024-07-05 18:28:26,048]: Triggering test 4 of 4... [INFO:2024-07-05 18:28:26,059]: RunnerClient: Loading test \{'directory': '/opt/kafka-dev/tests/kafkatest/tests/core', 'file_name': 'network_degrade_test.py', 'cls_name': 'NetworkDegradeTest', 'method_name': 'test_rate', 'injected_args': {'task_name': 'rate-1000', 'device_name': 'eth0', 'latency_ms': 0, 'rate_limit_kbit': 1000000}} [INFO:2024-07-05 18:28:26,061]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: on run 1/1 [INFO:2024-07-05 18:28:26,062]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: Setting up... [INFO:2024-07-05 18:28:37,441]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: Running... [INFO:2024-07-05 18:29:02,550]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: Tearing down... [INFO:2024-07-05 18:29:11,484]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: PASS [WARNING - 2024-07-05 18:29:11,485 - runner_client - log - lineno:294]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: Test requested 5 nodes, used only 4 [WARNING:2024-07-05 18:29:11,486]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: Test requested 5 nodes, used only 4 [INFO:2024-07-05 18:29:11,488]: RunnerClient: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000: Data: None ================================================================================ SESSION REPORT (ALL TESTS) ducktape version: 0.11.4 session_id: 2024-07-05--001 run time: 2 minutes 48.803 seconds tests run: 4 passed: 4 flaky: 0 failed: 0 ignored: 0 ================================================================================ test_id: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000 status: PASS run time: 39.590 seconds -------------------------------------------------------------------------------- test_id: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0 status: PASS run time: 40.108 seconds -------------------------------------------------------------------------------- test_id: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000-latency-50.device_name=eth0.latency_ms=50.rate_limit_kbit=1000000 status: PASS run time: 43.559 seconds -------------------------------------------------------------------------------- test_id: kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_rate.task_name=rate-1000.device_name=eth0.latency_ms=0.rate_limit_kbit=1000000 status: PASS run time: 45.423 seconds -------------------------------------------------------------------------------- {code} > Network Degrade Test fails in System Tests > ------------------------------------------ > > Key: KAFKA-17084 > URL: https://issues.apache.org/jira/browse/KAFKA-17084 > Project: Kafka > Issue Type: Bug > Components: system tests > Affects Versions: 3.8.0 > Reporter: Josep Prat > Priority: Critical > Attachments: TEST-kafka.xml > > > Tests for NetworkDegradeTest fail consistently on the 3.8 branch. > > Tests failing are: > > {noformat} > Module: kafkatest.tests.core.network_degrade_test > Class: NetworkDegradeTest > Method: test_latency > Arguments: > { > "device_name": "eth0", > "latency_ms": 50, > "rate_limit_kbit": 1000, > "task_name": "latency-100-rate-1000" > } > {noformat} > > and > > {noformat} > Module: kafkatest.tests.core.network_degrade_test > Class: NetworkDegradeTest > Method: test_latency > Arguments: > { > "device_name": "eth0", > "latency_ms": 50, > "rate_limit_kbit": 0, > "task_name": "latency-100" > } > {noformat} > > Failure for the first one is: > {noformat} > RemoteCommandError({'ssh_config': {'host': 'worker30', 'hostname': > '10.140.34.105', 'user': 'ubuntu', 'port': 22, 'password': None, > 'identityfile': '/home/semaphore/kafka-overlay/semaphore-muckrake.pem'}, > 'hostname': 'worker30', 'ssh_hostname': '10.140.34.105', 'user': 'ubuntu', > 'externally_routable_ip': '10.140.34.105', '_logger': <Logger > kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100-rate-1000.device_name=eth0.latency_ms=50.rate_limit_kbit=1000-1790 > (DEBUG)>, 'os': 'linux', '_ssh_client': <paramiko.client.SSHClient object at > 0x7f17a237dc10>, '_sftp_client': <paramiko.sftp_client.SFTPClient object at > 0x7f17a2393910>, '_custom_ssh_exception_checks': None}, 'ping -i 1 -c 20 > worker21', 1, b'') > Traceback (most recent call last): > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/tests/runner_client.py", > line 184, in _do_run > data = self.run_test() > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/tests/runner_client.py", > line 262, in run_test > return self.test_context.function(self.test) > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/mark/_mark.py", > line 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/home/semaphore/kafka-overlay/kafka/tests/kafkatest/tests/core/network_degrade_test.py", > line 66, in test_latency > for line in zk0.account.ssh_capture("ping -i 1 -c 20 %s" % > zk1.account.hostname): > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/cluster/remoteaccount.py", > line 680, in next > return next(self.iter_obj) > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/cluster/remoteaccount.py", > line 347, in output_generator > raise RemoteCommandError(self, cmd, exit_status, stderr.read()) > ducktape.cluster.remoteaccount.RemoteCommandError: ubuntu@worker30: Command > 'ping -i 1 -c 20 worker21' returned non-zero exit status 1.{noformat} > And for the second one is: > {noformat} > RemoteCommandError({'ssh_config': {'host': 'worker28', 'hostname': > '10.140.41.79', 'user': 'ubuntu', 'port': 22, 'password': None, > 'identityfile': '/home/semaphore/kafka-overlay/semaphore-muckrake.pem'}, > 'hostname': 'worker28', 'ssh_hostname': '10.140.41.79', 'user': 'ubuntu', > 'externally_routable_ip': '10.140.41.79', '_logger': <Logger > kafkatest.tests.core.network_degrade_test.NetworkDegradeTest.test_latency.task_name=latency-100.device_name=eth0.latency_ms=50.rate_limit_kbit=0-1791 > (DEBUG)>, 'os': 'linux', '_ssh_client': <paramiko.client.SSHClient object at > 0x7f17a1c7b7c0>, '_sftp_client': <paramiko.sftp_client.SFTPClient object at > 0x7f17a1c7b2b0>, '_custom_ssh_exception_checks': None}, 'ping -i 1 -c 20 > worker27', 1, b'') > Traceback (most recent call last): > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/tests/runner_client.py", > line 184, in _do_run > data = self.run_test() > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/tests/runner_client.py", > line 262, in run_test > return self.test_context.function(self.test) > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/mark/_mark.py", > line 433, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/home/semaphore/kafka-overlay/kafka/tests/kafkatest/tests/core/network_degrade_test.py", > line 66, in test_latency > for line in zk0.account.ssh_capture("ping -i 1 -c 20 %s" % > zk1.account.hostname): > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/cluster/remoteaccount.py", > line 680, in next > return next(self.iter_obj) > File > "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/cluster/remoteaccount.py", > line 347, in output_generator > raise RemoteCommandError(self, cmd, exit_status, stderr.read()) > ducktape.cluster.remoteaccount.RemoteCommandError: ubuntu@worker28: Command > 'ping -i 1 -c 20 worker27' returned non-zero exit status 1.{noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)