[ https://issues.apache.org/jira/browse/KAFKA-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897274#comment-16897274 ]
Rajini Sivaram commented on KAFKA-8688: --------------------------------------- PR: https://github.com/apache/kafka/pull/7102 > Upgrade system tests fail due to data loss with older message format > -------------------------------------------------------------------- > > Key: KAFKA-8688 > URL: https://issues.apache.org/jira/browse/KAFKA-8688 > Project: Kafka > Issue Type: Bug > Components: system tests > Reporter: Rajini Sivaram > Assignee: Rajini Sivaram > Priority: Major > > System test failure for TestUpgrade/test_upgrade: from_kafka_version=0.9.0.1, > to_message_format_version=0.9.0.1, compression_types=.lz4 > {code:java} > 3 acked message did not make it to the Consumer. They are: [33906, 33900, > 33903]. The first 3 missing messages were validated to ensure they are in > Kafka's data files. 3 were missing. This suggests data loss. Here are some of > the messages not found in the data files: [33906, 33900, 33903] > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py", > line 132, in run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py", > line 189, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/mark/_mark.py", > line 428, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/core/upgrade_test.py", > line 136, in test_upgrade > self.run_produce_consume_validate(core_test_action=lambda: > self.perform_upgrade(from_kafka_version, > File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 112, in run_produce_consume_validate > self.validate() > File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 135, in validate > assert succeeded, error_msg > AssertionError: 3 acked message did not make it to the Consumer. They are: > [33906, 33900, 33903]. The first 3 missing messages were validated to ensure > they are in Kafka's data files. 3 were missing. This suggests data loss. Here > are some of the messages not found in the data files: [33906, 33900, 33903] > {code} > Logs show: > # Broker 1 is leader of partition > # Broker 2 successfully fetches from offset 10947 and processes request > # Broker 2 sends fetch request to broker 1 for offset 10950 > # Broker 1 sets is HW to 10950, acknowledges produce requests up to HW > # Broker 2 is elected leader > # Broker 2 truncates to its local HW of 10947 - 3 messages are lost > This data loss is a known issue that was fixed under KIP-101. But since this > can still happen with older messages formats, we should update upgrade tests > to cope with some data loss. > -- This message was sent by Atlassian JIRA (v7.6.14#76016)