[ 
https://issues.apache.org/jira/browse/KAFKA-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-8688.
-----------------------------------
       Resolution: Fixed
         Reviewer: Ismael Juma
    Fix Version/s: 2.4.0

> Upgrade system tests fail due to data loss with older message format
> --------------------------------------------------------------------
>
>                 Key: KAFKA-8688
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8688
>             Project: Kafka
>          Issue Type: Bug
>          Components: system tests
>            Reporter: Rajini Sivaram
>            Assignee: Rajini Sivaram
>            Priority: Major
>             Fix For: 2.4.0
>
>
> System test failure for TestUpgrade/test_upgrade: from_kafka_version=0.9.0.1, 
> to_message_format_version=0.9.0.1, compression_types=.lz4
> {code:java}
> 3 acked message did not make it to the Consumer. They are: [33906, 33900, 
> 33903]. The first 3 missing messages were validated to ensure they are in 
> Kafka's data files. 3 were missing. This suggests data loss. Here are some of 
> the messages not found in the data files: [33906, 33900, 33903]
> Traceback (most recent call last):
>   File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py",
>  line 132, in run
>     data = self.run_test()
>   File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py",
>  line 189, in run_test
>     return self.test_context.function(self.test)
>   File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/mark/_mark.py",
>  line 428, in wrapper
>     return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/core/upgrade_test.py",
>  line 136, in test_upgrade
>     self.run_produce_consume_validate(core_test_action=lambda: 
> self.perform_upgrade(from_kafka_version,
>   File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 112, in run_produce_consume_validate
>     self.validate()
>   File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 135, in validate
>     assert succeeded, error_msg
> AssertionError: 3 acked message did not make it to the Consumer. They are: 
> [33906, 33900, 33903]. The first 3 missing messages were validated to ensure 
> they are in Kafka's data files. 3 were missing. This suggests data loss. Here 
> are some of the messages not found in the data files: [33906, 33900, 33903]
> {code}
> Logs show:
>  # Broker 1 is leader of partition
>  # Broker 2 successfully fetches from offset 10947 and processes request
>  # Broker 2 sends fetch request to broker 1 for offset 10950
>  # Broker 1 sets is HW to 10950, acknowledges produce requests up to HW
>  # Broker 2 is elected leader
>  # Broker 2 truncates to its local HW of 10947 - 3 messages are lost
> This data loss is a known issue that was fixed under KIP-101. But since this 
> can still happen with older messages formats, we should update upgrade tests 
> to cope with some data loss.
>   



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to