sanjiv marathe created KAFKA-8739:
-------------------------------------

             Summary: rejoining broker fails to sanity check existing log 
segments
                 Key: KAFKA-8739
                 URL: https://issues.apache.org/jira/browse/KAFKA-8739
             Project: Kafka
          Issue Type: Bug
          Components: replication
    Affects Versions: 2.3.0
            Reporter: sanjiv marathe


kafka claims it can be used as a storage. But following scenario proves other 
wise.
 # Consider a topic with single partition, repl-factor 2, with two brokers, say 
A and B.... with A is the leader.
 # Broker B fails due to sector errors. Sysadmin fixes the issues and brings it 
up again after a few minutes. A few log segments are lost/corrupted.
 # Broker B catches up with missed out msgs by fetching them from the leader A, 
but does not realize the issue with earlier log segments.
 # Broker A fails, B becomes the leader.
 # A new consumer requests msgs from the beginning. Broker B fails to deliver 
msgs belonging to corrupted log segments.

Suggested solution

A broker, immediately after coming up, should go through a sanity check, e.g. 
CRC check of its log segments. Any corrupted/lost, should be refetched from the 
leader.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to