[
https://issues.apache.org/jira/browse/KAFKA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402740#comment-15402740
]
Jun Rao commented on KAFKA-4009:
--------------------------------
[~aganesan], thanks for reporting this. In your test, did the producer use
ack=all? Also, did N1 detect the corruption during log recovery when restarting
the broker or during appending to the log?
> Data corruption or EIO leads to data loss
> -----------------------------------------
>
> Key: KAFKA-4009
> URL: https://issues.apache.org/jira/browse/KAFKA-4009
> Project: Kafka
> Issue Type: Bug
> Components: log
> Affects Versions: 0.9.0.0
> Reporter: Aishwarya Ganesan
>
> I have a 3 node kafka cluster (N1,N2 and N3) with
> log.flush.interval.messages=1, min.insync.replicas=3 and
> unclean.leader.election.enable=false and a single Zookeeper node. My workload
> inserts few messages and on completion of the workload, the
> recovery-point-offset-checkpoint reflects the latest offset of the messages
> committed.
> I have a small testing tool that drives distributed applications into corner
> cases by simulating possible error conditions like EIO, ENOSPC and EDQUOT
> that can be encountered in all modern file systems such as ext4. The tool
> also simulates on-disk silent data corruption.
> When I introduce silent data corruption in a node (say N1) in the ISR, Kafka
> is able to detect corruption using checksum and ignores the log entry from
> that point onwards. Even though N1 has lost log entries and
> recovery-point-offset-checkpoint file in N1 indicates the latest offsets, N1
> is allowed to become the leader because it is in the ISR. Also, the other
> nodes N2 and N3 crash with the following log message:
> FATAL [ReplicaFetcherThread-0-1], Halting because log truncation is not
> allowed for topic my-topic1, Current leader 1's latest offset 0 is less than
> replica 3's latest offset 1 (kafka.server.ReplicaFetcherThread)
> The end result is that a silent data corruption leads to data loss because
> querying the cluster returns only messages before the corrupted entry. Note
> that the cluster at this point has only N1. This situation could have been
> avoided if the node N1 which had to ignore the log entry wasn't allowed to
> become the leader. This scenario wouldn't happen in a majority based leader
> election as other nodes (N2 or N3) would have denied vote for N1 because N1's
> log is not complete compared to N2 or N3's log.
> If this scenario happens in any of the followers, it ignores the log entry
> and copies data from the leader and there is no data loss.
> Encountering an EIO thrown by the file system for a particular block results
> in the same consequence of data loss on querying the cluster and the
> remaining two nodes crash. An EIO on read could be thrown for a variety of
> reasons including a latent sector error of one or more sectors on disk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)