[ https://issues.apache.org/jira/browse/KAFKA-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569563#comment-17569563 ]
Jun Rao edited comment on KAFKA-13700 at 7/21/22 5:04 PM: ---------------------------------------------------------- Interesting. So the CRC validation passes with DumpLogSegments, but fails in ReplicaFetcherThread? This is a bit weird since the validation is the same in both places. One possibility is that there is some issue in the network. Is the error in ReplicaFetcherThread persistent or transient? As Divij asked earlier, does this error occur in other topic partitions? was (Author: junrao): Interesting. So the CRC validation passes with DumpLogSegments, but fails in ReplicaFetcherThread? This is a bit weird since the validation is the same in both places. Is the error in ReplicaFetcherThread persistent or transient? > Kafka reporting CorruptRecordException exception > ------------------------------------------------ > > Key: KAFKA-13700 > URL: https://issues.apache.org/jira/browse/KAFKA-13700 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 2.4.1 > Environment: ubuntu 16.04 > kafka 2.4 > Reporter: Uday Bhaskar > Priority: Critical > > In our kafka cluster a couple of partitions in __consumer_offsets and 1 > regular topic getting data corruption issue while replicas trying to read > from leader. Similar messages for other partitions as well . > > [2022-02-28 21:57:29,941] ERROR [ReplicaFetcher replicaId=6, leaderId=1, > fetcherId=2] Found invalid messages during fetch for partition > __consumer_offsets-10 offset 108845487 (kafka.server.ReplicaFetcherThread) > org.apache.kafka.common.errors.CorruptRecordException: Record is corrupt > (stored crc = 1524235439) in topic partition __consumer_offsets-10 > > another topic partitions with same errors > [2022-02-28 22:17:00,235] ERROR [ReplicaFetcher replicaId=6, leaderId=1, > fetcherId=0] Found invalid messages during fetch for partition > px-11351-xxxxxx-a56c642-0 offset 11746872 (kafka.server.ReplicaFetcherThread) > org.apache.kafka.common.errors.CorruptRecordException: Record is corrupt > (stored crc = 475179617) in topic partition px-11351-xxxxxx-a56c642-0. > > I have verified all infrastructure, dish network and system for any errors > found and nothing found. I am not sure why it is happening or how to > troubleshoot. > > Bellow is output of the message from DumpLogSegments , > > $ /opt/ns/kafka/bin/kafka-run-class.sh kafka.tools.DumpLogSegments > --verify-index-only --deep-iteration --files ./00000000000011324034.log | > grep 11746872 > baseOffset: 11746872 lastOffset: 11746872 count: 1 baseSequence: 50278 > lastSequence: 50278 producerId: 17035 producerEpoch: 0 partitionLeaderEpoch: > 8 isTransactional: false isControl: false position: 252530345 CreateTime: > 1645886348240 size: 647 magic: 2 compresscodec: SNAPPY crc: 475179617 > isvalid: true > | offset: 11746872 CreateTime: 1645886348240 keysize: 54 valuesize: 637 > sequence: 50278 headerKeys: [] -- This message was sent by Atlassian Jira (v8.20.10#820010)