[ https://issues.apache.org/jira/browse/KAFKA-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769071#comment-17769071 ]
Divij Vaidya commented on KAFKA-15169: -------------------------------------- Hey Arpit Asserting the sanity of the index (or any files on disk) is an expensive operation. Hence, we have to strike a balance on when do we assert sanity vs. trust that the file is not corrupted on disk. For logs, we perform CRC checksum while storing data on disk and after that the assumption is that files on disk will not get corrupted, i.e. we consider transfer over the network a possible culprit for corruption but don't consider that a file sitting on disk will get corrupted. Extending the same analogy to this cache, when we fetch the index files from remote store, they may be corrupted, so we perform a sanity check, but once stored on disk, we assume that files will not be corrupted. The case you mention assumes that file sitting on disk may get corrupted but that is a risk we choose to accept in Kafka, given the tradeoff mentioned above. Hence, the case you mentioned is an acceptable risk by design. > Add tests for RemoteIndexCache > ------------------------------ > > Key: KAFKA-15169 > URL: https://issues.apache.org/jira/browse/KAFKA-15169 > Project: Kafka > Issue Type: Test > Reporter: Satish Duggana > Assignee: Arpit Goyal > Priority: Major > Labels: KIP-405 > Fix For: 3.7.0 > > > Follow-up from > https://github.com/apache/kafka/pull/13275#discussion_r1257490978 -- This message was sent by Atlassian Jira (v8.20.10#820010)