[ 
https://issues.apache.org/jira/browse/KAFKA-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17331194#comment-17331194
 ] 

Jose Armando Garcia Sancio commented on KAFKA-10800:
----------------------------------------------------

{quote}I think it's easy to understand 3, but I'm curious why would we need to 
do 1 and 2? I guess there should be some benefit or restriction which I'm not 
realized? Thanks!{quote}

While operating this feature it may be possible that users may have the content 
of the snapshot on disk but not the original file name. In that case we may 
want to identify the snapshot id for that file. We can do that if we have the 
`endOffset - 1` as the baseOffset of every batch and the `epoch` on every 
batch. Actually, this may not be easy to do with the current code since we use 
`BatchAccumulator`. I am okay doing this as part of another PR.

> Validate the snapshot id when the state machine creates a snapshot
> ------------------------------------------------------------------
>
>                 Key: KAFKA-10800
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10800
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: replication
>            Reporter: Jose Armando Garcia Sancio
>            Assignee: Haoran Xuan
>            Priority: Major
>
> When the state machine attempts to create a snapshot writer we should 
> validate that the following is true:
>  # The end offset and epoch of the snapshot is less than the high-watermark.
>  # The end offset and epoch of the snapshot is valid based on the leader 
> epoch cache.
> Note that this validation should not be performed when the raft client 
> creates the snapshot writer because in that case the local log is out of date 
> and the follower should trust the snapshot id sent by the partition leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to