smjn opened a new pull request, #17580:
URL: https://github.com/apache/kafka/pull/17580

   In the `ShareCoordinatorShard.readState` method we are storing the 
`leaderEpoch` present in the request directly in the `leaderEpochMap` timeline 
hashmap, if it is the highest `leaderEpoch` seen so far for a specific share 
partition. This however is folly. 
   
   The coordinator runtime only allows updating timeline data structures via 
the `replay` method. The `replay` method is called by the runtime whenever it 
persists some records into the topic it is managing.
   
   Therefore, to remedy the situation this PR adds code into the 
`ShareCoordinatorService.readState` method to issue a 
`runtime.scheduleWriteOperation` call if the incoming read state request holds 
a valid `leaderEpoch` value (not -1). We cannot ascertain if the `leaderEpoch` 
in the read request is the highest so far for the given share partition because 
it is not possible to lookup the timeline data structures in the 
`ShareCoordinatorService` class since they are housed in 
`ShareCoordinatorShard` and an offset is needed to do the lookup, which we 
won't have.
   
   Hence, the new code will issue a `scheduleWriteOperation` call if 
leaderEpoch is != -1 and the supplied callback method 
`ShareCoordinatorShard.writeLeaderEpoch` (also part of the PR) will generate a 
record if the `leaderEpoch` is highest or no record if it is same as last seen 
or error if the epoch is old. We required a separate method because we want to 
only look at `leaderEpoch` and simply ignore other data fields. Also, we want 
to perform the optimization of not generating record if the `leaderEpoch` is -1 
or equal to highest seen so far.
   
   Subsequently, a sequential call to the `runtime.scheduleReadOperation` will 
be made in `ShareCoordinatorService.readState` which is same as before.
   
   **TLDR**: We will issue a phantom write call (not explicitly issued by a 
caller) in `ShareCoordinatorService.readState` if the read request contains a 
valid `leaderEpoch`. Based on the response to the write, a standard read call 
will be scheduled. This way we will be able to persist the `leaderEpoch` 
correctly in the topic and timeline data structures.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to