[ https://issues.apache.org/jira/browse/KAFKA-12958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369935#comment-17369935 ]
HaiyuanZhao edited comment on KAFKA-12958 at 6/26/21, 6:16 PM: --------------------------------------------------------------- Hi, [~jagsancio] I added an invariant that notified leaders are never asked to load snapshots. However, the test case canRecoverAfterAllNodesKilled failed, this is easy to reproduce. and the case detail is followed. *New Invariant* *Run Result* The ** handleSnapshot callstack is followed. This callstack indicated that the new leader may have a chance to catch up by loadingSnaphost if its listener is lagging. And the fireSnapshot call comes from KAFKA-12154, commit: 6203bf8. I am not sure if this is expected. Could you please take a look? !image-2021-06-27-02-09-25-296.png! *!image-2021-06-27-02-15-23-760.png!* was (Author: zhaohaidao): Hi, [~jagsancio] I added an invariant that notified leaders are never asked to load snapshots. However, the test case canRecoverAfterAllNodesKilled failed, this is easy to reproduce. and the case detail is followed. *New Invariant* *Run Result* The ** handleSnapshot callstack is followed. This callstack indicated that the new leader may have a chance to catch up by loadingSnaphost if its listener is lagging. And the fireSnapshot call comes from KAFKA-12154, which commit is 6203bf8. I am not sure if this is expected. Could you please take a look? !image-2021-06-27-02-09-25-296.png! *!image-2021-06-27-02-15-23-760.png!* > Add simulation invariant for leadership and snapshot > ---------------------------------------------------- > > Key: KAFKA-12958 > URL: https://issues.apache.org/jira/browse/KAFKA-12958 > Project: Kafka > Issue Type: Sub-task > Reporter: Jose Armando Garcia Sancio > Assignee: HaiyuanZhao > Priority: Major > Attachments: image-2021-06-27-02-09-25-296.png, > image-2021-06-27-02-15-23-760.png > > > During the simulation we should add an invariant that notified leaders are > never asked to load snapshots. The state machine always sees the following > sequence of callback calls: > Leaders see: > ... > handleLeaderChange state machine is notify of leadership > handleSnapshot is never called > Non-leader see: > ... > handleLeaderChange state machine is notify that is not leader > handleSnapshot is called 0 or more times -- This message was sent by Atlassian Jira (v8.3.4#803005)