Jose Armando Garcia Sancio created KAFKA-13073: --------------------------------------------------
Summary: Simulation test fails due to inconsistency in MockLog's implementation Key: KAFKA-13073 URL: https://issues.apache.org/jira/browse/KAFKA-13073 Project: Kafka Issue Type: Bug Components: controller, replication Affects Versions: 3.0.0 Reporter: Jose Armando Garcia Sancio Assignee: Jose Armando Garcia Sancio Fix For: 3.0.0 We are getting the following error on trunk {code:java} RaftEventSimulationTest > canRecoverAfterAllNodesKilled STANDARD_OUT timestamp = 2021-07-12T16:26:55.663, RaftEventSimulationTest:canRecoverAfterAllNodesKilled = java.lang.RuntimeException: Uncaught exception during poll of node 1 |-------------------jqwik------------------- tries = 25 | # of calls to property checks = 25 | # of not rejected calls generation = RANDOMIZED | parameters are randomly generated after-failure = PREVIOUS_SEED | use the previous seed when-fixed-seed = ALLOW | fixing the random seed is allowed edge-cases#mode = MIXIN | edge cases are mixed in edge-cases#total = 108 | # of all combined edge cases edge-cases#tried = 4 | # of edge cases tried in current run seed = 8079861963960994566 | random seed to reproduce generated values Sample ------ arg0: 4002 arg1: 2 arg2: 4{code} I think there are a couple of issues here: # The {{ListenerContext}} for {{KafkaRaftClient}} uses the value returned by {{ReplicatedLog::startOffset()}} to determined the log start and when to load a snapshot while the {{MockLog}} implementation uses {{logStartOffset}} which could be a different value. # {{MockLog}} doesn't implement {{ReplicatedLog::maybeClean}} so the log start offset is always 0. # The snapshot id validation for {{MockLog}} and {{KafkaMetadataLog}}'s {{createNewSnapshot}} throws an exception when the snapshot id is less than the log start offset. Solutions: Fix the error quoted above we only need to fix bullet point 3. but I think we should fix all of the issues enumerated in this Jira. For 1. we should change the {{MockLog}} implementation so that it uses {{startOffset}} both externally and internally. For 2. I will file another issue to track this implementation. For 3. I think this validation is too strict. I think it is safe to simply ignore any attempt by the state machine to create an snapshot with an id less that the log start offset. We should return a {{Optional.empty()}}when the snapshot id is less than the log start offset. This tells the user that it doesn't need to generate a snapshot for that offset. -- This message was sent by Atlassian Jira (v8.3.4#803005)