Jose Armando Garcia Sancio created KAFKA-13073:
--------------------------------------------------
Summary: Simulation test fails due to inconsistency in MockLog's
implementation
Key: KAFKA-13073
URL: https://issues.apache.org/jira/browse/KAFKA-13073
Project: Kafka
Issue Type: Bug
Components: controller, replication
Affects Versions: 3.0.0
Reporter: Jose Armando Garcia Sancio
Assignee: Jose Armando Garcia Sancio
Fix For: 3.0.0
We are getting the following error on trunk
{code:java}
RaftEventSimulationTest > canRecoverAfterAllNodesKilled STANDARD_OUT
timestamp = 2021-07-12T16:26:55.663,
RaftEventSimulationTest:canRecoverAfterAllNodesKilled =
java.lang.RuntimeException:
Uncaught exception during poll of node 1
|-------------------jqwik-------------------
tries = 25 | # of calls to property
checks = 25 | # of not rejected calls
generation = RANDOMIZED | parameters are randomly generated
after-failure = PREVIOUS_SEED | use the previous seed
when-fixed-seed = ALLOW | fixing the random seed is allowed
edge-cases#mode = MIXIN | edge cases are mixed in
edge-cases#total = 108 | # of all combined edge cases
edge-cases#tried = 4 | # of edge cases tried in current run
seed = 8079861963960994566 | random seed to reproduce generated values
Sample
------
arg0: 4002
arg1: 2
arg2: 4{code}
I think there are a couple of issues here:
# The {{ListenerContext}} for {{KafkaRaftClient}} uses the value returned by
{{ReplicatedLog::startOffset()}} to determined the log start and when to load a
snapshot while the {{MockLog}} implementation uses {{logStartOffset}} which
could be a different value.
# {{MockLog}} doesn't implement {{ReplicatedLog::maybeClean}} so the log start
offset is always 0.
# The snapshot id validation for {{MockLog}} and {{KafkaMetadataLog}}'s
{{createNewSnapshot}} throws an exception when the snapshot id is less than the
log start offset.
Solutions:
Fix the error quoted above we only need to fix bullet point 3. but I think we
should fix all of the issues enumerated in this Jira.
For 1. we should change the {{MockLog}} implementation so that it uses
{{startOffset}} both externally and internally.
For 2. I will file another issue to track this implementation.
For 3. I think this validation is too strict. I think it is safe to simply
ignore any attempt by the state machine to create an snapshot with an id less
that the log start offset. We should return a {{Optional.empty()}}when the
snapshot id is less than the log start offset. This tells the user that it
doesn't need to generate a snapshot for that offset.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)