Tsz-wo Sze created RATIS-1884:
---------------------------------
Summary: Fix retry cache warning condition
Key: RATIS-1884
URL: https://issues.apache.org/jira/browse/RATIS-1884
Project: Ratis
Issue Type: Bug
Components: server
Reporter: Tsz-wo Sze
Assignee: Song Ziyang
Made a mistake in previous PR [#904|https://github.com/apache/ratis/pull/904].
The conditions here are a bit of tricky.
The cache entry is expected to be *not completed normally* when
{{{}replyPendingRequest{}}}, since we'll complete this cache entry at the very
end of {{{}replyPendingRequest{}}}.
The explanation why this assertion fails in previous PR is incorrect. The real
path leading to the error is:
If the request r arrived, committed, but became timeout due to blocking apply
(may be stuck in a synchronous snapshotting), a client may choose to retry r.
However, if the retry gap exceeds retryCache expiration duration (in our case,
yes), the very same request r would be committed, {*}again{*}. After the
snapshotting finished, these two identical requests being applied would cause
the assertion to fail.
Maybe we should recommend users to set a retry cache expiration duration longer
than the client retry-waiting duration?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)