[
https://issues.apache.org/jira/browse/S4-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220172#comment-13220172
]
Matthieu Morel commented on S4-44:
----------------------------------
implemented in branch S4-44
https://git-wip-us.apache.org/repos/asf?p=incubator-s4.git;a=tree;h=refs/heads/S4-44;hb=S4-44
I also refactored the checkpointing mechanism so that time-based checkpointing
does not generate extra events.
> optional backoff upon multiple consecutive failed checkpoint fetches
> --------------------------------------------------------------------
>
> Key: S4-44
> URL: https://issues.apache.org/jira/browse/S4-44
> Project: Apache S4
> Issue Type: Improvement
> Affects Versions: 0.4
> Reporter: Matthieu Morel
> Fix For: 0.4
>
>
> if a checkpointing backend system becomes unresponsive (e.g. stalled NFS),
> and that a series of recoveries is to proceed (for instance, startup or
> failover), then each checkpoint fetching operation will block, wait for a
> timeout or another kind of exception, and the system will then continue
> without recovering this PE.
> We should provide a way to detect this pattern (multiple backend fetches
> failures in a short amount of time) and temporarily disable fetching from the
> backend, in order to reduce blocking when backend becomes unresponsive.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira