[jira] [Closed] (MESOS-1165) Retry required when recovering an empty log

Jie Yu (JIRA) Mon, 31 Mar 2014 15:33:31 -0700

     [ 
https://issues.apache.org/jira/browse/MESOS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jie Yu closed MESOS-1165.
-------------------------

    Resolution: Fixed

> Retry required when recovering an empty log
> -------------------------------------------
>
>                 Key: MESOS-1165
>                 URL: https://issues.apache.org/jira/browse/MESOS-1165
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Jie Yu
>            Assignee: Jie Yu
>            Priority: Minor
>             Fix For: 0.19.0
>
>
> Reported by [~benjaminhindman]. It's fairly non-intuitive that a 'fill' retry 
> is required when recovering an empty log. Moreover, since retry is done via a 
> 'delay' it means that you can't pause the clock before calling 
> Log::Writer::start! The following tests show the multiple calls and at one 
> point I added comments to explain the very esoteric reasoning here. Here are 
> the sequence of events:
> First a replica is recovered with nothing but 0 is always a hole:
> ----
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> ----
> At this point the replica assumes it was promised to a coordinator with 
> proposal 0 (that's the default metadata). Then an implicit promise request is 
> made with proposal 1.
> ----
> Replica received implicit promise request with proposal 1 
> ----
> Then the coordinator (via FillProcess) tries to fill the hole (position 0) 
> explicitly:
> ----
> Coordinator attemping to fill missing position 
> ----
> And the replica receives the request:
> ----
> Replica received explicit promise request for position 0 with proposal 1
> ----
> But the filling must be retried because the 0th position is implicitly 
> promised to proposer 1 (the same coordinator!) but the replica won't allow it 
> (because it might not be safe) so the FillProcess now tries with proposal 
> number 2 (after the delay). While correct, this seems unfortunate (and not 
> intuitive).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Closed] (MESOS-1165) Retry required when recovering an empty log

Reply via email to