> On Sept. 25, 2017, 5:36 a.m., Jie Yu wrote:
> > src/log/recover.cpp
> > Lines 652 (patched)
> > <https://reviews.apache.org/r/62286/diff/1/?file=1820908#file1820908line652>
> >
> >     Putting this in `recover.cpp|hpp` is very wierd. I am leaning towards 
> > moving this too `catchup.hpp|cpp` and just overload `catchup` method 
> > (without positions).

Done!


> On Sept. 25, 2017, 5:36 a.m., Jie Yu wrote:
> > src/log/recover.cpp
> > Lines 701 (patched)
> > <https://reviews.apache.org/r/62286/diff/1/?file=1820908#file1820908line701>
> >
> >     Any reason not returnning a failure in this case?

No. Fixed.


> On Sept. 25, 2017, 5:36 a.m., Jie Yu wrote:
> > src/log/recover.cpp
> > Lines 711 (patched)
> > <https://reviews.apache.org/r/62286/diff/1/?file=1820908#file1820908line711>
> >
> >     Can you add a few comments on this. THis is a bit hacky because we can 
> > essentially hijacting the recovery protocol. Ideally, we probably should 
> > split the recovery into several phases and uses on the phase that's 
> > relevant to this.

Per our offline discussion, exposed `runRecoverProtocol()` in `recover.hpp` and 
moved retry logic from `RecoverProtocolProcess` to `RecoverProcess`.


> On Sept. 25, 2017, 5:36 a.m., Jie Yu wrote:
> > src/log/recover.cpp
> > Lines 730 (patched)
> > <https://reviews.apache.org/r/62286/diff/1/?file=1820908#file1820908line730>
> >
> >     Can you explain why we need to recover the same `begin` after a restart?

Discussed offline.


- Ilya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62286/#review186084
-----------------------------------------------------------


On Oct. 4, 2017, 12:16 a.m., Ilya Pronin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62286/
> -----------------------------------------------------------
> 
> (Updated Oct. 4, 2017, 12:16 a.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-7973
>     https://issues.apache.org/jira/browse/MESOS-7973
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The process is used to catch-up a non-leading VOTING replica by running
> the recovery protocol to find current begin and end positions of the log
> and catching-up positions that are missing on the replica. This allows
> following replicas to serve eventually consistent reads.
> 
> 
> Diffs
> -----
> 
>   src/log/catchup.hpp 123bc7a57e5e89f9ba75c36ba0cbe5ead807c518 
>   src/log/catchup.cpp 94e1b00db2cd9d5a2368a979c1fd155bb6cac1f2 
>   src/tests/log_tests.cpp f9f9400c901152779ae0ebfe74cf8f7aac1d3396 
> 
> 
> Diff: https://reviews.apache.org/r/62286/diff/2/
> 
> 
> Testing
> -------
> 
> Added tests that verify that the new recovery process correctly performs and 
> produces meaningful result under various circumstances (recovered positions 
> were truncated, replica was lagging far behind). Ran `make check`.
> 
> 
> Thanks,
> 
> Ilya Pronin
> 
>

Reply via email to