The Monday 01 Sep 2014 à 15:43:06 (+0800), Liu Yuan wrote : > This patch set mainly add mainly two logics to implement device recover > - notify qourum driver of the broken states from the child driver(s) > - dirty track and sync the device after it is repaired > > Thus quorum allow VMs to continue while some child devices are broken and when > the child devices are repaired and return back, we sync dirty bits during > downtime to keep data consistency. > > The recovery logic is based on the driver state bitmap and will sync the dirty > bits with a timeslice window in a coroutine in this prtimive implementation. > > Simple graph about 2 children with threshold=1 and read-pattern=fifo: > (similary to DRBD) > > + denote device sync iteration > - IO on a single device > = IO on two devices > > sync complete, release dirty bitmap > ^ > | > ====-----------------++++----++++----++========== > | | > | v > | device repaired and begin to sync > v > device broken, create a dirty bitmap > > This sync logic can take care of nested broken problem, that devices are > broken while in sync. We just start a sync process after the devices are > repaired again and switch the devices from broken to sound only when the > sync > completes. > > For read-pattern=quorum mode, it enjoys the recovery logic without any > problem.
Hi Liu, I had something like that in mind. This series seems very cool I will review it. Thanks for contributing to quorum. Best regards Benoît > > Todo: > - use aio interface to sync data (multiple transfer in one go) > - dynamic slice window to control sync bandwidth more smoothly > - add auto-reconnection mechanism to other protocol (if not support yet) > - add tests > > Cc: Eric Blake <ebl...@redhat.com> > Cc: Benoit Canet <ben...@irqsave.net> > Cc: Kevin Wolf <kw...@redhat.com> > Cc: Stefan Hajnoczi <stefa...@redhat.com> > > Liu Yuan (8): > block/quorum: initialize qcrs.aiocb for read > block: add driver operation callbacks > block/sheepdog: propagate disconnect/reconnect events to upper driver > block/quorum: add quorum_aio_release() helper > quorum: fix quorum_aio_cancel() > block/quorum: add broken state to BlockDriverState > block: add two helpers > quorum: add basic device recovery logic > > block.c | 17 +++ > block/quorum.c | 324 > +++++++++++++++++++++++++++++++++++++++++----- > block/sheepdog.c | 9 ++ > include/block/block.h | 9 ++ > include/block/block_int.h | 6 + > trace-events | 5 + > 6 files changed, 336 insertions(+), 34 deletions(-) > > -- > 1.9.1 >