At Tue, 21 Aug 2012 04:34:05 +0000,
Dietmar Maurer wrote:
> 
> > On 08/21/2012 12:07 AM, Christoph Hellwig wrote:
> > > Another thing that sprang into mind is that instead of the formal
> > > recovery enable/disable we should simply always delay recovery, that
> > > is only do recovery after every N seconds if changes happened.
> > > Especially in the cases of whole racks going up/down or upgrades that
> > > dramatically reduces the number of epochs required, and thus reduces
> > > the recovery overhead.
> > >
> > > I didn't actually have time to look into the implementation
> > > implications of this yet, it's just high level thoughs.
> > 
> > I think negatively to delay recovery all the time. It is useful to delay 
> > recovery
> > in some time window for maintenance or operational purposes, so I think
> > the idea only to delay recovery manually at some controlled window is
> > useful, but if we extend this to all the running time, it will bring 
> > cluster to a
> > less safe state (if not
> > dangerous) at any point. (we only upgrade cluster/maintain individual node
> > only at some time, not all the time, no?)
> 
> I still think that automatic recovery without delay is the wrong approach. At 
> least for
> small clusters you simply want to avoid unnecessary traffic. Such recovery 
> can produce
> massive traffic on the network (several TB of data), and can make the whole 
> system unusable 
> because of that. I want to control when recovery starts.

Disabling automatic recovery by default doesn't work for you?  You can
control the time to start recovery with "collie cluster recover enable".

Thanks,

Kazutaka
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog

Reply via email to