On Fri, Jan 7, 2011 at 3:18 PM, Iustin Pop <[email protected]> wrote: > On Fri, Jan 07, 2011 at 03:04:02PM +0100, René Nussbaumer wrote: >> --- >> lib/cmdlib.py | 34 ++++++++++++++++++++++------------ >> 1 files changed, 22 insertions(+), 12 deletions(-) >> >> diff --git a/lib/cmdlib.py b/lib/cmdlib.py >> index d80c0da..3b6064f 100644 >> --- a/lib/cmdlib.py >> +++ b/lib/cmdlib.py >> @@ -6504,6 +6504,10 @@ def _WipeDisks(lu, instance): >> for idx, device in enumerate(instance.disks): >> lu.LogInfo("* Wiping disk %d", idx) >> logging.info("Wiping disk %d for instance %s", idx, instance.name) >> + logging.info("Pause sync of disk %d for instance %s", idx, >> instance.name) >> + >> + result = lu.rpc.call_blockdev_pause_resume_sync(node, device, True) >> + result.Raise("Could not pause sync of disk %d" % idx) >> >> # The wipe size is MIN_WIPE_CHUNK_PERCENT % of the instance disk but >> # MAX_WIPE_CHUNK at max >> @@ -6515,18 +6519,24 @@ def _WipeDisks(lu, instance): >> last_output = 0 >> start_time = time.time() >> >> - while offset < size: >> - wipe_size = min(wipe_chunk_size, size - offset) >> - result = lu.rpc.call_blockdev_wipe(node, device, offset, wipe_size) >> - result.Raise("Could not wipe disk %d at offset %d for size %d" % >> - (idx, offset, wipe_size)) >> - now = time.time() >> - offset += wipe_size >> - if now - last_output >= 60: >> - eta = _CalcEta(now - start_time, offset, size) >> - lu.LogInfo(" - done: %.1f%% ETA: %s" % >> - (offset / float(size) * 100, utils.FormatSeconds(eta))) >> - last_output = now >> + try: >> + while offset < size: >> + wipe_size = min(wipe_chunk_size, size - offset) >> + result = lu.rpc.call_blockdev_wipe(node, device, offset, wipe_size) >> + result.Raise("Could not wipe disk %d at offset %d for size %d" % >> + (idx, offset, wipe_size)) >> + now = time.time() >> + offset += wipe_size >> + if now - last_output >= 60: >> + eta = _CalcEta(now - start_time, offset, size) >> + lu.LogInfo(" - done: %.1f%% ETA: %s" % >> + (offset / float(size) * 100, utils.FormatSeconds(eta))) >> + last_output = now >> + finally: >> + logging.info("Resume sync of disk %d for instance %s", idx, >> instance.name) >> + >> + result = lu.rpc.call_blockdev_pause_resume_sync(node, device, False) >> + result.Raise("Could not resume sync of disk %d" % idx) > > The patch looks OK, however… > > For some time now, we have a known bug about --early-release and node > overloading. The fix for that would be to write a small daemon that > watches the DRBD syncs and orders pause/resumes as appropriate so that > only N drbd resyncs per node are running. > > If Ganeti starts issuing per-device pause/resume, independently for each > instance, that means we can no longer control globally the resync > process. > > So my question is it wouldn't be better to change this whole > architecture as follows: > > 1. create LV on primary node > 2. wipe it > 3. and only then create the DRBD devices > > Optionally, the install could move to step 2B, so that it doesn't need > to replicate the data to the remote node twice. > > What do you think?
I agree that this should be the final state. Though, I planned to get that change in before 2.4 freeze. I'm unsure if I can make this change happen before that. This change right now increases the wiping time needed drastically by about 60-100% (depending on the disk size). I don't mind to hold off, but I also thought to believe how appreciated the value of this increasement might be to others. René
