Hello all,

After 524 successful continuous migrations I think I am confident enough to
report that the above patch from Apollon works just fine.
If we weren't both living in Athens I'd promise I would buy him a beer at
the next ganeticon, but he's probably getting it a lot earlier than that...

Thanks a lot!



On Tue, Nov 5, 2013 at 4:30 PM, Apollon Oikonomopoulos <[email protected]>wrote:

> DrbdAttachNet supports both, normal primary/secondary node operation, and
> (during live migration) dual-primary operation. When resources are newly
> attached, we poll until we find all of them in connected or syncing
> operation.
>
> Although aggressive, this is enough for primary/secondary operation,
> because
> the primary/secondary role is not changed from within DrbdAttachNet.
> However,
> in the dual-primary (“multimaster”) case, both peers are subsequently
> upgraded
> to the primary role.  If - for unspecified reasons - both disks are not
> UpToDate, then a resync may be triggered after both peers have switched to
> primary, causing the resource to disconnect:
>
>   kernel: [1465514.164009] block drbd2: I shall become SyncTarget, but I
> am primary!
>   kernel: [1465514.171562] block drbd2: ASSERT( os.conn ==
> C_WF_REPORT_PARAMS ) in
> /build/linux-rrsxby/linux-3.2.51/drivers/block/drbd/drbd_receiver.c:3245
>
> This seems to be extremely racey and is possibly triggered by some
> underlying
> network issues (e.g. high latency), but it has been observed in the wild.
> By
> logging the DRBD resource state in the old secondary, we managed to see a
> resource getting promoted to primary while it was:
>
>   WFSyncUUID Secondary/Primary Outdated/UpToDate
>
> We fix this by explicitly waiting for “Connected” cstate and
> “UpToDate/UpToDate” disks, as advised in [1]:
>
>   ”For this purpose and scenario,
>    you only want to promote once you are Connected UpToDate/UpToDate.”
>
> [1] http://lists.linbit.com/pipermail/drbd-user/2013-July/020173.html
>
> Signed-off-by: Apollon Oikonomopoulos <[email protected]>
> ---
>  lib/backend.py |   16 ++++++++++++++--
>  lib/bdev.py    |    1 +
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/lib/backend.py b/lib/backend.py
> index a75432b..9e12639 100644
> --- a/lib/backend.py
> +++ b/lib/backend.py
> @@ -3622,8 +3622,20 @@ def DrbdAttachNet(nodes_ip, disks, instance_name,
> multimaster):
>      for rd in bdevs:
>        stats = rd.GetProcStatus()
>
> -      all_connected = (all_connected and
> -                       (stats.is_connected or stats.is_in_resync))
> +      if multimaster:
> +        # In the multimaster case we have to wait explicitly until
> +        # the resource is Connected and UpToDate/UpToDate, because
> +        # we promote *both nodes* to primary directly afterwards.
> +        # Being in resync is not enough, since there is a race during
> which we
> +        # may promote a node with an Outdated disk to primary, effectively
> +        # tearing down the connection.
> +        all_connected = (all_connected and
> +                         stats.is_connected and
> +                         stats.is_disk_uptodate and
> +                         stats.peer_disk_uptodate)
> +      else:
> +        all_connected = (all_connected and
> +                         (stats.is_connected or stats.is_in_resync))
>
>        if stats.is_standalone:
>          # peer had different config info and this node became
> diff --git a/lib/bdev.py b/lib/bdev.py
> index 7623869..acc18ec 100644
> --- a/lib/bdev.py
> +++ b/lib/bdev.py
> @@ -1135,6 +1135,7 @@ class DRBD8Status(object):
>
>      self.is_diskless = self.ldisk == self.DS_DISKLESS
>      self.is_disk_uptodate = self.ldisk == self.DS_UPTODATE
> +    self.peer_disk_uptodate = self.rdisk == self.DS_UPTODATE
>
>      self.is_in_resync = self.cstatus in self.CSET_SYNC
>      self.is_in_use = self.cstatus != self.CS_UNCONFIGURED
> --
> 1.7.10.4
>
>


-- 
Καργιωτάκης Γιώργος

Reply via email to