Hi Philipp,

Is there an incompatibility between rc2 and rc3? Trying to run both versions in one cluster fails and I see:
drbd r0 [rc2 node]: Preparing remote state change 1013781642
drbd r0: Two-phase commit 1013781642 timeout   <-- rc3 node

Going back to rc2 solves the problem and the cluster will be healthy again.

Regards,
Rob


On 12/7/20 5:38 PM, Philipp Reisner wrote:
Hi,

It is time for a rc3, rc2 is already nearly 3 weeks old!

We were very busy ironing out details with the state engine for strate
transitions when nodes establish a connection. Well, two partitions
join. It looks really good now. A new test tortures it in a way we
never tested it before. I am convinced that we have put an end to an
entire class of bugs.

While doing this two patches reached us that aim to cure possible
sources for inconsistencies in mirroring the data. One of those got
merged, the other one is still under investigation. We will take the
time that is necessary to fully understand that and have a proper fix
in place.

This is a release candidate, please help testing it.

Changelog:
9.0.26-0rc3 (api:genl2/proto:86-118/transport:14)
--------
  * fix for writes not getting mirrored over a connection while the primary
    transitions through the WFBitMapS state
  * completed missing logic of the new two-phase-commit based connect process;
    avoid connecting partitions with a primary in each; ensure consistent
    decisions if the connect attempt will be retried

9.0.26-0rc2 (api:genl2/proto:86-118/transport:14)
--------
  * fix a crash if during resync a discard operation fails on the
    resync-target node
  * fix online verify to not clamp disk states to UpToDate
  * fix promoting resync-target nodes; the problem was that it could modify
    the bitmap of an ongoing resync; which leads to alarming log messages
  * pause a resync if the sync-source node becomes inconsistent; an example
    is a cascading resync where the upstream resync aborts and leaves the
    sync-source node for the downstream resync with an inconsistent disk;
    note, the node at the end of the chain could still have an outdated disk
    (better than inconsistent)
  * allow force primary on a sync-target node by breaking the resync
  * minor fixes to the compat tests

9.0.26-0rc1 (api:genl2/proto:86-118/transport:14)
--------
  * fix a case of a disk unexpectedly becoming Outdated by moving the
    exchange of the initial packets into the body of the two-phase-commit
    that happens at a connect
  * fix adding of new volumes to resources with a primary node
  * reliably detect split brain situation on both nodes
  * fix an unexpected occurrence of NetworkFailure state in a tight
    drbdsetup disconnect; drbdsetup connect sequence
  * fix online verify to return to Established from VerifyS if the VerifyT node
    was temporarily Inconsistent during the run
  * fix a corner case where a node ends up Outdated after the crash and rejoin
    of a primary node
  * implement 'blockdev --setro' in DRBD
  * following upstream changes to DRBD up to Linux 5.9 and ensure
    compatibility with Linux 5.8 and 5.9

https://www.linbit.com/downloads/drbd/9.0/drbd-9.0.26-0rc3.tar.gz
https://github.com/LINBIT/drbd/commit/9114a0383f72b87610cd9ee282676cf94213da5b
_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to