>>> Lars Ellenberg <lars.ellenb...@linbit.com> schrieb am 08.09.2022 um 15:01 in Nachricht <Yxnns8D0NDTWKjDU@grappa.linbit>:
> Scenario: > three nodes, no fencing (I know) > break network, isolating nodes > unbreak network, see how cluster partitions rejoin and resume service > > > Funny outcome: > /usr/sbin/crm_mon ‑x pe‑input‑689.bz2 > Cluster Summary: > * Stack: corosync > * Current DC: mqhavm24 (version 1.1.24.linbit‑2.0.el7‑8f22be2ae) ‑ partition > with quorum > * Last updated: Thu Sep 8 14:39:54 2022 > * Last change: Thu Aug 11 12:33:02 2022 by root via crm_resource on > mqhavm24 > * 3 nodes configured > * 16 resource instances configured (2 DISABLED) > > Node List: > * Online: [ mqhavm34 mqhavm37 ] > * OFFLINE: [ mqhavm24 ] > > > Note how the current DC considers itself as OFFLINE! > > It accepted an apparently outdated cib replaceament from one of the non‑DCs > from a previous membership while already authoritative itself, > overwriting its own "join" status in the cib. > > I have full crm_reports and some context knowledge about the setup. > > For now I'd like to know: has anyone seen this before, > is that a known bug in corner cases/races during re‑join, > has it even been fixed meanwhile? I think the order ov events is important here. Maybe provide some logs? > > Thanks, > Lars > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/