You said you booted the hosts sequentially. From the logs they were starting in parallel.
>>> "Lentes, Bernd" <bernd.len...@helmholtz-muenchen.de> schrieb am 13.08.2019 um 13:53 in Nachricht <767205671.1953556.1565697218136.javamail.zim...@helmholtz-muenchen.de>: > ‑‑‑‑‑ On Aug 12, 2019, at 7:47 PM, Chris Walker cwal...@cray.com wrote: > >> When ha‑idg‑1 started Pacemaker around 17:43, it did not see ha‑idg‑2, for >> example, >> >> Aug 09 17:43:05 [6318] ha‑idg‑1 pacemakerd: info: > pcmk_quorum_notification: >> Quorum retained | membership=1320 members=1 >> >> after ~20s (dc‑deadtime parameter), ha‑idg‑2 is marked 'unclean' and STONITHed >> as part of startup fencing. >> >> There is nothing in ha‑idg‑2's HA logs around 17:43 indicating that it saw >> ha‑idg‑1 either, so it appears that there was no communication at all between >> the two nodes. >> >> I'm not sure exactly why the nodes did not see one another, but there are >> indications of network issues around this time >> >> 2019‑08‑09T17:42:16.427947+02:00 ha‑idg‑2 kernel: [ 1229.245533] bond1: now >> running without any active interface! >> >> so perhaps that's related. > > This is the initialization of the bond1 on ha‑idg‑1 during boot. > 3 seconds later bond1 is fine: > > 2019‑08‑09T17:42:19.299886+02:00 ha‑idg‑2 kernel: [ 1232.117470] tg3 > 0000:03:04.0 eth2: Link is up at 1000 Mbps, full duplex > 2019‑08‑09T17:42:19.299908+02:00 ha‑idg‑2 kernel: [ 1232.117482] tg3 > 0000:03:04.0 eth2: Flow control is on for TX and on for RX > 2019‑08‑09T17:42:19.315756+02:00 ha‑idg‑2 kernel: [ 1232.131565] tg3 > 0000:03:04.1 eth3: Link is up at 1000 Mbps, full duplex > 2019‑08‑09T17:42:19.315767+02:00 ha‑idg‑2 kernel: [ 1232.131568] tg3 > 0000:03:04.1 eth3: Flow control is on for TX and on for RX > 2019‑08‑09T17:42:19.351781+02:00 ha‑idg‑2 kernel: [ 1232.169386] bond1: link > status definitely up for interface eth2, 1000 Mbps full duplex > 2019‑08‑09T17:42:19.351792+02:00 ha‑idg‑2 kernel: [ 1232.169390] bond1: making > interface eth2 the new active one > 2019‑08‑09T17:42:19.352521+02:00 ha‑idg‑2 kernel: [ 1232.169473] bond1: first > active interface up! > 2019‑08‑09T17:42:19.352532+02:00 ha‑idg‑2 kernel: [ 1232.169480] bond1: link > status definitely up for interface eth3, 1000 Mbps full duplex > > also on ha‑idg‑1: > > 2019‑08‑09T17:42:19.168035+02:00 ha‑idg‑1 kernel: [ 110.164250] tg3 > 0000:02:00.3 eth3: Link is up at 1000 Mbps, full duplex > 2019‑08‑09T17:42:19.168050+02:00 ha‑idg‑1 kernel: [ 110.164252] tg3 > 0000:02:00.3 eth3: Flow control is on for TX and on for RX > 2019‑08‑09T17:42:19.168052+02:00 ha‑idg‑1 kernel: [ 110.164254] tg3 > 0000:02:00.3 eth3: EEE is disabled > 2019‑08‑09T17:42:19.172020+02:00 ha‑idg‑1 kernel: [ 110.171378] tg3 > 0000:02:00.2 eth2: Link is up at 1000 Mbps, full duplex > 2019‑08‑09T17:42:19.172028+02:00 ha‑idg‑1 kernel: [ 110.171380] tg3 > 0000:02:00.2 eth2: Flow control is on for TX and on for RX > 2019‑08‑09T17:42:19.172029+02:00 ha‑idg‑1 kernel: [ 110.171382] tg3 > 0000:02:00.2 eth2: EEE is disabled > ... > 2019‑08‑09T17:42:19.244066+02:00 ha‑idg‑1 kernel: [ 110.240310] bond1: link > status definitely up for interface eth2, 1000 Mbps full duplex > 2019‑08‑09T17:42:19.244083+02:00 ha‑idg‑1 kernel: [ 110.240311] bond1: making > interface eth2 the new active one > 2019‑08‑09T17:42:19.244085+02:00 ha‑idg‑1 kernel: [ 110.240353] bond1: first > active interface up! > 2019‑08‑09T17:42:19.244087+02:00 ha‑idg‑1 kernel: [ 110.240356] bond1: link > status definitely up for interface eth3, 1000 Mbps full duplex > > And the cluster is started afterwards on ha‑idg‑1 at 17:43:04. I don't find > further entries for problems with bond1. So i think it's not related. > Time is synchronized by ntp. > > > Bernd > > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz‑muenchen.de > Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich > Bassler, Kerstin Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt‑IdNr: DE 129521671 > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/