On Wed, Dec 21, 2011 at 11:18:23AM +0100, Ulrich Windl wrote: > >>> Andreas Kurz <andr...@hastexo.com> schrieb am 20.12.2011 um 22:57 in > >>> Nachricht > <4ef104b3.7050...@hastexo.com>: > > Hello, > > > > On 12/20/2011 02:47 PM, Ulrich Windl wrote: > > > Hi! > > > > > > I have a dual-primary DRBD that is not working well: It was working, then > > > I > > shut it down and restarted it. DRBD complained about split brain and fenced > > the other node. When coming up, the other node fenced this node. IMHO no > > node > > should have fenced each other. > > > > > > > no config from drbd, no cluster config, partial/filtered logs ... > > fragments ... you have _all_ information and can't find the problem ... > > sorry, but I can't see how anyone can help you based on that information. > > Well, > > to me the problem looks like this: When starting both DRBDs talk to each > other successfully, then they say "we jsut talked about not being able to > talk to each other, so let's commit suicide, because afterwards we can talk > better to each other" > > I think the diagnosis for "split brain" is based on disk content, not on > communication failure, because the nodes just talked to each other. So a > sync, not suicide would be the proper solution for the conflict. > > And as far as the DRBD logs are concearned, they are complete in the interval > that's interesting. > > I only heard from third party rumors that "this and that" isn't working, but > nobody could actually tell me why. I was hoping to get some insight here. > > > > > I personally think it is part of the free community support deal to > > share as much information as possible if one wants help for free. > > Well, if anybody has a dual-primary DRBD (with OCFS on top) working with > pacemaker, would you share your configuration with me to find out what's > different? > > Here's my configuration: > # grep -v '^[ ]*#' * > global_common.conf:global { > global_common.conf: usage-count no; > global_common.conf:} > global_common.conf: > global_common.conf:common { > global_common.conf: protocol C; > global_common.conf: > global_common.conf: handlers { > global_common.conf: pri-on-incon-degr > "/usr/lib/drbd/notify-pri-on-incon-degr.sh; > /usr/lib/drbd/notify-emergency-reboot.sh; sync; echo b > /proc/sysrq-trigger > ; reboot -f"; > global_common.conf: pri-lost-after-sb > "/usr/lib/drbd/notify-pri-lost-after-sb.sh; > /usr/lib/drbd/notify-emergency-reboot.sh; sync; echo b > /proc/sysrq-trigger > ; reboot -f"; > global_common.conf: local-io-error > "/usr/lib/drbd/notify-io-error.sh; > /usr/lib/drbd/notify-emergency-shutdown.sh; sync; echo o > > /proc/sysrq-trigger ; halt -f"; > global_common.conf: split-brain > "/usr/lib/drbd/notify-split-brain.sh root"; > global_common.conf: out-of-sync > "/usr/lib/drbd/notify-out-of-sync.sh root"; > global_common.conf: } > global_common.conf: > global_common.conf: startup { > global_common.conf: become-primary-on both; > global_common.conf: wfc-timeout 15; > global_common.conf: } > global_common.conf: > global_common.conf: disk { > global_common.conf: use-bmbv; > global_common.conf: }
So you do not even have DRBD fencing configured, yet claim that DRBD fencing was shooting your nodes. Yeah, right. > global_common.conf: > global_common.conf: net { > global_common.conf: allow-two-primaries; > global_common.conf: after-sb-0pri discard-zero-changes; > global_common.conf: after-sb-1pri discard-secondary; > global_common.conf: after-sb-2pri disconnect; > global_common.conf: } > global_common.conf: > global_common.conf: syncer { > global_common.conf: } > global_common.conf:} > r0.res:resource r0 { > r0.res: device /dev/drbd_r0 minor 0; > r0.res: disk /dev/sys/samba; > r0.res: meta-disk internal; > r0.res: on h02 { > r0.res: address 172.20.78.2:7780; > r0.res: } > r0.res: on h06 { > r0.res: address 172.20.78.6:7780; > r0.res: } > r0.res: syncer { > r0.res: rate 7M; > r0.res: } > r0.res:} -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems