Re: [Ubuntu-ha] [Bug 799711] Re: o2cb[11796]: ERROR: ocfs2_controld.pcmk did not come up
Henning You are probably right that this isn't the right place to continue with non-bug questions though Andres or others could answer that more acurately. I would recommend drbd-users mailing list as there are many experts there for config and troubleshooting. Also the pacemaker mailing lists is a good one. *snip* I don't have any idea why your getting the broken pipe... but do you have STONITH/fencing configured?! Normally when you have a comm link break like that then you would want Pacemaker to STONITH the disconnected node. Prior to the STONITH which ever DRBD node is going to survive should fence the resource preventing it from becoming primary until it is UpToDate. I use the fence peer handler in DRBD set to resource only and then have STONITH configure in Pacemaker. This way I can have a break in DRBD that doesn't automatically STONITH the node (but prevents the borked DRBD from coming up as Master and causing split brain) unless the Cluster communications are also dead at which point Pacemaker will shoot the node. I think the lack of fencing/STONITH is causing the split brain because both nodes do their own thing when not communicating which causes the diverging data set. > > This causes a split brain every time this happens even though there > are > no writes on the devices yet. You have your split brain handling configured like this still?: after-sb-0pri disconnect; after-sb-1pri consensus; after-sb-2pri disconnect; Your are telling it to disconnect regardless of changes with these split brain lines (I believe it's part of the cause). Have you considered using some more agressive split brain handling if your going with dual primary? They can be controversial topic due to data loss but... In the users guide at the bottom of this page it lists split brain behaviors that are considered OK for dual primary/clustered filesystem setups: http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; You would likely hit sb-1pri if you had fencing. It would go something like this: Break in comms Node 1 is fenced preventing it from becoming master Pacemaker shoots node 1 Node 1 reboots and (if setup to auto start the cluster) pacemaker accepts the node back into the cluster Drbd links up and finds Node 1 is diverged Node 1 is fenced so it is not master right now Considers the after-sb-1pri rule - this assumes that since it's dual primary if you have one primary then the dataset on that primary is always good and there is no need to perserve the secondary data so just overwrite it. Basically executes the commands you did manually and discards the Node 1 data. Once Node 1 is UpToDate DRBD removes the fencing and allows Node 1 to become master I hope all of that wasn't too confusing! Jake -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/799711 Title: o2cb[11796]: ERROR: ocfs2_controld.pcmk did not come up To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ocfs2-tools/+bug/799711/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Ubuntu-ha] [Bug 799711] Re: o2cb[11796]: ERROR: ocfs2_controld.pcmk did not come up
U Čet, 23. 06. 2011., u 07:13 +, HenningMalzahn je napisao/la: > 3. When you dpkg-reconfigure ocfs2-tools package, and after the output has > finished showing, did you disable o2cb as showed in the HowTo? "sudo > update-rc.d o2cb disable" > - Yes, did that. Enabled the services to be loaded at boot time and answered > all other questions accepting the defaults. > > 4. When you use OCFS2 with pacemaker you *don't* have to create > /etc/ocfs2/cluster.conf. Please drop that file. > - Did that too. That's why it doesn't work. OCFS2 supports two cluster modes. One is OCFS2 native, for which you have to enable o2cb service and setup /etc/ocfs2/cluster.conf. For this setup you don't need pacemaker. Other mode is when you integrate OCFS2 with pacemaker. For that you have to disable o2cb service in upstart, remove /etc/ocfs2/cluster.conf and setup OCFS2 within pacemaker. If you removed /etc/ocfs2/cluster.conf, but didn't integrate OCFS2 with pacemaker, it won't work. -- Ante Karamatic OEM Server Engineer, Canonical Ltd ante.karama...@canonical.com -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/799711 Title: o2cb[11796]: ERROR: ocfs2_controld.pcmk did not come up To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ocfs2-tools/+bug/799711/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs