[Pacemaker] Managing DRBD Dual Primary with Pacemaker always initial Split Brains

Felix Zachlod Wed, 01 Oct 2014 11:30:12 -0700

Hello!

I'm currently experimenting how a good DRBD Dual Primary Setup can beachieved with Pacemaker. I know all of the "you have to have goodfencing in place things" ... that is just what I'am currently trying totest in my setup beside other things.

But even without a node crashing or the link dropping I already have theproblem that I always run into a split brain situation when a node comesup that was e.g. in Standby before.

For example: I have both Nodes running connected both primary,everything is fine. I put one node into standby and DRBD is stopped onthis node.

I do some work, reboot the server and so on finally I try to re join thenode in the cluster. Pacemaker is starting all resources and finallyDRBD drops the connection informing me about a split brain.


In the log this looks like:

Oct 1 19:44:42 storage-test-d kernel: [ 111.138512] block drbd10:disk( Diskless -> Attaching )Oct 1 19:44:42 storage-test-d kernel: [ 111.139283] drbd testdata1:Method to ensure write ordering: drainOct 1 19:44:42 storage-test-d kernel: [ 111.139288] block drbd10: maxBIO size = 1048576Oct 1 19:44:42 storage-test-d kernel: [ 111.139296] block drbd10:drbd_bm_resize called with capacity == 838835128Oct 1 19:44:42 storage-test-d kernel: [ 111.144488] block drbd10:resync bitmap: bits=104854391 words=1638350 pages=3200Oct 1 19:44:42 storage-test-d kernel: [ 111.144494] block drbd10: size= 400 GB (419417564 KB)Oct 1 19:44:42 storage-test-d kernel: [ 111.289327] block drbd10:recounting of set bits took additional 3 jiffiesOct 1 19:44:42 storage-test-d kernel: [ 111.289334] block drbd10: 0 KB(0 bits) marked out-of-sync by on disk bit-map.Oct 1 19:44:42 storage-test-d kernel: [ 111.289346] block drbd10:disk( Attaching -> UpToDate )Oct 1 19:44:42 storage-test-d kernel: [ 111.289352] block drbd10:attached to UUIDsA41D74E79299A144:0000000000000000:86B0140AA1A527C0:86AF140AA1A527C1Oct 1 19:44:42 storage-test-d kernel: [ 111.321564] drbd testdata2:conn( StandAlone -> Unconnected )Oct 1 19:44:42 storage-test-d kernel: [ 111.321628] drbd testdata2:Starting receiver thread (from drbd_w_testdata [3211])Oct 1 19:44:42 storage-test-d kernel: [ 111.321794] drbd testdata2:receiver (re)startedOct 1 19:44:42 storage-test-d kernel: [ 111.321822] drbd testdata2:conn( Unconnected -> WFConnection )Oct 1 19:44:42 storage-test-d kernel: [ 111.337708] drbd testdata1:conn( StandAlone -> Unconnected )Oct 1 19:44:42 storage-test-d kernel: [ 111.337764] drbd testdata1:Starting receiver thread (from drbd_w_testdata [3215])Oct 1 19:44:42 storage-test-d kernel: [ 111.337904] drbd testdata1:receiver (re)startedOct 1 19:44:42 storage-test-d kernel: [ 111.337927] drbd testdata1:conn( Unconnected -> WFConnection )Oct 1 19:44:43 storage-test-d kernel: [ 111.808897] block drbd10:role( Secondary -> Primary )Oct 1 19:44:43 storage-test-d kernel: [ 111.810883] block drbd11:role( Secondary -> Primary )Oct 1 19:44:43 storage-test-d kernel: [ 111.820040] drbd testdata2:Handshake successful: Agreed network protocol version 101Oct 1 19:44:43 storage-test-d kernel: [ 111.820046] drbd testdata2:Agreed to support TRIM on protocol levelOct 1 19:44:43 storage-test-d kernel: [ 111.823292] block drbd10: newcurrent UUID8369EB6F395C0D29:A41D74E79299A144:86B0140AA1A527C0:86AF140AA1A527C1Oct 1 19:44:43 storage-test-d kernel: [ 111.836096] drbd testdata1:Handshake successful: Agreed network protocol version 101Oct 1 19:44:43 storage-test-d kernel: [ 111.836108] drbd testdata1:Agreed to support TRIM on protocol levelOct 1 19:44:43 storage-test-d kernel: [ 111.848917] block drbd11: newcurrent UUID69A056C665A38F35:C8B4320C2FE11A0C:D13C0AA6DC58CC8C:D13B0AA6DC58CC8DOct 1 19:44:43 storage-test-d kernel: [ 111.871100] drbd testdata2:conn( WFConnection -> WFReportParams )Oct 1 19:44:43 storage-test-d kernel: [ 111.871108] drbd testdata2:Starting asender thread (from drbd_r_testdata [3249])Oct 1 19:44:43 storage-test-d kernel: [ 111.909687] drbd testdata1:conn( WFConnection -> WFReportParams )Oct 1 19:44:43 storage-test-d kernel: [ 111.909695] drbd testdata1:Starting asender thread (from drbd_r_testdata [3270])Oct 1 19:44:43 storage-test-d kernel: [ 111.943986] drbd testdata2:meta connection shut down by peer.Oct 1 19:44:43 storage-test-d kernel: [ 111.944063] drbd testdata2:conn( WFReportParams -> NetworkFailure )Oct 1 19:44:43 storage-test-d kernel: [ 111.944067] drbd testdata2:asender terminatedOct 1 19:44:43 storage-test-d kernel: [ 111.944070] drbd testdata2:Terminating drbd_a_testdataOct 1 19:44:43 storage-test-d kernel: [ 111.988005] drbd testdata1:meta connection shut down by peer.Oct 1 19:44:43 storage-test-d kernel: [ 111.988089] drbd testdata1:conn( WFReportParams -> NetworkFailure )Oct 1 19:44:43 storage-test-d kernel: [ 111.988094] drbd testdata1:asender terminatedOct 1 19:44:43 storage-test-d kernel: [ 111.988098] drbd testdata1:Terminating drbd_a_testdataOct 1 19:44:43 storage-test-d kernel: [ 112.031948] drbd testdata2:Connection closedOct 1 19:44:43 storage-test-d kernel: [ 112.032116] drbd testdata2:conn( NetworkFailure -> Unconnected )Oct 1 19:44:43 storage-test-d kernel: [ 112.032121] drbd testdata2:receiver terminatedOct 1 19:44:43 storage-test-d kernel: [ 112.032124] drbd testdata2:Restarting receiver threadOct 1 19:44:43 storage-test-d kernel: [ 112.032127] drbd testdata2:receiver (re)startedOct 1 19:44:43 storage-test-d kernel: [ 112.032136] drbd testdata2:conn( Unconnected -> WFConnection )Oct 1 19:44:43 storage-test-d kernel: [ 112.096002] drbd testdata1:Connection closedOct 1 19:44:43 storage-test-d kernel: [ 112.096194] drbd testdata1:conn( NetworkFailure -> Unconnected )

To resolve this problem I simply put the cluster into maintenance mode,stop drbd on the one node which I just brought back on and reconnect onthe other side then start the other node's DRBD again and it finallyconnects without a problem into Secondary/Primary state. Withoutenforcing data being dropped. Afterwards I can go back intoPrimary/Primary. In a real world setup at this point the fencing wouldhave kicked in and with bit of bad luck even fenced the healthy node (asI already saw the split brain detected messages on either side of thecluster), bringing the possibly outdated side up with all of it'sancient data.

As I can see from the logs the resource is being promoted already evenit is still in WFConnection state. I assume this might be be problemhere, that both sides are primary already when they come to the pointwhere the connection is established and then one node drops theconnection. I don't think that this can be the desired behaviour. Howcan pacemaker be made aware of that it is promoting drbd only if it isalready in a connected state and with (assumed) good data? Or to say


1. bring up drbd into secondary
2. let drbd determine if data has to be resynced and so on

3. when drbd is finally in "Secondary/UpToDate" state promote it. andafterwards start services that rely on the drbd device. If somethinggoes wrong the promote should fail and the cluster could finally fencethe outdated node. I know there might be a RARE situation where it mightbe necesary to start a Secondary/Unknown node up to Primary (e.g.Cluster was degraded and for some reason the remainung good node hadrestarted (or had to be restarted) - but this might be a thing thatcould be handled manually.

This is a portion from the cluster config (should generally speaking beeverything that is related to drbd directly):


primitive drbd_testdata1 ocf:linbit:drbd \
        params drbd_resource="testdata1" \
        op monitor interval="29s" role="Master" \
        op monitor interval="31s" role="Slave"

ms ms_drbd_testdata1 drbd_testdata1 \

meta master-max="2" master-node-max="1" clone-max="2"clone-node-max="1" notify="true" target-role="Master"


location l-drbd1 ms_drbd_testdata1 \

rule $id="l-drbd1-rule" 0: #uname eq storage-test-d or #unameeq storage-test-c



Thanks for any hints in advance,
Felix


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] Managing DRBD Dual Primary with Pacemaker always initial Split Brains

Reply via email to