Hello,

I had to restore a VM snapshot of a DRBD machine today to revert a catastrophic 
DB update. However, after restoring the snapshot, I can't seem to get the two 
nodes to connect. I have already followed the documented recommendations for 
split-brain recovery, and the results are shown below.

This is my primary:
[root@MCM5-DB4 ~]# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 
2014-11-24 14:51:37
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:3136608 dr:9182265 al:31009 bm:15031 lo:0 pe:0 ua:0 ap:0 ep:1 
wo:f oos:1191752

And from /var/log/messages on the primary:
Dec  5 10:20:07 localhost kernel: block drbd0: conn( StandAlone -> Unconnected )
Dec  5 10:20:07 localhost kernel: block drbd0: Starting receiver thread (from 
drbd0_worker [1629])
Dec  5 10:20:07 localhost kernel: block drbd0: receiver (re)started
Dec  5 10:20:07 localhost kernel: block drbd0: conn( Unconnected -> 
WFConnection )
Dec  5 10:20:07 localhost kernel: block drbd0: Handshake successful: Agreed 
network protocol version 97
Dec  5 10:20:07 localhost kernel: block drbd0: Peer authenticated using 20 
bytes of 'sha1' HMAC
Dec  5 10:20:07 localhost kernel: block drbd0: conn( WFConnection -> 
WFReportParams )
Dec  5 10:20:07 localhost kernel: block drbd0: Starting asender thread (from 
drbd0_receiver [6450])
Dec  5 10:20:07 localhost kernel: block drbd0: data-integrity-alg: <not-used>
Dec  5 10:20:07 localhost kernel: block drbd0: drbd_sync_handshake:
Dec  5 10:20:07 localhost kernel: block drbd0: self 
5E31C2DC5B55B225:D72E026811DB74F3:20B3ACD5A61DC8E5:20B2ACD5A61DC8E5 bits:297938 
flags:0
Dec  5 10:20:07 localhost kernel: block drbd0: peer 
A98F08F32D2FCB34:0000000000000000:5E32C2DC5B55B224:5E31C2DC5B55B225 
bits:183494167 flags:1
Dec  5 10:20:07 localhost kernel: block drbd0: uuid_compare()=-2 by rule 60
Dec  5 10:20:07 localhost kernel: block drbd0: I shall become SyncTarget, but I 
am primary!
Dec  5 10:20:07 localhost kernel: block drbd0: conn( WFReportParams -> 
Disconnecting )
Dec  5 10:20:07 localhost kernel: block drbd0: error receiving ReportState, l: 
4!
Dec  5 10:20:07 localhost kernel: block drbd0: asender terminated
Dec  5 10:20:07 localhost kernel: block drbd0: Terminating drbd0_asender
Dec  5 10:20:07 localhost kernel: block drbd0: Connection closed
Dec  5 10:20:07 localhost kernel: block drbd0: conn( Disconnecting -> 
StandAlone )
Dec  5 10:20:07 localhost kernel: block drbd0: receiver terminated
Dec  5 10:20:07 localhost kernel: block drbd0: Terminating drbd0_receiver

And this is my secondary:
[root@MCM5-DB5 log]# cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 
2014-11-24 14:51:37
 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----s
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:733976668

With the log:
Dec  5 10:20:07 localhost kernel: block drbd0: Handshake successful: Agreed 
network protocol version 97
Dec  5 10:20:07 localhost kernel: block drbd0: Peer authenticated using 20 
bytes of 'sha1' HMAC
Dec  5 10:20:07 localhost kernel: block drbd0: conn( WFConnection -> 
WFReportParams )
Dec  5 10:20:07 localhost kernel: block drbd0: Starting asender thread (from 
drbd0_receiver [2171])
Dec  5 10:20:07 localhost kernel: block drbd0: data-integrity-alg: <not-used>
Dec  5 10:20:07 localhost kernel: block drbd0: drbd_sync_handshake:
Dec  5 10:20:07 localhost kernel: block drbd0: self 
A98F08F32D2FCB34:0000000000000000:5E32C2DC5B55B224:5E31C2DC5B55B225 
bits:183494167 flags:0
Dec  5 10:20:07 localhost kernel: block drbd0: peer 
5E31C2DC5B55B225:D72E026811DB74F3:20B3ACD5A61DC8E5:20B2ACD5A61DC8E5 bits:297938 
flags:2
Dec  5 10:20:07 localhost kernel: block drbd0: uuid_compare()=2 by rule 80
Dec  5 10:20:07 localhost kernel: block drbd0: Writing the whole bitmap, full 
sync required after drbd_sync_handshake.
Dec  5 10:20:07 localhost kernel: block drbd0: meta connection shut down by 
peer.
Dec  5 10:20:07 localhost kernel: block drbd0: conn( WFReportParams -> 
NetworkFailure )
Dec  5 10:20:07 localhost kernel: block drbd0: asender terminated
Dec  5 10:20:07 localhost kernel: block drbd0: Terminating drbd0_asender
Dec  5 10:20:08 localhost kernel: block drbd0: bitmap WRITE of 5600 pages took 
183 jiffies
Dec  5 10:20:08 localhost kernel: block drbd0: 700 GB (183494167 bits) marked 
out-of-sync by on disk bit-map.
Dec  5 10:20:08 localhost kernel: block drbd0: error receiving ReportState, l: 
4!
Dec  5 10:20:08 localhost kernel: block drbd0: Connection closed
Dec  5 10:20:08 localhost kernel: block drbd0: conn( NetworkFailure -> 
Unconnected )
Dec  5 10:20:08 localhost kernel: block drbd0: receiver terminated
Dec  5 10:20:08 localhost kernel: block drbd0: Restarting drbd0_receiver
Dec  5 10:20:08 localhost kernel: block drbd0: receiver (re)started
Dec  5 10:20:08 localhost kernel: block drbd0: conn( Unconnected -> 
WFConnection )

Can anyone tell me how to get my Primary to connect and push its data over to 
the secondary?

Thanks!
Tyler Hains





The information contained in this email and any attachments is private and is 
the confidential property of ROAM Data, Inc. If you are not the intended 
recipient(s) or have otherwise received this email in error, please delete this 
email and inform the sender as soon as possible. Neither this email nor the 
information contained in any attachments may be disclosed, stored, used, 
published or copied by anyone other than the intended recipient(s). All orders 
for ROAM Data, Inc. products and services are accepted by ROAM Data, Inc. 
subject to the terms and conditions of sale set forth on the ROAM Data, Inc. 
website, as such terms and conditions of sale may be changed from time to time 
without notice.
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to