Hello all,
it seems that we are facing the same problem on our ganeti clusters that
have been upgraded from squeeze to wheezy. The live migration failure ratio
is huge, 1 out of 8-10 migrations fail. When these nodes where running
squeeze there were absolutely no such issues.
For comparison purposes here are the package details of our clusters:
Squeeze:
drbd8-utils:
Installed: 2:8.3.7-2.1
ganeti2:
Installed: 2.4.2+ippool5-1
linux-image-2.6.32-5-amd64:
Installed: 2.6.32-48squeeze3
Wheezy:
drbd8-utils:
Installed: 2:8.3.13-2
ganeti2:
Installed: 2.8.1-1~bpo70+httpboot
linux-image-3.2.0-4-amd64:
Installed: 3.2.51-1
I am attaching you the relevant files of two failed migrations. To
replicate the issue, just migrate a VM in while true; loop...and wait for a
little bit.
We have experienced the same issue with ganeti 2.6, 2.7 and 2.8, but also
with squeeze and kernel 3.2 from backports.
The common denominator in all problematic situations seems to be kernel 3.2
but maybe there's a way to overcome this issue in ganeti itself.
Can someone with better insight of drbd/ganeti/kernel take a look at the
proposed "option a" fix from:
http://lists.linbit.com/pipermail/drbd-user/2013-July/020173.html would
that work?
We would really like some help from the ganeti-devel team to solve this
grave issue. We are available for any testing that might be needed.
Regards,
George Kargiotakis
On Tuesday, July 16, 2013 4:01:06 PM UTC+3, candlerb wrote:
>
> Some feedback from the drbd-user list:
>
> http://lists.linbit.com/pipermail/drbd-user/2013-July/020166.html
>
> ">* Jul 11 10:59:46 wrn-vm2 kernel: [236603.135779] block drbd0: I shall*
>
> >* become SyncTarget, but I am primary!*
> The message above is MUCH more frightening than the line below.
>
> Apparently a node was promoted right in the middle of a resync
> handshake, and did not like that at all.
>
> ...
>
> In that backend script,
>
> add a loop before the promote,
> that checks that the connection state really is "Connected",
> and the disk state really is "UpToDate"."
>
>
>
Nov 4 11:37:23 gnt7-05 kernel: [320941.793545] block drbd1: peer( Secondary ->
Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Nov 4 11:37:23 gnt7-05 kernel: [320941.805332] block drbd1: new current UUID
F352DE775D28F401:12DD91AD18639433:4BB4240B4E6917E1:4BB3240B4E6917E1
Nov 4 11:37:23 gnt7-05 kernel: [320941.839558] block drbd1: meta connection
shut down by peer.
Nov 4 11:37:23 gnt7-05 kernel: [320941.846398] block drbd1: asender terminated
Nov 4 11:37:23 gnt7-05 kernel: [320941.851704] block drbd1: Terminating
drbd1_asender
Nov 4 11:37:23 gnt7-05 kernel: [320941.866233] block drbd1: Connection closed
Nov 4 11:37:23 gnt7-05 kernel: [320941.871348] block drbd1: conn( TearDown ->
Unconnected )
Nov 4 11:37:23 gnt7-05 kernel: [320941.877894] block drbd1: receiver terminated
Nov 4 11:37:23 gnt7-05 kernel: [320941.883157] block drbd1: Restarting
drbd1_receiver
Nov 4 11:37:23 gnt7-05 kernel: [320941.889071] block drbd1: receiver
(re)started
Nov 4 11:37:23 gnt7-05 kernel: [320941.894465] block drbd1: conn( Unconnected
-> WFConnection )
Nov 4 11:37:23 gnt7-05 kernel: [320942.008330] block drbd1: conn( WFConnection
-> Disconnecting )
Nov 4 11:37:23 gnt7-05 kernel: [320942.015185] block drbd1: Discarding network
configuration.
Nov 4 11:37:23 gnt7-05 kernel: [320942.021902] block drbd1: Connection closed
Nov 4 11:37:23 gnt7-05 kernel: [320942.027016] block drbd1: conn(
Disconnecting -> StandAlone )
Nov 4 11:37:23 gnt7-05 kernel: [320942.033851] block drbd1: receiver terminated
Nov 4 11:37:23 gnt7-05 kernel: [320942.039077] block drbd1: Terminating
drbd1_receiver
Nov 4 11:37:24 gnt7-05 kernel: [320942.442752] block drbd1: conn( StandAlone
-> Unconnected )
Nov 4 11:37:24 gnt7-05 kernel: [320942.449268] block drbd1: Starting receiver
thread (from drbd1_worker [1349])
Nov 4 11:37:24 gnt7-05 kernel: [320942.457533] block drbd1: receiver
(re)started
Nov 4 11:37:24 gnt7-05 kernel: [320942.462787] block drbd1: conn( Unconnected
-> WFConnection )
Nov 4 11:37:24 gnt7-05 kernel: [320943.187673] block drbd1: Handshake
successful: Agreed network protocol version 96
Nov 4 11:37:24 gnt7-05 kernel: [320943.196844] block drbd1: Peer authenticated
using 16 bytes of 'md5' HMAC
Nov 4 11:37:24 gnt7-05 kernel: [320943.204675] block drbd1: conn( WFConnection
-> WFReportParams )
Nov 4 11:37:24 gnt7-05 kernel: [320943.211873] block drbd1: Starting asender
thread (from drbd1_receiver [4721])
Nov 4 11:37:24 gnt7-05 kernel: [320943.220713] block drbd1:
data-integrity-alg: <not-used>
Nov 4 11:37:25 gnt7-05 kernel: [320943.226924] block drbd1:
drbd_sync_handshake:
Nov 4 11:37:25 gnt7-05 kernel: [320943.232084] block drbd1: self
F352DE775D28F401:12DD91AD18639433:4BB4240B4E6917E1:4BB3240B4E6917E1 bits:0
flags:0
Nov 4 11:37:25 gnt7-05 kernel: [320943.243844] block drbd1: peer
12DD91AD18639432:0000000000000000:4BB4240B4E6917E0:4BB3240B4E6917E1 bits:0
flags:0
Nov 4 11:37:25 gnt7-05 kernel: [320943.255536] block drbd1: uuid_compare()=1
by rule 70
Nov 4 11:37:25 gnt7-05 kernel: [320943.261569] block drbd1: peer( Unknown ->
Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
Nov 4 11:37:25 gnt7-05 kernel: [320943.309761] block drbd1: helper command:
/bin/true before-resync-source minor-1
Nov 4 11:37:25 gnt7-05 kernel: [320943.319359] block drbd1: helper command:
/bin/true before-resync-source minor-1 exit code 0 (0x0)
Nov 4 11:37:25 gnt7-05 kernel: [320943.329939] block drbd1: conn( WFBitMapS ->
SyncSource ) pdsk( Consistent -> Inconsistent )
Nov 4 11:37:25 gnt7-05 kernel: [320943.339587] block drbd1: conn( SyncSource
-> WFBitMapS ) pdsk( Inconsistent -> Consistent )
Nov 4 11:37:25 gnt7-05 kernel: [320943.339707] block drbd1: updated sync UUID
F352DE775D28F401:12DE91AD18639433:12DD91AD18639433:4BB4240B4E6917E1
Nov 4 11:37:25 gnt7-05 kernel: [320943.339904] block drbd1: Began resync as
SyncSource (will sync 0 KB [0 bits set]).
Nov 4 11:37:25 gnt7-05 kernel: [320943.473000] block drbd1: peer( Secondary ->
Primary )
Nov 4 11:37:25 gnt7-05 kernel: [320943.659643] block drbd1: sock was shut down
by peer
Nov 4 11:37:25 gnt7-05 kernel: [320943.665674] block drbd1: meta connection
shut down by peer.
Nov 4 11:37:25 gnt7-05 kernel: [320943.665698] block drbd1: peer( Primary ->
Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( Consistent -> DUnknown )
Nov 4 11:37:25 gnt7-05 kernel: [320943.665708] block drbd1: short read
expecting header on sock: r=0
Nov 4 11:37:25 gnt7-05 kernel: [320943.683418] block drbd1: bitmap WRITE of 24
pages took 5 jiffies
Nov 4 11:37:25 gnt7-05 kernel: [320943.683537] block drbd1: 0 KB (0 bits)
marked out-of-sync by on disk bit-map.
Nov 4 11:37:25 gnt7-05 kernel: [320943.714752] block drbd1: asender terminated
Nov 4 11:37:25 gnt7-05 kernel: [320943.719679] block drbd1: Terminating
drbd1_asender
Nov 4 11:37:25 gnt7-05 kernel: [320943.719830] block drbd1: Connection closed
Nov 4 11:37:25 gnt7-05 kernel: [320943.719837] block drbd1: conn( BrokenPipe
-> Unconnected )
Nov 4 11:37:25 gnt7-05 kernel: [320943.719844] block drbd1: receiver terminated
Nov 4 11:37:25 gnt7-05 kernel: [320943.719848] block drbd1: Restarting
drbd1_receiver
Nov 4 11:37:25 gnt7-05 kernel: [320943.719851] block drbd1: receiver
(re)started
Nov 4 11:37:25 gnt7-05 kernel: [320943.719857] block drbd1: conn( Unconnected
-> WFConnection )
Mon Nov 4 11:54:11 2013 Migrating instance migrateme.vm.grnet.gr
Mon Nov 4 11:54:11 2013 * checking disk consistency between source and target
Mon Nov 4 11:54:12 2013 * switching node gnt7-05.cluster.tld to secondary mode
Mon Nov 4 11:54:12 2013 * changing into standalone mode
Mon Nov 4 11:54:13 2013 * changing disks into dual-master mode
Mon Nov 4 11:54:14 2013 * wait until resync is done
Failure: command execution error:
Cannot resync disks on node gnt7-01.cluster.tld: DRBD device <<class
'ganeti.bdev.DRBD8'>: unique_id: ('X.Y.Z.186', 11054, 'X.Y.Z.178', 11054, 2,
'2c5aac54fc9a537c5207f471b827d1cd20b2291a'), children: [<<class
'ganeti.bdev.LogicalVolume'>: unique_id: ('ganeti',
'95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_data'), children: [], 253:5,
/dev/ganeti/95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_data>, <<class
'ganeti.bdev.LogicalVolume'>: unique_id: ('ganeti',
'95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_meta'), children: [], 253:6,
/dev/ganeti/95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_meta>], 147:2,
/dev/drbd2> is not in sync: stats=<ganeti.bdev.DRBD8Status object at 0x2e7ac90>
Mon Nov 4 11:37:21 2013 Migrating instance migrateme.vm.grnet.gr
Mon Nov 4 11:37:22 2013 * checking disk consistency between source and target
Mon Nov 4 11:37:22 2013 * switching node gnt7-01.cluster.tld to secondary mode
Mon Nov 4 11:37:23 2013 * changing into standalone mode
Mon Nov 4 11:37:23 2013 * changing disks into dual-master mode
Mon Nov 4 11:37:25 2013 * wait until resync is done
Failure: command execution error:
Cannot resync disks on node gnt7-01.cluster.tld: DRBD device <<class
'ganeti.bdev.DRBD8'>: unique_id: ('X.Y.Z.186', 11054, 'X.Y.Z.178', 11054, 2,
'2c5aac54fc9a537c5207f471b827d1cd20b2291a'), children: [<<class
'ganeti.bdev.LogicalVolume'>: unique_id: ('ganeti',
'95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_data'), children: [], 253:5,
/dev/ganeti/95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_data>, <<class
'ganeti.bdev.LogicalVolume'>: unique_id: ('ganeti',
'95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_meta'), children: [], 253:6,
/dev/ganeti/95e7a41b-d955-435b-a9cf-fcffe4763565.disk0_meta>], 147:2,
/dev/drbd2> is not in sync: stats=<ganeti.bdev.DRBD8Status object at 0x2e7ac90>
Nov 4 11:54:13 gnt7-05 kernel: [321949.679761] block drbd1: peer( Primary ->
Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Nov 4 11:54:13 gnt7-05 kernel: [321949.705073] block drbd1: asender terminated
Nov 4 11:54:13 gnt7-05 kernel: [321949.710232] block drbd1: Terminating
drbd1_asender
Nov 4 11:54:13 gnt7-05 kernel: [321949.710545] block drbd1: Connection closed
Nov 4 11:54:13 gnt7-05 kernel: [321949.710556] block drbd1: conn(
Disconnecting -> StandAlone )
Nov 4 11:54:13 gnt7-05 kernel: [321949.710677] block drbd1: receiver terminated
Nov 4 11:54:13 gnt7-05 kernel: [321949.710681] block drbd1: Terminating
drbd1_receiver
Nov 4 11:54:13 gnt7-05 kernel: [321950.137532] block drbd1: conn( StandAlone
-> Unconnected )
Nov 4 11:54:13 gnt7-05 kernel: [321950.144033] block drbd1: Starting receiver
thread (from drbd1_worker [1349])
Nov 4 11:54:13 gnt7-05 kernel: [321950.152339] block drbd1: receiver
(re)started
Nov 4 11:54:13 gnt7-05 kernel: [321950.157508] block drbd1: conn( Unconnected
-> WFConnection )
Nov 4 11:54:14 gnt7-05 kernel: [321950.906247] block drbd1: Handshake
successful: Agreed network protocol version 96
Nov 4 11:54:14 gnt7-05 kernel: [321950.915625] block drbd1: Peer authenticated
using 16 bytes of 'md5' HMAC
Nov 4 11:54:14 gnt7-05 kernel: [321950.923566] block drbd1: conn( WFConnection
-> WFReportParams )
Nov 4 11:54:14 gnt7-05 kernel: [321950.930752] block drbd1: Starting asender
thread (from drbd1_receiver [8885])
Nov 4 11:54:14 gnt7-05 kernel: [321950.939826] block drbd1:
data-integrity-alg: <not-used>
Nov 4 11:54:14 gnt7-05 kernel: [321950.946016] block drbd1:
drbd_sync_handshake:
Nov 4 11:54:14 gnt7-05 kernel: [321950.951215] block drbd1: self
6FDF6EA09F4D0048:0000000000000000:9C341FC2219C7FEC:9C331FC2219C7FED bits:0
flags:0
Nov 4 11:54:14 gnt7-05 kernel: [321950.963083] block drbd1: peer
774CB90EE2E17FA3:6FDF6EA09F4D0049:9C341FC2219C7FED:9C331FC2219C7FED bits:0
flags:0
Nov 4 11:54:14 gnt7-05 kernel: [321950.974845] block drbd1: uuid_compare()=-1
by rule 50
Nov 4 11:54:14 gnt7-05 kernel: [321950.980824] block drbd1: peer( Unknown ->
Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated )
pdsk( DUnknown -> UpToDate )
Nov 4 11:54:14 gnt7-05 kernel: [321951.033029] block drbd1: role( Secondary ->
Primary )
Nov 4 11:54:14 gnt7-05 kernel: [321951.050646] block drbd1: conn( WFBitMapT ->
WFSyncUUID )
Nov 4 11:54:14 gnt7-05 kernel: [321951.124139] block drbd1: updated sync uuid
6FE06EA09F4D0049:0000000000000000:9C341FC2219C7FEC:9C331FC2219C7FED
Nov 4 11:54:14 gnt7-05 kernel: [321951.142076] block drbd1: helper command:
/bin/true before-resync-target minor-1
Nov 4 11:54:14 gnt7-05 kernel: [321951.151470] block drbd1: helper command:
/bin/true before-resync-target minor-1 exit code 0 (0x0)
Nov 4 11:54:14 gnt7-05 kernel: [321951.161667] block drbd1: conn( WFSyncUUID
-> SyncTarget ) disk( Outdated -> Inconsistent )
Nov 4 11:54:14 gnt7-05 kernel: [321951.171328] block drbd1: Began resync as
SyncTarget (will sync 0 KB [0 bits set]).
Nov 4 11:54:14 gnt7-05 kernel: [321951.180205] block drbd1: unexpected cstate
(SyncTarget) in receive_bitmap
Nov 4 11:54:14 gnt7-05 kernel: [321951.196233] block drbd1: Resync done (total
1 sec; paused 0 sec; 0 K/sec)
Nov 4 11:54:14 gnt7-05 kernel: [321951.204005] block drbd1: updated UUIDs
774CB90EE2E17FA3:0000000000000000:6FE06EA09F4D0049:6FDF6EA09F4D0049
Nov 4 11:54:14 gnt7-05 kernel: [321951.215002] block drbd1: conn( SyncTarget
-> Connected ) disk( Inconsistent -> UpToDate )
Nov 4 11:54:14 gnt7-05 kernel: [321951.228579] block drbd1: helper command:
/bin/true after-resync-target minor-1
Nov 4 11:54:14 gnt7-05 kernel: [321951.238086] block drbd1: helper command:
/bin/true after-resync-target minor-1 exit code 0 (0x0)
Nov 4 11:54:14 gnt7-05 kernel: [321951.253840] block drbd1: bitmap WRITE of 24
pages took 2 jiffies
Nov 4 11:54:14 gnt7-05 kernel: [321951.260812] block drbd1: 0 KB (0 bits)
marked out-of-sync by on disk bit-map.
Nov 4 11:54:14 gnt7-05 kernel: [321951.269458] block drbd1:
drbd_sync_handshake:
Nov 4 11:54:14 gnt7-05 kernel: [321951.274530] block drbd1: self
774CB90EE2E17FA3:0000000000000000:6FE06EA09F4D0049:6FDF6EA09F4D0049 bits:0
flags:0
Nov 4 11:54:14 gnt7-05 kernel: [321951.286188] block drbd1: peer
774CB90EE2E17FA3:6FE06EA09F4D0049:6FDF6EA09F4D0049:9C341FC2219C7FED bits:0
flags:0
Nov 4 11:54:14 gnt7-05 kernel: [321951.297779] block drbd1: was SyncTarget,
peer missed the resync finished event, corrected peer:
Nov 4 11:54:14 gnt7-05 kernel: [321951.307767] block drbd1: peer
774CB90EE2E17FA3:0000000000000000:6FE06EA09F4D0049:6FDF6EA09F4D0049 bits:0
flags:0
Nov 4 11:54:14 gnt7-05 kernel: [321951.319401] block drbd1: uuid_compare()=-1
by rule 35
Nov 4 11:54:14 gnt7-05 kernel: [321951.325278] block drbd1: I shall become
SyncTarget, but I am primary!
Nov 4 11:54:14 gnt7-05 kernel: [321951.332712] block drbd1: ASSERT( os.conn ==
C_WF_REPORT_PARAMS ) in
/build/linux-rrsxby/linux-3.2.51/drivers/block/drbd/drbd_receiver.c:3245
Nov 4 11:54:14 gnt7-05 kernel: [321951.347080] block drbd1: peer( Primary ->
Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Nov 4 11:54:14 gnt7-05 kernel: [321951.359186] block drbd1: new current UUID
AFED5C6F5DF458AF:774CB90EE2E17FA3:6FE06EA09F4D0049:6FDF6EA09F4D0049
Nov 4 11:54:14 gnt7-05 kernel: [321951.359257] block drbd1: error receiving
ReportState, l: 4!
Nov 4 11:54:14 gnt7-05 kernel: [321951.383986] block drbd1: asender terminated
Nov 4 11:54:14 gnt7-05 kernel: [321951.388899] block drbd1: Terminating
drbd1_asender
Nov 4 11:54:14 gnt7-05 kernel: [321951.389136] block drbd1: Connection closed
Nov 4 11:54:14 gnt7-05 kernel: [321951.389149] block drbd1: conn(
Disconnecting -> StandAlone )
Nov 4 11:54:14 gnt7-05 kernel: [321951.389171] block drbd1: receiver terminated
Nov 4 11:54:14 gnt7-05 kernel: [321951.389175] block drbd1: Terminating
drbd1_receiver
Nov 4 11:54:10 gnt7-01 kargig: MIGRATION STARTING for migrateme.vm.grnet.gr at
20131104 - 11:50:07
Nov 4 11:54:13 gnt7-01 kernel: [1466520.229933] block drbd2: peer( Secondary
-> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Nov 4 11:54:13 gnt7-01 kernel: [1466520.241809] block drbd2: new current UUID
774CB90EE2E17FA3:6FDF6EA09F4D0049:9C341FC2219C7FED:9C331FC2219C7FED
Nov 4 11:54:13 gnt7-01 kernel: [1466520.264393] block drbd2: asender terminated
Nov 4 11:54:13 gnt7-01 kernel: [1466520.269398] block drbd2: Terminating
drbd2_asender
Nov 4 11:54:13 gnt7-01 kernel: [1466520.269628] block drbd2: Connection closed
Nov 4 11:54:13 gnt7-01 kernel: [1466520.269635] block drbd2: conn( TearDown ->
Unconnected )
Nov 4 11:54:13 gnt7-01 kernel: [1466520.269641] block drbd2: receiver
terminated
Nov 4 11:54:13 gnt7-01 kernel: [1466520.269643] block drbd2: Restarting
drbd2_receiver
Nov 4 11:54:13 gnt7-01 kernel: [1466520.269646] block drbd2: receiver
(re)started
Nov 4 11:54:13 gnt7-01 kernel: [1466520.269651] block drbd2: conn( Unconnected
-> WFConnection )
Nov 4 11:54:13 gnt7-01 kernel: [1466520.376402] block drbd2: conn(
WFConnection -> Disconnecting )
Nov 4 11:54:13 gnt7-01 kernel: [1466520.383342] block drbd2: Discarding
network configuration.
Nov 4 11:54:13 gnt7-01 kernel: [1466520.390233] block drbd2: Connection closed
Nov 4 11:54:13 gnt7-01 kernel: [1466520.395452] block drbd2: conn(
Disconnecting -> StandAlone )
Nov 4 11:54:13 gnt7-01 kernel: [1466520.402486] block drbd2: receiver
terminated
Nov 4 11:54:13 gnt7-01 kernel: [1466520.407775] block drbd2: Terminating
drbd2_receiver
Nov 4 11:54:13 gnt7-01 kernel: [1466520.745331] block drbd2: conn( StandAlone
-> Unconnected )
Nov 4 11:54:13 gnt7-01 kernel: [1466520.751910] block drbd2: Starting receiver
thread (from drbd2_worker [9955])
Nov 4 11:54:13 gnt7-01 kernel: [1466520.760261] block drbd2: receiver
(re)started
Nov 4 11:54:13 gnt7-01 kernel: [1466520.765454] block drbd2: conn( Unconnected
-> WFConnection )
Nov 4 11:54:14 gnt7-01 kernel: [1466521.469005] block drbd2: Handshake
successful: Agreed network protocol version 96
Nov 4 11:54:14 gnt7-01 kernel: [1466521.478277] block drbd2: Peer
authenticated using 16 bytes of 'md5' HMAC
Nov 4 11:54:14 gnt7-01 kernel: [1466521.486308] block drbd2: conn(
WFConnection -> WFReportParams )
Nov 4 11:54:14 gnt7-01 kernel: [1466521.493533] block drbd2: Starting asender
thread (from drbd2_receiver [13958])
Nov 4 11:54:14 gnt7-01 kernel: [1466521.502318] block drbd2:
data-integrity-alg: <not-used>
Nov 4 11:54:14 gnt7-01 kernel: [1466521.508529] block drbd2:
drbd_sync_handshake:
Nov 4 11:54:14 gnt7-01 kernel: [1466521.513762] block drbd2: self
774CB90EE2E17FA3:6FDF6EA09F4D0049:9C341FC2219C7FED:9C331FC2219C7FED bits:0
flags:0
Nov 4 11:54:14 gnt7-01 kernel: [1466521.525533] block drbd2: peer
6FDF6EA09F4D0048:0000000000000000:9C341FC2219C7FEC:9C331FC2219C7FED bits:0
flags:0
Nov 4 11:54:14 gnt7-01 kernel: [1466521.537520] block drbd2: uuid_compare()=1
by rule 70
Nov 4 11:54:14 gnt7-01 kernel: [1466521.543623] block drbd2: peer( Unknown ->
Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
Nov 4 11:54:14 gnt7-01 kernel: [1466521.589203] block drbd2: peer( Secondary
-> Primary )
Nov 4 11:54:14 gnt7-01 kernel: [1466521.621278] block drbd2: helper command:
/bin/true before-resync-source minor-2
Nov 4 11:54:14 gnt7-01 kernel: [1466521.630776] block drbd2: helper command:
/bin/true before-resync-source minor-2 exit code 0 (0x0)
Nov 4 11:54:14 gnt7-01 kernel: [1466521.641214] block drbd2: conn( WFBitMapS
-> SyncSource ) pdsk( Consistent -> Inconsistent )
Nov 4 11:54:14 gnt7-01 kernel: [1466521.650964] block drbd2: conn( SyncSource
-> WFBitMapS ) pdsk( Inconsistent -> Consistent )
Nov 4 11:54:14 gnt7-01 kernel: [1466521.651006] block drbd2: updated sync UUID
774CB90EE2E17FA3:6FE06EA09F4D0049:6FDF6EA09F4D0049:9C341FC2219C7FED
Nov 4 11:54:14 gnt7-01 kernel: [1466521.651417] block drbd2: Began resync as
SyncSource (will sync 0 KB [0 bits set]).
Nov 4 11:54:14 gnt7-01 kernel: [1466521.951672] block drbd2: meta connection
shut down by peer.
Nov 4 11:54:14 gnt7-01 kernel: [1466521.951680] block drbd2: sock was shut
down by peer
Nov 4 11:54:14 gnt7-01 kernel: [1466521.951688] block drbd2: peer( Primary ->
Unknown ) conn( WFBitMapS -> BrokenPipe ) pdsk( Consistent -> DUnknown )
Nov 4 11:54:14 gnt7-01 kernel: [1466521.951697] block drbd2: short read
expecting header on sock: r=0
Nov 4 11:54:14 gnt7-01 kernel: [1466521.983428] block drbd2: asender terminated
Nov 4 11:54:14 gnt7-01 kernel: [1466521.988748] block drbd2: Terminating
drbd2_asender
Nov 4 11:54:14 gnt7-01 kernel: [1466522.009066] block drbd2: bitmap WRITE of
24 pages took 14 jiffies
Nov 4 11:54:14 gnt7-01 kernel: [1466522.016232] block drbd2: 0 KB (0 bits)
marked out-of-sync by on disk bit-map.
Nov 4 11:54:14 gnt7-01 kernel: [1466522.024669] block drbd2: Connection closed
Nov 4 11:54:14 gnt7-01 kernel: [1466522.029589] block drbd2: conn( BrokenPipe
-> Unconnected )
Nov 4 11:54:14 gnt7-01 kernel: [1466522.036286] block drbd2: receiver
terminated
Nov 4 11:54:14 gnt7-01 kernel: [1466522.041388] block drbd2: Restarting
drbd2_receiver
Nov 4 11:54:14 gnt7-01 kernel: [1466522.047104] block drbd2: receiver
(re)started
Nov 4 11:54:14 gnt7-01 kernel: [1466522.052372] block drbd2: conn( Unconnected
-> WFConnection )
Nov 4 11:54:30 gnt7-01 kargig: MIGRATION FINISHED for migrateme.vm.grnet.gr at
20131104 - 11:50:07
Nov 4 11:37:20 gnt7-01 kargig: MIGRATION STARTING for migrateme.vm.grnet.gr at
20131104 - 11:35:06
Nov 4 11:37:23 gnt7-01 kernel: [1465512.367688] block drbd2: peer( Primary ->
Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Nov 4 11:37:23 gnt7-01 kernel: [1465512.395666] block drbd2: asender terminated
Nov 4 11:37:23 gnt7-01 kernel: [1465512.401035] block drbd2: Terminating
drbd2_asender
Nov 4 11:37:23 gnt7-01 kernel: [1465512.401173] block drbd2: Connection closed
Nov 4 11:37:23 gnt7-01 kernel: [1465512.401185] block drbd2: conn(
Disconnecting -> StandAlone )
Nov 4 11:37:23 gnt7-01 kernel: [1465512.401297] block drbd2: receiver
terminated
Nov 4 11:37:23 gnt7-01 kernel: [1465512.401300] block drbd2: Terminating
drbd2_receiver
Nov 4 11:37:24 gnt7-01 kernel: [1465513.124874] block drbd2: conn( StandAlone
-> Unconnected )
Nov 4 11:37:24 gnt7-01 kernel: [1465513.131445] block drbd2: Starting receiver
thread (from drbd2_worker [4581])
Nov 4 11:37:24 gnt7-01 kernel: [1465513.139817] block drbd2: receiver
(re)started
Nov 4 11:37:24 gnt7-01 kernel: [1465513.145037] block drbd2: conn( Unconnected
-> WFConnection )
Nov 4 11:37:24 gnt7-01 kernel: [1465513.749700] block drbd2: Handshake
successful: Agreed network protocol version 96
Nov 4 11:37:24 gnt7-01 kernel: [1465513.758828] block drbd2: Peer
authenticated using 16 bytes of 'md5' HMAC
Nov 4 11:37:24 gnt7-01 kernel: [1465513.766721] block drbd2: conn(
WFConnection -> WFReportParams )
Nov 4 11:37:24 gnt7-01 kernel: [1465513.773903] block drbd2: Starting asender
thread (from drbd2_receiver [9400])
Nov 4 11:37:24 gnt7-01 kernel: [1465513.782583] block drbd2:
data-integrity-alg: <not-used>
Nov 4 11:37:24 gnt7-01 kernel: [1465513.788813] block drbd2:
drbd_sync_handshake:
Nov 4 11:37:24 gnt7-01 kernel: [1465513.794055] block drbd2: self
12DD91AD18639432:0000000000000000:4BB4240B4E6917E0:4BB3240B4E6917E1 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465513.805843] block drbd2: peer
F352DE775D28F401:12DD91AD18639433:4BB4240B4E6917E1:4BB3240B4E6917E1 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465513.817707] block drbd2: uuid_compare()=-1
by rule 50
Nov 4 11:37:25 gnt7-01 kernel: [1465513.823823] block drbd2: peer( Unknown ->
Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated )
pdsk( DUnknown -> UpToDate )
Nov 4 11:37:25 gnt7-01 kernel: [1465513.848313] block drbd2: conn( WFBitMapT
-> WFSyncUUID )
Nov 4 11:37:25 gnt7-01 kernel: [1465513.903940] block drbd2: updated sync uuid
12DE91AD18639432:0000000000000000:4BB4240B4E6917E0:4BB3240B4E6917E1
Nov 4 11:37:25 gnt7-01 kernel: [1465513.919322] block drbd2: helper command:
/bin/true before-resync-target minor-2
Nov 4 11:37:25 gnt7-01 kernel: [1465513.928772] block drbd2: helper command:
/bin/true before-resync-target minor-2 exit code 0 (0x0)
Nov 4 11:37:25 gnt7-01 kernel: [1465513.939070] block drbd2: conn( WFSyncUUID
-> SyncTarget ) disk( Outdated -> Inconsistent )
Nov 4 11:37:25 gnt7-01 kernel: [1465513.948854] block drbd2: Began resync as
SyncTarget (will sync 0 KB [0 bits set]).
Nov 4 11:37:25 gnt7-01 kernel: [1465513.958712] block drbd2: Resync done
(total 1 sec; paused 0 sec; 0 K/sec)
Nov 4 11:37:25 gnt7-01 kernel: [1465513.958960] block drbd2: unexpected cstate
(SyncTarget) in receive_bitmap
Nov 4 11:37:25 gnt7-01 kernel: [1465513.974437] block drbd2: updated UUIDs
F352DE775D28F400:0000000000000000:12DE91AD18639432:12DD91AD18639433
Nov 4 11:37:25 gnt7-01 kernel: [1465513.985537] block drbd2: conn( SyncTarget
-> Connected ) disk( Inconsistent -> UpToDate )
Nov 4 11:37:25 gnt7-01 kernel: [1465513.995357] block drbd2:
drbd_sync_handshake:
Nov 4 11:37:25 gnt7-01 kernel: [1465514.000881] block drbd2: self
F352DE775D28F400:0000000000000000:12DE91AD18639432:12DD91AD18639433 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465514.000965] block drbd2: helper command:
/bin/true after-resync-target minor-2
Nov 4 11:37:25 gnt7-01 kernel: [1465514.001830] block drbd2: helper command:
/bin/true after-resync-target minor-2 exit code 0 (0x0)
Nov 4 11:37:25 gnt7-01 kernel: [1465514.031579] block drbd2: peer
F352DE775D28F400:0000000000000000:12DE91AD18639432:12DD91AD18639433 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465514.031710] block drbd2: bitmap WRITE of
24 pages took 7 jiffies
Nov 4 11:37:25 gnt7-01 kernel: [1465514.031733] block drbd2: 0 KB (0 bits)
marked out-of-sync by on disk bit-map.
Nov 4 11:37:25 gnt7-01 kernel: [1465514.059065] block drbd2: uuid_compare()=0
by rule 40
Nov 4 11:37:25 gnt7-01 kernel: [1465514.059121] block drbd2: role( Secondary
-> Primary )
Nov 4 11:37:25 gnt7-01 kernel: [1465514.071416] block drbd2:
drbd_sync_handshake:
Nov 4 11:37:25 gnt7-01 kernel: [1465514.076858] block drbd2: self
F352DE775D28F401:0000000000000000:12DE91AD18639432:12DD91AD18639433 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465514.088967] block drbd2: peer
F352DE775D28F400:0000000000000000:12DE91AD18639432:12DD91AD18639433 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465514.101158] block drbd2: uuid_compare()=0
by rule 40
Nov 4 11:37:25 gnt7-01 kernel: [1465514.107455] block drbd2:
drbd_sync_handshake:
Nov 4 11:37:25 gnt7-01 kernel: [1465514.112686] block drbd2: self
F352DE775D28F401:0000000000000000:12DE91AD18639432:12DD91AD18639433 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465514.124443] block drbd2: peer
F352DE775D28F401:12DE91AD18639433:12DD91AD18639433:4BB4240B4E6917E1 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465514.136194] block drbd2: was SyncTarget,
peer missed the resync finished event, corrected peer:
Nov 4 11:37:25 gnt7-01 kernel: [1465514.146282] block drbd2: peer
F352DE775D28F401:0000000000000000:12DE91AD18639433:12DD91AD18639433 bits:0
flags:0
Nov 4 11:37:25 gnt7-01 kernel: [1465514.158018] block drbd2: uuid_compare()=-1
by rule 35
Nov 4 11:37:25 gnt7-01 kernel: [1465514.164009] block drbd2: I shall become
SyncTarget, but I am primary!
Nov 4 11:37:25 gnt7-01 kernel: [1465514.171562] block drbd2: ASSERT( os.conn
== C_WF_REPORT_PARAMS ) in
/build/linux-rrsxby/linux-3.2.51/drivers/block/drbd/drbd_receiver.c:3245
Nov 4 11:37:25 gnt7-01 kernel: [1465514.186009] block drbd2: peer( Primary ->
Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
Nov 4 11:37:25 gnt7-01 kernel: [1465514.198134] block drbd2: error receiving
ReportState, l: 4!
Nov 4 11:37:25 gnt7-01 kernel: [1465514.198154] block drbd2: new current UUID
CA140684A1D3D437:F352DE775D28F401:12DE91AD18639432:12DD91AD18639433
Nov 4 11:37:25 gnt7-01 kernel: [1465514.216187] block drbd2: asender terminated
Nov 4 11:37:25 gnt7-01 kernel: [1465514.221189] block drbd2: Terminating
drbd2_asender
Nov 4 11:37:25 gnt7-01 kernel: [1465514.221542] block drbd2: Connection closed
Nov 4 11:37:25 gnt7-01 kernel: [1465514.221559] block drbd2: conn(
Disconnecting -> StandAlone )
Nov 4 11:37:25 gnt7-01 kernel: [1465514.221602] block drbd2: receiver
terminated
Nov 4 11:37:25 gnt7-01 kernel: [1465514.221605] block drbd2: Terminating
drbd2_receiver
Nov 4 11:37:41 gnt7-01 kargig: MIGRATION FINISHED for migrateme.vm.grnet.gr at
20131104 - 11:35:06