On Tue, Apr 23, 2013 at 06:17:31AM -0500, Gerald Brandt wrote: > Hi, > > I have a DRBD setup with primary and secondary. DRBD0 is a RAID6 > array, and DRBD1 is a RAID1 array. A couple of time a day, DRBD1 > forces a re-sync due to PingAck not arriving in time, yet DRBD0 never > has that issue. Both sync over the same dedicated GigE link. DRBD0 > is in use, while DRBD1 isn't yet. Running stock Ubuntu 12.10 > > Is there anything I can do about this?
If drbd1 is not yet in use, down it :) I have seen similar behavior when some "intelligent" network component or firewall closed "idle" tcp connections, for some "definition" of idle. > PRIMARY: > > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.042587] block drbd1: sock was > shut down by peer > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.042597] block drbd1: peer( > Secondary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> > DUnknown ) > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.042633] block drbd1: short read > expecting header on sock: r=0 > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.042645] block drbd1: meta > connection shut down by peer. > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.042650] block drbd1: new > current UUID > 7E8F445E07B7C819:7308FAB30BA7338D:3D2A5512F8677B9B:3D295512F8677B9B > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.042673] block drbd1: asender > terminated > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.042675] block drbd1: > Terminating drbd1_asender > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.098710] block drbd1: Connection > closed > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.098714] block drbd1: conn( > BrokenPipe -> Unconnected ) > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.098718] block drbd1: receiver > terminated > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.098719] block drbd1: Restarting > drbd1_receiver > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.098721] block drbd1: receiver > (re)started > Apr 23 03:56:22 iscsi-filer-1 kernel: [927115.098724] block drbd1: conn( > Unconnected -> WFConnection ) > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937526] block drbd1: Handshake > successful: Agreed network protocol version 96 > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937698] block drbd1: Peer > authenticated using 20 bytes of 'sha1' HMAC > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937716] block drbd1: conn( > WFConnection -> WFReportParams ) > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937720] block drbd1: Starting > asender thread (from drbd1_receiver [1620]) > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937747] block drbd1: > data-integrity-alg: <not-used> > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937758] block drbd1: > drbd_sync_handshake: > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937760] block drbd1: self > 7E8F445E07B7C819:7308FAB30BA7338D:3D2A5512F8677B9B:3D295512F8677B9B bits:0 > flags:0 > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937763] block drbd1: peer > 7308FAB30BA7338C:0000000000000000:3D2A5512F8677B9A:3D295512F8677B9B bits:0 > flags:0 > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937766] block drbd1: > uuid_compare()=1 by rule 70 > Apr 23 03:56:23 iscsi-filer-1 kernel: [927115.937770] block drbd1: peer( > Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> > Consistent ) > Apr 23 03:56:24 iscsi-filer-1 kernel: [927117.010807] block drbd1: helper > command: /sbin/drbdadm before-resync-source minor-1 > Apr 23 03:56:24 iscsi-filer-1 kernel: [927117.065171] block drbd1: helper > command: /sbin/drbdadm before-resync-source minor-1 exit code 0 (0x0) > Apr 23 03:56:24 iscsi-filer-1 kernel: [927117.065179] block drbd1: conn( > WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent ) > Apr 23 03:56:24 iscsi-filer-1 kernel: [927117.065186] block drbd1: Began > resync as SyncSource (will sync 0 KB [0 bits set]). > Apr 23 03:56:24 iscsi-filer-1 kernel: [927117.065191] block drbd1: updated > sync UUID 7E8F445E07B7C819:7309FAB30BA7338D:7308FAB30BA7338D:3D2A5512F8677B9B > Apr 23 03:56:25 iscsi-filer-1 kernel: [927117.521667] block drbd1: Resync > done (total 1 sec; paused 0 sec; 0 K/sec) > Apr 23 03:56:25 iscsi-filer-1 kernel: [927117.521672] block drbd1: updated > UUIDs 7E8F445E07B7C819:0000000000000000:7309FAB30BA7338D:7308FAB30BA7338D > Apr 23 03:56:25 iscsi-filer-1 kernel: [927117.521676] block drbd1: conn( > SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) > Apr 23 03:56:25 iscsi-filer-1 kernel: [927118.433518] block drbd1: bitmap > WRITE of 14904 pages took 228 jiffies > Apr 23 03:56:26 iscsi-filer-1 kernel: [927118.486667] block drbd1: 0 KB (0 > bits) marked out-of-sync by on disk bit-map. > > > SECONDARY: > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.461499] block drbd1: PingAck > did not arrive in time. > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.461872] block drbd1: peer( > Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> > DUnknown ) > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502525] block drbd1: asender > terminated > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502533] block drbd1: > Terminating drbd1_asender > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502588] block drbd1: Connection > closed > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502593] block drbd1: conn( > NetworkFailure -> Unconnected ) > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502598] block drbd1: receiver > terminated > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502600] block drbd1: Restarting > drbd1_receiver > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502601] block drbd1: receiver > (re)started > Apr 23 03:56:22 iscsi-filer-2 kernel: [844879.502604] block drbd1: conn( > Unconnected -> WFConnection ) > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397008] block drbd1: Handshake > successful: Agreed network protocol version 96 > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397249] block drbd1: Peer > authenticated using 20 bytes of 'sha1' HMAC > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397264] block drbd1: conn( > WFConnection -> WFReportParams ) > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397268] block drbd1: Starting > asender thread (from drbd1_receiver [1670]) > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397326] block drbd1: > data-integrity-alg: <not-used> > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397376] block drbd1: > drbd_sync_handshake: > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397378] block drbd1: self > 7308FAB30BA7338C:0000000000000000:3D2A5512F8677B9A:3D295512F8677B9B bits:0 > flags:0 > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397380] block drbd1: peer > 7E8F445E07B7C819:7308FAB30BA7338D:3D2A5512F8677B9B:3D295512F8677B9B bits:0 > flags:0 > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397382] block drbd1: > uuid_compare()=-1 by rule 50 > Apr 23 03:56:23 iscsi-filer-2 kernel: [844880.397386] block drbd1: peer( > Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> > Outdated ) pdsk( DUnknown -> UpToDate ) > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.417113] block drbd1: conn( > WFBitMapT -> WFSyncUUID ) > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.577678] block drbd1: updated > sync uuid 7309FAB30BA7338C:0000000000000000:3D2A5512F8677B9A:3D295512F8677B9B > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.606019] block drbd1: helper > command: /sbin/drbdadm before-resync-target minor-1 > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.658326] block drbd1: helper > command: /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0) > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.658332] block drbd1: conn( > WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.658338] block drbd1: Began > resync as SyncTarget (will sync 0 KB [0 bits set]). > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.660583] block drbd1: Resync > done (total 1 sec; paused 0 sec; 0 K/sec) > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.660588] block drbd1: updated > UUIDs 7E8F445E07B7C818:0000000000000000:7309FAB30BA7338C:7308FAB30BA7338D > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.660592] block drbd1: conn( > SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.672649] block drbd1: helper > command: /sbin/drbdadm after-resync-target minor-1 > Apr 23 03:56:24 iscsi-filer-2 kernel: [844881.759827] block drbd1: helper > command: /sbin/drbdadm after-resync-target minor-1 exit code 0 (0x0) > Apr 23 03:56:25 iscsi-filer-2 kernel: [844882.698240] block drbd1: bitmap > WRITE of 14904 pages took 235 jiffies > Apr 23 03:56:25 iscsi-filer-2 kernel: [844882.760933] block drbd1: 0 KB (0 > bits) marked out-of-sync by on disk bit-map. > _______________________________________________ > drbd-user mailing list > drbd-user@lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user