Hi all,

I'm trying to join a new node into an existing 2-node cluster and it seems to be broken somehow...

I'm using DRBD9 on Ubuntu 16.04 LTS:
ii  drbd-dkms 9.0.9-1ppa1~xenial1                          all          RAID 1 over TCP/IP for Linux module source ii  drbd-utils 9.1.1-1ppa1~xenial1                          amd64        RAID 1 over TCP/IP for Linux (user utilities) ii  python-drbdmanage 0.99.10-1ppa1~xenial1                        all          DRBD distributed resource management utility

The existing 2-node cluster:
- Node hv2 / HW-Raid5 = 5x 1TB (512n) / LVM-Crypt / Drbd uses LvmThinLv
- Node hv3 / HW-Raid5 = 3x 2TB (512n) / LVM-Crypt / Drbd uses LvmThinLv

The new one:
- Node hv1 / 1x 6TB (4Kn) / LVM-Crypt / Drbd uses LvmThinLv
(actually tried with SW-Raid1 and had the same results before)


State before joining:

root@hv2 ~ # drbd-overview
Resources:
  0:.drbdctrl/0    Connected(2*) Second/Primar UpToDa/UpToDa
  1:.drbdctrl/1    Connected(2*) Second/Primar UpToDa/UpToDa
100:vm_dc2/0       Connected(2*) Primar/Second UpToDa/UpToDa *dc2            sda scsi 101:vm_dc1/0       Connected(2*) Primar/Second UpToDa/UpToDa *dc1            sda scsi
... some more in same state


Then I add hv1:

root@hv3 ~ # drbdmanage add-node hv1 192.168.42.2

In dmesg of hv1 I see:
[ 1186.205695] drbd .drbdctrl/0 drbd0: logical block size of local backend does not match (drbd:512, backend:4096); was this a late attach? [ 1186.205702] drbd .drbdctrl/0 drbd0: logical block sizes do not match (me:512, peer:512); this may cause problems.
In dmesg of hv2 and hv3 I see:
[ 7765.951161] drbd .drbdctrl/0 drbd0: logical block sizes do not match (me:512, peer:4096); this may cause problems. [ 7765.951165] drbd .drbdctrl/0 drbd0: current Primary must NOT adjust logical block size (512 -> 4096); hope for the best.


State still looks good:

root@hv1 ~ # drbd-overview
 0:.drbdctrl/0  Connected(3*) Seco(hv1,hv2)/Prim(hv3) UpTo(hv1)/UpTo(hv2,hv3)  1:.drbdctrl/1  Connected(3*) Seco(hv2,hv1)/Prim(hv3) UpTo(hv1)/UpTo(hv2,hv3)


Then I assign a resource:

root@hv3:~# drbdmanage assign-resource vm_dc2 hv1

and hv1 ends diskless :-(

[ 4564.041711] drbd vm_dc2/0 drbd100 hv2: helper command: /sbin/drbdadm before-resync-target [ 4564.044700] drbd vm_dc2/0 drbd100 hv2: helper command: /sbin/drbdadm before-resync-target exit code 0 (0x0)
[ 4564.044759] drbd vm_dc2/0 drbd100 hv2: repl( WFBitMapT -> SyncTarget )
[ 4564.044765] drbd vm_dc2/0 drbd100 hv3: resync-susp( no -> connection dependency ) [ 4564.044978] drbd vm_dc2/0 drbd100 hv2: Began resync as SyncTarget (will sync 16777216 KB [4194304 bits set]). [ 4564.048031] drbd vm_dc2/0 drbd100 hv3: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0% [ 4564.050267] drbd vm_dc2/0 drbd100 hv3: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 23(1), total 23; compression: 100.0% [ 4564.050285] drbd vm_dc2/0 drbd100 hv3: helper command: /sbin/drbdadm before-resync-target [ 4564.053016] drbd vm_dc2/0 drbd100 hv3: helper command: /sbin/drbdadm before-resync-target exit code 0 (0x0)
[ 4564.053074] drbd vm_dc2/0 drbd100 hv3: repl( WFBitMapT -> PausedSyncT )
[ 4564.053286] drbd vm_dc2/0 drbd100 hv3: Began resync as PausedSyncT (will sync 16777216 KB [4194304 bits set]).
[ 4636.342184] sd 4:0:0:0: [sda] Bad block number requested
[ 4636.346976] drbd vm_dc2/0 drbd100: write: error=10 s=2050s
[ 4636.347015] drbd vm_dc2/0 drbd100: disk( Inconsistent -> Failed )
[ 4636.347021] drbd vm_dc2/0 drbd100 hv2: repl( SyncTarget -> Established )
[ 4636.347026] drbd vm_dc2/0 drbd100 hv3: repl( PausedSyncT -> Established ) resync-susp( connection dependency -> no ) [ 4636.347063] drbd vm_dc2/0 drbd100: Local IO failed in drbd_endio_write_sec_final. Detaching...
[ 4636.354204] drbd vm_dc2/0 drbd100: disk( Failed -> Diskless )


Any ideas what is going wrong?

Thanks and regards
Sebastian Hasait

_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to