Re: [DRBD-user] Configuring a two-node cluster with redundant nics on each node?
On 2018-10-18 04:07, Bryan K. Walton wrote: Hi, I'm trying to configure a two-node cluster, where each node has dedicated redundant nics: storage node 1 has two private IPs: 10.40.1.3 10.40.2.2 storage node 2 has two private IPs: 10.40.1.2 10.40.2.3 I'd like to configure the resource so that the nodes have two possible paths to the other node. I've tried this: resource r0 { on storage1 { device/dev/drbd1; disk /dev/mapper/centos_storage1-storage; address 10.40.2.2:7789; address 10.40.1.3:7789; meta-disk internal; } on storage2 { device/dev/drbd1; disk /dev/mapper/centos_storage2-storage; address 10.40.1.2:7789; address 10.40.2.3:7789; meta-disk internal; } } But this doesn't work. When I try to create the device metadata, I get the following error: drbd.d/r0.res:6: conflicting use of address statement 'r0:storage1:address' ... drbd.d/r0.res:5: address statement 'r0:storage1:address' first used here. Clearly, my configuration won't work. Is there a way to accomplish what I'd like to accomplish? Why aren't you using Ethernet bonding? -- Adi Pircalabu ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
[DRBD-user] Configuring a two-node cluster with redundant nics on each node?
Hi, I'm trying to configure a two-node cluster, where each node has dedicated redundant nics: storage node 1 has two private IPs: 10.40.1.3 10.40.2.2 storage node 2 has two private IPs: 10.40.1.2 10.40.2.3 I'd like to configure the resource so that the nodes have two possible paths to the other node. I've tried this: resource r0 { on storage1 { device/dev/drbd1; disk /dev/mapper/centos_storage1-storage; address 10.40.2.2:7789; address 10.40.1.3:7789; meta-disk internal; } on storage2 { device/dev/drbd1; disk /dev/mapper/centos_storage2-storage; address 10.40.1.2:7789; address 10.40.2.3:7789; meta-disk internal; } } But this doesn't work. When I try to create the device metadata, I get the following error: drbd.d/r0.res:6: conflicting use of address statement 'r0:storage1:address' ... drbd.d/r0.res:5: address statement 'r0:storage1:address' first used here. Clearly, my configuration won't work. Is there a way to accomplish what I'd like to accomplish? Thanks, Bryan Walton ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] drbdadm down failed (-12) - blocked by drbd_submit
It turned out that the NFS daemon was blocking DRBD. Thanks, the comment about the 'drbd' kernel processes was helpful. BTW, the documentation (man pages) for DRBD 9.0 is still from 8.4 and some options are no longer there. On Thu, Oct 11, 2018 at 9:48 AM, Radoslaw Garbacz < radoslaw.garb...@xtremedatainc.com> wrote: > Thanks, will take a closer look at this. > > On Thu, Oct 11, 2018 at 3:47 AM, Lars Ellenberg > wrote: > >> On Tue, Oct 02, 2018 at 12:56:38PM -0500, Radoslaw Garbacz wrote: >> > Hi, >> > >> > >> > I have a problem, which (from what I found) has been discussed, however >> not >> > in the particular case, which I experienced, so I would be grateful for >> any >> > suggestions of how to deal with it. >> > >> > >> > I. >> > 1. I get an error when demoting DRBD resource: >> > * drbdadm down data0 >> > >> > data0: State change failed: (-12) Device is held open by someone >> > additional info from kernel: >> > failed to demote >> > Command 'drbdsetup-84 down data0' terminated with exit code 11 >> > >> > 2. The device is not mounted and not used by any LVM, so based on some >> > online discussions I checked the blocking process and it is >> "drbd0_submit" >> > >> > * lsof | grep drbd0 >> > drbd0_sub 16687 root cwd DIR 202,1 251 64 / >> >> No, it is not. >> >> drbd*submitter (only 16 bytes of that name actually make it into the >> comm part of the task struct, which is what ps or lsof or the like can >> display) are kernel threads, and part of DRBD operations. >> They are certainly NOT "holding it open". >> They are a required part of its existence. >> >> "Holding it open" when you think you already unmounted it >> is typically either some forgotten device mapper thingy >> (semi-automatically created by kpartx e.g.), >> or some racy "udev triggered probe". >> >> In the latter case, if you retry after a couple seconds, >> demoting should work. >> >> > Is there a good way to deal with this case, as whether some DRBD step is >> > missing, which leaves the process or killing the process is the right >> way? >> >> Again, that "process" has nothing to do with drbd being "held open", >> but is a kernel thread that is part of the existence of that DRBD volume. >> >> -- >> : Lars Ellenberg >> : LINBIT | Keeping the Digital World Running >> : DRBD -- Heartbeat -- Corosync -- Pacemaker >> >> DRBD® and LINBIT® are registered trademarks of LINBIT >> __ >> please don't Cc me, but send to list -- I'm subscribed >> ___ >> drbd-user mailing list >> drbd-user@lists.linbit.com >> http://lists.linbit.com/mailman/listinfo/drbd-user >> > > > > -- > Best Regards, > > Radoslaw Garbacz > XtremeData Incorporated > -- Best Regards, Radoslaw Garbacz XtremeData Incorporated ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] split brain on both nodes
On 2018-10-17 5:35 a.m., Adam Weremczuk wrote: > Hi all, > > Yesterday I rebooted both nodes a couple of times (replacing BBU RAID > batteries) and ended up with: > > drbd0: Split-Brain detected but unresolved, dropping connection! Fencing prevents this. > on both. > > node1: /drbd-overview// > //0:r0/0 StandAlone Primary/Unknown UpToDate/DUnknown /srv/test1 ext4 > 3.6T 75G 3.4T 3% / > > node2: /drbd-overview // > //0:r0/0 StandAlone Secondary/Unknown UpToDate/DUnknown/ > > I understand there is a good chance (but not absolute guarantee) that > node1 holds consistent and up to date data. > > Q1: > > Is it reasonably possible to mount /dev/drbd0 (/dev/sdb1) on node2 in > read only mode? > > I would like to examine the data before discarding and syncing > everything from node1. Yes. You can also promote node 2 to examine it as well. > /drbdadm disconnect all// > //drbdadm -- --discard-my-data connect all/ Discarding the data will trigger a resync and resolve the split-brain, but of course, any changes on the discarded node will be lost. > Q2: > > Will the above completely purge all data on node2 or just drbd metadata? > > I.e. will all 75G have to be fully copied block by block or a lot less? It will do a full resync. > I'm concerned about time and impact on performance when it comes to > terabytes of data. > > Regards, > Adam The resync (on 8.4) adapts the resync rate to minimize impact on applications using the storage. As it slows itself down to "stay out of the way", the resync time increases of course. You won't have redundancy until the resync completes. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] slow sync speed
Just a quick note .. You are correct, it shouldn't be required (v8.9.10) and I was surprised > with that too. > In the DRBD documentation, it is stated that ... "When multiple DRBD resources share a single replication/synchronization network, synchronization with a fixed rate may not be an optimal approach. So, in DRBD 8.4.0 the variable-rate synchronization was enabled by default." ..and.. "In a few, very restricted situations[4], it might make sense to just use some fixed synchronization rate. In this case, first of all you need to turn the dynamic sync rate controller off, by using c-plan-ahead 0;." ..by observing your configuration, it looks like you added that option since the first time, hence no surprises here, you explicitly decided to disable variable sync rate ... :) ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
[DRBD-user] split brain on both nodes
Hi all, Yesterday I rebooted both nodes a couple of times (replacing BBU RAID batteries) and ended up with: drbd0: Split-Brain detected but unresolved, dropping connection! on both. node1: /drbd-overview// //0:r0/0 StandAlone Primary/Unknown UpToDate/DUnknown /srv/test1 ext4 3.6T 75G 3.4T 3% / node2: /drbd-overview // //0:r0/0 StandAlone Secondary/Unknown UpToDate/DUnknown/ I understand there is a good chance (but not absolute guarantee) that node1 holds consistent and up to date data. Q1: Is it reasonably possible to mount /dev/drbd0 (/dev/sdb1) on node2 in read only mode? I would like to examine the data before discarding and syncing everything from node1. /drbdadm disconnect all// //drbdadm -- --discard-my-data connect all/ Q2: Will the above completely purge all data on node2 or just drbd metadata? I.e. will all 75G have to be fully copied block by block or a lot less? I'm concerned about time and impact on performance when it comes to terabytes of data. Regards, Adam ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] slow sync speed
You are correct, it shouldn't be required (v8.9.10) and I was surprised with that too. Another evidence of the option being honored is "want: 150,000 k/sec" which I sometimes (not always) see in /proc/drbd On 17/10/18 10:17, Oleksiy Evin wrote: If I'm not wrong, the "syncer" section has been deprecated somewhere around 8.4.0 drbd version. Based on the logs you provided the version you use is 8.4.10, so I don't think that should have any speed impact. But I'm glad you've got it resolved. //OE ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] slow sync speed
If I'm not wrong, the "syncer" section has been deprecated somewhere around 8.4.0 drbd version. Based on the logs you provided the version you use is 8.4.10, so I don't think that should have any speed impact. But I'm glad you've got it resolved. //OE -Original Message- From: Adam Weremczuk To: Robert Altnoeder Cc: drbd-user@lists.linbit.com Subject: Re: [DRBD-user] slow sync speed Date: Wed, 17 Oct 2018 10:05:10 +0100 "Max-buffers 8k" appear to be the sweet spot for me.I'm now getting 145-150 MB/s transfer rates between nodes which I'm happy with.The biggest problem was I didn't have "syncer" section defined at all. Currently my fully working and behaving config looks like below: global { usage-count no; }common { protocol C; }resource r0 { disk { on-io-error detach; no-disk-flushes; no-disk-barrier; c-plan-ahead 0; } net { max-buffers 8k; } syncer { rate 150M; al-extents 6400; } on lion { device /dev/drbd0; disk /dev/sdb1; address 192.168.200.1:7788; meta-disk internal; } on tiger { device /dev/drbd0; disk /dev/sdb1; address 192.168.200.2:7788; meta-disk internal; }} On 11/10/18 15:06, Robert Altnoeder wrote:On 10/11/2018 03:56 PM, Oleksiy Evin wrote:Try to remove the following: c-fill-target 24M;c-min-rate 80M;c-max-rate 720M; sndbuf-size 1024k;rcvbuf-size 2048k; Then gradually increase max-buffers from 4K to 12K checking its impactto the sync speed. Make sure you have the same config on both nodesand apply the changes with "drbdadm adjust all" on both nodes too. ___drbd-user mailing listdrbd-user@lists.linbit.comhttp://lists.linbit.com/mailman/listinfo/drbd-user ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] slow sync speed
"Max-buffers 8k" appear to be the sweet spot for me. I'm now getting 145-150 MB/s transfer rates between nodes which I'm happy with. The biggest problem was I didn't have "syncer" section defined at all. Currently my fully working and behaving config looks like below: global { usage-count no; } common { protocol C; } resource r0 { disk { on-io-error detach; no-disk-flushes; no-disk-barrier; c-plan-ahead 0; } net { max-buffers 8k; } syncer { rate 150M; al-extents 6400; } on lion { device /dev/drbd0; disk /dev/sdb1; address 192.168.200.1:7788; meta-disk internal; } on tiger { device /dev/drbd0; disk /dev/sdb1; address 192.168.200.2:7788; meta-disk internal; } } On 11/10/18 15:06, Robert Altnoeder wrote: On 10/11/2018 03:56 PM, Oleksiy Evin wrote: Try to remove the following: c-fill-target 24M; c-min-rate 80M; c-max-rate 720M; sndbuf-size 1024k; rcvbuf-size 2048k; Then gradually increase max-buffers from 4K to 12K checking its impact to the sync speed. Make sure you have the same config on both nodes and apply the changes with "drbdadm adjust all" on both nodes too. ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
[DRBD-user] drbd-9.0.16rc1
Hi, the list is shorter than with the last releases. I think this is good news. What really made us to release now, is fixing the regression introduced with 9.0.15. It was probably not triggered by many parties, because you can only trigger it if you have requests in flight in excactly the moment a timer comes by to check if the network timeout expired. The distributed connect loop was never seen in the wild, maybe only our test suite ever reproduced it. The fixes to the quorum code ensure that recovery works as expected after a primary node lost quorum. Please help testing! -- We will release in one week if nobody comes up with "interesting" behavior. We will use the time to write more test cases for our test suite. 9.0.16-0rc1 (api:genl2/proto:86-114/transport:14) * Fix regression (introduced with 9.0.15) in handling request timeouts; all pending requests always considered as overdue when the timer function was executed; this led to false positives in detecting timeouts * Fix a possible distributed loop when establishing a connection * Fix a corner case in case a resync "overtakes" an other one * Fix clearing of the PRIMARY_LOST_QUORUM flag * Check peers (to ensure quorum is not lost) before generating new current UUID after loosing a node * In case the locally configured address of a connection is not available keep on retrying until it comes back http://www.linbit.com/downloads/drbd/9.0/drbd-9.0.16-0rc1.tar.gz https://github.com/LINBIT/drbd-9.0/releases/tag/drbd-9.0.16-0rc1 best regards, Phil -- LINBIT | Keeping The Digital World Running DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user