Re: [DRBD-user] DRBD Sync stalls

2009-08-04 Thread Lars Ellenberg
On Mon, Aug 03, 2009 at 09:25:09PM -0700, Димитър Бойн wrote:
 Hi, again!
 So hoping that the new DRBD 8.3 might address my issue I did the upgrade.
 :-( it didn't help.
 As I did not receive any answers to my original post 2 weeks ago I guess my 
 problem is unique?

There have been two answers.

 What would be the best way to troubleshoot? Any way to troubleshoot?

have a look at my sig ;)

-- 
: Lars Ellenberg
: LINBIT HA-Solutions GmbH
: DRBD®/HA support and consultinghttp://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] slow drbd over tripple gigabit bonding balance-rr

2009-08-04 Thread Zoltan Patay
Mark,

syncer rate does not define the overall drbd performance between the two
nodes, but rather it designates a limit for re-syncing, to assure normal
system performance while it is happening in the backgrounnd (so your re-sync
processes dont eat up all your bandwidth between your drbd nodes, and normal
drbd usage can happen).

It has nothing to do with the working performance of a single drbd device
pair see more here:
http://www.drbd.org/users-guide/s-configure-syncer-rate.html

Originally it was set to be 80M in this case, but probably it should be even
lower, since despite the iperf results, I never saw a drbd go over the
117MB/s (single gigabit link hitting performance wall), so it probably
should be as low as 35MB/s

As I wrote before, the two boxes have three gigabit links dedicated for
DRBD, these are bonded using the balance-rr mode, and with arp ip monitoring
(basically I can unplug any of the cables between the nodes, in any order,
and plug them back in, as long there is a single link, the connection is
uninterrupted, also depending how many gigabit links are allive the
bandwidth scales up with every additional connection, pretty swift actually)

What I would like to see is higher write rates. I know how the different
bondings work, I also know balance-rr is the only where a single connection
cab scale beyond the capacity of a single card, and it clearly happens when
I benchamrk it with iperf, but never saw it in DRBD.

Now, since this is a Xen Dom0, I have been able to do more testing in the
DomU itself (half of the testing was done before I wrote to the mailing
list)

In the paravirtualized DomU, the drbd devices are inported as xvdb to xvdf,
and they were used:

1) as phisical volumes for the LVM in the DomU itseld, setting up the
logical volumes using striping in LVM

2) as part of a raid0 stripe

3) raid0 stripe in above as a phisical volume for LVM

There is no signofocant difference in either mode.

I have even tested a RAID_0 stripe over the drbd volumes in the Xen Dom0,
same results.

I also know about the LVM default block device performance issues, and
blockdev --setra is used as a workaround on all levels

Whenever I disconnect drbd, performance is as expected (close to the raid10
performance measured)

The reason this is so annoying, is because on both the DRBD wiki and in the
mail list there are hints that even over a dual gigabit link in ballance-rr
performance is much better, see them yourself:

http://www.drbd.org/home/wiki/?tx_drwiki_pi1[keyword]=performance

http://www.nabble.com/DRBD-Performance-td18745802.html

http://lists.linbit.com/pipermail/drbd-user/2008-July/009893.html


Also if it not clear, this is a nested LVM (with the DRBD's being the
phisical volumes for LVM in the paravirtualized xen instance, DRBD is
running in Dom0):


DRBD-|  |-xvdb (PV)|
Xen Dom0 6XHDD-RAID10-LVM-DRBD-|XenDomU|-xvdc
(PV)|-LVMfile ystems

DRBD-|  |-xvdd|(PV)|

As a side note, I am a seasoned sysadmin of fifteen years and use linux
practically for everything for the last  ten years, work with it at least
twelve hours a day (mostly more than that, I am lucky to do for living what
I love, and work and fun are the same)

So, anybody knows how those magical numbers were achived under those links?

z

On Thu, Jul 30, 2009 at 9:18 AM, Mark Watts m.wa...@eris.qinetiq.comwrote:

 On Thu, 2009-07-30 at 03:57 -0400, Zoltan Patay wrote:
  using dd if=/dev/zero of=/dev/drbd26 bs=10M count=100 I get:
 
  drbd connected
  1048576000 bytes (1.0 GB) copied, 13.6526 seconds, 76.8 MB/s
  1048576000 bytes (1.0 GB) copied, 13.4238 seconds, 78.1 MB/s
  1048576000 bytes (1.0 GB) copied, 13.2448 seconds, 79.2 MB/s
 
  drbd disconnected
  1048576000 bytes (1.0 GB) copied, 4.04754 seconds, 259 MB/s
  1048576000 bytes (1.0 GB) copied, 4.06758 seconds, 258 MB/s
  1048576000 bytes (1.0 GB) copied, 4.06758 seconds, 258 MB/s
 
  The three (intel) gigabit PCIe cards are bonded with balance-rr, and
  iperf gives me:
 
  iperf 0.0-10.0 sec  2.52 GBytes  2.16 Gbits/sec (276.48MB/s)
 
  So clearly there is enough speed for both on the network and in the
  backend to support higher speeds. The boxes are with cross-over
  back-to-back no-switch.
 
  version: 8.3.0 (api:88/proto:86-89)
  GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by
  p...@fat-tyre, 2008-12-18 15:26:13
 
  global { usage-count yes; }
 common { syncer { rate 650M; } }

 Try actually setting this to a sensible value for 3 x 1Gbit links.
 eg: 300M

  resource OpenVZ_C1C2_B_LVM5 {
protocol C;
startup {degr-wfc-timeout 120;}
disk {on-io-error
  detach;no-disk-flushes;no-md-flushes;no-disk-drain;no-disk-barrier;}
net {
  cram-hmac-alg sha1;
  shared-secret OpenVZ_C1C2_B;
  allow-two-primaries;
  after-sb-0pri discard-zero-changes;
  after-sb-1pri discard-secondary;
  after-sb-2pri disconnect;
  rr-conflict disconnect;
  

[DRBD-user] SOLVED - kind of (Fwd: slow drbd over tripple gigabit bonding balance-rr)

2009-08-04 Thread Zoltan Patay
turns out, once I did the testing with 1M block size instead of 10M, it was
showing the performance I expected.

sync; echo 3  /proc/sys/vm/drop_caches # free pagecache, dentries and
inodes

sync;dd if=/dev/zero of=/vz/blob bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 4.04935 seconds, 259 MB/s

I wonder if max-buffers  and max-epoch-size set to 2048 has anything to
do with this.

z

PS: I want to thank Mark for replying to the post.

-- Forwarded message --
From: Zoltan Patay zoltanpa...@gmail.com
Date: Thu, Jul 30, 2009 at 3:57 AM
Subject: slow drbd over tripple gigabit bonding balance-rr
To: drbd-user@lists.linbit.com


using dd if=/dev/zero of=/dev/drbd26 bs=10M count=100 I get:

drbd connected
1048576000 bytes (1.0 GB) copied, 13.6526 seconds, 76.8 MB/s
1048576000 bytes (1.0 GB) copied, 13.4238 seconds, 78.1 MB/s
1048576000 bytes (1.0 GB) copied, 13.2448 seconds, 79.2 MB/s

drbd disconnected
1048576000 bytes (1.0 GB) copied, 4.04754 seconds, 259 MB/s
1048576000 bytes (1.0 GB) copied, 4.06758 seconds, 258 MB/s
1048576000 bytes (1.0 GB) copied, 4.06758 seconds, 258 MB/s

The three (intel) gigabit PCIe cards are bonded with balance-rr, and iperf
gives me:

iperf 0.0-10.0 sec  2.52 GBytes  2.16 Gbits/sec (276.48MB/s)

So clearly there is enough speed for both on the network and in the backend
to support higher speeds. The boxes are with cross-over back-to-back
no-switch.

version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by p...@fat-tyre,
2008-12-18 15:26:13

global { usage-count yes; }
   common { syncer { rate 650M; } }

resource OpenVZ_C1C2_B_LVM5 {
  protocol C;
  startup {degr-wfc-timeout 120;}
  disk {on-io-error
detach;no-disk-flushes;no-md-flushes;no-disk-drain;no-disk-barrier;}
  net {
cram-hmac-alg sha1;
shared-secret OpenVZ_C1C2_B;
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
rr-conflict disconnect;
timeout 300;
connect-int 10;
ping-int 10;
max-buffers 2048;
max-epoch-size 2048;
  }
  syncer {rate 650M;al-extents 257;verify-alg crc32c;}
  on c1 {
device /dev/drbd26;
disk   /dev/mapper/xenvg-OpenVZ_C1C2_B_LVM5;
address10.0.10.10:7826;
meta-disk  /dev/mapper/xenvg-DRBD_MetaDisk[26];
  }
  on c2 {
device/dev/drbd26;
disk   /dev/mapper/xenvg-OpenVZ_C1C2_B_LVM5;
address   10.0.10.20:7826;
meta-disk  /dev/mapper/xenvg-DRBD_MetaDisk[26];
  }
}


Some of the settings above are unsafe (no-disk-flushes;no-md-flushes), they
were turned on to see if it makes any different (did not)

The two boxes are quad core 3GHz Nehalems, 12GB  tripple channel DDR3-1600,
6 western digital caviar black 750GB hdds, in RAID10 with LVM on top of it,
the DRBD backends are carved out of LVM. Three separate Intel gigabit PCIe
cards are bonded with ballance-rr, and connects them back-to-back, with a
forth gigabit card in each box (onboard) toward the outside.

The OS is Debian Etch + Backports with some custom deb packages rolled by
me. The machine is a Xen Dom0, kernel: 2.6.26, xen: 3.2.1, drbd: 8.3.0

Thanks any help / hint in advance,

z
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] I think that there is a missing instruction for RPM building.

2009-08-04 Thread Dotan Barak
Hi all.

I tried to follow the directions in the URL:
http://www.drbd.org/users-guide/s-build-rpm.html
and it seems that something is missing:
Before executing make rpm there is a need to build the .filelist.

This can be done by make .filelist or by make tarball.

Thanks
Dotan
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] switch over takes a quite long time

2009-08-04 Thread Pierre LEBRECH
Hello,

context : 3-node cluster, every node connected, HA services on node1, DRBD 
version 8.3.2 on linux 2.6.30.

I switch HA services over to node2 with /usr/lib/heartbeat/hb_standby all 
from node1.

It takes a long time to perform : 20 seconds.

I have this in logs (ha-log) :

on node1 :

ResourceManager[7475]:  2009/08/04_11:53:27 info: Running 
/etc/ha.d/resource.d/drbdupper r0-U start
Filesystem[7799]:   2009/08/04_11:53:47 INFO:  Resource is stopped
ResourceManager[7475]:  2009/08/04_11:53:47 info: Running 
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 start

and on node2 :

heartbeat[4836]: 2009/08/04_11:53:26 info: Local standby process completed 
[all].
heartbeat[4836]: 2009/08/04_11:53:26 info: New standby state: 3
heartbeat[4836]: 2009/08/04_11:53:26 info: Managed go_standby process 12561 
exited with return code 0.
heartbeat[4836]: 2009/08/04_11:53:48 WARN: 1 lost packet(s) for [node1] 
[7702:7704]
heartbeat[4836]: 2009/08/04_11:53:48 info: remote resource transition completed.


Question : why drbdupper takes such a long time to start? Is that normal?

Thanks.






here is the drbd.conf file :

global {
usage-count yes;
}
common {
  syncer { rate 10M; }
  net {
max-buffers 4;
  }
}

resource r0 {
  protocol C;
  handlers {
pri-on-incon-degr echo o  /proc/sysrq-trigger ; halt -f;
pri-lost-after-sb echo o  /proc/sysrq-trigger ; halt -f;
local-io-error echo o  /proc/sysrq-trigger ; halt -f;
  }
  startup {
wfc-timeout  0;
degr-wfc-timeout 120;
  }
  disk {
on-io-error   detach;
  }
  net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
  }
  syncer {
rate 90M;
al-extents 128;
csums-alg md5;
  }

  on node1 {
  device /dev/drbd0;
  disk   /dev/md2;
  address10.0.0.1:7788;
  meta-disk  /dev/md1 [0];
  }
  on node2 {
  device /dev/drbd0;
  disk   /dev/md2;
  address10.0.0.2:7788;
  meta-disk  /dev/md1 [0];
  }
}

resource r0-U {
  protocol C;

  syncer {
csums-alg md5;
rate 5M;
  }

  stacked-on-top-of r0 {
device/dev/drbd1;
address   192.168.2.15:7788;
  }

  on node3 {
device/dev/drbd1;
disk  /dev/md2;
address   192.168.2.14:7788;
meta-disk internal;
  }


}

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Antwort: upgrade problems

2009-08-04 Thread George
Hi,

Thanks for your replies.

I followed your replies and it worked like you described.
Unfortunately I could not get to finish the upgrade because the HDD
died when upgrading the second server so I had to create everything
from start.

On Mon, Jul 27, 2009 at 1:34 PM, robert.koe...@knapp.com wrote:

 Hi, the message you get during upgrade is perfectly normal. answer with yes
 and it  search for old metadata, answer with yes again if it offers to
 upgrade ant it will upgrade the metadata. however be sure to have a working
 backup of yout data in case something goes wrong. DO NOT ANSWER WITH YES IF
 NO V7 METADATA IS FOUND!
 I have upgraded about half a dozen DRBD clusters in the lost months
 following this guide:

 http://blogs.linbit.com/florian/2007/10/03/step-by-step-upgrade-from-drbd-07-to-drbd-8/

 Mit freundlichen Grüßen / Best regards,

 Robert Köppl
 System Administration
 ---
 Phone: +43 3842 805-910
 Fax: +43 3842 805-500
 robert.koe...@knapp.com
 www.KNAPP.com
 ---
 KNAPP Systemintegration GmbH
 Waltenbachstrasse 9
 8700 Leoben, Austria
 ---
 Commercial register number: FN 138870x
 Commercial register court: Leoben
 ---

 ___
 drbd-user mailing list
 drbd-user@lists.linbit.com
 http://lists.linbit.com/mailman/listinfo/drbd-user


___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] split brain detected when switching back to the 2node cluster from the DR node

2009-08-04 Thread guohuai li

Hi,
 
There are several items such as below in /etc/drbd.conf.
You may need to study it.
 
My DRBD is 8.3.0.
It works well.
 
edward
 
#after-sb-0pri disconnect;
after-sb-0pri discard-older-primary;
 
#after-sb-1pri disconnect;
after-sb-1pri discard-secondary; 
 
 Date: Tue, 4 Aug 2009 18:58:15 +0200
 From: pierre.lebr...@laposte.net
 To: drbd-user@lists.linbit.com
 Subject: [DRBD-user] split brain detected when switching back to the 2node 
 cluster from the DR node
 
 Hello,
 
 I always get a split brain when I switch the HA services back to the 2node 
 cluster from my DR node.
 
 Here are the steps I follow :
 
 - HA services are on the DR node
 - I stop these HA services
 - I umount the data
 
 The state of DRBD on node3 is as follow :
 
 --
 version: 8.3.2 (api:88/proto:86-90)
 GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hcns1, 
 2009-08-04 09:41:09
 
 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r
 ns:0 nr:34083396 dw:34084032 dr:68168163 al:12 bm:2094 lo:0 pe:0 ua:0 ap:0 
 ep:1 wo:f oos:208
 --
 
 Then, on node1 :
 
 - drbdadm primary r0
 - I start the HA IP
 - drbdadm --stacked up r0-U
 
 At this point, every thing is OK. Here is the output of cat /proc/drbd :
 
 --
 version: 8.3.2 (api:88/proto:86-90)
 GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hans1, 
 2009-08-04 09:43:39
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r
 ns:13706 nr:174 dw:15096 dr:102401147 al:30 bm:2170 lo:0 pe:0 ua:0 ap:0 ep:1 
 wo:f oos:0
 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r
 ns:0 nr:244 dw:244 dr:416 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
 --
 
 Then, I set all things back (reset) :
 
 on node1 :
 
 - drbdadm --stacked down r0-U
 - drbdadm secondary r0
 - I stop the HA IP
 
 The state on node1 is as follow :
 
 --
 version: 8.3.2 (api:88/proto:86-90)
 GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hans1, 
 2009-08-04 09:43:39
 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r
 ns:13707 nr:174 dw:15097 dr:102401147 al:30 bm:2186 lo:0 pe:0 ua:0 ap:0 ep:1 
 wo:f oos:0
 1: cs:Unconfigured
 --
 
 on node3, I type these commands to reset the state :
 
 - drbdadm secondary r0-U
 
 Then, on node1 and node2, I start heartbeat normally.
 
 
 
 Well, each time I follow theses steps, node3 gets a split-brain.
 
 Where is the problem?
 
 
 
 
 context : 3-node cluster, every node connected, HA services on node1, DRBD 
 version 8.3.2 on linux 2.6.30.
 
 ___
 drbd-user mailing list
 drbd-user@lists.linbit.com
 http://lists.linbit.com/mailman/listinfo/drbd-user

_
Share your memories online with anyone you want.
http://www.microsoft.com/middleeast/windows/windowslive/products/photos-share.aspx?tab=1___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user