Re: [DRBD-user] "kernel: bio too big device drbd0"

2013-06-06 Thread Lars Ellenberg
On Thu, Jun 06, 2013 at 06:30:23PM +0200, Lutz Vieweg wrote: > On 06/06/2013 02:51 PM, Lars Ellenberg wrote: > >You did something bad, and that confused the IO stack. > > I would have expected any kind of error message from any of the > tools I used to increase the device s

Re: [DRBD-user] "kernel: bio too big device drbd0"

2013-06-07 Thread Lars Ellenberg
io size smaller while being in use, there would have been a log message like ASSERT FAILED new < now; (xyz < XYZ) max BIO size = NEW SIZE Lutz, you see that in your kernel logs? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.li

Re: [DRBD-user] "kernel: bio too big device drbd0"

2013-06-07 Thread Lars Ellenberg
On Fri, Jun 07, 2013 at 02:42:32PM +0200, Lutz Vieweg wrote: > On 06/07/2013 02:03 PM, Lars Ellenberg wrote: > >>I also had a look at the latest DRBD 8.4 source. They still mess around > >>with the blk queue limits. Set it larger, set it smaller dynamically > >>while

Re: [DRBD-user] "kernel: bio too big device drbd0"

2013-06-11 Thread Lars Ellenberg
On Mon, Jun 10, 2013 at 04:14:03PM +0200, Helmut Wollmersdorfer wrote: > > > Am 07.06.2013 um 23:21 schrieb Lars Ellenberg: > > > >Bah. > >"Should not happen" > > > > That's what developers usually say. > > And experienced te

Re: [DRBD-user] [drbd?] Kernel panic - not syncing: Out of memory and no killable processes...

2013-06-11 Thread Lars Ellenberg
e the hard limit for the number of minor devices (allocation of an array of corresponding size), that has long since changed, and now it is really only used as scaling factor for these mempools. Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and co

Re: [DRBD-user] protocol error ?

2013-06-11 Thread Lars Ellenberg
ted > > Restarting receiver thread > receiver (re)started > conn( Unconnected -> WFConnection ) > conn( WFConnection -> Disconnecting ) > Discarding network configuration. > Connection closed > conn( Disconnecting -> StandAlone ) > receiver terminated -- : L

Re: [DRBD-user] r0 ok, r1 PingAck did not arrive in time

2013-06-27 Thread Lars Ellenberg
was more than two years ago. But from the changelog of 8.3.11: * Fixed wrong connection drops ("PingAck did not arrive in time") with asymmetrically congested networks Just saying ... -- : Lars Ellenberg : LINBIT | Your Way to High Ava

Re: [DRBD-user] Replication problems constants with DRBD 8.3.10

2013-06-28 Thread Lars Ellenberg
t;upper layers" are doing, hoping that whatever it is, it is supposed to be valid. Or just don't do dual-primary. Better yet: fix those "upper layers" to not do what they are doing, enable the checksum if it makes you feel good, do single-primary anyways, and still add fenc

Re: [DRBD-user] Replication problems constants with DRBD 8.3.10

2013-07-01 Thread Lars Ellenberg
See below. On Fri, Jun 28, 2013 at 08:34:50PM -0700, cesar wrote: > Hi guys > > can anybody help me > I will be very grateful if anyone can help me > > *Mr. Lars Ellenberg tell me that:* > "With special purpose built fencing handlers, > we may be able to fix

Re: [DRBD-user] How to unlock on /var/lock/drbd-147-xxx

2013-07-03 Thread Lars Ellenberg
ugxw | grep 6719 > root 6719 0.0 0.0 4112 652 ?SJun28 0:01 drbdsetup > 12 disconnect > -- -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD®

Re: [DRBD-user] make a script for my needs

2013-07-05 Thread Lars . Ellenberg
On Thu, Jul 04, 2013 at 10:25:01AM -0700, br...@click.com.py wrote: > Hi Lars > > According to your words: > "With special purpose built fencing handlers, we may be able to fix > your setup so it will freeze IO during the disconnected period, > reconnect, and replay pending buffers, without any re

Re: [DRBD-user] [drbd-8.4.3] Initial full synchronization starts without `drbdadm primary --force resource'

2013-07-11 Thread Lars Ellenberg
4327 flags:4 > Jul 10 10:22:59 nodeb kernel: block drbd0: uuid_compare()=0 by rule 10 > Jul 10 10:22:59 nodeb kernel: block drbd0: No resync, but 536854327 bits in > bitmap! > Jul 10 10:22:59 nodeb kernel: block drbd0: disk( Negotiating -> Inconsistent ) > Jul 10 10:22:59 nodeb kernel: b

Re: [DRBD-user] Reproducible ASSERT( os.conn == C_WF_REPORT_PARAMS )

2013-07-15 Thread Lars Ellenberg
pToDate". It is probably supposed to already wait in this fashion, but apparently it does not. > The google groups thread includes an strace log of execve() calls, > so you can see what sequence of drbdsetup calls are being issued. Is > it possible that ganeti is taking an unsafe

Re: [DRBD-user] Configuring the scheduler for a DRBD backend

2013-07-15 Thread Lars Ellenberg
and I'd suggest to set some levels of such a stack to no-op. > Another point is, are those /dev/dm-x devices consistent on reboots? Most of the time, unless you changed "something", but not "reliably". But they are not relevant. Why not just use deadline as defau

Re: [DRBD-user] Configuring the scheduler for a DRBD backend

2013-07-16 Thread Lars Ellenberg
On Mon, Jul 15, 2013 at 06:14:16PM +0200, Wiebe Cazemier wrote: > - Original Message - > > From: "Lars Ellenberg" > > To: drbd-user@lists.linbit.com > > Sent: Monday, 15 July, 2013 5:56:51 PM > > Subject: Re: [DRBD-user] Configuring the scheduler for

Re: [DRBD-user] Reproducible ASSERT( os.conn == C_WF_REPORT_PARAMS )

2013-07-16 Thread Lars Ellenberg
onnect/reconnect with different settings dance. Then the whole procedure above folds to - promote migration target - migrate - demote migration source no disconnect, reconnect, wait for whatever and again... the "risk" during normal mode (which is supposedly single primary,

Re: [DRBD-user] Replication problems constants with DRBD 8.3.10

2013-07-17 Thread Lars Ellenberg
alg" to "calculate-additional-checksums-for-diagnostic-purposes"? > Always the communication of DRBD are: > 1- NIC to NIC > 2- Bond active-backup with two NICs > > On a previous post Mr. Lars Ellenberg tell me basically two things about > loss connection of DRBD: > 1- fi

Re: [DRBD-user] recovery from "page allocation failure"

2013-07-17 Thread Lars Ellenberg
ess likely to hit this situation. Part of the issue was that there is no "physically contiguous" memory available: even though we have free memory, it is too fragmented. The "compaction" should cause "defragmentation" during normal allocations, making it much less likely

Re: [DRBD-user] recovery from "page allocation failure"

2013-07-18 Thread Lars Ellenberg
On Thu, Jul 18, 2013 at 10:11:15AM +0900, Christian Balzer wrote: > On Wed, 17 Jul 2013 11:27:23 +0200 Lars Ellenberg wrote: > > > On Wed, Jul 17, 2013 at 05:25:13PM +0900, Christian Balzer wrote: > > > > > > > > > On a very busy cluster with ker

Re: [DRBD-user] "BAD! BarrierAck"... after split brain.

2013-09-09 Thread Lars Ellenberg
his messages > > in new master server: > > > > d-con shared: BAD! BarrierAck #547846 received, expected #547845! > > > > Where the first number is always greater than the second by 1. > > > > This is Kernel 3.10.5, in-kernel DRBD 8.4.3. > > > > An

Re: [DRBD-user] Integrating DRBD with Redhat cluster

2013-09-09 Thread Lars Ellenberg
DRBD. And I really don't like that. So thank you for asking on the mailing list, that gets my hopes up a bit again ;-) If you have problems getting the documented way working, or following the available tutorials, please come back to the mailing list. -- : Lars Ellenberg : LINBIT | Your Way to

Re: [DRBD-user] [PATCH] crm-fence-peer.sh: tweak --suicide-on-failure-if-primary

2013-09-09 Thread Lars Ellenberg
msg}going to reboot -f in $(( > reboot_timeout - SECONDS )) seconds! To cancel: kill $$" > sleep 2 > done > @@ -367,6 +364,10 @@ drbd_peer_fencing() > get_cib_xml -Ql || return > fence_peer_init || return > > + local startup_fencing stonith_

Re: [DRBD-user] Digest integrity check FAILED - Help tracking down the cause

2013-09-13 Thread Lars Ellenberg
ate -> Inconsistent ) > Sep 5 07:49:11 mdb1-ha2 kernel: [32102360.020212] block drbd1: Began > resync as SyncSource (will sync 244 KB [61 bits set]). > Sep 5 07:49:11 mdb1-ha2 kernel: [32102360.109042] block drbd1: Resync > done (total 1 sec; paused 0 sec; 244 K/sec) > Sep 5 07:49:11 mdb1-ha2 kernel: [321023

Re: [DRBD-user] Problem with crm fencing

2013-09-30 Thread Lars Ellenberg
cemaker is not even responsible for promoting, telling pacemaker to not promote something it does not even know about cannot possibly help ;-) Using "become-primary" when using pacemaker is almost always not the best idea. -- : Lars Ellenberg : LINBIT | Your Way to High Ava

Re: [DRBD-user] ocf resource puts drbd in secondary mode on startup

2013-10-05 Thread Lars Ellenberg
... > } > > I shall try this, may be next night (to prevent rebooting my virtual > servers in work time). No. Your problem is not with the DRBD resource agent, but with your pacemaker configuration. Lars -- : Lars Ellenberg : LINBIT | Your

Re: [DRBD-user] Another iSCSI Active/Active Thread

2013-10-09 Thread Lars Ellenberg
ware applications. If you want to use multiple targets concurrently on the same data set in a cluster, the target needs to become cluster aware *itself*. Whether or not the backend storage is replicated by DRBD, shared SCSI or SAN, or whatever else, is unrelated as well. -- : Lars Elle

Re: [DRBD-user] Protocol D

2013-10-09 Thread Lars Ellenberg
uld make this work, if technically sensibly possible. But as things are now, if you use this "remus + DRBD protocol D", and it breaks, you get to keep the pieces. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.co

Re: [DRBD-user] pacemaker + corosync and postgresql

2013-10-11 Thread Lars Ellenberg
subsys: AMF > debug: off > } > } > > amf { > mode: disabled > } > > aisexec{ > user : root > group : root > } > > service{ > # Load the Pacemaker Cluster Resource Manager > name : pacemaker > ver : 0 >

Re: [DRBD-user] pacemaker + corosync and postgresql

2013-10-14 Thread Lars Ellenberg
essing > failed op drbd_postgresql:0_last_failure_0 on ha-master: unknown error (1) > Oct 14 11:10:08 ha-master pengine: [786]: info: clone_print: Master/Slave > Set: ms_drbd_postgresql [drbd_postgresql] > Oct 14 11:10:08 ha-master pengine: [786]: info: short_print: Stopped: [ >

Re: [DRBD-user] DRBD 8.3.13 Sync getting stalled with RHEL 6

2013-10-21 Thread Lars Ellenberg
grade to y ;-) I'd recommend using 8.3.16 (or 8.4.4), and come back if that does not help. Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks

Re: [DRBD-user] DRBD sync stalled

2013-11-11 Thread Lars Ellenberg
e successful: DRBD Network > Protocol version 74 > Nov 5 11:55:08 w583s3255 kernel: drbd0: drbd0_asender [4856]: cstate > SyncSource --> NetworkFailure > Nov 5 11:55:08 w583s3255 kernel: drbd0: drbd0_receiver [2909]: cstate > NetworkFailure --> Unconnected > > > How

Re: [DRBD-user] Ahead / Behind Log Messages

2013-11-12 Thread Lars Ellenberg
replication possibly repeat on next "congestion" event (write burst). If you try with too low thresholds, it may flip-flop between those states very quickly. No, we don't intend to suppress state change log messages. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD

Re: [DRBD-user] Fwd: Safely Remove Resource in 8.3

2013-11-18 Thread Lars Ellenberg
And that will stay until the next module load. DRBD 8.3 does not remove resource objects completely, but will keep "stubs". Starting with DRBD 8.4, we remove them completely. Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http

Re: [DRBD-user] DRBD 8.4 on OpenVZ

2013-11-28 Thread Lars Ellenberg
n just use the in-kernel implementation instead of doing its own "compat" implementation, which may or may not still sufficiently match the current in-kernel implementation. Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availabili

Re: [DRBD-user] data mismatch when primary/secondary are both up2date

2013-11-29 Thread Lars Ellenberg
or worse quickly. So don't bypass DRBD. Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'

Re: [DRBD-user] drbdadm verify stalled

2013-12-06 Thread Lars Ellenberg
his entry on the syslog : > > > > Dec 3 09:39:23 ifprdstor6a kernel: [61078.583849] block drbd2: > > [drbd2_worker/3080] sock_sendmsg time expired, ko = 4294957143 > > Dec 3 09:39:29 ifprdstor6a kernel: [61084.574556] block drbd2: > > [drbd2_worker/3080] sock_sendmsg time expired, ko = 4294957142 > > Dec 3 09:39:31 ifprdstor6a

Re: [DRBD-user] How to Trim ssd raid+ocfs2+drbd+dual primary?

2014-01-09 Thread Lars Ellenberg
SD *and* MD properly support and propagate the discard capabilities and requests? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don&#

Re: [DRBD-user] DRBD stalls reproducibly on every "drbdadm verify"

2014-01-09 Thread Lars Ellenberg
exhaust max-buffers with resync/verify-requests. Whether you achieve that by increasing max-buffers, or by starting the verify from the other node, or by reducing the number of queued resync requests per unit time, or both, is not that important. -- : Lars Ellenberg : LINBIT | Your Way to Hig

Re: [DRBD-user] How to Trim ssd raid+ocfs2+drbd+dual primary?

2014-01-10 Thread Lars Ellenberg
tem. > So question is 'How trim filesystem on drbd which doesn't support discard > mount option?' use fstrim ;-) Admittedly, this was not an ocfs2, I just did not have one handy. Still, it proves that fstrim works on top of DRBD. mount /dev/drbd0 /mnt/something

Re: [DRBD-user] BUG: Uncatchable DRBD out-of-sync issue

2014-02-04 Thread Lars Ellenberg
" -o "$raid_lvl" = "raid10" ]; then continue fi Anyways. Point being: Either have those upper layers stop modifying buffers while they are in-flight (keyword: "stable pages"). Kernel upgrade within the VMs may do it. Changing something in the &quo

Re: [DRBD-user] Full DRDB device on LVM is now unusable

2014-02-04 Thread Lars Ellenberg
get > primitive ldap lsb:ldap > primitive lvmdata ocf:heartbeat:LVM \ > params volgrpname="vg0drbd" \ > meta target-role="started" > primitive nfs lsb:nfs > primitive nfslock lsb:nfslock > primitive openfiler lsb:openfiler > primitive samba l

Re: [DRBD-user] issue with create-md

2014-02-04 Thread Lars Ellenberg
7; not defined in your config (for this host). > > Am I missing something here? this is DRBD 8.4.4 on CentOS 6.5 -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.

Re: [DRBD-user] BarrierAck #440237 received, expected #440236!

2014-02-18 Thread Lars Ellenberg
ixed... Probably easiest to upgrade to out-of-tree. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscrib

Re: [DRBD-user] Diskless / unclean metadata

2014-02-24 Thread Lars Ellenberg
; [need to type 'yes' to confirm] no > > I didn't confirmed the create-md process because I'm afraid that it > will discard all of the data saved on the disk, right? > > And now, the icing on the cake: I currently have no secondary node > due to an hardware failu

Re: [DRBD-user] BUG: Uncatchable DRBD out-of-sync issue

2014-02-24 Thread lars . ellenberg
e with the results, which are > > potentially not identical blocks on the DRBD peers. > > > > Hello Lars, > > Thank you for the detailed explanation. I've done some more tests and found > that "out of sync" sectors appear for master-slave also, not only

Re: [DRBD-user] [Drbd-dev] [PATCH] block-drbd: type is "phy" for drbd backends

2014-02-26 Thread Lars Ellenberg
drbd|phy) > > drbd_resource=$p > > drbd_role="$(drbdadm role $drbd_resource)" > > drbd_lrole="${drbd_role%%/*}" > > @@ -278,7 +278,7 @@ case "$command" in > > > >remove) > > case $t in >

Re: [DRBD-user] Diskless / unclean metadata

2014-02-27 Thread Lars Ellenberg
On Thu, Feb 27, 2014 at 11:05:30AM +0100, olc wrote: > Hi - > > On 24/02/2014 17:52, Lars Ellenberg wrote: > >>My primary node crashed today. I have had to reboot it but now it > >>reports unclean metadata. :( > > > >How about a simple "drbdadm adjust a

Re: [DRBD-user] Ahead stuck problem

2014-03-16 Thread Lars Ellenberg
han zero */ > ! /*atomic_read(&mdev->ap_in_flight) == 0 && */ > ! atomic_read(&mdev->ap_in_flight) <= 0 && > !drbd_test_and_set_flag(mdev, AHEAD_TO_SYNC_SOURCE)) { > + > + /* Reset ap_in_flight into zero */ > + ato

Re: [DRBD-user] Fencing & split brain related questions

2014-03-16 Thread Lars Ellenberg
;s local DRBD data. > > IPMI/PDU fencing is certainly the way to go. You cannot use the replication device as its own fencing mechanism. That's a dependency loop. You can still use DRBD, and SBD, but you would have a different, actually shared IO medium for the SBD, independend fro

Re: [DRBD-user] [drbd] Kernel panic - not syncing: Out of memory and no killable processes...

2014-03-18 Thread Lars Ellenberg
11:43 +0800 | From: Fengguang Wu | To: Philipp Reisner , drbd-user@lists.linbit.com, linux-ker...@vger.kernel.org | Subject: Re: [drbd?] Kernel panic - not syncing: Out of memory and no killable processes... | Message-ID: <20130612101143.GA13837@localhost> | | On Tue, Jun 11, 2013 at 05:33:27

Re: [DRBD-user] Linbit drbd Failed status not configured problem

2014-03-18 Thread Lars Ellenberg
ot the way to use ocf-tester with the DRBD RA. But rest assured that the DRBD RA is not your problem. > DRBD status > # cat /proc/drbd > > > > > *version: 8.4.4 (api:1/proto:86-101)GIT-hash: > 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@umonitor1, > 2014-03-

Re: [DRBD-user] PROBLEM:"Digest mismatch, buffer modified by upper layers during write" happend again and again

2014-03-28 Thread Lars Ellenberg
flight to the storage (keyword: stable pages). Or you live with the symptoms you are seing. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but

Re: [DRBD-user] performance in 8.4.4

2014-04-01 Thread Lars Ellenberg
hanges are in 9 branch also. > But 9 is not released yet. How about 8.4.4? Yeah, of course we will put performance improvements into 8.4.3, and take it out again in 8.4.4. Totally makes sense. What are you asking, really? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : D

Re: [DRBD-user] [Drbd-dev] [Xen-devel] [PATCH] block-drbd: type is "phy" for drbd backends

2014-04-07 Thread Lars Ellenberg
On Tue, Apr 01, 2014 at 09:12:21AM +0200, Roger Pau Monné wrote: > Ping? Has been committed to our internal repositories, will find its way to the public git and into the tarballs and packages "soon". > > On 26/02/14 18:01, Roger Pau Monné wrote: > > On 26/02/14 17

Re: [DRBD-user] peer-max-bio-size 1M

2014-04-11 Thread Lars Ellenberg
swer to create my resources :-) For any normal use case, using that option to drbdmeta is simply wrong. So don't. Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are r

Re: [DRBD-user] block drbd10: ASSERT( i >= 0 ) in drivers/block/drbd/drbd_int.h:2052

2014-04-11 Thread Lars Ellenberg
t means and where the problem > is related to. DRBD seems to work fine though. Here's the output of > /proc/drbd at this RaspberryPi. It's my third node, it's a stacked > device, and this third node has the role "secondary". Disconnect drbd. Reconnect drbd. If that

Re: [DRBD-user] /etc/init.d/drbd stop error

2014-04-11 Thread Lars Ellenberg
pe:0 ua:0 ap:0 ep:1 wo:f oos:0* > > I can't find the reason for that. > Any help please. You likely have a DRBD init script version from before 8.4, and that fails to properly down everything when using the 8.4 module, but expects you to do drbdadm down all (or equivalent) yourself

Re: [DRBD-user] peer-max-bio-size 1M

2014-04-11 Thread Lars Ellenberg
orrect. > However, I may be wrong with my understanding :) Nope, you are right. Also I already admitted that this is an oversight (Bug), and will be fixed anyways, even though it is mostly cosmetic imho. Thanks, Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRB

Re: [DRBD-user] automated LVM snapshots: interfere with nested LVM configuration

2014-04-11 Thread Lars Ellenberg
have two (or more) separate lvm.conf with suitable filter settings, with the "outer" one being the default (and filter out all DRBD), and any "inner" one(s) only being used explicitly by setting the respective LVM_SYSTEM_DIR (and filter out everything BUT the corresponding DRB

Re: [DRBD-user] Bug: section type conflict (drbd 8.4 / gcc 4.8.2)

2014-04-22 Thread Lars Ellenberg
from work/drbd-8.4./drbd/drbd_nl.c:103: > work/drbd-8.4./drbd/linux/drbd_genl.h:252:36: note: ‘drbd_mcg_events’ was > declared here -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are regist

Re: [DRBD-user] bonding more than two NICs

2014-04-22 Thread Lars Ellenberg
"optimum" from single-TCP session throughput vs # bonding channels. For other bonding modes, you won't be able to increase single TCP session beyond single physical link saturation, but your aggregate throughput will increase with number of bonding channels and number of communic

Re: [DRBD-user] Multi-threaded on-line verification

2014-04-22 Thread Lars Ellenberg
d our own threading/async layer over this. Guess that "soon" is very relative. Yes, we can add that as feature request somewhere on our (not so short) todo list. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com

Re: [DRBD-user] Replication for disaster recovery

2014-04-26 Thread Lars Ellenberg
ablished, or the admin explicitly resumes IO. In theory you could try to force disconnect, then force detach, then resume And all file systems/applications on top of it will get IO errors, and may then panic ;-) Maybe easier to just have it freeze, and start a timer... if it is still fr

Re: [DRBD-user] Xen migration with drbd

2014-04-29 Thread Lars Ellenberg
as it still was "in-use" by "someone". That someone frequently turns out to be some udev triggered device scan. Please find the logs, or add your own "echo >> some-log" style debugging to the script, then see why it does not work as expected. Maybe we simp

Re: [DRBD-user] drbdsetup-8.3 not compatible with drbdsetup 8.3 ?

2014-04-30 Thread Lars Ellenberg
" error. So the > frontend drbdsetup is rejecting this command *before* giving it to > drbdsetup-83. > > Regards, > > Brian. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT®

Re: [DRBD-user] Load high on primary node while doing backup on secondary

2014-04-30 Thread Lars Ellenberg
uot; DBRD setup, and disconnect the third node. Or, if you can live with reduced redundancy during the backup, disconnect the secondary for that time. Or add a dedicated PV for the snapshot "exeption store", or add non-volatile cache to your RAID. or a number of other options. Thing is, if you s

Re: [DRBD-user] minor-count

2014-04-30 Thread Lars Ellenberg
s is possible to apply this parameter without reloading the > drbd kernel module ? In 8.3: no. You may need to switch all to one side, reload the module on the then passive side, rinse and repeat. In 8.4: no need. There it is no longer relevant, but only an initial guesstimate to dimension some

Re: [DRBD-user] Bug: section type conflict (drbd 8.4 / gcc 4.8.2)

2014-04-30 Thread Lars Ellenberg
some > out-of-tree modules at http://www.grsecurity.net/~paxguy1/ . > --- > > As it sounds like this grsec option is useful to decrease attack > vectors, are you inclined to support this kernel option some day? Or is > it too exotic? Please let me know. I don't really care r

Re: [DRBD-user] three node backup setup without public/internal IP address

2014-05-07 Thread Lars Ellenberg
*. If your sustained average write rate exceeds the average drain rate through compression and replication link, you can only either disconnect, or throttle the primary system to that drain rate. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http:

Re: [DRBD-user] resize disks/partitions

2014-05-12 Thread Lars Ellenberg
he drbd > partitions. although if it can be done without downtime then good, i can > get it procedure tested and documented in case it needs to be done in > future when the systems are live. If you had things on LVM, you could add a new PV, do lvextend or pvmove as you see fit, without los

Re: [DRBD-user] drbd+device mapper drbd didn't start.

2014-05-12 Thread Lars Ellenberg
56 centos1 kernel: drbd r0: Starting worker thread (from > drbdsetup [5231]) > May 8 16:16:56 centos1 kernel: block drbd0: open("/dev/mapper/mapatha") > failed with -16 Something already claimed mapatha. Maybe you need to exclude kpartx from mapping internal partitions, or adj

Re: [DRBD-user] mysql crashes when I move the datafiles from its current mysql directory to drbd file system

2014-05-13 Thread Lars Ellenberg
and simply symlink /var/lib/mysql to where your new mount point is (after rsync'ing everything over again, apparently right now it's a complete mess). -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.

Re: [DRBD-user] drbd+device mapper drbd didn't start.

2014-05-14 Thread Lars Ellenberg
evice. If you need "partitions" inside of one DRBD I recommend to use DRBD as PV (physical volume) for a LVM VG (volume group). Hth, Lars > I started drbd. Same error occured. > > > > > --- On Tue, 2014/5/13, Lars Ellenberg wrote: > > On Thu, May 08

Re: [DRBD-user] drbd+device mapper drbd didn't start.

2014-05-16 Thread Lars Ellenberg
r from the top of my head, and it may not be supported on all platforms (yet). If nothing else helps, chmod -x kpartx ;-) > > I created partitions from Windows2008R first. > > Windows2008R created MS data partition(mpathap2) with MS reserved > partition(mpathap1). > >

Re: [DRBD-user] DRDB three-node cluster variant

2014-05-17 Thread Lars Ellenberg
27;s finally the killer application for dual-primary and stacked mode ;-) Cheers, Lars -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ pl

[DRBD-user] drbd-8.4.5rc1.tar.gz

2014-05-21 Thread Lars Ellenberg
policy to be bumped up again * trigger tcp_flush_pending_frames() for PING/PING_ACK -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. _

Re: [DRBD-user] Minor number count for DRBD 8.4

2014-05-22 Thread Lars Ellenberg
ed). > For DRBD 8.3 the default value was 32 I think thats why we have a > specific option in the drbd kernel module of 128. > > -- > Met vriendelijke groet / Kind regards, > Bram Klein Gunnewiek | Shock Media B.V. > > Tel: +31 (0)546 - 714360 > Fax: +31 (0)546 - 71

[DRBD-user] Clarification about drbd-utils 8.9.0 [Re: drbd-8.4.5rc1.tar.gz]

2014-05-22 Thread Lars Ellenberg
On Wed, May 21, 2014 at 12:00:19PM +0200, Lars Ellenberg wrote: > > This is a release candidate. Please help testing, > and let us know of any problems or regressions, performance or otherwise. > > Thanks, > Lars > > Git: > http://git.linbit.com/drbd-8.4.

Re: [DRBD-user] Pacemaker - DRBD fails on node every couple hours

2014-05-26 Thread Lars Ellenberg
On Fri, May 23, 2014 at 03:37:32PM +0200, Andreas Greve wrote: > My post was related to this > http://lists.linbit.com/pipermail/drbd-user/2012-March/017922.html A fix for the issue described in that thread has been in the DRBD code since March 2012. In fact, it is 19 commits post the 8.4.1 releas

Re: [DRBD-user] Adjusting al-extents on-the-fly

2014-05-27 Thread Lars Ellenberg
t a *destructive* re-creation of the block device? I've since reverted > my changes back to what's shown above on both nodes, and have not > proceeded. The documentation on exactly what that command does is > unclear to me. > > Is it a sane thing to try and adju

Re: [DRBD-user] Low.dev. smaller than requested DRBD-dev. size.

2014-05-29 Thread Lars Ellenberg
re in the docs or on the net... > If it's a device size difference, shouldn't DRBD automatically adjust to the > lower available size? usually it means literally that the lower level device (the "physical" disk), is too small. how big is /dev/cciss/disc0/part11 ? -- : L

Re: [DRBD-user] HA NFS with more than 2 nodes.

2014-05-29 Thread Lars Ellenberg
o ... and then you do something wrong, because of the pressure to get it going again at Monday 3:27 am ... and how you recover from that. -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbHFax +43-1-8178292-82 :

Re: [DRBD-user] Kernel 2.6.17-rc5 under amd64 may be hanging I/O, [Was] Re: [Q] What would cause fsck running on a drbd device to just stop?

2014-05-29 Thread Lars Ellenberg
too. > >* what are the numbers in /proc/drbd > > This is how it appears long after the copy hang and I stopped and just > restarted drbd on the peer. The output from then hanging computer: thanks, that might help in debugging things. I'll have a look, maybe I can find somethi

Re: [DRBD-user] Monitoring DRBD

2014-05-29 Thread Lars Ellenberg
L will change in future versions? yes. definitely. > 2. Is it possible to obtain synchronization speed, completion percents > and remaining time via IOCTLs? no. -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH

Re: [DRBD-user] input/output error in drbd after deleting files

2014-05-29 Thread Lars Ellenberg
famous bio_clone bug...) > 2. is it a regular process to do an fsck after deleting a lot of files > in the drbd filesystem? no. -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbHFax +43-1-8178292-82 : :

Re: [DRBD-user] Drbd make user timed out when secondary node up

2014-05-29 Thread Lars Ellenberg
les probably double or tripple the speed. still not very fast with a base rate of 10 MBit, probably feels like burning an old quad-cdrom drive... note, however, that _read_ requests are carried out locally, and thus should still be as fast as your local io bandwith can provide. -- : Lars Ellenb

Re: [DRBD-user] Spontaneous access to the CDROM on two computers simultaneously

2014-05-29 Thread Lars Ellenberg
t least theoretically triggered this somehow. no. much more likely some program scanning ("discovering") devices. just a wild guess: man lvm.conf /filter -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbHFax

Re: [DRBD-user] primary/secondary problem

2014-05-29 Thread Lars Ellenberg
mean by "insist"? if you tell it to become primary, what exactly does happen? how do you try to do the "drbdadm primary all"? -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbHFax +43-1-8178292-82 : : S

Re: [DRBD-user] primary/secondary problem

2014-05-29 Thread Lars Ellenberg
pt) does not. > > > > Mine does now. Should I conclude from your question that both > constituents of a drbd cluster start in secondary mode unless altered > via init script or heartbeat? if you insist on shooting yourself, just go ahead. -- : Lars Ellenberg

Re: [DRBD-user] primary/secondary problem

2014-05-29 Thread Lars Ellenberg
/ 2006-06-13 09:22:43 +0200 \ Gernot W. Schmied: > Lars Ellenberg wrote: > > / 2006-06-11 23:22:38 +0200 > > \ Gernot W. Schmied: > >> Hi, > >> > >> Since my migration (didn't change a thing!) > >> to newest Ubuntu my DRBD "didn'

Re: [DRBD-user] primary/secondary problem

2014-05-29 Thread Lars Ellenberg
/ 2006-06-13 09:22:43 +0200 \ Gernot W. Schmied: > Lars Ellenberg wrote: > > / 2006-06-11 23:22:38 +0200 > > \ Gernot W. Schmied: > >> Hi, > >> > >> Since my migration (didn't change a thing!) to newest Ubuntu my DRBD > >> setup (0.7.17

Re: [DRBD-user] Errors when creating a LVM PV on DRBD

2014-05-29 Thread Lars Ellenberg
w will be used for lvm. > > I have excluded /dev/sdb2 in the lvm.conf file thus: filter = [ "r/dev/sdb2" ] verify that you have exactly one filter statement. try filter = [ "a|/dev/drbd|", "r|.*|" ] or filter = [ "r|/dev/sdb2|" ] ( the sec

Re: [DRBD-user] HELP ... Primary becomes StandAlone.

2014-05-29 Thread Lars Ellenberg
/etc/init.d/drbd stop dd if=/dev/zero bs=4096 count=1 of=/dev/whatever-your-drbd-meta-data-device /etc/init.d/drbd start and, on the primary, do "drbdadm connect all" again. you'll see a Full Sync of the device then. if you use internal meta-data, this is slightly more involv

Re: [DRBD-user] [DRBD user] HELP ... Primary becomes StandAlone.

2014-05-29 Thread Lars Ellenberg
/ 2006-06-20 15:30:37 +0200 \ Regis Gras: > Lars Ellenberg wrote: > > you have several options. > > > the most easy one would be to go to the secondary, > > and wipe the drbd meta data area. > > if you use external meta-data, that is > > /etc/init.d/drbd stop

Re: [DRBD-user] What server to run services while syncing

2014-05-29 Thread Lars Ellenberg
uot; must print the string "running" to stdout if the service is running, and should print "stopped" to stdout if the service is not running. this is for heartbeat 1.X, or heartbeat 2.x in the non-crm mode. the heartbeat crm-mode is happy with this requirements, too, but doe

Re: [DRBD-user] Problem with Primary/Primary drbd8.0pre3 ocfs2

2014-05-29 Thread Lars Ellenberg
in error reporting... should have been fixed now in svn. did you "net { allow-two-primaries; }" in drbd.conf? -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbHFax +43-1-8178292-82 : : Schoenbrunne

Re: [DRBD-user] higher throughput => less happy?

2014-05-29 Thread Lars Ellenberg
ernel: Pid: 4099, comm: xvd 3 93:03 Not tainted > 2.6.16-1.2133_FC5xen0 #1 well. there should be a line before this, telling us what this is about... memory allocation failed? soft lockup? oops? bug(), debug backtrace of some kind? is cash's peer called jonny? -- : Lars Ellenberg

Re: [DRBD-user] [DRBD user] HELP ... Primary becomes StandAlone.

2014-05-29 Thread Lars Ellenberg
/ 2006-06-21 16:30:35 +0200 \ Regis Gras: > Lars Ellenberg wrote: > > >since the sda6 is a multiple of 4kB, too, > >we don't need to round down the offset. > >so I'd say, to wipe out the meta data, you do > > > >perl -e '$offset = sysseek STDO

Re: [DRBD-user] HELP ... Primary becomes StandAlone.

2014-05-29 Thread Lars Ellenberg
may be an option to /etc/init.d/drbd stop rmmod drbd # just in case # install drbd-0.7.19 (user space and kernel module) modprobe drbd drbdadm attach r0 drbdadm invalidate r0 drbdadm adjust r0 -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information

Re: [DRBD-user] Apologies...wrong subject: should have been drbd performance issue..

2014-05-29 Thread Lars Ellenberg
? if that is a 100 Megabit / second, that will max out the write throughput of connected drbd at around 10 MegaByte per second. if that is a 10 MBps line, that obviously goes even down to one megabyte per second... what write rates does your customer actually observe, and what write rates would

<    1   2   3   4   5   6   7   8   9   10   >