Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA
Lars Marowsky-Bree l...@suse.com schrieb am 03.12.2013 um 15:34 in Nachricht 20131203143443.gl27...@suse.de: On 2013-12-02T09:22:10, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: No! Then it can't work. Exclusive activation only works for clustered volume groups, since it uses the DLM to protect against the VG being activated more than once in the cluster. Hi! Try it with resource-agents-3.9.4-0.26.84: it works; with resource-agents-3.9.5-0.6.26.11 it doesn't work ;-) You thought it was working, but in fact it wasn't. ;-) working meaning the resource started. not working meaning the resource does not start You see I have minimal requirements ;-) Or at least, not as you expected. You could argue that it never should have worked. Anyway: If you want to activate a VG on exactly one node you should not need cLVM; only if you man to activate the VG on multiple nodes (as for a cluster file system)... You don't need cLVM to activate a VG on exactly one node. Correct. And you don't. The cluster stack will never activate a resource twice. Occasionally two safty lines are better than one. We HAD filesystem corruptions due to the cluster doing things it shouldn't do. You need cLVM if you want LVM2 to enforce that at the LVM2 level - because it does this by getting a lock on the VG/LV, since otherwise LVM2 has no way of knowing if the VG/LV is currently active somewhere else. And this is what exclusive=true turns on. If you don't want that to happen, exclusive=true is not what you want to set. That makes sense, but what I don't like is that I have to mess with local lvm.conf files... All the best, Ulrich Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA
On 2013-12-04T10:25:58, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: You thought it was working, but in fact it wasn't. ;-) working meaning the resource started. not working meaning the resource does not start You see I have minimal requirements ;-) I'm sorry; we couldn't possibly test all misconfigurations. So this slipped through, we didn't expect someone to set that for a non-clustered VG previously. You could argue that it never should have worked. Anyway: If you want to activate a VG on exactly one node you should not need cLVM; only if you man to activate the VG on multiple nodes (as for a cluster file system)... You don't need cLVM to activate a VG on exactly one node. Correct. And you don't. The cluster stack will never activate a resource twice. Occasionally two safty lines are better than one. We HAD filesystem corruptions due to the cluster doing things it shouldn't do. And that's perfectly fine. All you need to do to activate this is vgchange -c y on the specific volume group, and the exclusive=true flag will work just fine. If you don't want that to happen, exclusive=true is not what you want to set. That makes sense, but what I don't like is that I have to mess with local lvm.conf files... You don't. Just drop exclusive=true, or set the clustered flag on the VG. You only have to change anything in the lvm.conf if you want to use tags for exclusivity protection (I defer to the LVM RA help for how to use that, I've never tried it). Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] FYI: resource-agents-3.9.2-40.el6.x86_64 kills heartbeat-3.0.4
On 2013-12-02 00:03, Andrew Beekhof wrote: On Wed, Nov 27, 2013, at 06:15 PM, Jefferson Ogata wrote: On 2013-11-28 01:55, Andrew Beekhof wrote: On 28 Nov 2013, at 11:29 am, Jefferson Ogata linux...@antibozo.net wrote: On 2013-11-28 00:12, Dimitri Maziuk wrote: Just so you know: RedHat's (centos, actually) latest build of resource-agents sets $HA_BIN to /usr/libexec/heartbeat. The daemon in heartbeat-3.0.4 RPM is /usr/lib64/heartbeat/heartbeat so $HA_BIN/heartbeat binary does not exist. (And please hold the upgrade to pacemaker comments: I'm hoping if I wait just a little bit longer I can upgrade to ceph and openstack -- or retire, whichever comes first ;) Hey, upgrading to pacemaker wouldn't necessarily help. Red Hat broke that last month by dropping most of the resource agents they'd initially shipped. (Don't you love Technology Previews?) Thats the whole point behind the tech preview label... it means the software is not yet in a form that Red Hat will support and is subject to changes _exactly_ like the one made to resource-agents. Um, yes, i know. That's why i mentioned it. Ok, sorry, I wasn't sure. It's nicer, however, when Red Hat takes a conservative position with the Tech Preview. They could have shipped a minimal set of resource agents in the first place, 3 years ago we didn't know if pacemaker would _ever_ be supported in RHEL-6, so stripping out agents wasn't on our radar. I'm sure the only reason it and the rest of pacemaker shipped at all was to humor the guy they'd just hired. It was only at the point that supporting pacemaker in 6.5 became likely that someone took a look at the full list and had a heart-attack. so people would have a better idea what they had to provide on their own end, instead of pulling the rug out with nary a mention of what they were doing. Yes, that was not good. One of the challenges I find at Red Hat is the gaps between when a decision is made, when we're allowed to talk about it and when customers find out about it. As a developermore its the things we spent significant time on that first come to mind when writing release notes, not the 3s it took to remove some files from the spec file - even though the latter is going to have a bigger affect :-( We can only say that lessons have been learned and that we will do better if there is a similar situation next time. +1 Insightful/Informative/Interesting. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] drbd/pacemaker multiple tgt targets, portblock, and race conditions (long-ish)
On 2013-11-21 16:34, Jefferson Ogata wrote: On 2013-11-20 08:35, Jefferson Ogata wrote: Indeed, using iptables with REJECT and tcp-reset, this seems to piss off the initiators, creating immediate i/o errors. But one can use DROP on incoming SYN packets and let established connections drain. I've been trying to get this to work but am finding that it takes so long for some connections to drain that something times out. I haven't given up on this approach, tho. Testing this stuff can be tricky because if i make one mistake, stonith kicks in and i end up having to wait 5-10 minutes for the machine to reboot and resync its DRBD devices. Follow-up on this: the original race condition i reported still occurs with this strategy: if existing TCP connections are allowed to drain by passing packets from established initiator connections (by blocking only SYN packets), then the initiator can also send new requests to the target during the takedown process; the takedown removes LUNs from the live target and the initiator generates an i/o error if it happens to try to access a LUN that has been removed before the connection is removed. This happens because the configuration looks something like this (crm): group foo portblock vip iSCSITarget:target iSCSILogicalUnit:lun1 iSCSILogicalUnit:lun2 iSCSILogicalUnit:lun3 portunblock On takedown, if portblock is tweaked to pass packets for existing connections so they can drain, there's a window while LUNs lun3, lun2, lun1 are being removed from the target where this race condition occurs. The connection isn't removed until iSCSITarget runs to stop the target. A way to handle this that should actually work is to write a new RA that deletes the connections from the target *before* the LUNs are removed during takedown. The config would look something like this, then: group foo portblock vip iSCSITarget:target iSCSILogicalUnit:lun1 iSCSILogicalUnit:lun2 iSCSILogicalUnit:lun3 tgtConnections portunblock On takedown, then, portunblock will block new incoming connections, tgtConnections will shut down existing connections and wait for them to drain, then the LUNs can be safely removed before the target is taken down. I'll write this RA today and see how that works. So, this strategy worked. The final RA is attached. The config (crm) then looks like this, using the tweaked portblock RA that blocks syn only, the tgtUser RA that adds a tgtd user, and the tweaked iSCSITarget RA that doesn't add a user if no password is provided (see previous discussion for the latter two RAs). This is a two-node cluster using DRBD-backed LVMs and multiple targets. The names have been changed to protect the innocent, and is simplified to only a single target for brevity, but it should be clear how to do multiple DRBDs/VGs/targets. I've left out the stonith config here also. primitive tgtd lsb:tgtd op monitor interval=10s clone clone.tgtd tgtd primitive user.username ocf:local:tgtUser params username=username password=password clone clone.user.username user.username order clone.tgtd_before_clone.user.username inf: clone.tgtd:start clone.user.username:start primitive drbd.pv1 ocf:linbit:drbd params drbd_resource=pv1 op monitor role=Slave interval=29s timeout=600s op monitor role=Master interval=31s timeout=600s op start timeout=240s op stop timeout=240s ms ms.drbd.pv1 drbd.pv1 meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true primitive lvm.vg1 ocf:heartbeat:LVM params volgrpname=vg1 op monitor interval=30s timeout=30s op start timeout=30s op stop timeout=30s order ms.drbd.pv1_before_lvm.vg1 inf: ms.drbd.pv1:promote lvm.vg1:start colocation ms.drbd.pv1_with_lvm.vg1 inf: ms.drbd.pv1:Master lvm.vg1 primitive target.1 ocf:local:iSCSITarget params iqn=iqnt1 tid=1 incoming_username=username implementation=tgt portals= op monitor interval=30s op start timeout=30s op stop timeout=120s primitive lun.1.1 ocf:heartbeat:iSCSILogicalUnit params target_iqn=iqnt1 lun=1 path=/dev/vg1/lv1 additional_parameters=scsi_id=vg1/lv1 mode_page=8:0:18:0x10:0:0xff:0xff:0:0:0xff:0xff:0xff:0xff:0x80:0x14:0:0:0:0:0:0 implementation=tgt op monitor interval=30s op start timeout=30s op stop timeout=120s primitive ip.192.168.1.244 ocf:heartbeat:IPaddr params ip=192.168.1.244 cidr_netmask=24 nic=bond0 primitive portblock.ip.192.168.1.244 ocf:local:portblock params ip=192.168.1.244 action=block protocol=tcp portno=3260 syn_only=true op monitor interval=10s timeout=10s depth=0 primitive tgtfinal.1 ocf:local:tgtFinal params tid=1 op monitor interval=30s timeout=30s op stop timeout=60s primitive portunblock.ip.192.168.1.244 ocf:local:portblock params ip=192.168.1.244 action=unblock protocol=tcp portno=3260 syn_only=true op monitor interval=10s timeout=10s depth=0 group group.target.1 lvm.vg1 portblock.ip.192.168.12.244 ip.192.168.12.244 target.6 lun.6.1 tgtfinal.1 portunblock.ip.192.168.1.244 order clone.tgtd_before_group.target.1 inf: clone.tgtd:start
Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA
- Original Message - From: Lars Marowsky-Bree l...@suse.com To: General Linux-HA mailing list linux-ha@lists.linux-ha.org Sent: Wednesday, December 4, 2013 3:49:17 AM Subject: Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA On 2013-12-04T10:25:58, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: You thought it was working, but in fact it wasn't. ;-) working meaning the resource started. not working meaning the resource does not start You see I have minimal requirements ;-) I'm sorry; we couldn't possibly test all misconfigurations. So this slipped through, we didn't expect someone to set that for a non-clustered VG previously. Updates have been made to the LVM agent to allow exclusive activation without clvmd. http://www.davidvossel.com/wiki/index.php?title=HA_LVM -- Vossel You could argue that it never should have worked. Anyway: If you want to activate a VG on exactly one node you should not need cLVM; only if you man to activate the VG on multiple nodes (as for a cluster file system)... You don't need cLVM to activate a VG on exactly one node. Correct. And you don't. The cluster stack will never activate a resource twice. Occasionally two safty lines are better than one. We HAD filesystem corruptions due to the cluster doing things it shouldn't do. And that's perfectly fine. All you need to do to activate this is vgchange -c y on the specific volume group, and the exclusive=true flag will work just fine. If you don't want that to happen, exclusive=true is not what you want to set. That makes sense, but what I don't like is that I have to mess with local lvm.conf files... You don't. Just drop exclusive=true, or set the clustered flag on the VG. You only have to change anything in the lvm.conf if you want to use tags for exclusivity protection (I defer to the LVM RA help for how to use that, I've never tried it). Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems