Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA
- Original Message - > From: "Lars Marowsky-Bree" > To: "General Linux-HA mailing list" > Sent: Wednesday, December 4, 2013 3:49:17 AM > Subject: Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM > RA > > On 2013-12-04T10:25:58, Ulrich Windl > wrote: > > > > You thought it was working, but in fact it wasn't. ;-) > > "working" meaning "the resource started". > > "not working" meaning "the resource does not start" > > > > You see I have minimal requirements ;-) > > I'm sorry; we couldn't possibly test all misconfigurations. So this > slipped through, we didn't expect someone to set that for a > non-clustered VG previously. Updates have been made to the LVM agent to allow exclusive activation without clvmd. http://www.davidvossel.com/wiki/index.php?title=HA_LVM -- Vossel > > > >> You could argue that it never should have worked. Anyway: If you want > > >> to activate a VG on exactly one node you should not need cLVM; only if > > >> you man to activate the VG on multiple nodes (as for a cluster file > > >> system)... > > > > > > You don't need cLVM to activate a VG on exactly one node. Correct. And > > > you don't. The cluster stack will never activate a resource twice. > > > > Occasionally two safty lines are better than one. We HAD filesystem > > corruptions due to the cluster doing things it shouldn't do. > > And that's perfectly fine. All you need to do to activate this is > "vgchange -c y" on the specific volume group, and the exclusive=true > flag will work just fine. > > > > If you don't want that to happen, exclusive=true is not what you want to > > > set. > > That makes sense, but what I don't like is that I have to mess with local > > lvm.conf files... > > You don't. Just drop exclusive=true, or set the clustered flag on the > VG. > > You only have to change anything in the lvm.conf if you want to use tags > for exclusivity protection (I defer to the LVM RA help for how to use > that, I've never tried it). > > > Regards, > Lars > > -- > Architect Storage/HA > SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, > HRB 21284 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] drbd/pacemaker multiple tgt targets, portblock, and race conditions (long-ish)
On 2013-11-21 16:34, Jefferson Ogata wrote: On 2013-11-20 08:35, Jefferson Ogata wrote: Indeed, using iptables with REJECT and tcp-reset, this seems to piss off the initiators, creating immediate i/o errors. But one can use DROP on incoming SYN packets and let established connections drain. I've been trying to get this to work but am finding that it takes so long for some connections to drain that something times out. I haven't given up on this approach, tho. Testing this stuff can be tricky because if i make one mistake, stonith kicks in and i end up having to wait 5-10 minutes for the machine to reboot and resync its DRBD devices. Follow-up on this: the original race condition i reported still occurs with this strategy: if existing TCP connections are allowed to drain by passing packets from established initiator connections (by blocking only SYN packets), then the initiator can also send new requests to the target during the takedown process; the takedown removes LUNs from the live target and the initiator generates an i/o error if it happens to try to access a LUN that has been removed before the connection is removed. This happens because the configuration looks something like this (crm): group foo portblock vip iSCSITarget:target iSCSILogicalUnit:lun1 iSCSILogicalUnit:lun2 iSCSILogicalUnit:lun3 portunblock On takedown, if portblock is tweaked to pass packets for existing connections so they can drain, there's a window while LUNs lun3, lun2, lun1 are being removed from the target where this race condition occurs. The connection isn't removed until iSCSITarget runs to stop the target. A way to handle this that should actually work is to write a new RA that deletes the connections from the target *before* the LUNs are removed during takedown. The config would look something like this, then: group foo portblock vip iSCSITarget:target iSCSILogicalUnit:lun1 iSCSILogicalUnit:lun2 iSCSILogicalUnit:lun3 tgtConnections portunblock On takedown, then, portunblock will block new incoming connections, tgtConnections will shut down existing connections and wait for them to drain, then the LUNs can be safely removed before the target is taken down. I'll write this RA today and see how that works. So, this strategy worked. The final RA is attached. The config (crm) then looks like this, using the tweaked portblock RA that blocks syn only, the tgtUser RA that adds a tgtd user, and the tweaked iSCSITarget RA that doesn't add a user if no password is provided (see previous discussion for the latter two RAs). This is a two-node cluster using DRBD-backed LVMs and multiple targets. The names have been changed to protect the innocent, and is simplified to only a single target for brevity, but it should be clear how to do multiple DRBDs/VGs/targets. I've left out the stonith config here also. primitive tgtd lsb:tgtd op monitor interval="10s" clone clone.tgtd tgtd primitive user.username ocf:local:tgtUser params username="username" password="password" clone clone.user.username user.username order clone.tgtd_before_clone.user.username inf: clone.tgtd:start clone.user.username:start primitive drbd.pv1 ocf:linbit:drbd params drbd_resource="pv1" op monitor role="Slave" interval="29s" timeout="600s" op monitor role="Master" interval="31s" timeout="600s" op start timeout="240s" op stop timeout="240s" ms ms.drbd.pv1 drbd.pv1 meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" primitive lvm.vg1 ocf:heartbeat:LVM params volgrpname="vg1" op monitor interval="30s" timeout="30s" op start timeout="30s" op stop timeout="30s" order ms.drbd.pv1_before_lvm.vg1 inf: ms.drbd.pv1:promote lvm.vg1:start colocation ms.drbd.pv1_with_lvm.vg1 inf: ms.drbd.pv1:Master lvm.vg1 primitive target.1 ocf:local:iSCSITarget params iqn="iqnt1" tid="1" incoming_username="username" implementation="tgt" portals="" op monitor interval="30s" op start timeout="30s" op stop timeout="120s" primitive lun.1.1 ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqnt1" lun="1" path="/dev/vg1/lv1" additional_parameters="scsi_id=vg1/lv1 mode_page=8:0:18:0x10:0:0xff:0xff:0:0:0xff:0xff:0xff:0xff:0x80:0x14:0:0:0:0:0:0" implementation="tgt" op monitor interval="30s" op start timeout="30s" op stop timeout="120s" primitive ip.192.168.1.244 ocf:heartbeat:IPaddr params ip="192.168.1.244" cidr_netmask="24" nic="bond0" primitive portblock.ip.192.168.1.244 ocf:local:portblock params ip="192.168.1.244" action="block" protocol="tcp" portno="3260" syn_only="true" op monitor interval="10s" timeout="10s" depth="0" primitive tgtfinal.1 ocf:local:tgtFinal params tid="1" op monitor interval="30s" timeout="30s" op stop timeout="60s" primitive portunblock.ip.192.168.1.244 ocf:local:portblock params ip="192.168.1.244" action="unblock" protocol="tcp" portno="3260" syn_only="true" op monitor interval="10s" timeout="10s" depth="0" group group.target.1 lvm.vg1 portblock.ip.192.168.12.244 ip.192.168.12.244 targ
Re: [Linux-HA] FYI: resource-agents-3.9.2-40.el6.x86_64 kills heartbeat-3.0.4
On 2013-12-02 00:03, Andrew Beekhof wrote: On Wed, Nov 27, 2013, at 06:15 PM, Jefferson Ogata wrote: On 2013-11-28 01:55, Andrew Beekhof wrote: On 28 Nov 2013, at 11:29 am, Jefferson Ogata wrote: On 2013-11-28 00:12, Dimitri Maziuk wrote: Just so you know: RedHat's (centos, actually) latest build of resource-agents sets $HA_BIN to /usr/libexec/heartbeat. The daemon in heartbeat-3.0.4 RPM is /usr/lib64/heartbeat/heartbeat so $HA_BIN/heartbeat binary does not exist. (And please hold the "upgrade to pacemaker" comments: I'm hoping if I wait just a little bit longer I can upgrade to ceph and openstack -- or retire, whichever comes first ;) Hey, "upgrading" to pacemaker wouldn't necessarily help. Red Hat broke that last month by dropping most of the resource agents they'd initially shipped. (Don't you love "Technology Previews"?) Thats the whole point behind the "tech preview" label... it means the software is not yet in a form that Red Hat will support and is subject to changes _exactly_ like the one made to resource-agents. Um, yes, i know. That's why i mentioned it. Ok, sorry, I wasn't sure. It's nicer, however, when Red Hat takes a conservative position with the Tech Preview. They could have shipped a minimal set of resource agents in the first place, 3 years ago we didn't know if pacemaker would _ever_ be supported in RHEL-6, so stripping out agents wasn't on our radar. I'm sure the only reason it and the rest of pacemaker shipped at all was to humor the guy they'd just hired. It was only at the point that supporting pacemaker in 6.5 became likely that someone took a look at the full list and had a heart-attack. so people would have a better idea what they had to provide on their own end, instead of pulling the rug out with nary a mention of what they were doing. Yes, that was not good. One of the challenges I find at Red Hat is the gaps between when a decision is made, when we're allowed to talk about it and when customers find out about it. As a developermore its the things we spent significant time on that first come to mind when writing release notes, not the 3s it took to remove some files from the spec file - even though the latter is going to have a bigger affect :-( We can only say that lessons have been learned and that we will do better if there is a similar situation next time. +1 Insightful/Informative/Interesting. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA
On 2013-12-04T10:25:58, Ulrich Windl wrote: > > You thought it was working, but in fact it wasn't. ;-) > "working" meaning "the resource started". > "not working" meaning "the resource does not start" > > You see I have minimal requirements ;-) I'm sorry; we couldn't possibly test all misconfigurations. So this slipped through, we didn't expect someone to set that for a non-clustered VG previously. > >> You could argue that it never should have worked. Anyway: If you want > >> to activate a VG on exactly one node you should not need cLVM; only if > >> you man to activate the VG on multiple nodes (as for a cluster file > >> system)... > > > > You don't need cLVM to activate a VG on exactly one node. Correct. And > > you don't. The cluster stack will never activate a resource twice. > > Occasionally two safty lines are better than one. We HAD filesystem > corruptions due to the cluster doing things it shouldn't do. And that's perfectly fine. All you need to do to activate this is "vgchange -c y" on the specific volume group, and the exclusive=true flag will work just fine. > > If you don't want that to happen, exclusive=true is not what you want to > > set. > That makes sense, but what I don't like is that I have to mess with local > lvm.conf files... You don't. Just drop exclusive=true, or set the clustered flag on the VG. You only have to change anything in the lvm.conf if you want to use tags for exclusivity protection (I defer to the LVM RA help for how to use that, I've never tried it). Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA
>>> Lars Marowsky-Bree schrieb am 03.12.2013 um 15:34 in Nachricht <20131203143443.gl27...@suse.de>: > On 2013-12-02T09:22:10, Ulrich Windl wrote: > >> >> No! >> > >> > Then it can't work. Exclusive activation only works for clustered volume >> > groups, since it uses the DLM to protect against the VG being activated >> > more than once in the cluster. >> Hi! >> >> Try it with "resource-agents-3.9.4-0.26.84": it works; with > "resource-agents-3.9.5-0.6.26.11" it doesn't work ;-) > > You thought it was working, but in fact it wasn't. ;-) "working" meaning "the resource started". "not working" meaning "the resource does not start" You see I have minimal requirements ;-) > > Or at least, not as you expected. > >> You could argue that it never should have worked. Anyway: If you want >> to activate a VG on exactly one node you should not need cLVM; only if >> you man to activate the VG on multiple nodes (as for a cluster file >> system)... > > You don't need cLVM to activate a VG on exactly one node. Correct. And > you don't. The cluster stack will never activate a resource twice. Occasionally two safty lines are better than one. We HAD filesystem corruptions due to the cluster doing things it shouldn't do. > > You need cLVM if you want LVM2 to enforce that at the LVM2 level - > because it does this by getting a lock on the VG/LV, since otherwise > LVM2 has no way of knowing if the VG/LV is currently active somewhere > else. And this is what "exclusive=true" turns on. > > If you don't want that to happen, exclusive=true is not what you want to > set. That makes sense, but what I don't like is that I have to mess with local lvm.conf files... All the best, Ulrich > > > Regards, > Lars > > -- > Architect Storage/HA > SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, > HRB 21284 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems