Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA

2013-12-04 Thread Ulrich Windl
 Lars Marowsky-Bree l...@suse.com schrieb am 03.12.2013 um 15:34 in
Nachricht
20131203143443.gl27...@suse.de:
 On 2013-12-02T09:22:10, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de
wrote:
 
  No!
  
  Then it can't work. Exclusive activation only works for clustered volume
  groups, since it uses the DLM to protect against the VG being activated
  more than once in the cluster.
 Hi!
 
 Try it with resource-agents-3.9.4-0.26.84: it works; with 
 resource-agents-3.9.5-0.6.26.11 it doesn't work ;-)
 
 You thought it was working, but in fact it wasn't. ;-)

working meaning the resource started.
not working meaning the resource does not start

You see I have minimal requirements ;-)

 
 Or at least, not as you expected.
 
 You could argue that it never should have worked. Anyway: If you want
 to activate a VG on exactly one node you should not need cLVM; only if
 you man to activate the VG on multiple nodes (as for a cluster file
 system)...
 
 You don't need cLVM to activate a VG on exactly one node. Correct. And
 you don't. The cluster stack will never activate a resource twice.

Occasionally two safty lines are better than one. We HAD filesystem
corruptions due to the cluster doing things it shouldn't do.

 
 You need cLVM if you want LVM2 to enforce that at the LVM2 level -
 because it does this by getting a lock on the VG/LV, since otherwise
 LVM2 has no way of knowing if the VG/LV is currently active somewhere
 else. And this is what exclusive=true turns on.
 
 If you don't want that to happen, exclusive=true is not what you want to
 set.

That makes sense, but what I don't like is that I have to mess with local
lvm.conf files...


All the best,
Ulrich

 
 
 Regards,
 Lars
 
 -- 
 Architect Storage/HA
 SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer,

 HRB 21284 (AG Nürnberg)
 Experience is the name everyone gives to their mistakes. -- Oscar Wilde
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA

2013-12-04 Thread Lars Marowsky-Bree
On 2013-12-04T10:25:58, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:

  You thought it was working, but in fact it wasn't. ;-)
 working meaning the resource started.
 not working meaning the resource does not start
 
 You see I have minimal requirements ;-)

I'm sorry; we couldn't possibly test all misconfigurations. So this
slipped through, we didn't expect someone to set that for a
non-clustered VG previously.

  You could argue that it never should have worked. Anyway: If you want
  to activate a VG on exactly one node you should not need cLVM; only if
  you man to activate the VG on multiple nodes (as for a cluster file
  system)...
  
  You don't need cLVM to activate a VG on exactly one node. Correct. And
  you don't. The cluster stack will never activate a resource twice.
 
 Occasionally two safty lines are better than one. We HAD filesystem
 corruptions due to the cluster doing things it shouldn't do.

And that's perfectly fine. All you need to do to activate this is
vgchange -c y on the specific volume group, and the exclusive=true
flag will work just fine.

  If you don't want that to happen, exclusive=true is not what you want to
  set.
 That makes sense, but what I don't like is that I have to mess with local
 lvm.conf files...

You don't. Just drop exclusive=true, or set the clustered flag on the
VG.

You only have to change anything in the lvm.conf if you want to use tags
for exclusivity protection (I defer to the LVM RA help for how to use
that, I've never tried it).


Regards,
Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
Experience is the name everyone gives to their mistakes. -- Oscar Wilde

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] FYI: resource-agents-3.9.2-40.el6.x86_64 kills heartbeat-3.0.4

2013-12-04 Thread Jefferson Ogata

On 2013-12-02 00:03, Andrew Beekhof wrote:

On Wed, Nov 27, 2013, at 06:15 PM, Jefferson Ogata wrote:

On 2013-11-28 01:55, Andrew Beekhof wrote:

On 28 Nov 2013, at 11:29 am, Jefferson Ogata linux...@antibozo.net wrote:

On 2013-11-28 00:12, Dimitri Maziuk wrote:

Just so you know:

RedHat's (centos, actually) latest build of resource-agents sets $HA_BIN
to /usr/libexec/heartbeat. The daemon in heartbeat-3.0.4 RPM is
/usr/lib64/heartbeat/heartbeat so $HA_BIN/heartbeat binary does not exist.

(And please hold the upgrade to pacemaker comments: I'm hoping if I
wait just a little bit longer I can upgrade to ceph and openstack -- or
retire, whichever comes first ;)


Hey, upgrading to pacemaker wouldn't necessarily help. Red Hat broke that last month by 
dropping most of the resource agents they'd initially shipped. (Don't you love Technology 
Previews?)


Thats the whole point behind the tech preview label... it means the software 
is not yet in a form that Red Hat will support and is subject to changes _exactly_ like 
the one made to resource-agents.


Um, yes, i know. That's why i mentioned it.


Ok, sorry, I wasn't sure.


It's nicer, however, when Red Hat takes a conservative position with the
Tech Preview. They could have shipped a minimal set of resource agents
in the first place,


3 years ago we didn't know if pacemaker would _ever_ be supported in
RHEL-6, so stripping out agents wasn't on our radar.
I'm sure the only reason it and the rest of pacemaker shipped at all was
to humor the guy they'd just hired.

It was only at the point that supporting pacemaker in 6.5 became likely
that someone took a look at the full list and had a heart-attack.


so people would have a better idea what they had to
provide on their own end, instead of pulling the rug out with nary a
mention of what they were doing.


Yes, that was not good.
One of the challenges I find at Red Hat is the gaps between when a
decision is made, when we're allowed to talk about it and when customers
find out about it.  As a developermore  its the things we spent
significant time on that first come to mind when writing release notes,
not the 3s it took to remove some files from the spec file - even though
the latter is going to have a bigger affect :-(

We can only say that lessons have been learned and that we will do
better if there is a similar situation next time.


+1 Insightful/Informative/Interesting.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] drbd/pacemaker multiple tgt targets, portblock, and race conditions (long-ish)

2013-12-04 Thread Jefferson Ogata

On 2013-11-21 16:34, Jefferson Ogata wrote:

On 2013-11-20 08:35, Jefferson Ogata wrote:

Indeed, using iptables with REJECT and tcp-reset, this seems to piss off
the initiators, creating immediate i/o errors. But one can use DROP on
incoming SYN packets and let established connections drain. I've been
trying to get this to work but am finding that it takes so long for some
connections to drain that something times out. I haven't given up on
this approach, tho. Testing this stuff can be tricky because if i make
one mistake, stonith kicks in and i end up having to wait 5-10 minutes
for the machine to reboot and resync its DRBD devices.


Follow-up on this: the original race condition i reported still occurs
with this strategy: if existing TCP connections are allowed to drain by
passing packets from established initiator connections (by blocking only
SYN packets), then the initiator can also send new requests to the
target during the takedown process; the takedown removes LUNs from the
live target and the initiator generates an i/o error if it happens to
try to access a LUN that has been removed before the connection is removed.

This happens because the configuration looks something like this (crm):

group foo portblock vip iSCSITarget:target iSCSILogicalUnit:lun1
iSCSILogicalUnit:lun2 iSCSILogicalUnit:lun3 portunblock

On takedown, if portblock is tweaked to pass packets for existing
connections so they can drain, there's a window while LUNs lun3, lun2,
lun1 are being removed from the target where this race condition occurs.
The connection isn't removed until iSCSITarget runs to stop the target.

A way to handle this that should actually work is to write a new RA that
deletes the connections from the target *before* the LUNs are removed
during takedown. The config would look something like this, then:

group foo portblock vip iSCSITarget:target iSCSILogicalUnit:lun1
iSCSILogicalUnit:lun2 iSCSILogicalUnit:lun3 tgtConnections portunblock

On takedown, then, portunblock will block new incoming connections,
tgtConnections will shut down existing connections and wait for them to
drain, then the LUNs can be safely removed before the target is taken down.

I'll write this RA today and see how that works.


So, this strategy worked. The final RA is attached. The config (crm) 
then looks like this, using the tweaked portblock RA that blocks syn 
only, the tgtUser RA that adds a tgtd user, and the tweaked iSCSITarget 
RA that doesn't add a user if no password is provided (see previous 
discussion for the latter two RAs). This is a two-node cluster using 
DRBD-backed LVMs and multiple targets. The names have been changed to 
protect the innocent, and is simplified to only a single target for 
brevity, but it should be clear how to do multiple DRBDs/VGs/targets. 
I've left out the stonith config here also.



primitive tgtd lsb:tgtd op monitor interval=10s
clone clone.tgtd tgtd
primitive user.username ocf:local:tgtUser params username=username 
password=password

clone clone.user.username user.username
order clone.tgtd_before_clone.user.username inf: clone.tgtd:start 
clone.user.username:start


primitive drbd.pv1 ocf:linbit:drbd params drbd_resource=pv1 op monitor 
role=Slave interval=29s timeout=600s op monitor role=Master 
interval=31s timeout=600s op start timeout=240s op stop timeout=240s
ms ms.drbd.pv1 drbd.pv1 meta master-max=1 master-node-max=1 
clone-max=2 clone-node-max=1 notify=true
primitive lvm.vg1 ocf:heartbeat:LVM params volgrpname=vg1 op monitor 
interval=30s timeout=30s op start timeout=30s op stop timeout=30s

order ms.drbd.pv1_before_lvm.vg1 inf: ms.drbd.pv1:promote lvm.vg1:start
colocation ms.drbd.pv1_with_lvm.vg1 inf: ms.drbd.pv1:Master lvm.vg1

primitive target.1 ocf:local:iSCSITarget params iqn=iqnt1 tid=1 
incoming_username=username implementation=tgt portals= op monitor 
interval=30s op start timeout=30s op stop timeout=120s
primitive lun.1.1 ocf:heartbeat:iSCSILogicalUnit params 
target_iqn=iqnt1 lun=1 path=/dev/vg1/lv1 
additional_parameters=scsi_id=vg1/lv1 
mode_page=8:0:18:0x10:0:0xff:0xff:0:0:0xff:0xff:0xff:0xff:0x80:0x14:0:0:0:0:0:0 
implementation=tgt op monitor interval=30s op start timeout=30s op 
stop timeout=120s
primitive ip.192.168.1.244 ocf:heartbeat:IPaddr params 
ip=192.168.1.244 cidr_netmask=24 nic=bond0
primitive portblock.ip.192.168.1.244 ocf:local:portblock params 
ip=192.168.1.244 action=block protocol=tcp portno=3260 
syn_only=true op monitor interval=10s timeout=10s depth=0
primitive tgtfinal.1 ocf:local:tgtFinal params tid=1 op monitor 
interval=30s timeout=30s op stop timeout=60s
primitive portunblock.ip.192.168.1.244 ocf:local:portblock params 
ip=192.168.1.244 action=unblock protocol=tcp portno=3260 
syn_only=true op monitor interval=10s timeout=10s depth=0


group group.target.1 lvm.vg1 portblock.ip.192.168.12.244 
ip.192.168.12.244 target.6 lun.6.1 tgtfinal.1 portunblock.ip.192.168.1.244


order clone.tgtd_before_group.target.1 inf: clone.tgtd:start 

Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM RA

2013-12-04 Thread David Vossel




- Original Message -
 From: Lars Marowsky-Bree l...@suse.com
 To: General Linux-HA mailing list linux-ha@lists.linux-ha.org
 Sent: Wednesday, December 4, 2013 3:49:17 AM
 Subject: Re: [Linux-HA] Antw: Re: SLES11 SP2 HAE: problematic change for LVM 
 RA
 
 On 2013-12-04T10:25:58, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de
 wrote:
 
   You thought it was working, but in fact it wasn't. ;-)
  working meaning the resource started.
  not working meaning the resource does not start
  
  You see I have minimal requirements ;-)
 
 I'm sorry; we couldn't possibly test all misconfigurations. So this
 slipped through, we didn't expect someone to set that for a
 non-clustered VG previously.

Updates have been made to the LVM agent to allow exclusive activation without 
clvmd.

http://www.davidvossel.com/wiki/index.php?title=HA_LVM

-- Vossel

 
   You could argue that it never should have worked. Anyway: If you want
   to activate a VG on exactly one node you should not need cLVM; only if
   you man to activate the VG on multiple nodes (as for a cluster file
   system)...
   
   You don't need cLVM to activate a VG on exactly one node. Correct. And
   you don't. The cluster stack will never activate a resource twice.
  
  Occasionally two safty lines are better than one. We HAD filesystem
  corruptions due to the cluster doing things it shouldn't do.
 
 And that's perfectly fine. All you need to do to activate this is
 vgchange -c y on the specific volume group, and the exclusive=true
 flag will work just fine.
 
   If you don't want that to happen, exclusive=true is not what you want to
   set.
  That makes sense, but what I don't like is that I have to mess with local
  lvm.conf files...
 
 You don't. Just drop exclusive=true, or set the clustered flag on the
 VG.
 
 You only have to change anything in the lvm.conf if you want to use tags
 for exclusivity protection (I defer to the LVM RA help for how to use
 that, I've never tried it).
 
 
 Regards,
 Lars
 
 --
 Architect Storage/HA
 SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer,
 HRB 21284 (AG Nürnberg)
 Experience is the name everyone gives to their mistakes. -- Oscar Wilde
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems