Re: [Linux-ha-dev] PATCH: race in iSCSILogicalUnit

2012-10-23 Thread Florian Haas
On 10/08/2012 01:58 PM, Philipp Marek wrote:
> Hi Florian,
> 
> Dejan told me that you're the maintainer for the iSCSI pieces, so I'm 
> sending you this patch.

Sorry about the late reply; lately I've been watching the GitHub pull
requests more religiously than the list.

> Please apply, thank you very much!

As the change you are proposing is LIO specific, please rename the
parameter to lio_iblock, add code that will log warnings if this
parameter is used in a configuration that uses one of the other
supported targets, and send a pull request.

Thanks,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] RA developer's guide 1.0.3

2012-07-26 Thread Florian Haas
Hi everyone,

Tal Yalon has pointed out an error in the RA developer's guide about
the "required" and "unique" attributes in RA metadata (they belong on
 elements, not  as the guide erroneously stated).
I've spun and pushed a minor update. Enjoy release 1.0.3.

http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] sbd spinoff from cluster-glue

2012-06-01 Thread Florian Haas
Dejan, Lars,

is it confirmed from your end that sbd is moving out of cluster-glue?
If so, it would be nice if we could get an cluster-glue release with
sbd removed, and a release of standalone sbd, so packagers can fix the
relevant distro packages up properly.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH 0 of 2] Autotoolize build

2012-05-29 Thread Florian Haas
On Tue, May 29, 2012 at 6:38 PM, Lars Marowsky-Bree  wrote:
> On 2012-05-29T18:34:15, Florian Haas  wrote:
>
>> Yeah, it seems you just broke the build by including cluster/stack.h
>> and not bothering to add an AC_CHECK_HEADERS to configure.ac. Where
>> does that come from, is that new to Pacemaker?
>
> Uh? It builds here on the 1.1.7 pacemaker version.
>
> The integration with the cluster stack is rather specific to whatever
> pacemaker/corosync version + configuration you build against.
> Unfortunately.

Well that's what #ifdef HAVE_CLUSTER_STACK_H and friends are good for, no?

Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH 0 of 2] Autotoolize build

2012-05-29 Thread Florian Haas
On Tue, May 29, 2012 at 6:26 PM, Lars Marowsky-Bree  wrote:
> On 2012-05-29T17:56:59, Florian Haas  wrote:
>
>> In case you're wondering why I didn't use PKG_CHECK_MODULES for the PE
>> libraries: their pkg-config file is currently broken; Andrew has a
>> pull request for Pacemaker for that.
>
> I was wondering more about how to build this against older codebases,
> but then decided not to bother ;-)

Yeah, it seems you just broke the build by including cluster/stack.h
and not bothering to add an AC_CHECK_HEADERS to configure.ac. Where
does that come from, is that new to Pacemaker?

> The code seems to be working quite well. A README and a manpage would
> probably be good ideas ...

Well autofoo already gave you an INSTALL file, and you can use
help2man for the man page generation (look at booth for the
Makefile.am and configure.ac hack). For the README, copy your wiki
page.

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH 0 of 2] Autotoolize build

2012-05-29 Thread Florian Haas
On Tue, May 29, 2012 at 4:32 PM, Lars Marowsky-Bree  wrote:
> On 2012-05-29T14:31:20, Florian Haas  wrote:
>
>> Forgot to mention this in the original cover message, for those who
>> haven't been following the discussion: this is for sbd which is just
>> spinning off from cluster-glue.
>
> Thanks, I've merged them both!

In case you're wondering why I didn't use PKG_CHECK_MODULES for the PE
libraries: their pkg-config file is currently broken; Andrew has a
pull request for Pacemaker for that.

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH 0 of 2] Autotoolize build

2012-05-29 Thread Florian Haas
On Tue, May 29, 2012 at 2:27 PM, Florian Haas  wrote:
> Lars,
>
> I did this as an exercise of sorts to understand how this compiles and
> what its dependencies are. Considering the code base is quite small
> it may seem like a pointless stunt to jump through all the autofoo
> hoops, but it makes life that much easier for distro packagers.

Forgot to mention this in the original cover message, for those who
haven't been following the discussion: this is for sbd which is just
spinning off from cluster-glue.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [PATCH 1 of 2] build: autotoolize

2012-05-29 Thread Florian Haas
# HG changeset patch
# User Florian Haas 
# Date 1338230941 -7200
# Branch autotools
# Node ID 9888c2e4353b08599e6977e5e61dd6d34ce6151e
# Parent  c4de704b6cea21c69b3c767d1c47bed727f94d82
build: autotoolize

diff -r c4de704b6cea -r 9888c2e4353b COPYING
--- /dev/null   Thu Jan 01 00:00:00 1970 +
+++ b/COPYING   Mon May 28 20:49:01 2012 +0200
@@ -0,0 +1,339 @@
+GNU GENERAL PUBLIC LICENSE
+   Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physi

[Linux-ha-dev] [PATCH 0 of 2] Autotoolize build

2012-05-29 Thread Florian Haas
Lars,

I did this as an exercise of sorts to understand how this compiles and
what its dependencies are. Considering the code base is quite small
it may seem like a pointless stunt to jump through all the autofoo
hoops, but it makes life that much easier for distro packagers.

Feel free to apply this as you see fit.

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [rfc] SBD with Pacemaker/Quorum integration

2012-05-28 Thread Florian Haas
On Fri, May 25, 2012 at 9:56 PM, Lars Marowsky-Bree  wrote:
> Should be packageable on every platform, though I admit that I've not
> tried building the pacemaker module against anything but the
> corosync+pacemaker+openais stuff we ship on SLE HA 11 so far.

Are you expecting this to build without "-I/usr/include/libxml2"? It
didn't for me, before I added that.

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [rfc] SBD with Pacemaker/Quorum integration

2012-05-25 Thread Florian Haas
On Fri, May 25, 2012 at 5:41 PM, Lars Marowsky-Bree  wrote:
> On 2012-05-25T17:31:52, Florian Haas  wrote:
>
>> > That aside, what do you think of the idea/approach?
>> Um, right now I have no opinion. Your commit messages are pretty
>> terse, and there's no README in the repo. Mind adding one?
>
> Good point. I wasn't aware the commit messages were terse ;-)
>
> To sketch this out:
>
> Basically though SBD continues as it always did.
>
> If you specify "-P" to the daemon start-up (usually via
> /etc/sysconfig/sbd SBD_OPTS), the following will happen:
>
> sbd will start (in addition to the worker processes that monitor the
> disks) a process that signs in with pacemaker (and corosync). This
> process monitors that the partition the local node is part of is
> quorate, and that the local node (according to the CIB as run through
> pengine) is "healthy".
>
> If so, the master thread will not self-fence even if the majority of
> devices is currently unavailable.
>
> That's it, nothing more. Does that help?

It does. One naive question: what's the rationale of tying in with
Pacemaker's view of things? Couldn't you just consume the quorum and
membership information from Corosync alone?

> It became needed because customers had scenarios with just one device
> (which experienced intermittent problems), where MPIO acted up (I've
> seen IO stuck for minutes), or even three devices where failures were
> correlated. Then, SBD would self-fence, and the customer be unhappy.
>
>
> (I have opinions on particularly the last failure mode. This seems to
> arise specifically when customers have build setups with two HBAs, two
> SANs, two storages, but then cross-linked the SANs, connected the HBAs
> to each, and the storages too. That seems to frequently lead to
> hiccups where the *entire* fabric is affected. I'm thinking this
> cross-linking is a case of sham redundancy; it *looks* as if makes
> things more redundant, but in reality reduces it since faults are no
> longer independent. Alas, they've not wanted to change that.)

Henceforth, I'm going to dangle this thread in front of everyone who
believes their SAN can never fail. Thanks. :)

Are there any SUSEisms in SBD or would you expect it to be packageable
on any platform?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [rfc] SBD with Pacemaker/Quorum integration

2012-05-25 Thread Florian Haas
On Thu, May 24, 2012 at 3:10 PM, Lars Marowsky-Bree  wrote:
> On 2012-05-24T14:34:59, Florian Haas  wrote:
>
>> > To give you a glance of the extended sbd code, you can check out
>> > http://hg.linux-ha.org/sbd - the new Pacemaker integration is activated
>> > using the "-P" option in /etc/sysconfig/sbd, otherwise sbd remains a
>> > drop-in replacement for the previous versions.
>> Just as a suggestion: since you're already taking this out of glue,
>> would you mind also moving the repo to GitHub? It's just orders of
>> magnitude more straightforward to review and comment on code that way.
>
> I'll probably do that, but since I stripped it out of glue to start
> with, sticking with hg was easier for the time being.
>
> But yes, I am contemplating to get over my git aversion ;-)
>
> That aside, what do you think of the idea/approach?

Um, right now I have no opinion. Your commit messages are pretty
terse, and there's no README in the repo. Mind adding one?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [rfc] SBD with Pacemaker/Quorum integration

2012-05-24 Thread Florian Haas
On Thu, May 24, 2012 at 11:10 AM, Lars Marowsky-Bree  wrote:
> Hi all,
>
> I had to repeatedly deal with customer/partner scenarios where the SAN
> was unreliable, and outages were correlated across fabrics. The desire
> was to avoid the self-fence in such cases if the cluster is quorate and
> the node is not unhealthy.
>
> This required SBD to link against pacemaker's CIB and PE libraries, and
> all that that implies. Which meant sbd had to move out of cluster-glue,
> or else we'd face a build loop.
>
> To give you a glance of the extended sbd code, you can check out
> http://hg.linux-ha.org/sbd - the new Pacemaker integration is activated
> using the "-P" option in /etc/sysconfig/sbd, otherwise sbd remains a
> drop-in replacement for the previous versions.

Just as a suggestion: since you're already taking this out of glue,
would you mind also moving the repo to GitHub? It's just orders of
magnitude more straightforward to review and comment on code that way.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] Bug in iSCSILogicalUnit

2012-05-21 Thread Florian Haas
Hi Vadym,

moving the discussion to the -dev list, which is the more appropriate
forum for this. Please reply to -dev; more comments inline.

On Sun, May 20, 2012 at 9:52 PM, Vadym Chepkov  wrote:
> Hi,
>
>
> The monitor operation of  iSCSILogicalUnit  is not specific enough in the 
> regular expression and I got very "nice" fencing going because it was falsely 
> reporting "failed to stop" resource.
>
> I happened to add primitive lun-build10, while already having lun-build1.

It would be bad if _that_ caused problems, but I'm unsure how that
would be related to your patch.

> For myself I have narrowed it down to the following fix, but probably a more 
> appropriate regex has to be applied to these and other commands serving the 
> same purpose for other iSCSI implementations.
>
> diff --git a/heartbeat/iSCSILogicalUnit b/heartbeat/iSCSILogicalUnit
> index 25ee32e..2cee970 100755
> --- a/heartbeat/iSCSILogicalUnit
> +++ b/heartbeat/iSCSILogicalUnit
> @@ -328,7 +328,7 @@ iSCSILogicalUnit_monitor() {
>        tgt)
>            # Figure out and set the target ID
>            TID=`tgtadm --lld iscsi --op show --mode target \
> -               | sed -ne "s/^Target \([[:digit:]]\+\): 
> ${OCF_RESKEY_target_iqn}/\1/p"`
> +               | sed -ne "s/^Target \([[:digit:]]\+\): 
> ${OCF_RESKEY_target_iqn}$/\1/p"`

Adding the end-of-line anchor there does make good sense, but it would
only fix the case of there being two _targets_ sharing part of their
IQN, not two LUs with primitives similarly named. Do you have more
than one target, where the full IQN of one target is a substring of
the IQN of another?

>            if [ -z "$TID" ]; then
>                # Our target is not configured, thus we're not
>                # running.
> @@ -337,7 +337,7 @@ iSCSILogicalUnit_monitor() {
>            # This only looks for the backing store, but does not test
>            # for the correct target ID and LUN.
>            tgtadm --lld iscsi --op show --mode target \
> -               | grep -E -q "[[:space:]]+Backing store.*: 
> ${OCF_RESKEY_path}" && return $OCF_SUCCESS
> +               | grep -E -q "[[:space:]]+Backing store.*: 
> ${OCF_RESKEY_path}$" && return $OCF_SUCCESS
>            ;;
>        lio)

Here the "$" looks OK too, but here it would apply to two backing
devices with overlapping paths. I presume you named your LVs
"lun-build1" and "lun-build10" also, and they're in the same VG?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH v2] resource-agents: add Linux proxy arp resource agent

2012-04-04 Thread Florian Haas
On Wed, Apr 4, 2012 at 1:52 AM, Christian Franke  wrote:
> Hello Florian,
>
> Your question is fully justified - I sincerely apologize for ignoring
> that comprehensive documentation.
>
> I rewrote the patch trying to adhere to the requirements given in the
> documentation.

Wow, that is a response we infrequently get so promptly from new RA
authors. I'm impressed. Thanks!

And no-one's expecting you to apologize for just missing a piece of
documentation.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] resource-agents: add GNU/Linux proxy arp resource agent

2012-04-03 Thread Florian Haas
On Tue, Apr 3, 2012 at 1:07 PM, Christian Franke  wrote:
> This patch adds an OCF resource agent which maintains proxy arp entries
> in a GNU/Linux arp table.
>
> This is especially useful when a high-availability routing setup is built
> and it is required to perform proxy arp.

Thanks for the contribution. May I ask whether you've taken a look at
the OCF RA dev guide?

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] some iSCSITarget meta data issues

2012-02-26 Thread Florian Haas
On 02/27/12 07:38, Rasto Levrinc wrote:
> I am talking about long shortdescs. E.g.
> 
> Specifies the iSCSI target implementation ("iet",
> "tgt" or "lio").
> 
> is way too long and it abbreviates to something like "Specifies the
> iSCSI..." in the GUI. You may not care about this, but many people do.
> So I am not nitpicking or anything, this meta-data happens to be my
> interface to the resource agents and it used to work quite well in the
> past. So generally I think that short-description shouldn't encode that
> they specify something, the name of the resource agent and possible
> values.
> 
> Implementation
> 
> or
> 
> iSCSI target implementation
> 
> would be enough in my opinion and it's nothing shameful to have short
> short-descriptions.

So you're proposing either a shortdesc that's _identical_ to the
parameter name -- doesn't do a fat lot of good -- or one that contains
the parameter name plus the string "iSCSI target" which you've been
complaining about upthread? Does not compute.

Let me suggest that this discussion is getting us nowhere. The
shortdescs stay as they are until someone comes up with a real improvement.

Thanks for reporting the other issues, though.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] some iSCSITarget meta data issues

2012-02-26 Thread Florian Haas
On 02/26/12 15:31, Rasto Levrinc wrote:
> I think, I'm never bikeshedding. :) It is a mirror issue but the "iSCSI
> target" in every short description is redundant,

Which RA are you looking at? iSCSITarget supports 8 different
parameters; 4 of them have "iSCSI target" in their shortdesc. That's
hardly "every short description".

> since they all belong
> to the iSCSI target RA, and because the long descriptions are cut in the

Now what, are you talking about longdesc or shortdesc?

> display, then they all look like "iSCSI target..." and you must mouse
> over to actually see what they are. I am exaggerating a bit here. So I
> propose as general style rule, don't include RA name to the short
> descriptions.
> 
>>
>>> 3. defaults are computed in this way that they may be different in
>>> different cluster nodes and may change after the cluster is configured,
>>> which is not very useful in my opinion.
>>
>> That was my way of trying to provide a "reasonable" default across
>> distros. The alternative would be that every distro packager would
>> have to patch the RA to provide the proper default for their platform
>> -- which would be tgt for RHEL/CentOS, iet for SLES 11 and then lio
>> for SLES 11 SP2+ (I think), undefined for Debian. You get the picture.
>> I think the existing way of figuring out the defaults is saner, if not
>> perfect. Feel free to convince me otherwise, though.
> 
> 
> You are right about that. The problem I am having is, that they are two
> types of defaults, that you can't distinguish just by looking in the
> meta-data.
> 
> The first is that are used by RA, so you don't have to define 20
> parameters, only if you want use something other than the default. This
> is the most common use.
> 
> The second type is a suggested value, that is advertised as a default,
> but unless it is stored like normal value in the cib, it is not used by
> the RA.
> 
> The third type is a combination from the two above, like iSCSITarget.
> 
> I am solving it by keeping track of the RAs, what kind of defaults they
> are using for now, but I'd preferred that there was some consistency in
> it.

Patches welcome.

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] some iSCSITarget meta data issues

2012-02-26 Thread Florian Haas
On Sat, Feb 25, 2012 at 10:02 AM, Rasto Levrinc  wrote:
> Hi,
>
> There are couple of issues in iSCSITarget meta-data.
>
> 1. there is a extra space in name attribute in status action and it
> causes all sort of problems:
>
> 
> what it does, is that crm shell doesn't understand "status " action only
> "status", but cib doesn't accept "status". So I can't simply fix it in
> the GUI.

Fixed in the metadata:

https://github.com/ClusterLabs/resource-agents/commit/ebb5e5d103066cb19b46427ef0f28937f4943dbe

However, iirc RAs expose "status" operation only for age-old
compatibility reasons, and Pacemaker only ever uses "monitor". Which
is why, I guess, no-one has run into this problem before. Any reason
for LCMC to use "status" at all?

> And some not very critical...
>
> 2. short descriptions are not really short, I think it's not necessary
> prepend every one of them with "iSCSI target"
>
> The worst offender
> Manages an iSCSI target export
>
> could be changed to "Implementation", or "Daemon Implementation" to be
> short and descriptive.

Yeah, that one was just a copy & paste error. Fixed too, thanks.

About the others, I can only surmise you're bikeshedding -- those look
fine to me. However, I'll be happy to take a patch if you have better
suggestions.

> 3. defaults are computed in this way that they may be different in
> different cluster nodes and may change after the cluster is configured,
> which is not very useful in my opinion.

That was my way of trying to provide a "reasonable" default across
distros. The alternative would be that every distro packager would
have to patch the RA to provide the proper default for their platform
-- which would be tgt for RHEL/CentOS, iet for SLES 11 and then lio
for SLES 11 SP2+ (I think), undefined for Debian. You get the picture.
I think the existing way of figuring out the defaults is saner, if not
perfect. Feel free to convince me otherwise, though.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [RfC] [Patch] Filesystem

2012-02-21 Thread Florian Haas
On Mon, Feb 20, 2012 at 9:40 PM, Lars Ellenberg
 wrote:
> What do you say?

+1

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] Medium: Use the resource timeout as an override to the default dbus timeout for upstart RA

2012-02-20 Thread Florian Haas
On Mon, Feb 20, 2012 at 11:57 AM, Andrew Beekhof  wrote:
>> It does, but the exit status is always '0', which makes 'service' binary
>> unusable for monitoring the status of the service without parsing the
>> command output.
>
> 10 head
> 20 desk
> 30 add
> 40 goto 10

I believe you went through that same loop months ago when you found
out about this on IRC. Looks like premature cache invalidation of the
result.

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Additional changes made via DHCPD review process

2011-12-09 Thread Florian Haas
On Fri, Dec 9, 2011 at 6:30 AM, Dejan Muhamedagic  wrote:
> Hi,
>
> On Tue, Dec 06, 2011 at 01:39:04PM -0400, Chris Bowlby wrote:
>> Hi All,
>>
>>   Ok, I'll look into csync, and will concede the point on the RA syncing
>> the out of chrooted configuration file.
>>
>> I still need to find a means to monitor the DHCP responses however, as
>> that will just improve the reliability of the cluster itself, as well as
>> the service.
>
> I'm really not sure how to do that.
>
> Didn't review the agent, but on a cursory look, perhaps you could
> provide the default for chrooted_path (/var/lib/dhcp).
>
> BTW, did you think of adding an ocft test case?

Please, cut the new guy some slack. :) Evidently this is Chris' first
contributed RA, and he has been enormously responsive to our
suggestions, and has drastically improved his agent from his first
submission. I'm sure he'll get to ocft in due course.

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Additional changes made via DHCPD review process

2011-12-06 Thread Florian Haas
On Tue, Dec 6, 2011 at 4:44 PM, Dejan Muhamedagic  wrote:
> Hi,
>
> On Tue, Dec 06, 2011 at 10:59:20AM -0400, Chris Bowlby wrote:
>> Hi Everyone,
>>
>>   I would like to thank Florian, Andreas and Dejan for making
>> suggestions and pointing out some additional changed I should make. At
>> this point the following additional changes have been made:
>>
>> - A test case in the validation function for ocf_is_probe has been
>> reversed tp ! ocf_is_probe, and the "test"/"[ ]" wrappers removed to
>> ensure the validation is not occuring if the partition is not mounted or
>> under a probe.
>> - An extraneous return code has been removed from the "else" clause of
>> the probe test, to ensure the rest of the validation can finish.
>> - The call to the DHCPD daemon itself during the start phase has been
>> wrapped with the ocf_run helper function, to ensure that is somewhat
>> standardized.
>>
>> The first two changes corrected the "Failed Action... Not installed"
>> issue on the secondary node, as well as the fail-over itself. I've been
>> able to fail over to secondary and primary nodes multiple times and the
>> service follows the rest of the grouped services.
>>
>> There are a few things I'd like to add to the script, now that the main
>> issues/code changes have been addressed, and they are as follows:
>>
>> - Add a means of copying /etc/dhcpd.conf from node1 to node2...nodeX
>> from within the script. The logic behind this is as follows:
>
> I'd say that this is admin's responsibility. There are tools such
> as csync2 which can deal with that. Doing it from the RA is
> possible, but definitely very error prone and I'd be very
> reluctant to do that. Note that we have many RAs which keep
> additional configuration in a file and none if them tries to keep
> the copies of that configuration in sync itself.

Seconded. Whatever configuration doesn't live _in_ the CIB proper, is
not Pacemaker's job to replicate. The admin gets to either sync files
manually across the nodes (csync2 greatly simplifies this; no need to
reinvent the wheel), or put the config files on storage that's
available to all cluster nodes.

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] pgsql and streaming replcation

2011-12-05 Thread Florian Haas
On Sun, Dec 4, 2011 at 11:11 PM, Serge Dubrouski  wrote:
> Florian, Dejan how would you like to merge a patch when we are ready? The
> patch will be rather big one and AFAIK you have some policy on the amount of
> changes for one patch.

If it's a big addition of functionality, then a big patch is expected.
However please make sure that you do one patch per functional change.
Also, don't mix functional changes with "cleanup" work like fixing
whitespace, correcting incorrectly advertised resource parameters,
etc. It's acceptable to mix those in with the same pull request, but
they should be in separate changesets so we can easily bisect any
arising issues.

Other than that, I guess Dejan will agree with me that your PostgreSQL
expertise is way better than his and mine. So if you greenlight the
feature addition functionally we're unlikely to second-guess you on
that.

Does this help?
Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] Generic Python framework for OCF Resource Agents released

2011-12-03 Thread Florian Haas
Hi Volker,

welcome. First off I would like to say that what you present is pretty
impressive and could turn out to be something extraordinarily helpful.
I agree that being able to easily create robust resource agents in
Python would be a huge plus. I also agree with the pain points you
mentioned about resource agents in shell. I'm impressed with what's on
bitbucket; it's fairly easy to look around and get a good grasp of
what you're doing.

In addition, I'm an admitted Python fanboy and have been wishing for a
Python-based VirtualDomain agent (one of the RAs I wrote and
co-maintain, based on libvirt), rather than the current shell/virsh
based one, for quite a while.

But as it turns out what I, putting myself into the shoes of a
resource agent author, would expect from a resource agent seems to
only partially agree with your expectations. :) Let me give you my
idea of an ideal Python resource agent (all pseudocode, just hacked
up, totally untested, errors and typos mine, don't run this at home):

class MyAgent(ResourceAgent):
  VERSION = "0.1"

  # declare parameter types. Yeah, of course we can use something more
elegant than a nested dictionary
  PARAMS = { "foobar": { "type" : INTEGER,
 "required" : True,
 "unique" : True,
 "default" : "blah" }}

  def start(self, timeout, **params):
# Do stuff to start the RA
...

  def stop(self, timeout, **params):
# Do stuff to stop the RA
...

  def migrate_to(self, timeout, target, **params):
# Do stuff to migrate the RA to target
...

  def migrate_from(self, timeout, source, **params):
# Do stuff to migrate the RA from source
...

  def notify_post_start(self, timeout, *nodes, **params):
# Do stuff to handle a post notification for
# resources started on nodes
...

And now, I'd like that all the scaffolding is done by the abstract
base class. Such as:

* parse command line options (you got that right)
* create resource agent metadata (that too, however I'd much prefer if
rather than registering handlers you would just be able to introspect
the public methods on the RA, plus the params attribute, and build it
that way.)
* create the usage message (idem)
* translate all the OCF_RESKEY_* envars into simple method parameters
that then get passed into the methods
* insert defaults for parameter values, where they exist and haven't
been overridden
* same for the various OCF_RESKEY_CRM_meta* envars
* handle the command line invocation
* set up logging handlers in a sane way so the RA author can just use
logging.info() and friends, and the log output ends up where the
cluster admin decides

I wouldn't want to muck around with registering handlers and
parameters. I as a resource agent author would like to not have to
worry about the innards at all, unlike with the shell RAs where
there's no real way around that. Just give me a method signature I
need to implement, and a few public attributes I need to fill, and
then I want to be able to focus on function, not form.

Just so that's clear: that is just my idea; I'm not saying your
approach is in any way inferior. We're just getting this discussion
started. Maybe I'm totally off my rocker (it happens. :) ).

As this is a discussion that's really for the -dev list, I've added
that list to the recipients and would encourage people to continue the
discussion there.

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] new RA: varnish

2011-12-02 Thread Florian Haas
On Wed, Nov 23, 2011 at 10:40 AM, Léon Keijser  wrote:
> Hi,
>
> I've created a new RA to manage Varnish instances. I've forked
> resource-agents and added it here:
>
> https://github.com/lkeijser/resource-agents

We haven't seen much review from others here on the list, but Léon has
been enormously responsive on GitHub and has polished and improved
this RA considerably. I have just merged this into the upstream repo,
the merge commit is here:

https://github.com/ClusterLabs/resource-agents/commit/1e70eea45251b5375ea3314b63491e940cb055cd

If you would like to test this RA (and you're much encouraged to do
so), it's available from here:

https://raw.github.com/ClusterLabs/resource-agents/master/heartbeat/varnish

Thanks Léon, and enjoy your vacation!

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH 0/2] LVM: Fix activation for clustered VGs

2011-11-30 Thread Florian Haas
On Fri, Nov 25, 2011 at 6:38 PM, Florian Haas  wrote:
> Lars (both), Dejan, Nils,
>
> could you take a quick peek at whether the following (untested) patches
> look like they're making sense? The more important one is obviously
> the second one.
>
> Nils, could you apply those patches on the system where you ran into
> the issue? If it's more convenient, you can also fetch the patched RA
> from my Github repo:
>
> https://github.com/fghaas/resource-agents/blob/lvm/heartbeat/LVM
>
> Thanks!
>
> Cheers,
> Florian
>
> [PATCH 1/2] Low: LVM: add local convenience variables in LVM_start
> [PATCH 2/2] Medium: LVM: force dmevent monitoring for clones

Merged, plus a couple of trivial janitor patches. Recent commit history is here:

https://github.com/ClusterLabs/resource-agents/commits/master/heartbeat/LVM

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH 2/2] Medium: LVM: force dmevent monitoring for clones

2011-11-28 Thread Florian Haas
On Sat, Nov 26, 2011 at 11:58 AM, Lars Marowsky-Bree  wrote:
> On 2011-11-25T18:38:06, Florian Haas  wrote:
>
>> Starting a clustered volume with monitoring disabled is not allowed:
>>
>> http://www.redhat.com/archives/lvm-devel/2010-March/msg00289.html
>>
>> Which would be fine, as activation/monitoring = 1 ships as the default
>> in lvm.conf. However, at least some versions of LVM seem to ignore this,
>> throwing an error on vgchange unless "--monitor y" is explicitly set
>> on the command line:
>>
>> https://bugs.launchpad.net/ubuntu/+source/lvm2/+bug/833368
>>
>> Thus, for cloned instances, always invoke vgchange with "--monitor y".
>>
>> Thanks to Nils Meyer  for pointing out this issue.
>> ---
>>  heartbeat/LVM |    6 ++
>>  1 files changed, 6 insertions(+), 0 deletions(-)
>
> Seems to make sense. of course, an alternative would be to add a
> "Conflicts: lvm2 < x.y.z" to the package on the respective versions to
> make sure it's only installed with a fixed lvm2 package ...?

Surely you're joking. resource-agents does not enforce any packaging
dependencies for the stuff it's capable of managing, so why throw in a
random conflict here?

Of course, we probably wouldn't have this version issue if the LVM RA
was packaged with LVM, but someone shot down that suggestion. Who was
that, I wonder? I'm thinking, I'm thinking... nevermind, it'll come to
me.

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [PATCH 2/2] Medium: LVM: force dmevent monitoring for clones

2011-11-25 Thread Florian Haas
Starting a clustered volume with monitoring disabled is not allowed:

http://www.redhat.com/archives/lvm-devel/2010-March/msg00289.html

Which would be fine, as activation/monitoring = 1 ships as the default
in lvm.conf. However, at least some versions of LVM seem to ignore this,
throwing an error on vgchange unless "--monitor y" is explicitly set
on the command line:

https://bugs.launchpad.net/ubuntu/+source/lvm2/+bug/833368

Thus, for cloned instances, always invoke vgchange with "--monitor y".

Thanks to Nils Meyer  for pointing out this issue.
---
 heartbeat/LVM |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/heartbeat/LVM b/heartbeat/LVM
index d8ad3ca..05eefe7 100755
--- a/heartbeat/LVM
+++ b/heartbeat/LVM
@@ -224,6 +224,12 @@ LVM_start() {
vgchange_options="$vgchange_options --partial"
   fi
 
+  # for clones (clustered volume groups), we'll also have to force
+  # monitoring, even if disabled in lvm.conf.
+  if ocf_is_clone; then
+   vgchange_options="$vgchange_options --monitor y"
+  fi
+
   ocf_run vgchange $vgchange_options $1 || return $OCF_ERR_GENERIC
 
   if LVM_status $1; then
-- 
1.7.5.4

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [PATCH 0/2] LVM: Fix activation for clustered VGs

2011-11-25 Thread Florian Haas
Lars (both), Dejan, Nils,

could you take a quick peek at whether the following (untested) patches
look like they're making sense? The more important one is obviously
the second one.

Nils, could you apply those patches on the system where you ran into
the issue? If it's more convenient, you can also fetch the patched RA
from my Github repo:

https://github.com/fghaas/resource-agents/blob/lvm/heartbeat/LVM

Thanks!

Cheers,
Florian

[PATCH 1/2] Low: LVM: add local convenience variables in LVM_start
[PATCH 2/2] Medium: LVM: force dmevent monitoring for clones
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [PATCH 1/2] Low: LVM: add local convenience variables in LVM_start

2011-11-25 Thread Florian Haas
---
 heartbeat/LVM |   11 +++
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/heartbeat/LVM b/heartbeat/LVM
index 683d4d5..d8ad3ca 100755
--- a/heartbeat/LVM
+++ b/heartbeat/LVM
@@ -201,6 +201,8 @@ LVM_monitor() {
 #  Enable LVM volume
 #
 LVM_start() {
+  local vgchange_options
+  local active_mode
 
   # TODO: This MUST run vgimport as well
 
@@ -215,13 +217,14 @@ LVM_start() {
   active_mode="ly"
   if ocf_is_true "$OCF_RESKEY_exclusive" ; then
active_mode="ey"
-  fi   
-  partial_active=""
+  fi
+  vgchange_options="-a $active_mode"
+
   if ocf_is_true "$OCF_RESKEY_partial_activation" ; then
-   partial_active="--partial"
+   vgchange_options="$vgchange_options --partial"
   fi
 
-  ocf_run vgchange -a $active_mode $partial_active $1 || return 
$OCF_ERR_GENERIC
+  ocf_run vgchange $vgchange_options $1 || return $OCF_ERR_GENERIC
 
   if LVM_status $1; then
 : OK Volume $1 activated just fine!
-- 
1.7.5.4

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] ldirectord: Remove dependence on shellfuncs from /etc/init.d/ldirectord

2011-11-23 Thread Florian Haas
On 11/23/11 08:52, Simon Horman wrote:
> On Wed, Nov 23, 2011 at 04:10:46PM +0900, TATEISHI Katsuyuki wrote:
>> Hi All,
>>
>> Please consider applying the attached patch to remove dependence on
>> shellfuncs from /etc/init.d/ldirectord.
>>
>> It allows ldirectord-RPM users to run /etc/init.d/ldirectord without
>> resource-agents RPM installed.
>>
>> It is quite harmless, because /etc/init.d/ldirectord does not use any
>> functions in /etc/ha.d/shellfuncs.
>>
>> Thank you,
> 
> Hi Tateishi-san,
> 
> this looks good to me.
> 
> Dejan, could you apply it?

I had just merged Mori-san's Makefile.am patch, so I picked up this one
as well. Merged and pushed, with a slightly modified commit message:

https://github.com/ClusterLabs/resource-agents/commit/6c2e8146c757da47b7ff926edb6895f7b1832e55

Thanks!

Cheers,
Florian


___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Review request: fixes to IPaddr2

2011-11-22 Thread Florian Haas
On 11/21/11 16:51, Dejan Muhamedagic wrote:
> Hi Florian,
> 
> On Fri, Nov 11, 2011 at 10:28:00AM +0100, Florian Haas wrote:
>> Dejan/Lars,
>>
>> I noticed I've had a bunch of minor changes to IPaddr2 sitting in a
>> branch since July, and never got around to asking for a review or
>> merging them. I've just rebased them to the current state of master. If
>> one of you could take a look, I'd much appreciate that. Thanks!
>>
>> https://github.com/fghaas/resource-agents/compare/master...ipaddr2-fixes
> 
> 
> Reviewed all the changes and didn't find any problem. Good to
> put a bit more order in IPaddr2!

Thanks! Merged & pushed.

https://github.com/ClusterLabs/resource-agents/commit/881f539aafa1e0198144137bae9e3491c45f7cd1

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] Updated OCF RA dev guide (was Re: [GIT PULL] Moving the OCF RA dev guide to the resource-agents repo)

2011-11-18 Thread Florian Haas
On 11/18/11 10:50, Florian Haas wrote:
> Done. Merged and pushed. I'll now add a few updates and then rebuild the
> content hosted at
> http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html. I guess I
> should be able to upload the new stuff some time between now and Monday
> morning.

An updated version of the OCF RA developer's guide is now available here:

A related blog post of mine is at: http://wp.me/p4XzQ-bo

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [GIT PULL] Moving the OCF RA dev guide to the resource-agents repo

2011-11-18 Thread Florian Haas
On 11/16/11 10:37, Florian Haas wrote:
> Hi everyone,
> 
> this is something I've been meaning to do for a long time, and I've
> finally had the time to do so. Now that the ClusterLabs repo on Github
> has been established as the central source of OCF resource agents, there
> is really no reason why the RA dev guide should live in the linux-ha.org
> Mercurial repo any longer. I've informally proposed to move it to Github
> on IRC and received very positive feedback on that.
> 
> I've done some mangling with git and hg to preserve the guide's entire
> history, so this pull request is a bit bloated because of that.
> 
> So as not to clutter the doc/ directory, I've created two subdirs,
> "doc/dev-guide" (for the dev guide sources) and "doc/man" (for the magic
> that does man page autogeneration for RAs). I've updated configure.ac
> and the Automake files accordingly.
> 
> If I don't hear from anyone with a valid objection to moving this over
> to Github, I'll proceed with the merge about 48 hours from now.

Done. Merged and pushed. I'll now add a few updates and then rebuild the
content hosted at
http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html. I guess I
should be able to upload the new stuff some time between now and Monday
morning.

For those building resource agent man pages in a git checkout, please
remember to rerun ./autogen.sh and ./configure (or run autoreconf) as
the generated man pages (and their Automake file) have moved from doc/
to doc/man/.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [GIT PULL] Re: [Linux-HA] [RfC] Review request for ocf:heartbeat:asterisk (Asterisk OCF RA)

2011-11-17 Thread Florian Haas
On 11/16/11 10:28, Florian Haas wrote:
> Hi everybody,
> 
> barring any last-minute vetoes, I intend to pull the following changes
> since commit 020c8f7b08e232aef05e277b09632171a7561744:

Heard no vetoes. Merged and pushed.

https://github.com/ClusterLabs/resource-agents/commit/ff3aff0006368dcb5cf7da226ee69a8c53b4ef62

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [GIT PULL] Moving the OCF RA dev guide to the resource-agents repo

2011-11-16 Thread Florian Haas
Hi everyone,

this is something I've been meaning to do for a long time, and I've
finally had the time to do so. Now that the ClusterLabs repo on Github
has been established as the central source of OCF resource agents, there
is really no reason why the RA dev guide should live in the linux-ha.org
Mercurial repo any longer. I've informally proposed to move it to Github
on IRC and received very positive feedback on that.

I've done some mangling with git and hg to preserve the guide's entire
history, so this pull request is a bit bloated because of that.

So as not to clutter the doc/ directory, I've created two subdirs,
"doc/dev-guide" (for the dev guide sources) and "doc/man" (for the magic
that does man page autogeneration for RAs). I've updated configure.ac
and the Automake files accordingly.

If I don't hear from anyone with a valid objection to moving this over
to Github, I'll proceed with the merge about 48 hours from now.

Cheers,
Florian


The following changes since commit 020c8f7b08e232aef05e277b09632171a7561744:

  Medium: jboss: add the java_opts parameter for java options
(2011-11-15 16:27:26 +0100)

are available in the git repository at:
  git://github.com/fghaas/resource-agents dev-guide

Dejan Muhamedagic (1):
  RA dev guide: edit ocft description

Florian Haas (68):
  doc: move man page generation to doc/man
  Add RA developer's guide
  RA dev guide: explain API
  RA dev guide: add info about expected behavior for actions
  RA dev guide: add information about pseudo resources
  RA dev guide: add information about logging, ocf_run, and locks
  RA dev guide: add info about script interpreters
  RA dev guide: more information about pseudo resources
  RA dev guide: add info about testing for booleans and numbers
  RA dev guide: add separate section on convenience functions
  RA dev guide: explain have_binary and check_binary
  RA dev guide: explain validate-all and meta-data
  RA dev guide: do not use full paths when referring to executables
  RA dev guide: update example metadata
  RA dev guide: add line breaks in example metadata
  RA dev guide: add stubs for remaining actions
  RA dev guide: add section on variables
  RA dev guide: add stubs for initialization and locale considerations
  RA dev guide: add info about promote action
  RA dev guide: explain demote action
  RA dev guide: explain notify actions
  RA dev guide: explain migrate_to operation
  RA dev guide: explain migrate_from actions
  Add information about licensing, initialization, and locale settings
  RA dev guide: add hints for syntax highlighting
  RA dev guide: fix typo
  RA dev guide: fix incorrect debug message in example code
  RA dev guide: fix a misleading comment
  RA dev guide: fix example code for stop action
  RA dev guide: fix incorrect function name
  RA dev guide: fix trivial typo
  RA dev guide: add information about crm_master
  RA dev guide: rewrite monitor example
  RA dev guide: add convenience function names to the corresponding
section headers
  RA dev guide: add legal notice for CC-BY-SA license
  RA dev guide: add note about ocf_is_true
  RA dev guide: add new section on resource agent structure
  RA dev guide: fix typos (migrate-*/migrate_*)
  RA dev guide: fix typo (missing "it")
  RA dev guide: improve stop example
  RA dev guide: fix comments in demote example
  RA dev guide: mention the checkbashisms script for /bin/sh RAs
  RA dev guide: fix misleading comment
  RA dev guide: add section on testing for running processes
  RA dev guide: put return codes in a separate section
  RA dev guide: add warning about implications of failed stop actions
  RA dev guide: clarify information on timeouts
  RA dev guide: add note about HA_RSCTMP being cleaned on startup
  RA dev guide: rename "Resource agent behavior" to "Resource agent
actions"
  RA dev guide: add a section about ocf-tester
  RA dev guide: move testing to a different section
  RA dev guide: add section on installing resource agents
  RA dev guide: add section on RPM packaging
  RA dev guide: expand license information
  RA dev guide: add section on Debian packaging
  RA dev guide: add note about submitting RAs
  RA dev guide: fix names of notify variables
  RA dev guide: add another note about crm_master
  RA dev guide: put author information in docinfo file
  RA dev guide: add revision information to docinfo file
  RA dev guide: remove information about superfluous tests for zombies
  RA dev guide: add revision 1.0.1, update copyright years
  RA dev guide: add "Conventions" section
  RA dev guide: put testing into its own top-level section
  RA dev guide: add section on ocft

[Linux-ha-dev] [GIT PULL] Re: [Linux-HA] [RfC] Review request for ocf:heartbeat:asterisk (Asterisk OCF RA)

2011-11-16 Thread Florian Haas
Hi everybody,

barring any last-minute vetoes, I intend to pull the following changes
since commit 020c8f7b08e232aef05e277b09632171a7561744:

  Medium: jboss: add the java_opts parameter for java options
(2011-11-15 16:27:26 +0100)

into the ClusterLabs resource-agents repo from the git repository at:
  git://github.com/fghaas/resource-agents asterisk

Thanks a lot to Dejan, Lars and Russell for their extensive and valuable
feedback.

Cheers,
Florian

Andreas Kurz (2):
  Medium: asterisk: remove -x option from pgrep
  Low: asterisk: refine sipsak exit code interpretation

Florian Haas (24):
  Low: asterisk: use ocf_run where appropriate, invoke kill with "-s"
  Low: asterisk: honor $PATH in binary defaults
  Low: asterisk: don't advertise monitor depth
  Low: asterisk: remove LSB boilerplate
  Low: asterisk: don't redirect ocf_run invocations to /dev/null
  Low: asterisk: remove superfluous line-end semicolons
  Low: asterisk: simplify equality tests
  Low: asterisk: remove boilerplate copied from mysql
  Low: asterisk: downgrade logging severity for successful monitor
  Low: asterisk: improve directory creation on start
  Low: asterisk: improve/simplify start
  Low: asterisk: improve stop
  Low: asterisk: declare remaining "pid" variables local
  Low: asterisk: fix typo in log message
  Low: asterisk: remove useless return statement
  Low: asterisk: religiously use $rc, not $?
  Low: asterisk: exit, don't return, in the face of uncaught errors
  Low: asterisk: remove unused variable
  Low: asterisk: initialize convenience variables after validate
  Low: asterisk: add optional SIP monitoring with sipsak
  Low: asterisk: do "core show channels count " during monitor
  Low: asterisk: whitespace cleanup
  High: asterisk: fix typo (missing "$")
  Low: asterisk: remove -v flag from sipsak invocation

Martin Loschwitz (6):
  High: asterisk: new resource agent
  Low: asterisk: set suggested timeouts to 20s
  Low: asterisk: add convenience function for connecting to Asterisk
console
  Low: asterisk: rewrite stop operation
  Low: asterisk: cleanup
  Medium: asterisk: properly handle astcanary

 doc/Makefile.am   |1 +
 heartbeat/Makefile.am |1 +
 heartbeat/asterisk|  485
+
 3 files changed, 487 insertions(+), 0 deletions(-)
 create mode 100755 heartbeat/asterisk
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] ocf_run: sanitize output before logging?

2011-11-15 Thread Florian Haas
On 2011-11-15 16:21, Dejan Muhamedagic wrote:
> Hi,
> 
> On Mon, Nov 14, 2011 at 09:53:12PM +0100, Florian Haas wrote:
>> Dejan, Lars, and other shell gurus in attendance,
>>
>> maybe I'm totally off my rocker, and one of you guys can set me
>> straight. But to me this part of the ocf_run function seems a bit fishy:
>>
>> output=`"$@" 2>&1`
>> rc=$?
>> output=`echo $output`
> 
>> Am I gravely mistaken, or would any funny control characters produced by
>> the wrapped command line totally mess up the content of "output" here as
>> it is mangled by the backticks?
> 
> I think you're not :) The last line was most probably put there
> to convert CR to spaces.

>> $ output=`sipsak -v -s sip:somenotexistantextens...@ekiga.net 2>&1`
>> $ echo $output
>>  Content-Length: 0(1.5.3-notls
>> (i386/linux))tag=c64e1f832a41ec1c1f4e5673ac5b80f6.8ff585.127.155.32
> 
> Seems like part of the output goes to stdout and another part to
> stderr. The two are interspersed in an unpredictable manner.

Unlikely.

If I do
$ output=`sipsak -v -s sip:somenotexistantextens...@ekiga.net 2>&1`
$ xxd <<< $output

...then the output is in the hexdump in exactly the right order. It's
just delimited by CR (0x0d), not LF (0x0a). Which is mighty odd for a
utility running on any *nix platform, but still shouldn't be transformed
to something thus garbled, simply by being stuffed into a variable.

For now, I guess we'll wimp out and simply remove "-v" from the sipsak
invocation so it just doesn't produce any output, in which case ocf_run
falls back to logging just the command and its exit code.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] ocf_run: sanitize output before logging?

2011-11-14 Thread Florian Haas
Dejan, Lars, and other shell gurus in attendance,

maybe I'm totally off my rocker, and one of you guys can set me
straight. But to me this part of the ocf_run function seems a bit fishy:

output=`"$@" 2>&1`
rc=$?
output=`echo $output`

Am I gravely mistaken, or would any funny control characters produced by
the wrapped command line totally mess up the content of "output" here as
it is mangled by the backticks?

What I'm noticing is the invocation of "ocf_run sipsak -v -s ",
which we put into the asterisk RA as per Russell Bryant's suggestion,
seems to totally garble the output.

Compare this:

$ sipsak -v -s sip:somenotexistantextens...@ekiga.net 2>&1
SIP/2.0 200 OK
Via: SIP/2.0/UDP
127.0.0.1:43665;branch=z9hG4bK.539207ad;rport=53485;alias;received=85.127.155.32
From: sip:sipsak@127.0.0.1:43665;tag=6dafacb9
To:
sip:somenotexistantextens...@ekiga.net;tag=c64e1f832a41ec1c1f4e5673ac5b80f6.3109
Call-ID: 1840229561@127.0.0.1
CSeq: 1 OPTIONS
Server: Kamailio (1.5.3-notls (i386/linux))
Content-Length: 0

To this:

$ output=`sipsak -v -s sip:somenotexistantextens...@ekiga.net 2>&1`
$ echo $output
 Content-Length: 0(1.5.3-notls
(i386/linux))tag=c64e1f832a41ec1c1f4e5673ac5b80f6.8ff585.127.155.32

In this case it appears to be due to carriage-return (0x0d, ^M)
characters that sipsak injects into its output, which is annoying but
relatively benign. But maybe we want to sanitize the ocf_run output
before we hand it off to be written to the logs?

Cheers,
Florian
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] prevent Slave promotion in mysql RA

2011-11-14 Thread Florian Haas
On 2011-11-14 20:44, Marek Marczykowski wrote:
> On 14.11.2011 09:55, Raoul Bhatia [IPAX] wrote:
>> On 2011-11-11 09:22, Junko IKEDA wrote:
>>> Hi,
>>>
>>> I am running MySQL replication setting with 2 nodes Master/Slave
>>> configuration.
>>> If Slave status(secs_behind) is lager than Master's
>>> parameter(max_slave_lag),
>>> Slave data is outdated, right?
>>> check_slave() in mysql RA would run "crm_master -v 0" in this
>>> situation to mark Slave as "outdated",
>>> but if Master is shut down in this status,
>>> Slave will be able to promote instead of its old data.
>>> (is this correct?)
>>> It seems that "crm_master -v -INFINITY" is effectual to prevent Slave
>>> promotion.
>>
>> forwarding this to marek and fghaas as i'm not familiar with
>> multi-state handling inside resource agents.
> 
> Did you set "evict_outdated_slaves"?

In a master/slave set, evict_outdated_slaves will actually kick out (by
failing with $OCF_ERR_INSTALLED) any slave that has fallen behind.

If set to false (the default), then the slave will be allowed to stay in
the cluster, but its master preference will be pushed down so it's not
promoted, and this seems to be Ikeda-san's preferred behavior. The
caveat which I mentioned in my other email in this thread applies here,
though.

For those pulling this thread from the archives: this information is in
the resource agent man page, and in "crm ra info ocf:heartbeat:mysql".

Cheers,
Florian


-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] prevent Slave promotion in mysql RA

2011-11-14 Thread Florian Haas
Hello Ikeda-san!

On 2011-11-11 09:22, Junko IKEDA wrote:
> Hi,
> 
> I am running MySQL replication setting with 2 nodes Master/Slave 
> configuration.
> If Slave status(secs_behind) is lager than Master's parameter(max_slave_lag),
> Slave data is outdated, right?

Yes.

> check_slave() in mysql RA would run "crm_master -v 0" in this
> situation to mark Slave as "outdated",
> but if Master is shut down in this status,
> Slave will be able to promote instead of its old data.
> (is this correct?)

Actually, not quite. :)

Andrew will correct me if I'm wrong on this. But as I understand it,

- while a _placement score_ of 0 makes a node eligible for _running_ a
resource (including an instance of a master/slave set),
- only a _promotion score_ or greater than 0 (i.e. a minimum of 1) makes
the node eligible for promoting a resource to the Master role.

So, if a node has a promotion score of 0, then it will node be promoted.
However, your point is entirely valid if you also set a master
preference via a location constraint on the master role. Consider this:

node alice
  attributes standby="on"
node bob
  attributes standby="off"

primitive p_mysql ocf:heartbeat:mysql
ms ms_myql p_mysql

location l_master_prefers_bob ms_mysql \
  rule 200: $role=Master #uname eq bob

In that case, if bob has fallen too far behind (automatic master score:
0), then the location rule still increases that score by 200, so the
total promotion score for bob is 0 + 200 == 200, and bob will be promoted.

> It seems that "crm_master -v -INFINITY" is effectual to prevent Slave 
> promotion.

Yes, that is entirely correct. In the example above, if the outdated
slave sets a promotion preference of -INFINITY, them the total promotion
score would be -INFINITY + 200 == -INFINITY. So the outdated slave, bob,
would never be promoted to master.

But:
> if [ $master_pref -lt 0 ]; then
> # Sanitize a below-zero preference to just zero
> master_pref=0
> 
> fi
> $CRM_MASTER -v $master_pref

This if block is unfortunately there for a good reason, namely that (at
least some versions back) the Policy Engine really did not like negative
promotion scores at all. I forget the exact details, but maybe Lars
(Ellenberg) will remember -- I seem to recall him telling me very firmly
something to the effect of "whatever you do, don't use crm_master with a
negative score anywhere". Now, it may be that said issues with the
pengine have since been fixed. If that is the case, I'll be happy to
modify the mysql RA as you suggest.

Surely you have patched your local version of the RA to set a -INFINITY
master preference. If so, does it behave as you expect it? If yes, could
you test it on both a 1.1 and a 1.0 cluster?

Cheers,
Florian
-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Patches for VirtualDomain RA

2011-11-11 Thread Florian Haas
On 2011-11-11 11:42, Michael Schwartzkopff wrote:
>>> 2) The next problem is that a graceful shutdown sometimes does not work
>>> when the machine just booted. This patch makes the RA send a shutdown
>>> command every 10 seconds while shutting down the machine. This catches
>>> the boot problem.
>>>
>>> @@ -234,6 +240,9 @@
>>>
>>> shutdown_timeout=$((($OCF_RESKEY_CRM_meta_timeout/1000)-5
>>> )) # Loop on status for $shutdown_timeout seconds
>>> for i in `seq $shutdown_timeout`; do
>>>
>>> +   if [ $((i%10)) -eq 0 ]; then
>>> +   virsh $VIRSH_OPTIONS shutdown ${DOMAIN_NAME}
>>> +   fi
>>>
>>> VirtualDomain_Status
>>> status=$?
>>> case $status in
>>
>> I see the point -- if you're issuing a KVM shutdown while the machine is
>> still booting and the guest's acpid is not started, then the shutdown
>> effectively doesn't happen. And issuing a shutdown request for a domain
>> that's already got one should do no harm.
>>
>> Question is, why only do this every 10 seconds then? Might as well do it
>> on every iteration. So we could just roll the invocation of "virsh
>> $VIRSH_OPTIONS shutdown ${DOMAIN_NAME}" into the existing "while [ $NOW
>> -lt $shutdown_timeout ]; do" loop.
>>
>> What do others think?
> 
> Perhaps the shutdown might cause a considerably load on the system.

Why?

Florian
-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Patches for VirtualDomain RA

2011-11-11 Thread Florian Haas
On 2011-07-29 10:22, Michael Schwartzkopff wrote:
> Hi,
> 
> I hope I found the correct list. Playing with the VirtualDomain RA I found 
> two 
> problems. Please find the description and patches below.

Sorry for not tending to this for a while, and thanks to Dejan for the
reminder.

> 1) During stop operation libvirt occasionally returns an error because the 
> state cannot be determined just the moment the machine is shut down. This 
> patch makes the RA try to get the state again one time. If the machine is 
> down 
> then everything is OK.
> 
> --- /root/VirtualDomain 2011-07-29 08:39:30.652675972 +0200
> +++ /usr/lib/ocf/resource.d/heartbeat/VirtualDomain 2011-07-29 
> 10:08:24.712790703 +0200
> @@ -149,6 +149,7 @@
>  VirtualDomain_Status() {
>  rc=$OCF_ERR_GENERIC
>  status="no state"
> +bail_wait="yes";
>  while [ "$status" = "no state" ]; do
>  status="`virsh $VIRSH_OPTIONS domstate $DOMAIN_NAME`"
>  case "$status" in
> @@ -177,8 +178,13 @@
> # During the stop operation, we want to bail out
> # quickly, so as to be able to force-stop (destroy)
> # the domain if necessary.
> -   ocf_log error "Virtual domain $DOMAIN_NAME has no state 
> during 
> stop operation, bailing out."
> -   return $OCF_ERR_GENERIC;
> +   ocf_log info "Virtual domain $DOMAIN_NAME has no state 
> during 
> stop operation."
> +   if [ "$bail_wait" = "no" ]; then
> +   ocf_log error "Virtual domain $DOMAIN_NAME has no 
> state 
> during stop operation, bailing out."
> +   return $OCF_ERR_GENERIC;
> +   fi
> +   bail_wait="no"
> +   sleep 1
> else
> # During all other actions, we just wait and try
> # again, relying on the CRM/LRM to time us out if

Can you please configure your mail agent to not insert line breaks when
you send patches? Better still, use git send-email.

At any rate, I consider the patch obsolete (and actually, it was already
when it was submitted), as Lars Ellenberg implemented a "try this three
times" logic in commit ffc83235, on July 1, 2010:

https://github.com/ClusterLabs/resource-agents/commit/ffc8323515c19bc51fe0801fc3d2610878699ce3

> 2) The next problem is that a graceful shutdown sometimes does not work when 
> the machine just booted. This patch makes the RA send a shutdown command 
> every 
> 10 seconds while shutting down the machine. This catches the boot problem.
> 
> @@ -234,6 +240,9 @@
> shutdown_timeout=$((($OCF_RESKEY_CRM_meta_timeout/1000)-5))
> # Loop on status for $shutdown_timeout seconds
> for i in `seq $shutdown_timeout`; do
> +   if [ $((i%10)) -eq 0 ]; then
> +   virsh $VIRSH_OPTIONS shutdown ${DOMAIN_NAME}
> +   fi
> VirtualDomain_Status
> status=$?
> case $status in

I see the point -- if you're issuing a KVM shutdown while the machine is
still booting and the guest's acpid is not started, then the shutdown
effectively doesn't happen. And issuing a shutdown request for a domain
that's already got one should do no harm.

Question is, why only do this every 10 seconds then? Might as well do it
on every iteration. So we could just roll the invocation of "virsh
$VIRSH_OPTIONS shutdown ${DOMAIN_NAME}" into the existing "while [ $NOW
-lt $shutdown_timeout ]; do" loop.

What do others think?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] Review request: fixes to IPaddr2

2011-11-11 Thread Florian Haas
Dejan/Lars,

I noticed I've had a bunch of minor changes to IPaddr2 sitting in a
branch since July, and never got around to asking for a review or
merging them. I've just rebased them to the current state of master. If
one of you could take a look, I'd much appreciate that. Thanks!

https://github.com/fghaas/resource-agents/compare/master...ipaddr2-fixes

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] [RfC] Review request for ocf:heartbeat:asterisk (Asterisk OCF RA)

2011-11-11 Thread Florian Haas
Just FYI, I noticed I erroneously put the asterisk changes in the master
branch on my github repo; I've now moved them to a separate "asterisk"
branch. The direct links to commits, which I posted earlier, should
still work as the SHA IDs are unchanged. They just point to commits in a
different branch now.

For those just tuning in, the current state of the RA is here:

https://github.com/fghaas/resource-agents/blob/asterisk/heartbeat/asterisk

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] [RfC] Review request for ocf:heartbeat:asterisk (Asterisk OCF RA)

2011-11-10 Thread Florian Haas
On 2011-11-10 17:08, Dejan Muhamedagic wrote:
> Hi Lars,
> 
> Pity I didn't see this earlier, could've saved meself some time :)
> 
> On Thu, Nov 10, 2011 at 04:33:02PM +0100, Lars Ellenberg wrote:
>> On Thu, Nov 10, 2011 at 04:11:16PM +0100, Florian Haas wrote:
>>> Hi Dejan,
>>>
>>> thanks for the feedback! We've worked in most of your suggested changes,
>>> see below:
>>
>>>> More direct would be:
>>>>
>>>> if [ $? -ne 0 ]; then
>>
>>  $? in a test is almost always an error.
> 
> Unless you don't need the outcome later.

I'm with you, however Lars did evidently spot the one occasion where we
used $? twice trying to get the same return code. So he wins. :)

https://github.com/fghaas/resource-agents/commit/2cbb26648c133ce04b0d51e439c41541dac039e1

I left one invocation in there: "asterisk_validate || exit $?". I hope
that one is acceptable. :)

>> Btw,
>>  "User $OCF_RESKEY_user doesn't exit"
>>  there is an s missing.
> 
> "user doesn't exit" sounds good too ;-)

https://github.com/fghaas/resource-agents/commit/1568a990dfcac03ddfe5785c5d65940ed230068c

We also tossed in a few more changes:

Properly handle multiple instances of the "astcanary" watchdog daemon:
https://github.com/fghaas/resource-agents/commit/c257d4a57f9131a4353143991fe101f02b51d790

Remove a pointless "return $?"
https://github.com/fghaas/resource-agents/commit/812f2b55b9a7a9f8bdd270cca8d037f07f0e980a

Fix a couple of log messages, and exit where we shouldn't return:
https://github.com/fghaas/resource-agents/commit/58b5f55da4ac5537acf9a36b56f8b35f2b96da56

Cheers,
Florian


-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] [RfC] Review request for ocf:heartbeat:asterisk (Asterisk OCF RA)

2011-11-10 Thread Florian Haas
Hi Dejan,

thanks for the feedback! We've worked in most of your suggested changes,
see below:

On 2011-11-10 13:14, Dejan Muhamedagic wrote:
> Hi,
> 
> On Thu, Nov 10, 2011 at 10:27:36AM +0100, Florian Haas wrote:
>> On 2011-11-09 12:02, Martin Gerhard Loschwitz wrote:
>>> Hello everybody,
>>>
>>> I wrote an asterisk OCF resource agent which I am hereby putting up
>>> for discussion. Any feedback is welcome.
>>>
>>> It's available from
>>> https://github.com/fghaas/resource-agents/blob/master/heartbeat/asterisk
>>
>> Let's move this thread to the -dev list where it really belongs.
>>
>> FWIW, I consider this RA in pretty good shape -- I did review it rather
>> thoroughly and sent a few patches, for which Martin was kind enough to
>> include me, undeservingly, in the authors list. Feedback from others
>> would still be very much appreciated (even if it's just a "+1 for
>> merge"). Thanks!
> 
> Just a few remarks:
> 
> Ending commands with ';' is not necessary:
> 
>   return $OCF_ERR_INSTALLED;
> 
> i.e. ';' serves as a command separator. (:%s/;$//)

https://github.com/fghaas/resource-agents/commit/088ba39b855d4ca6375a17500aa0c0e1a2578db8#heartbeat/asterisk

> This construct looks a bit unusual:
> 
> if [ ! $? -eq 0 ]; then
> 
> More direct would be:
> 
> if [ $? -ne 0 ]; then

https://github.com/fghaas/resource-agents/commit/bbe7a0ba38d366b25067b09141208087a9e44850#heartbeat/asterisk

> Is this necessary (in asterisk_status):
> 
>   if [ -d /proc -a -d /proc/1 ]; then
>   [ "u$pid" != "u" -a -d /proc/$pid ]
>   else
>   ocf_run kill -s 0 $pid
>   fi
> 
> Why not just:
> 
>   kill -s 0 $pid

https://github.com/fghaas/resource-agents/commit/d77afe185d9cec53388c2248ce3b290f95e4cad5

> Line 273 in monitor is going to produce a lot of logging, better
> reduce severity to debug:
> 
>   ocf_log info "Asterisk PBX monitor succeeded";

https://github.com/fghaas/resource-agents/commit/cf130ce1a3d1b9502ad07df6361f611a256ee560#heartbeat/asterisk

> In asterisk_start() $ASTRUNDIR is first created using install(8),
> then again checked in lines 292-296 and created using mkdir,
> chown, etc. Superfluous.
>
> Start may exit with some arbitrary error code (line 324 in
> asterisk_start()).
> 
> Perhaps to move all local statements to the top of the function
> in asterisk_start().
> 
> [nitpicking] start_wait is not needed, why not just
> 
>   [ $rc -eq $OCF_SUCCESS ] && break

https://github.com/fghaas/resource-agents/commit/80ea432336a56cf2680f30c389b06c20f43eef79#heartbeat/asterisk

> Should check content of $pid before line 377 in stop.

I'm unsure what you're suggesting here.

- Just check that the pid file is non-empty?
- Check whether its contents are numeric?
- Read the pid and do a kill -0 before kill -TERM?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] [RfC] Review request for ocf:heartbeat:asterisk (Asterisk OCF RA)

2011-11-10 Thread Florian Haas
On 2011-11-09 12:02, Martin Gerhard Loschwitz wrote:
> Hello everybody,
> 
> I wrote an asterisk OCF resource agent which I am hereby putting up
> for discussion. Any feedback is welcome.
> 
> It's available from
> https://github.com/fghaas/resource-agents/blob/master/heartbeat/asterisk

Let's move this thread to the -dev list where it really belongs.

FWIW, I consider this RA in pretty good shape -- I did review it rather
thoroughly and sent a few patches, for which Martin was kind enough to
include me, undeservingly, in the authors list. Feedback from others
would still be very much appreciated (even if it's just a "+1 for
merge"). Thanks!

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Stonith turns node names to lowercase

2011-10-18 Thread Florian Haas
On 2011-10-18 12:12, Alberic de Pertat wrote:
> Hi,
> 
> I am currently in the process of writing a fencing agent for VMware
> vCenter. After some tests, I noticed that the stonith command is turning
> the nodename to lowercase.
> 
> The problem is that almost every VM in my inventory is uppercase with
> some mixed case too. VMware allows you to have two VM with the same name
> but different cases.

Gah. That sounds like a great way to go insane, or drive someone else
nuts. I really wonder why on earth anyone would want to do this.

> I cannot find a proper way to deal with this as a
> case insensitive search through the inventory could yield more than one
> result.
> 
> Looking through the stonith command source (Mercurial HEAD), I found the
> following in stonith.c (l. 456) : 
> 
>   g_strdown(nodecopy);
> 
> Is there a reason for this ?

I suppose Dejan will accept a patch making this configurable.

Cheers,
Florian

-- 
Need help with fencing?
http://www.hastexo.com/now
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Linux-HA] [ha-wg] CFP: HA Mini-Conference in Prague on Oct 25th

2011-10-07 Thread Florian Haas
Hi everyone,

I can finally respond to this and do want to take the opportunity to
apologize for my silence-for-obvious-reasons over the past month.

On 2011-10-07 00:05, Andrew Beekhof wrote:
> On Thu, Oct 6, 2011 at 1:53 AM, Lars Marowsky-Bree  wrote:
>> On 2011-10-03T11:10:13, Andrew Beekhof  wrote:
>>
>>> Based on Boston last year, I imagine the conversations will last right
>>> up until Lars starts presenting his talk on Friday afternoon.
>>> People came and went at random, and if someone essential was missing
>>> for a conversation we deferred it until later.
>>
>> Oh, then we're going to not stop, ever - because I don't have a talk at
>> the main conference this time ;-)
> 
> The schedule has you in a friday afternoon slot iirc.

That one is actually the talk by Madison and yours truly.

Regarding my own plans: when the originally planned miniconf was
canceled, I committed to speaking at Percona Live in London that same
week, so I will be late for Linuxcon and arrive in the late evening of
the 26th. I'm completely open for Boston-style round table sessions all
day on the 27th, and also in the morning and evening of the 28th. Minus
talk prep with Madison, of course. I am also not handing back down to
Vienna before the early afternoon of Saturday the 29th, so if anyone has
plans to do something interesting that Saturday morning I'd be more than
happy to join.

Cheers,
Florian

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-09-08 Thread Florian Haas
On 09/08/11 10:34, Raoul Bhatia [IPAX] wrote:
> On 09/08/2011 04:49 AM, renayama19661...@ybb.ne.jp wrote:
>>   do not apply a patch even if you apply this patch, there is not the big 
>> problem.
>> I am lacking in my explanation, and I'm sorry.
> 
> ok. i just updated my pull request.
> 
> https://github.com/ClusterLabs/resource-agents/pull/20
> 
> dejan, can you please review and apply our patches?

Taking the liberty to step in for Dejan, I've merged and pushed your
changes. Thanks for your contribution!

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] pacemaker - migrate RA, based on the state of other RA, w/o clone?

2011-07-14 Thread Florian Haas
On 2011-07-14 12:55, RNZ wrote:
> 
> 
> On Thu, Jul 14, 2011 at 2:02 PM, Florian Haas  <mailto:florian.h...@linbit.com>> wrote:
> 
> On 2011-07-14 08:46, RNZ wrote:
> > No, I want and I need - multi-master scheme (more then two nodes)...
> 
> There is nothing in Pacemaker's master/slave scheme that restricts you
> to a single master. The ocf:linbit:drbd resource agent, for example, is
> configurable in dual-Master mode.
> 
> Once the resource agent properly implements the functionality (the hard
> part), configuring a multi-master master/slave set is simply a question
> of setting the master-max meta parameter to a value greater than 1 (the
> easy part).
> 
> I don't think so... Couchdb RESTful API very easy allow running
> repliacate by next scheme:

It's entirely possible that the couchdb native API may be more powerful
in specific regards, but if you want to put it into a Pacemaker cluster
you may have to occasionally accept some minor limitations. That's a
tradeoff which is present for all Pacemaker managed applications.

> primitive cdb0 
> hostA: hostB:dbB > localhost:dbB
> hostA: hostC:dbC > localhost:dbC
> hostA: hostD:dbD > localhost:dbD
> primitive cdb1
> hostB: hostA:dbB > localhost:dbB
> primitive cdb2
> hostC: hostA:dbC > localhost:dbC
> 
> In this scheme hostA used as master for hostB and hostC (master-master)
> and as slave for hostD (slave-master). Both (master-master and
> slave-master for different servers/databases) scheme per one instance.

So you mean there would be a cascading replication, like so:

 hostD
   |
 hostA
 /   \
 hostB   hostC

Such a thing is not something Pacemaker caters for specifically, but I
dare say it doesn't need to, either. You would simply create one
master/slave set where D is master and A is slave, and another where A
is master and B and C are slaves.

> By the way, is there any specific reason you are contributing under a
> pseudonym? It's highly unusual in this community to do so.
> 
> 
> Sorry, habit... My real name Alibek.Amaev, alibek.am...@gmail.com
> <mailto:alibek.am...@gmail.com> or alibe...@gmail.com
> <mailto:alibe...@gmail.com>

Pleased to meet you Alibek, welcome to the tribe. :)

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] pacemaker - migrate RA, based on the state of other RA, w/o clone?

2011-07-14 Thread Florian Haas
On 2011-07-14 08:46, RNZ wrote:
> No, I want and I need - multi-master scheme (more then two nodes)...

There is nothing in Pacemaker's master/slave scheme that restricts you
to a single master. The ocf:linbit:drbd resource agent, for example, is
configurable in dual-Master mode.

Once the resource agent properly implements the functionality (the hard
part), configuring a multi-master master/slave set is simply a question
of setting the master-max meta parameter to a value greater than 1 (the
easy part).

By the way, is there any specific reason you are contributing under a
pseudonym? It's highly unusual in this community to do so.

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [GIT PULL] IPaddr2 cleanup

2011-07-06 Thread Florian Haas
Hi everyone,

please review the following changes since commit
dca8808361fc2d130e44d3e1ebc1c5ff38fbf9ac:

  Low: Route: insert paragraph breaks in longdesc (2011-07-04 14:32:17
+0200)

in the git repository at:
  git://github.com/fghaas/resource-agents ipaddr2-fixes

These should not introduce any functional changes, just a general
cleanup and streamlining along current best practices, hopefully making
the RA easier to maintain.

The updated RA passes all ocft tests.

Everyone's input is much appreciated. Thanks!

Cheers,
Florian

Florian Haas (7):
  Low: IPaddr2: improve add_interface function
  Low: IPaddr2: sanitize defaults initialization
  Low: IPaddr2: open-code references to resource parameters
  Low: IPaddr2: use ocf_is_true when evaluating lvs_support parameter
  Low: IPaddr2: use ocf_is_true when evaluating arp_bg parameter
  Low: IPaddr2: remove falsely advertised default
  Low: ocft: remove "InstallPackage" line for IPaddr2

 heartbeat/IPaddr2  |  146
++--
 tools/ocft/IPaddr2 |1 -
 2 files changed, 73 insertions(+), 74 deletions(-)




signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] regressions in resource-agents 3.9.1

2011-06-22 Thread Florian Haas
On 2011-06-22 11:48, Dejan Muhamedagic wrote:
> Hello all,
> 
> Unfortunately, it turned out that there were two regressions in
> the 3.9.1 release:
> 
> - iscsi on platforms which run open-iscsi 2.0-872 (see
>   http://developerbugs.linux-foundation.org/show_bug.cgi?id=2562)
> 
> - pgsql probes with shared storage (iirc), see
>   http://marc.info/?l=linux-ha&m=130858569405820&w=2
> 
> Thanks to Vadym Chepkov for finding and reporting them.
> 
> I'd suggest to make a quick fix release 3.9.2.
> 
> Opinions?

Agree.

Cheers,
Florian




signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Uniquness OCF Parameters

2011-06-16 Thread Florian Haas
On 2011-06-16 10:51, Lars Ellenberg wrote:
> On Thu, Jun 16, 2011 at 09:48:20AM +0200, Florian Haas wrote:
>> On 2011-06-16 09:03, Lars Ellenberg wrote:
>>> With the current "unique=true/false", you cannot express that.
>>
>> Thanks. You learn something every day. :)
> 
> Sorry that I left off the "As you are well aware of,"
> introductionary phrase. ;-)

In case that wasn't clear earlier, I was very much not aware of this. I
wasn't being ironic, for a change. :)

>>> Question is: do we really want or need that.
>>
>> That is a discussion for the updated OCF RA spec discussion, really. And
>> the driver of that discussion is currently submerged. :)
> 
> I guess this was @LMB?
> Hey there ... do you read? :)

He is on a diving vacation in Croatia. Not only was I not being ironic;
I referred to his literal submersion.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Uniquness OCF Parameters

2011-06-16 Thread Florian Haas
On 2011-06-16 09:03, Lars Ellenberg wrote:
> With the current "unique=true/false", you cannot express that.

Thanks. You learn something every day. :)

> Depending on what we chose the meaning to be,
> parameters marked "unique=true" would be required to
>   either be all _independently_ unique,
>   or be unique as a tuple.
> 
> If we want to be able to express both, we need a different markup.
> 
> Of course, we can move the markup out of the parameter description,
> into an additional markup, that spells them out,
> like .
> 
> But using unique=0 as the current non-unique meaning, then
> unique=, would
> name the scope for this uniqueness requirement,
> where parameters marked with the same such label
> would form a unique tuple.
> Enables us to mark multiple tuples, and individual parameters,
> at the same time.
> 
> Question is: do we really want or need that.

That is a discussion for the updated OCF RA spec discussion, really. And
the driver of that discussion is currently submerged. :)

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Uniquness OCF Parameters

2011-06-15 Thread Florian Haas
On 2011-06-15 15:50, Alan Robertson wrote:
> On 06/14/2011 06:03 AM, Florian Haas wrote:
>> On 2011-06-14 13:08, Dejan Muhamedagic wrote:
>>> Hi Alan,
>>>
>>> On Mon, Jun 13, 2011 at 10:32:02AM -0600, Alan Robertson wrote:
>>>> On 06/13/2011 04:12 AM, Simon Talbot wrote:
>>>>> A couple of observations (I am sure there are more) on the uniqueness 
>>>>> flag for OCF script parameters:
>>>>>
>>>>> Would it be wise for the for the index parameter of the SFEX ocf script 
>>>>> to have its unique flag set to 1 so that the crm tool (and others) would 
>>>>> warn if one inadvertantly tried to create two SFEX resource primitives 
>>>>> with the same index?
>>>>>
>>>>> Also, an example of the opposite, the Stonith/IPMI script, has parameters 
>>>>> such as interface, username and password with their unique flags set to 
>>>>> 1, causing erroneous warnings if you use the same interface, username or 
>>>>> password for multiple IPMI stonith primitives, which of course if often 
>>>>> the case in large clusters?
>>>>>
>>>> When we designed it, we intended that Unique applies to the complete set
>>>> of parameters - not to individual parameters.  It's like a multi-part
>>>> unique key.  It takes all 3 to create a unique instance (for the example
>>>> you gave).
>>> That makes sense.
>> Does it really? Then what would be the point of having some params that
>> are unique, and some that are not? Or would the tuple of _all_
>> parameters marked as unique be considered unique?
>>
> I don't know what you think I said, but A multi-part key to a database 
> is a tuple which consists of all marked parameters.  You just said what 
> I said in a different way.
> 
> So we agree.

Jfyi, I was asking a question, not stating an opinion. Hence the use of
a question mark.

So then, if the uniqueness should be enforced for a "unique key" that is
comprised of _all_ the parameters marked unique in a parameter set, then
what would be the correct way to express required uniqueness of
_individual_ parameters?

In other words, if I have foo and bar marked unique, then one resource
with foo=1 and bar=2, and another with foo=1, bar=3 does not violate the
uniqueness constraint. What if I want both foo and bar to be unique in
and of themselves, so any duplicate use of foo=1 should be treated as a
uniqueness violation?

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Patch for pgsql

2011-06-15 Thread Florian Haas
On 2011-06-15 14:26, Serge Dubrouski wrote:
> I screwed up with git so here is the patch attached.

Merged. I took the liberty to split this into two patches, and drop the
spelling fix because the original spelling is actually correct. :)

https://github.com/ClusterLabs/resource-agents/commit/f64c77a61ca4794ee636801b2447a2c1a6c531ce
https://github.com/ClusterLabs/resource-agents/commit/2dd56104687b38006ae41d3aa033cc6f1cc41509

Thanks for the fixes.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Patch for pgsql

2011-06-15 Thread Florian Haas
On 2011-06-15 14:26, Serge Dubrouski wrote:
> I screwed up with git so here is the patch attached.

Nice, thanks. Is the pgsql ocft test case OK as it is in the repo?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Uniquness OCF Parameters

2011-06-14 Thread Florian Haas
On 2011-06-14 13:08, Dejan Muhamedagic wrote:
> Hi Alan,
> 
> On Mon, Jun 13, 2011 at 10:32:02AM -0600, Alan Robertson wrote:
>> On 06/13/2011 04:12 AM, Simon Talbot wrote:
>>> A couple of observations (I am sure there are more) on the uniqueness flag 
>>> for OCF script parameters:
>>>
>>> Would it be wise for the for the index parameter of the SFEX ocf script to 
>>> have its unique flag set to 1 so that the crm tool (and others) would warn 
>>> if one inadvertantly tried to create two SFEX resource primitives with the 
>>> same index?
>>>
>>> Also, an example of the opposite, the Stonith/IPMI script, has parameters 
>>> such as interface, username and password with their unique flags set to 1, 
>>> causing erroneous warnings if you use the same interface, username or 
>>> password for multiple IPMI stonith primitives, which of course if often the 
>>> case in large clusters?
>>>
>>
>> When we designed it, we intended that Unique applies to the complete set 
>> of parameters - not to individual parameters.  It's like a multi-part 
>> unique key.  It takes all 3 to create a unique instance (for the example 
>> you gave).
> 
> That makes sense. 

Does it really? Then what would be the point of having some params that
are unique, and some that are not? Or would the tuple of _all_
parameters marked as unique be considered unique?

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [ha-wg-technical] resource agents 3.9.1rc1 release

2011-06-08 Thread Florian Haas
On 06/08/2011 03:06 AM, Dejan Muhamedagic wrote:
> Hi,
> 
> On Wed, Jun 08, 2011 at 10:50:17AM +0200, Fabio M. Di Nitto wrote:
>> Hi,
>>
>> On 6/8/2011 10:16 AM, Keisuke MORI wrote:
>>> Hi,
>>>
>>> Thank you for all your efforts for the new release.
>>>
>>>
>>> 2011/6/7 Fabio M. Di Nitto :
 Several changes have been made to the build system and the spec file to
 accommodate both projects´ needs. The most noticeable change is the
 option to select "all", "linux-ha" or "rgmanager" resource agents at
 configuration time, which will also set the default for the
 spec file.
>>>
>>> Why is the ldirectord package disabled on RHEL environment?
>>> I would expect that it would be built as same as (linux-ha)
>>> resource-agents-1.0.4
>>> so that we can use the upcoming 3.9.1 as the upgrade version.
>>
>> Because ldirectord requires libnet to build and libnet is not available
>> on default RHEL (unless you explicitly enable EPEL).
>>
>> Florian, last time we spoke, we were trying to avoid adding BR on
>> packages that are not part of RHEL, but then to build linux-ha agents we
>> need cluster-glue* that are not part of RHEL anyway.
>>
>> We should be consistent here.
>>
>> I am ok to allow people to build ldirectord.
>>
>>>
>>> We still use the resource-agents/ldirectord on many RHEL systems and
>>> if it was missing
>>> we can not upgrade them anymore.
>>
>> Understood, we are still smoothing a few corners after the merge. It´s
>> good people are spotting those bits.
>>
>>>
>>>
 NOTE: About the 3.9.x version (particularly for linux-ha folks): This
 version was chosen simply because the rgmanager set was already at
 3.1.x. In order to make it easier for distribution, and to keep package
 upgrades linear, we decided to bump the number higher than both
 projects. There is no other special meaning associated with it.

 The final 3.9.1 release will take place soon.
>>>
>>> BTW why not 4.0? :)
>>> just curious though.
>>
>> There is really nothing major in this release vs 1.0.4 for linux-ha and
>> 3.1.x for rgmanager agents, other than co-exist in the same tree.
> 
> Actually, while looking at it, I'd also like something else
> rather than 3.9.x. Can't put my finger on what's exactly the
> issue, but something like 4.0 would somehow look better. Is it
> only me?
> 
>> We will probably use 4.0 to introduce the new OCF standard and the new
>> common clusterlabs/ provider and mark effectively the introduction of
>> new features.
> 
> 4.1?

I realize I'm bikeshedding, but my preference would be for 3.9 for this
one, and 4.0 to implement the new standard. Like Fabio originally suggested.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [ha-wg-technical] resource agents 3.9.1rc1 release

2011-06-08 Thread Florian Haas
On 06/08/2011 02:50 AM, Fabio M. Di Nitto wrote:
> Hi,
> 
> On 6/8/2011 10:16 AM, Keisuke MORI wrote:
>> Hi,
>>
>> Thank you for all your efforts for the new release.
>>
>>
>> 2011/6/7 Fabio M. Di Nitto :
>>> Several changes have been made to the build system and the spec file to
>>> accommodate both projects´ needs. The most noticeable change is the
>>> option to select "all", "linux-ha" or "rgmanager" resource agents at
>>> configuration time, which will also set the default for the
>>> spec file.
>>
>> Why is the ldirectord package disabled on RHEL environment?
>> I would expect that it would be built as same as (linux-ha)
>> resource-agents-1.0.4
>> so that we can use the upcoming 3.9.1 as the upgrade version.
> 
> Because ldirectord requires libnet to build and libnet is not available
> on default RHEL (unless you explicitly enable EPEL).
> 
> Florian, last time we spoke, we were trying to avoid adding BR on
> packages that are not part of RHEL, but then to build linux-ha agents we
> need cluster-glue* that are not part of RHEL anyway.
> 
> We should be consistent here.
> 
> I am ok to allow people to build ldirectord.

No objection.

>> We still use the resource-agents/ldirectord on many RHEL systems and
>> if it was missing
>> we can not upgrade them anymore.
> 
> Understood, we are still smoothing a few corners after the merge. It´s
> good people are spotting those bits.
> 
>>
>>
>>> NOTE: About the 3.9.x version (particularly for linux-ha folks): This
>>> version was chosen simply because the rgmanager set was already at
>>> 3.1.x. In order to make it easier for distribution, and to keep package
>>> upgrades linear, we decided to bump the number higher than both
>>> projects. There is no other special meaning associated with it.
>>>
>>> The final 3.9.1 release will take place soon.
>>
>> BTW why not 4.0? :)
>> just curious though.
> 
> There is really nothing major in this release vs 1.0.4 for linux-ha and
> 3.1.x for rgmanager agents, other than co-exist in the same tree.
> 
> We will probably use 4.0 to introduce the new OCF standard and the new
> common clusterlabs/ provider and mark effectively the introduction of
> new features.

I'd agree.

Lars, it's now June, we have our final resource-agents release before we
start actually merging functional code (as opposed to build systems),
and we still don't even have coverage for deprecation in the OCF RA
spec. Can I ask you to please either start working on that spec update
or give up this task so someone else can pick it up?

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] MySQL 5.5 no longer supports CHANGE MASTER TO MASTER_HOST=''

2011-06-03 Thread Florian Haas
On 2011-06-03 12:51, Ben Mildren wrote:
> Hi all
> 
> I've submitted a patch for the mysql resource agent.  It currently will
> error on MySQL 5.5+ as CHANGE MASTER TO MASTER_HOST='' has been deprecated.
> 
> Within the unset_master function, I've supplied a dummy value as the
> host name to ensure replication would fail to start if restarted
> erroneously before the mysql service is restarted, and issued a RESET
> SLAVE to remove the replication metadata after a restart. 
> 
> I've altered the is_slave function to add an extra check for the dummy
> host name within the SHOW SLAVE STATUS output, as there is no longer a
> way to stop SHOW SLAVE STATUS returning a resultset before the mysql
> instance is restarted.

Thanks Ben. Marek, thoughts on this? The pull request diff is here:

https://github.com/ClusterLabs/resource-agents/pull/9/files

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-03 Thread Florian Haas
> Hi All,
> 
> We found a problem in the resource agent of postfix.

*Please* don't reply to an old thread if you mean to start a new one,
hijacking threads just confuses everyone.

> 
> The resource agent of postfix carries out /usr/sbin/postfix in status 
> parameter, but this is not available in old postfix.

I believe this has been addressed in the latest patch set that was
merged a couple of days ago; please try to reproduce the problem with
the postfix RA from upstream git before you start working on your own
patch. Thanks.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [PATCH] Low: fio: add missing log level

2011-06-01 Thread Florian Haas
On 2011-06-01 16:26, r.bha...@ipax.at wrote:
> From: Raoul Bhatia 
> 
> ---
>  heartbeat/fio |6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)

Merged, thanks.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] changelog for resource agents 3.9.x

2011-06-01 Thread Florian Haas
On 2011-06-01 15:33, Raoul Bhatia [IPAX] wrote:
> hello dejan!
> 
> On 06/01/2011 02:19 PM, Dejan Muhamedagic wrote:
>> Is it this pull request:
>>
>> https://github.com/ClusterLabs/resource-agents/pull/6
> yes
> 
>> What about Florian's comment?
> have been addressed.
> 
>> Also, I cannot merge anything from there because commit lines
>> don't specify the agent, just the level. For instance:
>>
>> Low: inform user that postfix stopped
>> ...
>> Low: at present, OCF_CHECK_LEVEL is not evaluated
>>
>> Can you please address this too.
> 
> sure. how do i best update the changelog in this regard?

git rebase -i 

The "reword" option in the interactive editor allows you to rephrase
your commit messages.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] mysql RA fixes merged

2011-05-30 Thread Florian Haas
Hello,

I've merged and pushed a number of fixes to master/slave replication in
the mysql RA, contributed by Marek Marczykowski. I've deliberately left
out Raoul Bhatia's retab patch out though, those "janitor" patches
usually make debugging harder if we do run into issues. We can always
merge that patch later.

I've also fixed the commit messages to be prefixed with "mysql". Marek,
could you please rebase your github repo against current upstream.
Thanks for the contribution!

Recent commit history is here:

https://github.com/ClusterLabs/resource-agents/commits/master/heartbeat/mysql

Testing credit goes to Raoul and Dejan. Thanks to you too.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] lxc RA merged

2011-05-30 Thread Florian Haas
Hello,

after much useful testing from Christoph Mitasch and a number of
necessary changes highlighted by ocf-tester, I've now merged and pushed
the lxc resource agent that was originally contributed by Darren Thompson.

The resource agent is here:

https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/lxc

Its commit history up to this point can be reviewed here:

https://github.com/ClusterLabs/resource-agents/commits/master/heartbeat/lxc

Hope this is useful.

Cheers,
Florian




signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] state of heartbeat resource agents

2011-05-24 Thread Florian Haas
On 2011-05-24 13:38, Raoul Bhatia [IPAX] wrote:
> postfix:
> fghaas reviewed my code. i tried to catch him on irc on how to
> progress with his comments.

I tried to catch you that same day; you weren't there. How about falling
back to the mailing list, or using github's line note interface to discuss?

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 90, Issue 8

2011-05-11 Thread Florian Haas
Darren,

On 2011-05-05 15:07, Florian Haas wrote:
> On 2011-05-05 14:26, Darren Thompson wrote:
>>> Can you confirm that the current version is working for you and passes
>>> ocf-tester on your system?
>>
>> What is an ocf-tester???
> 
> http://www.linux-ha.org/doc/dev-guides/_testing_installing_and_packaging_resource_agents.html
> 
>> I have been testing this "the hard way" by actually creating and running
>> the agents against actual LXC containers in a running cluster... If
>> there is a simple way of streamlining this testing I'd love to hear more
>> about it. (Did I mention that I'm not normally a "coder/developer"? -
>> Yes I know that's getting repetitive ;-) )
>>
>> But, back on topic... I can confirm that this agent is working correctly
>> in a "live fire" environment.
> 
> That's good to know. ocf-tester doesn't shoot blanks either (it operates
> on an actual incarnation of the resource), but it might run some tests
> that you manually do not, so it's always a wise idea to use it.

Any news regarding running ocf-tester on your lxc agent?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] New OCF RA: symlink

2011-05-09 Thread Florian Haas
On 2011-05-04 15:19, Florian Haas wrote:
> On 2011-04-20 14:37, Florian Haas wrote:
>> Dominik doesn't have a github repo yet, so I added this to a separate
>> branch in mine. The current revision is here:
>>
>> https://github.com/fghaas/resource-agents/blob/symlink/heartbeat/symlink
>>
>> Please comment freely. Thanks!
> 
> I've just responded to two comments from Lars and Alan and I'd
> appreciate more. As of right now I don't see any show stoppers and I
> wouldn't like to hold up Dominik's contribution much longer.
> 
> Unless I hear any valid objections, I intend to merge this new RA next
> Monday. Thanks!

OK. I believe we did stir up some valuable discussion here, but I
haven't seen anyone identify a real show stopper.

Merged. Thanks Dominik!

https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/symlink

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Filesystem ocf file

2011-05-06 Thread Florian Haas
On 2011-05-06 09:58, Darren Thompson wrote:
> Florian
> 
> Ok then... I agree it does seem to be poorly designed and It's far from
> intuitive...
> 
> But If it's actually "correct" who am I to argue...

"Correct" in the sense of "in line with the rules", not in the sense
that it's actually smart. But yeah, it's a bullet we'll have to bite.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Filesystem ocf file

2011-05-06 Thread Florian Haas
On 2011-05-06 09:26, Darren Thompson wrote:
> Team
> 
> I was reviewing some errors on a cluster mounted file-system that caused
> me to review the Filesystem ocf file.
> 
> I notice that it uses an "undeclared" parameter of "OCF_CHECK_LEVEL" to
> determine what degree of testing of the filesystem is required in "monitor"
> 
> I have now updated it to more formally work with a "check_level" value
> with the more obvious values of "mounted, read & write" ( my updated
> version attached )
> 
> Could someone (Florian is this something you can do?) please review this
> with a view to patching the upstream Filesystem ocf file.

NACK, sorry. The OCF_CHECK_LEVEL is specific to the monitor action and
described as such in the OCF spec; this will not be changed without a
change to the spec.

To use it, set "op monitor interval=X OCF_CHECK_LEVEL=Y"

Yes, it's poorly designed, it makes no sense why this is pretty much the
only sensible time to set a parameter specifically for an operation (as
opposed to on a resource), it's inexplicable why it's all caps, etc.,
but that's the way it is.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 90, Issue 8

2011-05-05 Thread Florian Haas
On 2011-05-05 14:26, Darren Thompson wrote:
>> Can you confirm that the current version is working for you and passes
>> ocf-tester on your system?
> 
> What is an ocf-tester???

http://www.linux-ha.org/doc/dev-guides/_testing_installing_and_packaging_resource_agents.html

> I have been testing this "the hard way" by actually creating and running
> the agents against actual LXC containers in a running cluster... If
> there is a simple way of streamlining this testing I'd love to hear more
> about it. (Did I mention that I'm not normally a "coder/developer"? -
> Yes I know that's getting repetitive ;-) )
> 
> But, back on topic... I can confirm that this agent is working correctly
> in a "live fire" environment.

That's good to know. ocf-tester doesn't shoot blanks either (it operates
on an actual incarnation of the resource), but it might run some tests
that you manually do not, so it's always a wise idea to use it.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [ha-wg] Cluster Stack - Ubuntu Developer Summit

2011-05-05 Thread Florian Haas
On 2011-04-26 19:33, Andres Rodriguez wrote:
> UDS' are open-to-public events, and I believe it would be great if
> upstream could participate and maybe even further the discussion about
> the Cluster Stack. For more information about UDS, please visit [1]. The
> specific date/time for the Cluster Stack session is not yet available.
> 
> If you require any further information please don't hesitate to contact me.

Andres already knows this, but FWIW I'll repost here that I'll be at UDS
in time for the cluster stack session at 12 noon on 5/12. I'll stay in
Budapest that evening and will probably join the Budapest sightseeing
tour that the Hungarian Ubuntu team is organizing, so if anyone wants to
link up with Andres and me for a few beverages please let us know.

Andrew, interested in making a day trip to Budapest while you're still
on this continent?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 90, Issue 6

2011-05-05 Thread Florian Haas
Darren,

can you please subscribe to the list as a normal subscriber rather than
to just the digest, so we can keep this discussion in one thread?

On 2011-05-05 04:47, Darren Thompson wrote:
> Florian/Team
> 
> There was an error in the GIT-Hub version that was causing my re-base
> attempts to fail, so I was forced to try to bring my "last known good"
> version to the same configuration (mostly successful).
> 
> I have since found the error in the GIT-Hub version (the initialisation
> section was wrong, the meta-data error was a 'red herring') so have been
> found and resolved so I have done an actual re-base now based on the
> GIT-Hub version.
> 
> Changes:
> 
> 1. Corrected error in utilisation causing ocf to fail in HB_GUI.

That is not an error; the Github version is correct. The path to the
ocf-shellfuncs library was recently changed upstream; your installed
version is apparently still using the old path. For the Github version
to work on your system, you will have to apply the attached patch after
you check out.

Note that normally people would be building the whole resource-agents
package from a git checkout and use _that_ on their test system, but
you're not using git, so that option is out for you. Have I mentioned
that starting to use git would be a good option?

> 2. Added "information"  to stop  section, to provide more feedback on
> container shutdown/stop (and to assist with future development of
> "containers using alternate 'init' systems").

Applied and pushed to my lxc branch.

Can you confirm that the current version is working for you and passes
ocf-tester on your system?

Cheers,
Florian
diff --git a/heartbeat/lxc b/heartbeat/lxc
index 3b0df91..07e0026 100755
--- a/heartbeat/lxc
+++ b/heartbeat/lxc
@@ -34,8 +34,8 @@
 #   OCF_RESKEY_config
 
 # Initialization:
-: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat}
-. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs
+: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat}
+. ${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs
 
 # Set default TRANS_RES_STATE (temporary file to "flag" if resource was stated but not stopped)
 TRANS_RES_STATE="${HA_RSCTMP}/${OCF_RESOURCE_INSTANCE}.state"


signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] ACLs and privilege escalation (was Re: New OCF RA: symlink)

2011-05-05 Thread Florian Haas
Rather than going into ACLs in more detail, I wanted to highlight that
however we limit access to the CIB, the resource agents still _execute_
as root, so we will always have what would normally be considered a
privilege escalation issue.

Now, we could agree on security guidelines for RAs, and some of those
would certainly be no-brainers to define (such as, don't ever "eval"
unsanitized user input), but I refuse to even suggest to tackle any such
guidelines before the OCF spec update has gotten off the ground.

One such thing that could be added to the spec would be optional meta
variables named "user" and "group", directing the LRM (or any successor)
to execute the RA as that user rather than root. Just an idea.

Cheers,
Florian




signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] New OCF RA: symlink

2011-05-04 Thread Florian Haas
On 2011-04-20 14:37, Florian Haas wrote:
> On 2011-04-20 11:41, Dominik Klein wrote:
>> Hi
>>
>> I wrote a new RA that can manage a symlink.
>>
>> Configuration:
>>
>> primitive mylink ocf:heartbeat:symlink \
>>  params link="/tmp/link" target="/tmp/target" \
>>  op monitor interval="15" timeout="15"
>>
>> This will basically
>> ln -s /tmp/target /tmp/link
>>
>> hth
>> Dominik
> 
> Dominik doesn't have a github repo yet, so I added this to a separate
> branch in mine. The current revision is here:
> 
> https://github.com/fghaas/resource-agents/blob/symlink/heartbeat/symlink
> 
> Please comment freely. Thanks!

I've just responded to two comments from Lars and Alan and I'd
appreciate more. As of right now I don't see any show stoppers and I
wouldn't like to hold up Dominik's contribution much longer.

Unless I hear any valid objections, I intend to merge this new RA next
Monday. Thanks!

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] New OCF RA: symlink

2011-05-04 Thread Florian Haas
On 2011-04-22 14:25, Alan Robertson wrote:
> Drbdlinks was never converted to an OCF RA, that I recall.  It handles 
> cases of needing to restart the logging system when you changed symlnks 
> around - mainly for chroot services.  I've used it for many years.  You 
> can find the source for it here:
>  http://www.tummy.com/Community/software/drbdlinks/
> 
> It's pretty well thought out, and works quite well.  I'd certainly look 
> it over before reinventing the wheel.

AFAICS drbdlinks does some things on its own which in a Pacemaker
cluster would be under Pacemaker control (restarting daemons, for
example). The symlink RA does none of this, it's simple and effective
and ties in quite well with Pacemaker management.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] New OCF RA: symlink

2011-05-04 Thread Florian Haas
Coming back to this one, as the discussion seems to have died down.

On 2011-04-20 19:00, Lars Ellenberg wrote:
> Oh, well, thinking about non-roots that may have cibadmin karma,
> they now can configure a resource that will remove /etc/passwd.
> I'm not sure if I like that.
> 
> How about a staged system? Double symlinks?
> Similar to the alternatives system in Debian or others.
> 
> The RA will force a single directory that will contain the indirection
> symlinks, and will sanitize (or force) link names to not contain slashes.
> 
> The real symlinks will point to that indirection symlink, which will
> point to the end location.
> 
> /etc/postfix/main.cf
>-> /var/lib/wherever-indirection-dir/postfix_main.cf <<<===
>   -> /mnt/somewhere/you/want/to/point/to/main.cf
> 
> And <<<=== will be managed by the resource agent.

Considering we have an "anything" resource agent which, well, lets us do
anything, I consider this pointless cluttering of the resource agent
which creates a false sense of security. Thoughts?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 90, Issue 3

2011-05-04 Thread Florian Haas
On 05/04/2011 10:52 AM, Darren Thompson wrote:
> Florian/Team
> 
> I have now updated my "re-based" ocf file to include the "experimental"
> support for upstart and systemd using containers.
> 
> I can confirm that this is still working correctly for containers
> running 'sysv init' and "in theory" should now also work for containers
> using 'upstart' and 'systemd'.
> 
> I'm currently doing a "crash course' in installing containers to use
> these 'init replacments' but have not yet succedded in testing either
> 'upstart' or 'systemd' containers yet.
> 
> If there is anyone with a better understanding of LXC containers and
> one/both of these other 'init systems', please contact me as your
> information/assistance would be invaluable.

OK, updated my git branch. You really want to double check your
"rebasing" method; you're constantly re-introducing things that I've
removed or fixed in earlier commits.

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 90, Issue 2

2011-05-04 Thread Florian Haas
On 05/04/2011 09:09 AM, Florian Haas wrote:
> On 05/04/2011 08:44 AM, Darren Thompson wrote:
>> Florian
>>
>> I have tried to re-base on your version but it just will not run for me.
>>
>> I keep getting "Failed to parse the metadata of LXC" syntax error line
>> 1, column 1"
>>
>> I've no idea where this error is as it all looks fine...
>>
>> I'll attach my copy and a screen-shot of the error, HELP!!!
> 
> It's probably not a wise approach to "test" with the GUI while you're
> not even sure the resource agent will run. You will have to start the
> script from the command line and see where your error is coming from.

Btw, your patch contains this:

diff --git a/heartbeat/lxc b/heartbeat/lxc
index 25ef6e3..9819a47 100755
--- a/heartbeat/lxc
+++ b/heartbeat/lxc
@@ -123,7 +123,6 @@ LXC_start() {
if ! LXC_monitor ; then
touch $TRANS_RES_STATE
ocf_log info "Starting" ${OCF_RESKEY_container}
-   cd "`dirname ${OCF_RESKEY_config`"
ocf_run ${STARTCMD} || exit $OCF_ERR_GENERIC
else
# If already running, consider start
successful

Sorry, that was a typo on my part, should have been cd "`dirname
${OCF_RESKEY_config}`" (with closing brace) of course. Should I correct
the typo, or can we drop that line altogether?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 90, Issue 2

2011-05-04 Thread Florian Haas
On 05/04/2011 08:44 AM, Darren Thompson wrote:
> Florian
> 
> I have tried to re-base on your version but it just will not run for me.
> 
> I keep getting "Failed to parse the metadata of LXC" syntax error line
> 1, column 1"
> 
> I've no idea where this error is as it all looks fine...
> 
> I'll attach my copy and a screen-shot of the error, HELP!!!

It's probably not a wise approach to "test" with the GUI while you're
not even sure the resource agent will run. You will have to start the
script from the command line and see where your error is coming from.

Florian




signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 89, Issue 32

2011-05-03 Thread Florian Haas
On 2011-05-03 09:38, Darren Thompson wrote:
> Florian/Team
> 
> How do I see any latter commits to the GIT repository???

See my other message --
https://github.com/fghaas/resource-agents/commits/lxc/heartbeat/lxc

> Is there a way I can confirm that you have committed my latest
> version/changes?

Yes, review the history.

I've added a bunch of small commits to bring the resource agent in line
with accepted precedent. Please test the current version and highlight
any issues that may arise. You will need to tweak your configuration as
I have renamed your parameters, and changed the semantics of some.

One thing you might want to revisit is your use of the
"${TRANS_RES_STATE}" state file. AIUI you're using this in order to tell
a gracefully stopped container from one that has crashed -- are you sure
LXC doesn't provide a built-in way to do that?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 89, Issue 32

2011-05-02 Thread Florian Haas
Hello Darren,

Please get the current version from
https://github.com/fghaas/resource-agents/blob/lxc/heartbeat/lxc, and
also review the commit history at
https://github.com/fghaas/resource-agents/commits/lxc/heartbeat/lxc.

When you send more updates, please do make sure they track the latest
version in my repo. I am doing my best splitting this up into patches as
I can and check them in individually, but the re-introduction of errors
that have already been fixed is not something that gives me thrills. Thanks.

Cheers,
Florian





signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers) - Linux-HA-Dev Digest, Vol 89, Issue 30

2011-04-29 Thread Florian Haas
On 2011-04-29 08:04, Darren Thompson wrote:
> You posted my "first attempt" and not the latest version, is it possible
> to add that one as it addresses some( most hopefully) of the issues you
> identified.

Already there. Been there since yesterday.

https://github.com/fghaas/resource-agents/commit/07827c42494dbec2c011133d9f82e831bc8b2eb6

> There are still some valid points you have raised however, So I'm going
> to try to incorporate them into a "third version".

See how much easier this would be if you actually did this in your own
github repo that we could just pull from?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers)

2011-04-28 Thread Florian Haas
On 2011-04-28 10:21, Darren Thompson wrote:
> Florina/TEAM
> 
> Thanks for your input and the link to the guidelines
> 
> I have updated my original ocf file in line with the guidlines, it even
> gave me a few tips on how to do things "better" so was well worth the
> time spent.
> 
> Please find the updated ocf file for LXC contianers as a cluster
> resource attached.
> 
> Since I'm not an actual developer (or even a career coder)

Do you think I am?

> I do not have
> the facility to host my own github fork so would appreciate "someone"
> adopting this and integrating it into their git repository.

OK, I have added this to a separate "lxc" branch in my own github fork.
I'd appreciate if you could at least get yourself an account on github
so you can comment on commit line notes.

I have added my comments to this page:

https://github.com/fghaas/resource-agents/commit/73f80b31f1cee5eff1c2fe2b968f4ea593e8f405


Some of those may have already been addresses in your updated version,
but to keep things simple I've kept my comments to one commit for the
time being.

Florian

PS: We can stop CC'ing the openais list, this is in no way
Corosync/OpenAIS related.




signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers)

2011-04-27 Thread Florian Haas
On 2011-04-27 00:29, Darren Thompson wrote:
> Florian
> 
> All good points.
> 
> Unfortunately I'm not a "programmer", so have no idea how to setup a
> 'git repo' and currently have no facility to host it even if I knew how.

That's the point of github; they provide all that infrastructure for you.

It all boils down to

- go to https://github.com and set up an account
- go to https://github.com/ClusterLabs/resource-agents and click "Fork"
- open a command line and do "git clone" of the URL that the web page
then shows (likely g...@github.com:/resource-agents.git)
- add your resource agent into the heartbeat/ directory
- do "git add heartbeat/lxc"
- do "git commit" with a meaningful commit message
- do "git push" to push the changes to your github repo.

> I will review the developers guide and as much as possible bring the OCF
> in-line with those recommendations

Yes, please do.

> Why I did not use libvirt-manager LXC containers:
> 1. Frankly I could not get the libvirt integration to work and wasted
> weeks worth of testing trying, if someone more experienced would like to
> get that working, more power to them.
> 2. The libvirt works and acts like a competing fork and does not use any
> of the "normal" lxc tools, I'm not sold that it's the best approach.

OK, fair enough. If we get your resource agent into mergeable shape then
the fact that may duplicate some VirtualDomain functionality is not a
show stopper.

One other question: have you considered submitting this resource agent
to the lxc folks?

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers)

2011-04-26 Thread Florian Haas
Thanks Darren!

Thanks for the contribution! Can I suggest

- we move this discussion to the linux-ha-dev list (where most OCF RA
related discussions and reviews take place);

- you give the RA a makeover following the OCF RA developer's guide
(http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html);

- you set up your own github fork off of
https://github.com/ClusterLabs/resource-agents, and push your RA to that
so we can eventually pull it into the mainline repo?

Also, can you explain what the advantages of your approach are, versus
using libvirt-managed lxc containers which Pacemaker can tie into via
the existing VirtualDomain agent?

Thanks!
Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Crowd-sourceing request to review RAs for missing input sanitation/missing quotes/missing escapes etc. [Was: New OCF RA: symlink]

2011-04-22 Thread Florian Haas
On 2011-04-22 10:57, Lars Ellenberg wrote:
> On Thu, Apr 21, 2011 at 03:19:10PM +0200, Florian Haas wrote:
>> On 2011-04-20 19:00, Lars Ellenberg wrote:
>>> On Wed, Apr 20, 2011 at 06:49:48PM +0200, Lars Ellenberg wrote:
>>> [a lot]
>>>
>>> I know I'm paranoid.
>>> Am I too paranoid?
>>
>> Patches welcome.
> 
> That phrase does work as reply to everything
> you don't want to hear about ;-)
> 
> Just because we probably have resource agents in tree
> that don't do proper input sanitation,
> and some of them may even do things like eval,
> or forget to quote parameters that need to be quoted ...

As for symlink, point taken.

https://github.com/fghaas/resource-agents/commit/0fe17c1188e5228012427ccc17d7a79af40f8b31

I also rebased my branch against current upstream.

Florian




signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] New OCF RA: symlink

2011-04-21 Thread Florian Haas
On 2011-04-20 19:00, Lars Ellenberg wrote:
> On Wed, Apr 20, 2011 at 06:49:48PM +0200, Lars Ellenberg wrote:
> [a lot]
>
> I know I'm paranoid.
> Am I too paranoid?

Patches welcome.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] New OCF RA: symlink

2011-04-20 Thread Florian Haas
On 2011-04-20 11:41, Dominik Klein wrote:
> Hi
> 
> I wrote a new RA that can manage a symlink.
> 
> Configuration:
> 
> primitive mylink ocf:heartbeat:symlink \
>   params link="/tmp/link" target="/tmp/target" \
>   op monitor interval="15" timeout="15"
> 
> This will basically
> ln -s /tmp/target /tmp/link
> 
> hth
> Dominik

Dominik doesn't have a github repo yet, so I added this to a separate
branch in mine. The current revision is here:

https://github.com/fghaas/resource-agents/blob/symlink/heartbeat/symlink

Please comment freely. Thanks!

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [GIT PULL] ldirectord

2011-04-19 Thread Florian Haas
On 2011-04-19 02:09, Simon Horman wrote:
> Hi Florian,
> 
> please consider pulling
> git://github.com/horms/resource-agents.git master
> to get the following bug fix for ldirectord by Takeuchi-san.
> 
> Sohgo Takeuchi (1):
>   fix a bug that IPv6 does not work fine.
> 
>  ldirectord/ldirectord.in |   16 +---
>  1 files changed, 13 insertions(+), 3 deletions(-)

Applied with a slightly modified commit message. Please do not forget to
rebase your tree on current upstream. Thanks!

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Linux-HA Wiki

2011-04-15 Thread Florian Haas
On 2011-04-14 10:41, Ulf wrote:
> Hi Florian,
> 
> I've seen that you are editing some of the http://www.linux-ha.org/wiki/ 
> pages.
> In http://linux-ha.org/doc/man-pages/re-ra-sfex.html is a link to 
> http://www.linux-ha.org/wiki/sfex_(resource_agent) , which doesn't exist.
> 
> It would be very useful if you can migrate the very good documentation of 
> sfex from http://linux-ha.org/sfex to the wiki.

It would be even more useful if *you* could do that, and also update the
configuration snippet to Pacemaker crm shell syntax. :) I've created an
account for you on the Linux-HA wiki, and you should have received a
password by email. Thanks for picking this up!

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] mysql m/s rapid failover problem

2011-04-14 Thread Florian Haas
On 2011-04-14 18:07, Raoul Bhatia [IPAX] wrote:
> On 04/13/2011 01:18 PM, Florian Haas wrote:
>> On 2011-04-13 12:54, Marek Marczykowski wrote:
>>> On 04/13/11 09:17, Florian Haas wrote:
>>>> Marek, have you considered setting up a personal fork of the
>>>> ClusterLabs/resource-agents repo where you could keep track of your
>>>> progress and which upstream could pull from?
>>>
>>> Good idea :)
>>>
>>> Pushed here:
>>> https://github.com/marmarek/resource-agents
>>
>> Raoul, inclined to give Marek's current version of mysql a spin?
> 
> first try:
> https://github.com/marmarek/resource-agents/commit/ba7ab1d7012259be70c02cc26cbbc7313aa753d7#L0R857

See how easy that was? :)

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] mysql m/s rapid failover problem

2011-04-14 Thread Florian Haas
On 2011-04-14 17:23, Raoul Bhatia [IPAX] wrote:
> On 04/14/2011 04:15 PM, Raoul Bhatia [IPAX] wrote:
>> On 04/13/2011 02:46 PM, Florian Haas wrote:
>>>> 2. shouldn't line 251 be removed. it reads:
>>>>> On M/S setup --skip-slave-start is needed (or in config file).
>>>> but --skip-slave-start is enforced on line 791.
>>>
>>> The "line notes" feature on github is actually remarkably useful for
>>> comments like this one. :)
>>
>> so you suggest i add a comment like this via github so a dev can fix
>> this, correct? (i just started to actually *work* with github yesterday)
> 
> mhm - i can only line-comment on my own fork, right?
> 

Don't think so ... I've been able to comment on Evgeny's commits (nif)
without problems.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] mysql m/s rapid failover problem

2011-04-13 Thread Florian Haas
On 2011-04-13 13:56, Raoul Bhatia [IPAX] wrote:
> hi,
> 
> i'll take a look in the next days.
> 
> as i'm running on debian squeeze with glue 1.0.6-1 and
> cluster-agents 1:1.0.3-3.1, i will have to revert cs
> 322b7fc587ea722a25e099f7a62cfafa01851394 where
> OCF_FUNCTIONS_DIR changed.
> 
> 
> i currently run [1] and things seem to be stable.
> 
> 
> things i noticed:
> 
> 1. the ra i'm running does not catch "connection refused" errors.
> c/p from the replication error:
> 
>> Last_IO_Errno: 2013
>> Last_IO_Error: error connecting to master 
>> 'mysql_rep@wdb01:3306' - retry-time: 60  retries: 86400
> 
> i run into this error because of a firewall misconfiguration.

Well if the firewall just dropped the packets (as opposed to returning a
RST TCP packet), then there wouldn't be any "connection refused" error
to be expected.

> 2. shouldn't line 251 be removed. it reads:
>> On M/S setup --skip-slave-start is needed (or in config file).
> but --skip-slave-start is enforced on line 791.

The "line notes" feature on github is actually remarkably useful for
comments like this one. :)

> 3. is merging possible? i thought that this mysql ra will
> always need more than one node to function properly?

It shouldn't and if it does, then that's a bug. The agent can always
test whether it's running as a master/slave set with ocf_is_ms.

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] mysql m/s rapid failover problem

2011-04-13 Thread Florian Haas
On 2011-04-13 12:54, Marek Marczykowski wrote:
> On 04/13/11 09:17, Florian Haas wrote:
>> Marek, have you considered setting up a personal fork of the
>> ClusterLabs/resource-agents repo where you could keep track of your
>> progress and which upstream could pull from?
> 
> Good idea :)
> 
> Pushed here:
> https://github.com/marmarek/resource-agents

Raoul, inclined to give Marek's current version of mysql a spin?

Florian



signature.asc
Description: OpenPGP digital signature
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


  1   2   3   4   5   >