Re: [ClusterLabs Developers] No ruinings in 2020! (Was: I will ruin your Christmas holidays Developers!)

2020-01-02 Thread Jan Pokorný
On 02/01/20 12:47 +0100, Jan Pokorný wrote:
> On 21/12/19 01:29 -0500, Digimer wrote:
>> I'm not sure how this got through the queue... Sorry for the noise.
> 
> in fact, it did not from what I can see, meaning that you (and perhaps
> other shadow moderators) do a stellar job, despite this not happening
> in a direct sight -- or in other words, practically non-existent spam
> is a proof how high the bar is (you can attest by scanning the long
> abandoned lists and comparing[*]).
> 
> Thanks for that, and to the broader community, my wishes for the best
> in the new year (whether it has just arrived in your calendar, is about
> to happen soon for you, or at any other occasion that will eventually
> come, alike).
> 
> To summarize, the most generic, high-level agenda regarding (Julian)

^ just teasing your (or my, TBH) acumen, Gregorian is correct here :-)

> year 2020 likely is:
> 
> - cluster summit:
>   http://plan.alteeve.ca/index.php/HA_Cluster_Summit_2020
> 
> - official EOL for Python 2:
>   https://www.python.org/psf/press-release/pr20191220/
> 
> Amendable, indeed, just respond on-list.
> 
>> digimer
>> 
>> On 2019-12-19 1:19 p.m., TorPedoHunt3r wrote:
>>> 
>> 
> 
> [*] These lists are, AFAICT, abandonded, please don't revive, treat
> just as a visitor in the reservation; visiting the links _not_
> recommended, only at your own risk:
> https://lists.clusterlabs.org/pipermail/pacemaker/2016-August/thread.html
> 
> https://lists.linuxfoundation.org/pipermail/ha-wg-technical/2019-December/thread.html
> 
> P.S. Sorry for piggy-backing here :-)

-- 
Jan (Poki)


pgpPlFSu39TZh.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs Developers] No ruinings in 2020! (Was: I will ruin your Christmas holidays Developers!)

2020-01-02 Thread Jan Pokorný
On 21/12/19 01:29 -0500, Digimer wrote:
> I'm not sure how this got through the queue... Sorry for the noise.

in fact, it did not from what I can see, meaning that you (and perhaps
other shadow moderators) do a stellar job, despite this not happening
in a direct sight -- or in other words, practically non-existent spam
is a proof how high the bar is (you can attest by scanning the long
abandoned lists and comparing[*]).

Thanks for that, and to the broader community, my wishes for the best
in the new year (whether it has just arrived in your calendar, is about
to happen soon for you, or at any other occasion that will eventually
come, alike).

To summarize, the most generic, high-level agenda regarding (Julian)
year 2020 likely is:

- cluster summit:
  http://plan.alteeve.ca/index.php/HA_Cluster_Summit_2020

- official EOL for Python 2:
  https://www.python.org/psf/press-release/pr20191220/

Amendable, indeed, just respond on-list.

> digimer
> 
> On 2019-12-19 1:19 p.m., TorPedoHunt3r wrote:
>> 
> 

[*] These lists are, AFAICT, abandonded, please don't revive, treat
just as a visitor in the reservation; visiting the links _not_
recommended, only at your own risk:
https://lists.clusterlabs.org/pipermail/pacemaker/2016-August/thread.html

https://lists.linuxfoundation.org/pipermail/ha-wg-technical/2019-December/thread.html

P.S. Sorry for piggy-backing here :-)

-- 
Jan (Poki)


pgpOqoimCcN49.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs Developers] Consensus on to-avoid in pacemaker, unnecessary proliferation of redundant goal-achievers, undocumented options and such? (Was: maintenance vs is-managed; different levels of

2019-12-18 Thread Jan Pokorný
On 18/12/19 02:36 +0100, Jan Pokorný wrote:
> [...]
> 
> - based on the above, increase of redundance/burden, plus
>   maintenance costs not just at pacemaker itself (more complex
>   codebase) but also any external tooling incl. higher level tools
>   (ditto, plus ensuring the change is caught by these at all[*]),
>   confusion on combinability, etc.
> 
> [...]
> 
> [*] for instance, I missed that change when suggesting the equivalent
> to pcs team: https://bugzilla.redhat.com/show_bug.cgi?id=1303969
> but hey, they made do avoiding that configuration addition
> altogether :-)

Oh, may be caused with this "maintenance" resource meta-attribute not
being documented at all in 1.1 line (while it was introduced in 1.1.12
and now we are at 1.1.22):
https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html#_resource_meta_attributes

Again, something we shall rather prevent (undocumented options, not
even in some "experimental" section that would list provisions that
may be changed or disappear again for things in need of some gradual,
multi-release incubation).

Can we agree on some principles likes this?

-- 
Jan (Poki)


pgptN5ssh740u.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] [deveoplers] maintenance vs is-managed; different levels of the maintenance property

2019-12-17 Thread Jan Pokorný
On 28/11/19 11:36 +, Yan Gao wrote:
> On 11/28/19 1:19 AM, Ken Gaillot wrote:
>> There is some room for coming up with better option naming and
>> meaning.  For example maybe the cluster-wide "maintenance-mode"
>> should be something like "force-maintenance" to make clear it takes
>> precedence over node and resource maintenance.
> Not sure if renaming would introduce even more confusion ...

+1

> But indeed, documentation definitely makes lot of sense.

+1

> Based on the whole idea, an inconsistent logic is in here then:
> 
> https://github.com/ClusterLabs/pacemaker/commit/9a8cb86573#diff-b4b7b0fdcefcd3eb5087dfbf0d101ec4R471
> 
> We should probably remove the "else" there, so that cluster-wide 
> maintenance-mode=true ALWAYS takes precedence.
> 
> Currently there's a corner case:
> 
> * Cluster maintenance-mode=true
> * Resource is-managed=true
> * Resource maintenance=false
> 
> , which makes an exception that the resource will be "managed".

ouch, slight +1 (oh, these least-surprise concerns where it verges
on least surprise towards existing reliance vs. newcomers, those
are clearly in a mutual contradiction, making it tough, not for
the first time).

Anyway, sorry for picking this is as an exemplary showcase (we all
are learning as we can, nobody is born with experience, but since
we are at the dev list, and we are not used to some in-community
retrospectives (yet?) slash meta-talk about approaches exactly to
take away something, please forgive, it has nothing to do with who,
just what) of how we shall _not_ be extending pacemaker, since
what seemed a simple, straightforward and justified addition of
a new configuration toggle carries, in hindsight, a lot of hidden
costs that were not forseen at that time:

- most notably that there are now two competitive options, one
  being the specialization (expressible with a combination with
  other configuration steps!) of other established option
  (correct me if I am wrong)

- based on the above, any human or machine needs to perform two
  step check (easy to miss) to be reasonably sure some claim
  holds or not (that the resource will be part of the cluster
  acting)

- based on the above, increase of redundance/burden, plus
  maintenance costs not just at pacemaker itself (more complex
  codebase) but also any external tooling incl. higher level tools
  (ditto, plus ensuring the change is caught by these at all[*]),
  confusion on combinability, etc.

There are bounds to evolution of code if there's some responsibility
behind it, let's keep up on sustainability (in all directions if
possible).  Suggested renaming would be a misstep in that regard
as well, I think.  High-level tools can abstract it whatever way
they like...

Speaking of these (I think they are still rather mid-level tools,
there is an enormous space for improvement in them when they dettach
from trying to carry 1:1 mapping to low level bits and move closer to
user-oriented concepts, otherwise it feels like using plain TeX
forever when it was shown that user-oriented simplification like
LaTeX can go far beyond, still benefitting the universality aspect
of the former) and their advent, I think it's fully OK to resist
urges to combine existing primitives in some composite way unless
there's a clear blocker (risk of race conditions, for instance).
These combinations shall occur higher up, outside (there were some
"middleware" ideas in the talks previously, not sure where it went,
but given a contract on the API, it well could be outside the
pacemaker project).

[*] for instance, I missed that change when suggesting the equivalent
to pcs team: https://bugzilla.redhat.com/show_bug.cgi?id=1303969
but hey, they made do avoiding that configuration addition
altogether :-)

-- 
Jan (Poki)


pgpM3wkYxXGpS.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] Extend enumeration of OCF return values

2019-10-16 Thread Jan Pokorný
On 16/10/19 09:18 +, Yan Gao wrote:
> On 10/15/19 4:31 PM, Ken Gaillot wrote:
>> On Tue, 2019-10-15 at 13:08 +0200, Tony den Haan wrote:
>>> Hi,
>>> I ran into getting "error 1" from portblock, so OCF_ERR_GENERIC,
>>> which for me doesn't guarantee the error was RC from portblock or
>>> pacemaker itself.
>>> Wouldn't it be quite useful to
>>> 1) give the agents a unique number to add to the OCF RC code, thus
>>> helping to determine origin of error
>>> 2) show an actual error string instead of "unknown error(1)". This is
>>> the last you want to see when a cluster is stuck.
>>> 
>>> Tony
>> 
>> I agree it's an issue, but the exit codes have to stay fairly generic.
>> There are only 255 possible exit codes, and half of those most shells
>> use for signals. Meanwhile there are dozens of agents. More
>> importantly, Pacemaker needs standard meanings to know how to respond.
>> 
>> However there are possibilities:
>> 
>> - OCF could add a few more codes for common error conditions. (This
>> requires updating the standard, as well as software such as Pacemaker
>> to be aware of them.)
>> 
>> - OCF already supports an arbitrary string "exit reason" which
>> pacemaker will display beyond just "unknown". It's up to the individual
>> agents to support this, and all of them should. Agents can get as
>> specific as they like with exit reasons.
>> 
>> - Agents can also log to the system log, or print error output which
>> pacemaker will log in its detail log. Many already provide good
>> information this way, but there's always room for improvement.
>> 
> All make sense. A lot of times, I can feel it's the wording "unknown 
> error" that frustrates users since they are definitely not in a good 
> mood seeing any errors in their beloved clusters, not to mention ones 
> are even "unknown" ;-)
> 
> As a manner of fact, it's probably the mostly returned error. I'd prefer 
> to call it something different from user interfaces, for example 
> "generic error" or just "error". Since:

\me votes for "sundry error" :-)

Seriously, better for getting the right hits of a random $WEBSEARCHER
since this is the first line of universal defense for a growing
population.  Assumes proper and web bots explorable documentation.

> - If "exit reason" gives a hint, it's not really "unknown".
> - Even if there's no "exit reason" given, it doesn't mean it's 
> "unknown". Usually clues could be found from logs.

-- 
Jan (Poki)


pgpwAykXyS9UZ.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs Developers] FYI: looks like there are DNS glitches with clusterlabs.org subdomains

2019-10-09 Thread Jan Pokorný
Neither bugs.c.o nor lists.c.o work for me ATM.
Either it resolves by itself, or Ken will intervene, I believe.

-- 
Jan (Poki)


pgpVtzxiRrw_d.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] kronosnet v1.12 released

2019-09-20 Thread Jan Pokorný
On 20/09/19 05:22 +0200, Fabio M. Di Nitto wrote:
> We are pleased to announce the general availability of kronosnet v1.12
> (bug fix release)
> 
> [...]
> 
> * Add support for musl libc

Congrats, and the above is a great news, since I've been toying with
an idea of putting together a truly minimalistic and vendor neutral
try-out image based on Alpine Linux, which uses musl as libc of choice
(for its bloatlessness, just as there's no systemd, etc.).

-- 
Jan (Poki)


pgpurvnsq6XPh.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] performance problems with ocf resource nfsserver script

2019-09-12 Thread Jan Pokorný
Hello Eberhard,

On 11/09/19 10:01 +0200, Eberhard Kuemmerle wrote:
> I use pacemaker with a some years old hardware.
> In combination with an rsync backup, I had nfsserver monitoring
> timeouts that resulted in stonith fencing events...
> 
> So I tested the ocf resource nfsserver script and found, that even
> in an idle situation (without rsync or other heavy load), 'nfsserver
> monitor' was running for more than 10 seconds.
> 
> I found two critical actions in the script:
> - systemctl status nfs-server  (which also calls journalctl)
> - systemctl list-unit-files
> 
> So I modified the script and replaced
> 
> systemctl $cmd $svc
> by
> systemctl -n0 $cmd $svc
> in nfs_exec() to suppress the journalctl call
> 
> and
> 
> systemctl list-unit-files
> by
> systemctl list-unit-files 'nfs-*'
> and
> systemctl list-unit-files 'rpc-*'
> 
> That reduced the runtime for 'nfsserver monitor' to less than 0.2
> secons!

That's a great improvement, indeed!

Thanks for being attentative to these details that actually sometimes
matter, as you could attest with your system.

> So I strongly recommend to integrate that modification in your
> repository.
> 
> [...]
> 
> [actual patch]
> 

Assuming your intention is to upstream your changes (and therefore you
are consent to publish your changes under the same conditions/license
as applied to that very file per its embedded notice in the header
~ GPLv2+), and assuming that publishing your changes on this list
is your preferred workflow (development itself occurs at GitHub at
this point), I brought the patch where it will be paid a bigger
attention:

https://github.com/ClusterLabs/resource-agents/pull/1398

Feel free to comment further at either location.

Btw. haven't checked, but per the timestamp alone, I supposed the
doubled messages on this list carry the identical version of the
patch.  Don't stay quiet if either this and/or license-kept
assumption do not apply, please :-)

-- 
Poki


pgpD8awBOotnA.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] Reminder that /proc is just rather an unreliable quirk, not a firm grip on processes

2019-07-08 Thread Jan Pokorný
On 03/07/19 11:45 +0200, Jan Pokorný wrote:
> [...]

Accidentally, something fundamentally related to process scan
and related imprecize overapproximation just popped up:
https://lists.clusterlabs.org/pipermail/users/2019-July/025978.html

-- 
Jan (Poki)


pgpyJOvTXCBYc.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs Developers] Reminder that /proc is just rather an unreliable quirk, not a firm grip on processes

2019-07-03 Thread Jan Pokorný
[in a sense, this is a follow-up for my recent post:
https://lists.clusterlabs.org/pipermail/users/2019-May/025749.html]

Have come across an interesting experience regarding /proc traversal:

https://rkeene.org/projects/info/wiki/173

(as well as a danger of exhausting available inodes mentioned in the
new spurred discussion: https://lobste.rs/s/ihz50b/day_proc_died)

Even if it wasn't observed with Linux in that particular case, it just
adds to the overall arguments why to avoid it, directly or indirectly
(that's what ps, pidof, killall etc. do make use of) whenever possible,
for instance:

- (at least on most systems) no snapshot semantics, meaning the
  scan-through is completely racy and ephemeral processes (or
  a fork chain thereof, see also CVE-2018-1121 for intentional
  carefully crafted abuse) are easy to miss completely

- problem of recycled PIDs is imminent (however theoretical), when
  the observer cannot subscribe itself to watch for changes in the
  process under supervision (verging on problems related to polling
  vs. event based systems, incl. timely responses to changes)

- finally, all these problems with unexpected behaviours of /proc
  under corner case situations like that mentioned initially, but
  add the possibility that arbitrary unprivileged users can
  deliberately block /proc enumeration triggered in other processes
  incl. privileged ones in Linux systems (see CVE-2018-1120[*]),
  for instance

Now, why I am mentioning, higher layers of cluster stack rely
heavily on /proc inspection, net outcome being that it can only
be as realiable as /proc filesystem is, not more.

So my ask here is to use our brain cluster (pun intended) so as
to devise ways how to get less reliant on /proc based enumeration.
One portable idea is to allow for agents persistency, i.e., the
agent would be directly informed its child (effectively the service
being run as proxied by this agent instance).  One non-portable idea
would be to leverage pidfd facility recently introduced into Linux
(as already mentioned in the May's post).

Good news is that there's still room for _also_ cheap improvements,
such as what I did along the recent security fixes for pacemaker
(in a nutshell: IPC end-points already constitute the systemd-wide
singletons, equivalent for our purposes with checking via /proc,
allowing for a swap, and -- as a paradox -- this positive change
was secondary as it effectively enabled us to close the security
hole at hand, which was the primary objective).

Apparently, the most affected are resource agents.

[*] I've mentioned such risks once on this list already:
https://lists.clusterlabs.org/pipermail/developers/2018-May/001237.html
but alas, it received no responses

-- 
Jan (Poki)


pgp_j4VnmQ7e2.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] If anybody develops against libpe_status.so: skipped soname bump (in 2.0.2)

2019-06-19 Thread Jan Pokorný
On 14/06/19 18:46 -0500, Ken Gaillot wrote:
> On Fri, 2019-06-14 at 23:57 +0200, Jan Pokorný wrote:
>> On 14/06/19 14:56 -0500, Ken Gaillot wrote:
>>> On Fri, 2019-06-14 at 20:13 +0200, Jan Pokorný wrote:
>>>>> On Thu, 2019-06-06 at 10:12 -0500, Ken Gaillot wrote:
>>> Since those functions are internal API, we don't need a soname
>>> bump.  Distributing the header was a mistake and should not be
>>> considered making it public API. The only functions in there that
>>> had doxygen blocks were marked internal, so that helps.
>>> 
>>> As an aside, the entire libpe_status was undocumented until 2.0.1,
>>> but that was an oversight (a missing \file line).
>> 
>> In FOSS, (un)documentation aspect doesn't play that much of a role...
>> 
>>> In practice there were some projects that used it, and we did bump
>>> the soname for most functions. Now however it's documented
>>> properly, so the line should be clear.
>> 
>> Not at all, see above.
>> 
>> Traces of the pre-existing mess have some momentum.
>> 
>> Anyway, good to know the root cause, question is how to deal with
>> the still real fallout.
> 
> What's the fallout? An internal function

"sort of", but definitely only after said forthcoming change :-)

> that no external application uses changed

"sort of", but they could with the header interpretable as public
(since Pacemaker-1.1.15), just wasn't discovered before (I don't
think I ever tried to match the changes back to the headers, plus
how these headers are to be interpreted)

> which doesn't require a soname bump.
> 
> I'll handle it by renaming the header and moving it to noinst.

Yes, that will help going forward.

This thread hopefully (justly) mitigates any surprising sharp edges
till this point (effectively towards any potential usage accustomed
in 1.1.15 - 2.0.1 timespan) should there be any.

Anyway, it looks like libabigail is a very useful tool we might
consider alongside or instead of abi-compliance-checker.
It looks like it can be told precisely which headers are private
and which not, so there could be even be some sort of authoritative
listing (regardless of documentation or not, as mentioned, that's
secondary with FOSS projects) to source that from.

One idea there would be to add another, standalone pass to our
Travis CI tests that would leverage TRAVIS_COMMIT_RANGE env. variable
(too bad that it's all stateless, without any convenient lookaside
storage, or is there?) to get the two builds (looks like it could
naturally be done in rather an efficient manner) for a subsequent
ABI comparison.  Either merely "informative" (i.e., pass unless there's
an actual build failure), or "punishing" if we can afford to switch
more into "always ready" paradigm (which CI is all about) -- when the
pull request destructs the ABI in some not-mere-addition way (while
soname bump didn't occur?  or when there are at least any API-ABI
hurdles found?), raise a flag.  It would then be up to deliberation
whether it's a blocker or not.  But would attract the attention for
sure, hence more care, in an ahead-of-time fashion.

-- 
Jan (Poki)


pgpqwV0p9BA0e.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] If anybody develops against libpe_status.so: skipped soname bump (Was: Pacemaker 2.0.2 final release now available)

2019-06-14 Thread Jan Pokorný
On 14/06/19 14:56 -0500, Ken Gaillot wrote:
> On Fri, 2019-06-14 at 20:13 +0200, Jan Pokorný wrote:
>>> On Thu, 2019-06-06 at 10:12 -0500, Ken Gaillot wrote:
>>> 
>>> Source code for the Pacemaker 2.0.2 and 1.1.21 releases is now
>>> available:
>>> 
>>> 
> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2
>>> 
>>> 
> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.21
>> 
>> In retrospect (I know, everybody is a general once the battle is
>> over), called out for with some automated tests in Fedora, there were
>> some slight discrepancies -- depending on whether any external
>> clients
>> of particular "wannabe internal" libraries of pacemaker accompanied
>> with "wannabe internal" headers, none of which are marked so
>> expressly
>> (and in case of headers, are usually shipped in dev packages anyway).
> 
> All public API is documented:
> 
>   https://clusterlabs.org/pacemaker/doxygen/
> 
> Anything not documented there is private API.

It's rather simplistic (and hypocritical) view not being part of any
written "contract", isn't it? :-)

> remote.h should be in noinst_HEADERS, thanks for catching that. It
> would also be a good idea to put "_internal" in all internal
> headers' names to be absolutely clear; most of them already have it.

Yes, that was that surprising moment here.

>> For the piece of mind, I am detailing the respective library that
>> would likely have been eligible for an explicit soname bump and why.
>> If you feel affected, please speak up so we have a clear incentive to
>> publish a "hotfix" for downstreams and direct consumers, otherwise
>> at least I don't feel compelled to anything immediate beyond this
>> FYI,
>> and we shall rather do it in 2.0.3 even if not otherwise justified
>> with an inter-release delta, so there isn't a tiniest glitch possible
>> when 2.0.2 is skipped on the upgrade path (which is generally not
>> recommended but would be understandable if you happen to rely on
>> those very libpe_status.so ABI details).
>> 
>> The mentioned ABI changes are:
>> 
>> * libpe_status.so.28.0.2 (2.0.1: soname 28.0.1)
>>   - include/crm/pengine/remote.h: function renames, symbolic notation:
>> { -> pe__}{is_baremetal_remote_node -> is_remote_node,
>>is_container_remote_node -> is_guest_node,
>>is_remote_node -> is_guest_or_remote_node,
>>is_rsc_baremetal_remote_node -> resource_is_remote_conn,
>>rsc_contains_remote_node -> resource_contains_guest_node}
>> 
>> (all other ABI breaking changes appear self-contained for not
>> being related to anything exposed through what could be considered
>> a public header/API -- not to be confused with ABI)
> 
> Since those functions are internal API, we don't need a soname bump.
> Distributing the header was a mistake and should not be considered
> making it public API. The only functions in there that had doxygen
> blocks were marked internal, so that helps.
> 
> As an aside, the entire libpe_status was undocumented until 2.0.1,
> but that was an oversight (a missing \file line).

In FOSS, (un)documentation aspect doesn't play that much of a role...

> In practice there were some projects that used it, and we did bump
> the soname for most functions. Now however it's documented properly,
> so the line should be clear.

Not at all, see above.

Traces of the pre-existing mess have some momentum.

Anyway, good to know the root cause, question is how to deal with
the still real fallout.

>> Note that there's at least a single publicly known consumer of
>> libpe_status.so, but luckily, sbd only uses some unaffected pe_*
>> functions.  Said after-the-fact bump of said library would require
>> it to be rebuilt as well (and all the SW that'd be in the same
>> boat), so even less appealing to do that now, but note that
>> such rebuild will be needed with said planned bump for 2.0.3.
>> 
>> But perhaps, some other changes as announced in [1] will be faster
>> than that -- to that account, I'd note that perhaps applying
>> single source -> multiple binary copies of code scheme is not all
>> that bad and we could move some of shared internal only code into
>> static libraries subsequently used to feed the links from the
>> actual daemons/tools code objects -- or the private libraries
>> shall at least be factually privatized/unshared, i.e., put into
>> a private, non-standard location (this is what, e.g., systemd uses)
>> where only "accustomed" executables can find them.
>> 
>> [1]https://lists.clusterlabs.org/pipermail/developers/2019-February/001358.html

-- 
Jan (Poki)


pgpJu_eOQaMYq.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] If anybody develops against libpe_status.so: skipped soname bump (Was: Pacemaker 2.0.2 final release now available)

2019-06-14 Thread Jan Pokorný
On 14/06/19 20:13 +0200, Jan Pokorný wrote:
> For the piece of mind, I am detailing the respective library that
> would likely have been eligible for an explicit soname bump and why.
> If you feel affected, please speak up so we have a clear incentive to
> publish a "hotfix" for downstreams and direct consumers, otherwise
> at least I don't feel compelled to anything immediate beyond this FYI,
> and we shall rather do it in 2.0.3 even if not otherwise justified
> with an inter-release delta, so there isn't a tiniest glitch possible
> when 2.0.2 is skipped on the upgrade path (which is generally not
> recommended but would be understandable if you happen to rely on
> those very libpe_status.so ABI details).

Of course, an alternative applicable right now (also suitable for
those who self-compile ... and use libpe_status.so in their client
code at the same time) and avoiding the soname bump is to add the
original symbols back, e.g. using

  __attribute__((alias ("original_name")))
  
for brevity.

> The mentioned ABI changes are:
> 
> * libpe_status.so.28.0.2 (2.0.1: soname 28.0.1)
>   - include/crm/pengine/remote.h: function renames, symbolic notation:
> { -> pe___}{is_baremetal_remote_node -> is_remote_node,
> is_container_remote_node -> is_guest_node,
>   is_remote_node -> is_guest_or_remote_node,
>   is_rsc_baremetal_remote_node -> resource_is_remote_conn,
>   rsc_contains_remote_node -> resource_contains_guest_node}
> 
> (all other ABI breaking changes appear self-contained for not
> being related to anything exposed through what could be considered
> a public header/API -- not to be confused with ABI)

-- 
Jan (Poki)


pgpBiHNh8gfAK.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs Developers] If anybody develops against libpe_status.so: skipped soname bump (Was: Pacemaker 2.0.2 final release now available)

2019-06-14 Thread Jan Pokorný
> On Thu, 2019-06-06 at 10:12 -0500, Ken Gaillot wrote:
> 
> Source code for the Pacemaker 2.0.2 and 1.1.21 releases is now
> available:
> 
> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.0.2
> 
> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.21

In retrospect (I know, everybody is a general once the battle is
over), called out for with some automated tests in Fedora, there were
some slight discrepancies -- depending on whether any external clients
of particular "wannabe internal" libraries of pacemaker accompanied
with "wannabe internal" headers, none of which are marked so expressly
(and in case of headers, are usually shipped in dev packages anyway).

For the piece of mind, I am detailing the respective library that
would likely have been eligible for an explicit soname bump and why.
If you feel affected, please speak up so we have a clear incentive to
publish a "hotfix" for downstreams and direct consumers, otherwise
at least I don't feel compelled to anything immediate beyond this FYI,
and we shall rather do it in 2.0.3 even if not otherwise justified
with an inter-release delta, so there isn't a tiniest glitch possible
when 2.0.2 is skipped on the upgrade path (which is generally not
recommended but would be understandable if you happen to rely on
those very libpe_status.so ABI details).

The mentioned ABI changes are:

* libpe_status.so.28.0.2 (2.0.1: soname 28.0.1)
  - include/crm/pengine/remote.h: function renames, symbolic notation:
{ -> pe___}{is_baremetal_remote_node -> is_remote_node,
is_container_remote_node -> is_guest_node,
is_remote_node -> is_guest_or_remote_node,
is_rsc_baremetal_remote_node -> resource_is_remote_conn,
rsc_contains_remote_node -> resource_contains_guest_node}

(all other ABI breaking changes appear self-contained for not
being related to anything exposed through what could be considered
a public header/API -- not to be confused with ABI)

Note that there's at least a single publicly known consumer of
libpe_status.so, but luckily, sbd only uses some unaffected pe_*
functions.  Said after-the-fact bump of said library would require
it to be rebuilt as well (and all the SW that'd be in the same
boat), so even less appealing to do that now, but note that
such rebuild will be needed with said planned bump for 2.0.3.

But perhaps, some other changes as announced in [1] will be faster
than that -- to that account, I'd note that perhaps applying
single source -> multiple binary copies of code scheme is not all
that bad and we could move some of shared internal only code into
static libraries subsequently used to feed the links from the
actual daemons/tools code objects -- or the private libraries
shall at least be factually privatized/unshared, i.e., put into
a private, non-standard location (this is what, e.g., systemd uses)
where only "accustomed" executables can find them.

[1] https://lists.clusterlabs.org/pipermail/developers/2019-February/001358.html

-- 
Poki


pgpCVko0r_AcB.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs Developers] Multiple processes appending to the same log file questions (Was: Pacemaker detail log directory permissions)

2019-04-30 Thread Jan Pokorný
[let's move this to developers@cl.o, please drop users on response
unless you are only subscribed there, I tend to only respond to the
lists]

On 30/04/19 13:55 +0200, Jan Pokorný wrote:
> On 30/04/19 07:55 +0200, Ulrich Windl wrote:
>>>>> Jan Pokorný  schrieb am 29.04.2019 um 17:22
>>>>> in Nachricht <20190429152200.ga19...@redhat.com>:
>>> On 29/04/19 14:58 +0200, Jan Pokorný wrote:
>>>> On 29/04/19 08:20 +0200, Ulrich Windl wrote:
>> I agree that multiple threads in one thread have no problem using
>> printf(), but (at least in the buffered case) if multiple processes
>> write to the same file, that type of locking doesn't help much IMHO.
> 
> Oops, you are right, I made a logical shortcut connecting flockfile(3)
> and flock(1), which was entirely unbacked.  You are correct it would
> matter only amongst the threads, not otherwise unsynchronized processes.
> Sorry about the noise :-/
> 
> Shamefully, this rather important (nobody wants garbled log messages)
> aspect is in no way documented in libqb's context (it does not do any
> explicit locking on its own), especially since the length of the logged
> messages can go above the default of 512 B (in the upcoming libqb 2,
> IIUIC) ... and luckily, I was steering the direction to still stay
> modest and cap that on 4 kiB, even if for other reasons:
> 
> https://github.com/ClusterLabs/libqb/pull/292#issuecomment-361745575
> 
> which still might be within Linux + typical FSs (ext4) boundaries
> to guarantee atomicity of an append (or maybe not even that, it all
> seems a gray area of any guarantees provided by the underlying system,
> inputs from the experts welcome).  Anyway, ISTM that we should at the
> very least increase the buffer size for block buffering to the
> configured one (up to 4 kiB as mentioned) if BUFSIZ would be less,
> to prevent tainting this expected atomicity from the get-go.

This post seems to indicate that while equivalent of BUFSIZ is 8 kiB
with glibc (confirmed on Fedora/x86-64), it might possibly be 1 kiB
only on BSDs (unless the underlying FS provides a hint otherwise?),
so the opt-in maxed out message (of 4 kiB currently, but generic
guards are in order in case anybody decides to bump that even further)
might readily cause log corruption on some systems with multiple
processes appending to the same file:

https://github.com/the-tcpdump-group/libpcap/issues/792

Any especially BSD people to advise here about what "atomic append"
on their system means, which conditions need to be assurably met?

-- 
Jan (Poki)


pgphjyR5U2bib.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] Using ClusterLabs logo

2019-04-29 Thread Jan Pokorný
On 29/04/19 16:31 +0200, Kristoffer Grönlund wrote:
> Tomas Jelinek  writes:
>> Is it OK to use ClusterLabs logo as a favicon for pcs in upstream? If 
>> so, are there any conditions to meet?
> 
> Yes, this would be OK to me at least (as the creator of the logo)!
> 
>> 
>> I went through new logo threads in mailinglists but I didn't find 
>> anything specific other than this:
> 
> I don't remember the specific license we decided on back then, but at
> least to me, CC-BY would make sense, where a link to clusterlabs.org
> would be sufficient attribution I think.
> 
> https://creativecommons.org/licenses/by/4.0/

Also a technical note:

Would it then be possible for you to go through *.svg files
you authored in https://github.com/ClusterLabs/clusterlabs-www.git
and add the respective licenses there?

Should be as easy as:
- open with inkscape
- Shift+Ctrl+D (File -> Document Properties)
- select the respective license (or by URI),
  perhaps edit some more metadata
- save again

Seems more appropriate for the author to do this himself if it's
indeed his intention :-)

Thanks!

-- 
Jan (Poki)


pgpYbhEVFLb0u.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] FYI: github policy change potentially affecting ssh/app access to repositories

2019-04-26 Thread Jan Pokorný
On 25/04/19 22:41 +0200, Jan Pokorný wrote:
> On 25/04/19 11:27 -0500, Ken Gaillot wrote:
>> FYI OAuth access restrictions are now in place on the ClusterLabs
>> organization account.
>> 
>> [...]
>> 
>> If you use an app that needs repo access, I believe a request to allow
>> it will be sent automatically, but if problems arise just mention them
>> here or to me directly.
> 
> Looks like Travis CI integration is also affected, at least in case of
> pacemaker:
> 
> https://github.com/ClusterLabs/pacemaker/pull/1759#issuecomment-486817936

Confirming it works now, apparently thanks to some more intervention
by Ken.

>> On Wed, 2019-04-10 at 17:44 -0500, Ken Gaillot wrote:
>>> Florian Haas and Kristoffer Grönlund noticed that the ClusterLabs
>>> organization on github currently carries over any app access that
>>> members have given to their own accounts.
>>> 
>>> This is not significant at the moment since we don't have any private
>>> repositories and few accounts have write access, but to stay on the
>>> safe side, we'd like to enable OAuth access restrictions on the
>>> organization account.
>>> 
>>> Going forward, this will simply mean that any apps that need access
>>> will need to be approved individually by one of the administrators.
>>> 
>>> But as a side effect, this will invalidate existing apps' access as
>>> well as some individual contributors' ssh key access to the
>>> repositories. If you are affected, you can simply re-upload your ssh
>>> key and it will work again.

-- 
Jan (Poki)


pgpPqkT_y304_.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] [ClusterLabs] Coming in 2.0.2: check whether a date-based rule is expired

2019-04-23 Thread Jan Pokorný
On 16/04/19 12:38 -0500, Ken Gaillot wrote:
> We are adding a "crm_rule" command

Wouldn't `pcmk-rule` be a more sensible command name -- I mean, why not
to benefit from not suffering the historical burden in this case, given
that `crm` in the broadest "almost anything that can be associated with
our cluster SW" sense is an anachronism, whereas the term metamorphed
into the invoking name of the original management shell project
(heck, we don't have `crmd` as daemon name anymore)?

> that has the ability to check > whether a particular date-based rule is
> currently in effect.
> 
> The motivation is a perennial user complaint: expired constraints
> remain in the configuration, which can be confusing.
> 
> [...]
> 
> The new command gives users (and high-level tools) a way to determine
> whether a rule is in effect, so they can remove it themselves, whether
> manually or in an automated way such as a cron.
> 
> You can use it like:
> 
> crm_rule -r  [-d ] [-X ]
> 
> With just -r, it will tell you whether the specified rule from the
> configuration is currently in effect. If you give -d, it will check as
> of that date and time (ISO 8601 format).

Uh, the date-time data representations (encodings of the singular
information) shall be used with some sort of considerations towards the
use cases:

1. _data-exchange friendly_, point-of-use-context-agnostic
   (yet timezone-respecting if need be) representation
   - this is something you want to have serialized in data
 to outlive the code (extrapolated: for exchange between
 various revisions of the same code)
   - ISO 8601 fills the bill

2. _user-friendly_, point-of-use-context-respecting representation
   - this is something you want user to work with, be it the
 management tools or helpers like crm_rule
   - ISO 8601 _barely_ fills the bill, fails in basic attampts of
 integration with surrounding system:

$ CIB_file=cts/scheduler/date-1.xml ./tools/crm_rule -c \
-r rule.auto-2 -d "next Monday 12:00"
> (crm_abort)   error: crm_time_check: Triggered assert at iso8601.c:1116 : 
> dt->days > 0
> (crm_abort)   error: parse_date: Triggered assert at iso8601.c:757 : 
> crm_time_check(dt)

 no good, let's try old good coreutils' `date` as the "chewer"

$ CIB_file=cts/scheduler/date-1.xml ./tools/crm_rule -c \
-r rule.auto-2 -d "$(date -d "next Monday 12:00")"
> (crm_abort)   error: crm_time_check: Triggered assert at iso8601.c:1116 : 
> dt->days > 0
> (crm_abort)   error: parse_date: Triggered assert at iso8601.c:757 : 
> crm_time_check(dt)

 still no good, so after few more iterations:

$ CIB_file=cts/scheduler/date-1.xml ./tools/crm_rule -c \
-r rule.auto-2 -d "$(date -Iminutes -d "next Monday 12:00")"
> Rule rule.auto-2 is still in effect

 that could be much more intuitive + locale-driven (assuming users
 have the locales set per what's natural to them/what they are
 used to), couldn't it?

I mean, at least allowing `-d` switch in `crm_rule` to support
LANG-native date/time specification makes a lot of sense to me:
https://github.com/ClusterLabs/pacemaker/pull/1756

Perhaps iso8601 (and more?) would deserve the same, even though it
smells with dragging some compatibility/interoperatibility into the
game?  (at least, `crm_rule` is at brand new, moreover marked
experimental, anyway, let's discuss this part at developers ML
if need be -- one more thing to possible put on debate, actually,
this user interface sanitization could be performed merely in the
opaque management shell wrappings, but if nothing else, it amounts
to duplication of work and makes bare-bones use bit of a PITA).

-- 
Jan (Poki)


pgpZVFPL9_hBx.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] FYI: github policy change potentially affecting ssh/app access to repositories

2019-04-15 Thread Jan Pokorný
On 14/04/19 22:48 +0200, Valentin Vidic wrote:
> On Wed, Apr 10, 2019 at 05:44:45PM -0500, Ken Gaillot wrote:
>> Florian Haas and Kristoffer Grönlund noticed that the ClusterLabs
>> organization on github currently carries over any app access that
>> members have given to their own accounts.
> 
> Related to github setup, I just noticed that some ClusterLabs repos
> don't have Issues tab enabled, but I suppose this was intentional?

I think that's very intentional, for several reasons:

* proliferation of issue trackers to watch for a single component is
  just a distraction (also for all those nice reporters that care
  about not filing duplicates), some started before GitHub (or as
  a replacement of a prior non-GH tracking), and will likely stay
  ever after (which is good, see below);
  also, speaking for clufter, I chose to use pagure.io as a primary
  home, so only kept issue tracking enabled there, which makes
  a tonne of sense (aligned with the above concern)

* as we know in HA, it's no good to put all the eggs in one basket
  (a.k.a. SPOF avoidance) -- git is trivial to move around since
  it's distributed by nature, and is continuously mirrored by many
  individuals, so the outliers (important points mentioned just as
  PR commentarie etc.) shall preferably be as spare as possible[1];
  the tracked issues themselves would not be that easy to recover
  back if GitHub stopped working this minute (however unexpectedly),
  I'd actually suggest that ClusterLabs projects with issue tracking
  enabled would opt-in to communication collection to some extra
  read-only mailing list that'd be established for that archival
  purpose (dev-pulse@cl.o?  technically, likely another GH account
  with casual communication email address set to that of this list,
  and subscribed to all these projects; note that bugs.clusterlabs.org
  Bugzilla instance could also forward there) and possibly also
  mirrored with some 3rd party services (too bad that Gmane passed out)

* partly related to that is the flexibility wrt. which forge to choose
  as authoritative, but I believe the data migration freedom is quite
  reasonable here, so there's no data lock-in per se
  (still a proponent of switching to GitLab, recent lengthy PR
  at GH demonstrated how unscalable these ongoing iteration within
  single PR are there)

[1] see point 1. at
https://lists.clusterlabs.org/pipermail/developers/2018-January/001958.html

-- 
Jan (Poki)


pgpaMYgM9k0Ox.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs Developers] Karma needed - Re: Updated kronosnet Fedora / EPEL packages to v1.8

2019-04-11 Thread Jan Pokorný
Hello Digimer,

On 11/04/19 01:09 -0400, digimer wrote:
> Would anyone with time and inclination please review / vote for
> these packages? Would like to get them pushed out if possible, short
> a vote each.

FYI, you will be allowed to push to stable in 7 days at latest since
filing the update regardless of karma.

Shouldn't this target rather the users list, preferably with
"[FEDORA/EPEL]" tag to save the time of distro-unaffiliated
users?

-- 
Jan (Poki)


pgpge5VSKJmf_.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/developers

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs Developers] Easy opt-in copyright delegation assignment (Was: Feedback wanted: proposed new copyright policy for Pacemaker)

2019-03-11 Thread Jan Pokorný
On 11/03/19 13:49 -0500, Ken Gaillot wrote:
> There's a pull request for the new policy in case anyone is interested:
> 
> https://github.com/ClusterLabs/pacemaker/pull/1716

As I mentioned there, rhis could be a possible next evolution step, but
it's in no hurry (unlike the former one of reality reflection, perhaps):

> In an outlook, I'd like to also see some simplification regarding the
> opt-in desire to assign the respective portional copyright of the
> changesets to come to a designated other party, typically an employer,
> as a pragmatic (and voluntary loyalty) legalese measure.
> 
> What was devised in a private discussion with Ken was adding an
> AFFILIATION.md file to the tree root, and mapping there
> (with enumeration or wildcards) the well-known "Signed-off-by"
> line email addresses to the respective recipient entity
> plus the start date it comes to effect for the particular item.
> Then, the pacemaker project would gain a clear semantics for
> Signed-off-by lines, and this copyright delegation would be
> trivial once established in AFFILIATION.md.

Also have more concrete wording for the projected AFFILIATION.md
file header to run by you, but that might be premature if there's
some early criticism about this idea as such (any other idea that
would still simplify the objective at hand?).

Also, is there possibly some collective wildcard-matching catch-all
consensus about this for particular companies with the active
involvement in the project, either due to policy or based simply
on an unisono agreement?

For instance, it seems that RH is relaxed about this topic, though
for myself as an employee tasked with this on-behalf-of-work, I'd
like to express my intention towards the company explicitly anyway
(as I'd normally do with new files I'd start on the project, at
least prior to proposed unification), for being rather pragmatic.

Feedback wanted on this as well, thanks in advance.

-- 
Jan (Poki)


pgpAjbgzi9hub.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] libqb: Re: 633f262 logging: Remove linker 'magic' and just use statics for logging callsites (#322)

2019-02-27 Thread Jan Pokorný
Late to the party (for some rather personal reasons), but
anyway, I don't see any progress while there's a pressing need
to resolve at least a single thing for sure before the release,
so here I go...

On 18/01/19 18:53 +0100, Lars Ellenberg wrote:
> On Thu, Jan 17, 2019 at 09:09:11AM +1100, Andrew Beekhof wrote:
>>> On 17 Jan 2019, at 2:59 am, Ken Gaillot  wrote:
>>> I'm not familiar with the reasoning for the current setup, but
>>> pacemaker's crm_crit(), crm_error(), etc. use qb_logt(), while
>>> crm_debug() and crm_trace() (which won't be used in ordinary runs) do
>>> something similar to what you propose.
>>> 
>>> Pacemaker has about 1,700 logging calls that would be affected
>>> (not counting another 2,000 debug/trace). Presumably that means
>>> Pacemaker currently has about +16KB of memory overhead and
>>> binary size for debug/trace logging static pointers, and that
>>> would almost double using them for all logs. Not a big deal
>>> today? Or meaningful in an embedded context?
>>> 
>>> Not sure if that overhead vs runtime trade-off is the original
>>> motivation or not, but that's the first thing that comes to mind.
>> 
>> I believe my interest was the ability to turn them on dynamically
>> in a running program (yes, i used it plenty back in the day) and
>> have the overhead be minimal for the normal case when they weren't
>> in use.

That's what the run-time configuration of the filtering per log
target (or per tags, even) is for, and generally, what the tracing
library should allow one to do naturally, isn't it?

Was there an enormous impact in the "normal case" as you put it,
it'd be a bug/misfeature, asking for new native approaches.

> Also, with libqb before the commit mentioned in the subject
> (633f262) and that is what pacemaker is using right now, you'd get
> one huge static array of "struct callsites" (in a special linker
> section; that's the linker magic that patch removes).

Yes, heap with all the run-time book-keeping overhead vs. cold data
used to be one of the benefits.

> Note: the whole struct was statically allocated,
> it is an array of structs, not just an array of pointers.
> 
> sizeof(struct qb_log_callsite) is 40
> 
> Now, those structs get dynamically allocated,
> and put in some lineno based lookup hash.

(Making it, in degenerate case, a linear (complexity) search,
vs. constant-time with the callsite section.)

> (so already at least additional 16 bytes),
> not counting malloc overhead for all the tiny objects.
> 
> The additional 8 byte static pointer
> is certainly not "doubling" that overhead.
> 
> But can be used to skip the additional lookup,
> sprintf, memcpy and whatnot, and even the function call,
> if the callsite at hand is currently disabled,
> which is probably the case for most >= trace
> callsites most of the time.
> 
> Any volunteers to benchmark the cpu usage?
> I think we'd need
> (trace logging: {enabled, disabled})
> x ({before 633f262,
> after 633f262,
> after 633f262 + lars patch})

Well, no numbers were presented even to support dropping the
callsite section case.  Otherwise the method could be just
repeated, I guess.

> BTW,
> I think without the "linker magic"
> (static array of structs),
> the _log_filter_apply() becomes a no-op?

Could possibly agree in qb_log_callsites_register() flow
(we have just applied the filters and stuff, haven't we?),
but not in qb_log_filter_ctl2() one.
At least without a closer look (so take it with a grain of salt).

> That's qb_log_filter_ctl2() at runtime.
> It would have to iterate over all the collision lists in all the
> buckets of the dynamically allocated callsites, instead of iterating
> the (now non-existing) static array of callsites.

It's what is does now?  I mean, the only simplification would
be to peel off a callsite section indirection, since only
a single section is now, carried?

> One side-effect of not using a static pointer,
> but *always* doing the lookup (qb_log_calsite_get()) again,
> is that a potentially set _custom_filter_fn() would be called
> and that filter applied to the callsite, at each invocation.
> 
> But I don't think that that is intentional?
> 
> Anyways.
> "just saying" :-)

There are more problems to be solved when switching to the static
pointer regarding "at least some continuity and room for future
optimizations", see the pressing one in the discussion along
("Note that this..."): https://github.com/ClusterLabs/libqb/issues/336

* * *

Thanks for pushing on this front, where rather impulsive changes
without truly caring approach were made, my critical voice
notwithstanding.

-- 
Cheers,
Jan (Poki)


pgpBOkRds9H54.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [RFC][pacemaker] Antora as a successor for the current publication platform base on (abandoned?) publican

2019-01-22 Thread Jan Pokorný
On 17/01/19 21:00 +0100, Jan Pokorný wrote:
> For instance, also Fedora project, ironically with the intimately
> strongest inclination towards this project, decided to ditch it in
> favour of Antora:
> 
> https://fedoramagazine.org/fedora-docs-overhaul/

[...]

> My ask is then: how you feel about this possible change (addressing
> intentionallh YOU on this very list, as an existing or possible future
> contributor), if you know of some other tool comparable to publican,
> or if you think we might be served with some other approach to
> mastering publications with as little friction as possible (staying
> with AsciiDoc preferred for the time being) unless we get something
> really appealing in return (is there any cherry like that with, e.g.,
> Sphinx?).

You tossing it here for possible future reference, more detailed
article linked from the Fedora Magazine post mentioned that Fedora
was also toying with what OpenShift documentation uses:

https://github.com/redhataccess/ascii_binder

once they were clear publican is not viable going forward and before
sticking with Antora.  It doesn't look very maintained either, though,
and brings a whole new dependency avalanche with it (Ruby), too.

-- 
Nazdar,
Jan (Poki)


pgpzguzozsAmm.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [RFC][pacemaker] Antora as a successor for the current publication platform base on (abandoned?) publican

2019-01-17 Thread Jan Pokorný
> Antora looks interesting. The biggest downside vs publican is that it
> appears to be only a static website generator, i.e. it would not
> generate PDF, epub, or single-page HTML the way we do now.

Couple of good questions was, coincidentally, raised "yesterday":
https://gitlab.com/antora/antora/issues/401

Nonethelss, I haven't heard of that project until I checked the
details about the Fedora docs migration I vaguely knew about.
I know see a connection between that project and Asciidoctor (for
which we already introduced small compat changes), though.

-- 
Jan (Poki)


pgpKzr749VsjD.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] [RFC][pacemaker] Antora as a successor for the current publication platform base on (abandoned?) publican

2019-01-17 Thread Jan Pokorný
I am now talking about documents as available, e.g., at:

https://clusterlabs.org/pacemaker/doc/ (Versioned documentation)

Sadly, I've come to realize that publican is no longer being
developed, and while this alone is bearable since it fulfills
its role well, worse, some distros are not (going to be) packaging
it anymore.  Also, think of staying up-to-date with target formats
and "pleasing aesthetics of the decade".

For instance, also Fedora project, ironically with the intimately
strongest inclination towards this project, decided to ditch it in
favour of Antora:

https://fedoramagazine.org/fedora-docs-overhaul/

On the first sight, getting rid of publican looked well -- the less
extensive dependencies (like Perl ecosystem) the better.  But the
crux is that Antora is possibly even worse in this regard :-D
Good thing about Antora, though, is that it natively works with
with AsciiDoc formatted files, just as we already do, e.g.:

https://github.com/ClusterLabs/pacemaker/tree/Pacemaker-2.0.1-rc2/doc/Pacemaker_Explained/en-US


My ask is then: how you feel about this possible change (addressing
intentionallh YOU on this very list, as an existing or possible future
contributor), if you know of some other tool comparable to publican,
or if you think we might be served with some other approach to
mastering publications with as little friction as possible (staying
with AsciiDoc preferred for the time being) unless we get something
really appealing in return (is there any cherry like that with, e.g.,
Sphinx?).

I figure also downstream has possibly something to say here
if they are after shipping such handbooks as well.

Thanks for your inputs.

-- 
Jan (Poki)


pgpySpdkA5RCF.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Heads up for potential Pacemaker API change

2018-11-02 Thread Jan Pokorný
On 01/11/18 16:41 -0500, Ken Gaillot wrote:
> I ran into a situation recently where a fix would require changing
> libpe_status's pe_working_set_t data type.
> 
> I ran into a situation recently where a fix would require changing
> libpe_status's pe_working_set_t data type.
> For most data types in the Pacemaker API, we require (usually by
> documented policy rather than code) that library-provided
> constructors be used to allocate them. That allows us to add new
> members at the end of structs without existing applications needing
> to be rebuilt.

Note this is not a panacea unless the struct definition is moved to
private only header and the respective pointers are all what's exposed
in public API.  So currently the client programs can just as well get
broken on future struct expansion (imagine an array of structs).

> A bit of searching turned up only sbd, fence-virt, and pacemaker-mgmt
> using libpe_status (and I'm not sure pacemaker-mgmt is still active).
> But I'm curious if anyone has custom applications that might be
> affected, or has an opinion on the problem and solution here.

With fence-virt (never occurred to me that it ever had a pacemaker
backend!) was inevitably broken since 1.1.8 at latest since the
renaming of "new_ha_date" function:
https://github.com/ClusterLabs/pacemaker/commit/9d2805ab00a117ddf3d1c67e2383c7778a81230f#diff-917f93b8d6f4434bbf21cd5b8240895cL1044
hence I bet it was never actively used (HA cluster of virtual nodes
spread across clustered hypervisors and/or even mixed topologies?
is the idea worth reviving?).

-- 
Nazdar,
Jan (Poki)


pgpZzOgUePCnX.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [HA] future of OpenStack OCF resource agents (was: resource-agents v4.2.0)

2018-10-24 Thread Jan Pokorný
On 24/10/18 14:42 +0200, Valentin Vidic wrote:
> On Wed, Oct 24, 2018 at 01:25:54PM +0100, Adam Spiers wrote:
>> No doubt I've missed some pros and cons here.  At this point
>> personally I'm slightly leaning towards keeping them in the
>> openstack-resource-agents - but that's assuming I can either hand off
>> maintainership to someone with more time, or somehow find the time
>> myself to do a better job.
>> 
>> What does everyone else think?  All opinions are very welcome,
>> obviously.
> 
> Well, I can just comment that with all the python agents coming in,
> the resource-agents package is getting a bit heavy on the dependencies
> (at least in Debian) so we might decide to split it at some point in
> the future.

At least packaging-wise, I think it would certainly be helpful to
split the current monolith of resource-agents.  Luckily, streamlined
user experience (catalogue of resources that are readily configurable)
is not necessarily in opposition to deliberate picking of particular
cherries from the basket (weak dependencies, catch-all meta packages
like it exists with fence-agents-all dependencies-only RPM, etc.).
Exactly to avoid the dependency creep.

Sorry for little off-topic, I don't have any opinion on the main
discussed matter.

-- 
Nazdar,
Jan (Poki)


pgp1oVZjqwz72.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [RFC] Time to migrate authoritative source forge elsewhere?

2018-10-23 Thread Jan Pokorný
On 08/06/18 00:21 +0200, Jan Pokorný wrote:
> On 07/06/18 15:40 -0500, Ken Gaillot wrote:
>> On Thu, 2018-06-07 at 11:01 -0400, Digimer wrote:
>>> I think we need to hang tight and wait to see what the landscape
>>> looks like after the dust settles. There are a lot of people on
>>> different projects under the Clusterlabs group. To have them all
>>> move in coordination would NOT be easy. If we do move, we need to
>>> be certain that it's worth the hassle and that we're going to the
>>> right place.
>>> 
>>> I don't think either of those can be met just now. Gitlab has had
>>> some well publicized, major problems in the past. No solution I
>>> know of is totally open, so it's a question of "picking your
>>> poison" which doesn't make a strong "move" argument.
>>> 
>>> I vote to just hang tight, say for 3~6 months, then start a new
>>> thread to discuss further.
>> 
>> +1
>> 
>> I'd wait until the dust settles to see if a clear favorite emerges.
>> Hopefully this will spur the other projects to compete more strongly on
>> features.
>> 
>> My gut feeling is that ClusterLabs may end up self-hosting one or
>> another of the open(ish) projects; our traffic is low enough it
>> shouldn't involve much admin. But as you suggested, I wouldn't look
>> forward to the migration. It's a time sink that means less coding on
>> our projects.
> 
> Hopefully not at all:
> https://docs.gitlab.com/ce/user/project/import/github.html
> 
> Btw. just to prevent any sort of squatting, I've registered
> https://gitlab.com/ClusterLabs & sharing now the intended dedication
> of this namespace publicly in a signed email in case it will turn
> up useful and the bus factor or whatever kicks in.

I guess you could see this thread bump coming, but with the recent
lack of HA with GitHub [1,2] (some rumours guessed that the problem
in question might have something to do with split brain scenarios
-- what a fitting reminder of the consequences, isn't it?), it's
a new opportunity to possibly get the ball slowly rolling and
reconsider where the biggest benefits vs. losses (e.g. suboptimal
merge reviews) possibly are and whether it's not the suitable time
for action now.

[1] https://blog.github.com/2018-10-21-october21-incident-report/
[2] https://blog.github.com/2018-10-22-incident-update/

-- 
Nazdar,
Jan (Poki)


pgpA_6CfSt7sX.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] CIB daemon up and running

2018-08-13 Thread Jan Pokorný
On 13/08/18 10:19 -0500, Ken Gaillot wrote:
> On Mon, 2018-08-13 at 05:36 +, Rohit Saini wrote:
>> Gentle Reminder!!
>>  
>> From: Rohit Saini 
>> Sent: 31 July 2018 10:34
>> To: 'developers@clusterlabs.org' 
>> Subject: CIB daemon up and running
>>  
>> Hello,
>>  
>> After “pcs cluster start”, how would I know if my CIB daemon has come
>> up and is initialized properly.
>> Currently I am checking output of “cibadmin -Q” periodically and when
>> I get the output, I consider CIB daemon has come up and initialized.
>>  
>> Is there anything better than this? I am looking for some
>> optimizations with respect to above.
>>  
>>  
>> Thanks,
>> Rohit
> 
> That's probably the best way available currently. You could copy the
> source code of cibadmin and modify it to do the query in a loop until
> successful, if you wanted to make it more convenient.

IOW. the plain poling mentioned in my previous reply
https://lists.clusterlabs.org/pipermail/developers/2018-July/001271.html

Have you missed that, Rohit?  What's your ultimate objective?

-- 
Nazdar,
Jan (Poki)


pgpnaDUTCmFNg.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] CIB daemon up and running

2018-07-31 Thread Jan Pokorný
Hello Rohit,

On 31/07/18 05:03 +, Rohit Saini wrote:
> After "pcs cluster start", how would I know if my CIB daemon has
> come up and is initialized properly.
> Currently I am checking output of "cibadmin -Q" periodically and
> when I get the output, I consider CIB daemon has come up and
> initialized.
> 
> Is there anything better than this? I am looking for some
> optimizations with respect to above.

The natural question here: what's your wider goal here?

Do you want to establish the connection with CIB (the daemon
got renamed to pacemaker-based since 2.0) as soon as it's possible
as an readiness indication for your scripting/application on top
of pacemaker?  I actually suspect we are back in automation waters
(Ansible?)...  Then, users list might be actually more suitable
venue to discuss this (CC'd).

The client/server arrangement of local inter-process communication
won't hardly allow for anything better than polling at this time,
with an exception being a slight possibly of using inotify to hook
at when /dev/shm/qb-cib_* file gets created.  That would, however,
rely on some kind of an implementation detail, which is indeed
discouraged, as it's generally a moving target.

If there's a demand, we could possibly add a DBus interface and
emit signals about events like these -- that might also alleviate
busy waiting/polling (or unreliable guarantees) in crm/pcs when
it comes to cases like this as well, I guess.  Or is there any
better idea how to move towards fully event-driven system?

-- 
Nazdar,
Jan (Poki)


pgpmNYkO5tQH3.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] [questionnaire] Do you overload pacemaker's meta-attributes to track your own data?

2018-06-28 Thread Jan Pokorný
Hello, and since it is a month since the preceding attempt to gather
some feedback, welcome to yet another simple set of questions that
I will be glad to have answered by as many of you as possible,
as an auxiliary indicator what's generally acceptable and what's not
within the userbase.

This time, I need to introduce context of the questions, since that's
important, and I am sorry it's rather long (feel free to skip lower
to the same original indentation level if you are in a time press):

  As you've surely heard when in touch with pacemaker, there's
  a level of declarative annotations for resources (whether primitive
  or otherwise), their operations and few other entities.  You'll
  find which ones (which identifiers in variable assignments emulated
  with identifier + value pairs) can be effectively applied in which
  context in the documentation[1] -- these are comprehended with
  pacemaker and put into resource allocation equations.

  Perhaps less known is the fact that these sets are open to possibly
  foreign, user-defined assignments that may effectively overload the
  the primary role of meta-attributes, dragging user-defined semantics
  there.  There may be warnings about doing so at the high-level
  management tools, but pacemaker won't protest by design, as this
  is also what allows for smooth configuration reuse with various
  point releases possibly acquiring new meanings for new identifiers.

  This possibility of a free-form consumer extensibility doesn't appear
  to be advertised anywhere (perhaps to prevent people confusing CIB,
  the configuration hierarchy, with generic key-value store, which it
  is rather not), and within the pacemaker configuration realms, it
  wasn't useful until it started to be an optional point of interest
  in location constraints thanks to ability to refer meta-attributes
  in the respective rules based on "value-source" indirection[2],
  which arrived with pacemaker 1.1.17.

  More experienced users/developers (intentionally sent to both lists)
  may already start suspecting potential namespace collisions between
  a narrow but possibly growing set identifiers claimed by pacemaker
  for its own (and here original) purpose, and those that are added
  by users, either so as to pose in the mentioned constraint rules
  or for some other, possibly external automation related purpose.

  So, I've figured out that with upcoming 2.0 release, we have a nice
  opportunity to start doing something about that, and the least
  effort, fully backward + tooling compatible, that would start
  getting us to a conflict-less situation is, in my opinion, to start
  actively pushing for a lexical cut, asking for a special
  prefix/naming convention for the mentioned custom additions.
  
  This initiative is meant to consist of two steps:
  
  a. modify the documentation to expressly detail said lexical
 requirement
 - you can read draft of my change as a pull request for pacemaker:
   https://github.com/ClusterLabs/pacemaker/pull/1523/files
   (warning: the respective discussion was somewhat heated,
   and is not a subject of examination nor of a special interest
   here), basically I suggest "x-*" naming, with full recommended
   convention being "x-appname_identifier"
  
  b. add a warning to the logs/standard error output (daemons/CLI)
 when not recognized as pacemaker's claimed identifier nor
 starting with dedicated prefix(es), possibly referring to
 the documentation stanza per a., in a similar way the user
 gets notified that no fencing devices were configured
 - this would need to be coded
 - note that this way, you would get actually warned about
   your own typos in the meta-attribute identifiers even
   if you are not using any high-level tooling

  This may be the final status quo, or the eventual separation
  of the identifiers makes it really easy to perform other schema
  upgrade related steps with future major schema version bumps
  _safely_.  Nobody is immediately forced to anything, although
  the above points should make it clear it's prudent to get ready
  (e.g. also regarding the custom tooling around that) in respect
  to future major pacemaker/schema version bumps and respective
  auto-upgrades of the configuration (say it will be declared
  it's valid to upgrade to pacemaker 3.0 only from as old pacemaker
  as 2.0 -- that's the justification for acting _now_ with preparing
  sane grounds slowly).

* * *

So now the promised questions; just send a reply where you [x] tick
your selections for the questions below, possibly with some more
commentary on the topic, and preferrably on-list (single of your
choice is enough):

1. In your cluster configurations, do you carry meta-attributes
   other than those recognized by pacemaker?

   [ ] no

   [ ] yes (if so, can you specify whether for said constraints
rules, as a way to permanently attach some kind of
administrative 

Re: [ClusterLabs Developers] [RFC] Time to migrate authoritative source forge elsewhere?

2018-06-08 Thread Jan Pokorný
On 07/06/18 11:10 +, Nils Carlson wrote:
> On 2018-06-07 08:58, Kristoffer Grönlund wrote:
>> Jan Pokorný  writes:
>>> AFAIK this doesn't address the qualitative complaint I have.  It makes
>>> for a very poor experience when there's no readily available way to
>>> observe evolution of particular patchsets, only to waste time of the
>>> reviewer or contribute to oversights ("I'll skip this part I am sure
>>> I reviewed already, if there was a generational diff, I'd have a look,
>>> but the review is quite a pain already, I'll move on").
>>> No, setting up a bot to gradually capture work in progress is not
>>> a solution.  And pull-request-per-patchset-iteration sounds crazy
>>> considering this count sometimes goes pretty high.
>>> 
>> 
>> I'll confess that I have no experience with Gerrit or the Github
>> required reviews, and I don't really know how they differ. :)
> 
> 
> Adding some info as these are things I know something about.
> 
> Gitlab & Github are very similar, but I much prefer Gitlab after having used
> both.
> 
> For open-source projects Gitlab gives you all features, including things
> like "approvers" for merge-requests. They have a nice permission model which
> allows only some users to approve merge requests and to set a minimum number
> of approvers.
> 
> The fundamental unit of review in Gitlab is the merge-request, requesting
> that a branch be merged into another. This works very well in practice. You
> can configure a regex for branch names and only allow users to push to
> branches with a prefix like "contributions/", making all other branches
> "protected", i.e. prevent direct pushes.
> 
> The code-review is good, but could be better. Every time you update the
> branch (either amending a commit or pushing a new commit) this creates a new
> "version" of the merge-request that you can diff against previous versions.

I must admit, this would be a killer feature for me (see the above
rant) and best trade-off if the willingness to try/adopt Gerrit
is unlikely.

> The bad thing here is that comments are not always carried over as they
> should be. There is also no way of marking a file as reviewed, so large
> reviews can be cumbersome. The good news is that this stuff is improving
> slowly.
> 
> Gerrit is a much more powerful tool for code-review. The workflow is less
> intuitive however and has a far higher learning curve. It requires specific
> hooks to be installed to work well and works by a "patch-set" concept. You
> push your changes to a "for" branch, i.e. "for-master" and they then end up
> on an unnamed branch on the server in a review. From there they can be
> pulled and tested.
> 
> The code-review is top-notch, with comments attached to a version of the
> patch-set and intra-version diffs being quick and elegant.
> 
> The negative sides of Gerrit typically outweigh the positive for most
> organizations I'm afraid:
> 
> - No central hosting like gitlab.com.
> - High threshold for new contributors (unusual workflow, hooks needed. )
> - No bugs/issues etc. But good jira integration.
> 
> I haven't tried pagure. There is also gitea which looks promising. And
> bitbucket.

Thanks for sharing your thoughts, Nils, appreciated.

P.S. Your post may be stuck in the moderation queue, hopefully this
is resolved soon (as a rule of thumb, I recommend subcribing to
particular list first if not already, but there can be additional
anti-spam measures for first-time/unrecognized posters).

-- 
Poki


pgpJ29F73cJD9.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [RFC] Time to migrate authoritative source forge elsewhere?

2018-06-07 Thread Jan Pokorný
On 07/06/18 15:40 -0500, Ken Gaillot wrote:
> On Thu, 2018-06-07 at 11:01 -0400, Digimer wrote:
>> I think we need to hang tight and wait to see what the landscape
>> looks like after the dust settles. There are a lot of people on
>> different projects under the Clusterlabs group. To have them all
>> move in coordination would NOT be easy. If we do move, we need to
>> be certain that it's worth the hassle and that we're going to the
>> right place.
>> 
>> I don't think either of those can be met just now. Gitlab has had
>> some well publicized, major problems in the past. No solution I
>> know of is totally open, so it's a question of "picking your
>> poison" which doesn't make a strong "move" argument.
>> 
>> I vote to just hang tight, say for 3~6 months, then start a new
>> thread to discuss further.
> 
> +1
> 
> I'd wait until the dust settles to see if a clear favorite emerges.
> Hopefully this will spur the other projects to compete more strongly on
> features.
> 
> My gut feeling is that ClusterLabs may end up self-hosting one or
> another of the open(ish) projects; our traffic is low enough it
> shouldn't involve much admin. But as you suggested, I wouldn't look
> forward to the migration. It's a time sink that means less coding on
> our projects.

Hopefully not at all:
https://docs.gitlab.com/ce/user/project/import/github.html

Btw. just to prevent any sort of squatting, I've registered
https://gitlab.com/ClusterLabs & sharing now the intended dedication
of this namespace publicly in a signed email in case it will turn
up useful and the bus factor or whatever kicks in.

-- 
Poki


pgp8i6_y5gKho.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [RFC] Time to migrate authoritative source forge elsewhere?

2018-06-07 Thread Jan Pokorný
On 04/06/18 09:23 +0200, Jan Pokorný wrote:
> As a second step, it might also be wise to start offering release
> tarballs elsewhere, preferrably OpenPGP-signed proper releases
> (as in "make dist" or the like) -- then it can be served practically
> from whatever location without imminent risk of being tampered with.

Meanwhile in Gitea land (another alternative for self-hosting):
https://github.com/go-gitea/gitea/issues/4167

Practical demonstration why to sign releases (tags, commits...), and
why permissions aspect of mixing proprietary and self-managed services
sucks.

-- 
Poki


pgpI_015ZHwI5.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [RFC] Time to migrate authoritative source forge elsewhere?

2018-06-07 Thread Jan Pokorný
On 07/06/18 08:48 +0200, Kristoffer Grönlund wrote:
> Jan Pokorný  writes:
>> But with the latest headlines on where that site is likely headed,
>> I think it's a great opportunity for us to possibly jump on the
>> bandwagon inclined more towards free (as in freedom) software
>> principles.
>> 
>> Possible options off the top of my head:
>> - GitLab, pagure: either their authoritative sites or self-hosted
>> - self-hosted cgit/whatever
>> 
>> It would also allow us to reconsider our workflows, e.g. using gerrit
>> for patch review queue (current silent force-pushes is a horrible
>> scheme!).
>> 
> My general view is that I also feel (and have felt) a bit uneasy about
> free software projects depending so strongly on a proprietary
> service. However, unless self-hosting, I don't see how f.ex. GitLab is
> much of an improvement

Open-core business approach aside as perhaps necessary downside at
these scales, the difference is crucial: Community Edition is open
source, anyone can host it individually, which is what enabled
both Debian and GNOME to consider it's usage (became a reality
for the latter: https://gitlab.gnome.org/explore/groups,
https://www.gnome.org/news/2018/05/gnome-moves-to-gitlab-2/)

Feature-wise:
https://wiki.debian.org/Alioth/GitNext/GitLab
https://wiki.debian.org/Alioth/GitNext
https://wiki.gnome.org/Initiatives/DevelopmentInfrastructure/FeatureMatrix

> (Pagure might be a different story, but does it offer a comparable
> user experience?) in that regard, and anything hosted on "public"
> cloud is basically the same. ;)

Pagure has the benefit you can influence it relatively easily, as
I directly attested :-)

> crmsh used to be hosted at GNU Savannah, which is Free with a capital F,
> but the admin experience, user experience and general discoverability in
> the world at large all left something to be desired.
> 
> In regard to workflows, if everyone agrees, we should be able to improve
> that without moving. For example, if all changes went through pull
> requests, there is a "required reviews" feature in github. I don't know
> if that is something everyone want, though.
> 
> https://help.github.com/articles/enabling-required-reviews-for-pull-requests/

AFAIK this doesn't address the qualitative complaint I have.  It makes
for a very poor experience when there's no readily available way to
observe evolution of particular patchsets, only to waste time of the
reviewer or contribute to oversights ("I'll skip this part I am sure
I reviewed already, if there was a generational diff, I'd have a look,
but the review is quite a pain already, I'll move on").
No, setting up a bot to gradually capture work in progress is not
a solution.  And pull-request-per-patchset-iteration sounds crazy
considering this count sometimes goes pretty high.


In the short term, I'd suggest concentrating on the two points I raised:
- good discipline regarding commit messages
- more systemic approach to release tarballs if possible

-- 
Poki


pgpY3msKcTLPo.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Impact of changing Pacemaker daemon names on other projects?

2018-04-16 Thread Jan Pokorný
On 16/04/18 14:32 +0200, Klaus Wenninger wrote:
> On 04/16/2018 01:52 PM, Jan Pokorný wrote:
>> On 29/03/18 11:13 -0500, Ken Gaillot wrote:
>>> 4. Public API symbols: for example, crm_meta_name() ->
>>> pcmk_meta_name(). This would be a huge project with huge impact, and
>>> will definitely not be done for 2.0.0. We would immediately start using
>>> the new convention for new API symbols, and more slowly update existing
>>> ones (with compatibility wrappers for the old names).
>> 
>> Value added here would be putting some commitment behind the "true
>> public API" when the symbols get sifted carefully, leaving some other
>> naming prefixes reserved for private only ones (without any commitment
>> whatsoever).
> 
> Like e.g. pcmk_* & pcmkpriv_*  (preferably something shorter
> for the latter) ?

Yes, something like that (pcmk_* vs. anything not starting with "pcmk_"
might suffice), which would allow for compiling library(ies) twice
-- once for public use (only "public API" symbols visible), once
for pacemaker's own usage (libpcmk_foo_private.so, everything non-static
visible).  That might be a first step towards something supportable,
start with literally nothing in the public version, gradually grow the
numbers, with almost no hassle other than adding symbols to an external
list and/or renaming formerly private-only symbols so as to match the
regexp/glob.  All native executables would naturaly link against
libpcmk_foo_private versions.  Later on, these can be merged or
otherwise restructured.

-- 
Poki


pgprD265u_74B.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Impact of changing Pacemaker daemon names on other projects?

2018-04-16 Thread Jan Pokorný
On 29/03/18 11:13 -0500, Ken Gaillot wrote:
> As I'm sure you've seen, there is a strong sentiment on the users list
> to change all the Pacemaker daemon names in Pacemaker 2.0.0, mainly to
> make it easier to read the logs.
> 
> This will obviously affect any other scripts and projects that look for
> the old names. I'd like to hear more developer input on how far we
> should go with this, and how much or little of a headache it will
> cause. I'm interested in both the public projects that use pacemaker
> (crmsh, pcs, sbd, dlm, openstack) and one-off scripts that people
> commonly put together.
> 
> In order of minimum impact to maximum impact, we could actually do this
> in stages:
> 
> 1. Log tags: This hopefully wouldn't affect anyone. For example, from
> 
> Mar 12 12:10:49 [11120] node1 pacemakerd: info:
> crm_log_init: Changed active directory to /var/lib/pacemaker/cores
> 
> to
> 
> Mar 12 12:10:49 [11120] node1 pcmk-launchd: info:
> crm_log_init: Changed active directory to /var/lib/pacemaker/cores
> 
> 2. Process names: what shows up in "ps". I'm hoping this would affect
> very little outside code, so we can at least get this far.
> 
> 3. Library names: for example, -lstonithd to -lpcmk-fencing. Other
> projects would need their configure script to auto-detect which is
> available. Not difficult, but it makes all older versions of other
> projects incompatible with Pacemaker 2.0. This is mostly what I want
> feedback on, whether this is a good idea. The only advantage is
> consistency and clarity.

Good news is that pkg-config/pkgconf (PKG_CHECK_MODULES et al.
Autoconf macros) honours names of *.pc files, hence compatibility
can be maintained with symlinks.

> 4. Public API symbols: for example, crm_meta_name() ->
> pcmk_meta_name(). This would be a huge project with huge impact, and
> will definitely not be done for 2.0.0. We would immediately start using
> the new convention for new API symbols, and more slowly update existing
> ones (with compatibility wrappers for the old names).

Value added here would be putting some commitment behind the "true
public API" when the symbols get sifted carefully, leaving some other
naming prefixes reserved for private only ones (without any commitment
whatsoever).

-- 
Poki


pgpbl4MhZnKdK.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] New challenges with corosync 3/kronosnet + pacemaker

2018-02-19 Thread Jan Pokorný
On 09/02/18 17:55 -0600, Ken Gaillot wrote:
> On Fri, 2018-02-09 at 18:54 -0500, Digimer wrote:
>> On 2018-02-09 06:51 PM, Ken Gaillot wrote:
>>> On Fri, 2018-02-09 at 12:52 -0500, Digimer wrote:
>>>> On 2018-02-09 03:27 AM, Jan Pokorný wrote:
>>>>> there is certainly whole can of these worms, put first that
>>>>> crosses my mind: performing double (de)compression on two levels
>>>>> of abstraction in the inter-node communication is not very
>>>>> clever, to put it mildly.
>>>>> 
>>>>> So far, just pacemaker was doing that for itself under certain
>>>>> conditions, now corosync 3 will have it's iron in this fire
>>>>> through kronosnet, too.  Perhaps something to keep in mind to
>>>>> avoid exercises in futility.
>>>> 
>>>> Can pacemaker be told to not do compression? If not, can that be
>>>> added in pacemaker v2?
>>> 
>>> Or better yet, is there some corosync API call we can use to
>>> determine whether corosync/knet is using compression?
>>> 
>>> There's currently no way to turn compression off in Pacemaker,
>>> however it is only used for IPC messages that pass a fairly high
>>> size threshold, so many clusters would be unaffected even without
>>> changes.
>> 
>> Can you "turn off compression" but just changing that threshold to
>> some silly high number?
> 
> It's hardcoded, so you'd have to edit the source and recompile.

FTR, since half year ago, I've had some resources noted for further
investigation on this topic of pacemaker-level compression -- since
it compresses XML, there are some specifics of the input that sugggest
more effective processing is possible.

Indeed; there's a huge, rigorously maintained non-binary files
compression benchmark that coincidentally also aims at XML files
(despite presumably more text-oriented than structure-oriented):

  http://mattmahoney.net/dc/text.html

Basically, I can see two (three) categories of possible optimizations:

0. pre-fill the scan dictionary for the compression algorithm
   with sequences that are statistically (constantly) most frequent
   (a priori known tag names?)

1. preprocessing of XML to allow for more efficient generic
   compression (like with bzip2 that is currently utilized), e.g.

   * XMill
 - https://homes.cs.washington.edu/~suciu/XMILL/

   * XWRT (XML-WRT)
 - https://github.com/inikep/XWRT

2. more effiecient algorithms as such for non-binary payloads
   (the benchmark above can help with selection of the candidates)

* * *

That being said, there are legitimate reasons to want merely the
high-level messaging be involved with compression, because that's
the only layer intimate with the respective application-specific
data and hence can provide optimal compression methods beyond
the reach of the generic ones.

-- 
Poki


pgpbbnYCqJu0U.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] [ANTICIPATED FAQ] libqb v1.0.3 vs. binutils' linker (Was: [Announce] libqb 1.0.3 release)

2017-12-21 Thread Jan Pokorný
I've meant to spread following piece advice but forgot...

On 21/12/17 17:45 +0100, Jan Pokorný wrote:
> On 21/12/17 14:40 +, Christine Caulfield wrote:
>> We are pleased to announce the release of libqb 1.0.3
>> 
>> 
>> Source code is available at:
>> https://github.com/ClusterLabs/libqb/releases/download/v1.0.3/libqb-1.0.3.tar.xz
>> 
>> 
>> This is mainly a bug-fix release to 1.0.2
>> 
>> [...]
> 
> Thanks Chrissie for the release; I'd like to take this opportunity to
> pick on one particularly important thing for "latest greatest pursuing"
> system deployments and distributions:
> 
>> High: bare fix for libqb logging not working with ld.bfd/binutils 2.29+
> 
> Together with auxiliary changes likewise present in v1.0.3, this
> effectively allows libqb to fulfil its logging duty properly also
> when any participating binary part (incl. libqb as a library itself)
> was build-time linked with a standard linker (known as ld or ld.bfd)
> from binutils 2.29 or newer.  Previous libqb releases would fail
> one way or another to proceed the messages stemming from ordinary way
> to issue them under these circumstances (and unless the linker feature
> based offloading was bypassed, which happens, e.g., for selected
> architectures [PowerPC] or platforms [Cygwin] automatically).

So now, you may face these questions:

Q1: Given the fact there was no SONAME bump (marking binary
compatibility being preserved) with libqb v1.0.3, do I have
to rebuild everything depending on libqb once I deploy this
new, "log-fixing" version?

A1: First, yes, public-facing ABI remains unchanged.  Second, it
depends whether these dependent components have anything in
common with ld linker from binutils 2.29+:

- every component that has already been build-time linked using
  such a linker prior to deploying the log-fixing libqb version
  (just this v1.0.3 and newer if we talk about official releases)
  SHOULD be recompiled with the log-fixing libqb in the build-time
  link (note that libqb pre-1.0.3 will likewise break the logging
  of the run-time linked-by programs when build-time linked using
  such a linker, but that's off-topic as we discuss
  post-deployment of the log-fixing version)

- for extra sanity, you may consider rebuilding such components,
  which will gain an advantage in case there's a risk of libqb
  being downgraded to "pre-1.0.3 version that was built-time
  linked with binutils 2.29+" -- but the mitigation measure will
  ONLY have an effect in case the component in question uses
  QB_LOG_INIT_DATA macro defined qblog.h header file of libqb
  (e.g, pacemaker does)

- otherwise, no component needs rebuilding if it was previously
  built using pre-2.29 binutils' linker, it shall combine with
  new log-fixing libqb (build-time linked with whichever binutils'
  linker) just fine

- mind that some minor exceptions do apply (see the end of the
  quoted response wrt. architectures and platforms) but are left
  out from the previous consideration


Please response on either or both lists should you have more
questions.

I am far from testing any possible combination of mixing various
build-time linkers/libqb versions per partes for the software pieces
that will eventually get linked together, but tried to cover that
space exhaustively in the limited dimensions, so let's say I have
some insights and intuition, and we can always test the particular
set of input configuration variables by hand to get any wiser.

-- 
Jan (Poki)


pgpaYKgurhqRB.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [libqb] heads-up: logging not working with binutils-2.29 standard linker (ld.bfd)

2017-12-15 Thread Jan Pokorný
On 19/10/17 22:49 +0200, Jan Pokorný wrote:
> The reconciling patchset is not merged yet, but I'd say it's in the
> good shape: https://github.com/ClusterLabs/libqb/pull/266
> 
> Testing is requested, of course ;)

We finally got to merge it with some ulterior changes, and there's
just a few more cleanups pending till the upcoming new release.

-- 
Jan (Poki)


pgpMAetK71JpN.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] RA as a systemd wrapper -- the right way?

2017-12-05 Thread Jan Pokorný
On 02/12/17 21:00 +0100, Jan Pokorný wrote:
> https://jdebp.eu/FGA/unix-daemon-readiness-protocol-problems.html
> 
> Quoting it:
>   Of course, only the service program itself can determine exactly
>   when this point [of being ready, that, "is about to enter its main
>   request processing loop"] is.
> 
> There's no way around this.
> 
> The whole objective of OCF standard looks retrospectively pretty
> sidetracked through this lense: instead of pulling weight of the
> semiformal standardization body (comprising significant industry
> players[*]) to raise awareness of this solvable reliability
> discrepancy, possibly contributing to generally acknowledged,
> resource manager agnostic solution (that could be inherited the
> next generation of the init systems), it just put a little bit of
> systemic approach to configuration management and monitoring on
> top of the legacy of organically grown "good enough" initscripts,
> clearly (because of inherent raciness and whatnot) not very suitable
> for the act of supervision nor for any sort of reactive balancing
> to satisfy the requirements (crucial in HA, polling interval-based
> approach leads to losing trailing nines needlessly for cases you
> can be notified about directly).

... although there was clearly a notion of employing asynchronous
mechanisms (one can infer, for technically more sound binding between
the resource manager and the supervised processes) even some 14+ years
ago:
https://github.com/ClusterLabs/OCF-spec/commit/2331bb8d3624a2697afaf3429cec1f47d19251f5#diff-316ade5241704833815c8fa2c2b71d4dR422

> Basically, that page also provides an overview of the existing
> "formalized intefaces" I had in mind above, in its "Several
> incompatible protocols with low adoption" section, including
> the mentioned sd_notify way of doing that in systemd realms
> (and its criticism just as well).
> 
> Apparently, this is a recurring topic because to this day, the problem
> hasn't been overcome in generic enough way, see NetBSD, as another
> example:
> https://mail-index.netbsd.org/tech-userlevel/2014/01/28/msg008401.html
> 
> This situation, caused by a lack of interest to get things right
> in the past plus OS ecosystem segmentation playing against any
> conceivable attempt to unify on a portable solution, is pretty
> unsettling :-/
> 
> [*] see https://en.wikipedia.org/wiki/Open_Cluster_Framework

-- 
Jan (Poki)


pgppERVlwhH_z.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] New LVM resource agent name (currently LVM-activate)

2017-11-23 Thread Jan Pokorný
[this follow-up is mostly to re-CC some people that were gradually
 omitted as the thread progressed, I am not sure who's subscribed
 and who not with them]

On 23/11/17 20:27 +0100, Jan Pokorný wrote:
> On 23/11/17 16:54 +0800, Eric Ren wrote:
>>> What about VolumeGroup (in the tradition of Filesystem, for instance)?
> 
>> In the LVM-activate, we will support both all VG activation and only
>> one specified LV activation depending on the parameters.
> 
> This non-educated suggestion was driven solely by the fact that VG
> needs to always be specified.
> 
>>> Or why not shoot for an LVM merge (plus proper versioning to tell
>>> the difference)?
>> 
>> You mean merging LVM-activate with the existing LVM?
> 
> Yep, see below.
> 
>> Here was a long discussion about that:
>> 
>> https://github.com/ClusterLabs/resource-agents/pull/1040
> 
> Honestly, was only vaguely aware of some previous complaints from
> Dejan in a different thread, but otherwise unlightened on what's
> happening.
> 
> And I must admit, I am quite sympathetic to the non-articulated wish
> of knowing there's a plan to give a new spin to enclustered LVM
> beforehand -- afterall, adoption depends also on whether the situation
> is/will be clear to the userbase.  Some feedback could have been
> gathered earlier -- perhaps something to learn some lessons from
> for the future.
> 
> Putting the "community logistics" issue aside...
> 
> Bear with me, I am only very slightly familiar with the storage field.
> I suspect there are some framed pictures of "LVM-activate" use that
> are yet to be recognized.  At least it looks to me like one of them
> is to couple+serialize lvmlockd agent instance followed with
> "LVM-activate".  In that case, the latter seems to be spot-on naming
> and I'd rather speculate about the former to make this dependency
> clearer, renaming it to something like LVM-prep-lvmlockd :-)
> 
> [At this point, I can't refrain myself from reminding how handy the
>  "composability with parameter reuse" provision in RGManager was,
>  and how naturally it would wrap such a pairing situation (unless
>  the original assumption doesn't hold).  I don't really see how
>  that could be emulated in pacemaker, it would definitely be
>  everything but a self-contained cut.]

Wanted to add a comment on IPaddr vs. IPaddr2 (which, as mentioned,
boils down to ifconfig vs. iproute2) situation being used for
comparison -- this is substantially a different story, as iproute2
(and in turn, IPaddr2) is Linux-only, while the whole stack is more
or less deployable on various *nixes so having two agents in parallel,
one portable but with some deficiences + one targeted and more capable
makes a damn good sense.  Cannot claim the same here.

-- 
Jan (Poki)


pgpe3e_gt1niP.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [libqb] heads-up: logging not working with binutils-2.29 standard linker (ld.bfd)

2017-10-20 Thread Jan Pokorný
On 19/10/17 22:49 +0200, Jan Pokorný wrote:
> On 03/08/17 20:50 +0200, Valentin Vidic wrote:
>>> Proper solution:
>>> - give me few days to investigate better ways to deal with this
> 
> well, that estimate was off... by far :)
> 
> But given the goals of
> - as high level of isolation of the client space from the linker
>   (respectively toolchain) subtleties as possible (no new compilation
>   flags an such on that side)
> - universality, as you don't really want to instruct libqb users
>   to use this set of flags with linker A and this with linker B,
>   and there's no way to hook any computational ad-hoc decision
>   when the compilation is about to happen (and particular linker
>   to be used + it's version are quite obscured in the build pipeline
>   so only configure-like checks would be viable, anyway)
> - mapping the exact scope of the issue for basic combinations of
>   link participants each doing the logging on its own and possibly
>   differing in the linker used to build them into standalone shared
>   libraries or executable using the former + cranking up the runner
>   of the underlying test matrix
> - some reasonable assurance that logging is not silently severed (see
>   the headaches note below)

I would have forgotten the uttermost important one(!):
- BINARY COMPATIBILITY (ABI) is PRESERVED, except for a single "ABI
  nongracefulness" I am aware of but that's more a consequence of
  slightly incorrect assumptions in the logic of QB_LOG_INIT_DATA
  macro function predating this whole affair by a long shot and which
  the patchset finally rectifies:
  if in the run-time dynamic link, following is combined:
  (. libqb, arbitrary variant: pre-/post-fix, binutils < / >= 2.29)
  . an "intermediate" library (something that the end executable links
with) triggering QB_LOG_INIT_DATA macro and being built with
pre-fix libqb (and perhaps only with binutils < 2.29)
  . end executable using no libqb's logging at all, but being built
with post-fix libqb (and arbitrary binutils < / >= 2.29)
  then, unlike when executable is built with pre-fix libqb, the
  special callsite data containing section in the ELF structure
  of the executable is created + its boundary denoting symbols
  defined within, despite the section being empty (did not happen
  with pre-fix libqb), and because the symbols defined within the
  target program have priority over that of shared libraries in the
  symbol resolution fallback scheme, the assertion of QB_LOG_INIT_DATA
  of the mentioned intermediate library will actually be evaluating
  the inequality of boundaries for the section of the executable(!)
  rather than it's own (or whatever higher prio symbols are hit,
  presumably only present if the section at that level is non-empty,
  basically a generalization of the story so far);

  the problem then manifests as unability to run said executable
  as it will fail because of the intermediate library inflicted
  assertion (sadly with very unhelpful "Assertion `0' failed"
  message);

  fortunately, there's enough flexibility so as how to fix
  this, either should be fine:
  . have everything in the executable's library dependency closure
that links against libqb assurably linked with one variant of
libqb only (either all pre-fix or post-fix)
  . have the end executable (that does not use logging at all as
discussed precondition) linked using substitution like this:
s/-lqb/-l:libqb.so.0/  (you may need to adapt the number later)
and you may also need to add this CPPFLAG for the executable:
-DQB_KILL_ATTRIBUTE_SECTION

* * *

Note: QB_LOG_INIT_DATA macro is not that widespread in the client
  space (though pacemaker uses it and corosync did use an internal
  variant that's hopefully ditched in favour of the former:
  https://github.com/corosync/corosync/pull/251) but I would
  recommend using it anywhere the logging is involved as it
  helps to check for preconditions of functional logging
  early at startup of the executable -- hard to predict
  what more breakage is to come from the linker side :-/
  (and on that note, there was an attempt to reconcile
  linker changes in the upstream I had no idea about
  until recently:
  https://github.com/ClusterLabs/libqb/pull/266#issuecomment-337700089
  but only limited subset of the behaviour was restored, which
  doesn't help us with libqb and binutils 2.29.1 still enforces
  us to use the workaround for 2.29 -- on the other hand, no new
  breakage was introduced, so the coexistence remains settled
  as of the fix)

> I believe it was worth the effort.
> 
>>>   fancy linker, it will likely differ from the iterim one above
>>>   (so far, I had quite miserable knowledge of linker script and
>>>   other internals, getting bet

Re: [ClusterLabs Developers] [libqb] heads-up: logging not working with binutils-2.29 standard linker (ld.bfd)

2017-10-19 Thread Jan Pokorný
On 03/08/17 20:50 +0200, Valentin Vidic wrote:
>> Proper solution:
>> - give me few days to investigate better ways to deal with this

well, that estimate was off... by far :)

But given the goals of
- as high level of isolation of the client space from the linker
  (respectively toolchain) subtleties as possible (no new compilation
  flags an such on that side)
- universality, as you don't really want to instruct libqb users
  to use this set of flags with linker A and this with linker B,
  and there's no way to hook any computational ad-hoc decision
  when the compilation is about to happen (and particular linker
  to be used + it's version are quite obscured in the build pipeline
  so only configure-like checks would be viable, anyway)
- mapping the exact scope of the issue for basic combinations of
  link participants each doing the logging on its own and possibly
  differing in the linker used to build them into standalone shared
  libraries or executable using the former + cranking up the runner
  of the underlying test matrix
- some reasonable assurance that logging is not silently severed (see
  the headaches note below)
I believe it was worth the effort.

>>   fancy linker, it will likely differ from the iterim one above
>>   (so far, I had quite miserable knowledge of linker script and
>>   other internals, getting better but not without headaches);
>>   we should also ensure there's a safety net because realizing
>>   there are logs missing when they are expected the most
>>   ... priceless
> 
> Thank you for the effort.  There is no rush here, we just won't
> be able to upload new version to Debian unstable.

The reconciling patchset is not merged yet, but I'd say it's in the
good shape: https://github.com/ClusterLabs/libqb/pull/266

Testing is requested, of course ;)

-- 
Jan (Poki)


pgpDYI4dAL_gu.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [ClusterLabs] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-09-06 Thread Jan Pokorný
On 24/07/17 16:59 +0200, Jan Pokorný wrote:
> On 23/07/17 12:32 +0100, Adam Spiers wrote:
>> Jan Pokorný <jpoko...@redhat.com> wrote:
>>> So, going to attend summit and want your key signed while reciprocally
>>> spreading the web of trust?
>>> Awesome, let's reuse the steps from the last time:
>>> 
>>> Once you have a key pair (and provided that you are using GnuPG),
>>> please run the following sequence:
>>> 
>>>   # figure out the key ID for the identity to be verified;
>>>   # IDENTITY is either your associated email address/your name
>>>   # if only single key ID matches, specific key otherwise
>>>   # (you can use "gpg -K" to select a desired ID at the "sec" line)
>>>   KEY=$(gpg --with-colons 'IDENTITY' | grep '^pub' | cut -d: -f5)
>> 
>> AFAICS this has two problems: it's missing a --list-key option,
> 
> Bummer!  I've been checking the original thread(s) for responses from
> others, but forgot to check my own:
> http://lists.linux-ha.org/pipermail/linux-ha/2015-January/048511.html
> 
> Thanks for spotting (and the public key already sent), Adam.
> 
>> and it doesn't handle multiple matches for 'IDENTITY'.  So to make it
>> choose the newest key if there are several:
>> 
>>read IDENTITY
>>KEY=$(gpg --with-colons --list-key "$IDENTITY" | grep '^pub' |
>>  sort -t: -nr -k6 | head -n1 | cut -d: -f5)
> 
> Good point.  Hopefully affected persons, allegedly heavy users of GPG,
> are capable to adapt on-the-fly anyway :-)
> 
>>>  # export the public key to a file that is suitable for exchange
>>>  gpg --export -a -- $KEY > $KEY
>>> 
>>>  # verify that you have an expected data to share
>>>  gpg --with-fingerprint -- $KEY

Thanks to the attendants and I am sorry for not responding to the ones
with on-the-edge submissions -- there was actually an active one
accepted and I've refreshed the authoritative record about the event
at https://people.redhat.com/jpokorny/keysigning/2017-ha/ accordingly
(see '*2.*' suffixes).

I'd also kindly ask the actual attendants (one person skipped the
event) to do the remaining signing work within the month at latest.
You can just grab the key of the other, already verified party from
the linked source (or the well known key server if present), sign it,
and then (IMHO) preferably send the signed key back to the original
person at one of his/her listed email, again (IMHO) preferably in an
encrypted form.  There are various tools to help with this workflow at
scale, such as PIUS (https://github.com/jaymzh/pius) to give an
example, but YMMV.

May the web of trust be with you.

-- 
Jan (Poki)


pgpSmOoCsGdgJ.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [libqb] heads-up: logging not working with binutils-2.29 standard linker (ld.bfd)

2017-07-31 Thread Jan Pokorný
On 31/07/17 21:55 +0200, Jan Pokorný wrote:
> This might be of interest *now* if you are fiddling with bleeding
> edge, or *later* when the distros adopt that version of binutils or
> newer:  Root cause is currently unknown, but the good news is that
> the failure will be captured by the test suite.  At least this was
> the case with the recent mass rebuild in Fedora Rawhide.
> 
> Will post more details/clarifications/rectifications when I know more.

So, after reverting following patches (modulo test suite files that
can be skipped easily) from 2.29:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=7dba9362c172f1073487536eb137feb2da30b0ff
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=b27685f2016c510d03ac9a64f7b04ce8efcf95c4
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=cbd0eecf261c2447781f8c89b0d955ee66fae7e9

I got log.test running happily again.  Will try to identify which one
is to be blamed and follow up with binutils/ld maintainer.

There's also an obligation on the libqb side to make the configure
test much more bullet-proof, as having logging silently directed at
"virtual /dev/null" could be quite painful.  We might go as far
as refusing to compile when section attribute supported by the
compiler/GCC but linker being a show stopper -- I suspect the
performance is the key driver for using that mechanism, so silent
regression in this area might be undesirable as well.

-- 
Jan (Poki)


pgp_67s29sbqY.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] [libqb] heads-up: logging not working with binutils-2.29 standard linker (ld.bfd)

2017-07-31 Thread Jan Pokorný
This might be of interest *now* if you are fiddling with bleeding
edge, or *later* when the distros adopt that version of binutils or
newer:  Root cause is currently unknown, but the good news is that
the failure will be captured by the test suite.  At least this was
the case with the recent mass rebuild in Fedora Rawhide.

Will post more details/clarifications/rectifications when I know more.

-- 
Jan (Poki)


pgpniZ2EzEVGQ.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] bundle/docker: zombie process on resource stop

2017-07-28 Thread Jan Pokorný
On 27/07/17 17:40 -0500, Ken Gaillot wrote:
> On Thu, 2017-07-27 at 23:26 +0200, Jan Pokorný wrote:
>> On 24/07/17 17:59 +0200, Valentin Vidic wrote:
>>> On Mon, Jul 24, 2017 at 09:57:01AM -0500, Ken Gaillot wrote:
>>>> Are you sure you have pacemaker 1.1.17 inside the container as well? The
>>>> pid-1 reaping stuff was added then.
>>> 
>>> Yep, the docker container from the bundle example got an older
>>> version installed, so mystery solved :)
>>> 
>>>   pacemaker-remote-1.1.15-11.el7_3.5.x86_64
>> 
>> As with docker/moby kind of bundles, pacemaker on host knows when it
>> sets pacemaker_remoted as the command to be run within the container
>> or not, it would be possible for it in such case check whether this
>> remote peer is recent enough to cope with zombie reaping and prevent
>> it from running any resources if not.
> 
> Leaving zombies behind is preferable to being unable to use containers
> with an older pacemaker_remoted installed. A common use case of
> containers is to run some legacy application that requires an old OS
> environment. The ideal usage there would be to compile a newer pacemaker
> for it, but many users won't have that option.

I was talking about in-bundle use case (as opposed to generic
pacemaker-remote one) in particular where it might be preferable
to have such sanity check in place as opposed to hard-to-predict
consequences, such as when the resource cannot be stopped due to
interference with zombies (well, there is whole lot of other issues
with this weak grip on processess, such as the resource agents on
host can get seriously confused by the processes running in the
local containers!).

For the particular, specific use case at hand, it might be reasonable
to require pacemaker-remote version that actually got bundle-ready,
IMHO.

>> The catch -- pacemaker on host cannot likely evalute this "recent
>> enough" part of the equation properly as there was no LRMD protocol
>> version bump for 1.1.17.  Correct?  Any other hints it could use?

-- 
Poki


pgpIkjUa3IAuC.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] bundle/docker: zombie process on resource stop

2017-07-27 Thread Jan Pokorný
On 24/07/17 17:59 +0200, Valentin Vidic wrote:
> On Mon, Jul 24, 2017 at 09:57:01AM -0500, Ken Gaillot wrote:
>> Are you sure you have pacemaker 1.1.17 inside the container as well? The
>> pid-1 reaping stuff was added then.
> 
> Yep, the docker container from the bundle example got an older
> version installed, so mystery solved :)
> 
>   pacemaker-remote-1.1.15-11.el7_3.5.x86_64

As with docker/moby kind of bundles, pacemaker on host knows when it
sets pacemaker_remoted as the command to be run within the container
or not, it would be possible for it in such case check whether this
remote peer is recent enough to cope with zombie reaping and prevent
it from running any resources if not.

The catch -- pacemaker on host cannot likely evalute this "recent
enough" part of the equation properly as there was no LRMD protocol
version bump for 1.1.17.  Correct?  Any other hints it could use?

-- 
Poki


pgpNm0NnBp1O0.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [ClusterLabs] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-07-24 Thread Jan Pokorný
On 23/07/17 12:32 +0100, Adam Spiers wrote:
> Jan Pokorný <jpoko...@redhat.com> wrote:
>> So, going to attend summit and want your key signed while reciprocally
>> spreading the web of trust?
>> Awesome, let's reuse the steps from the last time:
>> 
>> Once you have a key pair (and provided that you are using GnuPG),
>> please run the following sequence:
>> 
>>   # figure out the key ID for the identity to be verified;
>>   # IDENTITY is either your associated email address/your name
>>   # if only single key ID matches, specific key otherwise
>>   # (you can use "gpg -K" to select a desired ID at the "sec" line)
>>   KEY=$(gpg --with-colons 'IDENTITY' | grep '^pub' | cut -d: -f5)
> 
> AFAICS this has two problems: it's missing a --list-key option,

Bummer!  I've been checking the original thread(s) for responses from
others, but forgot to check my own:
http://lists.linux-ha.org/pipermail/linux-ha/2015-January/048511.html

Thanks for spotting (and the public key already sent), Adam.

> and it doesn't handle multiple matches for 'IDENTITY'.  So to make it
> choose the newest key if there are several:
> 
>read IDENTITY
>KEY=$(gpg --with-colons --list-key "$IDENTITY" | grep '^pub' |
>  sort -t: -nr -k6 | head -n1 | cut -d: -f5)

Good point.  Hopefully affected persons, allegedly heavy users of GPG,
are capable to adapt on-the-fly anyway :-)

>>  # export the public key to a file that is suitable for exchange
>>  gpg --export -a -- $KEY > $KEY
>> 
>>  # verify that you have an expected data to share
>>  gpg --with-fingerprint -- $KEY

-- 
Jan (Poki)


pgpMxrReDwmaM.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-07-21 Thread Jan Pokorný
Hello cluster masters :-)

as there's little less than 7 weeks left to "The Summit" meetup
(), it's about time to get the ball
rolling so we can voluntarily augment the digital trust amongst
us the attendees, on OpenGPG basis.

Doing that, we'll actually establish a tradition since this will
be the second time such event is being kicked off (unlike the birds
of the feather gathering itself, was edu-feathered back then):

  
  

If there are no objections, yours truly will conduct this undertaking.
(As an aside, I am toying with an idea of optimizing the process
a bit now that many keys are cross-signed already; I doubt there's
a value of adding identical signatures just with different timestamps,
unless, of course, the inscribed level of trust is going to change,
presumably elevate -- any comments?)

* * *

So, going to attend summit and want your key signed while reciprocally
spreading the web of trust?
Awesome, let's reuse the steps from the last time:

Once you have a key pair (and provided that you are using GnuPG),
please run the following sequence:

# figure out the key ID for the identity to be verified;
# IDENTITY is either your associated email address/your name
# if only single key ID matches, specific key otherwise
# (you can use "gpg -K" to select a desired ID at the "sec" line)
KEY=$(gpg --with-colons 'IDENTITY' | grep '^pub' | cut -d: -f5)

# export the public key to a file that is suitable for exchange
gpg --export -a -- $KEY > $KEY

# verify that you have an expected data to share
gpg --with-fingerprint -- $KEY

with IDENTITY adjusted as per the instruction above, and send me the
resulting $KEY file, preferably in a signed (or even encrypted[*]) email
from an address associated with that very public key of yours.

Timeline?
Please, send me your public keys *by 2017-09-05*, off-list and
best with [key-2017-ha] prefix in the subject.  I will then compile
a list of the attendees together with their keys and publish it at

so it can be printed beforehand.

[*] You can find my public key at public keyservers:

Indeed, the trust in this key should be ephemeral/one-off
(e.g. using a temporary keyring, not a universal one before we
proceed with the signing :)

* * *

Thanks for your cooperation, looking forward to this side stage
(but nonetheless important if release or commit[1] signing is to get
traction) happening and hope this will be beneficial to all involved.

See you there!


[1] for instance, see:



-- 
Jan (Poki)


pgpAflvBotm3a.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] bundle/rkt: port-mapping numbers/names

2017-07-19 Thread Jan Pokorný
On 19/07/17 09:49 -0500, Ken Gaillot wrote:
> On 07/19/2017 01:20 AM, Valentin Vidic wrote:
>> Another issue with the rkt containers is the port-mapping.  Each container
>> defines exposed ports:
>> 
>>  "ports": [
>>  {
>>  "name": "http",
>>  "protocol": "tcp",
>>  "port": 80,
>>  "count": 1,
>>  "socketActivated": false
>>  },
>>  ]
>> 
>> These are than mapped using the "name" from the definition:
>> 
>>   --port=   ports to expose on the host (requires 
>> contained network). Syntax: --port=NAME:[HOSTIP:]HOSTPORT
>> 
>> The problem now is that the xml defines the port to be a number:
>> 
>>   
>> 
>> Workaround is to use "80" as a port name, but perhaps we could allow
>> port to be a string or introduce a new attribute:
>> 
>>   
>> 
>> What do you think?
> 
> Hmm, this was a questionable design choice on our part. There was some
> question as to what to include in the docker tag (and thus could be
> different under different container technologies) and what to put
> outside of it (and thus should be supported by all technologies).
> 
> I'm guessing the situation is that your code needs to do something about
> the port mapping (otherwise you could just omit port-mapping with rkt),
> and the rkt "ports" configuration is pre-existing (otherwise your code
> could generate it with an arbitrary name).
> 
> I would think this would also affect the control-port attribute.
> 
> I see these alternatives, from simplest to most complicated:
> 
> * Just document the issue and require rkt configurations to have name
> equal to port number.

I don't think that alone would suffice, I'd expect at least (port,transport)
pair to be reasonably unique as long as you can remap TCP/UDP independently
(I am not sure, but would be no surprise) -- but hey, we have just hit
another limitation of the current schema (transport layer not being
taken into account -- is TCP silently assumed, then?).

> * Is it possible for the code to take the port number from port-mapping
> and query the rkt configuration to find the appropriate name?
> 
> * Is it possible for the code to generate a duplicate/override "ports"
> configuration with a generated name?
> 
> * Relax the port attribute to  and let the container
> implementation validate it further as needed. A downside is that some
> Docker config errors wouldn't be caught in the schema validation phase.
> (I think I prefer this over a separate port-name attribute.)
> 
> * Restructure the RNG so that the choice is between
>  and
> . It would be ugly and
> involve some duplication, but it would satisfy both implementations.

Similar approach was discussed with another proposed change:
http://oss.clusterlabs.org/pipermail/users/2017-April/005552.html
(item 1., i.e., separating the pacemaker-level pseudogenerics from
the tag for a particular engine) which still might be appealing,
especially as/if the schema gets changed anyway.

Valentin, is rkt able so serve containers from one image/location
in multiple instances in parallel?

> * Modify the schema so  is enclosed within the technology tag,
> and provide an XSL transform for existing configurations.
> 
> The last two options have the advantage of letting us move the 
> "network" attribute to the  tag.

-- 
Jan (Poki)


pgpFCzmKCQDpS.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] ocf_take_lock is NOT actually safe to use

2017-06-22 Thread Jan Pokorný
On 21/06/17 16:40 +0200, Lars Ellenberg wrote:
> Repost to a wider audience, to raise awareness for this.

Appreciated, Lars.
Adding developers ML for possibly even larger outreach.

> ocf_take_lock may or may not be better than nothing.
> 
> It at least "annotates" that the auther would like to protect something
> that is considered a "critical region" of the resource agent.
> 
> At the same time, it does NOT deliver what the name seems to imply.
> 
> I think I brought this up a few times over the years, but was not noisy
> enough about it, because it seemed not important enough: no-one was
> actually using this anyways.

True, I have found this reference (a leaf in the whole thread):
http://lists.linux-ha.org/pipermail/linux-ha-dev/2010-October/017801.html

> But since new usage has been recently added with
> [ClusterLabs/resource-agents] targetcli lockfile (#917)

[linked: https://github.com/ClusterLabs/resource-agents/pull/917]

> here goes:
> 
> On Wed, Jun 07, 2017 at 02:49:41PM -0700, Dejan Muhamedagic wrote:
>> On Wed, Jun 07, 2017 at 05:52:33AM -0700, Lars Ellenberg wrote:
>>> Note: ocf_take_lock is NOT actually safe to use.
>>> 
>>> As implemented, it uses "echo $pid > lockfile" to create the lockfile,
>>> which means if several such "ocf_take_lock" happen at the same time,
>>> they all "succeed", only the last one will be the "visible" one to future 
>>> waiters.
>> 
>> Ugh.
> 
> Exactly.
> 
> Reproducer:
> #
> #!/bin/bash
> export OCF_ROOT=/usr/lib/ocf/ ;
> .  /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs ;
> 
> x() (
>   ocf_take_lock dummy-lock ;
>   ocf_release_lock_on_exit dummy-lock  ;
>   set -C;
>   echo x > protected && sleep 0.15 && rm -f protected || touch BROKEN;
> );
> 
> mkdir -p /run/ocf_take_lock_demo
> cd /run/ocf_take_lock_demo
> rm -f BROKEN; i=0;
> time while ! test -e BROKEN; do
>   x &  x &
>   wait;
>   i=$(( i+1 ));
> done ;
> test -e BROKEN && echo "reproduced race in $i iterations"
> #
> 
> x() above takes, and, because of the () subshell and
> ocf_release_lock_on_exit, releases the "dummy-lock",
> and within the protected region of code,
> creates and removes a file "protected".
> 
> If ocf_take_lock was good, there could never be two instances
> inside the lock, so echo x > protected should never fail.
> 
> With the current implementation of ocf_take_lock,
> it takes "just a few" iterations here to reproduce the race.
> (usually within a minute).
> 
> The races I see in ocf_take_lock:
> "creation race":
>   test -e $lock
>   # someone else may create it here
>   echo $$ > $lock
>   # but we override it with ours anyways
> 
> "still empty race":
>   test -e $lock   # maybe it already exists (open O_CREAT|O_TRUNC)
>   # but does not yet contain target pid,
>   pid=`cat $lock` # this one is empty,
>   kill -0 $pid# and this one fails
>   and thus a "just being created" one is considered stale
> 
> There are other problems around "stale pid file detection",
> but let's not go into that minefield right now.
> 
>>> Maybe we should change it to 
>>> ```
>>> while ! ( set -C; echo $pid > lockfile ); do
>>> if test -e lockfile ; then
>>> : error handling for existing lockfile, stale lockfile detection
>>> else
>>> : error handling for not being able to create lockfile
>>> fi
>>> done
>>> : only reached if lockfile was successfully created
>>> ```
>>> 
>>> (or use flock or other tools designed for that purpose)
>> 
>> flock would probably be the easiest. mkdir would do too, but for
>> upgrade issues.
> 
> and, being part of util-linux, flock should be available "everywhere".
> 
> but because writing "wrappers" around flock similar to the intended
> semantics of ocf_take_lock and ocf_release_lock_on_exit is not easy
> either, usually you'd be better of using flock directly in the RA.
> 
> so, still trying to do this with shell:
> 
> "set -C" (respectively set -o noclober):
>   If set, disallow existing regular files to be overwritten
>   by redirection of output.

For completeness, also guaranteed with POSIX specification:
http://pubs.opengroup.org/onlinepubs/009695399/utilities/set.html

> normal '>' means: O_WRONLY|O_CREAT|O_TRUNC,

From
https://github.com/ClusterLabs/resource-agents/pull/622#issuecomment-113166800
I actually got an impression that this is shell-specific.

> set -C '>' means: O_WRONLY|O_CREAT|O_EXCL

The only thing I can add at this point (it needs more time to read up
on the proposals) is that this is another con for using "standard"
shell as an implementation language, along with, e.g., being prone
to mishandle whitespaces in the parameters being passed easily:
http://oss.clusterlabs.org/pipermail/users/2015-May/000403.html

> using "set -C ; echo $$ > $lock" instead of 
> "test -e $lock || echo $$ > $lock"
> 

Re: [ClusterLabs Developers] checking all procs on system enough during stop action?

2017-04-24 Thread Jan Pokorný
On 24/04/17 17:32 +0200, Jehan-Guillaume de Rorthais wrote:
> On Mon, 24 Apr 2017 17:08:15 +0200
> Lars Ellenberg  wrote:
> 
>> On Mon, Apr 24, 2017 at 04:34:07PM +0200, Jehan-Guillaume de Rorthais wrote:
>>> Hi all,
>>> 
>>> In the PostgreSQL Automatic Failover (PAF) project, one of most frequent
>>> negative feedback we got is how difficult it is to experience with it
>>> because of fencing occurring way too frequently. I am currently hunting
>>> this kind of useless fencing to make life easier.
>>> 
>>> It occurs to me, a frequent reason of fencing is because during the stop
>>> action, we check the status of the PostgreSQL instance using our monitor
>>> function before trying to stop the resource. If the function does not return
>>> OCF_NOT_RUNNING, OCF_SUCCESS or OCF_RUNNING_MASTER, we just raise an error,
>>> leading to a fencing. See:
>>> https://github.com/dalibo/PAF/blob/d50d0d783cfdf5566c3b7c8bd7ef70b11e4d1043/script/pgsqlms#L1291-L1301
>>> 
>>> I am considering adding a check to define if the instance is stopped even
>>> if the monitor action returns an error. The idea would be to parse **all**
>>> the local processes looking for at least one pair of
>>> "/proc//{comm,cwd}" related to the PostgreSQL instance we want to
>>> stop. If none are found, we consider the instance is not running.
>>> Gracefully or not, we just know it is down and we can return OCF_SUCCESS.
>>> 
>>> Just for completeness, the piece of code would be:
>>> 
>>>my @pids;
>>>foreach my $f (glob "/proc/[0-9]*") {
>>>push @pids => basename($f)
>>>if -r $f
>>>and basename( readlink( "$f/exe" ) ) eq "postgres"
>>>and readlink( "$f/cwd" ) eq $pgdata;
>>>}
>>> 
>>> I feels safe enough to me.
> 
> [...]
> 
> But anyway, here or there, I would have to add this piece of code looking at
> each processes. According to you, is it safe enough? Do you see some hazard
> with it?

Just for the sake of completeness, there's a race condition, indeed,
in multiple repeated path traversals (without being fixed of particular
entry inode), which can be interleaved with new postgres process being
launched anew (or what not).  But that may happen even before the code
in question is executed -- naturally not having a firm grip on the
process is open to such possible issues, so this is just an aside.

-- 
Jan (Poki)


pgpwmC7RyNunW.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] FenceAgentAPI

2017-03-07 Thread Jan Pokorný
On 06/03/17 17:12 -0500, Digimer wrote:
>   The old FenceAgentAPI document on fedorahosted is gone now that fedora
> hosted is closed. So I created a copy on the clusterlabs wiki:
> 
> http://wiki.clusterlabs.org/wiki/FenceAgentAPI

Note that just few days ago I've announced that the page has moved to
https://docs.pagure.org/ClusterLabs.fence-agents/FenceAgentAPI.md, see
http://oss.clusterlabs.org/pipermail/developers/2017-February/000438.html
(that hit just the developers list, I don't think it's of interest of
users of the stack as such).  Therefore that's another duplicate, just
as http://wiki.clusterlabs.org/wiki/Fedorahosted.org_FenceAgentAPI
(linked from the original fedorahosted.org page so as to allow for
future flexibility should the content still be visible, which turned
out to not be the case) is.

I will add you (or whoever wants to maintain that file) to linux-cluster
group at pagure.io so you can edit the underlying Markdown file (just let
me off-list know your Fedora Account System username).  The file itself 
is tracked under git repository, access URLs were provided in the
announcement email.

>   It desperately needs an update. Specifically, it needs '-o metadata'
> properly explained. I am happy to update this document and change the
> cman/cluster.conf example over to a pacemaker example, etc, but I do not
> feel like I am authoritative on the XML validation side of things.
> 
>   Can someone give me, even just point-form notes, how to explain this?
> If so, I'll create 'FenceAgentAPI - Working' document and I will have
> anyone interested comment before making it an official update.
> 
> Comments?

-- 
Jan (Poki)


pgpR6PrKqVgj_.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] Pagure.io as legacy codebases/distribution files/documentation hosting (Was: Moving cluster project)

2017-02-28 Thread Jan Pokorný
On 28/02/17 03:18 +0100, Jan Pokorný wrote:
> On 17/01/17 22:27 +0100, Jan Pokorný wrote:
>> On 17/01/17 21:14 +, Andrew Price wrote:
>>> On 17/01/17 19:58, Jan Pokorný wrote:
>>>> So I think we should arrange for a move to pagure.io for this cluster
>>>> project as well if possible, if only to retain the ability to change
>>>> something should there be a need.
>>> 
>>> Good plan.
>>> 
>>>> I can pursuit this if there are no complaints.  Just let me know
>>>> (off-list) who aspires to cluster-maint group (to be created)
>>>> membership.
>>> 
>>> Could you give the gfs2-utils-maint group push access to the cluster project
>>> once it's been set up? (It is possible to add many groups to a project.) I
>>> think that would be the most logical way to do it.
>> 
>> Sure and thanks for a cumulative access assignment tip.
>> 
>> I'll proceed on Friday or early next week, then.
> 
> Well, scheduler of mine didn't get to it until now, so sorry
> to anyone starting to worry.
> 
> So what's been done:
> 
> - git repo moved over to https://pagure.io/linux-cluster/cluster
>   + granted commit rights for gfs2-utils-maint group
> (and will add some more folks to linux-cluster group,
> feel free to bug me off-list about that)
>   + mass-committed an explanation change to every branch at
> the discontinued fedorahosted.org (fh.o) provider I could,
> as some are already frozen
> (https://git.fedorahosted.org/cgit/cluster.git/)
>   . I've decided to use a namespace (because there are possibly
> more projects to be migrated under that label),

Actually, there are quite a few legacy project copied over, some
merely for plain archival bits-preserving:
https://pagure.io/group/linux-cluster
[did I miss anything?  AFAIK, gfs-utils and dlm components migrated
on their own, and corosync is on GitHub for years]

Actually also some components otherwise found under ClusterLabs label
(note that *-agents are common to both worlds) are affected, and for
that I created a separate ClusterLabs group on pagure.io:
https://pagure.io/group/ClusterLabs

The respective projects there are just envelopes that I used for
uploading distribution files and/or documentation that were so
far served by fedorahosted.org [*], not used for active code
hosting (at this time, anyway).

[*] locations like:
https://fedorahosted.org/releases/q/u/quarterback/
https://fedorahosted.org/releases/f/e/fence-agents/

> and have stuck with linux-cluster referring to the mailing list
> of the same name that once actively served to discuss the
> cluster stack in question (and is quite abandoned nowadays)
> 
> - quickly added backup location links at
>   https://fedorahosted.org/cluster/ and
>   https://fedorahosted.org/cluster/wiki/FenceAgentAPI,

I've converted the latter to Markdown and exposed at
https://docs.pagure.org/ClusterLabs.fence-agents/FenceAgentAPI.md
The maintenance or just source access should be as simple as
cloning from ssh://g...@pagure.io/docs/ClusterLabs/fence-agents.git
or https://pagure.io/docs/ClusterLabs/fence-agents.git, respectively.

>   i.e., the pages that seem most important to me, to allow for
>   smooth "forward compatibility"; the links currently refer to vain
>   stubs at ClusterLabs wiki, but that can be solved later on -- I am
>   still unsure if trac wikis at fh.o will be served in the next
>   phase or shut down right away and apparently this measure will
>   help only in the former case
> 
> What to do:
> - move releases over to pagure.io as well:
>   https://fedorahosted.org/releases/c/l/cluster/

Done for cluster:
http://releases.pagure.org/linux-cluster/cluster/

Tarballs for split components from here will eventually be uploaded
to respective release directories for the particular projects, e.g.,
http://releases.pagure.org/ClusterLabs/fence-agents/, it's a WIP.

> - possibly migrate some original wiki content to proper
>   "doc pages" exposed directly through pagure.io

So far I am just collecting the cluster wiki texts for possible
later ressurecting.

> - resolve the question of the linked wiki stubs and
>   cross-linking as such
> 
> Any comments?  Ideas?

-- 
Jan (Poki)


pgp6lzFuvGOAh.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Moving cluster project (Was: Moving gfs2-utils away from fedorahosted.org)

2017-02-27 Thread Jan Pokorný
On 17/01/17 22:27 +0100, Jan Pokorný wrote:
> On 17/01/17 21:14 +, Andrew Price wrote:
>> On 17/01/17 19:58, Jan Pokorný wrote:
>>> So I think we should arrange for a move to pagure.io for this cluster
>>> project as well if possible, if only to retain the ability to change
>>> something should there be a need.
>> 
>> Good plan.
>> 
>>> I can pursuit this if there are no complaints.  Just let me know
>>> (off-list) who aspires to cluster-maint group (to be created)
>>> membership.
>> 
>> Could you give the gfs2-utils-maint group push access to the cluster project
>> once it's been set up? (It is possible to add many groups to a project.) I
>> think that would be the most logical way to do it.
> 
> Sure and thanks for a cumulative access assignment tip.
> 
> I'll proceed on Friday or early next week, then.

Well, scheduler of mine didn't get to it until now, so sorry
to anyone starting to worry.

So what's been done:

- git repo moved over to https://pagure.io/linux-cluster/cluster
  + granted commit rights for gfs2-utils-maint group
(and will add some more folks to linux-cluster group,
feel free to bug me off-list about that)
  + mass-committed an explanation change to every branch at
the discontinued fedorahosted.org (fh.o) provider I could,
as some are already frozen
(https://git.fedorahosted.org/cgit/cluster.git/)
  . I've decided to use a namespace (because there are possibly
more projects to be migrated under that label), and have
stuck with linux-cluster referring to the mailing list of
the same name that once actively served to discuss the
cluster stack in question (and is quite abandoned nowadays)

- quickly added backup location links at
  https://fedorahosted.org/cluster/ and
  https://fedorahosted.org/cluster/wiki/FenceAgentAPI, i.e.,
  the pages that seem most important to me, to allow for
  smooth "forward compatibility"; the links currently refer
  to vain stubs at ClusterLabs wiki, but that can be solved
  later on -- I am still unsure if trac wikis at fh.o will
  be served in the next phase or shut down right away and
  apparently this measure will help only in the former case

What to do:
- move releases over to pagure.io as well:
  https://fedorahosted.org/releases/c/l/cluster/
- possibly migrate some original wiki content to proper
  "doc pages" exposed directly through pagure.io
- resolve the question of the linked wiki stubs and
  cross-linking as such

Any comments?  Ideas?

-- 
Jan (Poki)


pgpHvoCAcoBIM.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] Moving cluster project (Was: Moving gfs2-utils away from fedorahosted.org)

2017-01-17 Thread Jan Pokorný
On 17/01/17 21:14 +, Andrew Price wrote:
> On 17/01/17 19:58, Jan Pokorný wrote:
>> So I think we should arrange for a move to pagure.io for this cluster
>> project as well if possible, if only to retain the ability to change
>> something should there be a need.
> 
> Good plan.
> 
>> I can pursuit this if there are no complaints.  Just let me know
>> (off-list) who aspires to cluster-maint group (to be created)
>> membership.
> 
> Could you give the gfs2-utils-maint group push access to the cluster project
> once it's been set up? (It is possible to add many groups to a project.) I
> think that would be the most logical way to do it.

Sure and thanks for a cumulative access assignment tip.

I'll proceed on Friday or early next week, then.

-- 
Jan (Poki)


pgpq79t5ViFqe.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Moving gfs2-utils away from fedorahosted.org

2017-01-17 Thread Jan Pokorný
[adding developers list at clusterlabs to CC]

On 16/01/17 18:45 +, Andrew Price wrote:
> On 19/09/16 17:48, Andrew Price wrote:
>> Re: https://communityblog.fedoraproject.org/fedorahosted-sunset-2017-02-28/
>> 
>> We'll need to find a new host for the cluster projects that haven't
>> migrated away from fedorahosted.org yet.
>> 
>> The recommended successor to fedorahosted.org is pagure.io which is a
>> Fedora project, open source, uses the same user account system, allows
>> git hooks to be set up, and has the added advantage that we have a
>> direct line to the admins and developers.
>> 
>> [...]
> 
> Progress on this:
> 
> - A new repository has been created at  and
> everything in the gfs2-utils Fedora Hosted repository has been pushed to it.
> This will be kept mirrored until the switch over.
> 
> - A gfs2-utils maintainers group 
> has been set up and given push access to the repository.
> 
> - Filed a ticket  to get
> the release tarballs etc. migrated over (and hopefully a URL redirect set
> up).
> 
> - Disabled the issue tracker and pull request features for the project as we
> currently have no plans to move away from Bugzilla and email.

Thanks for setting an example on this matter, Andy.

Tangentially touching is a question of "cluster" project at
fedorahosted location incl. still occasionally evolving or at least
valuable material, perhaps subject of future changes -- git tree
itself and the wiki.

For the former, there are still active branches:
- RHEL6 (head currently featuring Andy's recent commit):
  https://git.fedorahosted.org/cgit/cluster.git/commit/?h=RHEL6
- STABLE32 (Chrissie's commit from around the same time as above)
  https://git.fedorahosted.org/cgit/cluster.git/commit/?h=STABLE32

For the latter, there are some pretty authoritative documents, such
as definition of the API that fence agents should provide:
https://fedorahosted.org/cluster/wiki/FenceAgentAPI

So I think we should arrange for a move to pagure.io for this cluster
project as well if possible, if only to retain the ability to change
something should there be a need.

I can pursuit this if there are no complaints.  Just let me know
(off-list) who aspires to cluster-maint group (to be created)
membership.

-- 
Jan (Poki)


pgpeY6ii7kncL.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] @ClusterLabs/devel COPR with new libqb (Was: [ClusterLabs] libqb 1.0.1 release)

2016-11-24 Thread Jan Pokorný
On 24/11/16 10:42 +, Christine Caulfield wrote:
> I am very pleased to announce the 1.0.1 release of libqb

For instant tryout on Fedora/EL-based distros, there is already
a habitual COPR build.  But this time around, I'd like to introduce
some advancements in the process...

* * *

First, we now have a dedicated ClusterLabs group established in COPR,
and so far the only, devel, repository underneath, see:

https://copr.fedorainfracloud.org/coprs/g/ClusterLabs/devel/

The page hopefully states clearly what to expect, it's by no mean
intended to eclipse fine tuned downstream packages[*].  The packages
are provided AS ARE and the distros themselves have no liabilities,
so please do not file bugs at downstream trackers -- any feedback
at upstream level is still appreciated (as detailed), though.

[*] that being said, Fedora is receiving an update soonish

* * *

Second, new packages are generated once new push of changesets
occurs at respective upstream repositories, so it's always at
one's discretion whether to pick particular tagged version of
the component, or whichever else (usually the newest one).

So to update strictly to 1.0.1 version of libqb from here and
supposing you have dnf available + your distro is directly covered
with the builds, you would have to do as root:

  # dnf copr enable @ClusterLabs/devel
  # dnf update libqb-1.0.1-1$(rpm -E %dist)

as mere "dnf update libqb" would currently update even higher,
up to 1.0.1-1.2.d03b7 (2 commits pass the 1.0.1 version)
as of writing this email.

In other words, not specifying the particular version will provide
you with the latest greatest version, which is only useful if you
want to push living on the bleeding edge to the extreme (and this
COPR setup is hence a means of "continuous delivery" to shout a first
buzzword here).  It's good to be aware of this.

* * *

[now especially for developers ML readers]

Third, the coverage of the ClusterLabs-associated packages is
going to grow.  So far, there's pacemaker in the pipeline[**].
There's also an immediate benefit for developers of these packages,
as the cross-dependencies are primarily satisfied within the same
COPR repository, which means that here, latest development version
of pacemaker will get built against the latest version of libqb at
that moment, and thanks to the pacemaker's unit tests (as hook in
%check scriptlet when building the RPM package), there's also
realy a notion of integration testing (finally a "continous
integration" in a proper sense, IMHO; the other term to mention here).

That being said, if you work on a fellow project and want it to join
this club (and you are not a priori against Fedora affiliation as that
requires you obtaining an account in Fedora Account System), please
contact me off-list and we'll work it out.

[**] https://github.com/ClusterLabs/pacemaker/pull/1182

* * *

Hope you'll find this useful.

-- 
Jan (Poki)


pgpxt0sIRFYbk.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


[ClusterLabs Developers] RA as a systemd wrapper -- the right way?

2016-09-21 Thread Jan Pokorný
Hello,

https://github.com/ClusterLabs/resource-agents/pull/846 seems to be
a first crack on integrating systemd to otherwise init-system-unaware
resource-agents.

As pacemaker already handles native systemd integration, I wonder if
it wouldn't be better to just allow, on top of that, perhaps as
special "systemd+hooks" class of resources that would also accept
"hooks" (meta) attribute pointing to an executable implementing
formalized API akin to OCF (say on-start, on-stop, meta-data
actions) that would take care of initial reflecting on the rest of
the parameters + possibly a cleanup later on.

Technically, something akin to injecting Environment, ExecStartPre
and ExecStopPost to the service definition might also achieve the
same goal if there's a transparent way to do it from pacemaker using
just systemd API (I don't know).

Indeed, the scenario I have in mind would make do with separate
"prepare grounds" agent, suitably grouped with such systemd-class
resource, but that seems more fragile configuration-wise (this
is not the granularity cluster administrator would be supposed
to be thinking in, IMHO, just as with ocf class).

Just thinking aloud before the can is open.

-- 
Jan (Poki)


pgpe9b5hU66Ge.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Resurrecting OCF

2016-09-21 Thread Jan Pokorný
On 21/09/16 14:50 +1000, Andrew Beekhof wrote:
> I like where this is going.
> Although I don’t think we want to get into the business of trying to
> script config changes from one agent to another, so I’d drop #4

Not agent parameter changes, just its specification -- to reflect
formally what the proposed symlink-based delegation scheme does when
the old one is still in use.  If the old and new are incompatible,
such automatic delegation is not possible anyway (that's one of
the reasons "description" would come handy).

I see there's much bigger potential (parameter renames, ...) but for
that, each agent should be responsible on its own (somehow, subject
of further evolution).

Also, supposing there are more consumers of RA, the suggestion to
run the script should be more generic ("when used from under
pacemaker, ...").

> I would make .deprecated a nested directory so that if we want to
> retire (for example) a ClusterLabs agent in the future we can create
> .deprecate/clusterlabs/ and put the agent there. Rather than make
> this heartbeat specific.

Good point; it would also prevent clashes when single directory should
serve all the providers.

> I wonder if some of this should live in pacemaker itself though…

This runs directly to the other side of the RA-pacemaker bias,
pacemaker caring about RA evolutionary internals :-)

In the outlook, that would make any separated OCF standard efforts
worthless and we could just call it pacemaker resource standard
right away and forget about any sort of self-containment
(the proposed procedure aims to align with).

I am not sure that would be the best thing.

> If resources_action_create() cannot find ocf:${provider}:${agent} in
> its usual location, look up
> ${OCF_ROOT_DIR}/.compat/${provider}/__entries__
> 
> Format for __entries__:
># old, replacement
># ${agent} , ${new_provider}:${new_agent} , ${description}
>IPaddr , clusterlabs:IP , Replaced with different semantics
>IPaddr2 , clusterlabs:IP , Moved
>drbd , linbit:drbd , Moved
>eDirectory , , Deleted

Additional "what happened" field might work well in the update
suggestions.

> Assuming an entry is found:
> - If  . compat/${old_provider}/${old_agent} exists, notify the user
>“somehow”, then call it.
> - Otherwise, return OCF_ERR_NOT_INSTALLED and use ${description} and
>   ${replacement} as the exit reason (which shows up in pcs status).
> 
> Perhaps the “somehow” is creating PCMK_OCF_DEPRECATED (with the same
> semantics as PCMK_OCF_DEGRADED) and prepending ${description} to the
> output (assuming its not a metadata op) and/or the exit reason[1].
> Maybe only on successful start operations to minimise the noise?
> 
> [1] Shouldn’t be too hard with some extra fields for 'struct
> svc_action_private_s’ or svc_action_t
> 
> 
>> On 19 Aug 2016, at 6:59 PM, Jan Pokorný <jpoko...@redhat.com> wrote:
>> 
>> On 18/08/16 17:27 +0200, Klaus Wenninger wrote:
>>> On 08/18/2016 05:16 PM, Ken Gaillot wrote:
>>>> On 08/18/2016 08:31 AM, Kristoffer Grönlund wrote:
>>>>> Jan Pokorný <jpoko...@redhat.com> writes:
>>>>> 
>>>>>> Thinking about that, ClusterLabs may be considered a brand established
>>>>>> well enough for "clusterlabs" provider to work better than anything
>>>>>> general such as previously proposed "core".  Also, it's not expected
>>>>>> there will be more RA-centered projects under this umbrella than
>>>>>> resource-agents (pacemaker deserves to be a provider on its own),
>>>>>> so it would be pretty unambiguous pointer.
>>>>> I like this suggestion as well.
>>>> Sounds good to me.
>>>> 
>>>>>> And for new, not well-tested agents within resource-agents, there could
>>>>>> also be a provider schema akin to "clusterlabs-staging" introduced.
>>>>>> 
>>>>>> 1 CZK
>>>>> ...and this too.
>>>> I'd rather not see this. If the RA gets promoted to "well-tested",
>>>> everyone's configuration has to change. And there's never a clear line
>>>> between "not well-tested" and "well-tested", so things wind up staying
>>>> in "beta" status long after they're widely used in production, which
>>>> unnecessarily makes people question their reliability.
>>>> 
>>>> If an RA is considered experimental, say so in the documentation
>>>> (including the man page and help text), and give it an "0.x" version 
>>>> number.
>>>> 
>>>>> Here i

Re: [ClusterLabs Developers] Resurrecting OCF

2016-09-05 Thread Jan Pokorný
On 18/08/16 15:31 +0200, Kristoffer Grönlund wrote:
> A pet peeve of mine would also be to move heartbeat/IPaddr2 to
> clusterlabs/IP, to finally get rid of that weird 2 in the name...

Just recalled I used to be uncomfortable with "apache" (also present
in rgmanager's breed of RAs) as it's no longer unambiguous due to
handful of other "Apache X" projects (and Apache is rather the name
of the parental foundation, anyway).

And as I've just discovered, httpd is unfortunately not unambigous
either -- there's at least (currently disjoint) OpenBSD variant.

(sigh)

-- 
Jan (Poki)


pgpouTlZnj7Jq.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Potential logo for Cluster Labs

2016-08-25 Thread Jan Pokorný
On 25/08/16 09:17 -0500, Ken Gaillot wrote:
> On 08/25/2016 09:02 AM, Kristoffer Grönlund wrote:
>> Klaus Wenninger  writes:
>> 
>>> On 08/25/2016 03:13 PM, Andrew Price wrote:
 On 25/08/16 13:58, Klaus Wenninger wrote:
> On 08/25/2016 12:49 PM, Andrew Price wrote:
>> On 24/08/16 18:50, Ken Gaillot wrote:
>>> Suggestions/revisions/alternatives are welcome.
>> 
>> Here's a possible alternative theme. It's similarly greyscale and I'm
>> not hugely happy with the font (I don't seem to have many good ones
>> installed) but I'm happy enough with it to throw it on the pile :)
>> 
>> Alright, if we're throwing logo design ideas on a pile, here's mine!
>> 
>> The idea being basically a beaker with servers in it, hence.. Clusterlabs.
> 
> Bwahaha ... I love it.

Yeah, imagine an animated, blinking version :)

-- 
Jan (Poki)


pgpkwELu4xYZb.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Potential logo for Cluster Labs

2016-08-24 Thread Jan Pokorný
On 24/08/16 12:50 -0500, Ken Gaillot wrote:
> I was doodling the other day and came up with a potential logo for
> Cluster Labs. I've attached an example of what I came up with. It's
> meant to subtly represent an outer "C" of resources around an inner "L"
> of nodes.
> 
> We have a Pacemaker logo used on the website already, but I thought it
> might be nice to have a Cluster Labs logo for the website and
> documentation, that could tie all the various projects together.
> 
> Comments anyone? The example here is greyscale for discussion purposes,
> but the final should have some color scheme.
> Suggestions/revisions/alternatives are welcome.

Not a bad idea to start with.

I just hope the graphic source boils down to vectors (preferably SVG).
We are not binary patchers, afterall ;-)

-- 
Jan (Poki)


pgp8GKjnpS4Ab.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Resurrecting OCF

2016-08-19 Thread Jan Pokorný
On 19/08/16 13:12 +0200, Jan Pokorný wrote:
> On 19/08/16 11:14 +0200, Jan Pokorný wrote:
>> On 19/08/16 10:59 +0200, Jan Pokorný wrote:
>>> So, having some more thoughts on this, here's the possible action
>>> plan (just for heartbeat -> clusterlabs transition + deprecating
>>> some agents, but clusterlabs-staging -> clusterlabs would be similar):
>>> 
>>> # (adapt and) move original heartbeat agents
>>> 
>>> 1. have a resource.d subdirectory "clusterlabs" and move (possibly under
>>>new names) agents that were a priori updated to reflect new revision
>>>of OCF there
>>> 
>>> 2. have a resource.d subdirectory ".deprecated" (for instance) and
>>>move the RAs that are going to be sunset over there (i.e.,
>>>original heartbeat agents = agents moved to clusterlabs + agents
>>>moved to .deprecated + agents that remained under heartbeat, pending
>>>to be moved under clusterlabs)
>>> 
>>> # preparation for backward compatibility
>>> 
>>> 3. have a file with old heartbeat name -> new clusterlabs name mapping
>>>for the agents from 0., i.e., hence physically changed the directory;
>>>the format can be as simple as CVS with "old name; [new name]" lines
>>>where omitted new name means that actual name hasn't changed
>>>(unlike proposed IPaddress2 -> IP)
>>> 
>>> 4. have an XSL template that will convert resource references per the
>>>translation file from 3. (this XSLT should be automatically
>>>generated based on that file) and a script that will call
>>>something like:
>>>cibadmin -Q | xsltproc  - | cibadmin --replace --xml-pipe
>>> 
>>> 5. have a shell script "__cl_compat__" (for instance, name clearly
>>>distinguishable will become handy later on), that will:
>>>- figure which symlink it was called under ("$0") and figure out
>>>  how it should behave based on file from 3.:
>>>  . $0 found as old name with new name -> clusterlabs/
>>>will be called
>>>  . $0 found as old name without new name -> clusterlabs/
>>>will be called
>>>  . $0 not found as old name -> .deprecated/ will be
>>>called if exists (otherwise fail early)
>>>- if "$HA_RSCTMP/$(basename $0)_compat" exists, just run:
>>>  $0 "$@"; exit $?
>>>  the purpose here is to avoid excessive spamming in the logs
>>>- touch "$HA_RSCTMP/$(basename $0)_compat"
>>>- emit a warning "Your configuration referes to the agent with
>>>  an obsolete specification", followed with corresponding:
>>>   . "please consider changing ocf:heartbeat: to
>>>  ocf:clusterlabs:, you may use 

Re: [ClusterLabs Developers] Resurrecting OCF

2016-08-19 Thread Jan Pokorný
On 19/08/16 11:14 +0200, Jan Pokorný wrote:
> On 19/08/16 10:59 +0200, Jan Pokorný wrote:
>> So, having some more thoughts on this, here's the possible action
>> plan (just for heartbeat -> clusterlabs transition + deprecating
>> some agents, but clusterlabs-staging -> clusterlabs would be similar):
>> 
>> # (adapt and) move original heartbeat agents
>> 
>> 1. have a resource.d subdirectory "clusterlabs" and move (possibly under
>>new names) agents that were a priori updated to reflect new revision
>>of OCF there
>> 
>> 2. have a resource.d subdirectory ".deprecated" (for instance) and
>>move the RAs that are going to be sunset over there (i.e.,
>>original heartbeat agents = agents moved to clusterlabs + agents
>>moved to .deprecated + agents that remained under heartbeat, pending
>>to be moved under clusterlabs)
>> 
>> # preparation for backward compatibility
>> 
>> 3. have a file with old heartbeat name -> new clusterlabs name mapping
>>for the agents from 0., i.e., hence physically changed the directory;
>>the format can be as simple as CVS with "old name; [new name]" lines
>>where omitted new name means that actual name hasn't changed
>>(unlike proposed IPaddress2 -> IP)
>> 
>> 4. have an XSL template that will convert resource references per the
>>translation file from 3. (this XSLT should be automatically
>>generated based on that file) and a script that will call
>>something like:
>>cibadmin -Q | xsltproc  - | cibadmin --replace --xml-pipe
>> 
>> 5. have a shell script "__cl_compat__" (for instance, name clearly
>>distinguishable will become handy later on), that will:
>>- figure which symlink it was called under ("$0") and figure out
>>  how it should behave based on file from 3.:
>>  . $0 found as old name with new name -> clusterlabs/
>>will be called
>>  . $0 found as old name without new name -> clusterlabs/
>>will be called
>>  . $0 not found as old name -> .deprecated/ will be
>>called if exists (otherwise fail early)
>>- if "$HA_RSCTMP/$(basename $0)_compat" exists, just run:
>>  $0 "$@"; exit $?
>>  the purpose here is to avoid excessive spamming in the logs
>>- touch "$HA_RSCTMP/$(basename $0)_compat"
>>- emit a warning "Your configuration referes to the agent with
>>  an obsolete specification", followed with corresponding:
>>   . "please consider changing ocf:heartbeat: to
>>  ocf:clusterlabs:, you may use 

Re: [ClusterLabs Developers] Resurrecting OCF

2016-08-18 Thread Jan Pokorný
On 15/08/16 12:37 +0200, Jan Pokorný wrote:
> On 18/07/16 11:13 -0500, Ken Gaillot wrote:
>> A suggestion came up recently to formalize a new version of the OCF
>> resource agent API standard[1].
>> 
>> The main goal would be to formalize the API as it is actually used
>> today, and to replace the "unique" meta-data attribute with two new
>> attributes indicating uniqueness and reloadability.
> 
> My suggestion would be to consider changing the provider name for RAs
> from resource-agents upstream project to anything more reasonable
> than "heartbeat"

Thinking about that, ClusterLabs may be considered a brand established
well enough for "clusterlabs" provider to work better than anything
general such as previously proposed "core".  Also, it's not expected
there will be more RA-centered projects under this umbrella than
resource-agents (pacemaker deserves to be a provider on its own),
so it would be pretty unambiguous pointer.

And for new, not well-tested agents within resource-agents, there could
also be a provider schema akin to "clusterlabs-staging" introduced.

1 CZK

> in one step with bumping to-be-added conformance parameter in
> meta-data denoting that the RA in question reflects the requirements
> of the new revision of OCF/resource agents API (and, apparently, in
> one step with delivering any conformance adjustments needed, such as
> mentioned "unique" indicator).
> 
> Original thread regarding this related suggestion from 3 years ago:
> http://lists.linux-ha.org/pipermail/linux-ha/2013-July/047320.html
> 
> spanned also into the following month:
> http://lists.linux-ha.org/pipermail/linux-ha/2013-August/047368.html
> 
>> We could also add the fence agent API as a new spec, or expand the
>> RA spec to cover both.
> 
> Definitely, the spec(s) should be as language-agnostic as possible
> (so no pretending that, e.g., fencing library of fence-agents is
> a panacea to hide all the interaction/inteface details; the goal
> of the standardization work should be to allow truly interchangeable
> components).
> 
>> [...]
>> 
>> [1]
>> http://www.opencf.org/cgi-bin/viewcvs.cgi/specs/ra/resource-agent-api.txt?rev=HEAD

-- 
Jan (Poki)


pgp1m8vxTmBLf.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Resurrecting OCF

2016-08-15 Thread Jan Pokorný
On 18/07/16 11:13 -0500, Ken Gaillot wrote:
> A suggestion came up recently to formalize a new version of the OCF
> resource agent API standard[1].
> 
> The main goal would be to formalize the API as it is actually used
> today, and to replace the "unique" meta-data attribute with two new
> attributes indicating uniqueness and reloadability.

My suggestion would be to consider changing the provider name for RAs
from resource-agents upstream project to anything more reasonable
than "heartbeat" in one step with bumping to-be-added conformance
parameter in meta-data denoting that the RA in question reflects
the requirements of the new revision of OCF/resource agents API
(and, apparently, in one step with delivering any conformance
adjustments needed, such as mentioned "unique" indicator).

Original thread regarding this related suggestion from 3 years ago:
http://lists.linux-ha.org/pipermail/linux-ha/2013-July/047320.html

spanned also into the following month:
http://lists.linux-ha.org/pipermail/linux-ha/2013-August/047368.html

> We could also add the fence agent API as a new spec, or expand the
> RA spec to cover both.

Definitely, the spec(s) should be as language-agnostic as possible
(so no pretending that, e.g., fencing library of fence-agents is
a panacea to hide all the interaction/inteface details; the goal
of the standardization work should be to allow truly interchangeable
components).

> [...]
> 
> [1]
> http://www.opencf.org/cgi-bin/viewcvs.cgi/specs/ra/resource-agent-api.txt?rev=HEAD

-- 
Jan (Poki)


pgpLroCFNbwgJ.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [booth][sbd] GPLv2.1+ clarification request

2016-05-05 Thread Jan Pokorný
On 05/04/16 12:33 +0200, Dejan Muhamedagic wrote:
> On Wed, Mar 30, 2016 at 05:27:20PM +0200, Jan Pokorný wrote:
>> On 24/03/16 17:18 +0100, Jan Pokorný wrote:
>>> On 22/03/16 19:18 +0100, Dejan Muhamedagic wrote:
>>>> On Mon, Mar 21, 2016 at 10:03:12PM +0100, Jan Pokorný wrote:
>>>>> On 18/03/16 16:16 +0100, Lars Ellenberg wrote:
>>>>>> So I move to change it to GPLv2+, for everything that is a "program",
>>>>>> and LGPLv2.1 for everything that may be viewed as a library.
>>>>>> 
>>>>>> At least that's how I will correct the wording in the
>>>>>> affected files in the heartbeat mercurial.
>>>>> 
>>>>> In the light of the presented historic excursion, that feels natural.
>>>>> 
>>>>> Assuming no licensors want to speak up, the question now stands:
>>>>> Is it the same conclusion that has been reached by booth and sbd
>>>>> package maintainers (Dejan and Andrew respectively, if I follow what's
>>>>> authoritative nowadays properly) and are these willing to act on it to
>>>>> prevent the mentioned ambiguous interpretation once forever?
>>>> 
>>>> Yes, that's all fine with me.
>>>> 
>>>>> I will be happy to provide actual patches,
>>>> 
>>>> Even better :)
>>> 
>>> Added the "maint: clarify GPLv2.1+ -> GPLv2+ in the license notices"
>>> (e294fa2) commit into https://github.com/ClusterLabs/booth/pull/23
>>> if that's OK with you, Dejan.
>> 
>> I hope we are all on the same page as Andrew went ahead there (thanks).
>> Alas, I've noticed there were some subtleties neglected in there so,
>> with regrets, a separate (and hopefully final) pull request:
>> 
>> https://github.com/ClusterLabs/booth/pull/24
> 
> This got merged too. Thanks!

Neverending story, it seems.  Regrettably, please accept also
https://github.com/ClusterLabs/booth/pull/33 to call this license
clarification effort complete, Dejan.

-- 
Jan (Poki)


pgpNOD1IwzaMT.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] [ClusterLabsu] [patch][crmsh] Rework function next_nodeid.

2016-04-06 Thread Jan Pokorný
Andrei,

On 06/04/16 14:39 +0300, Andrei Maruha wrote:
> Attached patch contains a little bit reworked function next_nodeid> 
> 
> [...]

there are two better aligned channels to propose patches (ordered
by preference, at least judging based on
https://github.com/ClusterLabs/crmsh/pulls?q=is%3Apr+is%3Aclosed):

1. pull request against https://github.com/ClusterLabs/crmsh

2. patch (per git conventions, which is abided here) sent to
   developers@c.o ML, with appropriately labeled topic so that
   the respective upstream folks can hook on easily (in this case,
   the prefix should contain "crmsh"; see how I modified the topic,
   along with cross-posting to the correct list)

There is a reason behind having two mailing lists, different audience
being the most prominent one (true, devels will likely read both,
but the rest of "users" would be better off without such a traffic)

Thanks for understanding.

-- Jan

> From 56d99aa764abb2af8d638425b10a1e493d935e4b Mon Sep 17 00:00:00 2001
> From: Andrei Maruha 
> Date: Wed, 6 Apr 2016 12:33:27 +0300
> Subject: low: corosync: Don't take next node id based on max value, if some
>  smaller node id is free.
> 
> Do not assign node id equals to 'maxid + 1' in case if some node was
> removed and free node id can be reused.
> 
> diff --git a/crmsh/corosync.py b/crmsh/corosync.py
> index e9950b8..6401f52 100644
> --- a/crmsh/corosync.py
> +++ b/crmsh/corosync.py
> @@ -327,11 +327,16 @@ def diff_configuration(nodes, checksum=False):
>  utils.remote_diff(local_path, nodes)
>  
>  
> -def next_nodeid(parser):
> +def get_free_nodeid(parser):
>  ids = parser.get_all('nodelist.node.nodeid')
>  if not ids:
>  return 1
> -return max([int(i) for i in ids]) + 1
> +ids = [int(i) for i in ids]
> +max_id = max(ids) + 1
> +for i in xrange(1, max_id):
> +if i not in ids:
> +return i
> +return max_id
>  
>  
>  def get_ip(node):
> @@ -399,7 +404,7 @@ def add_node(addr, name=None):
>  p = Parser(f)
>  
>  node_addr = addr
> -node_id = next_nodeid(p)
> +node_id = get_free_nodeid(p)
>  node_name = name
>  node_value = (make_value('nodelist.node.ring0_addr', node_addr) +
>make_value('nodelist.node.nodeid', str(node_id)))
> diff --git a/test/unittests/test_corosync.py b/test/unittests/test_corosync.py
> index db8dd8c..d2a25b6 100644
> --- a/test/unittests/test_corosync.py
> +++ b/test/unittests/test_corosync.py
> @@ -5,6 +5,7 @@
>  
>  import os
>  import unittest
> +import mock
>  from crmsh import corosync
>  from crmsh.corosync import Parser, make_section, make_value
>  
> @@ -67,7 +68,7 @@ class TestCorosyncParser(unittest.TestCase):
>  p.add('nodelist',
>make_section('nodelist.node',
> make_value('nodelist.node.ring0_addr', 
> '10.10.10.10') +
> -   make_value('nodelist.node.nodeid', 
> str(corosync.next_nodeid(p)
> +   make_value('nodelist.node.nodeid', 
> str(corosync.get_free_nodeid(p)
>  _valid(p)
>  self.assertEqual(p.count('nodelist.node'), 6)
>  self.assertEqual(p.get_all('nodelist.node.nodeid'),
> @@ -75,11 +76,11 @@ class TestCorosyncParser(unittest.TestCase):
>  
>  def test_add_node_no_nodelist(self):
>  "test checks that if there is no nodelist, no node is added"
> -from crmsh.corosync import make_section, make_value, next_nodeid
> +from crmsh.corosync import make_section, make_value, get_free_nodeid
>  
>  p = Parser(F1)
>  _valid(p)
> -nid = next_nodeid(p)
> +nid = get_free_nodeid(p)
>  self.assertEqual(p.count('nodelist.node'), nid - 1)
>  p.add('nodelist',
>make_section('nodelist.node',
> @@ -89,11 +90,11 @@ class TestCorosyncParser(unittest.TestCase):
>  self.assertEqual(p.count('nodelist.node'), nid - 1)
>  
>  def test_add_node_nodelist(self):
> -from crmsh.corosync import make_section, make_value, next_nodeid
> +from crmsh.corosync import make_section, make_value, get_free_nodeid
>  
>  p = Parser(F2)
>  _valid(p)
> -nid = next_nodeid(p)
> +nid = get_free_nodeid(p)
>  c = p.count('nodelist.node')
>  p.add('nodelist',
>make_section('nodelist.node',
> @@ -101,7 +102,7 @@ class TestCorosyncParser(unittest.TestCase):
> make_value('nodelist.node.nodeid', str(nid
>  _valid(p)
>  self.assertEqual(p.count('nodelist.node'), c + 1)
> -self.assertEqual(next_nodeid(p), nid + 1)
> +self.assertEqual(get_free_nodeid(p), nid + 1)
>  
>  def test_remove_node(self):
>  p = Parser(F2)
> @@ -118,5 +119,14 @@ class TestCorosyncParser(unittest.TestCase):
>  _valid(p)
>  self.assertEqual(p.count('service.ver'), 1)
>  
> +def test_get_free_nodeid(self):
> 

Re: [ClusterLabs Developers] [ClusterLabs] [Announce] libqb 10.rc4 release

2016-03-18 Thread Jan Pokorný
On 17/03/16 16:37 +, Christine Caulfield wrote:
> This is a bugfix release and a potential 1.0 candidate.

Primarily serving for building some of the components in the common
cluster stack nowadays, libqb releases should likely be announced
(also) in developers ML, release candidates in particular (hence
taking the liberty to cross-post).

As usual, there are COPR builds for Fedora/EPEL for convenient tryout:
https://copr.fedorainfracloud.org/coprs/jpokorny/libqb/build/168979/
This time around, I made the builds fully self-source-hosted
(SRPM obtained with "make srpm").

> There are no actual code changes in this release, most of the patches
> are to the build system. Thanks to Jan Pokorný for, er, all of them.
> I've bumped the library soname to 0.18.0 which should really have
> happened last time.
> 
> Changes from 1.0rc3
> 
> build: fix tests/_syslog_override.h not being distributed
> build: enable syslog tests when configuring in spec
> build: do not install syslog_override for the RPM packaging
> build: update library soname to 0.18.0
> build: do not try to second-guess "distdir" Automake variable
> build: switch to XZ tarball format for {,s}rpm packaging
> CI: make sure RPM can be built all the time
> Generating the man pages definitely doesn't depend on existence of
> (possibly generated) header files that we omit anyway.
> build: drop extra qbconfig.h rule for auto_check_header self-test
> build: extra clean-local rule instead of overriding clean-generic
> build: docs: {dependent -> public}_headers + more robust obtaining
> build: tests: grab "public_headers" akin to docs precedent
> build: fix preposterous usage of $(AM_V_GEN)
> build: tests: add intermediate check-headers target
> CI: "make check" already included in "make distcheck"
> build: fix out-of-tree build broken with 0b04ed5 (#184)
> docs: enhance qb_log_ctl* description + fix doxygen warning
> docs: apply "doxygen -u" on {html,man}.dox.in, fix deprecations
> docs: {html,man}.dox.in: strip options for unused outputs
> docs: {html,man}.dox.in: unify where reasonable
> docs: make README.markdown always point to "CURRENT" docs
> build: reorder LINT_FLAGS in a more logical way
> build: make the code splint-friendly where not already
> build: better support for splint checker
> build: make splint check tolerant of existing defects

-- 
Jan (Poki)


pgppN8V6RRSlu.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers


Re: [ClusterLabs Developers] Proposed future feature: multiple notification scripts

2015-12-04 Thread Jan Pokorný
On 04/12/15 12:33 +1100, Andrew Beekhof wrote:
>> On 4 Dec 2015, at 2:45 AM, Jan Pokorný <jpoko...@redhat.com> wrote:
>> On 02/12/15 17:23 -0600, Ken Gaillot wrote:
>>> This will be of interest to cluster front-end developers and anyone who
>>> needs event notifications ...
>>> 
>>> One of the new features in Pacemaker 1.1.14 will be built-in
>>> notifications of cluster events, as described by Andrew Beekhof on That
>>> Cluster Guy blog:
>>> http://blog.clusterlabs.org/blog/2015/reliable-notifications/
>>> 
>>> For a future version, we're considering extending that to allow multiple
>>> notification scripts, each with multiple recipients. This would require
>>> a significant change in the CIB. Instead of a simple cluster property,
>>> our current idea is a new configuration section in the CIB, probably
>>> along these lines:
>>> 
>>> 
>>>   
>>> 
>>>   
>>>   
>>> 
>>>  
>>>  
>>> 
>>> 
>>> 
>>> 
>>>  
>>> 
>>>  
>>> 
>>>   
>>> 
>>> 
>>> 
>>> The recipient values would be passed to the script as command-line
>>> arguments (ex. "/my/script.sh m...@example.com").
>> 
>> Just thinking out loud, Pacemaker is well adapted to cope with
>> asymmetric/heterogenous nodes (incl. user-assisted optimizations
>> like with non-default "resource-discovery" property of a location
>> contraint, for instance).
>> 
>> Setting notifications universally for all nodes may be desired
>> in some scenarios, but may not be optimal if nodes may diverge,
> 
> Correct always wins over optimal.
> 
> I’d not be optimising around scripts that only apply to specific
> resources that also don’t run everywhere - at most you waste a few
> cycles.  If that ever becomes a real issue we can add a filter to
> the notify block.
> 
> Far worse is if a service can run somewhere new and you forgot to
> copy the script across… The knowledge doesn’t exist to report that
> as a problem.
> 
> The common scenario will be feeding fencing events into things like
> galera or nova and sending via different transports, like SNMP, SMS,
> email.  Particularly sending SNMP alerts into a fully fledged
> monitoring and alerts system that finds duplicates and does advanced
> filtering.  We do not and should not be trying to reimplement that.
> 
>> or will for sure:
>> 
>> (1) the script may not be distributed across all the nodes
> 
> Thats a bug, not a feature.

see bellow

>>- or (1b) it is located at the shared storage that will become
>>  available later during cluster life cycle because it is
>>  a subject of cluster service management as well
> 
> How will that script send a notification that the shared storage is
> no longer available?

This was mostly based on (made up, yes) assumption that notification
script is only checked once for the existence.  On the other hand,
if not, periodic recheck won't be drastically different in complexity
from period dir rescan (and optimizations on some systems do exist).

>> (2) one intentionally wants to run the notification mechanism
>>on a subset of nodes
> 
> Can you explain to me when that would be a good idea?

I have no idea about nifty details about how it all should work, but
it may be desired to, e.g., decide if the notification agent should
run also in pacemaker_remote case or not.  Or you want to run backup
SMS notifications only at the nodes with GSM module installed.

> Particularly when those nodes are the only remaining survivors
> (which you can’t know isn’t the case).
> If we don’t care about the services on those nodes, why did we make
> them HA?

You can achieve good enough HA notification mechanism by using more
non-HA notification methods, just as you do with fencing topologies,
or just as HA cluster uses more nodes that are not HA by themselves.

>> Note also that once you have the responsibility to distribute the
>> script on your own, you can use the same distribution mechanism to
>> share your configuration for this script, as an alternative to using
>> "value" attribute in the above proposal
> 
> So instead of using a standard pool of agents and pcs to set a
> value, I get to maintain two sets of files on every node in the
> cluster?
> And this is supposed to be a feature?

Just wanted to remind that CIB solves just a subset of orchestration
problems.  Tools like pcs adds only a tiny fraction to this subset.

Standard pool of agents + (mostly) single value cus

Re: [ClusterLabs Developers] Proposed future feature: multiple notification scripts

2015-12-03 Thread Jan Pokorný
On 02/12/15 17:23 -0600, Ken Gaillot wrote:
> This will be of interest to cluster front-end developers and anyone who
> needs event notifications ...
> 
> One of the new features in Pacemaker 1.1.14 will be built-in
> notifications of cluster events, as described by Andrew Beekhof on That
> Cluster Guy blog:
> http://blog.clusterlabs.org/blog/2015/reliable-notifications/
> 
> For a future version, we're considering extending that to allow multiple
> notification scripts, each with multiple recipients. This would require
> a significant change in the CIB. Instead of a simple cluster property,
> our current idea is a new configuration section in the CIB, probably
> along these lines:
> 
> 
>
> 
>
>
> 
>   
>   
> 
>  
>  
> 
>   
> 
>   
> 
>
> 
> 
> 
> The recipient values would be passed to the script as command-line
> arguments (ex. "/my/script.sh m...@example.com").

Just thinking out loud, Pacemaker is well adapted to cope with
asymmetric/heterogenous nodes (incl. user-assisted optimizations
like with non-default "resource-discovery" property of a location
contraint, for instance).

Setting notifications universally for all nodes may be desired
in some scenarios, but may not be optimal if nodes may diverge,
or will for sure:

(1) the script may not be distributed across all the nodes
- or (1b) it is located at the shared storage that will become
  available later during cluster life cycle because it is
  a subject of cluster service management as well

(2) one intentionally wants to run the notification mechanism
on a subset of nodes

Note also that once you have the responsibility to distribute the
script on your own, you can use the same distribution mechanism to
share your configuration for this script, as an alternative to using
"value" attribute in the above proposal (and again, this way, you
are free to have an asymmetric configuration).  There are tons
of cases like that and one has to deal with that already (some RAs,
file with secret for Corosync, ...).

What I am up to is a proposal of an alternative/parallel mechanism
that better fits the asymmetric (and asynchronous from cluster life
cycle POV) use cases: old good drop-in files.  There would simply
be a dedicated directory (say /usr/share/pacemaker/notify.d) where
the software interested in notifications would craft it's own
listener script (or a symlink thereof), script is then discovered
by Pacemaker upon subsequent dir rescan or inotify event, done.

--> no configuration needed (or is external to the CIB, or is
interspersed in a non-invasive way there), install and go

--> it has local-only effect, equally as is local the installation
of the respective software utilizing notifications
(and as is local handling of the notifications!)

> For backward compatibility, the (brand new!) notification-agent and
> notification-recipient cluster properties would be kept as deprecated
> shortcuts for a single notify script and recipient.
> 
> Also for backward compatibility, the first recipient would be passed to
> the script as the CRM_notify_recipient environment variable.
> 
> This proposal came about because the new notification capability has
> turned out to be useful enough that people sometimes want to use it for
> multiple purposes, e.g. email an administrator, and notify some software
> that an event occurred.

The proposal might be useful especially for the latter.

> Trying to fit unrelated actions in one notification script (or a
> script that calls multiple other scripts) has obvious pitfalls, so
> this would make it easier on sysadmins.
> 
> Another advantage will be a configurable timeout (1.1.14 will have a
> hardcoded 5-minute timeout for notification scripts).

There may be catch-all configurable global default that would be
applied also for drop-in files (replicating metadata framework
in the notification scripts sounds like over-engineering).

> The crm_attribute command and the various cluster front-ends would need
> to be modified to handle the new configuration syntax.
> 
> This is all in the idea stage (development is still a ways off), so any
> comments, suggestions, criticisms, etc. are welcome.

In the same spirit, please comment on this associated idea.

-- 
Jan (Poki)


pgpamP8NChnkB.pgp
Description: PGP signature
___
Developers mailing list
Developers@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/developers