Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Jehan-Guillaume de Rorthais
On Mon, 7 Nov 2016 12:39:32 -0600
Ken Gaillot  wrote:

> On 11/07/2016 12:03 PM, Jehan-Guillaume de Rorthais wrote:
> > On Mon, 7 Nov 2016 09:31:20 -0600
> > Ken Gaillot  wrote:
> >   
> >> On 11/07/2016 03:47 AM, Klaus Wenninger wrote:  
> >>> On 11/07/2016 10:26 AM, Jehan-Guillaume de Rorthais wrote:
>  On Mon, 7 Nov 2016 10:12:04 +0100
>  Klaus Wenninger  wrote:
> 
> > On 11/07/2016 08:41 AM, Ulrich Windl wrote:
> > Ken Gaillot  schrieb am 04.11.2016 um 22:37 in
> > Nachricht  
> >> <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>:  
> >>> On 11/04/2016 02:29 AM, Ulrich Windl wrote:  
> >>> Ken Gaillot  schrieb am 03.11.2016 um 17:08
> >>> in  
>  Nachricht
>  <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>:  
>  ...
> >>> Another possible use would be for a cron that needs to know whether a
> >>> particular resource is running, and an attribute query is quicker and
> >>> easier than something like parsing crm_mon output or probing the
> >>> service.  
> >> crm_mon reads parts of the CIB; crm_attribute also does, I guess, so
> >> besides of lacking options and inefficient implementation, why should
> >> one be faster than the other?  
> > attrd_updater doesn't go for the CIB
>  AFAIK, attrd_updater actually goes to the CIB, unless you set "--private"
>  since 1.1.13:
>  https://github.com/ClusterLabs/pacemaker/blob/master/ChangeLog#L177
> >>> That prevents values being stored in the CIB. attrd_updater should
> >>> always talk to attrd as I got it ...
> >>
> >> It's a bit confusing: Both crm_attribute and attrd_updater will
> >> ultimately affect both attrd and the CIB in most cases, but *how* they
> >> do so is different. crm_attribute modifies the CIB, and lets attrd pick
> >> up the change from there; attrd_updater notifies attrd, and lets attrd
> >> modify the CIB.
> >>
> >> The difference is subtle.
> >>
> >> With corosync 2, attrd only modifies "transient" node attributes (which
> >> stay in effect till the next reboot), not "permanent" attributes.  
> > 
> > So why "--private" is not compatible with corosync 1.x as attrd_updater
> > only set "transient" attributes anyway?  
> 
> Corosync 1 does not support certain reliability guarantees required by
> the current attrd, so when building against the corosync 1 libraries,
> pacemaker will install "legacy" attrd instead. The difference is mainly
> that the current attrd can guarantee atomic updates to attribute values.
> attrd_updater actually can set permanent attributes when used with
> legacy attrd.

OK, I understand now.

> > How and where private attributes are stored?  
> 
> They are kept in memory only, in attrd. Of course, attrd is clustered,
> so they are kept in sync across all nodes.

OK, that was my guess.

> >> So crm_attribute must be used if you want to set a permanent attribute.
> >> crm_attribute also has the ability to modify cluster properties and
> >> resource defaults, as well as node attributes.
> >>
> >> On the other hand, by contacting attrd directly, attrd_updater can
> >> change an attribute's "dampening" (how often it is flushed to the CIB),
> >> and it can (as mentioned above) set "private" attributes that are never
> >> written to the CIB (and thus never cause the cluster to re-calculate
> >> resource placement).  
> > 
> > Interesting, thank you for the clarification.
> > 
> > As I understand it, it resumes to:
> > 
> >   crm_attribute -> CIB <-(poll/notify?) attrd
> >   attrd_updater -> attrd -> CIB  
> 
> Correct. On startup, attrd registers with CIB to be notified of all changes.
> 
> > Just a quick question about this, is it possible to set a "dampening" high
> > enough so attrd never flush it to the CIB (kind of private attribute too)?  
> 
> I'd expect that to work, if the dampening interval was higher than the
> lifetime of the cluster being up.

Interesting.

> It's also possible to abuse attrd to create a kind of private attribute
> by using a node name that doesn't exist and never will. :) This ability
> is intentionally allowed, so you can set attributes for nodes that the
> current partition isn't aware of, or nodes that are planned to be added
> later, but only attributes for known nodes will be written to the CIB.

Again, interesting. I'll do some test on my RA as I need clustered private
attributes and was not able to get them under old stack (Debian < 8 or RHEL <
7).

Thank you very much for your answers!

Regards,

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Ken Gaillot
On 11/07/2016 12:03 PM, Jehan-Guillaume de Rorthais wrote:
> On Mon, 7 Nov 2016 09:31:20 -0600
> Ken Gaillot  wrote:
> 
>> On 11/07/2016 03:47 AM, Klaus Wenninger wrote:
>>> On 11/07/2016 10:26 AM, Jehan-Guillaume de Rorthais wrote:  
 On Mon, 7 Nov 2016 10:12:04 +0100
 Klaus Wenninger  wrote:
  
> On 11/07/2016 08:41 AM, Ulrich Windl wrote:  
> Ken Gaillot  schrieb am 04.11.2016 um 22:37 in
> Nachricht
>> <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>:
>>> On 11/04/2016 02:29 AM, Ulrich Windl wrote:
>>> Ken Gaillot  schrieb am 03.11.2016 um 17:08
>>> in
 Nachricht
 <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>:
 ...  
>>> Another possible use would be for a cron that needs to know whether a
>>> particular resource is running, and an attribute query is quicker and
>>> easier than something like parsing crm_mon output or probing the
>>> service.
>> crm_mon reads parts of the CIB; crm_attribute also does, I guess, so
>> besides of lacking options and inefficient implementation, why should one
>> be faster than the other?
> attrd_updater doesn't go for the CIB  
 AFAIK, attrd_updater actually goes to the CIB, unless you set "--private"
 since 1.1.13:
 https://github.com/ClusterLabs/pacemaker/blob/master/ChangeLog#L177  
>>> That prevents values being stored in the CIB. attrd_updater should
>>> always talk to attrd as I got it ...  
>>
>> It's a bit confusing: Both crm_attribute and attrd_updater will
>> ultimately affect both attrd and the CIB in most cases, but *how* they
>> do so is different. crm_attribute modifies the CIB, and lets attrd pick
>> up the change from there; attrd_updater notifies attrd, and lets attrd
>> modify the CIB.
>>
>> The difference is subtle.
>>
>> With corosync 2, attrd only modifies "transient" node attributes (which
>> stay in effect till the next reboot), not "permanent" attributes.
> 
> So why "--private" is not compatible with corosync 1.x as attrd_updater only 
> set
> "transient" attributes anyway?

Corosync 1 does not support certain reliability guarantees required by
the current attrd, so when building against the corosync 1 libraries,
pacemaker will install "legacy" attrd instead. The difference is mainly
that the current attrd can guarantee atomic updates to attribute values.
attrd_updater actually can set permanent attributes when used with
legacy attrd.

> How and where private attributes are stored?

They are kept in memory only, in attrd. Of course, attrd is clustered,
so they are kept in sync across all nodes.

>> So crm_attribute must be used if you want to set a permanent attribute.
>> crm_attribute also has the ability to modify cluster properties and
>> resource defaults, as well as node attributes.
>>
>> On the other hand, by contacting attrd directly, attrd_updater can
>> change an attribute's "dampening" (how often it is flushed to the CIB),
>> and it can (as mentioned above) set "private" attributes that are never
>> written to the CIB (and thus never cause the cluster to re-calculate
>> resource placement).
> 
> Interesting, thank you for the clarification.
> 
> As I understand it, it resumes to:
> 
>   crm_attribute -> CIB <-(poll/notify?) attrd
>   attrd_updater -> attrd -> CIB

Correct. On startup, attrd registers with CIB to be notified of all changes.

> Just a quick question about this, is it possible to set a "dampening" high
> enough so attrd never flush it to the CIB (kind of private attribute too)?

I'd expect that to work, if the dampening interval was higher than the
lifetime of the cluster being up.

It's also possible to abuse attrd to create a kind of private attribute
by using a node name that doesn't exist and never will. :) This ability
is intentionally allowed, so you can set attributes for nodes that the
current partition isn't aware of, or nodes that are planned to be added
later, but only attributes for known nodes will be written to the CIB.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Jehan-Guillaume de Rorthais
On Mon, 7 Nov 2016 09:31:20 -0600
Ken Gaillot  wrote:

> On 11/07/2016 03:47 AM, Klaus Wenninger wrote:
> > On 11/07/2016 10:26 AM, Jehan-Guillaume de Rorthais wrote:  
> >> On Mon, 7 Nov 2016 10:12:04 +0100
> >> Klaus Wenninger  wrote:
> >>  
> >>> On 11/07/2016 08:41 AM, Ulrich Windl wrote:  
> >>> Ken Gaillot  schrieb am 04.11.2016 um 22:37 in
> >>> Nachricht
>  <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>:
> > On 11/04/2016 02:29 AM, Ulrich Windl wrote:
> > Ken Gaillot  schrieb am 03.11.2016 um 17:08
> > in
> >> Nachricht
> >> <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>:
> >> ...  
> > Another possible use would be for a cron that needs to know whether a
> > particular resource is running, and an attribute query is quicker and
> > easier than something like parsing crm_mon output or probing the
> > service.
>  crm_mon reads parts of the CIB; crm_attribute also does, I guess, so
>  besides of lacking options and inefficient implementation, why should one
>  be faster than the other?
> >>> attrd_updater doesn't go for the CIB  
> >> AFAIK, attrd_updater actually goes to the CIB, unless you set "--private"
> >> since 1.1.13:
> >> https://github.com/ClusterLabs/pacemaker/blob/master/ChangeLog#L177  
> > That prevents values being stored in the CIB. attrd_updater should
> > always talk to attrd as I got it ...  
> 
> It's a bit confusing: Both crm_attribute and attrd_updater will
> ultimately affect both attrd and the CIB in most cases, but *how* they
> do so is different. crm_attribute modifies the CIB, and lets attrd pick
> up the change from there; attrd_updater notifies attrd, and lets attrd
> modify the CIB.
> 
> The difference is subtle.
> 
> With corosync 2, attrd only modifies "transient" node attributes (which
> stay in effect till the next reboot), not "permanent" attributes.

So why "--private" is not compatible with corosync 1.x as attrd_updater only set
"transient" attributes anyway?

How and where private attributes are stored?

> So crm_attribute must be used if you want to set a permanent attribute.
> crm_attribute also has the ability to modify cluster properties and
> resource defaults, as well as node attributes.
> 
> On the other hand, by contacting attrd directly, attrd_updater can
> change an attribute's "dampening" (how often it is flushed to the CIB),
> and it can (as mentioned above) set "private" attributes that are never
> written to the CIB (and thus never cause the cluster to re-calculate
> resource placement).

Interesting, thank you for the clarification.

As I understand it, it resumes to:

  crm_attribute -> CIB <-(poll/notify?) attrd
  attrd_updater -> attrd -> CIB

Just a quick question about this, is it possible to set a "dampening" high
enough so attrd never flush it to the CIB (kind of private attribute too)?

Regards,

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] permissions under /etc/corosync/qnetd (was: Corosync 2.4.0 is available at corosync.org!)

2016-11-07 Thread Ferenc Wágner
Jan Friesse  writes:

> Ferenc Wágner napsal(a):
>
>> Have you got any plans/timeline for 2.4.2 yet?
>
> Yep, I'm going to release it in few minutes/hours.

Man, that was quick.  I've got a bunch of typo fixes queued..:) Please
consider announcing upcoming releases a couple of days in advance; as a
packager, I'd much appreciate it.  Maybe even tag release candidates...

Anyway, I've got a question concerning corosync-qnetd.  I run it as
user and group coroqnetd.  Is granting it read access to cert8.db and
key3.db enough for proper operation?  corosync-qnetd-certutil gives
write access to group coroqnetd to everything, which seems unintuitive
to me.  Please note that I've got zero experience with NSS.  But I don't
expect the daemon to change the certificate database.  Should I?
-- 
Thanks,
Feri

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Corosync 2.4.2 is available at corosync.org!

2016-11-07 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
2.4.2 available immediately from our website at
http://build.clusterlabs.org/corosync/releases/.

This release is mainly because we forgot to bump libvotequorum.so major 
version number in 2.4.0. This is not that big deal because libvotequorum 
isn't used by 3rd party applications (pacemaker, ...). Still makes sense 
to have this issue fixed. Also thanks to Ferenc Wágner for notice.



Complete changelog for 2.4.2:
Christine Caulfield (1):
  man: mention qdevice incompatibilites in votequorum.5

Fabio M. Di Nitto (1):
  [build] Fix build on RHEL7.3 latest

Jan Friesse (3):
  Man: Fix corosync-qdevice-net-certutil link
  Qnetd LMS: Fix two partition use case
  libvotequorum: Bump version

Michael Jones (1):
  cfg: Prevents use of uninitialized buffer


Upgrade is (as usually) highly recommended.

Thanks/congratulations to all people that contributed to achieve this
great milestone.


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Corosync 2.4.0 is available at corosync.org!

2016-11-07 Thread Jan Friesse

Ferenc Wágner napsal(a):

Jan Friesse  writes:


Jan Friesse  writes:


Please note that because of required changes in votequorum,
libvotequorum is no longer binary compatible. This is reason for
version bump.


Er, what version bump?  Corosync 2.4.1 still produces
libvotequorum.so.7.0.0 for me, just like Corosync 2.3.6.


Yep, you are right. Thanks for notice, this is something what should
have happened.


Thanks for confirming.


Anyway, 2.3.6 and 2.4.x votequorum are incompatible (there were both
API and ABI changes). Probably something to fix in 2.4.2.


Have you got any plans/timeline for 2.4.2 yet?


Yep, I'm going to release it in few minutes/hours.



Anyway, we're packaging 2.4.1 for Debian now, shall we ship it with

-7.0.0
+8.0.0

in lib/libvotequorum.verso?




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] What is the logic when two node are down at the same time and needs to be fenced

2016-11-07 Thread Niu Sibo

Hi Ken,

Thanks for the clarification. Now I have another real problem that needs 
your advise.


The cluster consists of 5 nodes and one of the node got a 1 second 
network failure which resulted in one of the VirtualDomain resources to 
start on two nodes at the same time. The cluster property 
no_quorum_policy is set to stop.


At 16:13:34, this happened:
16:13:34 zs95kj attrd[133000]:  notice: crm_update_peer_proc: Node 
zs93KLpcs1[5] - state is now lost (was member)
16:13:34 zs95kj corosync[132974]:  [CPG   ] left_list[0] 
group:pacemakerd\x00, ip:r(0) ip(10.20.93.13) , pid:28721

16:13:34 zs95kj crmd[133002]: warning: No match for shutdown action on 5
16:13:34 zs95kj attrd[133000]:  notice: Removing all zs93KLpcs1 
attributes for attrd_peer_change_cb

16:13:34 zs95kj corosync[132974]:  [CPG   ] left_list_entries:1
16:13:34 zs95kj crmd[133002]:  notice: Stonith/shutdown of zs93KLpcs1 
not matched

...
16:13:35 zs95kj attrd[133000]:  notice: crm_update_peer_proc: Node 
zs93KLpcs1[5] - state is now member (was (null))


From the DC:
[root@zs95kj ~]# crm_simulate --xml-file 
/var/lib/pacemaker/pengine/pe-input-3288.bz2 |grep 110187
 zs95kjg110187_res  (ocf::heartbeat:VirtualDomain): Started 
zs93KLpcs1 <--This is the baseline that everything works normal


[root@zs95kj ~]# crm_simulate --xml-file 
/var/lib/pacemaker/pengine/pe-input-3289.bz2 |grep 110187
 zs95kjg110187_res  (ocf::heartbeat:VirtualDomain): Stopped 
<--- Here the node zs93KLpcs1 lost it's network for 1 sec and 
resulted in this state.


[root@zs95kj ~]# crm_simulate --xml-file 
/var/lib/pacemaker/pengine/pe-input-3290.bz2 |grep 110187

 zs95kjg110187_res  (ocf::heartbeat:VirtualDomain): Stopped

[root@zs95kj ~]# crm_simulate --xml-file 
/var/lib/pacemaker/pengine/pe-input-3291.bz2 |grep 110187

 zs95kjg110187_res  (ocf::heartbeat:VirtualDomain): Stopped


From the DC's pengine log, it has:
16:05:01 zs95kj pengine[133001]:  notice: Calculated Transition 238: 
/var/lib/pacemaker/pengine/pe-input-3288.bz2

...
16:13:41 zs95kj pengine[133001]:  notice: Start 
zs95kjg110187_res#011(zs90kppcs1)

...
16:13:41 zs95kj pengine[133001]:  notice: Calculated Transition 239: 
/var/lib/pacemaker/pengine/pe-input-3289.bz2


From the DC's CRMD log, it has:
Sep  9 16:05:25 zs95kj crmd[133002]:  notice: Transition 238 
(Complete=48, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
Source=/var/lib/pacemaker/pengine/pe-input-3288.bz2): Complete

...
Sep  9 16:13:42 zs95kj crmd[133002]:  notice: Initiating action 752: 
start zs95kjg110187_res_start_0 on zs90kppcs1

...
Sep  9 16:13:56 zs95kj crmd[133002]:  notice: Transition 241 
(Complete=81, Pending=0, Fired=0, Skipped=172, Incomplete=341, 
Source=/var/lib/pacemaker/pengine/pe-input-3291.bz2): Stopped


Here I do not see any log about pe-input-3289.bz2 and pe-input-3290.bz2. 
Why is this?


From the log on zs93KLpcs1 where guest 110187 was running, i do not see 
any message regarding stopping this resource after it lost its 
connection to the cluster.


Any ideas where to look for possible cause?

On 11/3/2016 1:02 AM, Ken Gaillot wrote:

On 11/02/2016 11:17 AM, Niu Sibo wrote:

Hi all,

I have a general question regarding the fence login in pacemaker.

I have setup a three nodes cluster with Pacemaker 1.1.13 and cluster
property no_quorum_policy set to ignore. When two nodes lost their NIC
corosync is running on at the same time, it looks like the two nodes are
getting fenced one by one, even I have three fence devices defined for
each of the node.

What should I be expecting in the case?

It's probably coincidence that the fencing happens serially; there is
nothing enforcing that for separate fence devices. There are many steps
in a fencing request, so they can easily take different times to complete.


I noticed if the node rejoins the cluster before the cluster starts the
fence actions, some resources will get activated on 2 nodes at the
sametime. This is really not good if the resource happens to be
VirutalGuest.  Thanks for any suggestions.

Since you're ignoring quorum, there's nothing stopping the disconnected
node from starting all resources on its own. It can even fence the other
nodes, unless the downed NIC is used for fencing. From that node's point
of view, it's the other two nodes that are lost.

Quorum is the only solution I know of to prevent that. Fencing will
correct the situation, but it won't prevent it.

See the votequorum(5) man page for various options that can affect how
quorum is calculated. Also, the very latest version of corosync supports
qdevice (a lightweight daemon that run on a host outside the cluster
strictly for the purposes of quorum).

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org





Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Ken Gaillot
On 11/07/2016 03:47 AM, Klaus Wenninger wrote:
> On 11/07/2016 10:26 AM, Jehan-Guillaume de Rorthais wrote:
>> On Mon, 7 Nov 2016 10:12:04 +0100
>> Klaus Wenninger  wrote:
>>
>>> On 11/07/2016 08:41 AM, Ulrich Windl wrote:
>>> Ken Gaillot  schrieb am 04.11.2016 um 22:37 in
>>> Nachricht  
 <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>:  
> On 11/04/2016 02:29 AM, Ulrich Windl wrote:  
> Ken Gaillot  schrieb am 03.11.2016 um 17:08 in  
>> Nachricht
>> <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>:  
>> ...
> Another possible use would be for a cron that needs to know whether a
> particular resource is running, and an attribute query is quicker and
> easier than something like parsing crm_mon output or probing the service. 
>  
 crm_mon reads parts of the CIB; crm_attribute also does, I guess, so
 besides of lacking options and inefficient implementation, why should one
 be faster than the other?  
>>> attrd_updater doesn't go for the CIB
>> AFAIK, attrd_updater actually goes to the CIB, unless you set "--private"
>> since 1.1.13:
>> https://github.com/ClusterLabs/pacemaker/blob/master/ChangeLog#L177
> That prevents values being stored in the CIB. attrd_updater should
> always talk to attrd as I got it ...

It's a bit confusing: Both crm_attribute and attrd_updater will
ultimately affect both attrd and the CIB in most cases, but *how* they
do so is different. crm_attribute modifies the CIB, and lets attrd pick
up the change from there; attrd_updater notifies attrd, and lets attrd
modify the CIB.

The difference is subtle.

With corosync 2, attrd only modifies "transient" node attributes (which
stay in effect till the next reboot), not "permanent" attributes. So
crm_attribute must be used if you want to set a permanent attribute.
crm_attribute also has the ability to modify cluster properties and
resource defaults, as well as node attributes.

On the other hand, by contacting attrd directly, attrd_updater can
change an attribute's "dampening" (how often it is flushed to the CIB),
and it can (as mentioned above) set "private" attributes that are never
written to the CIB (and thus never cause the cluster to re-calculate
resource placement).

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Authoritative corosync's location

2016-11-07 Thread Jan Pokorný
On 22/09/16 09:05 +0200, Jan Friesse wrote:
> Jan Pokorný napsal(a):
>> On 21/09/16 09:16 +0200, Jan Friesse wrote:
>>> Thomas Lamprecht napsal(a):
 I have also another, organizational question. I saw on the GitHub page from
 corosync that pull request there are preferred, and also that the
>>> 
>>> True
>> 
>> At this point, it's worth noting that ClusterLabs/corosync is
>> currently a stale fork of corosync/corosync location at GitHub,
>> which may be a source of confusion.
> 
> Nice catch, I didn't even know it exists.
> 
>> 
>> It would make sense to settle on just a single one to be clearly
>> authoritative place to be in touch with (not sure what options
>> are -- aliasing/transfering?).
> 
> Sure. I don't know who created that fork but whoever it was please consider
> deleting it. It may be really confusing.

Even more so when it's occasionally updated;
https://github.com/ClusterLabs/corosync (at master branch) now says
"This branch is 3 commits behind corosync:master.".

That also means that there seems no satisfactory solution, yet.

-- 
Jan (Poki)


pgpd3UxMpnNBv.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Jan Pokorný
On 03/11/16 11:08 -0500, Ken Gaillot wrote:
> ClusterLabs is happy to announce the first release candidate for
> Pacemaker version 1.1.16. Source code is available at:
> 
> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1
> 
> [...]

As usual, there are COPR builds (using upstream spec file without any
final touch that is usually done downstream) for easy consumption in
some environments:
https://copr.fedorainfracloud.org/coprs/jpokorny/pacemaker/build/473980/

I also have something to share regarding recently announced security
fix in pacemaker if you are interested in Fedora: fixed packages
should be available from updates-testing repo in Fedora 23
and Fedora 25, and regular updates repo in Fedora 24 at the moment.

-- 
Jan (Poki)


pgpeRMbXtWvm5.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Klaus Wenninger
On 11/07/2016 10:26 AM, Jehan-Guillaume de Rorthais wrote:
> On Mon, 7 Nov 2016 10:12:04 +0100
> Klaus Wenninger  wrote:
>
>> On 11/07/2016 08:41 AM, Ulrich Windl wrote:
>> Ken Gaillot  schrieb am 04.11.2016 um 22:37 in
>> Nachricht  
>>> <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>:  
 On 11/04/2016 02:29 AM, Ulrich Windl wrote:  
 Ken Gaillot  schrieb am 03.11.2016 um 17:08 in  
> Nachricht
> <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>:  
> ...
 Another possible use would be for a cron that needs to know whether a
 particular resource is running, and an attribute query is quicker and
 easier than something like parsing crm_mon output or probing the service.  
>>> crm_mon reads parts of the CIB; crm_attribute also does, I guess, so
>>> besides of lacking options and inefficient implementation, why should one
>>> be faster than the other?  
>> attrd_updater doesn't go for the CIB
> AFAIK, attrd_updater actually goes to the CIB, unless you set "--private"
> since 1.1.13:
> https://github.com/ClusterLabs/pacemaker/blob/master/ChangeLog#L177
That prevents values being stored in the CIB. attrd_updater should
always talk to attrd as I got it ...



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Jehan-Guillaume de Rorthais
On Mon, 7 Nov 2016 10:12:04 +0100
Klaus Wenninger  wrote:

> On 11/07/2016 08:41 AM, Ulrich Windl wrote:
>  Ken Gaillot  schrieb am 04.11.2016 um 22:37 in
>  Nachricht  
> > <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>:  
> >> On 11/04/2016 02:29 AM, Ulrich Windl wrote:  
> >> Ken Gaillot  schrieb am 03.11.2016 um 17:08 in  
> >>> Nachricht
> >>> <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>:  
...
> >> Another possible use would be for a cron that needs to know whether a
> >> particular resource is running, and an attribute query is quicker and
> >> easier than something like parsing crm_mon output or probing the service.  
> > crm_mon reads parts of the CIB; crm_attribute also does, I guess, so
> > besides of lacking options and inefficient implementation, why should one
> > be faster than the other?  
> 
> attrd_updater doesn't go for the CIB

AFAIK, attrd_updater actually goes to the CIB, unless you set "--private"
since 1.1.13:
https://github.com/ClusterLabs/pacemaker/blob/master/ChangeLog#L177


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Klaus Wenninger
On 11/07/2016 08:41 AM, Ulrich Windl wrote:
 Ken Gaillot  schrieb am 04.11.2016 um 22:37 in 
 Nachricht
> <27c2ca20-c52c-8fb4-a60f-5ae12f7ff...@redhat.com>:
>> On 11/04/2016 02:29 AM, Ulrich Windl wrote:
>> Ken Gaillot  schrieb am 03.11.2016 um 17:08 in
>>> Nachricht
>>> <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>:
 ClusterLabs is happy to announce the first release candidate for
 Pacemaker version 1.1.16. Source code is available at:

 https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 

 The most significant enhancements in this release are:

 * rsc-pattern may now be used instead of rsc in location constraints, to
 allow a single location constraint to apply to all resources whose names
 match a regular expression. Sed-like %0 - %9 backreferences let
 submatches be used in node attribute names in rules.

 * The new ocf:pacemaker:attribute resource agent sets a node attribute
 according to whether the resource is running or stopped. This may be
 useful in combination with attribute-based rules to model dependencies
 that simple constraints can't handle.
>>> I don't quite understand this: Isn't the state of a resource in the CIB 
>> status
>>> section anyway? If not, why not add it? So it would be readily available for
>>> anyone (rules, constraints, etc.).
>> This (hopefully) lets you model more complicated relationships.
>>
>> For example, someone recently asked whether they could make an ordering
>> constraint apply only at "start-up" -- the first time resource A starts,
>> it does some initialization that B needs, but once that's done, B can be
>> independent of A.
> Is "at start-up" before start of the resource, after start of the resource, 
> or parallel to the start of the resource ;-)
> Probably a "hook" in the corresponding RA is the better approach, unless you 
> can really model all of the above.
>
>> For that case, you could group A with an ocf:pacemaker:attribute
>> resource. The important part is that the attribute is not set if A has
>> never run on a node. So, you can make a rule that B can run only where
>> the attribute is set, regardless of the value -- even if A is later
>> stopped, the attribute will still be set.
> If a resource is not running on a node,, it is "stopped"; isn't it?
>
>> Another possible use would be for a cron that needs to know whether a
>> particular resource is running, and an attribute query is quicker and
>> easier than something like parsing crm_mon output or probing the service.
> crm_mon reads parts of the CIB; crm_attribute also does, I guess, so besides 
> of lacking options and inefficient implementation, why should one be faster 
> than the other?

attrd_updater doesn't go for the CIB
 
>
>> It's all theoretical at this point, and I'm not entirely sure those
>> examples would be useful :) but I wanted to make the agent available for
>> people to experiment with.
> A good product manager should resist the attempt to provide any feature the 
> customers ask for, avoiding bloat-ware. That is to protect the customer from 
> their own bad decisions. In most cases there is a better, more universal 
> solution to the specific problem.
>
>
 * Pacemaker's existing "node health" feature allows resources to move
 off nodes that become unhealthy. Now, when using
 node-health-strategy=progressive, a new cluster property
 node-health-base will be used as the initial health score of newly
 joined nodes (defaulting to 0, which is the previous behavior). This
 allows cloned and multistate resource instances to start on a node even
 if it has some "yellow" health attributes.
>>> So the node health is more or less a "node score"? I don't understand the 
>> last
>>> sentence. Maybe give an example?
>> Yes, node health is a score that's added when deciding where to place a
>> resource. It does get complicated ...
>>
>> Node health monitoring is optional, and off by default.
>>
>> Node health attributes are set to red, yellow or green (outside
>> pacemaker itself -- either by a resource agent, or some external
>> process). As an example, let's say we have three node health attributes
>> for CPU usage, CPU temperature, and SMART error count.
>>
>> With a progressive strategy, red and yellow are assigned some negative
>> score, and green is 0. In our example, let's say yellow gets a -10 score.
>>
>> If any of our attributes are yellow, resources will avoid the node
>> (unless they have higher positive scores from something like stickiness
>> or a location constraint).
>>
> I understood so far.
>
>> Normally, this is what you want, but if your resources are cloned on all
>> nodes, maybe you don't care if some attributes are yellow. In that case,
>> you can set node-health-base=20, so even if two attributes are yellow,
>> it won't prevent resources from running (20 + -10 + -10 = 0).
> I don't understand