Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Klaus Wenninger
On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
> 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
>> On Mon, 17 Aug 2020 10:19:45 -0500
>> Ken Gaillot  wrote:
>>
>>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
 Thanks to all your suggestions, I now have the systems with stonith
 configured on ipmi.  
>>> A word of caution: if the IPMI is on-board -- i.e. it shares the same
>>> power supply as the computer -- power becomes a single point of
>>> failure. If the node loses power, the other node can't fence because
>>> the IPMI is also down, and the cluster can't recover.
>>>
>>> Some on-board IPMI controllers can share an Ethernet port with the main
>>> computer, which would be a similar situation.
>>>
>>> It's best to have a backup fencing method when using IPMI as the
>>> primary fencing method. An example would be an intelligent power switch
>>> or sbd.
>> How SBD would be useful in this scenario? Poison pill will not be swallowed 
>> by
>> the dead node... Is it just to wait for the watchdog timeout?
>>
> Node is expected to commit suicide if SBD lost access to shared block
> device. So either node swallowed poison pill and died or node died
> because it realized it was impossible to see poison pill or node was
> dead already. After watchdog timeout (twice watchdog timeout for safety)
> we assume node is dead.
Yes, like this a suicide via watchdog will be triggered if there are
issues with thedisk. This is why it is important to have a reliable
watchdog with SBD even whenusing poison pill. As this alone would
make a single shared disk a SPOF, runningwith pacemaker integration
(default) a node with SBD will survive despite ofloosing the disk
when it has quorum and pacemaker looks healthy. As corosync-quorum
in 2-node-mode obviously won't be fit for this purpose SBD will switch
to checking for presence of both nodes if 2-node-flag is set.

Sorry for the lengthy explanation but the full picture is required
to understand whyit is sufficiently reliable and useful if configured
correctly.

Klaus

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Andrei Borzenkov
17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
> On Mon, 17 Aug 2020 10:19:45 -0500
> Ken Gaillot  wrote:
> 
>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
>>> Thanks to all your suggestions, I now have the systems with stonith
>>> configured on ipmi.  
>>
>> A word of caution: if the IPMI is on-board -- i.e. it shares the same
>> power supply as the computer -- power becomes a single point of
>> failure. If the node loses power, the other node can't fence because
>> the IPMI is also down, and the cluster can't recover.
>>
>> Some on-board IPMI controllers can share an Ethernet port with the main
>> computer, which would be a similar situation.
>>
>> It's best to have a backup fencing method when using IPMI as the
>> primary fencing method. An example would be an intelligent power switch
>> or sbd.
> 
> How SBD would be useful in this scenario? Poison pill will not be swallowed by
> the dead node... Is it just to wait for the watchdog timeout?
> 

Node is expected to commit suicide if SBD lost access to shared block
device. So either node swallowed poison pill and died or node died
because it realized it was impossible to see poison pill or node was
dead already. After watchdog timeout (twice watchdog timeout for safety)
we assume node is dead.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Ken Gaillot
On Mon, 2020-08-17 at 22:39 +0200, Jehan-Guillaume de Rorthais wrote:
> On Mon, 17 Aug 2020 10:19:45 -0500
> Ken Gaillot  wrote:
> 
> > On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
> > > Thanks to all your suggestions, I now have the systems with
> > > stonith
> > > configured on ipmi.  
> > 
> > A word of caution: if the IPMI is on-board -- i.e. it shares the
> > same
> > power supply as the computer -- power becomes a single point of
> > failure. If the node loses power, the other node can't fence
> > because
> > the IPMI is also down, and the cluster can't recover.
> > 
> > Some on-board IPMI controllers can share an Ethernet port with the
> > main
> > computer, which would be a similar situation.
> > 
> > It's best to have a backup fencing method when using IPMI as the
> > primary fencing method. An example would be an intelligent power
> > switch
> > or sbd.
> 
> How SBD would be useful in this scenario? Poison pill will not be
> swallowed by
> the dead node... Is it just to wait for the watchdog timeout?

Right, I meant watchdog-only SBD. Although now that I think about it,
I'm not sure of the details of if/how that would work. Klaus Wenninger
might have some insight.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Jehan-Guillaume de Rorthais
On Mon, 17 Aug 2020 10:19:45 -0500
Ken Gaillot  wrote:

> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
> > Thanks to all your suggestions, I now have the systems with stonith
> > configured on ipmi.  
> 
> A word of caution: if the IPMI is on-board -- i.e. it shares the same
> power supply as the computer -- power becomes a single point of
> failure. If the node loses power, the other node can't fence because
> the IPMI is also down, and the cluster can't recover.
> 
> Some on-board IPMI controllers can share an Ethernet port with the main
> computer, which would be a similar situation.
> 
> It's best to have a backup fencing method when using IPMI as the
> primary fencing method. An example would be an intelligent power switch
> or sbd.

How SBD would be useful in this scenario? Poison pill will not be swallowed by
the dead node... Is it just to wait for the watchdog timeout?

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] node utilization attributes are lost during upgrade

2020-08-17 Thread Ken Gaillot
On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik József wrote:
> Hello,
> 
> At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian
> stretch 
> to buster, all the node utilization attributes were erased from the 
> configuration. However, the same attributes were kept at the
> VirtualDomain 
> resources. This resulted that all resources with utilization
> attributes 
> were stopped.

Ouch :(

There are two types of node attributes, transient and permanent.
Transient attributes last only until pacemaker is next stopped on the
node, while permanent attributes persist between reboots/restarts.

If you configured the utilization attributes with crm_attribute -z/
--utilization, it will default to permanent, but it's possible to
override that with -l/--lifetime reboot (or equivalently, -t/--type
status).

Permanent node attributes should definitely not be erased in an
upgrade.

> 
> The documentation says: "You can name utilization attributes
> according to 
> your preferences and define as many name/value pairs as your
> configuration 
> needs.", so one assumes utilization attributes are kept during
> upgrades, 
> for nodes and resources as well.
> 
> The corosync incompatibility made the upgrade more stressful anyway
> and 
> the stopping of the resources came out of the blue. The resources
> could 
> not be started of course - and there were no log warning/error
> messages 
> that the resources are not started because the utilization
> constrains 
> could not be satisfied. Pacemaker logs a lot (from admin point of
> view it 
> is too much), but in this case there was no indication why the
> resources 
> could not be started (or we were unable to find it in the logs?). So
> we 
> wasted a lot of time with debugging the VirtualDomain agent.
> 
> Currently we run the cluster with the placement-strategy set to
> default.
> 
> In my opinion node attributes should be kept and preserved during an 
> upgrade. Also, it should be logged when a resource must be
> stopped/cannot 
> be started because the utilization constrains cannot be satisfied.
> 
> Best regards,
> Jozsef
> --
> E-mail : kadlecsik.joz...@wigner.hu
> PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
> Address: Wigner Research Centre for Physics
>  H-1525 Budapest 114, POB. 49, Hungary
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] node utilization attributes are lost during upgrade

2020-08-17 Thread Kadlecsik József
Hello,

At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian stretch 
to buster, all the node utilization attributes were erased from the 
configuration. However, the same attributes were kept at the VirtualDomain 
resources. This resulted that all resources with utilization attributes 
were stopped.

The documentation says: "You can name utilization attributes according to 
your preferences and define as many name/value pairs as your configuration 
needs.", so one assumes utilization attributes are kept during upgrades, 
for nodes and resources as well.

The corosync incompatibility made the upgrade more stressful anyway and 
the stopping of the resources came out of the blue. The resources could 
not be started of course - and there were no log warning/error messages 
that the resources are not started because the utilization constrains 
could not be satisfied. Pacemaker logs a lot (from admin point of view it 
is too much), but in this case there was no indication why the resources 
could not be started (or we were unable to find it in the logs?). So we 
wasted a lot of time with debugging the VirtualDomain agent.

Currently we run the cluster with the placement-strategy set to default.

In my opinion node attributes should be kept and preserved during an 
upgrade. Also, it should be logged when a resource must be stopped/cannot 
be started because the utilization constrains cannot be satisfied.

Best regards,
Jozsef
--
E-mail : kadlecsik.joz...@wigner.hu
PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics
 H-1525 Budapest 114, POB. 49, Hungary
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Beginner Question about VirtualDomain

2020-08-17 Thread Sameer Dhiman
Hi,

I am a beginner using pacemaker and corosync. I am trying to set up
a cluster of HA KVM guests as described by Alteeve wiki (CentOS-6) but in
CentOS-8.2. My R&D  setup is described below

Physical Host running CentOS-8.2 with Nested Virtualization
2 x CentOS-8.2 guest machines as Cluster Node 1 and 2.
WinXP as a HA guest.

1. drbd --> dlm --> lvmlockd --> LVM-activate --> gfs2 (guest machine
definitions)
2. drbd --> dlm --> lvmlockd --> LVM-activate --> raw-lv (guest machine HDD)

Question(s):
1. How to prevent guest startup until gfs2 and raw-lv are available? In
CentOS-6 Alteeve used autostart=0 in the  tag. Is there any similar
option in pacemaker because I did not find it in the documentation?

2. Suppose, If I configure constraint order gfs2 and raw-lv then guest
machine. Stopping the guest machine would also stop the complete service
tree so how can I prevent this?

-- 
Sameer Dhiman
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Ken Gaillot
On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
> Thanks to all your suggestions, I now have the systems with stonith
> configured on ipmi.

A word of caution: if the IPMI is on-board -- i.e. it shares the same
power supply as the computer -- power becomes a single point of
failure. If the node loses power, the other node can't fence because
the IPMI is also down, and the cluster can't recover.

Some on-board IPMI controllers can share an Ethernet port with the main
computer, which would be a similar situation.

It's best to have a backup fencing method when using IPMI as the
primary fencing method. An example would be an intelligent power switch
or sbd.

> Two questions:
> - how can I simulate a stonith situation to check that everything is
> ok?
> - considering that I have both nodes with stonith against the other
> node, once the two nodes can communicate, how can I be sure the two
> nodes will not try to stonith each other?
>  
> :)
> Thanks!
> Gabriele
> 
>  
>  
> Sonicle S.r.l. : http://www.sonicle.com
> Music: http://www.gabrielebulfon.com
> Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon
> 
> 
> 
> Da: Gabriele Bulfon 
> A: Cluster Labs - All topics related to open-source clustering
> welcomed 
> Data: 29 luglio 2020 14.22.42 CEST
> Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
> 
> 
> >  
> > It is a ZFS based illumos system.
> > I don't think SBD is an option.
> > Is there a reliable ZFS based stonith?
> >  
> > Gabriele
> > 
> >  
> >  
> > Sonicle S.r.l. : http://www.sonicle.com
> > Music: http://www.gabrielebulfon.com
> > Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon
> > 
> > 
> > 
> > Da: Andrei Borzenkov 
> > A: Cluster Labs - All topics related to open-source clustering
> > welcomed 
> > Data: 29 luglio 2020 9.46.09 CEST
> > Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
> > 
> > 
> > >  
> > > 
> > > On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon <
> > > gbul...@sonicle.com> wrote:
> > > > That one was taken from a specific implementation on Solaris
> > > > 11.
> > > > The situation is a dual node server with shared storage
> > > > controller: both nodes see the same disks concurrently.
> > > > Here we must be sure that the two nodes are not going to
> > > > import/mount the same zpool at the same time, or we will
> > > > encounter data corruption:
> > > > 
> > > 
> > >  
> > > ssh based "stonith" cannot guarantee it.
> > >  
> > > > node 1 will be perferred for pool 1, node 2 for pool 2, only in
> > > > case one of the node goes down or is taken offline the
> > > > resources should be first free by the leaving node and taken by
> > > > the other node.
> > > >  
> > > > Would you suggest one of the available stonith in this case?
> > > >  
> > > > 
> > > 
> > >  
> > > IPMI, managed PDU, SBD ...
> > > In practice, the only stonith method that works in case of
> > > complete node outage including any power supply is SBD.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] why is node fenced ?

2020-08-17 Thread Ken Gaillot
On Fri, 2020-08-14 at 20:37 +0200, Lentes, Bernd wrote:
> - On Aug 9, 2020, at 10:17 PM, Bernd Lentes 
> bernd.len...@helmholtz-muenchen.de wrote:
> 
> 
> > > So this appears to be the problem. From these logs I would guess
> > > the
> > > successful stop on ha-idg-1 did not get written to the CIB for
> > > some
> > > reason. I'd look at the pe input from this transition on ha-idg-2 
> > > to
> > > confirm that.
> > > 
> > > Without the DC knowing about the stop, it tries to schedule a new
> > > one,
> > > but the node is shutting down so it can't do it, which means it
> > > has to
> > > be fenced.
> 
> I checked all relevant pe-files in this time period.
> This is what i found out (i just write the important entries):
> 
> ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input-
> 3116 -G transition-3116.xml -D transition-3116.dot
> Current cluster status:
>  ...
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Started ha-idg-1
> Transition Summary:
>  ...
> * Migratevm_nextcloud   ( ha-idg-1 -> ha-idg-2 )
> Executing cluster transition:
>  * Resource action: vm_nextcloudmigrate_from on ha-idg-2 <===
> migrate vm_nextcloud
>  * Resource action: vm_nextcloudstop on ha-idg-1 
>  * Pseudo action:   vm_nextcloud_start_0
> Revised cluster status:
> Node ha-idg-1 (1084777482): standby
> Online: [ ha-idg-2 ]
> vm_nextcloud   (ocf::heartbeat:VirtualDomain): Started ha-idg-2
> 
> 
> ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-error-
> 48 -G transition-4514.xml -D transition-4514.dot
> Current cluster status:
> Node ha-idg-1 (1084777482): standby
> Online: [ ha-idg-2 ]
> ...
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): FAILED[ ha-idg-2 ha-
> idg-1 ] <== migration failed
> Transition Summary:
> ..
>  * Recovervm_nextcloud( ha-idg-2 )
> Executing cluster transition:
>  * Resource action: vm_nextcloudstop on ha-idg-2
>  * Resource action: vm_nextcloudstop on ha-idg-1 
>  * Resource action: vm_nextcloudstart on ha-idg-2
>  * Resource action: vm_nextcloudmonitor=3 on ha-idg-2
> Revised cluster status:
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Started ha-idg-2
> 
> ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input-
> 3117 -G transition-3117.xml -D transition-3117.dot
> Current cluster status:
> Node ha-idg-1 (1084777482): standby
> Online: [ ha-idg-2 ]
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): FAILED ha-idg-2
> <== start on ha-idg-2 failed
> Transition Summary:
>  * Stop   vm_nextcloud ( ha-idg-2 )   due to node
> availability < stop vm_nextcloud (what means due to node
> availability ?)

"Due to node availability" means no node is allowed to run the
resource, so it has to be stopped.

> Executing cluster transition:
>  * Resource action: vm_nextcloudstop on ha-idg-2
> Revised cluster status:
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> 
> ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input-
> 3118 -G transition-4516.xml -D transition-4516.dot
> Current cluster status:
> Node ha-idg-1 (1084777482): standby
> Online: [ ha-idg-2 ]
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> <== vm_nextcloud is stopped
> Transition Summary:
>  * Shutdown ha-idg-1
> Executing cluster transition:
>  * Resource action: vm_nextcloudstop on ha-idg-1 < why stop ?
> It is already stopped

I'm not sure, I'd have to see the pe input.

> Revised cluster status:
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> 
> ha-idg-1:~/why-fenced/ha-idg-2/pengine # crm_simulate -S -x pe-input-
> 3545 -G transition-0.xml -D transition-0.dot
> Current cluster status:
> Node ha-idg-1 (1084777482): pending
> Online: [ ha-idg-2 ]
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped <==
> vm_nextcloud is stopped
> Transition Summary:
> 
> Executing cluster transition:
> Using the original execution date of: 2020-07-20 15:05:33Z
> Revised cluster status:
> vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> 
> ha-idg-1:~/why-fenced/ha-idg-2/pengine # crm_simulate -S -x pe-warn-
> 749 -G transition-1.xml -D transition-1.dot
> Current cluster status:
> Node ha-idg-1 (1084777482): OFFLINE (standby)
> Online: [ ha-idg-2 ]
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped <===
> vm_nextcloud is stopped
> Transition Summary:
>  * Fence (Off) ha-idg-1 'resource actions are unrunnable'
> Executing cluster transition:
>  * Fencing ha-idg-1 (Off)
>  * Pseudo action:   vm_nextcloud_stop_0 <=== why stop ? It is
> already stopped ?
> Revised cluster status:
> Node ha-idg-1 (1084777482): OFFLINE (standby)
> Online: [ ha-idg-2 ]
>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> 
> I don't understand why the cluster tries to stop a resource which is
> already stopped.
> 
> Bernd
> Helmholtz Zentrum München
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH

Re: [ClusterLabs] why is node fenced ?

2020-08-17 Thread Ken Gaillot
On Fri, 2020-08-14 at 12:17 +0200, Lentes, Bernd wrote:
> 
> - On Aug 10, 2020, at 11:59 PM, kgaillot kgail...@redhat.com
> wrote:
> > The most recent transition is aborted, but since all its actions
> > are
> > complete, the only effect is to trigger a new transition.
> > 
> > We should probably rephrase the log message. In fact, the whole
> > "transition" terminology is kind of obscure. It's hard to come up
> > with
> > something better though.
> > 
> 
> Hi Ken,
> 
> i don't get it. How can s.th. be aborted which is already completed ?

I agree the wording is confusing :)

From the code's point of view, the actions in the transition are
complete, but the transition itself (as an abstract entity) remains
current until the next one starts. However that's academic and
meaningless from a user's point of view, so the log messages should be
reworded.

> Bernd
> Helmholtz Zentrum München
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin
> Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] kronosnet v1.x series and future support / development

2020-08-17 Thread Fabio M. Di Nitto

All,

kronosnet (or knet for short) is the new underlying network protocol for 
Linux HA components (corosync), that features the ability to use 
multiple links between nodes, active/active and active/passive link 
failover policies, automatic link recovery, FIPS compliant encryption 
(nss and/or openssl), automatic PMTUd and in general better performance 
compared to the old network protocol.


After several weeks / months without any major bug reported, starting 
with v1.19 release, we are going to lock down the 1.x series to only 2 
kind of changes:


* Bug fixes
* Onwire compatibility changes to allow rolling upgrades with the v2.x
  series (if necessary at all)

Upstream will continue to support v1.x for at least 12 months after v2.x 
will be released (date unknown, no really, we didn't even start the 
development).


If you have any amazing ideas on what v2.x should include, please file 
issues here:


https://github.com/kronosnet/kronosnet/issues

Or check the current TODO list here:

https://trello.com/kronosnet

Cheers,
The knet developer team
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] kronosnet v1.19 released

2020-08-17 Thread Fabio M. Di Nitto

All,

We are pleased to announce the general availability of kronosnet v1.19

kronosnet (or knet for short) is the new underlying network protocol for 
Linux HA components (corosync), that features the ability to use 
multiple links between nodes, active/active and active/passive link 
failover policies, automatic link recovery, FIPS compliant encryption 
(nss and/or openssl), automatic PMTUd and in general better performance 
compared to the old network protocol.


Highlights in this release:

* Add native support for openssl 3.0 (drop API COMPAT macros).
* Code cleanup of public APIs. Lots of lines of code moved around, no
  functional changes.
* Removed kronosnetd unsupported code completely
* Removed unused poc-code from the source tree
* Make sure to initialize epoll events structures

Known issues in this release:

* None

The source tarballs can be downloaded here:

https://www.kronosnet.org/releases/

Upstream resources and contacts:

https://kronosnet.org/
https://github.com/kronosnet/kronosnet/
https://ci.kronosnet.org/
https://trello.com/kronosnet (TODO list and activities tracking)
https://goo.gl/9ZvkLS (google shared drive with presentations and diagrams)
IRC: #kronosnet on Freenode
https://lists.kronosnet.org/mailman/listinfo/users
https://lists.kronosnet.org/mailman/listinfo/devel
https://lists.kronosnet.org/mailman/listinfo/commits

Cheers,
The knet developer team

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote

2020-08-17 Thread Klaus Wenninger
On 8/10/20 6:47 PM, Ken Gaillot wrote:
> Hi all,
>
> Looking ahead to the Pacemaker 2.0.5 release expected at the end of
> this year, here is a new feature already in the master branch.
>
> When configuring resource operations, Pacemaker lets you set an "on-
> fail" policy to specify whether to restart the resource, fence the
> node, etc., if the operation fails. With 2.0.5, a new possible value
> will be "demote", which will mean "demote this resource but do not
> fully restart it".
>
> "Demote" will be a valid value only for promote actions, and for
> recurring monitors with "role" set to "Master".
>
> Once the resource is demoted, it will be eligible for promotion again,
> so if the promotion scores have not changed, a promote on the same node
> may be attempted. If this is not desired, the agent can change the
> promotion scores either in the failed monitor or the demote.
>
> The intended use case is an application where a successful demote 
> assures a well-functioning service, and a full restart would be
> unnecessarily heavyweight. A large database might be an example.
>
> Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option
> to specify what happens to resources when quorum is lost (the default
> being to stop them). With 2.0.5, "demote" will be a possible value here
> as well, and will mean "demote all promotable resources and stop all
> other resources".
When using the new "no-quorum-policy" together with SBD please be
sure to use an SBD version that has
https://github.com/ClusterLabs/sbd/pull/111.

Klaus
>
> The intended use case is an application that cannot cause any harm
> after being demoted, and may be useful in a demoted role even if there
> is no quorum. A database that operates read-only when demoted and
> doesn't depend on any non-promotable resources might be an example.
>
> Happy clustering :)

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Clear Pending Fencing Action

2020-08-17 Thread Klaus Wenninger
On 8/3/20 7:04 AM, Reid Wahl wrote:
> Hi, Илья. `stonith_admin --cleanup` doesn't get rid of pending
> actions, only failed ones. You might be hitting
> https://bugs.clusterlabs.org/show_bug.cgi?id=5401.
>
> I believe a simultaneous reboot of both nodes will clear the pending
> actions. I don't recall whether there's any other way to clear them.
Even simultaneous rebooting might be some kind of a challenge.
When a node is coming up it will request the history from the
running nodes. Thus a simultaneous reboot might not be
simultaneous enough so that the nodes aren't still able to
pass this list from one to another.
To be on the safe side you would have to shut down all nodes
and fire them up again.

If it is not the bug stated above and we are talking about a
pending fence-action that was going on on node that itself
just got fenced (and is still down) pacemaker coming up on
that node will remove (fail) the pending fence-action.

Just for completeness:
'stonith_admin --cleanup' cleans everything that is 'just'
history (failed & successful) whereas weather an attempt is
still pending does have an effect on how fencing is working.
As nobody would expect a history-cleanup to influence
behavior, not touching pending actions is a safety measure
and not a bug.

Klaus


>
> On Sun, Aug 2, 2020 at 8:26 PM Илья Насонов  > wrote:
>
> Hello!
>
>  
>
> After troubleshooting 2-Node cluster, crm_mon deprecated actions
> are displayed in “Pending Fencing Action:” list.
>
> How can I delete them.
>
> «stonith_admin --cleanup --history=*» does not delete it.
>
>  
>
>  
>
> С уважением,
> Илья Насонов
> el...@po-mayak.ru
>
>  
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>
>
> -- 
> Regards,
>
> Reid Wahl, RHCA
> Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Alerts for qdevice/qnetd/booth

2020-08-17 Thread Jan Friesse

Thanks Honza. I have raised these on both upstream projects.


Thanks


I will leave upto implementer how best this can be done, considering the
technical limitations you mentioned.

https://github.com/corosync/corosync-qdevice/issues/13
https://github.com/ClusterLabs/booth/issues/99

Thanks,
Rohit

On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse  wrote:


Hi Rohit,


Hi Honza,
Thanks for your reply. Please find the attached image below:

[image: image.png]

Yes, I am talking about pacemaker alerts only.

Please find my suggestions/requirements below:

*Booth:*
1. Node5 booth-arbitrator should be able to give event when any of the
booth node joins or leaves. booth-ip can be passed in event.


This is not how booth works. Ticket leader (so site booth, never
arbitrator) executes election and get replies from other
sites/arbitrator. Follower executes election when leader hasn't for
configured timeout.

What I want to say is, that there is no "membership" - as in (for
example) corosync fashion.

The best we could get is the rough estimation based on election
request/replies.


2. Event when booth-arbitrator is up successfully and has started
monitoring the booth nodes.


This is basically start of service. I think it's doable with small
change in unit file (something like

https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html
)


2. Geo site booth should be able to give event when its booth peers
joins/leaves. For example, Geo site1 gives an event when node5
booth-arbitrator joins/leaves OR site2 booth joins/leaves.  booth-ip can

be

passed in event.
3. On ticket movements (revoke/grant), every booth node(Site1/2 and

node5)

should give events.


That would be doable



Note: pacemaker alerts works in a cluster. Since, arbitrator is a
non-cluster node, not sure how exactly it will work there. But this is

good

to have feature.

*Qnetd/Qdevice:*
This is similar to above.
1. Node5 qnetd should be able to raise an event when any of the cluster
node joins/leaves the quorum.


Doable


2. Event when qnetd is up successfully and has started monitoring the
cluster nodes


Qnetd itself is not monitoring qdevice nodes (it doesn't have list of
nodes). It monitors node status after node joins (= it would be possible
to trigger event on leave). So that may be enough.


3. Cluster node should be able to give event when any of the quorum node
leaves/joins.


You mean qdevice should be able to trigger event when connected to qnetd?



If you see on high level, then these are kind of node/resource events wrt
booth and qnetd/qdevice.


Yeah



As of today wrt booth/qnetd, I don't see any provision where any of the
nodes gives any event when its peer leaves/joins. This makes it difficult
to know whether geo sites nodes can see booth-arbitrator or not. This is


Got it. That's exactly what would be really problematic to implement,
because of no "membership" in booth. It would be, however, possible to
implement message when ticket was granted/rejected and have a list of
other booths replies and what was their votes.


true the other way around also where booth-arbitrator cannot see geo

booth

sites.
I am not sure how others are doing it in today's deployment, but I see

need

of monitoring of every other booth/qnet node. So that on basis of event,
appropriate alarms can be raised and action can be taken accordingly.

Please let me know if you agree on the usecases. I'll raise

feature-request

I can agree on usecases, but (especially with booth) there are technical
problems on realizing them.


on the pacemaker upstream project accordingly.


Please use booth (https://github.com/ClusterLabs/booth) and qdevice
(https://github.com/corosync/corosync-qdevice) upstream rather than
pacemaker, because these requests has really nothing to do with pcmk.

Regards,
honza



Thanks,
Rohit

On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse  wrote:


Hi Rohit,

Rohit Saini napsal(a):

Hi Team,

Question-1:
Similar to pcs alerts, do we have something similar for qdevice/qnetd?

This

You mean pacemaker alerts right?


is to detect asynchronously if any of the member is

unreachable/joined/left

and if that member is qdevice or qnetd.


Nope but actually shouldn't be that hard to implement. What exactly
would you like to see there?



Question-2:
Same above question for booth nodes and arbitrator. Is there any way to
receive events from booth daemon?


Not directly (again, shouldn't be that hard to implement). But pacemaker
alerts should be triggered when service changes state because of ticket
grant/reject, isn't it?



My main objective is to see if these daemons give events related to
their internal state transitions  and raise some alarms accordingly.

For

example, boothd arbitrator is unreachable, ticket moved from x to y,

etc.


I don't think "boothd arbitrator is unreachable" alert is really doable.
Ticket moved from x to y would be probably two alerts - 1. ticket
rejected on X and 2. granted on Y.

Would you m

Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Klaus Wenninger
On 8/17/20 9:19 AM, Andrei Borzenkov wrote:
> 17.08.2020 10:06, Klaus Wenninger пишет:
 Alternatively, you can set up corosync-qdevice, using a separate system
 running qnetd server as a quorum arbitrator.

>>> Any solution that is based on node suicide is prone to complete cluster
>>> loss. In particular, in two node cluster with qdevice surviving node
>>> will commit suicide is qnetd is not accessible.
>> I don't think that what Reid suggested was going for nodes
>> that loose quorum to commit suicide right away.
>> You can use quorum simply as a means of preventing fence-races
>> otherwise inherent to 2-node-clusters.
> Can you please show the configuration example how to do it? Sorry, but I
> do not understand how is it possible.
Simply don't set the 2-node-flag. So just one of the nodes will have
quorum and just one of them will attempt fencing.
>
>>> As long as external stonith is reasonably reliable it is much preferred
>>> to any solution based on quorum (unless you have very specific
>>> requirements and can tolerate running remaining nodes in "frozen" mode
>>> to limit unavailability).
>> Well we can name the predominant scenario why one might not want to depend
>> on fencing-devices like ipmi: If you want to cover a scenario where the
>> nodes don't
>> just loose corosync connectivity but as well access from one node to the
>> fencing
>> device of the other is interrupted you probably won't get around an
>> approach that
>> involves some kind of arbitrator.
> Sure. Which is why I said "reasonably reliable". Still even in this case
> one must understand all pros and cons to decide which risk is more
> important to mitigate.
>
Exactly! Which is why I tried to give some flesh to the idea to foster
this kind of understanding. You always have to be aware of the failure
scenarios you want to cover and what it might cost you elsewhere
to cover one specific scenario.

Klaus

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Andrei Borzenkov
17.08.2020 10:06, Klaus Wenninger пишет:
>>
>>> Alternatively, you can set up corosync-qdevice, using a separate system
>>> running qnetd server as a quorum arbitrator.
>>>
>> Any solution that is based on node suicide is prone to complete cluster
>> loss. In particular, in two node cluster with qdevice surviving node
>> will commit suicide is qnetd is not accessible.
> I don't think that what Reid suggested was going for nodes
> that loose quorum to commit suicide right away.
> You can use quorum simply as a means of preventing fence-races
> otherwise inherent to 2-node-clusters.

Can you please show the configuration example how to do it? Sorry, but I
do not understand how is it possible.

>>
>> As long as external stonith is reasonably reliable it is much preferred
>> to any solution based on quorum (unless you have very specific
>> requirements and can tolerate running remaining nodes in "frozen" mode
>> to limit unavailability).
> Well we can name the predominant scenario why one might not want to depend
> on fencing-devices like ipmi: If you want to cover a scenario where the
> nodes don't
> just loose corosync connectivity but as well access from one node to the
> fencing
> device of the other is interrupted you probably won't get around an
> approach that
> involves some kind of arbitrator.

Sure. Which is why I said "reasonably reliable". Still even in this case
one must understand all pros and cons to decide which risk is more
important to mitigate.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-17 Thread Klaus Wenninger
On 8/16/20 11:40 AM, Andrei Borzenkov wrote:
> 16.08.2020 04:25, Reid Wahl пишет:
>>
>>> - considering that I have both nodes with stonith against the other node,
>>> once the two nodes can communicate, how can I be sure the two nodes will
>>> not try to stonith each other?
>>>
>> The simplest option is to add a delay attribute (e.g., delay=10) to one of
>> the stonith devices. That way, if both nodes want to fence each other, the
>> node whose stonith device has a delay configured will wait for the delay to
>> expire before executing the reboot action.
If your fence-agent supports a delay attribute you can of course use
that. As this isn't available with every fence-agent or is looking
differently depending on the fence-agent we've introduced
pcmk_delay_max & pcmk_delay_base. These are applied prior
to actually calling the fence-agent and thus are always available and
always look the same. The delay is gonna be some random time
between pcmk_delay_base and pcmk_delay_max.
This takes us to another approach how you can reduce chances
of a fatal fence-race. Assuming that the reason why the fence-race
is triggered is detected around the same time when just adding a
random time you will very likely prevent them killing each other.
This is especially interesting when there is no clear / easy way
to determine which of the nodes is more important at this time.
>>
> Current pacemaker (2.0.4) also supports priority-fencing-delay option
> that computes delay based on which resources are active on specific
> node, so favoring node with "more important" resources.
>
>> Alternatively, you can set up corosync-qdevice, using a separate system
>> running qnetd server as a quorum arbitrator.
>>
> Any solution that is based on node suicide is prone to complete cluster
> loss. In particular, in two node cluster with qdevice surviving node
> will commit suicide is qnetd is not accessible.
I don't think that what Reid suggested was going for nodes
that loose quorum to commit suicide right away.
You can use quorum simply as a means of preventing fence-races
otherwise inherent to 2-node-clusters.
>
> As long as external stonith is reasonably reliable it is much preferred
> to any solution based on quorum (unless you have very specific
> requirements and can tolerate running remaining nodes in "frozen" mode
> to limit unavailability).
Well we can name the predominant scenario why one might not want to depend
on fencing-devices like ipmi: If you want to cover a scenario where the
nodes don't
just loose corosync connectivity but as well access from one node to the
fencing
device of the other is interrupted you probably won't get around an
approach that
involves some kind of arbitrator.
>
> And before someone jumps in - SBD falls into "solution based on suicide"
> as well.
Got your point without that hint ;-)

Klaus
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/