Re: [ClusterLabs] Beginner Question about VirtualDomain

2020-08-18 Thread Digimer
On 2020-08-17 8:40 a.m., Sameer Dhiman wrote:
> Hi,
> 
> I am a beginner using pacemaker and corosync. I am trying to set up
> a cluster of HA KVM guests as described by Alteeve wiki (CentOS-6) but
> in CentOS-8.2. My R  setup is described below
> 
> Physical Host running CentOS-8.2 with Nested Virtualization
> 2 x CentOS-8.2 guest machines as Cluster Node 1 and 2.
> WinXP as a HA guest.
> 
> 1. drbd --> dlm --> lvmlockd --> LVM-activate --> gfs2 (guest machine
> definitions)
> 2. drbd --> dlm --> lvmlockd --> LVM-activate --> raw-lv (guest machine HDD)
> 
> Question(s):
> 1. How to prevent guest startup until gfs2 and raw-lv are available? In
> CentOS-6 Alteeve used autostart=0 in the  tag. Is there any similar
> option in pacemaker because I did not find it in the documentation?
> 
> 2. Suppose, If I configure constraint order gfs2 and raw-lv then guest
> machine. Stopping the guest machine would also stop the complete service
> tree so how can I prevent this?
> 
> -- 
> Sameer Dhiman

Hi Sameer,

  I'm the author of that wiki. It's quite out of date, as you noted, and
we're actively developing a new release for EL8. Though, it would be
ready until near the end if the year.

  There are a few changes we've made that you might want to consider;

1. We never were too happy with DLM, and so we've reworked things to no
longer need it. So we use normal LVM backing DRBD resources. One
resource per VM, on volume per virtual disk backed by an LV. Our tools
will automate this, but you can easily enough manually create them if
your environment is fairly stable.

2. To get around GFS2, we create a
/mnt/shared/{provision,definitions,files,archive} directory (note
/shared -> /mnt/shared to be more LFS friendly). We'll again automate
management of files in Striker, but you can copy the files manually and
rsync out changes as needed (again, if your environment doesn't change
much).

3. We changed DRBD from v8.4 to 9.0, and this meant a few things had to
change. We will integrate support for short-throw DR hosts (async "third
node" in DRBD that is outside pacemaker). We run the resources to only
allow a single primary normally and enable auto-promote. For
live-migration, we temporarily enable live migration, promote the
target, migrate, demote the old host and disable dual-primary. This
makes it safer as it's far less likely that someone could accidentally
start a VM on the passive node (not that it ever happened as our tools
prevented it, but it was _possible_, so we wanted to improve that).

That handle #3, we've written our own custom RA (ocf:alteeve:server
[1]). This RA is smart enough to watch/wait for things to become ready
before starting. It also handles the DRBD stuff I mentioned, and the
virsh call to do the migration. So it means the pacemaker config is
extremely simple. Note though it depends on the rest of our tools so it
won't work outside the Anvil!. That said, if you wanted to use it before
we release Anvil! M3, you could probably adapt it easily enough.

If you have any questions, please let me know and I'll help as best I can.

Cheers,

digimer

(Note: during development, this code base is kept outside of
Clusterlabs. We'll move it in when it reaches beta).
1. https://github.com/digimer/anvil/blob/master/ocf/alteeve/server

-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] node utilization attributes are lost during upgrade

2020-08-18 Thread Strahil Nikolov
Won't it be easier if:
- set a node in standby
- stop a node
- remove the node
- add again with the new hostname

Best Regards,
Strahil Nikolov

На 18 август 2020 г. 17:15:49 GMT+03:00, Ken Gaillot  
написа:
>On Tue, 2020-08-18 at 14:35 +0200, Kadlecsik József wrote:
>> Hi,
>> 
>> On Mon, 17 Aug 2020, Ken Gaillot wrote:
>> 
>> > On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik József wrote:
>> > > 
>> > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from
>> > > Debian 
>> > > stretch to buster, all the node utilization attributes were
>> > > erased 
>> > > from the configuration. However, the same attributes were kept at
>> > > the 
>> > > VirtualDomain resources. This resulted that all resources with 
>> > > utilization attributes were stopped.
>> > 
>> > Ouch :(
>> > 
>> > There are two types of node attributes, transient and permanent. 
>> > Transient attributes last only until pacemaker is next stopped on
>> > the 
>> > node, while permanent attributes persist between reboots/restarts.
>> > 
>> > If you configured the utilization attributes with crm_attribute
>> > -z/ 
>> > --utilization, it will default to permanent, but it's possible to 
>> > override that with -l/--lifetime reboot (or equivalently, -t/
>> > --type 
>> > status).
>> 
>> The attributes were defined by "crm configure edit", simply stating:
>> 
>> node 1084762113: atlas0 \
>> utilization hv_memory=192 cpu=32 \
>> attributes standby=off
>> ...
>> node 1084762119: atlas6 \
>> utilization hv_memory=192 cpu=32 \
>> 
>> But I believe now that corosync caused the problem, because the nodes
>> had 
>> been renumbered:
>
>Ah yes, that would do it. Pacemaker would consider them different nodes
>with the same names. The "other" node's attributes would not apply to
>the "new" node.
>
>The upgrade procedure would be similar except that you would start
>corosync by itself after each upgrade. After all nodes were upgraded,
>you would modify the CIB on one node (while pacemaker is not running)
>with:
>
>CIB_file=/var/lib/pacemaker/cib/cib.xml cibadmin --modify --scope=nodes
>-X '...'
>
>where '...' is a  XML entry from the CIB with the "id" value
>changed to the new ID, and repeat that for each node. Then, start
>pacemaker on that node and wait for it to come up, then start pacemaker
>on the other nodes.
>
>> 
>> node 3232245761: atlas0
>> ...
>> node 3232245767: atlas6
>> 
>> The upgrade process was:
>> 
>> for each node do
>> set the "hold" mark on the corosync package
>> put the node standby
>> wait for the resources to be migrated off
>> upgrade from stretch to buster
>> reboot
>> put the node online
>> wait for the resources to be migrated (back)
>> done
>> 
>> Up to this point all resources were running fine.
>> 
>> In order to upgrade corosync, we followed the next steps:
>> 
>> enable maintenance mode
>> stop pacemaker and corosync on all nodes
>> for each node do
>> delete the hold mark and upgrade corosync
>> install new config file (nodeid not specified)
>> restart corosync, start pacemaker
>> done
>> 
>> We could see that all resources were running unmanaged. When
>> disabling the 
>> maintenance mode, then those were stopped.
>> 
>> So I think corosync renumbered the nodes and I suspect the reason for
>> that 
>> was that "clear_node_high_bit: yes" was not specified in the new
>> config 
>> file. It means it was an admin error then.
>> 
>> Best regards,
>> Jozsef
>> --
>> E-mail : kadlecsik.joz...@wigner.hu
>> PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
>> Address: Wigner Research Centre for Physics
>>  H-1525 Budapest 114, POB. 49, Hungary
>-- 
>Ken Gaillot 
>
>___
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] node utilization attributes are lost during upgrade

2020-08-18 Thread Ken Gaillot
On Tue, 2020-08-18 at 22:45 +0300, Strahil Nikolov wrote:
> Won't it be easier if:
> - set a node in standby
> - stop a node
> - remove the node
> - add again with the new hostname

The hostname stays the same, but corosync is changing the numeric node
ID as part of the upgrade. If they remove the node, they'll lose its
utilization attributes, which is what they want to keep.

Looking at it again, I'm guessing there are no explicit node IDs in
corosync.conf, and corosync is choosing the IDs. In that case the
easiest approach would be to explicitly set the original node IDs in
corosync.conf before the upgrade, so they don't change.

> 
> Best Regards,
> Strahil Nikolov
> 
> На 18 август 2020 г. 17:15:49 GMT+03:00, Ken Gaillot <
> kgail...@redhat.com> написа:
> > On Tue, 2020-08-18 at 14:35 +0200, Kadlecsik József wrote:
> > > Hi,
> > > 
> > > On Mon, 17 Aug 2020, Ken Gaillot wrote:
> > > 
> > > > On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik József wrote:
> > > > > 
> > > > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from
> > > > > Debian 
> > > > > stretch to buster, all the node utilization attributes were
> > > > > erased 
> > > > > from the configuration. However, the same attributes were
> > > > > kept at
> > > > > the 
> > > > > VirtualDomain resources. This resulted that all resources
> > > > > with 
> > > > > utilization attributes were stopped.
> > > > 
> > > > Ouch :(
> > > > 
> > > > There are two types of node attributes, transient and
> > > > permanent. 
> > > > Transient attributes last only until pacemaker is next stopped
> > > > on
> > > > the 
> > > > node, while permanent attributes persist between
> > > > reboots/restarts.
> > > > 
> > > > If you configured the utilization attributes with crm_attribute
> > > > -z/ 
> > > > --utilization, it will default to permanent, but it's possible
> > > > to 
> > > > override that with -l/--lifetime reboot (or equivalently, -t/
> > > > --type 
> > > > status).
> > > 
> > > The attributes were defined by "crm configure edit", simply
> > > stating:
> > > 
> > > node 1084762113: atlas0 \
> > > utilization hv_memory=192 cpu=32 \
> > > attributes standby=off
> > > ...
> > > node 1084762119: atlas6 \
> > > utilization hv_memory=192 cpu=32 \
> > > 
> > > But I believe now that corosync caused the problem, because the
> > > nodes
> > > had 
> > > been renumbered:
> > 
> > Ah yes, that would do it. Pacemaker would consider them different
> > nodes
> > with the same names. The "other" node's attributes would not apply
> > to
> > the "new" node.
> > 
> > The upgrade procedure would be similar except that you would start
> > corosync by itself after each upgrade. After all nodes were
> > upgraded,
> > you would modify the CIB on one node (while pacemaker is not
> > running)
> > with:
> > 
> > CIB_file=/var/lib/pacemaker/cib/cib.xml cibadmin --modify --
> > scope=nodes
> > -X '...'
> > 
> > where '...' is a  XML entry from the CIB with the "id" value
> > changed to the new ID, and repeat that for each node. Then, start
> > pacemaker on that node and wait for it to come up, then start
> > pacemaker
> > on the other nodes.
> > 
> > > 
> > > node 3232245761: atlas0
> > > ...
> > > node 3232245767: atlas6
> > > 
> > > The upgrade process was:
> > > 
> > > for each node do
> > > set the "hold" mark on the corosync package
> > > put the node standby
> > > wait for the resources to be migrated off
> > > upgrade from stretch to buster
> > > reboot
> > > put the node online
> > > wait for the resources to be migrated (back)
> > > done
> > > 
> > > Up to this point all resources were running fine.
> > > 
> > > In order to upgrade corosync, we followed the next steps:
> > > 
> > > enable maintenance mode
> > > stop pacemaker and corosync on all nodes
> > > for each node do
> > > delete the hold mark and upgrade corosync
> > > install new config file (nodeid not specified)
> > > restart corosync, start pacemaker
> > > done
> > > 
> > > We could see that all resources were running unmanaged. When
> > > disabling the 
> > > maintenance mode, then those were stopped.
> > > 
> > > So I think corosync renumbered the nodes and I suspect the reason
> > > for
> > > that 
> > > was that "clear_node_high_bit: yes" was not specified in the new
> > > config 
> > > file. It means it was an admin error then.
> > > 
> > > Best regards,
> > > Jozsef
> > > --
> > > E-mail : kadlecsik.joz...@wigner.hu
> > > PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
> > > Address: Wigner Research Centre for Physics
> > >  H-1525 Budapest 114, POB. 49, Hungary
> > 
> > -- 
> > Ken Gaillot 
> > 
> > ___
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> > 
> > ClusterLabs home: https://www.clusterlabs.org/
> 
> 
-- 
Ken Gaillot 

___
Manage your subscription:

Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-18 Thread Klaus Wenninger
On 8/18/20 9:07 PM, Andrei Borzenkov wrote:
> 18.08.2020 17:02, Ken Gaillot пишет:
>> On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote:
>>> On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
> On Mon, 17 Aug 2020 10:19:45 -0500
> Ken Gaillot  wrote:
>
>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
>>> Thanks to all your suggestions, I now have the systems with
>>> stonith
>>> configured on ipmi.  
>> A word of caution: if the IPMI is on-board -- i.e. it shares
>> the same
>> power supply as the computer -- power becomes a single point of
>> failure. If the node loses power, the other node can't fence
>> because
>> the IPMI is also down, and the cluster can't recover.
>>
>> Some on-board IPMI controllers can share an Ethernet port with
>> the main
>> computer, which would be a similar situation.
>>
>> It's best to have a backup fencing method when using IPMI as
>> the
>> primary fencing method. An example would be an intelligent
>> power switch
>> or sbd.
> How SBD would be useful in this scenario? Poison pill will not be
> swallowed by
> the dead node... Is it just to wait for the watchdog timeout?
>
 Node is expected to commit suicide if SBD lost access to shared
 block
 device. So either node swallowed poison pill and died or node died
 because it realized it was impossible to see poison pill or node
 was
 dead already. After watchdog timeout (twice watchdog timeout for
 safety)
 we assume node is dead.
>>> Yes, like this a suicide via watchdog will be triggered if there are
>>> issues with thedisk. This is why it is important to have a reliable
>>> watchdog with SBD even whenusing poison pill. As this alone would
>>> make a single shared disk a SPOF, runningwith pacemaker integration
>>> (default) a node with SBD will survive despite ofloosing the disk
>>> when it has quorum and pacemaker looks healthy. As corosync-quorum
>>> in 2-node-mode obviously won't be fit for this purpose SBD will
>>> switch
>>> to checking for presence of both nodes if 2-node-flag is set.
>>>
>>> Sorry for the lengthy explanation but the full picture is required
>>> to understand whyit is sufficiently reliable and useful if configured
>>> correctly.
>>>
>>> Klaus
>> What I'm not sure about is how watchdog-only sbd would behave as a
>> fail-back method for a regular fence device. Will the cluster wait for
>> the sbd timeout no matter what, or only if the regular fencing fails,
>> or ...?
>>
> Diskless SBD implicitly creates fencing device ("watchdog"), timeout
> starts only when this device is selected for fencing. This device
> appears to be completely invisible to normal stonith_admin operation, I
> do not know how to query for it. In my testing explicit stonith resource
> was always called first and only if it failed was "watchdog" self
> fencing attempted. I tried to set negative priority for CIB stonith
> resource but it did not change anything.
>
This matches with what I remember from going through the code ...
like with lowest prio but not at all if there is a topology defined ...
which probably should be overhauled ...
If interested there is a branch about having just certain nodes
watchdog-fenced on my pacemaker-clone that makes the watchdog
device visible.


Klaus

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-18 Thread Andrei Borzenkov
18.08.2020 17:02, Ken Gaillot пишет:
> On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote:
>> On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
>>> 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
 On Mon, 17 Aug 2020 10:19:45 -0500
 Ken Gaillot  wrote:

> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
>> Thanks to all your suggestions, I now have the systems with
>> stonith
>> configured on ipmi.  
>
> A word of caution: if the IPMI is on-board -- i.e. it shares
> the same
> power supply as the computer -- power becomes a single point of
> failure. If the node loses power, the other node can't fence
> because
> the IPMI is also down, and the cluster can't recover.
>
> Some on-board IPMI controllers can share an Ethernet port with
> the main
> computer, which would be a similar situation.
>
> It's best to have a backup fencing method when using IPMI as
> the
> primary fencing method. An example would be an intelligent
> power switch
> or sbd.

 How SBD would be useful in this scenario? Poison pill will not be
 swallowed by
 the dead node... Is it just to wait for the watchdog timeout?

>>>
>>> Node is expected to commit suicide if SBD lost access to shared
>>> block
>>> device. So either node swallowed poison pill and died or node died
>>> because it realized it was impossible to see poison pill or node
>>> was
>>> dead already. After watchdog timeout (twice watchdog timeout for
>>> safety)
>>> we assume node is dead.
>>
>> Yes, like this a suicide via watchdog will be triggered if there are
>> issues with thedisk. This is why it is important to have a reliable
>> watchdog with SBD even whenusing poison pill. As this alone would
>> make a single shared disk a SPOF, runningwith pacemaker integration
>> (default) a node with SBD will survive despite ofloosing the disk
>> when it has quorum and pacemaker looks healthy. As corosync-quorum
>> in 2-node-mode obviously won't be fit for this purpose SBD will
>> switch
>> to checking for presence of both nodes if 2-node-flag is set.
>>
>> Sorry for the lengthy explanation but the full picture is required
>> to understand whyit is sufficiently reliable and useful if configured
>> correctly.
>>
>> Klaus
> 
> What I'm not sure about is how watchdog-only sbd would behave as a
> fail-back method for a regular fence device. Will the cluster wait for
> the sbd timeout no matter what, or only if the regular fencing fails,
> or ...?
> 

Diskless SBD implicitly creates fencing device ("watchdog"), timeout
starts only when this device is selected for fencing. This device
appears to be completely invisible to normal stonith_admin operation, I
do not know how to query for it. In my testing explicit stonith resource
was always called first and only if it failed was "watchdog" self
fencing attempted. I tried to set negative priority for CIB stonith
resource but it did not change anything.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] why is node fenced ?

2020-08-18 Thread Ken Gaillot
On Tue, 2020-08-18 at 16:47 +0200, Lentes, Bernd wrote:
> 
> - On Aug 17, 2020, at 5:09 PM, kgaillot kgail...@redhat.com
> wrote:
> 
> 
> > > I checked all relevant pe-files in this time period.
> > > This is what i found out (i just write the important entries):
> 
>  
> > > Executing cluster transition:
> > >  * Resource action: vm_nextcloudstop on ha-idg-2
> > > Revised cluster status:
> > >  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> > > 
> > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-
> > > input-
> > > 3118 -G transition-4516.xml -D transition-4516.dot
> > > Current cluster status:
> > > Node ha-idg-1 (1084777482): standby
> > > Online: [ ha-idg-2 ]
> > >  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> > > <== vm_nextcloud is stopped
> > > Transition Summary:
> > >  * Shutdown ha-idg-1
> > > Executing cluster transition:
> > >  * Resource action: vm_nextcloudstop on ha-idg-1 < why
> > > stop ?
> > > It is already stopped
> > 
> > I'm not sure, I'd have to see the pe input.
> 
> You find it here: 
> https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29

This appears to be a scheduler bug.

The scheduler considers a migration to be "dangling" if it has a record
of a failed migrate_to on the source node, but no migrate_from on the
target node (and no migrate_from or start on the source node, which
would indicate a later full restart or reverse migration).

In this case, any migrate_from on the target has since been superseded
by a failed start and a successful stop, so there is no longer a record
of it. Therefore the migration is considered dangling, which requires a
full stop on the source node.

However in this case we already have a successful stop on the source
node after the failed migrate_to, and I believe that should be
sufficient to consider it no longer dangling.

> > >  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped <===
> > > vm_nextcloud is stopped
> > > Transition Summary:
> > >  * Fence (Off) ha-idg-1 'resource actions are unrunnable'
> > > Executing cluster transition:
> > >  * Fencing ha-idg-1 (Off)
> > >  * Pseudo action:   vm_nextcloud_stop_0 <=== why stop ? It is
> > > already stopped ?
> > > Revised cluster status:
> > > Node ha-idg-1 (1084777482): OFFLINE (standby)
> > > Online: [ ha-idg-2 ]
> > >  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
> > > 
> > > I don't understand why the cluster tries to stop a resource which
> > > is
> > > already stopped.
> 
> Bernd
> Helmholtz Zentrum München
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin
> Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] why is node fenced ?

2020-08-18 Thread Lentes, Bernd


- On Aug 17, 2020, at 5:09 PM, kgaillot kgail...@redhat.com wrote:


>> I checked all relevant pe-files in this time period.
>> This is what i found out (i just write the important entries):
 
>> Executing cluster transition:
>>  * Resource action: vm_nextcloudstop on ha-idg-2
>> Revised cluster status:
>>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
>> 
>> ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input-
>> 3118 -G transition-4516.xml -D transition-4516.dot
>> Current cluster status:
>> Node ha-idg-1 (1084777482): standby
>> Online: [ ha-idg-2 ]
>>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
>> <== vm_nextcloud is stopped
>> Transition Summary:
>>  * Shutdown ha-idg-1
>> Executing cluster transition:
>>  * Resource action: vm_nextcloudstop on ha-idg-1 < why stop ?
>> It is already stopped
> 
> I'm not sure, I'd have to see the pe input.

You find it here: 
https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29

>>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped <===
>> vm_nextcloud is stopped
>> Transition Summary:
>>  * Fence (Off) ha-idg-1 'resource actions are unrunnable'
>> Executing cluster transition:
>>  * Fencing ha-idg-1 (Off)
>>  * Pseudo action:   vm_nextcloud_stop_0 <=== why stop ? It is
>> already stopped ?
>> Revised cluster status:
>> Node ha-idg-1 (1084777482): OFFLINE (standby)
>> Online: [ ha-idg-2 ]
>>  vm_nextcloud   (ocf::heartbeat:VirtualDomain): Stopped
>> 
>> I don't understand why the cluster tries to stop a resource which is
>> already stopped.

Bernd
Helmholtz Zentrum München

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] node utilization attributes are lost during upgrade

2020-08-18 Thread Ken Gaillot
On Tue, 2020-08-18 at 14:35 +0200, Kadlecsik József wrote:
> Hi,
> 
> On Mon, 17 Aug 2020, Ken Gaillot wrote:
> 
> > On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik József wrote:
> > > 
> > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from
> > > Debian 
> > > stretch to buster, all the node utilization attributes were
> > > erased 
> > > from the configuration. However, the same attributes were kept at
> > > the 
> > > VirtualDomain resources. This resulted that all resources with 
> > > utilization attributes were stopped.
> > 
> > Ouch :(
> > 
> > There are two types of node attributes, transient and permanent. 
> > Transient attributes last only until pacemaker is next stopped on
> > the 
> > node, while permanent attributes persist between reboots/restarts.
> > 
> > If you configured the utilization attributes with crm_attribute
> > -z/ 
> > --utilization, it will default to permanent, but it's possible to 
> > override that with -l/--lifetime reboot (or equivalently, -t/
> > --type 
> > status).
> 
> The attributes were defined by "crm configure edit", simply stating:
> 
> node 1084762113: atlas0 \
> utilization hv_memory=192 cpu=32 \
> attributes standby=off
> ...
> node 1084762119: atlas6 \
> utilization hv_memory=192 cpu=32 \
> 
> But I believe now that corosync caused the problem, because the nodes
> had 
> been renumbered:

Ah yes, that would do it. Pacemaker would consider them different nodes
with the same names. The "other" node's attributes would not apply to
the "new" node.

The upgrade procedure would be similar except that you would start
corosync by itself after each upgrade. After all nodes were upgraded,
you would modify the CIB on one node (while pacemaker is not running)
with:

  CIB_file=/var/lib/pacemaker/cib/cib.xml cibadmin --modify --scope=nodes -X 
'...'

where '...' is a  XML entry from the CIB with the "id" value
changed to the new ID, and repeat that for each node. Then, start
pacemaker on that node and wait for it to come up, then start pacemaker
on the other nodes.

> 
> node 3232245761: atlas0
> ...
> node 3232245767: atlas6
> 
> The upgrade process was:
> 
> for each node do
> set the "hold" mark on the corosync package
> put the node standby
> wait for the resources to be migrated off
> upgrade from stretch to buster
> reboot
> put the node online
> wait for the resources to be migrated (back)
> done
> 
> Up to this point all resources were running fine.
> 
> In order to upgrade corosync, we followed the next steps:
> 
> enable maintenance mode
> stop pacemaker and corosync on all nodes
> for each node do
> delete the hold mark and upgrade corosync
> install new config file (nodeid not specified)
> restart corosync, start pacemaker
> done
> 
> We could see that all resources were running unmanaged. When
> disabling the 
> maintenance mode, then those were stopped.
> 
> So I think corosync renumbered the nodes and I suspect the reason for
> that 
> was that "clear_node_high_bit: yes" was not specified in the new
> config 
> file. It means it was an admin error then.
> 
> Best regards,
> Jozsef
> --
> E-mail : kadlecsik.joz...@wigner.hu
> PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
> Address: Wigner Research Centre for Physics
>  H-1525 Budapest 114, POB. 49, Hungary
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-18 Thread Ken Gaillot
On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote:
> On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
> > 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
> > > On Mon, 17 Aug 2020 10:19:45 -0500
> > > Ken Gaillot  wrote:
> > > 
> > > > On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
> > > > > Thanks to all your suggestions, I now have the systems with
> > > > > stonith
> > > > > configured on ipmi.  
> > > > 
> > > > A word of caution: if the IPMI is on-board -- i.e. it shares
> > > > the same
> > > > power supply as the computer -- power becomes a single point of
> > > > failure. If the node loses power, the other node can't fence
> > > > because
> > > > the IPMI is also down, and the cluster can't recover.
> > > > 
> > > > Some on-board IPMI controllers can share an Ethernet port with
> > > > the main
> > > > computer, which would be a similar situation.
> > > > 
> > > > It's best to have a backup fencing method when using IPMI as
> > > > the
> > > > primary fencing method. An example would be an intelligent
> > > > power switch
> > > > or sbd.
> > > 
> > > How SBD would be useful in this scenario? Poison pill will not be
> > > swallowed by
> > > the dead node... Is it just to wait for the watchdog timeout?
> > > 
> > 
> > Node is expected to commit suicide if SBD lost access to shared
> > block
> > device. So either node swallowed poison pill and died or node died
> > because it realized it was impossible to see poison pill or node
> > was
> > dead already. After watchdog timeout (twice watchdog timeout for
> > safety)
> > we assume node is dead.
> 
> Yes, like this a suicide via watchdog will be triggered if there are
> issues with thedisk. This is why it is important to have a reliable
> watchdog with SBD even whenusing poison pill. As this alone would
> make a single shared disk a SPOF, runningwith pacemaker integration
> (default) a node with SBD will survive despite ofloosing the disk
> when it has quorum and pacemaker looks healthy. As corosync-quorum
> in 2-node-mode obviously won't be fit for this purpose SBD will
> switch
> to checking for presence of both nodes if 2-node-flag is set.
> 
> Sorry for the lengthy explanation but the full picture is required
> to understand whyit is sufficiently reliable and useful if configured
> correctly.
> 
> Klaus

What I'm not sure about is how watchdog-only sbd would behave as a
fail-back method for a regular fence device. Will the cluster wait for
the sbd timeout no matter what, or only if the regular fencing fails,
or ...?
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] node utilization attributes are lost during upgrade

2020-08-18 Thread Kadlecsik József
Hi,

On Mon, 17 Aug 2020, Ken Gaillot wrote:

> On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik József wrote:
> > 
> > At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian 
> > stretch to buster, all the node utilization attributes were erased 
> > from the configuration. However, the same attributes were kept at the 
> > VirtualDomain resources. This resulted that all resources with 
> > utilization attributes were stopped.
> 
> Ouch :(
> 
> There are two types of node attributes, transient and permanent. 
> Transient attributes last only until pacemaker is next stopped on the 
> node, while permanent attributes persist between reboots/restarts.
> 
> If you configured the utilization attributes with crm_attribute -z/ 
> --utilization, it will default to permanent, but it's possible to 
> override that with -l/--lifetime reboot (or equivalently, -t/--type 
> status).

The attributes were defined by "crm configure edit", simply stating:

node 1084762113: atlas0 \
utilization hv_memory=192 cpu=32 \
attributes standby=off
...
node 1084762119: atlas6 \
utilization hv_memory=192 cpu=32 \

But I believe now that corosync caused the problem, because the nodes had 
been renumbered:

node 3232245761: atlas0
...
node 3232245767: atlas6

The upgrade process was:

for each node do
set the "hold" mark on the corosync package
put the node standby
wait for the resources to be migrated off
upgrade from stretch to buster
reboot
put the node online
wait for the resources to be migrated (back)
done

Up to this point all resources were running fine.

In order to upgrade corosync, we followed the next steps:

enable maintenance mode
stop pacemaker and corosync on all nodes
for each node do
delete the hold mark and upgrade corosync
install new config file (nodeid not specified)
restart corosync, start pacemaker
done

We could see that all resources were running unmanaged. When disabling the 
maintenance mode, then those were stopped.

So I think corosync renumbered the nodes and I suspect the reason for that 
was that "clear_node_high_bit: yes" was not specified in the new config 
file. It means it was an admin error then.

Best regards,
Jozsef
--
E-mail : kadlecsik.joz...@wigner.hu
PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics
 H-1525 Budapest 114, POB. 49, Hungary___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-18 Thread Jehan-Guillaume de Rorthais
On Tue, 18 Aug 2020 08:21:50 +0200
Klaus Wenninger  wrote:

> On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
> > 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:  
> >> On Mon, 17 Aug 2020 10:19:45 -0500
> >> Ken Gaillot  wrote:
> >>  
> >>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:  
>  Thanks to all your suggestions, I now have the systems with stonith
>  configured on ipmi.
> >>> A word of caution: if the IPMI is on-board -- i.e. it shares the same
> >>> power supply as the computer -- power becomes a single point of
> >>> failure. If the node loses power, the other node can't fence because
> >>> the IPMI is also down, and the cluster can't recover.
> >>>
> >>> Some on-board IPMI controllers can share an Ethernet port with the main
> >>> computer, which would be a similar situation.
> >>>
> >>> It's best to have a backup fencing method when using IPMI as the
> >>> primary fencing method. An example would be an intelligent power switch
> >>> or sbd.  
> >> How SBD would be useful in this scenario? Poison pill will not be
> >> swallowed by the dead node... Is it just to wait for the watchdog timeout?
> >>  
> > Node is expected to commit suicide if SBD lost access to shared block
> > device. So either node swallowed poison pill and died or node died
> > because it realized it was impossible to see poison pill or node was
> > dead already. After watchdog timeout (twice watchdog timeout for safety)
> > we assume node is dead.  
> Yes, like this a suicide via watchdog will be triggered if there are
> issues with thedisk. This is why it is important to have a reliable
> watchdog with SBD even whenusing poison pill. As this alone would
> make a single shared disk a SPOF, runningwith pacemaker integration
> (default) a node with SBD will survive despite ofloosing the disk
> when it has quorum and pacemaker looks healthy. As corosync-quorum
> in 2-node-mode obviously won't be fit for this purpose SBD will switch
> to checking for presence of both nodes if 2-node-flag is set.
> 
> Sorry for the lengthy explanation but the full picture is required
> to understand whyit is sufficiently reliable and useful if configured

Thank you Andrei and Klaus for the explanation.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Stonith failing

2020-08-18 Thread Andrei Borzenkov
18.08.2020 10:35, Ulrich Windl пишет:
 Andrei Borzenkov  schrieb am 18.08.2020 um 09:24 in
> Nachricht <83aba38d-c9ea-1dff-e53b-14a9e0623...@gmail.com>:
>> 18.08.2020 10:10, Ulrich Windl пишет:
>> Ken Gaillot  schrieb am 17.08.2020 um 17:19 in
>>> Nachricht
>>> <73d6ecf113098a3154a2e7db2e2a59557272024a.ca...@redhat.com>:
 On Fri, 2020‑08‑14 at 15:09 +0200, Gabriele Bulfon wrote:
> Thanks to all your suggestions, I now have the systems with stonith
> configured on ipmi.

 A word of caution: if the IPMI is on‑board ‑‑ i.e. it shares the same
 power supply as the computer ‑‑ power becomes a single point of
 failure. If the node loses power, the other node can't fence because
 the IPMI is also down, and the cluster can't recover.
>>>
>>> This may not always be true: We had servers with three(!) power supplies
> and 
>> a
>>> BMC (what today is called "light-out management"). You could "power down" 
>> the
>>> server, while the BMC was still operational (and thus could "power up" the
>>> server again).
>>> With standard PC architecture these days things seem to be a bit more
>>> compicated (meaning "primitive")...
>>>
>>
>> BMC is powered by standby voltage. If AC input to all of your power
>> supplies is cut off, there is no standby voltage anymore. Just try to
>> unplug all power cables and see if BMC is still accessible.
> 
> Of course! What I tried to point out is: With a proper BMC, you DON'T need to
> cut off the server power.
> 

You seem to completely misunderstand the problem - if external power is
cut off, it is impossible to stonith node via IPMI because BMC is not
accessible.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Stonith failing

2020-08-18 Thread Ulrich Windl
>>> Andrei Borzenkov  schrieb am 18.08.2020 um 09:24 in
Nachricht <83aba38d-c9ea-1dff-e53b-14a9e0623...@gmail.com>:
> 18.08.2020 10:10, Ulrich Windl пишет:
> Ken Gaillot  schrieb am 17.08.2020 um 17:19 in
>> Nachricht
>> <73d6ecf113098a3154a2e7db2e2a59557272024a.ca...@redhat.com>:
>>> On Fri, 2020‑08‑14 at 15:09 +0200, Gabriele Bulfon wrote:
 Thanks to all your suggestions, I now have the systems with stonith
 configured on ipmi.
>>>
>>> A word of caution: if the IPMI is on‑board ‑‑ i.e. it shares the same
>>> power supply as the computer ‑‑ power becomes a single point of
>>> failure. If the node loses power, the other node can't fence because
>>> the IPMI is also down, and the cluster can't recover.
>> 
>> This may not always be true: We had servers with three(!) power supplies
and 
> a
>> BMC (what today is called "light-out management"). You could "power down" 
> the
>> server, while the BMC was still operational (and thus could "power up" the
>> server again).
>> With standard PC architecture these days things seem to be a bit more
>> compicated (meaning "primitive")...
>> 
> 
> BMC is powered by standby voltage. If AC input to all of your power
> supplies is cut off, there is no standby voltage anymore. Just try to
> unplug all power cables and see if BMC is still accessible.

Of course! What I tried to point out is: With a proper BMC, you DON'T need to
cut off the server power.

Regards,
Ulrich

> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: [EXT] Stonith failing

2020-08-18 Thread Andrei Borzenkov
18.08.2020 10:10, Ulrich Windl пишет:
 Ken Gaillot  schrieb am 17.08.2020 um 17:19 in
> Nachricht
> <73d6ecf113098a3154a2e7db2e2a59557272024a.ca...@redhat.com>:
>> On Fri, 2020‑08‑14 at 15:09 +0200, Gabriele Bulfon wrote:
>>> Thanks to all your suggestions, I now have the systems with stonith
>>> configured on ipmi.
>>
>> A word of caution: if the IPMI is on‑board ‑‑ i.e. it shares the same
>> power supply as the computer ‑‑ power becomes a single point of
>> failure. If the node loses power, the other node can't fence because
>> the IPMI is also down, and the cluster can't recover.
> 
> This may not always be true: We had servers with three(!) power supplies and a
> BMC (what today is called "light-out management"). You could "power down" the
> server, while the BMC was still operational (and thus could "power up" the
> server again).
> With standard PC architecture these days things seem to be a bit more
> compicated (meaning "primitive")...
> 

BMC is powered by standby voltage. If AC input to all of your power
supplies is cut off, there is no standby voltage anymore. Just try to
unplug all power cables and see if BMC is still accessible.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: [EXT] node utilization attributes are lost during upgrade

2020-08-18 Thread Ulrich Windl
>>> Kadlecsik József  schrieb am 17.08.2020 um
12:12 in
Nachricht :
> Hello,
> 
> At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian stretch 
> to buster, all the node utilization attributes were erased from the 
> configuration. However, the same attributes were kept at the VirtualDomain 
> resources. This resulted that all resources with utilization attributes 
> were stopped.
> 
> The documentation says: "You can name utilization attributes according to 
> your preferences and define as many name/value pairs as your configuration 
> needs.", so one assumes utilization attributes are kept during upgrades, 
> for nodes and resources as well.

Now that you mention it, I think we had it in the past with SLES, too.

> 
> The corosync incompatibility made the upgrade more stressful anyway and 
> the stopping of the resources came out of the blue. The resources could 
> not be started of course ‑ and there were no log warning/error messages 
> that the resources are not started because the utilization constrains 
> could not be satisfied. Pacemaker logs a lot (from admin point of view it 
> is too much), but in this case there was no indication why the resources 
> could not be started (or we were unable to find it in the logs?). So we 
> wasted a lot of time with debugging the VirtualDomain agent.

Also true: It's not very obvious when resources are not started due to
utilization constraints.

> 
> Currently we run the cluster with the placement‑strategy set to default.
> 
> In my opinion node attributes should be kept and preserved during an 
> upgrade. Also, it should be logged when a resource must be stopped/cannot 
> be started because the utilization constrains cannot be satisfied.

+1

> 
> Best regards,
> Jozsef
> ‑‑
> E‑mail : kadlecsik.joz...@wigner.hu 
> PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt 
> Address: Wigner Research Centre for Physics
>  H‑1525 Budapest 114, POB. 49, Hungary
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: [EXT] Stonith failing

2020-08-18 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 17.08.2020 um 17:19 in
Nachricht
<73d6ecf113098a3154a2e7db2e2a59557272024a.ca...@redhat.com>:
> On Fri, 2020‑08‑14 at 15:09 +0200, Gabriele Bulfon wrote:
>> Thanks to all your suggestions, I now have the systems with stonith
>> configured on ipmi.
> 
> A word of caution: if the IPMI is on‑board ‑‑ i.e. it shares the same
> power supply as the computer ‑‑ power becomes a single point of
> failure. If the node loses power, the other node can't fence because
> the IPMI is also down, and the cluster can't recover.

This may not always be true: We had servers with three(!) power supplies and a
BMC (what today is called "light-out management"). You could "power down" the
server, while the BMC was still operational (and thus could "power up" the
server again).
With standard PC architecture these days things seem to be a bit more
compicated (meaning "primitive")...

> 
> Some on‑board IPMI controllers can share an Ethernet port with the main
> computer, which would be a similar situation.
> 
> It's best to have a backup fencing method when using IPMI as the
> primary fencing method. An example would be an intelligent power switch
> or sbd.
> 
>> Two questions:
>> ‑ how can I simulate a stonith situation to check that everything is
>> ok?
>> ‑ considering that I have both nodes with stonith against the other
>> node, once the two nodes can communicate, how can I be sure the two
>> nodes will not try to stonith each other?
>>  
>> :)
>> Thanks!
>> Gabriele
>> 
>>  
>>  
>> Sonicle S.r.l. : http://www.sonicle.com 
>> Music: http://www.gabrielebulfon.com 
>> Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon 
>> 
>> 
>> 
>> Da: Gabriele Bulfon 
>> A: Cluster Labs ‑ All topics related to open‑source clustering
>> welcomed 
>> Data: 29 luglio 2020 14.22.42 CEST
>> Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
>> 
>> 
>> >  
>> > It is a ZFS based illumos system.
>> > I don't think SBD is an option.
>> > Is there a reliable ZFS based stonith?
>> >  
>> > Gabriele
>> > 
>> >  
>> >  
>> > Sonicle S.r.l. : http://www.sonicle.com 
>> > Music: http://www.gabrielebulfon.com 
>> > Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon 
>> > 
>> > 
>> > 
>> > Da: Andrei Borzenkov 
>> > A: Cluster Labs ‑ All topics related to open‑source clustering
>> > welcomed 
>> > Data: 29 luglio 2020 9.46.09 CEST
>> > Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing
>> > 
>> > 
>> > >  
>> > > 
>> > > On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon <
>> > > gbul...@sonicle.com> wrote:
>> > > > That one was taken from a specific implementation on Solaris
>> > > > 11.
>> > > > The situation is a dual node server with shared storage
>> > > > controller: both nodes see the same disks concurrently.
>> > > > Here we must be sure that the two nodes are not going to
>> > > > import/mount the same zpool at the same time, or we will
>> > > > encounter data corruption:
>> > > > 
>> > > 
>> > >  
>> > > ssh based "stonith" cannot guarantee it.
>> > >  
>> > > > node 1 will be perferred for pool 1, node 2 for pool 2, only in
>> > > > case one of the node goes down or is taken offline the
>> > > > resources should be first free by the leaving node and taken by
>> > > > the other node.
>> > > >  
>> > > > Would you suggest one of the available stonith in this case?
>> > > >  
>> > > > 
>> > > 
>> > >  
>> > > IPMI, managed PDU, SBD ...
>> > > In practice, the only stonith method that works in case of
>> > > complete node outage including any power supply is SBD.
> ‑‑ 
> Ken Gaillot 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: [EXT] Re: why is node fenced ?

2020-08-18 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 17.08.2020 um 16:54 in
Nachricht
<426bbd8063885706b0e8fdd3dab81be7e0c9f25d.ca...@redhat.com>:
> On Fri, 2020-08-14 at 12:17 +0200, Lentes, Bernd wrote:
>> 
>> - On Aug 10, 2020, at 11:59 PM, kgaillot kgail...@redhat.com 
>> wrote:
>> > The most recent transition is aborted, but since all its actions
>> > are
>> > complete, the only effect is to trigger a new transition.
>> > 
>> > We should probably rephrase the log message. In fact, the whole
>> > "transition" terminology is kind of obscure. It's hard to come up
>> > with
>> > something better though.
>> > 
>> 
>> Hi Ken,
>> 
>> i don't get it. How can s.th. be aborted which is already completed ?
> 
> I agree the wording is confusing :)
> 
> From the code's point of view, the actions in the transition are
> complete, but the transition itself (as an abstract entity) remains
> current until the next one starts. However that's academic and

Hi!

So when the "transition" had completed, it became a "state" ;-)
Aborting a "state" seems to be a "transition" started in my view.

> meaningless from a user's point of view, so the log messages should be
> reworded.

Another thing that always had confused me in the past was that concept of
"synapse"...
Good wording in error messages is a valuable resource!

Regards,
Ulrich

> 
>> Bernd
>> Helmholtz Zentrum München
>> 
>> Helmholtz Zentrum Muenchen
>> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
>> Ingolstaedter Landstr. 1
>> 85764 Neuherberg
>> www.helmholtz-muenchen.de 
>> Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
>> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin
>> Guenther
>> Registergericht: Amtsgericht Muenchen HRB 6466
>> USt-IdNr: DE 129521671
> -- 
> Ken Gaillot 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Stonith failing

2020-08-18 Thread Klaus Wenninger
On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
> 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
>> On Mon, 17 Aug 2020 10:19:45 -0500
>> Ken Gaillot  wrote:
>>
>>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
 Thanks to all your suggestions, I now have the systems with stonith
 configured on ipmi.  
>>> A word of caution: if the IPMI is on-board -- i.e. it shares the same
>>> power supply as the computer -- power becomes a single point of
>>> failure. If the node loses power, the other node can't fence because
>>> the IPMI is also down, and the cluster can't recover.
>>>
>>> Some on-board IPMI controllers can share an Ethernet port with the main
>>> computer, which would be a similar situation.
>>>
>>> It's best to have a backup fencing method when using IPMI as the
>>> primary fencing method. An example would be an intelligent power switch
>>> or sbd.
>> How SBD would be useful in this scenario? Poison pill will not be swallowed 
>> by
>> the dead node... Is it just to wait for the watchdog timeout?
>>
> Node is expected to commit suicide if SBD lost access to shared block
> device. So either node swallowed poison pill and died or node died
> because it realized it was impossible to see poison pill or node was
> dead already. After watchdog timeout (twice watchdog timeout for safety)
> we assume node is dead.
Yes, like this a suicide via watchdog will be triggered if there are
issues with thedisk. This is why it is important to have a reliable
watchdog with SBD even whenusing poison pill. As this alone would
make a single shared disk a SPOF, runningwith pacemaker integration
(default) a node with SBD will survive despite ofloosing the disk
when it has quorum and pacemaker looks healthy. As corosync-quorum
in 2-node-mode obviously won't be fit for this purpose SBD will switch
to checking for presence of both nodes if 2-node-flag is set.

Sorry for the lengthy explanation but the full picture is required
to understand whyit is sufficiently reliable and useful if configured
correctly.

Klaus

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/