Re: [ClusterLabs] I'm doing something stupid probably but...

2021-10-12 Thread kevin martin
Sigh...never mind.  node2 was in standby (not sure how that happened)...pcs
node unstandby node2 and now it's working.


---


Regards,

Kevin Martin


On Tue, Oct 12, 2021 at 3:43 PM kevin martin  wrote:

> Ok, so I'm doing more wrong than I thought.  I did a "pcs cluster stop
> node1" on the main node expecting it would roll over the virtual ip to
> node2, no joy.  So "graceful" failover doesnt' work either.  The actual
> message is:  (pcmk__native_allocate)   info: Resource virtual_ip cannot run
> anywhere
>
> ---
>
>
> Regards,
>
> Kevin Martin
>
>
> On Tue, Oct 12, 2021 at 3:32 PM kevin martin  wrote:
>
>> I'm trying to replace a 2 node cluster running on rhel6 with a 2 node
>> cluster on el8 using the version of pacemaker/corosync/pcsd that's in the
>> repos (pacemake 1.1.20, pcs 0-.9, corosync 2.4.3 on el6 and 2.0.5, 0.10,
>> and 3.1 on el8) and I must be doing something wrong.  when I shutdown the
>> main node of the cluster (like with a reboot after patching) I expect the
>> virtual ip to move to the 2nd node, however I'm not seeing that. i'm seeing
>> a message in the pacemaker log that says the virtual ip cannot run
>> anywhere.  I'm not sure what I'm supposed to configure to allow that to
>> happen (again, this is with a reboot of the main node so it's an ungraceful
>> failover). Any help is appreciated.
>>
>>
>> ---
>>
>>
>> Regards,
>>
>> Kevin Martin
>>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] I'm doing something stupid probably but...

2021-10-12 Thread kevin martin
Ok, so I'm doing more wrong than I thought.  I did a "pcs cluster stop
node1" on the main node expecting it would roll over the virtual ip to
node2, no joy.  So "graceful" failover doesnt' work either.  The actual
message is:  (pcmk__native_allocate)   info: Resource virtual_ip cannot run
anywhere

---


Regards,

Kevin Martin


On Tue, Oct 12, 2021 at 3:32 PM kevin martin  wrote:

> I'm trying to replace a 2 node cluster running on rhel6 with a 2 node
> cluster on el8 using the version of pacemaker/corosync/pcsd that's in the
> repos (pacemake 1.1.20, pcs 0-.9, corosync 2.4.3 on el6 and 2.0.5, 0.10,
> and 3.1 on el8) and I must be doing something wrong.  when I shutdown the
> main node of the cluster (like with a reboot after patching) I expect the
> virtual ip to move to the 2nd node, however I'm not seeing that. i'm seeing
> a message in the pacemaker log that says the virtual ip cannot run
> anywhere.  I'm not sure what I'm supposed to configure to allow that to
> happen (again, this is with a reboot of the main node so it's an ungraceful
> failover). Any help is appreciated.
>
>
> ---
>
>
> Regards,
>
> Kevin Martin
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] I'm doing something stupid probably but...

2021-10-12 Thread kevin martin
I'm trying to replace a 2 node cluster running on rhel6 with a 2 node
cluster on el8 using the version of pacemaker/corosync/pcsd that's in the
repos (pacemake 1.1.20, pcs 0-.9, corosync 2.4.3 on el6 and 2.0.5, 0.10,
and 3.1 on el8) and I must be doing something wrong.  when I shutdown the
main node of the cluster (like with a reboot after patching) I expect the
virtual ip to move to the 2nd node, however I'm not seeing that. i'm seeing
a message in the pacemaker log that says the virtual ip cannot run
anywhere.  I'm not sure what I'm supposed to configure to allow that to
happen (again, this is with a reboot of the main node so it's an ungraceful
failover). Any help is appreciated.


---


Regards,

Kevin Martin
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Coming in Pacemaker 2.1.2: new fencing configuration options

2021-10-12 Thread Ken Gaillot
On Tue, 2021-10-12 at 20:48 +0300, Andrei Borzenkov wrote:
> On 12.10.2021 09:27, Ulrich Windl wrote:
> > > > > Andrei Borzenkov  schrieb am 11.10.2021
> > > > > um 11:43 in
> > Nachricht
> >  > >:
> > > On Mon, Oct 11, 2021 at 9:29 AM Ulrich Windl
> > >  wrote:
> > > 
> > > > > > Also how long would such a delay be: Long enough until the
> > > > > > other node
> > > > > > is
> > > > > > fenced, or long enough until the other node was fenced,
> > > > > > booted
> > > > > > (assuming it
> > > > > > does) and is running pacemaker?
> > > > > 
> > > > > The delay should be on the less‑preferred node, long enough
> > > > > for that
> > > > > node to get fenced. The other node, with no delay, will fence
> > > > > it if it
> > > > > can. If the other node is for whatever reason unable to
> > > > > fence, the node
> > > > > with the delay will fence it after the delay.
> > > > 
> > > > So the "fence intention" will be lost when the node is being
> > > > fenced?
> > > > Otherwise the surviving node would have to clean up the "fence
> > > > intention".
> > > > Or does it mean the "fence intention" does not make it to the
> > > > CIB and
> > stays
> > > > local on the node?
> > > > 
> > > 
> > > Two nodes cannot communicate with each other so the surviving
> > > node is
> > > not aware of anything the fenced node did or intended to do. When
> > > the
> > 
> > I thought (local) CIB writes do not need a quorum.
> > 
> > > fenced node reboots and pacemaker starts it should pull CIB from
> > > the
> > > surviving node, so whatever intentions the fenced node had before
> > > reboot should be lost at this point.
> > 
> > If the surviving node has a CIB newer (as per
> > modification/configuration
> > count) the fenced node that is true, but it the fenced node has a
> > newer CIB,
> > the surviving node would pull the "other" CIB, right?
> 
> Indeed. I honestly did not expect it.
> 
> I am not sure what consequences it has in practice though. It is
> certainly one more argument against running without mandatory stonith
> because in this case both nodes happily continue and it is
> unpredictable
> which one will win after they rejoin.
> 
> Assuming we do run with mandatory stonith then we have relatively
> small
> window before DC is killed (because only DC can update CIB). But I am
> not sure whether CIB changes will be committed locally until all
> nodes
> are either confirmed to be offline or acknowledged CIB changes. I
> guess
> only Ken can answer it :)

In general each node maintains its own copy of the CIB (writing
locally), and only changes (diffs) are passed between nodes. Checksums
are used to make sure the content remains functionally the same on all
nodes.

However full CIB replacements can be done, whether by user request (pcs
generally uses this for config changes, btw) or when the CIB gets out
of sync on the nodes.

When a node joins an existing cluster (like a fenced node rejoining),
the CIB versions will be compared, and the newest one wins (actually
more like the one with the most changes).

Generally, the existing cluster had more activity after the node was
fenced, and the fenced node has little to no activity before it rejoins
the cluster, so it works out well. However I have seen scripts that
start the cluster on a node and immediately set some node attributes
or whatnot, causing the fenced node to look "newer" when it rejoins.

> 
> > I think I had a few cases in the past when the "last dying node"
> > did not have
> > the "latest" CIB, causing some "extra noise" when the cluster was
> > formed
> > again.
> 
> Details of what happened are certainly interesting.
> 
> > Probably some period to wait for all nodes to join (and thus sync
> > the CIBs)
> > before performing any actions would help there.

-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Coming in Pacemaker 2.1.2: new fencing configuration options

2021-10-12 Thread Andrei Borzenkov
On 12.10.2021 09:27, Ulrich Windl wrote:
 Andrei Borzenkov  schrieb am 11.10.2021 um 11:43 in
> Nachricht
> :
>> On Mon, Oct 11, 2021 at 9:29 AM Ulrich Windl
>>  wrote:
>> 
> Also how long would such a delay be: Long enough until the other node
> is
> fenced, or long enough until the other node was fenced, booted
> (assuming it
> does) and is running pacemaker?

 The delay should be on the less‑preferred node, long enough for that
 node to get fenced. The other node, with no delay, will fence it if it
 can. If the other node is for whatever reason unable to fence, the node
 with the delay will fence it after the delay.
>>>
>>> So the "fence intention" will be lost when the node is being fenced?
>>> Otherwise the surviving node would have to clean up the "fence intention".
>>> Or does it mean the "fence intention" does not make it to the CIB and
> stays
>>> local on the node?
>>>
>>
>> Two nodes cannot communicate with each other so the surviving node is
>> not aware of anything the fenced node did or intended to do. When the
> 
> I thought (local) CIB writes do not need a quorum.
> 
>> fenced node reboots and pacemaker starts it should pull CIB from the
>> surviving node, so whatever intentions the fenced node had before
>> reboot should be lost at this point.
> 
> If the surviving node has a CIB newer (as per modification/configuration
> count) the fenced node that is true, but it the fenced node has a newer CIB,
> the surviving node would pull the "other" CIB, right?

Indeed. I honestly did not expect it.

I am not sure what consequences it has in practice though. It is
certainly one more argument against running without mandatory stonith
because in this case both nodes happily continue and it is unpredictable
which one will win after they rejoin.

Assuming we do run with mandatory stonith then we have relatively small
window before DC is killed (because only DC can update CIB). But I am
not sure whether CIB changes will be committed locally until all nodes
are either confirmed to be offline or acknowledged CIB changes. I guess
only Ken can answer it :)


> I think I had a few cases in the past when the "last dying node" did not have
> the "latest" CIB, causing some "extra noise" when the cluster was formed
> again.

Details of what happened are certainly interesting.

> Probably some period to wait for all nodes to join (and thus sync the CIBs)
> before performing any actions would help there.
> 
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: [EXT] Re: Possible timing bug in SLES15

2021-10-12 Thread Ulrich Windl
>>> Roger Zhou via Users  schrieb am 12.10.2021 um 09:55
in
Nachricht :

...
>> # Time syncs can make the clock jump backward, which messes with logging
>> # and failure timestamps, so wait until it's done.
>> After=time‑sync.target
>> ...
>> 
>> Oct 05 14:58:10 h16 pacemakerd[6974]:  notice: Starting Pacemaker 
> 2.0.4+20200616.2deceaa3a‑3.9.1
>> But still it does not "Require" time‑sync.target...
>> 
> 
> Actually `After=` is more strict dependency than `Require=`.

From discussions in the systemd development list there is hardly a scenario
where after without require makes sense, because (as I understood it) "After"
only has an effect if both units are started in the same "transaction". The way
I understood it, it would mean that if you start pacemaker manuall and your
clock is not in-sync, it would start pacemaker anyway.
I may be wrong, though.
Maybe a counter-argument is that pacemaker might stop if the time in not in
sync (although I believe a dependency on NTP would be bad, but time-sync is
probably OK).

> 
>> Doesn't corosync need synchronized clocks?
> 
> Seems good to have, but low priority.

Well at least when comparing log timestamps it seems useful if all nodes have
the same time.

Regards,
Ulrich


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-12 Thread Jehan-Guillaume de Rorthais
On Tue, 12 Oct 2021 09:46:04 +0200
"Ulrich Windl"  wrote:

> >>> Jehan-Guillaume de Rorthais  schrieb am 12.10.2021 um
> >>> 09:35 in  
> Nachricht <20211012093554.4bb761a2@firost>:
> > On Tue, 12 Oct 2021 08:42:49 +0200
> > "Ulrich Windl"  wrote:
> >   
> ...
> >> "watch cat /proc/meminfo" could be your friend.  
> > 
> > Or even better, make sure you have sysstat or pcp tools family installed and
> > harvesting system metrics. You'll have the full historic of the dirty pages
> > variations during the day/week/month.  
> 
> Actually I think the 10 minute granularity of sysstat (sar) is to coarse to
> learn what's going on, specifically if your node is fenced before the latest
> record is written.

Indeed. You can still set it down to 1min in the crontab if needed. But the
point is to gather a better understanding on the dirties (and many other useful 
metrics) evolution during a long time frame.

You will always loose a small part of information after a fencing. No matter if
your period is 10min, 1min or even 1s.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Possible timing bug in SLES15

2021-10-12 Thread Roger Zhou via Users



On 10/12/21 3:32 PM, Ulrich Windl wrote:

Hi!

I just examined the corosync.service unit in SLES15. It contains:
# /usr/lib/systemd/system/corosync.service
[Unit]
Description=Corosync Cluster Engine
Documentation=man:corosync man:corosync.conf man:corosync_overview
ConditionKernelCommandLine=!nocluster
Requires=network-online.target
After=network-online.target
...

However the documentation says corosync requires synchronized system clocks.
With this configuration corosync starts before the clocks are synchronized:


The point looks valid and make sense. Well, sounds like no(or very seldom) 
victim because of it in the real life.




Oct 05 14:57:47 h16 ntpd[6767]: ntpd 4.2.8p15@1.3728-o Tue Jun 15 12:00:00 UTC 
2021 (1): Starting
...
Oct 05 14:57:48 h16 systemd[1]: Starting Wait for ntpd to synchronize system 
clock...
...
Oct 05 14:57:48 h16 corosync[6793]:   [TOTEM ] Initializing transport (UDP/IP 
Unicast).
...
Oct 05 14:57:48 h16 systemd[1]: Started Corosync Cluster Engine.
...
Oct 05 14:58:10 h16 systemd[1]: Started Wait for ntpd to synchronize system 
clock.
Oct 05 14:58:10 h16 systemd[1]: Reached target System Time Synchronized.

Only pacemaker.service has:
# /usr/lib/systemd/system/pacemaker.service
[Unit]
Description=Pacemaker High Availability Cluster Manager
Documentation=man:pacemakerd
Documentation=https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html

# DefaultDependencies takes care of sysinit.target,
# basic.target, and shutdown.target

# We need networking to bind to a network address. It is recommended not to
# use Wants or Requires with network.target, and not to use
# network-online.target for server daemons.
After=network.target

# Time syncs can make the clock jump backward, which messes with logging
# and failure timestamps, so wait until it's done.
After=time-sync.target
...

Oct 05 14:58:10 h16 pacemakerd[6974]:  notice: Starting Pacemaker 
2.0.4+20200616.2deceaa3a-3.9.1
But still it does not "Require" time-sync.target...



Actually `After=` is more strict dependency than `Require=`.


Doesn't corosync need synchronized clocks?


Seems good to have, but low priority.

BR,
Roger





Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-12 Thread Jehan-Guillaume de Rorthais
On Tue, 12 Oct 2021 08:42:49 +0200
"Ulrich Windl"  wrote:

> ...
> >> sysctl ‑a | grep dirty
> >> vm.dirty_background_bytes = 0
> >> vm.dirty_background_ratio = 10  
> > 
> > Considering your 256GB of physical memory, this means you can dirty up to 
> > 25GB
> > pages in cache before the kernel start to write them on storage.
> > 
> > You might want to trigger these background, lighter syncs much before 
> > hitting
> > this limit.
> >   
> >> vm.dirty_bytes = 0
> >> vm.dirty_expire_centisecs = 3000
> >> vm.dirty_ratio = 20  
> > 
> > This is 20% of your 256GB physical memory. After this limit, writes have to
> > go to disks, directly. Considering the time to write to SSD compared to
> > memory and the amount of data to sync in the background as well (52GB),
> > this could be very painful.  
> 
> Wowever (unless doing really large commits) databases should flush buffers
> rather frequently, so I doubt database operations would fill the dirty buffer
> rate.

It depends on you database setup, your concurrency, your active dataset, your
query profile, batch, and so on.

> "watch cat /proc/meminfo" could be your friend.

Or even better, make sure you have sysstat or pcp tools family installed and
harvesting system metrics. You'll have the full historic of the dirty pages
variations during the day/week/month.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Possible timing bug in SLES15

2021-10-12 Thread Ulrich Windl
Hi!

I just examined the corosync.service unit in SLES15. It contains:
# /usr/lib/systemd/system/corosync.service
[Unit]
Description=Corosync Cluster Engine
Documentation=man:corosync man:corosync.conf man:corosync_overview
ConditionKernelCommandLine=!nocluster
Requires=network-online.target
After=network-online.target
...

However the documentation says corosync requires synchronized system clocks.
With this configuration corosync starts before the clocks are synchronized:

Oct 05 14:57:47 h16 ntpd[6767]: ntpd 4.2.8p15@1.3728-o Tue Jun 15 12:00:00 UTC 
2021 (1): Starting
...
Oct 05 14:57:48 h16 systemd[1]: Starting Wait for ntpd to synchronize system 
clock...
...
Oct 05 14:57:48 h16 corosync[6793]:   [TOTEM ] Initializing transport (UDP/IP 
Unicast).
...
Oct 05 14:57:48 h16 systemd[1]: Started Corosync Cluster Engine.
...
Oct 05 14:58:10 h16 systemd[1]: Started Wait for ntpd to synchronize system 
clock.
Oct 05 14:58:10 h16 systemd[1]: Reached target System Time Synchronized.

Only pacemaker.service has:
# /usr/lib/systemd/system/pacemaker.service
[Unit]
Description=Pacemaker High Availability Cluster Manager
Documentation=man:pacemakerd
Documentation=https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html
 

# DefaultDependencies takes care of sysinit.target,
# basic.target, and shutdown.target

# We need networking to bind to a network address. It is recommended not to
# use Wants or Requires with network.target, and not to use
# network-online.target for server daemons.
After=network.target

# Time syncs can make the clock jump backward, which messes with logging
# and failure timestamps, so wait until it's done.
After=time-sync.target
...

Oct 05 14:58:10 h16 pacemakerd[6974]:  notice: Starting Pacemaker 
2.0.4+20200616.2deceaa3a-3.9.1
But still it does not "Require" time-sync.target...

Doesn't corosync need synchronized clocks?

Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

2021-10-12 Thread Ulrich Windl
>>> Jehan-Guillaume de Rorthais  schrieb am 11.10.2021 um
11:57 in
Nachricht <2021105737.7cc99e69@firost>:
> Hi,
> 
> I kept the full answer in history to keep the list informed of your full
> answer.
> 
> My answer down below.
> 
> On Mon, 11 Oct 2021 11:33:12 +0200
> damiano giuliani  wrote:
> 
>> ehy guys sorry for being late, was busy during the WE
>> 
>> here i im:
>> 
>> 
>> > Did you see the swap activity (in/out, not just swap occupation) happen
in
>> > the
>> >
>> > same time the member was lost on corosync side?
>> > Did you check corosync or some of its libs were indeed in swap?
>> >
>> >
>> no and i dont know how do it, i just noticed the swap occupation which
>> suggest me (and my collegue) to find out if it could cause some trouble.
>> 
>> > First, corosync now sit on a lot of memory because of knet. Did you try
to
>> > switch back to udpu which is using way less memory?
>> 
>> 
>> No i havent move to udpd, cast stop processes at all.
>> 
>>   "Could not lock memory of service to avoid page faults"
>> 
>> 
>> grep ‑rn 'Could not lock memory of service to avoid page faults'
/var/log/*
>> returns noting

Maybe the expression is too specific (try "lock memory", maybe), or syslog in
in journal only (journalctl -b | grep "lock memory").

> 
> This message should appears on corosync startup. Make sure the logs hadn't 
> been
> rotated to a blackhole in the meantime...
> 
>> > On my side, mlocks is unlimited on ulimit settings. Check the values
>> > in /proc/$(coro PID)/limits (be careful with the ulimit command, check
the
>> > proc itself).
>> 
>> 
>> cat /proc/101350/limits
>> Limit Soft Limit   Hard Limit   Units
>> Max cpu time  unlimitedunlimited   
seconds
>> Max file size unlimitedunlimitedbytes
>> Max data size unlimitedunlimitedbytes
>> Max stack size8388608  unlimitedbytes
>> Max core file size0unlimitedbytes
>> Max resident set  unlimitedunlimitedbytes
>> Max processes 770868   770868
>> processes
>> Max open files1024 4096 files
>> Max locked memory unlimitedunlimitedbytes
>> Max address space unlimitedunlimitedbytes
>> Max file locksunlimitedunlimitedlocks
>> Max pending signals   770868   770868  
signals
>> Max msgqueue size 819200   819200   bytes
>> Max nice priority 00
>> Max realtime priority 00
>> Max realtime timeout  unlimitedunlimitedus
>> 
>> Ah... That's the first thing I change.
>> > In SLES, that is defaulted to 10s and so far I have never seen an
>> > environment that is stable enough for the default 1s timeout.
>> 
>> 
>> old versions have 10s default
>> you are not going to fix the problem lthis way, 1s timeout for a bonded
>> network and overkill hardware is enourmous time.
>> 
>> hostnamectl | grep Kernel
>> Kernel: Linux 3.10.0‑1160.6.1.el7.x86_64
>> [root@ltaoperdbs03 ~]# cat /etc/os‑release
>> NAME="CentOS Linux"
>> VERSION="7 (Core)"
>> 
>> > Indeed. But it's an arbitrage between swapping process mem or freeing
>> > mem by removing data from cache. For database servers, it is advised to
>> > use a
>> > lower value for swappiness anyway, around 5‑10, as a swapped process
means
>> > longer query, longer data in caches, piling sessions, etc.
>> 
>> 
>> totally agree, for db server swappines has to be 5‑10.
>> 
>> kernel?
>> > What are your settings for vm.dirty_* ?
>> 
>> 
>> 
>> hostnamectl | grep Kernel
>> Kernel: Linux 3.10.0‑1160.6.1.el7.x86_64
>> [root@ltaoperdbs03 ~]# cat /etc/os‑release
>> NAME="CentOS Linux"
>> VERSION="7 (Core)"
>> 
>> 
>> sysctl ‑a | grep dirty
>> vm.dirty_background_bytes = 0
>> vm.dirty_background_ratio = 10
> 
> Considering your 256GB of physical memory, this means you can dirty up to 
> 25GB
> pages in cache before the kernel start to write them on storage.
> 
> You might want to trigger these background, lighter syncs much before 
> hitting
> this limit.
> 
>> vm.dirty_bytes = 0
>> vm.dirty_expire_centisecs = 3000
>> vm.dirty_ratio = 20
> 
> This is 20% of your 256GB physical memory. After this limit, writes have to

> go
> to disks, directly. Considering the time to write to SSD compared to memory
> and the amount of data to sync in the background as well (52GB), this could

> be
> very painful.

Wowever (unless doing really large commits) databases should flush buffers
rather frequently, so I doubt database operations would fill the dirty buffer
rate.
"watch cat /proc/meminfo" could be your friend.

> 
>> vm.dirty_writeback_centisecs = 500
>> 
>> 
>> > Do you 

[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Coming in Pacemaker 2.1.2: new fencing configuration options

2021-10-12 Thread Ulrich Windl
>>> Andrei Borzenkov  schrieb am 11.10.2021 um 11:43 in
Nachricht
:
> On Mon, Oct 11, 2021 at 9:29 AM Ulrich Windl
>  wrote:
> 
>> >> Also how long would such a delay be: Long enough until the other node
>> >> is
>> >> fenced, or long enough until the other node was fenced, booted
>> >> (assuming it
>> >> does) and is running pacemaker?
>> >
>> > The delay should be on the less‑preferred node, long enough for that
>> > node to get fenced. The other node, with no delay, will fence it if it
>> > can. If the other node is for whatever reason unable to fence, the node
>> > with the delay will fence it after the delay.
>>
>> So the "fence intention" will be lost when the node is being fenced?
>> Otherwise the surviving node would have to clean up the "fence intention".
>> Or does it mean the "fence intention" does not make it to the CIB and
stays
>> local on the node?
>>
> 
> Two nodes cannot communicate with each other so the surviving node is
> not aware of anything the fenced node did or intended to do. When the

I thought (local) CIB writes do not need a quorum.

> fenced node reboots and pacemaker starts it should pull CIB from the
> surviving node, so whatever intentions the fenced node had before
> reboot should be lost at this point.

If the surviving node has a CIB newer (as per modification/configuration
count) the fenced node that is true, but it the fenced node has a newer CIB,
the surviving node would pull the "other" CIB, right?
I think I had a few cases in the past when the "last dying node" did not have
the "latest" CIB, causing some "extra noise" when the cluster was formed
again.
Probably some period to wait for all nodes to join (and thus sync the CIBs)
before performing any actions would help there.

Regards,
Ulrich

> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/