On 26/11/2023 12:20, Reid Wahl wrote:
On Sun, Nov 26, 2023 at 1:32 AM lejeczek via Users
<users@clusterlabs.org> wrote:
Hi guys.
With these:
-> $ pcs resource status REDIS-6381-clone
* Clone Set: REDIS-6381-clone [REDIS-6381] (promotable):
* Promoted: [ ubusrv2 ]
* Unpromoted: [ ubusrv1 ubusrv3 ]
-> $ pcs resource status PGSQL-PAF-5433-clone
* Clone Set: PGSQL-PAF-5433-clone [PGSQL-PAF-5433] (promotable):
* Promoted: [ ubusrv1 ]
* Unpromoted: [ ubusrv2 ubusrv3 ]
-> $ pcs constraint ref REDIS-6381-clone
Resource: REDIS-6381-clone
colocation-REDIS-6381-clone-PGSQL-PAF-5433-clone-INFINITY
basically promoted Redis should follow promoted pgSQL but it's not happening,
usually it does.
I presume pcs/cluster does something internally which results in
disobeying/ignoring that _colocation_ constraint for these resources.
I presume scoring might play a role:
REDIS-6385-clone with PGSQL-PAF-5435-clone (score:1001) (rsc-role:Master)
(with-rsc-role:Master)
but usually, that scoring works, only "now" it does not.
Any comments I appreciate much.
thanks, L.
I looked at pamaker log - snippet below after REDIS-6381-clone re-enabled - but
cannot see explanation for this.
...
notice: Calculated transition 110, saving inputs in
/var/lib/pacemaker/pengine/pe-input-3729.bz2
notice: Transition 110 (Complete=0, Pending=0, Fired=0, Skipped=0,
Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-3729.bz2): Complete
notice: State transition S_TRANSITION_ENGINE -> S_IDLE
notice: State transition S_IDLE -> S_POLICY_ENGINE
notice: Actions: Start REDIS-6381:0 (
ubusrv2 )
notice: Actions: Start REDIS-6381:1 (
ubusrv3 )
notice: Actions: Start REDIS-6381:2 (
ubusrv1 )
notice: Calculated transition 111, saving inputs in
/var/lib/pacemaker/pengine/pe-input-3730.bz2
notice: Initiating start operation REDIS-6381_start_0 locally on ubusrv2
notice: Requesting local execution of start operation for REDIS-6381 on
ubusrv2
(to redis) root on none
pam_unix(su:session): session opened for user redis(uid=127) by (uid=0)
pam_sss(su:session): Request to sssd failed. Connection refused
pam_unix(su:session): session closed for user redis
pam_sss(su:session): Request to sssd failed. Connection refused
notice: Setting master-REDIS-6381[ubusrv2]: (unset) -> 1000
notice: Transition 111 aborted by status-2-master-REDIS-6381 doing create
master-REDIS-6381=1000: Transient attribute change
INFO: demote: Setting master to 'no-such-master'
notice: Result of start operation for REDIS-6381 on ubusrv2: ok
notice: Transition 111 (Complete=4, Pending=0, Fired=0, Skipped=1,
Incomplete=14, Source=/var/lib/pacemaker/pengine/pe-input-3730.bz2): Stopped
notice: Actions: Promote REDIS-6381:0 ( Unpromoted -> Promoted
ubusrv2 )
notice: Actions: Start REDIS-6381:1 (
ubusrv1 )
notice: Actions: Start REDIS-6381:2 (
ubusrv3 )
notice: Calculated transition 112, saving inputs in
/var/lib/pacemaker/pengine/pe-input-3731.bz2
notice: Initiating notify operation REDIS-6381_pre_notify_start_0 locally on
ubusrv2
notice: Requesting local execution of notify operation for REDIS-6381 on
ubusrv2
notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
notice: Initiating start operation REDIS-6381_start_0 on ubusrv1
notice: Initiating start operation REDIS-6381:2_start_0 on ubusrv3
notice: Initiating notify operation REDIS-6381_post_notify_start_0 locally on
ubusrv2
notice: Requesting local execution of notify operation for REDIS-6381 on
ubusrv2
notice: Initiating notify operation REDIS-6381_post_notify_start_0 on ubusrv1
notice: Initiating notify operation REDIS-6381:2_post_notify_start_0 on
ubusrv3
notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
notice: Initiating notify operation REDIS-6381_pre_notify_promote_0 locally
on ubusrv2
notice: Requesting local execution of notify operation for REDIS-6381 on
ubusrv2
notice: Initiating notify operation REDIS-6381_pre_notify_promote_0 on ubusrv1
notice: Initiating notify operation REDIS-6381:2_pre_notify_promote_0 on
ubusrv3
notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
notice: Initiating promote operation REDIS-6381_promote_0 locally on ubusrv2
notice: Requesting local execution of promote operation for REDIS-6381 on
ubusrv2
notice: Result of promote operation for REDIS-6381 on ubusrv2: ok
notice: Initiating notify operation REDIS-6381_post_notify_promote_0 locally
on ubusrv2
notice: Requesting local execution of notify operation for REDIS-6381 on
ubusrv2
notice: Initiating notify operation REDIS-6381_post_notify_promote_0 on
ubusrv1
notice: Initiating notify operation REDIS-6381:2_post_notify_promote_0 on
ubusrv3
notice: Result of notify operation for REDIS-6381 on ubusrv2: ok
notice: Setting master-REDIS-6381[ubusrv3]: (unset) -> 1
notice: Transition 112 aborted by status-3-master-REDIS-6381 doing create
master-REDIS-6381=1: Transient attribute change
notice: Setting master-REDIS-6381[ubusrv1]: (unset) -> 1
notice: Transition 112 (Complete=25, Pending=0, Fired=0, Skipped=5,
Incomplete=5, Source=/var/lib/pacemaker/pengine/pe-input-3731.bz2): Stopped
notice: Calculated transition 113, saving inputs in
/var/lib/pacemaker/pengine/pe-input-3732.bz2
notice: Initiating monitor operation REDIS-6381_monitor_20000 locally on
ubusrv2
notice: Requesting local execution of monitor operation for REDIS-6381 on
ubusrv2
notice: Initiating monitor operation REDIS-6381_monitor_60000 on ubusrv3
notice: Initiating monitor operation REDIS-6381_monitor_45000 on ubusrv3
notice: Initiating monitor operation REDIS-6381_monitor_60000 on ubusrv1
notice: Initiating monitor operation REDIS-6381_monitor_45000 on ubusrv1
notice: Result of monitor operation for REDIS-6381 on ubusrv2: promoted
notice: Transition 113 (Complete=5, Pending=0, Fired=0, Skipped=0,
Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-3732.bz2): Complete
notice: State transition S_TRANSITION_ENGINE -> S_IDLE
How much transient attributes matter here?
As in my earlier example:
Node Attributes:
* Node: ubusrv1 (1):
* master-PGSQL-PAF-5433 : 1000
* master-PGSQL-PAF-5434 : 1001
* master-PGSQL-PAF-5435 : -1000
* master-REDIS-6380 : 1
* master-REDIS-6381 : 1
* master-REDIS-6382 : 1
* master-REDIS-6385 : 2
* Node: ubusrv2 (2):
* master-PGSQL-PAF-5433 : 990
* master-PGSQL-PAF-5434 : -1000
* master-PGSQL-PAF-5435 : 1001
* master-REDIS-6380 : 1
* master-REDIS-6381 : 1
* master-REDIS-6382 : 1
* master-REDIS-6385 : 2
* Node: ubusrv3 (3):
* master-PGSQL-PAF-5433 : 1001
* master-PGSQL-PAF-5434 : -1000
* master-PGSQL-PAF-5435 : -1000
* master-REDIS-6380 : 1000
* master-REDIS-6381 : 1
* master-REDIS-6382 : 1
* master-REDIS-6385 : 2
-> $ pcs constraint colocation --full | grep REDIS-6380
REDIS-6380-clone with PGSQL-PAF-5434-clone (score:9999)
(rsc-role:Master) (with-rsc-role:Master)
(id:colocation-REDIS-6380-clone-PGSQL-PAF-5434-clone-INFINITY-1)
Right now I again have situation where _master_ REDIS-6380
should be on ubusrv1 if... that constraint above was
honored, where master PGSQL-PAF-5434 is.
And again, I can move manually:
-> $ pcs resource move --master REDIS-6380-clone ubusrv1
When moved like that Redis reports replication status as
expected and seems problem-free.
As soon as I:
-> $ pcs resource clear REDIS-6380-clone
cluster moves master REDIS-6380-clone back to _ubusrv3_
where, there is that _transient_ attr "strangely" high.
Do we have a doc/manual somewhere cover those parts -
transient bits & their role?
many thanks, L.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/