Dne 08. 03. 22 v 23:08 Ken Gaillot napsal(a):
On Tue, 2022-03-08 at 17:20 +0100, Jehan-Guillaume de Rorthais wrote:
Hi,

Sorry, your mail was really hard to read on my side, but I think I
understood
and try to answer bellow.

On Tue, 8 Mar 2022 11:45:30 +0000
lejeczek via Users <users@clusterlabs.org> wrote:

On 08/03/2022 10:21, Jehan-Guillaume de Rorthais wrote:
op start timeout=60s \ op stop timeout=60s \ op promote
timeout=30s  >> \
op demote timeout=120s \ op monitor interval=15s
timeout=10s >> role="Master" meta master-max=1 \ op monitor
interval=16s >> timeout=10s role="Slave" \ op notify
timeout=60s meta notify=true > Because "op" appears, we are
back in resource ("pgsqld") context, > anything after is
interpreted as ressource and operation attributes, > even
the "meta notify=true". That's why your pgsqld-clone doesn't
  > have the meta attribute "notify=true" set.
Here is one-liner that should do - add, as per 'debug-'
suggestion, 'master-max=1'

What debug- suggestion??

...
then do:

-> $ pcs resource delete pgsqld

'-clone' should get removed too, so now no 'pgsqld'
resource(s) but cluster - weirdly in my mind - leaves node
attributes on.

indeed.

I see 'master-pgsqld' with each node and do not see why
'node attributes' should be kept(certainly shown) for
non-existent resources(to which only resources those attrs
are instinct)
So, you want to "clean" that for, perhaps for now you are
not going to have/use 'pgsqlms', you can do that with:

-> $ pcs node attribute node1 master-pgsqld="" # same for
remaining nodes

indeed.

now .. ! repeat your one-liner which worked just a moment
ago and you should get exact same or similar errors(while
all nodes are stuck on 'slave'

You have no promotion because your PostgreSQL instances has been
stopped
in standby mode. The cluster has no way and no score to promote one
of them.

-> $ pcs resource debug-promote pgsqld
crm_resource: Error performing operation: Error occurred
Operation force-promote for pgsqld (ocf:heartbeat:pgsqlms)
returned 1 (error: Can not get current node LSN location)
/tmp:5432 - accepting connections

NEVER use "debug-promote" or other "debug-*" command with pgsqlms, or
any other
cloned ressources. AFAIK, these commands works fine for "stateless"
ressource,
but do not (could not) create the required environnement for the
clone and multi-state ones.

So I repeat, NEVER use "debug-promote".

What you want to do is setting the promotion score on the node you
want the
promotion to happen. Eg.:

   pcs node attribute srv1 master-pgsqld=1001

You can use "crm_attribute" or "crm_master" as well.

ocf-exit-reason:Can not get current node LSN location

This one is probably because of "debug-promote".

You have to 'cib-push' to "fix" this very problem.
In my(admin's) opinion this is a 100% candidate for a bug -
whether in PCS or PAF - perhaps authors may wish to comment?

Removing the node attributes with the resource might be legit from
the
Pacemaker point of view, but I'm not sure how they can track the
dependency
(ping Ken?).

Higher-level tools like pcs or crm shell could probably do it when
removing the resource (i.e. if the resource was a promotable clone,
check for and remove any node attributes of the form master-$RSC_ID).
That sounds like a good idea to me.

I put this on pcs todo list.

Regards,
Tomas


Pacemaker would be a bad place to do it because Pacemaker only sees the
newly modified CIB with the resource configuration gone -- it can't
know for sure whether it was a promotable clone, and it can only know
it existed at all if there is leftover status entries (causing the
resource to be listed as "orphaned"), which isn't guaranteed.


PAF has no way to know the ressource is being deleted and can not
remove its
node attribute before hand.

Maybe PCS can look for promotable score and remove them during the
"resource
delete" command (ping Tomas)?

Regards,


_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to