Re: [ClusterLabs] attrd/attrd_updater asynchronous behavior

2018-04-16 Thread Jehan-Guillaume de Rorthais
I got an answer on IRC from Ken Gaillot. Bellow his answer for tracking purpose.

On Mon, 16 Apr 2018 23:28:39 +0200
Jehan-Guillaume de Rorthais  wrote:
[...]
> * does looping until the value becomes available is enough to conclude all
>   other node have the same value? Or is available only locally on the action's
>   node and not yet "replicated" to other nodes? 

kgaillot: « 
  That issue has come up recently in a different context. you are
  correct, currently there is no guarantee that the value has been set
  anywhere, and looping until the query comes back ensures that the new value
  is in the local attrd only.

  The solution will probably be to offer something like a --wait option that
  doesn't return until the value is available (maybe locally, maybe everywhere,
  or maybe that's part of the option).
»

I'll fill a bz for tracking purpose of such feature as discussed on IRC.

> * any other suggestions about how we could share values synchronously with all
>   other nodes?

Any suggestion is very welcome...
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] attrd/attrd_updater asynchronous behavior

2018-04-16 Thread Jehan-Guillaume de Rorthais
Hi,

I have a question in regard with attrd asynchronous behavior

In PAF, during the election process to pick the best PgSQL master, we are using
private attributes to publish the status (LSN) of each pgsql instances during
the pre-promote action.

Because we need these LSN from each nodes during the promote action, each time
we are calling

  attrd_updater --name blah --update x

we have a loop running

  attrd_updater --name blah --query

until the fetched value is the same than the one we set. We basically tried to
force a synchronous behavior.

See: https://github.com/ClusterLabs/PAF/blob/master/script/pgsqlms#L310

But, we have an issue on github that makes me think this might not be enough to
make sure all the private attributes becomes available among the
cluster during the pre-promote action and before the promote action is
triggered. See: https://github.com/ClusterLabs/PAF/issues/131

In this issue, a simple switchover fails (pcs move) during the designated slave
promotion action, because it couldn't check all other nodes LSN: 

  ocf-exit-reason:Can not get LSN location for "pg1-dev"

* does looping until the value becomes available is enough to conclude all
  other node have the same value? Or is available only locally on the action's
  node and not yet "replicated" to other nodes? 
* any other suggestions about how we could share values synchronously with all
  other nodes?

Thanks for your help,
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org