Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-29 Thread Dejan Muhamedagic
On Fri, Oct 26, 2012 at 11:36:53AM +1100, Andrew Beekhof wrote:
 On Fri, Oct 26, 2012 at 12:52 AM, Dejan Muhamedagic de...@suse.de wrote:
  On Thu, Oct 25, 2012 at 06:09:38AM -0700, Lars Ellenberg wrote:
  On Thu, Oct 25, 2012 at 03:38:47AM -0700, Takatoshi MATSUO wrote:
   Usually,  we use crm_master command instead of crm_attribute to 
   change master score in RA.
   But PostgreSQL's slave can't get own replication status, so Master 
   changes Slave's master-score
   using instance number on Pacemaker 1.0.x .
   This probably is not ordinary usage.
  
Would the existing resource agent work with globally-unique=true ?
  
   I don't know it works with true.
   I use it with false and it dosen't need true.
 
  I suggested that you actually should use globally-unique clones,
  as in that case you still get those instance numbers...
 
  Does using different clones make sense in pgsql? What is to be
  different between them? Or would it be just for the sake of
  getting instance numbers? If so, then it somehow looks wrong to
  me :)
 
  But thinking about it once more, I'm not so sure anymore.
 
  Correct me where I'm wrong.
 
  This is about the master score.
  In case the Master instance fails, we preferably want to promote the
  slave instance that is as close as possible to the Master.
  We only know which *node* was best at the last monitoring interval,
  which may be good enough.
 
  We need to then change the master score for *all possible instances*,
  for all nodes, accordingly.
 
  Which is what that loop did.
  (I think skipping the current instance is actually a bug;
   If pacemaker relabeles things in a bad way, you may hit it).
 
  Now, with pacemaker 1.1.8, all instances become equal
  (for anonymous clones, aka globally-unique=false),
  and we only need to set the score on the resource-id,
  not for all resource-id:instance combinations.
 
  OK.
 
  Which is great. After all, the master score in this case is attached to
  the node (or, the data set accessible from that node), and not to the
  (arbitrary, potentially relabeled anytime) instance number pacemaker
  assigned to the clone instance running on that node.
 
 
  And that is exactly what your patch does:
   * detect if a version of pacemaker is in use that attaches the instance
 number to the resource id
 * if so, do the loop on all possible instance numbers as before
 * if not, only set the master score on the resource-id
 
 
  Is my understanding correct?
  Then I think you patch is good.
 
  Yes, the patch seems good then. Though there is quite a bit of
  code repetition. The set attribute part should be moved to an
  extra function.
 
  Still, other resource agents that use master scores (or any other
  attributes that reference instance numbers of anonymous clones)
  need to be reviewed.
 
  Though this I'll set scores for other instances, not only myself
  logic is unique to pgsql, so most other resource agents should just
  work with whatever is present in the environment, they typically treat
  the $OCF_RESOURCE_INSTANCE as opaque.
 
  Seems like no other RA uses instance numbers. However, quite a
  few use OCF_RESOURCE_INSTANCE which, in case of clone/ms
  resources, may potentially lead to unpredictable results on
  upgrade to 1.1.8.
 
 No. Otherwise all the regression tests would fail.  The PE is smart
 enough to find promotion score and failcounts in either case.

Cool.

 Also, OCF_RESOURCE_INSTANCE contains whatever the local lrmd knows the
 resource as, not what we call it internally to the PE.

What I meant was that some RA use OCF_RESOURCE_INSTANCE to name
local files which keep some kind of state. If
OCF_RESOURCE_INSTANCE changes on upgrade... Well, I guess that
the worst that can happen is for the probe to fail. But I didn't
take a closer look.

Thanks,

Dejan

  Thanks,
Lars
 
  Cheers,
 
  Dejan
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-29 Thread Andrew Beekhof
On Mon, Oct 29, 2012 at 9:51 PM, Dejan Muhamedagic de...@suse.de wrote:
 On Fri, Oct 26, 2012 at 11:36:53AM +1100, Andrew Beekhof wrote:
 On Fri, Oct 26, 2012 at 12:52 AM, Dejan Muhamedagic de...@suse.de wrote:
  On Thu, Oct 25, 2012 at 06:09:38AM -0700, Lars Ellenberg wrote:
  On Thu, Oct 25, 2012 at 03:38:47AM -0700, Takatoshi MATSUO wrote:
   Usually,  we use crm_master command instead of crm_attribute to 
   change master score in RA.
   But PostgreSQL's slave can't get own replication status, so Master 
   changes Slave's master-score
   using instance number on Pacemaker 1.0.x .
   This probably is not ordinary usage.
  
Would the existing resource agent work with globally-unique=true ?
  
   I don't know it works with true.
   I use it with false and it dosen't need true.
 
  I suggested that you actually should use globally-unique clones,
  as in that case you still get those instance numbers...
 
  Does using different clones make sense in pgsql? What is to be
  different between them? Or would it be just for the sake of
  getting instance numbers? If so, then it somehow looks wrong to
  me :)
 
  But thinking about it once more, I'm not so sure anymore.
 
  Correct me where I'm wrong.
 
  This is about the master score.
  In case the Master instance fails, we preferably want to promote the
  slave instance that is as close as possible to the Master.
  We only know which *node* was best at the last monitoring interval,
  which may be good enough.
 
  We need to then change the master score for *all possible instances*,
  for all nodes, accordingly.
 
  Which is what that loop did.
  (I think skipping the current instance is actually a bug;
   If pacemaker relabeles things in a bad way, you may hit it).
 
  Now, with pacemaker 1.1.8, all instances become equal
  (for anonymous clones, aka globally-unique=false),
  and we only need to set the score on the resource-id,
  not for all resource-id:instance combinations.
 
  OK.
 
  Which is great. After all, the master score in this case is attached to
  the node (or, the data set accessible from that node), and not to the
  (arbitrary, potentially relabeled anytime) instance number pacemaker
  assigned to the clone instance running on that node.
 
 
  And that is exactly what your patch does:
   * detect if a version of pacemaker is in use that attaches the instance
 number to the resource id
 * if so, do the loop on all possible instance numbers as before
 * if not, only set the master score on the resource-id
 
 
  Is my understanding correct?
  Then I think you patch is good.
 
  Yes, the patch seems good then. Though there is quite a bit of
  code repetition. The set attribute part should be moved to an
  extra function.
 
  Still, other resource agents that use master scores (or any other
  attributes that reference instance numbers of anonymous clones)
  need to be reviewed.
 
  Though this I'll set scores for other instances, not only myself
  logic is unique to pgsql, so most other resource agents should just
  work with whatever is present in the environment, they typically treat
  the $OCF_RESOURCE_INSTANCE as opaque.
 
  Seems like no other RA uses instance numbers. However, quite a
  few use OCF_RESOURCE_INSTANCE which, in case of clone/ms
  resources, may potentially lead to unpredictable results on
  upgrade to 1.1.8.

 No. Otherwise all the regression tests would fail.  The PE is smart
 enough to find promotion score and failcounts in either case.

 Cool.

 Also, OCF_RESOURCE_INSTANCE contains whatever the local lrmd knows the
 resource as, not what we call it internally to the PE.

 What I meant was that some RA use OCF_RESOURCE_INSTANCE to name
 local files which keep some kind of state. If
 OCF_RESOURCE_INSTANCE changes on upgrade... Well, I guess that
 the worst that can happen is for the probe to fail.

Right. But only for attach/reattach.
And people should have maintenance-mode enabled at the point the probe
is run, so there is time to fix things up before the cluster does
anything about it.

 But I didn't
 take a closer look.

 Thanks,

 Dejan

  Thanks,
Lars
 
  Cheers,
 
  Dejan
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Lars Ellenberg
On Thu, Oct 25, 2012 at 01:24:40AM -0700, Takatoshi MATSUO wrote:
 check existence of instance number in replication mode
 because Pacemaker 1.1.8 or higher do not append instance numbers.

I think this is wrong.

It seems this became necessary because of

 commit 427c7fe6ea94a566aaa714daf8d214290632f837
 Author: Andrew Beekhof and...@beekhof.net
 Date:   Fri Jul 13 13:37:42 2012 +1000

High: PE: Do not append instance numbers to anonymous clones

Benefits:
- they shouldnt have been exposed in the first place, but I didnt know how 
not to back then
- if admins don't know what they are, they can't be misunderstood or misused
- more reliable failcount and promotion scores (since you dont have to 
check for all possible permutations)
- smaller status section since there cant be entries for each possible :N 
suffix
- the name in the config corresponds to the resource in the logs


So if pgsql thinks it needs these instance numbers,
maybe it is not so anonymous a clone, after all?

Would the existing resource agent work with globally-unique=true ?

Lars

 
 You can merge this Pull Request by running:
 
   git pull https://github.com/t-matsuo/resource-agents check-instance-number
 
 Or you can view, comment on it, or merge it online at:
 
   https://github.com/ClusterLabs/resource-agents/pull/159
 
 -- Commit Summary --
 
   * Low: pgsql: check existence of instance number in replication mode
 
 -- File Changes --
 
 M heartbeat/pgsql (44)
 
 -- Patch Links --
 
 https://github.com/ClusterLabs/resource-agents/pull/159.patch
 https://github.com/ClusterLabs/resource-agents/pull/159.diff
 
 
 ---
 Reply to this email directly or view it on GitHub:
 https://github.com/ClusterLabs/resource-agents/pull/159

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Takatoshi MATSUO
Usually, we use crm_master command instead of crm_attribute to
change own master score in RA.
But PostgreSQL's Slave can't get own replication status, so Master
changes Slave's master-score
using instance number on Pacemaker 1.0.x .
This probably is not ordinary usage.

 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

No, I use it with false and it dosen't need true.

--
Takatoshi MATSUO


2012/10/25 Lars Ellenberg lars.ellenb...@linbit.com:
 On Thu, Oct 25, 2012 at 01:24:40AM -0700, Takatoshi MATSUO wrote:
 check existence of instance number in replication mode
 because Pacemaker 1.1.8 or higher do not append instance numbers.

 I think this is wrong.

 It seems this became necessary because of

  commit 427c7fe6ea94a566aaa714daf8d214290632f837
  Author: Andrew Beekhof and...@beekhof.net
  Date:   Fri Jul 13 13:37:42 2012 +1000

 High: PE: Do not append instance numbers to anonymous clones

 Benefits:
 - they shouldnt have been exposed in the first place, but I didnt know 
 how not to back then
 - if admins don't know what they are, they can't be misunderstood or 
 misused
 - more reliable failcount and promotion scores (since you dont have to 
 check for all possible permutations)
 - smaller status section since there cant be entries for each possible :N 
 suffix
 - the name in the config corresponds to the resource in the logs


 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

 Lars


 You can merge this Pull Request by running:

   git pull https://github.com/t-matsuo/resource-agents check-instance-number

 Or you can view, comment on it, or merge it online at:

   https://github.com/ClusterLabs/resource-agents/pull/159

 -- Commit Summary --

   * Low: pgsql: check existence of instance number in replication mode

 -- File Changes --

 M heartbeat/pgsql (44)

 -- Patch Links --

 https://github.com/ClusterLabs/resource-agents/pull/159.patch
 https://github.com/ClusterLabs/resource-agents/pull/159.diff


 ---
 Reply to this email directly or view it on GitHub:
 https://github.com/ClusterLabs/resource-agents/pull/159

 --
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Dejan Muhamedagic
On Thu, Oct 25, 2012 at 06:09:38AM -0700, Lars Ellenberg wrote:
 On Thu, Oct 25, 2012 at 03:38:47AM -0700, Takatoshi MATSUO wrote:
  Usually,  we use crm_master command instead of crm_attribute to change 
  master score in RA.
  But PostgreSQL's slave can't get own replication status, so Master changes 
  Slave's master-score 
  using instance number on Pacemaker 1.0.x .
  This probably is not ordinary usage.
  
   Would the existing resource agent work with globally-unique=true ?
  
  I don't know it works with true.
  I use it with false and it dosen't need true.
 
 I suggested that you actually should use globally-unique clones,
 as in that case you still get those instance numbers...

Does using different clones make sense in pgsql? What is to be
different between them? Or would it be just for the sake of
getting instance numbers? If so, then it somehow looks wrong to
me :)

 But thinking about it once more, I'm not so sure anymore.
 
 Correct me where I'm wrong.
 
 This is about the master score.
 In case the Master instance fails, we preferably want to promote the
 slave instance that is as close as possible to the Master.
 We only know which *node* was best at the last monitoring interval,
 which may be good enough.
 
 We need to then change the master score for *all possible instances*,
 for all nodes, accordingly.
 
 Which is what that loop did.
 (I think skipping the current instance is actually a bug;
  If pacemaker relabeles things in a bad way, you may hit it).
 
 Now, with pacemaker 1.1.8, all instances become equal
 (for anonymous clones, aka globally-unique=false),
 and we only need to set the score on the resource-id,
 not for all resource-id:instance combinations.

OK.

 Which is great. After all, the master score in this case is attached to
 the node (or, the data set accessible from that node), and not to the
 (arbitrary, potentially relabeled anytime) instance number pacemaker
 assigned to the clone instance running on that node.
 
 
 And that is exactly what your patch does:
  * detect if a version of pacemaker is in use that attaches the instance
number to the resource id
* if so, do the loop on all possible instance numbers as before
* if not, only set the master score on the resource-id
 
 
 Is my understanding correct?
 Then I think you patch is good.

Yes, the patch seems good then. Though there is quite a bit of
code repetition. The set attribute part should be moved to an
extra function.

 Still, other resource agents that use master scores (or any other
 attributes that reference instance numbers of anonymous clones)
 need to be reviewed.
 
 Though this I'll set scores for other instances, not only myself
 logic is unique to pgsql, so most other resource agents should just
 work with whatever is present in the environment, they typically treat
 the $OCF_RESOURCE_INSTANCE as opaque.

Seems like no other RA uses instance numbers. However, quite a
few use OCF_RESOURCE_INSTANCE which, in case of clone/ms
resources, may potentially lead to unpredictable results on
upgrade to 1.1.8.

 Thanks,
   Lars

Cheers,

Dejan
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Andrew Beekhof
On Thu, Oct 25, 2012 at 10:01 PM, Takatoshi MATSUO matsuo@gmail.com wrote:
 Usually, we use crm_master command instead of crm_attribute to
 change own master score in RA.
 But PostgreSQL's Slave can't get own replication status, so Master
 changes Slave's master-score
 using instance number on Pacemaker 1.0.x .
 This probably is not ordinary usage.

Ouch!  No, not ordinary (or recommended) at all :-)
What does the crm_attribute command line look like?  Maybe the --node
option could help?


 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

 No, I use it with false and it dosen't need true.

 --
 Takatoshi MATSUO


 2012/10/25 Lars Ellenberg lars.ellenb...@linbit.com:
 On Thu, Oct 25, 2012 at 01:24:40AM -0700, Takatoshi MATSUO wrote:
 check existence of instance number in replication mode
 because Pacemaker 1.1.8 or higher do not append instance numbers.

 I think this is wrong.

 It seems this became necessary because of

  commit 427c7fe6ea94a566aaa714daf8d214290632f837
  Author: Andrew Beekhof and...@beekhof.net
  Date:   Fri Jul 13 13:37:42 2012 +1000

 High: PE: Do not append instance numbers to anonymous clones

 Benefits:
 - they shouldnt have been exposed in the first place, but I didnt know 
 how not to back then
 - if admins don't know what they are, they can't be misunderstood or 
 misused
 - more reliable failcount and promotion scores (since you dont have to 
 check for all possible permutations)
 - smaller status section since there cant be entries for each possible 
 :N suffix
 - the name in the config corresponds to the resource in the logs


 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

 Lars


 You can merge this Pull Request by running:

   git pull https://github.com/t-matsuo/resource-agents check-instance-number

 Or you can view, comment on it, or merge it online at:

   https://github.com/ClusterLabs/resource-agents/pull/159

 -- Commit Summary --

   * Low: pgsql: check existence of instance number in replication mode

 -- File Changes --

 M heartbeat/pgsql (44)

 -- Patch Links --

 https://github.com/ClusterLabs/resource-agents/pull/159.patch
 https://github.com/ClusterLabs/resource-agents/pull/159.diff


 ---
 Reply to this email directly or view it on GitHub:
 https://github.com/ClusterLabs/resource-agents/pull/159

 --
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Andrew Beekhof
On Fri, Oct 26, 2012 at 12:52 AM, Dejan Muhamedagic de...@suse.de wrote:
 On Thu, Oct 25, 2012 at 06:09:38AM -0700, Lars Ellenberg wrote:
 On Thu, Oct 25, 2012 at 03:38:47AM -0700, Takatoshi MATSUO wrote:
  Usually,  we use crm_master command instead of crm_attribute to change 
  master score in RA.
  But PostgreSQL's slave can't get own replication status, so Master changes 
  Slave's master-score
  using instance number on Pacemaker 1.0.x .
  This probably is not ordinary usage.
 
   Would the existing resource agent work with globally-unique=true ?
 
  I don't know it works with true.
  I use it with false and it dosen't need true.

 I suggested that you actually should use globally-unique clones,
 as in that case you still get those instance numbers...

 Does using different clones make sense in pgsql? What is to be
 different between them? Or would it be just for the sake of
 getting instance numbers? If so, then it somehow looks wrong to
 me :)

 But thinking about it once more, I'm not so sure anymore.

 Correct me where I'm wrong.

 This is about the master score.
 In case the Master instance fails, we preferably want to promote the
 slave instance that is as close as possible to the Master.
 We only know which *node* was best at the last monitoring interval,
 which may be good enough.

 We need to then change the master score for *all possible instances*,
 for all nodes, accordingly.

 Which is what that loop did.
 (I think skipping the current instance is actually a bug;
  If pacemaker relabeles things in a bad way, you may hit it).

 Now, with pacemaker 1.1.8, all instances become equal
 (for anonymous clones, aka globally-unique=false),
 and we only need to set the score on the resource-id,
 not for all resource-id:instance combinations.

 OK.

 Which is great. After all, the master score in this case is attached to
 the node (or, the data set accessible from that node), and not to the
 (arbitrary, potentially relabeled anytime) instance number pacemaker
 assigned to the clone instance running on that node.


 And that is exactly what your patch does:
  * detect if a version of pacemaker is in use that attaches the instance
number to the resource id
* if so, do the loop on all possible instance numbers as before
* if not, only set the master score on the resource-id


 Is my understanding correct?
 Then I think you patch is good.

 Yes, the patch seems good then. Though there is quite a bit of
 code repetition. The set attribute part should be moved to an
 extra function.

 Still, other resource agents that use master scores (or any other
 attributes that reference instance numbers of anonymous clones)
 need to be reviewed.

 Though this I'll set scores for other instances, not only myself
 logic is unique to pgsql, so most other resource agents should just
 work with whatever is present in the environment, they typically treat
 the $OCF_RESOURCE_INSTANCE as opaque.

 Seems like no other RA uses instance numbers. However, quite a
 few use OCF_RESOURCE_INSTANCE which, in case of clone/ms
 resources, may potentially lead to unpredictable results on
 upgrade to 1.1.8.

No. Otherwise all the regression tests would fail.  The PE is smart
enough to find promotion score and failcounts in either case.
Also, OCF_RESOURCE_INSTANCE contains whatever the local lrmd knows the
resource as, not what we call it internally to the PE.


 Thanks,
   Lars

 Cheers,

 Dejan
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Takatoshi MATSUO
2012/10/26 Andrew Beekhof and...@beekhof.net:
 On Thu, Oct 25, 2012 at 10:01 PM, Takatoshi MATSUO matsuo@gmail.com 
 wrote:
 Usually, we use crm_master command instead of crm_attribute to
 change own master score in RA.
 But PostgreSQL's Slave can't get own replication status, so Master
 changes Slave's master-score
 using instance number on Pacemaker 1.0.x .
 This probably is not ordinary usage.

 Ouch!  No, not ordinary (or recommended) at all :-)
 What does the crm_attribute command line look like?  Maybe the --node
 option could help?

# crm_attribute -l reboot  -N pm02 -n master-pgsql:1 -v 1000

This line uses crm_master as a reference.
 I would like crm_master to have a parameter which can set hostname.


But crm_master gets hostname using crm_node -n command in these days,
 so I think that I should fix method to get hostname for next version.
 It also needs compatible code for Pacemaker 1.0.x :(


 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

 No, I use it with false and it dosen't need true.

 --
 Takatoshi MATSUO


 2012/10/25 Lars Ellenberg lars.ellenb...@linbit.com:
 On Thu, Oct 25, 2012 at 01:24:40AM -0700, Takatoshi MATSUO wrote:
 check existence of instance number in replication mode
 because Pacemaker 1.1.8 or higher do not append instance numbers.

 I think this is wrong.

 It seems this became necessary because of

  commit 427c7fe6ea94a566aaa714daf8d214290632f837
  Author: Andrew Beekhof and...@beekhof.net
  Date:   Fri Jul 13 13:37:42 2012 +1000

 High: PE: Do not append instance numbers to anonymous clones

 Benefits:
 - they shouldnt have been exposed in the first place, but I didnt know 
 how not to back then
 - if admins don't know what they are, they can't be misunderstood or 
 misused
 - more reliable failcount and promotion scores (since you dont have to 
 check for all possible permutations)
 - smaller status section since there cant be entries for each possible 
 :N suffix
 - the name in the config corresponds to the resource in the logs


 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

 Lars


 You can merge this Pull Request by running:

   git pull https://github.com/t-matsuo/resource-agents 
 check-instance-number

 Or you can view, comment on it, or merge it online at:

   https://github.com/ClusterLabs/resource-agents/pull/159

 -- Commit Summary --

   * Low: pgsql: check existence of instance number in replication mode

 -- File Changes --

 M heartbeat/pgsql (44)

 -- Patch Links --

 https://github.com/ClusterLabs/resource-agents/pull/159.patch
 https://github.com/ClusterLabs/resource-agents/pull/159.diff


 ---
 Reply to this email directly or view it on GitHub:
 https://github.com/ClusterLabs/resource-agents/pull/159

 --
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/

--
Thanks,
Takatoshi MATSUO
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Takatoshi MATSUO
2012/10/25 Dejan Muhamedagic de...@suse.de:
 On Thu, Oct 25, 2012 at 06:09:38AM -0700, Lars Ellenberg wrote:
 On Thu, Oct 25, 2012 at 03:38:47AM -0700, Takatoshi MATSUO wrote:
  Usually,  we use crm_master command instead of crm_attribute to change 
  master score in RA.
  But PostgreSQL's slave can't get own replication status, so Master changes 
  Slave's master-score
  using instance number on Pacemaker 1.0.x .
  This probably is not ordinary usage.
 
   Would the existing resource agent work with globally-unique=true ?
 
  I don't know it works with true.
  I use it with false and it dosen't need true.

 I suggested that you actually should use globally-unique clones,
 as in that case you still get those instance numbers...

 Does using different clones make sense in pgsql? What is to be
 different between them? Or would it be just for the sake of
 getting instance numbers? If so, then it somehow looks wrong to
 me :)

It makes no sense to using different clones.
Pgsql only uses instance numbers for changing master score on other nodes.
Master score needs it on Pacemaker 1.0.x regardless of globally-unique.


 But thinking about it once more, I'm not so sure anymore.

 Correct me where I'm wrong.

 This is about the master score.
 In case the Master instance fails, we preferably want to promote the
 slave instance that is as close as possible to the Master.
 We only know which *node* was best at the last monitoring interval,
 which may be good enough.

 We need to then change the master score for *all possible instances*,
 for all nodes, accordingly.

 Which is what that loop did.
 (I think skipping the current instance is actually a bug;
  If pacemaker relabeles things in a bad way, you may hit it).

 Now, with pacemaker 1.1.8, all instances become equal
 (for anonymous clones, aka globally-unique=false),
 and we only need to set the score on the resource-id,
 not for all resource-id:instance combinations.

 OK.

 Which is great. After all, the master score in this case is attached to
 the node (or, the data set accessible from that node), and not to the
 (arbitrary, potentially relabeled anytime) instance number pacemaker
 assigned to the clone instance running on that node.


 And that is exactly what your patch does:
  * detect if a version of pacemaker is in use that attaches the instance
number to the resource id
* if so, do the loop on all possible instance numbers as before
* if not, only set the master score on the resource-id


 Is my understanding correct?
 Then I think you patch is good.

 Yes, the patch seems good then. Though there is quite a bit of
 code repetition. The set attribute part should be moved to an
 extra function.

I will improve it.


 Still, other resource agents that use master scores (or any other
 attributes that reference instance numbers of anonymous clones)
 need to be reviewed.

 Though this I'll set scores for other instances, not only myself
 logic is unique to pgsql, so most other resource agents should just
 work with whatever is present in the environment, they typically treat
 the $OCF_RESOURCE_INSTANCE as opaque.

 Seems like no other RA uses instance numbers. However, quite a
 few use OCF_RESOURCE_INSTANCE which, in case of clone/ms
 resources, may potentially lead to unpredictable results on
 upgrade to 1.1.8.

 Thanks,
   Lars

 Cheers,

 Dejan


Thanks,
Takatoshi MATSUO
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [resource-agents] Low: pgsql: check existence of instance number in replication mode (#159)

2012-10-25 Thread Andrew Beekhof
On Fri, Oct 26, 2012 at 12:49 PM, Takatoshi MATSUO matsuo@gmail.com wrote:
 2012/10/26 Andrew Beekhof and...@beekhof.net:
 On Thu, Oct 25, 2012 at 10:01 PM, Takatoshi MATSUO matsuo@gmail.com 
 wrote:
 Usually, we use crm_master command instead of crm_attribute to
 change own master score in RA.
 But PostgreSQL's Slave can't get own replication status, so Master
 changes Slave's master-score
 using instance number on Pacemaker 1.0.x .
 This probably is not ordinary usage.

 Ouch!  No, not ordinary (or recommended) at all :-)
 What does the crm_attribute command line look like?  Maybe the --node
 option could help?

 # crm_attribute -l reboot  -N pm02 -n master-pgsql:1 -v 1000

That looks fine, just drop the :1 (or use whatever is in OCF_RESOURCE_INSTANCE)


 This line uses crm_master as a reference.
  I would like crm_master to have a parameter which can set hostname.

Probably not going to happen.  crm_master is a convenience function
for the common use case.
Its fine to switch to crm_attribute for advanced usage.



 But crm_master gets hostname using crm_node -n command in these days,
  so I think that I should fix method to get hostname for next version.
  It also needs compatible code for Pacemaker 1.0.x :(


 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

 No, I use it with false and it dosen't need true.

 --
 Takatoshi MATSUO


 2012/10/25 Lars Ellenberg lars.ellenb...@linbit.com:
 On Thu, Oct 25, 2012 at 01:24:40AM -0700, Takatoshi MATSUO wrote:
 check existence of instance number in replication mode
 because Pacemaker 1.1.8 or higher do not append instance numbers.

 I think this is wrong.

 It seems this became necessary because of

  commit 427c7fe6ea94a566aaa714daf8d214290632f837
  Author: Andrew Beekhof and...@beekhof.net
  Date:   Fri Jul 13 13:37:42 2012 +1000

 High: PE: Do not append instance numbers to anonymous clones

 Benefits:
 - they shouldnt have been exposed in the first place, but I didnt know 
 how not to back then
 - if admins don't know what they are, they can't be misunderstood or 
 misused
 - more reliable failcount and promotion scores (since you dont have to 
 check for all possible permutations)
 - smaller status section since there cant be entries for each possible 
 :N suffix
 - the name in the config corresponds to the resource in the logs


 So if pgsql thinks it needs these instance numbers,
 maybe it is not so anonymous a clone, after all?

 Would the existing resource agent work with globally-unique=true ?

 Lars


 You can merge this Pull Request by running:

   git pull https://github.com/t-matsuo/resource-agents 
 check-instance-number

 Or you can view, comment on it, or merge it online at:

   https://github.com/ClusterLabs/resource-agents/pull/159

 -- Commit Summary --

   * Low: pgsql: check existence of instance number in replication mode

 -- File Changes --

 M heartbeat/pgsql (44)

 -- Patch Links --

 https://github.com/ClusterLabs/resource-agents/pull/159.patch
 https://github.com/ClusterLabs/resource-agents/pull/159.diff


 ---
 Reply to this email directly or view it on GitHub:
 https://github.com/ClusterLabs/resource-agents/pull/159

 --
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/

 --
 Thanks,
 Takatoshi MATSUO
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/