Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-06-08 Thread Lars Ellenberg
On Mon, Jun 04, 2012 at 11:33:45AM +1000, Andrew Beekhof wrote:
> On Mon, Jun 4, 2012 at 11:28 AM, Andrew Beekhof  wrote:
> > On Fri, May 25, 2012 at 7:48 PM, Florian Haas  wrote:
> >> On Fri, May 25, 2012 at 11:38 AM, Lars Ellenberg
> >>  wrote:
> >>> On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote:
>  On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
>   wrote:
>  > Sorry, sent to early.
>  >
>  > That would not catch the case of cluster partitions joining,
>  > only the pacemaker startup with fully connected cluster communication
>  > already up.
>  >
>  > I thought about a dc-priority default of 100,
>  > and only triggering a re-election if I am DC,
>  > my dc-priority is < 50, and I see a node joining.
> 
>  Hardcoded arbitrary defaults aren't that much fun. "You can use any
>  number, but 100 is the magic threshold" is something I wouldn't want
>  to explain to people over and over again.
> >>>
> >>> Then don't ;-)
> >>>
> >>> Not helping, and irrelevant to this case.
> >>>
> >>> Besides that was an example.
> >>> Easily possible: move the "I want to lose" vs "I want to win"
> >>> magic number to be 0, and allow both positive and negative priorities.
> >>> You get to decide whether positive or negative is the "I'd rather lose"
> >>> side. Want to make that configurable as well? Right.
> >>
> >> Nope, 0 is used as a threshold value in Pacemaker all over the place.
> >> So allowing both positive and negative priorities and making 0 the
> >> default sounds perfectly sane to me.
> >>
> >>> I don't think this can be made part of the cib configuration,
> >>> DC election takes place before cibs are resynced, so if you have
> >>> diverging cibs, you possibly end up with a never ending election?
> >>>
> >>> Then maybe the election is stable enough,
> >>> even after this change to the algorithm.
> >>
> >> Andrew?
> >
> > Probably.  The preferences are not going to be rapidly changing, so
> > there is no reason to suspect it would destabilise things.
> 
> Oh, you mean if the values are stored in the CIB?
> Yeah, I guess you could have issues if you changed the CIB during a
> cluster partition... dont do that?

Right. That was my concern.
So I'd rather not add them to the cib,
but get them from environment variables.
Which means that I would need to restart the local stack, if I wanted
to change the preference. Good enough.

> Honestly though, given the number (1? 2? 0?) of sites in the world
> that actually need this, my main criteria for a successful patch is
> "not screwing it up for everyone else".
> Which certainly rules out starting elections just because someone
> joined.  Although "i've just started and have a non-zero preference so
> I'm going to force an election" would be fine.

Thanks.
I'll see what the current status of that patch is, and if we can prepare
a patch to be considered for upstream inclusion.
May take a while though, due to round trip times ;-)


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-06-03 Thread Andrew Beekhof
On Mon, Jun 4, 2012 at 11:28 AM, Andrew Beekhof  wrote:
> On Fri, May 25, 2012 at 7:48 PM, Florian Haas  wrote:
>> On Fri, May 25, 2012 at 11:38 AM, Lars Ellenberg
>>  wrote:
>>> On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote:
 On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
  wrote:
 > Sorry, sent to early.
 >
 > That would not catch the case of cluster partitions joining,
 > only the pacemaker startup with fully connected cluster communication
 > already up.
 >
 > I thought about a dc-priority default of 100,
 > and only triggering a re-election if I am DC,
 > my dc-priority is < 50, and I see a node joining.

 Hardcoded arbitrary defaults aren't that much fun. "You can use any
 number, but 100 is the magic threshold" is something I wouldn't want
 to explain to people over and over again.
>>>
>>> Then don't ;-)
>>>
>>> Not helping, and irrelevant to this case.
>>>
>>> Besides that was an example.
>>> Easily possible: move the "I want to lose" vs "I want to win"
>>> magic number to be 0, and allow both positive and negative priorities.
>>> You get to decide whether positive or negative is the "I'd rather lose"
>>> side. Want to make that configurable as well? Right.
>>
>> Nope, 0 is used as a threshold value in Pacemaker all over the place.
>> So allowing both positive and negative priorities and making 0 the
>> default sounds perfectly sane to me.
>>
>>> I don't think this can be made part of the cib configuration,
>>> DC election takes place before cibs are resynced, so if you have
>>> diverging cibs, you possibly end up with a never ending election?
>>>
>>> Then maybe the election is stable enough,
>>> even after this change to the algorithm.
>>
>> Andrew?
>
> Probably.  The preferences are not going to be rapidly changing, so
> there is no reason to suspect it would destabilise things.

Oh, you mean if the values are stored in the CIB?
Yeah, I guess you could have issues if you changed the CIB during a
cluster partition... dont do that?

Honestly though, given the number (1? 2? 0?) of sites in the world
that actually need this, my main criteria for a successful patch is
"not screwing it up for everyone else".
Which certainly rules out starting elections just because someone
joined.  Although "i've just started and have a non-zero preference so
I'm going to force an election" would be fine.

>
>>
>>> But you'd need to add an other trigger on "dc-priority in configuration
>>> changed", complicating this stuff for no reason.
>>>
 We actually discussed node defaults a while back. Those would be
 similar to resource and op defaults which Pacemaker already has, and
 set defaults for node attributes for newly joined nodes. At the time
 the idea was to support putting new joiners in standby mode by
 default, so when you added a node in a symmetric cluster, you wouldn't
 need to be afraid that Pacemaker would shuffle resources around.[1]
 This dc-priority would be another possibly useful use case for this.
>>>
>>> Not so sure about that.
>>>
 [1] Yes, semi-doable with putting the cluster into maintenance mode
 before firing up the new node, setting that node into standby, and
 then unsetting maintenance mode. But that's just an additional step
 that users can easily forget about.
>>>
>>> Why not simply add the node to the cib, and set it to standby,
>>> before it even joins for the first time.
>>
>> Haha, good one.
>>
>> Wait, you weren't joking?
>>
>> Florian
>>
>> --
>> Need help with High Availability?
>> http://www.hastexo.com/now
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-06-03 Thread Andrew Beekhof
On Fri, May 25, 2012 at 7:48 PM, Florian Haas  wrote:
> On Fri, May 25, 2012 at 11:38 AM, Lars Ellenberg
>  wrote:
>> On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote:
>>> On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
>>>  wrote:
>>> > Sorry, sent to early.
>>> >
>>> > That would not catch the case of cluster partitions joining,
>>> > only the pacemaker startup with fully connected cluster communication
>>> > already up.
>>> >
>>> > I thought about a dc-priority default of 100,
>>> > and only triggering a re-election if I am DC,
>>> > my dc-priority is < 50, and I see a node joining.
>>>
>>> Hardcoded arbitrary defaults aren't that much fun. "You can use any
>>> number, but 100 is the magic threshold" is something I wouldn't want
>>> to explain to people over and over again.
>>
>> Then don't ;-)
>>
>> Not helping, and irrelevant to this case.
>>
>> Besides that was an example.
>> Easily possible: move the "I want to lose" vs "I want to win"
>> magic number to be 0, and allow both positive and negative priorities.
>> You get to decide whether positive or negative is the "I'd rather lose"
>> side. Want to make that configurable as well? Right.
>
> Nope, 0 is used as a threshold value in Pacemaker all over the place.
> So allowing both positive and negative priorities and making 0 the
> default sounds perfectly sane to me.
>
>> I don't think this can be made part of the cib configuration,
>> DC election takes place before cibs are resynced, so if you have
>> diverging cibs, you possibly end up with a never ending election?
>>
>> Then maybe the election is stable enough,
>> even after this change to the algorithm.
>
> Andrew?

Probably.  The preferences are not going to be rapidly changing, so
there is no reason to suspect it would destabilise things.

>
>> But you'd need to add an other trigger on "dc-priority in configuration
>> changed", complicating this stuff for no reason.
>>
>>> We actually discussed node defaults a while back. Those would be
>>> similar to resource and op defaults which Pacemaker already has, and
>>> set defaults for node attributes for newly joined nodes. At the time
>>> the idea was to support putting new joiners in standby mode by
>>> default, so when you added a node in a symmetric cluster, you wouldn't
>>> need to be afraid that Pacemaker would shuffle resources around.[1]
>>> This dc-priority would be another possibly useful use case for this.
>>
>> Not so sure about that.
>>
>>> [1] Yes, semi-doable with putting the cluster into maintenance mode
>>> before firing up the new node, setting that node into standby, and
>>> then unsetting maintenance mode. But that's just an additional step
>>> that users can easily forget about.
>>
>> Why not simply add the node to the cib, and set it to standby,
>> before it even joins for the first time.
>
> Haha, good one.
>
> Wait, you weren't joking?
>
> Florian
>
> --
> Need help with High Availability?
> http://www.hastexo.com/now
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-25 Thread Lars Ellenberg
On Fri, May 25, 2012 at 09:05:54PM +1000, Andrew Beekhof wrote:
> On Fri, May 25, 2012 at 7:48 PM, Florian Haas  wrote:
> > On Fri, May 25, 2012 at 11:38 AM, Lars Ellenberg
> >  wrote:
> >> On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote:
> >>> On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
> >>>  wrote:
> >>> > Sorry, sent to early.
> >>> >
> >>> > That would not catch the case of cluster partitions joining,
> >>> > only the pacemaker startup with fully connected cluster communication
> >>> > already up.
> >>> >
> >>> > I thought about a dc-priority default of 100,
> >>> > and only triggering a re-election if I am DC,
> >>> > my dc-priority is < 50, and I see a node joining.
> >>>
> >>> Hardcoded arbitrary defaults aren't that much fun. "You can use any
> >>> number, but 100 is the magic threshold" is something I wouldn't want
> >>> to explain to people over and over again.
> >>
> >> Then don't ;-)
> >>
> >> Not helping, and irrelevant to this case.
> >>
> >> Besides that was an example.
> >> Easily possible: move the "I want to lose" vs "I want to win"
> >> magic number to be 0, and allow both positive and negative priorities.
> >> You get to decide whether positive or negative is the "I'd rather lose"
> >> side. Want to make that configurable as well? Right.
> >
> > Nope, 0 is used as a threshold value in Pacemaker all over the place.
> > So allowing both positive and negative priorities and making 0 the
> > default sounds perfectly sane to me.
> >
> >> I don't think this can be made part of the cib configuration,
> >> DC election takes place before cibs are resynced, so if you have
> >> diverging cibs, you possibly end up with a never ending election?
> >>
> >> Then maybe the election is stable enough,
> >> even after this change to the algorithm.
> >
> > Andrew?
> 
> This whole thread makes me want to hurt kittens.

Yep...

Sorry for that :(

Lars

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-25 Thread Andrew Beekhof
On Fri, May 25, 2012 at 7:48 PM, Florian Haas  wrote:
> On Fri, May 25, 2012 at 11:38 AM, Lars Ellenberg
>  wrote:
>> On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote:
>>> On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
>>>  wrote:
>>> > Sorry, sent to early.
>>> >
>>> > That would not catch the case of cluster partitions joining,
>>> > only the pacemaker startup with fully connected cluster communication
>>> > already up.
>>> >
>>> > I thought about a dc-priority default of 100,
>>> > and only triggering a re-election if I am DC,
>>> > my dc-priority is < 50, and I see a node joining.
>>>
>>> Hardcoded arbitrary defaults aren't that much fun. "You can use any
>>> number, but 100 is the magic threshold" is something I wouldn't want
>>> to explain to people over and over again.
>>
>> Then don't ;-)
>>
>> Not helping, and irrelevant to this case.
>>
>> Besides that was an example.
>> Easily possible: move the "I want to lose" vs "I want to win"
>> magic number to be 0, and allow both positive and negative priorities.
>> You get to decide whether positive or negative is the "I'd rather lose"
>> side. Want to make that configurable as well? Right.
>
> Nope, 0 is used as a threshold value in Pacemaker all over the place.
> So allowing both positive and negative priorities and making 0 the
> default sounds perfectly sane to me.
>
>> I don't think this can be made part of the cib configuration,
>> DC election takes place before cibs are resynced, so if you have
>> diverging cibs, you possibly end up with a never ending election?
>>
>> Then maybe the election is stable enough,
>> even after this change to the algorithm.
>
> Andrew?

This whole thread makes me want to hurt kittens.

>
>> But you'd need to add an other trigger on "dc-priority in configuration
>> changed", complicating this stuff for no reason.
>>
>>> We actually discussed node defaults a while back. Those would be
>>> similar to resource and op defaults which Pacemaker already has, and
>>> set defaults for node attributes for newly joined nodes. At the time
>>> the idea was to support putting new joiners in standby mode by
>>> default, so when you added a node in a symmetric cluster, you wouldn't
>>> need to be afraid that Pacemaker would shuffle resources around.[1]
>>> This dc-priority would be another possibly useful use case for this.
>>
>> Not so sure about that.
>>
>>> [1] Yes, semi-doable with putting the cluster into maintenance mode
>>> before firing up the new node, setting that node into standby, and
>>> then unsetting maintenance mode. But that's just an additional step
>>> that users can easily forget about.
>>
>> Why not simply add the node to the cib, and set it to standby,
>> before it even joins for the first time.
>
> Haha, good one.
>
> Wait, you weren't joking?
>
> Florian
>
> --
> Need help with High Availability?
> http://www.hastexo.com/now
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-25 Thread Florian Haas
On Fri, May 25, 2012 at 11:38 AM, Lars Ellenberg
 wrote:
> On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote:
>> On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
>>  wrote:
>> > Sorry, sent to early.
>> >
>> > That would not catch the case of cluster partitions joining,
>> > only the pacemaker startup with fully connected cluster communication
>> > already up.
>> >
>> > I thought about a dc-priority default of 100,
>> > and only triggering a re-election if I am DC,
>> > my dc-priority is < 50, and I see a node joining.
>>
>> Hardcoded arbitrary defaults aren't that much fun. "You can use any
>> number, but 100 is the magic threshold" is something I wouldn't want
>> to explain to people over and over again.
>
> Then don't ;-)
>
> Not helping, and irrelevant to this case.
>
> Besides that was an example.
> Easily possible: move the "I want to lose" vs "I want to win"
> magic number to be 0, and allow both positive and negative priorities.
> You get to decide whether positive or negative is the "I'd rather lose"
> side. Want to make that configurable as well? Right.

Nope, 0 is used as a threshold value in Pacemaker all over the place.
So allowing both positive and negative priorities and making 0 the
default sounds perfectly sane to me.

> I don't think this can be made part of the cib configuration,
> DC election takes place before cibs are resynced, so if you have
> diverging cibs, you possibly end up with a never ending election?
>
> Then maybe the election is stable enough,
> even after this change to the algorithm.

Andrew?

> But you'd need to add an other trigger on "dc-priority in configuration
> changed", complicating this stuff for no reason.
>
>> We actually discussed node defaults a while back. Those would be
>> similar to resource and op defaults which Pacemaker already has, and
>> set defaults for node attributes for newly joined nodes. At the time
>> the idea was to support putting new joiners in standby mode by
>> default, so when you added a node in a symmetric cluster, you wouldn't
>> need to be afraid that Pacemaker would shuffle resources around.[1]
>> This dc-priority would be another possibly useful use case for this.
>
> Not so sure about that.
>
>> [1] Yes, semi-doable with putting the cluster into maintenance mode
>> before firing up the new node, setting that node into standby, and
>> then unsetting maintenance mode. But that's just an additional step
>> that users can easily forget about.
>
> Why not simply add the node to the cib, and set it to standby,
> before it even joins for the first time.

Haha, good one.

Wait, you weren't joking?

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-25 Thread Lars Ellenberg
On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote:
> On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
>  wrote:
> > Sorry, sent to early.
> >
> > That would not catch the case of cluster partitions joining,
> > only the pacemaker startup with fully connected cluster communication
> > already up.
> >
> > I thought about a dc-priority default of 100,
> > and only triggering a re-election if I am DC,
> > my dc-priority is < 50, and I see a node joining.
> 
> Hardcoded arbitrary defaults aren't that much fun. "You can use any
> number, but 100 is the magic threshold" is something I wouldn't want
> to explain to people over and over again.

Then don't ;-)

Not helping, and irrelevant to this case.

Besides that was an example.
Easily possible: move the "I want to lose" vs "I want to win"
magic number to be 0, and allow both positive and negative priorities.
You get to decide whether positive or negative is the "I'd rather lose"
side. Want to make that configurable as well? Right.

I don't think this can be made part of the cib configuration,
DC election takes place before cibs are resynced, so if you have
diverging cibs, you possibly end up with a never ending election?

Then maybe the election is stable enough,
even after this change to the algorithm.

But you'd need to add an other trigger on "dc-priority in configuration
changed", complicating this stuff for no reason.

> We actually discussed node defaults a while back. Those would be
> similar to resource and op defaults which Pacemaker already has, and
> set defaults for node attributes for newly joined nodes. At the time
> the idea was to support putting new joiners in standby mode by
> default, so when you added a node in a symmetric cluster, you wouldn't
> need to be afraid that Pacemaker would shuffle resources around.[1]
> This dc-priority would be another possibly useful use case for this.

Not so sure about that.

> [1] Yes, semi-doable with putting the cluster into maintenance mode
> before firing up the new node, setting that node into standby, and
> then unsetting maintenance mode. But that's just an additional step
> that users can easily forget about.

Why not simply add the node to the cib, and set it to standby,
before it even joins for the first time.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-25 Thread Florian Haas
On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg
 wrote:
> Sorry, sent to early.
>
> That would not catch the case of cluster partitions joining,
> only the pacemaker startup with fully connected cluster communication
> already up.
>
> I thought about a dc-priority default of 100,
> and only triggering a re-election if I am DC,
> my dc-priority is < 50, and I see a node joining.

Hardcoded arbitrary defaults aren't that much fun. "You can use any
number, but 100 is the magic threshold" is something I wouldn't want
to explain to people over and over again.

We actually discussed node defaults a while back. Those would be
similar to resource and op defaults which Pacemaker already has, and
set defaults for node attributes for newly joined nodes. At the time
the idea was to support putting new joiners in standby mode by
default, so when you added a node in a symmetric cluster, you wouldn't
need to be afraid that Pacemaker would shuffle resources around.[1]
This dc-priority would be another possibly useful use case for this.

Just my two cents.
Florian

[1] Yes, semi-doable with putting the cluster into maintenance mode
before firing up the new node, setting that node into standby, and
then unsetting maintenance mode. But that's just an additional step
that users can easily forget about.

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-25 Thread Lars Ellenberg
On Fri, May 25, 2012 at 10:29:58AM +0200, Lars Ellenberg wrote:
> On Fri, May 25, 2012 at 10:50:25AM +1000, Andrew Beekhof wrote:
> > On Fri, May 25, 2012 at 10:04 AM, Lars Ellenberg
> >  wrote:
> > > On Sun, May 06, 2012 at 09:45:09PM +1000, Andrew Beekhof wrote:
> > >> On Thu, May 3, 2012 at 5:38 PM, Lars Ellenberg
> > >>  wrote:
> > >> >
> > >> > People sometimes think they have a use case
> > >> > for influencing which node will be the DC.
> > >>
> > >> Agreed :-)
> > >>
> > >> >
> > >> > Sometimes it is latency (certain cli commands work faster
> > >> > when done on the DC),
> > >>
> > >> Config changes can be run against any node, there is no reason to go
> > >> to the one on the DC.
> > >>
> > >> > sometimes they add a "mostly quorum"
> > >> > node which may be not quite up to the task of being DC.
> > >>
> > >> I'm not sure I buy that.  Most of the load would comes from the
> > >> resources themselves.
> > >>
> > >> > Prohibiting a node from becoming DC completely would
> > >> > mean it can not even be cleanly shutdown (with 1.0.x, no MCP),
> > >> > or act on its own resources for certain no-quorum policies.
> > >> >
> > >> > So here is a patch I have been asked to present for discussion,
> > >>
> > >> May one ask where it originated?
> > >>
> > >> > against Pacemaker 1.0, that introduces a "dc-prio" configuration
> > >> > parameter, which will add some skew to the election algorithm.
> > >> >
> > >> >
> > >> > Open questions:
> > >> >  * does it make sense at all?
> > >>
> > >> Doubtful :-)
> > >>
> > >> >
> > >> >  * election algorithm compatibility, stability:
> > >> >   will the election be correct if some nodes have this patch,
> > >> >   and some don't ?
> > >>
> > >> Unlikely, but you could easily make it so by placing it after the
> > >> version check (and bumping said version in the patch)
> > >>
> > >> >  * How can it be improved so that a node with dc-prio=0 will
> > >> >   "give up" its DC-role as soon as there is at least one other node
> > >> >   with dc-prio > 0?
> > >>
> > >> Short of causing an election every time a node joins... I doubt it.
> > >
> > > Where would be a suitable place in the code/fsa to do so?
> > 
> > Just after the call to exit(0) :)
> 
> Just what I thought ;-)
> 
> > I'd do it at the end of do_started() but only if dc-priority* > 0.
> > That way you only cause an election if someone who is likely to win it 
> > starts.
> > And people that don't enable this feature are unaffected.

Sorry, sent to early.

That would not catch the case of cluster partitions joining,
only the pacemaker startup with fully connected cluster communication
already up.

I thought about a dc-priority default of 100,
and only triggering a re-election if I am DC,
my dc-priority is < 50, and I see a node joining.

That would then happen in
handle_request()
 /*== DC-Only Actions ==*/
 if(AM_I_DC) {
 if(strcmp(op, CRM_OP_JOIN_ANNOUNCE) == 0) {
if ( *** new logic goes here *** )
return I_ELECTION;
else
return I_NODE_JOIN;

Of course, we could even add the dc-priority to the CRM_OP_JOIN_ANNOUNCE
message, so we do only trigger an election if we are likely to lose.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-25 Thread Lars Ellenberg
On Fri, May 25, 2012 at 10:50:25AM +1000, Andrew Beekhof wrote:
> On Fri, May 25, 2012 at 10:04 AM, Lars Ellenberg
>  wrote:
> > On Sun, May 06, 2012 at 09:45:09PM +1000, Andrew Beekhof wrote:
> >> On Thu, May 3, 2012 at 5:38 PM, Lars Ellenberg
> >>  wrote:
> >> >
> >> > People sometimes think they have a use case
> >> > for influencing which node will be the DC.
> >>
> >> Agreed :-)
> >>
> >> >
> >> > Sometimes it is latency (certain cli commands work faster
> >> > when done on the DC),
> >>
> >> Config changes can be run against any node, there is no reason to go
> >> to the one on the DC.
> >>
> >> > sometimes they add a "mostly quorum"
> >> > node which may be not quite up to the task of being DC.
> >>
> >> I'm not sure I buy that.  Most of the load would comes from the
> >> resources themselves.
> >>
> >> > Prohibiting a node from becoming DC completely would
> >> > mean it can not even be cleanly shutdown (with 1.0.x, no MCP),
> >> > or act on its own resources for certain no-quorum policies.
> >> >
> >> > So here is a patch I have been asked to present for discussion,
> >>
> >> May one ask where it originated?
> >>
> >> > against Pacemaker 1.0, that introduces a "dc-prio" configuration
> >> > parameter, which will add some skew to the election algorithm.
> >> >
> >> >
> >> > Open questions:
> >> >  * does it make sense at all?
> >>
> >> Doubtful :-)
> >>
> >> >
> >> >  * election algorithm compatibility, stability:
> >> >   will the election be correct if some nodes have this patch,
> >> >   and some don't ?
> >>
> >> Unlikely, but you could easily make it so by placing it after the
> >> version check (and bumping said version in the patch)
> >>
> >> >  * How can it be improved so that a node with dc-prio=0 will
> >> >   "give up" its DC-role as soon as there is at least one other node
> >> >   with dc-prio > 0?
> >>
> >> Short of causing an election every time a node joins... I doubt it.
> >
> > Where would be a suitable place in the code/fsa to do so?
> 
> Just after the call to exit(0) :)

Just what I thought ;-)

> I'd do it at the end of do_started() but only if dc-priority* > 0.
> That way you only cause an election if someone who is likely to win it starts.
> And people that don't enable this feature are unaffected.
>
> * Not dc-prio, its 2012, there's no need to save the extra 4 chars :-)

Thanks,

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-24 Thread Andrew Beekhof
On Fri, May 25, 2012 at 10:04 AM, Lars Ellenberg
 wrote:
> On Sun, May 06, 2012 at 09:45:09PM +1000, Andrew Beekhof wrote:
>> On Thu, May 3, 2012 at 5:38 PM, Lars Ellenberg
>>  wrote:
>> >
>> > People sometimes think they have a use case
>> > for influencing which node will be the DC.
>>
>> Agreed :-)
>>
>> >
>> > Sometimes it is latency (certain cli commands work faster
>> > when done on the DC),
>>
>> Config changes can be run against any node, there is no reason to go
>> to the one on the DC.
>>
>> > sometimes they add a "mostly quorum"
>> > node which may be not quite up to the task of being DC.
>>
>> I'm not sure I buy that.  Most of the load would comes from the
>> resources themselves.
>>
>> > Prohibiting a node from becoming DC completely would
>> > mean it can not even be cleanly shutdown (with 1.0.x, no MCP),
>> > or act on its own resources for certain no-quorum policies.
>> >
>> > So here is a patch I have been asked to present for discussion,
>>
>> May one ask where it originated?
>>
>> > against Pacemaker 1.0, that introduces a "dc-prio" configuration
>> > parameter, which will add some skew to the election algorithm.
>> >
>> >
>> > Open questions:
>> >  * does it make sense at all?
>>
>> Doubtful :-)
>>
>> >
>> >  * election algorithm compatibility, stability:
>> >   will the election be correct if some nodes have this patch,
>> >   and some don't ?
>>
>> Unlikely, but you could easily make it so by placing it after the
>> version check (and bumping said version in the patch)
>>
>> >  * How can it be improved so that a node with dc-prio=0 will
>> >   "give up" its DC-role as soon as there is at least one other node
>> >   with dc-prio > 0?
>>
>> Short of causing an election every time a node joins... I doubt it.
>
> Where would be a suitable place in the code/fsa to do so?

Just after the call to exit(0) :)

I'd do it at the end of do_started() but only if dc-priority* > 0.
That way you only cause an election if someone who is likely to win it starts.
And people that don't enable this feature are unaffected.

* Not dc-prio, its 2012, there's no need to save the extra 4 chars :-)

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-24 Thread Lars Ellenberg
On Sun, May 06, 2012 at 09:45:09PM +1000, Andrew Beekhof wrote:
> On Thu, May 3, 2012 at 5:38 PM, Lars Ellenberg
>  wrote:
> >
> > People sometimes think they have a use case
> > for influencing which node will be the DC.
> 
> Agreed :-)
> 
> >
> > Sometimes it is latency (certain cli commands work faster
> > when done on the DC),
> 
> Config changes can be run against any node, there is no reason to go
> to the one on the DC.
> 
> > sometimes they add a "mostly quorum"
> > node which may be not quite up to the task of being DC.
> 
> I'm not sure I buy that.  Most of the load would comes from the
> resources themselves.
> 
> > Prohibiting a node from becoming DC completely would
> > mean it can not even be cleanly shutdown (with 1.0.x, no MCP),
> > or act on its own resources for certain no-quorum policies.
> >
> > So here is a patch I have been asked to present for discussion,
> 
> May one ask where it originated?
> 
> > against Pacemaker 1.0, that introduces a "dc-prio" configuration
> > parameter, which will add some skew to the election algorithm.
> >
> >
> > Open questions:
> >  * does it make sense at all?
> 
> Doubtful :-)
> 
> >
> >  * election algorithm compatibility, stability:
> >   will the election be correct if some nodes have this patch,
> >   and some don't ?
> 
> Unlikely, but you could easily make it so by placing it after the
> version check (and bumping said version in the patch)
> 
> >  * How can it be improved so that a node with dc-prio=0 will
> >   "give up" its DC-role as soon as there is at least one other node
> >   with dc-prio > 0?
> 
> Short of causing an election every time a node joins... I doubt it.

Where would be a suitable place in the code/fsa to do so?

Thanks,

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-06 Thread Andrew Beekhof
On Thu, May 3, 2012 at 5:38 PM, Lars Ellenberg
 wrote:
>
> People sometimes think they have a use case
> for influencing which node will be the DC.

Agreed :-)

>
> Sometimes it is latency (certain cli commands work faster
> when done on the DC),

Config changes can be run against any node, there is no reason to go
to the one on the DC.

> sometimes they add a "mostly quorum"
> node which may be not quite up to the task of being DC.

I'm not sure I buy that.  Most of the load would comes from the
resources themselves.

> Prohibiting a node from becoming DC completely would
> mean it can not even be cleanly shutdown (with 1.0.x, no MCP),
> or act on its own resources for certain no-quorum policies.
>
> So here is a patch I have been asked to present for discussion,

May one ask where it originated?

> against Pacemaker 1.0, that introduces a "dc-prio" configuration
> parameter, which will add some skew to the election algorithm.
>
>
> Open questions:
>  * does it make sense at all?

Doubtful :-)

>
>  * election algorithm compatibility, stability:
>   will the election be correct if some nodes have this patch,
>   and some don't ?

Unlikely, but you could easily make it so by placing it after the
version check (and bumping said version in the patch)

>  * How can it be improved so that a node with dc-prio=0 will
>   "give up" its DC-role as soon as there is at least one other node
>   with dc-prio > 0?

Short of causing an election every time a node joins... I doubt it.

>        Lars
>
>
> --- ./crmd/election.c.orig      2011-05-11 11:36:05.577329600 +0200
> +++ ./crmd/election.c   2011-05-12 13:49:04.671484200 +0200
> @@ -29,6 +29,7 @@
>  GHashTable *voted = NULL;
>  uint highest_born_on = -1;
>  static int current_election_id = 1;
> +static int our_dc_prio = -1;
>
>  /*     A_ELECTION_VOTE */
>  void
> @@ -55,6 +56,20 @@
>                        break;
>        }
>
> +       if (our_dc_prio < 0) {
> +                       char * dc_prio_str = getenv("HA_dc_prio");
> +
> +                       if (dc_prio_str == NULL) {
> +                               our_dc_prio = 1;
> +                       } else {
> +                               our_dc_prio = atoi(dc_prio_str);
> +                       }
> +       }
> +
> +       if (!our_dc_prio) {
> +               not_voting = TRUE;
> +       }
> +
>        if(not_voting == FALSE) {
>                if(is_set(fsa_input_register, R_STARTING)) {
>                        not_voting = TRUE;
> @@ -72,12 +87,13 @@
>        }
>
>        vote = create_request(
> -               CRM_OP_VOTE, NULL, NULL,
> +               our_dc_prio?CRM_OP_VOTE:CRM_OP_NOVOTE, NULL, NULL,
>                CRM_SYSTEM_CRMD, CRM_SYSTEM_CRMD, NULL);
>
>        current_election_id++;
>        crm_xml_add(vote, F_CRM_ELECTION_OWNER, fsa_our_uuid);
>        crm_xml_add_int(vote, F_CRM_ELECTION_ID, current_election_id);
> +       crm_xml_add_int(vote, F_CRM_DC_PRIO, our_dc_prio);
>
>        send_cluster_message(NULL, crm_msg_crmd, vote, TRUE);
>        free_xml(vote);
> @@ -188,6 +204,7 @@
>                       fsa_data_t *msg_data)
>  {
>        int election_id = -1;
> +       int your_dc_prio = 1;
>        int log_level = LOG_INFO;
>        gboolean done = FALSE;
>        gboolean we_loose = FALSE;
> @@ -216,6 +233,17 @@
>        your_version   = crm_element_value(vote->msg, F_CRM_VERSION);
>        election_owner = crm_element_value(vote->msg, F_CRM_ELECTION_OWNER);
>        crm_element_value_int(vote->msg, F_CRM_ELECTION_ID, &election_id);
> +       crm_element_value_int(vote->msg, F_CRM_DC_PRIO, &your_dc_prio);
> +
> +       if (our_dc_prio < 0) {
> +               char * dc_prio_str = getenv("HA_dc_prio");
> +
> +               if (dc_prio_str == NULL) {
> +                       our_dc_prio = 1;
> +               } else {
> +                       our_dc_prio = atoi(dc_prio_str);
> +               }
> +       }
>
>        CRM_CHECK(vote_from != NULL, vote_from = fsa_our_uname);
>
> @@ -269,6 +297,13 @@
>            reason = "Recorded";
>            done = TRUE;
>
> +       } else if(our_dc_prio < your_dc_prio) {
> +           reason = "DC Prio";
> +           we_loose = TRUE;
> +
> +       } else if(our_dc_prio > your_dc_prio) {
> +           reason = "DC Prio";
> +
>        } else if(compare_version(your_version, CRM_FEATURE_SET) < 0) {
>            reason = "Version";
>            we_loose = TRUE;
> @@ -328,6 +363,7 @@
>
>                crm_xml_add(novote, F_CRM_ELECTION_OWNER, election_owner);
>                crm_xml_add_int(novote, F_CRM_ELECTION_ID, election_id);
> +               crm_xml_add_int(novote, F_CRM_DC_PRIO, our_dc_prio);
>
>                send_cluster_message(vote_from, crm_msg_crmd, novote, TRUE);
>                free_xml(novote);
> --- ./include/crm/msg_xml.h.orig        2011-05-11 18:22:08.061726000 +0200
> +++ ./include/crm/msg_xml.h     2011-05-11 18:24:17.405132000 +0200
> @@ -32,6 +32,7 @@
>  #define F_CRM_ORIGIN                   "origin"
>  #define F_CR

[Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)

2012-05-03 Thread Lars Ellenberg

People sometimes think they have a use case
for influencing which node will be the DC.

Sometimes it is latency (certain cli commands work faster
when done on the DC), sometimes they add a "mostly quorum"
node which may be not quite up to the task of being DC.


Prohibiting a node from becoming DC completely would
mean it can not even be cleanly shutdown (with 1.0.x, no MCP),
or act on its own resources for certain no-quorum policies.

So here is a patch I have been asked to present for discussion,
against Pacemaker 1.0, that introduces a "dc-prio" configuration
parameter, which will add some skew to the election algorithm.


Open questions:
 * does it make sense at all?

 * election algorithm compatibility, stability:
   will the election be correct if some nodes have this patch,
   and some don't ?

 * How can it be improved so that a node with dc-prio=0 will
   "give up" its DC-role as soon as there is at least one other node
   with dc-prio > 0?

Lars


--- ./crmd/election.c.orig  2011-05-11 11:36:05.577329600 +0200
+++ ./crmd/election.c   2011-05-12 13:49:04.671484200 +0200
@@ -29,6 +29,7 @@
 GHashTable *voted = NULL;
 uint highest_born_on = -1;
 static int current_election_id = 1;
+static int our_dc_prio = -1;
 
 /* A_ELECTION_VOTE */
 void
@@ -55,6 +56,20 @@
break;
}
 
+   if (our_dc_prio < 0) {
+   char * dc_prio_str = getenv("HA_dc_prio");
+
+   if (dc_prio_str == NULL) {
+   our_dc_prio = 1;
+   } else {
+   our_dc_prio = atoi(dc_prio_str);
+   }
+   }
+
+   if (!our_dc_prio) {
+   not_voting = TRUE;
+   }
+
if(not_voting == FALSE) {
if(is_set(fsa_input_register, R_STARTING)) {
not_voting = TRUE;
@@ -72,12 +87,13 @@
}

vote = create_request(
-   CRM_OP_VOTE, NULL, NULL,
+   our_dc_prio?CRM_OP_VOTE:CRM_OP_NOVOTE, NULL, NULL,
CRM_SYSTEM_CRMD, CRM_SYSTEM_CRMD, NULL);
 
current_election_id++;
crm_xml_add(vote, F_CRM_ELECTION_OWNER, fsa_our_uuid);
crm_xml_add_int(vote, F_CRM_ELECTION_ID, current_election_id);
+   crm_xml_add_int(vote, F_CRM_DC_PRIO, our_dc_prio);
 
send_cluster_message(NULL, crm_msg_crmd, vote, TRUE);
free_xml(vote);
@@ -188,6 +204,7 @@
   fsa_data_t *msg_data)
 {
int election_id = -1;
+   int your_dc_prio = 1;
int log_level = LOG_INFO;
gboolean done = FALSE;
gboolean we_loose = FALSE;
@@ -216,6 +233,17 @@
your_version   = crm_element_value(vote->msg, F_CRM_VERSION);
election_owner = crm_element_value(vote->msg, F_CRM_ELECTION_OWNER);
crm_element_value_int(vote->msg, F_CRM_ELECTION_ID, &election_id);
+   crm_element_value_int(vote->msg, F_CRM_DC_PRIO, &your_dc_prio);
+
+   if (our_dc_prio < 0) {
+   char * dc_prio_str = getenv("HA_dc_prio");
+
+   if (dc_prio_str == NULL) {
+   our_dc_prio = 1;
+   } else {
+   our_dc_prio = atoi(dc_prio_str);
+   }
+   }
 
CRM_CHECK(vote_from != NULL, vote_from = fsa_our_uname);

@@ -269,6 +297,13 @@
reason = "Recorded";
done = TRUE;

+   } else if(our_dc_prio < your_dc_prio) {
+   reason = "DC Prio";
+   we_loose = TRUE;
+
+   } else if(our_dc_prio > your_dc_prio) {
+   reason = "DC Prio";
+
} else if(compare_version(your_version, CRM_FEATURE_SET) < 0) {
reason = "Version";
we_loose = TRUE;
@@ -328,6 +363,7 @@
 
crm_xml_add(novote, F_CRM_ELECTION_OWNER, election_owner);
crm_xml_add_int(novote, F_CRM_ELECTION_ID, election_id);
+   crm_xml_add_int(novote, F_CRM_DC_PRIO, our_dc_prio);

send_cluster_message(vote_from, crm_msg_crmd, novote, TRUE);
free_xml(novote);
--- ./include/crm/msg_xml.h.orig2011-05-11 18:22:08.061726000 +0200
+++ ./include/crm/msg_xml.h 2011-05-11 18:24:17.405132000 +0200
@@ -32,6 +32,7 @@
 #define F_CRM_ORIGIN   "origin"
 #define F_CRM_JOIN_ID  "join_id"
 #define F_CRM_ELECTION_ID  "election-id"
+#define F_CRM_DC_PRIO  "dc-prio"
 #define F_CRM_ELECTION_OWNER   "election-owner"
 #define F_CRM_TGRAPH   "crm-tgraph"
 #define F_CRM_TGRAPH_INPUT "crm-tgraph-in"
--- ./lib/ais/plugin.c.orig 2011-05-11 11:29:38.496116000 +0200
+++ ./lib/ais/plugin.c  2011-05-11 17:28:32.385425300 +0200
@@ -421,6 +421,9 @@
 get_config_opt(pcmk_api, local_handle, "use_logd", &value, "no");
 pcmk_env.use_logd = value;
 
+get_config_opt(pcmk_api, local_handle, "dc_prio", &value, "1");
+pcmk_