Hi Lars, Thanks for the explanation. On 04/28/11 02:55, Lars Marowsky-Bree wrote: > On 2011-04-26T23:34:16, Yan Gao <y...@novell.com> wrote: > > Perhaps chosing the name "token" for the cluster-wide attributes was not > a wise move, as it does invoke the "token" association from > corosync/totem. > > What do you all think about switching this word to "ticket"? And have > the Cluster Ticket Registry manage them? Less confusion later on, I > think. > > I'll try the word "ticket" for the rest of the mail and we can see how > that works out ;-) > > (I think the word works - you can own a ticket, grant a ticket, cancel, > and revoke tickets ...) Sounds fine to me:-)
>>> "Tokens" are, essentially, cluster-wide attributes (similar to node >>> attributes, just for the whole partition). >> Specifically, a "<tokens>" section with an attribute set ( >> "<token_set>" or something) under "/cib/configuration"? > > Yes; a ticket section, just like that. All right. How about the schema: <element name="configuration"> <interleave> ... <element name="tickets"> <zeroOrMore> <element name="ticket_set"> <externalRef href="nvset.rng"/> </element> </zeroOrMore> </element> ... >> - A completely new type of constraint: >> <rsc_token id="rscX-with-tokenA" rsc="rscX" token="tokenA" >> kind="Deadman"/> > > Personally, I lean towards this. (Andrew has expressed a wish to do > without the "rsc_" prefix, so lets drop this ;-) Well then, how about "ticket_dep" or "ticket_req"? > > Not sure the kind="Deadman" is actually required, but it probably makes > sense to be able to switch off the big hammer for debugging purposes. > ;-) I was thinking it's for switching on/off "immediately fence once the dependency is no longer satisfied". > > I don't see why any resource would depend on several tickets; but I can > see a use case for wanting to depend on _not_ owning a ticket, similar > to the node attributes. And the resource would need a role, obviously. OK. The schema I can imagine: <define name="element-ticket_dep"> <element name="ticket_dep"> <attribute name="id"><data type="ID"/></attribute> <choice> <oneOrMore> <ref name="element-resource-set"/> </oneOrMore> <group> <attribute name="rsc"><data type="IDREF"/></attribute> <optional> <attribute name="rsc-role"> <ref name="attribute-roles"/> </attribute> </optional> </group> </choice> <attribute name="ticket"><text/></attribute> </element> </define> > > Andrew, Yan - do you think we should allow _values_ for tickets, or > should they be strictly defined/undefined/set/unset? I think allowing values should be helpful to distinguish different demands. >> If so, isn't it supposed to be revoked manually by default? So the >> short-circuited fail-over needs an admin to participate? > > No to both; it can be revoked manually, yes, but it isn't going to be > always the case. I'm also not quite sure I understand where this > question is headed; how does it matter here whether the ticket is > revoked manually or not? I was just thinking -- before we have the CTR, we rely on the admin quite much. > >> Does it means an option for users to choose if they want an >> immediate fencing or stopping the resources normally? Is it global >> or particularly for a specific token , or even/just for a specific >> dependency? > > Good question. This came up above already briefly ... > > I _think_ there should be a special value that a ticket can be set to > that doesn't fence, but stops everything cleanly. > > However, while the ticket is in this state, the site _still_ owns it (no > other site can get it yet, and were it to lose the ticket due to > expiration, it'd still need to fence all remaining nodes so that the > services can be started elsewhere). > > Perhaps the CTR doesn't even need to know about this - it's a special > setting of the ticket at a given site. Perhaps it makes sense to > distinguish between owning the ticket (as granted on request via the CTR > or manually), and its value (which is set locally)? perhaps: > > Ownership is a true/false flag. Value is a positive integer (including > 0). > > A site that "owns" a ticket of value 0 will stop resources cleanly, and > afterwards relinquish the ticket itself. > > A site that "owns" a ticket of any value and loses it will perform the > deadman dance. > > A site that does not own a ticket but has a non-zero value for it > defined will request the ticket from the CTR; the CTR will grant it to > the site with the highest bid (but not to a site with 0) The site with the highest "bid" is being revoked the ticket. Should it clear the "bid" also? Otherwise it will get the ticket again soon after? > (if these are > equal, to the site with the highest node count, if these again are > equal, to the site with the lowest nodeid). > > (Tangent - ownership appears to belong to the status section; the value > seems belongs to the cib->ticket section(?).) Perhaps. Although there's no appropriate place to set a cluster-wide attribute in the status section so far. Other solutions are: A "ticket" is not a nvpair. It is - An object with "ownership" and "bid" attributes. Or: - A nvpair-set which includes the "ownership" and "bid" nvpairs. > > The value can be set manually - in that case, it allows the admin to > define a primary site for a given set of resources. (It might also be > modified automatically at a later stage based on whatever metric.) > > If a site owns a ticket, but doesn't have the highest value, it would > either fail-back automatically - or require manual intervention, OK, it seems to have answered my previous question. It should be configurable from CTR server side. > which > I'd assume to be quite common. (Again, this builds a very simplistic > active/passive overlay.) > > Does that make sense, or am I creating more confusion than answers? ;-) Definitely makes a lot of sense:-) Regards, Yan -- Gao,Yan <y...@novell.com> Software Engineer China Server Team, OPS Engineering, Novell, Inc. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker