Re: [Pacemaker] Pacemaker Explained -- Issues
On Sat, May 7, 2011 at 10:03 AM, KIA posei...@mail.ru wrote: Hi Andrew, I've found a consistency issue in subj. In section 1.4.1 DC is Designated Co-ordinator. In section 2.3 DC is Designated Controller. In all following occurrences of DC we have been left alone to guess what it means here. :( Both mean the same thing. A decision should be made, and the issue should be fixed. Sincerely, Igor ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional role parameter
On Fri, 2011-05-06 at 16:15 +0200, Andrew Beekhof wrote: On Fri, May 6, 2011 at 12:28 PM, Holger Teutsch holger.teut...@web.de wrote: On Fri, 2011-05-06 at 11:03 +0200, Andrew Beekhof wrote: On Fri, May 6, 2011 at 9:53 AM, Andrew Beekhof and...@beekhof.net wrote: On Thu, May 5, 2011 at 5:43 PM, Holger Teutsch holger.teut...@web.de wrote: On Fri, 2011-04-29 at 09:41 +0200, Andrew Beekhof wrote: Unfortunately the devel code does not run at all in my environment so I have to fix this first. Oh? I ran CTS on it the other day and it was fine here. I installed pacemaker-devel on top of a compilation of pacemaker-1.1. In addition I tried make uninstall for both versions and then again make install for devel. Pacemaker does not come up, crm_mon shows nodes as offline. I suspect reason is May 5 17:09:34 devel1 crmd: [5942]: notice: crmd_peer_update: Status update: Client devel1/crmd now has status [online] (DC=null) May 5 17:09:34 devel1 crmd: [5942]: info: crm_update_peer: Node devel1: id=1790093504 state=unknown addr=(null) votes=0 born=0 seen=0 proc=00111312 (new) May 5 17:09:34 devel1 crmd: [5942]: info: pcmk_quorum_notification: Membership 0: quorum retained (0) May 5 17:09:34 devel1 crmd: [5942]: debug: do_fsa_action: actions:trace: #011// A_STARTED May 5 17:09:34 devel1 crmd: [5942]: info: do_started: Delaying start, no membership data (0010) ^ May 5 17:09:34 devel1 crmd: [5942]: debug: register_fsa_input_adv: Stalling the FSA pending further input: cause=C_FSA_INTERNAL Any ideas ? Hg version? Corosync config? I'm running -devel here right now and things are fine. Uh, I think I see now. Try http://hg.clusterlabs.org/pacemaker/1.1/rev/b94ce5673ce4 Yeah, I realized afterwards that it was specific to devel. What does your corosync config look like? I run corosync-1.3.0-3.1.x86_64. It's exactly the same config that worked with pacemaker 1.1 rev 10608:b4f456380f60 # Please read the corosync.conf.5 manual page compatibility: whitetank aisexec { # Run as root - this is necessary to be able to manage # resources with Pacemaker user: root group: root } service { # Load the Pacemaker Cluster Resource Manager ver:1 name: pacemaker use_mgmtd: yes use_logd: yes } totem { # The only valid version is 2 version:2 # How long before declaring a token lost (ms) token: 5000 # How many token retransmits before forming a new configuration token_retransmits_before_loss_const: 10 # How long to wait for join messages in the membership protocol (ms) join: 60 # How long to wait for consensus to be achieved before starting # a new round of membership configuration (ms) consensus: 6000 # Turn off the virtual synchrony filter vsftype:none # Number of messages that may be sent by one processor on # receipt of the token max_messages: 20 # Limit generated nodeids to 31-bits (positive signed integers) clear_node_high_bit: yes # Disable encryption secauth:off # How many threads to use for encryption/decryption threads:0 # Optionally assign a fixed node id (integer) # nodeid: 1234 rrp_mode: active interface { ringnumber: 0 # The following values need to be set based on your environment bindnetaddr:192.168.178.0 mcastaddr: 226.94.40.1 mcastport: 5409 } interface { ringnumber: 1 # The following values need to be set based on your environment bindnetaddr:10.1.1.0 mcastaddr: 226.94.41.1 mcastport: 5411 } } logging { fileline: off to_stderr: no to_logfile: no to_syslog: yes syslog_facility: daemon debug: on timestamp: off } amf { mode: disabled } ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional role parameter
On Mon, May 9, 2011 at 9:58 AM, Holger Teutsch holger.teut...@web.de wrote: On Fri, 2011-05-06 at 16:15 +0200, Andrew Beekhof wrote: On Fri, May 6, 2011 at 12:28 PM, Holger Teutsch holger.teut...@web.de wrote: On Fri, 2011-05-06 at 11:03 +0200, Andrew Beekhof wrote: On Fri, May 6, 2011 at 9:53 AM, Andrew Beekhof and...@beekhof.net wrote: On Thu, May 5, 2011 at 5:43 PM, Holger Teutsch holger.teut...@web.de wrote: On Fri, 2011-04-29 at 09:41 +0200, Andrew Beekhof wrote: Unfortunately the devel code does not run at all in my environment so I have to fix this first. Oh? I ran CTS on it the other day and it was fine here. I installed pacemaker-devel on top of a compilation of pacemaker-1.1. In addition I tried make uninstall for both versions and then again make install for devel. Pacemaker does not come up, crm_mon shows nodes as offline. I suspect reason is May 5 17:09:34 devel1 crmd: [5942]: notice: crmd_peer_update: Status update: Client devel1/crmd now has status [online] (DC=null) May 5 17:09:34 devel1 crmd: [5942]: info: crm_update_peer: Node devel1: id=1790093504 state=unknown addr=(null) votes=0 born=0 seen=0 proc=00111312 (new) May 5 17:09:34 devel1 crmd: [5942]: info: pcmk_quorum_notification: Membership 0: quorum retained (0) May 5 17:09:34 devel1 crmd: [5942]: debug: do_fsa_action: actions:trace: #011// A_STARTED May 5 17:09:34 devel1 crmd: [5942]: info: do_started: Delaying start, no membership data (0010) ^ May 5 17:09:34 devel1 crmd: [5942]: debug: register_fsa_input_adv: Stalling the FSA pending further input: cause=C_FSA_INTERNAL Any ideas ? Hg version? Corosync config? I'm running -devel here right now and things are fine. Uh, I think I see now. Try http://hg.clusterlabs.org/pacemaker/1.1/rev/b94ce5673ce4 Yeah, I realized afterwards that it was specific to devel. What does your corosync config look like? I run corosync-1.3.0-3.1.x86_64. It's exactly the same config that worked with pacemaker 1.1 rev 10608:b4f456380f60 If it works for that version, then it should work for any of the 7 commits since. None of them could produce: May 5 17:09:33 devel1 pacemakerd: [5929]: info: get_cluster_type: Cluster type is: 'corosync' Going back through the commit logs, it looks like the following is when things broke: http://hg.clusterlabs.org/pacemaker/1.1/rev/44e5f4e760f6 Specifically this need to be removed: +setenv(HA_cluster_type, corosync,1); Could you see if that change helps? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional role parameter
I thought you said you were running 1.1? May 5 17:09:33 devel1 pacemakerd: [5929]: info: read_config: Reading configure for stack: corosync This message is specific to the devel branch. Update to get the following fix and you should be fine: http://hg.clusterlabs.org/pacemaker/devel/rev/84ef5401322f ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [pacemaker][patch 3/4] Simple changes for Pacemaker Explained, Chapter 6 CH_Constraints.xml
On Fri, May 6, 2011 at 12:29 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Fri, May 06, 2011 at 09:47:29AM +0200, Andrew Beekhof wrote: On Thu, May 5, 2011 at 5:20 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Thu, May 05, 2011 at 12:02:01PM +0200, Andrew Beekhof wrote: On Thu, May 5, 2011 at 11:37 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Thu, May 05, 2011 at 09:07:05AM +0200, Andrew Beekhof wrote: On Wed, May 4, 2011 at 7:15 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Wed, May 04, 2011 at 12:49:03PM +0200, Andrew Beekhof wrote: Tick tock. I'm going to push this soon unless someone raises an objection RSN. On Fri, Apr 15, 2011 at 4:55 PM, Andrew Beekhof and...@beekhof.net wrote: On Fri, Apr 15, 2011 at 3:00 PM, Lars Marowsky-Bree l...@novell.com wrote: On 2011-04-13T08:37:12, Andrew Beekhof and...@beekhof.net wrote: Before: rsc_colocation id=coloc-set score=INFINITY resource_set id=coloc-set-0 resource_ref id=dummy2/ resource_ref id=dummy3/ /resource_set resource_set id=coloc-set-1 sequential=false role=Master resource_ref id=dummy0/ resource_ref id=dummy1/ /resource_set /rsc_colocation rsc_order id=order-set score=INFINITY resource_set id=order-set-0 role=Master resource_ref id=dummy0/ resource_ref id=dummy1/ /resource_set resource_set id=order-set-1 sequential=false resource_ref id=dummy2/ resource_ref id=dummy3/ /resource_set /rsc_order After: So I am understanding this properly - we're getting rid of the sequential attribute, yes? Absolutely. So, the internal-collocation replaces the sequential attribute? Yes. What are the possible and/or meaningfull values for internal-collocation? It looks like that would be 0 or INFINITY only, which would translate to old sequential false and true, right? No. choice data type=integer/ valueINFINITY/value value+INFINITY/value value-INFINITY/value /choice I saw that, but wonder what makes sense in this context. What's the difference between values 0, INF, 50, -50, 100? Are all those necessary? Just as necessary as for colocation constraints not involving sets. You're setting up the colocation score between elements of the set. OK. Looking at the schema, the ordering constraint lost score Score was being mapped to kind inside the PE anyway. and is using only the kind attribute which can have one of: valueNone/value valueOptional/value valueMandatory/value valueSerialize/value But then, the kind attribute is optional. If missing, how's that different from value None? If its missing you get the default. Which IIRC is Mandatory not None. What does Serialize mean? (in orders) Same as it did before, this is not new. What does score-attribute-mangle mean? (in collocations) As above. Not new. Where are these two documented? Couldn't find anything in the docs. Looks to be just an alias for XML_RULE_ATTR_SCORE_ATTRIBUTE dating back to 2005. So there is probably a reason I didn't document it. So, it's obsolete then? The crm shell actually never supported it :-| And I can't recall that I've ever seen it in a configuration. Serialize is newer. Its like optional but for a set - no member will start or stop at the same time as another. OK. I think that it'd be good to clarify the shell syntax before applying these changes. Actually I'm going to flip-flop here... there's really no need for this. Until the shell understands the new syntax, it will just show xml right? Right. But in my experience trying things out in shell syntax sometimes reveals design shortcomings. That was so with the resource sets. Going back to the example you've shown earlier in this thread ... Before: collocation c inf: ( dummy0:Master dummy1:Master ) dummy2 dummy3 order o Mandatory: dummy0:promote dummy1:promote ( dummy2 dummy3 ) After(1): collocation_set c inf: 0:[dummy0:Master dummy1:Master] inf:[dummy2 dummy3] order_set o Mandatory: Mandatory:[dummy0:promote dummy1:promote] Optional:[dummy2 dummy3] After(2): collocation_set c inf: 0:[dummy0:Master dummy1:Master] dummy2 dummy3 order_set o Mandatory: dummy0:promote dummy1:promote Optional:[dummy2 dummy3] The second version removes redundant specification, i.e. for the sets which have the same kind/score as the constraint. Would this kind
Re: [Pacemaker] filesystem could not mount after reboot
Hello. I think it was some resource fail and it was been lock by failcount or you got some wrong location score. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional role parameter
I had 1.1 but Dejan asked my to rebase my patches on devel. So long story short: devel now works after upgrading to the rev you mentioned and I got back to working on my patches. Thanx Holger On Mon, 2011-05-09 at 10:58 +0200, Andrew Beekhof wrote: I thought you said you were running 1.1? May 5 17:09:33 devel1 pacemakerd: [5929]: info: read_config: Reading configure for stack: corosync This message is specific to the devel branch. Update to get the following fix and you should be fine: http://hg.clusterlabs.org/pacemaker/devel/rev/84ef5401322f ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [PATCH]Bug 2567 - crm resource migrate should support an optional role parameter
On Wed, 2011-04-27 at 13:25 +0200, Andrew Beekhof wrote: On Sun, Apr 24, 2011 at 4:31 PM, Holger Teutsch holger.teut...@web.de wrote: ... Remaining diffs seem to be not related to my changes. Unlikely I'm afraid. We run the regression tests after every commit and complain loudly if they fail. What is the regression test output? That's the output of tools/regression.sh of pacemaker-devel *without* my patches: Version: parent: 10731:bf7b957f4cbe tip see attachment -holger Using local binaries from: . * Passed: cibadmin - Require --force for CIB erasure * Passed: cibadmin - Allow CIB erasure with --force * Passed: cibadmin - Query CIB * Passed: crm_attribute - Set cluster option * Passed: cibadmin - Query new cluster option * Passed: cibadmin - Query cluster options * Passed: cibadmin - Delete nvpair * Passed: cibadmin - Create operaton should fail with: -21, The object already exists * Passed: cibadmin - Modify cluster options section * Passed: cibadmin - Query updated cluster option * Passed: crm_attribute - Set duplicate cluster option * Passed: crm_attribute - Setting multiply defined cluster option should fail with -216, Could not set cluster option * Passed: crm_attribute - Set cluster option with -s * Passed: crm_attribute - Delete cluster option with -i * Passed: cibadmin - Create node entry * Passed: cibadmin - Create node status entry * Passed: crm_attribute - Create node attribute * Passed: cibadmin - Query new node attribute * Passed: cibadmin - Digest calculation * Passed: cibadmin - Replace operation should fail with: -45, Update was older than existing configuration * Passed: crm_standby- Default standby value * Passed: crm_standby- Set standby status * Passed: crm_standby- Query standby value * Passed: crm_standby- Delete standby value * Passed: cibadmin - Create a resource * Passed: crm_resource - Create a resource meta attribute * Passed: crm_resource - Query a resource meta attribute * Passed: crm_resource - Remove a resource meta attribute * Passed: crm_resource - Create a resource attribute * Passed: crm_resource - List the configured resources * Passed: crm_resource - Set a resource's fail-count * Passed: crm_resource - Require a destination when migrating a resource that is stopped * Passed: crm_resource - Don't support migration to non-existant locations * Passed: crm_resource - Migrate a resource * Passed: crm_resource - Un-migrate a resource --- ./regression.exp2011-05-09 20:26:27.669381187 +0200 +++ ./regression.out2011-05-09 20:38:27.112098949 +0200 @@ -616,7 +616,7 @@ /status /cib * Passed: crm_resource - List the configured resources -cib epoch=16 num_updates=2 admin_epoch=0 validate-with=pacemaker-1.2 +cib epoch=16 num_updates=1 admin_epoch=0 validate-with=pacemaker-1.2 configuration crm_config cluster_property_set id=cib-bootstrap-options/ @@ -642,19 +642,13 @@ constraints/ /configuration status -node_state id=clusterNode-UUID uname=clusterNode-UNAME - transient_attributes id=clusterNode-UUID -instance_attributes id=status-clusterNode-UUID - nvpair id=status-clusterNode-UUID-fail-count-dummy name=fail-count-dummy value=10/ -/instance_attributes - /transient_attributes -/node_state +node_state id=clusterNode-UUID uname=clusterNode-UNAME/ /status /cib * Passed: crm_resource - Set a resource's fail-count Resource dummy not moved: not-active and no preferred location specified. Error performing operation: cib object missing -cib epoch=16 num_updates=2 admin_epoch=0 validate-with=pacemaker-1.2 +cib epoch=16 num_updates=1 admin_epoch=0 validate-with=pacemaker-1.2 configuration crm_config cluster_property_set id=cib-bootstrap-options/ @@ -680,19 +674,13 @@ constraints/ /configuration status -node_state id=clusterNode-UUID uname=clusterNode-UNAME - transient_attributes id=clusterNode-UUID -instance_attributes id=status-clusterNode-UUID - nvpair id=status-clusterNode-UUID-fail-count-dummy name=fail-count-dummy value=10/ -/instance_attributes - /transient_attributes -/node_state +node_state id=clusterNode-UUID uname=clusterNode-UNAME/ /status /cib * Passed: crm_resource - Require a destination when migrating a resource that is stopped Error performing operation: i.dont.exist is not a known node Error performing operation: The object/attribute does not exist -cib epoch=16 num_updates=2 admin_epoch=0 validate-with=pacemaker-1.2 +cib epoch=16 num_updates=1 admin_epoch=0 validate-with=pacemaker-1.2 configuration crm_config cluster_property_set id=cib-bootstrap-options/ @@ -718,13 +706,7 @@ constraints/ /configuration status -node_state id=clusterNode-UUID uname=clusterNode-UNAME - transient_attributes id=clusterNode-UUID -