Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers)
On 2011-04-26T16:03:48, Dejan Muhamedagic de...@suse.de wrote: - the required attributes in meta-data need to be reviewed, a parameter is either required or has a default, cannot be both Why would this be the case? Regards, Lars -- Architect Storage/HA, OPS Engineering, Novell, Inc. SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] [Openais] An OCF agent for LXC (Linux Containers)
On 5/2/2011 at 04:49 AM, Lars Marowsky-Bree l...@novell.com wrote: On 2011-04-26T16:03:48, Dejan Muhamedagic de...@suse.de wrote: - the required attributes in meta-data need to be reviewed, a parameter is either required or has a default, cannot be both Why would this be the case? There was some discussion about this last March: http://www.gossamer-threads.com/lists/linuxha/pacemaker/62163#62163 In summary (from lge): If a mandatory parameter has a default, then I'd think it is no longer mandatory, because, if not specified, it has its default to fall back to. [...] Mandatory paramters in my opinion should be such paramters that cannot possibly have a sane default, like the IP for IPaddr2. Regards, Tim -- Tim Serong tser...@novell.com Senior Clustering Engineer, OPS Engineering, Novell Inc. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
[Linux-HA] [PATCH] Low: adding cluster-glue-extras subpackage
Hi, recent addition of vcenter external plugin generates dependency on exotic perl(VMware::VIRuntime) package, majority won't use. I propose to create a separate subpackage, cluster-glue-extras, for all optional components # HG changeset patch # User Vadym Chepkov vchep...@gmail.com # Date 1304261276 14400 # Node ID f8122aff3cef64089d611682f1b77a7695102757 # Parent b3ab6686445b5267a18a37d1a1404170693306db Low: adding cluster-glue-extras subpackage Adding cluster-glue-extras subpackage for optional components diff --git a/cluster-glue-fedora.spec b/cluster-glue-fedora.spec --- a/cluster-glue-fedora.spec +++ b/cluster-glue-fedora.spec @@ -139,6 +139,7 @@ %{_libdir}/stonith/plugins/stonith2/*.so %{_libdir}/stonith/plugins/stonith2/*.py* %exclude %{_libdir}/stonith/plugins/external/ssh +%exclude %{_libdir}/stonith/plugins/external/vcenter %exclude %{_libdir}/stonith/plugins/stonith2/null.so %exclude %{_libdir}/stonith/plugins/stonith2/ssh.so %{_libdir}/stonith/plugins/xen0-ha-dom0-stonith-helper @@ -214,11 +215,23 @@ %{_includedir}/pils %{_datadir}/%{name}/lrmtest %{_libdir}/heartbeat/plugins/test/test.so -%{_libdir}/stonith/plugins/external/ssh -%{_libdir}/stonith/plugins/stonith2/null.so -%{_libdir}/stonith/plugins/stonith2/ssh.so %doc AUTHORS %doc COPYING %doc COPYING.LIB +%package -n cluster-glue-extras +Summary: Additional cluster-glue components +Group: Application/System +Requires: cluster-glue-libs = %{version}-%{release} + +%description -n cluster-glue-extras +cluster-glue-extras includes optional components of cluster-glue framework + +%files -n cluster-glue-extras +%defattr(-,root,root) +%{_libdir}/stonith/plugins/external/ssh +%{_libdir}/stonith/plugins/external/vcenter +%{_libdir}/stonith/plugins/stonith2/null.so +%{_libdir}/stonith/plugins/stonith2/ssh.so + %changelog Cheers, Vadym ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] get haresources2cib.py
Dear Andrew, I read your document clusters from scratch and found it very detailed. It gave lots of information, but I was looking for creating a cib.xml and could not decipher the language as to the syntex and different fields to be put in cib.xml. I am still looking for the haresources2cib.py script. I searched the web but could not find anywhere. I have 2 more questions. Do I have to create the cib.xml file on the nodes I am running heartbeat v.2 software. Does cib.xml has to reside in /var/lib/crm directory or can it reside anywhere else. Kindly provide these answers. I will greatly appreciate your help. Have a nice day. Thanks. nagrik On Sat, Apr 30, 2011 at 1:32 AM, Andrew Beekhof and...@beekhof.net wrote: Forget the conversion. Use the crm shell to create one from scratch. And look for the clusters from scratch doc relevant to your version - its worth the read. On Sat, Apr 30, 2011 at 1:19 AM, Vinay Nagrik vnag...@gmail.com wrote: Hello Group, Kindly tell me where can I download haresources2cib.py file from. Please also tell me can I convert haresources file on a node where I am not running high availability service and then can I copy the converted ..xml file in /var/lib/heartbeat directory on which I am running the high availability. Also does cib file must resiede under /var/lib/heartbeat directory or can it reside under any directory like under /etc. please let me know. I am just a beginner. Thanks in advance. -- Thanks Nagrik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems -- Thanks Nagrik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] How to debug corosync?
On 4/29/2011 at 08:14 PM, Stallmann, Andreas astallm...@conet.de wrote: Hi! Just on a punt... There's not a (partial) firewall running on app02 is there? No, no iptables running anywhere and no layer 3 switches around which could do any filtering. How do you debug corosync? Every command I find to debug corosync shows, that everything is allright. Still, both nodes see each other offline. :-( Try corosync-objctl runtime.totem.pg.mrp.srp.members. You should see something like: runtime.totem.pg.mrp.srp.956305584.ip=r(0) ip(176.16.0.185) runtime.totem.pg.mrp.srp.956305584.join_count=1 runtime.totem.pg.mrp.srp.956305584.status=joined runtime.totem.pg.mrp.srp.973082800.ip=r(0) ip(176.16.0.186) runtime.totem.pg.mrp.srp.973082800.join_count=1 runtime.totem.pg.mrp.srp.973082800.status=joined Note the status. If both nodes show status=joined, corosync should be communicating OK, and the problem is at a higher level (Pacemaker), in which case check /var/log/messages for errors from e.g.: crmd, cib etc. If either node shows status=left, there's a lower level problem (network, firewall (although you ruled that out), etc.). For lower level stuff possibly try asking on the openais mailing list, which is where the corosync devs hang out: http://corosync.org/doku.php?id=support Regards, Tim -- Tim Serong tser...@novell.com Senior Clustering Engineer, OPS Engineering, Novell Inc. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Auto Failback despite location constrain
On 4/29/2011 at 08:25 PM, Stallmann, Andreas astallm...@conet.de wrote: Ha! It works. But still, there are two strange (side) effects: Firstly, mgmt01 still takes over, if it was disconnected from the net for a time shorter than five minutes. I mgmt01 stays disconnected for more than 5 min, no auto fallback will happen after it's reconnected again. Secondly, when mgmt01 comes back after 5min or more, the resources *will* stay on mgmt01 (good so far), but do *restart* on mgmt02 (and that's equally bad as if the services would fallback, because we run phone conferences on the server and those disconnect on every restart of the resources/services). Any ideas *why* the resources restart and how to keep them from doing so? I think I'm confused about the exact sequence of events here. To verify, you mean mgmt01 goes offline, mgmt02 takes over resources, mgmt01 comes back online, no failback occurs, but resources are restarted on mgmt02? Which resources specifically? This may not be related, but I noticed you seem to have some redundant order constraints, I would remove them: order apache-after-ip inf: sharedIP web_res order nagios-after-apache inf: web_res nagios_res These are not necessary because they are already implied by this group: group nag_grp fs_r0 sharedIP web_res nagios_res ajaxterm Any ideas *why* mgmt01 has to stay disconnected for 5min or more to prevent an auto fallback? If the network is flapping for some reason, this would lead to flapping services, too, and that's really (really!) not desireable. No, not really. If the scores (ptest -Ls) are the same on both nodes, the resources should stay where they're already running. I wonder if your ping rule is involved somehow? location only-if-connected nag_grp \ rule $id=only-if-connected-rule -inf: not_defined pingd or pingd lte 1500 Note that -INF scores will always trump any other non-infinity score, see: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch-constraints.html#s-scores-infinity Regards, Tim -Ursprüngliche Nachricht- Von: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha-boun...@lists.linux-ha.org] Im Auftrag von Stallmann, Andreas Gesendet: Freitag, 29. April 2011 10:39 An: General Linux-HA mailing list Betreff: Re: [Linux-HA] Auto Failback despite location constrain Hi! If the resource ends up on the non-preferred node, those settings will cause it to have an equal score on both nodes, so it should stay put. If you want to verify, try ptest -Ls to see what scores each resource has. Great, that's the command I was looking for! Before the failover the output is: group_color: nag_grp allocation score on ipfuie-mgmt01: 100 group_color: nag_grp allocation score on ipfuie-mgmt02: 0 When the nag_grp has failed over to ipfuie-mgmt02 it is: group_color: nag_grp allocation score on ipfuie-mgmt01: -INFINITY group_color: nag_grp allocation score on ipfuie-mgmt02: 0 Strange, isn't it? I would have expected, that the default-resource-stickiness had any influence on the values, but obviously it has not. When mgmt01 comes back, we see (pretty soon) again: group_color: nag_grp allocation score on ipfuie-mgmt01: 100 group_color: nag_grp allocation score on ipfuie-mgmt02: 0 Thus, the resource fails over to mgmt01 again. Which is not what we intended. Anyway, the problem is this constraint: location cli-prefer-nag_grp nag_grp \ rule $id=cli-prefer-rule-nag_grp inf: #uname eq ipfuie-mgmt01 and #uname eq ipfuie-mgmt01 TNX, I shortly thought of applying a vast amount of necessary cruelty on the colleague who did a migrate without the following unmigrate. I unmigrated the resources now (the location constrain is gone now), but the result is the same; the resource-stickyness is not taken into account. AAARGH!!! (As Terry Pratchett says: Three exclamation marks, a clear sign of an insane mind... that's were configuring clusters gets me ... ) Please help, otherwise I might think of doing something really nasty like, like, like... like for example switching to windows! Ha! ;-) Thanks in advance for your ongoing patience with me, Andreas ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems CONET Solutions GmbH, Theodor-Heuss-Allee 19, 53773 Hennef. Registergericht/Registration Court: Amtsgericht Siegburg (HRB Nr. 9136) Geschäftsführer/Managing Directors: Jürgen Zender (Sprecher/Chairman), Anke Höfer ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: