[Pacemaker] About the difference in handling of "sequential".
Hi All, There is difference in two between handling of "sequential" of "resouce_set" of colocation. Is either one not a mistake? static gboolean unpack_colocation_set(xmlNode * set, int score, pe_working_set_t * data_set) { xmlNode *xml_rsc = NULL; resource_t *with = NULL; resource_t *resource = NULL; const char *set_id = ID(set); const char *role = crm_element_value(set, "role"); const char *sequential = crm_element_value(set, "sequential"); int local_score = score; const char *score_s = crm_element_value(set, XML_RULE_ATTR_SCORE); if (score_s) { local_score = char2score(score_s); } /* When "sequential" is not set, "sequential" is treat as TRUE. */ if (sequential != NULL && crm_is_true(sequential) == FALSE) { return TRUE; (snip) static gboolean colocate_rsc_sets(const char *id, xmlNode * set1, xmlNode * set2, int score, pe_working_set_t * data_set) { xmlNode *xml_rsc = NULL; resource_t *rsc_1 = NULL; resource_t *rsc_2 = NULL; const char *role_1 = crm_element_value(set1, "role"); const char *role_2 = crm_element_value(set2, "role"); const char *sequential_1 = crm_element_value(set1, "sequential"); const char *sequential_2 = crm_element_value(set2, "sequential"); /* When "sequential" is not set, "sequential" is treat as FALSE. */ if (crm_is_true(sequential_1)) { /* get the first one */ for (xml_rsc = __xml_first_child(set1); xml_rsc != NULL; xml_rsc = __xml_next(xml_rsc)) { if (crm_str_eq((const char *)xml_rsc->name, XML_TAG_RESOURCE_REF, TRUE)) { EXPAND_CONSTRAINT_IDREF(id, rsc_1, ID(xml_rsc)); break; } } } if (crm_is_true(sequential_2)) { /* get the last one */ (snip) Best Regards, Hideo Yamauchi. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.
Hi Kristoffer, Thank you for comments. I tested it. However, the problem seems to still occur. --- [root@srv01 crmsh-8d984b138fc4]# pwd /opt/crmsh-8d984b138fc4 [root@srv01 crmsh-8d984b138fc4]# ./autogen.sh autoconf: autoconf (GNU Autoconf) 2.63 automake: automake (GNU automake) 1.11.1 (snip) [root@srv01 crmsh-8d984b138fc4]# ./configure --sysconfdir=/etc --localstatedir=/var checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p (snip) Prefix = /usr Executables = /usr/sbin Man pages= /usr/share/man Libraries= /usr/lib64 Header files = ${prefix}/include Arch-independent files = /usr/share State information= /var System configuration = /etc [root@srv01 crmsh-8d984b138fc4]# make install Making install in doc make[1]: Entering directory `/opt/crmsh-8d984b138fc4/doc' a2x -f manpage crm.8.txt WARNING: crm.8.txt: line 621: missing [[cmdhelp_._status] section WARNING: crm.8.txt: line 3936: missing [[cmdhelp_._report] section ./crm.8.xml:3137: element refsect1: validity error : Element refsect1 content does not follow the DTD, expecting (refsect1info? , (title , subtitle? , titleabbrev?) , (((calloutlist | glosslist | bibliolist | itemizedlist | orderedlist | segmentedlist | simplelist | variablelist | caution | important | note | tip | warning | literallayout | programlisting | programlistingco | screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | graphicco | mediaobject | mediaobjectco | informalequation | informalexample | informalfigure | informaltable | equation | example | figure | table | msgset | procedure | sidebar | qandaset | task | anchor | bridgehead | remark | highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , refsect2*) | refsect2+)), got (title simpara simpara simpara simpara literallayout refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 simpara simpara simpara literallayout simpara literallayout refsect2 refsect2 refsect2 ) ^ a2x: failed: xmllint --nonet --noout --valid "./crm.8.xml" make[1]: *** [crm.8] Error 1 make[1]: Leaving directory `/opt/crmsh-8d984b138fc4/doc' make: *** [install-recursive] Error 1 --- Best Regards, Hideo Yamauchi. --- On Mon, 2014/2/10, Kristoffer Grönlund wrote: > On Fri, 7 Feb 2014 09:21:12 +0900 (JST) > renayama19661...@ybb.ne.jp wrote: > > > Hi Kristoffer, > > > > In RHEL6.4, crmsh-c8f214020b2c gives the next error and cannot > > install it. > > > > Does a procedure of any installation have a problem? > > Hello, > > It seems that docbook validation is stricter on RHEL 6.4 than on other > systems I use to test. I have pushed a fix for this problem, please > test again with changeset 8d984b138fc4. > > Thank you, > > -- > // Kristoffer Grönlund > // kgronl...@suse.com > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] node1 fencing itself after node2 being fenced
> -Original Message- > From: Vladislav Bogdanov [mailto:bub...@hoster-ok.com] > Sent: 11 February 2014 03:44 > To: pacemaker@oss.clusterlabs.org > Subject: Re: [Pacemaker] node1 fencing itself after node2 being fenced > > Nope, it's Centos6. In few words, It is probably safer for you to stay with > cman, especially if you need GFS2. gfs_controld is not officially ported to > corosync2 and is obsolete in EL7 because communication between > gfs2 and dlm is moved to kernelspace there. > OK thanks, I may do some searching on how to compile corosync2 on centos 6 for a different cluster I need to setup that does not have the gfs2 requirement, thanks for the info. > > You need to fix that for sure. > I ended up rebuilding all my nodes and adding a third one to see if quorum may have been the issue, but the symtoms are still the same, I ended up stracing clvmd and it looks like it tries to write to /dev/misc/dlm_clvmd which doesn't exist on the "failed" node. I ended up attaching the trace to an existing bug listed in the CentOS bug tracker: http://bugs.centos.org/view.php?id=6853 This looks like something to do with clvmd and its locks, but dlm appears to be operating fine for me, I don't see any kern_stop flags for clvmd at all when the node is being fenced. It is a strange one because if I shutdown and reboot any of the nodes cleanly then everything comes back up ok, however, when I simulate failure, this is where the issue comes in. > > Strange message, looks like something is bound to that port already. > You may want to try dlm in tcp mode btw. > I was unable to run dlm in tcp mode as I have dual-homed interfaces, so dlm won't run in tcp mode in this case :) Thanks for recommendation though ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] resource is too active problem in a 2-node cluster
Yes, we have cman (version: cman-3.0.12.1-49). We use manual fencing ( I know it is not recommended). There is an external monitoring and fencing service that we use (our own). Perhaps subject line "resource is too active problem in a 2-node cluster" was misleading. Real problem is that resource is *NOT* too active, but pacemaker thinks it is. Which leads to undesirable recovery procedure. See log lines below Feb 04 11:27:38 [45167] gol-5-7-0pengine: warning: unpack_rsc_op: Processing failed op monitor for GOL-HA on gol-5-7-0: unknown error (1) Feb 04 11:27:38 [45167] gol-5-7-0pengine: warning: unpack_rsc_op: Processing failed op monitor for GOL-HA on gol-5-7-6: unknown error (1) Feb 04 11:27:38 [45167] gol-5-7-0pengine:error: native_create_actions: Resource GOL-HA (ocf::script.sh) is active on 2 nodes attempting recovery On 02/10/2014 09:43 PM, Digimer wrote: On 10/02/14 09:13 PM, Aggarwal, Ajay wrote: I have a 2 node cluster with no-quorum-policy=ignore. I call these nodes as node-0 and node-1. In addition, I have two cluster resources in a group; an IP-address and an OCF script. Turning off quorum on a 2-node cluster is fine, in fact, it's required. However, that makes stonith all the more important. Without stonith, in any cluster but in particualr on two node clusters, things will not work right. First and foremost; Configure stonith and test to make sure it works. Pacemaker version: 1.1.10 Corosync version: 1-4.1-15 OS: CentOS 6.4 With CentOS/RHEL 6, you need cman as well. Please be sure to also configure fence_pcmk in cluster.conf to "hook" it into pacemaker's real fencing. What am I doing wrong? name="stonith-enabled" value="false"/> That. :) Once you have stonith working, see if the problem remains. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Gluster-users] Pacemaker and GlusterFS
Hi Vossel, I allready do this. Resource: home (class=ocf provider=heartbeat type=Filesystem) Attributes: device=localhost:/home_gv directory=/home fstype=glusterfs Operations: start interval=0 timeout=60 (home-start-interval-0) stop interval=0 timeout=240 (home-stop-interval-0) monitor interval=30s role=Started (home-monitor-interval-0) But when I try start I get error bellow and can see error in the log. Operation start for home (ocf:heartbeat:Filesystem) returned 1 > stdout: Mount failed. Please check the log file for more details. > stderr: INFO: Running start for localhost:/home_gv on /home > stderr: ERROR: Couldn't mount filesystem localhost:/home_gv on /home Work fine with fstab and mount -a [root@srvmail0 ~]# mount -a [root@srvmail0 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup-lv_root 1491664 1298072117816 92% / tmpfs 31223644080268156 15% /dev/shm /dev/xvda1 49584475560394684 17% /boot /dev/xvdb120954552 16049876 4904676 77% /gv /dev/xvdc1 2063504 1133824824860 58% /var localhost:/gv_home20954496 16049920 4904576 77% /home [root@srvmail0 ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Wed Dec 19 18:01:54 2012 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # /dev/mapper/VolGroup-lv_root / ext4 defaults1 1 UUID=a7af8398-cbea-495f-80cd-1a642d94d9f4 /boot ext4defaults1 2 /dev/mapper/VolGroup-lv_swap swapswap defaults0 0 tmpfs /dev/shmtmpfs defaults0 0 devpts /dev/ptsdevpts gid=5,mode=620 0 0 sysfs /syssysfs defaults0 0 proc/proc proc defaults0 0 /dev/xvdb1 /gv xfsdefaults1 1 /dev/xvdc1 /var ext4defaults1 1 localhost:/gv_home /home glusterfs _netdev 0 0 [root@srvmail0 ~]# Regards, Em 07-02-2014 17:53, David Vossel escreveu: - Original Message - From: "Jefferson Carlos Machado" To: "The Pacemaker cluster resource manager" , gluster-us...@gluster.org Sent: Friday, February 7, 2014 11:55:37 AM Subject: [Pacemaker] [Gluster-users] Pacemaker and GlusterFS Hi, How the best way to create a resource filesystem managed type glusterfs? I suppose using the Filesystem resource agent. https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/Filesystem -- Vossel Regards, ___ Gluster-users mailing list gluster-us...@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] pacemaker. config safe and create a new cluster?
On 2014-02-11T11:38:15, Beo Banks wrote: > can i use this configuation to create a new cluster system? > maybe with crm configure safe > whatever-bak > change the hostname/ip in whatever-bak > copy the file to the new cluster system install all services > and then crm configure load whatever-bak > > is that works? Sure. I copy & paste between cluster configurations all the time. Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] pacemaker. config safe and create a new cluster?
hi, after a long term i have a "good" cluster configuration now. can i use this configuation to create a new cluster system? maybe with crm configure safe > whatever-bak change the hostname/ip in whatever-bak copy the file to the new cluster system install all services and then crm configure load whatever-bak is that works? the services (only the hostname / ip is different) are the same. thanks beo ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] display order in crm_mon output
Hi List, we've recovered a cluster after a failure and used a previously exported cib.xml. Everything is back to normal state. The strange thing is, that the order in the output of crm_mon is not like before. Can anyone bring some light into this please? What is affecting the order of the displayed ressources? Can we rearrange it somehow? Before the failure: Resource Group: cluster1 p_bond0(ocf::heartbeat:IPaddr2): Started node1 p_vlan100 (ocf::heartbeat:IPaddr2): Started node1 p_vlan200 (ocf::heartbeat:IPaddr2): Started node1 p_route (ocf::heartbeat:Route): Started node1 p_conntrackd (lsb:conntrackd-sync): Started node1 Clone Set: pingclone [p_ping] Started: [ node1 ] Stopped: [ p_ping:1 ] p_vpn_B (ocf::heartbeat:anything): Started node1 p_vpn_C(ocf::heartbeat:anything): Started node1 p_vpn_H(ocf::heartbeat:anything): Started node1 p_vpn_K(ocf::heartbeat:anything): Started node1 p_vpn_L1 (ocf::heartbeat:anything): Started node1 p_vpn_LS (ocf::heartbeat:anything): Started node1 p_vpn_M(ocf::heartbeat:anything): Started node1 After the recovery: p_vpn_H(ocf::heartbeat:anything): Started node1 p_vpn_K(ocf::heartbeat:anything): Started node1 p_vpn_L1 (ocf::heartbeat:anything): Started node1 p_vpn_LS (ocf::heartbeat:anything): Started node1 p_vpn_M(ocf::heartbeat:anything): Started node1 Resource Group: cluster1 p_bond(ocf::heartbeat:IPaddr2): Started node1 p_vlan100 (ocf::heartbeat:IPaddr2): Started node1 p_vlan200 (ocf::heartbeat:IPaddr2): Started node1 p_route (ocf::heartbeat:Route): Started node1 p_conntrackd (lsb:conntrackd-sync): Started node1 Clone Set: pingclone [p_ping] Started: [ node1 ] Stopped: [ p_ping:1 ] p_vpn_B (ocf::heartbeat:anything): Started node1 p_vpn_C(ocf::heartbeat:anything): Started node1 Stefan ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org