Re-sending as I don't see my post on the thread. On Sun, May 1, 2016 at 4:21 PM, Nikhil Utane <nikhil.subscri...@gmail.com> wrote:
> Hi, > > Looking for some guidance here as we are completely blocked otherwise :(. > > -Regards > Nikhil > > On Fri, Apr 29, 2016 at 6:11 PM, Sriram <sriram...@gmail.com> wrote: > >> Corrected the subject. >> >> We went ahead and captured corosync debug logs for our ppc board. >> After log analysis and comparison with the sucessful logs( from x86 >> machine) , >> we didnt find * "[ MAIN ] Completed service synchronization, ready to >> provide service.*" in ppc logs. >> So, looks like corosync is not in a position to accept connection from >> Pacemaker. >> Even I tried with the new corosync.conf with no success. >> >> Any hints on this issue would be really helpful. >> >> Attaching ppc_notworking.log, x86_working.log, corosync.conf. >> >> Regards, >> Sriram >> >> >> >> On Fri, Apr 29, 2016 at 2:44 PM, Sriram <sriram...@gmail.com> wrote: >> >>> Hi, >>> >>> I went ahead and made some changes in file system(Like I brought in >>> /etc/init.d/corosync and /etc/init.d/pacemaker, /etc/sysconfig ), After >>> that I was able to run "pcs cluster start". >>> But it failed with the following error >>> # pcs cluster start >>> Starting Cluster... >>> Starting Pacemaker Cluster Manager[FAILED] >>> Error: unable to start pacemaker >>> >>> And in the /var/log/pacemaker.log, I saw these errors >>> pacemakerd: info: mcp_read_config: cmap connection setup failed: >>> CS_ERR_TRY_AGAIN. Retrying in 4s >>> Apr 29 08:53:47 [15863] node_cu pacemakerd: info: mcp_read_config: >>> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 5s >>> Apr 29 08:53:52 [15863] node_cu pacemakerd: warning: mcp_read_config: >>> Could not connect to Cluster Configuration Database API, error 6 >>> Apr 29 08:53:52 [15863] node_cu pacemakerd: notice: main: Could >>> not obtain corosync config data, exiting >>> Apr 29 08:53:52 [15863] node_cu pacemakerd: info: crm_xml_cleanup: >>> Cleaning up memory from libxml2 >>> >>> >>> And in the /var/log/Debuglog, I saw these errors coming from corosync >>> 20160429 085347.487050 airv_cu daemon.warn corosync[12857]: [QB ] >>> Denied connection, is not ready (12857-15863-14) >>> 20160429 085347.487067 airv_cu daemon.info corosync[12857]: [QB ] >>> Denied connection, is not ready (12857-15863-14) >>> >>> >>> I browsed the code of libqb to find that it is failing in >>> >>> https://github.com/ClusterLabs/libqb/blob/master/lib/ipc_setup.c >>> >>> Line 600 : >>> handle_new_connection function >>> >>> Line 637: >>> if (auth_result == 0 && c->service->serv_fns.connection_accept) { >>> res = c->service->serv_fns.connection_accept(c, >>> c->euid, c->egid); >>> } >>> if (res != 0) { >>> goto send_response; >>> } >>> >>> Any hints on this issue would be really helpful for me to go ahead. >>> Please let me know if any logs are required, >>> >>> Regards, >>> Sriram >>> >>> On Thu, Apr 28, 2016 at 2:42 PM, Sriram <sriram...@gmail.com> wrote: >>> >>>> Thanks Ken and Emmanuel. >>>> Its a big endian machine. I will try with running "pcs cluster setup" >>>> and "pcs cluster start" >>>> Inside cluster.py, "service pacemaker start" and "service corosync >>>> start" are executed to bring up pacemaker and corosync. >>>> Those service scripts and the infrastructure needed to bring up the >>>> processes in the above said manner doesn't exist in my board. >>>> As it is a embedded board with the limited memory, full fledged linux >>>> is not installed. >>>> Just curious to know, what could be reason the pacemaker throws that >>>> error. >>>> >>>> >>>> >>>> *"cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 1s"* >>>> Thanks for response. >>>> >>>> Regards, >>>> Sriram. >>>> >>>> On Thu, Apr 28, 2016 at 8:55 AM, Ken Gaillot <kgail...@redhat.com> >>>> wrote: >>>> >>>>> On 04/27/2016 11:25 AM, emmanuel segura wrote: >>>>> > you need to use pcs to do everything, pcs cluster setup and pcs >>>>> > cluster start, try to use the redhat docs for more information. >>>>> >>>>> Agreed -- pcs cluster setup will create a proper corosync.conf for you. >>>>> Your corosync.conf below uses corosync 1 syntax, and there were >>>>> significant changes in corosync 2. In particular, you don't need the >>>>> file created in step 4, because pacemaker is no longer launched via a >>>>> corosync plugin. >>>>> >>>>> > 2016-04-27 17:28 GMT+02:00 Sriram <sriram...@gmail.com>: >>>>> >> Dear All, >>>>> >> >>>>> >> I m trying to use pacemaker and corosync for the clustering >>>>> requirement that >>>>> >> came up recently. >>>>> >> We have cross compiled corosync, pacemaker and pcs(python) for ppc >>>>> >> environment (Target board where pacemaker and corosync are supposed >>>>> to run) >>>>> >> I m having trouble bringing up pacemaker in that environment, >>>>> though I could >>>>> >> successfully bring up corosync. >>>>> >> Any help is welcome. >>>>> >> >>>>> >> I m using these versions of pacemaker and corosync >>>>> >> [root@node_cu pacemaker]# corosync -v >>>>> >> Corosync Cluster Engine, version '2.3.5' >>>>> >> Copyright (c) 2006-2009 Red Hat, Inc. >>>>> >> [root@node_cu pacemaker]# pacemakerd -$ >>>>> >> Pacemaker 1.1.14 >>>>> >> Written by Andrew Beekhof >>>>> >> >>>>> >> For running corosync, I did the following. >>>>> >> 1. Created the following directories, >>>>> >> /var/lib/pacemaker >>>>> >> /var/lib/corosync >>>>> >> /var/lib/pacemaker >>>>> >> /var/lib/pacemaker/cores >>>>> >> /var/lib/pacemaker/pengine >>>>> >> /var/lib/pacemaker/blackbox >>>>> >> /var/lib/pacemaker/cib >>>>> >> >>>>> >> >>>>> >> 2. Created a file called corosync.conf under /etc/corosync folder >>>>> with the >>>>> >> following contents >>>>> >> >>>>> >> totem { >>>>> >> >>>>> >> version: 2 >>>>> >> token: 5000 >>>>> >> token_retransmits_before_loss_const: 20 >>>>> >> join: 1000 >>>>> >> consensus: 7500 >>>>> >> vsftype: none >>>>> >> max_messages: 20 >>>>> >> secauth: off >>>>> >> cluster_name: mycluster >>>>> >> transport: udpu >>>>> >> threads: 0 >>>>> >> clear_node_high_bit: yes >>>>> >> >>>>> >> interface { >>>>> >> ringnumber: 0 >>>>> >> # The following three values need to be set based >>>>> on your >>>>> >> environment >>>>> >> bindnetaddr: 10.x.x.x >>>>> >> mcastaddr: 226.94.1.1 >>>>> >> mcastport: 5405 >>>>> >> } >>>>> >> } >>>>> >> >>>>> >> logging { >>>>> >> fileline: off >>>>> >> to_syslog: yes >>>>> >> to_stderr: no >>>>> >> to_syslog: yes >>>>> >> logfile: /var/log/corosync.log >>>>> >> syslog_facility: daemon >>>>> >> debug: on >>>>> >> timestamp: on >>>>> >> } >>>>> >> >>>>> >> amf { >>>>> >> mode: disabled >>>>> >> } >>>>> >> >>>>> >> quorum { >>>>> >> provider: corosync_votequorum >>>>> >> } >>>>> >> >>>>> >> nodelist { >>>>> >> node { >>>>> >> ring0_addr: node_cu >>>>> >> nodeid: 1 >>>>> >> } >>>>> >> } >>>>> >> >>>>> >> 3. Created authkey under /etc/corosync >>>>> >> >>>>> >> 4. Created a file called pcmk under /etc/corosync/service.d and >>>>> contents as >>>>> >> below, >>>>> >> cat pcmk >>>>> >> service { >>>>> >> # Load the Pacemaker Cluster Resource Manager >>>>> >> name: pacemaker >>>>> >> ver: 1 >>>>> >> } >>>>> >> >>>>> >> 5. Added the node name "node_cu" in /etc/hosts with 10.X.X.X ip >>>>> >> >>>>> >> 6. ./corosync -f -p & --> this step started corosync >>>>> >> >>>>> >> [root@node_cu pacemaker]# netstat -alpn | grep -i coros >>>>> >> udp 0 0 10.X.X.X:61841 0.0.0.0:* >>>>> >> 9133/corosync >>>>> >> udp 0 0 10.X.X.X:5405 0.0.0.0:* >>>>> >> 9133/corosync >>>>> >> unix 2 [ ACC ] STREAM LISTENING 148888 >>>>> 9133/corosync >>>>> >> @quorum >>>>> >> unix 2 [ ACC ] STREAM LISTENING 148884 >>>>> 9133/corosync >>>>> >> @cmap >>>>> >> unix 2 [ ACC ] STREAM LISTENING 148887 >>>>> 9133/corosync >>>>> >> @votequorum >>>>> >> unix 2 [ ACC ] STREAM LISTENING 148885 >>>>> 9133/corosync >>>>> >> @cfg >>>>> >> unix 2 [ ACC ] STREAM LISTENING 148886 >>>>> 9133/corosync >>>>> >> @cpg >>>>> >> unix 2 [ ] DGRAM 148840 >>>>> 9133/corosync >>>>> >> >>>>> >> 7. ./pacemakerd -f & gives the following error and exits. >>>>> >> [root@node_cu pacemaker]# pacemakerd -f >>>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 1s >>>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 2s >>>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 3s >>>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 4s >>>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 5s >>>>> >> Could not connect to Cluster Configuration Database API, error 6 >>>>> >> >>>>> >> Can you please point me, what is missing in these steps ? >>>>> >> >>>>> >> Before trying these steps, I tried running "pcs cluster start", but >>>>> that >>>>> >> command fails with "service" script not found. As the root >>>>> filesystem >>>>> >> doesn't contain either /etc/init.d/ or /sbin/service >>>>> >> >>>>> >> So, the plan is to bring up corosync and pacemaker manually, later >>>>> do the >>>>> >> cluster configuration using "pcs" commands. >>>>> >> >>>>> >> Regards, >>>>> >> Sriram >>>>> >> >>>>> >> _______________________________________________ >>>>> >> Users mailing list: Users@clusterlabs.org >>>>> >> http://clusterlabs.org/mailman/listinfo/users >>>>> >> >>>>> >> Project Home: http://www.clusterlabs.org >>>>> >> Getting started: >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> >> Bugs: http://bugs.clusterlabs.org >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> >>>>> >>>>> _______________________________________________ >>>>> Users mailing list: Users@clusterlabs.org >>>>> http://clusterlabs.org/mailman/listinfo/users >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>>> >>>> >>>> >>> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> >> >
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org