Hi, Looking for some guidance here as we are completely blocked otherwise :(.
-Regards Nikhil On Fri, Apr 29, 2016 at 6:11 PM, Sriram <sriram...@gmail.com> wrote: > Corrected the subject. > > We went ahead and captured corosync debug logs for our ppc board. > After log analysis and comparison with the sucessful logs( from x86 > machine) , > we didnt find * "[ MAIN ] Completed service synchronization, ready to > provide service.*" in ppc logs. > So, looks like corosync is not in a position to accept connection from > Pacemaker. > Even I tried with the new corosync.conf with no success. > > Any hints on this issue would be really helpful. > > Attaching ppc_notworking.log, x86_working.log, corosync.conf. > > Regards, > Sriram > > > > On Fri, Apr 29, 2016 at 2:44 PM, Sriram <sriram...@gmail.com> wrote: > >> Hi, >> >> I went ahead and made some changes in file system(Like I brought in >> /etc/init.d/corosync and /etc/init.d/pacemaker, /etc/sysconfig ), After >> that I was able to run "pcs cluster start". >> But it failed with the following error >> # pcs cluster start >> Starting Cluster... >> Starting Pacemaker Cluster Manager[FAILED] >> Error: unable to start pacemaker >> >> And in the /var/log/pacemaker.log, I saw these errors >> pacemakerd: info: mcp_read_config: cmap connection setup failed: >> CS_ERR_TRY_AGAIN. Retrying in 4s >> Apr 29 08:53:47 [15863] node_cu pacemakerd: info: mcp_read_config: >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 5s >> Apr 29 08:53:52 [15863] node_cu pacemakerd: warning: mcp_read_config: >> Could not connect to Cluster Configuration Database API, error 6 >> Apr 29 08:53:52 [15863] node_cu pacemakerd: notice: main: Could not >> obtain corosync config data, exiting >> Apr 29 08:53:52 [15863] node_cu pacemakerd: info: crm_xml_cleanup: >> Cleaning up memory from libxml2 >> >> >> And in the /var/log/Debuglog, I saw these errors coming from corosync >> 20160429 085347.487050 airv_cu daemon.warn corosync[12857]: [QB ] >> Denied connection, is not ready (12857-15863-14) >> 20160429 085347.487067 airv_cu daemon.info corosync[12857]: [QB ] >> Denied connection, is not ready (12857-15863-14) >> >> >> I browsed the code of libqb to find that it is failing in >> >> https://github.com/ClusterLabs/libqb/blob/master/lib/ipc_setup.c >> >> Line 600 : >> handle_new_connection function >> >> Line 637: >> if (auth_result == 0 && c->service->serv_fns.connection_accept) { >> res = c->service->serv_fns.connection_accept(c, >> c->euid, c->egid); >> } >> if (res != 0) { >> goto send_response; >> } >> >> Any hints on this issue would be really helpful for me to go ahead. >> Please let me know if any logs are required, >> >> Regards, >> Sriram >> >> On Thu, Apr 28, 2016 at 2:42 PM, Sriram <sriram...@gmail.com> wrote: >> >>> Thanks Ken and Emmanuel. >>> Its a big endian machine. I will try with running "pcs cluster setup" >>> and "pcs cluster start" >>> Inside cluster.py, "service pacemaker start" and "service corosync >>> start" are executed to bring up pacemaker and corosync. >>> Those service scripts and the infrastructure needed to bring up the >>> processes in the above said manner doesn't exist in my board. >>> As it is a embedded board with the limited memory, full fledged linux is >>> not installed. >>> Just curious to know, what could be reason the pacemaker throws that >>> error. >>> >>> >>> >>> *"cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 1s"* >>> Thanks for response. >>> >>> Regards, >>> Sriram. >>> >>> On Thu, Apr 28, 2016 at 8:55 AM, Ken Gaillot <kgail...@redhat.com> >>> wrote: >>> >>>> On 04/27/2016 11:25 AM, emmanuel segura wrote: >>>> > you need to use pcs to do everything, pcs cluster setup and pcs >>>> > cluster start, try to use the redhat docs for more information. >>>> >>>> Agreed -- pcs cluster setup will create a proper corosync.conf for you. >>>> Your corosync.conf below uses corosync 1 syntax, and there were >>>> significant changes in corosync 2. In particular, you don't need the >>>> file created in step 4, because pacemaker is no longer launched via a >>>> corosync plugin. >>>> >>>> > 2016-04-27 17:28 GMT+02:00 Sriram <sriram...@gmail.com>: >>>> >> Dear All, >>>> >> >>>> >> I m trying to use pacemaker and corosync for the clustering >>>> requirement that >>>> >> came up recently. >>>> >> We have cross compiled corosync, pacemaker and pcs(python) for ppc >>>> >> environment (Target board where pacemaker and corosync are supposed >>>> to run) >>>> >> I m having trouble bringing up pacemaker in that environment, though >>>> I could >>>> >> successfully bring up corosync. >>>> >> Any help is welcome. >>>> >> >>>> >> I m using these versions of pacemaker and corosync >>>> >> [root@node_cu pacemaker]# corosync -v >>>> >> Corosync Cluster Engine, version '2.3.5' >>>> >> Copyright (c) 2006-2009 Red Hat, Inc. >>>> >> [root@node_cu pacemaker]# pacemakerd -$ >>>> >> Pacemaker 1.1.14 >>>> >> Written by Andrew Beekhof >>>> >> >>>> >> For running corosync, I did the following. >>>> >> 1. Created the following directories, >>>> >> /var/lib/pacemaker >>>> >> /var/lib/corosync >>>> >> /var/lib/pacemaker >>>> >> /var/lib/pacemaker/cores >>>> >> /var/lib/pacemaker/pengine >>>> >> /var/lib/pacemaker/blackbox >>>> >> /var/lib/pacemaker/cib >>>> >> >>>> >> >>>> >> 2. Created a file called corosync.conf under /etc/corosync folder >>>> with the >>>> >> following contents >>>> >> >>>> >> totem { >>>> >> >>>> >> version: 2 >>>> >> token: 5000 >>>> >> token_retransmits_before_loss_const: 20 >>>> >> join: 1000 >>>> >> consensus: 7500 >>>> >> vsftype: none >>>> >> max_messages: 20 >>>> >> secauth: off >>>> >> cluster_name: mycluster >>>> >> transport: udpu >>>> >> threads: 0 >>>> >> clear_node_high_bit: yes >>>> >> >>>> >> interface { >>>> >> ringnumber: 0 >>>> >> # The following three values need to be set based on >>>> your >>>> >> environment >>>> >> bindnetaddr: 10.x.x.x >>>> >> mcastaddr: 226.94.1.1 >>>> >> mcastport: 5405 >>>> >> } >>>> >> } >>>> >> >>>> >> logging { >>>> >> fileline: off >>>> >> to_syslog: yes >>>> >> to_stderr: no >>>> >> to_syslog: yes >>>> >> logfile: /var/log/corosync.log >>>> >> syslog_facility: daemon >>>> >> debug: on >>>> >> timestamp: on >>>> >> } >>>> >> >>>> >> amf { >>>> >> mode: disabled >>>> >> } >>>> >> >>>> >> quorum { >>>> >> provider: corosync_votequorum >>>> >> } >>>> >> >>>> >> nodelist { >>>> >> node { >>>> >> ring0_addr: node_cu >>>> >> nodeid: 1 >>>> >> } >>>> >> } >>>> >> >>>> >> 3. Created authkey under /etc/corosync >>>> >> >>>> >> 4. Created a file called pcmk under /etc/corosync/service.d and >>>> contents as >>>> >> below, >>>> >> cat pcmk >>>> >> service { >>>> >> # Load the Pacemaker Cluster Resource Manager >>>> >> name: pacemaker >>>> >> ver: 1 >>>> >> } >>>> >> >>>> >> 5. Added the node name "node_cu" in /etc/hosts with 10.X.X.X ip >>>> >> >>>> >> 6. ./corosync -f -p & --> this step started corosync >>>> >> >>>> >> [root@node_cu pacemaker]# netstat -alpn | grep -i coros >>>> >> udp 0 0 10.X.X.X:61841 0.0.0.0:* >>>> >> 9133/corosync >>>> >> udp 0 0 10.X.X.X:5405 0.0.0.0:* >>>> >> 9133/corosync >>>> >> unix 2 [ ACC ] STREAM LISTENING 148888 >>>> 9133/corosync >>>> >> @quorum >>>> >> unix 2 [ ACC ] STREAM LISTENING 148884 >>>> 9133/corosync >>>> >> @cmap >>>> >> unix 2 [ ACC ] STREAM LISTENING 148887 >>>> 9133/corosync >>>> >> @votequorum >>>> >> unix 2 [ ACC ] STREAM LISTENING 148885 >>>> 9133/corosync >>>> >> @cfg >>>> >> unix 2 [ ACC ] STREAM LISTENING 148886 >>>> 9133/corosync >>>> >> @cpg >>>> >> unix 2 [ ] DGRAM 148840 >>>> 9133/corosync >>>> >> >>>> >> 7. ./pacemakerd -f & gives the following error and exits. >>>> >> [root@node_cu pacemaker]# pacemakerd -f >>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 1s >>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 2s >>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 3s >>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 4s >>>> >> cmap connection setup failed: CS_ERR_TRY_AGAIN. Retrying in 5s >>>> >> Could not connect to Cluster Configuration Database API, error 6 >>>> >> >>>> >> Can you please point me, what is missing in these steps ? >>>> >> >>>> >> Before trying these steps, I tried running "pcs cluster start", but >>>> that >>>> >> command fails with "service" script not found. As the root filesystem >>>> >> doesn't contain either /etc/init.d/ or /sbin/service >>>> >> >>>> >> So, the plan is to bring up corosync and pacemaker manually, later >>>> do the >>>> >> cluster configuration using "pcs" commands. >>>> >> >>>> >> Regards, >>>> >> Sriram >>>> >> >>>> >> _______________________________________________ >>>> >> Users mailing list: Users@clusterlabs.org >>>> >> http://clusterlabs.org/mailman/listinfo/users >>>> >> >>>> >> Project Home: http://www.clusterlabs.org >>>> >> Getting started: >>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> >> Bugs: http://bugs.clusterlabs.org >>>> >> >>>> > >>>> > >>>> > >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list: Users@clusterlabs.org >>>> http://clusterlabs.org/mailman/listinfo/users >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: >>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>>> >>> >>> >> > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > >
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org