Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts

2016-04-29 Thread Ken Gaillot
On 04/28/2016 10:24 AM, Lars Marowsky-Bree wrote:
> On 2016-04-27T12:10:10, Klaus Wenninger  wrote:
> 
>>> Having things in ARGV[] is always risky due to them being exposed more
>>> easily via ps. Environment variables or stdin appear better.
>> What made you assume the recipient is being passed as argument?
>>
>> The environment variable CRM_alert_recipient is being used to pass it.
> 
> Ah, excellent! But what made me think that this would be passed as
> arguments is that your announcement said: "Each alert may have any
> number of recipients configured. These values will simply be passed to
> the script as *arguments*." ;-)

Yes, that was my mistake in the original announcement. :-/

An early design had the script called once with all recipients as
arguments, but in the implementation we went with the script being
called once per recipient (and using only the environment variable).


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] [ClusterLab] : Corosync not initializing successfully

2016-04-29 Thread Sriram
Corrected the subject.

We went ahead and captured corosync debug logs for our ppc board.
After log analysis and comparison with the sucessful logs( from x86
machine) ,
we didnt find * "[ MAIN  ] Completed service synchronization, ready to
provide service.*" in ppc logs.
So, looks like corosync is not in a position to accept connection from
Pacemaker.
Even I tried with the new corosync.conf with no success.

Any hints on this issue would be really helpful.

Attaching ppc_notworking.log, x86_working.log, corosync.conf.

Regards,
Sriram



On Fri, Apr 29, 2016 at 2:44 PM, Sriram <sriram...@gmail.com> wrote:

> Hi,
>
> I went ahead and made some changes in file system(Like I brought in
> /etc/init.d/corosync and /etc/init.d/pacemaker, /etc/sysconfig ), After
> that I was able to run  "pcs cluster start".
> But it failed with the following error
>  # pcs cluster start
> Starting Cluster...
> Starting Pacemaker Cluster Manager[FAILED]
> Error: unable to start pacemaker
>
> And in the /var/log/pacemaker.log, I saw these errors
> pacemakerd: info: mcp_read_config:  cmap connection setup failed:
> CS_ERR_TRY_AGAIN.  Retrying in 4s
> Apr 29 08:53:47 [15863] node_cu pacemakerd: info: mcp_read_config:
> cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 5s
> Apr 29 08:53:52 [15863] node_cu pacemakerd:  warning: mcp_read_config:
> Could not connect to Cluster Configuration Database API, error 6
> Apr 29 08:53:52 [15863] node_cu pacemakerd:   notice: main: Could not
> obtain corosync config data, exiting
> Apr 29 08:53:52 [15863] node_cu pacemakerd: info: crm_xml_cleanup:
> Cleaning up memory from libxml2
>
>
> And in the /var/log/Debuglog, I saw these errors coming from corosync
> 20160429 085347.487050 airv_cu daemon.warn corosync[12857]:   [QB]
> Denied connection, is not ready (12857-15863-14)
> 20160429 085347.487067 airv_cu daemon.info corosync[12857]:   [QB]
> Denied connection, is not ready (12857-15863-14)
>
>
> I browsed the code of libqb to find that it is failing in
>
> https://github.com/ClusterLabs/libqb/blob/master/lib/ipc_setup.c
>
> Line 600 :
> handle_new_connection function
>
> Line 637:
> if (auth_result == 0 && c->service->serv_fns.connection_accept) {
> res = c->service->serv_fns.connection_accept(c,
>  c->euid, c->egid);
> }
> if (res != 0) {
> goto send_response;
> }
>
> Any hints on this issue would be really helpful for me to go ahead.
> Please let me know if any logs are required,
>
> Regards,
> Sriram
>
> On Thu, Apr 28, 2016 at 2:42 PM, Sriram <sriram...@gmail.com> wrote:
>
>> Thanks Ken and Emmanuel.
>> Its a big endian machine. I will try with running "pcs cluster setup" and
>> "pcs cluster start"
>> Inside cluster.py, "service pacemaker start" and "service corosync start"
>> are executed to bring up pacemaker and corosync.
>> Those service scripts and the infrastructure needed to bring up the
>> processes in the above said manner doesn't exist in my board.
>> As it is a embedded board with the limited memory, full fledged linux is
>> not installed.
>> Just curious to know, what could be reason the pacemaker throws that
>> error.
>>
>>
>>
>> *"cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 1s"*
>> Thanks for response.
>>
>> Regards,
>> Sriram.
>>
>> On Thu, Apr 28, 2016 at 8:55 AM, Ken Gaillot <kgail...@redhat.com> wrote:
>>
>>> On 04/27/2016 11:25 AM, emmanuel segura wrote:
>>> > you need to use pcs to do everything, pcs cluster setup and pcs
>>> > cluster start, try to use the redhat docs for more information.
>>>
>>> Agreed -- pcs cluster setup will create a proper corosync.conf for you.
>>> Your corosync.conf below uses corosync 1 syntax, and there were
>>> significant changes in corosync 2. In particular, you don't need the
>>> file created in step 4, because pacemaker is no longer launched via a
>>> corosync plugin.
>>>
>>> > 2016-04-27 17:28 GMT+02:00 Sriram <sriram...@gmail.com>:
>>> >> Dear All,
>>> >>
>>> >> I m trying to use pacemaker and corosync for the clustering
>>> requirement that
>>> >> came up recently.
>>> >> We have cross compiled corosync, pacemaker and pcs(python) for ppc
>>> >> environment (Target board where pacemaker and corosync are supposed
>>> to run)
>>> >> I m having trouble bringing up pacemaker in that envi

Re: [ClusterLabs] [ClusterLab] : Unable to bring up pacemaker

2016-04-29 Thread Sriram
Hi,

I went ahead and made some changes in file system(Like I brought in
/etc/init.d/corosync and /etc/init.d/pacemaker, /etc/sysconfig ), After
that I was able to run  "pcs cluster start".
But it failed with the following error
 # pcs cluster start
Starting Cluster...
Starting Pacemaker Cluster Manager[FAILED]
Error: unable to start pacemaker

And in the /var/log/pacemaker.log, I saw these errors
pacemakerd: info: mcp_read_config:  cmap connection setup failed:
CS_ERR_TRY_AGAIN.  Retrying in 4s
Apr 29 08:53:47 [15863] node_cu pacemakerd: info: mcp_read_config:
cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 5s
Apr 29 08:53:52 [15863] node_cu pacemakerd:  warning: mcp_read_config:
Could not connect to Cluster Configuration Database API, error 6
Apr 29 08:53:52 [15863] node_cu pacemakerd:   notice: main: Could not
obtain corosync config data, exiting
Apr 29 08:53:52 [15863] node_cu pacemakerd: info: crm_xml_cleanup:
Cleaning up memory from libxml2


And in the /var/log/Debuglog, I saw these errors coming from corosync
20160429 085347.487050 airv_cu daemon.warn corosync[12857]:   [QB]
Denied connection, is not ready (12857-15863-14)
20160429 085347.487067 airv_cu daemon.info corosync[12857]:   [QB]
Denied connection, is not ready (12857-15863-14)


I browsed the code of libqb to find that it is failing in

https://github.com/ClusterLabs/libqb/blob/master/lib/ipc_setup.c

Line 600 :
handle_new_connection function

Line 637:
if (auth_result == 0 && c->service->serv_fns.connection_accept) {
res = c->service->serv_fns.connection_accept(c,
 c->euid, c->egid);
}
if (res != 0) {
goto send_response;
}

Any hints on this issue would be really helpful for me to go ahead.
Please let me know if any logs are required,

Regards,
Sriram

On Thu, Apr 28, 2016 at 2:42 PM, Sriram <sriram...@gmail.com> wrote:

> Thanks Ken and Emmanuel.
> Its a big endian machine. I will try with running "pcs cluster setup" and
> "pcs cluster start"
> Inside cluster.py, "service pacemaker start" and "service corosync start"
> are executed to bring up pacemaker and corosync.
> Those service scripts and the infrastructure needed to bring up the
> processes in the above said manner doesn't exist in my board.
> As it is a embedded board with the limited memory, full fledged linux is
> not installed.
> Just curious to know, what could be reason the pacemaker throws that error.
>
>
>
> *"cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 1s"*
> Thanks for response.
>
> Regards,
> Sriram.
>
> On Thu, Apr 28, 2016 at 8:55 AM, Ken Gaillot <kgail...@redhat.com> wrote:
>
>> On 04/27/2016 11:25 AM, emmanuel segura wrote:
>> > you need to use pcs to do everything, pcs cluster setup and pcs
>> > cluster start, try to use the redhat docs for more information.
>>
>> Agreed -- pcs cluster setup will create a proper corosync.conf for you.
>> Your corosync.conf below uses corosync 1 syntax, and there were
>> significant changes in corosync 2. In particular, you don't need the
>> file created in step 4, because pacemaker is no longer launched via a
>> corosync plugin.
>>
>> > 2016-04-27 17:28 GMT+02:00 Sriram <sriram...@gmail.com>:
>> >> Dear All,
>> >>
>> >> I m trying to use pacemaker and corosync for the clustering
>> requirement that
>> >> came up recently.
>> >> We have cross compiled corosync, pacemaker and pcs(python) for ppc
>> >> environment (Target board where pacemaker and corosync are supposed to
>> run)
>> >> I m having trouble bringing up pacemaker in that environment, though I
>> could
>> >> successfully bring up corosync.
>> >> Any help is welcome.
>> >>
>> >> I m using these versions of pacemaker and corosync
>> >> [root@node_cu pacemaker]# corosync -v
>> >> Corosync Cluster Engine, version '2.3.5'
>> >> Copyright (c) 2006-2009 Red Hat, Inc.
>> >> [root@node_cu pacemaker]# pacemakerd -$
>> >> Pacemaker 1.1.14
>> >> Written by Andrew Beekhof
>> >>
>> >> For running corosync, I did the following.
>> >> 1. Created the following directories,
>> >> /var/lib/pacemaker
>> >> /var/lib/corosync
>> >> /var/lib/pacemaker
>> >> /var/lib/pacemaker/cores
>> >> /var/lib/pacemaker/pengine
>> >> /var/lib/pacemaker/blackbox
>> >> /var/lib/pacemaker/cib
>> >>
>> >>
>> >> 2. Created a file called corosync.conf under /etc/co