[ClusterLabs] corosync.conf - reload failed

2018-01-24 Thread Sriram
Hi,


I m trying to update corosync.conf mcast ip in existing cluster.
I use this command "corosync-cfgtool -R" command to reload the corosync
configuration.



*corosync-cfgtool -R*

*Reloading corosync.conf...*

*Done*



But corosync multicast ip not been changed.



corosync provides any command to reload mcast ip or I have to restart
pacemaker ? Kindly let me know



corosync version is 2.3.5.


Kindly

Regards,

Sriram.
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Notification agent and Notification recipients

2017-08-15 Thread Sriram
Thanks for clarifying.

Regards,
Sriram.

On Mon, Aug 14, 2017 at 7:34 PM, Klaus Wenninger 
wrote:

> On 08/14/2017 03:19 PM, Sriram wrote:
>
> Yes, I had precreated the script file with the required permission.
>
> [root@*node1* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4140 Aug 14 01:51 /usr/share/pacemaker/alert_
> file.sh
>  [root@*node2* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4139 Aug 14 01:51 /usr/share/pacemaker/alert_
> file.sh
> [root@*node3* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4139 Aug 14 01:51 /usr/share/pacemaker/alert_
> file.sh
>
> Later I observed that user "hacluster" is not able to create the log file
> under /usr/share/pacemaker/alert_file.log.
> I am sorry, I should have observed this in the log before posting the
> query. Then I gave the path as /tmp/alert_file.log, it is able to create
> now.
> Thanks for pointing it out.
>
> I have one more clarification,
>
> if the resource is running in node2,
> [root@node2 tmp]# pcs resource
>  TRR(ocf::heartbeat:TimingRedundancyRA):Started node2
>
> And I executed the below command to make it standby.
> [root@node2 tmp] # pcs node standby node2
>
> Resource shifted to node3, because of higher location constraint.
> [root@node2 tmp]# pcs resource
>  TRR(ocf::heartbeat:TimingRedundancyRA):Started node3.
>
>
> I got the log file created under node2(resource stopped) and
> node3(resource started).
>
> Node1 was not notified about the resource shift, I mean no log file was
> created there.
> Its because alerts are designed to notify the external agents about the
> cluster events. Its not for internal notifications.
>
> Is my understanding correct ?
>
>
> Quite simple: crmd of node1 just didn't have anything to do with shifting
> the resource
> from node2 -> node3. There is no additional information passed between the
> nodes
> just to create a full set of notifications on every node. If you want to
> have a full log
> (or whatever you altert-agent is doing) in one place this would be up to
> your alert-agent.
>
>
> Regards,
> Klaus
>
>
> Regards,
> Sriram.
>
>
>
> On Mon, Aug 14, 2017 at 5:42 PM, Klaus Wenninger 
> wrote:
>
>> On 08/14/2017 12:32 PM, Sriram wrote:
>>
>> Hi Ken,
>>
>> I used the alerts as well, seems to be not working.
>>
>> Please check the below configuration
>> [root@node1 alerts]# pcs config show
>> Cluster Name:
>> Corosync Nodes:
>> Pacemaker Nodes:
>>  node1 node2 node3
>>
>> Resources:
>>  Resource: TRR (class=ocf provider=heartbeat type=TimingRedundancyRA)
>>   Operations: start interval=0s timeout=60s (TRR-start-interval-0s)
>>   stop interval=0s timeout=20s (TRR-stop-interval-0s)
>>   monitor interval=10 timeout=20 (TRR-monitor-interval-10)
>>
>> Stonith Devices:
>> Fencing Levels:
>>
>> Location Constraints:
>>   Resource: TRR
>> Enabled on: node1 (score:100) (id:location-TRR-node1-100)
>> Enabled on: node2 (score:200) (id:location-TRR-node2-200)
>> Enabled on: node3 (score:300) (id:location-TRR-node3-300)
>> Ordering Constraints:
>> Colocation Constraints:
>> Ticket Constraints:
>>
>> Alerts:
>>  Alert: alert_file (path=/usr/share/pacemaker/alert_file.sh)
>>   Options: debug_exec_order=false
>>   Meta options: timeout=15s
>>   Recipients:
>>Recipient: recipient_alert_file_id (value=/usr/share/pacemaker/al
>> ert_file.log)
>>
>>
>> Did you pre-create the file with proper rights? Be aware that the
>> alert-agent
>> is called as user hacluster.
>>
>>
>> Resources Defaults:
>>  resource-stickiness: INFINITY
>> Operations Defaults:
>>  No defaults set
>>
>> Cluster Properties:
>>  cluster-infrastructure: corosync
>>  dc-version: 1.1.15-11.el7_3.4-e174ec8
>>  default-action-timeout: 240
>>  have-watchdog: false
>>  no-quorum-policy: ignore
>>  placement-strategy: balanced
>>  stonith-enabled: false
>>  symmetric-cluster: false
>>
>> Quorum:
>>   Options:
>>
>>
>> /usr/share/pacemaker/alert_file.sh does not get called whenever I
>> trigger a scenario for failover.
>> Please let me know if I m missing anything.
>>
>>
>> Do you get any logs - like for startup of resources - or nothing at all?
>>
>> Regards,
>> Klaus
>>
>>
>>
>>
>> Regards,
>> Sriram.
>>
>> On Tue, Aug 8, 2

Re: [ClusterLabs] Antw: Re: Notification agent and Notification recipients

2017-08-14 Thread Sriram
Yes, I had precreated the script file with the required permission.

[root@*node1* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
-rwxr-xr-x. 1 root root 4140 Aug 14 01:51 /usr/share/pacemaker/alert_file.sh
 [root@*node2* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
-rwxr-xr-x. 1 root root 4139 Aug 14 01:51 /usr/share/pacemaker/alert_file.sh
[root@*node3* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
-rwxr-xr-x. 1 root root 4139 Aug 14 01:51 /usr/share/pacemaker/alert_file.sh

Later I observed that user "hacluster" is not able to create the log file
under /usr/share/pacemaker/alert_file.log.
I am sorry, I should have observed this in the log before posting the
query. Then I gave the path as /tmp/alert_file.log, it is able to create
now.
Thanks for pointing it out.

I have one more clarification,

if the resource is running in node2,
[root@node2 tmp]# pcs resource
 TRR(ocf::heartbeat:TimingRedundancyRA):Started node2

And I executed the below command to make it standby.
[root@node2 tmp] # pcs node standby node2

Resource shifted to node3, because of higher location constraint.
[root@node2 tmp]# pcs resource
 TRR(ocf::heartbeat:TimingRedundancyRA):Started node3.


I got the log file created under node2(resource stopped) and node3(resource
started).

Node1 was not notified about the resource shift, I mean no log file was
created there.
Its because alerts are designed to notify the external agents about the
cluster events. Its not for internal notifications.

Is my understanding correct ?

Regards,
Sriram.



On Mon, Aug 14, 2017 at 5:42 PM, Klaus Wenninger 
wrote:

> On 08/14/2017 12:32 PM, Sriram wrote:
>
> Hi Ken,
>
> I used the alerts as well, seems to be not working.
>
> Please check the below configuration
> [root@node1 alerts]# pcs config show
> Cluster Name:
> Corosync Nodes:
> Pacemaker Nodes:
>  node1 node2 node3
>
> Resources:
>  Resource: TRR (class=ocf provider=heartbeat type=TimingRedundancyRA)
>   Operations: start interval=0s timeout=60s (TRR-start-interval-0s)
>   stop interval=0s timeout=20s (TRR-stop-interval-0s)
>   monitor interval=10 timeout=20 (TRR-monitor-interval-10)
>
> Stonith Devices:
> Fencing Levels:
>
> Location Constraints:
>   Resource: TRR
> Enabled on: node1 (score:100) (id:location-TRR-node1-100)
> Enabled on: node2 (score:200) (id:location-TRR-node2-200)
> Enabled on: node3 (score:300) (id:location-TRR-node3-300)
> Ordering Constraints:
> Colocation Constraints:
> Ticket Constraints:
>
> Alerts:
>  Alert: alert_file (path=/usr/share/pacemaker/alert_file.sh)
>   Options: debug_exec_order=false
>   Meta options: timeout=15s
>   Recipients:
>Recipient: recipient_alert_file_id (value=/usr/share/pacemaker/
> alert_file.log)
>
>
> Did you pre-create the file with proper rights? Be aware that the
> alert-agent
> is called as user hacluster.
>
>
> Resources Defaults:
>  resource-stickiness: INFINITY
> Operations Defaults:
>  No defaults set
>
> Cluster Properties:
>  cluster-infrastructure: corosync
>  dc-version: 1.1.15-11.el7_3.4-e174ec8
>  default-action-timeout: 240
>  have-watchdog: false
>  no-quorum-policy: ignore
>  placement-strategy: balanced
>  stonith-enabled: false
>  symmetric-cluster: false
>
> Quorum:
>   Options:
>
>
> /usr/share/pacemaker/alert_file.sh does not get called whenever I trigger
> a scenario for failover.
> Please let me know if I m missing anything.
>
>
> Do you get any logs - like for startup of resources - or nothing at all?
>
> Regards,
> Klaus
>
>
>
>
> Regards,
> Sriram.
>
> On Tue, Aug 8, 2017 at 8:29 PM, Ken Gaillot  wrote:
>
>> On Tue, 2017-08-08 at 17:40 +0530, Sriram wrote:
>> > Hi Ulrich,
>> >
>> >
>> > Please see inline.
>> >
>> > On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl
>> >  wrote:
>> > >>> Sriram  schrieb am 08.08.2017 um
>> > 09:30 in Nachricht
>> > > > +dv...@mail.gmail.com>:
>> > > Hi Ken & Jan,
>> > >
>> > > In the cluster we have, there is only one resource running.
>> > Its a OPT-IN
>> > > cluster with resource-stickiness set to INFINITY.
>> > >
>> > > Just to clarify my question, lets take a scenario where
>> > there are four
>> > > nodes N1, N2, N3, N4
>> > > a. N1 comes up first, starts the cluster.
>> >
>> > The cluster will start once it has a quorum.
>> >
>> > > b. N1 Checks that there is no resource running

Re: [ClusterLabs] Antw: Re: Notification agent and Notification recipients

2017-08-14 Thread Sriram
Hi Ken,

I used the alerts as well, seems to be not working.

Please check the below configuration
[root@node1 alerts]# pcs config show
Cluster Name:
Corosync Nodes:
Pacemaker Nodes:
 node1 node2 node3

Resources:
 Resource: TRR (class=ocf provider=heartbeat type=TimingRedundancyRA)
  Operations: start interval=0s timeout=60s (TRR-start-interval-0s)
  stop interval=0s timeout=20s (TRR-stop-interval-0s)
  monitor interval=10 timeout=20 (TRR-monitor-interval-10)

Stonith Devices:
Fencing Levels:

Location Constraints:
  Resource: TRR
Enabled on: node1 (score:100) (id:location-TRR-node1-100)
Enabled on: node2 (score:200) (id:location-TRR-node2-200)
Enabled on: node3 (score:300) (id:location-TRR-node3-300)
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:

Alerts:
 Alert: alert_file (path=/usr/share/pacemaker/alert_file.sh)
  Options: debug_exec_order=false
  Meta options: timeout=15s
  Recipients:
   Recipient: recipient_alert_file_id
(value=/usr/share/pacemaker/alert_file.log)

Resources Defaults:
 resource-stickiness: INFINITY
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: corosync
 dc-version: 1.1.15-11.el7_3.4-e174ec8
 default-action-timeout: 240
 have-watchdog: false
 no-quorum-policy: ignore
 placement-strategy: balanced
 stonith-enabled: false
 symmetric-cluster: false

Quorum:
  Options:


/usr/share/pacemaker/alert_file.sh does not get called whenever I trigger a
scenario for failover.
Please let me know if I m missing anything.


Regards,
Sriram.

On Tue, Aug 8, 2017 at 8:29 PM, Ken Gaillot  wrote:

> On Tue, 2017-08-08 at 17:40 +0530, Sriram wrote:
> > Hi Ulrich,
> >
> >
> > Please see inline.
> >
> > On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl
> >  wrote:
> > >>> Sriram  schrieb am 08.08.2017 um
> > 09:30 in Nachricht
> >  > +dv...@mail.gmail.com>:
> > > Hi Ken & Jan,
> > >
> > > In the cluster we have, there is only one resource running.
> > Its a OPT-IN
> > > cluster with resource-stickiness set to INFINITY.
> > >
> > > Just to clarify my question, lets take a scenario where
> > there are four
> > > nodes N1, N2, N3, N4
> > > a. N1 comes up first, starts the cluster.
> >
> > The cluster will start once it has a quorum.
> >
> > > b. N1 Checks that there is no resource running, so it will
> > add the
> > > resource(R) with the some location constraint(lets say score
> > 100)
> > > c. So Resource(R) runs in N1 now.
> > > d. N2 comes up next, checks that resource(R) is already
> > running in N1, so
> > > it will update the location constraint(lets say score 200)
> > > e. N3 comes up next, checks that resource(R) is already
> > running in N1, so
> > > it will update the location constraint(lets say score 300)
> >
> > See my remark on quorum above.
> >
> > Yes you are right, I forgot to mention it.
> >
> >
> > > f.  N4 comes up next, checks that resource(R) is already
> > running in N1, so
> > > it will update the location constraint(lets say score 400)
> > > g. For the some reason, if N1 goes down, resource(R) shifts
> > to N4(as its
> > > score is higher than anyone).
> > >
> > > In this case is it possible to notify the nodes N2, N3 that
> > newly elected
> > > active node is N4 ?
> >
> > What type of notification, and what would the node do with it?
> > Any node in the cluster always has up to date configuration
> > information. So it knows the status of the other nodes also.
> >
> >
> > I agree that the node always has upto date configuration information,
> > but an application or a thread needs to poll for that information. Is
> > there any way, where the notifications are received through some
> > action function in RA. ?
>
> Ah, I misunderstood your situation, I thought you had a cloned resource.
>
> For that, the alerts feature (available in Pacemaker 1.1.15 and later)
> might be useful:
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-
> single/Pacemaker_Explained/index.html#idm139900098676896
>
>
> >
> >
> > Regards,
> > Sriram.
> >
> > >
> > > I went through clone notifications and master-slave, Iooks
> >  

Re: [ClusterLabs] Antw: Re: Notification agent and Notification recipients

2017-08-08 Thread Sriram
Hi Ulrich,

Please see inline.

On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl <
ulrich.wi...@rz.uni-regensburg.de> wrote:

> >>> Sriram  schrieb am 08.08.2017 um 09:30 in
> Nachricht
> :
> > Hi Ken & Jan,
> >
> > In the cluster we have, there is only one resource running. Its a OPT-IN
> > cluster with resource-stickiness set to INFINITY.
> >
> > Just to clarify my question, lets take a scenario where there are four
> > nodes N1, N2, N3, N4
> > a. N1 comes up first, starts the cluster.
>
> The cluster will start once it has a quorum.
>
> > b. N1 Checks that there is no resource running, so it will add the
> > resource(R) with the some location constraint(lets say score 100)
> > c. So Resource(R) runs in N1 now.
> > d. N2 comes up next, checks that resource(R) is already running in N1, so
> > it will update the location constraint(lets say score 200)
> > e. N3 comes up next, checks that resource(R) is already running in N1, so
> > it will update the location constraint(lets say score 300)
>
> See my remark on quorum above.
>
> Yes you are right, I forgot to mention it.

> f.  N4 comes up next, checks that resource(R) is already running in N1, so
> > it will update the location constraint(lets say score 400)
> > g. For the some reason, if N1 goes down, resource(R) shifts to N4(as its
> > score is higher than anyone).
> >
> > In this case is it possible to notify the nodes N2, N3 that newly elected
> > active node is N4 ?
>
> What type of notification, and what would the node do with it?
> Any node in the cluster always has up to date configuration information.
> So it knows the status of the other nodes also.
>

I agree that the node always has upto date configuration information, but
an application or a thread needs to poll for that information. Is there any
way, where the notifications are received through some action function in
RA. ?

Regards,
Sriram.

>
> >
> > I went through clone notifications and master-slave, Iooks like it either
> > requires identical resources(Anonymous) or Unique or Stateful resources
> to
> > be running
> > in all the nodes of the cluster, where as in our case there is only
> > resource running in the whole cluster.
>
> Maybe the main reason for not having notifications is that if a node fails
> hard, it won't be able to send out much status information to the other
> nodes.
>
> Regards,
> Ulrich
>
> >
> > Regards,
> > Sriram.
> >
> >
> >
> >
> > On Mon, Aug 7, 2017 at 11:28 AM, Sriram  wrote:
> >
> >>
> >> Thanks Ken, Jan. Will look into the clone notifications.
> >>
> >> Regards,
> >> Sriram.
> >>
> >> On Sat, Aug 5, 2017 at 1:25 AM, Ken Gaillot 
> wrote:
> >>
> >>> On Thu, 2017-08-03 at 12:31 +0530, Sriram wrote:
> >>> >
> >>> > Hi Team,
> >>> >
> >>> >
> >>> > We have a four node cluster (1 active : 3 standby) in our lab for a
> >>> > particular service. If the active node goes down, one of the three
> >>> > standby node  becomes active. Now there will be (1 active :  2
> >>> > standby : 1 offline).
> >>> >
> >>> >
> >>> > Is there any way where this newly elected node sends notification to
> >>> > the remaining 2 standby nodes about its new status ?
> >>>
> >>> Hi Sriram,
> >>>
> >>> This depends on how your service is configured in the cluster.
> >>>
> >>> If you have a clone or master/slave resource, then clone notifications
> >>> is probably what you want (not alerts, which is the path you were going
> >>> down -- alerts are designed to e.g. email a system administrator after
> >>> an important event).
> >>>
> >>> For details about clone notifications, see:
> >>>
> >>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing
> >>> le/Pacemaker_Explained/index.html#_clone_resource_agent_requirements
> >>>
> >>> The RA must support the "notify" action, which will be called when a
> >>> clone instance is started or stopped. See the similar section later for
> >>> master/slave resources for additional information. See the mysql or
> >>> pgsql resource agents for examples of notify implementations.
> >>>
> >>> > I was exploring "notification agent" and "notification recipient"
> >>> > features, but that d

Re: [ClusterLabs] Notification agent and Notification recipients

2017-08-08 Thread Sriram
Hi Ken & Jan,

In the cluster we have, there is only one resource running. Its a OPT-IN
cluster with resource-stickiness set to INFINITY.

Just to clarify my question, lets take a scenario where there are four
nodes N1, N2, N3, N4
a. N1 comes up first, starts the cluster.
b. N1 Checks that there is no resource running, so it will add the
resource(R) with the some location constraint(lets say score 100)
c. So Resource(R) runs in N1 now.
d. N2 comes up next, checks that resource(R) is already running in N1, so
it will update the location constraint(lets say score 200)
e. N3 comes up next, checks that resource(R) is already running in N1, so
it will update the location constraint(lets say score 300)
f.  N4 comes up next, checks that resource(R) is already running in N1, so
it will update the location constraint(lets say score 400)
g. For the some reason, if N1 goes down, resource(R) shifts to N4(as its
score is higher than anyone).

In this case is it possible to notify the nodes N2, N3 that newly elected
active node is N4 ?

I went through clone notifications and master-slave, Iooks like it either
requires identical resources(Anonymous) or Unique or Stateful resources to
be running
in all the nodes of the cluster, where as in our case there is only
resource running in the whole cluster.

Regards,
Sriram.




On Mon, Aug 7, 2017 at 11:28 AM, Sriram  wrote:

>
> Thanks Ken, Jan. Will look into the clone notifications.
>
> Regards,
> Sriram.
>
> On Sat, Aug 5, 2017 at 1:25 AM, Ken Gaillot  wrote:
>
>> On Thu, 2017-08-03 at 12:31 +0530, Sriram wrote:
>> >
>> > Hi Team,
>> >
>> >
>> > We have a four node cluster (1 active : 3 standby) in our lab for a
>> > particular service. If the active node goes down, one of the three
>> > standby node  becomes active. Now there will be (1 active :  2
>> > standby : 1 offline).
>> >
>> >
>> > Is there any way where this newly elected node sends notification to
>> > the remaining 2 standby nodes about its new status ?
>>
>> Hi Sriram,
>>
>> This depends on how your service is configured in the cluster.
>>
>> If you have a clone or master/slave resource, then clone notifications
>> is probably what you want (not alerts, which is the path you were going
>> down -- alerts are designed to e.g. email a system administrator after
>> an important event).
>>
>> For details about clone notifications, see:
>>
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing
>> le/Pacemaker_Explained/index.html#_clone_resource_agent_requirements
>>
>> The RA must support the "notify" action, which will be called when a
>> clone instance is started or stopped. See the similar section later for
>> master/slave resources for additional information. See the mysql or
>> pgsql resource agents for examples of notify implementations.
>>
>> > I was exploring "notification agent" and "notification recipient"
>> > features, but that doesn't seem to work. /etc/sysconfig/notify.sh
>> > doesn't get invoked even in the newly elected active node.
>>
>> Yep, that's something different altogether -- it's only enabled on RHEL
>> systems, and solely for backward compatibility with an early
>> implementation of the alerts interface. The new alerts interface is more
>> flexible, but it's not designed to send information between cluster
>> nodes -- it's designed to send information to something external to the
>> cluster, such as a human, or an SNMP server, or a monitoring system.
>>
>>
>> > Cluster Properties:
>> >  cluster-infrastructure: corosync
>> >  dc-version: 1.1.17-e2e6cdce80
>> >  default-action-timeout: 240
>> >  have-watchdog: false
>> >  no-quorum-policy: ignore
>> >  notification-agent: /etc/sysconfig/notify.sh
>> >  notification-recipient: /var/log/notify.log
>> >  placement-strategy: balanced
>> >  stonith-enabled: false
>> >  symmetric-cluster: false
>> >
>> >
>> >
>> >
>> > I m using the following versions of pacemaker and corosync.
>> >
>> >
>> > /usr/sbin # ./pacemakerd --version
>> > Pacemaker 1.1.17
>> > Written by Andrew Beekhof
>> > /usr/sbin # ./corosync -v
>> > Corosync Cluster Engine, version '2.3.5'
>> > Copyright (c) 2006-2009 Red Hat, Inc.
>> >
>> >
>> > Can you please suggest if I m doing anything wrong or if there any
>> > other mechanisms to achieve this ?
>> >
>> >
>> > Regards,
>> &

Re: [ClusterLabs] Notification agent and Notification recipients

2017-08-06 Thread Sriram
Thanks Ken, Jan. Will look into the clone notifications.

Regards,
Sriram.

On Sat, Aug 5, 2017 at 1:25 AM, Ken Gaillot  wrote:

> On Thu, 2017-08-03 at 12:31 +0530, Sriram wrote:
> >
> > Hi Team,
> >
> >
> > We have a four node cluster (1 active : 3 standby) in our lab for a
> > particular service. If the active node goes down, one of the three
> > standby node  becomes active. Now there will be (1 active :  2
> > standby : 1 offline).
> >
> >
> > Is there any way where this newly elected node sends notification to
> > the remaining 2 standby nodes about its new status ?
>
> Hi Sriram,
>
> This depends on how your service is configured in the cluster.
>
> If you have a clone or master/slave resource, then clone notifications
> is probably what you want (not alerts, which is the path you were going
> down -- alerts are designed to e.g. email a system administrator after
> an important event).
>
> For details about clone notifications, see:
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-
> single/Pacemaker_Explained/index.html#_clone_resource_agent_requirements
>
> The RA must support the "notify" action, which will be called when a
> clone instance is started or stopped. See the similar section later for
> master/slave resources for additional information. See the mysql or
> pgsql resource agents for examples of notify implementations.
>
> > I was exploring "notification agent" and "notification recipient"
> > features, but that doesn't seem to work. /etc/sysconfig/notify.sh
> > doesn't get invoked even in the newly elected active node.
>
> Yep, that's something different altogether -- it's only enabled on RHEL
> systems, and solely for backward compatibility with an early
> implementation of the alerts interface. The new alerts interface is more
> flexible, but it's not designed to send information between cluster
> nodes -- it's designed to send information to something external to the
> cluster, such as a human, or an SNMP server, or a monitoring system.
>
>
> > Cluster Properties:
> >  cluster-infrastructure: corosync
> >  dc-version: 1.1.17-e2e6cdce80
> >  default-action-timeout: 240
> >  have-watchdog: false
> >  no-quorum-policy: ignore
> >  notification-agent: /etc/sysconfig/notify.sh
> >  notification-recipient: /var/log/notify.log
> >  placement-strategy: balanced
> >  stonith-enabled: false
> >  symmetric-cluster: false
> >
> >
> >
> >
> > I m using the following versions of pacemaker and corosync.
> >
> >
> > /usr/sbin # ./pacemakerd --version
> > Pacemaker 1.1.17
> > Written by Andrew Beekhof
> > /usr/sbin # ./corosync -v
> > Corosync Cluster Engine, version '2.3.5'
> > Copyright (c) 2006-2009 Red Hat, Inc.
> >
> >
> > Can you please suggest if I m doing anything wrong or if there any
> > other mechanisms to achieve this ?
> >
> >
> > Regards,
> > Sriram.
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
> --
> Ken Gaillot 
>
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Notification agent and Notification recipients

2017-08-03 Thread Sriram
Hi,

Any idea what could have gone wrong or if there are other ways to achieve
the same ?

Regards,
Sriram.

-- Forwarded message --
From: Sriram 
Date: Thu, Aug 3, 2017 at 12:31 PM
Subject: Notification agent and Notification recipients
To: Cluster Labs - All topics related to open-source clustering welcomed <
users@clusterlabs.org>



Hi Team,

We have a four node cluster (1 active : 3 standby) in our lab for a
particular service. If the active node goes down, one of the three standby
node  becomes active. Now there will be (1 active :  2 standby : 1 offline).

Is there any way where this newly elected node sends notification to the
remaining 2 standby nodes about its new status ?

I was exploring "notification agent" and "notification recipient" features,
but that doesn't seem to work. /etc/sysconfig/notify.sh doesn't get invoked
even in the newly elected active node.

Cluster Properties:
 cluster-infrastructure: corosync
 dc-version: 1.1.17-e2e6cdce80
 default-action-timeout: 240
 have-watchdog: false
 no-quorum-policy: ignore
 *notification-agent: /etc/sysconfig/notify.sh*
* notification-recipient: /var/log/notify.log*
 placement-strategy: balanced
 stonith-enabled: false
 symmetric-cluster: false


I m using the following versions of pacemaker and corosync.

/usr/sbin # ./pacemakerd --version
Pacemaker 1.1.17
Written by Andrew Beekhof
/usr/sbin # ./corosync -v
Corosync Cluster Engine, version '2.3.5'
Copyright (c) 2006-2009 Red Hat, Inc.

Can you please suggest if I m doing anything wrong or if there any other
mechanisms to achieve this ?

Regards,
Sriram.
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Notification agent and Notification recipients

2017-08-03 Thread Sriram
Hi Team,

We have a four node cluster (1 active : 3 standby) in our lab for a
particular service. If the active node goes down, one of the three standby
node  becomes active. Now there will be (1 active :  2 standby : 1 offline).

Is there any way where this newly elected node sends notification to the
remaining 2 standby nodes about its new status ?

I was exploring "notification agent" and "notification recipient" features,
but that doesn't seem to work. /etc/sysconfig/notify.sh doesn't get invoked
even in the newly elected active node.

Cluster Properties:
 cluster-infrastructure: corosync
 dc-version: 1.1.17-e2e6cdce80
 default-action-timeout: 240
 have-watchdog: false
 no-quorum-policy: ignore
 *notification-agent: /etc/sysconfig/notify.sh*
* notification-recipient: /var/log/notify.log*
 placement-strategy: balanced
 stonith-enabled: false
 symmetric-cluster: false


I m using the following versions of pacemaker and corosync.

/usr/sbin # ./pacemakerd --version
Pacemaker 1.1.17
Written by Andrew Beekhof
/usr/sbin # ./corosync -v
Corosync Cluster Engine, version '2.3.5'
Copyright (c) 2006-2009 Red Hat, Inc.

Can you please suggest if I m doing anything wrong or if there any other
mechanisms to achieve this ?

Regards,
Sriram.
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] [ClusterLab] : Corosync not initializing successfully

2016-04-29 Thread Sriram
Corrected the subject.

We went ahead and captured corosync debug logs for our ppc board.
After log analysis and comparison with the sucessful logs( from x86
machine) ,
we didnt find * "[ MAIN  ] Completed service synchronization, ready to
provide service.*" in ppc logs.
So, looks like corosync is not in a position to accept connection from
Pacemaker.
Even I tried with the new corosync.conf with no success.

Any hints on this issue would be really helpful.

Attaching ppc_notworking.log, x86_working.log, corosync.conf.

Regards,
Sriram



On Fri, Apr 29, 2016 at 2:44 PM, Sriram  wrote:

> Hi,
>
> I went ahead and made some changes in file system(Like I brought in
> /etc/init.d/corosync and /etc/init.d/pacemaker, /etc/sysconfig ), After
> that I was able to run  "pcs cluster start".
> But it failed with the following error
>  # pcs cluster start
> Starting Cluster...
> Starting Pacemaker Cluster Manager[FAILED]
> Error: unable to start pacemaker
>
> And in the /var/log/pacemaker.log, I saw these errors
> pacemakerd: info: mcp_read_config:  cmap connection setup failed:
> CS_ERR_TRY_AGAIN.  Retrying in 4s
> Apr 29 08:53:47 [15863] node_cu pacemakerd: info: mcp_read_config:
> cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 5s
> Apr 29 08:53:52 [15863] node_cu pacemakerd:  warning: mcp_read_config:
> Could not connect to Cluster Configuration Database API, error 6
> Apr 29 08:53:52 [15863] node_cu pacemakerd:   notice: main: Could not
> obtain corosync config data, exiting
> Apr 29 08:53:52 [15863] node_cu pacemakerd: info: crm_xml_cleanup:
> Cleaning up memory from libxml2
>
>
> And in the /var/log/Debuglog, I saw these errors coming from corosync
> 20160429 085347.487050 airv_cu daemon.warn corosync[12857]:   [QB]
> Denied connection, is not ready (12857-15863-14)
> 20160429 085347.487067 airv_cu daemon.info corosync[12857]:   [QB]
> Denied connection, is not ready (12857-15863-14)
>
>
> I browsed the code of libqb to find that it is failing in
>
> https://github.com/ClusterLabs/libqb/blob/master/lib/ipc_setup.c
>
> Line 600 :
> handle_new_connection function
>
> Line 637:
> if (auth_result == 0 && c->service->serv_fns.connection_accept) {
> res = c->service->serv_fns.connection_accept(c,
>  c->euid, c->egid);
> }
> if (res != 0) {
> goto send_response;
> }
>
> Any hints on this issue would be really helpful for me to go ahead.
> Please let me know if any logs are required,
>
> Regards,
> Sriram
>
> On Thu, Apr 28, 2016 at 2:42 PM, Sriram  wrote:
>
>> Thanks Ken and Emmanuel.
>> Its a big endian machine. I will try with running "pcs cluster setup" and
>> "pcs cluster start"
>> Inside cluster.py, "service pacemaker start" and "service corosync start"
>> are executed to bring up pacemaker and corosync.
>> Those service scripts and the infrastructure needed to bring up the
>> processes in the above said manner doesn't exist in my board.
>> As it is a embedded board with the limited memory, full fledged linux is
>> not installed.
>> Just curious to know, what could be reason the pacemaker throws that
>> error.
>>
>>
>>
>> *"cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 1s"*
>> Thanks for response.
>>
>> Regards,
>> Sriram.
>>
>> On Thu, Apr 28, 2016 at 8:55 AM, Ken Gaillot  wrote:
>>
>>> On 04/27/2016 11:25 AM, emmanuel segura wrote:
>>> > you need to use pcs to do everything, pcs cluster setup and pcs
>>> > cluster start, try to use the redhat docs for more information.
>>>
>>> Agreed -- pcs cluster setup will create a proper corosync.conf for you.
>>> Your corosync.conf below uses corosync 1 syntax, and there were
>>> significant changes in corosync 2. In particular, you don't need the
>>> file created in step 4, because pacemaker is no longer launched via a
>>> corosync plugin.
>>>
>>> > 2016-04-27 17:28 GMT+02:00 Sriram :
>>> >> Dear All,
>>> >>
>>> >> I m trying to use pacemaker and corosync for the clustering
>>> requirement that
>>> >> came up recently.
>>> >> We have cross compiled corosync, pacemaker and pcs(python) for ppc
>>> >> environment (Target board where pacemaker and corosync are supposed
>>> to run)
>>> >> I m having trouble bringing up pacemaker in that environment, though
>>> I could
>>> >> successfully bring up corosync.
>>> &

Re: [ClusterLabs] [ClusterLab] : Unable to bring up pacemaker

2016-04-29 Thread Sriram
Hi,

I went ahead and made some changes in file system(Like I brought in
/etc/init.d/corosync and /etc/init.d/pacemaker, /etc/sysconfig ), After
that I was able to run  "pcs cluster start".
But it failed with the following error
 # pcs cluster start
Starting Cluster...
Starting Pacemaker Cluster Manager[FAILED]
Error: unable to start pacemaker

And in the /var/log/pacemaker.log, I saw these errors
pacemakerd: info: mcp_read_config:  cmap connection setup failed:
CS_ERR_TRY_AGAIN.  Retrying in 4s
Apr 29 08:53:47 [15863] node_cu pacemakerd: info: mcp_read_config:
cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 5s
Apr 29 08:53:52 [15863] node_cu pacemakerd:  warning: mcp_read_config:
Could not connect to Cluster Configuration Database API, error 6
Apr 29 08:53:52 [15863] node_cu pacemakerd:   notice: main: Could not
obtain corosync config data, exiting
Apr 29 08:53:52 [15863] node_cu pacemakerd: info: crm_xml_cleanup:
Cleaning up memory from libxml2


And in the /var/log/Debuglog, I saw these errors coming from corosync
20160429 085347.487050 airv_cu daemon.warn corosync[12857]:   [QB]
Denied connection, is not ready (12857-15863-14)
20160429 085347.487067 airv_cu daemon.info corosync[12857]:   [QB]
Denied connection, is not ready (12857-15863-14)


I browsed the code of libqb to find that it is failing in

https://github.com/ClusterLabs/libqb/blob/master/lib/ipc_setup.c

Line 600 :
handle_new_connection function

Line 637:
if (auth_result == 0 && c->service->serv_fns.connection_accept) {
res = c->service->serv_fns.connection_accept(c,
 c->euid, c->egid);
}
if (res != 0) {
goto send_response;
}

Any hints on this issue would be really helpful for me to go ahead.
Please let me know if any logs are required,

Regards,
Sriram

On Thu, Apr 28, 2016 at 2:42 PM, Sriram  wrote:

> Thanks Ken and Emmanuel.
> Its a big endian machine. I will try with running "pcs cluster setup" and
> "pcs cluster start"
> Inside cluster.py, "service pacemaker start" and "service corosync start"
> are executed to bring up pacemaker and corosync.
> Those service scripts and the infrastructure needed to bring up the
> processes in the above said manner doesn't exist in my board.
> As it is a embedded board with the limited memory, full fledged linux is
> not installed.
> Just curious to know, what could be reason the pacemaker throws that error.
>
>
>
> *"cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 1s"*
> Thanks for response.
>
> Regards,
> Sriram.
>
> On Thu, Apr 28, 2016 at 8:55 AM, Ken Gaillot  wrote:
>
>> On 04/27/2016 11:25 AM, emmanuel segura wrote:
>> > you need to use pcs to do everything, pcs cluster setup and pcs
>> > cluster start, try to use the redhat docs for more information.
>>
>> Agreed -- pcs cluster setup will create a proper corosync.conf for you.
>> Your corosync.conf below uses corosync 1 syntax, and there were
>> significant changes in corosync 2. In particular, you don't need the
>> file created in step 4, because pacemaker is no longer launched via a
>> corosync plugin.
>>
>> > 2016-04-27 17:28 GMT+02:00 Sriram :
>> >> Dear All,
>> >>
>> >> I m trying to use pacemaker and corosync for the clustering
>> requirement that
>> >> came up recently.
>> >> We have cross compiled corosync, pacemaker and pcs(python) for ppc
>> >> environment (Target board where pacemaker and corosync are supposed to
>> run)
>> >> I m having trouble bringing up pacemaker in that environment, though I
>> could
>> >> successfully bring up corosync.
>> >> Any help is welcome.
>> >>
>> >> I m using these versions of pacemaker and corosync
>> >> [root@node_cu pacemaker]# corosync -v
>> >> Corosync Cluster Engine, version '2.3.5'
>> >> Copyright (c) 2006-2009 Red Hat, Inc.
>> >> [root@node_cu pacemaker]# pacemakerd -$
>> >> Pacemaker 1.1.14
>> >> Written by Andrew Beekhof
>> >>
>> >> For running corosync, I did the following.
>> >> 1. Created the following directories,
>> >> /var/lib/pacemaker
>> >> /var/lib/corosync
>> >> /var/lib/pacemaker
>> >> /var/lib/pacemaker/cores
>> >> /var/lib/pacemaker/pengine
>> >> /var/lib/pacemaker/blackbox
>> >> /var/lib/pacemaker/cib
>> >>
>> >>
>> >> 2. Created a file called corosync.conf under /etc/corosync folder with
>> the
>> >> follo

Re: [ClusterLabs] [ClusterLab] : Unable to bring up pacemaker

2016-04-28 Thread Sriram
Thanks Ken and Emmanuel.
Its a big endian machine. I will try with running "pcs cluster setup" and
"pcs cluster start"
Inside cluster.py, "service pacemaker start" and "service corosync start"
are executed to bring up pacemaker and corosync.
Those service scripts and the infrastructure needed to bring up the
processes in the above said manner doesn't exist in my board.
As it is a embedded board with the limited memory, full fledged linux is
not installed.
Just curious to know, what could be reason the pacemaker throws that error.



*"cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 1s"*
Thanks for response.

Regards,
Sriram.

On Thu, Apr 28, 2016 at 8:55 AM, Ken Gaillot  wrote:

> On 04/27/2016 11:25 AM, emmanuel segura wrote:
> > you need to use pcs to do everything, pcs cluster setup and pcs
> > cluster start, try to use the redhat docs for more information.
>
> Agreed -- pcs cluster setup will create a proper corosync.conf for you.
> Your corosync.conf below uses corosync 1 syntax, and there were
> significant changes in corosync 2. In particular, you don't need the
> file created in step 4, because pacemaker is no longer launched via a
> corosync plugin.
>
> > 2016-04-27 17:28 GMT+02:00 Sriram :
> >> Dear All,
> >>
> >> I m trying to use pacemaker and corosync for the clustering requirement
> that
> >> came up recently.
> >> We have cross compiled corosync, pacemaker and pcs(python) for ppc
> >> environment (Target board where pacemaker and corosync are supposed to
> run)
> >> I m having trouble bringing up pacemaker in that environment, though I
> could
> >> successfully bring up corosync.
> >> Any help is welcome.
> >>
> >> I m using these versions of pacemaker and corosync
> >> [root@node_cu pacemaker]# corosync -v
> >> Corosync Cluster Engine, version '2.3.5'
> >> Copyright (c) 2006-2009 Red Hat, Inc.
> >> [root@node_cu pacemaker]# pacemakerd -$
> >> Pacemaker 1.1.14
> >> Written by Andrew Beekhof
> >>
> >> For running corosync, I did the following.
> >> 1. Created the following directories,
> >> /var/lib/pacemaker
> >> /var/lib/corosync
> >> /var/lib/pacemaker
> >> /var/lib/pacemaker/cores
> >> /var/lib/pacemaker/pengine
> >> /var/lib/pacemaker/blackbox
> >> /var/lib/pacemaker/cib
> >>
> >>
> >> 2. Created a file called corosync.conf under /etc/corosync folder with
> the
> >> following contents
> >>
> >> totem {
> >>
> >> version: 2
> >> token:  5000
> >> token_retransmits_before_loss_const: 20
> >> join:   1000
> >> consensus:  7500
> >> vsftype:none
> >> max_messages:   20
> >> secauth:off
> >> cluster_name:   mycluster
> >> transport:  udpu
> >> threads:0
> >> clear_node_high_bit: yes
> >>
> >> interface {
> >> ringnumber: 0
> >> # The following three values need to be set based on
> your
> >> environment
> >> bindnetaddr: 10.x.x.x
> >> mcastaddr: 226.94.1.1
> >> mcastport: 5405
> >> }
> >>  }
> >>
> >>  logging {
> >> fileline: off
> >> to_syslog: yes
> >> to_stderr: no
> >> to_syslog: yes
> >> logfile: /var/log/corosync.log
> >> syslog_facility: daemon
> >> debug: on
> >> timestamp: on
> >>  }
> >>
> >>  amf {
> >> mode: disabled
> >>  }
> >>
> >>  quorum {
> >> provider: corosync_votequorum
> >>  }
> >>
> >> nodelist {
> >>   node {
> >> ring0_addr: node_cu
> >> nodeid: 1
> >>}
> >> }
> >>
> >> 3.  Created authkey under /etc/corosync
> >>
> >> 4.  Created a file called pcmk under /etc/corosync/service.d and
> contents as
> >> below,
> >>   cat pcmk
> >>   service {
> >>  # Load the Pacemaker Cluster Resource Manager
> >>  name: pacemaker
> >>  ver:  1
> >>   }
> >>
> >> 5. Added the node name "node_c

[ClusterLabs] [ClusterLab] : Unable to bring up pacemaker

2016-04-27 Thread Sriram
Dear All,

I m trying to use pacemaker and corosync for the clustering requirement
that came up recently.
We have cross compiled corosync, pacemaker and pcs(python) for ppc
environment (Target board where pacemaker and corosync are supposed to run)
I m having trouble bringing up pacemaker in that environment, though I
could successfully bring up corosync.
Any help is welcome.

I m using these versions of pacemaker and corosync
[root@node_cu pacemaker]# corosync -v
*Corosync Cluster Engine, version '2.3.5'*
Copyright (c) 2006-2009 Red Hat, Inc.
[root@node_cu pacemaker]# pacemakerd -$

*Pacemaker 1.1.14Written by Andrew Beekhof*

For running corosync, I did the following.
1. Created the following directories,
/var/lib/pacemaker
/var/lib/corosync
/var/lib/pacemaker
/var/lib/pacemaker/cores
/var/lib/pacemaker/pengine
/var/lib/pacemaker/blackbox
/var/lib/pacemaker/cib


2. Created a file called corosync.conf under /etc/corosync folder with the
following contents

totem {

version: 2
token:  5000
token_retransmits_before_loss_const: 20
join:   1000
consensus:  7500
vsftype:none
max_messages:   20
secauth:off
cluster_name:   mycluster
transport:  udpu
threads:0
clear_node_high_bit: yes

interface {
ringnumber: 0
# The following three values need to be set based on your
environment
bindnetaddr: 10.x.x.x
mcastaddr: 226.94.1.1
mcastport: 5405
}
 }

 logging {
fileline: off
to_syslog: yes
to_stderr: no
to_syslog: yes
logfile: /var/log/corosync.log
syslog_facility: daemon
debug: on
timestamp: on
 }

 amf {
mode: disabled
 }

 quorum {
provider: corosync_votequorum
 }

nodelist {
  node {
ring0_addr: node_cu
nodeid: 1
   }
}

3.  Created authkey under /etc/corosync

4.  Created a file called pcmk under /etc/corosync/service.d and contents
as below,
  cat pcmk
  service {
 # Load the Pacemaker Cluster Resource Manager
 name: pacemaker
 ver:  1
  }

5. Added the node name "node_cu" in /etc/hosts with 10.X.X.X ip

6. ./corosync -f -p & --> this step started corosync

[root@node_cu pacemaker]# netstat -alpn | grep -i coros
udp0  0 10.X.X.X:61841 0.0.0.0:*
9133/corosync
udp0  0 10.X.X.X:5405  0.0.0.0:*
9133/corosync
unix  2  [ ACC ] STREAM LISTENING 14
9133/corosync   @quorum
unix  2  [ ACC ] STREAM LISTENING 148884
9133/corosync   @cmap
unix  2  [ ACC ] STREAM LISTENING 148887
9133/corosync   @votequorum
unix  2  [ ACC ] STREAM LISTENING 148885
9133/corosync   @cfg
unix  2  [ ACC ] STREAM LISTENING 148886
9133/corosync   @cpg
unix  2  [ ] DGRAM148840 9133/corosync

7. ./pacemakerd -f & gives the following error and exits.
[root@node_cu pacemaker]# pacemakerd -f
cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 1s
cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 2s
cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 3s
cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 4s
cmap connection setup failed: CS_ERR_TRY_AGAIN.  Retrying in 5s
Could not connect to Cluster Configuration Database API, error 6

Can you please point me, what is missing in these steps ?

Before trying these steps, I tried running "pcs cluster start", but that
command fails with "service" script not found. As the root filesystem
doesn't contain either /etc/init.d/ or /sbin/service

So, the plan is to bring up corosync and pacemaker manually, later do the
cluster configuration using "pcs" commands.

Regards,
Sriram
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org