[ClusterLabs] Unable to perform resource failover.

2017-11-07 Thread Garima
Hi All,

I am new in pacemaker corosync.

I have created a simple environment with 2 nodes(Active/Passive) having 2 
resources.
Resources:
One resource is added on VIP.
Other resource is added as Httpd apache service.

[root@node1 ~]# pcs resource show Httpd
Resource: Httpd (class=ocf provider=heartbeat type=apache)
  Attributes: configfile=/etc/httpd/conf/httpd.conf
  Operations: monitor interval=30s (Httpd-monitor-interval-30s)
  start interval=0s timeout=40s (Httpd-start-interval-0s)
  stop interval=0s timeout=60s (Httpd-stop-interval-0s)
[root@node1 ~]# pcs resource show Cluster_VIP
Resource: Cluster_VIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: cidr_netmask=32 ip=10.0.4.99
  Operations: monitor interval=20s (Cluster_VIP-monitor-interval-20s)
  start interval=0s timeout=20s (Cluster_VIP-start-interval-0s)
  stop interval=0s timeout=20s (Cluster_VIP-stop-interval-0s)

[root@node1 ~]# pcs status
Cluster name: Cluster
Stack: corosync
Current DC: node2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum
Last updated: Tue Nov  7 15:09:40 2017
Last change: Tue Nov  7 15:03:22 2017 by root via cibadmin on node1
2 nodes configured
2 resources configured
Online: [ node1 node2 ]
Full list of resources:
Cluster_VIP(ocf::heartbeat:IPaddr2):   Started node1
Httpd  (ocf::heartbeat:apache):Started node1
Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

To check and kill  process ID(pid) of httpd by using command:

* ps -aef | grep httpd

[root@node1 ~]# ps -aef | grep httpd
root  4392 1  0 15:03 ?00:00:00 /sbin/httpd -DSTATUS -f 
/etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache4393  4392  0 15:03 ?00:00:00 /sbin/httpd -DSTATUS -f 
/etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache4394  4392  0 15:03 ?00:00:00 /sbin/httpd -DSTATUS -f 
/etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache4395  4392  0 15:03 ?00:00:00 /sbin/httpd -DSTATUS -f 
/etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache4396  4392  0 15:03 ?00:00:00 /sbin/httpd -DSTATUS -f 
/etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
apache4397  4392  0 15:03 ?00:00:00 /sbin/httpd -DSTATUS -f 
/etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid

[root@node1 ~]# kill -9 4392

I am trying to do resource failover by killing pid of httpd.
Observation:
I observed that resource failover is not happing after killing the pid. Status 
of resource(Httpd) remain started on node1.
We don't want to use resource move "pcs resource move Httpd" and resource 
disable"pcs resource disable httpd" command for this.

Query:
What is the issue in our approach ?
How we can achieve a resources failover?

Further I will use this environment for testing the migration-threshold.
Any suggestions regarding this also welcome.

TIA

Regards,
Garima

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to perform resource failover.

2017-11-07 Thread Alberto Mijares
> I am trying to do resource failover by killing pid of httpd.
>
> Observation:
>
> I observed that resource failover is not happing after killing the pid.
> Status of resource(Httpd) remain started on node1.
>
> We don’t want to use resource move ”pcs resource move Httpd” and resource
> disable”pcs resource disable httpd” command for this.
>


There are many things to check. First of all, check if the service is
being restarted by systemd or another process manager.

Regards,


Alberto Mijares

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to perform resource failover.

2017-11-07 Thread Garima
Hi,


>> There are many things to check. First of all, check if the service is being 
>> restarted by systemd or another process manager.



We restarted the systemd and other process by using command mentioned below and 
also restarted the cluster nodes:



Systemctl restart httpd.service

Systemctl restart pacemaker.service

Systemctl restart corosync.service

Systemctl restart pcsd.service



Does this impact in cluster?



Regards,

Garima



-Original Message-
From: Alberto Mijares [mailto:amijar...@gmail.com]
Sent: 07 November 2017 16:25
To: Cluster Labs - All topics related to open-source clustering welcomed 

Subject: Re: [ClusterLabs] Unable to perform resource failover.



> I am trying to do resource failover by killing pid of httpd.

>

> Observation:

>

> I observed that resource failover is not happing after killing the pid.

> Status of resource(Httpd) remain started on node1.

>

> We don’t want to use resource move ”pcs resource move Httpd” and

> resource disable”pcs resource disable httpd” command for this.

>





There are many things to check. First of all, check if the service is being 
restarted by systemd or another process manager.



Regards,





Alberto Mijares



___

Users mailing list: Users@clusterlabs.org<mailto:Users@clusterlabs.org> 
http://lists.clusterlabs.org/mailman/listinfo/users



Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to perform resource failover.

2017-11-07 Thread Alberto Mijares
>
> We restarted the systemd and other process by using command mentioned below
> and also restarted the cluster nodes:
>
>
>
> Systemctl restart httpd.service
>
> Systemctl restart pacemaker.service
>
> Systemctl restart corosync.service
>
> Systemctl restart pcsd.service
>
>
>
> Does this impact in cluster?
>


You should

systemctl disable httpd.service && systemctl stop httpd.service in all
nodes of the cluster where Apache is installed. Then, start the
resource by using any utility, for example

pcs resource enable Httpd

If no more specific configurations have been done, that should start
Apache. Then, test again killing it.

Regards,


Alberto Mijares

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to perform resource failover.

2017-11-07 Thread Garima
Hi , 

>> systemctl disable httpd.service && systemctl stop httpd.service in all nodes 
>> of the cluster where Apache is installed

Executed both the command on nodes . Output is given below.

[root@node2 ~]# systemctl disable httpd.service
[root@node2 ~]# systemctl stop  httpd.service
 [root@node2 ~]# systemctl status  httpd.service
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor 
preset: disabled)
   Active: inactive (dead)
 Docs: man:httpd(8)
   man:apachectl(8)

Nov 07 16:59:17 node1 systemd[1]: Starting The Apache HTTP Server...
Nov 07 16:59:17 node1 httpd[15462]: AH00558: httpd: Could not reliably 
determine the server's fully qualified domain na...message
Nov 07 16:59:17 node1 systemd[1]: Started The Apache HTTP Server.
Nov 07 16:59:37 node1 systemd[1]: Stopping The Apache HTTP Server...
Nov 07 16:59:38 node1 systemd[1]: Stopped The Apache HTTP Server.
Nov 07 17:01:10 node1 systemd[1]: Starting The Apache HTTP Server...
Nov 07 17:01:10 node1 httpd[15720]: AH00558: httpd: Could not reliably 
determine the server's fully qualified domain na...message
Nov 07 17:01:10 node1 systemd[1]: Started The Apache HTTP Server.
Nov 07 17:02:14 node1 systemd[1]: Stopping The Apache HTTP Server...
Nov 07 17:02:15 node1 systemd[1]: Stopped The Apache HTTP Server.

[root@node2 ~]# pcs resource enable Httpd
 [root@node2 ~]# pcs status
Cluster name: Cluster
Stack: corosync
Current DC: node2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum
Last updated: Tue Nov  7 17:06:58 2017
Last change: Tue Nov  7 16:02:05 2017 by root via crm_resource on node1

2 nodes configured
2 resources configured

Online: [ node1 node2 ]

Full list of resources:

 Cluster_VIP(ocf::heartbeat:IPaddr2):   Started node2
 Httpd  (ocf::heartbeat:apache):Stopped

Failed Actions:
* Httpd_monitor_3 on node1 'not running' (7): call=21, status=complete, 
exitreason='none',
last-rc-change='Tue Nov  7 15:26:27 2017', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

There is no change in resource status.

TIA 
Regards,
Garima

-Original Message-
From: Alberto Mijares [mailto:amijar...@gmail.com] 
Sent: 07 November 2017 16:53
To: Cluster Labs - All topics related to open-source clustering welcomed 

Subject: Re: [ClusterLabs] Unable to perform resource failover.

>
> We restarted the systemd and other process by using command mentioned 
> below and also restarted the cluster nodes:
>
>
>
> Systemctl restart httpd.service
>
> Systemctl restart pacemaker.service
>
> Systemctl restart corosync.service
>
> Systemctl restart pcsd.service
>
>
>
> Does this impact in cluster?
>


You should

systemctl disable httpd.service && systemctl stop httpd.service in all nodes of 
the cluster where Apache is installed. Then, start the resource by using any 
utility, for example

pcs resource enable Httpd

If no more specific configurations have been done, that should start Apache. 
Then, test again killing it.

Regards,


Alberto Mijares

___
Users mailing list: Users@clusterlabs.org 
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Unable to perform resource failover.

2017-11-07 Thread Ken Gaillot
On Tue, 2017-11-07 at 10:30 +, Garima wrote:
> Hi All,
>  
> I am new in pacemaker corosync.
>  
> I have created a simple environment with 2 nodes(Active/Passive)
> having 2 resources.
> Resources:
> One resource is added on VIP.
> Other resource is added as Httpd apache service.
>  
> [root@node1 ~]# pcs resource show Httpd
> Resource: Httpd (class=ocf provider=heartbeat type=apache)
>   Attributes: configfile=/etc/httpd/conf/httpd.conf
>   Operations: monitor interval=30s (Httpd-monitor-interval-30s)
>   start interval=0s timeout=40s (Httpd-start-interval-0s)
>   stop interval=0s timeout=60s (Httpd-stop-interval-0s)
> [root@node1 ~]# pcs resource show Cluster_VIP
> Resource: Cluster_VIP (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: cidr_netmask=32 ip=10.0.4.99
>   Operations: monitor interval=20s (Cluster_VIP-monitor-interval-20s)
>   start interval=0s timeout=20s (Cluster_VIP-start-
> interval-0s)
>   stop interval=0s timeout=20s (Cluster_VIP-stop-
> interval-0s)
>  
> [root@node1 ~]# pcs status
> Cluster name: Cluster
> Stack: corosync
> Current DC: node2 (version 1.1.16-12.el7_4.4-94ff4df) - partition
> with quorum
> Last updated: Tue Nov  7 15:09:40 2017
> Last change: Tue Nov  7 15:03:22 2017 by root via cibadmin on node1
> 2 nodes configured
> 2 resources configured
> Online: [ node1 node2 ]
> Full list of resources:
> Cluster_VIP    (ocf::heartbeat:IPaddr2):   Started node1
> Httpd  (ocf::heartbeat:apache):    Started node1
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
>  
> To check and kill  process ID(pid) of httpd by using command:
> · ps –aef | grep httpd
>  
> [root@node1 ~]# ps -aef | grep httpd
> root  4392 1  0 15:03 ?    00:00:00 /sbin/httpd -DSTATUS
> -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
> apache    4393  4392  0 15:03 ?    00:00:00 /sbin/httpd -DSTATUS
> -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
> apache    4394  4392  0 15:03 ?    00:00:00 /sbin/httpd -DSTATUS
> -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
> apache    4395  4392  0 15:03 ?    00:00:00 /sbin/httpd -DSTATUS
> -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
> apache    4396  4392  0 15:03 ?    00:00:00 /sbin/httpd -DSTATUS
> -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
> apache    4397  4392  0 15:03 ?    00:00:00 /sbin/httpd -DSTATUS
> -f /etc/httpd/conf/httpd.conf -c PidFile /var/run//httpd.pid
>  
> [root@node1 ~]# kill -9 4392
>  
> I am trying to do resource failover by killing pid of httpd.
> Observation:
> I observed that resource failover is not happing after killing the
> pid. Status of resource(Httpd) remain started on node1.
> We don’t want to use resource move ”pcs resource move Httpd” and
> resource disable”pcs resource disable httpd” command for this.
>  
> Query:
> What is the issue in our approach ?

Pacemaker's default recovery behavior for service failures is not
failover, but restart. Chances are, pacemaker restarted httpd in the
above situation, and the outage was short enough that you didn't notice
it. You could check the pid of httpd afterward to see if it's the same
or a new one.

As discussed elsewhere in this thread, you also want to make sure that
your operating system is not managing the httpd process (via systemd,
upstart, lsb init, etc.).

> How we can achieve a resources failover?

migration-threshold=1

>  
> Further I will use this environment for testing the migration-
> threshold.
> Any suggestions regarding this also welcome.
>  
> TIA
>  
> Regards,
> Garima
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org