Re: [Linux-cluster] Normal startup vs startup due to failover on cluster node - can they be distinguished?

2012-11-27 Thread Parvez Shaikh
send the script which you have > > > > On 23 November 2012 10:55, Parvez Shaikh > wrote: > > Hi experts, > > > > I am using Red Hat Cluster available on RHEL 5.5. And it doesn't have any > > inbuilt mechanism to generate SNMP traps in failures of res

[Linux-cluster] Normal startup vs startup due to failover on cluster node - can they be distinguished?

2012-11-22 Thread Parvez Shaikh
Hi experts, I am using Red Hat Cluster available on RHEL 5.5. And it doesn't have any inbuilt mechanism to generate SNMP traps in failures of resources or failover of services from one node to another. I have a script agent, which starts, stops and checks status of my application. Is it possible

Re: [Linux-cluster] Not restarting "max_restart" times before relocating failed service

2012-10-31 Thread Parvez Shaikh
Hi, I am using recovery=restart as evident from earlier attached cluster.conf Thanks, Parvez On Wed, Oct 31, 2012 at 2:53 PM, emmanuel segura wrote: > Hello > > Maybe you missing recovery="restart" in your services > > 2012/10/31 Parvez Shaikh > >> Hi Digi

Re: [Linux-cluster] Not restarting "max_restart" times before relocating failed service

2012-10-30 Thread Parvez Shaikh
always come back to > this later after this issue is resolved, if you wish. > > On 10/30/2012 09:20 PM, Parvez Shaikh wrote: > &g

Re: [Linux-cluster] Not restarting "max_restart" times before relocating failed service

2012-10-30 Thread Parvez Shaikh
, Oct 30, 2012 at 9:25 PM, Digimer wrote: > On 10/30/2012 01:54 AM, Parvez Shaikh wrote: > > Hi experts, > > > > I have defined a service as follows in cluster.conf - > > > > > max_restar

[Linux-cluster] Monitoring Frequency - can it be changed?

2012-10-30 Thread Parvez Shaikh
Hi experts, Can we change frequency at which resources are monitored by Cluster? I observed 30 seconds as monitoring frequency. Thanks, Parvez -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster

[Linux-cluster] Not restarting "max_restart" times before relocating failed service

2012-10-30 Thread Parvez Shaikh
Hi experts, I have defined a service as follows in cluster.conf - I mentioned max_restarts=5 hoping that if cluster fails to start service 5 times, then it will relocate to another cluster node in failover domain

Re: [Linux-cluster] Hi

2012-10-02 Thread Parvez Shaikh
A curious observation, there is a sudden surge of sending emails on private addresses rather than sending over a mailing list. Please send your doubts / questions on mailing list " linux-cluster@redhat.com" instead of addressing personally. Regarding configuration for manual fencing - I don't hav

Re: [Linux-cluster] linux-cluster

2012-10-02 Thread Parvez Shaikh
Hi Digimer, Could you please give me reference/case studies of problem about why manual fencing was dropped and how automated fencing is fixing those? Thanks, Parvez On Tue, Oct 2, 2012 at 7:08 PM, Digimer wrote: > On 10/02/2012 04:00 AM, Parvez Shaikh wrote: > >> What kind of clu

Re: [Linux-cluster] linux-cluster

2012-10-02 Thread Parvez Shaikh
What kind of cluster is this - an academic project or production quality solution? If its former - go for manual fencing. You wont need fence device but failover wont be automatic If its later - yes you'll need fence device On Mon, Oct 1, 2012 at 10:15 PM, Rajagopal Swaminathan < raju.rajs...@gm

Re: [Linux-cluster] 2 node cluster showing strange behaviour

2012-09-17 Thread Parvez Shaikh
Had similar issues however I was using RHEL 5.5 Please refer - https://access.redhat.com/knowledge/solutions/18542 On Mon, Sep 17, 2012 at 9:22 PM, Ben .T.George wrote: > > > HI > > i am just started building 2 node cluster.i installed all packages of red > hat cluster suite by mounting RHEL 6

Re: [Linux-cluster] How to add shell script to cluster.conf

2012-09-16 Thread Parvez Shaikh
>From this link - https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/4/html/Cluster_Administration/s1-config-service-dev-CA.html Script *Name* — Enter a name for the custom user script. *File (with path)* — Enter the path where this custom script is located (for example, /et

Re: [Linux-cluster] clurgmgrd : relocating a service to better node

2012-04-10 Thread Parvez Shaikh
> recovery="relocate"> > > > > > > > On Wed, Apr 11, 2012 at 11:51 AM, Digimer wrote: > On 04/11/2012 02:14 AM, Parvez Shaikh wrote: > > Hi, > > > > When I start or enable a service (that was previously disab

[Linux-cluster] clurgmgrd : relocating a service to better node

2012-04-10 Thread Parvez Shaikh
Hi, When I start or enable a service (that was previously disabled) on a a cluster node, I see message saying clurmgrd relocating service to "better" node. I am not understanding why. I can relocate service back to a node where I see above message and it runs fine there. What does "better" node

[Linux-cluster] Multicast address by CMAN

2012-04-03 Thread Parvez Shaikh
Hi all, As per my understanding, CMAN uses cluster name to internally generate multi-cast address. In my cluster.conf Having a cluster with same name in a given network leads to issue and is undesirable. I want to know is there anyway to find if multicast address is already in use by some other

Re: [Linux-cluster] Two clusters in the network with same "name" in cluster.conf

2012-03-28 Thread Parvez Shaikh
subnet. Usually it is generated from a hash of the cluster name, but >it can be overridden here if you feel the need. Sometimes cluster >names can hash to the same ID. > > > > Kind regards, > Laszlo > > On 03/28/2012 09:14 AM, Parvez Shaikh wrote: > > H

Re: [Linux-cluster] Two clusters in the network with same "name" in cluster.conf

2012-03-28 Thread Parvez Shaikh
a point of curiosity; why do you need the cluster names to be identical? > > Digi > > On 03/28/2012 12:00 AM, Parvez Shaikh wrote: > > Thanks Fabio & Digimer, > > > > Just to add more information that each node has 4 NIC cards out of which > - > > > > eth

Re: [Linux-cluster] Two clusters in the network with same "name" in cluster.conf

2012-03-28 Thread Parvez Shaikh
e4-2 ccsd[6821]: Remote copy of cluster.conf is from quorate node. Mar 28 11:57:07 blade4-2 ccsd[6821]: Local version # : 1 Mar 28 11:57:07 blade4-2 ccsd[6821]: Remote version #: 1 On Wed, Mar 28, 2012 at 12:13 PM, Fabio M. Di Nitto wrote: > On 3/28/2012 8:14 AM, Parvez Shaikh wrote: &g

[Linux-cluster] [TOTEM] The consensus timeout expired.

2012-03-26 Thread Parvez Shaikh
Hi all, I have a cluster with two blades in IBM BladeCenter. Following error is appearing when I start cman service and it keep repeating the message /var/log/messages - openais[10770]: [TOTEM] The consensus timeout expired. openais[10770]: [TOTEM] entering GATHER state from 3. Heart beating

Re: [Linux-cluster] $OCF_ERR_CONFIGURED - recovers service on another cluster node

2012-01-27 Thread Parvez Shaikh
ce > > like that you can see what it's wrong > > 2012/1/27 Parvez Shaikh > >> Hi guys, >> >> I am using Red Hat Cluster Suite which comes with RHEL 5.5 - >> >> cman_tool version >> >>6.2.0 config xxx >> >> Now I have a sc

[Linux-cluster] $OCF_ERR_CONFIGURED - recovers service on another cluster node

2012-01-27 Thread Parvez Shaikh
Hi guys, I am using Red Hat Cluster Suite which comes with RHEL 5.5 - cman_tool version >>6.2.0 config xxx Now I have a script resource in which I return $OCF_ERR_CONFIGURED; in case of a Fatal irrecoverable error, hoping that my service would not start on another cluster node. But I see that c

Re: [Linux-cluster] Configuring failover time with Red Hat Cluster

2011-07-05 Thread Parvez Shaikh
o anything more I recommend you read the documentation for the > actual version of clustering you are going to install > > https://access.redhat.com/**knowledge/docs/Red_Hat_**Enterprise_Linux/<https://access.redhat.com/knowledge/docs/Red_Hat_Enterprise_Linux/> > > Chris

[Linux-cluster] Configuring failover time with Red Hat Cluster

2011-07-05 Thread Parvez Shaikh
Hi all, I was trying to find out how much time does it take for RHCS to detect failure and recover from it. I found the link - http://www.redhat.com/whitepapers/rha/RHA_ClusterSuiteWPPDF.pdf It says that network polling interval is 2 seconds and 6 retries are attempted before declaring a node as

Re: [Linux-cluster] fence_ipmilan fails to reboot - SOLVED

2011-07-01 Thread Parvez Shaikh
e: > > fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="xx.xx.xx.xx" > lanplus="1" login="xxxt" name="node1_ilo" passwd="yyy > > > Regards > > Shalom. > > On Thu, Jun 30, 2011 at 1:03 PM, Parvez Shaik

[Linux-cluster] fence_ipmilan fails to reboot

2011-06-30 Thread Parvez Shaikh
Hi all, I am on RHEL 5.5; and I have two rack mounted servers with IPMI configured. When I run command from the prompt to reboot the server through fence_ipmilan, it shutsdown the server fine but it fails to power it on # fence_ipmilan -a -l admin -p password -o reboot > Rebooting machine @ IPM

Re: [Linux-cluster] Plugged out blade from bladecenter chassis - fence_bladecenter failed

2011-06-19 Thread Parvez Shaikh
e: > There is a bug related to missing_as_off - > https://bugzilla.redhat.com/show_bug.cgi?id=689851 - expects the fix in > rhel5u7 . > > regards, > > On Wed, Apr 27, 2011 at 1:59 PM, Parvez Shaikh > wrote: > >> Hi all, >> >> I am using RHCS on IB

Re: [Linux-cluster] Plugged out blade from bladecenter chassis - fence_bladecenter failed

2011-06-14 Thread Parvez Shaikh
Hi, Has anyone used missing_as_off in cluster.conf file? Any help where to put this option in cluster.conf would be greatly appreciated Thanks, Parvez On Mon, May 2, 2011 at 6:49 PM, Parvez Shaikh wrote: > Hi Marek, > > I tried the option missing_as_off="1" and now I g

Re: [Linux-cluster] oracle DB is not failing over on killin PMON deamon

2011-05-14 Thread Parvez Shaikh
Hi Sufyan Does your status function return 0 or 1 if database is up or down respectively (i.e. have you tested it works outside script_db.sh) when run as "root"? On Thu, May 12, 2011 at 12:52 PM, Sufyan Khan wrote: > First of all thanks for you quick response. > > Secondly please note: the wor

Re: [Linux-cluster] Plugged out blade from bladecenter chassis - fence_bladecenter failed

2011-05-02 Thread Parvez Shaikh
Did I miss something? Thanks Parvez On Mon, May 2, 2011 at 1:03 PM, Marek Grac wrote: > Hi, > > > On 04/29/2011 10:15 AM, Parvez Shaikh wrote: > >> Hi Marek, >> >> Can we give this option in cluster.conf file for bladecenter fen

Re: [Linux-cluster] Plugged out blade from bladecenter chassis - fence_bladecenter failed

2011-04-29 Thread Parvez Shaikh
Hi Marek, Can we give this option in cluster.conf file for bladecenter fencing device or method For IPMI, fencing is there similar option? On Fri, Apr 29, 2011 at 1:38 PM, Marek Grac wrote: > Hi, > > > On 04/27/2011 10:29 AM, Parvez Shaikh wrote: > >> Hi all, >>

[Linux-cluster] Plugged out blade from bladecenter chassis - fence_bladecenter failed

2011-04-27 Thread Parvez Shaikh
Hi all, I am using RHCS on IBM bladecenter with blade center fencing. I plugged out a blade from blade center chassis slot and was hoping that failover to occur. However when I did so, I get following message - fenced[10240]: agent "fence_bladecenter" reports: Failed: Unable to obtain correct plu

Re: [Linux-cluster] Node without fencing method, is it possible to failover from such a node?

2011-03-24 Thread Parvez Shaikh
. Gratefully, Parvez On Thu, Mar 17, 2011 at 10:19 PM, Rajagopal Swaminathan < raju.rajs...@gmail.com> wrote: > Greetings, > > On 3/17/11, Digimer wrote: > > On 03/17/2011 01:25 AM, Parvez Shaikh wrote: > >> Hi all, > >> > >> Life was good until I

[Linux-cluster] Node without fencing method, is it possible to failover from such a node?

2011-03-16 Thread Parvez Shaikh
Hi all, I have a red hat cluster on IBM blade center with blades being my clusternodes and fence_bladecenter fencing agent. I have couple of resources - IP which activate or deactivate floating IP and script which start my server listening on this floating IP. This is a stateless server with no sh

[Linux-cluster] Clustat exit code for service status

2011-03-15 Thread Parvez Shaikh
Hi all, Command clustat -s gives status of service. If service is started (i.e. running on some node), exit code of this command is 0, if however service is not running, its exit code is non-zero (found it to be 119). Is this right and going to be continued in subsequent cluster versions as wel

Re: [Linux-cluster] Two node cluster - a potential problem of node fencing each other?

2011-03-13 Thread Parvez Shaikh
redundant network link - i trust you were referring to ethernet bonding. On Sun, Mar 13, 2011 at 1:19 PM, Ian Hayes wrote: > On Sat, Mar 12, 2011 at 11:19 PM, Parvez Shaikh > wrote: > >> Hi all, >> >> I have a question pertaining to two node cluster, I have RHEL 5.

[Linux-cluster] Two node cluster - a potential problem of node fencing each other?

2011-03-12 Thread Parvez Shaikh
Hi all, I have a question pertaining to two node cluster, I have RHEL 5.5 and cluster along with it which at least should have two nodes. In a situation where both nodes of the cluster are up, and have reliable connection to fencing device (e.g. power switch OR any other power fencing device) and

Re: [Linux-cluster] SNMP support with IBM Blade Center Fence Agent

2011-03-04 Thread Parvez Shaikh
ks again and have great weekend ahead Yours truly, Parvez On Fri, Mar 4, 2011 at 10:45 PM, Lon Hohberger wrote: > On Tue, Mar 01, 2011 at 06:50:18PM +0530, Parvez Shaikh wrote: > > Hi Ryan, > > > > Thank you for response. Does it mean there is no way to intimate > > ad

Re: [Linux-cluster] SNMP support with IBM Blade Center Fence Agent

2011-03-01 Thread Parvez Shaikh
)? Thanks On Mon, Feb 28, 2011 at 9:44 PM, Ryan O'Hara wrote: > On Mon, Feb 28, 2011 at 12:43:10PM +0530, Parvez Shaikh wrote: > > Hi all, > > > > I have a question related to fence agents and SNMP alarms. > > > > Fence Agent can fail to fence the faile

[Linux-cluster] SNMP support with IBM Blade Center Fence Agent

2011-02-27 Thread Parvez Shaikh
Hi all, I have a question related to fence agents and SNMP alarms. Fence Agent can fail to fence the failed node for various reason; e.g. with my bladecenter fencing agent, I sometimes get message saying bladecenter fencing failed because of timeout or fence device IP address/user credentials are

[Linux-cluster] Tuning red hat cluster

2011-02-10 Thread Parvez Shaikh
Hi, As per my understanding rgmanager invokes 'status' on resource groups periodically to determine if these resources are up or down. I observed that this period is of around 30 seconds. Is it possible to tune or adjust this period for individual services or resource groups? Thanks -- Linux-clu

Re: [Linux-cluster] Running cluster tools using non-root user

2011-01-27 Thread Parvez Shaikh
ed form our devel branch about a week ago. > > [/Shameless plug] > > On Tue, Jan 25, 2011 at 10:39 AM, Parvez Shaikh > wrote: > > Hi all > > > > Is it possible to run cluster tools like clustat or clusvcadm etc. using > > non-root user? > > > > If ye

[Linux-cluster] Running cluster tools using non-root user

2011-01-25 Thread Parvez Shaikh
Hi all Is it possible to run cluster tools like clustat or clusvcadm etc. using non-root user? If yes, to which groups this user should belong to? Otherwise can this be done using sudo(and sudoers) file. As of now I get following error on clustat - Could not connect to CMAN: Permission denied

[Linux-cluster] Questions related to cluster quorum and fencing

2011-01-18 Thread Parvez Shaikh
Hi all, *Quorum - * The questions are bit theoretical, I have gone through documentation and man pages and have understood that, a cluster is "quorate" if a cluster or its partition has nodes, with votes equal to or more than "expected_votes" in "cman" section of cluster.conf file (with no require

Re: [Linux-cluster] Determining failed node on another node of clusterduring failover

2011-01-13 Thread Parvez Shaikh
Gratefully yours On 1/13/11, Parvez Shaikh wrote: > Hi, > > I have been using clustat command. clustat -x -s servicename to get > following XML file - > > > > > state_str="started" flags="0" flags_str="" owner="node1

Re: [Linux-cluster] Determining failed node on another node of clusterduring failover

2011-01-12 Thread Parvez Shaikh
at' and 'cman_tool'? > > > Regards, > > Kit > > -Original Message- > From: linux-cluster-boun...@redhat.com > [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Parvez Shaikh > Sent: woensdag 12 januari 2011 11:01 > To: linux clustering > Subj

Re: [Linux-cluster] Determining failed node on another node of clusterduring failover

2011-01-12 Thread Parvez Shaikh
would suggest also setting up monitoring. > The monitoring package can then notify you if any cluster member fails. > > > Regards, > > Kit > > -Original Message- > From: linux-cluster-boun...@redhat.com > [mailto:linux-cluster-boun...@redhat.com] On Behalf

[Linux-cluster] Determining failed node on another node of cluster during failover

2011-01-11 Thread Parvez Shaikh
Hi all, Taking this question from another thread, here is a challenge that I am facing - Following is simple cluster configuration - Node 1, node 2, node 3, and node4 are part of cluster, its unrestricted unordered fail-over domain with active - active nxn configuration So a node 2 can get serv

Re: [Linux-cluster] Error while manual fencing and output of clustat

2011-01-10 Thread Parvez Shaikh
g XML file to obtain value of "last_owner" field. Any input on how to find out name of failed node on another cluster node, over which services from failed node are starting? Thanks On Mon, Jan 10, 2011 at 6:58 PM, Xavier Montagutelli wrote: > Hello Parvez, > > On Monday 10 Ja

[Linux-cluster] Error while manual fencing and output of clustat

2011-01-10 Thread Parvez Shaikh
Dear experts, I have two node cluster(node1 and node2), and manual fencing is configured. Service S2 is running on node2. To ensure failover happen, I shutdown node2.. I see following messages in /var/log/messages - agent "fence_manual" reports: failed: fence_manual no node na

Re: [Linux-cluster] configuring bladecenter fence device

2011-01-06 Thread Parvez Shaikh
Thanks Hugo Your gratefully On Fri, Jan 7, 2011 at 11:09 AM, Hugo Lombard wrote: > On Fri, Jan 07, 2011 at 10:12:16AM +0530, Parvez Shaikh wrote: >> >>                 >>                         >>                                 >>                            

Re: [Linux-cluster] configuring bladecenter fence device

2011-01-06 Thread Parvez Shaikh
Hi Ben Thanks a ton for below information. But I have doubt on cluster.conf file snippet below - Here for "node1

[Linux-cluster] configuring bladecenter fence device

2011-01-06 Thread Parvez Shaikh
Hi all, >From RHCS documentation, I could see that bladecenter is one of the fence devices - http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/ap-fence-device-param-CA.html Table B.9. IBM Blade Center Field Description NameA name for the IBM BladeCente

Re: [Linux-cluster] Determining red hat cluster version

2011-01-06 Thread Parvez Shaikh
ents (with their own versions may be) Gratefully yours On Thu, Jan 6, 2011 at 1:14 PM, Fabio M. Di Nitto wrote: > On 1/6/2011 8:28 AM, Parvez Shaikh wrote: >> Hi Fabio >> >> This produces output - >> >> cman-2.0.115-29.el5 >> >> So does it indicat

Re: [Linux-cluster] Determining red hat cluster version

2011-01-05 Thread Parvez Shaikh
Hi Fabio This produces output - cman-2.0.115-29.el5 So does it indicate 2.0.115-29 is version? On Thu, Jan 6, 2011 at 12:34 PM, Fabio M. Di Nitto wrote: > On 1/6/2011 6:24 AM, Parvez Shaikh wrote: >> Hi all, >> >> Is there any command which states Red Hat cluster v

[Linux-cluster] Determining red hat cluster version

2011-01-05 Thread Parvez Shaikh
Hi all, Is there any command which states Red Hat cluster version? I tried cman_tool version, and ccs_tool -V both produce different results, most likely reporting version of their own (not of Cluster suite) yum list installed *Cluster* produces following - Installed Packages Cluster_Administr

Re: [Linux-cluster] IP Resource behavior with Red Hat Cluster

2010-12-26 Thread Parvez Shaikh
s and interest in this problem Gratefully yours On Mon, Dec 27, 2010 at 12:18 PM, Rajagopal Swaminathan wrote: > Greetinds, > > On Mon, Dec 27, 2010 at 9:51 AM, Parvez Shaikh > wrote: >> Hi >> >> >> Dec 27 17:35:32 datablade1 clurgmgrd[31853]: Error stori

Re: [Linux-cluster] IP Resource behavior with Red Hat Cluster

2010-12-26 Thread Parvez Shaikh
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:162 Memory:9600-96012800 On Sat, Dec 25, 2010 at 6:34 AM, Jakov Sosic wrote: > On 12/24/2010 05:46 PM, Parvez Shaikh wrote

Re: [Linux-cluster] IP Resource behavior with Red Hat Cluster

2010-12-24 Thread Parvez Shaikh
Thanks a ton Jakov. It has clarified my doubts. Yours gratefully, Parvez On Sat, Dec 25, 2010 at 6:34 AM, Jakov Sosic wrote: > On 12/24/2010 05:46 PM, Parvez Shaikh wrote: >> Hi Jakov >> >> Thank you for your response. My two hosts have multiple network >> inte

Re: [Linux-cluster] IP Resource behavior with Red Hat Cluster

2010-12-24 Thread Parvez Shaikh
m if these are in same subnets. Gratefully yours, Parvez On Fri, Dec 24, 2010 at 6:16 PM, Jakov Sosic wrote: > On 12/24/2010 09:41 AM, Parvez Shaikh wrote: >> Hi Rajagopal, >> >> Thank you for your response >> >> I have created a cluster configuration by adding IP

Re: [Linux-cluster] IP Resource behavior with Red Hat Cluster

2010-12-24 Thread Parvez Shaikh
to add virtual interface manually (as above or any other method?) before I could start service with IP resource under it? Thanks Parvez On Fri, Dec 24, 2010 at 11:30 AM, Rajagopal Swaminathan wrote: > Greetings, > > On Fri, Dec 24, 2010 at 5:33 AM, Parvez Shaikh > wrote: >&g

[Linux-cluster] IP Resource behavior with Red Hat Cluster

2010-12-23 Thread Parvez Shaikh
Hi all, I am using Red Hat cluster 6.2.0 (version shown with cman_tool version) on Red Hat 5.5 I am on host that has multiple network interfaces and all(or some) of which may be active while I tried to bring up my IP resource up. My cluster is of simple configuration - It has only 2 nodes, and s