Re: [Pacemaker] crm : unknown expected votes

2011-04-20 Thread Andrew Beekhof
On Tue, Apr 19, 2011 at 3:37 PM,  hari.n.tatit...@accenture.com wrote:
 Hi,



     I created a 2 node cluster created using pacemaker on Fedora
 14(2.6.35.6-45.fc14.x86_64)

     I have two errors that I am not able to resolve.

     Can someone help me resolve these errors.



   1 )  It always shows “ unknown expected votes”  when I see ‘crm status’.

not an error. heartbeat based clusters do not use this


   2 ) In the logfile it shows  below message even though stonith setting is
 not enabled.

     Error: te_connect_stonith: Attempting connection to fencing daemon…

disabling stonith does not impact whether the daemon is started nor
whether we connect to it

google is your friend:
   http://www.mail-archive.com/linux-ha@lists.linux-ha.org/msg16967.html






  Pasted below the  configure and status:

 ===

 -bash-4.1# crm configure show

 node $id=2e9dd3fa-8083-4363-96b4-331aa9b93d1f rabbithanode2

 node $id=3a56dae9-d8c7-46b0-8a86-f6bd3b9658f4 rabbithanode1

 primitive bunny ocf:rabbitmq:rabbitmq-server \

     params mnesia_base=/cluster1

 primitive drbd ocf:linbit:drbd \

     params drbd_resource=wwwdata \

     op monitor interval=60s

 primitive drbd_fs ocf:heartbeat:Filesystem \

     params device=/dev/drbd1 directory=/cluster1 fstype=ext4

 ms drbd_ms drbd \

     meta master-max=1 master-node-max=1 clone-max=2
 clone-node-max=1 notify=true

 colocation bunny_on_fs inf: bunny drbd_fs

 colocation fs_on_drbd inf: drbd_fs drbd_ms:Master

 order bunny_after_fs inf: drbd_fs bunny

 order fs_after_drbd inf: drbd_ms:promote drbd_fs:start

 property $id=cib-bootstrap-options \

     dc-version=1.1.4-ac608e3491c7dfc3b3e3c36d966ae9b016f77065 \

     cluster-infrastructure=Heartbeat \

     stonith-enabled=false \

     resource-stickiness=100 \

     no-quorum-policy=ignore

 ===



 -bash-4.1# crm status

 

 Last updated: Tue Apr 19 09:32:52 2011

 Stack: Heartbeat

 Current DC: rabbithanode2 (2e9dd3fa-8083-4363-96b4-331aa9b93d1f) - partition
 with quorum

 Version: 1.1.4-ac608e3491c7dfc3b3e3c36d966ae9b016f77065

 2 Nodes configured, unknown expected votes

 3 Resources configured.

 



 Online: [ rabbithanode1 rabbithanode2 ]



  Master/Slave Set: drbd_ms [drbd]

  Masters: [ rabbithanode1 ]

  Slaves: [ rabbithanode2 ]

  drbd_fs    (ocf::heartbeat:Filesystem):    Started rabbithanode1

  bunny  (ocf::rabbitmq:rabbitmq-server):    Started rabbithanode1

 -bash-4.1#

 ===



 Thanks  Regds

 Hari Tatituri



 TACG-Cloud Factory Mobilization

 Desk  : +91-080-43154146

 Mobile: +91-9686022660



 
 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise private information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the email by you is prohibited.

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs:
 http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Ordering set of resources, problem in ordering chain of resources

2011-04-20 Thread Andrew Beekhof
On Tue, Apr 19, 2011 at 12:40 PM, Rakesh K rakirocker4...@gmail.com wrote:
 Andrew Beekhof andrew@... writes:


 There is nothing in this config that requires tomcat2 to be stopped.

 Perhaps:
    colocation Tomcat2-with-Tomcat inf: Tomcat1 Tomcat2VIP
 was intended to be:
    colocation Tomcat2-with-Tomcat inf: Tomcat2 Tomcat1

 The only other service active is httpd, which also has no constraints
 indicating it should stop when mysql is down.


 Thanks Andrew for the valuable feed back.

 As mentioned i had changed the colocation constraint but still facing with the
 same issue.

 As per the order given in HA configuration, i am providing output of my crm
 configure show command

Not enough sorry, I need the status section too.
   crm configure show xml

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Unable to stop Multi state resource

2011-04-20 Thread Andrew Beekhof
On Tue, Apr 19, 2011 at 12:34 PM, Rakesh K rakirocker4...@gmail.com wrote:
 Rakesh K rakirocker4236@... writes:


 Hi Andrew

 FSR is a File system replication script which adheres to ocf cluster frame 
 work,
 the script is similar to Mysql ocf script, which is a multi state resource,
 where in master  ssh server would be running and in slave there are rsync
 scripts which uses to synchronize the data between the Master and slave.

 the rsync script will be having the Master FSR location, so that the rysnc 
 tool
 will be frequently replication the data from the FSR master location.

 here is the crm configuration show output

Thanks, but this doesn't really answer my question about whether the
cluster tried to stop it.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] SBD kills both nodes in a two node cluster.

2011-04-20 Thread Andrew Beekhof
On Tue, Apr 19, 2011 at 12:04 PM, Ulf m...@gmx.net wrote:
 I' ve two nodes with shared storage and multipathing. But the SBD device 
 doesn't work as expected.
 My idea was that in case of a split brain: One node kills the other node and 
 one will survive.
 But in my case I get a double kill, both nodes will be killed at the same 
 time.

http://ourobengr.com/ha might be of some assistance.

 I simulated the split brain with ip link set down eth0 on one node. I 
 tested it several times.

 The sbd deamon is running on both nodes.
 My configuration:
 primitive stonith_sbd stonith:external/sbd params 
 sbd_device=/dev/disk/by-id/scsi-36...
 clone stonith_sbd-clone stonith_sbd

 /var/log/messages:
 Node A:
 Apr 19 10:37:09 nodeA crmd: [7690]: info: te_fence_node: Executing reboot 
 fencing operation (17) on nodeB (timeout=18)
 Apr 19 10:37:09 nodeA stonith-ng: [7685]: info: initiate_remote_stonith_op: 
 Initiating remote operation reboot for nodeB: 
 d4226746-fef1-4d29-bc85-2d33e9bf7f94
 Apr 19 10:37:09 nodeA stonith-ng: [7685]: info: stonith_queryQuery 
 stonith_command t=stonith-ng 
 st_async_id=d4226746-fef1-4d29-bc85-2d33e9bf7f94 st_op=st_query 
 st_callid=0 st_callopt=0 st_remote
 _op=d4226746-fef1-4d29-bc85-2d33e9bf7f94 st_target=nodeB 
 st_device_action=reboot st_clientid=3b1b3feb-5e4e-4a3c-ae8e-2131ea2ae588 
 st_timeout=18000 src=nodeA seq=1 /


 Node B:
 Apr 19 10:37:09 nodeB crmd: [7851]: info: te_fence_node: Executing reboot 
 fencing operation (17) on nodeA (timeout=18)
 Apr 19 10:37:09 nodeB stonith-ng: [7846]: info: initiate_remote_stonith_op: 
 Initiating remote operation reboot for nodeA: 
 e361b3b6-2890-474d-8671-b73eea62d1ab
 Apr 19 10:37:09 nodeB stonith-ng: [7846]: info: stonith_queryQuery 
 stonith_command t=stonith-ng 
 st_async_id=e361b3b6-2890-474d-8671-b73eea62d1ab st_op=st_query 
 st_callid=0 st_callopt=0 st_remote
 _op=e361b3b6-2890-474d-8671-b73eea62d1ab st_target=nodeA 
 st_device_action=reboot st_clientid=a0d67d7e-5e30-44fe-bc88-e733019e594d 
 st_timeout=18000 src=nodeB seq=1 /


 On both nodes I started a sbd -d /dev/disk/by-id/scsi-36... list in an 
 endless loop and these are the last SBD commands I get.
 As you can see both nodes request a reset at the same time and both will 
 succeed = double kill.
 Node A:
 0       nodeB clear
 1       nodeA clear
 0       nodeB clear
 1       nodeA reset   nodeB
 0       nodeB reset   nodeA
 1       nodeA reset   nodeB

 Node B:
 0       nodeB clear
 1       nodeA reset   nodeB
 0       nodeB clear
 1       nodeA reset   nodeB
 0       nodeB clear
 1       nodeA reset   nodeB
 0       nodeB reset   nodeA
 1       nodeA reset   nodeB
 0       nodeB reset   nodeA
 1       nodeA reset   nodeB


 Cheers,
 Ulf
 --
 NEU: FreePhone - kostenlos mobil telefonieren und surfen!
 Jetzt informieren: http://www.gmx.net/de/go/freephone

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: 
 http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] crm : unknown expected votes

2011-04-20 Thread hari.n.tatituri
Thank you so much for this info!

-Original Message-
From: Andrew Beekhof [mailto:and...@beekhof.net]
Sent: Wednesday, April 20, 2011 11:59 AM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] crm : unknown expected votes

On Tue, Apr 19, 2011 at 3:37 PM,  hari.n.tatit...@accenture.com wrote:
 Hi,



 I created a 2 node cluster created using pacemaker on Fedora
 14(2.6.35.6-45.fc14.x86_64)

 I have two errors that I am not able to resolve.

 Can someone help me resolve these errors.



   1 )  It always shows  unknown expected votes  when I see 'crm status'.

not an error. heartbeat based clusters do not use this


   2 ) In the logfile it shows  below message even though stonith setting is
 not enabled.

 Error: te_connect_stonith: Attempting connection to fencing daemon...

disabling stonith does not impact whether the daemon is started nor
whether we connect to it

google is your friend:
   http://www.mail-archive.com/linux-ha@lists.linux-ha.org/msg16967.html






  Pasted below the  configure and status:

 ===

 -bash-4.1# crm configure show

 node $id=2e9dd3fa-8083-4363-96b4-331aa9b93d1f rabbithanode2

 node $id=3a56dae9-d8c7-46b0-8a86-f6bd3b9658f4 rabbithanode1

 primitive bunny ocf:rabbitmq:rabbitmq-server \

 params mnesia_base=/cluster1

 primitive drbd ocf:linbit:drbd \

 params drbd_resource=wwwdata \

 op monitor interval=60s

 primitive drbd_fs ocf:heartbeat:Filesystem \

 params device=/dev/drbd1 directory=/cluster1 fstype=ext4

 ms drbd_ms drbd \

 meta master-max=1 master-node-max=1 clone-max=2
 clone-node-max=1 notify=true

 colocation bunny_on_fs inf: bunny drbd_fs

 colocation fs_on_drbd inf: drbd_fs drbd_ms:Master

 order bunny_after_fs inf: drbd_fs bunny

 order fs_after_drbd inf: drbd_ms:promote drbd_fs:start

 property $id=cib-bootstrap-options \

 dc-version=1.1.4-ac608e3491c7dfc3b3e3c36d966ae9b016f77065 \

 cluster-infrastructure=Heartbeat \

 stonith-enabled=false \

 resource-stickiness=100 \

 no-quorum-policy=ignore

 ===



 -bash-4.1# crm status

 

 Last updated: Tue Apr 19 09:32:52 2011

 Stack: Heartbeat

 Current DC: rabbithanode2 (2e9dd3fa-8083-4363-96b4-331aa9b93d1f) - partition
 with quorum

 Version: 1.1.4-ac608e3491c7dfc3b3e3c36d966ae9b016f77065

 2 Nodes configured, unknown expected votes

 3 Resources configured.

 



 Online: [ rabbithanode1 rabbithanode2 ]



  Master/Slave Set: drbd_ms [drbd]

  Masters: [ rabbithanode1 ]

  Slaves: [ rabbithanode2 ]

  drbd_fs(ocf::heartbeat:Filesystem):Started rabbithanode1

  bunny  (ocf::rabbitmq:rabbitmq-server):Started rabbithanode1

 -bash-4.1#

 ===



 Thanks  Regds

 Hari Tatituri



 TACG-Cloud Factory Mobilization

 Desk  : +91-080-43154146

 Mobile: +91-9686022660



 
 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise private information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the email by you is prohibited.

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs:
 http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information.  If you have received it in 
error, please notify the sender immediately and delete the original.  Any other 
use of the email by you is prohibited.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Ordering set of resources, problem in ordering chain of resources

2011-04-20 Thread Rakesh K
Andrew Beekhof andrew@... writes:

Hi Andrew 

thanks for giving replies
 sorry for troubling you  frequently

here is the out put of crm configure show xml
?xml version=1.0 ?
cib admin_epoch=0 crm_feature_set=3.0.1
dc-uuid=87b8b88e-3ded-4e34-8708-46f7afe62935 epoch=1120 have-quorum=1
num_updates=35 validate-with=pacemaker-1.0
  configuration
crm_config
  cluster_property_set id=cib-bootstrap-options
nvpair id=cib-bootstrap-options-dc-version name=dc-version
value=1.0.9-89bd754939df5150de7cd76835f98fe90851b677/
nvpair id=cib-bootstrap-options-cluster-infrastructure
name=cluster-infrastructure value=Heartbeat/
nvpair id=cib-bootstrap-options-stonith-enabled
name=stonith-enabled value=false/
nvpair id=cib-bootstrap-options-no-quorum-policy
name=no-quorum-policy value=ignore/
nvpair id=cib-bootstrap-options-last-lrm-refresh
name=last-lrm-refresh value=1300787402/
  /cluster_property_set
/crm_config
rsc_defaults
  meta_attributes id=rsc-options
nvpair id=rsc-options-resource-stickiness name=resource-stickiness
value=100/
  /meta_attributes
/rsc_defaults
op_defaults/
nodes
  node id=6317f856-e57b-4a03-acf1-ca81af4f19ce type=normal
uname=cisco-demomsf/
  node id=87b8b88e-3ded-4e34-8708-46f7afe62935 type=normal 
uname=mysql3/
/nodes
resources
  master id=MS_Mysql
meta_attributes id=MS_Mysql-meta_attributes
  nvpair id=MS_Mysql-meta_attributes-notify name=notify 
value=true/
  nvpair id=MS_Mysql-meta_attributes-target-role name=target-role
value=Stopped/
/meta_attributes
primitive class=ocf id=Mysql provider=heartbeat type=mysql
  instance_attributes id=Mysql-instance_attributes
nvpair id=Mysql-instance_attributes-binary name=binary
value=/usr/bin/mysqld_safe/
nvpair id=Mysql-instance_attributes-config name=config
value=/etc/my.cnf/
nvpair id=Mysql-instance_attributes-datadir name=datadir
value=/var/lib/mysql/
nvpair id=Mysql-instance_attributes-user name=user 
value=mysql/
nvpair id=Mysql-instance_attributes-pid name=pid
value=/var/lib/mysql/mysql.pid/
nvpair id=Mysql-instance_attributes-socket name=socket
value=/var/lib/mysql/mysql.sock/
nvpair id=Mysql-instance_attributes-test_passwd
name=test_passwd value=slavepass/
nvpair id=Mysql-instance_attributes-test_table name=test_table
value=msfha.conn/
nvpair id=Mysql-instance_attributes-test_user name=test_user
value=repl/
nvpair id=Mysql-instance_attributes-replication_user
name=replication_user value=repl/
nvpair id=Mysql-instance_attributes-replication_passwd
name=replication_passwd value=slavepass/
  /instance_attributes
  operations
op id=Mysql-start-0 interval=0 name=start timeout=120s/
op id=Mysql-stop-0 interval=0 name=stop timeout=120s/
op id=Mysql-monitor-10s interval=10s name=monitor
role=Master timeout=8s/
op id=Mysql-monitor-12s interval=12s name=monitor 
timeout=8s/
  /operations
/primitive
  /master
  primitive class=ocf id=Tomcat1VIP provider=heartbeat 
type=IPaddr3
instance_attributes id=Tomcat1VIP-instance_attributes
  nvpair id=Tomcat1VIP-instance_attributes-ip name=ip
value=172.21.52.140/
  nvpair id=Tomcat1VIP-instance_attributes-eth_num name=eth_num
value=eth0:2/
  nvpair id=Tomcat1VIP-instance_attributes-vip_cleanup_file
name=vip_cleanup_file value=/var/run/bigha.pid/
/instance_attributes
operations
  op id=Tomcat1VIP-start-0 interval=0 name=start timeout=120s/
  op id=Tomcat1VIP-monitor-30s interval=30s name=monitor/
/operations
meta_attributes id=Tomcat1VIP-meta_attributes
  nvpair id=Tomcat1VIP-meta_attributes-target-role name=target-role
value=Started/
/meta_attributes
  /primitive
  primitive class=ocf id=Tomcat1 provider=msf type=tomcat
instance_attributes id=Tomcat1-instance_attributes
  nvpair id=Tomcat1-instance_attributes-tomcat_name
name=tomcat_name value=tomcat/
  nvpair id=Tomcat1-instance_attributes-statusurl name=statusurl
value=http://localhost:8080/dbtest/testtomcat.html/
  nvpair id=Tomcat1-instance_attributes-java_home name=java_home
value=//
  nvpair id=Tomcat1-instance_attributes-catalina_home
name=catalina_home value=/home/msf/runtime/tomcat/apache-tomcat-6.0.18/
  nvpair id=Tomcat1-instance_attributes-client name=client
value=curl/
  nvpair id=Tomcat1-instance_attributes-testregex name=testregex
value=*lt;/htmlgt;/
/instance_attributes
operations
  op id=Tomcat1-start-0 interval=0 name=start timeout=60s/
  op id=Tomcat1-monitor-50s interval=50s name=monitor 
timeout=50s/
  op id=Tomcat1-stop-0 interval=0 name=stop/
/operations

[Pacemaker] [pacemaker] need some help regarding network failure setup in pacemaker.

2011-04-20 Thread rakesh k
Hello Everybody


 How can we detect network failure in pacemaker configuration. where my to
nodes in cluster frame work  are as follows

two network routers connected via switch as mediator for communication.

how can we detect network failure and stop the heartbeat processes, when i
shutdown the interface. i am seeing a split brain issue.where heart beat is
started on two nodes and each acting as two different heartbeat process.

I had configured pingd resource which comes with pacemaker, as clone
processes when there is a network failure, i see there is a split brain
issue where the heart beat processes are starting separately on both nodes,
my question is here is, is there any way to stop the heart beat process
while pingd on a particular node gives that there is communication between
the interface and node where HA is running.

Regards
rakesh
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [pacemaker] need some help regarding network failure setup in pacemaker.

2011-04-20 Thread Jelle de Jong
On 20-04-11 11:44, rakesh k wrote:
 How can we detect network failure in pacemaker configuration.

http://www.clusterlabs.org/wiki/Pingd_with_resources_on_different_networks
http://www.woodwose.net/thatremindsme/2011/04/the-pacemaker-ping-resource-agent/
http://wiki.lustre.org/index.php/Using_Pacemaker_with_Lustre

crm configure help location
crm ra info ocf:ping

That should give you a jup start.

You may need to increase the corosync token.

Kind regards,

Jelle de Jong

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [pacemaker] need some help regarding network failure setup in pacemaker.

2011-04-20 Thread Rakesh K
Jelle de Jong jelledejong@... writes:
Hi Jelle de Jong

 
 On 20-04-11 11:44, rakesh k wrote:
  How can we detect network failure in pacemaker configuration.
 
 http://www.clusterlabs.org/wiki/Pingd_with_resources_on_different_networks
 http://www.woodwose.net/thatremindsme/2011/04/the-pacemaker-ping-resource-agent/
 http://wiki.lustre.org/index.php/Using_Pacemaker_with_Lustre
 
 crm configure help location
 crm ra info ocf:ping
 
 That should give you a jup start.
 
 You may need to increase the corosync token.
 
 Kind regards,
 
 Jelle de Jong
 
 ___
 Pacemaker mailing list: Pacemaker@...
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: 
 http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
 
 
Thanks for the help

my question is 

I had gone through the scripts, where i found in 
ping_update method there is a variable called ACTIVE no.of nodes(host_list)
active based on this value, for our scenario, can we stop the
heartbeat/pacemaker process, when the host node cannot ping any other nodes in
the cluster frame work. provide me your suggestion so that it will help us in
our context.

Regards
Rakesh




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] missing pacemakerd on rhel5.5 installation

2011-04-20 Thread Emil Enemærke
Hi,

I have a rhel5.5 64bit installation and have tried installing heartbeat, 
pacemaker and corosync. The installation was done by 

rpm -Uvh 
http://download.fedora.redhat.com/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
wget -O /etc/yum.repos.d/pacemaker.repo 
http://clusterlabs.org/rpm/epel-5/clusterlabs.repo
yum install pacemaker.x86_64 corosync.x86_64 heartbeat.x86_64

and went without any problems.

I have the corosync and heartbeat deamons, but I cannot locate the pacemaker 
deamon. Here is the info on the pacemaker rpm.
Name: pacemakerRelocations: (not relocatable)
Version : 1.0.10Vendor: (none)
Release : 1.4.el5   Build Date: Wed 17 Nov 2010 
03:55:21 PM CET
Install Date: Wed 20 Apr 2011 11:35:08 AM CEST  Build Host: f13.beekhof.net
Group   : System Environment/DaemonsSource RPM: 
pacemaker-1.0.10-1.4.el5.src.rpm
Size: 12316094 License: GPLv2+ and LGPLv2+
Signature   : (none)
URL : http://www.clusterlabs.org
Summary : Scalable High-Availability cluster resource manager


But there is no /etc/init.d/pacemaker nor a pacemakerd.

Any help or suggestions?

/Emil
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [Linux-HA] Announce: Hawk (HA Web Konsole) 0.4.0

2011-04-20 Thread Lars Marowsky-Bree
On 2011-04-19T04:59:35, Tim Serong tser...@novell.com wrote:

 Greetings All,
 
 This is to announce version 0.4.0 of Hawk, a web-based GUI for
 managing and monitoring Pacemaker High-Availability clusters.

Hi Tim,

this is great news and a big step forward! Congratulations!


Regards,
Lars

-- 
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
Experience is the name everyone gives to their mistakes. -- Oscar Wilde


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Drbd dstate to trigger failover

2011-04-20 Thread Shravan Mishra
Hi,

I'm using following config for io errors:



resource resource {
  disk {
on-io-error detach;
...
  }
  ...
}


The above leads to following  state in case of disk errors:

Diskless/UpToDate

Under drbd documentation there is a following line:

STMT -- If the disk failure has occured on your primary node, you may
combine this step with a switch-over operation.

When I look at drbd resource agent's monitor:

drbd_monitor() {
local status

drbd_status
status=$?

drbd_update_master_score

return $status
}


Now in the above function the error is reported only based on role not
on estate. drbd_update_master_score is only updating score based on
estates.


My question is can I return error in drbd_monitor if my primary goes
Diskless and then cause the switchover based on the STMT?

Or  should I be doing something else?


Sincerely
Shravan

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Drbd dstate to trigger failover

2011-04-20 Thread Shravan Mishra
Wherever there is estates/estate it should be dstates/dstate.

Thanks
Shravan

On Wed, Apr 20, 2011 at 9:46 PM, Shravan Mishra
shravan.mis...@gmail.com wrote:
 Hi,

 I'm using following config for io errors:



 resource resource {
  disk {
    on-io-error detach;
    ...
  }
  ...
 }


 The above leads to following  state in case of disk errors:

 Diskless/UpToDate

 Under drbd documentation there is a following line:

 STMT -- If the disk failure has occured on your primary node, you may
 combine this step with a switch-over operation.

 When I look at drbd resource agent's monitor:

 drbd_monitor() {
        local status

        drbd_status
        status=$?

        drbd_update_master_score

        return $status
 }


 Now in the above function the error is reported only based on role not
 on estate. drbd_update_master_score is only updating score based on
 estates.


 My question is can I return error in drbd_monitor if my primary goes
 Diskless and then cause the switchover based on the STMT?

 Or  should I be doing something else?


 Sincerely
 Shravan


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker