[Pacemaker] Cluster reboot fro maintenance

2016-06-20 Thread marco
Hi,
i have a two node cluster with some vms (pacemaker resources) running on
the two hypervisors:
pacemaker-1.0.10
corosync-1.3.0

I need to do maintenance stuff , so i need to:
- put on maintenance the cluster so the cluster doesn't
  touch/start/stop/monitor the vms
- update the vms
- stop the vm
- stop cluster stuff (corosync/pacemaker)
- reboot the hypervisors.

What is the corret way to do that ( corosync/pacemaker) side ?


Best regards
Marco

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Clustermon issue

2015-01-09 Thread Marco Querci

I tried to insert some check code in my script:

#!/bin/bash

echo $(date) >> /tmp/check

monitorfile=/tmp/clustermonitor.html
hostname=$(hostname)

echo "Cluster state changes detected" | mail -r "$hostname@" -s "Cluster 
Monitor" -a $monitorfile mquerc...@gmail.com


to check if the script is been called or not.
In /tmp dir there is clustermonitor.html file, that been created by 
ClusterMon resource, but no check file is present.


Thanks.



Il 08/01/2015 03:31, Andrew Beekhof ha scritto:

And there is no indication this is being called?


On 7 Jan 2015, at 6:21 pm, Marco Querci  wrote:

#!/bin/bash

monitorfile=/tmp/clustermonitor.html
hostname=$(hostname)

echo "Cluster state changes detected" | mail -r "$hostname@" -s "Cluster 
Monitor" -a $monitorfile mquerc...@gmail.com


Thanks.


Il 06/01/2015 01:21, Andrew Beekhof ha scritto:

On 6 Jan 2015, at 3:37 am, Marco Querci  wrote:

Hi All.
Any news for my problem?

Maybe post your /home/administrator/clustermonitor_notification.sh script?


Many thanks.


Il 19/12/2014 12:13, Marco Querci ha scritto:

Many tahnk for your reply.
Here is my configuration:


  

  






  


  

  
  

  


  

  



  
  

  


  
  

  
  

  
  

  
  

  


  
  

  



  
  

  


  



  
  

  


  
  

  
  

  
  

  
  

  
  


  
  

  

  
  



  


  



  

  

  
  

  

  
  

  
  

  


  
  


  
  


  
  


  
  


  
  


  
  


  
  



  

  


  

  

  
  

  
  

  
  

  
  


  
  

  
  


  
  


  

  
  

  
  

  

  




Il 19/12/2014 10:02, Florian Crouzat ha scritto:

Le 18/12/2014 16:21, Marco Querci a écrit :

Hi all,
I have a pacemaker + corosync cluster installed on a CentOS 6.5
I have a resource ClusterMon with external_agent param set up.
Before last pacemaker update the events notification to the external
script worked perfectly.
After pacemaker update to version 1.1.11, the ClusterMon resource
continues to work but stop notifying to the external agent.
I followed setup instructions found on internet but I can't figure how
it doesn't work as expected.

Any help will be appreciated.
Many thanks.

Hello, please paste your full configuration here please so we understand how 
you use the ClusterMon stuff.
Remember that on RHEL 6.x, SNMP support is not built in ; but that's probably 
why you use an external_agent. I just need to make sure by reading your 
configuration.

Cheers,
Florian Crouzat

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


_

Re: [Pacemaker] Clustermon issue

2015-01-08 Thread Marco Querci

Sorry ... it was my error.
My CentOS is 6.6:

[root@langate1 ~]# cat /etc/redhat-release
CentOS release 6.6 (Final)

full upgraded.



Il 08/01/2015 23:00, Andrew Beekhof ha scritto:

On 8 Jan 2015, at 10:00 pm, Marco Querci  wrote:

Thanks for your suggestion.
But my version is:

[root@langate1 ~]# pacemakerd --version
Pacemaker 1.1.11

Why were you saying my version is 1.1.12-rc3?

Because that the tag that '97629de' in the following matches to:

   


I installed pacemaker on CentOS 6.5 from repository. Why those repository 
aren't updated with this patch?

Because CentOS blindly rebuilds from RHEL and no RHEL customers complained 
about it (which means package maintainers aren't allowed to fix it until 6.6).


Many Thanks.


Il 08/01/2015 03:39, Andrew Beekhof ha scritto:

On 8 Jan 2015, at 1:31 pm, Andrew Beekhof  wrote:

And there is no indication this is being called?

Doh. I know this one... you're actually using 1.1.12-rc3.

You need this patch which landed after 1.1.12 shipped:
https://github.com/beekhof/pacemaker/commit/3df6aff


On 7 Jan 2015, at 6:21 pm, Marco Querci  wrote:

#!/bin/bash

monitorfile=/tmp/clustermonitor.html
hostname=$(hostname)

echo "Cluster state changes detected" | mail -r "$hostname@" -s "Cluster 
Monitor" -a $monitorfile mquerc...@gmail.com


Thanks.


Il 06/01/2015 01:21, Andrew Beekhof ha scritto:

On 6 Jan 2015, at 3:37 am, Marco Querci  wrote:

Hi All.
Any news for my problem?

Maybe post your /home/administrator/clustermonitor_notification.sh script?


Many thanks.


Il 19/12/2014 12:13, Marco Querci ha scritto:

Many tahnk for your reply.
Here is my configuration:



   
 
   
   
   
   
   
   
 
   
   
 
   
 
 
   
 
   
   
 
   
 
   
   
   
 
 
   
 
   
   
 
 
   
 
 
   
 
 
   
 
 
   
 
   
   
 
 
   
 
   
   
   
 
 
   
 
   
   
 
   
   
   
 
 
   
 
   
   
 
 
   
 
 
   
 
 
   
 
 
   
 
 
   
   
 
 
   
 
   
 
 
   
   
   
 
   
   
 
   
   
   
 
   
 
   


   
 
   
 
 
   
 
 
   
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
   
 
   
 
   
   
 
   
 
   
 
 
   
 
 
   
 
 
   
 
 
   
   
 
 
   
 
 
   
   
 
 
   
   
 
   
 
 
   
 
 
   
 
   





Il 19/12/2014 10:02, Florian Crouzat ha scritto:

Le 18/12/2014 16:21, Marco Querci a écrit :

Hi all,
I have a pacemaker + corosync cluster installed on a CentOS 6.5
I have a resource ClusterMon with external_agent param set up.
Before last pacemaker update the events notification to the external
script worked perfectly.
After pacemaker update to version 1.1.11, the ClusterMon resource
continues to work but stop notifying to the external agent.
I followed setup instructions found on internet but I can't figure how
it doesn't work as expected.

Any help will be appreciated.
Many thanks.

Hello, please paste your full configuration here please so we understand how 
you use the ClusterMon stuff.
Remember that on RHEL 6.x, SNMP support is not built in ; but that's probably 
why you use an external_agent. I just need to make sure by reading your 
configuration.

Cheers,
Florian Crouzat

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___

Re: [Pacemaker] Clustermon issue

2015-01-08 Thread Marco Querci

Thanks for your suggestion.
But my version is:

[root@langate1 ~]# pacemakerd --version
Pacemaker 1.1.11

Why were you saying my version is 1.1.12-rc3?
I installed pacemaker on CentOS 6.5 from repository. Why those 
repository aren't updated with this patch?


Many Thanks.


Il 08/01/2015 03:39, Andrew Beekhof ha scritto:

On 8 Jan 2015, at 1:31 pm, Andrew Beekhof  wrote:

And there is no indication this is being called?

Doh. I know this one... you're actually using 1.1.12-rc3.

You need this patch which landed after 1.1.12 shipped:
https://github.com/beekhof/pacemaker/commit/3df6aff


On 7 Jan 2015, at 6:21 pm, Marco Querci  wrote:

#!/bin/bash

monitorfile=/tmp/clustermonitor.html
hostname=$(hostname)

echo "Cluster state changes detected" | mail -r "$hostname@" -s "Cluster 
Monitor" -a $monitorfile mquerc...@gmail.com


Thanks.


Il 06/01/2015 01:21, Andrew Beekhof ha scritto:

On 6 Jan 2015, at 3:37 am, Marco Querci  wrote:

Hi All.
Any news for my problem?

Maybe post your /home/administrator/clustermonitor_notification.sh script?


Many thanks.


Il 19/12/2014 12:13, Marco Querci ha scritto:

Many tahnk for your reply.
Here is my configuration:



   
 
   
   
   
   
   
   
 
   
   
 
   
 
 
   
 
   
   
 
   
 
   
   
   
 
 
   
 
   
   
 
 
   
 
 
   
 
 
   
 
 
   
 
   
   
 
 
   
 
   
   
   
 
 
   
 
   
   
 
   
   
   
 
 
   
 
   
   
 
 
   
 
 
   
 
 
   
 
 
   
 
 
   
   
 
 
   
 
   
 
 
   
   
   
 
   
   
 
   
   
   
 
   
 
   


   
 
   
 
 
   
 
 
   
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
 
 
   
   
   
 
   
 
   
   
 
   
 
   
 
 
   
 
 
   
 
 
   
 
 
   
   
 
 
   
 
 
   
   
 
 
   
   
 
   
 
 
   
 
 
   
 
   





Il 19/12/2014 10:02, Florian Crouzat ha scritto:

Le 18/12/2014 16:21, Marco Querci a écrit :

Hi all,
I have a pacemaker + corosync cluster installed on a CentOS 6.5
I have a resource ClusterMon with external_agent param set up.
Before last pacemaker update the events notification to the external
script worked perfectly.
After pacemaker update to version 1.1.11, the ClusterMon resource
continues to work but stop notifying to the external agent.
I followed setup instructions found on internet but I can't figure how
it doesn't work as expected.

Any help will be appreciated.
Many thanks.

Hello, please paste your full configuration here please so we understand how 
you use the ClusterMon stuff.
Remember that on RHEL 6.x, SNMP support is not built in ; but that's probably 
why you use an external_agent. I just need to make sure by reading your 
configuration.

Cheers,
Florian Crouzat

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterl

Re: [Pacemaker] Clustermon issue

2015-01-06 Thread Marco Querci

#!/bin/bash

monitorfile=/tmp/clustermonitor.html
hostname=$(hostname)

echo "Cluster state changes detected" | mail -r "$hostname@" -s 
"Cluster Monitor" -a $monitorfile mquerc...@gmail.com



Thanks.


Il 06/01/2015 01:21, Andrew Beekhof ha scritto:

On 6 Jan 2015, at 3:37 am, Marco Querci  wrote:

Hi All.
Any news for my problem?

Maybe post your /home/administrator/clustermonitor_notification.sh script?


Many thanks.


Il 19/12/2014 12:13, Marco Querci ha scritto:

Many tahnk for your reply.
Here is my configuration:


  

  






  


  

  
  

  


  

  



  
  

  


  
  

  
  

  
  

  
  

  


  
  

  



  
  

  


  



  
  

  


  
  

  
  

  
  

  
  

  
  


  
  

  

  
  



  


  



  

  

  
  

  

  
  

  
  

  


  
  


  
  


  
  


  
  


  
  


  
  


  
  



  

  


  

  

  
  

  
  

  
  

  
  


  
  

  
  


  
  


  

  
  

  
  

  

  




Il 19/12/2014 10:02, Florian Crouzat ha scritto:

Le 18/12/2014 16:21, Marco Querci a écrit :

Hi all,
I have a pacemaker + corosync cluster installed on a CentOS 6.5
I have a resource ClusterMon with external_agent param set up.
Before last pacemaker update the events notification to the external
script worked perfectly.
After pacemaker update to version 1.1.11, the ClusterMon resource
continues to work but stop notifying to the external agent.
I followed setup instructions found on internet but I can't figure how
it doesn't work as expected.

Any help will be appreciated.
Many thanks.


Hello, please paste your full configuration here please so we understand how 
you use the ClusterMon stuff.
Remember that on RHEL 6.x, SNMP support is not built in ; but that's probably 
why you use an external_agent. I just need to make sure by reading your 
configuration.

Cheers,
Florian Crouzat

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Clustermon issue

2015-01-05 Thread Marco Querci

Hi All.
Any news for my problem?

Many thanks.


Il 19/12/2014 12:13, Marco Querci ha scritto:

Many tahnk for your reply.
Here is my configuration:

validate-with="pacemaker-1.2" cib-last-written="Thu Dec 18 20:04:43 
2014" update-origin="langate1" update-client="crmd" 
crm_feature_set="3.0.9" have-quorum="1" dc-uuid="langate1">

  

  
name="dc-version" value="1.1.11-97629de"/>
name="cluster-infrastructure" value="classic openais (with plugin)"/>
name="expected-quorum-votes" value="2"/>
name="stonith-enabled" value="false"/>
name="no-quorum-policy" value="ignore"/>
name="last-lrm-refresh" value="1418929320"/>

  


  

  
  

  


  
type="IPaddr2">

  
name="ip" value="192.168.0.254"/>
id="ClusterIP_int-instance_attributes-cidr_netmask" 
name="cidr_netmask" value="32"/>
name="nic" value="eth3"/>

  
  
name="monitor"/>

  


  
  
name="monitor"/>

  
  

  
  

  
  
name="monitor"/>

  


  
  
provider="heartbeat" type="IPaddr2">

  
name="ip" value="10.10.10.2"/>
id="ClusterIP_ext1-instance_attributes-cidr_netmask" 
name="cidr_netmask" value="32"/>
name="nic" value="eth0"/>

  
  
interval="60s" name="monitor"/>

  

provider="heartbeat" type="IPaddr2">

  
name="ip" value="172.16.0.2"/>
id="ClusterIP_ext2-instance_attributes-cidr_netmask" 
name="cidr_netmask" value="32"/>
name="nic" value="eth1"/>

  
  
interval="60s" name="monitor"/>

  


  
  
name="monitor"/>

  
  

  
  

  
  
name="monitor"/>

  
  


  
  
provider="pacemaker" type="ClusterMon">

  
id="ClusterMonitor-instance_attributes-extra_options" 
name="extra_options" value="-E 
/home/administrator/clustermonitor_notification.sh -e 
"/>

  
  
name="start" timeout="20"/>
name="stop" timeout="20"/>
name="monitor" timeout="20"/>

  


  



  
name="resource-stickiness" value="100"/>

  

  
  
crmd="online" crm-debug-origin="do_update_resource" join="member" 
expected="member">

  

  value="0"/>
  name="probe_complete" value="true"/>


  
  

  
operation_key="WanFailover_start_0" operation="start" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="17:67:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;17:67:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="67" rc-code="0" op-status="0" interval="0" 
last-run="1418929483" last-rc-change="1418929483" exec-time="10" 
queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8" 
on_node="langate1"/>
operation_key="WanFailover_monitor_6" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="18:67:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;18:67:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="68" rc-code="0" op-status="0" interval="6" 
last-rc-change="1418929483" exec-time="15" queue-time="0" 
op-digest="4811cef7f7f94e3a35a70be7916cb2fd" on_node="langate1"/>

  
  
operation_key="Shorewall_start_0" oper

Re: [Pacemaker] Clustermon issue

2014-12-19 Thread Marco Querci
itor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="15:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:7;15:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="18" rc-code="7" op-status="0" interval="0" 
last-run="1418929392" last-rc-change="1418929392" exec-time="452" 
queue-time="0" op-digest="3a2172b3600a74a02c56030c73d7efd6" 
on_node="langate2"/>

  
  provider="heartbeat">
operation_key="ClusterIP_int_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="12:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:7;12:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="5" rc-code="7" op-status="0" interval="0" last-run="1418929392" 
last-rc-change="1418929392" exec-time="502" queue-time="0" 
op-digest="2e0d4879baaebfc3a7092f3adfeadb9e" on_node="langate2"/>

  
  provider="heartbeat">
operation_key="ClusterIP_ext2_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="16:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:7;16:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="22" rc-code="7" op-status="0" interval="0" 
last-run="1418929392" last-rc-change="1418929392" exec-time="431" 
queue-time="0" op-digest="cc4af9155b9449867acd30be849b0d3f" 
on_node="langate2"/>

  
  
operation_key="Shorewall_start_0" operation="start" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="28:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;28:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="37" rc-code="0" op-status="0" interval="0" 
last-run="1418929393" last-rc-change="1418929393" exec-time="6584" 
queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8" 
on_node="langate2"/>
operation_key="Shorewall_monitor_6" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="23:66:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;23:66:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="41" rc-code="0" op-status="0" interval="6" 
last-rc-change="1418929399" exec-time="99" queue-time="1" 
op-digest="4811cef7f7f94e3a35a70be7916cb2fd" on_node="langate2"/>

  
  
operation_key="Fail2ban_monitor_0" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="17:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:7;17:65:7:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="26" rc-code="7" op-status="0" interval="0" 
last-run="1418929392" last-rc-change="1418929392" exec-time="73" 
queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8" 
on_node="langate2"/>

  
  
operation_key="Postfix_start_0" operation="start" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="46:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;46:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="38" rc-code="0" op-status="0" interval="0" 
last-run="1418929393" last-rc-change="1418929393" exec-time="4401" 
queue-time="0" op-digest="f2317cad3d54cec5d7d7aa7d0bf35cf8" 
on_node="langate2"/>
operation_key="Postfix_monitor_6" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="42:66:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;42:66:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="42" rc-code="0" op-status="0" interval="6" 
last-rc-change="1418929399" exec-time="54" queue-time="0" 
op-digest="4811cef7f7f94e3a35a70be7916cb2fd" on_node="langate2"/>

  
  class="ocf" provider="pacemaker">
operation_key="ClusterMonitor_start_0" operation="start" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="54:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;54:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="39" rc-code="0" op-status="0" interval="0" 
last-run="1418929393" last-rc-change="1418929393" exec-time="163" 
queue-time="0" op-digest="bea5e7b384fbbbc979747b3584d1c025" 
on_node="langate2"/>
operation_key="ClusterMonitor_monitor_1" operation="monitor" 
crm-debug-origin="do_update_resource" crm_feature_set="3.0.9" 
transition-key="55:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
transition-magic="0:0;55:65:0:2883fef9-479a-473e-a889-5ecd6d367ed0" 
call-id="40" rc-code="0" op-status="0" interval="1" 
last-rc-change="1418929393" exec-time="62" queue-time="0" 
op-digest="31503c31050d63046026e4abfd181f64" on_node="langate2"/>

  

  
  

  
  name="probe_complete" value="true"/>


  

  




Il 19/12/2014 10:02, Florian Crouzat ha scritto:

Le 18/12/2014 16:21, Marco Querci a écrit :

Hi all,
I have a pacemaker + corosync cluster installed on a CentOS 6.5
I have a resource ClusterMon with external_agent param set up.
Before last pacemaker update the events notification to the external
script worked perfectly.
After pacemaker update to version 1.1.11, the ClusterMon resource
continues to work but stop notifying to the external agent.
I followed setup instructions found on internet but I can't figure how
it doesn't work as expected.

Any help will be appreciated.
Many thanks.



Hello, please paste your full configuration here please so we 
understand how you use the ClusterMon stuff.
Remember that on RHEL 6.x, SNMP support is not built in ; but that's 
probably why you use an external_agent. I just need to make sure by 
reading your configuration.


Cheers,
Florian Crouzat

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Clustermon issue

2014-12-18 Thread Marco Querci
Hi all,
I have a pacemaker + corosync cluster installed on a CentOS 6.5
I have a resource ClusterMon with external_agent param set up.
Before last pacemaker update the events notification to the external script
worked perfectly.
After pacemaker update to version 1.1.11, the ClusterMon resource continues
to work but stop notifying to the external agent.
I followed setup instructions found on internet but I can't figure how it
doesn't work as expected.

Any help will be appreciated.
Many thanks.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] clustermon external_agent issue

2014-11-24 Thread Marco Querci
Hi all,
I have a pacemaker + corosync cluster installed on a CentOS 6.5
I have a resource ClusterMon with external_agent param set up.
Since last pacemaker update the events notification to the external script
worked perfectly.
After pacemaker update to version 1.1.11, the ClusterMon resource continues
to work but stop notifying to the external agent.
I followed setup instructions found on internet but I can't figure how it
doesn't work as expected.

Any help will be appreciated.
Many thanks.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] corosync.conf tuning for Vm

2014-08-04 Thread marco
Hi all,
in a HA hypervisor environment corosync/pacemaker with virtual machines
is ok setting token to 1 minute ?

my needs are:

- i don't want that a temporary overload on an hypervisor break corosync
comunication or trigger a token lost.

- is ok to set token so high ( 1 minute ) or there are things/problems
  i don't know ?

thanks



signature.asc
Description: PGP signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] cib: ERROR: send_ais_message: Not connected to AIS

2014-04-14 Thread Marco Felettigh
On Mon, 14 Apr 2014 14:40:43 +1000
Andrew Beekhof  wrote:

> 
> On 11 Apr 2014, at 10:54 pm, Marco Felettigh  wrote:
> 
> > On Fri, 11 Apr 2014 17:17:57 +1000
> > Andrew Beekhof  wrote:
> > 
> >> 
> >> On 8 Apr 2014, at 8:37 pm, ma...@nucleus.it wrote:
> >> 
> >>> On Tue, 8 Apr 2014 10:49:16 +1000
> >>> Andrew Beekhof  wrote:
> >>> 
> >>>> 
> >>>> On 7 Apr 2014, at 8:46 pm, ma...@nucleus.it wrote:
> >>>> 
> >>>>> Hi,
> >>>>> in a production environment with 2 nodes ( nodeA , nodeB ) we
> >>>>> had an hardware failure so we restart the nodeB.
> >>>>> After the restarted nodeB came up we restart corosync/pacemaker
> >>>>> on it but for 2 days till now che corosync/pacemaker stuff is
> >>>>> looping.
> >>>>> 
> >>>>> crm_mon NodeA:
> >>>>> 
> >>>>> Stack: openais
> >>>>> Current DC: nodeA - partition with quorum
> >>>>> Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
> >>>>> 2 Nodes configured, 2 expected votes
> >>>>> 17 Resources configured.
> >>>>> 
> >>>>> 
> >>>>> Online: [ nodeA ]
> >>>>> OFFLINE: [ nodeB ]
> >>>>> 
> >>>>> 
> >>>>> crm_mon NodeB:
> >>>>> 
> >>>>> Stack: openais
> >>>>> Current DC: NONE
> >>>>> 2 Nodes configured, 2 expected votes
> >>>>> 17 Resources configured.
> >>>>> 
> >>>>> 
> >>>>> OFFLINE: [ nodeA nodeB ]
> >>>>> 
> >>>>> This loop on nodeB reports:
> >>>>> crmd: [7149]: debug: do_election_count_vote: Election 3 (owner:
> >>>>> nodeA) lost: vote from nodeA (Age)
> >>>>> 
> >>>>> So investigating around i found these message on nodeA:
> >>>>> cib: [28755]: ERROR: send_ais_message: Not connected to AIS
> >>>>> 
> >>>>> now this message is repeating for every operation.
> >>>>> Is it a corosync problem or a cib/pacemaker one ?
> >>>>> Any suggestion on what is happened ?
> >>>> 
> >>>> For some reason the cib can't connect to corosync anymore.
> >>>> No software got upgraded recently?
> >>>> 
> >>>> Are there any logs from corosync?
> >>>> Which distro is this?
> >>>> 
> >>>>> And why the start of a cluster node crasched the DC suff ? :(
> >>>>> 
> >>>>> 
> >>>>> Bye Marco
> >>>>> 
> >>>>> ___
> >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>>>> 
> >>>>> Project Home: http://www.clusterlabs.org
> >>>>> Getting started:
> >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs:
> >>>>> http://bugs.clusterlabs.org
> >>>> 
> >>> 
> >>> Hi,
> >>> the distro in an opensuse 11.1 and there is no updates also
> >>> because the distro is out of maintenance.
> >> 
> >> A good reason to be using SLES (or RHEL/CentOS).
> > 
> > Better Gentoo ;)
> > 
> >> 
> >>> We are planning and upgrade but the interesting thing is to figure
> >>> out the reasons of the problem.
> >>> The log in attachment, thanks for the support
> >> 
> >> There's nothing obvious in the logs.  Just that as far as pacemaker
> >> could tell, corosync suddenly went away. Was the corosync process
> >> still running?
> >> 
> > 
> > Yes , corosync was still running .
> 
> Stopping pacemaker and restarting it didnt help?
> 

At the end we restarted the two server and then start the
corosync/pacemaker stuff.


Thanks for the support
Marco

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] cib: ERROR: send_ais_message: Not connected to AIS

2014-04-11 Thread Marco Felettigh
On Fri, 11 Apr 2014 17:17:57 +1000
Andrew Beekhof  wrote:

> 
> On 8 Apr 2014, at 8:37 pm, ma...@nucleus.it wrote:
> 
> > On Tue, 8 Apr 2014 10:49:16 +1000
> > Andrew Beekhof  wrote:
> > 
> >> 
> >> On 7 Apr 2014, at 8:46 pm, ma...@nucleus.it wrote:
> >> 
> >>> Hi,
> >>> in a production environment with 2 nodes ( nodeA , nodeB ) we had
> >>> an hardware failure so we restart the nodeB.
> >>> After the restarted nodeB came up we restart corosync/pacemaker on
> >>> it but for 2 days till now che corosync/pacemaker stuff is
> >>> looping.
> >>> 
> >>> crm_mon NodeA:
> >>> 
> >>> Stack: openais
> >>> Current DC: nodeA - partition with quorum
> >>> Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
> >>> 2 Nodes configured, 2 expected votes
> >>> 17 Resources configured.
> >>> 
> >>> 
> >>> Online: [ nodeA ]
> >>> OFFLINE: [ nodeB ]
> >>> 
> >>> 
> >>> crm_mon NodeB:
> >>> 
> >>> Stack: openais
> >>> Current DC: NONE
> >>> 2 Nodes configured, 2 expected votes
> >>> 17 Resources configured.
> >>> 
> >>> 
> >>> OFFLINE: [ nodeA nodeB ]
> >>> 
> >>> This loop on nodeB reports:
> >>> crmd: [7149]: debug: do_election_count_vote: Election 3 (owner:
> >>> nodeA) lost: vote from nodeA (Age)
> >>> 
> >>> So investigating around i found these message on nodeA:
> >>> cib: [28755]: ERROR: send_ais_message: Not connected to AIS
> >>> 
> >>> now this message is repeating for every operation.
> >>> Is it a corosync problem or a cib/pacemaker one ?
> >>> Any suggestion on what is happened ?
> >> 
> >> For some reason the cib can't connect to corosync anymore.
> >> No software got upgraded recently?
> >> 
> >> Are there any logs from corosync?
> >> Which distro is this?
> >> 
> >>> And why the start of a cluster node crasched the DC suff ? :(
> >>> 
> >>> 
> >>> Bye Marco
> >>> 
> >>> ___
> >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>> 
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started:
> >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs:
> >>> http://bugs.clusterlabs.org
> >> 
> > 
> > Hi,
> > the distro in an opensuse 11.1 and there is no updates also because
> > the distro is out of maintenance.
> 
> A good reason to be using SLES (or RHEL/CentOS).

Better Gentoo ;)

> 
> > We are planning and upgrade but the interesting thing is to figure
> > out the reasons of the problem.
> > The log in attachment, thanks for the support
> 
> There's nothing obvious in the logs.  Just that as far as pacemaker
> could tell, corosync suddenly went away. Was the corosync process
> still running?
> 

Yes , corosync was still running .


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] cib: ERROR: send_ais_message: Not connected to AIS

2014-04-07 Thread marco
Hi,
in a production environment with 2 nodes ( nodeA , nodeB ) we had an
hardware failure so we restart the nodeB.
After the restarted nodeB came up we restart corosync/pacemaker on it
but for 2 days till now che corosync/pacemaker stuff is looping.

crm_mon NodeA:

Stack: openais
Current DC: nodeA - partition with quorum
Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
17 Resources configured.


Online: [ nodeA ]
OFFLINE: [ nodeB ]


crm_mon NodeB:

Stack: openais
Current DC: NONE
2 Nodes configured, 2 expected votes
17 Resources configured.


OFFLINE: [ nodeA nodeB ]

This loop on nodeB reports:
crmd: [7149]: debug: do_election_count_vote: Election 3 (owner: nodeA)
lost: vote from nodeA (Age)

So investigating around i found these message on nodeA:
cib: [28755]: ERROR: send_ais_message: Not connected to AIS

now this message is repeating for every operation.
Is it a corosync problem or a cib/pacemaker one ?
Any suggestion on what is happened ?
And why the start of a cluster node crasched the DC suff ? :(


Bye Marco



signature.asc
Description: PGP signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] token lost - need clarification

2013-12-18 Thread Marco Felettigh
On Tue, 17 Dec 2013 09:28:51 +0100
Michael Schwartzkopff  wrote:

> Am Dienstag, 17. Dezember 2013, 09:17:31 schrieb ma...@nucleus.it:
> > Hi to all,
> > i set up a 2 node cluster with a cross cable between the two nodes
> > without stonith ; i know this is not the best way but this is the
> > scenario i need at that time.
> > 
> > I know the releases are old:
> > corosync-1.2.7-1.2
> > libcorosync-1.2.7-1.2
> > pacemaker-1.0.10-1.4
> > libpacemaker3-1.0.10-1.4
> > 
> > Everything was ok for some days/months but a few day ago without
> > network interruption ( no messages relative to ethernet modules or
> > errors in network statistics or notifications by nagios ping checks
> > ) between the two nodes something went wrong.
> > 
> > From what i try to understand from the logs attached :
> > Token Timeout (1 ms) retransmit timeout (980 ms)
> > token hold (774 ms) retransmits before loss (10 retrans)
> > 
> > 
> > the 2 nodes lost a token and they try to solve the situation but
> > node1 think node2 is up:
> > 
> > Dec  7 05:01:41 node1 pengine: [1138]: info:
> > determine_online_status: Node node2 is online
> > Dec  7 05:01:41 node1 pengine: [1138]: info:
> > determine_online_status: Node node1 is online
> > 
> > and then lost
> > 
> > Dec  7 05:01:54 node1 corosync[1128]:   [pcmk  ] info:
> > ais_mark_unseen_peer_dead: Node node2 was not seen in the previous
> > transition
> > Dec  7 05:01:54 node1 corosync[1128]:   [pcmk  ] info:
> > update_member: Node 33559980/node2 is now: lost
> > 
> > while node2 think node1 was gone:
> > 
> > Dec  7 05:01:34 node2 corosync[6356]:   [pcmk  ] info:
> > ais_mark_unseen_peer_dead: Node node1 was not seen in the previous
> > transition Dec  7 05:01:34 node2 corosync[6356]:   [pcmk  ] info:
> > update_member: Node 16782764/node1 is now: lost
> > 
> > then they go in spilt brain .
> > Any suggestion about why node1 saw node2 ath the first time while
> > node2 declared immediately lost node1 ?
> 
> This depends who initiates the round. Both nodes recognized the
> failure within 20 seconds. This is ok. Especially if you allow 10
> Sekunds for a token timeout.
> 
> Mit freundlichen Grüßen,
> 
> Michael Schwartzkopff
> 

Ok that is fine but it is very strange without network loss between the
nodes that they cannot resend the token and later restablish the
quorum :( .

Marco

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Configuring Clusterstack on Scientifc Linux 6

2011-09-08 Thread Marco van Putten

On 09/08/2011 11:08 AM, Vadim Bulst wrote:

Hi all,

I'd like to build a 10-Node Cluster based on ScientificLinux 6 with
Corosync and Pacemaker. On my test-installation in the cluster-glue
packages I didn't find any stonith-components . Do I have to install any
more packages? In an opensuse-installation, there is a directory called
/usr/lib/stonith/ I didn't find any similar in an SL-environment.
My packagelist:

cluster-glue.i686 1.0.5-2.el6 @sl
cluster-glue-libs.i686 1.0.5-2.el6 @sl
cluster-glue-libs-devel.i686 1.0.5-2.el6 @sl
clusterlib.i686 3.0.12-41.el6 @sl
corosync.i686 1.2.3-36.el6 @sl
corosynclib.i686 1.2.3-36.el6 @sl
pacemaker.i686 1.1.5-5.el6 @sl
pacemaker-libs.i686 1.1.5-5.el6 @sl

Cheers,

Vadim




For this you'll need the "fence-agents" package.
The fencing binaries however are not in /usr/lib/stonith but in 
/usr/sbin/fence_*


You may also need resource-agents package.


Bye,
Marco.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker in RHEL6.

2011-08-12 Thread Marco van Putten

On 08/12/2011 06:05 AM, Larry Brigman wrote:

On Thu, Aug 11, 2011 at 8:51 PM, Larry Brigman mailto:larry.brig...@gmail.com>> wrote:

On Thu, Aug 11, 2011 at 5:37 PM, Andrew Beekhof mailto:and...@beekhof.net>> wrote:

On Fri, Aug 12, 2011 at 1:13 AM, Larry Brigman
mailto:larry.brig...@gmail.com>> wrote:
 > On Wed, Aug 10, 2011 at 10:50 PM, Marco van Putten
 > mailto:marco.vanput...@tudelft.nl>> wrote:
 >>
 >> On 08/10/2011 06:23 PM, David Coulson wrote:
 >>>
 >>> On 8/10/11 11:43 AM, Marco van Putten wrote:
 >>>>
 >>>> Thanks Andreas. But our managers persist on using Redhat.
 >>>
 >>> I think the idea would be to take the HA packages
distributed with
 >>> Scientific Linux 6.x and run them on RHEL.
 >>
 >>
 >> OK Thanks for the heads up. I will give it a try with the
Scientific Linux
 >> packages on RHEL.
 >>
 >>
 >>>
 >>> Note that even when you do subscribe to the HA add-on in RHEL6,
 >>> pacemaker is not supported by RedHat. Are you sure you
can't buy the HA
 >>> add-on to go with your base entitlement for RHEL?
 >>
 >>
 >> No unfortunately Redhat's license model doesn't work that
way. In stead of
 >> the 150$ academic license you have to buy the full licensed
version and then
 >> some extra for the add-on.
 >>
 > If you have the install DVD then the packages are there, just
in a different
 > repo on the disk.
 > Directory is HighAvailability.
 >  ls pacemaker-*
 > pacemaker-1.1.2-7.el6.x86_64.rpm
pacemaker-libs-1.1.2-7.el6.i686.rpm
 > pacemaker-libs-1.1.2-7.el6.x86_64.rpm

Is corosync and cluster-glue in there too?

Yes.
Packages]$ ls coro*
corosync-1.2.3-21.el6.x86_64.rpm   corosynclib-1.2.3-21.el6.x86_64.rpm
corosynclib-1.2.3-21.el6.i686.rpm
  Packages]$ ls cluster*
cluster-cim-0.16.2-10.el6.x86_64.rpm
clusterlib-3.0.12-23.el6.i686.rpm
cluster-glue-1.0.5-2.el6.x86_64.rpm
clusterlib-3.0.12-23.el6.x86_64.rpm
cluster-glue-libs-1.0.5-2.el6.i686.rpm
cluster-snmp-0.16.2-10.el6.x86_64.rpm
cluster-glue-libs-1.0.5-2.el6.x86_64.rpm


The source packages are also available.
ftp://ftp.redhat.com/pub/redhat/linux/enterprise/6Server/en/os/SRPMS/





I also found the rpm's on our Redhat satellite server. But this doesn't 
make it much easier if you want to do a upgrade to a newer version.


I've tried the Scientific Linux way by adding it as a disabled repository.

And then installing pacemaker by:
# yum install --enablerepo=scientificlinux pacemaker

Yum then takes care of all the dependencies and (somehow) only uses the 
pacemaker/corosync/etc packages from scientific while the rest comes 
from Redhat. You still need the epel repository as well btw.


So The Scientific Linux option works best for our situation I think.

Thanks everyone for all the reply's,
Marco.





___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker in RHEL6.

2011-08-10 Thread Marco van Putten

On 08/10/2011 06:23 PM, David Coulson wrote:

On 8/10/11 11:43 AM, Marco van Putten wrote:


Thanks Andreas. But our managers persist on using Redhat.


I think the idea would be to take the HA packages distributed with
Scientific Linux 6.x and run them on RHEL.



OK Thanks for the heads up. I will give it a try with the Scientific 
Linux packages on RHEL.





Note that even when you do subscribe to the HA add-on in RHEL6,
pacemaker is not supported by RedHat. Are you sure you can't buy the HA
add-on to go with your base entitlement for RHEL?



No unfortunately Redhat's license model doesn't work that way. In stead 
of the 150$ academic license you have to buy the full licensed version 
and then some extra for the add-on.





David



Bye,
Marco.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker in RHEL6.

2011-08-10 Thread Marco van Putten

On 08/10/2011 04:31 PM, Andreas Kurz wrote:

On 2011-08-10 14:13, Marco van Putten wrote:

Hi,

Is it possible to get the pacemaker rpm's available for RHEL6 on the
Clusterlabs repository (like for RHEL5)?

I know they are available through Redhat's "High Availability" channel.
But since we have academic licences we don't have this channel available.

scientific linux 6.1 should provide all packages

Regards,
Andreas




Thanks Andreas. But our managers persist on using Redhat.


Bye,
Marco.



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Pacemaker in RHEL6.

2011-08-10 Thread Marco van Putten

Hi,

Is it possible to get the pacemaker rpm's available for RHEL6 on the 
Clusterlabs repository (like for RHEL5)?


I know they are available through Redhat's "High Availability" channel.
But since we have academic licences we don't have this channel available.


Bye,
Marco.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [RPMs] clusterlabs.org and epel-6 ?

2011-07-12 Thread Marco van Putten




I don't believe there is any need for additional RPMs.  Pacemaker
should already be in CentOS6



For example we're running Redhat on an academic licence and don't have a 
subscription for the HA channel. It would be very welcome to have RPM's 
from Clusterlabs.



Bye,
Marco.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Pacemaker and Apache recourse configuration problem

2010-06-11 Thread Marco van Putten

On 06/11/2010 01:09 PM, Julio Gómez Belmonte wrote:

Hello everyone,

I'm configuring Pacemaker as Active / Passive cluster between two nodes
that need to run tomcat and mysql alternately. When I try to configure
Apache I get the following error in the state.

Apache_start_0 (node = SSCC-01, call = 39, rc =- 2, status = Timed Out):
unknown error exec

The sentence that I used to configure the Apache application is:

configure primitive Apache ocf:heartbeat:apache params
configfile="/etc/apache2/apache2.conf" port="443"

Anyone have any idea why this may be happening?



Did you activate mod_status and uncommented/set "/server-status> etc..." in your apache config?


Bye,
Marco.




Thanks in advance and best regards,



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Unable to commit from crm.

2010-04-26 Thread Marco van Putten

Hi Dejan,

Dejan Muhamedagic wrote:

Hi,

On Mon, Apr 26, 2010 at 11:53:17AM +0200, Marco van Putten wrote:

Hi Dejan,

Thanks for your response.

Dejan Muhamedagic wrote:

Never saw this. The '--noprofile' is a bash option. Looks like
some strange interaction between python and bash. If you set the
"user" option in crm, sudo is used to run all external programs.
Perhaps that is the culprit.


I was running the crm command with another user name than root but
with userid 0 and groupid 0.


You mean as effective id 0 (as in su or sudo) or that you have
another user with the id 0?



I have another user with uid 0 in /etc/passwd.





But if I run crm as root it works well.

For me this works as an OK workaround but on 3 other clusters (all
with an older pacemaker version) I don't have this problem...


Must be some environment issue. You should check your .profile
.bashrc .bash_profile /etc/profile.d/* /etc/bash*, there seem to
be so many in use and I probably forgot a few.



I use the tcsh shell for this userid 0 user. When I switch to bash in 
/etc/passwd the problem disappears.

So it's probably something in my .tcshrc. I'll look into it.

Thanks you very much for your help.


Bye,
Marco.





Thanks,

Dejan


Has this anything to do with the new version of cluster-glue a
couple of days ago...?

No, glue shouldn't have anything to do with this.

Thanks,

Dejan


Thanks,
Marco.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf


Re: [Pacemaker] Unable to commit from crm.

2010-04-26 Thread Marco van Putten

Hi Dejan,

Thanks for your response.

Dejan Muhamedagic wrote:


Never saw this. The '--noprofile' is a bash option. Looks like
some strange interaction between python and bash. If you set the
"user" option in crm, sudo is used to run all external programs.
Perhaps that is the culprit.



I was running the crm command with another user name than root but with 
userid 0 and groupid 0. But if I run crm as root it works well.


For me this works as an OK workaround but on 3 other clusters (all with 
an older pacemaker version) I don't have this problem...






Has this anything to do with the new version of cluster-glue a
couple of days ago...?


No, glue shouldn't have anything to do with this.

Thanks,

Dejan



Thanks,
Marco.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf


[Pacemaker] Unable to commit from crm.

2010-04-25 Thread Marco van Putten

Hi,

I've just installed a fresh version of pacemaker/heartbeat/ldirectord on 
2 RHEL5 servers with the repositories from 
http://www.clusterlabs.org/rpm/epel-5.


When I startup heartbeat all is fine. No problems so far.
But after I do some editing with "crm > configure > edit" when I try to 
do a commit I get this error message:


crm(live)configure# commit
Unknown option: `--noprofile'
Usage: -norc [ -bcdefilmnqstvVxX ] [ argument ... ].
ERROR: creating tmp shadow __crmshell.2440 failed

When I edit the file  /var/lib/heartbeat/crm/shadow.__crmshell.2440 and 
do a "crm_shadow -C __crmshell.2440 --force" eventually my modifications 
are committed.


Anyone else had this problem and is there something I can do about this?
Has this anything to do with the new version of cluster-glue a couple of 
days ago...?


These are the versions of heartbeat/pacemaker/ldirectord I'm running:
heartbeat-libs-3.0.3-1.el5
heartbeat-3.0.3-1.el5
pacemaker-libs-1.0.8-5.el5
pacemaker-1.0.8-5.el5
cluster-glue-libs-1.0.4-1.el5
cluster-glue-1.0.4-1.el5
ldirectord-1.0.3-1.el5

Thanks,
Marco.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf


Re: [Pacemaker] Resources don't start on second node after ping fails

2010-04-09 Thread Marco van Putten

Hi Benjamin,

Congratulations!
Do you mean not connected as in physicly not connected?

I'm no expert on the matter but I just ran into the "number" problem a 
couple of weeks ago myself.

Maybe in a newer version this is no longer an issue...

Bye,
Marco.

benjamin.b...@t-systems.com wrote:

Hi everybody!

I fixed this 'problem'... 
My drbd-resource wasn't connected. m(

The configuration of the ping resource and location were correct. I implemented 
Marco's advice but I'm sure my solution would've also worked.
The failover works just fine right now.

Thanks for reading!
Benjamin Benz


-Ursprüngliche Nachricht-
Von: Benz, Benjamin
Gesendet: Do 08.04.2010 14:46
An: pacemaker@oss.clusterlabs.org
Betreff: [Pacemaker] Resources don't start on second node after ping fails
 
Hi there!


I've got a problem with the configuration.
I'm using Pacemaker 1.0.7 to move my database from node1 to node2. Everything 
works fine when I migrate the resources manually or pull out the power plug.
Since I want the database to be available in case of network problems I tried 
to integrate a ping resource as you can see below.
When I pull out the network cable the resources stop on node1 but don't start 
on node2.

crm_mon output:

Online: [ bb-node1 bb-node2 ]

 Master/Slave Set: ms_drbd_ora
 Slaves: [ bb-node2 ]
 Stopped: [ drbd_ora:1 ]
 Clone Set: connected
 Started: [ bb-node1 bb-node2 ]


I guess there's something wrong with my configuration of the location but I 
can't figure it out.
It would be great if someone could help me out!

If you have other helpful hints concerning my config feel free to answer!

Regards
Benjamin Benz


crm configure show:

node $id="d109b732-1cfc-4cd8-9cce-ba9323a56087" bb-node2
node $id="f995b3ac-734f-4cc4-aacb-cbec22e48de5" bb-node1
primitive drbd_ora ocf:linbit:drbd \
params drbd_resource="ora" \
op monitor interval="5s" timeout="20s" on-fail="restart"
primitive fs_ora ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/oracle" fstype="ext3" \
op monitor interval="5s" timeout="40s" on-fail="restart"
primitive ip_ora ocf:heartbeat:IPaddr2 \
params ip="53.113.178.29" cidr_netmask="255.255.255.0" \
op monitor interval="5s" timeout="20s" on-fail="restart"
primitive oracle_ora ocf:heartbeat:oracle \
params home="/oracle" sid="bbcluster" user="oracle" ipcrm="orauser" \
op monitor interval="5s" timeout="30s" on-fail="restart"
primitive oralsnr_ora ocf:heartbeat:oralsnr \
params home="/oracle" sid="bbcluster" user="oracle" \
op monitor interval="5s" timeout="30s" on-fail="restart"
primitive ping ocf:pacemaker:ping \
params dampen="5s" host_list="53.118.160.121" multiplier="1000" 
name="pingval" \
operations $id="ping-operations" \
op monitor interval="10s" timeout="10s"
group ora_group fs_ora ip_ora oralsnr_ora oracle_ora \
meta target-role="Started"
ms ms_drbd_ora drbd_ora \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" 
notify="true" target-role="Started"
clone connected ping \
meta globally-unique="false" target-role="Started"

location ms_drbd_ora_on_connected_node ms_drbd_ora \
rule $id="ms_drbd_ora_on_connected_node-rule" -inf: not_defined pingval 
or pingval lte 0

colocation ora_group_on_ms_drbd_ora inf: ora_group ms_drbd_ora:Master
order ms_drbd_ora_before_ora_group inf: ms_drbd_ora:promote ora_group:start
property $id="cib-bootstrap-options" \
dc-version="1.0.7-6e1815972fc236825bf3658d7f8451d33227d420" \
cluster-infrastructure="Heartbeat" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
last-lrm-refresh="1270732011"

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

  



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
  



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Resources don't start on second node after ping fails

2010-04-08 Thread Marco van Putten

Hi Benjamin,



rule $id="ms_drbd_ora_on_connected_node-rule" -inf: not_defined pingval 
or pingval lte 0



You could give this a try instead:

rule $id="ms_drbd_ora_on_connected_node-rule" -inf: not_defined pingval 
or pingval number:lte 0


Good luck,
Marco.

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker