Re: [Pacemaker] Error when managing network with ping/pingd.

2013-08-29 Thread Andrew Beekhof

On 29/08/2013, at 5:36 PM, Francis SOUYRI  wrote:

> Hello,
> 
> I have a corosync/pacemaker with 2 nodes and 2 nets by nodes, 192.168.1.0/24 
> for cluster access, 10.1.1.0/24 for drbd in bond, both used by corosync.
> I try to used ocf:pacemaker:ping to monitor the 192.168.1.0/24 I have the 
> configuration below, but when I remove the cable of the noeud1 the named 
> group resource do not migrate to noeud2.
> 
> When I used these command "crm_attribute -G -t status -N  -n pingd" I have 
> this.

Invalid command.  -N requires an argument.

> 
> Could not map uname=-n to a UUID: The object/attribute does not exist
> scope=status   value=(null)
> Error performing operation: cib object missing
> 
> # CONFIG
> ...
>  
>
>  
>
> type="ping">
>  
> name="host_list" value="192.168.1.1"/>
> name="multiplier" value="100"/>
>  
>
>  
>
>
>   node="noeud1.apec.fr" rsc="named" score="50"/>
>   node="noeud1.apec.fr" rsc="dhcpd" score="50"/>
>   node="noeud1.apec.fr" rsc="named2" score="50"/>
>   node="noeud1.apec.fr" rsc="samba" score="50"/>
>  
>
>   operation="defined"/>
>
>  
>
>
>  
> name="resource-stickiness" value="100"/>
>  
>
> 

Please don't do this.

The contents of "" might not seem relevant to you, but I assure you they 
are.  Particularly in this case.


> # END CONFIG
> 
> Best regards.
> 
> Francis
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-08-29 Thread Andrew Beekhof

On 29/08/2013, at 7:31 PM, Andrey Groshev  wrote:

> 
> 
> 29.08.2013, 12:25, "Andrey Groshev" :
>> 29.08.2013, 02:55, "Andrew Beekhof" :
>> 
>>>  On 28/08/2013, at 5:38 PM, Andrey Groshev  wrote:
   28.08.2013, 04:06, "Andrew Beekhof" :
>   On 27/08/2013, at 1:13 PM, Andrey Groshev  wrote:
>>27.08.2013, 05:39, "Andrew Beekhof" :
>>>On 26/08/2013, at 3:09 PM, Andrey Groshev  wrote:
 26.08.2013, 03:34, "Andrew Beekhof" :
> On 23/08/2013, at 9:39 PM, Andrey Groshev  
> wrote:
>>  Hello,
>> 
>>  Today I try remake my test cluster from cman to corosync2.
>>  I drew attention to the following:
>>  If I reset cluster with cman through cibadmin --erase --force
>>  In cib is still there exist names of nodes.
> Yes, the cluster puts back entries for all the nodes it know 
> about automagically.
>>  cibadmin -Ql
>>  .
>> 
>>   > uname="dev-cluster2-node2"/>
>>   > uname="dev-cluster2-node4"/>
>>   > uname="dev-cluster2-node3"/>
>> 
>>  
>> 
>>  Even if cman and pacemaker running only one node.
> I'm assuming all three are configured in cluster.conf?
 Yes, there exist list nodes.
>>  And if I do too on cluster with corosync2
>>  I see only names of nodes which run corosync and pacemaker.
> Since you're not included your config, I can only guess that your 
> corosync.conf does not have a nodelist.
> If it did, you should get the same behaviour.
 I try and expected_node and nodelist.
>>>And it didn't work? What version of pacemaker?
>>It does not work as I expected.
>   Thats because you've used IP addresses in the node list.
>   ie.
> 
>   node {
> ring0_addr: 10.76.157.17
>   }
> 
>   try including the node name as well, eg.
> 
>   node {
> name: dev-cluster2-node2
> ring0_addr: 10.76.157.17
>   }
   The same thing.
>>>  I don't know what to say.  I tested it here yesterday and it worked as 
>>> expected.
>> 
>> I found that the reason that You and I have different results - I did not 
>> have reverse DNS zone for these nodes.
>> I know what it should be, but (PACEMAKER + CMAN) worked without a reverse 
>> area!
>> 
> 
> Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn!

It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - 
reverse or otherwise.
Can you set

 PCMK_trace_files=corosync.c

in your environment and retest?

On RHEL6 that means putting the following in /etc/sysconfig/pacemaker
  export PCMK_trace_files=corosync.c

It should produce additional logging[1] that will help diagnose the issue.

[1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/

> 
   # corosync-cmapctl |grep nodelist
   nodelist.local_node_pos (u32) = 2
   nodelist.node.0.name (str) = dev-cluster2-node2
   nodelist.node.0.ring0_addr (str) = 10.76.157.17
   nodelist.node.1.name (str) = dev-cluster2-node3
   nodelist.node.1.ring0_addr (str) = 10.76.157.18
   nodelist.node.2.name (str) = dev-cluster2-node4
   nodelist.node.2.ring0_addr (str) = 10.76.157.19
 
   # corosync-quorumtool -s
   Quorum information
   --
   Date: Wed Aug 28 11:29:49 2013
   Quorum provider:  corosync_votequorum
   Nodes:1
   Node ID:  172793107
   Ring ID:  52
   Quorate:  No
 
   Votequorum information
   --
   Expected votes:   3
   Highest expected: 3
   Total votes:  1
   Quorum:   2 Activity blocked
   Flags:
 
   Membership information
   --
  Nodeid  Votes Name
   172793107  1 dev-cluster2-node4 (local)
 
   # cibadmin -Q
   >>> validate-with="pacemaker-1.2" crm_feature_set="3.0.7" 
 cib-last-written="Wed Aug 28 11:24:06 2013" 
 update-origin="dev-cluster2-node4" update-client="crmd" have-quorum="0" 
 dc-uuid="172793107">

  

  >>> value="1.1.11-1.el6-4f672bc"/>
  >>> name="cluster-infrastructure" value="corosync"/>

  
  

  
  
  


  >>> crmd="online" crm-debug-origin="do_state_transition" join="member" 
 expected="member">

  


  
>>> name="probe_complete" value="true"/>
  

  

   
>>I figured out a way get around this, but it would be easier to do if 
>> the CIB has worked as a w

Re: [Pacemaker] Restart service after failover or failback

2013-08-29 Thread Andrew Beekhof

On 30/08/2013, at 1:28 AM, Digimer  wrote:

> Use the "script" resource agent.

No!
LSB scripts are supported directly:
  
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/_linux_standard_base.html

example at the bottom of:
  
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/s-resource-options.html


> 
> On 29/08/13 00:45, lista linux wrote:
>> Hi,
>> 
>> I already see the docs, but I dont understand How I can do pacemaker
>> execute "/etc/init.d/named restart"?
>> 
>> Regards,
>> On 08/29/2013 01:24 AM, Digimer wrote:
>>> On 29/08/13 00:16, lista linux wrote:
 Hi,
 
 Sorry for my english.
 
 I need help to configure my corosync/pacemaker to restart the named
 service after failover or failback.
 
 Regards,
>>> 
>>> The "Cluster from Scratch" at the link below is the best place to start;
>>> 
>>> http://clusterlabs.org/doc/
>>> 
>> 
> 
> 
> -- 
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without access 
> to education?
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]Two error information is displayed.

2013-08-29 Thread renayama19661014
Hi Andres,

Thank you for comment.

> But to be seriously: I see this phaenomena, too.
> (pacemaker 1.1.11-1.el6-4f672bc)

If the version that you confirm is the same as next, probably it will be that 
the same problem happens.
There is a similar cord.
(https://github.com/ClusterLabs/pacemaker/blob/4f672bc85eefd33e2fb09b601bb8ec1510645468/lib/pengine/unpack.c)

Best Regards,
Hideo Yamauchi.

--- On Thu, 2013/8/29, Andreas Mock  wrote:

> Hi Hideo san,
> 
> the two line shall emphasis that you do not only have trouble
> but real trouble...  ;-)
> 
> But to be seriously: I see this phaenomena, too.
> (pacemaker 1.1.11-1.el6-4f672bc)
> 
> Best regards
> Andreas Mock
> 
> -Ursprüngliche Nachricht-
> Von: renayama19661...@ybb.ne.jp [mailto:renayama19661...@ybb.ne.jp] 
> Gesendet: Donnerstag, 29. August 2013 02:38
> An: PaceMaker-ML
> Betreff: [Pacemaker] [Problem]Two error information is displayed.
> 
> Hi All,
> 
> Though the trouble is only once, two error information is displayed in
> crm_mon.
> 
> -
> 
> [root@rh64-coro2 ~]# crm_mon -1 -Af
> Last updated: Thu Aug 29 18:11:00 2013
> Last change: Thu Aug 29 18:10:45 2013 via cibadmin on rh64-coro2
> Stack: corosync
> Current DC: NONE
> 1 Nodes configured
> 1 Resources configured
> 
> 
> Online: [ rh64-coro2 ]
> 
> 
> Node Attributes:
> * Node rh64-coro2:
> 
> Migration summary:
> * Node rh64-coro2: 
>    dummy: migration-threshold=1 fail-count=1 last-failure='Thu Aug 29
> 18:10:57 2013'
> 
> Failed actions:
>     dummy_monitor_3000 on (null) 'not running' (7): call=11,
> status=complete, last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms,
> exec=0ms
>     dummy_monitor_3000 on rh64-coro2 'not running' (7): call=11,
> status=complete, last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms,
> exec=0ms
> 
> -
> 
> There seems to be the problem with an additional judgment of the error
> information somehow or other.
> 
> -
> static void
> unpack_rsc_op_failure(resource_t *rsc, node_t *node, int rc, xmlNode
> *xml_op, enum action_fail_response *on_fail, pe_working_set_t * data_set) 
> {
>     int interval = 0;
>     bool is_probe = FALSE;
>     action_t *action = NULL;
> (snip)
>     if (rc != PCMK_OCF_NOT_INSTALLED || is_set(data_set->flags,
> pe_flag_symmetric_cluster)) {
>         if ((node->details->shutdown == FALSE) || (node->details->online ==
> TRUE)) {
>             add_node_copy(data_set->failed, xml_op);
>         }
>     }
> 
>     crm_xml_add(xml_op, XML_ATTR_UNAME, node->details->uname);
>     if ((node->details->shutdown == FALSE) || (node->details->online ==
> TRUE)) {
>         add_node_copy(data_set->failed, xml_op);
>     }
> (snip)
> -
> 
> 
> Please revise the additional handling of error information.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Restart service after failover or failback

2013-08-29 Thread Digimer

Use the "script" resource agent.

On 29/08/13 00:45, lista linux wrote:

Hi,

I already see the docs, but I dont understand How I can do pacemaker
execute "/etc/init.d/named restart"?

Regards,
On 08/29/2013 01:24 AM, Digimer wrote:

On 29/08/13 00:16, lista linux wrote:

Hi,

Sorry for my english.

I need help to configure my corosync/pacemaker to restart the named
service after failover or failback.

Regards,


The "Cluster from Scratch" at the link below is the best place to start;

http://clusterlabs.org/doc/






--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]Two error information is displayed.

2013-08-29 Thread Andreas Mock
Hi Hideo san,

the two line shall emphasis that you do not only have trouble
but real trouble...  ;-)

But to be seriously: I see this phaenomena, too.
(pacemaker 1.1.11-1.el6-4f672bc)

Best regards
Andreas Mock

-Ursprüngliche Nachricht-
Von: renayama19661...@ybb.ne.jp [mailto:renayama19661...@ybb.ne.jp] 
Gesendet: Donnerstag, 29. August 2013 02:38
An: PaceMaker-ML
Betreff: [Pacemaker] [Problem]Two error information is displayed.

Hi All,

Though the trouble is only once, two error information is displayed in
crm_mon.

-

[root@rh64-coro2 ~]# crm_mon -1 -Af
Last updated: Thu Aug 29 18:11:00 2013
Last change: Thu Aug 29 18:10:45 2013 via cibadmin on rh64-coro2
Stack: corosync
Current DC: NONE
1 Nodes configured
1 Resources configured


Online: [ rh64-coro2 ]


Node Attributes:
* Node rh64-coro2:

Migration summary:
* Node rh64-coro2: 
   dummy: migration-threshold=1 fail-count=1 last-failure='Thu Aug 29
18:10:57 2013'

Failed actions:
dummy_monitor_3000 on (null) 'not running' (7): call=11,
status=complete, last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms,
exec=0ms
dummy_monitor_3000 on rh64-coro2 'not running' (7): call=11,
status=complete, last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms,
exec=0ms

-

There seems to be the problem with an additional judgment of the error
information somehow or other.

-
static void
unpack_rsc_op_failure(resource_t *rsc, node_t *node, int rc, xmlNode
*xml_op, enum action_fail_response *on_fail, pe_working_set_t * data_set) 
{
int interval = 0;
bool is_probe = FALSE;
action_t *action = NULL;
(snip)
if (rc != PCMK_OCF_NOT_INSTALLED || is_set(data_set->flags,
pe_flag_symmetric_cluster)) {
if ((node->details->shutdown == FALSE) || (node->details->online ==
TRUE)) {
add_node_copy(data_set->failed, xml_op);
}
}

crm_xml_add(xml_op, XML_ATTR_UNAME, node->details->uname);
if ((node->details->shutdown == FALSE) || (node->details->online ==
TRUE)) {
add_node_copy(data_set->failed, xml_op);
}
(snip)
-


Please revise the additional handling of error information.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Does ocf:heartbeat:IPaddr2 support binding the virtual IP on a bond interface?

2013-08-29 Thread Dejan Muhamedagic
Hi,

On Thu, Aug 29, 2013 at 11:43:11AM +0200, Florian Crouzat wrote:
> Le 28/08/2013 19:18, Xiaomin Zhang a écrit :
> >Actually I don't know how to specify the bond interface to assign this
> >virtual IP.
> 
> 
> $ sudo crm ra meta IPaddr2

Just to note that no special privileges are necessary to run crm
ra commands.

Cheers,

Dejan

> search for "nic" and make sure the underlaying interface is up as
> pacemaker doesn't do "ifup" but create aliases on already created
> interfaces (cf prerequisite in the "nic" section).
> 
> -- 
> Cheers,
> Florian Crouzat
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] editing primitive via script

2013-08-29 Thread Dejan Muhamedagic
Hi,

On Wed, Aug 28, 2013 at 11:17:13PM +0200, Christian Parpart wrote:
> Hey,
> 
> make it one-line commands, such as
> 
> #! /bin/sh
> crm configure unmanage pgsql ...

I guess you wanted to say:

crm resource unmanage ...

> crm configure delete pgsql ...
> crm configure primitive pgsql ...
> 
> This should do it, even though, it feels like an abuse :)

This be a bit better:

echo primitive pgsql ... | crm configure load update -

There's also crm configure filter though that may be difficult to
get right.

Cheers,

Dejan

> Cheers,
> Christian Parpart.
> 
> 
> 
> On Wed, Aug 28, 2013 at 5:58 PM, Gregg Jaskiewicz  wrote:
> 
> > Hi guys,
> >
> > So I have to change one of the 'primitives' configuration, in this case
> > pgsql - to add a new node or remove it.
> > I'd like to script it.
> >
> > Atm someone has to go in and manually run:
> > crm configuration edit pgsql
> > change the setting, and save it.
> >
> > How could one do this automatically in a script ?
> > crm configuration edit doesn't like to be fed things from outside using
> > stdin pipe.
> >
> > Thanks.
> >
> > --
> > GJ
> >
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> >

> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Resource does not auto recover from failed state

2013-08-29 Thread tetsuo shima
Update : I tried to spot the problem by running 2 Wheezy virtual machines
configured with debian pinning like this :

# cat /etc/apt/preferences
Package: *
Pin: release a=wheezy
Pin-Priority: 900

Package: *
Pin: release a=squeeze
Pin-Priority: 800

# aptitude install corosync
# aptitude install pacemaker/squeeze

so :

root@pcmk2:/etc/corosync# dpkg -l | grep pacem
ii  pacemaker  1.0.9.1+hg15626-1
amd64HA cluster resource manager
root@pcmk2:/etc/corosync# dpkg -l | grep corosync
ii  corosync   1.4.2-3
amd64Standards-based cluster framework (daemon and modules)
ii  libcorosync4   1.4.2-3
all  Standards-based cluster framework (transitional package)


and the problem did not occur :

root@pcmk1:~/pacemaker# crm_mon -1

Last updated: Thu Aug 29 05:53:50 2013
Stack: openais
Current DC: pcmk1 - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
2 Resources configured.


Online: [ pcmk2 pcmk1 ]

 ip(ocf::heartbeat:IPaddr2):Started pcmk1
 Clone Set: mysql-mm (unmanaged)
 mysql:0(ocf::heartbeat:mysql):Started pcmk2 (unmanaged)
 mysql:1(ocf::heartbeat:mysql):Started pcmk1 (unmanaged)

root@pcmk2:/etc/corosync# /etc/init.d/mysql stop
[ ok ] Stopping MySQL database server: mysqld.

root@pcmk1:~/pacemaker# crm_mon -1

Last updated: Thu Aug 29 05:55:39 2013
Stack: openais
Current DC: pcmk1 - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
2 Resources configured.


Online: [ pcmk2 pcmk1 ]

 ip(ocf::heartbeat:IPaddr2):Started pcmk1
 Clone Set: mysql-mm (unmanaged)
 mysql:0(ocf::heartbeat:mysql):Started pcmk2 (unmanaged) FAILED
 mysql:1(ocf::heartbeat:mysql):Started pcmk1 (unmanaged)

Failed actions:
mysql:0_monitor_15000 (node=pcmk2, call=5, rc=7, status=complete): not
running


root@pcmk2:/etc/corosync# /etc/init.d/mysql start
[ ok ] Starting MySQL database server: mysqld ..
[info] Checking for tables which need an upgrade, are corrupt or were
not closed cleanly..

root@pcmk1:~/pacemaker# crm_mon -1

Last updated: Thu Aug 29 05:56:34 2013
Stack: openais
Current DC: pcmk1 - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
2 Resources configured.


Online: [ pcmk2 pcmk1 ]

 ip(ocf::heartbeat:IPaddr2):Started pcmk1
 Clone Set: mysql-mm (unmanaged)
 mysql:0(ocf::heartbeat:mysql):Started pcmk2 (unmanaged)
 mysql:1(ocf::heartbeat:mysql):Started pcmk1 (unmanaged)



-

What I noticed :

with pacemaker 1.1.7, crm see 3 resources configured when in 1.0.9 it sees
2 resources (for the exact same configuration)



2013/8/27 tetsuo shima 

> Hi list !
>
> I'm having an issue with corosync, here is the scenario :
>
> # crm_mon -1
> 
> Last updated: Tue Aug 27 09:50:13 2013
> Last change: Mon Aug 26 16:06:01 2013 via cibadmin on node2
> Stack: openais
> Current DC: node1 - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 3 Resources configured.
> 
>
> Online: [ node2 node1 ]
>
>  ip(ocf::heartbeat:IPaddr2):Started node1
>  Clone Set: mysql-mm [mysql] (unmanaged)
>  mysql:0(ocf::heartbeat:mysql):Started node1 (unmanaged)
>  mysql:1(ocf::heartbeat:mysql):Started node2 (unmanaged)
>
> # /etc/init.d/mysql stop
> [ ok ] Stopping MySQL database server: mysqld.
>
> # crm_mon -1
> 
> Last updated: Tue Aug 27 09:50:30 2013
> Last change: Mon Aug 26 16:06:01 2013 via cibadmin on node2
> Stack: openais
> Current DC: node1 - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 3 Resources configured.
> 
>
> Online: [ node2 node1 ]
>
>  ip(ocf::heartbeat:IPaddr2):Started node1
>  Clone Set: mysql-mm [mysql] (unmanaged)
>  mysql:0(ocf::heartbeat:mysql):Started node1 (unmanaged)
>  mysql:1(ocf::heartbeat:mysql):Started node2 (unmanaged) FAILED
>
> Failed actions:
> mysql:0_monitor_15000 (node=node2, call=27, rc=7, status=complete):
> not running
>
> # /etc/init.d/mysql start
> [ ok ] Starting MySQL database server: mysqld ..
> [info] Checking for tables which need an upgrade, are corrupt or were
> not closed cleanly..
>
> # sleep 60 && crm_mon -1
> 
> Last updated: Tue Aug 27 09:51:54 2013
> Last change: Mon Aug 26 16:06:01 2013 via cibadmin on node2
> Stack: openais
> Current DC: node1 - partition with quorum
> Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
> 2 Nodes configured, 2 expected votes
> 3 Resources configured.
> 
>
> Online: [ node2 node1 ]
>
>  ip(ocf::heartbeat:IPaddr2):Started node1
>  

Re: [Pacemaker] Does ocf:heartbeat:IPaddr2 support binding the virtual IP on a bond interface?

2013-08-29 Thread Florian Crouzat

Le 28/08/2013 19:18, Xiaomin Zhang a écrit :

Actually I don't know how to specify the bond interface to assign this
virtual IP.



$ sudo crm ra meta IPaddr2

search for "nic" and make sure the underlaying interface is up as 
pacemaker doesn't do "ifup" but create aliases on already created 
interfaces (cf prerequisite in the "nic" section).


--
Cheers,
Florian Crouzat

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-08-29 Thread Andrey Groshev


29.08.2013, 12:25, "Andrey Groshev" :
> 29.08.2013, 02:55, "Andrew Beekhof" :
>
>>  On 28/08/2013, at 5:38 PM, Andrey Groshev  wrote:
>>>   28.08.2013, 04:06, "Andrew Beekhof" :
   On 27/08/2013, at 1:13 PM, Andrey Groshev  wrote:
>    27.08.2013, 05:39, "Andrew Beekhof" :
>>    On 26/08/2013, at 3:09 PM, Andrey Groshev  wrote:
>>> 26.08.2013, 03:34, "Andrew Beekhof" :
 On 23/08/2013, at 9:39 PM, Andrey Groshev  wrote:
>  Hello,
>
>  Today I try remake my test cluster from cman to corosync2.
>  I drew attention to the following:
>  If I reset cluster with cman through cibadmin --erase --force
>  In cib is still there exist names of nodes.
 Yes, the cluster puts back entries for all the nodes it know about 
 automagically.
>  cibadmin -Ql
>  .
> 
>    uname="dev-cluster2-node2"/>
>    uname="dev-cluster2-node4"/>
>    uname="dev-cluster2-node3"/>
> 
>  
>
>  Even if cman and pacemaker running only one node.
 I'm assuming all three are configured in cluster.conf?
>>> Yes, there exist list nodes.
>  And if I do too on cluster with corosync2
>  I see only names of nodes which run corosync and pacemaker.
 Since you're not included your config, I can only guess that your 
 corosync.conf does not have a nodelist.
 If it did, you should get the same behaviour.
>>> I try and expected_node and nodelist.
>>    And it didn't work? What version of pacemaker?
>    It does not work as I expected.
   Thats because you've used IP addresses in the node list.
   ie.

   node {
 ring0_addr: 10.76.157.17
   }

   try including the node name as well, eg.

   node {
 name: dev-cluster2-node2
 ring0_addr: 10.76.157.17
   }
>>>   The same thing.
>>  I don't know what to say.  I tested it here yesterday and it worked as 
>> expected.
>
> I found that the reason that You and I have different results - I did not 
> have reverse DNS zone for these nodes.
> I know what it should be, but (PACEMAKER + CMAN) worked without a reverse 
> area!
>

Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn!

>>>   # corosync-cmapctl |grep nodelist
>>>   nodelist.local_node_pos (u32) = 2
>>>   nodelist.node.0.name (str) = dev-cluster2-node2
>>>   nodelist.node.0.ring0_addr (str) = 10.76.157.17
>>>   nodelist.node.1.name (str) = dev-cluster2-node3
>>>   nodelist.node.1.ring0_addr (str) = 10.76.157.18
>>>   nodelist.node.2.name (str) = dev-cluster2-node4
>>>   nodelist.node.2.ring0_addr (str) = 10.76.157.19
>>>
>>>   # corosync-quorumtool -s
>>>   Quorum information
>>>   --
>>>   Date: Wed Aug 28 11:29:49 2013
>>>   Quorum provider:  corosync_votequorum
>>>   Nodes:    1
>>>   Node ID:  172793107
>>>   Ring ID:  52
>>>   Quorate:  No
>>>
>>>   Votequorum information
>>>   --
>>>   Expected votes:   3
>>>   Highest expected: 3
>>>   Total votes:  1
>>>   Quorum:   2 Activity blocked
>>>   Flags:
>>>
>>>   Membership information
>>>   --
>>>  Nodeid  Votes Name
>>>   172793107  1 dev-cluster2-node4 (local)
>>>
>>>   # cibadmin -Q
>>>   >> validate-with="pacemaker-1.2" crm_feature_set="3.0.7" cib-last-written="Wed 
>>> Aug 28 11:24:06 2013" update-origin="dev-cluster2-node4" 
>>> update-client="crmd" have-quorum="0" dc-uuid="172793107">
>>>    
>>>  
>>>    
>>>  >> value="1.1.11-1.el6-4f672bc"/>
>>>  >> name="cluster-infrastructure" value="corosync"/>
>>>    
>>>  
>>>  
>>>    
>>>  
>>>  
>>>  
>>>    
>>>    
>>>  >> crmd="online" crm-debug-origin="do_state_transition" join="member" 
>>> expected="member">
>>>    
>>>  
>>>    
>>>    
>>>  
>>>    >> name="probe_complete" value="true"/>
>>>  
>>>    
>>>  
>>>    
>>>   
>    I figured out a way get around this, but it would be easier to do if 
> the CIB has worked as a with CMAN.
>    I just do not start the main resource if the attribute is not defined 
> or it is not true.
>    This slightly changes the logic of the cluster.
>    But I'm not sure what the correct behavior.
>
>    libqb 0.14.4
>    corosync 2.3.1
>    pacemaker 1.1.11
>
>    All build from source in previews week.
>>> Now in corosync.conf:
>>>
>>> totem {
>>>    version: 2
>>>    crypto_cipher: none
>>>    crypto_hash: none
>>>    interface {
>>>    ringnumber: 0
>>> bindnetaddr: 10.76.157.18
>>>    

Re: [Pacemaker] Error when managing network with ping/pingd.

2013-08-29 Thread Francis SOUYRI

Hello,

Below the result of the scoreshow script 
(https://github.com/ClusterLabs/pacemaker/blob/master/extra/showscores.sh) 
and in attachment the output of "pcs config"


drbd ocf:redhat:drbd.sh
Filesystem: ocf:heartbeat:Filesystem
IPaddr2: ocf:heartbeat:Ipaddr2
ping: ocf:pacemaker:ping

# ./scoreshow
Resource  Score  Node  Stickiness #Fail 
Migration-Threshold

drbd_dhcpd200   noeud1.apec.fr 1000
drbd_dhcpd-INFINITY noeud2.apec.fr 1000
drbd_named200   noeud1.apec.fr 1000
drbd_named2   200   noeud2.apec.fr 1000
drbd_named2   -INFINITY noeud1.apec.fr 1000
drbd_named-INFINITY noeud2.apec.fr 1000
drbd_samba200   noeud2.apec.fr 1000
drbd_samba-INFINITY noeud1.apec.fr 1000
Filesystem_dhcpd  100   noeud1.apec.fr 1000
Filesystem_dhcpd  -INFINITY noeud2.apec.fr 1000
Filesystem_named  100   noeud1.apec.fr 1000
Filesystem_named2 100   noeud2.apec.fr 1000
Filesystem_named2 -INFINITY noeud1.apec.fr 1000
Filesystem_named  -INFINITY noeud2.apec.fr 1000
Filesystem_samba  100   noeud2.apec.fr 1000
Filesystem_samba  -INFINITY noeud1.apec.fr 1000
IPaddr2_dhcpd 0 noeud2.apec.fr 1000
IPaddr2_dhcpd 350   noeud1.apec.fr 1000
IPaddr2_named 0 noeud2.apec.fr 1000
IPaddr2_named20 noeud1.apec.fr 1000
IPaddr2_named2350   noeud2.apec.fr 1000
IPaddr2_named 350   noeud1.apec.fr 1000
IPaddr2_samba 0 noeud1.apec.fr 1000
IPaddr2_samba 350   noeud2.apec.fr 1000
ping-gateway:0100   noeud2.apec.fr 1000
ping-gateway:0-INFINITY noeud1.apec.fr 1000
ping-gateway:10 noeud2.apec.fr 1000
ping-gateway:1100   noeud1.apec.fr 1000

Best regards.

Francis

On 08/29/2013 10:49 AM, Francis SOUYRI wrote:

Hello Emmanuel,

Yes I think also but why... that the question, I checked the Net for
information about how the score is calculated without success.

I retrieved a "schoreshow" script but I do not understand the result.

Best regards.

Francis

On 08/29/2013 10:32 AM, emmanuel segura wrote:

I think your score is wrong in your rule expression


2013/8/29 Francis SOUYRI mailto:francis.sou...@apec.fr>>

 Hello,

 I have a corosync/pacemaker with 2 nodes and 2 nets by nodes,
 192.168.1.0/24  for cluster access,
 10.1.1.0/24  for drbd in bond, both used by
 corosync.
 I try to used ocf:pacemaker:ping to monitor the 192.168.1.0/24
  I have the configuration below, but when I
 remove the cable of the noeud1 the named group resource do not
 migrate to noeud2.

 When I used these command "crm_attribute -G -t status -N  -n pingd"
 I have this.

 Could not map uname=-n to a UUID: The object/attribute does not exist
 scope=status   value=(null)
 Error performing operation: cib object missing

 # CONFIG
 ...

  

  
  

  
  

  

  
  
http://noeud1.apec.fr>" rsc="named" score="50"/>
http://noeud1.apec.fr>" rsc="dhcpd" score="50"/>
http://noeud1.apec.fr>" rsc="named2" score="50"/>
http://noeud1.apec.fr>" rsc="samba" score="50"/>

  

  

  
  

  

  
 
 # END CONFIG

 Best regards.

 Francis

 _
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 
 http://oss.clusterlabs.org/__mailman/listinfo/pacemaker
 

 Project Home: http://www.clusterlabs.org
 Getting started:
 http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf
 
 Bugs: http://bugs.clusterlabs.org




--
esta es mi vida e me la vivo hasta que dios quiera



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




# pcs config
Corosync Nodes:
 10.1.1.1 10.1.1.2 
Pacemaker Nodes:
 noeud1.apec.fr noeud2.apec.fr 

Resources: 
 Group: named
  Resource: IPaddr2_named (type=IPaddr2 class=ocf provider=heartbeat)
   Attributes: ip=192.168.1.250 cidr_netmask=24 
   Operations: monitor interval=5s timeou

Re: [Pacemaker] Need help with quickstart of pacemaker on redhat

2013-08-29 Thread Andrew Beekhof
Pacemaker configuration. cibadmin -Ql

Sent from a mobile device

On 29/08/2013, at 7:02 PM, Moturi Upendra  wrote:

> Hi,
> 
> here it is
> 
> 
>   
>   
> 
>   
> 
>   
> 
>   
> 
> 
>   
> 
>   
> 
>   
> 
>   
>   
>   
>   
> 
>   
>   
> 
> 
>   
> 
> 
> 
> Just executed those steps in the document.
> 
> Thanks
> Upendra
> 
> 
> On Thu, Aug 29, 2013 at 2:59 AM, Andrew Beekhof  wrote:
>> 
>> On 28/08/2013, at 8:18 PM, Moturi Upendra  wrote:
>> 
>> > Thank's for the reply,
>> > but as per doc it says that it has to move to different node
>> 
>> Can you show your configuration please?
>> 
>> >
>> > From the document:
>> >
>> > Simulate a Service Failure
>> >
>> > We can simulate an error by telling the service to stop directly (without 
>> > telling the cluster):
>> >
>> > [ONE] # crm_resource --resource my_first_svc --force-stop
>> >
>> > If you now run crm_mon in interactive mode (the default), you should see 
>> > (within the monitor interval - 2 minutes) the cluster notice that 
>> > my_first_svc failed and move it to another node.
>> >
>> >
>> >
>> > thanks
>> >
>> > Upendra
>> >
>> >
>> >
>> > On Wed, Aug 28, 2013 at 3:51 AM, Andrew Beekhof  wrote:
>> >
>> > On 28/08/2013, at 12:12 AM, Moturi Upendra  
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > I followed your article in setting up 2-node cluster with pacemaker on 
>> > > redhat 6.4
>> > > http://clusterlabs.org/quickstart-redhat.html
>> > >
>> > > I just executed the same steps you have mentioned in document.
>> > > When i am trying to test failure condition to start the dummy agent on 
>> > > node2 ,it throws an error saying the
>> > > "my_first_svc_monitor_3 (node=node1, call=76, rc=7, 
>> > > status=complete): not running"
>> > >
>> > > Please help in understanding the error.
>> >
>> > Thats the cluster detecting the resource was stopped - which is expected 
>> > since you stopped it.
>> >
> 
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Error when managing network with ping/pingd.

2013-08-29 Thread emmanuel segura
Read this link
http://www.woodwose.net/thatremindsme/2011/04/the-pacemaker-ping-resource-agent/


2013/8/29 Francis SOUYRI 

> Hello Emmanuel,
>
> Yes I think also but why... that the question, I checked the Net for
> information about how the score is calculated without success.
>
> I retrieved a "schoreshow" script but I do not understand the result.
>
> Best regards.
>
> Francis
>
>
> On 08/29/2013 10:32 AM, emmanuel segura wrote:
>
>> I think your score is wrong in your rule expression
>>
>>
>> 2013/8/29 Francis SOUYRI > >
>>
>>
>> Hello,
>>
>> I have a corosync/pacemaker with 2 nodes and 2 nets by nodes,
>> 192.168.1.0/24  for cluster access,
>> 10.1.1.0/24  for drbd in bond, both used by
>>
>> corosync.
>> I try to used ocf:pacemaker:ping to monitor the 192.168.1.0/24
>>  I have the configuration below, but when I
>>
>> remove the cable of the noeud1 the named group resource do not
>> migrate to noeud2.
>>
>> When I used these command "crm_attribute -G -t status -N  -n pingd"
>> I have this.
>>
>> Could not map uname=-n to a UUID: The object/attribute does not exist
>> scope=status   value=(null)
>> Error performing operation: cib object missing
>>
>> # CONFIG
>> ...
>>
>>  
>>> value="true"/>
>>  
>>  > provider="pacemaker" type="ping">
>>
>>  > id="ping-gateway-instance___**attributes-host_list" name="host_list"
>> value="192.168.1.1"/>
>>  > id="ping-gateway-instance___**attributes-multiplier"
>> name="multiplier"
>>
>> value="100"/>
>>
>>  
>>
>>  
>>  
>>> node="noeud1.apec.fr " rsc="named"
>> score="50"/>
>>> node="noeud1.apec.fr " rsc="dhcpd"
>> score="50"/>
>>> node="noeud1.apec.fr " rsc="named2"
>> score="50"/>
>>> node="noeud1.apec.fr " rsc="samba"
>> score="50"/>
>>
>>
>>  
>>> operation="defined"/>
>>  
>>
>>  
>>  
>>
>>  >
>> name="resource-stickiness" value="100"/>
>>
>>  
>> 
>> # END CONFIG
>>
>> Best regards.
>>
>> Francis
>>
>> __**___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> > >
>> 
>> http://oss.clusterlabs.org/__**mailman/listinfo/pacemaker
>>
>> 
>> 
>> >
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> 
>> http://www.clusterlabs.org/__**doc/Cluster_from_Scratch.pdf
>>
>> 
>> 
>> >
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>>
>> --
>> esta es mi vida e me la vivo hasta que dios quiera
>>
>
>
> __**_
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/**mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: 
> http://www.clusterlabs.org/**doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
esta es mi vida e me la vivo hasta que dios quiera
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Error when managing network with ping/pingd.

2013-08-29 Thread Francis SOUYRI

Hello Emmanuel,

Yes I think also but why... that the question, I checked the Net for 
information about how the score is calculated without success.


I retrieved a "schoreshow" script but I do not understand the result.

Best regards.

Francis

On 08/29/2013 10:32 AM, emmanuel segura wrote:

I think your score is wrong in your rule expression


2013/8/29 Francis SOUYRI mailto:francis.sou...@apec.fr>>

Hello,

I have a corosync/pacemaker with 2 nodes and 2 nets by nodes,
192.168.1.0/24  for cluster access,
10.1.1.0/24  for drbd in bond, both used by
corosync.
I try to used ocf:pacemaker:ping to monitor the 192.168.1.0/24
 I have the configuration below, but when I
remove the cable of the noeud1 the named group resource do not
migrate to noeud2.

When I used these command "crm_attribute -G -t status -N  -n pingd"
I have this.

Could not map uname=-n to a UUID: The object/attribute does not exist
scope=status   value=(null)
Error performing operation: cib object missing

# CONFIG
...
   
 
   
 
 
   
 
 
   
 
   
 
 
   http://noeud1.apec.fr>" rsc="named" score="50"/>
   http://noeud1.apec.fr>" rsc="dhcpd" score="50"/>
   http://noeud1.apec.fr>" rsc="named2" score="50"/>
   http://noeud1.apec.fr>" rsc="samba" score="50"/>
   
 
   
 
   
 
 
   
 
   
 

# END CONFIG

Best regards.

Francis

_
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org

http://oss.clusterlabs.org/__mailman/listinfo/pacemaker


Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf

Bugs: http://bugs.clusterlabs.org




--
esta es mi vida e me la vivo hasta que dios quiera



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Error when managing network with ping/pingd.

2013-08-29 Thread emmanuel segura
I think your score is wrong in your rule expression


2013/8/29 Francis SOUYRI 

> Hello,
>
> I have a corosync/pacemaker with 2 nodes and 2 nets by nodes,
> 192.168.1.0/24 for cluster access, 10.1.1.0/24 for drbd in bond, both
> used by corosync.
> I try to used ocf:pacemaker:ping to monitor the 192.168.1.0/24 I have the
> configuration below, but when I remove the cable of the noeud1 the named
> group resource do not migrate to noeud2.
>
> When I used these command "crm_attribute -G -t status -N  -n pingd" I have
> this.
>
> Could not map uname=-n to a UUID: The object/attribute does not exist
> scope=status   value=(null)
> Error performing operation: cib object missing
>
> # CONFIG
> ...
>   
> 
>value="true"/>
> 
>  type="ping">
>   
>  name="host_list" value="192.168.1.1"/>
>  name="multiplier" value="100"/>
>   
> 
>   
> 
> 
>   
>   
>   
>   
>   
> 
>operation="defined"/>
> 
>   
> 
> 
>   
>  name="resource-stickiness" value="100"/>
>   
> 
> 
> # END CONFIG
>
> Best regards.
>
> Francis
>
> __**_
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/**mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: 
> http://www.clusterlabs.org/**doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
esta es mi vida e me la vivo hasta que dios quiera
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-08-29 Thread Andrey Groshev


29.08.2013, 02:55, "Andrew Beekhof" :
> On 28/08/2013, at 5:38 PM, Andrey Groshev  wrote:
>
>>  28.08.2013, 04:06, "Andrew Beekhof" :
>>>  On 27/08/2013, at 1:13 PM, Andrey Groshev  wrote:
   27.08.2013, 05:39, "Andrew Beekhof" :
>   On 26/08/2013, at 3:09 PM, Andrey Groshev  wrote:
>>    26.08.2013, 03:34, "Andrew Beekhof" :
>>>    On 23/08/2013, at 9:39 PM, Andrey Groshev  wrote:
 Hello,

 Today I try remake my test cluster from cman to corosync2.
 I drew attention to the following:
 If I reset cluster with cman through cibadmin --erase --force
 In cib is still there exist names of nodes.
>>>    Yes, the cluster puts back entries for all the nodes it know about 
>>> automagically.
 cibadmin -Ql
 .
    
  >>> uname="dev-cluster2-node2"/>
  >>> uname="dev-cluster2-node4"/>
  >>> uname="dev-cluster2-node3"/>
    
 

 Even if cman and pacemaker running only one node.
>>>    I'm assuming all three are configured in cluster.conf?
>>    Yes, there exist list nodes.
 And if I do too on cluster with corosync2
 I see only names of nodes which run corosync and pacemaker.
>>>    Since you're not included your config, I can only guess that your 
>>> corosync.conf does not have a nodelist.
>>>    If it did, you should get the same behaviour.
>>    I try and expected_node and nodelist.
>   And it didn't work? What version of pacemaker?
   It does not work as I expected.
>>>  Thats because you've used IP addresses in the node list.
>>>  ie.
>>>
>>>  node {
>>>    ring0_addr: 10.76.157.17
>>>  }
>>>
>>>  try including the node name as well, eg.
>>>
>>>  node {
>>>    name: dev-cluster2-node2
>>>    ring0_addr: 10.76.157.17
>>>  }
>>  The same thing.
>
> I don't know what to say.  I tested it here yesterday and it worked as 
> expected.

I found that the reason that You and I have different results - I did not have 
reverse DNS zone for these nodes.
I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area!

>
>>  # corosync-cmapctl |grep nodelist
>>  nodelist.local_node_pos (u32) = 2
>>  nodelist.node.0.name (str) = dev-cluster2-node2
>>  nodelist.node.0.ring0_addr (str) = 10.76.157.17
>>  nodelist.node.1.name (str) = dev-cluster2-node3
>>  nodelist.node.1.ring0_addr (str) = 10.76.157.18
>>  nodelist.node.2.name (str) = dev-cluster2-node4
>>  nodelist.node.2.ring0_addr (str) = 10.76.157.19
>>
>>  # corosync-quorumtool -s
>>  Quorum information
>>  --
>>  Date: Wed Aug 28 11:29:49 2013
>>  Quorum provider:  corosync_votequorum
>>  Nodes:    1
>>  Node ID:  172793107
>>  Ring ID:  52
>>  Quorate:  No
>>
>>  Votequorum information
>>  --
>>  Expected votes:   3
>>  Highest expected: 3
>>  Total votes:  1
>>  Quorum:   2 Activity blocked
>>  Flags:
>>
>>  Membership information
>>  --
>> Nodeid  Votes Name
>>  172793107  1 dev-cluster2-node4 (local)
>>
>>  # cibadmin -Q
>>  > validate-with="pacemaker-1.2" crm_feature_set="3.0.7" cib-last-written="Wed 
>> Aug 28 11:24:06 2013" update-origin="dev-cluster2-node4" 
>> update-client="crmd" have-quorum="0" dc-uuid="172793107">
>>   
>> 
>>   
>> > value="1.1.11-1.el6-4f672bc"/>
>> > name="cluster-infrastructure" value="corosync"/>
>>   
>> 
>> 
>>   
>> 
>> 
>> 
>>   
>>   
>> > crmd="online" crm-debug-origin="do_state_transition" join="member" 
>> expected="member">
>>   
>> 
>>   
>>   
>> 
>>   > value="true"/>
>> 
>>   
>> 
>>   
>>  
   I figured out a way get around this, but it would be easier to do if the 
 CIB has worked as a with CMAN.
   I just do not start the main resource if the attribute is not defined or 
 it is not true.
   This slightly changes the logic of the cluster.
   But I'm not sure what the correct behavior.

   libqb 0.14.4
   corosync 2.3.1
   pacemaker 1.1.11

   All build from source in previews week.
>>    Now in corosync.conf:
>>
>>    totem {
>>   version: 2
>>   crypto_cipher: none
>>   crypto_hash: none
>>   interface {
>>   ringnumber: 0
>>    bindnetaddr: 10.76.157.18
>>    mcastaddr: 239.94.1.56
>>   mcastport: 5405
>>   ttl: 1
>>   }
>>    }
>>    logging {
>>   fileline: off
>>   to_stderr: no
>>   to_logfile: yes
>>   logfile: /var/log/cluster/corosync.log
>>   to_syslog: yes
>>   debug: on
>>   timestamp: on
>>  

[Pacemaker] Error when managing network with ping/pingd.

2013-08-29 Thread Francis SOUYRI

Hello,

I have a corosync/pacemaker with 2 nodes and 2 nets by nodes, 
192.168.1.0/24 for cluster access, 10.1.1.0/24 for drbd in bond, both 
used by corosync.
I try to used ocf:pacemaker:ping to monitor the 192.168.1.0/24 I have 
the configuration below, but when I remove the cable of the noeud1 the 
named group resource do not migrate to noeud2.


When I used these command "crm_attribute -G -t status -N  -n pingd" I 
have this.


Could not map uname=-n to a UUID: The object/attribute does not exist
scope=status   value=(null)
Error performing operation: cib object missing

# CONFIG
...
  

  value="true"/>


type="ping">

  
name="host_list" value="192.168.1.1"/>
name="multiplier" value="100"/>

  

  


  node="noeud1.apec.fr" rsc="named" score="50"/>
  node="noeud1.apec.fr" rsc="dhcpd" score="50"/>
  node="noeud1.apec.fr" rsc="named2" score="50"/>
  node="noeud1.apec.fr" rsc="samba" score="50"/>

  

  operation="defined"/>


  


  
name="resource-stickiness" value="100"/>

  


# END CONFIG

Best regards.

Francis

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org