Re: [Pacemaker] Prioritized failover

2011-11-17 Thread Florian Haas
On 11/17/11 08:03, Nirmala S wrote:
> Hi,
> 
>  
> 
> I am in the process of modeling high availability for DB using
> Pacemaker. DB is an in-memory one with optional storage on disk.
> Replication is used as main form of data communication between nodes.
> The cluster needs to have a master-preferred slave-other slaves. Master
> and preferred slave replicate synchronously. Preferred slave acts as
> replication master for other slaves. In case of failure of master, the
> preferred slave should become master and pick one of the other slaves to
> handle the preferred slave role.
> 
>  
> 
> Topology is
> 
>  
> 
> Master
> 
>|
> 
> Slave/Master
> 
>  /  |  \
> 
> Slave Slave Slave

You can easily test this with the "Stateful" resource agent. Say you've
got 5 nodes, alice, bob, charlie, daisy and eric. alice is meant to be
your central master, bob is a slave to alice and a master to the three
others. alice and bob should be able to switch roles.

The following example configuration is untested, but it should suffice
to illustrate the idea.

node alice attributes class="central"
node bob attributes class=central"
node charlie attributes class="satellite"
node daisy attributes class="satellite"
node eric attributes class="satellite"

primitive p_stateful1 ocf:pacemaker:Stateful
primitive p_stateful2 ocf:pacemaker:Stateful
ms ms_stateful1 p_stateful1 meta master-max=1 clone-max=2
ms ms_stateful2 p_stateful2 meta master-max=1 clone-max=4

location l_stateful1_on_central ms_stateful1 \
  rule -inf: class ne central
location l_stateful2_master_on_central ms_stateful2 \
  rule $role=Master -inf: class ne central
location l_stateful2_slave_on_satellite ms_stateful2 \
  rule $role=Slave -inf: class ne satellite
colocation c_stateful2_master_on_stateful1_slave
  inf: ms_stateful2:Master ms_stateful1:Slave

> It needs to be achieved by one resource agent. Should the prioritized
> failover be based on ordering or location preference ? In either case
> based on the replication lag, can it be set dynamically ?

Take a look at what ocf:heartbeat:mysql and ocf:linbit:drbd do with
crm_master. It will allow you to set a master preference in whatever way
you see fit.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Prioritized failover

2011-11-17 Thread Nirmala
Florian Haas  writes:

> 
> You can easily test this with the "Stateful" resource agent. Say you've
> got 5 nodes, alice, bob, charlie, daisy and eric. alice is meant to be
> your central master, bob is a slave to alice and a master to the three
> others. alice and bob should be able to switch roles.
> 
> The following example configuration is untested, but it should suffice
> to illustrate the idea.
> 
> node alice attributes class="central"
> node bob attributes class=central"
> node charlie attributes class="satellite"
> node daisy attributes class="satellite"
> node eric attributes class="satellite"

Is it possible for the nodes in "satellite" class to become "central" at any
point of time ?

 Master(alice) Master(bob)
 
| |
 to 
 Slave/Master(bob) Slave/Master(eric)
 
  /  |  \ / \
  
Slave Slave Slave   Slave   Slave
(eric)(daisy) (charlie)(daisy)  (charlie)

Now if alice is restarted, it should be "satellite".

> primitive p_stateful1 ocf:pacemaker:Stateful
> primitive p_stateful2 ocf:pacemaker:Stateful
> ms ms_stateful1 p_stateful1 meta master-max=1 clone-max=2
> ms ms_stateful2 p_stateful2 meta master-max=1 clone-max=4
> 
> location l_stateful1_on_central ms_stateful1 \
>   rule -inf: class ne central
> location l_stateful2_master_on_central ms_stateful2 \
>   rule $role=Master -inf: class ne central
> location l_stateful2_slave_on_satellite ms_stateful2 \
>   rule $role=Slave -inf: class ne satellite
> colocation c_stateful2_master_on_stateful1_slave
>   inf: ms_stateful2:Master ms_stateful1:Slave
> 
> 
> Take a look at what ocf:heartbeat:mysql and ocf:linbit:drbd do with
> crm_master. It will allow you to set a master preference in whatever way
> you see fit.
> 

regards
Nirmala 





___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Prioritized failover

2011-11-17 Thread Florian Haas
On 11/17/11 11:39, Nirmala wrote:
> Florian Haas  writes:
> 
>>
>> You can easily test this with the "Stateful" resource agent. Say you've
>> got 5 nodes, alice, bob, charlie, daisy and eric. alice is meant to be
>> your central master, bob is a slave to alice and a master to the three
>> others. alice and bob should be able to switch roles.
>>
>> The following example configuration is untested, but it should suffice
>> to illustrate the idea.
>>
>> node alice attributes class="central"
>> node bob attributes class=central"
>> node charlie attributes class="satellite"
>> node daisy attributes class="satellite"
>> node eric attributes class="satellite"
> 
> Is it possible for the nodes in "satellite" class to become "central" at any
> point of time ?

What I outlined was merely _one way_ of doing this, for you to easily
test with an existing resource agent. Of course, if you roll your own
RA, there may be a myriad other ways of doing it.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Colocating resources on the same physical server when resources are run inside virtual servers

2011-11-17 Thread Andreas Kurz
Hello,

On 11/16/2011 09:04 PM, Attila Megyeri wrote:
> Hi Team,
> 
>  
> 
> Resources “A” and “B” are running within virtual servers. There are two
> physical servers “ph1” and “ph2” with two virtualized nodes on each.
> 
>  
> 
> What would be the easiest way to have a specific resource (e.g. resource
> "A") move to another node (From node "1" to node "2") in case when a
> different resource (e.g. "B")
> 
> moves from node 3 to node 4?
> 
>  
> 
> Resources "A" and "B" are independent, but nodes 1 and 3 are virtual
> servers running on physical host "ph1" whereas nodes 2 and 4 are virtual
> servers on physical host "ph2" and
> 
> the goal is to have resources "A" and "B" run on the same physical
> (host) server.

Is this one cluster or are these two independent clusters?

For one cluster there would the (hidden,undocumented) feature to use a
node-attribute in colocation constraints unfortunately the crm shell is
not aware of this feature so you would have to manipulate the cib
directly ... might be worth trying.

Regards,
Andreas

-- 
Need help with Pacemaker?
http://www.hastexo.com/now

> 
>  
> 
> Thank you in advance,
> 
>  
> 
> Bests
> 
>  
> 
> Attila
> 
>  
> 
>  
> 
>  
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker





signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Postgresql streaming replication failover - RAneeded

2011-11-17 Thread Buckingham, Brett
> Well I'm not sure I would be able to do that change. Failover is
relatively easy to do but I really have no idea how to do the failback
part.

>And that's exactly the reason why I haven't implemented it yet. With
the current way how replication is done in PostgreSQL there is no easy
way to switch between roles, or at least I don't know about a such way.
Implementing just fail-over functionality by creating a trigger file on
a slave server in the case of failure on master side doesn't create a
full master-slave implementation in my opinion.

We have created just such a multi-state RA, which incorporates a design
to manage failover, failback, and fallback (regular backups).  Please
give us a few days - a member of my team is removing any
product-specifics from it, and we'll post it shortly.

Brett


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] IPv6addr failure loopback interface

2011-11-17 Thread Lars Ellenberg
On Mon, Oct 24, 2011 at 02:57:24PM +0200, Arturo Borrero Gonzalez wrote:
> Hi there,
> 
> I'm working on deploying an Active/Active openldap cluster.
> 
> At first, I have 2 nodes.
> 
> I'm having some troubles with IPv6addr when trying to assing an IPv6 to the
> loopback interface.
> 
> The error is not very explicit:
> 
> IPv6addr: [1563]: ERROR: no valid mecahnisms //(yes, malformed word
> included)

And grepping for that malformed word in IPv6addr.c
would have been very easy.

Following the call path leads quickly to
scan_if(),
which has a comment that says:
/* Consider link-local addresses (scope == 0x20) only when
 * the inerface name is provided, and global addresses
 * (scope == 0). Skip everything else.
 */

where that 0x20 is what shows up in the 4. column of /proc/net/if_inet6,
not what IPv6 scope values are defined to be.

Apparently IPv6addr will only ever try to manage an address that shares
the scope and prefix of some existing one.

I suspect that IPv6addr would even work,
as soon as you have manually assigned one fc00::/7 address to lo.

The main reason for implementing it in C was to be able to
/* Send an unsolicited advertisement packet
 * Please refer to rfc4861 / rfc3542
 */

Which, well, does not appear to be useful on lo, anyways.

So if your shell thingy works for you, why not.

Anyways, if you change scan_if() (or whatever else is necessary) to e.g. just
use the provided if name, and not do sanity checks on scope and prefix, that
should be enough, and that patch should be fairly small.

Though I fail to understand the use case.
why would I want to assign globally routable IPv6 addresses to lo?

> Adding an IPv6 addr to the loopback interface is possible with ifconfig, so
> maybe I should write a new IPv6addr RA that manage IPv6addr on loopback with
> ifconfig.
> Something like this:
> 
> ifconfig lo add fc00::10/7
> ifconfig lo del fc00::10/7
> 
> To monitor:
> 
> ifconfig lo | grep fc00::10
> if [ $? -ne 0 ] then;
> don't have that ip on loopback
> else
> we have that ip on loopback
> fi
> 
> What do you think?


On Fri, Oct 28, 2011 at 10:55:46PM +0200, Arturo Borrero Gonzalez wrote:
> In a previous mail, I reported some errors with IPv6addr assigning IPv6 to
> the loopback interface.
> 
> I've developed a RA that is able to manage an IPv6 in the main loopback
> interface of most linux systems: "lo".
> 
> 
> I put here the code, but you can also found it here:
> 
> http://pastebin.com/rsqz83V3
> http://ral-arturo.blogspot.com/2011/10/ipv6addrlo-asignando-ipv6-interfaz-de.html
> 
> #!/bin/bash
> #
> #   OCF Resource Agent compliant resource script.
> # Arturo Borrero  || October 2011
> #
> # Based on the anything RA.
> #
> # GPLv3 Licensed. You can read the license in
> # http://www.gnu.org/licenses/gpl-3.0.html
> #
> # Initialization:
> 
> : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat}
> . ${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs
> 
> # Custom vars:
> IFCONFIG_BIN="/sbin/ifconfig"
> GREP_BIN="grep"
> IFACE="lo"
> process=$OCF_RESOURCE_INSTANCE
> ipv6addr=$OCF_RESKEY_ipv6addr
> cidr_netmask=$OCF_RESKEY_cidr_netmask
> pidfile=$OCF_RESKEY_pidfile ; [ -z "$pidfile" ] &&
> pidfile=${HA_VARRUN}IPv6addrLO_${process}.pid
> logfile=$OCF_RESKEY_logfile ; [ -z "$logfile" ] && logfile="/var/log/syslog"
> errlogfile=$OCF_RESKEY_errlogfile ; [ -z "$errlogfile" ] &&
> errlogfile="/var/log/syslog"
> 
> 
> validate_ipv6(){
> ocf_log debug "Validating IPv6 addr: [\"$1\"]."
> 
> echo "$1" | $GREP_BIN -E
> "^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]
> |2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:)))(%.+)?\s*$"
> > /dev/null

Are you serious ;-)

>  if [ $? -eq 0 ]
> then

indentation seems a bit unusual?
maybe that's a mail client issue though.

> # the ipv6 is valid
>  ocf_log debug "IPv6 addr: [\"$1\"] is va

Re: [Pacemaker] Regarding Stonith RAs

2011-11-17 Thread Andrew Beekhof
On Thu, Nov 17, 2011 at 1:28 AM, Dejan Muhamedagic  wrote:
> Hi,
>
> On Wed, Nov 16, 2011 at 05:49:30PM +0530, neha chatrath wrote:
> [...]
>> Nov 14 13:16:57 ggns2mexsatsdp17.hsc.com lrmd: [3976]: notice:
>> on_msg_get_rsc_types: can not find this RA class stonith"
>
> The PILS plugin handling stonith resources was not found.
> Strange, cannot recall seeing this before.

Could be a RHEL6 based distro.

> It should be in
> /usr/lib/heartbeat/plugins/RAExec/stonith.so (or /usr/lib64
> depending on your installation). Please check permissions and if
> this file is really a valid so object file. If everything's in
> order no idea what else could be the reason. You could strace
> lrmd on startup and see what happens between lines 1137 and 1158.
>
> Thanks,
>
> Dejan
>
>
>>
>> Thanks and regards
>> Neha Chatrath
>>
>> > --
>> >
>> >
>> > Date: Wed, 16 Nov 2011 11:12:12 +0100
>> > From: Dejan Muhamedagic 
>> > To: The Pacemaker cluster resource manager
>> >        
>> > Subject: Re: [Pacemaker] Regarding Stonith RAs
>> > Message-ID: <2016101211.GA4938@walrus.homenet>
>> > Content-Type: text/plain; charset=us-ascii
>> >
>> > Hi,
>> >
>> > On Tue, Nov 15, 2011 at 08:44:45AM +0530, neha chatrath wrote:
>> > > Hello Dejan,
>> > >
>> > > I am using Cluster Glue version 1.0.7.
>> > > Also this does not seem to be a problem with a specific Stonith agent
>> > like
>> > > IPMI, I think it is more of an issue with all the Stonith agents.
>> > > I have tried configuring another test Stonith agent e.g. Suicide and I am
>> > > facing exactly the same issue.
>> >
>> > Looks like a broken installation. I guess that metadata for other
>> > resource classes works fine. It could be some issue with
>> > stonith-ng. Did you notice any messages from stonith-ng?
>> >
>> > Thanks,
>> >
>> > Dejan
>> >
>> > > Kindly please suggest.
>> > >
>> > > Thanks and regards
>> > > Neha Chatrath
>> > >
>> > > Date: Mon, 14 Nov 2011 15:41:43 +0100
>> > > From: Dejan Muhamedagic 
>> > > To: The Pacemaker cluster resource manager
>> > >        > > > >
>> > > Subject: Re: [Pacemaker] Regarding Stonith RAs
>> > > Message-ID: <2014144142.GA3735@squib>
>> > > Content-Type: text/plain; charset=us-ascii
>> > >
>> > > Hi,
>> > >
>> > > On Mon, Nov 14, 2011 at 02:05:49PM +0530, neha chatrath wrote:
>> > > > Hello,
>> > > > I am facing issue in configuring a Stonith resource in my system of
>> > > cluster
>> > > > with 2 nodes.
>> > > > Whenever I try to give the following command:
>> > > > "crm configure primitive app_fence stonith::external/ipmi params
>> > hostname=
>> > > > ggns2mexsatsdp17.hsc.com ipaddr=192.168.113.17 userid=root
>> > > > passwd=pass@abc123" ,
>> > > > I get the following errors:
>> > > >
>> > > > "ERROR: stonith:external/ipmi: could not parse meta-data:
>> > >
>> > > Which version of cluster-glue do you have installed? There is a
>> > > serious issue with external/ipmi in version 1.0.8, we'll make a
>> > > new release ASAP.
>> > >
>> > > Thanks,
>> > >
>> > > Dejan
>> > >
>> > >
>> > > On Mon, Nov 14, 2011 at 2:05 PM, neha chatrath > > >wrote:
>> > >
>> > > > Hello,
>> > > > I am facing issue in configuring a Stonith resource in my system of
>> > > > cluster with 2 nodes.
>> > > > Whenever I try to give the following command:
>> > > > "crm configure primitive app_fence stonith::external/ipmi params
>> > hostname=
>> > > > ggns2mexsatsdp17.hsc.com ipaddr=192.168.113.17 userid=root
>> > > > passwd=pass@abc123" ,
>> > > > I get the following errors:
>> > > >
>> > > > "ERROR: stonith:external/ipmi: could not parse meta-data:
>> > > > Traceback (most recent call last):
>> > > >   File "/usr/sbin/crm", line 41, in 
>> > > >     crm.main.run()
>> > > >   File "/usr/lib/python2.6/site-packages/crm/main.py", line 249, in run
>> > > >     if parse_line(levels,shlex.split(' '.join(args))):
>> > > >   File "/usr/lib/python2.6/site-packages/crm/main.py", line 145, in
>> > > > parse_line
>> > > >     lvl.release()
>> > > >   File "/usr/lib/python2.6/site-packages/crm/levels.py", line 68, in
>> > > > release
>> > > >     self.droplevel()
>> > > >   File "/usr/lib/python2.6/site-packages/crm/levels.py", line 87, in
>> > > > droplevel
>> > > >     self.current_level.end_game(self._in_transit)
>> > > >   File "/usr/lib/python2.6/site-packages/crm/ui.py", line 1524, in
>> > end_game
>> > > >     self.commit("commit")
>> > > >   File "/usr/lib/python2.6/site-packages/crm/ui.py", line 1425, in
>> > commit
>> > > >     self._verify(mkset_obj("xml","changed"),mkset_obj("xml"))
>> > > >   File "/usr/lib/python2.6/site-packages/crm/ui.py", line 1324, in
>> > _verify
>> > > >     rc2 = set_obj_semantic.semantic_check(set_obj_all)
>> > > >   File "/usr/lib/python2.6/site-packages/crm/cibconfig.py", line 280,
>> > in
>> > > > semantic_check
>> > > >     rc = self.__check_unique_clash(set_obj_all)
>> > > >   File "/usr/lib/python2.6/site-packages/crm/cibconfig.py", line 260,
>> > in
>> > 

Re: [Pacemaker] Colocating resources on the same physical server when resources are run inside virtual servers

2011-11-17 Thread Andrew Beekhof
On Fri, Nov 18, 2011 at 12:46 AM, Andreas Kurz  wrote:
> Hello,
>
> On 11/16/2011 09:04 PM, Attila Megyeri wrote:
>> Hi Team,
>>
>>
>>
>> Resources “A” and “B” are running within virtual servers. There are two
>> physical servers “ph1” and “ph2” with two virtualized nodes on each.
>>
>>
>>
>> What would be the easiest way to have a specific resource (e.g. resource
>> "A") move to another node (From node "1" to node "2") in case when a
>> different resource (e.g. "B")
>>
>> moves from node 3 to node 4?
>>
>>
>>
>> Resources "A" and "B" are independent, but nodes 1 and 3 are virtual
>> servers running on physical host "ph1" whereas nodes 2 and 4 are virtual
>> servers on physical host "ph2" and
>>
>> the goal is to have resources "A" and "B" run on the same physical
>> (host) server.
>
> Is this one cluster or are these two independent clusters?
>
> For one cluster there would the (hidden,undocumented) feature to use a
> node-attribute in colocation constraints

Did I not document it, or just not explain how useful it could be in
these cases? :-)

>  unfortunately the crm shell is
> not aware of this feature so you would have to manipulate the cib
> directly ... might be worth trying.
>
> Regards,
> Andreas
>
> --
> Need help with Pacemaker?
> http://www.hastexo.com/now
>
>>
>>
>>
>> Thank you in advance,
>>
>>
>>
>> Bests
>>
>>
>>
>> Attila
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: 
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Colocating resources on the same physical server when resources are run inside virtual servers

2011-11-17 Thread Attila Megyeri
HI,


-Original Message-
From: Andrew Beekhof [mailto:and...@beekhof.net] 
Sent: 2011. november 18. 0:28
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Colocating resources on the same physical server when 
resources are run inside virtual servers

On Fri, Nov 18, 2011 at 12:46 AM, Andreas Kurz  wrote:
> Hello,
>
> On 11/16/2011 09:04 PM, Attila Megyeri wrote:
>> Hi Team,
>>
>>
>>
>> Resources "A" and "B" are running within virtual servers. There are 
>> two physical servers "ph1" and "ph2" with two virtualized nodes on each.
>>
>>
>>
>> What would be the easiest way to have a specific resource (e.g. 
>> resource
>> "A") move to another node (From node "1" to node "2") in case when a 
>> different resource (e.g. "B")
>>
>> moves from node 3 to node 4?
>>
>>
>>
>> Resources "A" and "B" are independent, but nodes 1 and 3 are virtual 
>> servers running on physical host "ph1" whereas nodes 2 and 4 are 
>> virtual servers on physical host "ph2" and
>>
>> the goal is to have resources "A" and "B" run on the same physical
>> (host) server.
>
> Is this one cluster or are these two independent clusters?


Well, my first idea was to make them independent, but I am ready to merge them 
as well :)


>
> For one cluster there would the (hidden,undocumented) feature to use a 
> node-attribute in colocation constraints

Did I not document it, or just not explain how useful it could be in these 
cases? :-)


So I guess I found something not very trivial :)

>  unfortunately the crm shell is
> not aware of this feature so you would have to manipulate the cib 
> directly ... might be worth trying.
>
> Regards,
> Andreas
>
> --
> Need help with Pacemaker?
> http://www.hastexo.com/now
>
>>
>>
>>
>> Thank you in advance,
>>
>>
>>
>> Bests
>>
>>
>>
>> Attila
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: 
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacem
>> aker
>
>
>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacema
> ker
>
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] [Drbd-dev] crm_attribute --quiet (was Fwd: [Linux-HA] Should This Worry Me?)

2011-11-17 Thread Andrew Beekhof
On Mon, Nov 14, 2011 at 9:56 PM, Lars Ellenberg
 wrote:
> On Mon, Nov 14, 2011 at 09:51:46AM +1100, Andrew Beekhof wrote:
>> > confused as to what the correct flag actually is. ocf:linbit:drbd (in
>> > both 8.3 and 8.4) uses "-Q" whereas Pacemaker expects "-q" as of this
>> > commit:
>> >
>> > commit c11ce5e9b0b13ead02b5fc4add928d7e7f95092e
>> > Author: Andrew Beekhof 
>> > Date:   Tue Sep 22 17:29:38 2009 +0200
>> >
>> >    Medium: Tools: Use -q as the short form for --quiet (for consistency)
>> >
>> >    Mercurial revision: 7289e661e4923beee4b7b45bc85592564ccdc438
>> >
>> > Should ocf:linbit:drbd be using "-q"?
>>
>> Correct.  Sorry about that.
>
>
> -Q is still accepted, though.

Ah, good. I do usually try and think about compatibility when I make
these sorts of changes.

> As it is accepted for a larger range of crm_attribute versions,
> I'll keep it for now.
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] killing corosync leaves crmd, stonithd, lrmd, cib and attrd to hog up the cpu

2011-11-17 Thread Andrew Beekhof
On Mon, Nov 14, 2011 at 10:32 PM, ihjaz Mohamed
 wrote:
> Hi All,
> As part of some robustness test for my cluster, I tried killing the corosync
> process using kill -9 . After this I see that the pacemakerd service is
> stopped but the processes crmd, stonithd, lrmd, cib and attrd are still
> running and are hogging up the cpu.

This is an old-ish[1] bug in the IPC code used by pacemaker to talk to corosync.
Try upgrading.

[1] Sufficiently long ago that I don't recall the version numbers anymore.

>
> top - 06:26:51 up  2:01,  4 users,  load average: 12.04, 12.01, 11.98
> Tasks: 330 total,  13 running, 317 sleeping,   0 stopped,   0 zombie
> Cpu(s):  7.1%us, 17.1%sy,  0.0%ni, 75.6%id,  0.1%wa,  0.0%hi,  0.0%si,
> 0.0%st
> Mem:   8015444k total,  4804412k used,  3211032k free,    54800k buffers
> Swap: 10256376k total,    0k used, 10256376k free,  1604464k cached
>
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  2053 hacluste  RT   0 90492 3324 2476 R 100.0  0.0 113:40.61 crmd
>  2047 root  RT   0 81480 2108 1712 R 99.8  0.0 113:40.43 stonithd
>  2048 hacluste  RT   0 83404 5260 2992 R 99.8  0.1 113:40.90 cib
>  2050 hacluste  RT   0 85896 2388 1952 R 99.8  0.0 113:40.43 attrd
>  5018 root  20   0 8787m 345m  56m S  2.0  4.4   0:56.95 java
> 19017 root  20   0 15068 1252  796 R  2.0  0.0   0:00.01 top
>     1 root  20   0 19232 1444 1156 S  0.0  0.0   0:01.71 init
>     2 root  20   0 0    0    0 S  0.0  0.0   0:00.00 kthreadd
>     3 root  RT   0 0    0    0 S  0.0  0.0   0:00.00 migration/0
>     4 root  20   0 0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
>
>
> Is there a way to cleanup these processes ? OR Do I need to kill them one by
> one before respawning the corosync?
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Modify resource agent.

2011-11-17 Thread Mark Gardner
I'm using a commercial cluster aware filesystem appliance called Panasas.
The OCF:Heartbeat:Filesystem resource agent politely declined to mount the
filesystem claiming that it was not a cluster aware filesystem.

For the time being I simply made a copy of the Filesystem resource agent
and added the file system type (panfs) alongside gfs, nfs, ocfs2.

Would it be appropriate to submit a patch/enhancement request to have this
added to the resource agent?  How would I go about submitting it?

I'd like to keep getting any future updates to the Resource Agent and have
the panfs filesystem available for use.

-- 

   ~ Mark
 Gardner ~
If it were easy everyone would do it.  Hard is what keeps out the riffraff.
***
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Colocating resources on the same physical server when resources are run inside virtual servers

2011-11-17 Thread Andrew Beekhof
On Fri, Nov 18, 2011 at 10:33 AM, Attila Megyeri
 wrote:
> HI,
>
>
> -Original Message-
> From: Andrew Beekhof [mailto:and...@beekhof.net]
> Sent: 2011. november 18. 0:28
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Colocating resources on the same physical server 
> when resources are run inside virtual servers
>
> On Fri, Nov 18, 2011 at 12:46 AM, Andreas Kurz  wrote:
>> Hello,
>>
>> On 11/16/2011 09:04 PM, Attila Megyeri wrote:
>>> Hi Team,
>>>
>>>
>>>
>>> Resources "A" and "B" are running within virtual servers. There are
>>> two physical servers "ph1" and "ph2" with two virtualized nodes on each.
>>>
>>>
>>>
>>> What would be the easiest way to have a specific resource (e.g.
>>> resource
>>> "A") move to another node (From node "1" to node "2") in case when a
>>> different resource (e.g. "B")
>>>
>>> moves from node 3 to node 4?
>>>
>>>
>>>
>>> Resources "A" and "B" are independent, but nodes 1 and 3 are virtual
>>> servers running on physical host "ph1" whereas nodes 2 and 4 are
>>> virtual servers on physical host "ph2" and
>>>
>>> the goal is to have resources "A" and "B" run on the same physical
>>> (host) server.
>>
>> Is this one cluster or are these two independent clusters?
>
>
> Well, my first idea was to make them independent, but I am ready to merge 
> them as well :)
>
>
>>
>> For one cluster there would the (hidden,undocumented) feature to use a
>> node-attribute in colocation constraints
>
> Did I not document it, or just not explain how useful it could be in these 
> cases? :-)
>
>
> So I guess I found something not very trivial :)

Actually its pretty trivial to configure, its just not very common

>
>>  unfortunately the crm shell is
>> not aware of this feature so you would have to manipulate the cib
>> directly ... might be worth trying.
>>
>> Regards,
>> Andreas
>>
>> --
>> Need help with Pacemaker?
>> http://www.hastexo.com/now
>>
>>>
>>>
>>>
>>> Thank you in advance,
>>>
>>>
>>>
>>> Bests
>>>
>>>
>>>
>>> Attila
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ___
>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs:
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacem
>>> aker
>>
>>
>>
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacema
>> ker
>>
>>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker