Re: [ClusterLabs] Pacemaker (remote) component relations

2016-02-08 Thread Ken Gaillot
On 02/08/2016 07:55 AM, Ferenc Wágner wrote:
> Hi,
> 
> I'm looking for information about the component interdependencies,
> because I'd like to split the Pacemaker packages in Debian properly.
> The current idea is to create two daemon packages, pacemaker and
> pacemaker-remote, which exclude each other, as they contain daemons
> listening on the same sockets.
> 
> 1. Is the socket names configurable?  Are there reasonable use cases
>requiring both daemons running concurrently?

No, they are exclusive. Pacemaker Remote simulates the cluster services,
so they listen on the same sockets. They can be installed on the same
machine, just not running at the same time.

> These two daemon packages would depend on a package providing the common
> hacluster user, the haclient group, the sysconfig and logrotate config.
> 
> What else should go here?
> 
> 2. Are the various RNG and XSL files under /usr/share/pacemaker used
>equally by pacemakerd and pacemaker_remoted?  Or by the CLI utils?

Yes, the RNG and XSL files should be considered common.

> 3. Maybe the ocf:pacemaker RAs and their man pages?  Or are they better
>in a separate pacemaker-resource-agents package?

If I were starting from scratch, I would consider a separate package.
Upstream, we recently moved most of the agents to the CLI package, which
is also a good alternative.

Note that a few agents can't run on remote nodes, and we've left those
in the pacemaker package upstream. These are remote (obviously),
controld and o2cb (I believe because the services they start require
direct access to fencing).

> 4. Is /usr/share/snmp/mibs/PCMK-MIB.txt used by any component, or is it
>only for decyphering SNMP traps at their destination?  I guess it can
>go anywhere, it will be copied by the admin anyway.

The ClusterMon resource uses it, and user scripts can use it. I'd vote
for common, as it's architecture-neutral and theoretically usable by
anything.

> There's also a separate package for the various command line utilities,
> which would depend on pacemaker OR pacemaker-remote.
> 
> 5. crm_attribute is in the pacemaker RPM package, unlike the other
>utilites, which are in pacemaker-cli.  What's the reason for this?
> 
> 6. crm_node connects to corosync directly, so it won't work with
>pacemaker_remote.  crm_master uses crm_node, thus it won't either (at
>least without -N).  But 4.3.5. of the Remote book explicitly mentions
>crm_master amongst the tools usable with pacemaker_remote.  Where's
>the mistake?

crm_attribute and crm_node both depend on the cluster-layer libraries,
which won't necessarily be available on a remote node. We hope to remove
that dependency at some point.

It is possible to install the cluster libraries on a remote node, and
some of the crm_attribute/crm_node functionality will work, though not all.

> 7. According to its man page, crm_master should be invoked from an OCF
>resource agent, which could happen under pacemaker_remote.  This is
>again problematic due to the previous point.

I'd have to look into this to be sure, but it's possible that if the
cluster libs are installed, the particular options that crm_master uses
are functional.

> 8. Do fence_legacy, fence_pcmk and stonith_admin make any sense outside
>the pacemaker package?  Even if they can run on top of
>pacemaker_remote, the cluster would never used them there, right?
>And what about attrd_updater?  To be honest, I don't know what that's
>for anyway...

No fence agents run on remote nodes.

I'd expect stonith_admin and attrd_updater to work, as the remote node
will proxy socket connections to stonithd and attrd. attrd_updater is an
alternative interface similar to crm_attribute.

> 9. What's the purpose of chgrp haclient /etc/pacemaker if
>pacemaker_remoted is running as root?  What other haclients may need
>access to the authkey file?

The crmd (which runs as hacluster:haclient) manages the cluster side of
a remote node connection, so on the cluster nodes, the key has to be
readable by that process. In the how-to's, I kept it consistent on the
remote nodes to reduce confusion and the chance for errors.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Hawk release 2.0

2016-02-08 Thread Jorge Fábregas
On 02/08/2016 11:30 AM, Kristoffer Grönlund wrote:
> It is my great pleasure to announce that Hawk 2.0.0 is released! 

Hi Krostoffer,

Thanks for the announcement.  It looks good!

Will this make it into SLES 12?

Thanks,
Jorge

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Hawk release 2.0

2016-02-08 Thread Kristoffer Grönlund
Jorge Fábregas  writes:

> On 02/08/2016 11:30 AM, Kristoffer Grönlund wrote:
>> It is my great pleasure to announce that Hawk 2.0.0 is released! 
>
> Hi Krostoffer,
>
> Thanks for the announcement.  It looks good!
>
> Will this make it into SLES 12?
>

Hi, and thank you!

There is a hawk2 package available in SLE HAE 12 SP1. It is not exactly
the same as this release, but very close.

By default, the old version of Hawk is still installed, so to get the
new version you need to install the hawk2 package instead.

Cheers,
Kristoffer

> Thanks,
> Jorge
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Working with 2 VIPs

2016-02-08 Thread Louis Chanouha

Hello,
I'm not sure if this mailign is the proper place to send ma request, 
please tell me where i should send it if not :)


I have an use case that i can't run acutally with corosync + pacemaker.

I have two nodes, two VIP and two services (one dupplicated), in order 
to provide an active/active service (2 physical sites).
On a normal situation, one VIP is associated to one node via a prefered 
location, and the service is running one the two nodes (cloned).


On failing situation, i want that the working node takes the IP of the 
other host without migrating the service (listening on 0.0.0.0), so when :

 - the service is down - not working
 - the node is down (network or OS layer) - working

I can't find the proper way to conceptualize this problem with 
group/colocation/order notions of pacemaker. I would be happy in you 
give me some thoughs on appropriate options.



Thank you in advance for your help.
Sorry for my non-native English.

Louis Chanouha

**

My current configuration is this one. I can't translate it in XML if you 
need it.


/node Gollum//
//node edison//
//primitive cups lsb:cups \//
//op monitor interval="2s"//
//primitive vip_edison ocf:heartbeat:IPaddr2 \//
//params nic="eth0" ip="10.1.9.18" cidr_netmask="24" \//
//op monitor interval="2s"//
//primitive vip_gollum ocf:heartbeat:IPaddr2 \//
//params nic="eth0" ip="10.1.9.23" cidr_netmask="24" \//
//op monitor interval="2s"//
//clone ha-cups cups//
//location pref_edison vip_edison 50: edison//
//location pref_gollum vip_gollum 50: Gollum//
//property $id="cib-bootstrap-options" \//
//dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \//
//cluster-infrastructure="openais" \//
//expected-quorum-votes="2" \//
//stonith-enabled="false" \//
//no-quorum-policy="ignore"/



--

*Louis Chanouha | Ingénieur Système et Réseaux*
Service Numérique de l'Université de Toulouse
*Université Fédérale Toulouse Midi-Pyrénées*
15 rue des Lois - BP 61321 - 31013 Toulouse Cedex 6
Tél. : +33(0)5 61 10 80 45  / poste int. : 18045

louis.chano...@univ-toulouse.fr
Facebook 
 | 
Twitter  | www.univ-toulouse.fr 



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two bugs in fence_ec2 script

2016-02-08 Thread Steve Marshall
Do you know where fence_ec2 is officially revision controlled?  Is there an
"official" release maintained anywhere?  I've found a number of different
versions online, but I haven't been able to find any that claim to be the
repository for debugging and development.

Thanks for any guidance you can give.

On Mon, Feb 8, 2016 at 8:33 AM, Marek marx Grác  wrote:

> Hi,
>
> fence_ec2 is not part of official upstream (
> https://github.com/ClusterLabs/fence-agents/) yet. There are various
> issues that blocks it, see archive if you are interested.
>
> Anyway thanks for patch and I hope people will find it there.
>
> m,
>
>
>
>
>
> On 5 February 2016 at 16:11:02, Steve Marshall (steve.marsh...@weather.com)
> wrote:
> > I've found two bugs in the fence_ec2 script found in this RPM:
> > fence_ec2-0.1-0.10.1.x86_64.rpm
> >
> > I believe this is the latest and greatest, but I may be wrong. I'm not
> > sure where to post these fixes, so I'm starting here. If there is a
> better
> > place to post, please let me know.
> >
> > The errors are:
> > 1. The tag attribute does not default to "Name", as specified in the
> > documentation
> > 2. The script does not accept attribute settings from stdin
> >
> > #1 is not critical, since tag=Name can be specified explicitly when
> setting
> > up a stonith resource. However, #2 is very important, since stonith_admin
> > (and I think most of pacemaker) passes arguments to fencing scripts via
> > stdin. Without this fix, fence_ec2 will not work properly via pacemaker,
> > that is:
> >
> > This command works
> > fence_ec2 --action=metadata
> >
> > ...but this alternate version of same command does not:
> > echo "action=metadata" | fence_ec2
> >
> > The fixes are relatively trivial. The version of fence_ec2 from the RPM
> is
> > attached as fence_ec2.old. My modified version is attached as
> > fence_ec2.new. I've also attached the RPM that was the source for
> > fence_ec2.old.
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two bugs in fence_ec2 script

2016-02-08 Thread Kristoffer Grönlund
Marek marx Grác  writes:

> Hi,
>
> fence_ec2 is not part of official upstream 
> (https://github.com/ClusterLabs/fence-agents/) yet. There are various issues 
> that blocks it, see archive if you are interested. 
>
> Anyway thanks for patch and I hope people will find it there.
>
> m,
>

A fork of fence_ec2 has been merged in cluster-glue as
stonith:external/ec2. It has already received additional patches
since being merged there:

http://hg.linux-ha.org/glue/file/56f40ec5d37e/lib/plugins/stonith/external/ec2

Error #1 as described below has already been fixed in the gluster-glue
version. The second issue (#2) is not a problem for glue stonith agents.

Cheers,
Kristoffer

>
>
>
>
> On 5 February 2016 at 16:11:02, Steve Marshall (steve.marsh...@weather.com) 
> wrote:
>> I've found two bugs in the fence_ec2 script found in this RPM:
>> fence_ec2-0.1-0.10.1.x86_64.rpm
>>  
>> I believe this is the latest and greatest, but I may be wrong. I'm not
>> sure where to post these fixes, so I'm starting here. If there is a better
>> place to post, please let me know.
>>  
>> The errors are:
>> 1. The tag attribute does not default to "Name", as specified in the
>> documentation
>> 2. The script does not accept attribute settings from stdin
>>  
>> #1 is not critical, since tag=Name can be specified explicitly when setting
>> up a stonith resource. However, #2 is very important, since stonith_admin
>> (and I think most of pacemaker) passes arguments to fencing scripts via
>> stdin. Without this fix, fence_ec2 will not work properly via pacemaker,
>> that is:
>>  
>> This command works
>> fence_ec2 --action=metadata
>>  
>> ...but this alternate version of same command does not:
>> echo "action=metadata" | fence_ec2
>>  
>> The fixes are relatively trivial. The version of fence_ec2 from the RPM is
>> attached as fence_ec2.old. My modified version is attached as
>> fence_ec2.new. I've also attached the RPM that was the source for
>> fence_ec2.old.
>> ___
>> Users mailing list: Users@clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>  
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>  
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Two bugs in fence_ec2 script

2016-02-08 Thread Marek marx Grác
Hi,

fence_ec2 is not part of official upstream 
(https://github.com/ClusterLabs/fence-agents/) yet. There are various issues 
that blocks it, see archive if you are interested. 

Anyway thanks for patch and I hope people will find it there.

m,





On 5 February 2016 at 16:11:02, Steve Marshall (steve.marsh...@weather.com) 
wrote:
> I've found two bugs in the fence_ec2 script found in this RPM:
> fence_ec2-0.1-0.10.1.x86_64.rpm
>  
> I believe this is the latest and greatest, but I may be wrong. I'm not
> sure where to post these fixes, so I'm starting here. If there is a better
> place to post, please let me know.
>  
> The errors are:
> 1. The tag attribute does not default to "Name", as specified in the
> documentation
> 2. The script does not accept attribute settings from stdin
>  
> #1 is not critical, since tag=Name can be specified explicitly when setting
> up a stonith resource. However, #2 is very important, since stonith_admin
> (and I think most of pacemaker) passes arguments to fencing scripts via
> stdin. Without this fix, fence_ec2 will not work properly via pacemaker,
> that is:
>  
> This command works
> fence_ec2 --action=metadata
>  
> ...but this alternate version of same command does not:
> echo "action=metadata" | fence_ec2
>  
> The fixes are relatively trivial. The version of fence_ec2 from the RPM is
> attached as fence_ec2.old. My modified version is attached as
> fence_ec2.new. I've also attached the RPM that was the source for
> fence_ec2.old.
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>  
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>  


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] crmsh configure delete for constraints

2016-02-08 Thread Vladislav Bogdanov
Hi,

when performing a delete operation, crmsh (2.2.0) having -F tries
to stop passed op arguments and then waits for DC to become idle.

That is not needed if only constraints are passed to delete.
Could that be changed? Or, could it wait only if there is something to stop?

Something like this:
diff --git a/modules/ui_configure.py b/modules/ui_configure.py
index cf98702..96ab77e 100644
--- a/modules/ui_configure.py
+++ b/modules/ui_configure.py
@@ -552,6 +552,9 @@ class CibConfig(command.UI):
 if not ok or not cib_factory.commit():
 raise ValueError("Failed to stop one or more running 
resources: %s" %
  (', '.join(to_stop)))
+return True
+else:
+return False
 
 @command.skill_level('administrator')
 @command.completers_repeating(_id_list)
@@ -562,8 +565,8 @@ class CibConfig(command.UI):
 arg_force = any((x in ('-f', '--force')) for x in argl)
 argl = [x for x in argl if (x not in ('-f', '--force'))]
 if arg_force or config.core.force:
-self._stop_if_running(argl)
-utils.wait4dc(what="Stopping %s" % (", ".join(argl)))
+if (self._stop_if_running(argl)):
+utils.wait4dc(what="Stopping %s" % (", ".join(argl)))
 return cib_factory.delete(*argl)
 
 @command.name('default-timeouts')


More, it may be worth checking stop-orphan-resources property and pass stop
work to pacemaker if it is set to true.


Thank you,
Vladislav

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] crmsh configure delete for constraints

2016-02-08 Thread Kristoffer Grönlund
Vladislav Bogdanov  writes:

> Hi,
>
> when performing a delete operation, crmsh (2.2.0) having -F tries
> to stop passed op arguments and then waits for DC to become idle.
>
> That is not needed if only constraints are passed to delete.
> Could that be changed? Or, could it wait only if there is something to stop?
>
[.. snipped patch ..]
>
>
> More, it may be worth checking stop-orphan-resources property and pass stop
> work to pacemaker if it is set to true.
>

Hi,

Thanks for the idea and patch, sounds good to me at least. I didn't know
about stop-orphan-resources. You are right, it sounds like we should be
checking that.

I'll test and see if it works as expected.

Cheers,
Kristoffer

>
> Thank you,
> Vladislav
>

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] DLM fencing

2016-02-08 Thread Digimer
On 08/02/16 03:55 PM, G Spot wrote:
> Hi Ken,
> 
> Am trying to create shared storage with clvm/gfs2 and when I try to
> fence I only see scsi option but my storage is conencted through FC is
> there any otherways can I fence my 1G stonith device other than scsi?

fencing of a lost node with clvmd/gfs2 is no different than normal
cluster fencing. To be clear, DLM does NOT fence, it simply waits for
the cluster to fence. So you can use IPMI, switched PDUs or whatever
else is available in your environment.

> On Mon, Feb 8, 2016 at 2:03 PM, Digimer  > wrote:
> 
> On 08/02/16 01:56 PM, Ferenc Wágner wrote:
> > Ken Gaillot > writes:
> >
> >> On 02/07/2016 12:21 AM, G Spot wrote:
> >>
> >>> Thanks for your response, am using ocf:pacemaker:controld resource
> >>> agent and stonith-enabled=false do I need to configure stonith device
> >>> to make this work?
> >>
> >> Correct. DLM requires access to fencing.
> >
> > I've ment to explore this connection for long, but never found much
> > useful material on the subject.  How does DLM fencing fit into the
> > modern Pacemaker architecture?  Fencing is a confusing topic in itself
> > already (fence_legacy, fence_pcmk, stonith, stonithd, stonith_admin),
> > then dlm_controld can use dlm_stonith to proxy fencing requests to
> > Pacemaker, and it becomes hopeless... :)
> >
> > I'd be grateful for a pointer to a good overview document, or a quick
> > sketch if you can spare the time.  To invoke some concrete questions:
> > When does DLM fence a node?  Is it necessary only when there's no
> > resource manager running on the cluster?  Does it matter whether
> > dlm_controld is run as a standalone daemon or as a controld resource?
> > Wouldn't Pacemaker fence a failing node itself all the same?  Or is
> > dlm_stonith for the case when only the stonithd component of Pacemaker
> > is active somehow?
> 
> DLM is a thing onto itself, and some tools like gfs2 and clustered-lvm
> use it to coordinate locking across the cluster. If a node drops out,
> the cluster informs dlm and it blocks until the lost node is confirmed
> fenced. Then it reaps the lost locks and recovery can begin.
> 
> If fencing fails or is not configured, DLM never unblocks and anything
> using it is left hung (by design, better to hang than risk corruption).
> 
> One of many reasons why fencing is critical.
> 
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org