Re: [ClusterLabs] Notifications on changes in clustered LVM

2017-06-23 Thread Jan Pokorný
On 20/06/17 10:59 +0200, Ferenc Wágner wrote:
> Digimer  writes:
> 
>> On 19/06/17 11:40 PM, Andrei Borzenkov wrote:
>>> udev events are sent over netlink, not D-Bus.
>> 
>> I've not used that before. Any docs on how to listen for those events,
>> by chance? If nothing off hand, don't worry, I can search.
> 
> Or just configure udev to run appropriate programs on the events you're
> interested in.  Less efficient, but simpler.

FYI, in the past, I've experimented with ordinary-user-access based
filtering of the events as obtained from "udevadm monitor", which used
to work happily out of the box:
"https://fedorapeople.org/cgit/jpokorny/public_git/uudev.git/
(in that example, I was running a custom script when the particular
keyboard was attached, I believe).

It's rather a poor man's solution, but the advantage is that it
doesn't require root's privileges for anything, and may suffice for
some use cases (not sure if for yours, Digimer, as well).

-- 
Jan (Poki)


pgpHyRnI7zS_N.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Ken Gaillot
On 06/23/2017 11:52 AM, Dimitri Maziuk wrote:
> On 06/23/2017 11:24 AM, Jan Pokorný wrote:
> 
>> People using ifdown or the iproute-based equivalent seem far
>> too prevalent, even if for long time bystanders the idea looks
>> continually disproved ad nauseam.
> 
> Has anyone had a network card fail recently and what does that look like
> on modern kernels? -- That's an honest question, I have not seen that in
> forever (fingers crossed knock on wood).
> 
> I.e. is the expectation that real life failure will be "nice" to
> corosync actually warranted?

I don't think there is such an expectation. If I understand correctly,
the issue with using ifdown as a test is two-fold: it's not a good
simulation of a typical network outage, and corosync is unable to
recover from an interface that goes down and later comes back up, so you
can only test the "down" part. Implementing some sort of recovery
mechanism in that situation is a goal for corosync 3, I believe.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Dimitri Maziuk
On 06/23/2017 11:24 AM, Jan Pokorný wrote:

> People using ifdown or the iproute-based equivalent seem far
> too prevalent, even if for long time bystanders the idea looks
> continually disproved ad nauseam.

Has anyone had a network card fail recently and what does that look like
on modern kernels? -- That's an honest question, I have not seen that in
forever (fingers crossed knock on wood).

I.e. is the expectation that real life failure will be "nice" to
corosync actually warranted?

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Jan Pokorný
On 23/06/17 08:48 -0500, Ken Gaillot wrote:
> On 06/22/2017 09:44 PM, Hui Xiang wrote:
>>   I have setup 3 nodes(node-1, node-2, node-3) as controller nodes, an
>> vip is selected by pacemaker between them, after manually make the
>> management interface down in node-1 (used by corosync) but still have
>> connectivity to public or non-management network
> 
> How did you make the cluster interface down? If you're blocking it via
> firewall, be aware that you have to block *outbound* traffic on the
> corosync port.

People using ifdown or the iproute-based equivalent seem far
too prevalent, even if for long time bystanders the idea looks
continually disproved ad nauseam.

Added some previous posts on that topic to lists-dedicated
scratchpad at ClusterLabs wiki:
http://wiki.clusterlabs.org/w/index.php?title=Lists%27_Digest=1601=1587#To_Avoid

-- 
Jan (Poki)


pgpR_bhPGVslW.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Ken Gaillot
On 06/22/2017 09:44 PM, Hui Xiang wrote:
> Hi guys,
> 
>   I have setup 3 nodes(node-1, node-2, node-3) as controller nodes, an
> vip is selected by pacemaker between them, after manually make the
> management interface down in node-1 (used by corosync) but still have
> connectivity to public or non-management network, I was expecting that
> the vip in node-1 will be stop/remove by pacemaker since this node lost
> connection with the other two node, however, now there are two vip in
> the cluster, below is my configuration:
> 
> [node-1]
> Online: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>  vip__public_old(ocf::es:ns_IPaddr2):Started node-1.domain.tld 
> 
> [node-2 node-3]
> Online: [ node-2.domain.tld node-3.domain.tld ]
> OFFLINE: [ node-1.domain.tld ]
>  vip__public_old(ocf::es:ns_IPaddr2):Started node-3.domain.tld 
> 
> 
> My question is am I miss any configuration, how can I make vip removed
> in node-1, shouldn't crm status in node-1 be:
> [node-1]
> Online: [ node-1.domain.tld ]
> OFFLINE: [  node-2.domain.tld node-3.domain.tld ] 
> 
> 
> Thanks much.
> Hui.

Hi,

How did you make the cluster interface down? If you're blocking it via
firewall, be aware that you have to block *outbound* traffic on the
corosync port.

Do you have stonith working? When the cluster loses a node, it recovers
by fencing it.

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org