Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Prasad, Shashank
> I don't think that having a hook that bypass stonith is the right way…. The intention is NOT to bypass STONITH. STONITH shall always remain active, and an integral part of the cluster. The discussion is about bailing out of situations when the STONITH itself fails due to fencing agent

Re: [ClusterLabs] why resources are restarted when a node rejoins a cluster?

2017-07-24 Thread Digimer
On 2017-07-24 11:04 PM, ztj wrote: Hi all, I have 2 Centos nodes with heartbeat and pacemaker-1.1.13 installed, and almost everything is working fine, I have only apache configured for

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Ken Gaillot
On Mon, 2017-07-24 at 21:29 +, Tomer Azran wrote: > I tend to agree with Klaus – I don't think that having a hook that > bypass stonith is the right way. It is better to not use stonith at > all. > > I think I will try to use an iScsi target on my qdevice and set SBD to > use it. Certainly,

Re: [ClusterLabs] resources do not migrate although node is going to standby

2017-07-24 Thread Ken Gaillot
On Mon, 2017-07-24 at 20:52 +0200, Lentes, Bernd wrote: > Hi, > > just to be sure: > i have a VirtualDomain resource (called prim_vm_servers_alive) running on one > node (ha-idg-2). From reasons i don't remember i have a location constraint: > location cli-prefer-prim_vm_servers_alive

Re: [ClusterLabs] timeout for stop VirtualDomain running Windows 7

2017-07-24 Thread Ken Gaillot
On Mon, 2017-07-24 at 19:30 +0200, Lentes, Bernd wrote: > Hi, > > i have a VirtualDomian resource running a Windows 7 client. This is the > respective configuration: > > primitive prim_vm_servers_alive VirtualDomain \ > params config="/var/lib/libvirt/images/xml/Server_Monitoring.xml" \

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
I tend to agree with Klaus – I don't think that having a hook that bypass stonith is the right way. It is better to not use stonith at all. I think I will try to use an iScsi target on my qdevice and set SBD to use it. I still don't understand why qdevice can't take the place SBD with shared

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Kristián Feldsam
yes, I just have idea, he probably have managed switch or fabric... S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za adekvátní ceny. FELDSAM s.r.o. V rohu 434/3 Praha

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Klaus Wenninger
On 07/24/2017 09:46 PM, Kristián Feldsam wrote: > so why to use some other fencing method like disablink port on switch, > so nobody can acces faultly node and write data to it. it is common > practice too. Well don't get me wrong here. I don't want to hard-sell sbd. Just though that very likely

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Kristián Feldsam
so why to use some other fencing method like disablink port on switch, so nobody can acces faultly node and write data to it. it is common practice too. S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální

Re: [ClusterLabs] resources do not migrate although node is going to standby

2017-07-24 Thread Kristián Feldsam
hmmi think that it is just prefered location, if it is not available, server should start on other node. you can of cource migrate manualy byt crm resource move resource_name node_name - which in effect change that location pref S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Klaus Wenninger
On 07/24/2017 08:27 PM, Prasad, Shashank wrote: > > My understanding is that SBD will need a shared storage between > clustered nodes. > > And that, SBD will need at least 3 nodes in a cluster, if using w/o > shared storage. > Haven't tried to be honest but reason for 3 nodes is that without

[ClusterLabs] resources do not migrate although node is going to standby

2017-07-24 Thread Lentes, Bernd
Hi, just to be sure: i have a VirtualDomain resource (called prim_vm_servers_alive) running on one node (ha-idg-2). From reasons i don't remember i have a location constraint: location cli-prefer-prim_vm_servers_alive prim_vm_servers_alive role=Started inf: ha-idg-2 Now i try to set this node

Re: [ClusterLabs] timeout for stop VirtualDomain running Windows 7

2017-07-24 Thread Kristián Feldsam
hmm, it is possible disable installing update on shutdown? and do regular maintanence for updating manually? S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za adekvátní

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Klaus Wenninger
On 07/24/2017 07:32 PM, Prasad, Shashank wrote: > > Sometimes IPMI fence devices use shared power of the node, and it > cannot be avoided. > > In such scenarios the HA cluster is NOT able to handle the power > failure of a node, since the power is shared with its own fence device. > > The failure

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Prasad, Shashank
Sometimes IPMI fence devices use shared power of the node, and it cannot be avoided. In such scenarios the HA cluster is NOT able to handle the power failure of a node, since the power is shared with its own fence device. The failure of IPMI based fencing can also exist due to other reasons

[ClusterLabs] timeout for stop VirtualDomain running Windows 7

2017-07-24 Thread Lentes, Bernd
Hi, i have a VirtualDomian resource running a Windows 7 client. This is the respective configuration: primitive prim_vm_servers_alive VirtualDomain \ params config="/var/lib/libvirt/images/xml/Server_Monitoring.xml" \ params hypervisor="qemu:///system" \ params

Re: [ClusterLabs] epic fail

2017-07-24 Thread Dimitri Maziuk
On 07/24/2017 11:34 AM, Ken Gaillot wrote: > On Mon, 2017-07-24 at 18:09 +0200, Valentin Vidic wrote: >> On Mon, Jul 24, 2017 at 11:01:26AM -0500, Dimitri Maziuk wrote: >>> Lsof/fuser show the PID of the process holding FS open as "kernel". >> >> That could be the NFS server running in the kernel.

Re: [ClusterLabs] epic fail

2017-07-24 Thread Ken Gaillot
On Mon, 2017-07-24 at 18:09 +0200, Valentin Vidic wrote: > On Mon, Jul 24, 2017 at 11:01:26AM -0500, Dimitri Maziuk wrote: > > Lsof/fuser show the PID of the process holding FS open as "kernel". > > That could be the NFS server running in the kernel. Dimitri, Is the NFS server also managed by

Re: [ClusterLabs] epic fail

2017-07-24 Thread Valentin Vidic
On Mon, Jul 24, 2017 at 10:38:40AM -0500, Ken Gaillot wrote: > Standby is not necessary, it's just a cautious step that allows the > admin to verify that all resources moved off correctly. The restart that > yum does should be sufficient for pacemaker to move everything. > > A restart shouldn't

Re: [ClusterLabs] epic fail

2017-07-24 Thread Dimitri Maziuk
> Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stopping NFS > server ... > Jul 22 14:03:46 zebrafish systemd: Stopping NFS server and services... > Jul 22 14:03:46 zebrafish systemd: Stopped NFS server and services. > Jul 22 14:03:46 zebrafish systemd: Stopping NFS Mount Daemon...

Re: [ClusterLabs] epic fail

2017-07-24 Thread Kristián Feldsam
nfs server/share is also managed by pacemaker and orderis set right? S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za adekvátní ceny. FELDSAM s.r.o. V rohu 434/3 Praha

Re: [ClusterLabs] epic fail

2017-07-24 Thread Valentin Vidic
On Mon, Jul 24, 2017 at 11:01:26AM -0500, Dimitri Maziuk wrote: > Lsof/fuser show the PID of the process holding FS open as "kernel". That could be the NFS server running in the kernel. -- Valentin ___ Users mailing list: Users@clusterlabs.org

Re: [ClusterLabs] epic fail

2017-07-24 Thread Dimitri Maziuk
On 07/24/2017 10:38 AM, Ken Gaillot wrote: > A restart shouldn't lead to fencing in any case where something's not > going seriously wrong. I'm not familiar with the "kernel is using it" > message, I haven't run into that before. I posted it at least once before. > > Jul 22 14:03:48 zebrafish

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Klaus Wenninger
On 07/24/2017 05:37 PM, Kristián Feldsam wrote: > I personally think that power off node by switched pdu is more safe, > or not? True if that is working in you environment. If you can't do a physical setup where you aren't simultaneously loosing connection to both your node and the switch-device

Re: [ClusterLabs] epic fail

2017-07-24 Thread Ken Gaillot
On Mon, 2017-07-24 at 17:13 +0200, Kristián Feldsam wrote: > Hmm, so when you know, that it happens also when putting node standy, > them why you run yum update on live cluster, it must be clear that > node will be fenced. Standby is not necessary, it's just a cautious step that allows the admin

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Klaus Wenninger
On 07/24/2017 05:32 PM, Tomer Azran wrote: > So your suggestion is to use sbd with or without qdevice? What is the > point of having a qdevice in two node cluster if it doesn't help in > this situation? If you have a qdevice setup that is already working (meaning in terms of that one of your

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Kristián Feldsam
I personally think that power off node by switched pdu is more safe, or not? S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za adekvátní ceny. FELDSAM s.r.o. V rohu

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
So your suggestion is to use sbd with or without qdevice? What is the point of having a qdevice in two node cluster if it doesn't help in this situation? From: Klaus Wenninger Sent: Monday, July 24, 18:28 Subject: Re: [ClusterLabs] Two nodes cluster issue To: Cluster Labs - All topics related

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Klaus Wenninger
On 07/24/2017 05:15 PM, Tomer Azran wrote: > I still don't understand why the qdevice concept doesn't help on this > situation. Since the master node is down, I would expect the quorum to > declare it as dead. > Why doesn't it happens? That is not how quorum works. It just limits the

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
I still don't understand why the qdevice concept doesn't help on this situation. Since the master node is down, I would expect the quorum to declare it as dead. Why doesn't it happens? On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk"

Re: [ClusterLabs] epic fail

2017-07-24 Thread Kristián Feldsam
Hmm, so when you know, that it happens also when putting node standy, them why you run yum update on live cluster, it must be clear that node will be fenced. Would you post your pacemaker config? + some logs? S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.:

Re: [ClusterLabs] epic fail

2017-07-24 Thread Dimitri Maziuk
On 07/24/2017 09:40 AM, Jan Pokorný wrote: > Would there be an interest, though? And would that be meaningful? IMO the only reason to put a node in standby is if you want to reboot the active node with no service interruption. For anything else, including a reboot with service interruption

Re: [ClusterLabs] [ClusterLabs Developers] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-07-24 Thread Jan Pokorný
On 23/07/17 12:32 +0100, Adam Spiers wrote: > Jan Pokorný wrote: >> So, going to attend summit and want your key signed while reciprocally >> spreading the web of trust? >> Awesome, let's reuse the steps from the last time: >> >> Once you have a key pair (and provided that

Re: [ClusterLabs] epic fail

2017-07-24 Thread Jan Pokorný
On 23/07/17 14:40 +0200, Valentin Vidic wrote: > On Sun, Jul 23, 2017 at 07:27:03AM -0500, Dmitri Maziuk wrote: >> So yesterday I ran yum update that puled in the new pacemaker and tried to >> restart it. The node went into its usual "can't unmount drbd because kernel >> is using it" and got

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Kristián Feldsam
APC AP7921 is just for 200€ on ebay. S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za adekvátní ceny. FELDSAM s.r.o. V rohu 434/3 Praha 4 – Libuš, PSČ 142 00 IČ: 290

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Dmitri Maziuk
On 2017-07-24 07:51, Tomer Azran wrote: We don't have the ability to use it. Is that the only solution? No, but I'd recommend thinking about it first. Are you sure you will care about your cluster working when your server room is on fire? 'Cause unless you have halon suppression, your server

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
We don't have the ability to use it. Is that the only solution? In addition, it will not cover a scenario that the server room is down (for example - fire or earthquake), the switch will go down as well. From: Klaus Wenninger Sent: Monday, July 24, 15:31 Subject: Re: [ClusterLabs] Two nodes

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Klaus Wenninger
On 07/24/2017 02:05 PM, Kristián Feldsam wrote: > Hello, you have to use second fencing device, for ex. APC Switched PDU. > > https://wiki.clusterlabs.org/wiki/Configure_Multiple_Fencing_Devices_Using_pcs Problem here seems to be that the fencing devices available are running from the same

Re: [ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Kristián Feldsam
Hello, you have to use second fencing device, for ex. APC Switched PDU. https://wiki.clusterlabs.org/wiki/Configure_Multiple_Fencing_Devices_Using_pcs S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální

[ClusterLabs] Two nodes cluster issue

2017-07-24 Thread Tomer Azran
Hello, We built a pacemaker cluster with 2 physical servers. We configured DRBD in Master\Slave setup, a floating IP and file system mount in Active\Passive mode. We configured two STONITH devices (fence_ipmilan), one for each server. We are trying to simulate a situation when the Master server

Re: [ClusterLabs] pcs: how to properly unset a value for resource/stonith? [Was: (no subject)]

2017-07-24 Thread ArekW
Hi, Thank you for setting subject. I confirm that the parameter can be disabled. The only issue is that sometimes thete is a "zombie" message in logs like I shown before: Jul 20 07:14:11 nfsnode1 stonith-ng[11097]: warning: fence_vbox[3092] stderr: [ WARNING:root:Parse error: Ignoring option

Re: [ClusterLabs] pcs: how to properly unset a value for resource/stonith? [Was: (no subject)]

2017-07-24 Thread ArekW
Hi, Thank you for setting subject. I confirm that the parameter can be disabled. The only issue is that sometimes thete is a "zombie" message in logs like I shown before: Jul 20 07:14:11 nfsnode1 stonith-ng[11097]: warning: fence_vbox[3092] stderr: [ WARNING:root:Parse error: Ignoring option

Re: [ClusterLabs] [ClusterLabs Developers] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-07-24 Thread Kristoffer Grönlund
Jan Pokorný writes: > [ Unknown signature status ] > Hello cluster masters :-) > > as there's little less than 7 weeks left to "The Summit" meetup > (), it's about time to get the ball > rolling so we can voluntarily augment the digital trust amongst