Re: [ClusterLabs] Regression in Filesystem RA
Hello, sorry for the late reply, moving Date Centers tends to keep one busy. I looked at the PR and while it works and certainly is an improvement, it wouldn't help me in my case much. Biggest issue being fuser and its exponential slowdown and the RA still uses this. What I did was to recklessly force my crap code into a script: --- #/bin/bash lsof -n |grep $1 |grep DIR| awk '{print $2}' --- And call that instead of fuser as well as removing all kill logging by default (determining the number pids isn't free either). With that in place it can deal with 10k processes to kill in less than 10 seconds. Regards, Christian On Tue, 24 Oct 2017 09:07:50 +0200 Dejan Muhamedagic wrote: > On Tue, Oct 24, 2017 at 08:59:17AM +0200, Dejan Muhamedagic wrote: > > [...] > > I just made a pull request: > > > > https://github.com/ClusterLabs/resource-agents/pull/1042 > > NB: It is completely untested! > > > It would be great if you could test it! > > > > Cheers, > > > > Dejan > > > > > Regards, > > > > > > Christian > > > > > > > > Maybe we can even come up with a way > > > > > to both "pretty print" and kill fast? > > > > > > > > My best guess right now is no ;-) But we could log nicely for the > > > > usual case of a small number of stray processes ... maybe > > > > something like this: > > > > > > > > i="" > > > > get_pids | tr '\n' ' ' | fold -s | > > > > while read procs; do > > > > if [ -z "$i" ]; then > > > > killnlog $procs > > > > i="nolog" > > > > else > > > > justkill $procs > > > > fi > > > > done > > > > > > > > Cheers, > > > > > > > > Dejan > > > > > > > > > -- > > > > > : Lars Ellenberg > > > > > : LINBIT | Keeping the Digital World Running > > > > > : DRBD -- Heartbeat -- Corosync -- Pacemaker > > > > > : R&D, Integration, Ops, Consulting, Support > > > > > > > > > > DRBD® and LINBIT® are registered trademarks of LINBIT > > > > > > > > > > ___ > > > > > Users mailing list: Users@clusterlabs.org > > > > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > > > Project Home: http://www.clusterlabs.org > > > > > Getting started: > > > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > > Bugs: http://bugs.clusterlabs.org > > > > > > > > ___ > > > > Users mailing list: Users@clusterlabs.org > > > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > Project Home: http://www.clusterlabs.org > > > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > Bugs: http://bugs.clusterlabs.org > > > > > > > > > > > > > -- > > > Christian BalzerNetwork/Systems Engineer > > > ch...@gol.com Rakuten Communications > > > > ___ > > Users mailing list: Users@clusterlabs.org > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: Re: questions about startup fencing
> Kristoffer Gronlund wrote: >>Adam Spiers writes: >> >>> - The whole cluster is shut down cleanly. >>> >>> - The whole cluster is then started up again. (Side question: what >>> happens if the last node to shut down is not the first to start up? >>> How will the cluster ensure it has the most recent version of the >>> CIB? Without that, how would it know whether the last man standing >>> was shut down cleanly or not?) >> >>This is my opinion, I don't really know what the "official" pacemaker >>stance is: There is no such thing as shutting down a cluster cleanly. A >>cluster is a process stretching over multiple nodes - if they all shut >>down, the process is gone. When you start up again, you effectively have >>a completely new cluster. > > Sorry, I don't follow you at all here. When you start the cluster up > again, the cluster config from before the shutdown is still there. > That's very far from being a completely new cluster :-) The problem is you cannot "start the cluster" in pacemaker; you can only "start nodes". The nodes will come up one by one. As opposed (as I had said) to HP Sertvice Guard, where there is a "cluster formation timeout". That is, the nodes wait for the specified time for the cluster to "form". Then the cluster starts as a whole. Of course that only applies if the whole cluster was down, not if a single node was down. > >>When starting up, how is the cluster, at any point, to know if the >>cluster it has knowledge of is the "latest" cluster? > > That was exactly my question. > >>The next node could have a newer version of the CIB which adds yet >>more nodes to the cluster. > > Yes, exactly. If the first node to start up was not the last man > standing, the CIB history is effectively being forked. So how is this > issue avoided? Quorum? "Cluster formation delay"? > >>The only way to bring up a cluster from being completely stopped is to >>treat it as creating a completely new cluster. The first node to start >>"creates" the cluster and later nodes join that cluster. > > That's ignoring the cluster config, which persists even when the > cluster's down. > > But to be clear, you picked a small side question from my original > post and answered that. The main questions I had were about startup > fencing :-) > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?
On 29/11/17 22:00 +0100, Jan Pokorný wrote: > On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: >> 28.11.2017 13:01, Jan Pokorný пишет: >>> On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: Отправлено с iPhone > 27 нояб. 2017 г., в 14:36, Ferenc Wágner написал(а): > > Andrei Borzenkov writes: > >> 25.11.2017 10:05, Andrei Borzenkov пишет: >> >>> In one of guides suggested procedure to simulate split brain was to kill >>> corosync process. It actually worked on one cluster, but on another >>> corosync process was restarted after being killed without cluster >>> noticing anything. Except after several attempts pacemaker died with >>> stopping resources ... :) >>> >>> This is SLES12 SP2; I do not see any Restart in service definition so it >>> probably not systemd. >>> >> FTR - it was not corosync, but pacemakker; its unit file specifies >> RestartOn=error so killing corosync caused pacemaker to fail and be >> restarted by systemd. > > And starting corosync via a Requires dependency? Exactly. >>> >>> From my testing it looks like we should change >>> "Requires=corosync.service" to "BindsTo=corosync.service" >>> in pacemaker.service. >>> >>> Could you give it a try? >>> >> >> I'm not sure what is expected outcome, but pacemaker.service is still >> restarted (due to Restart=on-failure). > > Expected outcome is that pacemaker.service will become > "inactive (dead)" after killing corosync (as a result of being > "bound" by pacemaker). Have you indeed issued "systemctl > daemon-reload" after updating the pacemaker unit file? > > (FTR, I tried with systemd 235). > >> If intention is to unconditionally stop it when corosync dies, >> pacemaker should probably exit with unique code and unit files have >> RestartPreventExitStatus set to it. > > That would be an elaborate way to reach the same. > > But good point in questioning what's the "best intention" around these > scenarios -- normally, fencing would happen, but as you note, the node > had actually survived by being fast enough to put corosync back to > life, and from there, whether it adds any value to have pacemaker > restarted on non-clean terminations at all. I don't know. > > Would it make more sense to have FailureAction=reboot-immediate to > at least in part emulate the fencing instead? Although the restart may be also blazingly fast in some cases, not making much difference except for taking all the previously running resources forcibly down as an extra step, which may be either good or bad. -- Jan (Poki) pgpo6ZFEeT30X.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?
On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: > 28.11.2017 13:01, Jan Pokorný пишет: >> On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: >>> Отправлено с iPhone >>> 27 нояб. 2017 г., в 14:36, Ferenc Wágner написал(а): Andrei Borzenkov writes: > 25.11.2017 10:05, Andrei Borzenkov пишет: > >> In one of guides suggested procedure to simulate split brain was to kill >> corosync process. It actually worked on one cluster, but on another >> corosync process was restarted after being killed without cluster >> noticing anything. Except after several attempts pacemaker died with >> stopping resources ... :) >> >> This is SLES12 SP2; I do not see any Restart in service definition so it >> probably not systemd. >> > FTR - it was not corosync, but pacemakker; its unit file specifies > RestartOn=error so killing corosync caused pacemaker to fail and be > restarted by systemd. And starting corosync via a Requires dependency? >>> >>> Exactly. >> >> From my testing it looks like we should change >> "Requires=corosync.service" to "BindsTo=corosync.service" >> in pacemaker.service. >> >> Could you give it a try? >> > > I'm not sure what is expected outcome, but pacemaker.service is still > restarted (due to Restart=on-failure). Expected outcome is that pacemaker.service will become "inactive (dead)" after killing corosync (as a result of being "bound" by pacemaker). Have you indeed issued "systemctl daemon-reload" after updating the pacemaker unit file? (FTR, I tried with systemd 235). > If intention is to unconditionally stop it when corosync dies, > pacemaker should probably exit with unique code and unit files have > RestartPreventExitStatus set to it. That would be an elaborate way to reach the same. But good point in questioning what's the "best intention" around these scenarios -- normally, fencing would happen, but as you note, the node had actually survived by being fast enough to put corosync back to life, and from there, whether it adds any value to have pacemaker restarted on non-clean terminations at all. I don't know. Would it make more sense to have FailureAction=reboot-immediate to at least in part emulate the fencing instead? -- Jan (Poki) pgpvr3dRWe6V_.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
On 11/29/2017 09:09 PM, Kristoffer Grönlund wrote: > Adam Spiers writes: > >> OK, so reading between the lines, if we don't want our cluster's >> latest config changes accidentally discarded during a complete cluster >> reboot, we should ensure that the last man standing is also the first >> one booted up - right? > That would make sense to me, but I don't know if it's the only > solution. If you separately ensure that they all have the same > configuration first, you could start them in any order I guess. I guess it is not that bad as after the last man standing has left the stage it would take a quorate number (actually depending on how many you allow to survive) of nodes till anything happens again (equivalent to wait-for-all in 2-node clusters). And one of these should have a reasonably current cib. > >> If so, I think that's a perfectly reasonable thing to ask for, but >> maybe it should be documented explicitly somewhere? Apologies if it >> is already and I missed it. > Yeah, maybe a section discussing both starting and stopping a whole > cluster would be helpful, but I don't know if I feel like I've thought > about it enough myself. Regarding the HP Service Guard commands that > Ulrich Windl mentioned, the very idea of such commands offends me on > some level but I don't know if I can clearly articulate why. :D > ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
Adam Spiers writes: > > OK, so reading between the lines, if we don't want our cluster's > latest config changes accidentally discarded during a complete cluster > reboot, we should ensure that the last man standing is also the first > one booted up - right? That would make sense to me, but I don't know if it's the only solution. If you separately ensure that they all have the same configuration first, you could start them in any order I guess. > > If so, I think that's a perfectly reasonable thing to ask for, but > maybe it should be documented explicitly somewhere? Apologies if it > is already and I missed it. Yeah, maybe a section discussing both starting and stopping a whole cluster would be helpful, but I don't know if I feel like I've thought about it enough myself. Regarding the HP Service Guard commands that Ulrich Windl mentioned, the very idea of such commands offends me on some level but I don't know if I can clearly articulate why. :D -- // Kristoffer Grönlund // kgronl...@suse.com ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
Kristoffer Gronlund wrote: Adam Spiers writes: Kristoffer Gronlund wrote: Adam Spiers writes: - The whole cluster is shut down cleanly. - The whole cluster is then started up again. (Side question: what happens if the last node to shut down is not the first to start up? How will the cluster ensure it has the most recent version of the CIB? Without that, how would it know whether the last man standing was shut down cleanly or not?) This is my opinion, I don't really know what the "official" pacemaker stance is: There is no such thing as shutting down a cluster cleanly. A cluster is a process stretching over multiple nodes - if they all shut down, the process is gone. When you start up again, you effectively have a completely new cluster. Sorry, I don't follow you at all here. When you start the cluster up again, the cluster config from before the shutdown is still there. That's very far from being a completely new cluster :-) You have a new cluster with (possibly fragmented) memories of a previous life ;) Well yeah, that's another way of describing it :-) Yes, exactly. If the first node to start up was not the last man standing, the CIB history is effectively being forked. So how is this issue avoided? The only way to bring up a cluster from being completely stopped is to treat it as creating a completely new cluster. The first node to start "creates" the cluster and later nodes join that cluster. That's ignoring the cluster config, which persists even when the cluster's down. There could be a command in pacemaker which resets a set of nodes to a common known state, basically to pick the CIB from one of the nodes as the survivor and copy that to all of them. But in the end, that's just the same thing as just picking one node as the first node, and telling the others to join that one and to discard their configurations. So, treating it as a new cluster. OK, so reading between the lines, if we don't want our cluster's latest config changes accidentally discarded during a complete cluster reboot, we should ensure that the last man standing is also the first one booted up - right? If so, I think that's a perfectly reasonable thing to ask for, but maybe it should be documented explicitly somewhere? Apologies if it is already and I missed it. ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] cluster with two ESX server
On 11/29/2017 08:24 PM, Andrei Borzenkov wrote: > 29.11.2017 20:14, Klaus Wenninger пишет: >> On 11/28/2017 07:41 PM, Andrei Borzenkov wrote: >>> 28.11.2017 10:45, Ramann, Björn пишет: hi@all, in my configuration, the 1st Node run on ESX1, the second run on ESX2. Now I'm looking for a way to configure the cluster fence/stonith with two ESX server - is this possible? >>> if you have sgared storage, SBD may be an option. >> True. >> And if you feel like experimenting you can have a look at >> https://github.com/wenningerk/sbd/tree/vmware. >> >> On ESX you don't have virtual watchdog-devices with >> a kernel-driver sitting on top (contrary to e.g. >> with qemu-kvm). >> This basically is a test-implementation using >> vSphere HA Application Monitoring as a replacement. >> > This sure sounds interesting. Does it work with open-vm-tools or does it > require VMware tools? Unfortunately with none of both. You need libappmonitorlib.so from GuestSDK which I didn't find anywhere else. Apart from that library you are fine with open-vm-tools. See VMware_GuestSDK.spec from my github-repo for details. When setting up a vSphere Cluster enable Application Monitoring and check that the following is true. ('Failure interval' = 'Minimum uptime') * 'Maximum per-VM resets' == 'Maximum reset time window' Otherwise your 'watchdog' will stop working after 3 resets till the reset time window is over (maybe never). Regards, Klaus > >> In comparison to using softdog this approach doesn't rely >> on any working code inside vm to trigger a reboot. >> >> [root@node4 ~]# sbd query-watchdog >> >> Discovered 3 watchdog devices: >> >> [1] vmware >> Identity: VMware Application Monitoring (gray) >> Driver: >> >> [2] /dev/watchdog >> Identity: Software Watchdog >> Driver: softdog >> CAUTION: Not recommended for use with sbd. >> >> [3] /dev/watchdog0 >> Identity: Software Watchdog >> Driver: softdog >> CAUTION: Not recommended for use with sbd. >> >> >> Have in mind that this is just a proof-of-concept >> implementation. So expect any kind of changes and >> be aware that in the current state it is definitely >> not fit to go into any distribution. >> >> Regarding building you can find VMware_GuestSDK.spec >> in the vmware-branch of my sbd-fork. >> Basically this builds rpms from the vmware-GuestSDK-tarball - >> both library-binary-rpm for the target and devel-rpm >> for building vmware-enabled-sbd. >> >> Regards, >> Klaus >> I try to us fence_vmware with vcenter, but then the vcenter is a single point of failure und running two vcenter is current not possible. >>> You can run vcenter on vFT VM in which case it should be pretty robust. >>> >>> ___ >>> Users mailing list: Users@clusterlabs.org >>> http://lists.clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] cluster with two ESX server
29.11.2017 20:14, Klaus Wenninger пишет: > On 11/28/2017 07:41 PM, Andrei Borzenkov wrote: >> 28.11.2017 10:45, Ramann, Björn пишет: >>> hi@all, >>> >>> in my configuration, the 1st Node run on ESX1, the second run on ESX2. Now >>> I'm looking for a way to configure the cluster fence/stonith with two ESX >>> server - is this possible? >> if you have sgared storage, SBD may be an option. > > True. > And if you feel like experimenting you can have a look at > https://github.com/wenningerk/sbd/tree/vmware. > > On ESX you don't have virtual watchdog-devices with > a kernel-driver sitting on top (contrary to e.g. > with qemu-kvm). > This basically is a test-implementation using > vSphere HA Application Monitoring as a replacement. > This sure sounds interesting. Does it work with open-vm-tools or does it require VMware tools? > In comparison to using softdog this approach doesn't rely > on any working code inside vm to trigger a reboot. > > [root@node4 ~]# sbd query-watchdog > > Discovered 3 watchdog devices: > > [1] vmware > Identity: VMware Application Monitoring (gray) > Driver: > > [2] /dev/watchdog > Identity: Software Watchdog > Driver: softdog > CAUTION: Not recommended for use with sbd. > > [3] /dev/watchdog0 > Identity: Software Watchdog > Driver: softdog > CAUTION: Not recommended for use with sbd. > > > Have in mind that this is just a proof-of-concept > implementation. So expect any kind of changes and > be aware that in the current state it is definitely > not fit to go into any distribution. > > Regarding building you can find VMware_GuestSDK.spec > in the vmware-branch of my sbd-fork. > Basically this builds rpms from the vmware-GuestSDK-tarball - > both library-binary-rpm for the target and devel-rpm > for building vmware-enabled-sbd. > > Regards, > Klaus > >> >>> I try to us fence_vmware with vcenter, but then the vcenter is a single >>> point of failure und running two vcenter is current not possible. >>> >> You can run vcenter on vFT VM in which case it should be pretty robust. >> >> ___ >> Users mailing list: Users@clusterlabs.org >> http://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
Adam Spiers writes: > Kristoffer Gronlund wrote: >>Adam Spiers writes: >> >>> - The whole cluster is shut down cleanly. >>> >>> - The whole cluster is then started up again. (Side question: what >>> happens if the last node to shut down is not the first to start up? >>> How will the cluster ensure it has the most recent version of the >>> CIB? Without that, how would it know whether the last man standing >>> was shut down cleanly or not?) >> >>This is my opinion, I don't really know what the "official" pacemaker >>stance is: There is no such thing as shutting down a cluster cleanly. A >>cluster is a process stretching over multiple nodes - if they all shut >>down, the process is gone. When you start up again, you effectively have >>a completely new cluster. > > Sorry, I don't follow you at all here. When you start the cluster up > again, the cluster config from before the shutdown is still there. > That's very far from being a completely new cluster :-) You have a new cluster with (possibly fragmented) memories of a previous life ;) > > Yes, exactly. If the first node to start up was not the last man > standing, the CIB history is effectively being forked. So how is this > issue avoided? > >>The only way to bring up a cluster from being completely stopped is to >>treat it as creating a completely new cluster. The first node to start >>"creates" the cluster and later nodes join that cluster. > > That's ignoring the cluster config, which persists even when the > cluster's down. There could be a command in pacemaker which resets a set of nodes to a common known state, basically to pick the CIB from one of the nodes as the survivor and copy that to all of them. But in the end, that's just the same thing as just picking one node as the first node, and telling the others to join that one and to discard their configurations. So, treating it as a new cluster. > > But to be clear, you picked a small side question from my original > post and answered that. The main questions I had were about startup > fencing :-) I did! :) -- // Kristoffer Grönlund // kgronl...@suse.com ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
Klaus Wenninger wrote: On 11/29/2017 04:23 PM, Kristoffer Grönlund wrote: Adam Spiers writes: - The whole cluster is shut down cleanly. - The whole cluster is then started up again. (Side question: what happens if the last node to shut down is not the first to start up? How will the cluster ensure it has the most recent version of the CIB? Without that, how would it know whether the last man standing was shut down cleanly or not?) This is my opinion, I don't really know what the "official" pacemaker stance is: There is no such thing as shutting down a cluster cleanly. A cluster is a process stretching over multiple nodes - if they all shut down, the process is gone. When you start up again, you effectively have a completely new cluster. When starting up, how is the cluster, at any point, to know if the cluster it has knowledge of is the "latest" cluster? The next node could have a newer version of the CIB which adds yet more nodes to the cluster. To make it even clearer imagine a node being reverted to a previous state by recovering it from a backup. Yes, I'm asking how this kind of scenario is dealt with :-) Another example is a config change being made after one or more of the cluster nodes had already been shut down. ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
Kristoffer Gronlund wrote: Adam Spiers writes: - The whole cluster is shut down cleanly. - The whole cluster is then started up again. (Side question: what happens if the last node to shut down is not the first to start up? How will the cluster ensure it has the most recent version of the CIB? Without that, how would it know whether the last man standing was shut down cleanly or not?) This is my opinion, I don't really know what the "official" pacemaker stance is: There is no such thing as shutting down a cluster cleanly. A cluster is a process stretching over multiple nodes - if they all shut down, the process is gone. When you start up again, you effectively have a completely new cluster. Sorry, I don't follow you at all here. When you start the cluster up again, the cluster config from before the shutdown is still there. That's very far from being a completely new cluster :-) When starting up, how is the cluster, at any point, to know if the cluster it has knowledge of is the "latest" cluster? That was exactly my question. The next node could have a newer version of the CIB which adds yet more nodes to the cluster. Yes, exactly. If the first node to start up was not the last man standing, the CIB history is effectively being forked. So how is this issue avoided? The only way to bring up a cluster from being completely stopped is to treat it as creating a completely new cluster. The first node to start "creates" the cluster and later nodes join that cluster. That's ignoring the cluster config, which persists even when the cluster's down. But to be clear, you picked a small side question from my original post and answered that. The main questions I had were about startup fencing :-) ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] cluster with two ESX server
On 11/28/2017 07:41 PM, Andrei Borzenkov wrote: > 28.11.2017 10:45, Ramann, Björn пишет: >> hi@all, >> >> in my configuration, the 1st Node run on ESX1, the second run on ESX2. Now >> I'm looking for a way to configure the cluster fence/stonith with two ESX >> server - is this possible? > if you have sgared storage, SBD may be an option. True. And if you feel like experimenting you can have a look at https://github.com/wenningerk/sbd/tree/vmware. On ESX you don't have virtual watchdog-devices with a kernel-driver sitting on top (contrary to e.g. with qemu-kvm). This basically is a test-implementation using vSphere HA Application Monitoring as a replacement. In comparison to using softdog this approach doesn't rely on any working code inside vm to trigger a reboot. [root@node4 ~]# sbd query-watchdog Discovered 3 watchdog devices: [1] vmware Identity: VMware Application Monitoring (gray) Driver: [2] /dev/watchdog Identity: Software Watchdog Driver: softdog CAUTION: Not recommended for use with sbd. [3] /dev/watchdog0 Identity: Software Watchdog Driver: softdog CAUTION: Not recommended for use with sbd. Have in mind that this is just a proof-of-concept implementation. So expect any kind of changes and be aware that in the current state it is definitely not fit to go into any distribution. Regarding building you can find VMware_GuestSDK.spec in the vmware-branch of my sbd-fork. Basically this builds rpms from the vmware-GuestSDK-tarball - both library-binary-rpm for the target and devel-rpm for building vmware-enabled-sbd. Regards, Klaus > >> I try to us fence_vmware with vcenter, but then the vcenter is a single >> point of failure und running two vcenter is current not possible. >> > You can run vcenter on vFT VM in which case it should be pretty robust. > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
On 11/29/2017 04:54 PM, Ken Gaillot wrote: On Wed, 2017-11-29 at 14:22 +, Adam Spiers wrote: The same questions apply if this troublesome node was actually a remote node running pacemaker_remoted, rather than the 5th node in the cluster. Remote nodes don't join at the crmd level as cluster nodes do, so they don't "start up" in the same sense, and start-up fencing doesn't apply to them. Instead, the cluster initiates the connection when called for (I don't remember for sure whether it fences the remote node if the connection fails, but that would make sense). According to link_rsc2remotenode() and handle_startup_fencing(), similar "startup-fencing applies to remote nodes too. So if a remote resource fails to start, the remote node will be fenced. A global setting statup-fencing=false will change the behavior for remote nodes too. Regards, Yan ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
On Wed, 2017-11-29 at 14:22 +, Adam Spiers wrote: > Hi all, > > A colleague has been valiantly trying to help me belatedly learn > about > the intricacies of startup fencing, but I'm still not fully > understanding some of the finer points of the behaviour. > > The documentation on the "startup-fencing" option[0] says > > Advanced Use Only: Should the cluster shoot unseen nodes? Not > using the default is very unsafe! > > and that it defaults to TRUE, but doesn't elaborate any further: > > https://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html/Pacema > ker_Explained/s-cluster-options.html > > Let's imagine the following scenario: > > - We have a 5-node cluster, with all nodes running cleanly. > > - The whole cluster is shut down cleanly. > > - The whole cluster is then started up again. (Side question: what > happens if the last node to shut down is not the first to start up? > How will the cluster ensure it has the most recent version of the > CIB? Without that, how would it know whether the last man standing > was shut down cleanly or not?) Of course, the cluster can't know what CIB version nodes it doesn't see have, so if a set of nodes is started with an older version, it will go with that. However, a node can't do much without quorum, so it would be difficult to get in a situation where CIB changes were made with quorum before shutdown, but none of those nodes are present at the next start-up with quorum. In any case, when a new node joins a cluster, the nodes do compare CIB versions. If the new node has a newer CIB, the cluster will use it. If other changes have been made since then, the newest CIB wins, so one or the other's changes will be lost. Whether missing nodes were shut down cleanly or not relates to your next question ... > - 4 of the nodes boot up fine and rejoin the cluster within the > dc-deadtime interval, foruming a quorum, but the 5th doesn't. > > IIUC, with startup-fencing enabled, this will result in that 5th node > automatically being fenced. If I'm right, is that really *always* > necessary? It's always safe. :-) As you mentioned, if the missing node was the last one alive in the previous run, the cluster can't know whether it shut down cleanly or not. Even if the node was known to shut down cleanly in the last run, the cluster still can't know whether the node was started since then and is now merely unreachable. So, fencing is necessary to ensure it's not accessing resources. The same scenario is why a single node can't have quorum at start-up in a cluster with "two_node" set. Both nodes have to see each other at least once before they can assume it's safe to do anything. > Let's suppose further that the cluster configuration is such that no > stateful resources which could potentially conflict with other nodes > will ever get launched on that 5th node. For example it might only > host stateless clones, or resources with require=nothing set, or it > might not even host any resources at all due to some temporary > constraints which have been applied. > > In those cases, what is to be gained from fencing? The only thing I > can think of is that using (say) IPMI to power-cycle the node *might* > fix whatever issue was preventing it from joining the cluster. Are > there any other reasons for fencing in this case? It wouldn't help > avoid any data corruption, at least. Just because constraints are telling the node it can't run a resource doesn't mean the node isn't malfunctioning and running it anyway. If the node can't tell us it's OK, we have to assume it's not. > Now let's imagine the same scenario, except rather than a clean full > cluster shutdown, all nodes were affected by a power cut, but also > this time the whole cluster is configured to *only* run stateless > clones, so there is no risk of conflict between two nodes > accidentally > running the same resource. On startup, the 4 nodes in the quorum > have > no way of knowing that the 5th node was also affected by the power > cut, so in theory from their perspective it could still be running a > stateless clone. Again, is there anything to be gained from fencing > the 5th node once it exceeds the dc-deadtime threshold for joining, > other than the chance that a reboot might fix whatever was preventing > it from joining, and get the cluster back to full strength? If a cluster runs only services that have no potential to conflict, then you don't need a cluster. :-) Unique clones require communication even if they're stateless (think IPaddr2). I'm pretty sure even some anonymous stateless clones require communication to avoid issues. > Also, when exactly does the dc-deadtime timer start ticking? > Is it reset to zero after a node is fenced, so that potentially that > node could go into a reboot loop if dc-deadtime is set too low? A node's crmd starts the timer at start-up and whenever a new election starts, and is stopped when the DC makes it a join offer. I don't t
[ClusterLabs] Antw: Re: questions about startup fencing
> Adam Spiers writes: > >> - The whole cluster is shut down cleanly. >> >> - The whole cluster is then started up again. (Side question: what >> happens if the last node to shut down is not the first to start up? >> How will the cluster ensure it has the most recent version of the >> CIB? Without that, how would it know whether the last man standing >> was shut down cleanly or not?) > > This is my opinion, I don't really know what the "official" pacemaker > stance is: There is no such thing as shutting down a cluster cleanly. A > cluster is a process stretching over multiple nodes - if they all shut > down, the process is gone. When you start up again, you effectively have > a completely new cluster. > > When starting up, how is the cluster, at any point, to know if the > cluster it has knowledge of is the "latest" cluster? The next node could > have a newer version of the CIB which adds yet more nodes to the > cluster. > > The only way to bring up a cluster from being completely stopped is to > treat it as creating a completely new cluster. The first node to start > "creates" the cluster and later nodes join that cluster. I think it is (once again) a problem of pacemaker: In HP Service Guard there was a "cmhaltnode" to halt a node, and a "cmhaltcluster" (AFAIR) to halt the whole cluster. The other direction was "cmrunnode" and "cmruncluster" (AFAIR). So when doing it on the cluster level, all nodes end with the same information (and can start with the "latest"... > > Cheers, > Kristoffer > > -- > // Kristoffer Grönlund > // kgronl...@suse.com > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
On 11/29/2017 04:23 PM, Kristoffer Grönlund wrote: > Adam Spiers writes: > >> - The whole cluster is shut down cleanly. >> >> - The whole cluster is then started up again. (Side question: what >> happens if the last node to shut down is not the first to start up? >> How will the cluster ensure it has the most recent version of the >> CIB? Without that, how would it know whether the last man standing >> was shut down cleanly or not?) > This is my opinion, I don't really know what the "official" pacemaker > stance is: There is no such thing as shutting down a cluster cleanly. A > cluster is a process stretching over multiple nodes - if they all shut > down, the process is gone. When you start up again, you effectively have > a completely new cluster. > > When starting up, how is the cluster, at any point, to know if the > cluster it has knowledge of is the "latest" cluster? The next node could > have a newer version of the CIB which adds yet more nodes to the > cluster. To make it even clearer imagine a node being reverted to a previous state by recovering it from a backup. Regards, Klaus > > The only way to bring up a cluster from being completely stopped is to > treat it as creating a completely new cluster. The first node to start > "creates" the cluster and later nodes join that cluster. > > Cheers, > Kristoffer > ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] questions about startup fencing
Adam Spiers writes: > - The whole cluster is shut down cleanly. > > - The whole cluster is then started up again. (Side question: what > happens if the last node to shut down is not the first to start up? > How will the cluster ensure it has the most recent version of the > CIB? Without that, how would it know whether the last man standing > was shut down cleanly or not?) This is my opinion, I don't really know what the "official" pacemaker stance is: There is no such thing as shutting down a cluster cleanly. A cluster is a process stretching over multiple nodes - if they all shut down, the process is gone. When you start up again, you effectively have a completely new cluster. When starting up, how is the cluster, at any point, to know if the cluster it has knowledge of is the "latest" cluster? The next node could have a newer version of the CIB which adds yet more nodes to the cluster. The only way to bring up a cluster from being completely stopped is to treat it as creating a completely new cluster. The first node to start "creates" the cluster and later nodes join that cluster. Cheers, Kristoffer -- // Kristoffer Grönlund // kgronl...@suse.com ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] building from source
On Tue, 2017-11-28 at 11:23 -0800, Aaron Cody wrote: > I'm trying to build all of the pacemaker/corosync components from > source instead of using the redhat rpms - I have a few questions. > > I'm building on redhat 7.2 and so far I have been able to build: > > libqb 1.0.2 > pacemaker 1.1.18 > corosync 2.4.3 > resource-agents 4.0.1 > > however I have not been able to build pcs yet, i'm getting ruby > errors: > > sudo make install_pcsd > which: no python3 in (/sbin:/bin:/usr/sbin:/usr/bin) > make -C pcsd build_gems > make[1]: Entering directory `/home/whacuser/pcs/pcsd' > bundle package > `ruby_22` is not a valid platform. The available options are: [:ruby, > :ruby_18, :ruby_19, :ruby_20, :ruby_21, :mri, :mri_18, :mri_19, > :mri_20, :mri_21, :rbx, :jruby, > :jruby_18, :jruby_19, :mswin, :mingw, :mingw_18, :mingw_19, > :mingw_20, :mingw_21, :x64_mingw, :x64_mingw_20, :x64_mingw_21] > make[1]: *** [get_gems] Error 4 > make[1]: Leaving directory `/home/whacuser/pcs/pcsd' > make: *** [install_pcsd] Error 2 > > > Q1: Is this the complete set of components I need to build? Not considering pcs, yes. > Q2: do I need cluster-glue? It's only used now to be able to use heartbeat-style fence agents. If you have what you need in Red Hat's fence agent packages, you don't need it. > Q3: any idea how I can get past the build error with pcsd? > Q4: if I use the pcs rpm instead of building pcs from source, I see > an error when my cluster starts up 'unable to get cib'. This didn't > happen when I was using the redhat rpms, so i'm wondering what i'm > missing... > > thanks pcs development is closely tied to Red Hat releases, so it's hit-or- miss mixing and matching pcs and RHEL versions. Upgrading to RHEL 7.4 would get you recent versions of everything, though, so that would be easiest if it's an option. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] questions about startup fencing
Hi all, A colleague has been valiantly trying to help me belatedly learn about the intricacies of startup fencing, but I'm still not fully understanding some of the finer points of the behaviour. The documentation on the "startup-fencing" option[0] says Advanced Use Only: Should the cluster shoot unseen nodes? Not using the default is very unsafe! and that it defaults to TRUE, but doesn't elaborate any further: https://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html/Pacemaker_Explained/s-cluster-options.html Let's imagine the following scenario: - We have a 5-node cluster, with all nodes running cleanly. - The whole cluster is shut down cleanly. - The whole cluster is then started up again. (Side question: what happens if the last node to shut down is not the first to start up? How will the cluster ensure it has the most recent version of the CIB? Without that, how would it know whether the last man standing was shut down cleanly or not?) - 4 of the nodes boot up fine and rejoin the cluster within the dc-deadtime interval, foruming a quorum, but the 5th doesn't. IIUC, with startup-fencing enabled, this will result in that 5th node automatically being fenced. If I'm right, is that really *always* necessary? Let's suppose further that the cluster configuration is such that no stateful resources which could potentially conflict with other nodes will ever get launched on that 5th node. For example it might only host stateless clones, or resources with require=nothing set, or it might not even host any resources at all due to some temporary constraints which have been applied. In those cases, what is to be gained from fencing? The only thing I can think of is that using (say) IPMI to power-cycle the node *might* fix whatever issue was preventing it from joining the cluster. Are there any other reasons for fencing in this case? It wouldn't help avoid any data corruption, at least. Now let's imagine the same scenario, except rather than a clean full cluster shutdown, all nodes were affected by a power cut, but also this time the whole cluster is configured to *only* run stateless clones, so there is no risk of conflict between two nodes accidentally running the same resource. On startup, the 4 nodes in the quorum have no way of knowing that the 5th node was also affected by the power cut, so in theory from their perspective it could still be running a stateless clone. Again, is there anything to be gained from fencing the 5th node once it exceeds the dc-deadtime threshold for joining, other than the chance that a reboot might fix whatever was preventing it from joining, and get the cluster back to full strength? Also, when exactly does the dc-deadtime timer start ticking? Is it reset to zero after a node is fenced, so that potentially that node could go into a reboot loop if dc-deadtime is set too low? The same questions apply if this troublesome node was actually a remote node running pacemaker_remoted, rather than the 5th node in the cluster. I have an uncomfortable feeling that I'm missing something obvious, probably due to the documentation's warning that "Not using the default [for startup-fencing] is very unsafe!" Or is it only unsafe when the resource which exceeded dc-deadtime on startup could potentially be running a stateful resource which the cluster now wants to restart elsewhere? If that's the case, would it be possible to optionally limit startup fencing to when it's really needed? Thanks for any light you can shed! ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] building from source
I'm trying to build all of the pacemaker/corosync components from source instead of using the redhat rpms - I have a few questions. I'm building on redhat 7.2 and so far I have been able to build: libqb 1.0.2 pacemaker 1.1.18 corosync 2.4.3 resource-agents 4.0.1 however I have not been able to build pcs yet, i'm getting ruby errors: sudo make install_pcsd which: no python3 in (/sbin:/bin:/usr/sbin:/usr/bin) make -C pcsd build_gems make[1]: Entering directory `/home/whacuser/pcs/pcsd' bundle package `ruby_22` is not a valid platform. The available options are: [:ruby, :ruby_18, :ruby_19, :ruby_20, :ruby_21, :mri, :mri_18, :mri_19, :mri_20, :mri_21, :rbx, :jruby, :jruby_18, :jruby_19, :mswin, :mingw, :mingw_18, :mingw_19, :mingw_20, :mingw_21, :x64_mingw, :x64_mingw_20, :x64_mingw_21] make[1]: *** [get_gems] Error 4 make[1]: Leaving directory `/home/whacuser/pcs/pcsd' make: *** [install_pcsd] Error 2 Q1: Is this the complete set of components I need to build? Q2: do I need cluster-glue? Q3: any idea how I can get past the build error with pcsd? Q4: if I use the pcs rpm instead of building pcs from source, I see an error when my cluster starts up 'unable to get cib'. This didn't happen when I was using the redhat rpms, so i'm wondering what i'm missing... thanks ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org