On Mar 4, 2014, at 3:38 PM, Marcus wrote: > On Tue, Mar 4, 2014 at 3:34 AM, France <mailingli...@isg.si> wrote: >> Hi Marcus and others. >> >> There is no need to kill of the entire hypervisor, if one of the primary >> storages fail. >> You just need to kill the VMs and probably disable SR on XenServer, because >> all other SRs and VMs have no problems. >> if you kill those, then you can safely start them elsewhere. On XenServer >> 6.2 you call destroy the VMs which lost access to NFS without any problems. > > That's a great idea, but as already mentioned, it doesn't work in > practice. You can't kill a VM that is hanging in D state, waiting on > storage. I also mentioned that it causes problems for libvirt and much > of the other system not using the storage.
You can on XS 6.2 as tried in in real life and reported by others as well. > >> >> If you really want to still kill the entire host and it's VMs in one go, I >> would suggest live migrating the VMs which have had not lost their storage >> off first, and then kill those VMs on a stale NFS by doing hard reboot. >> Additional time, while migrating working VMs, would even give some grace >> time for NFS to maybe recover. :-) > > You won't be able to live migrate a VM that is stuck in D state, or > use libvirt to do so if one of its storage pools is unresponsive, > anyway. > I dont want to live migrate VMs in D state, just the working VMs. Those stuck can die with hypervisor reboot. >> >> Hard reboot to recover from D state of NFS client can also be avoided by >> using soft mount options. > > As mentioned, soft and intr very rarely actually work, in my > experience. I wish they did as I truly have come to loathe NFS for it. > >> >> I run a bunch of Pacemaker/Corosync/Cman/Heartbeat/etc clusters and we don't >> just kill whole nodes but fence services from specific nodes. STONITH is >> implemented only when the node looses the quorum. > > Sure, but how do you fence a KVM host from an NFS server? I don't > think we've written a firewall plugin that works to fence hosts from > any NFS server. Regardless, what CloudStack does is more of a poor > man's clustering, the mgmt server is the locking in the sense that it > is managing what's going on, but it's not a real clustering service. > Heck, it doesn't even STONITH, it tries to clean shutdown, which fails > as well due to hanging NFS (per the mentioned bug, to fix it they'll > need IPMI fencing or something like that). In my case as well as in the case of OP, the hypervisor got rebooted successfully. > > I didn't write the code, I'm just saying that I can completely > understand why it kills nodes when it deems that their storage has > gone belly-up. It's dangerous to leave that D state VM hanging around, > and it will until the NFS storage comes back. In a perfect world you'd > just stop the VMs that were having the issue, or if there were no VMs > you'd just de-register the storage from libvirt, I agree. As previously stated on XS 6.2 you can "destroy" VMs with unaccessible NFS storage. I do not remember if processes were in the D state or whatever, cause i used the GUI, if i remember correctly. I am sure, you can test it yourself too. > >> >> Regards, >> F. >> >> >> On 3/3/14 5:35 PM, Marcus wrote: >>> >>> It's the standard clustering problem. Any software that does any sort >>> of avtive clustering is going to fence nodes that have problems, or >>> should if it cares about your data. If the risk of losing a host due >>> to a storage pool outage is too great, you could perhaps look at >>> rearranging your pool-to-host correlations (certain hosts run vms from >>> certain pools) via clusters. Note that if you register a storage pool >>> with a cluster, it will register the pool with libvirt when the pool >>> is not in maintenance, which, when the storage pool goes down will >>> cause problems for the host even if no VMs from that storage are >>> running (fetching storage stats for example will cause agent threads >>> to hang if its NFS), so you'd need to put ceph in its own cluster and >>> NFS in its own cluster. >>> >>> It's far more dangerous to leave a host in an unknown/bad state. If a >>> host loses contact with one of your storage nodes, with HA, cloudstack >>> will want to start the affected VMs elsewhere. If it does so, and your >>> original host wakes up from it's NFS hang, you suddenly have a VM >>> running in two locations, corruption ensues. You might think we could >>> just stop the affected VMs, but NFS tends to make things that touch it >>> go into D state, even with 'intr' and other parameters, which affects >>> libvirt and the agent. >>> >>> We could perhaps open a feature request to disable all HA and just >>> leave things as-is, disallowing operations when there are outages. If >>> that sounds useful you can create the feature request on >>> https://issues.apache.org/jira. >>> >>> >>> On Mon, Mar 3, 2014 at 5:37 AM, Andrei Mikhailovsky <and...@arhont.com> >>> wrote: >>>> >>>> Koushik, I understand that and I will put the storage into the >>>> maintenance mode next time. However, things happen and servers crash from >>>> time to time, which is not the reason to reboot all host servers, even >>>> those >>>> which do not have any running vms with volumes on the nfs storage. The >>>> bloody agent just rebooted every single host server regardless if they were >>>> running vms with volumes on the rebooted nfs server. 95% of my vms are >>>> running from ceph and those should have never been effected in the first >>>> place. >>>> ----- Original Message ----- >>>> >>>> From: "Koushik Das" <koushik....@citrix.com> >>>> To: "<us...@cloudstack.apache.org>" <us...@cloudstack.apache.org> >>>> Cc: dev@cloudstack.apache.org >>>> Sent: Monday, 3 March, 2014 5:55:34 AM >>>> Subject: Re: ALARM - ACS reboots host servers!!! >>>> >>>> The primary storage needs to be put in maintenance before doing any >>>> upgrade/reboot as mentioned in the previous mails. >>>> >>>> -Koushik >>>> >>>> On 03-Mar-2014, at 6:07 AM, Marcus <shadow...@gmail.com> wrote: >>>> >>>>> Also, please note that in the bug you referenced it doesn't have a >>>>> problem with the reboot being triggered, but with the fact that reboot >>>>> never completes due to hanging NFS mount (which is why the reboot >>>>> occurs, inaccessible primary storage). >>>>> >>>>> On Sun, Mar 2, 2014 at 5:26 PM, Marcus <shadow...@gmail.com> wrote: >>>>>> >>>>>> Or do you mean you have multiple primary storages and this one was not >>>>>> in use and put into maintenance? >>>>>> >>>>>> On Sun, Mar 2, 2014 at 5:25 PM, Marcus <shadow...@gmail.com> wrote: >>>>>>> >>>>>>> I'm not sure I understand. How do you expect to reboot your primary >>>>>>> storage while vms are running? It sounds like the host is being >>>>>>> fenced since it cannot contact the resources it depends on. >>>>>>> >>>>>>> On Sun, Mar 2, 2014 at 3:24 PM, Nux! <n...@li.nux.ro> wrote: >>>>>>>> >>>>>>>> On 02.03.2014 21:17, Andrei Mikhailovsky wrote: >>>>>>>>> >>>>>>>>> Hello guys, >>>>>>>>> >>>>>>>>> >>>>>>>>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted >>>>>>>>> all of my host servers without properly shutting down the guest vms. >>>>>>>>> I've simply upgraded and rebooted one of the nfs primary storage >>>>>>>>> servers and a few minutes later, to my horror, i've found out that >>>>>>>>> all >>>>>>>>> of my host servers have been rebooted. Is it just me thinking so, or >>>>>>>>> is this bug should be fixed ASAP and should be a blocker for any new >>>>>>>>> ACS release. I mean not only does it cause downtime, but also >>>>>>>>> possible >>>>>>>>> data loss and server corruption. >>>>>>>> >>>>>>>> >>>>>>>> Hi Andrei, >>>>>>>> >>>>>>>> Do you have HA enabled and did you put that primary storage in >>>>>>>> maintenance >>>>>>>> mode before rebooting it? >>>>>>>> It's my understanding that ACS relies on the shared storage to >>>>>>>> perform HA so >>>>>>>> if the storage goes it's expected to go berserk. I've noticed similar >>>>>>>> behaviour in Xenserver pools without ACS. >>>>>>>> I'd imagine a "cure" for this would be to use network distributed >>>>>>>> "filesystems" like GlusterFS or CEPH. >>>>>>>> >>>>>>>> Lucian >>>>>>>> >>>>>>>> -- >>>>>>>> Sent from the Delta quadrant using Borg technology! >>>>>>>> >>>>>>>> Nux! >>>>>>>> www.nux.ro >>>> >>>> >>