Hi Marcus, thanks for explaining. maybe a side question: " like storage/host tags to guarantee each host only uses one NFS" - what do you mean by this ? that is, how would you implent this? I know of tags, but I only know how to make sure certain Compute/Disk offerings use certain Compute/Storage hosts.
Not sure how to make some Hosts use some NFSs... ? Thanks anyway, Andrija On 14 November 2014 18:18, Marcus <shadow...@gmail.com> wrote: > It is there (I believe) because cloudstack is acting as a cluster manager > for KVM. It is using NFS to determine if it is 'alive' on the network, and > if it is not, it reboots itself to avoid having a split brain scenario > where VMs start coming up on other hosts when they are already running on > this host. It generally works, if the problem is the host, but as you > point out, there's a situation where the problem can be the NFS server. > This fairly rare for enterprise NFS with high availability, but there are a > fair number of people who have NFS on servers that are relatively low > availability (non-clustered, or get overloaded and unresponsive). > > There's plenty of room for improvement in that script, I agree the original > implemention seems fairly rudimentary, but we have to be careful in > thinking about all scenarios and make sure there's no chance of split > brain. In the mean time, one could also partition the resources such that > you have more clusters and only one primary storage per cluster (or > something else, like storage/host tags to guarantee each host only uses one > NFS). > > On Fri, Nov 14, 2014 at 8:07 AM, Andrija Panic <andrija.pa...@gmail.com> > wrote: > > > Hi guys, > > > > I'm wondering why us there a check > > inside > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh > > ? > > > > I understand that the KVM host checks availability of Primary Storage, > and > > reboots itself if it can't write to storage. > > > > But, if we have say, 3 NFS in a cluster, then lot of KVM hosts - 1 > primary > > storage going down (server crashing or whatever) - will bring porbably > 99% > > of KVM hosts also down for reboot ? > > So instead of loosing uptime for 1/3 of my VMs (1 storage out of 3) - I > > loose uptime for 99%-100% of my VMs ? > > > > I manually edit this script to disabled reboots - but why is it there in > > any case ? > > It doesn't make sense to me - unless I'm mising a point (probably)... > > > > Thanks, > > -- > > > > Andrija Panić > > > --