Re: Primary storage failure

Dean Kamali Wed, 03 Jul 2013 11:51:15 -0700

Geoff  thanks for your help, just wondering if this change will have any
impact on HA operations that cloudstack offers for HA instances (if one of
the nodes dies, vm will restart on another node).


Thanks again for your help


On Wed, Jul 3, 2013 at 2:39 PM, David Nalley <[email protected]> wrote:

> This warrants a bug IMO.
>
> --David
>
> On Wed, Jul 3, 2013 at 2:38 PM, Geoff Higginbottom
> <[email protected]> wrote:
> > Dean,
> >
> > I am guessing you are using NFS for your Primary Storage.
> >
> > This is actually 'by design'.  The logic is that if the storage goes
> offline, then all VMs must have also failed, and a 'forced' reboot of the
> Host 'might' automatically fix things.
> >
> > This is great if you only have one Primary Storage, but typically you
> have more than one, so whilst the reboot might fix the failed storage, it
> will also kill off all the perfectly good VMs which were still happily
> running.
> >
> > The fix for XenServer Hosts is to:
> >
> > 1. Modify /opt/xensource/bin/xenheartbeat.sh on all your Hosts,
> commenting out the two entries which have "reboot -f"
> >
> > 2. Identify the PID of the script  - pidof -x xenheartbeat.sh
> >
> > 3. Restart the Script  - kill <pid>
> >
> > 4. Force reconnect Host from the UI,  the script will then re-launch on
> reconnect
> >
> > If you running KVM, I'm guessing there is a similar script, but I have
> not tried this yet for anything other than XenSever (it does not apply to
> ESXi)
> >
> > Regards
> >
> > Geoff Higginbottom
> >
> > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
> >
> > [email protected]
> >
> >
> > -----Original Message-----
> > From: Dean Kamali [mailto:[email protected]]
> > Sent: 03 July 2013 19:14
> > To: [email protected]
> > Subject: Primary storage failure
> >
> > Hello everyone
> >
> > I'm testing failure scenarios, and I have noticed that as soon as the
> primary storage gets offline.
> >
> > cloudstack management server seems to think that the hypervisor is not
> responding and it will reboot the node, if you have number of of nodes it
> will eventually reboot all of them. (losing everything  .. fun! )
> >
> > What if I have multiple primary storage and one of them failed? it will
> reboot all of my hypervisors? it doesn't seems right to me.
> >
> > Is there is a way to control this behavior?
> >
> > it seems that cloud stack management server needs to be a little smarter.
> > This email and any attachments to it may be confidential and are
> intended solely for the use of the individual to whom it is addressed. Any
> views or opinions expressed are solely those of the author and do not
> necessarily represent those of Shape Blue Ltd or related companies. If you
> are not the intended recipient of this email, you must neither take any
> action based upon its contents, nor copy or show it to anyone. Please
> contact the sender if you believe you have received this email in error.
> Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue
> Services India LLP is operated under license from Shape Blue Ltd. ShapeBlue
> is a registered trademark.
> >
>

Re: Primary storage failure

Reply via email to