Hi Jeromy,

I love the idea, I'm not really a developer, so those guys will look at things 
a different way, but...

These would be by my initial comments:


  1.  We can't/don't run scripts on vSphere hosts (not sure about Hyper-V)
  2.  I know of one failure scenario (which happened) where MTU issues in 
intermediate switches meant that small amounts of data could pass, but anything 
that was passed as jumbo frames then failed. So it would be important to 
exercise that.
  3.  You need to be very sure of failures before shutting hosts down.  Also a 
host is likely to be connected to multiple storage pools, so you wouldn't want 
to shut down a host due to one pool becoming unavailable.
  4.  Environments can have hundreds of storage pools, so watch out for 
spamming the logs with updates.
  5.  The primary storage pools have a 'state' which should get updated and 
used by the deployment planners
  6.  Secondary storage pools don't have a 'state' - but it would be great if 
that were added in the DB and reflected in the UI.



Kind regards,

Paul Angus


paul.an...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

From: Jeromy Grimmett [mailto:jer...@cloudbrix.com]
Sent: 10 March 2017 15:28
To: dev@cloudstack.apache.org
Subject: [Proposal] - StorageHA

Hello,

I am new to the mailing list, and we are glad to be a part of the CloudStack 
community.  We are looking to develop plugins and modules that will help grow 
and expand the adoption and use of CloudStack.  So as part of my introductory 
email, I'd like to introduce a little project we have been working on; a 
StorageHA Monitor.  The Monitor would allow CloudStack and the hosts to test, 
communicate and resolve VM availability issues when storage (primary and/or 
secondary) availability becomes apparent.  This is a small write up about how 
it would work:

Consists of two scripts/programs:

The host script runs on the host servers and checks to see if the primary and 
secondary storage is available by doing a read/write test then reports to the 
master script that runs on the Cloudstack server. The host script will test a 
read and a write to the storage every 5 seconds (configurable), and if it fails 
3 times (configurable) then it will be recorded by the master script.

The master script will monitor the results of the host script. If the test is 
good, nothing happens and the results are logged and so that we can track the 
history of the test results. If the test reports back as failed, then it will 
perform the following actions:


  *   Secondary Storage - It will simply generate and send an alert that the 
failure has occurred.


  *   Primary Storage - The script will perform the following tasks:
     *   Generate and send an alert that the failure has occurred.
     *   Force the VMs on that host to shutdown.
     *   Determine which host to move the VMs to.
     *   Start the VMs on the healthy host.

We have already started working on some code, and the solution seems to be 
testing well.  Any thoughts/ideas/input are(is) welcome.  Should there are a 
solution out there already, then please forgive our ignorance, and point us in 
the right direction. We look forward to further collaboration with you all.

Regards,
j

Jeromy Grimmett
[cb-sig-logo2]
155 Fleet Street
Portsmouth, NH 03801
Direct: 603.766.3625
Office: 603.766.4908
Fax: 603.766.4729
jer...@cloudbrix.com<mailto:jer...@cloudbrix.com>
www.cloudbrix.com<http://www.cloudbrix.com/>

Reply via email to