[ https://issues.apache.org/jira/browse/CLOUDSTACK-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sebastien goasguen resolved CLOUDSTACK-1919. -------------------------------------------- Resolution: Fixed Closing this as too old and too generic > Runbooks > -------- > > Key: CLOUDSTACK-1919 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-1919 > Project: CloudStack > Issue Type: Improvement > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Doc > Reporter: Jessica Tomechak > > The RS [RightScale] runbooks are good stuff - we should seriously consider > producing CS-specific content like that. (Kevin Kluge) > On 03/02/2012 01:27 PM, Chiradeep Vittal wrote: > > Those are useful /tools. /I am more interested in the /content. / > > Running a cloud is a combination of operating cloudstack (stop / start > > / add host / delete host/ devices / storage) + operating the storage + > > operating the network + operating the hypervisor + ancilliary items > > like the SQL server database. It is clear for instance that the > > requisite checks were not done at [customer] before adding hosts to the > > cluster (check CPU level, driver patch levels, firmware upgrades), nor > > were they monitoring cloudstack for warnings about filling up storage > > or monitoring the XS hotfix mailblast. The Run Book for the cloud will > > contain content like this and solutions for when they receive alerts > > about storage filling up or how to recover corrupt vhds, how to > > periodically back up primary storage, etc. How to transfer VMs between > > failure domains when a particular failure domain has failed. > > References to other runbooks such as How to backup and restore MySQL. > > Monitor CS server and the underlying hardware with Nagios etc. Host > > maintenance procedures, storage maintenance procedures. > > > > In addition, there needs to be a reference architecture for deploying > > a cloud (define your failure domains, plan for capacity based on > > service offerings, calculate IOPs requirements, network bandwidth, > > switch capacity, core router capacity, ip address planning – public and > > private). > > > > Finally, [before a user sets up a cloud, they should evaluate whether they > > are ready]. Do they have change > > management procedures? Do they have a CMDB? Do they have document > > problem management procedures? Let me just throw ITIL in there. > > > I agree - honestly the complexity of what IaaS is incredible. > I'd go a step further in [the user] evaluation and say: > * Do they have config management > * Do they have automated provisioning (esp for hypervisors) - can they get a > new hypervisor up in EXACTLY the same configuration as the last one, down to > network bonding without any manual intervention? > * Do they have a monitoring system in place - and is it or can it be made > capable of monitoring cloudstack. [Need to define the baseline of what the > user should be monitoring] -- This message was sent by Atlassian JIRA (v6.2#6252)