Folks - we’re getting bitten occasionally by stability issues on some of our customer VMs indirectly related to ACS:
* The billing package[1] we use is touchy, and will occasionally reboot VMs when we bring up the VM’s details page in the billing package * ACS recently lost connectivity with a node, asked the VR to ping the VMs but was blocked by host firewall, so decided the VM was down and then killed it after reconnecting to the node * Something was either fat-fingered or mis-intreperted in billing package, and deleting a licensing product from a customer resulted in it telling ACS to delete a domain, user, the 10 VMs in it and their storage (Luckily I saw the grey icon of Shutdown/Expunge and shut down mgmt server, but not before losing one VM. Somehow I haven’t had a heart attack yet) My thought is each VM would have a LOCK field - when that’s set, it basically becomes “read-only” to ACS - stats are gathered, it monitors if it’s up/down, but any change in running state, the node it’s on, storage, network, firewalls etc would be denied without some type of authorization (I’m not sure what I mean here yet, if it’s a separate login or maybe authenticating to get a token and then present it with the change, or...). I understand in a larger environment there’s too much happening and this could backfire, but for our customers with legacy non-cloud architectures, stability is hugely important and anything we can do to help with that is worthwhile. Maybe in a “phase 2” of this implementation granular controls could be added to specify what could/could not happen during “production lock”... Looking to gauge interest and ideas/suggestions in something like this. Unfortunately it just jumped pretty much to the top of my priority list... John 1: I’d rather not say which at this point.