On 05/22/2017 03:36 PM, Sean Dague wrote:
On 05/22/2017 02:45 PM, James Penick wrote: <snip>I recognize that large Ironic users expressed their concerns about IPMI/BMC communication being unreliable and not wanting to have users manually retry a baremetal instance launch. But, on this particular point, I'm of the opinion that Nova just do one thing and do it well. Nova isn't an orchestrator, nor is it intending to be a "just continually try to get me to this eventual state" system like Kubernetes. Kubernetes is a larger orchestration platform that provides autoscale. I don't expect Nova to provide autoscale, but I agree that Nova should do one thing and do it really well, and in my mind that thing is reliable provisioning of compute resources. Kubernetes does autoscale among other things. I'm not asking for Nova to provide Autoscale, I -AM- asking OpenStack's compute platform to provision a discrete compute resource reliably. This means overcoming common and simple error cases. As a deployer of OpenStack I'm trying to build a cloud that wraps the chaos of infrastructure, and present a reliable facade. When my users issue a boot request, I want to see if fulfilled. I don't expect it to be a 100% guarantee across any possible failure, but I expect (and my users demand) that my "Infrastructure as a service" API make reasonable accommodation to overcome common failures.Right, I think hits my major queeziness with throwing the baby out with the bathwater here. I feel like Nova's job is to give me a compute when asked for computes. Yes, like malloc, things could fail. But honestly if Nova can recover from that scenario, it should try to. The baremetal and affinity cases are pretty good instances where Nova can catch and recover, and not just export that complexity up. It would make me sad to just export that complexity to users, and instead of handing those cases internally make every SDK, App, and simple script build their own retry loop.
If Heat was more widely deployed, would you feel this way? Would you reconsider having Heat as one of those "basic compute services" in OpenStack, then?
This is, unfortunately, one of the main problems stemming from OpenStack not having a *single* public API, with projects implementing parts of that single public API. You know, the thing I started arguing for about 6 years ago.
If we had one single public porcelain API, we wouldn't even need to have this conversation. People wouldn't even know we'd changed implementation details behind the scenes and were doing retries at a slightly higher level than before. Oh well... we live and learn (maybe).
Best, -jay _______________________________________________ OpenStack-operators mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
