Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

Jay Pipes Tue, 23 May 2017 08:56:54 -0700

On 05/22/2017 03:36 PM, Sean Dague wrote:

On 05/22/2017 02:45 PM, James Penick wrote:
<snip>


     I recognize that large Ironic users expressed their concerns about
     IPMI/BMC communication being unreliable and not wanting to have
     users manually retry a baremetal instance launch. But, on this
     particular point, I'm of the opinion that Nova just do one thing and
     do it well. Nova isn't an orchestrator, nor is it intending to be a
     "just continually try to get me to this eventual state" system like
     Kubernetes.

Kubernetes is a larger orchestration platform that provides autoscale. I
don't expect Nova to provide autoscale, but

I agree that Nova should do one thing and do it really well, and in my
mind that thing is reliable provisioning of compute resources.
Kubernetes does autoscale among other things. I'm not asking for Nova to
provide Autoscale, I -AM- asking OpenStack's compute platform to
provision a discrete compute resource reliably. This means overcoming
common and simple error cases. As a deployer of OpenStack I'm trying to
build a cloud that wraps the chaos of infrastructure, and present a
reliable facade. When my users issue a boot request, I want to see if
fulfilled. I don't expect it to be a 100% guarantee across any possible
failure, but I expect (and my users demand) that my "Infrastructure as a
service" API make reasonable accommodation to overcome common failures.


Right, I think hits my major queeziness with throwing the baby out with
the bathwater here. I feel like Nova's job is to give me a compute when
asked for computes. Yes, like malloc, things could fail. But honestly if
Nova can recover from that scenario, it should try to. The baremetal and
affinity cases are pretty good instances where Nova can catch and
recover, and not just export that complexity up.

It would make me sad to just export that complexity to users, and
instead of handing those cases internally make every SDK, App, and
simple script build their own retry loop.

If Heat was more widely deployed, would you feel this way? Would youreconsider having Heat as one of those "basic compute services" inOpenStack, then?

This is, unfortunately, one of the main problems stemming from OpenStacknot having a *single* public API, with projects implementing parts ofthat single public API. You know, the thing I started arguing for about6 years ago.

If we had one single public porcelain API, we wouldn't even need to havethis conversation. People wouldn't even know we'd changed implementationdetails behind the scenes and were doing retries at a slightly higherlevel than before. Oh well... we live and learn (maybe).


Best,
-jay

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

Reply via email to