Dan, et al-

> Well, (a) today you can't really externally retry a single instance
> build without just creating a new one. The new one could suffer the same
> fate, but that's why we just did the auto-disable feature for nova-compute.

Whoah, but that's after 10 tries (by default).  And if e.g. it bounced
because the instance is too big for the host, but other, smaller
instances come in and succeed in the meantime, that could wind up being
stretched indefinitely.  Doesn't sound like a complete answer to this issue.

> Thing (b) is that if we fix rebuild so it works on a failed
> shell-of-an-instance from a boot operation, we could easily exclude the
> host it failed on, but it'd require some additional logic.

Right, so I think the need for that "additional logic" was my point.

Today you can limit the set of compute hosts to try by specifying an
"availability zone".  Perhaps the answer here is to support some kind of
"exclude these hosts" list to a "fresh" deploy.

But is the cure worse than the disease?

-efried
.


_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to