Hi all, When create lots of instance simultaneously, there will be lots of instance in ERROR state. And most of them are caused by network rpc request timeout. This result is not so graceful.
I think it will be better if scheduler keep a queue of creating request. when he find all the hosts are busy enough(compute_node.current_workload reach some value), stop cast the request to host temporarily, until he found some host free enough. In this way, we can make sure booting lots of instances simultaneously results in active instances rather than lots of ERROR instance. but will cause a small weak point, if the top value of current_workload small enough, create instance processing will be slow. Do you have another quick fix? Thanks, -- best regards, gtt
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp