Excerpts from Zane Bitter's message of 2015-10-09 17:09:46 +0000: > On 08/10/15 21:32, Ian Wells wrote: > > > > > 2. if many hosts suit the 5 VMs then this is *very* unlucky,because > > we should be choosing a host at random from the set of > > suitable hosts and that's a huge coincidence - so this is a tiny > > corner case that we shouldn't be designing around > > > > Here is where we differ in our understanding. With the current > > system of filters and weighers, 5 schedulers getting requests for > > identical VMs and having identical information are *expected* to > > select the same host. It is not a tiny corner case; it is the most > > likely result for the current system design. By catching this > > situation early (in the scheduling process) we can avoid multiple > > RPC round-trips to handle the fail/retry mechanism. > > > > > > And so maybe this would be a different fix - choose, at random, one of > > the hosts above a weighting threshold, not choose the top host every > > time? Technically, any host passing the filter is adequate to the task > > from the perspective of an API user (and they can't prove if they got > > the highest weighting or not), so if we assume weighting an operator > > preference, and just weaken it slightly, we'd have a few more options. > > The optimal way to do this would be a weighted random selection, where > the probability of any given host being selected is proportional to its > weighting. (Obviously this is limited by the accuracy of the weighting > function in expressing your actual preferences - and it's at least > conceivable that this could vary with the number of schedulers running.) > > In fact, the choice of the name 'weighting' would normally imply that > it's done this way; hearing that the 'weighting' is actually used as a > 'score' with the highest one always winning is quite surprising. > > cheers, > Zane. >
There is a more generalized version of this algorithm for concurrent scheduling I've seen a few times - Pick N options at random, apply heuristic over that N to pick the best, attempt to schedule at your choice, retry on failure. As long as you have a fast heuristic and your N is sufficiently smaller than the total number of options then the retries are rare-ish and cheap. It also can scale out extremely well. Obviously you lose some of the ability to micro-manage where things are placed with a scheduling setup like that, but if scaling up is the concern I really hope that isnt a problem... Cheers, Greg __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev