Re: [openstack-dev] Scheduler proposal

Ed Leafe Thu, 08 Oct 2015 13:32:01 -0700

On Oct 8, 2015, at 1:38 PM, Ian Wells <ijw.ubu...@cack.org.uk> wrote:


>> You've hit upon the problem with the current design: multiple, and 
>> potentially out-of-sync copies of the data.
> 
> Arguably, this is the *intent* of the current design, not a problem with it.

It may have been the intent, but that doesn't mean that we are where we need to 
be.

> The data can never be perfect (ever) so go with 'good enough' and run with 
> it, and deal with the corner cases.

It is in defining what is "good enough" that is problematic.

> Truth be told, storing that data in MySQL is secondary to the correct 
> functioning of the scheduler.

I have no problem with MySQL (well, I do, but that's not relevant to this 
discussion). My issue is that the current system poorly replicates its data 
from MySQL to the places where it is needed.

> The one thing it helps with is when the scheduler restarts - it stands a 
> chance of making sensible decisions before it gets its full picture back.  
> (This is all very like route distribution protocols, you know: make the best 
> decision on the information you have to hand, assuming the rest of the system 
> will deal with your mistakes.  And hold times, and graceful restart, and…)

Yes, this is all well and good. My focus is on improving the information in 
hand when making that best decision.

> Is there any reason why the duplication (given it's not a huge amount of data 
> - megabytes, not gigabytes) is a problem?  Is there any reason why 
> inconsistency is a problem?

I'm sure that many of the larger deployments may have issues with the amount of 
data that must be managed in-memory by so many different parts of the system. 
Inconsistency is a problem, but one that has workarounds. The primary issue is 
scalability: with the current design, increasing the number of scheduler 
processes increases the raciness of the system.

> I do sympathise with your point in the following email where you have 5 VMs 
> scheduled by 5 schedulers to the same host, but consider:
> 
> 1. if only one host suits the 5 VMs this results in the same behaviour: 1 VM 
> runs, the rest don't.  There's more work to discover that but arguably less 
> work than maintaining a consistent database.

True, but in a large scale deployment this is an extremely rare case.

> 2. if many hosts suit the 5 VMs then this is *very* unlucky, because we 
> should be choosing a host at random from the set of suitable hosts and that's 
> a huge coincidence - so this is a tiny corner case that we shouldn't be 
> designing around

Here is where we differ in our understanding. With the current system of 
filters and weighers, 5 schedulers getting requests for identical VMs and 
having identical information are *expected* to select the same host. It is not 
a tiny corner case; it is the most likely result for the current system design. 
By catching this situation early (in the scheduling process) we can avoid 
multiple RPC round-trips to handle the fail/retry mechanism.

> The worst case, is, however
> 
> 3. we attempt to pick the optimal host, and the optimal host for all 5 VMs is 
> the same despite there being other less perfect choices out there.  That 
> would get you a stampeding herd and a bunch of retries.
> 
> I admit that the current system does not solve well for (3).

IMO, this is identical to (2).


-- Ed Leafe

signature.asc
Description: Message signed with OpenPGP using GPGMail

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Scheduler proposal

Reply via email to