Re: [openstack-dev] Scheduler proposal

Alec Hothan (ahothan) Fri, 09 Oct 2015 14:37:55 -0700

There are several ways to make python code that deals with a lot of data 
faster, especially when it comes to operating on DB fields from SQL tables (and 
that is not limited to the nova scheduler).
Pulling data from large SQL tables and operating on them through regular python 
code (using python loops) is extremely inefficient due to the nature of the 
python interpreter. If this is what nova scheduler code is doing today, the 
good thing is there is a potentially huge room for improvement.

The approach to scale out, in practice means a few instances (3 instances is 
common), meaning the gain would be in the order of 3x (or 1 order of magnitude) 
but with sharply increased complexity to deal with concurrent schedulers and 
potentially conflicting results (with the use of tools lie ZK or Consul...). 
But in essence we're basically just running the same unoptimized code 
concurrently to achieve a better throughput.
On the other hand optimizing something that is not very optimized to start with 
can yield a much better return than 3x, with the advantage of simplicity (one 
active scheduler, which could be backed by a standby for HA).

Python is actually one of the better languages to do *fast* in-memory big data 
processing using open source python scientific and data analysis libraries as 
they can provide native speed through cythonized libraries and powerful high 
level abstraction to do complex filters and vectorized operations. Not only it 
is fast but it also yields much smaller code.

I have used libraries such as numpy and pandas to operate on very large data 
sets (the equivalent of SQL tables with hundreds of thousands of rows) and 
there is easily 2 orders of magnitude of difference for operating on these data 
in memory between plain python code with loops and python code using these 
libraries (that is without any DB access).
The order of filtering on the kind of reduction that you describe below 
certainly helps but becomes second order when you use pandas filters because 
they are extremely fast even for very large datasets.

I'm curious to know why this path was not explored more before embarking full 
speed on concurrency/scale out options which is a very complex and treacherous 
path as we see in this discussion. Clearly very attractive intellectually to 
work with all these complex distributed frameworks, but the cost of complexity 
is often overlooked.

Is there any data showing the performance of the current nova scheduler? How 
many scheduling can nova do per second at scale with worst case filters?
When you think about it, 10,000 nodes and their associated properties is not 
such a big number if you use the right libraries.

On 10/9/15, 1:10 PM, "Joshua Harlow" <harlo...@fastmail.com> wrote:

>And also we should probably deprecate/not recommend:
>
>http://docs.openstack.org/developer/nova/api/nova.scheduler.filters.json_filter.html#nova.scheduler.filters.json_filter.JsonFilter
>
>That filter IMHO basically disallows optimizations like forming SQL 
>statements for each filter (and then letting the DB do the heavy 
>lifting) or say having each filter say 'oh my logic can be performed by 
>a prepared statement ABC and u should just use that instead' (and then 
>letting the DB do the heavy lifting).
>
>Chris Friesen wrote:
>> On 10/09/2015 12:25 PM, Alec Hothan (ahothan) wrote:
>>>
>>> Still the point from Chris is valid. I guess the main reason openstack is
>>> going with multiple concurrent schedulers is to scale out by
>>> distributing the
>>> load between multiple instances of schedulers because 1 instance is too
>>> slow. This discussion is about coordinating the many instances of
>>> schedulers
>>> in a way that works and this is actually a difficult problem and will get
>>> worst as the number of variables for instance placement increases (for
>>> example NFV is going to require a lot more than just cpu pinning, huge
>>> pages
>>> and numa).
>>>
>>> Has anybody looked at why 1 instance is too slow and what it would
>>> take to
>>> make 1 scheduler instance work fast enough? This does not preclude the
>>> use of
>>> concurrency for finer grain tasks in the background.
>>
>> Currently we pull data on all (!) of the compute nodes out of the
>> database via a series of RPC calls, then evaluate the various filters in
>> python code.
>>
>> I suspect it'd be a lot quicker if each filter was a DB query.
>>
>> Also, ideally we'd want to query for the most "strict" criteria first,
>> to reduce the total number of comparisons. For example, if you want to
>> implement the "affinity" server group policy, you only need to test a
>> single host. If you're matching against host aggregate metadata, you
>> only need to test against hosts in matching aggregates.
>>
>> Chris
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Scheduler proposal

Reply via email to