It is easy to understand that scheduling in nova-scheduler service consists of 
2 major phases:
A. Cache refresh, in code [1].
B. Filtering and weighing, in code [2].

Couple of previous experiments [3] [4] shows that “cache-refresh” is the major 
bottleneck of nova scheduler. For example, the 15th page of presentation [3] 
says the time cost of “cache-refresh” takes 98.5% of time of the entire 
`_schedule` function [6], when there are 200-1000 nodes and 50+ concurrent 
requests. The latest experiments [5] in China Mobile’s 1000-node environment 
also prove the same conclusion, and it’s even 99.7% when there’re 40+ 
concurrent requests.

Here’re some existing solutions for the “cache-refresh” bottleneck:
I. Caching scheduler.
II. Scheduler filters in DB [7].
III. Eventually consistent scheduler host state [8].

I can discuss their merits and drawbacks in a separate thread, but here I want 
to show a simplest solution based on my findings during the experiments [5]. I 
wrapped the expensive function [1] and tried to see the behavior of 
cache-refresh under pressure. It is very interesting to see a single 
cache-refresh only costs about 0.3 seconds. And when there’re concurrent 
cache-refresh operations, this cost can be suddenly increased to 8 seconds. 
I’ve seen it even reached 60 seconds for one cache-refresh under higher 
pressure. See the below section for details.

It raises a question in the current implementation: Do we really need a 
cache-refresh operation [1] for *every* requests? If those concurrent 
operations are replaced by one database query, the scheduler is still happy 
with the latest resource view from database. Scheduler is even happier because 
those expensive cache-refresh operations are minimized and much faster (0.3 
seconds). I believe it is the simplest optimization to scheduler performance, 
which doesn’t make any changes in filter scheduler. Minor improvements inside 
host manager is enough.

[1] 
https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L104
 
[2] 
https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L112-L123
[3] 
https://www.openstack.org/assets/presentation-media/7129-Dive-into-nova-scheduler-performance-summit.pdf
 
[4] http://lists.openstack.org/pipermail/openstack-dev/2016-June/098202.html 
[5] Please refer to Barcelona summit session ID 15334 later: “A tool to test 
and tune your OpenStack Cloud? Sharing our 1000 node China Mobile experience.”
[6] 
https://github.com/openstack/nova/blob/master/nova/scheduler/filter_scheduler.py#L53
[7] https://review.openstack.org/#/c/300178/
[8] https://review.openstack.org/#/c/306844/


****** Here is the discovery from latest experiments [5] ******
https://docs.google.com/document/d/1N_ZENg-jmFabyE0kLMBgIjBGXfL517QftX3DW7RVCzU/edit?usp=sharing
 

The figure 1 illustrates the concurrent cache-refresh operations in a nova 
scheduler service. There’re at most 23 requests waiting for the cache-refresh 
operations at time 43s.

The figure 2 illustrates the time cost of every requests in the same 
experiment. It shows that the cost is increased with the growth of concurrency. 
It proves the vicious circle that a request will wait longer for the database 
when there’re more waiting requests.

The figure 3/4 illustrate a worse case when the cache-refresh operation costs 
reach 60 seconds because of excessive cache-refresh operations.


-- 
Regards
Yingxin

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to