Hi, all  
  We are working on the scalability work for KVM guests, and found one big 
issue exists in linux scheduler and it may impact guest's performance and 
scalability a lot for some special workloads running in VM.  In the current 
Linux scheduler, there are some features to enhance App's performance which are 
defined in the file kvm.git/kernel/sched_features.h. Certainly, they are mostly 
beneficial optimizations to improve system's performance, but unluckily, some 
of them may hurt VM's performance and scalablity in KVM case 
  We know that if two or more vcpus of one guests are scheduled to one same 
logical processor,  same CPU utilization may generate less valid output due 
mutual lock in VM's OS than that are scheduled to different logical processors  
.And we also know that VM's vcpus are emulated or executed through the threads 
of Qemu for KVM.  If the vcpu threads of qemu are often pulled to one same 
logical processor by some features of Linux scheduler, kvm guests'performance 
may be hurt a lot.  In our performance testing,  the results also show this 
performance bottleneck due to this issue. After analysis about Linux scheduler, 
we found it is indeed caused by the known features of Linux schduler, such as 
AFFINE_WAKEUPS, SYNC_WAKEUPS etc. With these features on, linux schduler often 
tries to schedule the vcpu threads of one guests to one same logical processor 
when vcpus are over-committed and logical processors are saturated. Once the 
vcpu threads of one VM are scheduled to the same LP, system performance drops 
dramatically with some workloads(like webbench running in windows OS).  
   To verify this finding, we also worked out a simple patch attached in the 
mail to dynamially switch off the two sheduler features mentioned above when 
scheduler knows the scheduling tasks are vcpu threads, and we found the the 
whole system's performance and scalability are improved a lot.  Certatinly, 
this patch is not good for upstream, but it can enlighten us to think how to 
optimize Linux scheduler and we also want to initiate the discussion about how 
to make LINUX's scheduler more friendly to virtualization.  Besides, this issue 
maybe not only kvm's special issue, insteadly it should be a common issue for 
host-based VMs, and we also expect that we can have an elegant solution to 
thoroughly resolve the performance or scalability gap compared with 
hypervisor-based VMs.  
Any comments ?
Thanks!
Xiantao

Attachment: sheduler_issue_fix.patch
Description: sheduler_issue_fix.patch



Reply via email to