Re: [kvm-devel] [PATCH] interrupt preemption support

Gregory Haskins Fri, 16 Mar 2007 04:17:38 -0800

>>> On Fri, Mar 16, 2007 at  3:03 AM, in message <[EMAIL PROTECTED]>,
Avi Kivity <[EMAIL PROTECTED]> wrote: 
> Gregory Haskins wrote:
>> This patch adds baseline support for interrupt preemption.  This lets one 
> context (virtual SMP vcpu, PV driver, etc) preempt another vcpu which is 
> currently in guest context when an interrupt is posted.  The patch consists 
> of the following:
>>
>> 1) Re- worked the vcpu mutex with a new "MPL" construct.  This allows 
>> updates 
> to the vcpu structure even if the vcpu is currently executing in guest mode.  
> The previous code held a mutex during guest execution.
>>
>>   
> 
> Whoa there.  kvm can't just add new locking constructs to the kernel.


With all due respect, why not? ;)  Of course it can.  We arent really adding 
anything exposed to the kernel outside of KVM or even adding a new one from 
scratch.  Its simply derivative from the existing mutex/semphore that we 
already use.   As a philosophical point, surely not ever construct we utilize 
in KVM has to be trickled down from the upstream kernel, or KVM itself would 
not exist ;)

> It has to be added to the kernel _first_, you need to justify it with 
> more than just kvm as a user, show correctness, performance, and 
> scalability.
> 
> Once it's accepted, kvm can use it.

I understand if what I said may not change your argument against proving the 
construct against a larger audience, but I thought I would point it out that 
its really just a struct+mutex in case you only read the comment and not the 
code.  I think it just sounds worse than it is.

> 
> Formal issues aside, why is this different from taking nested locks?  

This essentially *is* a set of nested locks, except that it also automatically 
protects against deadlock.  Alternatively I could have written the code to 
simply have replaced the original vcpu->mutex with something like 
vcpu->vcms_mutex + vcpu->vcpu_mutex, and then replaced all the calls to 
mutex_lock(&vcpu->mutex) with a macro that grabs both (in proper order).   
However, I felt as though my solution was better for two reasons:

1) Its impossible to bungle the lock-ordering to induce deadlock, regardless of 
nesting depth
2) Its non-contended acquisition overhead is O(1) regardless of nesting depth, 
whereas true nested locks are O(n) (where n = depth)

In my experience, nested locks work ok if there is only one developer on the 
project. When you get project newbies (like me w.r.t. KVM ;) ) hacking code in 
after the fact, bad things can happen pretty easily.  I thought this might 
perhaps be a good way to prevent that.  If it turns out you guys dont like it, 
its no big deal to rip it out and go the more traditional route.  I wanted your 
feedback.  Thats why I submitted this now before I had all the issues worked 
out. ;)

> The paravirt network code congealing in Dor's repo has a spinlock 
> protecting the interrupt bits.  The main execution path takes both the 
> vcpu mutex and the irq lock (when necessary), other paths take just the 
> irq lock.  This has the added advantage of not requiring a mutex to 
> inject an interrupt, which is often necessary from (host) irq context.
> 

Keep in mind that my primary intention was to fix the 
kvm_vcpu_(ioctl)_interrupt() function to use finer grained locking than the 
previous vcpu->mutex that it used to use.  Because the old lock was a mutex, 
the new nested lock was also based on a mutex.  So its true that its not 
interrupt friendly, but it wasn't to begin with.  Whether it actually needs to 
be is something that I do not yet know (see my comments below).

> 
>> 2) Exposed the previously static kvm_vcpu_ioctl_interrupt() as 
> kvm_vcpu_interrupt so that other mechanisms (PV drivers) can issue interrupts 
> just as the ioctl interface can
>>   
> 
> It's not enough to issue an interrupt, there is a whole dance involved 
> in the guest side to ack it.  You need to go through the apic, which is 
> currently emulated in userspace.  We may push it to the kernel later.

I added this interface as a stab at accommodating your request for PV driver 
support without fully understanding your requirements.  Based on your comments, 
I assume that the code that invokes the INTERRUPT ioctl must have already 
updated the APIC model?  I will revert this change to make it a static ioctl 
function again until I can understand how the PV drivers can update the apic 
(userspace or kernel, whichever it ends up being).

> 
>>
>> Index: kvm- 12/kernel/kvm.h
>>   
> 
> Please base things off trunk.  For kernel code, off the git repository, 
> not the bundled kernel module (which is mangled by the release process 
> in order to accommodate older kernels).

Note that it actually is based off of (approx) trunk, but my quilt series 
starts with the kvm-12 tarball and thus the directory name.  I think my series 
was patched up to r4524.  You comment about git vs tarball is legitimate.  I 
will have to refactor my workflow to utilize git instead of the tarball somehow.


> 
>>  
>> +typedef enum
>> +{
>> +    KVM_MPL_FREE,
>> +    KVM_MPL_VMCS,
>> +    KVM_MPL_VCPU,
>> +    KVM_MPL_FULL = KVM_MPL_VCPU /* Must be equal to the last entry */ 
>> +}kvm_mpl_locktypes_t;
>>   
> 
> Either you or your mailer mangle whitespace horribly.

Its probably a combination of both.  Do you guys just use the standard 
"linux/linus" settings for emacs (or equivalent).  Sorry about that.  I 
recently rebuilt my devel machine and don't have my .emacs brought over yet.  I 
will fix this.

-Greg


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] interrupt preemption support

Reply via email to