Gregory Haskins wrote:
>>
>>>
>>> vcpu- >cpu = - 1;
>>> vcpu- >kvm = kvm;
>>> @@ - 366,13 +370,20 @@ static void free_pio_guest_pages(struct kvm_vcpu
>>> *vcpu)
>>>
>>> static void kvm_free_vcpu(struct kvm_vcpu *vcpu)
>>> {
>>> + unsigned long irqsave;
>>> +
>>> if (!vcpu- >vmcs)
>>> return;
>>>
>>> vcpu_load(vcpu);
>>> kvm_mmu_destroy(vcpu);
>>> vcpu_put(vcpu);
>>> +
>>> + spin_lock_irqsave(&vcpu- >irq.lock, irqsave);
>>> + vcpu- >irq.task = NULL;
>>> + spin_unlock_irqrestore(&vcpu- >irq.lock, irqsave);
>>>
>>>
>> Can irq.task be non- NULL here at all? Also, we only free vcpus when we
>> destroy the vm, and paravirt drivers would hopefully hold a ref to the
>> vm, so there's nobody to race against here.
>>
>
> I am perhaps being a bit overzealous here. What I found in practice is that
> the LVTT can screw things up on shutdown, so I was being pretty conservative
> on the synchronization here.
>
>
That may point out to a different sync problem. All pending timers
ought to have been canceled before we reach here. Please check to make
sure this isn't papering over another problem.
>>> kvm_irqdevice_destructor(&vcpu- >irq.dev);
>>>
>>> @@ - 1868,6 +1880,10 @@ static int kvm_vcpu_ioctl_run(struct kvm_vcpu
>>> *vcpu,
>>>
>> struct kvm_run *kvm_run)
>>
>>> kvm_arch_ops- >decache_regs(vcpu);
>>> }
>>>
>>> + spin_lock_irqsave(&vcpu- >irq.lock, irqsaved);
>>> + vcpu- >irq.task = current;
>>> + spin_unlock_irqrestore(&vcpu- >irq.lock, irqsaved);
>>> +
>>>
>>>
>> Just assignment + __smp_wmb().
>>
>
> (This comment applies to all of the subsequent reviews where memory barriers
> are recommended instead of locks:)
>
> I cant quite wrap my head around whether all these critical sections are
> correct with just a barrier instead of a full-blown lock. I would prefer to
> be conservative and leave them as locks for now. Someone with better insight
> could make a second pass and optimize the locks into barriers where
> appropriate. I am just uncomfortable doing it feeling confident that I am
> not causing races. If you insist on making the changes before the code is
> accepted, ok. Just note that I am not comfortable ;)
>
>
I approach it from the other direction: to me, a locked assignment says
that something is fundamentally wrong. Usually anything under a lock is
a read-modify-write operation, otherwise the writes just stomp on each
other.
This is the source of the all the race-after-vmexit-irq-check comments
you've been getting to me. No matter how many times you explain it,
every time I see it the automatic race alarm pops up.
>>> +/*
>>> * This function will be invoked whenever the vcpu- >irq.dev raises its
>>> INTR
>>> * line
>>> */
>>> @@ - 2318,10 +2348,52 @@ static void kvm_vcpu_intr(struct kvm_irqsink *this,
>>> {
>>> struct kvm_vcpu *vcpu = (struct kvm_vcpu*)this- >private;
>>> unsigned long flags;
>>> + int direct_ipi = - 1;
>>>
>>> spin_lock_irqsave(&vcpu- >irq.lock, flags);
>>> - __set_bit(pin, &vcpu- >irq.pending);
>>> +
>>> + if (!test_bit(pin, &vcpu- >irq.pending)) {
>>> + /*
>>> + * Record the change..
>>> + */
>>> + __set_bit(pin, &vcpu- >irq.pending);
>>> +
>>> + /*
>>> + * then wake up the vcpu (if necessary)
>>> + */
>>> + if (vcpu- >irq.task && (vcpu- >irq.task != current)) {
>>> + if (vcpu- >irq.guest_mode) {
>>> + /*
>>> + * If we are in guest mode, we can optimize
>>> + * the IPI by executing a function directly
>>> + * on the owning processor.
>>> + */
>>> + direct_ipi = task_cpu(vcpu- >irq.task);
>>> + BUG_ON(direct_ipi == smp_processor_id());
>>> + } else
>>> + /*
>>> + * otherwise, we must assume that we could be
>>> + * blocked anywhere, including userspace. Send
>>> + * a signal to give everyone a chance to get
>>> + * notification
>>> + */
>>> + send_sig(vcpu- >irq.signo, vcpu- >irq.task, 0);
>>> + }
>>> + }
>>> +
>>> spin_unlock_irqrestore(&vcpu- >irq.lock, flags);
>>> +
>>> + if (direct_ipi != - 1) {
>>> + /*
>>> + * Not sure if disabling preemption is needed.
>>> + * The kick_process() code does this so I copied it
>>> + */
>>> + preempt_disable();
>>>
[preemption is disabled here anyway]
>>> + smp_call_function_single(direct_ipi,
>>> + kvm_vcpu_guest_intr,
>>> + vcpu, 0, 0);
>>> + preempt_enable();
>>> + }
>>>
>>>
>> I see why you must issue the IPI outside the spin_lock_irqsave(), but
>> aren't you now opening a race? vcpu enters guest mode, irq on other
>> cpu, irq sets direct_ipi to wakeup guest, releases lock, vcpu exits to
>> userspace (or migrates to another cpu), ipi is issued but nobody cares.
>>
>
> Its subtle, but I think its ok. The race is actually against the setting of
> the irq.pending. This *must* happen inside the lock or the guest could exit
> and miss the interrupt. Once the pending bit is set, however, the guest can
> be woken up in any old fashion and the behavior should be correct. If the
> guest has already exited before the IPI is issued, its effectively a no-op
> (well, really its just a wasted IPI/reschedule event, but no harm is done).
> Does this make sense? Did I miss something else?
>
No, you are correct wrt the vcpu migrating to another cpu.
What about vs. exit to userspace where we may sleep?
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel