Re: [kvm-devel] kernel device reset support

Avi Kivity Thu, 11 Oct 2007 23:18:43 -0700

Dong, Eddie wrote:
>>> Current VP wake up logic thru INIT/SIPI doesn't support this when
>>> irqchip in kernel. 
>>>
>>>
>>>       
>> Doesn't this code imply that waiting for SIPI is supported?
>>     
>
> It is supported to wake up VCPU in kernel, but can't wake up the VCPU
> in user level since irqchip_in_kernel is TRUE here. vcpu->mp_state
> doesn't export to user level.
>
>


We never sleep in user level if irqchip_in_kernel().  So the thread will
eventually go back to kernel mode.


>> You can put a goto to the top of the loop to redo the mmu reload.  In
>> any case you need to do that because you don't want to execute
>> the reset
>> code with interrupts and preemption disabled.
>>     
>
> A goto cross function? It is too aggresive and bad code style IMO.
> The vcpu->request check is in __vcpu_run, while entering block 
> state is in its parent function kvm_vcpu_ioctl_run.
>
>   

goto the label 'again' in __vcpu_run(), which has the call to
kvm_mmu_reload().

> But if you want, we can return a special value, 
> say REQUEST_INTERNAL_LOOP, to 
> kvm_vcpu_ioctl_run and let kvm_vcpu_ioctl_run use sepcial logic
> to do goto within function if it see the special return value 
> REQUEST_INTERNAL_LOOP. But is it cleaner?
>
> Also we will add more kernel to user EXIT reason, such as RESET request
> from kernel sensored guest tripple fault etc.
>
>   

There is already a triple fault exit code.

>>> The VCPU may be executing in kernel still, which may modify kernel
>>> device state. E.g. A VCPU may be doing PIO emulating.
>>>
>>>
>>>       
>> In that case we will wait when taking kvm->lock.
>>     
>
> Lock doesn't help. Lock can only avoid no 2+ modifcation
> in same time. But what we care if all other VCPUs can't
> do modification after BSP do device reset.
> It is different semantics.
>
> Maybe you are still arguing it is the AP who do RESET ops.
> Let us go to next discussion first.
>
>   

We first halt all vcpus, then take the lock.  So:

- other processors won't start after the device reset because they are
halted
- we won't do the reset concurrently with other processors because of
the lock

>>> If BSP reset the kernel devices earlier than the VCPU modify the
>>> device state, we are in trouble.
>>>
>>> No, VCPU0 (BSP) is current VCPU (though you don't have the current
>>> vcpu parameter explicitly) like mentioned in previous mail and
>>> as pre-requirement of user level change. Please refer my abswer
>>> above of this mail. 
>>>
>>>
>>>       
>> We can't rely on user space not to cause host kernel corruption.
>>     
>
> ???
>
> Even an AP trigger RESET, it just sets a reset_request flag in user
> level.
> It is another VCPU who will execute RESET operation.
>
> It seems the argument is who should do the RESET operation, 
> say RST_CPU. BSP only or AP too. For me, since after RESET
> only BSP can execute, and the thread executing 
> qemu_system_reset will continously execute 
> (after RESET kernel) per current Qemu code, so what we can do is:
>
>       1: RST_CPU=BSP. Then BSP does qemu_system_reset, or
>       2: RST_CPU = AP, say RAP, does qemu_system_reset, user level
> then 
> need to block RAP after qemu_system_reset and wake up BSP to take over.
>       A point here we can't blcok RAP in case 2 at kernel RESET time,
> since
> kernel RESET may be not the last step of qemu_system_reset. It may go 
> to kernel again.
>
>       If we go with #1, just 1 line change as in my previous mail.
>       If we go with #2, we have to add a new ABI for the AP to enter
> kernel 
> wait for INIT/SIPI/SIPI state, otherwise normal INIT/SIPI/SIPI couldn't
> wake it up.
>
>       I see much complicate in #2 while #1 has same functionality but
> simple.
>
>   

My view is:
- vcpu threads never sleep in userspace.  they will always eventually
end up in the kernel so we can stop or restart them there
- reset is a platform API so it can't be dependent on which vcpu thread
executes it (if any; it may be executed from an unrelated thread,
remember we plan to separate the qemu signal handling code into a
separate thread)
- we already have a way to send messages to other vcpus

So it seems to me everything is in place to make it fairly simple.

I'll try writing a patch that does what I mean and post it.  Either I'll
convince you that in-kernel is simpler, or I'll convince myself that it
is harder.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] kernel device reset support

Reply via email to