On Mon, 2019-09-09 at 18:00 +0200, Stefan Hajnoczi wrote:

> Is this really necessary?  

Yes* :)

> Can the simulation interpose between the
> call/kick eventfds in order to control when events happen?
> 
>   CPU --cpu_kickfd--> Simulation --vhost_kickfd--> vhost-user device
> 
> and:
> 
>   vhost-user device --vhost_callfd--> Simulation -->cpu_callfd-> CPU
> 
> The simluation controls when the CPU's kick is seen by the device and
> also when the call is seen by the CPU.

The point isn't to let the simulation know about anything that happens.
The CPU and the device are *part* of the simulation.

> I don't understand why new vhost-user protocol messages are required.

I guess I haven't really explained it well then :-)

So let's say, WLOG, I have a simulated network and a bunch of Linux
machines that are running on simulation time. Today I can do that only
with user-mode Linux, but we'll see.

Now in order to run everything on simulation time, *everything* that
happens in the simulation needs to request a simulation calendar entry,
and gets told when that entry is scheduled.

So think, for example, you have

CPU ---[kick]---> device

Now, this is essentially triggering an interrupt in the device. However,
the simulation code has to ensure that the simulated device's interrupt
handling only happens at a scheduler entry. Fundamentally, the
simulation serializes all processing, contrary to what you want in a
real system.

Now, this means that the CPU (that's part of the simulation) has to
*wait* for the device to add an entry to the simulation calendar in
response to the kick... That means that it really has to look like

CPU               device                   calendar
     ---[kick]-->
                         ---[add entry]-->
                         <---[return]-----
     <-[return]--

so that the CPU won't get to idle or some other processing where it asks
the simulation calendar for its own next entry...

Yes, like I said before, I realize that all of this is completely
opposed to what you want in a real system, but then in a real system you
also have real timeouts, and don't just skip time forward when the
simulation calendar says so ...


* Now, of course I lied, this is software after all. The *concept* is
necessary, but it's not strictly necessary to do this in-band in the
vhost-user protocol.
We could do an out-of-band simulation socket for the kick signal and
just pretend we're using polling mode as far as the vhost-user protocol
is concerned, but it'd probably be harder to implement, and we couldn't
do it in a way that we could actually contribute anything upstream.
There are quite a few papers proposing such simulation systems, I only
found the VMSimInt one publishing their code, but even that is some
hacks on top of qemu 1.6.0...

johannes


Reply via email to