Ramon van Handel wrote:
> No, you missed the point.
>
> First of all, what you're saying is not true: it won't be
> scaled down proportionally. The amount of CPU time a
> process gets depends on the system load, which may well
> vary (considerably) during the uptime of the virtual
> machine. And in your model, the amount of CPU time ==
> the amount of virtualised time.
>
> Secondly, I couldn't care less about a 500mhz machine
> behaving like a 300 one. Nothing wrong with that, you
> can't avoid it anyway. What I'm worried about is how
> timed I/O sources behave:
>
> - timed interrupts (PIT, RTC, LAPIC)
> - timed I/O ports (VGA VBL, RTC)
Nope, I didn't miss the point. Our devices are *emulated*,
and the timing of the emulation we derive from the amount
of guest execution time. If the host is running slow,
then the guest will slow down. But we will deliver the
interrupts at exactly where they should occur from the
the guest's perspective, as far as how much guest code
has executed in between interrupts.
Essentially, the guest will not realize the skew at all
using the method I talked about. Perceptually, the
user will notice a slowdown of things which are real-time
based such as video and sound, as you mention, where frames
of data are expected to be delivered on time boundaries based
on the host OS's time reference since that's where the user
lives. For normal graphics updates, this is not a concern.
Things will run a little slower - no big deal.
For cases where it matters, I have no problem with adding
an ability in the timing facility to attempt tracking the
host OS's time reference, and as I mentioned there should
be an aggression factor, so the user can weight how important
it is to track time, trading emulation accuracy for better
delivery time of video & sound for instance. I want
the ability to tweak the factor to 0, which *is* the
accurate way to do this.
The fundamental issue is this. The guest is expecting
in whole, it runs 100% of the time. Under virtualization,
it is running a fraction of 1/N of the time. When there's
nothing much going on in the host, then you get near
1/1, which is fine and the time stretching stuff would work
fine. When there are 10 things going on you get 1/10th of
the time, and in order to stretch time, you have to nail
the guest OS with 10 times the number of interrupts even
though it is processing the same amount of other code. This
is worsened by the overhead incurred by the virtualization,
and of course the overhead of an interrupt itself in the
guest OS, especially so if we are emulating the interrupt
in the monitor.
Certain guest OSes will likely handle being stretched more
than others. Linux will probably respond one heck of a lot
better than Windows to this. But if you are running 5 ray
tracing programs on your machine, and it is also the server
for Slashdot, you better have some intelligence in the
stretching that can give up on synchonizing one or more
past host OS timeframes, and then begins synchronizing on
future ones. A user would see the manifestation of this kind
of technique as a window of slowdown of video frames, for example,
followed by normal displaying of the next ones, if there was
say a sharp burst of host activity that subsided thereafter.
What we should not do, is insist we keep up, if we continuously
fall behind.
> In overload situations, there's no helping performance
> anyway. Of course we need an overload check; but such
> a situation is not an optimal situation to be running
> FreeMWare in anyway (i.e., even without skewing I'd
> not expect it to perform very well that way).
True enough. Of course, users will push the edge anyways,
so we still should do what we can to make things work.
> There's no work about timing in there.
>
> I'll have to dig into the library and look up all those articles
> that are referenced in the DISCO article... (some nice review
> articles in there, too.) I'll do that next time I'm at school.
OK.
If you don't find anything, its really not important anyways.
I pretty much know how to do it already anyways. If you take
time reference samples in the host kernel module each time before
you run the monitor/guest, and the monitor is taking them for each
block of time bounded by exceptions generated in the guest and before
it returns to the host, then you have time deltas for each of the
host and guest, say M and N.
The ratio M/N gives you an idea how much you need to accelerate
the timing facilities, which distribute time by way of callbacks
to the device emulation. We also factor in the aggression factor
given by the user's preference. If we want to attempt wall-clock
synchronization, then we could get feedback from the CMOS RTC clock
emulation, finding out how far behind we are, and factoring this
in to the accelaration (stretching). Note that the RTC clock
can arbitrarily be stopped/started/reprogrammed. When it stops,
take it's weighting out of the picture. When it start again, put
it back in. When it's reprogrammed, maintain an offset from it's
new value and the host OS's clock, but everything else is the same.
-Kevin