On 01/04/2010 02:01 PM, Paolo Bonzini wrote:
On 01/04/2010 09:24 PM, Anthony Liguori wrote:
I'm not a huge fan of this for a couple reasons. The first is that
it introduces a subtle semantic change. Previously, timers always
ran before bottom halves whereas after this change, timers may run
after some bottoms halves but before others.
I see what you mean, and you are right: qemu_bh_new adds a
bottom half at the beginning of the queue, so it's pretty much
guaranteed that a ptimer's bottom half will run _before_ the alarm
timer's.
There are three possible fixes:
1) make async.c use a tail queue. Fixes the bug, but it is too clever
IMHO.
2) in tcg_exec, where there is
if (timer_alarm_pending) {
timer_alarm_pending = 0;
break;
}
instead check if any bottom half is scheduled. With this change,
after the timers run, if the ptimer's bottom half hadn't run TCG would
not execute code, qemu_bh_calculate_timeout would make main_loop_wait
nonblocking, and the ptimer's bottom half would execute right away.
BTW after my series the above check will test whether the timer bottom
half is scheduled, so in some sense this could be considered a bugfix
that would be placed _very early_ in the series or could even go in
independently.
3) Both of the above. 2 would provide the fix and 1 would provide a
performance improvement by avoiding the useless looping.
But more importantly, I think timer dispatch needs to be part of the
select loop. malc has a git tree that replaces host alarm timers
with select() timeouts. This has a lot of really nice properties
like it eliminates the need for signals and EINTR handling. A move
like this would likely make this more difficult.
Not necessarily, or at least, splitting qemu-timer.c may make it
marginally more difficult but not having a bottom half for timers.
When I think of a main loop, I think of something that provides the
following three services
1) fd based events
2) time based events
3) an idle callback
The problem we have is that bottom halves aren't quite idle callbacks.
They have some really weird properties about the guarantees of when
they're executed. At any rate, you really cannot express a time based
event as an idle callback because you need to setup the select() timeout
based on the next deadline and then dispatch timer event based on
selects return. idle functions have none of this ability.
With qemu-timer.c split you'd need something like
if (rc == 0) host_alarm_handler ();
after the select loop. I suppose you could basically revert this
patch and move timer handling into host_alarm_handler, but the code
should work independent of this patch. This patch (modulo your other
objection) just adds a level of indirection but doesn't change the
overall structure of the main loop.
Yeah, I'm not at all opposed to the qemu-timer.c split. What I would
suggest is just to continue calling timers explicitly instead of trying
to make them bottoms halves. I've played with this in the past and the
regressions are really nasty to track down. If we're going to make such
a change, I'd rather spend that regression-busting effort on mtloop or
completely changing the bottom half semantics to be idle callbacks.
Regards,
Anthony Liguori
Paolo