Hi Jan,

On Fri, Nov 21, 2008 at 08:54:56AM +0100, Jan Kiszka wrote:
> Eduardo Habkost wrote:
> > On Thu, Nov 20, 2008 at 12:22:53PM -0200, Eduardo Habkost wrote:
> >> Hi,
> >>
> >> When using a kvm.git kernel as host, I am getting guest boot failures
> >> when booting Fedora Rawhide kernel (2.6.27.5-117.fc10.x86_64). Guest
> >> stops booting at:
> >>
> >> ENABLING IO-APIC IRQs
> >> ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
> >> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> >> ...trying to set up timer (IRQ0) through the 8259A ...
> >> ..... (found apic 0 pin 0) ...
> >> ....... failed.
> >> ...trying to set up timer as Virtual Wire IRQ...
> >> ..... failed.
> >> ...trying to set up timer as ExtINT IRQ...
> > 
> > I've just found out this problem happens because the guest has HZ=1000
> > and the host had HZ=250 and no CONFIG_HIGH_RES_TIMERS.
> > 
> > With this setup, the host is not managing to inject enough timer
> > interrupts during the mdelay() loop on timer_irq_works().
> > 
> 
> Interesting, and plausible.
> 
> My observation so far is a sporadic test failure, often correlating with
> some raised host OS load. I'm running a high-res kernel, but that cannot
> prevent that this only 10 ticks long loop of the guest may obtain too
> few CPU cycles to handle enough of them once in a while (IIRC, it needs
> 4 out of the 10 ticks to declare the timer routing functional).

Using in-kernel PIT?

This is a potential problem which can be worked around by disabling the
whole thing either via no_timer_check or paravirt equivalent (Glauber?)
but for the non-paravirt case it seems its not the culprit. Possible
failure scenarios:

1) lpj miscalibration (SMP guests), which kvm-clock deals with.

2) proper lpj calibration, so m/udelay behave as expected, but not
enough interrupts can be injected due to CPU starvation as you mention.

On my testbox, with each pCPU running a cycle hog on nice -10, the first
timer_irq_works call (via IOAPIC) won't fail (guest is truly starved).
Host with both CONFIG_PREEMPT/CONFIG_PREEMPT_VOLUNTARY.

And moreover, code attempts to first deliver via IOAPIC, then 8259A, 
then virtual wire. Reports show all three failing.

3) Failure to inject the interrupt will break the in-kernel PIT ack
logic. The VMX NMI/IRQ race you fixed can certainly cause this. Can you
reproduce it with the fix (and CONFIG_KVM_CLOCK=y) ?

Any other possibilities? 

> Maybe Gleb's anti-coalesce patches for the PIC can also deal with your
> timer resolution conflict. At least worth a try...
> 
> Jan
> 


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to