On Sun, 21 Apr 2013, Borislav Petkov wrote:

> + tglx.
> 
> On Sun, Apr 21, 2013 at 01:38:33AM +0400, vita...@yourcmc.ru wrote:
> > >>Stack trace picture is here:
> > >>http://vmx.yourcmc.ru/var/pics/IMG_20130306_141045.jpg
> > >
> > >Vitaliy reported that his system crashes when suspending to disk.
> > >This
> > >was a regression from 3.2 to 3.7, and remains in 3.8.  Some
> > >details of
> > >this system are in the bug log at <http://bugs.debian.org/700333>.
> > >
> > >The photo shows a BUG in hrtimer_interrupt() after making the
> > >hibernation image and while resuming the non-boot CPUs.  The HPET
> > >interrupt handler was called immediately after it was registered
> > >for CPU
> > >2 (?), before the corresponding clock_event_device was registered.
> > >
> > >Seems like an obvious race condition, but then shouldn't the HPET
> > >have
> > >been stopped while the CPU was previously offlined?  And it's strange
> > >that this system apparently hits the race quite reliably.
> > 
> > Anyone?

So what happens is, that the HPET seems to have an interrupt pending
and this gets immediately fired, when the handler is installed. The
core code does not remove the hpet->event_handler, so it calls into
the hrtimer_interrupt where it hits the BUG and dies.

With the patch below, the box should survive and we should see a 

"Spurious HPET timer interrupt on HPET timer..." entry in dmesg.

That's a first workaround to confirm my theory. I'll look into the
HPET code how we can avoid that at all.

Thanks,

        tglx

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index b1600a6..0f0ce6e 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -323,6 +323,7 @@ static void tick_shutdown(unsigned int *cpup)
                 */
                dev->mode = CLOCK_EVT_MODE_UNUSED;
                clockevents_exchange_device(dev, NULL);
+               dev->event_handler = NULL;
                td->evtdev = NULL;
        }
        raw_spin_unlock_irqrestore(&tick_device_lock, flags);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to