On Fri, 22 Jul 2011 09:12:51 -0000
Thomas Gleixner <[email protected]> wrote:

> Ben reported a lockup related to rtc. The lockup happens due to:
> 
> CPU0                                        CPU1
> 
> rtc_irq_set_state()                       __run_hrtimer()     
>   spin_lock_irqsave(&rtc->irq_task_lock)    rtc_handle_legacy_irq();
>                                             spin_lock(&rtc->irq_task_lock);
>   hrtimer_cancel()
>     while (callback_running);
> 
> So the running callback never finishes as it's blocked on
> rtc->irq_task_lock.  
> 
> Use hrtimer_try_to_cancel() instead and drop rtc->irq_task_lock while
> waiting for the callback. Fix this for both rtc_irq_set_state() and
> rtc_irq_set_freq().
> 
> ...
>
> +static int rtc_update_hrtimer(struct rtc_device *rtc, int enabled)
> +{
> +     /*
> +      * We unconditionally cancel the timer here, because otherwise

The comment seems wrong.  If hrtimer_try_to_cancel() fails, we simply
bale out so we did not "unconditionally cancel the timer"?

> +      * we could run into BUG_ON(timer->state != HRTIMER_STATE_CALLBACK);
> +      * when we manage to start the timer before the callback
> +      * returns HRTIMER_RESTART.
> +      *
> +      * We cannot use hrtimer_cancel() here as a running callback
> +      * could be blocked on rtc->irq_task_lock and hrtimer_cancel()
> +      * would spin forever.
> +      */
> +     if (hrtimer_try_to_cancel(&rtc->pie_timer) < 0)
> +             return -1;
> +
> +     if (enabled) {
> +             ktime_t period = ktime_set(0, NSEC_PER_SEC / rtc->irq_freq);
> +
> +             hrtimer_start(&rtc->pie_timer, period, HRTIMER_MODE_REL);
> +     }
> +     return 0;
> +}
> +
>  /**
>   * rtc_irq_set_state - enable/disable 2^N Hz periodic IRQs
>   * @rtc: the rtc device
> @@ -651,24 +674,21 @@ int rtc_irq_set_state(struct rtc_device 
>       int err = 0;
>       unsigned long flags;
>  
> +retry:
>       spin_lock_irqsave(&rtc->irq_task_lock, flags);
>       if (rtc->irq_task != NULL && task == NULL)
>               err = -EBUSY;
>       if (rtc->irq_task != task)
>               err = -EACCES;
> -     if (err)
> -             goto out;
> -
> -     if (enabled) {
> -             ktime_t period = ktime_set(0, NSEC_PER_SEC/rtc->irq_freq);
> -             hrtimer_start(&rtc->pie_timer, period, HRTIMER_MODE_REL);
> -     } else {
> -             hrtimer_cancel(&rtc->pie_timer);
> +     if (!err) {
> +             if (rtc_update_hrtimer(rtc, enabled) < 0) {
> +                     spin_unlock_irqrestore(&rtc->irq_task_lock, flags);
> +                     cpu_relax();
> +                     goto retry;
> +             }
> +             rtc->pie_enabled = enabled;

Well this is rather nasty.  Sort of an open-coded expensive spinlock. 
All rather pointless on SMP=n builds, too.

Is there no better way, such as fixing up the locking properly?

_______________________________________________
stable mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/stable

Reply via email to