Mircea Damian wrote:
> 
> Hello,
> 
> I was expecting to receive some replies to my last desperate messages:
> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg35446.html
> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg36591.html
> 
> My machine is dyeing in add_timer(). It seems to happen only on SMP
> machines and is something related to the network driver. For some reason
>  one of the timer lists gets broken so we (we are two people trying to
>  solve this issue) wrote a "safe" timer.c which tries to rebuild the chain
>  in case it hits a NULL pointer.
> 
> The machine is (ofcourse) slower with this patch but at least it works.
> 
> Maybe someone can see which is the real bug and fix it.
> 
> Please help!

I found (at least part of) the problem.  In detach_timer() we test if
the timer is pending.  If it is not the function does not remove the
timer from the list and returns 0.  The functions that call
detach_timer() do not check the return value and unconditionally set the
list pointers to NULL, even though the timer is still on the list. 
Patch against 2.4.3 attached, but there may be a better solution.

-- 

                                                Brian Gerst
diff -urN linux-2.4.3/kernel/timer.c linux/kernel/timer.c
--- linux-2.4.3/kernel/timer.c  Thu Dec 14 20:52:22 2000
+++ linux/kernel/timer.c        Fri Apr 13 13:26:08 2001
@@ -194,6 +194,7 @@
        if (!timer_pending(timer))
                return 0;
        list_del(&timer->list);
+       timer->list.next = timer->list.prev = NULL;
        return 1;
 }
 
@@ -217,7 +218,6 @@
 
        spin_lock_irqsave(&timerlist_lock, flags);
        ret = detach_timer(timer);
-       timer->list.next = timer->list.prev = NULL;
        spin_unlock_irqrestore(&timerlist_lock, flags);
        return ret;
 }
@@ -246,7 +246,6 @@
 
                spin_lock_irqsave(&timerlist_lock, flags);
                ret += detach_timer(timer);
-               timer->list.next = timer->list.prev = 0;
                running = timer_is_running(timer);
                spin_unlock_irqrestore(&timerlist_lock, flags);
 
@@ -309,7 +308,6 @@
                        data= timer->data;
 
                        detach_timer(timer);
-                       timer->list.next = timer->list.prev = NULL;
                        timer_enter(timer);
                        spin_unlock_irq(&timerlist_lock);
                        fn(data);

Reply via email to