Steve,

You are totally correct.  I used the wrong words when I said
"ksoftirqd thread runs".  My apologies for very misleading wording.

I have updated the wording in-line below, in the original message to
indicate that it is softirq threads, in the ksoftirqd() function, not
the ksoftirqd thread.

-Frank

-----Original Message-----
From: Steven Rostedt [mailto:[EMAIL PROTECTED]
Sent: Tue 1/15/2008 4:39 PM
To: Rowand, Frank
Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
Subject: Re: [PATCH] lost softirq, 2.6.24-rc7
 
On Tue, Jan 15, 2008 at 02:15:26PM -0800, Frank Rowand wrote:
> From: Frank Rowand <[EMAIL PROTECTED]>
> 
> (Ingo, there is a question for you after the description, just before the
> patch.)
> 
> When running an interrupt and network intensive stress test with PREEMPT_RT
> enabled, the target system stopped processing received network packets.
> skbs from received packets were being queued by net_rx_action(), but the
> NET_RX_SOFTIRQ softirq was never running to remove the skbs from the queue.
> Since the target system root file system is NFS mounted, the system is now
> effectively hung.
> 
> A pseudocode description of how this state was reached follows.
> Each level of indentation represents a function call from the previous line.
> 
> 
> ethernet driver irq handler receives packet
>    netif_rx()
>       queues skb (qlen == 1), raises NET_RX_SOFTIRQ
> 
> on return from irq
>    ___do_softirq() [ 1 ]
>       Reset the pending bitmask

Frank,

This path should not be hit when running with PREEMPT_RT. The softirqs
are now all separate, and are not run in batch in ksoftirqd. In fact,
ksoftirqd should not be running at all with PREEMPT_RT.

-- Steve

>       net_rx_action()
>          dequeues skb (qlen == 0)
>          jiffies incremented, so
>             break out of processing
>             and raise NET_RX_SOFTIRQ
>             (but don't deassert NAPI_STATE_SCHED)
> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> -
>          ksoftirqd thread runs

           ^^^^^^^^^^^^^^^^^^^^^^  should have been:

           the TIMER_SOFTIRQ and RCU_SOFTIRQ softirq threads, which are
           both executing ksoftirqd() run

>             process TIMER_SOFTIRQ
>             process RCU_SOFTIRQ

>             << ksoftirqd sleeps >>

              ^^^^^^^^^^^^^^^^^^^^^^^  should have been:
              the softirq threads, executing in ksoftirqd(), sleep

> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> -
> 
>          ___do_softirq() [ 2 ]
>             Reset the pending bitmask
>             finds NET_RX_SOFTIRQ raised but already running
>             << ___do_softirq() [ 2 ] completes >>
> 
>       << ___do_softirq() [ 1 ] resumes >>
>       the pending bitmask is empty, so NET_RX_SOFTIRQ is lost
> 
> 
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to