Re: netif_tx_disable and lockless TX
On Wed, 2006-31-05 at 19:52 +0200, Robert Olsson wrote: > jamal writes: > > > Latency-wise: TX completion interrupt provides the best latency. > > Processing in the poll() -aka softirq- was almost close to the hardirq > > variant. So if you can make things run in a softirq such as transmit > > one, then the numbers will likely stay the same. > > I don't remember we tried tasklet for TX a la Herbert's suggestion but we > used use tasklets for controlling RX processing to avoid hardirq livelock > in pre-NAPI versions. > Hrm - it may have been a private thing i did then. I could swear we did that experiment together ... Perhaps Herbert's motivation was not really to optimize but rather to get something unstuck in the transmit path state machine maybe in a context of netconsole? The conditions for which that tasklet would even run require a CPU collision to the transmit. Sorry, I didnt quiet follow the motivation/discussion that ended in that patch. > Had variants of tulip driver with both TX cleaning at ->poll and TX > cleaning at hardirq and didn't see any performance difference. The > ->poll was much cleaner but we kept Alexey's original work for tulip. > It certainly is cleaner - but i do recall the hardirq variant had better latency much observable under high packet rates aka small packets. > > Sorry, I havent been following discussions on netchannels[1] so i am not > > qualified to comment on the "replacement" part Dave mentioned earlier. > > What I can say is the tx processing doesnt have to be part of the NAPI > > poll() and still use hardirq. > > Yes true but I see TX numbers with newer boards (wire rate small pakets) > with cleaing in ->poll. Also now linux is very safe in network "overload" > situations. Moving work to hardirq may change that. > Oh, I am not suggesting a change - i am a lot more conservative than that ;-> these areas are delicate (not code-delicate Acme ;->) but rather what seems obvious requires a lot of experimental results first. Robert, your transmit results Intel or AMD based? cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
Stephen Hemminger writes: > I also noticed that you really don't save much by doing TX cleaning at > hardirq, because in hardirq you need to do dev_kfree_irq and that causes > a softirq (for the routing case where users=1). So when routing it > doesn't make much difference, both methods cause the softirq delayed > processing to be invoked. For locally generated packets which are > cloned, the hardirq will drop the ref count, and that is faster than > doing the whole softirq round trip. Right. Also the other way around, repeated ->poll can avoid TX hardirq's. Cheers. --ro - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
Robert Olsson wrote: jamal writes: > Latency-wise: TX completion interrupt provides the best latency. > Processing in the poll() -aka softirq- was almost close to the hardirq > variant. So if you can make things run in a softirq such as transmit > one, then the numbers will likely stay the same. I don't remember we tried tasklet for TX a la Herbert's suggestion but we used use tasklets for controlling RX processing to avoid hardirq livelock in pre-NAPI versions. Had variants of tulip driver with both TX cleaning at ->poll and TX cleaning at hardirq and didn't see any performance difference. The ->poll was much cleaner but we kept Alexey's original work for tulip. > Sorry, I havent been following discussions on netchannels[1] so i am not > qualified to comment on the "replacement" part Dave mentioned earlier. > What I can say is the tx processing doesnt have to be part of the NAPI > poll() and still use hardirq. Yes true but I see TX numbers with newer boards (wire rate small pakets) with cleaing in ->poll. Also now linux is very safe in network "overload" situations. Moving work to hardirq may change that. I also noticed that you really don't save much by doing TX cleaning at hardirq, because in hardirq you need to do dev_kfree_irq and that causes a softirq (for the routing case where users=1). So when routing it doesn't make much difference, both methods cause the softirq delayed processing to be invoked. For locally generated packets which are cloned, the hardirq will drop the ref count, and that is faster than doing the whole softirq round trip. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Wed, May 31, 2006 at 05:27:53PM -0700, Michael Chan wrote: > > Now that I think about it some more, we can eliminate most of the > tx_lock in tg3 without converting to xmit_lock. In most places, we get > the tx_lock after we call netif_tx_disable(), etc, and quiesce the chip. > So the tx_lock or xmit_lock is really not necessary in most places. The > tp->lock is sufficient after we stop everything. Sounds good. > The only few remaining places where we need it is around > netif_stop_queue/netif_wake_queue in hard_start_xmit and tx completion. > We can either keep the tx_lock or convert to xmit_lock and either way > makes little difference because these are not in the normal fast path. > If we keep the tx_lock, it will be just like BNX2. Now that I look at it the BNX2 method looks perfect. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Thu, 2006-06-01 at 10:42 +1000, Herbert Xu wrote: > On Wed, May 31, 2006 at 04:01:06PM -0700, Michael Chan wrote: > > > > Oh, I thought your idea was to keep the LLTX and just replace tx_lock > > with xmit_lock in tg3. But I suppose we can also clear LLTX, remove the > > tx_lock in hard_start_xmit and convert the rest to avoid changing > > netpoll. > > That's what I meant. This has the advantage that things like AF_PACKET > won't see duplicate packets. Now that I think about it some more, we can eliminate most of the tx_lock in tg3 without converting to xmit_lock. In most places, we get the tx_lock after we call netif_tx_disable(), etc, and quiesce the chip. So the tx_lock or xmit_lock is really not necessary in most places. The tp->lock is sufficient after we stop everything. The only few remaining places where we need it is around netif_stop_queue/netif_wake_queue in hard_start_xmit and tx completion. We can either keep the tx_lock or convert to xmit_lock and either way makes little difference because these are not in the normal fast path. If we keep the tx_lock, it will be just like BNX2. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Wed, May 31, 2006 at 04:01:06PM -0700, Michael Chan wrote: > > Oh, I thought your idea was to keep the LLTX and just replace tx_lock > with xmit_lock in tg3. But I suppose we can also clear LLTX, remove the > tx_lock in hard_start_xmit and convert the rest to avoid changing > netpoll. That's what I meant. This has the advantage that things like AF_PACKET won't see duplicate packets. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Thu, 2006-06-01 at 10:25 +1000, Herbert Xu wrote: > On Wed, May 31, 2006 at 05:09:08PM -0700, David Miller wrote: > > From: "Michael Chan" <[EMAIL PROTECTED]> > > Date: Wed, 31 May 2006 14:20:29 -0700 > > > > > David, So do we want to fix this issue as proposed by Herbert to replace > > > tx_lock with xmit_lock? It seems quite straightforward to do. For this > > > change to work, netpoll also needs to be fixed up a bit to check for > > > LLTX before getting the xmit_lock. > > > > Oh yes, netpoll needs changes, thanks for noticing that. > > Why does it need to change? The idea is to take the LLTX flag off tg3 > after the conversion to xmit_lock. Oh, I thought your idea was to keep the LLTX and just replace tx_lock with xmit_lock in tg3. But I suppose we can also clear LLTX, remove the tx_lock in hard_start_xmit and convert the rest to avoid changing netpoll. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Wed, May 31, 2006 at 05:09:08PM -0700, David Miller wrote: > From: "Michael Chan" <[EMAIL PROTECTED]> > Date: Wed, 31 May 2006 14:20:29 -0700 > > > David, So do we want to fix this issue as proposed by Herbert to replace > > tx_lock with xmit_lock? It seems quite straightforward to do. For this > > change to work, netpoll also needs to be fixed up a bit to check for > > LLTX before getting the xmit_lock. > > Oh yes, netpoll needs changes, thanks for noticing that. Why does it need to change? The idea is to take the LLTX flag off tg3 after the conversion to xmit_lock. > I think this netpoll wrinkle means we also have to make > sure to set the xmit_lock_owner across the board. You're right. In fact this can deadlock today for those drivers that already make use of xmit_lock without setting the owner. So I suppose something like net_xmit_lock to obtain xmit_lock is called for. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
From: "Michael Chan" <[EMAIL PROTECTED]> Date: Wed, 31 May 2006 14:20:29 -0700 > David, So do we want to fix this issue as proposed by Herbert to replace > tx_lock with xmit_lock? It seems quite straightforward to do. For this > change to work, netpoll also needs to be fixed up a bit to check for > LLTX before getting the xmit_lock. Oh yes, netpoll needs changes, thanks for noticing that. That's why I had the idea to make xmit_lock a pointer, so that things like that netpoll case could be transparent. I think this netpoll wrinkle means we also have to make sure to set the xmit_lock_owner across the board. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Tue, 2006-05-30 at 22:11 -0700, David Miller wrote: > From: Herbert Xu <[EMAIL PROTECTED]> > Date: Wed, 31 May 2006 14:58:11 +1000 > > > Yes, TG3 does not disable IRQs when taking its TX lock. So do you see > > any problems with replacing the TG3 TX lock using xmit_lock? > > I don't see any. > David, So do we want to fix this issue as proposed by Herbert to replace tx_lock with xmit_lock? It seems quite straightforward to do. For this change to work, netpoll also needs to be fixed up a bit to check for LLTX before getting the xmit_lock. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
jamal writes: > Latency-wise: TX completion interrupt provides the best latency. > Processing in the poll() -aka softirq- was almost close to the hardirq > variant. So if you can make things run in a softirq such as transmit > one, then the numbers will likely stay the same. I don't remember we tried tasklet for TX a la Herbert's suggestion but we used use tasklets for controlling RX processing to avoid hardirq livelock in pre-NAPI versions. Had variants of tulip driver with both TX cleaning at ->poll and TX cleaning at hardirq and didn't see any performance difference. The ->poll was much cleaner but we kept Alexey's original work for tulip. > Sorry, I havent been following discussions on netchannels[1] so i am not > qualified to comment on the "replacement" part Dave mentioned earlier. > What I can say is the tx processing doesnt have to be part of the NAPI > poll() and still use hardirq. Yes true but I see TX numbers with newer boards (wire rate small pakets) with cleaing in ->poll. Also now linux is very safe in network "overload" situations. Moving work to hardirq may change that. Cheers. --ro - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Wed, 2006-31-05 at 22:40 +1000, Herbert Xu wrote: > On Wed, May 31, 2006 at 08:36:12AM -0400, jamal wrote: > > [..] > > > > Been done in the past, bad numbers especially in SMP for reasons of > > latency and likelihood that a tasklet will run in a totally different > > CPU. > > Was the bad latency measured with code like the above or was it with > an unconditional tasklet schedule? > The idea was to prune those skbs for reuse within reason so as to not sacrifice perfomance. Perfomance is sacrificed if you have too many TX EOLs, or you dont prune the descriptors fast enough. The dynamics of proved to be hard to nail in a generic way because behavior varies based on traffic patterns. So the "within reason" part will unconditionally schedule a tasklet on the tx if certain DMA ring thresholds are crossed, way after the tx lock has been grabbed which is different from your suggestion to do it when you cant grab a lock. Otherwise you let the hardirq take care of things. Tasklets proved to be a bad idea and instead we ended reclaiming the skb/descriptors right at the point where we initially had scheduled tasklets. I dont know what happened to using that technique - Robert can testify better than i. > Please note that my code will do the completion in the IRQ handler > most of the time. Only if there is contention for the spin lock will > it defer work to the softirq. that part is different and I think there may be potential - it will require experimenting with a variety of traffic patterns. I wish i had time to help. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Wed, May 31, 2006 at 08:36:12AM -0400, jamal wrote: > > > The IRQ handler would look like > > > > if (!spin_trylock(&dev->xmit_lock)) { > > tasklet_schedule(&priv->tx_completion_tasklet); > > return; > > } > > > > handle_tx_completion(); > > spin_unlock(&dev->xmit_lock); > > > > Where the TX completion tasklet would simply do > > > > spin_lock(&dev->xmit_lock); > > handle_tx_completion(); > > spin_unlock(&dev->xmit_lock); > > > > What do you think? > > Been done in the past, bad numbers especially in SMP for reasons of > latency and likelihood that a tasklet will run in a totally different > CPU. Was the bad latency measured with code like the above or was it with an unconditional tasklet schedule? Please note that my code will do the completion in the IRQ handler most of the time. Only if there is contention for the spin lock will it defer work to the softirq. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Wed, 2006-31-05 at 22:06 +1000, Herbert Xu wrote: > On Wed, May 31, 2006 at 12:08:18AM -0700, David Miller wrote: > > > > I understand what you're saying. What I'm talking about is that in > > net channel drivers we might go back to IRQ disabling locks again. > > OK, let's assume that the TX completion will go back into the IRQ > handler. I contend that we can still get by with a BH-disabling > xmit_lock. This is how it would work: > > The IRQ handler would look like > > if (!spin_trylock(&dev->xmit_lock)) { > tasklet_schedule(&priv->tx_completion_tasklet); > return; > } > > handle_tx_completion(); > spin_unlock(&dev->xmit_lock); > > Where the TX completion tasklet would simply do > > spin_lock(&dev->xmit_lock); > handle_tx_completion(); > spin_unlock(&dev->xmit_lock); > > What do you think? Been done in the past, bad numbers especially in SMP for reasons of latency and likelihood that a tasklet will run in a totally different CPU. Latency-wise: TX completion interrupt provides the best latency. Processing in the poll() -aka softirq- was almost close to the hardirq variant. So if you can make things run in a softirq such as transmit one, then the numbers will likely stay the same. Sorry, I havent been following discussions on netchannels[1] so i am not qualified to comment on the "replacement" part Dave mentioned earlier. What I can say is the tx processing doesnt have to be part of the NAPI poll() and still use hardirq. If you look at the earlier NAPI drivers such as the tulip or the broadcom 1250, the TX EOL was always made an exception and was never touched by the ->poll() code rather by the main interrupt routine. I do prefer the scheme of leaving the TX EOL out of the NAPI poll because i believe that it provides better performance. The caveat is there's a lot of fscked hardware out there that does unconditional clear-on-read of the int status register;-> I had a lot of fun with bcm1250 and ended discovering that there were infact two status regs ;-> (it pays to know people for undocumented things;->) one was for debug version which doesnt clear on writing. So in one spot i would read the real register and in the other the debug version. I had numbers which showed doing it this way provided better performance than doing it in the poll() routine (scouring the code just now it seems my version never made it in, so you will see something along the lines of e1000/tg3). BTW, Andi Kleen on CC is the person who posted some numbers on the LLTX (as well as the patch) at some point. I have also CCed Robert who may have comments on the use of tasklets. cheers, jamal [1] Such a shame, such an exciting topic, such few cycles. I will have time RealSoonNow and you will probably see a_patch_from_me_too. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Wed, May 31, 2006 at 12:08:18AM -0700, David Miller wrote: > > I understand what you're saying. What I'm talking about is that in > net channel drivers we might go back to IRQ disabling locks again. OK, let's assume that the TX completion will go back into the IRQ handler. I contend that we can still get by with a BH-disabling xmit_lock. This is how it would work: The IRQ handler would look like if (!spin_trylock(&dev->xmit_lock)) { tasklet_schedule(&priv->tx_completion_tasklet); return; } handle_tx_completion(); spin_unlock(&dev->xmit_lock); Where the TX completion tasklet would simply do spin_lock(&dev->xmit_lock); handle_tx_completion(); spin_unlock(&dev->xmit_lock); What do you think? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
From: Herbert Xu <[EMAIL PROTECTED]> Date: Wed, 31 May 2006 16:31:52 +1000 > On Tue, May 30, 2006 at 11:26:26PM -0700, David Miller wrote: > > > > I don't think it will be worthwhile to keep NAPI around just for > > TX completion. Sure the dev_kfree_skb() will schedule software > > interrupt work to do the actual free, but the TX ring walking > > and dev_kfree_skb() calls will be in hard IRQ context. > > Sure we won't need all of NAPI. But would it be that bad to schedule > a tasklet to do the TX completion? It seems just an extra level of indirection. :) > Oh, I'm not proposing that we disable IRQs on xmit_lock at all. ... I understand what you're saying. What I'm talking about is that in net channel drivers we might go back to IRQ disabling locks again. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Tue, May 30, 2006 at 11:26:26PM -0700, David Miller wrote: > > I don't think it will be worthwhile to keep NAPI around just for > TX completion. Sure the dev_kfree_skb() will schedule software > interrupt work to do the actual free, but the TX ring walking > and dev_kfree_skb() calls will be in hard IRQ context. Sure we won't need all of NAPI. But would it be that bad to schedule a tasklet to do the TX completion? > > > The only regret I have about that is we will go back to not being able > > > to profile ->hard_start_xmit() very well in such drivers. > > > > Can you elborate on that? I think I've already removed all references > > to this in my memory :) > > If you disable IRQs in the ->hard_start_xmit() handler, you don't > get timer based profiling ticks. Currently we do. Oh, I'm not proposing that we disable IRQs on xmit_lock at all. What I'm saying is that for tg3, since both xmit_lock and tx_lock are BH- disabling locks, they are equivalent and therefore we can replace tx_lock with xmit_lock. For other lockless NICs that have IRQ disabling tx_locks, they can be converted to BH-disabling ones and then be able to use xmit_lock. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
From: Herbert Xu <[EMAIL PROTECTED]> Date: Wed, 31 May 2006 15:14:51 +1000 > On Tue, May 30, 2006 at 10:11:17PM -0700, David Miller wrote: > > > > It will come back when net channels arrive :-) This is because net > > channel drivers will do all the work in hard IRQs and not in a NAPI or > > NAPI-like context. > > I thought the current channel stuff is RX only. Is TX completion moving > to IRQ context as well? I don't think it will be worthwhile to keep NAPI around just for TX completion. Sure the dev_kfree_skb() will schedule software interrupt work to do the actual free, but the TX ring walking and dev_kfree_skb() calls will be in hard IRQ context. > > The only regret I have about that is we will go back to not being able > > to profile ->hard_start_xmit() very well in such drivers. > > Can you elborate on that? I think I've already removed all references > to this in my memory :) If you disable IRQs in the ->hard_start_xmit() handler, you don't get timer based profiling ticks. Currently we do. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Tue, May 30, 2006 at 10:30:25PM -0700, Michael Chan wrote: > > I also don't see any problem. It looks like we don't have to set > xmit_lock_owner, right? Right, unless you want to call dev_queue_xmit recursively :) -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
David Miller wrote: > From: Herbert Xu <[EMAIL PROTECTED]> > Date: Wed, 31 May 2006 14:58:11 +1000 > > > Yes, TG3 does not disable IRQs when taking its TX lock. So do you see > > any problems with replacing the TG3 TX lock using xmit_lock? > > I don't see any. > I also don't see any problem. It looks like we don't have to set xmit_lock_owner, right? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Tue, May 30, 2006 at 10:11:17PM -0700, David Miller wrote: > > It will come back when net channels arrive :-) This is because net > channel drivers will do all the work in hard IRQs and not in a NAPI or > NAPI-like context. I thought the current channel stuff is RX only. Is TX completion moving to IRQ context as well? > The only regret I have about that is we will go back to not being able > to profile ->hard_start_xmit() very well in such drivers. Can you elborate on that? I think I've already removed all references to this in my memory :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
From: Herbert Xu <[EMAIL PROTECTED]> Date: Wed, 31 May 2006 14:58:11 +1000 > Hi Michael: > > On Tue, May 30, 2006 at 09:51:03PM -0700, Michael Chan wrote: > > > > > That's why I suggest that every NIC that uses this feature be forced > > > to do what TG3 does so only BH disabling is needed. Once that's done > > > they can just use xmit_lock and everyone will be happy again. > > > > As long as the tx completion is all done in NAPI, it can use BH > > disabling > > without irq disabling. > > Yes, TG3 does not disable IRQs when taking its TX lock. So do you see > any problems with replacing the TG3 TX lock using xmit_lock? I don't see any. Thanks for reminding me about the IRQ vs. BH disabling issue. It will come back when net channels arrive :-) This is because net channel drivers will do all the work in hard IRQs and not in a NAPI or NAPI-like context. The only regret I have about that is we will go back to not being able to profile ->hard_start_xmit() very well in such drivers. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
Hi Michael: On Tue, May 30, 2006 at 09:51:03PM -0700, Michael Chan wrote: > > > That's why I suggest that every NIC that uses this feature be forced > > to do what TG3 does so only BH disabling is needed. Once that's done > > they can just use xmit_lock and everyone will be happy again. > > As long as the tx completion is all done in NAPI, it can use BH > disabling > without irq disabling. Yes, TG3 does not disable IRQs when taking its TX lock. So do you see any problems with replacing the TG3 TX lock using xmit_lock? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
Herbert Xu wrote: > On Tue, May 30, 2006 at 09:13:22PM -0700, David Miller wrote: > > As per the bug, I always keep thinking about changing how we > > do the lockless stuff by simply making xmit_lock a pointer. > > Drivers like tg3 can make it point to their local driver lock. > > This is equivalent to just using the xmit_lock in the driver, right? > IIRC the problem was with IRQ disabling vs. BH disabling. For drivers that always take a private tx lock between tx completion and hard_start_xmit on top of the xmit_lock, getting rid of the xmit_lock using LLTX is a net gain. For drivers that don't need a private tx lock or rarely need it between tx completion and hard_start_xmit, it makes little difference whether you use xmit_lock or private lock with LLTX. > > That's why I suggest that every NIC that uses this feature be forced > to do what TG3 does so only BH disabling is needed. Once that's done > they can just use xmit_lock and everyone will be happy again. > As long as the tx completion is all done in NAPI, it can use BH disabling without irq disabling. TG3 uses LLTX with a private tx_lock. BNX2 uses xmit_lock with a private tx_lock that is rarely used. Both drivers use BH disabling. The profiles of the 2 drivers are very similar. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
On Tue, May 30, 2006 at 09:13:22PM -0700, David Miller wrote: > > There were numbers posted for real drivers at one point, I > do remember that, and the gains were non-trivial. At least > enough to justify the feature. I wonder if there are other ways of achieving the same improvement. > As per the bug, I always keep thinking about changing how we > do the lockless stuff by simply making xmit_lock a pointer. > Drivers like tg3 can make it point to their local driver lock. This is equivalent to just using the xmit_lock in the driver, right? IIRC the problem was with IRQ disabling vs. BH disabling. That's why I suggest that every NIC that uses this feature be forced to do what TG3 does so only BH disabling is needed. Once that's done they can just use xmit_lock and everyone will be happy again. > This leaves still the cases like loopback which do no locking > whatsoever. I don't know how to fit it into the above idea. As far as I'm concerned loopback can do whatever it wants as long as it doesn't create bugs in real NICs :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
Herbert> However, lockless drivers do not take the xmit_lock so Herbert> this method is ineffective. Such drivers need to do Herbert> their own checking inside whatever locks that they do Herbert> take. For example, tg3 could get around this by checking Herbert> whether the queue is stopped in its hard_start_xmit Herbert> function. Yes, I had to add this to the IPoIB driver, because calling netif_stop_queue() when the transmit ring was full was sometimes still allowing hard_start_xmit to be called again: /* * Check if our queue is stopped. Since we have the LLTX bit * set, we can't rely on netif_stop_queue() preventing our * xmit function from being called with a full queue. */ if (unlikely(netif_queue_stopped(dev))) { spin_unlock_irqrestore(&priv->tx_lock, flags); return NETDEV_TX_BUSY; } this bug started a long thread a while back, but I don't remember if there was any resolution. Herbert> I must say though that I'm becoming less and less Herbert> impressed by the lockless feature based on the number of Herbert> problems that it has caused. Does anyone have any hard Herbert> figures as to its effectiveness (excluding any stats Herbert> relating to the loopback interface which can be easily Herbert> separated from normal NIC drivers). I don't have exact figures at hand, but I remember something like a 2 or 3 percent throughput improvement for IPoIB. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netif_tx_disable and lockless TX
From: Herbert Xu <[EMAIL PROTECTED]> Date: Wed, 31 May 2006 14:03:07 +1000 > I must say though that I'm becoming less and less impressed by the > lockless feature based on the number of problems that it has caused. > Does anyone have any hard figures as to its effectiveness (excluding > any stats relating to the loopback interface which can be easily > separated from normal NIC drivers). There were numbers posted for real drivers at one point, I do remember that, and the gains were non-trivial. At least enough to justify the feature. As per the bug, I always keep thinking about changing how we do the lockless stuff by simply making xmit_lock a pointer. Drivers like tg3 can make it point to their local driver lock. This leaves still the cases like loopback which do no locking whatsoever. I don't know how to fit it into the above idea. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
netif_tx_disable and lockless TX
Hi: It has occured to me that the function netif_tx_disable may be unsafe when used in drivers that do lockless transmission. In particular, the function relies on taking the xmit_lock to ensure that no further transmissions will occur until the queue is started again. However, lockless drivers do not take the xmit_lock so this method is ineffective. Such drivers need to do their own checking inside whatever locks that they do take. For example, tg3 could get around this by checking whether the queue is stopped in its hard_start_xmit function. I must say though that I'm becoming less and less impressed by the lockless feature based on the number of problems that it has caused. Does anyone have any hard figures as to its effectiveness (excluding any stats relating to the loopback interface which can be easily separated from normal NIC drivers). Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html