Re: [WIP][PATCHES] Network xmit batching - tg3 support
On Wed, 2007-04-07 at 09:49 +0530, Krishna Kumar2 wrote: > Do you see any contention for tx_lock which can justify having a prep > handler? As I understood, no other CPU can be in the xmit code at the > same time since RUNNING bit is held. Hence getting this lock early or > late should not matter for the xmit side (and you are also holding > dev->queue_lock while running prep so no new skbs can also be added to > the dev during this time). And I couldn't find any driver that uses the > same tx_lock on rx, On any non-LLTX driver: netif_tx_lock() is grabbed on both tx and receive paths. I have to admit, the e1000 is too clever for its own good: there is no documentation/description (and Herbert never explained) but it seems to be able to play with only certain parts of the ring on tx side and others on rx side so that a lock becomes unnecessary. So even for non-LLTX, some lock needs to be held on rx just to be clean (and theres a lot of grunty-ness against LLTX drivers out there). > so where is the saving by doing prep ? You would agree i think that if there is contention between two CPUs for the same lock, gut feeling is there is benefit to do things outside the lock when possible. This becomes more important with batching when amortizing the cost of a lock. Theres a little more than that and I could write up a longer description if needed. Also note: not all drivers may be able to move things outside of the lock - infact the tun driver i converted is hard to do anything with since it doesnt have any hardware features that makes it easy to move things outside. For this reason this api is optional (and tun infact doesnt use it); and as i have said a few times now, i will be more than happy to get rid of it if experiments show it is unnecessary. For now my take is someone needs to disprove it is valuable (most prefered with experiments). cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
Hi Jamal, J Hadi Salim <[EMAIL PROTECTED]> wrote on 07/03/2007 06:56:20 PM: > On Mon, 2007-02-07 at 17:21 -0700, Michael Chan wrote: > > [Matt, please include the count in the fix per previous email] > long answer: > My goal with storing these values and computing them was to do certain > things that dont require holding the netif_tx_lock within a batch as > well. Evaluating the packet metadata and formating the packet to be > ready for stashing into DMA was one thing i could do outside of holding > the lock easily - and running that in loop of 100 packets amortizes the > instruction cache and allows me (when i get to it) to hold the lock for > a lot less computation. Do you see any contention for tx_lock which can justify having a prep handler? As I understood, no other CPU can be in the xmit code at the same time since RUNNING bit is held. Hence getting this lock early or late should not matter for the xmit side (and you are also holding dev->queue_lock while running prep so no new skbs can also be added to the dev during this time). And I couldn't find any driver that uses the same tx_lock on rx, so where is the saving by doing prep ? - KK - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
On Tue, 2007-03-07 at 12:31 -0700, Matt Carlson wrote: > I had planned on using netperf, but pktgen sounds like a more controlled > environment. Thanks for the tip. I can help more if you use pktgen - netperf is more involved. Also pktgen is much closer to the driver so it would let you see closely if any improvements are obvious or not. > Do you see any reason why we couldn't just add the VLAN code to the prep > stage and simply overwrite that portion of the skb-cb? Our driver would > just store its value in the base_flags member of tg3_tx_cbdata. That would be a much better approach;-> Go for it. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
From: jamal <[EMAIL PROTECTED]> Date: Tue, 03 Jul 2007 09:09:48 -0400 > [It sounds very dangerous to me the way skb->cb is being used by the > vlan code (i.e requires human intervention/knowledge to catch it as an > issue). I had no freaking idea the vlan code was using it. Maybe a huge > comment somewhere on how these cbs are being used by drivers would help > or even a registration on startup to just make sure there is no conflict > at a layer (i have been meaning to do the later for years now). In any > case this is is a digression]. This is how the VLAN layer passed in the tag for the transmit function. The CB is owned by the caller until you spam it with your local data at your level, so if you need any information the previous layer provides you have to sample it before using the ->cb[] So you can simply fix this by reading the VLAN tag, then using the ->cb[] however you like. It's always been like this, don't be surprised :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
On Tue, 2007-07-03 at 09:09 -0400, jamal wrote: > On Mon, 2007-02-07 at 14:20 -0700, Matt Carlson wrote: > > > > > > Hi Jamal. I'll be testing your patch soon, > > much thanks. Please let me know if you need help while doing this. > What tools are you planning to test with? I have tested this patch > with pktgen on a dual opteron/tg3-buggy. > There is an outstanding issue in regards to clocksource - however the > batch does well regardless of the clock source. I had planned on using netperf, but pktgen sounds like a more controlled environment. Thanks for the tip. > > but I wanted to point out a > > bug in the patch. The patch defines TG3_SKB_CB() as follows : > > > > #define TG3_SKB_CB(__skb) ((struct tg3_tx_cbdata *)&((__skb)->cb[0])) > > > > This definition will collide with the VLAN macros if TG3_VLAN_TAG_USED > > is set. vlan_tx_tag_get() is defined as : > > > > #define vlan_tx_tag_get(__skb) (VLAN_TX_SKB_CB(__skb)->vlan_tag) > > > > VLAN_TX_SKB_CB is defined as : > > > > #define VLAN_TX_SKB_CB(__skb) \ > > ((struct vlan_skb_tx_cookie *)&((__skb)->cb[0])) > > > > yikes. Thanks for catching that - I thought i had this pretty much > covered after scanning the source. So that bug exists on the e1000 as > well. > [It sounds very dangerous to me the way skb->cb is being used by the > vlan code (i.e requires human intervention/knowledge to catch it as an > issue). I had no freaking idea the vlan code was using it. Maybe a huge > comment somewhere on how these cbs are being used by drivers would help > or even a registration on startup to just make sure there is no conflict > at a layer (i have been meaning to do the later for years now). In any > case this is is a digression]. > > In the meantime, changing it to use byte 8 and above should do it? i.e. > #define TG3_SKB_CB(__skb) ((struct tg3_tx_cbdata *)&((__skb)->cb[8])) > > There are 48 bytes there on the skb-cb, so there should be plenty. Do you see any reason why we couldn't just add the VLAN code to the prep stage and simply overwrite that portion of the skb-cb? Our driver would just store its value in the base_flags member of tg3_tx_cbdata. > > Also, I think the count, max_per_txd, and nr_frags fields of the > > tg3_tx_cbdata struct are not needed. > > Yes, you are right. That was a result of the LinuxWay(tm) (aka cutnpaste > from the e1000 which needs them), Can you send me a patch for that and > TG3_SKB_CB? Once we iron out the skb-cb issue, sure. > > cheers, > jamal > > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
On Mon, 2007-02-07 at 17:21 -0700, Michael Chan wrote: [Matt, please include the count in the fix per previous email] > Only base_flags and mss are needed and these can be determined right > before sending the frame. So is it better not to store these in the > skb->cb at all? long answer: My goal with storing these values and computing them was to do certain things that dont require holding the netif_tx_lock within a batch as well. Evaluating the packet metadata and formating the packet to be ready for stashing into DMA was one thing i could do outside of holding the lock easily - and running that in loop of 100 packets amortizes the instruction cache and allows me (when i get to it) to hold the lock for a lot less computation. In the e1000 for example i was able to go as far as computing how many descriptors are needed per skb and stashing those in the skb->cb; the way the tg3 is structured makes doing all that needing some big changes. So to answer your question: It would be nice if i could keep the skb->cb for now and we can get rid of it later if deemed a nuisance for batching. > > + dcount = tg3_tx_avail(tp); > if (unlikely(netif_queue_stopped(tp->dev) && > - (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp { > + (dcount > TG3_TX_WAKEUP_THRESH(tp { > netif_tx_lock(tp->dev); > + tp->dev->xmit_win = 1; > if (netif_queue_stopped(tp->dev) && > - (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp))) > + (dcount > TG3_TX_WAKEUP_THRESH(tp))) { > netif_wake_queue(tp->dev); > + tp->dev->xmit_win = dcount; > + } > netif_tx_unlock(tp->dev); > } > } > > This is also not right. tg3_tx() runs without netif_tx_lock(). > tg3_tx_avail() can change after you get the netif_tx_lock() and we must > get the updated value again. If we just rely on dcount, we can call > wake_queue() when the ring is full, or there may be no wakeup when the > ring is empty. You are right, I was trying to be a smart-ass there so i didnt have to fold the lines (for readability). Can you or Matt restore it to the way it was originally while keeping the xmit_win update and just send me a patch? Thanks a lot Michael and Matt for looking at this and improving it. Maybe i should switch all my nics to tg3 from now on ;-> cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
On Mon, 2007-02-07 at 14:20 -0700, Matt Carlson wrote: > > Hi Jamal. I'll be testing your patch soon, much thanks. Please let me know if you need help while doing this. What tools are you planning to test with? I have tested this patch with pktgen on a dual opteron/tg3-buggy. There is an outstanding issue in regards to clocksource - however the batch does well regardless of the clock source. > but I wanted to point out a > bug in the patch. The patch defines TG3_SKB_CB() as follows : > > #define TG3_SKB_CB(__skb) ((struct tg3_tx_cbdata *)&((__skb)->cb[0])) > > This definition will collide with the VLAN macros if TG3_VLAN_TAG_USED > is set. vlan_tx_tag_get() is defined as : > > #define vlan_tx_tag_get(__skb) (VLAN_TX_SKB_CB(__skb)->vlan_tag) > > VLAN_TX_SKB_CB is defined as : > > #define VLAN_TX_SKB_CB(__skb) \ > ((struct vlan_skb_tx_cookie *)&((__skb)->cb[0])) > yikes. Thanks for catching that - I thought i had this pretty much covered after scanning the source. So that bug exists on the e1000 as well. [It sounds very dangerous to me the way skb->cb is being used by the vlan code (i.e requires human intervention/knowledge to catch it as an issue). I had no freaking idea the vlan code was using it. Maybe a huge comment somewhere on how these cbs are being used by drivers would help or even a registration on startup to just make sure there is no conflict at a layer (i have been meaning to do the later for years now). In any case this is is a digression]. In the meantime, changing it to use byte 8 and above should do it? i.e. #define TG3_SKB_CB(__skb) ((struct tg3_tx_cbdata *)&((__skb)->cb[8])) There are 48 bytes there on the skb-cb, so there should be plenty. > Also, I think the count, max_per_txd, and nr_frags fields of the > tg3_tx_cbdata struct are not needed. Yes, you are right. That was a result of the LinuxWay(tm) (aka cutnpaste from the e1000 which needs them), Can you send me a patch for that and TG3_SKB_CB? cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
On Mon, 2007-07-02 at 14:20 -0700, Matt Carlson wrote: > > Also, I think the count, max_per_txd, and nr_frags fields of the > tg3_tx_cbdata struct are not needed. The count field is not needed also. +struct tg3_tx_cbdata { + u32 base_flags; + int count; + unsigned int max_per_txd; + unsigned int nr_frags; + unsigned int mss; +}; +#define TG3_SKB_CB(__skb) ((struct tg3_tx_cbdata *)&((__skb)->cb[0])) Only base_flags and mss are needed and these can be determined right before sending the frame. So is it better not to store these in the skb->cb at all? @@ -3118,12 +3120,16 @@ static void tg3_tx(struct tg3 *tp) */ smp_mb(); + dcount = tg3_tx_avail(tp); if (unlikely(netif_queue_stopped(tp->dev) && -(tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp { +(dcount > TG3_TX_WAKEUP_THRESH(tp { netif_tx_lock(tp->dev); + tp->dev->xmit_win = 1; if (netif_queue_stopped(tp->dev) && - (tg3_tx_avail(tp) > TG3_TX_WAKEUP_THRESH(tp))) + (dcount > TG3_TX_WAKEUP_THRESH(tp))) { netif_wake_queue(tp->dev); + tp->dev->xmit_win = dcount; + } netif_tx_unlock(tp->dev); } } This is also not right. tg3_tx() runs without netif_tx_lock(). tg3_tx_avail() can change after you get the netif_tx_lock() and we must get the updated value again. If we just rely on dcount, we can call wake_queue() when the ring is full, or there may be no wakeup when the ring is empty. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP][PATCHES] Network xmit batching - tg3 support
On Wed, 2007-06-27 at 20:05 -0400, jamal wrote: > peoplez, > > I have added support for tg3 on batching. I see equivalent performance > improvement for pktgen as i did with e1000 when using gige. > I have only tested on two machines (one being a laptop which does > 10/100Mbps). Unfortunately in both cases these are considered to be in > the class of "buggy" tg3s (which take a longer code path). > > To the tg3 folks - can you double check if am off on something? > I have split a few things that you may like as well. > I havent upgraded the tree - it is still circa 2.6.22-rc4 based; at some > point i will sync with Daves net-26 > > Anyone who has tg3 based hardware: I would appreciate any testing and > results ... > > The git tree is at: > git://git.kernel.org/pub/scm/linux/kernel/git/hadi/batch-lin26.git > but i have attached the patch in case you just wanna stare. > > cheers, > jamal Hi Jamal. I'll be testing your patch soon, but I wanted to point out a bug in the patch. The patch defines TG3_SKB_CB() as follows : #define TG3_SKB_CB(__skb) ((struct tg3_tx_cbdata *)&((__skb)->cb[0])) This definition will collide with the VLAN macros if TG3_VLAN_TAG_USED is set. vlan_tx_tag_get() is defined as : #define vlan_tx_tag_get(__skb) (VLAN_TX_SKB_CB(__skb)->vlan_tag) VLAN_TX_SKB_CB is defined as : #define VLAN_TX_SKB_CB(__skb) \ ((struct vlan_skb_tx_cookie *)&((__skb)->cb[0])) Also, I think the count, max_per_txd, and nr_frags fields of the tg3_tx_cbdata struct are not needed. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html