Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
Mandeep Singh Baines wrote: Daniele Venzano ([EMAIL PROTECTED]) wrote: The patch looks good and I think it can be pushed higher (-mm ?) for some wider testing. I don't have the hardware available to do some tests myself, unfortunately, but it would be similar to yours anyway. I'd like to know how this works for people with less memory and slower CPU, but any kind of test run will be appreciated. Hi Daniele, I tested the driver under different setting of CPU clock. Under a lower clock, I get less interrupts (what I was hoping for) and slightly less performance. Under higher clock, I get better performance and almost 1 interrupt per packet. I have a patch that solves the high interrupt rate problem by keeping the driver in polled mode longer. It's written for the latest NAPI version that DaveM posted recently. I'll try to get some time to write it up and post it for comment. -- James Chapman Katalix Systems Ltd http://www.katalix.com Catalysts for your Embedded Linux software development - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
jamal wrote: On Wed, 2007-05-09 at 13:03 +0100, James Chapman wrote: I have a patch that solves the high interrupt rate problem by keeping the driver in polled mode longer. It's written for the latest NAPI version that DaveM posted recently. I'll try to get some time to write it up and post it for comment. Theres interesting challenges to be tackled with resolving that. Just so you dont reinvent the wheel or to help invent a better one; please read: http://kernel.org/pub/linux/kernel/people/hadi/docs/UKUUG2005.pdf if you have, what are the differences in your approach - and when do you find it useful? Thanks Jamal. Yes, I'd already read your paper. I think my idea is different to the ideas described in your paper and I'm in the process of writing it up as an RFC to post to netdev. Briefly though, the driver's -poll() holds off doing netif_rx_complete() for 1-2 jiffies and keeps itself in poll_on mode for that time, consuming no quota. The net_rx softirq processing loop is modified to detect the case when the only devices in its poll list are doing no work (consuming no quota). The driver's -poll() samples jiffies while it is considering when to do the netif_rx_complete() like your Parked state - no timers are used. If I missed that this approach has already been tried before and rejected, please let me know. I see better throughput and latency in my packet forwarding and LAN setups using it. -- James Chapman Katalix Systems Ltd http://www.katalix.com Catalysts for your Embedded Linux software development - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
On Wed, 2007-05-09 at 14:55 +0100, James Chapman wrote: Thanks Jamal. Yes, I'd already read your paper. I think my idea is different to the ideas described in your paper I am hoping you can pick from the lessons of what has been tried and failed and the justification for critiqueing something as Failed. If you have - i feel i accomplished something useful writting the paper. and I'm in the process of writing it up as an RFC to post to netdev. Please cc me if you want my feedback - I am backlogged by about 3000 messages on netdev. Briefly though, the driver's -poll() holds off doing netif_rx_complete() for 1-2 jiffies and keeps itself in poll_on mode for that time, consuming no quota. The net_rx softirq processing loop is modified to detect the case when the only devices in its poll list are doing no work (consuming no quota). The driver's -poll() samples jiffies while it is considering when to do the netif_rx_complete() like your Parked state - no timers are used. Ok, so the difference seems to be you actually poll instead for those jiffies instead starting a timer in the parked state - is that right? If I missed that this approach has already been tried before and rejected, please let me know. I see better throughput and latency in my packet forwarding and LAN setups using it. If you read the paper: There are no issues with high throughput - NAPI kicks in. The challenge to be overcome is at low traffic, if you have a real fast processor your cpu-cycles-used/bits-processed ratio is high If you are polling (softirqs have high prio and depending on the cpu, there could be a few gazillion within those 1-2 jiffies), then isnt the end result still a high cpu-cycles used? Thats what the timer tries to avoid (do nothing until some deffered point). If you waste cpu cycles and poll, I can see that (to emphasize: For FastCPU-LowTraffic scenario), you will end up _not_ having latency issues i pointed out, but you surely have digressed from the original goal which is to address the cpu abuse at low traffic (because you abuse more cpu). One thing that may be valuable is to show that the timers and polls are not much different in terms of cpu abuse (It's theoretically not true, but reality may not match). The other thing is now hrestimers are on, can you try using a piece of hardware that can get those kicked and see if you see any useful results on latency. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
jamal ([EMAIL PROTECTED]) wrote: On Wed, 2007-05-09 at 14:55 +0100, James Chapman wrote: Thanks Jamal. Yes, I'd already read your paper. I think my idea is different to the ideas described in your paper I am hoping you can pick from the lessons of what has been tried and failed and the justification for critiqueing something as Failed. If you have - i feel i accomplished something useful writting the paper. and I'm in the process of writing it up as an RFC to post to netdev. Please cc me if you want my feedback - I am backlogged by about 3000 messages on netdev. Briefly though, the driver's -poll() holds off doing netif_rx_complete() for 1-2 jiffies and keeps itself in poll_on mode for that time, consuming no quota. The net_rx softirq processing loop is modified to detect the case when the only devices in its poll list are doing no work (consuming no quota). The driver's -poll() samples jiffies while it is considering when to do the netif_rx_complete() like your Parked state - no timers are used. Ok, so the difference seems to be you actually poll instead for those jiffies instead starting a timer in the parked state - is that right? If I missed that this approach has already been tried before and rejected, please let me know. I see better throughput and latency in my packet forwarding and LAN setups using it. If you read the paper: There are no issues with high throughput - NAPI kicks in. The challenge to be overcome is at low traffic, if you have a real fast processor your cpu-cycles-used/bits-processed ratio is high I'm not sure cpu-cycles-used/bits-processed is the correct metric to use. An alternative would be to look at cpu-cycles-used/unit-time (i.e. CPU utilization or load) for a given bit-rate or packet-rate. This would make an interesting graph. At low packet-rate, CPU utilization is low so doing extra work per packet is probably OK. Utilizing 2% of CPU vesus 1% is negligible. But at higher rate, when there is more CPU utilization, using 40% of CPU versus 60% is significant. I think the absolute different in CPU utilization is more important than the relative difference. iow, 2% versus 1%, even though a 2x difference in cpu-cycles/packet, is negligible compared to 40% versus 60%. If you are polling (softirqs have high prio and depending on the cpu, there could be a few gazillion within those 1-2 jiffies), then isnt the end result still a high cpu-cycles used? Thats what the timer tries to avoid (do nothing until some deffered point). Using a timer might also behave better in a tick-less (CONFIG_NO_HZ) configuration. If you waste cpu cycles and poll, I can see that (to emphasize: For FastCPU-LowTraffic scenario), you will end up _not_ having latency issues i pointed out, but you surely have digressed from the original goal which is to address the cpu abuse at low traffic (because you abuse more cpu). One thing that may be valuable is to show that the timers and polls are not much different in terms of cpu abuse (It's theoretically not true, but reality may not match). The other thing is now hrestimers are on, can you try using a piece of hardware that can get those kicked and see if you see any useful results on latency. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
On Mon, 2007-03-09 at 20:20 -0700, Mandeep Singh Baines wrote: I didn't see much saving in interrupts on my machine (too fast, I guess). You could try the idea suggested by Dave earlier and just turn interupts for every nth packet. That should cut down the numbers. I did see a significant boost to tx performance by optimizing start_xmit: more than double pps in pktgen. 148Kpps on a slow piece of hardware aint bad - Good Stuff. I wonder how much CPU is being abused. If you wanna go one extra mile (separate future patch): get rid of that tx lock and use netif_tx_lock on the interupt path. Look at some sane driver like tg3 for reference. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
- Message d'origine - De: Mandeep Singh Baines [EMAIL PROTECTED] Date: Mon, 3 Sep 2007 20:20:36 -0700 Sujet: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition Hi Daniele, Attached is a patch for converting the sis900 driver to NAPI. Please take a look at let me know what you think. I'm not 100% sure if I'm handling the rotting packet issue correctly or that I have the locking right in tx_timeout. These might be areas to look at closely. I didn't see much saving in interrupts on my machine (too fast, I guess). But I still think its beneficial: pushing work out of the interrupt handler into a bottom half is a good thing and we no longer need to disable interrupts in start_xmit. I did see a significant boost to tx performance by optimizing start_xmit: more than double pps in pktgen. I'm also attaching some test results for various iterations of development. The patch looks good and I think it can be pushed higher (-mm ?) for some wider testing. I don't have the hardware available to do some tests myself, unfortunately, but it would be similar to yours anyway. I'd like to know how this works for people with less memory and slower CPU, but any kind of test run will be appreciated. Thanks, bye. -- Daniele Venzano [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
On 9/4/07, jamal [EMAIL PROTECTED] wrote: On Mon, 2007-03-09 at 20:20 -0700, Mandeep Singh Baines wrote: I didn't see much saving in interrupts on my machine (too fast, I guess). You could try the idea suggested by Dave earlier and just turn interupts for every nth packet. That should cut down the numbers. I wanted to do that but I couldn't figure out how to free the last skb if it happened to be a transmit that didn't generate a tx completion interrupt. Let's say I interrupt every 4th packet.I've already sent packets 0 to 4. Now I send packets 5 and 6. I can't figure out how to ensure that the skb's for these packets get freed in a deterministic amount of time if there is no interrupt? They will get freed when packet 8 gets transmitted but there is no upper bound on when that will happen. Maybe I could create a tx skb cleanup timer that I kickoff in hard_start_xmit(). Every call to hard_start_xmit would reset the timer. I could try this for a future patch. Not sure if the extra code would cost me more than I get back in savings. I did see a significant boost to tx performance by optimizing start_xmit: more than double pps in pktgen. 148Kpps on a slow piece of hardware aint bad - Good Stuff. I wonder how much CPU is being abused. If you wanna go one extra mile (separate future patch): get rid of that tx lock and use netif_tx_lock on the interupt path. Look at some sane driver like tg3 for reference. Cool. I'll check out the tg3. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
Cool. I'll try to see if I can clock my pc lower and run the experiments again. I'll measure cpu utilization also this time around. That should be useful for extrapolating. Regards, Mandeep On 9/4/07, Daniele Venzano [EMAIL PROTECTED] wrote: - Message d'origine - De: Mandeep Singh Baines [EMAIL PROTECTED] Date: Mon, 3 Sep 2007 20:20:36 -0700 Sujet: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition Hi Daniele, Attached is a patch for converting the sis900 driver to NAPI. Please take a look at let me know what you think. I'm not 100% sure if I'm handling the rotting packet issue correctly or that I have the locking right in tx_timeout. These might be areas to look at closely. I didn't see much saving in interrupts on my machine (too fast, I guess). But I still think its beneficial: pushing work out of the interrupt handler into a bottom half is a good thing and we no longer need to disable interrupts in start_xmit. I did see a significant boost to tx performance by optimizing start_xmit: more than double pps in pktgen. I'm also attaching some test results for various iterations of development. The patch looks good and I think it can be pushed higher (-mm ?) for some wider testing. I don't have the hardware available to do some tests myself, unfortunately, but it would be similar to yours anyway. I'd like to know how this works for people with less memory and slower CPU, but any kind of test run will be appreciated. Thanks, bye. -- Daniele Venzano [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
Hi Daniele, Attached is a patch for converting the sis900 driver to NAPI. Please take a look at let me know what you think. I'm not 100% sure if I'm handling the rotting packet issue correctly or that I have the locking right in tx_timeout. These might be areas to look at closely. I didn't see much saving in interrupts on my machine (too fast, I guess). But I still think its beneficial: pushing work out of the interrupt handler into a bottom half is a good thing and we no longer need to disable interrupts in start_xmit. I did see a significant boost to tx performance by optimizing start_xmit: more than double pps in pktgen. I'm also attaching some test results for various iterations of development. Regards, Mandeep Daniele Venzano ([EMAIL PROTECTED]) wrote: - Message d'origine - De: jamal [EMAIL PROTECTED] Date: Fri, 31 Aug 2007 09:50:03 -0400 Sujet: Re: Re: pktgen terminating condition I dont know if you followed the discussion - by defering the freeing of skbs, you will be slowing down socket apps sending from the local machine. It may be ok if the socket buffers were huge, but that comes at the cost of system memory (which may not be a big deal) Do you by any chance recall why you used the idle interupt instead of txok to kick the prunning of tx descriptors? That should be asked to the original author of the driver, that I wasn't able to contact when I took over the maintainership several years ago. I think that since this chip is usually used on cheap/low performance (relatively speaking) devices it was felt that generating an interrupt for each transmitted packet was going to affect the performance of the system too much. At the start, if I remember correctly, this was a chip thought for consumer use, where transmissions won't happen continuously for long periods of time. Then the sis900 started to get used everywhere, from embedded systems to (cheap) server motherboards and the usage scenario changed. -- Daniele Venzano [EMAIL PROTECTED] Tested using pktgen. cpuinfo: processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Geode NX stepping: 1 cpu MHz : 1397.641 cache size : 256 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mp mmxext 3dnowext 3dnow ts fid vid bogomips: 2796.00 clflush size: 32 meminfo: MemTotal: 907092 kB Results in pps. Sending 40 60-byte packets. Other than Iteration 5 and 6, there is 1 interrupt per packet. Iteration 0 (replace TxIDLE with TxOK): 6, 62052, 62335 Iteration 1 (optimize sis900_start_xmit): 148815, 148815, 148829 Iteration 2 (convert Rx to NAPI): 148669, 148635, 148677 Iteration 3 (remove tx_full dev state variable): 148677, 148528, 148532 Iteration 4 (optimize finish_xmit): 148677, 148676, 148689 Iteration 5 (move tx completion code into NAPI poll): 147751, 147658, 147809 Interrupts: 359270, 360747, 358010 Iteration 6 (cleanup + rotting packet handling + rx optimization): 148652, 148680, 148681 Interrupts: 399927, 399841, 399835 diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c index 7c6e480..862e15a 100644 --- a/drivers/net/sis900.c +++ b/drivers/net/sis900.c @@ -185,7 +185,6 @@ struct sis900_private { dma_addr_t tx_ring_dma; dma_addr_t rx_ring_dma; - unsigned int tx_full; /* The Tx queue is full. */ u8 host_bridge_rev; u8 chipset_rev; }; @@ -202,8 +201,10 @@ MODULE_PARM_DESC(max_interrupt_work, SiS 900/7016 maximum events handled per in MODULE_PARM_DESC(sis900_debug, SiS 900/7016 bitmapped debugging message level); #ifdef CONFIG_NET_POLL_CONTROLLER -static void sis900_poll(struct net_device *dev); +static void sis900_poll_controller(struct net_device *dev); #endif + +static int sis900_poll(struct net_device *dev, int *budget); static int sis900_open(struct net_device *net_dev); static int sis900_mii_probe (struct net_device * net_dev); static void sis900_init_rxfilter (struct net_device * net_dev); @@ -216,8 +217,8 @@ static void sis900_tx_timeout(struct net_device *net_dev); static void sis900_init_tx_ring(struct net_device *net_dev); static void sis900_init_rx_ring(struct net_device *net_dev); static int sis900_start_xmit(struct sk_buff *skb, struct net_device *net_dev); -static int sis900_rx(struct net_device *net_dev); -static void sis900_finish_xmit (struct net_device *net_dev); +static int sis900_rx(struct net_device *net_dev, int limit); +static int sis900_finish_xmit (struct net_device *net_dev); static irqreturn_t sis900_interrupt(int irq, void *dev_instance); static int sis900_close(struct net_device *net_dev); static int mii_ioctl(struct net_device *net_dev, struct ifreq *rq, int cmd); @@