Re: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
Thanks for the feedback Rick. We haven't used the netperf trunk. The person who actually got these numbers will be trying the netperf trunk little later and we will post the results.. Just in case someone has top-of-trunk worries, the basic single-stream, bidirectional stuff is in the 2.4.3 released bits. It is the user friendly netperf-does-the-math bit that is top of trunk :) rick - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
On 7/23/07, Rick Jones [EMAIL PROTECTED] wrote: The bidirectional looks like a two concurrent stream (TCP_STREAM + TCP_MAERTS) test right? If you want a single-stream bidirectional test, then with the top of trunk netperf you can use: Thanks for the feedback Rick. We haven't used the netperf trunk. The person who actually got these numbers will be trying the netperf trunk little later and we will post the results.. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
Bidirectional test. 87380 65536 6553660.01 7809.57 28.6630.022.405 2.519 TX 87380 65536 6553660.01 7592.90 28.6630.022.474 2.591 RX -- 87380 65536 6553660.01 7629.73 28.3229.642.433 2.546 RX 87380 65536 6553660.01 7926.99 28.3229.642.342 2.450 TX Signle netperf stream between 2 quad-core Xeon based boxes. Tested on 2.6.20 and 2.6.22 kernels. Driver uses NAPI and LRO. The bidirectional looks like a two concurrent stream (TCP_STREAM + TCP_MAERTS) test right? If you want a single-stream bidirectional test, then with the top of trunk netperf you can use: ./configure --enable-burst make install # yadda yadda netperf -t TCP_RR -H remote -f m -v 2 -l 60 -c -C -- -r 64K -b 12 which will cause netperf to have 13, 64K transactions in flight at one time on the connection, which for a 64K request size has been sufficient, thusfar anyway, to saturate things. As there is no select/poll/whatever call in netperf TCP_RR it might be necessary to include test-specific -s and -S options to make sure the socket buffer (SO_SNDBUF) is large enough that none of those send() calls ever block, lest both ends end-up blocked in a send() call. The -f m will switch the output from transactions/s to megabits per second and is the part requiring the top of trunk netperf. The -v 2 stuff causes extra stuff to give bitrates in each direction and transaction/s rate as well as computed average latency. That is also in top of trunk, otherwise, for 2.4.3 you can skip that and do the math to conver to megabits/s yourself and not get all the other derived values. rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
On 7/10/07, Jeff Garzik [EMAIL PROTECTED] wrote: Veeraiyan, Ayyappan wrote: On 7/10/07, Jeff Garzik [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: I will post the performance numbers later today.. Sorry for not responding earlier. We faced couple of issues like setup, and false alarms.. Anyway here are the numbers.. RecvSendSendUtilization Service Demand SocketSocketMessageElapsed SendRecvSendRecv SizeSizeSizeTime Throughput local remotelocal remote Bytes Bytes Bytes sec 10^6bits/s % s % s us/KB us/KB 87380 65536 128 60 2261.34 13.82 4.254.006 1.233 128 87380 65536 256 60 3332.51 14.19 5.672.791.115 256 87380 65536 512 60.01 4262.24 14.38 6.9 2.211.062 512 87380 65536 102460 4659.18 14.47.392.026 1.039 1024 87380 65536 204860.01 6177.87 14.36 14.99 1.524 1.59 2048 87380 65536 409660.01 9410.29 11.58 14.60.807 1.017 4096 87380 65536 819260.01 9324.62 11.13 14.33 0.782 1.007 8192 87380 65536 16384 60.01 9371.35 11.07 14.28 0.774 0.999 16384 87380 65536 32768 60.02 9385.81 10.83 14.27 0.756 0.997 32768 87380 65536 65536 60.01 9363.5 10.73 14.26 0.751 0.998 65536 TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to n0417 (10.0.4.17) port 0 AF_INET : cpu bind Recv SendSend Utilization Service Demand Socket Socket Message Elapsed SendRecvSendRecv Size SizeSize Time Throughput local remote local remote bytes bytes bytessecs. 10^6bits/s % S% S us/KB us/KB 87380 65536 6553660.029399.61 2.22 14.530.155 1.013 87380 65536 6553660.029348.01 2.46 14.390.173 1.009 87380 65536 6553660.029403.36 2.26 14.370.158 1.001 87380 65536 6553660.019332.22 2.23 14.510.157 1.019 Bidirectional test. 87380 65536 6553660.01 7809.57 28.6630.022.405 2.519 TX 87380 65536 6553660.01 7592.90 28.6630.022.474 2.591 RX -- 87380 65536 6553660.01 7629.73 28.3229.642.433 2.546 RX 87380 65536 6553660.01 7926.99 28.3229.642.342 2.450 TX Signle netperf stream between 2 quad-core Xeon based boxes. Tested on 2.6.20 and 2.6.22 kernels. Driver uses NAPI and LRO. To summarize, we are seeing the line-rate with NAPI (single Rx queue) and Rx CPU utilization is around 14%. In back to back scenarios, NAPI (combined with LRO) performs clearly better. In multiple client scenarios, Non-NAPI with multiple Rx queues performs better. I am continuously doing more benchmarking and submit a patch to pick one this week. But going forward if NAPI supports multiple Rx queues natively, I believe that would perform much better in most of the cases. Also, did you get a chance to review the driver take #2? I like to implement the review comments (if any) as early as possible, and submit another version. Thanks... Ayyappan - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
This patch adds support for the Intel(R) 82598 based PCI Express 10GbE adapters. Please find the full driver as a patch to latest linus-2.6 tree here: git-pull git://lost.foo-projects.org/~aveerani/git/linux-2.6 ixgbe Changes from the last submission... 1. Suspend, resume support is added 2. PCI error recovery support 3. Bit fields usage is removed and replaced with #defines. 4. typedef boolean_t is replaced with bool. 5. Ethtool functionality for eeprom and regiters dump and adapter identification. 6. RxDescriptor, TxDescriptors and XsumRx parameters are removed from the module param list. They can be handled via ethtool. 7. NAPI mode uses sigle Rx queue and so fake netdev usage is removed. 8. Non-NAPI mode is added. 9. LLTX is not used and tx_lock usage in xmit_frame is cleand up. 10. Performance and bug fixes. thanks, Ayyappan - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
[EMAIL PROTECTED] wrote: 7. NAPI mode uses sigle Rx queue and so fake netdev usage is removed. 8. Non-NAPI mode is added. Honestly I'm not sure about drivers that have both NAPI and non-NAPI paths. Several existing drivers do this, and in almost every case, I tend to feel the driver would benefit from picking one approach, rather than doing both. Doing both tends to signal that the author hasn't bothered to measure the differences between various approaches, and pick a clear winner. I strongly prefer NAPI combined with hardware interrupt mitigation -- it helps with multiple net interfaces balance load across the system, at times of high load -- but I'm open to other solutions as well. So... what are your preferences? What is the setup that gets closest to wire speed under Linux? :) Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
On 7/10/07, Jeff Garzik [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: Doing both tends to signal that the author hasn't bothered to measure the differences between various approaches, and pick a clear winner. I did pick NAPI in our previous submission based on various tests. But to get 10Gig line rate we need to use multiple Rx queues which will need fake netdevs.. Since fake netdevs weren't acceptable, I added non-NAPI support which gets 10Gig line rate with multi-rx. I am ok with removing NAPI support till the work of separating the netdevs and NAPI work is done.. I strongly prefer NAPI combined with hardware interrupt mitigation -- it helps with multiple net interfaces balance load across the system, at times of high load -- but I'm open to other solutions as well. Majority of tests we did here, we saw NAPI is better. But for some specific test cases (especially if we add the SW RSC i.e. LRO), we saw better throughput and CPU utilization with non-NAPI. So... what are your preferences? What is the setup that gets closest to wire speed under Linux? :) With SW LRO, non-NAPI is better but without LRO, NAPI is better but NAPI needs multiple Rx queues. So given the limitations, non-NPAI is my preference now. I will post the performance numbers later today.. Jeff Thanks.. Ayyappan - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
Veeraiyan, Ayyappan wrote: On 7/10/07, Jeff Garzik [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: Doing both tends to signal that the author hasn't bothered to measure the differences between various approaches, and pick a clear winner. I did pick NAPI in our previous submission based on various tests. But to get 10Gig line rate we need to use multiple Rx queues which will need fake netdevs.. Since fake netdevs weren't acceptable, I added non-NAPI support which gets 10Gig line rate with multi-rx. I am ok with removing NAPI support till the work of separating the netdevs and NAPI work is done.. That sounds fine to me. Separating netdev and NAPI is the right way to go. Maybe note that in a TODO list at the top of the driver. With SW LRO, non-NAPI is better but without LRO, NAPI is better but NAPI needs multiple Rx queues. So given the limitations, non-NPAI is my preference now. On the subject of SW LRO: We are really looking for a generic implementation, hopefully authored by one or more interested parties. This is something we definitely do not want to reinvent over and over in new drivers -- and in the one or two drivers it exists today, should be removed once the generic code is in place. If Intel could assist with that effort, that would be very helpful. I will post the performance numbers later today.. Thanks, Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2
[EMAIL PROTECTED] wrote: This patch adds support for the Intel(R) 82598 based PCI Express 10GbE adapters. Please find the full driver as a patch to latest linus-2.6 tree here: git-pull git://lost.foo-projects.org/~aveerani/git/linux-2.6 ixgbe Andrew, I rebased this with the new driver code, so you might need to drop the old version. Cheers, Auke - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html