Re: Doubts about listen backlog and tcp_max_syn_backlog
On 01/25/2013 02:05 AM, Leandro Lucarella wrote: > On Thu, Jan 24, 2013 at 10:12:46PM -0800, Nivedita SInghvi wrote: >>>>> I was just kind of quoting the name given by netstat: "SYNs to LISTEN >>>>> sockets dropped" (for kernel 3.0, I noticed newer kernels don't have >>>>> this stat anymore, or the name was changed). I still don't know if we >>>>> are talking about the same thing. >>>> >> [snip] >>>> I will sometimes be tripped-up by netstat's not showing a statistic >>>> with a zero value... >> >> Leandro, you should be able to do an nstat -z, it will print all >> counters even if zero. You should see something like so: >> >> ipv4]> nstat -z >> #kernel >> IpInReceives2135 0.0 >> IpInHdrErrors 0 0.0 >> IpInAddrErrors 2020.0 >> ... >> >> You might want to take a look at those (your pkts may not even be >> making it to tcp) and these in particular: >> >> TcpExtSyncookiesSent0 0.0 >> TcpExtSyncookiesRecv0 0.0 >> TcpExtSyncookiesFailed 0 0.0 >> TcpExtListenOverflows 0 0.0 >> TcpExtListenDrops 0 0.0 >> TcpExtTCPBacklogDrop0 0.0 >> TcpExtTCPMinTTLDrop 0 0.0 >> TcpExtTCPDeferAcceptDrop0 0.0 >> >> If you don't have nstat on that version for some reason, download the >> latest iproute pkg. Looking at the counter names is a lot more helpful >> and precise than the netstat converstion to human consumption. > > Thanks, but what about this? > > pc2 $ nstat -z | grep -i drop > TcpExtLockDroppedIcmps 0 0.0 > TcpExtListenDrops 0 0.0 > TcpExtTCPPrequeueDropped0 0.0 > TcpExtTCPBacklogDrop0 0.0 > TcpExtTCPMinTTLDrop 0 0.0 > TcpExtTCPDeferAcceptDrop0 0.0 That seems bogus. > pc2 $ netstat -s | grep -i drop > 470 outgoing packets dropped > 5659740 SYNs to LISTEN sockets dropped > > Is this normal? That's a lot ofconnect requests dropped, but it depends on how long you've been up and how much traffic you've seen. Hmm...you were on an older Ubuntu, right? The netstat source was patched to translate it as follows: +{ "ListenDrops", N_("%u SYNs to LISTEN sockets dropped"), opt_number }, (see the file debian/patches/CVS-20081003-statistics.c_sync.patch in the net-tools src) i.e., the netstat pkg is printing the value of the TCPEXT MIB counter that's counting TCPExtListenDrops. Theoretically, that number should be the same as that printed by nstat, as they are getting it from the same kernel stats counter. I have not looked at nstat code (I actually almost always dump the counters from /proc/net/{netstat + snmp} via a simple prettyprint script (will send you that offline). If the nstat and netstat counters don't match, something is fishy. That nstat output is broken. >>> Yes, I already did captures and we are definitely loosing packets >>> (including SYNs), but it looks like the amount of SYNs I'm loosing is >>> lower than the amount of long connect() times I observe. This is not >>> confirmed yet, I'm still investigating. >> >> Where did you narrow down the drop to? There are quite a few places in >> the networking stack we silently drop packets (such as the one pointed >> out earlier in this thread), although they should almost all be >> extremely low probability/NEVER type events. Do you want a patch to >> gap the most likely scenario? (I'll post that to netdev separately). > > Even when that would be awesome, unfortunately there is no way I could > get permission to run a patched kernel (or even restart the servers for > that matter). > > And I don't know how could I narrow down the drops in any way. What I > know is capturing traffic with tcpdump, I see some packets leaving one > server but never arriving to the new one. Hmm..do you have a switch between your two end points dropping pkts? Could be.. Basically, by looking at the statistics kept by each layer, you should be able to narrow it down a little bit at least. It does still sound like some drops are occurring in TCP due to accept backlog being full and you're overrunning TCP incoming processing (or at least this contributing), going by that ListenDrops count. > Also, the hardware is not great either, I'm no
Re: Doubts about listen backlog and tcp_max_syn_backlog
On 01/25/2013 02:05 AM, Leandro Lucarella wrote: > On Thu, Jan 24, 2013 at 10:12:46PM -0800, Nivedita SInghvi wrote: >>>>> I was just kind of quoting the name given by netstat: "SYNs to LISTEN >>>>> sockets dropped" (for kernel 3.0, I noticed newer kernels don't have >>>>> this stat anymore, or the name was changed). I still don't know if we >>>>> are talking about the same thing. >>>> >> [snip] >>>> I will sometimes be tripped-up by netstat's not showing a statistic >>>> with a zero value... >> >> Leandro, you should be able to do an nstat -z, it will print all >> counters even if zero. You should see something like so: >> >> ipv4]> nstat -z >> #kernel >> IpInReceives2135 0.0 >> IpInHdrErrors 0 0.0 >> IpInAddrErrors 2020.0 >> ... >> >> You might want to take a look at those (your pkts may not even be >> making it to tcp) and these in particular: >> >> TcpExtSyncookiesSent0 0.0 >> TcpExtSyncookiesRecv0 0.0 >> TcpExtSyncookiesFailed 0 0.0 >> TcpExtListenOverflows 0 0.0 >> TcpExtListenDrops 0 0.0 >> TcpExtTCPBacklogDrop0 0.0 >> TcpExtTCPMinTTLDrop 0 0.0 >> TcpExtTCPDeferAcceptDrop0 0.0 >> >> If you don't have nstat on that version for some reason, download the >> latest iproute pkg. Looking at the counter names is a lot more helpful >> and precise than the netstat converstion to human consumption. > > Thanks, but what about this? > > pc2 $ nstat -z | grep -i drop > TcpExtLockDroppedIcmps 0 0.0 > TcpExtListenDrops 0 0.0 > TcpExtTCPPrequeueDropped0 0.0 > TcpExtTCPBacklogDrop0 0.0 > TcpExtTCPMinTTLDrop 0 0.0 > TcpExtTCPDeferAcceptDrop0 0.0 That seems bogus. > pc2 $ netstat -s | grep -i drop > 470 outgoing packets dropped > 5659740 SYNs to LISTEN sockets dropped > > Is this normal? That's a lot ofconnect requests dropped, but it depends on how long you've been up and how much traffic you've seen. Hmm...you were on an older Ubuntu, right? The netstat source was patched to translate it as follows: +{ "ListenDrops", N_("%u SYNs to LISTEN sockets dropped"), opt_number }, (see the file debian/patches/CVS-20081003-statistics.c_sync.patch in the net-tools src) i.e., the netstat pkg is printing the value of the TCPEXT MIB counter that's counting TCPExtListenDrops. Theoretically, that number should be the same as that printed by nstat, as they are getting it from the same kernel stats counter. I have not looked at nstat code (I actually almost always dump the counters from /proc/net/{netstat + snmp} via a simple prettyprint script (will send you that offline). If the nstat and netstat counters don't match, something is fishy. That nstat output is broken. >>> Yes, I already did captures and we are definitely loosing packets >>> (including SYNs), but it looks like the amount of SYNs I'm loosing is >>> lower than the amount of long connect() times I observe. This is not >>> confirmed yet, I'm still investigating. >> >> Where did you narrow down the drop to? There are quite a few places in >> the networking stack we silently drop packets (such as the one pointed >> out earlier in this thread), although they should almost all be >> extremely low probability/NEVER type events. Do you want a patch to >> gap the most likely scenario? (I'll post that to netdev separately). > > Even when that would be awesome, unfortunately there is no way I could > get permission to run a patched kernel (or even restart the servers for > that matter). > > And I don't know how could I narrow down the drops in any way. What I > know is capturing traffic with tcpdump, I see some packets leaving one > server but never arriving to the new one. Hmm..do you have a switch between your two end points dropping pkts? Could be.. Basically, by looking at the statistics kept by each layer, you should be able to narrow it down a little bit at least. It does still sound like some drops are occurring in TCP due to accept backlog being full and you're overrunning TCP incoming processing (or at least this contributing), going by that ListenDrops count. > Also, the hardware is not great either, I'm no
Re: Doubts about listen backlog and tcp_max_syn_backlog
On 01/25/2013 02:05 AM, Leandro Lucarella wrote: On Thu, Jan 24, 2013 at 10:12:46PM -0800, Nivedita SInghvi wrote: I was just kind of quoting the name given by netstat: SYNs to LISTEN sockets dropped (for kernel 3.0, I noticed newer kernels don't have this stat anymore, or the name was changed). I still don't know if we are talking about the same thing. [snip] I will sometimes be tripped-up by netstat's not showing a statistic with a zero value... Leandro, you should be able to do an nstat -z, it will print all counters even if zero. You should see something like so: ipv4] nstat -z #kernel IpInReceives2135 0.0 IpInHdrErrors 0 0.0 IpInAddrErrors 2020.0 ... You might want to take a look at those (your pkts may not even be making it to tcp) and these in particular: TcpExtSyncookiesSent0 0.0 TcpExtSyncookiesRecv0 0.0 TcpExtSyncookiesFailed 0 0.0 TcpExtListenOverflows 0 0.0 TcpExtListenDrops 0 0.0 TcpExtTCPBacklogDrop0 0.0 TcpExtTCPMinTTLDrop 0 0.0 TcpExtTCPDeferAcceptDrop0 0.0 If you don't have nstat on that version for some reason, download the latest iproute pkg. Looking at the counter names is a lot more helpful and precise than the netstat converstion to human consumption. Thanks, but what about this? pc2 $ nstat -z | grep -i drop TcpExtLockDroppedIcmps 0 0.0 TcpExtListenDrops 0 0.0 TcpExtTCPPrequeueDropped0 0.0 TcpExtTCPBacklogDrop0 0.0 TcpExtTCPMinTTLDrop 0 0.0 TcpExtTCPDeferAcceptDrop0 0.0 That seems bogus. pc2 $ netstat -s | grep -i drop 470 outgoing packets dropped 5659740 SYNs to LISTEN sockets dropped Is this normal? That's a lot ofconnect requests dropped, but it depends on how long you've been up and how much traffic you've seen. Hmm...you were on an older Ubuntu, right? The netstat source was patched to translate it as follows: +{ ListenDrops, N_(%u SYNs to LISTEN sockets dropped), opt_number }, (see the file debian/patches/CVS-20081003-statistics.c_sync.patch in the net-tools src) i.e., the netstat pkg is printing the value of the TCPEXT MIB counter that's counting TCPExtListenDrops. Theoretically, that number should be the same as that printed by nstat, as they are getting it from the same kernel stats counter. I have not looked at nstat code (I actually almost always dump the counters from /proc/net/{netstat + snmp} via a simple prettyprint script (will send you that offline). If the nstat and netstat counters don't match, something is fishy. That nstat output is broken. Yes, I already did captures and we are definitely loosing packets (including SYNs), but it looks like the amount of SYNs I'm loosing is lower than the amount of long connect() times I observe. This is not confirmed yet, I'm still investigating. Where did you narrow down the drop to? There are quite a few places in the networking stack we silently drop packets (such as the one pointed out earlier in this thread), although they should almost all be extremely low probability/NEVER type events. Do you want a patch to gap the most likely scenario? (I'll post that to netdev separately). Even when that would be awesome, unfortunately there is no way I could get permission to run a patched kernel (or even restart the servers for that matter). And I don't know how could I narrow down the drops in any way. What I know is capturing traffic with tcpdump, I see some packets leaving one server but never arriving to the new one. Hmm..do you have a switch between your two end points dropping pkts? Could be.. Basically, by looking at the statistics kept by each layer, you should be able to narrow it down a little bit at least. It does still sound like some drops are occurring in TCP due to accept backlog being full and you're overrunning TCP incoming processing (or at least this contributing), going by that ListenDrops count. Also, the hardware is not great either, I'm not sure is not responsible for the loss. There are some errors reported by ethtool, but I don't know exactly what they mean: # ethtool -S eth0 NIC statistics: tx_packets: 336978308273 rx_packets: 384108075585 tx_errors: 0 rx_errors: 194 rx_missed: 1119 align_errors: 31731 tx_single_collisions: 0 tx_multi_collisions: 0 unicast: 384108023754 broadcast: 51825 multicast: 6 tx_aborted: 0 tx_underrun: 0 Thanks! You aren't suffering a lot of packet loss at the NIC. Sorry, I'm on the road
Re: Doubts about listen backlog and tcp_max_syn_backlog
On 01/25/2013 02:05 AM, Leandro Lucarella wrote: On Thu, Jan 24, 2013 at 10:12:46PM -0800, Nivedita SInghvi wrote: I was just kind of quoting the name given by netstat: SYNs to LISTEN sockets dropped (for kernel 3.0, I noticed newer kernels don't have this stat anymore, or the name was changed). I still don't know if we are talking about the same thing. [snip] I will sometimes be tripped-up by netstat's not showing a statistic with a zero value... Leandro, you should be able to do an nstat -z, it will print all counters even if zero. You should see something like so: ipv4] nstat -z #kernel IpInReceives2135 0.0 IpInHdrErrors 0 0.0 IpInAddrErrors 2020.0 ... You might want to take a look at those (your pkts may not even be making it to tcp) and these in particular: TcpExtSyncookiesSent0 0.0 TcpExtSyncookiesRecv0 0.0 TcpExtSyncookiesFailed 0 0.0 TcpExtListenOverflows 0 0.0 TcpExtListenDrops 0 0.0 TcpExtTCPBacklogDrop0 0.0 TcpExtTCPMinTTLDrop 0 0.0 TcpExtTCPDeferAcceptDrop0 0.0 If you don't have nstat on that version for some reason, download the latest iproute pkg. Looking at the counter names is a lot more helpful and precise than the netstat converstion to human consumption. Thanks, but what about this? pc2 $ nstat -z | grep -i drop TcpExtLockDroppedIcmps 0 0.0 TcpExtListenDrops 0 0.0 TcpExtTCPPrequeueDropped0 0.0 TcpExtTCPBacklogDrop0 0.0 TcpExtTCPMinTTLDrop 0 0.0 TcpExtTCPDeferAcceptDrop0 0.0 That seems bogus. pc2 $ netstat -s | grep -i drop 470 outgoing packets dropped 5659740 SYNs to LISTEN sockets dropped Is this normal? That's a lot ofconnect requests dropped, but it depends on how long you've been up and how much traffic you've seen. Hmm...you were on an older Ubuntu, right? The netstat source was patched to translate it as follows: +{ ListenDrops, N_(%u SYNs to LISTEN sockets dropped), opt_number }, (see the file debian/patches/CVS-20081003-statistics.c_sync.patch in the net-tools src) i.e., the netstat pkg is printing the value of the TCPEXT MIB counter that's counting TCPExtListenDrops. Theoretically, that number should be the same as that printed by nstat, as they are getting it from the same kernel stats counter. I have not looked at nstat code (I actually almost always dump the counters from /proc/net/{netstat + snmp} via a simple prettyprint script (will send you that offline). If the nstat and netstat counters don't match, something is fishy. That nstat output is broken. Yes, I already did captures and we are definitely loosing packets (including SYNs), but it looks like the amount of SYNs I'm loosing is lower than the amount of long connect() times I observe. This is not confirmed yet, I'm still investigating. Where did you narrow down the drop to? There are quite a few places in the networking stack we silently drop packets (such as the one pointed out earlier in this thread), although they should almost all be extremely low probability/NEVER type events. Do you want a patch to gap the most likely scenario? (I'll post that to netdev separately). Even when that would be awesome, unfortunately there is no way I could get permission to run a patched kernel (or even restart the servers for that matter). And I don't know how could I narrow down the drops in any way. What I know is capturing traffic with tcpdump, I see some packets leaving one server but never arriving to the new one. Hmm..do you have a switch between your two end points dropping pkts? Could be.. Basically, by looking at the statistics kept by each layer, you should be able to narrow it down a little bit at least. It does still sound like some drops are occurring in TCP due to accept backlog being full and you're overrunning TCP incoming processing (or at least this contributing), going by that ListenDrops count. Also, the hardware is not great either, I'm not sure is not responsible for the loss. There are some errors reported by ethtool, but I don't know exactly what they mean: # ethtool -S eth0 NIC statistics: tx_packets: 336978308273 rx_packets: 384108075585 tx_errors: 0 rx_errors: 194 rx_missed: 1119 align_errors: 31731 tx_single_collisions: 0 tx_multi_collisions: 0 unicast: 384108023754 broadcast: 51825 multicast: 6 tx_aborted: 0 tx_underrun: 0 Thanks! You aren't suffering a lot of packet loss at the NIC. Sorry, I'm on the road
Re: Doubts about listen backlog and tcp_max_syn_backlog
On 01/24/2013 11:21 AM, Leandro Lucarella wrote: > On Thu, Jan 24, 2013 at 10:44:32AM -0800, Rick Jones wrote: >> On 01/24/2013 04:22 AM, Leandro Lucarella wrote: >>> On Wed, Jan 23, 2013 at 11:28:08AM -0800, Rick Jones wrote: > Then if syncookies are enabled, the time spent in connect() shouldn't be > bigger than 3 seconds even if SYNs are being "dropped" by listen, right? Do you mean if "ESTABLISHED" connections are dropped because the listen queue is full? I don't think I would put that as "SYNs being dropped by listen" - too easy to confuse that with an actual dropping of a SYN segment. >>> >>> I was just kind of quoting the name given by netstat: "SYNs to LISTEN >>> sockets dropped" (for kernel 3.0, I noticed newer kernels don't have >>> this stat anymore, or the name was changed). I still don't know if we >>> are talking about the same thing. >> [snip] >> I will sometimes be tripped-up by netstat's not showing a statistic >> with a zero value... Leandro, you should be able to do an nstat -z, it will print all counters even if zero. You should see something like so: ipv4]> nstat -z #kernel IpInReceives2135 0.0 IpInHdrErrors 0 0.0 IpInAddrErrors 2020.0 ... You might want to take a look at those (your pkts may not even be making it to tcp) and these in particular: TcpExtSyncookiesSent0 0.0 TcpExtSyncookiesRecv0 0.0 TcpExtSyncookiesFailed 0 0.0 TcpExtListenOverflows 0 0.0 TcpExtListenDrops 0 0.0 TcpExtTCPBacklogDrop0 0.0 TcpExtTCPMinTTLDrop 0 0.0 TcpExtTCPDeferAcceptDrop0 0.0 If you don't have nstat on that version for some reason, download the latest iproute pkg. Looking at the counter names is a lot more helpful and precise than the netstat converstion to human consumption. > Yes, I already did captures and we are definitely loosing packets > (including SYNs), but it looks like the amount of SYNs I'm loosing is > lower than the amount of long connect() times I observe. This is not > confirmed yet, I'm still investigating. Where did you narrow down the drop to? There are quite a few places in the networking stack we silently drop packets (such as the one pointed out earlier in this thread), although they should almost all be extremely low probability/NEVER type events. Do you want a patch to gap the most likely scenario? (I'll post that to netdev separately). thanks, Nivedita -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Doubts about listen backlog and tcp_max_syn_backlog
On 01/24/2013 11:21 AM, Leandro Lucarella wrote: On Thu, Jan 24, 2013 at 10:44:32AM -0800, Rick Jones wrote: On 01/24/2013 04:22 AM, Leandro Lucarella wrote: On Wed, Jan 23, 2013 at 11:28:08AM -0800, Rick Jones wrote: Then if syncookies are enabled, the time spent in connect() shouldn't be bigger than 3 seconds even if SYNs are being dropped by listen, right? Do you mean if ESTABLISHED connections are dropped because the listen queue is full? I don't think I would put that as SYNs being dropped by listen - too easy to confuse that with an actual dropping of a SYN segment. I was just kind of quoting the name given by netstat: SYNs to LISTEN sockets dropped (for kernel 3.0, I noticed newer kernels don't have this stat anymore, or the name was changed). I still don't know if we are talking about the same thing. [snip] I will sometimes be tripped-up by netstat's not showing a statistic with a zero value... Leandro, you should be able to do an nstat -z, it will print all counters even if zero. You should see something like so: ipv4] nstat -z #kernel IpInReceives2135 0.0 IpInHdrErrors 0 0.0 IpInAddrErrors 2020.0 ... You might want to take a look at those (your pkts may not even be making it to tcp) and these in particular: TcpExtSyncookiesSent0 0.0 TcpExtSyncookiesRecv0 0.0 TcpExtSyncookiesFailed 0 0.0 TcpExtListenOverflows 0 0.0 TcpExtListenDrops 0 0.0 TcpExtTCPBacklogDrop0 0.0 TcpExtTCPMinTTLDrop 0 0.0 TcpExtTCPDeferAcceptDrop0 0.0 If you don't have nstat on that version for some reason, download the latest iproute pkg. Looking at the counter names is a lot more helpful and precise than the netstat converstion to human consumption. Yes, I already did captures and we are definitely loosing packets (including SYNs), but it looks like the amount of SYNs I'm loosing is lower than the amount of long connect() times I observe. This is not confirmed yet, I'm still investigating. Where did you narrow down the drop to? There are quite a few places in the networking stack we silently drop packets (such as the one pointed out earlier in this thread), although they should almost all be extremely low probability/NEVER type events. Do you want a patch to gap the most likely scenario? (I'll post that to netdev separately). thanks, Nivedita -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MAINTAINERS: Update John Stultz email
John's email switched from IBM to Linaro. One less place for him to update now... Signed-off-by: Nivedita Singhvi --- MAINTAINERS |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 4e734ed..59e68d8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6596,7 +6596,7 @@ F:drivers/dma/dw_dmac_regs.h F: drivers/dma/dw_dmac.c TIMEKEEPING, NTP -M: John Stultz +M: John Stultz M: Thomas Gleixner T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core S: Supported -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MAINTAINERS: Update John Stultz email
John's email switched from IBM to Linaro. One less place for him to update now... Signed-off-by: Nivedita Singhvi n...@us.ibm.com --- MAINTAINERS |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 4e734ed..59e68d8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6596,7 +6596,7 @@ F:drivers/dma/dw_dmac_regs.h F: drivers/dma/dw_dmac.c TIMEKEEPING, NTP -M: John Stultz johns...@us.ibm.com +M: John Stultz john.stu...@linaro.org M: Thomas Gleixner t...@linutronix.de T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core S: Supported -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Poor UDP performance using 2.6.21-rc5-rt5
Dave Sperry wrote: Hi (adding netdev to cc list) I have a dual core Opteron machine that exhibits poor UDP performance (RT consumes more than 2X cpu) with the 2.6.21-rc5-rt5 as compared to 2.6.21-rc5. Top shows the IRQ handler consuming a lot of CPU. Dave, any chance you've got oprofile working on the -rt5? And I'm assuming nothing very different in the stats or errors through both runs? thanks, Nivedita The mother board is a Supermicro H8DME-2 with one dual core Opteron installed. The networking is provided by the on board nVidia MCP55Pro chip. The RT test is done using netperf 2.4.3 with the server on an IBM LS20 blade running RHEL4U2 and the Supermicro running netperf under RHEL5 with 2.6.21-rc5-rt5. The Non-RT test was done on the exact same setup except 2.6.21-rc5-rt5 was loaded on the SuperMicro board. Cyclesoak was used to measure CPU utilization in all cases. Here are the RT results 3 ## 2.6.21-rc5-rt5 ### $ !netper netperf -l 100 -H 192.168.70.11 -t UDP_STREAM -- -m 1025 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.70.11 (192.168.70.11) port 0 AF_INET Socket Message Elapsed Messages SizeSize Time Okay Errors Throughput bytes bytessecs# # 10^6bits/sec 1269761025 100.008676376 0 711.46 135168 100.008676376711.46 ## cyclesoak during test $ ./cyclesoak using 2 CPUs System load: -0.1% System load: 40.5% System load: 51.6% System load: 51.5% System load: 50.9% System load: 50.7% System load: 50.8% System load: 50.7% System load: 50.6% top during test top - 13:26:48 up 8 min, 4 users, load average: 1.74, 0.46, 0.15 Tasks: 149 total, 4 running, 145 sleeping, 0 stopped, 0 zombie Cpu(s): 0.7%us, 16.8%sy, 50.6%ni, 0.0%id, 0.0%wa, 25.6%hi, 6.3%si, 0.0%st Mem: 2035444k total, 465888k used, 1569556k free,28840k buffers Swap: 3068372k total,0k used, 3068372k free, 318668k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3865 eadi 39 19 6804 1164 108 R 100 0.1 0:38.25 cyclesoak 2715 root -51 -5 000 S 51 0.0 0:09.52 IRQ-8406 3867 eadi 25 0 6440 632 480 R 34 0.0 0:06.03 netperf 19 root -51 0 000 S 13 0.0 0:02.33 softirq-net-tx/ 3866 eadi 39 19 6804 1164 108 R1 0.1 0:20.47 cyclesoak 3167 root 25 0 29888 1180 888 S0 0.1 0:00.93 automount 3861 eadi 15 0 12712 1076 788 R0 0.1 0:00.19 top 1 root 18 0 10308 668 552 S0 0.0 0:00.67 init 2 root RT 0 000 S0 0.0 0:00.00 migration/0 3 root RT 0 000 S0 0.0 0:00.00 posix_cpu_timer 4 root -51 0 000 S0 0.0 0:00.00 softirq-high/0 5 root -51 0 000 S0 0.0 0:00.00 softirq-timer/0 6 root -51 0 000 S0 0.0 0:00.00 softirq-net-tx/ 7 root -51 0 000 S0 0.0 0:00.00 softirq-net-rx/ 8 root -51 0 000 S0 0.0 0:00.00 softirq-block/0 The baseline results: RHEL5 with 2.6.21-rc5 kernel ## $ netperf -l 100 -H 192.168.70.11 -t UDP_STREAM -- -m 1025 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.70.11 (192.168.70.11) port 0 AF_INET Socket Message Elapsed Messages SizeSize Time Okay Errors Throughput bytes bytessecs# # 10^6bits/sec 1269761025 100.0011405485 0 935.24 135168 100.0011405485935.24 ### $ ./cyclesoak using 2 CPUs System load: 7.6% System load: 29.6% System load: 29.6% System load: 28.9% System load: 24.9% System load: 25.0% System load: 24.8% System load: 24.9% ### top:top - 13:52:22 up 10 min, 6 users, load average: 1.46, 0.43, 0.17 Tasks: 118 total, 4 running, 114 sleeping, 0 stopped, 0 zombie Cpu(s): 0.5%us, 9.8%sy, 75.7%ni, 0.0%id, 0.0%wa, 5.8%hi, 8.1%si, 0.0%st Mem: 2057200k total, 459128k used, 1598072k free,29020k buffers Swap: 3068372k total,0k used, 3068372k free, 318968k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3882 eadi 39 19 6804 1164 108 R 100 0.1 0:52.11 cyclesoak 3881 eadi 39 19 6804 1164 108 R 65 0.1 0:38.47 cyclesoak 3883 eadi 15 0 6436 632 480 R 35 0.0 0:18.26 netperf 3879 eadi 15 0 12580 1052 788 R0 0.1 0:00.15 top 1 root 18 0 10308 664 552 S0 0.0 0:00.48 init 2 root RT 0 000 S0 0.0 0:00.00 migration/0 3 root 34 19 000 S0 0.0
Re: Poor UDP performance using 2.6.21-rc5-rt5
Dave Sperry wrote: Hi (adding netdev to cc list) I have a dual core Opteron machine that exhibits poor UDP performance (RT consumes more than 2X cpu) with the 2.6.21-rc5-rt5 as compared to 2.6.21-rc5. Top shows the IRQ handler consuming a lot of CPU. Dave, any chance you've got oprofile working on the -rt5? And I'm assuming nothing very different in the stats or errors through both runs? thanks, Nivedita The mother board is a Supermicro H8DME-2 with one dual core Opteron installed. The networking is provided by the on board nVidia MCP55Pro chip. The RT test is done using netperf 2.4.3 with the server on an IBM LS20 blade running RHEL4U2 and the Supermicro running netperf under RHEL5 with 2.6.21-rc5-rt5. The Non-RT test was done on the exact same setup except 2.6.21-rc5-rt5 was loaded on the SuperMicro board. Cyclesoak was used to measure CPU utilization in all cases. Here are the RT results 3 ## 2.6.21-rc5-rt5 ### $ !netper netperf -l 100 -H 192.168.70.11 -t UDP_STREAM -- -m 1025 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.70.11 (192.168.70.11) port 0 AF_INET Socket Message Elapsed Messages SizeSize Time Okay Errors Throughput bytes bytessecs# # 10^6bits/sec 1269761025 100.008676376 0 711.46 135168 100.008676376711.46 ## cyclesoak during test $ ./cyclesoak using 2 CPUs System load: -0.1% System load: 40.5% System load: 51.6% System load: 51.5% System load: 50.9% System load: 50.7% System load: 50.8% System load: 50.7% System load: 50.6% top during test top - 13:26:48 up 8 min, 4 users, load average: 1.74, 0.46, 0.15 Tasks: 149 total, 4 running, 145 sleeping, 0 stopped, 0 zombie Cpu(s): 0.7%us, 16.8%sy, 50.6%ni, 0.0%id, 0.0%wa, 25.6%hi, 6.3%si, 0.0%st Mem: 2035444k total, 465888k used, 1569556k free,28840k buffers Swap: 3068372k total,0k used, 3068372k free, 318668k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3865 eadi 39 19 6804 1164 108 R 100 0.1 0:38.25 cyclesoak 2715 root -51 -5 000 S 51 0.0 0:09.52 IRQ-8406 3867 eadi 25 0 6440 632 480 R 34 0.0 0:06.03 netperf 19 root -51 0 000 S 13 0.0 0:02.33 softirq-net-tx/ 3866 eadi 39 19 6804 1164 108 R1 0.1 0:20.47 cyclesoak 3167 root 25 0 29888 1180 888 S0 0.1 0:00.93 automount 3861 eadi 15 0 12712 1076 788 R0 0.1 0:00.19 top 1 root 18 0 10308 668 552 S0 0.0 0:00.67 init 2 root RT 0 000 S0 0.0 0:00.00 migration/0 3 root RT 0 000 S0 0.0 0:00.00 posix_cpu_timer 4 root -51 0 000 S0 0.0 0:00.00 softirq-high/0 5 root -51 0 000 S0 0.0 0:00.00 softirq-timer/0 6 root -51 0 000 S0 0.0 0:00.00 softirq-net-tx/ 7 root -51 0 000 S0 0.0 0:00.00 softirq-net-rx/ 8 root -51 0 000 S0 0.0 0:00.00 softirq-block/0 The baseline results: RHEL5 with 2.6.21-rc5 kernel ## $ netperf -l 100 -H 192.168.70.11 -t UDP_STREAM -- -m 1025 UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.70.11 (192.168.70.11) port 0 AF_INET Socket Message Elapsed Messages SizeSize Time Okay Errors Throughput bytes bytessecs# # 10^6bits/sec 1269761025 100.0011405485 0 935.24 135168 100.0011405485935.24 ### $ ./cyclesoak using 2 CPUs System load: 7.6% System load: 29.6% System load: 29.6% System load: 28.9% System load: 24.9% System load: 25.0% System load: 24.8% System load: 24.9% ### top:top - 13:52:22 up 10 min, 6 users, load average: 1.46, 0.43, 0.17 Tasks: 118 total, 4 running, 114 sleeping, 0 stopped, 0 zombie Cpu(s): 0.5%us, 9.8%sy, 75.7%ni, 0.0%id, 0.0%wa, 5.8%hi, 8.1%si, 0.0%st Mem: 2057200k total, 459128k used, 1598072k free,29020k buffers Swap: 3068372k total,0k used, 3068372k free, 318968k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3882 eadi 39 19 6804 1164 108 R 100 0.1 0:52.11 cyclesoak 3881 eadi 39 19 6804 1164 108 R 65 0.1 0:38.47 cyclesoak 3883 eadi 15 0 6436 632 480 R 35 0.0 0:18.26 netperf 3879 eadi 15 0 12580 1052 788 R0 0.1 0:00.15 top 1 root 18 0 10308 664 552 S0 0.0 0:00.48 init 2 root RT 0 000 S0 0.0 0:00.00 migration/0 3 root 34 19 000 S0 0.0
Re: Client receives TCP packets but does not ACK
> The bad network behavior was due to shared irqs somehow screwing > things up. This explained most but not all of the problems. ah, that's why your test pgm succeeded on my systems.. > When I last posted I had a reproducible test case which spewed a bunch > of packets from a server to a client. The behavior is that the client > eventually stops ACKing and so the the connection stalls indefinitely. > packet. I added printk statements for each of these conditions in > hopes of detecting why the final packet is not acked. I recompiled > the kernel, and reran the test. The result was that the packet was > being droped in tcp_rcv_established() due to an invalid checksum. I Ouch! In the interests of not having it be so painful to identify the problem (to this point, i.e. TCP drops due to checksum failures) the next time around, I'd like to ask: - Were you seeing any bad csum error messages in /var/log/messages? i.e. or else was it only TCP? - Was the stats field /proc/net/snmp/Tcp:InErrs reflecting those drops? - What additional logging/stats gathering would have made this (silent drops due to checksum failures by TCP) easier to detect? My 2c: The stat TcpInErrs is updated for most TCP input failures. So its not obvious (unless youre real familiar with TCP) that there are checksum failures happening. It actually includes only these errors: - checksum failures - header len problems - unexpected SYN's Is this adequate as a diagnostic, or would adding a breakdown counter(s) for checksum (and other) failures be useful? At the moment, there is no logging TCP does on a plain vanilla kernel, you have to recompile the kernel with NETDEBUG in order to see logged checksum failures, at least at the TCP level. It would be nice to have people be able to look at a counter or stat on the fly and tell whether they're having packets silently dropped due to checksum failures (and other issues) without needing to recompile the kernel... Any thoughts? thanks, Nivedita --- I'd appreciate a cc since I'm not subscribed.. [EMAIL PROTECTED] [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Client receives TCP packets but does not ACK
The bad network behavior was due to shared irqs somehow screwing things up. This explained most but not all of the problems. ah, that's why your test pgm succeeded on my systems.. When I last posted I had a reproducible test case which spewed a bunch of packets from a server to a client. The behavior is that the client eventually stops ACKing and so the the connection stalls indefinitely. packet. I added printk statements for each of these conditions in hopes of detecting why the final packet is not acked. I recompiled the kernel, and reran the test. The result was that the packet was being droped in tcp_rcv_established() due to an invalid checksum. I Ouch! In the interests of not having it be so painful to identify the problem (to this point, i.e. TCP drops due to checksum failures) the next time around, I'd like to ask: - Were you seeing any bad csum error messages in /var/log/messages? i.e. or else was it only TCP? - Was the stats field /proc/net/snmp/Tcp:InErrs reflecting those drops? - What additional logging/stats gathering would have made this (silent drops due to checksum failures by TCP) easier to detect? My 2c: The stat TcpInErrs is updated for most TCP input failures. So its not obvious (unless youre real familiar with TCP) that there are checksum failures happening. It actually includes only these errors: - checksum failures - header len problems - unexpected SYN's Is this adequate as a diagnostic, or would adding a breakdown counter(s) for checksum (and other) failures be useful? At the moment, there is no logging TCP does on a plain vanilla kernel, you have to recompile the kernel with NETDEBUG in order to see logged checksum failures, at least at the TCP level. It would be nice to have people be able to look at a counter or stat on the fly and tell whether they're having packets silently dropped due to checksum failures (and other issues) without needing to recompile the kernel... Any thoughts? thanks, Nivedita --- I'd appreciate a cc since I'm not subscribed.. [EMAIL PROTECTED] [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal RECV network performance
> >the Netgear FA311/2 (tulip). Found that the link lost > >connectivity because of card lockups and transmit timeout > >failures - and some of these were silent. However, I moved > >to the 3C905C (3c59x driver) which behaved like a champ, and > I'm a little confused here - do you mean the FA310TX ("tulip" driver) or the > FA311/2 ("natsemi" driver)? I have not had any connection problems with > either the FA310 or the FA311 cards. I haven't noticed any speed problems > with the FA311 card, but I haven't benchmarked it, either. The FA310 is so > horribly slow, I couldn't help but notice. Unfortunately, the same is true > of the 3cSOHO. Sorry, meant to describe both, (natsemi and tulip, but latter on older DEC chip). > I looked at tcpdump to try and figure it out, and it appeared that the P-90 > was taking a very long time to ACK some packets. I am not a TCP/IP guru by > any stretch, but my guess at the time was that the packets that were taking > forever to get ACK'ed were the ones causing a framing error on the P-90, but > again, I'm not an expert. > The only unusual stat is the framing errors. There are a lot of them under > heavy receive load. The machine will go for weeks without a single framing > error, but if I blast some netperf action at it (or FTP send to it, etc.) > then I get about 1/3 of the incoming packets (to the P-90) with framing > errors. I see no other errors at all except a TX overrun error (maybe 1 in > 10 packets). Tried to reproduce this problem last night on my machines at home (kernel 2.4.4, 500MHz K7/400Mhz K6). Just doing FTP and netperf tests, didnt see any significant variation between rcv and tx sides. Admittedly different machines, and between 3C905C and a FA310TX (tulip). However, if the problem was purely kernel protocol under load, it should have showed. Also, am not seeing significant frame errors - 1 in 10K, definitely not seeing anything remotely like 30%. If 1/3 of your packets are being dropped with frame errs, you'll see lots of retransmissions, horrible performance, no question. But I would expect frame errors to be due to things like the speed not being negotiated correctly(?), or if the board isnt sitting quite right (true - thats the only experience I remember of recv code path being error prone compared to tx), but that should affect all the kernel versions you ran on that host.. Am pretty clueless about media level issues, but it would help to identify whats causing the framing errors. Not much help, I know.. thanks, Nivedita --- Nivedita Singhvi(503) 578-4580 Linux Technology Center [EMAIL PROTECTED] IBM Beaverton, OR [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal RECV network performance
the Netgear FA311/2 (tulip). Found that the link lost connectivity because of card lockups and transmit timeout failures - and some of these were silent. However, I moved to the 3C905C (3c59x driver) which behaved like a champ, and I'm a little confused here - do you mean the FA310TX (tulip driver) or the FA311/2 (natsemi driver)? I have not had any connection problems with either the FA310 or the FA311 cards. I haven't noticed any speed problems with the FA311 card, but I haven't benchmarked it, either. The FA310 is so horribly slow, I couldn't help but notice. Unfortunately, the same is true of the 3cSOHO. Sorry, meant to describe both, (natsemi and tulip, but latter on older DEC chip). I looked at tcpdump to try and figure it out, and it appeared that the P-90 was taking a very long time to ACK some packets. I am not a TCP/IP guru by any stretch, but my guess at the time was that the packets that were taking forever to get ACK'ed were the ones causing a framing error on the P-90, but again, I'm not an expert. The only unusual stat is the framing errors. There are a lot of them under heavy receive load. The machine will go for weeks without a single framing error, but if I blast some netperf action at it (or FTP send to it, etc.) then I get about 1/3 of the incoming packets (to the P-90) with framing errors. I see no other errors at all except a TX overrun error (maybe 1 in 10 packets). Tried to reproduce this problem last night on my machines at home (kernel 2.4.4, 500MHz K7/400Mhz K6). Just doing FTP and netperf tests, didnt see any significant variation between rcv and tx sides. Admittedly different machines, and between 3C905C and a FA310TX (tulip). However, if the problem was purely kernel protocol under load, it should have showed. Also, am not seeing significant frame errors - 1 in 10K, definitely not seeing anything remotely like 30%. If 1/3 of your packets are being dropped with frame errs, you'll see lots of retransmissions, horrible performance, no question. But I would expect frame errors to be due to things like the speed not being negotiated correctly(?), or if the board isnt sitting quite right (true - thats the only experience I remember of recv code path being error prone compared to tx), but that should affect all the kernel versions you ran on that host.. Am pretty clueless about media level issues, but it would help to identify whats causing the framing errors. Not much help, I know.. thanks, Nivedita --- Nivedita Singhvi(503) 578-4580 Linux Technology Center [EMAIL PROTECTED] IBM Beaverton, OR [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal RECV network performance
> Can someone please help me troubleshoot this problem - > I am getting abysmal (see numbers below) network performance > on my system, but the poor performance seems limited to receiving > data. Transmission is OK. [ snip ] > What kind of performance should I be seeing with a P-90 > on a 100Mbps connection? I was expecting something in the > range of 40-70 Mbps - certainly not 1-2 Mbps. > > What can I do to track this problem down? Has anyone else > had problems like this? While we didnt use 2.2 kernels at all, we did similar tests on 2.4.0 through 2.4.4 kernels, on UP and SMP. I've used a similar machine (PII 333MHz) as well as faster (866MHz) machines, and got pretty nifty (> 90Mbs) throughput on netperf tests (tcp stream, no disk I/O) over a 100Mb full duplex link. (not sure if there are any P-90 issues). Throughput does drop with small MTU, very small packet sizes, small socket buffer sizes, but only at extremes, for the most part throughput was well over 70Mbs. (this is true for single connections, you dont mention how many connections you were scaling to, if any). However, we did run into serious performance problems with the Netgear FA311/2 (tulip). Found that the link lost connectivity because of card lockups and transmit timeout failures - and some of these were silent. However, I moved to the 3C905C (3c59x driver) which behaved like a champ, and we didnt see the problems any more, so have stuck to that card. This was back in the 2.4.0 time frame, and there have many patches since then to various drivers, so not sure if the problem(s) have been resolved or not (likely to have been, extensively reported). Both your cards might actually be underperforming.. Are you seeing any errors reported in /var/log/messages? Are you monitoring your connection via tcpdump, for example? You might sometimes see long gaps in transmission...Are there any abnormal numbers in /proc/net/ stats? I dont remember seeing that high frame errors, although there were a few. HW checksumming for the kind of test you are doing (tcp, mostly fast path) will not buy you any real performance gain, the checksum is actually consumed by the user-kernel copy routine. You can also run the tests on a profiling kernel and compare results... Nivedita --- Nivedita Singhvi(503) 578-4580 Linux Technology Center [EMAIL PROTECTED] IBM Beaverton, OR [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Abysmal RECV network performance
Can someone please help me troubleshoot this problem - I am getting abysmal (see numbers below) network performance on my system, but the poor performance seems limited to receiving data. Transmission is OK. [ snip ] What kind of performance should I be seeing with a P-90 on a 100Mbps connection? I was expecting something in the range of 40-70 Mbps - certainly not 1-2 Mbps. What can I do to track this problem down? Has anyone else had problems like this? While we didnt use 2.2 kernels at all, we did similar tests on 2.4.0 through 2.4.4 kernels, on UP and SMP. I've used a similar machine (PII 333MHz) as well as faster (866MHz) machines, and got pretty nifty ( 90Mbs) throughput on netperf tests (tcp stream, no disk I/O) over a 100Mb full duplex link. (not sure if there are any P-90 issues). Throughput does drop with small MTU, very small packet sizes, small socket buffer sizes, but only at extremes, for the most part throughput was well over 70Mbs. (this is true for single connections, you dont mention how many connections you were scaling to, if any). However, we did run into serious performance problems with the Netgear FA311/2 (tulip). Found that the link lost connectivity because of card lockups and transmit timeout failures - and some of these were silent. However, I moved to the 3C905C (3c59x driver) which behaved like a champ, and we didnt see the problems any more, so have stuck to that card. This was back in the 2.4.0 time frame, and there have many patches since then to various drivers, so not sure if the problem(s) have been resolved or not (likely to have been, extensively reported). Both your cards might actually be underperforming.. Are you seeing any errors reported in /var/log/messages? Are you monitoring your connection via tcpdump, for example? You might sometimes see long gaps in transmission...Are there any abnormal numbers in /proc/net/ stats? I dont remember seeing that high frame errors, although there were a few. HW checksumming for the kind of test you are doing (tcp, mostly fast path) will not buy you any real performance gain, the checksum is actually consumed by the user-kernel copy routine. You can also run the tests on a profiling kernel and compare results... Nivedita --- Nivedita Singhvi(503) 578-4580 Linux Technology Center [EMAIL PROTECTED] IBM Beaverton, OR [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
netperf stream scaling; patches that help?
I'm trying to run a simple test on a pair of Linux 2.4.2 PC's that starts up simultaneous netperf tcp stream tests, and find that I cant invoke more that 800 without running into memory allocation failures. This wouldnt be strange except that I find on the same systems, FreeBSD seems to do twice as well (1600). I complete 500 concurrent netperf tcp stream tests sending 64 byte packets successfully, but again, FreeBSD completes a 1000 successfully. Also, Linux appears to hog around 300MB on the server side, whereas FreeBSD only appears to be using 3MB. Those are the bare numbers, details available, of course, but what I'd like to do is repeat the Linux test with 2.4.4 and include some VM patches that might possibly alleviate any memory management issues I may be running into. This is between a 500MHz PIII Katmai and 333MHz PII Deschutes both with 512MB memory, over a 100Mb (3C905C) private nw. I'd appreciate any pointers to patches that might help, or suggestions in general to improve the Linux numbers. Especially any insight into whether this is a case of apples/oranges or whether I'm missing some trivial element here... I know of Ed Tomlinson's patch posted on this list on 4/12, are there any others? I know Jonathan Morton posted some OOM patches, are those included in 2.4.4? thanks, Nivedita --- Nivedita Singhvi(503) 578-4580 Linux Technology Center [EMAIL PROTECTED] IBM Beaverton, OR [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
netperf stream scaling; patches that help?
I'm trying to run a simple test on a pair of Linux 2.4.2 PC's that starts up simultaneous netperf tcp stream tests, and find that I cant invoke more that 800 without running into memory allocation failures. This wouldnt be strange except that I find on the same systems, FreeBSD seems to do twice as well (1600). I complete 500 concurrent netperf tcp stream tests sending 64 byte packets successfully, but again, FreeBSD completes a 1000 successfully. Also, Linux appears to hog around 300MB on the server side, whereas FreeBSD only appears to be using 3MB. Those are the bare numbers, details available, of course, but what I'd like to do is repeat the Linux test with 2.4.4 and include some VM patches that might possibly alleviate any memory management issues I may be running into. This is between a 500MHz PIII Katmai and 333MHz PII Deschutes both with 512MB memory, over a 100Mb (3C905C) private nw. I'd appreciate any pointers to patches that might help, or suggestions in general to improve the Linux numbers. Especially any insight into whether this is a case of apples/oranges or whether I'm missing some trivial element here... I know of Ed Tomlinson's patch posted on this list on 4/12, are there any others? I know Jonathan Morton posted some OOM patches, are those included in 2.4.4? thanks, Nivedita --- Nivedita Singhvi(503) 578-4580 Linux Technology Center [EMAIL PROTECTED] IBM Beaverton, OR [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/