Bug#572201: forcedeth driver hangs under heavy load

2010-04-10 Thread Ben Hutchings
Stephen Mulcahy reported a regression in forcedeth at . The system information and some diagnostic information can be found there. Anyone able to help? Ben. stephen mulcahy wrote: > When running linux-image-2.6.32-trunk-amd64, the network stops > responding if la

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy
Ben Hutchings wrote: Stephen Mulcahy reported a regression in forcedeth at . The system information and some diagnostic information can be found there. Anyone able to help? Incidentally, I also tried the 2.6.33.2 kernel with CONFIG_FORCEDETH_NAPI set to "y" to

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy
stephen mulcahy wrote: It doesn't - further testing over the weekend saw 6 of 45 machines drop off the network with this problem. Nothing in dmesg or system logs. Happy to run more tests if someone can advise on what should be run. I also just tried using the 2.6.30-2-amd64 (Debian) forcedeth

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread Eric Dumazet
Le lundi 12 avril 2010 à 13:39 +0100, stephen mulcahy a écrit : > stephen mulcahy wrote: > > It doesn't - further testing over the weekend saw 6 of 45 machines drop > > off the network with this problem. Nothing in dmesg or system logs. > > Happy to run more tests if someone can advise on what sh

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy
Eric Dumazet wrote: Le lundi 12 avril 2010 à 13:39 +0100, stephen mulcahy a écrit : I am not sure I understand. Are you saying that using 2.6.30-2-amd64 kernel also makes your forcedeth adapter being not functional ? Hi Eric, If I run my tests with the 2.6.30-2-amd64 kernel the network doesn't

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy
stephen mulcahy wrote: Are both way non functional (RX and TX), or only one side ? Whats the best way of testing this? (tcpdump listening on both hosts and then running pings between the systems?) stephen mulcahy wrote: >> Are both way non functional (RX and TX), or only one side ? > > What

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread Eric Dumazet
Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit : > Does that help? Well, yes, because it seems a TCP problem. r...@node20:~# tcpdump host node20 and node05 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), ca

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy
Eric Dumazet wrote: Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit : Do you have some netfilters rules ? Hi Eric, I don't have any netfilters rules: r...@node34:~# for table in filter nat mangle raw; do iptables -t $table -L; done Chain INPUT (policy ACCEPT) target prot

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread Eric Dumazet
Le lundi 12 avril 2010 à 17:11 +0100, stephen mulcahy a écrit : > Eric Dumazet wrote: > > Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit : > > > > Do you have some netfilters rules ? > > > > Hi Eric, > > I don't have any netfilters rules: > > r...@node34:~# for table in filter n

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy
Eric Dumazet wrote: OK it seems forcedeth has problem with checksums ? Try to change "ethtool -k eth0" settings ? ethtool -K eth0 tso off tx off Yes, that makes an unresponsive system responsive again immediately, nice! Should the driver default to disabling this until we problem is correcte

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Eric Dumazet
Le mardi 13 avril 2010 à 11:03 +0100, stephen mulcahy a écrit : > Eric Dumazet wrote: > > OK it seems forcedeth has problem with checksums ? > > > > Try to change "ethtool -k eth0" settings ? > > > > ethtool -K eth0 tso off tx off > > Yes, that makes an unresponsive system responsive again immed

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy
Eric Dumazet wrote: Le mardi 13 avril 2010 à 11:03 +0100, stephen mulcahy a écrit : Eric Dumazet wrote: OK it seems forcedeth has problem with checksums ? Try to change "ethtool -k eth0" settings ? ethtool -K eth0 tso off tx off Yes, that makes an unresponsive system responsive again immedia

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Ben Hutchings
On Tue, 2010-04-13 at 12:00 +0100, stephen mulcahy wrote: > Eric Dumazet wrote: > > Le mardi 13 avril 2010 à 11:03 +0100, stephen mulcahy a écrit : > >> Eric Dumazet wrote: > >>> OK it seems forcedeth has problem with checksums ? > >>> > >>> Try to change "ethtool -k eth0" settings ? > >>> > >>> et

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy
Ok, I've tried both of the following with my reproducer 1. ethtool -K eth0 tso off RESULT: reproducer causes multiple hosts to be come unresponsive on first run. 2. ethtool -K eth0 tx off RESULT: reproducer runs three times without any hosts becoming unresponsive. -stephen -- To UNSUBSCR

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Eric Dumazet
Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit : > Ok, I've tried both of the following with my reproducer > > 1. ethtool -K eth0 tso off > > RESULT: reproducer causes multiple hosts to be come unresponsive on > first run. > > 2. ethtool -K eth0 tx off > > RESULT: reproducer run

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy
Eric Dumazet wrote: Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit : Ok, I've tried both of the following with my reproducer 1. ethtool -K eth0 tso off RESULT: reproducer causes multiple hosts to be come unresponsive on first run. 2. ethtool -K eth0 tx off RESULT: reproducer

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy
stephen mulcahy wrote: Now some brave fouls to check the 6410 lines of this driver ? ;) Question of the day : Why TSO is broken in forcedeth ? Is it generically broken or is it broken for specific NICS ? Actually, it is only when tx-checksumming is turned off that the problem doesn't occur

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Eric Dumazet
Le mardi 13 avril 2010 à 15:49 +0100, stephen mulcahy a écrit : > Eric Dumazet wrote: > > Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit : > >> Ok, I've tried both of the following with my reproducer > >> > >> 1. ethtool -K eth0 tso off > >> > >> RESULT: reproducer causes multiple ho

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy
Eric Dumazet wrote: I am scratching my head, but I thought you told me that ethtool -K eth0 tso off ethtool -K eth0 tx on was working ? No, sorry for the confusion. ethtool -K eth0 tx off fixes the problem. Setting only ethtool -K eth0 tso off ethtool -K eth0 tx on still results in f

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Eric Dumazet
Le mardi 13 avril 2010 à 16:08 +0100, stephen mulcahy a écrit : > Eric Dumazet wrote: > > > I am scratching my head, but I thought you told me that > > > > ethtool -K eth0 tso off > > ethtool -K eth0 tx on > > > > was working ? > > No, sorry for the confusion. > > ethtool -K eth0 tx off > >

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy
Eric Dumazet wrote: OK, thanks for clarification. Last question, did you tried a vanilla kernel, aka 2.6.33.2 for example ? I built a Debian package from the vanilla 2.6.33.2 and installed that on all nodes and tried my reproducer with the same results - nodes becoming unresponsive. I didn

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Eric Dumazet
Le mardi 13 avril 2010 à 16:25 +0100, stephen mulcahy a écrit : > Eric Dumazet wrote: > > OK, thanks for clarification. > > > > Last question, did you tried a vanilla kernel, aka 2.6.33.2 for > > example ? > > I built a Debian package from the vanilla 2.6.33.2 and installed that on > all nodes a

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Eric Dumazet
Le mardi 13 avril 2010 à 14:43 -0700, David Miller a écrit : > Do you really come to the conclusion that TSO is broken with the above > test results? > > I would conclude that there is a TX checksumming issue, since merely > turning TSO off does not fix the problem whereas turning TX > checksummin

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread David Miller
From: Eric Dumazet Date: Tue, 13 Apr 2010 16:42:21 +0200 > Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit : >> Ok, I've tried both of the following with my reproducer >> >> 1. ethtool -K eth0 tso off >> >> RESULT: reproducer causes multiple hosts to be come unresponsive on >> fi

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread Ayaz Abdulla
Attached fix has been submitted to netdev. Ayaz Eric Dumazet wrote: Le mardi 13 avril 2010 à 14:43 -0700, David Miller a écrit : Do you really come to the conclusion that TSO is broken with the above test results? I would conclude that there is a TX checksumming issue, since merely turning

Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread David Miller
From: Ayaz Abdulla Date: Wed, 14 Apr 2010 01:33:15 -0400 > Attached fix has been submitted to netdev. Thanks! I apply this soon. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://l

Bug#572201: forcedeth driver hangs under heavy load

2010-04-14 Thread stephen mulcahy
Ayaz Abdulla wrote: Attached fix has been submitted to netdev. I've run my reproducer with this patch applied to be Debian 2.6.32 kernel and so far the problem with nodes becoming unresponsive hasn't occurred. NIC settings were left the default so this looks positive r...@node23:~# ethtool