Hello, 

I have a bug on an Alpha ES40 SMP 2.4.4.ac3 modified (TCP Bug from lkml)

Platform:

Linux Version:
-----------------------

My kernel is 2.4.4-ac3 with the tcp.c file modified as suggested by the
following patch.


>I see! Dave, please, take the second Andrea's patch (appended).
>It is really the cleanest one.

>Alexey


>--- 2.4.4aa3/net/ipv4/tcp.c.~1~        Tue May  1 10:44:57 2001
>+++ 2.4.4aa3/net/ipv4/tcp.c    Tue May  1 12:00:25 2001
>@@ -1183,11 +1183,8 @@
 
> do_fault:
>       if (skb->len==0) {
>-              if (tp->send_head == skb) {
>-                      tp->send_head = skb->next;
>-                      if (tp->send_head == (struct
sk_buff*)&sk->write_queue)
>-                              tp->send_head = NULL;
>-              }
>+              if (tp->send_head == skb)
>+                      tp->send_head = NULL;
>               __skb_unlink(skb, skb->list);
>               tcp_free_skb(sk, skb);
>       }
>
>-

This time, to show that it has nothing to do with the ftp server I used a
simple
rcp:

Experiment 1:
----------------------

 ES40-06                                        ES40-05

 rcp es40-05:/mnt/big/mid /tmp/toto             Machine fine

 with a mid file not too big (1.4Megabytes) everything is fine
 

Experiment 2:
----------------------

 ES40-06                                        ES40-05

 rcp es40-05:/mnt/big/1Giga /tmp/toto           Machine frozen

 the ES40-06 managed to retrieve only 11 Mbytes so I guess I can start again
with a 12 Megabytes file, It should trigger the bug.

Here is the log of the machine who crashed:
-----------------------------------------------------------------------

May  3 17:27:57 es40-05 PAM_unix[651]: (system-auth) session opened for user
root by (uid=0)
May  3 17:27:57 es40-05 in.rshd[651]: [EMAIL PROTECTED] as root:
cmd='rcp -f /mnt/big/mid'
May  3 17:29:36 es40-05 PAM_unix[662]: (system-auth) session opened for user
root by (uid=0)
May  3 17:29:36 es40-05 in.rshd[662]: [EMAIL PROTECTED] as root:
cmd='rcp -f /mnt/big/1Giga'
May  3 17:29:36 es40-05 kernel: <oomerang_rx(): status e001
May  3 17:29:36 es40-05 kernel: <<7>eth0: interrupt, status e401, latency 4
ticks.
May  3 17:29:36 es40-05 kernel: .
May  3 17:29:36 es40-05 kernel: <th0: interrupt, status e401, latency 3
ticks.
May  3 17:29:36 es40-05 kernel: <7
May  3 17:29:36 es40-05 kernel: <7t()
May  3 17:29:37 es40-05 kernel: <01, latency 4 ticks.
May  3 17:29:37 es40-05 kernel: <7
May  3 17:29:37 es40-05 kernel: <7
May  3 17:29:37 es40-05 kernel: th0: interrupt, status e401, latency 4
ticks.
May  3 17:29:37 es40-05 kernel: <7o send a packet, Tx index 5905.
May  3 17:29:37 es40-05 kernel: <7<7>eth0: exiting interrupt, status e000.
May  3 17:29:37 es40-05 kernel:  e201.
May  3 17:29:37 es40-05 kernel: <7<7>eth0: In interrupt loop, status e401.
May  3 17:29:37 es40-05 kernel: <7omerang_start_xmit()
May  3 17:29:37 es40-05 kernel: <7omerang_start_xmit()

The next line is:
--------------------------
May  3 17:36:17 es40-05 syslogd 1.3-3: restart.



What could I do to be sure where the problem is ?

I tested the machine under high cpu load, memory, swap, combination of the
three.
The only thing that does not work under load is the network.... TCP/IP ?

Andrew Morton is pretty sure this has nothing to do with his driver...

Any ideas of how I could find where the problem is ?

Thx for any help.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to