2.4.4: Kernel crash, possibly tcp related
Greetings, A possibly tcp-related bug causing a kernel crash, possible to trigger from an unprivileged user. Kernel 2.4.4, no patches applied. The problem appeared when performing some network-performance tests with a program called tcpblast. tcpblast has an option to set its "block size". The block size is the size of the buffer passed to the write function. The problem appears when this value is set to 40481 or higher. For ex: $ tcpblast -d0 -s 40481 another_host 9000 With this block size the following message spammed: tcp/udpblast send:: No such file or directory Trying the same command with a 2.2.18 kernel gave: tcp/udpblast send:: Bad address The first part is from tcpblast, the second is printed via perror. Well, if the machine then has "some" other work running a kernel crash occurs (note that this only applies to 2.4.4, 2.2.18 didn't seem to have the problem): KERNEL: assertion (!skb_queue_empty(&sk->write_queue)) failed at tcp_timer.c(327): tcp_retransmit_timer Unable to handle kernel NULL pointer dereference... . . . Kernel panic: Aiee, killing interrupt handler! In interrupt handler - not syncing Then the machine is completely locked up, no vt-changing or ctrl->scroll_lock etc works. The most efficient way I found to produce "some load" to trigger the bug while running tcpblast was to use a simple forkbomb: int main() { while(1) fork(); } If you need more information, just ask. regards, /Ralf Nyrén System information: cat /proc/version Linux version 2.4.4 (plumbum@client2) (gcc version 2.95.2 2220 (Debian GNU/Linux)) #4 Sat Apr 28 15:47:17 CEST 2001 cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 3 model name : Pentium II (Klamath) stepping: 4 cpu MHz : 232.349 cache size : 512 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov mmx bogomips: 463.66 cat /proc/modules vfat8688 0 (unused) fat30272 0 [vfat] cat /proc/ioports -001f : dma1 0020-003f : pic1 0040-005f : timer 0060-006f : keyboard 0070-007f : rtc 0080-008f : dma page reg 00a0-00bf : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial(auto) 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial(auto) 0cf8-0cff : PCI conf1 4000-403f : Intel Corporation 82371AB PIIX4 ACPI 5000-501f : Intel Corporation 82371AB PIIX4 ACPI 6400-641f : Intel Corporation 82371AB PIIX4 USB 6800-687f : VIA Technologies, Inc. VT86C100A [Rhine 10/100] 6800-687f : via-rhine e000-efff : PCI Bus #01 e000-e0ff : ATI Technologies Inc 3D Rage LT Pro AGP-133 f000-f00f : Intel Corporation 82371AB PIIX4 IDE f000-f007 : ide0 f008-f00f : ide1 cat /proc/iomem -0009fbff : System RAM 0009fc00-0009 : reserved 000a-000b : Video RAM area 000c-000c7fff : Video ROM 000f-000f : System ROM 0010-03ff : System RAM 0010-001d160b : Kernel code 001d160c-0021a957 : Kernel data a800-afff : PCI Bus #01 d800-dfff : PCI Bus #01 d800-d8ff : ATI Technologies Inc 3D Rage LT Pro AGP-133 d900-d9000fff : ATI Technologies Inc 3D Rage LT Pro AGP-133 e000-e3ff : Intel Corporation 440LX/EX - 82443LX/EX Host bridge e400-e4ff : 3Dfx Interactive, Inc. Voodoo 2 e500-e57f : VIA Technologies, Inc. VT86C100A [Rhine 10/100] e500-e57f : via-rhine - : reserved - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.4: Kernel crash, possibly tcp related
On Sun, 29 Apr 2001, David S. Miller wrote: [snip] > > Anyways, I just tried to reproduce Ralf's problem on two of my > machines. One was an SMP sparc64 system, and the other was my > uniprocessor Athlon. > > What kind of machine are you reproducing this on Ralf? I'm not > even getting the very strange errors from tcpblast on the command > line, it is functioning perfectly fine and sending a stream of > data to the other machine. Are you doing something weird like > making the remote machine the local machine in your tcpblast run? > > Later, > David S. Miller > [EMAIL PROTECTED] > Sorry for not including a reference to the software. I used the tcpblast program from Debian (unstable). It can be found in the netdiag package: http://ftp.debian.org/debian/dists/woody/main/source/net/netdiag_0.7.orig.tar.gz Since this problem seemed a bit hard to reproduce I tested it on another machine too. It needed some more load, but eventually crashed. This machine is a PII 400MHz, 128MB, 440BX/ZX, PIIX. 3c905B network card. For more information like .config, System.map, ver_linux etc see: http://www.educ.umu.se/~plumbum/kernel/panic_2.4.4_20010430/ Regarding the strange error msg: tcp/udpblast send:: No such file or directory both the precompiled binary and one compiled from the source produced this message. Although I noticed that the min blocksize triggering the message changed from 40481 to 39841. Probably some compiletime feature :) Making remote machine the local machine... no, I send from my machine to another. Both with 100Mbps network connections. Reproduction procedure: ./tcpblast -d0 -s 20 _another_host_ 9000 ./forkbomb wait... The so called "forkbomb" shouldn't really be necessary, some heavy load making use of scheduler, memory and swap seems to do the thing. Hope this information could be helpful. regards, /Ralf - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/