Re: -CURRENT freeze under high load
On Fri, Oct 26, 2001 at 06:30:47PM +0100, David Malone wrote: Anyway, both ways I can trigger the bug (find . -type f | xargs mutt, and actually running fetchmail -a) do generate a LOT of work, so it's actually possible that your diagnosis (mbuf exhaustion) is correct; trouble is, this shouln't hurt the machine to the point I can't even enter DDB. Maybe I'll try installing qmail at home and reproducing the problem. I've installed qmail and managed to reproduce the problem. I've also managed to get a trace using a serial console and it was stuck in something like unp_scan after closing a pipe. I've got a fair idea of where the problem is now, so I should be able to track it down. David. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -CURRENT freeze under high load
On Fri, Oct 26, 2001 at 06:16:12PM +0200, Andrea Campi wrote: Anybody has any idea how to properly fix? Can you test the following patch? David. Index: uipc_usrreq.c === RCS file: /cvs/FreeBSD-CVS/src/sys/kern/uipc_usrreq.c,v retrieving revision 1.74 diff -u -r1.74 uipc_usrreq.c --- uipc_usrreq.c 9 Oct 2001 21:40:30 - 1.74 +++ uipc_usrreq.c 27 Oct 2001 18:30:56 - @@ -1420,7 +1420,7 @@ while (m0) { for (m = m0; m; m = m-m_next) { - if (m-m_type == MT_CONTROL) + if (m-m_type != MT_CONTROL) continue; cm = mtod(m, struct cmsghdr *); To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -CURRENT freeze under high load
Looks like the problem below is caused by this commit: dwmalone2001/10/04 06:11:48 PDT Modified files: lib/libc/rpc clnt_vc.c svc_vc.c sbin/mount_portalfs activate.c sys/kern uipc_socket.c uipc_usrreq.c sys/netgraph ng_socket.c sys/sys domain.h un.h usr.sbin/ppp bundle.c ACPI, which I previously wrongly blamed, isn't involved in any way. Right now I am running a very recent -CURRENT, modulo this commit and more recent commits to the same files, i.e. I updated using: # cvs -q -R update -A -P -d # cvs -q -R update -D'Oct 04 15:11' kern/kern_proc.c kern/kern_prot.c kern/uipc_socket.c kern/uipc_usrreq.c netgraph/ng_socket.c netinet/ip_fw.c netinet/raw_ip.c netinet/tcp_subr.c netinet/udp_usrreq.c sys/domain.h sys/socketvar.h sys/un.h All my problems are now gone. This sort of makes sense to me, as the culprit, qmail, is quite socket intensive. Anybody has any idea how to properly fix? Bye, Andrea On Wed, Oct 24, 2001 at 03:31:51PM +0200, Andrea Campi wrote: Hi all, I am trying to diagnose a problem I've been having for a few weeks (I didn't report it earlier because I didn't have much time to hunt for it). The symptom is a total system freeze, i.e. I can't get into DDB. I can repeat it only with qmail, but of course I don't think it's qmail specific in any way; probably something to do with locking. To reproduce it I run: find . -type f | xargs mutt (on my machine, all emails get delivered to me) A kernel from Oct 1 doesn't have this issue; a kernel from Oct 5 has. I'll start binary searching for a commit I can blame. Anybody seen anything like this? -- It is easier to fix Unix than to live with NT. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -CURRENT freeze under high load
On Fri, Oct 26, 2001 at 06:16:12PM +0200, Andrea Campi wrote: All my problems are now gone. This sort of makes sense to me, as the culprit, qmail, is quite socket intensive. Anybody has any idea how to properly fix? This patch changed quite a few things, so it's not obvious exactly what is causing the problem. Do you know if qmail does any discriptor passing? The code makes discriptor passing a bit more mbuf intensive, so it's possible that you're running your machine out of mbufs. I know qmail tends to run machines as hard as it can, so it may have run the machine into the ground. Also, are you running on alpha or i386? David. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -CURRENT freeze under high load
On Fri, Oct 26, 2001 at 05:52:37PM +0100, David Malone wrote: On Fri, Oct 26, 2001 at 06:16:12PM +0200, Andrea Campi wrote: All my problems are now gone. This sort of makes sense to me, as the culprit, qmail, is quite socket intensive. Anybody has any idea how to properly fix? This patch changed quite a few things, so it's not obvious exactly what is causing the problem. I know. I'd like to look deeper into the issue, but from a quick glance at the code, I don't think I could figure out a way to separate those things and try each one. Do you happen to have separate patches for them, that I could try? Do you know if qmail does any discriptor passing? The code makes discriptor passing a bit more mbuf intensive, so it's possible that you're running your machine out of mbufs. I know qmail tends to run machines as hard as it can, so it may have run the machine into the ground. I'm not 100% sure of how to check, but a grep SOL_SOCKET * in the sources didn't return anything. Also, from what I can understand without really reading all of that #@#@ DJB code, qmail mainly uses pipe and 2 or 3 fifos. AFAIK your commit wasn't intended to change that, but is it possible that a bug did sneak in? Anyway, both ways I can trigger the bug (find . -type f | xargs mutt, and actually running fetchmail -a) do generate a LOT of work, so it's actually possible that your diagnosis (mbuf exhaustion) is correct; trouble is, this shouln't hurt the machine to the point I can't even enter DDB. Also, are you running on alpha or i386? i386, IBM Thinkpad 570E (not that it being a laptop makes any difference, of course ;-)) Bye, Andrea -- Intel: where Quality is job number 0.9998782345! To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -CURRENT freeze under high load
On Fri, Oct 26, 2001 at 07:12:24PM +0200, Andrea Campi wrote: I know. I'd like to look deeper into the issue, but from a quick glance at the code, I don't think I could figure out a way to separate those things and tr each one. Do you happen to have separate patches for them, that I could try? It's a bit hard to seperate the bits out, which is why it happened as one commit. I'll see if I can come up with something to segregate the code out a bit. One possibility would be to add a printf to the internalise and externalise functions in uipc_usrreq.c - that way we can see if it is actually executing the code there. If it's not, then that narrows things down a bit. in the sources didn't return anything. Also, from what I can understand withou really reading all of that #@#@ DJB code, qmail mainly uses pipe and 2 or 3 fifos. AFAIK your commit wasn't intended to change that, but is it possible that a bug did sneak in? Hmm - I don't think my code should have changed fifos or pipes at all. Anyway, both ways I can trigger the bug (find . -type f | xargs mutt, and actually running fetchmail -a) do generate a LOT of work, so it's actually possible that your diagnosis (mbuf exhaustion) is correct; trouble is, this shouln't hurt the machine to the point I can't even enter DDB. Maybe I'll try installing qmail at home and reproducing the problem. Also, are you running on alpha or i386? i386, IBM Thinkpad 570E (not that it being a laptop makes any difference, of course ;-)) OK - that eliminates one possibility ;-) David. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [acpi-jp 1370] Re: -CURRENT freeze under high load
I've just updated to -HEAD with this delta reverted and running a make buildkernel right now. Looks like I spoke too soon; reverting just this delta wasn't enough. I'm back to testing with all ACPI related work from Oct 04 08:32 rolled back; if it works, I'll try to update each diff in increment and find what else failed. Now I'll just shut up until I have conclusive evidence of what is broken. Bye, Andrea -- It is easier to fix Unix than to live with NT. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
-CURRENT freeze under high load
Hi all, I am trying to diagnose a problem I've been having for a few weeks (I didn't report it earlier because I didn't have much time to hunt for it). The symptom is a total system freeze, i.e. I can't get into DDB. I can repeat it only with qmail, but of course I don't think it's qmail specific in any way; probably something to do with locking. To reproduce it I run: find . -type f | xargs mutt (on my machine, all emails get delivered to me) A kernel from Oct 1 doesn't have this issue; a kernel from Oct 5 has. I'll start binary searching for a commit I can blame. Anybody seen anything like this? Bye, Andrea To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: -CURRENT freeze under high load
In [EMAIL PROTECTED] [EMAIL PROTECTED] (Andrea Campi) wrote: AC Anybody seen anything like this? Well, it may not be the case, but I have similar problem. In my case, just after login via xdm installed from port/x11/XFree86-4, load average gets very much increased up to about 4.0 and the mouse cannot work smoothly. It occurs when uptime becomes one day or more. Fortunately, my system never freeze or hangup. Here is dmesg. Mainboard is ASUS P3V4X and CPU is P3 933MHz with ASUS S370-??. Copyright (c) 1992-2001 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #35: Wed Oct 17 15:51:39 JST 2001 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/NAKAJI Timecounter i8254 frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (936.75-MHz 686-class CPU) Origin = GenuineIntel Id = 0x686 Stepping = 6 Features=0x383f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE real memory = 671072256 (655344K bytes) avail memory = 647983104 (632796K bytes) Preloaded elf kernel /boot/kernel/kernel at 0xc0471000. Preloaded elf module /boot/kernel/acpi.ko at 0xc04710b4. Pentium Pro MTRR support enabled Using $PIR table, 8 entries at 0xc00f0e60 apm0: APM BIOS on motherboard apm0: found APM BIOS v1.2, connected at v1.2 npx0: math processor on motherboard npx0: INT 16 interface acpi0: ASUS P3V_4X on motherboard acpi0: power button is handled as a fixed feature programming model. acpi_button0: Power Button on acpi0 fdc0: NEC 72065B or clone port 0x3f7,0x3f2-0x3f5 irq 6 on acpi0 fdc0: FIFO enabled, 8 bytes threshold fd0: 1440-KB 3.5 drive on fdc0 drive 0 ppc0 port 0x778-0x77b,0x378-0x37f irq 7 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: IEEE1284 device found /NIBBLE/ECP/ECP_RLE Probing for PnP devices on ppbus0: ppbus0: EPSON LP-8400PS3 PRINTER POSTSCRIPT plip0: PLIP network interface on ppbus0 lpt0: Printer on ppbus0 lpt0: Interrupt-driven port ppi0: Parallel I/O on ppbus0 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A sio1 port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A atkbdc0: Keyboard controller (i8042) port 0x64,0x60 irq 1 on acpi0 atkbd0: AT Keyboard flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 psm0: failed to get data. psm0: PS/2 Mouse irq 12 on atkbdc0 psm0: model IntelliMouse, device ID 3 pcib0: Host to PCI bridge at pcibus 0 on motherboard pci0: PCI bus on pcib0 pcib1: PCI-PCI bridge at device 1.0 on pci0 pci1: PCI bus on pcib1 pci1: display, VGA at device 0.0 (no driver attached) isab0: PCI-ISA bridge at device 4.0 on pci0 isa0: ISA bus on isab0 atapci0: VIA 82C596 ATA66 controller port 0xb800-0xb80f at device 4.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: serial bus, USB at device 4.2 (no driver attached) xl0: 3Com 3c905B-TX Fast Etherlink XL port 0xb000-0xb07f mem 0xe180-0xe180007f irq 5 at device 9.0 on pci0 xl0: Ethernet address: 00:01:02:c2:15:af miibus0: MII bus on xl0 xlphy0: 3Com internal media interface on miibus0 xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto csa0: CS4280/CS4614/CS4622/CS4624/CS4630 mem 0xe080-0xe08f,0xe100-0xe1000fff irq 10 at device 10.0 on pci0 csa: card is Unknown/invalid SSID (CS4614) pcm0: CS461x PCM Audio on csa0 ahc0: Adaptec 2940 Ultra2 SCSI adapter (OEM) port 0xa800-0xa8ff mem 0xe000-0xefff irq 11 at device 12.0 on pci0 aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs atapci1: HighPoint HPT370 ATA100 controller port 0x9000-0x90ff,0x9400-0x9403,0x9800-0x9807,0xa000-0xa003,0xa400-0xa407 irq 5 at device 13.0 on pci0 ata2: at 0xa400 on atapci1 ata3: at 0x9800 on atapci1 orm0: Option ROMs at iomem 0xc-0xc7fff,0xc8000-0xcd7ff on isa0 fdc1: cannot reserve I/O port range (6 ports) ppc1: cannot reserve I/O port range ppc2: cannot reserve I/O port range sc0: System console at flags 0x100 on isa0 sc0: VGA 16 virtual consoles, flags=0x300 vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0 ad0: 2441MB WDC AC32500H [4960/16/63] at ata0-master WDMA2 ad1: 14655MB Maxtor 31536U2 [29777/16/63] at ata0-slave UDMA66 ad3: 9765MB FUJITSU MPC3102AT E [19841/16/63] at ata1-slave UDMA33 ar0: 43979MB ATA RAID1 array [5606/255/63] subdisks: ad4: 43979MB IBM-DTLA-307045 [89355/16/63] at ata2-master UDMA100 ad6: 43979MB IBM-DTLA-307045 [89355/16/63] at ata3-master UDMA100 ar1: 43979MB ATA RAID1 array [5606/255/63] subdisks: ad4: 43979MB IBM-DTLA-307045 [89355/16/63] at ata2-master UDMA100 ad6: 43979MB IBM-DTLA-307045 [89355/16/63] at ata3-master UDMA100 ar2: 43979MB ATA RAID1 array [5606/255/63] subdisks: ad4: 43979MB IBM-DTLA-307045 [89355/16/63] at ata2-master UDMA100 ad6: 43979MB IBM-DTLA-307045 [89355/16/63] at ata3-master UDMA100 ar3: 43979MB ATA RAID1 array [5606/255/63] subdisks: ad4: 43979MB
Re: -CURRENT freeze under high load
On 24-Oct-01 NAKAJI Hiroyuki wrote: In [EMAIL PROTECTED] [EMAIL PROTECTED] (Andrea Campi) wrote: AC Anybody seen anything like this? Well, it may not be the case, but I have similar problem. In my case, just after login via xdm installed from port/x11/XFree86-4, load average gets very much increased up to about 4.0 and the mouse cannot work smoothly. It occurs when uptime becomes one day or more. Fortunately, my system never freeze or hangup. I would try removing apm from your kernel config or disabling ACPI and only using one or the other and seeing if that helps. -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message