Re: -CURRENT freeze under high load

2001-10-27 Thread David Malone

On Fri, Oct 26, 2001 at 06:30:47PM +0100, David Malone wrote:
  Anyway, both ways I can trigger the bug (find . -type f | xargs mutt, and
  actually running fetchmail -a) do generate a LOT of work, so it's actually
  possible that your diagnosis (mbuf exhaustion) is correct; trouble is, this
  shouln't hurt the machine to the point I can't even enter DDB.
 
 Maybe I'll try installing qmail at home and reproducing the problem.

I've installed qmail and managed to reproduce the problem. I've also
managed to get a trace using a serial console and it was stuck in
something like unp_scan after closing a pipe. I've got a fair idea of
where the problem is now, so I should be able to track it down.

David.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -CURRENT freeze under high load

2001-10-27 Thread David Malone

On Fri, Oct 26, 2001 at 06:16:12PM +0200, Andrea Campi wrote:
 Anybody has any idea how to properly fix?

Can you test the following patch?

David.


Index: uipc_usrreq.c
===
RCS file: /cvs/FreeBSD-CVS/src/sys/kern/uipc_usrreq.c,v
retrieving revision 1.74
diff -u -r1.74 uipc_usrreq.c
--- uipc_usrreq.c   9 Oct 2001 21:40:30 -   1.74
+++ uipc_usrreq.c   27 Oct 2001 18:30:56 -
@@ -1420,7 +1420,7 @@
 
while (m0) {
for (m = m0; m; m = m-m_next) {
-   if (m-m_type == MT_CONTROL)
+   if (m-m_type != MT_CONTROL)
continue;
 
cm = mtod(m, struct cmsghdr *);


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -CURRENT freeze under high load

2001-10-26 Thread Andrea Campi

Looks like the problem below is caused by this commit:

dwmalone2001/10/04 06:11:48 PDT

  Modified files:
lib/libc/rpc clnt_vc.c svc_vc.c
sbin/mount_portalfs  activate.c
sys/kern uipc_socket.c uipc_usrreq.c
sys/netgraph ng_socket.c
sys/sys  domain.h un.h
usr.sbin/ppp bundle.c

ACPI, which I previously wrongly blamed, isn't involved in any way. Right now I
am running a very recent -CURRENT, modulo this commit and more recent commits to
the same files, i.e. I updated using:

# cvs -q -R update -A -P -d
# cvs -q -R update -D'Oct 04 15:11' kern/kern_proc.c kern/kern_prot.c 
kern/uipc_socket.c kern/uipc_usrreq.c netgraph/ng_socket.c netinet/ip_fw.c 
netinet/raw_ip.c netinet/tcp_subr.c netinet/udp_usrreq.c sys/domain.h sys/socketvar.h 
sys/un.h

All my problems are now gone. This sort of makes sense to me, as the culprit,
qmail, is quite socket intensive.

Anybody has any idea how to properly fix?

Bye,
Andrea


On Wed, Oct 24, 2001 at 03:31:51PM +0200, Andrea Campi wrote:
 Hi all,
 
 I am trying to diagnose a problem I've been having for a few weeks (I didn't
 report it earlier because I didn't have much time to hunt for it).
 
 The symptom is a total system freeze, i.e. I can't get into DDB. I can repeat
 it only with qmail, but of course I don't think it's qmail specific in any way;
 probably something to do with locking. To reproduce it I run:
 
 find . -type f | xargs mutt  (on my machine, all emails get delivered to me)
 
 A kernel from Oct 1 doesn't have this issue; a kernel from Oct 5 has. I'll
 start binary searching for a commit I can blame.
 
 Anybody seen anything like this?

-- 
It is easier to fix Unix than to live with NT.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -CURRENT freeze under high load

2001-10-26 Thread David Malone

On Fri, Oct 26, 2001 at 06:16:12PM +0200, Andrea Campi wrote:
 All my problems are now gone. This sort of makes sense to me, as the culprit,
 qmail, is quite socket intensive.
 
 Anybody has any idea how to properly fix?

This patch changed quite a few things, so it's not obvious exactly
what is causing the problem.

Do you know if qmail does any discriptor passing? The code makes
discriptor passing a bit more mbuf intensive, so it's possible that
you're running your machine out of mbufs. I know qmail tends to
run machines as hard as it can, so it may have run the machine into
the ground.

Also, are you running on alpha or i386?

David.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -CURRENT freeze under high load

2001-10-26 Thread Andrea Campi

On Fri, Oct 26, 2001 at 05:52:37PM +0100, David Malone wrote:
 On Fri, Oct 26, 2001 at 06:16:12PM +0200, Andrea Campi wrote:
  All my problems are now gone. This sort of makes sense to me, as the culprit,
  qmail, is quite socket intensive.
  
  Anybody has any idea how to properly fix?
 
 This patch changed quite a few things, so it's not obvious exactly
 what is causing the problem.

I know. I'd like to look deeper into the issue, but from a quick glance at the
code, I don't think I could figure out a way to separate those things and try
each one. Do you happen to have separate patches for them, that I could try?

 Do you know if qmail does any discriptor passing? The code makes
 discriptor passing a bit more mbuf intensive, so it's possible that
 you're running your machine out of mbufs. I know qmail tends to
 run machines as hard as it can, so it may have run the machine into
 the ground.

I'm not 100% sure of how to check, but a

grep SOL_SOCKET *

in the sources didn't return anything. Also, from what I can understand without
really reading all of that #@#@ DJB code, qmail mainly uses pipe and 2 or 3
fifos. AFAIK your commit wasn't intended to change that, but is it possible
that a bug did sneak in?

Anyway, both ways I can trigger the bug (find . -type f | xargs mutt, and
actually running fetchmail -a) do generate a LOT of work, so it's actually
possible that your diagnosis (mbuf exhaustion) is correct; trouble is, this
shouln't hurt the machine to the point I can't even enter DDB.

 
 Also, are you running on alpha or i386?

i386, IBM Thinkpad 570E (not that it being a laptop makes any difference, of
course ;-))

Bye,
Andrea

-- 
   Intel: where Quality is job number 0.9998782345!

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -CURRENT freeze under high load

2001-10-26 Thread David Malone

On Fri, Oct 26, 2001 at 07:12:24PM +0200, Andrea Campi wrote:
 I know. I'd like to look deeper into the issue, but from a quick glance at the
 code, I don't think I could figure out a way to separate those things and tr
 each one. Do you happen to have separate patches for them, that I could try?

It's a bit hard to seperate the bits out, which is why it happened
as one commit. I'll see if I can come up with something to segregate
the code out a bit. One possibility would be to add a printf to
the internalise and externalise functions in uipc_usrreq.c - that
way we can see if it is actually executing the code there. If it's
not, then that narrows things down a bit.

 in the sources didn't return anything. Also, from what I can understand withou
 really reading all of that #@#@ DJB code, qmail mainly uses pipe and 2 or 3
 fifos. AFAIK your commit wasn't intended to change that, but is it possible
 that a bug did sneak in?

Hmm - I don't think my code should have changed fifos or pipes at all.

 Anyway, both ways I can trigger the bug (find . -type f | xargs mutt, and
 actually running fetchmail -a) do generate a LOT of work, so it's actually
 possible that your diagnosis (mbuf exhaustion) is correct; trouble is, this
 shouln't hurt the machine to the point I can't even enter DDB.

Maybe I'll try installing qmail at home and reproducing the problem.

  Also, are you running on alpha or i386?
 
 i386, IBM Thinkpad 570E (not that it being a laptop makes any difference, of
 course ;-))

OK - that eliminates one possibility ;-)

David.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [acpi-jp 1370] Re: -CURRENT freeze under high load

2001-10-25 Thread Andrea Campi

 
 I've just updated to -HEAD with this delta reverted and running a make
 buildkernel right now.
 

Looks like I spoke too soon; reverting just this delta wasn't enough. I'm back
to testing with all ACPI related work from Oct 04 08:32 rolled back; if it
works, I'll try to update each diff in increment and find what else failed.

Now I'll just shut up until I have conclusive evidence of what is broken.

Bye,
Andrea

-- 
It is easier to fix Unix than to live with NT.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



-CURRENT freeze under high load

2001-10-24 Thread Andrea Campi

Hi all,

I am trying to diagnose a problem I've been having for a few weeks (I didn't
report it earlier because I didn't have much time to hunt for it).

The symptom is a total system freeze, i.e. I can't get into DDB. I can repeat
it only with qmail, but of course I don't think it's qmail specific in any way;
probably something to do with locking. To reproduce it I run:

find . -type f | xargs mutt  (on my machine, all emails get delivered to me)

A kernel from Oct 1 doesn't have this issue; a kernel from Oct 5 has. I'll
start binary searching for a commit I can blame.

Anybody seen anything like this?

Bye,
Andrea

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: -CURRENT freeze under high load

2001-10-24 Thread NAKAJI Hiroyuki

 In [EMAIL PROTECTED] 
   [EMAIL PROTECTED] (Andrea Campi) wrote:

AC Anybody seen anything like this?

Well, it may not be the case, but I have similar problem.

In my case, just after login via xdm installed from
port/x11/XFree86-4, load average gets very much increased up to about
4.0 and the mouse cannot work smoothly. It occurs when uptime becomes
one day or more.

Fortunately, my system never freeze or hangup.

Here is dmesg. Mainboard is ASUS P3V4X and CPU is P3 933MHz with ASUS
S370-??.

Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #35: Wed Oct 17 15:51:39 JST 2001
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/NAKAJI
Timecounter i8254  frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (936.75-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x686  Stepping = 6
  
Features=0x383f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
real memory  = 671072256 (655344K bytes)
avail memory = 647983104 (632796K bytes)
Preloaded elf kernel /boot/kernel/kernel at 0xc0471000.
Preloaded elf module /boot/kernel/acpi.ko at 0xc04710b4.
Pentium Pro MTRR support enabled
Using $PIR table, 8 entries at 0xc00f0e60
apm0: APM BIOS on motherboard
apm0: found APM BIOS v1.2, connected at v1.2
npx0: math processor on motherboard
npx0: INT 16 interface
acpi0: ASUS   P3V_4X   on motherboard
acpi0: power button is handled as a fixed feature programming model.
acpi_button0: Power Button on acpi0
fdc0: NEC 72065B or clone port 0x3f7,0x3f2-0x3f5 irq 6 on acpi0
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1440-KB 3.5 drive on fdc0 drive 0
ppc0 port 0x778-0x77b,0x378-0x37f irq 7 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/9 bytes threshold
ppbus0: IEEE1284 device found /NIBBLE/ECP/ECP_RLE
Probing for PnP devices on ppbus0:
ppbus0: EPSON LP-8400PS3 PRINTER POSTSCRIPT
plip0: PLIP network interface on ppbus0
lpt0: Printer on ppbus0
lpt0: Interrupt-driven port
ppi0: Parallel I/O on ppbus0
sio0 port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A
sio1 port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
atkbdc0: Keyboard controller (i8042) port 0x64,0x60 irq 1 on acpi0
atkbd0: AT Keyboard flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: failed to get data.
psm0: PS/2 Mouse irq 12 on atkbdc0
psm0: model IntelliMouse, device ID 3
pcib0: Host to PCI bridge at pcibus 0 on motherboard
pci0: PCI bus on pcib0
pcib1: PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
pci1: display, VGA at device 0.0 (no driver attached)
isab0: PCI-ISA bridge at device 4.0 on pci0
isa0: ISA bus on isab0
atapci0: VIA 82C596 ATA66 controller port 0xb800-0xb80f at device 4.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: serial bus, USB at device 4.2 (no driver attached)
xl0: 3Com 3c905B-TX Fast Etherlink XL port 0xb000-0xb07f mem 0xe180-0xe180007f 
irq 5 at device 9.0 on pci0
xl0: Ethernet address: 00:01:02:c2:15:af
miibus0: MII bus on xl0
xlphy0: 3Com internal media interface on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
csa0: CS4280/CS4614/CS4622/CS4624/CS4630 mem 
0xe080-0xe08f,0xe100-0xe1000fff irq 10 at device 10.0 on pci0
csa: card is Unknown/invalid SSID (CS4614)
pcm0: CS461x PCM Audio on csa0
ahc0: Adaptec 2940 Ultra2 SCSI adapter (OEM) port 0xa800-0xa8ff mem 
0xe000-0xefff irq 11 at device 12.0 on pci0
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
atapci1: HighPoint HPT370 ATA100 controller port 
0x9000-0x90ff,0x9400-0x9403,0x9800-0x9807,0xa000-0xa003,0xa400-0xa407 irq 5 at device 
13.0 on pci0
ata2: at 0xa400 on atapci1
ata3: at 0x9800 on atapci1
orm0: Option ROMs at iomem 0xc-0xc7fff,0xc8000-0xcd7ff on isa0
fdc1: cannot reserve I/O port range (6 ports)
ppc1: cannot reserve I/O port range
ppc2: cannot reserve I/O port range
sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles, flags=0x300
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
ad0: 2441MB WDC AC32500H [4960/16/63] at ata0-master WDMA2
ad1: 14655MB Maxtor 31536U2 [29777/16/63] at ata0-slave UDMA66
ad3: 9765MB FUJITSU MPC3102AT E [19841/16/63] at ata1-slave UDMA33
ar0: 43979MB ATA RAID1 array [5606/255/63] subdisks:
   ad4: 43979MB IBM-DTLA-307045 [89355/16/63] at ata2-master UDMA100
   ad6: 43979MB IBM-DTLA-307045 [89355/16/63] at ata3-master UDMA100
ar1: 43979MB ATA RAID1 array [5606/255/63] subdisks:
   ad4: 43979MB IBM-DTLA-307045 [89355/16/63] at ata2-master UDMA100
   ad6: 43979MB IBM-DTLA-307045 [89355/16/63] at ata3-master UDMA100
ar2: 43979MB ATA RAID1 array [5606/255/63] subdisks:
   ad4: 43979MB IBM-DTLA-307045 [89355/16/63] at ata2-master UDMA100
   ad6: 43979MB IBM-DTLA-307045 [89355/16/63] at ata3-master UDMA100
ar3: 43979MB ATA RAID1 array [5606/255/63] subdisks:
   ad4: 43979MB 

Re: -CURRENT freeze under high load

2001-10-24 Thread John Baldwin


On 24-Oct-01 NAKAJI Hiroyuki wrote:
 In [EMAIL PROTECTED] 
  [EMAIL PROTECTED] (Andrea Campi) wrote:
 
 AC Anybody seen anything like this?
 
 Well, it may not be the case, but I have similar problem.
 
 In my case, just after login via xdm installed from
 port/x11/XFree86-4, load average gets very much increased up to about
 4.0 and the mouse cannot work smoothly. It occurs when uptime becomes
 one day or more.
 
 Fortunately, my system never freeze or hangup.

I would try removing apm from your kernel config or disabling ACPI and only
using one or the other and seeing if that helps.

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message