Re: slow system freeze - data

2004-12-27 Thread Kris Kennaway
On Tue, Dec 28, 2004 at 03:52:50AM +0100, Benjamin Lutz wrote:
> > Does xmms try to run with rtprio or idprio?  Those are still broken,
> > and can lead to deadlocks, afaik.
> 
> No, none all PIDs are listed as "normal" by idprio and rtprio, except the 
> [pagezero] process, which is listed as "idle priority 31" by both 
> programs, and I suppose that's intentional.

You might have already mentioned this, but there aren't any other
messages being logged on the system console or in syslog, are there?
If your drive is failing this will lead to the above symptoms as well.

Kris




pgp2mc4aP86kp.pgp
Description: PGP signature


Re: slow system freeze - data

2004-12-27 Thread Benjamin Lutz
> Does xmms try to run with rtprio or idprio?  Those are still broken,
> and can lead to deadlocks, afaik.

No, none all PIDs are listed as "normal" by idprio and rtprio, except the 
[pagezero] process, which is listed as "idle priority 31" by both 
programs, and I suppose that's intentional.

Greetings
Benjamin


pgpyeZhfQAyJI.pgp
Description: PGP signature


Re: TIMEOUT - WRITE_DMA - A possible FIX! turn off ACPI

2004-12-27 Thread whitevamp

- Original Message - 
From: "Joe Koberg" <[EMAIL PROTECTED]>
To: "Zsolt Kúti" <[EMAIL PROTECTED]>
Cc: ; 
Sent: Monday, December 27, 2004 6:29 PM
Subject: Re: TIMEOUT - WRITE_DMA - A possible FIX! turn off ACPI


> Zsolt Kúti wrote:
>
> >My system produces these messages that I already know well from this
> >list (as well ;):
> >ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=213249674
> >
> >
> Like many people I was confronted with "TIMEOUT - READ_DMA"
> and "TIMEOUT - WRITE_DMA" errors on my drives. I was frustrated.
> But I found a workaround: Turning off ACPI.
>
> I just received a Highpoint RocketRaid 1640 controller,
> 2 Maxtor 300GB drives, and a Supermicro 5-drive SATA cage.
> I am testing this configuration for a storage server.
>
> I am using an old motherboard, DTK brand, Slot 1. 300A Celeron.
>
> Under a fresh install of 5.3-RELEASE I am unable to read or write
> both drives heavily at the same time.  One drive alone seems to work
> OK. When I run dd blasting both drives with seqential IO, I get
> TIMEOUT - WRITE(READ)_DMA. Repeatably, within 15 seconds.
>
> However I got a good test before I installed 5.3-R, the box was running
> with 5.3-BETA. Only difference was I booted without ACPI.
>
> So I rebooted the freshly installed 5.3-R without ACPI, and It works!
> I can read at 50MB/s per drive concurrently (hitting PCI bus speed
> limit?), and write at 30MB/s per drive concurrently. No errors so
> far, and its been dd'ing for a half hour.
>
> I hope this report helps someone!
>
>
>
> Joe Koberg
> joe at osoft dot us

I 2 have been seeing this error sence 4.9 with my westeren digital 80gig hd
the error message has changed a little between the two vers .. but i do have
this in device.hints  , hint.acpi.0.disabled="1"  , and i still see the
error messages . any way i just whanted to post in and let every one know
that turning off ACPI , might not work for you.
ohh and off subject here , i had acpi turned off becouse my net cards
wouldnt work with it on ..








> dmesg:
>
> FreeBSD 5.3-RELEASE #0: Fri Nov  5 04:19:18 UTC 2004
> [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Pentium II/Pentium II Xeon/Celeron (307.84-MHz 686-class CPU)
>   Origin = "GenuineIntel"  Id = 0x660  Stepping = 0
>
>
Features=0x183f9ff
> real memory  = 402587648 (383 MB)
> avail memory = 384270336 (366 MB)
> npx0: [FAST]
> npx0:  on motherboard
> npx0: INT 16 interface
> pcib0:  pcibus 0 on motherboard
> pir0:  on motherboard
> pci0:  on pcib0
> agp0:  mem
> 0xe000-0xe3ff at device 0.0 on pci0
> pcib1:  at device 1.0 on pci0
> pci1:  on pcib1
> pci1:  at device 0.0 (no driver attached)
> isab0:  at device 7.0 on pci0
> isa0:  on isab0
> atapci0:  port
> 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0
> ata0: channel #0 on atapci0
> ata1: channel #1 on atapci0
> uhci0:  port 0xb000-0xb01f irq
> 10 at device 7.2 on pci0
> uhci0: [GIANT-LOCKED]
> usb0:  on uhci0
> usb0: USB revision 1.0
> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> ums0: Microsoft Microsoft 5-Button Mouse with IntelliEye(TM), rev
> 1.10/3.00, addr 2, iclass 3/1
> ums0: 5 buttons and Z dir.
> pci0:  at device 7.3 (no driver attached)
> atapci1:  port
> 0xc400-0xc4ff,0xc000-0xc003,0xbc00-0xbc07,0xb800-0xb803,0xb400-0xb407
> irq 11 at device 17.0 on pci0
> ata2: channel #0 on atapci1
> ata3: channel #1 on atapci1
> atapci2:  port
> 0xd800-0xd8ff,0xd400-0xd403,0xd000-0xd007,0xcc00-0xcc03,0xc800-0xc807
> irq 11 at device 17.1 on pci0
> ata4: channel #0 on atapci2
> ata5: channel #1 on atapci2
> dc0:  port 0xdc00-0xdcff mem
> 0xec00-0xec0003ff irq 12 at device 18.0 on pci0
> miibus0:  on dc0
> ukphy0:  on miibus0
> ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> dc0: Ethernet address: 00:04:5a:56:80:76
> dc0: if_start running deferred for Giant
> dc0: [GIANT-LOCKED]
> pci0:  at device 19.0 (no driver attached)
> cpu0 on motherboard
> orm0:  at iomem 0xcc000-0xcdfff,0xc-0xc8fff on isa0
> pmtimer0 on isa0
> atkbdc0:  at port 0x64,0x60 on isa0
> atkbd0:  irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> fdc0:  at port 0x3f0-0x3f5 irq 6 drq 2 on isa0
> fdc0: [FAST]
> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
> ppc0:  at port 0x378-0x37f irq 7 on isa0
> ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
> ppc0: FIFO with 16/16/8 bytes threshold
> ppbus0:  on ppc0
> plip0:  on ppbus0
> lpt0:  on ppbus0
> lpt0: Interrupt-driven port
> ppi0:  on ppbus0
> sc0:  at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
> sio0: type 16550A
> sio1 at port 0x2f8-0x2ff irq 3 on isa0
> sio1: type 16550A
> vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
> unknown:  can't assign resources (port)
> unknown:  can't assign resources (memory)
> unknown:  can't assign resources (port)
> unknow

Re: slow system freeze - data

2004-12-27 Thread Kris Kennaway
On Tue, Dec 28, 2004 at 12:38:44PM +1100, Peter Jeremy wrote:
> On Sun, 2004-Dec-26 08:14:49 +0100, Benjamin Lutz wrote:
> >The freeze just happened again. I managed to get into the debugger and get 
> >some info.
> 
> The info you dumped shows that there's a filesystem deadlock on
> ad4s1f.  This is consistent with the behaviour you reported - the
> system is running "normally" but as soon as a process trys to access
> that filesystem, it freezes.  Eventually, everything all processes are
> frozen.
> 
> Unfortunately, it's not clear (to me) where to go next.  Printing the
> locked vnodes might help but that's not easy to do without gdb.
> 
> >The first app that froze as far as I could tell was xmms.
> 
> Actually, the locks suggest that the problem started with pid 678 - kdeinit.
> This is unlikely to be 

Does xmms try to run with rtprio or idprio?  Those are still broken,
and can lead to deadlocks, afaik.

Kris


pgp1YsPnuP94P.pgp
Description: PGP signature


Re: TIMEOUT - WRITE_DMA - A possible FIX! turn off ACPI

2004-12-27 Thread Joe Koberg
Zsolt Kúti wrote:
My system produces these messages that I already know well from this
list (as well ;):
ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=213249674

Like many people I was confronted with "TIMEOUT - READ_DMA"
and "TIMEOUT - WRITE_DMA" errors on my drives. I was frustrated.
But I found a workaround: Turning off ACPI.
I just received a Highpoint RocketRaid 1640 controller,
2 Maxtor 300GB drives, and a Supermicro 5-drive SATA cage.
I am testing this configuration for a storage server.
I am using an old motherboard, DTK brand, Slot 1. 300A Celeron.
Under a fresh install of 5.3-RELEASE I am unable to read or write
both drives heavily at the same time.  One drive alone seems to work
OK. When I run dd blasting both drives with seqential IO, I get
TIMEOUT - WRITE(READ)_DMA. Repeatably, within 15 seconds.
However I got a good test before I installed 5.3-R, the box was running
with 5.3-BETA. Only difference was I booted without ACPI.
So I rebooted the freshly installed 5.3-R without ACPI, and It works!
I can read at 50MB/s per drive concurrently (hitting PCI bus speed
limit?), and write at 30MB/s per drive concurrently. No errors so
far, and its been dd'ing for a half hour.
I hope this report helps someone!

Joe Koberg
joe at osoft dot us


dmesg:
FreeBSD 5.3-RELEASE #0: Fri Nov  5 04:19:18 UTC 2004
   [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Pentium II/Pentium II Xeon/Celeron (307.84-MHz 686-class CPU)
 Origin = "GenuineIntel"  Id = 0x660  Stepping = 0
 
Features=0x183f9ff
real memory  = 402587648 (383 MB)
avail memory = 384270336 (366 MB)
npx0: [FAST]
npx0:  on motherboard
npx0: INT 16 interface
pcib0:  pcibus 0 on motherboard
pir0:  on motherboard
pci0:  on pcib0
agp0:  mem 
0xe000-0xe3ff at device 0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci1:  at device 0.0 (no driver attached)
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 
0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
uhci0:  port 0xb000-0xb01f irq 
10 at device 7.2 on pci0
uhci0: [GIANT-LOCKED]
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
ums0: Microsoft Microsoft 5-Button Mouse with IntelliEye(TM), rev 
1.10/3.00, addr 2, iclass 3/1
ums0: 5 buttons and Z dir.
pci0:  at device 7.3 (no driver attached)
atapci1:  port 
0xc400-0xc4ff,0xc000-0xc003,0xbc00-0xbc07,0xb800-0xb803,0xb400-0xb407 
irq 11 at device 17.0 on pci0
ata2: channel #0 on atapci1
ata3: channel #1 on atapci1
atapci2:  port 
0xd800-0xd8ff,0xd400-0xd403,0xd000-0xd007,0xcc00-0xcc03,0xc800-0xc807 
irq 11 at device 17.1 on pci0
ata4: channel #0 on atapci2
ata5: channel #1 on atapci2
dc0:  port 0xdc00-0xdcff mem 
0xec00-0xec0003ff irq 12 at device 18.0 on pci0
miibus0:  on dc0
ukphy0:  on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc0: Ethernet address: 00:04:5a:56:80:76
dc0: if_start running deferred for Giant
dc0: [GIANT-LOCKED]
pci0:  at device 19.0 (no driver attached)
cpu0 on motherboard
orm0:  at iomem 0xcc000-0xcdfff,0xc-0xc8fff on isa0
pmtimer0 on isa0
atkbdc0:  at port 0x64,0x60 on isa0
atkbd0:  irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
fdc0:  at port 0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0:  at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0:  on ppc0
plip0:  on ppbus0
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
sc0:  at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown:  can't assign resources (port)
unknown:  can't assign resources (memory)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
Timecounter "TSC" frequency 307842170 Hz quality 800
Timecounters tick every 10.000 msec
ad0: 43979MB  [89355/16/63] at ata0-master UDMA33
ad4: 286188MB  [581463/16/63] at ata2-master 
UDMA133
ad6: 286188MB  [581463/16/63] at ata3-master 
UDMA133
Mounting root from ufs:/dev/ad0s1a






After these messages the two former cases  result in FAILURE and finally
in panic. Even background fsck cannot run without another panic, only
single user mode can help. All these prevent using them on my HW.
However B7, although displays the messages as well, works seemingly
fine. For the time being this version is sufficent, but I'd like to
know - if possible at all - what  the difference could be between the
versions and if one can expect to bring the actual 5.3 ve

Re: slow system freeze - data

2004-12-27 Thread Benjamin Lutz
Hello Peter,

> The info you dumped shows that there's a filesystem deadlock on
> ad4s1f.

In case you haven't guessed, that'd be my /usr.

> Unfortunately, it's not clear (to me) where to go next.  Printing the
> locked vnodes might help but that's not easy to do without gdb.

You mean that's the point where I need serial console access? I hope to 
have that running after the holidays.

> >The first app that froze as far as I could tell was xmms.
>
> Actually, the locks suggest that the problem started with pid 678 -
> kdeinit. This is unlikely to be

Well, xmms is just the first app where it became apparent :)
PID 678 is really kded (at least it is at the moment - It is very likely 
it was then too, since these low PIDs seem to generally be assigned the 
same way with each boot). kded appears to be some CORBA-related tool used 
by KDE.

Btw, is my assumption that this is a kernel problem, not a problem with 
any of my applications, correct?

Anyway, many thanks for your help and insight so far, it is appreciated.

Greetings
Benjamin



pgpoLose9alke.pgp
Description: PGP signature


Re: slow system freeze - data

2004-12-27 Thread Peter Jeremy
On Sun, 2004-Dec-26 08:14:49 +0100, Benjamin Lutz wrote:
>The freeze just happened again. I managed to get into the debugger and get 
>some info.

The info you dumped shows that there's a filesystem deadlock on
ad4s1f.  This is consistent with the behaviour you reported - the
system is running "normally" but as soon as a process trys to access
that filesystem, it freezes.  Eventually, everything all processes are
frozen.

Unfortunately, it's not clear (to me) where to go next.  Printing the
locked vnodes might help but that's not easy to do without gdb.

>The first app that froze as far as I could tell was xmms.

Actually, the locks suggest that the problem started with pid 678 - kdeinit.
This is unlikely to be 

-- 
Peter Jeremy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: am64/FreeBSD-5.3-STABLE (or RELEASE) crashes often

2004-12-27 Thread Kris Kennaway
On Mon, Dec 27, 2004 at 02:52:45PM -0700, Troy Bowman wrote:

> And it doesn't dump its core to its dump swap space, too, so I can't run
> savecore after reboot to get debugging info.  I have the swap space in
> fstab commented out so it won't come up at boot to be able to manually
> harvest the core, as it gives "savecore: no dumps found."  (it doesn't
> happen automatically, either). 

Please double-check that you're running 'dumpon'.  If you don't
configure swap at boot time, it won't be run automatically by the boot
scripts.

> Hardware that it is running on is a Tyan s2875 with dual amd64/246
> processors, and 2 GB Registered DDR RAM (Corsair).  We're also running
> vinum for all of the filesystems, mirroring them all, including the root
> filesystem.  The vinum is using two SATA WD Raptors.  I have one older
> IDE drive plugged in to capture the kernel dumps.  

There's an erratum for vinum in 5.3.

> What can I do to debug this more if I can't harvest the kernel dumps to
> report a bug?

See the chapter on kernel debugging in the developers' handbook,
available on the website.  It takes you through how to configure your
kernel with support for the debugger, and how to obtain minimal
information from it when you encounter a panic.

You might like to first update to FreeBSD 5.3-STABLE in case the bugs
are already fixed.

> Is there anything the FreeBSD team can do?

Perhaps, once you have the above information.  Also report it to the
freebsd-amd64 mailing list.

Kris

pgpvGJ7fXTgCB.pgp
Description: PGP signature


am64/FreeBSD-5.3-STABLE (or RELEASE) crashes often

2004-12-27 Thread Troy Bowman
And it doesn't dump its core to its dump swap space, too, so I can't run
savecore after reboot to get debugging info.  I have the swap space in
fstab commented out so it won't come up at boot to be able to manually
harvest the core, as it gives "savecore: no dumps found."  (it doesn't
happen automatically, either). 

We recently thought we'd give 5.3 a go in production, and it has been
too unstable.   When it crashes, it doesn't reboot, so it just hangs
there until someone has to drive in and push the button.  Who knows,
maybe Linux would be more stable at this point.  Sigh.

Hardware that it is running on is a Tyan s2875 with dual amd64/246
processors, and 2 GB Registered DDR RAM (Corsair).  We're also running
vinum for all of the filesystems, mirroring them all, including the root
filesystem.  The vinum is using two SATA WD Raptors.  I have one older
IDE drive plugged in to capture the kernel dumps.  

We've tried many different memory configurations to see if we can tune
it so that FreeBSD can handle it (DRAM ECC vs master ECC, bank & node
interleaving turned off/on, slowing the memory down, DRAM Scrub Redirect
off/on, etc, to no avail.

It's usually pagedaemon that croaks, but it crashes on the keyboard irq
process and serial IO irq process for some reason also.  I guess since
it's usually the pager that dies, that's the reason why I can't get
kernel dumps.  Here are some (manually copied) panics from the console.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x88
fault code  = supervisor read, page not present
instruction pointer = 0x8:0x80389aea
stack pointer   = 0x10:0xb2051a60
frame pointer   = 0x10:0xff006b12d000
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 53 (pagedaemon)
trap number = 12
panic: page fault
cpuid = 0
boot() called on cpu#0
Uptime: 10h18m49s

...

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x88
fault code  = supervisor read, page not present
instruction pointer = 0x8:0x8038a10a
frame pointer   = 0x10:0xb2051ab0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 53 (pagedaemon)
trap number = 12
panic: page fault
cpuid = 0
boot() called on cpu#0
Uptime: 15h59m55s

...
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resumek IOPL = 0
current process = 36 (swi5: clock sio)
trap number = 12
panic: page fault
cpuid = 1
kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x48
fault code  = supervisor read, page not present
instruction pointer = 0x8: 0x803a40d3
stack pointer   = 0x10: 0xb1d63650
frame pointer   = 0x10: 0xff007b7f3a40
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0,pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 30
trap number = 12
panic: page fault
cpuid = 1
spin lock sched lock held by 0xff007b8177b0 for > 5 seconds

...


What can I do to debug this more if I can't harvest the kernel dumps to
report a bug?  Is there anything the FreeBSD team can do?   Do I need to
resort to Linux for dual amd64 support for now? 

Thanks,

../troy


smime.p7s
Description: S/MIME cryptographic signature


Re: Spinlock errors.

2004-12-27 Thread Kris Kennaway
On Mon, Dec 27, 2004 at 06:48:36PM +, Yann Golanski wrote:
> I'm getting a few errors since I used tried to upgrade gtk2 and librsvg,
> namely:  
> 
>   Fatal error 'Spinlock called when not threaded.' at line 83 in file
>   /usr/src/lib/libpthread/thread/thr_spinlock.c (errno = 0)
>   gmake[3]: *** [install-data-hook] Error 134
>   gmake[3]: Leaving directory 
> `/usr/ports/graphics/librsvg2/work/librsvg-2.8.1/gdk -pixbuf-loader'
> 
> Any got any ideas as to how to fix it?   I suspect some mix up of some
> core library but I am not sure which one...

See the mailing list archives and UPDATING for extensive discussion of
this issue.

Kris


pgp8YjX1NWDvM.pgp
Description: PGP signature


Spinlock errors.

2004-12-27 Thread Yann Golanski
I'm getting a few errors since I used tried to upgrade gtk2 and librsvg,
namely:  

  Fatal error 'Spinlock called when not threaded.' at line 83 in file
  /usr/src/lib/libpthread/thread/thr_spinlock.c (errno = 0)
  gmake[3]: *** [install-data-hook] Error 134
  gmake[3]: Leaving directory 
`/usr/ports/graphics/librsvg2/work/librsvg-2.8.1/gdk -pixbuf-loader'

Any got any ideas as to how to fix it?   I suspect some mix up of some
core library but I am not sure which one...

# gmake --version
GNU Make 3.80
Copyright (C) 2002  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

# uname -a
FreeBSD gridlinked.neverness.org 5.3-STABLE FreeBSD 5.3-STABLE #1: Fri
Dec  3 13:53:15 GMT 2004
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GRIDLINKED  i386

-- 
[EMAIL PROTECTED]  -=*=-  www.kierun.org
PGP:   009D 7287 C4A7 FD4F 1680  06E4 F751 7006 9DE2 6318


pgpWTPiuE21c4.pgp
Description: PGP signature


Re: acpi boot error messages after last update (Dec 22nd)

2004-12-27 Thread Federico Galvez-Durand Besnard
Sorry, I was out for Xmas. Longer dmesg here:
http://www.del.ufrj.br/~fico/FreeBSD/debug/dmesg03
Apparently, my Notebook works well (acpi doesn't).
I did not have these error messages before
last big acpi update.
Before that update dmesg pointed out acpi was doing something
(I was debugging USB, so this dmesg was recorded):
http://www.del.ufrj.br/~fico/FreeBSD/debug/dmesg01
Thanks!
p.s.:
uname -a
FreeBSD me.HERE 5.3-STABLE FreeBSD 5.3-STABLE #20: Wed Dec 22 20:31:43 
GMT-1 2004  [EMAIL PROTECTED]  i386

Lowell Gilbert wrote:
Federico Galvez-Durand Besnard <[EMAIL PROTECTED]> writes:
 

Hi, just compiled new kernel with lattest acpi.
I am getting boot error messages.
I set hint.acpi.0.disabled="1" in device.hints.
With acpi disabled I get this:
( partial dmesg )
...
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown:  can't assign resources (memory)
unknown:  can't assign resources (irq)
unknown:  can't assign resources (port)
unknown:  can't assign resources (irq)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
Timecounter "TSC" frequency 646825914 Hz quality 800
Timecounters tick every 10.000 msec
...
   

I don't see any error messages there.
What is the actual problem?
.
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: mbuf leak in bpf.c

2004-12-27 Thread Pawel Jakub Dawidek
On Mon, Dec 27, 2004 at 12:24:49PM +, Johnny Eriksson wrote:
+> If one tries to write a datagram to a bpf device, and the datagram is
+> longer than the MTU on the physical interface, the write fails as it
+> should, but an mbuf is allocated and thrown away.  Proposed solution:

Committed to HEAD, thanks!

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgp7ufDAkrjsR.pgp
Description: PGP signature


Re: Questions about GEOM and MIRROR

2004-12-27 Thread Pawel Jakub Dawidek
On Mon, Dec 13, 2004 at 08:20:30PM +0100, Samuel Tardieu wrote:
+> Hi.
+> 
+> I just added two disks (ad4 & ad6, SATA 160Go) to my FreeBSD box. I want to
+> use them in the following configuration:
+> 
+>   - ad4s1 & ad6s1: geom mirror of 80Go containing all my precious data (/,
+> /usr, /var, /home)
+> 
+>   - ad4s2b: swap
+> 
+>   - ad4s2x, ad6s2x: non-important data
+> 
+> On the mirror (ad4s1+ad6s1), I created partitions for /, /tmp, /usr,
+> and /var.
+> 
+> Is there any pitfall in doing so? Do I have to be careful to keep extra space
+> somewhere? (such as one sector at the end of ad4s1/ad6s1)
+> 
+> I can't seem to place bootcode at the beginning of the mirror:
+> 
+> # bsdlabel /dev/mirror/precious
+> # /dev/mirror/precious:
+> 8 partitions:
+> #size   offsetfstype   [fsize bsize bps/cpg]
+>   a:   524288   164.2BSD 2048 16384 32776 
+>   c: 1677667310unused0 0 # "raw" part, don't 
edit
+>   d:  2097152   5243044.2BSD 2048 16384 28552 
+>   e:   524288  26214564.2BSD 2048 16384 32776 
+>   f: 164620971  31457444.2BSD 2048 16384 28552 
+> 
+> # bsdlabel -B /dev/mirror/precious
+> bsdlabel: Geom not found
+> 
+> What does this error mean?

It could mean, that there is no such device.
Hard to say, as I can't reproduce it here - 'bsdlabel -B' works for me.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpuYtwfWNNkX.pgp
Description: PGP signature


Re: ggated, dvd+rw, atapicam problem

2004-12-27 Thread Pawel Jakub Dawidek
On Wed, Nov 24, 2004 at 09:25:28PM -0600, Vulpes Velox wrote:
+> Just hit a odd problem... here is what I am doing... I have a dvd+rw
+> drive that I am trying to export using ggated... of which some thing
+> is going wrong... any one have any idea what is happening?
+> 
+> I think I provided all the possible info, if any one can think of any
+> thing more, please let me know.
+> 
+> 
+> [v42]:/etc# ggatec create 192.168.0.3 /dev/cd0
+> ggate0
+> [v42]:/etc# dvd+rw-mediainfo /dev/ggate0
+> /dev/ggate0: unable to open: Inappropriate ioctl for device
+> 
+> excert from dmesg...
+> acd0: DVDR  at ata1-master UDMA33
+> cd0 at ata1 bus 0 target 0 lun 0
+> cd0:  Removable CD-ROM SCSI-0 device
+> cd0: 33.000MB/s transfers
+> cd0: cd present [338104 x 2048 byte records]
+> 
+> gg.exports...
+> 192.168.0.2 RW  /dev/acd0
+> 192.168.0.2 RW  /dev/acd0t01
+> 192.168.0.2 RW  /dev/cd
+> 
+> uname -a client box...
+> FreeBSD vixen42 5.3-STABLE FreeBSD 5.3-STABLE #2: Wed Nov 24 16:04:21
+> CST 2004 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/vixen42-1  i386
+> 
+> uname -a server box...
+> FreeBSD fennec 5.3-STABLE FreeBSD 5.3-STABLE #0: Wed Nov 10 13:34:27
+> CST 2004 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/fennec-1  i386
+> 
+> 
+> btw the box I am trying to access it from does not have atapicam on it
+> do to atapicam cuases this box to hardlock since it has two atapi cd
+> drives on a promise card, which cuases the system to hardlock if any
+> atapi drives are found hooked up to a promise controller.

GEOM Gate can send only I/O requests, it cannot forward ioctls.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpBTaaHI29Bp.pgp
Description: PGP signature


Re: urgent help

2004-12-27 Thread kalin mintchev
PLEASE REPLY TO [EMAIL PROTECTED]


> On Mon, Dec 27, 2004 at 02:40:34PM +0100, Andreas Wider?e Andersen
typed:
>> At 09:35 27.12.2004, you wrote:
>> > PLEASE REPLY TO [EMAIL PROTECTED]
>> >
>> > upgraded from 4.6 => 4.10 rel
>> >
>> > network programs are craching the new system: netstat, ping, the
qmail
>> tcp
>> > server all of them...
>> > sshd is running but when accessing from outside it panics too...
what
>> is
>> > it?
>> >
>> > can i turn something off in the kernel?!
>> Did you "make world" in addition to recompiling the Kernel? Sounds like
your system is out of sync.
>> Here's a note about how I did it a while back:
>> http://home.eunet.no/~awand/freebsd-4.6_installasjon.txt (it's in
Norwegian, but all commands and order should be understandable.

how do i make it in sync?!

i did buildworld first - as it's in the handbook. i've done 5.x five
before without a problem...

this is for a mailserver in production...

>
> From this document I understand you do a "make buildkernel" before you
do
> a "make buildworld". That's not the recommended order. Build world
before
> you build kernel.
>
>


-- 



-- 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: urgent help

2004-12-27 Thread kalin mintchev
PLEASE REPLY TO [EMAIL PROTECTED]


thank you Bill for rplying...

well i did it a few times with the same success. it's not the first time
i'm doing it. it's the first time with the 4.x..

i followed the handbook step by step - rebuild devs too..  and then
cleaned up obj.. to make it all again - the same problems were happening
after every try...

the machine would come up. then netsat or ping or ssh will crash it... the
first time i had to add the sshd user and group...

i mostly installed the new etc files except the passwd, group and hosts...

i have a copy of the old etc...

what else do i need?


> "kalin mintchev" <[EMAIL PROTECTED]> wrote:
>
>> PLEASE REPLY TO [EMAIL PROTECTED]
>>
>> upgraded from 4.6 => 4.10 rel
>>
>> network programs are craching the new system: netstat, ping, the qmail
>> tcp
>> server all of them...
>> sshd is running but when accessing from outside it panics too...  what
>> is it?
>>
>> can i turn something off in the kernel?!
>
> What process did you follow to update?  It sounds to me like you didn't
> complete the upgrade process, skipped a step, or did it improperly.
>
> There's no reason I can think of that upgrading should cause things to
> panic, unless you did the upgrade process improperly.
>
> --
> Bill Moran
> Potential Technologies
> http://www.potentialtech.com
>


-- 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: urgent help

2004-12-27 Thread Bill Moran
"kalin mintchev" <[EMAIL PROTECTED]> wrote:

> PLEASE REPLY TO [EMAIL PROTECTED]
> 
> upgraded from 4.6 => 4.10 rel
> 
> network programs are craching the new system: netstat, ping, the qmail tcp
> server all of them...
> sshd is running but when accessing from outside it panics too...  what is it?
> 
> can i turn something off in the kernel?!

What process did you follow to update?  It sounds to me like you didn't
complete the upgrade process, skipped a step, or did it improperly.

There's no reason I can think of that upgrading should cause things to
panic, unless you did the upgrade process improperly.

-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: urgent help

2004-12-27 Thread kalin mintchev

 PLEASE REPLY TO [EMAIL PROTECTED]

 upgraded from 4.6 => 4.10 rel

 network programs are craching the new system: netstat, ping, the qmail tcp
 server all of them...
 sshd is running but when accessing from outside it panics too...  what is
 it?

 can i turn something off in the kernel?!




-- 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


urgent help

2004-12-27 Thread kalin mintchev
PLEASE REPLY TO [EMAIL PROTECTED]

upgraded from 4.6 => 4.10 rel

network programs are craching the new system: netstat, ping, the qmail tcp
server all of them...
sshd is running but when accessing from outside it panics too...  what is it?

can i turn something off in the kernel?!




-- 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: netstat fails with memory allocation error and error in kvm_read

2004-12-27 Thread techlists
 
> > > You appear to be running out of kernel memory. Since you're 
> > > capturing the output of vmstat -m, you should check that for any 
> > > bins that are growing at a high rate of speed.
> > >
> > > Seems possible that its in pf :)
> >
> > I've checked the numbers from just before the freeze (it's 
> within 15 
> > secs) with two sets of data: From a fresh boot and five minutes 
> > minutes before the freeze.
> 
> You might also log 'sysctl vm.kvm_free' and 'sysctl vm.zone'.

sysctl vm.zone is identical to vmstat -z (according to man vmstat).

I've graphed the output from iostat (idle/user/...), vmstat -i (interrupt
rate), vmstat -m (in use), vmstat -z (used), sysctl vm.kvm_free (which is
constant) and the number of pfstates. The graphs are at
. The newest data are from just after a
deadlock. Are there something else I should graph?

IRQ 20 is the NIC on our internal network (800+ machines), IRQ 18 and IRQ21
are NICs connected to the internet. There are a lot of changes on the vmstat
-m graphs just before midnight last night that seems to correspond with the
increase in interrupts on IRQ 18. 

The only graphs I can see changing up to the deadlock are:
irq20 (internal NIC), 
irq21 (primary external NIC), 
the buckets (vmstat -z) all grow (I suppose this is normal?)
the Mbufs seems to grow, but nothing extreme
pffrag, pffrent (but not to levels they haven't been at before)

Most notably most of the pf graphs doesn't change. Where can I see memory
used by pf/altq? If it is pfaltqpl (in vmstat -z), it doesn't change at all.

I'm in the process of setting up a serial console in the hope that I can
break to the debugger with that. I'm also trying to provoke the deadlock so
it will happen more frequently.

/Martin

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


mbuf leak in bpf.c

2004-12-27 Thread Johnny Eriksson
If one tries to write a datagram to a bpf device, and the datagram is
longer than the MTU on the physical interface, the write fails as it
should, but an mbuf is allocated and thrown away.  Proposed solution:

--- bpf.c.orig  Mon Dec 27 10:43:06 2004
+++ bpf.c   Mon Dec 27 10:44:16 2004
@@ -633,8 +633,10 @@
if (error)
return (error);
 
-   if (datlen > ifp->if_mtu)
+   if (datlen > ifp->if_mtu) {
+   m_freem(m);
return (EMSGSIZE);
+   }
 
if (d->bd_hdrcmplt)
dst.sa_family = pseudo_AF_HDRCMPLT;

--Johnny
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"