Re: system hangup - I'm lost
cpghost wrote: If it's PATA, check the cabling, then check it again, and just to make sure, replace the cable even if the system used to work flawlessly in the past. I've had this on a few servers, but replacing the cables always fixed the problem for me. It's SATA - it's a 3ware 9500S-4LP controller. I can just hope it would have detect any drive problem (even if it would result because of bad cabeling). If not I don't know why I had a raid controller anyway ;) The only other disk drive I've on that system is an USB attached hdd for backup purpose... So I can't realy try having the swap somewhere else.. //nudel /c0 show all /c0 Driver Version = 3.60.04.003 /c0 Model = 9500S-4LP /c0 Available Memory = 112MB /c0 Firmware Version = FE9X 2.08.00.009 /c0 Bios Version = BE9X 2.03.01.052 /c0 Boot Loader Version = BL9X 2.02.00.001 /c0 Serial Number = D19004A5300589 /c0 PCB Version = Rev 019 /c0 PCHIP Version = 1.50 /c0 ACHIP Version = 3.20 /c0 Number of Ports = 4 /c0 Number of Drives = 4 /c0 Number of Units = 1 /c0 Total Optimal Units = 1 /c0 Not Optimal Units = 0 /c0 JBOD Export Policy = off /c0 Disk Spinup Policy = 1 /c0 Spinup Stagger Time Policy (sec) = 2 /c0 Cache on Degrade Policy = Follow Unit Policy Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy -- u0RAID-5OK - - 64K 698.461 ON OFF Port Status Unit SizeBlocksSerial --- p0 OK u0 232.88 GB 488397168 WD-WCANK1079272 p1 OK u0 232.88 GB 488397168 WD-WCANK1120378 p2 OK u0 232.88 GB 488397168 WD-WCANK1120936 p3 OK u0 232.88 GB 488397168 WD-WCANK1120805 Name OnlineState BBUReady StatusVolt Temp Hours LastCapTest --- bbu On Yes OKOK OK 25524-Aug-2008 //nudel -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
Jeremy Chadwick wrote: - Maxim MAX211ECA1, no idea but doesn't interest me Just for completeness, this is a serial port driver IC. Best regards Oliver -- Oliver Fromme, secnetix GmbH Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd [...] one observation we can make here is that Python makes an excellent pseudocoding language, with the wonderful attribute that it can actually be executed. -- Bruce Eckel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
Oliver Lehmann wrote: Hi, today I'd a crash again - I was not able to get a crash dump (thought a panic at the end of the kdb would do it but didn't - should have called dumpon before ;)) - so here now the information I was able to retrieve: Ok, what I've got so far is wrinting stuff out to the console when the system hangs up: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 ... and now the debugger stuff: [snipped] So.. no idea? anyone? -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Wednesday 01 October 2008 11:29:43 am Oliver Lehmann wrote: Hi, today I'd a crash again - I was not able to get a crash dump (thought a panic at the end of the kdb would do it but didn't - should have called dumpon before ;)) - so here now the information I was able to retrieve: Ok, what I've got so far is wrinting stuff out to the console when the system hangs up: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 Sounds like your disk has died, or perhaps the controller is hung and not completing disk I/O requests anymore. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
John Baldwin wrote: Sounds like your disk has died, or perhaps the controller is hung and not completing disk I/O requests anymore. Hm - the 3ware eventlog does not shed any light on this - no events occured. So I can just guess that the controller and the disks are fine (I had once a hard failing disk and the controller detected it correctly) Do you have an idea how to debug this further? -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Thu, Oct 02, 2008 at 06:51:06PM +0200, Oliver Lehmann wrote: today I'd a crash again - I was not able to get a crash dump (thought a panic at the end of the kdb would do it but didn't - should have called dumpon before ;)) - so here now the information I was able to retrieve: Ok, what I've got so far is wrinting stuff out to the console when the system hangs up: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 ... and now the debugger stuff: [snipped] So.. no idea? anyone? If it's PATA, check the cabling, then check it again, and just to make sure, replace the cable even if the system used to work flawlessly in the past. I've had this on a few servers, but replacing the cables always fixed the problem for me. Oh, btw, you can reproduce this exact behavior on diskless workstations with an NFS-mounted swap. IIRC, it even happened on VERY slow hardware with GBDE or GELI-encrypted swap partitions; but I'm not 100% sure it was due to slowness (it could have been a bad cabling issue as well). -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ -cpghost. -- Cordula's Web. http://www.cordula.ws/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
Jeremy Chadwick wrote: P.S. -- You're the 2nd person I've encountered in under a week who's using 440BX/GX-based hardware in present day. I would not be surprised if the board is simply going bad/failing due to age. :-) I still have quite a few of these in active use. They are good workhorses. Sure, they don't have the raw computing power of newer servers, but for most of our tasks they get the job done. I also have a couple stacks of these in 2U cases sitting unused for spare parts and testing. They make great FreeBSD boxes, and handle low-moderate loads pretty well. We use them for all kinds of things: firewalls, personal/testing servers, SVN repos, monitoring and traffic graphing, name servers, you name it. To bring this back on topic, they might be old, but I have yet to encounter one single motherboard from that series that has failed on me in any way. (*knock on wood*) However, mine are all Intel L440GX boards with dual PIII CPUs in the 600-800MHz range. We try to squeeze every last bit of value out of the hardware we have. :-) Jim ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
Hi, today I'd a crash again - I was not able to get a crash dump (thought a panic at the end of the kdb would do it but didn't - should have called dumpon before ;)) - so here now the information I was able to retrieve: Ok, what I've got so far is wrinting stuff out to the console when the system hangs up: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 2, size: 4096 ... and now the debugger stuff: KDB: enter: manual escape to debugger [thread pid 40 tid 100048 ] Stopped at kdb_enter+0x30: leave db sh locks exclusive sleep mutex Giant r = 0 (0xc07c73c0) locked @ /usr/src/sys/kern/kern_intr.c:681 db sh alllocks Process 40 (irq1: atkbd0) thread 0xc4503a80 (100048) exclusive sleep mutex Giant r = 0 (0xc07c73c0) locked @ /usr/src/sys/kern/kern_intr.c:681 db so there are no locks except the one I caused but anyhow: db bt 100048 Tracing pid 40 tid 100048 td 0xc4503a80 kdb_enter(c077aee6,4,1,0,1,...) at kdb_enter+0x30 scgetc(c0842b60,2,de391c88,c05ad0b7,c4609340,...) at scgetc+0x575 sckbdevent(c0823740,0,c0842b60,c07c73c0,8,...) at sckbdevent+0x210 atkbd_intr(c0823740,0,de391cd8,c05695b8,c0823740,...) at atkbd_intr+0xa1 atkbdintr(c0823740,0,c076448a,2a9,8,...) at atkbdintr+0x21 ithread_execute_handlers(c460cc90,c4449680,c076448a,30e,c4503a80,...) at ithread_execute_handlers+0x108 ithread_loop (c45f66c0,de391d38,c07642ea,30c,0,...) at ithread_loop+0x64 fork_exit (c05696b0,c45f66c0,de391d38) at fork_exit+0x78 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xde391d6c, ebp = 0 --- db sh pcpu cpuid= 0 curthread= 0xc4503a80: pid 40 irq1: atkbd0 curpcb = 0xde391d90 fpcurthread = none idlethread = 0xc444c780: pid 11 idle: cpu0 APIC ID = 1 currentldt = 0x50 spin locks held: and now the output of ps (beware, it is long, no idea why there are so many cron - maybe the crond still schedules but they don't get processed?) show lockedvnods follows afterwards db ps pid ppid pgrp uid state wmesg wchancmd 57919 57918 692 0 SV ufs 0xc47857c8 cron 57918 692 692 0 S ppwait 0xc6e63a78 cron 57917 57916 692 0 SV ufs 0xc47857c8 cron 57916 692 692 0 S ppwait 0xc6e63c90 cron 57915 57914 692 0 SV ufs 0xc47857c8 cron 57914 692 692 0 S ppwait 0xc6eb3000 cron 57913 57912 692 0 SV ufs 0xc47857c8 cron 57912 692 692 0 S ppwait 0xc70a9430 cron 57911 57908 692 0 SV ufs 0xc47857c8 cron 57910 57907 692 0 SV ufs 0xc47857c8 cron 57909 57906 692 0 SV ufs 0xc47857c8 cron 57908 692 692 0 S ppwait 0xc6eb3648 cron 57907 692 692 0 S ppwait 0xc6eb3860 cron 57906 692 692 0 S ppwait 0xc6eb3a78 cron 57905 686 68625 S ufs 0xc4953388 sendmail 57904 57902 692 0 SV ufs 0xc47857c8 cron 57903 57901 692 0 SV ufs 0xc47857c8 cron 57902 692 692 0 S ppwait 0xc49a4430 cron 57901 692 692 0 S ppwait 0xc49a4648 cron 57900 57899 692 0 SV ufs 0xc47857c8 cron 57899 692 692 0 S ppwait 0xc49a4860 cron 57898 57897 692 0 SV ufs 0xc47857c8 cron 57897 692 692 0 S ppwait 0xc49a4a78 cron 57896 57895 692 0 SV ufs 0xc47857c8 cron 57895 692 692 0 S ppwait 0xc49a4c90 cron 57894 57893 692 0 SV ufs 0xc47857c8 cron 57893 692 692 0 S ppwait 0xc6b7c648 cron 57892 57891 692 0 SV ufs 0xc47857c8 cron 57891 692 692 0 S ppwait 0xc66bc430 cron 57890 57889 692 0 SV ufs 0xc47857c8 cron 57889 692 692 0 S ppwait 0xc6b7c860 cron 57888 57887 692 0 SV ufs 0xc47857c8 cron 57887 692 692 0 S ppwait 0xc66bc860 cron 57886 686 68625 S ufs 0xc4953388 sendmail 57885 57884 692 0 SV ufs 0xc47857c8 cron 57884 692 692 0 S ppwait 0xc66bca78 cron 57883 57882 692 0 SV ufs 0xc47857c8 cron 57882 692 692 0 S ppwait 0xc66bcc90 cron 57881 57880 692 0 SV ufs 0xc47857c8 cron 57880 692 692 0 S ppwait 0xc6a65000 cron 57879 57878 692 0 SV ufs 0xc47857c8 cron 57878 692 692 0 S ppwait 0xc6a65218 cron 57877 57876 692 0 SV ufs 0xc47857c8 cron 57876 692 692 0 S ppwait 0xc6a65430 cron 57875 57874 692 0 SV ufs 0xc47857c8 cron 57874 692 692 0 S ppwait 0xc6a65648 cron 57873 57872 692 0 SV ufs 0xc47857c8 cron 57872 692 692 0 S ppwait
Re: system hangup - I'm lost
Jeremy Chadwick wrote: On Wed, Oct 01, 2008 at 06:53:09AM +0200, Oliver Lehmann wrote: Because it is a Server Board it offers a lot of managing features and other nice things like serial console at bootup and system monitoring features... but all unsupported withn FreeBSDs software ;) Really? That's interesting, because Charles Sprickman told me that there is no hardware monitoring information in the BIOS if you go in there. Most motherboards provide that in the BIOS as a centralised place above all else. You are right - I could have sworn that there was such an screen in the BIOS but all I can see is for setting up stuff like enabling eventlog and posting it through a modem connection and so on - server specific stuff - but no display screen for health information... So you where right ;) -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Mon, 2008-09-29 at 22:14 +0200, Oliver Lehmann wrote: Any idea what I could do to shed some more light on this behaviour? Why it is happening and what really is causing it? Would enabling the kernel debugger really help here? I mean the system is really hanging up - except ping response it is not responding to anything except the reset switch ;) If it's responding to ping, you should be able to get into the debugger. Compile it in, along with options WITNESS and options WITNESS_SKIPSPIN, and press ctrl-alt-escape when the machine next hangs. From there, it should hopefully be possible to get more info. It's been a long time since I've used the debugger under 6.x so some of the more useful commands may not exist, but the output of at least sh locks, sh alllocks and bt on any processes that seem to be holding locks. Also sh pcpu and ps will help to determine exactly what was running at the time. Gavin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
Oliver Lehmann wrote: Hi, My fileserver has sporadical hangups running 6.3: FreeBSD 6.3-STABLE #0: Thu Jun 19 00:21:00 CEST 2008 [EMAIL PROTECTED]:/usr/obj/i386-pentium3-6.3/usr/src/sys/NUDEL The exact release doesn't matter since it happened before. It always happens afer some time of having some load on the system (I'm building ports with tinderbox and during the build process it just hangs up). The system does nothing write out on the console, neither the CRT, nor the serial console. The system itself is: CPU: Intel Pentium III (845.64-MHz 686-class CPU) Origin = GenuineIntel Id = 0x683 Stepping = 3 Features=0x387fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE real memory = 805240832 (767 MB) avail memory = 778481664 (742 MB) ACPI APIC Table: Intel N440BX FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 1 cpu1 (AP): APIC ID: 0 ioapic0 Version 1.1 irqs 0-23 on motherboard while the diskspace is provided by an 3ware RAID: twa0: 3ware 9000 series Storage Controller port 0x2400-0x24ff mem 0xf4101000-0xf41010ff,0xf480-0xf4ff irq 18 at device 11.0 on pci0 twa0: INFO: (0x04: 0x0053): Battery capacity test is overdue: twa0: INFO: (0x15: 0x1300): Controller details:: Model 9500S-4LP, 4 ports, Firmware FE9X 2.08.00.009, BIOS BE9X 2.03.01.052 da0 at twa0 bus 0 target 0 lun 0 da0: AMCC 9500S-4LP DISK 2.08 Fixed Direct Access SCSI-3 device da0: 100.000MB/s transfers da0: 715224MB (1464778752 512 byte sectors: 255H 63S/T 91178C) I had - in the past - sometimes messages left which where indicating, that the system was not able to allocate swap space fast enough if I recall it correctly (_not_ out of swap space!) but the RAID is kinda fast imho. Any idea what I could do to shed some more light on this behaviour? Why it is happening and what really is causing it? Would enabling the kernel debugger really help here? I mean the system is really hanging up - except ping response it is not responding to anything except the reset switch ;) Greetings, Oliver Personally I'd rather bet on some hardware problem (overheating?) Try to install mbmon from ports. I had also similiar problems with old motherboards with swelled capacitors. -- Bartosz Stec ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Tue, 30 Sep 2008, Gavin Atkinson wrote: On Mon, 2008-09-29 at 22:14 +0200, Oliver Lehmann wrote: Any idea what I could do to shed some more light on this behaviour? Why it is happening and what really is causing it? Would enabling the kernel debugger really help here? I mean the system is really hanging up - except ping response it is not responding to anything except the reset switch ;) If it's responding to ping, you should be able to get into the debugger. Compile it in, along with options WITNESS and options WITNESS_SKIPSPIN, and press ctrl-alt-escape when the machine next hangs. From there, it should hopefully be possible to get more info. It's been a long time since I've used the debugger under 6.x so some of the more useful commands may not exist, but the output of at least sh locks, sh alllocks and bt on any processes that seem to be holding locks. Also sh pcpu and ps will help to determine exactly what was running at the time. show lockedvnods is also quite useful if the problem originates in the file system, as it lists vnodes that have been locked, and by which threads. Robert N M Watson Computer Laboratory University of Cambridge ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Tue, Sep 30, 2008 at 12:39:27PM +0200, Bartosz Stec wrote: Oliver Lehmann wrote: Hi, My fileserver has sporadical hangups running 6.3: FreeBSD 6.3-STABLE #0: Thu Jun 19 00:21:00 CEST 2008 [EMAIL PROTECTED]:/usr/obj/i386-pentium3-6.3/usr/src/sys/NUDEL The exact release doesn't matter since it happened before. It always happens afer some time of having some load on the system (I'm building ports with tinderbox and during the build process it just hangs up). The system does nothing write out on the console, neither the CRT, nor the serial console. The system itself is: CPU: Intel Pentium III (845.64-MHz 686-class CPU) Origin = GenuineIntel Id = 0x683 Stepping = 3 Features=0x387fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE real memory = 805240832 (767 MB) avail memory = 778481664 (742 MB) ACPI APIC Table: Intel N440BX FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 1 cpu1 (AP): APIC ID: 0 ioapic0 Version 1.1 irqs 0-23 on motherboard while the diskspace is provided by an 3ware RAID: twa0: 3ware 9000 series Storage Controller port 0x2400-0x24ff mem 0xf4101000-0xf41010ff,0xf480-0xf4ff irq 18 at device 11.0 on pci0 twa0: INFO: (0x04: 0x0053): Battery capacity test is overdue: twa0: INFO: (0x15: 0x1300): Controller details:: Model 9500S-4LP, 4 ports, Firmware FE9X 2.08.00.009, BIOS BE9X 2.03.01.052 da0 at twa0 bus 0 target 0 lun 0 da0: AMCC 9500S-4LP DISK 2.08 Fixed Direct Access SCSI-3 device da0: 100.000MB/s transfers da0: 715224MB (1464778752 512 byte sectors: 255H 63S/T 91178C) I had - in the past - sometimes messages left which where indicating, that the system was not able to allocate swap space fast enough if I recall it correctly (_not_ out of swap space!) but the RAID is kinda fast imho. Any idea what I could do to shed some more light on this behaviour? Why it is happening and what really is causing it? Would enabling the kernel debugger really help here? I mean the system is really hanging up - except ping response it is not responding to anything except the reset switch ;) Greetings, Oliver Personally I'd rather bet on some hardware problem (overheating?) Try to install mbmon from ports. I had also similiar problems with old motherboards with swelled capacitors. Be careful with mbmon and healthd -- just because they compile and run does not mean they're working properly (the values shown may be completely unreliable/incorrect). It's best to check such things in the system BIOS, unless you have absolute certainty that your motherboard is supported by mbmon/healthd. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Monday 29 September 2008 04:14:08 pm Oliver Lehmann wrote: Hi, My fileserver has sporadical hangups running 6.3: FreeBSD 6.3-STABLE #0: Thu Jun 19 00:21:00 CEST 2008 [EMAIL PROTECTED]:/usr/obj/i386-pentium3-6.3/usr/src/sys/NUDEL The exact release doesn't matter since it happened before. It always happens afer some time of having some load on the system (I'm building ports with tinderbox and during the build process it just hangs up). The system does nothing write out on the console, neither the CRT, nor the serial console. 1) Setup support for crashdumps. 2) Add 'DDB' and 'KDB' to your kernel. When it hangs, break into the debugger (CTRL+ALT+ESC) and run 'panic' to generate a crash dump. 3) ps -axl -M /var/crash/vmcore.X -N /boot/kernel/kernel (where vmcore.X is the core file generated, probably vmcore.0). That's the first place to start. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
Hi, Jeremy Chadwick wrote: On Tue, Sep 30, 2008 at 12:39:27PM +0200, Bartosz Stec wrote: Personally I'd rather bet on some hardware problem (overheating?) Try to install mbmon from ports. I had also similiar problems with old motherboards with swelled capacitors. Be careful with mbmon and healthd -- just because they compile and run does not mean they're working properly (the values shown may be completely unreliable/incorrect). It's best to check such things in the system BIOS, unless you have absolute certainty that your motherboard is supported by mbmon/healthd. The systems chipset (440GX - board is http://www.intel.com/support/motherboards/server/l440gx/) is not supported by mbmon. All I can check is the temperature of the harddrives and they are between 30 - 45 °C. Which just means nothing for the CPUs ;) make world for example does not break the system down - I only encounter this during my tinderbox runs - who knows what stresses it then that much. I'll now make a kernel with all the debugging stuff in it... -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
John Baldwin wrote: (CTRL+ALT+ESC) and run 'panic' to generate a crash dump. problem here is, that after some memory upgrade my swapspace is no longer bigh enough to cover the memory size. I'll try this as a last resort if the interactive work with kdb does not provide any help and will remove some memory before it then... -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Tuesday 30 September 2008 10:57:19 am Oliver Lehmann wrote: John Baldwin wrote: (CTRL+ALT+ESC) and run 'panic' to generate a crash dump. problem here is, that after some memory upgrade my swapspace is no longer bigh enough to cover the memory size. I'll try this as a last resort if the interactive work with kdb does not provide any help and will remove some memory before it then... Turn on minidumps. minidumps don't dump all of memory (generally a lot, lot less). -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Tue, Sep 30, 2008 at 04:55:34PM +0200, Oliver Lehmann wrote: Hi, Jeremy Chadwick wrote: On Tue, Sep 30, 2008 at 12:39:27PM +0200, Bartosz Stec wrote: Personally I'd rather bet on some hardware problem (overheating?) Try to install mbmon from ports. I had also similiar problems with old motherboards with swelled capacitors. Be careful with mbmon and healthd -- just because they compile and run does not mean they're working properly (the values shown may be completely unreliable/incorrect). It's best to check such things in the system BIOS, unless you have absolute certainty that your motherboard is supported by mbmon/healthd. The systems chipset (440GX - board is http://www.intel.com/support/motherboards/server/l440gx/) is not supported by mbmon. All I can check is the temperature of the harddrives and they are between 30 - 45 °C. Which just means nothing for the CPUs ;) The chipset rarely matters (I've yet to encounter any PC chipset that natively handles full fan, voltage, and temperature monitoring), but the motherboard model can tell me a lot. :-) Boards have to include an external H/W monitoring IC (such as one from National Semiconductor (LMxx), AMD, or Winbond), have thermistors placed around the board, and have the H/W IC tied into the ISA or SMBus. Sometimes the H/W monitoring IC also acts as a super I/O chip (which means it handles serial, parallel, keyboard, mouse, and floppy disks -- and sometimes IDE). I can't find anything on Intel's site that clues me in; all the PDFs are vague as far as what chips are on the board. I tried searching for a high-resolution photo of the L440GX on Google Images, but I find none which are sharp/clear enough. The best I could find was this: http://bbs.yjfy.com/UploadFile/2008-2/20082818545062073.jpg I see Intel northbridge and southbridges, a Cirrus Logic (VGA?) chip, an Intel flash chip (probably for CMOS), and an Intel NIC. Four chips I don't recognise are an Intel chip on the far right, a mystery chip at the bottom of the board (can't make out company logo), and two chips with E in their company logo (right of PCI slots). Possibly one of these handles H/W monitoring. If you can reboot the system and go into the BIOS, see if you can find anything that looks remotely like CPU and system temperatures, as well as voltages. If there's no such menu, the board likely has no support for such. P.S. -- You're the 2nd person I've encountered in under a week who's using 440BX/GX-based hardware in present day. I would not be surprised if the board is simply going bad/failing due to age. :-) -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
Jeremy Chadwick wrote: I can't find anything on Intel's site that clues me in; all the PDFs are vague as far as what chips are on the board. Have you tried the Product specifications? http://download.intel.com/support/motherboards/server/l440gx/254151-003.pdf Beginning on page 33 (43 of the pdf) It has 3 different Server Management busses. the temperature part is handled within a Baseboard Management Controller. This BMC is implemented using a DS82CL10. Because it is a Server Board it offers a lot of managing features and other nice things like serial console at bootup and system monitoring features... but all unsupported withn FreeBSDs software ;) P.S. -- You're the 2nd person I've encountered in under a week who's using 440BX/GX-based hardware in present day. I would not be surprised if the board is simply going bad/failing due to age. :-) Hm - I'd wonder if this would be the case. I mean I'm using older hardware (Tyan Tsunami S1830S, PII300, DAC960P, RAID-1 2*IBM DFHS S2W) without any problems as router ;) -- Oliver Lehmann http://www.pofo.de/ http://wishlist.ans-netz.de/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system hangup - I'm lost
On Wed, Oct 01, 2008 at 06:53:09AM +0200, Oliver Lehmann wrote: Jeremy Chadwick wrote: I can't find anything on Intel's site that clues me in; all the PDFs are vague as far as what chips are on the board. Have you tried the Product specifications? No need -- Charles Sprickman sent me high-resolution pictures of all the ICs on the 440GX board, and I was able to identify all of them except a few (and those are obviously bit-latches or gates of some kind, so not important). Here's the list: - National Semiconductor Super I/O chip [1] - Cirrus Logic GD5480 video/VGA chip - Samsung SGRAM module for VGA chip; 16MBytes, 70ns - Intel 82371EB (PIIX4E) chip [2] - Dallas Semiconductor DS80CH11 power management chip - EtronTech SRAM; 256kbit, 15ns - Unknown, looks like flash or DRAM - Intel S82093AA I/O APIC - Octal bit-latch IC - Intel SB21150BC PCI bridge; 66MHz - Intel chip of some kind, can't make it out due to dust - Texas Instrument UCC5638 SCSI terminator - Texas Instrument UCC5638 SCSI terminator - Cypress Semiconductor W48C101 clock chip - Numerous other bit-latching ICs - Cypress Semiconductor 3.3V SDRAM buffering chip; probably used to drive SDRAM DIMMs (system memory) - ??? Model 684702-003; not sure what this does, but is of no interest - Some TI chip, doesn't interest me - 2x California Micro Devices ECP/EPP (parallel port) terminator - Maxim MAX211ECA1, no idea but doesn't interest me [1]: I'll have to look up datasheets on this chip to see if it supports H/W monitoring. [2]: This chip does a **lot**, the most important piece being it drives the entire PCI bus. It *does* support SMBus, but not I2C. Linux lmsensors supports this chip, but I don't know how it supports it. I will need to look up the specs/datasheets on it http://www.lm-sensors.org/browser/lm-sensors/trunk/doc/busses/i2c-piix4 http://download.intel.com/support/motherboards/server/l440gx/254151-003.pdf Beginning on page 33 (43 of the pdf) It has 3 different Server Management busses. the temperature part is handled within a Baseboard Management Controller. This BMC is implemented using a DS82CL10. This tells me very little. :-) Because it is a Server Board it offers a lot of managing features and other nice things like serial console at bootup and system monitoring features... but all unsupported withn FreeBSDs software ;) Really? That's interesting, because Charles Sprickman told me that there is no hardware monitoring information in the BIOS if you go in there. Most motherboards provide that in the BIOS as a centralised place above all else. Either way, I'm going to look into the details. Examining what exactly Linux lm-sensors means by support will be the first step. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]