Re: FreeBSD 5.3 SMP freezes with MySQL 4.1
the system is totally dead with the pannic message, and have to hard reset to reboot. even i reboot the server, it will crash in several minutes, because of hundreds of request is coming. by refer to Klein's configuration and turn debug.mpsafenet=0 in /boot/loader.conf, the server is stable so far, and it last 20 hours. -- Young Lee On Tue, 29 Mar 2005 21:01:51 -0800 (PST) Doug White [EMAIL PROTECTED] wrote: On Tue, 29 Mar 2005, [GB2312] ÀîÒã¸Õ wrote: Hi, I have a dell pe2650 box with dual xeon 2.4G, disabled the HTT, installed 5.3-RELEASE, enabled SMP, mysql 4.1.10a built from the port. If the mysql connections is high (e.g. over 100 connections), the system will freezes in several minutes, and the fatal alway indicated to the mysqld process. When I build the kernel without SMP, the system is stable. I have try 5.3-p5 and 5.4-PRE, the problem is still there. It's my configuration mistake or will be solved in 5.4-RELEASE ? There's just not enough information to tell. When you say freeze, is the machine totally unresponsive, or does the console allow you to type characters? Can you ping the system when it appears frozen? Does the system eventually recover if the load is taken away? -- Doug White| FreeBSD: The Power to Serve [EMAIL PROTECTED] | www.FreeBSD.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
new LORs on 5.4 pre
Hi, I've stumbled over some new LORs (all continuable) on 5.4pre from 2005-03-29 09:49 UTC, thus before the bpf/DHCP fix. lock order reversal 1st 0xc0642b60 Giant (Giant) @ /usr/src/sys/kern/kern_timeout.c:256 2nd 0xc14d7264 fxp0 (network driver) @ /usr/src/sys/modules/fxp/../../dev/fxp/if_fxp.c:1233 KDB: stack backtrace: kdb_backtrace(c05fc462,c14d7264,c14cab80,c06fc810,c06fc7ad) at 0xc04b05ae = kdb_backtrace+0x2e witness_checkorder(c14d7264,9,c06fc7ad,4d1,c06018d6) at 0xc04bb6c6 = witness_checkorder+0x6a6 _mtx_lock_flags(c14d7264,0,c06fc7ad,4d1,c14d7000) at 0xc048a62a = _mtx_lock_flags+0x8a fxp_start(c14d7000,12b,0,c14d7000) at 0xc06f9db7 = fxp_start+0x37 if_start(c14d7000,0,c06018d6,184,402) at 0xc050a999 = if_start+0x99 ether_output_frame(c14d7000,c15d8100,6,c9be5bd8,c9be5a8c) at 0xc050c0d8 = ether_output_frame+0x218 ether_output(c14d7000,c15d8100,c9be5bd8,0,0) at 0xc050beae = ether_output+0x44e nd6_output(c14d7000,c14d7000,c15d8100,c9be5bd8,0) at 0xc0551ac1 = nd6_output+0x3c1 ip6_output(c15d8100,0,0,1,c9be5c40) at 0xc054b0b3 = ip6_output+0xf93 nd6_ns_output(c14d7000,0,c15dc8a8,0,1) at 0xc0552c95 = nd6_ns_output+0x3b5 nd6_dad_ns_output(c1594100,c15dc800,100,1,6) at 0xc055420c = nd6_dad_ns_output+0x4c nd6_dad_timer(c15dc800,0,c05f9d24,100,1) at 0xc0553e94 = nd6_dad_timer+0x224 softclock(0,0,c05f6625,269,c0642b20) at 0xc04a29c8 = softclock+0x238 ithread_loop(c13dd500,c9be5d48,c05f641c,30e,0) at 0xc047d8c2 = ithread_loop+0x172 fork_exit(c047d750,c13dd500,c9be5d48) at 0xc047c8e6 = fork_exit+0xc6 fork_trampoline() at 0xc05c7c9c = fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc9be5d7c, ebp = 0 --- KDB: enter: witness_checkorder (this one is similar to others on the list) lock order reversal 1st 0xc16718a0 rtentry (rtentry) @ /usr/src/sys/netinet/if_ether.c:445 2nd 0xc14d7264 fxp0 (network driver) @ /usr/src/sys/modules/fxp/../../dev/fxp/if_fxp.c:1233 KDB: stack backtrace: kdb_backtrace(c05fc462,c14d7264,c14cab80,c06fc810,c06fc7ad) at 0xc04b05ae = kdb_backtrace+0x2e witness_checkorder(c14d7264,9,c06fc7ad,4d1,c06018d6) at 0xc04bb6c6 = witness_checkorder+0x6a6 _mtx_lock_flags(c14d7264,0,c06fc7ad,4d1,c14d7000) at 0xc048a62a = _mtx_lock_flags+0x8a fxp_start(c14d7000,12b,0,c14d7000) at 0xc06f9db7 = fxp_start+0x37 if_start(c14d7000,0,c06018d6,184,202) at 0xc050a999 = if_start+0x99 ether_output_frame(c14d7000,c15d5500,6,12b,c1045b18) at 0xc050c0d8 = ether_output_frame+0x218 ether_output(c14d7000,c15d5500,ca446a40,0,2,c1670001,2302,c06021ce,1bd,516) at 0xc050beae = ether_output+0x44e arprequest(c14d7000,c15fc0c8,ca446b14,c148c4ac,7) at 0xc0515489 = arprequest+0x109 arpresolve(c14d7000,c1671840,c15d5600,ca446b10,ca446aac) at 0xc05157cd = arpresolve+0x32d ether_output(c14d7000,c15d5600,ca446b10,c1671840,c04bb7a7) at 0xc050badc = ether_output+0x7c ip_output(c15d5600,0,ca446b0c,0,0) at 0xc0520897 = ip_output+0x7c7 udp_output(c166f9d8,c15d5600,0,0,c1499480) at 0xc0535a7a = udp_output+0x53a udp_send(c166eca8,0,c15d5600,0,0) at 0xc0536280 = udp_send+0x30 sosend(c166eca8,0,ca446c48,c15d5600,0) at 0xc04d2db1 = sosend+0x701 kern_sendit(c1499480,d,ca446cc4,0,0) at 0xc04d95ef = kern_sendit+0x13f sendit(c1499480,d,ca446cc4,0,810001d) at 0xc04d9481 = sendit+0x1a1 sendto(c1499480,ca446d14,18,431,6) at 0xc04d976b = sendto+0x5b syscall(2f,2f,2f,2,0) at 0xc05d9170 = syscall+0x2a0 Xint0x80_syscall() at 0xc05c7c8f = Xint0x80_syscall+0x1f --- syscall (133, FreeBSD ELF32, sendto), eip = 0x28233baf, esp = 0xbfbfd51c, ebp = 0xbfbfd548 --- KDB: enter: witness_checkorder lock order reversal 1st 0xc168a57c inp (tcpinp) @ /usr/src/sys/netinet/tcp_usrreq.c:371 2nd 0xc14d7264 fxp0 (network driver) @ /usr/src/sys/modules/fxp/../../dev/fxp/if_fxp.c:1233 KDB: stack backtrace: kdb_backtrace(c05fc462,c14d7264,c14cab80,c06fc810,c06fc7ad) at 0xc04b05ae = kdb_backtrace+0x2e witness_checkorder(c14d7264,9,c06fc7ad,4d1,c06018d6) at 0xc04bb6c6 = witness_checkorder+0x6a6 _mtx_lock_flags(c14d7264,0,c06fc7ad,4d1,c14d7000) at 0xc048a62a = _mtx_lock_flags+0x8a fxp_start(c14d7000,12b,0,c14d7000) at 0xc06f9db7 = fxp_start+0x37 if_start(c14d7000,0,c06018d6,184,2) at 0xc050a999 = if_start+0x99 ether_output_frame(c14d7000,c15d6200,6,c1589150,ca455afc) at 0xc050c0d8 = ether_output_frame+0x218 ether_output(c14d7000,c15d6200,c1589150,c16718c4,255) at 0xc050beae = ether_output+0x44e ip_output(c15d6200,0,ca455b5c,0,0) at 0xc0520897 = ip_output+0x7c7 tcp_output(c168ca68,c158b970,c1499c00,173,c19b7288) at 0xc052ad5d = tcp_output+0x134d tcp_usr_connect(c19b7288,c158b970,c1499c00) at 0xc053297a = tcp_usr_connect+0x12a soconnect(c19b7288,c158b970,c1499c00,c04daa66,808b4a0) at 0xc04d2651 = soconnect+0x61 kern_connect(c1499c00,3,c158b970,c158b970,0) at 0xc04d8e5d = kern_connect+0x8d connect(c1499c00,ca455d14,c,431,3) at 0xc04d8db1 = connect+0x41 syscall(2f,2f,2f,808b480,8088240) at 0xc05d9170 = syscall+0x2a0 Xint0x80_syscall() at 0xc05c7c8f = Xint0x80_syscall+0x1f --- syscall (98, FreeBSD ELF32, connect), eip = 0x282e7def, esp =
KDE refuses new processes when network goes away
This is on a labtop (IBM Thinkpad R50e) running 5.3-RELEASE. When I start it up with network up everything is fine, but if the network dies (we have a slightly dodgy ADSL, which periodically plays yo-yo) it will refuse to open new processes or windows. Is it (likely to be) a KDE, xopen or freebsd problem? /Par -- Par Leijonhufvud [EMAIL PROTECTED] This is not a book to be set aside lightly. It should be thrown with great force! -- Dorothy Sayers, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: KDE refuses new processes when network goes away
Par Leijonhufvud wrote: When I start it up with network up everything is fine, but if the network dies (we have a slightly dodgy ADSL, which periodically plays yo-yo) it will refuse to open new processes or windows. Is it (likely to be) a KDE, xopen or freebsd problem? Is your /etc/hosts correct? I've seen this happen when the local hostname (i.e., the output of `hostname`) cannot be resolved. Colin Percival ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 5.3 SMP freezes with MySQL 4.1
Hi, Young, 2005-03-30 17:08 +0800Young Lee the system is totally dead with the pannic message, and have to hard reset to reboot. even i reboot the server, it will crash in several minutes, because of hundreds of request is coming. by refer to Klein's configuration and turn debug.mpsafenet=0 in /boot/loader.conf, the server is stable so far, and it last 20 hours. Have you tried disabling SACK? (net.inet.tcp.sack)? BTW. I think it might be helpful to show your kernel compiling configuration. Cheers, -- Xin LI delphij delphij net http://www.delphij.net/ signature.asc Description: This is a digitally signed message part
5.4pre panic
Got this panic with auto-reboot (no dump :( ), saved from dmesg on 5.4pre 2003-05-29 09:49 UTC processor eflags= IOPL = 0 current process = 29 (swi1: net) trap number = 3 panic: breakpoint instruction fault KDB: enter: panic Fatal trap 3: breakpoint instruction fault while in kernel mode instruction pointer = 0x8:0xc04b0650 stack pointer = 0x10:0xc0687bec frame pointer = 0x10:0xc0687bf4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= IOPL = 0 current process = 29 (swi1: net) trap number = 3 panic: breakpoint instruction fault KDB: enter: panic Fatal trap 3: breakpoint instruction fault while in kernel mode instruction pointer = 0x8:0xc04b0650 stack pointer = 0x10:0xc0687af4 frame pointer = 0x10:0xc0687afc code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= IOPL = 0 current process = 29 (swi1: net) trap number = 3 panic: breakpoint instruction fault KDB: enter: panic Fatal trap 3: breakpoint instruction fault while in kernel mode instruction pointer = 0x8:0xc04b0650 stack pointer = 0x10:0xc06879fc frame pointer = 0x10:0xc0687a04 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= IOPL = 0 current process = 29 (swi1: net) trap number = 3 panic: breakpoint instruction fault Rene -- It won't fit on the line. -- me, 2001 pgpLAz1KknSaT.pgp Description: PGP signature
Upgrading to net-snmp-5.2.1_1
Hi, on my FreeBSD laptop net-snmp-5.2.1 is installed. However, the portupgrade to net-snmp-5.2.1_1 failed. I tried to build it manually, but make failed, too. Next step, I set the option WITHOUT_PERL=yes. But I still get this error everytime: Warning: -L../../snmplib/ changed to -L/usr/ports/net-mgmt/net-snmp/work/net-snmp-5.2.1/perl/SNMP/../../snmplib/ Unrecognized argument in LIBS ignored: '-rpath=/usr/local/lib' Writing Makefile for SNMP /libexec/ld-elf.so.1: /usr/local/lib/perl5/site_perl/5.8.6/mach/auto/SNMP/SNMP.so: Undefined symbol perl_get_sv *** Error code 1 Stop in /usr/ports/net-mgmt/net-snmp/work/net-snmp-5.2.1. *** Error code 1 Stop in /usr/ports/net-mgmt/net-snmp. I am using FreeBSD 5.4-PRE and perl 5.8.6. Could you please give me a hint? Kind regards Rainer --- Rainer Heesen, [EMAIL PROTECTED], Bonn, Germany ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 5.3 SMP freezes with MySQL 4.1
I am running 1 mildly busy new MySQL server thats running fine on 5.3-RELEASE-p5 #5: Sat Jan 22 04:54:07 EST 2005 from the generic conf kernel Its a Dell 1850 Dual P4 Xeon CPU 3.00GHz EMT64 with HTT enabled FYI I actually have a Dell 2650 thats not doing anything at the moment because it had sluggish performance when I started to put some serious burden on it. I recently updated to the latest MySQL to 4.1.10a from 4.1.5 with using this set of settings (I hate using ports manually) portupgrade -Rfri -m 'BUILD_OPTIMIZED=yes BUILD_STATIC=yes' /var/db/pkg/mysql-server-4.1* I copied the default large.cnf file to /var/db/mysql/my.cnf for better performance but thats about it, I am still evaluating MySQL performance. To give you a remote idea how busy this MySQL server is, here are some bits running mysqladmin extended-status | Bytes_received | 49227436 | | Bytes_sent | 71933703 | | Threads_connected| 25 | | Threads_created | 42 | | Uptime | 101775 | According to MySQL manual Threads_created gives an idea of the load on the MySQL server. phpMyAdmin lists MySQL status in a much nicer way This MySQL server has been running for 1 days, 4 hours, 38 minutes and 28 seconds. Query statistics: Since its startup, 335,544 queries have been sent to the server. Total ø per hour ø per minute ø per second 335,54411,715.47 195.26 3.25 select 204,621 7,144.31 61.04 % insert 29,149 1,017.73 8.70 % show keys 85,395 2,981.55 25.48 % This server is doing more things then I originally planned it to do, its also running a Postgres 7.4 server that has over 500megs of data and almost constant 100% usage of disk IO according to top via m, I have statistics enabled on postgres but no way to show some simple summaries. I run Apache2 in prefork mode and currently has around 350 average apache daemons ps -auxww | grep -c httpd 356 Its doing over 1 million dynamic page loads a day (some page loads don't use database) This server also is running 12 separate Java processes each at around 200megs of size. Since cvsuping to the latest 5_3 for release security patches and critical updates the server is been perfectly stable, before that I did have kernel panic reboot problems that I believe were caused be massive thread usage from the java processes. Although I have rebooted just a little while ago the servers uptime is currently 33days. With your server how have you been updating your server to 5.3-P5 release? Its possible you have a similar problem. I am emailing this in HTML format in the hope the tables come out more nicely. Regards, Mike Young Lee wrote: I have try your solution yesterday, so far it is stable, and will observe the stability for some days. btw, i turn debug.mpsafenet=0 in /boot/loader.conf to evade the possible network stack deadlock under SMP. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 5.4pre panic
On Wed, Mar 30, 2005 at 12:11:23PM +0200, Rene Ladan wrote: Got this panic with auto-reboot (no dump :( ), saved from dmesg on 5.4pre 2003-05-29 09:49 UTC processor eflags = IOPL = 0 current process = 29 (swi1: net) trap number = 3 panic: breakpoint instruction fault I'd have expected this panic to be the result of setting a breakpoint, i.e. requesting the kernel to panic at this location. Are you absolutely certain this is not the case? If so, you should try to use DDB to obtain a traceback to find out what it was doing. Kris pgp9lTxc4v1UM.pgp Description: PGP signature
Re: new LORs on 5.4 pre
On Wed, Mar 30, 2005 at 11:17:50AM +0200, Rene Ladan wrote: Hi, I've stumbled over some new LORs (all continuable) on 5.4pre from 2005-03-29 09:49 UTC, thus before the bpf/DHCP fix. lock order reversal 1st 0xc0642b60 Giant (Giant) @ /usr/src/sys/kern/kern_timeout.c:256 2nd 0xc14d7264 fxp0 (network driver) @ /usr/src/sys/modules/fxp/../../dev/fxp/if_fxp.c:1233 Is your fxp module up-to-date? Stale modules (i.e. compiled for a different kernel than the one you're running) will cause problems. Kris pgpfj8n70D7Fp.pgp Description: PGP signature
Re: 5.4pre panic
On Wed, Mar 30, 2005 at 05:51:04AM -0800, Kris Kennaway wrote: On Wed, Mar 30, 2005 at 12:11:23PM +0200, Rene Ladan wrote: Got this panic with auto-reboot (no dump :( ), saved from dmesg on 5.4pre 2003-05-29 09:49 UTC processor eflags= IOPL = 0 current process = 29 (swi1: net) trap number = 3 panic: breakpoint instruction fault I'd have expected this panic to be the result of setting a breakpoint, i.e. requesting the kernel to panic at this location. Are you absolutely certain this is not the case? I did not set a breakpoint, maybe some application was doing it for me. I was working in X at the moment. If so, you should try to use DDB to obtain a traceback to find out what it was doing. I have DDB/KDB in my kernelconfig, but auto-reboot without dump doesn't help much, does it? Kris Regards, Rene -- It won't fit on the line. -- me, 2001 pgpEbGjPIZWZx.pgp Description: PGP signature
Re: new LORs on 5.4 pre
On Wed, Mar 30, 2005 at 05:52:13AM -0800, Kris Kennaway wrote: On Wed, Mar 30, 2005 at 11:17:50AM +0200, Rene Ladan wrote: Hi, I've stumbled over some new LORs (all continuable) on 5.4pre from 2005-03-29 09:49 UTC, thus before the bpf/DHCP fix. lock order reversal 1st 0xc0642b60 Giant (Giant) @ /usr/src/sys/kern/kern_timeout.c:256 2nd 0xc14d7264 fxp0 (network driver) @ /usr/src/sys/modules/fxp/../../dev/fxp/if_fxp.c:1233 Is your fxp module up-to-date? Stale modules (i.e. compiled for a different kernel than the one you're running) will cause problems. All modules are up to date. This LOR popped up after installing from a resume buildworld. After blowing away /usr/obj, make cleandir twice in /usr/src and rebuilding and reinstalling world+kernel, it and the others still pop up, especially after any of these commands: # dhclient fxp0 # ntpd -q % fetchmail (only at startup, not after wakeup) Browsing in links does _not_ trigger a LOR. Kris Regards, Rene -- It won't fit on the line. -- me, 2001 pgpDb5PcaRtzT.pgp Description: PGP signature
Re: 5.4pre panic
On Wed, Mar 30, 2005 at 04:17:08PM +0200, Rene Ladan wrote: On Wed, Mar 30, 2005 at 05:51:04AM -0800, Kris Kennaway wrote: On Wed, Mar 30, 2005 at 12:11:23PM +0200, Rene Ladan wrote: Got this panic with auto-reboot (no dump :( ), saved from dmesg on 5.4pre 2003-05-29 09:49 UTC processor eflags = IOPL = 0 current process = 29 (swi1: net) trap number = 3 panic: breakpoint instruction fault I'd have expected this panic to be the result of setting a breakpoint, i.e. requesting the kernel to panic at this location. Are you absolutely certain this is not the case? I did not set a breakpoint, maybe some application was doing it for me. I was working in X at the moment. If so, you should try to use DDB to obtain a traceback to find out what it was doing. I have DDB/KDB in my kernelconfig, but auto-reboot without dump doesn't help much, does it? The point is to disable auto-reboot (I assume you mean DDB_UNATTENDED?) and obtain the traceback manually from the ddb prompt when it panics. Kris pgpYufjSSWwsp.pgp Description: PGP signature
boot0: To beep or not to beep?
Hi, long story short: boot0cfg -B ad0 - No beeping on boot boot0cfg -B -o noupdate ad0 - Annoying beep. I really don't know any assembler, but reading /sys/boot/i386/boot0/boot0.S leads me to believe that there should be a beep on every boot. However, I certainly don't want any beeps from boot0, can I comment out the first two lines of main.10? /* * Start of input loop. Beep and take note of time */ main.10:movb $ASCII_BEL,%al # Signal callw putchr# beep! xorb %ah,%ah# BIOS: Get int $0x1a # system time movw %dx,%di# Ticks when addw _TICKS(%bp),%di# timeout /* * Busy loop, looking for keystrokes but keeping one eye on the time. */ main.8: #ifndef SIO movb $0x1,%ah # BIOS: Check int $0x16 # for keypress jnz main.11 # Have one #else /* SIO */ movb $0x03,%ah # BIOS: Read COM call bioscom testb $0x01,%ah # Check line status jnz main.11 # (bit 1 indicates input) #endif /* SIO */ xorb %ah,%ah# BIOS: Get int $0x1a # system time cmpw %di,%dx# Timeout? jb main.8 # No Ulrich Spörlein -- PGP Key ID: F0DB9F44 Encrypted mail welcome! Fingerprint: F1CE D062 0CA9 ADE3 349B 2FE8 980A C6B5 F0DB 9F44 Ok, which part of Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn. didn't you understand? pgpZsMaAcYQSc.pgp Description: PGP signature
Re: FreeBSD 5.3 SMP freezes with MySQL 4.1
admin# sysctl -a | grep net.inet.tcp.sack net.inet.tcp.sack.enable: 1 i use SMP kernel configuration from cvsup without any modification. -- Young Lee On Wed, 30 Mar 2005 17:40:49 +0800 Xin LI [EMAIL PROTECTED] wrote: Hi, Young, 2005-03-30 17:08 +0800Young Lee the system is totally dead with the pannic message, and have to hard reset to reboot. even i reboot the server, it will crash in several minutes, because of hundreds of request is coming. by refer to Klein's configuration and turn debug.mpsafenet=0 in /boot/loader.conf, the server is stable so far, and it last 20 hours. Have you tried disabling SACK? (net.inet.tcp.sack)? BTW. I think it might be helpful to show your kernel compiling configuration. Cheers, -- Xin LI delphij delphij net http://www.delphij.net/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE
On Wed, March 30, 2005 12:43 am, Karl Denninger said: Here's the diff and some thoughts [snip, including first diff] 241,243c244,249 /* if reinit succeeded and retries still permit, reinject request */ if (ata_reinit(ch) request-retries-- 0 request-device-param){ --- /* * if reinit succeeds, retries still permit and device didn't * get removed by the reinit, reinject request */ if (!ata_reinit(ch) request-retries-- 0 request-device-param){ [snip third diff] The second diff is really just a formatting and comment change.. you're certainly correct that the changes are small! :-) No, it is not -- it reverses the sense of the first condition. At first glance that is what I would expect to be the core of the problem, but I don't have appropriate hardware to test on. (It also adds a third condition, but that is presumably the intent of the change and should give the desired results once the first condition is corrected.) Jim -- Jim Trigg, Lord High Everything Else O- /\ Hostmaster, Huie Kin family website\ / ASCII RIBBON CAMPAIGN Verger and System Administrator,XHELP CURE HTML MAIL All Saints Church - Sharon Chapel / \ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problem with IBM xSeries 226 with ServeRAID 6i+ using FreeBSD 5.3
Scott Long wrote: J. Buck Caldwell wrote: Scott Long wrote: J. Buck Caldwell wrote: Any help would be appreciated. Requests for more detail will be answered promptly. It sounds very much like an interrupt routing problem. Have you tried the 5.4-BETA CD? Scott No help. Booting normally brings me to the same point - ips0: resetting adapter, make take 5 minutes - then nothing. System just hangs. Also, after that point, hitting the power button will not power-off the machine (of course, holding it in will). Anytime before that point, hitting the power button will turn the power off instantly. One more test, would you mind trying the 6.0-CURRENT-SNAP002 snapshot? If that works then that gives us a target to shoot for with 5.4. Scott Well, some excellent news - 6.0-CURRENT-SNAP002 works perfectly. Or at least, it boots - I'm doing an install now, but I wanted to let you know right away. Anything we could do to get this system running properly under 5.4, let me know - because we're paying lease payments two servers that we can't use at the moment, until we can boot FreeBSD on them. (damn IBM and thier end-of-life cycles) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE
On Wed, Mar 30, 2005 at 12:07:22PM -0500, Jim Trigg wrote: On Wed, March 30, 2005 12:43 am, Karl Denninger said: Here's the diff and some thoughts [snip, including first diff] 241,243c244,249 /* if reinit succeeded and retries still permit, reinject request */ if (ata_reinit(ch) request-retries-- 0 request-device-param){ --- /* * if reinit succeeds, retries still permit and device didn't * get removed by the reinit, reinject request */ if (!ata_reinit(ch) request-retries-- 0 request-device-param){ [snip third diff] The second diff is really just a formatting and comment change.. you're certainly correct that the changes are small! :-) No, it is not -- it reverses the sense of the first condition. At first glance that is what I would expect to be the core of the problem, but I don't have appropriate hardware to test on. (It also adds a third condition, but that is presumably the intent of the change and should give the desired results once the first condition is corrected.) Jim -- Jim Trigg, Lord High Everything Else O- /\ Hostmaster, Huie Kin family website\ / ASCII RIBBON CAMPAIGN Verger and System Administrator,XHELP CURE HTML MAIL All Saints Church - Sharon Chapel / \ You're correct of course - I missed the !. Too darn late at night... I've got my sandbox up and the world rebuilt so its consistent with the machine that's having the problem - will add a SATA disk and see if I can duplicate this and then figure out what's going on here this afternoon - and hopefully how to fix it. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: boot0: To beep or not to beep?
On Wed, Mar 30, 2005 at 05:00:31PM +0200, Ulrich Spoerlein wrote: I really don't know any assembler, but reading /sys/boot/i386/boot0/boot0.S leads me to believe that there should be a beep on every boot. However, I certainly don't want any beeps from boot0, can I comment out the first two lines of main.10? You are correct. This is the patch I use for a laptop with an untunable (and loud) bell volume. Marc --- /usr/src/sys/boot/i386/boot0/boot0.SThu Jun 17 14:02:25 2004 +++ /usr/src/sys/boot/i386/boot0/boot0.SThu Oct 7 13:23:08 2004 @@ -38,7 +38,6 @@ .set KEY_F1,0x3b# F1 key scan code .set KEY_1,0x02 # #1 key scan code - .set ASCII_BEL,0x07 # ASCII code for BEL .set ASCII_CR,0x0D # ASCII code for CR /* @@ -203,9 +202,7 @@ /* * Start of input loop. Beep and take note of time */ -main.10: movb $ASCII_BEL,%al # Signal - callw putchr# beep! - xorb %ah,%ah# BIOS: Get +main.10: xorb %ah,%ah# BIOS: Get int $0x1a # system time movw %dx,%di# Ticks when addw _TICKS(%bp),%di# timeout pgpiI16Zq7Elr.pgp Description: PGP signature
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE
On 3/30/2005 9:39 AM Karl Denninger wrote: On Wed, Mar 30, 2005 at 12:07:22PM -0500, Jim Trigg wrote: On Wed, March 30, 2005 12:43 am, Karl Denninger said: Here's the diff and some thoughts [snip, including first diff] 241,243c244,249 /* if reinit succeeded and retries still permit, reinject request */ if (ata_reinit(ch) request-retries-- 0 request-device-param){ --- /* * if reinit succeeds, retries still permit and device didn't * get removed by the reinit, reinject request */ if (!ata_reinit(ch) request-retries-- 0 request-device-param){ [snip third diff] The second diff is really just a formatting and comment change.. you're certainly correct that the changes are small! :-) No, it is not -- it reverses the sense of the first condition. At first glance that is what I would expect to be the core of the problem, but I don't have appropriate hardware to test on. (It also adds a third condition, but that is presumably the intent of the change and should give the desired results once the first condition is corrected.) You're correct of course - I missed the !. Too darn late at night... I've got my sandbox up and the world rebuilt so its consistent with the machine that's having the problem - will add a SATA disk and see if I can duplicate this and then figure out what's going on here this afternoon - and hopefully how to fix it. I missed the beginning of this thread and apologize if my question has already been covered. But can you tell me if this issue might be the reason my PC locks up intermittently ? I have whatever cheap card came with a Maxtor 160 GB SATA drive installed in this machine and the PC ran fine with Windows. Now I'm trying install FBSD from the 5.4-BETA ISO I downloaded from the ftp site. The PC runs POST fine and always boots from the CD to the boot menu. After picking the default option 1 (normal boot) the PC locks up anywhere from the dmesg output to sysinstall actually beginning to install the base package after doing the fdisk and disklabel stuff. Should I download 5.3-RELEASE and try installing from that? Thanks, Drew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Problems with AMD64 and 8 GB RAM?
I've recently acquired an AMD64 box (dual Opteron 242, SiS [EMAIL PROTECTED] motherboard (http://www.msi.com.tw/program/products/server/svr/pro_svr_detail.php?UID=484). See below for more details). I find it very unstable running with 8 GB memory, though 4 GB are not a problem. At first I thought it was the onboard peripherals, but after disabling them it still persisted. What's unstable? I only once got it through the boot process. Running a 5.3-RELEASE i386 kernel it panics, though I haven't investigated the panic (yet), since I'm not interested in the i386 kernel. The amd64 5.4-PRERELEASE kernel just hangs/freezes. When the peripherals are enabled, it's after probing the onboard NIC (bge) and before probing SATA (no drives present). I've done a verbose boot, of course, but no additional information is present. The NIC is recognized, and that's all. Without the peripherals, but with a 3Com 3c905 PCI NIC, it continues beyond this point, but doesn't enable the NIC. I don't have dmesg output for these attempts, so I can't produce the exact message, and I suspect it's not important. It continues until trying to mount NFS file systems, where it hangs for obvious reasons. Pressing ^C causes the system to either panic (and be unable to dump because I don't have that much swap) or just hang. None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 1. Has anybody else seen this problem? 2. Has anybody else used this hardware configuration and *not* seen this problem? 3. Where should I look next? I'm attaching the (non-verbose) dmesg from a successful boot. Greg -- See complete headers for address and phone numbers. Mar 30 14:17:16 obelix kernel: FreeBSD 5.4-PRERELEASE #0: Tue Mar 22 04:02:17 UTC 2005 Mar 30 14:17:16 obelix kernel: [EMAIL PROTECTED]:/usr/obj/src/FreeBSD/OBELIX/src/sys/OBELIX Mar 30 14:17:16 obelix kernel: Timecounter i8254 frequency 1193182 Hz quality 0 Mar 30 14:17:16 obelix kernel: CPU: AMD Opteron(tm) Processor 242 (1603.65-MHz K8-class CPU) Mar 30 14:17:16 obelix kernel: Origin = AuthenticAMD Id = 0xf5a Stepping = 10 Mar 30 14:17:16 obelix kernel: Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE3 6,CLFLUSH,MMX,FXSR,SSE,SSE2 Mar 30 14:17:16 obelix kernel: AMD Features=0xe0500800SYSCALL,NX,MMX+,LM,3DNow+,3DNow Mar 30 14:17:16 obelix kernel: real memory = 3756916736 (3582 MB) Mar 30 14:17:16 obelix kernel: avail memory = 3623907328 (3456 MB) Mar 30 14:17:16 obelix kernel: ACPI APIC Table: VIAK8 AWRDACPI Mar 30 14:17:16 obelix kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs Mar 30 14:17:16 obelix kernel: cpu0 (BSP): APIC ID: 0 Mar 30 14:17:16 obelix kernel: cpu1 (AP): APIC ID: 1 Mar 30 14:17:16 obelix kernel: ioapic0: Changing APIC ID to 2 Mar 30 14:17:16 obelix kernel: ioapic0 Version 0.3 irqs 0-23 on motherboard Mar 30 14:17:16 obelix kernel: acpi0: VIAK8 AWRDACPI on motherboard Mar 30 14:17:16 obelix kernel: acpi0: Power Button (fixed) Mar 30 14:17:16 obelix kernel: Timecounter ACPI-fast frequency 3579545 Hz quality 1000 Mar 30 14:17:16 obelix kernel: acpi_timer0: 24-bit timer at 3.579545MHz port 0x4008-0x400b on acpi0 Mar 30 14:17:16 obelix kernel: cpu0: ACPI CPU on acpi0 Mar 30 14:17:16 obelix kernel: cpu1: ACPI CPU on acpi0 Mar 30 14:17:16 obelix kernel: acpi_button0: Power Button on acpi0 Mar 30 14:17:16 obelix kernel: pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 Mar 30 14:17:16 obelix kernel: pci0: ACPI PCI bus on pcib0 Mar 30 14:17:16 obelix kernel: pcib1: PCI-PCI bridge at device 1.0 on pci0 Mar 30 14:17:16 obelix kernel: pci1: PCI bus on pcib1 Mar 30 14:17:16 obelix kernel: pci1: display, VGA at device 0.0 (no driver attached) Mar 30 14:17:16 obelix kernel: xl0: 3Com 3c905C-TX Fast Etherlink XL port 0xd000-0xd07f mem 0xfb00-0xfb7f irq 18 at device 7.0 on pci0 Mar 30 14:17:16 obelix kernel: miibus0: MII bus on xl0 Mar 30 14:17:16 obelix kernel: xlphy0: 3c905C 10/100 internal PHY on miibus0 Mar 30 14:17:16 obelix kernel: xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto Mar 30 14:17:16 obelix kernel: xl0: Ethernet address: 00:50:da:cf:17:d3 Mar 30 14:17:16 obelix kernel: atapci0: VIA 8237 UDMA133 controller port 0xd400-0xd40f,0x376,0x170-0x177,0x3f6,0x1f0-0 x1f7 at device 15.0 on pci0 Mar 30 14:17:16 obelix kernel: ata0: channel #0 on atapci0 Mar 30 14:17:16 obelix kernel: ata1: channel #1 on atapci0 Mar 30 14:17:16 obelix kernel: uhci0: VIA 83C572 USB controller port 0xd800-0xd81f irq 21 at device 16.0 on pci0 Mar 30 14:17:16 obelix kernel: usb0: VIA 83C572 USB controller on uhci0 Mar 30 14:17:16 obelix kernel: usb0: USB revision 1.0 Mar 30 14:17:16 obelix kernel: uhub0: VIA UHCI root hub, class 9/0, rev
Re: Problems with AMD64 and 8 GB RAM?
Greg 'groggy' Lehey wrote: I've recently acquired an AMD64 box (dual Opteron 242, SiS [EMAIL PROTECTED] motherboard (http://www.msi.com.tw/program/products/server/svr/pro_svr_detail.php?UID=484). See below for more details). I find it very unstable running with 8 GB memory, though 4 GB are not a problem. At first I thought it was the onboard peripherals, but after disabling them it still persisted. What's unstable? I only once got it through the boot process. Running a 5.3-RELEASE i386 kernel it panics, though I haven't investigated the panic (yet), since I'm not interested in the i386 kernel. The amd64 5.4-PRERELEASE kernel just hangs/freezes. When the peripherals are enabled, it's after probing the onboard NIC (bge) and before probing SATA (no drives present). I've done a verbose boot, of course, but no additional information is present. The NIC is recognized, and that's all. Without the peripherals, but with a 3Com 3c905 PCI NIC, it continues beyond this point, but doesn't enable the NIC. I don't have dmesg output for these attempts, so I can't produce the exact message, and I suspect it's not important. It continues until trying to mount NFS file systems, where it hangs for obvious reasons. Pressing ^C causes the system to either panic (and be unable to dump because I don't have that much swap) or just hang. None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 1. Has anybody else seen this problem? 2. Has anybody else used this hardware configuration and *not* seen this problem? 3. Where should I look next? I'm attaching the (non-verbose) dmesg from a successful boot. Greg -- See complete headers for address and phone numbers. 5.3-RELEASE has a lot of problems with 4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. You'll need to dig in and provide some more details, I guess. I have an HDAMA dual Opteron system that behaves fine now with 8GB of RAM, so your problem might lie with particular hardware and/or drivers. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 1. Has anybody else seen this problem? 2. Has anybody else used this hardware configuration and *not* seen this problem? 3. Where should I look next? Have you run sysutils/memtest86 with the 8 GB? I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. -- Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 15:30:37 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: I've recently acquired an AMD64 box ... What's unstable? ... The amd64 5.4-PRERELEASE kernel just hangs/freezes. 5.3-RELEASE has a lot of problems with 4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. They appear to be. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 3. Where should I look next? You'll need to dig in and provide some more details, I guess. Yes, my guess too. I have an HDAMA dual Opteron system that behaves fine now with 8GB of RAM, so your problem might lie with particular hardware and/or drivers. As I described, it doesn't appear to be the drivers. Greg -- See complete headers for address and phone numbers. pgpuBz6NIFzNC.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 14:35:46 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 1. Has anybody else seen this problem? 2. Has anybody else used this hardware configuration and *not* seen this problem? 3. Where should I look next? Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. OK, this makes sense. It might also explain why the 4 GB configuration only recognizes 3.5 GB. Greg -- See complete headers for address and phone numbers. pgpHUROsJz6Ax.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On Thursday, 31 March 2005 at 10:32:33 +0930, Daniel O'Connor wrote: On Thu, 31 Mar 2005 08:14, Greg 'groggy' Lehey wrote: Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. You could try http://www.memtest86.com although that doesn't do 4Gb :( I'm pretty sure it's not the memory. I've tried each pair individually, and it's only when they're both in there together that it's a problem. And yes, I've tried them in each pair of slots. I now have dmesg output for verbose boots with both 4 GB and 8 GB memory. The complete dmesg output is at http://www.lemis.com/grog/Images/20050331/dmesg.4GB and http://www.lemis.com/grog/Images/20050331/dmesg.8GB. The diffs are at http://www.lemis.com/grog/Images/20050331/dmesg.diff. Here's a truncated summary: --- dmesg.4GB Thu Mar 31 10:47:16 2005 +++ dmesg.8GB Thu Mar 31 10:52:32 2005 @@ -64,6 +10,7 @@ SMAP type=01 base=0010 len=dfde SMAP type=03 base=dfee3000 len=d000 SMAP type=04 base=dfee len=3000 +SMAP type=01 base=0001 len=0001 Copyright (c) 1992-2005 The FreeBSD Project. @@ -75,7 +22,7 @@ Calibrating clock(s) ... i8254 clock: 1193283 Hz CLK_USE_I8254_CALIBRATION not specified - using default frequency Timecounter i8254 frequency 1193182 Hz quality 0 -Calibrating TSC clock ... TSC clock: 1603647337 Hz +Calibrating TSC clock ... TSC clock: 1603647241 Hz CPU: AMD Opteron(tm) Processor 242 (1603.65-MHz K8-class CPU) Origin = AuthenticAMD Id = 0xf5a Stepping = 10 Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2 @@ -90,11 +37,12 @@ L2 4KB data TLB: 512 entries, 4-way associative L2 4KB instruction TLB: 512 entries, 4-way associative L2 unified cache: 1024 kbytes, 64 bytes/line, 1 lines/tag, 16-way associative -real memory = 3756916736 (3582 MB) +real memory = 8589934592 (8192 MB) This is interesting in that it has gained 4.5 GB. Physical memory chunk(s): 0x1000 - 0x0009bfff, 634880 bytes (155 pages) -0x00a05000 - 0xd95b7fff, 3636146176 bytes (887731 pages) -avail memory = 3623817216 (3455 MB) +0x00a09000 - 0xdfed, 3746394112 bytes (914647 pages) +0x0001 - 0x0001f0fc, 4043112448 bytes (987088 pages) +avail memory = 177600 (7416 MB) ACPI APIC Table: VIAK8 AWRDACPI APIC ID: physical 0, logical 0:0 APIC ID: physical 1, logical 0:1 @@ -138,41 +86,12 @@ ioapic0: intpin 9 trigger: level ioapic0: intpin 9 polarity: low lapic0: Routing NMI - LINT1 -A IRQ 3 (edge, high) -ioapic0: intpin 4 - ISA IRQ 4 (edge, high) -ioapic0: intpin 5 - ISA IRQ 5 (edge, high) -ioapic0: intpin 6 - ISA IRQ 6 (edge, high) -ioapic0: intpin 7 - ISA IRQ 7 (edge, high) -ioapic0: intpin 8 - ISA IRQ 8 (edge, high) -ioapic0: intpin 9 - ISA IRQ 9 (edge, high) -ioapic0: intpin 10 - ISA IRQ 10 (edge, high) -ioapic0: intpin 11 - ISA IRQ 11 (edge, high) -ioapic0: intpin 12 - ISA IRQ 12 (edge, high) -ioapic0: intpin 13 - ISA IRQ 13 (edge, high) -ioapic0: intpin 14 - ISA IRQ 14 (edge, high) -ioapic0: intpin 15 - ISA IRQ 15 (edge, high) -ioapic0: intpin 16 - PCI IRQ 16 (level, low) -ioapic0: intpin 17 - PCI IRQ 17 (level, low) -ioapic0: intpin 18 - PCI IRQ 18 (level, low) -ioapic0: intpin 19 - PCI IRQ 19 (level, low) -ioapic0: intpin 20 - PCI IRQ 20 (level, low) -ioapic0: intpin 21 - PCI IRQ 21 (level, low) -ioapic0: intpin 22 - PCI IRQ 22 (level, low) -ioapic0: intpin 23 - PCI IRQ 23 (level, low) -MADT: Interrupt override: source 0, irq 2 -ioapic0: Routing IRQ 0 - intpin 2 -ioapic0: intpin 2 trigger: edge -ioapic0: intpin 2 polarity: high -MADT: Interrupt override: source 9, irq 9 -ioapic0: intpin 9 trigger: level -ioapic0: intpin 9 polarity: low -lapic0: Routing NMI - LINT1 This stuff is puzzling. I suppose it could be related. Does anybody have any ideas? lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic1: Routing NMI - LINT1 lapic1: LINT1 trigger: edge lapic1: LINT1 polarity: high -ioapic0 Version 0.3 irqs 0-23 on motherboard +ioapic0 Version 0.0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00040010 LDR: 0x0100 DFR: 0x0fff lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff The last lines in the 8 GB dmesg are: bge0: Broadcom BCM5705 Gigabit Ethernet, ASIC rev. 0x3003 mem 0xfa00-0xfa00 irq 16 at device 11.0 on pci0 bge0: Reserved 0x1 bytes for rid 0x10 type 3 at 0xfa00 They're identical in each probe. Greg -- See complete headers for address and phone numbers. pgpZU27XHUxhn.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On Thu, 31 Mar 2005 11:24, Greg 'groggy' Lehey wrote: I'm pretty sure it's not the memory. I've tried each pair individually, and it's only when they're both in there together that it's a problem. And yes, I've tried them in each pair of slots. Could be a marginal timing issue.. You could try winding out the RAM timing slightly. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au The nice thing about standards is that there are so many of them to choose from. -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C pgpYD49Q7hkAC.pgp Description: PGP signature
Re: ALTQ, pf and VLANs
Max, that solution works fine. I have tried it and it works fine for me. Thanks. Anyway, do you know some issues with dropping traffic on em0 vlan enabled interfaces and tcpdump-ing ? The average traffic, that we tcpdump is cca 10-20mbit/s and when tcpdump-ing, we get allmost 90% packet loss on interfaces. Any clue ? Marko Max Laier wrote: On Tuesday 29 March 2005 20:28, Marko uk wrote: Will that be fixed in 5.4 ? Right now, today it won't work without a patch. pfctl: vlan0: driver does not support altq Please see: http://lists.freebsd.org/mailman/htdig/freebsd-net/2005-February/006456.html If you still can't live without ALTQ rate-limitting on VLAN submit a PR and throw it my way. -- Private: http://cuk.nu Sports: http://www.cuk.nu Slovenian FreeBSD mirror admin http://www2.si.freebsd.org Work @ http://www.xenya.si ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 5.3 SMP freezes with MySQL 4.1
On Wed, 30 Mar 2005, Young Lee wrote: the system is totally dead with the pannic message, and have to hard reset to reboot. even i reboot the server, it will crash in several minutes, because of hundreds of request is coming. You get a panic? You didn't say that before! by refer to Klein's configuration and turn debug.mpsafenet=0 in /boot/loader.conf, the server is stable so far, and it last 20 hours. Can't say I've had problems with mpsafenet here. -- Doug White| FreeBSD: The Power to Serve [EMAIL PROTECTED] | www.FreeBSD.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Greg 'groggy' Lehey wrote: I'm pretty sure it's not the memory. I've tried each pair individually, and it's only when they're both in there together that it's a problem. And yes, I've tried them in each pair of slots. I'm sure you have checked this aswell but just for completeness, they aren't different pairs? Like one pair is single-sided and the other double-sided (had some nasty and obscure problems with such a combination myself)? mkb. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: buggy ATA controller: I can install 4.11, but not 5.3 !?!
On Wed, 30 Mar 2005, Rob wrote: So its said again .. Use a different ATA controller. Please. The RZ1000 series should not be used under ANY CIRCUMSTANCES WHATSOEVER. I've got not really a choice here. This is a 10 year old Pentium-1 PC, which I would like to use. I'm not a hardware expert, but your advice would mean to throw this PC away? Or can I replace the ATA controller easily myself? Any PCI ATA controller will work, and that machine probably has ISA slots so you could use an ISA ATA controller if you were desperate. I put my P90 out to pasture since once the 2GB disk dies in it I'll have to use flash media to get a disk the onboard controller will support. So I replaced it with a small fanless Soekris :) I have already put this machine under moderate load and recompiled/installed a new world/kernel without any problems. Apparently 4.11 knows how to bypass the flaws of this buggy ATA controller. At least that's my impression. Would 5.3 or 5.4 do this as-good here? AFAIK the RZ1000 bug was that it would corrupt data to the slave channel if the primary channel was also active. If you only have one device then you may not be able to reproduce it. Do you have verbose boot output from the non-working 5.x boot? -- Doug White| FreeBSD: The Power to Serve [EMAIL PROTECTED] | www.FreeBSD.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE - UPDATE
On Tue, Mar 29, 2005 at 11:43:18PM -0600, Karl Denninger wrote: Here's the diff and some thoughts Fs:/usr/src/sys/dev/ata cvs diff -r 1.32.2.5 ata-queue.c Index: ata-queue.c === RCS file: /usr/cvs/src/sys/dev/ata/ata-queue.c,v retrieving revision 1.32.2.5 retrieving revision 1.32.2.6 diff -r1.32.2.5 -r1.32.2.6 30c30 __FBSDID($FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.32.2.5 2004/10/24 09:27:37 sos Exp $); --- __FBSDID($FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.32.2.6 2005/03/23 04:50:26 mdodd Exp $); 218a219,221 if (!dumping) callout_reset(request-callout, request-timeout * hz, (timeout_t*)ata_timeout, request); 241,243c244,249 /* if reinit succeeded and retries still permit, reinject request */ if (ata_reinit(ch) request-retries-- 0 request-device-param){ --- /* * if reinit succeeds, retries still permit and device didn't * get removed by the reinit, reinject request */ if (!ata_reinit(ch) request-retries-- 0 request-device-param){ 245a252 request-donecount = 0; Removing the second change (changing the test on the ata_reinit) appears to prevent both the destabilization and the actual requeue from taking place (that is, you get the immediate disconnect from the array when the error occurs; therefore whatever is causing the destabilization doesn't happen.) I will attempt to remove the first delta alone (and put back the second), but from a quick perusal of the code I doubt this will make a material change. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Thursday, 31 March 2005 at 5:54:17 +0200, Matthias Buelow wrote: Greg 'groggy' Lehey wrote: I'm pretty sure it's not the memory. I've tried each pair individually, and it's only when they're both in there together that it's a problem. And yes, I've tried them in each pair of slots. I'm sure you have checked this aswell but just for completeness, they aren't different pairs? Like one pair is single-sided and the other double-sided (had some nasty and obscure problems with such a combination myself)? No, they're all the same. Greg -- See complete headers for address and phone numbers. pgpB6Tf3lyzwW.pgp Description: PGP signature
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE - UPDATE
Nevermind on my update - somehow cvs got the wrong tag (5.3-RELEASE) on my sandbox build. Doing it again :) -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Mar 30, 2005, at 8:54 PM, Greg 'groggy' Lehey wrote: lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic1: Routing NMI - LINT1 lapic1: LINT1 trigger: edge lapic1: LINT1 polarity: high -ioapic0 Version 0.3 irqs 0-23 on motherboard +ioapic0 Version 0.0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00040010 LDR: 0x0100 DFR: 0x0fff lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff This shows that in the - case the APIC is broken somehow (0.0 isn't a valid I/O APIC version). It would seem that the system has mapped RAM over top of the I/O APIC perhaps? It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. The local APIC portion seems ok though. -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve = http://www.FreeBSD.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 23:01:03 -0500, John Baldwin wrote: On Mar 30, 2005, at 8:54 PM, Greg 'groggy' Lehey wrote: lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic1: Routing NMI - LINT1 lapic1: LINT1 trigger: edge lapic1: LINT1 polarity: high -ioapic0 Version 0.3 irqs 0-23 on motherboard +ioapic0 Version 0.0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00040010 LDR: 0x0100 DFR: 0x0fff lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff This shows that in the - case the APIC is broken somehow (0.0 isn't a valid I/O APIC version). You mean the + case, I suppose. Yes, that's what I suspected. It would seem that the system has mapped RAM over top of the I/O APIC perhaps? That's what I suspected too, but imp doesn't think so. It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. Any suggestions about how to do so? Greg -- See complete headers for address and phone numbers. pgpswZz69jMbN.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 23:01:03 -0500, John Baldwin wrote: On Mar 30, 2005, at 8:54 PM, Greg 'groggy' Lehey wrote: lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic1: Routing NMI - LINT1 lapic1: LINT1 trigger: edge lapic1: LINT1 polarity: high -ioapic0 Version 0.3 irqs 0-23 on motherboard +ioapic0 Version 0.0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00040010 LDR: 0x0100 DFR: 0x0fff lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff This shows that in the - case the APIC is broken somehow (0.0 isn't a valid I/O APIC version). You mean the + case, I suppose. Yes, that's what I suspected. It would seem that the system has mapped RAM over top of the I/O APIC perhaps? That's what I suspected too, but imp doesn't think so. I'd be more inclined to believe that there is an erroneous mapping by the OS, not that things are fundamentally broken in hardware. Your SMAP table shows everything correctly. It's becoming hard to break through your pre-concieved notions here and explain how things actually work. It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. Any suggestions about how to do so? man acpidump ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE - UPDATE (real this time)
Ok, here's what I've got so far. Pulling the SECOND delta both gets rid of the stability problem AND the requeue fix (e.g. getting rid of that denies the essential purpose of the deltas in the first place.) Removing the FIRST delta, which is: 218a219,221 if (!dumping) callout_reset(request-callout, request-timeout * hz, (timeout_t*)ata_timeout, request); appears to get rid of the crashes while not harming data integrity OR the reqeueing. With this one out the errors (I was able to general over a dozen retries in less than 10 minutes doing a large file copy with a 3-disk RAID 1 array comprised of 2 SATA disks, 1 UDMA100) still occur, BUT they are retried (apparently successfully.) I copied the source tree to /usr/src2 and took the errors. I am now attempting to buildworld off it - so far, so good (about 1/4 of the way through - if there was data corruption it should have failed by now) Also, the sandbox system is still up. That also is a major improvement. I will let this buildworld complete, and if it is successful (proving that the retried errors didn't actually result in corrupted files!), will put this same change (pulling the first delta only) on the production system, rebuild the other RAID disks (I had to pull the cartridges from there to use them on the sandbox) and see if intentionally provoking the same error there allows the system to remain stable once the errors start showing up. Again, I will not have a final determination on this until late tomorrow, but at first blush pulling the first delta appears to fix the stability issue. Further update tomorrow as soon as I have it -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.comMusings Of A Sentient Mind On Wed, Mar 30, 2005 at 09:08:30PM -0600, Karl Denninger wrote: On Tue, Mar 29, 2005 at 11:43:18PM -0600, Karl Denninger wrote: Here's the diff and some thoughts Fs:/usr/src/sys/dev/ata cvs diff -r 1.32.2.5 ata-queue.c Index: ata-queue.c === RCS file: /usr/cvs/src/sys/dev/ata/ata-queue.c,v retrieving revision 1.32.2.5 retrieving revision 1.32.2.6 diff -r1.32.2.5 -r1.32.2.6 30c30 __FBSDID($FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.32.2.5 2004/10/24 09:27:37 sos Exp $); --- __FBSDID($FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.32.2.6 2005/03/23 04:50:26 mdodd Exp $); 218a219,221 if (!dumping) callout_reset(request-callout, request-timeout * hz, (timeout_t*)ata_timeout, request); 241,243c244,249 /* if reinit succeeded and retries still permit, reinject request */ if (ata_reinit(ch) request-retries-- 0 request-device-param){ --- /* * if reinit succeeds, retries still permit and device didn't * get removed by the reinit, reinject request */ if (!ata_reinit(ch) request-retries-- 0 request-device-param){ 245a252 request-donecount = 0; Removing the second change (changing the test on the ata_reinit) appears to prevent both the destabilization and the actual requeue from taking place (that is, you get the immediate disconnect from the array when the error occurs; therefore whatever is causing the destabilization doesn't happen.) I will attempt to remove the first delta alone (and put back the second), but from a quick perusal of the code I doubt this will make a material change. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.net My home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.com Musings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] %SPAMBLOCK-SYS: Matched [freebsd], message ok ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
[gratuitous empty lines removed] On Wednesday, 30 March 2005 at 21:28:36 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 23:01:03 -0500, John Baldwin wrote: On Mar 30, 2005, at 8:54 PM, Greg 'groggy' Lehey wrote: lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic1: Routing NMI - LINT1 lapic1: LINT1 trigger: edge lapic1: LINT1 polarity: high -ioapic0 Version 0.3 irqs 0-23 on motherboard +ioapic0 Version 0.0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00040010 LDR: 0x0100 DFR: 0x0fff lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff This shows that in the - case the APIC is broken somehow (0.0 isn't a valid I/O APIC version). You mean the + case, I suppose. Yes, that's what I suspected. It would seem that the system has mapped RAM over top of the I/O APIC perhaps? That's what I suspected too, but imp doesn't think so. I'd be more inclined to believe that there is an erroneous mapping by the OS, not that things are fundamentally broken in hardware. Agreed. This has been my favourite hypothesis all along. But isn't that what jhb is saying? Your SMAP table shows everything correctly. It's becoming hard to break through your pre-concieved notions here and explain how things actually work. No, there's nothing to break through. I think you're just having problems 1. expressing yourself, and 2. understanding what I'm saying. I have no preconceived notions. All I can see here is an antagonistic attitude on your part. What's the problem? You'll recall from my first message that I asked for suggestions about how to approach the issue. jhb provided some; you haven't so far. From what you've written, it's unclear whether you disagree with jhb or not. If you do, why? If you don't, what's your point here? It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. Any suggestions about how to do so? man acpidump How do you run that on a system that won't boot? Greg -- See complete headers for address and phone numbers. pgppI6HefiCEz.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On 03/30/05 23:14, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 21:28:36 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 23:01:03 -0500, John Baldwin wrote: On Mar 30, 2005, at 8:54 PM, Greg 'groggy' Lehey wrote: lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic1: Routing NMI - LINT1 lapic1: LINT1 trigger: edge lapic1: LINT1 polarity: high -ioapic0 Version 0.3 irqs 0-23 on motherboard +ioapic0 Version 0.0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00040010 LDR: 0x0100 DFR: 0x0fff lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff This shows that in the - case the APIC is broken somehow (0.0 isn't a valid I/O APIC version). You mean the + case, I suppose. Yes, that's what I suspected. It would seem that the system has mapped RAM over top of the I/O APIC perhaps? That's what I suspected too, but imp doesn't think so. I'd be more inclined to believe that there is an erroneous mapping by the OS, not that things are fundamentally broken in hardware. Agreed. This has been my favourite hypothesis all along. But isn't that what jhb is saying? Your SMAP table shows everything correctly. It's becoming hard to break through your pre-concieved notions here and explain how things actually work. No, there's nothing to break through. I think you're just having problems 1. expressing yourself, and 2. understanding what I'm saying. I have no preconceived notions. All I can see here is an antagonistic attitude on your part. What's the problem? You'll recall from my first message that I asked for suggestions about how to approach the issue. jhb provided some; you haven't so far. From what you've written, it's unclear whether you disagree with jhb or not. If you do, why? If you don't, what's your point here? It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. Any suggestions about how to do so? man acpidump How do you run that on a system that won't boot? You said the system worked with 4 GB (albeit detecting only 3.5 GB). My perception of this whole ACPI thing is that it is fixed in your BIOS (although it can be overridden by the OS). As such, the amount of RAM you have in the machine shouldn't change acpidump results. Is that not correct? Jon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Jon Noack wrote: On 03/30/05 23:14, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 21:28:36 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 23:01:03 -0500, John Baldwin wrote: On Mar 30, 2005, at 8:54 PM, Greg 'groggy' Lehey wrote: lapic0: LINT1 trigger: edge lapic0: LINT1 polarity: high lapic1: Routing NMI - LINT1 lapic1: LINT1 trigger: edge lapic1: LINT1 polarity: high -ioapic0 Version 0.3 irqs 0-23 on motherboard +ioapic0 Version 0.0 irqs 0-23 on motherboard cpu0 BSP: ID: 0x VER: 0x00040010 LDR: 0x0100 DFR: 0x0fff lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff This shows that in the - case the APIC is broken somehow (0.0 isn't a valid I/O APIC version). You mean the + case, I suppose. Yes, that's what I suspected. It would seem that the system has mapped RAM over top of the I/O APIC perhaps? That's what I suspected too, but imp doesn't think so. I'd be more inclined to believe that there is an erroneous mapping by the OS, not that things are fundamentally broken in hardware. Agreed. This has been my favourite hypothesis all along. But isn't that what jhb is saying? Your SMAP table shows everything correctly. It's becoming hard to break through your pre-concieved notions here and explain how things actually work. No, there's nothing to break through. I think you're just having problems 1. expressing yourself, and 2. understanding what I'm saying. I have no preconceived notions. All I can see here is an antagonistic attitude on your part. What's the problem? You'll recall from my first message that I asked for suggestions about how to approach the issue. jhb provided some; you haven't so far. From what you've written, it's unclear whether you disagree with jhb or not. If you do, why? If you don't, what's your point here? It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. Any suggestions about how to do so? man acpidump How do you run that on a system that won't boot? You said the system worked with 4 GB (albeit detecting only 3.5 GB). My perception of this whole ACPI thing is that it is fixed in your BIOS (although it can be overridden by the OS). As such, the amount of RAM you have in the machine shouldn't change acpidump results. Is that not correct? Jon This is absolutely correct. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE
On Wed, Mar 30, 2005 at 10:44:38AM -0800, Drew Tomlinson wrote: I missed the beginning of this thread and apologize if my question has already been covered. But can you tell me if this issue might be the reason my PC locks up intermittently ? I have whatever cheap card came with a Maxtor 160 GB SATA drive installed in this machine and the PC ran fine with Windows. Now I'm trying install FBSD from the 5.4-BETA ISO I downloaded from the ftp site. The PC runs POST fine and always boots from the CD to the boot menu. After picking the default option 1 (normal boot) the PC locks up anywhere from the dmesg output to sysinstall actually beginning to install the base package after doing the fdisk and disklabel stuff. Should I download 5.3-RELEASE and try installing from that? Thanks, Drew 5.3-RELEASE may lock up too, but in different ways. In a non-redundant disk situation a bogus fatal write error hoses you in extremely bad ways, including possible file or filesystem metadata damage. I would NOT run 5.3 in an attempt to get around this, in that such damage could remain hidden (although not without notice, as the errors will show up on the console!) for quite some time until you discover holes in your files or a critical metadata write craps out and causes a crash - possibly with a corrupted disk that fsck can't fix. Grave danger (to your data) lies down that road 5.4-PRERELEASE, once the tests are complete (that I'm working on now), the decisions on what to commit are made, and a new ISO is cut, should work - it will bitch (a LOT) about retried writes, but it should work. At least that's what I'm seeing right now - I can provoke the error, but it doesn't kill the machine anymore and it also doesn't appear to corrupt data as the retired write is (by all appearances) successful. It'll be a couple of days before I can be SURE that what appears to be working right now is in fact stable though, then however long it takes for the back room stuff to get done and new ISOs generated. BTW its NOT your hardware at fault here - the same hardware that returns these complaints for me on 5.x works perfectly with 4.11. There have been changes made to the ATA code that apparently interact VERY badly with some controllers - particularly some very common SATA (SII chipset, used on Adaptec and Bustek boards, among others) ones. I don't know if GEOM/GMIRROR is truly involved here although that's the easiest way for me to provoke it - I suspect not - its just that GEOM/GMIRROR produces an I/O load pattern that is conducive to the breakage showing up. Specifically, a DD from one or more disks does NOT fail - a mix of reads and writes and fairly significant load appears necessary to cause trouble. Of course installation produces a very nice load of that type I opened a PR on this quite some time ago - IMHO this sort of breakage should be considered a critical fault sufficient to stop a release until its completely resolved. A workaround that stops the system from blowing up but leaves the pauses and errors isn't really a fix - I doubt anyone will consider that acceptable as a means of truly addressing the problem (at least I hope not!) I got surprised by this (in a bad way) and have been fighting workarounds since 5.3 was deemed production quality. Going back to 4.x is possible for me, but highly undesireable for a number of reasons, not the least of which is the official FreeBSD posture on where work is and will be done on the OS down the road. The Intel ICH-based SATA adapters appear NOT to have this problem. I've beat the living SNOT out of my two systems with ICH-based motherboard SATA controllers on them for days at a time and have been unable to provoke the problem - using the same disk drives. The SII-based chipset boards I have (one Adaptec and one Bustek) reliably puke within seconds with a simple large-directory copy. Both ran for a VERY long time under 4.x and were completely stable. Unfortunately I've yet to find an actual BOARD with the ICH chipset on it - it is common among motherboard SATA controllers, but that doesn't help people who need the adapter on a PCI card. ATA-GenIII may fix all this but I've yet to try it. In any event that's a research project right now, although it will likely soon get committed to -HEAD. That still doesn't help you though in that it won't show up in -STABLE until people are satisfied that it at worst is at least as good as what's in there now. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 22:27:43 -0700, Scott Long wrote: Jon Noack wrote: On 03/30/05 23:14, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 21:28:36 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 23:01:03 -0500, John Baldwin wrote: It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. Any suggestions about how to do so? man acpidump How do you run that on a system that won't boot? You said the system worked with 4 GB (albeit detecting only 3.5 GB). Yes, this is correct. A number of people have explained why it only detected 3.5 GB in this configuration. My perception of this whole ACPI thing is that it is fixed in your BIOS (although it can be overridden by the OS). As such, the amount of RAM you have in the machine shouldn't change acpidump results. Is that not correct? This is absolutely correct. Ah, so you meant to say that the output from the system running with 4 GB memory is useful? That wasn't in the man page you pointed to. What it does say is: When invoked with the -t flag, the acpidump utility dumps contents of the following tables: ... MADT This may be the case, but between man page and output some terminology must have changed. I can't see any reference to anything like an MADT there. Does that mean that there isn't one, or that ACPI can't find it, or does the section APIC refer to/dump the MADT? Here's the complete output of acpidump -t, anyway: /* RSD PTR: OEM=VIAK8, ACPI_Rev=1.0x (0) RSDT=0xdfee3000, cksum=97 */ /* RSDT: Length=44, Revision=1, Checksum=4, OEMID=VIAK8, OEM Table ID=AWRDACPI, OEM Revision=0x42302e31, Creator ID=AWRD, Creator Revision=0x0 Entries={ 0xdfee3040, 0xdfee7b40 } */ /* FACP: Length=116, Revision=1, Checksum=255, OEMID=VIAK8, OEM Table ID=AWRDACPI, OEM Revision=0x42302e31, Creator ID=AWRD, Creator Revision=0x0 FACS=0xdfee, DSDT=0xdfee30c0 INT_MODEL=PIC Preferred_PM_Profile=Unspecified (0) SCI_INT=9 SMI_CMD=0x402f, ACPI_ENABLE=0xa1, ACPI_DISABLE=0xa0, S4BIOS_REQ=0x0 PSTATE_CNT=0x0 PM1a_EVT_BLK=0x4000-0x4003 PM1a_CNT_BLK=0x4004-0x4005 PM_TMR_BLK=0x4008-0x400b GPE0_BLK=0x4020-0x4023 P_LVL2_LAT=101 us, P_LVL3_LAT=1001 us FLUSH_SIZE=0, FLUSH_STRIDE=0 DUTY_OFFSET=0, DUTY_WIDTH=1 DAY_ALRM=125, MON_ALRM=126, CENTURY=50 IAPC_BOOT_ARCH= Flags={WBINVD,PROC_C1,SLP_BUTTON,RTC_S4,RESET_REG} RESET_REG=0x:0[0] (Memory), RESET_VALUE=0x44 */ /* FACS: Length=64, HwSig=0x, Firm_Wake_Vec=0x Global_Lock= Flags= Version=0 */ /* DSDT: Length=19020, Revision=1, Checksum=28, OEMID=VIAK8, OEM Table ID=AWRDACPI, OEM Revision=0x1000, Creator ID=MSFT, Creator Revision=0x10e */ /* APIC: Length=104, Revision=1, Checksum=145, OEMID=VIAK8, OEM Table ID=AWRDACPI, OEM Revision=0x42302e31, Creator ID=AWRD, Creator Revision=0x0 Local APIC ADDR=0xfee0 Flags={PC-AT} Type=Local APIC ACPI CPU=0 Flags={ENABLED} APIC ID=0 Type=Local APIC ACPI CPU=1 Flags={ENABLED} APIC ID=1 Type=IO APIC APIC ID=2 INT BASE=0 ADDR=0xfec0 Type=INT Override BUS=0 IRQ=0 INTR=2 Flags={Polarity=conforming, Trigger=conforming} Type=INT Override BUS=0 IRQ=9 INTR=9 Flags={Polarity=active-lo, Trigger=level} Type=Local NMI ACPI CPU=0 LINT Pin=1 Flags={Polarity=active-hi, Trigger=edge} Type=Local NMI ACPI CPU=1 LINT Pin=1 Flags={Polarity=active-hi, Trigger=edge} */ Since I don't know anything about ACPI, this doesn't say too much to me. Suggestions welcome. If the APIC section is the MADT, it looks as if we should update the docco. Greg -- See complete headers for address and phone numbers. pgpd4k8dxjan9.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On 03/30/05 23:49, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 22:27:43 -0700, Scott Long wrote: Jon Noack wrote: On 03/30/05 23:14, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 21:28:36 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 23:01:03 -0500, John Baldwin wrote: It would be interesting to see the contents of your MADT to see if it's trying to use a 64-bit PA for your APIC. Any suggestions about how to do so? man acpidump How do you run that on a system that won't boot? You said the system worked with 4 GB (albeit detecting only 3.5 GB). Yes, this is correct. A number of people have explained why it only detected 3.5 GB in this configuration. My perception of this whole ACPI thing is that it is fixed in your BIOS (although it can be overridden by the OS). As such, the amount of RAM you have in the machine shouldn't change acpidump results. Is that not correct? This is absolutely correct. Ah, so you meant to say that the output from the system running with 4 GB memory is useful? That wasn't in the man page you pointed to. What it does say is: When invoked with the -t flag, the acpidump utility dumps contents of the following tables: ... MADT This may be the case, but between man page and output some terminology must have changed. I can't see any reference to anything like an MADT there. Does that mean that there isn't one, or that ACPI can't find it, or does the section APIC refer to/dump the MADT? Here's the complete output of acpidump -t, anyway: snip acpidump output Since I don't know anything about ACPI, this doesn't say too much to me. Suggestions welcome. If the APIC section is the MADT, it looks as if we should update the docco. My limited research (as in, Google) shows that the MADT was defined as part of ACPI 2.0: http://www.microsoft.com/whdc/system/platform/64bit/IA64_ACPI.mspx According to your previous link the motherboard specs, it supports both ACPI 1.0A and 2.0. Perhaps there is a BIOS knob to toggle between the two? Jon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Thursday, 31 March 2005 at 0:00:22 -0600, Jon Noack wrote: On 03/30/05 23:49, Greg 'groggy' Lehey wrote: Here's the complete output of acpidump -t, anyway: snip acpidump output Since I don't know anything about ACPI, this doesn't say too much to me. Suggestions welcome. If the APIC section is the MADT, it looks as if we should update the docco. My limited research (as in, Google) shows that the MADT was defined as part of ACPI 2.0: http://www.microsoft.com/whdc/system/platform/64bit/IA64_ACPI.mspx Thanks for the link. According to your previous link the motherboard specs, it supports both ACPI 1.0A and 2.0. Perhaps there is a BIOS knob to toggle between the two? I've taken a look, but I can't find anything. Greg -- See complete headers for address and phone numbers. pgpF5N94haxLL.pgp Description: PGP signature
syscons options and memory use
Hello, The syscons manual page says: The following options will remove some features from the syscons driver and save kernel memory. [...] SC_NO_SYSMOUSE This option removes mouse support in the syscons driver. The mouse daemon moused(8) will fail if this option is defined. This option implies the SC_NO_CUTPASTE option too. How much memory does this save (or how can I discover that)? Is it worth it on a 96MB PentiumII laptop? Thanks in advance, Ronald. -- Ronald Klop, Amsterdam, The Netherlands ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: syscons options and memory use
In the last episode (Mar 31), Ronald Klop said: The syscons manual page says: The following options will remove some features from the syscons driver and save kernel memory. [...] SC_NO_SYSMOUSE This option removes mouse support in the syscons driver. The mouse daemon moused(8) will fail if this option is defined. This option implies the SC_NO_CUTPASTE option too. How much memory does this save (or how can I discover that)? Is it worth it on a 96MB PentiumII laptop? I would guess that the memory savings is probably on the order of kilobytes. Useful if you're trying to prevent excessive swapping on an 8MB system. Not worth disabling on your system. -- Dan Nelson [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Hi, Since we are discussing AMD64 with 8GB RAM, I also would like to point my problem. I'm still looking for possibility to run FreeBSD 5.3-STABLE with more than 4GB RAM on Dual amd64 2.2GHz machine (IBM @server 325) with ServeRAID 6M (ips driver)). Right now I'm using only 4GB RAM and this server is in production. #uname -an FreeBSD publica.ub.mng.net 5.3-STABLE FreeBSD 5.3-STABLE #12: Mon Nov 22 12:04:57 ULAT 2004 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/AMD amd64 As Scott said a few months ago, problem is below: The ips driver looks like it will fail under heavy load when more than 4GB of RAM is present. It tries to force busdma to not defer requests when the bounce page reserve is low, but that looks to be broken and will result in corrupted commands. Are the ips driver and bus_dma problems fixed yet in STABLE tree? Is it worth to try source update and see how it works? I'm afraid to do so, since it is production server. Please see my previous posts: http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-December/044325.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041003.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041005.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041013.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041015.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041094.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041112.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041164.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041258.html http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041554.html dmesg: http://lists.freebsd.org/pipermail/freebsd-current/2004-October/041265.html thanks in advance, Ganbold ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Ganbold wrote: Hi, Since we are discussing AMD64 with 8GB RAM, I also would like to point my problem. I'm still looking for possibility to run FreeBSD 5.3-STABLE with more than 4GB RAM on Dual amd64 2.2GHz machine (IBM @server 325) with ServeRAID 6M (ips driver)). Right now I'm using only 4GB RAM and this server is in production. #uname -an FreeBSD publica.ub.mng.net 5.3-STABLE FreeBSD 5.3-STABLE #12: Mon Nov 22 12:04:57 ULAT 2004 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/AMD amd64 As Scott said a few months ago, problem is below: The ips driver looks like it will fail under heavy load when more than 4GB of RAM is present. It tries to force busdma to not defer requests when the bounce page reserve is low, but that looks to be broken and will result in corrupted commands. Are the ips driver and bus_dma problems fixed yet in STABLE tree? Is it worth to try source update and see how it works? I'm afraid to do so, since it is production server. Yes, I (hopefully) fixed the problems that I pointed out, and I also locked it and added crashdump support. It is reported to be stable and fast now, so you won't go wrong by updating. These changes are in both 5-stable and 6-current. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: syscons options and memory use
On Thu, 31 Mar 2005 01:04:10 -0600, Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Mar 31), Ronald Klop said: The syscons manual page says: The following options will remove some features from the syscons driver and save kernel memory. [...] SC_NO_SYSMOUSE This option removes mouse support in the syscons driver. The mouse daemon moused(8) will fail if this option is defined. This option implies the SC_NO_CUTPASTE option too. How much memory does this save (or how can I discover that)? Is it worth it on a 96MB PentiumII laptop? I would guess that the memory savings is probably on the order of kilobytes. Useful if you're trying to prevent excessive swapping on an 8MB system. Not worth disabling on your system. How can I see the size of my kernel? I know vmstat -m and netstat -m, but from that info I don't see if I reduced the memory footprint after disabling an option or device. -- Ronald Klop, Amsterdam, The Netherlands ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Thu, Mar 31, 2005 at 08:14:45AM +0930, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 14:35:46 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 3. Where should I look next? Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. That's what happens when 1 of 8 (1 of 4?) DIMM is bad :-) I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. OK, this makes sense. It might also explain why the 4 GB configuration only recognizes 3.5 GB. Search amd64 mailing list. The missing memory is reserved for something which escapes me at the moment. Similar to the infamous ISA memory hole. -- Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 14:35:46 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 1. Has anybody else seen this problem? 2. Has anybody else used this hardware configuration and *not* seen this problem? 3. Where should I look next? Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. OK, this makes sense. It might also explain why the 4 GB configuration only recognizes 3.5 GB. No, and I'm going to make this an FAQ and post it in a very obvious place, since 4+ GB is so easy to get and people don't seem to understand the PC architecture very well. Almost all systems put the PCI Memory Mapped IO window into the 3.75-4GB region of the physical memory map. The registers for the APICs and other system resources are also typically in this region. Now with PCI-Express, the Memory Mapped PCI config registers are typically being mapped in the 3.5-3.75GB range. The memory controllers, host bridges, north-bridges, and/or whatever else glues the memory to the bus to the CPU decode these addresses into PCI cycles, not RAM cycles. Some systems are smart and re-map the RAM that is hidden by these holes into a region 4GB. Some systems are dumb, though, and just deny you access to the RAM that is covered up. It's very much like the old days of the XT/AT architecture when you had 1MB of RAM but everything above 640k was hidden by the VGA framebuffer, ISA option ROMs, and system BIOS, but some systems where smart enough to relocate the hidden RAM. So, your missing .5GB is almost certainly not due to defective RAM, it's just due to The Way Things Are. It's a lot harder for Opteron systems to be smart about this than Xeon systems since all of the remapping magic can happen in the hostbridge on the Xeon, while the Opertons need to have their built-in memory controllers programmed specially for it. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 15:30:37 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: I've recently acquired an AMD64 box ... What's unstable? ... The amd64 5.4-PRERELEASE kernel just hangs/freezes. 5.3-RELEASE has a lot of problems with 4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. They appear to be. I don't understand what you mean here. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 3. Where should I look next? You'll need to dig in and provide some more details, I guess. Yes, my guess too. I have an HDAMA dual Opteron system that behaves fine now with 8GB of RAM, so your problem might lie with particular hardware and/or drivers. As I described, it doesn't appear to be the drivers. I don't see how you proved or disproved this. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 16:04:44 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 15:30:37 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: I've recently acquired an AMD64 box ... What's unstable? ... The amd64 5.4-PRERELEASE kernel just hangs/freezes. 5.3-RELEASE has a lot of problems with 4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. They appear to be. I don't understand what you mean here. As I said above (and trimmed for convenience), this problem occurs on 5.4-PRERELEASE as of yesterday morning. The dmesg shows that too. As I described, it doesn't appear to be the drivers. I don't see how you proved or disproved this. Shall I resend the original message? It seems independent of any particular driver. That's not proof, of course, but I didn't claim it was. Greg -- See complete headers for address and phone numbers. pgppvFZmjt3W8.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 16:01:14 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 14:35:46 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. OK, this makes sense. It might also explain why the 4 GB configuration only recognizes 3.5 GB. No, and I'm going to make this an FAQ and post it in a very obvious place, since 4+ GB is so easy to get and people don't seem to understand the PC architecture very well. That's not easy to understand when it's barely documented. Thanks for the info: it helps a lot. This may still be a hint, though: that memory hole doesn't show up during a boot with 8 GB RAM. How come? Is the system trying to map RAM over the PCI hole? It looks as if I should get a verbose boot listing with 8 GB. It'll be a couple of hours before I find time to reboot this machine. In the meantime, there's a verbose boot with 4 GB at http://www.lemis.com/grog/Images/20050331/obelix-dmesg. I'm told it shows a number of strange things, including incorrect reporting of on-chip cache sizes. Greg -- See complete headers for address and phone numbers. pgpqVG3rp0uyU.pgp Description: PGP signature
Re: new LORs on 5.4 pre
On 03/30/05 08:23, Rene Ladan wrote: On Wed, Mar 30, 2005 at 05:52:13AM -0800, Kris Kennaway wrote: On Wed, Mar 30, 2005 at 11:17:50AM +0200, Rene Ladan wrote: Hi, I've stumbled over some new LORs (all continuable) on 5.4pre from 2005-03-29 09:49 UTC, thus before the bpf/DHCP fix. lock order reversal 1st 0xc0642b60 Giant (Giant) @ /usr/src/sys/kern/kern_timeout.c:256 2nd 0xc14d7264 fxp0 (network driver) @ /usr/src/sys/modules/fxp/../../dev/fxp/if_fxp.c:1233 Is your fxp module up-to-date? Stale modules (i.e. compiled for a different kernel than the one you're running) will cause problems. All modules are up to date. This LOR popped up after installing from a resume buildworld. After blowing away /usr/obj, make cleandir twice in /usr/src and rebuilding and reinstalling world+kernel, it and the others still pop up, especially after any of these commands: # dhclient fxp0 # ntpd -q % fetchmail (only at startup, not after wakeup) Browsing in links does _not_ trigger a LOR. I saw a LOR similar to the rtentry one on -CURRENT a few days ago. See my message 4 LORs and a freeze with wi(4): http://lists.freebsd.org/pipermail/freebsd-current/2005-March/047862.html Jon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 14:57:15 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 08:14:45AM +0930, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 14:35:46 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I realize that this isn't enough to diagnose the problem. The reason for this message now is to ask: 3. Where should I look next? Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. That's what happens when 1 of 8 (1 of 4?) DIMM is bad :-) I've booted with the other 2 DIMMs now (I have 4 2 GB DIMMs, all the MB will hold). No problems. See my last reply to Scott: I'm wondering if the system is ignoring the PCI hole. Greg -- See complete headers for address and phone numbers. pgpoGoHsF272S.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday 30 March 2005 03:15 pm, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 16:01:14 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 14:35:46 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. OK, this makes sense. It might also explain why the 4 GB configuration only recognizes 3.5 GB. No, and I'm going to make this an FAQ and post it in a very obvious place, since 4+ GB is so easy to get and people don't seem to understand the PC architecture very well. That's not easy to understand when it's barely documented. Thanks for the info: it helps a lot. This may still be a hint, though: that memory hole doesn't show up during a boot with 8 GB RAM. How come? Is the system trying to map RAM over the PCI hole? Nope, its still there. When you boot -v, you'll see the hole in the Physical memory chunk(s) list. However, I suspect that some of the bioses will set the 4GB hole partition in the physical ram lower so that there will be 4.5GB of ram above the 4GB mark. I haven't looked too closely to see for sure. It looks as if I should get a verbose boot listing with 8 GB. It'll be a couple of hours before I find time to reboot this machine. In the meantime, there's a verbose boot with 4 GB at http://www.lemis.com/grog/Images/20050331/obelix-dmesg. I'm told it shows a number of strange things, including incorrect reporting of on-chip cache sizes. Nope, it is correct. You have 1MB of L2 cache. L1 data cache: 64 kbytes, 64 bytes/line, 1 lines/tag, 2-way associative L1 instruction cache: 64 kbytes, 64 bytes/line, 1 lines/tag, 2-way associative L2 unified cache: 1024 kbytes, 64 bytes/line, 1 lines/tag, 16-way associative Greg -- See complete headers for address and phone numbers. -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
.. Original Message ... On Thu, 31 Mar 2005 08:14:45 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. There is a bootable ISO version of memtest86 that you could try. - ask -- http://askask.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 16:01:14 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 14:35:46 -0800, Steve Kargl wrote: On Thu, Mar 31, 2005 at 07:54:39AM +0930, Greg 'groggy' Lehey wrote: None of these problems occur when I use 4 GB memory. About the only strangeness, which seems to come from the BIOS, is that it recognizes only 3.5 GB. If I put all DIMMS in, it recognizes the full 8 GB memory. I had 4 bad out of 12 tested where the DIMMs were Crucial PC2700 2GB Reg. ECC DIMMs. OK, this makes sense. It might also explain why the 4 GB configuration only recognizes 3.5 GB. No, and I'm going to make this an FAQ and post it in a very obvious place, since 4+ GB is so easy to get and people don't seem to understand the PC architecture very well. That's not easy to understand when it's barely documented. Thanks for the info: it helps a lot. This may still be a hint, though: that memory hole doesn't show up during a boot with 8 GB RAM. How come? Is the system trying to map RAM over the PCI hole? It looks as if I should get a verbose boot listing with 8 GB. It'll be a couple of hours before I find time to reboot this machine. In the meantime, there's a verbose boot with 4 GB at http://www.lemis.com/grog/Images/20050331/obelix-dmesg. I'm told it shows a number of strange things, including incorrect reporting of on-chip cache sizes. The SMAP will show the hole. It's well documented in most PC archtitecure books. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 16:04:44 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 15:30:37 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: I've recently acquired an AMD64 box ... What's unstable? ... The amd64 5.4-PRERELEASE kernel just hangs/freezes. 5.3-RELEASE has a lot of problems with 4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. They appear to be. I don't understand what you mean here. As I said above (and trimmed for convenience), this problem occurs on 5.4-PRERELEASE as of yesterday morning. The dmesg shows that too. And you're certain that it's due to the same busdma issues that I was describing? I must have missed the evidence that you use to support this. As I described, it doesn't appear to be the drivers. I don't see how you proved or disproved this. Shall I resend the original message? It seems independent of any particular driver. That's not proof, of course, but I didn't claim it was. Again, I must have missed the part where you investigated the drivers that apply to your particular system. I highly doubt that they apply to every 8GB Opteron system available on the market. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday 30 March 2005 03:09 pm, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 16:04:44 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 15:30:37 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: I've recently acquired an AMD64 box ... What's unstable? ... The amd64 5.4-PRERELEASE kernel just hangs/freezes. 5.3-RELEASE has a lot of problems with 4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. They appear to be. I don't understand what you mean here. As I said above (and trimmed for convenience), this problem occurs on 5.4-PRERELEASE as of yesterday morning. The dmesg shows that too. As I described, it doesn't appear to be the drivers. I don't see how you proved or disproved this. Shall I resend the original message? It seems independent of any particular driver. That's not proof, of course, but I didn't claim it was. Greg: The busdma problems from 5.3-RELEASE are fixed. That doesn't mean that there are no *other* problems. Scott is saying the old busdma bug shouldn't be affecting 5.4-PRE, and he's correct. Most likely, something else is happening, eg: you're running out of KVM or something silly like that. I know we're right on the brink at 8GB. The layout of the devices may be just enough to tip it over the edge. -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: buggy ATA controller: I can install 4.11, but not 5.3 !?!
--- Doug White [EMAIL PROTECTED] wrote: On Mon, 28 Mar 2005, Mars Trading wrote: I think you may be on to something here. Not that I'm an expert, BTW. If I remember correctly, 4.11 doesn't use device.hints; device irq's and stuff were all included in the kernel configuration. This is no longer the case with 5.x which uses /boot/device.hints to tell where attached devices are. But how does one edit device.hints on a boot cd? You set the variables from the loader with the set command. For example, if you need to add this to device.hints: hint.ata.0.at=isa type this on the loader command line: set hint.ata.0.at=isa On 4.x the hints were configured so that it would still probe the ISA resources if the PCI attachments failed. On 5.x this is likely not the case. However on 5.x that disabled by BIOS message is gone so it should attach if that particular message was the reason before. However someone clipped that message from the output so I don't even know why its failing to attach in the first place. I've got a few other 5.3-Stable (5.4-PreRelease) PCs here. On these machines I get: $ grep hint.ata.0 /boot/device.hints hint.ata.0.at=isa hint.ata.0.port=0x1F0 hint.ata.0.irq=14 $ grep hint.ata.0 /usr/src/sys/i386/conf/GENERIC.hints hint.ata.0.at=isa hint.ata.0.port=0x1F0 hint.ata.0.irq=14 So it's already there! I suppose it's the same with the 5.3 installation floppies, or isn't it? Or would me setting it manually at the loader prompt, have a different effect? So its said again .. Use a different ATA controller. Please. The RZ1000 series should not be used under ANY CIRCUMSTANCES WHATSOEVER. I've got not really a choice here. This is a 10 year old Pentium-1 PC, which I would like to use. I'm not a hardware expert, but your advice would mean to throw this PC away? Or can I replace the ATA controller easily myself? I have already put this machine under moderate load and recompiled/installed a new world/kernel without any problems. Apparently 4.11 knows how to bypass the flaws of this buggy ATA controller. At least that's my impression. Would 5.3 or 5.4 do this as-good here? Regards, Rob. __ Do you Yahoo!? Make Yahoo! your home page http://www.yahoo.com/r/hs ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday 30 March 2005 03:22 pm, Ask Bjørn Hansen wrote: .. Original Message ... On Thu, 31 Mar 2005 08:14:45 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. There is a bootable ISO version of memtest86 that you could try. Thats what the port does.. It produces a bootable floppy or ISO. -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
In message: [EMAIL PROTECTED] Greg 'groggy' Lehey [EMAIL PROTECTED] writes: : I've booted with the other 2 DIMMs now (I have 4 2 GB DIMMs, all the : MB will hold). No problems. See my last reply to Scott: I'm : wondering if the system is ignoring the PCI hole. Unlikely. If it was, you'd not have enough of a system to complain about. Warner ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
[Format recovered--see http://www.lemis.com/email/email-format.html] [gratuitous empty lines removed] On Wednesday, 30 March 2005 at 16:23:34 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 16:04:44 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 15:30:37 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: I've recently acquired an AMD64 box ... What's unstable? ... The amd64 5.4-PRERELEASE kernel just hangs/freezes. 5.3-RELEASE has a lot of problems with 4GB due to busdma issues. Those should no longer be an issue in RELENG_5, including 5.4-PRE. They appear to be. I don't understand what you mean here. As I said above (and trimmed for convenience), this problem occurs on 5.4-PRERELEASE as of yesterday morning. The dmesg shows that too. And you're certain that it's due to the same busdma issues that I was describing? No. I must have missed the evidence that you use to support this. I didn't give any. It appears that I misunderstood what you were saying. As I described, it doesn't appear to be the drivers. I don't see how you proved or disproved this. Shall I resend the original message? It seems independent of any particular driver. That's not proof, of course, but I didn't claim it was. Again, I must have missed the part where you investigated the drivers that apply to your particular system. The description is still there. I highly doubt that they apply to every 8GB Opteron system available on the market. I never suggested that they did. There's every reason to believe that it's something to do with this particular motherboard, but that doesn't mean that FreeBSD is blameless. Greg -- When replying to this message, please take care not to mutilate the original text. For more information, see http://www.lemis.com/email.html See complete headers for address and phone numbers. pgp0VFy61yFYS.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On Wednesday, 30 March 2005 at 15:25:37 -0800, Peter Wemm wrote: On Wednesday 30 March 2005 03:09 pm, Greg 'groggy' Lehey wrote: On Wednesday, 30 March 2005 at 16:04:44 -0700, Scott Long wrote: Greg 'groggy' Lehey wrote: As I described, it doesn't appear to be the drivers. I don't see how you proved or disproved this. Shall I resend the original message? It seems independent of any particular driver. That's not proof, of course, but I didn't claim it was. Greg: The busdma problems from 5.3-RELEASE are fixed. That doesn't mean that there are no *other* problems. Scott is saying the old busdma bug shouldn't be affecting 5.4-PRE, and he's correct. Yes, now I understand. Most likely, something else is happening, eg: you're running out of KVM or something silly like that. I know we're right on the brink at 8GB. The layout of the devices may be just enough to tip it over the edge. Yes, this seems reasonable. Where should I look next? I'm currently rebuilding world and will attempt a verbose boot via serial console when it's done. Anything else I should try? Greg -- See complete headers for address and phone numbers. pgpV3oeVrvAB0.pgp Description: PGP signature
[OT] memtest86 (Was: Problems with AMD64 and 8 GB RAM?)
On Mar 30, Peter Wemm wrote: On Wednesday 30 March 2005 03:22 pm, Ask Bjørn Hansen wrote: .. Original Message ... On Thu, 31 Mar 2005 08:14:45 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. There is a bootable ISO version of memtest86 that you could try. Thats what the port does.. It produces a bootable floppy or ISO. This reminds me, I noticed that gentoo includes a memtest86 kernel in their install ISO. Would this be a hard feature to include in FreeBSD? Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE
On Wed, Mar 30, 2005 at 12:06:11AM -0600, Karl Denninger wrote: On Wed, Mar 30, 2005 at 12:50:31AM -0500, Matthew N. Dodd wrote: On Tue, 29 Mar 2005, Karl Denninger wrote: 245a252 request-donecount = 0; Without the last delta the requeue doesn't happen at all. So you're saying that this change: 1.42: When resubmitting a timed out request, reset donecount. produces the problem? I believe so, yes, if my recollection from earlier in the month is correct. Without it, the problem doesn't exist, but the requeueing doesn't happen either (well technically according to the code it does, but it doesn't do anything) - so whether that's the problem or whether it simply MASKS the problem I can't say without further investigation. I am loading up my sandbox machine with an exact copy now, and expect to know more sometime tomorrow - I will have to go through a full buildworld/installworld/buildkernel/installkernel to bring the sandbox up to date and then stuff a SATA disk and adapter in there to re-create the environment closely enough to be sure that I'm looking at the same issue. More as soon as I know with certainty. It appears that the change was backed out of the CVS tree late last night. I've reproduced the original problem on my sandbox, and am now testing removing the patch lines one at a time. Hopefully I can isolate exactly which line causes trouble. BTW, it appears that the original problem (DMA write errorrs) ONLY occur if you have at least two SATA devices - at least in my system - and at least one of them is on the SI chipset. A single UDMA100 PATA disk and a single SATA150 disk DO NOT trigger retries, no matter how high the load. Had to diddle with things to get it to go bang so I can isolate Hopefully more this evening - first round (removing the ! change) being run now. -- -- Karl Denninger ([EMAIL PROTECTED]) Internet Consultant Kids Rights Activist http://www.denninger.netMy home on the net - links to everything I do! http://scubaforum.org Your UNCENSORED place to talk about DIVING! http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME! http://genesis3.blogspot.comMusings Of A Sentient Mind ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Thu, 31 Mar 2005 08:14, Greg 'groggy' Lehey wrote: Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. You could try http://www.memtest86.com although that doesn't do 4Gb :( -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au The nice thing about standards is that there are so many of them to choose from. -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C pgpXvfkMAM1yn.pgp Description: PGP signature
Re: Problems with AMD64 and 8 GB RAM?
On Thu, Mar 31, 2005 at 10:32:33AM +0930, Daniel O'Connor wrote: On Thu, 31 Mar 2005 08:14, Greg 'groggy' Lehey wrote: Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. You could try http://www.memtest86.com although that doesn't do 4Gb :( http://www.memtest.org/ -- Steve ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: buggy ATA controller: I can install 4.11, but not 5.3 !?!
Rob wrote: I have already put this machine under moderate load and recompiled/installed a new world/kernel without any problems. Apparently 4.11 knows how to bypass the flaws of this buggy ATA controller. At least that's my impression. Would 5.3 or 5.4 do this as-good here? Googling on this topic, I found info on a Linux kernel configuration site: Quote CONFIG_BLK_DEV_RZ1000 The PC-Technologies RZ1000 chip is used on many common 486 and Pentium motherboards, usually along with the Neptune chipset. Unfortunately, it has a rather nasty design flaw that can cause severe data corruption under many conditions. Say Y here to include code which automatically detects and corrects the problem under Linux. This may slow disk throughput by a few percent, but at least things will operate 100% reliably. If unsure, say Y. \Quote I wonder whether FreeBSD (at least 4.X) uses the same strategy to bypass the RZ1000 trouble on my PC. Regards, Rob. __ Do you Yahoo!? Take Yahoo! Mail with you! Get it on your mobile phone. http://mobile.yahoo.com/maildemo ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problems with AMD64 and 8 GB RAM?
On Thu, 31 Mar 2005 10:40, Steve Kargl wrote: On Thu, Mar 31, 2005 at 10:32:33AM +0930, Daniel O'Connor wrote: On Thu, 31 Mar 2005 08:14, Greg 'groggy' Lehey wrote: Have you run sysutils/memtest86 with the 8 GB? Heh. Difficult when the system doesn't run. You could try http://www.memtest86.com although that doesn't do 4Gb :( http://www.memtest.org/ Ahh well there you go :) Thanks! -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au The nice thing about standards is that there are so many of them to choose from. -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C pgp8NYdVZHABS.pgp Description: PGP signature