Re: cpu timer issues
on 30/09/2010 01:27 Jurgen Weber said the following: Gentlemen Ah, ok. Learn something new everyday. Fantastic. The first time the machine stopped during the boot process, but that is ok the 2nd time we have success. http://pastebin.com/r4UWdN7U I am not sure if ACPI is on, Jeremy you mention below that it should be in just by booting with this option so let me know if there are any problems there. If you disabled it in BIOS, you have to re-enable it there. There is no magic. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime
on 30/09/2010 02:27 Don Lewis said the following: On 29 Sep, Andriy Gapon wrote: on 29/09/2010 11:56 Don Lewis said the following: I'm using the same kernel config as the one on a slower !SMP box which I'm trying to squeeze as much performance out of as possible. My kernel config file contains these statements: nooptions SMP nodeviceapic Testing with an SMP kernel is on my TODO list. SMP or not, it's really weird to see apic disabled nowadays. I tried enabling apic and got worse results. I saw ping RTTs as high as 67 seconds. Here's the timer info with apic enabled: I didn't expect anything to change in this output with APIC enabled. # sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-100) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 53633 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 7988816 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1341917999 kern.timecounter.tc.TSC.frequency: 2500014018 kern.timecounter.tc.TSC.quality: 800 kern.timecounter.invariant_tsc: 0 Here's the verbose boot info with apic: http://people.freebsd.org/~truckman/AN-M2_HD-8.1-STABLE-apic-verbose.txt vmstat -i ? I've also experimented with SMP as well as SCHED_4BSD (all previous testing was with !SMP and SCHED_ULE). I still see occasional problems with SCHED_4BSD and !SMP, but so far I have not seen any problems with SCHED_ULE and SMP. Good! I did manage to catch the problem with lock profiling enabled: http://people.freebsd.org/~truckman/AN-M2_HD-8.1-STABLE_lock_profile_freeze.txt I'm currently testing SMP some more to verify if it really avoids this problem. OK. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
MCA messages in dmesg
For awhile now, my home server has been acting up. Actually it had a bad set of RAM long ago, replaced and it and worked fine. It's been weird again now, and I've found this in dmesg: MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 2 MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 3 MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory I really don't know what MCA is, but that looks like possibility bad RAM again. I have some other DIMM's I can try, but I was hoping someone had some info on exactly what those messages mean. One concern is the motherboard bad, and hosing the memory. Some more info: FreeBSD vbox.galacticdominator.com 8.1-STABLE FreeBSD 8.1-STABLE #0: Mon Aug 2 11:19:16 CDT 2010 a...@vbox.galacticdominator.com:/usr/obj/usr/src/sys/GENERIC amd64 smbios.bios.reldate=01/22/2008 smbios.bios.vendor=Phoenix Technologies, LTD smbios.bios.version=6.00 PG smbios.chassis.maker=NVIDIA smbios.chassis.serial= smbios.chassis.tag= smbios.chassis.version=NFORCE 680i LT SLI smbios.memory.enabled=4194304 smbios.planar.maker=NVIDIA smbios.planar.product=NFORCE 680i LT SLI smbios.planar.serial=1 smbios.planar.version=2 smbios.socket.enabled=1 smbios.socket.populated=1 smbios.system.maker=NVIDIA smbios.system.product=NFORCE 680i LT SLI smbios.system.serial=1 smbios.system.uuid=86fe600d-034b-0400-- smbios.system.version=2 smbios.version=2.4 Normal dmesg: Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.1-STABLE #0: Mon Aug 2 11:19:16 CDT 2010 a...@vbox.galacticdominator.com:/usr/obj/usr/src/sys/GENERIC amd64 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Quad CPUQ6600 @ 2.40GHz (2700.03-MHz K8-class CPU) Origin = GenuineIntel Id = 0x6fb Family = 6 Model = f Stepping = 11 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0xe3bdSSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM AMD Features=0x20100800SYSCALL,NX,LM AMD Features2=0x1LAHF TSC: P-state invariant real memory = 4294967296 (4096 MB) avail memory = 4073664512 (3884 MB) ACPI APIC Table: Nvidia NVDAACPI FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0: Changing APIC ID to 4 ioapic0 Version 1.1 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: Nvidia NVDAACPI on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a (3) failed acpi0: reservation of 10, afdf (3) failed Timecounter ACPI-fast frequency 3579545 Hz quality 1000 acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0 cpu0: ACPI CPU on acpi0 cpu1: ACPI CPU on acpi0 cpu2: ACPI CPU on acpi0 cpu3: ACPI CPU on acpi0 acpi_hpet0: High Precision Event Timer iomem 0xfeff-0xfeff03ff on acpi0 device_attach: acpi_hpet0 attach returned 12 acpi_button0: Power Button on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pci0: ACPI PCI bus on pcib0 pci0: memory, RAM at device 0.1 (no driver attached) pci0: memory, RAM at device 0.2 (no driver attached) pci0: memory, RAM at device 0.3 (no driver attached) pci0: memory, RAM at device 0.4 (no driver attached) pci0: memory, RAM at device 0.5 (no driver attached) pci0: memory, RAM at device 0.6 (no driver attached) pci0: memory, RAM at device 0.7 (no driver attached) pci0: memory, RAM at device 1.0 (no driver attached) pci0: memory, RAM at device 1.1 (no driver attached) pci0: memory, RAM at device 1.2 (no driver attached) pci0: memory, RAM at device 1.3 (no driver attached) pci0: memory, RAM at device 1.4 (no driver attached) pci0: memory, RAM at device 1.5 (no driver attached) pci0: memory, RAM at device 1.6 (no driver attached) pci0: memory, RAM at device 2.0 (no driver attached) pci0: memory, RAM at device 2.1 (no driver attached) pci0: memory, RAM at device 2.2 (no driver attached) pcib1: ACPI PCI-PCI bridge at device 3.0 on pci0 pci1: ACPI PCI bus on pcib1 vgapci0: VGA-compatible display port 0x8c00-0x8c7f mem 0xcc00-0xccff,0xb000-0xbfff,0xcd00-0xcdff irq 16 at device 0.0 on pci1 nvidia0: GeForce 7600 GT on vgapci0 vgapci0: child nvidia0 requested pci_enable_busmaster vgapci0: child nvidia0 requested pci_enable_io vgapci0: child nvidia0 requested pci_enable_io nvidia0: [ITHREAD] pci0: memory, RAM at device 9.0 (no driver attached) isab0: PCI-ISA bridge port 0xfc00-0xfc7f at device 10.0 on pci0 isa0: ISA
mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
Something interesting I've come across which happens on both RELENG_7 and RELENG_8 (indicating it's not a problem with the older tty code or the newer pty/pts code), and it's reproducible on Linux (sort of...). mysqld_safe appears to hold a pty/tty open even after the process has been backgrounded. I can understand how/why this might occur, just not in this particular case. I had a colleague test the situation on his Linux machine. He was able to confirm that: 1) mysqld_safe /dev/null 21 never released the tty 2) nohup mysqld_safe /dev/null 21 did release the tty With regards to test #1, looking in /proc/{pid}/fd showed that STDIN was being held open. I recommended he point STDIN to /dev/null as so: mysqld_safe /dev/null /dev/null 21 Which also solved the problem. On FreeBSD it's a different story. Below, mysql-server was started as root on pts/1. The open file descriptors all point to /dev/null, so I'm not sure why the pty/tty is being held open. icarus# ps -aux -U mysql USERPID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 10078 0.2 0.3 35100 11032 1 S11:38PM 0:00.02 [mysqld] mysql 9997 0.0 0.0 8228 1592 1 S11:38PM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mys icarus# procstat -f 9997 PID COMM FD T V FLAGSREF OFFSET PRO NAME 9997 shcwd v d - - - /root 9997 sh root v d - - - / 9997 sh 0 v c r--- 1 0 - /dev/null 9997 sh 1 v c -w-- 2 0 - /dev/null 9997 sh 2 v c -w-- 2 0 - /dev/null icarus# procstat -f 10078 PID COMM FD T V FLAGSREF OFFSET PRO NAME 10078 mysqldcwd v d - - - /storage/mysql 10078 mysqld root v d - - - / 10078 mysqld 0 v c r--- 1 0 - /dev/null 10078 mysqld 1 v r rwa- 1 32048 - /storage/mysql/icarus.home.lan.err 10078 mysqld 2 v r rwa- 1 32380 - /storage/mysql/icarus.home.lan.err At this point I log out of pts/1 and log back in to the machine (which sticks me on pts/2 as a result of the problem). Looking again, we see: icarus# ps -aux -U mysql USERPID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 9997 0.0 0.0 8228 1592 1- I11:38PM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mys mysql 10078 0.0 0.3 35100 11032 1- I11:38PM 0:00.02 [mysqld] With absolutely no change in procstat output relevant to fds 0/1/2. Yet pts/1 still appears held open by something: icarus# ls -l /dev/pts total 0 crw--w 1 jdc tty 0, 116 Sep 29 23:44 0 crw-rw-rw- 1 root wheel0, 115 Sep 29 23:41 1 crw--w 1 jdc tty 0, 117 Sep 29 23:44 2 fstat also shows no indication of anything using pts/1: icarus# fstat /dev/pts/1 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W NAME icarus# fstat | grep pts/1 icarus# Ideas? -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
On Thu, Sep 30, 2010 at 1:51 AM, Jeremy Chadwick free...@jdc.parodius.comwrote: Something interesting I've come across which happens on both RELENG_7 and RELENG_8 (indicating it's not a problem with the older tty code or the newer pty/pts code), and it's reproducible on Linux (sort of...). mysqld_safe appears to hold a pty/tty open even after the process has been backgrounded. I can understand how/why this might occur, just not in this particular case. Actually cam across this the other day: http://lists.freebsd.org/pipermail/freebsd-ports/2010-July/062417.html It appears you aren't the only one to notice the issue. -- Adam Vande More ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
Hi Jeremy, * Jeremy Chadwick free...@jdc.parodius.com wrote: 1) mysqld_safe /dev/null 21 never released the tty 2) nohup mysqld_safe /dev/null 21 did release the tty What happens if you run the following command? daemon -cf mysqld_safe The point is that FreeBSD's pts(4) driver only deallocates TTYs when it's really sure nothing uses it anymore. Even if there is not a single file descriptor referring to the slave device, it has to wait until there exist no processes which have the TTY as its controlling TTY. The `pstat -t' command is quite useful to figure out whether there is still a session associated with the TTY. See the following thread: http://lists.freebsd.org/pipermail/freebsd-ports/2010-July/062417.html -- Ed Schouten e...@80386.nl WWW: http://80386.nl/ pgpq5AlaJIXSZ.pgp Description: PGP signature
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
On Thu, Sep 30, 2010 at 09:03:33AM +0200, Ed Schouten wrote: Hi Jeremy, * Jeremy Chadwick free...@jdc.parodius.com wrote: 1) mysqld_safe /dev/null 21 never released the tty 2) nohup mysqld_safe /dev/null 21 did release the tty What happens if you run the following command? daemon -cf mysqld_safe Let's try it and find out. This is all being done from pts/2. icarus# ps -auxwww -U mysql | grep mysqld_safe mysql9997 0.0 0.0 8228 1592 1- I11:38PM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb icarus# /usr/local/etc/rc.d/mysql-server stop Stopping mysql. Waiting for PIDS: 10078. icarus# daemon -c -f -u mysql /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb icarus# ps -auxwww -U mysql USERPID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 11036 0.0 0.0 8228 1600 ?? Is 12:21AM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb mysql 6 0.0 0.3 35100 11032 ?? I12:21AM 0:00.02 [mysqld] icarus# exit $ exit [another window, different tty] icarus# pstat -t | grep pts/2 icarus# Summary: looks good to me. The point is that FreeBSD's pts(4) driver only deallocates TTYs when it's really sure nothing uses it anymore. Even if there is not a single file descriptor referring to the slave device, it has to wait until there exist no processes which have the TTY as its controlling TTY. Ah I see. Well that would explain the difference between Linux and FreeBSD then -- it sounds like Linux has a one-off with regards to fds that point to /dev/null. The `pstat -t' command is quite useful to figure out whether there is still a session associated with the TTY. See the following thread: http://lists.freebsd.org/pipermail/freebsd-ports/2010-July/062417.html Ahhh, two people pointing me to the same thread, sweet. :-) I wasn't subscribed to -ports back in July, else I'd almost certainly have said something then. It's exactly as you stated in that thread -- the tty is in G state (waiting to be freed/process to exist). Please note the below output was obtained *before* attempting the daemon -cf stuff you recommended. icarus# pstat -t | grep pts/1 pts/1 0000 000 0 9372 0 G Until rc(8) can be updated to support daemon(8) natively, the ~76 ports which Do The Wrong Thing(tm) should get updated to do it this way. Ones like mysqlXX-server should be placed high on the priority list given their popularity/importance. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
On Thu, Sep 30, 2010 at 09:30:25AM +0200, Alex Dupre wrote: Jeremy Chadwick ha scritto: Until rc(8) can be updated to support daemon(8) natively, This would be the Right Thing IMHO. the ~76 ports which Do The Wrong Thing(tm) should get updated to do it this way. Ones like mysqlXX-server should be placed high on the priority list given their popularity/importance. If you have an already tested patch for the mysql rc script, I'll commit it asap. Just finished it for databases/mysql51-server. Tested on RELENG_8 with the below variables in use, and also tested with mysql_limits=yes. mysql_enable=yes mysql_dbdir=/storage/mysql mysql_args=--skip-innodb Should work fine on RELENG_7 since it has /usr/sbin/daemon too. Tested using stop, start, and restart. I can test a reboot if you'd like, just let me know. Validation: icarus# /usr/local/etc/rc.d/mysql-server stop Stopping mysql. Waiting for PIDS: 12015. icarus# /usr/local/etc/rc.d/mysql-server start Starting mysql. icarus# ps -auxwww -U mysql USERPID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 12271 0.0 0.0 8228 1600 ?? Is 12:53AM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb mysql 12352 0.0 0.3 35100 11032 ?? I12:53AM 0:00.02 [mysqld] I'll also take this opportunity to point this out, since I'm certain someone will mention it: daemon's -u argument would be ideal except that it breaks when using rc.subr's xxx_user variable (which uses su(1) to change credentials/spawn $command). With both in use, daemon then fails on setusercontext(), which in turn fails because of initgroups() returning EPERM -- and this does make sense. So let's not use daemon -u in rc.subr for the time being. The diff is pretty obvious/simple (2 line change), so the other databases/mysqlXX-server ports can be upgraded in the same manner. --- files/mysql-server.sh.in.orig 2010-03-27 03:24:53.0 -0700 +++ files/mysql-server.sh.in2010-09-30 00:45:38.0 -0700 @@ -35,8 +35,8 @@ mysql_user=mysql mysql_limits_args=-e -U ${mysql_user} pidfile=${mysql_dbdir}/`/bin/hostname`.pid -command=%%PREFIX%%/bin/mysqld_safe -command_args=--defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} /dev/null 21 +command=/usr/sbin/daemon +command_args=-c -f /usr/local/bin/mysqld_safe --defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} procname=%%PREFIX%%/libexec/mysqld start_precmd=${name}_prestart start_postcmd=${name}_poststart -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
Jeremy Chadwick ha scritto: Until rc(8) can be updated to support daemon(8) natively, This would be the Right Thing IMHO. the ~76 ports which Do The Wrong Thing(tm) should get updated to do it this way. Ones like mysqlXX-server should be placed high on the priority list given their popularity/importance. If you have an already tested patch for the mysql rc script, I'll commit it asap. -- Alex Dupre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Diskless/readonly root booting issues
Hi all, I've been working on updating my semi-embedded images to 7.3-stable of late (I generally wait for .3+ releases), it's been a few years since the last time I did one of these and I'm having some issues getting my netboot test environment to behave itself. I'm sure it's something simple but I've spent quite a bit of time looking for answers and poking the system but no joy yet. Basically I use a PXE booted NFS root to test my reduced footprint image builds, the boot is working but init is attempting to remount / rw (in spite of it being marked ro in fstab) which of course fails because the directory is exported ro from the NFS server at which point the system dumps me to single user mode; === OUTPUT === Starting file system checks: udp: Netconfig database not found Mounting root filesystem rw failed, startup aborted ERROR: ABORTING BOOT (sending SIGTERM to parent)! Sep 30 09:60:02 init: /bin/sh on /etc/rc terminated abnormally, going to single user mode Enter full pathname of shell or RETURN for /bin/sh: Relevant configs from the diskless root == rc.conf == ifconfig_le0=DHCP diskless_mount=/etc/rc.initdiskless varsize=8192 varmfs=YES tmpsize=8192 tmpmfs=YES nfs_client_enable=YES dumpdev=NO = rc.initdiskless is the version from /usr/share/examples/rc.initdiskless == fstab == 192.168.2.2:/usr/fbtest / nfs ro 0 0 proc /proc procfs rw 0 0 == loader.conf == verbose_loading=YES autoboot_delay=2 Kernel is (obviously) built with NFS_ROOT and NFSCLIENT, relatively minimalist otherwise, have also tested with GENERIC, same result. I must be forgetting something simple in all of this, I don't recall it being terribly difficult to get this stuff working when I was doing my original work with 6.3, though I don't recall the use of the initdiskless script, IIRC I was using rc.diskless2 which (again IIRC) was later replaced by /etc/rc.d/diskless but I've not been able to find this script anywhere. Any suggestions would be greatly appreciated at this point. Thanks, Morgan Reed firstly, you should be using the latest pxeboot, it passes the root file-handle to the kernel, so no need to remount it, so remove the line from the fstab. secondly, try using /etc/rc.initdiskless - which is the default. use the KISS method :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
[releng_8_0 tinderbox] failure on ia64/ia64
TB --- 2010-09-30 08:07:07 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-30 08:07:07 - starting RELENG_8_0 tinderbox run for ia64/ia64 TB --- 2010-09-30 08:07:07 - cleaning the object tree TB --- 2010-09-30 08:10:48 - cvsupping the source tree TB --- 2010-09-30 08:10:48 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8_0/ia64/ia64/supfile TB --- 2010-09-30 08:54:21 - WARNING: /usr/bin/csup returned exit code 1 TB --- 2010-09-30 08:54:21 - ERROR: unable to cvsup the source tree TB --- 2010-09-30 08:54:21 - 1.22 user 134.47 system 2833.60 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8_0-ia64-ia64.full ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
[releng_8_0 tinderbox] failure on mips/mips
TB --- 2010-09-30 08:54:21 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-30 08:54:21 - starting RELENG_8_0 tinderbox run for mips/mips TB --- 2010-09-30 08:54:21 - cleaning the object tree TB --- 2010-09-30 08:56:19 - cvsupping the source tree TB --- 2010-09-30 08:56:19 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8_0/mips/mips/supfile TB --- 2010-09-30 09:39:03 - WARNING: /usr/bin/csup returned exit code 1 TB --- 2010-09-30 09:39:03 - ERROR: unable to cvsup the source tree TB --- 2010-09-30 09:39:03 - 0.80 user 76.49 system 2681.91 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8_0-mips-mips.full ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
On Thu, Sep 30, 2010 at 08:53:07AM -0400, Paul Mather wrote: On Sep 30, 2010, at 3:56 AM, Jeremy Chadwick wrote: The diff is pretty obvious/simple (2 line change), so the other databases/mysqlXX-server ports can be upgraded in the same manner. --- files/mysql-server.sh.in.orig 2010-03-27 03:24:53.0 -0700 +++ files/mysql-server.sh.in2010-09-30 00:45:38.0 -0700 @@ -35,8 +35,8 @@ mysql_user=mysql mysql_limits_args=-e -U ${mysql_user} pidfile=${mysql_dbdir}/`/bin/hostname`.pid -command=%%PREFIX%%/bin/mysqld_safe -command_args=--defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} /dev/null 21 +command=/usr/sbin/daemon +command_args=-c -f /usr/local/bin/mysqld_safe --defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} Shouldn't this be -c -f %%PREFIX%%/bin/mysqld_safe ... rather than hard-coding /usr/local? Yes. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime
On 30 Sep, Andriy Gapon wrote: on 30/09/2010 02:27 Don Lewis said the following: vmstat -i ? I didn't see anything odd in the vmstat -i output that I posted to the list earlier. It looked more or less normal as the ntp offset suddenly went insane. I did manage to catch the problem with lock profiling enabled: http://people.freebsd.org/~truckman/AN-M2_HD-8.1-STABLE_lock_profile_freeze.txt I'm currently testing SMP some more to verify if it really avoids this problem. OK. I wasn't able to cause SMP on stable to break. The silent reboots that I was seeing with WITNESS go away if I add WITNESS_SKIPSPIN. Witness doesn't complain about anything. I tested -CURRENT and !SMP seems to work ok. One difference in terms of hardware between the two tests is that I'm using a SATA drive when testing -STABLE and a SCSI drive when testing -CURRENT. At this point, I think the biggest clues are going to be in the lock profile results. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: RELENG_7 em problems (and RELENG_8)
At 08:00 PM 9/26/2010, Jack Vogel wrote: The system I've had stress tests running on has 82574 LOMs, so I hope it will solve the problem, will see tomorrow morning at how things have held up... I pulled a copy of sys/dev/e1000 from HEAD and copied onto my RELENG_8 box. I had another nic lock up last night :( Anyways, now running with the driver from HEAD on RELENG_8 amd64 em0: Intel(R) PRO/1000 Network Connection 7.0.8 port 0x4040-0x405f mem 0xb440-0xb441,0xb4425000-0xb4425fff irq 16 at device 25.0 on pci0 em0: Using an MSI interrupt em0: [FILTER] em0: Ethernet address: 00:15:17:ed:68:a5 em1: Intel(R) PRO/1000 Network Connection 7.0.8 port 0x2000-0x201f mem 0xb410-0xb411,0xb412-0xb4123fff irq 16 at device 0.0 on pci9 em1: Using MSIX interrupts with 3 vectors em1: [ITHREAD] em1: [ITHREAD] em1: [ITHREAD] em1: Ethernet address: 00:15:17:ed:68:a4 e...@pci0:0:25:0:class=0x02 card=0x34ec8086 chip=0x10ef8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 13[e0] = PCI Advanced Features: FLR TP e...@pci0:9:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 0003[140] = Serial 1 001517ed68a4 interrupt total rate irq4: uart0 2283 6 irq16: siis04332 11 irq18: arcmsr0137175372 irq19: twa018805 51 irq21: ehci02734 7 irq23: ehci1 675 1 cpu0: timer 733804 1994 irq256: em073195198 irq257: em1:rx 0 238 0 irq258: em1:tx 0 37 0 irq260: ahci0 4328 11 cpu1: timer 725637 1971 cpu3: timer 725709 1972 cpu2: timer 725688 1971 Total3154640 8572 ---Mike Jack On Sun, Sep 26, 2010 at 4:43 PM, Mike Tancsa mailto:m...@sentex.netm...@sentex.net wrote: At 06:19 PM 9/26/2010, Jack Vogel wrote: Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm not sure whats broken from what you show here. I will try to get the new driver out shortly for you to try. With this particular NIC, it will wedge under high load. I tried 2 different motherboards and chipsets the same behaviour. ---Mike Jack On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa mailto:m...@sentex.netmailto:m...@sentex.netm...@sentex.net wrote: At 06:36 PM 9/24/2010, Jack Vogel wrote: There is a new revision of the em driver coming next week, its going thru some stress pounding over the weekend, if no issues show up I'll put it into HEAD. Yongari's changes in TX context handling which effects checksum and tso are added. I've also decided that multiple queues in 82574 just are a source of problems without a lot of benefit, so it still uses MSIX but with only 3 vectors, meaning it seperates TX and RX but has a single queue. Thanks, looking forward to trying it out! With respect to the multiple queues, I thought the driver already used just the one on RELENG_8 ? If not, is there a way to force the existing driver to use just the one queue ? On the box that has the NIC locking up, it shows e...@pci0:9:0:0: class=0x02 card=0x34ec8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) and vmstat -i shows irq256: em0 5129063353 irq257: em1 531251 36 in a wedged state, stats look like dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 dev.em.1.%driver: em dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0x34ec class=0x02 dev.em.1.%parent: pci9 dev.em.1.nvm: -1 dev.em.1.rx_int_delay: 0 dev.em.1.tx_int_delay: 66 dev.em.1.rx_abs_int_delay: 66
Re: fetch: Non-recoverable resolver failure
Jeremy Chadwick wrote: On Tue, Sep 28, 2010 at 10:59:04PM +0200, Miroslav Lachman wrote: Jeremy Chadwick wrote: On Tue, Sep 28, 2010 at 08:12:00PM +0200, Miroslav Lachman wrote: Hi, we are using fetch command from cron to run PHP scripts periodically and sometimes cron sends error e-mails like this: fetch: https://hiden.example.com/cron/fiveminutes: Non-recoverable resolver failure [...] Note: target domains are hosted on the server it-self and named too. The system is FreeBSD 7.3-RELEASE-p2 i386 GENERIC Can somebody help me to diagnose this random fetch+resolver issue? [...] There is PF with some basic rules, mostly blocking incomming packets, allowing all outgoing and scrubbing: scrub in on bge1 all fragment reassemble scrub out on bge1 all no-df random-id min-ttl 24 max-mss 1492 fragment reassemble pass out on bge1 inet proto udp all keep state pass out on bge1 inet proto tcp from 1.2.3.40 to any flags S/SA modulate state pass out on bge1 inet proto tcp from 1.2.3.41 to any flags S/SA modulate state pass out on bge1 inet proto tcp from 1.2.3.42 to any flags S/SA modulate state modified PF options: set timeout { frag 15, interval 5 } set limit { frags 2500, states 5000 } set optimization aggressive set block-policy drop set loginterface bge1 # Let loopback and internal interface traffic flow without restrictions set skip on lo0 Please also provide pfctl -s info output, in addition to uname -a output (you can hide the hostname), since the pf stack differs depending on what FreeBSD version you're using. # pfctl -s info No ALTQ support in kernel ALTQ related functions disabled Status: Enabled for 32 days 11:31:02 Debug: Urgent Interface Stats for bge1 IPv4 IPv6 Bytes In 370643147870 Bytes Out 2796338699760 Packets In Passed 2140574770 Blocked11801250 Packets Out Passed 2722667440 Blocked 1287770 State Table Total Rate current entries 181 searches 518860439 184.9/s inserts 166081725.9/s removals166079915.9/s Counters match 179511316.4/s bad-offset 00.0/s fragment 230.0/s short 00.0/s normalize 40.0/s memory 00.0/s bad-timestamp 00.0/s congestion 00.0/s ip-option 00.0/s proto-cksum 30950.0/s state-mismatch 167070.0/s state-insert 00.0/s state-limit00.0/s src-limit 00.0/s synproxy 00.0/s uname: 7.3-RELEASE-p2 FreeBSD 7.3-RELEASE-p2 #0: Mon Jul 12 19:04:04 UTC 2010 r...@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC i386 Things that catch my eye as potential problems -- I don't have a way to confirm these are responsible for your issue (DNS resolver lookups are UDP-based, not TCP), but I want to point them out anyway. 1) modulate state is broken on FreeBSD. Taken from our pf.conf notes: # Filtering (public interface only; see set skip) # # NOTE: Do not use modulate state, as it's known to be broken on FreeBSD. # http://lists.freebsd.org/pipermail/freebsd-pf/2008-March/004227.html 2) optimization aggressive sounds dangerous given what pf.conf(5) says about it. I'd like to know what it considers idle. 3) I would also remove many of the options you have set in your scrub out rule. Starting with a clean slate to see if things improve is probably a good idea. As you'll see below, sometimes pf does things which may be correct per IP specification but don't work quite right with other vendors' IP stacks. 4) Your set timeout values look to be extreme. I would recommend leaving these at their defaults given your situation. 5) This feature is not in use in your pf.conf, but I want to point out regardless. reassemble tcp is also broken in some way. Again taken from our pf.conf notes: # Normalization -- resolve/reduce traffic ambiguities. # # NOTE: Do NOT use 'reassemble tcp' as it definitely causes breakage. # Issue may be related to other vendors' IP stacks, so let's leave it # disabled. Thank you for all your hints about PF! Maybe it's time to consider refactoring our standard pf.conf which was
Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE
On Tuesday, September 28, 2010 3:57:01 pm Vitaly Magerya wrote: Jung-uk Kim wrote: - the mouse doesn't work until I restart moused manually I always use hint.psm.0.flags=0x6000 in /boot/loader.conf, i.e., turn on both HOOKRESUME and INITAFTERSUSPEND, to work around similar problem on different laptop. Yes, that helps (after the stall period). Can you please report other problems in the appropriate ML? em - freebsd-net@ usb - freebsd-usb@ acpi_ec - freebsd-acpi@ I will try to do so. I'm not sure about acpi_ec issue though; it's only a warning, and it doesn't cause me any troubles. I also have this kernel message once in a few hours (seemingly random) if I used sleep/resume before: MCA: Bank 1, Status 0xe20001f5 MCA: Global Cap 0x0005, Status 0x MCA: Vendor GenuineIntel, ID 0x695, APIC ID 0 MCA: CPU 0 UNCOR PCC OVER DCACHE L1 ??? error But once again, it doesn't really cause any problems. A true uncorrected machine check would trigger a MC# fault and panic. I think this is just garbage in the MCx banks. Are you running the latest 8-stable? The change to reset the banks on resume was MFC'd in r210509 on July 26. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MCA messages in dmesg
On Thursday, September 30, 2010 2:49:24 am Adam Vande More wrote: For awhile now, my home server has been acting up. Actually it had a bad set of RAM long ago, replaced and it and worked fine. It's been weird again now, and I've found this in dmesg: MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 2 MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 3 MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory Are you getting a panic when this happens? -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE
John Baldwin wrote: A true uncorrected machine check would trigger a MC# fault and panic. I think this is just garbage in the MCx banks. Are you running the latest 8-stable? No, 8.1-RELEASE. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MCA messages in dmesg
On Thu, Sep 30, 2010 at 8:40 AM, John Baldwin j...@freebsd.org wrote: On Thursday, September 30, 2010 2:49:24 am Adam Vande More wrote: For awhile now, my home server has been acting up. Actually it had a bad set of RAM long ago, replaced and it and worked fine. It's been weird again now, and I've found this in dmesg: MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 2 MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 3 MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory Are you getting a panic when this happens? It's symptoms vary, but yes I think so. The box is headless, so I depend on logs after boot to see what happens. Sometimes the box panics and powers off with no warning, and other times it just seems to hit a stall state where everything become unresponsive and I have to manually power off. -- Adam Vande More ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE
On Thursday, September 30, 2010 10:53:14 am Vitaly Magerya wrote: John Baldwin wrote: A true uncorrected machine check would trigger a MC# fault and panic. I think this is just garbage in the MCx banks. Are you running the latest 8-stable? No, 8.1-RELEASE. Ok, that almost certainly explains it then. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MCA messages in dmesg
On Thursday, September 30, 2010 12:33:24 pm Adam Vande More wrote: On Thu, Sep 30, 2010 at 8:40 AM, John Baldwin j...@freebsd.org wrote: On Thursday, September 30, 2010 2:49:24 am Adam Vande More wrote: For awhile now, my home server has been acting up. Actually it had a bad set of RAM long ago, replaced and it and worked fine. It's been weird again now, and I've found this in dmesg: MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 2 MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory MCA: Bank 0, Status 0xf2000800 MCA: Global Cap 0x0806, Status 0x MCA: Vendor GenuineIntel, ID 0x6fb, APIC ID 3 MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory Are you getting a panic when this happens? It's symptoms vary, but yes I think so. The box is headless, so I depend on logs after boot to see what happens. Sometimes the box panics and powers off with no warning, and other times it just seems to hit a stall state where everything become unresponsive and I have to manually power off. Ok, it is a memory error of some sort, but mcelog claims it is a transaction timeout rather than an ECC error, per se: HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 2 BANK 0 MCG status: MCi status: Error overflow Uncorrected error Error enabled Processor context corrupt MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-timeout Error BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE STATUS f2000800 MCGSTATUS 0 MCGCAP 806 APICID 2 SOCKETID 0 CPUID Vendor Intel Family 6 Model 15 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 3 BANK 0 MCG status: MCi status: Error overflow Uncorrected error Error enabled Processor context corrupt MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-timeout Error BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE STATUS f2000800 MCGSTATUS 0 MCGCAP 806 APICID 3 SOCKETID 0 CPUID Vendor Intel Family 6 Model 15 I've no idea what specific hardware is busted (memory or motherboard or CPU), but I suspect something is likely broken. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MCA messages in dmesg
On Thu, Sep 30, 2010 at 12:25 PM, John Baldwin j...@freebsd.org wrote: Ok, it is a memory error of some sort, but mcelog claims it is a transaction timeout rather than an ECC error, per se: snip I've no idea what specific hardware is busted (memory or motherboard or CPU), but I suspect something is likely broken. Thanks for looking into it, I'm going to play around with BIOS voltages to see if I can achieve some stability since I don't have much to lose trying that first. The system may work fine for a week or more, then have a really bad day. I've made some raises to the cpu voltage and we'll see how that goes. -- Adam Vande More ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem running 8.1R on KVM with AMD hosts
Hi FreeBSD-stable, 1. Please, build your kernel with debug symbols. 2. Show kgdb output I could not convince the kernel to dump (it was looping forever but not panicing), but I have managed to compiled a kernel with debugging symbols and DDB which immediately drops into the debugger when the problem occurs, see screenshot at: http://lukemarsden.net/kvm-panic.png Progress, I sense. I tried typing 'panic' on the understanding that this should force a panic and cause it would dump core to the configured swap device (I have set dump* in /etc/rc.conf) so that I could get you the kgdb output, but it just looped back into the debugger. This issue seems to occur very early in the boot process. I would like to invite anyone with the skills and the inclination to have a poke around with this directly over VNC to email me off-list and I will turn on the VM and send you the VNC credentials. My email address is: luke [at] hybrid-logic.co.uk Or you can catch me on Skype at luke.marsden. I'm in GMT+1. I look forward to hearing from you ;-) -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem running 8.1R on KVM with AMD hosts
On Thursday 30 September 2010 02:57 pm, Luke Marsden wrote: Hi FreeBSD-stable, 1. Please, build your kernel with debug symbols. 2. Show kgdb output I could not convince the kernel to dump (it was looping forever but not panicing), but I have managed to compiled a kernel with debugging symbols and DDB which immediately drops into the debugger when the problem occurs, see screenshot at: http://lukemarsden.net/kvm-panic.png It seems MCA capability is advertised by the CPUID translator but writing to the MSRs causes GPF. In other words, it seems like a CPU emulator bug. A simple workaround is 'set hw.mca.enabled=0' from the loader prompt. If it works, add hw.mca.enabled=0 in /boot/loader.conf to make it permanent. MCA does not make any sense in emulation any way. Jung-uk Kim ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem running 8.1R on KVM with AMD hosts
On Thu, Sep 30, 2010 at 07:57:51PM +0100, Luke Marsden wrote: Hi FreeBSD-stable, 1. Please, build your kernel with debug symbols. 2. Show kgdb output I could not convince the kernel to dump (it was looping forever but not panicing), but I have managed to compiled a kernel with debugging symbols and DDB which immediately drops into the debugger when the problem occurs, see screenshot at: http://lukemarsden.net/kvm-panic.png Progress, I sense. I tried typing 'panic' on the understanding that this should force a panic and cause it would dump core to the configured swap device (I have set dump* in /etc/rc.conf) so that I could get you the kgdb output, but it just looped back into the debugger. Try call doadump instead. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x)
On Sep 30, 2010, at 3:56 AM, Jeremy Chadwick wrote: The diff is pretty obvious/simple (2 line change), so the other databases/mysqlXX-server ports can be upgraded in the same manner. --- files/mysql-server.sh.in.orig 2010-03-27 03:24:53.0 -0700 +++ files/mysql-server.sh.in 2010-09-30 00:45:38.0 -0700 @@ -35,8 +35,8 @@ mysql_user=mysql mysql_limits_args=-e -U ${mysql_user} pidfile=${mysql_dbdir}/`/bin/hostname`.pid -command=%%PREFIX%%/bin/mysqld_safe -command_args=--defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} /dev/null 21 +command=/usr/sbin/daemon +command_args=-c -f /usr/local/bin/mysqld_safe --defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} Shouldn't this be -c -f %%PREFIX%%/bin/mysqld_safe ... rather than hard-coding /usr/local? procname=%%PREFIX%%/libexec/mysqld start_precmd=${name}_prestart start_postcmd=${name}_poststart -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | Cheers, Paul.___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Problem running 8.1R on KVM with AMD hosts
On Thu, 2010-09-30 at 18:55 -0400, Jung-uk Kim wrote: It seems MCA capability is advertised by the CPUID translator but writing to the MSRs causes GPF. In other words, it seems like a CPU emulator bug. A simple workaround is 'set hw.mca.enabled=0' from the loader prompt. If it works, add hw.mca.enabled=0 in /boot/loader.conf to make it permanent. MCA does not make any sense in emulation any way. Awesome, this allows us to boot 8.1R on Linux KVM with AMD hardware! Thank you very much. This has just doubled our number of availability zones. -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime
On 30 Sep, Andriy Gapon wrote: on 30/09/2010 02:27 Don Lewis said the following: I tried enabling apic and got worse results. I saw ping RTTs as high as 67 seconds. Here's the timer info with apic enabled: [snip] Here's the verbose boot info with apic: http://people.freebsd.org/~truckman/AN-M2_HD-8.1-STABLE-apic-verbose.txt vmstat -i ? Here's the vmstat -i output at the time the machine starts experiencing freezes and ntp goes insane: Thu Sep 30 11:38:57 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 2845 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 74628 40 cpu0: timer 3676399 1999 irq256: nfe03915 2 Total3758132 2043 remote refid st t when poll reach delay offset jitter == *gw.catspoiler.o .GPS.1 u 129 128 3770.185 -0.307 0.020 Thu Sep 30 11:39:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 2935 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 78954 41 cpu0: timer 3796447 1998 irq256: nfe04090 2 Total3882771 2043 remote refid st t when poll reach delay offset jitter == *gw.catspoiler.o .GPS.1 u 61 128 3770.185 -0.307 0.023 Thu Sep 30 11:40:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 3025 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 85038 43 cpu0: timer 3916483 1998 irq256: nfe04247 2 Total4009138 2045 remote refid st t when poll reach delay offset jitter == *gw.catspoiler.o .GPS.1 u 121 128 3770.185 -0.307 0.023 Thu Sep 30 11:41:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 3115 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 89099 44 cpu0: timer 4036529 1998 irq256: nfe04384 2 Total4133472 2046 remote refid st t when poll reach delay offset jitter == *gw.catspoiler.o .GPS.1 u 54 128 3770.185 -0.307 43008.9 Thu Sep 30 11:42:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 11 0 irq12: psm0 18 0 irq14: ata0 3205 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 92111 44 cpu0: timer 4156575 1998 irq256: nfe04421 2 Total4256658 2046 remote refid st t when poll reach delay offset jitter == *gw.catspoiler.o .GPS.1 u 114 128 3770.185 -0.307 43008.9 Thu Sep 30 11:43:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 12 0 irq12: psm0 18 0 irq14: ata0 3295 1 irq17: ahc0 310 0 irq19: