Re: Quggaa locking hard.
I have also seen this with a recent version of FreeBSD 8 (I know 8.0-BETA2 didn't have this problem, also I have an 8.0-RC1 without problems, but I think RC3 did have it, and I'm sure -RELEASE has it). A few more details: It happened both on amd64 and i386. I couldn't debug amd64 (it was a live server and we couldn't afford it), but on i386 flowcleaner was using a LOT of CPU. It seemed to happen after booting, when quagga was importing global routing tables (~300k routes) from 2 BGP sessions. At least one of the sessions seemed to finish importing routes, but the kernel routing table seemed to be growing very slowly. Doing netstat -nr | wc -l took way longer than usual (20-30 seconds versus 9 seconds now), and it only reported about 100k routes. Doing it again after a minute or so showed the number of routes grew by around 10k. During this time, both quagga and zebra were very slow to respond to a new telnet session opened to them. As a workaround, I did sysctl net.inet.flowtable.enable=0. This didn't ease the load on the CPU, but having it in /etc/sysctl.conf and rebooting did help (quagga started up normally and all routes are where they should be). Hope this helps Alex --- On Fri, 12/4/09, Zaphod Beeblebrox zbee...@gmail.com wrote: From: Zaphod Beeblebrox zbee...@gmail.com Subject: Quggaa locking hard. To: FreeBSD Stable freebsd-stable@freebsd.org Date: Friday, December 4, 2009, 5:46 AM I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0 and not locking hard on 7.2. It seems (at this early point in the investigation) that both bgpd and zebra are wedging and zebra is listed as being in the RUN state. curiously, the load is also 4.0 (exactly the number of cores in the machine) even though the machine also reads 100% idle. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Quggaa locking hard.
At 10:46 PM 12/3/2009, Zaphod Beeblebrox wrote: I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0 and not locking hard on 7.2. It seems (at this early point in the investigation) that both bgpd and zebra are wedging and zebra is listed as being in the RUN state. curiously, the load is also 4.0 (exactly the number of cores in the machine) even though the machine also reads 100% idle. I think I am seeing something similar on a test box. I was loading up the box with 200k routes to do testing with. Kernel is default, save for a few unused drivers removed. If I take out optionsFLOWTABLE # per-cpu routing cache from the kernel, load avg is back to normal. This issue only seems to have come up in the past week or so as the previous kernel from ~8 days ago was OK. last pid: 6229; load averages: 2.00, 2.00, 2.00 up 1+17:33:02 09:39:31 141 processes: 7 running, 106 sleeping, 28 waiting CPU: 0.0% user, 0.0% nice, 22.2% system, 0.0% interrupt, 77.8% idle Mem: 98M Active, 2233M Inact, 187M Wired, 36K Cache, 112M Buf, 979M Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 22 root 76- 0K 8K CPU33 41.5H 100.00% flowcleaner 11 root 171 ki31 0K32K CPU22 41.5H 100.00% {idle: cpu2} 11 root 171 ki31 0K32K CPU11 41.5H 100.00% {idle: cpu1} 11 root 171 ki31 0K32K RUN 0 41.4H 100.00% {idle: cpu0} 869 root 40 64860K 64488K select 0 4:12 0.00% bgpd 11 root 171 ki31 0K32K RUN 3 2:09 0.00% {idle: cpu3} 20 root 44- 0K 8K syncer 0 1:00 0.00% syncer 12 root -32- 0K 224K WAIT1 0:47 0.00% {swi4: clock} 0 root -680 0K80K - 2 0:03 0.00% {fw0_taskq} 1230 root 760 3348K 1160K ttyin 2 0:02 0.00% getty 863 root 960 24640K 24232K RUN 2 0:02 0.00% zebra 12 root -32- 0K 224K WAIT2 0:01 0.00% {swi4: clock} 14 root -16- 0K 8K - 0 0:01 0.00% yarrow ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org Mike Tancsa, tel +1 519 651 3400 Sentex Communications,m...@sentex.net Providing Internet since 1994www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: loader(8) readin failed on 7.2R and later including 8.0R
On Thursday 03 December 2009 4:20:08 pm Hiroki Sato wrote: John Baldwin j...@freebsd.org wrote in 200912030803.29797@freebsd.org: jh On Thursday 03 December 2009 5:29:13 am Hiroki Sato wrote: jh John Baldwin j...@freebsd.org wrote jhin 200912020948.05698@freebsd.org: jh jh jh On Tuesday 01 December 2009 12:13:39 pm Hiroki Sato wrote: jh jh While the load command seemed to finish, the box got stuck just jh jh after entering boot command. jh jh jh jh Curious to say, I have got this symptom only on a specific box in jh jh more than ten different boxes I upgraded so far; it is based on an jh jh old motherboard Supermicro P4DPE[*]. jh jh jh jh [*] jh http://www.supermicro.com/products/motherboard/Xeon/E7500/P4DPE.cfm jh jh jh jh Any workaround? Booting from release CDROMs (7.2R and 8.0R) also jh jh fail. On the box 7.1R or 7.1R's loader + 7.2R kernel worked jh jh fine. It is possible something in changes of loader(8) between 7.1R jh jh and 7.2R is the cause, but I am still not sure what it is... jh jh jh jh It may be related to the loader switching to using memory 1MB for its jh jh malloc(). Maybe try building the loader with jh 'LOADER_NO_GPT_SUPPORT=yes' in jh jh /etc/src.conf? jh jh Thanks, a recompiled loader with LOADER_NO_GPT_SUPPORT=yes' displayed jh elf32_loadimage: could not read symbols - skipped! for 8.0R kernel. jh This is the same as 7.1R's loader + 8.0R kernel case. jh jh Can you get the output of 'smap' from the loader? Is the 8.0 kernel bigger jh than the 7.x kernel? If so, can you try trimming the 8.0 kernel a bit to see jh if that changes things? Sure. Output of smap on an 8.0R loader with LOADER_NO_GPT_SUPPORT=yes was: | OK smap | SMAP type=01 base= len=0009f400 | SMAP type=02 base=0009f400 len=0c00 | SMAP type=02 base=000dc000 len=00024000 | SMAP type=01 base=0010 len=00e0 So this is the region that ends up getting used for malloc: /* look for the first segment in 'extended' memory */ if ((smap.type == SMAP_TYPE_MEMORY) (smap.base == 0x10)) { bios_extmem = smap.length; ... /* Set memtop to actual top of memory */ memtop = memtop_copyin = 0x10 + bios_extmem; and then later: #if defined(LOADER_BZIP2_SUPPORT) || defined(LOADER_FIREWIRE_SUPPORT) || defined(LOADER_GPT_SUPPORT) || defined(LOADER_ZFS_SUPPORT) heap_top = PTOV(memtop_copyin); memtop_copyin -= 0x30; heap_bottom = PTOV(memtop_copyin); #else So memtop_copyin would start off as 0xf0 but would end up as 0xc0, and since the kernel starts at 4MB, I think that only leaves about 8MB for the kernel. Probably the loader needs to be more intelligent about using high memory for malloc by using the largest region 1MB but 4GB for malloc() instead of stealing memory from bios_extmem in the SMAP case. Try the attached patch which tries to make the loader use better smarts when picking a memory region for the heap (warning, I haven't tested it myself yet). | SMAP type=02 base=00f0 len=0010 | SMAP type=01 base=0100 len=beef | SMAP type=03 base=bfef len=c000 | SMAP type=04 base=bfefc000 len=4000 | SMAP type=01 base=bff0 len=0008 | SMAP type=02 base=bff8 len=0008 | SMAP type=02 base=fec0 len=0001 | SMAP type=02 base=fee0 len=1000 | SMAP type=02 base=ff80 len=0040 | SMAP type=02 base=fff0 len=0010 | OK Size difference between the two kernels was: | -r-xr-xr-x 1 root wheel 9708240 Dec 1 16:22 kernel.7/kernel | -r-xr-xr-x 1 root wheel 11492703 Nov 21 15:48 kernel.8/kernel Then I rebuilt a smaller 8.0 kernel by removing some entries from the kernel configuration file. The size is now smaller than 7.1R kernel: | -r-xr-xr-x 1 root wheel 7710491 Dec 3 21:10 /boot/kernel.8X/kernel Loading the new kernel seemed to work fine with the recompiled 8.0R loader, but it got stuck just after entering boot: | OK load /boot/kernel.8X/kernel | /boot/kernel.8X/kernel text=0x5a7664 data=0x88d74+0x82f04 syms=[0x4+0x6d290+0x4+0x987e3] | OK boot | / I'm not sure why it would get stuck. Can you add some debug printfs to see how far it gets before it dies? E.g. does it get to the point of calling exec() (in which case the hang is in the kernel in locore.S rather than in the loader). -- John Baldwin --- //depot/vendor/freebsd/src/sys/boot/i386/libi386/biosmem.c 2007/10/28 21:26:35 +++ //depot/user/jhb/boot/sys/boot/i386/libi386/biosmem.c 2009/12/04 15:33:59 @@ -35,14 +35,27 @@ #include libi386.h #include btxv86.h -vm_offset_t memtop, memtop_copyin; -u_int32_t bios_basemem, bios_extmem;
Re: Could you please fix this ?
On Thursday 03 December 2009 09:27 pm, Leonardo Santagostini wrote: Sorry, but ive not backed up this file, instead of this, i will copy the entire function (in fact its very short) cpi_pcib_pci_attach(device_t dev) { struct acpi_pcib_softc *sc; ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__); if (device_get_unit(dev)==2){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } if (device_get_unit(dev)==3){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } pcib_attach_common(dev); sc = device_get_softc(dev); sc-ap_handle = acpi_get_handle(dev); return (acpi_pcib_attach(dev, sc-ap_prt, sc-ap_pcibsc.secbus)); } As mav@ pointed out yesterday, this hack is very specific to this hardware. As jhb@ pointed out some time ago, this problem will be properly addressed by his multipass device probing mechanism. Sorry, there's nothing we can commit ATM. Jung-uk Kim Kind Regards Leonardo Santagostini 2009/12/3 Giorgos Keramidas keram...@freebsd.org: On Thu, 3 Dec 2009 01:57:50 +, Leonardo Santagostini lsantagost...@gmail.com wrote: Hello everybody, I was facing one big problem, i have a notebook, which is an Acer Aspire 5920. If you like i can send to you my messages file. Which is: Intel(R) Core(TM)2 Duo CPU T5550 @ 1.83GHz (1833.48-MHz 686-class CPU) Intel(R) PRO/Wireless 3945ABG Broadcom NetLink Gigabit Ethernet Controller 2 Gigs RAM 160 Gigs SATA The point was: With ACPI disabled, i managed to boot but without WIFI; and with ACPI enabled, the boot process hanged up all times. I fixed this adding if (device_get_unit(dev)==2){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } if (device_get_unit(dev)==3){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } to /usr/src/sys/dev/acpica/acpi_pcib_pci.c running on a 8.0-RELEASE I was able to fix it by my way but many people cant do it, so, i would really appreciate if you can add this piece of code. Hi Leonardo. Jung-uk Kim has done a lot of ACPI-related work, so he will probably know if the change is ok to commit to stable/8. I've added him to the thread, so he can let us know what he thinks of the change. Can you please post a diff that also shows _where_ the changes have to be installed in our current version of src/sys/dev/acpica/acpi_pcib_pci.c for 8.0-RELEASE? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Fatal trap 9 triggered by zfs?
I'm getting panics like this every so often (couple weeks, sometimes just a few days.) A second machine that has identical hardware and is running the same source has no such problems. FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec 1 14:30:54 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT amd64 # zpool status pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 ad4s1dONLINE 0 0 0 # cat /boot/loader.conf vfs.zfs.arc_max=512M vfs.zfs.prefetch_disable=1 vfs.zfs.zil_disable=1 Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0x80a39900 stack pointer = 0x28:0xff80622ddae0 frame pointer = 0x28:0xff80622ddb10 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 0 (spa_zio) trap number = 9 panic: general protection fault cpuid = 0 Uptime: 17h44m5s Physical memory: 3313 MB Dumping 1843 MB: 1828 1812 1796 1780 1764 1748 1732 1716 1700 1684 1668 1652 1636 1620 1604 1588 1572 1556 1540 1524 1508 1492 1476 1460 1444 1428 1412 1396 1380 1364 1348 1332 1316 1300 1284 1268 1252 1236 1220 1204 1188 1172 1156 1140 1124 1108 1092 1076 1060 1044 1028 1012 996 980 964 948 932 916 900 884 868 852 836 820 804 788 772 756 740 724 708 692 676 660 644 628 612 596 580 564 548 532 516 500 484 468 452 436 420 404 388 372 356 340 324 308 292 276 260 244 228 212 196 180 164 148 132 116 100 84 68 52 36 20 4 #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:223 #1 0x803374b9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #2 0x8033790c in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:579 #3 0x805cbb8d in trap_fatal (frame=0x9, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #4 0x805cc6f2 in trap (frame=0xff80622dda30) at /usr/src/sys/amd64/amd64/trap.c:644 #5 0x805b2223 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #6 0x80a39900 in vdev_queue_agg_io_done (aio=0xff00374562d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:174 #7 0x80a4be6f in zio_done (zio=0xff00374562d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:2243 #8 0x80a49e87 in zio_execute (zio=0xff00374562d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996 #9 0x809ed603 in taskq_run (arg=0xff008d8d0420, pending=Variable pending is not available. ) at /usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/kern/opensolaris_taskq.c:108 #10 0x80373533 in taskqueue_run (queue=0xff00017e1400) at /usr/src/sys/kern/subr_taskqueue.c:239 #11 0x803737b6 in taskqueue_thread_loop (arg=Variable arg is not available. ) at /usr/src/sys/kern/subr_taskqueue.c:360 #12 0x8030e0b8 in fork_exit ( callout=0x80373770 taskqueue_thread_loop, arg=0xff00016434e0, frame=0xff80622ddc80) at /usr/src/sys/kern/kern_fork.c:843 #13 0x805b26fe in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:561 #14 0x in ?? () #15 0x in ?? () #16 0x in ?? () #17 0x in ?? () #18 0x in ?? () #19 0x in ?? () #20 0x in ?? () #21 0x in ?? () #22 0x in ?? () #23 0x in ?? () #24 0x in ?? () #25 0x in ?? () #26 0x in ?? () #27 0x in ?? () #28 0x in ?? () #29 0x in ?? () #30 0x in ?? () #31 0x in ?? () #32 0x in ?? () #33 0x in ?? () #34 0x in ?? () #35 0x in ?? () #36 0x in ?? () #37 0x in ?? () #38 0x00c6c000 in ?? () #39 0x in ?? () #40 0x000b in ?? () #41 0x80832500 in affinity () #42 0xff000173c390 in ?? () #43 0xff80622dd240 in ?? () #44 0xff80622dd1f8 in ?? () #45 0xff00015ecab0 in ?? () #46 0x8035aa48 in sched_switch (td=0x80373770, newtd=0xff00016434e0, flags=Variable flags is not available. ) at /usr/src/sys/kern/sched_ule.c:1858 Previous frame inner to this frame (corrupt stack?) (kgdb) -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___
Re: Could you please fix this ?
Ok, anyway thanks for your time. Best Regards Leonardo Santagostini 2009/12/4 Jung-uk Kim j...@freebsd.org: On Thursday 03 December 2009 09:27 pm, Leonardo Santagostini wrote: Sorry, but ive not backed up this file, instead of this, i will copy the entire function (in fact its very short) cpi_pcib_pci_attach(device_t dev) { struct acpi_pcib_softc *sc; ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__); if (device_get_unit(dev)==2){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } if (device_get_unit(dev)==3){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } pcib_attach_common(dev); sc = device_get_softc(dev); sc-ap_handle = acpi_get_handle(dev); return (acpi_pcib_attach(dev, sc-ap_prt, sc-ap_pcibsc.secbus)); } As mav@ pointed out yesterday, this hack is very specific to this hardware. As jhb@ pointed out some time ago, this problem will be properly addressed by his multipass device probing mechanism. Sorry, there's nothing we can commit ATM. Jung-uk Kim Kind Regards Leonardo Santagostini 2009/12/3 Giorgos Keramidas keram...@freebsd.org: On Thu, 3 Dec 2009 01:57:50 +, Leonardo Santagostini lsantagost...@gmail.com wrote: Hello everybody, I was facing one big problem, i have a notebook, which is an Acer Aspire 5920. If you like i can send to you my messages file. Which is: Intel(R) Core(TM)2 Duo CPU T5550 @ 1.83GHz (1833.48-MHz 686-class CPU) Intel(R) PRO/Wireless 3945ABG Broadcom NetLink Gigabit Ethernet Controller 2 Gigs RAM 160 Gigs SATA The point was: With ACPI disabled, i managed to boot but without WIFI; and with ACPI enabled, the boot process hanged up all times. I fixed this adding if (device_get_unit(dev)==2){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } if (device_get_unit(dev)==3){ pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN | PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev); pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1); pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2); pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2); pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2); } to /usr/src/sys/dev/acpica/acpi_pcib_pci.c running on a 8.0-RELEASE I was able to fix it by my way but many people cant do it, so, i would really appreciate if you can add this piece of code. Hi Leonardo. Jung-uk Kim has done a lot of ACPI-related work, so he will probably know if the change is ok to commit to stable/8. I've added him to the thread, so he can let us know what he thinks of the change. Can you please post a diff that also shows _where_ the changes have to be installed in our current version of src/sys/dev/acpica/acpi_pcib_pci.c for 8.0-RELEASE? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: vge problem
On Fri, Dec 04, 2009 at 02:47:35PM +0900, Yoshiaki Kasahara wrote: On Thu, 3 Dec 2009 15:08:10 -0800, Pyun YongHyeon pyu...@gmail.com said: I remember there were several instability reports of vge(4). Would you try the following patch? The patch was generated against CURRENT so it may not cleanly apply to 8.0 due to if_timer changes. But I think you can download latest vge(4) code in CURRENT and apply the patch. Note, the patch was not tested at all on real hardware so it even may not work at all.(Long time ago, I ordered the vge(4) hardware was not delivered.) I downloaded vge(4) code in CURRENT, put it in 8.0R source tree, applied the patch, and rebuild GENERIC kernel (actually I shortcut it with NO_CLEAN flag). After I rebooted with the new kernel, the boot sequence stopped just after setting hostname. Setting hostname: elvenbow.cc.kyushu-u.ac.jp msk0: Uncorrectable PCI Express error vge0: link state changed to DOWN msk0: link state changed to DOWN (stop) The system didn't completely freeze. I can push Scroll Lock and Page Up/Down to browse the boot messages, but sometimes it stopped responding to my input for a second. Ctrl-C had no effect and I had to hit the reset button. Now my PC is synchronizing degraded gmirror volume...(ouch) I'm not sure why it would get stuck. You touched only vge(4), right? I'm wondering if I need complete kernel rebuild for the code to work... Regards, -- Yoshiaki Kasahara Research Institute for Information Technology, Kyushu University kasah...@nc.kyushu-u.ac.jp ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: vge problem
On Fri, 4 Dec 2009 09:36:01 -0800, Pyun YongHyeon pyu...@gmail.com said: After I rebooted with the new kernel, the boot sequence stopped just after setting hostname. Setting hostname: elvenbow.cc.kyushu-u.ac.jp msk0: Uncorrectable PCI Express error vge0: link state changed to DOWN msk0: link state changed to DOWN (stop) The system didn't completely freeze. I can push Scroll Lock and Page Up/Down to browse the boot messages, but sometimes it stopped responding to my input for a second. Ctrl-C had no effect and I had to hit the reset button. Now my PC is synchronizing degraded gmirror volume...(ouch) I'm not sure why it would get stuck. You touched only vge(4), right? Yes, I only touched vge(4). I believe that msk0: Uncorretable PCI Express error wasn't relevant to the freeze because it also happened before I replaced vge(4). I guess the system froze while initializing vge(4), but I'm not really sure actually. What can I do to narrow the cause of problems? Is it useful to build kernel with options KDB and DDB? Regards, -- Yoshiaki Kasahara Research Institute for Information Technology, Kyushu University kasah...@nc.kyushu-u.ac.jp ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: vge problem
On Sat, Dec 05, 2009 at 03:26:45AM +0900, Yoshiaki Kasahara wrote: On Fri, 4 Dec 2009 09:36:01 -0800, Pyun YongHyeon pyu...@gmail.com said: After I rebooted with the new kernel, the boot sequence stopped just after setting hostname. Setting hostname: elvenbow.cc.kyushu-u.ac.jp msk0: Uncorrectable PCI Express error vge0: link state changed to DOWN msk0: link state changed to DOWN (stop) The system didn't completely freeze. I can push Scroll Lock and Page Up/Down to browse the boot messages, but sometimes it stopped responding to my input for a second. Ctrl-C had no effect and I had to hit the reset button. Now my PC is synchronizing degraded gmirror volume...(ouch) I'm not sure why it would get stuck. You touched only vge(4), right? Yes, I only touched vge(4). I believe that msk0: Uncorretable PCI Express error wasn't relevant to the freeze because it also happened Most cases you can ignore that message. before I replaced vge(4). I guess the system froze while initializing vge(4), but I'm not really sure actually. Yes, that's also possible. But I can't explain how the patch can freeze the box. Another user also reported the similar vge(4) issue in private mail and tried the same patch and he could successfully boot with patched vge(4). Unfortunately the issue does not seem to fix his issue. I'm still working on it. What can I do to narrow the cause of problems? Is it useful to build kernel with options KDB and DDB? Yes. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
openjdk6 browser plugin
Hi list, I've installed openjdk6 from ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8-stable/java. Does this package contain a browser java plugin? I can't find it. Any tips are appreciated. Regards, Serguey. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Fatal trap 9 triggered by zfs?
Am 04.12.2009 um 17:52 schrieb Stefan Bethke: I'm getting panics like this every so often (couple weeks, sometimes just a few days.) A second machine that has identical hardware and is running the same source has no such problems. FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec 1 14:30:54 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT amd64 # zpool status pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 ad4s1dONLINE 0 0 0 # cat /boot/loader.conf vfs.zfs.arc_max=512M vfs.zfs.prefetch_disable=1 vfs.zfs.zil_disable=1 Got another, different one. Any tuning suggestions or similar? #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:223 #1 0x80337bd9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #2 0x8033802c in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:579 #3 0x805cc2ad in trap_fatal (frame=0x9, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #4 0x805cce12 in trap (frame=0xff80625db030) at /usr/src/sys/amd64/amd64/trap.c:644 #5 0x805b2943 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #6 0x80586c7a in vm_map_entry_splay (addr=Variable addr is not available. ) at /usr/src/sys/vm/vm_map.c:771 #7 0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, address=18446743523979624448, entry=0xff80625db170) at /usr/src/sys/vm/vm_map.c:1021 #8 0x80588aa3 in vm_map_delete (map=0xff0001e8, start=18446743523979624448, end=18446743523979689984) at /usr/src/sys/vm/vm_map.c:2685 #9 0x80588e61 in vm_map_remove (map=0xff0001e8, start=18446743523979624448, end=18446743523979689984) at /usr/src/sys/vm/vm_map.c:2774 #10 0x8057db85 in uma_large_free (slab=0xff005fcc7000) at /usr/src/sys/vm/uma_core.c:3021 #11 0x80325987 in free (addr=0xff80018b, mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471 #12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, ve=0xff003dd52200) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151 #13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182 #14 0x80a4a954 in zio_vdev_io_start (zio=0xff005f5ca2d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1814 #15 0x80a4ae87 in zio_execute (zio=0xff005f5ca2d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996 #16 0x80a3a080 in vdev_mirror_io_start (zio=0xff005f811b40) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303 #17 0x80a4ae87 in zio_execute (zio=0xff005f811b40) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996 #18 0x809ff45a in arc_read_nolock (pio=0xff005f66d5a0, spa=0xff000150a000, bp=0xff800a91c440, done=0x80a02630 dbuf_read_done, private=Variable private is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2763 #19 0x809ff8ec in arc_read (pio=0xff005f66d5a0, spa=0xff000150a000, bp=0xff800a91c440, pbuf=0xff0042a3ca20, done=0x80a02630 dbuf_read_done, private=0xff005fbfc620, priority=0, zio_flags=1, arc_flags=0xff80625db5ec, zb=0xff80625db5c0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508 #20 0x80a02aba in dbuf_read (db=0xff005fbfc620, zio=0xff005f66d5a0, flags=2) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521 #21 0x80a0602c in dmu_buf_hold (os=Variable os is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:106 #22 0x80a40db5 in zap_lockdir (os=0xff005f937610, obj=247890, tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=0xff80625db888) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:388 #23 0x80a41724 in zap_cursor_retrieve (zc=0xff80625db880, za=0xff80625db8c0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:1004 #24 0x80a61b66 in zfs_freebsd_readdir (ap=Variable ap is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:2157 #25 0x803cfde9 in kern_getdirentries
Re: openjdk6 browser plugin
On Friday 04 December 2009 02:33 pm, S.N.Grigoriev wrote: Hi list, I've installed openjdk6 from ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8-stable/jav a. Does this package contain a browser java plugin? I can't find it. Any tips are appreciated. No, OpenJDK does not have a browser plugin. If Java plugin is all you need, you can use java/diablo-jre16. Jung-uk Kim ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Fatal trap 9 triggered by zfs?
On Dec 4, 2009, at 8:56 PM, Stefan Bethke wrote: Am 04.12.2009 um 17:52 schrieb Stefan Bethke: I'm getting panics like this every so often (couple weeks, sometimes just a few days.) A second machine that has identical hardware and is running the same source has no such problems. FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec 1 14:30:54 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT amd64 # zpool status pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 ad4s1dONLINE 0 0 0 # cat /boot/loader.conf vfs.zfs.arc_max=512M vfs.zfs.prefetch_disable=1 vfs.zfs.zil_disable=1 Got another, different one. Any tuning suggestions or similar? #6 0x80586c7a in vm_map_entry_splay (addr=Variable addr is not available. ) at /usr/src/sys/vm/vm_map.c:771 #7 0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, address=18446743523979624448, entry=0xff80625db170) at /usr/src/sys/vm/vm_map.c:1021 #8 0x80588aa3 in vm_map_delete (map=0xff0001e8, start=18446743523979624448, end=18446743523979689984) at /usr/src/sys/vm/vm_map.c:2685 #9 0x80588e61 in vm_map_remove (map=0xff0001e8, start=18446743523979624448, end=18446743523979689984) at /usr/src/sys/vm/vm_map.c:2774 #10 0x8057db85 in uma_large_free (slab=0xff005fcc7000) at /usr/src/sys/vm/uma_core.c:3021 #11 0x80325987 in free (addr=0xff80018b, mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471 #12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, ve=0xff003dd52200) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151 #13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182 Bad RAM/motherboard? My first thought when I read your first mail (re: identical hardware) was bad hardware, and this seems to point towards that too, no? Have you tried memtest86+? Regards, Thomas___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Fatal trap 9 triggered by zfs?
On Fri, Dec 04, 2009 at 08:56:05PM +0100, Stefan Bethke wrote: Am 04.12.2009 um 17:52 schrieb Stefan Bethke: I'm getting panics like this every so often (couple weeks, sometimes just a few days.) A second machine that has identical hardware and is running the same source has no such problems. FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec 1 14:30:54 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT amd64 # zpool status pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 ad4s1dONLINE 0 0 0 # cat /boot/loader.conf vfs.zfs.arc_max=512M vfs.zfs.prefetch_disable=1 vfs.zfs.zil_disable=1 Got another, different one. Any tuning suggestions or similar? #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:223 #1 0x80337bd9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #2 0x8033802c in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:579 #3 0x805cc2ad in trap_fatal (frame=0x9, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #4 0x805cce12 in trap (frame=0xff80625db030) at /usr/src/sys/amd64/amd64/trap.c:644 #5 0x805b2943 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #6 0x80586c7a in vm_map_entry_splay (addr=Variable addr is not available. ) at /usr/src/sys/vm/vm_map.c:771 #7 0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, address=18446743523979624448, entry=0xff80625db170) at /usr/src/sys/vm/vm_map.c:1021 #8 0x80588aa3 in vm_map_delete (map=0xff0001e8, start=18446743523979624448, end=18446743523979689984) at /usr/src/sys/vm/vm_map.c:2685 #9 0x80588e61 in vm_map_remove (map=0xff0001e8, start=18446743523979624448, end=18446743523979689984) at /usr/src/sys/vm/vm_map.c:2774 #10 0x8057db85 in uma_large_free (slab=0xff005fcc7000) at /usr/src/sys/vm/uma_core.c:3021 #11 0x80325987 in free (addr=0xff80018b, mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471 #12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, ve=0xff003dd52200) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151 #13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182 #14 0x80a4a954 in zio_vdev_io_start (zio=0xff005f5ca2d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1814 #15 0x80a4ae87 in zio_execute (zio=0xff005f5ca2d0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996 #16 0x80a3a080 in vdev_mirror_io_start (zio=0xff005f811b40) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303 #17 0x80a4ae87 in zio_execute (zio=0xff005f811b40) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996 #18 0x809ff45a in arc_read_nolock (pio=0xff005f66d5a0, spa=0xff000150a000, bp=0xff800a91c440, done=0x80a02630 dbuf_read_done, private=Variable private is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2763 #19 0x809ff8ec in arc_read (pio=0xff005f66d5a0, spa=0xff000150a000, bp=0xff800a91c440, pbuf=0xff0042a3ca20, done=0x80a02630 dbuf_read_done, private=0xff005fbfc620, priority=0, zio_flags=1, arc_flags=0xff80625db5ec, zb=0xff80625db5c0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508 #20 0x80a02aba in dbuf_read (db=0xff005fbfc620, zio=0xff005f66d5a0, flags=2) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521 #21 0x80a0602c in dmu_buf_hold (os=Variable os is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:106 #22 0x80a40db5 in zap_lockdir (os=0xff005f937610, obj=247890, tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=0xff80625db888) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:388 #23 0x80a41724 in zap_cursor_retrieve (zc=0xff80625db880, za=0xff80625db8c0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:1004 #24 0x80a61b66 in zfs_freebsd_readdir (ap=Variable ap is not
Re: Fatal trap 9 triggered by zfs?
Am 04.12.2009 um 21:20 schrieb Thomas Backman: Bad RAM/motherboard? My first thought when I read your first mail (re: identical hardware) was bad hardware, and this seems to point towards that too, no? Have you tried memtest86+? No, I haven't yet, since I don't have physical access right now, and the box is in production service. I've shifted a couple of services to the other, identical box to see if that changes anything in the behavior. Right now it seems that heavy CPU load triggers panics, so bad RAM, CPU, chipset, mainboard, or marginal power supply are all possibilities. Stefan -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Fatal trap 9 triggered by zfs?
Am 04.12.2009 um 21:33 schrieb Jeremy Chadwick: You only have one disk in your pool. I'm not sure how long your system stays up before it panics, but could you try doing zpool scrub tank and let that run for a while? The first ~5 minutes may show the time to completion (from zpool status) getting worse and worse, but it should decrease/catch up. If the scrub is able to finish, look for any errors in the resulting R/W/CK fields. Doh, should have though of that myself. Will get started right away. Stefan -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Fatal trap 9 triggered by zfs?
Am 04.12.2009 um 20:56 schrieb Stefan Bethke: Am 04.12.2009 um 17:52 schrieb Stefan Bethke: I'm getting panics like this every so often (couple weeks, sometimes just a few days.) A second machine that has identical hardware and is running the same source has no such problems. FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec 1 14:30:54 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT amd64 # zpool status pool: tank state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 ad4s1dONLINE 0 0 0 # cat /boot/loader.conf vfs.zfs.arc_max=512M vfs.zfs.prefetch_disable=1 vfs.zfs.zil_disable=1 Third one. Since there's no mention of ZFS in this one, I'll start looking into pontential hardware issues. (kgdb) #0 doadump () at pcpu.h:223 #1 0x80337bd9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #2 0x8033802c in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:579 #3 0x805cc2ad in trap_fatal (frame=0x9, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #4 0x805cce12 in trap (frame=0xff800011ab00) at /usr/src/sys/amd64/amd64/trap.c:644 #5 0x805b2943 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #6 0x803405fc in msleep_spin (ident=0xff00015ce780, mtx=0xff00015ce7b0, wmesg=0x80638ffd -, timo=0) at /usr/src/sys/kern/kern_synch.c:312 #7 0x80373ef7 in taskqueue_thread_loop (arg=Variable arg is not available. ) at /usr/src/sys/kern/subr_taskqueue.c:89 #8 0x8030e7d8 in fork_exit ( callout=0x80373e90 taskqueue_thread_loop, arg=0xff80002e4768, frame=0xff800011ac80) at /usr/src/sys/kern/kern_fork.c:843 #9 0x805b2e1e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:561 Stefan -- Stefan Bethke s...@lassitu.de Fon +49 151 14070811 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Quggaa locking hard.
If you have a large number of routes then you will want to disable the flowtable. The default maximum number of cacheable flows is fairly small, raising it can help on the low-end, but fundamentally its an optimization for systems that have fewer than a few thousand simultaneous peers - the common case. I do have longer term plans for moving to lock-free L3 and L2 so that applications with large numbers of prefixes will also no longer be hampered by high locking overhead. -Kip On Fri, Dec 4, 2009 at 6:56 AM, Mike Tancsa m...@sentex.net wrote: At 10:46 PM 12/3/2009, Zaphod Beeblebrox wrote: I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0 and not locking hard on 7.2. It seems (at this early point in the investigation) that both bgpd and zebra are wedging and zebra is listed as being in the RUN state. curiously, the load is also 4.0 (exactly the number of cores in the machine) even though the machine also reads 100% idle. I think I am seeing something similar on a test box. I was loading up the box with 200k routes to do testing with. Kernel is default, save for a few unused drivers removed. If I take out options FLOWTABLE # per-cpu routing cache from the kernel, load avg is back to normal. This issue only seems to have come up in the past week or so as the previous kernel from ~8 days ago was OK. last pid: 6229; load averages: 2.00, 2.00, 2.00 up 1+17:33:02 09:39:31 141 processes: 7 running, 106 sleeping, 28 waiting CPU: 0.0% user, 0.0% nice, 22.2% system, 0.0% interrupt, 77.8% idle Mem: 98M Active, 2233M Inact, 187M Wired, 36K Cache, 112M Buf, 979M Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 22 root 76 - 0K 8K CPU3 3 41.5H 100.00% flowcleaner 11 root 171 ki31 0K 32K CPU2 2 41.5H 100.00% {idle: cpu2} 11 root 171 ki31 0K 32K CPU1 1 41.5H 100.00% {idle: cpu1} 11 root 171 ki31 0K 32K RUN 0 41.4H 100.00% {idle: cpu0} 869 root 4 0 64860K 64488K select 0 4:12 0.00% bgpd 11 root 171 ki31 0K 32K RUN 3 2:09 0.00% {idle: cpu3} 20 root 44 - 0K 8K syncer 0 1:00 0.00% syncer 12 root -32 - 0K 224K WAIT 1 0:47 0.00% {swi4: clock} 0 root -68 0 0K 80K - 2 0:03 0.00% {fw0_taskq} 1230 root 76 0 3348K 1160K ttyin 2 0:02 0.00% getty 863 root 96 0 24640K 24232K RUN 2 0:02 0.00% zebra 12 root -32 - 0K 224K WAIT 2 0:01 0.00% {swi4: clock} 14 root -16 - 0K 8K - 0 0:01 0.00% yarrow ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: loader(8) readin failed on 7.2R and later including 8.0R
On Friday 04 December 2009 10:35:59 am John Baldwin wrote: On Thursday 03 December 2009 4:20:08 pm Hiroki Sato wrote: John Baldwin j...@freebsd.org wrote in 200912030803.29797@freebsd.org: jh On Thursday 03 December 2009 5:29:13 am Hiroki Sato wrote: jh John Baldwin j...@freebsd.org wrote jhin 200912020948.05698@freebsd.org: jh jh jh On Tuesday 01 December 2009 12:13:39 pm Hiroki Sato wrote: jh jh While the load command seemed to finish, the box got stuck just jh jh after entering boot command. jh jh jh jh Curious to say, I have got this symptom only on a specific box in jh jh more than ten different boxes I upgraded so far; it is based on an jh jh old motherboard Supermicro P4DPE[*]. jh jh jh jh [*] jh http://www.supermicro.com/products/motherboard/Xeon/E7500/P4DPE.cfm jh jh jh jh Any workaround? Booting from release CDROMs (7.2R and 8.0R) also jh jh fail. On the box 7.1R or 7.1R's loader + 7.2R kernel worked jh jh fine. It is possible something in changes of loader(8) between 7.1R jh jh and 7.2R is the cause, but I am still not sure what it is... jh jh jh jh It may be related to the loader switching to using memory 1MB for its jh jh malloc(). Maybe try building the loader with jh 'LOADER_NO_GPT_SUPPORT=yes' in jh jh /etc/src.conf? jh jh Thanks, a recompiled loader with LOADER_NO_GPT_SUPPORT=yes' displayed jh elf32_loadimage: could not read symbols - skipped! for 8.0R kernel. jh This is the same as 7.1R's loader + 8.0R kernel case. jh jh Can you get the output of 'smap' from the loader? Is the 8.0 kernel bigger jh than the 7.x kernel? If so, can you try trimming the 8.0 kernel a bit to see jh if that changes things? Sure. Output of smap on an 8.0R loader with LOADER_NO_GPT_SUPPORT=yes was: | OK smap | SMAP type=01 base= len=0009f400 | SMAP type=02 base=0009f400 len=0c00 | SMAP type=02 base=000dc000 len=00024000 | SMAP type=01 base=0010 len=00e0 So this is the region that ends up getting used for malloc: /* look for the first segment in 'extended' memory */ if ((smap.type == SMAP_TYPE_MEMORY) (smap.base == 0x10)) { bios_extmem = smap.length; ... /* Set memtop to actual top of memory */ memtop = memtop_copyin = 0x10 + bios_extmem; and then later: #if defined(LOADER_BZIP2_SUPPORT) || defined(LOADER_FIREWIRE_SUPPORT) || defined(LOADER_GPT_SUPPORT) || defined(LOADER_ZFS_SUPPORT) heap_top = PTOV(memtop_copyin); memtop_copyin -= 0x30; heap_bottom = PTOV(memtop_copyin); #else So memtop_copyin would start off as 0xf0 but would end up as 0xc0, and since the kernel starts at 4MB, I think that only leaves about 8MB for the kernel. Probably the loader needs to be more intelligent about using high memory for malloc by using the largest region 1MB but 4GB for malloc() instead of stealing memory from bios_extmem in the SMAP case. Try the attached patch which tries to make the loader use better smarts when picking a memory region for the heap (warning, I haven't tested it myself yet). Use the updated patch (actually tested in qemu) instead. -- John Baldwin --- //depot/vendor/freebsd/src/sys/boot/i386/libi386/biosmem.c 2007/10/28 21:26:35 +++ //depot/user/jhb/boot/sys/boot/i386/libi386/biosmem.c 2009/12/04 22:20:17 @@ -35,14 +35,20 @@ #include libi386.h #include btxv86.h -vm_offset_t memtop, memtop_copyin; -u_int32_t bios_basemem, bios_extmem; +vm_offset_t memtop, memtop_copyin, high_heap_base; +uint32_t bios_basemem, bios_extmem, high_heap_size; static struct bios_smap smap; +/* + * The minimum amount of memory to reserve in bios_extmem for the heap. + */ +#define HEAP_MIN (3 * 1024 * 1024) + void bios_getmem(void) { +uint64_t size; /* Parse system memory map */ v86.ebx = 0; @@ -65,6 +71,26 @@ if ((smap.type == SMAP_TYPE_MEMORY) (smap.base == 0x10)) { bios_extmem = smap.length; } + + /* + * Look for the largest segment in 'extended' memory beyond + * 1MB but below 4GB. + */ + if ((smap.type == SMAP_TYPE_MEMORY) (smap.base 0x10) + (smap.base 0x1ull)) { + size = smap.length; + + /* + * If this segment crosses the 4GB boundary, truncate it. + */ + if (smap.base + size 0x1ull) + size = 0x1ull - smap.base; + + if (size high_heap_size) { + high_heap_size = size; + high_heap_base = smap.base; + } + } } while (v86.ebx != 0); /* Fall back to the old compatibility function for base memory */ @@ -97,5 +123,13 @@ /* Set memtop to actual top of memory */ memtop = memtop_copyin = 0x10 + bios_extmem; +/* + * If we have extended memory and did not find a suitable heap + * region in the SMAP, use the