Re: Quggaa locking hard.

2009-12-04 Thread alexpalias-bsdstable
I have also seen this with a recent version of FreeBSD 8 (I know 8.0-BETA2 
didn't have this problem, also I have an 8.0-RC1 without problems,  but I think 
RC3 did have it, and I'm sure -RELEASE has it).

A few more details:  

It happened both on amd64 and i386.  I couldn't debug amd64 (it was a live 
server and we couldn't afford it), but on i386 flowcleaner was using a LOT of 
CPU.  

It seemed to happen after booting, when quagga was importing global routing 
tables (~300k routes) from 2 BGP sessions.  At least one of the sessions seemed 
to finish importing routes, but the kernel routing table seemed to be growing 
very slowly.  

Doing netstat -nr | wc -l took way longer than usual (20-30 seconds versus 9 
seconds now), and it only reported about 100k routes.  Doing it again after a 
minute or so showed the number of routes grew by around 10k.

During this time, both quagga and zebra were very slow to respond to a new 
telnet session opened to them.

As a workaround, I did sysctl net.inet.flowtable.enable=0.  This didn't ease 
the load on the CPU, but having it in /etc/sysctl.conf and rebooting did help 
(quagga started up normally and all routes are where they should be).

Hope this helps
Alex

--- On Fri, 12/4/09, Zaphod Beeblebrox zbee...@gmail.com wrote:

 From: Zaphod Beeblebrox zbee...@gmail.com
 Subject: Quggaa locking hard.
 To: FreeBSD Stable freebsd-stable@freebsd.org
 Date: Friday, December 4, 2009, 5:46 AM
 I'm still investigating this, but my
 quagga is locking hard on FreeBSD 8.0
 and not locking hard on 7.2.  It seems (at this early
 point in the
 investigation) that both bgpd and zebra are wedging and
 zebra is listed as
 being in the RUN state.
 
 curiously, the load is also 4.0 (exactly the number of
 cores in the machine)
 even though the machine also reads 100% idle.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Quggaa locking hard.

2009-12-04 Thread Mike Tancsa

At 10:46 PM 12/3/2009, Zaphod Beeblebrox wrote:

I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0
and not locking hard on 7.2.  It seems (at this early point in the
investigation) that both bgpd and zebra are wedging and zebra is listed as
being in the RUN state.

curiously, the load is also 4.0 (exactly the number of cores in the machine)
even though the machine also reads 100% idle.



I think I am seeing something similar on a test box.  I was loading 
up the box with 200k routes to do testing with.  Kernel is default, 
save for a few unused drivers removed. If I take out

optionsFLOWTABLE   # per-cpu routing cache
from the kernel, load avg is back to normal.  This issue only seems 
to have come up in the past week or so as the previous kernel from ~8 
days ago was OK.


last pid:  6229;  load 
averages:  2.00,  2.00,  2.00 
   up 1+17:33:02  09:39:31

141 processes: 7 running, 106 sleeping, 28 waiting
CPU:  0.0% user,  0.0% nice, 22.2% system,  0.0% interrupt, 77.8% idle
Mem: 98M Active, 2233M Inact, 187M Wired, 36K Cache, 112M Buf, 979M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   22 root  76- 0K 8K CPU33  41.5H 100.00% flowcleaner
   11 root 171 ki31 0K32K CPU22  41.5H 100.00% {idle: cpu2}
   11 root 171 ki31 0K32K CPU11  41.5H 100.00% {idle: cpu1}
   11 root 171 ki31 0K32K RUN 0  41.4H 100.00% {idle: cpu0}
  869 root   40 64860K 64488K select  0   4:12  0.00% bgpd
   11 root 171 ki31 0K32K RUN 3   2:09  0.00% {idle: cpu3}
   20 root  44- 0K 8K syncer  0   1:00  0.00% syncer
   12 root -32- 0K   224K WAIT1   0:47  0.00% {swi4: clock}
0 root -680 0K80K -   2   0:03  0.00% {fw0_taskq}
 1230 root  760  3348K  1160K ttyin   2   0:02  0.00% getty
  863 root  960 24640K 24232K RUN 2   0:02  0.00% zebra
   12 root -32- 0K   224K WAIT2   0:01  0.00% {swi4: clock}
   14 root -16- 0K 8K -   0   0:01  0.00% yarrow


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org



Mike Tancsa,  tel +1 519 651 3400
Sentex Communications,m...@sentex.net
Providing Internet since 1994www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: loader(8) readin failed on 7.2R and later including 8.0R

2009-12-04 Thread John Baldwin
On Thursday 03 December 2009 4:20:08 pm Hiroki Sato wrote:
 John Baldwin j...@freebsd.org wrote
   in 200912030803.29797@freebsd.org:
 
 jh On Thursday 03 December 2009 5:29:13 am Hiroki Sato wrote:
 jh  John Baldwin j...@freebsd.org wrote
 jhin 200912020948.05698@freebsd.org:
 jh 
 jh  jh On Tuesday 01 December 2009 12:13:39 pm Hiroki Sato wrote:
 jh  jh   While the load command seemed to finish, the box got stuck just
 jh  jh   after entering boot command.
 jh  jh 
 jh  jh   Curious to say, I have got this symptom only on a specific box in
 jh  jh   more than ten different boxes I upgraded so far; it is based on 
 an
 jh  jh   old motherboard Supermicro P4DPE[*].
 jh  jh 
 jh  jh   [*]
 jh http://www.supermicro.com/products/motherboard/Xeon/E7500/P4DPE.cfm
 jh  jh 
 jh  jh   Any workaround?  Booting from release CDROMs (7.2R and 8.0R) also
 jh  jh   fail.  On the box 7.1R or 7.1R's loader + 7.2R kernel worked
 jh  jh   fine.  It is possible something in changes of loader(8) between 
 7.1R
 jh  jh   and 7.2R is the cause, but I am still not sure what it is...
 jh  jh
 jh  jh It may be related to the loader switching to using memory  1MB for 
 its
 jh  jh malloc().  Maybe try building the loader with
 jh 'LOADER_NO_GPT_SUPPORT=yes' in
 jh  jh /etc/src.conf?
 jh 
 jh   Thanks, a recompiled loader with LOADER_NO_GPT_SUPPORT=yes' displayed
 jh   elf32_loadimage: could not read symbols - skipped! for 8.0R kernel.
 jh   This is the same as 7.1R's loader + 8.0R kernel case.
 jh
 jh Can you get the output of 'smap' from the loader?  Is the 8.0 kernel 
 bigger
 jh than the 7.x kernel?  If so, can you try trimming the 8.0 kernel a bit to 
 see
 jh if that changes things?
 
  Sure.  Output of smap on an 8.0R loader with LOADER_NO_GPT_SUPPORT=yes
  was:
 
 | OK smap
 | SMAP type=01 base= len=0009f400
 | SMAP type=02 base=0009f400 len=0c00
 | SMAP type=02 base=000dc000 len=00024000
 | SMAP type=01 base=0010 len=00e0

So this is the region that ends up getting used for malloc:

/* look for the first segment in 'extended' memory */
if ((smap.type == SMAP_TYPE_MEMORY)  (smap.base == 0x10)) {
bios_extmem = smap.length;

...

/* Set memtop to actual top of memory */
memtop = memtop_copyin = 0x10 + bios_extmem;


and then later:

#if defined(LOADER_BZIP2_SUPPORT) || defined(LOADER_FIREWIRE_SUPPORT) || 
defined(LOADER_GPT_SUPPORT) || defined(LOADER_ZFS_SUPPORT)
heap_top = PTOV(memtop_copyin);
memtop_copyin -= 0x30;
heap_bottom = PTOV(memtop_copyin);
#else

So memtop_copyin would start off as 0xf0 but would end up as 0xc0,
and since the kernel starts at 4MB, I think that only leaves about 8MB for
the kernel.  Probably the loader needs to be more intelligent about using
high memory for malloc by using the largest region  1MB but  4GB for
malloc() instead of stealing memory from bios_extmem in the SMAP case.
Try the attached patch which tries to make the loader use better smarts
when picking a memory region for the heap (warning, I haven't tested it
myself yet).

 | SMAP type=02 base=00f0 len=0010
 | SMAP type=01 base=0100 len=beef
 | SMAP type=03 base=bfef len=c000
 | SMAP type=04 base=bfefc000 len=4000
 | SMAP type=01 base=bff0 len=0008
 | SMAP type=02 base=bff8 len=0008
 | SMAP type=02 base=fec0 len=0001
 | SMAP type=02 base=fee0 len=1000
 | SMAP type=02 base=ff80 len=0040
 | SMAP type=02 base=fff0 len=0010
 | OK
 
  Size difference between the two kernels was:
 
 | -r-xr-xr-x  1 root  wheel   9708240 Dec  1 16:22 kernel.7/kernel
 | -r-xr-xr-x  1 root  wheel  11492703 Nov 21 15:48 kernel.8/kernel
 
  Then I rebuilt a smaller 8.0 kernel by removing some entries from the
  kernel configuration file.  The size is now smaller than 7.1R kernel:
 
 | -r-xr-xr-x  1 root  wheel  7710491 Dec  3 21:10 /boot/kernel.8X/kernel
 
  Loading the new kernel seemed to work fine with the recompiled 8.0R
  loader, but it got stuck just after entering boot:
 
 | OK load /boot/kernel.8X/kernel
 | /boot/kernel.8X/kernel text=0x5a7664 data=0x88d74+0x82f04 
 syms=[0x4+0x6d290+0x4+0x987e3]
 | OK boot
 | /

I'm not sure why it would get stuck.  Can you add some debug printfs to see
how far it gets before it dies?  E.g. does it get to the point of calling
exec() (in which case the hang is in the kernel in locore.S rather than in
the loader).

-- 
John Baldwin
--- //depot/vendor/freebsd/src/sys/boot/i386/libi386/biosmem.c	2007/10/28 21:26:35
+++ //depot/user/jhb/boot/sys/boot/i386/libi386/biosmem.c	2009/12/04 15:33:59
@@ -35,14 +35,27 @@
 #include libi386.h
 #include btxv86.h
 
-vm_offset_t	memtop, memtop_copyin;
-u_int32_t	bios_basemem, bios_extmem;

Re: Could you please fix this ?

2009-12-04 Thread Jung-uk Kim
On Thursday 03 December 2009 09:27 pm, Leonardo Santagostini wrote:
 Sorry, but ive not backed up this file, instead of this, i will
 copy the entire function (in fact its very short)

 cpi_pcib_pci_attach(device_t dev)
 {
 struct acpi_pcib_softc *sc;
 ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__);

 if (device_get_unit(dev)==2){
 pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
 PCIM_CMD_PORTEN, 1);
 pci_enable_busmaster(dev);
 pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
 pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2);
 pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2);
 pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
 }
 if (device_get_unit(dev)==3){
 pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
 PCIM_CMD_PORTEN, 1);
 pci_enable_busmaster(dev);
 pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
 pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2);
 pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2);
 pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
 }

 pcib_attach_common(dev);
 sc = device_get_softc(dev);
 sc-ap_handle = acpi_get_handle(dev);
   return (acpi_pcib_attach(dev, sc-ap_prt,
 sc-ap_pcibsc.secbus)); }

As mav@ pointed out yesterday, this hack is very specific to this 
hardware.  As jhb@ pointed out some time ago, this problem will be 
properly addressed by his multipass device probing mechanism.

Sorry, there's nothing we can commit ATM.

Jung-uk Kim

 Kind Regards
 Leonardo Santagostini

 2009/12/3 Giorgos Keramidas keram...@freebsd.org:
  On Thu, 3 Dec 2009 01:57:50 +, Leonardo Santagostini 
lsantagost...@gmail.com wrote:
  Hello everybody,
 
  I was facing one big problem, i have a notebook, which is an
  Acer Aspire 5920.  If you like i can send to you my messages
  file.
 
  Which is:
 
  Intel(R) Core(TM)2 Duo CPU     T5550  @ 1.83GHz (1833.48-MHz
  686-class CPU) Intel(R) PRO/Wireless 3945ABG
  Broadcom NetLink Gigabit Ethernet Controller
  2 Gigs RAM
  160 Gigs SATA
 
  The point was:
  With ACPI disabled, i managed to boot but without WIFI; and with
  ACPI enabled, the boot process hanged up all times.
 
  I fixed this adding
 
      if (device_get_unit(dev)==2){
          pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
  PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev);
          pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
          pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2);
          pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2);
          pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
      }
      if (device_get_unit(dev)==3){
          pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
  PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev);
          pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
          pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2);
          pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2);
          pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
      }
 
  to /usr/src/sys/dev/acpica/acpi_pcib_pci.c running on a
  8.0-RELEASE
 
  I was able to fix it by my way but many people cant do it, so, i
  would really appreciate if you can add this piece of code.
 
  Hi Leonardo.
 
  Jung-uk Kim has done a lot of ACPI-related work, so he will
  probably know if the change is ok to commit to stable/8.  I've
  added him to the thread, so he can let us know what he thinks of
  the change.  Can you please post a diff that also shows _where_
  the changes have to be installed in our current version of
  src/sys/dev/acpica/acpi_pcib_pci.c for 8.0-RELEASE?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke
I'm getting panics like this every so often (couple weeks, sometimes just a few 
days.) A second machine that has identical hardware and is running the same 
source has no such problems.

FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 UTC 
2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64

# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  ad4s1dONLINE   0 0 0
# cat /boot/loader.conf
vfs.zfs.arc_max=512M
vfs.zfs.prefetch_disable=1
vfs.zfs.zil_disable=1

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0x80a39900
stack pointer   = 0x28:0xff80622ddae0
frame pointer   = 0x28:0xff80622ddb10
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (spa_zio)
trap number = 9
panic: general protection fault
cpuid = 0
Uptime: 17h44m5s
Physical memory: 3313 MB
Dumping 1843 MB: 1828 1812 1796 1780 1764 1748 1732 1716 1700 1684 1668 1652 
1636 1620 1604 1588 1572 1556 1540 1524 1508 1492 1476 1460 1444 1428 1412 1396 
1380 1364 1348 1332 1316 1300 1284 1268 1252 1236 1220 1204 1188 1172 1156 1140 
1124 1108 1092 1076 1060 1044 1028 1012 996 980 964 948 932 916 900 884 868 852 
836 820 804 788 772 756 740 724 708 692 676 660 644 628 612 596 580 564 548 532 
516 500 484 468 452 436 420 404 388 372 356 340 324 308 292 276 260 244 228 212 
196 180 164 148 132 116 100 84 68 52 36 20 4

#0  doadump () at pcpu.h:223
223 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0x803374b9 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x8033790c in panic (fmt=Variable fmt is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:579
#3  0x805cbb8d in trap_fatal (frame=0x9, eva=Variable eva is not 
available.
)
at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x805cc6f2 in trap (frame=0xff80622dda30)
at /usr/src/sys/amd64/amd64/trap.c:644
#5  0x805b2223 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:224
#6  0x80a39900 in vdev_queue_agg_io_done (aio=0xff00374562d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c:174
#7  0x80a4be6f in zio_done (zio=0xff00374562d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:2243
#8  0x80a49e87 in zio_execute (zio=0xff00374562d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
#9  0x809ed603 in taskq_run (arg=0xff008d8d0420, pending=Variable 
pending is not available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/kern/opensolaris_taskq.c:108
#10 0x80373533 in taskqueue_run (queue=0xff00017e1400)
at /usr/src/sys/kern/subr_taskqueue.c:239
#11 0x803737b6 in taskqueue_thread_loop (arg=Variable arg is not 
available.
)
at /usr/src/sys/kern/subr_taskqueue.c:360
#12 0x8030e0b8 in fork_exit (
callout=0x80373770 taskqueue_thread_loop, 
arg=0xff00016434e0, frame=0xff80622ddc80)
at /usr/src/sys/kern/kern_fork.c:843
#13 0x805b26fe in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:561
#14 0x in ?? ()
#15 0x in ?? ()
#16 0x in ?? ()
#17 0x in ?? ()
#18 0x in ?? ()
#19 0x in ?? ()
#20 0x in ?? ()
#21 0x in ?? ()
#22 0x in ?? ()
#23 0x in ?? ()
#24 0x in ?? ()
#25 0x in ?? ()
#26 0x in ?? ()
#27 0x in ?? ()
#28 0x in ?? ()
#29 0x in ?? ()
#30 0x in ?? ()
#31 0x in ?? ()
#32 0x in ?? ()
#33 0x in ?? ()
#34 0x in ?? ()
#35 0x in ?? ()
#36 0x in ?? ()
#37 0x in ?? ()
#38 0x00c6c000 in ?? ()
#39 0x in ?? ()
#40 0x000b in ?? ()
#41 0x80832500 in affinity ()
#42 0xff000173c390 in ?? ()
#43 0xff80622dd240 in ?? ()
#44 0xff80622dd1f8 in ?? ()
#45 0xff00015ecab0 in ?? ()
#46 0x8035aa48 in sched_switch (td=0x80373770, 
newtd=0xff00016434e0, flags=Variable flags is not available.
) at /usr/src/sys/kern/sched_ule.c:1858
Previous frame inner to this frame (corrupt stack?)
(kgdb) 

-- 
Stefan Bethke s...@lassitu.de   Fon +49 151 14070811




___

Re: Could you please fix this ?

2009-12-04 Thread Leonardo Santagostini
Ok, anyway thanks for your time.

Best Regards
Leonardo Santagostini



2009/12/4 Jung-uk Kim j...@freebsd.org:
 On Thursday 03 December 2009 09:27 pm, Leonardo Santagostini wrote:
 Sorry, but ive not backed up this file, instead of this, i will
 copy the entire function (in fact its very short)

 cpi_pcib_pci_attach(device_t dev)
 {
     struct acpi_pcib_softc *sc;
     ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__);

     if (device_get_unit(dev)==2){
         pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
 PCIM_CMD_PORTEN, 1);
         pci_enable_busmaster(dev);
         pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
         pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2);
         pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2);
         pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
     }
     if (device_get_unit(dev)==3){
         pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
 PCIM_CMD_PORTEN, 1);
         pci_enable_busmaster(dev);
         pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
         pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2);
         pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2);
         pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
     }

     pcib_attach_common(dev);
     sc = device_get_softc(dev);
     sc-ap_handle = acpi_get_handle(dev);
   return (acpi_pcib_attach(dev, sc-ap_prt,
 sc-ap_pcibsc.secbus)); }

 As mav@ pointed out yesterday, this hack is very specific to this
 hardware.  As jhb@ pointed out some time ago, this problem will be
 properly addressed by his multipass device probing mechanism.

 Sorry, there's nothing we can commit ATM.

 Jung-uk Kim

 Kind Regards
 Leonardo Santagostini

 2009/12/3 Giorgos Keramidas keram...@freebsd.org:
  On Thu, 3 Dec 2009 01:57:50 +, Leonardo Santagostini
 lsantagost...@gmail.com wrote:
  Hello everybody,
 
  I was facing one big problem, i have a notebook, which is an
  Acer Aspire 5920.  If you like i can send to you my messages
  file.
 
  Which is:
 
  Intel(R) Core(TM)2 Duo CPU     T5550  @ 1.83GHz (1833.48-MHz
  686-class CPU) Intel(R) PRO/Wireless 3945ABG
  Broadcom NetLink Gigabit Ethernet Controller
  2 Gigs RAM
  160 Gigs SATA
 
  The point was:
  With ACPI disabled, i managed to boot but without WIFI; and with
  ACPI enabled, the boot process hanged up all times.
 
  I fixed this adding
 
      if (device_get_unit(dev)==2){
          pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
  PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev);
          pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
          pci_write_config(dev, PCIR_MEMBASE_1, 0xf020, 2);
          pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf020, 2);
          pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
      }
      if (device_get_unit(dev)==3){
          pci_write_config(dev, PCIR_COMMAND, PCIM_CMD_MEMEN |
  PCIM_CMD_PORTEN, 1); pci_enable_busmaster(dev);
          pci_write_config(dev, PCIR_IOBASEL_1, 0xf0, 1);
          pci_write_config(dev, PCIR_MEMBASE_1, 0xf030, 2);
          pci_write_config(dev, PCIR_MEMLIMIT_1, 0xf030, 2);
          pci_write_config(dev, PCIR_PMBASEL_1, 0xfff1, 2);
      }
 
  to /usr/src/sys/dev/acpica/acpi_pcib_pci.c running on a
  8.0-RELEASE
 
  I was able to fix it by my way but many people cant do it, so, i
  would really appreciate if you can add this piece of code.
 
  Hi Leonardo.
 
  Jung-uk Kim has done a lot of ACPI-related work, so he will
  probably know if the change is ok to commit to stable/8.  I've
  added him to the thread, so he can let us know what he thinks of
  the change.  Can you please post a diff that also shows _where_
  the changes have to be installed in our current version of
  src/sys/dev/acpica/acpi_pcib_pci.c for 8.0-RELEASE?

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: vge problem

2009-12-04 Thread Pyun YongHyeon
On Fri, Dec 04, 2009 at 02:47:35PM +0900, Yoshiaki Kasahara wrote:
 On Thu, 3 Dec 2009 15:08:10 -0800,
   Pyun YongHyeon pyu...@gmail.com said:
 
  I remember there were several instability reports of vge(4). Would
  you try the following patch? The patch was generated against
  CURRENT so it may not cleanly apply to 8.0 due to if_timer changes.
  But I think you can download latest vge(4) code in CURRENT and
  apply the patch. Note, the patch was not tested at all on real
  hardware so it even may not work at all.(Long time ago, I ordered
  the vge(4) hardware was not delivered.)
 
 I downloaded vge(4) code in CURRENT, put it in 8.0R source tree,
 applied the patch, and rebuild GENERIC kernel (actually I shortcut it
 with NO_CLEAN flag).
 
 After I rebooted with the new kernel, the boot sequence stopped just
 after setting hostname.
 
 Setting hostname: elvenbow.cc.kyushu-u.ac.jp
 msk0: Uncorrectable PCI Express error
 vge0: link state changed to DOWN
 msk0: link state changed to DOWN
 (stop)
 
 The system didn't completely freeze. I can push Scroll Lock and
 Page Up/Down to browse the boot messages, but sometimes it stopped
 responding to my input for a second. Ctrl-C had no effect and I had to
 hit the reset button. Now my PC is synchronizing degraded gmirror
 volume...(ouch)
 

I'm not sure why it would get stuck. You touched only vge(4),
right?

 I'm wondering if I need complete kernel rebuild for the code to work...
 
 Regards,
 -- 
 Yoshiaki Kasahara
 Research Institute for Information Technology, Kyushu University
 kasah...@nc.kyushu-u.ac.jp
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: vge problem

2009-12-04 Thread Yoshiaki Kasahara
On Fri, 4 Dec 2009 09:36:01 -0800,
Pyun YongHyeon pyu...@gmail.com said:

 After I rebooted with the new kernel, the boot sequence stopped just
 after setting hostname.
 
 Setting hostname: elvenbow.cc.kyushu-u.ac.jp
 msk0: Uncorrectable PCI Express error
 vge0: link state changed to DOWN
 msk0: link state changed to DOWN
 (stop)
 
 The system didn't completely freeze. I can push Scroll Lock and
 Page Up/Down to browse the boot messages, but sometimes it stopped
 responding to my input for a second. Ctrl-C had no effect and I had to
 hit the reset button. Now my PC is synchronizing degraded gmirror
 volume...(ouch)
 
 
 I'm not sure why it would get stuck. You touched only vge(4),
 right?

Yes, I only touched vge(4). I believe that msk0: Uncorretable PCI
Express error wasn't relevant to the freeze because it also happened
before I replaced vge(4). I guess the system froze while initializing
vge(4), but I'm not really sure actually.

What can I do to narrow the cause of problems? Is it useful to build
kernel with options KDB and DDB?

Regards,
-- 
Yoshiaki Kasahara
Research Institute for Information Technology, Kyushu University
kasah...@nc.kyushu-u.ac.jp





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: vge problem

2009-12-04 Thread Pyun YongHyeon
On Sat, Dec 05, 2009 at 03:26:45AM +0900, Yoshiaki Kasahara wrote:
 On Fri, 4 Dec 2009 09:36:01 -0800,
   Pyun YongHyeon pyu...@gmail.com said:
 
  After I rebooted with the new kernel, the boot sequence stopped just
  after setting hostname.
  
  Setting hostname: elvenbow.cc.kyushu-u.ac.jp
  msk0: Uncorrectable PCI Express error
  vge0: link state changed to DOWN
  msk0: link state changed to DOWN
  (stop)
  
  The system didn't completely freeze. I can push Scroll Lock and
  Page Up/Down to browse the boot messages, but sometimes it stopped
  responding to my input for a second. Ctrl-C had no effect and I had to
  hit the reset button. Now my PC is synchronizing degraded gmirror
  volume...(ouch)
  
  
  I'm not sure why it would get stuck. You touched only vge(4),
  right?
 
 Yes, I only touched vge(4). I believe that msk0: Uncorretable PCI
 Express error wasn't relevant to the freeze because it also happened

Most cases you can ignore that message.

 before I replaced vge(4). I guess the system froze while initializing
 vge(4), but I'm not really sure actually.

Yes, that's also possible. But I can't explain how the patch can
freeze the box. Another user also reported the similar vge(4) issue
in private mail and tried the same patch and he could successfully
boot with patched vge(4). Unfortunately the issue does not seem to
fix his issue. I'm still working on it.

 
 What can I do to narrow the cause of problems? Is it useful to build
 kernel with options KDB and DDB?
 

Yes.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


openjdk6 browser plugin

2009-12-04 Thread S . N . Grigoriev
Hi list,

I've installed openjdk6 from 
ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8-stable/java.
Does this package contain a browser java plugin? I can't find it. Any tips are 
appreciated.

Regards,
Serguey.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke
Am 04.12.2009 um 17:52 schrieb Stefan Bethke:

 I'm getting panics like this every so often (couple weeks, sometimes just a 
 few days.) A second machine that has identical hardware and is running the 
 same source has no such problems.
 
 FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 
 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64
 
 # zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
 config:
 
   NAMESTATE READ WRITE CKSUM
   tankONLINE   0 0 0
 ad4s1dONLINE   0 0 0
 # cat /boot/loader.conf
 vfs.zfs.arc_max=512M
 vfs.zfs.prefetch_disable=1
 vfs.zfs.zil_disable=1

Got another, different one.  Any tuning suggestions or similar?

#0  doadump () at pcpu.h:223
223 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0x80337bd9 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x8033802c in panic (fmt=Variable fmt is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:579
#3  0x805cc2ad in trap_fatal (frame=0x9, eva=Variable eva is not 
available.
)
at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x805cce12 in trap (frame=0xff80625db030)
at /usr/src/sys/amd64/amd64/trap.c:644
#5  0x805b2943 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:224
#6  0x80586c7a in vm_map_entry_splay (addr=Variable addr is not 
available.
)
at /usr/src/sys/vm/vm_map.c:771
#7  0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, 
address=18446743523979624448, entry=0xff80625db170)
at /usr/src/sys/vm/vm_map.c:1021
#8  0x80588aa3 in vm_map_delete (map=0xff0001e8, 
start=18446743523979624448, end=18446743523979689984)
at /usr/src/sys/vm/vm_map.c:2685
#9  0x80588e61 in vm_map_remove (map=0xff0001e8, 
start=18446743523979624448, end=18446743523979689984)
at /usr/src/sys/vm/vm_map.c:2774
#10 0x8057db85 in uma_large_free (slab=0xff005fcc7000)
at /usr/src/sys/vm/uma_core.c:3021
#11 0x80325987 in free (addr=0xff80018b, 
mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471
#12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, 
ve=0xff003dd52200)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151
#13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182
#14 0x80a4a954 in zio_vdev_io_start (zio=0xff005f5ca2d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1814
#15 0x80a4ae87 in zio_execute (zio=0xff005f5ca2d0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
#16 0x80a3a080 in vdev_mirror_io_start (zio=0xff005f811b40)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303
#17 0x80a4ae87 in zio_execute (zio=0xff005f811b40)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
#18 0x809ff45a in arc_read_nolock (pio=0xff005f66d5a0, 
spa=0xff000150a000, bp=0xff800a91c440, 
done=0x80a02630 dbuf_read_done, private=Variable private is not 
available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2763
#19 0x809ff8ec in arc_read (pio=0xff005f66d5a0, 
spa=0xff000150a000, bp=0xff800a91c440, pbuf=0xff0042a3ca20, 
done=0x80a02630 dbuf_read_done, private=0xff005fbfc620, 
priority=0, zio_flags=1, arc_flags=0xff80625db5ec, 
zb=0xff80625db5c0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508
#20 0x80a02aba in dbuf_read (db=0xff005fbfc620, 
zio=0xff005f66d5a0, flags=2)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521
#21 0x80a0602c in dmu_buf_hold (os=Variable os is not available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:106
#22 0x80a40db5 in zap_lockdir (os=0xff005f937610, obj=247890, 
tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=0xff80625db888)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:388
#23 0x80a41724 in zap_cursor_retrieve (zc=0xff80625db880, 
za=0xff80625db8c0)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:1004
#24 0x80a61b66 in zfs_freebsd_readdir (ap=Variable ap is not 
available.
)
at 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:2157
#25 0x803cfde9 in kern_getdirentries 

Re: openjdk6 browser plugin

2009-12-04 Thread Jung-uk Kim
On Friday 04 December 2009 02:33 pm, S.N.Grigoriev wrote:
 Hi list,

 I've installed openjdk6 from
 ftp://ftp.freebsd.org/pub/FreeBSD/ports/amd64/packages-8-stable/jav
a. Does this package contain a browser java plugin? I can't find it.
 Any tips are appreciated.

No, OpenJDK does not have a browser plugin.  If Java plugin is all you 
need, you can use java/diablo-jre16.

Jung-uk Kim
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Thomas Backman
On Dec 4, 2009, at 8:56 PM, Stefan Bethke wrote:

 Am 04.12.2009 um 17:52 schrieb Stefan Bethke:
 
 I'm getting panics like this every so often (couple weeks, sometimes just a 
 few days.) A second machine that has identical hardware and is running the 
 same source has no such problems.
 
 FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 
 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64
 
 # zpool status
 pool: tank
 state: ONLINE
 scrub: none requested
 config:
 
  NAMESTATE READ WRITE CKSUM
  tankONLINE   0 0 0
ad4s1dONLINE   0 0 0
 # cat /boot/loader.conf
 vfs.zfs.arc_max=512M
 vfs.zfs.prefetch_disable=1
 vfs.zfs.zil_disable=1
 
 Got another, different one.  Any tuning suggestions or similar?
 
 
 #6  0x80586c7a in vm_map_entry_splay (addr=Variable addr is not 
 available.
 )
at /usr/src/sys/vm/vm_map.c:771
 #7  0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, 
address=18446743523979624448, entry=0xff80625db170)
at /usr/src/sys/vm/vm_map.c:1021
 #8  0x80588aa3 in vm_map_delete (map=0xff0001e8, 
start=18446743523979624448, end=18446743523979689984)
at /usr/src/sys/vm/vm_map.c:2685
 #9  0x80588e61 in vm_map_remove (map=0xff0001e8, 
start=18446743523979624448, end=18446743523979689984)
at /usr/src/sys/vm/vm_map.c:2774
 #10 0x8057db85 in uma_large_free (slab=0xff005fcc7000)
at /usr/src/sys/vm/uma_core.c:3021
 #11 0x80325987 in free (addr=0xff80018b, 
mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471
 #12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, 
ve=0xff003dd52200)
at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151
 #13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0)
at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182
Bad RAM/motherboard? My first thought when I read your first mail (re: 
identical hardware) was bad hardware, and this seems to point towards that too, 
no?
Have you tried memtest86+?

Regards,
Thomas___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Jeremy Chadwick


On Fri, Dec 04, 2009 at 08:56:05PM +0100, Stefan Bethke wrote:
 Am 04.12.2009 um 17:52 schrieb Stefan Bethke:
 
  I'm getting panics like this every so often (couple weeks, sometimes just a 
  few days.) A second machine that has identical hardware and is running the 
  same source has no such problems.
  
  FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 
  UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64
  
  # zpool status
   pool: tank
  state: ONLINE
  scrub: none requested
  config:
  
  NAMESTATE READ WRITE CKSUM
  tankONLINE   0 0 0
ad4s1dONLINE   0 0 0
  # cat /boot/loader.conf
  vfs.zfs.arc_max=512M
  vfs.zfs.prefetch_disable=1
  vfs.zfs.zil_disable=1
 
 Got another, different one.  Any tuning suggestions or similar?
 
 #0  doadump () at pcpu.h:223
 223   pcpu.h: No such file or directory.
   in pcpu.h
 (kgdb) #0  doadump () at pcpu.h:223
 #1  0x80337bd9 in boot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:416
 #2  0x8033802c in panic (fmt=Variable fmt is not available.
 )
 at /usr/src/sys/kern/kern_shutdown.c:579
 #3  0x805cc2ad in trap_fatal (frame=0x9, eva=Variable eva is not 
 available.
 )
 at /usr/src/sys/amd64/amd64/trap.c:857
 #4  0x805cce12 in trap (frame=0xff80625db030)
 at /usr/src/sys/amd64/amd64/trap.c:644
 #5  0x805b2943 in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:224
 #6  0x80586c7a in vm_map_entry_splay (addr=Variable addr is not 
 available.
 )
 at /usr/src/sys/vm/vm_map.c:771
 #7  0x80587f37 in vm_map_lookup_entry (map=0xff0001e8, 
 address=18446743523979624448, entry=0xff80625db170)
 at /usr/src/sys/vm/vm_map.c:1021
 #8  0x80588aa3 in vm_map_delete (map=0xff0001e8, 
 start=18446743523979624448, end=18446743523979689984)
 at /usr/src/sys/vm/vm_map.c:2685
 #9  0x80588e61 in vm_map_remove (map=0xff0001e8, 
 start=18446743523979624448, end=18446743523979689984)
 at /usr/src/sys/vm/vm_map.c:2774
 #10 0x8057db85 in uma_large_free (slab=0xff005fcc7000)
 at /usr/src/sys/vm/uma_core.c:3021
 #11 0x80325987 in free (addr=0xff80018b, 
 mtp=0x80ac61e0) at /usr/src/sys/kern/kern_malloc.c:471
 #12 0x80a36d03 in vdev_cache_evict (vc=0xff0001723ce0, 
 ve=0xff003dd52200)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:151
 #13 0x80a372ad in vdev_cache_read (zio=0xff005f5ca2d0)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:182
 #14 0x80a4a954 in zio_vdev_io_start (zio=0xff005f5ca2d0)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1814
 #15 0x80a4ae87 in zio_execute (zio=0xff005f5ca2d0)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
 #16 0x80a3a080 in vdev_mirror_io_start (zio=0xff005f811b40)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303
 #17 0x80a4ae87 in zio_execute (zio=0xff005f811b40)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:996
 #18 0x809ff45a in arc_read_nolock (pio=0xff005f66d5a0, 
 spa=0xff000150a000, bp=0xff800a91c440, 
 done=0x80a02630 dbuf_read_done, private=Variable private is 
 not available.
 )
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2763
 #19 0x809ff8ec in arc_read (pio=0xff005f66d5a0, 
 spa=0xff000150a000, bp=0xff800a91c440, pbuf=0xff0042a3ca20, 
 done=0x80a02630 dbuf_read_done, private=0xff005fbfc620, 
 priority=0, zio_flags=1, arc_flags=0xff80625db5ec, 
 zb=0xff80625db5c0)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508
 #20 0x80a02aba in dbuf_read (db=0xff005fbfc620, 
 zio=0xff005f66d5a0, flags=2)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521
 #21 0x80a0602c in dmu_buf_hold (os=Variable os is not available.
 )
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:106
 #22 0x80a40db5 in zap_lockdir (os=0xff005f937610, obj=247890, 
 tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=0xff80625db888)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:388
 #23 0x80a41724 in zap_cursor_retrieve (zc=0xff80625db880, 
 za=0xff80625db8c0)
 at 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:1004
 #24 0x80a61b66 in zfs_freebsd_readdir (ap=Variable ap is not 
 

Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke
Am 04.12.2009 um 21:20 schrieb Thomas Backman:

 Bad RAM/motherboard? My first thought when I read your first mail (re: 
 identical hardware) was bad hardware, and this seems to point towards that 
 too, no?
 Have you tried memtest86+?

No, I haven't yet, since I don't have physical access right now, and the box is 
in production service.  I've shifted a couple of services to the other, 
identical box to see if that changes anything in the behavior.

Right now it seems that heavy CPU load triggers panics, so bad RAM, CPU, 
chipset, mainboard, or marginal power supply are all possibilities.


Stefan

-- 
Stefan Bethke s...@lassitu.de   Fon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke

Am 04.12.2009 um 21:33 schrieb Jeremy Chadwick:

 You only have one disk in your pool.  I'm not sure how long your system
 stays up before it panics, but could you try doing zpool scrub tank
 and let that run for a while?  The first ~5 minutes may show the time to
 completion (from zpool status) getting worse and worse, but it should
 decrease/catch up.
 
 If the scrub is able to finish, look for any errors in the resulting
 R/W/CK fields.

Doh, should have though of that myself. Will get started right away.


Stefan

-- 
Stefan Bethke s...@lassitu.de   Fon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Fatal trap 9 triggered by zfs?

2009-12-04 Thread Stefan Bethke
Am 04.12.2009 um 20:56 schrieb Stefan Bethke:

 Am 04.12.2009 um 17:52 schrieb Stefan Bethke:
 
 I'm getting panics like this every so often (couple weeks, sometimes just a 
 few days.) A second machine that has identical hardware and is running the 
 same source has no such problems.
 
 FreeBSD XXX.hanse.de 8.0-STABLE FreeBSD 8.0-STABLE #16: Tue Dec  1 14:30:54 
 UTC 2009 r...@xxx.hanse.de:/usr/obj/usr/src/sys/EISENBOOT  amd64
 
 # zpool status
 pool: tank
 state: ONLINE
 scrub: none requested
 config:
 
  NAMESTATE READ WRITE CKSUM
  tankONLINE   0 0 0
ad4s1dONLINE   0 0 0
 # cat /boot/loader.conf
 vfs.zfs.arc_max=512M
 vfs.zfs.prefetch_disable=1
 vfs.zfs.zil_disable=1

Third one.  Since there's no mention of ZFS in this one, I'll start looking 
into pontential hardware issues.

(kgdb) #0  doadump () at pcpu.h:223
#1  0x80337bd9 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x8033802c in panic (fmt=Variable fmt is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:579
#3  0x805cc2ad in trap_fatal (frame=0x9, eva=Variable eva is not 
available.
)
at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x805cce12 in trap (frame=0xff800011ab00)
at /usr/src/sys/amd64/amd64/trap.c:644
#5  0x805b2943 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:224
#6  0x803405fc in msleep_spin (ident=0xff00015ce780, 
mtx=0xff00015ce7b0, wmesg=0x80638ffd -, timo=0)
at /usr/src/sys/kern/kern_synch.c:312
#7  0x80373ef7 in taskqueue_thread_loop (arg=Variable arg is not 
available.
)
at /usr/src/sys/kern/subr_taskqueue.c:89
#8  0x8030e7d8 in fork_exit (
callout=0x80373e90 taskqueue_thread_loop, 
arg=0xff80002e4768, frame=0xff800011ac80)
at /usr/src/sys/kern/kern_fork.c:843
#9  0x805b2e1e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:561


Stefan

-- 
Stefan Bethke s...@lassitu.de   Fon +49 151 14070811




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Quggaa locking hard.

2009-12-04 Thread K. Macy
If you have a large number of routes then you will want to disable the
flowtable. The default maximum number of cacheable flows is fairly
small, raising it can help on the low-end, but fundamentally its an
optimization for systems that have fewer than a few thousand
simultaneous peers - the common case.

I do have longer term plans for moving to lock-free L3 and L2 so that
applications with large numbers of prefixes will also no longer be
hampered by high locking overhead.


-Kip



On Fri, Dec 4, 2009 at 6:56 AM, Mike Tancsa m...@sentex.net wrote:
 At 10:46 PM 12/3/2009, Zaphod Beeblebrox wrote:

 I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0
 and not locking hard on 7.2.  It seems (at this early point in the
 investigation) that both bgpd and zebra are wedging and zebra is listed as
 being in the RUN state.

 curiously, the load is also 4.0 (exactly the number of cores in the
 machine)
 even though the machine also reads 100% idle.


 I think I am seeing something similar on a test box.  I was loading up the
 box with 200k routes to do testing with.  Kernel is default, save for a few
 unused drivers removed. If I take out
 options                FLOWTABLE               # per-cpu routing cache
 from the kernel, load avg is back to normal.  This issue only seems to have
 come up in the past week or so as the previous kernel from ~8 days ago was
 OK.

 last pid:  6229;  load averages:  2.00,  2.00,  2.00               up
 1+17:33:02  09:39:31
 141 processes: 7 running, 106 sleeping, 28 waiting
 CPU:  0.0% user,  0.0% nice, 22.2% system,  0.0% interrupt, 77.8% idle
 Mem: 98M Active, 2233M Inact, 187M Wired, 36K Cache, 112M Buf, 979M Free
 Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   22 root      76    -     0K     8K CPU3    3  41.5H 100.00% flowcleaner
   11 root     171 ki31     0K    32K CPU2    2  41.5H 100.00% {idle: cpu2}
   11 root     171 ki31     0K    32K CPU1    1  41.5H 100.00% {idle: cpu1}
   11 root     171 ki31     0K    32K RUN     0  41.4H 100.00% {idle: cpu0}
  869 root       4    0 64860K 64488K select  0   4:12  0.00% bgpd
   11 root     171 ki31     0K    32K RUN     3   2:09  0.00% {idle: cpu3}
   20 root      44    -     0K     8K syncer  0   1:00  0.00% syncer
   12 root     -32    -     0K   224K WAIT    1   0:47  0.00% {swi4: clock}
    0 root     -68    0     0K    80K -       2   0:03  0.00% {fw0_taskq}
  1230 root      76    0  3348K  1160K ttyin   2   0:02  0.00% getty
  863 root      96    0 24640K 24232K RUN     2   0:02  0.00% zebra
   12 root     -32    -     0K   224K WAIT    2   0:01  0.00% {swi4: clock}
   14 root     -16    -     0K     8K -       0   0:01  0.00% yarrow

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

 
 Mike Tancsa,                                      tel +1 519 651 3400
 Sentex Communications,                            m...@sentex.net
 Providing Internet since 1994                    www.sentex.net
 Cambridge, Ontario Canada                         www.sentex.net/mike


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: loader(8) readin failed on 7.2R and later including 8.0R

2009-12-04 Thread John Baldwin
On Friday 04 December 2009 10:35:59 am John Baldwin wrote:
 On Thursday 03 December 2009 4:20:08 pm Hiroki Sato wrote:
  John Baldwin j...@freebsd.org wrote
in 200912030803.29797@freebsd.org:
  
  jh On Thursday 03 December 2009 5:29:13 am Hiroki Sato wrote:
  jh  John Baldwin j...@freebsd.org wrote
  jhin 200912020948.05698@freebsd.org:
  jh 
  jh  jh On Tuesday 01 December 2009 12:13:39 pm Hiroki Sato wrote:
  jh  jh   While the load command seemed to finish, the box got stuck 
  just
  jh  jh   after entering boot command.
  jh  jh 
  jh  jh   Curious to say, I have got this symptom only on a specific box 
  in
  jh  jh   more than ten different boxes I upgraded so far; it is based 
  on an
  jh  jh   old motherboard Supermicro P4DPE[*].
  jh  jh 
  jh  jh   [*]
  jh http://www.supermicro.com/products/motherboard/Xeon/E7500/P4DPE.cfm
  jh  jh 
  jh  jh   Any workaround?  Booting from release CDROMs (7.2R and 8.0R) 
  also
  jh  jh   fail.  On the box 7.1R or 7.1R's loader + 7.2R kernel 
  worked
  jh  jh   fine.  It is possible something in changes of loader(8) 
  between 7.1R
  jh  jh   and 7.2R is the cause, but I am still not sure what it is...
  jh  jh
  jh  jh It may be related to the loader switching to using memory  1MB 
  for its
  jh  jh malloc().  Maybe try building the loader with
  jh 'LOADER_NO_GPT_SUPPORT=yes' in
  jh  jh /etc/src.conf?
  jh 
  jh   Thanks, a recompiled loader with LOADER_NO_GPT_SUPPORT=yes' displayed
  jh   elf32_loadimage: could not read symbols - skipped! for 8.0R kernel.
  jh   This is the same as 7.1R's loader + 8.0R kernel case.
  jh
  jh Can you get the output of 'smap' from the loader?  Is the 8.0 kernel 
  bigger
  jh than the 7.x kernel?  If so, can you try trimming the 8.0 kernel a bit 
  to see
  jh if that changes things?
  
   Sure.  Output of smap on an 8.0R loader with LOADER_NO_GPT_SUPPORT=yes
   was:
  
  | OK smap
  | SMAP type=01 base= len=0009f400
  | SMAP type=02 base=0009f400 len=0c00
  | SMAP type=02 base=000dc000 len=00024000
  | SMAP type=01 base=0010 len=00e0
 
 So this is the region that ends up getting used for malloc:
 
   /* look for the first segment in 'extended' memory */
   if ((smap.type == SMAP_TYPE_MEMORY)  (smap.base == 0x10)) {
   bios_extmem = smap.length;
 
   ...
 
 /* Set memtop to actual top of memory */
 memtop = memtop_copyin = 0x10 + bios_extmem;
 
 
 and then later:
 
 #if defined(LOADER_BZIP2_SUPPORT) || defined(LOADER_FIREWIRE_SUPPORT) || 
 defined(LOADER_GPT_SUPPORT) || defined(LOADER_ZFS_SUPPORT)
 heap_top = PTOV(memtop_copyin);
 memtop_copyin -= 0x30;
 heap_bottom = PTOV(memtop_copyin);
 #else
 
 So memtop_copyin would start off as 0xf0 but would end up as 0xc0,
 and since the kernel starts at 4MB, I think that only leaves about 8MB for
 the kernel.  Probably the loader needs to be more intelligent about using
 high memory for malloc by using the largest region  1MB but  4GB for
 malloc() instead of stealing memory from bios_extmem in the SMAP case.
 Try the attached patch which tries to make the loader use better smarts
 when picking a memory region for the heap (warning, I haven't tested it
 myself yet).

Use the updated patch (actually tested in qemu) instead.

-- 
John Baldwin
--- //depot/vendor/freebsd/src/sys/boot/i386/libi386/biosmem.c	2007/10/28 21:26:35
+++ //depot/user/jhb/boot/sys/boot/i386/libi386/biosmem.c	2009/12/04 22:20:17
@@ -35,14 +35,20 @@
 #include libi386.h
 #include btxv86.h
 
-vm_offset_t	memtop, memtop_copyin;
-u_int32_t	bios_basemem, bios_extmem;
+vm_offset_t	memtop, memtop_copyin, high_heap_base;
+uint32_t	bios_basemem, bios_extmem, high_heap_size;
 
 static struct bios_smap smap;
 
+/*
+ * The minimum amount of memory to reserve in bios_extmem for the heap.
+ */
+#define	HEAP_MIN	(3 * 1024 * 1024)
+
 void
 bios_getmem(void)
 {
+uint64_t size;
 
 /* Parse system memory map */
 v86.ebx = 0;
@@ -65,6 +71,26 @@
 	if ((smap.type == SMAP_TYPE_MEMORY)  (smap.base == 0x10)) {
 	bios_extmem = smap.length;
 	}
+
+	/*
+	 * Look for the largest segment in 'extended' memory beyond
+	 * 1MB but below 4GB.
+	 */
+	if ((smap.type == SMAP_TYPE_MEMORY)  (smap.base  0x10) 
+	(smap.base  0x1ull)) {
+	size = smap.length;
+
+	/*
+	 * If this segment crosses the 4GB boundary, truncate it.
+	 */
+	if (smap.base + size  0x1ull)
+		size = 0x1ull - smap.base;
+
+	if (size  high_heap_size) {
+		high_heap_size = size;
+		high_heap_base = smap.base;
+	}
+	}
 } while (v86.ebx != 0);
 
 /* Fall back to the old compatibility function for base memory */
@@ -97,5 +123,13 @@
 /* Set memtop to actual top of memory */
 memtop = memtop_copyin = 0x10 + bios_extmem;
 
+/*
+ * If we have extended memory and did not find a suitable heap
+ * region in the SMAP, use the