Jeff Ross wrote:
On Sat, 30 Jan 2010, Bob Beck wrote:

Ooooh. nice one. Obviously ami couldn't get memory mappings and freaked out.

While not completely necessary, I'd love for you to file that whole
thing into sendbug() in a pr so we don't
forget it. but that one I need to pester krw, art, dlg, and maybe
marco about what ami is doing.

note that the behaviour you see wrt free memory dropping but not
hitting swap is what I expect.
basically that makes the buffer cache subordinate to working set
memory between 10 and 90% of
physmem. the buffer cache will throw away pages before allowing the
system to swap.

Drop it back to 70% and tell me if you still get the same panic
please.  and if you have a fixed test
case that reproduces this on your machine ( a load generator for
postgres with clients) I'd love to
have a copy in the pr as well.

70% produces the same panic.

panic: pmap_enter: no pv entries available
Stopped at      Debugger+0x4:   leave
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
IF RUNNING SMP, USE 'mach ddbcpu <#>' AND 'trace' ON OTHER PROCESSORS, TOO.
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
ddb{0}> trace
Debugger(7149,dfe3ede8,d08edf18,c,0) at Debugger+0x4
panic(d077d740,0,7000,d08ad6e0,ffffffff) at panic+0x55
pmap_enter(d08f3520,e0031000,7149000,3,13) at pmap_enter+0x2e5
_bus_dmamem_map(d0875c40,d505fc44,1,6344,d505fc58,1,dfe3eebc,1) at _bus_dmamem_map+0x9c
ami_allocmem(d49b5800,6344,20,d0753ddc) at ami_allocmem+0x92
ami_mgmt(d49b5800,a1,4,0,0,6344,d49cd000,1) at ami_mgmt+0x268
ami_refresh_sensors(d49b5800,da987028,da987050,80000000,da987028) at ami_refresh_sensors+0x25
sensor_task_work(d49b3d80,0,50,200286) at sensor_task_work+0x1f
workq_thread(d0863100) at workq_thread+0x32
Bad frame pointer: 0xd0a32e78
ddb{0}>

I'll skip the ps for this go round since it should be pretty much the same and go directly to sendbug, including the pg_bench script I use to trigger it.

Thanks!

Jeff

Okay, I've been testing.  I brought everything up to current, applied the
ami.c patch sent by David Gwynne as modified by Phillip Guenther, and the patch to bus_dma.c sent by Kenneth Westerback.

I started by setting kern.bufcachepercent=60 and then moving down by 10 after
each panic. Anything 20 or greater triggers the same panic as above.

I then set it to 10 to see what would happen.  The load ran okay, but I did
get three uvm_mapent_alloc: out of static map entries entries into the console
that seems to coincide with the end of one of the three pgbench runs and the
start of the next.

So I set it to 11 and got this:

ddb{2}> show panic
malloc: out of space in kmem_map
ddb{2}> trace
Debugger(3fff,c,d488a000,4,4000) at Debugger+0x4
panic(d0752c20,0,4000,0,0) at panic+0x55
malloc(4000,7f,0,da8980b4,70000) at malloc+0x76
vfs_getcwd_scandir(e002eee8,e002eecc,e002eed0,d4e80400,da24f184) at vfs_getcwd_
scandir+0x123
vfs_getcwd_common(da898154,0,e002ef10,d4e80400,200,1,da24f184,22) at vfs_getcwd
_common+0x1f0
sys___getcwd(da24f184,e002ef68,e002ef58,da24f184) at sys___getcwd+0x62
syscall() at syscall+0x12b
--- syscall (number 304) ---
0x1c028f25:

I reported a similar panic back in December
        http://kerneltrap.org/mailarchive/openbsd-misc/2009/12/14/6309363
and was told I'd twisted the knobs too hard ;-)

Here are the sysctl values I'm currently using:

kern.maxproc=10240
kern.maxfiles=20480
kern.maxvnodes=6000

kern.shminfo.shmseg=32
kern.seminfo.semmni=256
kern.seminfo.semmns=2048
kern.shminfo.shmall=512000
kern.shminfo.shmmax=768000000

About that time Owain Ainsworth sent his version of a fix to bus_dma.c so I applied that and built a new kernel and I still get panics when I adjust kern.bufcachepercent above 15 or so.

Here's the latest panic, trace and ps with kern.bufcachepercent set to 20:
ddb{0}> show panic
pmap_enter: no pv entries available
ddb{0}> trace
Debugger(b6f30,dff02e1c,2000,c,0) at Debugger+0x4
panic(d077d7a0,0,0,d0e7a980,1) at panic+0x55
pmap_enter(d08f3520,e002d000,b6f30000,7,13) at pmap_enter+0x2e5
uvm_km_alloc1(d08ad720,2000,0,1) at uvm_km_alloc1+0xd5
fork1(da8c5834,14,1,0,0) at fork1+0x100
sys_fork(da8c5834,dff02f68,dff02f58,da8c5834) at sys_fork+0x38
syscall() at syscall+0x12b
--- syscall (number 2) ---
ddb{0}> ps
   PID   PPID   PGRP    UID  S       FLAGS  WAIT          COMMAND
 18032   3768  18032    503  3   0x2000008  flt_pmfail2   postgres
 28815   3768  28815    503  3   0x2000008  inode         postgres
 16262   3768  16262    503  2   0x2000008                postgres
 15301   3768  15301    503  3   0x2000008  inode         postgres
 16712   3768  16712    503  2   0x2000008                postgres
  5959   3768   5959    503  3   0x2000008  flt_pmfail2   postgres
 24166   3768  24166    503  3   0x2000008  flt_pmfail2   postgres
 23692   3768  23692    503  3   0x2000008  flt_pmfail2   postgres
 21841   3768  21841    503  3   0x2000008  flt_pmfail2   postgres
 23838   3768  23838    503  3   0x2000008  flt_pmfail2   postgres
 25423   3768  25423    503  3   0x2000008  inode         postgres
  5075   3768   5075    503  3   0x2000008  inode         postgres
 23008   3768  23008    503  3   0x2000008  inode         postgres
 10527   3768  10527    503  3   0x2000008  flt_pmfail2   postgres
 17391   3768  17391    503  3   0x2000008  flt_pmfail2   postgres
 27363   3768  27363    503  3   0x2000008  flt_pmfail2   postgres
  4858   3768   4858    503  2   0x2000008                postgres
 18716   3768  18716    503  2   0x2000008                postgres
  8073   3768   8073    503  2   0x2000008                postgres
 30893   3768  30893    503  3   0x2000008  flt_pmfail2   postgres
 13741   3768  13741    503  3   0x2000008  flt_pmfail2   postgres
 14272   3768  14272    503  2   0x2000008                postgres
  1962   3768   1962    503  3   0x2000008  inode         postgres
  4988   3768   4988    503  3   0x2000008  flt_pmfail2   postgres
  8452   3768   8452    503  3   0x2000008  inode         postgres
 15633   3768  15633    503  3   0x2000008  flt_pmfail2   postgres
 17648   3768  17648    503  3   0x2000008  flt_pmfail2   postgres
 27751   3768  27751    503  3   0x2000008  inode         postgres
 12932   3768  12932    503  7   0x2000008                postgres
 21367   3768  21367    503  2   0x2000008                postgres
 13583   3768  13583    503  3   0x2000008  flt_pmfail2   postgres
 11813   3768  11813    503  3   0x2000008  inode         postgres
  6094   3768   6094    503  3   0x2000008  inode         postgres
 13686   3768  13686    503  3   0x2000008  flt_pmfail2   postgres
 10011   3768  10011    503  3   0x2000008  flt_pmfail2   postgres
 30652   3768  30652    503  7   0x2000008                postgres
  5358   3768   5358    503  3   0x2000008  flt_pmfail2   postgres
 24385   3768  24385    503  3   0x2000008  inode         postgres
 29773   3768  29773    503  3   0x2000008  flt_pmfail2   postgres
 22599   3768  22599    503  3   0x2000008  flt_pmfail2   postgres
 14231   3768  14231    503  3   0x2000008  flt_pmfail2   postgres
 14083   3768  14083    503  3   0x2000008  flt_pmfail2   postgres
  9729   3768   9729    503  2   0x2000008                postgres
 11748   3768  11748    503  3   0x2000008  inode         postgres
  9097   3768   9097    503  3   0x2000008  flt_pmfail2   postgres
 16057   3768  16057    503  3   0x2000008  inode         postgres
  1067   3768   1067    503  3   0x2000008  flt_pmfail2   postgres
 27445   3768  27445    503  3   0x2000008  flt_pmfail2   postgres
 24037   3768  24037    503  3   0x2000008  flt_pmfail2   postgres
  7000   3768   7000    503  2   0x2000008                postgres
 29367   3768  29367    503  3   0x2000008  inode         postgres
 25020   3768  25020    503  3   0x2000008  flt_pmfail2   postgres
 20564   3768  20564    503  3   0x2000008  flt_pmfail2   postgres
  4090   3768   4090    503  3   0x2000008  flt_pmfail2   postgres
  8865   3768   8865    503  3   0x2000008  flt_pmfail2   postgres
 29631   3768  29631    503  3   0x2000008  inode         postgres
 16278   3768  16278    503  7   0x2000008                postgres
 20854   3768  20854    503  2   0x2000008                postgres
 31943   3768  31943    503  3   0x2000008  flt_pmfail2   postgres
 18137   3768  18137    503  2   0x2000008                postgres
 10523   4345  10523      0  3   0x2044182  poll          systat
  4345      1   4345      0  3   0x2004082  pause         ksh
  2520      1   2520      0  3   0x2004082  ttyin         getty
 32389      1  32389      0  3   0x2004082  ttyin         getty
 21553      1  21553      0  3   0x2000080  select        cron
 22285      1  22285      0  3   0x2000080  select        sshd
 19843      1  19843      0  3   0x2000180  select        inetd
 24948  27412  17184     83  3   0x2000180  poll          ntpd
 27412  17184  17184     83  3   0x2000180  poll          ntpd
 17184      1  17184      0  3   0x2000080  poll          ntpd
  7640  25372  25372     74  3   0x2000180  bpf           pflogd
 25372      1  25372      0  3   0x2000080  netio         pflogd
 25643  12522  12522     73  3   0x2000180  poll          syslogd
 12522      1  12522      0  3   0x2000088  netio         syslogd
 31132   3768  31132    503  3   0x2000088  poll          postgres
  6731   3768   6731    503  3   0x2000088  select        postgres
  3796   3768   3796    503  2         0x8                postgres
 10178   3768  10178    503  3   0x2000088  select        postgres
 23616   3768  23616    503  3   0x2000088  select        postgres
  1109  22145  22399   7794  3   0x2004082  piperd        qmail-clean
  4633  22145  22399   7795  3   0x2004082  select        qmail-rspawn
 26222  22145  22399      0  3   0x2004082  select        qmail-lspawn
* 3768  27937  22399    503  7   0x200400a                postgres
  3493  26189  22399     73  3   0x2004082  piperd        multilog
 19747  10749  22399   7792  3   0x2004082  piperd        multilog
 31915  18366  22399   1002  3   0x2004082  piperd        multilog
   611  19270  22399     73  3   0x2004182  netio         socklog
  6710      1   6710      0  3   0x2004082  ttyin         getty
 30305      1  30305      0  3   0x2004082  ttyin         getty
  1774      1   1774      0  3   0x2004082  ttyin         getty
 25968  31940  22399   1001  3   0x2004182  poll          dnscache
 30323   3543  22399     73  3   0x2004082  piperd        multilog
 22145  15017  22399   7796  3   0x2004082  select        qmail-send
 26189  30365  22399      0  3   0x2004082  poll          supervise
 27937  30365  22399      0  3   0x2004082  poll          supervise
  3543  30365  22399      0  3   0x2004082  poll          supervise
 19270  30365  22399      0  3   0x2004082  poll          supervise
 10749  30365  22399      0  3   0x2004082  poll          supervise
 15017  30365  22399      0  3   0x2004082  poll          supervise
 18366  30365  22399      0  3   0x2004082  poll          supervise
 31940  30365  22399      0  3   0x2004082  poll          supervise
 30671  22399  22399      0  3   0x2004082  piperd        readproctitle
 30365  22399  22399      0  3   0x2004082  nanosleep     svscan
 22399      1  22399      0  3   0x2004082  pause         sh
    19      0      0      0  3   0x2100200  bored         crypto
    18      0      0      0  3   0x2100200  aiodoned      aiodoned
    17      0      0      0  3   0x2100200  syncer        update
    16      0      0      0  3   0x2100200  cleaner       cleaner
    15      0      0      0  3    0x100200  reaper        reaper
    14      0      0      0  2   0x2100200                pagedaemon
    13      0      0      0  3   0x2100200  pftm          pfpurge
    12      0      0      0  3   0x2100200  usbevt        usb2
    11      0      0      0  3   0x2100200  usbevt        usb1
    10      0      0      0  3   0x2100200  usbtsk        usbtask
     9      0      0      0  3   0x2100200  usbevt        usb0
     8      0      0      0  3   0x2100200  acpi_idle     acpi0
     7      0      0      0  3  0x40100200                idle3
     6      0      0      0  3  0x40100200                idle2
     5      0      0      0  3  0x40100200                idle1
     4      0      0      0  3   0x2100200  bored         syswq
     3      0      0      0  3  0x40100200                idle0
     2      0      0      0  3   0x2100200  kmalloc       kmthread
     1      0      1      0  3   0x2004080  wait          init
     0     -1      0      0  3   0x2080200  scheduler     swapper

 Anybody got an idea what I can test from here?

Thanks,

Jeff

Reply via email to