Re: g_vfs_done():md2[WRITE(offset=434585600...?

Matthew Seaman Fri, 10 Feb 2006 06:54:20 -0800

[EMAIL PROTECTED] wrote:
> Hi all,
> 
> I am seeking information about what this and other similar messages
> mean, and corrective action to take. At the time of the error
> message, the machine spontaneously rebooted (apparently without panic
> ) and came back with a corrupt /var filesystem (to  which fsck
> required manuall intervention to recover).
> 
> The machine is a dual Xeon ASUS NCCH-DL board with 4 GB of ram,
> running 6.0 STABLE Thu Dec 222 18:24:2005, and has otherwise been
> reliable. The machine was placed into test as a secondary mail
> server, seeded with dictionary-attack accounts and allowed to collect
> UCE and ratware at will, as a test for SpamAssassin and MIMEDefang. (
> Also makes a goot test for a pf-spamd teergrube.)
> 
> md2 is a 512mB memory disk mounted on /var/spool/MIMEDefang, to allow
> quick scanning with less hardware disk IO. The main hardware drive
> controller is a 3ware 4 port SATA controller in raid mirror mode.
> 
> Googling on this vfs_done() seems to show various similar requests
> for information related to other circumstances but no paresable
> responses. (I dont *think* md2 was ever *full*.) I can read code..
> but.. Geez, filesystem code... Echh. Clue-stick -> manpage welcome
> here. Thanks.
> 
> Feb  8 13:48:59 testbed kernel: g_vfs_done():md2[WRITE(offset=434585600, 
> length=131072)]error = 28
> Feb  8 13:48:59 testbed kernel: g_vfs_done():md2[WRITE(offset=434716672, 
> length=131072)]error = 28
> Feb  8 13:48:59 testbed kernel: g_vfs_done():md2[WRITE(offset=434847744, 
> length=131072)]error = 28
> Feb  8 13:48:59 testbed kernel: g_vfs_done():md2[WRITE(offset=434978816, 
> length=131072)]error = 28
> Feb  8 13:48:59 testbed kernel: g_vfs_done():md2[WRITE(offset=435109888, 
> length=131072)]error = 28
> Feb  8 13:48:59 testbed kernel: g_vfs_done():md2[WRITE(offset=435240960, 
> length=131072)]error = 28
>


Did you get a kernel dump after the reboot?  If you did, and you generated a
backtrace as described here:

    
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html

I reckon you'ld see that it panic'd with 'kmem_map too small':

#0  doadump () at pcpu.h:165
#1  0xc063ce7f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xc063d1a5 in panic (fmt=0xc0888692 "kmem_malloc(%ld): kmem_map too small: %
ld total allocated") at /usr/src/sys/kern/kern_shutdown.c:555
#3  0xc07aa349 in kmem_malloc (map=0xc10600c0, size=16384, flags=1026) at /usr/s
rc/sys/vm/vm_kern.c:299
#4  0xc07a1c72 in page_alloc (zone=0x0, bytes=16384, pflag=0x0, wait=1026) at /u
sr/src/sys/vm/uma_core.c:957
[etc...]

It's a bug -- the VM system seems to starve the memory disk of pages, causing
a crash.  See the example given at the end of 

   http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/87255

The ultimate cause would be running a bunch of programs that are heavy on
the memory requirements, and running out of memory for both them and the
malloc backed memory filesystem.  See mdconfig(8) -- as it says:

            malloc   Storage for this type of memory disk is allocated with
                      malloc(9).  This limits the size to the malloc bucket
                      limit in the kernel.  If the -o reserve option is not
                      set, creating and filling a large malloc-backed memory
                      disk is a very easy way to panic a system.

Hence using '-o reserve' looks like a very good thing to try.  Alternatively
use a swap backed memory disk, or don't use a memory disk at all.

        Cheers,

        Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.                       Flat 3
                                                      7 Priory Courtyard
PGP: http://www.infracaninophile.co.uk/pgpkey         Ramsgate
                                                      Kent, CT11 9PW, UK

signature.asc
Description: OpenPGP digital signature

Re: g_vfs_done():md2[WRITE(offset=434585600...?

Reply via email to