Hi Marion,

I'm not the right person to analyze your panic stack, but a quick
search says the page_sub: bad arg(s): pp panic string might be
associated with a bad CPU or a page locking problem.

I would recommend running CPU/memory diagnostics on this system.


Thanks,

Cindy

On 09/02/10 20:31, Marion Hakanson wrote:
Folks,

Has anyone seen a panic traceback like the following?  This is Solaris-10u7
on a Thumper, acting as an NFS server.  The machine was up for nearly a
year, I added a dataset to an existing pool, set compression=on for the
first time on this system, loaded some data in there (via "rsync"),
then mounted it to the NFS client.

The first data was written by the client itself in a 10pm cron-job, and
the system crashed at 10:02pm as below:

panic[cpu2]/thread=fffffe8000f5cc60: page_sub: bad arg(s): pp ffffffff872b5610, *ppp 0

fffffe8000f5c470 unix:mutex_exit_critical_size+20219 ()
fffffe8000f5c4b0 unix:page_list_sub_pages+161 ()
fffffe8000f5c510 unix:page_claim_contig_pages+190 ()
fffffe8000f5c600 unix:page_geti_contig_pages+44b ()
fffffe8000f5c660 unix:page_get_contig_pages+c2 ()
fffffe8000f5c6f0 unix:page_get_freelist+1a4 ()
fffffe8000f5c760 unix:page_create_get_something+95 ()
fffffe8000f5c7f0 unix:page_create_va+2a1 ()
fffffe8000f5c850 unix:segkmem_page_create+72 ()
fffffe8000f5c8b0 unix:segkmem_xalloc+60 ()
fffffe8000f5c8e0 unix:segkmem_alloc_vn+8a ()
fffffe8000f5c8f0 unix:segkmem_alloc+10 ()
fffffe8000f5c9c0 genunix:vmem_xalloc+315 ()
fffffe8000f5ca20 genunix:vmem_alloc+155 ()
fffffe8000f5ca90 genunix:kmem_slab_create+77 ()
fffffe8000f5cac0 genunix:kmem_slab_alloc+107 ()
fffffe8000f5caf0 genunix:kmem_cache_alloc+e9 ()
fffffe8000f5cb00 zfs:zio_buf_alloc+1d ()
fffffe8000f5cb50 zfs:zio_compress_data+ba ()
fffffe8000f5cba0 zfs:zio_write_compress+78 ()
fffffe8000f5cbc0 zfs:zio_execute+60 ()
fffffe8000f5cc40 genunix:taskq_thread+bc ()
fffffe8000f5cc50 unix:thread_start+8 ()

syncing file systems... done
. . .

Unencumbered by more than a gut feeling, I disabled compression on
the dataset, and we've gotten through two nightly runs of the same
NFS client job without crashing, but of course we would tecnically
have to wait for nearly a year before we've exactly replicated the
original situation (:-).

Unfortunately the dump-slice was slightly too small, we were just short
of enough space to capture the whole 10GB crash dump.  I did get savecore
to write something out, and I uploaded it to the Oracle support site,but it gives "scat" too much indigestion to be useful to the engineer I'm working
with.  They have not found any matching bugs so far, so I thought I'd ask a
slightly wider audience here.

Thanks and regards,

Marion


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to