Chris
 
Have you manipulated the ZFS ARC setting re memory consumption ?
 
Regards
 
Jeremy.

>>> On 3/06/2009 at 9:37 am, in message 
>>> <206bef920906021637u65d451a0y7c1ad2c46bfd57e7 at mail.gmail.com>, Chris 
>>> Wells <chris.unix.dude at gmail.com> wrote:
Hi All - I need to use the MSOSUG hive mind!

I've got a newly built system (OpenSolaris 2009.06 / nv111b) which is
unpredictably panicking, and I want to narrow down why this is
happening.
I've seen it mainly happening when the system is under higher IO load
(eg when doing a "zfs scrub rpool").
I have got quite a few (15!) crashdumps, and have looked at the
function stacks, and there doesn't seem to be any consistent pattern.
Sun (God bless em!) have said that they've seen a single-bit flip in
one of the crashdumps, and are wondering if it's hardware related
issue.
(The memory is non-ECC).

I've already run memtest86 (which completed 13 iterations without
finding fault),  and am wondering on the next steps.

I was wondering how to subdivide the problem - my initial thoughts are to:

1) Remove the harddisks, and boot from the LiveCD - and then run some
memory and CPU stress tests - Can anyone suggest a suitable stress
test that could be run from the LiveCD (ideally in text / singleuser
mode)?
2) Exchange the harddisks with some spares, and reinstall OS2009.06
(or another OS) from scratch.


Cheers-- Chris

PS -For those which might be interested kmdb msgbuf gives this output
on the latest crash:

panic[cpu2]/thread=ffffff02e172b020:
BAD TRAP: type=e (#pf Page fault) rp=ffffff000f89d870 addr=0 occurred in module
"zfs" due to a NULL pointer dereference


zfs:
#pf Page fault
Bad kernel fault at addr=0x0
pid=6260, pc=0xfffffffff78a2fdb, sp=0xffffff000f89d960, eflags=0x10286
cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0
cr3: 1e8640000
cr8: c

        rdi:                0 rsi: ffffff8000000000 rdx: ffffff02e172b020
        rcx:                1  r8: fffffffffbd09010  r9: ffffff02d78853d8
        rax:              200 rbx: ffffff02da10eec8 rbp: ffffff000f89d990
        r10:                0 r11:                0 r12: 48cc25d36ba3d0f4
        r13: ffffff02da10f480 r14:                0 r15:                0
        fsb:                0 gsb: ffffff02d8981a80  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                e err:                0 rip: fffffffff78a2fdb
         cs:               30 rfl:            10286 rsp: ffffff000f89d960
         ss:               38

ffffff000f89d750 unix:die+dd ()
ffffff000f89d860 unix:trap+1752 ()
ffffff000f89d870 unix:cmntrap+e9 ()
ffffff000f89d990 zfs:arc_buf_clone+1b ()
ffffff000f89da30 zfs:arc_read_nolock+264 ()
ffffff000f89daf0 zfs:dmu_objset_open_impl+e2 ()
ffffff000f89db50 zfs:dmu_objset_open_ds_os+69 ()
ffffff000f89dbc0 zfs:dmu_objset_open+af ()
ffffff000f89dc00 zfs:zfs_ioc_objset_stats+33 ()
ffffff000f89dc40 zfs:zfs_ioc_snapshot_list_next+d6 ()
ffffff000f89dcc0 zfs:zfsdev_ioctl+10b ()
ffffff000f89dd00 genunix:cdev_ioctl+45 ()
ffffff000f89dd40 specfs:spec_ioctl+83 ()
ffffff000f89ddc0 genunix:fop_ioctl+7b ()
ffffff000f89dec0 genunix:ioctl+18e ()
ffffff000f89df10 unix:brand_sys_syscall32+197 ()

syncing file systems...
done


-- 

Regards,

Chris
_______________________________________________
ug-msosug mailing list
ug-msosug at opensolaris.org 
http://mail.opensolaris.org/mailman/listinfo/ug-msosug

This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. E-mail transmission cannot be guaranteed to be 
secure or error-free as information could be intercepted, corrupted, lost, 
destroyed, arrive late or incomplete, or contain viruses. The sender therefore 
does not accept liability for any errors or omissions in the contents of this 
message, which arise as a result of e-mail transmission. If verification is 
required please request a hard-copy version. Contact Network Help on 
+61-3-9459-2122 for further details.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/ug-msosug/attachments/20090603/52317f2f/attachment.html>

Reply via email to