I missed these two messages earlier. Thanks, guys.

I had swapped out the RAM and went a whole day without any panics, then one
happened at 10pm and the a couple more in the early hours. At least it's
not crashing during business hours right now...

Anyways,
> And/or upload this dump to Manta[1] via thoth[2] and point us to it. ;)
I can't figure out how to get node onto OmnioOS. I could definitely upload
the dump somewhere.

> Because the system paniced in mutex_enter(), you may be missing a
stack frame from the output of $C. Can you also grab the output of:

  ffffff001fadfb60+8,40=pn8- | ::eval ./p

0xffffff001fadfb68:             taskq_thread+0x2d0
0xffffff001fadfb60:             0xffffff001fadfc20
0xffffff001fadfb58:             0xffffff04ecacf110
0xffffff001fadfb50:             0xffffff001fadfb60
0xffffff001fadfb48:             0xffffff001fadfbc0
0xffffff001fadfb40:             0xffffff04ec9004e0
0xffffff001fadfb38:             1
0xffffff001fadfb30:             0xffffff04ec9004d8
0xffffff001fadfb28:             0xffffff04ec9004b8
0xffffff001fadfb20:             1
0xffffff001fadfb18:             0x8915dfc9
0xffffff001fadfb10:             0xffffff04ec9004b8
0xffffff001fadfb08:             tsc_gethrtime+0x5c
0xffffff001fadfb00:             sleepq_head+0x1568
0xffffff001fadfaf8:             taskq_thread_wait+0xbe
0xffffff001fadfaf0:             0xffffff001fadfb60
0xffffff001fadfae8:             0xffffff04ec9004e8
0xffffff001fadfae0:             0xffffff04ec9004d8
0xffffff001fadfad8:             0xffffff001fadfbc0
0xffffff001fadfad0:             sleepq_head+0x1560
0xffffff001fadfac8:             0xffffff04ec9004d8
0xffffff001fadfac0:             0xffffff04ec9004ea
0xffffff001fadfab8:             cv_wait+0x7c
0xffffff001fadfab0:             0xffffff001fadfaf0
0xffffff001fadfaa8:             0
0xffffff001fadfaa0:             0xffffff04ec9004b8
0xffffff001fadfa98:             0xffffff04ec9004d8
0xffffff001fadfa90:             0xffffff04ec9004e8
0xffffff001fadfa88:             swtch+0x141
0xffffff001fadfa80:             0xffffff001fadfab0
0xffffff001fadfa78:             _resume_from_idle+0xf1
0xffffff001fadfa70:             0xffffff001fadfab0
0xffffff001fadfa68:             mptsas_handle_event+0x39
0xffffff001fadfa60:             0x458c5114031b1
0xffffff001fadfa58:             0x38
0xffffff001fadfa50:             0xffffff001fadfa68
0xffffff001fadfa48:             0x10246
0xffffff001fadfa40:             0x30
0xffffff001fadfa38:             mutex_enter+0xb
0xffffff001fadfa30:             2
0xffffff001fadfa28:             0xe
0xffffff001fadfa20:             0x1c3
0xffffff001fadfa18:             0
0xffffff001fadfa10:             0x4b
0xffffff001fadfa08:             0x4b
0xffffff001fadfa00:             0xffffff001fadfa30
0xffffff001fadf9f8:             disp_ratify+0x6f
0xffffff001fadf9f0:             0
0xffffff001fadf9e8:             0x20
0xffffff001fadf9e0:             1
0xffffff001fadf9d8:             0
0xffffff001fadf9d0:             0xffffff001fadfc40
0xffffff001fadf9c8:             sleepq_head+0x4e10
0xffffff001fadf9c0:             0xffffff001fadfb60
0xffffff001fadf9b8:             0
0xffffff001fadf9b0:             0
0xffffff001fadf9a8:             0x78
0xffffff001fadf9a0:             0xffffff04e760d8c0
0xffffff001fadf998:             0
0xffffff001fadf990:             0xffffff001fadfc40
0xffffff001fadf988:             0xffffff04ea7e1588
0xffffff001fadf980:             0x20
0xffffff001fadf978:             mutex_enter+0xb
0xffffff001fadf970:             0xffffff001fadfb60



On Mon, Oct 7, 2013 at 1:43 PM, Robert Mustacchi <[email protected]> wrote:

> On 10/7/13 13:25 , Travis LaDuke wrote:
> > Hi,
> > Looking at crash dumps on unix is new territory for me. I have this
> > OmniOS-stable VM on esxi, and it's having kernel panics and rebooting 2
> - 3
> > times a day. I'm guessing it's bad RAM, but I haven't been able to bring
> > the machine down to test yet.
> > Sorry if I'm posting on the wrong list. I'm mostly curious what other
> > info/output it would be useful to look at and if this output below says
> > anything obvious about what the problem is. Or what else  I can do to
> test.
>
> Hey Travis,
>
> So it's a bit hard to say from just that what it could be. That stack
> trace doesn't immediately stand out to me. What would help here is to
> open up the dump in mdb and git a little bit more information such as
> the arguments to the functions and the taskq in question.
>
> To start off what I might run is something like this:
>
> cd /var/crash/volatile
>
> In there you should see files with the name of vmdump.%d. So if you have
> a vmdump.0, you'd run `savecore -fv vmdump.0 .` and then `mdb 0`. If the
> trailing digit is different, just replace that everywhere. Once you do
> that you should have a prompt for mdb. To start with run the following:
>
> > ::status
> ... output ...
> > $C
> ... output ...
> > $r
> ... output ...
> > $q
>
> That last one will cause you to exit.
>
> Robert
>
> > # cat fmdump.txt
> > TIME                           UUID
> > SUNW-MSG-ID
> > Oct 04 2013 17:21:54.666705000 622961c5-bb0f-6890-929d-cbad7b19385b
> > SUNOS-8000-KL
> >
> >   TIME                 CLASS                                 ENA
> >   Oct 04 17:21:54.6631 ireport.os.sunos.panic.dump_available
> > 0x0000000000000000
> >   Oct 04 17:21:50.6898 ireport.os.sunos.panic.dump_pending_on_device
> > 0x0000000000000000
> >
> > nvlist version: 0
> >         version = 0x0
> >         class = list.suspect
> >         uuid = 622961c5-bb0f-6890-929d-cbad7b19385b
> >         code = SUNOS-8000-KL
> >         diag-time = 1380932514 663452
> >         de = fmd:///module/software-diagnosis
> >         fault-list-sz = 0x1
> >         fault-list = (array of embedded nvlists)
> >         (start fault-list[0])
> >         nvlist version: 0
> >                 version = 0x0
> >                 class = defect.sunos.kernel.panic
> >                 certainty = 0x64
> >                 asru =
> > sw:///:path=/var/crash/unknown/.622961c5-bb0f-6890-929d-cbad7b19385b
> >                 resource =
> > sw:///:path=/var/crash/unknown/.622961c5-bb0f-6890-929d-cbad7b19385b
> >                 savecore-succcess = 1
> >                 dump-dir = /var/crash/unknown
> >                 dump-files = vmdump.45
> >                 os-instance-uuid = 622961c5-bb0f-6890-929d-cbad7b19385b
> >                 panicstr = BAD TRAP: type=e (#pf Page fault)
> > rp=ffffff001fac9970 addr=20 occurred in module "unix" due to a NULL
> pointer
> > dereference
> >                 panicstack = unix:die+df () | unix:trap+db3 () |
> > unix:cmntrap+e6 () | unix:mutex_enter+b () | genunix:taskq_thread+2d0 ()
> |
> > unix:thread_start+8 () |
> >                 crashtime = 1380932428
> >                 panic-time = Fri Oct  4 17:20:28 2013 PDT
> >         (end fault-list[0])
> >
> >         fault-status = 0x1
> >         severity = Major
> >         __ttl = 0x1
> >         __tod = 0x524f5ba2 0x27bd1c68
> >
> >
> >
> > -------------------------------------------
> > illumos-discuss
> > Archives: https://www.listbox.com/member/archive/182180/=now
> > RSS Feed:
> https://www.listbox.com/member/archive/rss/182180/21175748-a2cc1e82
> > Modify Your Subscription:
> https://www.listbox.com/member/?&;
> > Powered by Listbox: http://www.listbox.com
> >
>
>



-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to