I just recently started encountering this problem on OmniOS r151012.   It's
been under heavy load for several weeks, so I can't confirm if this load
related.

I have two crash dumps available at:
ftp://ftp.nrg.wustl.edu/pub/zfs/crashdump_2015-03-24.vmdump
ftp://ftp.nrg.wustl.edu/pub/zfs/crashdump_2015-03-22.vmdump

My deadman timeout was set to 4.5 second when these dumps occurred.  I have
since set it to 45 seconds.

-Chip

-Chip




On Thu, Mar 26, 2015 at 10:56 AM, Youzhong Yang <[email protected]> wrote:

> Hi all,
>
> I'd like to ask if anyone has experienced this issue or what can be done
> next to address the issue.
>
> We're trying to qualify a Supermicro X10DRi-T4+ (X10DRC) box, which has
>
>    - Intel Xeon CPU E5-2690 [email protected], 2 CPUs, each 12 cores, 384 GB
>    memory (Samsung DDR4 2133 MHz)
>
> The server machine crashes frequently when it's under relatively heavy
> load, the typical stack looks like
>
> genunix: [ID 596504 kern.notice] deadman: timed out after 50 seconds of clock 
> inactivity
> unix: [ID 100000 kern.notice]
> genunix: [ID 802836 kern.notice] fffff002e02d7ea0 fffffffffb9b1d15 ()
> genunix: [ID 655072 kern.notice] fffff002e02d7ef0 genunix:cyclic_expire+d1 ()
> genunix: [ID 655072 kern.notice] fffff002e02d7f60 genunix:cyclic_fire+8c ()
> genunix: [ID 655072 kern.notice] fffff002e02d7f80 unix:cbe_fire+3e ()
> genunix: [ID 655072 kern.notice] fffff002e02d7fd0 
> apix:apix_dispatch_by_vector+8c ()
> genunix: [ID 655072 kern.notice] fffff002e02d7ff0 
> apix:apix_dispatch_hilevel+15 ()
> genunix: [ID 655072 kern.notice] fffff002e02d1750 unix:switch_sp_and_call+13 
> ()
> genunix: [ID 655072 kern.notice] fffff002e02d17b0 apix:apix_do_interrupt+fe ()
> genunix: [ID 655072 kern.notice] fffff002e02d17c0 unix:cmnint+ba ()
> genunix: [ID 655072 kern.notice] fffff002e02d18e0 unix:mutex_delay_default+7 
> ()
> genunix: [ID 655072 kern.notice] fffff002e02d1950 unix:mutex_vector_enter+cc 
> ()
> genunix: [ID 655072 kern.notice] fffff002e02d19b0 unix:clock_tick_process+ce 
> ()
> genunix: [ID 655072 kern.notice] fffff002e02d1a30 
> unix:clock_tick_execute_common+93 ()
> genunix: [ID 655072 kern.notice] fffff002e02d1a70 unix:clock_tick_schedule+a8 
> ()
> genunix: [ID 655072 kern.notice] fffff002e02d1af0 genunix:clock+2cb ()
> genunix: [ID 655072 kern.notice] fffff002e02d1b90 genunix:cyclic_softint+f3 ()
> genunix: [ID 655072 kern.notice] fffff002e02d1ba0 unix:cbe_softclock+17 ()
> genunix: [ID 655072 kern.notice] fffff002e02d1bf0 
> unix:av_dispatch_softvect+78 ()
> genunix: [ID 655072 kern.notice] fffff002e02d1c20 
> apix:apix_dispatch_softint+35 ()
> genunix: [ID 655072 kern.notice] fffff002e4325950 unix:switch_sp_and_call+13 
> ()
>
> Has anyone run into the same issue? We've tried all possible BIOS tweaks, but 
> without any luck to stop the crash.
>
> Thanks,
>
> -Youzhong
>
>
> *smartos-discuss* | Archives
> <https://www.listbox.com/member/archive/184463/=now>
> <https://www.listbox.com/member/archive/rss/184463/26131723-5e6842d7> |
> Modify
> <https://www.listbox.com/member/?&;>
> Your Subscription <http://www.listbox.com>
>



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to