I just recently started encountering this problem on OmniOS r151012. It's been under heavy load for several weeks, so I can't confirm if this load related.
I have two crash dumps available at: ftp://ftp.nrg.wustl.edu/pub/zfs/crashdump_2015-03-24.vmdump ftp://ftp.nrg.wustl.edu/pub/zfs/crashdump_2015-03-22.vmdump My deadman timeout was set to 4.5 second when these dumps occurred. I have since set it to 45 seconds. -Chip -Chip On Thu, Mar 26, 2015 at 10:56 AM, Youzhong Yang <[email protected]> wrote: > Hi all, > > I'd like to ask if anyone has experienced this issue or what can be done > next to address the issue. > > We're trying to qualify a Supermicro X10DRi-T4+ (X10DRC) box, which has > > - Intel Xeon CPU E5-2690 [email protected], 2 CPUs, each 12 cores, 384 GB > memory (Samsung DDR4 2133 MHz) > > The server machine crashes frequently when it's under relatively heavy > load, the typical stack looks like > > genunix: [ID 596504 kern.notice] deadman: timed out after 50 seconds of clock > inactivity > unix: [ID 100000 kern.notice] > genunix: [ID 802836 kern.notice] fffff002e02d7ea0 fffffffffb9b1d15 () > genunix: [ID 655072 kern.notice] fffff002e02d7ef0 genunix:cyclic_expire+d1 () > genunix: [ID 655072 kern.notice] fffff002e02d7f60 genunix:cyclic_fire+8c () > genunix: [ID 655072 kern.notice] fffff002e02d7f80 unix:cbe_fire+3e () > genunix: [ID 655072 kern.notice] fffff002e02d7fd0 > apix:apix_dispatch_by_vector+8c () > genunix: [ID 655072 kern.notice] fffff002e02d7ff0 > apix:apix_dispatch_hilevel+15 () > genunix: [ID 655072 kern.notice] fffff002e02d1750 unix:switch_sp_and_call+13 > () > genunix: [ID 655072 kern.notice] fffff002e02d17b0 apix:apix_do_interrupt+fe () > genunix: [ID 655072 kern.notice] fffff002e02d17c0 unix:cmnint+ba () > genunix: [ID 655072 kern.notice] fffff002e02d18e0 unix:mutex_delay_default+7 > () > genunix: [ID 655072 kern.notice] fffff002e02d1950 unix:mutex_vector_enter+cc > () > genunix: [ID 655072 kern.notice] fffff002e02d19b0 unix:clock_tick_process+ce > () > genunix: [ID 655072 kern.notice] fffff002e02d1a30 > unix:clock_tick_execute_common+93 () > genunix: [ID 655072 kern.notice] fffff002e02d1a70 unix:clock_tick_schedule+a8 > () > genunix: [ID 655072 kern.notice] fffff002e02d1af0 genunix:clock+2cb () > genunix: [ID 655072 kern.notice] fffff002e02d1b90 genunix:cyclic_softint+f3 () > genunix: [ID 655072 kern.notice] fffff002e02d1ba0 unix:cbe_softclock+17 () > genunix: [ID 655072 kern.notice] fffff002e02d1bf0 > unix:av_dispatch_softvect+78 () > genunix: [ID 655072 kern.notice] fffff002e02d1c20 > apix:apix_dispatch_softint+35 () > genunix: [ID 655072 kern.notice] fffff002e4325950 unix:switch_sp_and_call+13 > () > > Has anyone run into the same issue? We've tried all possible BIOS tweaks, but > without any luck to stop the crash. > > Thanks, > > -Youzhong > > > *smartos-discuss* | Archives > <https://www.listbox.com/member/archive/184463/=now> > <https://www.listbox.com/member/archive/rss/184463/26131723-5e6842d7> | > Modify > <https://www.listbox.com/member/?&> > Your Subscription <http://www.listbox.com> > ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
