> > Maybe we can find out who's calling drv_usecwait(), > > using: > > lockstat -kIW -f drv_usecwait -s 10 sleep 15 > # lockstat -kIW -f drv_usecwait -s 10 sleep 15 > > Profiling interrupt: 88 events in 16.823 seconds (5 events/sec) > > ------------------------------------------------------------------------------- > Count indv cuml rcnt nsec Hottest CPU+PIL Caller > 86 98% 98% 0.00 867 cpu[0]+4 drv_usecwait > > nsec ------ Time Distribution ------ count Stack > 1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@ 76 bge_poll_firmware > 2048 |@@@ 10 bge_chip_reset > bge_reset > bge_restart > bge_chip_factotum > av_dispatch_softvect > dispatch_softint > switch_sp_and_call
Hmm, looks like the bge driver is using software interrupts, and I think these could be running at priority level 4. Seems that the bge hardware has some problems, and the driver tries to reset the bge network hardware in an attempt to recover from the bge hardware problem. bge_poll_firmware() could be busy waiting for up to one second; I suspect this could explain the kernel cpu time usage. Are there any error or warning messages logged to /var/adm/messages when the system starts consuming kernel cpu time? Maybe the hang can be avoided when the bge nic driver isn't used and the bge interface is unconfigured / unplumbed? Or the bge nic driver isn't allowed to load, by using the kernel option "-B disable-bge=true" ? -- This message posted from opensolaris.org _______________________________________________ opensolaris-help mailing list opensolaris-help@opensolaris.org