> Hmm, looks like the bge driver is using > software interrupts, and I think these could > be running at priority level 4. > > Seems that the bge hardware has some > problems, and the driver tries to reset the > bge network hardware in an attempt to > recover from the bge hardware problem. > > bge_poll_firmware() could be busy waiting > for up to one second; I suspect this could > explain the kernel cpu time usage. > > Are there any error or warning messages > logged to /var/adm/messages when the > system starts consuming kernel cpu time? > > > Maybe the hang can be avoided when the > bge nic driver isn't used and the bge interface > is unconfigured / unplumbed? Or the bge > nic driver isn't allowed to load, by using > the kernel option "-B disable-bge=true" ?
I started at the end, with -B disable-bge=true. The network applet still shows bge0, but it doesn't try to configure it. ifconfig bge0 unplumb says bge0 is no interface, so the kernel option seems to have worked. Lockstat though still shows 98% of i86_mwait at 'sane' state. I checked the /var/adm/messages, but it is so long, and I don't know what I should look for. I tried 'excess' and 'consum', but neither had any hits. What looks strange to me, the layperson in kernel land: Aug 8 22:05:34 OSolUwe mac: [ID 469746 kern.info] NOTICE: bge0 registered Aug 8 22:05:34 OSolUwe pci_pci: [ID 370704 kern.info] PCI-device: pci103c,3...@e, bge0 Aug 8 22:05:34 OSolUwe genunix: [ID 936769 kern.info] bge0 is /p...@0,0/pci8086,2...@1e/pci103c,3...@e Aug 8 22:05:46 OSolUwe genunix: [ID 408114 kern.info] /p...@0,0/pci8086,2...@1e/pci103c,3...@e (bge0) online Aug 8 22:05:47 OSolUwe ip: [ID 856290 kern.notice] ip: joining multicasts failed (4) on bge0 - will use link layer broadcasts for multicast Aug 8 22:05:50 OSolUwe in.ndpd[366]: [ID 169330 daemon.error] Interface bge0 has been removed from kernel. in.ndpd will no longer use it Aug 8 22:05:54 OSolUwe genunix: [ID 408114 kern.info] /p...@0,0/pci8086,2...@1e/pci103c,3...@e (bge0) online At least, I can confirm that now the system keeps running normally; meaning that at least the symptoms have been suppressed by that kernel option. What next? -- This message posted from opensolaris.org _______________________________________________ opensolaris-help mailing list opensolaris-help@opensolaris.org