> The machine answered the question: It was running
> smoothly for close to an hour. Then I left for lunch.
> When I came back, the monitor was black, no reaction.
> I tried all and everything, with power button as last
> resort. That resulted in a cold start.

Is it possible that the ip address assigned to the
system's hostname was bound to the bge0 interface;
that is, after disabling the bge driver, connections
to the local machine using the machine' `hostname`
failed ?   I suspect that the X screen blank needs such
a connection to wake up from screen blank mode...


> Since this was a good opportunity, I gave it a shot
> and pulled a network cable to it, and disabled WLAN.
> And it connected properly to the network; so the NIC
> is probably not broken, as one could assume. The
> message lines are as in my earlier mail, except that
> there are two more: one with bge0 link up,
> immediately followed by bad address 0.0.0.0
> I had issued ifconfig bge0 dhcp afterwards, and there
> are no more bge0 messages in the log.

That is, when there is no cable connected to the 
bge NIC hardware, the machine starts to consume
lots of kernel cpu time after a few minutes, and
eventually hangs the system?

And when a cable is connected, there is no
excessive kernel cpu time usage, and the machine
doesn't hang?


> So what we seem to encounter here, is a bad
> architectural mistake in the kernel. Blame nwam on
> pulling the wrong cords, nevermind. But the kernel
> must not allow this to happen: When bge0 can't
> connect, it monopolises all resources to load the
> 'correct' firmware to get it back up?

Yep; I'd say something is broken in the bge driver...

Maybe the BIOS has configured the nic hardware to enter
a power saving state after five minutes with no activity;
and the Solaris bge driver is confused when the device
enters that power saving state (it tries to recover by 
reseting the bge hardware, but fails to wake up the hardware,
and tries to wait forever for the firmware to become ready) ?


> On top of that,
> I never used bge0, always wpi0. So there is no reason
> at all for the kernel to try to force bge0 to work.

I think a possible workaround is to disable
svc:/network/physical:nwam, enable
svc:/network/physical:default, and manually
configure the wpi0 interface (and not use the
bge interface for now).
-- 
This message posted from opensolaris.org
_______________________________________________
opensolaris-help mailing list
opensolaris-help@opensolaris.org

Reply via email to