I'm trying to debug a problem I'm seeing in RBridges, and I'm not
quite sure where to turn.  Suggestions welcome.

The problem is that when my link-add function (which runs in a
separate task) calls mac_open_by_linkid(), that call sometimes takes
15 seconds or more to run on some platforms.  This seems abnormal to
me, but I guess I'm not sure whether I should consider it a bug.
(That's likely question #1.)

What happens is that mac_open_by_linkid() calls
dls_devnet_prop_task_wait(), and it sits there for 15 seconds stuck in
a cv_wait.  Clearly, since it's cv_wait, I'd be hosed even if I could
handle signals here.  (I currently can't, but that could be fixable.)

I've tried tracing dls_devnet_prop_task() itself, and I see it taking
about 15 seconds to run, and all of that time spent in a single call
to door_ki_upcall_limited().

That's where things get fuzzy.  I've tried dtracing and trussing
dlmgmtd, and I don't see it wasting any time at all, so I don't know
where that 15 seconds is going.

Does this ring any bells for anyone?  Could it be devfs again?  (I've
seen really long delays on some systems in touching anything in /dev,
and these delays appear to be related to the devfsadmd song and dance
routine.)

Some possibly-helpful details: I've seen this effect only on x86
systems -- whitestar2-5.east, which is a blade system with two 'bge'
interfaces.  It happens only on bge1, which is not plumbed up by
default.

I plan to work around it by making it so that I hang forever waiting
for this to finish, though that doesn't seem like the best possible
solution.

-- 
James Carlson, Solaris Networking              <[email protected]>
Sun Microsystems / 35 Network Drive        71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to