Interesting... the DDI dma handle mechanism has greatly improved, I 
believe, since it was first introduced.  I've not done any perf. 
analysis, but I am a bit surprised that you didn't notice a nasty 
regression in perf. when using a fresh ddi dma handle allocation.

That said, I wonder if a kmem_cache could be used to achieve a high 
level of efficiency here.

Admittedly, from all my own experiences, locking issues (the length of 
the path while the lock is held) are one of the major bottlenecks for 
the 10Gb driver I've looked at in detail (namely nxge).

But, I think its also true that if customers see that one of the top 
locks is your driver's lock, they shouldn't be surprised -- *if* the 
lock is not *contended*.  (Uncontended locks shouldn't be a problem.)

    -- Garrett

Andrew Gallatin wrote:
> Garrett D'Amore wrote:
>
> > Intelligent use of DDI compliant DMA (reusing handles, and sometimes
>
> It is funny that you mention re-using handles.  That's just about
> my biggest pet peeve with the Solaris DMA DDI.
>
> <rant on>
> Handles are the most annoying part of the DDI for a NIC driver trying
> to map transmits for DMA.  They either require you to do a
> handle allocation in the critical path, or require you to setup a
> fairly complex infrastructure to re-use them, or require you to use
> fairly coarse grained locking in your tx routine so you can associate
> a handle with every tx ring entry.  The locking problems inherent in
> handles really make me wish for a pci_map_page() sort of interface
> which is fire and forget.
>
> FWIW, I used to have a fairly clever handle management policy in my
> 10GbE driver's transmit path.  My transmit runs almost entirely
> unlocked, and acquires the tx lock only when passing the descriptors
> to the NIC.  Using a handle per ring entry would put a lock around a
> *lot* of code which can otherwise run lockless.  So, at plumb time,
> I'd allocate a pool of handles.  Each transmit would briefly acquire
> the pool lock and remove a pre-allocated handle from the pool.  If the
> pool was empty, it would be grown (after dropping the lock).  The
> transmit done path would build a list of free handles, and then
> acquire the pool mutex and add them to the free pool in one operation.
> The pool mutex was never held for more than one list insertion
> or removal operation.
>
> So I'm not sure how I could have handled the free pool locking any
> better.  Yet I would still get complaints from customers saying "when
> we run lockstat, one of the top locks is <address of my driver's tx
> handle lock>, please fix the locking in your driver".  I finally got
> sick and tired of getting these reports and I now do a
> ddi_dma_alloc_handle() each time.  Performance didn't get much worse,
> and now lockstat points to the ddi system, so the customers have
> stopped complaining.  Sigh.
>
> </rant>
>
> Drew

_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to