Garrett D'Amore wrote:

>> More generally speaking, though - I'd be surprised if  > 1Gbe NICs
>> perform well with copy-in/out, and CPU utilization is probably a huge
>> factor, and the size of the frame is likely less of a factor than the simple
>> transfer rate a good NIC can maintain.
>>   
> 
> The problem is that the cost of bcopy of ~1500 bytes becomes < the cost 
> to do the various DMA (or DVMA) related contortions.  For TX its almost 
> impossible to make this work well, since you have to bind/unbind each 
> packet (at least one DMA operation per packet -- often more than that -- 
> since packets often come in that are split across a page boundary.)  For 

As I mentioned before.. MacOSX also has to deal with IOMMUs, and does
something really clever to avoid the overhead you're talking about.
MacOSX keeps its network buffers (mbufs)  pre-mapped in the IOMMU.
This makes getting a DMA address  a simple table lookup, without
the need to touch the IOMMU in the common case.

I know that Solaris makes more generic use of mblks/dblks than
MacOSX makes of mbufs/clusters due to its streams heritage.
But perhaps you could set aside a special pool of pre-mapped
memory to be used on a socket write, and attach it via an
gesballoc() kind of mechanism.

Of course, to make this practical, you'd have to fix
the broken locking in some drivers to make it safe to directly
free gesballoc'ed blocks.   Eg, the underlying reason the
performance sucking freebs_enqueue() taskq mechanism of
freeing gesballoc'ed buffers is used today.

Drew
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to