Garrett D'Amore wrote:
> An interesting idea.  It would require the framework knowing that the 
> drivers could do this, and the drivers would have to be able to support 
> whatever address attributes the stack's mblk's used.
>    -- Garrett

I think you'd just want to prefer pre-mapped buffers in the code
which does the copy in (mcopinuio, I think).  When they run
out, just fall back to allocb'ed buffers.

Assuming the premapped buffers were IOMMU_PAGESIZE sized, I think most
drivers would be able to cope.  Or cope at least as
well as they do on amd64 as long as they use the normal ddi_dma
interfaces.

The nice thing is that this could be totally transparent, and
drivers wouldn't need to change at all.

Drew


> Andrew Gallatin wrote:
>> Garrett D'Amore wrote:
>>
>>  
>>>> More generally speaking, though - I'd be surprised if  > 1Gbe NICs
>>>> perform well with copy-in/out, and CPU utilization is probably a huge
>>>> factor, and the size of the frame is likely less of a factor than 
>>>> the simple
>>>> transfer rate a good NIC can maintain.
>>>>         
>>> The problem is that the cost of bcopy of ~1500 bytes becomes < the 
>>> cost to do the various DMA (or DVMA) related contortions.  For TX its 
>>> almost impossible to make this work well, since you have to 
>>> bind/unbind each packet (at least one DMA operation per packet -- 
>>> often more than that -- since packets often come in that are split 
>>> across a page boundary.)  For     
>>
>> As I mentioned before.. MacOSX also has to deal with IOMMUs, and does
>> something really clever to avoid the overhead you're talking about.
>> MacOSX keeps its network buffers (mbufs)  pre-mapped in the IOMMU.
>> This makes getting a DMA address  a simple table lookup, without
>> the need to touch the IOMMU in the common case.
>>
>> I know that Solaris makes more generic use of mblks/dblks than
>> MacOSX makes of mbufs/clusters due to its streams heritage.
>> But perhaps you could set aside a special pool of pre-mapped
>> memory to be used on a socket write, and attach it via an
>> gesballoc() kind of mechanism.
>>
>> Of course, to make this practical, you'd have to fix
>> the broken locking in some drivers to make it safe to directly
>> free gesballoc'ed blocks.   Eg, the underlying reason the
>> performance sucking freebs_enqueue() taskq mechanism of
>> freeing gesballoc'ed buffers is used today.
>>
>> Drew
>> _______________________________________________
>> driver-discuss mailing list
>> [email protected]
>> http://mail.opensolaris.org/mailman/listinfo/driver-discuss
>>   

_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to