Andrew Gallatin wrote:
> I'm trying to reduce the memory footprint of a driver which
> makes heavy use of ddi_dma_mem_alloc()
>
> I need to allocate 1520 byte buffers which cannot cross a 4KB
> boundary. Note the 4KB requirement is irrespective of the host
> page size -- the device cannot DMA across a 4KB boundary.
>
> I'm currently using the following dma_attr:
>
> static ddi_dma_attr_t dma_attr = {
>       DMA_ATTR_V0,                    /* version number. */
>       (uint64_t)0,                    /* low address */
>       (uint64_t)0xffffffffffffffffULL, /* high address */
>       (uint64_t)0x7ffffff,            /* address counter max */
>       (uint64_t)2048,                 /* alignment */
>       (uint_t)0x7f,                   /* burstsizes for 32b and 64b xfers */
>       (uint32_t)0x1,                  /* minimum transfer size */
>       (uint64_t)0x7fffffff,           /* maximum transfer size */
>       UINT64_MAX,                     /* maximum segment size */
>       1,                              /* scatter/gather list length */
>       1,                              /* granularity */
>       0                               /* attribute flags */
> };
>
>
> The "real_length" coming back from these allocations is 2048, and
> I was expecting to see 1/2 the buffers I get back aligned on a 2KB
> boundary, and 1/2 on a 4KB boundary.
>
> However, a quick check shows that all the buffers are coming
> back aligned on a 2KB boundary. Eg, all the physical addresses
> end in 0x800.  My assumption is that since I'm never seeing
> physical addresses ending in 0x000, then I must be wasting
> an entire 4KB page.  Also, kmastat shows these allocations winding
> up in kmem_io_512M_4096
>
> How can I satisfy the "don't cross a 4KB boundary" requirement
> and still fit 2 buffers into a single 4KB page and save the system
> some memory?
>   

I'd be shocked if the system tried to put multiple DMA resources into a 
single physical page.  The problem is that if you do that, then you lose 
some of the protection guarantees that the system can otherwise 
provide.  (The system doesn't *know* that your resources are logically 
connected.)  There may be MMU related restrictions as well.  (What 
happens, for example, if you need to map two different portions of the 
page into different contiguous physical address spaces using an IOMMU?  
I suspect the answer is "you can't".)

If you really care about this, you could just allocate up 4K pages 
yourself, and subdivide them into pairs.  It might make for a more 
complex driver, but you wind up saving even more memory -- fewer PTEs in 
MMU/IOMMU tables, half as many DMA handles, etc.

    -- Garrett
> BTW, I was testing on an amd64 running snv100
>
> Thanks,
>
> Drew
> _______________________________________________
> driver-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/driver-discuss
>   

_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to