On 6/9/26 16:58, Bobby Eshleman wrote: > On Mon, Jun 08, 2026 at 03:59:04PM +0200, Christian König wrote: >> On 6/8/26 15:55, Bobby Eshleman wrote: >>> >>> On Sun, Jun 7, 2026 at 11:42 PM Christian König <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> On 6/5/26 20:44, Bobby Eshleman wrote: >>> > On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote: >>> >> On 6/4/26 02:42, Bobby Eshleman wrote: >>> >>> From: Bobby Eshleman <[email protected] >>> <mailto:[email protected]>> >>> >>> >>> >>> get_sg_table() emitted one PAGE_SIZE sg entry per page even when the >>> >>> underlying folio was larger. >>> >>> >>> >>> Instead, walk folios[] and emit one sg entry per folio. When folios >>> >>> represent large pages (as is for MFD_HUGETLB), each sg entry is a >>> large >>> >>> page. Normal PAGE_SIZE sg tables are unchanged. >>> >>> >>> >>> Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with >>> >>> udmabuf. >>> >> >>> >> That doesn't explain why this is required. >>> > >>> > Sure, can definitely add. Devmem currently requires dmabuf sg entries >>> to >>> > be length and size aligned when it allocates niovs for NIC page pools. >>> > Though udmabuf is not violating any dmabuf contract by emitting >>> > PAGE_SIZE entries and the above restriction is probably more a >>> > shortfalling of devmem, by emitting a single entry per folio this >>> patch >>> > allows udmabuf to be used by devmem for large pages. >>> > >>> >> >>> >> Please note that accessing the pages/folio of an sg-table returned >>> by DMA-buf is illegal and strictly forbidden! >>> >> >>> >> Regards, >>> >> Christian. >>> > >>> > It seems both devmem and io_uring zcrx at least introspect through to >>> > the sg-table to build NIC page pools (not accessing the memory itself, >>> > however). Is there a better way? >>> >>> That's an absolute NO-GO! We need to stop that immediately. >>> >>> Touching the underlying struct page of an DMA-buf exported sg-table is >>> strictly forbidden. >>> >>> We even have code to wrap the sg_table and hide the struct pages on >>> debug builds to catch those issues, see function dma_buf_wrap_sg_table(). >>> >>> My last status is that the NIC page pools are build directly from the >>> DMA addresses exposed by the sg_table. >>> >>> Was there any change I'm not aware of? >>> >>> Regards, >>> Christian. >>> >>> >>> Oh no change, your mental model is still current. >>> They just go through each sg and use sg_dma_address() on each. >> >> Ah, thanks! That was a near heart attack :D >> >> Yeah that is perfectly correct, question is do you then still really need >> this udmabuf change? I mean the DMA API usually merges together contiguous >> DMA addresses. >> >> Regards, >> Christian. >> > > Hey Christian, sorry for the delay I justed want to double check what > I'm seeing... > > I reverted the udmabuf patch and confirmed devmem still runs into 4K > pages even for hugepage udmabuf. I see that the dma_map_direct() path is > being taken, which if I am reading the code correctly results in the > sg_dma_len(sg) inheriting sg->length directly (set by udmabuf's > sg_set_folio(..., PAGE_SIZE) call), compared to the iommu_dma_map_phys() > path which looks like it does merge when possible.
Ok that makes more sense. Yeah something which could potentially be improved elsewhere. Feel free to go ahead with this patch as a workaround, just adjust the commit message and maybe add a code comment why it is necessary and helpful. Thanks, Christian. > > Best, > Bobby

