On Mon, 15 Apr 2024 13:03:30 +0000 Michael Kelley <mhkli...@outlook.com> wrote:
> From: Petr Tesařík <p...@tesarici.cz> Sent: Monday, April 15, 2024 5:50 AM > > > > On Mon, 15 Apr 2024 12:23:22 +0000 > > Michael Kelley <mhkli...@outlook.com> wrote: > > > > > From: Petr Tesařík <p...@tesarici.cz> Sent: Monday, April 15, 2024 4:46 > > > AM > > > > > > > > Hi Michael, > > > > > > > > sorry for taking so long to answer. Yes, there was no agreement on the > > > > removal of the "dir" parameter, but I'm not sure it's because of > > > > symmetry with swiotlb_sync_*(), because the topic was not really > > > > discussed. > > > > > > > > The discussion was about the KUnit test suite and whether direction is > > > > a property of the bounce buffer or of each sync operation. Since DMA API > > > > defines associates each DMA buffer with a direction, the direction > > > > parameter passed to swiotlb_sync_*() should match what was passed to > > > > swiotlb_tbl_map_single(), because that's how it is used by the generic > > > > DMA code. In other words, if the parameter is kept, it should be kept > > > > to match dma_map_*(). > > > > > > > > However, there is also symmetry with swiotlb_tbl_unmap_single(). This > > > > function does use the parameter for the final sync. I believe there > > > > should be a matching initial sync in swiotlb_tbl_map_single(). In > > > > short, the buffer sync for DMA non-coherent devices should be moved from > > > > swiotlb_map() to swiotlb_tbl_map_single(). If this sync is not needed, > > > > then the caller can (and should) include DMA_ATTR_SKIP_CPU_SYNC in > > > > the flags parameter. > > > > > > > > To sum it up: > > > > > > > > * Do *NOT* remove the "dir" parameter. > > > > * Let me send a patch which moves the initial buffer sync. > > > > > > > > > > I'm not seeing the need to move the initial buffer sync. All > > > callers of swiotlb_tbl_map_single() already have a subsequent > > > check for a non-coherent device, and a call to > > > arch_sync_dma_for_device(). And the Xen code has some > > > special handling that probably shouldn't go in > > > swiotlb_tbl_map_single(). Or am I missing something? > > > > Oh, sure, there's nothing broken ATM. It's merely a cleanup. The API is > > asymmetric and thus confusing. You get a final sync by default if you > > call swiotlb_tbl_unmap_single(), > > I don't see that final sync in swiotlb_tbl_unmap_single(). It calls > swiotlb_bounce() to copy the data, but it doesn't deal with > non-coherent devices or call arch_sync_dma_for_cpu(). Ouch. You're right! The buffer gets only bounced but not synced if device DMA is non-coherent. So, how is this supposed to work? Now I'm looking at the code in dma_direct_map_page(), and it calls arch_sync_dma_for_device() explicitly, _except_ when using SWIOTLB. So, maybe I should instead review all callers of swiotlb_map(), make sure that they handle non-coherent devices, and then remove the sync from swiotlb_map()? I mean, the current situation seems somewhat disorganized to me. Petr T