Hi everyone, This is a repost from a different address as it seems the previous one ended in Gmail junk due to a domain error.. I added more info found while blindly debugging the issue.
Short version: I'm having an issue with direct DMA transfer from a device to host memory. It seems some of the data is not transferring to the appropriate page. Some more details: I'm debugging a home made PCI driver for our board (Kalray), attached to a x86_64 host running centos7 (3.10.0-327.el7.x86_64) In the current case, a userland application transfers back and forth data through read/write operations on a file. On the kernel side, it triggers DMA transfers through the PCI to/from our board memory. We followed what pretty much all docs said about direct I/O to user buffers: 1) get_user_pages() (in the current case, it's at most 16 pages at once) 2) convert to a scatterlist 3) pci_map_sg 4) eventually coalesce sg (Intel IOMMU is enabled, so it's usually possible) 4) A lot of DMA engine handling code, using the dmaengine layer and virt-dma 5) wait for transfer complete, in the mean time, go back to (1) to schedule more work, if any 6) pci_unmap_sg 7) for read (card2host) transfer, set_page_dirty_lock 8) page_cache_release In 99,9999% it works perfectly. However, I have one userland application where a few pages are not written by a read (card2host) transfer. The buffer is memset them to a different value so I can check that nothing has overwritten them. I know (PCI protocol analyser) that the data left our board for the "right" address (the one set in the sg by pci_map_sg). I tried reading the data between the pci_unmap_sg and the set_page_dirty, using uint32_t *addr = page_address(trans->pages[0]); dev_warn(&pdata->pdev->dev, "val = %x\n", *addr); and it has the expected value. But if I try to copy_from_user (using the address coming from userland, the one passed to get_user_pages), the data has not been written and I see the memset value. New infos: The issue happens with IOMMU on or off. I compiled a kernel with DMA_API_DEBUG enabled and got no warnings or errors. I digged a little bit deeper with my very small understanding of linux mm and I discovered that: * we are using transparent huge pages * the page 'not transferred' are the last few of a huge page More precisely: - We have several transfer in flight from the same user buffer - Each transfer is 16 pages long - At one point in time, we start transferring from another huge page (transfers are still in flight from the previous one) - When a transfer from the previous huge page completes, I dumped at the mapcount of the pages from the previous transfers, they are all to 0. The pages are still mapped to dma at this point. - A get_user_page to the address of the completed transfer returns return a different struct page * then the on I had. But this is before I have unmapped/put_page them back. From my understanding this should not have happened. I tried the same code with a kernel 4.5 and encountered the same issue Disabling transparent huge pages makes the issue disapear Thanks in advance Nicolas