Dan, I personally like "device-DAX" idea but my concerns are:
- How well it will co-exists with the DRM infrastructure / implementations in part dealing with CPU pointers? - How well we will be able to handle case when we need to "move"/"evict" memory/data to the new location so CPU pointer should point to the new physical location/address (and may be not in PCI device memory at all)? Sincerely yours, Serguei Sagalovitch On 2016-11-22 01:11 PM, Dan Williams wrote: > On Mon, Nov 21, 2016 at 12:36 PM, Deucher, Alexander > <Alexander.Deucher at amd.com> wrote: >> This is certainly not the first time this has been brought up, but I'd like >> to try and get some consensus on the best way to move this forward. >> Allowing devices to talk directly improves performance and reduces latency >> by avoiding the use of staging buffers in system memory. Also in cases >> where both devices are behind a switch, it avoids the CPU entirely. Most >> current APIs (DirectGMA, PeerDirect, CUDA, HSA) that deal with this are >> pointer based. Ideally we'd be able to take a CPU virtual address and be >> able to get to a physical address taking into account IOMMUs, etc. Having >> struct pages for the memory would allow it to work more generally and >> wouldn't require as much explicit support in drivers that wanted to use it. >> >> Some use cases: >> 1. Storage devices streaming directly to GPU device memory >> 2. GPU device memory to GPU device memory streaming >> 3. DVB/V4L/SDI devices streaming directly to GPU device memory >> 4. DVB/V4L/SDI devices streaming directly to storage devices >> >> Here is a relatively simple example of how this could work for testing. >> This is obviously not a complete solution. >> - Device memory will be registered with Linux memory sub-system by created >> corresponding struct page structures for device memory >> - get_user_pages_fast() will return corresponding struct pages when CPU >> address points to the device memory >> - put_page() will deal with struct pages for device memory >> > [..] >> 4. iopmem >> iopmem : A block device for PCIe memory (https://lwn.net/Articles/703895/) > The change I suggest for this particular approach is to switch to > "device-DAX" [1]. I.e. a character device for establishing DAX > mappings rather than a block device plus a DAX filesystem. The pro of > this approach is standard user pointers and struct pages rather than a > new construct. The con is that this is done via an interface separate > from the existing gpu and storage device. For example it would require > a /dev/dax instance alongside a /dev/nvme interface, but I don't see > that as a significant blocking concern. > > [1]: https://lists.01.org/pipermail/linux-nvdimm/2016-October/007496.html -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20161122/4e8561c9/attachment.html>