Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang...@intel.com) Task id: 7140 -Summary- Platform Delta drm-intel-nightly Series Applied ILK 302/302 302/302 SNB 315/315 315/315 IVB 336/336 336/336 BYT 283/283 283/283 HSW 378/378 378/378 -Detailed- Platform Testdrm-intel-nightly Series Applied Note: You need to pay more attention to line start with '*' ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached
On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote: If the set of pages is being imported from another device, we cannot assume that it is fully coherent with the CPU cache, so mark it as such. However, if the source is the shared memory vgem allocator, we could treat the buffer as being cached (so long as all parties agree in the case the same buffer is shared between multiple devices) but as of today, vgem cannot export dma-bufs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Yay for x86 dma api to refuse acknowleding that incoherent devices exist (since this should be done in the dma-ap/sg stuff imo for dma-bufs). But since x86 assumes that everything is coherent shouldn't we do the inverse and use snooping/cacheable always? That should be correct for everything modern at least, and for old agp crap (do we care even, is sharing possible there?) snooping should only result in a bit more overhead. Or where exactly does this blow up? -Daniel --- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 7748d57adffd..da5e4fb02350 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -265,8 +265,22 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, i915_gem_object_init(obj, i915_gem_object_dmabuf_ops); obj-base.import_attach = attach; + /* This bo is imported from a foreign HW device with which we cannot + * assume any coherency. Note that if this dma-buf was imported from + * a simple page allocator, like vgem, then we could treat this as + * cacheable (so long as all parties agree on that convention - i.e. + * if the same bo was shared with nouveau/radeon they also treat it + * as CPU cacheeable so that coherency is maintained between all + * parties). + */ + ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); + if (ret) + goto fail_obj; + return obj-base; +fail_obj: + drm_gem_object_unreference(obj-base); fail_detach: dma_buf_detach(dma_buf, attach); dma_buf_put(dma_buf); -- 2.5.0 -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached
On Tue, Aug 11, 2015 at 11:24:58AM +0100, Chris Wilson wrote: On Tue, Aug 11, 2015 at 12:12:08PM +0200, Daniel Vetter wrote: On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote: If the set of pages is being imported from another device, we cannot assume that it is fully coherent with the CPU cache, so mark it as such. However, if the source is the shared memory vgem allocator, we could treat the buffer as being cached (so long as all parties agree in the case the same buffer is shared between multiple devices) but as of today, vgem cannot export dma-bufs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Yay for x86 dma api to refuse acknowleding that incoherent devices exist (since this should be done in the dma-ap/sg stuff imo for dma-bufs). But since x86 assumes that everything is coherent shouldn't we do the inverse and use snooping/cacheable always? Actually, the convention for PRIME is that transfer bos are uncached because we have to be sure that any write makes it into the foriegn pages, that applies to both GPU and CPU writes, precisely because the target is *incoherent*. That should be correct for everything modern at least, and for old agp crap (do we care even, is sharing possible there?) snooping should only result in a bit more overhead. Or where exactly does this blow up? Consider a kms_crc-esque test using a radeon/nouveau bo imported into i915 and accessed using i915 ioctls (with the CRC testing on the radeon scanout). Scanout on radeon must be in vram afaik, and that's a place i915 can't get at. I think that even holds true for the integrated radones (they just use stolen for vram then). The other way round also doesn't work I think since i915 can't change the caching policy for radoen/nouveau access if radeon/nouveau want to write directly to main memory. Otoh I think pcie transactions just snoop the cache and never put anything in there. I still think that everything we import and don't have a clue about should probably be treated as snooped conceptually. But practically who knows what's going on ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached
On Tue, Aug 11, 2015 at 12:12:08PM +0200, Daniel Vetter wrote: On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote: If the set of pages is being imported from another device, we cannot assume that it is fully coherent with the CPU cache, so mark it as such. However, if the source is the shared memory vgem allocator, we could treat the buffer as being cached (so long as all parties agree in the case the same buffer is shared between multiple devices) but as of today, vgem cannot export dma-bufs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Yay for x86 dma api to refuse acknowleding that incoherent devices exist (since this should be done in the dma-ap/sg stuff imo for dma-bufs). But since x86 assumes that everything is coherent shouldn't we do the inverse and use snooping/cacheable always? Actually, the convention for PRIME is that transfer bos are uncached because we have to be sure that any write makes it into the foriegn pages, that applies to both GPU and CPU writes, precisely because the target is *incoherent*. That should be correct for everything modern at least, and for old agp crap (do we care even, is sharing possible there?) snooping should only result in a bit more overhead. Or where exactly does this blow up? Consider a kms_crc-esque test using a radeon/nouveau bo imported into i915 and accessed using i915 ioctls (with the CRC testing on the radeon scanout). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached
On Tue, Aug 11, 2015 at 02:58:59PM +0200, Daniel Vetter wrote: On Tue, Aug 11, 2015 at 11:24:58AM +0100, Chris Wilson wrote: On Tue, Aug 11, 2015 at 12:12:08PM +0200, Daniel Vetter wrote: On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote: If the set of pages is being imported from another device, we cannot assume that it is fully coherent with the CPU cache, so mark it as such. However, if the source is the shared memory vgem allocator, we could treat the buffer as being cached (so long as all parties agree in the case the same buffer is shared between multiple devices) but as of today, vgem cannot export dma-bufs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Yay for x86 dma api to refuse acknowleding that incoherent devices exist (since this should be done in the dma-ap/sg stuff imo for dma-bufs). But since x86 assumes that everything is coherent shouldn't we do the inverse and use snooping/cacheable always? Actually, the convention for PRIME is that transfer bos are uncached because we have to be sure that any write makes it into the foriegn pages, that applies to both GPU and CPU writes, precisely because the target is *incoherent*. That should be correct for everything modern at least, and for old agp crap (do we care even, is sharing possible there?) snooping should only result in a bit more overhead. Or where exactly does this blow up? Consider a kms_crc-esque test using a radeon/nouveau bo imported into i915 and accessed using i915 ioctls (with the CRC testing on the radeon scanout). Scanout on radeon must be in vram afaik, and that's a place i915 can't get at. I think that even holds true for the integrated radones (they just use stolen for vram then). The other way round also doesn't work I think since i915 can't change the caching policy for radoen/nouveau access if radeon/nouveau want to write directly to main memory. Otoh I think pcie transactions just snoop the cache and never put anything in there. I still think that everything we import and don't have a clue about should probably be treated as snooped conceptually. But practically who knows what's going on ... I'd say setting as uncached is safer. But at the moment, the policy is just whatever the platform default is and that leads to interesting platform specific behaviour. At the very least we do want snooping if we ever do get a shmem dma-buf source. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached
If the set of pages is being imported from another device, we cannot assume that it is fully coherent with the CPU cache, so mark it as such. However, if the source is the shared memory vgem allocator, we could treat the buffer as being cached (so long as all parties agree in the case the same buffer is shared between multiple devices) but as of today, vgem cannot export dma-bufs. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 7748d57adffd..da5e4fb02350 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -265,8 +265,22 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, i915_gem_object_init(obj, i915_gem_object_dmabuf_ops); obj-base.import_attach = attach; + /* This bo is imported from a foreign HW device with which we cannot +* assume any coherency. Note that if this dma-buf was imported from +* a simple page allocator, like vgem, then we could treat this as +* cacheable (so long as all parties agree on that convention - i.e. +* if the same bo was shared with nouveau/radeon they also treat it +* as CPU cacheeable so that coherency is maintained between all +* parties). +*/ + ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE); + if (ret) + goto fail_obj; + return obj-base; +fail_obj: + drm_gem_object_unreference(obj-base); fail_detach: dma_buf_detach(dma_buf, attach); dma_buf_put(dma_buf); -- 2.5.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx