Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached

2015-08-14 Thread shuang . he
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: 
shuang...@intel.com)
Task id: 7140
-Summary-
Platform  Delta  drm-intel-nightly  Series Applied
ILK  302/302  302/302
SNB  315/315  315/315
IVB  336/336  336/336
BYT  283/283  283/283
HSW  378/378  378/378
-Detailed-
Platform  Testdrm-intel-nightly  Series 
Applied
Note: You need to pay more attention to line start with '*'
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached

2015-08-11 Thread Daniel Vetter
On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote:
 If the set of pages is being imported from another device, we cannot
 assume that it is fully coherent with the CPU cache, so mark it as such.
 However, if the source is the shared memory vgem allocator, we could
 treat the buffer as being cached (so long as all parties agree in the
 case the same buffer is shared between multiple devices) but as of
 today, vgem cannot export dma-bufs.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Daniel Vetter daniel.vet...@ffwll.ch

Yay for x86 dma api to refuse acknowleding that incoherent devices exist
(since this should be done in the dma-ap/sg stuff imo for dma-bufs).

But since x86 assumes that everything is coherent shouldn't we do the
inverse and use snooping/cacheable always?

That should be correct for everything modern at least, and for old agp
crap (do we care even, is sharing possible there?) snooping should only
result in a bit more overhead.

Or where exactly does this blow up?
-Daniel


 ---
  drivers/gpu/drm/i915/i915_gem_dmabuf.c | 14 ++
  1 file changed, 14 insertions(+)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
 b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
 index 7748d57adffd..da5e4fb02350 100644
 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
 +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
 @@ -265,8 +265,22 @@ struct drm_gem_object *i915_gem_prime_import(struct 
 drm_device *dev,
   i915_gem_object_init(obj, i915_gem_object_dmabuf_ops);
   obj-base.import_attach = attach;
  
 + /* This bo is imported from a foreign HW device with which we cannot
 +  * assume any coherency. Note that if this dma-buf was imported from
 +  * a simple page allocator, like vgem, then we could treat this as
 +  * cacheable (so long as all parties agree on that convention - i.e.
 +  * if the same bo was shared with nouveau/radeon they also treat it
 +  * as CPU cacheeable so that coherency is maintained between all
 +  * parties).
 +  */
 + ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
 + if (ret)
 + goto fail_obj;
 +
   return obj-base;
  
 +fail_obj:
 + drm_gem_object_unreference(obj-base);
  fail_detach:
   dma_buf_detach(dma_buf, attach);
   dma_buf_put(dma_buf);
 -- 
 2.5.0
 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached

2015-08-11 Thread Daniel Vetter
On Tue, Aug 11, 2015 at 11:24:58AM +0100, Chris Wilson wrote:
 On Tue, Aug 11, 2015 at 12:12:08PM +0200, Daniel Vetter wrote:
  On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote:
   If the set of pages is being imported from another device, we cannot
   assume that it is fully coherent with the CPU cache, so mark it as such.
   However, if the source is the shared memory vgem allocator, we could
   treat the buffer as being cached (so long as all parties agree in the
   case the same buffer is shared between multiple devices) but as of
   today, vgem cannot export dma-bufs.
   
   Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
   Cc: Daniel Vetter daniel.vet...@ffwll.ch
  
  Yay for x86 dma api to refuse acknowleding that incoherent devices exist
  (since this should be done in the dma-ap/sg stuff imo for dma-bufs).
  
  But since x86 assumes that everything is coherent shouldn't we do the
  inverse and use snooping/cacheable always?
 
 Actually, the convention for PRIME is that transfer bos are uncached
 because we have to be sure that any write makes it into the foriegn
 pages, that applies to both GPU and CPU writes, precisely because the
 target is *incoherent*.
 
  That should be correct for everything modern at least, and for old agp
  crap (do we care even, is sharing possible there?) snooping should only
  result in a bit more overhead.
  
  Or where exactly does this blow up?
 
 Consider a kms_crc-esque test using a radeon/nouveau bo imported into
 i915 and accessed using i915 ioctls (with the CRC testing on the radeon
 scanout).

Scanout on radeon must be in vram afaik, and that's a place i915 can't get
at. I think that even holds true for the integrated radones (they just use
stolen for vram then).

The other way round also doesn't work I think since i915 can't change the
caching policy for radoen/nouveau access if radeon/nouveau want to write
directly to main memory. Otoh I think pcie transactions just snoop the
cache and never put anything in there.

I still think that everything we import and don't have a clue about should
probably be treated as snooped conceptually. But practically who knows
what's going on ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached

2015-08-11 Thread Chris Wilson
On Tue, Aug 11, 2015 at 12:12:08PM +0200, Daniel Vetter wrote:
 On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote:
  If the set of pages is being imported from another device, we cannot
  assume that it is fully coherent with the CPU cache, so mark it as such.
  However, if the source is the shared memory vgem allocator, we could
  treat the buffer as being cached (so long as all parties agree in the
  case the same buffer is shared between multiple devices) but as of
  today, vgem cannot export dma-bufs.
  
  Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
  Cc: Daniel Vetter daniel.vet...@ffwll.ch
 
 Yay for x86 dma api to refuse acknowleding that incoherent devices exist
 (since this should be done in the dma-ap/sg stuff imo for dma-bufs).
 
 But since x86 assumes that everything is coherent shouldn't we do the
 inverse and use snooping/cacheable always?

Actually, the convention for PRIME is that transfer bos are uncached
because we have to be sure that any write makes it into the foriegn
pages, that applies to both GPU and CPU writes, precisely because the
target is *incoherent*.

 That should be correct for everything modern at least, and for old agp
 crap (do we care even, is sharing possible there?) snooping should only
 result in a bit more overhead.
 
 Or where exactly does this blow up?

Consider a kms_crc-esque test using a radeon/nouveau bo imported into
i915 and accessed using i915 ioctls (with the CRC testing on the radeon
scanout).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached

2015-08-11 Thread Chris Wilson
On Tue, Aug 11, 2015 at 02:58:59PM +0200, Daniel Vetter wrote:
 On Tue, Aug 11, 2015 at 11:24:58AM +0100, Chris Wilson wrote:
  On Tue, Aug 11, 2015 at 12:12:08PM +0200, Daniel Vetter wrote:
   On Mon, Aug 10, 2015 at 09:16:46PM +0100, Chris Wilson wrote:
If the set of pages is being imported from another device, we cannot
assume that it is fully coherent with the CPU cache, so mark it as such.
However, if the source is the shared memory vgem allocator, we could
treat the buffer as being cached (so long as all parties agree in the
case the same buffer is shared between multiple devices) but as of
today, vgem cannot export dma-bufs.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Daniel Vetter daniel.vet...@ffwll.ch
   
   Yay for x86 dma api to refuse acknowleding that incoherent devices exist
   (since this should be done in the dma-ap/sg stuff imo for dma-bufs).
   
   But since x86 assumes that everything is coherent shouldn't we do the
   inverse and use snooping/cacheable always?
  
  Actually, the convention for PRIME is that transfer bos are uncached
  because we have to be sure that any write makes it into the foriegn
  pages, that applies to both GPU and CPU writes, precisely because the
  target is *incoherent*.
  
   That should be correct for everything modern at least, and for old agp
   crap (do we care even, is sharing possible there?) snooping should only
   result in a bit more overhead.
   
   Or where exactly does this blow up?
  
  Consider a kms_crc-esque test using a radeon/nouveau bo imported into
  i915 and accessed using i915 ioctls (with the CRC testing on the radeon
  scanout).
 
 Scanout on radeon must be in vram afaik, and that's a place i915 can't get
 at. I think that even holds true for the integrated radones (they just use
 stolen for vram then).
 
 The other way round also doesn't work I think since i915 can't change the
 caching policy for radoen/nouveau access if radeon/nouveau want to write
 directly to main memory. Otoh I think pcie transactions just snoop the
 cache and never put anything in there.
 
 I still think that everything we import and don't have a clue about should
 probably be treated as snooped conceptually. But practically who knows
 what's going on ...

I'd say setting as uncached is safer. But at the moment, the policy is
just whatever the platform default is and that leads to interesting
platform specific behaviour.

At the very least we do want snooping if we ever do get a shmem dma-buf
source.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Treat foreign dma-buf imports as uncached

2015-08-10 Thread Chris Wilson
If the set of pages is being imported from another device, we cannot
assume that it is fully coherent with the CPU cache, so mark it as such.
However, if the source is the shared memory vgem allocator, we could
treat the buffer as being cached (so long as all parties agree in the
case the same buffer is shared between multiple devices) but as of
today, vgem cannot export dma-bufs.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Daniel Vetter daniel.vet...@ffwll.ch
---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 7748d57adffd..da5e4fb02350 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -265,8 +265,22 @@ struct drm_gem_object *i915_gem_prime_import(struct 
drm_device *dev,
i915_gem_object_init(obj, i915_gem_object_dmabuf_ops);
obj-base.import_attach = attach;
 
+   /* This bo is imported from a foreign HW device with which we cannot
+* assume any coherency. Note that if this dma-buf was imported from
+* a simple page allocator, like vgem, then we could treat this as
+* cacheable (so long as all parties agree on that convention - i.e.
+* if the same bo was shared with nouveau/radeon they also treat it
+* as CPU cacheeable so that coherency is maintained between all
+* parties).
+*/
+   ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
+   if (ret)
+   goto fail_obj;
+
return obj-base;
 
+fail_obj:
+   drm_gem_object_unreference(obj-base);
 fail_detach:
dma_buf_detach(dma_buf, attach);
dma_buf_put(dma_buf);
-- 
2.5.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx