Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-11 Thread Daniel Vetter
On Fri, Dec 11, 2015 at 12:29:40PM +, Chris Wilson wrote:
> On Fri, Dec 11, 2015 at 12:19:09PM +, Dave Gordon wrote:
> > On 10/12/15 08:58, Daniel Vetter wrote:
> > >On Mon, Dec 07, 2015 at 12:51:49PM +, Dave Gordon wrote:
> > >>I think I missed i915_gem_phys_pwrite().
> > >>
> > >>i915_gem_gtt_pwrite_fast() marks the object dirty for most cases (vit
> > >>set_to_gtt_domain(), but isn't called for all cases (or can return before
> > >>the set_domain). Then we try i915_gem_shmem_pwrite() for non-phys
> > >>objects (no check for stolen!) and that already marks the object dirty
> > >>[aside: we might be able to change that to page-by-page?], but
> > >>i915_gem_phys_pwrite() doesn't mark the object dirty, so we might lose
> > >>updates there?
> > >>
> > >>Or maybe we should move the marking up into i915_gem_pwrite_ioctl() 
> > >>instead.
> > >>The target object is surely going to be dirtied, whatever type it is.
> > >
> > >phys objects are special, and when binding we create allocate new
> > >(contiguous) storage. In put_pages_phys that gets copied back and pages
> > >marked as dirty. While a phys object is pinned it's a kernel bug to look
> > >at the shmem pages and a userspace bug to touch the cpu mmap (since that
> > >data will simply be overwritten whenever the kernel feels like).
> > >
> > >phys objects are only used for cursors on old crap though, so ok if we
> > >don't streamline this fairly quirky old ABI.
> > >-Daniel
> > 
> > So is pread broken already for 'phys' ?
> 
> Yes. A completely unused corner of the API.

I think it would be useful to extract all the phys object stuff into
i915_gem_phys_obj.c, add minimal kerneldoc for the functions, and then an
overview section which explains in detail how fucked up this little bit of
ABI history lore is. I can do the overview section, but the
extraction/basic kerneldoc will probably take a bit longer to get around
to.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-11 Thread Chris Wilson
On Fri, Dec 11, 2015 at 12:19:09PM +, Dave Gordon wrote:
> On 10/12/15 08:58, Daniel Vetter wrote:
> >On Mon, Dec 07, 2015 at 12:51:49PM +, Dave Gordon wrote:
> >>I think I missed i915_gem_phys_pwrite().
> >>
> >>i915_gem_gtt_pwrite_fast() marks the object dirty for most cases (vit
> >>set_to_gtt_domain(), but isn't called for all cases (or can return before
> >>the set_domain). Then we try i915_gem_shmem_pwrite() for non-phys
> >>objects (no check for stolen!) and that already marks the object dirty
> >>[aside: we might be able to change that to page-by-page?], but
> >>i915_gem_phys_pwrite() doesn't mark the object dirty, so we might lose
> >>updates there?
> >>
> >>Or maybe we should move the marking up into i915_gem_pwrite_ioctl() instead.
> >>The target object is surely going to be dirtied, whatever type it is.
> >
> >phys objects are special, and when binding we create allocate new
> >(contiguous) storage. In put_pages_phys that gets copied back and pages
> >marked as dirty. While a phys object is pinned it's a kernel bug to look
> >at the shmem pages and a userspace bug to touch the cpu mmap (since that
> >data will simply be overwritten whenever the kernel feels like).
> >
> >phys objects are only used for cursors on old crap though, so ok if we
> >don't streamline this fairly quirky old ABI.
> >-Daniel
> 
> So is pread broken already for 'phys' ?

Yes. A completely unused corner of the API.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-11 Thread Dave Gordon

On 10/12/15 08:58, Daniel Vetter wrote:

On Mon, Dec 07, 2015 at 12:51:49PM +, Dave Gordon wrote:

I think I missed i915_gem_phys_pwrite().

i915_gem_gtt_pwrite_fast() marks the object dirty for most cases (vit
set_to_gtt_domain(), but isn't called for all cases (or can return before
the set_domain). Then we try i915_gem_shmem_pwrite() for non-phys
objects (no check for stolen!) and that already marks the object dirty
[aside: we might be able to change that to page-by-page?], but
i915_gem_phys_pwrite() doesn't mark the object dirty, so we might lose
updates there?

Or maybe we should move the marking up into i915_gem_pwrite_ioctl() instead.
The target object is surely going to be dirtied, whatever type it is.


phys objects are special, and when binding we create allocate new
(contiguous) storage. In put_pages_phys that gets copied back and pages
marked as dirty. While a phys object is pinned it's a kernel bug to look
at the shmem pages and a userspace bug to touch the cpu mmap (since that
data will simply be overwritten whenever the kernel feels like).

phys objects are only used for cursors on old crap though, so ok if we
don't streamline this fairly quirky old ABI.
-Daniel


So is pread broken already for 'phys' ? In the pwrite code, we have 
i915_gem_phys_pwrite() which look OK, but there isn't a corresponding 
i915_gem_phys_pread(), instead it will call i915_gem_shmem_pread(), and 
I'm not sure that will work! The question being, does the kernel have 
page table slots corresponding to the DMA area allocated, otherwise
the for_each_sg_page()/sg_page_iter_page() in i915_gem_shmem_pread() 
isn't going to give meaningful results. And I found this comment in 
drm_pci_alloc() (called from i915_gem_object_attach_phys()):


/* XXX - Is virt_to_page() legal for consistent mem? */
/* Reserve */
for (addr = (unsigned long)dmah->vaddr, sz = size;
 sz > 0; addr += PAGE_SIZE, sz -= PAGE_SIZE) {
SetPageReserved(virt_to_page((void *)addr));
}

(and does it depend on which memory configuration is selected?).

See also current thread on "Support for pread/pwrite from/to non shmem 
backed objects" ...


.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-10 Thread Daniel Vetter
On Mon, Dec 07, 2015 at 12:51:49PM +, Dave Gordon wrote:
> I think I missed i915_gem_phys_pwrite().
> 
> i915_gem_gtt_pwrite_fast() marks the object dirty for most cases (vit
> set_to_gtt_domain(), but isn't called for all cases (or can return before
> the set_domain). Then we try i915_gem_shmem_pwrite() for non-phys
> objects (no check for stolen!) and that already marks the object dirty
> [aside: we might be able to change that to page-by-page?], but
> i915_gem_phys_pwrite() doesn't mark the object dirty, so we might lose
> updates there?
> 
> Or maybe we should move the marking up into i915_gem_pwrite_ioctl() instead.
> The target object is surely going to be dirtied, whatever type it is.

phys objects are special, and when binding we create allocate new
(contiguous) storage. In put_pages_phys that gets copied back and pages
marked as dirty. While a phys object is pinned it's a kernel bug to look
at the shmem pages and a userspace bug to touch the cpu mmap (since that
data will simply be overwritten whenever the kernel feels like).

phys objects are only used for cursors on old crap though, so ok if we
don't streamline this fairly quirky old ABI.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-10 Thread Daniel Vetter
On Mon, Dec 07, 2015 at 12:04:18PM +, Dave Gordon wrote:
> On 07/12/15 08:29, Daniel Vetter wrote:
> >On Fri, Dec 04, 2015 at 05:28:29PM +, Dave Gordon wrote:
> >>On 04/12/15 09:57, Daniel Vetter wrote:
> >>>On Tue, Dec 01, 2015 at 01:21:07PM +, Dave Gordon wrote:
> On 01/12/15 13:04, Chris Wilson wrote:
> >On Tue, Dec 01, 2015 at 12:42:02PM +, Dave Gordon wrote:
> >>In various places, one or more pages of a GEM object are mapped into CPU
> >>address space and updated. In each such case, the object should be
> >>marked dirty, to ensure that the modifications are not discarded if the
> >>object is evicted under memory pressure.
> >>
> >>This is similar to commit
> >>commit 51bc140431e233284660b1d22c47dec9ecdb521e
> >>Author: Chris Wilson 
> >>Date:   Mon Aug 31 15:10:39 2015 +0100
> >>drm/i915: Always mark the object as dirty when used by the GPU
> >>
> >>in which Chris ensured that updates by the GPU were not lost due to
> >>eviction, but this patch applies instead to the multiple places where
> >>object content is updated by the host CPU.
> >
> >Apart from that commit was to mask userspace bugs, here we are under
> >control of when the pages are marked and have chosen a different
> >per-page interface for CPU writes as opposed to per-object.
> >-Chris
> 
> The pattern
>   get_pages();
>   kmap(get_page())
>   write
>   kunmap()
> occurs often enough that it might be worth providing a common function to 
> do
> that and mark only the specific page dirty (other cases touch the whole
> object, so for those we can just set the obj->dirty flag and let 
> put_pages()
> take care of propagating that to all the individual pages).
> 
> But can we be sure that all the functions touched by this patch will 
> operate
> only on regular (default) GEM objects (i.e. not phys, stolen, etc) 'cos 
> some
> of those don't support per-page tracking. What about objects with no 
> backing
> store -- can/should we mark those as dirty (which would prevent eviction)?
> >>>
> >>>I thought our special objects do clear obj->dirty on put_pages? Can you
> >>>please elaborate on your concern?
> >>>
> >>>While we discuss all this: A patch at the end to document dirty (maybe
> >>>even as a first stab at kerneldoc for i915_drm_gem_buffer_object) would be
> >>>awesome.
> >>>-Daniel
> >>
> >>In general, obj->dirty means that (some or) all the pages of the object
> >>(may) have been modified since last time the object was read from backing
> >>store, and that the modified data should be written back rather than
> >>discarded.
> >>
> >>Code that works only on default (gtt) GEM objects may be able to optimise
> >>writebacks by marking individual pages dirty, rather than the object as a
> >>whole. But not every GEM object has backing store, and even among those that
> >>do, some do not support per-page dirty tracking.
> >>
> >>These are the GEM objects we may want to consider:
> >>
> >>1. Default (gtt) object
> >>* Discontiguous, lives in page cache while pinned during use
> >>* Backed by shmfs (swap)
> >>* put_pages() transfers dirty status from object to each page
> >>  before release
> >>* shmfs ensures that dirty unpinned pages are written out
> >>  before deallocation
> >>* Could optimise by marking individual pages at point of use,
> >>  rather than marking whole object and then pushing to all pages
> >>  during put_pages()
> >>
> >>2. Phys GEM object
> >>* Lives in physically-contiguous system memory, pinned during use
> >>* Backed by shmfs
> >>* if obj->dirty, put_pages() *copies* all pages back to shmfs via
> >>  page cache RMW
> >>* No per-page tracking, cannot optimise
> >>
> >>3. Stolen GEM object
> >>* Lives in (physically-contiguous) stolen memory, always pinned
> >>* No backing store!
> >>* obj->dirty is irrelevant (ignored)
> >>* put_pages() only called at end-of-life
> >>* No per-page tracking (not meaningful anyway)
> >>
> >>4. Userptr GEM object
> >>* Discontiguous, lives in page cache while pinned during use
> >>* Backed by user process memory (which may then map to some
> >>  arbitrary file mapping?)
> >>* put_pages() transfers dirty status from object to each page
> >>  before release
> >>* dirty pages are still resident in user space, can be swapped
> >>  out when not pinned
> >>* Could optimise by marking individual pages at point of use,
> >>  rather than marking whole object and then pushing to all pages
> >>  during put_pages()
> >>
> >>Are there any more?
> >>
> >>Given this diversity, it may be worth adding a dirty_page() vfunc, so that
> >>for those situations where a single page is dirtied AND the object type
> >>supports per-page tracking, we can take advantage of this to reduce copying.
> >>

Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-07 Thread Dave Gordon

On 01/12/15 12:42, Dave Gordon wrote:

In various places, one or more pages of a GEM object are mapped into CPU
address space and updated. In each such case, the object should be
marked dirty, to ensure that the modifications are not discarded if the
object is evicted under memory pressure.

This is similar to commit
commit 51bc140431e233284660b1d22c47dec9ecdb521e
Author: Chris Wilson 
Date:   Mon Aug 31 15:10:39 2015 +0100
drm/i915: Always mark the object as dirty when used by the GPU

in which Chris ensured that updates by the GPU were not lost due to
eviction, but this patch applies instead to the multiple places where
object content is updated by the host CPU.

It also incorporates and supercedes Alex Dai's earlier patch
[PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted

Signed-off-by: Dave Gordon 
Cc: Chris Wilson 
Cc: Alex Dai 
---
  drivers/gpu/drm/i915/i915_cmd_parser.c   | 1 +
  drivers/gpu/drm/i915/i915_gem.c  | 1 +
  drivers/gpu/drm/i915/i915_gem_dmabuf.c   | 2 ++
  drivers/gpu/drm/i915/i915_gem_execbuffer.c   | 2 ++
  drivers/gpu/drm/i915/i915_gem_render_state.c | 1 +
  drivers/gpu/drm/i915/i915_guc_submission.c   | 1 +
  drivers/gpu/drm/i915/intel_lrc.c | 6 +-
  7 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 814d894..292bd5d 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -945,6 +945,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj,
drm_clflush_virt_range(src, batch_len);

memcpy(dst, src, batch_len);
+   dest_obj->dirty = 1;

  unmap_src:
vunmap(src_base);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 33adc8f..76bacba 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5201,6 +5201,7 @@ i915_gem_object_create_from_data(struct drm_device *dev,
i915_gem_object_pin_pages(obj);
sg = obj->pages;
bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
+   obj->dirty = 1;
i915_gem_object_unpin_pages(obj);

if (WARN_ON(bytes != size)) {
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index e9c2bfd..49a74c6 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -208,6 +208,8 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, size_t start, size
return ret;

ret = i915_gem_object_set_to_cpu_domain(obj, write);
+   if (write)
+   obj->dirty = 1;
mutex_unlock(&dev->struct_mutex);
return ret;
  }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a4c243c..bc28a10 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -281,6 +281,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj,
}

kunmap_atomic(vaddr);
+   obj->dirty = 1;

return 0;
  }
@@ -372,6 +373,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj,
}

kunmap_atomic(vaddr);
+   obj->dirty = 1;

return 0;
  }
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c 
b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 5026a62..dd1976c 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -144,6 +144,7 @@ static int render_state_setup(struct render_state *so)
so->aux_batch_size = ALIGN(so->aux_batch_size, 8);

kunmap(page);
+   so->obj->dirty = 1;

ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
if (ret)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index a057cbd..b4a99a2 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -583,6 +583,7 @@ static void lr_context_update(struct drm_i915_gem_request 
*rq)
reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);

kunmap_atomic(reg_state);
+   ctx_obj->dirty = 1;
  }

  /**
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4ebafab..bc77794 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -391,6 +391,7 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
}

kunmap_atomic(reg_state);
+   ctx_obj->dirty = 1;

return 0;
  }
@@ -1030,7 +1031,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs 
*ring,
if (ret)
goto unpin_ctx_obj;

-   ctx_obj->dirty = true;
+   ctx_obj->dirty = 1;

/* Invalidate GuC TLB. */
if (i915.enable_guc_submi

Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-07 Thread Dave Gordon

On 07/12/15 08:29, Daniel Vetter wrote:

On Fri, Dec 04, 2015 at 05:28:29PM +, Dave Gordon wrote:

On 04/12/15 09:57, Daniel Vetter wrote:

On Tue, Dec 01, 2015 at 01:21:07PM +, Dave Gordon wrote:

On 01/12/15 13:04, Chris Wilson wrote:

On Tue, Dec 01, 2015 at 12:42:02PM +, Dave Gordon wrote:

In various places, one or more pages of a GEM object are mapped into CPU
address space and updated. In each such case, the object should be
marked dirty, to ensure that the modifications are not discarded if the
object is evicted under memory pressure.

This is similar to commit
commit 51bc140431e233284660b1d22c47dec9ecdb521e
Author: Chris Wilson 
Date:   Mon Aug 31 15:10:39 2015 +0100
drm/i915: Always mark the object as dirty when used by the GPU

in which Chris ensured that updates by the GPU were not lost due to
eviction, but this patch applies instead to the multiple places where
object content is updated by the host CPU.


Apart from that commit was to mask userspace bugs, here we are under
control of when the pages are marked and have chosen a different
per-page interface for CPU writes as opposed to per-object.
-Chris


The pattern
get_pages();
kmap(get_page())
write
kunmap()
occurs often enough that it might be worth providing a common function to do
that and mark only the specific page dirty (other cases touch the whole
object, so for those we can just set the obj->dirty flag and let put_pages()
take care of propagating that to all the individual pages).

But can we be sure that all the functions touched by this patch will operate
only on regular (default) GEM objects (i.e. not phys, stolen, etc) 'cos some
of those don't support per-page tracking. What about objects with no backing
store -- can/should we mark those as dirty (which would prevent eviction)?


I thought our special objects do clear obj->dirty on put_pages? Can you
please elaborate on your concern?

While we discuss all this: A patch at the end to document dirty (maybe
even as a first stab at kerneldoc for i915_drm_gem_buffer_object) would be
awesome.
-Daniel


In general, obj->dirty means that (some or) all the pages of the object
(may) have been modified since last time the object was read from backing
store, and that the modified data should be written back rather than
discarded.

Code that works only on default (gtt) GEM objects may be able to optimise
writebacks by marking individual pages dirty, rather than the object as a
whole. But not every GEM object has backing store, and even among those that
do, some do not support per-page dirty tracking.

These are the GEM objects we may want to consider:

1. Default (gtt) object
* Discontiguous, lives in page cache while pinned during use
* Backed by shmfs (swap)
* put_pages() transfers dirty status from object to each page
  before release
* shmfs ensures that dirty unpinned pages are written out
  before deallocation
* Could optimise by marking individual pages at point of use,
  rather than marking whole object and then pushing to all pages
  during put_pages()

2. Phys GEM object
* Lives in physically-contiguous system memory, pinned during use
* Backed by shmfs
* if obj->dirty, put_pages() *copies* all pages back to shmfs via
  page cache RMW
* No per-page tracking, cannot optimise

3. Stolen GEM object
* Lives in (physically-contiguous) stolen memory, always pinned
* No backing store!
* obj->dirty is irrelevant (ignored)
* put_pages() only called at end-of-life
* No per-page tracking (not meaningful anyway)

4. Userptr GEM object
* Discontiguous, lives in page cache while pinned during use
* Backed by user process memory (which may then map to some
  arbitrary file mapping?)
* put_pages() transfers dirty status from object to each page
  before release
* dirty pages are still resident in user space, can be swapped
  out when not pinned
* Could optimise by marking individual pages at point of use,
  rather than marking whole object and then pushing to all pages
  during put_pages()

Are there any more?

Given this diversity, it may be worth adding a dirty_page() vfunc, so that
for those situations where a single page is dirtied AND the object type
supports per-page tracking, we can take advantage of this to reduce copying.
For objects that don't support per-page tracking, the implementation would
just set obj->dirty.

For example:
 void (*dirty_page)(obj, pageno);
possibly with the additional semantic that pageno == -1 means 'dirty the
whole object'.

A convenient further facility would be:
 struct page *i915_gem_object_get_dirty_page(obj, pageno);
which is just like i915_gem_object_get_page() but with the additional effect
of marking the returned page dirty (by calling the above vfunc).
[Aside: can we call set_page_dirty() on a non-shmfs-backed page?].

This means that in all the places where

Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-07 Thread Daniel Vetter
On Fri, Dec 04, 2015 at 05:28:29PM +, Dave Gordon wrote:
> On 04/12/15 09:57, Daniel Vetter wrote:
> >On Tue, Dec 01, 2015 at 01:21:07PM +, Dave Gordon wrote:
> >>On 01/12/15 13:04, Chris Wilson wrote:
> >>>On Tue, Dec 01, 2015 at 12:42:02PM +, Dave Gordon wrote:
> In various places, one or more pages of a GEM object are mapped into CPU
> address space and updated. In each such case, the object should be
> marked dirty, to ensure that the modifications are not discarded if the
> object is evicted under memory pressure.
> 
> This is similar to commit
>   commit 51bc140431e233284660b1d22c47dec9ecdb521e
>   Author: Chris Wilson 
>   Date:   Mon Aug 31 15:10:39 2015 +0100
>   drm/i915: Always mark the object as dirty when used by the GPU
> 
> in which Chris ensured that updates by the GPU were not lost due to
> eviction, but this patch applies instead to the multiple places where
> object content is updated by the host CPU.
> >>>
> >>>Apart from that commit was to mask userspace bugs, here we are under
> >>>control of when the pages are marked and have chosen a different
> >>>per-page interface for CPU writes as opposed to per-object.
> >>>-Chris
> >>
> >>The pattern
> >>get_pages();
> >>kmap(get_page())
> >>write
> >>kunmap()
> >>occurs often enough that it might be worth providing a common function to do
> >>that and mark only the specific page dirty (other cases touch the whole
> >>object, so for those we can just set the obj->dirty flag and let put_pages()
> >>take care of propagating that to all the individual pages).
> >>
> >>But can we be sure that all the functions touched by this patch will operate
> >>only on regular (default) GEM objects (i.e. not phys, stolen, etc) 'cos some
> >>of those don't support per-page tracking. What about objects with no backing
> >>store -- can/should we mark those as dirty (which would prevent eviction)?
> >
> >I thought our special objects do clear obj->dirty on put_pages? Can you
> >please elaborate on your concern?
> >
> >While we discuss all this: A patch at the end to document dirty (maybe
> >even as a first stab at kerneldoc for i915_drm_gem_buffer_object) would be
> >awesome.
> >-Daniel
> 
> In general, obj->dirty means that (some or) all the pages of the object
> (may) have been modified since last time the object was read from backing
> store, and that the modified data should be written back rather than
> discarded.
> 
> Code that works only on default (gtt) GEM objects may be able to optimise
> writebacks by marking individual pages dirty, rather than the object as a
> whole. But not every GEM object has backing store, and even among those that
> do, some do not support per-page dirty tracking.
> 
> These are the GEM objects we may want to consider:
> 
> 1. Default (gtt) object
>* Discontiguous, lives in page cache while pinned during use
>* Backed by shmfs (swap)
>* put_pages() transfers dirty status from object to each page
>  before release
>* shmfs ensures that dirty unpinned pages are written out
>  before deallocation
>* Could optimise by marking individual pages at point of use,
>  rather than marking whole object and then pushing to all pages
>  during put_pages()
> 
> 2. Phys GEM object
>* Lives in physically-contiguous system memory, pinned during use
>* Backed by shmfs
>* if obj->dirty, put_pages() *copies* all pages back to shmfs via
>  page cache RMW
>* No per-page tracking, cannot optimise
> 
> 3. Stolen GEM object
>* Lives in (physically-contiguous) stolen memory, always pinned
>* No backing store!
>* obj->dirty is irrelevant (ignored)
>* put_pages() only called at end-of-life
>* No per-page tracking (not meaningful anyway)
> 
> 4. Userptr GEM object
>* Discontiguous, lives in page cache while pinned during use
>* Backed by user process memory (which may then map to some
>  arbitrary file mapping?)
>* put_pages() transfers dirty status from object to each page
>  before release
>* dirty pages are still resident in user space, can be swapped
>  out when not pinned
>* Could optimise by marking individual pages at point of use,
>  rather than marking whole object and then pushing to all pages
>  during put_pages()
> 
> Are there any more?
> 
> Given this diversity, it may be worth adding a dirty_page() vfunc, so that
> for those situations where a single page is dirtied AND the object type
> supports per-page tracking, we can take advantage of this to reduce copying.
> For objects that don't support per-page tracking, the implementation would
> just set obj->dirty.
> 
> For example:
> void (*dirty_page)(obj, pageno);
> possibly with the additional semantic that pageno == -1 means 'dirty the
> whole object'.
> 
> A convenient further facility would be:
> struct page *i915_gem_object_get_dirty_page(obj, pageno);
> which is just like i915_gem_obje

Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-04 Thread Dave Gordon

On 04/12/15 09:57, Daniel Vetter wrote:

On Tue, Dec 01, 2015 at 01:21:07PM +, Dave Gordon wrote:

On 01/12/15 13:04, Chris Wilson wrote:

On Tue, Dec 01, 2015 at 12:42:02PM +, Dave Gordon wrote:

In various places, one or more pages of a GEM object are mapped into CPU
address space and updated. In each such case, the object should be
marked dirty, to ensure that the modifications are not discarded if the
object is evicted under memory pressure.

This is similar to commit
commit 51bc140431e233284660b1d22c47dec9ecdb521e
Author: Chris Wilson 
Date:   Mon Aug 31 15:10:39 2015 +0100
drm/i915: Always mark the object as dirty when used by the GPU

in which Chris ensured that updates by the GPU were not lost due to
eviction, but this patch applies instead to the multiple places where
object content is updated by the host CPU.


Apart from that commit was to mask userspace bugs, here we are under
control of when the pages are marked and have chosen a different
per-page interface for CPU writes as opposed to per-object.
-Chris


The pattern
get_pages();
kmap(get_page())
write
kunmap()
occurs often enough that it might be worth providing a common function to do
that and mark only the specific page dirty (other cases touch the whole
object, so for those we can just set the obj->dirty flag and let put_pages()
take care of propagating that to all the individual pages).

But can we be sure that all the functions touched by this patch will operate
only on regular (default) GEM objects (i.e. not phys, stolen, etc) 'cos some
of those don't support per-page tracking. What about objects with no backing
store -- can/should we mark those as dirty (which would prevent eviction)?


I thought our special objects do clear obj->dirty on put_pages? Can you
please elaborate on your concern?

While we discuss all this: A patch at the end to document dirty (maybe
even as a first stab at kerneldoc for i915_drm_gem_buffer_object) would be
awesome.
-Daniel


In general, obj->dirty means that (some or) all the pages of the object 
(may) have been modified since last time the object was read from 
backing store, and that the modified data should be written back rather 
than discarded.


Code that works only on default (gtt) GEM objects may be able to 
optimise writebacks by marking individual pages dirty, rather than the 
object as a whole. But not every GEM object has backing store, and even 
among those that do, some do not support per-page dirty tracking.


These are the GEM objects we may want to consider:

1. Default (gtt) object
   * Discontiguous, lives in page cache while pinned during use
   * Backed by shmfs (swap)
   * put_pages() transfers dirty status from object to each page
 before release
   * shmfs ensures that dirty unpinned pages are written out
 before deallocation
   * Could optimise by marking individual pages at point of use,
 rather than marking whole object and then pushing to all pages
 during put_pages()

2. Phys GEM object
   * Lives in physically-contiguous system memory, pinned during use
   * Backed by shmfs
   * if obj->dirty, put_pages() *copies* all pages back to shmfs via
 page cache RMW
   * No per-page tracking, cannot optimise

3. Stolen GEM object
   * Lives in (physically-contiguous) stolen memory, always pinned
   * No backing store!
   * obj->dirty is irrelevant (ignored)
   * put_pages() only called at end-of-life
   * No per-page tracking (not meaningful anyway)

4. Userptr GEM object
   * Discontiguous, lives in page cache while pinned during use
   * Backed by user process memory (which may then map to some
 arbitrary file mapping?)
   * put_pages() transfers dirty status from object to each page
 before release
   * dirty pages are still resident in user space, can be swapped
 out when not pinned
   * Could optimise by marking individual pages at point of use,
 rather than marking whole object and then pushing to all pages
 during put_pages()

Are there any more?

Given this diversity, it may be worth adding a dirty_page() vfunc, so 
that for those situations where a single page is dirtied AND the object 
type supports per-page tracking, we can take advantage of this to reduce 
copying. For objects that don't support per-page tracking, the 
implementation would just set obj->dirty.


For example:
void (*dirty_page)(obj, pageno);
possibly with the additional semantic that pageno == -1 means 'dirty the 
whole object'.


A convenient further facility would be:
struct page *i915_gem_object_get_dirty_page(obj, pageno);
which is just like i915_gem_object_get_page() but with the additional 
effect of marking the returned page dirty (by calling the above vfunc).

[Aside: can we call set_page_dirty() on a non-shmfs-backed page?].

This means that in all the places where I added 'obj->dirty = 1' after a 
kunmap() call, we would instead just change the earlier get_page() to 
get_dirty_p

Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-04 Thread Daniel Vetter
On Tue, Dec 01, 2015 at 01:21:07PM +, Dave Gordon wrote:
> On 01/12/15 13:04, Chris Wilson wrote:
> >On Tue, Dec 01, 2015 at 12:42:02PM +, Dave Gordon wrote:
> >>In various places, one or more pages of a GEM object are mapped into CPU
> >>address space and updated. In each such case, the object should be
> >>marked dirty, to ensure that the modifications are not discarded if the
> >>object is evicted under memory pressure.
> >>
> >>This is similar to commit
> >>commit 51bc140431e233284660b1d22c47dec9ecdb521e
> >>Author: Chris Wilson 
> >>Date:   Mon Aug 31 15:10:39 2015 +0100
> >>drm/i915: Always mark the object as dirty when used by the GPU
> >>
> >>in which Chris ensured that updates by the GPU were not lost due to
> >>eviction, but this patch applies instead to the multiple places where
> >>object content is updated by the host CPU.
> >
> >Apart from that commit was to mask userspace bugs, here we are under
> >control of when the pages are marked and have chosen a different
> >per-page interface for CPU writes as opposed to per-object.
> >-Chris
> >
> 
> The pattern
>   get_pages();
>   kmap(get_page())
>   write
>   kunmap()
> occurs often enough that it might be worth providing a common function to do
> that and mark only the specific page dirty (other cases touch the whole
> object, so for those we can just set the obj->dirty flag and let put_pages()
> take care of propagating that to all the individual pages).
> 
> But can we be sure that all the functions touched by this patch will operate
> only on regular (default) GEM objects (i.e. not phys, stolen, etc) 'cos some
> of those don't support per-page tracking. What about objects with no backing
> store -- can/should we mark those as dirty (which would prevent eviction)?

I thought our special objects do clear obj->dirty on put_pages? Can you
please elaborate on your concern?

While we discuss all this: A patch at the end to document dirty (maybe
even as a first stab at kerneldoc for i915_drm_gem_buffer_object) would be
awesome.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-01 Thread Dave Gordon

On 01/12/15 13:04, Chris Wilson wrote:

On Tue, Dec 01, 2015 at 12:42:02PM +, Dave Gordon wrote:

In various places, one or more pages of a GEM object are mapped into CPU
address space and updated. In each such case, the object should be
marked dirty, to ensure that the modifications are not discarded if the
object is evicted under memory pressure.

This is similar to commit
commit 51bc140431e233284660b1d22c47dec9ecdb521e
Author: Chris Wilson 
Date:   Mon Aug 31 15:10:39 2015 +0100
drm/i915: Always mark the object as dirty when used by the GPU

in which Chris ensured that updates by the GPU were not lost due to
eviction, but this patch applies instead to the multiple places where
object content is updated by the host CPU.


Apart from that commit was to mask userspace bugs, here we are under
control of when the pages are marked and have chosen a different
per-page interface for CPU writes as opposed to per-object.
-Chris



The pattern
get_pages();
kmap(get_page())
write
kunmap()
occurs often enough that it might be worth providing a common function 
to do that and mark only the specific page dirty (other cases touch the 
whole object, so for those we can just set the obj->dirty flag and let 
put_pages() take care of propagating that to all the individual pages).


But can we be sure that all the functions touched by this patch will 
operate only on regular (default) GEM objects (i.e. not phys, stolen, 
etc) 'cos some of those don't support per-page tracking. What about 
objects with no backing store -- can/should we mark those as dirty 
(which would prevent eviction)?


.Dave.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-01 Thread Chris Wilson
On Tue, Dec 01, 2015 at 12:42:02PM +, Dave Gordon wrote:
> In various places, one or more pages of a GEM object are mapped into CPU
> address space and updated. In each such case, the object should be
> marked dirty, to ensure that the modifications are not discarded if the
> object is evicted under memory pressure.
> 
> This is similar to commit
>   commit 51bc140431e233284660b1d22c47dec9ecdb521e
>   Author: Chris Wilson 
>   Date:   Mon Aug 31 15:10:39 2015 +0100
>   drm/i915: Always mark the object as dirty when used by the GPU
> 
> in which Chris ensured that updates by the GPU were not lost due to
> eviction, but this patch applies instead to the multiple places where
> object content is updated by the host CPU.

Apart from that commit was to mask userspace bugs, here we are under
control of when the pages are marked and have chosen a different
per-page interface for CPU writes as opposed to per-object.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] Always mark GEM objects as dirty when written by the CPU

2015-12-01 Thread Dave Gordon
In various places, one or more pages of a GEM object are mapped into CPU
address space and updated. In each such case, the object should be
marked dirty, to ensure that the modifications are not discarded if the
object is evicted under memory pressure.

This is similar to commit
commit 51bc140431e233284660b1d22c47dec9ecdb521e
Author: Chris Wilson 
Date:   Mon Aug 31 15:10:39 2015 +0100
drm/i915: Always mark the object as dirty when used by the GPU

in which Chris ensured that updates by the GPU were not lost due to
eviction, but this patch applies instead to the multiple places where
object content is updated by the host CPU.

It also incorporates and supercedes Alex Dai's earlier patch
[PATCH v1] drm/i915/guc: Fix a fw content lost issue after it is evicted

Signed-off-by: Dave Gordon 
Cc: Chris Wilson 
Cc: Alex Dai 
---
 drivers/gpu/drm/i915/i915_cmd_parser.c   | 1 +
 drivers/gpu/drm/i915/i915_gem.c  | 1 +
 drivers/gpu/drm/i915/i915_gem_dmabuf.c   | 2 ++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   | 2 ++
 drivers/gpu/drm/i915/i915_gem_render_state.c | 1 +
 drivers/gpu/drm/i915/i915_guc_submission.c   | 1 +
 drivers/gpu/drm/i915/intel_lrc.c | 6 +-
 7 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 814d894..292bd5d 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -945,6 +945,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj,
drm_clflush_virt_range(src, batch_len);
 
memcpy(dst, src, batch_len);
+   dest_obj->dirty = 1;
 
 unmap_src:
vunmap(src_base);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 33adc8f..76bacba 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5201,6 +5201,7 @@ i915_gem_object_create_from_data(struct drm_device *dev,
i915_gem_object_pin_pages(obj);
sg = obj->pages;
bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
+   obj->dirty = 1;
i915_gem_object_unpin_pages(obj);
 
if (WARN_ON(bytes != size)) {
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index e9c2bfd..49a74c6 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -208,6 +208,8 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, size_t start, size
return ret;
 
ret = i915_gem_object_set_to_cpu_domain(obj, write);
+   if (write)
+   obj->dirty = 1;
mutex_unlock(&dev->struct_mutex);
return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a4c243c..bc28a10 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -281,6 +281,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj,
}
 
kunmap_atomic(vaddr);
+   obj->dirty = 1;
 
return 0;
 }
@@ -372,6 +373,7 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj,
}
 
kunmap_atomic(vaddr);
+   obj->dirty = 1;
 
return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c 
b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 5026a62..dd1976c 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -144,6 +144,7 @@ static int render_state_setup(struct render_state *so)
so->aux_batch_size = ALIGN(so->aux_batch_size, 8);
 
kunmap(page);
+   so->obj->dirty = 1;
 
ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
if (ret)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index a057cbd..b4a99a2 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -583,6 +583,7 @@ static void lr_context_update(struct drm_i915_gem_request 
*rq)
reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
 
kunmap_atomic(reg_state);
+   ctx_obj->dirty = 1;
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4ebafab..bc77794 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -391,6 +391,7 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
}
 
kunmap_atomic(reg_state);
+   ctx_obj->dirty = 1;
 
return 0;
 }
@@ -1030,7 +1031,7 @@ static int intel_lr_context_do_pin(struct intel_engine_cs 
*ring,
if (ret)
goto unpin_ctx_obj;
 
-   ctx_obj->dirty = true;
+   ctx_obj->dirty = 1;
 
/* Invalidate GuC TLB. */
if (i915.enable_guc_submission)
@@ -1461,6 +1462,8 @@ static int