On 24/11/15 16:04, Daniel Vetter wrote:
On Tue, Nov 24, 2015 at 03:47:02PM +0000, Dave Gordon wrote:
On 10/11/15 23:27, yu....@intel.com wrote:
From: Alex Dai <yu....@intel.com>

We keep a copy of GuC fw in a GEM obj. However its content is lost
if the GEM obj is swapped (igt/gem_evict_*). Therefore, the later
fw loading during GPU reset will fail. Mark the obj dirty after
copying data into the pages. So its content will be kept during
swapped out.

Signed-off-by: Alex Dai <yu....@intel.com>
---
  drivers/gpu/drm/i915/i915_gem.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f1e3fde..3b15167 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5199,6 +5199,7 @@ i915_gem_object_create_from_data(struct drm_device *dev,
        i915_gem_object_pin_pages(obj);
        sg = obj->pages;
        bytes = sg_copy_from_buffer(sg->sgl, sg->nents, (void *)data, size);
+       obj->dirty = 1;
        i915_gem_object_unpin_pages(obj);

        if (WARN_ON(bytes != size)) {

Hi,

although I liked this at first, on further checking I've found that there
are other cases where the CPU writes on a GEM object without marking it
dirty, so in theory all of those could also be subject to the same issue.

Functions that may suffer from this problem include copy_batch(),
relocate_entry_cpu(), and render_state_setup(); hence, the objects that
might be affected (CPU updates discarded) would in general be batch buffers.

Obviously also i915_gem_object_create_from_data() currently used only for
firmware objects; and populate_lr_context() also modifies a GEM object using
the CPU, but it already explicitly marks the object dirty and so avoid any
problems.

At present, setting an object into the GTT domain with intent-to-write
causes the object to be marked as dirty, whereas setting it into the CPU
domain with intent-to-write doesn't. Since the obj->dirty flag doesn't mean
"CPU and GPU views differ" but rather "In-memory version is newer than
backing store", setting it should depend only on the intent-to-write flag,
not on which domain it's being put into.

So my preferred solution would be to set the obj->dirty flag inside
i915_gem_object_set_to_cpu_domain(obj, true). Then objects that have been
written by the CPU would be swapped out to backing store rather than
discarded if the shrinker decides to reclaim the memory.

To make this work properly, we also need to audit all the callers of
i915_gem_object_set_to_cpu_domain(), because I think in several places the
callers pass the second argument as true where it should be false; this
would cause objects to unnecessarily be marked dirty, increasing swap
requirements and impacting performance.

I'll follow up with an actual patch once people have had an opportunity to
check that the above analysis is accurate.

set_to_cpu_domain won't cover everyone, since some callers like
shmem_pwrite/pread do clever tricks with inline flushing.

Also there's a fundamental difference between writing through GTT (where
we always pin all the pages, even though now we do have partial views),
and CPU writes, which in most places access the bo page-by-page.

Given that I'm still slightly leaning towards a "cpu writes need to mark
pages dirty themselves" world-view. At least teaching this to
set_to_cpu_domain won't be a fully effective magic bullet.
-Daniel

shmem_pwrite doesn't call set_to_cpu_domain, and it already explicitly marks the GEM object as dirty; shmem_pread obviously doesn't need to as it's a read operation and doesn't change the object contents.

There's no difference (as far as backing store is concerned) whether the object is written via a CPU mapping, a GTT mapping, or by the GPU; the end result is that the object in memory is different from the version on backing store, and the object should be marked dirty if we want the changes to persist across eviction.

There may indeed be additional functions that modify GEM objects in some way without first setting them into the CPU domain, in which case they will also be responsible for marking the objects dirty. But all the (correct) callers of i915_gem_object_set_to_cpu_domain(obj, true) follow it by writing on the object, and only one marks the object as dirty, so all others are potential victims of having the shrinker discard their uncommitted changes.

Fixing it here (i.e when mapping whole objects for CPU write access) means that object-level operations then don't have to worry about page-level details. Functions that want to be smarter about tracking updates at the level of individual pages could still do so, but I don't want to impose that requirement on code that just wants to manipulate objects.

BTW, I note that the different implementations of put_pages() are not consistent about how they treat dirty pages within objects not marked as dirty. Does this matter?

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to