i915: Add an interface to dynamically change the cache level

Chris Wilson Wed, 13 Apr 2011 12:21:35 -0700

On Wed, 13 Apr 2011 20:59:46 +0200, Daniel Vetter <[email protected]> wrote:
> > @@ -3002,6 +3002,44 @@ i915_gem_object_set_to_gtt_domain(struct 
> > drm_i915_gem_object *obj, bool write)
> >     return 0;
> >  }
> >  
> > +int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > +                               enum i915_cache_level cache_level)
> > +{
> > +   int ret;
> > +
> > +   if (obj->cache_level == cache_level)
> > +           return 0;
> > +
> > +   if (obj->gtt_space) {
> > +           ret = i915_gem_object_flush_gpu(obj);
> > +           if (ret)
> > +                   return ret;
> > +
> > +           ret = i915_gem_gtt_bind_object(obj, cache_level);
> > +           if (ret)
> > +                   return ret;
> 
> This momentarily confused me till I've noticed that the fake agp driver
> does the right thing and does not re-create a dmar mapping if it already
> exists. So much for remembering my own code. Still, maybe extract
> i915_gem_gtt_rebind_object from restore_gtt_mappings and use that one
> here? Should make the intent clearer.


Now that you reminded me, I was going to ask you at one point if we can
move the construction of the sg to a separate function. I'm not completely
happy that we have the sanest of interfaces between the gtt driver
and i915 yet. I think it will be worth revisiting that as our usage
patterns change.

[The suggested change is good, of course.]

> 
> > +           /* Ensure that we invalidate the GPU's caches and TLBs. */
> > +           obj->base.read_domains &= I915_GEM_GPU_DOMAINS;
> 
> I can't make sense of this. Either we really want to ensure that the gpu
> buffers get invalidated on next use. But then it's probably
> 
>               read_domains &= ~GPU_DOMAINS
> 
> and would fit better grouped together with the call to object_flush_gpu
> (the rebind can't actually fail if the dmar mappings already exist). Or
> this is something else and I'm blind.

Gah, typo. Even re-reading what you wrote, I thought you had gone insane.
It was only me who had. ;-)

> > +   }
> > +
> > +   if (cache_level == I915_CACHE_NONE) {
> > +           /* If we're coming from LLC cached, then we haven't
> > +            * actually been tracking whether the data is in the
> > +            * CPU cache or not, since we only allow one bit set
> > +            * in obj->write_domain and have been skipping the clflushes.
> > +            * Just set it to the CPU cache for now.
> > +            */
> > +           WARN_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
> > +
> > +           obj->base.read_domains |= I915_GEM_DOMAIN_CPU;
> 
> This breaks the invariant that write_domain != 0 implies write_domain ==
> read_domains. Yes, if nothing prefetches and we clflush in due time the
> caches should still be valid, but paranoid me deems that a bit fragile.
> Also future patches shoot down fences, so we might as well shoot down the
> gtt mapping completely. That seems required for the redirect gtt mappings
> patch, too.
> 
> > +           obj->base.write_domain = I915_GEM_DOMAIN_CPU;
> 
> We might end up here with a write_domain == DOMAIN_GTT. Feels a tad bit
> unsafe. I'd prefer either a WARN_ON and push the problem out to callers or
> to call flush_gtt_write_domain somewhere in set_cache_level.

Right, we can simply flush the GTT write domain and do a complete move
into the CPU domain:

if (cache_level == I915_CACHE_NONE) {
        /* If we're coming from LLC cached, then we haven't
         * actually been tracking whether the data is in the
         * CPU cache or not, since we only allow one bit set
         * in obj->write_domain and have been skipping the
         * clflushes.
         * Just set it to the CPU cache for now.
         */
        WARN_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
        WARN_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);

        i915_gem_object_flush_gtt_write_domain(obj);

        obj->base.read_domains = I915_GEM_DOMAIN_CPU;
        obj->base.write_domain = I915_GEM_DOMAIN_CPU;
}

That should be a no-op and cause no greater impact than having to trigger
the clflushes. And keeps the domain tracking consistent.

However, considering the unflushed GTT bug, it does become much simpler...

> This looks like the critical part of the whole patch series so perhaps
> fold the follow-up patches in here, too (like fence teardown). This way
> there's just one spot that requires _really_ careful thinking.
> 
> Also, I haven't thought too hard about the uncached->cached transition on
> live objects, which is not (yet) required. Maybe some more careful
> handling of the gtt domain (mappings teardown) is needed for that.

Right, we've already identified that we have a bug here entirely due to
not flushing the GTT!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 18/30] drm/i915: Add an interface to dynamically change the cache level

Reply via email to