[Intel-gfx] [PATCH 05/18] drm/i915: context switch implementation
Implement the context switch code as well as the interfaces to do the context switch. This patch also doesn't match 1:1 with the RFC patches. The main difference is that from Daniel's responses the last context object is now stored instead of the last context. This aids in allows us to free the context data structure, and context object independently. There is room for optimization: this code will pin the context object until the next context is active. The optimal way to do it is to actually pin the object, move it to the active list, do the context switch, and then unpin it. This allows the eviction code to actually evict the context object if needed. The context switch code is missing workarounds, they will be implemented in future patches. Signed-off-by: Ben Widawsky --- drivers/gpu/drm/i915/i915_drv.h |5 ++ drivers/gpu/drm/i915/i915_gem_context.c | 118 ++- drivers/gpu/drm/i915/intel_ringbuffer.c | 26 +++ drivers/gpu/drm/i915/intel_ringbuffer.h |5 ++ 4 files changed, 153 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f458a8f..c6c2ada 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1303,10 +1303,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level); /* i915_gem_context.c */ +#define I915_CONTEXT_FORCED_SWITCH (1<<0) +#define I915_CONTEXT_INITIAL_SWITCH (1<<1) void i915_gem_context_load(struct drm_device *dev); void i915_gem_context_unload(struct drm_device *dev); void i915_gem_context_open(struct drm_device *dev, struct drm_file *file); void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); +int i915_switch_context(struct intel_ring_buffer *ring, + struct drm_file *file, + int to_id, u32 seqno, u32 flags); /* i915_gem_gtt.c */ int __must_check i915_gem_init_aliasing_ppgtt(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 321bafd..cc508d5 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -262,7 +262,7 @@ void i915_gem_context_close(struct drm_device *dev, struct drm_file *file) mutex_unlock(&dev->struct_mutex); } -static __used struct i915_hw_context * +static struct i915_hw_context * i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id) { struct i915_hw_context *ctx = NULL; @@ -279,3 +279,119 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id) return ctx; } + +static int do_switch(struct drm_i915_gem_object *from_obj, +struct i915_hw_context *to, +u32 seqno, u32 flags) +{ + bool initial_switch = (flags & I915_CONTEXT_INITIAL_SWITCH) ? true : false; + bool force = (flags & I915_CONTEXT_FORCED_SWITCH) ? true : false; + struct intel_ring_buffer *ring = NULL; + u32 hw_flags = 0; + int ret; + + BUG_ON(to == NULL); + BUG_ON(from_obj != NULL && from_obj->pin_count == 0); + BUG_ON((from_obj != NULL && from_obj->context_id < 0) || to->obj->context_id < 0); + + ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false); + if (ret) + return ret; + + if (initial_switch) + hw_flags |= MI_RESTORE_INHIBIT; + if (force) + hw_flags |= MI_FORCE_RESTORE; + + ring = to->ring; + ret = intel_ring_mi_set_context(ring, to, hw_flags); + if (ret) { + i915_gem_object_unpin(to->obj); + return ret; + } + + /* The backing object for the context is done after switching to the +* *next* context. Therefore we cannot retire the previous context until +* the next context has already started running. In fact, the below code +* is a bit suboptimal because the retiring can occur simply after the +* MI_SET_CONTEXT instead of when the next seqno has completed. +*/ + if (from_obj != NULL) { + i915_gem_object_move_to_active(from_obj, ring, seqno); + /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the +* whole damn pipeline, we don't need to explicitly mark the +* object dirty. It should be safe to evict the object at any +* point after MI_SET_CONTEXT has finished executing... true as +* of GEN7. If not from_obj->dirty=1 would make this safer. +*/ + BUG_ON(from_obj->ring != to->ring); + } + + if (from_obj) + i915_gem_object_unpin(from_obj); + + ring->last_context_obj = to->obj; + + return 0; +} + +/** + * i915_switch_context() - perform a GPU context switch. + * @ring: ring for which we'll execute
Re: [Intel-gfx] [PATCH 05/18] drm/i915: context switch implementation
On Sun, Mar 18, 2012 at 01:39:45PM -0700, Ben Widawsky wrote: > Implement the context switch code as well as the interfaces to do the > context switch. This patch also doesn't match 1:1 with the RFC patches. > The main difference is that from Daniel's responses the last context > object is now stored instead of the last context. This aids in allows us > to free the context data structure, and context object independently. > > There is room for optimization: this code will pin the context object > until the next context is active. The optimal way to do it is to > actually pin the object, move it to the active list, do the context > switch, and then unpin it. This allows the eviction code to actually > evict the context object if needed. > > The context switch code is missing workarounds, they will be implemented > in future patches. > > Signed-off-by: Ben Widawsky Ok, I've looked at the use-sites of context_get and all this refcounting and noticed that: - we always hold dev->struct_mutex - we always drop the acquired reference to the context structure in the same function without dropping struct_mutex in between. So we don't seem to require any reference counting on these (and additional locking on the idr). Additionally the idr locking seems to give us a false sense of security because afaics the locking/refcounting would be broken when we would _not_ hold struct_mutex. So can we just rip this out or do we need this (in which case it needs some more work imo)? -Daniel > --- > drivers/gpu/drm/i915/i915_drv.h |5 ++ > drivers/gpu/drm/i915/i915_gem_context.c | 118 > ++- > drivers/gpu/drm/i915/intel_ringbuffer.c | 26 +++ > drivers/gpu/drm/i915/intel_ringbuffer.h |5 ++ > 4 files changed, 153 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index f458a8f..c6c2ada 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1303,10 +1303,15 @@ int i915_gem_object_set_cache_level(struct > drm_i915_gem_object *obj, > enum i915_cache_level cache_level); > > /* i915_gem_context.c */ > +#define I915_CONTEXT_FORCED_SWITCH (1<<0) > +#define I915_CONTEXT_INITIAL_SWITCH (1<<1) > void i915_gem_context_load(struct drm_device *dev); > void i915_gem_context_unload(struct drm_device *dev); > void i915_gem_context_open(struct drm_device *dev, struct drm_file *file); > void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); > +int i915_switch_context(struct intel_ring_buffer *ring, > + struct drm_file *file, > + int to_id, u32 seqno, u32 flags); > > /* i915_gem_gtt.c */ > int __must_check i915_gem_init_aliasing_ppgtt(struct drm_device *dev); > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > index 321bafd..cc508d5 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -262,7 +262,7 @@ void i915_gem_context_close(struct drm_device *dev, > struct drm_file *file) > mutex_unlock(&dev->struct_mutex); > } > > -static __used struct i915_hw_context * > +static struct i915_hw_context * > i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id) > { > struct i915_hw_context *ctx = NULL; > @@ -279,3 +279,119 @@ i915_gem_context_get(struct drm_i915_file_private > *file_priv, u32 id) > > return ctx; > } > + > +static int do_switch(struct drm_i915_gem_object *from_obj, > + struct i915_hw_context *to, > + u32 seqno, u32 flags) > +{ > + bool initial_switch = (flags & I915_CONTEXT_INITIAL_SWITCH) ? true : > false; > + bool force = (flags & I915_CONTEXT_FORCED_SWITCH) ? true : false; > + struct intel_ring_buffer *ring = NULL; > + u32 hw_flags = 0; > + int ret; > + > + BUG_ON(to == NULL); > + BUG_ON(from_obj != NULL && from_obj->pin_count == 0); > + BUG_ON((from_obj != NULL && from_obj->context_id < 0) || > to->obj->context_id < 0); > + > + ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false); > + if (ret) > + return ret; > + > + if (initial_switch) > + hw_flags |= MI_RESTORE_INHIBIT; > + if (force) > + hw_flags |= MI_FORCE_RESTORE; > + > + ring = to->ring; > + ret = intel_ring_mi_set_context(ring, to, hw_flags); > + if (ret) { > + i915_gem_object_unpin(to->obj); > + return ret; > + } > + > + /* The backing object for the context is done after switching to the > + * *next* context. Therefore we cannot retire the previous context until > + * the next context has already started running. In fact, the below code > + * is a bit suboptimal because the retiring can occur simply after the > + * MI_SET_CONTEXT instead of when the next seqno has completed. > +
Re: [Intel-gfx] [PATCH 05/18] drm/i915: context switch implementation
On Thu, 29 Mar 2012 20:24:11 +0200 Daniel Vetter wrote: > On Sun, Mar 18, 2012 at 01:39:45PM -0700, Ben Widawsky wrote: > > Implement the context switch code as well as the interfaces to do the > > context switch. This patch also doesn't match 1:1 with the RFC patches. > > The main difference is that from Daniel's responses the last context > > object is now stored instead of the last context. This aids in allows us > > to free the context data structure, and context object independently. > > > > There is room for optimization: this code will pin the context object > > until the next context is active. The optimal way to do it is to > > actually pin the object, move it to the active list, do the context > > switch, and then unpin it. This allows the eviction code to actually > > evict the context object if needed. > > > > The context switch code is missing workarounds, they will be implemented > > in future patches. > > > > Signed-off-by: Ben Widawsky > > Ok, I've looked at the use-sites of context_get and all this refcounting > and noticed that: > - we always hold dev->struct_mutex > - we always drop the acquired reference to the context structure in the > same function without dropping struct_mutex in between. > > So we don't seem to require any reference counting on these (and > additional locking on the idr). Additionally the idr locking seems to give > us a false sense of security because afaics the locking/refcounting would > be broken when we would _not_ hold struct_mutex. > > So can we just rip this out or do we need this (in which case it needs > some more work imo)? > -Daniel I think it can be ripped out. I was on the fence about this before submitting the patches and left it out of laziness; it doesn't hurt as there is no lock contention assuming your statement is true with no bugs in the code, and it follows the canonical use of idrs. Let me look it over some more to make sure after you finish reviewing the other stuff. The idea was supposed to be contexts can be created and looked up without struct mutex, but that isn't actually done in the code. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/18] drm/i915: context switch implementation
On Sun, Mar 18, 2012 at 01:39:45PM -0700, Ben Widawsky wrote: > Implement the context switch code as well as the interfaces to do the > context switch. This patch also doesn't match 1:1 with the RFC patches. > The main difference is that from Daniel's responses the last context > object is now stored instead of the last context. This aids in allows us > to free the context data structure, and context object independently. > > There is room for optimization: this code will pin the context object > until the next context is active. The optimal way to do it is to > actually pin the object, move it to the active list, do the context > switch, and then unpin it. This allows the eviction code to actually > evict the context object if needed. > > The context switch code is missing workarounds, they will be implemented > in future patches. > > Signed-off-by: Ben Widawsky Meh, I've forgotten half of the review, some more comments below. > --- > drivers/gpu/drm/i915/i915_drv.h |5 ++ > drivers/gpu/drm/i915/i915_gem_context.c | 118 > ++- > drivers/gpu/drm/i915/intel_ringbuffer.c | 26 +++ > drivers/gpu/drm/i915/intel_ringbuffer.h |5 ++ > 4 files changed, 153 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index f458a8f..c6c2ada 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1303,10 +1303,15 @@ int i915_gem_object_set_cache_level(struct > drm_i915_gem_object *obj, > enum i915_cache_level cache_level); > > /* i915_gem_context.c */ > +#define I915_CONTEXT_FORCED_SWITCH (1<<0) > +#define I915_CONTEXT_INITIAL_SWITCH (1<<1) > void i915_gem_context_load(struct drm_device *dev); > void i915_gem_context_unload(struct drm_device *dev); > void i915_gem_context_open(struct drm_device *dev, struct drm_file *file); > void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); > +int i915_switch_context(struct intel_ring_buffer *ring, > + struct drm_file *file, > + int to_id, u32 seqno, u32 flags); > > /* i915_gem_gtt.c */ > int __must_check i915_gem_init_aliasing_ppgtt(struct drm_device *dev); > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > index 321bafd..cc508d5 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -262,7 +262,7 @@ void i915_gem_context_close(struct drm_device *dev, > struct drm_file *file) > mutex_unlock(&dev->struct_mutex); > } > > -static __used struct i915_hw_context * > +static struct i915_hw_context * > i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id) > { > struct i915_hw_context *ctx = NULL; > @@ -279,3 +279,119 @@ i915_gem_context_get(struct drm_i915_file_private > *file_priv, u32 id) > > return ctx; > } > + > +static int do_switch(struct drm_i915_gem_object *from_obj, > + struct i915_hw_context *to, > + u32 seqno, u32 flags) > +{ > + bool initial_switch = (flags & I915_CONTEXT_INITIAL_SWITCH) ? true : > false; > + bool force = (flags & I915_CONTEXT_FORCED_SWITCH) ? true : false; I don't like this flag passing. force seems to be totally unused. And imo inital_switch would work much better if we track this in i915_hw_context->intialized (which will be false on context create) and check this when before doing the context switch. That way you'd also lose the complexity of having to initialize a context explicitly after create - it would just magically happen. > + struct intel_ring_buffer *ring = NULL; > + u32 hw_flags = 0; > + int ret; > + > + BUG_ON(to == NULL); > + BUG_ON(from_obj != NULL && from_obj->pin_count == 0); > + BUG_ON((from_obj != NULL && from_obj->context_id < 0) || > to->obj->context_id < 0); > + > + ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false); > + if (ret) > + return ret; > + > + if (initial_switch) > + hw_flags |= MI_RESTORE_INHIBIT; > + if (force) > + hw_flags |= MI_FORCE_RESTORE; Please move this into the function that actually emits the MI_ command - it's imo hard to read things if it's distributed all over the place. And because we don't seem to need force, a simple (and destriptive) bool inhibit_restore would be sufficient. > + > + ring = to->ring; > + ret = intel_ring_mi_set_context(ring, to, hw_flags); > + if (ret) { > + i915_gem_object_unpin(to->obj); > + return ret; > + } > + > + /* The backing object for the context is done after switching to the > + * *next* context. Therefore we cannot retire the previous context until > + * the next context has already started running. In fact, the below code > + * is a bit suboptimal because the retiring can occur simply afte
Re: [Intel-gfx] [PATCH 05/18] drm/i915: context switch implementation
On Thu, Mar 29, 2012 at 11:43:12AM -0700, Ben Widawsky wrote: > On Thu, 29 Mar 2012 20:24:11 +0200 > Daniel Vetter wrote: > > > On Sun, Mar 18, 2012 at 01:39:45PM -0700, Ben Widawsky wrote: > > > Implement the context switch code as well as the interfaces to do the > > > context switch. This patch also doesn't match 1:1 with the RFC patches. > > > The main difference is that from Daniel's responses the last context > > > object is now stored instead of the last context. This aids in allows us > > > to free the context data structure, and context object independently. > > > > > > There is room for optimization: this code will pin the context object > > > until the next context is active. The optimal way to do it is to > > > actually pin the object, move it to the active list, do the context > > > switch, and then unpin it. This allows the eviction code to actually > > > evict the context object if needed. > > > > > > The context switch code is missing workarounds, they will be implemented > > > in future patches. > > > > > > Signed-off-by: Ben Widawsky > > > > Ok, I've looked at the use-sites of context_get and all this refcounting > > and noticed that: > > - we always hold dev->struct_mutex > > - we always drop the acquired reference to the context structure in the > > same function without dropping struct_mutex in between. > > > > So we don't seem to require any reference counting on these (and > > additional locking on the idr). Additionally the idr locking seems to give > > us a false sense of security because afaics the locking/refcounting would > > be broken when we would _not_ hold struct_mutex. > > > > So can we just rip this out or do we need this (in which case it needs > > some more work imo)? > > -Daniel > > I think it can be ripped out. I was on the fence about this before > submitting the patches and left it out of laziness; it doesn't hurt as > there is no lock contention assuming your statement is true with no > bugs in the code, and it follows the canonical use of idrs. > > Let me look it over some more to make sure after you finish reviewing > the other stuff. The idea was supposed to be contexts can be created > and looked up without struct mutex, but that isn't actually done in the > code. Well, if you want to leave it I have to add some more review comments about it - atm I think it would be buggy and racy without holding struct_mutex ... -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/18] drm/i915: context switch implementation
On Thu, 29 Mar 2012 20:49:00 +0200 Daniel Vetter wrote: > On Thu, Mar 29, 2012 at 11:43:12AM -0700, Ben Widawsky wrote: > > On Thu, 29 Mar 2012 20:24:11 +0200 > > Daniel Vetter wrote: > > > > > On Sun, Mar 18, 2012 at 01:39:45PM -0700, Ben Widawsky wrote: > > > > Implement the context switch code as well as the interfaces to do the > > > > context switch. This patch also doesn't match 1:1 with the RFC patches. > > > > The main difference is that from Daniel's responses the last context > > > > object is now stored instead of the last context. This aids in allows us > > > > to free the context data structure, and context object independently. > > > > > > > > There is room for optimization: this code will pin the context object > > > > until the next context is active. The optimal way to do it is to > > > > actually pin the object, move it to the active list, do the context > > > > switch, and then unpin it. This allows the eviction code to actually > > > > evict the context object if needed. > > > > > > > > The context switch code is missing workarounds, they will be implemented > > > > in future patches. > > > > > > > > Signed-off-by: Ben Widawsky > > > > > > Ok, I've looked at the use-sites of context_get and all this refcounting > > > and noticed that: > > > - we always hold dev->struct_mutex > > > - we always drop the acquired reference to the context structure in the > > > same function without dropping struct_mutex in between. > > > > > > So we don't seem to require any reference counting on these (and > > > additional locking on the idr). Additionally the idr locking seems to give > > > us a false sense of security because afaics the locking/refcounting would > > > be broken when we would _not_ hold struct_mutex. > > > > > > So can we just rip this out or do we need this (in which case it needs > > > some more work imo)? > > > -Daniel > > > > I think it can be ripped out. I was on the fence about this before > > submitting the patches and left it out of laziness; it doesn't hurt as > > there is no lock contention assuming your statement is true with no > > bugs in the code, and it follows the canonical use of idrs. > > > > Let me look it over some more to make sure after you finish reviewing > > the other stuff. The idea was supposed to be contexts can be created > > and looked up without struct mutex, but that isn't actually done in the > > code. > > Well, if you want to leave it I have to add some more review comments > about it - atm I think it would be buggy and racy without holding > struct_mutex ... > -Daniel As I said before, I'll either find a good use for it, or remove it. Context creation is the only currently viable case for it- but probably not worth the extra lock. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx