Re: [Intel-gfx] [PATCH 12/18] drm/i915: try to reset the gpu before unload

2012-03-30 Thread Daniel Vetter
On Fri, Mar 30, 2012 at 11:50:42AM -0700, Ben Widawsky wrote:
> On Thu, 29 Mar 2012 21:31:21 +0200
> Daniel Vetter  wrote:
> 
> > On Sun, Mar 18, 2012 at 01:39:52PM -0700, Ben Widawsky wrote:
> > > paranoia
> > > 
> > > For context support the HW expects the default context to always be
> > > available as there is no way to shut off HW contexts once turned on
> > > (afaics). This is problematic when unloading the driver as we have no
> > > way to prevent the GPU from expecting the BO to still be present once
> > > unloaded.
> > > 
> > > The best we can do to remedy the situation is to attempt a GPU reset
> > > when doing the unload.
> > > 
> > > NOTE: this patch isn't *really* required to go with the rest of the
> > > context serious.
> > > 
> > > Signed-off-by: Ben Widawsky 
> > 
> > I think the paranoia here is justified (albeit it would benefit from some
> > commit-message love imo). But we do not support i915_reset on all gens, so
> > I think you need to add a gen >= 5 check here.
> 
> I think i915_reset does the right thing, but I'm not sure. It has a big
> gen switch statement in it.

Yeah, it just returns. But on reading i915_reset I think you don't want
all it does - it resets the entire modeset and gem state, too. I think it
would be better to just do the hw reset and not bother with everything
else - the only chance is that it accidentally breaks module unload.

What about extracting the actual hw reset code (i.e. the switch(gen)
stuff) into i915_do_reset and only calling that? If you want, add the
is_gen5 check, but given that it's just module unload I don't care about
fried hw due to a not-so-well-working gpu reset ;-)
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 12/18] drm/i915: try to reset the gpu before unload

2012-03-30 Thread Ben Widawsky
On Thu, 29 Mar 2012 21:31:21 +0200
Daniel Vetter  wrote:

> On Sun, Mar 18, 2012 at 01:39:52PM -0700, Ben Widawsky wrote:
> > paranoia
> > 
> > For context support the HW expects the default context to always be
> > available as there is no way to shut off HW contexts once turned on
> > (afaics). This is problematic when unloading the driver as we have no
> > way to prevent the GPU from expecting the BO to still be present once
> > unloaded.
> > 
> > The best we can do to remedy the situation is to attempt a GPU reset
> > when doing the unload.
> > 
> > NOTE: this patch isn't *really* required to go with the rest of the
> > context serious.
> > 
> > Signed-off-by: Ben Widawsky 
> 
> I think the paranoia here is justified (albeit it would benefit from some
> commit-message love imo). But we do not support i915_reset on all gens, so
> I think you need to add a gen >= 5 check here.

I think i915_reset does the right thing, but I'm not sure. It has a big
gen switch statement in it.

> 
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c |3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index c1aab45..848cc45 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3910,6 +3910,9 @@ i915_gem_lastclose(struct drm_device *dev)
> > ret = i915_gem_idle(dev);
> > if (ret)
> > DRM_ERROR("failed to idle hardware: %d\n", ret);
> > +   ret = i915_reset(dev, GRDOM_FULL);
> > +   if (ret)
> > +   DRM_ERROR("failed to reset gpu: %d\n", ret);
> >  }
> >  
> >  static void
> > -- 
> > 1.7.9.4
> > 
> > ___
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 12/18] drm/i915: try to reset the gpu before unload

2012-03-30 Thread Daniel Vetter
On Fri, Mar 30, 2012 at 09:54:31AM -0700, Jesse Barnes wrote:
> On Sun, 18 Mar 2012 13:39:52 -0700
> Ben Widawsky  wrote:
> 
> > paranoia
> > 
> > For context support the HW expects the default context to always be
> > available as there is no way to shut off HW contexts once turned on
> > (afaics). This is problematic when unloading the driver as we have no
> > way to prevent the GPU from expecting the BO to still be present once
> > unloaded.
> > 
> > The best we can do to remedy the situation is to attempt a GPU reset
> > when doing the unload.
> > 
> > NOTE: this patch isn't *really* required to go with the rest of the
> > context serious.
> > 
> > Signed-off-by: Ben Widawsky 
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c |3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index c1aab45..848cc45 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3910,6 +3910,9 @@ i915_gem_lastclose(struct drm_device *dev)
> > ret = i915_gem_idle(dev);
> > if (ret)
> > DRM_ERROR("failed to idle hardware: %d\n", ret);
> > +   ret = i915_reset(dev, GRDOM_FULL);
> > +   if (ret)
> > +   DRM_ERROR("failed to reset gpu: %d\n", ret);
> >  }
> >  
> >  static void
> 
> This reminds me that we should make our reset finer grained.  We
> generally only need to reset the render engine when we detect an error,
> but today we reset display and media unconditionally too.  Not
> resetting the display would make reset more invisible like it ought to
> be...
> 
> I haven't tested it, but it's yet another thing for our TODO list.

I think with the pch split stuff we don't already reset display. And in
any cases the global gtt seems to just keep going. The real problem is
that you will notice the reset anyway, because we tend to wait a few
seconds before we declare the gpu dead.

For that we also need some improvements (and better culprit detection so
that we can start to reset earlier). That one is on my todo ;-)
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 12/18] drm/i915: try to reset the gpu before unload

2012-03-30 Thread Jesse Barnes
On Sun, 18 Mar 2012 13:39:52 -0700
Ben Widawsky  wrote:

> paranoia
> 
> For context support the HW expects the default context to always be
> available as there is no way to shut off HW contexts once turned on
> (afaics). This is problematic when unloading the driver as we have no
> way to prevent the GPU from expecting the BO to still be present once
> unloaded.
> 
> The best we can do to remedy the situation is to attempt a GPU reset
> when doing the unload.
> 
> NOTE: this patch isn't *really* required to go with the rest of the
> context serious.
> 
> Signed-off-by: Ben Widawsky 
> ---
>  drivers/gpu/drm/i915/i915_gem.c |3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c1aab45..848cc45 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3910,6 +3910,9 @@ i915_gem_lastclose(struct drm_device *dev)
>   ret = i915_gem_idle(dev);
>   if (ret)
>   DRM_ERROR("failed to idle hardware: %d\n", ret);
> + ret = i915_reset(dev, GRDOM_FULL);
> + if (ret)
> + DRM_ERROR("failed to reset gpu: %d\n", ret);
>  }
>  
>  static void

This reminds me that we should make our reset finer grained.  We
generally only need to reset the render engine when we detect an error,
but today we reset display and media unconditionally too.  Not
resetting the display would make reset more invisible like it ought to
be...

I haven't tested it, but it's yet another thing for our TODO list.

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 12/18] drm/i915: try to reset the gpu before unload

2012-03-29 Thread Daniel Vetter
On Sun, Mar 18, 2012 at 01:39:52PM -0700, Ben Widawsky wrote:
> paranoia
> 
> For context support the HW expects the default context to always be
> available as there is no way to shut off HW contexts once turned on
> (afaics). This is problematic when unloading the driver as we have no
> way to prevent the GPU from expecting the BO to still be present once
> unloaded.
> 
> The best we can do to remedy the situation is to attempt a GPU reset
> when doing the unload.
> 
> NOTE: this patch isn't *really* required to go with the rest of the
> context serious.
> 
> Signed-off-by: Ben Widawsky 

I think the paranoia here is justified (albeit it would benefit from some
commit-message love imo). But we do not support i915_reset on all gens, so
I think you need to add a gen >= 5 check here.

> ---
>  drivers/gpu/drm/i915/i915_gem.c |3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c1aab45..848cc45 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3910,6 +3910,9 @@ i915_gem_lastclose(struct drm_device *dev)
>   ret = i915_gem_idle(dev);
>   if (ret)
>   DRM_ERROR("failed to idle hardware: %d\n", ret);
> + ret = i915_reset(dev, GRDOM_FULL);
> + if (ret)
> + DRM_ERROR("failed to reset gpu: %d\n", ret);
>  }
>  
>  static void
> -- 
> 1.7.9.4
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx