Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted

2018-02-08 Thread Chris Wilson
Quoting Mika Kuoppala (2018-02-08 11:43:50)
> Chris Wilson  writes:
> 
> > After we assert the reset request (and wait for 20us), when the device
> > has been fully reset it asserts the reset-status bit. Before we stop
> > requesting the reset and allow the device to return to normal, we should
> > wait for the reset to be completed. (Similar to how we wait for the
> > device to return to normal after deasserting the reset request.)
> >
> > v2: Rename i915_reset_completed() probe to not cause as much confusion.
> >
> > Signed-off-by: Chris Wilson 
> > Cc: Ville Syrjälä 
> > ---
> >  drivers/gpu/drm/i915/intel_uncore.c | 15 +++
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
> > b/drivers/gpu/drm/i915/intel_uncore.c
> > index 0fd59dd6bbb5..e09981a3113c 100644
> > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > @@ -1550,24 +1550,31 @@ static void i915_stop_engines(struct 
> > drm_i915_private *dev_priv,
> >   gen3_stop_engine(engine);
> >  }
> >  
> > -static bool i915_reset_complete(struct pci_dev *pdev)
> > +static bool i915_in_reset(struct pci_dev *pdev)
> >  {
> >   u8 gdrst;
> >  
> >   pci_read_config_byte(pdev, I915_GDRST, );
> > - return (gdrst & GRDOM_RESET_STATUS) == 0;
> > + return gdrst & GRDOM_RESET_STATUS;
> >  }
> >  
> >  static int i915_do_reset(struct drm_i915_private *dev_priv, unsigned 
> > engine_mask)
> >  {
> >   struct pci_dev *pdev = dev_priv->drm.pdev;
> > + int err;
> >  
> > - /* assert reset for at least 20 usec */
> > + /* Assert reset for at least 20 usec, and wait for acknowledgement. */
> >   pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
> >   usleep_range(50, 200);
> > + err = wait_for(i915_in_reset(pdev), 500);
> > +
> > + /* Clear the reset request. */
> >   pci_write_config_byte(pdev, I915_GDRST, 0);
> > + usleep_range(50, 200);
> > + if (!err)
> > + err = wait_for(!i915_in_reset(pdev), 500);
> >
> 
> Regardless of if this helps or not with pnv, it makes the
> both phases clear.
> 
> Any thoughts of starting to log the reset attempts
> with timeout, even if the subsequent reset succeeds?
> 
> Patch is,
> Reviewed-by: Mika Kuoppala 

Although I don't expect this pair to help CI, it feels justifiable
paranoia. Thanks for the review,
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted

2018-02-08 Thread Chris Wilson
Quoting Mika Kuoppala (2018-02-08 12:34:08)
> Chris Wilson  writes:
> 
> > Quoting Mika Kuoppala (2018-02-08 11:43:50)
> >> Any thoughts of starting to log the reset attempts
> >> with timeout, even if the subsequent reset succeeds?
> >
> > If it succeeds, do we care? Capturing why it fails, sure.
> >
> > The question being what do we want to gain from it? Faster reset by
> > removing timeout loops -- but if it does take X attempts, we can't
> > really make it faster, just swap out one delay for another?
> >
> > It's a challenge, trying to provide the right information to solve a
> > user's problem without their intervention and without any burden.
> > Easier when you are chasing a problem down to know what you need. And
> > likely need again in future?
> 
> I was thinking of gauging the rest robustness. To check if
> the level we are at, through CI. I would keep the timeouts and would
> keep retries. 
> 
> Mainly the intent would be to find a pattern. Like if on some platform,
> after test x, the first reset always timeouts would be a sign
> to further robustify.

But unless you automate such pattern finding, it will be lost and just
annoy the next person trying to extract signal from the noise :)

If you can think of it, write a test for it and let it run for a few
months in CI.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted

2018-02-08 Thread Mika Kuoppala
Chris Wilson  writes:

> Quoting Mika Kuoppala (2018-02-08 11:43:50)
>> Any thoughts of starting to log the reset attempts
>> with timeout, even if the subsequent reset succeeds?
>
> If it succeeds, do we care? Capturing why it fails, sure.
>
> The question being what do we want to gain from it? Faster reset by
> removing timeout loops -- but if it does take X attempts, we can't
> really make it faster, just swap out one delay for another?
>
> It's a challenge, trying to provide the right information to solve a
> user's problem without their intervention and without any burden.
> Easier when you are chasing a problem down to know what you need. And
> likely need again in future?

I was thinking of gauging the rest robustness. To check if
the level we are at, through CI. I would keep the timeouts and would
keep retries. 

Mainly the intent would be to find a pattern. Like if on some platform,
after test x, the first reset always timeouts would be a sign
to further robustify.

-Mika
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted

2018-02-08 Thread Chris Wilson
Quoting Mika Kuoppala (2018-02-08 11:43:50)
> Any thoughts of starting to log the reset attempts
> with timeout, even if the subsequent reset succeeds?

If it succeeds, do we care? Capturing why it fails, sure.

The question being what do we want to gain from it? Faster reset by
removing timeout loops -- but if it does take X attempts, we can't
really make it faster, just swap out one delay for another?

It's a challenge, trying to provide the right information to solve a
user's problem without their intervention and without any burden.
Easier when you are chasing a problem down to know what you need. And
likely need again in future?
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted

2018-02-08 Thread Mika Kuoppala
Chris Wilson  writes:

> After we assert the reset request (and wait for 20us), when the device
> has been fully reset it asserts the reset-status bit. Before we stop
> requesting the reset and allow the device to return to normal, we should
> wait for the reset to be completed. (Similar to how we wait for the
> device to return to normal after deasserting the reset request.)
>
> v2: Rename i915_reset_completed() probe to not cause as much confusion.
>
> Signed-off-by: Chris Wilson 
> Cc: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
> b/drivers/gpu/drm/i915/intel_uncore.c
> index 0fd59dd6bbb5..e09981a3113c 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1550,24 +1550,31 @@ static void i915_stop_engines(struct drm_i915_private 
> *dev_priv,
>   gen3_stop_engine(engine);
>  }
>  
> -static bool i915_reset_complete(struct pci_dev *pdev)
> +static bool i915_in_reset(struct pci_dev *pdev)
>  {
>   u8 gdrst;
>  
>   pci_read_config_byte(pdev, I915_GDRST, );
> - return (gdrst & GRDOM_RESET_STATUS) == 0;
> + return gdrst & GRDOM_RESET_STATUS;
>  }
>  
>  static int i915_do_reset(struct drm_i915_private *dev_priv, unsigned 
> engine_mask)
>  {
>   struct pci_dev *pdev = dev_priv->drm.pdev;
> + int err;
>  
> - /* assert reset for at least 20 usec */
> + /* Assert reset for at least 20 usec, and wait for acknowledgement. */
>   pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
>   usleep_range(50, 200);
> + err = wait_for(i915_in_reset(pdev), 500);
> +
> + /* Clear the reset request. */
>   pci_write_config_byte(pdev, I915_GDRST, 0);
> + usleep_range(50, 200);
> + if (!err)
> + err = wait_for(!i915_in_reset(pdev), 500);
>

Regardless of if this helps or not with pnv, it makes the
both phases clear.

Any thoughts of starting to log the reset attempts
with timeout, even if the subsequent reset succeeds?

Patch is,
Reviewed-by: Mika Kuoppala 

> - return wait_for(i915_reset_complete(pdev), 500);
> + return err;
>  }
>  
>  static bool g4x_reset_complete(struct pci_dev *pdev)
> -- 
> 2.16.1
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted

2018-02-07 Thread Chris Wilson
After we assert the reset request (and wait for 20us), when the device
has been fully reset it asserts the reset-status bit. Before we stop
requesting the reset and allow the device to return to normal, we should
wait for the reset to be completed. (Similar to how we wait for the
device to return to normal after deasserting the reset request.)

v2: Rename i915_reset_completed() probe to not cause as much confusion.

Signed-off-by: Chris Wilson 
Cc: Ville Syrjälä 
---
 drivers/gpu/drm/i915/intel_uncore.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 0fd59dd6bbb5..e09981a3113c 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1550,24 +1550,31 @@ static void i915_stop_engines(struct drm_i915_private 
*dev_priv,
gen3_stop_engine(engine);
 }
 
-static bool i915_reset_complete(struct pci_dev *pdev)
+static bool i915_in_reset(struct pci_dev *pdev)
 {
u8 gdrst;
 
pci_read_config_byte(pdev, I915_GDRST, );
-   return (gdrst & GRDOM_RESET_STATUS) == 0;
+   return gdrst & GRDOM_RESET_STATUS;
 }
 
 static int i915_do_reset(struct drm_i915_private *dev_priv, unsigned 
engine_mask)
 {
struct pci_dev *pdev = dev_priv->drm.pdev;
+   int err;
 
-   /* assert reset for at least 20 usec */
+   /* Assert reset for at least 20 usec, and wait for acknowledgement. */
pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
usleep_range(50, 200);
+   err = wait_for(i915_in_reset(pdev), 500);
+
+   /* Clear the reset request. */
pci_write_config_byte(pdev, I915_GDRST, 0);
+   usleep_range(50, 200);
+   if (!err)
+   err = wait_for(!i915_in_reset(pdev), 500);
 
-   return wait_for(i915_reset_complete(pdev), 500);
+   return err;
 }
 
 static bool g4x_reset_complete(struct pci_dev *pdev)
-- 
2.16.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx