[PATCH 7/8] drm/i915: Fix random aux transactions failures.

2015-11-23 Thread Jani Nikula

I don't see how the subject matches the commit.

On Sat, 21 Nov 2015, Rodrigo Vivi  wrote:
> This read wake with retries were initially added by 2 commits:
>
> commit 61da5fab ("drm/i915/dp: retry link status read 3 times on failure")
> commit 899526d9 ("drm/i915/dp: try to read receiver capabilities 3 times when 
> detecting")
>
> Both mentioning section 9.1 of the 1.1a DisplayPort spec, that actually
> tell us to retry three times on certain case when "writing 01h to DPCD 
> address 600h"
> and this code is already in place in our driver. Added by:
>
> commit c7ad3810 ("drm/i915/dp: manage sink power state if possible")

I still think what we currently do for the sink power state management
works by coincidence. We should still look into it.

However, I think this series overall (apart from patch 6/8 which really
is a bummer, the comment inline below, and the minor other comments)
looks like worthwhile changes. We can leave the power state management
for later. Or rip it out for now...

> At this point we have no visibility if those patches were added to workaround 
> certain
> corner cases like lazy dongles or what, but also at that time there wasn't 
> enough
> retries on the proper places.
>
> So my proposal is to remove these retries for now that we have drm handling 
> the retries
> and if we face any corner case back again we study it to return EAGAIN or 
> EBUSY
> to force retries at drm instead of handling them here.
>
> v2: Improve commit message trying to explain the origin of the retries.
>
> Cc: Daniel Vetter 
> Cc: Jani Nikula 
> Cc: Jesse Barnes 
> Signed-off-by: Rodrigo Vivi 
> ---
>  drivers/gpu/drm/i915/intel_dp.c | 95 
> ++---
>  1 file changed, 32 insertions(+), 63 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index c87e937..2ce6527 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -985,7 +985,8 @@ intel_dp_aux_transfer(struct drm_dp_aux *aux, struct 
> drm_dp_aux_msg *msg)
>   if (WARN_ON(rxsize > 20))
>   return -E2BIG;
>  
> - ret = intel_dp_aux_ch(intel_dp, txbuf, txsize, rxbuf, rxsize);
> + ret = intel_dp_aux_ch(intel_dp, txbuf, txsize,
> +   rxbuf, rxsize);
>   if (ret > 0) {
>   msg->reply = rxbuf[0] >> 4;
>   /*
> @@ -3150,47 +3151,16 @@ static void chv_dp_post_pll_disable(struct 
> intel_encoder *encoder)
>  }
>  
>  /*
> - * Native read with retry for link status and receiver capability reads for
> - * cases where the sink may still be asleep.
> - *
> - * Sinks are *supposed* to come up within 1ms from an off state, but we're 
> also
> - * supposed to retry 3 times per the spec.
> - */
> -static ssize_t
> -intel_dp_dpcd_read_wake(struct drm_dp_aux *aux, unsigned int offset,
> - void *buffer, size_t size)
> -{
> - ssize_t ret;
> - int i;
> -
> - /*
> -  * Sometime we just get the same incorrect byte repeated
> -  * over the entire buffer. Doing just one throw away read
> -  * initially seems to "solve" it.
> -  */
> - drm_dp_dpcd_read(aux, DP_DPCD_REV, buffer, 1);

This still needs to be addressed somehow. Maybe it's sufficient for
Ville to test with his monitor?

commit f6a1906674005377b64ee5431c1418077c1b2425
Author: Ville Syrjälä 
Date:   Thu Oct 16 20:46:09 2014 +0300

drm/i915: Do a dummy DPCD read before the actual read

> -
> - for (i = 0; i < 3; i++) {
> - ret = drm_dp_dpcd_read(aux, offset, buffer, size);
> - if (ret == size)
> - return ret;
> - msleep(1);
> - }
> -
> - return ret;
> -}
> -
> -/*
>   * Fetch AUX CH registers 0x202 - 0x207 which contain
>   * link status information
>   */
>  bool
>  intel_dp_get_link_status(struct intel_dp *intel_dp, uint8_t 
> link_status[DP_LINK_STATUS_SIZE])
>  {
> - return intel_dp_dpcd_read_wake(&intel_dp->aux,
> -DP_LANE0_1_STATUS,
> -link_status,
> -DP_LINK_STATUS_SIZE) == 
> DP_LINK_STATUS_SIZE;
> + return drm_dp_dpcd_read(&intel_dp->aux,
> + DP_LANE0_1_STATUS,
> + link_status,
> + DP_LINK_STATUS_SIZE) == DP_LINK_STATUS_SIZE;
>  }
>  
>  /* These are source-specific values. */
> @@ -3825,8 +3795,8 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
>   struct drm_i915_private *dev_priv = dev->dev_private;
>   uint8_t rev;
>  
> - if (intel_dp_dpcd_read_wake(&intel_dp->aux, 0x000, intel_dp->dpcd,
> - sizeof(intel_dp->dpcd)) < 0)
> + if (drm_dp_dpcd_read(&intel_dp->aux, 0x000, intel_dp->dpcd,
> +  sizeof(intel_dp->dpcd)) < 0)
>   return false; /* aux transfer faile

[PATCH 7/8] drm/i915: Fix random aux transactions failures.

2015-11-23 Thread Jani Nikula
On Sat, 21 Nov 2015, Rodrigo Vivi  wrote:
> Mainly aux communications on sink_crc
> were failing a lot randomly on recent platforms.
> The first solution was to try to use intel_dp_dpcd_read_wake, but then
> it was suggested to move retries to drm level.
>
> Since drm level was already taking care of retries and didn't want
> to through random retries on that level the second solution was to
> put the retries at aux_transfer layer what was nacked.
>
> So I realized we had so many retries in different places and
> started to organize that a bit. During this organization I noticed
> that we weren't handing at all the case were the message size was
> zeroed. And this was exactly the case that was affecting sink_crc.
>
> Also we weren't respect BSPec who says this size message = 0 or > 20
> are forbidden.
>
> It is a fact that we still have no clue why we are getting this
> forbidden value there. But anyway we need to handle that for now
> so we return -EBUSY and drm level takes care of the retries that
> are already in place.
>
> Cc: Jani Nikula 
> Cc: Daniel Vetter 
> Signed-off-by: Rodrigo Vivi 
> ---
>  drivers/gpu/drm/i915/intel_dp.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 35048d6..c87e937 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -905,6 +905,17 @@ done:
>   /* Unload any bytes sent back from the other side */
>   recv_bytes = ((status & DP_AUX_CH_CTL_MESSAGE_SIZE_MASK) >>
> DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT);
> +
> + /*
> +  * By BSpec: "Message sizes of 0 or >20 are not allowed."
> +  * We have no idea of what happened so we return -EBUSY so
> +  * drm layer takes care for the necessary retries.
> +  */
> + if (recv_bytes == 0 || recv_bytes > 20) {
> + ret = -EBUSY;

This deserves debug logging at the very least.

BR,
Jani.


> + goto out;
> + }
> +
>   if (recv_bytes > recv_size)
>   recv_bytes = recv_size;

-- 
Jani Nikula, Intel Open Source Technology Center


[PATCH 7/8] drm/i915: Fix random aux transactions failures.

2015-11-20 Thread Rodrigo Vivi
This read wake with retries were initially added by 2 commits:

commit 61da5fab ("drm/i915/dp: retry link status read 3 times on failure")
commit 899526d9 ("drm/i915/dp: try to read receiver capabilities 3 times when 
detecting")

Both mentioning section 9.1 of the 1.1a DisplayPort spec, that actually
tell us to retry three times on certain case when "writing 01h to DPCD address 
600h"
and this code is already in place in our driver. Added by:

commit c7ad3810 ("drm/i915/dp: manage sink power state if possible")

At this point we have no visibility if those patches were added to workaround 
certain
corner cases like lazy dongles or what, but also at that time there wasn't 
enough
retries on the proper places.

So my proposal is to remove these retries for now that we have drm handling the 
retries
and if we face any corner case back again we study it to return EAGAIN or EBUSY
to force retries at drm instead of handling them here.

v2: Improve commit message trying to explain the origin of the retries.

Cc: Daniel Vetter 
Cc: Jani Nikula 
Cc: Jesse Barnes 
Signed-off-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/intel_dp.c | 95 ++---
 1 file changed, 32 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index c87e937..2ce6527 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -985,7 +985,8 @@ intel_dp_aux_transfer(struct drm_dp_aux *aux, struct 
drm_dp_aux_msg *msg)
if (WARN_ON(rxsize > 20))
return -E2BIG;

-   ret = intel_dp_aux_ch(intel_dp, txbuf, txsize, rxbuf, rxsize);
+   ret = intel_dp_aux_ch(intel_dp, txbuf, txsize,
+ rxbuf, rxsize);
if (ret > 0) {
msg->reply = rxbuf[0] >> 4;
/*
@@ -3150,47 +3151,16 @@ static void chv_dp_post_pll_disable(struct 
intel_encoder *encoder)
 }

 /*
- * Native read with retry for link status and receiver capability reads for
- * cases where the sink may still be asleep.
- *
- * Sinks are *supposed* to come up within 1ms from an off state, but we're also
- * supposed to retry 3 times per the spec.
- */
-static ssize_t
-intel_dp_dpcd_read_wake(struct drm_dp_aux *aux, unsigned int offset,
-   void *buffer, size_t size)
-{
-   ssize_t ret;
-   int i;
-
-   /*
-* Sometime we just get the same incorrect byte repeated
-* over the entire buffer. Doing just one throw away read
-* initially seems to "solve" it.
-*/
-   drm_dp_dpcd_read(aux, DP_DPCD_REV, buffer, 1);
-
-   for (i = 0; i < 3; i++) {
-   ret = drm_dp_dpcd_read(aux, offset, buffer, size);
-   if (ret == size)
-   return ret;
-   msleep(1);
-   }
-
-   return ret;
-}
-
-/*
  * Fetch AUX CH registers 0x202 - 0x207 which contain
  * link status information
  */
 bool
 intel_dp_get_link_status(struct intel_dp *intel_dp, uint8_t 
link_status[DP_LINK_STATUS_SIZE])
 {
-   return intel_dp_dpcd_read_wake(&intel_dp->aux,
-  DP_LANE0_1_STATUS,
-  link_status,
-  DP_LINK_STATUS_SIZE) == 
DP_LINK_STATUS_SIZE;
+   return drm_dp_dpcd_read(&intel_dp->aux,
+   DP_LANE0_1_STATUS,
+   link_status,
+   DP_LINK_STATUS_SIZE) == DP_LINK_STATUS_SIZE;
 }

 /* These are source-specific values. */
@@ -3825,8 +3795,8 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
struct drm_i915_private *dev_priv = dev->dev_private;
uint8_t rev;

-   if (intel_dp_dpcd_read_wake(&intel_dp->aux, 0x000, intel_dp->dpcd,
-   sizeof(intel_dp->dpcd)) < 0)
+   if (drm_dp_dpcd_read(&intel_dp->aux, 0x000, intel_dp->dpcd,
+sizeof(intel_dp->dpcd)) < 0)
return false; /* aux transfer failed */

DRM_DEBUG_KMS("DPCD: %*ph\n", (int) sizeof(intel_dp->dpcd), 
intel_dp->dpcd);
@@ -3837,9 +3807,9 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
/* Check if the panel supports PSR */
memset(intel_dp->psr_dpcd, 0, sizeof(intel_dp->psr_dpcd));
if (is_edp(intel_dp)) {
-   intel_dp_dpcd_read_wake(&intel_dp->aux, DP_PSR_SUPPORT,
-   intel_dp->psr_dpcd,
-   sizeof(intel_dp->psr_dpcd));
+   drm_dp_dpcd_read(&intel_dp->aux, DP_PSR_SUPPORT,
+intel_dp->psr_dpcd,
+sizeof(intel_dp->psr_dpcd));
if (intel_dp->psr_dpcd[0] & DP_PSR_IS_SUPPORTED) {
dev_priv->psr.sink_support = true;
DRM_DEBUG_KMS("Detected EDP PSR Panel.\n");
@@ -3850,9 +

[PATCH 7/8] drm/i915: Fix random aux transactions failures.

2015-11-20 Thread Rodrigo Vivi
Mainly aux communications on sink_crc
were failing a lot randomly on recent platforms.
The first solution was to try to use intel_dp_dpcd_read_wake, but then
it was suggested to move retries to drm level.

Since drm level was already taking care of retries and didn't want
to through random retries on that level the second solution was to
put the retries at aux_transfer layer what was nacked.

So I realized we had so many retries in different places and
started to organize that a bit. During this organization I noticed
that we weren't handing at all the case were the message size was
zeroed. And this was exactly the case that was affecting sink_crc.

Also we weren't respect BSPec who says this size message = 0 or > 20
are forbidden.

It is a fact that we still have no clue why we are getting this
forbidden value there. But anyway we need to handle that for now
so we return -EBUSY and drm level takes care of the retries that
are already in place.

Cc: Jani Nikula 
Cc: Daniel Vetter 
Signed-off-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/intel_dp.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 35048d6..c87e937 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -905,6 +905,17 @@ done:
/* Unload any bytes sent back from the other side */
recv_bytes = ((status & DP_AUX_CH_CTL_MESSAGE_SIZE_MASK) >>
  DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT);
+
+   /*
+* By BSpec: "Message sizes of 0 or >20 are not allowed."
+* We have no idea of what happened so we return -EBUSY so
+* drm layer takes care for the necessary retries.
+*/
+   if (recv_bytes == 0 || recv_bytes > 20) {
+   ret = -EBUSY;
+   goto out;
+   }
+
if (recv_bytes > recv_size)
recv_bytes = recv_size;

-- 
2.4.3