[PATCH 7/8] drm/i915: Fix random aux transactions failures.
I don't see how the subject matches the commit. On Sat, 21 Nov 2015, Rodrigo Vivi wrote: > This read wake with retries were initially added by 2 commits: > > commit 61da5fab ("drm/i915/dp: retry link status read 3 times on failure") > commit 899526d9 ("drm/i915/dp: try to read receiver capabilities 3 times when > detecting") > > Both mentioning section 9.1 of the 1.1a DisplayPort spec, that actually > tell us to retry three times on certain case when "writing 01h to DPCD > address 600h" > and this code is already in place in our driver. Added by: > > commit c7ad3810 ("drm/i915/dp: manage sink power state if possible") I still think what we currently do for the sink power state management works by coincidence. We should still look into it. However, I think this series overall (apart from patch 6/8 which really is a bummer, the comment inline below, and the minor other comments) looks like worthwhile changes. We can leave the power state management for later. Or rip it out for now... > At this point we have no visibility if those patches were added to workaround > certain > corner cases like lazy dongles or what, but also at that time there wasn't > enough > retries on the proper places. > > So my proposal is to remove these retries for now that we have drm handling > the retries > and if we face any corner case back again we study it to return EAGAIN or > EBUSY > to force retries at drm instead of handling them here. > > v2: Improve commit message trying to explain the origin of the retries. > > Cc: Daniel Vetter > Cc: Jani Nikula > Cc: Jesse Barnes > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/intel_dp.c | 95 > ++--- > 1 file changed, 32 insertions(+), 63 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index c87e937..2ce6527 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -985,7 +985,8 @@ intel_dp_aux_transfer(struct drm_dp_aux *aux, struct > drm_dp_aux_msg *msg) > if (WARN_ON(rxsize > 20)) > return -E2BIG; > > - ret = intel_dp_aux_ch(intel_dp, txbuf, txsize, rxbuf, rxsize); > + ret = intel_dp_aux_ch(intel_dp, txbuf, txsize, > + rxbuf, rxsize); > if (ret > 0) { > msg->reply = rxbuf[0] >> 4; > /* > @@ -3150,47 +3151,16 @@ static void chv_dp_post_pll_disable(struct > intel_encoder *encoder) > } > > /* > - * Native read with retry for link status and receiver capability reads for > - * cases where the sink may still be asleep. > - * > - * Sinks are *supposed* to come up within 1ms from an off state, but we're > also > - * supposed to retry 3 times per the spec. > - */ > -static ssize_t > -intel_dp_dpcd_read_wake(struct drm_dp_aux *aux, unsigned int offset, > - void *buffer, size_t size) > -{ > - ssize_t ret; > - int i; > - > - /* > - * Sometime we just get the same incorrect byte repeated > - * over the entire buffer. Doing just one throw away read > - * initially seems to "solve" it. > - */ > - drm_dp_dpcd_read(aux, DP_DPCD_REV, buffer, 1); This still needs to be addressed somehow. Maybe it's sufficient for Ville to test with his monitor? commit f6a1906674005377b64ee5431c1418077c1b2425 Author: Ville Syrjälä Date: Thu Oct 16 20:46:09 2014 +0300 drm/i915: Do a dummy DPCD read before the actual read > - > - for (i = 0; i < 3; i++) { > - ret = drm_dp_dpcd_read(aux, offset, buffer, size); > - if (ret == size) > - return ret; > - msleep(1); > - } > - > - return ret; > -} > - > -/* > * Fetch AUX CH registers 0x202 - 0x207 which contain > * link status information > */ > bool > intel_dp_get_link_status(struct intel_dp *intel_dp, uint8_t > link_status[DP_LINK_STATUS_SIZE]) > { > - return intel_dp_dpcd_read_wake(&intel_dp->aux, > -DP_LANE0_1_STATUS, > -link_status, > -DP_LINK_STATUS_SIZE) == > DP_LINK_STATUS_SIZE; > + return drm_dp_dpcd_read(&intel_dp->aux, > + DP_LANE0_1_STATUS, > + link_status, > + DP_LINK_STATUS_SIZE) == DP_LINK_STATUS_SIZE; > } > > /* These are source-specific values. */ > @@ -3825,8 +3795,8 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp) > struct drm_i915_private *dev_priv = dev->dev_private; > uint8_t rev; > > - if (intel_dp_dpcd_read_wake(&intel_dp->aux, 0x000, intel_dp->dpcd, > - sizeof(intel_dp->dpcd)) < 0) > + if (drm_dp_dpcd_read(&intel_dp->aux, 0x000, intel_dp->dpcd, > + sizeof(intel_dp->dpcd)) < 0) > return false; /* aux transfer faile
[PATCH 7/8] drm/i915: Fix random aux transactions failures.
On Sat, 21 Nov 2015, Rodrigo Vivi wrote: > Mainly aux communications on sink_crc > were failing a lot randomly on recent platforms. > The first solution was to try to use intel_dp_dpcd_read_wake, but then > it was suggested to move retries to drm level. > > Since drm level was already taking care of retries and didn't want > to through random retries on that level the second solution was to > put the retries at aux_transfer layer what was nacked. > > So I realized we had so many retries in different places and > started to organize that a bit. During this organization I noticed > that we weren't handing at all the case were the message size was > zeroed. And this was exactly the case that was affecting sink_crc. > > Also we weren't respect BSPec who says this size message = 0 or > 20 > are forbidden. > > It is a fact that we still have no clue why we are getting this > forbidden value there. But anyway we need to handle that for now > so we return -EBUSY and drm level takes care of the retries that > are already in place. > > Cc: Jani Nikula > Cc: Daniel Vetter > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/intel_dp.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index 35048d6..c87e937 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -905,6 +905,17 @@ done: > /* Unload any bytes sent back from the other side */ > recv_bytes = ((status & DP_AUX_CH_CTL_MESSAGE_SIZE_MASK) >> > DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT); > + > + /* > + * By BSpec: "Message sizes of 0 or >20 are not allowed." > + * We have no idea of what happened so we return -EBUSY so > + * drm layer takes care for the necessary retries. > + */ > + if (recv_bytes == 0 || recv_bytes > 20) { > + ret = -EBUSY; This deserves debug logging at the very least. BR, Jani. > + goto out; > + } > + > if (recv_bytes > recv_size) > recv_bytes = recv_size; -- Jani Nikula, Intel Open Source Technology Center
[PATCH 7/8] drm/i915: Fix random aux transactions failures.
This read wake with retries were initially added by 2 commits: commit 61da5fab ("drm/i915/dp: retry link status read 3 times on failure") commit 899526d9 ("drm/i915/dp: try to read receiver capabilities 3 times when detecting") Both mentioning section 9.1 of the 1.1a DisplayPort spec, that actually tell us to retry three times on certain case when "writing 01h to DPCD address 600h" and this code is already in place in our driver. Added by: commit c7ad3810 ("drm/i915/dp: manage sink power state if possible") At this point we have no visibility if those patches were added to workaround certain corner cases like lazy dongles or what, but also at that time there wasn't enough retries on the proper places. So my proposal is to remove these retries for now that we have drm handling the retries and if we face any corner case back again we study it to return EAGAIN or EBUSY to force retries at drm instead of handling them here. v2: Improve commit message trying to explain the origin of the retries. Cc: Daniel Vetter Cc: Jani Nikula Cc: Jesse Barnes Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/i915/intel_dp.c | 95 ++--- 1 file changed, 32 insertions(+), 63 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index c87e937..2ce6527 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -985,7 +985,8 @@ intel_dp_aux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) if (WARN_ON(rxsize > 20)) return -E2BIG; - ret = intel_dp_aux_ch(intel_dp, txbuf, txsize, rxbuf, rxsize); + ret = intel_dp_aux_ch(intel_dp, txbuf, txsize, + rxbuf, rxsize); if (ret > 0) { msg->reply = rxbuf[0] >> 4; /* @@ -3150,47 +3151,16 @@ static void chv_dp_post_pll_disable(struct intel_encoder *encoder) } /* - * Native read with retry for link status and receiver capability reads for - * cases where the sink may still be asleep. - * - * Sinks are *supposed* to come up within 1ms from an off state, but we're also - * supposed to retry 3 times per the spec. - */ -static ssize_t -intel_dp_dpcd_read_wake(struct drm_dp_aux *aux, unsigned int offset, - void *buffer, size_t size) -{ - ssize_t ret; - int i; - - /* -* Sometime we just get the same incorrect byte repeated -* over the entire buffer. Doing just one throw away read -* initially seems to "solve" it. -*/ - drm_dp_dpcd_read(aux, DP_DPCD_REV, buffer, 1); - - for (i = 0; i < 3; i++) { - ret = drm_dp_dpcd_read(aux, offset, buffer, size); - if (ret == size) - return ret; - msleep(1); - } - - return ret; -} - -/* * Fetch AUX CH registers 0x202 - 0x207 which contain * link status information */ bool intel_dp_get_link_status(struct intel_dp *intel_dp, uint8_t link_status[DP_LINK_STATUS_SIZE]) { - return intel_dp_dpcd_read_wake(&intel_dp->aux, - DP_LANE0_1_STATUS, - link_status, - DP_LINK_STATUS_SIZE) == DP_LINK_STATUS_SIZE; + return drm_dp_dpcd_read(&intel_dp->aux, + DP_LANE0_1_STATUS, + link_status, + DP_LINK_STATUS_SIZE) == DP_LINK_STATUS_SIZE; } /* These are source-specific values. */ @@ -3825,8 +3795,8 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp) struct drm_i915_private *dev_priv = dev->dev_private; uint8_t rev; - if (intel_dp_dpcd_read_wake(&intel_dp->aux, 0x000, intel_dp->dpcd, - sizeof(intel_dp->dpcd)) < 0) + if (drm_dp_dpcd_read(&intel_dp->aux, 0x000, intel_dp->dpcd, +sizeof(intel_dp->dpcd)) < 0) return false; /* aux transfer failed */ DRM_DEBUG_KMS("DPCD: %*ph\n", (int) sizeof(intel_dp->dpcd), intel_dp->dpcd); @@ -3837,9 +3807,9 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp) /* Check if the panel supports PSR */ memset(intel_dp->psr_dpcd, 0, sizeof(intel_dp->psr_dpcd)); if (is_edp(intel_dp)) { - intel_dp_dpcd_read_wake(&intel_dp->aux, DP_PSR_SUPPORT, - intel_dp->psr_dpcd, - sizeof(intel_dp->psr_dpcd)); + drm_dp_dpcd_read(&intel_dp->aux, DP_PSR_SUPPORT, +intel_dp->psr_dpcd, +sizeof(intel_dp->psr_dpcd)); if (intel_dp->psr_dpcd[0] & DP_PSR_IS_SUPPORTED) { dev_priv->psr.sink_support = true; DRM_DEBUG_KMS("Detected EDP PSR Panel.\n"); @@ -3850,9 +
[PATCH 7/8] drm/i915: Fix random aux transactions failures.
Mainly aux communications on sink_crc were failing a lot randomly on recent platforms. The first solution was to try to use intel_dp_dpcd_read_wake, but then it was suggested to move retries to drm level. Since drm level was already taking care of retries and didn't want to through random retries on that level the second solution was to put the retries at aux_transfer layer what was nacked. So I realized we had so many retries in different places and started to organize that a bit. During this organization I noticed that we weren't handing at all the case were the message size was zeroed. And this was exactly the case that was affecting sink_crc. Also we weren't respect BSPec who says this size message = 0 or > 20 are forbidden. It is a fact that we still have no clue why we are getting this forbidden value there. But anyway we need to handle that for now so we return -EBUSY and drm level takes care of the retries that are already in place. Cc: Jani Nikula Cc: Daniel Vetter Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/i915/intel_dp.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 35048d6..c87e937 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -905,6 +905,17 @@ done: /* Unload any bytes sent back from the other side */ recv_bytes = ((status & DP_AUX_CH_CTL_MESSAGE_SIZE_MASK) >> DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT); + + /* +* By BSpec: "Message sizes of 0 or >20 are not allowed." +* We have no idea of what happened so we return -EBUSY so +* drm layer takes care for the necessary retries. +*/ + if (recv_bytes == 0 || recv_bytes > 20) { + ret = -EBUSY; + goto out; + } + if (recv_bytes > recv_size) recv_bytes = recv_size; -- 2.4.3