Re: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-18 Thread Alex Deucher
On Fri, Nov 18, 2022 at 2:53 PM Lyude Paul  wrote:
>
> JFYI - I'm not sure of the correct commit ID to add for the Fixes: tag since
> it's in your branch, so I'll omit that and let you add it into the patch

Yeah, I'll add that.  Many thanks!

Alex

>
> On Fri, 2022-11-18 at 14:47 -0500, Lyude Paul wrote:
> > of course, will do in just a moment
> >
> > On Fri, 2022-11-18 at 14:46 -0500, Alex Deucher wrote:
> > > I've already picked this up.  Can you send a follow up patch with just
> > > the coverity fix?
> > >
> > > Alex
> > >
> > > On Fri, Nov 18, 2022 at 2:17 PM Lyude Paul  wrote:
> > > >
> > > > JFYI, Coverity pointed out one more issue with this series so I'm going 
> > > > to
> > > > send out a respin real quick to fix it. It's just a missing variable
> > > > assignment (we leave ret unassigned by mistake in
> > > > pre_compute_mst_dsc_configs()) so I will carry over your r-b on it.
> > > >
> > > > On Wed, 2022-11-16 at 04:39 +, Lin, Wayne wrote:
> > > > > [Public]
> > > > >
> > > > > All the patch set looks good to me. Feel free to add:
> > > > > Reviewed-by: Wayne Lin 
> > > > >
> > > > > Again, thank you Lyude for helping on this!!!
> > > > >
> > > > > Regards,
> > > > > Wayne
> > > > > > -Original Message-
> > > > > > From: Lyude Paul 
> > > > > > Sent: Tuesday, November 15, 2022 6:18 AM
> > > > > > To: amd-gfx@lists.freedesktop.org
> > > > > > Cc: Wentland, Harry ; 
> > > > > > sta...@vger.kernel.org;
> > > > > > Li, Sun peng (Leo) ; Siqueira, Rodrigo
> > > > > > ; Deucher, Alexander
> > > > > > ; Koenig, Christian
> > > > > > ; Pan, Xinhui ; David
> > > > > > Airlie ; Daniel Vetter ; 
> > > > > > Kazlauskas,
> > > > > > Nicholas ; Pillai, Aurabindo
> > > > > > ; Li, Roman ; Zuo, Jerry
> > > > > > ; Wu, Hersen ; Lin, Wayne
> > > > > > ; Thomas Zimmermann ;
> > > > > > Mahfooz, Hamza ; Hung, Alex
> > > > > > ; Mikita Lipski ; Liu,
> > > > > > Wenjing ; Francis, David
> > > > > > ; open list:DRM DRIVERS  > > > > > de...@lists.freedesktop.org>; open list 
> > > > > > 
> > > > > > Subject: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes 
> > > > > > and
> > > > > > deadlocking
> > > > > >
> > > > > > It appears that amdgpu makes the mistake of completely ignoring the 
> > > > > > return
> > > > > > values from the DP MST helpers, and instead just returns a simple 
> > > > > > true/false.
> > > > > > In this case, it seems to have come back to bite us because as a 
> > > > > > result of
> > > > > > simply returning false from compute_mst_dsc_configs_for_state(), 
> > > > > > amdgpu
> > > > > > had no way of telling when a deadlock happened from these helpers. 
> > > > > > This
> > > > > > could definitely result in some kernel splats.
> > > > > >
> > > > > > V2:
> > > > > > * Address Wayne's comments (fix another bunch of spots where we 
> > > > > > weren't
> > > > > >   passing down return codes)
> > > > > >
> > > > > > Signed-off-by: Lyude Paul 
> > > > > > Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
> > > > > > Cc: Harry Wentland 
> > > > > > Cc:  # v5.6+
> > > > > > ---
> > > > > >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  18 +-
> > > > > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 235 
> > > > > > ++--
> > > > > > --
> > > > > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |  12 +-
> > > > > >  3 files changed, 147 insertions(+), 118 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > > index 0db2a88cd4d7b..852a2100c6b38 100644
> > > > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > > @@ -6462,7 +6462,7 @@ static int
> > > > > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > > > > struct drm_connector_state *new_con_state;
> > > > > > struct amdgpu_dm_connector *aconnector;
> > > > > > struct dm_connector_state *dm_conn_state;
> > > > > > -   int i, j;
> > > > > > +   int i, j, ret;
> > > > > > int vcpi, pbn_div, pbn, slot_num = 0;
> > > > > >
> > > > > > for_each_new_connector_in_state(state, connector,
> > > > > > new_con_state, i) { @@ -6509,8 +6509,11 @@ static int
> > > > > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > > > > dm_conn_state->pbn = pbn;
> > > > > > dm_conn_state->vcpi_slots = slot_num;
> > > > > >
> > > > > > -   drm_dp_mst_atomic_enable_dsc(state, aconnector-
> > > > > > > port, dm_conn_state->pbn,
> > > > > > -false);
> > > > > > +   ret = drm_dp_mst_atomic_enable_dsc(state,
> > > > > > aconnector->port,
> > > > > > +  
> > > > > > dm_conn_state-
> > > > > > > pbn, false);
> > > > > > +   if (ret < 0)
> > > > > > +  

Re: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-18 Thread Lyude Paul
JFYI - I'm not sure of the correct commit ID to add for the Fixes: tag since
it's in your branch, so I'll omit that and let you add it into the patch

On Fri, 2022-11-18 at 14:47 -0500, Lyude Paul wrote:
> of course, will do in just a moment
> 
> On Fri, 2022-11-18 at 14:46 -0500, Alex Deucher wrote:
> > I've already picked this up.  Can you send a follow up patch with just
> > the coverity fix?
> > 
> > Alex
> > 
> > On Fri, Nov 18, 2022 at 2:17 PM Lyude Paul  wrote:
> > > 
> > > JFYI, Coverity pointed out one more issue with this series so I'm going to
> > > send out a respin real quick to fix it. It's just a missing variable
> > > assignment (we leave ret unassigned by mistake in
> > > pre_compute_mst_dsc_configs()) so I will carry over your r-b on it.
> > > 
> > > On Wed, 2022-11-16 at 04:39 +, Lin, Wayne wrote:
> > > > [Public]
> > > > 
> > > > All the patch set looks good to me. Feel free to add:
> > > > Reviewed-by: Wayne Lin 
> > > > 
> > > > Again, thank you Lyude for helping on this!!!
> > > > 
> > > > Regards,
> > > > Wayne
> > > > > -Original Message-
> > > > > From: Lyude Paul 
> > > > > Sent: Tuesday, November 15, 2022 6:18 AM
> > > > > To: amd-gfx@lists.freedesktop.org
> > > > > Cc: Wentland, Harry ; sta...@vger.kernel.org;
> > > > > Li, Sun peng (Leo) ; Siqueira, Rodrigo
> > > > > ; Deucher, Alexander
> > > > > ; Koenig, Christian
> > > > > ; Pan, Xinhui ; David
> > > > > Airlie ; Daniel Vetter ; 
> > > > > Kazlauskas,
> > > > > Nicholas ; Pillai, Aurabindo
> > > > > ; Li, Roman ; Zuo, Jerry
> > > > > ; Wu, Hersen ; Lin, Wayne
> > > > > ; Thomas Zimmermann ;
> > > > > Mahfooz, Hamza ; Hung, Alex
> > > > > ; Mikita Lipski ; Liu,
> > > > > Wenjing ; Francis, David
> > > > > ; open list:DRM DRIVERS  > > > > de...@lists.freedesktop.org>; open list 
> > > > > Subject: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and
> > > > > deadlocking
> > > > > 
> > > > > It appears that amdgpu makes the mistake of completely ignoring the 
> > > > > return
> > > > > values from the DP MST helpers, and instead just returns a simple 
> > > > > true/false.
> > > > > In this case, it seems to have come back to bite us because as a 
> > > > > result of
> > > > > simply returning false from compute_mst_dsc_configs_for_state(), 
> > > > > amdgpu
> > > > > had no way of telling when a deadlock happened from these helpers. 
> > > > > This
> > > > > could definitely result in some kernel splats.
> > > > > 
> > > > > V2:
> > > > > * Address Wayne's comments (fix another bunch of spots where we 
> > > > > weren't
> > > > >   passing down return codes)
> > > > > 
> > > > > Signed-off-by: Lyude Paul 
> > > > > Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
> > > > > Cc: Harry Wentland 
> > > > > Cc:  # v5.6+
> > > > > ---
> > > > >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  18 +-
> > > > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 235 ++--
> > > > > --
> > > > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |  12 +-
> > > > >  3 files changed, 147 insertions(+), 118 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > index 0db2a88cd4d7b..852a2100c6b38 100644
> > > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > @@ -6462,7 +6462,7 @@ static int
> > > > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > > > struct drm_connector_state *new_con_state;
> > > > > struct amdgpu_dm_connector *aconnector;
> > > > > struct dm_connector_state *dm_conn_state;
> > > > > -   int i, j;
> > > > > +   int i, j, ret;
> > > > > int vcpi, pbn_div, pbn, slot_num = 0;
> > > > > 
> > > > > for_each_new_connector_in_state(state, connector,
> > > > > new_con_state, i) { @@ -6509,8 +6509,11 @@ static int
> > > > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > > > dm_conn_state->pbn = pbn;
> > > > > dm_conn_state->vcpi_slots = slot_num;
> > > > > 
> > > > > -   drm_dp_mst_atomic_enable_dsc(state, aconnector-
> > > > > > port, dm_conn_state->pbn,
> > > > > -false);
> > > > > +   ret = drm_dp_mst_atomic_enable_dsc(state,
> > > > > aconnector->port,
> > > > > +  dm_conn_state-
> > > > > > pbn, false);
> > > > > +   if (ret < 0)
> > > > > +   return ret;
> > > > > +
> > > > > continue;
> > > > > }
> > > > > 
> > > > > @@ -9523,10 +9526,9 @@ static int amdgpu_dm_atomic_check(struct
> > > > > drm_device *dev,
> > > > > 
> > > > >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > > > > if (dc_resource_is_dsc_encoding_supported(dc)) {
> > > > > -   if (!pre_validate_dsc(state, 

Re: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-18 Thread Lyude Paul
of course, will do in just a moment

On Fri, 2022-11-18 at 14:46 -0500, Alex Deucher wrote:
> I've already picked this up.  Can you send a follow up patch with just
> the coverity fix?
> 
> Alex
> 
> On Fri, Nov 18, 2022 at 2:17 PM Lyude Paul  wrote:
> > 
> > JFYI, Coverity pointed out one more issue with this series so I'm going to
> > send out a respin real quick to fix it. It's just a missing variable
> > assignment (we leave ret unassigned by mistake in
> > pre_compute_mst_dsc_configs()) so I will carry over your r-b on it.
> > 
> > On Wed, 2022-11-16 at 04:39 +, Lin, Wayne wrote:
> > > [Public]
> > > 
> > > All the patch set looks good to me. Feel free to add:
> > > Reviewed-by: Wayne Lin 
> > > 
> > > Again, thank you Lyude for helping on this!!!
> > > 
> > > Regards,
> > > Wayne
> > > > -Original Message-
> > > > From: Lyude Paul 
> > > > Sent: Tuesday, November 15, 2022 6:18 AM
> > > > To: amd-gfx@lists.freedesktop.org
> > > > Cc: Wentland, Harry ; sta...@vger.kernel.org;
> > > > Li, Sun peng (Leo) ; Siqueira, Rodrigo
> > > > ; Deucher, Alexander
> > > > ; Koenig, Christian
> > > > ; Pan, Xinhui ; David
> > > > Airlie ; Daniel Vetter ; Kazlauskas,
> > > > Nicholas ; Pillai, Aurabindo
> > > > ; Li, Roman ; Zuo, Jerry
> > > > ; Wu, Hersen ; Lin, Wayne
> > > > ; Thomas Zimmermann ;
> > > > Mahfooz, Hamza ; Hung, Alex
> > > > ; Mikita Lipski ; Liu,
> > > > Wenjing ; Francis, David
> > > > ; open list:DRM DRIVERS  > > > de...@lists.freedesktop.org>; open list 
> > > > Subject: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and
> > > > deadlocking
> > > > 
> > > > It appears that amdgpu makes the mistake of completely ignoring the 
> > > > return
> > > > values from the DP MST helpers, and instead just returns a simple 
> > > > true/false.
> > > > In this case, it seems to have come back to bite us because as a result 
> > > > of
> > > > simply returning false from compute_mst_dsc_configs_for_state(), amdgpu
> > > > had no way of telling when a deadlock happened from these helpers. This
> > > > could definitely result in some kernel splats.
> > > > 
> > > > V2:
> > > > * Address Wayne's comments (fix another bunch of spots where we weren't
> > > >   passing down return codes)
> > > > 
> > > > Signed-off-by: Lyude Paul 
> > > > Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
> > > > Cc: Harry Wentland 
> > > > Cc:  # v5.6+
> > > > ---
> > > >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  18 +-
> > > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 235 ++--
> > > > --
> > > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |  12 +-
> > > >  3 files changed, 147 insertions(+), 118 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > index 0db2a88cd4d7b..852a2100c6b38 100644
> > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > @@ -6462,7 +6462,7 @@ static int
> > > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > > struct drm_connector_state *new_con_state;
> > > > struct amdgpu_dm_connector *aconnector;
> > > > struct dm_connector_state *dm_conn_state;
> > > > -   int i, j;
> > > > +   int i, j, ret;
> > > > int vcpi, pbn_div, pbn, slot_num = 0;
> > > > 
> > > > for_each_new_connector_in_state(state, connector,
> > > > new_con_state, i) { @@ -6509,8 +6509,11 @@ static int
> > > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > > dm_conn_state->pbn = pbn;
> > > > dm_conn_state->vcpi_slots = slot_num;
> > > > 
> > > > -   drm_dp_mst_atomic_enable_dsc(state, aconnector-
> > > > > port, dm_conn_state->pbn,
> > > > -false);
> > > > +   ret = drm_dp_mst_atomic_enable_dsc(state,
> > > > aconnector->port,
> > > > +  dm_conn_state-
> > > > > pbn, false);
> > > > +   if (ret < 0)
> > > > +   return ret;
> > > > +
> > > > continue;
> > > > }
> > > > 
> > > > @@ -9523,10 +9526,9 @@ static int amdgpu_dm_atomic_check(struct
> > > > drm_device *dev,
> > > > 
> > > >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > > > if (dc_resource_is_dsc_encoding_supported(dc)) {
> > > > -   if (!pre_validate_dsc(state, _state, vars)) {
> > > > -   ret = -EINVAL;
> > > > +   ret = pre_validate_dsc(state, _state, vars);
> > > > +   if (ret != 0)
> > > > goto fail;
> > > > -   }
> > > > }
> > > >  #endif
> > > > 
> > > > @@ -9621,9 +9623,9 @@ static int amdgpu_dm_atomic_check(struct
> > > > drm_device *dev,
> > > > }
> > > > 
> > > >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > > > -   if 

Re: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-18 Thread Alex Deucher
I've already picked this up.  Can you send a follow up patch with just
the coverity fix?

Alex

On Fri, Nov 18, 2022 at 2:17 PM Lyude Paul  wrote:
>
> JFYI, Coverity pointed out one more issue with this series so I'm going to
> send out a respin real quick to fix it. It's just a missing variable
> assignment (we leave ret unassigned by mistake in
> pre_compute_mst_dsc_configs()) so I will carry over your r-b on it.
>
> On Wed, 2022-11-16 at 04:39 +, Lin, Wayne wrote:
> > [Public]
> >
> > All the patch set looks good to me. Feel free to add:
> > Reviewed-by: Wayne Lin 
> >
> > Again, thank you Lyude for helping on this!!!
> >
> > Regards,
> > Wayne
> > > -Original Message-
> > > From: Lyude Paul 
> > > Sent: Tuesday, November 15, 2022 6:18 AM
> > > To: amd-gfx@lists.freedesktop.org
> > > Cc: Wentland, Harry ; sta...@vger.kernel.org;
> > > Li, Sun peng (Leo) ; Siqueira, Rodrigo
> > > ; Deucher, Alexander
> > > ; Koenig, Christian
> > > ; Pan, Xinhui ; David
> > > Airlie ; Daniel Vetter ; Kazlauskas,
> > > Nicholas ; Pillai, Aurabindo
> > > ; Li, Roman ; Zuo, Jerry
> > > ; Wu, Hersen ; Lin, Wayne
> > > ; Thomas Zimmermann ;
> > > Mahfooz, Hamza ; Hung, Alex
> > > ; Mikita Lipski ; Liu,
> > > Wenjing ; Francis, David
> > > ; open list:DRM DRIVERS  > > de...@lists.freedesktop.org>; open list 
> > > Subject: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and
> > > deadlocking
> > >
> > > It appears that amdgpu makes the mistake of completely ignoring the return
> > > values from the DP MST helpers, and instead just returns a simple 
> > > true/false.
> > > In this case, it seems to have come back to bite us because as a result of
> > > simply returning false from compute_mst_dsc_configs_for_state(), amdgpu
> > > had no way of telling when a deadlock happened from these helpers. This
> > > could definitely result in some kernel splats.
> > >
> > > V2:
> > > * Address Wayne's comments (fix another bunch of spots where we weren't
> > >   passing down return codes)
> > >
> > > Signed-off-by: Lyude Paul 
> > > Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
> > > Cc: Harry Wentland 
> > > Cc:  # v5.6+
> > > ---
> > >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  18 +-
> > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 235 ++--
> > > --
> > >  .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |  12 +-
> > >  3 files changed, 147 insertions(+), 118 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > index 0db2a88cd4d7b..852a2100c6b38 100644
> > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > @@ -6462,7 +6462,7 @@ static int
> > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > struct drm_connector_state *new_con_state;
> > > struct amdgpu_dm_connector *aconnector;
> > > struct dm_connector_state *dm_conn_state;
> > > -   int i, j;
> > > +   int i, j, ret;
> > > int vcpi, pbn_div, pbn, slot_num = 0;
> > >
> > > for_each_new_connector_in_state(state, connector,
> > > new_con_state, i) { @@ -6509,8 +6509,11 @@ static int
> > > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > > dm_conn_state->pbn = pbn;
> > > dm_conn_state->vcpi_slots = slot_num;
> > >
> > > -   drm_dp_mst_atomic_enable_dsc(state, aconnector-
> > > > port, dm_conn_state->pbn,
> > > -false);
> > > +   ret = drm_dp_mst_atomic_enable_dsc(state,
> > > aconnector->port,
> > > +  dm_conn_state-
> > > > pbn, false);
> > > +   if (ret < 0)
> > > +   return ret;
> > > +
> > > continue;
> > > }
> > >
> > > @@ -9523,10 +9526,9 @@ static int amdgpu_dm_atomic_check(struct
> > > drm_device *dev,
> > >
> > >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > > if (dc_resource_is_dsc_encoding_supported(dc)) {
> > > -   if (!pre_validate_dsc(state, _state, vars)) {
> > > -   ret = -EINVAL;
> > > +   ret = pre_validate_dsc(state, _state, vars);
> > > +   if (ret != 0)
> > > goto fail;
> > > -   }
> > > }
> > >  #endif
> > >
> > > @@ -9621,9 +9623,9 @@ static int amdgpu_dm_atomic_check(struct
> > > drm_device *dev,
> > > }
> > >
> > >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > > -   if (!compute_mst_dsc_configs_for_state(state, dm_state-
> > > > context, vars)) {
> > > +   ret = compute_mst_dsc_configs_for_state(state, dm_state-
> > > > context, vars);
> > > +   if (ret) {
> > >
> > > DRM_DEBUG_DRIVER("compute_mst_dsc_configs_for_state()
> > > failed\n");
> > > -   ret = -EINVAL;
> > > goto fail;
> > > 

Re: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-18 Thread Lyude Paul
JFYI, Coverity pointed out one more issue with this series so I'm going to
send out a respin real quick to fix it. It's just a missing variable
assignment (we leave ret unassigned by mistake in
pre_compute_mst_dsc_configs()) so I will carry over your r-b on it.

On Wed, 2022-11-16 at 04:39 +, Lin, Wayne wrote:
> [Public]
> 
> All the patch set looks good to me. Feel free to add:
> Reviewed-by: Wayne Lin 
> 
> Again, thank you Lyude for helping on this!!!
> 
> Regards,
> Wayne
> > -Original Message-
> > From: Lyude Paul 
> > Sent: Tuesday, November 15, 2022 6:18 AM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Wentland, Harry ; sta...@vger.kernel.org;
> > Li, Sun peng (Leo) ; Siqueira, Rodrigo
> > ; Deucher, Alexander
> > ; Koenig, Christian
> > ; Pan, Xinhui ; David
> > Airlie ; Daniel Vetter ; Kazlauskas,
> > Nicholas ; Pillai, Aurabindo
> > ; Li, Roman ; Zuo, Jerry
> > ; Wu, Hersen ; Lin, Wayne
> > ; Thomas Zimmermann ;
> > Mahfooz, Hamza ; Hung, Alex
> > ; Mikita Lipski ; Liu,
> > Wenjing ; Francis, David
> > ; open list:DRM DRIVERS  > de...@lists.freedesktop.org>; open list 
> > Subject: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and
> > deadlocking
> > 
> > It appears that amdgpu makes the mistake of completely ignoring the return
> > values from the DP MST helpers, and instead just returns a simple 
> > true/false.
> > In this case, it seems to have come back to bite us because as a result of
> > simply returning false from compute_mst_dsc_configs_for_state(), amdgpu
> > had no way of telling when a deadlock happened from these helpers. This
> > could definitely result in some kernel splats.
> > 
> > V2:
> > * Address Wayne's comments (fix another bunch of spots where we weren't
> >   passing down return codes)
> > 
> > Signed-off-by: Lyude Paul 
> > Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
> > Cc: Harry Wentland 
> > Cc:  # v5.6+
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  18 +-
> >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 235 ++--
> > --
> >  .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |  12 +-
> >  3 files changed, 147 insertions(+), 118 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 0db2a88cd4d7b..852a2100c6b38 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -6462,7 +6462,7 @@ static int
> > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > struct drm_connector_state *new_con_state;
> > struct amdgpu_dm_connector *aconnector;
> > struct dm_connector_state *dm_conn_state;
> > -   int i, j;
> > +   int i, j, ret;
> > int vcpi, pbn_div, pbn, slot_num = 0;
> > 
> > for_each_new_connector_in_state(state, connector,
> > new_con_state, i) { @@ -6509,8 +6509,11 @@ static int
> > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > dm_conn_state->pbn = pbn;
> > dm_conn_state->vcpi_slots = slot_num;
> > 
> > -   drm_dp_mst_atomic_enable_dsc(state, aconnector-
> > > port, dm_conn_state->pbn,
> > -false);
> > +   ret = drm_dp_mst_atomic_enable_dsc(state,
> > aconnector->port,
> > +  dm_conn_state-
> > > pbn, false);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > continue;
> > }
> > 
> > @@ -9523,10 +9526,9 @@ static int amdgpu_dm_atomic_check(struct
> > drm_device *dev,
> > 
> >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > if (dc_resource_is_dsc_encoding_supported(dc)) {
> > -   if (!pre_validate_dsc(state, _state, vars)) {
> > -   ret = -EINVAL;
> > +   ret = pre_validate_dsc(state, _state, vars);
> > +   if (ret != 0)
> > goto fail;
> > -   }
> > }
> >  #endif
> > 
> > @@ -9621,9 +9623,9 @@ static int amdgpu_dm_atomic_check(struct
> > drm_device *dev,
> > }
> > 
> >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > -   if (!compute_mst_dsc_configs_for_state(state, dm_state-
> > > context, vars)) {
> > +   ret = compute_mst_dsc_configs_for_state(state, dm_state-
> > > context, vars);
> > +   if (ret) {
> > 
> > DRM_DEBUG_DRIVER("compute_mst_dsc_configs_for_state()
> > failed\n");
> > -   ret = -EINVAL;
> > goto fail;
> > }
> > 
> > diff --git
> > a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> > index 6ff96b4bdda5c..bba2e8aaa2c20 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> > +++
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> > @@ -703,13 +703,13 @@ 

Re: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-16 Thread Lyude Paul
On Wed, 2022-11-16 at 04:39 +, Lin, Wayne wrote:
> [Public]
> 
> All the patch set looks good to me. Feel free to add:
> Reviewed-by: Wayne Lin 
> 
> Again, thank you Lyude for helping on this!!!

No problem! I was the one who introduced the bug anyway :P, I'm just glad we
were able to fix this on time.

Harry, Alex - feel free to merge this on whatever branch you want (I'm fine
with the mst-helper bits going through amd's branch, especially since AMD is
the only driver using the dsc stuff right now)

> 
> Regards,
> Wayne
> > -Original Message-
> > From: Lyude Paul 
> > Sent: Tuesday, November 15, 2022 6:18 AM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Wentland, Harry ; sta...@vger.kernel.org;
> > Li, Sun peng (Leo) ; Siqueira, Rodrigo
> > ; Deucher, Alexander
> > ; Koenig, Christian
> > ; Pan, Xinhui ; David
> > Airlie ; Daniel Vetter ; Kazlauskas,
> > Nicholas ; Pillai, Aurabindo
> > ; Li, Roman ; Zuo, Jerry
> > ; Wu, Hersen ; Lin, Wayne
> > ; Thomas Zimmermann ;
> > Mahfooz, Hamza ; Hung, Alex
> > ; Mikita Lipski ; Liu,
> > Wenjing ; Francis, David
> > ; open list:DRM DRIVERS  > de...@lists.freedesktop.org>; open list 
> > Subject: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and
> > deadlocking
> > 
> > It appears that amdgpu makes the mistake of completely ignoring the return
> > values from the DP MST helpers, and instead just returns a simple 
> > true/false.
> > In this case, it seems to have come back to bite us because as a result of
> > simply returning false from compute_mst_dsc_configs_for_state(), amdgpu
> > had no way of telling when a deadlock happened from these helpers. This
> > could definitely result in some kernel splats.
> > 
> > V2:
> > * Address Wayne's comments (fix another bunch of spots where we weren't
> >   passing down return codes)
> > 
> > Signed-off-by: Lyude Paul 
> > Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
> > Cc: Harry Wentland 
> > Cc:  # v5.6+
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  18 +-
> >  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 235 ++--
> > --
> >  .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |  12 +-
> >  3 files changed, 147 insertions(+), 118 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 0db2a88cd4d7b..852a2100c6b38 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -6462,7 +6462,7 @@ static int
> > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > struct drm_connector_state *new_con_state;
> > struct amdgpu_dm_connector *aconnector;
> > struct dm_connector_state *dm_conn_state;
> > -   int i, j;
> > +   int i, j, ret;
> > int vcpi, pbn_div, pbn, slot_num = 0;
> > 
> > for_each_new_connector_in_state(state, connector,
> > new_con_state, i) { @@ -6509,8 +6509,11 @@ static int
> > dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
> > dm_conn_state->pbn = pbn;
> > dm_conn_state->vcpi_slots = slot_num;
> > 
> > -   drm_dp_mst_atomic_enable_dsc(state, aconnector-
> > > port, dm_conn_state->pbn,
> > -false);
> > +   ret = drm_dp_mst_atomic_enable_dsc(state,
> > aconnector->port,
> > +  dm_conn_state-
> > > pbn, false);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > continue;
> > }
> > 
> > @@ -9523,10 +9526,9 @@ static int amdgpu_dm_atomic_check(struct
> > drm_device *dev,
> > 
> >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > if (dc_resource_is_dsc_encoding_supported(dc)) {
> > -   if (!pre_validate_dsc(state, _state, vars)) {
> > -   ret = -EINVAL;
> > +   ret = pre_validate_dsc(state, _state, vars);
> > +   if (ret != 0)
> > goto fail;
> > -   }
> > }
> >  #endif
> > 
> > @@ -9621,9 +9623,9 @@ static int amdgpu_dm_atomic_check(struct
> > drm_device *dev,
> > }
> > 
> >  #if defined(CONFIG_DRM_AMD_DC_DCN)
> > -   if (!compute_mst_dsc_configs_for_state(state, dm_state-
> > > context, vars)) {
> > +   ret = compute_mst_dsc_configs_for_state(state, dm_state-
> > > context, vars);
> > +   if (ret) {
> > 
> > DRM_DEBUG_DRIVER("compute_mst_dsc_configs_for_state()
> > failed\n");
> > -   ret = -EINVAL;
> > goto fail;
> > }
> > 
> > diff --git
> > a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> > index 6ff96b4bdda5c..bba2e8aaa2c20 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> > +++
> > 

RE: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-15 Thread Lin, Wayne
[Public]

All the patch set looks good to me. Feel free to add:
Reviewed-by: Wayne Lin 

Again, thank you Lyude for helping on this!!!

Regards,
Wayne
> -Original Message-
> From: Lyude Paul 
> Sent: Tuesday, November 15, 2022 6:18 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Wentland, Harry ; sta...@vger.kernel.org;
> Li, Sun peng (Leo) ; Siqueira, Rodrigo
> ; Deucher, Alexander
> ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter ; Kazlauskas,
> Nicholas ; Pillai, Aurabindo
> ; Li, Roman ; Zuo, Jerry
> ; Wu, Hersen ; Lin, Wayne
> ; Thomas Zimmermann ;
> Mahfooz, Hamza ; Hung, Alex
> ; Mikita Lipski ; Liu,
> Wenjing ; Francis, David
> ; open list:DRM DRIVERS  de...@lists.freedesktop.org>; open list 
> Subject: [PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and
> deadlocking
> 
> It appears that amdgpu makes the mistake of completely ignoring the return
> values from the DP MST helpers, and instead just returns a simple true/false.
> In this case, it seems to have come back to bite us because as a result of
> simply returning false from compute_mst_dsc_configs_for_state(), amdgpu
> had no way of telling when a deadlock happened from these helpers. This
> could definitely result in some kernel splats.
> 
> V2:
> * Address Wayne's comments (fix another bunch of spots where we weren't
>   passing down return codes)
> 
> Signed-off-by: Lyude Paul 
> Fixes: 8c20a1ed9b4f ("drm/amd/display: MST DSC compute fair share")
> Cc: Harry Wentland 
> Cc:  # v5.6+
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  18 +-
>  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 235 ++--
> --
>  .../display/amdgpu_dm/amdgpu_dm_mst_types.h   |  12 +-
>  3 files changed, 147 insertions(+), 118 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 0db2a88cd4d7b..852a2100c6b38 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -6462,7 +6462,7 @@ static int
> dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
>   struct drm_connector_state *new_con_state;
>   struct amdgpu_dm_connector *aconnector;
>   struct dm_connector_state *dm_conn_state;
> - int i, j;
> + int i, j, ret;
>   int vcpi, pbn_div, pbn, slot_num = 0;
> 
>   for_each_new_connector_in_state(state, connector,
> new_con_state, i) { @@ -6509,8 +6509,11 @@ static int
> dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state,
>   dm_conn_state->pbn = pbn;
>   dm_conn_state->vcpi_slots = slot_num;
> 
> - drm_dp_mst_atomic_enable_dsc(state, aconnector-
> >port, dm_conn_state->pbn,
> -  false);
> + ret = drm_dp_mst_atomic_enable_dsc(state,
> aconnector->port,
> +dm_conn_state-
> >pbn, false);
> + if (ret < 0)
> + return ret;
> +
>   continue;
>   }
> 
> @@ -9523,10 +9526,9 @@ static int amdgpu_dm_atomic_check(struct
> drm_device *dev,
> 
>  #if defined(CONFIG_DRM_AMD_DC_DCN)
>   if (dc_resource_is_dsc_encoding_supported(dc)) {
> - if (!pre_validate_dsc(state, _state, vars)) {
> - ret = -EINVAL;
> + ret = pre_validate_dsc(state, _state, vars);
> + if (ret != 0)
>   goto fail;
> - }
>   }
>  #endif
> 
> @@ -9621,9 +9623,9 @@ static int amdgpu_dm_atomic_check(struct
> drm_device *dev,
>   }
> 
>  #if defined(CONFIG_DRM_AMD_DC_DCN)
> - if (!compute_mst_dsc_configs_for_state(state, dm_state-
> >context, vars)) {
> + ret = compute_mst_dsc_configs_for_state(state, dm_state-
> >context, vars);
> + if (ret) {
> 
>   DRM_DEBUG_DRIVER("compute_mst_dsc_configs_for_state()
> failed\n");
> - ret = -EINVAL;
>   goto fail;
>   }
> 
> diff --git
> a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 6ff96b4bdda5c..bba2e8aaa2c20 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -703,13 +703,13 @@ static int bpp_x16_from_pbn(struct
> dsc_mst_fairness_params param, int pbn)
>   return dsc_config.bits_per_pixel;
>  }
> 
> -static bool increase_dsc_bpp(struct drm_atomic_state *state,
> -  struct drm_dp_mst_topology_state *mst_state,
> -  struct dc_link *dc_link,
> -  struct dsc_mst_fairness_params *params,
> -  struct dsc_mst_fairness_vars *vars,
> -  int count,
> -