Re: [PATCH] drm/amdgpu: fix amdgpu_ras_block_late_init error handler

2022-02-22 Thread Kenny Ho
On Thu, Feb 17, 2022 at 2:06 PM Alex Deucher  wrote:
>
> On Thu, Feb 17, 2022 at 2:04 PM Nick Desaulniers
>  wrote:
> >
> >
> > Alex,
> > Has AMD been able to set up clang builds, yet?
>
> No.  I think some individual teams do, but it's never been integrated
> into our larger CI systems as of yet as far as I know.

I have just added clang build to our CI last night so hopefully we
should be catching these now.

Kenny

>
> Alex
>
>
> >
> > --
> > Thanks,
> > ~Nick Desaulniers


Re: [PATCH] drm/amdgpu: fix amdgpu_ras_block_late_init error handler

2022-02-17 Thread Alex Deucher
On Thu, Feb 17, 2022 at 2:04 PM Nick Desaulniers
 wrote:
>
> On Thu, Feb 17, 2022 at 8:16 AM Alex Deucher  wrote:
> >
> > Applied.  Thanks!
> >
> > Alex
>
> Alex,
> Has AMD been able to set up clang builds, yet?

No.  I think some individual teams do, but it's never been integrated
into our larger CI systems as of yet as far as I know.

Alex


>
> --
> Thanks,
> ~Nick Desaulniers


Re: [PATCH] drm/amdgpu: fix amdgpu_ras_block_late_init error handler

2022-02-17 Thread Alex Deucher
Applied.  Thanks!

Alex

On Thu, Feb 17, 2022 at 10:57 AM Luben Tuikov  wrote:
>
> Thanks for catching this.
>
> Reviewed-by: Luben Tuikov 
>
> Regards,
> Luben
>
> On 2022-02-17 10:38, t...@redhat.com wrote:
> > From: Tom Rix 
> >
> > Clang build fails with
> > amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized
> >   whenever 'if' condition is true
> >   if (adev->in_suspend || amdgpu_in_reset(adev)) {
> >   ^
> >
> > amdgpu_ras.c:2453:6: note: uninitialized use occurs here
> >  if (ras_obj->ras_cb)
> >  ^~~
> >
> > There is a logic error in the error handler's labels.
> > ex/ The sysfs: is the last goto label in the normal code but
> > is the middle of error handler.  Rework the error handler.
> >
> > cleanup: is the first error, so it's handler should be last.
> >
> > interrupt: is the second error, it's handler is next.  interrupt:
> > handles the failure of amdgpu_ras_interrupt_add_hander() by
> > calling amdgpu_ras_interrupt_remove_handler().  This is wrong,
> > remove() assumes the interrupt has been setup, not torn down by
> > add().  Change the goto label to cleanup.
> >
> > sysfs is the last error, it's handler should be first.  sysfs:
> > handles the failure of amdgpu_ras_sysfs_create() by calling
> > amdgpu_ras_sysfs_remove().  But when the create() fails there
> > is nothing added so there is nothing to remove.  This error
> > handler is not needed. Remove the error handler and change
> > goto label to interrupt.
> >
> > Fixes: b293e891b057 ("drm/amdgpu: add helper function to do common 
> > ras_late_init/fini (v3)")
> > Signed-off-by: Tom Rix 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 11 +--
> >  1 file changed, 5 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> > index b5cd21cb6e58..c5c8a666110f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> > @@ -2432,12 +2432,12 @@ int amdgpu_ras_block_late_init(struct amdgpu_device 
> > *adev,
> >   if (ras_obj->ras_cb) {
> >   r = amdgpu_ras_interrupt_add_handler(adev, ras_block);
> >   if (r)
> > - goto interrupt;
> > + goto cleanup;
> >   }
> >
> >   r = amdgpu_ras_sysfs_create(adev, ras_block);
> >   if (r)
> > - goto sysfs;
> > + goto interrupt;
> >
> >   /* Those are the cached values at init.
> >*/
> > @@ -2447,12 +2447,11 @@ int amdgpu_ras_block_late_init(struct amdgpu_device 
> > *adev,
> >   }
> >
> >   return 0;
> > -cleanup:
> > - amdgpu_ras_sysfs_remove(adev, ras_block);
> > -sysfs:
> > +
> > +interrupt:
> >   if (ras_obj->ras_cb)
> >   amdgpu_ras_interrupt_remove_handler(adev, ras_block);
> > -interrupt:
> > +cleanup:
> >   amdgpu_ras_feature_enable(adev, ras_block, 0);
> >   return r;
> >  }
>


Re: [PATCH] drm/amdgpu: fix amdgpu_ras_block_late_init error handler

2022-02-17 Thread Luben Tuikov
Thanks for catching this.

Reviewed-by: Luben Tuikov 

Regards,
Luben

On 2022-02-17 10:38, t...@redhat.com wrote:
> From: Tom Rix 
> 
> Clang build fails with
> amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized
>   whenever 'if' condition is true
>   if (adev->in_suspend || amdgpu_in_reset(adev)) {
>   ^
> 
> amdgpu_ras.c:2453:6: note: uninitialized use occurs here
>  if (ras_obj->ras_cb)
>  ^~~
> 
> There is a logic error in the error handler's labels.
> ex/ The sysfs: is the last goto label in the normal code but
> is the middle of error handler.  Rework the error handler.
> 
> cleanup: is the first error, so it's handler should be last.
> 
> interrupt: is the second error, it's handler is next.  interrupt:
> handles the failure of amdgpu_ras_interrupt_add_hander() by
> calling amdgpu_ras_interrupt_remove_handler().  This is wrong,
> remove() assumes the interrupt has been setup, not torn down by
> add().  Change the goto label to cleanup.
> 
> sysfs is the last error, it's handler should be first.  sysfs:
> handles the failure of amdgpu_ras_sysfs_create() by calling
> amdgpu_ras_sysfs_remove().  But when the create() fails there
> is nothing added so there is nothing to remove.  This error
> handler is not needed. Remove the error handler and change
> goto label to interrupt.
> 
> Fixes: b293e891b057 ("drm/amdgpu: add helper function to do common 
> ras_late_init/fini (v3)")
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index b5cd21cb6e58..c5c8a666110f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2432,12 +2432,12 @@ int amdgpu_ras_block_late_init(struct amdgpu_device 
> *adev,
>   if (ras_obj->ras_cb) {
>   r = amdgpu_ras_interrupt_add_handler(adev, ras_block);
>   if (r)
> - goto interrupt;
> + goto cleanup;
>   }
>  
>   r = amdgpu_ras_sysfs_create(adev, ras_block);
>   if (r)
> - goto sysfs;
> + goto interrupt;
>  
>   /* Those are the cached values at init.
>*/
> @@ -2447,12 +2447,11 @@ int amdgpu_ras_block_late_init(struct amdgpu_device 
> *adev,
>   }
>  
>   return 0;
> -cleanup:
> - amdgpu_ras_sysfs_remove(adev, ras_block);
> -sysfs:
> +
> +interrupt:
>   if (ras_obj->ras_cb)
>   amdgpu_ras_interrupt_remove_handler(adev, ras_block);
> -interrupt:
> +cleanup:
>   amdgpu_ras_feature_enable(adev, ras_block, 0);
>   return r;
>  }