Re: [PATCH] drm/amdgpu: add missed write lock for pci detected state pci_channel_io_normal

2021-09-30 Thread Andrey Grodzovsky

On 2021-09-30 10:00 p.m., Guchun Chen wrote:


When a PCI error state pci_channel_io_normal is detectd, it will
report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver
will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock.



Good catch but, the issue is even wider in scope, what about 
drm_sched_resubmit_jobs
and drm_sched_start called without being stopped before ? Better to put 
the entire scope
of code in this function under flag that set only in 
pci_channel_io_frozen. As far as i remember

we don't need to do anything in case of pci_channel_io_normal.

Andrey




Fixes: c9a6b82f45e2("drm/amdgpu: Implement DPC recovery")
Signed-off-by: Guchun Chen 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bb5ad2b6ca13..12f822d51de2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5370,6 +5370,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev 
*pdev, pci_channel_sta
  
  	switch (state) {

case pci_channel_io_normal:
+   amdgpu_device_lock_adev(adev, NULL);
return PCI_ERS_RESULT_CAN_RECOVER;
/* Fatal error, prepare for slot reset */
case pci_channel_io_frozen:


[PATCH] drm/amdgpu: add missed write lock for pci detected state pci_channel_io_normal

2021-09-30 Thread Guchun Chen
When a PCI error state pci_channel_io_normal is detectd, it will
report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver
will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock.

Fixes: c9a6b82f45e2("drm/amdgpu: Implement DPC recovery")
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bb5ad2b6ca13..12f822d51de2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5370,6 +5370,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev 
*pdev, pci_channel_sta
 
switch (state) {
case pci_channel_io_normal:
+   amdgpu_device_lock_adev(adev, NULL);
return PCI_ERS_RESULT_CAN_RECOVER;
/* Fatal error, prepare for slot reset */
case pci_channel_io_frozen:
-- 
2.17.1



Re: [PATCH v3] drm/dp: Add Additional DP2 Headers

2021-09-30 Thread Rodrigo Siqueira
Applied to drm-misc-next.

Thanks

On 09/28, Harry Wentland wrote:
> On 2021-09-27 15:23, Fangzhi Zuo wrote:
> > Include FEC, DSC, Link Training related headers.
> > 
> > Change since v2
> > - Align with the spec for DP_DSC_SUPPORT_AND_DSC_DECODER_COUNT
> > 
> > Signed-off-by: Fangzhi Zuo 
> 
> Reviewed-by: Harry Wentland 
> 
> Harry
> 
> > ---
> > This patch is based on top of the other DP2.0 work in
> > "drm/dp: add LTTPR DP 2.0 DPCD addresses"
> > ---
> >  include/drm/drm_dp_helper.h | 20 
> >  1 file changed, 20 insertions(+)
> > 
> > diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
> > index 1d5b3dbb6e56..a1df35aa6e68 100644
> > --- a/include/drm/drm_dp_helper.h
> > +++ b/include/drm/drm_dp_helper.h
> > @@ -453,6 +453,7 @@ struct drm_panel;
> >  # define DP_FEC_UNCORR_BLK_ERROR_COUNT_CAP  (1 << 1)
> >  # define DP_FEC_CORR_BLK_ERROR_COUNT_CAP(1 << 2)
> >  # define DP_FEC_BIT_ERROR_COUNT_CAP(1 << 3)
> > +#define DP_FEC_CAPABILITY_10x091   /* 2.0 */
> >  
> >  /* DP-HDMI2.1 PCON DSC ENCODER SUPPORT */
> >  #define DP_PCON_DSC_ENCODER_CAP_SIZE0xC/* 0x9E - 0x92 */
> > @@ -537,6 +538,9 @@ struct drm_panel;
> >  #define DP_DSC_BRANCH_OVERALL_THROUGHPUT_1  0x0a1
> >  #define DP_DSC_BRANCH_MAX_LINE_WIDTH0x0a2
> >  
> > +/* DFP Capability Extension */
> > +#define DP_DFP_CAPABILITY_EXTENSION_SUPPORT0x0a3   /* 2.0 */
> > +
> >  /* Link Configuration */
> >  #defineDP_LINK_BW_SET  0x100
> >  # define DP_LINK_RATE_TABLE0x00/* eDP 1.4 */
> > @@ -688,6 +692,7 @@ struct drm_panel;
> >  
> >  #define DP_DSC_ENABLE   0x160   /* DP 1.4 */
> >  # define DP_DECOMPRESSION_EN(1 << 0)
> > +#define DP_DSC_CONFIGURATION   0x161   /* DP 
> > 2.0 */
> >  
> >  #define DP_PSR_EN_CFG  0x170   /* XXX 1.2? */
> >  # define DP_PSR_ENABLE BIT(0)
> > @@ -743,6 +748,7 @@ struct drm_panel;
> >  # define DP_RECEIVE_PORT_0_STATUS  (1 << 0)
> >  # define DP_RECEIVE_PORT_1_STATUS  (1 << 1)
> >  # define DP_STREAM_REGENERATION_STATUS  (1 << 2) /* 2.0 */
> > +# define DP_INTRA_HOP_AUX_REPLY_INDICATION (1 << 3) /* 2.0 */
> >  
> >  #define DP_ADJUST_REQUEST_LANE0_1  0x206
> >  #define DP_ADJUST_REQUEST_LANE2_3  0x207
> > @@ -865,6 +871,8 @@ struct drm_panel;
> >  # define DP_PHY_TEST_PATTERN_80BIT_CUSTOM   0x4
> >  # define DP_PHY_TEST_PATTERN_CP2520 0x5
> >  
> > +#define DP_PHY_SQUARE_PATTERN  0x249
> > +
> >  #define DP_TEST_HBR2_SCRAMBLER_RESET0x24A
> >  #define DP_TEST_80BIT_CUSTOM_PATTERN_7_00x250
> >  #defineDP_TEST_80BIT_CUSTOM_PATTERN_15_8   0x251
> > @@ -1109,6 +1117,18 @@ struct drm_panel;
> >  #define DP_128B132B_TRAINING_AUX_RD_INTERVAL   0x2216 /* 2.0 */
> >  # define DP_128B132B_TRAINING_AUX_RD_INTERVAL_MASK 0x7f
> >  
> > +#define DP_TEST_264BIT_CUSTOM_PATTERN_7_0  0x2230
> > +#define DP_TEST_264BIT_CUSTOM_PATTERN_263_256  0x2250
> > +
> > +/* DSC Extended Capability Branch Total DSC Resources */
> > +#define DP_DSC_SUPPORT_AND_DSC_DECODER_COUNT   0x2260  /* 2.0 
> > */
> > +# define DP_DSC_DECODER_COUNT_MASK (0b111 << 5)
> > +# define DP_DSC_DECODER_COUNT_SHIFT5
> > +#define DP_DSC_MAX_SLICE_COUNT_AND_AGGREGATION_0   0x2270  /* 2.0 */
> > +# define DP_DSC_DECODER_0_MAXIMUM_SLICE_COUNT_MASK (1 << 0)
> > +# define DP_DSC_DECODER_0_AGGREGATION_SUPPORT_MASK (0b111 << 1)
> > +# define DP_DSC_DECODER_0_AGGREGATION_SUPPORT_SHIFT1
> > +
> >  /* Protocol Converter Extension */
> >  /* HDMI CEC tunneling over AUX DP 1.3 section 5.3.3.3.1 DPCD 1.4+ */
> >  #define DP_CEC_TUNNELING_CAPABILITY0x3000
> > 
> 

-- 
Rodrigo Siqueira
https://siqueira.tech


Re: [PATCH 1/2] drm/amdkfd: remove redundant iommu cleanup code

2021-09-30 Thread Zhu, James
[AMD Official Use Only]


Reviewed-by: James Zhu  for the series



James Zhu


From: amd-gfx  on behalf of Yifan Zhang 

Sent: Tuesday, September 28, 2021 4:28 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Kuehling, Felix ; Zhang, Yifan 

Subject: [PATCH 1/2] drm/amdkfd: remove redundant iommu cleanup code

kfd_resume doesn't involve iommu operation, remove
redundant iommu cleanup code.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index c2a4d920da40..4a416231b24c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -1085,18 +1085,12 @@ static int kfd_resume(struct kfd_dev *kfd)
 int err = 0;

 err = kfd->dqm->ops.start(kfd->dqm);
-   if (err) {
+   if (err)
 dev_err(kfd_device,
 "Error starting queue manager for device %x:%x\n",
 kfd->pdev->vendor, kfd->pdev->device);
-   goto dqm_start_error;
-   }

 return err;
-
-dqm_start_error:
-   kfd_iommu_suspend(kfd);
-   return err;
 }

 static inline void kfd_queue_work(struct workqueue_struct *wq,
--
2.25.1



Re: [PATCH] drm/amdkfd: match the signatures of the real and stub kgd2kfd_probe()

2021-09-30 Thread Alex Deucher
On Thu, Sep 30, 2021 at 4:35 PM  wrote:
>
> From: Tom Rix 
>
> When CONFIG_HSA_AMD=n this there is this error
> amdgpu_amdkfd.c:75:56: error: incompatible type for
>   argument 2 of ‘kgd2kfd_probe’
>75 |  adev->kfd.dev = kgd2kfd_probe((struct kgd_dev *)adev, vf);
>
> amdgpu_amdkfd.h:349:17: note: declared here
>   349 | struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>   struct pci_dev *pdev,
>
> The signature of the stub kgd2kfd_probe() does not match the real one.
> So change the stub to match.
>
> Fixes: 920f37e6a3fc ("drm/amdkfd: clean up parameters in kgd2kfd_probe")
> Signed-off-by: Tom Rix 

Anson fixed this up earlier today.  Thanks!

Alex


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 38d883dffc20..69de31754907 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -346,8 +346,7 @@ static inline void kgd2kfd_exit(void)
>  }
>
>  static inline
> -struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev,
> -   unsigned int asic_type, bool vf)
> +struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
>  {
> return NULL;
>  }
> --
> 2.26.3
>


[PATCH] drm/amdkfd: match the signatures of the real and stub kgd2kfd_probe()

2021-09-30 Thread trix
From: Tom Rix 

When CONFIG_HSA_AMD=n this there is this error
amdgpu_amdkfd.c:75:56: error: incompatible type for
  argument 2 of ‘kgd2kfd_probe’
   75 |  adev->kfd.dev = kgd2kfd_probe((struct kgd_dev *)adev, vf);

amdgpu_amdkfd.h:349:17: note: declared here
  349 | struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
  struct pci_dev *pdev,

The signature of the stub kgd2kfd_probe() does not match the real one.
So change the stub to match.

Fixes: 920f37e6a3fc ("drm/amdkfd: clean up parameters in kgd2kfd_probe")
Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 38d883dffc20..69de31754907 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -346,8 +346,7 @@ static inline void kgd2kfd_exit(void)
 }
 
 static inline
-struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev,
-   unsigned int asic_type, bool vf)
+struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
 {
return NULL;
 }
-- 
2.26.3



Re: [PATCH] amd/amdkfd: remove svms declaration to avoid werror

2021-09-30 Thread Alex Deucher
On Thu, Sep 30, 2021 at 11:53 AM Alex Sierra  wrote:
>
> svm_range_list svms declaration removed to avoid werror when
> CONFIG_HSA_AMD_SVM is not enabled.
>
> Signed-off-by: Alex Sierra 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 4de907f3e66a..f1e7edeb4e6b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1251,7 +1251,6 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
> *filep,
> struct kfd_process_device *pdd;
> void *mem;
> struct kfd_dev *dev;
> -   struct svm_range_list *svms = >svms;
> int idr_handle;
> long err;
> uint64_t offset = args->mmap_offset;
> @@ -1264,18 +1263,18 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
> *filep,
> /* Flush pending deferred work to avoid racing with deferred actions
>  * from previous memory map changes (e.g. munmap).
>  */
> -   svm_range_list_lock_and_flush_work(svms, current->mm);
> -   mutex_lock(>lock);
> +   svm_range_list_lock_and_flush_work(>svms, current->mm);
> +   mutex_lock(>svms.lock);
> mmap_write_unlock(current->mm);
> -   if (interval_tree_iter_first(>objects,
> +   if (interval_tree_iter_first(>svms.objects,
>  args->va_addr >> PAGE_SHIFT,
>  (args->va_addr + args->size - 1) >> 
> PAGE_SHIFT)) {
> pr_err("Address: 0x%llx already allocated by SVM\n",
> args->va_addr);
> -   mutex_unlock(>lock);
> +   mutex_unlock(>svms.lock);
> return -EADDRINUSE;
> }
> -   mutex_unlock(>lock);
> +   mutex_unlock(>svms.lock);
>  #endif
> dev = kfd_device_by_id(args->gpu_id);
> if (!dev)
> --
> 2.32.0
>


Re: [PATCH] drm/amd: Guard IS_OLD_GCC assignment with CONFIG_CC_IS_GCC

2021-09-30 Thread Alex Deucher
On Thu, Sep 30, 2021 at 12:02 PM Nathan Chancellor  wrote:
>
> cc-ifversion only works for GCC, as clang pretends to be GCC 4.2.1 for
> glibc compatibility, which means IS_OLD_GCC will get set and unsupported
> flags will be passed to clang when building certain code within the DCN
> files:
>
> clang-14: error: unknown argument: '-mpreferred-stack-boundary=4'
> make[5]: *** [scripts/Makefile.build:277: 
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_resource.o] Error 1
>
> Guard the call to cc-ifversion with CONFIG_CC_IS_GCC so that everything
> continues to work properly. See commit 00db297106e8 ("drm/amdgpu: fix stack
> alignment ABI mismatch for GCC 7.1+") for more context.
>
> Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1468
> Signed-off-by: Nathan Chancellor 

Harry beat you to the punch by a little bit.

Thanks!

Alex

> ---
>  drivers/gpu/drm/amd/display/dc/dcn201/Makefile | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
> index d98d69705117..96cbd4ccd344 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
> @@ -14,9 +14,11 @@ ifdef CONFIG_PPC64
>  CFLAGS_$(AMDDALPATH)/dc/dcn201/dcn201_resource.o := -mhard-float -maltivec
>  endif
>
> +ifdef CONFIG_CC_IS_GCC
>  ifeq ($(call cc-ifversion, -lt, 0701, y), y)
>  IS_OLD_GCC = 1
>  endif
> +endif
>
>  ifdef CONFIG_X86
>  ifdef IS_OLD_GCC
>
> base-commit: b47b99e30cca8906753c83205e8c6179045dd725
> --
> 2.33.0.591.gddb1055343
>


Re: [PATCH] drm/amd: Return NULL instead of false in dcn201_acquire_idle_pipe_for_layer()

2021-09-30 Thread Alex Deucher
On Thu, Sep 30, 2021 at 1:23 PM Nick Desaulniers
 wrote:
>
> On Thu, Sep 30, 2021 at 10:10 AM Alex Deucher  wrote:
> >
> > Applied.  Thanks!
> >
> > Alex
> >
> > On Thu, Sep 30, 2021 at 12:23 PM Nathan Chancellor  
> > wrote:
> > >
> > > Clang warns:
>
> Any chance AMDGPU folks can look into adding clang to the CI roster?

We can look into it.  We may already be doing it for some groups.

Alex

> --
> Thanks,
> ~Nick Desaulniers


Re: [PATCH] drm/amd/display: fix DCC settings for DCN3

2021-09-30 Thread Joshua Ashton

Thanks for the info!

- Joshie ✨

On 9/30/21 18:33, Marek Olšák wrote:
The name is kind of correct. It means "64B with no 128B cache line 
straddling", which really means just 64B independent blocks with a small 
modification to support DCC image stores.  They are not true 128B 
independent blocks.


Marek

On Thu, Sep 30, 2021 at 12:35 PM Joshua Ashton > wrote:


Can we please add documentation for this enum?

This was not necessarily a typo, but me misunderstanding and stuff it
working in my testing.

I guess I don't understand why hubp_ind_block_64b_no_128bcl is for 64b
&& 128b when it specifically says "no_128" in the name.

Is there something about it I am missing or is it just misleading
naming?

- Joshie ✨

On 9/30/21 17:14, Marek Olšák wrote:
 > I've also amended the version bump that I forgot to do:
 >
 > -#define KMS_DRIVER_MINOR       43
 > +#define KMS_DRIVER_MINOR       44
 >
 > Marek
 >
 > On Thu, Sep 30, 2021 at 12:06 PM Alex Deucher
mailto:alexdeuc...@gmail.com>
 > >> wrote:
 >
 >     Acked-by: Alex Deucher mailto:alexander.deuc...@amd.com>
 >     >>
 >
 >     On Thu, Sep 30, 2021 at 11:50 AM Marek Olšák
mailto:mar...@gmail.com>
 >     >> wrote:
 >      >
 >      > Hi,
 >      >
 >      > Just discovered this typo. Please review.
 >      >
 >      > Thanks,
 >      > Marek
 >



Re: [PATCH] drm/amd/display: fix DCC settings for DCN3

2021-09-30 Thread Marek Olšák
The name is kind of correct. It means "64B with no 128B cache line
straddling", which really means just 64B independent blocks with a small
modification to support DCC image stores.  They are not true 128B
independent blocks.

Marek

On Thu, Sep 30, 2021 at 12:35 PM Joshua Ashton  wrote:

> Can we please add documentation for this enum?
>
> This was not necessarily a typo, but me misunderstanding and stuff it
> working in my testing.
>
> I guess I don't understand why hubp_ind_block_64b_no_128bcl is for 64b
> && 128b when it specifically says "no_128" in the name.
>
> Is there something about it I am missing or is it just misleading naming?
>
> - Joshie ✨
>
> On 9/30/21 17:14, Marek Olšák wrote:
> > I've also amended the version bump that I forgot to do:
> >
> > -#define KMS_DRIVER_MINOR   43
> > +#define KMS_DRIVER_MINOR   44
> >
> > Marek
> >
> > On Thu, Sep 30, 2021 at 12:06 PM Alex Deucher  > > wrote:
> >
> > Acked-by: Alex Deucher  > >
> >
> > On Thu, Sep 30, 2021 at 11:50 AM Marek Olšák  > > wrote:
> >  >
> >  > Hi,
> >  >
> >  > Just discovered this typo. Please review.
> >  >
> >  > Thanks,
> >  > Marek
> >
>
>


Re: [PATCH] drm/amd: Initialize remove_mpcc in dcn201_update_mpcc()

2021-09-30 Thread Alex Deucher
Applied.  Thanks!

Alex

On Thu, Sep 30, 2021 at 12:16 PM Nathan Chancellor  wrote:
>
> Clang warns:
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:505:6: error: 
> variable 'remove_mpcc' is used uninitialized whenever 'if' condition is false 
> [-Werror,-Wsometimes-uninitialized]
> if (mpc->funcs->get_mpcc_for_dpp_from_secondary)
> ^~~
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:509:6: note: 
> uninitialized use occurs here
> if (remove_mpcc != NULL && mpc->funcs->remove_mpcc_from_secondary)
> ^~~
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:505:2: note: 
> remove the 'if' if its condition is always true
> if (mpc->funcs->get_mpcc_for_dpp_from_secondary)
> ^~~~
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:442:26: note: 
> initialize the variable 'remove_mpcc' to silence this warning
> struct mpcc *remove_mpcc;
> ^
>  = NULL
> 1 error generated.
>
> The code already handles remove_mpcc being NULL just fine so initialize
> it to NULL at the beginning of the function so it is never used
> uninitialized.
>
> Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1469
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c 
> b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
> index ceaaeeb8f2de..cfd09b3f705e 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
> @@ -439,7 +439,7 @@ void dcn201_update_mpcc(struct dc *dc, struct pipe_ctx 
> *pipe_ctx)
> bool per_pixel_alpha = pipe_ctx->plane_state->per_pixel_alpha && 
> pipe_ctx->bottom_pipe;
> int mpcc_id, dpp_id;
> struct mpcc *new_mpcc;
> -   struct mpcc *remove_mpcc;
> +   struct mpcc *remove_mpcc = NULL;
> struct mpc *mpc = dc->res_pool->mpc;
> struct mpc_tree *mpc_tree_params = 
> &(pipe_ctx->stream_res.opp->mpc_tree_params);
>
>
> base-commit: 30fc33064c846df29888c3c61e30a064aad3a342
> --
> 2.33.0.591.gddb1055343
>


Re: [PATCH] drm/amd: Return NULL instead of false in dcn201_acquire_idle_pipe_for_layer()

2021-09-30 Thread Alex Deucher
Applied.  Thanks!

Alex

On Thu, Sep 30, 2021 at 12:23 PM Nathan Chancellor  wrote:
>
> Clang warns:
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_resource.c:1017:10: 
> error: expression which evaluates to zero treated as a null pointer constant 
> of type 'struct pipe_ctx *' [-Werror,-Wnon-literal-null-conversion]
> return false;
>^
> 1 error generated.
>
> Use NULL instead of false since the function is returning a pointer
> rather than a boolean.
>
> Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1470
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
> index aec276e1db65..8523a048e6f6 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
> @@ -1014,7 +1014,7 @@ static struct pipe_ctx 
> *dcn201_acquire_idle_pipe_for_layer(
> ASSERT(0);
>
> if (!idle_pipe)
> -   return false;
> +   return NULL;
>
> idle_pipe->stream = head_pipe->stream;
> idle_pipe->stream_res.tg = head_pipe->stream_res.tg;
>
> base-commit: b47b99e30cca8906753c83205e8c6179045dd725
> --
> 2.33.0.591.gddb1055343
>


Re: [PATCH] drm/amd/display: fix DCC settings for DCN3

2021-09-30 Thread Joshua Ashton

Can we please add documentation for this enum?

This was not necessarily a typo, but me misunderstanding and stuff it 
working in my testing.


I guess I don't understand why hubp_ind_block_64b_no_128bcl is for 64b 
&& 128b when it specifically says "no_128" in the name.


Is there something about it I am missing or is it just misleading naming?

- Joshie ✨

On 9/30/21 17:14, Marek Olšák wrote:

I've also amended the version bump that I forgot to do:

-#define KMS_DRIVER_MINOR       43
+#define KMS_DRIVER_MINOR       44

Marek

On Thu, Sep 30, 2021 at 12:06 PM Alex Deucher > wrote:


Acked-by: Alex Deucher mailto:alexander.deuc...@amd.com>>

On Thu, Sep 30, 2021 at 11:50 AM Marek Olšák mailto:mar...@gmail.com>> wrote:
 >
 > Hi,
 >
 > Just discovered this typo. Please review.
 >
 > Thanks,
 > Marek





Re: [PATCH] drm/amdkfd: avoid conflicting address mappings

2021-09-30 Thread Sierra Guiza, Alejandro (Alex)



On 9/29/2021 9:15 PM, Felix Kuehling wrote:

On 2021-09-29 7:35 p.m., Mike Lothian wrote:

Hi

This patch is causing a compile failure for me

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.c:1254:25: error:
unused variable 'svms' [-Werror,-Wunused-variable]
    struct svm_range_list *svms = >svms;
   ^
1 error generated.

I'll turn off Werror
I guess the struct svm_range_list *svms declaration should be under 
#if IS_ENABLED(CONFIG_HSA_AMD_SVM). Alternatively, we could get rid of 
it and use p->svms directly (it's used in 3 places in that function).


Would you like to propose a patch for that?


I have submitted the patch that fix this for review

Regards,
Alex Sierra



Thanks,
  Felix




On Mon, 19 Jul 2021 at 22:19, Alex Sierra  wrote:

[Why]
Avoid conflict with address ranges mapped by SVM
mechanism that try to be allocated again through
ioctl_alloc in the same process. And viceversa.

[How]
For ioctl_alloc_memory_of_gpu allocations
Check if the address range passed into ioctl memory
alloc does not exist already in the kfd_process
svms->objects interval tree.

For SVM allocations
Look for the address range into the interval tree VA from
the VM inside of each pdds used in a kfd_process.

Signed-off-by: Alex Sierra 
---
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 13 
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 79 
+++-

  2 files changed, 75 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c

index 67541c30327a..f39baaa22a62 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1251,6 +1251,7 @@ static int 
kfd_ioctl_alloc_memory_of_gpu(struct file *filep,

 struct kfd_process_device *pdd;
 void *mem;
 struct kfd_dev *dev;
+   struct svm_range_list *svms = >svms;
 int idr_handle;
 long err;
 uint64_t offset = args->mmap_offset;
@@ -1259,6 +1260,18 @@ static int 
kfd_ioctl_alloc_memory_of_gpu(struct file *filep,

 if (args->size == 0)
 return -EINVAL;

+#if IS_ENABLED(CONFIG_HSA_AMD_SVM)
+   mutex_lock(>lock);
+   if (interval_tree_iter_first(>objects,
+    args->va_addr >> PAGE_SHIFT,
+    (args->va_addr + args->size - 
1) >> PAGE_SHIFT)) {

+   pr_err("Address: 0x%llx already allocated by SVM\n",
+   args->va_addr);
+   mutex_unlock(>lock);
+   return -EADDRINUSE;
+   }
+   mutex_unlock(>lock);
+#endif
 dev = kfd_device_by_id(args->gpu_id);
 if (!dev)
 return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c

index 31f3f24cef6a..043ee0467916 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2581,9 +2581,54 @@ int svm_range_list_init(struct kfd_process *p)
 return 0;
  }

+/**
+ * svm_range_is_vm_bo_mapped - check if virtual address range 
mapped already

+ * @p: current kfd_process
+ * @start: range start address, in pages
+ * @last: range last address, in pages
+ *
+ * The purpose is to avoid virtual address ranges already allocated by
+ * kfd_ioctl_alloc_memory_of_gpu ioctl.
+ * It looks for each pdd in the kfd_process.
+ *
+ * Context: Process context
+ *
+ * Return 0 - OK, if the range is not mapped.
+ * Otherwise error code:
+ * -EADDRINUSE - if address is mapped already by 
kfd_ioctl_alloc_memory_of_gpu
+ * -ERESTARTSYS - A wait for the buffer to become unreserved was 
interrupted by

+ * a signal. Release all buffer reservations and return to user-space.
+ */
+static int
+svm_range_is_vm_bo_mapped(struct kfd_process *p, uint64_t start, 
uint64_t last)

+{
+   uint32_t i;
+   int r;
+
+   for (i = 0; i < p->n_pdds; i++) {
+   struct amdgpu_vm *vm;
+
+   if (!p->pdds[i]->drm_priv)
+   continue;
+
+   vm = drm_priv_to_vm(p->pdds[i]->drm_priv);
+   r = amdgpu_bo_reserve(vm->root.bo, false);
+   if (r)
+   return r;
+   if (interval_tree_iter_first(>va, start, last)) {
+   pr_debug("Range [0x%llx 0x%llx] already 
mapped\n", start, last);

+   amdgpu_bo_unreserve(vm->root.bo);
+   return -EADDRINUSE;
+   }
+   amdgpu_bo_unreserve(vm->root.bo);
+   }
+
+   return 0;
+}
+
  /**
   * svm_range_is_valid - check if virtual address range is valid
- * @mm: current process mm_struct
+ * @mm: current kfd_process
   * @start: range start address, in pages
   * @size: range size, in pages
   *
@@ -2592,28 +2637,27 @@ int svm_range_list_init(struct kfd_process *p)
   * Context: Process context
   *
   * Return:
- *  true - valid svm range
- *  false - invalid svm 

[PATCH] drm/amd: Return NULL instead of false in dcn201_acquire_idle_pipe_for_layer()

2021-09-30 Thread Nathan Chancellor
Clang warns:

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_resource.c:1017:10: 
error: expression which evaluates to zero treated as a null pointer constant of 
type 'struct pipe_ctx *' [-Werror,-Wnon-literal-null-conversion]
return false;
   ^
1 error generated.

Use NULL instead of false since the function is returning a pointer
rather than a boolean.

Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
Link: https://github.com/ClangBuiltLinux/linux/issues/1470
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
index aec276e1db65..8523a048e6f6 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c
@@ -1014,7 +1014,7 @@ static struct pipe_ctx 
*dcn201_acquire_idle_pipe_for_layer(
ASSERT(0);
 
if (!idle_pipe)
-   return false;
+   return NULL;
 
idle_pipe->stream = head_pipe->stream;
idle_pipe->stream_res.tg = head_pipe->stream_res.tg;

base-commit: b47b99e30cca8906753c83205e8c6179045dd725
-- 
2.33.0.591.gddb1055343



[PATCH] drm/amd: Initialize remove_mpcc in dcn201_update_mpcc()

2021-09-30 Thread Nathan Chancellor
Clang warns:

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:505:6: error: 
variable 'remove_mpcc' is used uninitialized whenever 'if' condition is false 
[-Werror,-Wsometimes-uninitialized]
if (mpc->funcs->get_mpcc_for_dpp_from_secondary)
^~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:509:6: note: 
uninitialized use occurs here
if (remove_mpcc != NULL && mpc->funcs->remove_mpcc_from_secondary)
^~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:505:2: note: 
remove the 'if' if its condition is always true
if (mpc->funcs->get_mpcc_for_dpp_from_secondary)
^~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_hwseq.c:442:26: note: 
initialize the variable 'remove_mpcc' to silence this warning
struct mpcc *remove_mpcc;
^
 = NULL
1 error generated.

The code already handles remove_mpcc being NULL just fine so initialize
it to NULL at the beginning of the function so it is never used
uninitialized.

Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
Link: https://github.com/ClangBuiltLinux/linux/issues/1469
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
index ceaaeeb8f2de..cfd09b3f705e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c
@@ -439,7 +439,7 @@ void dcn201_update_mpcc(struct dc *dc, struct pipe_ctx 
*pipe_ctx)
bool per_pixel_alpha = pipe_ctx->plane_state->per_pixel_alpha && 
pipe_ctx->bottom_pipe;
int mpcc_id, dpp_id;
struct mpcc *new_mpcc;
-   struct mpcc *remove_mpcc;
+   struct mpcc *remove_mpcc = NULL;
struct mpc *mpc = dc->res_pool->mpc;
struct mpc_tree *mpc_tree_params = 
&(pipe_ctx->stream_res.opp->mpc_tree_params);
 

base-commit: 30fc33064c846df29888c3c61e30a064aad3a342
-- 
2.33.0.591.gddb1055343



Re: [PATCH] drm/amd/display: fix DCC settings for DCN3

2021-09-30 Thread Marek Olšák
I've also amended the version bump that I forgot to do:

-#define KMS_DRIVER_MINOR   43
+#define KMS_DRIVER_MINOR   44

Marek

On Thu, Sep 30, 2021 at 12:06 PM Alex Deucher  wrote:

> Acked-by: Alex Deucher 
>
> On Thu, Sep 30, 2021 at 11:50 AM Marek Olšák  wrote:
> >
> > Hi,
> >
> > Just discovered this typo. Please review.
> >
> > Thanks,
> > Marek
>


Re: [PATCH] drm/amd/display: fix DCC settings for DCN3

2021-09-30 Thread Alex Deucher
Acked-by: Alex Deucher 

On Thu, Sep 30, 2021 at 11:50 AM Marek Olšák  wrote:
>
> Hi,
>
> Just discovered this typo. Please review.
>
> Thanks,
> Marek


[PATCH] drm/amd: Guard IS_OLD_GCC assignment with CONFIG_CC_IS_GCC

2021-09-30 Thread Nathan Chancellor
cc-ifversion only works for GCC, as clang pretends to be GCC 4.2.1 for
glibc compatibility, which means IS_OLD_GCC will get set and unsupported
flags will be passed to clang when building certain code within the DCN
files:

clang-14: error: unknown argument: '-mpreferred-stack-boundary=4'
make[5]: *** [scripts/Makefile.build:277: 
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn201/dcn201_resource.o] Error 1

Guard the call to cc-ifversion with CONFIG_CC_IS_GCC so that everything
continues to work properly. See commit 00db297106e8 ("drm/amdgpu: fix stack
alignment ABI mismatch for GCC 7.1+") for more context.

Fixes: ff7e396f822f ("drm/amd/display: add cyan_skillfish display support")
Link: https://github.com/ClangBuiltLinux/linux/issues/1468
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/dc/dcn201/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
index d98d69705117..96cbd4ccd344 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
@@ -14,9 +14,11 @@ ifdef CONFIG_PPC64
 CFLAGS_$(AMDDALPATH)/dc/dcn201/dcn201_resource.o := -mhard-float -maltivec
 endif
 
+ifdef CONFIG_CC_IS_GCC
 ifeq ($(call cc-ifversion, -lt, 0701, y), y)
 IS_OLD_GCC = 1
 endif
+endif
 
 ifdef CONFIG_X86
 ifdef IS_OLD_GCC

base-commit: b47b99e30cca8906753c83205e8c6179045dd725
-- 
2.33.0.591.gddb1055343



[PATCH] amd/amdkfd: remove svms declaration to avoid werror

2021-09-30 Thread Alex Sierra
svm_range_list svms declaration removed to avoid werror when
CONFIG_HSA_AMD_SVM is not enabled.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 4de907f3e66a..f1e7edeb4e6b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1251,7 +1251,6 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
struct kfd_process_device *pdd;
void *mem;
struct kfd_dev *dev;
-   struct svm_range_list *svms = >svms;
int idr_handle;
long err;
uint64_t offset = args->mmap_offset;
@@ -1264,18 +1263,18 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
/* Flush pending deferred work to avoid racing with deferred actions
 * from previous memory map changes (e.g. munmap).
 */
-   svm_range_list_lock_and_flush_work(svms, current->mm);
-   mutex_lock(>lock);
+   svm_range_list_lock_and_flush_work(>svms, current->mm);
+   mutex_lock(>svms.lock);
mmap_write_unlock(current->mm);
-   if (interval_tree_iter_first(>objects,
+   if (interval_tree_iter_first(>svms.objects,
 args->va_addr >> PAGE_SHIFT,
 (args->va_addr + args->size - 1) >> 
PAGE_SHIFT)) {
pr_err("Address: 0x%llx already allocated by SVM\n",
args->va_addr);
-   mutex_unlock(>lock);
+   mutex_unlock(>svms.lock);
return -EADDRINUSE;
}
-   mutex_unlock(>lock);
+   mutex_unlock(>svms.lock);
 #endif
dev = kfd_device_by_id(args->gpu_id);
if (!dev)
-- 
2.32.0



[PATCH] drm/amd/display: fix DCC settings for DCN3

2021-09-30 Thread Marek Olšák
Hi,

Just discovered this typo. Please review.

Thanks,
Marek
From 3abee824223e214d8a74c3f1b47a24e5ea9a9a34 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Thu, 30 Sep 2021 11:13:59 -0400
Subject: [PATCH] drm/amd/display: fix DCC settings for DCN3
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

ind_block_64b_no_128bcl means INDEP_64B && INDEP_128B &&
MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx10.3.

ind_block_64b means INDEP_64B && !INDEP_128B &&
MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx9 and gfx10.

Signed-off-by: Marek Olšák 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index df83b1f438b6..ebdb959f4e1f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -97,6 +97,7 @@
  * - 3.41.0 - Add video codec query
  * - 3.42.0 - Add 16bpc fixed point display support
  * - 3.43.0 - Add device hot plug/unplug support
+ * - 3.44.0 - DCN3 supports DCC independent block settings: !64B && 128B, 64B && 128B
  */
 #define KMS_DRIVER_MAJOR	3
 #define KMS_DRIVER_MINOR	43
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a399a984b8a6..49be531d68ae 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5105,11 +5105,11 @@ fill_gfx9_plane_attributes_from_modifiers(struct amdgpu_device *adev,
 		dcc->independent_64b_blks = independent_64b_blks;
 		if (AMD_FMT_MOD_GET(TILE_VERSION, modifier) == AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS) {
 			if (independent_64b_blks && independent_128b_blks)
-dcc->dcc_ind_blk = hubp_ind_block_64b;
+dcc->dcc_ind_blk = hubp_ind_block_64b_no_128bcl;
 			else if (independent_128b_blks)
 dcc->dcc_ind_blk = hubp_ind_block_128b;
 			else if (independent_64b_blks && !independent_128b_blks)
-dcc->dcc_ind_blk = hubp_ind_block_64b_no_128bcl;
+dcc->dcc_ind_blk = hubp_ind_block_64b;
 			else
 dcc->dcc_ind_blk = hubp_ind_block_unconstrained;
 		} else {
-- 
2.25.1



Re: [PATCH] drm/amdgpu: use generic fb helpers instead of setting up AMD own's.

2021-09-30 Thread Alex Deucher
@Christian Koenig
Have you had a chance to look at this yet?

Alex

On Mon, Sep 20, 2021 at 4:44 AM Thomas Zimmermann  wrote:
>
> Hi
>
> Am 20.09.21 um 10:41 schrieb Thomas Zimmermann:
> > (cc'ing dri-devel)
> >
> > Hi
> >
> > Am 13.09.21 um 16:36 schrieb Alex Deucher:
> >> On Thu, Sep 9, 2021 at 11:25 PM Evan Quan  wrote:
> >>>
> >>> With the shadow buffer support from generic framebuffer emulation, it's
> >>> possible now to have runpm kicked when no update for console.
> >>>
> >>> Change-Id: I285472c9100ee6f649d3f3f3548f402b9cd34eaf
> >>> Signed-off-by: Evan Quan 
> >>> Acked-by: Christian König 
> >>
> >> Reviewed-by: Alex Deucher 
> >
> > There was a long discussion about this change within radeon and the
> > result was that it cannot be done. [1] I don't remember the full
> > details, but semantics of the vmap/vunmap for dma-bufs were not
> > compatible IIRC. And the resolution was a redesign of the API.
>
> I posted a patchset with a new interface at [1].
>
> Best regards
> Thomas
>
> [1]
> https://lore.kernel.org/dri-devel/20201209142527.26415-1-tzimmerm...@suse.de/
>
> >
> > If that has changed, I'd be happy to see this patch merged. Otherwise,
> > it should better not be taken.
> >
> > Best regards
> > Thomas
> >
> > [1] https://patchwork.freedesktop.org/patch/400054/?series=83765=1
> >
> >>
> >>> --
> >>> v1->v2:
> >>>- rename amdgpu_align_pitch as amdgpu_gem_align_pitch to align with
> >>>  other APIs from the same file (Alex)
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/Makefile |   2 +-
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  12 +-
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |  11 +-
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  13 +
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c  | 388 
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  30 +-
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h|  20 -
> >>>   7 files changed, 50 insertions(+), 426 deletions(-)
> >>>   delete mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
> >>> b/drivers/gpu/drm/amd/amdgpu/Makefile
> >>> index 8d0748184a14..73a2151ee43f 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> >>> @@ -45,7 +45,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
> >>>  amdgpu_atombios.o atombios_crtc.o amdgpu_connectors.o \
> >>>  atom.o amdgpu_fence.o amdgpu_ttm.o amdgpu_object.o
> >>> amdgpu_gart.o \
> >>>  amdgpu_encoders.o amdgpu_display.o amdgpu_i2c.o \
> >>> -   amdgpu_fb.o amdgpu_gem.o amdgpu_ring.o \
> >>> +   amdgpu_gem.o amdgpu_ring.o \
> >>>  amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \
> >>>  atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \
> >>>  atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> index 682d459e992a..bcc308b7f826 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> >>> @@ -3695,8 +3695,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >>>  /* Get a log2 for easy divisions. */
> >>>  adev->mm_stats.log2_max_MBps = ilog2(max(1u, max_MBps));
> >>>
> >>> -   amdgpu_fbdev_init(adev);
> >>> -
> >>>  r = amdgpu_pm_sysfs_init(adev);
> >>>  if (r) {
> >>>  adev->pm_sysfs_en = false;
> >>> @@ -3854,8 +3852,6 @@ void amdgpu_device_fini_hw(struct amdgpu_device
> >>> *adev)
> >>>  amdgpu_ucode_sysfs_fini(adev);
> >>>  sysfs_remove_files(>dev->kobj, amdgpu_dev_attributes);
> >>>
> >>> -   amdgpu_fbdev_fini(adev);
> >>> -
> >>>  amdgpu_irq_fini_hw(adev);
> >>>
> >>>  amdgpu_device_ip_fini_early(adev);
> >>> @@ -3931,7 +3927,7 @@ int amdgpu_device_suspend(struct drm_device
> >>> *dev, bool fbcon)
> >>>  drm_kms_helper_poll_disable(dev);
> >>>
> >>>  if (fbcon)
> >>> -   amdgpu_fbdev_set_suspend(adev, 1);
> >>> +
> >>> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
> >>>
> >>>  cancel_delayed_work_sync(>delayed_init_work);
> >>>
> >>> @@ -4009,7 +4005,7 @@ int amdgpu_device_resume(struct drm_device
> >>> *dev, bool fbcon)
> >>>  flush_delayed_work(>delayed_init_work);
> >>>
> >>>  if (fbcon)
> >>> -   amdgpu_fbdev_set_suspend(adev, 0);
> >>> +
> >>> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, false);
> >>>
> >>>  drm_kms_helper_poll_enable(dev);
> >>>
> >>> @@ -4638,7 +4634,7 @@ int amdgpu_do_asic_reset(struct list_head
> >>> *device_list_handle,
> >>>  if (r)
> >>>  goto out;
> >>>
> >>> -   amdgpu_fbdev_set_suspend(tmp_adev, 0);
> >>> +
> >>> 

Re: [PATCH] drm/amdkfd: Fix dummy kgd2kfd_probe parameters

2021-09-30 Thread Alex Deucher
On Thu, Sep 30, 2021 at 9:18 AM Anson Jacob  wrote:
>
> Commit 4d706ed6825f ("drm/amdkfd: clean up parameters in kgd2kfd_probe")
> updated paremeters for kgd2kfd_probe. Update the dummy function as well
> when CONFIG_HSA_AMD is not enabled.
>
> Fixes: 4d706ed6825f ("drm/amdkfd: clean up parameters in kgd2kfd_probe")
> Signed-off-by: Anson Jacob 

Thanks!
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index 38d883dffc20..69de31754907 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -346,8 +346,7 @@ static inline void kgd2kfd_exit(void)
>  }
>
>  static inline
> -struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev,
> -   unsigned int asic_type, bool vf)
> +struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
>  {
> return NULL;
>  }
> --
> 2.25.1
>


Re: [PATCH] drm/amdgpu: print warning and taint kernel if lockup timeout is disabled

2021-09-30 Thread Deucher, Alexander
[AMD Official Use Only]

Acked-by: Alex Deucher 

From: Christian König 
Sent: Thursday, September 30, 2021 6:00 AM
To: Deucher, Alexander ; 
amd-gfx@lists.freedesktop.org 
Subject: [PATCH] drm/amdgpu: print warning and taint kernel if lockup timeout 
is disabled

Make sure that we notice this in error reports.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4d34b2da8582..8ee5bbc19f62 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3346,6 +3346,8 @@ static int amdgpu_device_get_job_timeout_settings(struct 
amdgpu_device *adev)
 continue;
 } else if (timeout < 0) {
 timeout = MAX_SCHEDULE_TIMEOUT;
+   dev_warn(adev->dev, "lockup timeout disabled");
+   add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
 } else {
 timeout = msecs_to_jiffies(timeout);
 }
--
2.25.1



Re: [PATCH 02/02 v2] drm/amd/display: add cyan_skillfish display support

2021-09-30 Thread Harry Wentland



On 2021-09-30 10:12, Alex Deucher wrote:
> On Thu, Sep 30, 2021 at 10:10 AM Alex Deucher  wrote:
>>
>> On Wed, Sep 29, 2021 at 10:00 PM Alex Deucher  wrote:
>>>
>>> On Wed, Sep 29, 2021 at 7:23 PM Mike Lothian  wrote:

 Hi

 This patch is causing a failure for me when building with clang:


 Enable DCN201 support in DC (DRM_AMD_DC_DCN201) [Y/n/?] (NEW) y
 Enable HDCP support in DC (DRM_AMD_DC_HDCP) [Y/n/?] y
 AMD DC support for Southern Islands ASICs (DRM_AMD_DC_SI) [N/y/?] n
 Enable secure display support (DRM_AMD_SECURE_DISPLAY) [Y/n/?] y
  DESCEND objtool
  CALLscripts/atomic/check-atomics.sh
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
  UPD kernel/config_data
  GZIPkernel/config_data.gz
  CC  kernel/configs.o
  AR  kernel/built-in.a
  CC  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/clk_mgr.o
  CC  drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.o
 clang-12: error: unknown argument: '-mpreferred-stack-boundary=4'
 make[4]: *** [scripts/Makefile.build:278:
 drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.o] Error 1
 make[4]: *** Waiting for unfinished jobs
 make[3]: *** [scripts/Makefile.build:540: drivers/gpu/drm/amd/amdgpu] 
 Error 2
 make[2]: *** [scripts/Makefile.build:540: drivers/gpu/drm] Error 2
 make[1]: *** [scripts/Makefile.build:540: drivers/gpu] Error 2
 make: *** [Makefile:1868: drivers] Error 2
>>>
>>> The Makefiles for the new stuff added probably need to be fixed up for
>>> clang like the other Makefiles.  I can take a look tomorrow.
>>
>> I don't see anything off in the Makefiles.  Can you try with a clean tree?
>>
> 
> Nevermind, Harry found it.  patch on the list.
> 

I found a clang build error with dcn201_resources.o, not in dcn30_optc.o.

Not sure why you're seeing this with dcn30_optc.o.

Harry

> Alex
> 



Re: [PATCH 02/02 v2] drm/amd/display: add cyan_skillfish display support

2021-09-30 Thread Alex Deucher
On Thu, Sep 30, 2021 at 10:10 AM Alex Deucher  wrote:
>
> On Wed, Sep 29, 2021 at 10:00 PM Alex Deucher  wrote:
> >
> > On Wed, Sep 29, 2021 at 7:23 PM Mike Lothian  wrote:
> > >
> > > Hi
> > >
> > > This patch is causing a failure for me when building with clang:
> > >
> > >
> > > Enable DCN201 support in DC (DRM_AMD_DC_DCN201) [Y/n/?] (NEW) y
> > > Enable HDCP support in DC (DRM_AMD_DC_HDCP) [Y/n/?] y
> > > AMD DC support for Southern Islands ASICs (DRM_AMD_DC_SI) [N/y/?] n
> > > Enable secure display support (DRM_AMD_SECURE_DISPLAY) [Y/n/?] y
> > >  DESCEND objtool
> > >  CALLscripts/atomic/check-atomics.sh
> > >  CALLscripts/checksyscalls.sh
> > >  CHK include/generated/compile.h
> > >  UPD kernel/config_data
> > >  GZIPkernel/config_data.gz
> > >  CC  kernel/configs.o
> > >  AR  kernel/built-in.a
> > >  CC  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/clk_mgr.o
> > >  CC  drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.o
> > > clang-12: error: unknown argument: '-mpreferred-stack-boundary=4'
> > > make[4]: *** [scripts/Makefile.build:278:
> > > drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.o] Error 1
> > > make[4]: *** Waiting for unfinished jobs
> > > make[3]: *** [scripts/Makefile.build:540: drivers/gpu/drm/amd/amdgpu] 
> > > Error 2
> > > make[2]: *** [scripts/Makefile.build:540: drivers/gpu/drm] Error 2
> > > make[1]: *** [scripts/Makefile.build:540: drivers/gpu] Error 2
> > > make: *** [Makefile:1868: drivers] Error 2
> >
> > The Makefiles for the new stuff added probably need to be fixed up for
> > clang like the other Makefiles.  I can take a look tomorrow.
>
> I don't see anything off in the Makefiles.  Can you try with a clean tree?
>

Nevermind, Harry found it.  patch on the list.

Alex


Re: [PATCH] drm/amd/display: Don't use mpreferred-stack-boundary for clang on DCN201

2021-09-30 Thread Alex Deucher
On Thu, Sep 30, 2021 at 10:02 AM Harry Wentland  wrote:
>
> We were erroneously setting IS_OLD_GCC for clang since we didn't
> check first whether we're doing a GCC build.
>
> See dcn30/Makefile for reference.
>
> Fixes: 4ac93fa0ec12 ("drm/amd/display: add cyan_skillfish display support")
> Signed-off-by: Harry Wentland 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/display/dc/dcn201/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
> index d98d69705117..f68038ceb1b1 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
> @@ -14,9 +14,12 @@ ifdef CONFIG_PPC64
>  CFLAGS_$(AMDDALPATH)/dc/dcn201/dcn201_resource.o := -mhard-float -maltivec
>  endif
>
> +ifdef CONFIG_CC_IS_GCC
>  ifeq ($(call cc-ifversion, -lt, 0701, y), y)
>  IS_OLD_GCC = 1
>  endif
> +CFLAGS_$(AMDDALPATH)/dc/dcn201/dcn201_resource.o += -mhard-float
> +endif
>
>  ifdef CONFIG_X86
>  ifdef IS_OLD_GCC
> --
> 2.33.0
>


Re: [PATCH 02/02 v2] drm/amd/display: add cyan_skillfish display support

2021-09-30 Thread Alex Deucher
On Wed, Sep 29, 2021 at 10:00 PM Alex Deucher  wrote:
>
> On Wed, Sep 29, 2021 at 7:23 PM Mike Lothian  wrote:
> >
> > Hi
> >
> > This patch is causing a failure for me when building with clang:
> >
> >
> > Enable DCN201 support in DC (DRM_AMD_DC_DCN201) [Y/n/?] (NEW) y
> > Enable HDCP support in DC (DRM_AMD_DC_HDCP) [Y/n/?] y
> > AMD DC support for Southern Islands ASICs (DRM_AMD_DC_SI) [N/y/?] n
> > Enable secure display support (DRM_AMD_SECURE_DISPLAY) [Y/n/?] y
> >  DESCEND objtool
> >  CALLscripts/atomic/check-atomics.sh
> >  CALLscripts/checksyscalls.sh
> >  CHK include/generated/compile.h
> >  UPD kernel/config_data
> >  GZIPkernel/config_data.gz
> >  CC  kernel/configs.o
> >  AR  kernel/built-in.a
> >  CC  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/clk_mgr.o
> >  CC  drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.o
> > clang-12: error: unknown argument: '-mpreferred-stack-boundary=4'
> > make[4]: *** [scripts/Makefile.build:278:
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.o] Error 1
> > make[4]: *** Waiting for unfinished jobs
> > make[3]: *** [scripts/Makefile.build:540: drivers/gpu/drm/amd/amdgpu] Error 
> > 2
> > make[2]: *** [scripts/Makefile.build:540: drivers/gpu/drm] Error 2
> > make[1]: *** [scripts/Makefile.build:540: drivers/gpu] Error 2
> > make: *** [Makefile:1868: drivers] Error 2
>
> The Makefiles for the new stuff added probably need to be fixed up for
> clang like the other Makefiles.  I can take a look tomorrow.

I don't see anything off in the Makefiles.  Can you try with a clean tree?

Alex


[PATCH] drm/amd/display: Don't use mpreferred-stack-boundary for clang on DCN201

2021-09-30 Thread Harry Wentland
We were erroneously setting IS_OLD_GCC for clang since we didn't
check first whether we're doing a GCC build.

See dcn30/Makefile for reference.

Fixes: 4ac93fa0ec12 ("drm/amd/display: add cyan_skillfish display support")
Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/dcn201/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
index d98d69705117..f68038ceb1b1 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn201/Makefile
@@ -14,9 +14,12 @@ ifdef CONFIG_PPC64
 CFLAGS_$(AMDDALPATH)/dc/dcn201/dcn201_resource.o := -mhard-float -maltivec
 endif
 
+ifdef CONFIG_CC_IS_GCC
 ifeq ($(call cc-ifversion, -lt, 0701, y), y)
 IS_OLD_GCC = 1
 endif
+CFLAGS_$(AMDDALPATH)/dc/dcn201/dcn201_resource.o += -mhard-float
+endif
 
 ifdef CONFIG_X86
 ifdef IS_OLD_GCC
-- 
2.33.0



Re: [PATCH 1/2] drm/amdgpu/jpeg2: move jpeg2 shared macro to header file

2021-09-30 Thread Leo Liu

The series are:

Reviewed-by: Leo Liu 

On 2021-09-29 3:57 p.m., James Zhu wrote:

Move jpeg2 shared macro to header file

Signed-off-by: James Zhu 
---
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 20 
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h | 20 
  2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
index 85967a5..299de1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
@@ -32,26 +32,6 @@
  #include "vcn/vcn_2_0_0_sh_mask.h"
  #include "ivsrcid/vcn/irqsrcs_vcn_2_0.h"
  
-#define mmUVD_JRBC_EXTERNAL_REG_INTERNAL_OFFSET0x1bfff

-#define mmUVD_JPEG_GPCOM_CMD_INTERNAL_OFFSET   0x4029
-#define mmUVD_JPEG_GPCOM_DATA0_INTERNAL_OFFSET 0x402a
-#define mmUVD_JPEG_GPCOM_DATA1_INTERNAL_OFFSET 0x402b
-#define mmUVD_LMI_JRBC_RB_MEM_WR_64BIT_BAR_LOW_INTERNAL_OFFSET 0x40ea
-#define mmUVD_LMI_JRBC_RB_MEM_WR_64BIT_BAR_HIGH_INTERNAL_OFFSET
0x40eb
-#define mmUVD_LMI_JRBC_IB_VMID_INTERNAL_OFFSET 0x40cf
-#define mmUVD_LMI_JPEG_VMID_INTERNAL_OFFSET0x40d1
-#define mmUVD_LMI_JRBC_IB_64BIT_BAR_LOW_INTERNAL_OFFSET
0x40e8
-#define mmUVD_LMI_JRBC_IB_64BIT_BAR_HIGH_INTERNAL_OFFSET   0x40e9
-#define mmUVD_JRBC_IB_SIZE_INTERNAL_OFFSET 0x4082
-#define mmUVD_LMI_JRBC_RB_MEM_RD_64BIT_BAR_LOW_INTERNAL_OFFSET 0x40ec
-#define mmUVD_LMI_JRBC_RB_MEM_RD_64BIT_BAR_HIGH_INTERNAL_OFFSET
0x40ed
-#define mmUVD_JRBC_RB_COND_RD_TIMER_INTERNAL_OFFSET0x4085
-#define mmUVD_JRBC_RB_REF_DATA_INTERNAL_OFFSET 0x4084
-#define mmUVD_JRBC_STATUS_INTERNAL_OFFSET  0x4089
-#define mmUVD_JPEG_PITCH_INTERNAL_OFFSET   0x401f
-
-#define JRBC_DEC_EXTERNAL_REG_WRITE_ADDR   0x18000
-
  static void jpeg_v2_0_set_dec_ring_funcs(struct amdgpu_device *adev);
  static void jpeg_v2_0_set_irq_funcs(struct amdgpu_device *adev);
  static int jpeg_v2_0_set_powergating_state(void *handle,
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h 
b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h
index 15a344e..1a03baa 100644
--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h
@@ -24,6 +24,26 @@
  #ifndef __JPEG_V2_0_H__
  #define __JPEG_V2_0_H__
  
+#define mmUVD_JRBC_EXTERNAL_REG_INTERNAL_OFFSET0x1bfff

+#define mmUVD_JPEG_GPCOM_CMD_INTERNAL_OFFSET   0x4029
+#define mmUVD_JPEG_GPCOM_DATA0_INTERNAL_OFFSET 0x402a
+#define mmUVD_JPEG_GPCOM_DATA1_INTERNAL_OFFSET 0x402b
+#define mmUVD_LMI_JRBC_RB_MEM_WR_64BIT_BAR_LOW_INTERNAL_OFFSET 0x40ea
+#define mmUVD_LMI_JRBC_RB_MEM_WR_64BIT_BAR_HIGH_INTERNAL_OFFSET
0x40eb
+#define mmUVD_LMI_JRBC_IB_VMID_INTERNAL_OFFSET 0x40cf
+#define mmUVD_LMI_JPEG_VMID_INTERNAL_OFFSET0x40d1
+#define mmUVD_LMI_JRBC_IB_64BIT_BAR_LOW_INTERNAL_OFFSET
0x40e8
+#define mmUVD_LMI_JRBC_IB_64BIT_BAR_HIGH_INTERNAL_OFFSET   0x40e9
+#define mmUVD_JRBC_IB_SIZE_INTERNAL_OFFSET 0x4082
+#define mmUVD_LMI_JRBC_RB_MEM_RD_64BIT_BAR_LOW_INTERNAL_OFFSET 0x40ec
+#define mmUVD_LMI_JRBC_RB_MEM_RD_64BIT_BAR_HIGH_INTERNAL_OFFSET
0x40ed
+#define mmUVD_JRBC_RB_COND_RD_TIMER_INTERNAL_OFFSET0x4085
+#define mmUVD_JRBC_RB_REF_DATA_INTERNAL_OFFSET 0x4084
+#define mmUVD_JRBC_STATUS_INTERNAL_OFFSET  0x4089
+#define mmUVD_JPEG_PITCH_INTERNAL_OFFSET   0x401f
+
+#define JRBC_DEC_EXTERNAL_REG_WRITE_ADDR   0x18000
+
  void jpeg_v2_0_dec_ring_insert_start(struct amdgpu_ring *ring);
  void jpeg_v2_0_dec_ring_insert_end(struct amdgpu_ring *ring);
  void jpeg_v2_0_dec_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 
seq,


[PATCH] drm/amdkfd: Fix dummy kgd2kfd_probe parameters

2021-09-30 Thread Anson Jacob
Commit 4d706ed6825f ("drm/amdkfd: clean up parameters in kgd2kfd_probe")
updated paremeters for kgd2kfd_probe. Update the dummy function as well
when CONFIG_HSA_AMD is not enabled.

Fixes: 4d706ed6825f ("drm/amdkfd: clean up parameters in kgd2kfd_probe")
Signed-off-by: Anson Jacob 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 38d883dffc20..69de31754907 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -346,8 +346,7 @@ static inline void kgd2kfd_exit(void)
 }
 
 static inline
-struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev,
-   unsigned int asic_type, bool vf)
+struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
 {
return NULL;
 }
-- 
2.25.1



RE: [PATCH] drm/amdgpu: fix some repeated includings

2021-09-30 Thread 郭正奎
Actually the duplicates take place in line 46, 47 and 62, 63.

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 291a47f7992a..94fca56583a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -46,34 +46,32 @@
#include "vcn_v2_0.h"
#include "jpeg_v2_0.h"
#include "vcn_v2_5.h"
#include "jpeg_v2_5.h"
#include "smuio_v9_0.h"
#include "gmc_v10_0.h"
#include "gfxhub_v2_0.h"
#include "mmhub_v2_0.h"
#include "nbio_v2_3.h"
#include "nbio_v7_2.h"
#include "hdp_v5_0.h"
#include "nv.h"
#include "navi10_ih.h"
#include "gfx_v10_0.h"
#include "sdma_v5_0.h"
#include "sdma_v5_2.h"
-#include "vcn_v2_0.h"
-#include "jpeg_v2_0.h"
#include "vcn_v3_0.h"
#include "jpeg_v3_0.h"
#include "amdgpu_vkms.h"
#include "mes_v10_1.h"
#include "smuio_v11_0.h"
#include "smuio_v11_0_6.h"
#include "smuio_v13_0.h"

MODULE_FIRMWARE("amdgpu/ip_discovery.bin");

#define mmRCC_CONFIG_MEMSIZE   0xde3
#define mmMM_INDEX 0x0
#define mmMM_INDEX_HI  0x6
#define mmMM_DATA  0x1

static const char *hw_id_names[HW_ID_MAX] = {


[PATCH] drm/amdgpu: fix some repeated includings

2021-09-30 Thread Guo Zhengkui
Remove two repeated includings in line 62 and 63.

Signed-off-by: Guo Zhengkui 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 291a47f7992a..94fca56583a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -59,8 +59,6 @@
 #include "gfx_v10_0.h"
 #include "sdma_v5_0.h"
 #include "sdma_v5_2.h"
-#include "vcn_v2_0.h"
-#include "jpeg_v2_0.h"
 #include "vcn_v3_0.h"
 #include "jpeg_v3_0.h"
 #include "amdgpu_vkms.h"
-- 
2.20.1



Repository for additional color and HDR related documentation (Re: [RFC PATCH v3 1/6] drm/doc: Color Management and HDR10 RFC)

2021-09-30 Thread Pekka Paalanen
On Thu, 23 Sep 2021 10:43:54 +0300
Pekka Paalanen  wrote:

> On Wed, 22 Sep 2021 11:28:37 -0400
> Harry Wentland  wrote:
> 
> > On 2021-09-22 04:31, Pekka Paalanen wrote:  
> > > On Tue, 21 Sep 2021 14:05:05 -0400
> > > Harry Wentland  wrote:
> > > 
> > >> On 2021-09-21 09:31, Pekka Paalanen wrote:
> > >>> On Mon, 20 Sep 2021 20:14:50 -0400
> > >>> Harry Wentland  wrote:
> > >>>   
> > 
> > ...
> >   
> > > 
> > >> Did anybody start any CM doc patches in Weston or Wayland yet?
> > > 
> > > There is the
> > > https://gitlab.freedesktop.org/swick/wayland-protocols/-/blob/color/unstable/color-management/color.rst
> > > we started a long time ago, and have not really touched it for a while.
> > > Since we last touched it, at least my understanding has developed
> > > somewhat.
> > > 
> > > It is linked from the overview in
> > > https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/14
> > > and if you want to propose changes, the way to do it is file a MR in
> > > https://gitlab.freedesktop.org/swick/wayland-protocols/-/merge_requests
> > > against the 'color' branch. Patches very much welcome, that doc does
> > > not need to limit itself to Wayland. :-)
> > > 
> > 
> > Right, I've read all that a while back.
> > 
> > It might be a good place to consolidate most of the Linux CM/HDR discussion,
> > since gitlab is good with allowing discussions, we can track changes, and
> > it's more formatting and diagram friendly than text-only email.  
> 
> Fine by me, but the way things are right now, we'd be hijacking
> Sebastian's personal repository for these things. That's not ideal.
> 
> We can't merge the protocol XML into wayland-protocols until it has the
> accepted implementations required by the governance rules, but I wonder
> if we could land color.rst ahead of time, then work on that in
> wayland-protocols upstream repo.
> 
> It's hard to pick a good place for a cross-project document. Any other
> ideas?
> 
> > > We also have issues tracked at
> > > https://gitlab.freedesktop.org/swick/wayland-protocols/-/issues?scope=all=%E2%9C%93=opened
> > >   

Hi all,

we discussed things in
https://gitlab.freedesktop.org/swick/wayland-protocols/-/issues/6

and we have a new home for the color related WIP documentation we can
use across Wayland, Mesa, DRM, and even X11 if people want to:

https://gitlab.freedesktop.org/pq/color-and-hdr

Yes, it's still someone's personal repository, but we avoid entangling
it with wayland-protocols which also means we can keep the full git
history. If this gets enough traction, the repository can be moved from
under my personal group to somewhere more communal, and if that is
still inside gitlab.fd.o then all merge requests and issues will move
with it.

The README notes that we will deal out merge permissions as well.

This is not meant to supersede the documentation of individual APIs,
but to host additional documentation that would be too verbose, too
big, or out of scope to host within respective API docs.

Feel free to join the effort or just to discuss.


Thanks,
pq


pgpbYSmZoo9jO.pgp
Description: OpenPGP digital signature


Re: [PATCH v6 2/2] habanalabs: add support for dma-buf exporter

2021-09-30 Thread Oded Gabbay
On Wed, Sep 29, 2021 at 12:17 AM Oded Gabbay  wrote:
>
> On Tue, Sep 28, 2021 at 8:36 PM Jason Gunthorpe  wrote:
> >
> > On Sun, Sep 12, 2021 at 07:53:09PM +0300, Oded Gabbay wrote:
> > > From: Tomer Tayar 
> > >
> > > Implement the calls to the dma-buf kernel api to create a dma-buf
> > > object backed by FD.
> > >
> > > We block the option to mmap the DMA-BUF object because we don't support
> > > DIRECT_IO and implicit P2P.
> >
> > This statement doesn't make sense, you can mmap your dmabuf if you
> > like. All dmabuf mmaps are supposed to set the special bit/etc to
> > exclude them from get_user_pages() anyhow - and since this is BAR
> > memory not struct page memory this driver would be doing it anyhow.
> >
> But we block mmap the dmabuf fd from user-space.
> If you try to do it, you will get MAP_FAILED.
> That's because we don't supply a function to the mmap callback in dmabuf.
> We did that per Christian's advice. It is in one of the long email
> threads on previous versions of this patch.
>
>
> > > We check the p2p distance using pci_p2pdma_distance_many() and refusing
> > > to map dmabuf in case the distance doesn't allow p2p.
> >
> > Does this actually allow the p2p transfer for your intended use cases?
> >
> It depends on the system. If we are working bare-metal, then yes, it allows.
> If inside a VM, then no. The virtualized root complex is not
> white-listed and the kernel can't know the distance.
> But I remember you asked me to add this check, in v3 of the review IIRC.
> I don't mind removing this check if you don't object.
>
> > > diff --git a/drivers/misc/habanalabs/common/memory.c 
> > > b/drivers/misc/habanalabs/common/memory.c
> > > index 33986933aa9e..8cf5437c0390 100644
> > > +++ b/drivers/misc/habanalabs/common/memory.c
> > > @@ -1,7 +1,7 @@
> > >  // SPDX-License-Identifier: GPL-2.0
> > >
> > >  /*
> > > - * Copyright 2016-2019 HabanaLabs, Ltd.
> > > + * Copyright 2016-2021 HabanaLabs, Ltd.
> > >   * All Rights Reserved.
> > >   */
> > >
> > > @@ -11,11 +11,13 @@
> > >
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #define HL_MMU_DEBUG 0
> > >
> > >  /* use small pages for supporting non-pow2 (32M/40M/48M) DRAM phys page 
> > > sizes */
> > > -#define DRAM_POOL_PAGE_SIZE SZ_8M
> > > +#define DRAM_POOL_PAGE_SIZE  SZ_8M
> > > +
> >
> > ??
> ok, I 'll remove
> >
> > >  /*
> > >   * The va ranges in context object contain a list with the available 
> > > chunks of
> > > @@ -347,6 +349,13 @@ static int free_device_memory(struct hl_ctx *ctx, 
> > > struct hl_mem_in *args)
> > >   return -EINVAL;
> > >   }
> > >
> > > + if (phys_pg_pack->exporting_cnt) {
> > > + dev_err(hdev->dev,
> > > + "handle %u is exported, cannot free\n", 
> > > handle);
> > > + spin_unlock(>idr_lock);
> >
> > Don't write to the kernel log from user space triggered actions
> at all ?
> It's the first time I hear about this limitation...
> How do you tell the user it has done something wrong ?
> I agree it might be better to rate limit it, but why not give the
> information to the user ?
>
> >
> > > +static int alloc_sgt_from_device_pages(struct hl_device *hdev,
> > > + struct sg_table **sgt, u64 *pages,
> > > + u64 npages, u64 page_size,
> > > + struct device *dev,
> > > + enum dma_data_direction dir)
> >
> > Why doesn't this return a sg_table * and an ERR_PTR?
> Basically I modeled this function after amdgpu_vram_mgr_alloc_sgt()
> And in that function they also return int and pass the sg_table as **
>
> If it's critical I can change.
>
> >
> > > +{
> > > + u64 chunk_size, bar_address, dma_max_seg_size;
> > > + struct asic_fixed_properties *prop;
> > > + int rc, i, j, nents, cur_page;
> > > + struct scatterlist *sg;
> > > +
> > > + prop = >asic_prop;
> > > +
> > > + dma_max_seg_size = dma_get_max_seg_size(dev);
> >
> > > +
> > > + /* We would like to align the max segment size to PAGE_SIZE, so the
> > > +  * SGL will contain aligned addresses that can be easily mapped to
> > > +  * an MMU
> > > +  */
> > > + dma_max_seg_size = ALIGN_DOWN(dma_max_seg_size, PAGE_SIZE);
> > > + if (dma_max_seg_size < PAGE_SIZE) {
> > > + dev_err_ratelimited(hdev->dev,
> > > + "dma_max_seg_size %llu can't be smaller 
> > > than PAGE_SIZE\n",
> > > + dma_max_seg_size);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + *sgt = kzalloc(sizeof(**sgt), GFP_KERNEL);
> > > + if (!*sgt)
> > > + return -ENOMEM;
> > > +
> > > + /* If the size of each page is larger than the dma max segment size,
> > > +  * then we can't combine pages and the number of entries in the SGL
> > > +  * will just be the
> > > +  * 

Re: [PATCH] drm/amdgpu: fix some repeated includings

2021-09-30 Thread Christian König
Ah, that makes more sense. Then please remove the duplicates in lines 46 
and 47 instead since the other ones are more correctly grouped together 
with their blocks.


Christian.

Am 30.09.21 um 13:54 schrieb 郭正奎:


Actually the duplicates take place in line 46, 47 and 62, 63.

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c


index 291a47f7992a..94fca56583a0 100644

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c

+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c

@@ -46,34 +46,32 @@

#include "vcn_v2_0.h"

#include "jpeg_v2_0.h"

#include "vcn_v2_5.h"

#include "jpeg_v2_5.h"

#include "smuio_v9_0.h"

#include "gmc_v10_0.h"

#include "gfxhub_v2_0.h"

#include "mmhub_v2_0.h"

#include "nbio_v2_3.h"

#include "nbio_v7_2.h"

#include "hdp_v5_0.h"

#include "nv.h"

#include "navi10_ih.h"

#include "gfx_v10_0.h"

#include "sdma_v5_0.h"

#include "sdma_v5_2.h"

-#include "vcn_v2_0.h"

-#include "jpeg_v2_0.h"

#include "vcn_v3_0.h"

#include "jpeg_v3_0.h"

#include "amdgpu_vkms.h"

#include "mes_v10_1.h"

#include "smuio_v11_0.h"

#include "smuio_v11_0_6.h"

#include "smuio_v13_0.h"

MODULE_FIRMWARE("amdgpu/ip_discovery.bin");

#define mmRCC_CONFIG_MEMSIZE   0xde3

#define mmMM_INDEX 0x0

#define mmMM_INDEX_HI  0x6

#define mmMM_DATA  0x1

static const char *hw_id_names[HW_ID_MAX] = {





AW: [PATCH] drm/amdgpu: fix some repeated includings

2021-09-30 Thread Koenig, Christian
Seconded, there is one include for each hardware version.

At least of hand I don't see a duplicate.

Von: Simon Ser 
Gesendet: Donnerstag, 30. September 2021 12:17
An: Guo Zhengkui 
Cc: Deucher, Alexander ; Koenig, Christian 
; Pan, Xinhui ; David Airlie 
; Daniel Vetter ; Chen, Guchun 
; Zhou, Peng Ju ; Zhang, Bokun 
; Gao, Likun ; 
amd-gfx@lists.freedesktop.org ; 
dri-de...@lists.freedesktop.org ; 
linux-ker...@vger.kernel.org ; ker...@vivo.com 

Betreff: Re: [PATCH] drm/amdgpu: fix some repeated includings

One include is v2, the other is v3, or am I missing something?


Re: [PATCH] drm/amdgpu: fix some repeated includings

2021-09-30 Thread Simon Ser
One include is v2, the other is v3, or am I missing something?


[PATCH] drm/amdgpu: print warning and taint kernel if lockup timeout is disabled

2021-09-30 Thread Christian König
Make sure that we notice this in error reports.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4d34b2da8582..8ee5bbc19f62 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3346,6 +3346,8 @@ static int amdgpu_device_get_job_timeout_settings(struct 
amdgpu_device *adev)
continue;
} else if (timeout < 0) {
timeout = MAX_SCHEDULE_TIMEOUT;
+   dev_warn(adev->dev, "lockup timeout disabled");
+   add_taint(TAINT_SOFTLOCKUP, LOCKDEP_STILL_OK);
} else {
timeout = msecs_to_jiffies(timeout);
}
-- 
2.25.1



Re: [PATCH] drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"

2021-09-30 Thread Das, Nirmoy

Acked-by: Nirmoy Das 

On 9/30/2021 11:26 AM, Christian König wrote:

This reverts commit 728e7e0cd61899208e924472b9e641dbeb0775c4.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  2 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 80 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |  5 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 --
  4 files changed, 91 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index dc3c6b3a00e5..6a1928a720a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1078,8 +1078,6 @@ struct amdgpu_device {
charproduct_name[32];
charserial[20];
  
-	struct amdgpu_autodump		autodump;

-
atomic_tthrottling_logging_enabled;
struct ratelimit_state  throttling_logging_rs;
uint32_tras_hw_enabled;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 277128846dd1..0b89ba142a59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -27,7 +27,6 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "amdgpu.h"

  #include "amdgpu_pm.h"
@@ -37,85 +36,7 @@
  #include "amdgpu_securedisplay.h"
  #include "amdgpu_fw_attestation.h"
  
-int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)

-{
  #if defined(CONFIG_DEBUG_FS)
-   unsigned long timeout = 600 * HZ;
-   int ret;
-
-   wake_up_interruptible(>autodump.gpu_hang);
-
-   ret = 
wait_for_completion_interruptible_timeout(>autodump.dumping, timeout);
-   if (ret == 0) {
-   pr_err("autodump: timeout, move on to gpu recovery\n");
-   return -ETIMEDOUT;
-   }
-#endif
-   return 0;
-}
-
-#if defined(CONFIG_DEBUG_FS)
-
-static int amdgpu_debugfs_autodump_open(struct inode *inode, struct file *file)
-{
-   struct amdgpu_device *adev = inode->i_private;
-   int ret;
-
-   file->private_data = adev;
-
-   ret = down_read_killable(>reset_sem);
-   if (ret)
-   return ret;
-
-   if (adev->autodump.dumping.done) {
-   reinit_completion(>autodump.dumping);
-   ret = 0;
-   } else {
-   ret = -EBUSY;
-   }
-
-   up_read(>reset_sem);
-
-   return ret;
-}
-
-static int amdgpu_debugfs_autodump_release(struct inode *inode, struct file 
*file)
-{
-   struct amdgpu_device *adev = file->private_data;
-
-   complete_all(>autodump.dumping);
-   return 0;
-}
-
-static unsigned int amdgpu_debugfs_autodump_poll(struct file *file, struct 
poll_table_struct *poll_table)
-{
-   struct amdgpu_device *adev = file->private_data;
-
-   poll_wait(file, >autodump.gpu_hang, poll_table);
-
-   if (amdgpu_in_reset(adev))
-   return POLLIN | POLLRDNORM | POLLWRNORM;
-
-   return 0;
-}
-
-static const struct file_operations autodump_debug_fops = {
-   .owner = THIS_MODULE,
-   .open = amdgpu_debugfs_autodump_open,
-   .poll = amdgpu_debugfs_autodump_poll,
-   .release = amdgpu_debugfs_autodump_release,
-};
-
-static void amdgpu_debugfs_autodump_init(struct amdgpu_device *adev)
-{
-   init_completion(>autodump.dumping);
-   complete_all(>autodump.dumping);
-   init_waitqueue_head(>autodump.gpu_hang);
-
-   debugfs_create_file("amdgpu_autodump", 0600,
-   adev_to_drm(adev)->primary->debugfs_root,
-   adev, _debug_fops);
-}
  
  /**

   * amdgpu_debugfs_process_reg_op - Handle MMIO register reads/writes
@@ -1590,7 +1511,6 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev)
}
  
  	amdgpu_ras_debugfs_create_all(adev);

-   amdgpu_debugfs_autodump_init(adev);
amdgpu_rap_debugfs_init(adev);
amdgpu_securedisplay_debugfs_init(adev);
amdgpu_fw_attestation_debugfs_init(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
index 141a8474e24f..8b641f40fdf6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
@@ -26,10 +26,6 @@
  /*
   * Debugfs
   */
-struct amdgpu_autodump {
-   struct completion   dumping;
-   struct wait_queue_head  gpu_hang;
-};
  
  int amdgpu_debugfs_regs_init(struct amdgpu_device *adev);

  int amdgpu_debugfs_init(struct amdgpu_device *adev);
@@ -37,4 +33,3 @@ void amdgpu_debugfs_fini(struct amdgpu_device *adev);
  void amdgpu_debugfs_fence_init(struct amdgpu_device *adev);
  void amdgpu_debugfs_firmware_init(struct amdgpu_device 

[PATCH] drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"

2021-09-30 Thread Christian König
This reverts commit 728e7e0cd61899208e924472b9e641dbeb0775c4.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  2 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 80 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |  5 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 --
 4 files changed, 91 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index dc3c6b3a00e5..6a1928a720a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1078,8 +1078,6 @@ struct amdgpu_device {
charproduct_name[32];
charserial[20];
 
-   struct amdgpu_autodump  autodump;
-
atomic_tthrottling_logging_enabled;
struct ratelimit_state  throttling_logging_rs;
uint32_tras_hw_enabled;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 277128846dd1..0b89ba142a59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "amdgpu.h"
 #include "amdgpu_pm.h"
@@ -37,85 +36,7 @@
 #include "amdgpu_securedisplay.h"
 #include "amdgpu_fw_attestation.h"
 
-int amdgpu_debugfs_wait_dump(struct amdgpu_device *adev)
-{
 #if defined(CONFIG_DEBUG_FS)
-   unsigned long timeout = 600 * HZ;
-   int ret;
-
-   wake_up_interruptible(>autodump.gpu_hang);
-
-   ret = 
wait_for_completion_interruptible_timeout(>autodump.dumping, timeout);
-   if (ret == 0) {
-   pr_err("autodump: timeout, move on to gpu recovery\n");
-   return -ETIMEDOUT;
-   }
-#endif
-   return 0;
-}
-
-#if defined(CONFIG_DEBUG_FS)
-
-static int amdgpu_debugfs_autodump_open(struct inode *inode, struct file *file)
-{
-   struct amdgpu_device *adev = inode->i_private;
-   int ret;
-
-   file->private_data = adev;
-
-   ret = down_read_killable(>reset_sem);
-   if (ret)
-   return ret;
-
-   if (adev->autodump.dumping.done) {
-   reinit_completion(>autodump.dumping);
-   ret = 0;
-   } else {
-   ret = -EBUSY;
-   }
-
-   up_read(>reset_sem);
-
-   return ret;
-}
-
-static int amdgpu_debugfs_autodump_release(struct inode *inode, struct file 
*file)
-{
-   struct amdgpu_device *adev = file->private_data;
-
-   complete_all(>autodump.dumping);
-   return 0;
-}
-
-static unsigned int amdgpu_debugfs_autodump_poll(struct file *file, struct 
poll_table_struct *poll_table)
-{
-   struct amdgpu_device *adev = file->private_data;
-
-   poll_wait(file, >autodump.gpu_hang, poll_table);
-
-   if (amdgpu_in_reset(adev))
-   return POLLIN | POLLRDNORM | POLLWRNORM;
-
-   return 0;
-}
-
-static const struct file_operations autodump_debug_fops = {
-   .owner = THIS_MODULE,
-   .open = amdgpu_debugfs_autodump_open,
-   .poll = amdgpu_debugfs_autodump_poll,
-   .release = amdgpu_debugfs_autodump_release,
-};
-
-static void amdgpu_debugfs_autodump_init(struct amdgpu_device *adev)
-{
-   init_completion(>autodump.dumping);
-   complete_all(>autodump.dumping);
-   init_waitqueue_head(>autodump.gpu_hang);
-
-   debugfs_create_file("amdgpu_autodump", 0600,
-   adev_to_drm(adev)->primary->debugfs_root,
-   adev, _debug_fops);
-}
 
 /**
  * amdgpu_debugfs_process_reg_op - Handle MMIO register reads/writes
@@ -1590,7 +1511,6 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev)
}
 
amdgpu_ras_debugfs_create_all(adev);
-   amdgpu_debugfs_autodump_init(adev);
amdgpu_rap_debugfs_init(adev);
amdgpu_securedisplay_debugfs_init(adev);
amdgpu_fw_attestation_debugfs_init(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
index 141a8474e24f..8b641f40fdf6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
@@ -26,10 +26,6 @@
 /*
  * Debugfs
  */
-struct amdgpu_autodump {
-   struct completion   dumping;
-   struct wait_queue_head  gpu_hang;
-};
 
 int amdgpu_debugfs_regs_init(struct amdgpu_device *adev);
 int amdgpu_debugfs_init(struct amdgpu_device *adev);
@@ -37,4 +33,3 @@ void amdgpu_debugfs_fini(struct amdgpu_device *adev);
 void amdgpu_debugfs_fence_init(struct amdgpu_device *adev);
 void amdgpu_debugfs_firmware_init(struct amdgpu_device *adev);
 void amdgpu_debugfs_gem_init(struct amdgpu_device *adev);
-int 

Re: [PATCH v6 0/2] Add p2p via dmabuf to habanalabs

2021-09-30 Thread Daniel Vetter
On Tue, Sep 28, 2021 at 10:04:29AM +0300, Oded Gabbay wrote:
> On Thu, Sep 23, 2021 at 12:22 PM Oded Gabbay  wrote:
> >
> > On Sat, Sep 18, 2021 at 11:38 AM Oded Gabbay  wrote:
> > >
> > > On Fri, Sep 17, 2021 at 3:30 PM Daniel Vetter  wrote:
> > > >
> > > > On Thu, Sep 16, 2021 at 10:10:14AM -0300, Jason Gunthorpe wrote:
> > > > > On Thu, Sep 16, 2021 at 02:31:34PM +0200, Daniel Vetter wrote:
> > > > > > On Wed, Sep 15, 2021 at 10:45:36AM +0300, Oded Gabbay wrote:
> > > > > > > On Tue, Sep 14, 2021 at 7:12 PM Jason Gunthorpe  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Tue, Sep 14, 2021 at 04:18:31PM +0200, Daniel Vetter wrote:
> > > > > > > > > On Sun, Sep 12, 2021 at 07:53:07PM +0300, Oded Gabbay wrote:
> > > > > > > > > > Hi,
> > > > > > > > > > Re-sending this patch-set following the release of our 
> > > > > > > > > > user-space TPC
> > > > > > > > > > compiler and runtime library.
> > > > > > > > > >
> > > > > > > > > > I would appreciate a review on this.
> > > > > > > > >
> > > > > > > > > I think the big open we have is the entire revoke 
> > > > > > > > > discussions. Having the
> > > > > > > > > option to let dma-buf hang around which map to random local 
> > > > > > > > > memory ranges,
> > > > > > > > > without clear ownership link and a way to kill it sounds bad 
> > > > > > > > > to me.
> > > > > > > > >
> > > > > > > > > I think there's a few options:
> > > > > > > > > - We require revoke support. But I've heard rdma really 
> > > > > > > > > doesn't like that,
> > > > > > > > >   I guess because taking out an MR while holding the 
> > > > > > > > > dma_resv_lock would
> > > > > > > > >   be an inversion, so can't be done. Jason, can you recap 
> > > > > > > > > what exactly the
> > > > > > > > >   hold-up was again that makes this a no-go?
> > > > > > > >
> > > > > > > > RDMA HW can't do revoke.
> > > > > >
> > > > > > Like why? I'm assuming when the final open handle or whatever for 
> > > > > > that MR
> > > > > > is closed, you do clean up everything? Or does that MR still stick 
> > > > > > around
> > > > > > forever too?
> > > > >
> > > > > It is a combination of uAPI and HW specification.
> > > > >
> > > > > revoke here means you take a MR object and tell it to stop doing DMA
> > > > > without causing the MR object to be destructed.
> > > > >
> > > > > All the drivers can of course destruct the MR, but doing such a
> > > > > destruction without explicit synchronization with user space opens
> > > > > things up to a serious use-after potential that could be a security
> > > > > issue.
> > > > >
> > > > > When the open handle closes the userspace is synchronized with the
> > > > > kernel and we can destruct the HW objects safely.
> > > > >
> > > > > So, the special HW feature required is 'stop doing DMA but keep the
> > > > > object in an error state' which isn't really implemented, and doesn't
> > > > > extend very well to other object types beyond simple MRs.
> > > >
> > > > Yeah revoke without destroying the MR doesn't work, and it sounds like
> > > > revoke by destroying the MR just moves the can of worms around to 
> > > > another
> > > > place.
> > > >
> > > > > > 1. User A opens gaudi device, sets up dma-buf export
> > > > > >
> > > > > > 2. User A registers that with RDMA, or anything else that doesn't 
> > > > > > support
> > > > > > revoke.
> > > > > >
> > > > > > 3. User A closes gaudi device
> > > > > >
> > > > > > 4. User B opens gaudi device, assumes that it has full control over 
> > > > > > the
> > > > > > device and uploads some secrets, which happen to end up in the 
> > > > > > dma-buf
> > > > > > region user A set up
> > > > >
> > > > > I would expect this is blocked so long as the DMABUF exists - eg the
> > > > > DMABUF will hold a fget on the FD of #1 until the DMABUF is closed, so
> > > > > that #3 can't actually happen.
> > > > >
> > > > > > It's not mlocked memory, it's mlocked memory and I can exfiltrate
> > > > > > it.
> > > > >
> > > > > That's just bug, don't make buggy drivers :)
> > > >
> > > > Well yeah, but given that habanalabs hand rolled this I can't just check
> > > > for the usual things we have to enforce this in drm. And generally you 
> > > > can
> > > > just open chardevs arbitrarily, and multiple users fighting over each
> > > > another. The troubles only start when you have private state or memory
> > > > allocations of some kind attached to the struct file (instead of the
> > > > underlying device), or something else that requires device exclusivity.
> > > > There's no standard way to do that.
> > > >
> > > > Plus in many cases you really want revoke on top (can't get that here
> > > > unfortunately it seems), and the attempts to get towards a generic
> > > > revoke() just never went anywhere. So again it's all hand-rolled
> > > > per-subsystem. *insert lament about us not having done this through a
> > > > proper subsystem*
> > > >
> > > > Anyway it sounds like the code takes care of that.
> > > > -Daniel
> > >
> > > Daniel, 

Re: amdgpu driver halted on suspend of shutdown

2021-09-30 Thread Christian König

Well you could remove it locally if it solves your problem at hand.

But keep in mind that a lot of ARM boards are simply not compliant to 
the PCIe specification and the hardware won't work correctly on those in 
general.


I'm pretty sure you have one of those cases here.

Christian.

Am 30.09.21 um 03:26 schrieb 李真能:


So, Can I remove suspend process in amdgpu_pci_shutdown if  I don't  
use amdgpu driver in vm?


Thank you so much foryour reply!

在 2021/9/30 上午5:12, Alex Deucher 写道:

On Wed, Sep 29, 2021 at 3:25 AM 李真能  wrote:

Hello:

  When I do loop  auto test of reboot, I found  kernel may halt
on memcpy_fromio of amdgpu's amdgpu_uvd_suspend, so I remove suspend
process in amdgpu_pci_shutdown, and it will fix this bug.

I have 3 questions to ask:

1. In amdgpu_pci_shutdown, the comment explains why we must execute
suspend,  so I know VM will call amdgpu driver in which situations, as I
know, VM's graphics card is a virtual card;

2. I see a path that is commited by Alex Deucher, the commit message is
as follows:

drm/amdgpu: just suspend the hw on pci shutdown

We can't just reuse pci_remove as there may be userspace still
  doing things.

My question is:In which situations, there may be  userspace till doing
things.

3. Why amdgpu driver is halted on memcpy_fromio of amdgpu_uvd_suspend, I
haven't launch any video app during reboot test, is it the bug of pci bus?

Test environment:

CPU: arm64

I suspect the problem is something ARM specific.  IIRC, we added the
memcpy_fromio() to work around a limitation in ARM related to CPU
mappings of PCI BAR memory.  The whole point of the PCI shutdown
callback is to put the device into a quiescent state (e.g., stop all
DMAs and asynchronous engines, etc.).  Some of that tear down requires
access to PCI BARs.

Alex



Graphics card: r7340(amdgpu), rx550

OS: ubuntu 2004





Re: [PATCH] Documentation/gpu: remove spurious "+" in amdgpu.rst

2021-09-30 Thread Christian König

Am 29.09.21 um 19:45 schrieb Alex Deucher:

Not sure why that was there.  Remove it.

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 


---
  Documentation/gpu/amdgpu.rst | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/gpu/amdgpu.rst b/Documentation/gpu/amdgpu.rst
index 364680cdad2e..8ba72e898099 100644
--- a/Documentation/gpu/amdgpu.rst
+++ b/Documentation/gpu/amdgpu.rst
@@ -300,8 +300,8 @@ pcie_replay_count
  .. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
 :doc: pcie_replay_count
  
-+GPU SmartShift Information

-
+GPU SmartShift Information
+==
  
  GPU SmartShift information via sysfs
  




Re: [PATCH] drm/amdgpu: consolidate case statements

2021-09-30 Thread Christian König

Am 29.09.21 um 19:45 schrieb Alex Deucher:

IP_VERSION(11, 0, 13) does the exact same thing as
IP_VERSION(11, 0, 12) so squash them together.

Signed-off-by: Alex Deucher 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 7 ---
  1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index 382cebfc2069..aaf200ec982b 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -216,13 +216,6 @@ static int psp_v11_0_init_microcode(struct psp_context 
*psp)
case IP_VERSION(11, 0, 7):
case IP_VERSION(11, 0, 11):
case IP_VERSION(11, 0, 12):
-   err = psp_init_sos_microcode(psp, chip_name);
-   if (err)
-   return err;
-   err = psp_init_ta_microcode(psp, chip_name);
-   if (err)
-   return err;
-   break;
case IP_VERSION(11, 0, 13):
err = psp_init_sos_microcode(psp, chip_name);
if (err)