date:20230605

[PATCH] drm/amdgpu/mmsch: Correct the definition for mmsch init header

2023-06-05 Thread Emily Deng

For the header, it is version related, shouldn't use MAX_VCN_INSTANCES.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h | 4 +++-
 drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h | 4 +++-
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c   | 2 +-
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h 
b/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h
index 3e4e858a6965..a773ef61b78c 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h
@@ -30,6 +30,8 @@
 #define MMSCH_VERSION_MINOR0
 #define MMSCH_VERSION  (MMSCH_VERSION_MAJOR << 16 | MMSCH_VERSION_MINOR)
 
+#define MMSCH_V3_0_VCN_INSTANCES 0x2
+
 enum mmsch_v3_0_command_type {
MMSCH_COMMAND__DIRECT_REG_WRITE = 0,
MMSCH_COMMAND__DIRECT_REG_POLLING = 2,
@@ -47,7 +49,7 @@ struct mmsch_v3_0_table_info {
 struct mmsch_v3_0_init_header {
uint32_t version;
uint32_t total_size;
-   struct mmsch_v3_0_table_info inst[AMDGPU_MAX_VCN_INSTANCES];
+   struct mmsch_v3_0_table_info inst[MMSCH_V3_0_VCN_INSTANCES];
 };
 
 struct mmsch_v3_0_cmd_direct_reg_header {
diff --git a/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h 
b/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h
index 83653a50a1a2..796d4f8791e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h
@@ -43,6 +43,8 @@
 #define MMSCH_VF_MAILBOX_RESP__OK 0x1
 #define MMSCH_VF_MAILBOX_RESP__INCOMPLETE 0x2
 
+#define MMSCH_V4_0_VCN_INSTANCES 0x2
+
 enum mmsch_v4_0_command_type {
MMSCH_COMMAND__DIRECT_REG_WRITE = 0,
MMSCH_COMMAND__DIRECT_REG_POLLING = 2,
@@ -60,7 +62,7 @@ struct mmsch_v4_0_table_info {
 struct mmsch_v4_0_init_header {
uint32_t version;
uint32_t total_size;
-   struct mmsch_v4_0_table_info inst[AMDGPU_MAX_VCN_INSTANCES];
+   struct mmsch_v4_0_table_info inst[MMSCH_V4_0_VCN_INSTANCES];
struct mmsch_v4_0_table_info jpegdec;
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 70fefbf26c48..c8f63b3c6f69 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1313,7 +1313,7 @@ static int vcn_v3_0_start_sriov(struct amdgpu_device 
*adev)
 
header.version = MMSCH_VERSION;
header.total_size = sizeof(struct mmsch_v3_0_init_header) >> 2;
-   for (i = 0; i < AMDGPU_MAX_VCN_INSTANCES; i++) {
+   for (i = 0; i < MMSCH_V3_0_VCN_INSTANCES; i++) {
header.inst[i].init_status = 0;
header.inst[i].table_offset = 0;
header.inst[i].table_size = 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index 60c3fd20e8ce..8d371faaa2b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -1239,7 +1239,7 @@ static int vcn_v4_0_start_sriov(struct amdgpu_device 
*adev)
 
header.version = MMSCH_VERSION;
header.total_size = sizeof(struct mmsch_v4_0_init_header) >> 2;
-   for (i = 0; i < AMDGPU_MAX_VCN_INSTANCES; i++) {
+   for (i = 0; i < MMSCH_V4_0_VCN_INSTANCES; i++) {
header.inst[i].init_status = 0;
header.inst[i].table_offset = 0;
header.inst[i].table_size = 0;
-- 
2.36.1

[PATCH] drm/amdgpu: disable virtual display support on APP device

2023-06-05 Thread Yang Wang

virtual display is not support on APP device.

Signed-off-by: Yang Wang 
Signed-off-by: Gavin Wan 
Reviewed-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 2c1fbed24535..0f1ca0136f50 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -56,7 +56,8 @@ void amdgpu_virt_init_setting(struct amdgpu_device *adev)
 
/* enable virtual display */
if (adev->asic_type != CHIP_ALDEBARAN &&
-   adev->asic_type != CHIP_ARCTURUS) {
+   adev->asic_type != CHIP_ARCTURUS &&
+   ((adev->pdev->class >> 8) != AMD_ACCELERATOR_PROCESSING)) {
if (adev->mode_info.num_crtc == 0)
adev->mode_info.num_crtc = 1;
adev->enable_virtual_display = true;
-- 
2.34.1

Re: [Intel-gfx] [PATCH v2 1/2] vgaarb: various coding style and comments fix

2023-06-05 Thread Sui Jingfeng


Hi,

On 2023/6/6 06:16, Andi Shyti wrote:

Hi Sui,

On Mon, Jun 05, 2023 at 04:58:30AM +0800, Sui Jingfeng wrote:

From: Sui Jingfeng 

To keep consistent with vga_iostate_to_str() function, the third argument
of vga_str_to_iostate() function should be 'unsigned int *'.

I think the real reason is not to keep consistent with
vga_iostate_to_str() but because vga_str_to_iostate() is actually
only taking "unsigned int *" parameters.


Yes, right.

my expression is not completely correct, I will update it at next version.


I think, we have the same opinion.

Originally, I also want to express the opinion.

Because, it make no sense to  interpret the return value

(VGA_RSRC_LEGACY_IO | VGA_RSRC_LEGACY_MEM) as int type.


IO state should be should be donate by a unsigned type.

vga_iostate_to_str() also receive unsigned type.

static const char *vga_iostate_to_str(unsigned int iostate)


Signed-off-by: Sui Jingfeng 
---
  drivers/pci/vgaarb.c   | 29 +++--
  include/linux/vgaarb.h |  8 +++-
  2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 5a696078b382..e40e6e5e5f03 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -61,7 +61,6 @@ static bool vga_arbiter_used;
  static DEFINE_SPINLOCK(vga_lock);
  static DECLARE_WAIT_QUEUE_HEAD(vga_wait_queue);
  
-

drop this change


OK,

This is a double blank line.

Originally, I intend to accumulate all tiny fix, commit together.

As they are trivial.

Now, Should I split this patch,

then this patch set will contain two trivial patch ?


  static const char *vga_iostate_to_str(unsigned int iostate)
  {
/* Ignore VGA_RSRC_IO and VGA_RSRC_MEM */
@@ -77,10 +76,12 @@ static const char *vga_iostate_to_str(unsigned int iostate)
return "none";
  }
  
-static int vga_str_to_iostate(char *buf, int str_size, int *io_state)

+static int vga_str_to_iostate(char *buf, int str_size, unsigned int *io_state)

this is OK, it's actually what you are describing in the commit
log, but...


  {
-   /* we could in theory hand out locks on IO and mem
-* separately to userspace but it can cause deadlocks */
+   /*
+* we could in theory hand out locks on IO and mem
+* separately to userspace but it can cause deadlocks
+*/

... all the rest needs to go on different patches as it doesn't
have anything to do with what you describe.


OK,

I will wait a few days for more reviews,

I process them together,   also avoid version grow too fast.

Thanks.


Andi


--
Jingfeng

RE: [PATCH 1/2] drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memory

2023-06-05 Thread Chen, Guchun

[Public]

Acked-by: Guchun Chen  for this series.

A simple question is we don't need to hold the lock if bo locations are not 
changed?

Regards,
Guchun

> -Original Message-
> From: Christian König 
> Sent: Monday, June 5, 2023 5:11 PM
> To: amd-gfx@lists.freedesktop.org; mikhail.v.gavri...@gmail.com; Chen,
> Guchun 
> Subject: [PATCH 1/2] drm/amdgpu: make sure BOs are locked in
> amdgpu_vm_get_memory
>
> We need to grab the lock of the BO or otherwise can run into a crash when
> we try to inspect the current location.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 69 +++-
> --
>  1 file changed, 39 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 3c0310576b3b..2c8cafec48a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -920,42 +920,51 @@ int amdgpu_vm_update_range(struct
> amdgpu_device *adev, struct amdgpu_vm *vm,
>   return r;
>  }
>
> +static void amdgpu_vm_bo_get_memory(struct amdgpu_bo_va *bo_va,
> + struct amdgpu_mem_stats *stats) {
> + struct amdgpu_vm *vm = bo_va->base.vm;
> + struct amdgpu_bo *bo = bo_va->base.bo;
> +
> + if (!bo)
> + return;
> +
> + /*
> +  * For now ignore BOs which are currently locked and potentially
> +  * changing their location.
> +  */
> + if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
> + !dma_resv_trylock(bo->tbo.base.resv))
> + return;
> +
> + amdgpu_bo_get_memory(bo, stats);
> + if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
> + dma_resv_unlock(bo->tbo.base.resv);
> +}
> +
>  void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
> struct amdgpu_mem_stats *stats)
>  {
>   struct amdgpu_bo_va *bo_va, *tmp;
>
>   spin_lock(&vm->status_lock);
> - list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status)
> {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->relocated,
> base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status)
> {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->invalidated,
> base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> - list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status) {
> - if (!bo_va->base.bo)
> - continue;
> - amdgpu_bo_get_memory(bo_va->base.bo, stats);
> - }
> + list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->relocated,
> base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->invalidated,
> base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
> +
> + list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status)
> + amdgpu_vm_bo_get_memory(bo_va, stats);
>   spin_unlock(&vm->status_lock);
>  }
>
> --
> 2.34.1

Re: [Intel-gfx] [PATCH v2 1/2] vgaarb: various coding style and comments fix

2023-06-05 Thread Andi Shyti

Hi Sui,

On Mon, Jun 05, 2023 at 04:58:30AM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> To keep consistent with vga_iostate_to_str() function, the third argument
> of vga_str_to_iostate() function should be 'unsigned int *'.

I think the real reason is not to keep consistent with
vga_iostate_to_str() but because vga_str_to_iostate() is actually
only taking "unsigned int *" parameters.

> Signed-off-by: Sui Jingfeng 
> ---
>  drivers/pci/vgaarb.c   | 29 +++--
>  include/linux/vgaarb.h |  8 +++-
>  2 files changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index 5a696078b382..e40e6e5e5f03 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -61,7 +61,6 @@ static bool vga_arbiter_used;
>  static DEFINE_SPINLOCK(vga_lock);
>  static DECLARE_WAIT_QUEUE_HEAD(vga_wait_queue);
>  
> -

drop this change

>  static const char *vga_iostate_to_str(unsigned int iostate)
>  {
>   /* Ignore VGA_RSRC_IO and VGA_RSRC_MEM */
> @@ -77,10 +76,12 @@ static const char *vga_iostate_to_str(unsigned int 
> iostate)
>   return "none";
>  }
>  
> -static int vga_str_to_iostate(char *buf, int str_size, int *io_state)
> +static int vga_str_to_iostate(char *buf, int str_size, unsigned int 
> *io_state)

this is OK, it's actually what you are describing in the commit
log, but...

>  {
> - /* we could in theory hand out locks on IO and mem
> -  * separately to userspace but it can cause deadlocks */
> + /*
> +  * we could in theory hand out locks on IO and mem
> +  * separately to userspace but it can cause deadlocks
> +  */

... all the rest needs to go on different patches as it doesn't
have anything to do with what you describe.

Andi

RE: [PATCH] drm/amd: Check that a system is a NUMA system before looking for SRAT

2023-06-05 Thread Limonciello, Mario

[Public]

> On 2023-06-02 08:18, Mario Limonciello wrote:
> > It's pointless on laptops to look for the SRAT table as these are not
> > NUMA.  Check the number of possible nodes is > 1 to decide whether to
> > look for SRAT.
> >
> > Suggested-by: Felix Kuehling 
> > Signed-off-by: Mario Limonciello 
>
> I think we discussed this a while ago and I don't remember the exact
> issue that was meant to fix. Was just to get rid of an irritating
> warning in the kernel log? Anyway, the patch looks good to me.

Yeah I forgot all about sending out the fix until I noticed it again recently.

>
> Reviewed-by: Felix Kuehling 

Thanks!

>
>
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > index 950af6820153..3dcd8f8bc98e 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > @@ -2041,7 +2041,8 @@ static int kfd_fill_gpu_direct_io_link_to_cpu(int
> *avail_size,
> > sub_type_hdr->proximity_domain_from = proximity_domain;
> >
> >   #ifdef CONFIG_ACPI_NUMA
> > -   if (kdev->adev->pdev->dev.numa_node == NUMA_NO_NODE)
> > +   if (kdev->adev->pdev->dev.numa_node == NUMA_NO_NODE &&
> > +   num_possible_nodes() > 1)
> > kfd_find_numa_node_in_srat(kdev);
> >   #endif
> >   #ifdef CONFIG_NUMA

Re: [PATCH] drm/amd: Check that a system is a NUMA system before looking for SRAT

2023-06-05 Thread Felix Kuehling


On 2023-06-02 08:18, Mario Limonciello wrote:

It's pointless on laptops to look for the SRAT table as these are not
NUMA.  Check the number of possible nodes is > 1 to decide whether to
look for SRAT.

Suggested-by: Felix Kuehling 
Signed-off-by: Mario Limonciello 


I think we discussed this a while ago and I don't remember the exact 
issue that was meant to fix. Was just to get rid of an irritating 
warning in the kernel log? Anyway, the patch looks good to me.


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 950af6820153..3dcd8f8bc98e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -2041,7 +2041,8 @@ static int kfd_fill_gpu_direct_io_link_to_cpu(int 
*avail_size,
sub_type_hdr->proximity_domain_from = proximity_domain;
  
  #ifdef CONFIG_ACPI_NUMA

-   if (kdev->adev->pdev->dev.numa_node == NUMA_NO_NODE)
+   if (kdev->adev->pdev->dev.numa_node == NUMA_NO_NODE &&
+   num_possible_nodes() > 1)
kfd_find_numa_node_in_srat(kdev);
  #endif
  #ifdef CONFIG_NUMA

Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue

2023-06-05 Thread Felix Richter

I will apply this patch and see if fixes the issue for me. Will let you
now when I am done.

Felix

On 05.06.23 16:11, Alex Deucher wrote:

On Sat, Jun 3, 2023 at 10:52 AM Felix Richter wrote:

Hi Guys,

sorry for the silence from my side. I had a lot of things to take care
of after returning from vacation. Also I had to wait on the zfs modules
to be updated to support kernel 6.3 for further testing.

The bad news is that I am still experiencing issues. I have been able to
get a reproducible trigger for the buggy behavior. The moment I take a
screenshot or any other program like `wdisplays` accesses the screen
buffer the screen starts flickering. The only way to reset it is to
reboot the machine or log out of the desktop.

With this I did a bisection to figure out which commit is responsible
for this. I attached the logs to the mail. The short version is that I
identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the
culprit. Seems that there are side effects of having more flexible
buffer placement for the case of the internal GPU. To verify that this
actually is the cause of the issue I built the current archlinux kernel
with an extra patch to revert the commit:
https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that be
bug is fixed!

+ Hamza

This is a known issue. You can workaround it by setting
amdgpu.sg_display=0. It should be issue should be fixed in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08da182175db4c7f80850354849d95f2670e8cd9

Alex

Now if this is the desired long term fix I do not know …

Kind regards,
Felix Richter

On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:

On 02.05.23 15:48, Felix Richter wrote:

On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:

On 02.05.23 15:13, Alex Deucher wrote:

On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
Leemhuis) wrote:

On 30.04.23 13:44, Felix Richter wrote:

Hi,

I am running into an issue with the integrated GPU of the Ryzen 9
7950X. It seems to be a regression from kernel version 6.1 to 6.2.
The bug materializes in from of my monitor blinking, meaning it
turns full white shortly. This happens very often so that the
system becomes unpleasant to use.

I am running the Archlinux Kernel:
The Issue happens on the bleeding edge kernel: 6.2.13
Switching back to the LTS kernel resolves the issue: 6.1.26

I have two monitors attached to the system. One 42 inch 4k Display
and a 24 inch 1080p Display and am running sway as my desktop.

Let me know if there is more information I could provide to help
narrow down the issue.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced v6.1..v6.2
#regzbot title drm: amdgpu: system becomes unpleasant to use after
monitor starts blinking and turns full white
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify
when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags
pointing
to the report (the parent of this mail). See page linked in footer for
details.

This sounds exactly like the issue that was fixed in this patch which
is already on it's way to Linus:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9

FWIW, you in the flood of emails likely missed that this is the same
thread where you yesterday replied "If the module parameter didn't help
then perhaps you are seeing some other issue. Can you bisect?". That's
why I decided to add this to the tracking. Or am I missing something
obvious here?

/me looks around again and can't see anything, but that doesn't have to
mean anything...

Felix, btw, this guide might help you with the bisection, even if it's
just for kernel compilation:

https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html

And to indirectly reply to your mail from yesterday[1]. You might want
to ignore the arch linux kernel git repo and just do a bisection between
6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
I'd also try 6.3 or even mainline before that, in case the issue was
fixed already.

[1]
https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279e...@felixrichter.tech/

Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
the newest commit.

FWIW, I wonder what you actually mean with "newest commit" here: a
bisection between 6.1 and mainline HEAD might be a waste of time, *if*
this is something that only happens in 6.2.y (say due to a broken or
incomplete backport)

That was the part I was mostly

Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue

2023-06-05 Thread Felix Richter

Hi,

I can confirm that setting amdgpu.sg_display=0 does not fix the issue
for me.

I have 64GB of Kinsten Memory running with XMP at 5200MHz. I attached
the result of `dmidecode --type=memory` to this email.

Kind regards
Felix Richter

On 05.06.23 17:27, Hamza Mahfooz wrote:

On 6/3/23 10:52, Felix Richter wrote:

Hi Guys,

sorry for the silence from my side. I had a lot of things to take
care of after returning from vacation. Also I had to wait on the zfs
modules to be updated to support kernel 6.3 for further testing.

The bad news is that I am still experiencing issues. I have been able
to get a reproducible trigger for the buggy behavior. The moment I
take a screenshot or any other program like `wdisplays` accesses the
screen buffer the screen starts flickering. The only way to reset it
is to reboot the machine or log out of the desktop.

With this I did a bisection to figure out which commit is responsible
for this. I attached the logs to the mail. The short version is that
I identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the
culprit. Seems that there are side effects of having more flexible
buffer placement for the case of the internal GPU. To verify that
this actually is the cause of the issue I built the current archlinux
kernel with an extra patch to revert the commit:
https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that
be bug is fixed!

Now if this is the desired long term fix I do not know …

Can you provide a dmidecode of your RAM (i.e. # dmidecode --type=memory)?

The current trend seems to suggest that if you have 64 or more gigs of
RAM, you will probably still experience issues with S/G mode enabled
even with my fix applied.

Kind regards,
Felix Richter

On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:

On 02.05.23 15:48, Felix Richter wrote:

On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:

On 02.05.23 15:13, Alex Deucher wrote:

On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
Leemhuis) wrote:

On 30.04.23 13:44, Felix Richter wrote:

Hi,

I am running the Archlinux Kernel:
The Issue happens on the bleeding edge kernel: 6.2.13
Switching back to the LTS kernel resolves the issue: 6.1.26

I have two monitors attached to the system. One 42 inch 4k Display
and a 24 inch 1080p Display and am running sway as my desktop.

Let me know if there is more information I could provide to help
narrow down the issue.
Thanks for the report. To be sure the issue doesn't fall through
the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel
regression

tracking bot:

#regzbot ^introduced v6.1..v6.2
#regzbot title drm: amdgpu: system becomes unpleasant to use after
monitor starts blinking and turns full white
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify
when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me --
ideally
while also telling regzbot about it, as explained by the page
listed in

the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags
pointing
to the report (the parent of this mail). See page linked in
footer for

details.
This sounds exactly like the issue that was fixed in this patch
which

is already on it's way to Linus:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9

FWIW, you in the flood of emails likely missed that this is the same
thread where you yesterday replied "If the module parameter didn't
help
then perhaps you are seeing some other issue. Can you bisect?".
That's

why I decided to add this to the tracking. Or am I missing something
obvious here?

/me looks around again and can't see anything, but that doesn't
have to

mean anything...

Felix, btw, this guide might help you with the bisection, even if
it's

just for kernel compilation:

https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html

And to indirectly reply to your mail from yesterday[1]. You might
want
to ignore the arch linux kernel git repo and just do a bisection
between
6.1 and the latest 6.2.y kernel using upstream repos; and if I
were you

I'd also try 6.3 or even mainline before that, in case the issue was
fixed already.

[1]
https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279e...@felixrichter.tech/

Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
the newest commit.

FWIW, I wonder what you actually mean with "newest commit" here: a
bisection between 6.1 and mainlin

[PATCH 2/2] drm/amd/display: mark dml314's UseMinimumDCFCLK() as noinline_for_stack

2023-06-05 Thread Hamza Mahfooz

clang reports:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/display_mode_vba_314.c:3892:6:
 error: stack frame size (2632) exceeds limit (2048) in 
'dml314_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
 3892 | void dml314_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_lib)
  |  ^
1 error generated.

So, since UseMinimumDCFCLK() consumes a lot of stack space, mark it as
noinline_for_stack to prevent it from blowing up
dml314_ModeSupportAndSystemConfigurationFull()'s stack size.

Signed-off-by: Hamza Mahfooz 
---
 .../gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
index 27b83162ae45..1532a7e0ed6c 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c
@@ -7061,7 +7061,7 @@ static double CalculateUrgentLatency(
return ret;
 }
 
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
int MaxPrefetchMode,
int ReorderingBytes)
-- 
2.40.1

[PATCH 1/2] drm/amd/display: mark dml31's UseMinimumDCFCLK() as noinline_for_stack

2023-06-05 Thread Hamza Mahfooz

clang reports:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3797:6:
 error: stack frame size (2632) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
 3797 | void dml31_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_lib)
  |  ^
1 error generated.

So, since UseMinimumDCFCLK() consumes a lot of stack space, mark it as
noinline_for_stack to prevent it from blowing up
dml31_ModeSupportAndSystemConfigurationFull()'s stack size.

Signed-off-by: Hamza Mahfooz 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index 01603abd75bb..43016c462251 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -7032,7 +7032,7 @@ static double CalculateUrgentLatency(
return ret;
 }
 
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
int MaxPrefetchMode,
int ReorderingBytes)
-- 
2.40.1

Re: drm/amd: Drop messages in init for radeon, amdgpu

2023-06-05 Thread Limonciello, Mario




On 6/5/2023 9:28 AM, Alex Deucher wrote:

Since there is overlap in supported devices, both
modules load, but only one will bind to a particular
device depending on the user's configuration.  Drop
the message in the module init function as this can
be confusing to users.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2608
Signed-off-by: Alex Deucher 

Reviewed-by: Mario Limonciello 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 -
  drivers/gpu/drm/radeon/radeon_drv.c | 1 -
  2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 7eda4f039224..94509b76fa6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -3065,7 +3065,6 @@ static int __init amdgpu_init(void)
if (r)
goto error_fence;
  
-	DRM_INFO("amdgpu kernel modesetting enabled.\n");

amdgpu_register_atpx_handler();
amdgpu_acpi_detect();
  
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c

index e4374814f0ef..16b9eab90185 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -634,7 +634,6 @@ static int __init radeon_module_init(void)
if (radeon_modeset == 0)
return -EINVAL;
  
-	DRM_INFO("radeon kernel modesetting enabled.\n");

radeon_register_atpx_handler();
  
  	return pci_register_driver(&radeon_kms_pci_driver);

Re: [PATCH] drm/amdkfd: mark som eclear_address_watch() callback static

2023-06-05 Thread Alex Deucher

On Mon, Jun 5, 2023 at 6:58 AM Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> Some of the newly introduced clear_address_watch callback handlers have
> no prototype because they are only used in one file, which causes a W=1
> warning:
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c:164:10: error: no 
> previous prototype for 'kgd_gfx_aldebaran_clear_address_watch' 
> [-Werror=missing-prototypes]
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c:782:10: error: no previous 
> prototype for 'kgd_gfx_v11_clear_address_watch' [-Werror=missing-prototypes]
>
> Mark these ones static. If another user comes up in the future, that
> can be reverted along with adding the prototype.
>
> Fixes: cfd9715f741a1 ("drm/amdkfd: add debug set and clear address watch 
> points operation")
> Signed-off-by: Arnd Bergmann 

Thanks.  Srinivasan already sent out a fix for this.

Alex


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c   | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
> index efd6a72aab4eb..bdda8744398fe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
> @@ -161,7 +161,7 @@ static uint32_t kgd_gfx_aldebaran_set_address_watch(
> return watch_address_cntl;
>  }
>
> -uint32_t kgd_gfx_aldebaran_clear_address_watch(struct amdgpu_device *adev,
> +static uint32_t kgd_gfx_aldebaran_clear_address_watch(struct amdgpu_device 
> *adev,
> uint32_t watch_id)
>  {
> return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c
> index 52efa690a3c21..131859ce3e7e9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v11.c
> @@ -779,7 +779,7 @@ static uint32_t kgd_gfx_v11_set_address_watch(struct 
> amdgpu_device *adev,
> return watch_address_cntl;
>  }
>
> -uint32_t kgd_gfx_v11_clear_address_watch(struct amdgpu_device *adev,
> +static uint32_t kgd_gfx_v11_clear_address_watch(struct amdgpu_device *adev,
> uint32_t watch_id)
>  {
> return 0;
> --
> 2.39.2
>

Re: [PATCH] drm/amdgpu: Report ras_num_recs in debugfs

2023-06-05 Thread Alex Deucher

On Sat, Jun 3, 2023 at 1:11 AM Luben Tuikov  wrote:
>
> Report the number of records stored in the RAS EEPROM table in debugfs.
>
> This can be used by user-space to calculate the capacity of the RAS EEPROM
> table since "bad_page_cnt_threshold" is also reported in the same place in
> debugfs.
>
> See commit reference 7fb6407145479d (drm/amdgpu: Add bad_page_cnt_threshold to
> debugfs, 2021-04-13).
>
> ras_num_recs can already be inferred by dumping the RAS EEPROM table, also in
> the same debugfs location, see commit reference c65b0805e77919 (drm/amdgpu:
> RAS EEPROM table is now in debugfs, 2021-04-08). This commit makes it an
> integer value easily shown in a single file.
>
> Cc: Alex Deucher 
> Cc: Hawking Zhang 
> Cc: Tao Zhou 
> Cc: Stanley Yang 
> Cc: John Clements 
> Signed-off-by: Luben Tuikov 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index f2da69adcd9d48..68163890f9632d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1487,6 +1487,7 @@ static int amdgpu_ras_sysfs_remove_all(struct 
> amdgpu_device *adev)
>  static struct dentry *amdgpu_ras_debugfs_create_ctrl_node(struct 
> amdgpu_device *adev)
>  {
> struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
> +   struct amdgpu_ras_eeprom_control *eeprom = &con->eeprom_control;
> struct drm_minor  *minor = adev_to_drm(adev)->primary;
> struct dentry *dir;
>
> @@ -1497,6 +1498,7 @@ static struct dentry 
> *amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *
> &amdgpu_ras_debugfs_eeprom_ops);
> debugfs_create_u32("bad_page_cnt_threshold", 0444, dir,
>&con->bad_page_cnt_threshold);
> +   debugfs_create_u32("ras_num_recs", 0444, dir, &eeprom->ras_num_recs);
> debugfs_create_x32("ras_hw_enabled", 0444, dir, 
> &adev->ras_hw_enabled);
> debugfs_create_x32("ras_enabled", 0444, dir, &adev->ras_enabled);
> debugfs_create_file("ras_eeprom_size", S_IRUGO, dir, adev,
>
> base-commit: e82c20a8755677528a5e01e58b7763a42edf
> --
> 2.41.0
>

RE: WARNING: CPU: 5 PID: 1464 at drivers/gpu/drm/ttm/ttm_bo.c:326 ttm_bo_release+0x27e/0x2d0 [ttm]

2023-06-05 Thread Deucher, Alexander

[Public]

+ Christian

> -Original Message-
> From: Borislav Petkov 
> Sent: Saturday, June 3, 2023 1:48 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; dri-
> de...@lists.freedesktop.org; lkml 
> Subject: WARNING: CPU: 5 PID: 1464 at drivers/gpu/drm/ttm/ttm_bo.c:326
> ttm_bo_release+0x27e/0x2d0 [ttm]
>
> Hi,
>
> this below triggers with the latest Linus tree:
>
> 51f269a6ecc7 ("Merge tag 'probes-fixes-6.4-rc4' of
> git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace")
>
> ...
> [   16.173593] [drm] radeon kernel modesetting enabled.
> [   16.173743] radeon :29:00.0: vgaarb: deactivate vga console
> [   16.174300] MCE: In-kernel MCE decoding enabled.
> [   16.175695] EDAC DEBUG: umc_read_base_mask:   DCSB0[0]=0x0001
> reg: 0x5
> [   16.175698] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC0[0]=0x reg: 0x50010
> [   16.175700] EDAC DEBUG: umc_read_base_mask:   DCSB0[1]=0x
> reg: 0x50004
> [   16.175702] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC0[1]=0x reg: 0x50014
> [   16.175703] EDAC DEBUG: umc_read_base_mask:   DCSB0[2]=0x0201
> reg: 0x50008
> [   16.175705] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC0[2]=0x reg: 0x50018
> [   16.175706] EDAC DEBUG: umc_read_base_mask:   DCSB0[3]=0x
> reg: 0x5000c
> [   16.175707] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC0[3]=0x reg: 0x5001c
> [   16.175709] EDAC DEBUG: umc_read_base_mask:   DCSM0[0]=0x03fffdfe
> reg: 0x50020
> [   16.175710] EDAC DEBUG: umc_read_base_mask:
> DCSM_SEC0[0]=0x reg: 0x50028
> [   16.175712] EDAC DEBUG: umc_read_base_mask:   DCSM0[1]=0x03fffdfe
> reg: 0x50024
> [   16.175713] EDAC DEBUG: umc_read_base_mask:
> DCSM_SEC0[1]=0x reg: 0x5002c
> [   16.175715] EDAC DEBUG: umc_read_base_mask:   DCSB1[0]=0x0001
> reg: 0x15
> [   16.175716] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC1[0]=0x reg: 0x150010
> [   16.175718] EDAC DEBUG: umc_read_base_mask:   DCSB1[1]=0x
> reg: 0x150004
> [   16.175719] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC1[1]=0x reg: 0x150014
> [   16.175720] EDAC DEBUG: umc_read_base_mask:   DCSB1[2]=0x0201
> reg: 0x150008
> [   16.175722] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC1[2]=0x reg: 0x150018
> [   16.175723] EDAC DEBUG: umc_read_base_mask:   DCSB1[3]=0x
> reg: 0x15000c
> [   16.175725] EDAC DEBUG: umc_read_base_mask:
> DCSB_SEC1[3]=0x reg: 0x15001c
> [   16.175726] EDAC DEBUG: umc_read_base_mask:   DCSM1[0]=0x03fffdfe
> reg: 0x150020
> [   16.175728] EDAC DEBUG: umc_read_base_mask:
> DCSM_SEC1[0]=0x reg: 0x150028
> [   16.175729] EDAC DEBUG: umc_read_base_mask:   DCSM1[1]=0x03fffdfe
> reg: 0x150024
> [   16.175730] EDAC DEBUG: umc_read_base_mask:
> DCSM_SEC1[1]=0x reg: 0x15002c
> [   16.175741] EDAC DEBUG: umc_determine_memory_type:   UMC0 DIMM
> type: Unbuffered-DDR4
> [   16.175742] EDAC DEBUG: umc_determine_memory_type:   UMC1 DIMM
> type: Unbuffered-DDR4
> [   16.177514] Console: switching to colour dummy device 80x25
> [   16.177693] [drm] initializing kernel modesetting (CEDAR 0x1002:0x68E1
> 0x174B:0x3000 0x00).
> [   16.177733] ATOM BIOS: AMD
> [   16.177795] radeon :29:00.0: VRAM: 1024M 0x
> - 0x3FFF (1024M used)
> [   16.177798] radeon :29:00.0: GTT: 1024M 0x4000 -
> 0x7FFF
> [   16.177800] [drm] Detected VRAM RAM=1024M, BAR=256M
> [   16.177802] [drm] RAM width 64bits DDR
> [   16.177835] [drm] radeon: 1024M of VRAM memory ready
> [   16.177836] [drm] radeon: 1024M of GTT memory ready.
> [   16.177839] [drm] Loading CEDAR Microcode
> [   16.179106] [drm] Internal thermal controller without fan control
> [   16.199812] [drm] radeon: dpm initialized
> [   16.200179] [drm] GART: num cpu pages 262144, num gpu pages 262144
> [   16.200399] [drm] enabling PCIE gen 2 link speeds, disable with
> radeon.pcie_gen2=0
> [   16.218135] [drm] PCIE GART of 1024M enabled (table at
> 0x0014C000).
> [   16.218239] radeon :29:00.0: WB enabled
> [   16.218240] radeon :29:00.0: fence driver on ring 0 use gpu addr
> 0x4c00
> [   16.218242] radeon :29:00.0: fence driver on ring 3 use gpu addr
> 0x4c0c
> [   16.218606] radeon :29:00.0: fence driver on ring 5 use gpu addr
> 0x0005c418
> [   16.218657] radeon :29:00.0: radeon: MSI limited to 32-bit
> [   16.218689] radeon :29:00.0: radeon: using MSI.
> [   16.218707] [drm] radeon: irq initialized.
> [   16.234730] [drm] ring test on 0 succeeded in 0 usecs
> [   16.234738] [drm] ring test on 3 succeeded in 2 usecs
> [   16.317725] r8169 :25:00.0 eth0: Link is Down
> [   16.410486] [drm] ring test on 5 succeeded in 1 usecs
> [   16.410492] [drm] UVD initialized successfully.
> [   16.410555] [drm] ib test on ring 0 succeeded in 0 usecs
> [   16.410596] [drm] ib test on ring 3 succeeded in 0 usecs
> [   17.077422] [drm] ib test on ring 5 succeeded
> [   17.077581] [drm] Radeo

Re: [PATCH] drm/amdgpu: fix xclk freq on CHIP_STONEY

2023-06-05 Thread Alex Deucher

Applied.  Thanks!

On Fri, Jun 2, 2023 at 11:13 PM Chia-I Wu  wrote:
>
> On Fri, Jun 2, 2023 at 11:50 AM Alex Deucher  wrote:
> >
> > Nevermind, missing your Signed-off-by.  Please add and I'll apply.
> Sorry that I keep forgetting...  This patch is
>
>   Signed-off-by: Chia-I Wu 
>
> I can send v2 if necessary.
> >
> > Alex
> >

Re: [PATCH 2/4] drm/amdkfd: Signal page table fence after KFD flush tlb

2023-06-05 Thread Christian König


Am 05.06.23 um 17:40 schrieb Shashank Sharma:


On 05/06/2023 17:18, Christian König wrote:

Am 05.06.23 um 17:13 schrieb Shashank Sharma:


On 02/06/2023 16:54, Felix Kuehling wrote:

Am 2023-06-02 um 07:57 schrieb Christian König:

Am 01.06.23 um 21:31 schrieb Philip Yang:

To free page table BOs which are freed when updating page table, for
example PTE BOs when PDE0 used as PTE.

Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

index af0a4b5257cc..0ff007a74d03 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2101,6 +2101,11 @@ void kfd_flush_tlb(struct 
kfd_process_device *pdd, enum TLB_FLUSH_TYPE type)

  amdgpu_amdkfd_flush_gpu_tlb_pasid(
  dev->adev, pdd->process->pasid, type, xcc);
  }
+
+    /* Signal page table fence to free page table BOs */
+    dma_fence_signal(vm->pt_fence);


That's not something you can do here.

Signaling a fence can never depend on anything except for hardware 
work. In other words dma_fence_signal() is supposed to be called 
only from interrupt context!


We are signaling eviction fences from normal user context, too. 
There is no practical way to put this into an interrupt handler 
when the TLB flush is being done synchronously on a user thread. We 
have to do this in such a context for user mode queues.



We are currently working on adding a provide a high level kernel API 
which can be called directly to perform a TLB flush. Internally this 
API will add a deferred work to use the SDMA engine to perform a GPU 
TLB flush work (to compensate for a HW bug in Navi Chips). If my 
understanding is right, by interrupt context Christian means to 
perform this flush and signal from that differed work, is that so 
@Christian ?


Well more or less. Ideally you put the TLB flush in a work item (or 
use the SDMA for the hw bug workaround on Navi 1x).


The point is that you shouldn't have it in the same execution thread 
as the VM page table updates, because any memory allocation or 
grabbing a lock could potentially depend on the TLB flush as soon as 
you have published the dma_fence (by adding it to the VM page table 
BOs for example).


Would it work for everyone if we add this generic API (say 
amdgpu_flush_tlb_async()) to add a TLB flush work and will also send 
this dma_fence_signal from within ? KFD can simply consume it from 
wherever they want, do you see a race condition if we do like this ?


Yes, that's pretty much the whole idea. amdgpu_flush_tlb() should just 
return a dma_fence object.


This dma_fence object should either be the SDMA workaround or signaled 
from a work item.


We can then fence the BOs or just wait for the dma_fence object to signal.

Regards,
Christian.




- Shashank


Christian.



- Shashank



Regards,
  Felix




What we can to is to put the TLB flushing into an irq worker or 
work item and let the signaling happen from there.


Amar and Shashank are already working on this, I strongly suggest 
to sync up with them.


Regards,
Christian.


+ dma_fence_put(vm->pt_fence);
+    vm->pt_fence = amdgpu_pt_fence_create();
  }
    struct kfd_process_device 
*kfd_process_device_data_by_id(struct kfd_process *p, uint32_t 
gpu_id)

Re: [PATCH 2/4] drm/amdkfd: Signal page table fence after KFD flush tlb

2023-06-05 Thread Shashank Sharma




On 05/06/2023 17:18, Christian König wrote:

Am 05.06.23 um 17:13 schrieb Shashank Sharma:


On 02/06/2023 16:54, Felix Kuehling wrote:

Am 2023-06-02 um 07:57 schrieb Christian König:

Am 01.06.23 um 21:31 schrieb Philip Yang:

To free page table BOs which are freed when updating page table, for
example PTE BOs when PDE0 used as PTE.

Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

index af0a4b5257cc..0ff007a74d03 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2101,6 +2101,11 @@ void kfd_flush_tlb(struct 
kfd_process_device *pdd, enum TLB_FLUSH_TYPE type)

  amdgpu_amdkfd_flush_gpu_tlb_pasid(
  dev->adev, pdd->process->pasid, type, xcc);
  }
+
+    /* Signal page table fence to free page table BOs */
+    dma_fence_signal(vm->pt_fence);


That's not something you can do here.

Signaling a fence can never depend on anything except for hardware 
work. In other words dma_fence_signal() is supposed to be called 
only from interrupt context!


We are signaling eviction fences from normal user context, too. 
There is no practical way to put this into an interrupt handler when 
the TLB flush is being done synchronously on a user thread. We have 
to do this in such a context for user mode queues.



We are currently working on adding a provide a high level kernel API 
which can be called directly to perform a TLB flush. Internally this 
API will add a deferred work to use the SDMA engine to perform a GPU 
TLB flush work (to compensate for a HW bug in Navi Chips). If my 
understanding is right, by interrupt context Christian means to 
perform this flush and signal from that differed work, is that so 
@Christian ?


Well more or less. Ideally you put the TLB flush in a work item (or 
use the SDMA for the hw bug workaround on Navi 1x).


The point is that you shouldn't have it in the same execution thread 
as the VM page table updates, because any memory allocation or 
grabbing a lock could potentially depend on the TLB flush as soon as 
you have published the dma_fence (by adding it to the VM page table 
BOs for example).


Would it work for everyone if we add this generic API (say 
amdgpu_flush_tlb_async()) to add a TLB flush work and will also send 
this dma_fence_signal from within ? KFD can simply consume it from 
wherever they want, do you see a race condition if we do like this ?


- Shashank


Christian.



- Shashank



Regards,
  Felix




What we can to is to put the TLB flushing into an irq worker or 
work item and let the signaling happen from there.


Amar and Shashank are already working on this, I strongly suggest 
to sync up with them.


Regards,
Christian.


+    dma_fence_put(vm->pt_fence);
+    vm->pt_fence = amdgpu_pt_fence_create();
  }
    struct kfd_process_device 
*kfd_process_device_data_by_id(struct kfd_process *p, uint32_t 
gpu_id)

[PATCH] drm/amdgpu: Log if device is unsupported

2023-06-05 Thread Paul Menzel

Since there is overlap in supported devices, both modules load, but only
one will bind to a particular device depending on the model and user's
configuration.

amdgpu binds to all display class devices with VID 0x1002 and then
determines whether or not to bind to a device based on whether the
individual device is supported by the driver or not. Log that case, so
users looking at the logs know what is going on.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2608
Signed-off-by: Paul Menzel 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 86fbb4138285..410ff918c350 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2062,8 +2062,10 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
 
/* skip devices which are owned by radeon */
for (i = 0; i < ARRAY_SIZE(amdgpu_unsupported_pciidlist); i++) {
-   if (amdgpu_unsupported_pciidlist[i] == pdev->device)
+   if (amdgpu_unsupported_pciidlist[i] == pdev->device) {
+   DRM_INFO("This hardware is only supported by radeon.");
return -ENODEV;
+   }
}
 
if (amdgpu_aspm == -1 && !pcie_aspm_enabled(pdev))
-- 
2.40.1

RE: [PATCH] drm/amdgpu: Log if device is unsupported

2023-06-05 Thread Deucher, Alexander

[AMD Official Use Only - General]

> -Original Message-
> From: Paul Menzel 
> Sent: Monday, June 5, 2023 11:23 AM
> To: Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ; David
> Airlie ; Daniel Vetter 
> Cc: Paul Menzel ; amd-gfx@lists.freedesktop.org;
> dri-de...@lists.freedesktop.org; linux-ker...@vger.kernel.org
> Subject: [PATCH] drm/amdgpu: Log if device is unsupported
>
> Since there is overlap in supported devices, both modules load, but only one
> will bind to a particular device depending on the model and user's
> configuration.
>
> amdgpu binds to all display class devices with VID 0x1002 and then
> determines whether or not to bind to a device based on whether the
> individual device is supported by the driver or not. Log that case, so users
> looking at the logs know what is going on.
>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2608
> Signed-off-by: Paul Menzel 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 86fbb4138285..410ff918c350 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2062,8 +2062,10 @@ static int amdgpu_pci_probe(struct pci_dev
> *pdev,
>
>   /* skip devices which are owned by radeon */
>   for (i = 0; i < ARRAY_SIZE(amdgpu_unsupported_pciidlist); i++) {
> - if (amdgpu_unsupported_pciidlist[i] == pdev->device)
> + if (amdgpu_unsupported_pciidlist[i] == pdev->device) {
> + DRM_INFO("This hardware is only supported by
> radeon.");
>   return -ENODEV;

I think this will confuse users even more.  As there will be a new "error" 
message reported.  I'd suggest either dropping the message in init per my 
proposed patch or just leaving things as is.

Alex

> + }
>   }
>
>   if (amdgpu_aspm == -1 && !pcie_aspm_enabled(pdev))
> --
> 2.40.1

Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue

2023-06-05 Thread Hamza Mahfooz

On 6/3/23 10:52, Felix Richter wrote:

Hi Guys,

sorry for the silence from my side. I had a lot of things to take care
of after returning from vacation. Also I had to wait on the zfs modules
to be updated to support kernel 6.3 for further testing.

Now if this is the desired long term fix I do not know …

Can you provide a dmidecode of your RAM (i.e. # dmidecode --type=memory)?

The current trend seems to suggest that if you have 64 or more gigs of
RAM, you will probably still experience issues with S/G mode enabled
even with my fix applied.

Kind regards,
Felix Richter

On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:

On 02.05.23 15:48, Felix Richter wrote:

On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:

On 02.05.23 15:13, Alex Deucher wrote:

On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
Leemhuis) wrote:

On 30.04.23 13:44, Felix Richter wrote:

Hi,

I am running the Archlinux Kernel:
The Issue happens on the bleeding edge kernel: 6.2.13
Switching back to the LTS kernel resolves the issue: 6.1.26

I have two monitors attached to the system. One 42 inch 4k Display
and a 24 inch 1080p Display and am running sway as my desktop.

Let me know if there is more information I could provide to help
narrow down the issue.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel
regression

tracking bot:

#regzbot ^introduced v6.1..v6.2
#regzbot title drm: amdgpu: system becomes unpleasant to use after
monitor starts blinking and turns full white
#regzbot ignore-activity

the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags
pointing
to the report (the parent of this mail). See page linked in footer
for

details.

This sounds exactly like the issue that was fixed in this patch which
is already on it's way to Linus:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9

/me looks around again and can't see anything, but that doesn't have to
mean anything...

Felix, btw, this guide might help you with the bisection, even if it's
just for kernel compilation:

https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html

And to indirectly reply to your mail from yesterday[1]. You might want
to ignore the arch linux kernel git repo and just do a bisection
between

6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
I'd also try 6.3 or even mainline before that, in case the issue was
fixed already.

[1]
https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279e...@felixrichter.tech/

Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
the newest commit.

That was the part I was mostly unsure about … where
to start from.

I was planning to use PKGBUILD scripts from arch to achieve the same
configuration as I would when inst

Re: [PATCH v2] drm/radeon: fix race condition UAF in radeon_gem_set_domain_ioctl

2023-06-05 Thread Alex Deucher

Applied.  Thanks!

On Mon, Jun 5, 2023 at 4:13 AM Christian König  wrote:
>
> Am 03.06.23 um 09:43 schrieb Min Li:
> > Userspace can race to free the gobj(robj converted from), robj should not
> > be accessed again after drm_gem_object_put, otherwith it will result in
> > use-after-free.
> >
> > Signed-off-by: Min Li 
>
> Reviewed-by: Christian König 
>
> > ---
> > Changes in v2:
> > - Remove unused robj, avoid compile complain
> >
> >   drivers/gpu/drm/radeon/radeon_gem.c | 4 +---
> >   1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> > b/drivers/gpu/drm/radeon/radeon_gem.c
> > index bdc5af23f005..d3f5ddbc1704 100644
> > --- a/drivers/gpu/drm/radeon/radeon_gem.c
> > +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> > @@ -459,7 +459,6 @@ int radeon_gem_set_domain_ioctl(struct drm_device *dev, 
> > void *data,
> >   struct radeon_device *rdev = dev->dev_private;
> >   struct drm_radeon_gem_set_domain *args = data;
> >   struct drm_gem_object *gobj;
> > - struct radeon_bo *robj;
> >   int r;
> >
> >   /* for now if someone requests domain CPU -
> > @@ -472,13 +471,12 @@ int radeon_gem_set_domain_ioctl(struct drm_device 
> > *dev, void *data,
> >   up_read(&rdev->exclusive_lock);
> >   return -ENOENT;
> >   }
> > - robj = gem_to_radeon_bo(gobj);
> >
> >   r = radeon_gem_set_domain(gobj, args->read_domains, 
> > args->write_domain);
> >
> >   drm_gem_object_put(gobj);
> >   up_read(&rdev->exclusive_lock);
> > - r = radeon_gem_handle_lockup(robj->rdev, r);
> > + r = radeon_gem_handle_lockup(rdev, r);
> >   return r;
> >   }
> >
>

Re: [PATCH 2/4] drm/amdkfd: Signal page table fence after KFD flush tlb

2023-06-05 Thread Christian König


Am 05.06.23 um 17:13 schrieb Shashank Sharma:


On 02/06/2023 16:54, Felix Kuehling wrote:

Am 2023-06-02 um 07:57 schrieb Christian König:

Am 01.06.23 um 21:31 schrieb Philip Yang:

To free page table BOs which are freed when updating page table, for
example PTE BOs when PDE0 used as PTE.

Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

index af0a4b5257cc..0ff007a74d03 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2101,6 +2101,11 @@ void kfd_flush_tlb(struct kfd_process_device 
*pdd, enum TLB_FLUSH_TYPE type)

  amdgpu_amdkfd_flush_gpu_tlb_pasid(
  dev->adev, pdd->process->pasid, type, xcc);
  }
+
+    /* Signal page table fence to free page table BOs */
+    dma_fence_signal(vm->pt_fence);


That's not something you can do here.

Signaling a fence can never depend on anything except for hardware 
work. In other words dma_fence_signal() is supposed to be called 
only from interrupt context!


We are signaling eviction fences from normal user context, too. There 
is no practical way to put this into an interrupt handler when the 
TLB flush is being done synchronously on a user thread. We have to do 
this in such a context for user mode queues.



We are currently working on adding a provide a high level kernel API 
which can be called directly to perform a TLB flush. Internally this 
API will add a deferred work to use the SDMA engine to perform a GPU 
TLB flush work (to compensate for a HW bug in Navi Chips). If my 
understanding is right, by interrupt context Christian means to 
perform this flush and signal from that differed work, is that so 
@Christian ?


Well more or less. Ideally you put the TLB flush in a work item (or use 
the SDMA for the hw bug workaround on Navi 1x).


The point is that you shouldn't have it in the same execution thread as 
the VM page table updates, because any memory allocation or grabbing a 
lock could potentially depend on the TLB flush as soon as you have 
published the dma_fence (by adding it to the VM page table BOs for example).


Christian.



- Shashank



Regards,
  Felix




What we can to is to put the TLB flushing into an irq worker or work 
item and let the signaling happen from there.


Amar and Shashank are already working on this, I strongly suggest to 
sync up with them.


Regards,
Christian.


+    dma_fence_put(vm->pt_fence);
+    vm->pt_fence = amdgpu_pt_fence_create();
  }
    struct kfd_process_device *kfd_process_device_data_by_id(struct 
kfd_process *p, uint32_t gpu_id)

Re: [PATCH 2/2] drm/amdgpu: make sure that BOs have a backing store

2023-06-05 Thread Alex Deucher

On Mon, Jun 5, 2023 at 5:11 AM Christian König
 wrote:
>
> It's perfectly possible that the BO is about to be destroyed and doesn't
> have a backing store associated with it.
>
> Signed-off-by: Christian König 

Series is:
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 2bd1a54ee866..249385985a4f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1268,8 +1268,12 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object 
> *bo,
>  void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   struct amdgpu_mem_stats *stats)
>  {
> -   unsigned int domain;
> uint64_t size = amdgpu_bo_size(bo);
> +   unsigned int domain;
> +
> +   /* Abort if the BO doesn't currently have a backing store */
> +   if (!bo->tbo.resource)
> +   return;
>
> domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> switch (domain) {
> --
> 2.34.1
>

Re: [PATCH 2/4] drm/amdkfd: Signal page table fence after KFD flush tlb

2023-06-05 Thread Shashank Sharma




On 02/06/2023 16:54, Felix Kuehling wrote:

Am 2023-06-02 um 07:57 schrieb Christian König:

Am 01.06.23 um 21:31 schrieb Philip Yang:

To free page table BOs which are freed when updating page table, for
example PTE BOs when PDE0 used as PTE.

Signed-off-by: Philip Yang 
---
  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c

index af0a4b5257cc..0ff007a74d03 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2101,6 +2101,11 @@ void kfd_flush_tlb(struct kfd_process_device 
*pdd, enum TLB_FLUSH_TYPE type)

  amdgpu_amdkfd_flush_gpu_tlb_pasid(
  dev->adev, pdd->process->pasid, type, xcc);
  }
+
+    /* Signal page table fence to free page table BOs */
+    dma_fence_signal(vm->pt_fence);


That's not something you can do here.

Signaling a fence can never depend on anything except for hardware 
work. In other words dma_fence_signal() is supposed to be called only 
from interrupt context!


We are signaling eviction fences from normal user context, too. There 
is no practical way to put this into an interrupt handler when the TLB 
flush is being done synchronously on a user thread. We have to do this 
in such a context for user mode queues.



We are currently working on adding a provide a high level kernel API 
which can be called directly to perform a TLB flush. Internally this API 
will add a deferred work to use the SDMA engine to perform a GPU TLB 
flush work (to compensate for a HW bug in Navi Chips). If my 
understanding is right, by interrupt context Christian means to perform 
this flush and signal from that differed work, is that so @Christian ?


- Shashank



Regards,
  Felix




What we can to is to put the TLB flushing into an irq worker or work 
item and let the signaling happen from there.


Amar and Shashank are already working on this, I strongly suggest to 
sync up with them.


Regards,
Christian.


+    dma_fence_put(vm->pt_fence);
+    vm->pt_fence = amdgpu_pt_fence_create();
  }
    struct kfd_process_device *kfd_process_device_data_by_id(struct 
kfd_process *p, uint32_t gpu_id)

[PATCH] drm/amd: Drop messages in init for radeon, amdgpu

2023-06-05 Thread Alex Deucher

Since there is overlap in supported devices, both
modules load, but only one will bind to a particular
device depending on the user's configuration.  Drop
the message in the module init function as this can
be confusing to users.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2608
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 -
 drivers/gpu/drm/radeon/radeon_drv.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 7eda4f039224..94509b76fa6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -3065,7 +3065,6 @@ static int __init amdgpu_init(void)
if (r)
goto error_fence;
 
-   DRM_INFO("amdgpu kernel modesetting enabled.\n");
amdgpu_register_atpx_handler();
amdgpu_acpi_detect();
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index e4374814f0ef..16b9eab90185 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -634,7 +634,6 @@ static int __init radeon_module_init(void)
if (radeon_modeset == 0)
return -EINVAL;
 
-   DRM_INFO("radeon kernel modesetting enabled.\n");
radeon_register_atpx_handler();
 
return pci_register_driver(&radeon_kms_pci_driver);
-- 
2.40.1