Re: [PATCH] drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov

2021-11-05 Thread Felix Kuehling



On 2021-11-05 1:59 p.m., Liu, Shaoyun wrote:

[AMD Official Use Only]

Ye, a lot already been changed since then , now the  pre_reset and  post_reset 
not in the  lock/unlock anymore.  With  my previous change , we make 
kfd_pre_reset  avoid touch  HW . Now it's pure SW handling , should be safe  to 
be moved out of the full access .
Anyway, thanks to bring this up, it will remind us to verify on the  XGMI 
configuration on SRIOV.

OK. Assuming it doesn't break in your testing, consider the patch

Reviewed-by: Felix Kuehling 




Regards
shaoyun.liu

-Original Message-
From: Kuehling, Felix 
Sent: Friday, November 5, 2021 1:48 PM
To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov

There was a reason why pre_reset was done differently on SRIOV. However, the 
code has changed a lot since then. Is this concern still valid?


commit 7b184b006185215daf4e911f8de212964c99a514
Author: wentalou 
Date:   Fri Dec 7 13:53:18 2018 +0800

     drm/amdgpu: kfd_pre_reset outside req_full_gpu cause sriov hang

     XGMI hive put kfd_pre_reset into amdgpu_device_lock_adev,
     but outside req_full_gpu of sriov.
     It would make sriov hang during reset.

     Signed-off-by: Wentao Lou 
     Reviewed-by: Shaoyun Liu 
     Signed-off-by: Alex Deucher 

Regards,
    Felix


On 2021-11-05 12:57 p.m., shaoyunl wrote:

The KFD pre_reset should be called before reset been executed, it will
hold the lock to prevent other rocm process to sent the packlage to
hiq during host execute the real reset on the HW

Signed-off-by: shaoyunl 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
   1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 95fec36e385e..d7c9dce17cad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4278,8 +4278,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
if (r)
return r;
   
-	amdgpu_amdkfd_pre_reset(adev);

-
/* Resume IP prior to SMC */
r = amdgpu_device_ip_reinit_early_sriov(adev);
if (r)
@@ -5015,8 +5013,7 @@ int amdgpu_device_gpu_recover(struct
amdgpu_device *adev,
   
   		cancel_delayed_work_sync(_adev->delayed_init_work);
   
-		if (!amdgpu_sriov_vf(tmp_adev))

-   amdgpu_amdkfd_pre_reset(tmp_adev);
+   amdgpu_amdkfd_pre_reset(tmp_adev);
   
   		/*

 * Mark these ASICs to be reseted as untracked first


Re: [PATCH] drm/amdkfd: lower the VAs base offset to 8KB

2021-11-05 Thread Felix Kuehling

On 2021-11-05 3:25 p.m., Alex Sierra wrote:

The low 16MB of virtual address space are currently reserved for kernel
mode allocations mapped into user virtual address space. This causes
conflicts with HMM/SVM mappings at low virtual addresses. We tried to
move those kernel mode allocations to the upper half of the 64-bit
virtual address space for GFX9, which is naturally reserved for kernel
use. However, TBA (trap handler code) has problems to access addresses
in the high virtual space. We have decided to set this to 8KB of the
lower address space as a temporary fix, while investigate TBA address
problem. It is very unlikely for user space to map memory at this low
region.

Signed-off-by: Alex Sierra 


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
index 2e86692def19..d1388896f9c1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
@@ -308,7 +308,7 @@
   * 16MB are reserved for kernel use (CWSR trap handler and kernel IB
   * for now).
   */
-#define SVM_USER_BASE 0x100ull
+#define SVM_USER_BASE (u64)(KFD_CWSR_TBA_TMA_SIZE + 2*PAGE_SIZE)
  #define SVM_CWSR_BASE (SVM_USER_BASE - KFD_CWSR_TBA_TMA_SIZE)
  #define SVM_IB_BASE   (SVM_CWSR_BASE - PAGE_SIZE)
  


Re: [PATCH] drm/amdkfd: replace trivial funcs with direct access

2021-11-05 Thread Felix Kuehling

On 2021-11-05 2:43 p.m., Graham Sider wrote:

These get funcs simply return an adev field. Replace funcs/calls with
direct field accesses instead.

Signed-off-by: Graham Sider 


Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 30 ---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  6 
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  4 +--
  drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  8 ++---
  .../amd/amdkfd/kfd_process_queue_manager.c|  2 +-
  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  7 ++---
  6 files changed, 10 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 83f863dca7af..46cf48b3904a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -521,16 +521,6 @@ uint64_t amdgpu_amdkfd_get_vram_usage(struct amdgpu_device 
*adev)
return amdgpu_vram_mgr_usage(vram_man);
  }
  
-uint64_t amdgpu_amdkfd_get_hive_id(struct amdgpu_device *adev)

-{
-   return adev->gmc.xgmi.hive_id;
-}
-
-uint64_t amdgpu_amdkfd_get_unique_id(struct amdgpu_device *adev)
-{
-   return adev->unique_id;
-}
-
  uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct amdgpu_device *dst,
  struct amdgpu_device *src)
  {
@@ -630,26 +620,6 @@ int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct 
amdgpu_device *adev, bool is_
return (num_lanes_factor * gen_speed_mbits_factor)/BITS_PER_BYTE;
  }
  
-uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct amdgpu_device *adev)

-{
-   return adev->rmmio_remap.bus_addr;
-}
-
-uint32_t amdgpu_amdkfd_get_num_gws(struct amdgpu_device *adev)
-{
-   return adev->gds.gws_size;
-}
-
-uint32_t amdgpu_amdkfd_get_asic_rev_id(struct amdgpu_device *adev)
-{
-   return adev->rev_id;
-}
-
-int amdgpu_amdkfd_get_noretry(struct amdgpu_device *adev)
-{
-   return adev->gmc.noretry;
-}
-
  int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
enum kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 5f658823a637..d00de575c541 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -224,12 +224,6 @@ int amdgpu_amdkfd_get_dmabuf_info(struct amdgpu_device 
*adev, int dma_buf_fd,
  size_t buffer_size, uint32_t *metadata_size,
  uint32_t *flags);
  uint64_t amdgpu_amdkfd_get_vram_usage(struct amdgpu_device *adev);
-uint64_t amdgpu_amdkfd_get_hive_id(struct amdgpu_device *adev);
-uint64_t amdgpu_amdkfd_get_unique_id(struct amdgpu_device *adev);
-uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct amdgpu_device *adev);
-uint32_t amdgpu_amdkfd_get_num_gws(struct amdgpu_device *adev);
-uint32_t amdgpu_amdkfd_get_asic_rev_id(struct amdgpu_device *adev);
-int amdgpu_amdkfd_get_noretry(struct amdgpu_device *adev);
  uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct amdgpu_device *dst,
  struct amdgpu_device *src);
  int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct amdgpu_device *dst,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 8d5021e8c714..2e3d74f7fbfb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1313,7 +1313,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
err = -EINVAL;
goto err_unlock;
}
-   offset = amdgpu_amdkfd_get_mmio_remap_phys_addr(dev->adev);
+   offset = dev->adev->rmmio_remap.bus_addr;
if (!offset) {
err = -ENOMEM;
goto err_unlock;
@@ -2066,7 +2066,7 @@ static int kfd_mmio_mmap(struct kfd_dev *dev, struct 
kfd_process *process,
if (vma->vm_end - vma->vm_start != PAGE_SIZE)
return -EINVAL;
  
-	address = amdgpu_amdkfd_get_mmio_remap_phys_addr(dev->adev);

+   address = dev->adev->rmmio_remap.bus_addr;
  
  	vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE |

VM_DONTDUMP | VM_PFNMAP;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index c8aade17efef..b752dc36a2cd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -892,7 +892,7 @@ static int kfd_gws_init(struct kfd_dev *kfd)
|| (kfd->device_info->asic_family == CHIP_ALDEBARAN
&& kfd->mec2_fw_version >= 0x28))
ret = amdgpu_amdkfd_alloc_gws(kfd->adev,
-   amdgpu_amdkfd_get_num_gws(kfd->adev), 
>gws);
+   

RE: [PATCH 00/22] DC Patches Nov 4, 2021

2021-11-05 Thread Wheeler, Daniel
[Public]

Hi all,
 
This week this patchset was tested on the following systems:
 
HP Envy 360, with Ryzen 5 4500U, with the following display types: eDP 1080p 
60hz, 4k 60hz  (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 
1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA)
 
AMD Ryzen 9 5900H, with the following display types: eDP 1080p 60hz, 4k 60hz  
(via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 1680*1050 60hz (via 
USB-C to DP and then DP to DVI/VGA)
 
Sapphire Pulse RX5700XT with the following display types:
4k 60hz  (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to 
DVI/VGA)
 
Reference AMD RX6800 with the following display types:
4k 60hz  (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI 
and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA)
 
Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz, and 3x 1080p 
60hz on all systems. Also tested DSC via USB-C to DP DSC Hub with 3x 4k 60hz on 
Ryzen 9 5900h and Ryzen 5 4500u.
 
Tested on Ubuntu 20.04.3 with Kernel Version 5.13
 
Tested-by: Daniel Wheeler 
 
 
Thank you,
 
Dan Wheeler
Technologist  |  AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
Facebook |  Twitter |  amd.com  

-Original Message-
From: amd-gfx  On Behalf Of Anson Jacob
Sent: November 4, 2021 4:52 PM
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Chao-kai (Stylon) ; Chiu, Solomon 
; Li, Sun peng (Leo) ; Wentland, 
Harry ; Zhuo, Qingqing ; 
Siqueira, Rodrigo ; Li, Roman ; 
Jacob, Anson ; Pillai, Aurabindo 
; Lin, Wayne ; Lipski, Mikita 
; Lakha, Bhawanpreet ; 
Gutierrez, Agustin ; Kotarac, Pavle 

Subject: [PATCH 00/22] DC Patches Nov 4, 2021

This DC patchset brings improvements in multiple areas. In summary, we
have:

* Improvements to INBOX0 HW Lock
* Add support for sending TPS3 pattern
* Fix Coverity Issues
* Fixes for DMUB
* Fix RGB MPO underflow with multiple displays
* WS fixes and code restructure

Alvin Lee (1):
  drm/amd/display: Wait for ACK for INBOX0 HW Lock

Angus Wang (1):
  drm/amd/display: Fix RGB MPO underflow with multiple displays

Anson Jacob (1):
  drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends

Aric Cyr (1):
  drm/amd/display: 3.2.161

Charlene Liu (3):
  drm/amd/display: remove dmcub_support cap dependency
  drm/amd/display: clean up some formats and log.
  drm/amd/display: Adjust code indentation

Chris Park (1):
  drm/amd/display: Fix Coverity Issues

Dmytro Laktyushkin (1):
  drm/amd/display: bring dcn31 clk mgr in line with other version style

Huang, ChiaWen (1):
  drm/amd/display: use link_rate_set above DPCD 1.3 (#1527)

Jimmy Kizito (3):
  drm/amd/display: Use link_enc_cfg API for queries.
  drm/amd/display: Query all entries in assignment table during updates.
  drm/amd/display: Initialise encoder assignment when initialising
dc_state.

Leo (Hanghong) Ma (1):
  drm/amd/display: Add helper for blanking all dp displays

Meenakshikumar Somasundaram (1):
  drm/amd/display: Add hpd pending flag to indicate detection of new
hpd.

Mikita Lipski (1):
  drm/amd/display: Pass panel inst to a PSR command

Nicholas Kazlauskas (3):
  drm/amd/display: Fix detection of aligned DMUB firmware meta info
  drm/amd/display: Don't lock connection_mutex for DMUB HPD
  drm/amd/display: Add callbacks for DMUB HPD IRQ notifications

Robin Chen (1):
  drm/amd/display: To support sending TPS3 pattern when restoring link

Roy Chan (1):
  drm/amd/display: fix stale info in link encoder assignment

Sung Joon Kim (1):
  drm/amd/display: retain/release stream pointer in link enc table

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 29 ---  
.../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c  |  8 +-  
.../display/dc/clk_mgr/dcn31/dcn31_clk_mgr.h  |  7 ++
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 17 ++--
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 78 ++-  
.../gpu/drm/amd/display/dc/core/dc_link_dp.c  |  2 +-
 .../drm/amd/display/dc/core/dc_link_dpia.c| 20 ++---
 .../drm/amd/display/dc/core/dc_link_enc_cfg.c | 51 ++--  
.../gpu/drm/amd/display/dc/core/dc_resource.c |  3 +
 drivers/gpu/drm/amd/display/dc/dc.h   |  2 +-
 drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c  | 37 -  
drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h  |  2 +
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  7 +-
 .../gpu/drm/amd/display/dc/dce/dce_audio.c|  6 --
 .../gpu/drm/amd/display/dc/dce/dce_audio.h|  2 +
 .../drm/amd/display/dc/dce/dmub_hw_lock_mgr.c |  3 +  
drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 13 +++-  
drivers/gpu/drm/amd/display/dc/dce/dmub_psr.h |  2 +-
 .../display/dc/dce110/dce110_hw_sequencer.c   | 22 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 41 +-  
.../drm/amd/display/dc/dcn20/dcn20_resource.c |  2 +-
 .../drm/amd/display/dc/dcn30/dcn30_hwseq.c| 39 +-
 

[PATCH 2/2] drm/amdkfd: convert misc checks to IP version checking

2021-11-05 Thread Graham Sider
Switch to IP version checking instead of asic_type on various KFD
version checks.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  4 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 24 ++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  6 ++---
 .../amd/amdkfd/kfd_device_queue_manager_v9.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  6 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  8 +++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  6 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  4 ++--
 11 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 2e3d74f7fbfb..f66c78fda5be 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -321,7 +321,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
/* Return gpu_id as doorbell offset for mmap usage */
args->doorbell_offset = KFD_MMAP_TYPE_DOORBELL;
args->doorbell_offset |= KFD_MMAP_GPU_ID(args->gpu_id);
-   if (KFD_IS_SOC15(dev->device_info->asic_family))
+   if (KFD_IS_SOC15(dev->adev->ip_versions[GC_HWIP][0]))
/* On SOC15 ASICs, include the doorbell offset within the
 * process doorbell frame, which is 2 pages.
 */
@@ -1603,7 +1603,7 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file 
*filep,
}
mutex_unlock(>mutex);
 
-   if (dev->device_info->asic_family == CHIP_ALDEBARAN) {
+   if (dev->adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 2)) {
err = amdgpu_amdkfd_gpuvm_sync_memory(dev->adev,
(struct kgd_mem *) mem, true);
if (err) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 500bc7e40309..b41e62a324f6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1992,7 +1992,7 @@ static int kfd_fill_gpu_direct_io_link_to_cpu(int 
*avail_size,
sub_type_hdr->flags |= CRAT_IOLINK_FLAGS_BI_DIRECTIONAL;
sub_type_hdr->io_interface_type = CRAT_IOLINK_TYPE_XGMI;
sub_type_hdr->num_hops_xgmi = 1;
-   if (kdev->adev->asic_type == CHIP_ALDEBARAN) {
+   if (kdev->adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 2)) 
{
sub_type_hdr->minimum_bandwidth_mbs =
amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(
kdev->adev, NULL, true);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index b752dc36a2cd..29f8fcd4b779 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -844,23 +844,23 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, 
bool vf)
 static void kfd_cwsr_init(struct kfd_dev *kfd)
 {
if (cwsr_enable && kfd->device_info->supports_cwsr) {
-   if (kfd->device_info->asic_family < CHIP_VEGA10) {
+   if (kfd->adev->ip_versions[GC_HWIP][0] < IP_VERSION(9, 0, 1)) {
BUILD_BUG_ON(sizeof(cwsr_trap_gfx8_hex) > PAGE_SIZE);
kfd->cwsr_isa = cwsr_trap_gfx8_hex;
kfd->cwsr_isa_size = sizeof(cwsr_trap_gfx8_hex);
-   } else if (kfd->device_info->asic_family == CHIP_ARCTURUS) {
+   } else if (kfd->adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 
4, 1)) {
BUILD_BUG_ON(sizeof(cwsr_trap_arcturus_hex) > 
PAGE_SIZE);
kfd->cwsr_isa = cwsr_trap_arcturus_hex;
kfd->cwsr_isa_size = sizeof(cwsr_trap_arcturus_hex);
-   } else if (kfd->device_info->asic_family == CHIP_ALDEBARAN) {
+   } else if (kfd->adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 
4, 2)) {
BUILD_BUG_ON(sizeof(cwsr_trap_aldebaran_hex) > 
PAGE_SIZE);
kfd->cwsr_isa = cwsr_trap_aldebaran_hex;
kfd->cwsr_isa_size = sizeof(cwsr_trap_aldebaran_hex);
-   } else if (kfd->device_info->asic_family < CHIP_NAVI10) {
+   } else if (kfd->adev->ip_versions[GC_HWIP][0] < IP_VERSION(10, 
1, 1)) {
BUILD_BUG_ON(sizeof(cwsr_trap_gfx9_hex) > PAGE_SIZE);
kfd->cwsr_isa = cwsr_trap_gfx9_hex;
kfd->cwsr_isa_size = sizeof(cwsr_trap_gfx9_hex);
-   } else if (kfd->device_info->asic_family < CHIP_SIENNA_CICHLID) 
{
+   } else if (kfd->adev->ip_versions[GC_HWIP][0] < IP_VERSION(10, 
3, 

[PATCH 1/2] drm/amdkfd: convert to IP-based version checking

2021-11-05 Thread Graham Sider
Patches to change KFD to use IP versions rather than asic_type.
Converting IP version checking in main switch statements.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 124 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  56 
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c  |  52 
 .../gpu/drm/amd/amdkfd/kfd_packet_manager.c   |  56 
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  54 
 5 files changed, 189 insertions(+), 153 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 1dc6cb7446e0..500bc7e40309 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1377,67 +1377,71 @@ static int kfd_fill_gpu_cache_info(struct kfd_dev *kdev,
pcache_info = vegam_cache_info;
num_of_cache_types = ARRAY_SIZE(vegam_cache_info);
break;
-   case CHIP_VEGA10:
-   pcache_info = vega10_cache_info;
-   num_of_cache_types = ARRAY_SIZE(vega10_cache_info);
-   break;
-   case CHIP_VEGA12:
-   pcache_info = vega12_cache_info;
-   num_of_cache_types = ARRAY_SIZE(vega12_cache_info);
-   break;
-   case CHIP_VEGA20:
-   case CHIP_ARCTURUS:
-   pcache_info = vega20_cache_info;
-   num_of_cache_types = ARRAY_SIZE(vega20_cache_info);
-   break;
-   case CHIP_ALDEBARAN:
-   pcache_info = aldebaran_cache_info;
-   num_of_cache_types = ARRAY_SIZE(aldebaran_cache_info);
-   break;
-   case CHIP_RAVEN:
-   pcache_info = raven_cache_info;
-   num_of_cache_types = ARRAY_SIZE(raven_cache_info);
-   break;
-   case CHIP_RENOIR:
-   pcache_info = renoir_cache_info;
-   num_of_cache_types = ARRAY_SIZE(renoir_cache_info);
-   break;
-   case CHIP_NAVI10:
-   case CHIP_NAVI12:
-   case CHIP_CYAN_SKILLFISH:
-   pcache_info = navi10_cache_info;
-   num_of_cache_types = ARRAY_SIZE(navi10_cache_info);
-   break;
-   case CHIP_NAVI14:
-   pcache_info = navi14_cache_info;
-   num_of_cache_types = ARRAY_SIZE(navi14_cache_info);
-   break;
-   case CHIP_SIENNA_CICHLID:
-   pcache_info = sienna_cichlid_cache_info;
-   num_of_cache_types = ARRAY_SIZE(sienna_cichlid_cache_info);
-   break;
-   case CHIP_NAVY_FLOUNDER:
-   pcache_info = navy_flounder_cache_info;
-   num_of_cache_types = ARRAY_SIZE(navy_flounder_cache_info);
-   break;
-   case CHIP_DIMGREY_CAVEFISH:
-   pcache_info = dimgrey_cavefish_cache_info;
-   num_of_cache_types = ARRAY_SIZE(dimgrey_cavefish_cache_info);
-   break;
-   case CHIP_VANGOGH:
-   pcache_info = vangogh_cache_info;
-   num_of_cache_types = ARRAY_SIZE(vangogh_cache_info);
-   break;
-   case CHIP_BEIGE_GOBY:
-   pcache_info = beige_goby_cache_info;
-   num_of_cache_types = ARRAY_SIZE(beige_goby_cache_info);
-   break;
-   case CHIP_YELLOW_CARP:
-   pcache_info = yellow_carp_cache_info;
-   num_of_cache_types = ARRAY_SIZE(yellow_carp_cache_info);
-   break;
default:
-   return -EINVAL;
+   switch(kdev->adev->ip_versions[GC_HWIP][0]) {
+   case IP_VERSION(9, 0, 1):
+   pcache_info = vega10_cache_info;
+   num_of_cache_types = ARRAY_SIZE(vega10_cache_info);
+   break;
+   case IP_VERSION(9, 2, 1):
+   pcache_info = vega12_cache_info;
+   num_of_cache_types = ARRAY_SIZE(vega12_cache_info);
+   break;
+   case IP_VERSION(9, 4, 0):
+   case IP_VERSION(9, 4, 1):
+   pcache_info = vega20_cache_info;
+   num_of_cache_types = ARRAY_SIZE(vega20_cache_info);
+   break;
+   case IP_VERSION(9, 4, 2):
+   pcache_info = aldebaran_cache_info;
+   num_of_cache_types = ARRAY_SIZE(aldebaran_cache_info);
+   break;
+   case IP_VERSION(9, 1, 0):
+   case IP_VERSION(9, 2, 2):
+   pcache_info = raven_cache_info;
+   num_of_cache_types = ARRAY_SIZE(raven_cache_info);
+   break;
+   case IP_VERSION(9, 3, 0):
+   pcache_info = renoir_cache_info;
+   num_of_cache_types = ARRAY_SIZE(renoir_cache_info);
+   break;
+   case IP_VERSION(10, 1, 10):
+   case 

[PATCH v10 08/10] dyndbg: add print-to-tracefs, selftest with it - RFC

2021-11-05 Thread Jim Cromie
Sean Paul proposed, in:
https://patchwork.freedesktop.org/series/78133/
drm/trace: Mirror DRM debug logs to tracefs

His patchset's objective is to be able to independently steer some of
the drm.debug stream to an alternate tracing destination, by splitting
drm_debug_enabled() into syslog & trace flavors, and enabling them
separately.  2 advantages were identified:

1- syslog is heavyweight, tracefs is much lighter
2- separate selection of enabled categories means less traffic

Dynamic-Debug can do 2nd exceedingly well:

A- all work is behind jump-label's NOOP, zero off cost.
B- exact site selectivity, precisely the useful traffic.
   can tailor enabled set interactively, at shell.

Since the tracefs interface is effective for drm (the threads suggest
so), adding that interface to dynamic-debug has real potential for
everyone including drm.

if CONFIG_TRACING:

Grab Sean's trace_init/cleanup code, use it to provide tracefs
available by default to all pr_debugs.  This will likely need some
further per-module treatment; perhaps something reflecting hierarchy
of module,file,function,line, maybe with a tuned flattening.

endif CONFIG_TRACING

Add a new +T flag to enable tracing, independent of +p, and add and
use 3 macros: dyndbg_site_is_enabled/logging/tracing(), to encapsulate
the flag checks.  Existing code treats T like other flags.

Add ddebug_validate_flags() as last step in ddebug_parse_flags().  Its
only job is to fail on +T for non-CONFIG_TRACING builds.  It only sees
the new flags, and cannot validate specific state transitions.  This
is fine, since we have no need for that; such a test would have to be
done in ddebug_change(), which actually updates the callsites.

ddebug_change() adjusts the static-key-enable/disable condition to use
_DPRINTK_ENABLED / abstraction macros.

dynamic_emit_prefix() now gates on _DPRINTK_ENABLED too, as an
optimization but mostly to allow decluttering of its users.

__dynamic_pr_debug() etal get minor changes:

 - call dynamic_emit_prefix() 1st, _enabled() optimizes.
 - if (T) call trace_array_printk
 - if (!p) go around original printk code.
   done to minimize diff,
   goto-ectomy + reindent later/separately
 - share vaf across p|T

WRT _dev, I skipped all the  specific dev_emit_prefix
additions for now.  tracefs is a fast customer with different needs,
its not clear that pretty device-ID-ish strings is useful tracefs
content (on ingest), or that couldn't be done more efficiently while
analysing or postprocesing the tracefs buffer.

SELFTEST: test_dynamic_debug.ko:

Uses the tracer facility to implement a kernel module selftest.

TODO:

Earlier core code had (tracerfn)() indirection, allowing a plugin
side-effector we could test the results of.

ATM all the tests which count +T'd callsite executions (and which
expect >0) are failing.

Now it needs a rethink to test from userspace, rather than the current
test-once at module-load.  It needs a parameters/testme button.

So remainder of this is a bit stale 

- A custom tracer counts the number of calls (of T-enabled pr_debugs),
- do_debugging(x) calls a set of categorized pr_debugs x times

- test registers the tracer on the module
  then iteratively:
  manipulates dyndbg states via query-cmds, mostly format ^prefix
  runs do_debugging()
  counts enabled callsite executions
  reports mismatches

- modprobe test_dynamic_debug use_bad_tracer=1
  attaches a bad/recursive tracer
  Bad Things (did) Happen.
  has thrown me interesting panics.
  cannot replicate atm.

RFC: (DONE)

The "tracer" interface probably needs work and a new name.  It is only
1/2 way towards a real tracefs interface; and the code I lifted from
Sean Paul in the next patch could be implemented in dynamic_debug.c
instead, and made available for all pr_debug users.

This would also eliminate need for dynamic_debug_(un)register_tracer(),
since dyndbg could just provide it when TRACING is on.

NOTES:

$> modprobe test_dynamic_debug dyndbg=+p

   it fails 3/29 tests. havent looked at why.

$> modprobe test_dynamic_debug use_bad_tracer=1

Earlier in dev, bad_tracer() exploded in recursion, I havent been able
to replicate that lately.

Signed-off-by: Jim Cromie 
---
 .../admin-guide/dynamic-debug-howto.rst   |   7 +-
 MAINTAINERS   |   1 +
 include/linux/dynamic_debug.h |  12 +-
 lib/Kconfig.debug |  11 +
 lib/Makefile  |   1 +
 lib/dynamic_debug.c   | 127 --
 lib/test_dynamic_debug.c  | 222 ++
 7 files changed, 355 insertions(+), 26 deletions(-)
 create mode 100644 lib/test_dynamic_debug.c

diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst 
b/Documentation/admin-guide/dynamic-debug-howto.rst
index b119b8277b3e..48d32782bb11 100644
--- a/Documentation/admin-guide/dynamic-debug-howto.rst
+++ b/Documentation/admin-guide/dynamic-debug-howto.rst
@@ -227,7 +227,8 @@ of the 

[PATCH v10 06/10] drm_print: add choice to use dynamic debug in drm-debug

2021-11-05 Thread Jim Cromie
drm's debug system writes 10 distinct categories of messages to syslog
using a small API[1]: drm_dbg*(10 names), DRM_DEV_DEBUG*(3 names),
DRM_DEBUG*(8 names).  There are thousands of these callsites, each
categorized in this systematized way.

These callsites can be enabled at runtime by their category, each
controlled by a bit in drm.debug (/sys/modules/drm/parameter/debug).
In the current "basic" implementation, drm_debug_enabled() tests these
bits in __drm_debug each time an API[1] call is executed; while cheap
individually, the costs accumulate with uptime.

This patch uses dynamic-debug with (required) jump-label to patch
enabled callsites onto their respective NOOP slots, avoiding all
runtime bit-checks of __drm_debug by drm_debug_enabled().

Dynamic debug has no concept of category, but we can emulate one by
replacing enum categories with a set of prefix-strings; "drm:core:",
"drm:kms:" "drm:driver:" etc, and prepend them (at compile time) to
the given formats.

Then we can use:

   # echo module drm format "^drm:core: " +p > control`

to enable the whole category with one query.

This conversion yields many new prdbg callsites:

  dyndbg: 207 debug prints in module drm_kms_helper
  dyndbg: 376 debug prints in module drm
  dyndbg: 1811 debug prints in module i915
  dyndbg: 3917 debug prints in module amdgpu

Each site costs 56 bytes of .data, which is a big increase for
drm modules, so CONFIG_DRM_USE_DYNAMIC_DEBUG makes it optional.

CONFIG_JUMP_LABEL is also required, to get the promised optimizations.

The "basic" -> "dyndbg" switchover is layered into the macro scheme

A. A "prefix" version of DRM_UT_ map, named DRM_DBG_CAT_

"basic":  DRM_DBG_CAT_  <===  DRM_UT_.  Identity map.
"dyndbg":
   #define DRM_DBG_CAT_KMS"drm:kms: "
   #define DRM_DBG_CAT_PRIME  "drm:prime: "
   #define DRM_DBG_CAT_ATOMIC "drm:atomic: "

DRM_UT_* are preserved, since theyre used elsewhere.  Since the
callback maintains its state in __drm_debug, drm_debug_enabled() will
stay synchronized, and continue to work.  We can address them
separately if they are called enough to be worth fixing.

B. drm_dev_dbg() & drm_debug() are interposed with macros

basic:forward to renamed fn, with args preserved
enabled:  redirect to pr_debug, dev_dbg, with CATEGORY format catenated

This is where drm_debug_enabled() is avoided.  The prefix is prepended
at compile-time, no category at runtime.

C. API[1] uses DRM_DBG_CAT_s

The API already uses B, now it uses A too, instead of DRM_UT_, to
get the correct token type for "basic" and "dyndbg" configs.

D. use DEFINE_DYNAMIC_DEBUG_CATEGORIES()

This defines the map using DRM_CAT_s, and creates the /sysfs
bitmap to control those categories.

CONFIG_DRM_USE_DYNAMIC_DEBUG is also used to adjust amdgpu, i915
makefiles to add -DDYNAMIC_DEBUG_MODULE; it includes the current
CONFIG_DYNAMIC_DEBUG_CORE and is enabled by the user.

LIMITATIONS:

dev_dbg(etal) effectively prepends twice, category then driver-name,
yielding format strings like so:

bash-5.1# grep amdgpu: /proc/dynamic_debug/control | grep drm: | cut -d= -f2-
_ "amdgpu: drm:core: fence driver on ring %s use gpu addr 0x%016llx\012"
_ "amdgpu: drm:kms: Cannot create framebuffer from imported dma_buf\012"

This means we cannot use anchored "^drm:kms: " to specify the
category, a small loss of precision.

Note that searching on "format ^amdgpu: " works, but this is less
valuable, because the same can be done with "module amdgpu".

NOTES:

Because the dyndbg callback is keeping state in __drm_debug, it
synchronizes with drm_debug_enabled() and its remaining users; the
switchover should be transparent.

Code Review is expected to catch the lack of correspondence between
bit=>prefix definitions (the selector) and the prefixes used in the
API[1] layer above pr_debug()

I've coded the categories with trailing spaces.  This excludes any
sub-categories which might get added later.  This convention protects
any "drm:atomic:fail:" callsites from getting stomped on by `echo 0 >
debug`.  Other categories could differ, but we need some default.

Dyndbg requires that the prefix be in the compiled-in format string;
run-time prefixing evades callsite selection by category.

pr_debug("%s: ...", __func__, ...) // not ideal

Unfortunately __func__ is not a macro, and cannot be catenated at
preprocess/compile time.

If you want that, you might consider +mfl flags instead;)

Signed-off-by: Jim Cromie 
---
v5:
. use DEFINE_DYNAMIC_DEBUG_CATEGORIES in drm_print.c
. s/DRM_DBG_CLASS_/DRM_DBG_CAT_/ - dont need another term
. default=y in Kconfig entry - per @DanVet
. move some commit-log prose to dyndbg commit
. add-prototyes to (param_get/set)_dyndbg
. more wrinkles found by 
. relocate ratelimit chunk from elsewhere
v6:
. add kernel doc
. fix cpp paste, drop '#'
v7:
. change __drm_debug to long, to fit with DEFINE_DYNAMIC_DEBUG_CATEGORIES
. add -DDYNAMIC_DEBUG_MODULE to ccflags if DRM_USE_DYNAMIC_DEBUG
v8:
. adapt to altered ^ insertion
. add mem 

[PATCH v10 10/10] drm: use DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS in 3 places

2021-11-05 Thread Jim Cromie
add sysfs knobs to enable modules' pr_debug()s ---> tracefs

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/amd/display/dc/core/dc_debug.c |  8 
 drivers/gpu/drm/drm_print.c| 13 ++---
 drivers/gpu/drm/i915/intel_gvt.c   | 15 ---
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
index e49a755c6a69..58c56c1708e7 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
@@ -80,6 +80,14 @@ DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug_dc, __debug_dc,
DC_DYNDBG_BITMAP_DESC(debug_dc),
amdgpu_bitmap);
 
+#if defined(CONFIG_TRACING)
+
+unsigned long __trace_dc;
+EXPORT_SYMBOL(__trace_dc);
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(trace_dc, __trace_dc,
+   DC_DYNDBG_BITMAP_DESC(trace_dc),
+   amdgpu_bitmap);
+#endif
 #endif
 
 #define DC_LOGGER_INIT(logger)
diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index d5e0ffad467b..ee20e9c14ce9 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -72,9 +72,16 @@ static struct dyndbg_bitdesc drm_dyndbg_bitmap[] = {
[8] = { DRM_DBG_CAT_DP },
[9] = { DRM_DBG_CAT_DRMRES }
 };
-DEFINE_DYNAMIC_DEBUG_BITGRPS(debug, __drm_debug, DRM_DEBUG_DESC,
-drm_dyndbg_bitmap);
-
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug, __drm_debug, DRM_DEBUG_DESC,
+   drm_dyndbg_bitmap);
+
+#ifdef CONFIG_TRACING
+struct trace_array *trace_arr;
+unsigned long __drm_trace;
+EXPORT_SYMBOL(__drm_trace);
+DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS(trace, __drm_trace, DRM_DEBUG_DESC,
+ drm_dyndbg_bitmap);
+#endif
 #endif
 
 void __drm_puts_coredump(struct drm_printer *p, const char *str)
diff --git a/drivers/gpu/drm/i915/intel_gvt.c b/drivers/gpu/drm/i915/intel_gvt.c
index efaac5777873..84348d4aedf6 100644
--- a/drivers/gpu/drm/i915/intel_gvt.c
+++ b/drivers/gpu/drm/i915/intel_gvt.c
@@ -195,8 +195,17 @@ static struct dyndbg_bitdesc i915_dyndbg_bitmap[] = {
help_(7, "gvt:render:") \
help_(8, "gvt:sched:")
 
-DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
-I915_GVT_CATEGORIES(debug_gvt),
-i915_dyndbg_bitmap);
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug_gvt, __gvt_debug,
+   I915_GVT_CATEGORIES(debug_gvt),
+   i915_dyndbg_bitmap);
 
+#if defined(CONFIG_TRACING)
+
+unsigned long __gvt_trace;
+EXPORT_SYMBOL(__gvt_trace);
+DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS(trace_gvt, __gvt_trace,
+ I915_GVT_CATEGORIES(trace_gvt),
+ i915_dyndbg_bitmap);
+
+#endif
 #endif
-- 
2.31.1



[PATCH v10 09/10] dyndbg: create DEFINE_DYNAMIC_DEBUG_LOG|TRACE_GROUPS

2021-11-05 Thread Jim Cromie
With the recent addition of pr_debug to tracefs via +T flag, we now
want to add drm.trace; like its model: drm.debug, it maps bits to
pr_debug categories, but this one enables/disables writing to tracefs
(iff CONFIG_TRACING).

Do this by:

1. add flags to dyndbg_bitmap_param, holds "p" or "T" to work for either.

   add DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS to init .flags
   DEFINE_DYNAMIC_DEBUG_BITGRPS gets "p" for compat.
   use it from...

2. DEFINE_DYNAMIC_DEBUG_LOG_GROUPS as (1) with "p" flags - print to syslog
   DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS as (1) with "T" flags - trace to tracefs
   add kdoc to these

NOTES

The flags args (1) is a string, not just a 'p' or 'T'.  This allows
use of decorator flags ("mflt") too, in case the module author wants
to insure those decorations are in the trace & log.

The LOG|TRACE (2) macros don't use any decorator flags, (and therefore
don't toggle them), allowing users to control those themselves.

Decorator flags are shared for both LOG and TRACE consumers,
coordination between users is expected.  ATM, theres no declarative
way to preset decorator flags, but DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS
can be used to explicitly toggle them.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 44 ++-
 lib/dynamic_debug.c   |  4 ++--
 2 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 792bcff0297e..918ac1a92358 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -255,30 +255,52 @@ struct dyndbg_bitdesc {
 
 struct dyndbg_bitmap_param {
unsigned long *bits;/* ref to shared state */
+   const char *flags;
unsigned int maplen;
struct dyndbg_bitdesc *map; /* indexed by bitpos */
 };
 
 #if defined(CONFIG_DYNAMIC_DEBUG) || \
(defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
+
+#define DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, _flags, desc, data) \
+   MODULE_PARM_DESC(fsname, desc); \
+   static struct dyndbg_bitmap_param ddcats_##_var =   \
+   { .bits = &(_var), .flags = (_flags),   \
+ .map = data, .maplen = ARRAY_SIZE(data) };\
+   module_param_cb(fsname, _ops_dyndbg, _##_var, 0644)
+
+#define DEFINE_DYNAMIC_DEBUG_BITGRPS(fsname, _var, desc, data) \
+   DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, "p", desc, data)
+
 /**
- * DEFINE_DYNAMIC_DEBUG_BITGRPS() - bitmap control of pr_debugs, by format 
match
+ * DEFINE_DYNAMIC_DEBUG_LOG_GROUPS() - bitmap control of grouped pr_debugs --> 
syslog
+ *
  * @fsname: parameter basename under /sys
  * @_var:   C-identifier holding bitmap
  * @desc:   string summarizing the controls provided
  * @bitmap: C array of struct dyndbg_bitdescs
  *
- * Intended for modules with a systematic use of pr_debug prefixes in
- * the format strings, this allows modules calling pr_debugs to
- * control them in groups by matching against their formats, and map
- * them to bits 0-N of a sysfs control point.
+ * Intended for modules having pr_debugs with prefixed/categorized
+ * formats; this lets you group them by substring match, map groups to
+ * bits, and enable per group to write to syslog, via @fsname.
  */
-#define DEFINE_DYNAMIC_DEBUG_BITGRPS(fsname, _var, desc, data) \
-   MODULE_PARM_DESC(fsname, desc); \
-   static struct dyndbg_bitmap_param ddcats_##_var =   \
-   { .bits = &(_var), .map = data, \
- .maplen = ARRAY_SIZE(data) }; \
-   module_param_cb(fsname, _ops_dyndbg, _##_var, 0644)
+#define DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(fsname, _var, desc, data)  \
+   DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, "p", desc, data)
+
+/**
+ * DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS() - bitmap control of pr_debugs --> 
tracefs
+ * @fsname: parameter basename under /sys
+ * @_var:   C-identifier holding bitmap
+ * @desc:   string summarizing the controls provided
+ * @bitmap: C array of struct dyndbg_bitdescs
+ *
+ * Intended for modules having pr_debugs with prefixed/categorized
+ * formats; this lets you group them by substring match, map groups to
+ * bits, and enable per group to write to tracebuf, via @fsname.
+ */
+#define DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS(fsname, _var, desc, data)\
+   DEFINE_DYNAMIC_DEBUG_BITGRPS_FLAGS(fsname, _var, "T", desc, data)
 
 extern const struct kernel_param_ops param_ops_dyndbg;
 
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index d6e9c833f5d4..f55861dd76b2 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -629,8 +629,8 @@ int param_set_dyndbg(const char *instr, const struct 
kernel_param *kp)
for (i = 0; i < p->maplen && i < BITS_PER_LONG; map++, i++) {
if (test_bit(i, ) == test_bit(i, 

[PATCH v10 07/10] drm_print: instrument drm_debug_enabled

2021-11-05 Thread Jim Cromie
Duplicate drm_debug_enabled() code into both "basic" and "dyndbg"
ifdef branches.  Then add a pr_debug("todo: ...") into the "dyndbg"
branch.

Then convert the "dyndbg" branch's code to a macro, so that the
pr_debug() get its callsite info from the invoking function, instead
of from drm_debug_enabled() itself.

This gives us unique callsite info for the 8 remaining users of
drm_debug_enabled(), and lets us enable them individually to see how
much logging traffic they generate.  The oft-visited callsites can
then be reviewed for runtime cost and possible optimizations.

Heres what we get:

bash-5.1# modprobe drm
dyndbg: 384 debug prints in module drm
bash-5.1# grep todo: /proc/dynamic_debug/control
drivers/gpu/drm/drm_edid.c:1843 [drm]connector_bad_edid =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_print.c:309 [drm]___drm_dbg =p "todo: maybe avoid via 
dyndbg\012"
drivers/gpu/drm/drm_print.c:286 [drm]__drm_dev_dbg =p "todo: maybe avoid via 
dyndbg\012"
drivers/gpu/drm/drm_vblank.c:1491 [drm]drm_vblank_restore =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_vblank.c:787 
[drm]drm_crtc_vblank_helper_get_vblank_timestamp_internal =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_vblank.c:410 [drm]drm_crtc_accurate_vblank_count =_ "todo: 
maybe avoid via dyndbg\012"
drivers/gpu/drm/drm_atomic_uapi.c:1457 [drm]drm_mode_atomic_ioctl =_ "todo: 
maybe avoid via dyndbg\012"
drivers/gpu/drm/drm_edid_load.c:178 [drm]edid_load =_ "todo: maybe avoid via 
dyndbg\012"

At quick glance, edid won't qualify, drm_print might, drm_vblank is
strongest chance, maybe atomic-ioctl too.

Signed-off-by: Jim Cromie 
---
 include/drm/drm_print.h | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 392cff7cb95c..a902bd4d8c55 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -381,6 +381,11 @@ enum drm_debug_category {
 #define DRM_DBG_CAT_DP DRM_UT_DP
 #define DRM_DBG_CAT_DRMRES DRM_UT_DRMRES
 
+static inline bool drm_debug_enabled(enum drm_debug_category category)
+{
+   return unlikely(__drm_debug & category);
+}
+
 #else /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
 
 /* join prefix + format in cpp so dyndbg can see it */
@@ -414,12 +419,13 @@ enum drm_debug_category {
 #define DRM_DBG_CAT_DP "drm:dp: "
 #define DRM_DBG_CAT_DRMRES "drm:res: "
 
-#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
+#define drm_debug_enabled(category)\
+   ({  \
+   pr_debug("todo: maybe avoid via dyndbg\n"); \
+   unlikely(__drm_debug & (category)); \
+   })
 
-static inline bool drm_debug_enabled(enum drm_debug_category category)
-{
-   return unlikely(__drm_debug & category);
-}
+#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
 
 /*
  * struct device based logging
@@ -582,7 +588,6 @@ void __drm_dev_dbg(const struct device *dev, enum 
drm_debug_category category,
 #define drm_dbg_drmres(drm, fmt, ...)  \
drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_DBG_CAT_DRMRES, fmt, 
##__VA_ARGS__)
 
-
 /*
  * printk based logging
  *
-- 
2.31.1



[PATCH v10 04/10] i915/gvt: trim spaces from pr_debug "gvt: core:" prefixes

2021-11-05 Thread Jim Cromie
Taking embedded spaces out of existing prefixes makes them more easily
searchable; simplifying the extra quoting needed otherwise:

  $> echo format "^gvt: core:" +p >control

Dropping the internal spaces means any trailing space in a query will
more clearly terminate the prefix being searched for.

Consider a generic drm-debug example:

  # turn off ATOMIC reports
  echo format "^drm:atomic: " -p > control

  # turn off all ATOMIC:* reports, including any sub-categories
  echo format "^drm:atomic:" -p > control

  # turn on ATOMIC:FAIL: reports
  echo format "^drm:atomic:fail: " +p > control

Removing embedded spaces in the format prefixes simplifies the
corresponding match string.  This means that "quoted" match-prefixes
are only needed when the trailing space is desired, in order to
exclude explicitly sub-categorized pr-debugs; in this example,
"drm:atomic:fail:".

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/i915/gvt/debug.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/debug.h b/drivers/gpu/drm/i915/gvt/debug.h
index c6027125c1ec..bbecc279e077 100644
--- a/drivers/gpu/drm/i915/gvt/debug.h
+++ b/drivers/gpu/drm/i915/gvt/debug.h
@@ -36,30 +36,30 @@ do {
\
 } while (0)
 
 #define gvt_dbg_core(fmt, args...) \
-   pr_debug("gvt: core: "fmt, ##args)
+   pr_debug("gvt:core: " fmt, ##args)
 
 #define gvt_dbg_irq(fmt, args...) \
-   pr_debug("gvt: irq: "fmt, ##args)
+   pr_debug("gvt:irq: " fmt, ##args)
 
 #define gvt_dbg_mm(fmt, args...) \
-   pr_debug("gvt: mm: "fmt, ##args)
+   pr_debug("gvt:mm: " fmt, ##args)
 
 #define gvt_dbg_mmio(fmt, args...) \
-   pr_debug("gvt: mmio: "fmt, ##args)
+   pr_debug("gvt:mmio: " fmt, ##args)
 
 #define gvt_dbg_dpy(fmt, args...) \
-   pr_debug("gvt: dpy: "fmt, ##args)
+   pr_debug("gvt:dpy: " fmt, ##args)
 
 #define gvt_dbg_el(fmt, args...) \
-   pr_debug("gvt: el: "fmt, ##args)
+   pr_debug("gvt:el: " fmt, ##args)
 
 #define gvt_dbg_sched(fmt, args...) \
-   pr_debug("gvt: sched: "fmt, ##args)
+   pr_debug("gvt:sched: " fmt, ##args)
 
 #define gvt_dbg_render(fmt, args...) \
-   pr_debug("gvt: render: "fmt, ##args)
+   pr_debug("gvt:render: " fmt, ##args)
 
 #define gvt_dbg_cmd(fmt, args...) \
-   pr_debug("gvt: cmd: "fmt, ##args)
+   pr_debug("gvt:cmd: " fmt, ##args)
 
 #endif
-- 
2.31.1



[PATCH v10 05/10] i915/gvt: use dyndbg.BITGRPS for existing pr_debugs

2021-11-05 Thread Jim Cromie
The gvt component of this driver has ~120 pr_debugs with formats using
one of 9 fixed string prefixes, which are quite similar to those
enumerated in DRM debug categories.  Following the interface model of
drm.debug, add a parameter to map bits to these format prefixes.

static struct dyndbg_bitdesc i915_bitmap[] = {
[0] = { "gvt:cmd:" },
[1] = { "gvt:core:" },
[2] = { "gvt:dpy:" },
[3] = { "gvt:el:" },
[4] = { "gvt:irq:" },
[5] = { "gvt:mm:" },
[6] = { "gvt:mmio:" },
[7] = { "gvt:render:" },
[8] = { "gvt:sched:" }
};
DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
"dyndbg bitmap desc",

If CONFIG_DYNAMIC_DEBUG_CORE=y, then gvt/Makefile adds
-DDYNAMIC_DEBUG_MODULE to cflags, which CONFIG_DYNAMIC_DEBUG=n
(CORE-only) builds need.  This is redone more comprehensively soon.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/i915/Makefile|  2 ++
 drivers/gpu/drm/i915/intel_gvt.c | 38 
 2 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 660bb03de6fc..0fa5f53312a8 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -317,6 +317,8 @@ i915-y += intel_gvt.o
 include $(src)/gvt/Makefile
 endif
 
+ccflags-$(CONFIG_DYNAMIC_DEBUG_CORE) += -DDYNAMIC_DEBUG_MODULE
+
 obj-$(CONFIG_DRM_I915) += i915.o
 obj-$(CONFIG_DRM_I915_GVT_KVMGT) += gvt/kvmgt.o
 
diff --git a/drivers/gpu/drm/i915/intel_gvt.c b/drivers/gpu/drm/i915/intel_gvt.c
index 4e70c1a9ef2e..efaac5777873 100644
--- a/drivers/gpu/drm/i915/intel_gvt.c
+++ b/drivers/gpu/drm/i915/intel_gvt.c
@@ -162,3 +162,41 @@ void intel_gvt_resume(struct drm_i915_private *dev_priv)
if (intel_gvt_active(dev_priv))
intel_gvt_pm_resume(dev_priv->gvt);
 }
+
+#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
+
+unsigned long __gvt_debug;
+EXPORT_SYMBOL(__gvt_debug);
+
+static struct dyndbg_bitdesc i915_dyndbg_bitmap[] = {
+   [0] = { "gvt:cmd:" },
+   [1] = { "gvt:core:" },
+   [2] = { "gvt:dpy:" },
+   [3] = { "gvt:el:" },
+   [4] = { "gvt:irq:" },
+   [5] = { "gvt:mm:" },
+   [6] = { "gvt:mmio:" },
+   [7] = { "gvt:render:" },
+   [8] = { "gvt:sched:" }
+};
+
+#define help_(_N, _cat)"\t  Bit-" #_N ":\t" _cat "\n"
+
+#define I915_GVT_CATEGORIES(name) \
+   " Enable debug output via /sys/module/i915/parameters/" #name   \
+   ", where each bit enables a debug category.\n"  \
+   help_(0, "gvt:cmd:")\
+   help_(1, "gvt:core:")   \
+   help_(2, "gvt:dpy:")\
+   help_(3, "gvt:el:") \
+   help_(4, "gvt:irq:")\
+   help_(5, "gvt:mm:") \
+   help_(6, "gvt:mmio:")   \
+   help_(7, "gvt:render:") \
+   help_(8, "gvt:sched:")
+
+DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
+I915_GVT_CATEGORIES(debug_gvt),
+i915_dyndbg_bitmap);
+
+#endif
-- 
2.31.1



[PATCH v10 02/10] drm: fix doc grammar

2021-11-05 Thread Jim Cromie
allocates and initializes ...

Signed-off-by: Jim Cromie 
---
 include/drm/drm_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 0cd95953cdf5..4b29261c4537 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -486,7 +486,7 @@ void *__devm_drm_dev_alloc(struct device *parent,
  * @type: the type of the struct which contains struct _device
  * @member: the name of the _device within @type.
  *
- * This allocates and initialize a new DRM device. No device registration is 
done.
+ * This allocates and initializes a new DRM device. No device registration is 
done.
  * Call drm_dev_register() to advertice the device to user space and register 
it
  * with other core subsystems. This should be done last in the device
  * initialization sequence to make sure userspace can't access an inconsistent
-- 
2.31.1



[PATCH v10 03/10] amdgpu: use dyndbg.BITGRPS to control existing pr_debugs

2021-11-05 Thread Jim Cromie
logger_types.h defines many DC_LOG_*() categorized debug wrappers.
Most of these already use DRM debug API, so are controllable using
drm.debug, but others use a bare pr_debug("$prefix: .."), with 1 of 13
different class-prefixes matching [:uppercase:]

Use DEFINE_DYNAMIC_DEBUG_BITGRPS to create a sysfs location which maps
from bits to these 13 sets of categorized pr_debugs to en/disable.

Makefile adds -DDYNAMIC_DEBUG_MODULE for CONFIG_DYNAMIC_DEBUG_CORE,
otherwise BUILD_BUG_ON triggers (obvious errors on subtle misuse is
better than mysterious ones).

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |  2 +
 .../gpu/drm/amd/display/dc/core/dc_debug.c| 47 ++-
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 653726588956..077342ca803f 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -38,6 +38,8 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
-I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm \
-I$(FULL_AMD_PATH)/amdkfd
 
+ccflags-$(CONFIG_DYNAMIC_DEBUG_CORE) += -DYNAMIC_DEBUG_MODULE
+
 amdgpu-y := amdgpu_drv.o
 
 # add KMS driver
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
index 21be2a684393..e49a755c6a69 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
@@ -36,8 +36,53 @@
 
 #include "resource.h"
 
-#define DC_LOGGER_INIT(logger)
+#if defined(CONFIG_DRM_USE_DYNAMIC_DEBUG)
+#include 
+
+unsigned long __debug_dc;
+EXPORT_SYMBOL(__debug_dc);
+
+#define help_(_N, _cat)"\t  Bit-" #_N "\t" _cat "\n"
+
+#define DC_DYNDBG_BITMAP_DESC(name)\
+   "Control pr_debugs via /sys/module/amdgpu/parameters/" #name\
+   ", where each bit controls a debug category.\n" \
+   help_(0, "[SURFACE]:")  \
+   help_(1, "[CURSOR]:")   \
+   help_(2, "[PFLIP]:")\
+   help_(3, "[VBLANK]:")   \
+   help_(4, "[HW_LINK_TRAINING]:") \
+   help_(5, "[HW_AUDIO]:") \
+   help_(6, "[SCALER]:")   \
+   help_(7, "[BIOS]:") \
+   help_(8, "[BANDWIDTH_CALCS]:")  \
+   help_(9, "[DML]:")  \
+   help_(10, "[IF_TRACE]:")\
+   help_(11, "[GAMMA]:")   \
+   help_(12, "[SMU_MSG]:")
+
+static struct dyndbg_bitdesc amdgpu_bitmap[] = {
+   [0] = { "[CURSOR]:" },
+   [1] = { "[PFLIP]:" },
+   [2] = { "[VBLANK]:" },
+   [3] = { "[HW_LINK_TRAINING]:" },
+   [4] = { "[HW_AUDIO]:" },
+   [5] = { "[SCALER]:" },
+   [6] = { "[BIOS]:" },
+   [7] = { "[BANDWIDTH_CALCS]:" },
+   [8] = { "[DML]:" },
+   [9] = { "[IF_TRACE]:" },
+   [10] = { "[GAMMA]:" },
+   [11] = { "[SMU_MSG]:" }
+};
+
+DEFINE_DYNAMIC_DEBUG_LOG_GROUPS(debug_dc, __debug_dc,
+   DC_DYNDBG_BITMAP_DESC(debug_dc),
+   amdgpu_bitmap);
+
+#endif
 
+#define DC_LOGGER_INIT(logger)
 
 #define SURFACE_TRACE(...) do {\
if (dc->debug.surface_trace) \
-- 
2.31.1



[PATCH v10 01/10] dyndbg: add DEFINE_DYNAMIC_DEBUG_BITGRPS macro and callbacks

2021-11-05 Thread Jim Cromie
DEFINE_DYNAMIC_DEBUG_BITGRPS(fsname, var, bitmap_desc, bitmap)
allows users to create a drm.debug style (bitmap) sysfs interface,
mapping each bit to a group of pr_debugs, matching on their formats.

This works well when the formats systematically include a prefix
string such as ERR|WARN|INFO, etc.

Such groups can (already) be manipulated like so:

echo "format $prefix +p" >control

This macro merely makes it easier to operate them as groups

/* standard usage */
static struct dyndbg_bitdesc my_bitmap[] = {
[0] = { "gvt:cmd:" },
[1] = { "gvt:core:" },
[2] = { "gvt:dpy:" },
[3] = { "gvt:el:" },
[4] = { "gvt:irq:" },
[5] = { "gvt:mm:" },
[6] = { "gvt:mmio:" },
[7] = { "gvt:render:" },
[8] = { "gvt:sched:" }
};
DEFINE_DYNAMIC_DEBUG_BITGRPS(debug_gvt, __gvt_debug,
 "i915/gvt bitmap desc", my_bitmap);

In addition to the macro, patch adds:

 - int param_set_dyndbg()
 - int param_get_dyndbg()
 - struct kernel_param_ops param_ops_dyndbg

Following the model of kernel/params.c STANDARD_PARAM_DEFS, these are
non-static and exported.

get/set use an augmented kernel_param; the arg refs a new struct
dyndbg_bitmap_param containing:

A- the map of "categories", an array of struct dyndbg_bitdescs,
   indexed by bitpos, defining the match against pr_debug formats.

B- a pointer to the user module's ulong holding the bits/state.
   By sharing state, we coordinate with code that still uses it
   directly.  This allows drm-debug api to be converted incrementally,
   while still using __drm_debug & drm_debug_enabled() in other parts.

param_set_dyndbg() compares new vs old bits, and only updates prdbgs
on changes.  This maximally preserves the underlying state, which may
have been customized via later `echo $cmd >control`.  So if a user
really wants to know that all prdbgs are set precisely, they must
pre-clear then set.

dynamic_debug.h:

Add DEFINE_DYNAMIC_DEBUG_BITGRPS() described above, and a stub
throwing a BUILD_BUG (RFC) when used without DYNAMIC_DEBUG support.

Add structs dyndbg_bitdesc, dyndbg_bitmap_param to support the main
macro, and several helper macros wrapping the given categories with
^prefix and ' ' suffix.  This way the callback can be more broadly
used, by using the right helper macro.

Also externs the struct kernel_param param_ops_dyndbg symbol, as is
done in moduleparams.h for all the STANDARD params.

USAGE NOTES:

Using dyndbg to query on "format $str" requires that $str must be
present in the compiled-in format string.  Searching on "%s" does not
define a useful set of callsites.

Using DEFINE_DYNAMIC_DEBUG_CATEGORIES wo support gets a BUILD_BUG.
ISTM there is already action at a declarative distance, nobody needs
mystery as to why the /sysfs thingy didn't appear.

Dyndbg is agnostic wrt the categorization scheme used, in order to
play well with any prefix convention already in use in the codebase.
In fact, "prefix" is not strictly accurate without ^ anchor.

Ad-hoc categories and sub-categories are implicitly allowed, author
discipline and review is expected.

Hierarchical classes/categories are natural:

"^drm::"   is used in a later commit
"^drm:::" is a natural extension.
"^drm:atomic:fail:" has been proposed, sounds directly useful

RFC: in a real sense we abandon enum strictures here, and lose some
compiler help, on spelling errs for example.  Obviously "drm:" != "DRM:".

Some properties of a hierarchical category deserve explication:

Trailing spaces matter !

With 1..3-space ("drm: ", "drm:atomic: ", "drm:atomic:fail: "), the
":" doesn't terminate the search-space, the trailing space does.  So a
"drm:" search spec will match all DRM categories & subcategories, and
will not be useful in an interface where all categories are already
controlled together.  That said, "drm:atomic:" & "drm:atomic: " are
different, and both are useful in cases.

Ad-Hoc categories & sub-categories:

Ad-hoc categories are those format-prefixes already in use; both
amdgpu and i915 have numerous (120,~1800) pr_debugs, most of these use
a system, a small set (9,13) of prefixes, to categorize the output.
Dyndbg already works on these, this patch just allows adding a new
bitmap knob to control them.

Ad-hoc sub-categories are slightly trickier.
  since drm_dbg_atomic("fail: ...") is a macro:
pr_debug("drm:atomic:" " " format,...) // cpp-paste in a trailing space

We get "drm:atomic: fail:", with that undesirable embedded space;
obviously not ideal wrt clear and simple prefixes.

a possible fix: drm_dbg_atomic_("fail: ..."); // trailing _ for ad-hoc subcat

Summarizing:

 - "drm:kms: " & "drm:kms:" are different
 - "drm:kms"also different - includes drm:kms2:
 - "drm:kms:\t" also different - could be troublesome
 - "drm:kms:*"  doesn't work, no wildcard on format atm.

Order matters in DEFINE_DYNAMIC_DEBUG_CATEGORIES(... @bit_descs)

Since bits are/will-stay applied 0-N, 

[PATCH v10 00/10] use DYNAMIC_DEBUG to implement DRM.debug & DRM.trace

2021-11-05 Thread Jim Cromie
Hi Jason, Greg, DRM-everyone,

This patchset has 3 separate but related parts:

1. DEFINE_DYNAMIC_DEBUG_BITGRPS [patch 1/10]

   Declares DRM.debug style bitmap, bits control pr_debugs by matching formats
   Adds callback to translate bits to $cmd > dynamic_debug/control
   This could obsolete EXPORT(dynamic_debug_exec_queries) not included.

   /* anticipated_usage */
   static struct dyndbg_desc drm_categories_map[] = {
  [0] = { DRM_DBG_CAT_CORE },
  [1] = { DRM_DBG_CAT_DRIVER },
  [2] = { DRM_DBG_CAT_KMS },
  [3] = { DRM_DBG_CAT_PRIME }, ... };

   DEFINE_DYNAMIC_DEBUG_BITGRPS(debug, __drm_debug,
" bits control drm.debug categories ",
drm_categories_map);

   Please consider this patch for -next:
   - new interface, new code, no users to break
   - allows DRM folks to consider in earnest.
   - api bikeshedding to do ?
 struct dyndbg_desc isnt that great a name, others too probably.

2. use (1) to reimplement drm.debug [patches 3-7]:

   1st in amdgpu & i915 to control existing pr_debugs by their formats
   then in drm-print, for all drm.debug API users
   POC for (1)
   avoids drm_debug_enabled(), gives NOOP savings & new flexibility.
   changes drm.debug categories from enum to format-prefix-string
   alters in-log format to include the format-prefix-string
   Daniel Vetter liked this at -v3
   https://lore.kernel.org/lkml/YPbPvm%2FxcBlTK1wq@phenom.ffwll.local/
   Im sure Ive missed stuff.
   
3. separately, Sean Paul proposed drm.trace to mirror drm.debug to tracefs
   https://patchwork.freedesktop.org/series/78133/

He argues:
   tracefs is fast/lightweight compared to syslog
   independent selection (of drm categories) to tracefs
   gives tailored traffic w.o flooding syslog

ISTM he's correct.  So it follows that write-to-tracefs is also a good
feature for dyndbg, where its then available for all pr_debug users,
on a per-site basis, via echo +T >control.  (iff CONFIG_TRACING).

So basically, I borg'd his:
   [patch 14/14] drm/print: Add tracefs support to the drm logging helpers
   
Then I added a T flag, so it can be toggled from shell:

   # turn on all drm's pr_debug --> tracefs
   echo module drm +T > /proc/dynamic_debug/control

It appears to just work: (RFC)

The instance name is a placeholder, per-module subdirs kinda fits the
tracefs pattern, full mod/file-basename/function/line feels like
overkill, mod/basename-func.line would flatten it nicely. RFC.

[root@gandalf dyndbg-tracefs]# pwd
/sys/kernel/tracing/instances/dyndbg-tracefs
[root@gandalf dyndbg-tracefs]# echo 1 > /sys/module/drm/parameters/trace
[root@gandalf dyndbg-tracefs]# head -n16 trace | sed -e 's/^#//'
 tracer: nop

 entries-in-buffer/entries-written: 405/405   #P:24

_-=> irqs-off
   / _=> need-resched
  | / _---=> hardirq/softirq
  || / _--=> preempt-depth
  ||| / _-=> migrate-disable
   / delay
   TASK-PID CPU#  |  TIMESTAMP  FUNCTION
  | | |   | | |
   <...>-2254[000] .  7040.894352: __dynamic_pr_debug: 
drm:core: comm="gnome-shel:cs0" pid=2254, dev=0xe200, auth=1, AMDGPU_CS
   <...>-2207[015] .  7040.894654: __dynamic_pr_debug: 
drm:core: comm="gnome-shell" pid=2207, dev=0xe200, auth=1, DRM_IOCTL_MODE_ADDFB2
   <...>-2207[015] .  7040.995403: __dynamic_pr_debug: 
drm:core: comm="gnome-shell" pid=2207, dev=0xe200, auth=1, DRM_IOCTL_MODE_RMFB
   <...>-2207[015] .  7040.995413: __dynamic_pr_debug: 
drm:core: OBJ ID: 121 (2)

This is the pr-debug doing most of that logging: (from dynamic_debug/control)

  drivers/gpu/drm/drm_ioctl.c:866 [drm]drm_ioctl =T "drm:core: comm=\042%s\042 
pid=%d, dev=0x%lx, auth=%d, %s\012"

Turning on decoration flags changes the trace:

  echo module drm format drm:core: +mflt > /proc/dynamic_debug/control 

   TASK-PID CPU#  |  TIMESTAMP  FUNCTION
  | | |   | | |
   <...>-2254[003] . 15980.936660: __dynamic_pr_debug: [2254] 
drm:drm_ioctl:866: drm:core: comm="gnome-shel:cs0" pid=2254, dev=0xe200, 
auth=1, AMDGPU_CS
   <...>-2207[015] . 15980.936966: __dynamic_pr_debug: [2207] 
drm:drm_ioctl:866: drm:core: comm="gnome-shell" pid=2207, dev=0xe200, auth=1, 
DRM_IOCTL_MODE_ADDFB2
   <...>-2207[015] . 15981.037727: __dynamic_pr_debug: [2207] 
drm:drm_ioctl:866: drm:core: comm="gnome-shell" pid=2207, dev=0xe200, auth=1, 
DRM_IOCTL_MODE_RMFB
   <...>-2207[015] . 15981.037739: __dynamic_pr_debug: [2207] 
drm:drm_mode_object_put:195: drm:core: OBJ ID: 124 (2)
   <...>-2207[015] . 15981.037742: __dynamic_pr_debug: [2207] 
drm:drm_mode_object_put:195: drm:core: OBJ ID: 124 

[PATCH] drm/amdkfd: lower the VAs base offset to 8KB

2021-11-05 Thread Alex Sierra
The low 16MB of virtual address space are currently reserved for kernel
mode allocations mapped into user virtual address space. This causes
conflicts with HMM/SVM mappings at low virtual addresses. We tried to
move those kernel mode allocations to the upper half of the 64-bit
virtual address space for GFX9, which is naturally reserved for kernel
use. However, TBA (trap handler code) has problems to access addresses
in the high virtual space. We have decided to set this to 8KB of the
lower address space as a temporary fix, while investigate TBA address
problem. It is very unlikely for user space to map memory at this low
region.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
index 2e86692def19..d1388896f9c1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
@@ -308,7 +308,7 @@
  * 16MB are reserved for kernel use (CWSR trap handler and kernel IB
  * for now).
  */
-#define SVM_USER_BASE 0x100ull
+#define SVM_USER_BASE (u64)(KFD_CWSR_TBA_TMA_SIZE + 2*PAGE_SIZE)
 #define SVM_CWSR_BASE (SVM_USER_BASE - KFD_CWSR_TBA_TMA_SIZE)
 #define SVM_IB_BASE   (SVM_CWSR_BASE - PAGE_SIZE)
 
-- 
2.32.0



[PATCH] drm/amdkfd: replace trivial funcs with direct access

2021-11-05 Thread Graham Sider
These get funcs simply return an adev field. Replace funcs/calls with
direct field accesses instead.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 30 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  6 
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  4 +--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  8 ++---
 .../amd/amdkfd/kfd_process_queue_manager.c|  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  7 ++---
 6 files changed, 10 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 83f863dca7af..46cf48b3904a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -521,16 +521,6 @@ uint64_t amdgpu_amdkfd_get_vram_usage(struct amdgpu_device 
*adev)
return amdgpu_vram_mgr_usage(vram_man);
 }
 
-uint64_t amdgpu_amdkfd_get_hive_id(struct amdgpu_device *adev)
-{
-   return adev->gmc.xgmi.hive_id;
-}
-
-uint64_t amdgpu_amdkfd_get_unique_id(struct amdgpu_device *adev)
-{
-   return adev->unique_id;
-}
-
 uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct amdgpu_device *dst,
  struct amdgpu_device *src)
 {
@@ -630,26 +620,6 @@ int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct 
amdgpu_device *adev, bool is_
return (num_lanes_factor * gen_speed_mbits_factor)/BITS_PER_BYTE;
 }
 
-uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct amdgpu_device *adev)
-{
-   return adev->rmmio_remap.bus_addr;
-}
-
-uint32_t amdgpu_amdkfd_get_num_gws(struct amdgpu_device *adev)
-{
-   return adev->gds.gws_size;
-}
-
-uint32_t amdgpu_amdkfd_get_asic_rev_id(struct amdgpu_device *adev)
-{
-   return adev->rev_id;
-}
-
-int amdgpu_amdkfd_get_noretry(struct amdgpu_device *adev)
-{
-   return adev->gmc.noretry;
-}
-
 int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
enum kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 5f658823a637..d00de575c541 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -224,12 +224,6 @@ int amdgpu_amdkfd_get_dmabuf_info(struct amdgpu_device 
*adev, int dma_buf_fd,
  size_t buffer_size, uint32_t *metadata_size,
  uint32_t *flags);
 uint64_t amdgpu_amdkfd_get_vram_usage(struct amdgpu_device *adev);
-uint64_t amdgpu_amdkfd_get_hive_id(struct amdgpu_device *adev);
-uint64_t amdgpu_amdkfd_get_unique_id(struct amdgpu_device *adev);
-uint64_t amdgpu_amdkfd_get_mmio_remap_phys_addr(struct amdgpu_device *adev);
-uint32_t amdgpu_amdkfd_get_num_gws(struct amdgpu_device *adev);
-uint32_t amdgpu_amdkfd_get_asic_rev_id(struct amdgpu_device *adev);
-int amdgpu_amdkfd_get_noretry(struct amdgpu_device *adev);
 uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct amdgpu_device *dst,
  struct amdgpu_device *src);
 int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struct amdgpu_device *dst,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 8d5021e8c714..2e3d74f7fbfb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1313,7 +1313,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
err = -EINVAL;
goto err_unlock;
}
-   offset = amdgpu_amdkfd_get_mmio_remap_phys_addr(dev->adev);
+   offset = dev->adev->rmmio_remap.bus_addr;
if (!offset) {
err = -ENOMEM;
goto err_unlock;
@@ -2066,7 +2066,7 @@ static int kfd_mmio_mmap(struct kfd_dev *dev, struct 
kfd_process *process,
if (vma->vm_end - vma->vm_start != PAGE_SIZE)
return -EINVAL;
 
-   address = amdgpu_amdkfd_get_mmio_remap_phys_addr(dev->adev);
+   address = dev->adev->rmmio_remap.bus_addr;
 
vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE |
VM_DONTDUMP | VM_PFNMAP;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index c8aade17efef..b752dc36a2cd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -892,7 +892,7 @@ static int kfd_gws_init(struct kfd_dev *kfd)
|| (kfd->device_info->asic_family == CHIP_ALDEBARAN
&& kfd->mec2_fw_version >= 0x28))
ret = amdgpu_amdkfd_alloc_gws(kfd->adev,
-   amdgpu_amdkfd_get_num_gws(kfd->adev), 
>gws);
+   kfd->adev->gds.gws_size, >gws);
 
return ret;
 }
@@ 

RE: [PATCH] drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov

2021-11-05 Thread Liu, Shaoyun
[AMD Official Use Only]

Ye, a lot already been changed since then , now the  pre_reset and  post_reset 
not in the  lock/unlock anymore.  With  my previous change , we make 
kfd_pre_reset  avoid touch  HW . Now it's pure SW handling , should be safe  to 
be moved out of the full access . 
Anyway, thanks to bring this up, it will remind us to verify on the  XGMI 
configuration on SRIOV. 

Regards
shaoyun.liu 

-Original Message-
From: Kuehling, Felix  
Sent: Friday, November 5, 2021 1:48 PM
To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov

There was a reason why pre_reset was done differently on SRIOV. However, the 
code has changed a lot since then. Is this concern still valid?

> commit 7b184b006185215daf4e911f8de212964c99a514
> Author: wentalou 
> Date:   Fri Dec 7 13:53:18 2018 +0800
>
>     drm/amdgpu: kfd_pre_reset outside req_full_gpu cause sriov hang
>
>     XGMI hive put kfd_pre_reset into amdgpu_device_lock_adev,
>     but outside req_full_gpu of sriov.
>     It would make sriov hang during reset.
>
>     Signed-off-by: Wentao Lou 
>     Reviewed-by: Shaoyun Liu 
>     Signed-off-by: Alex Deucher 
Regards,
   Felix


On 2021-11-05 12:57 p.m., shaoyunl wrote:
> The KFD pre_reset should be called before reset been executed, it will 
> hold the lock to prevent other rocm process to sent the packlage to 
> hiq during host execute the real reset on the HW
>
> Signed-off-by: shaoyunl 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
>   1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 95fec36e385e..d7c9dce17cad 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4278,8 +4278,6 @@ static int amdgpu_device_reset_sriov(struct 
> amdgpu_device *adev,
>   if (r)
>   return r;
>   
> - amdgpu_amdkfd_pre_reset(adev);
> -
>   /* Resume IP prior to SMC */
>   r = amdgpu_device_ip_reinit_early_sriov(adev);
>   if (r)
> @@ -5015,8 +5013,7 @@ int amdgpu_device_gpu_recover(struct 
> amdgpu_device *adev,
>   
>   cancel_delayed_work_sync(_adev->delayed_init_work);
>   
> - if (!amdgpu_sriov_vf(tmp_adev))
> - amdgpu_amdkfd_pre_reset(tmp_adev);
> + amdgpu_amdkfd_pre_reset(tmp_adev);
>   
>   /*
>* Mark these ASICs to be reseted as untracked first


Re: [PATCH] drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov

2021-11-05 Thread Felix Kuehling
There was a reason why pre_reset was done differently on SRIOV. However, 
the code has changed a lot since then. Is this concern still valid?



commit 7b184b006185215daf4e911f8de212964c99a514
Author: wentalou 
Date:   Fri Dec 7 13:53:18 2018 +0800

    drm/amdgpu: kfd_pre_reset outside req_full_gpu cause sriov hang

    XGMI hive put kfd_pre_reset into amdgpu_device_lock_adev,
    but outside req_full_gpu of sriov.
    It would make sriov hang during reset.

    Signed-off-by: Wentao Lou 
    Reviewed-by: Shaoyun Liu 
    Signed-off-by: Alex Deucher 

Regards,
  Felix


On 2021-11-05 12:57 p.m., shaoyunl wrote:

The KFD pre_reset should be called before reset been executed, it will
hold the lock to prevent other rocm process to sent the packlage to hiq
during host execute the real reset on the HW

Signed-off-by: shaoyunl 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
  1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 95fec36e385e..d7c9dce17cad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4278,8 +4278,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
if (r)
return r;
  
-	amdgpu_amdkfd_pre_reset(adev);

-
/* Resume IP prior to SMC */
r = amdgpu_device_ip_reinit_early_sriov(adev);
if (r)
@@ -5015,8 +5013,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
  
  		cancel_delayed_work_sync(_adev->delayed_init_work);
  
-		if (!amdgpu_sriov_vf(tmp_adev))

-   amdgpu_amdkfd_pre_reset(tmp_adev);
+   amdgpu_amdkfd_pre_reset(tmp_adev);
  
  		/*

 * Mark these ASICs to be reseted as untracked first


[PATCH] drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov

2021-11-05 Thread shaoyunl
The KFD pre_reset should be called before reset been executed, it will
hold the lock to prevent other rocm process to sent the packlage to hiq
during host execute the real reset on the HW

Signed-off-by: shaoyunl 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 95fec36e385e..d7c9dce17cad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4278,8 +4278,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
if (r)
return r;
 
-   amdgpu_amdkfd_pre_reset(adev);
-
/* Resume IP prior to SMC */
r = amdgpu_device_ip_reinit_early_sriov(adev);
if (r)
@@ -5015,8 +5013,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 
cancel_delayed_work_sync(_adev->delayed_init_work);
 
-   if (!amdgpu_sriov_vf(tmp_adev))
-   amdgpu_amdkfd_pre_reset(tmp_adev);
+   amdgpu_amdkfd_pre_reset(tmp_adev);
 
/*
 * Mark these ASICs to be reseted as untracked first
-- 
2.17.1



Re: [RFC PATCH 3/3] drm/amdgpu: enable HIQ in amdgpu without kfd

2021-11-05 Thread Das, Nirmoy



On 11/5/2021 3:17 PM, Alex Deucher wrote:

On Fri, Nov 5, 2021 at 10:09 AM Nirmoy Das  wrote:

There is a HW bug which prevents CP to read secure buffers
with HIQ being configured and mapped using KIQ. KFD already
does this for amdgpu but when kfd is not enabled amdgpu
should that for itself.

Can we just move the HIQ init/fini into the KGD and then have KFD call
into the KGD when it needs to interact with it?  I'd rather not have
two code paths to maintain to handle the HIQ ring.



I looked into the kfd code a bit, AFAIU kfd deals with struct 
v{9|10}_mqd instead of amdgpu_ring.


I could try to expose a function in KGD to map HIQ with a mqd struct 
which kfd can use.



Regards,

Nirmoy




Alex


Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 -
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 77 
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 80 +
  3 files changed, 170 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 053a1119ebfe..837f76550242 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -519,7 +519,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
 AMDGPU_GEM_DOMAIN_VRAM, 
>mqd_obj,
 >mqd_gpu_addr, 
>mqd_ptr);
 if (r) {
-   dev_warn(adev->dev, "failed to create ring mqd ob 
(%d)", r);
+   dev_warn(adev->dev, "failed to create KIQ ring mqd ob 
(%d)", r);
 return r;
 }

@@ -569,6 +569,18 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
 }
 }

+   /* create MQD for HIQ */
+   ring = >gfx.hiq.ring;
+   if (!ring->mqd_obj) {
+   r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_VRAM, 
>mqd_obj,
+   >mqd_gpu_addr, 
>mqd_ptr);
+   if (r) {
+   dev_warn(adev->dev, "failed to create HIQ ring mqd ob 
(%d)", r);
+   return r;
+   }
+   }
+
 return 0;
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 538130c453a6..9532f013128f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4794,6 +4794,7 @@ static int gfx_v10_0_sw_init(void *handle)
  {
 int i, j, k, r, ring_id = 0;
 struct amdgpu_kiq *kiq;
+   struct amdgpu_hiq *hiq;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

 switch (adev->ip_versions[GC_HWIP][0]) {
@@ -4923,6 +4924,18 @@ static int gfx_v10_0_sw_init(void *handle)
 if (r)
 return r;

+   if (!adev->kfd.dev) {
+   r = amdgpu_gfx_hiq_init(adev, GFX10_MEC_HPD_SIZE);
+   if (r) {
+   DRM_ERROR("Failed to init HIQ BOs!\n");
+   return r;
+   }
+
+   hiq = >gfx.hiq;
+   r = amdgpu_gfx_hiq_init_ring(adev, >ring, >irq);
+   if (r)
+   return r;
+   }
 r = amdgpu_gfx_mqd_sw_init(adev, sizeof(struct v10_compute_mqd));
 if (r)
 return r;
@@ -7215,6 +7228,54 @@ static int gfx_v10_0_kcq_resume(struct amdgpu_device 
*adev)
 return r;
  }

+static int gfx_v10_0_hiq_init_queue(struct amdgpu_ring *ring)
+{
+   struct amdgpu_device *adev = ring->adev;
+   struct v10_compute_mqd *mqd = ring->mqd_ptr;
+
+
+   if (amdgpu_in_reset(adev)) {
+   /* reset ring buffer */
+   ring->wptr = 0;
+   amdgpu_ring_clear_ring(ring);
+
+   } else {
+   memset((void *)mqd, 0, sizeof(*mqd));
+   mutex_lock(>srbm_mutex);
+   nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
+   gfx_v10_0_compute_mqd_init(ring);
+   nv_grbm_select(adev, 0, 0, 0, 0);
+   mutex_unlock(>srbm_mutex);
+   }
+
+   return 0;
+}
+
+static int gfx_v10_0_hiq_resume(struct amdgpu_device *adev)
+{
+   struct amdgpu_ring *ring;
+   int r;
+
+   ring = >gfx.hiq.ring;
+
+   r = amdgpu_bo_reserve(ring->mqd_obj, false);
+   if (unlikely(r != 0))
+   return r;
+
+   r = amdgpu_bo_kmap(ring->mqd_obj, (void **)>mqd_ptr);
+   if (unlikely(r != 0))
+   return r;
+
+   gfx_v10_0_hiq_init_queue(ring);
+   amdgpu_bo_kunmap(ring->mqd_obj);
+   ring->mqd_ptr = NULL;
+   amdgpu_bo_unreserve(ring->mqd_obj);
+   ring->sched.ready = true;
+
+   amdgpu_gfx_enable_hiq(adev);
+   return 0;
+}
+
  static int gfx_v10_0_cp_resume(struct amdgpu_device *adev)
  {
 int r, i;
@@ 

Re: [PATCH] drm/amd/amdkfd: Don't sent command to HWS on kfd reset

2021-11-05 Thread Felix Kuehling


Am 2021-11-04 um 12:53 p.m. schrieb shaoyunl:
> When kfd need to be reset, sent command to HWS might cause hang and get 
> unnecessary timeout.
> This change try not to touch HW in pre_reset and keep queues to be in the 
> evicted state
> when the reset is done, so they are not put back on the runlist. These queues 
> will be destroied
> on process termination.
>
> Signed-off-by: shaoyunl 

Reviewed-by: Felix Kuehling 


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 6 +-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index e9601d4dfb77..0a60317509c8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1430,7 +1430,7 @@ static int unmap_queues_cpsch(struct 
> device_queue_manager *dqm,
>  
>   if (!dqm->sched_running)
>   return 0;
> - if (dqm->is_hws_hang)
> + if (dqm->is_hws_hang || dqm->is_resetting)
>   return -EIO;
>   if (!dqm->active_runlist)
>   return retval;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index f8a8fdb95832..f29b3932e3dc 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -1715,7 +1715,11 @@ int kfd_process_evict_queues(struct kfd_process *p)
>  
>   r = pdd->dev->dqm->ops.evict_process_queues(pdd->dev->dqm,
>   >qpd);
> - if (r) {
> + /* evict return -EIO if HWS is hang or asic is resetting, in 
> this case
> +  * we would like to set all the queues to be in evicted state 
> to prevent
> +  * them been add back since they actually not be saved right 
> now.
> +  */
> + if (r && r != -EIO) {
>   pr_err("Failed to evict process queues\n");
>   goto fail;
>   }


Re: [RFC PATCH 3/3] drm/amdgpu: enable HIQ in amdgpu without kfd

2021-11-05 Thread Alex Deucher
On Fri, Nov 5, 2021 at 10:09 AM Nirmoy Das  wrote:
>
> There is a HW bug which prevents CP to read secure buffers
> with HIQ being configured and mapped using KIQ. KFD already
> does this for amdgpu but when kfd is not enabled amdgpu
> should that for itself.

Can we just move the HIQ init/fini into the KGD and then have KFD call
into the KGD when it needs to interact with it?  I'd rather not have
two code paths to maintain to handle the HIQ ring.

Alex

>
> Signed-off-by: Nirmoy Das 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 -
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 77 
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 80 +
>  3 files changed, 170 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 053a1119ebfe..837f76550242 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -519,7 +519,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
> AMDGPU_GEM_DOMAIN_VRAM, 
> >mqd_obj,
> >mqd_gpu_addr, 
> >mqd_ptr);
> if (r) {
> -   dev_warn(adev->dev, "failed to create ring mqd ob 
> (%d)", r);
> +   dev_warn(adev->dev, "failed to create KIQ ring mqd ob 
> (%d)", r);
> return r;
> }
>
> @@ -569,6 +569,18 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
> }
> }
>
> +   /* create MQD for HIQ */
> +   ring = >gfx.hiq.ring;
> +   if (!ring->mqd_obj) {
> +   r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
> +   AMDGPU_GEM_DOMAIN_VRAM, 
> >mqd_obj,
> +   >mqd_gpu_addr, 
> >mqd_ptr);
> +   if (r) {
> +   dev_warn(adev->dev, "failed to create HIQ ring mqd ob 
> (%d)", r);
> +   return r;
> +   }
> +   }
> +
> return 0;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 538130c453a6..9532f013128f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -4794,6 +4794,7 @@ static int gfx_v10_0_sw_init(void *handle)
>  {
> int i, j, k, r, ring_id = 0;
> struct amdgpu_kiq *kiq;
> +   struct amdgpu_hiq *hiq;
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> switch (adev->ip_versions[GC_HWIP][0]) {
> @@ -4923,6 +4924,18 @@ static int gfx_v10_0_sw_init(void *handle)
> if (r)
> return r;
>
> +   if (!adev->kfd.dev) {
> +   r = amdgpu_gfx_hiq_init(adev, GFX10_MEC_HPD_SIZE);
> +   if (r) {
> +   DRM_ERROR("Failed to init HIQ BOs!\n");
> +   return r;
> +   }
> +
> +   hiq = >gfx.hiq;
> +   r = amdgpu_gfx_hiq_init_ring(adev, >ring, >irq);
> +   if (r)
> +   return r;
> +   }
> r = amdgpu_gfx_mqd_sw_init(adev, sizeof(struct v10_compute_mqd));
> if (r)
> return r;
> @@ -7215,6 +7228,54 @@ static int gfx_v10_0_kcq_resume(struct amdgpu_device 
> *adev)
> return r;
>  }
>
> +static int gfx_v10_0_hiq_init_queue(struct amdgpu_ring *ring)
> +{
> +   struct amdgpu_device *adev = ring->adev;
> +   struct v10_compute_mqd *mqd = ring->mqd_ptr;
> +
> +
> +   if (amdgpu_in_reset(adev)) {
> +   /* reset ring buffer */
> +   ring->wptr = 0;
> +   amdgpu_ring_clear_ring(ring);
> +
> +   } else {
> +   memset((void *)mqd, 0, sizeof(*mqd));
> +   mutex_lock(>srbm_mutex);
> +   nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
> +   gfx_v10_0_compute_mqd_init(ring);
> +   nv_grbm_select(adev, 0, 0, 0, 0);
> +   mutex_unlock(>srbm_mutex);
> +   }
> +
> +   return 0;
> +}
> +
> +static int gfx_v10_0_hiq_resume(struct amdgpu_device *adev)
> +{
> +   struct amdgpu_ring *ring;
> +   int r;
> +
> +   ring = >gfx.hiq.ring;
> +
> +   r = amdgpu_bo_reserve(ring->mqd_obj, false);
> +   if (unlikely(r != 0))
> +   return r;
> +
> +   r = amdgpu_bo_kmap(ring->mqd_obj, (void **)>mqd_ptr);
> +   if (unlikely(r != 0))
> +   return r;
> +
> +   gfx_v10_0_hiq_init_queue(ring);
> +   amdgpu_bo_kunmap(ring->mqd_obj);
> +   ring->mqd_ptr = NULL;
> +   amdgpu_bo_unreserve(ring->mqd_obj);
> +   ring->sched.ready = true;
> +
> +   amdgpu_gfx_enable_hiq(adev);
> +   return 0;
> +}
> +
>  static int gfx_v10_0_cp_resume(struct amdgpu_device *adev)
>  {
> int r, i;
> @@ -7252,6 

[RFC PATCH 2/3] drm/amdgpu: add HIQ eng_sel to KIQ packets

2021-11-05 Thread Nirmoy Das
Allow KIQ to map/unmap HIQ MQD as well.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  2 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  4 ++--
 4 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 5b8cb76e35a0..053a1119ebfe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -1010,3 +1010,17 @@ void amdgpu_gfx_state_change_set(struct amdgpu_device 
*adev, enum gfx_change_sta
(adev)->powerplay.pp_handle, state));
mutex_unlock(>pm.mutex);
 }
+
+int amdgpu_kiq_get_eng_num(struct amdgpu_ring *ring)
+{
+
+   switch (ring->funcs->type) {
+   case AMDGPU_RING_TYPE_GFX:
+   return 4;
+   case AMDGPU_RING_TYPE_HIQ:
+   return 1;
+   default:
+   return 0;
+   }
+
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 4d9c91f4400d..88d942b1ef08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -373,6 +373,8 @@ static inline u32 amdgpu_gfx_create_bitmask(u32 bit_width)
return (u32)((1ULL << bit_width) - 1);
 }
 
+int amdgpu_kiq_get_eng_num(struct amdgpu_ring *ring);
+
 int amdgpu_gfx_scratch_get(struct amdgpu_device *adev, uint32_t *reg);
 void amdgpu_gfx_scratch_free(struct amdgpu_device *adev, uint32_t reg);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 90a834dc4008..538130c453a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3633,7 +3633,7 @@ static void gfx10_kiq_unmap_queues(struct amdgpu_ring 
*kiq_ring,
   enum amdgpu_unmap_queues_action action,
   u64 gpu_addr, u64 seq)
 {
-   uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
+   uint32_t eng_sel = amdgpu_kiq_get_eng_num(ring);
 
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_UNMAP_QUEUES, 4));
amdgpu_ring_write(kiq_ring, /* Q_sel: 0, vmid: 0, engine: 0, num_Q: 1 */
@@ -3660,7 +3660,7 @@ static void gfx10_kiq_query_status(struct amdgpu_ring 
*kiq_ring,
   u64 addr,
   u64 seq)
 {
-   uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
+   uint32_t eng_sel = amdgpu_kiq_get_eng_num(ring);
 
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_QUERY_STATUS, 5));
amdgpu_ring_write(kiq_ring,
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 7f944bb11298..2b29e42bde62 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -847,7 +847,7 @@ static void gfx_v9_0_kiq_map_queues(struct amdgpu_ring 
*kiq_ring,
struct amdgpu_device *adev = kiq_ring->adev;
uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj);
uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
-   uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
+   uint32_t eng_sel = amdgpu_kiq_get_eng_num(ring);
 
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5));
/* Q_sel:0, vmid:0, vidmem: 1, engine:0, num_Q:1*/
@@ -877,7 +877,7 @@ static void gfx_v9_0_kiq_unmap_queues(struct amdgpu_ring 
*kiq_ring,
   enum amdgpu_unmap_queues_action action,
   u64 gpu_addr, u64 seq)
 {
-   uint32_t eng_sel = ring->funcs->type == AMDGPU_RING_TYPE_GFX ? 4 : 0;
+   uint32_t eng_sel = amdgpu_kiq_get_eng_num(ring);
 
amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_UNMAP_QUEUES, 4));
amdgpu_ring_write(kiq_ring, /* Q_sel: 0, vmid: 0, engine: 0, num_Q: 1 */
-- 
2.31.1



[RFC PATCH 3/3] drm/amdgpu: enable HIQ in amdgpu without kfd

2021-11-05 Thread Nirmoy Das
There is a HW bug which prevents CP to read secure buffers
with HIQ being configured and mapped using KIQ. KFD already
does this for amdgpu but when kfd is not enabled amdgpu
should that for itself.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 -
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 77 
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 80 +
 3 files changed, 170 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 053a1119ebfe..837f76550242 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -519,7 +519,7 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
AMDGPU_GEM_DOMAIN_VRAM, 
>mqd_obj,
>mqd_gpu_addr, 
>mqd_ptr);
if (r) {
-   dev_warn(adev->dev, "failed to create ring mqd ob 
(%d)", r);
+   dev_warn(adev->dev, "failed to create KIQ ring mqd ob 
(%d)", r);
return r;
}
 
@@ -569,6 +569,18 @@ int amdgpu_gfx_mqd_sw_init(struct amdgpu_device *adev,
}
}
 
+   /* create MQD for HIQ */
+   ring = >gfx.hiq.ring;
+   if (!ring->mqd_obj) {
+   r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_VRAM, 
>mqd_obj,
+   >mqd_gpu_addr, 
>mqd_ptr);
+   if (r) {
+   dev_warn(adev->dev, "failed to create HIQ ring mqd ob 
(%d)", r);
+   return r;
+   }
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 538130c453a6..9532f013128f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4794,6 +4794,7 @@ static int gfx_v10_0_sw_init(void *handle)
 {
int i, j, k, r, ring_id = 0;
struct amdgpu_kiq *kiq;
+   struct amdgpu_hiq *hiq;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
switch (adev->ip_versions[GC_HWIP][0]) {
@@ -4923,6 +4924,18 @@ static int gfx_v10_0_sw_init(void *handle)
if (r)
return r;
 
+   if (!adev->kfd.dev) {
+   r = amdgpu_gfx_hiq_init(adev, GFX10_MEC_HPD_SIZE);
+   if (r) {
+   DRM_ERROR("Failed to init HIQ BOs!\n");
+   return r;
+   }
+
+   hiq = >gfx.hiq;
+   r = amdgpu_gfx_hiq_init_ring(adev, >ring, >irq);
+   if (r)
+   return r;
+   }
r = amdgpu_gfx_mqd_sw_init(adev, sizeof(struct v10_compute_mqd));
if (r)
return r;
@@ -7215,6 +7228,54 @@ static int gfx_v10_0_kcq_resume(struct amdgpu_device 
*adev)
return r;
 }
 
+static int gfx_v10_0_hiq_init_queue(struct amdgpu_ring *ring)
+{
+   struct amdgpu_device *adev = ring->adev;
+   struct v10_compute_mqd *mqd = ring->mqd_ptr;
+
+
+   if (amdgpu_in_reset(adev)) {
+   /* reset ring buffer */
+   ring->wptr = 0;
+   amdgpu_ring_clear_ring(ring);
+
+   } else {
+   memset((void *)mqd, 0, sizeof(*mqd));
+   mutex_lock(>srbm_mutex);
+   nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
+   gfx_v10_0_compute_mqd_init(ring);
+   nv_grbm_select(adev, 0, 0, 0, 0);
+   mutex_unlock(>srbm_mutex);
+   }
+
+   return 0;
+}
+
+static int gfx_v10_0_hiq_resume(struct amdgpu_device *adev)
+{
+   struct amdgpu_ring *ring;
+   int r;
+
+   ring = >gfx.hiq.ring;
+
+   r = amdgpu_bo_reserve(ring->mqd_obj, false);
+   if (unlikely(r != 0))
+   return r;
+
+   r = amdgpu_bo_kmap(ring->mqd_obj, (void **)>mqd_ptr);
+   if (unlikely(r != 0))
+   return r;
+
+   gfx_v10_0_hiq_init_queue(ring);
+   amdgpu_bo_kunmap(ring->mqd_obj);
+   ring->mqd_ptr = NULL;
+   amdgpu_bo_unreserve(ring->mqd_obj);
+   ring->sched.ready = true;
+
+   amdgpu_gfx_enable_hiq(adev);
+   return 0;
+}
+
 static int gfx_v10_0_cp_resume(struct amdgpu_device *adev)
 {
int r, i;
@@ -7252,6 +7313,12 @@ static int gfx_v10_0_cp_resume(struct amdgpu_device 
*adev)
return r;
}
 
+   if (!adev->kfd.dev) {
+   r = gfx_v10_0_hiq_resume(adev);
+   if (r)
+   return r;
+   }
+
for (i = 0; i < adev->gfx.num_gfx_rings; i++) {
ring = >gfx.gfx_ring[i];
r = amdgpu_ring_test_helper(ring);
@@ -7557,6 +7624,11 @@ static int gfx_v10_0_hw_fini(void *handle)
 #endif
if 

[RFC PATCH 1/3] drm/amdgpu: add HIQ ring to amdgpu

2021-11-05 Thread Nirmoy Das
Add HIQ ring structs and functions that will map HIQ
using KIQ.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  | 142 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h  |  24 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |   3 +-
 4 files changed, 169 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
index 89e6ad30396f..2d9295adac06 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
@@ -40,6 +40,7 @@ struct amdgpu_doorbell {
  */
 struct amdgpu_doorbell_index {
uint32_t kiq;
+   uint32_t hiq;
uint32_t mec_ring0;
uint32_t mec_ring1;
uint32_t mec_ring2;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 1916ec84dd71..5b8cb76e35a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -256,6 +256,148 @@ void amdgpu_gfx_graphics_queue_acquire(struct 
amdgpu_device *adev)
bitmap_weight(adev->gfx.me.queue_bitmap, AMDGPU_MAX_GFX_QUEUES);
 }
 
+int amdgpu_gfx_hiq_acquire(struct amdgpu_device *adev, struct amdgpu_ring 
*ring)
+{
+   int queue_bit;
+   int mec, pipe, queue;
+
+   queue_bit = adev->gfx.mec.num_mec
+   * adev->gfx.mec.num_pipe_per_mec
+   * adev->gfx.mec.num_queue_per_pipe;
+
+   while (queue_bit-- >= 0) {
+   if (test_bit(queue_bit, adev->gfx.mec.queue_bitmap))
+   continue;
+
+   amdgpu_queue_mask_bit_to_mec_queue(adev, queue_bit, , 
, );
+
+   if (mec == 1 && pipe > 1)
+   continue;
+
+   ring->me = mec + 1;
+   ring->pipe = pipe;
+   ring->queue = queue;
+
+   return 0;
+   }
+
+   dev_err(adev->dev, "Failed to find a queue for HIQ\n");
+   return -EINVAL;
+}
+
+int amdgpu_gfx_hiq_init_ring(struct amdgpu_device *adev,
+struct amdgpu_ring *ring,
+struct amdgpu_irq_src *irq)
+{
+   struct amdgpu_hiq *hiq = >gfx.hiq;
+   int r = 0;
+
+   ring->adev = NULL;
+   ring->ring_obj = NULL;
+   ring->use_doorbell = true;
+   ring->doorbell_index = adev->doorbell_index.hiq;
+
+   r = amdgpu_gfx_hiq_acquire(adev, ring);
+   if (r)
+   return r;
+
+   ring->eop_gpu_addr = hiq->eop_gpu_addr;
+   ring->no_scheduler = true;
+   sprintf(ring->name, "hiq_%d.%d.%d", ring->me, ring->pipe, ring->queue);
+   r = amdgpu_ring_init(adev, ring, 1024, irq, 
AMDGPU_CP_IRQ_COMPUTE_MEC2_PIPE0_EOP,
+AMDGPU_RING_PRIO_DEFAULT, NULL);
+   if (r)
+   dev_warn(adev->dev, "(%d) failed to init hiq ring\n", r);
+
+   return r;
+}
+
+void amdgpu_gfx_hiq_free_ring(struct amdgpu_ring *ring)
+{
+   amdgpu_ring_fini(ring);
+}
+
+void amdgpu_gfx_hiq_init_ring_fini(struct amdgpu_device *adev)
+{
+   struct amdgpu_hiq *hiq = >gfx.hiq;
+
+   amdgpu_bo_free_kernel(>eop_obj, >eop_gpu_addr, NULL);
+}
+
+int amdgpu_gfx_hiq_init(struct amdgpu_device *adev,
+   unsigned hpd_size)
+{
+   int r;
+   u32 *hpd;
+   struct amdgpu_hiq *hiq = >gfx.hiq;
+
+   r = amdgpu_bo_create_kernel(adev, hpd_size, PAGE_SIZE,
+   AMDGPU_GEM_DOMAIN_GTT, >eop_obj,
+   >eop_gpu_addr, (void **));
+   if (r) {
+   dev_warn(adev->dev, "failed to create HIQ bo (%d).\n", r);
+   return r;
+   }
+
+   memset(hpd, 0, hpd_size);
+
+   r = amdgpu_bo_reserve(hiq->eop_obj, true);
+   if (unlikely(r != 0))
+   dev_warn(adev->dev, "(%d) reserve hiq eop bo failed\n", r);
+   amdgpu_bo_kunmap(hiq->eop_obj);
+   amdgpu_bo_unreserve(hiq->eop_obj);
+
+   return 0;
+}
+
+int amdgpu_gfx_disable_hiq(struct amdgpu_device *adev)
+{
+   struct amdgpu_kiq *kiq = >gfx.kiq;
+   struct amdgpu_ring *kiq_ring = >ring;
+   int r;
+
+   if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues)
+   return -EINVAL;
+
+   spin_lock(>gfx.kiq.ring_lock);
+   if (amdgpu_ring_alloc(kiq_ring, kiq->pmf->unmap_queues_size)) {
+   spin_unlock(>gfx.kiq.ring_lock);
+   return -ENOMEM;
+   }
+
+   kiq->pmf->kiq_unmap_queues(kiq_ring, >gfx.kiq.ring, RESET_QUEUES,
+  0, 0);
+   r = amdgpu_ring_test_helper(kiq_ring);
+   spin_unlock(>gfx.kiq.ring_lock);
+
+   return r;
+}
+
+int amdgpu_gfx_enable_hiq(struct amdgpu_device *adev)
+{
+   struct amdgpu_kiq *kiq = >gfx.kiq;
+   struct amdgpu_ring *kiq_ring = >gfx.kiq.ring;
+   struct amdgpu_ring *hiq_ring = >gfx.hiq.ring;
+   int r;
+
+   

Re: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")

2021-11-05 Thread Borislav Petkov
On Fri, Nov 05, 2021 at 08:05:41AM +, Quan, Evan wrote:
> I'm wondering are you able to give the attached patch(alone) a try.

Yap, looks good.

Tested-by: Borislav Petkov 

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v2 1/2] drm: Add a drm_drv_enabled() to check if drivers should be enabled

2021-11-05 Thread Javier Martinez Canillas
On 11/5/21 11:04, Jani Nikula wrote:
> On Fri, 05 Nov 2021, Javier Martinez Canillas  wrote:

[snip]

>>
>> Do you envision other condition that could be added later to disable a
>> DRM driver ? Or do you think that just from a code readability point of
>> view makes worth it ?
> 
> Taking a step back for perspective.
> 
> I think there's broad consensus in moving the parameter to drm, naming
> the check function to drm_something_something(), and breaking the ties
> to CONFIG_VGA_CONSOLE. I appreciate the work you're doing to that
> effect.
>

Thanks, I appreciate your feedback and comments.
 
> I think everything beyond that is still a bit vague and/or
> contentious. So how about making the first 2-3 patches just that?
> Something we can all agree on, makes good progress, improves the kernel,
> and gives us something to build on?
>

That works for me. Thomas, do you agree with that approach ?
 
Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH v2 2/2] drm: Move nomodeset kernel parameter to the DRM subsystem

2021-11-05 Thread Javier Martinez Canillas
On 11/5/21 10:39, Thomas Zimmermann wrote:

[snip]


 +obj-$(CONFIG_VGA_CONSOLE) += drm_nomodeset.o
 +
>>>
>>> This now depends on the VGA textmode console. Even if you have no VGA
>>> console, you'd want drm_nomodeset.o. Simpledrm might be built-in and can
>>> provide graphics. Non-PC systems don't even have a VGA device.
>>
>> This was discussed in an earlier version, which had this builtin but the
>> header still had a stub for CONFIG_VGA_CONSOLE=n.
>>
>>> I think we really want a separate boolean config option that gets
>>> selected by CONFIG_DRM.
>>
>> Perhaps that should be a separate change on top.
> 
> Sure, make it a separate patch.
>

Agreed. I was planning to do it as a follow-up as well and drop the
#ifdef CONFIG_VGA_CONSOLE guard in the header.
 
> We want to make this work on ARM systems. I even have a request to 
> replace offb on Power architecture by simpledrm. So the final config has 
> to be system agnostic.
>

Same, since we want to drop the fbdev drivers in Fedora, for all arches.
 
> Best regards
> Thomas
> 
Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH v2 1/2] drm: Add a drm_drv_enabled() to check if drivers should be enabled

2021-11-05 Thread Javier Martinez Canillas
Hello Thomas,

On 11/5/21 09:43, Thomas Zimmermann wrote:
> Hi
> 
> Am 04.11.21 um 21:09 schrieb Javier Martinez Canillas:
>> Hello Jani,
>>
>> On 11/4/21 20:57, Jani Nikula wrote:
>>> On Thu, 04 Nov 2021, Javier Martinez Canillas  wrote:
 +/**
 + * drm_drv_enabled - Checks if a DRM driver can be enabled
 + * @driver: DRM driver to check
 + *
 + * Checks whether a DRM driver can be enabled or not. This may be the case
 + * if the "nomodeset" kernel command line parameter is used.
 + *
 + * Return: 0 on success or a negative error code on failure.
 + */
 +int drm_drv_enabled(const struct drm_driver *driver)
> 
> Jani mentioned that i915 absolutely wants this to run from the 
> module_init function. Best is to drop the parameter.
>

Ok. I now wonder though how much value would add this function since
it will just be a wrapper around the nomodeset check.

We talked about adding a new DRIVER_GENERIC feature flag and check for
this, but as danvet mentioned that is not really needed. We just need
to avoid testing for nomodeset in the simpledrm driver.

Do you envision other condition that could be added later to disable a
DRM driver ? Or do you think that just from a code readability point of
view makes worth it ?

 +{
 +  if (vgacon_text_force()) {
 +  DRM_INFO("%s driver is disabled\n", driver->name);
 +  return -ENODEV;
 +  }
> 
> If we run this from within a module_init function, we'd get plenty of 
> these warnings if drivers are compiled into the kernel. Maybe simply 
> remove the message. There's already a warning printed by the nomodeset 
> handler.
>

Indeed. I'll just drop it.

 +
 +  return 0;
 +}
 +EXPORT_SYMBOL(drm_drv_enabled);
>>>
>>> The name implies a bool return, but it's not.
>>>
>>> if (drm_drv_enabled(...)) {
>>> /* surprise, it's disabled! */
>>> }
>>>
>>
>> It used to return a bool in v2 but Thomas suggested an int instead to
>> have consistency on the errno code that was returned by the callers.
>>
>> I should probably name that function differently to avoid confusion.
> 
> Yes, please.
>

drm_driver_check() maybe ?
 
Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH v2 2/2] drm: Move nomodeset kernel parameter to the DRM subsystem

2021-11-05 Thread Javier Martinez Canillas
On 11/5/21 10:00, Thomas Zimmermann wrote:

[snip]

>> +
>> +static int __init disable_modeset(char *str)
>> +{
>> +drm_nomodeset = true;
>> +
>> +pr_warn("You have booted with nomodeset. This means your GPU drivers 
>> are DISABLED\n");
>> +pr_warn("Any video related functionality will be severely degraded, and 
>> you may not even be able to suspend the system properly\n");
>> +pr_warn("Unless you actually understand what nomodeset does, you should 
>> reboot without enabling it\n");
> 
> I'd update this text to be less sensational.
>

This is indeed quite sensational. But think we can do it as a follow-up patch.

Since we will have to stick with nomodeset for backward compatibility, I was
planning to add documentation to explain what this parameter is about and what
is the actual effect of setting it.

So I think we can change this as a part of that patch-set.
 
>> +
>> +return 1;
>> +}
>> +
>> +/* Disable kernel modesetting */
>> +__setup("nomodeset", disable_modeset);
>> diff --git a/drivers/gpu/drm/i915/i915_module.c 
>> b/drivers/gpu/drm/i915/i915_module.c
>> index 45cb3e540eff..c890c1ca20c4 100644
>> --- a/drivers/gpu/drm/i915/i915_module.c
>> +++ b/drivers/gpu/drm/i915/i915_module.c
>> @@ -4,8 +4,6 @@
>>* Copyright © 2021 Intel Corporation
>>*/
>>   
>> -#include 
>> -
>
> These changes should be in patch 1?
>

Yes, I forgot to move these when changed the order of the patches.

Best regards, -- 
Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH v2 1/2] drm: Add a drm_drv_enabled() to check if drivers should be enabled

2021-11-05 Thread Thomas Zimmermann

Hi

Am 05.11.21 um 13:00 schrieb Javier Martinez Canillas:

On 11/5/21 11:04, Jani Nikula wrote:

On Fri, 05 Nov 2021, Javier Martinez Canillas  wrote:


[snip]



Do you envision other condition that could be added later to disable a
DRM driver ? Or do you think that just from a code readability point of
view makes worth it ?


Taking a step back for perspective.

I think there's broad consensus in moving the parameter to drm, naming
the check function to drm_something_something(), and breaking the ties
to CONFIG_VGA_CONSOLE. I appreciate the work you're doing to that
effect.



Thanks, I appreciate your feedback and comments.
  

I think everything beyond that is still a bit vague and/or
contentious. So how about making the first 2-3 patches just that?
Something we can all agree on, makes good progress, improves the kernel,
and gives us something to build on?



That works for me. Thomas, do you agree with that approach ?


Sure. I think that's more or less what I proposed in my reply to that mail.

Best regards
Thomas

  
Best regards,




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


RE: [PATCH] drm/amdgpu: assign dpms for amdgpu_vkms_crtc_helper_funcs

2021-11-05 Thread Chen, Guchun
[Public]

Acked-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Asher Song
Sent: Friday, November 5, 2021 5:41 PM
To: amd-gfx@lists.freedesktop.org
Cc: Song, Asher 
Subject: [PATCH] drm/amdgpu: assign dpms for amdgpu_vkms_crtc_helper_funcs

In drm_helper_disable_unused_functions(), when !crtc->enable is false, a NULL 
pointer crtc_funcs->dpms may occur. To avoid this, assign dpms for 
amdgpu_vkms_crtc_helper_funcs.

 Call Trace:
  __drm_helper_disable_unused_functions+0xac/0xe0 [drm_kms_helper]
  drm_helper_disable_unused_functions+0x38/0x60 [drm_kms_helper]
  amdgpu_fbdev_init+0xf6/0x100 [amdgpu]
  amdgpu_device_init+0x13d4/0x1f10 [amdgpu]

Fixes: ba5317109d0ce ("drm/amdgpu: create amdgpu_vkms (v4)")
Signed-off-by: Asher Song 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 26 
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
index 50bdc39733aa..9cfe479c4c97 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
@@ -156,7 +156,33 @@ static void amdgpu_vkms_crtc_atomic_flush(struct drm_crtc 
*crtc,
}
 }
 
+static void amdgpu_vkms_crtc_dpms(struct drm_crtc *crtc, int mode) {
+   struct drm_device *dev = crtc->dev;
+   struct amdgpu_device *adev = drm_to_adev(dev);
+   struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
+   unsigned type;
+
+   switch (mode) {
+   case DRM_MODE_DPMS_ON:
+   amdgpu_crtc->enabled = true;
+   /* Make sure VBLANK interrupts are still enabled */
+   type = amdgpu_display_crtc_idx_to_irq_type(adev,
+   amdgpu_crtc->crtc_id);
+   amdgpu_irq_update(adev, >crtc_irq, type);
+   drm_crtc_vblank_on(crtc);
+   break;
+   case DRM_MODE_DPMS_STANDBY:
+   case DRM_MODE_DPMS_SUSPEND:
+   case DRM_MODE_DPMS_OFF:
+   drm_crtc_vblank_off(crtc);
+   amdgpu_crtc->enabled = false;
+   break;
+   }
+}
+
 static const struct drm_crtc_helper_funcs amdgpu_vkms_crtc_helper_funcs = {
+   .dpms = amdgpu_vkms_crtc_dpms,
.atomic_flush   = amdgpu_vkms_crtc_atomic_flush,
.atomic_enable  = amdgpu_vkms_crtc_atomic_enable,
.atomic_disable = amdgpu_vkms_crtc_atomic_disable,
--
2.25.1


Re: [PATCH v2 1/2] drm: Add a drm_drv_enabled() to check if drivers should be enabled

2021-11-05 Thread Thomas Zimmermann

Hi

Am 05.11.21 um 10:48 schrieb Javier Martinez Canillas:

Hello Thomas,

On 11/5/21 09:43, Thomas Zimmermann wrote:

Hi

Am 04.11.21 um 21:09 schrieb Javier Martinez Canillas:

Hello Jani,

On 11/4/21 20:57, Jani Nikula wrote:

On Thu, 04 Nov 2021, Javier Martinez Canillas  wrote:

+/**
+ * drm_drv_enabled - Checks if a DRM driver can be enabled
+ * @driver: DRM driver to check
+ *
+ * Checks whether a DRM driver can be enabled or not. This may be the case
+ * if the "nomodeset" kernel command line parameter is used.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int drm_drv_enabled(const struct drm_driver *driver)


Jani mentioned that i915 absolutely wants this to run from the
module_init function. Best is to drop the parameter.



Ok. I now wonder though how much value would add this function since
it will just be a wrapper around the nomodeset check.

We talked about adding a new DRIVER_GENERIC feature flag and check for
this, but as danvet mentioned that is not really needed. We just need
to avoid testing for nomodeset in the simpledrm driver.

Do you envision other condition that could be added later to disable a
DRM driver ? Or do you think that just from a code readability point of
view makes worth it ?


DRIVER_GENERIC would have been nice, but it's not happening now.

I suggest to move over the nomodeset parameter (i.e., this patchset), 
then make the config option system agnostic, and finally add the test to 
all drivers expect simpledrm. That should solve the imminent problem.


Best regards
Thomas




+{
+   if (vgacon_text_force()) {
+   DRM_INFO("%s driver is disabled\n", driver->name);
+   return -ENODEV;
+   }


If we run this from within a module_init function, we'd get plenty of
these warnings if drivers are compiled into the kernel. Maybe simply
remove the message. There's already a warning printed by the nomodeset
handler.



Indeed. I'll just drop it.


+
+   return 0;
+}
+EXPORT_SYMBOL(drm_drv_enabled);


The name implies a bool return, but it's not.

if (drm_drv_enabled(...)) {
/* surprise, it's disabled! */
}



It used to return a bool in v2 but Thomas suggested an int instead to
have consistency on the errno code that was returned by the callers.

I should probably name that function differently to avoid confusion.


Yes, please.



drm_driver_check() maybe ?
  
Best regards,




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v2 1/2] drm: Add a drm_drv_enabled() to check if drivers should be enabled

2021-11-05 Thread Jani Nikula
On Fri, 05 Nov 2021, Javier Martinez Canillas  wrote:
> Hello Thomas,
>
> On 11/5/21 09:43, Thomas Zimmermann wrote:
>> Hi
>> 
>> Am 04.11.21 um 21:09 schrieb Javier Martinez Canillas:
>>> Hello Jani,
>>>
>>> On 11/4/21 20:57, Jani Nikula wrote:
 On Thu, 04 Nov 2021, Javier Martinez Canillas  wrote:
> +/**
> + * drm_drv_enabled - Checks if a DRM driver can be enabled
> + * @driver: DRM driver to check
> + *
> + * Checks whether a DRM driver can be enabled or not. This may be the 
> case
> + * if the "nomodeset" kernel command line parameter is used.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int drm_drv_enabled(const struct drm_driver *driver)
>> 
>> Jani mentioned that i915 absolutely wants this to run from the 
>> module_init function. Best is to drop the parameter.
>>
>
> Ok. I now wonder though how much value would add this function since
> it will just be a wrapper around the nomodeset check.
>
> We talked about adding a new DRIVER_GENERIC feature flag and check for
> this, but as danvet mentioned that is not really needed. We just need
> to avoid testing for nomodeset in the simpledrm driver.
>
> Do you envision other condition that could be added later to disable a
> DRM driver ? Or do you think that just from a code readability point of
> view makes worth it ?

Taking a step back for perspective.

I think there's broad consensus in moving the parameter to drm, naming
the check function to drm_something_something(), and breaking the ties
to CONFIG_VGA_CONSOLE. I appreciate the work you're doing to that
effect.

I think everything beyond that is still a bit vague and/or
contentious. So how about making the first 2-3 patches just that?
Something we can all agree on, makes good progress, improves the kernel,
and gives us something to build on?

BR,
Jani.


>
> +{
> + if (vgacon_text_force()) {
> + DRM_INFO("%s driver is disabled\n", driver->name);
> + return -ENODEV;
> + }
>> 
>> If we run this from within a module_init function, we'd get plenty of 
>> these warnings if drivers are compiled into the kernel. Maybe simply 
>> remove the message. There's already a warning printed by the nomodeset 
>> handler.
>>
>
> Indeed. I'll just drop it.
>
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(drm_drv_enabled);

 The name implies a bool return, but it's not.

if (drm_drv_enabled(...)) {
/* surprise, it's disabled! */
}

>>>
>>> It used to return a bool in v2 but Thomas suggested an int instead to
>>> have consistency on the errno code that was returned by the callers.
>>>
>>> I should probably name that function differently to avoid confusion.
>> 
>> Yes, please.
>>
>
> drm_driver_check() maybe ?
>  
> Best regards,

-- 
Jani Nikula, Intel Open Source Graphics Center


[PATCH] drm/amdgpu: assign dpms for amdgpu_vkms_crtc_helper_funcs

2021-11-05 Thread Asher Song
In drm_helper_disable_unused_functions(), when !crtc->enable is false, a NULL 
pointer crtc_funcs->dpms may occur. To avoid this, assign dpms for 
amdgpu_vkms_crtc_helper_funcs.

 Call Trace:
  __drm_helper_disable_unused_functions+0xac/0xe0 [drm_kms_helper]
  drm_helper_disable_unused_functions+0x38/0x60 [drm_kms_helper]
  amdgpu_fbdev_init+0xf6/0x100 [amdgpu]
  amdgpu_device_init+0x13d4/0x1f10 [amdgpu]

Fixes: ba5317109d0ce ("drm/amdgpu: create amdgpu_vkms (v4)")
Signed-off-by: Asher Song 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 26 
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
index 50bdc39733aa..9cfe479c4c97 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
@@ -156,7 +156,33 @@ static void amdgpu_vkms_crtc_atomic_flush(struct drm_crtc 
*crtc,
}
 }
 
+static void amdgpu_vkms_crtc_dpms(struct drm_crtc *crtc, int mode)
+{
+   struct drm_device *dev = crtc->dev;
+   struct amdgpu_device *adev = drm_to_adev(dev);
+   struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
+   unsigned type;
+
+   switch (mode) {
+   case DRM_MODE_DPMS_ON:
+   amdgpu_crtc->enabled = true;
+   /* Make sure VBLANK interrupts are still enabled */
+   type = amdgpu_display_crtc_idx_to_irq_type(adev,
+   amdgpu_crtc->crtc_id);
+   amdgpu_irq_update(adev, >crtc_irq, type);
+   drm_crtc_vblank_on(crtc);
+   break;
+   case DRM_MODE_DPMS_STANDBY:
+   case DRM_MODE_DPMS_SUSPEND:
+   case DRM_MODE_DPMS_OFF:
+   drm_crtc_vblank_off(crtc);
+   amdgpu_crtc->enabled = false;
+   break;
+   }
+}
+
 static const struct drm_crtc_helper_funcs amdgpu_vkms_crtc_helper_funcs = {
+   .dpms = amdgpu_vkms_crtc_dpms,
.atomic_flush   = amdgpu_vkms_crtc_atomic_flush,
.atomic_enable  = amdgpu_vkms_crtc_atomic_enable,
.atomic_disable = amdgpu_vkms_crtc_atomic_disable,
-- 
2.25.1



Re: [PATCH v2 2/2] drm: Move nomodeset kernel parameter to the DRM subsystem

2021-11-05 Thread Thomas Zimmermann

Hi

Am 05.11.21 um 10:22 schrieb Jani Nikula:

On Fri, 05 Nov 2021, Thomas Zimmermann  wrote:

Hi

Am 04.11.21 um 17:07 schrieb Javier Martinez Canillas:

The "nomodeset" kernel cmdline parameter is handled by the vgacon driver
but the exported vgacon_text_force() symbol is only used by DRM drivers.

It makes much more sense for the parameter logic to be in the subsystem
of the drivers that are making use of it.

Let's move the vgacon_text_force() function and related logic to the DRM
subsystem. While doing that, rename the function to drm_check_modeset()
which better reflects what the function is really used to test for.

Suggested-by: Daniel Vetter 
Signed-off-by: Javier Martinez Canillas 
---

Changes in v2:
- Conditionally build drm_nomodeset.o if CONFIG_VGA_CONSOLE is set.
- Squash patches to move nomodeset logic to DRM and do the renaming.
- Name the function drm_check_modeset() and make it return -ENODEV.

   drivers/gpu/drm/Makefile|  2 ++
   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  1 -
   drivers/gpu/drm/ast/ast_drv.c   |  1 -
   drivers/gpu/drm/drm_drv.c   |  9 +
   drivers/gpu/drm/drm_nomodeset.c | 26 +
   drivers/gpu/drm/i915/i915_module.c  |  2 --
   drivers/gpu/drm/mgag200/mgag200_drv.c   |  1 -
   drivers/gpu/drm/nouveau/nouveau_drm.c   |  1 -
   drivers/gpu/drm/qxl/qxl_drv.c   |  1 -
   drivers/gpu/drm/radeon/radeon_drv.c |  1 -
   drivers/gpu/drm/tiny/bochs.c|  1 -
   drivers/gpu/drm/tiny/cirrus.c   |  1 -
   drivers/gpu/drm/vboxvideo/vbox_drv.c|  1 -
   drivers/gpu/drm/virtio/virtgpu_drv.c|  1 -
   drivers/gpu/drm/vmwgfx/vmwgfx_drv.c |  1 -
   drivers/video/console/vgacon.c  | 21 
   include/drm/drm_mode_config.h   |  6 ++
   include/linux/console.h |  6 --
   18 files changed, 39 insertions(+), 44 deletions(-)
   create mode 100644 drivers/gpu/drm/drm_nomodeset.c

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 1c41156deb5f..c74810c285af 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -33,6 +33,8 @@ drm-$(CONFIG_DRM_PRIVACY_SCREEN) += drm_privacy_screen.o 
drm_privacy_screen_x86.
   
   obj-$(CONFIG_DRM_DP_AUX_BUS) += drm_dp_aux_bus.o
   
+obj-$(CONFIG_VGA_CONSOLE) += drm_nomodeset.o

+


This now depends on the VGA textmode console. Even if you have no VGA
console, you'd want drm_nomodeset.o. Simpledrm might be built-in and can
provide graphics. Non-PC systems don't even have a VGA device.


This was discussed in an earlier version, which had this builtin but the
header still had a stub for CONFIG_VGA_CONSOLE=n.


I think we really want a separate boolean config option that gets
selected by CONFIG_DRM.


Perhaps that should be a separate change on top.


Sure, make it a separate patch.

We want to make this work on ARM systems. I even have a request to 
replace offb on Power architecture by simpledrm. So the final config has 
to be system agnostic.


Best regards
Thomas



BR,
Jani.





   drm_cma_helper-y := drm_gem_cma_helper.o
   obj-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_cma_helper.o
   
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 7fde40d06181..b4b6993861e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -31,7 +31,6 @@
   #include "amdgpu_drv.h"
   
   #include 

-#include 
   #include 
   #include 
   #include 
diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index 802063279b86..6222082c3082 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -26,7 +26,6 @@
* Authors: Dave Airlie 
*/
   
-#include 

   #include 
   #include 
   
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c

index 3fb567d62881..80b85b8ea776 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -986,12 +986,13 @@ EXPORT_SYMBOL(drm_dev_set_unique);
*/
   int drm_drv_enabled(const struct drm_driver *driver)
   {
-   if (vgacon_text_force()) {
+   int ret;
+
+   ret = drm_check_modeset();
+   if (ret)
DRM_INFO("%s driver is disabled\n", driver->name);
-   return -ENODEV;
-   }
   
-	return 0;

+   return ret;
   }
   EXPORT_SYMBOL(drm_drv_enabled);
   
diff --git a/drivers/gpu/drm/drm_nomodeset.c b/drivers/gpu/drm/drm_nomodeset.c

new file mode 100644
index ..6683e396d2c5
--- /dev/null
+++ b/drivers/gpu/drm/drm_nomodeset.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+
+static bool drm_nomodeset;
+
+int drm_check_modeset(void)
+{
+   return drm_nomodeset ? -ENODEV : 0;
+}
+EXPORT_SYMBOL(drm_check_modeset);
+
+static int __init disable_modeset(char *str)
+{
+   drm_nomodeset = true;
+
+   pr_warn("You have booted with nomodeset. This means your GPU drivers are 

Re: [PATCH v2 2/2] drm: Move nomodeset kernel parameter to the DRM subsystem

2021-11-05 Thread Jani Nikula
On Fri, 05 Nov 2021, Thomas Zimmermann  wrote:
> Hi
>
> Am 04.11.21 um 17:07 schrieb Javier Martinez Canillas:
>> The "nomodeset" kernel cmdline parameter is handled by the vgacon driver
>> but the exported vgacon_text_force() symbol is only used by DRM drivers.
>> 
>> It makes much more sense for the parameter logic to be in the subsystem
>> of the drivers that are making use of it.
>> 
>> Let's move the vgacon_text_force() function and related logic to the DRM
>> subsystem. While doing that, rename the function to drm_check_modeset()
>> which better reflects what the function is really used to test for.
>> 
>> Suggested-by: Daniel Vetter 
>> Signed-off-by: Javier Martinez Canillas 
>> ---
>> 
>> Changes in v2:
>> - Conditionally build drm_nomodeset.o if CONFIG_VGA_CONSOLE is set.
>> - Squash patches to move nomodeset logic to DRM and do the renaming.
>> - Name the function drm_check_modeset() and make it return -ENODEV.
>> 
>>   drivers/gpu/drm/Makefile|  2 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  1 -
>>   drivers/gpu/drm/ast/ast_drv.c   |  1 -
>>   drivers/gpu/drm/drm_drv.c   |  9 +
>>   drivers/gpu/drm/drm_nomodeset.c | 26 +
>>   drivers/gpu/drm/i915/i915_module.c  |  2 --
>>   drivers/gpu/drm/mgag200/mgag200_drv.c   |  1 -
>>   drivers/gpu/drm/nouveau/nouveau_drm.c   |  1 -
>>   drivers/gpu/drm/qxl/qxl_drv.c   |  1 -
>>   drivers/gpu/drm/radeon/radeon_drv.c |  1 -
>>   drivers/gpu/drm/tiny/bochs.c|  1 -
>>   drivers/gpu/drm/tiny/cirrus.c   |  1 -
>>   drivers/gpu/drm/vboxvideo/vbox_drv.c|  1 -
>>   drivers/gpu/drm/virtio/virtgpu_drv.c|  1 -
>>   drivers/gpu/drm/vmwgfx/vmwgfx_drv.c |  1 -
>>   drivers/video/console/vgacon.c  | 21 
>>   include/drm/drm_mode_config.h   |  6 ++
>>   include/linux/console.h |  6 --
>>   18 files changed, 39 insertions(+), 44 deletions(-)
>>   create mode 100644 drivers/gpu/drm/drm_nomodeset.c
>> 
>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
>> index 1c41156deb5f..c74810c285af 100644
>> --- a/drivers/gpu/drm/Makefile
>> +++ b/drivers/gpu/drm/Makefile
>> @@ -33,6 +33,8 @@ drm-$(CONFIG_DRM_PRIVACY_SCREEN) += drm_privacy_screen.o 
>> drm_privacy_screen_x86.
>>   
>>   obj-$(CONFIG_DRM_DP_AUX_BUS) += drm_dp_aux_bus.o
>>   
>> +obj-$(CONFIG_VGA_CONSOLE) += drm_nomodeset.o
>> +
>
> This now depends on the VGA textmode console. Even if you have no VGA 
> console, you'd want drm_nomodeset.o. Simpledrm might be built-in and can 
> provide graphics. Non-PC systems don't even have a VGA device.

This was discussed in an earlier version, which had this builtin but the
header still had a stub for CONFIG_VGA_CONSOLE=n.

> I think we really want a separate boolean config option that gets 
> selected by CONFIG_DRM.

Perhaps that should be a separate change on top.

BR,
Jani.

>
>
>>   drm_cma_helper-y := drm_gem_cma_helper.o
>>   obj-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_cma_helper.o
>>   
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 7fde40d06181..b4b6993861e6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -31,7 +31,6 @@
>>   #include "amdgpu_drv.h"
>>   
>>   #include 
>> -#include 
>>   #include 
>>   #include 
>>   #include 
>> diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
>> index 802063279b86..6222082c3082 100644
>> --- a/drivers/gpu/drm/ast/ast_drv.c
>> +++ b/drivers/gpu/drm/ast/ast_drv.c
>> @@ -26,7 +26,6 @@
>>* Authors: Dave Airlie 
>>*/
>>   
>> -#include 
>>   #include 
>>   #include 
>>   
>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> index 3fb567d62881..80b85b8ea776 100644
>> --- a/drivers/gpu/drm/drm_drv.c
>> +++ b/drivers/gpu/drm/drm_drv.c
>> @@ -986,12 +986,13 @@ EXPORT_SYMBOL(drm_dev_set_unique);
>>*/
>>   int drm_drv_enabled(const struct drm_driver *driver)
>>   {
>> -if (vgacon_text_force()) {
>> +int ret;
>> +
>> +ret = drm_check_modeset();
>> +if (ret)
>>  DRM_INFO("%s driver is disabled\n", driver->name);
>> -return -ENODEV;
>> -}
>>   
>> -return 0;
>> +return ret;
>>   }
>>   EXPORT_SYMBOL(drm_drv_enabled);
>>   
>> diff --git a/drivers/gpu/drm/drm_nomodeset.c 
>> b/drivers/gpu/drm/drm_nomodeset.c
>> new file mode 100644
>> index ..6683e396d2c5
>> --- /dev/null
>> +++ b/drivers/gpu/drm/drm_nomodeset.c
>> @@ -0,0 +1,26 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +#include 
>> +#include 
>> +
>> +static bool drm_nomodeset;
>> +
>> +int drm_check_modeset(void)
>> +{
>> +return drm_nomodeset ? -ENODEV : 0;
>> +}
>> +EXPORT_SYMBOL(drm_check_modeset);
>> +
>> +static int __init disable_modeset(char *str)
>> +{
>> +drm_nomodeset = true;
>> +
>> +pr_warn("You have booted with nomodeset. 

Re: [PATCH v2 2/2] drm: Move nomodeset kernel parameter to the DRM subsystem

2021-11-05 Thread Thomas Zimmermann

Hi

Am 04.11.21 um 17:07 schrieb Javier Martinez Canillas:

The "nomodeset" kernel cmdline parameter is handled by the vgacon driver
but the exported vgacon_text_force() symbol is only used by DRM drivers.

It makes much more sense for the parameter logic to be in the subsystem
of the drivers that are making use of it.

Let's move the vgacon_text_force() function and related logic to the DRM
subsystem. While doing that, rename the function to drm_check_modeset()
which better reflects what the function is really used to test for.

Suggested-by: Daniel Vetter 
Signed-off-by: Javier Martinez Canillas 
---

Changes in v2:
- Conditionally build drm_nomodeset.o if CONFIG_VGA_CONSOLE is set.
- Squash patches to move nomodeset logic to DRM and do the renaming.
- Name the function drm_check_modeset() and make it return -ENODEV.

  drivers/gpu/drm/Makefile|  2 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  1 -
  drivers/gpu/drm/ast/ast_drv.c   |  1 -
  drivers/gpu/drm/drm_drv.c   |  9 +
  drivers/gpu/drm/drm_nomodeset.c | 26 +
  drivers/gpu/drm/i915/i915_module.c  |  2 --
  drivers/gpu/drm/mgag200/mgag200_drv.c   |  1 -
  drivers/gpu/drm/nouveau/nouveau_drm.c   |  1 -
  drivers/gpu/drm/qxl/qxl_drv.c   |  1 -
  drivers/gpu/drm/radeon/radeon_drv.c |  1 -
  drivers/gpu/drm/tiny/bochs.c|  1 -
  drivers/gpu/drm/tiny/cirrus.c   |  1 -
  drivers/gpu/drm/vboxvideo/vbox_drv.c|  1 -
  drivers/gpu/drm/virtio/virtgpu_drv.c|  1 -
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c |  1 -
  drivers/video/console/vgacon.c  | 21 
  include/drm/drm_mode_config.h   |  6 ++
  include/linux/console.h |  6 --
  18 files changed, 39 insertions(+), 44 deletions(-)
  create mode 100644 drivers/gpu/drm/drm_nomodeset.c

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 1c41156deb5f..c74810c285af 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -33,6 +33,8 @@ drm-$(CONFIG_DRM_PRIVACY_SCREEN) += drm_privacy_screen.o 
drm_privacy_screen_x86.
  
  obj-$(CONFIG_DRM_DP_AUX_BUS) += drm_dp_aux_bus.o
  
+obj-$(CONFIG_VGA_CONSOLE) += drm_nomodeset.o

+


This now depends on the VGA textmode console. Even if you have no VGA 
console, you'd want drm_nomodeset.o. Simpledrm might be built-in and can 
provide graphics. Non-PC systems don't even have a VGA device.


I think we really want a separate boolean config option that gets 
selected by CONFIG_DRM.




  drm_cma_helper-y := drm_gem_cma_helper.o
  obj-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_cma_helper.o
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 7fde40d06181..b4b6993861e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -31,7 +31,6 @@
  #include "amdgpu_drv.h"
  
  #include 

-#include 
  #include 
  #include 
  #include 
diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index 802063279b86..6222082c3082 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -26,7 +26,6 @@
   * Authors: Dave Airlie 
   */
  
-#include 

  #include 
  #include 
  
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c

index 3fb567d62881..80b85b8ea776 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -986,12 +986,13 @@ EXPORT_SYMBOL(drm_dev_set_unique);
   */
  int drm_drv_enabled(const struct drm_driver *driver)
  {
-   if (vgacon_text_force()) {
+   int ret;
+
+   ret = drm_check_modeset();
+   if (ret)
DRM_INFO("%s driver is disabled\n", driver->name);
-   return -ENODEV;
-   }
  
-	return 0;

+   return ret;
  }
  EXPORT_SYMBOL(drm_drv_enabled);
  
diff --git a/drivers/gpu/drm/drm_nomodeset.c b/drivers/gpu/drm/drm_nomodeset.c

new file mode 100644
index ..6683e396d2c5
--- /dev/null
+++ b/drivers/gpu/drm/drm_nomodeset.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+
+static bool drm_nomodeset;
+
+int drm_check_modeset(void)
+{
+   return drm_nomodeset ? -ENODEV : 0;
+}
+EXPORT_SYMBOL(drm_check_modeset);
+
+static int __init disable_modeset(char *str)
+{
+   drm_nomodeset = true;
+
+   pr_warn("You have booted with nomodeset. This means your GPU drivers are 
DISABLED\n");
+   pr_warn("Any video related functionality will be severely degraded, and you 
may not even be able to suspend the system properly\n");
+   pr_warn("Unless you actually understand what nomodeset does, you should 
reboot without enabling it\n");


I'd update this text to be less sensational.


+
+   return 1;
+}
+
+/* Disable kernel modesetting */
+__setup("nomodeset", disable_modeset);
diff --git a/drivers/gpu/drm/i915/i915_module.c 
b/drivers/gpu/drm/i915/i915_module.c
index 45cb3e540eff..c890c1ca20c4 100644
--- 

Re: [PATCH v2 1/2] drm: Add a drm_drv_enabled() to check if drivers should be enabled

2021-11-05 Thread Thomas Zimmermann

Hi

Am 04.11.21 um 21:09 schrieb Javier Martinez Canillas:

Hello Jani,

On 11/4/21 20:57, Jani Nikula wrote:

On Thu, 04 Nov 2021, Javier Martinez Canillas  wrote:

+/**
+ * drm_drv_enabled - Checks if a DRM driver can be enabled
+ * @driver: DRM driver to check
+ *
+ * Checks whether a DRM driver can be enabled or not. This may be the case
+ * if the "nomodeset" kernel command line parameter is used.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int drm_drv_enabled(const struct drm_driver *driver)


Jani mentioned that i915 absolutely wants this to run from the 
module_init function. Best is to drop the parameter.



+{
+   if (vgacon_text_force()) {
+   DRM_INFO("%s driver is disabled\n", driver->name);
+   return -ENODEV;
+   }


If we run this from within a module_init function, we'd get plenty of 
these warnings if drivers are compiled into the kernel. Maybe simply 
remove the message. There's already a warning printed by the nomodeset 
handler.



+
+   return 0;
+}
+EXPORT_SYMBOL(drm_drv_enabled);


The name implies a bool return, but it's not.

if (drm_drv_enabled(...)) {
/* surprise, it's disabled! */
}



It used to return a bool in v2 but Thomas suggested an int instead to
have consistency on the errno code that was returned by the callers.

I should probably name that function differently to avoid confusion.


Yes, please.

Best regards
Thomas



But I think you are correct and this change is caused too much churn
for not that much benefit, specially since is unclear that there might
be another condition to prevent a DRM driver to load besides nomodeset.

I'll just drop this patch and post only #2 but making drivers to test
using the drm_check_modeset() function (which doesn't have a name that
implies a bool return).



BR,
Jani.





Best regards,



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Kernel WARNING at

2021-11-05 Thread Zzy Wysm
I found the following warning in my log this evening.  I don’t know if or how 
it can be reproduced.

Linux 5.10.77 amd64.  Kernel config attached.  (The kernel taint is merely 
because of the struct randomization plugin.)

zzy

..

kernel: [ cut here ]
kernel: refcount_t: addition on 0; use-after-free.
kernel: WARNING: CPU: 3 PID: 957 at lib/refcount.c:25 
refcount_warn_saturate+0x68/0xf0
kernel: CPU: 3 PID: 957 Comm: Xorg Tainted: GW   T 5.10.77 #1
kernel: Hardware name: Supermicro Super Server/H11SSL-NC, BIOS 2.1 02/21/2020
kernel: RIP: 0010:refcount_warn_saturate+0x68/0xf0
kernel: Code: 05 2c 9f f5 01 01 e8 83 82 9e 00 0f 0b c3 80 3d 1c 9f f5 01 00 75 
d3 48 c7 c7 f0 2e 1d 96 c6 05 0c 9f f5 01 01 e8 64 82 9e 00 <0f> 0b c3 80 3d ff 
9e f5 01 00 75 d3 48 c7 c7 f0 2e 1d 96 c6 05 0c 9f f5 01 01 e8 64 82 9e 00 <0f> 
0b c3 80 3d ff 9e f5 01 00 75 b4 48 c7 c7 c8 2e 1d 96 c6 05 ef
kernel: RSP: 0018:b4f201cc3c00 EFLAGS: 00010286
kernel: RAX:  RBX: 8a0c00ede458 RCX: 0027
kernel: RDX: 0027 RSI: fffe RDI: 8a2ace192e88
kernel: RBP: b4f201cc3d38 R08: 8a2ace192e80 R09: b4f201cc3a28
kernel: R10: 0001 R11: 0001 R12: 8a0c911e5000
kernel: R13: 8a0cba1fc580 R14: b4f201cc3cc8 R15: 8a0c1a44
kernel: FS:  () GS:8a2ace18() 
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 80050033
kernel: CR2: 736ef80ab660 CR3: 0011da80c000 CR4: 00350ee0
kernel: Call Trace:
kernel:  dma_resv_add_shared_fence+0x122/0x180
kernel:  amdgpu_gem_object_close+0x1c3/0x250
kernel:  drm_gem_object_release_handle+0x2b/0x90
kernel:  ? drm_gem_object_handle_put_unlocked+0xc0/0xc0
kernel:  idr_for_each+0x70/0xe0
kernel:  drm_gem_release+0x17/0x20
kernel:  drm_file_free.part.0+0x273/0x280
kernel:  drm_release+0x60/0xe0
kernel:  __fput+0x96/0x240
kernel:  task_work_run+0x5a/0x90
kernel:  do_exit+0x34e/0xaf0
kernel:  do_group_exit+0x34/0xb0
kernel:  __x64_sys_exit_group+0xf/0x10
kernel:  do_syscall_64+0x33/0x40
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
kernel: RIP: 0033:0x736f1c33a699
kernel: Code: Unable to access opcode bytes at RIP 0x736f1c33a66f.
kernel: RSP: 002b:7ffce40b21e8 EFLAGS: 0246 ORIG_RAX: 00e7
kernel: RAX: ffda RBX: 736f1c42f610 RCX: 736f1c33a699
kernel: RDX: 003c RSI: 00e7 RDI: 
kernel: RBP:  R08: fc80 R09: 
kernel: R10: 736f1cdbaa40 R11: 0246 R12: 736f1c42f610
kernel: R13: 0b14 R14: 736f1c42fae8 R15: 
kernel: ---[ end trace 52a8b244b766437f ]—

..



kernel-warning-config
Description: Binary data




Re: Kernel WARNING at

2021-11-05 Thread Zzy Wysm
Another use-after-free on the same computer that looks like it’s in amdgpu:

[ 2101.168138] [ cut here ]
[ 2101.168144] refcount_t: underflow; use-after-free.
[ 2101.168162] WARNING: CPU: 4 PID: 965 at lib/refcount.c:28 
refcount_warn_saturate+0xa6/0xf0
[ 2101.168167] CPU: 4 PID: 965 Comm: Xorg Tainted: GT 5.10.77 #1
[ 2101.168169] Hardware name: Supermicro Super Server/H11SSL-NC, BIOS 2.1 
02/21/2020
[ 2101.168174] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[ 2101.168177] Code: 05 9f 6b f3 01 01 e8 4a 0d 9d 00 0f 0b c3 80 3d 8d 6b f3 
01 00 75 95 48 c7 c7 e0 ee fc ad c6 05 7d 6b f3 01 01 e8 2b 0d 9d 00 <0f> 0b c3 
80 3d 6c 6b f3 01 00 0f 85 72 ff ff ff 48 c7 c7 38 ef fc
[ 2101.168180] RSP: 0018:b3778216fdc0 EFLAGS: 00010282
[ 2101.168183] RAX:  RBX:  RCX: 0027
[ 2101.168186] RDX: 0027 RSI: fffe RDI: 8ddb4e212ec8
[ 2101.168187] RBP: 8dbdf149f680 R08: 8ddb4e212ec0 R09: b3778216fbe8
[ 2101.168189] R10: 0001 R11: 0001 R12: 
[ 2101.168191] R13: 8dbca0b50c00 R14: 8dbca0b50c58 R15: 
[ 2101.168194] FS:  71af94f3fa40() GS:8ddb4e20() 
knlGS:
[ 2101.168196] CS:  0010 DS:  ES:  CR0: 80050033
[ 2101.168198] CR2: 71af0c7a9000 CR3: 0015680a2000 CR4: 00350ee0
[ 2101.168199] Call Trace:
[ 2101.168206]  dma_resv_list_free.part.0+0x66/0x70
[ 2101.168212]  drm_gem_object_release+0x28/0x50
[ 2101.168218]  amdgpu_bo_destroy+0x60/0x100
[ 2101.168223]  ttm_bo_release+0x1de/0x310
[ 2101.168226]  amdgpu_bo_unref+0x15/0x20
[ 2101.168230]  amdgpu_gem_object_free+0x2b/0x50
[ 2101.168236]  drm_gem_dmabuf_release+0x31/0x50
[ 2101.168239]  dma_buf_release+0x35/0x70
[ 2101.168244]  __dentry_kill+0xe5/0x150
[ 2101.168249]  __fput+0xe1/0x250
[ 2101.168254]  task_work_run+0x5a/0x90
[ 2101.168260]  exit_to_user_mode_prepare+0xbe/0xc0
[ 2101.168266]  syscall_exit_to_user_mode+0x22/0xf0
[ 2101.168271]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2101.168274] RIP: 0033:0x71af953aacc7
[ 2101.168278] Code: 00 00 00 48 8b 05 c9 91 0c 00 64 c7 00 26 00 00 00 48 c7 
c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 
f0 ff ff 73 01 c3 48 8b 0d 99 91 0c 00 f7 d8 64 89 01 48
[ 2101.168280] RSP: 002b:7ffd1fbd3628 EFLAGS: 0246 ORIG_RAX: 
0010
[ 2101.168284] RAX:  RBX: 7ffd1fbd3660 RCX: 71af953aacc7
[ 2101.168286] RDX: 7ffd1fbd3660 RSI: 40086409 RDI: 0010
[ 2101.168288] RBP: 40086409 R08: 0007 R09: 000e
[ 2101.168290] R10: 001b R11: 0246 R12: 561c5b999b98
[ 2101.168292] R13: 0010 R14: 561c5ba2b72c R15: 7ffd1fbd36a0
[ 2101.168295] ---[ end trace 921c49c63d8e1053 ]—

zzy


> On Nov 4, 2021, at 9:34 PM, Zzy Wysm  wrote:
> 
> I found the following warning in my log this evening.  I don’t know if or how 
> it can be reproduced.
> 
> Linux 5.10.77 amd64.  Kernel config attached.  (The kernel taint is merely 
> because of the struct randomization plugin.)
> 
> zzy
> 
> ..
> 
> kernel: [ cut here ]
> kernel: refcount_t: addition on 0; use-after-free.
> kernel: WARNING: CPU: 3 PID: 957 at lib/refcount.c:25 
> refcount_warn_saturate+0x68/0xf0
> kernel: CPU: 3 PID: 957 Comm: Xorg Tainted: GW   T 5.10.77 #1
> kernel: Hardware name: Supermicro Super Server/H11SSL-NC, BIOS 2.1 02/21/2020
> kernel: RIP: 0010:refcount_warn_saturate+0x68/0xf0
> kernel: Code: 05 2c 9f f5 01 01 e8 83 82 9e 00 0f 0b c3 80 3d 1c 9f f5 01 00 
> 75 d3 48 c7 c7 f0 2e 1d 96 c6 05 0c 9f f5 01 01 e8 64 82 9e 00 <0f> 0b c3 80 
> 3d ff 9e f5 01 00 75 d3 48 c7 c7 f0 2e 1d 96 c6 05 0c 9f f5 01 01 e8 64 82 9e 
> 00 <0f> 0b c3 80 3d ff 9e f5 01 00 75 b4 48 c7 c7 c8 2e 1d 96 c6 05 ef
> kernel: RSP: 0018:b4f201cc3c00 EFLAGS: 00010286
> kernel: RAX:  RBX: 8a0c00ede458 RCX: 0027
> kernel: RDX: 0027 RSI: fffe RDI: 8a2ace192e88
> kernel: RBP: b4f201cc3d38 R08: 8a2ace192e80 R09: b4f201cc3a28
> kernel: R10: 0001 R11: 0001 R12: 8a0c911e5000
> kernel: R13: 8a0cba1fc580 R14: b4f201cc3cc8 R15: 8a0c1a44
> kernel: FS:  () GS:8a2ace18() 
> knlGS:
> kernel: CS:  0010 DS:  ES:  CR0: 80050033
> kernel: CR2: 736ef80ab660 CR3: 0011da80c000 CR4: 00350ee0
> kernel: Call Trace:
> kernel:  dma_resv_add_shared_fence+0x122/0x180
> kernel:  amdgpu_gem_object_close+0x1c3/0x250
> kernel:  drm_gem_object_release_handle+0x2b/0x90
> kernel:  ? drm_gem_object_handle_put_unlocked+0xc0/0xc0
> kernel:  idr_for_each+0x70/0xe0
> kernel:  drm_gem_release+0x17/0x20
> kernel:  drm_file_free.part.0+0x273/0x280
> kernel:  drm_release+0x60/0xe0
> kernel:  __fput+0x96/0x240
> kernel:  

RE: [PATCH] drm/amdgpu: fix the Carrizo UVD hang on system reboot

2021-11-05 Thread Quan, Evan
[AMD Official Use Only]



> -Original Message-
> From: amd-gfx  On Behalf Of
> James Zhu
> Sent: Friday, November 5, 2021 10:03 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: fix the Carrizo UVD hang on system
> reboot
> 
> 
> On 2021-11-04 4:19 a.m., Evan Quan wrote:
> > It's confirmed that on some APUs the interaction with SMU about DPM
> > disablement will power off the UVD completely. Thus the succeeding
> > interactions with UVD during the reboot will trigger hard hang. To
> > workaround this issue, we will skip the dpm disablement on APUs.
> >
> > Signed-off-by: Evan Quan 
> > Change-Id: I4340cc2fb0fd94f439cbac5d4963fe920866bc13
> > ---
> >   drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 20 ++
> >   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 30
> +++
> >   drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 18 +---
> >   drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 18 +---
> >   4 files changed, 52 insertions(+), 34 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > index c108b8381795..67ec13622e51 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > @@ -238,15 +238,17 @@ static int uvd_v4_2_suspend(void *handle)
> >  */
> > cancel_delayed_work_sync(>uvd.idle_work);
> >
> > -   if (adev->pm.dpm_enabled) {
> > -   amdgpu_dpm_enable_uvd(adev, false);
> 
> [JZ] Hi Evan, VCN code put amdgpu_dpm_enable_uvd(false) at the end of
> stop.  Can we do the same for uvd/vce?
[Quan, Evan] Sounds reasonable to me. Actually Lijo provided some insights 
which enlightened me.
I will drop this patch and provide a new one.

BR
Evan
> 
> Here, it is possible that some dec/enc jobs are still running when dpm is
> called. I am not sure if this situation caused hard hang during reboot.
> 
> > -   } else {
> > -   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > -   /* shutdown the UVD block */
> > -   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_PG_STATE_GATE);
> > -   amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_CG_STATE_GATE);
> > +   if (!(adev->flags & AMD_IS_APU)) {
> > +   if (adev->pm.dpm_enabled) {
> > +   amdgpu_dpm_enable_uvd(adev, false);
> > +   } else {
> > +   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > +   /* shutdown the UVD block */
> > +   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > +
> AMD_PG_STATE_GATE);
> > +   amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > +
> AMD_CG_STATE_GATE);
> > +   }
> > }
> >
> > r = uvd_v4_2_hw_fini(adev);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > index 2d558c2f417d..60d05ec8c953 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > @@ -560,15 +560,27 @@ static int uvd_v6_0_suspend(void *handle)
> >  */
> > cancel_delayed_work_sync(>uvd.idle_work);
> >
> > -   if (adev->pm.dpm_enabled) {
> > -   amdgpu_dpm_enable_uvd(adev, false);
> > -   } else {
> > -   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > -   /* shutdown the UVD block */
> > -   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_PG_STATE_GATE);
> > -   amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_CG_STATE_GATE);
> > +   /*
> > +* It's confirmed that on some APUs the interaction with SMU(about
> DPM disablement)
> > +* will power off the UVD. That will make the succeeding interactions
> with UVD on the
> > +* suspend path impossible. And the system will hang due to that. To
> workaround the
> > +* issue, we will skip the dpm disablement on APUs.
> > +*
> > +* TODO: a better solution is to reorg the action chains performed on
> suspend and make
> > +* the dpm disablement the last one. But that will involve a lot and
> needs MM team's
> > +* help.
> > +*/
> > +   if (!(adev->flags & AMD_IS_APU)) {
> > +   if (adev->pm.dpm_enabled) {
> > +   amdgpu_dpm_enable_uvd(adev, false);
> > +   } else {
> > +   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > +   /* shutdown the UVD block */
> > +   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > +
> AMD_PG_STATE_GATE);
> > +   amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > +
> AMD_CG_STATE_GATE);
> > +   }
> >   

RE: [PATCH] drm/amdgpu: fix the Carrizo UVD hang on system reboot

2021-11-05 Thread Quan, Evan
[AMD Official Use Only]



> -Original Message-
> From: Lazar, Lijo 
> Sent: Thursday, November 4, 2021 4:55 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: Re: [PATCH] drm/amdgpu: fix the Carrizo UVD hang on system
> reboot
> 
> 
> 
> On 11/4/2021 1:49 PM, Evan Quan wrote:
> > It's confirmed that on some APUs the interaction with SMU about DPM
> > disablement will power off the UVD completely. Thus the succeeding
> > interactions with UVD during the reboot will trigger hard hang. To
> > workaround this issue, we will skip the dpm disablement on APUs.
> >
> > Signed-off-by: Evan Quan 
> > Change-Id: I4340cc2fb0fd94f439cbac5d4963fe920866bc13
> > ---
> >   drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 20 ++
> >   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 30
> +++
> >   drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 18 +---
> >   drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 18 +---
> >   4 files changed, 52 insertions(+), 34 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > index c108b8381795..67ec13622e51 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
> > @@ -238,15 +238,17 @@ static int uvd_v4_2_suspend(void *handle)
> >  */
> > cancel_delayed_work_sync(>uvd.idle_work);
> 
> If the idle work handler had already started execution, it also goes through
> the same logic. Wouldn't that be a problem? The other case is if decode work
> is already completed before suspend is called, then also it disables DPM. Not
> sure how it works then, or is this caused by a second atempt to power off
> again after idle work is executed?
[Quan, Evan] Good point. Yes, maybe there should not be 2nd attempt when the 
target IP is already in the desired state.
Let me confirm with Boris.

BR
Evan
> 
> Thanks,
> Lijo
> 
> >
> > -   if (adev->pm.dpm_enabled) {
> > -   amdgpu_dpm_enable_uvd(adev, false);
> > -   } else {
> > -   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > -   /* shutdown the UVD block */
> > -   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_PG_STATE_GATE);
> > -   amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_CG_STATE_GATE);
> > +   if (!(adev->flags & AMD_IS_APU)) {
> > +   if (adev->pm.dpm_enabled) {
> > +   amdgpu_dpm_enable_uvd(adev, false);
> > +   } else {
> > +   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > +   /* shutdown the UVD block */
> > +   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > +
> AMD_PG_STATE_GATE);
> > +   amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > +
> AMD_CG_STATE_GATE);
> > +   }
> > }
> >
> > r = uvd_v4_2_hw_fini(adev);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > index 2d558c2f417d..60d05ec8c953 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
> > @@ -560,15 +560,27 @@ static int uvd_v6_0_suspend(void *handle)
> >  */
> > cancel_delayed_work_sync(>uvd.idle_work);
> >
> > -   if (adev->pm.dpm_enabled) {
> > -   amdgpu_dpm_enable_uvd(adev, false);
> > -   } else {
> > -   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > -   /* shutdown the UVD block */
> > -   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_PG_STATE_GATE);
> > -   amdgpu_device_ip_set_clockgating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > -  AMD_CG_STATE_GATE);
> > +   /*
> > +* It's confirmed that on some APUs the interaction with SMU(about
> DPM disablement)
> > +* will power off the UVD. That will make the succeeding interactions
> with UVD on the
> > +* suspend path impossible. And the system will hang due to that. To
> workaround the
> > +* issue, we will skip the dpm disablement on APUs.
> > +*
> > +* TODO: a better solution is to reorg the action chains performed on
> suspend and make
> > +* the dpm disablement the last one. But that will involve a lot and
> needs MM team's
> > +* help.
> > +*/
> > +   if (!(adev->flags & AMD_IS_APU)) {
> > +   if (adev->pm.dpm_enabled) {
> > +   amdgpu_dpm_enable_uvd(adev, false);
> > +   } else {
> > +   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
> > +   /* shutdown the UVD block */
> > +   amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_UVD,
> > +
> AMD_PG_STATE_GATE);
> > + 

Re: [PATCH 1/1] drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs

2021-11-05 Thread Christian König

Am 05.11.21 um 00:05 schrieb Felix Kuehling:

If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be
freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the
release_notify callback then the amdgpu_bo is freed.

Signed-off-by: Felix Kuehling 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h   |  4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 +---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |  2 +-
  3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 4accd584886b..5f658823a637 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -307,7 +307,7 @@ void amdgpu_amdkfd_ras_poison_consumption_handler(struct 
amdgpu_device *adev);
  void amdgpu_amdkfd_gpuvm_init_mem_limits(void);
  void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
struct amdgpu_vm *vm);
-void amdgpu_amdkfd_unreserve_memory_limit(struct amdgpu_bo *bo);
+void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo);
  void amdgpu_amdkfd_reserve_system_mem(uint64_t size);
  #else
  static inline
@@ -322,7 +322,7 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device 
*adev,
  }
  
  static inline

-void amdgpu_amdkfd_unreserve_memory_limit(struct amdgpu_bo *bo)
+void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo)
  {
  }
  #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 5174762f0b46..94fccf0b47ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -201,7 +201,7 @@ static void unreserve_mem_limit(struct amdgpu_device *adev,
spin_unlock(_mem_limit.mem_limit_lock);
  }
  
-void amdgpu_amdkfd_unreserve_memory_limit(struct amdgpu_bo *bo)

+void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo)
  {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
u32 domain = bo->preferred_domains;
@@ -213,6 +213,8 @@ void amdgpu_amdkfd_unreserve_memory_limit(struct amdgpu_bo 
*bo)
}
  
  	unreserve_mem_limit(adev, amdgpu_bo_size(bo), domain, sg);

+
+   kfree(bo->kfd_bo);
  }
  
  
@@ -1599,9 +1601,13 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(

drm_vma_node_revoke(>bo->tbo.base.vma_node, drm_priv);
if (mem->dmabuf)
dma_buf_put(mem->dmabuf);
-   drm_gem_object_put(>bo->tbo.base);
mutex_destroy(>lock);
-   kfree(mem);
+
+   /* If this releases the last reference, it will end up calling
+* amdgpu_amdkfd_release_notify and kfree the mem struct. That's why
+* this needs to be the last call here.
+*/
+   drm_gem_object_put(>bo->tbo.base);
  
  	return ret;

  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6b25982a9077..156002db24e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1279,7 +1279,7 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object 
*bo)
abo = ttm_to_amdgpu_bo(bo);
  
  	if (abo->kfd_bo)

-   amdgpu_amdkfd_unreserve_memory_limit(abo);
+   amdgpu_amdkfd_release_notify(abo);
  
  	/* We only remove the fence if the resv has individualized. */

WARN_ON_ONCE(bo->type == ttm_bo_type_kernel