Re: [PATCH v2 3/3] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Christian König

Am 19.10.21 um 20:14 schrieb Nirmoy Das:

Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_device_evict_resources()
while evicting GART table.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 42 --
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c |  9 ++---
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 10 +++---
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 10 +++---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  9 ++---
  6 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5807df52031c..f69e613805db 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,10 +3941,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
amdgpu_fence_driver_hw_fini(adev);

amdgpu_device_ip_suspend_phase2(adev);
-   /* This second call to evict device resources is to evict
-* the gart page table using the CPU.
-*/
-   amdgpu_device_evict_resources(adev);

return 0;
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index d3e4203f6217..97a9f61fa106 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -107,33 +107,37 @@ void amdgpu_gart_dummy_page_fini(struct amdgpu_device 
*adev)
   *
   * @adev: amdgpu_device pointer
   *
- * Allocate video memory for GART page table
+ * Allocate and pin video memory for GART page table
   * (pcie r4xx, r5xx+).  These asics require the
   * gart table to be in video memory.
   * Returns 0 for success, error for failure.
   */
  int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev)
  {
+   struct amdgpu_bo_param bp;
int r;

-   if (adev->gart.bo == NULL) {
-   struct amdgpu_bo_param bp;
-
-   memset(&bp, 0, sizeof(bp));
-   bp.size = adev->gart.table_size;
-   bp.byte_align = PAGE_SIZE;
-   bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
-   bp.type = ttm_bo_type_kernel;
-   bp.resv = NULL;
-   bp.bo_ptr_size = sizeof(struct amdgpu_bo);
-
-   r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
-   if (r) {
-   return r;
-   }
-   }
+   if (adev->gart.bo != NULL)
+   return 0;
+
+   memset(&bp, 0, sizeof(bp));
+   bp.size = adev->gart.table_size;
+   bp.byte_align = PAGE_SIZE;
+   bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
+   bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
+   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+   bp.type = ttm_bo_type_kernel;
+   bp.resv = NULL;
+   bp.bo_ptr_size = sizeof(struct amdgpu_bo);
+
+   r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
+   if (r)
+   return r;
+
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+


Instead of all this you should be able to use amdgpu_bo_create_kernel().


return 0;
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 3ec5ff5a6dbe..75d584e1b0e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,11 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}

-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+   if (adev->in_suspend) {
+   r = amdgpu_gtt_mgr_recover(adev);
+   if (r)
+   return r;
+   }


Please drop the in_suspend check here.

If I'm not completely mistaken the GTT domain should already be 
initialized here and if it's not then we can easily check for that in 
amdgpu_gtt_mgr_recover.


Christian.



r = adev->gfxhub.funcs->gart_enable(adev);
if (r)
@@ -1062,7 +1064,6 @@ static void gmc_v10_0_gart_disable(struct amdgpu_device 
*adev)
  {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
-   amdgpu_gart_table_vram_unpin(adev);
  }

  static int gmc_v10_0_hw_fini(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 0a50fdaced7e..02e90d9443c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -620,9 +620,12 @@ static int gmc_v7_0_gart_enable(struct amdgpu_device *adev)
dev_err(adev->dev, "No VRAM object for PCIE GAR

Re: [PATCH 1/3] drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr

2021-10-19 Thread Christian König

Am 19.10.21 um 20:14 schrieb Nirmoy Das:

Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call gtt_mgr functions.


That's a rather bad idea I think.

The GTT and VRAM manager work on their respective objects and not on the 
adev directly.


Christian.



Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 31 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |  4 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +--
  4 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 41ce86244144..5807df52031c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4287,7 +4287,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
  
  	amdgpu_virt_init_data_exchange(adev);

/* we need recover gart prior to run SMC/CP/SDMA resume */
-   amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev, TTM_PL_TT));
+   amdgpu_gtt_mgr_recover(adev);
  
  	r = amdgpu_device_fw_loading(adev);

if (r)
@@ -4604,7 +4604,7 @@ int amdgpu_do_asic_reset(struct list_head 
*device_list_handle,
amdgpu_inc_vram_lost(tmp_adev);
}
  
-r = amdgpu_gtt_mgr_recover(ttm_manager_type(&tmp_adev->mman.bdev, TTM_PL_TT));

+   r = amdgpu_gtt_mgr_recover(tmp_adev);
if (r)
goto out;
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c

index c18f16b3be9c..5e41f8ef743a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -77,10 +77,8 @@ static ssize_t amdgpu_mem_info_gtt_used_show(struct device 
*dev,
  {
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(ddev);
-   struct ttm_resource_manager *man;
  
-	man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);

-   return sysfs_emit(buf, "%llu\n", amdgpu_gtt_mgr_usage(man));
+   return sysfs_emit(buf, "%llu\n", amdgpu_gtt_mgr_usage(adev));
  }
  
  static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,

@@ -206,14 +204,19 @@ static void amdgpu_gtt_mgr_del(struct 
ttm_resource_manager *man,
  /**
   * amdgpu_gtt_mgr_usage - return usage of GTT domain
   *
- * @man: TTM memory type manager
+ * @adev: amdgpu_device pointer
   *
   * Return how many bytes are used in the GTT domain
   */
-uint64_t amdgpu_gtt_mgr_usage(struct ttm_resource_manager *man)
+uint64_t amdgpu_gtt_mgr_usage(struct amdgpu_device *adev)
  {
-   struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
-   s64 result = man->size - atomic64_read(&mgr->available);
+   struct ttm_resource_manager *man;
+   struct amdgpu_gtt_mgr *mgr;
+   s64 result;
+
+   man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
+   mgr = to_gtt_mgr(man);
+   result = man->size - atomic64_read(&mgr->available);
  
  	return (result > 0 ? result : 0) * PAGE_SIZE;

  }
@@ -221,19 +224,20 @@ uint64_t amdgpu_gtt_mgr_usage(struct ttm_resource_manager 
*man)
  /**
   * amdgpu_gtt_mgr_recover - re-init gart
   *
- * @man: TTM memory type manager
+ * @adev: amdgpu_device pointer
   *
   * Re-init the gart for each known BO in the GTT.
   */
-int amdgpu_gtt_mgr_recover(struct ttm_resource_manager *man)
+int amdgpu_gtt_mgr_recover(struct amdgpu_device *adev)
  {
-   struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
-   struct amdgpu_device *adev;
+   struct ttm_resource_manager *man;
+   struct amdgpu_gtt_mgr *mgr;
struct amdgpu_gtt_node *node;
struct drm_mm_node *mm_node;
int r = 0;
  
-	adev = container_of(mgr, typeof(*adev), mman.gtt_mgr);

+   man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
+   mgr = to_gtt_mgr(man);
spin_lock(&mgr->lock);
drm_mm_for_each_node(mm_node, &mgr->mm) {
node = container_of(mm_node, typeof(*node), base.mm_nodes[0]);
@@ -260,6 +264,7 @@ static void amdgpu_gtt_mgr_debug(struct 
ttm_resource_manager *man,
 struct drm_printer *printer)
  {
struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
+   struct amdgpu_device *adev = container_of(mgr, typeof(*adev), 
mman.gtt_mgr);
  
  	spin_lock(&mgr->lock);

drm_mm_print(&mgr->mm, printer);
@@ -267,7 +272,7 @@ static void amdgpu_gtt_mgr_debug(struct 
ttm_resource_manager *man,
  
  	drm_printf(printer, "man size:%llu pages, gtt available:%lld pages, usage:%lluMB\n",

   man->size, (u64)atomic64_read(&mgr->available),
-  amdgpu_gtt_mgr_usage(man) >> 20);
+  a

subscribe to the mailing list

2021-10-19 Thread 王 会
Thanks!


[PATCH 13/13] drm/amdgpu: cleanup drm_mm and apply DRM buddy

2021-10-19 Thread Arunpravin
Remove drm_mm references and add DRM buddy
functions

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h  |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 233 +++
 2 files changed, 138 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 639c7b41e30b..a8ac9902ab29 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -26,6 +26,7 @@
 
 #include 
 #include 
+#include 
 #include "amdgpu.h"
 
 #define AMDGPU_PL_GDS  (TTM_PL_PRIV + 0)
@@ -40,12 +41,13 @@
 
 struct amdgpu_vram_mgr {
struct ttm_resource_manager manager;
-   struct drm_mm mm;
+   struct drm_buddy_mm mm;
spinlock_t lock;
struct list_head reservations_pending;
struct list_head reserved_pages;
atomic64_t usage;
atomic64_t vis_usage;
+   uint64_t default_page_size;
 };
 
 struct amdgpu_gtt_mgr {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index a9182c59907a..0c55a5ea1ed1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -180,10 +180,10 @@ const struct attribute_group amdgpu_vram_mgr_attr_group = 
{
  * Calculate how many bytes of the MM node are inside visible VRAM
  */
 static u64 amdgpu_vram_mgr_vis_size(struct amdgpu_device *adev,
-   struct drm_mm_node *node)
+   struct drm_buddy_block *block)
 {
-   uint64_t start = node->start << PAGE_SHIFT;
-   uint64_t end = (node->size + node->start) << PAGE_SHIFT;
+   uint64_t start = node_start(block);
+   uint64_t end = start + node_size(block);
 
if (start >= adev->gmc.visible_vram_size)
return 0;
@@ -204,9 +204,9 @@ u64 amdgpu_vram_mgr_bo_visible_size(struct amdgpu_bo *bo)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
struct ttm_resource *res = bo->tbo.resource;
-   unsigned pages = res->num_pages;
-   struct drm_mm_node *mm;
-   u64 usage;
+   struct amdgpu_vram_mgr_node *node = to_amdgpu_vram_mgr_node(res);
+   struct drm_buddy_block *block;
+   u64 usage = 0;
 
if (amdgpu_gmc_vram_full_visible(&adev->gmc))
return amdgpu_bo_size(bo);
@@ -214,9 +214,8 @@ u64 amdgpu_vram_mgr_bo_visible_size(struct amdgpu_bo *bo)
if (res->start >= adev->gmc.visible_vram_size >> PAGE_SHIFT)
return 0;
 
-   mm = &container_of(res, struct ttm_range_mgr_node, base)->mm_nodes[0];
-   for (usage = 0; pages; pages -= mm->size, mm++)
-   usage += amdgpu_vram_mgr_vis_size(adev, mm);
+   list_for_each_entry(block, &node->blocks, link)
+   usage += amdgpu_vram_mgr_vis_size(adev, block);
 
return usage;
 }
@@ -226,21 +225,30 @@ static void amdgpu_vram_mgr_do_reserve(struct 
ttm_resource_manager *man)
 {
struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
struct amdgpu_device *adev = to_amdgpu_device(mgr);
-   struct drm_mm *mm = &mgr->mm;
+   struct drm_buddy_mm *mm = &mgr->mm;
struct amdgpu_vram_reservation *rsv, *temp;
+   struct drm_buddy_block *block;
uint64_t vis_usage;
 
list_for_each_entry_safe(rsv, temp, &mgr->reservations_pending, node) {
-   if (drm_mm_reserve_node(mm, &rsv->mm_node))
+   if (drm_buddy_alloc(mm, rsv->start, rsv->start + rsv->size,
+   rsv->size, rsv->min_size, &rsv->block,
+   rsv->flags))
continue;
 
-   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
-   rsv->mm_node.start, rsv->mm_node.size);
+   block = list_first_entry_or_null(&rsv->block,
+struct drm_buddy_block,
+link);
 
-   vis_usage = amdgpu_vram_mgr_vis_size(adev, &rsv->mm_node);
-   atomic64_add(vis_usage, &mgr->vis_usage);
-   atomic64_add(rsv->mm_node.size << PAGE_SHIFT, &mgr->usage);
-   list_move(&rsv->node, &mgr->reserved_pages);
+   if (block) {
+   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, 
Succeeded\n",
+   rsv->start, rsv->size);
+
+   vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
+   atomic64_add(vis_usage, &mgr->vis_usage);
+   atomic64_add(rsv->size, &mgr->usage);
+   list_move(&rsv->node, &mgr->reserved_pages);
+   }
}
 }
 
@@ -264,11 +272,15 @@ int amdgpu_vram_mgr_reserve_range(struct 
ttm_resource_manager *man,
return -ENOMEM;
 
INIT_LIST_HEAD(&rsv->node);
-   rsv

[PATCH 12/13] drm/amdgpu: add cursor support for drm buddy

2021-10-19 Thread Arunpravin
- Add res cursor support for drm buddy
- Replace if..else statement with switch case statement

Signed-off-by: Arunpravin 
---
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h| 97 +++
 1 file changed, 78 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..2c17e948355e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto err_out;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = &to_amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto err_out;
+
+   while (start >= node_size(block)) {
+   start -= node_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = node_start(block) + start;
+   cur->size = min(node_size(block) - start, size);
+   cur->remaining = size;
+   cur->node = block;
+   break;
+   case TTM_PL_TT:
+   node = to_ttm_range_mgr_node(res)->mm_nodes;
+   while (start >= node->size << PAGE_SHIFT)
+   start -= node++->size << PAGE_SHIFT;
+
+   cur->start = (node->start << PAGE_SHIFT) + start;
+   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   cur->remaining = size;
+   cur->node = node;
+   break;
+   default:
+   goto err_out;
+   }
 
-   cur->start = (node->start << PAGE_SHIFT) + start;
-   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   return;
+
+err_out:
+   cur->start = start;
+   cur->size = size;
cur->remaining = size;
-   cur->node = node;
+   cur->node = NULL;
+   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
+   return;
 }
 
 /**
@@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
  */
 static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t 
size)
 {
-   struct drm_mm_node *node = cur->node;
+   struct drm_buddy_block *block;
+   struct drm_mm_node *node;
+   struct list_head *next;
 
BUG_ON(size > cur->remaining);
 
@@ -99,9 +140,27 @@ static inline void amdgpu_res_next(struct amdgpu_res_cursor 
*cur, uint64_t size)
return;
}
 
-   cur->node = ++node;
-   cur->start = node->start << PAGE_SHIFT;
-   cur->size = min(node->size << PAGE_SHIFT, cur->remaining);
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   block = cur->node;
+
+   next = block->link.next;
+   block = list_entry(next, struct drm_buddy_block, link);
+
+   cur->node = block;
+   cur->start = node_start(block);
+   cur->size = min(node_size(block), cur->remaining);
+   break;
+   case TTM_PL_TT:
+   node = cur->node;
+
+   cur->node = ++node;
+   cur->start = node->start << PAGE_SHIFT;
+   cur->size = min(node->size << PAGE_SHIFT, cur->remaining);
+   break;
+   default:
+   return;
+   }

[PATCH 11/13] drm/amdgpu: move vram defines into a header

2021-10-19 Thread Arunpravin
Move vram defines and inline functions into
a header file

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 18 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 72 
 2 files changed, 73 insertions(+), 17 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 7b2b0980ec41..a9182c59907a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -29,25 +29,9 @@
 #include "amdgpu_vm.h"
 #include "amdgpu_res_cursor.h"
 #include "amdgpu_atomfirmware.h"
+#include "amdgpu_vram_mgr.h"
 #include "atom.h"
 
-struct amdgpu_vram_reservation {
-   struct list_head node;
-   struct drm_mm_node mm_node;
-};
-
-static inline struct amdgpu_vram_mgr *
-to_vram_mgr(struct ttm_resource_manager *man)
-{
-   return container_of(man, struct amdgpu_vram_mgr, manager);
-}
-
-static inline struct amdgpu_device *
-to_amdgpu_device(struct amdgpu_vram_mgr *mgr)
-{
-   return container_of(mgr, struct amdgpu_device, mman.vram_mgr);
-}
-
 /**
  * DOC: mem_info_vram_total
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
new file mode 100644
index ..fcab6475ccbb
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_VRAM_MGR_H__
+#define __AMDGPU_VRAM_MGR_H__
+
+#include 
+
+struct amdgpu_vram_mgr_node {
+   struct ttm_resource base;
+   struct list_head blocks;
+   unsigned long flags;
+};
+
+struct amdgpu_vram_reservation {
+   uint64_t start;
+   uint64_t size;
+   uint64_t min_size;
+   unsigned long flags;
+   struct list_head block;
+   struct list_head node;
+};
+
+static inline uint64_t node_start(struct drm_buddy_block *block)
+{
+   return drm_buddy_block_offset(block);
+}
+
+static inline uint64_t node_size(struct drm_buddy_block *block)
+{
+   return PAGE_SIZE << drm_buddy_block_order(block);
+}
+
+static inline struct amdgpu_vram_mgr_node *
+to_amdgpu_vram_mgr_node(struct ttm_resource *res)
+{
+   return container_of(res, struct amdgpu_vram_mgr_node, base);
+}
+
+static inline struct amdgpu_vram_mgr *
+to_vram_mgr(struct ttm_resource_manager *man)
+{
+   return container_of(man, struct amdgpu_vram_mgr, manager);
+}
+
+static inline struct amdgpu_device *
+to_amdgpu_device(struct amdgpu_vram_mgr *mgr)
+{
+   return container_of(mgr, struct amdgpu_device, mman.vram_mgr);
+}
+
+#endif
-- 
2.25.1



[PATCH 09/13] drm: remove i915 selftest config check

2021-10-19 Thread Arunpravin
i915 buddy selftests will be moved to drm selftest folder,
hence the config condition check may be removed.

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 5eb7c4187009..e7a5d6d47a37 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -689,10 +689,6 @@ void drm_buddy_print(struct drm_buddy_mm *mm, struct 
drm_printer *p)
 }
 EXPORT_SYMBOL(drm_buddy_print);
 
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/i915_buddy.c"
-#endif
-
 void drm_buddy_module_exit(void)
 {
kmem_cache_destroy(slab_blocks);
-- 
2.25.1



[PATCH 10/13] drm/i915: cleanup i915 buddy and apply DRM buddy

2021-10-19 Thread Arunpravin
Remove i915 buddy references and add DRM buddy
functions

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/i915/Makefile |  1 -
 drivers/gpu/drm/i915/i915_module.c|  3 -
 drivers/gpu/drm/i915/i915_scatterlist.c   | 11 +--
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 91 +--
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |  5 +-
 5 files changed, 53 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 467872cca027..fc5ca8c4ccb2 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -161,7 +161,6 @@ gem-y += \
 i915-y += \
  $(gem-y) \
  i915_active.o \
- i915_buddy.o \
  i915_cmd_parser.o \
  i915_gem_evict.o \
  i915_gem_gtt.o \
diff --git a/drivers/gpu/drm/i915/i915_module.c 
b/drivers/gpu/drm/i915/i915_module.c
index ab2295dd4500..121b4178c5ca 100644
--- a/drivers/gpu/drm/i915/i915_module.c
+++ b/drivers/gpu/drm/i915/i915_module.c
@@ -9,7 +9,6 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_object.h"
 #include "i915_active.h"
-#include "i915_buddy.h"
 #include "i915_params.h"
 #include "i915_pci.h"
 #include "i915_perf.h"
@@ -50,8 +49,6 @@ static const struct {
{ .init = i915_check_nomodeset },
{ .init = i915_active_module_init,
  .exit = i915_active_module_exit },
-   { .init = i915_buddy_module_init,
- .exit = i915_buddy_module_exit },
{ .init = i915_context_module_init,
  .exit = i915_context_module_exit },
{ .init = i915_gem_context_module_init,
diff --git a/drivers/gpu/drm/i915/i915_scatterlist.c 
b/drivers/gpu/drm/i915/i915_scatterlist.c
index 4a6712dca838..84d622aa32d2 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.c
+++ b/drivers/gpu/drm/i915/i915_scatterlist.c
@@ -5,10 +5,9 @@
  */
 
 #include "i915_scatterlist.h"
-
-#include "i915_buddy.h"
 #include "i915_ttm_buddy_manager.h"
 
+#include 
 #include 
 
 #include 
@@ -126,9 +125,9 @@ struct sg_table *i915_sg_from_buddy_resource(struct 
ttm_resource *res,
struct i915_ttm_buddy_resource *bman_res = to_ttm_buddy_resource(res);
const u64 size = res->num_pages << PAGE_SHIFT;
const u64 max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-   struct i915_buddy_mm *mm = bman_res->mm;
+   struct drm_buddy_mm *mm = bman_res->mm;
struct list_head *blocks = &bman_res->blocks;
-   struct i915_buddy_block *block;
+   struct drm_buddy_block *block;
struct scatterlist *sg;
struct sg_table *st;
resource_size_t prev_end;
@@ -151,8 +150,8 @@ struct sg_table *i915_sg_from_buddy_resource(struct 
ttm_resource *res,
list_for_each_entry(block, blocks, link) {
u64 block_size, offset;
 
-   block_size = min_t(u64, size, i915_buddy_block_size(mm, block));
-   offset = i915_buddy_block_offset(block);
+   block_size = min_t(u64, size, drm_buddy_block_size(mm, block));
+   offset = drm_buddy_block_offset(block);
 
while (block_size) {
u64 len;
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index d59fbb019032..d09ea6c83a27 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -7,15 +7,15 @@
 
 #include 
 #include 
+#include 
 
 #include "i915_ttm_buddy_manager.h"
 
-#include "i915_buddy.h"
 #include "i915_gem.h"
 
 struct i915_ttm_buddy_manager {
struct ttm_resource_manager manager;
-   struct i915_buddy_mm mm;
+   struct drm_buddy_mm mm;
struct list_head reserved;
struct mutex lock;
u64 default_page_size;
@@ -34,15 +34,12 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
 {
struct i915_ttm_buddy_manager *bman = to_buddy_manager(man);
struct i915_ttm_buddy_resource *bman_res;
-   struct i915_buddy_mm *mm = &bman->mm;
+   struct drm_buddy_mm *mm = &bman->mm;
unsigned long n_pages;
-   unsigned int min_order;
u64 min_page_size;
u64 size;
int err;
 
-   GEM_BUG_ON(place->fpfn || place->lpfn);
-
bman_res = kzalloc(sizeof(*bman_res), GFP_KERNEL);
if (!bman_res)
return -ENOMEM;
@@ -59,11 +56,12 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
min_page_size = bo->page_alignment << PAGE_SHIFT;
 
GEM_BUG_ON(min_page_size < mm->chunk_size);
-   min_order = ilog2(min_page_size) - ilog2(mm->chunk_size);
-   if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
+
+   if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
size = roundup_pow_of_two(size);
-   min_order = ilog2(size) - ilog2(mm->chunk_size);
-   }
+
+   if (place->fpfn || place->lpfn)
+   bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATIO

[PATCH 08/13] drm: export functions and write description

2021-10-19 Thread Arunpravin
Export functions and write kerneldoc description

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 89 ++---
 1 file changed, 83 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 3e3303dd6658..5eb7c4187009 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -63,6 +63,18 @@ static void mark_split(struct drm_buddy_block *block)
list_del(&block->link);
 }
 
+/**
+ * drm_buddy_init - init memory manager
+ *
+ * @mm: DRM buddy manager to initialize
+ * @size: size in bytes to manage
+ * @chunk_size: minimum page size in bytes for our allocations
+ *
+ * Initializes the memory manager and its resources.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
 int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size)
 {
unsigned int i;
@@ -144,7 +156,15 @@ int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 
chunk_size)
kfree(mm->free_list);
return -ENOMEM;
 }
+EXPORT_SYMBOL(drm_buddy_init);
 
+/**
+ * drm_buddy_fini - tear down the memory manager
+ *
+ * @mm: DRM buddy manager to free
+ *
+ * Cleanup memory manager resources and the freelist
+ */
 void drm_buddy_fini(struct drm_buddy_mm *mm)
 {
int i;
@@ -159,6 +179,7 @@ void drm_buddy_fini(struct drm_buddy_mm *mm)
kfree(mm->roots);
kfree(mm->free_list);
 }
+EXPORT_SYMBOL(drm_buddy_fini);
 
 static int split_block(struct drm_buddy_mm *mm,
   struct drm_buddy_block *block)
@@ -235,6 +256,12 @@ void drm_buddy_free(struct drm_buddy_mm *mm,
__drm_buddy_free(mm, block);
 }
 
+/**
+ * drm_buddy_free_list - free blocks
+ *
+ * @mm: DRM buddy manager
+ * @objects: input list head to free blocks
+ */
 void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects)
 {
struct drm_buddy_block *block, *on;
@@ -245,6 +272,7 @@ void drm_buddy_free_list(struct drm_buddy_mm *mm, struct 
list_head *objects)
}
INIT_LIST_HEAD(objects);
 }
+EXPORT_SYMBOL(drm_buddy_free_list);
 
 static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
 {
@@ -256,6 +284,20 @@ static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
return s1 <= s2 && e1 >= e2;
 }
 
+/**
+ * drm_buddy_free_unused_pages - free unused pages
+ *
+ * @mm: DRM buddy manager
+ * @actual_size: original size requested
+ * @blocks: output list head to add allocated blocks
+ *
+ * For contiguous allocation, we round up the size to the nearest
+ * power of two value, drivers consume *actual* size, so remaining
+ * portions are unused and it can be freed.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
 int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm,
u64 actual_size,
struct list_head *blocks)
@@ -342,6 +384,7 @@ int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm,
__drm_buddy_free(mm, block);
return err;
 }
+EXPORT_SYMBOL(drm_buddy_free_unused_pages);
 
 static struct drm_buddy_block *
 alloc_range(struct drm_buddy_mm *mm,
@@ -494,13 +537,31 @@ alloc_from_freelist(struct drm_buddy_mm *mm,
return ERR_PTR(err);
 }
 
-/*
- * Allocate power-of-two block. The order value here translates to:
+/**
+ * drm_buddy_alloc - allocate power-of-two blocks
+ *
+ * @mm: DRM buddy manager to allocate from
+ * @start: start of the allowed range for this block
+ * @end: end of the allowed range for this block
+ * @size: size of the allocation
+ * @min_page_size: alignment of the allocation
+ * @blocks: output list head to add allocated blocks
+ * @flags: DRM_BUDDY_*_ALLOCATION flags
+ *
+ * alloc_range() invoked on range limitations, which traverses
+ * the tree and returns the desired block.
+ *
+ * alloc_from_freelist() called when *no* range restrictions
+ * are enforced, which picks the block from the freelist.
+ *
+ * blocks are allocated in order, order value here translates to:
  *
- *   0 = 2^0 * mm->chunk_size
- *   1 = 2^1 * mm->chunk_size
- *   2 = 2^2 * mm->chunk_size
- *   ...
+ * 0 = 2^0 * mm->chunk_size
+ * 1 = 2^1 * mm->chunk_size
+ * 2 = 2^2 * mm->chunk_size
+ *
+ * Returns:
+ * 0 on success, error code on failure.
  */
 int drm_buddy_alloc(struct drm_buddy_mm *mm,
u64 start, u64 end, u64 size,
@@ -573,7 +634,15 @@ int drm_buddy_alloc(struct drm_buddy_mm *mm,
drm_buddy_free_list(mm, &allocated);
return err;
 }
+EXPORT_SYMBOL(drm_buddy_alloc);
 
+/**
+ * drm_buddy_block_print - print block information
+ *
+ * @mm: DRM buddy manager
+ * @block: DRM buddy block
+ * @p: DRM printer to use
+ */
 void drm_buddy_block_print(struct drm_buddy_mm *mm,
   struct drm_buddy_block *block,
   struct drm_printer *p)
@@ -583,7 +652,14 @@ void drm_buddy_block_print(struct drm_buddy_mm *mm,
 
drm_printf(p, "%#018llx-%#018llx: %llu\n", start, start + size, s

[PATCH 07/13] drm: Implement method to free unused pages

2021-10-19 Thread Arunpravin
On contiguous allocation, we round up the size
to the nearest power of 2, implement a function
to free unused pages.

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 87 +
 include/drm/drm_buddy.h |  4 ++
 2 files changed, 91 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 42ce4f8f4e0e..3e3303dd6658 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -256,6 +256,93 @@ static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
return s1 <= s2 && e1 >= e2;
 }
 
+int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm,
+   u64 actual_size,
+   struct list_head *blocks)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   u64 actual_start;
+   u64 actual_end;
+   LIST_HEAD(dfs);
+   u64 count = 0;
+   int err;
+
+   if (!list_is_singular(blocks))
+   return -EINVAL;
+
+   block = list_first_entry_or_null(blocks,
+struct drm_buddy_block,
+link);
+
+   if (!block)
+   return -EINVAL;
+
+   if (actual_size > drm_buddy_block_size(mm, block))
+   return -EINVAL;
+
+   if (actual_size == drm_buddy_block_size(mm, block))
+   return 0;
+
+   list_del(&block->link);
+
+   actual_start = drm_buddy_block_offset(block);
+   actual_end = actual_start + actual_size - 1;
+
+   if (drm_buddy_block_is_allocated(block))
+   mark_free(mm, block);
+
+   list_add(&block->tmp_link, &dfs);
+
+   while (1) {
+   block = list_first_entry_or_null(&dfs,
+struct drm_buddy_block,
+tmp_link);
+
+   if (!block)
+   break;
+
+   list_del(&block->tmp_link);
+
+   if (count == actual_size)
+   return 0;
+
+   if (contains(actual_start, actual_end, 
drm_buddy_block_offset(block),
+   (drm_buddy_block_offset(block) + 
drm_buddy_block_size(mm, block) - 1))) {
+   BUG_ON(!drm_buddy_block_is_free(block));
+   /* Allocate only required blocks */
+   mark_allocated(block);
+   mm->avail -= drm_buddy_block_size(mm, block);
+   list_add_tail(&block->link, blocks);
+   count += drm_buddy_block_size(mm, block);
+   continue;
+   }
+
+   if (drm_buddy_block_order(block) == 0)
+   continue;
+
+   if (!drm_buddy_block_is_split(block)) {
+   err = split_block(mm, block);
+
+   if (unlikely(err))
+   goto err_undo;
+   }
+
+   list_add(&block->right->tmp_link, &dfs);
+   list_add(&block->left->tmp_link, &dfs);
+   }
+
+   return -ENOSPC;
+
+err_undo:
+   buddy = get_buddy(block);
+   if (buddy &&
+   (drm_buddy_block_is_free(block) &&
+drm_buddy_block_is_free(buddy)))
+   __drm_buddy_free(mm, block);
+   return err;
+}
+
 static struct drm_buddy_block *
 alloc_range(struct drm_buddy_mm *mm,
u64 start, u64 end,
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 19c7e298613e..993312841140 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -145,6 +145,10 @@ int drm_buddy_alloc(struct drm_buddy_mm *mm,
struct list_head *blocks,
unsigned long flags);
 
+int drm_buddy_free_unused_pages(struct drm_buddy_mm *mm,
+   u64 actual_size,
+   struct list_head *blocks);
+
 void drm_buddy_free(struct drm_buddy_mm *mm, struct drm_buddy_block *block);
 
 void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects);
-- 
2.25.1



[PATCH 06/13] drm: implement top-down allocation method

2021-10-19 Thread Arunpravin
Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.

The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 42 +++--
 include/drm/drm_buddy.h |  1 +
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 138e9f1a7340..42ce4f8f4e0e 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -334,6 +334,27 @@ alloc_range(struct drm_buddy_mm *mm,
return ERR_PTR(err);
 }
 
+static struct drm_buddy_block *
+get_maxblock(struct list_head *head)
+{
+   struct drm_buddy_block *max_block = NULL, *node;
+
+   max_block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+
+   if (!max_block)
+   return NULL;
+
+   list_for_each_entry(node, head, link) {
+   if (drm_buddy_block_offset(node) >
+   drm_buddy_block_offset(max_block))
+   max_block = node;
+   }
+
+   return max_block;
+}
+
 static struct drm_buddy_block *
 alloc_from_freelist(struct drm_buddy_mm *mm,
unsigned int order,
@@ -344,13 +365,22 @@ alloc_from_freelist(struct drm_buddy_mm *mm,
int err;
 
for (i = order; i <= mm->max_order; ++i) {
-   if (!list_empty(&mm->free_list[i])) {
-   block = list_first_entry_or_null(&mm->free_list[i],
-struct drm_buddy_block,
-link);
+   if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) {
+   if (!list_empty(&mm->free_list[i])) {
+   block = get_maxblock(&mm->free_list[i]);
 
-   if (block)
-   break;
+   if (block)
+   break;
+   }
+   } else {
+   if (!list_empty(&mm->free_list[i])) {
+   block = 
list_first_entry_or_null(&mm->free_list[i],
+struct 
drm_buddy_block,
+link);
+
+   if (block)
+   break;
+   }
}
}
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index ebf03d151845..19c7e298613e 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -27,6 +27,7 @@
|| size__ > end__ - start__; \
 })
 
+#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 0)
 #define DRM_BUDDY_RANGE_ALLOCATION (1 << 1)
 
 struct drm_buddy_block {
-- 
2.25.1



[PATCH 05/13] drm: remove drm_buddy_alloc_range

2021-10-19 Thread Arunpravin
This function becomes obsolete and may be removed.

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 101 
 include/drm/drm_buddy.h |   4 --
 2 files changed, 105 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index f5f299dd9131..138e9f1a7340 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -457,107 +457,6 @@ int drm_buddy_alloc(struct drm_buddy_mm *mm,
return err;
 }
 
-/*
- * Allocate range. Note that it's safe to chain together multiple alloc_ranges
- * with the same blocks list.
- *
- * Intended for pre-allocating portions of the address space, for example to
- * reserve a block for the initial framebuffer or similar, hence the 
expectation
- * here is that drm_buddy_alloc() is still the main vehicle for
- * allocations, so if that's not the case then the drm_mm range allocator is
- * probably a much better fit, and so you should probably go use that instead.
- */
-int drm_buddy_alloc_range(struct drm_buddy_mm *mm,
- struct list_head *blocks,
- u64 start, u64 size)
-{
-   struct drm_buddy_block *block;
-   struct drm_buddy_block *buddy;
-   LIST_HEAD(allocated);
-   LIST_HEAD(dfs);
-   u64 end;
-   int err;
-   int i;
-
-   if (size < mm->chunk_size)
-   return -EINVAL;
-
-   if (!IS_ALIGNED(size | start, mm->chunk_size))
-   return -EINVAL;
-
-   if (range_overflows(start, size, mm->size))
-   return -EINVAL;
-
-   for (i = 0; i < mm->n_roots; ++i)
-   list_add_tail(&mm->roots[i]->tmp_link, &dfs);
-
-   end = start + size - 1;
-
-   do {
-   u64 block_start;
-   u64 block_end;
-
-   block = list_first_entry_or_null(&dfs,
-struct drm_buddy_block,
-tmp_link);
-   if (!block)
-   break;
-
-   list_del(&block->tmp_link);
-
-   block_start = drm_buddy_block_offset(block);
-   block_end = block_start + drm_buddy_block_size(mm, block) - 1;
-
-   if (!overlaps(start, end, block_start, block_end))
-   continue;
-
-   if (drm_buddy_block_is_allocated(block)) {
-   err = -ENOSPC;
-   goto err_free;
-   }
-
-   if (contains(start, end, block_start, block_end)) {
-   if (!drm_buddy_block_is_free(block)) {
-   err = -ENOSPC;
-   goto err_free;
-   }
-
-   mark_allocated(block);
-   mm->avail -= drm_buddy_block_size(mm, block);
-   list_add_tail(&block->link, &allocated);
-   continue;
-   }
-
-   if (!drm_buddy_block_is_split(block)) {
-   err = split_block(mm, block);
-   if (unlikely(err))
-   goto err_undo;
-   }
-
-   list_add(&block->right->tmp_link, &dfs);
-   list_add(&block->left->tmp_link, &dfs);
-   } while (1);
-
-   list_splice_tail(&allocated, blocks);
-   return 0;
-
-err_undo:
-   /*
-* We really don't want to leave around a bunch of split blocks, since
-* bigger is better, so make sure we merge everything back before we
-* free the allocated blocks.
-*/
-   buddy = get_buddy(block);
-   if (buddy &&
-   (drm_buddy_block_is_free(block) &&
-drm_buddy_block_is_free(buddy)))
-   __drm_buddy_free(mm, block);
-
-err_free:
-   drm_buddy_free_list(mm, &allocated);
-   return err;
-}
-
 void drm_buddy_block_print(struct drm_buddy_mm *mm,
   struct drm_buddy_block *block,
   struct drm_printer *p)
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index c64fd4062cb6..ebf03d151845 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -144,10 +144,6 @@ int drm_buddy_alloc(struct drm_buddy_mm *mm,
struct list_head *blocks,
unsigned long flags);
 
-int drm_buddy_alloc_range(struct drm_buddy_mm *mm,
-  struct list_head *blocks,
-  u64 start, u64 size);
-
 void drm_buddy_free(struct drm_buddy_mm *mm, struct drm_buddy_block *block);
 
 void drm_buddy_free_list(struct drm_buddy_mm *mm, struct list_head *objects);
-- 
2.25.1



[PATCH 04/13] drm: make drm_buddy_alloc a commonplace

2021-10-19 Thread Arunpravin
- Make drm_buddy_alloc a single function to handle
  range allocation and non-range allocation demands.

- Implemented a new function alloc_range() which allocates
  the requested order (in bytes) comply with range limitations

- Moved memory alignment logic from i915 driver

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c | 208 +++-
 include/drm/drm_buddy.h |  18 +++-
 2 files changed, 194 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 0398706cb7ae..f5f299dd9131 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -246,27 +246,112 @@ void drm_buddy_free_list(struct drm_buddy_mm *mm, struct 
list_head *objects)
INIT_LIST_HEAD(objects);
 }
 
-/*
- * Allocate power-of-two block. The order value here translates to:
- *
- *   0 = 2^0 * mm->chunk_size
- *   1 = 2^1 * mm->chunk_size
- *   2 = 2^2 * mm->chunk_size
- *   ...
- */
-struct drm_buddy_block *
-drm_buddy_alloc(struct drm_buddy_mm *mm, unsigned int order)
+static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= e2 && e1 >= s2;
+}
+
+static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= s2 && e1 >= e2;
+}
+
+static struct drm_buddy_block *
+alloc_range(struct drm_buddy_mm *mm,
+   u64 start, u64 end,
+   unsigned int order)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   LIST_HEAD(dfs);
+   int err;
+   int i;
+
+   end = end - 1;
+
+   for (i = 0; i < mm->n_roots; ++i)
+   list_add_tail(&mm->roots[i]->tmp_link, &dfs);
+
+   do {
+   u64 block_start;
+   u64 block_end;
+
+   block = list_first_entry_or_null(&dfs,
+struct drm_buddy_block,
+tmp_link);
+
+   if (!block)
+   break;
+
+   list_del(&block->tmp_link);
+
+   if (drm_buddy_block_order(block) < order)
+   continue;
+
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block) - 1;
+
+   if (!overlaps(start, end, block_start, block_end))
+   continue;
+
+   if (drm_buddy_block_is_allocated(block))
+   continue;
+
+   if (contains(start, end, block_start, block_end)
+   && order == drm_buddy_block_order(block)) {
+   /*
+* Find the free block within the range.
+*/
+   if (drm_buddy_block_is_free(block))
+   return block;
+
+   continue;
+   }
+
+   if (!drm_buddy_block_is_split(block)) {
+   err = split_block(mm, block);
+   if (unlikely(err))
+   goto err_undo;
+   }
+
+   list_add(&block->left->tmp_link, &dfs);
+   list_add(&block->right->tmp_link, &dfs);
+   } while (1);
+
+   return ERR_PTR(-ENOSPC);
+
+err_undo:
+   /*
+* We really don't want to leave around a bunch of split blocks, since
+* bigger is better, so make sure we merge everything back before we
+* free the allocated blocks.
+*/
+   buddy = get_buddy(block);
+   if (buddy &&
+   (drm_buddy_block_is_free(block) &&
+drm_buddy_block_is_free(buddy)))
+   __drm_buddy_free(mm, block);
+   return ERR_PTR(err);
+}
+
+static struct drm_buddy_block *
+alloc_from_freelist(struct drm_buddy_mm *mm,
+   unsigned int order,
+   unsigned long flags)
 {
struct drm_buddy_block *block = NULL;
unsigned int i;
int err;
 
for (i = order; i <= mm->max_order; ++i) {
-   block = list_first_entry_or_null(&mm->free_list[i],
-struct drm_buddy_block,
-link);
-   if (block)
-   break;
+   if (!list_empty(&mm->free_list[i])) {
+   block = list_first_entry_or_null(&mm->free_list[i],
+struct drm_buddy_block,
+link);
+
+   if (block)
+   break;
+   }
}
 
if (!block)
@@ -276,33 +361,100 @@ drm_buddy_alloc(struct drm_buddy_mm *mm, unsigned int 
order)
 
while (i != order) {
err = split_block(mm, block);
+
if (unlikely(err))
-   goto out_free;
+   goto err_undo;
 
- 

[PATCH 03/13] drm: add Makefile support for drm buddy

2021-10-19 Thread Arunpravin
- Include drm buddy to DRM root Makefile
- Add drm buddy init and exit function calls
  to drm core

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Makefile  | 2 +-
 drivers/gpu/drm/drm_drv.c | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 0dff40bb863c..dc61e91a3154 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -18,7 +18,7 @@ drm-y   :=drm_aperture.o drm_auth.o drm_cache.o \
drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
-   drm_managed.o drm_vblank_work.o
+   drm_managed.o drm_vblank_work.o drm_buddy.o
 
 drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
drm_dma.o \
drm_legacy_misc.o drm_lock.o drm_memory.o 
drm_scatter.o \
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 7a5097467ba5..6707eec21bef 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "drm_crtc_internal.h"
 #include "drm_internal.h"
@@ -1034,6 +1035,7 @@ static void drm_core_exit(void)
drm_sysfs_destroy();
idr_destroy(&drm_minors_idr);
drm_connector_ida_destroy();
+   drm_buddy_module_exit();
 }
 
 static int __init drm_core_init(void)
@@ -1043,6 +1045,7 @@ static int __init drm_core_init(void)
drm_connector_ida_init();
idr_init(&drm_minors_idr);
drm_memcpy_init_early();
+   drm_buddy_module_init();
 
ret = drm_sysfs_init();
if (ret < 0) {
-- 
2.25.1



[PATCH 02/13] drm: Move and rename i915 buddy source

2021-10-19 Thread Arunpravin
- Move i915_buddy.c to drm root folder
- Rename "i915" string with "drm" string wherever applicable
- Rename "I915" string with "DRM" string wherever applicable
- Fix header file dependencies
- Fix alignment issues

Signed-off-by: Arunpravin 
---
 .../drm/{i915/i915_buddy.c => drm_buddy.c}| 193 +-
 include/drm/drm_buddy.h   |  10 +
 2 files changed, 105 insertions(+), 98 deletions(-)
 rename drivers/gpu/drm/{i915/i915_buddy.c => drm_buddy.c} (58%)

diff --git a/drivers/gpu/drm/i915/i915_buddy.c b/drivers/gpu/drm/drm_buddy.c
similarity index 58%
rename from drivers/gpu/drm/i915/i915_buddy.c
rename to drivers/gpu/drm/drm_buddy.c
index 6e2ad68f8f3f..0398706cb7ae 100644
--- a/drivers/gpu/drm/i915/i915_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -6,21 +6,18 @@
 #include 
 #include 
 
-#include "i915_buddy.h"
-
-#include "i915_gem.h"
-#include "i915_utils.h"
+#include 
 
 static struct kmem_cache *slab_blocks;
 
-static struct i915_buddy_block *i915_block_alloc(struct i915_buddy_mm *mm,
-struct i915_buddy_block 
*parent,
-unsigned int order,
-u64 offset)
+static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm,
+  struct drm_buddy_block *parent,
+  unsigned int order,
+  u64 offset)
 {
-   struct i915_buddy_block *block;
+   struct drm_buddy_block *block;
 
-   GEM_BUG_ON(order > I915_BUDDY_MAX_ORDER);
+   BUG_ON(order > DRM_BUDDY_MAX_ORDER);
 
block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
if (!block)
@@ -30,43 +27,43 @@ static struct i915_buddy_block *i915_block_alloc(struct 
i915_buddy_mm *mm,
block->header |= order;
block->parent = parent;
 
-   GEM_BUG_ON(block->header & I915_BUDDY_HEADER_UNUSED);
+   BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
return block;
 }
 
-static void i915_block_free(struct i915_buddy_mm *mm,
-   struct i915_buddy_block *block)
+static void drm_block_free(struct drm_buddy_mm *mm,
+  struct drm_buddy_block *block)
 {
kmem_cache_free(slab_blocks, block);
 }
 
-static void mark_allocated(struct i915_buddy_block *block)
+static void mark_allocated(struct drm_buddy_block *block)
 {
-   block->header &= ~I915_BUDDY_HEADER_STATE;
-   block->header |= I915_BUDDY_ALLOCATED;
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_ALLOCATED;
 
list_del(&block->link);
 }
 
-static void mark_free(struct i915_buddy_mm *mm,
- struct i915_buddy_block *block)
+static void mark_free(struct drm_buddy_mm *mm,
+ struct drm_buddy_block *block)
 {
-   block->header &= ~I915_BUDDY_HEADER_STATE;
-   block->header |= I915_BUDDY_FREE;
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_FREE;
 
list_add(&block->link,
-&mm->free_list[i915_buddy_block_order(block)]);
+&mm->free_list[drm_buddy_block_order(block)]);
 }
 
-static void mark_split(struct i915_buddy_block *block)
+static void mark_split(struct drm_buddy_block *block)
 {
-   block->header &= ~I915_BUDDY_HEADER_STATE;
-   block->header |= I915_BUDDY_SPLIT;
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_SPLIT;
 
list_del(&block->link);
 }
 
-int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 chunk_size)
+int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size)
 {
unsigned int i;
u64 offset;
@@ -87,7 +84,7 @@ int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 
chunk_size)
mm->chunk_size = chunk_size;
mm->max_order = ilog2(size) - ilog2(chunk_size);
 
-   GEM_BUG_ON(mm->max_order > I915_BUDDY_MAX_ORDER);
+   BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER);
 
mm->free_list = kmalloc_array(mm->max_order + 1,
  sizeof(struct list_head),
@@ -101,7 +98,7 @@ int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 
chunk_size)
mm->n_roots = hweight64(size);
 
mm->roots = kmalloc_array(mm->n_roots,
- sizeof(struct i915_buddy_block *),
+ sizeof(struct drm_buddy_block *),
  GFP_KERNEL);
if (!mm->roots)
goto out_free_list;
@@ -114,21 +111,21 @@ int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, 
u64 chunk_size)
 * not itself a power-of-two.
 */
do {
-   struct i915_buddy_block *root;
+   struct drm_buddy_block *root;
unsigned int order;
u64 root_size;
 
   

[PATCH 01/13] drm: Move and rename i915 buddy header

2021-10-19 Thread Arunpravin
- Move i915_buddy.h to include/drm
- rename "i915" string to "drm"
- rename "I915" string to "DRM"

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/i915/i915_buddy.h | 143 --
 include/drm/drm_buddy.h   | 143 ++
 2 files changed, 143 insertions(+), 143 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 create mode 100644 include/drm/drm_buddy.h

diff --git a/drivers/gpu/drm/i915/i915_buddy.h 
b/drivers/gpu/drm/i915/i915_buddy.h
deleted file mode 100644
index 7077742112ac..
--- a/drivers/gpu/drm/i915/i915_buddy.h
+++ /dev/null
@@ -1,143 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2021 Intel Corporation
- */
-
-#ifndef __I915_BUDDY_H__
-#define __I915_BUDDY_H__
-
-#include 
-#include 
-#include 
-
-#include 
-
-struct i915_buddy_block {
-#define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
-#define I915_BUDDY_HEADER_STATE  GENMASK_ULL(11, 10)
-#define   I915_BUDDY_ALLOCATED(1 << 10)
-#define   I915_BUDDY_FREE (2 << 10)
-#define   I915_BUDDY_SPLIT(3 << 10)
-/* Free to be used, if needed in the future */
-#define I915_BUDDY_HEADER_UNUSED GENMASK_ULL(9, 6)
-#define I915_BUDDY_HEADER_ORDER  GENMASK_ULL(5, 0)
-   u64 header;
-
-   struct i915_buddy_block *left;
-   struct i915_buddy_block *right;
-   struct i915_buddy_block *parent;
-
-   void *private; /* owned by creator */
-
-   /*
-* While the block is allocated by the user through i915_buddy_alloc*,
-* the user has ownership of the link, for example to maintain within
-* a list, if so desired. As soon as the block is freed with
-* i915_buddy_free* ownership is given back to the mm.
-*/
-   struct list_head link;
-   struct list_head tmp_link;
-};
-
-/* Order-zero must be at least PAGE_SIZE */
-#define I915_BUDDY_MAX_ORDER (63 - PAGE_SHIFT)
-
-/*
- * Binary Buddy System.
- *
- * Locking should be handled by the user, a simple mutex around
- * i915_buddy_alloc* and i915_buddy_free* should suffice.
- */
-struct i915_buddy_mm {
-   /* Maintain a free list for each order. */
-   struct list_head *free_list;
-
-   /*
-* Maintain explicit binary tree(s) to track the allocation of the
-* address space. This gives us a simple way of finding a buddy block
-* and performing the potentially recursive merge step when freeing a
-* block.  Nodes are either allocated or free, in which case they will
-* also exist on the respective free list.
-*/
-   struct i915_buddy_block **roots;
-
-   /*
-* Anything from here is public, and remains static for the lifetime of
-* the mm. Everything above is considered do-not-touch.
-*/
-   unsigned int n_roots;
-   unsigned int max_order;
-
-   /* Must be at least PAGE_SIZE */
-   u64 chunk_size;
-   u64 size;
-   u64 avail;
-};
-
-static inline u64
-i915_buddy_block_offset(struct i915_buddy_block *block)
-{
-   return block->header & I915_BUDDY_HEADER_OFFSET;
-}
-
-static inline unsigned int
-i915_buddy_block_order(struct i915_buddy_block *block)
-{
-   return block->header & I915_BUDDY_HEADER_ORDER;
-}
-
-static inline unsigned int
-i915_buddy_block_state(struct i915_buddy_block *block)
-{
-   return block->header & I915_BUDDY_HEADER_STATE;
-}
-
-static inline bool
-i915_buddy_block_is_allocated(struct i915_buddy_block *block)
-{
-   return i915_buddy_block_state(block) == I915_BUDDY_ALLOCATED;
-}
-
-static inline bool
-i915_buddy_block_is_free(struct i915_buddy_block *block)
-{
-   return i915_buddy_block_state(block) == I915_BUDDY_FREE;
-}
-
-static inline bool
-i915_buddy_block_is_split(struct i915_buddy_block *block)
-{
-   return i915_buddy_block_state(block) == I915_BUDDY_SPLIT;
-}
-
-static inline u64
-i915_buddy_block_size(struct i915_buddy_mm *mm,
- struct i915_buddy_block *block)
-{
-   return mm->chunk_size << i915_buddy_block_order(block);
-}
-
-int i915_buddy_init(struct i915_buddy_mm *mm, u64 size, u64 chunk_size);
-
-void i915_buddy_fini(struct i915_buddy_mm *mm);
-
-struct i915_buddy_block *
-i915_buddy_alloc(struct i915_buddy_mm *mm, unsigned int order);
-
-int i915_buddy_alloc_range(struct i915_buddy_mm *mm,
-  struct list_head *blocks,
-  u64 start, u64 size);
-
-void i915_buddy_free(struct i915_buddy_mm *mm, struct i915_buddy_block *block);
-
-void i915_buddy_free_list(struct i915_buddy_mm *mm, struct list_head *objects);
-
-void i915_buddy_print(struct i915_buddy_mm *mm, struct drm_printer *p);
-void i915_buddy_block_print(struct i915_buddy_mm *mm,
-   struct i915_buddy_block *block,
-   struct drm_printer *p);
-
-void i915_buddy_module_exit(void);
-int i915_buddy_module_init(void);
-
-#endif
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
new fil

[PATCH 00/13] drm: Enable buddy allocator support

2021-10-19 Thread Arunpravin
This series of patches implemented to move i915 buddy allocator
to drm root, and introduce new features include

- make drm_buddy_alloc a prime vehicle for allocation
- TOPDOWN range of address allocation support
- a function to free unused pages on contiguous allocation
- a function to allocate required size comply with range limitations
- cleanup i915 and amdgpu old mm manager references
- and finally add drm buddy support to i915 and amdgpu driver modules

selftest patches will be sent in a separate series.

Arunpravin (13):
  drm: Move and rename i915 buddy header
  drm: Move and rename i915 buddy source
  drm: add Makefile support for drm buddy
  drm: make drm_buddy_alloc a commonplace
  drm: remove drm_buddy_alloc_range
  drm: implement top-down allocation method
  drm: Implement method to free unused pages
  drm: export functions and write description
  drm: remove i915 selftest config check
  drm/i915: cleanup i915 buddy and apply DRM buddy
  drm/amdgpu: move vram defines into a header
  drm/amdgpu: add cursor support for drm buddy
  drm/amdgpu: cleanup drm_mm and apply DRM buddy

 drivers/gpu/drm/Makefile  |   2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 251 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h  |  72 ++
 drivers/gpu/drm/drm_buddy.c   | 704 ++
 drivers/gpu/drm/drm_drv.c |   3 +
 drivers/gpu/drm/i915/Makefile |   1 -
 drivers/gpu/drm/i915/i915_buddy.c | 466 
 drivers/gpu/drm/i915/i915_buddy.h | 143 
 drivers/gpu/drm/i915/i915_module.c|   3 -
 drivers/gpu/drm/i915/i915_scatterlist.c   |  11 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  91 ++-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   5 +-
 include/drm/drm_buddy.h   | 164 
 15 files changed, 1214 insertions(+), 803 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
 create mode 100644 drivers/gpu/drm/drm_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.c
 delete mode 100644 drivers/gpu/drm/i915/i915_buddy.h
 create mode 100644 include/drm/drm_buddy.h

-- 
2.25.1



Re: [PATCH v3 13/13] drm/i915: replace drm_detect_hdmi_monitor() with drm_display_info.is_hdmi

2021-10-19 Thread Claudio Suarez
drm_get_edid() internally calls to drm_connector_update_edid_property()
and then drm_add_display_info(), which parses the EDID.
This happens in the function intel_hdmi_set_edid() and
intel_sdvo_tmds_sink_detect() (via intel_sdvo_get_edid()).

Once EDID is parsed, the monitor HDMI support information is available
through drm_display_info.is_hdmi. Retriving the same information with
drm_detect_hdmi_monitor() is less efficient. Change to
drm_display_info.is_hdmi

This is a TODO task in Documentation/gpu/todo.rst

Signed-off-by: Claudio Suarez 
---
 drivers/gpu/drm/i915/display/intel_hdmi.c | 2 +-
 drivers/gpu/drm/i915/display/intel_sdvo.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c 
b/drivers/gpu/drm/i915/display/intel_hdmi.c
index b04685bb6439..008e5b0ba408 100644
--- a/drivers/gpu/drm/i915/display/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
@@ -2355,7 +2355,7 @@ intel_hdmi_set_edid(struct drm_connector *connector)
to_intel_connector(connector)->detect_edid = edid;
if (edid && edid->input & DRM_EDID_INPUT_DIGITAL) {
intel_hdmi->has_audio = drm_detect_monitor_audio(edid);
-   intel_hdmi->has_hdmi_sink = drm_detect_hdmi_monitor(edid);
+   intel_hdmi->has_hdmi_sink = connector->display_info.is_hdmi;
 
connected = true;
}
diff --git a/drivers/gpu/drm/i915/display/intel_sdvo.c 
b/drivers/gpu/drm/i915/display/intel_sdvo.c
index 6cb27599ea03..b4065e4df644 100644
--- a/drivers/gpu/drm/i915/display/intel_sdvo.c
+++ b/drivers/gpu/drm/i915/display/intel_sdvo.c
@@ -2060,8 +2060,9 @@ intel_sdvo_tmds_sink_detect(struct drm_connector 
*connector)
if (edid->input & DRM_EDID_INPUT_DIGITAL) {
status = connector_status_connected;
if (intel_sdvo_connector->is_hdmi) {
-   intel_sdvo->has_hdmi_monitor = 
drm_detect_hdmi_monitor(edid);
intel_sdvo->has_hdmi_audio = 
drm_detect_monitor_audio(edid);
+   intel_sdvo->has_hdmi_monitor =
+   
connector->display_info.is_hdmi;
}
} else
status = connector_status_disconnected;
-- 
2.33.0





Re: [PATCH v3 01/13] gpu/drm: make drm_add_edid_modes() consistent when updating connector->display_info

2021-10-19 Thread Claudio Suarez


According to the documentation, drm_add_edid_modes
"... Also fills out the &drm_display_info structure and ELD in @connector
with any information which can be derived from the edid."

drm_add_edid_modes accepts a struct edid *edid parameter which may have a
value or may be null. When it is not null, connector->display_info and
connector->eld are updated according to the edid. When edid=NULL, only
connector->eld is reset. Reset connector->display_info to be consistent
and accurate.

Since drm_edid_is_valid() considers NULL as an invalid EDID, simplify the
code to avoid duplicating code in the case of NULL/error.

Signed-off-by: Claudio Suarez 
---
 drivers/gpu/drm/drm_edid.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 6325877c5fd6..a019a26ede7a 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -5356,14 +5356,14 @@ int drm_add_edid_modes(struct drm_connector *connector, 
struct edid *edid)
int num_modes = 0;
u32 quirks;
 
-   if (edid == NULL) {
-   clear_eld(connector);
-   return 0;
-   }
if (!drm_edid_is_valid(edid)) {
+   /* edid == NULL or invalid here */
clear_eld(connector);
-   drm_warn(connector->dev, "%s: EDID invalid.\n",
-connector->name);
+   drm_reset_display_info(connector);
+   if (edid)
+   drm_warn(connector->dev,
+"[CONNECTOR:%d:%s] EDID invalid.\n",
+connector->base.id, connector->name);
return 0;
}
 
-- 
2.33.0





Re: [PATCH v2 01/13] gpu/drm: make drm_add_edid_modes() consistent when updating connector->display_info

2021-10-19 Thread Claudio Suarez
On Tue, Oct 19, 2021 at 09:35:08PM +0300, Ville Syrjälä wrote:
> On Sat, Oct 16, 2021 at 08:42:14PM +0200, Claudio Suarez wrote:
> > According to the documentation, drm_add_edid_modes
> > "... Also fills out the &drm_display_info structure and ELD in @connector
> > with any information which can be derived from the edid."
> > 
> > drm_add_edid_modes accepts a struct edid *edid parameter which may have a
> > value or may be null. When it is not null, connector->display_info and
> > connector->eld are updated according to the edid. When edid=NULL, only
> > connector->eld is reset. Reset connector->display_info to be consistent
> > and accurate.
> > 
> > Signed-off-by: Claudio Suarez 
> > ---
> >  drivers/gpu/drm/drm_edid.c | 11 +--
> >  1 file changed, 5 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index 6325877c5fd6..c643db17782c 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -5356,14 +5356,13 @@ int drm_add_edid_modes(struct drm_connector 
> > *connector, struct edid *edid)
> > int num_modes = 0;
> > u32 quirks;
> >  
> > -   if (edid == NULL) {
> > -   clear_eld(connector);
> > -   return 0;
> > -   }
> > if (!drm_edid_is_valid(edid)) {
> 
> OK, so drm_edid_is_valid() will happily accept NULL and considers
> it invalid. You may want to mention that explicitly in the commit
> message.

Thank you for your comments, I appreciate :)
I'm sending new mails with the new commit messages.

> > +   /* edid == NULL or invalid here */
> > clear_eld(connector);
> > -   drm_warn(connector->dev, "%s: EDID invalid.\n",
> > -connector->name);
> > +   drm_reset_display_info(connector);
> > +   if (edid)
> > +   drm_warn(connector->dev, "%s: EDID invalid.\n",
> > +connector->name);
> 
> Could you respin this to use the standard [CONNECTOR:%d:%s] form
> while at it? Or I guess a patch to mass convert the whole drm_edid.c
> might be another option.

Good point.
I like the idea of a new patch. I'll start working on it. I can change
this drm_warn here to avoid merge conflicts.

> Patch looks good.
> Reviewed-by: Ville Syrjälä 

Thanks!

BR
Claudio Suarez.




Re: [PATCH] drm/amd/pm: Enable GPU metrics for One VF mode

2021-10-19 Thread Alex Deucher
On Tue, Oct 19, 2021 at 5:49 PM Vignesh Chander  wrote:
>

Please add a patch description, something like:

Enable GPU metrics feature in one VF mode.  These are only possible in
one VF mode because the VF
is dedicated in that case.

With that fixed:
Reviewed-by: Alex Deucher 

> Signed-off-by: Vignesh Chander 
> Change-Id: I14a5c4d6b9d790b7f298b67cece2c501a003e2a7
> ---
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> index c255b4b8e685..01cca08a774f 100644
> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> @@ -2019,15 +2019,15 @@ static struct amdgpu_device_attr 
> amdgpu_device_attrs[] = {
> AMDGPU_DEVICE_ATTR_RW(pp_dpm_pcie,  
> ATTR_FLAG_BASIC),
> AMDGPU_DEVICE_ATTR_RW(pp_sclk_od,   
> ATTR_FLAG_BASIC),
> AMDGPU_DEVICE_ATTR_RW(pp_mclk_od,   
> ATTR_FLAG_BASIC),
> -   AMDGPU_DEVICE_ATTR_RW(pp_power_profile_mode,
> ATTR_FLAG_BASIC),
> +   AMDGPU_DEVICE_ATTR_RW(pp_power_profile_mode,
> ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
> AMDGPU_DEVICE_ATTR_RW(pp_od_clk_voltage,
> ATTR_FLAG_BASIC),
> -   AMDGPU_DEVICE_ATTR_RO(gpu_busy_percent, 
> ATTR_FLAG_BASIC),
> -   AMDGPU_DEVICE_ATTR_RO(mem_busy_percent, 
> ATTR_FLAG_BASIC),
> +   AMDGPU_DEVICE_ATTR_RO(gpu_busy_percent, 
> ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
> +   AMDGPU_DEVICE_ATTR_RO(mem_busy_percent, 
> ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
> AMDGPU_DEVICE_ATTR_RO(pcie_bw,  
> ATTR_FLAG_BASIC),
> -   AMDGPU_DEVICE_ATTR_RW(pp_features,  
> ATTR_FLAG_BASIC),
> -   AMDGPU_DEVICE_ATTR_RO(unique_id,
> ATTR_FLAG_BASIC),
> -   AMDGPU_DEVICE_ATTR_RW(thermal_throttling_logging,   
> ATTR_FLAG_BASIC),
> -   AMDGPU_DEVICE_ATTR_RO(gpu_metrics,  
> ATTR_FLAG_BASIC),
> +   AMDGPU_DEVICE_ATTR_RW(pp_features,  
> ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
> +   AMDGPU_DEVICE_ATTR_RO(unique_id,
> ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
> +   AMDGPU_DEVICE_ATTR_RW(thermal_throttling_logging,   
> ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
> +   AMDGPU_DEVICE_ATTR_RO(gpu_metrics,  
> ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
> AMDGPU_DEVICE_ATTR_RO(smartshift_apu_power, 
> ATTR_FLAG_BASIC,
>   .attr_update = ss_power_attr_update),
> AMDGPU_DEVICE_ATTR_RO(smartshift_dgpu_power,
> ATTR_FLAG_BASIC,
> --
> 2.25.1
>


[PATCH] drm/amd/pm: Enable GPU metrics for One VF mode

2021-10-19 Thread Vignesh Chander
Signed-off-by: Vignesh Chander 
Change-Id: I14a5c4d6b9d790b7f298b67cece2c501a003e2a7
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index c255b4b8e685..01cca08a774f 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -2019,15 +2019,15 @@ static struct amdgpu_device_attr amdgpu_device_attrs[] 
= {
AMDGPU_DEVICE_ATTR_RW(pp_dpm_pcie,  
ATTR_FLAG_BASIC),
AMDGPU_DEVICE_ATTR_RW(pp_sclk_od,   
ATTR_FLAG_BASIC),
AMDGPU_DEVICE_ATTR_RW(pp_mclk_od,   
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RW(pp_power_profile_mode,
ATTR_FLAG_BASIC),
+   AMDGPU_DEVICE_ATTR_RW(pp_power_profile_mode,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RW(pp_od_clk_voltage,
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RO(gpu_busy_percent, 
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RO(mem_busy_percent, 
ATTR_FLAG_BASIC),
+   AMDGPU_DEVICE_ATTR_RO(gpu_busy_percent, 
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RO(mem_busy_percent, 
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RO(pcie_bw,  
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RW(pp_features,  
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RO(unique_id,
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RW(thermal_throttling_logging,   
ATTR_FLAG_BASIC),
-   AMDGPU_DEVICE_ATTR_RO(gpu_metrics,  
ATTR_FLAG_BASIC),
+   AMDGPU_DEVICE_ATTR_RW(pp_features,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RO(unique_id,
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RW(thermal_throttling_logging,   
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
+   AMDGPU_DEVICE_ATTR_RO(gpu_metrics,  
ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF),
AMDGPU_DEVICE_ATTR_RO(smartshift_apu_power, 
ATTR_FLAG_BASIC,
  .attr_update = ss_power_attr_update),
AMDGPU_DEVICE_ATTR_RO(smartshift_dgpu_power,
ATTR_FLAG_BASIC,
-- 
2.25.1



[PATCH 10/13] drm/amdkfd: replace kgd_dev in get amdgpu_amdkfd funcs

2021-10-19 Thread Graham Sider
Modified definitions:

- amdgpu_amdkfd_get_fw_version
- amdgpu_amdkfd_get_local_mem_info
- amdgpu_amdkfd_get_gpu_clock_counter
- amdgpu_amdkfd_get_max_engine_clock_in_mhz
- amdgpu_amdkfd_get_cu_info
- amdgpu_amdkfd_get_dmabuf_info
- amdgpu_amdkfd_get_vram_usage
- amdgpu_amdkfd_get_hive_id
- amdgpu_amdkfd_get_unique_id
- amdgpu_amdkfd_get_mmio_remap_phys_addr
- amdgpu_amdkfd_get_num_gws
- amdgpu_amdkfd_get_asic_rev_id
- amdgpu_amdkfd_get_noretry
- amdgpu_amdkfd_get_xgmi_hops_count
- amdgpu_amdkfd_get_xgmi_bandwidth_mbytes
- amdgpu_amdkfd_get_pcie_bandwidth_mbytes

Also replaces kfd_device_by_kgd with kfd_device_by_adev, now
searching via adev rather than kgd.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 73 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 38 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 16 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 16 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 14 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 18 ++---
 9 files changed, 82 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 69fc8f0d9c45..79a2e37baa59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -358,11 +358,9 @@ void amdgpu_amdkfd_free_gws(struct amdgpu_device *adev, 
void *mem_obj)
amdgpu_bo_unref(&bo);
 }
 
-uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd,
+uint32_t amdgpu_amdkfd_get_fw_version(struct amdgpu_device *adev,
  enum kgd_engine_type type)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-
switch (type) {
case KGD_ENGINE_PFP:
return adev->gfx.pfp_fw_version;
@@ -395,11 +393,9 @@ uint32_t amdgpu_amdkfd_get_fw_version(struct kgd_dev *kgd,
return 0;
 }
 
-void amdgpu_amdkfd_get_local_mem_info(struct kgd_dev *kgd,
+void amdgpu_amdkfd_get_local_mem_info(struct amdgpu_device *adev,
  struct kfd_local_mem_info *mem_info)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-
memset(mem_info, 0, sizeof(*mem_info));
 
mem_info->local_mem_size_public = adev->gmc.visible_vram_size;
@@ -424,19 +420,15 @@ void amdgpu_amdkfd_get_local_mem_info(struct kgd_dev *kgd,
mem_info->mem_clk_max = 100;
 }
 
-uint64_t amdgpu_amdkfd_get_gpu_clock_counter(struct kgd_dev *kgd)
+uint64_t amdgpu_amdkfd_get_gpu_clock_counter(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-
if (adev->gfx.funcs->get_gpu_clock_counter)
return adev->gfx.funcs->get_gpu_clock_counter(adev);
return 0;
 }
 
-uint32_t amdgpu_amdkfd_get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
+uint32_t amdgpu_amdkfd_get_max_engine_clock_in_mhz(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-
/* the sclk is in quantas of 10kHz */
if (amdgpu_sriov_vf(adev))
return adev->clock.default_sclk / 100;
@@ -446,9 +438,8 @@ uint32_t amdgpu_amdkfd_get_max_engine_clock_in_mhz(struct 
kgd_dev *kgd)
return 100;
 }
 
-void amdgpu_amdkfd_get_cu_info(struct kgd_dev *kgd, struct kfd_cu_info 
*cu_info)
+void amdgpu_amdkfd_get_cu_info(struct amdgpu_device *adev, struct kfd_cu_info 
*cu_info)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct amdgpu_cu_info acu_info = adev->gfx.cu_info;
 
memset(cu_info, 0, sizeof(*cu_info));
@@ -469,13 +460,12 @@ void amdgpu_amdkfd_get_cu_info(struct kgd_dev *kgd, 
struct kfd_cu_info *cu_info)
cu_info->lds_size = acu_info.lds_size;
 }
 
-int amdgpu_amdkfd_get_dmabuf_info(struct kgd_dev *kgd, int dma_buf_fd,
- struct kgd_dev **dma_buf_kgd,
+int amdgpu_amdkfd_get_dmabuf_info(struct amdgpu_device *adev, int dma_buf_fd,
+ struct amdgpu_device **dmabuf_adev,
  uint64_t *bo_size, void *metadata_buffer,
  size_t buffer_size, uint32_t *metadata_size,
  uint32_t *flags)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct dma_buf *dma_buf;
struct drm_gem_object *obj;
struct amdgpu_bo *bo;
@@ -503,8 +493,8 @@ int amdgpu_amdkfd_get_dmabuf_info(struct kgd_dev *kgd, int 
dma_buf_fd,
goto out_put;
 
r = 0;
-   if (dma_buf_kgd)
-   *dma_buf_kgd = (struct kgd_dev *)adev;
+   if (dmabuf_adev)
+   *dmabuf_adev = adev;
if (bo_size)
*bo_size = amdgpu_bo_size(bo);

[PATCH 08/13] drm/amdkfd: replace kgd_dev in various kfd2kgd funcs

2021-10-19 Thread Graham Sider
Modified definitions:

- program_sh_mem_settings
- set_pasid_vmid_mapping
- init_interrupts
- address_watch_disable
- address_watch_execute
- wave_control_execute
- address_watch_get_offset
- get_atc_vmid_pasid_mapping_info
- set_scratch_backing_va
- set_vm_context_page_table_base
- read_vmid_from_vmfault_reg
- get_cu_occupancy
- program_trap_handler_settings

Signed-off-by: Graham Sider 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 33 +
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  | 49 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 39 +--
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 33 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 35 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 22 -
 .../gpu/drm/amd/amdkfd/cik_event_interrupt.c  |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   | 18 +++
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 14 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  2 +-
 .../gpu/drm/amd/include/kgd_kfd_interface.h   | 29 ++-
 12 files changed, 106 insertions(+), 174 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 5927a8fcbc23..5f274b7c4121 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -80,14 +80,12 @@ static void release_queue(struct amdgpu_device *adev)
unlock_srbm(adev);
 }
 
-static void kgd_program_sh_mem_settings(struct kgd_dev *kgd, uint32_t vmid,
+static void kgd_program_sh_mem_settings(struct amdgpu_device *adev, uint32_t 
vmid,
uint32_t sh_mem_config,
uint32_t sh_mem_ape1_base,
uint32_t sh_mem_ape1_limit,
uint32_t sh_mem_bases)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
lock_srbm(adev, 0, 0, 0, vmid);
 
WREG32_SOC15(GC, 0, mmSH_MEM_CONFIG, sh_mem_config);
@@ -97,11 +95,9 @@ static void kgd_program_sh_mem_settings(struct kgd_dev *kgd, 
uint32_t vmid,
unlock_srbm(adev);
 }
 
-static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, u32 pasid,
+static int kgd_set_pasid_vmid_mapping(struct amdgpu_device *adev, u32 pasid,
unsigned int vmid)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
/*
 * We have to assume that there is no outstanding mapping.
 * The ATC_VMID_PASID_MAPPING_UPDATE_STATUS bit could be 0 because
@@ -144,9 +140,8 @@ static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, 
u32 pasid,
  * but still works
  */
 
-static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id)
+static int kgd_init_interrupts(struct amdgpu_device *adev, uint32_t pipe_id)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t mec;
uint32_t pipe;
 
@@ -669,11 +664,10 @@ static int kgd_hqd_sdma_destroy(struct amdgpu_device 
*adev, void *mqd,
return 0;
 }
 
-static bool get_atc_vmid_pasid_mapping_info(struct kgd_dev *kgd,
+static bool get_atc_vmid_pasid_mapping_info(struct amdgpu_device *adev,
uint8_t vmid, uint16_t *p_pasid)
 {
uint32_t value;
-   struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
 
value = RREG32(SOC15_REG_OFFSET(ATHUB, 0, mmATC_VMID0_PASID_MAPPING)
 + vmid);
@@ -682,12 +676,12 @@ static bool get_atc_vmid_pasid_mapping_info(struct 
kgd_dev *kgd,
return !!(value & ATC_VMID0_PASID_MAPPING__VALID_MASK);
 }
 
-static int kgd_address_watch_disable(struct kgd_dev *kgd)
+static int kgd_address_watch_disable(struct amdgpu_device *adev)
 {
return 0;
 }
 
-static int kgd_address_watch_execute(struct kgd_dev *kgd,
+static int kgd_address_watch_execute(struct amdgpu_device *adev,
unsigned int watch_point_id,
uint32_t cntl_val,
uint32_t addr_hi,
@@ -696,11 +690,10 @@ static int kgd_address_watch_execute(struct kgd_dev *kgd,
return 0;
 }
 
-static int kgd_wave_control_execute(struct kgd_dev *kgd,
+static int kgd_wave_control_execute(struct amdgpu_device *adev,
uint32_t gfx_index_val,
uint32_t sq_cmd)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t data = 0;
 
mutex_lock(&adev->grbm_idx_mutex);
@@ -721,18 +714,16 @@ static int kgd_wave_control_execute(struct kgd_dev *kgd,
return 0;
 }
 
-static uint32_t kgd_address_watch_get_offset(struct kgd_dev *kgd,
+static uint32_t kgd_address_watch_get_offset(struct amdgpu_device *adev,
  

[PATCH 04/13] drm/amdkfd: replace kgd_dev in static gfx v9 funcs

2021-10-19 Thread Graham Sider
Static funcs in amdgpu_amdkfd_gfx_v9.c now using amdgpu_device.

Signed-off-by: Graham Sider 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 52 ---
 1 file changed, 23 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index bcc1cbeb8799..a79f4d110669 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -51,32 +51,26 @@ static inline struct amdgpu_device 
*get_amdgpu_device(struct kgd_dev *kgd)
return (struct amdgpu_device *)kgd;
 }
 
-static void lock_srbm(struct kgd_dev *kgd, uint32_t mec, uint32_t pipe,
+static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
mutex_lock(&adev->srbm_mutex);
soc15_grbm_select(adev, mec, pipe, queue, vmid);
 }
 
-static void unlock_srbm(struct kgd_dev *kgd)
+static void unlock_srbm(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
soc15_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(&adev->srbm_mutex);
 }
 
-static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
+static void acquire_queue(struct amdgpu_device *adev, uint32_t pipe_id,
uint32_t queue_id)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, queue_id, 0);
+   lock_srbm(adev, mec, pipe, queue_id, 0);
 }
 
 static uint64_t get_queue_mask(struct amdgpu_device *adev,
@@ -88,9 +82,9 @@ static uint64_t get_queue_mask(struct amdgpu_device *adev,
return 1ull << bit;
 }
 
-static void release_queue(struct kgd_dev *kgd)
+static void release_queue(struct amdgpu_device *adev)
 {
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 void kgd_gfx_v9_program_sh_mem_settings(struct kgd_dev *kgd, uint32_t vmid,
@@ -101,13 +95,13 @@ void kgd_gfx_v9_program_sh_mem_settings(struct kgd_dev 
*kgd, uint32_t vmid,
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-   lock_srbm(kgd, 0, 0, 0, vmid);
+   lock_srbm(adev, 0, 0, 0, vmid);
 
WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmSH_MEM_CONFIG), sh_mem_config);
WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmSH_MEM_BASES), sh_mem_bases);
/* APE1 no longer exists on GFX9 */
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 int kgd_gfx_v9_set_pasid_vmid_mapping(struct kgd_dev *kgd, u32 pasid,
@@ -180,13 +174,13 @@ int kgd_gfx_v9_init_interrupts(struct kgd_dev *kgd, 
uint32_t pipe_id)
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, 0, 0);
+   lock_srbm(adev, mec, pipe, 0, 0);
 
WREG32(SOC15_REG_OFFSET(GC, 0, mmCPC_INT_CNTL),
CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK |
CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK);
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 
return 0;
 }
@@ -245,7 +239,7 @@ int kgd_gfx_v9_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
 
m = get_mqd(mqd);
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
/* HQD registers extend from CP_MQD_BASE_ADDR to CP_HQD_EOP_WPTR_MEM. */
mqd_hqd = &m->cp_mqd_base_addr_lo;
@@ -308,7 +302,7 @@ int kgd_gfx_v9_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
WREG32_RLC(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE), data);
 
-   release_queue(kgd);
+   release_queue(adev);
 
return 0;
 }
@@ -325,7 +319,7 @@ int kgd_gfx_v9_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
 
m = get_mqd(mqd);
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
@@ -361,7 +355,7 @@ int kgd_gfx_v9_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
 
 out_unlock:
spin_unlock(&adev->gfx.kiq.ring_lock);
-   release_queue(kgd);
+   release_queue(adev);
 
return r;
 }
@@ -384,13 +378,13 @@ int kgd_gfx_v9_hqd_dump(struct kgd_dev *kgd,
if (*dump == NULL)
return -ENOMEM;
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
for (reg = SOC15_REG_OFFSET(GC, 0, mmCP_MQD_BASE_ADDR);
 reg <= SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_HI); reg++)
DUMP_REG(reg);
 
-   release_queue(kgd);
+   release_queue(adev);
 
WARN_ON_O

[PATCH 07/13] drm/amdkfd: replace kgd_dev in hqd/mqd kfd2kgd funcs

2021-10-19 Thread Graham Sider
Modified definitions:

- hqd_load
- hiq_mqd_load
- hqd_sdma_load
- hqd_dump
- hqd_sdma_dump
- hqd_is_occupied
- hqd_destroy
- hqd_sdma_is_occupied
- hqd_sdma_destroy
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   | 13 +++
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.h   |  9 +++--
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 36 +++---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  | 37 ---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 33 +++--
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 35 +++---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 36 +++---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 13 ---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  6 +--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 12 +++---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  | 14 +++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 14 +++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 12 +++---
 .../gpu/drm/amd/include/kgd_kfd_interface.h   | 25 +++--
 14 files changed, 129 insertions(+), 166 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index 5a7f680bcb3f..c2b8d970195b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -123,10 +123,9 @@ static uint32_t get_sdma_rlc_reg_offset(struct 
amdgpu_device *adev,
return sdma_rlc_reg_offset;
 }
 
-int kgd_arcturus_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
+int kgd_arcturus_hqd_sdma_load(struct amdgpu_device *adev, void *mqd,
 uint32_t __user *wptr, struct mm_struct *mm)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
struct v9_sdma_mqd *m;
uint32_t sdma_rlc_reg_offset;
unsigned long end_jiffies;
@@ -193,11 +192,10 @@ int kgd_arcturus_hqd_sdma_load(struct kgd_dev *kgd, void 
*mqd,
return 0;
 }
 
-int kgd_arcturus_hqd_sdma_dump(struct kgd_dev *kgd,
+int kgd_arcturus_hqd_sdma_dump(struct amdgpu_device *adev,
 uint32_t engine_id, uint32_t queue_id,
 uint32_t (**dump)[2], uint32_t *n_regs)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t sdma_rlc_reg_offset = get_sdma_rlc_reg_offset(adev,
engine_id, queue_id);
uint32_t i = 0, reg;
@@ -225,9 +223,9 @@ int kgd_arcturus_hqd_sdma_dump(struct kgd_dev *kgd,
return 0;
 }
 
-bool kgd_arcturus_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
+bool kgd_arcturus_hqd_sdma_is_occupied(struct amdgpu_device *adev,
+   void *mqd)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
struct v9_sdma_mqd *m;
uint32_t sdma_rlc_reg_offset;
uint32_t sdma_rlc_rb_cntl;
@@ -244,10 +242,9 @@ bool kgd_arcturus_hqd_sdma_is_occupied(struct kgd_dev 
*kgd, void *mqd)
return false;
 }
 
-int kgd_arcturus_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
+int kgd_arcturus_hqd_sdma_destroy(struct amdgpu_device *adev, void *mqd,
unsigned int utimeout)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
struct v9_sdma_mqd *m;
uint32_t sdma_rlc_reg_offset;
uint32_t temp;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.h
index ce08131b7b5f..756c1a5679c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.h
@@ -20,11 +20,12 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-int kgd_arcturus_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
+int kgd_arcturus_hqd_sdma_load(struct amdgpu_device *adev, void *mqd,
 uint32_t __user *wptr, struct mm_struct *mm);
-int kgd_arcturus_hqd_sdma_dump(struct kgd_dev *kgd,
+int kgd_arcturus_hqd_sdma_dump(struct amdgpu_device *adev,
 uint32_t engine_id, uint32_t queue_id,
 uint32_t (**dump)[2], uint32_t *n_regs);
-bool kgd_arcturus_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
-int kgd_arcturus_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
+bool kgd_arcturus_hqd_sdma_is_occupied(struct amdgpu_device *adev,
+   void *mqd);
+int kgd_arcturus_hqd_sdma_destroy(struct amdgpu_device *adev, void *mqd,
unsigned int utimeout);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 69aee5b52f64..5927a8fcbc23 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -212,12 +212,11 @@ static inline struct v10_sdma_mqd *get_sdma_mqd(void *mqd)
return (struct v10_sdma_mqd *)mqd;
 }
 
-static int kgd_hqd_load(str

[PATCH 13/13] drm/amdkfd: remove kgd_dev declaration and initialization

2021-10-19 Thread Graham Sider
Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h  | 4 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 4 +---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 1 -
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 2 --
 5 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 79a2e37baa59..83f863dca7af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -72,7 +72,7 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
if (!kfd_initialized)
return;
 
-   adev->kfd.dev = kgd2kfd_probe((struct kgd_dev *)adev, vf);
+   adev->kfd.dev = kgd2kfd_probe(adev, vf);
 
if (adev->kfd.dev)
amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 35f703dda034..8c1ba8f258c8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -331,7 +331,7 @@ int kgd2kfd_schedule_evict_and_restore_process(struct 
mm_struct *mm,
 #if IS_ENABLED(CONFIG_HSA_AMD)
 int kgd2kfd_init(void);
 void kgd2kfd_exit(void);
-struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf);
+struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf);
 bool kgd2kfd_device_init(struct kfd_dev *kfd,
 struct drm_device *ddev,
 const struct kgd2kfd_shared_resources *gpu_resources);
@@ -355,7 +355,7 @@ static inline void kgd2kfd_exit(void)
 }
 
 static inline
-struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
+struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf)
 {
return NULL;
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 402891a02a01..7677ced16a27 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -627,12 +627,11 @@ static void kfd_gtt_sa_fini(struct kfd_dev *kfd);
 
 static int kfd_resume(struct kfd_dev *kfd);
 
-struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
+struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf)
 {
struct kfd_dev *kfd;
const struct kfd_device_info *device_info;
const struct kfd2kgd_calls *f2g;
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct pci_dev *pdev = adev->pdev;
 
switch (adev->asic_type) {
@@ -824,7 +823,6 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
if (!kfd)
return NULL;
 
-   kfd->kgd = kgd;
kfd->adev = adev;
kfd->device_info = device_info;
kfd->pdev = pdev;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 1fbc28c34c4c..32307b9f1ec2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -228,7 +228,6 @@ struct kfd_vmid_info {
 };
 
 struct kfd_dev {
-   struct kgd_dev *kgd;
struct amdgpu_device *adev;
 
const struct kfd_device_info *device_info;
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 8f4f3a1700e8..ac941f62cbed 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -38,8 +38,6 @@ struct amdgpu_device;
 #define KGD_MAX_QUEUES 128
 
 struct kfd_dev;
-struct kgd_dev;
-
 struct kgd_mem;
 
 enum kfd_preempt_type {
-- 
2.25.1



[PATCH 12/13] drm/amdkfd: replace/remove remaining kgd_dev references

2021-10-19 Thread Graham Sider
Signed-off-by: Graham Sider 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  5 ---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  5 ---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  |  5 ---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  5 ---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  5 ---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  5 ---
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  5 +--
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  | 43 +++
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 12 +++---
 14 files changed, 31 insertions(+), 77 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
index c2b8d970195b..abe93b3ff765 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
@@ -57,11 +57,6 @@
(*dump)[i++][1] = RREG32(addr); \
} while (0)
 
-static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
-{
-   return (struct amdgpu_device *)kgd;
-}
-
 static inline struct v9_sdma_mqd *get_sdma_mqd(void *mqd)
 {
return (struct v9_sdma_mqd *)mqd;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 5f274b7c4121..7b7f4b2764c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -39,11 +39,6 @@ enum hqd_dequeue_request_type {
SAVE_WAVES
 };
 
-static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
-{
-   return (struct amdgpu_device *)kgd;
-}
-
 static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c
index 980430974aca..1f37d3574001 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c
@@ -38,11 +38,6 @@ enum hqd_dequeue_request_type {
SAVE_WAVES
 };
 
-static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
-{
-   return (struct amdgpu_device *)kgd;
-}
-
 static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index e31b03495db4..36528dad7684 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -82,11 +82,6 @@ union TCP_WATCH_CNTL_BITS {
float f32All;
 };
 
-static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
-{
-   return (struct amdgpu_device *)kgd;
-}
-
 static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 9a30a4f4f098..52832cd69a93 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -39,11 +39,6 @@ enum hqd_dequeue_request_type {
RESET_WAVES
 };
 
-static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
-{
-   return (struct amdgpu_device *)kgd;
-}
-
 static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index d7b31adcfd80..ddfe7aff919d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -46,11 +46,6 @@ enum hqd_dequeue_request_type {
SAVE_WAVES
 };
 
-static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
-{
-   return (struct amdgpu_device *)kgd;
-}
-
 static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 7143550becb0..1dc6cb7446e0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -1963,8 +1963,6 @@ static int kfd_fill_gpu_direct_io_link_to_cpu(int 
*avail_size,
struct crat_subtype_iolink *sub_type_hdr,
uint32_t proximity_domain)
 {
-   struc

[PATCH 09/13] drm/amdkfd: replace kgd_dev in various amgpu_amdkfd funcs

2021-10-19 Thread Graham Sider
Modified definitions:

- amdgpu_amdkfd_submit_ib
- amdgpu_amdkfd_set_compute_idle
- amdgpu_amdkfd_have_atomics_support
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_gpu_reset
- amdgpu_amdkfd_alloc_gtt_mem
- amdgpu_amdkfd_free_gtt_mem
- amdgpu_amdkfd_alloc_gws
- amdgpu_amdkfd_free_gws
- amdgpu_amdkfd_ras_poison_consumption_handler

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 41 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 27 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 18 
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  8 ++--
 .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c   |  4 +-
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  | 10 ++---
 8 files changed, 53 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 7077f21f0021..69fc8f0d9c45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -233,19 +233,16 @@ int amdgpu_amdkfd_post_reset(struct amdgpu_device *adev)
return r;
 }
 
-void amdgpu_amdkfd_gpu_reset(struct kgd_dev *kgd)
+void amdgpu_amdkfd_gpu_reset(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-
if (amdgpu_device_should_recover_gpu(adev))
amdgpu_device_gpu_recover(adev, NULL);
 }
 
-int amdgpu_amdkfd_alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
+int amdgpu_amdkfd_alloc_gtt_mem(struct amdgpu_device *adev, size_t size,
void **mem_obj, uint64_t *gpu_addr,
void **cpu_ptr, bool cp_mqd_gfx9)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct amdgpu_bo *bo = NULL;
struct amdgpu_bo_param bp;
int r;
@@ -314,7 +311,7 @@ int amdgpu_amdkfd_alloc_gtt_mem(struct kgd_dev *kgd, size_t 
size,
return r;
 }
 
-void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void *mem_obj)
+void amdgpu_amdkfd_free_gtt_mem(struct amdgpu_device *adev, void *mem_obj)
 {
struct amdgpu_bo *bo = (struct amdgpu_bo *) mem_obj;
 
@@ -325,10 +322,9 @@ void amdgpu_amdkfd_free_gtt_mem(struct kgd_dev *kgd, void 
*mem_obj)
amdgpu_bo_unref(&(bo));
 }
 
-int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t size,
+int amdgpu_amdkfd_alloc_gws(struct amdgpu_device *adev, size_t size,
void **mem_obj)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct amdgpu_bo *bo = NULL;
struct amdgpu_bo_user *ubo;
struct amdgpu_bo_param bp;
@@ -355,7 +351,7 @@ int amdgpu_amdkfd_alloc_gws(struct kgd_dev *kgd, size_t 
size,
return 0;
 }
 
-void amdgpu_amdkfd_free_gws(struct kgd_dev *kgd, void *mem_obj)
+void amdgpu_amdkfd_free_gws(struct amdgpu_device *adev, void *mem_obj)
 {
struct amdgpu_bo *bo = (struct amdgpu_bo *)mem_obj;
 
@@ -675,11 +671,11 @@ int amdgpu_amdkfd_get_noretry(struct kgd_dev *kgd)
return adev->gmc.noretry;
 }
 
-int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum kgd_engine_type engine,
+int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev,
+   enum kgd_engine_type engine,
uint32_t vmid, uint64_t gpu_addr,
uint32_t *ib_cmd, uint32_t ib_len)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct amdgpu_job *job;
struct amdgpu_ib *ib;
struct amdgpu_ring *ring;
@@ -730,10 +726,8 @@ int amdgpu_amdkfd_submit_ib(struct kgd_dev *kgd, enum 
kgd_engine_type engine,
return ret;
 }
 
-void amdgpu_amdkfd_set_compute_idle(struct kgd_dev *kgd, bool idle)
+void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-
amdgpu_dpm_switch_power_profile(adev,
PP_SMC_POWER_PROFILE_COMPUTE,
!idle);
@@ -747,10 +741,9 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, 
u32 vmid)
return false;
 }
 
-int amdgpu_amdkfd_flush_gpu_tlb_vmid(struct kgd_dev *kgd, uint16_t vmid)
+int amdgpu_amdkfd_flush_gpu_tlb_vmid(struct amdgpu_device *adev,
+uint16_t vmid)
 {
-   struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-
if (adev->family == AMDGPU_FAMILY_AI) {
int i;
 
@@ -763,10 +756,9 @@ int amdgpu_amdkfd_flush_gpu_tlb_vmid(struct kgd_dev *kgd, 
uint16_t vmid)
return 0;
 }
 
-int amdgpu_amdkfd_flush_gpu_tlb_pasid(struct kgd_dev *kgd, uint16_t pasid,
- enum TLB_FLUSH_TYPE flush_type)
+int amdgpu_amdkfd_flush_gpu_tlb_pasid(struct amdgp

[PATCH 11/13] drm/amdkfd: replace kgd_dev in gpuvm amdgpu_amdkfd funcs

2021-10-19 Thread Graham Sider
Modified definitions:

- amdgpu_amdkfd_gpuvm_acquire_process_vm
- amdgpu_amdkfd_gpuvm_release_process_vm
- amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu
- amdgpu_amdkfd_gpuvm_free_memory_of_gpu
- amdgpu_amdkfd_gpuvm_map_memory_to_gpu
- amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu
- amdgpu_amdkfd_gpuvm_sync_memory
- amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel
- amdgpu_amdkfd_gpuvm_get_vm_fault_info
- amdgpu_amdkfd_gpuvm_import_dmabuf
- amdgpu_amdkfd_get_tile_config

Remove:

- get_amdgpu_device

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 24 ++-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 41 ++-
 .../gpu/drm/amd/amdkfd/cik_event_interrupt.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 22 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 22 +-
 5 files changed, 49 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 7e3697a7a5cd..35f703dda034 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -265,37 +265,39 @@ int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct 
amdgpu_device *adev, bool is_
(&((struct amdgpu_fpriv *)  \
((struct drm_file *)(drm_priv))->driver_priv)->vm)
 
-int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct kgd_dev *kgd,
+int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct amdgpu_device *adev,
struct file *filp, u32 pasid,
void **process_info,
struct dma_fence **ef);
-void amdgpu_amdkfd_gpuvm_release_process_vm(struct kgd_dev *kgd, void 
*drm_priv);
+void amdgpu_amdkfd_gpuvm_release_process_vm(struct amdgpu_device *adev,
+   void *drm_priv);
 uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void *drm_priv);
 int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
-   struct kgd_dev *kgd, uint64_t va, uint64_t size,
+   struct amdgpu_device *adev, uint64_t va, uint64_t size,
void *drm_priv, struct kgd_mem **mem,
uint64_t *offset, uint32_t flags);
 int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
-   struct kgd_dev *kgd, struct kgd_mem *mem, void *drm_priv,
+   struct amdgpu_device *adev, struct kgd_mem *mem, void *drm_priv,
uint64_t *size);
 int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
-   struct kgd_dev *kgd, struct kgd_mem *mem, void *drm_priv, bool 
*table_freed);
+   struct amdgpu_device *adev, struct kgd_mem *mem, void *drm_priv,
+   bool *table_freed);
 int amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu(
-   struct kgd_dev *kgd, struct kgd_mem *mem, void *drm_priv);
+   struct amdgpu_device *adev, struct kgd_mem *mem, void 
*drm_priv);
 int amdgpu_amdkfd_gpuvm_sync_memory(
-   struct kgd_dev *kgd, struct kgd_mem *mem, bool intr);
-int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct kgd_dev *kgd,
+   struct amdgpu_device *adev, struct kgd_mem *mem, bool intr);
+int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct amdgpu_device *adev,
struct kgd_mem *mem, void **kptr, uint64_t *size);
 int amdgpu_amdkfd_gpuvm_restore_process_bos(void *process_info,
struct dma_fence **ef);
-int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct kgd_dev *kgd,
+int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev,
  struct kfd_vm_fault_info *info);
-int amdgpu_amdkfd_gpuvm_import_dmabuf(struct kgd_dev *kgd,
+int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
  struct dma_buf *dmabuf,
  uint64_t va, void *drm_priv,
  struct kgd_mem **mem, uint64_t *size,
  uint64_t *mmap_offset);
-int amdgpu_amdkfd_get_tile_config(struct kgd_dev *kgd,
+int amdgpu_amdkfd_get_tile_config(struct amdgpu_device *adev,
struct tile_config *config);
 void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev);
 #if IS_ENABLED(CONFIG_HSA_AMD)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index cdf46bd0d8d5..d632484b209e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -60,12 +60,6 @@ static const char * const domain_bit_to_string[] = {
 
 static void amdgpu_amdkfd_restore_userptr_worker(struct work_struct *work);
 
-
-static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
-{
-   return (struct amdgpu_device *)kgd;
-}
-
 static bool kfd_mem_is_attached(struct amdgpu_vm *

[PATCH 06/13] drm/amdkfd: replace kgd_dev in static gfx v10_3 funcs

2021-10-19 Thread Graham Sider
Static funcs in amdgpu_amdkfd_gfx_v10_3.c now using amdgpu_device.

Signed-off-by: Graham Sider 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c  | 52 ---
 1 file changed, 23 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c
index dac0d751d5af..b33a9fe715cd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c
@@ -43,32 +43,26 @@ static inline struct amdgpu_device 
*get_amdgpu_device(struct kgd_dev *kgd)
return (struct amdgpu_device *)kgd;
 }
 
-static void lock_srbm(struct kgd_dev *kgd, uint32_t mec, uint32_t pipe,
+static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
mutex_lock(&adev->srbm_mutex);
nv_grbm_select(adev, mec, pipe, queue, vmid);
 }
 
-static void unlock_srbm(struct kgd_dev *kgd)
+static void unlock_srbm(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
nv_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(&adev->srbm_mutex);
 }
 
-static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
+static void acquire_queue(struct amdgpu_device *adev, uint32_t pipe_id,
uint32_t queue_id)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, queue_id, 0);
+   lock_srbm(adev, mec, pipe, queue_id, 0);
 }
 
 static uint64_t get_queue_mask(struct amdgpu_device *adev,
@@ -80,9 +74,9 @@ static uint64_t get_queue_mask(struct amdgpu_device *adev,
return 1ull << bit;
 }
 
-static void release_queue(struct kgd_dev *kgd)
+static void release_queue(struct amdgpu_device *adev)
 {
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 static void program_sh_mem_settings_v10_3(struct kgd_dev *kgd, uint32_t vmid,
@@ -93,13 +87,13 @@ static void program_sh_mem_settings_v10_3(struct kgd_dev 
*kgd, uint32_t vmid,
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-   lock_srbm(kgd, 0, 0, 0, vmid);
+   lock_srbm(adev, 0, 0, 0, vmid);
 
WREG32_SOC15(GC, 0, mmSH_MEM_CONFIG, sh_mem_config);
WREG32_SOC15(GC, 0, mmSH_MEM_BASES, sh_mem_bases);
/* APE1 no longer exists on GFX9 */
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 /* ATC is defeatured on Sienna_Cichlid */
@@ -127,13 +121,13 @@ static int init_interrupts_v10_3(struct kgd_dev *kgd, 
uint32_t pipe_id)
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, 0, 0);
+   lock_srbm(adev, mec, pipe, 0, 0);
 
WREG32_SOC15(GC, 0, mmCPC_INT_CNTL,
CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK |
CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK);
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 
return 0;
 }
@@ -201,7 +195,7 @@ static int hqd_load_v10_3(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
m = get_mqd(mqd);
 
pr_debug("Load hqd of pipe %d queue %d\n", pipe_id, queue_id);
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
/* HIQ is set during driver init period with vmid set to 0*/
if (m->cp_hqd_vmid == 0) {
@@ -281,7 +275,7 @@ static int hqd_load_v10_3(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
WREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE, data);
 
-   release_queue(kgd);
+   release_queue(adev);
 
return 0;
 }
@@ -298,7 +292,7 @@ static int hiq_mqd_load_v10_3(struct kgd_dev *kgd, void 
*mqd,
 
m = get_mqd(mqd);
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
@@ -334,7 +328,7 @@ static int hiq_mqd_load_v10_3(struct kgd_dev *kgd, void 
*mqd,
 
 out_unlock:
spin_unlock(&adev->gfx.kiq.ring_lock);
-   release_queue(kgd);
+   release_queue(adev);
 
return r;
 }
@@ -357,13 +351,13 @@ static int hqd_dump_v10_3(struct kgd_dev *kgd,
if (*dump == NULL)
return -ENOMEM;
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
for (reg = SOC15_REG_OFFSET(GC, 0, mmCP_MQD_BASE_ADDR);
 reg <= SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_HI); reg++)
DUMP_REG(reg);
 
-   release_queue(kgd);
+   release_queue(adev);
 
WARN_ON_ONCE(i != HQD_

[PATCH 05/13] drm/amdkfd: replace kgd_dev in static gfx v10 funcs

2021-10-19 Thread Graham Sider
Static funcs in amdgpu_amdkfd_gfx_v10.c now using amdgpu_device.

Signed-off-by: Graham Sider 
---
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 52 ---
 1 file changed, 23 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index 960acf68150a..69aee5b52f64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -44,32 +44,26 @@ static inline struct amdgpu_device 
*get_amdgpu_device(struct kgd_dev *kgd)
return (struct amdgpu_device *)kgd;
 }
 
-static void lock_srbm(struct kgd_dev *kgd, uint32_t mec, uint32_t pipe,
+static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
mutex_lock(&adev->srbm_mutex);
nv_grbm_select(adev, mec, pipe, queue, vmid);
 }
 
-static void unlock_srbm(struct kgd_dev *kgd)
+static void unlock_srbm(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
nv_grbm_select(adev, 0, 0, 0, 0);
mutex_unlock(&adev->srbm_mutex);
 }
 
-static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
+static void acquire_queue(struct amdgpu_device *adev, uint32_t pipe_id,
uint32_t queue_id)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, queue_id, 0);
+   lock_srbm(adev, mec, pipe, queue_id, 0);
 }
 
 static uint64_t get_queue_mask(struct amdgpu_device *adev,
@@ -81,9 +75,9 @@ static uint64_t get_queue_mask(struct amdgpu_device *adev,
return 1ull << bit;
 }
 
-static void release_queue(struct kgd_dev *kgd)
+static void release_queue(struct amdgpu_device *adev)
 {
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 static void kgd_program_sh_mem_settings(struct kgd_dev *kgd, uint32_t vmid,
@@ -94,13 +88,13 @@ static void kgd_program_sh_mem_settings(struct kgd_dev 
*kgd, uint32_t vmid,
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-   lock_srbm(kgd, 0, 0, 0, vmid);
+   lock_srbm(adev, 0, 0, 0, vmid);
 
WREG32_SOC15(GC, 0, mmSH_MEM_CONFIG, sh_mem_config);
WREG32_SOC15(GC, 0, mmSH_MEM_BASES, sh_mem_bases);
/* APE1 no longer exists on GFX9 */
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, u32 pasid,
@@ -159,13 +153,13 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, 
uint32_t pipe_id)
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, 0, 0);
+   lock_srbm(adev, mec, pipe, 0, 0);
 
WREG32_SOC15(GC, 0, mmCPC_INT_CNTL,
CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK |
CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK);
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 
return 0;
 }
@@ -231,7 +225,7 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
m = get_mqd(mqd);
 
pr_debug("Load hqd of pipe %d queue %d\n", pipe_id, queue_id);
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
/* HQD registers extend from CP_MQD_BASE_ADDR to CP_HQD_EOP_WPTR_MEM. */
mqd_hqd = &m->cp_mqd_base_addr_lo;
@@ -296,7 +290,7 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
WREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE, data);
 
-   release_queue(kgd);
+   release_queue(adev);
 
return 0;
 }
@@ -313,7 +307,7 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
 
m = get_mqd(mqd);
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
@@ -349,7 +343,7 @@ static int kgd_hiq_mqd_load(struct kgd_dev *kgd, void *mqd,
 
 out_unlock:
spin_unlock(&adev->gfx.kiq.ring_lock);
-   release_queue(kgd);
+   release_queue(adev);
 
return r;
 }
@@ -372,13 +366,13 @@ static int kgd_hqd_dump(struct kgd_dev *kgd,
if (*dump == NULL)
return -ENOMEM;
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
for (reg = SOC15_REG_OFFSET(GC, 0, mmCP_MQD_BASE_ADDR);
 reg <= SOC15_REG_OFFSET(GC, 0, mmCP_HQD_PQ_WPTR_HI); reg++)
DUMP_REG(reg);
 
-   release_queue(kgd);
+   release_queue(adev);
 
WA

[PATCH 02/13] drm/amdkfd: replace kgd_dev in static gfx v7 funcs

2021-10-19 Thread Graham Sider
Static funcs in amdgpu_amdkfd_gfx_v7.c now using amdgpu_device.

Signed-off-by: Graham Sider 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 51 +--
 1 file changed, 23 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index b91d27e39bad..d00ba8d65a6d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -87,38 +87,33 @@ static inline struct amdgpu_device 
*get_amdgpu_device(struct kgd_dev *kgd)
return (struct amdgpu_device *)kgd;
 }
 
-static void lock_srbm(struct kgd_dev *kgd, uint32_t mec, uint32_t pipe,
+static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t value = PIPEID(pipe) | MEID(mec) | VMID(vmid) | QUEUEID(queue);
 
mutex_lock(&adev->srbm_mutex);
WREG32(mmSRBM_GFX_CNTL, value);
 }
 
-static void unlock_srbm(struct kgd_dev *kgd)
+static void unlock_srbm(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
WREG32(mmSRBM_GFX_CNTL, 0);
mutex_unlock(&adev->srbm_mutex);
 }
 
-static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
+static void acquire_queue(struct amdgpu_device *adev, uint32_t pipe_id,
uint32_t queue_id)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, queue_id, 0);
+   lock_srbm(adev, mec, pipe, queue_id, 0);
 }
 
-static void release_queue(struct kgd_dev *kgd)
+static void release_queue(struct amdgpu_device *adev)
 {
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 static void kgd_program_sh_mem_settings(struct kgd_dev *kgd, uint32_t vmid,
@@ -129,14 +124,14 @@ static void kgd_program_sh_mem_settings(struct kgd_dev 
*kgd, uint32_t vmid,
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-   lock_srbm(kgd, 0, 0, 0, vmid);
+   lock_srbm(adev, 0, 0, 0, vmid);
 
WREG32(mmSH_MEM_CONFIG, sh_mem_config);
WREG32(mmSH_MEM_APE1_BASE, sh_mem_ape1_base);
WREG32(mmSH_MEM_APE1_LIMIT, sh_mem_ape1_limit);
WREG32(mmSH_MEM_BASES, sh_mem_bases);
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, u32 pasid,
@@ -174,12 +169,12 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, 
uint32_t pipe_id)
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, 0, 0);
+   lock_srbm(adev, mec, pipe, 0, 0);
 
WREG32(mmCPC_INT_CNTL, CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK |
CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK);
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 
return 0;
 }
@@ -220,7 +215,7 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
 
m = get_mqd(mqd);
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
/* HQD registers extend from CP_MQD_BASE_ADDR to CP_MQD_CONTROL. */
mqd_hqd = &m->cp_mqd_base_addr_lo;
@@ -239,16 +234,16 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
 * release srbm_mutex to avoid circular dependency between
 * srbm_mutex->mm_sem->reservation_ww_class_mutex->srbm_mutex.
 */
-   release_queue(kgd);
+   release_queue(adev);
valid_wptr = read_user_wptr(mm, wptr, wptr_val);
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
if (valid_wptr)
WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
 
data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
WREG32(mmCP_HQD_ACTIVE, data);
 
-   release_queue(kgd);
+   release_queue(adev);
 
return 0;
 }
@@ -271,7 +266,7 @@ static int kgd_hqd_dump(struct kgd_dev *kgd,
if (*dump == NULL)
return -ENOMEM;
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
@@ -281,7 +276,7 @@ static int kgd_hqd_dump(struct kgd_dev *kgd,
for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_MQD_CONTROL; reg++)
DUMP_REG(reg);
 
-   release_queue(kgd);
+   release_queue(adev);
 
WARN_ON_ONCE(i != HQD_N_REGS);
*n_regs = i;
@@ -380,7 +375,7 @@ static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, 
uint64_t queue_address,
bool r

[PATCH 03/13] drm/amdkfd: replace kgd_dev in static gfx v8 funcs

2021-10-19 Thread Graham Sider
Static funcs in amdgpu_amdkfd_gfx_v8.c now using amdgpu_device.

Signed-off-by: Graham Sider 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 51 +--
 1 file changed, 23 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 5ce0ce704a21..06be6061e4c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -44,38 +44,33 @@ static inline struct amdgpu_device 
*get_amdgpu_device(struct kgd_dev *kgd)
return (struct amdgpu_device *)kgd;
 }
 
-static void lock_srbm(struct kgd_dev *kgd, uint32_t mec, uint32_t pipe,
+static void lock_srbm(struct amdgpu_device *adev, uint32_t mec, uint32_t pipe,
uint32_t queue, uint32_t vmid)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t value = PIPEID(pipe) | MEID(mec) | VMID(vmid) | QUEUEID(queue);
 
mutex_lock(&adev->srbm_mutex);
WREG32(mmSRBM_GFX_CNTL, value);
 }
 
-static void unlock_srbm(struct kgd_dev *kgd)
+static void unlock_srbm(struct amdgpu_device *adev)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
WREG32(mmSRBM_GFX_CNTL, 0);
mutex_unlock(&adev->srbm_mutex);
 }
 
-static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
+static void acquire_queue(struct amdgpu_device *adev, uint32_t pipe_id,
uint32_t queue_id)
 {
-   struct amdgpu_device *adev = get_amdgpu_device(kgd);
-
uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, queue_id, 0);
+   lock_srbm(adev, mec, pipe, queue_id, 0);
 }
 
-static void release_queue(struct kgd_dev *kgd)
+static void release_queue(struct amdgpu_device *adev)
 {
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 static void kgd_program_sh_mem_settings(struct kgd_dev *kgd, uint32_t vmid,
@@ -86,14 +81,14 @@ static void kgd_program_sh_mem_settings(struct kgd_dev 
*kgd, uint32_t vmid,
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-   lock_srbm(kgd, 0, 0, 0, vmid);
+   lock_srbm(adev, 0, 0, 0, vmid);
 
WREG32(mmSH_MEM_CONFIG, sh_mem_config);
WREG32(mmSH_MEM_APE1_BASE, sh_mem_ape1_base);
WREG32(mmSH_MEM_APE1_LIMIT, sh_mem_ape1_limit);
WREG32(mmSH_MEM_BASES, sh_mem_bases);
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 }
 
 static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, u32 pasid,
@@ -132,12 +127,12 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, 
uint32_t pipe_id)
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
-   lock_srbm(kgd, mec, pipe, 0, 0);
+   lock_srbm(adev, mec, pipe, 0, 0);
 
WREG32(mmCPC_INT_CNTL, CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK |
CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK);
 
-   unlock_srbm(kgd);
+   unlock_srbm(adev);
 
return 0;
 }
@@ -178,7 +173,7 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
 
m = get_mqd(mqd);
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
/* HIQ is set during driver init period with vmid set to 0*/
if (m->cp_hqd_vmid == 0) {
@@ -226,16 +221,16 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, 
uint32_t pipe_id,
 * release srbm_mutex to avoid circular dependency between
 * srbm_mutex->mm_sem->reservation_ww_class_mutex->srbm_mutex.
 */
-   release_queue(kgd);
+   release_queue(adev);
valid_wptr = read_user_wptr(mm, wptr, wptr_val);
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
if (valid_wptr)
WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
 
data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
WREG32(mmCP_HQD_ACTIVE, data);
 
-   release_queue(kgd);
+   release_queue(adev);
 
return 0;
 }
@@ -258,7 +253,7 @@ static int kgd_hqd_dump(struct kgd_dev *kgd,
if (*dump == NULL)
return -ENOMEM;
 
-   acquire_queue(kgd, pipe_id, queue_id);
+   acquire_queue(adev, pipe_id, queue_id);
 
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
@@ -268,7 +263,7 @@ static int kgd_hqd_dump(struct kgd_dev *kgd,
for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_HQD_EOP_DONES; reg++)
DUMP_REG(reg);
 
-   release_queue(kgd);
+   release_queue(adev);
 
WARN_ON_ONCE(i != HQD_N_REGS);
*n_regs = i;
@@ -375,7 +370,7 @@ static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, 
uint64_t queue_address,
bool retval = false;

[PATCH 01/13] drm/amdkfd: add amdgpu_device entry to kfd_dev

2021-10-19 Thread Graham Sider
Patch series to remove kgd_dev struct and replace all instances with
amdgpu_device objects.

amdgpu_device needs to be declared in kgd_kfd_interface.h to be visible
to kfd2kgd_calls.

Signed-off-by: Graham Sider 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 1 +
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 0fffaf859c59..81ca00d7b3da 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -825,6 +825,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, bool vf)
return NULL;
 
kfd->kgd = kgd;
+   kfd->adev = adev;
kfd->device_info = device_info;
kfd->pdev = pdev;
kfd->init_complete = false;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 6d8f9bb2d905..c8bd062fb954 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -229,6 +229,7 @@ struct kfd_vmid_info {
 
 struct kfd_dev {
struct kgd_dev *kgd;
+   struct amdgpu_device *adev;
 
const struct kfd_device_info *device_info;
struct pci_dev *pdev;
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index c84bd7b2cf59..ba444cbf9206 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -33,6 +33,7 @@
 #include 
 
 struct pci_dev;
+struct amdgpu_device;
 
 #define KGD_MAX_QUEUES 128
 
-- 
2.25.1



[PATCH] drm/amdgpu/display: remove unused variable in dcn31_init_hw()

2021-10-19 Thread Alex Deucher
Unused.  Remove it.

Fixes: d1065882691179 ("Revert "drm/amd/display: Add helper for blanking all dp 
displays"")
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c
index 7308c4c744ba..9a6ad1cebc85 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c
@@ -73,7 +73,6 @@ void dcn31_init_hw(struct dc *dc)
struct resource_pool *res_pool = dc->res_pool;
uint32_t backlight = MAX_BACKLIGHT_LEVEL;
int i, j;
-   int edp_num;
 
if (dc->clk_mgr && dc->clk_mgr->funcs->init_clocks)
dc->clk_mgr->funcs->init_clocks(dc->clk_mgr);
-- 
2.31.1



Re: [PATCH] amdgpu: replace snprintf in show functions with sysfs_emit

2021-10-19 Thread Alex Deucher
Applied.  thanks!

On Fri, Oct 15, 2021 at 2:48 AM Qing Wang  wrote:
>
> show() must not use snprintf() when formatting the value to be
> returned to user space.
>
> Fix the following coccicheck warning:
> drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:427:
> WARNING: use scnprintf or sprintf.
>
> Signed-off-by: Qing Wang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> index 2834981..faf4011 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> @@ -424,7 +424,7 @@ static ssize_t show_##name(struct device *dev,
>   \
> struct drm_device *ddev = dev_get_drvdata(dev); \
> struct amdgpu_device *adev = drm_to_adev(ddev); \
> \
> -   return snprintf(buf, PAGE_SIZE, "0x%08x\n", adev->field);   \
> +   return sysfs_emit(buf, "0x%08x\n", adev->field);\
>  }  \
>  static DEVICE_ATTR(name, mode, show_##name, NULL)
>
> --
> 2.7.4
>


[PATCH 4/4] drm/amdgpu/vcn3.0: remove intermediate variable

2021-10-19 Thread Alex Deucher
No need to use the id variable, just use the constant
plus instance offset directly.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 57b62fb04750..da11ceba0698 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -60,11 +60,6 @@ static int amdgpu_ih_clientid_vcns[] = {
SOC15_IH_CLIENTID_VCN1
 };
 
-static int amdgpu_ucode_id_vcns[] = {
-   AMDGPU_UCODE_ID_VCN,
-   AMDGPU_UCODE_ID_VCN1
-};
-
 static int vcn_v3_0_start_sriov(struct amdgpu_device *adev);
 static void vcn_v3_0_set_dec_ring_funcs(struct amdgpu_device *adev);
 static void vcn_v3_0_set_enc_ring_funcs(struct amdgpu_device *adev);
@@ -1278,7 +1273,6 @@ static int vcn_v3_0_start_sriov(struct amdgpu_device 
*adev)
uint32_t param, resp, expected;
uint32_t offset, cache_size;
uint32_t tmp, timeout;
-   uint32_t id;
 
struct amdgpu_mm_table *table = &adev->virt.mm_table;
uint32_t *table_loc;
@@ -1322,13 +1316,12 @@ static int vcn_v3_0_start_sriov(struct amdgpu_device 
*adev)
cache_size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw->size + 4);
 
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   id = amdgpu_ucode_id_vcns[i];
MMSCH_V3_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, i,
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
-   adev->firmware.ucode[id].tmr_mc_addr_lo);
+   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN + 
i].tmr_mc_addr_lo);
MMSCH_V3_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, i,
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
-   adev->firmware.ucode[id].tmr_mc_addr_hi);
+   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN + 
i].tmr_mc_addr_hi);
offset = 0;
MMSCH_V3_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCN, i,
mmUVD_VCPU_CACHE_OFFSET0),
-- 
2.31.1



[PATCH 3/4] drm/amdgpu/vcn2.0: remove intermediate variable

2021-10-19 Thread Alex Deucher
No need to use the tmp variable, just use the constant
directly.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 3883df5b31ab..313fc1b53999 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -1876,15 +1876,14 @@ static int vcn_v2_0_start_sriov(struct amdgpu_device 
*adev)
 
/* mc resume*/
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   tmp = AMDGPU_UCODE_ID_VCN;
MMSCH_V2_0_INSERT_DIRECT_WT(
SOC15_REG_OFFSET(UVD, i,
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW),
-   adev->firmware.ucode[tmp].tmr_mc_addr_lo);
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].tmr_mc_addr_lo);
MMSCH_V2_0_INSERT_DIRECT_WT(
SOC15_REG_OFFSET(UVD, i,
mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH),
-   adev->firmware.ucode[tmp].tmr_mc_addr_hi);
+   
adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].tmr_mc_addr_hi);
offset = 0;
} else {
MMSCH_V2_0_INSERT_DIRECT_WT(
-- 
2.31.1



[PATCH 2/4] drm/amdgpu: Consolidate VCN firmware setup code

2021-10-19 Thread Alex Deucher
Roughly the same code was present in all VCN versions.
Consolidate it into a single function.

v2: use AMDGPU_UCODE_ID_VCN + i, check if num_inst >= 2

Signed-off-by: Alex Deucher 
Reviewed-by: James Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 27 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  2 ++
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c   | 10 +
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c   | 10 +
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c   | 17 +---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c   | 17 +---
 6 files changed, 33 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index c7d316850570..2658414c503d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -949,3 +949,30 @@ enum amdgpu_ring_priority_level 
amdgpu_vcn_get_enc_ring_prio(int ring)
return AMDGPU_RING_PRIO_0;
}
 }
+
+void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev)
+{
+   int i;
+   unsigned int idx;
+
+   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   const struct common_firmware_header *hdr;
+   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
+
+   for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+   if (adev->vcn.harvest_config & (1 << i))
+   continue;
+   /* currently only support 2 FW instances */
+   if (i >= 2) {
+   dev_info(adev->dev, "More then 2 VCN FW 
instances!\n");
+   break;
+   }
+   idx = AMDGPU_UCODE_ID_VCN + i;
+   adev->firmware.ucode[idx].ucode_id = idx;
+   adev->firmware.ucode[idx].fw = adev->vcn.fw;
+   adev->firmware.fw_size +=
+   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
+   }
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index 795cbaa02ff8..bfa27ea94804 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -310,4 +310,6 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, 
long timeout);
 
 enum amdgpu_ring_priority_level amdgpu_vcn_get_enc_ring_prio(int ring);
 
+void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index ad0d2564087c..d54d720b3cf6 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -111,15 +111,7 @@ static int vcn_v1_0_sw_init(void *handle)
/* Override the work func */
adev->vcn.idle_work.work.func = vcn_v1_0_idle_work_handler;
 
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   const struct common_firmware_header *hdr;
-   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
-   }
+   amdgpu_vcn_setup_ucode(adev);
 
r = amdgpu_vcn_resume(adev);
if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 091d8c0f6801..3883df5b31ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -115,15 +115,7 @@ static int vcn_v2_0_sw_init(void *handle)
if (r)
return r;
 
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   const struct common_firmware_header *hdr;
-   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
-   }
+   amdgpu_vcn_setup_ucode(adev);
 
r = amdgpu_vcn_resume(adev);
if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 59f469bab005..44fc4c218433 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -139,22 +139,7 @@ static int vcn_v2_5_sw_init(void *handle)
if (r)
  

[PATCH 1/4] drm/amdgpu/vcn3.0: handle harvesting in firmware setup

2021-10-19 Thread Alex Deucher
Only enable firmware for the instance that is enabled.

v2: use AMDGPU_UCODE_ID_VCN + i

Fixes: 1b592d00b4ac83 ("drm/amdgpu/vcn: remove manual instance setting")
Reviewed-by: James Zhu 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index dbfd92984655..49752574a13c 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -123,6 +123,7 @@ static int vcn_v3_0_sw_init(void *handle)
 {
struct amdgpu_ring *ring;
int i, j, r;
+   unsigned int idx;
int vcn_doorbell_index = 0;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
@@ -133,14 +134,13 @@ static int vcn_v3_0_sw_init(void *handle)
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
const struct common_firmware_header *hdr;
hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-
-   if (adev->vcn.num_vcn_inst == VCN_INSTANCES_SIENNA_CICHLID) {
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].ucode_id = 
AMDGPU_UCODE_ID_VCN1;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].fw = 
adev->vcn.fw;
+
+   for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+   if (adev->vcn.harvest_config & (1 << i))
+   continue;
+   idx = AMDGPU_UCODE_ID_VCN + i;
+   adev->firmware.ucode[idx].ucode_id = idx;
+   adev->firmware.ucode[idx].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-- 
2.31.1



Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-19 Thread Dan Williams
On Tue, Oct 19, 2021 at 9:02 AM Jason Gunthorpe  wrote:
>
> On Tue, Oct 19, 2021 at 04:13:34PM +0100, Joao Martins wrote:
> > On 10/19/21 00:06, Jason Gunthorpe wrote:
> > > On Mon, Oct 18, 2021 at 12:37:30PM -0700, Dan Williams wrote:
> > >
> > >>> device-dax uses PUD, along with TTM, they are the only places. I'm not
> > >>> sure TTM is a real place though.
> > >>
> > >> I was setting device-dax aside because it can use Joao's changes to
> > >> get compound-page support.
> > >
> > > Ideally, but that ideas in that patch series have been floating around
> > > for a long time now..
> > >
> > The current status of the series misses a Rb on patches 6,7,10,12-14.
> > Well, patch 8 too should now drop its tag, considering the latest
> > discussion.
> >
> > If it helps moving things forward I could split my series further into:
> >
> > 1) the compound page introduction (patches 1-7) of my aforementioned series
> > 2) vmemmap deduplication for memory gains (patches 9-14)
> > 3) gup improvements (patch 8 and gup-slow improvements)
>
> I would split it, yes..
>
> I think we can see a general consensus that making compound_head/etc
> work consistently with how THP uses it will provide value and
> opportunity for optimization going forward.
>
> > Whats the benefit between preventing longterm at start
> > versus only after mounting the filesystem? Or is the intended future purpose
> > to pass more context into an holder potential future callback e.g. nack 
> > longterm
> > pins on a page basis?
>
> I understood Dan's remark that the device-dax path allows
> FOLL_LONGTERM and the FSDAX path does not ?
>
> Which, IIRC, today is signaled basd on vma properties and in all cases
> fast-gup is denied.

Yeah, I forgot that 7af75561e171 eliminated any possibility of
longterm-gup-fast for device-dax, let's not disturb that status quo.

> > Maybe we can start by at least not add any flags and just prevent
> > FOLL_LONGTERM on fsdax -- which I guess was the original purpose of
> > commit 7af75561e171 ("mm/gup: add FOLL_LONGTERM capability to GUP fast").
> > This patch (which I can formally send) has a sketch of that (below scissors 
> > mark):
> >
> > https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10f...@oracle.com/
>
> Yes, basically, whatever test we want for 'deny fast gup foll
> longterm' is fine.
>
> Personally I'd like to see us move toward a set of flag specifying
> each special behavior and not a collection of types that imply special
> behaviors.
>
> Eg we have at least:
>  - Block gup fast on foll_longterm
>  - Capture the refcount ==1 and use the pgmap free hook
>(confusingly called page_is_devmap_managed())
>  - Always use a swap entry
>  - page->index/mapping are used in the usual file based way?
>
> Probably more things..

Yes, agree with the principle of reducing type-implied special casing.



Re: [PATCH 2/4] drm/amdgpu: Clarify error when hitting bad page threshold

2021-10-19 Thread Luben Tuikov
Reviewed-by: Luben Tuikov 

Regards,
Luben

On 2021-10-19 13:50, Kent Russell wrote:
> Change the error message when the bad_page_threshold is reached,
> explicitly stating that the GPU will not be initialized.
>
> Cc: Luben Tuikov 
> Cc: Mukul Joshi 
> Signed-off-by: Kent Russell 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 8270aad23a06..7bb506a0ebd6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -,7 +,7 @@ int amdgpu_ras_eeprom_init(struct 
> amdgpu_ras_eeprom_control *control,
>   *exceed_err_limit = true;
>   dev_err(adev->dev,
>   "RAS records:%d exceed threshold:%d, "
> - "maybe retire this GPU?",
> + "GPU will not be initialized. Replace this GPU 
> or increase the threshold",
>   control->ras_num_recs, 
> ras->bad_page_cnt_threshold);
>   }
>   } else {



Re: [PATCH 1/4] drm/amdgpu: Warn when bad pages approaches threshold

2021-10-19 Thread Luben Tuikov
On 2021-10-19 14:22, Russell, Kent wrote:
> [AMD Official Use Only]
>
>
>
>> -Original Message-
>> From: Kuehling, Felix 
>> Sent: Tuesday, October 19, 2021 2:09 PM
>> To: Russell, Kent ; amd-gfx@lists.freedesktop.org
>> Cc: Tuikov, Luben ; Joshi, Mukul 
>> Subject: Re: [PATCH 1/4] drm/amdgpu: Warn when bad pages approaches threshold
>>
>> Am 2021-10-19 um 1:50 p.m. schrieb Kent Russell:
>>> Currently dmesg doesn't warn when the number of bad pages approaches the
>>> threshold for page retirement. WARN when the number of bad pages
>>> is at 90% or greater for easier checks and planning, instead of waiting
>>> until the GPU is full of bad pages
>>>
>>> Cc: Luben Tuikov 
>>> Cc: Mukul Joshi 
>>> Signed-off-by: Kent Russell 
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>>> index 98732518543e..8270aad23a06 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>>> @@ -1077,6 +1077,16 @@ int amdgpu_ras_eeprom_init(struct
>> amdgpu_ras_eeprom_control *control,
>>> if (res)
>>> DRM_ERROR("RAS table incorrect checksum or error:%d\n",
>>>   res);
>>> +
>>> +   /* threshold = -1 is automatic, threshold = 0 means that page
>>> +* retirement is disabled.
>>> +*/
>>> +   if (amdgpu_bad_page_threshold > 0 &&
>>> +   control->ras_num_recs >= 0 &&
>>> +   control->ras_num_recs >= (amdgpu_bad_page_threshold * 9 / 
>>> 10))
>>> +   DRM_WARN("RAS records:%u approaching threshold:%d",
>>> +   control->ras_num_recs,
>>> +   amdgpu_bad_page_threshold);
>> This won't work for the default setting amdgpu_bad_page_threshold=-1.
>> For this case, you'd have to take the threshold from
>> ras->bad_page_cnt_threshold.
> Yep, completely missed that. Thanks, I'll fix that up.

Please also fix the round off, third conditional:

a >= b * 9/10   <==>   10*a >= 9*b

Then, you can also drop the second line, since from the first:

b > 0  ==>   10*a >= 9*b > 0   ==>  10a > 0  ==>  a > 0.

Which shows that,

b > 0 && 10*a >= 9*b
   is true iff a and b are both greater than 0, so you don't need 
the middle line of the check.

Also in your message, say something like:

DRM_WARN("RAS records:%u approaching a 90% threshold:%d",
             control->ras_num_recs,
 amdgpu_bad_page_threshold);

Regards,
Luben

>
>  Kent
>> Regards,
>>    Felix
>>
>>
>>> } else if (hdr->header == RAS_TABLE_HDR_BAD &&
>>>amdgpu_bad_page_threshold != 0) {
>>> res = __verify_ras_table_checksum(control);



Re: [PATCH v2 13/13] drm/i915: replace drm_detect_hdmi_monitor() with drm_display_info.is_hdmi

2021-10-19 Thread Ville Syrjälä
On Sat, Oct 16, 2021 at 08:42:26PM +0200, Claudio Suarez wrote:
> Once EDID is parsed, the monitor HDMI support information is available
> through drm_display_info.is_hdmi. Retriving the same information with
> drm_detect_hdmi_monitor() is less efficient. Change to
> drm_display_info.is_hdmi where possible.

We still need proof in the commit message that display_info
is actually populated by the time this gets called.

> 
> This is a TODO task in Documentation/gpu/todo.rst
> 
> Signed-off-by: Claudio Suarez 
> ---
>  drivers/gpu/drm/i915/display/intel_hdmi.c | 2 +-
>  drivers/gpu/drm/i915/display/intel_sdvo.c | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c 
> b/drivers/gpu/drm/i915/display/intel_hdmi.c
> index b04685bb6439..008e5b0ba408 100644
> --- a/drivers/gpu/drm/i915/display/intel_hdmi.c
> +++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
> @@ -2355,7 +2355,7 @@ intel_hdmi_set_edid(struct drm_connector *connector)
>   to_intel_connector(connector)->detect_edid = edid;
>   if (edid && edid->input & DRM_EDID_INPUT_DIGITAL) {
>   intel_hdmi->has_audio = drm_detect_monitor_audio(edid);
> - intel_hdmi->has_hdmi_sink = drm_detect_hdmi_monitor(edid);
> + intel_hdmi->has_hdmi_sink = connector->display_info.is_hdmi;
>  
>   connected = true;
>   }
> diff --git a/drivers/gpu/drm/i915/display/intel_sdvo.c 
> b/drivers/gpu/drm/i915/display/intel_sdvo.c
> index 6cb27599ea03..b4065e4df644 100644
> --- a/drivers/gpu/drm/i915/display/intel_sdvo.c
> +++ b/drivers/gpu/drm/i915/display/intel_sdvo.c
> @@ -2060,8 +2060,9 @@ intel_sdvo_tmds_sink_detect(struct drm_connector 
> *connector)
>   if (edid->input & DRM_EDID_INPUT_DIGITAL) {
>   status = connector_status_connected;
>   if (intel_sdvo_connector->is_hdmi) {
> - intel_sdvo->has_hdmi_monitor = 
> drm_detect_hdmi_monitor(edid);
>   intel_sdvo->has_hdmi_audio = 
> drm_detect_monitor_audio(edid);
> + intel_sdvo->has_hdmi_monitor =
> + 
> connector->display_info.is_hdmi;
>   }
>   } else
>   status = connector_status_disconnected;
> -- 
> 2.33.0
> 

-- 
Ville Syrjälä
Intel


Re: [PATCH v2 01/13] gpu/drm: make drm_add_edid_modes() consistent when updating connector->display_info

2021-10-19 Thread Ville Syrjälä
On Sat, Oct 16, 2021 at 08:42:14PM +0200, Claudio Suarez wrote:
> According to the documentation, drm_add_edid_modes
> "... Also fills out the &drm_display_info structure and ELD in @connector
> with any information which can be derived from the edid."
> 
> drm_add_edid_modes accepts a struct edid *edid parameter which may have a
> value or may be null. When it is not null, connector->display_info and
> connector->eld are updated according to the edid. When edid=NULL, only
> connector->eld is reset. Reset connector->display_info to be consistent
> and accurate.
> 
> Signed-off-by: Claudio Suarez 
> ---
>  drivers/gpu/drm/drm_edid.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> index 6325877c5fd6..c643db17782c 100644
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@ -5356,14 +5356,13 @@ int drm_add_edid_modes(struct drm_connector 
> *connector, struct edid *edid)
>   int num_modes = 0;
>   u32 quirks;
>  
> - if (edid == NULL) {
> - clear_eld(connector);
> - return 0;
> - }
>   if (!drm_edid_is_valid(edid)) {

OK, so drm_edid_is_valid() will happily accept NULL and considers
it invalid. You may want to mention that explicitly in the commit
message.

> + /* edid == NULL or invalid here */
>   clear_eld(connector);
> - drm_warn(connector->dev, "%s: EDID invalid.\n",
> -  connector->name);
> + drm_reset_display_info(connector);
> + if (edid)
> + drm_warn(connector->dev, "%s: EDID invalid.\n",
> +  connector->name);

Could you respin this to use the standard [CONNECTOR:%d:%s] form
while at it? Or I guess a patch to mass convert the whole drm_edid.c
might be another option.

Patch looks good.
Reviewed-by: Ville Syrjälä 


>   return 0;
>   }
>  
> -- 
> 2.33.0
> 
> 

-- 
Ville Syrjälä
Intel


RE: [PATCH 3/4] drm/amdgpu: Add kernel parameter for ignoring bad page threshold

2021-10-19 Thread Russell, Kent
[AMD Official Use Only]



> -Original Message-
> From: Kuehling, Felix 
> Sent: Tuesday, October 19, 2021 2:13 PM
> To: Russell, Kent ; amd-gfx@lists.freedesktop.org
> Cc: Tuikov, Luben ; Joshi, Mukul 
> Subject: Re: [PATCH 3/4] drm/amdgpu: Add kernel parameter for ignoring bad 
> page
> threshold
> 
> 
> Am 2021-10-19 um 1:50 p.m. schrieb Kent Russell:
> > When a GPU hits the bad_page_threshold, it will not be initialized by
> > the amdgpu driver. This means that the table cannot be cleared, nor can
> > information gathering be performed (getting serial number, BDF, etc).
> > Add an override called ignore_bad_page_threshold that can be set to true
> > to still initialize the GPU, even when the bad page threshold has been
> > reached.
> Do you really need a new parameter for this? Wouldn't it be enough to
> set bad_page_threshold to the VRAM size? You could use a new special
> value (e.g. bad_page_threshold=-2) for that.

Ah interesting. That could definitely work here. I hadn't thought about 
co-opting another variable. We already check -1, so why not -2? Great insight. 
Thanks!

 Kent

> 
> Regards,
>   Felix
> 
> 
> >
> > Cc: Luben Tuikov 
> > Cc: Mukul Joshi 
> > Signed-off-by: Kent Russell 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 +
> >  2 files changed, 14 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index d58e37fd01f4..b85b67a88a3d 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -205,6 +205,7 @@ extern struct amdgpu_mgpu_info mgpu_info;
> >  extern int amdgpu_ras_enable;
> >  extern uint amdgpu_ras_mask;
> >  extern int amdgpu_bad_page_threshold;
> > +extern bool amdgpu_ignore_bad_page_threshold;
> >  extern struct amdgpu_watchdog_timer amdgpu_watchdog_timer;
> >  extern int amdgpu_async_gfx_ring;
> >  extern int amdgpu_mcbp;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index 96bd63aeeddd..3e9a7b072888 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -189,6 +189,7 @@ struct amdgpu_mgpu_info mgpu_info = {
> >  int amdgpu_ras_enable = -1;
> >  uint amdgpu_ras_mask = 0x;
> >  int amdgpu_bad_page_threshold = -1;
> > +bool amdgpu_ignore_bad_page_threshold;
> >  struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
> > .timeout_fatal_disable = false,
> > .period = 0x0, /* default to 0x0 (timeout disable) */
> > @@ -880,6 +881,18 @@ module_param_named(reset_method, amdgpu_reset_method,
> int, 0444);
> >  MODULE_PARM_DESC(bad_page_threshold, "Bad page threshold(-1 = auto(default
> value), 0 = disable bad page retirement)");
> >  module_param_named(bad_page_threshold, amdgpu_bad_page_threshold, int, 
> > 0444);
> >
> > +/**
> > + * DOC: ignore_bad_page_threshold (bool) Bad page threshold specifies
> > + * the threshold value of faulty pages detected by RAS ECC. Once the
> > + * threshold is hit, the GPU will not be initialized. Use this parameter
> > + * to ignore the bad page threshold so that information gathering can
> > + * still be performed. This also allows for booting the GPU to clear
> > + * the RAS EEPROM table.
> > + */
> > +
> > +MODULE_PARM_DESC(ignore_bad_page_threshold, "Ignore bad page threshold 
> > (false =
> respect bad page threshold (default value)");
> > +module_param_named(ignore_bad_page_threshold,
> amdgpu_ignore_bad_page_threshold, bool, 0644);
> > +
> >  MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to 
> > setup
> (8 if set to greater than 8 or less than 0, only affect gfx 8+)");
> >  module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> >


RE: [PATCH 1/4] drm/amdgpu: Warn when bad pages approaches threshold

2021-10-19 Thread Russell, Kent
[AMD Official Use Only]



> -Original Message-
> From: Kuehling, Felix 
> Sent: Tuesday, October 19, 2021 2:09 PM
> To: Russell, Kent ; amd-gfx@lists.freedesktop.org
> Cc: Tuikov, Luben ; Joshi, Mukul 
> Subject: Re: [PATCH 1/4] drm/amdgpu: Warn when bad pages approaches threshold
> 
> Am 2021-10-19 um 1:50 p.m. schrieb Kent Russell:
> > Currently dmesg doesn't warn when the number of bad pages approaches the
> > threshold for page retirement. WARN when the number of bad pages
> > is at 90% or greater for easier checks and planning, instead of waiting
> > until the GPU is full of bad pages
> >
> > Cc: Luben Tuikov 
> > Cc: Mukul Joshi 
> > Signed-off-by: Kent Russell 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> > index 98732518543e..8270aad23a06 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> > @@ -1077,6 +1077,16 @@ int amdgpu_ras_eeprom_init(struct
> amdgpu_ras_eeprom_control *control,
> > if (res)
> > DRM_ERROR("RAS table incorrect checksum or error:%d\n",
> >   res);
> > +
> > +   /* threshold = -1 is automatic, threshold = 0 means that page
> > +* retirement is disabled.
> > +*/
> > +   if (amdgpu_bad_page_threshold > 0 &&
> > +   control->ras_num_recs >= 0 &&
> > +   control->ras_num_recs >= (amdgpu_bad_page_threshold * 9 / 
> > 10))
> > +   DRM_WARN("RAS records:%u approaching threshold:%d",
> > +   control->ras_num_recs,
> > +   amdgpu_bad_page_threshold);
> 
> This won't work for the default setting amdgpu_bad_page_threshold=-1.
> For this case, you'd have to take the threshold from
> ras->bad_page_cnt_threshold.

Yep, completely missed that. Thanks, I'll fix that up.

 Kent
> 
> Regards,
>    Felix
> 
> 
> > } else if (hdr->header == RAS_TABLE_HDR_BAD &&
> >amdgpu_bad_page_threshold != 0) {
> > res = __verify_ras_table_checksum(control);


[PATCH 1/3] drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr

2021-10-19 Thread Nirmoy Das
Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call gtt_mgr functions.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 31 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +--
 4 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 41ce86244144..5807df52031c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4287,7 +4287,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
 
amdgpu_virt_init_data_exchange(adev);
/* we need recover gart prior to run SMC/CP/SDMA resume */
-   amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev, TTM_PL_TT));
+   amdgpu_gtt_mgr_recover(adev);
 
r = amdgpu_device_fw_loading(adev);
if (r)
@@ -4604,7 +4604,7 @@ int amdgpu_do_asic_reset(struct list_head 
*device_list_handle,
amdgpu_inc_vram_lost(tmp_adev);
}
 
-   r = 
amdgpu_gtt_mgr_recover(ttm_manager_type(&tmp_adev->mman.bdev, TTM_PL_TT));
+   r = amdgpu_gtt_mgr_recover(tmp_adev);
if (r)
goto out;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index c18f16b3be9c..5e41f8ef743a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -77,10 +77,8 @@ static ssize_t amdgpu_mem_info_gtt_used_show(struct device 
*dev,
 {
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(ddev);
-   struct ttm_resource_manager *man;
 
-   man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
-   return sysfs_emit(buf, "%llu\n", amdgpu_gtt_mgr_usage(man));
+   return sysfs_emit(buf, "%llu\n", amdgpu_gtt_mgr_usage(adev));
 }
 
 static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
@@ -206,14 +204,19 @@ static void amdgpu_gtt_mgr_del(struct 
ttm_resource_manager *man,
 /**
  * amdgpu_gtt_mgr_usage - return usage of GTT domain
  *
- * @man: TTM memory type manager
+ * @adev: amdgpu_device pointer
  *
  * Return how many bytes are used in the GTT domain
  */
-uint64_t amdgpu_gtt_mgr_usage(struct ttm_resource_manager *man)
+uint64_t amdgpu_gtt_mgr_usage(struct amdgpu_device *adev)
 {
-   struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
-   s64 result = man->size - atomic64_read(&mgr->available);
+   struct ttm_resource_manager *man;
+   struct amdgpu_gtt_mgr *mgr;
+   s64 result;
+
+   man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
+   mgr = to_gtt_mgr(man);
+   result = man->size - atomic64_read(&mgr->available);
 
return (result > 0 ? result : 0) * PAGE_SIZE;
 }
@@ -221,19 +224,20 @@ uint64_t amdgpu_gtt_mgr_usage(struct ttm_resource_manager 
*man)
 /**
  * amdgpu_gtt_mgr_recover - re-init gart
  *
- * @man: TTM memory type manager
+ * @adev: amdgpu_device pointer
  *
  * Re-init the gart for each known BO in the GTT.
  */
-int amdgpu_gtt_mgr_recover(struct ttm_resource_manager *man)
+int amdgpu_gtt_mgr_recover(struct amdgpu_device *adev)
 {
-   struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
-   struct amdgpu_device *adev;
+   struct ttm_resource_manager *man;
+   struct amdgpu_gtt_mgr *mgr;
struct amdgpu_gtt_node *node;
struct drm_mm_node *mm_node;
int r = 0;
 
-   adev = container_of(mgr, typeof(*adev), mman.gtt_mgr);
+   man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
+   mgr = to_gtt_mgr(man);
spin_lock(&mgr->lock);
drm_mm_for_each_node(mm_node, &mgr->mm) {
node = container_of(mm_node, typeof(*node), base.mm_nodes[0]);
@@ -260,6 +264,7 @@ static void amdgpu_gtt_mgr_debug(struct 
ttm_resource_manager *man,
 struct drm_printer *printer)
 {
struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
+   struct amdgpu_device *adev = container_of(mgr, typeof(*adev), 
mman.gtt_mgr);
 
spin_lock(&mgr->lock);
drm_mm_print(&mgr->mm, printer);
@@ -267,7 +272,7 @@ static void amdgpu_gtt_mgr_debug(struct 
ttm_resource_manager *man,
 
drm_printf(printer, "man size:%llu pages, gtt available:%lld pages, 
usage:%lluMB\n",
   man->size, (u64)atomic64_read(&mgr->available),
-  amdgpu_gtt_mgr_usage(man) >> 20);
+  amdgpu_gtt_mgr_usage(adev) >> 20);
 }
 
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func = {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.

[PATCH v2 3/3] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Nirmoy Das
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_device_evict_resources()
while evicting GART table.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 42 --
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c |  9 ++---
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 10 +++---
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 10 +++---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  9 ++---
 6 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5807df52031c..f69e613805db 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,10 +3941,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
amdgpu_fence_driver_hw_fini(adev);

amdgpu_device_ip_suspend_phase2(adev);
-   /* This second call to evict device resources is to evict
-* the gart page table using the CPU.
-*/
-   amdgpu_device_evict_resources(adev);

return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index d3e4203f6217..97a9f61fa106 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -107,33 +107,37 @@ void amdgpu_gart_dummy_page_fini(struct amdgpu_device 
*adev)
  *
  * @adev: amdgpu_device pointer
  *
- * Allocate video memory for GART page table
+ * Allocate and pin video memory for GART page table
  * (pcie r4xx, r5xx+).  These asics require the
  * gart table to be in video memory.
  * Returns 0 for success, error for failure.
  */
 int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev)
 {
+   struct amdgpu_bo_param bp;
int r;

-   if (adev->gart.bo == NULL) {
-   struct amdgpu_bo_param bp;
-
-   memset(&bp, 0, sizeof(bp));
-   bp.size = adev->gart.table_size;
-   bp.byte_align = PAGE_SIZE;
-   bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
-   bp.type = ttm_bo_type_kernel;
-   bp.resv = NULL;
-   bp.bo_ptr_size = sizeof(struct amdgpu_bo);
-
-   r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
-   if (r) {
-   return r;
-   }
-   }
+   if (adev->gart.bo != NULL)
+   return 0;
+
+   memset(&bp, 0, sizeof(bp));
+   bp.size = adev->gart.table_size;
+   bp.byte_align = PAGE_SIZE;
+   bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
+   bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
+   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+   bp.type = ttm_bo_type_kernel;
+   bp.resv = NULL;
+   bp.bo_ptr_size = sizeof(struct amdgpu_bo);
+
+   r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
+   if (r)
+   return r;
+
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+
return 0;
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 3ec5ff5a6dbe..75d584e1b0e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,11 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}

-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+   if (adev->in_suspend) {
+   r = amdgpu_gtt_mgr_recover(adev);
+   if (r)
+   return r;
+   }

r = adev->gfxhub.funcs->gart_enable(adev);
if (r)
@@ -1062,7 +1064,6 @@ static void gmc_v10_0_gart_disable(struct amdgpu_device 
*adev)
 {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
-   amdgpu_gart_table_vram_unpin(adev);
 }

 static int gmc_v10_0_hw_fini(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 0a50fdaced7e..02e90d9443c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -620,9 +620,12 @@ static int gmc_v7_0_gart_enable(struct amdgpu_device *adev)
dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
return -EINVAL;
}
-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+
+   if (adev->in_suspend) {
+   r = amdgpu_gtt_mgr_recover(adev);
+   if (r)
+   return r;
+   }

table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);

@@ -758,7 

[PATCH 2/3] drm/amdgpu: do not pass ttm_resource_manager to vram_mgr

2021-10-19 Thread Nirmoy Das
Do not allow exported amdgpu_vram_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call vram_mgr functions.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c   |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c  | 10 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c  |  6 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h  |  8 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c |  5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 54 
 7 files changed, 49 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 7077f21f0021..4837c579a787 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -531,9 +531,8 @@ int amdgpu_amdkfd_get_dmabuf_info(struct kgd_dev *kgd, int 
dma_buf_fd,
 uint64_t amdgpu_amdkfd_get_vram_usage(struct kgd_dev *kgd)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
-   struct ttm_resource_manager *vram_man = 
ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM);
 
-   return amdgpu_vram_mgr_usage(vram_man);
+   return amdgpu_vram_mgr_usage(adev);
 }
 
 uint64_t amdgpu_amdkfd_get_hive_id(struct kgd_dev *kgd)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 76fe5b71e35d..f4084ca8b614 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -298,7 +298,6 @@ static void amdgpu_cs_get_threshold_for_moves(struct 
amdgpu_device *adev,
 {
s64 time_us, increment_us;
u64 free_vram, total_vram, used_vram;
-   struct ttm_resource_manager *vram_man = 
ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM);
/* Allow a maximum of 200 accumulated ms. This is basically per-IB
 * throttling.
 *
@@ -315,7 +314,7 @@ static void amdgpu_cs_get_threshold_for_moves(struct 
amdgpu_device *adev,
}
 
total_vram = adev->gmc.real_vram_size - 
atomic64_read(&adev->vram_pin_size);
-   used_vram = amdgpu_vram_mgr_usage(vram_man);
+   used_vram = amdgpu_vram_mgr_usage(adev);
free_vram = used_vram >= total_vram ? 0 : total_vram - used_vram;
 
spin_lock(&adev->mm_stats.lock);
@@ -362,7 +361,7 @@ static void amdgpu_cs_get_threshold_for_moves(struct 
amdgpu_device *adev,
if (!amdgpu_gmc_vram_full_visible(&adev->gmc)) {
u64 total_vis_vram = adev->gmc.visible_vram_size;
u64 used_vis_vram =
- amdgpu_vram_mgr_vis_usage(vram_man);
+ amdgpu_vram_mgr_vis_usage(adev);
 
if (used_vis_vram < total_vis_vram) {
u64 free_vis_vram = total_vis_vram - used_vis_vram;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index b9b38f70e416..34674ccabd67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -672,10 +672,10 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
ui64 = atomic64_read(&adev->num_vram_cpu_page_faults);
return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
case AMDGPU_INFO_VRAM_USAGE:
-   ui64 = amdgpu_vram_mgr_usage(ttm_manager_type(&adev->mman.bdev, 
TTM_PL_VRAM));
+   ui64 = amdgpu_vram_mgr_usage(adev);
return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
case AMDGPU_INFO_VIS_VRAM_USAGE:
-   ui64 = 
amdgpu_vram_mgr_vis_usage(ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM));
+   ui64 = amdgpu_vram_mgr_vis_usage(adev);
return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
case AMDGPU_INFO_GTT_USAGE:
ui64 = amdgpu_gtt_mgr_usage(adev);
@@ -709,8 +709,6 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
}
case AMDGPU_INFO_MEMORY: {
struct drm_amdgpu_memory_info mem;
-   struct ttm_resource_manager *vram_man =
-   ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM);
struct ttm_resource_manager *gtt_man =
ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
memset(&mem, 0, sizeof(mem));
@@ -719,7 +717,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
atomic64_read(&adev->vram_pin_size) -
AMDGPU_VM_RESERVED_VRAM;
mem.vram.heap_usage =
-   amdgpu_vram_mgr_usage(vram_man);
+   amdgpu_vram_mgr_usage(adev);
mem.vram.max_allocation = mem.vram.usable_heap_size * 3 / 4;
 
mem.cpu_accessible_vram.tot

Re: [PATCH 3/4] drm/amdgpu: Add kernel parameter for ignoring bad page threshold

2021-10-19 Thread Felix Kuehling


Am 2021-10-19 um 1:50 p.m. schrieb Kent Russell:
> When a GPU hits the bad_page_threshold, it will not be initialized by
> the amdgpu driver. This means that the table cannot be cleared, nor can
> information gathering be performed (getting serial number, BDF, etc).
> Add an override called ignore_bad_page_threshold that can be set to true
> to still initialize the GPU, even when the bad page threshold has been
> reached.
Do you really need a new parameter for this? Wouldn't it be enough to
set bad_page_threshold to the VRAM size? You could use a new special
value (e.g. bad_page_threshold=-2) for that.

Regards,
  Felix


>
> Cc: Luben Tuikov 
> Cc: Mukul Joshi 
> Signed-off-by: Kent Russell 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 +
>  2 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index d58e37fd01f4..b85b67a88a3d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -205,6 +205,7 @@ extern struct amdgpu_mgpu_info mgpu_info;
>  extern int amdgpu_ras_enable;
>  extern uint amdgpu_ras_mask;
>  extern int amdgpu_bad_page_threshold;
> +extern bool amdgpu_ignore_bad_page_threshold;
>  extern struct amdgpu_watchdog_timer amdgpu_watchdog_timer;
>  extern int amdgpu_async_gfx_ring;
>  extern int amdgpu_mcbp;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 96bd63aeeddd..3e9a7b072888 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -189,6 +189,7 @@ struct amdgpu_mgpu_info mgpu_info = {
>  int amdgpu_ras_enable = -1;
>  uint amdgpu_ras_mask = 0x;
>  int amdgpu_bad_page_threshold = -1;
> +bool amdgpu_ignore_bad_page_threshold;
>  struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
>   .timeout_fatal_disable = false,
>   .period = 0x0, /* default to 0x0 (timeout disable) */
> @@ -880,6 +881,18 @@ module_param_named(reset_method, amdgpu_reset_method, 
> int, 0444);
>  MODULE_PARM_DESC(bad_page_threshold, "Bad page threshold(-1 = auto(default 
> value), 0 = disable bad page retirement)");
>  module_param_named(bad_page_threshold, amdgpu_bad_page_threshold, int, 0444);
>  
> +/**
> + * DOC: ignore_bad_page_threshold (bool) Bad page threshold specifies
> + * the threshold value of faulty pages detected by RAS ECC. Once the
> + * threshold is hit, the GPU will not be initialized. Use this parameter
> + * to ignore the bad page threshold so that information gathering can
> + * still be performed. This also allows for booting the GPU to clear
> + * the RAS EEPROM table.
> + */
> +
> +MODULE_PARM_DESC(ignore_bad_page_threshold, "Ignore bad page threshold 
> (false = respect bad page threshold (default value)");
> +module_param_named(ignore_bad_page_threshold, 
> amdgpu_ignore_bad_page_threshold, bool, 0644);
> +
>  MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup 
> (8 if set to greater than 8 or less than 0, only affect gfx 8+)");
>  module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
>  


Re: [PATCH 1/4] drm/amdgpu: Warn when bad pages approaches threshold

2021-10-19 Thread Felix Kuehling
Am 2021-10-19 um 1:50 p.m. schrieb Kent Russell:
> Currently dmesg doesn't warn when the number of bad pages approaches the
> threshold for page retirement. WARN when the number of bad pages
> is at 90% or greater for easier checks and planning, instead of waiting
> until the GPU is full of bad pages
>
> Cc: Luben Tuikov 
> Cc: Mukul Joshi 
> Signed-off-by: Kent Russell 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> index 98732518543e..8270aad23a06 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
> @@ -1077,6 +1077,16 @@ int amdgpu_ras_eeprom_init(struct 
> amdgpu_ras_eeprom_control *control,
>   if (res)
>   DRM_ERROR("RAS table incorrect checksum or error:%d\n",
> res);
> +
> + /* threshold = -1 is automatic, threshold = 0 means that page
> +  * retirement is disabled.
> +  */
> + if (amdgpu_bad_page_threshold > 0 &&
> + control->ras_num_recs >= 0 &&
> + control->ras_num_recs >= (amdgpu_bad_page_threshold * 9 / 
> 10))
> + DRM_WARN("RAS records:%u approaching threshold:%d",
> + control->ras_num_recs,
> + amdgpu_bad_page_threshold);

This won't work for the default setting amdgpu_bad_page_threshold=-1.
For this case, you'd have to take the threshold from
ras->bad_page_cnt_threshold.

Regards,
   Felix


>   } else if (hdr->header == RAS_TABLE_HDR_BAD &&
>  amdgpu_bad_page_threshold != 0) {
>   res = __verify_ras_table_checksum(control);


[PATCH 4/4] drm/amdgpu: Implement ignore_bad_page_threshold parameter

2021-10-19 Thread Kent Russell
If the ignore_bad_page_threshold kernel parameter is set to true,
continue to post the GPU. Print an warning to dmesg that this action has
been done, and that page retirement will obviously not work for said GPU

Cc: Luben Tuikov 
Cc: Mukul Joshi 
Signed-off-by: Kent Russell 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 7bb506a0ebd6..63a0548a05bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1108,11 +1108,16 @@ int amdgpu_ras_eeprom_init(struct 
amdgpu_ras_eeprom_control *control,
res = amdgpu_ras_eeprom_correct_header_tag(control,
   
RAS_TABLE_HDR_VAL);
} else {
-   *exceed_err_limit = true;
-   dev_err(adev->dev,
-   "RAS records:%d exceed threshold:%d, "
-   "GPU will not be initialized. Replace this GPU 
or increase the threshold",
+   dev_err(adev->dev, "RAS records:%d exceed threshold:%d",
control->ras_num_recs, 
ras->bad_page_cnt_threshold);
+   if (amdgpu_ignore_bad_page_threshold) {
+   dev_warn(adev->dev, "GPU will be initialized 
due to ignore_bad_page_threshold.");
+   dev_warn(adev->dev, "Page retirement will not 
work for this GPU in this state.");
+   res = 0;
+   } else {
+   *exceed_err_limit = true;
+   dev_err(adev->dev, "GPU will not be 
initialized. Replace this GPU or increase the threshold.");
+   }
}
} else {
DRM_INFO("Creating a new EEPROM table");
-- 
2.25.1



[PATCH 2/4] drm/amdgpu: Clarify error when hitting bad page threshold

2021-10-19 Thread Kent Russell
Change the error message when the bad_page_threshold is reached,
explicitly stating that the GPU will not be initialized.

Cc: Luben Tuikov 
Cc: Mukul Joshi 
Signed-off-by: Kent Russell 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 8270aad23a06..7bb506a0ebd6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -,7 +,7 @@ int amdgpu_ras_eeprom_init(struct 
amdgpu_ras_eeprom_control *control,
*exceed_err_limit = true;
dev_err(adev->dev,
"RAS records:%d exceed threshold:%d, "
-   "maybe retire this GPU?",
+   "GPU will not be initialized. Replace this GPU 
or increase the threshold",
control->ras_num_recs, 
ras->bad_page_cnt_threshold);
}
} else {
-- 
2.25.1



[PATCH 3/4] drm/amdgpu: Add kernel parameter for ignoring bad page threshold

2021-10-19 Thread Kent Russell
When a GPU hits the bad_page_threshold, it will not be initialized by
the amdgpu driver. This means that the table cannot be cleared, nor can
information gathering be performed (getting serial number, BDF, etc).
Add an override called ignore_bad_page_threshold that can be set to true
to still initialize the GPU, even when the bad page threshold has been
reached.

Cc: Luben Tuikov 
Cc: Mukul Joshi 
Signed-off-by: Kent Russell 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 +
 2 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d58e37fd01f4..b85b67a88a3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -205,6 +205,7 @@ extern struct amdgpu_mgpu_info mgpu_info;
 extern int amdgpu_ras_enable;
 extern uint amdgpu_ras_mask;
 extern int amdgpu_bad_page_threshold;
+extern bool amdgpu_ignore_bad_page_threshold;
 extern struct amdgpu_watchdog_timer amdgpu_watchdog_timer;
 extern int amdgpu_async_gfx_ring;
 extern int amdgpu_mcbp;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 96bd63aeeddd..3e9a7b072888 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -189,6 +189,7 @@ struct amdgpu_mgpu_info mgpu_info = {
 int amdgpu_ras_enable = -1;
 uint amdgpu_ras_mask = 0x;
 int amdgpu_bad_page_threshold = -1;
+bool amdgpu_ignore_bad_page_threshold;
 struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
.timeout_fatal_disable = false,
.period = 0x0, /* default to 0x0 (timeout disable) */
@@ -880,6 +881,18 @@ module_param_named(reset_method, amdgpu_reset_method, int, 
0444);
 MODULE_PARM_DESC(bad_page_threshold, "Bad page threshold(-1 = auto(default 
value), 0 = disable bad page retirement)");
 module_param_named(bad_page_threshold, amdgpu_bad_page_threshold, int, 0444);
 
+/**
+ * DOC: ignore_bad_page_threshold (bool) Bad page threshold specifies
+ * the threshold value of faulty pages detected by RAS ECC. Once the
+ * threshold is hit, the GPU will not be initialized. Use this parameter
+ * to ignore the bad page threshold so that information gathering can
+ * still be performed. This also allows for booting the GPU to clear
+ * the RAS EEPROM table.
+ */
+
+MODULE_PARM_DESC(ignore_bad_page_threshold, "Ignore bad page threshold (false 
= respect bad page threshold (default value)");
+module_param_named(ignore_bad_page_threshold, 
amdgpu_ignore_bad_page_threshold, bool, 0644);
+
 MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup 
(8 if set to greater than 8 or less than 0, only affect gfx 8+)");
 module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
 
-- 
2.25.1



[PATCH 1/4] drm/amdgpu: Warn when bad pages approaches threshold

2021-10-19 Thread Kent Russell
Currently dmesg doesn't warn when the number of bad pages approaches the
threshold for page retirement. WARN when the number of bad pages
is at 90% or greater for easier checks and planning, instead of waiting
until the GPU is full of bad pages

Cc: Luben Tuikov 
Cc: Mukul Joshi 
Signed-off-by: Kent Russell 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 98732518543e..8270aad23a06 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1077,6 +1077,16 @@ int amdgpu_ras_eeprom_init(struct 
amdgpu_ras_eeprom_control *control,
if (res)
DRM_ERROR("RAS table incorrect checksum or error:%d\n",
  res);
+
+   /* threshold = -1 is automatic, threshold = 0 means that page
+* retirement is disabled.
+*/
+   if (amdgpu_bad_page_threshold > 0 &&
+   control->ras_num_recs >= 0 &&
+   control->ras_num_recs >= (amdgpu_bad_page_threshold * 9 / 
10))
+   DRM_WARN("RAS records:%u approaching threshold:%d",
+   control->ras_num_recs,
+   amdgpu_bad_page_threshold);
} else if (hdr->header == RAS_TABLE_HDR_BAD &&
   amdgpu_bad_page_threshold != 0) {
res = __verify_ras_table_checksum(control);
-- 
2.25.1



Re: [PATCH] drm/amdgpu: Consolidate VCN firmware setup code

2021-10-19 Thread James Zhu

With two nit-pick below.

ThispatchisReviewed-by:JamesZhu

On 2021-10-19 11:56 a.m., Alex Deucher wrote:

Roughly the same code was present in all VCN versions.
Consolidate it into a single function.

Signed-off-by: Alex Deucher
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 25 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  2 ++
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c   | 10 +-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c   | 10 +-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c   | 17 +
  drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c   | 20 +---
  6 files changed, 31 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index c7d316850570..dc823349f870 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -949,3 +949,28 @@ enum amdgpu_ring_priority_level 
amdgpu_vcn_get_enc_ring_prio(int ring)
return AMDGPU_RING_PRIO_0;
}
  }
+
+void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev)
+{
+   int i;
+   unsigned int idx;
+
+   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   const struct common_firmware_header *hdr;
+   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
+
+   for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+   if (adev->vcn.harvest_config & (1 << i))
+   continue;

[JZ] Add check if i > 2.

+   if (i == 0)
+   idx = AMDGPU_UCODE_ID_VCN;
+   else
+   idx = AMDGPU_UCODE_ID_VCN1;
+   adev->firmware.ucode[idx].ucode_id = idx;
+   adev->firmware.ucode[idx].fw = adev->vcn.fw;
+   adev->firmware.fw_size +=
+   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
+   }
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");

[JZ] DRM_DEV_INFO can be used instead .

+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index 795cbaa02ff8..bfa27ea94804 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -310,4 +310,6 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, 
long timeout);
  
  enum amdgpu_ring_priority_level amdgpu_vcn_get_enc_ring_prio(int ring);
  
+void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev);

+
  #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index ad0d2564087c..d54d720b3cf6 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -111,15 +111,7 @@ static int vcn_v1_0_sw_init(void *handle)
/* Override the work func */
adev->vcn.idle_work.work.func = vcn_v1_0_idle_work_handler;
  
-	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {

-   const struct common_firmware_header *hdr;
-   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
-   }
+   amdgpu_vcn_setup_ucode(adev);
  
  	r = amdgpu_vcn_resume(adev);

if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 091d8c0f6801..3883df5b31ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -115,15 +115,7 @@ static int vcn_v2_0_sw_init(void *handle)
if (r)
return r;
  
-	if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {

-   const struct common_firmware_header *hdr;
-   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
-   }
+   amdgpu_vcn_setup_ucode(adev);
  
  	r = amdgpu_vcn_resume(adev);

if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 59f469bab005..44fc4c218433 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -139,22 +139,7 @@ static int vcn_v2_5_sw_init(void *handle)
if (r)
return r;
  
-	if (adev->firm

Re: [PATCH 1/1] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Das, Nirmoy



On 10/19/2021 5:43 PM, Christian König wrote:

Am 19.10.21 um 15:22 schrieb Nirmoy Das:

Get rid off pin/unpin and evict and swap back gart
page table which should make things less likely to break.

Also remove 2nd call to amdgpu_device_evict_resources()
as we don't need it.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 17 +
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 
  4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 41ce86244144..22ff229ab981 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,11 +3941,6 @@ int amdgpu_device_suspend(struct drm_device 
*dev, bool fbcon)

  amdgpu_fence_driver_hw_fini(adev);
    amdgpu_device_ip_suspend_phase2(adev);
-    /* This second call to evict device resources is to evict
- * the gart page table using the CPU.
- */
-    amdgpu_device_evict_resources(adev);
-
  return 0;
  }
  diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c

index 3ec5ff5a6dbe..18e3f3c5aae6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,16 @@ static int gmc_v10_0_gart_enable(struct 
amdgpu_device *adev)

  return -EINVAL;
  }
  -    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;


I think you can move the functionality of pinning into 
amdgpu_gart_table_vram_alloc().



+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }


And add a wrapper around this call here. Something like 
amdgpu_gart_recover() or similar.



Thanks Christian, I will resend with your suggested changes.



Regards,
Christian.


    r = adev->gfxhub.funcs->gart_enable(adev);
  if (r)
@@ -1062,7 +1069,8 @@ static void gmc_v10_0_gart_disable(struct 
amdgpu_device *adev)

  {
  adev->gfxhub.funcs->gart_disable(adev);
  adev->mmhub.funcs->gart_disable(adev);
-    amdgpu_gart_table_vram_unpin(adev);
+    if (!adev->in_suspend)
+    amdgpu_gart_table_vram_unpin(adev);
  }
    static int gmc_v10_0_hw_fini(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c

index 492ebed2915b..0ef50ad3d7d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,9 +837,17 @@ static int gmc_v8_0_gart_enable(struct 
amdgpu_device *adev)

  dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
  return -EINVAL;
  }
-    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
  @@ -992,7 +1000,8 @@ static void gmc_v8_0_gart_disable(struct 
amdgpu_device *adev)

  tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
  WREG32(mmVM_L2_CNTL, tmp);
  WREG32(mmVM_L2_CNTL2, 0);
-    amdgpu_gart_table_vram_unpin(adev);
+    if (!adev->in_suspend)
+    amdgpu_gart_table_vram_unpin(adev);
  }
    /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c

index cb82404df534..1bbcefd53974 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1714,9 +1714,16 @@ static int gmc_v9_0_gart_enable(struct 
amdgpu_device *adev)

  return -EINVAL;
  }
  -    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    r = adev->gfxhub.funcs->gart_enable(adev);
  if (r)
@@ -1793,7 +1800,8 @@ static void gmc_v9_0_gart_disable(struct 
amdgpu_device *adev)

  {
  adev->gfxhub.funcs->gart_disable(adev);
  adev->mmhub.funcs->gart_disable(adev);
-    amdgpu_gart_table_vram_unpin(adev);
+    if (!adev->in_suspend)
+    amdgpu_gart_table_vram_unpin(adev);
  }
    static int gmc_v9_0_hw_fini(void *handle)




Re: [PATCH 1/1] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Andrey Grodzovsky



On 2021-10-19 11:54 a.m., Christian König wrote:

Am 19.10.21 um 17:41 schrieb Andrey Grodzovsky:


On 2021-10-19 9:22 a.m., Nirmoy Das wrote:

Get rid off pin/unpin and evict and swap back gart
page table which should make things less likely to break.


+Christian

Could you guys also clarify what exactly are the stability issues 
this fixes ?


When we evict the GART table during suspend it is theoretically 
possible that we run into an OOM situation.


But since the OOM killer and the buffer move functions are already 
disable that is basically not gracefully handle able.


When we just keep the GART pinned all the time and restore it's 
content during resume from the metadata we should be able to avoid any 
memory allocation for the move.


Christian.



Got it.

Andrey






Andrey




Also remove 2nd call to amdgpu_device_evict_resources()
as we don't need it.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 17 +
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 
  4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 41ce86244144..22ff229ab981 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,11 +3941,6 @@ int amdgpu_device_suspend(struct drm_device 
*dev, bool fbcon)

  amdgpu_fence_driver_hw_fini(adev);
    amdgpu_device_ip_suspend_phase2(adev);
-    /* This second call to evict device resources is to evict
- * the gart page table using the CPU.
- */
-    amdgpu_device_evict_resources(adev);
-
  return 0;
  }
  diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c

index 3ec5ff5a6dbe..18e3f3c5aae6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,16 @@ static int gmc_v10_0_gart_enable(struct 
amdgpu_device *adev)

  return -EINVAL;
  }
  -    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    r = adev->gfxhub.funcs->gart_enable(adev);
  if (r)
@@ -1062,7 +1069,8 @@ static void gmc_v10_0_gart_disable(struct 
amdgpu_device *adev)

  {
  adev->gfxhub.funcs->gart_disable(adev);
  adev->mmhub.funcs->gart_disable(adev);
-    amdgpu_gart_table_vram_unpin(adev);
+    if (!adev->in_suspend)
+    amdgpu_gart_table_vram_unpin(adev);
  }
    static int gmc_v10_0_hw_fini(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c

index 492ebed2915b..0ef50ad3d7d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,9 +837,17 @@ static int gmc_v8_0_gart_enable(struct 
amdgpu_device *adev)

  dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
  return -EINVAL;
  }
-    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
  @@ -992,7 +1000,8 @@ static void gmc_v8_0_gart_disable(struct 
amdgpu_device *adev)

  tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
  WREG32(mmVM_L2_CNTL, tmp);
  WREG32(mmVM_L2_CNTL2, 0);
-    amdgpu_gart_table_vram_unpin(adev);
+    if (!adev->in_suspend)
+    amdgpu_gart_table_vram_unpin(adev);
  }
    /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c

index cb82404df534..1bbcefd53974 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1714,9 +1714,16 @@ static int gmc_v9_0_gart_enable(struct 
amdgpu_device *adev)

  return -EINVAL;
  }
  -    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    r = adev->gfxhub.funcs->gart_enable(adev);
  if (r)
@@ -1793,7 +1800,8 @@ static void gmc_v9_0_gart_disable(struct 
amdgpu_device *adev)

  {
  adev->gfxhub

Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-19 Thread Jason Gunthorpe
On Tue, Oct 19, 2021 at 04:13:34PM +0100, Joao Martins wrote:
> On 10/19/21 00:06, Jason Gunthorpe wrote:
> > On Mon, Oct 18, 2021 at 12:37:30PM -0700, Dan Williams wrote:
> > 
> >>> device-dax uses PUD, along with TTM, they are the only places. I'm not
> >>> sure TTM is a real place though.
> >>
> >> I was setting device-dax aside because it can use Joao's changes to
> >> get compound-page support.
> > 
> > Ideally, but that ideas in that patch series have been floating around
> > for a long time now..
> >  
> The current status of the series misses a Rb on patches 6,7,10,12-14.
> Well, patch 8 too should now drop its tag, considering the latest
> discussion.
> 
> If it helps moving things forward I could split my series further into:
> 
> 1) the compound page introduction (patches 1-7) of my aforementioned series
> 2) vmemmap deduplication for memory gains (patches 9-14)
> 3) gup improvements (patch 8 and gup-slow improvements)

I would split it, yes..

I think we can see a general consensus that making compound_head/etc
work consistently with how THP uses it will provide value and
opportunity for optimization going forward.

> Whats the benefit between preventing longterm at start
> versus only after mounting the filesystem? Or is the intended future purpose
> to pass more context into an holder potential future callback e.g. nack 
> longterm
> pins on a page basis?

I understood Dan's remark that the device-dax path allows
FOLL_LONGTERM and the FSDAX path does not ?

Which, IIRC, today is signaled basd on vma properties and in all cases
fast-gup is denied.

> Maybe we can start by at least not add any flags and just prevent
> FOLL_LONGTERM on fsdax -- which I guess was the original purpose of
> commit 7af75561e171 ("mm/gup: add FOLL_LONGTERM capability to GUP fast").
> This patch (which I can formally send) has a sketch of that (below scissors 
> mark):
> 
> https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10f...@oracle.com/

Yes, basically, whatever test we want for 'deny fast gup foll
longterm' is fine. 

Personally I'd like to see us move toward a set of flag specifying
each special behavior and not a collection of types that imply special
behaviors.

Eg we have at least:
 - Block gup fast on foll_longterm
 - Capture the refcount ==1 and use the pgmap free hook
   (confusingly called page_is_devmap_managed())
 - Always use a swap entry
 - page->index/mapping are used in the usual file based way?

Probably more things..

Jason



[PATCH] drm/amdgpu: Consolidate VCN firmware setup code

2021-10-19 Thread Alex Deucher
Roughly the same code was present in all VCN versions.
Consolidate it into a single function.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 25 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h |  2 ++
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c   | 10 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c   | 10 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c   | 17 +
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c   | 20 +---
 6 files changed, 31 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index c7d316850570..dc823349f870 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -949,3 +949,28 @@ enum amdgpu_ring_priority_level 
amdgpu_vcn_get_enc_ring_prio(int ring)
return AMDGPU_RING_PRIO_0;
}
 }
+
+void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev)
+{
+   int i;
+   unsigned int idx;
+
+   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
+   const struct common_firmware_header *hdr;
+   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
+
+   for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+   if (adev->vcn.harvest_config & (1 << i))
+   continue;
+   if (i == 0)
+   idx = AMDGPU_UCODE_ID_VCN;
+   else
+   idx = AMDGPU_UCODE_ID_VCN1;
+   adev->firmware.ucode[idx].ucode_id = idx;
+   adev->firmware.ucode[idx].fw = adev->vcn.fw;
+   adev->firmware.fw_size +=
+   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
+   }
+   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
index 795cbaa02ff8..bfa27ea94804 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -310,4 +310,6 @@ int amdgpu_vcn_enc_ring_test_ib(struct amdgpu_ring *ring, 
long timeout);
 
 enum amdgpu_ring_priority_level amdgpu_vcn_get_enc_ring_prio(int ring);
 
+void amdgpu_vcn_setup_ucode(struct amdgpu_device *adev);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index ad0d2564087c..d54d720b3cf6 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -111,15 +111,7 @@ static int vcn_v1_0_sw_init(void *handle)
/* Override the work func */
adev->vcn.idle_work.work.func = vcn_v1_0_idle_work_handler;
 
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   const struct common_firmware_header *hdr;
-   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
-   }
+   amdgpu_vcn_setup_ucode(adev);
 
r = amdgpu_vcn_resume(adev);
if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
index 091d8c0f6801..3883df5b31ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -115,15 +115,7 @@ static int vcn_v2_0_sw_init(void *handle)
if (r)
return r;
 
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   const struct common_firmware_header *hdr;
-   hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-   dev_info(adev->dev, "Will use PSP to load VCN firmware\n");
-   }
+   amdgpu_vcn_setup_ucode(adev);
 
r = amdgpu_vcn_resume(adev);
if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 59f469bab005..44fc4c218433 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -139,22 +139,7 @@ static int vcn_v2_5_sw_init(void *handle)
if (r)
return r;
 
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
-   const struct common_firmware_header *hdr;
-   hdr = (const struct common_firmware_header *)adev->vcn.

Re: [PATCH 1/1] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Christian König

Am 19.10.21 um 17:41 schrieb Andrey Grodzovsky:


On 2021-10-19 9:22 a.m., Nirmoy Das wrote:

Get rid off pin/unpin and evict and swap back gart
page table which should make things less likely to break.


+Christian

Could you guys also clarify what exactly are the stability issues this 
fixes ?


When we evict the GART table during suspend it is theoretically possible 
that we run into an OOM situation.


But since the OOM killer and the buffer move functions are already 
disable that is basically not gracefully handle able.


When we just keep the GART pinned all the time and restore it's content 
during resume from the metadata we should be able to avoid any memory 
allocation for the move.


Christian.



Andrey




Also remove 2nd call to amdgpu_device_evict_resources()
as we don't need it.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 17 +
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 
  4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 41ce86244144..22ff229ab981 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,11 +3941,6 @@ int amdgpu_device_suspend(struct drm_device 
*dev, bool fbcon)

  amdgpu_fence_driver_hw_fini(adev);
    amdgpu_device_ip_suspend_phase2(adev);
-    /* This second call to evict device resources is to evict
- * the gart page table using the CPU.
- */
-    amdgpu_device_evict_resources(adev);
-
  return 0;
  }
  diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c

index 3ec5ff5a6dbe..18e3f3c5aae6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,16 @@ static int gmc_v10_0_gart_enable(struct 
amdgpu_device *adev)

  return -EINVAL;
  }
  -    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    r = adev->gfxhub.funcs->gart_enable(adev);
  if (r)
@@ -1062,7 +1069,8 @@ static void gmc_v10_0_gart_disable(struct 
amdgpu_device *adev)

  {
  adev->gfxhub.funcs->gart_disable(adev);
  adev->mmhub.funcs->gart_disable(adev);
-    amdgpu_gart_table_vram_unpin(adev);
+    if (!adev->in_suspend)
+    amdgpu_gart_table_vram_unpin(adev);
  }
    static int gmc_v10_0_hw_fini(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c

index 492ebed2915b..0ef50ad3d7d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,9 +837,17 @@ static int gmc_v8_0_gart_enable(struct 
amdgpu_device *adev)

  dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
  return -EINVAL;
  }
-    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
  @@ -992,7 +1000,8 @@ static void gmc_v8_0_gart_disable(struct 
amdgpu_device *adev)

  tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
  WREG32(mmVM_L2_CNTL, tmp);
  WREG32(mmVM_L2_CNTL2, 0);
-    amdgpu_gart_table_vram_unpin(adev);
+    if (!adev->in_suspend)
+    amdgpu_gart_table_vram_unpin(adev);
  }
    /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c

index cb82404df534..1bbcefd53974 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1714,9 +1714,16 @@ static int gmc_v9_0_gart_enable(struct 
amdgpu_device *adev)

  return -EINVAL;
  }
  -    r = amdgpu_gart_table_vram_pin(adev);
-    if (r)
-    return r;
+    if (!adev->in_suspend) {
+    r = amdgpu_gart_table_vram_pin(adev);
+    if (r)
+    return r;
+    } else {
+    r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+    TTM_PL_TT));
+    if (r)
+    return r;
+    }
    r = adev->gfxhub.funcs->gart_enable(adev);
  if (r)
@@ -1793,7 +1800,8 @@ static void gmc_v9_0_gart_disable(struct 
amdgpu_device *adev)

  {
  adev->gfxhub.funcs->gart_disable(adev);
  adev->mmhub.funcs->gart_disable(adev);
-

Re: [PATCH] drm/amdgpu/vcn3.0: handle harvesting in firmware setup

2021-10-19 Thread James Zhu


On 2021-10-19 11:13 a.m., Alex Deucher wrote:

Only enable firmware for the instance that is enabled.

Fixes: 1b592d00b4ac83 ("drm/amdgpu/vcn: remove manual instance setting")
Signed-off-by: Alex Deucher
---
  drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 19 +++
  1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index dbfd92984655..e311303a5e01 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -123,6 +123,7 @@ static int vcn_v3_0_sw_init(void *handle)
  {
struct amdgpu_ring *ring;
int i, j, r;
+   unsigned int idx;
int vcn_doorbell_index = 0;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
@@ -133,14 +134,16 @@ static int vcn_v3_0_sw_init(void *handle)

if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
const struct common_firmware_header *hdr;
hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-
-   if (adev->vcn.num_vcn_inst == VCN_INSTANCES_SIENNA_CICHLID) {
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].ucode_id = 
AMDGPU_UCODE_ID_VCN1;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].fw = 
adev->vcn.fw;
+
+   for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+   if (adev->vcn.harvest_config & (1 << i))
+   continue;
+   if (i == 0)
+   idx = AMDGPU_UCODE_ID_VCN;
+   else
+   idx = AMDGPU_UCODE_ID_VCN1;


[JZ] Not sure if it is worthy to replace idx with (AMDGPU_UCODE_ID_VCN+ i).

ThispatchisReviewed-by:JamesZhu


+   adev->firmware.ucode[idx].ucode_id = idx;
+   adev->firmware.ucode[idx].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}

Re: [PATCH 1/1] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Christian König

Am 19.10.21 um 15:22 schrieb Nirmoy Das:

Get rid off pin/unpin and evict and swap back gart
page table which should make things less likely to break.

Also remove 2nd call to amdgpu_device_evict_resources()
as we don't need it.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 17 +
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 
  4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 41ce86244144..22ff229ab981 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,11 +3941,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
amdgpu_fence_driver_hw_fini(adev);
  
  	amdgpu_device_ip_suspend_phase2(adev);

-   /* This second call to evict device resources is to evict
-* the gart page table using the CPU.
-*/
-   amdgpu_device_evict_resources(adev);
-
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c

index 3ec5ff5a6dbe..18e3f3c5aae6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,16 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}
  
-	r = amdgpu_gart_table_vram_pin(adev);

-   if (r)
-   return r;
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;


I think you can move the functionality of pinning into 
amdgpu_gart_table_vram_alloc().



+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }


And add a wrapper around this call here. Something like 
amdgpu_gart_recover() or similar.


Regards,
Christian.

  
  	r = adev->gfxhub.funcs->gart_enable(adev);

if (r)
@@ -1062,7 +1069,8 @@ static void gmc_v10_0_gart_disable(struct amdgpu_device 
*adev)
  {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gart_table_vram_unpin(adev);
  }
  
  static int gmc_v10_0_hw_fini(void *handle)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 492ebed2915b..0ef50ad3d7d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,9 +837,17 @@ static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
return -EINVAL;
}
-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
  
  	table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
  
@@ -992,7 +1000,8 @@ static void gmc_v8_0_gart_disable(struct amdgpu_device *adev)

tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
WREG32(mmVM_L2_CNTL, tmp);
WREG32(mmVM_L2_CNTL2, 0);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gart_table_vram_unpin(adev);
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index cb82404df534..1bbcefd53974 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1714,9 +1714,16 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}
  
-	r = amdgpu_gart_table_vram_pin(adev);

-   if (r)
-   return r;
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
  
  	r = adev->gfxhub.funcs->gart_enable(adev);

if (r)
@@ -1793,7 +1800,8 @@ static void gmc_v9_0_gart_disable(struct amdgpu_device 
*adev)
  {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
- 

Re: [PATCH 1/1] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Andrey Grodzovsky



On 2021-10-19 9:22 a.m., Nirmoy Das wrote:

Get rid off pin/unpin and evict and swap back gart
page table which should make things less likely to break.


+Christian

Could you guys also clarify what exactly are the stability issues this 
fixes ?


Andrey




Also remove 2nd call to amdgpu_device_evict_resources()
as we don't need it.

Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 17 +
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 
  4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 41ce86244144..22ff229ab981 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,11 +3941,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
amdgpu_fence_driver_hw_fini(adev);
  
  	amdgpu_device_ip_suspend_phase2(adev);

-   /* This second call to evict device resources is to evict
-* the gart page table using the CPU.
-*/
-   amdgpu_device_evict_resources(adev);
-
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c

index 3ec5ff5a6dbe..18e3f3c5aae6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,16 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}
  
-	r = amdgpu_gart_table_vram_pin(adev);

-   if (r)
-   return r;
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
  
  	r = adev->gfxhub.funcs->gart_enable(adev);

if (r)
@@ -1062,7 +1069,8 @@ static void gmc_v10_0_gart_disable(struct amdgpu_device 
*adev)
  {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gart_table_vram_unpin(adev);
  }
  
  static int gmc_v10_0_hw_fini(void *handle)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 492ebed2915b..0ef50ad3d7d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,9 +837,17 @@ static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
return -EINVAL;
}
-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
  
  	table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
  
@@ -992,7 +1000,8 @@ static void gmc_v8_0_gart_disable(struct amdgpu_device *adev)

tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
WREG32(mmVM_L2_CNTL, tmp);
WREG32(mmVM_L2_CNTL2, 0);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gart_table_vram_unpin(adev);
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index cb82404df534..1bbcefd53974 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1714,9 +1714,16 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}
  
-	r = amdgpu_gart_table_vram_pin(adev);

-   if (r)
-   return r;
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
  
  	r = adev->gfxhub.funcs->gart_enable(adev);

if (r)
@@ -1793,7 +1800,8 @@ static void gmc_v9_0_gart_disable(struct amdgpu_device 
*adev)
  {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gar

[PATCH] drm/amdgpu/smu11.0: add missing IP version check

2021-10-19 Thread Alex Deucher
Add missing check in smu_v11_0_init_display_count(),

Fixes: af3b89d3a639d5 ("drm/amdgpu/smu11.0: convert to IP version checking")
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
index 5c1703cc25fd..28b7c0562b99 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
@@ -755,6 +755,7 @@ int smu_v11_0_init_display_count(struct smu_context *smu, 
uint32_t count)
 */
if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 11) ||
adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 5, 0) ||
+   adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 12) ||
adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 13))
return 0;
 
-- 
2.31.1



Re: [PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

2021-10-19 Thread Joao Martins
On 10/19/21 00:06, Jason Gunthorpe wrote:
> On Mon, Oct 18, 2021 at 12:37:30PM -0700, Dan Williams wrote:
> 
>>> device-dax uses PUD, along with TTM, they are the only places. I'm not
>>> sure TTM is a real place though.
>>
>> I was setting device-dax aside because it can use Joao's changes to
>> get compound-page support.
> 
> Ideally, but that ideas in that patch series have been floating around
> for a long time now..
>  
The current status of the series misses a Rb on patches 6,7,10,12-14.
Well, patch 8 too should now drop its tag, considering the latest
discussion.

If it helps moving things forward I could split my series further into:

1) the compound page introduction (patches 1-7) of my aforementioned series
2) vmemmap deduplication for memory gains (patches 9-14)
3) gup improvements (patch 8 and gup-slow improvements)

The reason being that item 1) is the the main dependency listed below.
And allows 2) and 3) to be parallelized. FWIW, it is almost fully reviewed
by Dan (as of v3->v4). For (1) patches 6 & 7 are on changes to
device-dax subsystem (drivers/dax/*) which still needs his Ack.

>>> Here I imagine the thing that creates the pgmap would specify the
>>> policy it wants. In most cases the policy is tightly coupled to what
>>> the free function in the the provided dev_pagemap_ops does..
>>
>> The thing that creates the pgmap is the device-driver, and
>> device-driver does not implement truncate or reclaim. It's not until
>> the FS mounts that the pgmap needs to start enforcing pin lifetime
>> guarantees.
> 
> I am explaining this wrong, the immediate need is really 'should
> foll_longterm fail fast-gup to the slow path' and something like the
> nvdimm driver can just set that to 1 and rely on VMA flags to control
> what the slow path does - as is today.
> 
> It is not as elegant as more flags in the pgmap, but it would get the
> job done with minimal fuss.
> 
> Might be nice to either rely fully on VMA flags or fully on pgmap
> holder flags for FOLL_LONGTERM?
>

Whats the benefit between preventing longterm at start
versus only after mounting the filesystem? Or is the intended future purpose
to pass more context into an holder potential future callback e.g. nack longterm
pins on a page basis?

Maybe we can start by at least not add any flags and just prevent
FOLL_LONGTERM on fsdax -- which I guess was the original purpose of
commit 7af75561e171 ("mm/gup: add FOLL_LONGTERM capability to GUP fast").
This patch (which I can formally send) has a sketch of that (below scissors 
mark):

https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10f...@oracle.com/

It uses pgmap->type rather than adding further fields into pgmap, given this
restriction applies only to fsdax.

... and then we could improve devmap_longterm_available(pgmap) to look at the
holder::flags or pgmap::flags should we decide that an explicit flags is 
required
from holder/pgmap .. as a further improvement?

Joao



[PATCH] drm/amdgpu/vcn3.0: handle harvesting in firmware setup

2021-10-19 Thread Alex Deucher
Only enable firmware for the instance that is enabled.

Fixes: 1b592d00b4ac83 ("drm/amdgpu/vcn: remove manual instance setting")
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index dbfd92984655..e311303a5e01 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -123,6 +123,7 @@ static int vcn_v3_0_sw_init(void *handle)
 {
struct amdgpu_ring *ring;
int i, j, r;
+   unsigned int idx;
int vcn_doorbell_index = 0;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
@@ -133,14 +134,16 @@ static int vcn_v3_0_sw_init(void *handle)
if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
const struct common_firmware_header *hdr;
hdr = (const struct common_firmware_header *)adev->vcn.fw->data;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = 
AMDGPU_UCODE_ID_VCN;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;
-   adev->firmware.fw_size +=
-   ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);
-
-   if (adev->vcn.num_vcn_inst == VCN_INSTANCES_SIENNA_CICHLID) {
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].ucode_id = 
AMDGPU_UCODE_ID_VCN1;
-   adev->firmware.ucode[AMDGPU_UCODE_ID_VCN1].fw = 
adev->vcn.fw;
+
+   for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+   if (adev->vcn.harvest_config & (1 << i))
+   continue;
+   if (i == 0)
+   idx = AMDGPU_UCODE_ID_VCN;
+   else
+   idx = AMDGPU_UCODE_ID_VCN1;
+   adev->firmware.ucode[idx].ucode_id = idx;
+   adev->firmware.ucode[idx].fw = adev->vcn.fw;
adev->firmware.fw_size +=
ALIGN(le32_to_cpu(hdr->ucode_size_bytes), 
PAGE_SIZE);
}
-- 
2.31.1



Re: [PATCH 0/5] 0 MHz is not a valid current frequency

2021-10-19 Thread Luben Tuikov

  
It again fails with the same message! 
  But this time it is different!
  Here's why:
  
  openat(AT_FDCWD,
"/sys/class/drm/card0/device/pp_dpm_fclk", O_RDONLY) = 3
read(3, "0: 571Mhz \n1: 1274Mhz *\n2: 1221M"...,
8191) = 36
read(3, "", 8191)   = 0
close(3)    = 0
write(2, "python3: /home/ltuikov/proj/amd/"..., 220python3:
/home/ltuikov/proj/amd/rocm_smi_lib/src/rocm_smi.cc:913:
rsmi_status_t get_frequencies(amd::smi::DevInfoTypes, uint32_t,
rsmi_frequencies_t*, uint32_t*): Assertion
`f->frequency[i-1] <= f->frequency[i]' failed.
) = 220
mmap(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f531f9bc000
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
getpid()    = 37861
gettid()    = 37861
tgkill(37861, 37861, SIGABRT)   = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=37861,
si_uid=1000} ---
+++ killed by SIGABRT (core dumped) +++
Aborted (core dumped)
$cat
/sys/class/drm/card0/device/pp_dpm_fclk
  0: 571Mhz 
  1: 1274Mhz *
  2: 1221Mhz 
  $_

  Why is the mid frequency larger than the last?
  Why does get_frequencies() insists that they be ordered when
  they're not? (Does the tool need fixing or the kernel?)
  
  The current patchset doesn't report 0, and doesn't report any
  current if 0 would've been reported as current. But anything else
  is reported as it would've been reported before the patch. And I
  tested it with vanilla amd-staging-drm-next--same thing.
  
  Regards,
  Luben
  
  
  On 2021-10-19 09:25, Russell, Kent wrote:


  
  
  
  
  
  
  
  
[AMD
Official Use Only]
 
It
was the rocm-smi -c flag. Maybe some work was done to make
it more robust, that would be nice. But the -c flag is
supposed to show the current frequency for each clock type.
-g would do the same, but just for SCLK.
 
Kent
 

  

  From: Tuikov, Luben 
  
  Sent: Tuesday, October 19, 2021 12:27 AM
  To: Russell, Kent ;
  Deucher, Alexander ;
  Quan, Evan ;
  Lazar, Lijo ;
  amd-gfx@lists.freedesktop.org
  Cc: Kasiviswanathan, Harish 
  Subject: Re: [PATCH 0/5] 0 MHz is not a valid
  current frequency

  
   
  
Kent,

What is the command which fails?
I can try to duplicate it here.

So far, things I've tried, I cannot make rocm-smi fail.
Command arguments?

Regards,
Luben

On 2021-10-18 21:06, Russell, Kent wrote:
  
  
[AMD
Official Use Only]
 
The
* is required for the rocm-smi’s functionality for
showing what the current clocks are. We had a bug before
where the * was removed, then the SMI died
fantastically. Work could be done to try to handle that
type of situation, but the SMI has a “show current
clocks” and uses the * to determine which one is active
 
Kent
 

  

  From: amd-gfx 
  On Behalf Of Russell, Kent
  Sent: Monday, October 18, 2021 9:05 PM
  To: Tuikov, Luben ;
  Deucher, Alexander ;
  Quan, Evan ;
  Lazar, Lijo 
; amd-gfx@lists.freedesktop.org
  Cc: Kasiviswanathan, Harish 

  Subject: RE: [PATCH 0/5] 0 MHz is not a
  valid current frequency

  
   
  [AMD
  Official Use Only]
   
  +Harish, rocm-smi falls under his purview
  now.
   
  Kent
   
  

  
From: Tuikov, Luben 

Sent: Monday, October 18, 2021 4:30 PM
 

RE: [PATCH 0/5] 0 MHz is not a valid current frequency

2021-10-19 Thread Russell, Kent
[AMD Official Use Only]

It was the rocm-smi -c flag. Maybe some work was done to make it more robust, 
that would be nice. But the -c flag is supposed to show the current frequency 
for each clock type. -g would do the same, but just for SCLK.

Kent

From: Tuikov, Luben 
Sent: Tuesday, October 19, 2021 12:27 AM
To: Russell, Kent ; Deucher, Alexander 
; Quan, Evan ; Lazar, Lijo 
; amd-gfx@lists.freedesktop.org
Cc: Kasiviswanathan, Harish 
Subject: Re: [PATCH 0/5] 0 MHz is not a valid current frequency

Kent,

What is the command which fails?
I can try to duplicate it here.

So far, things I've tried, I cannot make rocm-smi fail. Command arguments?

Regards,
Luben

On 2021-10-18 21:06, Russell, Kent wrote:

[AMD Official Use Only]

The * is required for the rocm-smi's functionality for showing what the current 
clocks are. We had a bug before where the * was removed, then the SMI died 
fantastically. Work could be done to try to handle that type of situation, but 
the SMI has a "show current clocks" and uses the * to determine which one is 
active

Kent

From: amd-gfx 

 On Behalf Of Russell, Kent
Sent: Monday, October 18, 2021 9:05 PM
To: Tuikov, Luben ; Deucher, 
Alexander ; Quan, 
Evan ; Lazar, Lijo 
; 
amd-gfx@lists.freedesktop.org
Cc: Kasiviswanathan, Harish 

Subject: RE: [PATCH 0/5] 0 MHz is not a valid current frequency


[AMD Official Use Only]

+Harish, rocm-smi falls under his purview now.

Kent

From: Tuikov, Luben mailto:luben.tui...@amd.com>>
Sent: Monday, October 18, 2021 4:30 PM
To: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Quan, Evan 
mailto:evan.q...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>; 
amd-gfx@lists.freedesktop.org; Russell, 
Kent mailto:kent.russ...@amd.com>>
Subject: Re: [PATCH 0/5] 0 MHz is not a valid current frequency

I think Kent is already seen these patches as he did comment on 1/5 patch.

The v3 version of the patch, posted last week, removes the asterisk to report 
the lowest frequency as the current frequency, when the current frequency is 0, 
i.e. when the block is in low power state. Does the tool rely on the asterisk? 
If this information is necessary could it not use amdgpu_pm_info?

Regards,
Luben

On 2021-10-18 16:19, Deucher, Alexander wrote:

[Public]

We the current behavior (0 for clock) already crashes the tool, so I don't 
think we can really make things worse.

Alex


From: Quan, Evan 
Sent: Thursday, October 14, 2021 10:25 PM
To: Lazar, Lijo ; Tuikov, Luben 
; 
amd-gfx@lists.freedesktop.org 
; Russell, 
Kent 
Cc: Deucher, Alexander 

Subject: RE: [PATCH 0/5] 0 MHz is not a valid current frequency


[AMD Official Use Only]



+Kent who maintains the Rocm tool



From: amd-gfx 

 On Behalf Of Lazar, Lijo
Sent: Thursday, October 14, 2021 1:07 AM
To: Tuikov, Luben ; 
amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 

Subject: Re: [PATCH 0/5] 0 MHz is not a valid current frequency



[AMD Official Use Only]



[AMD Official Use Only]



>Or maybe just a list without default hint, i.e. no asterisk?



I think this is also fine meaning we are having trouble in determining the 
current frequency or DPM level. Evan/Alex? Don't know if this will crash the 
tools.



Thanks,
Lijo



From: Tuikov, Luben mailto:luben.tui...@amd.com>>
Sent: Wednesday, October 13, 2021 9:52:09 PM
To: Lazar, Lijo mailto:lijo.la...@amd.com>>; 
amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>
Cc: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>
Subject: Re: [PATCH 0/5] 0 MHz is not a valid current frequency



On 2021-10-13 00:14, Lazar, Lijo wrote:
>
> On 10/13/2021 8:40 AM, Luben Tuikov wrote:
>> Some ASIC support low-power functionality for the whole ASIC or just
>> an IP block. When in such low-power mode, some sysfs interfaces would
>> report a frequency of 0, e.g.,
>>
>> $cat /sys/class/drm/card0/device/pp_dpm_sclk
>> 0: 500Mhz
>> 1: 0Mhz *
>> 2: 2200Mhz
>> $_
>>
>> An operating frequency of 0 MHz doesn't make sense, and this interface
>> is designed to report only operating clock frequencies, i.e. non-zero,
>> and possibly the current one.
>>
>> When in this low-power state, round to the smallest
>> operating frequency, for this interface, as follows,
>>
> Would rather avoid this -
>
> 1) It is ma

[PATCH 1/1] drm/amdgpu: recover gart table at resume

2021-10-19 Thread Nirmoy Das
Get rid off pin/unpin and evict and swap back gart
page table which should make things less likely to break.

Also remove 2nd call to amdgpu_device_evict_resources()
as we don't need it.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 -
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 16 
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 17 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 
 4 files changed, 37 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 41ce86244144..22ff229ab981 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3941,11 +3941,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
amdgpu_fence_driver_hw_fini(adev);
 
amdgpu_device_ip_suspend_phase2(adev);
-   /* This second call to evict device resources is to evict
-* the gart page table using the CPU.
-*/
-   amdgpu_device_evict_resources(adev);
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 3ec5ff5a6dbe..18e3f3c5aae6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,9 +992,16 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}
 
-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
 
r = adev->gfxhub.funcs->gart_enable(adev);
if (r)
@@ -1062,7 +1069,8 @@ static void gmc_v10_0_gart_disable(struct amdgpu_device 
*adev)
 {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gart_table_vram_unpin(adev);
 }
 
 static int gmc_v10_0_hw_fini(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 492ebed2915b..0ef50ad3d7d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -837,9 +837,17 @@ static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
return -EINVAL;
}
-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
 
table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
 
@@ -992,7 +1000,8 @@ static void gmc_v8_0_gart_disable(struct amdgpu_device 
*adev)
tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
WREG32(mmVM_L2_CNTL, tmp);
WREG32(mmVM_L2_CNTL2, 0);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gart_table_vram_unpin(adev);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index cb82404df534..1bbcefd53974 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1714,9 +1714,16 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device 
*adev)
return -EINVAL;
}
 
-   r = amdgpu_gart_table_vram_pin(adev);
-   if (r)
-   return r;
+   if (!adev->in_suspend) {
+   r = amdgpu_gart_table_vram_pin(adev);
+   if (r)
+   return r;
+   } else {
+   r = amdgpu_gtt_mgr_recover(ttm_manager_type(&adev->mman.bdev,
+   TTM_PL_TT));
+   if (r)
+   return r;
+   }
 
r = adev->gfxhub.funcs->gart_enable(adev);
if (r)
@@ -1793,7 +1800,8 @@ static void gmc_v9_0_gart_disable(struct amdgpu_device 
*adev)
 {
adev->gfxhub.funcs->gart_disable(adev);
adev->mmhub.funcs->gart_disable(adev);
-   amdgpu_gart_table_vram_unpin(adev);
+   if (!adev->in_suspend)
+   amdgpu_gart_table_vram_unpin(adev);
 }
 
 static int gmc_v9_0_hw_fini(void *handle)
-- 
2.32.0



Re: [PATCH] drm/amdgpu: support B0&B1 external revision id for yellow carp

2021-10-19 Thread Alex Deucher
On Mon, Oct 18, 2021 at 11:22 PM Aaron Liu  wrote:
>
> B0 internal rev_id is 0x01, B1 internal rev_id is 0x02.
> The external rev_id for B0 and B1 is 0x20.
> The original expression is not suitable for B1.

Are you sure about this?  We'll be losing the difference between B0
and B1.  I think 0x19 is correct.  What will give us 0x1a for B0 and
0x1b for B1.  That aligns with what the display code has as well.

Alex


>
> Signed-off-by: Aaron Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/nv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
> index 898e688be63c..5166a1573e7e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> @@ -1248,7 +1248,7 @@ static int nv_common_early_init(void *handle)
> AMD_PG_SUPPORT_VCN_DPG |
> AMD_PG_SUPPORT_JPEG;
> if (adev->pdev->device == 0x1681)
> -   adev->external_rev_id = adev->rev_id + 0x19;
> +   adev->external_rev_id = 0x20;
> else
> adev->external_rev_id = adev->rev_id + 0x01;
> break;
> --
> 2.25.1
>


[PATCH] drm/amdgpu: remove grbm cam remmaping for gfx v10

2021-10-19 Thread Huang Rui
PSP firmware will be responsible for applying the GRBM CAM remapping in
the production. And the GRBM_CAM_INDEX / GRBM_CAM_DATA registers will be
protected by PSP under security policy. So remove it according to the
new security policy.

Signed-off-by: Huang Rui 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 201 -
 1 file changed, 201 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 71bb3c0dc1da..a53036a05d7e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -270,25 +270,6 @@ MODULE_FIRMWARE("amdgpu/cyan_skillfish2_mec.bin");
 MODULE_FIRMWARE("amdgpu/cyan_skillfish2_mec2.bin");
 MODULE_FIRMWARE("amdgpu/cyan_skillfish2_rlc.bin");
 
-static const struct soc15_reg_golden golden_settings_gc_10_0[] =
-{
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_INDEX, 0x, 0x),
-   /* TA_GRAD_ADJ_UCONFIG -> TA_GRAD_ADJ */
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_DATA, 0x, 0x2544c382),
-   /* VGT_TF_RING_SIZE_UMD -> VGT_TF_RING_SIZE */
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_DATA, 0x, 0x2262c24e),
-   /* VGT_HS_OFFCHIP_PARAM_UMD -> VGT_HS_OFFCHIP_PARAM */
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_DATA, 0x, 0x226cc24f),
-   /* VGT_TF_MEMORY_BASE_UMD -> VGT_TF_MEMORY_BASE */
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_DATA, 0x, 0x226ec250),
-   /* VGT_TF_MEMORY_BASE_HI_UMD -> VGT_TF_MEMORY_BASE_HI */
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_DATA, 0x, 0x2278c261),
-   /* VGT_ESGS_RING_SIZE_UMD -> VGT_ESGS_RING_SIZE */
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_DATA, 0x, 0x2232c240),
-   /* VGT_GSVS_RING_SIZE_UMD -> VGT_GSVS_RING_SIZE */
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmGRBM_CAM_DATA, 0x, 0x2233c241),
-};
-
 static const struct soc15_reg_golden golden_settings_gc_10_1[] =
 {
SOC15_REG_GOLDEN_VALUE(GC, 0, mmCB_HW_CONTROL_4, 0x, 
0x00400014),
@@ -3809,9 +3790,6 @@ static void gfx_v10_0_init_golden_registers(struct 
amdgpu_device *adev)
(const 
u32)ARRAY_SIZE(golden_settings_gc_10_3_5));
break;
case IP_VERSION(10, 1, 3):
-   soc15_program_register_sequence(adev,
-   golden_settings_gc_10_0,
-   (const 
u32)ARRAY_SIZE(golden_settings_gc_10_0));
soc15_program_register_sequence(adev,

golden_settings_gc_10_0_cyan_skillfish,
(const 
u32)ARRAY_SIZE(golden_settings_gc_10_0_cyan_skillfish));
@@ -7297,181 +7275,6 @@ static void gfx_v10_0_cp_enable(struct amdgpu_device 
*adev, bool enable)
gfx_v10_0_cp_compute_enable(adev, enable);
 }
 
-static bool gfx_v10_0_check_grbm_cam_remapping(struct amdgpu_device *adev)
-{
-   uint32_t data, pattern = 0xDEADBEEF;
-
-   /* check if mmVGT_ESGS_RING_SIZE_UMD
-* has been remapped to mmVGT_ESGS_RING_SIZE */
-   switch (adev->ip_versions[GC_HWIP][0]) {
-   case IP_VERSION(10, 3, 0):
-   case IP_VERSION(10, 3, 2):
-   case IP_VERSION(10, 3, 4):
-   case IP_VERSION(10, 3, 5):
-   data = RREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE_Sienna_Cichlid);
-   WREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE_Sienna_Cichlid, 0);
-   WREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE_UMD, pattern);
-
-   if (RREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE_Sienna_Cichlid) == 
pattern) {
-   WREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE_UMD , data);
-   return true;
-   } else {
-   WREG32_SOC15(GC, 0, 
mmVGT_ESGS_RING_SIZE_Sienna_Cichlid, data);
-   return false;
-   }
-   break;
-   case IP_VERSION(10, 3, 1):
-   case IP_VERSION(10, 3, 3):
-   return true;
-   default:
-   data = RREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE);
-   WREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE, 0);
-   WREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE_UMD, pattern);
-
-   if (RREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE) == pattern) {
-   WREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE_UMD, data);
-   return true;
-   } else {
-   WREG32_SOC15(GC, 0, mmVGT_ESGS_RING_SIZE, data);
-   return false;
-   }
-   break;
-   }
-}
-
-static void gfx_v10_0_setup_grbm_cam_remapping(struct amdgpu_device *adev)
-{
-   uint32_t data;
-
-   if (amdgpu_sriov_vf(adev))
-   return;
-
-   /* initialize cam_index to 0
-* index will auto-inc after each data writting */
-   WREG32_SOC15(GC, 0, mmGRBM_CAM_INDEX, 0)

回复: 回复: [PATCH Review 1/1] drm/ttm: fix debugfs node create failed

2021-10-19 Thread Yang, Stanley
[AMD Official Use Only]



> -邮件原件-
> 发件人: Christian König 
> 发送时间: Tuesday, October 19, 2021 4:46 PM
> 收件人: Yang, Stanley ; Das, Nirmoy
> ; amd-gfx@lists.freedesktop.org
> 主题: Re: 回复: [PATCH Review 1/1] drm/ttm: fix debugfs node create failed
> 
> Am 19.10.21 um 10:02 schrieb Yang, Stanley:
> > [AMD Official Use Only]
> >
> >
> >> -邮件原件-
> >> 发件人: amd-gfx  代表 Das,
> Nirmoy
> >> 发送时间: Thursday, October 14, 2021 2:11 AM
> >> 收件人: Christian König ; amd-
> >> g...@lists.freedesktop.org
> >> 主题: Re: [PATCH Review 1/1] drm/ttm: fix debugfs node create failed
> >>
> >>
> >> On 10/13/2021 2:29 PM, Christian König wrote:
> >>> Am 12.10.21 um 15:12 schrieb Das, Nirmoy:
>  On 10/12/2021 1:58 PM, Stanley.Yang wrote:
> > Test scenario:
> >   modprobe amdgpu -> rmmod amdgpu -> modprobe amdgpu Error
> log:
> >   [   54.396807] debugfs: File 'page_pool' in directory 'amdttm'
> > already present!
> >   [   54.396833] debugfs: File 'page_pool_shrink' in directory
> > 'amdttm' already present!
> >   [   54.396848] debugfs: File 'buffer_objects' in directory
> > 'amdttm' already present!
> 
>  We should instead add a check if those debugfs files already
>  exist/created in ttm debugfs dir using debugfs_lookup() before creating.
> >>> No, IIRC the Intel guys had fixed that already by adding/removing
> >>> the debugfs file on module load/unload.
> >>
> >> Adding/removing on ttm module load/unload is nicer.
> > The point is that page_pool, page_pool_shrink and buffer_objects are
> > created by amdgpu driver,
> 
> Yeah, but the debugfs files are not created by the driver. Those are global to
> TTM and can trivially be created during module load/unload.
[Yang, Stanley] Thanks Christian, I double check ttm related code the ttm load 
will create those debugfs file.

Stanley
> 
> Christian.
> 
> >   I think it's better to remove them by amdgpu module due to amdgpu
> > module create them, otherwise, there will be a scene create them failed
> only reload amdgpu module.
> >
> > Stanley
> >>
> >> Nirmoy
> >>
> >>>
> >>> Christian.
> >>>
> 
>  Regards,
> 
>  Nirmoy
> 
> 
> 
> > Reason:
> >   page_pool, page_pool_shrink and buffer_objects can be
> > removed when
> >   rmmod amdttm, in the above test scenario only rmmod amdgpu,
> > so those
> >   debugfs node will not be removed, this caused file create failed.
> > Soultion:
> >   create ttm_page directory under ttm_root directory when
> > insmod amdgpu,
> >   page_pool, page_pool_shrink and buffer_objects are stored in
> > ttm_page directiry,
> >   remove ttm_page directory when do rmmod amdgpu, this can fix
> > above issue.
> >
> > Signed-off-by: Stanley.Yang 
> > ---
> >    drivers/gpu/drm/ttm/ttm_device.c | 12 +++-
> >    drivers/gpu/drm/ttm/ttm_module.c |  1 +
> >    drivers/gpu/drm/ttm/ttm_module.h |  1 +
> >    drivers/gpu/drm/ttm/ttm_pool.c   |  4 ++--
> >    4 files changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_device.c
> > b/drivers/gpu/drm/ttm/ttm_device.c
> > index 1de23edbc182..ad170328f0c8 100644
> > --- a/drivers/gpu/drm/ttm/ttm_device.c
> > +++ b/drivers/gpu/drm/ttm/ttm_device.c
> > @@ -55,6 +55,10 @@ static void ttm_global_release(void)
> >      ttm_pool_mgr_fini();
> >    +#ifdef CONFIG_DEBUG_FS
> > +    debugfs_remove(ttm_debugfs_page); #endif
> > +
> >    __free_page(glob->dummy_read_page);
> >    memset(glob, 0, sizeof(*glob));
> >    out:
> > @@ -85,6 +89,10 @@ static int ttm_global_init(void)
> >    >> PAGE_SHIFT;
> >    num_dma32 = min(num_dma32, 2UL << (30 - PAGE_SHIFT));
> >    +#ifdef CONFIG_DEBUG_FS
> > +    ttm_debugfs_page = debugfs_create_dir("ttm_page",
> > ttm_debugfs_root);
> > +#endif
> > +
> >    ttm_pool_mgr_init(num_pages);
> >    ttm_tt_mgr_init(num_pages, num_dma32);
> >    @@ -98,8 +106,10 @@ static int ttm_global_init(void)
> >    INIT_LIST_HEAD(&glob->device_list);
> >    atomic_set(&glob->bo_count, 0);
> >    -    debugfs_create_atomic_t("buffer_objects", 0444,
> > ttm_debugfs_root,
> > +#ifdef CONFIG_DEBUG_FS
> > +    debugfs_create_atomic_t("buffer_objects", 0444,
> > +ttm_debugfs_page,
> >    &glob->bo_count);
> > +#endif
> >    out:
> >    mutex_unlock(&ttm_global_mutex);
> >    return ret;
> > diff --git a/drivers/gpu/drm/ttm/ttm_module.c
> > b/drivers/gpu/drm/ttm/ttm_module.c
> > index 88970a6b8e32..66595e6e7087 100644
> > --- a/drivers/gpu/drm/ttm/ttm_module.c
> > +++ b/drivers/gpu/drm/ttm/ttm_module.c
> > @@ -38,6 +38,7 @@
> >    #include "ttm_module.h"
> >      struct dentry *ttm_debugfs_root;
> > +struct dentry *ttm_debugfs_page;
> >

Re: 回复: [PATCH Review 1/1] drm/ttm: fix debugfs node create failed

2021-10-19 Thread Christian König

Am 19.10.21 um 10:02 schrieb Yang, Stanley:

[AMD Official Use Only]



-邮件原件-
发件人: amd-gfx  代表 Das,
Nirmoy
发送时间: Thursday, October 14, 2021 2:11 AM
收件人: Christian König ; amd-
g...@lists.freedesktop.org
主题: Re: [PATCH Review 1/1] drm/ttm: fix debugfs node create failed


On 10/13/2021 2:29 PM, Christian König wrote:

Am 12.10.21 um 15:12 schrieb Das, Nirmoy:

On 10/12/2021 1:58 PM, Stanley.Yang wrote:

Test scenario:
  modprobe amdgpu -> rmmod amdgpu -> modprobe amdgpu Error log:
  [   54.396807] debugfs: File 'page_pool' in directory 'amdttm'
already present!
  [   54.396833] debugfs: File 'page_pool_shrink' in directory
'amdttm' already present!
  [   54.396848] debugfs: File 'buffer_objects' in directory
'amdttm' already present!


We should instead add a check if those debugfs files already
exist/created in ttm debugfs dir using debugfs_lookup() before creating.

No, IIRC the Intel guys had fixed that already by adding/removing the
debugfs file on module load/unload.


Adding/removing on ttm module load/unload is nicer.

The point is that page_pool, page_pool_shrink and buffer_objects are created by 
amdgpu driver,


Yeah, but the debugfs files are not created by the driver. Those are 
global to TTM and can trivially be created during module load/unload.


Christian.


  I think it's better to remove them by amdgpu module due to amdgpu module 
create them,
otherwise, there will be a scene create them failed only reload amdgpu module.

Stanley


Nirmoy



Christian.



Regards,

Nirmoy




Reason:
  page_pool, page_pool_shrink and buffer_objects can be removed
when
  rmmod amdttm, in the above test scenario only rmmod amdgpu, so
those
  debugfs node will not be removed, this caused file create failed.
Soultion:
  create ttm_page directory under ttm_root directory when insmod
amdgpu,
  page_pool, page_pool_shrink and buffer_objects are stored in
ttm_page directiry,
  remove ttm_page directory when do rmmod amdgpu, this can fix
above issue.

Signed-off-by: Stanley.Yang 
---
   drivers/gpu/drm/ttm/ttm_device.c | 12 +++-
   drivers/gpu/drm/ttm/ttm_module.c |  1 +
   drivers/gpu/drm/ttm/ttm_module.h |  1 +
   drivers/gpu/drm/ttm/ttm_pool.c   |  4 ++--
   4 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c
b/drivers/gpu/drm/ttm/ttm_device.c
index 1de23edbc182..ad170328f0c8 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -55,6 +55,10 @@ static void ttm_global_release(void)
     ttm_pool_mgr_fini();
   +#ifdef CONFIG_DEBUG_FS
+    debugfs_remove(ttm_debugfs_page); #endif
+
   __free_page(glob->dummy_read_page);
   memset(glob, 0, sizeof(*glob));
   out:
@@ -85,6 +89,10 @@ static int ttm_global_init(void)
   >> PAGE_SHIFT;
   num_dma32 = min(num_dma32, 2UL << (30 - PAGE_SHIFT));
   +#ifdef CONFIG_DEBUG_FS
+    ttm_debugfs_page = debugfs_create_dir("ttm_page",
ttm_debugfs_root);
+#endif
+
   ttm_pool_mgr_init(num_pages);
   ttm_tt_mgr_init(num_pages, num_dma32);
   @@ -98,8 +106,10 @@ static int ttm_global_init(void)
   INIT_LIST_HEAD(&glob->device_list);
   atomic_set(&glob->bo_count, 0);
   -    debugfs_create_atomic_t("buffer_objects", 0444,
ttm_debugfs_root,
+#ifdef CONFIG_DEBUG_FS
+    debugfs_create_atomic_t("buffer_objects", 0444,
+ttm_debugfs_page,
   &glob->bo_count);
+#endif
   out:
   mutex_unlock(&ttm_global_mutex);
   return ret;
diff --git a/drivers/gpu/drm/ttm/ttm_module.c
b/drivers/gpu/drm/ttm/ttm_module.c
index 88970a6b8e32..66595e6e7087 100644
--- a/drivers/gpu/drm/ttm/ttm_module.c
+++ b/drivers/gpu/drm/ttm/ttm_module.c
@@ -38,6 +38,7 @@
   #include "ttm_module.h"
     struct dentry *ttm_debugfs_root;
+struct dentry *ttm_debugfs_page;
     static int __init ttm_init(void)
   {
diff --git a/drivers/gpu/drm/ttm/ttm_module.h
b/drivers/gpu/drm/ttm/ttm_module.h
index d7cac5d4b835..6007dc66f44e 100644
--- a/drivers/gpu/drm/ttm/ttm_module.h
+++ b/drivers/gpu/drm/ttm/ttm_module.h
@@ -36,5 +36,6 @@
   struct dentry;
     extern struct dentry *ttm_debugfs_root;
+extern struct dentry *ttm_debugfs_page;
     #endif /* _TTM_MODULE_H_ */
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c
b/drivers/gpu/drm/ttm/ttm_pool.c index 8be7fd7161fd..ecb33daad7b5
100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -709,9 +709,9 @@ int ttm_pool_mgr_init(unsigned long num_pages)
   }
     #ifdef CONFIG_DEBUG_FS
-    debugfs_create_file("page_pool", 0444, ttm_debugfs_root, NULL,
+    debugfs_create_file("page_pool", 0444, ttm_debugfs_page, NULL,
   &ttm_pool_debugfs_globals_fops);
-    debugfs_create_file("page_pool_shrink", 0400, ttm_debugfs_root,
NULL,
+    debugfs_create_file("page_pool_shrink", 0400, ttm_debugfs_page,
NULL,
   &ttm_pool_debugfs_shrink_fops);
   #endif




Re: [PATCH] amd/display: remove ChromeOS workaround

2021-10-19 Thread Paul Menzel

Dear Simon,


Am 19.10.21 um 10:10 schrieb Simon Ser:

On Tuesday, October 19th, 2021 at 01:21, Paul Menzel  
wrote:


Am 19.10.21 um 01:06 schrieb Simon Ser:

On Tuesday, October 19th, 2021 at 01:03, Paul Menzel wrote:


Excuse my ignorance. Reading the commit message, there was a Linux
kernel change, that broke Chrome OS userspace, right? If so, and we do
not know if there is other userspace using the API incorrectly,
shouldn’t the patch breaking Chrome OS userspace be reverted to adhere
to Linux’ no-regression rule?


No. There was a ChromeOS bug which has been thought to be an amdgpu bug. But
fixing that "bug" breaks other user-space.


Thank you for the explanation. I guess the bug was only surfacing
because Chrome OS device, like Chromebooks, are only using AMD hardware
since a short while (maybe last year).

Reading your message *amdgpu: atomic API and cursor/overlay planes* [1]
again, it says:


Up until now we were using cursor and overlay planes in gamescope [3],
but some changes in the amdgpu driver [1] makes us unable to use planes


So this statement was incorrect? Which changes are that? Or did Chrome
OS ever work correctly with an older Linux kernel or not?


The sequence of events is as follows:

- gamescope can use cursor and overlay planes.
- ChromeOS-specific commit lands, fixing some ChromeOS issues related to video
   playback. This breaks gamescope overlays.


I guess, I am confused, which Chrome OS specific commit that is. Is it 
one of the reverted commits below? Which one?


1.  ddab8bd788f5 ("drm/amd/display: Fix two cursor duplication
when using overlay")
2.  e7d9560aeae5 ("Revert "drm/amd/display: Fix overlay validation by 
considering cursors"")



- Discussion to restrict the ChromeOS-specific logic to ChromeOS, or to revert
   it, either of these fix gamescope.

Given this, I don't see how the quoted statement is incorrect? Maybe I'm
missing something?


Your reply from August 2021 to commit ddab8bd788f5 (drm/amd/display: Fix 
two cursor duplication when using overlay) from April 2021 [2]:



Hm. This patch causes a regression for me. I was using primary + overlay
not covering the whole primary plane + cursor before. This patch breaks it.

This patch makes the overlay plane very useless for me, because the primary
plane is always under the overlay plane.


So, I would have thought, everything worked fine before some Linux 
kernel commit changed behavior, and regressed userspace.



Kind regards,

Paul


[2]: 
https://lore.kernel.org/amd-gfx/SrcUnUUGJquVgjp9P79uV8sv6s-kMHG4wp0S3b4Nh9ksi29EIOye5edofuXkDLRvGfvkkRpQZ9JM7MNqew2B3kFUhaxsonDRXprkAYXaQUo=@emersion.fr/


Re: [PATCH] amd/display: remove ChromeOS workaround

2021-10-19 Thread Simon Ser
On Tuesday, October 19th, 2021 at 01:21, Paul Menzel  
wrote:

> Am 19.10.21 um 01:06 schrieb Simon Ser:
> > On Tuesday, October 19th, 2021 at 01:03, Paul Menzel wrote:
> >
> >> Excuse my ignorance. Reading the commit message, there was a Linux
> >> kernel change, that broke Chrome OS userspace, right? If so, and we do
> >> not know if there is other userspace using the API incorrectly,
> >> shouldn’t the patch breaking Chrome OS userspace be reverted to adhere
> >> to Linux’ no-regression rule?
> >
> > No. There was a ChromeOS bug which has been thought to be an amdgpu bug. But
> > fixing that "bug" breaks other user-space.
>
> Thank you for the explanation. I guess the bug was only surfacing
> because Chrome OS device, like Chromebooks, are only using AMD hardware
> since a short while (maybe last year).
>
> Reading your message *amdgpu: atomic API and cursor/overlay planes* [1]
> again, it says:
>
> > Up until now we were using cursor and overlay planes in gamescope [3],
> > but some changes in the amdgpu driver [1] makes us unable to use planes
>
> So this statement was incorrect? Which changes are that? Or did Chrome
> OS ever work correctly with an older Linux kernel or not?

The sequence of events is as follows:

- gamescope can use cursor and overlay planes.
- ChromeOS-specific commit lands, fixing some ChromeOS issues related to video
  playback. This breaks gamescope overlays.
- Discussion to restrict the ChromeOS-specific logic to ChromeOS, or to revert
  it, either of these fix gamescope.

Given this, I don't see how the quoted statement is incorrect? Maybe I'm
missing something?

Hope that helps,

Simon


Re: Use of conditionals with omitted operands in amdgpu (x? : y) (was: [PATCH 4/5] dpm/amd/pm: Sienna: Remove 0 MHz as a current clock frequency (v3))

2021-10-19 Thread Luben Tuikov
+AlexD
+ChrisianK
+LKML

On 2021-10-19 03:44, Paul Menzel wrote:
> Dear Luben,
>
>
> Am 19.10.21 um 06:50 schrieb Luben Tuikov:
>> On 2021-10-19 00:38, Lazar, Lijo wrote:
>>> On 10/19/2021 9:45 AM, Luben Tuikov wrote:
 On 2021-10-18 23:38, Lazar, Lijo wrote:
> On 10/19/2021 5:19 AM, Luben Tuikov wrote:
> […]
>
>> -if (ret)
>> -goto print_clk_out;
>> +freq_value[1] = curr_value ?: freq_value[0];
> Omitting second expression is not standard C -
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fgcc%2FConditionals.html&data=04%7C01%7Cluben.tuikov%40amd.com%7Ca0515ae37fc64695640408d992d44452%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637702263078635669%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=cPRhR4Fcns4Kx%2BjkAjvN16xDk0jDEbkCO0EooJzEUlA%3D&reserved=0
 Lijo just clarified to me that:

> well, i had to look up as I haven't seen it before
 I hope the following should make it clear about its usage:

 $cd linux/
 $find . -name "*.[ch]" -exec grep -E "\?:" \{\} \+ | wc -l
 1042
 $_
>  $ git grep -E "\?:" -- '*amdgpu*.[ch]'
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c: * Solution?:
>
> So for the AMDGPU subsystem, as the only result is a comment, currently, 
> conditionals with omitted operands are not used. So, it’s a valid 
> question, if the use should be introduced into the subsystem.
>
> The GCC documentation also states:
>
>> In this simple case, the ability to omit the middle operand is not
>> especially useful. When it becomes useful is when the first operand
>> does, or may (if it is a macro argument), contain a side effect. Then
>> repeating the operand in the middle would perform the side effect
>> twice. Omitting the middle operand uses the value already computed
>> without the undesirable effects of recomputing it.
> So, in your case, there are no side effect, if I am not mistaken.

The explanation you quoted above makes a case *for* using the extension. It's 
telling you that it is a good thing to use the extension so should the first 
operand be a macro with an argument, it'll avoid being evaluated twice.

>
> I do not care, if the extension is going to be used or not.

So then why post this message, if you don't care? Since your email continues 
after this, like this:

> The 
> maintainers might want to officially confirm the use in the subsystem, 

I've added the maintainers, Alex and Christian, as well as LKML to the To: list 
of this email.
I believe that it is perfectly fine to use the ternary conditional abbreviation 
"c = a ?: b;", as the kernel uses it extensively, over 1000 occurrences in the 
kernel. It also eliminates side effects should 'a' be a macro with an argument 
which evaluates.

> as using these extensions is surprising for some C developers not 
> knowing the GNU extensions.
>
>>> Thanks Luben!
>> You're welcome. I'm glad you're learning new things from my patches.
>> Would've been easier if you'd just said in your email that you've
>> never seen this ternary conditional shortcut before and that you've
>> just learned of it from my patch. (Or not post anything at all in
>> this very case and get in touch with me privately via email or
>> Teams--I would've gladly clarified it there.)
> In my opinion, asking this on the list is perfectly valid, as other 
> readers, might have the same question. But being more elaborate to avoid 
> misunderstandings is always a good thing.

Lijo wasn't asking anything. There was no question in any of his emails on this 
thread which is all about the use of "?:", which is a well-established practice.

Why are we having a thread about the use of "?:" ?

Regards,
Luben

>
>> I hope the find+egrep above is also edifying, so you can use it in
>> the future in your learning process.
> I hope, you like my solution without using find. ;-)
>
>
> Kind regards,
>
> Paul



回复: [PATCH Review 1/1] drm/ttm: fix debugfs node create failed

2021-10-19 Thread Yang, Stanley
[AMD Official Use Only]


> -邮件原件-
> 发件人: amd-gfx  代表 Das,
> Nirmoy
> 发送时间: Thursday, October 14, 2021 2:11 AM
> 收件人: Christian König ; amd-
> g...@lists.freedesktop.org
> 主题: Re: [PATCH Review 1/1] drm/ttm: fix debugfs node create failed
> 
> 
> On 10/13/2021 2:29 PM, Christian König wrote:
> > Am 12.10.21 um 15:12 schrieb Das, Nirmoy:
> >>
> >> On 10/12/2021 1:58 PM, Stanley.Yang wrote:
> >>> Test scenario:
> >>>  modprobe amdgpu -> rmmod amdgpu -> modprobe amdgpu Error log:
> >>>  [   54.396807] debugfs: File 'page_pool' in directory 'amdttm'
> >>> already present!
> >>>  [   54.396833] debugfs: File 'page_pool_shrink' in directory
> >>> 'amdttm' already present!
> >>>  [   54.396848] debugfs: File 'buffer_objects' in directory
> >>> 'amdttm' already present!
> >>
> >>
> >> We should instead add a check if those debugfs files already
> >> exist/created in ttm debugfs dir using debugfs_lookup() before creating.
> >
> > No, IIRC the Intel guys had fixed that already by adding/removing the
> > debugfs file on module load/unload.
> 
> 
> Adding/removing on ttm module load/unload is nicer.
The point is that page_pool, page_pool_shrink and buffer_objects are created by 
amdgpu driver, I think it's better to remove them by amdgpu module due to 
amdgpu module create them,
otherwise, there will be a scene create them failed only reload amdgpu module.

Stanley
> 
> 
> Nirmoy
> 
> >
> >
> > Christian.
> >
> >>
> >>
> >> Regards,
> >>
> >> Nirmoy
> >>
> >>
> >>
> >>> Reason:
> >>>  page_pool, page_pool_shrink and buffer_objects can be removed
> >>> when
> >>>  rmmod amdttm, in the above test scenario only rmmod amdgpu, so
> >>> those
> >>>  debugfs node will not be removed, this caused file create failed.
> >>> Soultion:
> >>>  create ttm_page directory under ttm_root directory when insmod
> >>> amdgpu,
> >>>  page_pool, page_pool_shrink and buffer_objects are stored in
> >>> ttm_page directiry,
> >>>  remove ttm_page directory when do rmmod amdgpu, this can fix
> >>> above issue.
> >>>
> >>> Signed-off-by: Stanley.Yang 
> >>> ---
> >>>   drivers/gpu/drm/ttm/ttm_device.c | 12 +++-
> >>>   drivers/gpu/drm/ttm/ttm_module.c |  1 +
> >>>   drivers/gpu/drm/ttm/ttm_module.h |  1 +
> >>>   drivers/gpu/drm/ttm/ttm_pool.c   |  4 ++--
> >>>   4 files changed, 15 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_device.c
> >>> b/drivers/gpu/drm/ttm/ttm_device.c
> >>> index 1de23edbc182..ad170328f0c8 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_device.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> >>> @@ -55,6 +55,10 @@ static void ttm_global_release(void)
> >>>     ttm_pool_mgr_fini();
> >>>   +#ifdef CONFIG_DEBUG_FS
> >>> +    debugfs_remove(ttm_debugfs_page); #endif
> >>> +
> >>>   __free_page(glob->dummy_read_page);
> >>>   memset(glob, 0, sizeof(*glob));
> >>>   out:
> >>> @@ -85,6 +89,10 @@ static int ttm_global_init(void)
> >>>   >> PAGE_SHIFT;
> >>>   num_dma32 = min(num_dma32, 2UL << (30 - PAGE_SHIFT));
> >>>   +#ifdef CONFIG_DEBUG_FS
> >>> +    ttm_debugfs_page = debugfs_create_dir("ttm_page",
> >>> ttm_debugfs_root);
> >>> +#endif
> >>> +
> >>>   ttm_pool_mgr_init(num_pages);
> >>>   ttm_tt_mgr_init(num_pages, num_dma32);
> >>>   @@ -98,8 +106,10 @@ static int ttm_global_init(void)
> >>>   INIT_LIST_HEAD(&glob->device_list);
> >>>   atomic_set(&glob->bo_count, 0);
> >>>   -    debugfs_create_atomic_t("buffer_objects", 0444,
> >>> ttm_debugfs_root,
> >>> +#ifdef CONFIG_DEBUG_FS
> >>> +    debugfs_create_atomic_t("buffer_objects", 0444,
> >>> +ttm_debugfs_page,
> >>>   &glob->bo_count);
> >>> +#endif
> >>>   out:
> >>>   mutex_unlock(&ttm_global_mutex);
> >>>   return ret;
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_module.c
> >>> b/drivers/gpu/drm/ttm/ttm_module.c
> >>> index 88970a6b8e32..66595e6e7087 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_module.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_module.c
> >>> @@ -38,6 +38,7 @@
> >>>   #include "ttm_module.h"
> >>>     struct dentry *ttm_debugfs_root;
> >>> +struct dentry *ttm_debugfs_page;
> >>>     static int __init ttm_init(void)
> >>>   {
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_module.h
> >>> b/drivers/gpu/drm/ttm/ttm_module.h
> >>> index d7cac5d4b835..6007dc66f44e 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_module.h
> >>> +++ b/drivers/gpu/drm/ttm/ttm_module.h
> >>> @@ -36,5 +36,6 @@
> >>>   struct dentry;
> >>>     extern struct dentry *ttm_debugfs_root;
> >>> +extern struct dentry *ttm_debugfs_page;
> >>>     #endif /* _TTM_MODULE_H_ */
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c
> >>> b/drivers/gpu/drm/ttm/ttm_pool.c index 8be7fd7161fd..ecb33daad7b5
> >>> 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> >>> @@ -709,9 +709,9 @@ int ttm_pool_mgr_init(unsigned long num_pages)
> >>>   }
> >>>     #ifdef CONFIG_DEBUG_FS
> >>> -    debugfs_create_file("page_pool", 0444,

Re: [PATCH 0/5] Remove 0 MHz as a valid current frequency (v4)

2021-10-19 Thread Paul Menzel

Dear Luben,


Thank you for your quick reply.

Am 19.10.21 um 09:43 schrieb Luben Tuikov:

On 2021-10-19 03:23, Paul Menzel wrote:



Sorry, two more style nits.

1.  Could you please use 75 characters per line for the text width of
the commit messages. Currently, especially 4/5, are hard to read being
so short.


This is the default we use--I've not made any changes to the wrap.


What do you mean? Your editor wraps the lines at the point, where you 
configured it, doesn’t it?



git-log(1) indents the text by 4/8 chars and it looks better if the
text doesn't roll past 75 chars per line in git-log.
Patch 4/5 uses a text width of 50 characters, which is too short. From 
commit 2a076f40d8c9 (checkpatch, SubmittingPatches: suggest line 
wrapping commit messages at 75 columns) [1], which added a check for too 
long lines:



Suggest line wrapping at 75 columns so the default git commit log
indentation of 4 plus the commit message text still fits on an 80 column
screen.




2.  No idea, what is done in amd-gfx, but for me it is more common to
put the iteration number (reroll count) in the PATCH tag in the
beginning. No idea, how Patchwork deals with it.


This is what we do in amd-gfx and particularly in amdgpu, so that the
version of the patch is recorded in the title of the patch and in history.

I forgot. Thank you.


Kind regards,

Paul


[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a076f40d8c9be95bee7bcf18436655e1140447f


Use of conditionals with omitted operands in amdgpu (x? : y) (was: [PATCH 4/5] dpm/amd/pm: Sienna: Remove 0 MHz as a current clock frequency (v3))

2021-10-19 Thread Paul Menzel

Dear Luben,


Am 19.10.21 um 06:50 schrieb Luben Tuikov:

On 2021-10-19 00:38, Lazar, Lijo wrote:


On 10/19/2021 9:45 AM, Luben Tuikov wrote:

On 2021-10-18 23:38, Lazar, Lijo wrote:

On 10/19/2021 5:19 AM, Luben Tuikov wrote:


[…]


-   if (ret)
-   goto print_clk_out;
+   freq_value[1] = curr_value ?: freq_value[0];

Omitting second expression is not standard C -
https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html

Lijo just clarified to me that:


well, i had to look up as I haven't seen it before

I hope the following should make it clear about its usage:

$cd linux/
$find . -name "*.[ch]" -exec grep -E "\?:" \{\} \+ | wc -l
1042
$_


$ git grep -E "\?:" -- '*amdgpu*.[ch]'
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c: * Solution?:

So for the AMDGPU subsystem, as the only result is a comment, currently, 
conditionals with omitted operands are not used. So, it’s a valid 
question, if the use should be introduced into the subsystem.


The GCC documentation also states:


In this simple case, the ability to omit the middle operand is not
especially useful. When it becomes useful is when the first operand
does, or may (if it is a macro argument), contain a side effect. Then
repeating the operand in the middle would perform the side effect
twice. Omitting the middle operand uses the value already computed
without the undesirable effects of recomputing it.


So, in your case, there are no side effect, if I am not mistaken.

I do not care, if the extension is going to be used or not. The 
maintainers might want to officially confirm the use in the subsystem, 
as using these extensions is surprising for some C developers not 
knowing the GNU extensions.



Thanks Luben!


You're welcome. I'm glad you're learning new things from my patches.
Would've been easier if you'd just said in your email that you've
never seen this ternary conditional shortcut before and that you've
just learned of it from my patch. (Or not post anything at all in
this very case and get in touch with me privately via email or
Teams--I would've gladly clarified it there.)


In my opinion, asking this on the list is perfectly valid, as other 
readers, might have the same question. But being more elaborate to avoid 
misunderstandings is always a good thing.



I hope the find+egrep above is also edifying, so you can use it in
the future in your learning process.


I hope, you like my solution without using find. ;-)


Kind regards,

Paul


Re: [PATCH 0/5] Remove 0 MHz as a valid current frequency (v4)

2021-10-19 Thread Luben Tuikov
On 2021-10-19 03:23, Paul Menzel wrote:
> Dear Luben,
>
>
> Sorry, two more style nits.
>
> 1.  Could you please use 75 characters per line for the text width of 
> the commit messages. Currently, especially 4/5, are hard to read being 
> so short.

This is the default we use--I've not made any changes to the wrap. git-log(1) 
indents the text by 4/8 chars and it looks better if the text doesn't roll past 
75 chars per line in git-log.
>
> 2.  No idea, what is done in amd-gfx, but for me it is more common to 
> put the iteration number (reroll count) in the PATCH tag in the 
> beginning. No idea, how Patchwork deals with it.

This is what we do in amd-gfx and particularly in amdgpu, so that the version 
of the patch is recorded in the title of the patch and in history.

Regards,
Luben

>
>
> Kind regards,
>
> Paul



Re: [PATCH 0/5] Remove 0 MHz as a valid current frequency (v4)

2021-10-19 Thread Paul Menzel

Dear Luben,


Sorry, two more style nits.

1.  Could you please use 75 characters per line for the text width of 
the commit messages. Currently, especially 4/5, are hard to read being 
so short.


2.  No idea, what is done in amd-gfx, but for me it is more common to 
put the iteration number (reroll count) in the PATCH tag in the 
beginning. No idea, how Patchwork deals with it.



Kind regards,

Paul