Re: [PATCH 3/4] drm/amdgpu: add IOCTL interface for per VM BOs v2

2017-08-29 Thread Michel Dänzer
On 30/08/17 03:42 PM, Michel Dänzer wrote:
> On 30/08/17 03:09 PM, Christian König wrote:
>> Am 29.08.2017 um 19:20 schrieb Deucher, Alexander:
 From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
 Of Christian König

 @@ -89,6 +89,8 @@ extern "C" {
   #define AMDGPU_GEM_CREATE_SHADOW(1 << 4)
   /* Flag that allocating the BO should use linear VRAM */
   #define AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS(1 << 5)
 +/* Flag that BO is local in the VM */
 +#define AMDGPU_GEM_CREATE_LOCAL(1 << 6)
>>> I'm not crazy about the name LOCAL.  Maybe something like ALWAYS_VALID?
>>
>> Works for me as well. Dave any other opinion?
>>
>> If everybody is ok with ALWAYS_VALID I'm going to use that one.
> 
> FWIW, I like LOCAL better than ALWAYS_VALID. The latter suggests that
> the BO is valid under any circumstances, whereas LOCAL indicates that it
> cannot be used outside of the GPUVM it was created in.
> 
> I don't feel strongly about it though, feel free to go with either.

Another idea:

/* The BO can only be used in the VM it was created in */
#define AMDGPU_GEM_CREATE_UNSHAREABLE(1 << 6)


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/4] drm/amdgpu: add IOCTL interface for per VM BOs v2

2017-08-29 Thread Michel Dänzer
On 30/08/17 03:09 PM, Christian König wrote:
> Am 29.08.2017 um 19:20 schrieb Deucher, Alexander:
>>> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
>>> Of Christian König
>>>
>>> @@ -89,6 +89,8 @@ extern "C" {
>>>   #define AMDGPU_GEM_CREATE_SHADOW(1 << 4)
>>>   /* Flag that allocating the BO should use linear VRAM */
>>>   #define AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS(1 << 5)
>>> +/* Flag that BO is local in the VM */
>>> +#define AMDGPU_GEM_CREATE_LOCAL(1 << 6)
>> I'm not crazy about the name LOCAL.  Maybe something like ALWAYS_VALID?
> 
> Works for me as well. Dave any other opinion?
> 
> If everybody is ok with ALWAYS_VALID I'm going to use that one.

FWIW, I like LOCAL better than ALWAYS_VALID. The latter suggests that
the BO is valid under any circumstances, whereas LOCAL indicates that it
cannot be used outside of the GPUVM it was created in.

I don't feel strongly about it though, feel free to go with either.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: cover fragment size between 4 and 9 when not aligned

2017-08-29 Thread Christian König

Hi David & Roger,

I think when you can select fragment automatically, you shouldn't 
involve the vm_manager.fragment_size to calculation, then we can use 
the properest  fragment for every segment and get the best performance.
No, that won't work. I've already tried this and it decreases 
performance. The problem is that the L2 on pre Vega10 uses the fragment 
field to decide in which cache segment to put the PTE.


So we should only cover fragment sizes up to whatever the 
vm_manager.fragment_size configuration is currently.


Additional to that the patch needs to be simplified, cause another 5 
level recursion is not something we want because of the limited kernel 
stack (only 4K usually).


If you don't mind Roger I'm going to fix this today, already have a good 
idea how to handle it I think.


Regards,
Christian.

Am 30.08.2017 um 08:15 schrieb zhoucm1:

Hi Roger,

I think when you can select fragment automatically, you shouldn't 
involve the vm_manager.fragment_size to calculation, then we can use 
the properest  fragment for every segment and get the best performance.


So vm_manager.fragment_size should be always be -1, if it becomes 
valid fragment value, then we should always use it and disable 
automatic selection, which should only for validation usage.



Regards,

David Zhou


On 2017年08月30日 13:01, Roger He wrote:

this can get performance improvement for some cases

Change-Id: Ibb58bb3099f7e8c4b5da90da73a03544cdb2bcb7
Signed-off-by: Roger He 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 98 
+++---

  1 file changed, 79 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 592c3e7..4e5da5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1375,7 +1375,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,

  }
/*
- * amdgpu_vm_frag_ptes - add fragment information to PTEs
+ * amdgpu_vm_update_ptes_helper - add fragment information to PTEs
   *
   * @params: see amdgpu_pte_update_params definition
   * @vm: requested vm
@@ -1383,11 +1383,12 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,

   * @end: last PTE to handle
   * @dst: addr those PTEs should point to
   * @flags: hw mapping flags
+ * @fragment: fragment size
   * Returns 0 for success, -EINVAL for failure.
   */
-static int amdgpu_vm_frag_ptes(struct amdgpu_pte_update_params
*params,

-uint64_t start, uint64_t end,
-uint64_t dst, uint64_t flags)
+static int amdgpu_vm_update_ptes_helper(struct 
amdgpu_pte_update_params *params,

+  uint64_t start, uint64_t end, uint64_t dst,
+  uint64_t flags, int fragment)
  {
  int r;
  @@ -1409,41 +1410,100 @@ static int amdgpu_vm_frag_ptes(struct 
amdgpu_pte_update_params*params,

   * Userspace can support this by aligning virtual base address and
   * allocation size to the fragment size.
   */
-unsigned pages_per_frag = params->adev->vm_manager.fragment_size;
-uint64_t frag_flags = AMDGPU_PTE_FRAG(pages_per_frag);
-uint64_t frag_align = 1 << pages_per_frag;
+uint64_t frag_flags, frag_align, frag_start, frag_end;
  -uint64_t frag_start = ALIGN(start, frag_align);
-uint64_t frag_end = end & ~(frag_align - 1);
+if (start > end || fragment < 0)
+return -EINVAL;
  -/* system pages are non continuously */
-if (params->src || !(flags & AMDGPU_PTE_VALID) ||
-(frag_start >= frag_end))
-return amdgpu_vm_update_ptes(params, start, end, dst, flags);
+fragment = min(fragment, max(0, fls64(end - start) - 1));
+frag_flags = AMDGPU_PTE_FRAG(fragment);
+frag_align = 1 << fragment;
+frag_start = ALIGN(start, frag_align);
+frag_end = end & ~(frag_align - 1);
+
+if (frag_start >= frag_end) {
+if (fragment <= 4)
+return amdgpu_vm_update_ptes(params, start, end,
+dst, flags);
+else
+return amdgpu_vm_update_ptes_helper(params, start,
+end, dst, flags, fragment - 1);
+}
+
+if (fragment <= 4) {
+/* handle the 4K area at the beginning */
+if (start != frag_start) {
+r = amdgpu_vm_update_ptes(params, start, frag_start,
+  dst, flags);
+if (r)
+return r;
+dst += (frag_start - start) * AMDGPU_GPU_PAGE_SIZE;
+}
+
+/* handle the area in the middle */
+r = amdgpu_vm_update_ptes(params, frag_start, frag_end, dst,
+  flags | frag_flags);
+if (r)
+return r;
+
+/* handle the 4K area at the end */
+if (frag_end != end) {
+dst += (frag_end - frag_start) * AMDGPU_GPU_PAGE_SIZE;
+r = amdgpu_vm_update_ptes(params, frag_end, end,
+dst, f

Re: [PATCH 04/13] drm/amdgpu: update to new mmu_notifier semantic

2017-08-29 Thread Christian König

Am 30.08.2017 um 01:54 schrieb Jérôme Glisse:

Call to mmu_notifier_invalidate_page() are replaced by call to
mmu_notifier_invalidate_range() and thus call are bracketed by
call to mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse 


Reviewed-by: Christian König 

The general approach is Acked-by: Christian König 
.


It's something very welcome since I was one of the people (together with 
the Intel guys) which failed to recognize what this callback really does.


Regards,
Christian.


Cc: amd-gfx@lists.freedesktop.org
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Cc: Kirill A. Shutemov 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Andrea Arcangeli 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 31 ---
  1 file changed, 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 6558a3ed57a7..e1cde6b80027 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -147,36 +147,6 @@ static void amdgpu_mn_invalidate_node(struct 
amdgpu_mn_node *node,
  }
  
  /**

- * amdgpu_mn_invalidate_page - callback to notify about mm change
- *
- * @mn: our notifier
- * @mn: the mm this callback is about
- * @address: address of invalidate page
- *
- * Invalidation of a single page. Blocks for all BOs mapping it
- * and unmap them by move them into system domain again.
- */
-static void amdgpu_mn_invalidate_page(struct mmu_notifier *mn,
- struct mm_struct *mm,
- unsigned long address)
-{
-   struct amdgpu_mn *rmn = container_of(mn, struct amdgpu_mn, mn);
-   struct interval_tree_node *it;
-
-   mutex_lock(&rmn->lock);
-
-   it = interval_tree_iter_first(&rmn->objects, address, address);
-   if (it) {
-   struct amdgpu_mn_node *node;
-
-   node = container_of(it, struct amdgpu_mn_node, it);
-   amdgpu_mn_invalidate_node(node, address, address);
-   }
-
-   mutex_unlock(&rmn->lock);
-}
-
-/**
   * amdgpu_mn_invalidate_range_start - callback to notify about mm change
   *
   * @mn: our notifier
@@ -215,7 +185,6 @@ static void amdgpu_mn_invalidate_range_start(struct 
mmu_notifier *mn,
  
  static const struct mmu_notifier_ops amdgpu_mn_ops = {

.release = amdgpu_mn_release,
-   .invalidate_page = amdgpu_mn_invalidate_page,
.invalidate_range_start = amdgpu_mn_invalidate_range_start,
  };
  



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: cover fragment size between 4 and 9 when not aligned

2017-08-29 Thread zhoucm1

Hi Roger,

I think when you can select fragment automatically, you shouldn't 
involve the vm_manager.fragment_size to calculation, then we can use the 
properest  fragment for every segment and get the best performance.


So vm_manager.fragment_size should be always be -1, if it becomes valid 
fragment value, then we should always use it and disable automatic 
selection, which should only for validation usage.



Regards,

David Zhou


On 2017年08月30日 13:01, Roger He wrote:

this can get performance improvement for some cases

Change-Id: Ibb58bb3099f7e8c4b5da90da73a03544cdb2bcb7
Signed-off-by: Roger He 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 98 +++---
  1 file changed, 79 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 592c3e7..4e5da5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1375,7 +1375,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
  }
  
  /*

- * amdgpu_vm_frag_ptes - add fragment information to PTEs
+ * amdgpu_vm_update_ptes_helper - add fragment information to PTEs
   *
   * @params: see amdgpu_pte_update_params definition
   * @vm: requested vm
@@ -1383,11 +1383,12 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
   * @end: last PTE to handle
   * @dst: addr those PTEs should point to
   * @flags: hw mapping flags
+ * @fragment: fragment size
   * Returns 0 for success, -EINVAL for failure.
   */
-static int amdgpu_vm_frag_ptes(struct amdgpu_pte_update_params *params,
-   uint64_t start, uint64_t end,
-   uint64_t dst, uint64_t flags)
+static int amdgpu_vm_update_ptes_helper(struct amdgpu_pte_update_params 
*params,
+ uint64_t start, uint64_t end, uint64_t dst,
+ uint64_t flags, int fragment)
  {
int r;
  
@@ -1409,41 +1410,100 @@ static int amdgpu_vm_frag_ptes(struct amdgpu_pte_update_params	*params,

 * Userspace can support this by aligning virtual base address and
 * allocation size to the fragment size.
 */
-   unsigned pages_per_frag = params->adev->vm_manager.fragment_size;
-   uint64_t frag_flags = AMDGPU_PTE_FRAG(pages_per_frag);
-   uint64_t frag_align = 1 << pages_per_frag;
+   uint64_t frag_flags, frag_align, frag_start, frag_end;
  
-	uint64_t frag_start = ALIGN(start, frag_align);

-   uint64_t frag_end = end & ~(frag_align - 1);
+   if (start > end || fragment < 0)
+   return -EINVAL;
  
-	/* system pages are non continuously */

-   if (params->src || !(flags & AMDGPU_PTE_VALID) ||
-   (frag_start >= frag_end))
-   return amdgpu_vm_update_ptes(params, start, end, dst, flags);
+   fragment = min(fragment, max(0, fls64(end - start) - 1));
+   frag_flags = AMDGPU_PTE_FRAG(fragment);
+   frag_align = 1 << fragment;
+   frag_start = ALIGN(start, frag_align);
+   frag_end = end & ~(frag_align - 1);
+
+   if (frag_start >= frag_end) {
+   if (fragment <= 4)
+   return amdgpu_vm_update_ptes(params, start, end,
+   dst, flags);
+   else
+   return amdgpu_vm_update_ptes_helper(params, start,
+   end, dst, flags, fragment - 1);
+   }
+
+   if (fragment <= 4) {
+   /* handle the 4K area at the beginning */
+   if (start != frag_start) {
+   r = amdgpu_vm_update_ptes(params, start, frag_start,
+ dst, flags);
+   if (r)
+   return r;
+   dst += (frag_start - start) * AMDGPU_GPU_PAGE_SIZE;
+   }
+
+   /* handle the area in the middle */
+   r = amdgpu_vm_update_ptes(params, frag_start, frag_end, dst,
+ flags | frag_flags);
+   if (r)
+   return r;
+
+   /* handle the 4K area at the end */
+   if (frag_end != end) {
+   dst += (frag_end - frag_start) * AMDGPU_GPU_PAGE_SIZE;
+   r = amdgpu_vm_update_ptes(params, frag_end, end,
+   dst, flags);
+   }
+   return r;
+   }
  
-	/* handle the 4K area at the beginning */

+   /* handle the area at the beginning not aligned */
if (start != frag_start) {
-   r = amdgpu_vm_update_ptes(params, start, frag_start,
- dst, flags);
+   r = amdgpu_vm_update_ptes_helper(params, start, frag_start,
+   dst, flags, fragme

Re: [PATCH 3/4] drm/amdgpu: add IOCTL interface for per VM BOs v2

2017-08-29 Thread Christian König

Am 29.08.2017 um 19:20 schrieb Deucher, Alexander:

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Christian König
Sent: Tuesday, August 29, 2017 1:08 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 3/4] drm/amdgpu: add IOCTL interface for per VM BOs v2

From: Christian König 

Add the IOCTL interface so that applications can allocate per VM BOs.

Still WIP since not all corner cases are tested yet, but this reduces average
CS overhead for 10K BOs from 21ms down to 48us.

v2: add some extra checks, remove the WIP tag

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  7 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c|  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   | 63
++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c |  3 +-
  include/uapi/drm/amdgpu_drm.h |  2 +
  5 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index b1e817c..21cab36 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -457,9 +457,10 @@ struct amdgpu_sa_bo {
   */
  void amdgpu_gem_force_release(struct amdgpu_device *adev);
  int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned
long size,
-   int alignment, u32 initial_domain,
-   u64 flags, bool kernel,
-   struct drm_gem_object **obj);
+int alignment, u32 initial_domain,
+u64 flags, bool kernel,
+struct reservation_object *resv,
+struct drm_gem_object **obj);

  int amdgpu_mode_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
index 0e907ea..7256f83 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
@@ -144,7 +144,7 @@ static int amdgpufb_create_pinned_object(struct
amdgpu_fbdev *rfbdev,

AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |

AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |

AMDGPU_GEM_CREATE_VRAM_CLEARED,
-  true, &gobj);
+  true, NULL, &gobj);
if (ret) {
pr_err("failed to allocate framebuffer (%d)\n", aligned_size);
return -ENOMEM;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index e32a2b5..a835304 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -44,11 +44,12 @@ void amdgpu_gem_object_free(struct
drm_gem_object *gobj)
  }

  int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned
long size,
-   int alignment, u32 initial_domain,
-   u64 flags, bool kernel,
-   struct drm_gem_object **obj)
+int alignment, u32 initial_domain,
+u64 flags, bool kernel,
+struct reservation_object *resv,
+struct drm_gem_object **obj)
  {
-   struct amdgpu_bo *robj;
+   struct amdgpu_bo *bo;
int r;

*obj = NULL;
@@ -59,7 +60,7 @@ int amdgpu_gem_object_create(struct amdgpu_device
*adev, unsigned long size,

  retry:
r = amdgpu_bo_create(adev, size, alignment, kernel, initial_domain,
-flags, NULL, NULL, 0, &robj);
+flags, NULL, resv, 0, &bo);
if (r) {
if (r != -ERESTARTSYS) {
if (initial_domain ==
AMDGPU_GEM_DOMAIN_VRAM) {
@@ -71,7 +72,7 @@ int amdgpu_gem_object_create(struct amdgpu_device
*adev, unsigned long size,
}
return r;
}
-   *obj = &robj->gem_base;
+   *obj = &bo->gem_base;

return 0;
  }
@@ -119,6 +120,10 @@ int amdgpu_gem_object_open(struct
drm_gem_object *obj,
if (mm && mm != current->mm)
return -EPERM;

+   if (abo->flags & AMDGPU_GEM_CREATE_LOCAL &&
+   abo->tbo.resv != vm->root.base.bo->tbo.resv)
+   return -EPERM;
+
r = amdgpu_bo_reserve(abo, false);
if (r)
return r;
@@ -142,13 +147,14 @@ void amdgpu_gem_object_close(struct
drm_gem_object *obj,
struct amdgpu_vm *vm = &fpriv->vm;

struct amdgpu_bo_list_entry vm_pd;
-   struct list_head list;
+   struct list_head list, duplicates;
struct ttm_validate_buffer tv;
struct ww_acquire_ctx ticket;
struct amdgpu_bo_va *bo_va;
int r;

INIT_LIST_HEAD(&list);
+   INIT_LIST_HEAD(&duplicates);

tv.bo = &bo->tbo;
tv.shared = true;
@@ -156,7 +162,7 @@ void amdgpu

[PATCH] drm/amd/amdgpu: cover fragment size between 4 and 9 when not aligned

2017-08-29 Thread Roger He
this can get performance improvement for some cases

Change-Id: Ibb58bb3099f7e8c4b5da90da73a03544cdb2bcb7
Signed-off-by: Roger He 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 98 +++---
 1 file changed, 79 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 592c3e7..4e5da5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1375,7 +1375,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
 }
 
 /*
- * amdgpu_vm_frag_ptes - add fragment information to PTEs
+ * amdgpu_vm_update_ptes_helper - add fragment information to PTEs
  *
  * @params: see amdgpu_pte_update_params definition
  * @vm: requested vm
@@ -1383,11 +1383,12 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
  * @end: last PTE to handle
  * @dst: addr those PTEs should point to
  * @flags: hw mapping flags
+ * @fragment: fragment size
  * Returns 0 for success, -EINVAL for failure.
  */
-static int amdgpu_vm_frag_ptes(struct amdgpu_pte_update_params *params,
-   uint64_t start, uint64_t end,
-   uint64_t dst, uint64_t flags)
+static int amdgpu_vm_update_ptes_helper(struct amdgpu_pte_update_params 
*params,
+ uint64_t start, uint64_t end, uint64_t dst,
+ uint64_t flags, int fragment)
 {
int r;
 
@@ -1409,41 +1410,100 @@ static int amdgpu_vm_frag_ptes(struct 
amdgpu_pte_update_params *params,
 * Userspace can support this by aligning virtual base address and
 * allocation size to the fragment size.
 */
-   unsigned pages_per_frag = params->adev->vm_manager.fragment_size;
-   uint64_t frag_flags = AMDGPU_PTE_FRAG(pages_per_frag);
-   uint64_t frag_align = 1 << pages_per_frag;
+   uint64_t frag_flags, frag_align, frag_start, frag_end;
 
-   uint64_t frag_start = ALIGN(start, frag_align);
-   uint64_t frag_end = end & ~(frag_align - 1);
+   if (start > end || fragment < 0)
+   return -EINVAL;
 
-   /* system pages are non continuously */
-   if (params->src || !(flags & AMDGPU_PTE_VALID) ||
-   (frag_start >= frag_end))
-   return amdgpu_vm_update_ptes(params, start, end, dst, flags);
+   fragment = min(fragment, max(0, fls64(end - start) - 1));
+   frag_flags = AMDGPU_PTE_FRAG(fragment);
+   frag_align = 1 << fragment;
+   frag_start = ALIGN(start, frag_align);
+   frag_end = end & ~(frag_align - 1);
+
+   if (frag_start >= frag_end) {
+   if (fragment <= 4)
+   return amdgpu_vm_update_ptes(params, start, end,
+   dst, flags);
+   else
+   return amdgpu_vm_update_ptes_helper(params, start,
+   end, dst, flags, fragment - 1);
+   }
+
+   if (fragment <= 4) {
+   /* handle the 4K area at the beginning */
+   if (start != frag_start) {
+   r = amdgpu_vm_update_ptes(params, start, frag_start,
+ dst, flags);
+   if (r)
+   return r;
+   dst += (frag_start - start) * AMDGPU_GPU_PAGE_SIZE;
+   }
+
+   /* handle the area in the middle */
+   r = amdgpu_vm_update_ptes(params, frag_start, frag_end, dst,
+ flags | frag_flags);
+   if (r)
+   return r;
+
+   /* handle the 4K area at the end */
+   if (frag_end != end) {
+   dst += (frag_end - frag_start) * AMDGPU_GPU_PAGE_SIZE;
+   r = amdgpu_vm_update_ptes(params, frag_end, end,
+   dst, flags);
+   }
+   return r;
+   }
 
-   /* handle the 4K area at the beginning */
+   /* handle the area at the beginning not aligned */
if (start != frag_start) {
-   r = amdgpu_vm_update_ptes(params, start, frag_start,
- dst, flags);
+   r = amdgpu_vm_update_ptes_helper(params, start, frag_start,
+   dst, flags, fragment - 1);
if (r)
return r;
dst += (frag_start - start) * AMDGPU_GPU_PAGE_SIZE;
}
 
-   /* handle the area in the middle */
+   /* handle the area in the middle aligned*/
r = amdgpu_vm_update_ptes(params, frag_start, frag_end, dst,
  flags | frag_flags);
if (r)
return r;
 
-   /* handle the 4K area at the end */
+   /* handle the area at th

Re: [radeon-alex:amd-staging-drm-next 68/819] drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:154:10: error: 'drm_atomic_helper_connector_dpms' undeclared here (not in a function

2017-08-29 Thread Alex Deucher
On Tue, Aug 29, 2017 at 6:39 PM, Dieter Nützel  wrote:
> I've send a related kernel crash log to amd-devel some days ago without any
> answer, yet...
>
> Was:
> [amd-staging-drm-next] kernel crash with amdgpu on RX580 in
> 'drm_object_property_get_value'
>
> I get this in _all_ current 'amd-staging-drm-next' versions. ;-(

It's not a crash, just a warning.  It's due to a recent change in the
drm core, but we haven't gotten around to sorting it out yet.

Alex

>
> [16301.515079] [ cut here ]
> [16301.515105] WARNING: CPU: 4 PID: 11871 at
> drivers/gpu/drm/drm_mode_object.c:294
> drm_object_property_get_value+0x22/0x30 [drm]
> [16301.515106] Modules linked in: fuse rfcomm nf_log_ipv6 xt_comment
> nf_log_ipv4 nf_log_common xt_LOG xt_limit rpcsec_gss_krb5 auth_rpcgss nfsv4
> dns_resolver nfs lockd nfnetlink_cthelper grace nfnetlink sunrpc fscache
> af_packet ipmi_ssif iscsi_ibft iscsi_boot_sysfs ip6t_REJECT nf_reject_ipv6
> nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 xt_pkttype
> xt_tcpudp iptable_filter ip6table_mangle nf_conntrack_netbios_ns
> nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
> xt_conntrack nf_conntrack libcrc32c ip6table_filter ip6_tables x_tables jc42
> bnep joydev snd_hda_codec_hdmi snd_hda_intel snd_hda_codec btusb snd_hwdep
> btrtl snd_hda_core btbcm btintel bluetooth snd_pcm e1000e intel_powerclamp
> iTCO_wdt snd_timer coretemp iTCO_vendor_support hid_generic snd ptp
> kvm_intel pps_core rfkill
> [16301.515129]  kvm ecdh_generic tpm_infineon soundcore irqbypass tpm_tis
> tpm_tis_core crc32c_intel pcspkr shpchp ipmi_si tpm usbhid i2c_i801
> i7core_edac lpc_ich ipmi_devintf ipmi_msghandler ac button acpi_cpufreq
> tcp_bbr raid1 md_mod amdkfd amd_iommu_v2 serio_raw sr_mod cdrom amdgpu
> mpt3sas i2c_algo_bit raid_class scsi_transport_sas drm_kms_helper
> syscopyarea sysfillrect ehci_pci sysimgblt fb_sys_fops ehci_hcd ttm usbcore
> drm sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> [16301.515147] CPU: 4 PID: 11871 Comm: X Tainted: GW
> 4.13.0-rc5-1.g7262353-default+ #1
> [16301.515148] Hardware name: FUJITSU  PRIMERGY
> TX150 S7 /D2759, BIOS 6.00 Rev. 1.19.2759.A1
> 09/26/2012
> [16301.515149] task: 90c49525a180 task.stack: b84786ef4000
> [16301.515156] RIP: 0010:drm_object_property_get_value+0x22/0x30 [drm]
> [16301.515157] RSP: 0018:b84786ef7bf8 EFLAGS: 00010282
> [16301.515158] RAX: c03f0cc0 RBX: 90c52200 RCX:
> 
> [16301.515158] RDX: b84786ef7c10 RSI: 90c523837600 RDI:
> 90c523b19028
> [16301.515159] RBP: b84786ef7bf8 R08: 90c41ee28280 R09:
> 90c515defc00
> [16301.515159] R10: 00024bb8 R11:  R12:
> 
> [16301.515160] R13: 90c523b19000 R14: ffea R15:
> 90c511e8c100
> [16301.515161] FS:  7f86f15c9a40() GS:90c53fd0()
> knlGS:
> [16301.515162] CS:  0010 DS:  ES:  CR0: 80050033
> [16301.515162] CR2: 000849a98680 CR3: 0005dcc1a000 CR4:
> 06e0
> [16301.515163] Call Trace:
> [16301.515255]  amdgpu_dm_connector_atomic_set_property+0xe8/0x150 [amdgpu]
> [16301.515292]  drm_atomic_set_property+0x164/0x470 [drm]
> [16301.515300]  drm_mode_obj_set_property_ioctl+0x10b/0x240 [drm]
> [16301.515311]  ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
> [16301.515318]  drm_mode_connector_property_set_ioctl+0x30/0x40 [drm]
> [16301.515324]  drm_ioctl_kernel+0x5d/0xb0 [drm]
> [16301.515332]  drm_ioctl+0x31a/0x3d0 [drm]
> [16301.515339]  ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
> [16301.515342]  ? ext4_file_write_iter+0xba/0x390
> [16301.515362]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
> [16301.515365]  do_vfs_ioctl+0x92/0x5c0
> [16301.515367]  ? __fget+0x6e/0x90
> [16301.515368]  SyS_ioctl+0x79/0x90
> [16301.515372]  entry_SYSCALL_64_fastpath+0x1e/0xa9
> [16301.515373] RIP: 0033:0x7f86eef46507
> [16301.515373] RSP: 002b:7fffe5f9dca8 EFLAGS: 3246 ORIG_RAX:
> 0010
> [16301.515374] RAX: ffda RBX: 00084aeea990 RCX:
> 7f86eef46507
> [16301.515375] RDX: 7fffe5f9dce0 RSI: c01064ab RDI:
> 000d
> [16301.515375] RBP: 00084b50d830 R08: 00084b07d690 R09:
> 0001
> [16301.515376] R10: 0004 R11: 3246 R12:
> 00084aeea990
> [16301.515376] R13: 0005 R14: 00084aeeedd0 R15:
> 00084aeeedd0
> [16301.515377] Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 46 60
> 55 48 89 e5 48 8b 80 70 03 00 00 48 83 78 20 00 75 07 e8 60 ff ff ff 5d c3
> <0f> ff e8 57 ff ff ff 5d c3 0f 1f 44 00 00 66 66 66 66 90 55 48
> [16301.515394] ---[ end trace f957c1d5844ce46f ]---
>
> Thanks,
> Dieter
>
>
> Am 30.08.2017 00:22, schrieb kbuild test robot:
>>
>> tree:   git://people.freedesktop.org/~agd5f/linux.git amd-staging-drm-next
>> head:   ff14f0dca1f23b2cff41e43440c7952965e5fc1b
>> commit: 9b37a9b8f6

Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-08-29 Thread Jerome Glisse
On Tue, Aug 29, 2017 at 05:11:24PM -0700, Linus Torvalds wrote:
> On Tue, Aug 29, 2017 at 4:54 PM, Jérôme Glisse  wrote:
> >
> > Note this is barely tested. I intend to do more testing of next few days
> > but i do not have access to all hardware that make use of the mmu_notifier
> > API.
> 
> Thanks for doing this.
> 
> > First 2 patches convert existing call of mmu_notifier_invalidate_page()
> > to mmu_notifier_invalidate_range() and bracket those call with call to
> > mmu_notifier_invalidate_range_start()/end().
> 
> Ok, those two patches are a bit more complex than I was hoping for,
> but not *too* bad.
> 
> And the final end result certainly looks nice:
> 
> >  16 files changed, 74 insertions(+), 214 deletions(-)
> 
> Yeah, removing all those invalidate_page() notifiers certainly makes
> for a nice patch.
> 
> And I actually think you missed some more lines that can now be
> removed: kvm_arch_mmu_notifier_invalidate_page() should no longer be
> needed either, so you can remove all of those too (most of them are
> empty inline functions, but x86 has one that actually does something.
> 
> So there's an added 30 or so dead lines that should be removed in the
> kvm patch, I think.

Yes i missed that. I will wait for people to test and for result of my
own test before reposting if need be, otherwise i will post as separate
patch.

> 
> But from a _very_ quick read-through this looks fine. But it obviously
> needs testing.
> 
> People - *especially* the people who saw issues under KVM - can you
> try out Jérôme's patch-series? I aded some people to the cc, the full
> series is on lkml. Jérôme - do you have a git branch for people to
> test that they could easily pull and try out?

https://cgit.freedesktop.org/~glisse/linux mmu-notifier branch
git://people.freedesktop.org/~glisse/linux

(Sorry if that tree is bit big it has a lot of dead thing i need
 to push a clean and slim one)

Jérôme
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-08-29 Thread Jérôme Glisse
(Sorry for so many list cross-posting and big cc)

Please help testing !

The invalidate_page callback suffered from 2 pitfalls. First it used to
happen after page table lock was release and thus a new page might have
been setup for the virtual address before the call to invalidate_page().

This is in a weird way fixed by c7ab0d2fdc840266b39db94538f74207ec2afbf6
which moved the callback under the page table lock. Which also broke
several existing user of the mmu_notifier API that assumed they could
sleep inside this callback.

The second pitfall was invalidate_page being the only callback not taking
a range of address in respect to invalidation but was giving an address
and a page. Lot of the callback implementer assumed this could never be
THP and thus failed to invalidate the appropriate range for THP pages.

By killing this callback we unify the mmu_notifier callback API to always
take a virtual address range as input.

There is now 2 clear API (I am not mentioning the youngess API which is
seldomly used):
  - invalidate_range_start()/end() callback (which allow you to sleep)
  - invalidate_range() where you can not sleep but happen right after
page table update under page table lock


Note that a lot of existing user feels broken in respect to range_start/
range_end. Many user only have range_start() callback but there is nothing
preventing them to undo what was invalidated in their range_start() callback
after it returns but before any CPU page table update take place.

The code pattern use in kvm or umem odp is an example on how to properly
avoid such race. In a nutshell use some kind of sequence number and active
range invalidation counter to block anything that might undo what the
range_start() callback did.

If you do not care about keeping fully in sync with CPU page table (ie
you can live with CPU page table pointing to new different page for a
given virtual address) then you can take a reference on the pages inside
the range_start callback and drop it in range_end or when your driver
is done with those pages.

Last alternative is to use invalidate_range() if you can do invalidation
without sleeping as invalidate_range() callback happens under the CPU
page table spinlock right after the page table is updated.


Note this is barely tested. I intend to do more testing of next few days
but i do not have access to all hardware that make use of the mmu_notifier
API.


First 2 patches convert existing call of mmu_notifier_invalidate_page()
to mmu_notifier_invalidate_range() and bracket those call with call to
mmu_notifier_invalidate_range_start()/end().

The next 10 patches remove existing invalidate_page() callback as it can
no longer happen.

Finaly the last page remove it completely so it can RIP.

Jérôme Glisse (13):
  dax: update to new mmu_notifier semantic
  mm/rmap: update to new mmu_notifier semantic
  powerpc/powernv: update to new mmu_notifier semantic
  drm/amdgpu: update to new mmu_notifier semantic
  IB/umem: update to new mmu_notifier semantic
  IB/hfi1: update to new mmu_notifier semantic
  iommu/amd: update to new mmu_notifier semantic
  iommu/intel: update to new mmu_notifier semantic
  misc/mic/scif: update to new mmu_notifier semantic
  sgi-gru: update to new mmu_notifier semantic
  xen/gntdev: update to new mmu_notifier semantic
  KVM: update to new mmu_notifier semantic
  mm/mmu_notifier: kill invalidate_page

Cc: Kirill A. Shutemov 
Cc: Linus Torvalds 
Cc: Andrew Morton 
Cc: Andrea Arcangeli 
Cc: Joerg Roedel 
Cc: Dan Williams 
Cc: Sudeep Dutt 
Cc: Ashutosh Dixit 
Cc: Dimitri Sivanich 
Cc: Jack Steiner 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 

Cc: linuxppc-...@lists.ozlabs.org
Cc: dri-de...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Cc: linux-r...@vger.kernel.org
Cc: io...@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
Cc: k...@vger.kernel.org


 arch/powerpc/platforms/powernv/npu-dma.c | 10 
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c   | 31 --
 drivers/infiniband/core/umem_odp.c   | 19 --
 drivers/infiniband/hw/hfi1/mmu_rb.c  |  9 ---
 drivers/iommu/amd_iommu_v2.c |  8 --
 drivers/iommu/intel-svm.c|  9 ---
 drivers/misc/mic/scif/scif_dma.c | 11 
 drivers/misc/sgi-gru/grutlbpurge.c   | 12 -
 drivers/xen/gntdev.c |  8 --
 fs/dax.c | 19 --
 include/linux/mm.h   |  1 +
 include/linux/mmu_notifier.h | 25 --
 mm/memory.c  | 26 +++
 mm/mmu_notifier.c| 14 --
 mm/rmap.c| 44 +---
 virt/kvm/kvm_main.c  | 42 --
 16 files changed, 74 insertions(+), 214 deletions(-)

-- 
2.13.5

___
amd-gfx mailing list
amd-gfx@lis

[PATCH 04/13] drm/amdgpu: update to new mmu_notifier semantic

2017-08-29 Thread Jérôme Glisse
Call to mmu_notifier_invalidate_page() are replaced by call to
mmu_notifier_invalidate_range() and thus call are bracketed by
call to mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse 
Cc: amd-gfx@lists.freedesktop.org
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Cc: Kirill A. Shutemov 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Andrea Arcangeli 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 31 ---
 1 file changed, 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 6558a3ed57a7..e1cde6b80027 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -147,36 +147,6 @@ static void amdgpu_mn_invalidate_node(struct 
amdgpu_mn_node *node,
 }
 
 /**
- * amdgpu_mn_invalidate_page - callback to notify about mm change
- *
- * @mn: our notifier
- * @mn: the mm this callback is about
- * @address: address of invalidate page
- *
- * Invalidation of a single page. Blocks for all BOs mapping it
- * and unmap them by move them into system domain again.
- */
-static void amdgpu_mn_invalidate_page(struct mmu_notifier *mn,
- struct mm_struct *mm,
- unsigned long address)
-{
-   struct amdgpu_mn *rmn = container_of(mn, struct amdgpu_mn, mn);
-   struct interval_tree_node *it;
-
-   mutex_lock(&rmn->lock);
-
-   it = interval_tree_iter_first(&rmn->objects, address, address);
-   if (it) {
-   struct amdgpu_mn_node *node;
-
-   node = container_of(it, struct amdgpu_mn_node, it);
-   amdgpu_mn_invalidate_node(node, address, address);
-   }
-
-   mutex_unlock(&rmn->lock);
-}
-
-/**
  * amdgpu_mn_invalidate_range_start - callback to notify about mm change
  *
  * @mn: our notifier
@@ -215,7 +185,6 @@ static void amdgpu_mn_invalidate_range_start(struct 
mmu_notifier *mn,
 
 static const struct mmu_notifier_ops amdgpu_mn_ops = {
.release = amdgpu_mn_release,
-   .invalidate_page = amdgpu_mn_invalidate_page,
.invalidate_range_start = amdgpu_mn_invalidate_range_start,
 };
 
-- 
2.13.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 00/13] mmu_notifier kill invalidate_page callback

2017-08-29 Thread Linus Torvalds
On Tue, Aug 29, 2017 at 4:54 PM, Jérôme Glisse  wrote:
>
> Note this is barely tested. I intend to do more testing of next few days
> but i do not have access to all hardware that make use of the mmu_notifier
> API.

Thanks for doing this.

> First 2 patches convert existing call of mmu_notifier_invalidate_page()
> to mmu_notifier_invalidate_range() and bracket those call with call to
> mmu_notifier_invalidate_range_start()/end().

Ok, those two patches are a bit more complex than I was hoping for,
but not *too* bad.

And the final end result certainly looks nice:

>  16 files changed, 74 insertions(+), 214 deletions(-)

Yeah, removing all those invalidate_page() notifiers certainly makes
for a nice patch.

And I actually think you missed some more lines that can now be
removed: kvm_arch_mmu_notifier_invalidate_page() should no longer be
needed either, so you can remove all of those too (most of them are
empty inline functions, but x86 has one that actually does something.

So there's an added 30 or so dead lines that should be removed in the
kvm patch, I think.

But from a _very_ quick read-through this looks fine. But it obviously
needs testing.

People - *especially* the people who saw issues under KVM - can you
try out Jérôme's patch-series? I aded some people to the cc, the full
series is on lkml. Jérôme - do you have a git branch for people to
test that they could easily pull and try out?

Linus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 9/9] drm/amdgpu: WIP add IOCTL interface for per VM BOs

2017-08-29 Thread Marek Olšák
It might be interesting to try glmark2.

Marek

On Tue, Aug 29, 2017 at 3:59 PM, Christian König
 wrote:
> Ok, found something that works. Xonotic in lowest resolution, lowest effects
> quality (e.g. totally CPU bound):
>
> Without per process BOs:
>
> Xonotic 0.8:
> pts/xonotic-1.4.0 [Resolution: 800 x 600 - Effects Quality: Low]
> Test 1 of 1
> Estimated Trial Run Count:3
> Estimated Time To Completion: 3 Minutes
> Started Run 1 @ 21:13:50
> Started Run 2 @ 21:14:57
> Started Run 3 @ 21:16:03  [Std. Dev: 0.94%]
>
> Test Results:
> 187.436577
> 189.514724
> 190.9605812
>
> Average: 189.30 Frames Per Second
> Minimum: 131
> Maximum: 355
>
> With per process BOs:
>
> Xonotic 0.8:
> pts/xonotic-1.4.0 [Resolution: 800 x 600 - Effects Quality: Low]
> Test 1 of 1
> Estimated Trial Run Count:3
> Estimated Time To Completion: 3 Minutes
> Started Run 1 @ 21:20:05
> Started Run 2 @ 21:21:07
> Started Run 3 @ 21:22:10  [Std. Dev: 1.49%]
>
> Test Results:
> 203.0471676
> 199.6622532
> 197.0954183
>
> Average: 199.93 Frames Per Second
> Minimum: 132
> Maximum: 349
>
> Well that looks like some improvement.
>
> Regards,
> Christian.
>
>
> Am 28.08.2017 um 14:59 schrieb Zhou, David(ChunMing):
>
> I will push our vulkan guys to test it, their bo list is very long.
>
> 发自坚果 Pro
>
> Christian K鰊ig  于 2017年8月28日 下午7:55写道:
>
> Am 28.08.2017 um 06:21 schrieb zhoucm1:
>>
>>
>> On 2017年08月27日 18:03, Christian König wrote:
>>> Am 25.08.2017 um 21:19 schrieb Christian König:
 Am 25.08.2017 um 18:22 schrieb Marek Olšák:
> On Fri, Aug 25, 2017 at 3:00 PM, Christian König
>  wrote:
>> Am 25.08.2017 um 12:32 schrieb zhoucm1:
>>>
>>>
>>> On 2017年08月25日 17:38, Christian König wrote:
 From: Christian König 

 Add the IOCTL interface so that applications can allocate per VM
 BOs.

 Still WIP since not all corner cases are tested yet, but this
 reduces
 average
 CS overhead for 10K BOs from 21ms down to 48us.
>>> Wow, cheers, eventually you get per vm bo to same reservation
>>> with PD/pts,
>>> indeed save a lot of bo list.
>>
>> Don't cheer to loud yet, that is a completely constructed test case.
>>
>> So far I wasn't able to archive any improvements with any real
>> game on this
>> with Mesa.
>> With thinking more, too many BOs share one reservation, which could
>> result in reservation lock often is busy, if eviction or destroy also
>> happens often in the meaning time, then which could effect VM update
>> and CS submission as well.
>
> That's exactly the reason why I've added code to the BO destroy path to
> avoid at least some of the problems. But yeah, that's only the tip of
> the iceberg of problems with that approach.
>
>> Anyway, this is very good start and try that we reduce CS overhead,
>> especially we've seen "reduces average CS overhead for 10K BOs from
>> 21ms down to 48us. ".
>
> Actually, it's not that good. See this is a completely build up test
> case on a kernel with lockdep and KASAN enabled.
>
> In reality we usually don't have so many BOs and so far I wasn't able to
> find much of an improvement in any real world testing.
>
> Regards,
> Christian.
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [radeon-alex:amd-staging-drm-next 68/819] drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:154:10: error: 'drm_atomic_helper_connector_dpms' undeclared here (not in a function

2017-08-29 Thread Dieter Nützel
I've send a related kernel crash log to amd-devel some days ago without 
any answer, yet...


Was:
[amd-staging-drm-next] kernel crash with amdgpu on RX580 in 
'drm_object_property_get_value'


I get this in _all_ current 'amd-staging-drm-next' versions. ;-(

[16301.515079] [ cut here ]
[16301.515105] WARNING: CPU: 4 PID: 11871 at 
drivers/gpu/drm/drm_mode_object.c:294 
drm_object_property_get_value+0x22/0x30 [drm]
[16301.515106] Modules linked in: fuse rfcomm nf_log_ipv6 xt_comment 
nf_log_ipv4 nf_log_common xt_LOG xt_limit rpcsec_gss_krb5 auth_rpcgss 
nfsv4 dns_resolver nfs lockd nfnetlink_cthelper grace nfnetlink sunrpc 
fscache af_packet ipmi_ssif iscsi_ibft iscsi_boot_sysfs ip6t_REJECT 
nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_pkttype xt_tcpudp iptable_filter ip6table_mangle 
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 
nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack libcrc32c 
ip6table_filter ip6_tables x_tables jc42 bnep joydev snd_hda_codec_hdmi 
snd_hda_intel snd_hda_codec btusb snd_hwdep btrtl snd_hda_core btbcm 
btintel bluetooth snd_pcm e1000e intel_powerclamp iTCO_wdt snd_timer 
coretemp iTCO_vendor_support hid_generic snd ptp kvm_intel pps_core 
rfkill
[16301.515129]  kvm ecdh_generic tpm_infineon soundcore irqbypass 
tpm_tis tpm_tis_core crc32c_intel pcspkr shpchp ipmi_si tpm usbhid 
i2c_i801 i7core_edac lpc_ich ipmi_devintf ipmi_msghandler ac button 
acpi_cpufreq tcp_bbr raid1 md_mod amdkfd amd_iommu_v2 serio_raw sr_mod 
cdrom amdgpu mpt3sas i2c_algo_bit raid_class scsi_transport_sas 
drm_kms_helper syscopyarea sysfillrect ehci_pci sysimgblt fb_sys_fops 
ehci_hcd ttm usbcore drm sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc 
scsi_dh_alua
[16301.515147] CPU: 4 PID: 11871 Comm: X Tainted: GW   
4.13.0-rc5-1.g7262353-default+ #1
[16301.515148] Hardware name: FUJITSU  PRIMERGY 
TX150 S7 /D2759, BIOS 6.00 Rev. 1.19.2759.A1   
09/26/2012

[16301.515149] task: 90c49525a180 task.stack: b84786ef4000
[16301.515156] RIP: 0010:drm_object_property_get_value+0x22/0x30 [drm]
[16301.515157] RSP: 0018:b84786ef7bf8 EFLAGS: 00010282
[16301.515158] RAX: c03f0cc0 RBX: 90c52200 RCX: 

[16301.515158] RDX: b84786ef7c10 RSI: 90c523837600 RDI: 
90c523b19028
[16301.515159] RBP: b84786ef7bf8 R08: 90c41ee28280 R09: 
90c515defc00
[16301.515159] R10: 00024bb8 R11:  R12: 

[16301.515160] R13: 90c523b19000 R14: ffea R15: 
90c511e8c100
[16301.515161] FS:  7f86f15c9a40() GS:90c53fd0() 
knlGS:

[16301.515162] CS:  0010 DS:  ES:  CR0: 80050033
[16301.515162] CR2: 000849a98680 CR3: 0005dcc1a000 CR4: 
06e0

[16301.515163] Call Trace:
[16301.515255]  amdgpu_dm_connector_atomic_set_property+0xe8/0x150 
[amdgpu]

[16301.515292]  drm_atomic_set_property+0x164/0x470 [drm]
[16301.515300]  drm_mode_obj_set_property_ioctl+0x10b/0x240 [drm]
[16301.515311]  ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
[16301.515318]  drm_mode_connector_property_set_ioctl+0x30/0x40 [drm]
[16301.515324]  drm_ioctl_kernel+0x5d/0xb0 [drm]
[16301.515332]  drm_ioctl+0x31a/0x3d0 [drm]
[16301.515339]  ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
[16301.515342]  ? ext4_file_write_iter+0xba/0x390
[16301.515362]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[16301.515365]  do_vfs_ioctl+0x92/0x5c0
[16301.515367]  ? __fget+0x6e/0x90
[16301.515368]  SyS_ioctl+0x79/0x90
[16301.515372]  entry_SYSCALL_64_fastpath+0x1e/0xa9
[16301.515373] RIP: 0033:0x7f86eef46507
[16301.515373] RSP: 002b:7fffe5f9dca8 EFLAGS: 3246 ORIG_RAX: 
0010
[16301.515374] RAX: ffda RBX: 00084aeea990 RCX: 
7f86eef46507
[16301.515375] RDX: 7fffe5f9dce0 RSI: c01064ab RDI: 
000d
[16301.515375] RBP: 00084b50d830 R08: 00084b07d690 R09: 
0001
[16301.515376] R10: 0004 R11: 3246 R12: 
00084aeea990
[16301.515376] R13: 0005 R14: 00084aeeedd0 R15: 
00084aeeedd0
[16301.515377] Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 46 
60 55 48 89 e5 48 8b 80 70 03 00 00 48 83 78 20 00 75 07 e8 60 ff ff ff 
5d c3 <0f> ff e8 57 ff ff ff 5d c3 0f 1f 44 00 00 66 66 66 66 90 55 48

[16301.515394] ---[ end trace f957c1d5844ce46f ]---

Thanks,
Dieter

Am 30.08.2017 00:22, schrieb kbuild test robot:
tree:   git://people.freedesktop.org/~agd5f/linux.git 
amd-staging-drm-next

head:   ff14f0dca1f23b2cff41e43440c7952965e5fc1b
commit: 9b37a9b8f6464e3ce1ce59eda1ec7053c8e77ca3 [68/819] drm/amd/dc:
Add dc display driver (v2)
config: ia64-allyesconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
wget
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross
-O ~/bin/make.cross
chmod +x ~/bin/make.cross
  

[PATCH 7/8] lib: Closed hash table with low overhead

2017-08-29 Thread Felix Kuehling
This adds a statically sized closed hash table implementation with
low memory and CPU overhead. The API is inspired by kfifo.

Storing, retrieving and deleting data does not involve any dynamic
memory management, which makes it ideal for use in interrupt context.
Static memory usage per entry comprises a 32 or 64 bit hash key, two
bits for occupancy tracking and the value size stored in the table.
No list heads or pointers are needed. Therefore this data structure
should be quite cache-friendly, too.

It uses linear probing and lazy deletion. During lookups free space
is reclaimed and entries relocated to speed up future lookups.

Signed-off-by: Felix Kuehling 
Acked-by: Christian König 
---
 include/linux/chash.h | 358 +
 lib/Kconfig   |  24 ++
 lib/Makefile  |   2 +
 lib/chash.c   | 622 ++
 4 files changed, 1006 insertions(+)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

diff --git a/include/linux/chash.h b/include/linux/chash.h
new file mode 100644
index 000..c89b92b
--- /dev/null
+++ b/include/linux/chash.h
@@ -0,0 +1,358 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _LINUX_CHASH_H
+#define _LINUX_CHASH_H
+
+#include 
+#include 
+#include 
+#include 
+
+struct __chash_table {
+   u8 bits;
+   u8 key_size;
+   unsigned int value_size;
+   u32 size_mask;
+   unsigned long *occup_bitmap, *valid_bitmap;
+   union {
+   u32 *keys32;
+   u64 *keys64;
+   };
+   u8 *values;
+
+#ifdef CONFIG_CHASH_STATS
+   u64 hits, hits_steps, hits_time_ns;
+   u64 miss, miss_steps, miss_time_ns;
+   u64 relocs, reloc_dist;
+#endif
+};
+
+#define __CHASH_BITMAP_SIZE(bits)  \
+   (((1 << (bits)) + BITS_PER_LONG - 1) / BITS_PER_LONG)
+#define __CHASH_ARRAY_SIZE(bits, size) \
+   size) << (bits)) + sizeof(long) - 1) / sizeof(long))
+
+#define __CHASH_DATA_SIZE(bits, key_size, value_size)  \
+   (__CHASH_BITMAP_SIZE(bits) * 2 +\
+__CHASH_ARRAY_SIZE(bits, key_size) +   \
+__CHASH_ARRAY_SIZE(bits, value_size))
+
+#define STRUCT_CHASH_TABLE(bits, key_size, value_size) \
+   struct {\
+   struct __chash_table table; \
+   unsigned long data  \
+   [__CHASH_DATA_SIZE(bits, key_size, value_size)];\
+   }
+
+/**
+ * struct chash_table - Dynamically allocated closed hash table
+ *
+ * Use this struct for dynamically allocated hash tables (using
+ * chash_table_alloc and chash_table_free), where the size is
+ * determined at runtime.
+ */
+struct chash_table {
+   struct __chash_table table;
+   unsigned long *data;
+};
+
+/**
+ * DECLARE_CHASH_TABLE - macro to declare a closed hash table
+ * @table: name of the declared hash table
+ * @bts: Table size will be 2^bits entries
+ * @key_sz: Size of hash keys in bytes, 4 or 8
+ * @val_sz: Size of data values in bytes, can be 0
+ *
+ * This declares the hash table variable with a static size.
+ *
+ * The closed hash table stores key-value pairs with low memory and
+ * lookup overhead. In operation it performs no dynamic memory
+ * management. The data being stored does not require any
+ * list_heads. The hash table performs best with small @val_sz and as
+ * long as some space (about 50%) is left free in the table. But the
+ * table can still work reasonably efficiently even when filled up to
+ * about 90%. If bigger data items need to be stored and looked up,
+ * store the pointer to it as value in the hash table.
+ *
+ * @val_sz may be 0. This can be useful when all the stored
+ * informati

[PATCH 6/8] drm/amdgpu: Add prescreening stage in IH processing

2017-08-29 Thread Felix Kuehling
To filter out high-frequency interrupts that can be safely ignored.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c  |  6 ++
 drivers/gpu/drm/amd/amdgpu/cik_ih.c | 14 ++
 drivers/gpu/drm/amd/amdgpu/cz_ih.c  | 14 ++
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c | 14 ++
 drivers/gpu/drm/amd/amdgpu/si_ih.c  | 14 ++
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c   | 14 ++
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c  | 14 ++
 8 files changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 103635a..8db6b23 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -332,6 +332,7 @@ struct amdgpu_gart_funcs {
 struct amdgpu_ih_funcs {
/* ring read/write ptr handling, called from interrupt context */
u32 (*get_wptr)(struct amdgpu_device *adev);
+   bool (*prescreen_iv)(struct amdgpu_device *adev);
void (*decode_iv)(struct amdgpu_device *adev,
  struct amdgpu_iv_entry *entry);
void (*set_rptr)(struct amdgpu_device *adev);
@@ -1759,6 +1760,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_ring_init_cond_exec(r) (r)->funcs->init_cond_exec((r))
 #define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o))
 #define amdgpu_ih_get_wptr(adev) (adev)->irq.ih_funcs->get_wptr((adev))
+#define amdgpu_ih_prescreen_iv(adev) (adev)->irq.ih_funcs->prescreen_iv((adev))
 #define amdgpu_ih_decode_iv(adev, iv) (adev)->irq.ih_funcs->decode_iv((adev), 
(iv))
 #define amdgpu_ih_set_rptr(adev) (adev)->irq.ih_funcs->set_rptr((adev))
 #define amdgpu_display_vblank_get_counter(adev, crtc) 
(adev)->mode_info.funcs->vblank_get_counter((adev), (crtc))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index 3ab4c65..c834a40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -169,6 +169,12 @@ int amdgpu_ih_process(struct amdgpu_device *adev)
while (adev->irq.ih.rptr != wptr) {
u32 ring_index = adev->irq.ih.rptr >> 2;
 
+   /* Prescreening of high-frequency interrupts */
+   if (!amdgpu_ih_prescreen_iv(adev)) {
+   adev->irq.ih.rptr &= adev->irq.ih.ptr_mask;
+   continue;
+   }
+
/* Before dispatching irq to IP blocks, send it to amdkfd */
amdgpu_amdkfd_interrupt(adev,
(const void *) &adev->irq.ih.ring[ring_index]);
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c 
b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index b891843..07d3d89 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -228,6 +228,19 @@ static u32 cik_ih_get_wptr(struct amdgpu_device *adev)
  * [127:96] - reserved
  */
 
+/**
+ * cik_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool cik_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+   /* Process all interrupts */
+   return true;
+}
+
  /**
  * cik_ih_decode_iv - decode an interrupt vector
  *
@@ -433,6 +446,7 @@ static const struct amd_ip_funcs cik_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs cik_ih_funcs = {
.get_wptr = cik_ih_get_wptr,
+   .prescreen_iv = cik_ih_prescreen_iv,
.decode_iv = cik_ih_decode_iv,
.set_rptr = cik_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c 
b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index 0c1209c..b6cdf4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -208,6 +208,19 @@ static u32 cz_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * cz_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool cz_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+   /* Process all interrupts */
+   return true;
+}
+
+/**
  * cz_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -414,6 +427,7 @@ static const struct amd_ip_funcs cz_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs cz_ih_funcs = {
.get_wptr = cz_ih_get_wptr,
+   .prescreen_iv = cz_ih_prescreen_iv,
.decode_iv = cz_ih_decode_iv,
.set_rptr = cz_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c 
b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index 7a0ea27..65ed6d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -208,6 +208,19 @@ static u32 iceland_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * iceland_ih_prescreen_iv - prescreen an interrupt vect

[PATCH 4/8] drm/amdkfd: Separate doorbell allocation from PASID

2017-08-29 Thread Felix Kuehling
PASID management is moving into KGD. Limiting the PASID range to the
number of doorbell pages is no longer practical.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  7 -
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 50 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 10 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  6 
 4 files changed, 45 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 61fff25..5df12b2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -168,13 +168,6 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
pasid_limit = min_t(unsigned int,
(unsigned int)(1 << kfd->device_info->max_pasid_bits),
iommu_info.max_pasids);
-   /*
-* last pasid is used for kernel queues doorbells
-* in the future the last pasid might be used for a kernel thread.
-*/
-   pasid_limit = min_t(unsigned int,
-   pasid_limit,
-   kfd->doorbell_process_limit - 1);
 
err = amd_iommu_init_device(kfd->pdev, pasid_limit);
if (err < 0) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index acf4d2a..feb76c2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -24,16 +24,15 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
- * This extension supports a kernel level doorbells management for
- * the kernel queues.
- * Basically the last doorbells page is devoted to kernel queues
- * and that's assures that any user process won't get access to the
- * kernel doorbells page
+ * This extension supports a kernel level doorbells management for the
+ * kernel queues using the first doorbell page reserved for the kernel.
  */
 
-#define KERNEL_DOORBELL_PASID 1
+static DEFINE_IDA(doorbell_ida);
+static unsigned int max_doorbell_slices;
 #define KFD_SIZE_OF_DOORBELL_IN_BYTES 4
 
 /*
@@ -84,13 +83,16 @@ int kfd_doorbell_init(struct kfd_dev *kfd)
(doorbell_aperture_size - doorbell_start_offset) /
doorbell_process_allocation();
else
-   doorbell_process_limit = 0;
+   return -ENOSPC;
+
+   if (!max_doorbell_slices ||
+   doorbell_process_limit < max_doorbell_slices)
+   max_doorbell_slices = doorbell_process_limit;
 
kfd->doorbell_base = kfd->shared_resources.doorbell_physical_address +
doorbell_start_offset;
 
kfd->doorbell_id_offset = doorbell_start_offset / sizeof(u32);
-   kfd->doorbell_process_limit = doorbell_process_limit - 1;
 
kfd->doorbell_kernel_ptr = ioremap(kfd->doorbell_base,
doorbell_process_allocation());
@@ -185,11 +187,10 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
return NULL;
 
/*
-* Calculating the kernel doorbell offset using "faked" kernel
-* pasid that allocated for kernel queues only
+* Calculating the kernel doorbell offset using the first
+* doorbell page.
 */
-   *doorbell_off = KERNEL_DOORBELL_PASID * (doorbell_process_allocation() /
-   sizeof(u32)) + inx;
+   *doorbell_off = kfd->doorbell_id_offset + inx;
 
pr_debug("Get kernel queue doorbell\n"
 " doorbell offset   == 0x%08X\n"
@@ -228,11 +229,12 @@ unsigned int kfd_queue_id_to_doorbell(struct kfd_dev *kfd,
 {
/*
 * doorbell_id_offset accounts for doorbells taken by KGD.
-* pasid * doorbell_process_allocation/sizeof(u32) adjusts
-* to the process's doorbells
+* index * doorbell_process_allocation/sizeof(u32) adjusts to
+* the process's doorbells.
 */
return kfd->doorbell_id_offset +
-   process->pasid * (doorbell_process_allocation()/sizeof(u32)) +
+   process->doorbell_index
+   * doorbell_process_allocation() / sizeof(u32) +
queue_id;
 }
 
@@ -250,5 +252,21 @@ phys_addr_t kfd_get_process_doorbells(struct kfd_dev *dev,
struct kfd_process *process)
 {
return dev->doorbell_base +
-   process->pasid * doorbell_process_allocation();
+   process->doorbell_index * doorbell_process_allocation();
+}
+
+int kfd_alloc_process_doorbells(struct kfd_process *process)
+{
+   int r = ida_simple_get(&doorbell_ida, 1, max_doorbell_slices,
+   GFP_KERNEL);
+   if (r > 0)
+   process->doorbell_index = r;
+
+   return r;
+}
+
+void kfd_free_process_doorbells(st

[PATCH 8/8] drm/amdgpu: Track pending retry faults in IH and VM (v2)

2017-08-29 Thread Felix Kuehling
IH tracks pending retry faults in a hash table for fast lookup in
interrupt context. Each VM has a short FIFO of pending VM faults for
processing in a bottom half.

The IH prescreening stage adds retry faults and filters out repeated
retry interrupts to minimize the impact of interrupt storms.

It's the VM's responsibility remove pending faults once they are
handled. For now this is only done when the VM is destroyed.

v2:
- Made the hash table smaller and the FIFO longer. I never want the
  FIFO to fill up, because that would make prescreen take longer.
  128 pending page faults should be enough to keep migrations busy.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/Kconfig|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 76 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 12 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  7 +++
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 78 +-
 6 files changed, 180 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 83cb2a8..a79ce4c 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -184,6 +184,7 @@ config DRM_AMDGPU
select BACKLIGHT_CLASS_DEVICE
select BACKLIGHT_LCD_SUPPORT
select INTERVAL_TREE
+   select CHASH
help
  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index c834a40..f5f27e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -196,3 +196,79 @@ int amdgpu_ih_process(struct amdgpu_device *adev)
 
return IRQ_HANDLED;
 }
+
+/**
+ * amdgpu_ih_add_fault - Add a page fault record
+ *
+ * @adev: amdgpu device pointer
+ * @key: 64-bit encoding of PASID and address
+ *
+ * This should be called when a retry page fault interrupt is
+ * received. If this is a new page fault, it will be added to a hash
+ * table. The return value indicates whether this is a new fault, or
+ * a fault that was already known and is already being handled.
+ *
+ * If there are too many pending page faults, this will fail. Retry
+ * interrupts should be ignored in this case until there is enough
+ * free space.
+ *
+ * Returns 0 if the fault was added, 1 if the fault was already known,
+ * -ENOSPC if there are too many pending faults.
+ */
+int amdgpu_ih_add_fault(struct amdgpu_device *adev, u64 key)
+{
+   unsigned long flags;
+   int r = -ENOSPC;
+
+   if (WARN_ON_ONCE(!adev->irq.ih.faults))
+   /* Should be allocated in _ih_sw_init on GPUs that
+* support retry faults and require retry filtering.
+*/
+   return r;
+
+   spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
+
+   /* Only let the hash table fill up to 50% for best performance */
+   if (adev->irq.ih.faults->count >= (1 << (AMDGPU_PAGEFAULT_HASH_BITS-1)))
+   goto unlock_out;
+
+   r = chash_table_copy_in(&adev->irq.ih.faults->hash, key, NULL);
+   if (!r)
+   adev->irq.ih.faults->count++;
+
+   /* chash_table_copy_in should never fail unless we're losing count */
+   WARN_ON_ONCE(r < 0);
+
+unlock_out:
+   spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
+   return r;
+}
+
+/**
+ * amdgpu_ih_clear_fault - Remove a page fault record
+ *
+ * @adev: amdgpu device pointer
+ * @key: 64-bit encoding of PASID and address
+ *
+ * This should be called when a page fault has been handled. Any
+ * future interrupt with this key will be processed as a new
+ * page fault.
+ */
+void amdgpu_ih_clear_fault(struct amdgpu_device *adev, u64 key)
+{
+   unsigned long flags;
+   int r;
+
+   if (!adev->irq.ih.faults)
+   return;
+
+   spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
+
+   r = chash_table_remove(&adev->irq.ih.faults->hash, key, NULL);
+   if (!WARN_ON_ONCE(r < 0)) {
+   adev->irq.ih.faults->count--;
+   WARN_ON_ONCE(adev->irq.ih.faults->count < 0);
+   }
+
+   spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index 3de8e74..ada89358 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
@@ -24,6 +24,8 @@
 #ifndef __AMDGPU_IH_H__
 #define __AMDGPU_IH_H__
 
+#include 
+
 struct amdgpu_device;
  /*
   * vega10+ IH clients
@@ -69,6 +71,13 @@ enum amdgpu_ih_clientid
 
 #define AMDGPU_IH_CLIENTID_LEGACY 0
 
+#define AMDGPU_PAGEFAULT_HASH_BITS 8
+struct amdgpu_retryfault_hashtable {
+   DECLARE_CHASH_TABLE(hash, AMDGPU_PAGEFAULT_HASH_BITS, 8, 0);
+   spinlock_t  lock;
+   int count;
+};
+
 /*
  * R6xx+ IH ring
  */
@@ -87,6 +96,7 @@ struct amdgpu_ih_ring 

[PATCH 3/8] drm/radeon: Add PASID manager for KFD

2017-08-29 Thread Felix Kuehling
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index f6578c9..a2ac8ac 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -58,6 +58,10 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd);
 static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
 static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
+
+static int alloc_pasid(unsigned int bits);
+static void free_pasid(unsigned int pasid);
+
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type);
 
 /*
@@ -112,6 +116,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_vmem_size = get_vmem_size,
.get_gpu_clock_counter = get_gpu_clock_counter,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+   .alloc_pasid = alloc_pasid,
+   .free_pasid = free_pasid,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
.init_pipeline = kgd_init_pipeline,
@@ -341,6 +347,31 @@ static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev 
*kgd)
return rdev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
 }
 
+/*
+ * PASID manager
+ */
+static DEFINE_IDA(pasid_ida);
+
+int alloc_pasid(unsigned int bits)
+{
+   int pasid = -EINVAL;
+
+   for (bits = min(bits, 31U); bits > 0; bits--) {
+   pasid = ida_simple_get(&pasid_ida,
+  1U << (bits - 1), 1U << bits,
+  GFP_KERNEL);
+   if (pasid != -ENOSPC)
+   break;
+   }
+
+   return pasid;
+}
+
+void free_pasid(unsigned int pasid)
+{
+   ida_simple_remove(&pasid_ida, pasid);
+}
+
 static inline struct radeon_device *get_radeon_device(struct kgd_dev *kgd)
 {
return (struct radeon_device *)kgd;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/8] drm/amdgpu: Fix error handling in amdgpu_vm_init

2017-08-29 Thread Felix Kuehling
Make sure vm->root.bo is not left reserved if amdgpu_bo_kmap fails.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 6ff3c1b..0e068fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2541,9 +2541,9 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
goto error_free_root;
 
r = amdgpu_bo_kmap(vm->root.base.bo, NULL);
+   amdgpu_bo_unreserve(vm->root.base.bo);
if (r)
goto error_free_root;
-   amdgpu_bo_unreserve(vm->root.base.bo);
}
 
return 0;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 5/8] drm/amdkfd: Use PASID manager from KGD

2017-08-29 Thread Felix Kuehling
Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_module.c |  6 ---
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c  | 90 ++---
 2 files changed, 38 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 0d73bea..6c5a9ca 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -103,10 +103,6 @@ static int __init kfd_module_init(void)
return -1;
}
 
-   err = kfd_pasid_init();
-   if (err < 0)
-   return err;
-
err = kfd_chardev_init();
if (err < 0)
goto err_ioctl;
@@ -126,7 +122,6 @@ static int __init kfd_module_init(void)
 err_topology:
kfd_chardev_exit();
 err_ioctl:
-   kfd_pasid_exit();
return err;
 }
 
@@ -137,7 +132,6 @@ static void __exit kfd_module_exit(void)
kfd_process_destroy_wq();
kfd_topology_shutdown();
kfd_chardev_exit();
-   kfd_pasid_exit();
dev_info(kfd_device, "Removed module\n");
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
index 1e06de0..d6a7961 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
@@ -20,78 +20,64 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include 
 #include 
 #include "kfd_priv.h"
 
-static unsigned long *pasid_bitmap;
-static unsigned int pasid_limit;
-static DEFINE_MUTEX(pasid_mutex);
-
-int kfd_pasid_init(void)
-{
-   pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
-
-   pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
-   GFP_KERNEL);
-   if (!pasid_bitmap)
-   return -ENOMEM;
-
-   set_bit(0, pasid_bitmap); /* PASID 0 is reserved. */
-
-   return 0;
-}
-
-void kfd_pasid_exit(void)
-{
-   kfree(pasid_bitmap);
-}
+static unsigned int pasid_bits = 16;
+static const struct kfd2kgd_calls *kfd2kgd;
 
 bool kfd_set_pasid_limit(unsigned int new_limit)
 {
-   if (new_limit < pasid_limit) {
-   bool ok;
-
-   mutex_lock(&pasid_mutex);
-
-   /* ensure that no pasids >= new_limit are in-use */
-   ok = (find_next_bit(pasid_bitmap, pasid_limit, new_limit) ==
-   pasid_limit);
-   if (ok)
-   pasid_limit = new_limit;
-
-   mutex_unlock(&pasid_mutex);
-
-   return ok;
+   if (new_limit < 2)
+   return false;
+
+   if (new_limit < (1U << pasid_bits)) {
+   if (kfd2kgd)
+   /* We've already allocated user PASIDs, too late to
+* change the limit
+*/
+   return false;
+
+   while (new_limit < (1U << pasid_bits))
+   pasid_bits--;
}
 
return true;
 }
 
-inline unsigned int kfd_get_pasid_limit(void)
+unsigned int kfd_get_pasid_limit(void)
 {
-   return pasid_limit;
+   return 1U << pasid_bits;
 }
 
 unsigned int kfd_pasid_alloc(void)
 {
-   unsigned int found;
-
-   mutex_lock(&pasid_mutex);
-
-   found = find_first_zero_bit(pasid_bitmap, pasid_limit);
-   if (found == pasid_limit)
-   found = 0;
-   else
-   set_bit(found, pasid_bitmap);
+   int r;
+
+   /* Find the first best KFD device for calling KGD */
+   if (!kfd2kgd) {
+   struct kfd_dev *dev = NULL;
+   unsigned int i = 0;
+
+   while ((dev = kfd_topology_enum_kfd_devices(i)) != NULL) {
+   if (dev && dev->kfd2kgd) {
+   kfd2kgd = dev->kfd2kgd;
+   break;
+   }
+   i++;
+   }
+
+   if (!kfd2kgd)
+   return false;
+   }
 
-   mutex_unlock(&pasid_mutex);
+   r = kfd2kgd->alloc_pasid(pasid_bits);
 
-   return found;
+   return r > 0 ? r : 0;
 }
 
 void kfd_pasid_free(unsigned int pasid)
 {
-   if (!WARN_ON(pasid == 0 || pasid >= pasid_limit))
-   clear_bit(pasid, pasid_bitmap);
+   if (kfd2kgd)
+   kfd2kgd->free_pasid(pasid);
 }
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/8] drm/amdgpu: Add PASID management

2017-08-29 Thread Felix Kuehling
Allows assigning a PASID to a VM for identifying VMs involved in page
faults. The global PASID manager is also exported in the KFD
interface so that AMDGPU and KFD can share the PASID space.

PASIDs of different sizes can be requested. On APUs, the PASID size
is deterined by the capabilities of the IOMMU. So KFD must be able
to allocate PASIDs in a smaller range.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 75 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 14 -
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
 6 files changed, 97 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index b9dbbf9..dc7e25c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -169,6 +169,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_vmem_size = get_vmem_size,
.get_gpu_clock_counter = get_gpu_clock_counter,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+   .alloc_pasid = amdgpu_vm_alloc_pasid,
+   .free_pasid = amdgpu_vm_free_pasid,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
.init_pipeline = kgd_init_pipeline,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 309f241..c678c69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -128,6 +128,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
.get_vmem_size = get_vmem_size,
.get_gpu_clock_counter = get_gpu_clock_counter,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+   .alloc_pasid = amdgpu_vm_alloc_pasid,
+   .free_pasid = amdgpu_vm_free_pasid,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
.init_pipeline = kgd_init_pipeline,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index e162290..79d9ab4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
}
 
r = amdgpu_vm_init(adev, &fpriv->vm,
-  AMDGPU_VM_CONTEXT_GFX);
+  AMDGPU_VM_CONTEXT_GFX, 0);
if (r) {
kfree(fpriv);
goto out_suspend;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0e068fb..f07b3b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -27,12 +27,59 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
 /*
+ * PASID manager
+ *
+ * PASIDs are global address space identifiers that can be shared
+ * between the GPU, an IOMMU and the driver. VMs on different devices
+ * may use the same PASID if they share the same address
+ * space. Therefore PASIDs are allocated using a global IDA. VMs are
+ * looked up from the PASID per amdgpu_device.
+ */
+static DEFINE_IDA(amdgpu_vm_pasid_ida);
+
+/**
+ * amdgpu_vm_alloc_pasid - Allocate a PASID
+ * @bits: Maximum width of the PASID in bits, must be at least 1
+ *
+ * Allocates a PASID of the given width while keeping smaller PASIDs
+ * available if possible.
+ *
+ * Returns a positive integer on success. Returns %-EINVAL if bits==0.
+ * Returns %-ENOSPC if no PASID was available. Returns %-ENOMEM on
+ * memory allocation failure.
+ */
+int amdgpu_vm_alloc_pasid(unsigned int bits)
+{
+   int pasid = -EINVAL;
+
+   for (bits = min(bits, 31U); bits > 0; bits--) {
+   pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
+  1U << (bits - 1), 1U << bits,
+  GFP_KERNEL);
+   if (pasid != -ENOSPC)
+   break;
+   }
+
+   return pasid;
+}
+
+/**
+ * amdgpu_vm_free_pasid - Free a PASID
+ * @pasid: PASID to free
+ */
+void amdgpu_vm_free_pasid(unsigned int pasid)
+{
+   ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
+}
+
+/*
  * GPUVM
  * GPUVM is similar to the legacy gart on older asics, however
  * rather than there being a single global gart table
@@ -2466,7 +2513,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
uint64_t vm_size, uint32_
  * Init @vm fields.
  */
 int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-  int vm_context)
+ 

[PATCH 0/8] Retry page fault handling for Vega10

2017-08-29 Thread Felix Kuehling
Rebased on the public drm-next-4.15-wip. Patch 8 from the WIP patch
series did not apply at all, because upstream KFD doesn't support
GPUVM yet.

The "lib: Closed hash table ..." change is updated and the same as
what I sent to LKML yesterday. Changes are mainly in the way the self
test is hooked up, Kconfig options and some checkpatch fixes. If it
takes too long to get accepted upstream, I could add it under
drivers/gpu/drm/amd/chash in the interim.

This is only compile tested on this branch. I can't do much more
because the upstream KFD doesn't support Vega10 and GPUVM yet. Someone
will have to add PASID support for graphics on top of this.

TODO:
* Finish upstreaming KFD
* Allocate PASIDs for graphics contexts
* Setup VMID-PASID mapping during graphics command submission
* Confirm that graphics page faults have the correct PASID in the IV

Felix Kuehling (8):
  drm/amdgpu: Fix error handling in amdgpu_vm_init
  drm/amdgpu: Add PASID management
  drm/radeon: Add PASID manager for KFD
  drm/amdkfd: Separate doorbell allocation from PASID
  drm/amdkfd: Use PASID manager from KGD
  drm/amdgpu: Add prescreening stage in IH processing
  lib: Closed hash table with low overhead
  drm/amdgpu: Track pending retry faults in IH and VM (v2)

 drivers/gpu/drm/Kconfig   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c|  82 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h|  12 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  84 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  21 +-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c   |  14 +
 drivers/gpu/drm/amd/amdgpu/cz_ih.c|  14 +
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c   |  14 +
 drivers/gpu/drm/amd/amdgpu/si_ih.c|  14 +
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c |  14 +
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c|  90 
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |   7 -
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c |  50 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c   |   6 -
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c|  90 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   6 +
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
 drivers/gpu/drm/radeon/radeon_kfd.c   |  31 ++
 include/linux/chash.h | 358 +
 lib/Kconfig   |  24 +
 lib/Makefile  |   2 +
 lib/chash.c   | 622 ++
 27 files changed, 1489 insertions(+), 91 deletions(-)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/powerplay/hwmgr: Remove null check before kfree

2017-08-29 Thread Alex Deucher
On Tue, Aug 29, 2017 at 9:34 AM, Harry Wentland  wrote:
> On 2017-08-29 09:12 AM, Himanshu Jha wrote:
>> kfree on NULL pointer is a no-op and therefore checking is redundant.
>>
>> Signed-off-by: Himanshu Jha 
>
> Reviewed-by: Harry Wentland 

Applied.  thanks!

Alex

>
> Harry
>
>> ---
>>  drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c |  6 +-
>>  .../gpu/drm/amd/powerplay/hwmgr/processpptables.c  | 96 
>> --
>>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c | 52 
>>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 12 +--
>>  4 files changed, 56 insertions(+), 110 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
>> b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
>> index bc839ff..9f2c037 100644
>> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
>> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
>> @@ -1225,10 +1225,8 @@ static int cz_hwmgr_backend_fini(struct pp_hwmgr 
>> *hwmgr)
>>   phm_destroy_table(hwmgr, &(hwmgr->power_down_asic));
>>   phm_destroy_table(hwmgr, &(hwmgr->setup_asic));
>>
>> - if (NULL != hwmgr->dyn_state.vddc_dep_on_dal_pwrl) {
>> - kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
>> - hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
>> + hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
>>
>>   kfree(hwmgr->backend);
>>   hwmgr->backend = NULL;
>> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
>> b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>> index 2716721..a6dbc55 100644
>> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
>> @@ -1615,85 +1615,53 @@ static int pp_tables_uninitialize(struct pp_hwmgr 
>> *hwmgr)
>>   if (hwmgr->chip_id == CHIP_RAVEN)
>>   return 0;
>>
>> - if (NULL != hwmgr->dyn_state.vddc_dependency_on_sclk) {
>> - kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
>> - hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
>> + hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.vddci_dependency_on_mclk) {
>> - kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
>> - hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
>> + hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.vddc_dependency_on_mclk) {
>> - kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
>> - hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
>> + hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.mvdd_dependency_on_mclk) {
>> - kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
>> - hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
>> + hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.valid_mclk_values) {
>> - kfree(hwmgr->dyn_state.valid_mclk_values);
>> - hwmgr->dyn_state.valid_mclk_values = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.valid_mclk_values);
>> + hwmgr->dyn_state.valid_mclk_values = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.valid_sclk_values) {
>> - kfree(hwmgr->dyn_state.valid_sclk_values);
>> - hwmgr->dyn_state.valid_sclk_values = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.valid_sclk_values);
>> + hwmgr->dyn_state.valid_sclk_values = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.cac_leakage_table) {
>> - kfree(hwmgr->dyn_state.cac_leakage_table);
>> - hwmgr->dyn_state.cac_leakage_table = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.cac_leakage_table);
>> + hwmgr->dyn_state.cac_leakage_table = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.vddc_phase_shed_limits_table) {
>> - kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
>> - hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
>> + hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.vce_clock_voltage_dependency_table) {
>> - kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
>> - hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
>> - }
>> + kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
>> + hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
>>
>> - if (NULL != hwmgr->dyn_state.uvd_clock_vo

Re: [PATCH] drm/amd: Remove null check before kfree

2017-08-29 Thread Alex Deucher
On Tue, Aug 29, 2017 at 9:28 AM, Christian König
 wrote:
> Am 29.08.2017 um 15:21 schrieb Himanshu Jha:
>>
>> Kfree on NULL pointer is a no-op and therefore checking is redundant.
>>
>> Signed-off-by: Himanshu Jha 
>
>
> Reviewed-by: Christian König 
>

Applied.  thanks!

Alex


>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 6 ++
>>   drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 6 ++
>>   2 files changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
>> index 8d1cf2d..f51b41f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
>> @@ -346,10 +346,8 @@ static void amdgpu_connector_free_edid(struct
>> drm_connector *connector)
>>   {
>> struct amdgpu_connector *amdgpu_connector =
>> to_amdgpu_connector(connector);
>>   - if (amdgpu_connector->edid) {
>> -   kfree(amdgpu_connector->edid);
>> -   amdgpu_connector->edid = NULL;
>> -   }
>> +   kfree(amdgpu_connector->edid);
>> +   amdgpu_connector->edid = NULL;
>>   }
>> static int amdgpu_connector_ddc_get_modes(struct drm_connector
>> *connector)
>> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
>> b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
>> index 76347ff..00075c2 100644
>> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
>> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
>> @@ -606,10 +606,8 @@ int smu7_init(struct pp_smumgr *smumgr)
>> int smu7_smu_fini(struct pp_smumgr *smumgr)
>>   {
>> -   if (smumgr->backend) {
>> -   kfree(smumgr->backend);
>> -   smumgr->backend = NULL;
>> -   }
>> +   kfree(smumgr->backend);
>> +   smumgr->backend = NULL;
>> cgs_rel_firmware(smumgr->device, CGS_UCODE_ID_SMU);
>> return 0;
>>   }
>
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[pull] amdgpu and ttm drm-next-4.14

2017-08-29 Thread Alex Deucher
Hi Dave,

A few fixes for drm-next for 4.14.  Nothing too major.  I'll check in
again in a week or so.


The following changes since commit 7c0059dd832cc686bf0febefdcf8295cdd93007f:

  Merge branch 'linux-4.14' of git://github.com/skeggsb/linux into drm-next 
(2017-08-23 05:32:26 +1000)

are available in the git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-next-4.14

for you to fetch changes up to 403df1f66cc0457221f3be5c210f128ab87de547:

  drm/amdgpu: remove duplicate return statement (2017-08-24 14:27:44 -0400)


Alex Deucher (5):
  drm/amdgpu/gfx8: fix spelling typo in mqd allocation
  drm/amdgpu: add automatic per asic settings for gart_size
  drm/amdgpu: refine default gart size
  drm/amdgpu: move default gart size setting into gmc modules
  drm/amdgpu: set sched_hw_submission higher for KIQ (v3)

Christian König (5):
  drm/amdgpu: fix and cleanup shadow handling
  drm/amdgpu: discard commands of killed processes
  drm/amdgpu: remove the GART copy hack
  drm/amdgpu: fix amdgpu_ttm_bind
  drm/amdgpu: inline amdgpu_ttm_do_bind again

Christophe JAILLET (1):
  drm/amdgpu: check memory allocation failure

Colin Ian King (1):
  drm/amdgpu: remove duplicate return statement

Emily Deng (1):
  drm/amdgpu/virtual_dce: Virtual display doesn't support disable vblank 
immediately

Evan Quan (2):
  drm/amd/powerplay: unhalt mec after loading
  drm/amd/powerplay: ACG frequency added in PPTable

Felix Kuehling (1):
  drm/amdgpu: Fix huge page updates with CPU

Monk Liu (2):
  drm/ttm: fix missing inc bo_count
  drm/ttm:fix wrong decoding of bo_count

Roger He (1):
  drm/amd/amdgpu: fix BANK_SELECT on Vega10 (v2)

 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 12 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h   |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c| 14 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c|  5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 46 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   | 16 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 76 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h|  4 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 46 ++---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 12 ++--
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c   |  5 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  | 19 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 22 ++-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 21 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 16 -
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c|  5 +-
 drivers/gpu/drm/amd/include/vi_structs.h   |  4 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 11 +++-
 drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h |  6 +-
 drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c |  3 +-
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.c  | 23 +--
 drivers/gpu/drm/ttm/ttm_bo.c   |  4 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c  |  1 +
 28 files changed, 236 insertions(+), 156 deletions(-)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdkfd: remove memset before memcpy

2017-08-29 Thread Himanshu Jha
calling memcpy immediately after memset with the same region of memory
makes memset redundant.

Signed-off-by: Himanshu Jha 
---
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 1cae95e..03bec76 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -143,7 +143,6 @@ int pqm_create_queue(struct process_queue_manager *pqm,
int num_queues = 0;
struct queue *cur;
 
-   memset(&q_properties, 0, sizeof(struct queue_properties));
memcpy(&q_properties, properties, sizeof(struct queue_properties));
q = NULL;
kq = NULL;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/4] drm/amdgpu: add IOCTL interface for per VM BOs v2

2017-08-29 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Christian König
> Sent: Tuesday, August 29, 2017 1:08 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH 3/4] drm/amdgpu: add IOCTL interface for per VM BOs v2
> 
> From: Christian König 
> 
> Add the IOCTL interface so that applications can allocate per VM BOs.
> 
> Still WIP since not all corner cases are tested yet, but this reduces average
> CS overhead for 10K BOs from 21ms down to 48us.
> 
> v2: add some extra checks, remove the WIP tag
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  7 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c|  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   | 63
> ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c |  3 +-
>  include/uapi/drm/amdgpu_drm.h |  2 +
>  5 files changed, 55 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index b1e817c..21cab36 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -457,9 +457,10 @@ struct amdgpu_sa_bo {
>   */
>  void amdgpu_gem_force_release(struct amdgpu_device *adev);
>  int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned
> long size,
> - int alignment, u32 initial_domain,
> - u64 flags, bool kernel,
> - struct drm_gem_object **obj);
> +  int alignment, u32 initial_domain,
> +  u64 flags, bool kernel,
> +  struct reservation_object *resv,
> +  struct drm_gem_object **obj);
> 
>  int amdgpu_mode_dumb_create(struct drm_file *file_priv,
>   struct drm_device *dev,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> index 0e907ea..7256f83 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
> @@ -144,7 +144,7 @@ static int amdgpufb_create_pinned_object(struct
> amdgpu_fbdev *rfbdev,
> 
> AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
> 
> AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
> 
> AMDGPU_GEM_CREATE_VRAM_CLEARED,
> -true, &gobj);
> +true, NULL, &gobj);
>   if (ret) {
>   pr_err("failed to allocate framebuffer (%d)\n", aligned_size);
>   return -ENOMEM;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index e32a2b5..a835304 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -44,11 +44,12 @@ void amdgpu_gem_object_free(struct
> drm_gem_object *gobj)
>  }
> 
>  int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned
> long size,
> - int alignment, u32 initial_domain,
> - u64 flags, bool kernel,
> - struct drm_gem_object **obj)
> +  int alignment, u32 initial_domain,
> +  u64 flags, bool kernel,
> +  struct reservation_object *resv,
> +  struct drm_gem_object **obj)
>  {
> - struct amdgpu_bo *robj;
> + struct amdgpu_bo *bo;
>   int r;
> 
>   *obj = NULL;
> @@ -59,7 +60,7 @@ int amdgpu_gem_object_create(struct amdgpu_device
> *adev, unsigned long size,
> 
>  retry:
>   r = amdgpu_bo_create(adev, size, alignment, kernel, initial_domain,
> -  flags, NULL, NULL, 0, &robj);
> +  flags, NULL, resv, 0, &bo);
>   if (r) {
>   if (r != -ERESTARTSYS) {
>   if (initial_domain ==
> AMDGPU_GEM_DOMAIN_VRAM) {
> @@ -71,7 +72,7 @@ int amdgpu_gem_object_create(struct amdgpu_device
> *adev, unsigned long size,
>   }
>   return r;
>   }
> - *obj = &robj->gem_base;
> + *obj = &bo->gem_base;
> 
>   return 0;
>  }
> @@ -119,6 +120,10 @@ int amdgpu_gem_object_open(struct
> drm_gem_object *obj,
>   if (mm && mm != current->mm)
>   return -EPERM;
> 
> + if (abo->flags & AMDGPU_GEM_CREATE_LOCAL &&
> + abo->tbo.resv != vm->root.base.bo->tbo.resv)
> + return -EPERM;
> +
>   r = amdgpu_bo_reserve(abo, false);
>   if (r)
>   return r;
> @@ -142,13 +147,14 @@ void amdgpu_gem_object_close(struct
> drm_gem_object *obj,
>   struct amdgpu_vm *vm = &fpriv->vm;
> 
>   struct amdgpu_bo_list_entry vm_pd;
> - struct list_head list;
> + struct list_head list, duplicates;
>   struct ttm_validate_buffer tv;
>   struct ww_acquire_ctx ticket;
>   struct amdgpu_bo_va *bo_va;
>   int r;
> 
>   INIT_LIST_HEAD(&list);
> + INIT_LIST_HEAD(&dup

[PATCH 4/4] drm/amdgpu: bump version for support of local BOs

2017-08-29 Thread Christian König
From: Christian König 

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 21116fc..5035305c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -69,9 +69,10 @@
  * - 3.17.0 - Add AMDGPU_NUM_VRAM_CPU_PAGE_FAULTS.
  * - 3.18.0 - Export gpu always on cu bitmap
  * - 3.19.0 - Add support for UVD MJPEG decode
+ * - 3.20.0 - Add support for local BOs
  */
 #define KMS_DRIVER_MAJOR   3
-#define KMS_DRIVER_MINOR   19
+#define KMS_DRIVER_MINOR   20
 #define KMS_DRIVER_PATCHLEVEL  0
 
 int amdgpu_vram_limit = 0;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 3/4] drm/amdgpu: add IOCTL interface for per VM BOs v2

2017-08-29 Thread Christian König
From: Christian König 

Add the IOCTL interface so that applications can allocate per VM BOs.

Still WIP since not all corner cases are tested yet, but this reduces average
CS overhead for 10K BOs from 21ms down to 48us.

v2: add some extra checks, remove the WIP tag

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  7 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   | 63 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c |  3 +-
 include/uapi/drm/amdgpu_drm.h |  2 +
 5 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index b1e817c..21cab36 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -457,9 +457,10 @@ struct amdgpu_sa_bo {
  */
 void amdgpu_gem_force_release(struct amdgpu_device *adev);
 int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
-   int alignment, u32 initial_domain,
-   u64 flags, bool kernel,
-   struct drm_gem_object **obj);
+int alignment, u32 initial_domain,
+u64 flags, bool kernel,
+struct reservation_object *resv,
+struct drm_gem_object **obj);
 
 int amdgpu_mode_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
index 0e907ea..7256f83 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
@@ -144,7 +144,7 @@ static int amdgpufb_create_pinned_object(struct 
amdgpu_fbdev *rfbdev,
   AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
   AMDGPU_GEM_CREATE_VRAM_CLEARED,
-  true, &gobj);
+  true, NULL, &gobj);
if (ret) {
pr_err("failed to allocate framebuffer (%d)\n", aligned_size);
return -ENOMEM;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index e32a2b5..a835304 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -44,11 +44,12 @@ void amdgpu_gem_object_free(struct drm_gem_object *gobj)
 }
 
 int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
-   int alignment, u32 initial_domain,
-   u64 flags, bool kernel,
-   struct drm_gem_object **obj)
+int alignment, u32 initial_domain,
+u64 flags, bool kernel,
+struct reservation_object *resv,
+struct drm_gem_object **obj)
 {
-   struct amdgpu_bo *robj;
+   struct amdgpu_bo *bo;
int r;
 
*obj = NULL;
@@ -59,7 +60,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
 
 retry:
r = amdgpu_bo_create(adev, size, alignment, kernel, initial_domain,
-flags, NULL, NULL, 0, &robj);
+flags, NULL, resv, 0, &bo);
if (r) {
if (r != -ERESTARTSYS) {
if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
@@ -71,7 +72,7 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
}
return r;
}
-   *obj = &robj->gem_base;
+   *obj = &bo->gem_base;
 
return 0;
 }
@@ -119,6 +120,10 @@ int amdgpu_gem_object_open(struct drm_gem_object *obj,
if (mm && mm != current->mm)
return -EPERM;
 
+   if (abo->flags & AMDGPU_GEM_CREATE_LOCAL &&
+   abo->tbo.resv != vm->root.base.bo->tbo.resv)
+   return -EPERM;
+
r = amdgpu_bo_reserve(abo, false);
if (r)
return r;
@@ -142,13 +147,14 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
struct amdgpu_vm *vm = &fpriv->vm;
 
struct amdgpu_bo_list_entry vm_pd;
-   struct list_head list;
+   struct list_head list, duplicates;
struct ttm_validate_buffer tv;
struct ww_acquire_ctx ticket;
struct amdgpu_bo_va *bo_va;
int r;
 
INIT_LIST_HEAD(&list);
+   INIT_LIST_HEAD(&duplicates);
 
tv.bo = &bo->tbo;
tv.shared = true;
@@ -156,7 +162,7 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
 
amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
 
-   r = ttm_eu_reserve_buffers(&ticket, &list, false, NULL);
+   r = ttm_eu_reserve_buffers(&ti

[PATCH 2/4] drm/amdgpu: add support for per VM BOs v2

2017-08-29 Thread Christian König
From: Christian König 

Per VM BOs are handled like VM PDs and PTs. They are always valid and don't
need to be specified in the BO lists.

v2: validate PDs/PTs first

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 79 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  5 ++-
 3 files changed, 60 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index f68ac56..48e18cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -813,7 +813,7 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser 
*p)
 
}
 
-   r = amdgpu_vm_clear_moved(adev, vm, &p->job->sync);
+   r = amdgpu_vm_handle_moved(adev, vm, &p->job->sync);
 
if (amdgpu_vm_debug && p->bo_list) {
/* Invalidate all BOs to test for userspace bugs */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4cdfb70..6cd20e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -189,14 +189,18 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
spin_unlock(&glob->lru_lock);
}
 
-   if (vm->use_cpu_for_update) {
+   if (bo->tbo.type == ttm_bo_type_kernel &&
+   vm->use_cpu_for_update) {
r = amdgpu_bo_kmap(bo, NULL);
if (r)
return r;
}
 
spin_lock(&vm->status_lock);
-   list_move(&bo_base->vm_status, &vm->relocated);
+   if (bo->tbo.type != ttm_bo_type_kernel)
+   list_move(&bo_base->vm_status, &vm->moved);
+   else
+   list_move(&bo_base->vm_status, &vm->relocated);
}
spin_unlock(&vm->status_lock);
 
@@ -1994,20 +1998,23 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 }
 
 /**
- * amdgpu_vm_clear_moved - clear moved BOs in the PT
+ * amdgpu_vm_handle_moved - handle moved BOs in the PT
  *
  * @adev: amdgpu_device pointer
  * @vm: requested vm
+ * @sync: sync object to add fences to
  *
- * Make sure all moved BOs are cleared in the PT.
+ * Make sure all BOs which are moved are updated in the PTs.
  * Returns 0 for success.
  *
- * PTs have to be reserved and mutex must be locked!
+ * PTs have to be reserved!
  */
-int amdgpu_vm_clear_moved(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-   struct amdgpu_sync *sync)
+int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
+  struct amdgpu_vm *vm,
+  struct amdgpu_sync *sync)
 {
struct amdgpu_bo_va *bo_va = NULL;
+   bool clear;
int r = 0;
 
spin_lock(&vm->status_lock);
@@ -2016,7 +2023,10 @@ int amdgpu_vm_clear_moved(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
struct amdgpu_bo_va, base.vm_status);
spin_unlock(&vm->status_lock);
 
-   r = amdgpu_vm_bo_update(adev, bo_va, true);
+   /* Per VM BOs never need to bo cleared in the page tables */
+   clear = bo_va->base.bo->tbo.resv != vm->root.base.bo->tbo.resv;
+
+   r = amdgpu_vm_bo_update(adev, bo_va, clear);
if (r)
return r;
 
@@ -2068,6 +2078,37 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct 
amdgpu_device *adev,
return bo_va;
 }
 
+
+/**
+ * amdgpu_vm_bo_insert_mapping - insert a new mapping
+ *
+ * @adev: amdgpu_device pointer
+ * @bo_va: bo_va to store the address
+ * @mapping: the mapping to insert
+ *
+ * Insert a new mapping into all structures.
+ */
+static void amdgpu_vm_bo_insert_map(struct amdgpu_device *adev,
+   struct amdgpu_bo_va *bo_va,
+   struct amdgpu_bo_va_mapping *mapping)
+{
+   struct amdgpu_vm *vm = bo_va->base.vm;
+   struct amdgpu_bo *bo = bo_va->base.bo;
+
+   list_add(&mapping->list, &bo_va->invalids);
+   amdgpu_vm_it_insert(mapping, &vm->va);
+
+   if (mapping->flags & AMDGPU_PTE_PRT)
+   amdgpu_vm_prt_get(adev);
+
+   if (bo && bo->tbo.resv == vm->root.base.bo->tbo.resv) {
+   spin_lock(&vm->status_lock);
+   list_move(&bo_va->base.vm_status, &vm->moved);
+   spin_unlock(&vm->status_lock);
+   }
+   trace_amdgpu_vm_bo_map(bo_va, mapping);
+}
+
 /**
  * amdgpu_vm_bo_map - map bo inside a vm
  *
@@ -2119,18 +2160,12 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
if (!mapping)
return -ENOMEM;
 
-   INIT_LIST_HEAD(&mapping->list);
mapping->start = saddr;
mapping->last = eaddr;
mapping->offset = offset;
mapping->flag

[PATCH 1/4] drm/amdgpu: restrict userptr even more

2017-08-29 Thread Christian König
From: Christian König 

Don't allow them to be GEM imported into another process.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d028806..e32a2b5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -112,7 +112,13 @@ int amdgpu_gem_object_open(struct drm_gem_object *obj,
struct amdgpu_fpriv *fpriv = file_priv->driver_priv;
struct amdgpu_vm *vm = &fpriv->vm;
struct amdgpu_bo_va *bo_va;
+   struct mm_struct *mm;
int r;
+
+   mm = amdgpu_ttm_tt_get_usermm(abo->tbo.ttm);
+   if (mm && mm != current->mm)
+   return -EPERM;
+
r = amdgpu_bo_reserve(abo, false);
if (r)
return r;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/powerplay/hwmgr: Remove null check before kfree

2017-08-29 Thread Christian König

Am 29.08.2017 um 15:12 schrieb Himanshu Jha:

kfree on NULL pointer is a no-op and therefore checking is redundant.

Signed-off-by: Himanshu Jha 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c |  6 +-
  .../gpu/drm/amd/powerplay/hwmgr/processpptables.c  | 96 --
  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c | 52 
  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 12 +--
  4 files changed, 56 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
index bc839ff..9f2c037 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
@@ -1225,10 +1225,8 @@ static int cz_hwmgr_backend_fini(struct pp_hwmgr *hwmgr)
phm_destroy_table(hwmgr, &(hwmgr->power_down_asic));
phm_destroy_table(hwmgr, &(hwmgr->setup_asic));
  
-		if (NULL != hwmgr->dyn_state.vddc_dep_on_dal_pwrl) {

-   kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
-   hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
+   hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
  
  		kfree(hwmgr->backend);

hwmgr->backend = NULL;
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
index 2716721..a6dbc55 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
@@ -1615,85 +1615,53 @@ static int pp_tables_uninitialize(struct pp_hwmgr 
*hwmgr)
if (hwmgr->chip_id == CHIP_RAVEN)
return 0;
  
-	if (NULL != hwmgr->dyn_state.vddc_dependency_on_sclk) {

-   kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
-   hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
+   hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
  
-	if (NULL != hwmgr->dyn_state.vddci_dependency_on_mclk) {

-   kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
-   hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
+   hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
  
-	if (NULL != hwmgr->dyn_state.vddc_dependency_on_mclk) {

-   kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
-   hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
+   hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
  
-	if (NULL != hwmgr->dyn_state.mvdd_dependency_on_mclk) {

-   kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
-   hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
+   hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
  
-	if (NULL != hwmgr->dyn_state.valid_mclk_values) {

-   kfree(hwmgr->dyn_state.valid_mclk_values);
-   hwmgr->dyn_state.valid_mclk_values = NULL;
-   }
+   kfree(hwmgr->dyn_state.valid_mclk_values);
+   hwmgr->dyn_state.valid_mclk_values = NULL;
  
-	if (NULL != hwmgr->dyn_state.valid_sclk_values) {

-   kfree(hwmgr->dyn_state.valid_sclk_values);
-   hwmgr->dyn_state.valid_sclk_values = NULL;
-   }
+   kfree(hwmgr->dyn_state.valid_sclk_values);
+   hwmgr->dyn_state.valid_sclk_values = NULL;
  
-	if (NULL != hwmgr->dyn_state.cac_leakage_table) {

-   kfree(hwmgr->dyn_state.cac_leakage_table);
-   hwmgr->dyn_state.cac_leakage_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.cac_leakage_table);
+   hwmgr->dyn_state.cac_leakage_table = NULL;
  
-	if (NULL != hwmgr->dyn_state.vddc_phase_shed_limits_table) {

-   kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
-   hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
+   hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
  
-	if (NULL != hwmgr->dyn_state.vce_clock_voltage_dependency_table) {

-   kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
-   hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
+   hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
  
-	if (NULL != hwmgr->dyn_state.uvd_clock_voltage_dependency_table) {

-   kfree(hwmgr->dyn_state.uvd_clock_voltage_dependency_table);
-   hwmgr->dyn_state.uvd_clock_voltage_dependency_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.uvd_clock_voltage_dependency_table);
+   hwmgr->dyn_state.uvd_clock_voltage_dep

[PATCH] drm/amd/powerplay: fix sclk setting for profile mode for CZ/ST

2017-08-29 Thread Alex Deucher
Need to select dpm0 to avoid clock fluctuations.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c | 47 +-
 1 file changed, 1 insertion(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
index bc839ff..9f7918e 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
@@ -1312,48 +1312,9 @@ static int cz_phm_force_dpm_lowest(struct pp_hwmgr 
*hwmgr)
return 0;
 }
 
-static int cz_phm_force_dpm_sclk(struct pp_hwmgr *hwmgr, uint32_t sclk)
-{
-   smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
-   PPSMC_MSG_SetSclkSoftMin,
-   cz_get_sclk_level(hwmgr,
-   sclk,
-   PPSMC_MSG_SetSclkSoftMin));
-
-   smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
-   PPSMC_MSG_SetSclkSoftMax,
-   cz_get_sclk_level(hwmgr,
-   sclk,
-   PPSMC_MSG_SetSclkSoftMax));
-   return 0;
-}
-
-static int cz_get_profiling_clk(struct pp_hwmgr *hwmgr, uint32_t *sclk)
-{
-   struct phm_clock_voltage_dependency_table *table =
-   hwmgr->dyn_state.vddc_dependency_on_sclk;
-   int32_t tmp_sclk;
-   int32_t count;
-
-   tmp_sclk = table->entries[table->count-1].clk * 70 / 100;
-
-   for (count = table->count-1; count >= 0; count--) {
-   if (tmp_sclk >= table->entries[count].clk) {
-   tmp_sclk = table->entries[count].clk;
-   *sclk = tmp_sclk;
-   break;
-   }
-   }
-   if (count < 0)
-   *sclk = table->entries[0].clk;
-
-   return 0;
-}
-
 static int cz_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
enum amd_dpm_forced_level level)
 {
-   uint32_t sclk = 0;
int ret = 0;
uint32_t profile_mode_mask = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD |
AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK |
@@ -1391,6 +1352,7 @@ static int cz_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
break;
case AMD_DPM_FORCED_LEVEL_LOW:
case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK:
+   case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
ret = cz_phm_force_dpm_lowest(hwmgr);
if (ret)
return ret;
@@ -1402,13 +1364,6 @@ static int cz_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
return ret;
hwmgr->dpm_level = level;
break;
-   case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
-   ret = cz_get_profiling_clk(hwmgr, &sclk);
-   if (ret)
-   return ret;
-   hwmgr->dpm_level = level;
-   cz_phm_force_dpm_sclk(hwmgr, sclk);
-   break;
case AMD_DPM_FORCED_LEVEL_MANUAL:
hwmgr->dpm_level = level;
break;
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/powerplay/hwmgr: Remove null check before kfree

2017-08-29 Thread Zhu, Rex

Reviewed-by:  Rex Zhu mailto:rex@amd.com>>


Best Regards

Rex


From: Wentland, Harry
Sent: Tuesday, August 29, 2017 9:34:01 PM
To: Himanshu Jha; airl...@linux.ie
Cc: linux-ker...@vger.kernel.org; dri-de...@lists.freedesktop.org; 
amd-gfx@lists.freedesktop.org; Deucher, Alexander; Zhu, Rex; Koenig, Christian
Subject: Re: [PATCH] drm/amd/powerplay/hwmgr: Remove null check before kfree

On 2017-08-29 09:12 AM, Himanshu Jha wrote:
> kfree on NULL pointer is a no-op and therefore checking is redundant.
>
> Signed-off-by: Himanshu Jha 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c |  6 +-
>  .../gpu/drm/amd/powerplay/hwmgr/processpptables.c  | 96 
> --
>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c | 52 
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 12 +--
>  4 files changed, 56 insertions(+), 110 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> index bc839ff..9f2c037 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> @@ -1225,10 +1225,8 @@ static int cz_hwmgr_backend_fini(struct pp_hwmgr 
> *hwmgr)
>phm_destroy_table(hwmgr, &(hwmgr->power_down_asic));
>phm_destroy_table(hwmgr, &(hwmgr->setup_asic));
>
> - if (NULL != hwmgr->dyn_state.vddc_dep_on_dal_pwrl) {
> - kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
> - hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
> + hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
>
>kfree(hwmgr->backend);
>hwmgr->backend = NULL;
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> index 2716721..a6dbc55 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> @@ -1615,85 +1615,53 @@ static int pp_tables_uninitialize(struct pp_hwmgr 
> *hwmgr)
>if (hwmgr->chip_id == CHIP_RAVEN)
>return 0;
>
> - if (NULL != hwmgr->dyn_state.vddc_dependency_on_sclk) {
> - kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
> - hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
> + hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
>
> - if (NULL != hwmgr->dyn_state.vddci_dependency_on_mclk) {
> - kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
> - hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
> + hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
>
> - if (NULL != hwmgr->dyn_state.vddc_dependency_on_mclk) {
> - kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
> - hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
> + hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
>
> - if (NULL != hwmgr->dyn_state.mvdd_dependency_on_mclk) {
> - kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
> - hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
> + hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
>
> - if (NULL != hwmgr->dyn_state.valid_mclk_values) {
> - kfree(hwmgr->dyn_state.valid_mclk_values);
> - hwmgr->dyn_state.valid_mclk_values = NULL;
> - }
> + kfree(hwmgr->dyn_state.valid_mclk_values);
> + hwmgr->dyn_state.valid_mclk_values = NULL;
>
> - if (NULL != hwmgr->dyn_state.valid_sclk_values) {
> - kfree(hwmgr->dyn_state.valid_sclk_values);
> - hwmgr->dyn_state.valid_sclk_values = NULL;
> - }
> + kfree(hwmgr->dyn_state.valid_sclk_values);
> + hwmgr->dyn_state.valid_sclk_values = NULL;
>
> - if (NULL != hwmgr->dyn_state.cac_leakage_table) {
> - kfree(hwmgr->dyn_state.cac_leakage_table);
> - hwmgr->dyn_state.cac_leakage_table = NULL;
> - }
> + kfree(hwmgr->dyn_state.cac_leakage_table);
> + hwmgr->dyn_state.cac_leakage_table = NULL;
>
> - if (NULL != hwmgr->dyn_state.vddc_phase_shed_limits_table) {
> - kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
> - hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
> + hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
>
> - if (NULL != hwmgr->dyn_state.vce_clock_voltage_dependency_table) {
> - kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
> - hwmgr->dyn_state.vce

Re: [PATCH 9/9] drm/amdgpu: WIP add IOCTL interface for per VM BOs

2017-08-29 Thread Christian König
Ok, found something that works. Xonotic in lowest resolution, lowest 
effects quality (e.g. totally CPU bound):


Without per process BOs:

Xonotic 0.8:
pts/xonotic-1.4.0 [Resolution: 800 x 600 - Effects Quality: Low]
Test 1 of 1
Estimated Trial Run Count:3
Estimated Time To Completion: 3 Minutes
Started Run 1 @ 21:13:50
Started Run 2 @ 21:14:57
Started Run 3 @ 21:16:03  [Std. Dev: 0.94%]

Test Results:
187.436577
189.514724
190.9605812

Average: 189.30 Frames Per Second
Minimum: 131
Maximum: 355

With per process BOs:

Xonotic 0.8:
pts/xonotic-1.4.0 [Resolution: 800 x 600 - Effects Quality: Low]
Test 1 of 1
Estimated Trial Run Count:3
Estimated Time To Completion: 3 Minutes
Started Run 1 @ 21:20:05
Started Run 2 @ 21:21:07
Started Run 3 @ 21:22:10  [Std. Dev: 1.49%]

Test Results:
203.0471676
199.6622532
197.0954183

Average: 199.93 Frames Per Second
Minimum: 132
Maximum: 349

Well that looks like some improvement.

Regards,
Christian.

Am 28.08.2017 um 14:59 schrieb Zhou, David(ChunMing):

I will push our vulkan guys to test it, their bo list is very long.

发自坚果 Pro

Christian K鰊ig  于 2017年8月28日 下午7:55写道:

Am 28.08.2017 um 06:21 schrieb zhoucm1:
>
>
> On 2017年08月27日 18:03, Christian König wrote:
>> Am 25.08.2017 um 21:19 schrieb Christian König:
>>> Am 25.08.2017 um 18:22 schrieb Marek Olšák:
 On Fri, Aug 25, 2017 at 3:00 PM, Christian König
  wrote:
> Am 25.08.2017 um 12:32 schrieb zhoucm1:
>>
>>
>> On 2017年08月25日 17:38, Christian König wrote:
>>> From: Christian König 
>>>
>>> Add the IOCTL interface so that applications can allocate per VM
>>> BOs.
>>>
>>> Still WIP since not all corner cases are tested yet, but this
>>> reduces
>>> average
>>> CS overhead for 10K BOs from 21ms down to 48us.
>> Wow, cheers, eventually you get per vm bo to same reservation
>> with PD/pts,
>> indeed save a lot of bo list.
>
> Don't cheer to loud yet, that is a completely constructed test case.
>
> So far I wasn't able to archive any improvements with any real
> game on this
> with Mesa.
> With thinking more, too many BOs share one reservation, which could
> result in reservation lock often is busy, if eviction or destroy also
> happens often in the meaning time, then which could effect VM update
> and CS submission as well.

That's exactly the reason why I've added code to the BO destroy path to
avoid at least some of the problems. But yeah, that's only the tip of
the iceberg of problems with that approach.

> Anyway, this is very good start and try that we reduce CS overhead,
> especially we've seen "reduces average CS overhead for 10K BOs from
> 21ms down to 48us. ".

Actually, it's not that good. See this is a completely build up test
case on a kernel with lockdep and KASAN enabled.

In reality we usually don't have so many BOs and so far I wasn't able to
find much of an improvement in any real world testing.

Regards,
Christian.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amd/powerplay: set uvd/vce/nb/mclk level as pstate requested

2017-08-29 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rex Zhu
> Sent: Tuesday, August 29, 2017 5:14 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH 2/2] drm/amd/powerplay: set uvd/vce/nb/mclk level as
> pstate requested
> 
> Change-Id: Ibd74590c3fe9dbdeac924b697d18448bddbefcdb

Missing your signed-off-by.

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> index a125e30..10bf687 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> @@ -1359,6 +1359,11 @@ static int cz_dpm_force_dpm_level(struct
> pp_hwmgr *hwmgr,
>   if (level == hwmgr->dpm_level)
>   return 0;
> 
> + if (level == AMD_DPM_FORCED_LEVEL_PROFILE_PEAK)
> + cz_nbdpm_pstate_enable_disable(hwmgr, false, false);
> + else if (level == AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD)
> + cz_nbdpm_pstate_enable_disable(hwmgr, false, true);
> +

Do we need a default case here as well for the nbdpm to reset it back to normal 
when profiling mode is disabled?

Alex

>   switch (level) {
>   case AMD_DPM_FORCED_LEVEL_HIGH:
>   case AMD_DPM_FORCED_LEVEL_PROFILE_PEAK:
> @@ -1435,7 +1440,8 @@ int cz_dpm_update_uvd_dpm(struct pp_hwmgr
> *hwmgr, bool bgate)
>   if (!bgate) {
>   /* Stable Pstate is enabled and we need to set the UVD DPM
> to highest level */
>   if (phm_cap_enabled(hwmgr-
> >platform_descriptor.platformCaps,
> -  PHM_PlatformCaps_StablePState)) {
> +  PHM_PlatformCaps_StablePState)
> + || hwmgr->en_umd_pstate) {
>   cz_hwmgr->uvd_dpm.hard_min_clk =
>  ptable->entries[ptable->count - 1].vclk;
> 
> @@ -1464,7 +1470,8 @@ int  cz_dpm_update_vce_dpm(struct pp_hwmgr
> *hwmgr)
> 
>   /* Stable Pstate is enabled and we need to set the VCE DPM to
> highest level */
>   if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps,
> -  PHM_PlatformCaps_StablePState)) {
> + PHM_PlatformCaps_StablePState)
> + || hwmgr->en_umd_pstate) {
>   cz_hwmgr->vce_dpm.hard_min_clk =
> ptable->entries[ptable->count - 1].ecclk;
> 
> --
> 1.9.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amd/powerplay: add UMD P-state in powerplay.

2017-08-29 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Rex Zhu
> Sent: Tuesday, August 29, 2017 5:14 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhu, Rex
> Subject: [PATCH 1/2] drm/amd/powerplay: add UMD P-state in powerplay.
> 
> This feature is for UMD to run benchmark in a
> power state that is as steady as possible. kmd
> need to fix the power state as stable as possible.
> now, kmd support four level:
> profile_standard,peak,min_sclk,min_mclk
> 
> move common related code to amd_powerplay.c
> 
> Change-Id: Ie06c122199b7246f5b1951c354cf502bbed27485
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/powerplay/amd_powerplay.c  | 40
> +-
>  drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c | 24 +
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 25 +-
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 25 +
> -
>  drivers/gpu/drm/amd/powerplay/inc/hwmgr.h  |  3 +-
>  5 files changed, 44 insertions(+), 73 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> index f73e80c..310f34a 100644
> --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
> @@ -30,7 +30,7 @@
>  #include "pp_instance.h"
>  #include "power_state.h"
>  #include "eventmanager.h"
> -
> +#include "eventtasks.h"
> 
>  static inline int pp_check(struct pp_instance *handle)
>  {
> @@ -324,12 +324,44 @@ static int pp_dpm_fw_loading_complete(void
> *handle)
>   return 0;
>  }
> 
> +static void pp_dpm_en_umd_pstate(struct pp_hwmgr  *hwmgr,
> + enum
> amd_dpm_forced_level level)
> +{
> + uint32_t profile_mode_mask =
> AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD |
> +
>   AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK |
> +
>   AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK |
> +
>   AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
> +
> + if (!(hwmgr->dpm_level & profile_mode_mask)) {
> + /* enter umd pstate, save current level, disable gfx cg*/
> + if (level & profile_mode_mask) {
> + hwmgr->saved_dpm_level = hwmgr->dpm_level;
> + hwmgr->en_umd_pstate = true;
> + cgs_set_clockgating_state(hwmgr->device,
> + AMD_IP_BLOCK_TYPE_GFX,
> + AMD_CG_STATE_UNGATE);
> + }
> + } else {
> + /* exit umd pstate, restore level, enable gfx cg*/
> + if (!(level & profile_mode_mask)) {
> + if (level ==
> AMD_DPM_FORCED_LEVEL_PROFILE_EXIT)
> + level = hwmgr->saved_dpm_level;
> + hwmgr->en_umd_pstate = false;
> + cgs_set_clockgating_state(hwmgr->device,
> + AMD_IP_BLOCK_TYPE_GFX,
> + AMD_CG_STATE_GATE);
> + }
> + }
> + return;

Can drop the return here.  With that fixed:
Reviewed-by: Alex Deucher 

> +}
> +
>  static int pp_dpm_force_performance_level(void *handle,
>   enum amd_dpm_forced_level level)
>  {
>   struct pp_hwmgr  *hwmgr;
>   struct pp_instance *pp_handle = (struct pp_instance *)handle;
>   int ret = 0;
> + struct pem_event_data data = { {0} };
> 
>   ret = pp_check(pp_handle);
> 
> @@ -338,13 +370,19 @@ static int pp_dpm_force_performance_level(void
> *handle,
> 
>   hwmgr = pp_handle->hwmgr;
> 
> + if (level == hwmgr->dpm_level)
> + return 0;
> +
>   if (hwmgr->hwmgr_func->force_dpm_level == NULL) {
>   pr_info("%s was not implemented.\n", __func__);
>   return 0;
>   }
> 
>   mutex_lock(&pp_handle->pp_lock);
> + pp_dpm_en_umd_pstate(hwmgr, level);
> + pem_task_adjust_power_state(pp_handle->eventmgr, &data);
>   hwmgr->hwmgr_func->force_dpm_level(hwmgr, level);
> +
>   mutex_unlock(&pp_handle->pp_lock);
>   return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> index bc839ff..a125e30 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> @@ -1355,31 +1355,9 @@ static int cz_dpm_force_dpm_level(struct
> pp_hwmgr *hwmgr,
>  {
>   uint32_t sclk = 0;
>   int ret = 0;
> - uint32_t profile_mode_mask =
> AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD |
> -
>   AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK |
> -
>   AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
> 
>   if (level == hwmgr->dpm_level)
> - return ret;
> -
> - if (!(hwmgr->dpm_level & profile_mode_mask)) {
> - /* enter profile mode, save current level, disable gfx cg*/
> - if (level & profile_mode_mask) {
> -  

[PATCH umr] Add vram write functionality

2017-08-29 Thread Tom St Denis
Also add --vram-read and -vr which will eventually replace --vram and -v.

Signed-off-by: Tom St Denis 
---
 doc/umr.1   |  6 -
 src/app/main.c  | 43 +---
 src/lib/read_vram.c | 72 ++---
 src/umr.h   |  6 -
 4 files changed, 97 insertions(+), 30 deletions(-)

diff --git a/doc/umr.1 b/doc/umr.1
index 06950925b7b6..b990ff2c412f 100644
--- a/doc/umr.1
+++ b/doc/umr.1
@@ -94,12 +94,16 @@ The VMID can be specified in hexadecimal (with leading 
'0x') or in decimal.
 Implies '-O verbose' for the duration of the command so does not require it
 to be manually specified.
 
-.IP "--vram, -v [vmid@] "
+.IP "--vram-read, -vr [vmid@] "
 Read 'size' bytes (in hex) from the address specified (in hexadecimal) from 
VRAM
 to stdout.  Optionally specify the VMID (in decimal or in hex with a 0x prefix)
 treating the address as a virtual address instead.  Can use 'use_pci' to
 directly access VRAM.
 
+.IP "--vram-write, -vw [vmid@] "
+Write 'size' bytes (in hex) to the address specified (in hexadecimal) to VRAM
+from stdin.
+
 .IP "--update, -u" 
 Specify update file to add, change, or delete registers from the register
 database.  Useful for adding registers that are not including in the kernel 
headers.
diff --git a/src/app/main.c b/src/app/main.c
index 920f6815e220..8fdad3580686 100644
--- a/src/app/main.c
+++ b/src/app/main.c
@@ -382,7 +382,8 @@ int main(int argc, char **argv)
printf("--vm-decode requires two parameters\n");
return EXIT_FAILURE;
}
-   } else if (!strcmp(argv[i], "--vram") || !strcmp(argv[i], 
"-v")) {
+   } else if (!strcmp(argv[i], "--vram") || !strcmp(argv[i], "-v") 
||
+  !strcmp(argv[i], "--vram-read") || !strcmp(argv[i], 
"-vr")) {
if (i + 2 < argc) {
unsigned char buf[256];
uint64_t address;
@@ -413,7 +414,41 @@ int main(int argc, char **argv)
} while (size);
i += 2;
} else {
-   printf("--vram requires two parameters\n");
+   printf("--vram-read requires two parameters\n");
+   return EXIT_FAILURE;
+   }
+   } else if (!strcmp(argv[i], "--vram-write") || !strcmp(argv[i], 
"-vw")) {
+   if (i + 2 < argc) {
+   unsigned char buf[256];
+   uint64_t address;
+   uint32_t size, n, vmid;
+
+   if (!asic)
+   asic = get_asic();
+
+   // allow specifying the vmid in hex as well so
+   // people can add the HUB flags more easily
+   if ((n = sscanf(argv[i+1], 
"0x%"SCNx32"@%"SCNx64, &vmid, &address)) != 2)
+   if ((n = sscanf(argv[i+1], 
"%"SCNu32"@%"SCNx64, &vmid, &address)) != 2) {
+   sscanf(argv[i+1], "%"SCNx64, 
&address);
+   vmid = UMR_LINEAR_HUB;
+   }
+
+   // imply user hub if hub name specified
+   if (options.hub_name[0])
+   vmid |= UMR_USER_HUB;
+
+   sscanf(argv[i+2], "%"SCNx32, &size);
+   do {
+   n = size > sizeof(buf) ? sizeof(buf) : 
size;
+   fread(buf, 1, n, stdin);
+   umr_write_vram(asic, vmid, address, n, 
buf);
+   size -= n;
+   address += n;
+   } while (size);
+   i += 2;
+   } else {
+   printf("--vram-write requires two 
parameters\n");
return EXIT_FAILURE;
}
} else if (!strcmp(argv[i], "--option") || !strcmp(argv[i], 
"-O")) {
@@ -479,11 +514,13 @@ int main(int argc, char **argv)
"\n\t\tThe VMID can be specified in hexadecimal (with leading '0x') or 
in decimal."
"\n\t\tImplies '-O verbose' for the duration of the command so does not 
require it"
"\n\t\tto be manually specified.\n"
-"\n\t--vram, -v [@] "
+"\n\t--vram-read, -vr [@] "
"\n\t\tRead 'size' bytes (in hex) from a given address (in hex) to 
stdout. Optionally"
"\n\t\tspecify the VMID (in decimal or in hex with a '0x' prefix) 
treating the address"
"\n

RE: [PATCH xf86-video-ati] Use a timer for unreferencing the all-black FB

2017-08-29 Thread Deucher, Alexander


> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Tuesday, August 29, 2017 5:18 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH xf86-video-ati] Use a timer for unreferencing the all-black FB
> 
> From: Michel Dänzer 
> 
> The timer fires 1 second after LeaveVT. This gives the next DRM master
> enough time to set up scanout of its own buffers.
> 
> Fixes prolonged intermittent black screen when switching from Xorg to
> e.g. the GDM Wayland mode login VT.
> 
> Fixes: 06a465484101 ("Make all active CRTCs scan out an all-black
>   framebuffer in LeaveVT")
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  src/radeon_kms.c | 40 +++-
>  1 file changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/src/radeon_kms.c b/src/radeon_kms.c
> index 5410c4208..01594c6ca 100644
> --- a/src/radeon_kms.c
> +++ b/src/radeon_kms.c
> @@ -1150,7 +1150,6 @@ static void
> RADEONBlockHandler_KMS(BLOCKHANDLER_ARGS_DECL)
>  {
>  SCREEN_PTR(arg);
>  ScrnInfoPtrpScrn   = xf86ScreenToScrn(pScreen);
> -RADEONEntPtr pRADEONEnt = RADEONEntPriv(pScrn);
>  RADEONInfoPtr  info= RADEONPTR(pScrn);
>  xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(pScrn);
>  int c;
> @@ -1159,19 +1158,8 @@ static void
> RADEONBlockHandler_KMS(BLOCKHANDLER_ARGS_DECL)
>  (*pScreen->BlockHandler) (BLOCKHANDLER_ARGS);
>  pScreen->BlockHandler = RADEONBlockHandler_KMS;
> 
> -if (!xf86ScreenToScrn(radeon_master_screen(pScreen))->vtSema) {
> - /* Unreference the all-black FB created by RADEONLeaveVT_KMS.
> After
> -  * this, there should be no FB left created by this driver.
> -  */
> - for (c = 0; c < xf86_config->num_crtc; c++) {
> - drmmode_crtc_private_ptr drmmode_crtc =
> - xf86_config->crtc[c]->driver_private;
> -
> - drmmode_fb_reference(pRADEONEnt->fd, &drmmode_crtc->fb,
> NULL);
> - }
> -
> +if (!xf86ScreenToScrn(radeon_master_screen(pScreen))->vtSema)
>   return;
> -}
> 
>  if (!radeon_is_gpu_screen(pScreen))
>  {
> @@ -2473,6 +2461,30 @@ Bool
> RADEONEnterVT_KMS(VT_FUNC_ARGS_DECL)
>  return TRUE;
>  }
> 
> +static
> +CARD32 cleanup_black_fb(OsTimerPtr timer, CARD32 now, pointer data)
> +{
> +ScreenPtr screen = data;
> +ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
> +RADEONEntPtr pRADEONEnt = RADEONEntPriv(scrn);
> +xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(scrn);
> +int c;
> +
> +if (xf86ScreenToScrn(radeon_master_screen(screen))->vtSema)
> + return 0;
> +
> +/* Unreference the all-black FB created by RADEONLeaveVT_KMS. After
> + * this, there should be no FB left created by this driver.
> + */
> +for (c = 0; c < xf86_config->num_crtc; c++) {
> + drmmode_crtc_private_ptr drmmode_crtc =
> + xf86_config->crtc[c]->driver_private;
> +
> + drmmode_fb_reference(pRADEONEnt->fd, &drmmode_crtc->fb,
> NULL);
> +}
> +
> +return 0;
> +}
> 
>  static void
>  pixmap_unref_fb(void *value, XID id, void *cdata)
> @@ -2569,6 +2581,8 @@ void
> RADEONLeaveVT_KMS(VT_FUNC_ARGS_DECL)
>  }
>  pixmap_unref_fb(pScreen->GetScreenPixmap(pScreen), None,
> pRADEONEnt);
> 
> +TimerSet(NULL, 0, 1000, cleanup_black_fb, pScreen);
> +
>  xf86_hide_cursors (pScrn);
> 
>  radeon_drop_drm_master(pScrn);
> --
> 2.14.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH xf86-video-amdgpu 6/6] Remove drmmode_scanout_free

2017-08-29 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Tuesday, August 29, 2017 4:31 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH xf86-video-amdgpu 6/6] Remove drmmode_scanout_free
> 
> From: Michel Dänzer 
> 
> Not used anymore.
> 
> (Cherry picked from radeon commit
> e4a3df19d588a4310fcb889ef34e205d0e92e4d7)
> 
> Signed-off-by: Michel Dänzer 

Series is:
Reviewed-by: Alex Deucher 

> ---
>  src/drmmode_display.c | 10 --
>  src/drmmode_display.h |  1 -
>  2 files changed, 11 deletions(-)
> 
> diff --git a/src/drmmode_display.c b/src/drmmode_display.c
> index 6057699bf..9c838a8b9 100644
> --- a/src/drmmode_display.c
> +++ b/src/drmmode_display.c
> @@ -490,16 +490,6 @@
> drmmode_crtc_scanout_free(drmmode_crtc_private_ptr drmmode_crtc)
>   DamageDestroy(drmmode_crtc->scanout_damage);
>  }
> 
> -void
> -drmmode_scanout_free(ScrnInfoPtr scrn)
> -{
> - xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(scrn);
> - int c;
> -
> - for (c = 0; c < xf86_config->num_crtc; c++)
> - drmmode_crtc_scanout_free(xf86_config->crtc[c]-
> >driver_private);
> -}
> -
>  PixmapPtr
>  drmmode_crtc_scanout_create(xf86CrtcPtr crtc, struct drmmode_scanout
> *scanout,
>   int width, int height)
> diff --git a/src/drmmode_display.h b/src/drmmode_display.h
> index eff342942..03134f0c9 100644
> --- a/src/drmmode_display.h
> +++ b/src/drmmode_display.h
> @@ -208,7 +208,6 @@ extern Bool drmmode_setup_colormap(ScreenPtr
> pScreen, ScrnInfoPtr pScrn);
> 
>  extern void drmmode_crtc_scanout_destroy(drmmode_ptr drmmode,
>struct drmmode_scanout *scanout);
> -extern void drmmode_scanout_free(ScrnInfoPtr scrn);
>  void drmmode_crtc_scanout_free(drmmode_crtc_private_ptr
> drmmode_crtc);
>  PixmapPtr drmmode_crtc_scanout_create(xf86CrtcPtr crtc,
> struct drmmode_scanout *scanout,
> --
> 2.14.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/powerplay/hwmgr: Remove null check before kfree

2017-08-29 Thread Harry Wentland
On 2017-08-29 09:12 AM, Himanshu Jha wrote:
> kfree on NULL pointer is a no-op and therefore checking is redundant.
> 
> Signed-off-by: Himanshu Jha 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c |  6 +-
>  .../gpu/drm/amd/powerplay/hwmgr/processpptables.c  | 96 
> --
>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c | 52 
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 12 +--
>  4 files changed, 56 insertions(+), 110 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> index bc839ff..9f2c037 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
> @@ -1225,10 +1225,8 @@ static int cz_hwmgr_backend_fini(struct pp_hwmgr 
> *hwmgr)
>   phm_destroy_table(hwmgr, &(hwmgr->power_down_asic));
>   phm_destroy_table(hwmgr, &(hwmgr->setup_asic));
>  
> - if (NULL != hwmgr->dyn_state.vddc_dep_on_dal_pwrl) {
> - kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
> - hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
> + hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
>  
>   kfree(hwmgr->backend);
>   hwmgr->backend = NULL;
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> index 2716721..a6dbc55 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
> @@ -1615,85 +1615,53 @@ static int pp_tables_uninitialize(struct pp_hwmgr 
> *hwmgr)
>   if (hwmgr->chip_id == CHIP_RAVEN)
>   return 0;
>  
> - if (NULL != hwmgr->dyn_state.vddc_dependency_on_sclk) {
> - kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
> - hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
> + hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
>  
> - if (NULL != hwmgr->dyn_state.vddci_dependency_on_mclk) {
> - kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
> - hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
> + hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
>  
> - if (NULL != hwmgr->dyn_state.vddc_dependency_on_mclk) {
> - kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
> - hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
> + hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
>  
> - if (NULL != hwmgr->dyn_state.mvdd_dependency_on_mclk) {
> - kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
> - hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
> - }
> + kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
> + hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
>  
> - if (NULL != hwmgr->dyn_state.valid_mclk_values) {
> - kfree(hwmgr->dyn_state.valid_mclk_values);
> - hwmgr->dyn_state.valid_mclk_values = NULL;
> - }
> + kfree(hwmgr->dyn_state.valid_mclk_values);
> + hwmgr->dyn_state.valid_mclk_values = NULL;
>  
> - if (NULL != hwmgr->dyn_state.valid_sclk_values) {
> - kfree(hwmgr->dyn_state.valid_sclk_values);
> - hwmgr->dyn_state.valid_sclk_values = NULL;
> - }
> + kfree(hwmgr->dyn_state.valid_sclk_values);
> + hwmgr->dyn_state.valid_sclk_values = NULL;
>  
> - if (NULL != hwmgr->dyn_state.cac_leakage_table) {
> - kfree(hwmgr->dyn_state.cac_leakage_table);
> - hwmgr->dyn_state.cac_leakage_table = NULL;
> - }
> + kfree(hwmgr->dyn_state.cac_leakage_table);
> + hwmgr->dyn_state.cac_leakage_table = NULL;
>  
> - if (NULL != hwmgr->dyn_state.vddc_phase_shed_limits_table) {
> - kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
> - hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
> - }
> + kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
> + hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
>  
> - if (NULL != hwmgr->dyn_state.vce_clock_voltage_dependency_table) {
> - kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
> - hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
> - }
> + kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
> + hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
>  
> - if (NULL != hwmgr->dyn_state.uvd_clock_voltage_dependency_table) {
> - kfree(hwmgr->dyn_state.uvd_clock_voltage_dependency_table);
> - hwmgr->dyn_state.uvd_clock_voltage_dependency_table

[PATCH] drm/amd/powerplay/hwmgr: Remove null check before kfree

2017-08-29 Thread Himanshu Jha
kfree on NULL pointer is a no-op and therefore checking is redundant.

Signed-off-by: Himanshu Jha 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c |  6 +-
 .../gpu/drm/amd/powerplay/hwmgr/processpptables.c  | 96 --
 drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c | 52 
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 12 +--
 4 files changed, 56 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
index bc839ff..9f2c037 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
@@ -1225,10 +1225,8 @@ static int cz_hwmgr_backend_fini(struct pp_hwmgr *hwmgr)
phm_destroy_table(hwmgr, &(hwmgr->power_down_asic));
phm_destroy_table(hwmgr, &(hwmgr->setup_asic));
 
-   if (NULL != hwmgr->dyn_state.vddc_dep_on_dal_pwrl) {
-   kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
-   hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_dep_on_dal_pwrl);
+   hwmgr->dyn_state.vddc_dep_on_dal_pwrl = NULL;
 
kfree(hwmgr->backend);
hwmgr->backend = NULL;
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
index 2716721..a6dbc55 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/processpptables.c
@@ -1615,85 +1615,53 @@ static int pp_tables_uninitialize(struct pp_hwmgr 
*hwmgr)
if (hwmgr->chip_id == CHIP_RAVEN)
return 0;
 
-   if (NULL != hwmgr->dyn_state.vddc_dependency_on_sclk) {
-   kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
-   hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_dependency_on_sclk);
+   hwmgr->dyn_state.vddc_dependency_on_sclk = NULL;
 
-   if (NULL != hwmgr->dyn_state.vddci_dependency_on_mclk) {
-   kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
-   hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddci_dependency_on_mclk);
+   hwmgr->dyn_state.vddci_dependency_on_mclk = NULL;
 
-   if (NULL != hwmgr->dyn_state.vddc_dependency_on_mclk) {
-   kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
-   hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_dependency_on_mclk);
+   hwmgr->dyn_state.vddc_dependency_on_mclk = NULL;
 
-   if (NULL != hwmgr->dyn_state.mvdd_dependency_on_mclk) {
-   kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
-   hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
-   }
+   kfree(hwmgr->dyn_state.mvdd_dependency_on_mclk);
+   hwmgr->dyn_state.mvdd_dependency_on_mclk = NULL;
 
-   if (NULL != hwmgr->dyn_state.valid_mclk_values) {
-   kfree(hwmgr->dyn_state.valid_mclk_values);
-   hwmgr->dyn_state.valid_mclk_values = NULL;
-   }
+   kfree(hwmgr->dyn_state.valid_mclk_values);
+   hwmgr->dyn_state.valid_mclk_values = NULL;
 
-   if (NULL != hwmgr->dyn_state.valid_sclk_values) {
-   kfree(hwmgr->dyn_state.valid_sclk_values);
-   hwmgr->dyn_state.valid_sclk_values = NULL;
-   }
+   kfree(hwmgr->dyn_state.valid_sclk_values);
+   hwmgr->dyn_state.valid_sclk_values = NULL;
 
-   if (NULL != hwmgr->dyn_state.cac_leakage_table) {
-   kfree(hwmgr->dyn_state.cac_leakage_table);
-   hwmgr->dyn_state.cac_leakage_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.cac_leakage_table);
+   hwmgr->dyn_state.cac_leakage_table = NULL;
 
-   if (NULL != hwmgr->dyn_state.vddc_phase_shed_limits_table) {
-   kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
-   hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.vddc_phase_shed_limits_table);
+   hwmgr->dyn_state.vddc_phase_shed_limits_table = NULL;
 
-   if (NULL != hwmgr->dyn_state.vce_clock_voltage_dependency_table) {
-   kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
-   hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.vce_clock_voltage_dependency_table);
+   hwmgr->dyn_state.vce_clock_voltage_dependency_table = NULL;
 
-   if (NULL != hwmgr->dyn_state.uvd_clock_voltage_dependency_table) {
-   kfree(hwmgr->dyn_state.uvd_clock_voltage_dependency_table);
-   hwmgr->dyn_state.uvd_clock_voltage_dependency_table = NULL;
-   }
+   kfree(hwmgr->dyn_state.uvd_clock_voltage_dependency_table);
+   hwmgr->dyn_state.uvd_clock_voltage_dependency_table = NULL;
 
-

[PATCH] drm/amd: Remove null check before kfree

2017-08-29 Thread Himanshu Jha
Kfree on NULL pointer is a no-op and therefore checking is redundant.

Signed-off-by: Himanshu Jha 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 6 ++
 drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 6 ++
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
index 8d1cf2d..f51b41f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
@@ -346,10 +346,8 @@ static void amdgpu_connector_free_edid(struct 
drm_connector *connector)
 {
struct amdgpu_connector *amdgpu_connector = 
to_amdgpu_connector(connector);
 
-   if (amdgpu_connector->edid) {
-   kfree(amdgpu_connector->edid);
-   amdgpu_connector->edid = NULL;
-   }
+   kfree(amdgpu_connector->edid);
+   amdgpu_connector->edid = NULL;
 }
 
 static int amdgpu_connector_ddc_get_modes(struct drm_connector *connector)
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index 76347ff..00075c2 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -606,10 +606,8 @@ int smu7_init(struct pp_smumgr *smumgr)
 
 int smu7_smu_fini(struct pp_smumgr *smumgr)
 {
-   if (smumgr->backend) {
-   kfree(smumgr->backend);
-   smumgr->backend = NULL;
-   }
+   kfree(smumgr->backend);
+   smumgr->backend = NULL;
cgs_rel_firmware(smumgr->device, CGS_UCODE_ID_SMU);
return 0;
 }
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd: Remove null check before kfree

2017-08-29 Thread Christian König

Am 29.08.2017 um 15:21 schrieb Himanshu Jha:

Kfree on NULL pointer is a no-op and therefore checking is redundant.

Signed-off-by: Himanshu Jha 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 6 ++
  drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c | 6 ++
  2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
index 8d1cf2d..f51b41f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
@@ -346,10 +346,8 @@ static void amdgpu_connector_free_edid(struct 
drm_connector *connector)
  {
struct amdgpu_connector *amdgpu_connector = 
to_amdgpu_connector(connector);
  
-	if (amdgpu_connector->edid) {

-   kfree(amdgpu_connector->edid);
-   amdgpu_connector->edid = NULL;
-   }
+   kfree(amdgpu_connector->edid);
+   amdgpu_connector->edid = NULL;
  }
  
  static int amdgpu_connector_ddc_get_modes(struct drm_connector *connector)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index 76347ff..00075c2 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -606,10 +606,8 @@ int smu7_init(struct pp_smumgr *smumgr)
  
  int smu7_smu_fini(struct pp_smumgr *smumgr)

  {
-   if (smumgr->backend) {
-   kfree(smumgr->backend);
-   smumgr->backend = NULL;
-   }
+   kfree(smumgr->backend);
+   smumgr->backend = NULL;
cgs_rel_firmware(smumgr->device, CGS_UCODE_ID_SMU);
return 0;
  }



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add write() method to VRAM debugfs entry (v2)

2017-08-29 Thread Christian König

Am 29.08.2017 um 14:49 schrieb Tom St Denis:

Allows writing data to vram via debugfs.

Signed-off-by: Tom St Denis 


Reviewed-by: Christian König 



(v2):  Call get_user before holding spinlock.
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 42 -
  1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c97a99427eea..e9a05186b889 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1671,10 +1671,50 @@ static ssize_t amdgpu_ttm_vram_read(struct file *f, 
char __user *buf,
return result;
  }
  
+static ssize_t amdgpu_ttm_vram_write(struct file *f, const char __user *buf,

+   size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result = 0;
+   int r;
+
+   if (size & 0x3 || *pos & 0x3)
+   return -EINVAL;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return -ENXIO;
+
+   while (size) {
+   unsigned long flags;
+   uint32_t value;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return result;
+
+   r = get_user(value, (uint32_t *)buf);
+   if (r)
+   return r;
+
+   spin_lock_irqsave(&adev->mmio_idx_lock, flags);
+   WREG32(mmMM_INDEX, ((uint32_t)*pos) | 0x8000);
+   WREG32(mmMM_INDEX_HI, *pos >> 31);
+   WREG32(mmMM_DATA, value);
+   spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+
+   result += 4;
+   buf += 4;
+   *pos += 4;
+   size -= 4;
+   }
+
+   return result;
+}
+
  static const struct file_operations amdgpu_ttm_vram_fops = {
.owner = THIS_MODULE,
.read = amdgpu_ttm_vram_read,
-   .llseek = default_llseek
+   .write = amdgpu_ttm_vram_write,
+   .llseek = default_llseek,
  };
  
  #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/5] drm/amdgpu: rework moved handling in the VM v2

2017-08-29 Thread Christian König
I asked you how you keep the access of base.moved is safely in last 
thread,

Sorry, I've missed that question somehow.


I check it just now, it depends on the shared resv lock.
Actually that's not 100% correct. The moved member is protected by the 
BOs resv lock.


That can be the shared one (in the case of PDs/PTs/local BOs), but can 
also be a separate lock.


I find most of all vm lists are protected by the shared resv lock, 
whether the status lock can be removed as well? seems everything of vm 
is protected by that big resv lock. 
Nope, that won't work. When amdgpu_vm_bo_invalidate() is called only the 
BO itself is locked, but not something from the VM.


So we need a separate lock to protect this list.

Thanks for the review,
Christian.

Am 29.08.2017 um 04:13 schrieb zhoucm1:

Hi Christian,

I asked you how you keep the access of base.moved is safely in last 
thread, I check it just now, it depends on the shared resv lock.


For patch itself, Reviewed-by: Chunming Zhou 

another question comes up to me, I find most of all vm lists are 
protected by the shared resv lock, whether the status lock can be 
removed as well? seems everything of vm is protected by that big resv 
lock.


Regards,
David Zhou

On 2017年08月29日 02:50, Christian König wrote:

From: Christian König 

Instead of using the vm_state use a separate flag to note
that the BO was moved.

v2: reorder patches to avoid temporary lockless access

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  3 +++
  2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index f621dba..ee53293 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1787,10 +1787,16 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
*adev,

  else
  flags = 0x0;
  -spin_lock(&vm->status_lock);
-if (!list_empty(&bo_va->base.vm_status))
+if (!clear && bo_va->base.moved) {
+bo_va->base.moved = false;
  list_splice_init(&bo_va->valids, &bo_va->invalids);
-spin_unlock(&vm->status_lock);
+
+} else {
+spin_lock(&vm->status_lock);
+if (!list_empty(&bo_va->base.vm_status))
+list_splice_init(&bo_va->valids, &bo_va->invalids);
+spin_unlock(&vm->status_lock);
+}
list_for_each_entry(mapping, &bo_va->invalids, list) {
  r = amdgpu_vm_bo_split_mapping(adev, exclusive, pages_addr, 
vm,
@@ -2418,6 +2424,7 @@ void amdgpu_vm_bo_invalidate(struct 
amdgpu_device *adev,

  struct amdgpu_vm_bo_base *bo_base;
list_for_each_entry(bo_base, &bo->va, bo_list) {
+bo_base->moved = true;
  spin_lock(&bo_base->vm->status_lock);
  if (list_empty(&bo_base->vm_status))
  list_add(&bo_base->vm_status,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h

index 9347d28..1b478e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -105,6 +105,9 @@ struct amdgpu_vm_bo_base {
/* protected by spinlock */
  struct list_headvm_status;
+
+/* protected by the BO being reserved */
+boolmoved;
  };
struct amdgpu_vm_pt {


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: Add write() method to VRAM debugfs entry (v2)

2017-08-29 Thread Tom St Denis
Allows writing data to vram via debugfs.

Signed-off-by: Tom St Denis 

(v2):  Call get_user before holding spinlock.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 42 -
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c97a99427eea..e9a05186b889 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1671,10 +1671,50 @@ static ssize_t amdgpu_ttm_vram_read(struct file *f, 
char __user *buf,
return result;
 }
 
+static ssize_t amdgpu_ttm_vram_write(struct file *f, const char __user *buf,
+   size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result = 0;
+   int r;
+
+   if (size & 0x3 || *pos & 0x3)
+   return -EINVAL;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return -ENXIO;
+
+   while (size) {
+   unsigned long flags;
+   uint32_t value;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return result;
+
+   r = get_user(value, (uint32_t *)buf);
+   if (r)
+   return r;
+
+   spin_lock_irqsave(&adev->mmio_idx_lock, flags);
+   WREG32(mmMM_INDEX, ((uint32_t)*pos) | 0x8000);
+   WREG32(mmMM_INDEX_HI, *pos >> 31);
+   WREG32(mmMM_DATA, value);
+   spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+
+   result += 4;
+   buf += 4;
+   *pos += 4;
+   size -= 4;
+   }
+
+   return result;
+}
+
 static const struct file_operations amdgpu_ttm_vram_fops = {
.owner = THIS_MODULE,
.read = amdgpu_ttm_vram_read,
-   .llseek = default_llseek
+   .write = amdgpu_ttm_vram_write,
+   .llseek = default_llseek,
 };
 
 #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
-- 
2.12.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add write() method to VRAM debugfs entry

2017-08-29 Thread Tom St Denis

On 29/08/17 08:46 AM, Christian König wrote:

Am 29.08.2017 um 14:43 schrieb Tom St Denis:

Allows writing data to vram via debugfs.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 41 
-

  1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

index c97a99427eea..cdc96d027707 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1671,10 +1671,49 @@ static ssize_t amdgpu_ttm_vram_read(struct 
file *f, char __user *buf,

  return result;
  }
+static ssize_t amdgpu_ttm_vram_write(struct file *f, const char 
__user *buf,

+size_t size, loff_t *pos)
+{
+struct amdgpu_device *adev = file_inode(f)->i_private;
+ssize_t result = 0;
+int r;
+
+if (size & 0x3 || *pos & 0x3)
+return -EINVAL;
+
+if (*pos >= adev->mc.mc_vram_size)
+return -ENXIO;
+
+while (size) {
+unsigned long flags;
+uint32_t value;
+
+if (*pos >= adev->mc.mc_vram_size)
+return result;
+
+spin_lock_irqsave(&adev->mmio_idx_lock, flags);
+WREG32(mmMM_INDEX, ((uint32_t)*pos) | 0x8000);
+WREG32(mmMM_INDEX_HI, *pos >> 31);
+r = get_user(value, (uint32_t *)buf);


Don't call get_user() while holding a spin.


+if (r)
+return r;


You forget to unlock the spin.


Yup, my bad.  I'll fix that up.

Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/amdgpu: Add write() method to VRAM debugfs entry

2017-08-29 Thread Christian König

Am 29.08.2017 um 14:43 schrieb Tom St Denis:

Allows writing data to vram via debugfs.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 41 -
  1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c97a99427eea..cdc96d027707 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1671,10 +1671,49 @@ static ssize_t amdgpu_ttm_vram_read(struct file *f, 
char __user *buf,
return result;
  }
  
+static ssize_t amdgpu_ttm_vram_write(struct file *f, const char __user *buf,

+   size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result = 0;
+   int r;
+
+   if (size & 0x3 || *pos & 0x3)
+   return -EINVAL;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return -ENXIO;
+
+   while (size) {
+   unsigned long flags;
+   uint32_t value;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return result;
+
+   spin_lock_irqsave(&adev->mmio_idx_lock, flags);
+   WREG32(mmMM_INDEX, ((uint32_t)*pos) | 0x8000);
+   WREG32(mmMM_INDEX_HI, *pos >> 31);
+   r = get_user(value, (uint32_t *)buf);


Don't call get_user() while holding a spin.


+   if (r)
+   return r;


You forget to unlock the spin.

Regards,
Christian.


+   WREG32(mmMM_DATA, value);
+   spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+
+   result += 4;
+   buf += 4;
+   *pos += 4;
+   size -= 4;
+   }
+
+   return result;
+}
+
  static const struct file_operations amdgpu_ttm_vram_fops = {
.owner = THIS_MODULE,
.read = amdgpu_ttm_vram_read,
-   .llseek = default_llseek
+   .write = amdgpu_ttm_vram_write,
+   .llseek = default_llseek,
  };
  
  #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: Add write() method to VRAM debugfs entry

2017-08-29 Thread Tom St Denis
Allows writing data to vram via debugfs.

Signed-off-by: Tom St Denis 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c97a99427eea..cdc96d027707 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1671,10 +1671,49 @@ static ssize_t amdgpu_ttm_vram_read(struct file *f, 
char __user *buf,
return result;
 }
 
+static ssize_t amdgpu_ttm_vram_write(struct file *f, const char __user *buf,
+   size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result = 0;
+   int r;
+
+   if (size & 0x3 || *pos & 0x3)
+   return -EINVAL;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return -ENXIO;
+
+   while (size) {
+   unsigned long flags;
+   uint32_t value;
+
+   if (*pos >= adev->mc.mc_vram_size)
+   return result;
+
+   spin_lock_irqsave(&adev->mmio_idx_lock, flags);
+   WREG32(mmMM_INDEX, ((uint32_t)*pos) | 0x8000);
+   WREG32(mmMM_INDEX_HI, *pos >> 31);
+   r = get_user(value, (uint32_t *)buf);
+   if (r)
+   return r;
+   WREG32(mmMM_DATA, value);
+   spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+
+   result += 4;
+   buf += 4;
+   *pos += 4;
+   size -= 4;
+   }
+
+   return result;
+}
+
 static const struct file_operations amdgpu_ttm_vram_fops = {
.owner = THIS_MODULE,
.read = amdgpu_ttm_vram_read,
-   .llseek = default_llseek
+   .write = amdgpu_ttm_vram_write,
+   .llseek = default_llseek,
 };
 
 #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
-- 
2.12.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-ati 5/6] Make all active CRTCs scan out an all-black framebuffer in LeaveVT

2017-08-29 Thread Emil Velikov
Hi Michel,

On 28 August 2017 at 10:23, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> And destroy all other FBs. This is so that other DRM masters can only
> get access to this all-black FB, not to any other FB we created, while
> we're switched away and not DRM master.
>
Isn't the issue applicable overall - be that in X, wayland compositors, other?

IIRC the vmwgfx's kernel driver, which has extra locking [1] in order
to address that.
Would a similar approach like that be applicable for radeon/amdgpu?

I'm not saying that this is bad/not needed. Just wondering on a
"perfect" long term solution.
That is, unless I've lost the plot and the two are completely unrelated.

Thanks
Emil

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c?h=v4.13-rc7#n1054
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amd/powerplay: set uvd/vce/nb/mclk level as pstate requested

2017-08-29 Thread Rex Zhu
Change-Id: Ibd74590c3fe9dbdeac924b697d18448bddbefcdb
---
 drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
index a125e30..10bf687 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
@@ -1359,6 +1359,11 @@ static int cz_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
if (level == hwmgr->dpm_level)
return 0;
 
+   if (level == AMD_DPM_FORCED_LEVEL_PROFILE_PEAK)
+   cz_nbdpm_pstate_enable_disable(hwmgr, false, false);
+   else if (level == AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD)
+   cz_nbdpm_pstate_enable_disable(hwmgr, false, true);
+
switch (level) {
case AMD_DPM_FORCED_LEVEL_HIGH:
case AMD_DPM_FORCED_LEVEL_PROFILE_PEAK:
@@ -1435,7 +1440,8 @@ int cz_dpm_update_uvd_dpm(struct pp_hwmgr *hwmgr, bool 
bgate)
if (!bgate) {
/* Stable Pstate is enabled and we need to set the UVD DPM to 
highest level */
if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps,
-PHM_PlatformCaps_StablePState)) {
+PHM_PlatformCaps_StablePState)
+   || hwmgr->en_umd_pstate) {
cz_hwmgr->uvd_dpm.hard_min_clk =
   ptable->entries[ptable->count - 1].vclk;
 
@@ -1464,7 +1470,8 @@ int  cz_dpm_update_vce_dpm(struct pp_hwmgr *hwmgr)
 
/* Stable Pstate is enabled and we need to set the VCE DPM to highest 
level */
if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps,
-PHM_PlatformCaps_StablePState)) {
+   PHM_PlatformCaps_StablePState)
+   || hwmgr->en_umd_pstate) {
cz_hwmgr->vce_dpm.hard_min_clk =
  ptable->entries[ptable->count - 1].ecclk;
 
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm/amd/powerplay: add UMD P-state in powerplay.

2017-08-29 Thread Rex Zhu
This feature is for UMD to run benchmark in a
power state that is as steady as possible. kmd
need to fix the power state as stable as possible.
now, kmd support four level:
profile_standard,peak,min_sclk,min_mclk

move common related code to amd_powerplay.c

Change-Id: Ie06c122199b7246f5b1951c354cf502bbed27485
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/amd_powerplay.c  | 40 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c | 24 +
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c   | 25 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 25 +-
 drivers/gpu/drm/amd/powerplay/inc/hwmgr.h  |  3 +-
 5 files changed, 44 insertions(+), 73 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c 
b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
index f73e80c..310f34a 100644
--- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
+++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
@@ -30,7 +30,7 @@
 #include "pp_instance.h"
 #include "power_state.h"
 #include "eventmanager.h"
-
+#include "eventtasks.h"
 
 static inline int pp_check(struct pp_instance *handle)
 {
@@ -324,12 +324,44 @@ static int pp_dpm_fw_loading_complete(void *handle)
return 0;
 }
 
+static void pp_dpm_en_umd_pstate(struct pp_hwmgr  *hwmgr,
+   enum amd_dpm_forced_level level)
+{
+   uint32_t profile_mode_mask = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD |
+   AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK |
+   AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK |
+   AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
+
+   if (!(hwmgr->dpm_level & profile_mode_mask)) {
+   /* enter umd pstate, save current level, disable gfx cg*/
+   if (level & profile_mode_mask) {
+   hwmgr->saved_dpm_level = hwmgr->dpm_level;
+   hwmgr->en_umd_pstate = true;
+   cgs_set_clockgating_state(hwmgr->device,
+   AMD_IP_BLOCK_TYPE_GFX,
+   AMD_CG_STATE_UNGATE);
+   }
+   } else {
+   /* exit umd pstate, restore level, enable gfx cg*/
+   if (!(level & profile_mode_mask)) {
+   if (level == AMD_DPM_FORCED_LEVEL_PROFILE_EXIT)
+   level = hwmgr->saved_dpm_level;
+   hwmgr->en_umd_pstate = false;
+   cgs_set_clockgating_state(hwmgr->device,
+   AMD_IP_BLOCK_TYPE_GFX,
+   AMD_CG_STATE_GATE);
+   }
+   }
+   return;
+}
+
 static int pp_dpm_force_performance_level(void *handle,
enum amd_dpm_forced_level level)
 {
struct pp_hwmgr  *hwmgr;
struct pp_instance *pp_handle = (struct pp_instance *)handle;
int ret = 0;
+   struct pem_event_data data = { {0} };
 
ret = pp_check(pp_handle);
 
@@ -338,13 +370,19 @@ static int pp_dpm_force_performance_level(void *handle,
 
hwmgr = pp_handle->hwmgr;
 
+   if (level == hwmgr->dpm_level)
+   return 0;
+
if (hwmgr->hwmgr_func->force_dpm_level == NULL) {
pr_info("%s was not implemented.\n", __func__);
return 0;
}
 
mutex_lock(&pp_handle->pp_lock);
+   pp_dpm_en_umd_pstate(hwmgr, level);
+   pem_task_adjust_power_state(pp_handle->eventmgr, &data);
hwmgr->hwmgr_func->force_dpm_level(hwmgr, level);
+
mutex_unlock(&pp_handle->pp_lock);
return 0;
 }
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
index bc839ff..a125e30 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
@@ -1355,31 +1355,9 @@ static int cz_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
 {
uint32_t sclk = 0;
int ret = 0;
-   uint32_t profile_mode_mask = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD |
-   AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK |
-   AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
 
if (level == hwmgr->dpm_level)
-   return ret;
-
-   if (!(hwmgr->dpm_level & profile_mode_mask)) {
-   /* enter profile mode, save current level, disable gfx cg*/
-   if (level & profile_mode_mask) {
-   hwmgr->saved_dpm_level = hwmgr->dpm_level;
-   cgs_set_clockgating_state(hwmgr->device,
-   AMD_IP_BLOCK_TYPE_GFX,
-   AMD_CG_STATE_UNGATE);
-   }
-   } else {
-   /* exit profile mode, restore level, enab

[PATCH xf86-video-ati] Use a timer for unreferencing the all-black FB

2017-08-29 Thread Michel Dänzer
From: Michel Dänzer 

The timer fires 1 second after LeaveVT. This gives the next DRM master
enough time to set up scanout of its own buffers.

Fixes prolonged intermittent black screen when switching from Xorg to
e.g. the GDM Wayland mode login VT.

Fixes: 06a465484101 ("Make all active CRTCs scan out an all-black
  framebuffer in LeaveVT")
Signed-off-by: Michel Dänzer 
---
 src/radeon_kms.c | 40 +++-
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/src/radeon_kms.c b/src/radeon_kms.c
index 5410c4208..01594c6ca 100644
--- a/src/radeon_kms.c
+++ b/src/radeon_kms.c
@@ -1150,7 +1150,6 @@ static void RADEONBlockHandler_KMS(BLOCKHANDLER_ARGS_DECL)
 {
 SCREEN_PTR(arg);
 ScrnInfoPtrpScrn   = xf86ScreenToScrn(pScreen);
-RADEONEntPtr pRADEONEnt = RADEONEntPriv(pScrn);
 RADEONInfoPtr  info= RADEONPTR(pScrn);
 xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(pScrn);
 int c;
@@ -1159,19 +1158,8 @@ static void 
RADEONBlockHandler_KMS(BLOCKHANDLER_ARGS_DECL)
 (*pScreen->BlockHandler) (BLOCKHANDLER_ARGS);
 pScreen->BlockHandler = RADEONBlockHandler_KMS;
 
-if (!xf86ScreenToScrn(radeon_master_screen(pScreen))->vtSema) {
-   /* Unreference the all-black FB created by RADEONLeaveVT_KMS. After
-* this, there should be no FB left created by this driver.
-*/
-   for (c = 0; c < xf86_config->num_crtc; c++) {
-   drmmode_crtc_private_ptr drmmode_crtc =
-   xf86_config->crtc[c]->driver_private;
-
-   drmmode_fb_reference(pRADEONEnt->fd, &drmmode_crtc->fb, NULL);
-   }
-
+if (!xf86ScreenToScrn(radeon_master_screen(pScreen))->vtSema)
return;
-}
 
 if (!radeon_is_gpu_screen(pScreen))
 {
@@ -2473,6 +2461,30 @@ Bool RADEONEnterVT_KMS(VT_FUNC_ARGS_DECL)
 return TRUE;
 }
 
+static
+CARD32 cleanup_black_fb(OsTimerPtr timer, CARD32 now, pointer data)
+{
+ScreenPtr screen = data;
+ScrnInfoPtr scrn = xf86ScreenToScrn(screen);
+RADEONEntPtr pRADEONEnt = RADEONEntPriv(scrn);
+xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(scrn);
+int c;
+
+if (xf86ScreenToScrn(radeon_master_screen(screen))->vtSema)
+   return 0;
+
+/* Unreference the all-black FB created by RADEONLeaveVT_KMS. After
+ * this, there should be no FB left created by this driver.
+ */
+for (c = 0; c < xf86_config->num_crtc; c++) {
+   drmmode_crtc_private_ptr drmmode_crtc =
+   xf86_config->crtc[c]->driver_private;
+
+   drmmode_fb_reference(pRADEONEnt->fd, &drmmode_crtc->fb, NULL);
+}
+
+return 0;
+}
 
 static void
 pixmap_unref_fb(void *value, XID id, void *cdata)
@@ -2569,6 +2581,8 @@ void RADEONLeaveVT_KMS(VT_FUNC_ARGS_DECL)
 }
 pixmap_unref_fb(pScreen->GetScreenPixmap(pScreen), None, pRADEONEnt);
 
+TimerSet(NULL, 0, 1000, cleanup_black_fb, pScreen);
+
 xf86_hide_cursors (pScrn);
 
 radeon_drop_drm_master(pScrn);
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-amdgpu 4/6] Create amdgpu_master_screen helper

2017-08-29 Thread Michel Dänzer
From: Michel Dänzer 

Preparatory, no functional change intended yet.

(Ported from radeon commit 7f0cd68d1b0c132e32ae736371bce3e12ed33c7a)

Signed-off-by: Michel Dänzer 
---
 src/amdgpu_drv.h | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/amdgpu_drv.h b/src/amdgpu_drv.h
index 75c2a2653..8b378b18a 100644
--- a/src/amdgpu_drv.h
+++ b/src/amdgpu_drv.h
@@ -171,6 +171,15 @@ typedef enum {
 #define amdgpu_is_gpu_screen(screen) (screen)->isGPU
 #define amdgpu_is_gpu_scrn(scrn) (scrn)->is_gpu
 
+static inline ScreenPtr
+amdgpu_master_screen(ScreenPtr screen)
+{
+   if (screen->current_master)
+   return screen->current_master;
+
+   return screen;
+}
+
 static inline ScreenPtr
 amdgpu_dirty_master(PixmapDirtyUpdatePtr dirty)
 {
@@ -180,10 +189,7 @@ amdgpu_dirty_master(PixmapDirtyUpdatePtr dirty)
ScreenPtr screen = dirty->src->drawable.pScreen;
 #endif
 
-   if (screen->current_master)
-   return screen->current_master;
-
-   return screen;
+   return amdgpu_master_screen(screen);
 }
 
 static inline Bool
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-amdgpu 6/6] Remove drmmode_scanout_free

2017-08-29 Thread Michel Dänzer
From: Michel Dänzer 

Not used anymore.

(Cherry picked from radeon commit e4a3df19d588a4310fcb889ef34e205d0e92e4d7)

Signed-off-by: Michel Dänzer 
---
 src/drmmode_display.c | 10 --
 src/drmmode_display.h |  1 -
 2 files changed, 11 deletions(-)

diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index 6057699bf..9c838a8b9 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -490,16 +490,6 @@ drmmode_crtc_scanout_free(drmmode_crtc_private_ptr 
drmmode_crtc)
DamageDestroy(drmmode_crtc->scanout_damage);
 }
 
-void
-drmmode_scanout_free(ScrnInfoPtr scrn)
-{
-   xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(scrn);
-   int c;
-
-   for (c = 0; c < xf86_config->num_crtc; c++)
-   drmmode_crtc_scanout_free(xf86_config->crtc[c]->driver_private);
-}
-
 PixmapPtr
 drmmode_crtc_scanout_create(xf86CrtcPtr crtc, struct drmmode_scanout *scanout,
int width, int height)
diff --git a/src/drmmode_display.h b/src/drmmode_display.h
index eff342942..03134f0c9 100644
--- a/src/drmmode_display.h
+++ b/src/drmmode_display.h
@@ -208,7 +208,6 @@ extern Bool drmmode_setup_colormap(ScreenPtr pScreen, 
ScrnInfoPtr pScrn);
 
 extern void drmmode_crtc_scanout_destroy(drmmode_ptr drmmode,
 struct drmmode_scanout *scanout);
-extern void drmmode_scanout_free(ScrnInfoPtr scrn);
 void drmmode_crtc_scanout_free(drmmode_crtc_private_ptr drmmode_crtc);
 PixmapPtr drmmode_crtc_scanout_create(xf86CrtcPtr crtc,
  struct drmmode_scanout *scanout,
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-amdgpu 1/6] Create amdgpu_pixmap_clear helper

2017-08-29 Thread Michel Dänzer
From: Michel Dänzer 

Preparatory, no functional change intended yet.

(Ported from radeon commit 3f6210ca2c8ef60d59efc8139151d3b9838bb875)

Signed-off-by: Michel Dänzer 
---
 src/amdgpu_bo_helper.c | 20 
 src/amdgpu_bo_helper.h |  2 ++
 src/drmmode_display.c  | 14 +-
 3 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/src/amdgpu_bo_helper.c b/src/amdgpu_bo_helper.c
index 7acd0057e..ee52e0c24 100644
--- a/src/amdgpu_bo_helper.c
+++ b/src/amdgpu_bo_helper.c
@@ -120,6 +120,26 @@ struct amdgpu_buffer *amdgpu_alloc_pixmap_bo(ScrnInfoPtr 
pScrn, int width,
return pixmap_buffer;
 }
 
+/* Clear the pixmap contents to black */
+void
+amdgpu_pixmap_clear(PixmapPtr pixmap)
+{
+   ScreenPtr screen = pixmap->drawable.pScreen;
+   AMDGPUInfoPtr info = AMDGPUPTR(xf86ScreenToScrn(screen));
+   GCPtr gc = GetScratchGC(pixmap->drawable.depth, screen);
+   xRectangle rect;
+
+   ValidateGC(&pixmap->drawable, gc);
+   rect.x = 0;
+   rect.y = 0;
+   rect.width = pixmap->drawable.width;
+   rect.height = pixmap->drawable.height;
+   info->force_accel = TRUE;
+   gc->ops->PolyFillRect(&pixmap->drawable, gc, 1, &rect);
+   info->force_accel = FALSE;
+   FreeScratchGC(gc);
+}
+
 Bool amdgpu_bo_get_handle(struct amdgpu_buffer *bo, uint32_t *handle)
 {
if (bo->flags & AMDGPU_BO_FLAGS_GBM) {
diff --git a/src/amdgpu_bo_helper.h b/src/amdgpu_bo_helper.h
index 26fca1604..4f6b628a6 100644
--- a/src/amdgpu_bo_helper.h
+++ b/src/amdgpu_bo_helper.h
@@ -29,6 +29,8 @@ extern struct amdgpu_buffer 
*amdgpu_alloc_pixmap_bo(ScrnInfoPtr pScrn, int width
 int height, int depth, int 
usage_hint,
 int bitsPerPixel, int 
*new_pitch);
 
+extern void amdgpu_pixmap_clear(PixmapPtr pixmap);
+
 extern Bool amdgpu_bo_get_handle(struct amdgpu_buffer *bo, uint32_t *handle);
 
 extern uint64_t amdgpu_pixmap_get_tiling_info(PixmapPtr pixmap);
diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index 17efde8e8..285eb0a0f 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -2116,8 +2116,6 @@ static Bool drmmode_xf86crtc_resize(ScrnInfoPtr scrn, int 
width, int height)
PixmapPtr ppix = screen->GetScreenPixmap(screen);
void *fb_shadow;
int hint = 0;
-   xRectangle rect;
-   GCPtr gc;
 
if (scrn->virtualX == width && scrn->virtualY == height)
return TRUE;
@@ -2181,17 +2179,7 @@ static Bool drmmode_xf86crtc_resize(ScrnInfoPtr scrn, 
int width, int height)
goto fail;
}
 
-   /* Clear new buffer */
-   gc = GetScratchGC(ppix->drawable.depth, scrn->pScreen);
-   ValidateGC(&ppix->drawable, gc);
-   rect.x = 0;
-   rect.y = 0;
-   rect.width = width;
-   rect.height = height;
-   info->force_accel = TRUE;
-   (*gc->ops->PolyFillRect)(&ppix->drawable, gc, 1, &rect);
-   info->force_accel = FALSE;
-   FreeScratchGC(gc);
+   amdgpu_pixmap_clear(ppix);
amdgpu_glamor_finish(scrn);
 
for (i = 0; i < xf86_config->num_crtc; i++) {
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-amdgpu 2/6] Create drmmode_set_mode helper

2017-08-29 Thread Michel Dänzer
From: Michel Dänzer 

Preparatory, no functional change intended yet.

(Ported from radeon commit 4bc992c31059eb50e22df4ebf5b92d08411f41ef)

Signed-off-by: Michel Dänzer 
---
 src/drmmode_display.c | 82 ++-
 src/drmmode_display.h |  3 ++
 2 files changed, 52 insertions(+), 33 deletions(-)

diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index 285eb0a0f..6092805fd 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -796,6 +796,52 @@ drmmode_crtc_gamma_do_set(xf86CrtcPtr crtc, uint16_t *red, 
uint16_t *green,
size, red, green, blue);
 }
 
+Bool
+drmmode_set_mode(xf86CrtcPtr crtc, struct drmmode_fb *fb, DisplayModePtr mode,
+int x, int y)
+{
+   ScrnInfoPtr scrn = crtc->scrn;
+   AMDGPUEntPtr pAMDGPUEnt = AMDGPUEntPriv(scrn);
+   xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(scrn);
+   drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
+   uint32_t *output_ids = calloc(sizeof(uint32_t), 
xf86_config->num_output);
+   int output_count = 0;
+   drmModeModeInfo kmode;
+   Bool ret;
+   int i;
+
+   if (!output_ids)
+   return FALSE;
+
+   for (i = 0; i < xf86_config->num_output; i++) {
+   xf86OutputPtr output = xf86_config->output[i];
+   drmmode_output_private_ptr drmmode_output = 
output->driver_private;
+
+   if (output->crtc != crtc)
+   continue;
+
+   output_ids[output_count] = 
drmmode_output->mode_output->connector_id;
+   output_count++;
+   }
+
+   drmmode_ConvertToKMode(scrn, &kmode, mode);
+
+   ret = drmModeSetCrtc(pAMDGPUEnt->fd,
+drmmode_crtc->mode_crtc->crtc_id,
+fb->handle, x, y, output_ids,
+output_count, &kmode) == 0;
+
+   if (ret) {
+   drmmode_fb_reference(pAMDGPUEnt->fd, &drmmode_crtc->fb, fb);
+   } else {
+   xf86DrvMsg(scrn->scrnIndex, X_ERROR,
+  "failed to set mode: %s\n", strerror(errno));
+   }
+
+   free(output_ids);
+   return ret;
+}
+
 static Bool
 drmmode_set_mode_major(xf86CrtcPtr crtc, DisplayModePtr mode,
   Rotation rotation, int x, int y)
@@ -811,12 +857,9 @@ drmmode_set_mode_major(xf86CrtcPtr crtc, DisplayModePtr 
mode,
int saved_x, saved_y;
Rotation saved_rotation;
DisplayModeRec saved_mode;
-   uint32_t *output_ids = NULL;
-   int output_count = 0;
Bool ret = FALSE;
int i;
struct drmmode_fb *fb = NULL;
-   drmModeModeInfo kmode;
 
/* The root window contents may be undefined before the WindowExposures
 * hook is called for it, so bail if we get here before that
@@ -835,23 +878,6 @@ drmmode_set_mode_major(xf86CrtcPtr crtc, DisplayModePtr 
mode,
crtc->y = y;
crtc->rotation = rotation;
 
-   output_ids = calloc(sizeof(uint32_t), xf86_config->num_output);
-   if (!output_ids)
-   goto done;
-
-   for (i = 0; i < xf86_config->num_output; i++) {
-   xf86OutputPtr output = xf86_config->output[i];
-   drmmode_output_private_ptr drmmode_output;
-
-   if (output->crtc != crtc)
-   continue;
-
-   drmmode_output = output->driver_private;
-   output_ids[output_count] =
-   drmmode_output->mode_output->connector_id;
-   output_count++;
-   }
-
if (!drmmode_handle_transform(crtc))
goto done;
 
@@ -862,8 +888,6 @@ drmmode_set_mode_major(xf86CrtcPtr crtc, DisplayModePtr 
mode,
drmmode_crtc_gamma_do_set(crtc, crtc->gamma_red, 
crtc->gamma_green,
  crtc->gamma_blue, crtc->gamma_size);
 
-   drmmode_ConvertToKMode(crtc->scrn, &kmode, mode);
-
 #ifdef AMDGPU_PIXMAP_SHARING
if (drmmode_crtc->prime_scanout_pixmap) {
drmmode_crtc_prime_scanout_update(crtc, mode, 
scanout_id,
@@ -907,17 +931,10 @@ drmmode_set_mode_major(xf86CrtcPtr crtc, DisplayModePtr 
mode,
drmmode_crtc_wait_pending_event(drmmode_crtc, pAMDGPUEnt->fd,
drmmode_crtc->flip_pending);
 
-   if (drmModeSetCrtc(pAMDGPUEnt->fd,
-  drmmode_crtc->mode_crtc->crtc_id,
-  fb->handle, x, y, output_ids,
-  output_count, &kmode) != 0) {
-   xf86DrvMsg(crtc->scrn->scrnIndex, X_ERROR,
-  "failed to set mode: %s\n", strerror(errno));
+   if (!drmmode_set_mode(crtc, fb, mode, x, 

[PATCH xf86-video-amdgpu 5/6] Make all active CRTCs scan out an all-black framebuffer in LeaveVT

2017-08-29 Thread Michel Dänzer
From: Michel Dänzer 

And destroy all other FBs. This is so that other DRM masters can only
get access to this all-black FB, not to any other FB we created, while
we're switched away and not DRM master.

Fixes: b09fde0d81e0 ("Use reference counting for tracking KMS
  framebuffer lifetimes")
(Ported from radeon commit 06a465484101f21e99d3a0a62fb03440bcaff93e)

Signed-off-by: Michel Dänzer 
---
 src/amdgpu_kms.c  | 99 ++-
 src/drmmode_display.c |  4 +--
 src/drmmode_display.h |  4 +++
 3 files changed, 97 insertions(+), 10 deletions(-)

diff --git a/src/amdgpu_kms.c b/src/amdgpu_kms.c
index e0b735819..c3613eb8d 100644
--- a/src/amdgpu_kms.c
+++ b/src/amdgpu_kms.c
@@ -32,6 +32,7 @@
 #include 
 /* Driver data structures */
 #include "amdgpu_drv.h"
+#include "amdgpu_bo_helper.h"
 #include "amdgpu_drm_queue.h"
 #include "amdgpu_glamor.h"
 #include "amdgpu_probe.h"
@@ -1052,11 +1053,10 @@ static void 
AMDGPUBlockHandler_KMS(BLOCKHANDLER_ARGS_DECL)
(*pScreen->BlockHandler) (BLOCKHANDLER_ARGS);
pScreen->BlockHandler = AMDGPUBlockHandler_KMS;
 
-   if (!pScrn->vtSema) {
-#if XORG_VERSION_CURRENT < XORG_VERSION_NUMERIC(1,19,0,0,0)
-   if (info->use_glamor)
-   amdgpu_glamor_flush(pScrn);
-#endif
+   if (!xf86ScreenToScrn(amdgpu_master_screen(pScreen))->vtSema) {
+   /* Unreference the all-black FB created by AMDGPULeaveVT_KMS. 
After
+* this, there should be no FB left created by this driver.
+*/
 
for (c = 0; c < xf86_config->num_crtc; c++) {
drmmode_crtc_private_ptr drmmode_crtc =
@@ -1967,21 +1967,104 @@ Bool AMDGPUEnterVT_KMS(VT_FUNC_ARGS_DECL)
return TRUE;
 }
 
+static void
+pixmap_unref_fb(void *value, XID id, void *cdata)
+{
+   PixmapPtr pixmap = value;
+   AMDGPUEntPtr pAMDGPUEnt = cdata;
+   struct drmmode_fb **fb_ptr = amdgpu_pixmap_get_fb_ptr(pixmap);
+
+   if (fb_ptr)
+   drmmode_fb_reference(pAMDGPUEnt->fd, fb_ptr, NULL);
+}
+
 void AMDGPULeaveVT_KMS(VT_FUNC_ARGS_DECL)
 {
SCRN_INFO_PTR(arg);
+   AMDGPUInfoPtr info = AMDGPUPTR(pScrn);
+   AMDGPUEntPtr pAMDGPUEnt = AMDGPUEntPriv(pScrn);
+   ScreenPtr pScreen = pScrn->pScreen;
+   xf86CrtcConfigPtr xf86_config = XF86_CRTC_CONFIG_PTR(pScrn);
+   struct drmmode_scanout black_scanout = { .pixmap = NULL, .bo = NULL };
+   xf86CrtcPtr crtc;
+   drmmode_crtc_private_ptr drmmode_crtc;
+   unsigned w = 0, h = 0;
+   int i;
 
xf86DrvMsgVerb(pScrn->scrnIndex, X_INFO, AMDGPU_LOGLEVEL_DEBUG,
   "AMDGPULeaveVT_KMS\n");
 
-   amdgpu_drop_drm_master(pScrn);
+   /* Compute maximum scanout dimensions of active CRTCs */
+   for (i = 0; i < xf86_config->num_crtc; i++) {
+   crtc = xf86_config->crtc[i];
+   drmmode_crtc = crtc->driver_private;
+
+   if (!drmmode_crtc->fb)
+   continue;
+
+   w = max(w, crtc->mode.HDisplay);
+   h = max(h, crtc->mode.VDisplay);
+   }
+
+   /* Make all active CRTCs scan out from an all-black framebuffer */
+   if (w > 0 && h > 0) {
+   if (drmmode_crtc_scanout_create(crtc, &black_scanout, w, h)) {
+   struct drmmode_fb *black_fb =
+   amdgpu_pixmap_get_fb(black_scanout.pixmap);
+
+   amdgpu_pixmap_clear(black_scanout.pixmap);
+   amdgpu_glamor_finish(pScrn);
+
+   for (i = 0; i < xf86_config->num_crtc; i++) {
+   crtc = xf86_config->crtc[i];
+   drmmode_crtc = crtc->driver_private;
+
+   if (drmmode_crtc->fb) {
+   if (black_fb) {
+   drmmode_set_mode(crtc, 
black_fb, &crtc->mode, 0, 0);
+   } else {
+   drmModeSetCrtc(pAMDGPUEnt->fd,
+  
drmmode_crtc->mode_crtc->crtc_id, 0, 0,
+  0, NULL, 0, 
NULL);
+   
drmmode_fb_reference(pAMDGPUEnt->fd, &drmmode_crtc->fb,
+NULL);
+   }
+
+   if (pScrn->is_gpu) {
+   if 
(drmmode_crtc->scanout[0].pixmap)
+   
pixmap_unref_fb(drmmode_crtc->scanout[0].pixmap,
+   None, 
pAMDGPUEnt);
+   if 
(drmmode_crtc->scanout[1].pixmap)
+ 

[PATCH xf86-video-amdgpu 3/6] Create amdgpu_pixmap_get_fb_ptr helper

2017-08-29 Thread Michel Dänzer
From: Michel Dänzer 

Preparatory, no functional change intended yet.

Also inline amdgpu_pixmap_create_fb into amdgpu_pixmap_get_fb, since
there's only one call-site.

(Ported from radeon commit 20f6b56fdb74d88086e8e094013fedbb14e50a24)

Signed-off-by: Michel Dänzer 
---
 src/amdgpu_pixmap.h | 46 +++---
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/src/amdgpu_pixmap.h b/src/amdgpu_pixmap.h
index 00fb5bf05..eded17037 100644
--- a/src/amdgpu_pixmap.h
+++ b/src/amdgpu_pixmap.h
@@ -121,39 +121,47 @@ amdgpu_fb_create(int drm_fd, uint32_t width, uint32_t 
height, uint8_t depth,
return NULL;
 }
 
-static inline struct drmmode_fb*
-amdgpu_pixmap_create_fb(int drm_fd, PixmapPtr pix)
+static inline struct drmmode_fb**
+amdgpu_pixmap_get_fb_ptr(PixmapPtr pix)
 {
-   uint32_t handle;
+   ScrnInfoPtr scrn = xf86ScreenToScrn(pix->drawable.pScreen);
+   AMDGPUInfoPtr info = AMDGPUPTR(scrn);
 
-   if (!amdgpu_pixmap_get_handle(pix, &handle))
-   return NULL;
+   if (info->use_glamor) {
+   struct amdgpu_pixmap *priv = amdgpu_get_pixmap_private(pix);
+
+   if (!priv)
+   return NULL;
+
+   return &priv->fb;
+   }
 
-   return amdgpu_fb_create(drm_fd, pix->drawable.width, 
pix->drawable.height,
-   pix->drawable.depth, pix->drawable.bitsPerPixel,
-   pix->devKind, handle);
+   return NULL;
 }
 
 static inline struct drmmode_fb*
 amdgpu_pixmap_get_fb(PixmapPtr pix)
 {
-   ScrnInfoPtr scrn = xf86ScreenToScrn(pix->drawable.pScreen);
-   AMDGPUEntPtr pAMDGPUEnt = AMDGPUEntPriv(scrn);
-   AMDGPUInfoPtr info = AMDGPUPTR(scrn);
+   struct drmmode_fb **fb_ptr = amdgpu_pixmap_get_fb_ptr(pix);
 
-   if (info->use_glamor) {
-   struct amdgpu_pixmap *priv = amdgpu_get_pixmap_private(pix);
+   if (!fb_ptr)
+   return NULL;
 
-   if (!priv)
-   return NULL;
+   if (!*fb_ptr) {
+   uint32_t handle;
 
-   if (!priv->fb)
-   priv->fb = amdgpu_pixmap_create_fb(pAMDGPUEnt->fd, pix);
+   if (amdgpu_pixmap_get_handle(pix, &handle)) {
+   ScrnInfoPtr scrn = 
xf86ScreenToScrn(pix->drawable.pScreen);
+   AMDGPUEntPtr pAMDGPUEnt = AMDGPUEntPriv(scrn);
 
-   return priv->fb;
+   *fb_ptr = amdgpu_fb_create(pAMDGPUEnt->fd, 
pix->drawable.width,
+  pix->drawable.height, 
pix->drawable.depth,
+  pix->drawable.bitsPerPixel, 
pix->devKind,
+  handle);
+   }
}
 
-   return NULL;
+   return *fb_ptr;
 }
 
 enum {
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx