Re: [igt-dev] RFC: Migration to Gitlab

2018-08-23 Thread Jani Nikula
On Wed, 22 Aug 2018, Rodrigo Vivi  wrote:
> On Wed, Aug 22, 2018 at 10:19:19AM -0400, Adam Jackson wrote:
>> On Wed, 2018-08-22 at 16:13 +0300, Jani Nikula wrote:
>> 
>> > - Sticking to fdo bugzilla and disabling gitlab issues for at least
>> >   drm-intel for the time being. Doing that migration in the same go is a
>> >   bit much I think. Reassignment across bugzilla and gitlab will be an
>> >   issue.
>> 
>> Can you elaborate a bit on the issues here? The actual move-the-bugs
>> process has been pretty painless for the parts of xorg we've done so
>> far.
>
> I guess there is nothing against moving the bugs there. The concern is only on
> doing everything at once.

No, it's not just that.

We have some automation using the bugzilla APIs directly, and
someone(tm) needs to figure out how this should work with gitlab. Maybe
we have a better chance of doing things with gitlab's APIs, maybe we can
reduce our dependence on external logic altogether.

We have special "i915 platform" and "i915 features" fields in
bugzilla. We use a mailing list default assignee. Some of us use the
"whiteboard" and "keywords" fields. Etc.

I don't think figuring all this out is rocket science, but someone needs
to actually do it, and get our workflows straightened out *before* we
flip the switch. I'm just trying to figure out if that is a blocker to
migrating the repos.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Adjust the VM size based on system memory size v2

2018-08-23 Thread Huang Rui
On Thu, Aug 23, 2018 at 06:37:10PM -0400, Felix Kuehling wrote:
> Set the VM size based on system memory size between the ASIC-specific
> limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their
> default VM size of 256TB (48 bit). Only older GPUs will adjust VM size
> depending on system memory size.
> 
> This makes more VM space available for ROCm applications on GFXv8 GPUs
> that want to map all available VRAM and system memory in their SVM
> address space.
> 
> v2:
> * Clarify comment
> * Round up memory size before >> 30
> * Round up automatic vm_size to power of two
> 
> Signed-off-by: Felix Kuehling 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 32 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index a6b1126..543db67 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2482,28 +2482,52 @@ static uint32_t amdgpu_vm_get_block_size(uint64_t 
> vm_size)
>   * amdgpu_vm_adjust_size - adjust vm size, block size and fragment size
>   *
>   * @adev: amdgpu_device pointer
> - * @vm_size: the default vm size if it's set auto
> + * @min_vm_size: the minimum vm size in GB if it's set auto
>   * @fragment_size_default: Default PTE fragment size
>   * @max_level: max VMPT level
>   * @max_bits: max address space size in bits
>   *
>   */
> -void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
> +void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
>  uint32_t fragment_size_default, unsigned max_level,
>  unsigned max_bits)
>  {
> + unsigned int max_size = 1 << (max_bits - 30);
> + unsigned int vm_size;
>   uint64_t tmp;
>  
>   /* adjust vm size first */
>   if (amdgpu_vm_size != -1) {
> - unsigned max_size = 1 << (max_bits - 30);
> -
>   vm_size = amdgpu_vm_size;
>   if (vm_size > max_size) {
>   dev_warn(adev->dev, "VM size (%d) too large, max is %u 
> GB\n",
>amdgpu_vm_size, max_size);
>   vm_size = max_size;
>   }
> + } else {
> + struct sysinfo si;
> + unsigned int phys_ram_gb;
> +
> + /* Optimal VM size depends on the amount of physical
> +  * RAM available. Underlying requirements and
> +  * assumptions:
> +  *
> +  *  - Need to map system memory and VRAM from all GPUs
> +  * - VRAM from other GPUs not known here
> +  * - Assume VRAM <= system memory
> +  *  - On GFX8 and older, VM space can be segmented for
> +  *different MTYPEs
> +  *  - Need to allow room for fragmentation, guard pages etc.
> +  *
> +  * This adds up to a rough guess of system memory x3.
> +  * Round up to power of two to maximize the available
> +  * VM size with the given page table size.
> +  */
> + si_meminfo(&si);
> + phys_ram_gb = ((uint64_t)si.totalram * si.mem_unit +
> +(1 << 30) - 1) >> 30;
> + vm_size = roundup_pow_of_two(
> + min(max(phys_ram_gb * 3, min_vm_size), max_size));
>   }
>  
>   adev->vm_manager.max_pfn = (uint64_t)vm_size << 18;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 1162c2b..ab1d23e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -345,7 +345,7 @@ struct amdgpu_bo_va_mapping 
> *amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
>  void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx 
> *ticket);
>  void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
> struct amdgpu_bo_va *bo_va);
> -void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
> +void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
>  uint32_t fragment_size_default, unsigned max_level,
>  unsigned max_bits);
>  int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file 
> *filp);
> -- 
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Adjust the VM size based on system memory size v2

2018-08-23 Thread Zhang, Jerry (Junwei)

On 08/24/2018 06:37 AM, Felix Kuehling wrote:

Set the VM size based on system memory size between the ASIC-specific
limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their
default VM size of 256TB (48 bit). Only older GPUs will adjust VM size
depending on system memory size.

This makes more VM space available for ROCm applications on GFXv8 GPUs
that want to map all available VRAM and system memory in their SVM
address space.

v2:
* Clarify comment
* Round up memory size before >> 30
* Round up automatic vm_size to power of two

Signed-off-by: Felix Kuehling 

Acked-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 32 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
  2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a6b1126..543db67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2482,28 +2482,52 @@ static uint32_t amdgpu_vm_get_block_size(uint64_t 
vm_size)
   * amdgpu_vm_adjust_size - adjust vm size, block size and fragment size
   *
   * @adev: amdgpu_device pointer
- * @vm_size: the default vm size if it's set auto
+ * @min_vm_size: the minimum vm size in GB if it's set auto
   * @fragment_size_default: Default PTE fragment size
   * @max_level: max VMPT level
   * @max_bits: max address space size in bits
   *
   */
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
   uint32_t fragment_size_default, unsigned max_level,
   unsigned max_bits)
  {
+   unsigned int max_size = 1 << (max_bits - 30);
+   unsigned int vm_size;
uint64_t tmp;

/* adjust vm size first */
if (amdgpu_vm_size != -1) {
-   unsigned max_size = 1 << (max_bits - 30);
-
vm_size = amdgpu_vm_size;
if (vm_size > max_size) {
dev_warn(adev->dev, "VM size (%d) too large, max is %u 
GB\n",
 amdgpu_vm_size, max_size);
vm_size = max_size;
}
+   } else {
+   struct sysinfo si;
+   unsigned int phys_ram_gb;
+
+   /* Optimal VM size depends on the amount of physical
+* RAM available. Underlying requirements and
+* assumptions:
+*
+*  - Need to map system memory and VRAM from all GPUs
+* - VRAM from other GPUs not known here
+* - Assume VRAM <= system memory
+*  - On GFX8 and older, VM space can be segmented for
+*different MTYPEs
+*  - Need to allow room for fragmentation, guard pages etc.
+*
+* This adds up to a rough guess of system memory x3.
+* Round up to power of two to maximize the available
+* VM size with the given page table size.
+*/
+   si_meminfo(&si);
+   phys_ram_gb = ((uint64_t)si.totalram * si.mem_unit +
+  (1 << 30) - 1) >> 30;
+   vm_size = roundup_pow_of_two(
+   min(max(phys_ram_gb * 3, min_vm_size), max_size));
}

adev->vm_manager.max_pfn = (uint64_t)vm_size << 18;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 1162c2b..ab1d23e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -345,7 +345,7 @@ struct amdgpu_bo_va_mapping 
*amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
  void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx 
*ticket);
  void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
  struct amdgpu_bo_va *bo_va);
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
   uint32_t fragment_size_default, unsigned max_level,
   unsigned max_bits);
  int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file 
*filp);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Adjust the VM size based on system memory size v2

2018-08-23 Thread Felix Kuehling
Set the VM size based on system memory size between the ASIC-specific
limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their
default VM size of 256TB (48 bit). Only older GPUs will adjust VM size
depending on system memory size.

This makes more VM space available for ROCm applications on GFXv8 GPUs
that want to map all available VRAM and system memory in their SVM
address space.

v2:
* Clarify comment
* Round up memory size before >> 30
* Round up automatic vm_size to power of two

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 32 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a6b1126..543db67 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2482,28 +2482,52 @@ static uint32_t amdgpu_vm_get_block_size(uint64_t 
vm_size)
  * amdgpu_vm_adjust_size - adjust vm size, block size and fragment size
  *
  * @adev: amdgpu_device pointer
- * @vm_size: the default vm size if it's set auto
+ * @min_vm_size: the minimum vm size in GB if it's set auto
  * @fragment_size_default: Default PTE fragment size
  * @max_level: max VMPT level
  * @max_bits: max address space size in bits
  *
  */
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
   uint32_t fragment_size_default, unsigned max_level,
   unsigned max_bits)
 {
+   unsigned int max_size = 1 << (max_bits - 30);
+   unsigned int vm_size;
uint64_t tmp;
 
/* adjust vm size first */
if (amdgpu_vm_size != -1) {
-   unsigned max_size = 1 << (max_bits - 30);
-
vm_size = amdgpu_vm_size;
if (vm_size > max_size) {
dev_warn(adev->dev, "VM size (%d) too large, max is %u 
GB\n",
 amdgpu_vm_size, max_size);
vm_size = max_size;
}
+   } else {
+   struct sysinfo si;
+   unsigned int phys_ram_gb;
+
+   /* Optimal VM size depends on the amount of physical
+* RAM available. Underlying requirements and
+* assumptions:
+*
+*  - Need to map system memory and VRAM from all GPUs
+* - VRAM from other GPUs not known here
+* - Assume VRAM <= system memory
+*  - On GFX8 and older, VM space can be segmented for
+*different MTYPEs
+*  - Need to allow room for fragmentation, guard pages etc.
+*
+* This adds up to a rough guess of system memory x3.
+* Round up to power of two to maximize the available
+* VM size with the given page table size.
+*/
+   si_meminfo(&si);
+   phys_ram_gb = ((uint64_t)si.totalram * si.mem_unit +
+  (1 << 30) - 1) >> 30;
+   vm_size = roundup_pow_of_two(
+   min(max(phys_ram_gb * 3, min_vm_size), max_size));
}
 
adev->vm_manager.max_pfn = (uint64_t)vm_size << 18;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 1162c2b..ab1d23e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -345,7 +345,7 @@ struct amdgpu_bo_va_mapping 
*amdgpu_vm_bo_lookup_mapping(struct amdgpu_vm *vm,
 void amdgpu_vm_bo_trace_cs(struct amdgpu_vm *vm, struct ww_acquire_ctx 
*ticket);
 void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
  struct amdgpu_bo_va *bo_va);
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t vm_size,
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
   uint32_t fragment_size_default, unsigned max_level,
   unsigned max_bits);
 int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file *filp);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amdgpu: Remove CONFIG_HSA_AMD_MODULE

2018-08-23 Thread Felix Kuehling
On 2018-08-23 11:17 AM, Amber Lin wrote:
> After amdkfd is merged to amdgpu, CONFIG_HSA_AMD_MODULE no longer exists.
>
> Change-Id: I42096cdf887e0d776075f3dd3e8d3f153aff4e85
> Signed-off-by: Amber Lin 

Reviewed-by: Felix Kuehling 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 26 +++---
>  1 file changed, 3 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index e3ed08d..8c652ec 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -36,36 +36,16 @@ int amdgpu_amdkfd_init(void)
>  {
>   int ret;
>  
> -#if defined(CONFIG_HSA_AMD_MODULE)
> - int (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
> -
> - kgd2kfd_init_p = symbol_request(kgd2kfd_init);
> -
> - if (kgd2kfd_init_p == NULL)
> - return -ENOENT;
> -
> - ret = kgd2kfd_init_p(KFD_INTERFACE_VERSION, &kgd2kfd);
> - if (ret) {
> - symbol_put(kgd2kfd_init);
> - kgd2kfd = NULL;
> - }
> -
> -
> -#elif defined(CONFIG_HSA_AMD)
> -
> +#ifdef CONFIG_HSA_AMD
>   ret = kgd2kfd_init(KFD_INTERFACE_VERSION, &kgd2kfd);
>   if (ret)
>   kgd2kfd = NULL;
> -
> + amdgpu_amdkfd_gpuvm_init_mem_limits();
>  #else
>   kgd2kfd = NULL;
>   ret = -ENOENT;
>  #endif
>  
> -#if defined(CONFIG_HSA_AMD_MODULE) || defined(CONFIG_HSA_AMD)
> - amdgpu_amdkfd_gpuvm_init_mem_limits();
> -#endif
> -
>   return ret;
>  }
>  
> @@ -471,7 +451,7 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device 
> *adev, u32 vmid)
>   return false;
>  }
>  
> -#if !defined(CONFIG_HSA_AMD_MODULE) && !defined(CONFIG_HSA_AMD)
> +#ifndef CONFIG_HSA_AMD
>  bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
>  {
>   return false;

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: Merge amdkfd into amdgpu

2018-08-23 Thread Felix Kuehling
I believe you also need to edit drivers/gpu/drm/Kconfig. Otherwise
amdkfd/Kconfig will be included twice. With that fixed, this commit is

Reviewed-by: Felix Kuehling 

But let's give amdgpu reviewers some more time to respond.

Thanks,
  Felix

On 2018-08-23 11:17 AM, Amber Lin wrote:
> Since KFD is only supported by single GPU driver, it makes sense to merge
> amdgpu and amdkfd into one module. This patch is the initial step: merge
> Kconfig and Makefile.
>
> Change-Id: I21c996ba29d393c1bf8064bdb2f5d89541159649
> Signed-off-by: Amber Lin 
> ---
>  drivers/gpu/drm/amd/amdgpu/Kconfig  |  1 +
>  drivers/gpu/drm/amd/amdgpu/Makefile |  6 ++-
>  drivers/gpu/drm/amd/amdkfd/Kconfig  |  2 +-
>  drivers/gpu/drm/amd/amdkfd/Makefile | 53 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c | 76 
> ++---
>  5 files changed, 63 insertions(+), 75 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
> b/drivers/gpu/drm/amd/amdgpu/Kconfig
> index e8af1f5..9221e54 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Kconfig
> +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
> @@ -42,3 +42,4 @@ config DRM_AMDGPU_GART_DEBUGFS
>  
>  source "drivers/gpu/drm/amd/acp/Kconfig"
>  source "drivers/gpu/drm/amd/display/Kconfig"
> +source "drivers/gpu/drm/amd/amdkfd/Kconfig"
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index d2bafab..847536b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -35,7 +35,8 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
>   -I$(FULL_AMD_DISPLAY_PATH) \
>   -I$(FULL_AMD_DISPLAY_PATH)/include \
>   -I$(FULL_AMD_DISPLAY_PATH)/dc \
> - -I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm
> + -I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm \
> + -I$(FULL_AMD_PATH)/amdkfd
>  
>  amdgpu-y := amdgpu_drv.o
>  
> @@ -136,6 +137,9 @@ amdgpu-y += \
>  amdgpu-y += amdgpu_amdkfd.o
>  
>  ifneq ($(CONFIG_HSA_AMD),)
> +AMDKFD_PATH := ../amdkfd
> +include $(FULL_AMD_PATH)/amdkfd/Makefile
> +amdgpu-y += $(AMDKFD_FILES)
>  amdgpu-y += \
>amdgpu_amdkfd_fence.o \
>amdgpu_amdkfd_gpuvm.o \
> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig 
> b/drivers/gpu/drm/amd/amdkfd/Kconfig
> index 3858820..fbf0ee5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
> @@ -3,7 +3,7 @@
>  #
>  
>  config HSA_AMD
> - tristate "HSA kernel driver for AMD GPU devices"
> + bool "HSA kernel driver for AMD GPU devices"
>   depends on DRM_AMDGPU && X86_64
>   imply AMD_IOMMU_V2
>   select MMU_NOTIFIER
> diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile 
> b/drivers/gpu/drm/amd/amdkfd/Makefile
> index ffd096f..69ec969 100644
> --- a/drivers/gpu/drm/amd/amdkfd/Makefile
> +++ b/drivers/gpu/drm/amd/amdkfd/Makefile
> @@ -23,26 +23,41 @@
>  # Makefile for Heterogenous System Architecture support for AMD GPU devices
>  #
>  
> -ccflags-y := -Idrivers/gpu/drm/amd/include/  \
> - -Idrivers/gpu/drm/amd/include/asic_reg
> -
> -amdkfd-y := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
> - kfd_pasid.o kfd_doorbell.o kfd_flat_memory.o \
> - kfd_process.o kfd_queue.o kfd_mqd_manager.o \
> - kfd_mqd_manager_cik.o kfd_mqd_manager_vi.o \
> - kfd_mqd_manager_v9.o \
> - kfd_kernel_queue.o kfd_kernel_queue_cik.o \
> - kfd_kernel_queue_vi.o kfd_kernel_queue_v9.o \
> - kfd_packet_manager.o kfd_process_queue_manager.o \
> - kfd_device_queue_manager.o kfd_device_queue_manager_cik.o \
> - kfd_device_queue_manager_vi.o kfd_device_queue_manager_v9.o \
> - kfd_interrupt.o kfd_events.o cik_event_interrupt.o \
> - kfd_int_process_v9.o kfd_dbgdev.o kfd_dbgmgr.o kfd_crat.o
> +AMDKFD_FILES := $(AMDKFD_PATH)/kfd_module.o \
> + $(AMDKFD_PATH)/kfd_device.o \
> + $(AMDKFD_PATH)/kfd_chardev.o \
> + $(AMDKFD_PATH)/kfd_topology.o \
> + $(AMDKFD_PATH)/kfd_pasid.o \
> + $(AMDKFD_PATH)/kfd_doorbell.o \
> + $(AMDKFD_PATH)/kfd_flat_memory.o \
> + $(AMDKFD_PATH)/kfd_process.o \
> + $(AMDKFD_PATH)/kfd_queue.o \
> + $(AMDKFD_PATH)/kfd_mqd_manager.o \
> + $(AMDKFD_PATH)/kfd_mqd_manager_cik.o \
> + $(AMDKFD_PATH)/kfd_mqd_manager_vi.o \
> + $(AMDKFD_PATH)/kfd_mqd_manager_v9.o \
> + $(AMDKFD_PATH)/kfd_kernel_queue.o \
> + $(AMDKFD_PATH)/kfd_kernel_queue_cik.o \
> + $(AMDKFD_PATH)/kfd_kernel_queue_vi.o \
> + $(AMDKFD_PATH)/kfd_kernel_queue_v9.o \
> + $(AMDKFD_PATH)/kfd_packet_manager.o \
> + $(AMDKFD_PATH)/kfd_process_queue_manager.o \
> + $(AMDKFD_PATH)/kfd_device_queue_manager.o \
> + $(AMDKFD_PATH)/kfd_device_queue_manager_cik.o \
> + $(AMDKFD_PATH)/kfd_device_queue_man

Re: [PATCH v2] drm/amd/display: Fix bug use wrong pp interface

2018-08-23 Thread Harry Wentland


On 2018-08-20 03:54 AM, Rex Zhu wrote:
> Used wrong pp interface, the original interface is
> exposed by dpm on SI and paritial CI.
> 
> Pointed out by Francis David 
> 
> v2: dal only need to set min_dcefclk and min_fclk to smu.
> so use display_clock_voltage_request interface,
> instand of update all display configuration.
> 
> Acked-by: Alex Deucher 
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
> index e5c5b0a..7811d60 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
> @@ -480,12 +480,20 @@ void pp_rv_set_display_requirement(struct pp_smu *pp,
>  {
>   const struct dc_context *ctx = pp->dm;
>   struct amdgpu_device *adev = ctx->driver_context;
> + void *pp_handle = adev->powerplay.pp_handle;
>   const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
> + struct pp_display_clock_request clock = {0};
>  
> - if (!pp_funcs || !pp_funcs->display_configuration_changed)
> + if (!req || !pp_funcs || !pp_funcs->display_clock_voltage_request)

Is req ever NULL? I don't expect it.

Otherwise this looks good.

Harry

>   return;
>  
> - amdgpu_dpm_display_configuration_changed(adev);
> + clock.clock_type = amd_pp_dcf_clock;
> + clock.clock_freq_in_khz = req->hard_min_dcefclk_khz;
> + pp_funcs->display_clock_voltage_request(pp_handle, &clock);
> +
> + clock.clock_type = amd_pp_f_clock;
> + clock.clock_freq_in_khz = req->hard_min_fclk_khz;
> + pp_funcs->display_clock_voltage_request(pp_handle, &clock);
>  }
>  
>  void pp_rv_set_wm_ranges(struct pp_smu *pp,
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/5] drm/amdgpu: add ring soft recovery v2

2018-08-23 Thread Marek Olšák
On Thu, Aug 23, 2018 at 2:51 AM Christian König
 wrote:
>
> Am 22.08.2018 um 21:32 schrieb Marek Olšák:
> > On Wed, Aug 22, 2018 at 12:56 PM Alex Deucher  wrote:
> >> On Wed, Aug 22, 2018 at 6:05 AM Christian König
> >>  wrote:
> >>> Instead of hammering hard on the GPU try a soft recovery first.
> >>>
> >>> v2: reorder code a bit
> >>>
> >>> Signed-off-by: Christian König 
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 24 
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
> >>>   3 files changed, 34 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> index 265ff90f4e01..d93e31a5c4e7 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> @@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct drm_sched_job 
> >>> *s_job)
> >>>  struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> >>>  struct amdgpu_job *job = to_amdgpu_job(s_job);
> >>>
> >>> +   if (amdgpu_ring_soft_recovery(ring, job->vmid, 
> >>> s_job->s_fence->parent)) {
> >>> +   DRM_ERROR("ring %s timeout, but soft recovered\n",
> >>> + s_job->sched->name);
> >>> +   return;
> >>> +   }
> >> I think we should still bubble up the error to userspace even if we
> >> can recover.  Data is lost when the wave is killed.  We should treat
> >> it like a GPU reset.
> > Yes, please increment gpu_reset_counter, so that we are compliant with
> > OpenGL. Being able to recover from infinite loops is great, but test
> > suites also expect this to be properly reported to userspace via the
> > per-context query.
>
> Sure that shouldn't be a problem.
>
> > Also please bump the deadline to 1 second. Even you if you kill all
> > shaders, the IB can also contain CP DMA, which may take longer than 1
> > ms.
>
> Is there any way we can get a feedback from the SQ if the kill was
> successfully?

I don't think so. The kill should be finished pretty quickly, but more
waves with infinite loops may be waiting to be launched, so you still
need to repeat the kill command. And we should ideally repeat it for 1
second.

The reason is that vertex shader waves take a lot of time to launch. A
very very very large draw call can keep launching new waves for 1
second with the same infinite loop. You would have to soft-reset all
VGTs to stop that.

>
> 1 second is way to long, since in the case of a blocked MC we need to
> start up hard reset relative fast.

10 seconds have already passed.

I think that some hangs from corrupted descriptors may still be
recoverable just by killing waves.

Marek

>
> Regards,
> Christian.
>
> >
> > Marek
> >
> > Marek
> >
> >> Alex
> >>
> >>> +
> >>>  DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
> >>>job->base.sched->name, 
> >>> atomic_read(&ring->fence_drv.last_seq),
> >>>ring->fence_drv.sync_seq);
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >>> index 5dfd26be1eec..c045a4e38ad1 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >>> @@ -383,6 +383,30 @@ void 
> >>> amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
> >>>  amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
> >>>   }
> >>>
> >>> +/**
> >>> + * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
> >>> + *
> >>> + * @ring: ring to try the recovery on
> >>> + * @vmid: VMID we try to get going again
> >>> + * @fence: timedout fence
> >>> + *
> >>> + * Tries to get a ring proceeding again when it is stuck.
> >>> + */
> >>> +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int 
> >>> vmid,
> >>> +  struct dma_fence *fence)
> >>> +{
> >>> +   ktime_t deadline = ktime_add_us(ktime_get(), 1000);
> >>> +
> >>> +   if (!ring->funcs->soft_recovery)
> >>> +   return false;
> >>> +
> >>> +   while (!dma_fence_is_signaled(fence) &&
> >>> +  ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
> >>> +   ring->funcs->soft_recovery(ring, vmid);
> >>> +
> >>> +   return dma_fence_is_signaled(fence);
> >>> +}
> >>> +
> >>>   /*
> >>>* Debugfs info
> >>>*/
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >>> index 409fdd9b9710..9cc239968e40 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >>> @@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
> >>>  /* priority functions */
> >>>  void (*set_priority) (struct amdgpu_ring *ring,
> >>>enum drm_sched_priority priority);
> >>> +   /*

Re: [PATCH (repost) 5/5] drm/amdgpu: add DisplayPort CEC-Tunneling-over-AUX support

2018-08-23 Thread Harry Wentland
On 2018-08-17 10:11 AM, Hans Verkuil wrote:
> From: Hans Verkuil 
> 
> Add DisplayPort CEC-Tunneling-over-AUX support to amdgpu.
> 
> Signed-off-by: Hans Verkuil 
> Acked-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   | 13 +++--
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c |  2 ++
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 34f34823bab5..77898c95bef6 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -898,6 +898,7 @@ amdgpu_dm_update_connector_after_detect(struct 
> amdgpu_dm_connector *aconnector)
>   aconnector->dc_sink = sink;
>   if (sink->dc_edid.length == 0) {
>   aconnector->edid = NULL;
> + drm_dp_cec_unset_edid(&aconnector->dm_dp_aux.aux);
>   } else {
>   aconnector->edid =
>   (struct edid *) sink->dc_edid.raw_edid;
> @@ -905,10 +906,13 @@ amdgpu_dm_update_connector_after_detect(struct 
> amdgpu_dm_connector *aconnector)
>  
>   drm_connector_update_edid_property(connector,
>   aconnector->edid);
> + drm_dp_cec_set_edid(&aconnector->dm_dp_aux.aux,
> + aconnector->edid);
>   }
>   amdgpu_dm_add_sink_to_freesync_module(connector, 
> aconnector->edid);
>  
>   } else {
> + drm_dp_cec_unset_edid(&aconnector->dm_dp_aux.aux);
>   amdgpu_dm_remove_sink_from_freesync_module(connector);
>   drm_connector_update_edid_property(connector, NULL);
>   aconnector->num_modes = 0;
> @@ -1059,12 +1063,16 @@ static void handle_hpd_rx_irq(void *param)
>   drm_kms_helper_hotplug_event(dev);
>   }
>   }
> +
>   if ((dc_link->cur_link_settings.lane_count != LANE_COUNT_UNKNOWN) ||
> - (dc_link->type == dc_connection_mst_branch))
> + (dc_link->type == dc_connection_mst_branch)) {
>   dm_handle_hpd_rx_irq(aconnector);
> + }

These lines don't really add anything functional.

Either way, this patch is
Reviewed-by: Harry Wentland 

Harry

>  
> - if (dc_link->type != dc_connection_mst_branch)
> + if (dc_link->type != dc_connection_mst_branch) {
> + drm_dp_cec_irq(&aconnector->dm_dp_aux.aux);
>   mutex_unlock(&aconnector->hpd_lock);
> + }
>  }
>  
>  static void register_hpd_handlers(struct amdgpu_device *adev)
> @@ -2732,6 +2740,7 @@ static void amdgpu_dm_connector_destroy(struct 
> drm_connector *connector)
>   dm->backlight_dev = NULL;
>   }
>  #endif
> + drm_dp_cec_unregister_connector(&aconnector->dm_dp_aux.aux);
>   drm_connector_unregister(connector);
>   drm_connector_cleanup(connector);
>   kfree(connector);
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 9a300732ba37..18a3a6e5ffa0 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -496,6 +496,8 @@ void amdgpu_dm_initialize_dp_connector(struct 
> amdgpu_display_manager *dm,
>   aconnector->dm_dp_aux.ddc_service = aconnector->dc_link->ddc;
>  
>   drm_dp_aux_register(&aconnector->dm_dp_aux.aux);
> + drm_dp_cec_register_connector(&aconnector->dm_dp_aux.aux,
> +   aconnector->base.name, dm->adev->dev);
>   aconnector->mst_mgr.cbs = &dm_mst_cbs;
>   drm_dp_mst_topology_mgr_init(
>   &aconnector->mst_mgr,
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/4] drm/amdgpu: add ring soft recovery v3

2018-08-23 Thread Christian König

Am 23.08.2018 um 17:20 schrieb Zhu, Rex:



-Original Message-
From: amd-gfx  On Behalf Of
Christian König
Sent: Thursday, August 23, 2018 7:24 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 1/4] drm/amdgpu: add ring soft recovery v3

Instead of hammering hard on the GPU try a soft recovery first.

v2: reorder code a bit
v3: increase timeout to 10ms, increment GPU reset counter

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 25
+
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
  3 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 265ff90f4e01..d93e31a5c4e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct
drm_sched_job *s_job)
struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
struct amdgpu_job *job = to_amdgpu_job(s_job);

+   if (amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence-

parent)) {

+   DRM_ERROR("ring %s timeout, but soft recovered\n",
+ s_job->sched->name);
+   return;
+   }
+
DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
  job->base.sched->name, atomic_read(&ring-

fence_drv.last_seq),

  ring->fence_drv.sync_seq);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 5dfd26be1eec..d445acb3d956 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -383,6 +383,31 @@ void
amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);  }

+/**
+ * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
+ *
+ * @ring: ring to try the recovery on
+ * @vmid: VMID we try to get going again
+ * @fence: timedout fence
+ *
+ * Tries to get a ring proceeding again when it is stuck.
+ */
+bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int
vmid,
+  struct dma_fence *fence)
+{
+   ktime_t deadline = ktime_add_us(ktime_get(), 1);
+
+   if (!ring->funcs->soft_recovery)
+   return false;
+
+   atomic_inc(&adev->gpu_reset_counter);
+   while (!dma_fence_is_signaled(fence) &&
+  ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
+   ring->funcs->soft_recovery(ring, vmid);

Hi Christian,

Is it necessary to add a udelay() here?


No, I don't think so.

Christian.



Regards
Rex

+   return dma_fence_is_signaled(fence);
+}
+
  /*
   * Debugfs info
   */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 409fdd9b9710..9cc239968e40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
/* priority functions */
void (*set_priority) (struct amdgpu_ring *ring,
  enum drm_sched_priority priority);
+   /* Try to soft recover the ring to make the fence signal */
+   void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
  };

  struct amdgpu_ring {
@@ -260,6 +262,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring);
void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring
*ring,
uint32_t reg0, uint32_t val0,
uint32_t reg1, uint32_t val1);
+bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int
vmid,
+  struct dma_fence *fence);

  static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)  {
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Possible use_mm() mis-uses

2018-08-23 Thread Linus Torvalds
On Wed, Aug 22, 2018 at 11:16 PM Zhenyu Wang  wrote:
>
> yeah, that's the clear way to fix this imo. We only depend on guest
> life cycle to access guest memory properly. Here's proposed fix, will
> verify and integrate it later.

Thanks, this looks sane to me,

Linus
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-ati 1/2] Add m4 directory

2018-08-23 Thread Michel Dänzer
From: Michel Dänzer 

Although normally it only warns about it, under some circumstances,
aclocal can error out if this directory doesn't exist.

Signed-off-by: Michel Dänzer 
---
 .gitignore| 5 -
 m4/.gitignore | 5 +
 2 files changed, 5 insertions(+), 5 deletions(-)
 create mode 100644 m4/.gitignore

diff --git a/.gitignore b/.gitignore
index 22bba424d..49cbec782 100644
--- a/.gitignore
+++ b/.gitignore
@@ -26,12 +26,7 @@ INSTALL
 install-sh
 .libs/
 libtool
-libtool.m4
 ltmain.sh
-lt~obsolete.m4
-ltoptions.m4
-ltsugar.m4
-ltversion.m4
 Makefile
 Makefile.in
 mdate-sh
diff --git a/m4/.gitignore b/m4/.gitignore
new file mode 100644
index 0..464ba5caa
--- /dev/null
+++ b/m4/.gitignore
@@ -0,0 +1,5 @@
+libtool.m4
+lt~obsolete.m4
+ltoptions.m4
+ltsugar.m4
+ltversion.m4
-- 
2.18.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-ati 2/2] Use AC_CONFIG_MACRO_DIR instead of AC_CONFIG_MACRO_DIRS

2018-08-23 Thread Michel Dänzer
From: Michel Dänzer 

Older versions of autoconf only supported the former.

Signed-off-by: Michel Dänzer 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 444862f3b..b6da673ea 100644
--- a/configure.ac
+++ b/configure.ac
@@ -29,7 +29,7 @@ AC_INIT([xf86-video-ati],
 
 AC_CONFIG_SRCDIR([Makefile.am])
 AC_CONFIG_HEADERS([config.h])
-AC_CONFIG_MACRO_DIRS([m4])
+AC_CONFIG_MACRO_DIR([m4])
 
 AC_CONFIG_AUX_DIR(.)
 
-- 
2.18.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: Move KFD parameters to amdgpu

2018-08-23 Thread Alex Deucher
On Thu, Aug 23, 2018 at 11:18 AM Amber Lin  wrote:
>
> After merging KFD into amdgpu, move module parameters defined in KFD to
> amdgpu_drv.c, where other module parameters are declared.
>
> Change-Id: I2de8d6c96bb49554c028bbc84bdb194f974c9278
> Signed-off-by: Amber Lin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 41 
> +
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c | 40 
>  2 files changed, 41 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 2221f6b..af9a766 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -39,6 +39,7 @@
>  #include "amdgpu_gem.h"
>
>  #include "amdgpu_amdkfd.h"
> +#include "kfd_priv.h"
>
>  /*
>   * KMS wrapper.
> @@ -127,6 +128,15 @@ int amdgpu_compute_multipipe = -1;
>  int amdgpu_gpu_recovery = -1; /* auto */
>  int amdgpu_emu_mode = 0;
>  uint amdgpu_smu_memory_pool_size = 0;
> +/* KFD parameters */
> +int sched_policy = KFD_SCHED_POLICY_HWS;
> +int hws_max_conc_proc = 8;
> +int cwsr_enable = 1;
> +int max_num_of_queues_per_device = KFD_MAX_NUM_OF_QUEUES_PER_DEVICE_DEFAULT;
> +int send_sigterm;
> +int debug_largebar;
> +int ignore_crat;
> +int vega10_noretry;
>
>  /**
>   * DOC: vramlimit (int)
> @@ -532,6 +542,37 @@ MODULE_PARM_DESC(smu_memory_pool_size,
> "0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 
> 2GByte");
>  module_param_named(smu_memory_pool_size, amdgpu_smu_memory_pool_size, uint, 
> 0444);
>

Please add DOC comments for all of these options like the other amdgpu options.

Alex

> +module_param(sched_policy, int, 0444);
> +MODULE_PARM_DESC(sched_policy,
> +   "Scheduling policy (0 = HWS (Default), 1 = HWS without 
> over-subscription, 2 = Non-HWS (Used for debugging only)");
> +
> +module_param(hws_max_conc_proc, int, 0444);
> +MODULE_PARM_DESC(hws_max_conc_proc,
> +   "Max # processes HWS can execute concurrently when sched_policy=0 (0 
> = no concurrency, #VMIDs for KFD = Maximum(default))");
> +
> +module_param(cwsr_enable, int, 0444);
> +MODULE_PARM_DESC(cwsr_enable, "CWSR enable (0 = Off, 1 = On (Default))");
> +
> +module_param(max_num_of_queues_per_device, int, 0444);
> +MODULE_PARM_DESC(max_num_of_queues_per_device,
> +   "Maximum number of supported queues per device (1 = Minimum, 4096 = 
> default)");
> +
> +module_param(send_sigterm, int, 0444);
> +MODULE_PARM_DESC(send_sigterm,
> +   "Send sigterm to HSA process on unhandled exception (0 = disable, 1 = 
> enable)");
> +
> +module_param(debug_largebar, int, 0444);
> +MODULE_PARM_DESC(debug_largebar,
> +   "Debug large-bar flag used to simulate large-bar capability on 
> non-large bar machine (0 = disable, 1 = enable)");
> +
> +module_param(ignore_crat, int, 0444);
> +MODULE_PARM_DESC(ignore_crat,
> +   "Ignore CRAT table during KFD initialization (0 = use CRAT (default), 
> 1 = ignore CRAT)");
> +
> +module_param_named(noretry, vega10_noretry, int, 0644);
> +MODULE_PARM_DESC(noretry,
> +   "Set sh_mem_config.retry_disable on Vega10 (0 = retry enabled 
> (default), 1 = retry disabled)");
> +
>  static const struct pci_device_id pciidlist[] = {
>  #ifdef  CONFIG_DRM_AMDGPU_SI
> {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> index 8847514..5f4977b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> @@ -21,7 +21,6 @@
>   */
>
>  #include 
> -#include 
>  #include 
>  #include "kfd_priv.h"
>
> @@ -39,45 +38,6 @@ static const struct kgd2kfd_calls kgd2kfd = {
>   kgd2kfd_schedule_evict_and_restore_process,
>  };
>
> -int sched_policy = KFD_SCHED_POLICY_HWS;
> -module_param(sched_policy, int, 0444);
> -MODULE_PARM_DESC(sched_policy,
> -   "Scheduling policy (0 = HWS (Default), 1 = HWS without 
> over-subscription, 2 = Non-HWS (Used for debugging only)");
> -
> -int hws_max_conc_proc = 8;
> -module_param(hws_max_conc_proc, int, 0444);
> -MODULE_PARM_DESC(hws_max_conc_proc,
> -   "Max # processes HWS can execute concurrently when sched_policy=0 (0 
> = no concurrency, #VMIDs for KFD = Maximum(default))");
> -
> -int cwsr_enable = 1;
> -module_param(cwsr_enable, int, 0444);
> -MODULE_PARM_DESC(cwsr_enable, "CWSR enable (0 = Off, 1 = On (Default))");
> -
> -int max_num_of_queues_per_device = KFD_MAX_NUM_OF_QUEUES_PER_DEVICE_DEFAULT;
> -module_param(max_num_of_queues_per_device, int, 0444);
> -MODULE_PARM_DESC(max_num_of_queues_per_device,
> -   "Maximum number of supported queues per device (1 = Minimum, 4096 = 
> default)");
> -
> -int send_sigterm;
> -module_param(send_sigterm, int, 0444);
> -MODULE_PARM_DESC(send_sigterm,
> -   "Send sigterm to HSA process on unhandled exception (0 = disable, 1 = 
> enable)");
> -
> -int debug_largebar;
> -

RE: [PATCH 1/4] drm/amdgpu: add ring soft recovery v3

2018-08-23 Thread Zhu, Rex


> -Original Message-
> From: amd-gfx  On Behalf Of
> Christian König
> Sent: Thursday, August 23, 2018 7:24 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH 1/4] drm/amdgpu: add ring soft recovery v3
> 
> Instead of hammering hard on the GPU try a soft recovery first.
> 
> v2: reorder code a bit
> v3: increase timeout to 10ms, increment GPU reset counter
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 25
> +
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
>  3 files changed, 35 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 265ff90f4e01..d93e31a5c4e7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct
> drm_sched_job *s_job)
>   struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   struct amdgpu_job *job = to_amdgpu_job(s_job);
> 
> + if (amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence-
> >parent)) {
> + DRM_ERROR("ring %s timeout, but soft recovered\n",
> +   s_job->sched->name);
> + return;
> + }
> +
>   DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
> job->base.sched->name, atomic_read(&ring-
> >fence_drv.last_seq),
> ring->fence_drv.sync_seq);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 5dfd26be1eec..d445acb3d956 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -383,6 +383,31 @@ void
> amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
>   amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);  }
> 
> +/**
> + * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
> + *
> + * @ring: ring to try the recovery on
> + * @vmid: VMID we try to get going again
> + * @fence: timedout fence
> + *
> + * Tries to get a ring proceeding again when it is stuck.
> + */
> +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int
> vmid,
> +struct dma_fence *fence)
> +{
> + ktime_t deadline = ktime_add_us(ktime_get(), 1);
> +
> + if (!ring->funcs->soft_recovery)
> + return false;
> +
> + atomic_inc(&adev->gpu_reset_counter);
> + while (!dma_fence_is_signaled(fence) &&
> +ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
> + ring->funcs->soft_recovery(ring, vmid);
Hi Christian,

Is it necessary to add a udelay() here?

Regards
Rex
> + return dma_fence_is_signaled(fence);
> +}
> +
>  /*
>   * Debugfs info
>   */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 409fdd9b9710..9cc239968e40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
>   /* priority functions */
>   void (*set_priority) (struct amdgpu_ring *ring,
> enum drm_sched_priority priority);
> + /* Try to soft recover the ring to make the fence signal */
> + void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
>  };
> 
>  struct amdgpu_ring {
> @@ -260,6 +262,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring);
> void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring
> *ring,
>   uint32_t reg0, uint32_t val0,
>   uint32_t reg1, uint32_t val1);
> +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int
> vmid,
> +struct dma_fence *fence);
> 
>  static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)  {
> --
> 2.14.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/3] drm/amdgpu: Merge amdkfd into amdgpu

2018-08-23 Thread Amber Lin
Since KFD is only supported by single GPU driver, it makes sense to merge
amdgpu and amdkfd into one module. This patch is the initial step: merge
Kconfig and Makefile.

Change-Id: I21c996ba29d393c1bf8064bdb2f5d89541159649
Signed-off-by: Amber Lin 
---
 drivers/gpu/drm/amd/amdgpu/Kconfig  |  1 +
 drivers/gpu/drm/amd/amdgpu/Makefile |  6 ++-
 drivers/gpu/drm/amd/amdkfd/Kconfig  |  2 +-
 drivers/gpu/drm/amd/amdkfd/Makefile | 53 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c | 76 ++---
 5 files changed, 63 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig 
b/drivers/gpu/drm/amd/amdgpu/Kconfig
index e8af1f5..9221e54 100644
--- a/drivers/gpu/drm/amd/amdgpu/Kconfig
+++ b/drivers/gpu/drm/amd/amdgpu/Kconfig
@@ -42,3 +42,4 @@ config DRM_AMDGPU_GART_DEBUGFS
 
 source "drivers/gpu/drm/amd/acp/Kconfig"
 source "drivers/gpu/drm/amd/display/Kconfig"
+source "drivers/gpu/drm/amd/amdkfd/Kconfig"
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index d2bafab..847536b 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -35,7 +35,8 @@ ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
-I$(FULL_AMD_DISPLAY_PATH) \
-I$(FULL_AMD_DISPLAY_PATH)/include \
-I$(FULL_AMD_DISPLAY_PATH)/dc \
-   -I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm
+   -I$(FULL_AMD_DISPLAY_PATH)/amdgpu_dm \
+   -I$(FULL_AMD_PATH)/amdkfd
 
 amdgpu-y := amdgpu_drv.o
 
@@ -136,6 +137,9 @@ amdgpu-y += \
 amdgpu-y += amdgpu_amdkfd.o
 
 ifneq ($(CONFIG_HSA_AMD),)
+AMDKFD_PATH := ../amdkfd
+include $(FULL_AMD_PATH)/amdkfd/Makefile
+amdgpu-y += $(AMDKFD_FILES)
 amdgpu-y += \
 amdgpu_amdkfd_fence.o \
 amdgpu_amdkfd_gpuvm.o \
diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig 
b/drivers/gpu/drm/amd/amdkfd/Kconfig
index 3858820..fbf0ee5 100644
--- a/drivers/gpu/drm/amd/amdkfd/Kconfig
+++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
@@ -3,7 +3,7 @@
 #
 
 config HSA_AMD
-   tristate "HSA kernel driver for AMD GPU devices"
+   bool "HSA kernel driver for AMD GPU devices"
depends on DRM_AMDGPU && X86_64
imply AMD_IOMMU_V2
select MMU_NOTIFIER
diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile 
b/drivers/gpu/drm/amd/amdkfd/Makefile
index ffd096f..69ec969 100644
--- a/drivers/gpu/drm/amd/amdkfd/Makefile
+++ b/drivers/gpu/drm/amd/amdkfd/Makefile
@@ -23,26 +23,41 @@
 # Makefile for Heterogenous System Architecture support for AMD GPU devices
 #
 
-ccflags-y := -Idrivers/gpu/drm/amd/include/  \
-   -Idrivers/gpu/drm/amd/include/asic_reg
-
-amdkfd-y   := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
-   kfd_pasid.o kfd_doorbell.o kfd_flat_memory.o \
-   kfd_process.o kfd_queue.o kfd_mqd_manager.o \
-   kfd_mqd_manager_cik.o kfd_mqd_manager_vi.o \
-   kfd_mqd_manager_v9.o \
-   kfd_kernel_queue.o kfd_kernel_queue_cik.o \
-   kfd_kernel_queue_vi.o kfd_kernel_queue_v9.o \
-   kfd_packet_manager.o kfd_process_queue_manager.o \
-   kfd_device_queue_manager.o kfd_device_queue_manager_cik.o \
-   kfd_device_queue_manager_vi.o kfd_device_queue_manager_v9.o \
-   kfd_interrupt.o kfd_events.o cik_event_interrupt.o \
-   kfd_int_process_v9.o kfd_dbgdev.o kfd_dbgmgr.o kfd_crat.o
+AMDKFD_FILES   := $(AMDKFD_PATH)/kfd_module.o \
+   $(AMDKFD_PATH)/kfd_device.o \
+   $(AMDKFD_PATH)/kfd_chardev.o \
+   $(AMDKFD_PATH)/kfd_topology.o \
+   $(AMDKFD_PATH)/kfd_pasid.o \
+   $(AMDKFD_PATH)/kfd_doorbell.o \
+   $(AMDKFD_PATH)/kfd_flat_memory.o \
+   $(AMDKFD_PATH)/kfd_process.o \
+   $(AMDKFD_PATH)/kfd_queue.o \
+   $(AMDKFD_PATH)/kfd_mqd_manager.o \
+   $(AMDKFD_PATH)/kfd_mqd_manager_cik.o \
+   $(AMDKFD_PATH)/kfd_mqd_manager_vi.o \
+   $(AMDKFD_PATH)/kfd_mqd_manager_v9.o \
+   $(AMDKFD_PATH)/kfd_kernel_queue.o \
+   $(AMDKFD_PATH)/kfd_kernel_queue_cik.o \
+   $(AMDKFD_PATH)/kfd_kernel_queue_vi.o \
+   $(AMDKFD_PATH)/kfd_kernel_queue_v9.o \
+   $(AMDKFD_PATH)/kfd_packet_manager.o \
+   $(AMDKFD_PATH)/kfd_process_queue_manager.o \
+   $(AMDKFD_PATH)/kfd_device_queue_manager.o \
+   $(AMDKFD_PATH)/kfd_device_queue_manager_cik.o \
+   $(AMDKFD_PATH)/kfd_device_queue_manager_vi.o \
+   $(AMDKFD_PATH)/kfd_device_queue_manager_v9.o \
+   $(AMDKFD_PATH)/kfd_interrupt.o \
+   $(AMDKFD_PATH)/kfd_events.o \
+   $(AMDKFD_PATH)/cik_event_interrupt.o \
+   $(AMDKFD_PATH)/kfd_int_process_v9.o \
+   $(AMDKFD_PATH)/kfd_dbgdev.o \
+   $(AMDKFD_PATH)/kfd_dbgmgr.o \
+   $(AMDKFD_PATH)/kfd_crat

[PATCH 2/3] drm/amdgpu: Remove CONFIG_HSA_AMD_MODULE

2018-08-23 Thread Amber Lin
After amdkfd is merged to amdgpu, CONFIG_HSA_AMD_MODULE no longer exists.

Change-Id: I42096cdf887e0d776075f3dd3e8d3f153aff4e85
Signed-off-by: Amber Lin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 26 +++---
 1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index e3ed08d..8c652ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -36,36 +36,16 @@ int amdgpu_amdkfd_init(void)
 {
int ret;
 
-#if defined(CONFIG_HSA_AMD_MODULE)
-   int (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
-
-   kgd2kfd_init_p = symbol_request(kgd2kfd_init);
-
-   if (kgd2kfd_init_p == NULL)
-   return -ENOENT;
-
-   ret = kgd2kfd_init_p(KFD_INTERFACE_VERSION, &kgd2kfd);
-   if (ret) {
-   symbol_put(kgd2kfd_init);
-   kgd2kfd = NULL;
-   }
-
-
-#elif defined(CONFIG_HSA_AMD)
-
+#ifdef CONFIG_HSA_AMD
ret = kgd2kfd_init(KFD_INTERFACE_VERSION, &kgd2kfd);
if (ret)
kgd2kfd = NULL;
-
+   amdgpu_amdkfd_gpuvm_init_mem_limits();
 #else
kgd2kfd = NULL;
ret = -ENOENT;
 #endif
 
-#if defined(CONFIG_HSA_AMD_MODULE) || defined(CONFIG_HSA_AMD)
-   amdgpu_amdkfd_gpuvm_init_mem_limits();
-#endif
-
return ret;
 }
 
@@ -471,7 +451,7 @@ bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, 
u32 vmid)
return false;
 }
 
-#if !defined(CONFIG_HSA_AMD_MODULE) && !defined(CONFIG_HSA_AMD)
+#ifndef CONFIG_HSA_AMD
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
 {
return false;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 3/3] drm/amdgpu: Move KFD parameters to amdgpu

2018-08-23 Thread Amber Lin
After merging KFD into amdgpu, move module parameters defined in KFD to
amdgpu_drv.c, where other module parameters are declared.

Change-Id: I2de8d6c96bb49554c028bbc84bdb194f974c9278
Signed-off-by: Amber Lin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 41 +
 drivers/gpu/drm/amd/amdkfd/kfd_module.c | 40 
 2 files changed, 41 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 2221f6b..af9a766 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -39,6 +39,7 @@
 #include "amdgpu_gem.h"
 
 #include "amdgpu_amdkfd.h"
+#include "kfd_priv.h"
 
 /*
  * KMS wrapper.
@@ -127,6 +128,15 @@ int amdgpu_compute_multipipe = -1;
 int amdgpu_gpu_recovery = -1; /* auto */
 int amdgpu_emu_mode = 0;
 uint amdgpu_smu_memory_pool_size = 0;
+/* KFD parameters */
+int sched_policy = KFD_SCHED_POLICY_HWS;
+int hws_max_conc_proc = 8;
+int cwsr_enable = 1;
+int max_num_of_queues_per_device = KFD_MAX_NUM_OF_QUEUES_PER_DEVICE_DEFAULT;
+int send_sigterm;
+int debug_largebar;
+int ignore_crat;
+int vega10_noretry;
 
 /**
  * DOC: vramlimit (int)
@@ -532,6 +542,37 @@ MODULE_PARM_DESC(smu_memory_pool_size,
"0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte");
 module_param_named(smu_memory_pool_size, amdgpu_smu_memory_pool_size, uint, 
0444);
 
+module_param(sched_policy, int, 0444);
+MODULE_PARM_DESC(sched_policy,
+   "Scheduling policy (0 = HWS (Default), 1 = HWS without 
over-subscription, 2 = Non-HWS (Used for debugging only)");
+
+module_param(hws_max_conc_proc, int, 0444);
+MODULE_PARM_DESC(hws_max_conc_proc,
+   "Max # processes HWS can execute concurrently when sched_policy=0 (0 = 
no concurrency, #VMIDs for KFD = Maximum(default))");
+
+module_param(cwsr_enable, int, 0444);
+MODULE_PARM_DESC(cwsr_enable, "CWSR enable (0 = Off, 1 = On (Default))");
+
+module_param(max_num_of_queues_per_device, int, 0444);
+MODULE_PARM_DESC(max_num_of_queues_per_device,
+   "Maximum number of supported queues per device (1 = Minimum, 4096 = 
default)");
+
+module_param(send_sigterm, int, 0444);
+MODULE_PARM_DESC(send_sigterm,
+   "Send sigterm to HSA process on unhandled exception (0 = disable, 1 = 
enable)");
+
+module_param(debug_largebar, int, 0444);
+MODULE_PARM_DESC(debug_largebar,
+   "Debug large-bar flag used to simulate large-bar capability on 
non-large bar machine (0 = disable, 1 = enable)");
+
+module_param(ignore_crat, int, 0444);
+MODULE_PARM_DESC(ignore_crat,
+   "Ignore CRAT table during KFD initialization (0 = use CRAT (default), 1 
= ignore CRAT)");
+
+module_param_named(noretry, vega10_noretry, int, 0644);
+MODULE_PARM_DESC(noretry,
+   "Set sh_mem_config.retry_disable on Vega10 (0 = retry enabled 
(default), 1 = retry disabled)");
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 8847514..5f4977b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -21,7 +21,6 @@
  */
 
 #include 
-#include 
 #include 
 #include "kfd_priv.h"
 
@@ -39,45 +38,6 @@ static const struct kgd2kfd_calls kgd2kfd = {
  kgd2kfd_schedule_evict_and_restore_process,
 };
 
-int sched_policy = KFD_SCHED_POLICY_HWS;
-module_param(sched_policy, int, 0444);
-MODULE_PARM_DESC(sched_policy,
-   "Scheduling policy (0 = HWS (Default), 1 = HWS without 
over-subscription, 2 = Non-HWS (Used for debugging only)");
-
-int hws_max_conc_proc = 8;
-module_param(hws_max_conc_proc, int, 0444);
-MODULE_PARM_DESC(hws_max_conc_proc,
-   "Max # processes HWS can execute concurrently when sched_policy=0 (0 = 
no concurrency, #VMIDs for KFD = Maximum(default))");
-
-int cwsr_enable = 1;
-module_param(cwsr_enable, int, 0444);
-MODULE_PARM_DESC(cwsr_enable, "CWSR enable (0 = Off, 1 = On (Default))");
-
-int max_num_of_queues_per_device = KFD_MAX_NUM_OF_QUEUES_PER_DEVICE_DEFAULT;
-module_param(max_num_of_queues_per_device, int, 0444);
-MODULE_PARM_DESC(max_num_of_queues_per_device,
-   "Maximum number of supported queues per device (1 = Minimum, 4096 = 
default)");
-
-int send_sigterm;
-module_param(send_sigterm, int, 0444);
-MODULE_PARM_DESC(send_sigterm,
-   "Send sigterm to HSA process on unhandled exception (0 = disable, 1 = 
enable)");
-
-int debug_largebar;
-module_param(debug_largebar, int, 0444);
-MODULE_PARM_DESC(debug_largebar,
-   "Debug large-bar flag used to simulate large-bar capability on 
non-large bar machine (0 = disable, 1 = enable)");
-
-int ignore_crat;
-module_param(ignore_crat, int, 0444);
-MODULE_PARM_DESC(ignore_crat,
-   "Ignore CRAT table during KFD initialization (0 = use CRAT (default), 1 
= ignore CRAT)");
-
-int v

Re: [PATCH 1/3] drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins

2018-08-23 Thread Michel Dänzer
On 2018-08-23 2:59 p.m., Zhu, Rex wrote:
>> From: Michel Dänzer 
>> On 2018-08-23 11:24 a.m., Rex Zhu wrote:
>>> Forgot to add vce pg support via smu for Kaveri/Mullins.
>>>
>>> Regresstion issue caused by
>>> 'commit 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
>> to
>>> set_powergating_by_smu")'
>>
>> You can replace this paragraph with
>>
>> Fixes: 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
>>   to set_powergating_by_smu")
>>
>>
>> This patch fixes the VCE ring (and also IB) test on this laptop, thanks!
>>
>> Unfortunately though, there's still an oops if I let the amdkfd driver load
>> together with amdgpu (no issue when loading amdkfd manually later), see
>> the attached kernel.log excerpt. This is also a regression in the
>> 4.19 drm tree changes. It might be a separate issue, but TBH I don't feel 
>> like
>> another day or two bisecting right now. :)
> 
> Thanks Michel,  I will check the oops issue tomorrow.

FWIW, it does seem related: I re-tested the commit before 561a5c83eadd,
the oops doesn't happen there. But it does happen with only this patch
on top of 561a5c83eadd.

I suspect https://bugs.freedesktop.org/107595 could similarly be at
least triggered by this change (I'm not using DC yet on this laptop).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: enable GTT PD/PT for raven v2

2018-08-23 Thread Andrey Grodzovsky

Acked-by: Andrey Grodzovsky 

Andrey


On 08/23/2018 08:33 AM, Christian König wrote:

Should work on Vega10 as well, but with an obvious performance hit.

Older APUs can be enabled as well, but will probably be more work.

v2: fix error checking

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 928fdae0dab4..9ec56f59b03a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -308,6 +308,9 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
list_move(&bo_base->vm_status, &vm->moved);
spin_unlock(&vm->moved_lock);
} else {
+   r = amdgpu_ttm_alloc_gart(&bo->tbo);
+   if (r)
+   break;
list_move(&bo_base->vm_status, &vm->relocated);
}
}
@@ -396,6 +399,10 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
if (r)
goto error;
  
+	r = amdgpu_ttm_alloc_gart(&bo->tbo);

+   if (r)
+   return r;
+
r = amdgpu_job_alloc_with_ib(adev, 64, &job);
if (r)
goto error;
@@ -461,7 +468,11 @@ static void amdgpu_vm_bo_param(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
bp->size = amdgpu_vm_bo_size(adev, level);
bp->byte_align = AMDGPU_GPU_PAGE_SIZE;
bp->domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+   if (bp->size <= PAGE_SIZE && adev->asic_type == CHIP_RAVEN)
+   bp->domain |= AMDGPU_GEM_DOMAIN_GTT;
+   bp->domain = amdgpu_bo_get_preferred_pin_domain(adev, bp->domain);
+   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
+   AMDGPU_GEM_CREATE_CPU_GTT_USWC;
if (vm->use_cpu_for_update)
bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
else


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/4] drm/amdgpu: add ring soft recovery v3

2018-08-23 Thread Huang Rui
On Thu, Aug 23, 2018 at 01:23:31PM +0200, Christian König wrote:
> Instead of hammering hard on the GPU try a soft recovery first.
> 
> v2: reorder code a bit
> v3: increase timeout to 10ms, increment GPU reset counter
> 
> Signed-off-by: Christian König 

Series are Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 25 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
>  3 files changed, 35 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 265ff90f4e01..d93e31a5c4e7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct drm_sched_job 
> *s_job)
>   struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   struct amdgpu_job *job = to_amdgpu_job(s_job);
>  
> + if (amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) 
> {
> + DRM_ERROR("ring %s timeout, but soft recovered\n",
> +   s_job->sched->name);
> + return;
> + }
> +
>   DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
> job->base.sched->name, atomic_read(&ring->fence_drv.last_seq),
> ring->fence_drv.sync_seq);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 5dfd26be1eec..d445acb3d956 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -383,6 +383,31 @@ void amdgpu_ring_emit_reg_write_reg_wait_helper(struct 
> amdgpu_ring *ring,
>   amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
>  }
>  
> +/**
> + * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
> + *
> + * @ring: ring to try the recovery on
> + * @vmid: VMID we try to get going again
> + * @fence: timedout fence
> + *
> + * Tries to get a ring proceeding again when it is stuck.
> + */
> +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
> +struct dma_fence *fence)
> +{
> + ktime_t deadline = ktime_add_us(ktime_get(), 1);
> +
> + if (!ring->funcs->soft_recovery)
> + return false;
> +
> + atomic_inc(&adev->gpu_reset_counter);
> + while (!dma_fence_is_signaled(fence) &&
> +ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
> + ring->funcs->soft_recovery(ring, vmid);
> +
> + return dma_fence_is_signaled(fence);
> +}
> +
>  /*
>   * Debugfs info
>   */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 409fdd9b9710..9cc239968e40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
>   /* priority functions */
>   void (*set_priority) (struct amdgpu_ring *ring,
> enum drm_sched_priority priority);
> + /* Try to soft recover the ring to make the fence signal */
> + void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
>  };
>  
>  struct amdgpu_ring {
> @@ -260,6 +262,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring);
>  void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
>   uint32_t reg0, uint32_t val0,
>   uint32_t reg1, uint32_t val1);
> +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
> +struct dma_fence *fence);
>  
>  static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
>  {
> -- 
> 2.14.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/3] drm/amdgpu: Power up uvd block when hw_fini

2018-08-23 Thread Zhu, Rex


> -Original Message-
> From: amd-gfx  On Behalf Of
> Michel Dänzer
> Sent: Thursday, August 23, 2018 6:58 PM
> To: Zhu, Rex 
> Cc: amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 2/3] drm/amdgpu: Power up uvd block when hw_fini
> 
> On 2018-08-23 11:24 a.m., Rex Zhu wrote:
> > when hw_fini/suspend, smu only need to power up uvd block if uvd pg is
> > supported, don't need to call vce to do hw_init.
> 
> Do you really mean VCE here, not UVD?

Yes, thanks for reminding me.


> > diff --git a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
> > b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
> > index a713c8b..8f625d6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
> > @@ -65,7 +65,6 @@ static int kv_set_thermal_temperature_range(struct
> amdgpu_device *adev,
> > int min_temp, int max_temp);
> static int
> > kv_init_fps_limits(struct amdgpu_device *adev);
> >
> > -static void kv_dpm_powergate_uvd(void *handle, bool gate);  static
> > void kv_dpm_powergate_samu(struct amdgpu_device *adev, bool gate);
> > static void kv_dpm_powergate_acp(struct amdgpu_device *adev, bool
> > gate);
> >
> > @@ -1390,7 +1389,8 @@ static void kv_dpm_disable(struct
> amdgpu_device *adev)
> > kv_dpm_powergate_samu(adev, false);
> > if (pi->caps_vce_pg) /* power on the VCE block */
> > amdgpu_kv_notify_message_to_smu(adev,
> PPSMC_MSG_VCEPowerON);
> > -   kv_dpm_powergate_uvd(adev, false);
> > +   if (pi->caps_uvd_pg) /* power off the UVD block */
> > +   amdgpu_kv_notify_message_to_smu(adev,
> PPSMC_MSG_UVDPowerON);
> 
> The comment should say "power on", shouldn't it?

Maybe. For keep consistent with the smu message.
Will change the comment in v2.

Thanks.
rex


> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Possible use_mm() mis-uses

2018-08-23 Thread Paolo Bonzini
On 23/08/2018 08:07, Zhenyu Wang wrote:
> On 2018.08.22 20:20:46 +0200, Paolo Bonzini wrote:
>> On 22/08/2018 18:44, Linus Torvalds wrote:
>>> An example of something that *isn't* right, is the i915 kvm interface,
>>> which does
>>>
>>> use_mm(kvm->mm);
>>>
>>> on an mm that was initialized in virt/kvm/kvm_main.c using
>>>
>>> mmgrab(current->mm);
>>> kvm->mm = current->mm;
>>>
>>> which is *not* right. Using "mmgrab()" does indeed guarantee the
>>> lifetime of the 'struct mm_struct' itself, but it does *not* guarantee
>>> the lifetime of the page tables. You need to use "mmget()" and
>>> "mmput()", which get the reference to the actual process address
>>> space!
>>>
>>> Now, it is *possible* that the kvm use is correct too, because kvm
>>> does register a mmu_notifier chain, and in theory you can avoid the
>>> proper refcounting by just making sure the mmu "release" notifier
>>> kills any existing uses, but I don't really see kvm doing that. Kvm
>>> does register a release notifier, but that just flushes the shadow
>>> page tables, it doesn't kill any use_mm() use by some i915 use case.
>>
>> Yes, KVM is correct but the i915 bits are at least fishy.  It's probably
>> as simple as adding a mmget/mmput pair respectively in kvmgt_guest_init
>> and kvmgt_guest_exit, or maybe mmget_not_zero.
>>
> 
> yeah, that's the clear way to fix this imo. We only depend on guest
> life cycle to access guest memory properly. Here's proposed fix, will
> verify and integrate it later.
> 
> Thanks!
> 
> From 5e5a8d0409aa150884adf5a4d0b956fd0b9906b3 Mon Sep 17 00:00:00 2001
> From: Zhenyu Wang 
> Date: Thu, 23 Aug 2018 14:08:06 +0800
> Subject: [PATCH] drm/i915/gvt: Fix life cycle reference on KVM mm
> 
> Handle guest mm access life cycle properly with mmget()/mmput()
> through guest init()/exit(). As noted by Linus, use_mm() depends
> on valid live page table but KVM's mmgrab() doesn't guarantee that.
> As vGPU usage depends on guest VM life cycle, need to make sure to
> use mmget()/mmput() to guarantee VM address access.
> 
> Cc: Linus Torvalds 
> Cc: Paolo Bonzini 
> Cc: Zhi Wang 
> Signed-off-by: Zhenyu Wang 

Reviewed-by: Paolo Bonzini 

> ---
>  drivers/gpu/drm/i915/gvt/kvmgt.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c 
> b/drivers/gpu/drm/i915/gvt/kvmgt.c
> index 71751be329e3..4a0988747d08 100644
> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1614,9 +1615,16 @@ static int kvmgt_guest_init(struct mdev_device *mdev)
>   if (__kvmgt_vgpu_exist(vgpu, kvm))
>   return -EEXIST;
>  
> + if (!mmget_not_zero(kvm->mm)) {
> + gvt_vgpu_err("Can't get KVM mm reference\n");
> + return -EINVAL;
> + }
> +
>   info = vzalloc(sizeof(struct kvmgt_guest_info));
> - if (!info)
> + if (!info) {
> + mmput(kvm->mm);
>   return -ENOMEM;
> + }
>  
>   vgpu->handle = (unsigned long)info;
>   info->vgpu = vgpu;
> @@ -1647,6 +1655,8 @@ static bool kvmgt_guest_exit(struct kvmgt_guest_info 
> *info)
>   debugfs_remove(info->debugfs_cache_entries);
>  
>   kvm_page_track_unregister_notifier(info->kvm, &info->track_node);
> + if (info->kvm->mm)
> + mmput(info->kvm->mm);
>   kvm_put_kvm(info->kvm);
>   kvmgt_protect_table_destroy(info);
>   gvt_cache_destroy(info->vgpu);
> 




signature.asc
Description: OpenPGP digital signature
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 11/11] drm/amdgpu: enable GTT PD/PT for raven

2018-08-23 Thread Christian König

Am 23.08.2018 um 14:54 schrieb Huang Rui:

On Wed, Aug 22, 2018 at 11:44:04AM -0400, Andrey Grodzovsky wrote:


On 08/22/2018 11:05 AM, Christian König wrote:

Should work on Vega10 as well, but with an obvious performance hit.

Older APUs can be enabled as well, but will probably be more work.

Raven's VRAM is actually the system memory. May I know the benefit if we
switch the PD/PT BO from vram to gart?


We want to reduce VRAM usage as much as possible on APUs.

The end goal is that it should work with only 16MB (or was it 32MB?) 
stolen VRAM.


That is the recommended setting for newer OSes on that hardware.

Christian.




Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 928fdae0dab4..670a42729f88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -308,6 +308,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
list_move(&bo_base->vm_status, &vm->moved);
spin_unlock(&vm->moved_lock);
} else {
+   amdgpu_ttm_alloc_gart(&bo->tbo);

Looks like you forgot to check for return value here.


Yes, the same comment with me.

Thanks,
Ray


Andrey


list_move(&bo_base->vm_status, &vm->relocated);
}
}
@@ -396,6 +397,10 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
if (r)
goto error;
+   r = amdgpu_ttm_alloc_gart(&bo->tbo);
+   if (r)
+   return r;
+
r = amdgpu_job_alloc_with_ib(adev, 64, &job);
if (r)
goto error;
@@ -461,7 +466,11 @@ static void amdgpu_vm_bo_param(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
bp->size = amdgpu_vm_bo_size(adev, level);
bp->byte_align = AMDGPU_GPU_PAGE_SIZE;
bp->domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+   if (bp->size <= PAGE_SIZE && adev->asic_type == CHIP_RAVEN)
+   bp->domain |= AMDGPU_GEM_DOMAIN_GTT;
+   bp->domain = amdgpu_bo_get_preferred_pin_domain(adev, bp->domain);
+   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
+   AMDGPU_GEM_CREATE_CPU_GTT_USWC;
if (vm->use_cpu_for_update)
bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
else

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/5] drm/amdgpu: add ring soft recovery v2

2018-08-23 Thread Huang Rui
On Thu, Aug 23, 2018 at 05:21:53PM +0800, Christian König wrote:
> Am 23.08.2018 um 09:17 schrieb Huang Rui:
> > On Wed, Aug 22, 2018 at 12:55:43PM -0400, Alex Deucher wrote:
> >> On Wed, Aug 22, 2018 at 6:05 AM Christian König
> >>  wrote:
> >>> Instead of hammering hard on the GPU try a soft recovery first.
> >>>
> >>> v2: reorder code a bit
> >>>
> >>> Signed-off-by: Christian König 
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 24 
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
> >>>   3 files changed, 34 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> index 265ff90f4e01..d93e31a5c4e7 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> @@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct drm_sched_job 
> >>> *s_job)
> >>>  struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> >>>  struct amdgpu_job *job = to_amdgpu_job(s_job);
> >>>
> >>> +   if (amdgpu_ring_soft_recovery(ring, job->vmid, 
> >>> s_job->s_fence->parent)) {
> >>> +   DRM_ERROR("ring %s timeout, but soft recovered\n",
> >>> + s_job->sched->name);
> >>> +   return;
> >>> +   }
> >> I think we should still bubble up the error to userspace even if we
> >> can recover.  Data is lost when the wave is killed.  We should treat
> >> it like a GPU reset.
> >>
> > May I know what does the wavefront stand for? Why we can do the "light"
> > recover than reset here?
> 
> Wavefront means a running shader in the SQ.
> 
> Basically this only covers the case when the application sends down a 
> shader with an endless loop to the hardware. Here we just kill the 
> shader and try to continue.
> 
> When you run into a hang because of a corrupted resource descriptor you 
> need usually need a full ASIC reset to get out of that again.
> 

Good to know this, thank you.

Series are Acked-by: Huang Rui 

> Regards,
> Christian.
> 
> >
> > Thanks,
> > Ray
> >
> >> Alex
> >>
> >>> +
> >>>  DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
> >>>job->base.sched->name, 
> >>> atomic_read(&ring->fence_drv.last_seq),
> >>>ring->fence_drv.sync_seq);
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >>> index 5dfd26be1eec..c045a4e38ad1 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> >>> @@ -383,6 +383,30 @@ void 
> >>> amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
> >>>  amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
> >>>   }
> >>>
> >>> +/**
> >>> + * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
> >>> + *
> >>> + * @ring: ring to try the recovery on
> >>> + * @vmid: VMID we try to get going again
> >>> + * @fence: timedout fence
> >>> + *
> >>> + * Tries to get a ring proceeding again when it is stuck.
> >>> + */
> >>> +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int 
> >>> vmid,
> >>> +  struct dma_fence *fence)
> >>> +{
> >>> +   ktime_t deadline = ktime_add_us(ktime_get(), 1000);
> >>> +
> >>> +   if (!ring->funcs->soft_recovery)
> >>> +   return false;
> >>> +
> >>> +   while (!dma_fence_is_signaled(fence) &&
> >>> +  ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
> >>> +   ring->funcs->soft_recovery(ring, vmid);
> >>> +
> >>> +   return dma_fence_is_signaled(fence);
> >>> +}
> >>> +
> >>>   /*
> >>>* Debugfs info
> >>>*/
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >>> index 409fdd9b9710..9cc239968e40 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> >>> @@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
> >>>  /* priority functions */
> >>>  void (*set_priority) (struct amdgpu_ring *ring,
> >>>enum drm_sched_priority priority);
> >>> +   /* Try to soft recover the ring to make the fence signal */
> >>> +   void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
> >>>   };
> >>>
> >>>   struct amdgpu_ring {
> >>> @@ -260,6 +262,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring);
> >>>   void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring 
> >>> *ring,
> >>>  uint32_t reg0, uint32_t 
> >>> val0,
> >>>  uint32_t reg1, uint32_t 
> >>> val1);
> >>> +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int 
> >>> vmid,
> >>> +  struct dma_fen

RE: [PATCH 1/3] drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins

2018-08-23 Thread Zhu, Rex


> -Original Message-
> From: Michel Dänzer 
> Sent: Thursday, August 23, 2018 6:59 PM
> To: Zhu, Rex 
> Cc: amd-gfx@lists.freedesktop.org; Kuehling, Felix
> 
> Subject: Re: [PATCH 1/3] drm/amdgpu: Fix vce initialize failed on
> Kaveri/Mullins
> 
> On 2018-08-23 11:24 a.m., Rex Zhu wrote:
> > Forgot to add vce pg support via smu for Kaveri/Mullins.
> >
> > Regresstion issue caused by
> > 'commit 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
> to
> > set_powergating_by_smu")'
> 
> You can replace this paragraph with
> 
> Fixes: 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
>   to set_powergating_by_smu")
> 
> 
> This patch fixes the VCE ring (and also IB) test on this laptop, thanks!
> 
> Unfortunately though, there's still an oops if I let the amdkfd driver load
> together with amdgpu (no issue when loading amdkfd manually later), see
> the attached kernel.log excerpt. This is also a regression in the
> 4.19 drm tree changes. It might be a separate issue, but TBH I don't feel like
> another day or two bisecting right now. :)

Thanks Michel,  I will check the oops issue tomorrow.

Regards
Rex
> 
> Anyway, this series is
> 
> Tested-by: Michel Dänzer 
> 
> 
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Remove duplicated power source update

2018-08-23 Thread Deucher, Alexander
In that case,

Reviewed-by: Alex Deucher 


From: amd-gfx  on behalf of Zhu, Rex 

Sent: Thursday, August 23, 2018 8:41:28 AM
To: Alex Deucher
Cc: Wu, Hersen; amd-gfx list
Subject: RE: [PATCH] drm/amdgpu: Remove duplicated power source update

Hi Alex,

We get initial state in amdgpu_device_init.

Best Regards
Rex



> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, August 23, 2018 8:37 PM
> To: Zhu, Rex 
> Cc: amd-gfx list ; Wu, Hersen
> 
> Subject: Re: [PATCH] drm/amdgpu: Remove duplicated power source update
>
> On Thu, Aug 23, 2018 at 2:40 AM Rex Zhu  wrote:
> >
> > when ac/dc switch, driver will be notified by acpi event.
> > then the power source will be updated. so don't need to get power
> > source when set power state.
>
> Don't we need this to get the initial state?  Maybe we should move this to
> one of the init functions if we don't already check there.
>
> Alex
>
> >
> > Signed-off-by: Rex Zhu 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 8 
> >  1 file changed, 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > index daa55fb..3e51e9c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > @@ -1933,14 +1933,6 @@ void amdgpu_pm_compute_clocks(struct
> amdgpu_device *adev)
> > amdgpu_fence_wait_empty(ring);
> > }
> >
> > -   mutex_lock(&adev->pm.mutex);
> > -   /* update battery/ac status */
> > -   if (power_supply_is_system_supplied() > 0)
> > -   adev->pm.ac_power = true;
> > -   else
> > -   adev->pm.ac_power = false;
> > -   mutex_unlock(&adev->pm.mutex);
> > -
> > if (adev->powerplay.pp_funcs->dispatch_tasks) {
> > if (!amdgpu_device_has_dc_support(adev)) {
> > mutex_lock(&adev->pm.mutex);
> > --
> > 1.9.1
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 11/11] drm/amdgpu: enable GTT PD/PT for raven

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 11:44:04AM -0400, Andrey Grodzovsky wrote:
> 
> 
> On 08/22/2018 11:05 AM, Christian König wrote:
> >Should work on Vega10 as well, but with an obvious performance hit.
> >
> >Older APUs can be enabled as well, but will probably be more work.

Raven's VRAM is actually the system memory. May I know the benefit if we
switch the PD/PT BO from vram to gart?

> >
> >Signed-off-by: Christian König 
> >---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 ++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> >b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >index 928fdae0dab4..670a42729f88 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >@@ -308,6 +308,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device 
> >*adev, struct amdgpu_vm *vm,
> > list_move(&bo_base->vm_status, &vm->moved);
> > spin_unlock(&vm->moved_lock);
> > } else {
> >+amdgpu_ttm_alloc_gart(&bo->tbo);
> 
> Looks like you forgot to check for return value here.
> 

Yes, the same comment with me. 

Thanks,
Ray

> Andrey
> 
> > list_move(&bo_base->vm_status, &vm->relocated);
> > }
> > }
> >@@ -396,6 +397,10 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device 
> >*adev,
> > if (r)
> > goto error;
> >+r = amdgpu_ttm_alloc_gart(&bo->tbo);
> >+if (r)
> >+return r;
> >+
> > r = amdgpu_job_alloc_with_ib(adev, 64, &job);
> > if (r)
> > goto error;
> >@@ -461,7 +466,11 @@ static void amdgpu_vm_bo_param(struct amdgpu_device 
> >*adev, struct amdgpu_vm *vm,
> > bp->size = amdgpu_vm_bo_size(adev, level);
> > bp->byte_align = AMDGPU_GPU_PAGE_SIZE;
> > bp->domain = AMDGPU_GEM_DOMAIN_VRAM;
> >-bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
> >+if (bp->size <= PAGE_SIZE && adev->asic_type == CHIP_RAVEN)
> >+bp->domain |= AMDGPU_GEM_DOMAIN_GTT;
> >+bp->domain = amdgpu_bo_get_preferred_pin_domain(adev, bp->domain);
> >+bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
> >+AMDGPU_GEM_CREATE_CPU_GTT_USWC;
> > if (vm->use_cpu_for_update)
> > bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> > else
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 10/11] drm/amdgpu: add helper for VM PD/PT allocation parameters

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:16PM +0200, Christian König wrote:
> Add a helper function to figure them out only once.
> 
> Signed-off-by: Christian König 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 61 --
>  1 file changed, 28 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 87e3d44b0a3f..928fdae0dab4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -446,6 +446,31 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
>   return r;
>  }
>  
> +/**
> + * amdgpu_vm_bo_param - fill in parameters for PD/PT allocation
> + *
> + * @adev: amdgpu_device pointer
> + * @vm: requesting vm
> + * @bp: resulting BO allocation parameters
> + */
> +static void amdgpu_vm_bo_param(struct amdgpu_device *adev, struct amdgpu_vm 
> *vm,
> +int level, struct amdgpu_bo_param *bp)
> +{
> + memset(&bp, 0, sizeof(bp));
> +
> + bp->size = amdgpu_vm_bo_size(adev, level);
> + bp->byte_align = AMDGPU_GPU_PAGE_SIZE;
> + bp->domain = AMDGPU_GEM_DOMAIN_VRAM;
> + bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
> + if (vm->use_cpu_for_update)
> + bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> + else
> + bp->flags |= AMDGPU_GEM_CREATE_SHADOW;
> + bp->type = ttm_bo_type_kernel;
> + if (vm->root.base.bo)
> + bp->resv = vm->root.base.bo->tbo.resv;
> +}
> +
>  /**
>   * amdgpu_vm_alloc_levels - allocate the PD/PT levels
>   *
> @@ -469,8 +494,8 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
> *adev,
> unsigned level, bool ats)
>  {
>   unsigned shift = amdgpu_vm_level_shift(adev, level);
> + struct amdgpu_bo_param bp;
>   unsigned pt_idx, from, to;
> - u64 flags;
>   int r;
>  
>   if (!parent->entries) {
> @@ -494,29 +519,14 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
> *adev,
>   saddr = saddr & ((1 << shift) - 1);
>   eaddr = eaddr & ((1 << shift) - 1);
>  
> - flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
> - if (vm->use_cpu_for_update)
> - flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> - else
> - flags |= (AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
> - AMDGPU_GEM_CREATE_SHADOW);
> + amdgpu_vm_bo_param(adev, vm, level, &bp);
>  
>   /* walk over the address space and allocate the page tables */
>   for (pt_idx = from; pt_idx <= to; ++pt_idx) {
> - struct reservation_object *resv = vm->root.base.bo->tbo.resv;
>   struct amdgpu_vm_pt *entry = &parent->entries[pt_idx];
>   struct amdgpu_bo *pt;
>  
>   if (!entry->base.bo) {
> - struct amdgpu_bo_param bp;
> -
> - memset(&bp, 0, sizeof(bp));
> - bp.size = amdgpu_vm_bo_size(adev, level);
> - bp.byte_align = AMDGPU_GPU_PAGE_SIZE;
> - bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
> - bp.flags = flags;
> - bp.type = ttm_bo_type_kernel;
> - bp.resv = resv;
>   r = amdgpu_bo_create(adev, &bp, &pt);
>   if (r)
>   return r;
> @@ -2564,8 +2574,6 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm,
>  {
>   struct amdgpu_bo_param bp;
>   struct amdgpu_bo *root;
> - unsigned long size;
> - uint64_t flags;
>   int r, i;
>  
>   vm->va = RB_ROOT_CACHED;
> @@ -2602,20 +2610,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm,
> "CPU update of VM recommended only for large BAR system\n");
>   vm->last_update = NULL;
>  
> - flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
> - if (vm->use_cpu_for_update)
> - flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> - else
> - flags |= AMDGPU_GEM_CREATE_SHADOW;
> -
> - size = amdgpu_vm_bo_size(adev, adev->vm_manager.root_level);
> - memset(&bp, 0, sizeof(bp));
> - bp.size = size;
> - bp.byte_align = AMDGPU_GPU_PAGE_SIZE;
> - bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
> - bp.flags = flags;
> - bp.type = ttm_bo_type_kernel;
> - bp.resv = NULL;
> + amdgpu_vm_bo_param(adev, vm, adev->vm_manager.root_level, &bp);
>   r = amdgpu_bo_create(adev, &bp, &root);
>   if (r)
>   goto error_free_sched_entity;
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 09/11] drm/amdgpu: add amdgpu_gmc_get_pde_for_bo helper

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:15PM +0200, Christian König wrote:
> Helper to get the PDE for a PD/PT.
> 
> Signed-off-by: Christian König 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 37 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  4 +--
>  5 files changed, 57 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 36058feac64f..6f79ce108728 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -26,6 +26,38 @@
>  
>  #include "amdgpu.h"
>  
> +/**
> + * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
> + *
> + * @bo: the BO to get the PDE for
> + * @level: the level in the PD hirarchy
> + * @addr: resulting addr
> + * @flags: resulting flags
> + *
> + * Get the address and flags to be used for a PDE.
> + */
> +void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level,
> +uint64_t *addr, uint64_t *flags)
> +{
> + struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> + struct ttm_dma_tt *ttm;
> +
> + switch (bo->tbo.mem.mem_type) {
> + case TTM_PL_TT:
> + ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
> + *addr = ttm->dma_address[0];
> + break;
> + case TTM_PL_VRAM:
> + *addr = amdgpu_bo_gpu_offset(bo);
> + break;
> + default:
> + *addr = 0;
> + break;
> + }
> + *flags = amdgpu_ttm_tt_pde_flags(bo->tbo.ttm, &bo->tbo.mem);
> + amdgpu_gmc_get_vm_pde(adev, level, addr, flags);
> +}
> +
>  /**
>   * amdgpu_gmc_pd_addr - return the address of the root directory
>   *
> @@ -35,13 +67,14 @@ uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo)
>   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   uint64_t pd_addr;
>  
> - pd_addr = amdgpu_bo_gpu_offset(bo);
>   /* TODO: move that into ASIC specific code */
>   if (adev->asic_type >= CHIP_VEGA10) {
>   uint64_t flags = AMDGPU_PTE_VALID;
>  
> - amdgpu_gmc_get_vm_pde(adev, -1, &pd_addr, &flags);
> + amdgpu_gmc_get_pde_for_bo(bo, -1, &pd_addr, &flags);
>   pd_addr |= flags;
> + } else {
> + pd_addr = amdgpu_bo_gpu_offset(bo);
>   }
>   return pd_addr;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> index 7c469cce0498..0d2c9f65ca13 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> @@ -131,6 +131,8 @@ static inline bool amdgpu_gmc_vram_full_visible(struct 
> amdgpu_gmc *gmc)
>   return (gmc->real_vram_size == gmc->visible_vram_size);
>  }
>  
> +void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level,
> +uint64_t *addr, uint64_t *flags);
>  uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo);
>  
>  #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index eb08a03b82a0..72366643e3c2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1428,13 +1428,14 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm)
>  }
>  
>  /**
> - * amdgpu_ttm_tt_pte_flags - Compute PTE flags for ttm_tt object
> + * amdgpu_ttm_tt_pde_flags - Compute PDE flags for ttm_tt object
>   *
>   * @ttm: The ttm_tt object to compute the flags for
>   * @mem: The memory registry backing this ttm_tt object
> + *
> + * Figure out the flags to use for a VM PDE.
>   */
> -uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt 
> *ttm,
> -  struct ttm_mem_reg *mem)
> +uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct ttm_mem_reg *mem)
>  {
>   uint64_t flags = 0;
>  
> @@ -1448,6 +1449,20 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device 
> *adev, struct ttm_tt *ttm,
>   flags |= AMDGPU_PTE_SNOOPED;
>   }
>  
> + return flags;
> +}
> +
> +/**
> + * amdgpu_ttm_tt_pte_flags - Compute PTE flags for ttm_tt object
> + *
> + * @ttm: The ttm_tt object to compute the flags for
> + * @mem: The memory registry backing this ttm_tt object
> + */
> +uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt 
> *ttm,
> +  struct ttm_mem_reg *mem)
> +{
> + uint64_t flags = amdgpu_ttm_tt_pde_flags(ttm, mem);
> +
>   flags |= adev->gart.gart_pte_flags;
>   flags |= AMDGPU_PTE_READABLE;
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> index 8b3cc6687769..fe8f276e9811 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> +++ b/drivers/g

RE: [PATCH] drm/amdgpu: Remove duplicated power source update

2018-08-23 Thread Zhu, Rex
Hi Alex,

We get initial state in amdgpu_device_init.

Best Regards
Rex

 

> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, August 23, 2018 8:37 PM
> To: Zhu, Rex 
> Cc: amd-gfx list ; Wu, Hersen
> 
> Subject: Re: [PATCH] drm/amdgpu: Remove duplicated power source update
> 
> On Thu, Aug 23, 2018 at 2:40 AM Rex Zhu  wrote:
> >
> > when ac/dc switch, driver will be notified by acpi event.
> > then the power source will be updated. so don't need to get power
> > source when set power state.
> 
> Don't we need this to get the initial state?  Maybe we should move this to
> one of the init functions if we don't already check there.
> 
> Alex
> 
> >
> > Signed-off-by: Rex Zhu 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 8 
> >  1 file changed, 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > index daa55fb..3e51e9c 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> > @@ -1933,14 +1933,6 @@ void amdgpu_pm_compute_clocks(struct
> amdgpu_device *adev)
> > amdgpu_fence_wait_empty(ring);
> > }
> >
> > -   mutex_lock(&adev->pm.mutex);
> > -   /* update battery/ac status */
> > -   if (power_supply_is_system_supplied() > 0)
> > -   adev->pm.ac_power = true;
> > -   else
> > -   adev->pm.ac_power = false;
> > -   mutex_unlock(&adev->pm.mutex);
> > -
> > if (adev->powerplay.pp_funcs->dispatch_tasks) {
> > if (!amdgpu_device_has_dc_support(adev)) {
> > mutex_lock(&adev->pm.mutex);
> > --
> > 1.9.1
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/5] drm/amdgpu: remove extra root PD alignment

2018-08-23 Thread Christian König
Just another leftover from radeon.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 ---
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a6b1126c61fd..53ce9982a5ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2566,8 +2566,6 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
 {
struct amdgpu_bo_param bp;
struct amdgpu_bo *root;
-   const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
-   AMDGPU_VM_PTE_COUNT(adev) * 8);
unsigned long size;
uint64_t flags;
int r, i;
@@ -2615,7 +2613,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
size = amdgpu_vm_bo_size(adev, adev->vm_manager.root_level);
memset(&bp, 0, sizeof(bp));
bp.size = size;
-   bp.byte_align = align;
+   bp.byte_align = AMDGPU_GPU_PAGE_SIZE;
bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
bp.flags = flags;
bp.type = ttm_bo_type_kernel;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 1162c2bf3138..1c9049feaaea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -48,9 +48,6 @@ struct amdgpu_bo_list_entry;
 /* number of entries in page table */
 #define AMDGPU_VM_PTE_COUNT(adev) (1 << (adev)->vm_manager.block_size)
 
-/* PTBs (Page Table Blocks) need to be aligned to 32K */
-#define AMDGPU_VM_PTB_ALIGN_SIZE   32768
-
 #define AMDGPU_PTE_VALID   (1ULL << 0)
 #define AMDGPU_PTE_SYSTEM  (1ULL << 1)
 #define AMDGPU_PTE_SNOOPED (1ULL << 2)
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/5] drm/amdgpu: add helper for VM PD/PT allocation parameters

2018-08-23 Thread Christian König
Add a helper function to figure them out only once.

Signed-off-by: Christian König 
Reviewed-by: Junwei Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 61 --
 1 file changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 53ce9982a5ee..4035cf193d91 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -446,6 +446,31 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
return r;
 }
 
+/**
+ * amdgpu_vm_bo_param - fill in parameters for PD/PT allocation
+ *
+ * @adev: amdgpu_device pointer
+ * @vm: requesting vm
+ * @bp: resulting BO allocation parameters
+ */
+static void amdgpu_vm_bo_param(struct amdgpu_device *adev, struct amdgpu_vm 
*vm,
+  int level, struct amdgpu_bo_param *bp)
+{
+   memset(&bp, 0, sizeof(bp));
+
+   bp->size = amdgpu_vm_bo_size(adev, level);
+   bp->byte_align = AMDGPU_GPU_PAGE_SIZE;
+   bp->domain = AMDGPU_GEM_DOMAIN_VRAM;
+   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+   if (vm->use_cpu_for_update)
+   bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+   else
+   bp->flags |= AMDGPU_GEM_CREATE_SHADOW;
+   bp->type = ttm_bo_type_kernel;
+   if (vm->root.base.bo)
+   bp->resv = vm->root.base.bo->tbo.resv;
+}
+
 /**
  * amdgpu_vm_alloc_levels - allocate the PD/PT levels
  *
@@ -469,8 +494,8 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
*adev,
  unsigned level, bool ats)
 {
unsigned shift = amdgpu_vm_level_shift(adev, level);
+   struct amdgpu_bo_param bp;
unsigned pt_idx, from, to;
-   u64 flags;
int r;
 
if (!parent->entries) {
@@ -494,29 +519,14 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
*adev,
saddr = saddr & ((1 << shift) - 1);
eaddr = eaddr & ((1 << shift) - 1);
 
-   flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
-   if (vm->use_cpu_for_update)
-   flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
-   else
-   flags |= (AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
-   AMDGPU_GEM_CREATE_SHADOW);
+   amdgpu_vm_bo_param(adev, vm, level, &bp);
 
/* walk over the address space and allocate the page tables */
for (pt_idx = from; pt_idx <= to; ++pt_idx) {
-   struct reservation_object *resv = vm->root.base.bo->tbo.resv;
struct amdgpu_vm_pt *entry = &parent->entries[pt_idx];
struct amdgpu_bo *pt;
 
if (!entry->base.bo) {
-   struct amdgpu_bo_param bp;
-
-   memset(&bp, 0, sizeof(bp));
-   bp.size = amdgpu_vm_bo_size(adev, level);
-   bp.byte_align = AMDGPU_GPU_PAGE_SIZE;
-   bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp.flags = flags;
-   bp.type = ttm_bo_type_kernel;
-   bp.resv = resv;
r = amdgpu_bo_create(adev, &bp, &pt);
if (r)
return r;
@@ -2566,8 +2576,6 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
 {
struct amdgpu_bo_param bp;
struct amdgpu_bo *root;
-   unsigned long size;
-   uint64_t flags;
int r, i;
 
vm->va = RB_ROOT_CACHED;
@@ -2604,20 +2612,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
  "CPU update of VM recommended only for large BAR system\n");
vm->last_update = NULL;
 
-   flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
-   if (vm->use_cpu_for_update)
-   flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
-   else
-   flags |= AMDGPU_GEM_CREATE_SHADOW;
-
-   size = amdgpu_vm_bo_size(adev, adev->vm_manager.root_level);
-   memset(&bp, 0, sizeof(bp));
-   bp.size = size;
-   bp.byte_align = AMDGPU_GPU_PAGE_SIZE;
-   bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp.flags = flags;
-   bp.type = ttm_bo_type_kernel;
-   bp.resv = NULL;
+   amdgpu_vm_bo_param(adev, vm, adev->vm_manager.root_level, &bp);
r = amdgpu_bo_create(adev, &bp, &root);
if (r)
goto error_free_sched_entity;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: KFD co-maintainership and branch model

2018-08-23 Thread Deucher, Alexander
> -Original Message-
> From: Christian König 
> Sent: Thursday, August 23, 2018 3:01 AM
> To: Oded Gabbay ; Dave Airlie
> 
> Cc: Deucher, Alexander ; Kuehling, Felix
> ; amd-gfx list 
> Subject: Re: KFD co-maintainership and branch model
> 
> Am 23.08.2018 um 08:54 schrieb Oded Gabbay:
> > On Thu, Aug 23, 2018 at 4:34 AM David Airlie  wrote:
> >> On Thu, Aug 23, 2018 at 8:25 AM, Felix Kuehling
>  wrote:
> >>> Hi all,
> >>>
> >>> Oded has offered to make me co-maintainer of KFD, as he's super busy
> >>> at work and less responsive than he used to be.
> >>>
> >>> At the same time we're about to send out the first patches to merge
> >>> KFD and AMDGPU into a single kernel module.
> >>>
> >>> With that in mind I'd like to propose to upstream KFD through Alex's
> >>> branch in the future. It would avoid conflicts in shared code
> >>> (amdgpu_vm.c is most active at the moment) when merging branches,
> >>> and make the code flow and testing easier.
> >>>
> >>> Please let me know what you think?
> >>>
> >> Works for me.
> >>
> >> Thanks,
> >> Dave.
> > Works for me as well.
> 
> Sounds good to me as well.

Works for me as well.

Alex

> 
> Christian.
> 
> >
> > Oded
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Remove duplicated power source update

2018-08-23 Thread Alex Deucher
On Thu, Aug 23, 2018 at 2:40 AM Rex Zhu  wrote:
>
> when ac/dc switch, driver will be notified by acpi event.
> then the power source will be updated. so don't need to
> get power source when set power state.

Don't we need this to get the initial state?  Maybe we should move
this to one of the init functions if we don't already check there.

Alex

>
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index daa55fb..3e51e9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -1933,14 +1933,6 @@ void amdgpu_pm_compute_clocks(struct amdgpu_device 
> *adev)
> amdgpu_fence_wait_empty(ring);
> }
>
> -   mutex_lock(&adev->pm.mutex);
> -   /* update battery/ac status */
> -   if (power_supply_is_system_supplied() > 0)
> -   adev->pm.ac_power = true;
> -   else
> -   adev->pm.ac_power = false;
> -   mutex_unlock(&adev->pm.mutex);
> -
> if (adev->powerplay.pp_funcs->dispatch_tasks) {
> if (!amdgpu_device_has_dc_support(adev)) {
> mutex_lock(&adev->pm.mutex);
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 5/5] drm/amdgpu: enable GTT PD/PT for raven v2

2018-08-23 Thread Christian König
Should work on Vega10 as well, but with an obvious performance hit.

Older APUs can be enabled as well, but will probably be more work.

v2: fix error checking

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 928fdae0dab4..9ec56f59b03a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -308,6 +308,9 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
list_move(&bo_base->vm_status, &vm->moved);
spin_unlock(&vm->moved_lock);
} else {
+   r = amdgpu_ttm_alloc_gart(&bo->tbo);
+   if (r)
+   break;
list_move(&bo_base->vm_status, &vm->relocated);
}
}
@@ -396,6 +399,10 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
if (r)
goto error;
 
+   r = amdgpu_ttm_alloc_gart(&bo->tbo);
+   if (r)
+   return r;
+
r = amdgpu_job_alloc_with_ib(adev, 64, &job);
if (r)
goto error;
@@ -461,7 +468,11 @@ static void amdgpu_vm_bo_param(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
bp->size = amdgpu_vm_bo_size(adev, level);
bp->byte_align = AMDGPU_GPU_PAGE_SIZE;
bp->domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+   if (bp->size <= PAGE_SIZE && adev->asic_type == CHIP_RAVEN)
+   bp->domain |= AMDGPU_GEM_DOMAIN_GTT;
+   bp->domain = amdgpu_bo_get_preferred_pin_domain(adev, bp->domain);
+   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
+   AMDGPU_GEM_CREATE_CPU_GTT_USWC;
if (vm->use_cpu_for_update)
bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
else
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/5] drm/amdgpu: add amdgpu_gmc_get_pde_for_bo helper

2018-08-23 Thread Christian König
Helper to get the PDE for a PD/PT.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 37 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |  4 +--
 5 files changed, 57 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 36058feac64f..6f79ce108728 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -26,6 +26,38 @@
 
 #include "amdgpu.h"
 
+/**
+ * amdgpu_gmc_get_pde_for_bo - get the PDE for a BO
+ *
+ * @bo: the BO to get the PDE for
+ * @level: the level in the PD hirarchy
+ * @addr: resulting addr
+ * @flags: resulting flags
+ *
+ * Get the address and flags to be used for a PDE.
+ */
+void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level,
+  uint64_t *addr, uint64_t *flags)
+{
+   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+   struct ttm_dma_tt *ttm;
+
+   switch (bo->tbo.mem.mem_type) {
+   case TTM_PL_TT:
+   ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
+   *addr = ttm->dma_address[0];
+   break;
+   case TTM_PL_VRAM:
+   *addr = amdgpu_bo_gpu_offset(bo);
+   break;
+   default:
+   *addr = 0;
+   break;
+   }
+   *flags = amdgpu_ttm_tt_pde_flags(bo->tbo.ttm, &bo->tbo.mem);
+   amdgpu_gmc_get_vm_pde(adev, level, addr, flags);
+}
+
 /**
  * amdgpu_gmc_pd_addr - return the address of the root directory
  *
@@ -35,13 +67,14 @@ uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo)
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
uint64_t pd_addr;
 
-   pd_addr = amdgpu_bo_gpu_offset(bo);
/* TODO: move that into ASIC specific code */
if (adev->asic_type >= CHIP_VEGA10) {
uint64_t flags = AMDGPU_PTE_VALID;
 
-   amdgpu_gmc_get_vm_pde(adev, -1, &pd_addr, &flags);
+   amdgpu_gmc_get_pde_for_bo(bo, -1, &pd_addr, &flags);
pd_addr |= flags;
+   } else {
+   pd_addr = amdgpu_bo_gpu_offset(bo);
}
return pd_addr;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 7c469cce0498..0d2c9f65ca13 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -131,6 +131,8 @@ static inline bool amdgpu_gmc_vram_full_visible(struct 
amdgpu_gmc *gmc)
return (gmc->real_vram_size == gmc->visible_vram_size);
 }
 
+void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level,
+  uint64_t *addr, uint64_t *flags);
 uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index eb08a03b82a0..72366643e3c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1428,13 +1428,14 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm)
 }
 
 /**
- * amdgpu_ttm_tt_pte_flags - Compute PTE flags for ttm_tt object
+ * amdgpu_ttm_tt_pde_flags - Compute PDE flags for ttm_tt object
  *
  * @ttm: The ttm_tt object to compute the flags for
  * @mem: The memory registry backing this ttm_tt object
+ *
+ * Figure out the flags to use for a VM PDE.
  */
-uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt 
*ttm,
-struct ttm_mem_reg *mem)
+uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct ttm_mem_reg *mem)
 {
uint64_t flags = 0;
 
@@ -1448,6 +1449,20 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device 
*adev, struct ttm_tt *ttm,
flags |= AMDGPU_PTE_SNOOPED;
}
 
+   return flags;
+}
+
+/**
+ * amdgpu_ttm_tt_pte_flags - Compute PTE flags for ttm_tt object
+ *
+ * @ttm: The ttm_tt object to compute the flags for
+ * @mem: The memory registry backing this ttm_tt object
+ */
+uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt 
*ttm,
+struct ttm_mem_reg *mem)
+{
+   uint64_t flags = amdgpu_ttm_tt_pde_flags(ttm, mem);
+
flags |= adev->gart.gart_pte_flags;
flags |= AMDGPU_PTE_READABLE;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 8b3cc6687769..fe8f276e9811 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -116,6 +116,7 @@ bool amdgpu_ttm_tt_userptr_invalidated(struct ttm_tt *ttm,
   int *last_invalidated);
 bool amdgpu_ttm_tt_userptr_needs_pages(struct ttm_tt *ttm);
 bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm);

[PATCH 3/5] drm/amdgpu: add GMC9 support for PDs/PTs in system memory

2018-08-23 Thread Christian König
Add the necessary handling.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index a82b3eb429e8..453bd7ea50e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -560,7 +560,7 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct 
amdgpu_device *adev,
 static void gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, int level,
uint64_t *addr, uint64_t *flags)
 {
-   if (!(*flags & AMDGPU_PDE_PTE))
+   if (!(*flags & AMDGPU_PDE_PTE) && !(*flags & AMDGPU_PTE_SYSTEM))
*addr = adev->vm_manager.vram_base_offset + *addr -
adev->gmc.vram_start;
BUG_ON(*addr & 0x003FULL);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: enable GTT PD/PT for raven v2

2018-08-23 Thread Christian König
Should work on Vega10 as well, but with an obvious performance hit.

Older APUs can be enabled as well, but will probably be more work.

v2: fix error checking

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 928fdae0dab4..9ec56f59b03a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -308,6 +308,9 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
list_move(&bo_base->vm_status, &vm->moved);
spin_unlock(&vm->moved_lock);
} else {
+   r = amdgpu_ttm_alloc_gart(&bo->tbo);
+   if (r)
+   break;
list_move(&bo_base->vm_status, &vm->relocated);
}
}
@@ -396,6 +399,10 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
if (r)
goto error;
 
+   r = amdgpu_ttm_alloc_gart(&bo->tbo);
+   if (r)
+   return r;
+
r = amdgpu_job_alloc_with_ib(adev, 64, &job);
if (r)
goto error;
@@ -461,7 +468,11 @@ static void amdgpu_vm_bo_param(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
bp->size = amdgpu_vm_bo_size(adev, level);
bp->byte_align = AMDGPU_GPU_PAGE_SIZE;
bp->domain = AMDGPU_GEM_DOMAIN_VRAM;
-   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
+   if (bp->size <= PAGE_SIZE && adev->asic_type == CHIP_RAVEN)
+   bp->domain |= AMDGPU_GEM_DOMAIN_GTT;
+   bp->domain = amdgpu_bo_get_preferred_pin_domain(adev, bp->domain);
+   bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
+   AMDGPU_GEM_CREATE_CPU_GTT_USWC;
if (vm->use_cpu_for_update)
bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
else
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 08/11] drm/amdgpu: add amdgpu_gmc_pd_addr helper

2018-08-23 Thread Christian König

Am 23.08.2018 um 05:07 schrieb Zhang, Jerry (Junwei):

On 08/22/2018 11:05 PM, Christian König wrote:

Add a helper to get the root PD address and remove the workarounds from
the GMC9 code for that.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/Makefile   |  3 +-
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  5 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c    |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   | 47 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |  2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  2 +-
  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c  |  7 +--
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  4 --
  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c   |  7 +--
  9 files changed, 56 insertions(+), 23 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile

index 860cb8731c7c..d2bafabe585d 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -51,7 +51,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
  amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
  amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
  amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o 
amdgpu_atomfirmware.o \

-    amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o
+    amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
+    amdgpu_gmc.o

  # add asic specific block
  amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c

index 7eadc58231f2..2e2393fe09b2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -364,7 +364,6 @@ static int vm_validate_pt_pd_bos(struct amdgpu_vm 
*vm)

  struct amdgpu_bo *pd = vm->root.base.bo;
  struct amdgpu_device *adev = amdgpu_ttm_adev(pd->tbo.bdev);
  struct amdgpu_vm_parser param;
-    uint64_t addr, flags = AMDGPU_PTE_VALID;
  int ret;

  param.domain = AMDGPU_GEM_DOMAIN_VRAM;
@@ -383,9 +382,7 @@ static int vm_validate_pt_pd_bos(struct amdgpu_vm 
*vm)

  return ret;
  }

-    addr = amdgpu_bo_gpu_offset(vm->root.base.bo);
-    amdgpu_gmc_get_vm_pde(adev, -1, &addr, &flags);
-    vm->pd_phys_addr = addr;
+    vm->pd_phys_addr = amdgpu_gmc_pd_addr(vm->root.base.bo);

  if (vm->use_cpu_for_update) {
  ret = amdgpu_bo_kmap(pd, NULL);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c

index 17bf63f93c93..d268035cf2f3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -946,7 +946,7 @@ static int amdgpu_cs_vm_handling(struct 
amdgpu_cs_parser *p)

  if (r)
  return r;

-    p->job->vm_pd_addr = amdgpu_bo_gpu_offset(vm->root.base.bo);
+    p->job->vm_pd_addr = amdgpu_gmc_pd_addr(vm->root.base.bo);

  if (amdgpu_vm_debug) {
  /* Invalidate all BOs to test for userspace bugs */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c

new file mode 100644
index ..36058feac64f
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -0,0 +1,47 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person 
obtaining a

+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, 
subject to

+ * the following conditions:
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO 
EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR 
ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 
TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 
SOFTWARE OR THE

+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial 
portions

+ * of the Software.
+ *
+ */
+
+#include "amdgpu.h"
+
+/**
+ * amdgpu_gmc_pd_addr - return the address of the root directory
+ *
+ */
+uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo)


If the func is going to handle all pd address, it's better to be 
called in gmc6,7,8 as well.



+{
+    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+    uint64_t pd_addr;
+
+    pd_addr

Re: [PATCH 02/11] drm/amdgpu: validate the VM root PD from the VM code

2018-08-23 Thread Christian König

Am 23.08.2018 um 09:28 schrieb Huang Rui:

On Wed, Aug 22, 2018 at 05:05:08PM +0200, Christian König wrote:

Preparation for following changes. This validates the root PD twice,
but the overhead of that should be minimal.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 73b8dcaf66e6..53ce9982a5ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -291,11 +291,11 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) {
struct amdgpu_bo *bo = bo_base->bo;
  
-		if (bo->parent) {

-   r = validate(param, bo);
-   if (r)
-   break;
+   r = validate(param, bo);
+   if (r)
+   break;

In orignal case, we skip the root PD. But now, it is validated one time.
May I know where is another time?


That is in the CS code, we add the root PD to the list of BOs which are 
validated there.


Christian.



Thanks,
Ray

  
+		if (bo->parent) {

spin_lock(&glob->lru_lock);
ttm_bo_move_to_lru_tail(&bo->tbo);
if (bo->shadow)
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 08/11] drm/amdgpu: add amdgpu_gmc_pd_addr helper

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:14PM +0200, Christian König wrote:
> Add a helper to get the root PD address and remove the workarounds from
> the GMC9 code for that.
> 
> Signed-off-by: Christian König 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile   |  3 +-
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  5 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   | 47 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |  2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c  |  7 +--
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  4 --
>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c   |  7 +--
>  9 files changed, 56 insertions(+), 23 deletions(-)
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 860cb8731c7c..d2bafabe585d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -51,7 +51,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>   amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
>   amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
>   amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
> - amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o
> + amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
> + amdgpu_gmc.o
>  
>  # add asic specific block
>  amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 7eadc58231f2..2e2393fe09b2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -364,7 +364,6 @@ static int vm_validate_pt_pd_bos(struct amdgpu_vm *vm)
>   struct amdgpu_bo *pd = vm->root.base.bo;
>   struct amdgpu_device *adev = amdgpu_ttm_adev(pd->tbo.bdev);
>   struct amdgpu_vm_parser param;
> - uint64_t addr, flags = AMDGPU_PTE_VALID;
>   int ret;
>  
>   param.domain = AMDGPU_GEM_DOMAIN_VRAM;
> @@ -383,9 +382,7 @@ static int vm_validate_pt_pd_bos(struct amdgpu_vm *vm)
>   return ret;
>   }
>  
> - addr = amdgpu_bo_gpu_offset(vm->root.base.bo);
> - amdgpu_gmc_get_vm_pde(adev, -1, &addr, &flags);
> - vm->pd_phys_addr = addr;
> + vm->pd_phys_addr = amdgpu_gmc_pd_addr(vm->root.base.bo);
>  
>   if (vm->use_cpu_for_update) {
>   ret = amdgpu_bo_kmap(pd, NULL);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 17bf63f93c93..d268035cf2f3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -946,7 +946,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser 
> *p)
>   if (r)
>   return r;
>  
> - p->job->vm_pd_addr = amdgpu_bo_gpu_offset(vm->root.base.bo);
> + p->job->vm_pd_addr = amdgpu_gmc_pd_addr(vm->root.base.bo);
>  
>   if (amdgpu_vm_debug) {
>   /* Invalidate all BOs to test for userspace bugs */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> new file mode 100644
> index ..36058feac64f
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -0,0 +1,47 @@
> +/*
> + * Copyright 2018 Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, including
> + * without limitation the rights to use, copy, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY 
> CLAIM,
> + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
> + * USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + */
> +
> +#include "amdgpu.h"
> +
> +/**
> + * amdgpu_gmc_pd_addr - return the address of the root directory
> + *
> + */
> +uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *b

Re: [PATCH 07/11] drm/amdgpu: add GMC9 support for PDs/PTs in system memory

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:13PM +0200, Christian König wrote:
> Add the necessary handling.
> 
> Signed-off-by: Christian König 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index e412eb8e347c..3393a329fc9c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -571,7 +571,7 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct 
> amdgpu_device *adev,
>  static void gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   uint64_t *addr, uint64_t *flags)
>  {
> - if (!(*flags & AMDGPU_PDE_PTE))
> + if (!(*flags & AMDGPU_PDE_PTE) && !(*flags & AMDGPU_PTE_SYSTEM))
>   *addr = adev->vm_manager.vram_base_offset + *addr -
>   adev->gmc.vram_start;
>   BUG_ON(*addr & 0x003FULL);
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 07/11] drm/amdgpu: add GMC9 support for PDs/PTs in system memory

2018-08-23 Thread Huang Rui
On Thu, Aug 23, 2018 at 10:50:49AM +0800, Zhang, Jerry (Junwei) wrote:
> On 08/22/2018 11:05 PM, Christian König wrote:
> >Add the necessary handling.
> >
> >Signed-off-by: Christian König 
> 
> Looks going to use GTT for page table.
> What kind of scenario to use that?
> could it be replaced by CPU updating page table in system memory?

If the system bit is not set in the PTE entry, it will map the page of
video memory.

Thanks,
Ray

> 
> Regards,
> Jerry
> 
> >---
> >  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> >b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> >index e412eb8e347c..3393a329fc9c 100644
> >--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> >+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> >@@ -571,7 +571,7 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct 
> >amdgpu_device *adev,
> >  static void gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, int level,
> > uint64_t *addr, uint64_t *flags)
> >  {
> >-if (!(*flags & AMDGPU_PDE_PTE))
> >+if (!(*flags & AMDGPU_PDE_PTE) && !(*flags & AMDGPU_PTE_SYSTEM))
> > *addr = adev->vm_manager.vram_base_offset + *addr -
> > adev->gmc.vram_start;
> > BUG_ON(*addr & 0x003FULL);
> >
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 05/11] drm/amdgpu: rename gart.robj into gart.bo

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:11PM +0200, Christian König wrote:
> sed -i "s/gart.robj/gart.bo/" drivers/gpu/drm/amd/amdgpu/*.c
> sed -i "s/gart.robj/gart.bo/" drivers/gpu/drm/amd/amdgpu/*.h
> 
> Just cleaning up radeon leftovers.
> 
> Signed-off-by: Christian König 

Yes, the 'bo' is the better name of gart table than 'robj'.

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 32 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h |  2 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c|  4 +--
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c|  4 +--
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c|  4 +--
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  4 +--
>  6 files changed, 25 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index a54d5655a191..f5cb5e2856c1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -112,7 +112,7 @@ int amdgpu_gart_table_vram_alloc(struct amdgpu_device 
> *adev)
>  {
>   int r;
>  
> - if (adev->gart.robj == NULL) {
> + if (adev->gart.bo == NULL) {
>   struct amdgpu_bo_param bp;
>  
>   memset(&bp, 0, sizeof(bp));
> @@ -123,7 +123,7 @@ int amdgpu_gart_table_vram_alloc(struct amdgpu_device 
> *adev)
>   AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>   bp.type = ttm_bo_type_kernel;
>   bp.resv = NULL;
> - r = amdgpu_bo_create(adev, &bp, &adev->gart.robj);
> + r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
>   if (r) {
>   return r;
>   }
> @@ -145,19 +145,19 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device 
> *adev)
>  {
>   int r;
>  
> - r = amdgpu_bo_reserve(adev->gart.robj, false);
> + r = amdgpu_bo_reserve(adev->gart.bo, false);
>   if (unlikely(r != 0))
>   return r;
> - r = amdgpu_bo_pin(adev->gart.robj, AMDGPU_GEM_DOMAIN_VRAM);
> + r = amdgpu_bo_pin(adev->gart.bo, AMDGPU_GEM_DOMAIN_VRAM);
>   if (r) {
> - amdgpu_bo_unreserve(adev->gart.robj);
> + amdgpu_bo_unreserve(adev->gart.bo);
>   return r;
>   }
> - r = amdgpu_bo_kmap(adev->gart.robj, &adev->gart.ptr);
> + r = amdgpu_bo_kmap(adev->gart.bo, &adev->gart.ptr);
>   if (r)
> - amdgpu_bo_unpin(adev->gart.robj);
> - amdgpu_bo_unreserve(adev->gart.robj);
> - adev->gart.table_addr = amdgpu_bo_gpu_offset(adev->gart.robj);
> + amdgpu_bo_unpin(adev->gart.bo);
> + amdgpu_bo_unreserve(adev->gart.bo);
> + adev->gart.table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
>   return r;
>  }
>  
> @@ -173,14 +173,14 @@ void amdgpu_gart_table_vram_unpin(struct amdgpu_device 
> *adev)
>  {
>   int r;
>  
> - if (adev->gart.robj == NULL) {
> + if (adev->gart.bo == NULL) {
>   return;
>   }
> - r = amdgpu_bo_reserve(adev->gart.robj, true);
> + r = amdgpu_bo_reserve(adev->gart.bo, true);
>   if (likely(r == 0)) {
> - amdgpu_bo_kunmap(adev->gart.robj);
> - amdgpu_bo_unpin(adev->gart.robj);
> - amdgpu_bo_unreserve(adev->gart.robj);
> + amdgpu_bo_kunmap(adev->gart.bo);
> + amdgpu_bo_unpin(adev->gart.bo);
> + amdgpu_bo_unreserve(adev->gart.bo);
>   adev->gart.ptr = NULL;
>   }
>  }
> @@ -196,10 +196,10 @@ void amdgpu_gart_table_vram_unpin(struct amdgpu_device 
> *adev)
>   */
>  void amdgpu_gart_table_vram_free(struct amdgpu_device *adev)
>  {
> - if (adev->gart.robj == NULL) {
> + if (adev->gart.bo == NULL) {
>   return;
>   }
> - amdgpu_bo_unref(&adev->gart.robj);
> + amdgpu_bo_unref(&adev->gart.bo);
>  }
>  
>  /*
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index 9f9e9dc87da1..d7b7c2d408d5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -41,7 +41,7 @@ struct amdgpu_bo;
>  
>  struct amdgpu_gart {
>   u64 table_addr;
> - struct amdgpu_bo*robj;
> + struct amdgpu_bo*bo;
>   void*ptr;
>   unsignednum_gpu_pages;
>   unsignednum_cpu_pages;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index c14cf1c5bf57..c50bd0c46508 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -497,7 +497,7 @@ static int gmc_v6_0_gart_enable(struct amdgpu_device 
> *adev)
>   int r, i;
>   u32 field;
>  
> - if (adev->gart.robj == NULL) {
> + if (adev->gart.bo == NULL) {
>   dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
>   return -EINVAL;

[PATCH 3/4] drm/amdgpu: implement soft_recovery for GFX8 v2

2018-08-23 Thread Christian König
Try to kill waves on the SQ.

v2: only for the GFX ring for now.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 282dba6cce86..9de940a65c80 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6714,6 +6714,18 @@ static void gfx_v8_0_ring_emit_wreg(struct amdgpu_ring 
*ring, uint32_t reg,
amdgpu_ring_write(ring, val);
 }
 
+static void gfx_v8_0_ring_soft_recovery(struct amdgpu_ring *ring, unsigned 
vmid)
+{
+   struct amdgpu_device *adev = ring->adev;
+   uint32_t value = 0;
+
+   value = REG_SET_FIELD(value, SQ_CMD, CMD, 0x03);
+   value = REG_SET_FIELD(value, SQ_CMD, MODE, 0x01);
+   value = REG_SET_FIELD(value, SQ_CMD, CHECK_VMID, 1);
+   value = REG_SET_FIELD(value, SQ_CMD, VM_ID, vmid);
+   WREG32(mmSQ_CMD, value);
+}
+
 static void gfx_v8_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
 enum amdgpu_interrupt_state 
state)
 {
@@ -7171,6 +7183,7 @@ static const struct amdgpu_ring_funcs 
gfx_v8_0_ring_funcs_gfx = {
.init_cond_exec = gfx_v8_0_ring_emit_init_cond_exec,
.patch_cond_exec = gfx_v8_0_ring_emit_patch_cond_exec,
.emit_wreg = gfx_v8_0_ring_emit_wreg,
+   .soft_recovery = gfx_v8_0_ring_soft_recovery,
 };
 
 static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_compute = {
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/4] drm/amdgpu: add ring soft recovery v3

2018-08-23 Thread Christian König
Instead of hammering hard on the GPU try a soft recovery first.

v2: reorder code a bit
v3: increase timeout to 10ms, increment GPU reset counter

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 25 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
 3 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 265ff90f4e01..d93e31a5c4e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
struct amdgpu_job *job = to_amdgpu_job(s_job);
 
+   if (amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) 
{
+   DRM_ERROR("ring %s timeout, but soft recovered\n",
+ s_job->sched->name);
+   return;
+   }
+
DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
  job->base.sched->name, atomic_read(&ring->fence_drv.last_seq),
  ring->fence_drv.sync_seq);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 5dfd26be1eec..d445acb3d956 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -383,6 +383,31 @@ void amdgpu_ring_emit_reg_write_reg_wait_helper(struct 
amdgpu_ring *ring,
amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
 }
 
+/**
+ * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
+ *
+ * @ring: ring to try the recovery on
+ * @vmid: VMID we try to get going again
+ * @fence: timedout fence
+ *
+ * Tries to get a ring proceeding again when it is stuck.
+ */
+bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
+  struct dma_fence *fence)
+{
+   ktime_t deadline = ktime_add_us(ktime_get(), 1);
+
+   if (!ring->funcs->soft_recovery)
+   return false;
+
+   atomic_inc(&adev->gpu_reset_counter);
+   while (!dma_fence_is_signaled(fence) &&
+  ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
+   ring->funcs->soft_recovery(ring, vmid);
+
+   return dma_fence_is_signaled(fence);
+}
+
 /*
  * Debugfs info
  */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 409fdd9b9710..9cc239968e40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
/* priority functions */
void (*set_priority) (struct amdgpu_ring *ring,
  enum drm_sched_priority priority);
+   /* Try to soft recover the ring to make the fence signal */
+   void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
 };
 
 struct amdgpu_ring {
@@ -260,6 +262,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring);
 void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
uint32_t reg0, uint32_t val0,
uint32_t reg1, uint32_t val1);
+bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
+  struct dma_fence *fence);
 
 static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
 {
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/4] drm/amdgpu: implement soft_recovery for GFX9

2018-08-23 Thread Christian König
Try to kill waves on the SQ.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 44707f94b2c5..ab5cacea967b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4421,6 +4421,18 @@ static void gfx_v9_0_ring_emit_reg_write_reg_wait(struct 
amdgpu_ring *ring,
   ref, mask);
 }
 
+static void gfx_v9_0_ring_soft_recovery(struct amdgpu_ring *ring, unsigned 
vmid)
+{
+   struct amdgpu_device *adev = ring->adev;
+   uint32_t value = 0;
+
+   value = REG_SET_FIELD(value, SQ_CMD, CMD, 0x03);
+   value = REG_SET_FIELD(value, SQ_CMD, MODE, 0x01);
+   value = REG_SET_FIELD(value, SQ_CMD, CHECK_VMID, 1);
+   value = REG_SET_FIELD(value, SQ_CMD, VM_ID, vmid);
+   WREG32(mmSQ_CMD, value);
+}
+
 static void gfx_v9_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
 enum amdgpu_interrupt_state 
state)
 {
@@ -4743,6 +4755,7 @@ static const struct amdgpu_ring_funcs 
gfx_v9_0_ring_funcs_gfx = {
.emit_wreg = gfx_v9_0_ring_emit_wreg,
.emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
.emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
+   .soft_recovery = gfx_v9_0_ring_soft_recovery,
 };
 
 static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_compute = {
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 04/11] drm/amdgpu: move setting the GART addr into TTM

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:10PM +0200, Christian König wrote:
> Move setting the GART addr for window based copies into the TTM code who
> uses it.
> 
> Signed-off-by: Christian König 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 -
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 391e2f7c03aa..239ccbae09bc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -82,8 +82,6 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, 
> unsigned size,
>   r = amdgpu_ib_get(adev, NULL, size, &(*job)->ibs[0]);
>   if (r)
>   kfree(*job);
> - else
> - (*job)->vm_pd_addr = adev->gart.table_addr;
>  
>   return r;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index c6611cff64c8..b4333f60ed8b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -2048,7 +2048,10 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring, 
> uint64_t src_offset,
>   if (r)
>   return r;
>  
> - job->vm_needs_flush = vm_needs_flush;
> + if (vm_needs_flush) {
> + job->vm_pd_addr = adev->gart.table_addr;
> + job->vm_needs_flush = true;
> + }
>   if (resv) {
>   r = amdgpu_sync_resv(adev, &job->sync, resv,
>AMDGPU_FENCE_OWNER_UNDEFINED,
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/4] drm/amdgpu: implement soft_recovery for GFX7

2018-08-23 Thread Christian König
Try to kill waves on the SQ.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 95452c5a9df6..a15d9c0f233b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -4212,6 +4212,18 @@ static void gfx_v7_0_ring_emit_gds_switch(struct 
amdgpu_ring *ring,
amdgpu_ring_write(ring, (1 << (oa_size + oa_base)) - (1 << oa_base));
 }
 
+static void gfx_v7_0_ring_soft_recovery(struct amdgpu_ring *ring, unsigned 
vmid)
+{
+   struct amdgpu_device *adev = ring->adev;
+   uint32_t value = 0;
+
+   value = REG_SET_FIELD(value, SQ_CMD, CMD, 0x03);
+   value = REG_SET_FIELD(value, SQ_CMD, MODE, 0x01);
+   value = REG_SET_FIELD(value, SQ_CMD, CHECK_VMID, 1);
+   value = REG_SET_FIELD(value, SQ_CMD, VM_ID, vmid);
+   WREG32(mmSQ_CMD, value);
+}
+
 static uint32_t wave_read_ind(struct amdgpu_device *adev, uint32_t simd, 
uint32_t wave, uint32_t address)
 {
WREG32(mmSQ_IND_INDEX,
@@ -5088,6 +5100,7 @@ static const struct amdgpu_ring_funcs 
gfx_v7_0_ring_funcs_gfx = {
.pad_ib = amdgpu_ring_generic_pad_ib,
.emit_cntxcntl = gfx_v7_ring_emit_cntxcntl,
.emit_wreg = gfx_v7_0_ring_emit_wreg,
+   .soft_recovery = gfx_v7_0_ring_soft_recovery,
 };
 
 static const struct amdgpu_ring_funcs gfx_v7_0_ring_funcs_compute = {
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/5] drm: add syncobj timeline support v2

2018-08-23 Thread Christian König

Am 23.08.2018 um 11:58 schrieb zhoucm1:



On 2018年08月23日 17:15, Christian König wrote:

Am 23.08.2018 um 10:25 schrieb Chunming Zhou:

VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an 
integer payload

identifying a point in a timeline. Such timeline semaphores support the
following operations:
    * Host query - A host operation that allows querying the payload 
of the

  timeline semaphore.
    * Host wait - A host operation that allows a blocking wait for a
  timeline semaphore to reach a specified value.


I think I have a idea what "Host" means in this context, but it would 
probably be better to describe it.


How about "CPU"?



Yeah, that's probably better.




    * Device wait - A device operation that allows waiting for a
  timeline semaphore to reach a specified value.
    * Device signal - A device operation that allows advancing the
  timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is 
signaled before the late PT.

a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion 
fence, when PT[N] fence is signaled,

the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when 
timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than 
timeline value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait 
on any point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal 
PT, we need a sumission fence to

perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate 
patch. (Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate 
patch.
5. drop the submission_fence implementation and instead use 
wait_event() for that. (Christian)

6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)


I really liked Daniels idea to handle the classic syncobj like a 
timeline synobj with just 1 entry. That can probably simplify the 
implementation quite a bit.
Yeah, after timeline, seems we can remove old syncobj->fence, right? 
will try to unify them in additional patch.


I think we could do something like the following:

1. When drm_syncobj_find_fence is called with point zero then we return 
the last added one.
2. When drm_syncobj_find_fence is called with point zero we add the new 
fence as with N+1 to the list.


This way syncobj should keep it's functionality as is with the timeline 
support added on top.


Regards,
Christian.



Thanks,
David Zhou


Additional to that an amdgpu patch which shows how the interface is 
to be used is probably something Daniel will want to see as well.


Christian.



TODO:
1. CPU query and wait on timeline semaphore.
2. test application (Daniel Vetter)

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c | 383 
+++---

  include/drm/drm_syncobj.h |  28 +++
  include/uapi/drm/drm.h    |   1 +
  3 files changed, 389 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index 6227df2cc0a4..f738d78edf65 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,44 @@
  #include "drm_internal.h"
  #include 
  +struct drm_syncobj_stub_fence {
+    struct dma_fence base;
+    spinlock_t lock;
+};
+
+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence 
*fence)

+{
+    return "syncobjstub";
+}
+
+static bool drm_syncobj_stub_fence_enable_signaling(struct 
dma_fence *fence)

+{
+    return !dma_fence_is_signaled(fence);
+}
+
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+    .get_driver_name = drm_syncobj_stub_fence_get_name,
+    .get_timeline_name = drm_syncobj_stub_fence_get_name,
+    .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
+    .release = NULL,
+};
+
+struct drm_syncobj_wait_pt {
+    struct drm_syncobj_stub_fence base;
+    u64    value;
+    struct rb_node   node;
+};
+struct drm_syncobj_signal_pt {
+    struct drm_syncobj_stub_fence base;
+    struct dma_fence *signal_fence;
+    struct dma_fence *pre_pt_base;
+    struct dma_fence_cb signal_cb;
+    struct dma_fence_cb pre_pt_cb;
+    struct drm_syncobj *syncobj;
+    u64    value;
+    struct list_head list;
+};
+
  /**
   * drm_syncobj_find - lookup and reference a sync object.
   * @file_private: drm file private pointer
@@ -137,6 +175,150 @@ void drm_syncobj_remove_callback(struct 
drm_syncobj *syncobj,

  spin_unlock(&syncobj->lock);
  }
  +static void drm_syncobj_timeline_signal_wait_pts(struct 
drm_syncobj *syncobj)

+{
+    struct rb_node *node = NULL

Re: [PATCH 1/3] drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins

2018-08-23 Thread Michel Dänzer
On 2018-08-23 11:24 a.m., Rex Zhu wrote:
> Forgot to add vce pg support via smu for Kaveri/Mullins.
> 
> Regresstion issue caused by
> 'commit 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
> to set_powergating_by_smu")'

You can replace this paragraph with

Fixes: 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
  to set_powergating_by_smu")


This patch fixes the VCE ring (and also IB) test on this laptop, thanks!

Unfortunately though, there's still an oops if I let the amdkfd driver
load together with amdgpu (no issue when loading amdkfd manually later),
see the attached kernel.log excerpt. This is also a regression in the
4.19 drm tree changes. It might be a separate issue, but TBH I don't
feel like another day or two bisecting right now. :)


Anyway, this series is

Tested-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
Aug 23 12:25:30 thor kernel: [  200.456163] [drm] amdgpu kernel modesetting enabled.
Aug 23 12:25:30 thor kernel: [  200.465731] Parsing CRAT table with 1 nodes
Aug 23 12:25:30 thor kernel: [  200.465741] Creating topology SYSFS entries
Aug 23 12:25:30 thor kernel: [  200.465786] Topology: Add APU node [0x0:0x0]
Aug 23 12:25:30 thor kernel: [  200.465789] Finished initializing topology
Aug 23 12:25:30 thor kernel: [  200.465853] kfd kfd: Initialized module
Aug 23 12:25:30 thor kernel: [  200.466288] checking generic (c000 30) vs hw (c000 1000)
Aug 23 12:25:30 thor kernel: [  200.466296] fb: switching to amdgpudrmfb from EFI VGA
Aug 23 12:25:30 thor kernel: [  200.466418] Console: switching to colour dummy device 80x25
Aug 23 12:25:30 thor kernel: [  200.467646] [drm] initializing kernel modesetting (KAVERI 0x1002:0x130A 0x103C:0x2234 0x00).
Aug 23 12:25:30 thor kernel: [  200.468031] [drm] register mmio base: 0xD680
Aug 23 12:25:30 thor kernel: [  200.468035] [drm] register mmio size: 262144
Aug 23 12:25:30 thor kernel: [  200.468058] [drm] add ip block number 0 
Aug 23 12:25:30 thor kernel: [  200.468062] [drm] add ip block number 1 
Aug 23 12:25:30 thor kernel: [  200.468064] [drm] add ip block number 2 
Aug 23 12:25:30 thor kernel: [  200.468067] [drm] add ip block number 3 
Aug 23 12:25:30 thor kernel: [  200.468071] [drm] add ip block number 4 
Aug 23 12:25:30 thor kernel: [  200.468074] [drm] add ip block number 5 
Aug 23 12:25:30 thor kernel: [  200.468077] [drm] add ip block number 6 
Aug 23 12:25:30 thor kernel: [  200.468080] [drm] add ip block number 7 
Aug 23 12:25:30 thor kernel: [  200.468082] [drm] add ip block number 8 
Aug 23 12:25:30 thor kernel: [  200.501755] [drm] BIOS signature incorrect 0 0
Aug 23 12:25:30 thor kernel: [  200.501804] resource sanity check: requesting [mem 0x000c-0x000d], which spans more than PCI Bus :00 [mem 0x000c-0x000c3fff window]
Aug 23 12:25:30 thor kernel: [  200.501812] caller pci_map_rom+0x58/0xe0 mapping multiple BARs
Aug 23 12:25:30 thor kernel: [  200.503187] ATOM BIOS: BR45464.001
Aug 23 12:25:30 thor kernel: [  200.503219] [drm] GPU posting now...
Aug 23 12:25:31 thor kernel: [  200.966309] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
Aug 23 12:25:31 thor kernel: [  200.966329] amdgpu :00:01.0: VRAM: 1024M 0x00F4 - 0x00F43FFF (1024M used)
Aug 23 12:25:31 thor kernel: [  200.966333] amdgpu :00:01.0: GART: 1024M 0x - 0x3FFF
Aug 23 12:25:31 thor kernel: [  200.966352] [drm] Detected VRAM RAM=1024M, BAR=1024M
Aug 23 12:25:31 thor kernel: [  200.966354] [drm] RAM width 128bits UNKNOWN
Aug 23 12:25:31 thor kernel: [  200.966695] [TTM] Zone  kernel: Available graphics memory: 3568742 kiB
Aug 23 12:25:31 thor kernel: [  200.966702] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
Aug 23 12:25:31 thor kernel: [  200.966705] [TTM] Initializing pool allocator
Aug 23 12:25:31 thor kernel: [  200.966714] [TTM] Initializing DMA pool allocator
Aug 23 12:25:31 thor kernel: [  200.966799] [drm] amdgpu: 1024M of VRAM memory ready
Aug 23 12:25:31 thor kernel: [  200.966803] [drm] amdgpu: 3072M of GTT memory ready.
Aug 23 12:25:31 thor kernel: [  200.966842] [drm] GART: num cpu pages 262144, num gpu pages 262144
Aug 23 12:25:31 thor kernel: [  200.967622] [drm] PCIE GART of 1024M enabled (table at 0x00F4).
Aug 23 12:25:31 thor kernel: [  200.967771] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Aug 23 12:25:31 thor kernel: [  200.967774] [drm] Driver supports precise vblank timestamp query.
Aug 23 12:25:31 thor kernel: [  200.967803] [drm] Internal thermal controller without fan control
Aug 23 12:25:31 thor kernel: [  200.967806] [drm] amdgpu: dpm initialized
Aug 23 12:25:31 thor kernel: [  200.969641] [drm] amdgpu atom DIG backlight initialized
Aug 23 12:25:31 thor kernel: [  200.969644] [drm] AMDGPU Display Connectors
Aug 23 12:25:3

Re: [PATCH 2/3] drm/amdgpu: Power up uvd block when hw_fini

2018-08-23 Thread Michel Dänzer
On 2018-08-23 11:24 a.m., Rex Zhu wrote:
> when hw_fini/suspend, smu only need to power up uvd block
> if uvd pg is supported, don't need to call vce to do hw_init.

Do you really mean VCE here, not UVD?


> diff --git a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c 
> b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
> index a713c8b..8f625d6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
> @@ -65,7 +65,6 @@ static int kv_set_thermal_temperature_range(struct 
> amdgpu_device *adev,
>   int min_temp, int max_temp);
>  static int kv_init_fps_limits(struct amdgpu_device *adev);
>  
> -static void kv_dpm_powergate_uvd(void *handle, bool gate);
>  static void kv_dpm_powergate_samu(struct amdgpu_device *adev, bool gate);
>  static void kv_dpm_powergate_acp(struct amdgpu_device *adev, bool gate);
>  
> @@ -1390,7 +1389,8 @@ static void kv_dpm_disable(struct amdgpu_device *adev)
>   kv_dpm_powergate_samu(adev, false);
>   if (pi->caps_vce_pg) /* power on the VCE block */
>   amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerON);
> - kv_dpm_powergate_uvd(adev, false);
> + if (pi->caps_uvd_pg) /* power off the UVD block */
> + amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_UVDPowerON);

The comment should say "power on", shouldn't it?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/5] drm: add syncobj timeline support v2

2018-08-23 Thread zhoucm1



On 2018年08月23日 17:15, Christian König wrote:

Am 23.08.2018 um 10:25 schrieb Chunming Zhou:

VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer 
payload

identifying a point in a timeline. Such timeline semaphores support the
following operations:
    * Host query - A host operation that allows querying the payload 
of the

  timeline semaphore.
    * Host wait - A host operation that allows a blocking wait for a
  timeline semaphore to reach a specified value.


I think I have a idea what "Host" means in this context, but it would 
probably be better to describe it.


How about "CPU"?


    * Device wait - A device operation that allows waiting for a
  timeline semaphore to reach a specified value.
    * Device signal - A device operation that allows advancing the
  timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is 
signaled before the late PT.

a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, 
when PT[N] fence is signaled,

the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when 
timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than 
timeline value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait 
on any point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal 
PT, we need a sumission fence to

perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate 
patch. (Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate 
patch.
5. drop the submission_fence implementation and instead use 
wait_event() for that. (Christian)

6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)


I really liked Daniels idea to handle the classic syncobj like a 
timeline synobj with just 1 entry. That can probably simplify the 
implementation quite a bit.
Yeah, after timeline, seems we can remove old syncobj->fence, right? 
will try to unify them in additional patch.


Thanks,
David Zhou


Additional to that an amdgpu patch which shows how the interface is to 
be used is probably something Daniel will want to see as well.


Christian.



TODO:
1. CPU query and wait on timeline semaphore.
2. test application (Daniel Vetter)

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c | 383 
+++---

  include/drm/drm_syncobj.h |  28 +++
  include/uapi/drm/drm.h    |   1 +
  3 files changed, 389 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c 
b/drivers/gpu/drm/drm_syncobj.c

index 6227df2cc0a4..f738d78edf65 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,44 @@
  #include "drm_internal.h"
  #include 
  +struct drm_syncobj_stub_fence {
+    struct dma_fence base;
+    spinlock_t lock;
+};
+
+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence 
*fence)

+{
+    return "syncobjstub";
+}
+
+static bool drm_syncobj_stub_fence_enable_signaling(struct dma_fence 
*fence)

+{
+    return !dma_fence_is_signaled(fence);
+}
+
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+    .get_driver_name = drm_syncobj_stub_fence_get_name,
+    .get_timeline_name = drm_syncobj_stub_fence_get_name,
+    .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
+    .release = NULL,
+};
+
+struct drm_syncobj_wait_pt {
+    struct drm_syncobj_stub_fence base;
+    u64    value;
+    struct rb_node   node;
+};
+struct drm_syncobj_signal_pt {
+    struct drm_syncobj_stub_fence base;
+    struct dma_fence *signal_fence;
+    struct dma_fence *pre_pt_base;
+    struct dma_fence_cb signal_cb;
+    struct dma_fence_cb pre_pt_cb;
+    struct drm_syncobj *syncobj;
+    u64    value;
+    struct list_head list;
+};
+
  /**
   * drm_syncobj_find - lookup and reference a sync object.
   * @file_private: drm file private pointer
@@ -137,6 +175,150 @@ void drm_syncobj_remove_callback(struct 
drm_syncobj *syncobj,

  spin_unlock(&syncobj->lock);
  }
  +static void drm_syncobj_timeline_signal_wait_pts(struct 
drm_syncobj *syncobj)

+{
+    struct rb_node *node = NULL;
+    struct drm_syncobj_wait_pt *wait_pt = NULL;
+
+    spin_lock(&syncobj->lock);
+    for(node = rb_first(&syncobj->syncobj_timeline.wait_pt_tree);
+    node != NULL; ) {
+    wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
+    node = rb_next(node);
+    if (wait_pt->value <= syncobj->syncobj_timeline.timeline) {
+    dma_fence_signal(&wait_pt->base.base);
+    rb_erase(&wait_pt->node,
+ &synco

Re: [PATCH 5/5] drm: add syncobj timeline support v2

2018-08-23 Thread zhoucm1



On 2018年08月23日 17:08, Daniel Vetter wrote:

On Thu, Aug 23, 2018 at 04:25:42PM +0800, Chunming Zhou wrote:

VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer payload
identifying a point in a timeline. Such timeline semaphores support the
following operations:
* Host query - A host operation that allows querying the payload of the
  timeline semaphore.
* Host wait - A host operation that allows a blocking wait for a
  timeline semaphore to reach a specified value.
* Device wait - A device operation that allows waiting for a
  timeline semaphore to reach a specified value.
* Device signal - A device operation that allows advancing the
  timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait on any 
point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

Depending upon how it's going to be used, this is the wrong thing to do.


TODO:
1. CPU query and wait on timeline semaphore.
2. test application (Daniel Vetter)

I also had some more suggestions, around aligning the two concepts of
future fences
submission fence is replaced by wait_event, so I don't address your 
future fence suggestion. And welcome to explain future fence status.

and at least trying to merge the timeline and the other
fence (which really is just a special case of a timeline with only 1
slot).

Could you detail that? Do you mean merge syncobj->fence to timeline point?

Thanks,
David Zhou

-Daniel


Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c | 383 +++---
  include/drm/drm_syncobj.h |  28 +++
  include/uapi/drm/drm.h|   1 +
  3 files changed, 389 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 6227df2cc0a4..f738d78edf65 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,44 @@
  #include "drm_internal.h"
  #include 
  
+struct drm_syncobj_stub_fence {

+   struct dma_fence base;
+   spinlock_t lock;
+};
+
+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
+{
+return "syncobjstub";
+}
+
+static bool drm_syncobj_stub_fence_enable_signaling(struct dma_fence *fence)
+{
+return !dma_fence_is_signaled(fence);
+}
+
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+   .get_driver_name = drm_syncobj_stub_fence_get_name,
+   .get_timeline_name = drm_syncobj_stub_fence_get_name,
+   .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
+   .release = NULL,
+};
+
+struct drm_syncobj_wait_pt {
+   struct drm_syncobj_stub_fence base;
+   u64value;
+   struct rb_node   node;
+};
+struct drm_syncobj_signal_pt {
+   struct drm_syncobj_stub_fence base;
+   struct dma_fence *signal_fence;
+   struct dma_fence *pre_pt_base;
+   struct dma_fence_cb signal_cb;
+   struct dma_fence_cb pre_pt_cb;
+   struct drm_syncobj *syncobj;
+   u64value;
+   struct list_head list;
+};
+
  /**
   * drm_syncobj_find - lookup and reference a sync object.
   * @file_private: drm file private pointer
@@ -137,6 +175,150 @@ void drm_syncobj_remove_callback(struct drm_syncobj 
*syncobj,
spin_unlock(&syncobj->lock);
  }
  
+static void drm_syncobj_timeline_signal_wait_pts(struct drm_syncobj *syncobj)

+{
+   struct rb_node *node = NULL;
+   struct drm_syncobj_wait_pt *wait_pt = NULL;
+
+   spin_lock(&syncobj->lock);
+   for(node = rb_first(&syncobj->syncobj_timeline.wait_pt_tree);
+   node != NULL; ) {
+   wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
+   node = rb_next(node);
+   if (wait_pt->value <= syncobj->syncobj_timeline.timeline) {
+   dma_fence_signal(&wait_pt->b

[PATCH 3/3] drm/amdgpu: Remove dead code in amdgpu_pm.c

2018-08-23 Thread Rex Zhu
As we have unify powergate_uvd/vce/mmhub to set_powergating_by_smu,
and set_powergating_by_smu was supported by both dpm and powerplay.
so remove the else case.

Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 35 --
 1 file changed, 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index 3e51e9c..b7b16cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -1720,18 +1720,6 @@ void amdgpu_dpm_enable_uvd(struct amdgpu_device *adev, 
bool enable)
mutex_lock(&adev->pm.mutex);
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_UVD, 
!enable);
mutex_unlock(&adev->pm.mutex);
-   } else {
-   if (enable) {
-   mutex_lock(&adev->pm.mutex);
-   adev->pm.dpm.uvd_active = true;
-   adev->pm.dpm.state = POWER_STATE_TYPE_INTERNAL_UVD;
-   mutex_unlock(&adev->pm.mutex);
-   } else {
-   mutex_lock(&adev->pm.mutex);
-   adev->pm.dpm.uvd_active = false;
-   mutex_unlock(&adev->pm.mutex);
-   }
-   amdgpu_pm_compute_clocks(adev);
}
 }
 
@@ -1742,29 +1730,6 @@ void amdgpu_dpm_enable_vce(struct amdgpu_device *adev, 
bool enable)
mutex_lock(&adev->pm.mutex);
amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_VCE, 
!enable);
mutex_unlock(&adev->pm.mutex);
-   } else {
-   if (enable) {
-   mutex_lock(&adev->pm.mutex);
-   adev->pm.dpm.vce_active = true;
-   /* XXX select vce level based on ring/task */
-   adev->pm.dpm.vce_level = AMD_VCE_LEVEL_AC_ALL;
-   mutex_unlock(&adev->pm.mutex);
-   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
-  
AMD_CG_STATE_UNGATE);
-   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
-  
AMD_PG_STATE_UNGATE);
-   amdgpu_pm_compute_clocks(adev);
-   } else {
-   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
-  
AMD_PG_STATE_GATE);
-   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
-  
AMD_CG_STATE_GATE);
-   mutex_lock(&adev->pm.mutex);
-   adev->pm.dpm.vce_active = false;
-   mutex_unlock(&adev->pm.mutex);
-   amdgpu_pm_compute_clocks(adev);
-   }
-
}
 }
 
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/3] drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins

2018-08-23 Thread Rex Zhu
Forgot to add vce pg support via smu for Kaveri/Mullins.

Regresstion issue caused by
'commit 561a5c83eadd ("drm/amd/pp: Unify powergate_uvd/vce/mmhub
to set_powergating_by_smu")'

Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/amdgpu/kv_dpm.c | 41 +++--
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c 
b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
index 3f57f64..a713c8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
@@ -66,7 +66,6 @@ static int kv_set_thermal_temperature_range(struct 
amdgpu_device *adev,
 static int kv_init_fps_limits(struct amdgpu_device *adev);
 
 static void kv_dpm_powergate_uvd(void *handle, bool gate);
-static void kv_dpm_powergate_vce(struct amdgpu_device *adev, bool gate);
 static void kv_dpm_powergate_samu(struct amdgpu_device *adev, bool gate);
 static void kv_dpm_powergate_acp(struct amdgpu_device *adev, bool gate);
 
@@ -1374,6 +1373,8 @@ static int kv_dpm_enable(struct amdgpu_device *adev)
 
 static void kv_dpm_disable(struct amdgpu_device *adev)
 {
+   struct kv_power_info *pi = kv_get_pi(adev);
+
amdgpu_irq_put(adev, &adev->pm.dpm.thermal.irq,
   AMDGPU_THERMAL_IRQ_LOW_TO_HIGH);
amdgpu_irq_put(adev, &adev->pm.dpm.thermal.irq,
@@ -1387,7 +1388,8 @@ static void kv_dpm_disable(struct amdgpu_device *adev)
/* powerup blocks */
kv_dpm_powergate_acp(adev, false);
kv_dpm_powergate_samu(adev, false);
-   kv_dpm_powergate_vce(adev, false);
+   if (pi->caps_vce_pg) /* power on the VCE block */
+   amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerON);
kv_dpm_powergate_uvd(adev, false);
 
kv_enable_smc_cac(adev, false);
@@ -1551,7 +1553,6 @@ static int kv_update_vce_dpm(struct amdgpu_device *adev,
int ret;
 
if (amdgpu_new_state->evclk > 0 && amdgpu_current_state->evclk == 0) {
-   kv_dpm_powergate_vce(adev, false);
if (pi->caps_stable_p_state)
pi->vce_boot_level = table->count - 1;
else
@@ -1573,7 +1574,6 @@ static int kv_update_vce_dpm(struct amdgpu_device *adev,
kv_enable_vce_dpm(adev, true);
} else if (amdgpu_new_state->evclk == 0 && amdgpu_current_state->evclk 
> 0) {
kv_enable_vce_dpm(adev, false);
-   kv_dpm_powergate_vce(adev, true);
}
 
return 0;
@@ -1702,24 +1702,32 @@ static void kv_dpm_powergate_uvd(void *handle, bool 
gate)
}
 }
 
-static void kv_dpm_powergate_vce(struct amdgpu_device *adev, bool gate)
+static void kv_dpm_powergate_vce(void *handle, bool gate)
 {
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
struct kv_power_info *pi = kv_get_pi(adev);
-
-   if (pi->vce_power_gated == gate)
-   return;
+   int ret;
 
pi->vce_power_gated = gate;
 
-   if (!pi->caps_vce_pg)
-   return;
-
-   if (gate)
-   amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerOFF);
-   else
-   amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerON);
+   if (gate) {
+   /* stop the VCE block */
+   ret = amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+AMD_PG_STATE_GATE);
+   kv_enable_vce_dpm(adev, false);
+   if (pi->caps_vce_pg) /* power off the VCE block */
+   amdgpu_kv_notify_message_to_smu(adev, 
PPSMC_MSG_VCEPowerOFF);
+   } else {
+   if (pi->caps_vce_pg) /* power on the VCE block */
+   amdgpu_kv_notify_message_to_smu(adev, 
PPSMC_MSG_VCEPowerON);
+   kv_enable_vce_dpm(adev, true);
+   /* re-init the VCE block */
+   ret = amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+
AMD_PG_STATE_UNGATE);
+   }
 }
 
+
 static void kv_dpm_powergate_samu(struct amdgpu_device *adev, bool gate)
 {
struct kv_power_info *pi = kv_get_pi(adev);
@@ -3313,6 +3321,9 @@ static int kv_set_powergating_by_smu(void *handle,
case AMD_IP_BLOCK_TYPE_UVD:
kv_dpm_powergate_uvd(handle, gate);
break;
+   case AMD_IP_BLOCK_TYPE_VCE:
+   kv_dpm_powergate_vce(handle, gate);
+   break;
default:
break;
}
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/3] drm/amdgpu: Power up uvd block when hw_fini

2018-08-23 Thread Rex Zhu
when hw_fini/suspend, smu only need to power up uvd block
if uvd pg is supported, don't need to call vce to do hw_init.

Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/amdgpu/kv_dpm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c 
b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
index a713c8b..8f625d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/kv_dpm.c
@@ -65,7 +65,6 @@ static int kv_set_thermal_temperature_range(struct 
amdgpu_device *adev,
int min_temp, int max_temp);
 static int kv_init_fps_limits(struct amdgpu_device *adev);
 
-static void kv_dpm_powergate_uvd(void *handle, bool gate);
 static void kv_dpm_powergate_samu(struct amdgpu_device *adev, bool gate);
 static void kv_dpm_powergate_acp(struct amdgpu_device *adev, bool gate);
 
@@ -1390,7 +1389,8 @@ static void kv_dpm_disable(struct amdgpu_device *adev)
kv_dpm_powergate_samu(adev, false);
if (pi->caps_vce_pg) /* power on the VCE block */
amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_VCEPowerON);
-   kv_dpm_powergate_uvd(adev, false);
+   if (pi->caps_uvd_pg) /* power off the UVD block */
+   amdgpu_kv_notify_message_to_smu(adev, PPSMC_MSG_UVDPowerON);
 
kv_enable_smc_cac(adev, false);
kv_enable_didt(adev, false);
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/5] drm/amdgpu: add ring soft recovery v2

2018-08-23 Thread Christian König

Am 23.08.2018 um 09:17 schrieb Huang Rui:

On Wed, Aug 22, 2018 at 12:55:43PM -0400, Alex Deucher wrote:

On Wed, Aug 22, 2018 at 6:05 AM Christian König
 wrote:

Instead of hammering hard on the GPU try a soft recovery first.

v2: reorder code a bit

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 24 
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
  3 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 265ff90f4e01..d93e31a5c4e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
 struct amdgpu_job *job = to_amdgpu_job(s_job);

+   if (amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) 
{
+   DRM_ERROR("ring %s timeout, but soft recovered\n",
+ s_job->sched->name);
+   return;
+   }

I think we should still bubble up the error to userspace even if we
can recover.  Data is lost when the wave is killed.  We should treat
it like a GPU reset.


May I know what does the wavefront stand for? Why we can do the "light"
recover than reset here?


Wavefront means a running shader in the SQ.

Basically this only covers the case when the application sends down a 
shader with an endless loop to the hardware. Here we just kill the 
shader and try to continue.


When you run into a hang because of a corrupted resource descriptor you 
need usually need a full ASIC reset to get out of that again.


Regards,
Christian.



Thanks,
Ray


Alex


+
 DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
   job->base.sched->name, 
atomic_read(&ring->fence_drv.last_seq),
   ring->fence_drv.sync_seq);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 5dfd26be1eec..c045a4e38ad1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -383,6 +383,30 @@ void amdgpu_ring_emit_reg_write_reg_wait_helper(struct 
amdgpu_ring *ring,
 amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
  }

+/**
+ * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
+ *
+ * @ring: ring to try the recovery on
+ * @vmid: VMID we try to get going again
+ * @fence: timedout fence
+ *
+ * Tries to get a ring proceeding again when it is stuck.
+ */
+bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
+  struct dma_fence *fence)
+{
+   ktime_t deadline = ktime_add_us(ktime_get(), 1000);
+
+   if (!ring->funcs->soft_recovery)
+   return false;
+
+   while (!dma_fence_is_signaled(fence) &&
+  ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
+   ring->funcs->soft_recovery(ring, vmid);
+
+   return dma_fence_is_signaled(fence);
+}
+
  /*
   * Debugfs info
   */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 409fdd9b9710..9cc239968e40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
 /* priority functions */
 void (*set_priority) (struct amdgpu_ring *ring,
   enum drm_sched_priority priority);
+   /* Try to soft recover the ring to make the fence signal */
+   void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
  };

  struct amdgpu_ring {
@@ -260,6 +262,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring);
  void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
 uint32_t reg0, uint32_t val0,
 uint32_t reg1, uint32_t val1);
+bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
+  struct dma_fence *fence);

  static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
  {
--
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/5] drm: add syncobj timeline support v2

2018-08-23 Thread Christian König

Am 23.08.2018 um 10:25 schrieb Chunming Zhou:

VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer payload
identifying a point in a timeline. Such timeline semaphores support the
following operations:
* Host query - A host operation that allows querying the payload of the
  timeline semaphore.
* Host wait - A host operation that allows a blocking wait for a
  timeline semaphore to reach a specified value.


I think I have a idea what "Host" means in this context, but it would 
probably be better to describe it.



* Device wait - A device operation that allows waiting for a
  timeline semaphore to reach a specified value.
* Device signal - A device operation that allows advancing the
  timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait on any 
point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)


I really liked Daniels idea to handle the classic syncobj like a 
timeline synobj with just 1 entry. That can probably simplify the 
implementation quite a bit.


Additional to that an amdgpu patch which shows how the interface is to 
be used is probably something Daniel will want to see as well.


Christian.



TODO:
1. CPU query and wait on timeline semaphore.
2. test application (Daniel Vetter)

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/drm_syncobj.c | 383 +++---
  include/drm/drm_syncobj.h |  28 +++
  include/uapi/drm/drm.h|   1 +
  3 files changed, 389 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 6227df2cc0a4..f738d78edf65 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,44 @@
  #include "drm_internal.h"
  #include 
  
+struct drm_syncobj_stub_fence {

+   struct dma_fence base;
+   spinlock_t lock;
+};
+
+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
+{
+return "syncobjstub";
+}
+
+static bool drm_syncobj_stub_fence_enable_signaling(struct dma_fence *fence)
+{
+return !dma_fence_is_signaled(fence);
+}
+
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+   .get_driver_name = drm_syncobj_stub_fence_get_name,
+   .get_timeline_name = drm_syncobj_stub_fence_get_name,
+   .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
+   .release = NULL,
+};
+
+struct drm_syncobj_wait_pt {
+   struct drm_syncobj_stub_fence base;
+   u64value;
+   struct rb_node   node;
+};
+struct drm_syncobj_signal_pt {
+   struct drm_syncobj_stub_fence base;
+   struct dma_fence *signal_fence;
+   struct dma_fence *pre_pt_base;
+   struct dma_fence_cb signal_cb;
+   struct dma_fence_cb pre_pt_cb;
+   struct drm_syncobj *syncobj;
+   u64value;
+   struct list_head list;
+};
+
  /**
   * drm_syncobj_find - lookup and reference a sync object.
   * @file_private: drm file private pointer
@@ -137,6 +175,150 @@ void drm_syncobj_remove_callback(struct drm_syncobj 
*syncobj,
spin_unlock(&syncobj->lock);
  }
  
+static void drm_syncobj_timeline_signal_wait_pts(struct drm_syncobj *syncobj)

+{
+   struct rb_node *node = NULL;
+   struct drm_syncobj_wait_pt *wait_pt = NULL;
+
+   spin_lock(&syncobj->lock);
+   for(node = rb_first(&syncobj->syncobj_timeline.wait_pt_tree);
+   node != NULL; ) {
+   wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
+   node = rb_next(node);
+   if (wait_pt->value <= syncobj->syncobj_timeline.timeline) {
+   dma_fence_signal(&wait_pt->base.base);
+   rb_erase(&wait_pt->node,
+&syncobj->syncobj_timeline.wait_pt_tree);
+   RB_CLEAR_NODE(&

Re: [PATCH 2/5] drm: rename null fence to stub fence in syncobj

2018-08-23 Thread Christian König

Am 23.08.2018 um 10:25 schrieb Chunming Zhou:

stub fence will be used by timeline syncobj as well.


Mhm, I'm leaning a bit towards renaming it but "null" fence or "stub" 
fence doesn't make a large difference to me.


Point is that it is a fence which is always signaled right from the 
beginning.


So any name which describes that would be welcome.

Christian.



Signed-off-by: Chunming Zhou 
Cc: Jason Ekstrand 
---
  drivers/gpu/drm/drm_syncobj.c | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index d17ed75ac7e2..d4b48fb410a1 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -172,37 +172,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj 
*syncobj,
  }
  EXPORT_SYMBOL(drm_syncobj_replace_fence);
  
-struct drm_syncobj_null_fence {

+struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
  };
  
-static const char *drm_syncobj_null_fence_get_name(struct dma_fence *fence)

+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
  {
-return "syncobjnull";
+return "syncobjstub";
  }
  
-static bool drm_syncobj_null_fence_enable_signaling(struct dma_fence *fence)

+static bool drm_syncobj_stub_fence_enable_signaling(struct dma_fence *fence)
  {
  return !dma_fence_is_signaled(fence);
  }
  
-static const struct dma_fence_ops drm_syncobj_null_fence_ops = {

-   .get_driver_name = drm_syncobj_null_fence_get_name,
-   .get_timeline_name = drm_syncobj_null_fence_get_name,
-   .enable_signaling = drm_syncobj_null_fence_enable_signaling,
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+   .get_driver_name = drm_syncobj_stub_fence_get_name,
+   .get_timeline_name = drm_syncobj_stub_fence_get_name,
+   .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
.release = NULL,
  };
  
  static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)

  {
-   struct drm_syncobj_null_fence *fence;
+   struct drm_syncobj_stub_fence *fence;
fence = kzalloc(sizeof(*fence), GFP_KERNEL);
if (fence == NULL)
return -ENOMEM;
  
  	spin_lock_init(&fence->lock);

-   dma_fence_init(&fence->base, &drm_syncobj_null_fence_ops,
+   dma_fence_init(&fence->base, &drm_syncobj_stub_fence_ops,
   &fence->lock, 0, 0);
dma_fence_signal(&fence->base);
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/5] drm: add syncobj timeline support v2

2018-08-23 Thread Daniel Vetter
On Thu, Aug 23, 2018 at 04:25:42PM +0800, Chunming Zhou wrote:
> VK_KHR_timeline_semaphore:
> This extension introduces a new type of semaphore that has an integer payload
> identifying a point in a timeline. Such timeline semaphores support the
> following operations:
>* Host query - A host operation that allows querying the payload of the
>  timeline semaphore.
>* Host wait - A host operation that allows a blocking wait for a
>  timeline semaphore to reach a specified value.
>* Device wait - A device operation that allows waiting for a
>  timeline semaphore to reach a specified value.
>* Device signal - A device operation that allows advancing the
>  timeline semaphore to a specified value.
> 
> Since it's a timeline, that means the front time point(PT) always is signaled 
> before the late PT.
> a. signal PT design:
> Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
> PT[N] fence is signaled,
> the timeline will increase to value of PT[N].
> b. wait PT design:
> Wait PT fence is signaled by reaching timeline point value, when timeline is 
> increasing, will compare
> wait PTs value with new timeline value, if PT value is lower than timeline 
> value, then wait PT will be
> signaled, otherwise keep in list. semaphore wait operation can wait on any 
> point of timeline,
> so need a RB tree to order them. And wait PT could ahead of signal PT, we 
> need a sumission fence to
> perform that.
> 
> v2:
> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
> 2. move unexposed denitions to .c file. (Daniel Vetter)
> 3. split up the change to drm_syncobj_find_fence() in a separate patch. 
> (Christian)
> 4. split up the change to drm_syncobj_replace_fence() in a separate patch.
> 5. drop the submission_fence implementation and instead use wait_event() for 
> that. (Christian)
> 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

Depending upon how it's going to be used, this is the wrong thing to do.

> TODO:
> 1. CPU query and wait on timeline semaphore.
> 2. test application (Daniel Vetter)

I also had some more suggestions, around aligning the two concepts of
future fences and at least trying to merge the timeline and the other
fence (which really is just a special case of a timeline with only 1
slot).
-Daniel

> 
> Signed-off-by: Chunming Zhou 
> Cc: Christian Konig 
> Cc: Dave Airlie 
> Cc: Daniel Rakos 
> Cc: Daniel Vetter 
> ---
>  drivers/gpu/drm/drm_syncobj.c | 383 
> +++---
>  include/drm/drm_syncobj.h |  28 +++
>  include/uapi/drm/drm.h|   1 +
>  3 files changed, 389 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index 6227df2cc0a4..f738d78edf65 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -56,6 +56,44 @@
>  #include "drm_internal.h"
>  #include 
>  
> +struct drm_syncobj_stub_fence {
> + struct dma_fence base;
> + spinlock_t lock;
> +};
> +
> +static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
> +{
> +return "syncobjstub";
> +}
> +
> +static bool drm_syncobj_stub_fence_enable_signaling(struct dma_fence *fence)
> +{
> +return !dma_fence_is_signaled(fence);
> +}
> +
> +static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
> + .get_driver_name = drm_syncobj_stub_fence_get_name,
> + .get_timeline_name = drm_syncobj_stub_fence_get_name,
> + .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
> + .release = NULL,
> +};
> +
> +struct drm_syncobj_wait_pt {
> + struct drm_syncobj_stub_fence base;
> + u64value;
> + struct rb_node   node;
> +};
> +struct drm_syncobj_signal_pt {
> + struct drm_syncobj_stub_fence base;
> + struct dma_fence *signal_fence;
> + struct dma_fence *pre_pt_base;
> + struct dma_fence_cb signal_cb;
> + struct dma_fence_cb pre_pt_cb;
> + struct drm_syncobj *syncobj;
> + u64value;
> + struct list_head list;
> +};
> +
>  /**
>   * drm_syncobj_find - lookup and reference a sync object.
>   * @file_private: drm file private pointer
> @@ -137,6 +175,150 @@ void drm_syncobj_remove_callback(struct drm_syncobj 
> *syncobj,
>   spin_unlock(&syncobj->lock);
>  }
>  
> +static void drm_syncobj_timeline_signal_wait_pts(struct drm_syncobj *syncobj)
> +{
> + struct rb_node *node = NULL;
> + struct drm_syncobj_wait_pt *wait_pt = NULL;
> +
> + spin_lock(&syncobj->lock);
> + for(node = rb_first(&syncobj->syncobj_timeline.wait_pt_tree);
> + node != NULL; ) {
> + wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
> + node = rb_next(node);
> + if (wait_pt->value <= syncobj->syncobj_timeline.timeline) {
> + dma_fence_signal(&wait_pt->base.base);
> + rb_erase(&wait_pt->node,
> +  &syncobj->syncobj_timelin

Re: [PATCH 4/5] drm: expand replace_fence to support timeline point

2018-08-23 Thread Christian König

Am 23.08.2018 um 10:25 schrieb Chunming Zhou:

we can place a fence to a timeline point after expanded.

Signed-off-by: Chunming Zhou 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
  drivers/gpu/drm/drm_syncobj.c  | 16 +---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
  drivers/gpu/drm/v3d/v3d_gem.c  |  3 ++-
  drivers/gpu/drm/vc4/vc4_gem.c  |  2 +-
  include/drm/drm_syncobj.h  |  2 +-
  6 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 4d3f1a6ee078..ef922e34086e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1151,7 +1151,7 @@ static void amdgpu_cs_post_dependencies(struct 
amdgpu_cs_parser *p)
int i;
  
  	for (i = 0; i < p->num_post_dep_syncobjs; ++i)

-   drm_syncobj_replace_fence(p->post_dep_syncobjs[i], p->fence);
+   drm_syncobj_replace_fence(p->post_dep_syncobjs[i], p->fence, 0);
  }
  
  static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 3aac0b50a104..6227df2cc0a4 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -141,11 +141,13 @@ void drm_syncobj_remove_callback(struct drm_syncobj 
*syncobj,
   * drm_syncobj_replace_fence - replace fence in a sync object.
   * @syncobj: Sync object to replace fence in
   * @fence: fence to install in sync file.
+ * @point: timeline point
   *
- * This replaces the fence on a sync object.
+ * This replaces the fence on a sync object, or a timeline point fence.
   */
  void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
-  struct dma_fence *fence)
+  struct dma_fence *fence,
+  u64 point)
  {
struct dma_fence *old_fence;
struct drm_syncobj_cb *cur, *tmp;
@@ -206,7 +208,7 @@ static int drm_syncobj_assign_null_handle(struct 
drm_syncobj *syncobj)
   &fence->lock, 0, 0);
dma_fence_signal(&fence->base);
  
-	drm_syncobj_replace_fence(syncobj, &fence->base);

+   drm_syncobj_replace_fence(syncobj, &fence->base, 0);
  
  	dma_fence_put(&fence->base);
  
@@ -258,7 +260,7 @@ void drm_syncobj_free(struct kref *kref)

struct drm_syncobj *syncobj = container_of(kref,
   struct drm_syncobj,
   refcount);
-   drm_syncobj_replace_fence(syncobj, NULL);
+   drm_syncobj_replace_fence(syncobj, NULL, 0);
kfree(syncobj);
  }
  EXPORT_SYMBOL(drm_syncobj_free);
@@ -298,7 +300,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, 
uint32_t flags,
}
  
  	if (fence)

-   drm_syncobj_replace_fence(syncobj, fence);
+   drm_syncobj_replace_fence(syncobj, fence, 0);
  
  	*out_syncobj = syncobj;

return 0;
@@ -483,7 +485,7 @@ static int drm_syncobj_import_sync_file_fence(struct 
drm_file *file_private,
return -ENOENT;
}
  
-	drm_syncobj_replace_fence(syncobj, fence);

+   drm_syncobj_replace_fence(syncobj, fence, 0);
dma_fence_put(fence);
drm_syncobj_put(syncobj);
return 0;
@@ -965,7 +967,7 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
return ret;
  
  	for (i = 0; i < args->count_handles; i++)

-   drm_syncobj_replace_fence(syncobjs[i], NULL);
+   drm_syncobj_replace_fence(syncobjs[i], NULL, 0);
  
  	drm_syncobj_array_free(syncobjs, args->count_handles);
  
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c

index 60dc2a865f5f..fab3b8fe7a60 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2211,7 +2211,7 @@ signal_fence_array(struct i915_execbuffer *eb,
if (!(flags & I915_EXEC_FENCE_SIGNAL))
continue;
  
-		drm_syncobj_replace_fence(syncobj, fence);

+   drm_syncobj_replace_fence(syncobj, fence, 0);
}
  }
  
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c

index f6dfb8140a62..f3ec1f18c04c 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -587,7 +587,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
sync_out = drm_syncobj_find(file_priv, args->out_sync);
if (sync_out) {
drm_syncobj_replace_fence(sync_out,
- &exec->render.base.s_fence->finished);
+ &exec->render.base.s_fence->finished,
+ 0);
drm_syncobj_put(sync_out);
}
  
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_ge

Re: [PATCH 1/5] drm: fix syncobj null_fence_enable_signaling

2018-08-23 Thread Daniel Vetter
On Thu, Aug 23, 2018 at 11:02 AM, Christian König
 wrote:
> Am 23.08.2018 um 10:25 schrieb Chunming Zhou:
>>
>> That is certainly totally nonsense. dma_fence_enable_sw_signaling()
>> is the function who is calling this callback.
>>
>> Signed-off-by: Chunming Zhou 
>> Cc: Jason Ekstrand 
>
>
> For this one: Reviewed-by: Christian König 

As mentioned in the v1 thread, you can outright nuke this, no longer
needed. But this at least makes sense now. Ack either way.
-Daniel

>
>> ---
>>   drivers/gpu/drm/drm_syncobj.c | 1 -
>>   1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
>> index 3a8837c49639..d17ed75ac7e2 100644
>> --- a/drivers/gpu/drm/drm_syncobj.c
>> +++ b/drivers/gpu/drm/drm_syncobj.c
>> @@ -184,7 +184,6 @@ static const char
>> *drm_syncobj_null_fence_get_name(struct dma_fence *fence)
>> static bool drm_syncobj_null_fence_enable_signaling(struct dma_fence
>> *fence)
>>   {
>> -dma_fence_enable_sw_signaling(fence);
>>   return !dma_fence_is_signaled(fence);
>>   }
>>
>
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/5] drm: expand drm_syncobj_find_fence to support timeline point

2018-08-23 Thread Christian König

Am 23.08.2018 um 10:25 schrieb Chunming Zhou:

we can fetch timeline point fence after expanded.

Signed-off-by: Chunming Zhou 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
  drivers/gpu/drm/drm_syncobj.c  | 6 --
  drivers/gpu/drm/v3d/v3d_gem.c  | 4 ++--
  drivers/gpu/drm/vc4/vc4_gem.c  | 2 +-
  include/drm/drm_syncobj.h  | 2 +-
  5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 7a625f3989a0..4d3f1a6ee078 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1062,7 +1062,7 @@ static int amdgpu_syncobj_lookup_and_add_to_sync(struct 
amdgpu_cs_parser *p,
  {
int r;
struct dma_fence *fence;
-   r = drm_syncobj_find_fence(p->filp, handle, &fence);
+   r = drm_syncobj_find_fence(p->filp, handle, &fence, 0);
if (r)
return r;
  
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c

index d4b48fb410a1..3aac0b50a104 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -218,6 +218,7 @@ static int drm_syncobj_assign_null_handle(struct 
drm_syncobj *syncobj)
   * @file_private: drm file private pointer
   * @handle: sync object handle to lookup.
   * @fence: out parameter for the fence
+ * @point: timeline point
   *
   * This is just a convenience function that combines drm_syncobj_find() and
   * drm_syncobj_fence_get().
@@ -228,7 +229,8 @@ static int drm_syncobj_assign_null_handle(struct 
drm_syncobj *syncobj)
   */
  int drm_syncobj_find_fence(struct drm_file *file_private,
   u32 handle,
-  struct dma_fence **fence)
+  struct dma_fence **fence,
+  u64 point)
  {
struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
int ret = 0;
@@ -498,7 +500,7 @@ static int drm_syncobj_export_sync_file(struct drm_file 
*file_private,
if (fd < 0)
return fd;
  
-	ret = drm_syncobj_find_fence(file_private, handle, &fence);

+   ret = drm_syncobj_find_fence(file_private, handle, &fence, 0);
if (ret)
goto err_put_fd;
  
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c

index e1fcbb4cd0ae..f6dfb8140a62 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -521,12 +521,12 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
kref_init(&exec->refcount);
  
  	ret = drm_syncobj_find_fence(file_priv, args->in_sync_bcl,

-&exec->bin.in_fence);
+&exec->bin.in_fence, 0);
if (ret == -EINVAL)
goto fail;
  
  	ret = drm_syncobj_find_fence(file_priv, args->in_sync_rcl,

-&exec->render.in_fence);
+&exec->render.in_fence, 0);
if (ret == -EINVAL)
goto fail;
  
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c

index 7910b9acedd6..f7b4971342e8 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -1173,7 +1173,7 @@ vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
  
  	if (args->in_sync) {

ret = drm_syncobj_find_fence(file_priv, args->in_sync,
-&in_fence);
+&in_fence, 0);
if (ret)
goto fail;
  
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h

index e419c79ba94d..9962f7a1672c 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -135,7 +135,7 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
   struct dma_fence *fence);
  int drm_syncobj_find_fence(struct drm_file *file_private,
   u32 handle,
-  struct dma_fence **fence);
+  struct dma_fence **fence, u64 point);


The fence is the result of the function and should come last.

Christian.


  void drm_syncobj_free(struct kref *kref);
  int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
   struct dma_fence *fence);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/5] drm: fix syncobj null_fence_enable_signaling

2018-08-23 Thread Christian König

Am 23.08.2018 um 10:25 schrieb Chunming Zhou:

That is certainly totally nonsense. dma_fence_enable_sw_signaling()
is the function who is calling this callback.

Signed-off-by: Chunming Zhou 
Cc: Jason Ekstrand 


For this one: Reviewed-by: Christian König 


---
  drivers/gpu/drm/drm_syncobj.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 3a8837c49639..d17ed75ac7e2 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -184,7 +184,6 @@ static const char *drm_syncobj_null_fence_get_name(struct 
dma_fence *fence)
  
  static bool drm_syncobj_null_fence_enable_signaling(struct dma_fence *fence)

  {
-dma_fence_enable_sw_signaling(fence);
  return !dma_fence_is_signaled(fence);
  }
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/6] drm/amdgpu: cleanup GPU recovery check a bit

2018-08-23 Thread Christian König

Am 23.08.2018 um 04:54 schrieb Huang Rui:

On Wed, Aug 22, 2018 at 12:04:53PM +0200, Christian König wrote:

Check if we should call the function instead of providing the forced
flag.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 38 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c|  4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c|  3 ++-
  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c  |  4 ++--
  drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c  |  3 ++-
  7 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 19ef7711d944..340e40d03d54 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1158,8 +1158,9 @@ int emu_soc_asic_init(struct amdgpu_device *adev);
  #define amdgpu_asic_need_full_reset(adev) 
(adev)->asic_funcs->need_full_reset((adev))
  
  /* Common functions */

+bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev);
  int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
- struct amdgpu_job* job, bool force);
+ struct amdgpu_job* job);
  void amdgpu_device_pci_config_reset(struct amdgpu_device *adev);
  bool amdgpu_device_need_post(struct amdgpu_device *adev);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index c23339d8ae2d..9f5e4be76d5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3244,32 +3244,44 @@ static int amdgpu_device_reset_sriov(struct 
amdgpu_device *adev,
return r;
  }
  
+/**

+ * amdgpu_device_should_recover_gpu - check if we should try GPU recovery
+ *
+ * @adev: amdgpu device pointer
+ *
+ * Check amdgpu_gpu_recovery and SRIOV status to see if we should try to 
recover
+ * a hung GPU.
+ */
+bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev)
+{
+   if (!amdgpu_device_ip_check_soft_reset(adev)) {
+   DRM_INFO("Timeout, but no hardware hang detected.\n");
+   return false;
+   }
+
+   if (amdgpu_gpu_recovery == 0 || (amdgpu_gpu_recovery == -1  &&
+!amdgpu_sriov_vf(adev))) {
+   DRM_INFO("GPU recovery disabled.\n");
+   return false;
+   }
+
+   return true;
+}
+
  /**
   * amdgpu_device_gpu_recover - reset the asic and recover scheduler
   *
   * @adev: amdgpu device pointer
   * @job: which job trigger hang
- * @force: forces reset regardless of amdgpu_gpu_recovery
   *
   * Attempt to reset the GPU if it has hung (all asics).
   * Returns 0 for success or an error on failure.
   */
  int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
- struct amdgpu_job *job, bool force)
+ struct amdgpu_job *job)
  {

In my view, actually, we don't need return as "int" for this function.
Because, no calling is to check the return value.


Yeah, that is also something I noticed.

But for now this cleanup is only about the force flag and the check if 
the function should be called or not.



Others looks good for me.
Reviewed-by: Huang Rui 


Thanks,
Christian.




int i, r, resched;
  
-	if (!force && !amdgpu_device_ip_check_soft_reset(adev)) {

-   DRM_INFO("No hardware hang detected. Did some blocks stall?\n");
-   return 0;
-   }
-
-   if (!force && (amdgpu_gpu_recovery == 0 ||
-   (amdgpu_gpu_recovery == -1  && 
!amdgpu_sriov_vf(adev {
-   DRM_INFO("GPU recovery disabled.\n");
-   return 0;
-   }
-
dev_info(adev->dev, "GPU reset begin!\n");
  
  	mutex_lock(&adev->lock_reset);

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index e74d620d9699..68cccebb8463 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -702,7 +702,7 @@ static int amdgpu_debugfs_gpu_recover(struct seq_file *m, 
void *data)
struct amdgpu_device *adev = dev->dev_private;
  
  	seq_printf(m, "gpu recover\n");

-   amdgpu_device_gpu_recover(adev, NULL, true);
+   amdgpu_device_gpu_recover(adev, NULL);
  
  	return 0;

  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 1abf5b5bac9e..b927e8798534 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -105,8 +105,8 @@ static void amdgpu_irq_reset_work_func(struct work_struct 
*work)
struct amdgpu_device *adev = container_of(work, struct amdgpu_device,
  reset_work);
  
-	if (!amdgpu_sriov_vf(adev))

-   amdgpu_device_g

[PATCH 5/5] drm: add syncobj timeline support v2

2018-08-23 Thread Chunming Zhou
VK_KHR_timeline_semaphore:
This extension introduces a new type of semaphore that has an integer payload
identifying a point in a timeline. Such timeline semaphores support the
following operations:
   * Host query - A host operation that allows querying the payload of the
 timeline semaphore.
   * Host wait - A host operation that allows a blocking wait for a
 timeline semaphore to reach a specified value.
   * Device wait - A device operation that allows waiting for a
 timeline semaphore to reach a specified value.
   * Device signal - A device operation that allows advancing the
 timeline semaphore to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled 
before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when 
PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is 
increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline 
value, then wait PT will be
signaled, otherwise keep in list. semaphore wait operation can wait on any 
point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need 
a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. 
(Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for 
that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

TODO:
1. CPU query and wait on timeline semaphore.
2. test application (Daniel Vetter)

Signed-off-by: Chunming Zhou 
Cc: Christian Konig 
Cc: Dave Airlie 
Cc: Daniel Rakos 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_syncobj.c | 383 +++---
 include/drm/drm_syncobj.h |  28 +++
 include/uapi/drm/drm.h|   1 +
 3 files changed, 389 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 6227df2cc0a4..f738d78edf65 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,44 @@
 #include "drm_internal.h"
 #include 
 
+struct drm_syncobj_stub_fence {
+   struct dma_fence base;
+   spinlock_t lock;
+};
+
+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
+{
+return "syncobjstub";
+}
+
+static bool drm_syncobj_stub_fence_enable_signaling(struct dma_fence *fence)
+{
+return !dma_fence_is_signaled(fence);
+}
+
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+   .get_driver_name = drm_syncobj_stub_fence_get_name,
+   .get_timeline_name = drm_syncobj_stub_fence_get_name,
+   .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
+   .release = NULL,
+};
+
+struct drm_syncobj_wait_pt {
+   struct drm_syncobj_stub_fence base;
+   u64value;
+   struct rb_node   node;
+};
+struct drm_syncobj_signal_pt {
+   struct drm_syncobj_stub_fence base;
+   struct dma_fence *signal_fence;
+   struct dma_fence *pre_pt_base;
+   struct dma_fence_cb signal_cb;
+   struct dma_fence_cb pre_pt_cb;
+   struct drm_syncobj *syncobj;
+   u64value;
+   struct list_head list;
+};
+
 /**
  * drm_syncobj_find - lookup and reference a sync object.
  * @file_private: drm file private pointer
@@ -137,6 +175,150 @@ void drm_syncobj_remove_callback(struct drm_syncobj 
*syncobj,
spin_unlock(&syncobj->lock);
 }
 
+static void drm_syncobj_timeline_signal_wait_pts(struct drm_syncobj *syncobj)
+{
+   struct rb_node *node = NULL;
+   struct drm_syncobj_wait_pt *wait_pt = NULL;
+
+   spin_lock(&syncobj->lock);
+   for(node = rb_first(&syncobj->syncobj_timeline.wait_pt_tree);
+   node != NULL; ) {
+   wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
+   node = rb_next(node);
+   if (wait_pt->value <= syncobj->syncobj_timeline.timeline) {
+   dma_fence_signal(&wait_pt->base.base);
+   rb_erase(&wait_pt->node,
+&syncobj->syncobj_timeline.wait_pt_tree);
+   RB_CLEAR_NODE(&wait_pt->node);
+   /* kfree(wait_pt) is excuted by fence put */
+   dma_fence_put(&wait_pt->base.base);
+   } else {
+   /* the loop is from left to right, the later entry 
value is
+* bigger, so don't need to check any more */
+   break;
+   }
+   }
+   spin_unlock(&syncobj->lock);
+}
+
+
+static void pt_fence_cb(struct drm_syncobj_signal_pt *signal_pt)
+{

[PATCH 4/5] drm: expand replace_fence to support timeline point

2018-08-23 Thread Chunming Zhou
we can place a fence to a timeline point after expanded.

Signed-off-by: Chunming Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
 drivers/gpu/drm/drm_syncobj.c  | 16 +---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/v3d/v3d_gem.c  |  3 ++-
 drivers/gpu/drm/vc4/vc4_gem.c  |  2 +-
 include/drm/drm_syncobj.h  |  2 +-
 6 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 4d3f1a6ee078..ef922e34086e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1151,7 +1151,7 @@ static void amdgpu_cs_post_dependencies(struct 
amdgpu_cs_parser *p)
int i;
 
for (i = 0; i < p->num_post_dep_syncobjs; ++i)
-   drm_syncobj_replace_fence(p->post_dep_syncobjs[i], p->fence);
+   drm_syncobj_replace_fence(p->post_dep_syncobjs[i], p->fence, 0);
 }
 
 static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 3aac0b50a104..6227df2cc0a4 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -141,11 +141,13 @@ void drm_syncobj_remove_callback(struct drm_syncobj 
*syncobj,
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
  * @fence: fence to install in sync file.
+ * @point: timeline point
  *
- * This replaces the fence on a sync object.
+ * This replaces the fence on a sync object, or a timeline point fence.
  */
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
-  struct dma_fence *fence)
+  struct dma_fence *fence,
+  u64 point)
 {
struct dma_fence *old_fence;
struct drm_syncobj_cb *cur, *tmp;
@@ -206,7 +208,7 @@ static int drm_syncobj_assign_null_handle(struct 
drm_syncobj *syncobj)
   &fence->lock, 0, 0);
dma_fence_signal(&fence->base);
 
-   drm_syncobj_replace_fence(syncobj, &fence->base);
+   drm_syncobj_replace_fence(syncobj, &fence->base, 0);
 
dma_fence_put(&fence->base);
 
@@ -258,7 +260,7 @@ void drm_syncobj_free(struct kref *kref)
struct drm_syncobj *syncobj = container_of(kref,
   struct drm_syncobj,
   refcount);
-   drm_syncobj_replace_fence(syncobj, NULL);
+   drm_syncobj_replace_fence(syncobj, NULL, 0);
kfree(syncobj);
 }
 EXPORT_SYMBOL(drm_syncobj_free);
@@ -298,7 +300,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, 
uint32_t flags,
}
 
if (fence)
-   drm_syncobj_replace_fence(syncobj, fence);
+   drm_syncobj_replace_fence(syncobj, fence, 0);
 
*out_syncobj = syncobj;
return 0;
@@ -483,7 +485,7 @@ static int drm_syncobj_import_sync_file_fence(struct 
drm_file *file_private,
return -ENOENT;
}
 
-   drm_syncobj_replace_fence(syncobj, fence);
+   drm_syncobj_replace_fence(syncobj, fence, 0);
dma_fence_put(fence);
drm_syncobj_put(syncobj);
return 0;
@@ -965,7 +967,7 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
return ret;
 
for (i = 0; i < args->count_handles; i++)
-   drm_syncobj_replace_fence(syncobjs[i], NULL);
+   drm_syncobj_replace_fence(syncobjs[i], NULL, 0);
 
drm_syncobj_array_free(syncobjs, args->count_handles);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 60dc2a865f5f..fab3b8fe7a60 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2211,7 +2211,7 @@ signal_fence_array(struct i915_execbuffer *eb,
if (!(flags & I915_EXEC_FENCE_SIGNAL))
continue;
 
-   drm_syncobj_replace_fence(syncobj, fence);
+   drm_syncobj_replace_fence(syncobj, fence, 0);
}
 }
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index f6dfb8140a62..f3ec1f18c04c 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -587,7 +587,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
sync_out = drm_syncobj_find(file_priv, args->out_sync);
if (sync_out) {
drm_syncobj_replace_fence(sync_out,
- &exec->render.base.s_fence->finished);
+ &exec->render.base.s_fence->finished,
+ 0);
drm_syncobj_put(sync_out);
}
 
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index f7b4971342e8..68832d66d716

[PATCH 3/5] drm: expand drm_syncobj_find_fence to support timeline point

2018-08-23 Thread Chunming Zhou
we can fetch timeline point fence after expanded.

Signed-off-by: Chunming Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 drivers/gpu/drm/drm_syncobj.c  | 6 --
 drivers/gpu/drm/v3d/v3d_gem.c  | 4 ++--
 drivers/gpu/drm/vc4/vc4_gem.c  | 2 +-
 include/drm/drm_syncobj.h  | 2 +-
 5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 7a625f3989a0..4d3f1a6ee078 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1062,7 +1062,7 @@ static int amdgpu_syncobj_lookup_and_add_to_sync(struct 
amdgpu_cs_parser *p,
 {
int r;
struct dma_fence *fence;
-   r = drm_syncobj_find_fence(p->filp, handle, &fence);
+   r = drm_syncobj_find_fence(p->filp, handle, &fence, 0);
if (r)
return r;
 
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index d4b48fb410a1..3aac0b50a104 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -218,6 +218,7 @@ static int drm_syncobj_assign_null_handle(struct 
drm_syncobj *syncobj)
  * @file_private: drm file private pointer
  * @handle: sync object handle to lookup.
  * @fence: out parameter for the fence
+ * @point: timeline point
  *
  * This is just a convenience function that combines drm_syncobj_find() and
  * drm_syncobj_fence_get().
@@ -228,7 +229,8 @@ static int drm_syncobj_assign_null_handle(struct 
drm_syncobj *syncobj)
  */
 int drm_syncobj_find_fence(struct drm_file *file_private,
   u32 handle,
-  struct dma_fence **fence)
+  struct dma_fence **fence,
+  u64 point)
 {
struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
int ret = 0;
@@ -498,7 +500,7 @@ static int drm_syncobj_export_sync_file(struct drm_file 
*file_private,
if (fd < 0)
return fd;
 
-   ret = drm_syncobj_find_fence(file_private, handle, &fence);
+   ret = drm_syncobj_find_fence(file_private, handle, &fence, 0);
if (ret)
goto err_put_fd;
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index e1fcbb4cd0ae..f6dfb8140a62 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -521,12 +521,12 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
kref_init(&exec->refcount);
 
ret = drm_syncobj_find_fence(file_priv, args->in_sync_bcl,
-&exec->bin.in_fence);
+&exec->bin.in_fence, 0);
if (ret == -EINVAL)
goto fail;
 
ret = drm_syncobj_find_fence(file_priv, args->in_sync_rcl,
-&exec->render.in_fence);
+&exec->render.in_fence, 0);
if (ret == -EINVAL)
goto fail;
 
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index 7910b9acedd6..f7b4971342e8 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -1173,7 +1173,7 @@ vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
 
if (args->in_sync) {
ret = drm_syncobj_find_fence(file_priv, args->in_sync,
-&in_fence);
+&in_fence, 0);
if (ret)
goto fail;
 
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index e419c79ba94d..9962f7a1672c 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -135,7 +135,7 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
   struct dma_fence *fence);
 int drm_syncobj_find_fence(struct drm_file *file_private,
   u32 handle,
-  struct dma_fence **fence);
+  struct dma_fence **fence, u64 point);
 void drm_syncobj_free(struct kref *kref);
 int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
   struct dma_fence *fence);
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/5] drm: rename null fence to stub fence in syncobj

2018-08-23 Thread Chunming Zhou
stub fence will be used by timeline syncobj as well.

Signed-off-by: Chunming Zhou 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/drm_syncobj.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index d17ed75ac7e2..d4b48fb410a1 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -172,37 +172,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj 
*syncobj,
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
-struct drm_syncobj_null_fence {
+struct drm_syncobj_stub_fence {
struct dma_fence base;
spinlock_t lock;
 };
 
-static const char *drm_syncobj_null_fence_get_name(struct dma_fence *fence)
+static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
 {
-return "syncobjnull";
+return "syncobjstub";
 }
 
-static bool drm_syncobj_null_fence_enable_signaling(struct dma_fence *fence)
+static bool drm_syncobj_stub_fence_enable_signaling(struct dma_fence *fence)
 {
 return !dma_fence_is_signaled(fence);
 }
 
-static const struct dma_fence_ops drm_syncobj_null_fence_ops = {
-   .get_driver_name = drm_syncobj_null_fence_get_name,
-   .get_timeline_name = drm_syncobj_null_fence_get_name,
-   .enable_signaling = drm_syncobj_null_fence_enable_signaling,
+static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
+   .get_driver_name = drm_syncobj_stub_fence_get_name,
+   .get_timeline_name = drm_syncobj_stub_fence_get_name,
+   .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
.release = NULL,
 };
 
 static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
 {
-   struct drm_syncobj_null_fence *fence;
+   struct drm_syncobj_stub_fence *fence;
fence = kzalloc(sizeof(*fence), GFP_KERNEL);
if (fence == NULL)
return -ENOMEM;
 
spin_lock_init(&fence->lock);
-   dma_fence_init(&fence->base, &drm_syncobj_null_fence_ops,
+   dma_fence_init(&fence->base, &drm_syncobj_stub_fence_ops,
   &fence->lock, 0, 0);
dma_fence_signal(&fence->base);
 
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/5] drm: fix syncobj null_fence_enable_signaling

2018-08-23 Thread Chunming Zhou
That is certainly totally nonsense. dma_fence_enable_sw_signaling()
is the function who is calling this callback.

Signed-off-by: Chunming Zhou 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/drm_syncobj.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 3a8837c49639..d17ed75ac7e2 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -184,7 +184,6 @@ static const char *drm_syncobj_null_fence_get_name(struct 
dma_fence *fence)
 
 static bool drm_syncobj_null_fence_enable_signaling(struct dma_fence *fence)
 {
-dma_fence_enable_sw_signaling(fence);
 return !dma_fence_is_signaled(fence);
 }
 
-- 
2.14.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 03/11] drm/amdgpu: cleanup VM handling in the CS a bit

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:09PM +0200, Christian König wrote:
> Add a helper function for getting the root PD addr and cleanup join the
> two VM related functions and cleanup the function name.
> 
> No functional change.
> 
> Signed-off-by: Christian König 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 160 -
>  1 file changed, 74 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index d42d1c8f78f6..17bf63f93c93 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -804,8 +804,9 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser 
> *parser, int error,
>   amdgpu_bo_unref(&parser->uf_entry.robj);
>  }
>  
> -static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
> +static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
>  {
> + struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
>   struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>   struct amdgpu_device *adev = p->adev;
>   struct amdgpu_vm *vm = &fpriv->vm;
> @@ -814,6 +815,71 @@ static int amdgpu_bo_vm_update_pte(struct 
> amdgpu_cs_parser *p)
>   struct amdgpu_bo *bo;
>   int r;
>  
> + /* Only for UVD/VCE VM emulation */
> + if (ring->funcs->parse_cs || ring->funcs->patch_cs_in_place) {
> + unsigned i, j;
> +
> + for (i = 0, j = 0; i < p->nchunks && j < p->job->num_ibs; i++) {
> + struct drm_amdgpu_cs_chunk_ib *chunk_ib;
> + struct amdgpu_bo_va_mapping *m;
> + struct amdgpu_bo *aobj = NULL;
> + struct amdgpu_cs_chunk *chunk;
> + uint64_t offset, va_start;
> + struct amdgpu_ib *ib;
> + uint8_t *kptr;
> +
> + chunk = &p->chunks[i];
> + ib = &p->job->ibs[j];
> + chunk_ib = chunk->kdata;
> +
> + if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB)
> + continue;
> +
> + va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK;
> + r = amdgpu_cs_find_mapping(p, va_start, &aobj, &m);
> + if (r) {
> + DRM_ERROR("IB va_start is invalid\n");
> + return r;
> + }
> +
> + if ((va_start + chunk_ib->ib_bytes) >
> + (m->last + 1) * AMDGPU_GPU_PAGE_SIZE) {
> + DRM_ERROR("IB va_start+ib_bytes is invalid\n");
> + return -EINVAL;
> + }
> +
> + /* the IB should be reserved at this point */
> + r = amdgpu_bo_kmap(aobj, (void **)&kptr);
> + if (r) {
> + return r;
> + }
> +
> + offset = m->start * AMDGPU_GPU_PAGE_SIZE;
> + kptr += va_start - offset;
> +
> + if (ring->funcs->parse_cs) {
> + memcpy(ib->ptr, kptr, chunk_ib->ib_bytes);
> + amdgpu_bo_kunmap(aobj);
> +
> + r = amdgpu_ring_parse_cs(ring, p, j);
> + if (r)
> + return r;
> + } else {
> + ib->ptr = (uint32_t *)kptr;
> + r = amdgpu_ring_patch_cs_in_place(ring, p, j);
> + amdgpu_bo_kunmap(aobj);
> + if (r)
> + return r;
> + }
> +
> + j++;
> + }
> + }
> +
> + if (!p->job->vm)
> + return amdgpu_cs_sync_rings(p);
> +
> +
>   r = amdgpu_vm_clear_freed(adev, vm, NULL);
>   if (r)
>   return r;
> @@ -876,6 +942,12 @@ static int amdgpu_bo_vm_update_pte(struct 
> amdgpu_cs_parser *p)
>   if (r)
>   return r;
>  
> + r = reservation_object_reserve_shared(vm->root.base.bo->tbo.resv);
> + if (r)
> + return r;
> +
> + p->job->vm_pd_addr = amdgpu_bo_gpu_offset(vm->root.base.bo);
> +
>   if (amdgpu_vm_debug) {
>   /* Invalidate all BOs to test for userspace bugs */
>   amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> @@ -887,90 +959,6 @@ static int amdgpu_bo_vm_update_pte(struct 
> amdgpu_cs_parser *p)
>   }
>   }
>  
> - return r;
> -}
> -
> -static int amdgpu_cs_ib_vm_chunk(struct amdgpu_device *adev,
> -  struct amdgpu_cs_parser *p)
> -{
> - struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
> - struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> - struct amdgp

Re: [PATCH 02/11] drm/amdgpu: validate the VM root PD from the VM code

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:08PM +0200, Christian König wrote:
> Preparation for following changes. This validates the root PD twice,
> but the overhead of that should be minimal.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 73b8dcaf66e6..53ce9982a5ee 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -291,11 +291,11 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device 
> *adev, struct amdgpu_vm *vm,
>   list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) {
>   struct amdgpu_bo *bo = bo_base->bo;
>  
> - if (bo->parent) {
> - r = validate(param, bo);
> - if (r)
> - break;
> + r = validate(param, bo);
> + if (r)
> + break;

In orignal case, we skip the root PD. But now, it is validated one time.
May I know where is another time?

Thanks,
Ray

>  
> + if (bo->parent) {
>   spin_lock(&glob->lru_lock);
>   ttm_bo_move_to_lru_tail(&bo->tbo);
>   if (bo->shadow)
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 01/11] drm/amdgpu: remove extra root PD alignment

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 05:05:07PM +0200, Christian König wrote:
> Just another leftover from radeon.
> 
> Signed-off-by: Christian König 

Acked-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 ---
>  2 files changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 662aec5c81d4..73b8dcaf66e6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2566,8 +2566,6 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm,
>  {
>   struct amdgpu_bo_param bp;
>   struct amdgpu_bo *root;
> - const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
> - AMDGPU_VM_PTE_COUNT(adev) * 8);
>   unsigned long size;
>   uint64_t flags;
>   int r, i;
> @@ -2615,7 +2613,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
> amdgpu_vm *vm,
>   size = amdgpu_vm_bo_size(adev, adev->vm_manager.root_level);
>   memset(&bp, 0, sizeof(bp));
>   bp.size = size;
> - bp.byte_align = align;
> + bp.byte_align = AMDGPU_GPU_PAGE_SIZE;
>   bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
>   bp.flags = flags;
>   bp.type = ttm_bo_type_kernel;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 1162c2bf3138..1c9049feaaea 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -48,9 +48,6 @@ struct amdgpu_bo_list_entry;
>  /* number of entries in page table */
>  #define AMDGPU_VM_PTE_COUNT(adev) (1 << (adev)->vm_manager.block_size)
>  
> -/* PTBs (Page Table Blocks) need to be aligned to 32K */
> -#define AMDGPU_VM_PTB_ALIGN_SIZE   32768
> -
>  #define AMDGPU_PTE_VALID (1ULL << 0)
>  #define AMDGPU_PTE_SYSTEM(1ULL << 1)
>  #define AMDGPU_PTE_SNOOPED   (1ULL << 2)
> -- 
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/5] drm/amdgpu: add ring soft recovery v2

2018-08-23 Thread Huang Rui
On Wed, Aug 22, 2018 at 12:55:43PM -0400, Alex Deucher wrote:
> On Wed, Aug 22, 2018 at 6:05 AM Christian König
>  wrote:
> >
> > Instead of hammering hard on the GPU try a soft recovery first.
> >
> > v2: reorder code a bit
> >
> > Signed-off-by: Christian König 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 ++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 24 
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 
> >  3 files changed, 34 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index 265ff90f4e01..d93e31a5c4e7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -33,6 +33,12 @@ static void amdgpu_job_timedout(struct drm_sched_job 
> > *s_job)
> > struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> > struct amdgpu_job *job = to_amdgpu_job(s_job);
> >
> > +   if (amdgpu_ring_soft_recovery(ring, job->vmid, 
> > s_job->s_fence->parent)) {
> > +   DRM_ERROR("ring %s timeout, but soft recovered\n",
> > + s_job->sched->name);
> > +   return;
> > +   }
> 
> I think we should still bubble up the error to userspace even if we
> can recover.  Data is lost when the wave is killed.  We should treat
> it like a GPU reset.
> 

May I know what does the wavefront stand for? Why we can do the "light"
recover than reset here?

Thanks,
Ray

> Alex
> 
> > +
> > DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
> >   job->base.sched->name, 
> > atomic_read(&ring->fence_drv.last_seq),
> >   ring->fence_drv.sync_seq);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > index 5dfd26be1eec..c045a4e38ad1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > @@ -383,6 +383,30 @@ void amdgpu_ring_emit_reg_write_reg_wait_helper(struct 
> > amdgpu_ring *ring,
> > amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
> >  }
> >
> > +/**
> > + * amdgpu_ring_soft_recovery - try to soft recover a ring lockup
> > + *
> > + * @ring: ring to try the recovery on
> > + * @vmid: VMID we try to get going again
> > + * @fence: timedout fence
> > + *
> > + * Tries to get a ring proceeding again when it is stuck.
> > + */
> > +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
> > +  struct dma_fence *fence)
> > +{
> > +   ktime_t deadline = ktime_add_us(ktime_get(), 1000);
> > +
> > +   if (!ring->funcs->soft_recovery)
> > +   return false;
> > +
> > +   while (!dma_fence_is_signaled(fence) &&
> > +  ktime_to_ns(ktime_sub(deadline, ktime_get())) > 0)
> > +   ring->funcs->soft_recovery(ring, vmid);
> > +
> > +   return dma_fence_is_signaled(fence);
> > +}
> > +
> >  /*
> >   * Debugfs info
> >   */
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > index 409fdd9b9710..9cc239968e40 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > @@ -168,6 +168,8 @@ struct amdgpu_ring_funcs {
> > /* priority functions */
> > void (*set_priority) (struct amdgpu_ring *ring,
> >   enum drm_sched_priority priority);
> > +   /* Try to soft recover the ring to make the fence signal */
> > +   void (*soft_recovery)(struct amdgpu_ring *ring, unsigned vmid);
> >  };
> >
> >  struct amdgpu_ring {
> > @@ -260,6 +262,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring);
> >  void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
> > uint32_t reg0, uint32_t 
> > val0,
> > uint32_t reg1, uint32_t 
> > val1);
> > +bool amdgpu_ring_soft_recovery(struct amdgpu_ring *ring, unsigned int vmid,
> > +  struct dma_fence *fence);
> >
> >  static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> >  {
> > --
> > 2.14.1
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: KFD co-maintainership and branch model

2018-08-23 Thread Christian König

Am 23.08.2018 um 08:54 schrieb Oded Gabbay:

On Thu, Aug 23, 2018 at 4:34 AM David Airlie  wrote:

On Thu, Aug 23, 2018 at 8:25 AM, Felix Kuehling  wrote:

Hi all,

Oded has offered to make me co-maintainer of KFD, as he's super busy at
work and less responsive than he used to be.

At the same time we're about to send out the first patches to merge KFD
and AMDGPU into a single kernel module.

With that in mind I'd like to propose to upstream KFD through Alex's
branch in the future. It would avoid conflicts in shared code
(amdgpu_vm.c is most active at the moment) when merging branches, and
make the code flow and testing easier.

Please let me know what you think?


Works for me.

Thanks,
Dave.

Works for me as well.


Sounds good to me as well.

Christian.



Oded
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx