RE: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v4)

2020-07-31 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

Please check V5 and see if it looks better

I use automatic indentation on those lines

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of Liu, Monk
Sent: Friday, July 31, 2020 3:43 PM
To: Kuehling, Felix ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v4)

[AMD Official Use Only - Internal Distribution Only]

[AMD Official Use Only - Internal Distribution Only]

Felix

I cannot let the indentation look perfect when it goes out from the email , 
don't know why

It looks well in my VIM although

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Kuehling, Felix 
Sent: Friday, July 31, 2020 12:54 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v4)


Am 2020-07-30 um 11:42 p.m. schrieb Monk Liu:
> what:
> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
> clocks during world switch which impacts a lot to multi-VF performance
>
> how:
> introduce a paramter to control the number of KCQ to avoid performance
> drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> v2:
> refine namings
>
> v3:
> choose queues for each ring to that try best to cross pipes evenly.
>
> v4:
> fix indentation
> some cleanupsin the gfx_compute_queue_acquire() function
>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through modprobe
> in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 52 
> +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +-
>  7 files changed, 80 insertions(+), 72 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..de11136 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
> +extern int amdgpu_num_kcq;
>
>  #define AMDGPU_VM_MAX_NUM_CTX4096
>  #define AMDGPU_SG_THRESHOLD(256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..cf445bab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
> amdgpu_device *adev)
>
>  amdgpu_gmc_tmz_set(adev);
>
> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { amdgpu_num_kcq = 8;
> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to
> +invalid paramter provided by user\n"); }
> +
>  return 0;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..b545c40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;  int amdgpu_force_asic_type =
> -1;  int amdgpu_tmz = 0;  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq = -1;
>
>  struct amdgpu_mgpu_info mgpu_info = {  .mutex =
> __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default),
> 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
> module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want
> +to setup (8 if set to greater than 8 or less than 0, only affect gfx
> +8+)"); module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> +
>  static const struct pci_device_id pciidlist[] = {  #ifdef
> CONFIG_DRM_AMDGPU_SI  {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
> CHIP_TAHITI}, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..b43df8e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/

RE: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v4)

2020-07-31 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

Felix

I cannot let the indentation look perfect when it goes out from the email , 
don't know why

It looks well in my VIM although

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Kuehling, Felix 
Sent: Friday, July 31, 2020 12:54 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v4)


Am 2020-07-30 um 11:42 p.m. schrieb Monk Liu:
> what:
> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
> clocks during world switch which impacts a lot to multi-VF performance
>
> how:
> introduce a paramter to control the number of KCQ to avoid performance
> drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> v2:
> refine namings
>
> v3:
> choose queues for each ring to that try best to cross pipes evenly.
>
> v4:
> fix indentation
> some cleanupsin the gfx_compute_queue_acquire() function
>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through modprobe
> in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 52 
> +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +-
>  7 files changed, 80 insertions(+), 72 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..de11136 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
> +extern int amdgpu_num_kcq;
>
>  #define AMDGPU_VM_MAX_NUM_CTX4096
>  #define AMDGPU_SG_THRESHOLD(256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..cf445bab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
> amdgpu_device *adev)
>
>  amdgpu_gmc_tmz_set(adev);
>
> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
> +amdgpu_num_kcq = 8;
> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid 
> paramter provided by user\n");
> +}
> +
>  return 0;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..b545c40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;  int amdgpu_force_asic_type =
> -1;  int amdgpu_tmz = 0;  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq = -1;
>
>  struct amdgpu_mgpu_info mgpu_info = {
>  .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default),
> 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
> module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want
> +to setup (8 if set to greater than 8 or less than 0, only affect gfx
> +8+)"); module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> +
>  static const struct pci_device_id pciidlist[] = {  #ifdef
> CONFIG_DRM_AMDGPU_SI
>  {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI}, diff
> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..b43df8e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -202,40 +202,34 @@ bool
> amdgpu_gfx_is_high_priority_compute_queue(struct amdgpu_device *adev,
>
>  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)  {
> -int i, queue, pipe, mec;
> +int i, queue, pipe;
>  bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
> +int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
> + adev->gfx.mec.num_queue_per_pipe,
> + adev->gfx.num_compute_rings);
> +
> +if (multip

Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v4)

2020-07-30 Thread Felix Kuehling

Am 2020-07-30 um 11:42 p.m. schrieb Monk Liu:
> what:
> the MQD's save and restore of KCQ (kernel compute queue)
> cost lots of clocks during world switch which impacts a lot
> to multi-VF performance
>
> how:
> introduce a paramter to control the number of KCQ to avoid
> performance drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> v2:
> refine namings
>
> v3:
> choose queues for each ring to that try best to cross pipes evenly.
>
> v4:
> fix indentation
> some cleanupsin the gfx_compute_queue_acquire() function
>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through
> modprobe in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 52 
> +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +-
>  7 files changed, 80 insertions(+), 72 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..de11136 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;
>  #ifdef CONFIG_DRM_AMDGPU_CIK
>  extern int amdgpu_cik_support;
>  #endif
> +extern int amdgpu_num_kcq;
>  
>  #define AMDGPU_VM_MAX_NUM_CTX4096
>  #define AMDGPU_SG_THRESHOLD  (256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..cf445bab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
> amdgpu_device *adev)
>  
>   amdgpu_gmc_tmz_set(adev);
>  
> + if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
> + amdgpu_num_kcq = 8;
> + dev_warn(adev->dev, "set kernel compute queue number to 8 due 
> to invalid paramter provided by user\n");
> + }
> +
>   return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..b545c40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;
>  int amdgpu_force_asic_type = -1;
>  int amdgpu_tmz = 0;
>  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq = -1;
>  
>  struct amdgpu_mgpu_info mgpu_info = {
>   .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
>  MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
> legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
>  module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>  
> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup 
> (8 if set to greater than 8 or less than 0, only affect gfx 8+)");
> +module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> +
>  static const struct pci_device_id pciidlist[] = {
>  #ifdef  CONFIG_DRM_AMDGPU_SI
>   {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..b43df8e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -202,40 +202,34 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
> amdgpu_device *adev,
>  
>  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
>  {
> - int i, queue, pipe, mec;
> + int i, queue, pipe;
>   bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
> + int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
> +  adev->gfx.mec.num_queue_per_pipe,
> +  adev->gfx.num_compute_rings);
> +
> + if (multipipe_policy) {
> + /* policy: make queues evenly cross all pipes on MEC1 only */
> + for (i = 0; i < max_queues_per_mec; i++) {
> + pipe = i % adev->gfx.mec.num_pipe_per_mec;
> + queue = (i / adev->gfx.mec.num_pipe_per_mec) %
> + adev->gfx.mec.num_queue_per_pipe;
> +
> + set_bit(pipe * adev->gfx.mec.num_queue_per_pipe + queue,
> + adev->gfx.mec.queue_bitmap);

Indentation looks slightly off here.


> + }
> + } else {
> + /* policy: amdgpu owns all queues in the given pipe */
> + 

RE: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-30 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

My latest patch  was already based on visually 8 SPACE per TAB config, but the 
TAB was *not* replaced by real eight spaces

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Kuehling, Felix 
Sent: Friday, July 31, 2020 10:20 AM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v3)


Am 2020-07-30 um 10:11 p.m. schrieb Liu, Monk:
> [AMD Official Use Only - Internal Distribution Only]
>
>>>> Indentation looks wrong. Did you use the wrong TAB size?
> My TAB size is 4 space, and I don't know why it looks strange , but I
> can use space to replace the TABs anyway

The linux kernel uses TABs for indentation, and TAB width is 8. Do not replace 
TABs with spaces for indentation, that would violate the kernel coding style 
guide. Please change the settings of your editor.

See Documentation/process/coding-style.rst.

Regards,
  Felix


>
> Will send v4 patch against other suggestions from your side soon
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: Kuehling, Felix 
> Sent: Tuesday, July 28, 2020 10:38 PM
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to
> configure how many KCQ we want(v3)
>
> Am 2020-07-28 um 5:00 a.m. schrieb Monk Liu:
>> what:
>> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
>> clocks during world switch which impacts a lot to multi-VF
>> performance
>>
>> how:
>> introduce a paramter to control the number of KCQ to avoid
>> performance drop if there is no kernel compute queue needed
>>
>> notes:
>> this paramter only affects gfx 8/9/10
>>
>> v2:
>> refine namings
>>
>> v3:
>> choose queues for each ring to that try best to cross pipes evenly.
> Thanks. Some more suggestions for simplifications inline.
>
>
>> TODO:
>> in the future we will let hypervisor driver to set this paramter
>> automatically thus no need for user to configure it through modprobe
>> in virtual machine
>>
>> Signed-off-by: Monk Liu 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 
>> +++---
>>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
>>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
>>  7 files changed, 87 insertions(+), 71 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index e97c088..de11136 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
>> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
>> +extern int amdgpu_num_kcq;
>>
>>  #define AMDGPU_VM_MAX_NUM_CTX4096
>>  #define AMDGPU_SG_THRESHOLD(256*1024*1024)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 62ecac9..cf445bab 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1199,6 +1199,11 @@ static int
>> amdgpu_device_check_arguments(struct
>> amdgpu_device *adev)
>>
>>  amdgpu_gmc_tmz_set(adev);
>>
>> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { amdgpu_num_kcq = 8;
>> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to
>> +invalid paramter provided by user\n"); }
>> +
>>  return 0;
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 6291f5f..b545c40 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -150,6 +150,7 @@ int amdgpu_noretry;  int amdgpu_force_asic_type =
>> -1;  int amdgpu_tmz = 0;  int amdgpu_reset_method = -1; /* auto */
>> +int amdgpu_num_kcq = -1;
>>
>>  struct amdgpu_mgpu_info mgpu_info = {  .mutex =
>> __MUTEX_INITIALIZER(mgpu_info.mutex),
>> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
>> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto
>> 

[PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v4)

2020-07-30 Thread Monk Liu
what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

v4:
fix indentation
some cleanupsin the gfx_compute_queue_acquire() function

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 52 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +-
 7 files changed, 80 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..de11136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
 #ifdef CONFIG_DRM_AMDGPU_CIK
 extern int amdgpu_cik_support;
 #endif
+extern int amdgpu_num_kcq;
 
 #define AMDGPU_VM_MAX_NUM_CTX  4096
 #define AMDGPU_SG_THRESHOLD(256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..cf445bab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
 
amdgpu_gmc_tmz_set(adev);
 
+   if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
+   amdgpu_num_kcq = 8;
+   dev_warn(adev->dev, "set kernel compute queue number to 8 due 
to invalid paramter provided by user\n");
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6291f5f..b545c40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
 int amdgpu_force_asic_type = -1;
 int amdgpu_tmz = 0;
 int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq = -1;
 
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
 MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
 module_param_named(reset_method, amdgpu_reset_method, int, 0444);
 
+MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup 
(8 if set to greater than 8 or less than 0, only affect gfx 8+)");
+module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..b43df8e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,40 +202,34 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,
 
 void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
 {
-   int i, queue, pipe, mec;
+   int i, queue, pipe;
bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
+   int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
+adev->gfx.mec.num_queue_per_pipe,
+adev->gfx.num_compute_rings);
+
+   if (multipipe_policy) {
+   /* policy: make queues evenly cross all pipes on MEC1 only */
+   for (i = 0; i < max_queues_per_mec; i++) {
+   pipe = i % adev->gfx.mec.num_pipe_per_mec;
+   queue = (i / adev->gfx.mec.num_pipe_per_mec) %
+   adev->gfx.mec.num_queue_per_pipe;
+
+   set_bit(pipe * adev->gfx.mec.num_queue_per_pipe + queue,
+   adev->gfx.mec.queue_bitmap);
+   }
+   } else {
+   /* policy: amdgpu owns all queues in the given pipe */
+   for (i = 0; i < max_queues_per_mec; ++i) {
+   queue = i % adev->gfx.mec.num_queue_per_pipe;
+   pipe = (i / adev->gfx.mec.num_queue_per_pipe)
+   % adev->gfx.mec.num_pipe_per_mec;
 
-   /* policy 

Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-30 Thread Felix Kuehling

Am 2020-07-30 um 10:11 p.m. schrieb Liu, Monk:
> [AMD Official Use Only - Internal Distribution Only]
>
>>>> Indentation looks wrong. Did you use the wrong TAB size?
> My TAB size is 4 space, and I don't know why it looks strange , but I can use 
> space to replace the TABs anyway

The linux kernel uses TABs for indentation, and TAB width is 8. Do not
replace TABs with spaces for indentation, that would violate the kernel
coding style guide. Please change the settings of your editor.

See Documentation/process/coding-style.rst.

Regards,
  Felix


>
> Will send v4 patch against other suggestions from your side soon
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: Kuehling, Felix 
> Sent: Tuesday, July 28, 2020 10:38 PM
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
> many KCQ we want(v3)
>
> Am 2020-07-28 um 5:00 a.m. schrieb Monk Liu:
>> what:
>> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
>> clocks during world switch which impacts a lot to multi-VF performance
>>
>> how:
>> introduce a paramter to control the number of KCQ to avoid performance
>> drop if there is no kernel compute queue needed
>>
>> notes:
>> this paramter only affects gfx 8/9/10
>>
>> v2:
>> refine namings
>>
>> v3:
>> choose queues for each ring to that try best to cross pipes evenly.
> Thanks. Some more suggestions for simplifications inline.
>
>
>> TODO:
>> in the future we will let hypervisor driver to set this paramter
>> automatically thus no need for user to configure it through modprobe
>> in virtual machine
>>
>> Signed-off-by: Monk Liu 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 
>> +++---
>>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
>>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
>>  7 files changed, 87 insertions(+), 71 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index e97c088..de11136 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
>> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
>> +extern int amdgpu_num_kcq;
>>
>>  #define AMDGPU_VM_MAX_NUM_CTX4096
>>  #define AMDGPU_SG_THRESHOLD(256*1024*1024)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 62ecac9..cf445bab 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
>> amdgpu_device *adev)
>>
>>  amdgpu_gmc_tmz_set(adev);
>>
>> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
>> +amdgpu_num_kcq = 8;
>> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid 
>> paramter provided by user\n");
>> +}
>> +
>>  return 0;
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 6291f5f..b545c40 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -150,6 +150,7 @@ int amdgpu_noretry;  int amdgpu_force_asic_type =
>> -1;  int amdgpu_tmz = 0;  int amdgpu_reset_method = -1; /* auto */
>> +int amdgpu_num_kcq = -1;
>>
>>  struct amdgpu_mgpu_info mgpu_info = {
>>  .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
>> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
>> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default),
>> 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
>> module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>>
>> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want
>> +to setup (8 if set to greater than 8 or less than 0, only affect gfx
>> +8+)"); module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
>> +
>>  static const struct pci_device_id pciidlist[] = {  #ifdef
>&g

[PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v4)

2020-07-30 Thread Monk Liu
what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

v4:
fix indentation
some cleanupsin the gfx_compute_queue_acquire() function

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 52 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +-
 7 files changed, 80 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..de11136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
 #ifdef CONFIG_DRM_AMDGPU_CIK
 extern int amdgpu_cik_support;
 #endif
+extern int amdgpu_num_kcq;
 
 #define AMDGPU_VM_MAX_NUM_CTX  4096
 #define AMDGPU_SG_THRESHOLD(256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..cf445bab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
 
amdgpu_gmc_tmz_set(adev);
 
+   if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
+   amdgpu_num_kcq = 8;
+   dev_warn(adev->dev, "set kernel compute queue number to 8 due 
to invalid paramter provided by user\n");
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6291f5f..b545c40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
 int amdgpu_force_asic_type = -1;
 int amdgpu_tmz = 0;
 int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq = -1;
 
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
 MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
 module_param_named(reset_method, amdgpu_reset_method, int, 0444);
 
+MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup 
(8 if set to greater than 8 or less than 0, only affect gfx 8+)");
+module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..24b3461 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,40 +202,34 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,
 
 void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
 {
-   int i, queue, pipe, mec;
+   int i, queue, pipe;
bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
+   int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
+adev->gfx.mec.num_queue_per_pipe,
+adev->gfx.num_compute_rings);
+
+   if (multipipe_policy) {
+   /* policy: make queues evenly cross all pipes on MEC1 only */
+   for (i = 0; i < max_queues_per_mec; i++) {
+   pipe = i % adev->gfx.mec.num_pipe_per_mec;
+   queue = (i / adev->gfx.mec.num_pipe_per_mec) %
+   adev->gfx.mec.num_queue_per_pipe;
+
+   set_bit(pipe * adev->gfx.mec.num_queue_per_pipe + queue,
+   adev->gfx.mec.queue_bitmap);
+   }
+   } else {
+   /* policy: amdgpu owns all queues in the given pipe */
+   for (i = 0; i < max_queues_per_mec; ++i) {
+   queue = i % adev->gfx.mec.num_queue_per_pipe;
+   pipe = (i / adev->gfx.mec.num_queue_per_pipe)
+   % adev->gfx.mec.num_pipe_per_mec;
 
-   /* policy 

RE: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-30 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

>>> Indentation looks wrong. Did you use the wrong TAB size?

My TAB size is 4 space, and I don't know why it looks strange , but I can use 
space to replace the TABs anyway

Will send v4 patch against other suggestions from your side soon

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Kuehling, Felix 
Sent: Tuesday, July 28, 2020 10:38 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v3)

Am 2020-07-28 um 5:00 a.m. schrieb Monk Liu:
> what:
> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
> clocks during world switch which impacts a lot to multi-VF performance
>
> how:
> introduce a paramter to control the number of KCQ to avoid performance
> drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> v2:
> refine namings
>
> v3:
> choose queues for each ring to that try best to cross pipes evenly.

Thanks. Some more suggestions for simplifications inline.


>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through modprobe
> in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 
> +++---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
>  7 files changed, 87 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..de11136 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
> +extern int amdgpu_num_kcq;
>
>  #define AMDGPU_VM_MAX_NUM_CTX4096
>  #define AMDGPU_SG_THRESHOLD(256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..cf445bab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
> amdgpu_device *adev)
>
>  amdgpu_gmc_tmz_set(adev);
>
> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
> +amdgpu_num_kcq = 8;
> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid 
> paramter provided by user\n");
> +}
> +
>  return 0;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..b545c40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;  int amdgpu_force_asic_type =
> -1;  int amdgpu_tmz = 0;  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq = -1;
>
>  struct amdgpu_mgpu_info mgpu_info = {
>  .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default),
> 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
> module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want
> +to setup (8 if set to greater than 8 or less than 0, only affect gfx
> +8+)"); module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> +
>  static const struct pci_device_id pciidlist[] = {  #ifdef
> CONFIG_DRM_AMDGPU_SI
>  {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI}, diff
> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..f83a9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -202,40 +202,42 @@ bool
> amdgpu_gfx_is_high_priority_compute_queue(struct amdgpu_device *adev,
>
>  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)  {
> -int i, queue, pipe, mec;
> +int i, queue, pipe;
>  bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
> +int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
> + adev->gfx.mec.nu

Re: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-30 Thread Felix Kuehling
I already replied to v3 of your patch on Tuesday. Did you see my
comments about further simplifying the two queue allocation policies?

Regards,
  Felix

Am 2020-07-29 um 11:03 p.m. schrieb Liu, Monk:
> [AMD Official Use Only - Internal Distribution Only]
>
> Ping
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of Liu, Monk
> Sent: Tuesday, July 28, 2020 5:32 PM
> To: Koenig, Christian ; amd-...@freedesktop.org
> Cc: Kuehling, Felix 
> Subject: RE: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure 
> how many KCQ we want(v3)
>
> [AMD Official Use Only - Internal Distribution Only]
>
> [AMD Official Use Only - Internal Distribution Only]
>
> I repeated the patch broadcast through git-send-email
>
> _
> Monk Liu|GPU Virtualization Team |AMD
>
>
> -Original Message-
> From: Koenig, Christian 
> Sent: Tuesday, July 28, 2020 5:04 PM
> To: Liu, Monk ; amd-...@freedesktop.org
> Cc: Kuehling, Felix 
> Subject: Re: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure 
> how many KCQ we want(v3)
>
> The patch looks totally mangled to me, e.g. some spaces and new lines are 
> missing.
>
> Probably because it was forwarded.
>
> Christian.
>
> Am 28.07.20 um 10:59 schrieb Liu, Monk:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> -Original Message-
>> From: Monk Liu 
>> Sent: Tuesday, July 28, 2020 2:59 PM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Liu, Monk 
>> Subject: [PATCH] drm/amdgpu: introduce a new parameter to configure
>> how many KCQ we want(v3)
>>
>> what:
>> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
>> clocks during world switch which impacts a lot to multi-VF performance
>>
>> how:
>> introduce a paramter to control the number of KCQ to avoid performance
>> drop if there is no kernel compute queue needed
>>
>> notes:
>> this paramter only affects gfx 8/9/10
>>
>> v2:
>> refine namings
>>
>> v3:
>> choose queues for each ring to that try best to cross pipes evenly.
>>
>> TODO:
>> in the future we will let hypervisor driver to set this paramter
>> automatically thus no need for user to configure it through modprobe
>> in virtual machine
>>
>> Signed-off-by: Monk Liu 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 
>> +++---
>>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
>>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
>>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
>>   7 files changed, 87 insertions(+), 71 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index e97c088..de11136 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
>> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
>> +extern int amdgpu_num_kcq;
>>
>>   #define AMDGPU_VM_MAX_NUM_CTX4096
>>   #define AMDGPU_SG_THRESHOLD(256*1024*1024)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 62ecac9..cf445bab 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
>> amdgpu_device *adev)
>>
>>   amdgpu_gmc_tmz_set(adev);
>>
>> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { amdgpu_num_kcq = 8;
>> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to
>> +invalid paramter provided by user\n"); }
>> +
>>   return 0;
>>   }
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 6291f5f..b545c40 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -150,6 +150,7 @@ int amdgpu_noretry;
>>   int amdgpu_force_asic_type = -1;
>>   int amdgpu_tmz = 0;
>>   int amdgpu_reset_method = -1; /* auto */
>> +int amdgpu_num_kcq = -1;
>>
>> 

Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-30 Thread Christian König

Am 28.07.20 um 11:00 schrieb Monk Liu:

what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine

Signed-off-by: Monk Liu 


I would still call it num_kernel_compute_queues instead of num_kcq, but 
that is just a nit pick.


Acked-by: Christian König  for the patch, but 
I would ask Felix for his rb. He knows all this MES, pipe and queue 
stuff much better than I do.



---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 +++---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
  7 files changed, 87 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..de11136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
  #ifdef CONFIG_DRM_AMDGPU_CIK
  extern int amdgpu_cik_support;
  #endif
+extern int amdgpu_num_kcq;
  
  #define AMDGPU_VM_MAX_NUM_CTX			4096

  #define AMDGPU_SG_THRESHOLD   (256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..cf445bab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
  
  	amdgpu_gmc_tmz_set(adev);
  
+	if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {

+   amdgpu_num_kcq = 8;
+   dev_warn(adev->dev, "set kernel compute queue number to 8 due to 
invalid paramter provided by user\n");
+   }
+
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 6291f5f..b545c40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
  int amdgpu_force_asic_type = -1;
  int amdgpu_tmz = 0;
  int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq = -1;
  
  struct amdgpu_mgpu_info mgpu_info = {

.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
  MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = legacy, 
1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
  module_param_named(reset_method, amdgpu_reset_method, int, 0444);
  
+MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup (8 if set to greater than 8 or less than 0, only affect gfx 8+)");

+module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
+
  static const struct pci_device_id pciidlist[] = {
  #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..f83a9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,40 +202,42 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,
  
  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)

  {
-   int i, queue, pipe, mec;
+   int i, queue, pipe;
bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
+   int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
+
adev->gfx.mec.num_queue_per_pipe,
+
adev->gfx.num_compute_rings);
+
+   if (multipipe_policy) {
+   /* policy: make queues evenly cross all pipes on MEC1 only */
+   for (i = 0; i < max_queues_per_mec; i++) {
+   pipe = i % adev->gfx.mec.num_pipe_per_mec;
+   queue = (i / adev->gfx.mec.num_pipe_per_mec) %
+   adev->gfx.mec.num_queue_per_pipe;
+
+   set_bit(pipe * adev->gfx.mec.num_queue_per_pipe + queue,
+   adev->gfx.mec.queue_bitmap);
+   }
+   } else {
+   int mec;
  
-	/* policy for amdgpu compute queue ownership */

- 

RE: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-29 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

Ping

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: amd-gfx  On Behalf Of Liu, Monk
Sent: Tuesday, July 28, 2020 5:32 PM
To: Koenig, Christian ; amd-...@freedesktop.org
Cc: Kuehling, Felix 
Subject: RE: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v3)

[AMD Official Use Only - Internal Distribution Only]

[AMD Official Use Only - Internal Distribution Only]

I repeated the patch broadcast through git-send-email

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Koenig, Christian 
Sent: Tuesday, July 28, 2020 5:04 PM
To: Liu, Monk ; amd-...@freedesktop.org
Cc: Kuehling, Felix 
Subject: Re: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v3)

The patch looks totally mangled to me, e.g. some spaces and new lines are 
missing.

Probably because it was forwarded.

Christian.

Am 28.07.20 um 10:59 schrieb Liu, Monk:
> [AMD Official Use Only - Internal Distribution Only]
>
> -Original Message-
> From: Monk Liu 
> Sent: Tuesday, July 28, 2020 2:59 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Monk 
> Subject: [PATCH] drm/amdgpu: introduce a new parameter to configure
> how many KCQ we want(v3)
>
> what:
> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
> clocks during world switch which impacts a lot to multi-VF performance
>
> how:
> introduce a paramter to control the number of KCQ to avoid performance
> drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> v2:
> refine namings
>
> v3:
> choose queues for each ring to that try best to cross pipes evenly.
>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through modprobe
> in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 
> +++---
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
>   7 files changed, 87 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..de11136 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
> +extern int amdgpu_num_kcq;
>
>   #define AMDGPU_VM_MAX_NUM_CTX4096
>   #define AMDGPU_SG_THRESHOLD(256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..cf445bab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
> amdgpu_device *adev)
>
>   amdgpu_gmc_tmz_set(adev);
>
> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { amdgpu_num_kcq = 8;
> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to
> +invalid paramter provided by user\n"); }
> +
>   return 0;
>   }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..b545c40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;
>   int amdgpu_force_asic_type = -1;
>   int amdgpu_tmz = 0;
>   int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq = -1;
>
>   struct amdgpu_mgpu_info mgpu_info = {
>   .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default),
> 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
> module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want
> +to setup (8 if set to greater than 8 or less than 0, only affect gfx
> +8+)"); module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> +
>   static const struct pci_device_id pciidlist[] = {  #ifdef  
> CONFIG_DRM_AMDGPU_SI
>   {0x1002, 0x6780, PCI_ANY_ID, P

Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-28 Thread Felix Kuehling
Am 2020-07-28 um 5:00 a.m. schrieb Monk Liu:
> what:
> the MQD's save and restore of KCQ (kernel compute queue)
> cost lots of clocks during world switch which impacts a lot
> to multi-VF performance
>
> how:
> introduce a paramter to control the number of KCQ to avoid
> performance drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> v2:
> refine namings
>
> v3:
> choose queues for each ring to that try best to cross pipes evenly.

Thanks. Some more suggestions for simplifications inline.


>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through
> modprobe in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 
> +++---
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
>  7 files changed, 87 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..de11136 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;
>  #ifdef CONFIG_DRM_AMDGPU_CIK
>  extern int amdgpu_cik_support;
>  #endif
> +extern int amdgpu_num_kcq;
>  
>  #define AMDGPU_VM_MAX_NUM_CTX4096
>  #define AMDGPU_SG_THRESHOLD  (256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..cf445bab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
> amdgpu_device *adev)
>  
>   amdgpu_gmc_tmz_set(adev);
>  
> + if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
> + amdgpu_num_kcq = 8;
> + dev_warn(adev->dev, "set kernel compute queue number to 8 due 
> to invalid paramter provided by user\n");
> + }
> +
>   return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..b545c40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;
>  int amdgpu_force_asic_type = -1;
>  int amdgpu_tmz = 0;
>  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq = -1;
>  
>  struct amdgpu_mgpu_info mgpu_info = {
>   .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
>  MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
> legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
>  module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>  
> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup 
> (8 if set to greater than 8 or less than 0, only affect gfx 8+)");
> +module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> +
>  static const struct pci_device_id pciidlist[] = {
>  #ifdef  CONFIG_DRM_AMDGPU_SI
>   {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..f83a9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -202,40 +202,42 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
> amdgpu_device *adev,
>  
>  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
>  {
> - int i, queue, pipe, mec;
> + int i, queue, pipe;
>   bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
> + int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
> +  
> adev->gfx.mec.num_queue_per_pipe,
> +  
> adev->gfx.num_compute_rings);

Indentation looks wrong. Did you use the wrong TAB size?


> +
> + if (multipipe_policy) {
> + /* policy: make queues evenly cross all pipes on MEC1 only */
> + for (i = 0; i < max_queues_per_mec; i++) {
> + pipe = i % adev->gfx.mec.num_pipe_per_mec;
> + queue = (i / adev->gfx.mec.num_pipe_per_mec) %
> + adev->gfx.mec.num_queue_per_pipe;
> +
> + set_bit(pipe * adev->gfx.mec.num_queue_per_pipe + queue,
> + adev->gfx.mec.queue_bitmap);
> + }
> + } else {
> + int 

RE: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-28 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

I repeated the patch broadcast through git-send-email

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Koenig, Christian 
Sent: Tuesday, July 28, 2020 5:04 PM
To: Liu, Monk ; amd-...@freedesktop.org
Cc: Kuehling, Felix 
Subject: Re: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v3)

The patch looks totally mangled to me, e.g. some spaces and new lines are 
missing.

Probably because it was forwarded.

Christian.

Am 28.07.20 um 10:59 schrieb Liu, Monk:
> [AMD Official Use Only - Internal Distribution Only]
>
> -Original Message-
> From: Monk Liu 
> Sent: Tuesday, July 28, 2020 2:59 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Monk 
> Subject: [PATCH] drm/amdgpu: introduce a new parameter to configure
> how many KCQ we want(v3)
>
> what:
> the MQD's save and restore of KCQ (kernel compute queue) cost lots of
> clocks during world switch which impacts a lot to multi-VF performance
>
> how:
> introduce a paramter to control the number of KCQ to avoid performance
> drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> v2:
> refine namings
>
> v3:
> choose queues for each ring to that try best to cross pipes evenly.
>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through modprobe
> in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 
> +++---
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
>   7 files changed, 87 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..de11136 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
> +extern int amdgpu_num_kcq;
>
>   #define AMDGPU_VM_MAX_NUM_CTX4096
>   #define AMDGPU_SG_THRESHOLD(256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..cf445bab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
> amdgpu_device *adev)
>
>   amdgpu_gmc_tmz_set(adev);
>
> +if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) { amdgpu_num_kcq = 8;
> +dev_warn(adev->dev, "set kernel compute queue number to 8 due to
> +invalid paramter provided by user\n"); }
> +
>   return 0;
>   }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..b545c40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;
>   int amdgpu_force_asic_type = -1;
>   int amdgpu_tmz = 0;
>   int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq = -1;
>
>   struct amdgpu_mgpu_info mgpu_info = {
>   .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default),
> 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
> module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
> +MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want
> +to setup (8 if set to greater than 8 or less than 0, only affect gfx
> +8+)"); module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
> +
>   static const struct pci_device_id pciidlist[] = {  #ifdef  
> CONFIG_DRM_AMDGPU_SI
>   {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI}, diff
> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..f83a9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -202,40 +202,42 @@ bool
> amdgpu_gfx_is_high_priority_compute_queue(struct amdgpu_device *adev,
>
>   void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)  {

Re: FW: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-28 Thread Christian König
The patch looks totally mangled to me, e.g. some spaces and new lines 
are missing.


Probably because it was forwarded.

Christian.

Am 28.07.20 um 10:59 schrieb Liu, Monk:

[AMD Official Use Only - Internal Distribution Only]

-Original Message-
From: Monk Liu 
Sent: Tuesday, July 28, 2020 2:59 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk 
Subject: [PATCH] drm/amdgpu: introduce a new parameter to configure how many 
KCQ we want(v3)

what:
the MQD's save and restore of KCQ (kernel compute queue) cost lots of clocks 
during world switch which impacts a lot to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid performance drop if 
there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

TODO:
in the future we will let hypervisor driver to set this paramter automatically 
thus no need for user to configure it through modprobe in virtual machine

Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 +++---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
  7 files changed, 87 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..de11136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef CONFIG_DRM_AMDGPU_CIK 
 extern int amdgpu_cik_support;  #endif
+extern int amdgpu_num_kcq;

  #define AMDGPU_VM_MAX_NUM_CTX4096
  #define AMDGPU_SG_THRESHOLD(256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..cf445bab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)

  amdgpu_gmc_tmz_set(adev);

+if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
+amdgpu_num_kcq = 8;
+dev_warn(adev->dev, "set kernel compute queue number to 8 due to invalid paramter 
provided by user\n");
+}
+
  return 0;
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6291f5f..b545c40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
  int amdgpu_force_asic_type = -1;
  int amdgpu_tmz = 0;
  int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq = -1;

  struct amdgpu_mgpu_info mgpu_info = {
  .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);  
MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = legacy, 1 
= mode0, 2 = mode1, 3 = mode2, 4 = baco)");  module_param_named(reset_method, 
amdgpu_reset_method, int, 0444);

+MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to
+setup (8 if set to greater than 8 or less than 0, only affect gfx
+8+)"); module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
+
  static const struct pci_device_id pciidlist[] = {  #ifdef  
CONFIG_DRM_AMDGPU_SI
  {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI}, diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..f83a9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,40 +202,42 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,

  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)  {
-int i, queue, pipe, mec;
+int i, queue, pipe;
  bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
+int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
+ adev->gfx.mec.num_queue_per_pipe,
+ adev->gfx.num_compute_rings);
+
+if (multipipe_policy) {
+/* policy: make queues evenly cross all pipes on MEC1 only */
+for (i = 0; i < max_queues_per_mec; i++) {
+pipe = i % adev->gfx.mec.num_pipe_per_mec;
+queue = (i / adev->gfx.mec.num_pipe_per_mec) %
+adev->gfx.mec.num_queue_per_pipe;
+
+set_bit(pipe * adev->gfx.mec.num_queue_per_pipe + queue,
+adev->gfx.mec.queue_bitmap);
+}
+} else {
+int mec;

-/* policy for amdgpu compute queue ownership */
-for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) {
-queue = i % adev->gfx.mec.num_queue_per_pipe;
-pipe = (i / adev->gfx.mec.num_queue_per_pipe)
-% adev->gfx.mec.num_pipe_per_mec;
-mec = (i / adev->gfx.mec.num_queue_per_pipe)

[PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v3)

2020-07-28 Thread Monk Liu
what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 58 +++---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 +++
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
 7 files changed, 87 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..de11136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
 #ifdef CONFIG_DRM_AMDGPU_CIK
 extern int amdgpu_cik_support;
 #endif
+extern int amdgpu_num_kcq;
 
 #define AMDGPU_VM_MAX_NUM_CTX  4096
 #define AMDGPU_SG_THRESHOLD(256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..cf445bab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
 
amdgpu_gmc_tmz_set(adev);
 
+   if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
+   amdgpu_num_kcq = 8;
+   dev_warn(adev->dev, "set kernel compute queue number to 8 due 
to invalid paramter provided by user\n");
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6291f5f..b545c40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
 int amdgpu_force_asic_type = -1;
 int amdgpu_tmz = 0;
 int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq = -1;
 
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
 MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
 module_param_named(reset_method, amdgpu_reset_method, int, 0444);
 
+MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup 
(8 if set to greater than 8 or less than 0, only affect gfx 8+)");
+module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..f83a9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,40 +202,42 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,
 
 void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
 {
-   int i, queue, pipe, mec;
+   int i, queue, pipe;
bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
+   int max_queues_per_mec = min(adev->gfx.mec.num_pipe_per_mec *
+
adev->gfx.mec.num_queue_per_pipe,
+
adev->gfx.num_compute_rings);
+
+   if (multipipe_policy) {
+   /* policy: make queues evenly cross all pipes on MEC1 only */
+   for (i = 0; i < max_queues_per_mec; i++) {
+   pipe = i % adev->gfx.mec.num_pipe_per_mec;
+   queue = (i / adev->gfx.mec.num_pipe_per_mec) %
+   adev->gfx.mec.num_queue_per_pipe;
+
+   set_bit(pipe * adev->gfx.mec.num_queue_per_pipe + queue,
+   adev->gfx.mec.queue_bitmap);
+   }
+   } else {
+   int mec;
 
-   /* policy for amdgpu compute queue ownership */
-   for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) {
-   queue = i % adev->gfx.mec.num_queue_per_pipe;
-   pipe = (i / adev->gfx.mec.num_queue_per_pipe)
-   % adev->gfx.mec.num_pipe_per_mec;
-   mec = (i / adev->gfx.mec.num_queue_per_pipe)

RE: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v2)

2020-07-28 Thread Liu, Monk
[AMD Official Use Only - Internal Distribution Only]

Thanks Felix

I reworked my patch with your suggestion and I can get queues evenly cross 
pipes, e.g.: modprobe amdgpu num_kcq=6

[  409.878557] amdgpu :00:07.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 
on hub 0
[  409.878559] amdgpu :00:07.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 
on hub 0
[  409.878561] amdgpu :00:07.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 
on hub 0
[  409.878563] amdgpu :00:07.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 
on hub 0
[  409.878565] amdgpu :00:07.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 
on hub 0
[  409.878567] amdgpu :00:07.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 
on hub 0
[  409.878568] amdgpu :00:07.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 9 on 
hub 0

Please review my patch upcoming

_
Monk Liu|GPU Virtualization Team |AMD


-Original Message-
From: Kuehling, Felix 
Sent: Tuesday, July 28, 2020 7:33 AM
To: amd-gfx@lists.freedesktop.org; Liu, Monk 
Subject: Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how 
many KCQ we want(v2)

Am 2020-07-27 um 6:47 a.m. schrieb Monk Liu:
> what:
> the MQD's save and restore of kernel compute queues cost lots of
> clocks during world switch which impacts a lot to multi-VF performance
>
> how:
> introduce a paramter to control the number of kernel compute queues to
> avoid performance drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through modprobe
> in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 27 +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +++--
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 ++--
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
> +++---
>  7 files changed, 71 insertions(+), 56 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..71a3d6a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;  #ifdef
> CONFIG_DRM_AMDGPU_CIK  extern int amdgpu_cik_support;  #endif
> +extern int amdgpu_num_kcq_user_set;
>
>  #define AMDGPU_VM_MAX_NUM_CTX4096
>  #define AMDGPU_SG_THRESHOLD(256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..18b93ef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct
> amdgpu_device *adev)
>
>  amdgpu_gmc_tmz_set(adev);
>
> +if (amdgpu_num_kcq_user_set > 8 || amdgpu_num_kcq_user_set < 0) {
> +amdgpu_num_kcq_user_set = 8;
> +dev_warn(adev-dev, "set KCQ number to 8 due to invalid paramter provided by 
> user\n");
> +}
> +
>  return 0;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..03a94e9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;  int amdgpu_force_asic_type =
> -1;  int amdgpu_tmz = 0;  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq_user_set = 8;
>
>  struct amdgpu_mgpu_info mgpu_info = {
>  .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
> MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default),
> 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
> module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>
> +MODULE_PARM_DESC(num_kcq, "number of KCQ user want to setup (8 if set
> +to greater than 8 or less than 0, only affect gfx 8+)");
> +module_param_named(num_kcq, amdgpu_num_kcq_user_set, int, 0444);
> +
>  static const struct pci_device_id pciidlist[] = {  #ifdef
> CONFIG_DRM_AMDGPU_SI
>  {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI}, diff
> --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..0b59049 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> 

Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v2)

2020-07-27 Thread Felix Kuehling
Am 2020-07-27 um 6:47 a.m. schrieb Monk Liu:
> what:
> the MQD's save and restore of kernel compute queues cost lots of clocks
> during world switch which impacts a lot to multi-VF performance
>
> how:
> introduce a paramter to control the number of kernel compute queues to
> avoid performance drop if there is no kernel compute queue needed
>
> notes:
> this paramter only affects gfx 8/9/10
>
> TODO:
> in the future we will let hypervisor driver to set this paramter
> automatically thus no need for user to configure it through
> modprobe in virtual machine
>
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 27 +-
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +++--
>  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 ++--
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 
> +++---
>  7 files changed, 71 insertions(+), 56 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088..71a3d6a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;
>  #ifdef CONFIG_DRM_AMDGPU_CIK
>  extern int amdgpu_cik_support;
>  #endif
> +extern int amdgpu_num_kcq_user_set;
>  
>  #define AMDGPU_VM_MAX_NUM_CTX4096
>  #define AMDGPU_SG_THRESHOLD  (256*1024*1024)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 62ecac9..18b93ef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
> amdgpu_device *adev)
>  
>   amdgpu_gmc_tmz_set(adev);
>  
> + if (amdgpu_num_kcq_user_set > 8 || amdgpu_num_kcq_user_set < 0) {
> + amdgpu_num_kcq_user_set = 8;
> + dev_warn(adev-dev, "set KCQ number to 8 due to invalid paramter 
> provided by user\n");
> + }
> +
>   return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6291f5f..03a94e9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -150,6 +150,7 @@ int amdgpu_noretry;
>  int amdgpu_force_asic_type = -1;
>  int amdgpu_tmz = 0;
>  int amdgpu_reset_method = -1; /* auto */
> +int amdgpu_num_kcq_user_set = 8;
>  
>  struct amdgpu_mgpu_info mgpu_info = {
>   .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
>  MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
> legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
>  module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>  
> +MODULE_PARM_DESC(num_kcq, "number of KCQ user want to setup (8 if set to 
> greater than 8 or less than 0, only affect gfx 8+)");
> +module_param_named(num_kcq, amdgpu_num_kcq_user_set, int, 0444);
> +
>  static const struct pci_device_id pciidlist[] = {
>  #ifdef  CONFIG_DRM_AMDGPU_SI
>   {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 8eff017..0b59049 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -202,7 +202,7 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
> amdgpu_device *adev,
>  
>  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
>  {
> - int i, queue, pipe, mec;
> + int i, queue, pipe, mec, j = 0;
>   bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
>  
>   /* policy for amdgpu compute queue ownership */
> @@ -219,23 +219,24 @@ void amdgpu_gfx_compute_queue_acquire(struct 
> amdgpu_device *adev)
>  
>   if (multipipe_policy) {
>   /* policy: amdgpu owns the first two queues of the 
> first MEC */
> - if (mec == 0 && queue < 2)
> - set_bit(i, adev->gfx.mec.queue_bitmap);
> + if (mec == 0 && queue < 2) {
> + if (j++ < adev->gfx.num_compute_rings)

This is not ideal, because it wouldn't distribute the queues evenly
across pipes if there are fewer than 7. I would change how queue and
pipe are calculated from i for the multipipe_policy case:

if (multipipe_policy) {
pipe = i % adev->gfx.mec.num_pipe_per_mec;
queue = (i / adev->gfx.mec.num_pipe_per_mec)
% adev->gfx.mec.num_queue_per_pipe;
} else {
/* previous way */
}


Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want

2020-07-27 Thread Christian König

Am 27.07.20 um 18:02 schrieb Felix Kuehling:

Am 2020-07-27 um 5:26 a.m. schrieb Christian König:

Am 27.07.20 um 10:21 schrieb Monk Liu:

what:
KCQ cost many clocks during world switch which impacts a lot to multi-VF
performance

how:
introduce a paramter to control the number of KCQ to avoid performance
drop if there is no KQC needed

notes:
this paramter only affects gfx 8/9/10

Sounds like a good idea to me, but that needs a different name.
Outside AMD most people don't know what a KCQ is.

Just use compute queue or similar as name for this.

Just "compute queue" would be confusing for ROCm users. Maybe "legacy
compute queues"?


"kernel compute queues" is just fine, we just shouldn't shorten it.

And we should especially drop the "_user_set" postfix.

Regards,
Christian.



Regards,
   Felix



Another comment below.


Signed-off-by: Monk Liu 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 +
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +++
   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  4 
   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 27
+-
   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30
+++--
   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29
++--
   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31
+++---
   7 files changed, 69 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..71a3d6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
   #ifdef CONFIG_DRM_AMDGPU_CIK
   extern int amdgpu_cik_support;
   #endif
+extern int amdgpu_num_kcq_user_set;
     #define AMDGPU_VM_MAX_NUM_CTX    4096
   #define AMDGPU_SG_THRESHOLD    (256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..61c7583 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,9 @@ static int amdgpu_device_check_arguments(struct
amdgpu_device *adev)
     amdgpu_gmc_tmz_set(adev);
   +    if (amdgpu_num_kcq_user_set > 8 || amdgpu_num_kcq_user_set < 0)
+    amdgpu_num_kcq_user_set = 8;

This needs a warning or error message if we overwrite invalid user
provided parameters.

Christian.


+
   return 0;
   }
   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6291f5f..03a94e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
   int amdgpu_force_asic_type = -1;
   int amdgpu_tmz = 0;
   int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq_user_set = 8;
     struct amdgpu_mgpu_info mgpu_info = {
   .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
   MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto
(default), 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
   module_param_named(reset_method, amdgpu_reset_method, int, 0444);
   +MODULE_PARM_DESC(num_kcq, "number of KCQ user want to setup (8 if
set to greater than 8 or less than 0, only affect gfx 8+)");
+module_param_named(num_kcq, amdgpu_num_kcq_user_set, int, 0444);
+
   static const struct pci_device_id pciidlist[] = {
   #ifdef  CONFIG_DRM_AMDGPU_SI
   {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..0b59049 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,7 +202,7 @@ bool
amdgpu_gfx_is_high_priority_compute_queue(struct amdgpu_device *adev,
     void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
   {
-    int i, queue, pipe, mec;
+    int i, queue, pipe, mec, j = 0;
   bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
     /* policy for amdgpu compute queue ownership */
@@ -219,23 +219,24 @@ void amdgpu_gfx_compute_queue_acquire(struct
amdgpu_device *adev)
     if (multipipe_policy) {
   /* policy: amdgpu owns the first two queues of the
first MEC */
-    if (mec == 0 && queue < 2)
-    set_bit(i, adev->gfx.mec.queue_bitmap);
+    if (mec == 0 && queue < 2) {
+    if (j++ < adev->gfx.num_compute_rings)
+    set_bit(i, adev->gfx.mec.queue_bitmap);
+    else
+    break;
+    }
   } else {
   /* policy: amdgpu owns all queues in the first pipe */
-    if (mec == 0 && pipe == 0)
-    set_bit(i, adev->gfx.mec.queue_bitmap);
+    if (mec == 0 && pipe == 0) {
+    if (j++ < adev->gfx.num_compute_rings)
+    

Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want

2020-07-27 Thread Felix Kuehling
Am 2020-07-27 um 5:26 a.m. schrieb Christian König:
> Am 27.07.20 um 10:21 schrieb Monk Liu:
>> what:
>> KCQ cost many clocks during world switch which impacts a lot to multi-VF
>> performance
>>
>> how:
>> introduce a paramter to control the number of KCQ to avoid performance
>> drop if there is no KQC needed
>>
>> notes:
>> this paramter only affects gfx 8/9/10
>
> Sounds like a good idea to me, but that needs a different name.
> Outside AMD most people don't know what a KCQ is.
>
> Just use compute queue or similar as name for this.

Just "compute queue" would be confusing for ROCm users. Maybe "legacy
compute queues"?

Regards,
  Felix


>
> Another comment below.
>
>>
>> Signed-off-by: Monk Liu 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  4 
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c    | 27
>> +-
>>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30
>> +++--
>>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29
>> ++--
>>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31
>> +++---
>>   7 files changed, 69 insertions(+), 56 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index e97c088..71a3d6a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -201,6 +201,7 @@ extern int amdgpu_si_support;
>>   #ifdef CONFIG_DRM_AMDGPU_CIK
>>   extern int amdgpu_cik_support;
>>   #endif
>> +extern int amdgpu_num_kcq_user_set;
>>     #define AMDGPU_VM_MAX_NUM_CTX    4096
>>   #define AMDGPU_SG_THRESHOLD    (256*1024*1024)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 62ecac9..61c7583 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1199,6 +1199,9 @@ static int amdgpu_device_check_arguments(struct
>> amdgpu_device *adev)
>>     amdgpu_gmc_tmz_set(adev);
>>   +    if (amdgpu_num_kcq_user_set > 8 || amdgpu_num_kcq_user_set < 0)
>> +    amdgpu_num_kcq_user_set = 8;
>
> This needs a warning or error message if we overwrite invalid user
> provided parameters.
>
> Christian.
>
>> +
>>   return 0;
>>   }
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 6291f5f..03a94e9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -150,6 +150,7 @@ int amdgpu_noretry;
>>   int amdgpu_force_asic_type = -1;
>>   int amdgpu_tmz = 0;
>>   int amdgpu_reset_method = -1; /* auto */
>> +int amdgpu_num_kcq_user_set = 8;
>>     struct amdgpu_mgpu_info mgpu_info = {
>>   .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
>> @@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
>>   MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto
>> (default), 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
>>   module_param_named(reset_method, amdgpu_reset_method, int, 0444);
>>   +MODULE_PARM_DESC(num_kcq, "number of KCQ user want to setup (8 if
>> set to greater than 8 or less than 0, only affect gfx 8+)");
>> +module_param_named(num_kcq, amdgpu_num_kcq_user_set, int, 0444);
>> +
>>   static const struct pci_device_id pciidlist[] = {
>>   #ifdef  CONFIG_DRM_AMDGPU_SI
>>   {0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> index 8eff017..0b59049 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -202,7 +202,7 @@ bool
>> amdgpu_gfx_is_high_priority_compute_queue(struct amdgpu_device *adev,
>>     void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
>>   {
>> -    int i, queue, pipe, mec;
>> +    int i, queue, pipe, mec, j = 0;
>>   bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
>>     /* policy for amdgpu compute queue ownership */
>> @@ -219,23 +219,24 @@ void amdgpu_gfx_compute_queue_acquire(struct
>> amdgpu_device *adev)
>>     if (multipipe_policy) {
>>   /* policy: amdgpu owns the first two queues of the
>> first MEC */
>> -    if (mec == 0 && queue < 2)
>> -    set_bit(i, adev->gfx.mec.queue_bitmap);
>> +    if (mec == 0 && queue < 2) {
>> +    if (j++ < adev->gfx.num_compute_rings)
>> +    set_bit(i, adev->gfx.mec.queue_bitmap);
>> +    else
>> +    break;
>> +    }
>>   } else {
>>   /* policy: amdgpu owns all queues in the first pipe */
>> -    if (mec == 0 && pipe == 0)
>> -    set_bit(i, adev->gfx.mec.queue_bitmap);
>> +    if (mec 

[PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v2)

2020-07-27 Thread Monk Liu
what:
the MQD's save and restore of kernel compute queues cost lots of clocks
during world switch which impacts a lot to multi-VF performance

how:
introduce a paramter to control the number of kernel compute queues to
avoid performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 27 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +++--
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +++---
 7 files changed, 71 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..71a3d6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
 #ifdef CONFIG_DRM_AMDGPU_CIK
 extern int amdgpu_cik_support;
 #endif
+extern int amdgpu_num_kcq_user_set;
 
 #define AMDGPU_VM_MAX_NUM_CTX  4096
 #define AMDGPU_SG_THRESHOLD(256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..18b93ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,11 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
 
amdgpu_gmc_tmz_set(adev);
 
+   if (amdgpu_num_kcq_user_set > 8 || amdgpu_num_kcq_user_set < 0) {
+   amdgpu_num_kcq_user_set = 8;
+   dev_warn(adev-dev, "set KCQ number to 8 due to invalid paramter 
provided by user\n");
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6291f5f..03a94e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
 int amdgpu_force_asic_type = -1;
 int amdgpu_tmz = 0;
 int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq_user_set = 8;
 
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
 MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
 module_param_named(reset_method, amdgpu_reset_method, int, 0444);
 
+MODULE_PARM_DESC(num_kcq, "number of KCQ user want to setup (8 if set to 
greater than 8 or less than 0, only affect gfx 8+)");
+module_param_named(num_kcq, amdgpu_num_kcq_user_set, int, 0444);
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..0b59049 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,7 +202,7 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,
 
 void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
 {
-   int i, queue, pipe, mec;
+   int i, queue, pipe, mec, j = 0;
bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
 
/* policy for amdgpu compute queue ownership */
@@ -219,23 +219,24 @@ void amdgpu_gfx_compute_queue_acquire(struct 
amdgpu_device *adev)
 
if (multipipe_policy) {
/* policy: amdgpu owns the first two queues of the 
first MEC */
-   if (mec == 0 && queue < 2)
-   set_bit(i, adev->gfx.mec.queue_bitmap);
+   if (mec == 0 && queue < 2) {
+   if (j++ < adev->gfx.num_compute_rings)
+   set_bit(i, adev->gfx.mec.queue_bitmap);
+   else
+   break;
+   }
} else {
/* policy: amdgpu owns all queues in the first pipe */
-   if (mec == 0 && pipe == 0)
-   set_bit(i, adev->gfx.mec.queue_bitmap);
+   if (mec == 0 && pipe == 0) {
+   if (j++ < adev->gfx.num_compute_rings)
+   set_bit(i, adev->gfx.mec.queue_bitmap);
+   else
+   

Re: [PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want

2020-07-27 Thread Christian König

Am 27.07.20 um 10:21 schrieb Monk Liu:

what:
KCQ cost many clocks during world switch which impacts a lot to multi-VF
performance

how:
introduce a paramter to control the number of KCQ to avoid performance
drop if there is no KQC needed

notes:
this paramter only affects gfx 8/9/10


Sounds like a good idea to me, but that needs a different name. Outside 
AMD most people don't know what a KCQ is.


Just use compute queue or similar as name for this.

Another comment below.



Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 27 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +++--
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 ++--
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +++---
  7 files changed, 69 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..71a3d6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
  #ifdef CONFIG_DRM_AMDGPU_CIK
  extern int amdgpu_cik_support;
  #endif
+extern int amdgpu_num_kcq_user_set;
  
  #define AMDGPU_VM_MAX_NUM_CTX			4096

  #define AMDGPU_SG_THRESHOLD   (256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..61c7583 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,9 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
  
  	amdgpu_gmc_tmz_set(adev);
  
+	if (amdgpu_num_kcq_user_set > 8 || amdgpu_num_kcq_user_set < 0)

+   amdgpu_num_kcq_user_set = 8;


This needs a warning or error message if we overwrite invalid user 
provided parameters.


Christian.


+
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 6291f5f..03a94e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
  int amdgpu_force_asic_type = -1;
  int amdgpu_tmz = 0;
  int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq_user_set = 8;
  
  struct amdgpu_mgpu_info mgpu_info = {

.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
  MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = legacy, 
1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
  module_param_named(reset_method, amdgpu_reset_method, int, 0444);
  
+MODULE_PARM_DESC(num_kcq, "number of KCQ user want to setup (8 if set to greater than 8 or less than 0, only affect gfx 8+)");

+module_param_named(num_kcq, amdgpu_num_kcq_user_set, int, 0444);
+
  static const struct pci_device_id pciidlist[] = {
  #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..0b59049 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,7 +202,7 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,
  
  void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)

  {
-   int i, queue, pipe, mec;
+   int i, queue, pipe, mec, j = 0;
bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
  
  	/* policy for amdgpu compute queue ownership */

@@ -219,23 +219,24 @@ void amdgpu_gfx_compute_queue_acquire(struct 
amdgpu_device *adev)
  
  		if (multipipe_policy) {

/* policy: amdgpu owns the first two queues of the 
first MEC */
-   if (mec == 0 && queue < 2)
-   set_bit(i, adev->gfx.mec.queue_bitmap);
+   if (mec == 0 && queue < 2) {
+   if (j++ < adev->gfx.num_compute_rings)
+   set_bit(i, adev->gfx.mec.queue_bitmap);
+   else
+   break;
+   }
} else {
/* policy: amdgpu owns all queues in the first pipe */
-   if (mec == 0 && pipe == 0)
-   set_bit(i, adev->gfx.mec.queue_bitmap);
+   if (mec == 0 && pipe == 0) {
+   if (j++ < adev->gfx.num_compute_rings)
+   set_bit(i, adev->gfx.mec.queue_bitmap);
+   else
+   break;
+

[PATCH] drm/amdgpu: introduce a new parameter to configure how many KCQ we want

2020-07-27 Thread Monk Liu
what:
KCQ cost many clocks during world switch which impacts a lot to multi-VF
performance

how:
introduce a paramter to control the number of KCQ to avoid performance
drop if there is no KQC needed

notes:
this paramter only affects gfx 8/9/10

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 27 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 30 +++--
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  | 29 ++--
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 31 +++---
 7 files changed, 69 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088..71a3d6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -201,6 +201,7 @@ extern int amdgpu_si_support;
 #ifdef CONFIG_DRM_AMDGPU_CIK
 extern int amdgpu_cik_support;
 #endif
+extern int amdgpu_num_kcq_user_set;
 
 #define AMDGPU_VM_MAX_NUM_CTX  4096
 #define AMDGPU_SG_THRESHOLD(256*1024*1024)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac9..61c7583 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1199,6 +1199,9 @@ static int amdgpu_device_check_arguments(struct 
amdgpu_device *adev)
 
amdgpu_gmc_tmz_set(adev);
 
+   if (amdgpu_num_kcq_user_set > 8 || amdgpu_num_kcq_user_set < 0)
+   amdgpu_num_kcq_user_set = 8;
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6291f5f..03a94e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -150,6 +150,7 @@ int amdgpu_noretry;
 int amdgpu_force_asic_type = -1;
 int amdgpu_tmz = 0;
 int amdgpu_reset_method = -1; /* auto */
+int amdgpu_num_kcq_user_set = 8;
 
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -765,6 +766,9 @@ module_param_named(tmz, amdgpu_tmz, int, 0444);
 MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = 
legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)");
 module_param_named(reset_method, amdgpu_reset_method, int, 0444);
 
+MODULE_PARM_DESC(num_kcq, "number of KCQ user want to setup (8 if set to 
greater than 8 or less than 0, only affect gfx 8+)");
+module_param_named(num_kcq, amdgpu_num_kcq_user_set, int, 0444);
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef  CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 8eff017..0b59049 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -202,7 +202,7 @@ bool amdgpu_gfx_is_high_priority_compute_queue(struct 
amdgpu_device *adev,
 
 void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
 {
-   int i, queue, pipe, mec;
+   int i, queue, pipe, mec, j = 0;
bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
 
/* policy for amdgpu compute queue ownership */
@@ -219,23 +219,24 @@ void amdgpu_gfx_compute_queue_acquire(struct 
amdgpu_device *adev)
 
if (multipipe_policy) {
/* policy: amdgpu owns the first two queues of the 
first MEC */
-   if (mec == 0 && queue < 2)
-   set_bit(i, adev->gfx.mec.queue_bitmap);
+   if (mec == 0 && queue < 2) {
+   if (j++ < adev->gfx.num_compute_rings)
+   set_bit(i, adev->gfx.mec.queue_bitmap);
+   else
+   break;
+   }
} else {
/* policy: amdgpu owns all queues in the first pipe */
-   if (mec == 0 && pipe == 0)
-   set_bit(i, adev->gfx.mec.queue_bitmap);
+   if (mec == 0 && pipe == 0) {
+   if (j++ < adev->gfx.num_compute_rings)
+   set_bit(i, adev->gfx.mec.queue_bitmap);
+   else
+   break;
+   }
}
}
 
-   /* update the number of active compute rings */
-   adev->gfx.num_compute_rings =
-   bitmap_weight(adev->gfx.mec.queue_bitmap, 
AMDGPU_MAX_COMPUTE_QUEUES);
-
-   /* If you hit this case and edited the policy, you probably just
-* need to increase