[PATCH] drm/amdgpu: Fix the warning info when unload or remove amdgpu

2023-02-07 Thread Ma Jun
Checking INVOKE_CMD  to fix the below warning info when
unload or remove amdgpu driver

[  319.489809] Call Trace:
[  319.489810]  
[  319.489812]  psp_ta_unload+0x9a/0xd0 [amdgpu]
[  319.489926]  ? smu_smc_hw_cleanup+0x2f6/0x360 [amdgpu]
[  319.490072]  psp_hw_fini+0xea/0x170 [amdgpu]
[  319.490231]  amdgpu_device_fini_hw+0x2fc/0x413 [amdgpu]
[  319.490398]  ? blocking_notifier_chain_unregister+0x56/0xb0
[  319.490401]  amdgpu_driver_unload_kms+0x51/0x60 [amdgpu]
[  319.490493]  amdgpu_pci_remove+0x5a/0x140 [amdgpu]
[  319.490583]  ? __pm_runtime_resume+0x60/0x90
[  319.490586]  pci_device_remove+0x3b/0xb0
[  319.490588]  __device_release_driver+0x1a8/0x2a0
[  319.490591]  driver_detach+0xf3/0x140
[  319.490593]  bus_remove_driver+0x6c/0xf0
[  319.490595]  driver_unregister+0x31/0x60
[  319.490597]  pci_unregister_driver+0x40/0x90
[  319.490599]  amdgpu_exit+0x15/0x44e [amdgpu]

Signed-off-by: Ma Jun 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 466054719842..5fb919cd9330 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -620,7 +620,8 @@ psp_cmd_submit_buf(struct psp_context *psp,
 */
if (!dev_entered)
WARN_ON(psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_LOAD_ASD &&
-   psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_UNLOAD_TA);
+   psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_UNLOAD_TA &&
+   psp->cmd_buf_mem->cmd_id != GFX_CMD_ID_INVOKE_CMD);
 
memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
 
-- 
2.25.1



[PATCH] drm/amd/display: fix dm irq error message in gpu recover

2023-02-07 Thread Tianci Yin
From: tiancyin 

[Why]
Variable adev->crtc_irq.num_types was initialized as the value of
adev->mode_info.num_crtc at early_init stage, later at hw_init stage,
the num_crtc changed due to the display pipe harvest on some SKUs,
but the num_types was not updated accordingly, that cause below error
in gpu recover.

  *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3

[How]
Defer the initialization of num_types to eliminate the error logs.

Signed-off-by: tiancyin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b31cfda30ff9..506699c0d316 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4226,6 +4226,8 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
/* Update the actual used number of crtc */
adev->mode_info.num_crtc = adev->dm.display_indexes_num;
 
+   amdgpu_dm_set_irq_funcs(adev);
+
link_cnt = dm->dc->caps.max_links;
if (amdgpu_dm_mode_config_init(dm->adev)) {
DRM_ERROR("DM: Failed to initialize mode config\n");
@@ -4714,8 +4716,6 @@ static int dm_early_init(void *handle)
break;
}
 
-   amdgpu_dm_set_irq_funcs(adev);
-
if (adev->mode_info.funcs == NULL)
adev->mode_info.funcs = &dm_display_funcs;
 
-- 
2.34.1



Re: [PATCH] drm/amd/display: Align num_crtc to max_streams

2023-02-07 Thread Yin, Tianci (Rico)
[AMD Official Use Only - General]

Thank you very much Harry and Hamza.

You are right,  adev->mode_info.num_crtc has already been updated, the 
variables that really need updating are
adev->mode_info.num_crtc, adev->vline0_irq.num_types, 
adev->vupdate_irq.num_types and adev->pageflip_irq.num_types.

I have made a new patch with title "drm/amd/display: fix dm irq error message 
in gpu recover", please help to review.

Regards,
Rico

From: Wentland, Harry 
Sent: Wednesday, February 8, 2023 0:19
To: Mahfooz, Hamza ; Yin, Tianci (Rico) 
; amd-gfx@lists.freedesktop.org 

Cc: Wang, Yu (Charlie) ; Siqueira, Rodrigo 
; Pillai, Aurabindo 
Subject: Re: [PATCH] drm/amd/display: Align num_crtc to max_streams



On 2/7/23 09:41, Hamza Mahfooz wrote:
>
> On 2/7/23 09:31, Harry Wentland wrote:
>>
>>
>> On 2/7/23 08:00, Hamza Mahfooz wrote:
>>>
>>> On 2/6/23 23:05, Tianci Yin wrote:
 From: tiancyin 

 [Why]
 Display pipe might be harvested on some SKUs, that cause the
 adev->mode_info.num_crtc mismatch with the usable crtc number,
 then below error dmesgs observed after GPU recover.

 *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
 *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3

 [How]
 The max_streams is limited number after pipe fuse, align num_crtc
 to max_streams to eliminate the error logs.

 Signed-off-by: tiancyin 
 ---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
1 file changed, 3 insertions(+)

 diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
 b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 index b31cfda30ff9..87ec2574cc09 100644
 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 @@ -4285,6 +4285,9 @@ static int amdgpu_dm_initialize_drm_device(struct 
 amdgpu_device *adev)
break;
}
+/* Adjust the crtc number according to the DCN pipe fuse. */
 +adev->mode_info.num_crtc = dm->dc->caps.max_streams;
>>>
>>> This would introduce array-out-bounds issues, since there are arrays of
>>> size AMDGPU_MAX_CRTCS that use num_crtc as a bounds check.
>>
>>  From what I can tell max_streams is always <= AMDGPU_MAX_CRTCS (6),
>> though we're not checking. Maybe it'd be good to check and print a
>> DRM_ERROR here if that's ever not the case (e.g., if we ever add
>> any new ASIC that has more streams).
>
> I have had UBSAN warns before commit d633b7a25fbe ("drm/amd/display: fix
> possible buffer overflow relating to secure display") was applied, so it
> seems to already be the case, maybe due to virtual streams.
>

Interesting.

On closer look I'm not sure why this patch is needed. We already
do exactly what this patch does at the beginning of
amdgpu_dm_initialize_drm_device:

>dm->display_indexes_num = dm->dc->caps.max_streams;
>/* Update the actual used number of crtc */
>adev->mode_info.num_crtc = adev->dm.display_indexes_num;

Harry

>>
>> Harry
>>
>>>
 +
for (i = 0; i < dm->dc->caps.max_streams; i++)
if (amdgpu_dm_crtc_init(dm, mode_info->planes[i], i)) {
DRM_ERROR("KMS: Failed to initialize crtc\n");
>>>
>>
>



Re: [PATCH] drm/amdkfd: Prevent user space using both svm and kfd api to register same user buffer

2023-02-07 Thread Felix Kuehling

Am 2023-02-07 um 18:36 schrieb Chen, Xiaogang:


On 2/7/2023 2:48 PM, Felix Kuehling wrote:


Am 2023-02-07 um 15:35 schrieb Xiaogang.Chen:

From: Xiaogang Chen 

When xnack is on user space can use svm page restore to set a vm 
range without
setup it first, then use regular api to register. Currently kfd api 
and svm are
not interoperable. We already have check on that, but for user 
buffer the mapping
address is not same as buffer cpu virtual address. Add checking on 
that to

avoid error propagate to hmm.

Signed-off-by: Xiaogang Chen 
---
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c

index f79b8e964140..cb7acb0b9b52 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1065,6 +1065,23 @@ static int 
kfd_ioctl_alloc_memory_of_gpu(struct file *filep,

  mutex_unlock(&p->svms.lock);
  return -EADDRINUSE;
  }
+
+    /* When register user buffer check if it has been registered by 
svm by

+ * buffer cpu virtual address.
+ */
+    if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
+
+    if (interval_tree_iter_first(&p->svms.objects,
+    untagged_addr(args->mmap_offset) >> PAGE_SHIFT,
+    (untagged_addr(args->mmap_offset) + args->size - 1) 
>> PAGE_SHIFT)) {


Instead of nesting two if-blocks, you can write this as a single 
if-block like


if ((flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) &&
    interval_tree_iter_first(&p->svms.objects,
 untagged_addr(args->mmap_offset) >> PAGE_SHIFT,
 (untagged_addr(args->mmap_offset)  + args->size 
- 1) >> PAGE_SHIFT) {


I'm also not sure untagged_addr is needed here. If it is, we're 
missing it in a bunch of other places too. Most notably, we don't 
untag pointers anywhere in the SVM API.
memory virtual address tagging is architecture dependent. Ex: if 
virtual address is 48bits and use 64bits pointer, people can use 
additional bits for different purpose. From kernel source tree seems 
only arm64 and sparc define untagged_addr that are not noop. For other 
architectures it is defined as noop. I think we want have it if run on 
different architecture cpu.


Fair enough. But it has to be consistent. If the SVM code fills the 
interval tree with tagged addresses, we can't look for untagged 
addresses here. Given that the SVM code currently uses (potentially) 
tagged addresses, we should not use untagged addresses here. A proper 
fix for untagging addresses in the SVM code would be needed as a 
separate commit.


Regards,
  Felix




Regards,
  Felix



+
+    pr_err("User Buffer Address: 0x%llx already allocated 
by SVM\n",

+    untagged_addr(args->mmap_offset));
+    mutex_unlock(&p->svms.lock);
+    return -EADDRINUSE;
+    }
+
+    }
  mutex_unlock(&p->svms.lock);
  #endif
  mutex_lock(&p->mutex);


Re: [PATCH] drm/amdkfd: Prevent user space using both svm and kfd api to register same user buffer

2023-02-07 Thread Chen, Xiaogang



On 2/7/2023 2:48 PM, Felix Kuehling wrote:


Am 2023-02-07 um 15:35 schrieb Xiaogang.Chen:

From: Xiaogang Chen 

When xnack is on user space can use svm page restore to set a vm 
range without
setup it first, then use regular api to register. Currently kfd api 
and svm are
not interoperable. We already have check on that, but for user buffer 
the mapping
address is not same as buffer cpu virtual address. Add checking on 
that to

avoid error propagate to hmm.

Signed-off-by: Xiaogang Chen 
---
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c

index f79b8e964140..cb7acb0b9b52 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1065,6 +1065,23 @@ static int 
kfd_ioctl_alloc_memory_of_gpu(struct file *filep,

  mutex_unlock(&p->svms.lock);
  return -EADDRINUSE;
  }
+
+    /* When register user buffer check if it has been registered by 
svm by

+ * buffer cpu virtual address.
+ */
+    if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
+
+    if (interval_tree_iter_first(&p->svms.objects,
+    untagged_addr(args->mmap_offset) >> PAGE_SHIFT,
+    (untagged_addr(args->mmap_offset) + args->size - 1) 
>> PAGE_SHIFT)) {


Instead of nesting two if-blocks, you can write this as a single 
if-block like


if ((flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) &&
    interval_tree_iter_first(&p->svms.objects,
 untagged_addr(args->mmap_offset) >> PAGE_SHIFT,
 (untagged_addr(args->mmap_offset)  + args->size - 
1) >> PAGE_SHIFT) {


I'm also not sure untagged_addr is needed here. If it is, we're 
missing it in a bunch of other places too. Most notably, we don't 
untag pointers anywhere in the SVM API.
memory virtual address tagging is architecture dependent. Ex: if virtual 
address is 48bits and use 64bits pointer, people can use additional bits 
for different purpose. From kernel source tree seems only arm64 and 
sparc define untagged_addr that are not noop. For other architectures it 
is defined as noop. I think we want have it if run on different 
architecture cpu.


Regards,
  Felix



+
+    pr_err("User Buffer Address: 0x%llx already allocated by 
SVM\n",

+    untagged_addr(args->mmap_offset));
+    mutex_unlock(&p->svms.lock);
+    return -EADDRINUSE;
+    }
+
+    }
  mutex_unlock(&p->svms.lock);
  #endif
  mutex_lock(&p->mutex);


[PATCH] drm/amd/display: fix glitches on hw rotation without pipe split

2023-02-07 Thread Melissa Wen
Fix glitches when moving cursor close to the edge on a rotated screen
for drivers with one pipe (without pipe split) by halving dst_x_offset.

Reported-by: Xaver Hugl 
Signed-off-by: Melissa Wen 
---

Hi,

I'm not sure if having dst_x_offset (or only for the one pipe case) is
the right solution, but it solves the issue on different devices when
the pipe split policy is AVOID.

Context:

Some artifacts appear on HW rotated screen when moving the cursor close
to the edge, as reported in:
https://gitlab.freedesktop.org/drm/amd/-/issues/2247

This issue was initially reported on DCN 3.0.1 and it's not present in
DCN 2.1 by default, for example. These two drivers follow different
pipe_split_policy, where there is no pipe split on DCN 3.0.1 (AVOID),
but we see pipe splitting on DCN 2.1 (MPC_SPLIT_AVOID_MULT_DISP).

Splitting (or not) the pipe changes the way DC calculates cursor
movements and its position, as we can see in
dcn10_set_cursor_position(). In addition, it's possible to reproduce the
same issue found on DCN 3.0.1 by setting DCN 2.1 to avoid pipe splitting
plus rotating the screen to any angle different from zero. However, from
my experiments, setting DCN 3.0.1 to a different pipe split policy makes
the system unstable and causes GPU reset (even though DYNAMIC seems to
be the default policy for DC).

I see that plugging/unplugging the charger changed the intensity of
these artifacts and also see some temporary changes with different power
performance settings. Keeping that in mind, I verified calculations and
register updates related to cursor movements
(dcn10_set_cursor_position(), hubp2_cursor_set_position()), and we can
see that some clk values participates in the final result of
dst_x_offset. After halving dst_x_offset, the artifacts no longer appear
and it solves the problem when pipe splitting is not allowed.

This change doesn't affect the correct behavior with more than one pipe,
but may affect the optimal setup of bandwidth and clocks. Perhaps, the
current values doesn't deliver the right performance when the pipe is
not split and halving dst_x_offset is exactly the right step for only
one pipe, but not for pipe split. 

Finally, if this is not the right solution, I appreciate any feedback to
address this problem correctly.

Thanks,

Melissa

 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
index 4566bc7abf17..1ff85d81237e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
@@ -1018,7 +1018,7 @@ void hubp2_cursor_set_position(
src_y_offset = y_pos - (cursor_height - y_hotspot);
}
 
-   dst_x_offset = (src_x_offset >= 0) ? src_x_offset : 0;
+   dst_x_offset = (src_x_offset >= 0) ? src_x_offset / 2 : 0;
dst_x_offset *= param->ref_clk_khz;
dst_x_offset /= param->pixel_clk_khz;
 
-- 
2.39.0



Re: gpu_metrics does not provide 'current_gfxclk', 'current_uclk', 'average_cpu_power' & 'temperature_core' on AMD Ryzen 7000 CPU

2023-02-07 Thread sfrcorne
Dear Alex,

If current_gfxclk is not supported for my CPU, then using 
average_gfxclk_frequency instead is indeed the best solution in my opinion. I 
will try to get a fix merged for my CPU in Mangohud.

On a side note: you mentioned that unsupported fields would be 0 but I don't 
think this is correct. In the Linux kernel/driver there is a line of code that 
first set all values to 0xFF by a memset() and then populates the supported 
fields.

see 
"https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c#L999":
 memset(header, 0xFF, structure_size);

The value of the unsupported uint16_t fields thus should be 0x (or 65535 in 
decimal). This is also what I get when reading the gpu_metrics file. I just 
wanted to mention this in case someone reads this in the Archive.

Anyway, thank you for your help!

Kind regards,
sfrcorne

--- Original Message ---
On Tuesday, February 7th, 2023 at 05:05, Alex Deucher  
wrote:


> On Mon, Feb 6, 2023 at 5:48 PM sfrcorne sfrco...@protonmail.com wrote:
> 
> > Dear Alex,
> > 
> > First of all, thank you for your response. Personally, I use a Ryzen 5 
> > 7600X however people with a Ryzen 9 7900X are also reporting this issue. 
> > The relevant bug report in Mangohud can be found here: 
> > "https://github.com/flightlessmango/MangoHud/issues/868";.
> > 
> > I looked around a bit in both the Mangohud source code and the Linux kernel 
> > source code.
> > 
> > (Mangohud source): From what I understand, Mangohud looks for a file 
> > "/sys/class/drm/card*/device/gpu_metrics". If this file exists (and it does 
> > exists on my machine), it tries to read this file and extract the relevant 
> > GPU data (and in case of an APU also the CPU data) from it (these are the 
> > values I was talking about in my previous mail). When the file 
> > "/sys/class/drm/card*/device/gpu_metrics" exists, it will not use the data 
> > provided by hwmon (/sys/class/hwmon/hwmon*/*).
> > 
> > (Linux kernel): The gpu_metrics file contains different data, depending on 
> > what version is used. All valid versions can be found in the source code: 
> > "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/include/kgd_pp_interface.h#L725";.
> >  For my CPU/APU the 'gpu_metrics_v2_1' structure is used (I tested this by 
> > reading the gpu_metrics file myself). Furthermore, I think that for my 
> > case, this structure is set by the function 
> > "https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_5_ppt.c#L459";
> >  but I am not completely sure about this.
> 
> 
> The metrics provided by the SMU firmware varies from asic to asic.
> For things that are not supported by the metrics table for a
> particular asic, those fields would be 0. You can see what metrics
> are supported for your asic in smu_v13_0_5_get_gpu_metrics() as that
> function populates the supported fields from the firmware to the
> common structure. current_gfxclk is not supported in your asic, but
> average_gfxclk_frequency is. So you'd want to use whichever field is
> available for a particular asic in Mangohud.
> 
> > Lastly, I am not familiar with umr. I assume that you are referring to 
> > "https://gitlab.freedesktop.org/tomstdenis/umr";? If I find some time this 
> > weekend, then I will look into this some more.
> 
> 
> Yes, that is the right link. umr uses the same interface as mangohud,
> so you should see the same data.
> 
> Alex
> 
> > Kind regards,
> > sfrcorne
> > 
> > --- Original Message ---
> > On Monday, February 6th, 2023 at 22:22, Alex Deucher alexdeuc...@gmail.com 
> > wrote:
> > 
> > > On Mon, Feb 6, 2023 at 9:22 AM sfrcorne sfrco...@protonmail.com wrote:
> > > 
> > > > Hello,
> > > > 
> > > > I hope this is the correct place to ask my question. I was not sure if 
> > > > I should have opened a new issue on Gitlab or send an email here, since 
> > > > I don't know know whether this is a bug or intended behaviour.
> > > > 
> > > > The question is about the new AMD Ryzen 7000 CPU's. These new CPU's 
> > > > have an iGPU and consequently provide a gpu_metrics file for monitoring 
> > > > the GPU/CPU (APU?). This file is used by programs like Mangohud, that 
> > > > try to read (among other values) the following 4 values:
> > > > - current_gfxclk
> > > > - current_uclk
> > > > - average_cpu_power
> > > > - temperature_core
> > > > However it appears that on AMD Ryzen 7000 CPU's these 4 values are not 
> > > > provided/updated in the gpu_metrics file. Other values like 
> > > > 'average_core_power', 'temperature_l3' and the other 'current_clk' 
> > > > are also not provided/updated but these are not used by Mangohud at the 
> > > > moment.
> > > > 
> > > > Is this intentional or a bug? And will this be fix and/or will support 
> > > > for these 4 values be added in the future?
> > > 
> > > What specific CPU/APU is this? I don't recall off hand how mangohud
> > > queries this stuff, but you can take a look at the hwmon inter

Re: [PATCH] drm/amdkfd: Prevent user space using both svm and kfd api to register same user buffer

2023-02-07 Thread Felix Kuehling



Am 2023-02-07 um 15:35 schrieb Xiaogang.Chen:

From: Xiaogang Chen 

When xnack is on user space can use svm page restore to set a vm range without
setup it first, then use regular api to register. Currently kfd api and svm are
not interoperable. We already have check on that, but for user buffer the 
mapping
address is not same as buffer cpu virtual address. Add checking on that to
avoid error propagate to hmm.

Signed-off-by: Xiaogang Chen 
---
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index f79b8e964140..cb7acb0b9b52 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1065,6 +1065,23 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
mutex_unlock(&p->svms.lock);
return -EADDRINUSE;
}
+
+   /* When register user buffer check if it has been registered by svm by
+* buffer cpu virtual address.
+*/
+   if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
+
+   if (interval_tree_iter_first(&p->svms.objects,
+   untagged_addr(args->mmap_offset) >> PAGE_SHIFT,
+   (untagged_addr(args->mmap_offset) + args->size - 1) 
>> PAGE_SHIFT)) {


Instead of nesting two if-blocks, you can write this as a single 
if-block like


if ((flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) &&
interval_tree_iter_first(&p->svms.objects,
 untagged_addr(args->mmap_offset) >> 
PAGE_SHIFT,
 (untagged_addr(args->mmap_offset)  + args->size 
- 1) >> PAGE_SHIFT) {

I'm also not sure untagged_addr is needed here. If it is, we're missing 
it in a bunch of other places too. Most notably, we don't untag pointers 
anywhere in the SVM API.


Regards,
  Felix



+
+   pr_err("User Buffer Address: 0x%llx already allocated by 
SVM\n",
+   untagged_addr(args->mmap_offset));
+   mutex_unlock(&p->svms.lock);
+   return -EADDRINUSE;
+   }
+
+   }
mutex_unlock(&p->svms.lock);
  #endif
mutex_lock(&p->mutex);


[PATCH] drm/amdkfd: Prevent user space using both svm and kfd api to register same user buffer

2023-02-07 Thread Xiaogang . Chen
From: Xiaogang Chen 

When xnack is on user space can use svm page restore to set a vm range without
setup it first, then use regular api to register. Currently kfd api and svm are
not interoperable. We already have check on that, but for user buffer the 
mapping
address is not same as buffer cpu virtual address. Add checking on that to
avoid error propagate to hmm.

Signed-off-by: Xiaogang Chen 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index f79b8e964140..cb7acb0b9b52 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1065,6 +1065,23 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file 
*filep,
mutex_unlock(&p->svms.lock);
return -EADDRINUSE;
}
+
+   /* When register user buffer check if it has been registered by svm by
+* buffer cpu virtual address.
+*/
+   if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
+
+   if (interval_tree_iter_first(&p->svms.objects,
+   untagged_addr(args->mmap_offset) >> PAGE_SHIFT,
+   (untagged_addr(args->mmap_offset)  + args->size 
- 1) >> PAGE_SHIFT)) {
+
+   pr_err("User Buffer Address: 0x%llx already allocated 
by SVM\n",
+   untagged_addr(args->mmap_offset));
+   mutex_unlock(&p->svms.lock);
+   return -EADDRINUSE;
+   }
+
+   }
mutex_unlock(&p->svms.lock);
 #endif
mutex_lock(&p->mutex);
-- 
2.25.1



Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 1:28 PM Shashank Sharma  wrote:
>
>
> On 07/02/2023 18:57, Alex Deucher wrote:
> > On Tue, Feb 7, 2023 at 12:14 PM Shashank Sharma  
> > wrote:
> >>
> >> On 07/02/2023 17:54, Alex Deucher wrote:
> >>> On Tue, Feb 7, 2023 at 11:37 AM Shashank Sharma  
> >>> wrote:
>  On 07/02/2023 17:05, Alex Deucher wrote:
> > On Tue, Feb 7, 2023 at 10:43 AM Shashank Sharma 
> >  wrote:
> >> On 07/02/2023 16:17, Alex Deucher wrote:
> >>> On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma 
> >>>  wrote:
>  From: Shashank Sharma 
> 
>  MQD describes the properies of a user queue to the HW, and allows it 
>  to
>  accurately configure the queue while mapping it in GPU HW. This patch
>  adds:
>  - A new header file which contains the userqueue MQD definition for
>    V11 graphics engine.
>  - A new function which fills it with userqueue data and prepares MQD
>  - A function which sets-up the MQD function ptrs in the generic 
>  userqueue
>    creation code.
> 
>  V1: Addressed review comments from RFC patch series
>  - Reuse the existing MQD structure instead of creating a new 
>  one
>  - MQD format and creation can be IP specific, keep it like 
>  that
> 
>  Cc: Alex Deucher 
>  Cc: Christian Koenig 
>  Signed-off-by: Arvind Yadav 
>  Signed-off-by: Shashank Sharma 
>  ---
>   drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
>   .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 
>  ++
>   drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
>   4 files changed, 169 insertions(+), 8 deletions(-)
>   create mode 100644 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> 
>  diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
>  b/drivers/gpu/drm/amd/amdgpu/Makefile
>  index 764801cc8203..6ae9d5792791 100644
>  --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>  +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>  @@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o
> 
>   # add usermode queue
>   amdgpu-y += amdgpu_userqueue.o
>  +amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o
> 
>   ifneq ($(CONFIG_HSA_AMD),)
>   AMDKFD_PATH := ../amdkfd
>  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
>  b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
>  index 625c2fe1e84a..9f3490a91776 100644
>  --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
>  +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
>  @@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, 
>  void *data,
>   return r;
>   }
> 
>  +extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
>  +
>  +static int
>  +amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
>  +{
>  +int maj;
>  +struct amdgpu_device *adev = uq_mgr->adev;
>  +uint32_t version = adev->ip_versions[GC_HWIP][0];
>  +
>  +maj = IP_VERSION_MAJ(version);
>  +if (maj == 11) {
>  +uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
>  +} else {
>  +DRM_WARN("This IP doesn't support usermode queues\n");
>  +return -EINVAL;
>  +}
>  +
> >>> I think it would be cleaner to just store these callbacks in adev.
> >>> Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
> >>> in early_init for each IP, we can register the callbacks.  When the
> >>> user goes to create a new user_queue, we can check check to see if the
> >>> function pointer is NULL or not for the queue type:
> >>>
> >>> if (!adev->user_queue_funcs[ip_type])
> >>>   return -EINVAL
> >>>
> >>> r = adev->user_queue_funcs[ip_type]->create_queue();
> >> Sounds like a good idea, we can do this.
> >>
> >>> Actually, there is already an mqd manager interface (adev->mqds[]).
> >>> Maybe you can leverage that interface.
> >> Yep, I saw that and initially even tried to work on that interface
> >> itself, and then realized that it doesn't allow us to pass some
> >>
> >> additional parameters (like queue->vm, various BOs like proc_ctx_bo,
> >> gang_ctx_bo's and so on). All of these are required in the MQD
> >>
> >> and we will need them to be added into MQD. I even thought of expanding
> >> this structure with additional parameters, but I felt like
> >>
> >> it defeats the purpose of th

Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Shashank Sharma



On 07/02/2023 18:57, Alex Deucher wrote:

On Tue, Feb 7, 2023 at 12:14 PM Shashank Sharma  wrote:


On 07/02/2023 17:54, Alex Deucher wrote:

On Tue, Feb 7, 2023 at 11:37 AM Shashank Sharma  wrote:

On 07/02/2023 17:05, Alex Deucher wrote:

On Tue, Feb 7, 2023 at 10:43 AM Shashank Sharma  wrote:

On 07/02/2023 16:17, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma  wrote:

From: Shashank Sharma 

MQD describes the properies of a user queue to the HW, and allows it to
accurately configure the queue while mapping it in GPU HW. This patch
adds:
- A new header file which contains the userqueue MQD definition for
  V11 graphics engine.
- A new function which fills it with userqueue data and prepares MQD
- A function which sets-up the MQD function ptrs in the generic userqueue
  creation code.

V1: Addressed review comments from RFC patch series
- Reuse the existing MQD structure instead of creating a new one
- MQD format and creation can be IP specific, keep it like that

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Arvind Yadav 
Signed-off-by: Shashank Sharma 
---
 drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
 .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 ++
 drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
 4 files changed, 169 insertions(+), 8 deletions(-)
 create mode 100644 
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 764801cc8203..6ae9d5792791 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o

 # add usermode queue
 amdgpu-y += amdgpu_userqueue.o
+amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o

 ifneq ($(CONFIG_HSA_AMD),)
 AMDKFD_PATH := ../amdkfd
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
index 625c2fe1e84a..9f3490a91776 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
@@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void *data,
 return r;
 }

+extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
+
+static int
+amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
+{
+int maj;
+struct amdgpu_device *adev = uq_mgr->adev;
+uint32_t version = adev->ip_versions[GC_HWIP][0];
+
+maj = IP_VERSION_MAJ(version);
+if (maj == 11) {
+uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
+} else {
+DRM_WARN("This IP doesn't support usermode queues\n");
+return -EINVAL;
+}
+

I think it would be cleaner to just store these callbacks in adev.
Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
in early_init for each IP, we can register the callbacks.  When the
user goes to create a new user_queue, we can check check to see if the
function pointer is NULL or not for the queue type:

if (!adev->user_queue_funcs[ip_type])
  return -EINVAL

r = adev->user_queue_funcs[ip_type]->create_queue();

Sounds like a good idea, we can do this.


Actually, there is already an mqd manager interface (adev->mqds[]).
Maybe you can leverage that interface.

Yep, I saw that and initially even tried to work on that interface
itself, and then realized that it doesn't allow us to pass some

additional parameters (like queue->vm, various BOs like proc_ctx_bo,
gang_ctx_bo's and so on). All of these are required in the MQD

and we will need them to be added into MQD. I even thought of expanding
this structure with additional parameters, but I felt like

it defeats the purpose of this MQD properties. But if you feel strongly
about that, we can work around it.

I think it would be cleaner to just add whatever additional mqd
properties you need to amdgpu_mqd_prop, and then you can share
gfx_v11_0_gfx_mqd_init() and gfx_v11_0_compute_mqd_init()  for GFX and
sdma_v6_0_mqd_init() for SDMA.  That way if we make changes to the MQD
configuration, we only have to change one function.

Alex

Noted,

We might have to add some additional fptrs for .prepare_map() and
.prepare_unmap(). in the mqd funcs.

These are the required to prepare data for MES HW queue mapping.

OK.  I think we could start with just using the existing init_mqd
callbacks from your create/destroy queue functions for now.

Ok,

That
said, do we need the prepare_(un)map callbacks?  I think just
create/destory callbacks should be fine.  In the create callback, we
can init the mqd and map it, then in destroy, we can unmap and free.

If you observe the kernel MES framework, it expects the data to be fed
in a particular format, in form of queue_properties, and

creates the map_queue_packet using those. So we need to re-arrange the
data we have in MQD or drm_mqd_i

[linux-next:master] BUILD REGRESSION 49a8133221c71b935f36a7c340c0271c2a9ee2db

2023-02-07 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 49a8133221c71b935f36a7c340c0271c2a9ee2db  Add linux-next specific 
files for 20230207

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202301230743.xnut0zvc-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202301300743.bp7dpazv-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202301301801.y5o08tqx-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202301302110.metnwkbd-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202301310939.tagcoezb-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202302011836.ka3bxqdy-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202302061911.c7xvhx9v-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202302062224.byzetxh1-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202302072055.odjdvd5v-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

Documentation/riscv/uabi.rst:24: WARNING: Enumerated list ends without a blank 
line; unexpected unindent.
ERROR: modpost: "devm_platform_ioremap_resource" [drivers/dma/fsl-edma.ko] 
undefined!
ERROR: modpost: "devm_platform_ioremap_resource" [drivers/dma/idma64.ko] 
undefined!
FAILED: load BTF from vmlinux: No data available
arch/arm64/kvm/arm.c:2207: warning: expecting prototype for Initialize Hyp(). 
Prototype was for kvm_arm_init() instead
drivers/clk/qcom/gcc-sa8775p.c:313:32: warning: unused variable 
'gcc_parent_map_10' [-Wunused-const-variable]
drivers/clk/qcom/gcc-sa8775p.c:318:37: warning: unused variable 
'gcc_parent_data_10' [-Wunused-const-variable]
drivers/clk/qcom/gcc-sa8775p.c:333:32: warning: unused variable 
'gcc_parent_map_12' [-Wunused-const-variable]
drivers/clk/qcom/gcc-sa8775p.c:338:37: warning: unused variable 
'gcc_parent_data_12' [-Wunused-const-variable]
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.h:62:10: fatal error: 
mod_info_packet.h: No such file or directory
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_hubbub.c:1011:6: warning: 
no previous prototype for 'hubbub31_init' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_hubbub.c:948:6: warning: 
no previous prototype for 'hubbub32_init' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_hubp.c:158:6: warning: no 
previous prototype for 'hubp32_init' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_resource_helpers.c:62:18: 
warning: variable 'cursor_bpp' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/accessories/link_dp_trace.c:148:6:
 warning: no previous prototype for 'link_dp_trace_set_edp_power_timestamp' 
[-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/accessories/link_dp_trace.c:158:10:
 warning: no previous prototype for 'link_dp_trace_get_edp_poweron_timestamp' 
[-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/accessories/link_dp_trace.c:163:10:
 warning: no previous prototype for 'link_dp_trace_get_edp_poweroff_timestamp' 
[-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:1295:32:
 warning: variable 'result_write_min_hblank' set but not used 
[-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:279:42:
 warning: variable 'ds_port' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_training.c:1585:38:
 warning: variable 'result' set but not used [-Wunused-but-set-variable]
ftrace-ops.c:(.init.text+0x2c3): undefined reference to `__udivdi3'
libbpf: failed to find '.BTF' ELF section in vmlinux

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/media/i2c/max9286.c:802 max9286_s_stream() error: buffer overflow 
'priv->fmt' 4 <= 32
drivers/thermal/qcom/tsens-v0_1.c:106:40: sparse: sparse: symbol 
'tsens_9607_nvmem' was not declared. Should it be static?
drivers/thermal/qcom/tsens-v0_1.c:26:40: sparse: sparse: symbol 
'tsens_8916_nvmem' was not declared. Should it be static?
drivers/thermal/qcom/tsens-v0_1.c:42:40: sparse: sparse: symbol 
'tsens_8939_nvmem' was not declared. Should it be static?
drivers/thermal/qcom/tsens-v0_1.c:62:40: sparse: sparse: symbol 
'tsens_8974_nvmem' was not declared. Should it be static?
drivers/thermal/qcom/tsens-v0_1.c:84:40: sparse: sparse: symbol 
'tsens_8974_backup_nvmem' was not declared. Should it be static?
drivers/thermal/qcom/tsens-v1.c:24:40: sparse: sparse: symbol 
'tsens_qcs404_nvmem' was not declared. Should it be static?
drivers/thermal/qcom/tsens-v1.c:45:40: sparse: sparse: symbol 
'tsens_8976_nvmem' was not declare

Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 12:14 PM Shashank Sharma  wrote:
>
>
> On 07/02/2023 17:54, Alex Deucher wrote:
> > On Tue, Feb 7, 2023 at 11:37 AM Shashank Sharma  
> > wrote:
> >>
> >> On 07/02/2023 17:05, Alex Deucher wrote:
> >>> On Tue, Feb 7, 2023 at 10:43 AM Shashank Sharma  
> >>> wrote:
>  On 07/02/2023 16:17, Alex Deucher wrote:
> > On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma 
> >  wrote:
> >> From: Shashank Sharma 
> >>
> >> MQD describes the properies of a user queue to the HW, and allows it to
> >> accurately configure the queue while mapping it in GPU HW. This patch
> >> adds:
> >> - A new header file which contains the userqueue MQD definition for
> >>  V11 graphics engine.
> >> - A new function which fills it with userqueue data and prepares MQD
> >> - A function which sets-up the MQD function ptrs in the generic 
> >> userqueue
> >>  creation code.
> >>
> >> V1: Addressed review comments from RFC patch series
> >>- Reuse the existing MQD structure instead of creating a new one
> >>- MQD format and creation can be IP specific, keep it like that
> >>
> >> Cc: Alex Deucher 
> >> Cc: Christian Koenig 
> >> Signed-off-by: Arvind Yadav 
> >> Signed-off-by: Shashank Sharma 
> >> ---
> >> drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
> >> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
> >> .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 
> >> ++
> >> drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
> >> 4 files changed, 169 insertions(+), 8 deletions(-)
> >> create mode 100644 
> >> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> >> b/drivers/gpu/drm/amd/amdgpu/Makefile
> >> index 764801cc8203..6ae9d5792791 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> >> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> >> @@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o
> >>
> >> # add usermode queue
> >> amdgpu-y += amdgpu_userqueue.o
> >> +amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o
> >>
> >> ifneq ($(CONFIG_HSA_AMD),)
> >> AMDKFD_PATH := ../amdkfd
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> >> index 625c2fe1e84a..9f3490a91776 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> >> @@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, 
> >> void *data,
> >> return r;
> >> }
> >>
> >> +extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
> >> +
> >> +static int
> >> +amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
> >> +{
> >> +int maj;
> >> +struct amdgpu_device *adev = uq_mgr->adev;
> >> +uint32_t version = adev->ip_versions[GC_HWIP][0];
> >> +
> >> +maj = IP_VERSION_MAJ(version);
> >> +if (maj == 11) {
> >> +uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
> >> +} else {
> >> +DRM_WARN("This IP doesn't support usermode queues\n");
> >> +return -EINVAL;
> >> +}
> >> +
> > I think it would be cleaner to just store these callbacks in adev.
> > Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
> > in early_init for each IP, we can register the callbacks.  When the
> > user goes to create a new user_queue, we can check check to see if the
> > function pointer is NULL or not for the queue type:
> >
> > if (!adev->user_queue_funcs[ip_type])
> >  return -EINVAL
> >
> > r = adev->user_queue_funcs[ip_type]->create_queue();
>  Sounds like a good idea, we can do this.
> 
> > Actually, there is already an mqd manager interface (adev->mqds[]).
> > Maybe you can leverage that interface.
>  Yep, I saw that and initially even tried to work on that interface
>  itself, and then realized that it doesn't allow us to pass some
> 
>  additional parameters (like queue->vm, various BOs like proc_ctx_bo,
>  gang_ctx_bo's and so on). All of these are required in the MQD
> 
>  and we will need them to be added into MQD. I even thought of expanding
>  this structure with additional parameters, but I felt like
> 
>  it defeats the purpose of this MQD properties. But if you feel strongly
>  about that, we can work around it.
> >>> I think it would be cleaner to just add whatever additional mqd
> >>> properties you need to amdgpu_mqd_prop, and then you can share
> >>> gfx_v11_0_gfx_mqd_init() and gfx_v11_0_compute_mqd_init()  for GFX and
> >>> sdma_v6_0_mqd_init() for SDMA.  That way if we make changes

Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Shashank Sharma



On 07/02/2023 17:54, Alex Deucher wrote:

On Tue, Feb 7, 2023 at 11:37 AM Shashank Sharma  wrote:


On 07/02/2023 17:05, Alex Deucher wrote:

On Tue, Feb 7, 2023 at 10:43 AM Shashank Sharma  wrote:

On 07/02/2023 16:17, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma  wrote:

From: Shashank Sharma 

MQD describes the properies of a user queue to the HW, and allows it to
accurately configure the queue while mapping it in GPU HW. This patch
adds:
- A new header file which contains the userqueue MQD definition for
 V11 graphics engine.
- A new function which fills it with userqueue data and prepares MQD
- A function which sets-up the MQD function ptrs in the generic userqueue
 creation code.

V1: Addressed review comments from RFC patch series
   - Reuse the existing MQD structure instead of creating a new one
   - MQD format and creation can be IP specific, keep it like that

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Arvind Yadav 
Signed-off-by: Shashank Sharma 
---
drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
.../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 ++
drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
4 files changed, 169 insertions(+), 8 deletions(-)
create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 764801cc8203..6ae9d5792791 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o

# add usermode queue
amdgpu-y += amdgpu_userqueue.o
+amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o

ifneq ($(CONFIG_HSA_AMD),)
AMDKFD_PATH := ../amdkfd
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
index 625c2fe1e84a..9f3490a91776 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
@@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void *data,
return r;
}

+extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
+
+static int
+amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
+{
+int maj;
+struct amdgpu_device *adev = uq_mgr->adev;
+uint32_t version = adev->ip_versions[GC_HWIP][0];
+
+maj = IP_VERSION_MAJ(version);
+if (maj == 11) {
+uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
+} else {
+DRM_WARN("This IP doesn't support usermode queues\n");
+return -EINVAL;
+}
+

I think it would be cleaner to just store these callbacks in adev.
Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
in early_init for each IP, we can register the callbacks.  When the
user goes to create a new user_queue, we can check check to see if the
function pointer is NULL or not for the queue type:

if (!adev->user_queue_funcs[ip_type])
 return -EINVAL

r = adev->user_queue_funcs[ip_type]->create_queue();

Sounds like a good idea, we can do this.


Actually, there is already an mqd manager interface (adev->mqds[]).
Maybe you can leverage that interface.

Yep, I saw that and initially even tried to work on that interface
itself, and then realized that it doesn't allow us to pass some

additional parameters (like queue->vm, various BOs like proc_ctx_bo,
gang_ctx_bo's and so on). All of these are required in the MQD

and we will need them to be added into MQD. I even thought of expanding
this structure with additional parameters, but I felt like

it defeats the purpose of this MQD properties. But if you feel strongly
about that, we can work around it.

I think it would be cleaner to just add whatever additional mqd
properties you need to amdgpu_mqd_prop, and then you can share
gfx_v11_0_gfx_mqd_init() and gfx_v11_0_compute_mqd_init()  for GFX and
sdma_v6_0_mqd_init() for SDMA.  That way if we make changes to the MQD
configuration, we only have to change one function.

Alex

Noted,

We might have to add some additional fptrs for .prepare_map() and
.prepare_unmap(). in the mqd funcs.

These are the required to prepare data for MES HW queue mapping.

OK.  I think we could start with just using the existing init_mqd
callbacks from your create/destroy queue functions for now.

Ok,

That
said, do we need the prepare_(un)map callbacks?  I think just
create/destory callbacks should be fine.  In the create callback, we
can init the mqd and map it, then in destroy, we can unmap and free.


If you observe the kernel MES framework, it expects the data to be fed 
in a particular format, in form of queue_properties, and


creates the map_queue_packet using those. So we need to re-arrange the 
data we have in MQD or drm_mqd_in in format


of properties, which is being done in prepare_map/unmap. Now, as the MQD 
is IP specific, we will n

Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 11:37 AM Shashank Sharma  wrote:
>
>
> On 07/02/2023 17:05, Alex Deucher wrote:
> > On Tue, Feb 7, 2023 at 10:43 AM Shashank Sharma  
> > wrote:
> >>
> >> On 07/02/2023 16:17, Alex Deucher wrote:
> >>> On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma  
> >>> wrote:
>  From: Shashank Sharma 
> 
>  MQD describes the properies of a user queue to the HW, and allows it to
>  accurately configure the queue while mapping it in GPU HW. This patch
>  adds:
>  - A new header file which contains the userqueue MQD definition for
>  V11 graphics engine.
>  - A new function which fills it with userqueue data and prepares MQD
>  - A function which sets-up the MQD function ptrs in the generic userqueue
>  creation code.
> 
>  V1: Addressed review comments from RFC patch series
>    - Reuse the existing MQD structure instead of creating a new one
>    - MQD format and creation can be IP specific, keep it like that
> 
>  Cc: Alex Deucher 
>  Cc: Christian Koenig 
>  Signed-off-by: Arvind Yadav 
>  Signed-off-by: Shashank Sharma 
>  ---
> drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
> .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 ++
> drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
> 4 files changed, 169 insertions(+), 8 deletions(-)
> create mode 100644 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> 
>  diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
>  b/drivers/gpu/drm/amd/amdgpu/Makefile
>  index 764801cc8203..6ae9d5792791 100644
>  --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>  +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>  @@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o
> 
> # add usermode queue
> amdgpu-y += amdgpu_userqueue.o
>  +amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o
> 
> ifneq ($(CONFIG_HSA_AMD),)
> AMDKFD_PATH := ../amdkfd
>  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
>  b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
>  index 625c2fe1e84a..9f3490a91776 100644
>  --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
>  +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
>  @@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, 
>  void *data,
> return r;
> }
> 
>  +extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
>  +
>  +static int
>  +amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
>  +{
>  +int maj;
>  +struct amdgpu_device *adev = uq_mgr->adev;
>  +uint32_t version = adev->ip_versions[GC_HWIP][0];
>  +
>  +maj = IP_VERSION_MAJ(version);
>  +if (maj == 11) {
>  +uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
>  +} else {
>  +DRM_WARN("This IP doesn't support usermode queues\n");
>  +return -EINVAL;
>  +}
>  +
> >>> I think it would be cleaner to just store these callbacks in adev.
> >>> Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
> >>> in early_init for each IP, we can register the callbacks.  When the
> >>> user goes to create a new user_queue, we can check check to see if the
> >>> function pointer is NULL or not for the queue type:
> >>>
> >>> if (!adev->user_queue_funcs[ip_type])
> >>> return -EINVAL
> >>>
> >>> r = adev->user_queue_funcs[ip_type]->create_queue();
> >> Sounds like a good idea, we can do this.
> >>
> >>> Actually, there is already an mqd manager interface (adev->mqds[]).
> >>> Maybe you can leverage that interface.
> >> Yep, I saw that and initially even tried to work on that interface
> >> itself, and then realized that it doesn't allow us to pass some
> >>
> >> additional parameters (like queue->vm, various BOs like proc_ctx_bo,
> >> gang_ctx_bo's and so on). All of these are required in the MQD
> >>
> >> and we will need them to be added into MQD. I even thought of expanding
> >> this structure with additional parameters, but I felt like
> >>
> >> it defeats the purpose of this MQD properties. But if you feel strongly
> >> about that, we can work around it.
> > I think it would be cleaner to just add whatever additional mqd
> > properties you need to amdgpu_mqd_prop, and then you can share
> > gfx_v11_0_gfx_mqd_init() and gfx_v11_0_compute_mqd_init()  for GFX and
> > sdma_v6_0_mqd_init() for SDMA.  That way if we make changes to the MQD
> > configuration, we only have to change one function.
> >
> > Alex
>
> Noted,
>
> We might have to add some additional fptrs for .prepare_map() and
> .prepare_unmap(). in the mqd funcs.
>
> These are the required to prepare data for MES HW queue mapping.

OK.  I think we could start with just using the existing init_mqd
callbacks from y

Re: [PATCH 5/8] drm/amdgpu: Create context for usermode queue

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 11:51 AM Alex Deucher  wrote:
>
> On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma  
> wrote:
> >
> > The FW expects us to allocate atleast one page as context space to
> > process gang, process, shadow, GDS and FW_space related work. This
> > patch creates some object for the same, and adds an IP specific
> > functions to do this.
> >
> > Cc: Alex Deucher 
> > Cc: Christian Koenig 
> > Signed-off-by: Shashank Sharma 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  32 +
> >  .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 121 ++
> >  .../gpu/drm/amd/include/amdgpu_userqueue.h|  18 +++
> >  3 files changed, 171 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> > index 9f3490a91776..18281b3a51f1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> > @@ -42,6 +42,28 @@ static struct amdgpu_usermode_queue
> >  return idr_find(&uq_mgr->userq_idr, qid);
> >  }
> >
> > +static void
> > +amdgpu_userqueue_destroy_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
> > +   struct amdgpu_usermode_queue *queue)
> > +{
> > +uq_mgr->userq_mqd_funcs->ctx_destroy(uq_mgr, queue);
> > +}
> > +
> > +static int
> > +amdgpu_userqueue_create_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
> > +  struct amdgpu_usermode_queue *queue)
> > +{
> > +int r;
> > +
> > +r = uq_mgr->userq_mqd_funcs->ctx_create(uq_mgr, queue);
> > +if (r) {
> > +DRM_ERROR("Failed to create context space for queue\n");
> > +return r;
> > +}
> > +
> > +return 0;
> > +}
> > +
> >  static int
> >  amdgpu_userqueue_create_mqd(struct amdgpu_userq_mgr *uq_mgr, struct 
> > amdgpu_usermode_queue *queue)
> >  {
> > @@ -142,12 +164,21 @@ static int amdgpu_userqueue_create(struct drm_file 
> > *filp, union drm_amdgpu_userq
> >  goto free_qid;
> >  }
> >
> > +r = amdgpu_userqueue_create_ctx_space(uq_mgr, queue);
> > +if (r) {
> > +DRM_ERROR("Failed to create context space\n");
> > +goto free_mqd;
> > +}
> > +
> >  list_add_tail(&queue->userq_node, &uq_mgr->userq_list);
> >  args->out.q_id = queue->queue_id;
> >  args->out.flags = 0;
> >  mutex_unlock(&uq_mgr->userq_mutex);
> >  return 0;
> >
> > +free_mqd:
> > +amdgpu_userqueue_destroy_mqd(uq_mgr, queue);
> > +
> >  free_qid:
> >  amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
> >
> > @@ -170,6 +201,7 @@ static void amdgpu_userqueue_destroy(struct drm_file 
> > *filp, int queue_id)
> >  }
> >
> >  mutex_lock(&uq_mgr->userq_mutex);
> > +amdgpu_userqueue_destroy_ctx_space(uq_mgr, queue);
> >  amdgpu_userqueue_destroy_mqd(uq_mgr, queue);
> >  amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
> >  list_del(&queue->userq_node);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> > index 57889729d635..687f90a587e3 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> > @@ -120,6 +120,125 @@ amdgpu_userq_gfx_v11_mqd_destroy(struct 
> > amdgpu_userq_mgr *uq_mgr, struct amdgpu_
> >
> >  }
> >
> > +static int amdgpu_userq_gfx_v11_ctx_create(struct amdgpu_userq_mgr *uq_mgr,
> > +   struct amdgpu_usermode_queue 
> > *queue)
> > +{
> > +int r;
> > +struct amdgpu_device *adev = uq_mgr->adev;
> > +struct amdgpu_userq_ctx *pctx = &queue->proc_ctx;
> > +struct amdgpu_userq_ctx *gctx = &queue->gang_ctx;
> > +struct amdgpu_userq_ctx *gdsctx = &queue->gds_ctx;
> > +struct amdgpu_userq_ctx *fwctx = &queue->fw_ctx;
> > +struct amdgpu_userq_ctx *sctx = &queue->shadow_ctx;
> > +
> > +/*
> > + * The FW expects atleast one page space allocated for
> > + * process context related work, and one for gang context.
> > + */
> > +r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_PROC_CTX_SZ, PAGE_SIZE,
> > +AMDGPU_GEM_DOMAIN_VRAM,
> > +&pctx->obj,
> > +&pctx->gpu_addr,
> > +&pctx->cpu_ptr);
> > +if (r) {
> > +DRM_ERROR("Failed to allocate proc bo for userqueue (%d)", r);
> > +return r;
> > +}
> > +
> > +r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_GANG_CTX_SZ, PAGE_SIZE,
> > +AMDGPU_GEM_DOMAIN_VRAM,
> > +&gctx->obj,
> > +&gctx->gpu_addr,
> > +&gctx->cpu_ptr);
> > +if (r) {
> > +DRM_ERROR("Failed to allocate gang bo for userqueue (%d)", r);
> > +goto err_gangctx;
> > +}
> > +
> > +  

Re: [PATCH 5/8] drm/amdgpu: Create context for usermode queue

2023-02-07 Thread Alex Deucher
On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma  wrote:
>
> The FW expects us to allocate atleast one page as context space to
> process gang, process, shadow, GDS and FW_space related work. This
> patch creates some object for the same, and adds an IP specific
> functions to do this.
>
> Cc: Alex Deucher 
> Cc: Christian Koenig 
> Signed-off-by: Shashank Sharma 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  32 +
>  .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 121 ++
>  .../gpu/drm/amd/include/amdgpu_userqueue.h|  18 +++
>  3 files changed, 171 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> index 9f3490a91776..18281b3a51f1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> @@ -42,6 +42,28 @@ static struct amdgpu_usermode_queue
>  return idr_find(&uq_mgr->userq_idr, qid);
>  }
>
> +static void
> +amdgpu_userqueue_destroy_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
> +   struct amdgpu_usermode_queue *queue)
> +{
> +uq_mgr->userq_mqd_funcs->ctx_destroy(uq_mgr, queue);
> +}
> +
> +static int
> +amdgpu_userqueue_create_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
> +  struct amdgpu_usermode_queue *queue)
> +{
> +int r;
> +
> +r = uq_mgr->userq_mqd_funcs->ctx_create(uq_mgr, queue);
> +if (r) {
> +DRM_ERROR("Failed to create context space for queue\n");
> +return r;
> +}
> +
> +return 0;
> +}
> +
>  static int
>  amdgpu_userqueue_create_mqd(struct amdgpu_userq_mgr *uq_mgr, struct 
> amdgpu_usermode_queue *queue)
>  {
> @@ -142,12 +164,21 @@ static int amdgpu_userqueue_create(struct drm_file 
> *filp, union drm_amdgpu_userq
>  goto free_qid;
>  }
>
> +r = amdgpu_userqueue_create_ctx_space(uq_mgr, queue);
> +if (r) {
> +DRM_ERROR("Failed to create context space\n");
> +goto free_mqd;
> +}
> +
>  list_add_tail(&queue->userq_node, &uq_mgr->userq_list);
>  args->out.q_id = queue->queue_id;
>  args->out.flags = 0;
>  mutex_unlock(&uq_mgr->userq_mutex);
>  return 0;
>
> +free_mqd:
> +amdgpu_userqueue_destroy_mqd(uq_mgr, queue);
> +
>  free_qid:
>  amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
>
> @@ -170,6 +201,7 @@ static void amdgpu_userqueue_destroy(struct drm_file 
> *filp, int queue_id)
>  }
>
>  mutex_lock(&uq_mgr->userq_mutex);
> +amdgpu_userqueue_destroy_ctx_space(uq_mgr, queue);
>  amdgpu_userqueue_destroy_mqd(uq_mgr, queue);
>  amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
>  list_del(&queue->userq_node);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> index 57889729d635..687f90a587e3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> @@ -120,6 +120,125 @@ amdgpu_userq_gfx_v11_mqd_destroy(struct 
> amdgpu_userq_mgr *uq_mgr, struct amdgpu_
>
>  }
>
> +static int amdgpu_userq_gfx_v11_ctx_create(struct amdgpu_userq_mgr *uq_mgr,
> +   struct amdgpu_usermode_queue 
> *queue)
> +{
> +int r;
> +struct amdgpu_device *adev = uq_mgr->adev;
> +struct amdgpu_userq_ctx *pctx = &queue->proc_ctx;
> +struct amdgpu_userq_ctx *gctx = &queue->gang_ctx;
> +struct amdgpu_userq_ctx *gdsctx = &queue->gds_ctx;
> +struct amdgpu_userq_ctx *fwctx = &queue->fw_ctx;
> +struct amdgpu_userq_ctx *sctx = &queue->shadow_ctx;
> +
> +/*
> + * The FW expects atleast one page space allocated for
> + * process context related work, and one for gang context.
> + */
> +r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_PROC_CTX_SZ, PAGE_SIZE,
> +AMDGPU_GEM_DOMAIN_VRAM,
> +&pctx->obj,
> +&pctx->gpu_addr,
> +&pctx->cpu_ptr);
> +if (r) {
> +DRM_ERROR("Failed to allocate proc bo for userqueue (%d)", r);
> +return r;
> +}
> +
> +r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_GANG_CTX_SZ, PAGE_SIZE,
> +AMDGPU_GEM_DOMAIN_VRAM,
> +&gctx->obj,
> +&gctx->gpu_addr,
> +&gctx->cpu_ptr);
> +if (r) {
> +DRM_ERROR("Failed to allocate gang bo for userqueue (%d)", r);
> +goto err_gangctx;
> +}
> +
> +r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_GDS_CTX_SZ, PAGE_SIZE,
> +AMDGPU_GEM_DOMAIN_VRAM,
> +&gdsctx->obj,
> +&gdsctx->gpu_addr,
> +&gdsctx->cpu_ptr);
> +if (r) {
> +DRM

Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Shashank Sharma



On 07/02/2023 17:05, Alex Deucher wrote:

On Tue, Feb 7, 2023 at 10:43 AM Shashank Sharma  wrote:


On 07/02/2023 16:17, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma  wrote:

From: Shashank Sharma 

MQD describes the properies of a user queue to the HW, and allows it to
accurately configure the queue while mapping it in GPU HW. This patch
adds:
- A new header file which contains the userqueue MQD definition for
V11 graphics engine.
- A new function which fills it with userqueue data and prepares MQD
- A function which sets-up the MQD function ptrs in the generic userqueue
creation code.

V1: Addressed review comments from RFC patch series
  - Reuse the existing MQD structure instead of creating a new one
  - MQD format and creation can be IP specific, keep it like that

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Arvind Yadav 
Signed-off-by: Shashank Sharma 
---
   drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
   drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
   .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 ++
   drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
   4 files changed, 169 insertions(+), 8 deletions(-)
   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 764801cc8203..6ae9d5792791 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o

   # add usermode queue
   amdgpu-y += amdgpu_userqueue.o
+amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o

   ifneq ($(CONFIG_HSA_AMD),)
   AMDKFD_PATH := ../amdkfd
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
index 625c2fe1e84a..9f3490a91776 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
@@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void *data,
   return r;
   }

+extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
+
+static int
+amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
+{
+int maj;
+struct amdgpu_device *adev = uq_mgr->adev;
+uint32_t version = adev->ip_versions[GC_HWIP][0];
+
+maj = IP_VERSION_MAJ(version);
+if (maj == 11) {
+uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
+} else {
+DRM_WARN("This IP doesn't support usermode queues\n");
+return -EINVAL;
+}
+

I think it would be cleaner to just store these callbacks in adev.
Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
in early_init for each IP, we can register the callbacks.  When the
user goes to create a new user_queue, we can check check to see if the
function pointer is NULL or not for the queue type:

if (!adev->user_queue_funcs[ip_type])
return -EINVAL

r = adev->user_queue_funcs[ip_type]->create_queue();

Sounds like a good idea, we can do this.


Actually, there is already an mqd manager interface (adev->mqds[]).
Maybe you can leverage that interface.

Yep, I saw that and initially even tried to work on that interface
itself, and then realized that it doesn't allow us to pass some

additional parameters (like queue->vm, various BOs like proc_ctx_bo,
gang_ctx_bo's and so on). All of these are required in the MQD

and we will need them to be added into MQD. I even thought of expanding
this structure with additional parameters, but I felt like

it defeats the purpose of this MQD properties. But if you feel strongly
about that, we can work around it.

I think it would be cleaner to just add whatever additional mqd
properties you need to amdgpu_mqd_prop, and then you can share
gfx_v11_0_gfx_mqd_init() and gfx_v11_0_compute_mqd_init()  for GFX and
sdma_v6_0_mqd_init() for SDMA.  That way if we make changes to the MQD
configuration, we only have to change one function.

Alex


Noted,

We might have to add some additional fptrs for .prepare_map() and 
.prepare_unmap(). in the mqd funcs.


These are the required to prepare data for MES HW queue mapping.

- Shashank




+return 0;
+}
+
   int amdgpu_userq_mgr_init(struct amdgpu_userq_mgr *userq_mgr, struct 
amdgpu_device *adev)
   {
+int r;
+
   mutex_init(&userq_mgr->userq_mutex);
   idr_init_base(&userq_mgr->userq_idr, 1);
   INIT_LIST_HEAD(&userq_mgr->userq_list);
   userq_mgr->adev = adev;

+r = amdgpu_userqueue_setup_mqd_funcs(userq_mgr);
+if (r) {
+DRM_ERROR("Failed to setup MQD functions for usermode queue\n");
+return r;
+}
+
   return 0;
   }

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
new file mode 100644
index ..57889729d635
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
@@ -0,0 +1,132 @@
+/*
+ 

Re: [PATCH] drm/amd/display: Align num_crtc to max_streams

2023-02-07 Thread Harry Wentland



On 2/7/23 09:41, Hamza Mahfooz wrote:
> 
> On 2/7/23 09:31, Harry Wentland wrote:
>>
>>
>> On 2/7/23 08:00, Hamza Mahfooz wrote:
>>>
>>> On 2/6/23 23:05, Tianci Yin wrote:
 From: tiancyin 

 [Why]
 Display pipe might be harvested on some SKUs, that cause the
 adev->mode_info.num_crtc mismatch with the usable crtc number,
 then below error dmesgs observed after GPU recover.

     *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
     *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3

 [How]
 The max_streams is limited number after pipe fuse, align num_crtc
 to max_streams to eliminate the error logs.

 Signed-off-by: tiancyin 
 ---
    drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
    1 file changed, 3 insertions(+)

 diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
 b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 index b31cfda30ff9..87ec2574cc09 100644
 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
 @@ -4285,6 +4285,9 @@ static int amdgpu_dm_initialize_drm_device(struct 
 amdgpu_device *adev)
    break;
    }
    +    /* Adjust the crtc number according to the DCN pipe fuse. */
 +    adev->mode_info.num_crtc = dm->dc->caps.max_streams;
>>>
>>> This would introduce array-out-bounds issues, since there are arrays of
>>> size AMDGPU_MAX_CRTCS that use num_crtc as a bounds check.
>>
>>  From what I can tell max_streams is always <= AMDGPU_MAX_CRTCS (6),
>> though we're not checking. Maybe it'd be good to check and print a
>> DRM_ERROR here if that's ever not the case (e.g., if we ever add
>> any new ASIC that has more streams).
> 
> I have had UBSAN warns before commit d633b7a25fbe ("drm/amd/display: fix
> possible buffer overflow relating to secure display") was applied, so it
> seems to already be the case, maybe due to virtual streams.
> 

Interesting.

On closer look I'm not sure why this patch is needed. We already
do exactly what this patch does at the beginning of
amdgpu_dm_initialize_drm_device:

>   dm->display_indexes_num = dm->dc->caps.max_streams;
>   /* Update the actual used number of crtc */
>   adev->mode_info.num_crtc = adev->dm.display_indexes_num;

Harry

>>
>> Harry
>>
>>>
 +
    for (i = 0; i < dm->dc->caps.max_streams; i++)
    if (amdgpu_dm_crtc_init(dm, mode_info->planes[i], i)) {
    DRM_ERROR("KMS: Failed to initialize crtc\n");
>>>
>>
> 



Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 10:43 AM Shashank Sharma  wrote:
>
>
> On 07/02/2023 16:17, Alex Deucher wrote:
> > On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma  
> > wrote:
> >> From: Shashank Sharma 
> >>
> >> MQD describes the properies of a user queue to the HW, and allows it to
> >> accurately configure the queue while mapping it in GPU HW. This patch
> >> adds:
> >> - A new header file which contains the userqueue MQD definition for
> >>V11 graphics engine.
> >> - A new function which fills it with userqueue data and prepares MQD
> >> - A function which sets-up the MQD function ptrs in the generic userqueue
> >>creation code.
> >>
> >> V1: Addressed review comments from RFC patch series
> >>  - Reuse the existing MQD structure instead of creating a new one
> >>  - MQD format and creation can be IP specific, keep it like that
> >>
> >> Cc: Alex Deucher 
> >> Cc: Christian Koenig 
> >> Signed-off-by: Arvind Yadav 
> >> Signed-off-by: Shashank Sharma 
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
> >>   .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 ++
> >>   drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
> >>   4 files changed, 169 insertions(+), 8 deletions(-)
> >>   create mode 100644 
> >> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> >> b/drivers/gpu/drm/amd/amdgpu/Makefile
> >> index 764801cc8203..6ae9d5792791 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> >> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> >> @@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o
> >>
> >>   # add usermode queue
> >>   amdgpu-y += amdgpu_userqueue.o
> >> +amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o
> >>
> >>   ifneq ($(CONFIG_HSA_AMD),)
> >>   AMDKFD_PATH := ../amdkfd
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> >> index 625c2fe1e84a..9f3490a91776 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> >> @@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void 
> >> *data,
> >>   return r;
> >>   }
> >>
> >> +extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
> >> +
> >> +static int
> >> +amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
> >> +{
> >> +int maj;
> >> +struct amdgpu_device *adev = uq_mgr->adev;
> >> +uint32_t version = adev->ip_versions[GC_HWIP][0];
> >> +
> >> +maj = IP_VERSION_MAJ(version);
> >> +if (maj == 11) {
> >> +uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
> >> +} else {
> >> +DRM_WARN("This IP doesn't support usermode queues\n");
> >> +return -EINVAL;
> >> +}
> >> +
> > I think it would be cleaner to just store these callbacks in adev.
> > Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
> > in early_init for each IP, we can register the callbacks.  When the
> > user goes to create a new user_queue, we can check check to see if the
> > function pointer is NULL or not for the queue type:
> >
> > if (!adev->user_queue_funcs[ip_type])
> >return -EINVAL
> >
> > r = adev->user_queue_funcs[ip_type]->create_queue();
>
> Sounds like a good idea, we can do this.
>
> >
> > Actually, there is already an mqd manager interface (adev->mqds[]).
> > Maybe you can leverage that interface.
>
> Yep, I saw that and initially even tried to work on that interface
> itself, and then realized that it doesn't allow us to pass some
>
> additional parameters (like queue->vm, various BOs like proc_ctx_bo,
> gang_ctx_bo's and so on). All of these are required in the MQD
>
> and we will need them to be added into MQD. I even thought of expanding
> this structure with additional parameters, but I felt like
>
> it defeats the purpose of this MQD properties. But if you feel strongly
> about that, we can work around it.

I think it would be cleaner to just add whatever additional mqd
properties you need to amdgpu_mqd_prop, and then you can share
gfx_v11_0_gfx_mqd_init() and gfx_v11_0_compute_mqd_init()  for GFX and
sdma_v6_0_mqd_init() for SDMA.  That way if we make changes to the MQD
configuration, we only have to change one function.

Alex

>
> >> +return 0;
> >> +}
> >> +
> >>   int amdgpu_userq_mgr_init(struct amdgpu_userq_mgr *userq_mgr, struct 
> >> amdgpu_device *adev)
> >>   {
> >> +int r;
> >> +
> >>   mutex_init(&userq_mgr->userq_mutex);
> >>   idr_init_base(&userq_mgr->userq_idr, 1);
> >>   INIT_LIST_HEAD(&userq_mgr->userq_list);
> >>   userq_mgr->adev = adev;
> >>
> >> +r = amdgpu_userqueue_setup_mqd_funcs(userq_mgr);
> >> +if (r) {
> >> +DRM_ERROR("Failed to setup MQD functions for usermode queue\n");
> >> +return r;
> >> +}
> >> +
> >>   return 0;
> >>   }
> >>
> >> diff --git a/drivers

Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Shashank Sharma



On 07/02/2023 16:17, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma  wrote:

From: Shashank Sharma 

MQD describes the properies of a user queue to the HW, and allows it to
accurately configure the queue while mapping it in GPU HW. This patch
adds:
- A new header file which contains the userqueue MQD definition for
   V11 graphics engine.
- A new function which fills it with userqueue data and prepares MQD
- A function which sets-up the MQD function ptrs in the generic userqueue
   creation code.

V1: Addressed review comments from RFC patch series
 - Reuse the existing MQD structure instead of creating a new one
 - MQD format and creation can be IP specific, keep it like that

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Arvind Yadav 
Signed-off-by: Shashank Sharma 
---
  drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
  .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 ++
  drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
  4 files changed, 169 insertions(+), 8 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 764801cc8203..6ae9d5792791 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o

  # add usermode queue
  amdgpu-y += amdgpu_userqueue.o
+amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o

  ifneq ($(CONFIG_HSA_AMD),)
  AMDKFD_PATH := ../amdkfd
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
index 625c2fe1e84a..9f3490a91776 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
@@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void *data,
  return r;
  }

+extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
+
+static int
+amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
+{
+int maj;
+struct amdgpu_device *adev = uq_mgr->adev;
+uint32_t version = adev->ip_versions[GC_HWIP][0];
+
+maj = IP_VERSION_MAJ(version);
+if (maj == 11) {
+uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
+} else {
+DRM_WARN("This IP doesn't support usermode queues\n");
+return -EINVAL;
+}
+

I think it would be cleaner to just store these callbacks in adev.
Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
in early_init for each IP, we can register the callbacks.  When the
user goes to create a new user_queue, we can check check to see if the
function pointer is NULL or not for the queue type:

if (!adev->user_queue_funcs[ip_type])
   return -EINVAL

r = adev->user_queue_funcs[ip_type]->create_queue();


Sounds like a good idea, we can do this.



Actually, there is already an mqd manager interface (adev->mqds[]).
Maybe you can leverage that interface.


Yep, I saw that and initially even tried to work on that interface 
itself, and then realized that it doesn't allow us to pass some


additional parameters (like queue->vm, various BOs like proc_ctx_bo, 
gang_ctx_bo's and so on). All of these are required in the MQD


and we will need them to be added into MQD. I even thought of expanding 
this structure with additional parameters, but I felt like


it defeats the purpose of this MQD properties. But if you feel strongly 
about that, we can work around it.



+return 0;
+}
+
  int amdgpu_userq_mgr_init(struct amdgpu_userq_mgr *userq_mgr, struct 
amdgpu_device *adev)
  {
+int r;
+
  mutex_init(&userq_mgr->userq_mutex);
  idr_init_base(&userq_mgr->userq_idr, 1);
  INIT_LIST_HEAD(&userq_mgr->userq_list);
  userq_mgr->adev = adev;

+r = amdgpu_userqueue_setup_mqd_funcs(userq_mgr);
+if (r) {
+DRM_ERROR("Failed to setup MQD functions for usermode queue\n");
+return r;
+}
+
  return 0;
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
new file mode 100644
index ..57889729d635
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
@@ -0,0 +1,132 @@
+/*
+ * Copyright 2022 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the 

Re: [PATCH 4/8] drm/amdgpu: Add V11 graphics MQD functions

2023-02-07 Thread Alex Deucher
On Fri, Feb 3, 2023 at 4:55 PM Shashank Sharma  wrote:
>
> From: Shashank Sharma 
>
> MQD describes the properies of a user queue to the HW, and allows it to
> accurately configure the queue while mapping it in GPU HW. This patch
> adds:
> - A new header file which contains the userqueue MQD definition for
>   V11 graphics engine.
> - A new function which fills it with userqueue data and prepares MQD
> - A function which sets-up the MQD function ptrs in the generic userqueue
>   creation code.
>
> V1: Addressed review comments from RFC patch series
> - Reuse the existing MQD structure instead of creating a new one
> - MQD format and creation can be IP specific, keep it like that
>
> Cc: Alex Deucher 
> Cc: Christian Koenig 
> Signed-off-by: Arvind Yadav 
> Signed-off-by: Shashank Sharma 
> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile   |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  28 
>  .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 132 ++
>  drivers/gpu/drm/amd/include/v11_structs.h |  16 +--
>  4 files changed, 169 insertions(+), 8 deletions(-)
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 764801cc8203..6ae9d5792791 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -212,6 +212,7 @@ amdgpu-y += amdgpu_amdkfd.o
>
>  # add usermode queue
>  amdgpu-y += amdgpu_userqueue.o
> +amdgpu-y += amdgpu_userqueue_mqd_gfx_v11.o
>
>  ifneq ($(CONFIG_HSA_AMD),)
>  AMDKFD_PATH := ../amdkfd
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> index 625c2fe1e84a..9f3490a91776 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> @@ -202,13 +202,41 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void 
> *data,
>  return r;
>  }
>
> +extern const struct amdgpu_userq_mqd_funcs userq_gfx_v11_mqd_funcs;
> +
> +static int
> +amdgpu_userqueue_setup_mqd_funcs(struct amdgpu_userq_mgr *uq_mgr)
> +{
> +int maj;
> +struct amdgpu_device *adev = uq_mgr->adev;
> +uint32_t version = adev->ip_versions[GC_HWIP][0];
> +
> +maj = IP_VERSION_MAJ(version);
> +if (maj == 11) {
> +uq_mgr->userq_mqd_funcs = &userq_gfx_v11_mqd_funcs;
> +} else {
> +DRM_WARN("This IP doesn't support usermode queues\n");
> +return -EINVAL;
> +}
> +

I think it would be cleaner to just store these callbacks in adev.
Maybe something like adev->user_queue_funcs[AMDGPU_HW_IP_NUM].  Then
in early_init for each IP, we can register the callbacks.  When the
user goes to create a new user_queue, we can check check to see if the
function pointer is NULL or not for the queue type:

if (!adev->user_queue_funcs[ip_type])
  return -EINVAL

r = adev->user_queue_funcs[ip_type]->create_queue();

Actually, there is already an mqd manager interface (adev->mqds[]).
Maybe you can leverage that interface.

> +return 0;
> +}
> +
>  int amdgpu_userq_mgr_init(struct amdgpu_userq_mgr *userq_mgr, struct 
> amdgpu_device *adev)
>  {
> +int r;
> +
>  mutex_init(&userq_mgr->userq_mutex);
>  idr_init_base(&userq_mgr->userq_idr, 1);
>  INIT_LIST_HEAD(&userq_mgr->userq_list);
>  userq_mgr->adev = adev;
>
> +r = amdgpu_userqueue_setup_mqd_funcs(userq_mgr);
> +if (r) {
> +DRM_ERROR("Failed to setup MQD functions for usermode queue\n");
> +return r;
> +}
> +
>  return 0;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> new file mode 100644
> index ..57889729d635
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
> @@ -0,0 +1,132 @@
> +/*
> + * Copyright 2022 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISI

Re: [PATCH 2/8] drm/amdgpu: add usermode queues

2023-02-07 Thread Shashank Sharma



On 07/02/2023 15:54, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma  wrote:

From: Shashank Sharma 

This patch adds skeleton code for usermode queue creation. It
typically contains:
- A new structure to keep all the user queue data in one place.
- An IOCTL function to create/free a usermode queue.
- A function to generate unique index for the queue.
- A queue context manager in driver private data.

V1: Worked on design review comments from RFC patch series:
(https://patchwork.freedesktop.org/series/112214/)

- Alex: Keep a list of queues, instead of single queue per process.
- Christian: Use the queue manager instead of global ptrs,
Don't keep the queue structure in amdgpu_ctx

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Shashank Sharma 
---
  drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |   5 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 155 ++
  .../gpu/drm/amd/include/amdgpu_userqueue.h|  64 
  6 files changed, 230 insertions(+)
  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
  create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 798d0e9a60b7..764801cc8203 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -210,6 +210,8 @@ amdgpu-y += \
  # add amdkfd interfaces
  amdgpu-y += amdgpu_amdkfd.o

+# add usermode queue
+amdgpu-y += amdgpu_userqueue.o

  ifneq ($(CONFIG_HSA_AMD),)
  AMDKFD_PATH := ../amdkfd
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6b74df446694..0625d6bdadf4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -109,6 +109,7 @@
  #include "amdgpu_fdinfo.h"
  #include "amdgpu_mca.h"
  #include "amdgpu_ras.h"
+#include "amdgpu_userqueue.h"

  #define MAX_GPU_INSTANCE   16

@@ -482,6 +483,7 @@ struct amdgpu_fpriv {
 struct mutexbo_list_lock;
 struct idr  bo_list_handles;
 struct amdgpu_ctx_mgr   ctx_mgr;
+   struct amdgpu_userq_mgr userq_mgr;
  };

  int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index b4f2d61ea0d5..229976a2d0e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -52,6 +52,7 @@
  #include "amdgpu_ras.h"
  #include "amdgpu_xgmi.h"
  #include "amdgpu_reset.h"
+#include "amdgpu_userqueue.h"

  /*
   * KMS wrapper.
@@ -2748,6 +2749,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
 DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
 DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
 DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(AMDGPU_USERQ, amdgpu_userq_ioctl, 
DRM_AUTH|DRM_RENDER_ALLOW),
  };

  static const struct drm_driver amdgpu_kms_driver = {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 7aa7e52ca784..52e61e339a88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -1187,6 +1187,10 @@ int amdgpu_driver_open_kms(struct drm_device *dev, 
struct drm_file *file_priv)

 amdgpu_ctx_mgr_init(&fpriv->ctx_mgr, adev);

+   r = amdgpu_userq_mgr_init(&fpriv->userq_mgr, adev);
+   if (r)
+   DRM_WARN("Can't setup usermode queues, only legacy workload 
submission will work\n");
+
 file_priv->driver_priv = fpriv;
 goto out_suspend;

@@ -1254,6 +1258,7 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,

 amdgpu_ctx_mgr_fini(&fpriv->ctx_mgr);
 amdgpu_vm_fini(adev, &fpriv->vm);
+   amdgpu_userq_mgr_fini(&fpriv->userq_mgr);

 if (pasid)
 amdgpu_pasid_free_delayed(pd->tbo.base.resv, pasid);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
new file mode 100644
index ..d5bc7fe81750
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
@@ -0,0 +1,155 @@
+/*
+ * Copyright 2022 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do

Re: [PATCH 3/8] drm/amdgpu: introduce userqueue MQD handlers

2023-02-07 Thread Alex Deucher
On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma  wrote:
>
> From: Shashank Sharma 
>
> A Memory queue descriptor (MQD) of a userqueue defines it in the harware's
> context. As the method of formation of a MQD, and its format can vary between
> different graphics IPs, we need gfx GEN specific handlers to create MQDs.
>
> This patch:
> - Introduces MQD hander functions for the usermode queues
> - A general function to create and destroy MQD for a userqueue.
>
> V1: Worked on review comments from Alex on RFC patches:
> MQD creation should be gen and IP specific.
>
> Cc: Alex Deucher 
> Cc: Christian Koenig 
> Signed-off-by: Shashank Sharma 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 64 +++
>  .../gpu/drm/amd/include/amdgpu_userqueue.h|  9 +++
>  2 files changed, 73 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> index d5bc7fe81750..625c2fe1e84a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> @@ -42,6 +42,60 @@ static struct amdgpu_usermode_queue
>  return idr_find(&uq_mgr->userq_idr, qid);
>  }
>
> +static int
> +amdgpu_userqueue_create_mqd(struct amdgpu_userq_mgr *uq_mgr, struct 
> amdgpu_usermode_queue *queue)
> +{
> +int r;
> +int size;
> +struct amdgpu_device *adev = uq_mgr->adev;
> +
> +if (!uq_mgr->userq_mqd_funcs) {
> +DRM_ERROR("Userqueue not initialized\n");
> +return -EINVAL;
> +}
> +
> +size = uq_mgr->userq_mqd_funcs->mqd_size(uq_mgr);
> +r = amdgpu_bo_create_kernel(adev, size, PAGE_SIZE,
> +AMDGPU_GEM_DOMAIN_VRAM,
> +&queue->mqd_obj,
> +&queue->mqd_gpu_addr,
> +&queue->mqd_cpu_ptr);
> +if (r) {
> +DRM_ERROR("Failed to allocate bo for userqueue (%d)", r);
> +return r;
> +}
> +
> +memset(queue->mqd_cpu_ptr, 0, size);
> +r = amdgpu_bo_reserve(queue->mqd_obj, false);
> +if (unlikely(r != 0)) {
> +DRM_ERROR("Failed to reserve mqd for userqueue (%d)", r);
> +goto free_mqd;
> +}
> +
> +r = uq_mgr->userq_mqd_funcs->mqd_create(uq_mgr, queue);
> +amdgpu_bo_unreserve(queue->mqd_obj);
> +if (r) {
> +DRM_ERROR("Failed to create MQD for queue\n");
> +goto free_mqd;
> +}
> +return 0;
> +
> +free_mqd:
> +amdgpu_bo_free_kernel(&queue->mqd_obj,
> +  &queue->mqd_gpu_addr,
> +  &queue->mqd_cpu_ptr);
> +   return r;
> +}
> +
> +static void
> +amdgpu_userqueue_destroy_mqd(struct amdgpu_userq_mgr *uq_mgr, struct 
> amdgpu_usermode_queue *queue)
> +{
> +uq_mgr->userq_mqd_funcs->mqd_destroy(uq_mgr, queue);
> +amdgpu_bo_free_kernel(&queue->mqd_obj,
> +  &queue->mqd_gpu_addr,
> +  &queue->mqd_cpu_ptr);
> +}
> +
>  static int amdgpu_userqueue_create(struct drm_file *filp, union 
> drm_amdgpu_userq *args)
>  {
>  int r, pasid;
> @@ -82,12 +136,21 @@ static int amdgpu_userqueue_create(struct drm_file 
> *filp, union drm_amdgpu_userq
>  goto free_queue;
>  }
>
> +r = amdgpu_userqueue_create_mqd(uq_mgr, queue);
> +if (r) {
> +DRM_ERROR("Failed to create MQD\n");
> +goto free_qid;
> +}
> +
>  list_add_tail(&queue->userq_node, &uq_mgr->userq_list);
>  args->out.q_id = queue->queue_id;
>  args->out.flags = 0;
>  mutex_unlock(&uq_mgr->userq_mutex);
>  return 0;
>
> +free_qid:
> +amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
> +
>  free_queue:
>  mutex_unlock(&uq_mgr->userq_mutex);
>  kfree(queue);
> @@ -107,6 +170,7 @@ static void amdgpu_userqueue_destroy(struct drm_file 
> *filp, int queue_id)
>  }
>
>  mutex_lock(&uq_mgr->userq_mutex);
> +amdgpu_userqueue_destroy_mqd(uq_mgr, queue);
>  amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
>  list_del(&queue->userq_node);
>  mutex_unlock(&uq_mgr->userq_mutex);
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h 
> b/drivers/gpu/drm/amd/include/amdgpu_userqueue.h
> index 9557588fe34f..a6abdfd5cb74 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_userqueue.h
> @@ -26,10 +26,13 @@
>
>  #define AMDGPU_MAX_USERQ 512
>
> +struct amdgpu_userq_mqd_funcs;
> +
>  struct amdgpu_userq_mgr {
> struct idr userq_idr;
> struct mutex userq_mutex;
> struct list_head userq_list;
> +   const struct amdgpu_userq_mqd_funcs *userq_mqd_funcs;
> struct amdgpu_device *adev;
>  };
>
> @@ -57,6 +60,12 @@ struct amdgpu_usermode_queue {
>
>  int amdgpu_userq_ioctl(struct drm_device *dev, void *data, struct drm_file 
> *filp);
>
> +struct amdgpu_userq_mqd_funcs {
> +   int (*mqd_size)(struct amdgpu_userq_mgr *);
> +   int (*mqd_create)(struct am

Re: [RFC PATCH] drm: Create documentation about device resets

2023-02-07 Thread Michel Dänzer
On 2/7/23 14:30, Pekka Paalanen wrote:
> On Mon, 23 Jan 2023 21:38:11 +0100
> Christian König  wrote:
>> Am 23.01.23 um 21:26 schrieb André Almeida:
>>>
>>> diff --git a/Documentation/gpu/drm-reset.rst 
>>> b/Documentation/gpu/drm-reset.rst
>>> new file mode 100644
>>> index ..0dd11a469cf9
>>> --- /dev/null
>>> +++ b/Documentation/gpu/drm-reset.rst
>>> @@ -0,0 +1,51 @@
>>> +
>>> +DRM Device Reset
>>> +
>>> +
>>> +The GPU stack is really complex and is prone to errors, from hardware bugs,
>>> +faulty applications and everything in the many layers in between. To 
>>> recover
>>> +from this kind of state, sometimes is needed to reset the GPU. Unproper 
>>> handling
>>> +of GPU resets can lead to an unstable userspace. This page describes 
>>> what's the
>>> +expected behaviour from DRM drivers to do in those situations, from 
>>> usermode
>>> +drivers and compositors as well.
>>> +
>>> +Robustness
>>> +--
>>> +
>>> +First of all, application robust APIs, when available, should be used. This
>>> +allows the application to correctly recover and continue to run after a 
>>> reset.
>>> +Apps that doesn't use this should be promptly killed when the kernel driver
>>> +detects that it's in broken state. Specifically guidelines for some APIs:
>>> +  
>>
>>> +- OpenGL: During a reset, KMD kill processes that haven't ARB Robustness
>>> +  enabled, assuming they can't recover.  
>>
>> This is a pretty clear NAK from my side to this approach. The KMD should 
>> never mess with an userspace process directly in such a way.
>>
>> Instead use something like this "OpenGL: KMD signals the abortion of 
>> submitted commands and the UMD should then react accordingly and abort 
>> the application.".
> 
> what Christian said, plus I would not assume that robust programs will
> always respond by creating a new context. They could also switch
> to a software renderer, [...]

That is indeed what Firefox does.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH 2/8] drm/amdgpu: add usermode queues

2023-02-07 Thread Alex Deucher
On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma  wrote:
>
> From: Shashank Sharma 
>
> This patch adds skeleton code for usermode queue creation. It
> typically contains:
> - A new structure to keep all the user queue data in one place.
> - An IOCTL function to create/free a usermode queue.
> - A function to generate unique index for the queue.
> - A queue context manager in driver private data.
>
> V1: Worked on design review comments from RFC patch series:
> (https://patchwork.freedesktop.org/series/112214/)
>
> - Alex: Keep a list of queues, instead of single queue per process.
> - Christian: Use the queue manager instead of global ptrs,
>Don't keep the queue structure in amdgpu_ctx
>
> Cc: Alex Deucher 
> Cc: Christian Koenig 
> Signed-off-by: Shashank Sharma 
> ---
>  drivers/gpu/drm/amd/amdgpu/Makefile   |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |   5 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 155 ++
>  .../gpu/drm/amd/include/amdgpu_userqueue.h|  64 
>  6 files changed, 230 insertions(+)
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
>  create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 798d0e9a60b7..764801cc8203 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -210,6 +210,8 @@ amdgpu-y += \
>  # add amdkfd interfaces
>  amdgpu-y += amdgpu_amdkfd.o
>
> +# add usermode queue
> +amdgpu-y += amdgpu_userqueue.o
>
>  ifneq ($(CONFIG_HSA_AMD),)
>  AMDKFD_PATH := ../amdkfd
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 6b74df446694..0625d6bdadf4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -109,6 +109,7 @@
>  #include "amdgpu_fdinfo.h"
>  #include "amdgpu_mca.h"
>  #include "amdgpu_ras.h"
> +#include "amdgpu_userqueue.h"
>
>  #define MAX_GPU_INSTANCE   16
>
> @@ -482,6 +483,7 @@ struct amdgpu_fpriv {
> struct mutexbo_list_lock;
> struct idr  bo_list_handles;
> struct amdgpu_ctx_mgr   ctx_mgr;
> +   struct amdgpu_userq_mgr userq_mgr;
>  };
>
>  int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index b4f2d61ea0d5..229976a2d0e7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -52,6 +52,7 @@
>  #include "amdgpu_ras.h"
>  #include "amdgpu_xgmi.h"
>  #include "amdgpu_reset.h"
> +#include "amdgpu_userqueue.h"
>
>  /*
>   * KMS wrapper.
> @@ -2748,6 +2749,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, 
> DRM_AUTH|DRM_RENDER_ALLOW),
> DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, 
> DRM_AUTH|DRM_RENDER_ALLOW),
> DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, 
> DRM_AUTH|DRM_RENDER_ALLOW),
> +   DRM_IOCTL_DEF_DRV(AMDGPU_USERQ, amdgpu_userq_ioctl, 
> DRM_AUTH|DRM_RENDER_ALLOW),
>  };
>
>  static const struct drm_driver amdgpu_kms_driver = {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 7aa7e52ca784..52e61e339a88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -1187,6 +1187,10 @@ int amdgpu_driver_open_kms(struct drm_device *dev, 
> struct drm_file *file_priv)
>
> amdgpu_ctx_mgr_init(&fpriv->ctx_mgr, adev);
>
> +   r = amdgpu_userq_mgr_init(&fpriv->userq_mgr, adev);
> +   if (r)
> +   DRM_WARN("Can't setup usermode queues, only legacy workload 
> submission will work\n");
> +
> file_priv->driver_priv = fpriv;
> goto out_suspend;
>
> @@ -1254,6 +1258,7 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
>
> amdgpu_ctx_mgr_fini(&fpriv->ctx_mgr);
> amdgpu_vm_fini(adev, &fpriv->vm);
> +   amdgpu_userq_mgr_fini(&fpriv->userq_mgr);
>
> if (pasid)
> amdgpu_pasid_free_delayed(pd->tbo.base.resv, pasid);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> new file mode 100644
> index ..d5bc7fe81750
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> @@ -0,0 +1,155 @@
> +/*
> + * Copyright 2022 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, 

Re: [PATCH] drm/amd/display: Align num_crtc to max_streams

2023-02-07 Thread Hamza Mahfooz



On 2/7/23 09:31, Harry Wentland wrote:



On 2/7/23 08:00, Hamza Mahfooz wrote:


On 2/6/23 23:05, Tianci Yin wrote:

From: tiancyin 

[Why]
Display pipe might be harvested on some SKUs, that cause the
adev->mode_info.num_crtc mismatch with the usable crtc number,
then below error dmesgs observed after GPU recover.

    *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
    *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3

[How]
The max_streams is limited number after pipe fuse, align num_crtc
to max_streams to eliminate the error logs.

Signed-off-by: tiancyin 
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b31cfda30ff9..87ec2574cc09 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4285,6 +4285,9 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
   break;
   }
   +    /* Adjust the crtc number according to the DCN pipe fuse. */
+    adev->mode_info.num_crtc = dm->dc->caps.max_streams;


This would introduce array-out-bounds issues, since there are arrays of
size AMDGPU_MAX_CRTCS that use num_crtc as a bounds check.


 From what I can tell max_streams is always <= AMDGPU_MAX_CRTCS (6),
though we're not checking. Maybe it'd be good to check and print a
DRM_ERROR here if that's ever not the case (e.g., if we ever add
any new ASIC that has more streams).


I have had UBSAN warns before commit d633b7a25fbe ("drm/amd/display: fix
possible buffer overflow relating to secure display") was applied, so it
seems to already be the case, maybe due to virtual streams.



Harry




+
   for (i = 0; i < dm->dc->caps.max_streams; i++)
   if (amdgpu_dm_crtc_init(dm, mode_info->planes[i], i)) {
   DRM_ERROR("KMS: Failed to initialize crtc\n");






--
Hamza



Re: [PATCH 1/8] drm/amdgpu: UAPI for user queue management

2023-02-07 Thread Shashank Sharma



On 07/02/2023 15:20, Alex Deucher wrote:

On Tue, Feb 7, 2023 at 9:19 AM Christian König  wrote:

Am 07.02.23 um 15:17 schrieb Alex Deucher:

On Tue, Feb 7, 2023 at 9:11 AM Christian König
 wrote:

Am 07.02.23 um 15:07 schrieb Alex Deucher:

On Tue, Feb 7, 2023 at 2:38 AM Shashank Sharma  wrote:

On 07/02/2023 08:03, Christian König wrote:

Am 06.02.23 um 22:03 schrieb Alex Deucher:

On Mon, Feb 6, 2023 at 12:01 PM Christian König
 wrote:

Am 06.02.23 um 17:56 schrieb Alex Deucher:

On Fri, Feb 3, 2023 at 5:26 PM Shashank Sharma
 wrote:

Hey Alex,

On 03/02/2023 23:07, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma
 wrote:

From: Alex Deucher 

This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.

This UAPI maps the queue into GPU, so the graphics app can start
submitting work to the queue as soon as the call returns.

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 
---
   include/uapi/drm/amdgpu_drm.h | 53
+++
   1 file changed, 53 insertions(+)

diff --git a/include/uapi/drm/amdgpu_drm.h
b/include/uapi/drm/amdgpu_drm.h
index 4038abe8505a..6c5235d107b3 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -54,6 +54,7 @@ extern "C" {
   #define DRM_AMDGPU_VM  0x13
   #define DRM_AMDGPU_FENCE_TO_HANDLE 0x14
   #define DRM_AMDGPU_SCHED   0x15
+#define DRM_AMDGPU_USERQ   0x16

   #define DRM_IOCTL_AMDGPU_GEM_CREATE
DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union
drm_amdgpu_gem_create)
   #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE
+ DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -71,6 +72,7 @@ extern "C" {
   #define DRM_IOCTL_AMDGPU_VM DRM_IOWR(DRM_COMMAND_BASE +
DRM_AMDGPU_VM, union drm_amdgpu_vm)
   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE
DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union
drm_amdgpu_fence_to_handle)
   #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE +
DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+#define DRM_IOCTL_AMDGPU_USERQ DRM_IOW(DRM_COMMAND_BASE +
DRM_AMDGPU_USERQ, union drm_amdgpu_userq)

   /**
* DOC: memory domains
@@ -302,6 +304,57 @@ union drm_amdgpu_ctx {
  union drm_amdgpu_ctx_out out;
   };

+/* user queue IOCTL */
+#define AMDGPU_USERQ_OP_CREATE 1
+#define AMDGPU_USERQ_OP_FREE   2
+
+#define AMDGPU_USERQ_MQD_FLAGS_SECURE  (1 << 0)
+#define AMDGPU_USERQ_MQD_FLAGS_AQL (1 << 1)
+
+struct drm_amdgpu_userq_mqd {
+   /** Flags: AMDGPU_USERQ_MQD_FLAGS_* */
+   __u32   flags;
+   /** IP type: AMDGPU_HW_IP_* */
+   __u32   ip_type;
+   /** GEM object handle */
+   __u32   doorbell_handle;
+   /** Doorbell offset in dwords */
+   __u32   doorbell_offset;

Since doorbells are 64 bit, maybe this offset should be in qwords.

Can you please help to cross check this information ? All the
existing
kernel doorbell calculations are keeping doorbells size as
sizeof(u32)

Doorbells on pre-vega hardware are 32 bits so that is where that comes
from, but from vega onward most doorbells are 64 bit.  I think some
versions of VCN may still use 32 bit doorbells.  Internally in the
kernel driver we just use two slots for newer hardware, but for the
UAPI, I think we can just stick with 64 bit slots to avoid confusion.
Even if an engine only uses a 32 bit one, I don't know that there is
much value to trying to support variable doorbell sizes.

I think we can stick with using __u32 because this is *not* the size of
the doorbell entries.

Instead this is the offset into the BO where to find the doorbell for
this queue (which then in turn is 64bits wide).

Since we will probably never have more than 4GiB doorbells we should be
pretty save to use 32bits here.

Yes, the offset would still be 32 bits, but the units would be
qwords.  E.g.,

+   /** Doorbell offset in qwords */
+   __u32   doorbell_offset;

That way you couldn't accidently specify an overlapping doorbell.

Ah, so you only wanted to fix the comment. That was absolutely not
clear from the discussion.

If I understand this correctly, the offset of the doorbell in the BO is
still is 32-bit, but its width (size in bytes) is 64 bits. Am I getting
that right ?

Right.  Each doorbell is 64 bits (8 bytes) so this value would
basically be an index into the doorbell bo.  Having it be a 64 bit
index rather than a 32 bit index would avoid the possibility of users
specifying overlapping doorbells.  E.g.,
offset in bytes
0 - doorbell
4 - doorbell
Would be incorrect, while
offset in bytes
0 - doorbell
8 - doorbell
Would be correct.

I.e., u64 doorbell_page[512] vs u32 doorbell_page[1024]

Well I usually prefer just straight byte offsets, but I think the main
question 

Re: [PATCH] drm/amd/display: Align num_crtc to max_streams

2023-02-07 Thread Harry Wentland



On 2/7/23 08:00, Hamza Mahfooz wrote:
> 
> On 2/6/23 23:05, Tianci Yin wrote:
>> From: tiancyin 
>>
>> [Why]
>> Display pipe might be harvested on some SKUs, that cause the
>> adev->mode_info.num_crtc mismatch with the usable crtc number,
>> then below error dmesgs observed after GPU recover.
>>
>>    *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
>>    *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
>>
>> [How]
>> The max_streams is limited number after pipe fuse, align num_crtc
>> to max_streams to eliminate the error logs.
>>
>> Signed-off-by: tiancyin 
>> ---
>>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index b31cfda30ff9..87ec2574cc09 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -4285,6 +4285,9 @@ static int amdgpu_dm_initialize_drm_device(struct 
>> amdgpu_device *adev)
>>   break;
>>   }
>>   +    /* Adjust the crtc number according to the DCN pipe fuse. */
>> +    adev->mode_info.num_crtc = dm->dc->caps.max_streams;
> 
> This would introduce array-out-bounds issues, since there are arrays of
> size AMDGPU_MAX_CRTCS that use num_crtc as a bounds check.

>From what I can tell max_streams is always <= AMDGPU_MAX_CRTCS (6),
though we're not checking. Maybe it'd be good to check and print a
DRM_ERROR here if that's ever not the case (e.g., if we ever add
any new ASIC that has more streams).

Harry

> 
>> +
>>   for (i = 0; i < dm->dc->caps.max_streams; i++)
>>   if (amdgpu_dm_crtc_init(dm, mode_info->planes[i], i)) {
>>   DRM_ERROR("KMS: Failed to initialize crtc\n");
> 



Re: [PATCH] drm/amdgpu: Use the TGID for trace_amdgpu_vm_update_ptes

2023-02-07 Thread Alex Deucher
I'll add the stable CC.  Thanks,

Alex

On Tue, Feb 7, 2023 at 2:34 AM Christian König  wrote:
>
> That sounds like a good idea to me as well.
>
> If you think that a patch should be backported please add a "CC:
> sta...@vger.kernel.org" tag to it before sending it out.
>
> We can always remove it if we don't think a backport is appropriated,
> but maintainers seldom add it by themself.
>
> Thanks,
> Christian.
>
> Am 07.02.23 um 00:09 schrieb Friedrich Vock:
> > Hi,
> >
> > thanks for applying the patch!
> >
> > Do you think it'd also be possible to backport it to previous kernel
> > versions or do you already plan to do that?
> > Since it is a one-liner bugfix it shouldn't be too hard to backport.
> >
> > Thank you,
> > Friedrich Vock
> >
> > On 06.02.23 21:26, Alex Deucher wrote:
> >> Applied.  Thanks!
> >>
> >> Alex
> >>
> >> On Mon, Feb 6, 2023 at 3:35 AM Christian König
> >>  wrote:
> >>>
> >>>
> >>> Am 02.02.23 um 17:21 schrieb Friedrich Vock:
>  The pid field corresponds to the result of gettid() in userspace.
>  However, userspace cannot reliably attribute PTE events to processes
>  with just the thread id. This patch allows userspace to easily
>  attribute PTE update events to specific processes by comparing this
>  field with the result of getpid().
> 
>  For attributing events to specific threads, the thread id is also
>  contained in the common fields of each trace event.
> 
>  Signed-off-by: Friedrich Vock 
> >>> Ah, yes that makes more sense. Reviewed-by: Christian König
> >>> 
> >>>
> >>> Alex do you pick this up or should I take care of it?
> >>>
> >>> Thanks,
> >>> Christian.
> >>>
>  ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
>  b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
>  index b5f3bba851db..01e42bdd8e4e 100644
>  --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
>  +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
>  @@ -974,7 +974,7 @@ int amdgpu_vm_ptes_update(struct
>  amdgpu_vm_update_params *params,
> trace_amdgpu_vm_update_ptes(params,
>  frag_start, upd_end,
>  min(nptes, 32u), dst, incr,
>  upd_flags,
>  - vm->task_info.pid,
>  + vm->task_info.tgid,
>  vm->immediate.fence_context);
> amdgpu_vm_pte_update_flags(params,
>  to_amdgpu_bo_vm(pt),
>  cursor.level, pe_start, dst,
>  --
>  2.39.1
> 
>


Re: [PATCH 1/8] drm/amdgpu: UAPI for user queue management

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 9:19 AM Christian König  wrote:
>
> Am 07.02.23 um 15:17 schrieb Alex Deucher:
> > On Tue, Feb 7, 2023 at 9:11 AM Christian König
> >  wrote:
> >> Am 07.02.23 um 15:07 schrieb Alex Deucher:
> >>> On Tue, Feb 7, 2023 at 2:38 AM Shashank Sharma  
> >>> wrote:
>  On 07/02/2023 08:03, Christian König wrote:
> > Am 06.02.23 um 22:03 schrieb Alex Deucher:
> >> On Mon, Feb 6, 2023 at 12:01 PM Christian König
> >>  wrote:
> >>> Am 06.02.23 um 17:56 schrieb Alex Deucher:
>  On Fri, Feb 3, 2023 at 5:26 PM Shashank Sharma
>   wrote:
> > Hey Alex,
> >
> > On 03/02/2023 23:07, Alex Deucher wrote:
> >> On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma
> >>  wrote:
> >>> From: Alex Deucher 
> >>>
> >>> This patch intorduces new UAPI/IOCTL for usermode graphics
> >>> queue. The userspace app will fill this structure and request
> >>> the graphics driver to add a graphics work queue for it. The
> >>> output of this UAPI is a queue id.
> >>>
> >>> This UAPI maps the queue into GPU, so the graphics app can start
> >>> submitting work to the queue as soon as the call returns.
> >>>
> >>> Cc: Alex Deucher 
> >>> Cc: Christian Koenig 
> >>> Signed-off-by: Alex Deucher 
> >>> Signed-off-by: Shashank Sharma 
> >>> ---
> >>>   include/uapi/drm/amdgpu_drm.h | 53
> >>> +++
> >>>   1 file changed, 53 insertions(+)
> >>>
> >>> diff --git a/include/uapi/drm/amdgpu_drm.h
> >>> b/include/uapi/drm/amdgpu_drm.h
> >>> index 4038abe8505a..6c5235d107b3 100644
> >>> --- a/include/uapi/drm/amdgpu_drm.h
> >>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>> @@ -54,6 +54,7 @@ extern "C" {
> >>>   #define DRM_AMDGPU_VM  0x13
> >>>   #define DRM_AMDGPU_FENCE_TO_HANDLE 0x14
> >>>   #define DRM_AMDGPU_SCHED   0x15
> >>> +#define DRM_AMDGPU_USERQ   0x16
> >>>
> >>>   #define DRM_IOCTL_AMDGPU_GEM_CREATE
> >>> DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union
> >>> drm_amdgpu_gem_create)
> >>>   #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE
> >>> + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>> @@ -71,6 +72,7 @@ extern "C" {
> >>>   #define DRM_IOCTL_AMDGPU_VM DRM_IOWR(DRM_COMMAND_BASE +
> >>> DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>>   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE
> >>> DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union
> >>> drm_amdgpu_fence_to_handle)
> >>>   #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE +
> >>> DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>> +#define DRM_IOCTL_AMDGPU_USERQ DRM_IOW(DRM_COMMAND_BASE +
> >>> DRM_AMDGPU_USERQ, union drm_amdgpu_userq)
> >>>
> >>>   /**
> >>>* DOC: memory domains
> >>> @@ -302,6 +304,57 @@ union drm_amdgpu_ctx {
> >>>  union drm_amdgpu_ctx_out out;
> >>>   };
> >>>
> >>> +/* user queue IOCTL */
> >>> +#define AMDGPU_USERQ_OP_CREATE 1
> >>> +#define AMDGPU_USERQ_OP_FREE   2
> >>> +
> >>> +#define AMDGPU_USERQ_MQD_FLAGS_SECURE  (1 << 0)
> >>> +#define AMDGPU_USERQ_MQD_FLAGS_AQL (1 << 1)
> >>> +
> >>> +struct drm_amdgpu_userq_mqd {
> >>> +   /** Flags: AMDGPU_USERQ_MQD_FLAGS_* */
> >>> +   __u32   flags;
> >>> +   /** IP type: AMDGPU_HW_IP_* */
> >>> +   __u32   ip_type;
> >>> +   /** GEM object handle */
> >>> +   __u32   doorbell_handle;
> >>> +   /** Doorbell offset in dwords */
> >>> +   __u32   doorbell_offset;
> >> Since doorbells are 64 bit, maybe this offset should be in qwords.
> > Can you please help to cross check this information ? All the
> > existing
> > kernel doorbell calculations are keeping doorbells size as
> > sizeof(u32)
>  Doorbells on pre-vega hardware are 32 bits so that is where that 
>  comes
>  from, but from vega onward most doorbells are 64 bit.  I think some
>  versions of VCN may still use 32 bit doorbells.  Internally in the
>  kernel driver we just use two slots for newer hardware, but for the
>  UAPI, I think we can just stick with 64 bit slots to avoid confusion.
>  Even if an engine only uses a 32 bit one, I don't know that there is
>  much value to trying to support variable doorbell sizes.
> >>> I think we can stick with using __u32 because this is *not* the size 
> >>> of
> >>> the doorbell entries.

Re: [PATCH 1/8] drm/amdgpu: UAPI for user queue management

2023-02-07 Thread Christian König

Am 07.02.23 um 15:17 schrieb Alex Deucher:

On Tue, Feb 7, 2023 at 9:11 AM Christian König
 wrote:

Am 07.02.23 um 15:07 schrieb Alex Deucher:

On Tue, Feb 7, 2023 at 2:38 AM Shashank Sharma  wrote:

On 07/02/2023 08:03, Christian König wrote:

Am 06.02.23 um 22:03 schrieb Alex Deucher:

On Mon, Feb 6, 2023 at 12:01 PM Christian König
 wrote:

Am 06.02.23 um 17:56 schrieb Alex Deucher:

On Fri, Feb 3, 2023 at 5:26 PM Shashank Sharma
 wrote:

Hey Alex,

On 03/02/2023 23:07, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma
 wrote:

From: Alex Deucher 

This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.

This UAPI maps the queue into GPU, so the graphics app can start
submitting work to the queue as soon as the call returns.

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 
---
  include/uapi/drm/amdgpu_drm.h | 53
+++
  1 file changed, 53 insertions(+)

diff --git a/include/uapi/drm/amdgpu_drm.h
b/include/uapi/drm/amdgpu_drm.h
index 4038abe8505a..6c5235d107b3 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -54,6 +54,7 @@ extern "C" {
  #define DRM_AMDGPU_VM  0x13
  #define DRM_AMDGPU_FENCE_TO_HANDLE 0x14
  #define DRM_AMDGPU_SCHED   0x15
+#define DRM_AMDGPU_USERQ   0x16

  #define DRM_IOCTL_AMDGPU_GEM_CREATE
DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union
drm_amdgpu_gem_create)
  #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE
+ DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -71,6 +72,7 @@ extern "C" {
  #define DRM_IOCTL_AMDGPU_VM DRM_IOWR(DRM_COMMAND_BASE +
DRM_AMDGPU_VM, union drm_amdgpu_vm)
  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE
DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union
drm_amdgpu_fence_to_handle)
  #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE +
DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+#define DRM_IOCTL_AMDGPU_USERQ DRM_IOW(DRM_COMMAND_BASE +
DRM_AMDGPU_USERQ, union drm_amdgpu_userq)

  /**
   * DOC: memory domains
@@ -302,6 +304,57 @@ union drm_amdgpu_ctx {
 union drm_amdgpu_ctx_out out;
  };

+/* user queue IOCTL */
+#define AMDGPU_USERQ_OP_CREATE 1
+#define AMDGPU_USERQ_OP_FREE   2
+
+#define AMDGPU_USERQ_MQD_FLAGS_SECURE  (1 << 0)
+#define AMDGPU_USERQ_MQD_FLAGS_AQL (1 << 1)
+
+struct drm_amdgpu_userq_mqd {
+   /** Flags: AMDGPU_USERQ_MQD_FLAGS_* */
+   __u32   flags;
+   /** IP type: AMDGPU_HW_IP_* */
+   __u32   ip_type;
+   /** GEM object handle */
+   __u32   doorbell_handle;
+   /** Doorbell offset in dwords */
+   __u32   doorbell_offset;

Since doorbells are 64 bit, maybe this offset should be in qwords.

Can you please help to cross check this information ? All the
existing
kernel doorbell calculations are keeping doorbells size as
sizeof(u32)

Doorbells on pre-vega hardware are 32 bits so that is where that comes
from, but from vega onward most doorbells are 64 bit.  I think some
versions of VCN may still use 32 bit doorbells.  Internally in the
kernel driver we just use two slots for newer hardware, but for the
UAPI, I think we can just stick with 64 bit slots to avoid confusion.
Even if an engine only uses a 32 bit one, I don't know that there is
much value to trying to support variable doorbell sizes.

I think we can stick with using __u32 because this is *not* the size of
the doorbell entries.

Instead this is the offset into the BO where to find the doorbell for
this queue (which then in turn is 64bits wide).

Since we will probably never have more than 4GiB doorbells we should be
pretty save to use 32bits here.

Yes, the offset would still be 32 bits, but the units would be
qwords.  E.g.,

+   /** Doorbell offset in qwords */
+   __u32   doorbell_offset;

That way you couldn't accidently specify an overlapping doorbell.

Ah, so you only wanted to fix the comment. That was absolutely not
clear from the discussion.

If I understand this correctly, the offset of the doorbell in the BO is
still is 32-bit, but its width (size in bytes) is 64 bits. Am I getting
that right ?

Right.  Each doorbell is 64 bits (8 bytes) so this value would
basically be an index into the doorbell bo.  Having it be a 64 bit
index rather than a 32 bit index would avoid the possibility of users
specifying overlapping doorbells.  E.g.,
offset in bytes
0 - doorbell
4 - doorbell
Would be incorrect, while
offset in bytes
0 - doorbell
8 - doorbell
Would be correct.

I.e., u64 doorbell_page[512] vs u32 doorbell_page[1024]

Well I usually prefer just straight byte offsets, but I think the main
question is what does the underlying hw/fw use?

If that's a dword index we should probably stick with that in the UAPI
as 

Re: [PATCH 1/8] drm/amdgpu: UAPI for user queue management

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 9:11 AM Christian König
 wrote:
>
> Am 07.02.23 um 15:07 schrieb Alex Deucher:
> > On Tue, Feb 7, 2023 at 2:38 AM Shashank Sharma  
> > wrote:
> >>
> >> On 07/02/2023 08:03, Christian König wrote:
> >>> Am 06.02.23 um 22:03 schrieb Alex Deucher:
>  On Mon, Feb 6, 2023 at 12:01 PM Christian König
>   wrote:
> > Am 06.02.23 um 17:56 schrieb Alex Deucher:
> >> On Fri, Feb 3, 2023 at 5:26 PM Shashank Sharma
> >>  wrote:
> >>> Hey Alex,
> >>>
> >>> On 03/02/2023 23:07, Alex Deucher wrote:
>  On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma
>   wrote:
> > From: Alex Deucher 
> >
> > This patch intorduces new UAPI/IOCTL for usermode graphics
> > queue. The userspace app will fill this structure and request
> > the graphics driver to add a graphics work queue for it. The
> > output of this UAPI is a queue id.
> >
> > This UAPI maps the queue into GPU, so the graphics app can start
> > submitting work to the queue as soon as the call returns.
> >
> > Cc: Alex Deucher 
> > Cc: Christian Koenig 
> > Signed-off-by: Alex Deucher 
> > Signed-off-by: Shashank Sharma 
> > ---
> >  include/uapi/drm/amdgpu_drm.h | 53
> > +++
> >  1 file changed, 53 insertions(+)
> >
> > diff --git a/include/uapi/drm/amdgpu_drm.h
> > b/include/uapi/drm/amdgpu_drm.h
> > index 4038abe8505a..6c5235d107b3 100644
> > --- a/include/uapi/drm/amdgpu_drm.h
> > +++ b/include/uapi/drm/amdgpu_drm.h
> > @@ -54,6 +54,7 @@ extern "C" {
> >  #define DRM_AMDGPU_VM  0x13
> >  #define DRM_AMDGPU_FENCE_TO_HANDLE 0x14
> >  #define DRM_AMDGPU_SCHED   0x15
> > +#define DRM_AMDGPU_USERQ   0x16
> >
> >  #define DRM_IOCTL_AMDGPU_GEM_CREATE
> > DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union
> > drm_amdgpu_gem_create)
> >  #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE
> > + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > @@ -71,6 +72,7 @@ extern "C" {
> >  #define DRM_IOCTL_AMDGPU_VM DRM_IOWR(DRM_COMMAND_BASE +
> > DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE
> > DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union
> > drm_amdgpu_fence_to_handle)
> >  #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE +
> > DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > +#define DRM_IOCTL_AMDGPU_USERQ DRM_IOW(DRM_COMMAND_BASE +
> > DRM_AMDGPU_USERQ, union drm_amdgpu_userq)
> >
> >  /**
> >   * DOC: memory domains
> > @@ -302,6 +304,57 @@ union drm_amdgpu_ctx {
> > union drm_amdgpu_ctx_out out;
> >  };
> >
> > +/* user queue IOCTL */
> > +#define AMDGPU_USERQ_OP_CREATE 1
> > +#define AMDGPU_USERQ_OP_FREE   2
> > +
> > +#define AMDGPU_USERQ_MQD_FLAGS_SECURE  (1 << 0)
> > +#define AMDGPU_USERQ_MQD_FLAGS_AQL (1 << 1)
> > +
> > +struct drm_amdgpu_userq_mqd {
> > +   /** Flags: AMDGPU_USERQ_MQD_FLAGS_* */
> > +   __u32   flags;
> > +   /** IP type: AMDGPU_HW_IP_* */
> > +   __u32   ip_type;
> > +   /** GEM object handle */
> > +   __u32   doorbell_handle;
> > +   /** Doorbell offset in dwords */
> > +   __u32   doorbell_offset;
>  Since doorbells are 64 bit, maybe this offset should be in qwords.
> >>> Can you please help to cross check this information ? All the
> >>> existing
> >>> kernel doorbell calculations are keeping doorbells size as
> >>> sizeof(u32)
> >> Doorbells on pre-vega hardware are 32 bits so that is where that comes
> >> from, but from vega onward most doorbells are 64 bit.  I think some
> >> versions of VCN may still use 32 bit doorbells.  Internally in the
> >> kernel driver we just use two slots for newer hardware, but for the
> >> UAPI, I think we can just stick with 64 bit slots to avoid confusion.
> >> Even if an engine only uses a 32 bit one, I don't know that there is
> >> much value to trying to support variable doorbell sizes.
> > I think we can stick with using __u32 because this is *not* the size of
> > the doorbell entries.
> >
> > Instead this is the offset into the BO where to find the doorbell for
> > this queue (which then in turn is 64bits wide).
> >
> > Since we will probably never have more than 4GiB doorbells we should be
> > pretty save to use 32bits here.
>  Yes, the offset would still be 32 bits, but the units wou

Re: [PATCH 1/2] drm/amdgpu: Fix incorrect filenames in sysfs comments

2023-02-07 Thread Christian König

Am 06.02.23 um 18:25 schrieb kent.russ...@amd.com:

This looks like a standard copy/paste mistake. Replace the incorrect
serial_number references with product_name and product_model

Signed-off-by: kent.russ...@amd.com 


Reviewed-by: Christian König  for the series.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a10b627c8357..5a97021bbb23 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -162,7 +162,7 @@ static void amdgpu_device_get_pcie_info(struct 
amdgpu_device *adev);
   *
   * The amdgpu driver provides a sysfs API for reporting the product name
   * for the device
- * The file serial_number is used for this and returns the product name
+ * The file product_name is used for this and returns the product name
   * as returned from the FRU.
   * NOTE: This is only available for certain server cards
   */
@@ -184,7 +184,7 @@ static DEVICE_ATTR(product_name, S_IRUGO,
   *
   * The amdgpu driver provides a sysfs API for reporting the part number
   * for the device
- * The file serial_number is used for this and returns the part number
+ * The file product_number is used for this and returns the part number
   * as returned from the FRU.
   * NOTE: This is only available for certain server cards
   */




Re: Indexing of FeatureCtrlMask for SMU13 OverDrive

2023-02-07 Thread Alex Deucher
On Mon, Feb 6, 2023 at 11:36 PM Alex Deucher  wrote:
>
> On Mon, Feb 6, 2023 at 8:17 PM Matt Coffin  wrote:
> >
> > Hello again,
> >
> > I've been working on OverDrive support for smu13, as you probably
> > already know. In that endeavor, it also contains the following:
> >
> > 1. I've come up with a few patterns that I think will reduce the
> > amount of boilerplate and SMU-specific code required to do
> > implement these interfaces in the future.
> > 2. Since the old pp_od_clk_voltage sysfs interface is inadequate for
> > usage in setting values other than a few indexed clock/voltage settings,
> > I'll likely be sending a proposed "generic" interface, where OD settings
> > are exposed to userspace by ASIC-specific indexed identifiers.
> >
> > But, those are beside the point, for now.
> >
> > While picking through the existing headers, the information in
> > smu_v13_0_0_pptable.h seems to not quite be in line with what I'm seeing
> > coming from the card, so I'm instead focusing mainly on
> > smu13_driver_if_v13_0_0.h.
> >
> > In the two OverDrive-related structs, OverDriveTable_t and
> > OverDriveLimits_t, the FeatureCtrlMask member seems to be controlling
> > which of the "features" of OverDrive would actually be in use. As of
> > yet, I haven't been able to find an index of what the bits in here
> > actually mean. Is there any way you could help me out with that?
>
> I can ask tomorrow.  That said, we are working on OD support and
> should have patches available soon.

Those bits refer to the OD feature bits PP_OD_FEATURE_*.  Looks like
they are missing from smu13_driver_if_v13_0_0.h, but they are the same
as the ones in smu13_driver_if_v13_0_7.h.

Alex

>
> Alex
>
>
>
> >
> > My best guess thus far is that they are by each element of the
> > OverDriveTable_t struct, but that's only just a guess.
> >
> > For reference, here are the values I'm seeing present in each at boot
> > time.
> >
> > Since FeatureCtrlMask is 0b1001101, the current theory is that the
> > "unsupported" features would be VddGfxVmax, GfxclkFmin, GfxclkFmax. Does
> > that line up with what we'd be expecting for this ASIC?
> >
> > Thanks in advance for any information you can provide. I really
> > appreciate the work that you all do.
> >
> > Thanks,
> > Matt
> >
> > OverDriveLimits:
> > FeatureCtrlMask: [0x07cd, 0x07cd]
> > VoltageOffsetPerZoneBoundary: [-450, 0]
> > VddGfxVmax: [0, 0]
> > IdlePwrSavingFeaturesCtrl: [0x00, 0x00]
> > RuntimePwrSavingFeaturesCtrl: [0x00, 0x00]
> > GfxclkFmin: [500, 5000]
> > GfxclkFmax: [500, 5000]
> > UclkFmin: [97, 1500]
> > UclkFmax: [97, 1500]
> > Ppt: [-10, 15], Tdc: [-10, 0]
> > FanLinearPwmPoints: [23, 100]
> > FanLinearTempPoints: [25, 100]
> > FanMinimumPwm: [23, 100]
> > AcousticTargetRpmThreshold: [500, 3200]
> > AcousticLimitRpmThreshold: [500, 3200]
> > FanTargetTemperature: [25, 105]
> > FanZeroRpmEnable: [0, 1]
> > FanZeroRpmStopTemp: [25, 100]
> > FanMode: [0, 1]
> > MaxOpTemp: [50, 110]
> > OverDriveTable:
> > FeatureCtrlMask: 0x
> > VoltageOffsetPerZoneBoundary[0]: 0
> > VoltageOffsetPerZoneBoundary[1]: 0
> > VoltageOffsetPerZoneBoundary[2]: 0
> > VoltageOffsetPerZoneBoundary[3]: 0
> > VoltageOffsetPerZoneBoundary[4]: 0
> > VoltageOffsetPerZoneBoundary[5]: 0
> > VddGfxVmax: 1150
> > IdlePwrSavingFeaturesCtrl: 0x00
> > RuntimePwrSavingFeaturesCtrl: 0x00
> > GfxclkFmin: 500
> > GfxclkFmax: 2890
> > UclkFmin: 97
> > UclkFmax: 1249
> > Ppt: 0
> > Tdc: 0
> > FanLinearPwmPoints[0]: 0
> > FanLinearPwmPoints[1]: 0
> > FanLinearPwmPoints[2]: 0
> > FanLinearPwmPoints[3]: 0
> > FanLinearPwmPoints[4]: 0
> > FanLinearPwmPoints[5]: 0
> > FanLinearTempPoints[0]: 0
> > FanLinearTempPoints[1]: 0
> > FanLinearTempPoints[2]: 0
> > FanLinearTempPoints[3]: 0
> > FanLinearTempPoints[4]: 0
> > FanLinearTempPoints[5]: 0
> > FanMinimumPwm: 35
> > AcousticTargetRpmThreshold: 1250
> > AcousticLimitRpmThreshold: 1500
> > FanTargetTemperature: 94
> > FanZeroRpmEnable: 1
> > FanZeroRpmStopTemp: 55
> > FanMode: 0
> > MaxOpTemp: 110


Re: [PATCH 1/8] drm/amdgpu: UAPI for user queue management

2023-02-07 Thread Christian König

Am 07.02.23 um 15:07 schrieb Alex Deucher:

On Tue, Feb 7, 2023 at 2:38 AM Shashank Sharma  wrote:


On 07/02/2023 08:03, Christian König wrote:

Am 06.02.23 um 22:03 schrieb Alex Deucher:

On Mon, Feb 6, 2023 at 12:01 PM Christian König
 wrote:

Am 06.02.23 um 17:56 schrieb Alex Deucher:

On Fri, Feb 3, 2023 at 5:26 PM Shashank Sharma
 wrote:

Hey Alex,

On 03/02/2023 23:07, Alex Deucher wrote:

On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma
 wrote:

From: Alex Deucher 

This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.

This UAPI maps the queue into GPU, so the graphics app can start
submitting work to the queue as soon as the call returns.

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Alex Deucher 
Signed-off-by: Shashank Sharma 
---
 include/uapi/drm/amdgpu_drm.h | 53
+++
 1 file changed, 53 insertions(+)

diff --git a/include/uapi/drm/amdgpu_drm.h
b/include/uapi/drm/amdgpu_drm.h
index 4038abe8505a..6c5235d107b3 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -54,6 +54,7 @@ extern "C" {
 #define DRM_AMDGPU_VM  0x13
 #define DRM_AMDGPU_FENCE_TO_HANDLE 0x14
 #define DRM_AMDGPU_SCHED   0x15
+#define DRM_AMDGPU_USERQ   0x16

 #define DRM_IOCTL_AMDGPU_GEM_CREATE
DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union
drm_amdgpu_gem_create)
 #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE
+ DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -71,6 +72,7 @@ extern "C" {
 #define DRM_IOCTL_AMDGPU_VM DRM_IOWR(DRM_COMMAND_BASE +
DRM_AMDGPU_VM, union drm_amdgpu_vm)
 #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE
DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union
drm_amdgpu_fence_to_handle)
 #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE +
DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+#define DRM_IOCTL_AMDGPU_USERQ DRM_IOW(DRM_COMMAND_BASE +
DRM_AMDGPU_USERQ, union drm_amdgpu_userq)

 /**
  * DOC: memory domains
@@ -302,6 +304,57 @@ union drm_amdgpu_ctx {
union drm_amdgpu_ctx_out out;
 };

+/* user queue IOCTL */
+#define AMDGPU_USERQ_OP_CREATE 1
+#define AMDGPU_USERQ_OP_FREE   2
+
+#define AMDGPU_USERQ_MQD_FLAGS_SECURE  (1 << 0)
+#define AMDGPU_USERQ_MQD_FLAGS_AQL (1 << 1)
+
+struct drm_amdgpu_userq_mqd {
+   /** Flags: AMDGPU_USERQ_MQD_FLAGS_* */
+   __u32   flags;
+   /** IP type: AMDGPU_HW_IP_* */
+   __u32   ip_type;
+   /** GEM object handle */
+   __u32   doorbell_handle;
+   /** Doorbell offset in dwords */
+   __u32   doorbell_offset;

Since doorbells are 64 bit, maybe this offset should be in qwords.

Can you please help to cross check this information ? All the
existing
kernel doorbell calculations are keeping doorbells size as
sizeof(u32)

Doorbells on pre-vega hardware are 32 bits so that is where that comes
from, but from vega onward most doorbells are 64 bit.  I think some
versions of VCN may still use 32 bit doorbells.  Internally in the
kernel driver we just use two slots for newer hardware, but for the
UAPI, I think we can just stick with 64 bit slots to avoid confusion.
Even if an engine only uses a 32 bit one, I don't know that there is
much value to trying to support variable doorbell sizes.

I think we can stick with using __u32 because this is *not* the size of
the doorbell entries.

Instead this is the offset into the BO where to find the doorbell for
this queue (which then in turn is 64bits wide).

Since we will probably never have more than 4GiB doorbells we should be
pretty save to use 32bits here.

Yes, the offset would still be 32 bits, but the units would be
qwords.  E.g.,

+   /** Doorbell offset in qwords */
+   __u32   doorbell_offset;

That way you couldn't accidently specify an overlapping doorbell.

Ah, so you only wanted to fix the comment. That was absolutely not
clear from the discussion.

If I understand this correctly, the offset of the doorbell in the BO is
still is 32-bit, but its width (size in bytes) is 64 bits. Am I getting
that right ?

Right.  Each doorbell is 64 bits (8 bytes) so this value would
basically be an index into the doorbell bo.  Having it be a 64 bit
index rather than a 32 bit index would avoid the possibility of users
specifying overlapping doorbells.  E.g.,
offset in bytes
0 - doorbell
4 - doorbell
Would be incorrect, while
offset in bytes
0 - doorbell
8 - doorbell
Would be correct.

I.e., u64 doorbell_page[512] vs u32 doorbell_page[1024]


Well I usually prefer just straight byte offsets, but I think the main 
question is what does the underlying hw/fw use?


If that's a dword index we should probably stick with that in the UAPI 
as well. If it's in qword then stick to that, if it's in bytes than use 
that.


Otherwise we will just confuse 

Re: [PATCH 1/8] drm/amdgpu: UAPI for user queue management

2023-02-07 Thread Alex Deucher
On Tue, Feb 7, 2023 at 2:38 AM Shashank Sharma  wrote:
>
>
> On 07/02/2023 08:03, Christian König wrote:
> > Am 06.02.23 um 22:03 schrieb Alex Deucher:
> >> On Mon, Feb 6, 2023 at 12:01 PM Christian König
> >>  wrote:
> >>> Am 06.02.23 um 17:56 schrieb Alex Deucher:
>  On Fri, Feb 3, 2023 at 5:26 PM Shashank Sharma
>   wrote:
> > Hey Alex,
> >
> > On 03/02/2023 23:07, Alex Deucher wrote:
> >> On Fri, Feb 3, 2023 at 4:54 PM Shashank Sharma
> >>  wrote:
> >>> From: Alex Deucher 
> >>>
> >>> This patch intorduces new UAPI/IOCTL for usermode graphics
> >>> queue. The userspace app will fill this structure and request
> >>> the graphics driver to add a graphics work queue for it. The
> >>> output of this UAPI is a queue id.
> >>>
> >>> This UAPI maps the queue into GPU, so the graphics app can start
> >>> submitting work to the queue as soon as the call returns.
> >>>
> >>> Cc: Alex Deucher 
> >>> Cc: Christian Koenig 
> >>> Signed-off-by: Alex Deucher 
> >>> Signed-off-by: Shashank Sharma 
> >>> ---
> >>> include/uapi/drm/amdgpu_drm.h | 53
> >>> +++
> >>> 1 file changed, 53 insertions(+)
> >>>
> >>> diff --git a/include/uapi/drm/amdgpu_drm.h
> >>> b/include/uapi/drm/amdgpu_drm.h
> >>> index 4038abe8505a..6c5235d107b3 100644
> >>> --- a/include/uapi/drm/amdgpu_drm.h
> >>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>> @@ -54,6 +54,7 @@ extern "C" {
> >>> #define DRM_AMDGPU_VM  0x13
> >>> #define DRM_AMDGPU_FENCE_TO_HANDLE 0x14
> >>> #define DRM_AMDGPU_SCHED   0x15
> >>> +#define DRM_AMDGPU_USERQ   0x16
> >>>
> >>> #define DRM_IOCTL_AMDGPU_GEM_CREATE
> >>> DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union
> >>> drm_amdgpu_gem_create)
> >>> #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE
> >>> + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>> @@ -71,6 +72,7 @@ extern "C" {
> >>> #define DRM_IOCTL_AMDGPU_VM DRM_IOWR(DRM_COMMAND_BASE +
> >>> DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>> #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE
> >>> DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union
> >>> drm_amdgpu_fence_to_handle)
> >>> #define DRM_IOCTL_AMDGPU_SCHED DRM_IOW(DRM_COMMAND_BASE +
> >>> DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>> +#define DRM_IOCTL_AMDGPU_USERQ DRM_IOW(DRM_COMMAND_BASE +
> >>> DRM_AMDGPU_USERQ, union drm_amdgpu_userq)
> >>>
> >>> /**
> >>>  * DOC: memory domains
> >>> @@ -302,6 +304,57 @@ union drm_amdgpu_ctx {
> >>>union drm_amdgpu_ctx_out out;
> >>> };
> >>>
> >>> +/* user queue IOCTL */
> >>> +#define AMDGPU_USERQ_OP_CREATE 1
> >>> +#define AMDGPU_USERQ_OP_FREE   2
> >>> +
> >>> +#define AMDGPU_USERQ_MQD_FLAGS_SECURE  (1 << 0)
> >>> +#define AMDGPU_USERQ_MQD_FLAGS_AQL (1 << 1)
> >>> +
> >>> +struct drm_amdgpu_userq_mqd {
> >>> +   /** Flags: AMDGPU_USERQ_MQD_FLAGS_* */
> >>> +   __u32   flags;
> >>> +   /** IP type: AMDGPU_HW_IP_* */
> >>> +   __u32   ip_type;
> >>> +   /** GEM object handle */
> >>> +   __u32   doorbell_handle;
> >>> +   /** Doorbell offset in dwords */
> >>> +   __u32   doorbell_offset;
> >> Since doorbells are 64 bit, maybe this offset should be in qwords.
> > Can you please help to cross check this information ? All the
> > existing
> > kernel doorbell calculations are keeping doorbells size as
> > sizeof(u32)
>  Doorbells on pre-vega hardware are 32 bits so that is where that comes
>  from, but from vega onward most doorbells are 64 bit.  I think some
>  versions of VCN may still use 32 bit doorbells.  Internally in the
>  kernel driver we just use two slots for newer hardware, but for the
>  UAPI, I think we can just stick with 64 bit slots to avoid confusion.
>  Even if an engine only uses a 32 bit one, I don't know that there is
>  much value to trying to support variable doorbell sizes.
> >>> I think we can stick with using __u32 because this is *not* the size of
> >>> the doorbell entries.
> >>>
> >>> Instead this is the offset into the BO where to find the doorbell for
> >>> this queue (which then in turn is 64bits wide).
> >>>
> >>> Since we will probably never have more than 4GiB doorbells we should be
> >>> pretty save to use 32bits here.
> >> Yes, the offset would still be 32 bits, but the units would be
> >> qwords.  E.g.,
> >>
> >> +   /** Doorbell offset in qwords */
> >> +   __u32   doorbell_offset;
> >>
> >> That way you couldn't accidently specify an overlapping doorbell.
> >
> > Ah, so you only wanted to fix the comment. That was absolutely not
> > clear from the discussion.
>
> If I understand this correct

Re: [PATCH 1/5] drm/amd/display: disable S/G display on DCN 2.1.0

2023-02-07 Thread Christian König
If it's not already committed Reviewed-by: Christian König 
.


Sorry for the vacation delay, I'm still catching up on mails.

Christian.

Am 31.01.23 um 19:13 schrieb Alex Deucher:

Causes flickering or white screens in some configurations.
Disable it for now until we can fix the issue.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2352
Cc: roman...@amd.com
Cc: yifan1.zh...@amd.com
Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 3a40c6dec3d2..2e4040f1a357 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1534,7 +1534,6 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
(adev->apu_flags & AMD_APU_IS_PICASSO))
init_data.flags.gpu_vm_support = true;
break;
-   case IP_VERSION(2, 1, 0):
case IP_VERSION(3, 0, 1):
case IP_VERSION(3, 1, 2):
case IP_VERSION(3, 1, 3):




Re: [RFC PATCH] drm: Create documentation about device resets

2023-02-07 Thread Pekka Paalanen
On Mon, 23 Jan 2023 21:38:11 +0100
Christian König  wrote:

> Am 23.01.23 um 21:26 schrieb André Almeida:
> > Create a document that specifies how to deal with DRM device resets for
> > kernel and userspace drivers.
> >
> > Signed-off-by: André Almeida 
> > ---
> >   Documentation/gpu/drm-reset.rst | 51 +
> >   Documentation/gpu/index.rst |  1 +
> >   2 files changed, 52 insertions(+)
> >   create mode 100644 Documentation/gpu/drm-reset.rst
> >
> > diff --git a/Documentation/gpu/drm-reset.rst 
> > b/Documentation/gpu/drm-reset.rst
> > new file mode 100644
> > index ..0dd11a469cf9
> > --- /dev/null
> > +++ b/Documentation/gpu/drm-reset.rst
> > @@ -0,0 +1,51 @@
> > +
> > +DRM Device Reset
> > +
> > +
> > +The GPU stack is really complex and is prone to errors, from hardware bugs,
> > +faulty applications and everything in the many layers in between. To 
> > recover
> > +from this kind of state, sometimes is needed to reset the GPU. Unproper 
> > handling
> > +of GPU resets can lead to an unstable userspace. This page describes 
> > what's the
> > +expected behaviour from DRM drivers to do in those situations, from 
> > usermode
> > +drivers and compositors as well.
> > +
> > +Robustness
> > +--
> > +
> > +First of all, application robust APIs, when available, should be used. This
> > +allows the application to correctly recover and continue to run after a 
> > reset.
> > +Apps that doesn't use this should be promptly killed when the kernel driver
> > +detects that it's in broken state. Specifically guidelines for some APIs:
> > +  
> 
> > +- OpenGL: During a reset, KMD kill processes that haven't ARB Robustness
> > +  enabled, assuming they can't recover.  
> 
> This is a pretty clear NAK from my side to this approach. The KMD should 
> never mess with an userspace process directly in such a way.
> 
> Instead use something like this "OpenGL: KMD signals the abortion of 
> submitted commands and the UMD should then react accordingly and abort 
> the application.".
> 
> > +- Vulkan: Assumes that every app is able to deal with 
> > ``VK_ERROR_DEVICE_LOST``,
> > +  so KMD doesn't kill any. If it doesn't do it right, it's considered a 
> > broken
> > +  application and UMD will deal with it.  
> 
> Again, pleas remove the "KMD kill" reference.
> 
> > +
> > +Kernel mode driver
> > +--
> > +
> > +The KMD should be able to detect that something is wrong with the 
> > application  
> 
> Please replace *should* with *must* here, this is mandatory or otherwise 
> core memory management can run into deadlocks during reclaim.
> 
> Regards,
> Christian.
> 
> > +and that a reset is needed to take place to recover the device (e.g. an 
> > endless
> > +wait). It needs to properly track the context that is broken and mark it as
> > +dead, so any other syscalls to that context should be further rejected. The
> > +other contexts should be preserved when possible, avoid crashing the rest 
> > of
> > +userspace. KMD can ban a file descriptor that keeps causing resets, as it's
> > +likely in a broken loop.
> > +
> > +User mode driver
> > +
> > +
> > +During a reset, UMD should be aware that rejected syscalls indicates that 
> > the
> > +context is broken and for robust apps the recovery should happen for the
> > +context. Non-robust apps would be already terminated by KMD. If no new 
> > context
> > +is created for some time, it is assumed that the recovery didn't work, so 
> > UMD
> > +should terminate it.

Hi,

what Christian said, plus I would not assume that robust programs will
always respond by creating a new context. They could also switch
to a software renderer, or simply not do graphics again until something
else happens.

> > +
> > +Compositors
> > +---
> > +
> > +(In the long term) compositors should be robust as well to properly deal 
> > with it
> > +errors. Init systems should be aware of the compositor status and reset it 
> > if is
> > +broken.

I don't know how init systems could do that, or what difference does it
make to an init system whether the display server is robust or not.
Display servers can get stuck for other reasons as well. They may also
be live-stuck, where they respond to keepalive, serve clients, and
deliver input events, but still do not update the screen. You can't
tell if that's a malfunction or expected.



Have you checked
https://www.kernel.org/doc/html/latest/gpu/drm-uapi.html#device-hot-unplug
that you are consistent with hot-unplug plans?


Thanks,
pq

> > diff --git a/Documentation/gpu/index.rst b/Documentation/gpu/index.rst
> > index b99dede9a5b1..300b2529bd39 100644
> > --- a/Documentation/gpu/index.rst
> > +++ b/Documentation/gpu/index.rst
> > @@ -9,6 +9,7 @@ Linux GPU Driver Developer's Guide
> >  drm-mm
> >  drm-kms
> >  drm-kms-helpers
> > +   drm-reset
> >  drm-uapi
> >  drm-usage-stats
> >  driver-uapi  
> 



pgpOakoHR3OCU.pgp
De

Re: [PATCH] drm/amd/display: Align num_crtc to max_streams

2023-02-07 Thread Hamza Mahfooz



On 2/6/23 23:05, Tianci Yin wrote:

From: tiancyin 

[Why]
Display pipe might be harvested on some SKUs, that cause the
adev->mode_info.num_crtc mismatch with the usable crtc number,
then below error dmesgs observed after GPU recover.

   *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
   *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3

[How]
The max_streams is limited number after pipe fuse, align num_crtc
to max_streams to eliminate the error logs.

Signed-off-by: tiancyin 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b31cfda30ff9..87ec2574cc09 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4285,6 +4285,9 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
break;
}
  
+	/* Adjust the crtc number according to the DCN pipe fuse. */

+   adev->mode_info.num_crtc = dm->dc->caps.max_streams;


This would introduce array-out-bounds issues, since there are arrays of
size AMDGPU_MAX_CRTCS that use num_crtc as a bounds check.


+
for (i = 0; i < dm->dc->caps.max_streams; i++)
if (amdgpu_dm_crtc_init(dm, mode_info->planes[i], i)) {
DRM_ERROR("KMS: Failed to initialize crtc\n");


--
Hamza



Re: [PATCH v2 00/21] Enable Colorspace connector property in amdgpu

2023-02-07 Thread Pekka Paalanen
On Fri, 13 Jan 2023 11:24:07 -0500
Harry Wentland  wrote:

> This patchset enables the DP and HDMI infoframe properties
> in amdgpu.
> 
> The first two patches are not completely related to the rest. The
> first patch allows for HDR_OUTPUT_METADATA with EOTFs that are
> unknown in the kernel.
> 
> The second one prints a connector's max_bpc as part of the atomic
> state debugfs print.
> 
> The following patches rework the connector colorspace code to
> 1) allow for easy printing of the colorspace in the drm_atomic
>state debugfs, and
> 2) allow drivers to specify the supported colorspaces on a
>connector.
> 
> The rest of the patches deal with the Colorspace enablement
> in amdgpu.
> 
> Why do drivers need to specify supported colorspaces? The amdgpu
> driver needs support for RGB-to-YCbCr conversion when we drive
> the display in YCbCr. This is currently not implemented for all
> colorspaces.
> 
> Since the Colorspace property didn't have an IGT test I added
> one to kms_hdr. The relevant patchset can be found on the IGT
> mailing list or on
> https://gitlab.freedesktop.org/hwentland/igt-gpu-tools/-/tree/hdr-colorimetry
> 
> We tested v1 of the patchset and confirmed that the infoframes
> are as expected for both DP and HDMI when running the IGT
> colorimetry tests.
> 
> Open Items
> --
> 
> A couple comments from Pekka about colorspace documentation are
> left unaddressed. I hope they won't block merging this set but
> should still be addressed separately.
> 
> Pekka's questions really got me thinking of how this colorspace
> property should be used and working with it more closely with
> Joshua who is enabling HDR in gamescope made me wonder even more.
> 
> Uma, is there a (canonical, upstream) userspace that uses this
> property that I can look at to understand more?
> 
> One of the key challenges that is currently not addressed is that
> userspace is expected to pick a colorspace format straight from the
> list of definitions out of the DP or HDMI spec. But the kernel
> driver are the ones deciding on the output encoding (RGB, YCBCR444,
> YCBCR420, etc.). So there is no way for userspace to decide correctly
> between, for example, BT2020_RGB, BT2020_CYCC, BT2020_YCC.
> 
> So we end up in a scenario where gamescope sets BT2020_RGB but we
> output YCBCR444 so have to correct the colorspace value to
> BT2020_YCC. This in turn breaks the colorspace IGT tests I
> wrote. I don't think "fixing" the IGT tests to accept this is
> the right thing to do.
> 
> The way it stands this patchset allows us to specify the output
> colorspace on amdgpu and we try to do the right thing, but I don't
> thing the way the colorspace property is defined is right. We're trying
> to expose things to userspace that should be under driver control. A
> much better approach would be to give userspace options for colorspace
> that are not tied to DP or HDMI specs, i.e., sRGB, BT709, BT2020, etc.,
> and have the driver do the right thing to fill the infoframe, e.g., by
> picking BT2020_YCC if the requested colorspace is BT2020 and the
> is YCBCR444.

Hi Harry,

well explained.

Indeed, if you want to keep the driver in control of the encoding on
the monitor cable, then your suggestion seems correct (ignoring the
question whether it breaks something existing).

I do recall something about letting userspace control the encoding on
the cable though, particularly when it affects performance or quality.
E.g. 4:2:0 sub-sampling might be wanted in some cases and unwanted in
others. It's a bit similar to bpc. The trade-off may be frame rate or
resolution. It might better to know that the hardware cannot do what
you ask, than to silently degrade. E.g. if you use sub-pixel rendering,
you really do not want 4:2:0.

That's compatible with your suggestion on changing the Colorspace
property, I think it would complement it. Cable encoding parameters
could be other properties, which might also be on "auto".

If Colorspace property cannot be changed, then options I haven't seen
discussed yet are have it force the cable encoding parameters, or
another new property replacing it.

> If no upstream userspace currently makes use of this property I
> can make that change, i.e., no longer tie the colorspace property
> directly to the infoframe and reduce the options to sRGB, BT709,
> BT601, and BT2020 (and possibly opRGB).
> 
> v2:
> - Tested with DP and HDMI analyzers
> - Confirmed driver will fallback to lower bpc when needed
> - Dropped hunk to set HDMI AVI infoframe as it was a no-op
> - Fixed BT.2020 YCbCr colorimetry (JoshuaAshton)
> - Simplify initialization of supported colorspaces (Jani)
> - Fix kerneldoc (kernel test robot)

I recall saying this before, but in the series there are occurrences
where sRGB is spelled as "RGB". I find that very confusing that "RGB"
would imply anything about colorimetry when it's just a color model.

Sometimes it might not be sRGB because the "default RGB" has probably
not been sRGB for many years now

Re: [PATCH v2 18/21] drm/amd/display: Fallback to 2020_YCBCR if the pixel encoding is not RGB

2023-02-07 Thread Pekka Paalanen
On Wed, 25 Jan 2023 12:59:53 +
Joshua Ashton  wrote:

> On 1/23/23 20:30, Sebastian Wick wrote:
> > A new property to control YCC and subsampling would be the more
> > complete path here. If we actually want to fix this in the short-term
> > though, we should handle the YCC and RGB Colorspace values as
> > equivalent, everywhere. Technically we're breaking the user space API
> > here so it should be documented on the KMS property and other drivers
> > must be adjusted accordingly as well.  
> 
> I am happy with treating 2020_YCC and 2020_RGB as the same.
> 
> I think having the YCC/RGB split here in Colorspace is a mistake.
> Pixel encoding should be completely separate from colorspace from a uAPI 
> perspective when we want to expose that.
> It's just a design flaw from when it was added as it just mirrors the 
> values in the colorimetry packets 1:1. I understand why this happened, 
> and I don't think it's a big deal for us to correct ourselves now:
> 
> I suggest we deprecate the _YCC variants, treat them the same as the RGB 
> enum to avoid any potential uAPI breakage and key the split entirely off 
> the pixel_encoding value.
> 
> That way when we do want to plumb more explicit pixel encoding down the 
> line, userspace only has to manage one thing. There's no advantage for 
> anything more here.

Sounds good to me!


Thanks,
pq

> 
> - Joshie 🐸✨
> 
> > 
> > On Fri, Jan 13, 2023 at 5:26 PM Harry Wentland  
> > wrote:  
> >>
> >> From: Joshua Ashton 
> >>
> >> Userspace might not aware whether we're sending RGB or YCbCr
> >> data to the display. If COLOR_SPACE_2020_RGB_FULLRANGE is
> >> requested but the output encoding is YCbCr we should
> >> send COLOR_SPACE_2020_YCBCR.
> >>
> >> Signed-off-by: Joshua Ashton 
> >> Signed-off-by: Harry Wentland 
> >> Cc: Pekka Paalanen 
> >> Cc: Sebastian Wick 
> >> Cc: vitaly.pros...@amd.com
> >> Cc: Joshua Ashton 
> >> Cc: dri-de...@lists.freedesktop.org
> >> Cc: amd-gfx@lists.freedesktop.org
> >> Reviewed-by: Harry Wentland 
> >> ---
> >>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 -
> >>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> >> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> index f74b125af31f..16940ea61b59 100644
> >> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> >> @@ -5184,7 +5184,10 @@ get_output_color_space(const struct dc_crtc_timing 
> >> *dc_crtc_timing,
> >>  color_space = COLOR_SPACE_ADOBERGB;
> >>  break;
> >>  case DRM_MODE_COLORIMETRY_BT2020_RGB:
> >> -   color_space = COLOR_SPACE_2020_RGB_FULLRANGE;
> >> +   if (dc_crtc_timing->pixel_encoding == PIXEL_ENCODING_RGB)
> >> +   color_space = COLOR_SPACE_2020_RGB_FULLRANGE;
> >> +   else
> >> +   color_space = COLOR_SPACE_2020_YCBCR;
> >>  break;
> >>  case DRM_MODE_COLORIMETRY_BT2020_YCC:
> >>  color_space = COLOR_SPACE_2020_YCBCR;
> >> --
> >> 2.39.0
> >>  
> >   



pgpG7QgFOv8X4.pgp
Description: OpenPGP digital signature


Re: [PATCH 5/8] drm/amdgpu: Create context for usermode queue

2023-02-07 Thread Shashank Sharma



On 07/02/2023 08:55, Christian König wrote:

Am 07.02.23 um 08:51 schrieb Shashank Sharma:


On 07/02/2023 08:14, Christian König wrote:

Am 03.02.23 um 22:54 schrieb Shashank Sharma:

The FW expects us to allocate atleast one page as context space to
process gang, process, shadow, GDS and FW_space related work. This
patch creates some object for the same, and adds an IP specific
functions to do this.

Cc: Alex Deucher 
Cc: Christian Koenig 
Signed-off-by: Shashank Sharma 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  32 +
  .../amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c | 121 
++

  .../gpu/drm/amd/include/amdgpu_userqueue.h    |  18 +++
  3 files changed, 171 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c

index 9f3490a91776..18281b3a51f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
@@ -42,6 +42,28 @@ static struct amdgpu_usermode_queue
  return idr_find(&uq_mgr->userq_idr, qid);
  }
  +static void
+amdgpu_userqueue_destroy_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
+   struct amdgpu_usermode_queue 
*queue)

+{
+    uq_mgr->userq_mqd_funcs->ctx_destroy(uq_mgr, queue);
+}
+
+static int
+amdgpu_userqueue_create_ctx_space(struct amdgpu_userq_mgr *uq_mgr,
+  struct amdgpu_usermode_queue 
*queue)

+{
+    int r;
+
+    r = uq_mgr->userq_mqd_funcs->ctx_create(uq_mgr, queue);
+    if (r) {
+    DRM_ERROR("Failed to create context space for queue\n");
+    return r;
+    }
+
+    return 0;
+}
+
  static int
  amdgpu_userqueue_create_mqd(struct amdgpu_userq_mgr *uq_mgr, 
struct amdgpu_usermode_queue *queue)

  {
@@ -142,12 +164,21 @@ static int amdgpu_userqueue_create(struct 
drm_file *filp, union drm_amdgpu_userq

  goto free_qid;
  }
  +    r = amdgpu_userqueue_create_ctx_space(uq_mgr, queue);
+    if (r) {
+    DRM_ERROR("Failed to create context space\n");
+    goto free_mqd;
+    }
+
  list_add_tail(&queue->userq_node, &uq_mgr->userq_list);
  args->out.q_id = queue->queue_id;
  args->out.flags = 0;
  mutex_unlock(&uq_mgr->userq_mutex);
  return 0;
  +free_mqd:
+    amdgpu_userqueue_destroy_mqd(uq_mgr, queue);
+
  free_qid:
  amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
  @@ -170,6 +201,7 @@ static void amdgpu_userqueue_destroy(struct 
drm_file *filp, int queue_id)

  }
    mutex_lock(&uq_mgr->userq_mutex);
+    amdgpu_userqueue_destroy_ctx_space(uq_mgr, queue);
  amdgpu_userqueue_destroy_mqd(uq_mgr, queue);
  amdgpu_userqueue_free_index(uq_mgr, queue->queue_id);
  list_del(&queue->userq_node);
diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c

index 57889729d635..687f90a587e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_mqd_gfx_v11.c
@@ -120,6 +120,125 @@ amdgpu_userq_gfx_v11_mqd_destroy(struct 
amdgpu_userq_mgr *uq_mgr, struct amdgpu_

    }
  +static int amdgpu_userq_gfx_v11_ctx_create(struct 
amdgpu_userq_mgr *uq_mgr,
+   struct 
amdgpu_usermode_queue *queue)

+{
+    int r;
+    struct amdgpu_device *adev = uq_mgr->adev;
+    struct amdgpu_userq_ctx *pctx = &queue->proc_ctx;
+    struct amdgpu_userq_ctx *gctx = &queue->gang_ctx;
+    struct amdgpu_userq_ctx *gdsctx = &queue->gds_ctx;
+    struct amdgpu_userq_ctx *fwctx = &queue->fw_ctx;
+    struct amdgpu_userq_ctx *sctx = &queue->shadow_ctx;
+
+    /*
+ * The FW expects atleast one page space allocated for
+ * process context related work, and one for gang context.
+ */
+    r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_PROC_CTX_SZ, 
PAGE_SIZE,

+    AMDGPU_GEM_DOMAIN_VRAM,
+    &pctx->obj,
+    &pctx->gpu_addr,
+    &pctx->cpu_ptr);


Again, don't use amdgpu_bo_create_kernel() for any of this.

Noted,



+    if (r) {
+    DRM_ERROR("Failed to allocate proc bo for userqueue (%d)", 
r);

+    return r;
+    }
+
+    r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_GANG_CTX_SZ, 
PAGE_SIZE,

+    AMDGPU_GEM_DOMAIN_VRAM,
+    &gctx->obj,
+    &gctx->gpu_addr,
+    &gctx->cpu_ptr);
+    if (r) {
+    DRM_ERROR("Failed to allocate gang bo for userqueue (%d)", 
r);

+    goto err_gangctx;
+    }
+
+    r = amdgpu_bo_create_kernel(adev, AMDGPU_USERQ_GDS_CTX_SZ, 
PAGE_SIZE,

+    AMDGPU_GEM_DOMAIN_VRAM,
+    &gdsctx->obj,
+    &gdsctx->gpu_addr,
+    &gdsctx->cpu_ptr);
+    if (r) {
+    DR