date:20210819

Re: [PATCH 2/3] drm/amdgpu: Use SDMA1 for buffer movement on Aldebaran

2021-08-19 Thread Christian König





Am 20.08.21 um 07:32 schrieb Joseph Greathouse:

Aldebaran should not use SDMA0 for buffer funcs such as page migration.
Instead, we move over to SDMA1 for these features. Leave SDMA0 in
charge for all other existing chips to avoid any possibility of
regressions.


The part why we do this is missing, apart from that looks good to me.

Christian.



Signed-off-by: Joseph Greathouse 
---
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 8931000dcd41..771630d7bb3f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -2689,11 +2689,15 @@ static const struct amdgpu_buffer_funcs 
sdma_v4_0_buffer_funcs = {
  
  static void sdma_v4_0_set_buffer_funcs(struct amdgpu_device *adev)

  {
+   int engine = 0;
+
+   if (adev->asic_type == CHIP_ALDEBARAN)
+   engine = 1;
adev->mman.buffer_funcs = &sdma_v4_0_buffer_funcs;
if (adev->sdma.has_page_queue)
-   adev->mman.buffer_funcs_ring = &adev->sdma.instance[0].page;
+   adev->mman.buffer_funcs_ring = 
&adev->sdma.instance[engine].page;
else
-   adev->mman.buffer_funcs_ring = &adev->sdma.instance[0].ring;
+   adev->mman.buffer_funcs_ring = 
&adev->sdma.instance[engine].ring;
  }
  
  static const struct amdgpu_vm_pte_funcs sdma_v4_0_vm_pte_funcs = {

Re: [PATCH v2] drm/amdgpu/OLAND: clip the ref divider max value

2021-08-19 Thread Christian König

Sounds like a good idea to me, but I would limit this generally or at 
least for the whole generation and not just one particular chipset.


Regards,
Christian.

Am 20.08.21 um 08:05 schrieb Sharma, Shashank:

From 4841e5ba60e33ff798bde6cb69fbd7e137b6db9c Mon Sep 17 00:00:00 2001
From: Shashank Sharma 
Date: Fri, 20 Aug 2021 10:20:02 +0530
Subject: [PATCH v2] drm/amdgpu/OLAND: clip the ref divider max value
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This patch limits the ref_div_max value to 100, during the
calculation of PLL feedback reference divider. With current
value (128), the produced fb_ref_div value generates unstable
output at particular frequencies. Radeon driver limits this
value at 100.

On Oland, when we try to setup mode 2048x1280@60 (a bit weird,
I know), it demands a clock of 221270 Khz. It's been observed
that the PLL calculations using values 128 and 100 are vastly
different, and look like this:

+--+
|Parameter    |AMDGPU    |Radeon   |
| |  | |
+-++
|Clock feedback  | |
|divider max  |  128 |   100   |
|cap value    |  | |
| |  | |
| |  | |
+--+
|ref_div_max  |  | |
| |  42  |  20 |
| |  | |
| |  | |
+--+
|ref_div  |  42  |  20 |
| |  | |
+--+
|fb_div   |  10326   |  8195   |
+--+
|fb_div   |  1024    |  163    |
+--+
|fb_dev_p |  4   |  9  |
|frac fb_de^_p|  | |
++-+

With ref_div_max value clipped at 100, AMDGPU driver can also
drive videmode 2048x1280@60 (221Mhz) and produce proper output
without any blanking and distortion on the screen.

PS: This value was changed from 128 to 100 in Radeon driver also, here:
https://github.com/freedesktop/drm-tip/commit/4b21ce1b4b5d262e7d4656b8ececc891fc3cb806 



V1:
Got acks from:
Acked-by: Alex Deucher 
Acked-by: Christian König 

V2:
- Restricting the changes only for OLAND, just to avoid any regression
  for other cards.
- Changed unsigned -> unsigned int to make checkpatch quiet.

Cc: Alex Deucher 
Cc: Christian König 
Cc: Eddy Qin 
Signed-off-by: Shashank Sharma 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c    | 20 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pll.h    |  3 ++-
 drivers/gpu/drm/amd/amdgpu/atombios_crtc.c |  2 +-
 3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c

index f2e20666c9c1..6d04c1d25bfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
@@ -80,12 +80,17 @@ static void amdgpu_pll_reduce_ratio(unsigned *nom, 
unsigned *den,
  * Calculate feedback and reference divider for a given post divider. 
Makes

  * sure we stay within the limits.
  */
-static void amdgpu_pll_get_fb_ref_div(unsigned nom, unsigned den, 
unsigned post_div,

-  unsigned fb_div_max, unsigned ref_div_max,
-  unsigned *fb_div, unsigned *ref_div)
+static void amdgpu_pll_get_fb_ref_div(struct amdgpu_device *adev, 
unsigned int nom,

+  unsigned int den, unsigned int post_div,
+  unsigned int fb_div_max, unsigned int ref_div_max,
+  unsigned int *fb_div, unsigned int *ref_div)
 {
+
 /* limit reference * post divider to a maximum */
-    ref_div_max = min(128 / post_div, ref_div_max);
+    if (adev->asic_type == CHIP_OLAND)
+    ref_div_max = min(100 / post_div, ref_div_max);
+    else
+    ref_div_max = min(128 / post_div, ref_div_max);

 /* get matching reference and feedback divider */
 *ref_div = min(max(DIV_ROUND_CLOSEST(den, post_div), 1u), 
ref_div_max);
@@ -112,7 +117,8 @@ static void amdgpu_pll_get_fb_ref_div(unsigned 
nom, unsigned den, unsigned post_

  * Try to calculate the PLL parameters to generate the given frequency:
  * dot_clock = (ref_freq * feedback_div) / (ref_div * post_div)
  */
-void amdgpu_pll_compute(struct amdgpu_pll *pll,
+void amdgpu_pll_compute(struct amdgpu_device *adev,
+    struct amdgpu_pll *pll,
 u32 freq,
 u32 *dot_clock_p,
 u32 *fb_div_p,
@@ -199,7 +205,7 @@ void amdgpu_pll_compute(struct amdgpu_pll *pll,

 for (post_div = post_div_min; post_div <= post_div_max; 
++post_div) {

 unsigned diff;
-    amdgpu_pll_ge

Re: [PATCH v6 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-08-19 Thread Jerome Glisse

Note that you do not want GUP to succeed on device page, i do not see
where that is handled in the new code.

On Sun, Aug 15, 2021 at 1:40 PM John Hubbard  wrote:
>
> On 8/15/21 8:37 AM, Christoph Hellwig wrote:
> >> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >> index 8ae31622deef..d48a1f0889d1 100644
> >> --- a/include/linux/mm.h
> >> +++ b/include/linux/mm.h
> >> @@ -1218,7 +1218,7 @@ __maybe_unused struct page 
> >> *try_grab_compound_head(struct page *page, int refs,
> >>   static inline __must_check bool try_get_page(struct page *page)
> >>   {
> >>  page = compound_head(page);
> >> -if (WARN_ON_ONCE(page_ref_count(page) <= 0))
> >> +if (WARN_ON_ONCE(page_ref_count(page) < 
> >> (int)!is_zone_device_page(page)))
> >
> > Please avoid the overly long line.  In fact I'd be tempted to just not
> > bother here and keep the old, more lose check.  Especially given that
> > John has a patch ready that removes try_get_page entirely.
> >
>
> Yes. Andrew has accepted it into mmotm.
>
> Ralph's patch here was written well before my cleanup that removed
> try_grab_page() [1]. But now that we're here, if you drop this hunk then
> it will make merging easier, I think.
>
>
> [1] https://lore.kernel.org/r/20210813044133.1536842-4-jhubb...@nvidia.com
>
> thanks,
> --
> John Hubbard
> NVIDIA
>

[PATCH v2] drm/amdgpu/OLAND: clip the ref divider max value

2021-08-19 Thread Sharma, Shashank


From 4841e5ba60e33ff798bde6cb69fbd7e137b6db9c Mon Sep 17 00:00:00 2001
From: Shashank Sharma 
Date: Fri, 20 Aug 2021 10:20:02 +0530
Subject: [PATCH v2] drm/amdgpu/OLAND: clip the ref divider max value
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This patch limits the ref_div_max value to 100, during the
calculation of PLL feedback reference divider. With current
value (128), the produced fb_ref_div value generates unstable
output at particular frequencies. Radeon driver limits this
value at 100.

On Oland, when we try to setup mode 2048x1280@60 (a bit weird,
I know), it demands a clock of 221270 Khz. It's been observed
that the PLL calculations using values 128 and 100 are vastly
different, and look like this:

+--+
|Parameter|AMDGPU|Radeon   |
| |  | |
+-++
|Clock feedback  | |
|divider max  |  128 |   100   |
|cap value|  | |
| |  | |
| |  | |
+--+
|ref_div_max  |  | |
| |  42  |  20 |
| |  | |
| |  | |
+--+
|ref_div  |  42  |  20 |
| |  | |
+--+
|fb_div   |  10326   |  8195   |
+--+
|fb_div   |  1024|  163|
+--+
|fb_dev_p |  4   |  9  |
|frac fb_de^_p|  | |
++-+

With ref_div_max value clipped at 100, AMDGPU driver can also
drive videmode 2048x1280@60 (221Mhz) and produce proper output
without any blanking and distortion on the screen.

PS: This value was changed from 128 to 100 in Radeon driver also, here:
https://github.com/freedesktop/drm-tip/commit/4b21ce1b4b5d262e7d4656b8ececc891fc3cb806

V1:
Got acks from:
Acked-by: Alex Deucher 
Acked-by: Christian König 

V2:
- Restricting the changes only for OLAND, just to avoid any regression
  for other cards.
- Changed unsigned -> unsigned int to make checkpatch quiet.

Cc: Alex Deucher 
Cc: Christian König 
Cc: Eddy Qin 
Signed-off-by: Shashank Sharma 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c| 20 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pll.h|  3 ++-
 drivers/gpu/drm/amd/amdgpu/atombios_crtc.c |  2 +-
 3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c

index f2e20666c9c1..6d04c1d25bfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pll.c
@@ -80,12 +80,17 @@ static void amdgpu_pll_reduce_ratio(unsigned *nom, 
unsigned *den,
  * Calculate feedback and reference divider for a given post divider. 
Makes

  * sure we stay within the limits.
  */
-static void amdgpu_pll_get_fb_ref_div(unsigned nom, unsigned den, 
unsigned post_div,

- unsigned fb_div_max, unsigned ref_div_max,
- unsigned *fb_div, unsigned *ref_div)
+static void amdgpu_pll_get_fb_ref_div(struct amdgpu_device *adev, 
unsigned int nom,

+ unsigned int den, unsigned int post_div,
+ unsigned int fb_div_max, unsigned int 
ref_div_max,
+ unsigned int *fb_div, unsigned int 
*ref_div)
 {
+
/* limit reference * post divider to a maximum */
-   ref_div_max = min(128 / post_div, ref_div_max);
+   if (adev->asic_type == CHIP_OLAND)
+   ref_div_max = min(100 / post_div, ref_div_max);
+   else
+   ref_div_max = min(128 / post_div, ref_div_max);

/* get matching reference and feedback divider */
*ref_div = min(max(DIV_ROUND_CLOSEST(den, post_div), 1u), ref_div_max);
@@ -112,7 +117,8 @@ static void amdgpu_pll_get_fb_ref_div(unsigned nom, 
unsigned den, unsigned post_

  * Try to calculate the PLL parameters to generate the given frequency:
  * dot_clock = (ref_freq * feedback_div) / (ref_div * post_div)
  */
-void amdgpu_pll_compute(struct amdgpu_pll *pll,
+void amdgpu_pll_compute(struct amdgpu_device *adev,
+   struct amdgpu_pll *pll,
u32 freq,
u32 *dot_clock_p,
u32 *fb_div_p,
@@ -199,7 +205,7 @@ void amdgpu_pll_compute(struct amdgpu_pll *pll,

for (post_div = post_div_min; post_div <= post_div_max; ++post_div) {
unsigned diff;
-   amdgpu_pll_get_fb_ref_div(nom, den, post_div, fb_div_max,

[PATCH 3/3] drm/amdkfd: Spread out XGMI SDMA allocations

2021-08-19 Thread Joseph Greathouse

Avoid hotspotting of allocations of SDMA engines from the
XGMI pool by making each process attempt to allocate engines
starting from the engine after the last one that was allocated.

Signed-off-by: Joseph Greathouse 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 8 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 1 +
 4 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 86bdb765f350..7f06ad6bdcd2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1096,9 +1096,12 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
return -ENOMEM;
}
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
+   if (qpd->initial_xgmi_engine == -1)
+   qpd->initial_xgmi_engine = dqm->next_xgmi_engine;
num_engines = get_num_xgmi_sdma_engines(dqm);
for_each_set_bit(engine, &(qpd->xgmi_sdma_engine_bitmap), 
num_engines) {
-   available_queue_bitmap = sdma_engine_mask(engine, 
num_engines);
+   available_queue_bitmap = sdma_engine_mask(
+   qpd->initial_xgmi_engine + engine, 
num_engines);
available_queue_bitmap &= dqm->xgmi_sdma_bitmap;
 
if (!available_queue_bitmap)
@@ -1109,6 +1112,9 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
qpd->xgmi_sdma_engine_bitmap &= ~(1ULL << engine);
found_sdma = true;
 
+   dqm->next_xgmi_engine = qpd->initial_xgmi_engine + 
engine + 1;
+   dqm->next_xgmi_engine %= num_engines;
+
bit = __ffs64(available_queue_bitmap);
dqm->xgmi_sdma_bitmap &= ~(1ULL << bit);
q->sdma_id = bit;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index c8719682c4da..b5955e7401e5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -183,6 +183,8 @@ struct device_queue_manager {
unsigned int*allocated_queues;
uint64_tsdma_bitmap;
uint64_txgmi_sdma_bitmap;
+   /* which XGMI engine the next process should attempt to start on */
+   unsigned intnext_xgmi_engine;
/* the pasid mapping for each kfd vmid */
uint16_tvmid_pasid[VMID_NUM];
uint64_tpipelines_addr;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index c38eebc9db4d..bcf56280c7ed 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -578,6 +578,7 @@ struct qcm_process_device {
unsigned int queue_count;
unsigned long sdma_engine_bitmap;
unsigned long xgmi_sdma_engine_bitmap;
+   int initial_xgmi_engine;
unsigned int vmid;
bool is_debug;
unsigned int evicted; /* eviction counter, 0=active */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 13c85624bf7d..24ce1b52a1d5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1449,6 +1449,7 @@ struct kfd_process_device 
*kfd_create_process_device_data(struct kfd_dev *dev,
pdd->qpd.mapped_gws_queue = false;
pdd->qpd.sdma_engine_bitmap = BIT_ULL(dev_info->num_sdma_engines) - 1;
pdd->qpd.xgmi_sdma_engine_bitmap = 
BIT_ULL(dev_info->num_xgmi_sdma_engines) - 1;
+   pdd->qpd.initial_xgmi_engine = -1;
pdd->process = p;
pdd->bound = PDD_UNBOUND;
pdd->already_dequeued = false;
-- 
2.20.1

[PATCH 2/3] drm/amdgpu: Use SDMA1 for buffer movement on Aldebaran

2021-08-19 Thread Joseph Greathouse

Aldebaran should not use SDMA0 for buffer funcs such as page migration.
Instead, we move over to SDMA1 for these features. Leave SDMA0 in
charge for all other existing chips to avoid any possibility of
regressions.

Signed-off-by: Joseph Greathouse 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 8931000dcd41..771630d7bb3f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -2689,11 +2689,15 @@ static const struct amdgpu_buffer_funcs 
sdma_v4_0_buffer_funcs = {
 
 static void sdma_v4_0_set_buffer_funcs(struct amdgpu_device *adev)
 {
+   int engine = 0;
+
+   if (adev->asic_type == CHIP_ALDEBARAN)
+   engine = 1;
adev->mman.buffer_funcs = &sdma_v4_0_buffer_funcs;
if (adev->sdma.has_page_queue)
-   adev->mman.buffer_funcs_ring = &adev->sdma.instance[0].page;
+   adev->mman.buffer_funcs_ring = 
&adev->sdma.instance[engine].page;
else
-   adev->mman.buffer_funcs_ring = &adev->sdma.instance[0].ring;
+   adev->mman.buffer_funcs_ring = 
&adev->sdma.instance[engine].ring;
 }
 
 static const struct amdgpu_vm_pte_funcs sdma_v4_0_vm_pte_funcs = {
-- 
2.20.1

[PATCH 1/3] drm/amdkfd: Allocate SDMA engines more fairly

2021-08-19 Thread Joseph Greathouse

Give every process at most one queue from each SDMA engine.
Previously, we allocated all SDMA engines and queues on a first-
come-first-serve basis. This meant that it was possible for two
processes racing on their allocation requests to each end up with
two queues on the same SDMA engine. That could serialize the
performance of commands to different queues.

This new mechanism allows each process at most a single queue
on each SDMA engine. Processes will check for queue availability on
SDMA engine 0, then 1, then 2, etc. and only allocate on that
engine if there is an available queue slot. If there are no
queue slots available (or if this process has already allocated
a queue on this engine), it moves on and tries the next engine.

The Aldebaran chip has a small quirk that SDMA0 should only be
used by the very first allocation from each process. It is OK to
use any other SDMA engine at any time, but after the first SDMA
allocation request from a process, the resulting engine must
be from SDMA1 or above. This patch handles this case as well.

Signed-off-by: Joseph Greathouse 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 135 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   3 +
 3 files changed, 107 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index f8fce9d05f50..86bdb765f350 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -52,12 +52,14 @@ static int unmap_queues_cpsch(struct device_queue_manager 
*dqm,
 static int map_queues_cpsch(struct device_queue_manager *dqm);
 
 static void deallocate_sdma_queue(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd,
struct queue *q);
 
 static inline void deallocate_hqd(struct device_queue_manager *dqm,
struct queue *q);
 static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q);
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd,
struct queue *q);
 static void kfd_process_hw_exception(struct work_struct *work);
 
@@ -349,7 +351,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
q->pipe, q->queue);
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
-   retval = allocate_sdma_queue(dqm, q);
+   retval = allocate_sdma_queue(dqm, qpd, q);
if (retval)
goto deallocate_vmid;
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
@@ -410,7 +412,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
deallocate_hqd(dqm, q);
else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI)
-   deallocate_sdma_queue(dqm, q);
+   deallocate_sdma_queue(dqm, qpd, q);
 deallocate_vmid:
if (list_empty(&qpd->queues_list))
deallocate_vmid(dqm, qpd, q);
@@ -475,9 +477,9 @@ static int destroy_queue_nocpsch_locked(struct 
device_queue_manager *dqm,
if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE)
deallocate_hqd(dqm, q);
else if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
-   deallocate_sdma_queue(dqm, q);
+   deallocate_sdma_queue(dqm, qpd, q);
else if (q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI)
-   deallocate_sdma_queue(dqm, q);
+   deallocate_sdma_queue(dqm, qpd, q);
else {
pr_debug("q->properties.type %d is invalid\n",
q->properties.type);
@@ -1039,42 +1041,93 @@ static void pre_reset(struct device_queue_manager *dqm)
dqm_unlock(dqm);
 }
 
+static uint64_t sdma_engine_mask(int engine, int num_engines)
+{
+   uint64_t mask = 0;
+
+   engine %= num_engines;
+   while (engine < (sizeof(uint64_t)*8)) {
+   mask |= 1ULL << engine;
+   engine += num_engines;
+   }
+   return mask;
+}
+
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
+   struct qcm_process_device *qpd,
struct queue *q)
 {
-   int bit;
+   uint64_t available_queue_bitmap;
+   unsigned int bit, engine, num_engines;
+   bool found_sdma = false;
 
if (q->properties.type == KFD_QUEUE_TYPE_SDMA) {
-   if (dqm->sdma_bitmap == 0) {
+   num_engines = get_num_sdma_engines(dqm);
+   for_each_set_bit(engine, &(qpd->sdma_engine_bitmap), 
num_engines) {
+   /*

[PATCH V2] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend

2021-08-19 Thread Evan Quan

Perform proper cleanups on UVD/VCE suspend: powergate enablement,
clockgating enablement and dpm disablement. This can fix some hangs
observed on suspending when UVD/VCE still using(e.g. issue
"pm-suspend" when video is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan 
Signed-off-by: xinhui pan 
--
v1->v2:
  - move the changes to ->hw_fini() (James Zhu)
---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 23 +++
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..c238aa2014fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -543,6 +543,30 @@ static int uvd_v6_0_hw_fini(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->uvd.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_uvd(adev, false);
+   } else {
+   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
+   /* shutdown the UVD block */
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_CG_STATE_GATE);
+   }
+
if (RREG32(mmUVD_STATUS) != 0)
uvd_v6_0_stop(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..e99877c13d5f 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -490,6 +490,29 @@ static int vce_v3_0_hw_fini(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->vce.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_vce(adev, false);
+   } else {
+   amdgpu_asic_set_vce_clocks(adev, 0, 0);
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_CG_STATE_GATE);
+   }
+
r = vce_v3_0_wait_for_idle(handle);
if (r)
return r;
-- 
2.29.0

RE: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend

2021-08-19 Thread Quan, Evan

[AMD Official Use Only]



From: Lazar, Lijo 
Sent: Thursday, August 19, 2021 10:36 PM
To: Zhu, James ; Quan, Evan ; 
amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Chen, Guchun 
; Pan, Xinhui 
Subject: RE: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend


[AMD Official Use Only]

If that is done  -

+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_CG_STATE_GATE);

Usual order is CG followed by PG. It comes in the else part, so less likely to 
happen. Nice to fix for code correctness purpose.
[Quan, Evan] Thanks Lijo. Make sense to me. However, actually these code were 
copied from amdgpu_uvd_idle_work_handler() of amdgpu_uvd.c. Same logic was used 
there. So, maybe @Zhu, James or @Liu, 
Leo can share some insights about this.

BR
Evan

Thanks,
Lijo

From: Zhu, James mailto:james@amd.com>>
Sent: Thursday, August 19, 2021 7:49 PM
To: Quan, Evan mailto:evan.q...@amd.com>>; 
amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Chen, Guchun 
mailto:guchun.c...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>; Pan, Xinhui 
mailto:xinhui@amd.com>>
Subject: Re: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend


[AMD Official Use Only]


Why not move changes into hw_fini?


Best Regards!



James Zhu


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Evan Quan mailto:evan.q...@amd.com>>
Sent: Wednesday, August 18, 2021 11:08 PM
To: amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>
Cc: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Chen, Guchun 
mailto:guchun.c...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>; Quan, Evan 
mailto:evan.q...@amd.com>>; Pan, Xinhui 
mailto:xinhui@amd.com>>
Subject: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend

Perform proper cleanups on UVD/VCE suspend: powergate enablement,
clockgating enablement and dpm disablement. This can fix some hangs
observed on suspending when UVD/VCE still using(e.g. issue
"pm-suspend" when video is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan mailto:evan.q...@amd.com>>
Signed-off-by: xinhui pan mailto:xinhui@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 23 +++
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..d0fc6ec18c29 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -554,6 +554,30 @@ static int uvd_v6_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->uvd.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_uvd(adev, false);
+   } else {
+   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
+   /* shutdown the UVD block */
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_CG_STATE_GATE);
+   }
+
 r = uvd_v6_0_hw_fini(adev);
 if (r)
 return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..a594ade5d30a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -503,6 +503,29 @@ static int vce_v3_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+

RE: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend

2021-08-19 Thread Quan, Evan

[AMD Official Use Only]



From: Zhu, James 
Sent: Thursday, August 19, 2021 10:19 PM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Chen, Guchun 
; Lazar, Lijo ; Pan, Xinhui 

Subject: Re: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend


[AMD Official Use Only]


Why not move changes into hw_fini?
[Quan, Evan] Sure, I can do that. Will send out a new patch with that updated.

BR
Evan


Best Regards!



James Zhu


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Evan Quan mailto:evan.q...@amd.com>>
Sent: Wednesday, August 18, 2021 11:08 PM
To: amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>
Cc: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Chen, Guchun 
mailto:guchun.c...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>; Quan, Evan 
mailto:evan.q...@amd.com>>; Pan, Xinhui 
mailto:xinhui@amd.com>>
Subject: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend

Perform proper cleanups on UVD/VCE suspend: powergate enablement,
clockgating enablement and dpm disablement. This can fix some hangs
observed on suspending when UVD/VCE still using(e.g. issue
"pm-suspend" when video is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan mailto:evan.q...@amd.com>>
Signed-off-by: xinhui pan mailto:xinhui@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 23 +++
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..d0fc6ec18c29 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -554,6 +554,30 @@ static int uvd_v6_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->uvd.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_uvd(adev, false);
+   } else {
+   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
+   /* shutdown the UVD block */
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_CG_STATE_GATE);
+   }
+
 r = uvd_v6_0_hw_fini(adev);
 if (r)
 return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..a594ade5d30a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -503,6 +503,29 @@ static int vce_v3_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->vce.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_vce(adev, false);
+   } else {
+   amdgpu_asic_set_vce_clocks(adev, 0, 0);
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_CG_STATE_GATE);
+   }
+
 r = vce_v3_0_hw_fini(adev);
 if (r)
 return r;
--
2.29.0

Re: [PATCH 2/2] drm/amdkfd: map SVM range with correct access permission

2021-08-19 Thread Felix Kuehling

Am 2021-08-19 um 10:56 a.m. schrieb Philip Yang:
> Restore retry fault or prefetch range, or restore svm range after
> eviction to map range to GPU with correct read or write access
> permission.
>
> Range may includes multiple VMAs, update GPU page table with offset of
> prange, number of pages for each VMA according VMA access permission.
>
> Signed-off-by: Philip Yang 

Minor nitpicks, and one question. See inline. It looks good otherwise.


> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 131 +--
>  1 file changed, 84 insertions(+), 47 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index cf1009bb532a..94612581963f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -120,6 +120,7 @@ static void svm_range_remove_notifier(struct svm_range 
> *prange)
>  
>  static int
>  svm_range_dma_map_dev(struct amdgpu_device *adev, struct svm_range *prange,
> +   unsigned long offset, unsigned long npages,
> unsigned long *hmm_pfns, uint32_t gpuidx)
>  {
>   enum dma_data_direction dir = DMA_BIDIRECTIONAL;
> @@ -136,7 +137,8 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
> svm_range *prange,
>   prange->dma_addr[gpuidx] = addr;
>   }
>  
> - for (i = 0; i < prange->npages; i++) {
> + addr += offset;
> + for (i = 0; i < npages; i++) {
>   if (WARN_ONCE(addr[i] && !dma_mapping_error(dev, addr[i]),
> "leaking dma mapping\n"))
>   dma_unmap_page(dev, addr[i], PAGE_SIZE, dir);
> @@ -167,6 +169,7 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
> svm_range *prange,
>  
>  static int
>  svm_range_dma_map(struct svm_range *prange, unsigned long *bitmap,
> +   unsigned long offset, unsigned long npages,
> unsigned long *hmm_pfns)
>  {
>   struct kfd_process *p;
> @@ -187,7 +190,8 @@ svm_range_dma_map(struct svm_range *prange, unsigned long 
> *bitmap,
>   }
>   adev = (struct amdgpu_device *)pdd->dev->kgd;
>  
> - r = svm_range_dma_map_dev(adev, prange, hmm_pfns, gpuidx);
> + r = svm_range_dma_map_dev(adev, prange, offset, npages,
> +   hmm_pfns, gpuidx);
>   if (r)
>   break;
>   }
> @@ -1088,11 +1092,6 @@ svm_range_get_pte_flags(struct amdgpu_device *adev, 
> struct svm_range *prange,
>   pte_flags |= snoop ? AMDGPU_PTE_SNOOPED : 0;
>  
>   pte_flags |= amdgpu_gem_va_map_flags(adev, mapping_flags);
> -
> - pr_debug("svms 0x%p [0x%lx 0x%lx] vram %d PTE 0x%llx mapping 0x%x\n",
> -  prange->svms, prange->start, prange->last,
> -  (domain == SVM_RANGE_VRAM_DOMAIN) ? 1:0, pte_flags, 
> mapping_flags);
> -
>   return pte_flags;
>  }
>  
> @@ -1156,7 +1155,8 @@ svm_range_unmap_from_gpus(struct svm_range *prange, 
> unsigned long start,
>  
>  static int
>  svm_range_map_to_gpu(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> -  struct svm_range *prange, dma_addr_t *dma_addr,
> +  struct svm_range *prange, unsigned long offset,
> +  unsigned long npages, bool readonly, dma_addr_t *dma_addr,
>struct amdgpu_device *bo_adev, struct dma_fence **fence)
>  {
>   struct amdgpu_bo_va bo_va;
> @@ -1167,14 +1167,15 @@ svm_range_map_to_gpu(struct amdgpu_device *adev, 
> struct amdgpu_vm *vm,
>   int r = 0;
>   int64_t i;
>  
> - pr_debug("svms 0x%p [0x%lx 0x%lx]\n", prange->svms, prange->start,
> -  prange->last);
> + last_start = prange->start + offset;
> +
> + pr_debug("svms 0x%p [0x%lx 0x%lx] readonly %d\n", prange->svms,
> +  last_start, last_start + npages - 1, readonly);
>  
>   if (prange->svm_bo && prange->ttm_res)
>   bo_va.is_xgmi = amdgpu_xgmi_same_hive(adev, bo_adev);
>  
> - last_start = prange->start;
> - for (i = 0; i < prange->npages; i++) {
> + for (i = offset; i < offset + npages; i++) {
>   last_domain = dma_addr[i] & SVM_RANGE_VRAM_DOMAIN;
>   dma_addr[i] &= ~SVM_RANGE_VRAM_DOMAIN;
>   if ((prange->start + i) < prange->last &&
> @@ -1183,13 +1184,21 @@ svm_range_map_to_gpu(struct amdgpu_device *adev, 
> struct amdgpu_vm *vm,
>  
>   pr_debug("Mapping range [0x%lx 0x%llx] on domain: %s\n",
>last_start, prange->start + i, last_domain ? "GPU" : 
> "CPU");
> +
>   pte_flags = svm_range_get_pte_flags(adev, prange, last_domain);
> - r = amdgpu_vm_bo_update_mapping(adev, bo_adev, vm, false, 
> false, NULL,
> - last_start,
> + if (readonly)
> + pte_flags &= ~AMDGPU_PTE_WRITEABLE;
> +
> + pr_debug("svms 0x%p map [0x%lx 0x%llx] vram %

Re: [PATCH 1/2] drm/amdkfd: check access permisson to restore retry fault

2021-08-19 Thread Felix Kuehling

Am 2021-08-19 um 10:56 a.m. schrieb Philip Yang:
> Check range access permission to restore GPU retry fault, if GPU retry
> fault on address which belongs to VMA, and VMA has no read or write
> permission requested by GPU, failed to restore the address. The vm fault
> event will pass back to user space.
>
> Signed-off-by: Philip Yang 

Just some nit-picks. Other than that the patch looks good to me. See
inline ...


> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  5 +++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  3 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c   | 30 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.h   |  5 +++--
>  6 files changed, 40 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 831f00644460..ff6de40b860c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -3347,12 +3347,13 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm)
>   * @adev: amdgpu device pointer
>   * @pasid: PASID of the VM
>   * @addr: Address of the fault
> + * @rw_fault: 0 is read fault, 1 is write fault
>   *
>   * Try to gracefully handle a VM fault. Return true if the fault was handled 
> and
>   * shouldn't be reported any more.
>   */
>  bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
> - uint64_t addr)
> + uint64_t addr, uint32_t rw_fault)

Should rw_fault be a bool? And maybe call it write_fault to clarify what
"true" means.


>  {
>   bool is_compute_context = false;
>   struct amdgpu_bo *root;
> @@ -3377,7 +3378,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, 
> u32 pasid,
>   addr /= AMDGPU_GPU_PAGE_SIZE;
>  
>   if (is_compute_context &&
> - !svm_range_restore_pages(adev, pasid, addr)) {
> + !svm_range_restore_pages(adev, pasid, addr, rw_fault)) {
>   amdgpu_bo_unref(&root)
>   return true;
>   }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 80cc9ab2c1d0..1cc574ece180 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -448,7 +448,7 @@ void amdgpu_vm_check_compute_bug(struct amdgpu_device 
> *adev);
>  void amdgpu_vm_get_task_info(struct amdgpu_device *adev, u32 pasid,
>struct amdgpu_task_info *task_info);
>  bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
> - uint64_t addr);
> + uint64_t addr, uint32_t rw_fault);

bool write_fault


>  
>  void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index 24b781e90bef..994983901006 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -93,6 +93,7 @@ static int gmc_v10_0_process_interrupt(struct amdgpu_device 
> *adev,
>  struct amdgpu_iv_entry *entry)
>  {
>   bool retry_fault = !!(entry->src_data[1] & 0x80);
> + bool rw_fault = !!(entry->src_data[1] & 0x20);

write_fault


>   struct amdgpu_vmhub *hub = &adev->vmhub[entry->vmid_src];
>   struct amdgpu_task_info task_info;
>   uint32_t status = 0;
> @@ -121,7 +122,7 @@ static int gmc_v10_0_process_interrupt(struct 
> amdgpu_device *adev,
>   /* Try to handle the recoverable page faults by filling page
>* tables
>*/
> - if (amdgpu_vm_handle_fault(adev, entry->pasid, addr))
> + if (amdgpu_vm_handle_fault(adev, entry->pasid, addr, rw_fault))
>   return 1;
>   }
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 097230b5e946..9a37fd0527a9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -506,6 +506,7 @@ static int gmc_v9_0_process_interrupt(struct 
> amdgpu_device *adev,
> struct amdgpu_iv_entry *entry)
>  {
>   bool retry_fault = !!(entry->src_data[1] & 0x80);
> + bool rw_fault = !!(entry->src_data[1] & 0x20);

write_fault


>   uint32_t status = 0, cid = 0, rw = 0;
>   struct amdgpu_task_info task_info;
>   struct amdgpu_vmhub *hub;
> @@ -536,7 +537,7 @@ static int gmc_v9_0_process_interrupt(struct 
> amdgpu_device *adev,
>   /* Try to handle the recoverable page faults by filling page
>* tables
>*/
> - if (amdgpu_vm_handle_fault(adev, entry->pasid, addr))
> + if (amdgpu_vm_handle_fault(adev, entry->pasid, addr, rw_fault))
>   return 1;
>   }
>  
> diff --git a

Re: [PATCH v2 18/63] drm/amd/pm: Use struct_group() for memcpy() region

2021-08-19 Thread Kees Cook

On Thu, Aug 19, 2021 at 10:33:43AM +0530, Lazar, Lijo wrote:
> On 8/19/2021 5:29 AM, Kees Cook wrote:
> > On Wed, Aug 18, 2021 at 05:12:28PM +0530, Lazar, Lijo wrote:
> > > 
> > > On 8/18/2021 11:34 AM, Kees Cook wrote:
> > > > In preparation for FORTIFY_SOURCE performing compile-time and run-time
> > > > field bounds checking for memcpy(), memmove(), and memset(), avoid
> > > > intentionally writing across neighboring fields.
> > > > 
> > > > Use struct_group() in structs:
> > > > struct atom_smc_dpm_info_v4_5
> > > > struct atom_smc_dpm_info_v4_6
> > > > struct atom_smc_dpm_info_v4_7
> > > > struct atom_smc_dpm_info_v4_10
> > > > PPTable_t
> > > > so the grouped members can be referenced together. This will allow
> > > > memcpy() and sizeof() to more easily reason about sizes, improve
> > > > readability, and avoid future warnings about writing beyond the end of
> > > > the first member.
> > > > 
> > > > "pahole" shows no size nor member offset changes to any structs.
> > > > "objdump -d" shows no object code changes.
> > > > 
> > > > Cc: "Christian König" 
> > > > Cc: "Pan, Xinhui" 
> > > > Cc: David Airlie 
> > > > Cc: Daniel Vetter 
> > > > Cc: Hawking Zhang 
> > > > Cc: Feifei Xu 
> > > > Cc: Lijo Lazar 
> > > > Cc: Likun Gao 
> > > > Cc: Jiawei Gu 
> > > > Cc: Evan Quan 
> > > > Cc: amd-gfx@lists.freedesktop.org
> > > > Cc: dri-de...@lists.freedesktop.org
> > > > Signed-off-by: Kees Cook 
> > > > Acked-by: Alex Deucher 
> > > > Link: 
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2FCADnq5_Npb8uYvd%2BR4UHgf-w8-cQj3JoODjviJR_Y9w9wqJ71mQ%40mail.gmail.com&data=04%7C01%7Clijo.lazar%40amd.com%7C3861f20094074bf7328808d962a433f2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637649279701053991%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=386LcfJJGfQfHsXBuK17LMqxJ2nFtGoj%2FUjoN2ZtJd0%3D&reserved=0
> > > > ---
> > > >drivers/gpu/drm/amd/include/atomfirmware.h   |  9 -
> > > >.../gpu/drm/amd/pm/inc/smu11_driver_if_arcturus.h|  3 ++-
> > > >drivers/gpu/drm/amd/pm/inc/smu11_driver_if_navi10.h  |  3 ++-
> > > >.../gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h   |  3 ++-
> > > 
> > > Hi Kees,
> > 
> > Hi! Thanks for looking into this.
> > 
> > > The headers which define these structs are firmware/VBIOS interfaces and 
> > > are
> > > picked directly from those components. There are difficulties in grouping
> > > them to structs at the original source as that involves other component
> > > changes.
> > 
> > So, can you help me understand this a bit more? It sounds like these are
> > generated headers, yes? I'd like to understand your constraints and
> > weight them against various benefits that could be achieved here.
> > 
> > The groupings I made do appear to be roughly documented already,
> > for example:
> > 
> > struct   atom_common_table_header  table_header;
> >   // SECTION: BOARD PARAMETERS
> > +  struct_group(dpm_info,
> > 
> > Something emitted the "BOARD PARAMETERS" section heading as a comment,
> > so it likely also would know where it ends, yes? The good news here is
> > that for the dpm_info groups, they all end at the end of the existing
> > structs, see:
> > struct atom_smc_dpm_info_v4_5
> > struct atom_smc_dpm_info_v4_6
> > struct atom_smc_dpm_info_v4_7
> > struct atom_smc_dpm_info_v4_10
> > 
> > The matching regions in the PPTable_t structs are similarly marked with a
> > "BOARD PARAMETERS" section heading comment:
> > 
> > --- a/drivers/gpu/drm/amd/pm/inc/smu11_driver_if_arcturus.h
> > +++ b/drivers/gpu/drm/amd/pm/inc/smu11_driver_if_arcturus.h
> > @@ -643,6 +643,7 @@ typedef struct {
> > // SECTION: BOARD PARAMETERS
> > // SVI2 Board Parameters
> > +  struct_group(v4_6,
> > uint16_t MaxVoltageStepGfx; // In mV(Q2) Max voltage step that SMU 
> > will request. Multiple steps are taken if voltage change exceeds this value.
> > uint16_t MaxVoltageStepSoc; // In mV(Q2) Max voltage step that SMU 
> > will request. Multiple steps are taken if voltage change exceeds this value.
> > @@ -728,10 +729,10 @@ typedef struct {
> > uint32_t BoardVoltageCoeffB;// decode by /1000
> > uint32_t BoardReserved[7];
> > +  );
> > // Padding for MMHUB - do not modify this
> > uint32_t MmHubPadding[8]; // SMU internal use
> > -
> >   } PPTable_t;
> > 
> > Where they end seems known as well (the padding switches from a "Board"
> > to "MmHub" prefix at exactly the matching size).
> > 
> > So, given that these regions are already known by the export tool, how
> > about just updating the export tool to emit a struct there? I imagine
> > the problem with this would be the identifier churn needed, but that's
> > entirely mechanical.
> > 
> > However, I'm curious about another aspect of these regions: they are,
> > by definition, the same. Why isn't there a single str

[PATCH] drm/amd/pm: And destination bounds checking to struct copy

2021-08-19 Thread Kees Cook

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

The "Board Parameters" members of the structs:
struct atom_smc_dpm_info_v4_5
struct atom_smc_dpm_info_v4_6
struct atom_smc_dpm_info_v4_7
struct atom_smc_dpm_info_v4_10
are written to the corresponding members of the corresponding PPTable_t
variables, but they lack destination size bounds checking, which means
the compiler cannot verify at compile time that this is an intended and
safe memcpy().

Since the header files are effectively immutable[1] and a struct_group()
cannot be used, nor a common struct referenced by both sides of the
memcpy() arguments, add a new helper, memcpy_trailing(), to perform the
bounds checking at compile time. Replace the open-coded memcpy()s with
memcpy_trailing() which includes enough context for the bounds checking.

"objdump -d" shows no object code changes.

[1] https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d...@amd.com

Cc: Lijo Lazar 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Hawking Zhang 
Cc: Feifei Xu 
Cc: Likun Gao 
Cc: Jiawei Gu 
Cc: Evan Quan 
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Signed-off-by: Kees Cook 
Link: 
https://lore.kernel.org/lkml/cadnq5_npb8uyvd+r4uhgf-w8-cqj3joodjvijr_y9w9wqj7...@mail.gmail.com
---
Alex, I dropped your prior Acked-by, since the implementation is very
different. If you're still happy with it, I can add it back. :)
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   | 25 +++
 .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c |  6 ++---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 +++---
 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c|  5 ++--
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 96e895d6be35..4605934a4fb7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1446,4 +1446,29 @@ static inline int amdgpu_in_reset(struct amdgpu_device 
*adev)
 {
return atomic_read(&adev->in_gpu_reset);
 }
+
+/**
+ * memcpy_trailing - Copy the end of one structure into the middle of another
+ *
+ * @dst: Pointer to destination struct
+ * @first_dst_member: The member name in @dst where the overwrite begins
+ * @last_dst_member: The member name in @dst where the overwrite ends after
+ * @src: Pointer to the source struct
+ * @first_src_member: The member name in @src where the copy begins
+ *
+ */
+#define memcpy_trailing(dst, first_dst_member, last_dst_member,
   \
+   src, first_src_member) \
+({\
+   size_t __src_offset = offsetof(typeof(*(src)), first_src_member);  \
+   size_t __src_size = sizeof(*(src)) - __src_offset; \
+   size_t __dst_offset = offsetof(typeof(*(dst)), first_dst_member);  \
+   size_t __dst_size = offsetofend(typeof(*(dst)), last_dst_member) - \
+   __dst_offset;  \
+   BUILD_BUG_ON(__src_size != __dst_size);\
+   __builtin_memcpy((u8 *)(dst) + __dst_offset,   \
+(u8 *)(src) + __src_offset,   \
+__dst_size);  \
+})
+
 #endif
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index 8ab58781ae13..1918e6232319 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -465,10 +465,8 @@ static int arcturus_append_powerplay_table(struct 
smu_context *smu)
 
if ((smc_dpm_table->table_header.format_revision == 4) &&
(smc_dpm_table->table_header.content_revision == 6))
-   memcpy(&smc_pptable->MaxVoltageStepGfx,
-  &smc_dpm_table->maxvoltagestepgfx,
-  sizeof(*smc_dpm_table) - offsetof(struct 
atom_smc_dpm_info_v4_6, maxvoltagestepgfx));
-
+   memcpy_trailing(smc_pptable, MaxVoltageStepGfx, BoardReserved,
+   smc_dpm_table, maxvoltagestepgfx);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 2e5d3669652b..b738042e064d 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -431,16 +431,16 @@ static int navi10_append_powerplay_table(struct 
smu_context *smu)
 
switch (smc_dpm_table->table_header.content_revision) {
case 5: /* nv10 and nv14 */
-   memcpy(smc_pptable->I2cControllers

Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-19 Thread Kuppuswamy, Sathyanarayanan





On 8/19/21 11:33 AM, Tom Lendacky wrote:

There was some talk about this on the mailing list where TDX and SEV may
need to be differentiated, so we wanted to reserve a range of values per
technology. I guess I can remove them until they are actually needed.


In TDX also we have similar requirements and we need some flags for
TDX specific checks. So I think it is fine to leave some space for vendor
flags.

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

Re: [PATCH v6 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-08-19 Thread Felix Kuehling

Am 2021-08-19 um 2:00 p.m. schrieb Sierra Guiza, Alejandro (Alex):
>
> On 8/18/2021 2:28 PM, Ralph Campbell wrote:
>> On 8/17/21 5:35 PM, Felix Kuehling wrote:
>>> Am 2021-08-17 um 8:01 p.m. schrieb Ralph Campbell:
 On 8/12/21 11:31 PM, Alex Sierra wrote:
> From: Ralph Campbell 
>
> ZONE_DEVICE struct pages have an extra reference count that
> complicates the
> code for put_page() and several places in the kernel that need to
> check the
> reference count to see that a page is not being used (gup,
> compaction,
> migration, etc.). Clean up the code so the reference count doesn't
> need to
> be treated specially for ZONE_DEVICE.
>
> v2:
> AS: merged this patch in linux 5.11 version
>
> v5:
> AS: add condition at try_grab_page to check for the zone device type,
> while
> page ref counter is checked less/equal to zero. In case of device
> zone, pages
> ref counter are initialized to zero.
>
> Signed-off-by: Ralph Campbell 
> Signed-off-by: Alex Sierra 
> ---
>    arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
>    drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
>    fs/dax.c   |  4 +-
>    include/linux/dax.h    |  2 +-
>    include/linux/memremap.h   |  7 +--
>    include/linux/mm.h | 13 +
>    lib/test_hmm.c |  2 +-
>    mm/internal.h  |  8 +++
>    mm/memremap.c  | 68
> +++---
>    mm/migrate.c   |  5 --
>    mm/page_alloc.c    |  3 ++
>    mm/swap.c  | 45 ++---
>    12 files changed, 46 insertions(+), 115 deletions(-)
>
 I haven't seen a response to the issues I raised back at v3 of this
 series.
 https://lore.kernel.org/linux-mm/4f6dd918-d79b-1aa7-3a4c-caa67ddc2...@nvidia.com/



 Did I miss something?
>>> I think part of the response was that we did more testing. Alex added
>>> support for DEVICE_GENERIC pages to test_hmm and he ran DAX tests
>>> recommended by Theodore Tso. In that testing he ran into a WARN_ON_ONCE
>>> about a zero page refcount in try_get_page. The fix is in the latest
>>> version of patch 2. But it's already obsolete because John Hubbard is
>>> about to remove that function altogether.
>>>
>>> I think the issues you raised were more uncertainty than known bugs. It
>>> seems the fact that you can have DAX pages with 0 refcount is a feature
>>> more than a bug.
>>>
>>> Regards,
>>>    Felix
>>
>> Did you test on a system without CONFIG_ARCH_HAS_PTE_SPECIAL defined?
>> In that case, mmap() of a DAX device will call insert_page() which calls
>> get_page() which would trigger VM_BUG_ON_PAGE().
>>
>> I can believe it is OK for PTE_SPECIAL page table entries to have no
>> struct page or that MEMORY_DEVICE_GENERIC struct pages be mapped with
>> a zero reference count using insert_pfn().
> Hi Ralph,
> We have tried the DAX tests with and without
> CONFIG_ARCH_HAS_PTE_SPECIAL defined.
> Apparently none of the tests touches that condition for a DAX device.
> Of course,
> that doesn't mean it could happen.
>
> Regards,
> Alex S.
>
>>
>>
>> I find it hard to believe that other MM developers don't see an issue
>> with a struct page with refcount == 0 and mapcount == 1.
>>
>> I don't see where init_page_count() is being called for the
>> MEMORY_DEVICE_GENERIC or MEMORY_DEVICE_PRIVATE struct pages the AMD
>> driver allocates and passes to migrate_vma_setup().
>> Looks like svm_migrate_get_vram_page() needs to call init_page_count()
>> instead of get_page(). (I'm looking at branch
>> origin/alexsierrag/device_generic
>> https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver.git
> Yes, you're right. My bad. Thanks for catching this up. I didn't
> realize I was missing
> to define CONFIG_DEBUG_VM on my build. Therefore this BUG was never
> caught.
> It worked after I replaced get_pages by init_page_count at
> svm_migrate_get_vram_page. However, I don't think this is the best way
> to fix it.
> Ideally, get_pages call should work for device pages with ref count
> equal to 0
> too. Otherwise, we could overwrite refcounter if someone else is
> grabbing the page
> concurrently.

I think using init_page_count in svm_migrate_get_vram_page is the right
answer. This is where the page first gets allocated and initialized
(data migrated into it). I think nobody should have or try to take a
reference to the page before that. We should probably also add a
VM_BUG_ON_PAGE(page_ref_count(page) != 0) before calling init_page_count
to make sure of that.


> I was thinking to add a special condition in get_pages for dev pages.
> This could
> also fix the insert_page -> get_page call from a DAX device.

[+Theodore]

I got lost trying to understand how DAX counts page references an

[PATCH v3 6/6] drm/amd/display: Add DP 2.0 SST DC Support

2021-08-19 Thread Fangzhi Zuo

1. Retrieve 128/132b link cap.
2. 128/132b link training and payload allocation.
3. UHBR10 link rate support.

Signed-off-by: Fangzhi Zuo 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |8 +
 drivers/gpu/drm/amd/display/dc/core/dc.c  |   21 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  520 +++-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 1168 -
 .../drm/amd/display/dc/core/dc_link_hwss.c|  314 -
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  118 ++
 drivers/gpu/drm/amd/display/dc/dc.h   |   27 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |  222 
 drivers/gpu/drm/amd/display/dc/dc_link.h  |3 +
 drivers/gpu/drm/amd/display/dc/dc_types.h |   21 +
 .../display/dc/dce110/dce110_hw_sequencer.c   |  104 +-
 .../amd/display/dc/dcn10/dcn10_link_encoder.c |9 +
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|   26 +-
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |4 +
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |   18 +-
 drivers/gpu/drm/amd/display/dc/dm_cp_psp.h|1 +
 drivers/gpu/drm/amd/display/dc/dm_helpers.h   |2 +
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   22 +
 .../amd/display/dc/inc/hw_sequencer_private.h |3 +
 drivers/gpu/drm/amd/display/dc/inc/resource.h |5 +
 .../gpu/drm/amd/display/include/dpcd_defs.h   |   16 +
 .../amd/display/include/link_service_types.h  |   24 +-
 .../drm/amd/display/include/logger_types.h|6 +
 23 files changed, 2619 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 6fee12c91ef5..22ddd8d71bcf 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -751,3 +751,11 @@ void dm_helpers_mst_enable_stream_features(const struct 
dc_stream_state *stream)
 &new_downspread.raw,
 sizeof(new_downspread));
 }
+
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+void dm_set_phyd32clk(struct dc_context *ctx, int freq_khz)
+{
+   // FPGA programming for this clock in diags framework that
+   // needs to go through dm layer, therefore leave dummy interace here
+}
+#endif
\ No newline at end of file
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 7a442fcfa6ac..8d4d804df5b2 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -186,6 +186,9 @@ static bool create_links(
int i;
int connectors_num;
struct dc_bios *bios = dc->ctx->dc_bios;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   int dp_hpo_link_count = 0;
+#endif
 
dc->link_count = 0;
 
@@ -255,6 +258,24 @@ static bool create_links(
goto failed_alloc;
}
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   if (IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment) &&
+   dc->caps.dp_hpo &&
+   
link->dc->res_pool->res_cap->num_hpo_dp_link_encoder > 0) {
+   /* FPGA case - Allocate HPO DP link encoder */
+   if (i < 
link->dc->res_pool->res_cap->num_hpo_dp_link_encoder) {
+   link->hpo_dp_link_enc = 
link->dc->res_pool->hpo_dp_link_enc[i];
+
+   if (link->hpo_dp_link_enc == NULL) {
+   BREAK_TO_DEBUGGER();
+   goto failed_alloc;
+   }
+   link->hpo_dp_link_enc->hpd_source = 
link->link_enc->hpd_source;
+   link->hpo_dp_link_enc->transmitter = 
link->link_enc->transmitter;
+   }
+   }
+#endif
+
link->link_status.dpcd_caps = &link->dpcd_caps;
 
enc_init.ctx = dc->ctx;
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 8bd7f42a8053..bb06d8e3312e 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -64,6 +64,31 @@
 
/***
  * Private functions
  
**/
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+static bool add_dp_hpo_link_encoder_to_link(struct dc_link *link)
+{
+   struct hpo_dp_link_encoder *enc = 
resource_get_unused_hpo_dp_link_encoder(
+   link->dc->res_pool);
+
+   if (!link->hpo_dp_link_enc && enc) {
+   link->hpo_dp_link_enc = enc;
+   link->hpo_dp_link_enc->transmitter = 
link->link_enc->transmitter;
+   link->hpo_dp_link_enc->hpd_source = link->link_enc->hpd_source;
+   }
+
+

[PATCH v3 5/6] drm/amd/display: Add DP 2.0 BIOS and DMUB Support

2021-08-19 Thread Fangzhi Zuo

Parse DP2 encoder caps and hpo instance from bios

Signed-off-by: Fangzhi Zuo 
---
 drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c | 10 ++
 drivers/gpu/drm/amd/display/dc/bios/command_table2.c   | 10 ++
 .../drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c  |  4 
 drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h   |  6 ++
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h|  4 
 .../gpu/drm/amd/display/include/bios_parser_types.h| 10 ++
 drivers/gpu/drm/amd/include/atomfirmware.h |  6 ++
 7 files changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c 
b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index 6dbde74c1e06..cdb5c027411a 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -1604,6 +1604,16 @@ static enum bp_result bios_parser_get_encoder_cap_info(
ATOM_ENCODER_CAP_RECORD_HBR3_EN) ? 1 : 0;
info->HDMI_6GB_EN = (record->encodercaps &
ATOM_ENCODER_CAP_RECORD_HDMI6Gbps_EN) ? 1 : 0;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   info->IS_DP2_CAPABLE = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_DP2) ? 1 : 0;
+   info->DP_UHBR10_EN = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_UHBR10_EN) ? 1 : 0;
+   info->DP_UHBR13_5_EN = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_UHBR13_5_EN) ? 1 : 0;
+   info->DP_UHBR20_EN = (record->encodercaps &
+   ATOM_ENCODER_CAP_RECORD_UHBR20_EN) ? 1 : 0;
+#endif
info->DP_IS_USB_C = (record->encodercaps &
ATOM_ENCODER_CAP_RECORD_USB_C_TYPE) ? 1 : 0;
 
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c 
b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
index f1f672a997d7..6e333b4af7d6 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
@@ -340,6 +340,13 @@ static enum bp_result transmitter_control_v1_7(
const struct command_table_helper *cmd = bp->cmd_helper;
struct dmub_dig_transmitter_control_data_v1_7 dig_v1_7 = {0};
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   uint8_t hpo_instance = (uint8_t)cntl->hpo_engine_id - ENGINE_ID_HPO_0;
+
+   if (dc_is_dp_signal(cntl->signal))
+   hpo_instance = (uint8_t)cntl->hpo_engine_id - 
ENGINE_ID_HPO_DP_0;
+#endif
+
dig_v1_7.phyid = cmd->phy_id_to_atom(cntl->transmitter);
dig_v1_7.action = (uint8_t)cntl->action;
 
@@ -353,6 +360,9 @@ static enum bp_result transmitter_control_v1_7(
dig_v1_7.hpdsel = cmd->hpd_sel_to_atom(cntl->hpd_sel);
dig_v1_7.digfe_sel = cmd->dig_encoder_sel_to_atom(cntl->engine_id);
dig_v1_7.connobj_id = (uint8_t)cntl->connector_obj_id.id;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   dig_v1_7.HPO_instance = hpo_instance;
+#endif
dig_v1_7.symclk_units.symclk_10khz = cntl->pixel_clock/10;
 
if (cntl->action == TRANSMITTER_CONTROL_ENABLE ||
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c
index 46ea39f5ef8d..6f3c2fb60790 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dio_link_encoder.c
@@ -192,6 +192,10 @@ void dcn30_link_encoder_construct(
enc10->base.features.flags.bits.IS_HBR3_CAPABLE =
bp_cap_info.DP_HBR3_EN;
enc10->base.features.flags.bits.HDMI_6GB_EN = 
bp_cap_info.HDMI_6GB_EN;
+   enc10->base.features.flags.bits.IS_DP2_CAPABLE = 
bp_cap_info.IS_DP2_CAPABLE;
+   enc10->base.features.flags.bits.IS_UHBR10_CAPABLE = 
bp_cap_info.DP_UHBR10_EN;
+   enc10->base.features.flags.bits.IS_UHBR13_5_CAPABLE = 
bp_cap_info.DP_UHBR13_5_EN;
+   enc10->base.features.flags.bits.IS_UHBR20_CAPABLE = 
bp_cap_info.DP_UHBR20_EN;
enc10->base.features.flags.bits.DP_IS_USB_C =
bp_cap_info.DP_IS_USB_C;
} else {
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
index fa3a725e11dc..b99efcf4712f 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
@@ -59,6 +59,12 @@ struct encoder_feature_support {
uint32_t IS_TPS3_CAPABLE:1;
uint32_t IS_TPS4_CAPABLE:1;
uint32_t HDMI_6GB_EN:1;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   uint32_t IS_DP2_CAPABLE:1;
+   uint32_t IS_UHBR10_CAPABLE:1;
+   uint32_t IS_UHBR13_5_CAPABLE:1;
+   uint32_t IS_UHBR20_CAPABLE:1;
+#endif

[PATCH v3 4/6] drm/amd/display: Add DP 2.0 DCCG

2021-08-19 Thread Fangzhi Zuo

HW Blocks:

++  +-+  +--+
|  OPTC  |  | HDA |  | HUBP |
++  +-+  +--+
|  ||
|  ||
HPO |==||
 |  |  v|
 |  |   +-+ |
 |  |   | APG | |
 |  |   +-+ |
 |  |  ||
 |  v  vv
 | +-+
 | |  HPO Stream Encoder |
 | +-+
 | |
 | v
 |  ++
 |  |  HPO Link Encoder  |
 |  ++
 | |
 v  ===|=
   v
  +--+
  |  DIO Output Mux  |
  +--+
   |
   v
+-+
| PHY |
+-+
   | PHYD32CLK[0]
   v
+--+
| DCCG |
+--+
   |
   v
   SYMCLK32

Signed-off-by: Fangzhi Zuo 
---
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c | 162 ++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.h |  18 ++
 drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h  |  23 +++
 .../amd/display/dc/inc/hw/timing_generator.h  |   3 +
 4 files changed, 206 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
index 696c9307715d..9896adf67425 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
@@ -42,6 +42,155 @@
 #define DC_LOGGER \
dccg->ctx->logger
 
+void dccg31_set_dpstreamclk(
+   struct dccg *dccg,
+   enum hdmistreamclk_source src,
+   int otg_inst)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
+   /* enabled to select one of the DTBCLKs for pipe */
+   switch (otg_inst) {
+   case 0:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE0_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   case 1:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE1_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   case 2:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE2_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   case 3:
+   REG_UPDATE(DPSTREAMCLK_CNTL,
+   DPSTREAMCLK_PIPE3_EN, (src == REFCLK) ? 0 : 1);
+   break;
+   default:
+   BREAK_TO_DEBUGGER();
+   return;
+   }
+}
+
+void dccg31_enable_symclk32_se(
+   struct dccg *dccg,
+   int hpo_se_inst,
+   enum phyd32clk_clock_source phyd32clk)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
+   /* select one of the PHYD32CLKs as the source for symclk32_se */
+   switch (hpo_se_inst) {
+   case 0:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE0_SRC_SEL, phyd32clk,
+   SYMCLK32_SE0_EN, 1);
+   break;
+   case 1:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE1_SRC_SEL, phyd32clk,
+   SYMCLK32_SE1_EN, 1);
+   break;
+   case 2:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE2_SRC_SEL, phyd32clk,
+   SYMCLK32_SE2_EN, 1);
+   break;
+   case 3:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE3_SRC_SEL, phyd32clk,
+   SYMCLK32_SE3_EN, 1);
+   break;
+   default:
+   BREAK_TO_DEBUGGER();
+   return;
+   }
+}
+
+void dccg31_disable_symclk32_se(
+   struct dccg *dccg,
+   int hpo_se_inst)
+{
+   struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
+   /* set refclk as the source for symclk32_se */
+   switch (hpo_se_inst) {
+   case 0:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE0_SRC_SEL, 0,
+   SYMCLK32_SE0_EN, 0);
+   break;
+   case 1:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE1_SRC_SEL, 0,
+   SYMCLK32_SE1_EN, 0);
+   break;
+   case 2:
+   REG_UPDATE_2(SYMCLK32_SE_CNTL,
+   SYMCLK32_SE2_SRC_SEL, 0,
+   SYMCLK32_SE2_EN, 0);
+   brea

[PATCH v3 2/6] drm/amd/display: Add DP 2.0 HPO Stream Encoder

2021-08-19 Thread Fangzhi Zuo

HW Blocks:

++  +-+  +--+
|  OPTC  |  | HDA |  | HUBP |
++  +-+  +--+
|  ||
|  ||
HPO |==||
 |  |  v|
 |  |   +-+ |
 |  |   | APG | |
 |  |   +-+ |
 |  |  ||
 v  v  vv
   +--+
   |  HPO Stream Encoder  |
   +--+

Signed-off-by: Fangzhi Zuo 
---
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  33 +
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   2 +-
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c| 749 ++
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.h| 241 ++
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  85 ++
 .../gpu/drm/amd/display/dc/inc/core_types.h   |  10 +
 .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |   3 +
 .../amd/display/dc/inc/hw/stream_encoder.h|  81 ++
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   8 +
 .../amd/display/include/grph_object_defs.h|  12 +
 .../drm/amd/display/include/grph_object_id.h  |   8 +
 11 files changed, 1231 insertions(+), 1 deletion(-)
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index df8a7718a85f..be98f5513fe5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -466,6 +466,39 @@ void dcn10_log_hw_state(struct dc *dc,
 
log_mpc_crc(dc, log_ctx);
 
+   {
+   int hpo_dp_link_enc_count = 0;
+
+   if (pool->hpo_dp_stream_enc_count > 0) {
+   DTN_INFO("DP HPO S_ENC:  Enabled  OTG   Format   Depth  
 Vid   SDP   Compressed  Link\n");
+   for (i = 0; i < pool->hpo_dp_stream_enc_count; i++) {
+   struct hpo_dp_stream_encoder_state 
hpo_dp_se_state = {0};
+   struct hpo_dp_stream_encoder *hpo_dp_stream_enc 
= pool->hpo_dp_stream_enc[i];
+
+   if (hpo_dp_stream_enc && 
hpo_dp_stream_enc->funcs->read_state) {
+   
hpo_dp_stream_enc->funcs->read_state(hpo_dp_stream_enc, &hpo_dp_se_state);
+
+   DTN_INFO("[%d]: %d
%d   %6s   %d %d %d%d %d\n",
+   hpo_dp_stream_enc->id - 
ENGINE_ID_HPO_DP_0,
+   
hpo_dp_se_state.stream_enc_enabled,
+   
hpo_dp_se_state.otg_inst,
+   
(hpo_dp_se_state.pixel_encoding == 0) ? "4:4:4" :
+   
((hpo_dp_se_state.pixel_encoding == 1) ? "4:2:2" :
+   
(hpo_dp_se_state.pixel_encoding == 2) ? "4:2:0" : "Y-Only"),
+   
(hpo_dp_se_state.component_depth == 0) ? 6 :
+   
((hpo_dp_se_state.component_depth == 1) ? 8 :
+   
(hpo_dp_se_state.component_depth == 2) ? 10 : 12),
+   
hpo_dp_se_state.vid_stream_enabled,
+   
hpo_dp_se_state.sdp_enabled,
+   
hpo_dp_se_state.compressed_format,
+   
hpo_dp_se_state.mapped_to_link_enc);
+   }
+   }
+
+   DTN_INFO("\n");
+   }
+   }
+
DTN_INFO_END();
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
index bc2087f6dcb2..8b811f589524 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
@@ -12,7 +12,7 @@
 
 DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o dcn31_init.o 
dcn31_hubp.o \
dcn31_dccg.o dcn31_optc.o dcn31_dio_link_encoder.o dcn31_panel_cntl.o \
-   dcn31_apg.o
+   dcn31_apg.o dcn31_hpo_dp_stream_encoder.o
 
 ifdef CONFIG_X86
 CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -msse
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
new file mode 100644
index ..bc265ee0682

[PATCH v3 3/6] drm/amd/display: Add DP 2.0 HPO Link Encoder

2021-08-19 Thread Fangzhi Zuo

HW Blocks:

++  +-+  +--+
|  OPTC  |  | HDA |  | HUBP |
++  +-+  +--+
|  ||
|  ||
HPO |==||
 |  |  v|
 |  |   +-+ |
 |  |   | APG | |
 |  |   +-+ |
 |  |  ||
 |  v  vv
 | +-+
 | |  HPO Stream Encoder |
 | +-+
 | |
 | v
 |  ++
 |  |  HPO Link Encoder  |
 v  ++

Signed-off-by: Fangzhi Zuo 
---
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   4 +
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |  32 +
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   2 +-
 .../display/dc/dcn31/dcn31_dio_link_encoder.c |   4 +
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.c  | 620 ++
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.h  | 222 +++
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  50 ++
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   2 +
 .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |   1 +
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |  87 +++
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   5 +
 .../amd/display/include/link_service_types.h  |  17 +
 12 files changed, 1045 insertions(+), 1 deletion(-)
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.h

diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
b/drivers/gpu/drm/amd/display/dc/dc_link.h
index 83845d006c54..4450078213a2 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -45,6 +45,10 @@ struct dc_link_status {
 struct link_mst_stream_allocation {
/* DIG front */
const struct stream_encoder *stream_enc;
+#if defined(CONFIG_DRM_AMD_DC_DCN)
+   /* HPO DP Stream Encoder */
+   const struct hpo_dp_stream_encoder *hpo_dp_stream_enc;
+#endif
/* associate DRM payload table with DC stream encoder */
uint8_t vcp_id;
/* number of slots required for the DP stream in transport packet */
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index be98f5513fe5..70d47773c23c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -497,6 +497,38 @@ void dcn10_log_hw_state(struct dc *dc,
 
DTN_INFO("\n");
}
+
+   /* log DP HPO L_ENC section if any hpo_dp_link_enc exists */
+   for (i = 0; i < dc->link_count; i++)
+   if (dc->links[i]->hpo_dp_link_enc)
+   hpo_dp_link_enc_count++;
+
+   if (hpo_dp_link_enc_count) {
+   DTN_INFO("DP HPO L_ENC:  Enabled  Mode   Lanes   Stream 
 Slots   VC Rate XVC Rate Y\n");
+
+   for (i = 0; i < dc->link_count; i++) {
+   struct hpo_dp_link_encoder *hpo_dp_link_enc = 
dc->links[i]->hpo_dp_link_enc;
+   struct hpo_dp_link_enc_state hpo_dp_le_state = 
{0};
+
+   if (hpo_dp_link_enc && 
hpo_dp_link_enc->funcs->read_state) {
+   
hpo_dp_link_enc->funcs->read_state(hpo_dp_link_enc, &hpo_dp_le_state);
+   DTN_INFO("[%d]: %d  %6s 
%d%d  %d %d %d\n",
+   hpo_dp_link_enc->inst,
+   
hpo_dp_le_state.link_enc_enabled,
+   
(hpo_dp_le_state.link_mode == 0) ? "TPS1" :
+   
(hpo_dp_le_state.link_mode == 1) ? "TPS2" :
+   
(hpo_dp_le_state.link_mode == 2) ? "ACTIVE" : "TEST",
+   
hpo_dp_le_state.lane_count,
+   
hpo_dp_le_state.stream_src[0],
+   
hpo_dp_le_state.slot_count[0],
+   
hpo_dp_le_state.vc_rate_x[0],
+   
hpo_dp_le_state.vc_rate_y[0]);
+   DTN_INFO("\n");
+   }
+   }
+
+   DTN_INFO("\n");
+   }
}
 
DTN_INFO_END();
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/d

[PATCH v3 1/6] drm/amd/display: Add DP 2.0 Audio Package Generator

2021-08-19 Thread Fangzhi Zuo

HW Blocks:

+-+
| HDA |
+-+
   |
   |
HPO ===|=
 | v
 |  +-+
 |  | APG |
 v  +-+

Signed-off-by: Fangzhi Zuo 
---
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |   3 +-
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.c  | 173 ++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.h  | 115 
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  38 
 4 files changed, 328 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.h

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
index 4bab97acb155..bc2087f6dcb2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
@@ -11,7 +11,8 @@
 # Makefile for dcn31.
 
 DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o dcn31_init.o 
dcn31_hubp.o \
-   dcn31_dccg.o dcn31_optc.o dcn31_dio_link_encoder.o dcn31_panel_cntl.o
+   dcn31_dccg.o dcn31_optc.o dcn31_dio_link_encoder.o dcn31_panel_cntl.o \
+   dcn31_apg.o
 
 ifdef CONFIG_X86
 CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -msse
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c 
b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
new file mode 100644
index ..6bd7a0626665
--- /dev/null
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
@@ -0,0 +1,173 @@
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ *  and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: AMD
+ *
+ */
+
+
+#include "dc_bios_types.h"
+#include "hw_shared.h"
+#include "dcn31_apg.h"
+#include "reg_helper.h"
+
+#define DC_LOGGER \
+   apg31->base.ctx->logger
+
+#define REG(reg)\
+   (apg31->regs->reg)
+
+#undef FN
+#define FN(reg_name, field_name) \
+   apg31->apg_shift->field_name, apg31->apg_mask->field_name
+
+
+#define CTX \
+   apg31->base.ctx
+
+
+static void apg31_enable(
+   struct apg *apg)
+{
+   struct dcn31_apg *apg31 = DCN31_APG_FROM_APG(apg);
+
+   /* Reset APG */
+   REG_UPDATE(APG_CONTROL, APG_RESET, 1);
+   REG_WAIT(APG_CONTROL,
+   APG_RESET_DONE, 1,
+   1, 10);
+   REG_UPDATE(APG_CONTROL, APG_RESET, 0);
+   REG_WAIT(APG_CONTROL,
+   APG_RESET_DONE, 0,
+   1, 10);
+
+   /* Enable APG */
+   REG_UPDATE(APG_CONTROL2, APG_ENABLE, 1);
+}
+
+static void apg31_disable(
+   struct apg *apg)
+{
+   struct dcn31_apg *apg31 = DCN31_APG_FROM_APG(apg);
+
+   /* Disable APG */
+   REG_UPDATE(APG_CONTROL2, APG_ENABLE, 0);
+}
+
+static union audio_cea_channels speakers_to_channels(
+   struct audio_speaker_flags speaker_flags)
+{
+   union audio_cea_channels cea_channels = {0};
+
+   /* these are one to one */
+   cea_channels.channels.FL = speaker_flags.FL_FR;
+   cea_channels.channels.FR = speaker_flags.FL_FR;
+   cea_channels.channels.LFE = speaker_flags.LFE;
+   cea_channels.channels.FC = speaker_flags.FC;
+
+   /* if Rear Left and Right exist move RC speaker to channel 7
+* otherwise to channel 5
+*/
+   if (speaker_flags.RL_RR) {
+   cea_channels.channels.RL_RC = speaker_flags.RL_RR;
+   cea_channels.channels.RR = speaker_flags.RL_RR;
+   cea_channels.channels.RC_RLC_FLC = speaker_flags.RC;
+   } else {
+   cea_channels.channels.RL_RC = speaker_flags.RC;
+   }
+
+   /* FRONT Left Right Center and REAR Left Right Center are exclusive */
+   if (speaker_flags.FLC_FRC) {
+   cea_channels.channels.RC_RLC_FLC = speaker_flags.FLC_FRC;
+

[PATCH v3 0/6] Add DP 2.0 SST Support

2021-08-19 Thread Fangzhi Zuo

The patch series adds SST UHBR10 support

Fangzhi Zuo (6):
  drm/amd/display: Add DP 2.0 Audio Package Generator
  drm/amd/display: Add DP 2.0 HPO Stream Encoder
  drm/amd/display: Add DP 2.0 HPO Link Encoder
  drm/amd/display: Add DP 2.0 DCCG
  drm/amd/display: Add DP 2.0 BIOS and DMUB Support
  drm/amd/display: Add DP 2.0 SST DC Support

 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |8 +
 .../drm/amd/display/dc/bios/bios_parser2.c|   10 +
 .../drm/amd/display/dc/bios/command_table2.c  |   10 +
 drivers/gpu/drm/amd/display/dc/core/dc.c  |   21 +
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  520 +++-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 1168 -
 .../drm/amd/display/dc/core/dc_link_hwss.c|  314 -
 .../gpu/drm/amd/display/dc/core/dc_resource.c |  118 ++
 drivers/gpu/drm/amd/display/dc/dc.h   |   27 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |  222 
 drivers/gpu/drm/amd/display/dc/dc_link.h  |7 +
 drivers/gpu/drm/amd/display/dc/dc_types.h |   21 +
 .../display/dc/dce110/dce110_hw_sequencer.c   |  104 +-
 .../amd/display/dc/dcn10/dcn10_hw_sequencer.c |   65 +
 .../amd/display/dc/dcn10/dcn10_link_encoder.c |9 +
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|   26 +-
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |4 +
 .../display/dc/dcn30/dcn30_dio_link_encoder.c |4 +
 drivers/gpu/drm/amd/display/dc/dcn31/Makefile |3 +-
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.c  |  173 +++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_apg.h  |  115 ++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c |  162 +++
 .../gpu/drm/amd/display/dc/dcn31/dcn31_dccg.h |   18 +
 .../display/dc/dcn31/dcn31_dio_link_encoder.c |4 +
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.c  |  620 +
 .../dc/dcn31/dcn31_hpo_dp_link_encoder.h  |  222 
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.c|  749 +++
 .../dc/dcn31/dcn31_hpo_dp_stream_encoder.h|  241 
 .../drm/amd/display/dc/dcn31/dcn31_resource.c |  181 +++
 drivers/gpu/drm/amd/display/dc/dm_cp_psp.h|1 +
 drivers/gpu/drm/amd/display/dc/dm_helpers.h   |2 +
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   12 +
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |   22 +
 drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h  |   23 +
 .../gpu/drm/amd/display/dc/inc/hw/hw_shared.h |4 +
 .../drm/amd/display/dc/inc/hw/link_encoder.h  |   93 ++
 .../amd/display/dc/inc/hw/stream_encoder.h|   81 ++
 .../amd/display/dc/inc/hw/timing_generator.h  |3 +
 .../amd/display/dc/inc/hw_sequencer_private.h |3 +
 drivers/gpu/drm/amd/display/dc/inc/resource.h |   18 +
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   |4 +
 .../amd/display/include/bios_parser_types.h   |   10 +
 .../gpu/drm/amd/display/include/dpcd_defs.h   |   16 +
 .../amd/display/include/grph_object_defs.h|   12 +
 .../drm/amd/display/include/grph_object_id.h  |8 +
 .../amd/display/include/link_service_types.h  |   41 +-
 .../drm/amd/display/include/logger_types.h|6 +
 drivers/gpu/drm/amd/include/atomfirmware.h|6 +
 48 files changed, 5472 insertions(+), 39 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_apg.h
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.h
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c
 create mode 100644 
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_stream_encoder.h

-- 
2.25.1

Re: [PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-19 Thread Tom Lendacky

On 8/19/21 4:55 AM, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 11:59:23AM -0500, Tom Lendacky wrote:
>> +static inline bool prot_guest_has(unsigned int attr)
> 
> No reall need to have this inline.  In fact I'd suggest we havea the
> prototype in a common header so that everyone must implement it out
> of line.

I'll do the same thing I end up doing for x86.

Thanks,
Tom

>

Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-19 Thread Tom Lendacky

On 8/19/21 4:52 AM, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 11:59:22AM -0500, Tom Lendacky wrote:
>> While the name suggests this is intended mainly for guests, it will
>> also be used for host memory encryption checks in place of sme_active().
> 
> Which suggest that the name is not good to start with.  Maybe protected
> hardware, system or platform might be a better choice?
> 
>> +static inline bool prot_guest_has(unsigned int attr)
>> +{
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +if (sme_me_mask)
>> +return amd_prot_guest_has(attr);
>> +#endif
>> +
>> +return false;
>> +}
> 
> Shouldn't this be entirely out of line?

I did it as inline originally because the presence of the function will be
decided based on the ARCH_HAS_PROTECTED_GUEST config. For now, that is
only selected by the AMD memory encryption support, so if I went out of
line I could put in mem_encrypt.c. But with TDX wanting to also use it, it
would have to be in an always built file with some #ifdefs or in its own
file that is conditionally built based on the ARCH_HAS_PROTECTED_GUEST
setting (they've already tried building with ARCH_HAS_PROTECTED_GUEST=y
and AMD_MEM_ENCRYPT not set).

To take it out of line, I'm leaning towards the latter, creating a new
file that is built based on the ARCH_HAS_PROTECTED_GUEST setting.

> 
>> +/* 0x800 - 0x8ff reserved for AMD */
>> +#define PATTR_SME   0x800
>> +#define PATTR_SEV   0x801
>> +#define PATTR_SEV_ES0x802
> 
> Why do we need reservations for a purely in-kernel namespace?
> 
> And why are you overoading a brand new generic API with weird details
> of a specific implementation like this?

There was some talk about this on the mailing list where TDX and SEV may
need to be differentiated, so we wanted to reserve a range of values per
technology. I guess I can remove them until they are actually needed.

Thanks,
Tom

>

Re: [PATCH] drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)

2021-08-19 Thread Alex Deucher

Please go ahead.  Thanks!

Alex

On Thu, Aug 19, 2021 at 8:05 AM Mike Lothian  wrote:
>
> Hi
>
> Do I need to open a new bug report for this?
>
> Cheers
>
> Mike
>
> On Wed, 18 Aug 2021 at 06:26, Andrey Grodzovsky  
> wrote:
>>
>>
>> On 2021-08-02 1:16 a.m., Guchun Chen wrote:
>> > In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
>> > scheduler in s3 test, otherwise, fence related failure will arrive
>> > after resume. To fix this and for a better clean up, move drm_sched_fini
>> > from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and
>> > should never be called in hw_fini.
>> >
>> > v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
>> > to keep sw_init and sw_fini paired.
>> >
>> > Fixes: cd87a6dcf6af drm/amdgpu: adjust fence driver enable sequence
>> > Suggested-by: Christian König 
>> > Signed-off-by: Guchun Chen 
>> > ---
>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 ++---
>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 12 +++-
>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  4 ++--
>> >   3 files changed, 11 insertions(+), 10 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> > index b1d2dc39e8be..9e53ff851496 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> > @@ -3646,9 +3646,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>> >
>> >   fence_driver_init:
>> >   /* Fence driver */
>> > - r = amdgpu_fence_driver_init(adev);
>> > + r = amdgpu_fence_driver_sw_init(adev);
>> >   if (r) {
>> > - dev_err(adev->dev, "amdgpu_fence_driver_init failed\n");
>> > + dev_err(adev->dev, "amdgpu_fence_driver_sw_init failed\n");
>> >   amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_FENCE_INIT_FAIL, 
>> > 0, 0);
>> >   goto failed;
>> >   }
>> > @@ -3988,7 +3988,6 @@ int amdgpu_device_resume(struct drm_device *dev, 
>> > bool fbcon)
>> >   }
>> >   amdgpu_fence_driver_hw_init(adev);
>> >
>> > -
>> >   r = amdgpu_device_ip_late_init(adev);
>> >   if (r)
>> >   return r;
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
>> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> > index 49c5c7331c53..7495911516c2 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
>> > @@ -498,7 +498,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring 
>> > *ring,
>> >   }
>> >
>> >   /**
>> > - * amdgpu_fence_driver_init - init the fence driver
>> > + * amdgpu_fence_driver_sw_init - init the fence driver
>> >* for all possible rings.
>> >*
>> >* @adev: amdgpu device pointer
>> > @@ -509,13 +509,13 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring 
>> > *ring,
>> >* amdgpu_fence_driver_start_ring().
>> >* Returns 0 for success.
>> >*/
>> > -int amdgpu_fence_driver_init(struct amdgpu_device *adev)
>> > +int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev)
>> >   {
>> >   return 0;
>> >   }
>> >
>> >   /**
>> > - * amdgpu_fence_driver_fini - tear down the fence driver
>> > + * amdgpu_fence_driver_hw_fini - tear down the fence driver
>> >* for all possible rings.
>> >*
>> >* @adev: amdgpu device pointer
>> > @@ -531,8 +531,7 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device 
>> > *adev)
>> >
>> >   if (!ring || !ring->fence_drv.initialized)
>> >   continue;
>> > - if (!ring->no_scheduler)
>> > - drm_sched_fini(&ring->sched);
>> > +
>> >   /* You can't wait for HW to signal if it's gone */
>> >   if (!drm_dev_is_unplugged(&adev->ddev))
>> >   r = amdgpu_fence_wait_empty(ring);
>>
>>
>> Sorry for late notice, missed this patch. By moving drm_sched_fini
>> past amdgpu_fence_wait_empty a race is created as even after you waited
>> for all fences on the ring to signal the sw scheduler will keep submitting
>> new jobs on the ring and so the ring won't stay empty.
>>
>> For hot device removal also we want to prevent any access to HW past PCI
>> removal
>> in order to not do any MMIO accesses inside the physical MMIO range that
>> no longer
>> belongs to this device after it's removal by the PCI core. Stopping all
>> the schedulers prevents any MMIO
>> accesses done during job submissions and that why drm_sched_fini was
>> done as part of amdgpu_fence_driver_hw_fini
>> and not amdgpu_fence_driver_sw_fini
>>
>> Andrey
>>
>> > @@ -560,6 +559,9 @@ void amdgpu_fence_driver_sw_fini(struct amdgpu_device 
>> > *adev)
>> >   if (!ring || !ring->fence_drv.initialized)
>> >   continue;
>> >
>> > + if (!ring->no_scheduler)
>> > + drm_sched_fini(&ring->sched);
>> > +
>> >   for (j = 0; j <= ring->fence_drv.num_fences_mask; +

Re: [PATCH v6 02/13] mm: remove extra ZONE_DEVICE struct page refcount

2021-08-19 Thread Sierra Guiza, Alejandro (Alex)




On 8/18/2021 2:28 PM, Ralph Campbell wrote:

On 8/17/21 5:35 PM, Felix Kuehling wrote:

Am 2021-08-17 um 8:01 p.m. schrieb Ralph Campbell:

On 8/12/21 11:31 PM, Alex Sierra wrote:

From: Ralph Campbell 

ZONE_DEVICE struct pages have an extra reference count that
complicates the
code for put_page() and several places in the kernel that need to
check the
reference count to see that a page is not being used (gup, compaction,
migration, etc.). Clean up the code so the reference count doesn't
need to
be treated specially for ZONE_DEVICE.

v2:
AS: merged this patch in linux 5.11 version

v5:
AS: add condition at try_grab_page to check for the zone device type,
while
page ref counter is checked less/equal to zero. In case of device
zone, pages
ref counter are initialized to zero.

Signed-off-by: Ralph Campbell 
Signed-off-by: Alex Sierra 
---
   arch/powerpc/kvm/book3s_hv_uvmem.c |  2 +-
   drivers/gpu/drm/nouveau/nouveau_dmem.c |  2 +-
   fs/dax.c   |  4 +-
   include/linux/dax.h    |  2 +-
   include/linux/memremap.h   |  7 +--
   include/linux/mm.h | 13 +
   lib/test_hmm.c |  2 +-
   mm/internal.h  |  8 +++
   mm/memremap.c  | 68 
+++---

   mm/migrate.c   |  5 --
   mm/page_alloc.c    |  3 ++
   mm/swap.c  | 45 ++---
   12 files changed, 46 insertions(+), 115 deletions(-)


I haven't seen a response to the issues I raised back at v3 of this
series.
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-mm%2F4f6dd918-d79b-1aa7-3a4c-caa67ddc29bc%40nvidia.com%2F&data=04%7C01%7Calex.sierra%40amd.com%7Cd2bd2d4fbf764528540908d9627e5dcd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637649117156919772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=P7FxYm%2BkJaCkMFa3OHtuKrPOn7SvytFRmYQdIzq7rN4%3D&reserved=0 




Did I miss something?

I think part of the response was that we did more testing. Alex added
support for DEVICE_GENERIC pages to test_hmm and he ran DAX tests
recommended by Theodore Tso. In that testing he ran into a WARN_ON_ONCE
about a zero page refcount in try_get_page. The fix is in the latest
version of patch 2. But it's already obsolete because John Hubbard is
about to remove that function altogether.

I think the issues you raised were more uncertainty than known bugs. It
seems the fact that you can have DAX pages with 0 refcount is a feature
more than a bug.

Regards,
   Felix


Did you test on a system without CONFIG_ARCH_HAS_PTE_SPECIAL defined?
In that case, mmap() of a DAX device will call insert_page() which calls
get_page() which would trigger VM_BUG_ON_PAGE().

I can believe it is OK for PTE_SPECIAL page table entries to have no
struct page or that MEMORY_DEVICE_GENERIC struct pages be mapped with
a zero reference count using insert_pfn().

Hi Ralph,
We have tried the DAX tests with and without CONFIG_ARCH_HAS_PTE_SPECIAL 
defined.
Apparently none of the tests touches that condition for a DAX device. Of 
course,

that doesn't mean it could happen.

Regards,
Alex S.




I find it hard to believe that other MM developers don't see an issue
with a struct page with refcount == 0 and mapcount == 1.

I don't see where init_page_count() is being called for the
MEMORY_DEVICE_GENERIC or MEMORY_DEVICE_PRIVATE struct pages the AMD
driver allocates and passes to migrate_vma_setup().
Looks like svm_migrate_get_vram_page() needs to call init_page_count()
instead of get_page(). (I'm looking at branch 
origin/alexsierrag/device_generic

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FRadeonOpenCompute%2FROCK-Kernel-Driver.git&data=04%7C01%7Calex.sierra%40amd.com%7Cd2bd2d4fbf764528540908d9627e5dcd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637649117156919772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IXe8HP2s8x5OdJdERBkGOYJCQk3iqCu5AYkwpDL8zec%3D&reserved=0)
Yes, you're right. My bad. Thanks for catching this up. I didn't realize 
I was missing

to define CONFIG_DEBUG_VM on my build. Therefore this BUG was never caught.
It worked after I replaced get_pages by init_page_count at
svm_migrate_get_vram_page. However, I don't think this is the best way 
to fix it.
Ideally, get_pages call should work for device pages with ref count 
equal to 0
too. Otherwise, we could overwrite refcounter if someone else is 
grabbing the page

concurrently.
I was thinking to add a special condition in get_pages for dev pages. 
This could

also fix the insert_page -> get_page call from a DAX device.

Regards,
Alex S.



Also, what about the other places where is_device_private_page() is 
called?

Don't they need to be updated to call is_device_page() instead?
One of my goals for this patch was

Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-19 Thread Borislav Petkov

On Thu, Aug 19, 2021 at 10:52:53AM +0100, Christoph Hellwig wrote:
> Which suggest that the name is not good to start with.  Maybe protected
> hardware, system or platform might be a better choice?

Yah, coming up with a proper name here hasn't been easy.
prot_guest_has() is not the first variant.

>From all three things you suggest above, I guess calling it a "platform"
is the closest. As in, this is a confidential computing platform which
provides host and guest facilities etc.

So calling it

confidential_computing_platform_has()

is obviously too long.

ccp_has() clashes with the namespace of drivers/crypto/ccp/ which is
used by the technology too.

coco_platform_has() is too unserious.

So I guess

cc_platform_has()

ain't all that bad.

Unless you have a better idea, ofc.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 02/12] mm: Introduce a function to check for virtualization protection features

2021-08-19 Thread Tom Lendacky

On 8/19/21 4:46 AM, Christoph Hellwig wrote:
> On Fri, Aug 13, 2021 at 11:59:21AM -0500, Tom Lendacky wrote:
>> +#define PATTR_MEM_ENCRYPT   0   /* Encrypted memory */
>> +#define PATTR_HOST_MEM_ENCRYPT  1   /* Host encrypted 
>> memory */
>> +#define PATTR_GUEST_MEM_ENCRYPT 2   /* Guest encrypted 
>> memory */
>> +#define PATTR_GUEST_PROT_STATE  3   /* Guest encrypted 
>> state */
> 
> Please write an actual detailed explanaton of what these mean, that
> is what implications it has on the kernel.

Will do.

Thanks,
Tom

>

Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-19 Thread Andrey Grodzovsky




On 2021-08-19 5:30 a.m., Daniel Vetter wrote:

On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:

On 2021-08-18 10:42 a.m., Daniel Vetter wrote:

On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:

On 2021-08-18 10:32 a.m., Daniel Vetter wrote:

On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:

On 2021-08-18 10:02 a.m., Alex Deucher wrote:


+ dri-devel

Since scheduler is a shared component, please add dri-devel on all
scheduler patches.

On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen  wrote:

[Why]
for bailing job, this commit will delete it from pending list thus the
bailing job will never have a chance to be resubmitted even in advance
tdr mode.

[How]
after embeded hw_fence into amdgpu_job is done, the race condition that
this commit tries to work around is completely solved.So revert this
commit.
This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.
v2:
add dma_fence_get/put() around timedout_job to avoid concurrent delete
during processing timedout_job

Signed-off-by: Jingwen Chen 
---
 drivers/gpu/drm/scheduler/sched_main.c | 23 +--
 1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a953693b45..f9b9b3aefc4a 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 {
struct drm_gpu_scheduler *sched;
struct drm_sched_job *job;
+   struct dma_fence *fence;
enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;

sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
@@ -325,11 +326,10 @@ static void drm_sched_job_timedout(struct work_struct 
*work)

if (job) {
/*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
+* Get job->s_fence->parent here to avoid concurrent delete 
during
+* processing timedout_job
 */
-   list_del_init(&job->list);
+   fence = dma_fence_get(job->s_fence->parent);

While this is true for amdgpu, it has no meaning for other drivers for whom
we haven't
done the refactoring of embedding HW fence (parent) into the job structure.
In fact thinking
about it, unless you do the HW fence embedding for all the drivers using the
scheduler you cannot
revert this patch or you will just break them.

btw, why did you do that embedding? I do still have my patches with
dma_fence annotations floating around, but my idea at least was to fix
that issue with a mempool, not with embeddeding. What was the motivation
for embedding the wh fence?
-Daniel

The motivation was 2 fold, avoid memory allocation during jobs submissions
(HW fence allocation) because as Christian explained this leads to deadlock
with
mm code during evictions due to memory pressure (Christian can clarify if I
messed

Yeah that's the exact same thing I've chased with my dma_fence
annotations, but thus far zero to none interested in getting it sorted. I
think it'd be good to have some cross-driver agreement on how this should
be solved before someone just charges ahead ...


this explanation). Second is to exactly revert this patch because while it
solved the issue
described in the patch it created another with drivers who baildc out early
during TDR handling
for various reason and the job would just leak because it was already
removed form pending list.

Can't we reinsert it before we restart the scheduler thread? It might need
a separate list for that due to the lockless queue tricks. Or am I
thinking about the wrong kind of "we lost the job"?
-Danile


If you look at the original patch it would reinsert it even earlier - right
after stopping the  SW scheduler thread, and even then it was to late for
some drivers as they would decide to return back from their TDR handler even
before that. It is solvable but in an ugly way as far as I see, you need to
require each driver in his code to put the job back in the list if they do
it before reaching the place where scheduler framework does it. Kind of
spaghetti code seems to me.

Hm yeah I didn't realize this all happens before we stop the scheduler
thread.

Why can't we stop the scheduler thread first, so that there's guaranteed
no race? I've recently had a lot of discussions with panfrost folks about
their reset that spawns across engines, and without stopping the scheduler
thread first before you touch anything it's just plain impossible.



Talked with Christian on that, for each TDR we actually stop all the
schedulers for all the rings and not only the hanged ring since
ASIC reset will impact all the rings anyway. So we cannot allo

[PATCH 2/2] drm/amdkfd: map SVM range with correct access permission

2021-08-19 Thread Philip Yang

Restore retry fault or prefetch range, or restore svm range after
eviction to map range to GPU with correct read or write access
permission.

Range may includes multiple VMAs, update GPU page table with offset of
prange, number of pages for each VMA according VMA access permission.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 131 +--
 1 file changed, 84 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index cf1009bb532a..94612581963f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -120,6 +120,7 @@ static void svm_range_remove_notifier(struct svm_range 
*prange)
 
 static int
 svm_range_dma_map_dev(struct amdgpu_device *adev, struct svm_range *prange,
+ unsigned long offset, unsigned long npages,
  unsigned long *hmm_pfns, uint32_t gpuidx)
 {
enum dma_data_direction dir = DMA_BIDIRECTIONAL;
@@ -136,7 +137,8 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
prange->dma_addr[gpuidx] = addr;
}
 
-   for (i = 0; i < prange->npages; i++) {
+   addr += offset;
+   for (i = 0; i < npages; i++) {
if (WARN_ONCE(addr[i] && !dma_mapping_error(dev, addr[i]),
  "leaking dma mapping\n"))
dma_unmap_page(dev, addr[i], PAGE_SIZE, dir);
@@ -167,6 +169,7 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
 
 static int
 svm_range_dma_map(struct svm_range *prange, unsigned long *bitmap,
+ unsigned long offset, unsigned long npages,
  unsigned long *hmm_pfns)
 {
struct kfd_process *p;
@@ -187,7 +190,8 @@ svm_range_dma_map(struct svm_range *prange, unsigned long 
*bitmap,
}
adev = (struct amdgpu_device *)pdd->dev->kgd;
 
-   r = svm_range_dma_map_dev(adev, prange, hmm_pfns, gpuidx);
+   r = svm_range_dma_map_dev(adev, prange, offset, npages,
+ hmm_pfns, gpuidx);
if (r)
break;
}
@@ -1088,11 +1092,6 @@ svm_range_get_pte_flags(struct amdgpu_device *adev, 
struct svm_range *prange,
pte_flags |= snoop ? AMDGPU_PTE_SNOOPED : 0;
 
pte_flags |= amdgpu_gem_va_map_flags(adev, mapping_flags);
-
-   pr_debug("svms 0x%p [0x%lx 0x%lx] vram %d PTE 0x%llx mapping 0x%x\n",
-prange->svms, prange->start, prange->last,
-(domain == SVM_RANGE_VRAM_DOMAIN) ? 1:0, pte_flags, 
mapping_flags);
-
return pte_flags;
 }
 
@@ -1156,7 +1155,8 @@ svm_range_unmap_from_gpus(struct svm_range *prange, 
unsigned long start,
 
 static int
 svm_range_map_to_gpu(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-struct svm_range *prange, dma_addr_t *dma_addr,
+struct svm_range *prange, unsigned long offset,
+unsigned long npages, bool readonly, dma_addr_t *dma_addr,
 struct amdgpu_device *bo_adev, struct dma_fence **fence)
 {
struct amdgpu_bo_va bo_va;
@@ -1167,14 +1167,15 @@ svm_range_map_to_gpu(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
int r = 0;
int64_t i;
 
-   pr_debug("svms 0x%p [0x%lx 0x%lx]\n", prange->svms, prange->start,
-prange->last);
+   last_start = prange->start + offset;
+
+   pr_debug("svms 0x%p [0x%lx 0x%lx] readonly %d\n", prange->svms,
+last_start, last_start + npages - 1, readonly);
 
if (prange->svm_bo && prange->ttm_res)
bo_va.is_xgmi = amdgpu_xgmi_same_hive(adev, bo_adev);
 
-   last_start = prange->start;
-   for (i = 0; i < prange->npages; i++) {
+   for (i = offset; i < offset + npages; i++) {
last_domain = dma_addr[i] & SVM_RANGE_VRAM_DOMAIN;
dma_addr[i] &= ~SVM_RANGE_VRAM_DOMAIN;
if ((prange->start + i) < prange->last &&
@@ -1183,13 +1184,21 @@ svm_range_map_to_gpu(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
 
pr_debug("Mapping range [0x%lx 0x%llx] on domain: %s\n",
 last_start, prange->start + i, last_domain ? "GPU" : 
"CPU");
+
pte_flags = svm_range_get_pte_flags(adev, prange, last_domain);
-   r = amdgpu_vm_bo_update_mapping(adev, bo_adev, vm, false, 
false, NULL,
-   last_start,
+   if (readonly)
+   pte_flags &= ~AMDGPU_PTE_WRITEABLE;
+
+   pr_debug("svms 0x%p map [0x%lx 0x%llx] vram %d PTE 0x%llx\n",
+prange->svms, last_start, prange->start + i,
+(last_domain == SVM_RANGE_VRAM_DOMAIN) ? 1 : 0,
+pte_flags);
+
+   r = amdgpu_vm_bo_update_mappi

[PATCH 1/2] drm/amdkfd: check access permisson to restore retry fault

2021-08-19 Thread Philip Yang

Check range access permission to restore GPU retry fault, if GPU retry
fault on address which belongs to VMA, and VMA has no read or write
permission requested by GPU, failed to restore the address. The vm fault
event will pass back to user space.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  5 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c   | 30 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h   |  5 +++--
 6 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 831f00644460..ff6de40b860c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3347,12 +3347,13 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm)
  * @adev: amdgpu device pointer
  * @pasid: PASID of the VM
  * @addr: Address of the fault
+ * @rw_fault: 0 is read fault, 1 is write fault
  *
  * Try to gracefully handle a VM fault. Return true if the fault was handled 
and
  * shouldn't be reported any more.
  */
 bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
-   uint64_t addr)
+   uint64_t addr, uint32_t rw_fault)
 {
bool is_compute_context = false;
struct amdgpu_bo *root;
@@ -3377,7 +3378,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, 
u32 pasid,
addr /= AMDGPU_GPU_PAGE_SIZE;
 
if (is_compute_context &&
-   !svm_range_restore_pages(adev, pasid, addr)) {
+   !svm_range_restore_pages(adev, pasid, addr, rw_fault)) {
amdgpu_bo_unref(&root);
return true;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 80cc9ab2c1d0..1cc574ece180 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -448,7 +448,7 @@ void amdgpu_vm_check_compute_bug(struct amdgpu_device 
*adev);
 void amdgpu_vm_get_task_info(struct amdgpu_device *adev, u32 pasid,
 struct amdgpu_task_info *task_info);
 bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
-   uint64_t addr);
+   uint64_t addr, uint32_t rw_fault);
 
 void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 24b781e90bef..994983901006 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -93,6 +93,7 @@ static int gmc_v10_0_process_interrupt(struct amdgpu_device 
*adev,
   struct amdgpu_iv_entry *entry)
 {
bool retry_fault = !!(entry->src_data[1] & 0x80);
+   bool rw_fault = !!(entry->src_data[1] & 0x20);
struct amdgpu_vmhub *hub = &adev->vmhub[entry->vmid_src];
struct amdgpu_task_info task_info;
uint32_t status = 0;
@@ -121,7 +122,7 @@ static int gmc_v10_0_process_interrupt(struct amdgpu_device 
*adev,
/* Try to handle the recoverable page faults by filling page
 * tables
 */
-   if (amdgpu_vm_handle_fault(adev, entry->pasid, addr))
+   if (amdgpu_vm_handle_fault(adev, entry->pasid, addr, rw_fault))
return 1;
}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 097230b5e946..9a37fd0527a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -506,6 +506,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
  struct amdgpu_iv_entry *entry)
 {
bool retry_fault = !!(entry->src_data[1] & 0x80);
+   bool rw_fault = !!(entry->src_data[1] & 0x20);
uint32_t status = 0, cid = 0, rw = 0;
struct amdgpu_task_info task_info;
struct amdgpu_vmhub *hub;
@@ -536,7 +537,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
/* Try to handle the recoverable page faults by filling page
 * tables
 */
-   if (amdgpu_vm_handle_fault(adev, entry->pasid, addr))
+   if (amdgpu_vm_handle_fault(adev, entry->pasid, addr, rw_fault))
return 1;
}
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index d4a43c94bcf9..cf1009bb532a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2400,9 +2400,29 @@ svm_range_count_fault(struct amdgpu_device *adev, struct 
kfd_process *p,
WRITE_ONCE(pdd->faults, pdd->faults + 1);
 }
 
+static bool
+svm_range_allow_access(struct m

RE: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend

2021-08-19 Thread Lazar, Lijo

[AMD Official Use Only]

If that is done  -

+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_CG_STATE_GATE);

Usual order is CG followed by PG. It comes in the else part, so less likely to 
happen. Nice to fix for code correctness purpose.

Thanks,
Lijo

From: Zhu, James 
Sent: Thursday, August 19, 2021 7:49 PM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Chen, Guchun 
; Lazar, Lijo ; Pan, Xinhui 

Subject: Re: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend


[AMD Official Use Only]


Why not move changes into hw_fini?


Best Regards!



James Zhu


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Evan Quan mailto:evan.q...@amd.com>>
Sent: Wednesday, August 18, 2021 11:08 PM
To: amd-gfx@lists.freedesktop.org 
mailto:amd-gfx@lists.freedesktop.org>>
Cc: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Chen, Guchun 
mailto:guchun.c...@amd.com>>; Lazar, Lijo 
mailto:lijo.la...@amd.com>>; Quan, Evan 
mailto:evan.q...@amd.com>>; Pan, Xinhui 
mailto:xinhui@amd.com>>
Subject: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend

Perform proper cleanups on UVD/VCE suspend: powergate enablement,
clockgating enablement and dpm disablement. This can fix some hangs
observed on suspending when UVD/VCE still using(e.g. issue
"pm-suspend" when video is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan mailto:evan.q...@amd.com>>
Signed-off-by: xinhui pan mailto:xinhui@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 23 +++
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..d0fc6ec18c29 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -554,6 +554,30 @@ static int uvd_v6_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->uvd.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_uvd(adev, false);
+   } else {
+   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
+   /* shutdown the UVD block */
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_CG_STATE_GATE);
+   }
+
 r = uvd_v6_0_hw_fini(adev);
 if (r)
 return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..a594ade5d30a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -503,6 +503,29 @@ static int vce_v3_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->vce.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_vce(adev, false);
+   } else {
+   amdgpu_asic_set_vce_clocks(adev, 0, 0);
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_CG_STATE_GATE);
+   }
+
 r = vce_v3_0_hw_fini(adev);
 if (r)
 return r;
--
2.29.0

Re: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend

2021-08-19 Thread Zhu, James

[AMD Official Use Only]


Why not move changes into hw_fini?


Best Regards!


James Zhu


From: amd-gfx  on behalf of Evan Quan 

Sent: Wednesday, August 18, 2021 11:08 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander ; Chen, Guchun 
; Lazar, Lijo ; Quan, Evan 
; Pan, Xinhui 
Subject: [PATCH] drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on 
suspend

Perform proper cleanups on UVD/VCE suspend: powergate enablement,
clockgating enablement and dpm disablement. This can fix some hangs
observed on suspending when UVD/VCE still using(e.g. issue
"pm-suspend" when video is still playing).

Change-Id: I36f39d9731e0a9638b52d5d92558b0ee9c23a9ed
Signed-off-by: Evan Quan 
Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 23 +++
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index 4eebf973a065..d0fc6ec18c29 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -554,6 +554,30 @@ static int uvd_v6_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->uvd.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_uvd(adev, false);
+   } else {
+   amdgpu_asic_set_uvd_clocks(adev, 0, 0);
+   /* shutdown the UVD block */
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_UVD,
+  AMD_CG_STATE_GATE);
+   }
+
 r = uvd_v6_0_hw_fini(adev);
 if (r)
 return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6d9108fa22e0..a594ade5d30a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -503,6 +503,29 @@ static int vce_v3_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /*
+* Proper cleanups before halting the HW engine:
+*   - cancel the delayed idle work
+*   - enable powergating
+*   - enable clockgating
+*   - disable dpm
+*
+* TODO: to align with the VCN implementation, move the
+* jobs for clockgating/powergating/dpm setting to
+* ->set_powergating_state().
+*/
+   cancel_delayed_work_sync(&adev->vce.idle_work);
+
+   if (adev->pm.dpm_enabled) {
+   amdgpu_dpm_enable_vce(adev, false);
+   } else {
+   amdgpu_asic_set_vce_clocks(adev, 0, 0);
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_PG_STATE_GATE);
+   amdgpu_device_ip_set_clockgating_state(adev, 
AMD_IP_BLOCK_TYPE_VCE,
+  AMD_CG_STATE_GATE);
+   }
+
 r = vce_v3_0_hw_fini(adev);
 if (r)
 return r;
--
2.29.0

[PATCH 18/18] drm/amdkfd: CRIU export kfd bos as prime dmabuf objects

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

KFD buffer objects do not associate a GEM handle with them so cannot
directly be used with libdrm to initiate a system dma (sDMA) operation
to speedup the checkpoint and restore operation so export them as dmabuf
objects and use with libdrm helper (amdgpu_bo_import) to further process
the sdma command submissions.

With sDMA, we see huge improvement in checkpoint and restore operations
compared to the generic pci based access via host data path.

Suggested-by: Felix Kuehling 
Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 57 
 1 file changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 90e4d4ce4398..ead4cb37377b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
@@ -43,6 +44,7 @@
 #include "amdgpu_amdkfd.h"
 #include "kfd_smi_events.h"
 #include "amdgpu_object.h"
+#include "amdgpu_dma_buf.h"
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -1900,6 +1902,33 @@ uint64_t get_process_num_bos(struct kfd_process *p)
return num_of_bos;
 }
 
+static int criu_get_prime_handle(struct drm_gem_object *gobj, int flags,
+ u32 *shared_fd)
+{
+   struct dma_buf *dmabuf;
+   int ret;
+
+   dmabuf = amdgpu_gem_prime_export(gobj, flags);
+   if (IS_ERR(dmabuf)) {
+   ret = PTR_ERR(dmabuf);
+   pr_err("dmabuf export failed for the BO\n");
+   return ret;
+   }
+
+   ret = dma_buf_fd(dmabuf, flags);
+   if (ret < 0) {
+   pr_err("dmabuf create fd failed, ret:%d\n", ret);
+   goto out_free_dmabuf;
+   }
+
+   *shared_fd = ret;
+   return 0;
+
+out_free_dmabuf:
+   dma_buf_put(dmabuf);
+   return ret;
+}
+
 static int criu_dump_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_args *args)
 {
struct kfd_criu_bo_bucket *bo_buckets;
@@ -1969,6 +1998,14 @@ static int criu_dump_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_arg
goto exit;
}
}
+   if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   ret = 
criu_get_prime_handle(&dumper_bo->tbo.base,
+   bo_bucket->alloc_flags &
+   
KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ? DRM_RDWR : 0,
+   &bo_bucket->dmabuf_fd);
+   if (ret)
+   goto exit;
+   }
if (bo_bucket->alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL)
bo_bucket->offset = KFD_MMAP_TYPE_DOORBELL |
KFD_MMAP_GPU_ID(pdd->dev->id);
@@ -1998,6 +2035,11 @@ static int criu_dump_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_arg
}
 
 exit:
+   while (ret && bo_index--) {
+   if (bo_buckets[bo_index].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
+   close_fd(bo_buckets[bo_index].dmabuf_fd);
+   }
+
kvfree(bo_buckets);
return ret;
 }
@@ -2516,6 +2558,7 @@ static int criu_restore_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_restore
struct kfd_criu_bo_priv_data *bo_priv;
struct kfd_dev *dev;
struct kfd_process_device *pdd;
+   struct kgd_mem *kgd_mem;
void *mem;
u64 offset;
int idr_handle;
@@ -2663,6 +2706,16 @@ static int criu_restore_bos(struct kfd_process *p, 
struct kfd_ioctl_criu_restore
}
 
pr_debug("map memory was successful for the BO\n");
+   /* create the dmabuf object and export the bo */
+   kgd_mem = (struct kgd_mem *)mem;
+   if (bo_bucket->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
+   ret = criu_get_prime_handle(&kgd_mem->bo->tbo.base,
+   DRM_RDWR,
+   &bo_bucket->dmabuf_fd);
+   if (ret)
+   goto exit;
+   }
+
} /* done */
 
/* Flush TLBs after waiting for the page table updates to complete */
@@ -2687,6 +2740,10 @@ static int criu_restore_bos(struct kfd_process *p, 
struct kfd_ioctl_criu_restore
ret = -EFAULT;
 
 exit:
+   while (ret && i--) {
+   if (bo_buckets[i].alloc_flag

[PATCH 16/18] drm/amdkfd: CRIU implement gpu_id remapping

2021-08-19 Thread David Yat Sin

When doing a restore on a different node, the gpu_id's on the restore
node may be different. But the user space application will still refer
use the original gpu_id's in the ioctl calls. Adding code to create a
gpu id mapping so that kfd can determine actual gpu_id during the user
ioctl's.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 400 +--
 drivers/gpu/drm/amd/amdkfd/kfd_events.c  |   5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  10 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  18 +
 4 files changed, 324 insertions(+), 109 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index c8f523d8ab81..90e4d4ce4398 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -294,13 +294,14 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
return err;
 
pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev) {
+
+   mutex_lock(&p->mutex);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
return -EINVAL;
}
-
-   mutex_lock(&p->mutex);
+   dev = pdd->dev;
 
pdd = kfd_bind_process_to_device(dev, p);
if (IS_ERR(pdd)) {
@@ -491,7 +492,6 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
struct kfd_process *p, void *data)
 {
struct kfd_ioctl_set_memory_policy_args *args = data;
-   struct kfd_dev *dev;
int err = 0;
struct kfd_process_device *pdd;
enum cache_policy default_policy, alternate_policy;
@@ -506,13 +506,15 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
return -EINVAL;
}
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -EINVAL;
-
mutex_lock(&p->mutex);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
+   err = -EINVAL;
+   goto out;
+   }
 
-   pdd = kfd_bind_process_to_device(dev, p);
+   pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
goto out;
@@ -525,7 +527,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
(args->alternate_policy == KFD_IOC_CACHE_POLICY_COHERENT)
   ? cache_policy_coherent : cache_policy_noncoherent;
 
-   if (!dev->dqm->ops.set_cache_memory_policy(dev->dqm,
+   if (!pdd->dev->dqm->ops.set_cache_memory_policy(pdd->dev->dqm,
&pdd->qpd,
default_policy,
alternate_policy,
@@ -543,17 +545,18 @@ static int kfd_ioctl_set_trap_handler(struct file *filep,
struct kfd_process *p, void *data)
 {
struct kfd_ioctl_set_trap_handler_args *args = data;
-   struct kfd_dev *dev;
int err = 0;
struct kfd_process_device *pdd;
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -EINVAL;
-
mutex_lock(&p->mutex);
 
-   pdd = kfd_bind_process_to_device(dev, p);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   err = -EINVAL;
+   goto out;
+   }
+
+   pdd = kfd_bind_process_to_device(pdd->dev, p);
if (IS_ERR(pdd)) {
err = -ESRCH;
goto out;
@@ -577,16 +580,20 @@ static int kfd_ioctl_dbg_register(struct file *filep,
bool create_ok;
long status = 0;
 
-   dev = kfd_device_by_id(args->gpu_id);
-   if (!dev)
-   return -EINVAL;
+   mutex_lock(&p->mutex);
+   pdd = kfd_process_device_data_by_id(p, args->gpu_id);
+   if (!pdd) {
+   status = -EINVAL;
+   goto out_unlock_p;
+   }
+   dev = pdd->dev;
 
if (dev->device_info->asic_family == CHIP_CARRIZO) {
pr_debug("kfd_ioctl_dbg_register not supported on CZ\n");
-   return -EINVAL;
+   status = -EINVAL;
+   goto out_unlock_p;
}
 
-   mutex_lock(&p->mutex);
mutex_lock(kfd_get_dbgmgr_mutex());
 
/*
@@ -596,7 +603,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
pdd = kfd_bind_process_to_device(dev, p);
if (IS_ERR(pdd)) {
status = PTR_ERR(pdd);
-   goto out;
+   goto out_unlock_dbg;
}
 
if (!dev->dbgmgr) {
@@ -614,8 +621,9 @@ static int kfd_ioctl_dbg_register(struct file *filep,
status = -EINVAL;
}
 
-ou

[PATCH 14/18] drm/amdkfd: CRIU dump/restore queue control stack

2021-08-19 Thread David Yat Sin

Dump contents of queue control stacks on CRIU dump and restore them
during CRIU restore.

(rajneesh: rebased to 5.11 and fixed merge conflict)
Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 31 ---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 30 --
 .../drm/amd/amdkfd/kfd_device_queue_manager.h | 10 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  8 +++--
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 17 +++---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  | 18 ---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 27 +---
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 17 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  8 +++--
 .../amd/amdkfd/kfd_process_queue_manager.c| 19 +++-
 11 files changed, 133 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 68b06037616f..19f16e3dd769 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -313,7 +313,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
dev->id);
 
err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, 
NULL,
-  NULL, &doorbell_offset_in_process);
+   NULL, NULL, &doorbell_offset_in_process);
if (err != 0)
goto err_create_queue;
 
@@ -1968,13 +1968,15 @@ static int criu_dump_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_arg
 static int get_queue_data_sizes(struct kfd_process_device *pdd,
struct queue *q,
uint32_t *cu_mask_size,
-   uint32_t *mqd_size)
+   uint32_t *mqd_size,
+   uint32_t *ctl_stack_size)
 {
int ret;
 
*cu_mask_size = sizeof(uint32_t) * (q->properties.cu_mask_count / 32);
 
-   ret = pqm_get_queue_dump_info(&pdd->process->pqm, 
q->properties.queue_id, mqd_size);
+   ret = pqm_get_queue_dump_info(&pdd->process->pqm, 
q->properties.queue_id, mqd_size,
+ ctl_stack_size);
if (ret)
pr_err("Failed to get queue dump info (%d)\n", ret);
 
@@ -1998,13 +2000,15 @@ int get_process_queue_info(struct kfd_process *p, 
uint32_t *num_queues, uint32_t
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) 
{
u32 cu_mask_size;
u32 mqd_size;
+   u32 ctl_stack_size;
int ret;
 
-   ret = get_queue_data_sizes(pdd, q, 
&cu_mask_size, &mqd_size);
+   ret = get_queue_data_sizes(pdd, q, 
&cu_mask_size, &mqd_size,
+  &ctl_stack_size);
if (ret)
return ret;
 
-   data_sizes += cu_mask_size + mqd_size;
+   data_sizes += cu_mask_size + mqd_size + 
ctl_stack_size;
q_index++;
} else {
pr_err("Unsupported queue type (%d)\n", 
q->properties.type);
@@ -2024,11 +2028,12 @@ static int criu_dump_queue(struct kfd_process_device 
*pdd,
   void *private_data)
 {
struct kfd_criu_queue_priv_data *q_data = (struct 
kfd_criu_queue_priv_data *) private_data;
-   uint8_t *cu_mask, *mqd;
+   uint8_t *cu_mask, *mqd, *ctl_stack;
int ret;
 
cu_mask = (void *)(q_data + 1);
mqd = cu_mask + q_data->cu_mask_size;
+   ctl_stack = mqd + q_data->mqd_size;
 
q_bucket->gpu_id = pdd->dev->id;
q_data->type = q->properties.type;
@@ -2058,7 +2063,7 @@ static int criu_dump_queue(struct kfd_process_device *pdd,
if (q_data->cu_mask_size)
memcpy(cu_mask, q->properties.cu_mask, q_data->cu_mask_size);
 
-   ret = pqm_dump_mqd(&pdd->process->pqm, q->properties.queue_id, mqd);
+   ret = pqm_dump_mqd(&pdd->process->pqm, q->properties.queue_id, mqd, 
ctl_stack);
if (ret) {
pr_err("Failed dump queue_mqd (%d)\n", ret);
return ret;
@@ -2086,6 +2091,7 @@ static int criu_dump_queues_device(struct 
kfd_process_device *pdd,
uint64_t q_data_size;
uint32_t cu_mask_size;
uint32_t mqd_size;
+   uint32_t ctl_stack_size;
 
if (q->properties.type != KFD_QUEUE_TYPE_COMPUTE &&
q->properties.type != KFD_QUEUE_TYPE_SDMA &&
@@ -2097,11 +2103,11 @@ static int criu_dump_queue

[PATCH 13/18] drm/amdkfd: CRIU dump and restore queue mqds

2021-08-19 Thread David Yat Sin

Dump contents of queue MQD's on CRIU dump and restore them during CRIU
restore.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 53 ++
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 70 +--
 .../drm/amd/amdkfd/kfd_device_queue_manager.h | 11 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |  7 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  | 67 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  | 68 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 67 ++
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   | 68 ++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  7 ++
 .../amd/amdkfd/kfd_process_queue_manager.c| 47 -
 11 files changed, 444 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 813ed42e3ce6..68b06037616f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -313,7 +313,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
dev->id);
 
err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, 
NULL,
-   &doorbell_offset_in_process);
+  NULL, &doorbell_offset_in_process);
if (err != 0)
goto err_create_queue;
 
@@ -1965,11 +1965,20 @@ static int criu_dump_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_arg
return ret;
 }
 
-static void get_queue_data_sizes(struct kfd_process_device *pdd,
+static int get_queue_data_sizes(struct kfd_process_device *pdd,
struct queue *q,
-   uint32_t *cu_mask_size)
+   uint32_t *cu_mask_size,
+   uint32_t *mqd_size)
 {
+   int ret;
+
*cu_mask_size = sizeof(uint32_t) * (q->properties.cu_mask_count / 32);
+
+   ret = pqm_get_queue_dump_info(&pdd->process->pqm, 
q->properties.queue_id, mqd_size);
+   if (ret)
+   pr_err("Failed to get queue dump info (%d)\n", ret);
+
+   return ret;
 }
 
 int get_process_queue_info(struct kfd_process *p, uint32_t *num_queues, 
uint32_t *q_data_sizes)
@@ -1988,10 +1997,14 @@ int get_process_queue_info(struct kfd_process *p, 
uint32_t *num_queues, uint32_t
q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) 
{
u32 cu_mask_size;
+   u32 mqd_size;
+   int ret;
 
-   get_queue_data_sizes(pdd, q, &cu_mask_size);
+   ret = get_queue_data_sizes(pdd, q, 
&cu_mask_size, &mqd_size);
+   if (ret)
+   return ret;
 
-   data_sizes += cu_mask_size;
+   data_sizes += cu_mask_size + mqd_size;
q_index++;
} else {
pr_err("Unsupported queue type (%d)\n", 
q->properties.type);
@@ -2005,15 +2018,17 @@ int get_process_queue_info(struct kfd_process *p, 
uint32_t *num_queues, uint32_t
return 0;
 }
 
-static void criu_dump_queue(struct kfd_process_device *pdd,
+static int criu_dump_queue(struct kfd_process_device *pdd,
   struct queue *q,
   struct kfd_criu_queue_bucket *q_bucket,
   void *private_data)
 {
struct kfd_criu_queue_priv_data *q_data = (struct 
kfd_criu_queue_priv_data *) private_data;
-   uint8_t *cu_mask;
+   uint8_t *cu_mask, *mqd;
+   int ret;
 
cu_mask = (void *)(q_data + 1);
+   mqd = cu_mask + q_data->cu_mask_size;
 
q_bucket->gpu_id = pdd->dev->id;
q_data->type = q->properties.type;
@@ -2043,7 +2058,14 @@ static void criu_dump_queue(struct kfd_process_device 
*pdd,
if (q_data->cu_mask_size)
memcpy(cu_mask, q->properties.cu_mask, q_data->cu_mask_size);
 
+   ret = pqm_dump_mqd(&pdd->process->pqm, q->properties.queue_id, mqd);
+   if (ret) {
+   pr_err("Failed dump queue_mqd (%d)\n", ret);
+   return ret;
+   }
+
pr_debug("Dumping Queue: gpu_id:%x queue_id:%u\n", q_bucket->gpu_id, 
q_data->q_id);
+   return ret;
 }
 
 static int criu_dump_queues_device(struct kfd_process_device *pdd,
@@ -2063,6 +2085,7 @@ static int criu_dump_queues_device(struct 
kfd_process_device *pdd,
struct kfd_criu_queue_priv_data *q_data;
uint64_t q_data_size;
uint32_t cu_mask_size;
+   uint32_t mqd_size;
 
if (q->pr

[PATCH 17/18] Revert "drm/amdgpu: Remove verify_access shortcut for KFD BOs"

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

This reverts commit 12ebe2b9df192a2a8580cd9ee3e9940c116913c8.

This is just a temporary work around and will be dropped later.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 99ea29fd12bd..be7eb85af066 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -178,6 +178,13 @@ static int amdgpu_verify_access(struct ttm_buffer_object 
*bo, struct file *filp)
 {
struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo);
 
+   /*
+* Don't verify access for KFD BOs. They don't have a GEM
+* object associated with them.
+*/
+   if (abo->kfd_bo)
+   return 0;
+
if (amdgpu_ttm_tt_get_usermm(bo->ttm))
return -EPERM;
return drm_vma_node_verify_access(&abo->tbo.base.vma_node,
-- 
2.17.1

[PATCH 15/18] drm/amdkfd: CRIU dump and restore events

2021-08-19 Thread David Yat Sin

Add support to existing CRIU ioctl's to save and restore events during
criu checkpoint and restore.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 130 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c  | 253 ---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  25 ++-
 3 files changed, 329 insertions(+), 79 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 19f16e3dd769..c8f523d8ab81 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1008,51 +1008,11 @@ static int kfd_ioctl_create_event(struct file *filp, 
struct kfd_process *p,
 * through the event_page_offset field.
 */
if (args->event_page_offset) {
-   struct kfd_dev *kfd;
-   struct kfd_process_device *pdd;
-   void *mem, *kern_addr;
-   uint64_t size;
-
-   if (p->signal_page) {
-   pr_err("Event page is already set\n");
-   return -EINVAL;
-   }
-
-   kfd = kfd_device_by_id(GET_GPU_ID(args->event_page_offset));
-   if (!kfd) {
-   pr_err("Getting device by id failed in %s\n", __func__);
-   return -EINVAL;
-   }
-
mutex_lock(&p->mutex);
-   pdd = kfd_bind_process_to_device(kfd, p);
-   if (IS_ERR(pdd)) {
-   err = PTR_ERR(pdd);
-   goto out_unlock;
-   }
-
-   mem = kfd_process_device_translate_handle(pdd,
-   GET_IDR_HANDLE(args->event_page_offset));
-   if (!mem) {
-   pr_err("Can't find BO, offset is 0x%llx\n",
-  args->event_page_offset);
-   err = -EINVAL;
-   goto out_unlock;
-   }
+   err = kfd_kmap_event_page(p, args->event_page_offset);
mutex_unlock(&p->mutex);
-
-   err = amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(kfd->kgd,
-   mem, &kern_addr, &size);
-   if (err) {
-   pr_err("Failed to map event page to kernel\n");
-   return err;
-   }
-
-   err = kfd_event_page_set(p, kern_addr, size);
-   if (err) {
-   pr_err("Failed to set event page\n");
+   if (err)
return err;
-   }
}
 
err = kfd_event_create(filp, p, args->event_type,
@@ -1061,10 +1021,7 @@ static int kfd_ioctl_create_event(struct file *filp, 
struct kfd_process *p,
&args->event_page_offset,
&args->event_slot_index);
 
-   return err;
-
-out_unlock:
-   mutex_unlock(&p->mutex);
+   pr_debug("Created event (id:0x%08x) (%s)\n", args->event_id, __func__);
return err;
 }
 
@@ -2208,6 +2165,41 @@ static int criu_dump_queues(struct kfd_process *p, 
struct kfd_ioctl_criu_dumper_
return ret;
 }
 
+static int criu_dump_events(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_args *args)
+{
+   struct kfd_criu_event_bucket *ev_buckets;
+   uint32_t num_events;
+   int ret =  0;
+
+   num_events = kfd_get_num_events(p);
+   if (args->num_objects != num_events) {
+   pr_err("Mismatch with number of events (current:%d 
user:%lld)\n",
+   num_events, 
args->num_objects);
+
+   }
+
+   if (args->objects_size != args->num_objects *
+ (sizeof(*ev_buckets) + sizeof(struct 
kfd_criu_event_priv_data))) {
+   pr_err("Invalid objects size for events\n");
+   return -EINVAL;
+   }
+
+   ev_buckets = kvzalloc(args->objects_size, GFP_KERNEL);
+   if (!ev_buckets)
+   return -ENOMEM;
+
+   ret = kfd_event_dump(p, ev_buckets, args->num_objects);
+   if (!ret) {
+   ret = copy_to_user((void __user *)args->objects, ev_buckets, 
args->objects_size);
+   if (ret) {
+   pr_err("Failed to copy events information to user\n");
+   ret = -EFAULT;
+   }
+   }
+   kvfree(ev_buckets);
+   return ret;
+}
+
 static int kfd_ioctl_criu_dumper(struct file *filep,
struct kfd_process *p, void *data)
 {
@@ -2246,6 +2238,8 @@ static int kfd_ioctl_criu_dumper(struct file *filep,
ret = criu_dump_queues(p, args);
break;
case KFD_CRIU_OBJECT_TYPE_EVENT:
+   ret = criu_dump_events(p, args);
+   break;
case KFD_CRIU_OBJECT_TYPE_DEVICE:
case KFD_CRIU_OBJECT_TYPE_SVM_RANGE:
default:
@@ -2676,6

[PATCH 12/18] drm/amdkfd: CRIU restore queue doorbell id

2021-08-19 Thread David Yat Sin

When re-creating queues during CRIU restore, restore the queue with the
same doorbell id value used during CRIU dump.

Signed-off-by: David Yat Sin 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 61 +--
 1 file changed, 41 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 677f94e93218..5c268c7726d2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -153,7 +153,13 @@ static void decrement_queue_count(struct 
device_queue_manager *dqm,
dqm->active_cp_queue_count--;
 }
 
-static int allocate_doorbell(struct qcm_process_device *qpd, struct queue *q)
+/*
+ * Allocate a doorbell ID to this queue.
+ * If doorbell_id is passed in, make sure requested ID is valid then allocate 
it.
+ */
+static int allocate_doorbell(struct qcm_process_device *qpd,
+struct queue *q,
+uint32_t const *restore_id)
 {
struct kfd_dev *dev = qpd->dqm->dev;
 
@@ -161,6 +167,9 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd, struct queue *q)
/* On pre-SOC15 chips we need to use the queue ID to
 * preserve the user mode ABI.
 */
+   if (restore_id && *restore_id != q->properties.queue_id)
+   return -EINVAL;
+
q->doorbell_id = q->properties.queue_id;
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
@@ -169,25 +178,37 @@ static int allocate_doorbell(struct qcm_process_device 
*qpd, struct queue *q)
 * The doobell index distance between RLC (2*i) and (2*i+1)
 * for a SDMA engine is 512.
 */
-   uint32_t *idx_offset =
-   dev->shared_resources.sdma_doorbell_idx;
 
-   q->doorbell_id = idx_offset[q->properties.sdma_engine_id]
-   + (q->properties.sdma_queue_id & 1)
-   * KFD_QUEUE_DOORBELL_MIRROR_OFFSET
-   + (q->properties.sdma_queue_id >> 1);
+   uint32_t *idx_offset = dev->shared_resources.sdma_doorbell_idx;
+   uint32_t valid_id = idx_offset[q->properties.sdma_engine_id]
+   + (q->properties.sdma_queue_id 
& 1)
+   * 
KFD_QUEUE_DOORBELL_MIRROR_OFFSET
+   + (q->properties.sdma_queue_id 
>> 1);
+
+   if (restore_id && *restore_id != valid_id)
+   return -EINVAL;
+   q->doorbell_id = valid_id;
} else {
-   /* For CP queues on SOC15 reserve a free doorbell ID */
-   unsigned int found;
-
-   found = find_first_zero_bit(qpd->doorbell_bitmap,
-   KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
-   if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
-   pr_debug("No doorbells available");
-   return -EBUSY;
+   /* For CP queues on SOC15 */
+   if (restore_id) {
+   /* make sure that ID is free  */
+   if (__test_and_set_bit(*restore_id, 
qpd->doorbell_bitmap))
+   return -EINVAL;
+
+   q->doorbell_id = *restore_id;
+   } else {
+   /* or reserve a free doorbell ID */
+   unsigned int found;
+
+   found = find_first_zero_bit(qpd->doorbell_bitmap,
+   
KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
+   if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
+   pr_debug("No doorbells available");
+   return -EBUSY;
+   }
+   set_bit(found, qpd->doorbell_bitmap);
+   q->doorbell_id = found;
}
-   set_bit(found, qpd->doorbell_bitmap);
-   q->doorbell_id = found;
}
 
q->properties.doorbell_off =
@@ -343,7 +364,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
}
 
-   retval = allocate_doorbell(qpd, q);
+   retval = allocate_doorbell(qpd, q, qd ? &qd->doorbell_id : NULL);
if (retval)
goto out_deallocate_hqd;
 
@@ -998,7 +1019,7 @@ static int start_nocpsch(struct device_queue_manager *dqm)
 {
pr_info("SW scheduler is used");
init_interrupts(dqm);
-   
+
if (dqm->dev->device_info->asic_family == CHIP_HAWAII)
return pm_init(&dqm->packets, dqm);

[PATCH 10/18] drm/amdkfd: CRIU restore queue ids

2021-08-19 Thread David Yat Sin

When re-creating queues during CRIU restore, restore the queue with the
same queue id value used during CRIU dump. Adding a new private
structure queue_restore_data to store queue restore information.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  |  4 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  2 ++
 .../amd/amdkfd/kfd_process_queue_manager.c| 22 ++-
 4 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 6f1c9fb8d46c..813ed42e3ce6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -312,7 +312,7 @@ static int kfd_ioctl_create_queue(struct file *filep, 
struct kfd_process *p,
p->pasid,
dev->id);
 
-   err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id,
+   err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, 
NULL,
&doorbell_offset_in_process);
if (err != 0)
goto err_create_queue;
@@ -2543,7 +2543,7 @@ static int criu_restore_queue(struct kfd_process *p,
 
print_queue_properties(&qp);
 
-   ret = pqm_create_queue(&p->pqm, dev, NULL, &qp, &queue_id, NULL);
+   ret = pqm_create_queue(&p->pqm, dev, NULL, &qp, &queue_id, q_data, 
NULL);
if (ret) {
pr_err("Failed to create new queue err:%d\n", ret);
ret = -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 159add0f5aaa..749a7a3bf191 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
properties.type = KFD_QUEUE_TYPE_DIQ;
 
status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
-   &properties, &qid, NULL);
+   &properties, &qid, NULL, NULL);
 
if (status) {
pr_err("Failed to create DIQ\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 4b4808b191f2..eaf5fe1480e9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -468,6 +468,7 @@ enum KFD_QUEUE_PRIORITY {
  * it's user mode or kernel mode queue.
  *
  */
+
 struct queue_properties {
enum kfd_queue_type type;
enum kfd_queue_format format;
@@ -1114,6 +1115,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
struct file *f,
struct queue_properties *properties,
unsigned int *qid,
+   const struct kfd_criu_queue_priv_data *q_data,
uint32_t *p_doorbell_offset_in_process);
 int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid);
 int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 95a6c36cea4c..e6abab16b8de 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -42,6 +42,20 @@ static inline struct process_queue_node *get_queue_by_qid(
return NULL;
 }
 
+static int assign_queue_slot_by_qid(struct process_queue_manager *pqm,
+   unsigned int qid)
+{
+   if (qid >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
+   return -EINVAL;
+
+   if (__test_and_set_bit(qid, pqm->queue_slot_bitmap)) {
+   pr_err("Cannot create new queue because requested qid(%u) is in 
use\n", qid);
+   return -ENOSPC;
+   }
+
+   return 0;
+}
+
 static int find_available_queue_slot(struct process_queue_manager *pqm,
unsigned int *qid)
 {
@@ -193,6 +207,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
struct file *f,
struct queue_properties *properties,
unsigned int *qid,
+   const struct kfd_criu_queue_priv_data *q_data,
uint32_t *p_doorbell_offset_in_process)
 {
int retval;
@@ -224,7 +239,12 @@ int pqm_create_queue(struct process_queue_manager *pqm,
if (pdd->qpd.queue_count >= max_queues)
return -ENOSPC;
 
-   retval = find_available_queue_slot(pqm, qid);
+   if (q_data) {
+   retval = assign_queue_slot_by_qid(pqm, q_data->q_id);
+   *qid = q_data->q_id;
+   } else
+   retval = find_available_queue_slot(pqm, qid);
+
if (retval

[PATCH 11/18] drm/amdkfd: CRIU restore sdma id for queues

2021-08-19 Thread David Yat Sin

When re-creating queues during CRIU restore, restore the queue with the
same sdma id value used during CRIU dump.

Signed-off-by: David Yat Sin 
---
 .../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |  3 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|  4 +-
 3 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 98c2046c7331..677f94e93218 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -58,7 +58,7 @@ static inline void deallocate_hqd(struct device_queue_manager 
*dqm,
struct queue *q);
 static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q);
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   struct queue *q);
+   struct queue *q, const uint32_t 
*restore_sdma_id);
 static void kfd_process_hw_exception(struct work_struct *work);
 
 static inline
@@ -296,7 +296,8 @@ static void deallocate_vmid(struct device_queue_manager 
*dqm,
 
 static int create_queue_nocpsch(struct device_queue_manager *dqm,
struct queue *q,
-   struct qcm_process_device *qpd)
+   struct qcm_process_device *qpd,
+   const struct kfd_criu_queue_priv_data *qd)
 {
struct mqd_manager *mqd_mgr;
int retval;
@@ -336,7 +337,7 @@ static int create_queue_nocpsch(struct device_queue_manager 
*dqm,
q->pipe, q->queue);
} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA ||
q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) {
-   retval = allocate_sdma_queue(dqm, q);
+   retval = allocate_sdma_queue(dqm, q, qd ? &qd->sdma_id : NULL);
if (retval)
goto deallocate_vmid;
dqm->asic_ops.init_sdma_vm(dqm, q, qpd);
@@ -1022,7 +1023,7 @@ static void pre_reset(struct device_queue_manager *dqm)
 }
 
 static int allocate_sdma_queue(struct device_queue_manager *dqm,
-   struct queue *q)
+   struct queue *q, const uint32_t 
*restore_sdma_id)
 {
int bit;
 
@@ -1032,9 +1033,21 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
return -ENOMEM;
}
 
-   bit = __ffs64(dqm->sdma_bitmap);
-   dqm->sdma_bitmap &= ~(1ULL << bit);
-   q->sdma_id = bit;
+   if (restore_sdma_id) {
+   /* Re-use existing sdma_id */
+   if (!(dqm->sdma_bitmap & (1ULL << *restore_sdma_id))) {
+   pr_err("SDMA queue already in use\n");
+   return -EBUSY;
+   }
+   dqm->sdma_bitmap &= ~(1ULL << *restore_sdma_id);
+   q->sdma_id = *restore_sdma_id;
+   } else {
+   /* Find first available sdma_id */
+   bit = __ffs64(dqm->sdma_bitmap);
+   dqm->sdma_bitmap &= ~(1ULL << bit);
+   q->sdma_id = bit;
+   }
+
q->properties.sdma_engine_id = q->sdma_id %
get_num_sdma_engines(dqm);
q->properties.sdma_queue_id = q->sdma_id /
@@ -1044,9 +1057,19 @@ static int allocate_sdma_queue(struct 
device_queue_manager *dqm,
pr_err("No more XGMI SDMA queue to allocate\n");
return -ENOMEM;
}
-   bit = __ffs64(dqm->xgmi_sdma_bitmap);
-   dqm->xgmi_sdma_bitmap &= ~(1ULL << bit);
-   q->sdma_id = bit;
+   if (restore_sdma_id) {
+   /* Re-use existing sdma_id */
+   if (!(dqm->xgmi_sdma_bitmap & (1ULL << 
*restore_sdma_id))) {
+   pr_err("SDMA queue already in use\n");
+   return -EBUSY;
+   }
+   dqm->xgmi_sdma_bitmap &= ~(1ULL << *restore_sdma_id);
+   q->sdma_id = *restore_sdma_id;
+   } else {
+   bit = __ffs64(dqm->xgmi_sdma_bitmap);
+   dqm->xgmi_sdma_bitmap &= ~(1ULL << bit);
+   q->sdma_id = bit;
+   }
/* sdma_engine_id is sdma id including
 * both PCIe-optimized SDMAs and XGMI-
 * optimized SDMAs. The calculation below
@@ -1269,7 +1292,8 @@ static void destroy_kernel_queue_cpsch(struct 
device_queue_manager *dqm,
 }
 
 static int create_queue_cpsch(struct device_queue_manager *dqm, struc

[PATCH 09/18] drm/amdkfd: CRIU add queues support

2021-08-19 Thread David Yat Sin

Add support to existing CRIU ioctl's to save number of queues and queue
properties for each queue during checkpoint and re-create queues on
restore.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 380 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  22 +-
 2 files changed, 400 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 24e5c53261f5..6f1c9fb8d46c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1965,6 +1965,213 @@ static int criu_dump_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_arg
return ret;
 }
 
+static void get_queue_data_sizes(struct kfd_process_device *pdd,
+   struct queue *q,
+   uint32_t *cu_mask_size)
+{
+   *cu_mask_size = sizeof(uint32_t) * (q->properties.cu_mask_count / 32);
+}
+
+int get_process_queue_info(struct kfd_process *p, uint32_t *num_queues, 
uint32_t *q_data_sizes)
+{
+   u32 data_sizes = 0;
+   u32 q_index = 0;
+   struct queue *q;
+   int i;
+
+   /* Run over all PDDs of the process */
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+
+   list_for_each_entry(q, &pdd->qpd.queues_list, list) {
+   if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE ||
+   q->properties.type == KFD_QUEUE_TYPE_SDMA ||
+   q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) 
{
+   u32 cu_mask_size;
+
+   get_queue_data_sizes(pdd, q, &cu_mask_size);
+
+   data_sizes += cu_mask_size;
+   q_index++;
+   } else {
+   pr_err("Unsupported queue type (%d)\n", 
q->properties.type);
+   return -EOPNOTSUPP;
+   }
+   }
+   }
+   *num_queues = q_index;
+   *q_data_sizes = data_sizes;
+
+   return 0;
+}
+
+static void criu_dump_queue(struct kfd_process_device *pdd,
+  struct queue *q,
+  struct kfd_criu_queue_bucket *q_bucket,
+  void *private_data)
+{
+   struct kfd_criu_queue_priv_data *q_data = (struct 
kfd_criu_queue_priv_data *) private_data;
+   uint8_t *cu_mask;
+
+   cu_mask = (void *)(q_data + 1);
+
+   q_bucket->gpu_id = pdd->dev->id;
+   q_data->type = q->properties.type;
+   q_data->format = q->properties.format;
+   q_data->q_id =  q->properties.queue_id;
+   q_data->q_address = q->properties.queue_address;
+   q_data->q_size = q->properties.queue_size;
+   q_data->priority = q->properties.priority;
+   q_data->q_percent = q->properties.queue_percent;
+   q_data->read_ptr_addr = (uint64_t)q->properties.read_ptr;
+   q_data->write_ptr_addr = (uint64_t)q->properties.write_ptr;
+   q_data->doorbell_id = q->doorbell_id;
+
+   q_data->sdma_id = q->sdma_id;
+
+   q_data->eop_ring_buffer_address =
+   q->properties.eop_ring_buffer_address;
+
+   q_data->eop_ring_buffer_size = q->properties.eop_ring_buffer_size;
+
+   q_data->ctx_save_restore_area_address =
+   q->properties.ctx_save_restore_area_address;
+
+   q_data->ctx_save_restore_area_size =
+   q->properties.ctx_save_restore_area_size;
+
+   if (q_data->cu_mask_size)
+   memcpy(cu_mask, q->properties.cu_mask, q_data->cu_mask_size);
+
+   pr_debug("Dumping Queue: gpu_id:%x queue_id:%u\n", q_bucket->gpu_id, 
q_data->q_id);
+}
+
+static int criu_dump_queues_device(struct kfd_process_device *pdd,
+   unsigned int *q_index,
+   unsigned int max_num_queues,
+   struct kfd_criu_queue_bucket *q_buckets,
+   uint8_t *user_priv_data,
+   uint64_t *queues_priv_data_offset)
+{
+   struct queue *q;
+   uint8_t *q_private_data = NULL; /* Local buffer to store individual 
queue private data */
+   unsigned int q_private_data_size = 0;
+   int ret = 0;
+
+   list_for_each_entry(q, &pdd->qpd.queues_list, list) {
+   struct kfd_criu_queue_bucket q_bucket;
+   struct kfd_criu_queue_priv_data *q_data;
+   uint64_t q_data_size;
+   uint32_t cu_mask_size;
+
+   if (q->properties.type != KFD_QUEUE_TYPE_COMPUTE &&
+   q->properties.type != KFD_QUEUE_TYPE_SDMA &&
+   q->properties.type != KFD_QUEUE_TYPE_SDMA_XGMI) {
+
+   pr_err("Unsupported queue type (%d)\n", 
q->properties.type);
+   return -EOPNOTSUPP;
+   }

[PATCH 08/18] drm/amdkfd: CRIU Implement KFD pause ioctl

2021-08-19 Thread David Yat Sin

Introducing pause IOCTL. The CRIU amdgpu plugin is needs
to call AMDKFD_IOC_CRIU_PAUSE(pause = 1) before starting dump and
AMDKFD_IOC_CRIU_PAUSE(pause = 0) when dump is complete. This ensures
that the queues are not modified between each CRIU dump ioctl.

Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 23 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  3 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  1 +
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index f0c278e7d7e0..24e5c53261f5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1984,6 +1984,14 @@ static int kfd_ioctl_criu_dumper(struct file *filep,
goto err_unlock;
}
 
+   /* Confirm all process queues are evicted */
+   if (!p->queues_paused) {
+   pr_err("Cannot dump process when queues are not in evicted 
state\n");
+   /* CRIU plugin did not call AMDKFD_IOC_CRIU_PAUSE before 
dumping */
+   ret = -EINVAL;
+   goto err_unlock;
+   }
+
switch (args->type) {
case KFD_CRIU_OBJECT_TYPE_PROCESS:
ret = criu_dump_process(p, args);
@@ -2306,9 +2314,20 @@ static int kfd_ioctl_criu_restorer(struct file *filep,
 
 static int kfd_ioctl_criu_pause(struct file *filep, struct kfd_process *p, 
void *data)
 {
-   pr_debug("Inside %s\n", __func__);
+   int ret;
+   struct kfd_ioctl_criu_pause_args *args = data;
 
-   return 0;
+   if (args->pause)
+   ret = kfd_process_evict_queues(p);
+   else
+   ret = kfd_process_restore_queues(p);
+
+   if (ret)
+   pr_err("Failed to %s queues ret:%d\n", args->pause ? "evict" : 
"restore", ret);
+   else
+   p->queues_paused = !!(args->pause);
+
+   return ret;
 }
 
 static int kfd_ioctl_criu_resume(struct file *filep,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 719982605587..0b8165729cde 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -857,6 +857,9 @@ struct kfd_process {
bool svm_disabled;
 
bool xnack_enabled;
+
+   /* Queues are in paused stated because we are in the process of doing a 
CRIU checkpoint */
+   bool queues_paused;
 };
 
 #define KFD_PROCESS_TABLE_SIZE 5 /* bits: 32 entries */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index bbf21395fb06..e4cb2f778590 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1268,6 +1268,7 @@ static struct kfd_process *create_process(const struct 
task_struct *thread)
process->lead_thread = thread->group_leader;
process->n_pdds = 0;
process->svm_disabled = false;
+   process->queues_paused = false;
INIT_DELAYED_WORK(&process->eviction_work, evict_process_worker);
INIT_DELAYED_WORK(&process->restore_work, restore_process_worker);
process->last_restore_timestamp = get_jiffies_64();
-- 
2.17.1

[PATCH 07/18] drm/amdkfd: CRIU Implement KFD resume ioctl

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

This adds support to create userptr BOs on restore and introduces a new
ioctl to restart memory notifiers for the restored userptr BOs.
When doing CRIU restore MMU notifications can happen anytime after we call
amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the
restore process i.e. criu_resume ioctl is received, and the process is
ready to be resumed. This ioctl is different from other KFD CRIU ioctls
since its called by CRIU master restore process for all the target
processes being resumed by CRIU.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
(cherry picked from commit 1f0300f5a4dc12b3c1140b0f0953300b4a6ac81f)
(cherry picked from commit 5c5ae6026ea795ae39acff06db862a7ef2fc6aa9)
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  5 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 51 +--
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 40 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 36 +++--
 5 files changed, 120 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 313ee49b9f17..158130a4f4cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -117,6 +117,7 @@ struct amdkfd_process_info {
atomic_t evicted_bos;
struct delayed_work restore_userptr_work;
struct pid *pid;
+   bool block_mmu_notifications;
 };
 
 int amdgpu_amdkfd_init(void);
@@ -249,7 +250,9 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void 
*drm_priv);
 int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
struct kgd_dev *kgd, uint64_t va, uint64_t size,
void *drm_priv, struct kgd_mem **mem,
-   uint64_t *offset, uint32_t flags);
+   uint64_t *offset, uint32_t flags, bool criu_resume);
+void amdgpu_amdkfd_block_mmu_notifications(void *p);
+int amdgpu_amdkfd_criu_resume(void *p);
 int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
struct kgd_dev *kgd, struct kgd_mem *mem, void *drm_priv,
uint64_t *size);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index dfa025d694f8..ad8818844526 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -597,7 +597,8 @@ static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem 
*mem,
  *
  * Returns 0 for success, negative errno for errors.
  */
-static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr)
+static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr,
+  bool criu_resume)
 {
struct amdkfd_process_info *process_info = mem->process_info;
struct amdgpu_bo *bo = mem->bo;
@@ -619,6 +620,17 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr)
goto out;
}
 
+   if (criu_resume) {
+   /*
+* During a CRIU restore operation, the userptr buffer objects
+* will be validated in the restore_userptr_work worker at a
+* later stage when it is scheduled by another ioctl called by
+* CRIU master process for the target pid for restore.
+*/
+   atomic_inc(&mem->invalid);
+   mutex_unlock(&process_info->lock);
+   return 0;
+   }
ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages);
if (ret) {
pr_err("%s: Failed to get user pages: %d\n", __func__, ret);
@@ -982,6 +994,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
INIT_DELAYED_WORK(&info->restore_userptr_work,
  amdgpu_amdkfd_restore_userptr_worker);
 
+   info->block_mmu_notifications = false;
*process_info = info;
*ef = dma_fence_get(&info->eviction_fence->base);
}
@@ -1139,10 +1152,37 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void 
*drm_priv)
return avm->pd_phys_addr;
 }
 
+void amdgpu_amdkfd_block_mmu_notifications(void *p)
+{
+   struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p;
+
+   pinfo->block_mmu_notifications = true;
+}
+
+int amdgpu_amdkfd_criu_resume(void *p)
+{
+   int ret = 0;
+   struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p;
+
+   mutex_lock(&pinfo->lock);
+   pr_debug("scheduling work\n");
+   atomic_inc(&pinfo->evicted_bos);
+   if (!pinfo->block_mmu_notifications) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+   pinfo->block_mmu_notifications = false;
+   schedule_delayed_work(&pinfo->restore_userptr_work, 0);
+
+out_unlock:
+   mutex_unlock(&pinfo->lock);
+   return ret;
+}

[PATCH 06/18] drm/amdkfd: CRIU Implement KFD restore ioctl

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

This implements the KFD CRIU Restore ioctl that lays the basic
foundation for the CRIU restore operation. It provides support to
create the buffer objects corresponding to Non-Paged system memory
mapped for GPU and/or CPU access and lays basic foundation for the
userptrs buffer objects which will be added in a separate patch.
This ioctl creates various types of buffer objects such as VRAM,
MMIO, Doorbell, GTT based on the date sent from the userspace plugin.
The data mostly contains the previously checkpointed KFD images from
some KFD processs.

While restoring a criu process, attach old IDR values to newly
created BOs. This also adds the minimal gpu mapping support for a single
gpu checkpoint restore use case.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
(cherry picked from commit 47bb685701c336d1fde7e91be93d9cabe89a4c1b)
(cherry picked from commit b71ba8158a7ddf9e4fd8d872be4e40ddd9a29b4f)
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 284 ++-
 1 file changed, 282 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index d548e6691d69..2dab1845f9d3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -2010,12 +2010,292 @@ static int kfd_ioctl_criu_dumper(struct file *filep,
return ret;
 }
 
+static int criu_restore_process(struct kfd_process *p, struct 
kfd_ioctl_criu_restorer_args *args)
+{
+   int ret = 0;
+   uint8_t *objects;
+   struct kfd_criu_process_bucket *process_bucket;
+   struct kfd_criu_process_priv_data *process_priv;
+
+   if (args->num_objects != 1) {
+   pr_err("Only 1 process supported\n");
+   return -EINVAL;
+   }
+
+   if (args->objects_size != sizeof(*process_bucket) + 
sizeof(*process_priv)) {
+   pr_err("Invalid objects size for process\n");
+   return -EINVAL;
+   }
+
+   objects = kmalloc(args->objects_size, GFP_KERNEL);
+   if (!objects)
+   return -ENOMEM;
+
+   ret = copy_from_user(objects, (void __user *)args->objects, 
args->objects_size);
+   if (ret) {
+   pr_err("Failed to copy process information from user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+
+   process_bucket = (struct kfd_criu_process_bucket *)objects;
+   /* Private data starts after process bucket */
+   process_priv = (struct kfd_criu_process_priv_data *)
+   (objects + sizeof(*process_bucket) + 
process_bucket->priv_data_offset);
+
+   if (process_priv->version != KFD_CRIU_PRIV_VERSION) {
+   pr_err("Invalid CRIU API version (checkpointed:%d 
current:%d)\n",
+   process_priv->version, KFD_CRIU_PRIV_VERSION);
+   return -EINVAL;
+   }
+
+exit:
+   kfree(objects);
+   return ret;
+}
+
+static int criu_restore_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_restorer_args *args)
+{
+   uint8_t *objects, *private_data;
+   struct kfd_criu_bo_bucket *bo_buckets;
+   int ret = 0, i, j = 0;
+
+   if (args->objects_size != args->num_objects *
+   (sizeof(*bo_buckets) + sizeof(struct kfd_criu_bo_priv_data))) {
+   pr_err("Invalid objects size for BOs\n");
+   return -EINVAL;
+   }
+
+   objects = kmalloc(args->objects_size, GFP_KERNEL);
+   if (!objects)
+   return -ENOMEM;
+
+   ret = copy_from_user(objects, (void __user *)args->objects, 
args->objects_size);
+   if (ret) {
+   pr_err("Failed to copy BOs information from user\n");
+   ret = -EFAULT;
+   goto exit;
+   }
+
+   bo_buckets = (struct kfd_criu_bo_bucket *) objects;
+   /* Private data for first BO starts after all bo_buckets */
+   private_data = (void *)(bo_buckets + args->num_objects);
+
+   /* Create and map new BOs */
+   for (i = 0; i < args->num_objects; i++) {
+   struct kfd_criu_bo_bucket *bo_bucket;
+   struct kfd_criu_bo_priv_data *bo_priv;
+   struct kfd_dev *dev;
+   struct kfd_process_device *pdd;
+   void *mem;
+   u64 offset;
+   int idr_handle;
+
+   bo_bucket = &bo_buckets[i];
+   bo_priv = (struct kfd_criu_bo_priv_data *)
+   (private_data + bo_bucket->priv_data_offset);
+
+   dev = kfd_device_by_id(bo_bucket->gpu_id);
+   if (!dev) {
+   ret = -EINVAL;
+   pr_err("Failed to get pdd\n");
+   goto exit;
+   }
+   pdd = kfd_get_process_device_data(dev, p);
+   if (!pdd) {
+   ret = -EINVAL;
+   pr_err("Failed to get pdd\n");
+   goto exit;
+

[PATCH 02/18] x86/configs: CRIU update debug rock defconfig

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

 - Update debug config for Checkpoint-Restore (CR) support
 - Also include necessary options for CR with docker containers.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: David Yat Sin 
---
 arch/x86/configs/rock-dbg_defconfig | 53 ++---
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/arch/x86/configs/rock-dbg_defconfig 
b/arch/x86/configs/rock-dbg_defconfig
index 54688993d6e2..87951da7de6a 100644
--- a/arch/x86/configs/rock-dbg_defconfig
+++ b/arch/x86/configs/rock-dbg_defconfig
@@ -236,6 +236,7 @@ CONFIG_BPF_SYSCALL=y
 CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
 # CONFIG_BPF_PRELOAD is not set
 # CONFIG_USERFAULTFD is not set
+CONFIG_USERFAULTFD=y
 CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
 CONFIG_KCMP=y
 CONFIG_RSEQ=y
@@ -994,6 +995,11 @@ CONFIG_PACKET_DIAG=y
 CONFIG_UNIX=y
 CONFIG_UNIX_SCM=y
 CONFIG_UNIX_DIAG=y
+CONFIG_SMC_DIAG=y
+CONFIG_XDP_SOCKETS_DIAG=y
+CONFIG_INET_MPTCP_DIAG=y
+CONFIG_TIPC_DIAG=y
+CONFIG_VSOCKETS_DIAG=y
 # CONFIG_TLS is not set
 CONFIG_XFRM=y
 CONFIG_XFRM_ALGO=y
@@ -1031,15 +1037,17 @@ CONFIG_SYN_COOKIES=y
 # CONFIG_NET_IPVTI is not set
 # CONFIG_NET_FOU is not set
 # CONFIG_NET_FOU_IP_TUNNELS is not set
-# CONFIG_INET_AH is not set
-# CONFIG_INET_ESP is not set
-# CONFIG_INET_IPCOMP is not set
-CONFIG_INET_TUNNEL=y
-CONFIG_INET_DIAG=y
-CONFIG_INET_TCP_DIAG=y
-# CONFIG_INET_UDP_DIAG is not set
-# CONFIG_INET_RAW_DIAG is not set
-# CONFIG_INET_DIAG_DESTROY is not set
+CONFIG_INET_AH=m
+CONFIG_INET_ESP=m
+CONFIG_INET_IPCOMP=m
+CONFIG_INET_ESP_OFFLOAD=m
+CONFIG_INET_TUNNEL=m
+CONFIG_INET_XFRM_TUNNEL=m
+CONFIG_INET_DIAG=m
+CONFIG_INET_TCP_DIAG=m
+CONFIG_INET_UDP_DIAG=m
+CONFIG_INET_RAW_DIAG=m
+CONFIG_INET_DIAG_DESTROY=y
 CONFIG_TCP_CONG_ADVANCED=y
 # CONFIG_TCP_CONG_BIC is not set
 CONFIG_TCP_CONG_CUBIC=y
@@ -1064,12 +1072,14 @@ CONFIG_TCP_MD5SIG=y
 CONFIG_IPV6=y
 # CONFIG_IPV6_ROUTER_PREF is not set
 # CONFIG_IPV6_OPTIMISTIC_DAD is not set
-CONFIG_INET6_AH=y
-CONFIG_INET6_ESP=y
-# CONFIG_INET6_ESP_OFFLOAD is not set
-# CONFIG_INET6_ESPINTCP is not set
-# CONFIG_INET6_IPCOMP is not set
-# CONFIG_IPV6_MIP6 is not set
+CONFIG_INET6_AH=m
+CONFIG_INET6_ESP=m
+CONFIG_INET6_ESP_OFFLOAD=m
+CONFIG_INET6_IPCOMP=m
+CONFIG_IPV6_MIP6=m
+CONFIG_INET6_XFRM_TUNNEL=m
+CONFIG_INET_DCCP_DIAG=m
+CONFIG_INET_SCTP_DIAG=m
 # CONFIG_IPV6_ILA is not set
 # CONFIG_IPV6_VTI is not set
 CONFIG_IPV6_SIT=y
@@ -1126,8 +1136,13 @@ CONFIG_NF_CT_PROTO_UDPLITE=y
 # CONFIG_NF_CONNTRACK_SANE is not set
 # CONFIG_NF_CONNTRACK_SIP is not set
 # CONFIG_NF_CONNTRACK_TFTP is not set
-# CONFIG_NF_CT_NETLINK is not set
-# CONFIG_NF_CT_NETLINK_TIMEOUT is not set
+CONFIG_COMPAT_NETLINK_MESSAGES=y
+CONFIG_NF_CT_NETLINK=m
+CONFIG_NF_CT_NETLINK_TIMEOUT=m
+CONFIG_NF_CT_NETLINK_HELPER=m
+CONFIG_NETFILTER_NETLINK_GLUE_CT=y
+CONFIG_SCSI_NETLINK=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
 CONFIG_NF_NAT=m
 CONFIG_NF_NAT_REDIRECT=y
 CONFIG_NF_NAT_MASQUERADE=y
@@ -1971,7 +1986,7 @@ CONFIG_NETCONSOLE_DYNAMIC=y
 CONFIG_NETPOLL=y
 CONFIG_NET_POLL_CONTROLLER=y
 # CONFIG_RIONET is not set
-# CONFIG_TUN is not set
+CONFIG_TUN=y
 # CONFIG_TUN_VNET_CROSS_LE is not set
 CONFIG_VETH=y
 # CONFIG_NLMON is not set
@@ -3955,7 +3970,7 @@ CONFIG_MANDATORY_FILE_LOCKING=y
 CONFIG_FSNOTIFY=y
 CONFIG_DNOTIFY=y
 CONFIG_INOTIFY_USER=y
-# CONFIG_FANOTIFY is not set
+CONFIG_FANOTIFY=y
 CONFIG_QUOTA=y
 CONFIG_QUOTA_NETLINK_INTERFACE=y
 # CONFIG_PRINT_QUOTA_WARNING is not set
-- 
2.17.1

[PATCH 05/18] drm/amdkfd: CRIU Implement KFD dumper ioctl

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

This adds support to discover the  buffer objects that belong to a
process being checkpointed. The data corresponding to these buffer
objects is returned to user space plugin running under criu master
context which then stores this info to recreate these buffer objects
during a restore operation.

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
(cherry picked from commit 1f114a541bd21873de905db64bb9efa673274d4b)
(cherry picked from commit 20c435fad57d3201e5402e38ae778f1f0f84a09d)
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  |  20 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h  |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 182 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|   3 +-
 4 files changed, 204 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 7e7d8330d64b..99ea29fd12bd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1181,6 +1181,26 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device 
*bdev,
return ttm_pool_free(&adev->mman.bdev.pool, ttm);
 }
 
+/**
+ * amdgpu_ttm_tt_get_userptr - Return the userptr GTT ttm_tt for the current
+ * task
+ *
+ * @tbo: The ttm_buffer_object that contains the userptr
+ * @user_addr:  The returned value
+ */
+int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo,
+ uint64_t *user_addr)
+{
+   struct amdgpu_ttm_tt *gtt;
+
+   if (!tbo->ttm)
+   return -EINVAL;
+
+   gtt = (void *)tbo->ttm;
+   *user_addr = gtt->userptr;
+   return 0;
+}
+
 /**
  * amdgpu_ttm_tt_set_userptr - Initialize userptr GTT ttm_tt for the current
  * task
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 9e38475e0f8d..76f7a92e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -168,6 +168,8 @@ static inline bool amdgpu_ttm_tt_get_user_pages_done(struct 
ttm_tt *ttm)
 #endif
 
 void amdgpu_ttm_tt_set_user_pages(struct ttm_tt *ttm, struct page **pages);
+int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo,
+ uint64_t *user_addr);
 int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo,
  uint64_t addr, uint32_t flags);
 bool amdgpu_ttm_tt_has_userptr(struct ttm_tt *ttm);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 09e2d30515e2..d548e6691d69 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -42,6 +42,7 @@
 #include "kfd_svm.h"
 #include "amdgpu_amdkfd.h"
 #include "kfd_smi_events.h"
+#include "amdgpu_object.h"
 
 static long kfd_ioctl(struct file *, unsigned int, unsigned long);
 static int kfd_open(struct inode *, struct file *);
@@ -1804,6 +1805,44 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
 }
 #endif
 
+static int criu_dump_process(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_args *args)
+{
+   int ret;
+   struct kfd_criu_process_bucket *process_bucket;
+   struct kfd_criu_process_priv_data *process_priv;
+
+   if (args->num_objects != 1) {
+   pr_err("Only 1 process supported\n");
+   return -EINVAL;
+   }
+
+   if (args->objects_size != sizeof(*process_bucket) + 
sizeof(*process_priv)) {
+   pr_err("Invalid objects size for process\n");
+   return -EINVAL;
+   }
+
+   process_bucket = kzalloc(args->objects_size, GFP_KERNEL);
+   if (!process_bucket)
+   return -ENOMEM;
+
+   /* Private data starts after process bucket */
+   process_priv = (void *)(process_bucket + 1);
+
+   process_priv->version = KFD_CRIU_PRIV_VERSION;
+
+   process_bucket->priv_data_offset = 0;
+   process_bucket->priv_data_size = sizeof(*process_priv);
+
+   ret = copy_to_user((void __user *)args->objects, process_bucket, 
args->objects_size);
+   if (ret) {
+   pr_err("Failed to copy process information to user\n");
+   ret = -EFAULT;
+   }
+
+   kfree(process_bucket);
+   return ret;
+}
+
 uint64_t get_process_num_bos(struct kfd_process *p)
 {
uint64_t num_of_bos = 0, i;
@@ -1824,12 +1863,151 @@ uint64_t get_process_num_bos(struct kfd_process *p)
return num_of_bos;
 }
 
+static int criu_dump_bos(struct kfd_process *p, struct 
kfd_ioctl_criu_dumper_args *args)
+{
+   struct kfd_criu_bo_bucket *bo_buckets;
+   struct kfd_criu_bo_priv_data *bo_privs;
+   uint64_t num_bos;
+
+   int ret = 0, pdd_index, bo_index = 0, id;
+   void *mem;
+
+   num_bos = get_process_num_bos(p);
+
+   if (args->num_objects != num_bos) {
+   pr_err("Mismatch with number of BOs (current:%lld user:%lld)\n",
+

[PATCH 03/18] drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

Checkpoint-Restore in userspace (CRIU) is a powerful tool that can
snapshot a running process and later restore it on same or a remote
machine but expects the processes that have a device file (e.g. GPU)
associated with them, provide necessary driver support to assist CRIU
and its extensible plugin interface. Thus, In order to support the
Checkpoint-Restore of any ROCm process, the AMD Radeon Open Compute
Kernel driver, needs to provide a set of new APIs that provide
necessary VRAM metadata and its contents to a userspace component
(CRIU plugin) that can store it in form of image files.

This introduces some new ioctls which will be used to checkpoint-Restore
any KFD bound user process. KFD doesn't allow any arbitrary ioctl call
unless it is called by the group leader process. Since these ioctls are
expected to be called from a KFD criu plugin which has elevated ptrace
attached privileges and CAP_SYS_ADMIN capabilities attached with the file
descriptors so modify KFD to allow such calls.

(API redesign suggested by Felix Kuehling and implemented by David Yat
Sin)

Signed-off-by: David Yat Sin 
Signed-off-by: Rajneesh Bhardwaj 
(cherry picked from commit 72f4907135aed9c037b9f442a6055b51733b518a)
(cherry picked from commit 33ff4953c5352f51d57a77ba8ae6614b7993e70d)
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  70 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  69 ++
 include/uapi/linux/kfd_ioctl.h   | 110 ++-
 3 files changed, 247 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 059c3f1ca27d..a1b60d29aae1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "kfd_priv.h"
@@ -1802,6 +1803,44 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
return -EPERM;
 }
 #endif
+static int kfd_ioctl_criu_dumper(struct file *filep,
+   struct kfd_process *p, void *data)
+{
+   pr_debug("Inside %s\n", __func__);
+
+   return 0;
+}
+
+static int kfd_ioctl_criu_restorer(struct file *filep,
+   struct kfd_process *p, void *data)
+{
+   pr_debug("Inside %s\n", __func__);
+
+   return 0;
+}
+
+static int kfd_ioctl_criu_pause(struct file *filep, struct kfd_process *p, 
void *data)
+{
+   pr_debug("Inside %s\n", __func__);
+
+   return 0;
+}
+
+static int kfd_ioctl_criu_resume(struct file *filep,
+   struct kfd_process *p, void *data)
+{
+   pr_debug("Inside %s\n", __func__);
+
+   return 0;
+}
+
+static int kfd_ioctl_criu_process_info(struct file *filep,
+   struct kfd_process *p, void *data)
+{
+   pr_debug("Inside %s\n", __func__);
+
+   return 0;
+}
 
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
@@ -1906,6 +1945,21 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
 
AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_XNACK_MODE,
kfd_ioctl_set_xnack_mode, 0),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_DUMPER,
+kfd_ioctl_criu_dumper, KFD_IOC_FLAG_PTRACE_ATTACHED),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_RESTORER,
+kfd_ioctl_criu_restorer, KFD_IOC_FLAG_ROOT_ONLY),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_PROCESS_INFO,
+kfd_ioctl_criu_process_info, 
KFD_IOC_FLAG_PTRACE_ATTACHED),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_RESUME,
+kfd_ioctl_criu_resume, KFD_IOC_FLAG_ROOT_ONLY),
+
+   AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_PAUSE,
+kfd_ioctl_criu_pause, KFD_IOC_FLAG_PTRACE_ATTACHED),
 };
 
 #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls)
@@ -1920,6 +1974,7 @@ static long kfd_ioctl(struct file *filep, unsigned int 
cmd, unsigned long arg)
char *kdata = NULL;
unsigned int usize, asize;
int retcode = -EINVAL;
+   bool ptrace_attached = false;
 
if (nr >= AMDKFD_CORE_IOCTL_COUNT)
goto err_i1;
@@ -1945,7 +2000,15 @@ static long kfd_ioctl(struct file *filep, unsigned int 
cmd, unsigned long arg)
 * processes need to create their own KFD device context.
 */
process = filep->private_data;
-   if (process->lead_thread != current->group_leader) {
+
+   rcu_read_lock();
+   if ((ioctl->flags & KFD_IOC_FLAG_PTRACE_ATTACHED) &&
+   ptrace_parent(process->lead_thread) == current)
+   ptrace_attached = true;
+   rcu_read_unlock();
+
+   if (process->lead_thread != current->group_leader
+   && !ptrace_attached) {
dev_dbg(kfd_device, "Using KFD FD in wrong process\n");

[PATCH 01/18] x86/configs: CRIU update release defconfig

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

Update rock-rel_defconfig for monolithic kernel release that enables
CRIU support with kfd.

Signed-off-by: Rajneesh Bhardwaj 
(cherry picked from commit 4a6d309a82648a23a4fc0add83013ac6db6187d5)
Signed-off-by: David Yat Sin 
---
 arch/x86/configs/rock-rel_defconfig | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/configs/rock-rel_defconfig 
b/arch/x86/configs/rock-rel_defconfig
index 16fe62276006..9c46bb890879 100644
--- a/arch/x86/configs/rock-rel_defconfig
+++ b/arch/x86/configs/rock-rel_defconfig
@@ -1045,6 +1045,11 @@ CONFIG_PACKET_DIAG=m
 CONFIG_UNIX=y
 CONFIG_UNIX_SCM=y
 CONFIG_UNIX_DIAG=m
+CONFIG_SMC_DIAG=y
+CONFIG_XDP_SOCKETS_DIAG=y
+CONFIG_INET_MPTCP_DIAG=y
+CONFIG_TIPC_DIAG=y
+CONFIG_VSOCKETS_DIAG=y
 # CONFIG_TLS is not set
 CONFIG_XFRM=y
 CONFIG_XFRM_ALGO=m
@@ -1089,7 +1094,7 @@ CONFIG_NET_FOU=m
 CONFIG_NET_FOU_IP_TUNNELS=y
 CONFIG_INET_AH=m
 CONFIG_INET_ESP=m
-# CONFIG_INET_ESP_OFFLOAD is not set
+CONFIG_INET_ESP_OFFLOAD=m
 # CONFIG_INET_ESPINTCP is not set
 CONFIG_INET_IPCOMP=m
 CONFIG_INET_XFRM_TUNNEL=m
@@ -1097,8 +1102,8 @@ CONFIG_INET_TUNNEL=m
 CONFIG_INET_DIAG=m
 CONFIG_INET_TCP_DIAG=m
 CONFIG_INET_UDP_DIAG=m
-# CONFIG_INET_RAW_DIAG is not set
-# CONFIG_INET_DIAG_DESTROY is not set
+CONFIG_INET_RAW_DIAG=m
+CONFIG_INET_DIAG_DESTROY=m
 CONFIG_TCP_CONG_ADVANCED=y
 CONFIG_TCP_CONG_BIC=m
 CONFIG_TCP_CONG_CUBIC=y
@@ -1126,7 +1131,7 @@ CONFIG_IPV6_ROUTE_INFO=y
 # CONFIG_IPV6_OPTIMISTIC_DAD is not set
 CONFIG_INET6_AH=m
 CONFIG_INET6_ESP=m
-# CONFIG_INET6_ESP_OFFLOAD is not set
+CONFIG_INET6_ESP_OFFLOAD=m
 # CONFIG_INET6_ESPINTCP is not set
 CONFIG_INET6_IPCOMP=m
 CONFIG_IPV6_MIP6=m
-- 
2.17.1

[PATCH 04/18] drm/amdkfd: CRIU Implement KFD process_info ioctl

2021-08-19 Thread David Yat Sin

From: Rajneesh Bhardwaj 

This IOCTL is expected to be called as a precursor to the actual
Checkpoint operation. This does the basic discovery into the target
process seized by CRIU and relays the information to the userspace that
utilizes it to start the Checkpoint operation via another dedicated
IOCTL.

The process_info IOCTL determines the number of GPUs, buffer objects that
are associated with the target process, its process id in caller's
namespace since /proc/pid/mem interface maybe used to drain the contents of
the discovered buffer objects in userspace and getpid returns the pid of
CRIU dumper process. Also the pid of a process inside a container might
be different than its global pid so return the ns pid.

Signed-off-by: Rajneesh Bhardwaj 
(cherry picked from commit b2fa92d0a8f1de51013cd6742b4996b38c285ffc)
(cherry picked from commit 8b44c466ce53162603cd8ae49624462902541a47)
Signed-off-by: David Yat Sin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14 
 3 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index a1b60d29aae1..09e2d30515e2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1803,6 +1803,27 @@ static int kfd_ioctl_svm(struct file *filep, struct 
kfd_process *p, void *data)
return -EPERM;
 }
 #endif
+
+uint64_t get_process_num_bos(struct kfd_process *p)
+{
+   uint64_t num_of_bos = 0, i;
+
+   /* Run over all PDDs of the process */
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+   void *mem;
+   int id;
+
+   idr_for_each_entry(&pdd->alloc_idr, mem, id) {
+   struct kgd_mem *kgd_mem = (struct kgd_mem *)mem;
+
+   if ((uint64_t)kgd_mem->va > pdd->gpuvm_base)
+   num_of_bos++;
+   }
+   }
+   return num_of_bos;
+}
+
 static int kfd_ioctl_criu_dumper(struct file *filep,
struct kfd_process *p, void *data)
 {
@@ -1837,9 +1858,30 @@ static int kfd_ioctl_criu_resume(struct file *filep,
 static int kfd_ioctl_criu_process_info(struct file *filep,
struct kfd_process *p, void *data)
 {
+   struct kfd_ioctl_criu_process_info_args *args = data;
+   int ret = 0;
+
pr_debug("Inside %s\n", __func__);
+   mutex_lock(&p->mutex);
 
-   return 0;
+   if (!kfd_has_process_device_data(p)) {
+   pr_err("No pdd for given process\n");
+   ret = -ENODEV;
+   goto err_unlock;
+   }
+
+   args->task_pid = task_pid_nr_ns(p->lead_thread,
+   task_active_pid_ns(p->lead_thread));
+
+   args->process_priv_data_size = sizeof(struct 
kfd_criu_process_priv_data);
+
+   args->total_bos = get_process_num_bos(p);
+   args->bos_priv_data_size = args->total_bos * sizeof(struct 
kfd_criu_bo_priv_data);
+
+   dev_dbg(kfd_device, "Num of bos:%llu\n", args->total_bos);
+err_unlock:
+   mutex_unlock(&p->mutex);
+   return ret;
 }
 
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 768cc3fe95d2..4e390006b4b6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -932,6 +932,8 @@ void *kfd_process_device_translate_handle(struct 
kfd_process_device *p,
 void kfd_process_device_remove_obj_handle(struct kfd_process_device *pdd,
int handle);
 
+bool kfd_has_process_device_data(struct kfd_process *p);
+
 /* PASIDs */
 int kfd_pasid_init(void);
 void kfd_pasid_exit(void);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 9d4f527bda7c..bc133c3789d8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1359,6 +1359,20 @@ static int init_doorbell_bitmap(struct 
qcm_process_device *qpd,
return 0;
 }
 
+bool kfd_has_process_device_data(struct kfd_process *p)
+{
+   int i;
+
+   for (i = 0; i < p->n_pdds; i++) {
+   struct kfd_process_device *pdd = p->pdds[i];
+
+   if (pdd)
+   return true;
+   }
+
+   return false;
+}
+
 struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
struct kfd_process *p)
 {
-- 
2.17.1

[PATCH 00/18] CHECKPOINT RESTORE WITH ROCm

2021-08-19 Thread David Yat Sin

CRIU is a user space tool which is very popular for container live migration in 
datacentres. It can checkpoint a running application, save its complete state, 
memory contents and all system resources to images on disk which can be 
migrated to another m
achine and restored later. More information on CRIU can be found at 
https://criu.org/Main_Page

CRIU currently does not support Checkpoint / Restore with applications that 
have devices files open so it cannot perform checkpoint and restore on GPU 
devices which are very complex and have their own VRAM managed privately. CRIU, 
however can support e
xternal devices by using a plugin architecture. This patch series adds initial 
support for ROCm applications while we add more remaining features. We welcome 
some feedback, especially in regards to the APIs, before involving a larger 
audience.

Our plugin code can be found at 
https://github.com/RadeonOpenCompute/criu/tree/criu-dev/plugins/amdgpu

We have tested the following scenarios:
-Checkpoint / Restore of a Pytorch (BERT) workload
-kfdtests with queues and events
-Gfx9 and Gfx10 based multi GPU test systems
-On baremetal and inside a docker container
-Restoring on a different system

David Yat Sin (9):
  drm/amdkfd: CRIU Implement KFD pause ioctl
  drm/amdkfd: CRIU add queues support
  drm/amdkfd: CRIU restore queue ids
  drm/amdkfd: CRIU restore sdma id for queues
  drm/amdkfd: CRIU restore queue doorbell id
  drm/amdkfd: CRIU dump and restore queue mqds
  drm/amdkfd: CRIU dump/restore queue control stack
  drm/amdkfd: CRIU dump and restore events
  drm/amdkfd: CRIU implement gpu_id remapping

Rajneesh Bhardwaj (9):
  x86/configs: CRIU update release defconfig
  x86/configs: CRIU update debug rock defconfig
  drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs
  drm/amdkfd: CRIU Implement KFD process_info ioctl
  drm/amdkfd: CRIU Implement KFD dumper ioctl
  drm/amdkfd: CRIU Implement KFD restore ioctl
  drm/amdkfd: CRIU Implement KFD resume ioctl
  Revert "drm/amdgpu: Remove verify_access shortcut for KFD BOs"
  drm/amdkfd: CRIU export kfd bos as prime dmabuf objects

 arch/x86/configs/rock-dbg_defconfig   |   53 +-
 arch/x86/configs/rock-rel_defconfig   |   13 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|5 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   51 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   27 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |2 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c  | 1730 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  187 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.h |   14 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  254 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h  |   11 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c  |   76 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c  |   78 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   |   86 +
 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c   |   77 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |  140 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   69 +-
 .../amd/amdkfd/kfd_process_queue_manager.c|   72 +-
 include/uapi/linux/kfd_ioctl.h|  110 +-
 20 files changed, 2743 insertions(+), 314 deletions(-)

-- 
2.17.1

Re: [PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()

2021-08-19 Thread Christoph Hellwig

On Fri, Aug 13, 2021 at 11:59:23AM -0500, Tom Lendacky wrote:
> +static inline bool prot_guest_has(unsigned int attr)

No reall need to have this inline.  In fact I'd suggest we havea the
prototype in a common header so that everyone must implement it out
of line.

Re: [PATCH v2 02/12] mm: Introduce a function to check for virtualization protection features

2021-08-19 Thread Christoph Hellwig

On Fri, Aug 13, 2021 at 11:59:21AM -0500, Tom Lendacky wrote:
> +#define PATTR_MEM_ENCRYPT0   /* Encrypted memory */
> +#define PATTR_HOST_MEM_ENCRYPT   1   /* Host encrypted 
> memory */
> +#define PATTR_GUEST_MEM_ENCRYPT  2   /* Guest encrypted 
> memory */
> +#define PATTR_GUEST_PROT_STATE   3   /* Guest encrypted 
> state */

Please write an actual detailed explanaton of what these mean, that
is what implications it has on the kernel.

Re: [PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()

2021-08-19 Thread Christoph Hellwig

On Fri, Aug 13, 2021 at 11:59:22AM -0500, Tom Lendacky wrote:
> While the name suggests this is intended mainly for guests, it will
> also be used for host memory encryption checks in place of sme_active().

Which suggest that the name is not good to start with.  Maybe protected
hardware, system or platform might be a better choice?

> +static inline bool prot_guest_has(unsigned int attr)
> +{
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + if (sme_me_mask)
> + return amd_prot_guest_has(attr);
> +#endif
> +
> + return false;
> +}

Shouldn't this be entirely out of line?

> +/* 0x800 - 0x8ff reserved for AMD */
> +#define PATTR_SME0x800
> +#define PATTR_SEV0x801
> +#define PATTR_SEV_ES 0x802

Why do we need reservations for a purely in-kernel namespace?

And why are you overoading a brand new generic API with weird details
of a specific implementation like this?

Re: [PATCH] drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)

2021-08-19 Thread Mike Lothian

Hi

Do I need to open a new bug report for this?

Cheers

Mike

On Wed, 18 Aug 2021 at 06:26, Andrey Grodzovsky 
wrote:

>
> On 2021-08-02 1:16 a.m., Guchun Chen wrote:
> > In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
> > scheduler in s3 test, otherwise, fence related failure will arrive
> > after resume. To fix this and for a better clean up, move drm_sched_fini
> > from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and
> > should never be called in hw_fini.
> >
> > v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
> > to keep sw_init and sw_fini paired.
> >
> > Fixes: cd87a6dcf6af drm/amdgpu: adjust fence driver enable sequence
> > Suggested-by: Christian König 
> > Signed-off-by: Guchun Chen 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  5 ++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 12 +++-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  4 ++--
> >   3 files changed, 11 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index b1d2dc39e8be..9e53ff851496 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3646,9 +3646,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >
> >   fence_driver_init:
> >   /* Fence driver */
> > - r = amdgpu_fence_driver_init(adev);
> > + r = amdgpu_fence_driver_sw_init(adev);
> >   if (r) {
> > - dev_err(adev->dev, "amdgpu_fence_driver_init failed\n");
> > + dev_err(adev->dev, "amdgpu_fence_driver_sw_init failed\n");
> >   amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_FENCE_INIT_FAIL,
> 0, 0);
> >   goto failed;
> >   }
> > @@ -3988,7 +3988,6 @@ int amdgpu_device_resume(struct drm_device *dev,
> bool fbcon)
> >   }
> >   amdgpu_fence_driver_hw_init(adev);
> >
> > -
> >   r = amdgpu_device_ip_late_init(adev);
> >   if (r)
> >   return r;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > index 49c5c7331c53..7495911516c2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > @@ -498,7 +498,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring
> *ring,
> >   }
> >
> >   /**
> > - * amdgpu_fence_driver_init - init the fence driver
> > + * amdgpu_fence_driver_sw_init - init the fence driver
> >* for all possible rings.
> >*
> >* @adev: amdgpu device pointer
> > @@ -509,13 +509,13 @@ int amdgpu_fence_driver_init_ring(struct
> amdgpu_ring *ring,
> >* amdgpu_fence_driver_start_ring().
> >* Returns 0 for success.
> >*/
> > -int amdgpu_fence_driver_init(struct amdgpu_device *adev)
> > +int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev)
> >   {
> >   return 0;
> >   }
> >
> >   /**
> > - * amdgpu_fence_driver_fini - tear down the fence driver
> > + * amdgpu_fence_driver_hw_fini - tear down the fence driver
> >* for all possible rings.
> >*
> >* @adev: amdgpu device pointer
> > @@ -531,8 +531,7 @@ void amdgpu_fence_driver_hw_fini(struct
> amdgpu_device *adev)
> >
> >   if (!ring || !ring->fence_drv.initialized)
> >   continue;
> > - if (!ring->no_scheduler)
> > - drm_sched_fini(&ring->sched);
> > +
> >   /* You can't wait for HW to signal if it's gone */
> >   if (!drm_dev_is_unplugged(&adev->ddev))
> >   r = amdgpu_fence_wait_empty(ring);
>
>
> Sorry for late notice, missed this patch. By moving drm_sched_fini
> past amdgpu_fence_wait_empty a race is created as even after you waited
> for all fences on the ring to signal the sw scheduler will keep submitting
> new jobs on the ring and so the ring won't stay empty.
>
> For hot device removal also we want to prevent any access to HW past PCI
> removal
> in order to not do any MMIO accesses inside the physical MMIO range that
> no longer
> belongs to this device after it's removal by the PCI core. Stopping all
> the schedulers prevents any MMIO
> accesses done during job submissions and that why drm_sched_fini was
> done as part of amdgpu_fence_driver_hw_fini
> and not amdgpu_fence_driver_sw_fini
>
> Andrey
>
> > @@ -560,6 +559,9 @@ void amdgpu_fence_driver_sw_fini(struct
> amdgpu_device *adev)
> >   if (!ring || !ring->fence_drv.initialized)
> >   continue;
> >
> > + if (!ring->no_scheduler)
> > + drm_sched_fini(&ring->sched);
> > +
> >   for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
> >   dma_fence_put(ring->fence_drv.fences[j]);
> >   kfree(ring->fence_drv.fences);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > index 27adffa7658d..9

RE: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-19 Thread Liu, Monk

[AMD Official Use Only]

Hi Daniel

>> Why can't we stop the scheduler thread first, so that there's guaranteed no 
>> race? I've recently had a lot of discussions with panfrost folks about their 
>> reset that spawns across engines, and without stopping the scheduler thread 
>> first before you touch anything it's just plain impossible.

Yeah we had this though as well in our mind.

Our second approach is to call ktrhead_stop() in job_timedout() routine so that 
 the "bad" job is guaranteed to be used without scheduler's touching or freeing,
Check this sample patch one as well please:

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..50a49cb 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,17 +319,12 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   kthread_park(sched->thread);
spin_lock(&sched->job_list_lock);
job = list_first_entry_or_null(&sched->pending_list,
   struct drm_sched_job, list);
 
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(&job->list);
spin_unlock(&sched->job_list_lock);
 
status = job->sched->ops->timedout_job(job);
@@ -345,6 +340,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
} else {
spin_unlock(&sched->job_list_lock);
}
+   kthread_unpark(sched->thread);
 
if (status != DRM_GPU_SCHED_STAT_ENODEV) {
spin_lock(&sched->job_list_lock);
@@ -393,20 +389,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
kthread_park(sched->thread);
 
/*
-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(&bad->list, &sched->pending_list);
-
-   /*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
 * signaled.


Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Thursday, August 19, 2021 5:31 PM
To: Grodzovsky, Andrey 
Cc: Daniel Vetter ; Alex Deucher ; 
Chen, JingWen ; Maling list - DRI developers 
; amd-gfx list 
; Liu, Monk ; Koenig, 
Christian 
Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
> > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > > > 
> > > > > > + dri-devel
> > > > > > 
> > > > > > Since scheduler is a shared component, please add dri-devel 
> > > > > > on all scheduler patches.
> > > > > > 
> > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen 
> > > > > >  wrote:
> > > > > > > [Why]
> > > > > > > for bailing job, this commit will delete it from pending 
> > > > > > > list thus the bailing job will never have a chance to be 
> > > > > > > resubmitted even in advance tdr mode.
> > > > > > > 
> > > > > > > [How]
> > > > > > > after embeded hw_fence into amdgpu_job is done, the race 
> > > > > > > condition that this commit tries to work around is 
> > > > > > > completely solved.So revert this commit.
> > > > > > > This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.
> > > > > > > v2:
> > > > > > > add dma_fence_get/put() around timedout_job to avoid 
> > > > > > > concurrent delete during processing timedout_job
> > > > > > > 
> > > > > > > Signed-off-by: Jingwen Chen 
> > > > > > > ---
> > > > > > > drivers/gpu/drm/scheduler/sched_main.c | 23 
> > > > > > > +--
> > > > > > > 1 file changed, 5 insertions(+), 18 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/dr

Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-19 Thread Daniel Vetter

On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
> > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > > > 
> > > > > > + dri-devel
> > > > > > 
> > > > > > Since scheduler is a shared component, please add dri-devel on all
> > > > > > scheduler patches.
> > > > > > 
> > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen 
> > > > > >  wrote:
> > > > > > > [Why]
> > > > > > > for bailing job, this commit will delete it from pending list 
> > > > > > > thus the
> > > > > > > bailing job will never have a chance to be resubmitted even in 
> > > > > > > advance
> > > > > > > tdr mode.
> > > > > > > 
> > > > > > > [How]
> > > > > > > after embeded hw_fence into amdgpu_job is done, the race 
> > > > > > > condition that
> > > > > > > this commit tries to work around is completely solved.So revert 
> > > > > > > this
> > > > > > > commit.
> > > > > > > This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.
> > > > > > > v2:
> > > > > > > add dma_fence_get/put() around timedout_job to avoid concurrent 
> > > > > > > delete
> > > > > > > during processing timedout_job
> > > > > > > 
> > > > > > > Signed-off-by: Jingwen Chen 
> > > > > > > ---
> > > > > > > drivers/gpu/drm/scheduler/sched_main.c | 23 
> > > > > > > +--
> > > > > > > 1 file changed, 5 insertions(+), 18 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > index a2a953693b45..f9b9b3aefc4a 100644
> > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct 
> > > > > > > work_struct *work)
> > > > > > > {
> > > > > > >struct drm_gpu_scheduler *sched;
> > > > > > >struct drm_sched_job *job;
> > > > > > > +   struct dma_fence *fence;
> > > > > > >enum drm_gpu_sched_stat status = 
> > > > > > > DRM_GPU_SCHED_STAT_NOMINAL;
> > > > > > > 
> > > > > > >sched = container_of(work, struct drm_gpu_scheduler, 
> > > > > > > work_tdr.work);
> > > > > > > @@ -325,11 +326,10 @@ static void drm_sched_job_timedout(struct 
> > > > > > > work_struct *work)
> > > > > > > 
> > > > > > >if (job) {
> > > > > > >/*
> > > > > > > -* Remove the bad job so it cannot be freed by 
> > > > > > > concurrent
> > > > > > > -* drm_sched_cleanup_jobs. It will be reinserted 
> > > > > > > back after sched->thread
> > > > > > > -* is parked at which point it's safe.
> > > > > > > +* Get job->s_fence->parent here to avoid 
> > > > > > > concurrent delete during
> > > > > > > +* processing timedout_job
> > > > > > > */
> > > > > > > -   list_del_init(&job->list);
> > > > > > > +   fence = dma_fence_get(job->s_fence->parent);
> > > > > While this is true for amdgpu, it has no meaning for other drivers 
> > > > > for whom
> > > > > we haven't
> > > > > done the refactoring of embedding HW fence (parent) into the job 
> > > > > structure.
> > > > > In fact thinking
> > > > > about it, unless you do the HW fence embedding for all the drivers 
> > > > > using the
> > > > > scheduler you cannot
> > > > > revert this patch or you will just break them.
> > > > btw, why did you do that embedding? I do still have my patches with
> > > > dma_fence annotations floating around, but my idea at least was to fix
> > > > that issue with a mempool, not with embeddeding. What was the motivation
> > > > for embedding the wh fence?
> > > > -Daniel
> > > 
> > > The motivation was 2 fold, avoid memory allocation during jobs submissions
> > > (HW fence allocation) because as Christian explained this leads to 
> > > deadlock
> > > with
> > > mm code during evictions due to memory pressure (Christian can clarify if 
> > > I
> > > messed
> > Yeah that's the exact same thing I've chased with my dma_fence
> > annotations, but thus far zero to none interested in getting it sorted. I
> > think it'd be good to have some cross-driver agreement on how this should
> > be solved before someone just charges ahead ...
> > 
> > > this explanation). Second is to exactly revert this patch because while it
> > > solved the issue
> > > described in the patch it created another with drivers who baildc out 
> > > early
> > > during TDR handling
> > > for various reason and the job would just leak because it was already
> > > removed form pending list.
> > Can't we reinsert it before we restart the scheduler thread? It might need
> > a separate list for that due to the lockless qu

Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-19 Thread Daniel Vetter

On Thu, Aug 19, 2021 at 03:01:26AM +, Liu, Monk wrote:
> [AMD Official Use Only]
> 
> Hi Andrey and Daniel
> 
> We worked for a really long time on this new feature to AMD that finally
> can pick up the bad job from all timedout ones, and the change in
> scheduler (get/put fence in drm_sched_job_timedout, and remove the bad
> job delete and put back) is the last piece for us.
> 
> While we understand and realized that after the "bad job list node
> delete logic" being removed from job_timedout,  there will be race
> issues introduced if vendor's job_timeout calback is accessing the bad
> job  in parallel of scheduler doing "sched->ops->free_job(leanup_job)".
> 
> And to not introduce impact at all on those vendors I'd like to proposal
> a very simple change (which introduced a new bool member for scheduler
> to indicate if the del/put-back logic is needed or not) , check  patch
> here below:

If everyone operates like that then the shared code becomes a massive mess
of incompatible options and unmaintainable. I don't think that's a good
path forward.
-Daniel

> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 47ea468..5e0bdc4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -495,6 +495,8 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring 
> *ring,
>   return r;
>   }
>  
> + ring->sched.keep_bad_job = true;
> +
>   return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 92d8de2..e7ac384 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>  {
>   struct drm_gpu_scheduler *sched;
>   struct drm_sched_job *job;
> + struct dma_fence *f = NULL;
>  
>   sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>  
> @@ -328,7 +329,11 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>* drm_sched_cleanup_jobs. It will be reinserted back after 
> sched->thread
>* is parked at which point it's safe.
>*/
> - list_del_init(&job->list);
> + if (sched->keep_bad_job == false)
> + list_del_init(&job->list);
> + else
> + f = dma_fence_get(job->s_fence->parent);//get parent 
> fence here to prevent hw_fence dropping to zero due to sched-main's 
> cleanup_jobs, for amdgpu once parent fence drop to zero the sched_job will be 
> kfree-ed 
> +
>   spin_unlock(&sched->job_list_lock);
>  
>   job->sched->ops->timedout_job(job);
> @@ -341,6 +346,8 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>   job->sched->ops->free_job(job);
>   sched->free_guilty = false;
>   }
> +
> + dma_fence_put(f);
>   } else {
>   spin_unlock(&sched->job_list_lock);
>   }
> @@ -396,7 +403,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> struct drm_sched_job *bad)
>* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
>* now until the scheduler thread is unparked.
>*/
> - if (bad && bad->sched == sched)
> + if (bad && bad->sched == sched && sched->keep_bad_job == false)
>   /*
>* Add at the head of the queue to reflect it was the earliest
>* job extracted.
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 4ea8606..5f9a640 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -301,6 +301,7 @@ struct drm_gpu_scheduler {
>   atomic_t_score;
>   boolready;
>   boolfree_guilty;
> + bool keep_bad_job;
>  };
>  
>  int drm_sched_init(struct drm_gpu_scheduler *sched,
> 
> 
> Thanks 
> 
> --
> Monk Liu | Cloud-GPU Core team
> --
> 
> -Original Message-
> From: Daniel Vetter  
> Sent: Wednesday, August 18, 2021 10:43 PM
> To: Grodzovsky, Andrey 
> Cc: Daniel Vetter ; Alex Deucher ; 
> Chen, JingWen ; Maling list - DRI developers 
> ; amd-gfx list 
> ; Liu, Monk ; Koenig, 
> Christian 
> Subject: Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
> 
> On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> > 
> > On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > > 
> > > > > + dri-devel
> > > > > 
> > > > > Since scheduler is a shared component, please add dri-devel on 
> > > > > all scheduler patches.
> > > > > 
>

60 matches

Mail list logo