Re: [Nouveau] [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings

2023-06-23 Thread Christian König

Am 23.06.23 um 15:55 schrieb Danilo Krummrich:

[SNIP]

How do you efficiently find only the mappings of a BO in one VM?


Actually, I think this case should even be more efficient than with 
a BO having a list of GPUVAs (or mappings):


*than with a BO having a list of VMs:



Having a list of GPUVAs per GEM, each GPUVA has a pointer to it's 
VM. Hence, you'd only need to iterate the list of mappings for a 
given BO and check the mappings VM pointer.


Yeah, and that is extremely time consuming if you have tons of 
mappings in different VMs.




Having a list of VMs per BO, you'd have to iterate the whole VM to 
find the mappings having a pointer to the given BO, right?


No, you don't seem to understand what I'm suggesting.

Currently you have a list of mappings attached to the BO, so when you 
need to make sure that a specific BO is up to date in a specific VM 
you either need to iterate over the VM or the BO. Neither of that is 
a good idea.


What you need is a representation of the data used for each BO+VM 
combination. In other words another indirection which allows you to 
handle all the mappings of a BO inside a VM at once.


Ok, after having a quick look at amdgpu, I can see what you mean.

The missing piece for me was that the BO+VM abstraction itself keeps a 
list of mappings for this specific BO and VM.


Just to make it obvious for other people following the discussion, let 
me quickly sketch up how this approach would look like for the GPUVA 
manager:


1. We would need a new structure to represent the BO+VM combination, 
something like:


struct drm_gpuva_mgr_gem {
    struct drm_gpuva_manager *mgr;
    struct drm_gem_object *obj;
    struct list_head gpuva_list;
};

with a less horrible name, hopefully.

2. Create an instance of struct drm_gpuva_mgr_gem once a GEM becomes 
associated with a GPUVA manager (VM) and attach it to the GEMs, as by 
now, "gpuva" list.


In amdgpu, for example, this seems to be the case once a GEM object is 
opened, since there is one VM per file_priv.


However, for other drivers this could be different, hence drivers 
would need to take care about this.


Yes, exactly that.




3. Attach GPUVAs to the new gpuva_list of the corresponding instance of
struct drm_gpuva_mgr_gem.

4. Drivers would need to clean up the instance of struct 
drm_gpuva_mgr_gem, once the GEM is not associated with the GPUVA 
manager anymore.


As pointed out by Christian, this would optimize the "get all mappings 
backed by a specific BO from a given VM" use case.


The question for me is, do other drivers than amdgpu commonly need this?


I have no idea.



And what does amdgpu need this for? Maybe amdgpu does something we're 
not doing (yet)?


Basically when we do a CS we need to make sure that the VM used by this 
CS is up to date. For this we walk over the relocation list of BOs and 
check the status of each BO+VM structure.


This is done because we don't want to update all VMs at the same time, 
but rather just those who needs the update.




Christian - I know you didn't ask for "do it the way amdgpu does", 
instead you voted for keeping it entirely driver specific. But I think 
everyone is pretty close and I'm still optimistic that we could just 
generalize this.


Well, you should *not* necessarily do it like amdgpu does! Basically the 
implementation in amdgpu was driven by requirements, e.g. we need that, 
let's do it like this.


It's perfectly possible that other requirements (e.g. focus on Vulkan) 
lead to a completely different implementation.


It's just that ideally I would like to have an implementation where I 
can apply at least the basics to amdgpu as well.


Regards,
Christian.



- Danilo





I'd think that a single VM potentially has more mapping entries 
than a single BO was mapped in multiple VMs.


Another case to consider is the case I originally had in mind 
choosing this relationship: finding all mappings for a given BO, 
which I guess all drivers need to do in order to invalidate 
mappings on BO eviction.


Having a list of VMs per BO, wouldn't you need to iterate all of 
the VMs entirely?


No, see how amdgpu works.

Regards,
Christian.







Re: [Nouveau] [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings

2023-06-23 Thread Danilo Krummrich

On 6/23/23 09:16, Christian König wrote:

Am 22.06.23 um 17:07 schrieb Danilo Krummrich:

On 6/22/23 17:04, Danilo Krummrich wrote:

On 6/22/23 16:42, Christian König wrote:

Am 22.06.23 um 16:22 schrieb Danilo Krummrich:

On 6/22/23 15:54, Christian König wrote:

Am 20.06.23 um 14:23 schrieb Danilo Krummrich:

Hi Christian,

On 6/20/23 08:45, Christian König wrote:

Hi Danilo,

sorry for the delayed reply. I've trying to dig myself out of a 
hole at the moment.


No worries, thank you for taking a look anyway!



Am 20.06.23 um 02:42 schrieb Danilo Krummrich:

[SNIP]
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index bbc721870c13..5ec8148a30ee 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -36,6 +36,8 @@
  #include 
  #include 
+#include 
+#include 
  #include 
@@ -379,6 +381,18 @@ struct drm_gem_object {
   */
  struct dma_resv _resv;
+    /**
+ * @gpuva:
+ *
+ * Provides the list of GPU VAs attached to this GEM object.
+ *
+ * Drivers should lock list accesses with the GEMs 
&dma_resv lock

+ * (&drm_gem_object.resv).
+ */
+    struct {
+    struct list_head list;
+    } gpuva;
+
  /**
   * @funcs:
   *


I'm pretty sure that it's not a good idea to attach this 
directly to the GEM object.


Why do you think so? IMHO having a common way to connect mappings 
to their backing buffers is a good thing, since every driver 
needs this connection anyway.


E.g. when a BO gets evicted, drivers can just iterate the list of 
mappings and, as the circumstances require, invalidate the 
corresponding mappings or to unmap all existing mappings of a 
given buffer.


What would be the advantage to let every driver implement a 
driver specific way of keeping this connection?


Flexibility. For example on amdgpu the mappings of a BO are groups 
by VM address spaces.


E.g. the BO points to multiple bo_vm structures which in turn have 
lists of their mappings.


Isn't this (almost) the same relationship I introduce with the 
GPUVA manager?


If you would switch over to the GPUVA manager right now, it would 
be that every GEM has a list of it's mappings (the gpuva list). The 
mapping is represented by struct drm_gpuva (of course embedded in 
driver specific structure(s)) which has a pointer to the VM address 
space it is part of, namely the GPUVA manager instance. And the 
GPUVA manager keeps a maple tree of it's mappings as well.


If you still would like to *directly* (indirectly you already have 
that relationship) keep a list of GPUVA managers (VM address 
spaces) per GEM, you could still do that in a driver specific way.


Do I miss something?


How do you efficiently find only the mappings of a BO in one VM?


Actually, I think this case should even be more efficient than with a 
BO having a list of GPUVAs (or mappings):


*than with a BO having a list of VMs:



Having a list of GPUVAs per GEM, each GPUVA has a pointer to it's VM. 
Hence, you'd only need to iterate the list of mappings for a given BO 
and check the mappings VM pointer.


Yeah, and that is extremely time consuming if you have tons of mappings 
in different VMs.




Having a list of VMs per BO, you'd have to iterate the whole VM to 
find the mappings having a pointer to the given BO, right?


No, you don't seem to understand what I'm suggesting.

Currently you have a list of mappings attached to the BO, so when you 
need to make sure that a specific BO is up to date in a specific VM you 
either need to iterate over the VM or the BO. Neither of that is a good 
idea.


What you need is a representation of the data used for each BO+VM 
combination. In other words another indirection which allows you to 
handle all the mappings of a BO inside a VM at once.


Ok, after having a quick look at amdgpu, I can see what you mean.

The missing piece for me was that the BO+VM abstraction itself keeps a 
list of mappings for this specific BO and VM.


Just to make it obvious for other people following the discussion, let 
me quickly sketch up how this approach would look like for the GPUVA 
manager:


1. We would need a new structure to represent the BO+VM combination, 
something like:


struct drm_gpuva_mgr_gem {
struct drm_gpuva_manager *mgr;
struct drm_gem_object *obj;
struct list_head gpuva_list;
};

with a less horrible name, hopefully.

2. Create an instance of struct drm_gpuva_mgr_gem once a GEM becomes 
associated with a GPUVA manager (VM) and attach it to the GEMs, as by 
now, "gpuva" list.


In amdgpu, for example, this seems to be the case once a GEM object is 
opened, since there is one VM per file_priv.


However, for other drivers this could be different, hence drivers would 
need to take care about this.



3. Attach GPUVAs to the new gpuva_list of the corresponding instance of
struct drm_gpuva_mgr_gem.

4. Drivers would need to clean up the instance of struct 
drm_gpuva_mgr_gem, once the GEM is not ass

Re: [Nouveau] [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings

2023-06-23 Thread Christian König

Am 22.06.23 um 17:07 schrieb Danilo Krummrich:

On 6/22/23 17:04, Danilo Krummrich wrote:

On 6/22/23 16:42, Christian König wrote:

Am 22.06.23 um 16:22 schrieb Danilo Krummrich:

On 6/22/23 15:54, Christian König wrote:

Am 20.06.23 um 14:23 schrieb Danilo Krummrich:

Hi Christian,

On 6/20/23 08:45, Christian König wrote:

Hi Danilo,

sorry for the delayed reply. I've trying to dig myself out of a 
hole at the moment.


No worries, thank you for taking a look anyway!



Am 20.06.23 um 02:42 schrieb Danilo Krummrich:

[SNIP]
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index bbc721870c13..5ec8148a30ee 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -36,6 +36,8 @@
  #include 
  #include 
+#include 
+#include 
  #include 
@@ -379,6 +381,18 @@ struct drm_gem_object {
   */
  struct dma_resv _resv;
+    /**
+ * @gpuva:
+ *
+ * Provides the list of GPU VAs attached to this GEM object.
+ *
+ * Drivers should lock list accesses with the GEMs 
&dma_resv lock

+ * (&drm_gem_object.resv).
+ */
+    struct {
+    struct list_head list;
+    } gpuva;
+
  /**
   * @funcs:
   *


I'm pretty sure that it's not a good idea to attach this 
directly to the GEM object.


Why do you think so? IMHO having a common way to connect mappings 
to their backing buffers is a good thing, since every driver 
needs this connection anyway.


E.g. when a BO gets evicted, drivers can just iterate the list of 
mappings and, as the circumstances require, invalidate the 
corresponding mappings or to unmap all existing mappings of a 
given buffer.


What would be the advantage to let every driver implement a 
driver specific way of keeping this connection?


Flexibility. For example on amdgpu the mappings of a BO are groups 
by VM address spaces.


E.g. the BO points to multiple bo_vm structures which in turn have 
lists of their mappings.


Isn't this (almost) the same relationship I introduce with the 
GPUVA manager?


If you would switch over to the GPUVA manager right now, it would 
be that every GEM has a list of it's mappings (the gpuva list). The 
mapping is represented by struct drm_gpuva (of course embedded in 
driver specific structure(s)) which has a pointer to the VM address 
space it is part of, namely the GPUVA manager instance. And the 
GPUVA manager keeps a maple tree of it's mappings as well.


If you still would like to *directly* (indirectly you already have 
that relationship) keep a list of GPUVA managers (VM address 
spaces) per GEM, you could still do that in a driver specific way.


Do I miss something?


How do you efficiently find only the mappings of a BO in one VM?


Actually, I think this case should even be more efficient than with a 
BO having a list of GPUVAs (or mappings):


*than with a BO having a list of VMs:



Having a list of GPUVAs per GEM, each GPUVA has a pointer to it's VM. 
Hence, you'd only need to iterate the list of mappings for a given BO 
and check the mappings VM pointer.


Yeah, and that is extremely time consuming if you have tons of mappings 
in different VMs.




Having a list of VMs per BO, you'd have to iterate the whole VM to 
find the mappings having a pointer to the given BO, right?


No, you don't seem to understand what I'm suggesting.

Currently you have a list of mappings attached to the BO, so when you 
need to make sure that a specific BO is up to date in a specific VM you 
either need to iterate over the VM or the BO. Neither of that is a good 
idea.


What you need is a representation of the data used for each BO+VM 
combination. In other words another indirection which allows you to 
handle all the mappings of a BO inside a VM at once.




I'd think that a single VM potentially has more mapping entries than 
a single BO was mapped in multiple VMs.


Another case to consider is the case I originally had in mind 
choosing this relationship: finding all mappings for a given BO, 
which I guess all drivers need to do in order to invalidate mappings 
on BO eviction.


Having a list of VMs per BO, wouldn't you need to iterate all of the 
VMs entirely?


No, see how amdgpu works.

Regards,
Christian.