Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-03-16 Thread Danilo Krummrich

Hi Oded,

sorry for the late response, somehow this mail slipped through.

On 2/6/23 15:48, Oded Gabbay wrote:

On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost  wrote:

Is this not an application issue? Millions of mappings seems a bit
absurd to me.

If I look at the most extreme case for AI, assuming 256GB of HBM
memory and page mapping of 2MB, we get to 128K of mappings. But that's
really the extreme case imo. I assume most mappings will be much
larger. In fact, in the most realistic scenario of large-scale
training, a single user will probably map the entire HBM memory using
1GB pages.

I have also a question, could this GPUVA code manage VA ranges
mappings for userptr mappings, assuming we work without svm/uva/usm
(pointer-is-a-pointer) ? Because then we are talking about possible
4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is
the assumption this can be used also for non-VK use-cases. Please tell
me if I'm totally wrong here).


In V2 I switched from drm_mm to maple tree, which should improve 
handling of lots of entries. I also dropped the requirement for GPUVA 
entries to be backed by a valid GEM object.


I think it can be used for non-VK use-cases. It basically just keeps 
track of mappings (not allocating them in the sense of finding a hole 
and providing a base address for a given size). There are basic 
functions to insert and remove entries. For those basic functions it is 
ensured that colliding entries can't be inserted and only a specific 
given entry can be removed, rather than e.g. an arbitrary range.


There are also more advanced functions where users of the GPUVA manager 
can request to "force map" a new mapping and to unmap a given range. The 
GPUVA manager will figure out the (sub-)operations to make this happen 
(.e.g. remove mappings in the way, split up mappings, etc.) and either 
provide these operations (or steps) through callbacks or though a list 
of operations to the caller to process them.


Are there any other use-cases or features you could think of that would 
be beneficial for accelerators?


- Danilo



Thanks,
Oded





Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-02-06 Thread Oded Gabbay
On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost  wrote:
>
> On Thu, Jan 19, 2023 at 05:04:32AM +0100, Danilo Krummrich wrote:
> > On 1/18/23 20:48, Christian König wrote:
> > > Am 18.01.23 um 20:17 schrieb Dave Airlie:
> > > > On Thu, 19 Jan 2023 at 02:54, Alex Deucher  
> > > > wrote:
> > > > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich
> > > > >  wrote:
> > > > > >
> > > > > >
> > > > > > On 1/18/23 17:30, Alex Deucher wrote:
> > > > > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich
> > > > > > >  wrote:
> > > > > > > > On 1/18/23 16:37, Christian König wrote:
> > > > > > > > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> > > > > > > > > > Hi Christian,
> > > > > > > > > >
> > > > > > > > > > On 1/18/23 09:53, Christian König wrote:
> > > > > > > > > > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> > > > > > > > > > > > This patch series provides a new UAPI for the Nouveau 
> > > > > > > > > > > > driver in
> > > > > > > > > > > > order to
> > > > > > > > > > > > support Vulkan features, such as
> > > > > > > > > > > > sparse bindings and sparse
> > > > > > > > > > > > residency.
> > > > > > > > > > > >
> > > > > > > > > > > > Furthermore, with the DRM GPUVA
> > > > > > > > > > > > manager it provides a new DRM core
> > > > > > > > > > > > feature to
> > > > > > > > > > > > keep track of GPU virtual address
> > > > > > > > > > > > (VA) mappings in a more generic way.
> > > > > > > > > > > >
> > > > > > > > > > > > The DRM GPUVA manager is indented to help drivers 
> > > > > > > > > > > > implement
> > > > > > > > > > > > userspace-manageable
> > > > > > > > > > > > GPU VA spaces in reference to the Vulkan API. In order 
> > > > > > > > > > > > to achieve
> > > > > > > > > > > > this goal it
> > > > > > > > > > > > serves the following purposes in this context.
> > > > > > > > > > > >
> > > > > > > > > > > >1) Provide a dedicated range allocator to track 
> > > > > > > > > > > > GPU VA
> > > > > > > > > > > > allocations and
> > > > > > > > > > > >   mappings, making use of the drm_mm range 
> > > > > > > > > > > > allocator.
> > > > > > > > > > > This means that the ranges are allocated
> > > > > > > > > > > by the kernel? If yes that's
> > > > > > > > > > > a really really bad idea.
> > > > > > > > > > No, it's just for keeping track of the
> > > > > > > > > > ranges userspace has allocated.
> > > > > > > > > Ok, that makes more sense.
> > > > > > > > >
> > > > > > > > > So basically you have an IOCTL which asks kernel
> > > > > > > > > for a free range? Or
> > > > > > > > > what exactly is the drm_mm used for here?
> > > > > > > > Not even that, userspace provides both the base
> > > > > > > > address and the range,
> > > > > > > > the kernel really just keeps track of things.
> > > > > > > > Though, writing a UAPI on
> > > > > > > > top of the GPUVA manager asking for a free range instead would 
> > > > > > > > be
> > > > > > > > possible by just adding the corresponding wrapper functions to 
> > > > > > > > get a
> > > > > > > > free hole.
> > > > > > > >
> > > > > > > > Currently, and that's what I think I read out of
> > > > > > > > your question, the main
> > > > > > > > benefit of using drm_mm over simply stuffing the
> > > > > > > > entries into a list or
> > > > > > > > something boils down to easier collision detection and iterating
> > > > > > > > sub-ranges of the whole VA space.
> > > > > > > Why not just do this in userspace?  We have a range manager in
> > > > > > > libdrm_amdgpu that you could lift out into libdrm or some other
> > > > > > > helper.
> > > > > > The kernel still needs to keep track of the mappings within the 
> > > > > > various
> > > > > > VA spaces, e.g. it silently needs to unmap mappings that are backed 
> > > > > > by
> > > > > > BOs that get evicted and remap them once they're validated (or 
> > > > > > swapped
> > > > > > back in).
> > > > > Ok, you are just using this for maintaining the GPU VM space in
> > > > > the kernel.
> > > > >
> > > > Yes the idea behind having common code wrapping drm_mm for this is to
> > > > allow us to make the rules consistent across drivers.
> > > >
> > > > Userspace (generally Vulkan, some compute) has interfaces that pretty
> > > > much dictate a lot of how VMA tracking works, esp around lifetimes,
> > > > sparse mappings and splitting/merging underlying page tables, I'd
> > > > really like this to be more consistent across drivers, because already
> > > > I think we've seen with freedreno some divergence from amdgpu and we
> > > > also have i915/xe to deal with. I'd like to at least have one place
> > > > that we can say this is how it should work, since this is something
> > > > that *should* be consistent across drivers mostly, as it is more about
> > > > how the uapi is exposed.
> > >
> > > That's a really good idea, but the implementation with drm_mm won't work
> > > like that.
> > >
> > > We have Vulkan applications which use the sparse feature to create
> > > literally millions of mappings.

drm_gpuva_manager requirements (was Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI)

2023-01-19 Thread Christian König

Am 19.01.23 um 06:23 schrieb Matthew Brost:

[SNIP]

Userspace (generally Vulkan, some compute) has interfaces that pretty
much dictate a lot of how VMA tracking works, esp around lifetimes,
sparse mappings and splitting/merging underlying page tables, I'd
really like this to be more consistent across drivers, because already
I think we've seen with freedreno some divergence from amdgpu and we
also have i915/xe to deal with. I'd like to at least have one place
that we can say this is how it should work, since this is something
that *should* be consistent across drivers mostly, as it is more about
how the uapi is exposed.

That's a really good idea, but the implementation with drm_mm won't work
like that.

We have Vulkan applications which use the sparse feature to create
literally millions of mappings. That's why I have fine tuned the mapping

Is this not an application issue? Millions of mappings seems a bit
absurd to me.


That's unfortunately how some games are designed these days.


structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
possible in the handling of that.

We might need to bit of work here in Xe as our xe_vma structure is quite
big as we currently use it as dumping ground for various features.


We have done that as well and it turned out to be a bad idea. At one 
point we added some power management information into the mapping 
structure, but quickly reverted that.



That's a valuable information. Can you recommend such an application for
testing / benchmarking?


Also interested.


On of the most demanding ones is Forza Horizon 5. The general approach 
of that game seems to be to allocate 64GiB of address space (equals 16 
million 4kiB pages) and then mmap() whatever data it needs into that 
self managed space, assuming that every 4KiB page is individually 
mapable to a different location.



Your optimization effort sounds great. May it be worth thinking about
generalizing your approach by itself and stacking the drm_gpuva_manager on
top of it?


FWIW the Xe is on board with the drm_gpuva_manager effort, we basically
open code all of this right now. I'd like to port over to
drm_gpuva_manager ASAP so we can contribute and help find a viable
solution for all of us.


Sounds good. I haven't looked into the drm_gpuva_manager code yet, but a 
few design notes I've leaned from amdgpu:


Separate address space management (drm_mm) from page table management. 
In other words when an application asks for 64GiB for free address space 
you don't look into the page table structures, but rather into a 
separate drm_mm instance. In amdgpu we even moved the later into 
userspace, but the general take away is that you have only a handful of 
address space requests while you have tons of mapping/unmapping requests.


Separate the tracking structure into two, one for each BO+VM combination 
(we call that amdgpu_bo_va) and one for each mapping (called 
amdgpu_bo_va_mapping). We unfortunately use that for our hw dependent 
state machine as well, so it isn't easily generalize-able.


I've gone back on forth on merging VMA and then not again. Not merging 
them can save quite a bit of overhead, but results in much more mappings 
for some use cases.


Regards,
Christian.



Matt
  

A drm_mm_node is more in the range of ~200 bytes and certainly not
suitable for this kind of job.

I strongly suggest to rather use a good bunch of the amdgpu VM code as
blueprint for the common infrastructure.

I will definitely have look.


Regards,
Christian.


Dave.




Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Matthew Brost
On Thu, Jan 19, 2023 at 05:04:32AM +0100, Danilo Krummrich wrote:
> On 1/18/23 20:48, Christian König wrote:
> > Am 18.01.23 um 20:17 schrieb Dave Airlie:
> > > On Thu, 19 Jan 2023 at 02:54, Alex Deucher  wrote:
> > > > On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich
> > > >  wrote:
> > > > > 
> > > > > 
> > > > > On 1/18/23 17:30, Alex Deucher wrote:
> > > > > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich
> > > > > >  wrote:
> > > > > > > On 1/18/23 16:37, Christian König wrote:
> > > > > > > > Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> > > > > > > > > Hi Christian,
> > > > > > > > > 
> > > > > > > > > On 1/18/23 09:53, Christian König wrote:
> > > > > > > > > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> > > > > > > > > > > This patch series provides a new UAPI for the Nouveau 
> > > > > > > > > > > driver in
> > > > > > > > > > > order to
> > > > > > > > > > > support Vulkan features, such as
> > > > > > > > > > > sparse bindings and sparse
> > > > > > > > > > > residency.
> > > > > > > > > > > 
> > > > > > > > > > > Furthermore, with the DRM GPUVA
> > > > > > > > > > > manager it provides a new DRM core
> > > > > > > > > > > feature to
> > > > > > > > > > > keep track of GPU virtual address
> > > > > > > > > > > (VA) mappings in a more generic way.
> > > > > > > > > > > 
> > > > > > > > > > > The DRM GPUVA manager is indented to help drivers 
> > > > > > > > > > > implement
> > > > > > > > > > > userspace-manageable
> > > > > > > > > > > GPU VA spaces in reference to the Vulkan API. In order to 
> > > > > > > > > > > achieve
> > > > > > > > > > > this goal it
> > > > > > > > > > > serves the following purposes in this context.
> > > > > > > > > > > 
> > > > > > > > > > >    1) Provide a dedicated range allocator to track 
> > > > > > > > > > > GPU VA
> > > > > > > > > > > allocations and
> > > > > > > > > > >   mappings, making use of the drm_mm range 
> > > > > > > > > > > allocator.
> > > > > > > > > > This means that the ranges are allocated
> > > > > > > > > > by the kernel? If yes that's
> > > > > > > > > > a really really bad idea.
> > > > > > > > > No, it's just for keeping track of the
> > > > > > > > > ranges userspace has allocated.
> > > > > > > > Ok, that makes more sense.
> > > > > > > > 
> > > > > > > > So basically you have an IOCTL which asks kernel
> > > > > > > > for a free range? Or
> > > > > > > > what exactly is the drm_mm used for here?
> > > > > > > Not even that, userspace provides both the base
> > > > > > > address and the range,
> > > > > > > the kernel really just keeps track of things.
> > > > > > > Though, writing a UAPI on
> > > > > > > top of the GPUVA manager asking for a free range instead would be
> > > > > > > possible by just adding the corresponding wrapper functions to 
> > > > > > > get a
> > > > > > > free hole.
> > > > > > > 
> > > > > > > Currently, and that's what I think I read out of
> > > > > > > your question, the main
> > > > > > > benefit of using drm_mm over simply stuffing the
> > > > > > > entries into a list or
> > > > > > > something boils down to easier collision detection and iterating
> > > > > > > sub-ranges of the whole VA space.
> > > > > > Why not just do this in userspace?  We have a range manager in
> > > > > > libdrm_amdgpu that you could lift out into libdrm or some other
> > > > > > helper.
> > > > > The kernel still needs to keep track of the mappings within the 
> > > > > various
> > > > > VA spaces, e.g. it silently needs to unmap mappings that are backed by
> > > > > BOs that get evicted and remap them once they're validated (or swapped
> > > > > back in).
> > > > Ok, you are just using this for maintaining the GPU VM space in
> > > > the kernel.
> > > > 
> > > Yes the idea behind having common code wrapping drm_mm for this is to
> > > allow us to make the rules consistent across drivers.
> > > 
> > > Userspace (generally Vulkan, some compute) has interfaces that pretty
> > > much dictate a lot of how VMA tracking works, esp around lifetimes,
> > > sparse mappings and splitting/merging underlying page tables, I'd
> > > really like this to be more consistent across drivers, because already
> > > I think we've seen with freedreno some divergence from amdgpu and we
> > > also have i915/xe to deal with. I'd like to at least have one place
> > > that we can say this is how it should work, since this is something
> > > that *should* be consistent across drivers mostly, as it is more about
> > > how the uapi is exposed.
> > 
> > That's a really good idea, but the implementation with drm_mm won't work
> > like that.
> > 
> > We have Vulkan applications which use the sparse feature to create
> > literally millions of mappings. That's why I have fine tuned the mapping

Is this not an application issue? Millions of mappings seems a bit
absurd to me.

> > structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
> > possible in the handling of that.

We might need to bit of work here in Xe as our xe_vma 

Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Danilo Krummrich

On 1/18/23 20:48, Christian König wrote:

Am 18.01.23 um 20:17 schrieb Dave Airlie:

On Thu, 19 Jan 2023 at 02:54, Alex Deucher  wrote:
On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich  
wrote:



On 1/18/23 17:30, Alex Deucher wrote:
On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich  
wrote:

On 1/18/23 16:37, Christian König wrote:

Am 18.01.23 um 16:34 schrieb Danilo Krummrich:

Hi Christian,

On 1/18/23 09:53, Christian König wrote:

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:

This patch series provides a new UAPI for the Nouveau driver in
order to
support Vulkan features, such as sparse bindings and sparse 
residency.


Furthermore, with the DRM GPUVA manager it provides a new DRM 
core

feature to
keep track of GPU virtual address (VA) mappings in a more 
generic way.


The DRM GPUVA manager is indented to help drivers implement
userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve
this goal it
serves the following purposes in this context.

   1) Provide a dedicated range allocator to track GPU VA
allocations and
  mappings, making use of the drm_mm range allocator.
This means that the ranges are allocated by the kernel? If yes 
that's

a really really bad idea.
No, it's just for keeping track of the ranges userspace has 
allocated.

Ok, that makes more sense.

So basically you have an IOCTL which asks kernel for a free 
range? Or

what exactly is the drm_mm used for here?
Not even that, userspace provides both the base address and the 
range,
the kernel really just keeps track of things. Though, writing a 
UAPI on

top of the GPUVA manager asking for a free range instead would be
possible by just adding the corresponding wrapper functions to get a
free hole.

Currently, and that's what I think I read out of your question, 
the main
benefit of using drm_mm over simply stuffing the entries into a 
list or

something boils down to easier collision detection and iterating
sub-ranges of the whole VA space.

Why not just do this in userspace?  We have a range manager in
libdrm_amdgpu that you could lift out into libdrm or some other
helper.

The kernel still needs to keep track of the mappings within the various
VA spaces, e.g. it silently needs to unmap mappings that are backed by
BOs that get evicted and remap them once they're validated (or swapped
back in).
Ok, you are just using this for maintaining the GPU VM space in the 
kernel.



Yes the idea behind having common code wrapping drm_mm for this is to
allow us to make the rules consistent across drivers.

Userspace (generally Vulkan, some compute) has interfaces that pretty
much dictate a lot of how VMA tracking works, esp around lifetimes,
sparse mappings and splitting/merging underlying page tables, I'd
really like this to be more consistent across drivers, because already
I think we've seen with freedreno some divergence from amdgpu and we
also have i915/xe to deal with. I'd like to at least have one place
that we can say this is how it should work, since this is something
that *should* be consistent across drivers mostly, as it is more about
how the uapi is exposed.


That's a really good idea, but the implementation with drm_mm won't work 
like that.


We have Vulkan applications which use the sparse feature to create 
literally millions of mappings. That's why I have fine tuned the mapping 
structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle 
possible in the handling of that.


That's a valuable information. Can you recommend such an application for 
testing / benchmarking?


Your optimization effort sounds great. May it be worth thinking about 
generalizing your approach by itself and stacking the drm_gpuva_manager 
on top of it?




A drm_mm_node is more in the range of ~200 bytes and certainly not 
suitable for this kind of job.


I strongly suggest to rather use a good bunch of the amdgpu VM code as 
blueprint for the common infrastructure.


I will definitely have look.



Regards,
Christian.



Dave.






Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Christian König

Am 18.01.23 um 20:17 schrieb Dave Airlie:

On Thu, 19 Jan 2023 at 02:54, Alex Deucher  wrote:

On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich  wrote:



On 1/18/23 17:30, Alex Deucher wrote:

On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich  wrote:

On 1/18/23 16:37, Christian König wrote:

Am 18.01.23 um 16:34 schrieb Danilo Krummrich:

Hi Christian,

On 1/18/23 09:53, Christian König wrote:

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:

This patch series provides a new UAPI for the Nouveau driver in
order to
support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core
feature to
keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement
userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve
this goal it
serves the following purposes in this context.

   1) Provide a dedicated range allocator to track GPU VA
allocations and
  mappings, making use of the drm_mm range allocator.

This means that the ranges are allocated by the kernel? If yes that's
a really really bad idea.

No, it's just for keeping track of the ranges userspace has allocated.

Ok, that makes more sense.

So basically you have an IOCTL which asks kernel for a free range? Or
what exactly is the drm_mm used for here?

Not even that, userspace provides both the base address and the range,
the kernel really just keeps track of things. Though, writing a UAPI on
top of the GPUVA manager asking for a free range instead would be
possible by just adding the corresponding wrapper functions to get a
free hole.

Currently, and that's what I think I read out of your question, the main
benefit of using drm_mm over simply stuffing the entries into a list or
something boils down to easier collision detection and iterating
sub-ranges of the whole VA space.

Why not just do this in userspace?  We have a range manager in
libdrm_amdgpu that you could lift out into libdrm or some other
helper.

The kernel still needs to keep track of the mappings within the various
VA spaces, e.g. it silently needs to unmap mappings that are backed by
BOs that get evicted and remap them once they're validated (or swapped
back in).

Ok, you are just using this for maintaining the GPU VM space in the kernel.


Yes the idea behind having common code wrapping drm_mm for this is to
allow us to make the rules consistent across drivers.

Userspace (generally Vulkan, some compute) has interfaces that pretty
much dictate a lot of how VMA tracking works, esp around lifetimes,
sparse mappings and splitting/merging underlying page tables, I'd
really like this to be more consistent across drivers, because already
I think we've seen with freedreno some divergence from amdgpu and we
also have i915/xe to deal with. I'd like to at least have one place
that we can say this is how it should work, since this is something
that *should* be consistent across drivers mostly, as it is more about
how the uapi is exposed.


That's a really good idea, but the implementation with drm_mm won't work 
like that.


We have Vulkan applications which use the sparse feature to create 
literally millions of mappings. That's why I have fine tuned the mapping 
structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle 
possible in the handling of that.


A drm_mm_node is more in the range of ~200 bytes and certainly not 
suitable for this kind of job.


I strongly suggest to rather use a good bunch of the amdgpu VM code as 
blueprint for the common infrastructure.


Regards,
Christian.



Dave.




Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Dave Airlie
On Thu, 19 Jan 2023 at 02:54, Alex Deucher  wrote:
>
> On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich  wrote:
> >
> >
> >
> > On 1/18/23 17:30, Alex Deucher wrote:
> > > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich  wrote:
> > >>
> > >> On 1/18/23 16:37, Christian König wrote:
> > >>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> >  Hi Christian,
> > 
> >  On 1/18/23 09:53, Christian König wrote:
> > > Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> > >> This patch series provides a new UAPI for the Nouveau driver in
> > >> order to
> > >> support Vulkan features, such as sparse bindings and sparse 
> > >> residency.
> > >>
> > >> Furthermore, with the DRM GPUVA manager it provides a new DRM core
> > >> feature to
> > >> keep track of GPU virtual address (VA) mappings in a more generic 
> > >> way.
> > >>
> > >> The DRM GPUVA manager is indented to help drivers implement
> > >> userspace-manageable
> > >> GPU VA spaces in reference to the Vulkan API. In order to achieve
> > >> this goal it
> > >> serves the following purposes in this context.
> > >>
> > >>   1) Provide a dedicated range allocator to track GPU VA
> > >> allocations and
> > >>  mappings, making use of the drm_mm range allocator.
> > >
> > > This means that the ranges are allocated by the kernel? If yes that's
> > > a really really bad idea.
> > 
> >  No, it's just for keeping track of the ranges userspace has allocated.
> > >>>
> > >>> Ok, that makes more sense.
> > >>>
> > >>> So basically you have an IOCTL which asks kernel for a free range? Or
> > >>> what exactly is the drm_mm used for here?
> > >>
> > >> Not even that, userspace provides both the base address and the range,
> > >> the kernel really just keeps track of things. Though, writing a UAPI on
> > >> top of the GPUVA manager asking for a free range instead would be
> > >> possible by just adding the corresponding wrapper functions to get a
> > >> free hole.
> > >>
> > >> Currently, and that's what I think I read out of your question, the main
> > >> benefit of using drm_mm over simply stuffing the entries into a list or
> > >> something boils down to easier collision detection and iterating
> > >> sub-ranges of the whole VA space.
> > >
> > > Why not just do this in userspace?  We have a range manager in
> > > libdrm_amdgpu that you could lift out into libdrm or some other
> > > helper.
> >
> > The kernel still needs to keep track of the mappings within the various
> > VA spaces, e.g. it silently needs to unmap mappings that are backed by
> > BOs that get evicted and remap them once they're validated (or swapped
> > back in).
>
> Ok, you are just using this for maintaining the GPU VM space in the kernel.
>

Yes the idea behind having common code wrapping drm_mm for this is to
allow us to make the rules consistent across drivers.

Userspace (generally Vulkan, some compute) has interfaces that pretty
much dictate a lot of how VMA tracking works, esp around lifetimes,
sparse mappings and splitting/merging underlying page tables, I'd
really like this to be more consistent across drivers, because already
I think we've seen with freedreno some divergence from amdgpu and we
also have i915/xe to deal with. I'd like to at least have one place
that we can say this is how it should work, since this is something
that *should* be consistent across drivers mostly, as it is more about
how the uapi is exposed.

Dave.


Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Alex Deucher
On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich  wrote:
>
>
>
> On 1/18/23 17:30, Alex Deucher wrote:
> > On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich  wrote:
> >>
> >> On 1/18/23 16:37, Christian König wrote:
> >>> Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
>  Hi Christian,
> 
>  On 1/18/23 09:53, Christian König wrote:
> > Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
> >> This patch series provides a new UAPI for the Nouveau driver in
> >> order to
> >> support Vulkan features, such as sparse bindings and sparse residency.
> >>
> >> Furthermore, with the DRM GPUVA manager it provides a new DRM core
> >> feature to
> >> keep track of GPU virtual address (VA) mappings in a more generic way.
> >>
> >> The DRM GPUVA manager is indented to help drivers implement
> >> userspace-manageable
> >> GPU VA spaces in reference to the Vulkan API. In order to achieve
> >> this goal it
> >> serves the following purposes in this context.
> >>
> >>   1) Provide a dedicated range allocator to track GPU VA
> >> allocations and
> >>  mappings, making use of the drm_mm range allocator.
> >
> > This means that the ranges are allocated by the kernel? If yes that's
> > a really really bad idea.
> 
>  No, it's just for keeping track of the ranges userspace has allocated.
> >>>
> >>> Ok, that makes more sense.
> >>>
> >>> So basically you have an IOCTL which asks kernel for a free range? Or
> >>> what exactly is the drm_mm used for here?
> >>
> >> Not even that, userspace provides both the base address and the range,
> >> the kernel really just keeps track of things. Though, writing a UAPI on
> >> top of the GPUVA manager asking for a free range instead would be
> >> possible by just adding the corresponding wrapper functions to get a
> >> free hole.
> >>
> >> Currently, and that's what I think I read out of your question, the main
> >> benefit of using drm_mm over simply stuffing the entries into a list or
> >> something boils down to easier collision detection and iterating
> >> sub-ranges of the whole VA space.
> >
> > Why not just do this in userspace?  We have a range manager in
> > libdrm_amdgpu that you could lift out into libdrm or some other
> > helper.
>
> The kernel still needs to keep track of the mappings within the various
> VA spaces, e.g. it silently needs to unmap mappings that are backed by
> BOs that get evicted and remap them once they're validated (or swapped
> back in).

Ok, you are just using this for maintaining the GPU VM space in the kernel.

Alex

>
> >
> > Alex
> >
> >
> >>
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> 
>  - Danilo
> 
> >
> > Regards,
> > Christian.
> >
> >>
> >>   2) Generically connect GPU VA mappings to their backing
> >> buffers, in
> >>  particular DRM GEM objects.
> >>
> >>   3) Provide a common implementation to perform more complex 
> >> mapping
> >>  operations on the GPU VA space. In particular splitting and
> >> merging
> >>  of GPU VA mappings, e.g. for intersecting mapping requests
> >> or partial
> >>  unmap requests.
> >>
> >> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
> >> itself
> >> providing the following new interfaces.
> >>
> >>   1) Initialize a GPU VA space via the new
> >> DRM_IOCTL_NOUVEAU_VM_INIT ioctl
> >>  for UMDs to specify the portion of VA space managed by the
> >> kernel and
> >>  userspace, respectively.
> >>
> >>   2) Allocate and free a VA space region as well as bind and
> >> unbind memory
> >>  to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
> >> ioctl.
> >>
> >>   3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC 
> >> ioctl.
> >>
> >> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
> >> of the DRM
> >> scheduler to queue jobs and support asynchronous processing with DRM
> >> syncobjs
> >> as synchronization mechanism.
> >>
> >> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
> >> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
> >>
> >> The new VM_BIND UAPI for Nouveau makes also use of drm_exec
> >> (execution context
> >> for GEM buffers) by Christian König. Since the patch implementing
> >> drm_exec was
> >> not yet merged into drm-next it is part of this series, as well as a
> >> small fix
> >> for this patch, which was found while testing this series.
> >>
> >> This patch series is also available at [1].
> >>
> >> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
> >> corresponding userspace parts for this series.
> >>
> >> The Vulkan CTS test suite passes the sparse binding and sparse

Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Danilo Krummrich




On 1/18/23 17:30, Alex Deucher wrote:

On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich  wrote:


On 1/18/23 16:37, Christian König wrote:

Am 18.01.23 um 16:34 schrieb Danilo Krummrich:

Hi Christian,

On 1/18/23 09:53, Christian König wrote:

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:

This patch series provides a new UAPI for the Nouveau driver in
order to
support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core
feature to
keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement
userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve
this goal it
serves the following purposes in this context.

  1) Provide a dedicated range allocator to track GPU VA
allocations and
 mappings, making use of the drm_mm range allocator.


This means that the ranges are allocated by the kernel? If yes that's
a really really bad idea.


No, it's just for keeping track of the ranges userspace has allocated.


Ok, that makes more sense.

So basically you have an IOCTL which asks kernel for a free range? Or
what exactly is the drm_mm used for here?


Not even that, userspace provides both the base address and the range,
the kernel really just keeps track of things. Though, writing a UAPI on
top of the GPUVA manager asking for a free range instead would be
possible by just adding the corresponding wrapper functions to get a
free hole.

Currently, and that's what I think I read out of your question, the main
benefit of using drm_mm over simply stuffing the entries into a list or
something boils down to easier collision detection and iterating
sub-ranges of the whole VA space.


Why not just do this in userspace?  We have a range manager in
libdrm_amdgpu that you could lift out into libdrm or some other
helper.


The kernel still needs to keep track of the mappings within the various 
VA spaces, e.g. it silently needs to unmap mappings that are backed by 
BOs that get evicted and remap them once they're validated (or swapped 
back in).




Alex






Regards,
Christian.



- Danilo



Regards,
Christian.



  2) Generically connect GPU VA mappings to their backing
buffers, in
 particular DRM GEM objects.

  3) Provide a common implementation to perform more complex mapping
 operations on the GPU VA space. In particular splitting and
merging
 of GPU VA mappings, e.g. for intersecting mapping requests
or partial
 unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
itself
providing the following new interfaces.

  1) Initialize a GPU VA space via the new
DRM_IOCTL_NOUVEAU_VM_INIT ioctl
 for UMDs to specify the portion of VA space managed by the
kernel and
 userspace, respectively.

  2) Allocate and free a VA space region as well as bind and
unbind memory
 to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
ioctl.

  3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
of the DRM
scheduler to queue jobs and support asynchronous processing with DRM
syncobjs
as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec
(execution context
for GEM buffers) by Christian König. Since the patch implementing
drm_exec was
not yet merged into drm-next it is part of this series, as well as a
small fix
for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse
residency test
cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for
the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing
the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be
considered just as
helper for implementation.

However, I absolutely intend to change those test cases to proper
kunit test
cases for the DRM GPUVA manager, once and if we agree on it's
usefulness and
design.

[1]
https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3]
https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind

I also want to give credit to Dave Airlie, who contributed a lot of
ideas to
this patch series.

Christian König (1):
drm: execution context for GEM buffers

Danilo Krummrich (13):
drm/exec: fix memory leak in drm_exec

Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Alex Deucher
On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich  wrote:
>
> On 1/18/23 16:37, Christian König wrote:
> > Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
> >> Hi Christian,
> >>
> >> On 1/18/23 09:53, Christian König wrote:
> >>> Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
>  This patch series provides a new UAPI for the Nouveau driver in
>  order to
>  support Vulkan features, such as sparse bindings and sparse residency.
> 
>  Furthermore, with the DRM GPUVA manager it provides a new DRM core
>  feature to
>  keep track of GPU virtual address (VA) mappings in a more generic way.
> 
>  The DRM GPUVA manager is indented to help drivers implement
>  userspace-manageable
>  GPU VA spaces in reference to the Vulkan API. In order to achieve
>  this goal it
>  serves the following purposes in this context.
> 
>   1) Provide a dedicated range allocator to track GPU VA
>  allocations and
>  mappings, making use of the drm_mm range allocator.
> >>>
> >>> This means that the ranges are allocated by the kernel? If yes that's
> >>> a really really bad idea.
> >>
> >> No, it's just for keeping track of the ranges userspace has allocated.
> >
> > Ok, that makes more sense.
> >
> > So basically you have an IOCTL which asks kernel for a free range? Or
> > what exactly is the drm_mm used for here?
>
> Not even that, userspace provides both the base address and the range,
> the kernel really just keeps track of things. Though, writing a UAPI on
> top of the GPUVA manager asking for a free range instead would be
> possible by just adding the corresponding wrapper functions to get a
> free hole.
>
> Currently, and that's what I think I read out of your question, the main
> benefit of using drm_mm over simply stuffing the entries into a list or
> something boils down to easier collision detection and iterating
> sub-ranges of the whole VA space.

Why not just do this in userspace?  We have a range manager in
libdrm_amdgpu that you could lift out into libdrm or some other
helper.

Alex


>
> >
> > Regards,
> > Christian.
> >
> >>
> >> - Danilo
> >>
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> 
>   2) Generically connect GPU VA mappings to their backing
>  buffers, in
>  particular DRM GEM objects.
> 
>   3) Provide a common implementation to perform more complex mapping
>  operations on the GPU VA space. In particular splitting and
>  merging
>  of GPU VA mappings, e.g. for intersecting mapping requests
>  or partial
>  unmap requests.
> 
>  The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager,
>  itself
>  providing the following new interfaces.
> 
>   1) Initialize a GPU VA space via the new
>  DRM_IOCTL_NOUVEAU_VM_INIT ioctl
>  for UMDs to specify the portion of VA space managed by the
>  kernel and
>  userspace, respectively.
> 
>   2) Allocate and free a VA space region as well as bind and
>  unbind memory
>  to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND
>  ioctl.
> 
>   3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
> 
>  Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use
>  of the DRM
>  scheduler to queue jobs and support asynchronous processing with DRM
>  syncobjs
>  as synchronization mechanism.
> 
>  By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
>  DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
> 
>  The new VM_BIND UAPI for Nouveau makes also use of drm_exec
>  (execution context
>  for GEM buffers) by Christian König. Since the patch implementing
>  drm_exec was
>  not yet merged into drm-next it is part of this series, as well as a
>  small fix
>  for this patch, which was found while testing this series.
> 
>  This patch series is also available at [1].
> 
>  There is a Mesa NVK merge request by Dave Airlie [2] implementing the
>  corresponding userspace parts for this series.
> 
>  The Vulkan CTS test suite passes the sparse binding and sparse
>  residency test
>  cases for the new UAPI together with Dave's Mesa work.
> 
>  There are also some test cases in the igt-gpu-tools project [3] for
>  the new UAPI
>  and hence the DRM GPU VA manager. However, most of them are testing
>  the DRM GPU
>  VA manager's logic through Nouveau's new UAPI and should be
>  considered just as
>  helper for implementation.
> 
>  However, I absolutely intend to change those test cases to proper
>  kunit test
>  cases for the DRM GPUVA manager, once and if we agree on it's
>  usefulness and
>  design.
> 
>  [1]
>  https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next 
>  /
> 

Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Danilo Krummrich

On 1/18/23 16:37, Christian König wrote:

Am 18.01.23 um 16:34 schrieb Danilo Krummrich:

Hi Christian,

On 1/18/23 09:53, Christian König wrote:

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
This patch series provides a new UAPI for the Nouveau driver in 
order to

support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core 
feature to

keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement 
userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve 
this goal it

serves the following purposes in this context.

 1) Provide a dedicated range allocator to track GPU VA 
allocations and

    mappings, making use of the drm_mm range allocator.


This means that the ranges are allocated by the kernel? If yes that's 
a really really bad idea.


No, it's just for keeping track of the ranges userspace has allocated.


Ok, that makes more sense.

So basically you have an IOCTL which asks kernel for a free range? Or 
what exactly is the drm_mm used for here?


Not even that, userspace provides both the base address and the range,
the kernel really just keeps track of things. Though, writing a UAPI on
top of the GPUVA manager asking for a free range instead would be 
possible by just adding the corresponding wrapper functions to get a 
free hole.


Currently, and that's what I think I read out of your question, the main 
benefit of using drm_mm over simply stuffing the entries into a list or 
something boils down to easier collision detection and iterating 
sub-ranges of the whole VA space.




Regards,
Christian.



- Danilo



Regards,
Christian.



 2) Generically connect GPU VA mappings to their backing 
buffers, in

    particular DRM GEM objects.

 3) Provide a common implementation to perform more complex mapping
    operations on the GPU VA space. In particular splitting and 
merging
    of GPU VA mappings, e.g. for intersecting mapping requests 
or partial

    unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, 
itself

providing the following new interfaces.

 1) Initialize a GPU VA space via the new 
DRM_IOCTL_NOUVEAU_VM_INIT ioctl
    for UMDs to specify the portion of VA space managed by the 
kernel and

    userspace, respectively.

 2) Allocate and free a VA space region as well as bind and 
unbind memory
    to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND 
ioctl.


 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use 
of the DRM
scheduler to queue jobs and support asynchronous processing with DRM 
syncobjs

as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec 
(execution context
for GEM buffers) by Christian König. Since the patch implementing 
drm_exec was
not yet merged into drm-next it is part of this series, as well as a 
small fix

for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse 
residency test

cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for 
the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing 
the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be 
considered just as

helper for implementation.

However, I absolutely intend to change those test cases to proper 
kunit test
cases for the DRM GPUVA manager, once and if we agree on it's 
usefulness and

design.

[1] 
https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /

https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3] 
https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind


I also want to give credit to Dave Airlie, who contributed a lot of 
ideas to

this patch series.

Christian König (1):
   drm: execution context for GEM buffers

Danilo Krummrich (13):
   drm/exec: fix memory leak in drm_exec_prepare_obj()
   drm: manager to keep track of GPUs VA mappings
   drm: debugfs: provide infrastructure to dump a DRM GPU VA space
   drm/nouveau: new VM_BIND uapi interfaces
   drm/nouveau: get vmm via nouveau_cli_vmm()
   drm/nouveau: bo: initialize GEM GPU VA interface
   drm/nouveau: move usercopy helpers to nouveau_drv.h
   drm/nouveau: fence: fail to emit when fence context is killed
   drm/nouveau: chan: provide nouveau_channel_kill

Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Christian König

Am 18.01.23 um 16:34 schrieb Danilo Krummrich:

Hi Christian,

On 1/18/23 09:53, Christian König wrote:

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
This patch series provides a new UAPI for the Nouveau driver in 
order to

support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core 
feature to

keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement 
userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve 
this goal it

serves the following purposes in this context.

 1) Provide a dedicated range allocator to track GPU VA 
allocations and

    mappings, making use of the drm_mm range allocator.


This means that the ranges are allocated by the kernel? If yes that's 
a really really bad idea.


No, it's just for keeping track of the ranges userspace has allocated.


Ok, that makes more sense.

So basically you have an IOCTL which asks kernel for a free range? Or 
what exactly is the drm_mm used for here?


Regards,
Christian.



- Danilo



Regards,
Christian.



 2) Generically connect GPU VA mappings to their backing 
buffers, in

    particular DRM GEM objects.

 3) Provide a common implementation to perform more complex mapping
    operations on the GPU VA space. In particular splitting and 
merging
    of GPU VA mappings, e.g. for intersecting mapping requests 
or partial

    unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, 
itself

providing the following new interfaces.

 1) Initialize a GPU VA space via the new 
DRM_IOCTL_NOUVEAU_VM_INIT ioctl
    for UMDs to specify the portion of VA space managed by the 
kernel and

    userspace, respectively.

 2) Allocate and free a VA space region as well as bind and 
unbind memory
    to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND 
ioctl.


 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use 
of the DRM
scheduler to queue jobs and support asynchronous processing with DRM 
syncobjs

as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec 
(execution context
for GEM buffers) by Christian König. Since the patch implementing 
drm_exec was
not yet merged into drm-next it is part of this series, as well as a 
small fix

for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse 
residency test

cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for 
the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing 
the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be 
considered just as

helper for implementation.

However, I absolutely intend to change those test cases to proper 
kunit test
cases for the DRM GPUVA manager, once and if we agree on it's 
usefulness and

design.

[1] 
https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next 
/

https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3] 
https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind


I also want to give credit to Dave Airlie, who contributed a lot of 
ideas to

this patch series.

Christian König (1):
   drm: execution context for GEM buffers

Danilo Krummrich (13):
   drm/exec: fix memory leak in drm_exec_prepare_obj()
   drm: manager to keep track of GPUs VA mappings
   drm: debugfs: provide infrastructure to dump a DRM GPU VA space
   drm/nouveau: new VM_BIND uapi interfaces
   drm/nouveau: get vmm via nouveau_cli_vmm()
   drm/nouveau: bo: initialize GEM GPU VA interface
   drm/nouveau: move usercopy helpers to nouveau_drv.h
   drm/nouveau: fence: fail to emit when fence context is killed
   drm/nouveau: chan: provide nouveau_channel_kill()
   drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
   drm/nouveau: implement uvmm for user mode bindings
   drm/nouveau: implement new VM_BIND UAPI
   drm/nouveau: debugfs: implement DRM GPU VA debugfs

  Documentation/gpu/driver-uapi.rst |   11 +
  Documentation/gpu/drm-mm.rst  |   43 +
  drivers/gpu/drm/Kconfig   |    6 +
  drivers/gpu/drm/Makefile  |    3 +
  drivers/gpu/drm/amd/amdgpu/Kconfig    |    1 +
  drivers/gpu/drm/drm_debugfs.c |   56 +
  drivers/gpu/drm/drm_exec.c 

Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Danilo Krummrich

Hi Christian,

On 1/18/23 09:53, Christian König wrote:

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:

This patch series provides a new UAPI for the Nouveau driver in order to
support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core 
feature to

keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement 
userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve this 
goal it

serves the following purposes in this context.

 1) Provide a dedicated range allocator to track GPU VA 
allocations and

    mappings, making use of the drm_mm range allocator.


This means that the ranges are allocated by the kernel? If yes that's a 
really really bad idea.


No, it's just for keeping track of the ranges userspace has allocated.

- Danilo



Regards,
Christian.



 2) Generically connect GPU VA mappings to their backing buffers, in
    particular DRM GEM objects.

 3) Provide a common implementation to perform more complex mapping
    operations on the GPU VA space. In particular splitting and 
merging
    of GPU VA mappings, e.g. for intersecting mapping requests or 
partial

    unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, 
itself

providing the following new interfaces.

 1) Initialize a GPU VA space via the new 
DRM_IOCTL_NOUVEAU_VM_INIT ioctl
    for UMDs to specify the portion of VA space managed by the 
kernel and

    userspace, respectively.

 2) Allocate and free a VA space region as well as bind and unbind 
memory

    to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.

 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use 
of the DRM
scheduler to queue jobs and support asynchronous processing with DRM 
syncobjs

as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution 
context
for GEM buffers) by Christian König. Since the patch implementing 
drm_exec was
not yet merged into drm-next it is part of this series, as well as a 
small fix

for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse 
residency test

cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for 
the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing 
the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be considered 
just as

helper for implementation.

However, I absolutely intend to change those test cases to proper 
kunit test
cases for the DRM GPUVA manager, once and if we agree on it's 
usefulness and

design.

[1] 
https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /

 https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3] 
https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind


I also want to give credit to Dave Airlie, who contributed a lot of 
ideas to

this patch series.

Christian König (1):
   drm: execution context for GEM buffers

Danilo Krummrich (13):
   drm/exec: fix memory leak in drm_exec_prepare_obj()
   drm: manager to keep track of GPUs VA mappings
   drm: debugfs: provide infrastructure to dump a DRM GPU VA space
   drm/nouveau: new VM_BIND uapi interfaces
   drm/nouveau: get vmm via nouveau_cli_vmm()
   drm/nouveau: bo: initialize GEM GPU VA interface
   drm/nouveau: move usercopy helpers to nouveau_drv.h
   drm/nouveau: fence: fail to emit when fence context is killed
   drm/nouveau: chan: provide nouveau_channel_kill()
   drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
   drm/nouveau: implement uvmm for user mode bindings
   drm/nouveau: implement new VM_BIND UAPI
   drm/nouveau: debugfs: implement DRM GPU VA debugfs

  Documentation/gpu/driver-uapi.rst |   11 +
  Documentation/gpu/drm-mm.rst  |   43 +
  drivers/gpu/drm/Kconfig   |    6 +
  drivers/gpu/drm/Makefile  |    3 +
  drivers/gpu/drm/amd/amdgpu/Kconfig    |    1 +
  drivers/gpu/drm/drm_debugfs.c |   56 +
  drivers/gpu/drm/drm_exec.c    |  294 
  drivers/gpu/drm/drm_gem.c |    3 +
  drivers/gpu/drm/drm_gpuva_mgr.c   | 1323 +
  drivers/gpu/drm/nouveau/Kbuild    |    3

Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-18 Thread Christian König

Am 18.01.23 um 07:12 schrieb Danilo Krummrich:

This patch series provides a new UAPI for the Nouveau driver in order to
support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to
keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it
serves the following purposes in this context.

 1) Provide a dedicated range allocator to track GPU VA allocations and
mappings, making use of the drm_mm range allocator.


This means that the ranges are allocated by the kernel? If yes that's a 
really really bad idea.


Regards,
Christian.



 2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

 3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself
providing the following new interfaces.

 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl
for UMDs to specify the portion of VA space managed by the kernel and
userspace, respectively.

 2) Allocate and free a VA space region as well as bind and unbind memory
to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.

 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM
scheduler to queue jobs and support asynchronous processing with DRM syncobjs
as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context
for GEM buffers) by Christian König. Since the patch implementing drm_exec was
not yet merged into drm-next it is part of this series, as well as a small fix
for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse residency test
cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be considered just as
helper for implementation.

However, I absolutely intend to change those test cases to proper kunit test
cases for the DRM GPUVA manager, once and if we agree on it's usefulness and
design.

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
 https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind

I also want to give credit to Dave Airlie, who contributed a lot of ideas to
this patch series.

Christian König (1):
   drm: execution context for GEM buffers

Danilo Krummrich (13):
   drm/exec: fix memory leak in drm_exec_prepare_obj()
   drm: manager to keep track of GPUs VA mappings
   drm: debugfs: provide infrastructure to dump a DRM GPU VA space
   drm/nouveau: new VM_BIND uapi interfaces
   drm/nouveau: get vmm via nouveau_cli_vmm()
   drm/nouveau: bo: initialize GEM GPU VA interface
   drm/nouveau: move usercopy helpers to nouveau_drv.h
   drm/nouveau: fence: fail to emit when fence context is killed
   drm/nouveau: chan: provide nouveau_channel_kill()
   drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
   drm/nouveau: implement uvmm for user mode bindings
   drm/nouveau: implement new VM_BIND UAPI
   drm/nouveau: debugfs: implement DRM GPU VA debugfs

  Documentation/gpu/driver-uapi.rst |   11 +
  Documentation/gpu/drm-mm.rst  |   43 +
  drivers/gpu/drm/Kconfig   |6 +
  drivers/gpu/drm/Makefile  |3 +
  drivers/gpu/drm/amd/amdgpu/Kconfig|1 +
  drivers/gpu/drm/drm_debugfs.c |   56 +
  drivers/gpu/drm/drm_exec.c|  294 
  drivers/gpu/drm/drm_gem.c |3 +
  drivers/gpu/drm/drm_gpuva_mgr.c   | 1323 +
  drivers/gpu/drm/nouveau/Kbuild|3 +
  drivers/gpu/drm/nouveau/Kconfig   |2 +
  drivers/gpu/drm/nouveau/include/nvif/if000c.h |   23 +-
  drivers/gpu/drm/nouveau/include/nvif/vmm.h|   17 +-
  ..

[PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-01-17 Thread Danilo Krummrich
This patch series provides a new UAPI for the Nouveau driver in order to
support Vulkan features, such as sparse bindings and sparse residency.

Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to
keep track of GPU virtual address (VA) mappings in a more generic way.

The DRM GPUVA manager is indented to help drivers implement userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it
serves the following purposes in this context.

1) Provide a dedicated range allocator to track GPU VA allocations and
   mappings, making use of the drm_mm range allocator.

2) Generically connect GPU VA mappings to their backing buffers, in
   particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
   operations on the GPU VA space. In particular splitting and merging
   of GPU VA mappings, e.g. for intersecting mapping requests or partial
   unmap requests.

The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself
providing the following new interfaces.

1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl
   for UMDs to specify the portion of VA space managed by the kernel and
   userspace, respectively.

2) Allocate and free a VA space region as well as bind and unbind memory
   to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.

3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.

Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM
scheduler to queue jobs and support asynchronous processing with DRM syncobjs
as synchronization mechanism.

By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.

The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context
for GEM buffers) by Christian König. Since the patch implementing drm_exec was
not yet merged into drm-next it is part of this series, as well as a small fix
for this patch, which was found while testing this series.

This patch series is also available at [1].

There is a Mesa NVK merge request by Dave Airlie [2] implementing the
corresponding userspace parts for this series.

The Vulkan CTS test suite passes the sparse binding and sparse residency test
cases for the new UAPI together with Dave's Mesa work.

There are also some test cases in the igt-gpu-tools project [3] for the new UAPI
and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU
VA manager's logic through Nouveau's new UAPI and should be considered just as
helper for implementation.

However, I absolutely intend to change those test cases to proper kunit test
cases for the DRM GPUVA manager, once and if we agree on it's usefulness and
design.

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
[2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
[3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind

I also want to give credit to Dave Airlie, who contributed a lot of ideas to
this patch series.

Christian König (1):
  drm: execution context for GEM buffers

Danilo Krummrich (13):
  drm/exec: fix memory leak in drm_exec_prepare_obj()
  drm: manager to keep track of GPUs VA mappings
  drm: debugfs: provide infrastructure to dump a DRM GPU VA space
  drm/nouveau: new VM_BIND uapi interfaces
  drm/nouveau: get vmm via nouveau_cli_vmm()
  drm/nouveau: bo: initialize GEM GPU VA interface
  drm/nouveau: move usercopy helpers to nouveau_drv.h
  drm/nouveau: fence: fail to emit when fence context is killed
  drm/nouveau: chan: provide nouveau_channel_kill()
  drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
  drm/nouveau: implement uvmm for user mode bindings
  drm/nouveau: implement new VM_BIND UAPI
  drm/nouveau: debugfs: implement DRM GPU VA debugfs

 Documentation/gpu/driver-uapi.rst |   11 +
 Documentation/gpu/drm-mm.rst  |   43 +
 drivers/gpu/drm/Kconfig   |6 +
 drivers/gpu/drm/Makefile  |3 +
 drivers/gpu/drm/amd/amdgpu/Kconfig|1 +
 drivers/gpu/drm/drm_debugfs.c |   56 +
 drivers/gpu/drm/drm_exec.c|  294 
 drivers/gpu/drm/drm_gem.c |3 +
 drivers/gpu/drm/drm_gpuva_mgr.c   | 1323 +
 drivers/gpu/drm/nouveau/Kbuild|3 +
 drivers/gpu/drm/nouveau/Kconfig   |2 +
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   23 +-
 drivers/gpu/drm/nouveau/include/nvif/vmm.h|   17 +-
 .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h |   10 +
 drivers/gpu/drm/nouveau/nouveau_abi16.c   |   23 +
 drivers/gpu/drm/nouveau/nouveau_abi16.h   |1 +
 drivers/gpu/drm/nouveau/nouveau_bo.c  |