Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Tue, Dec 17, 2019 at 12:01:26PM -0600, Jason Ekstrand wrote: On Sun, Dec 15, 2019 at 10:24 PM Niranjan Vishwanathapura wrote: On Sat, Dec 14, 2019 at 10:31:37AM +, Chris Wilson wrote: >Quoting Jason Ekstrand (2019-12-14 00:36:19) >> On Fri, Dec 13, 2019 at 5:24 PM Niranjan Vishwanathapura < >> niranjana.vishwanathap...@intel.com> wrote: >> >> On Fri, Dec 13, 2019 at 04:58:42PM -0600, Jason Ekstrand wrote: >> > >> > +/** >> > + * struct drm_i915_gem_vm_bind >> > + * >> > + * Bind an object in a vm's page table. >> > >> > First off, this is something I've wanted for a while for Vulkan, it's >> just >> > never made its way high enough up the priority list. However, it's >> going >> > to have to come one way or another soon. I'm glad to see kernel API >> for >> > this being proposed. >> > I do, however, have a few high-level comments/questions about the API: >> >1. In order to be useful for sparse memory support, the API has to go >> the >> > other way around so that it binds a VA range to a range within the BO. >> It >> > also needs to be able to handle overlapping where two different VA >> ranges >> > may map to the same underlying bytes in the BO. This likely means that >> > unbind needs to also take a VA range and only unbind that range. >> >2. If this is going to be useful for managing GL's address space where >> we >> > have lots of BOs, we probably want it to take a list of ranges so we >> > aren't making one ioctl for each thing we want to bind. >> >> Hi Jason, >> >> Yah, some of these requirements came up. >> >> >> Yes, I have raised them every single time an API like this has come across my >> e-mail inbox for years and they continue to get ignored. Why are we landing an >> API that we know isn't the API we want especially when it's pretty obvious >> roughly what the API we want is? It may be less time in the short term, but >> long-term it means two ioctls and two implementations in i915, IGT tests for >> both code paths, and code in all UMDs to call one or the other depending on >> what kernel you're running on, and we have to maintain all that code going >> forward forever. Sure, that's a price we pay today for a variety of things but >> that's because they all seemed like the right thing at the time. Landing the >> wrong API when we know it's the wrong API seems foolish. > >Exactly. This is not even close to the uAPI we need. Reposting an RFC >without taking in the concerns last time (or the time before that, or >the time before that...) suggests that you aren't really requesting for >comments at all. Thanks Jason for detailed exlanation. Chris, all comments and guidance are much appreciated :) I haven't looked in detail, but my concern is that implementing partial object binding (offset, lenght) from vma down to [un]binding in ppgtt might be a lot of work to include in this SVM patch series. I believe we need the partial object binding in non-SVM scenario as well? Yes, the Vulkan APIs require both partial binding and aliasing. It may be worth pointing out that we're already doing some of this stuff today, although in a rather backwards way. Our non-softpin model for Vulkan uses a memfd which we then map in userspace and turn into a BO via userptr. Due to the way we handle threading in the driver, we end up with multiple BOs pointing to the same overlapping range in the memfd and hence the same pages. That doesn't mean that everything in the path is already set up for what you need but the VA -> page mappings should be. Also, avoiding these kinds of shinanigans is exactly why we want a "real" kernel API for this. :-) Ok thanks Jason for the explantion. Will look into supporting this here. Ok, let me change the interface as below. struct drm_i915_gem_vm_bind_va { /** VA start to bind **/ __u64 start; /** Offset in Object to bind for I915_GEM_VM_BIND_SVM_OBJ type **/ __u64 offset; /** VA length to [un]bind **/ __u64 length; /** Type of memory to [un]bind **/ __u32 type; #define I915_GEM_VM_BIND_SVM_OBJ 0 #define I915_GEM_VM_BIND_SVM_BUFFER 1 /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ type **/ __u32 handle; /** Flags **/ __u32 flags; #define I915_GEM_VM_BIND_UNBIND (1 << 0) #define I915_GEM_VM_BIND_READONLY(1 << 1) } struct drm_i915_gem_vm_
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Sun, Dec 15, 2019 at 08:15:24PM -0800, Niranjan Vishwanathapura wrote: On Sat, Dec 14, 2019 at 10:56:54AM +, Chris Wilson wrote: Quoting Niranjana Vishwanathapura (2019-12-13 21:56:04) Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Kconfig | 11 drivers/gpu/drm/i915/Makefile | 3 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 58 ++ drivers/gpu/drm/i915/gem/i915_gem_svm.c | 60 +++ drivers/gpu/drm/i915/gem/i915_gem_svm.h | 22 +++ drivers/gpu/drm/i915/i915_drv.c | 21 +++ drivers/gpu/drm/i915/i915_drv.h | 22 +++ drivers/gpu/drm/i915/i915_gem_gtt.c | 1 + drivers/gpu/drm/i915/i915_gem_gtt.h | 13 include/uapi/drm/i915_drm.h | 27 + 10 files changed, 227 insertions(+), 11 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index ba9595960bbe..c2e48710eec8 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -137,6 +137,16 @@ config DRM_I915_GVT_KVMGT Choose this option if you want to enable KVMGT support for Intel GVT-g. +config DRM_I915_SVM + bool "Enable Shared Virtual Memory support in i915" + depends on STAGING + depends on DRM_I915 + default n + help + Choose this option if you want Shared Virtual Memory (SVM) + support in i915. With SVM support, one can share the virtual + address space between a process and the GPU. + menu "drm/i915 Debugging" depends on DRM_I915 depends on EXPERT diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e0fd10c0cfb8..75fe45633779 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -153,6 +153,9 @@ i915-y += \ intel_region_lmem.o \ intel_wopcm.o +# SVM code +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o + # general-purpose microcontroller (GuC) support obj-y += gt/uc/ i915-y += gt/uc/intel_uc.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 5003e616a1ad..af360238a392 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2836,10 +2836,14 @@ int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { + struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user; struct drm_i915_gem_execbuffer2 *args = data; - struct drm_i915_gem_exec_object2 *exec2_list; - struct drm_syncobj **fences = NULL; const size_t count = args->buffer_count; + struct drm_syncobj **fences = NULL; + unsigned int i = 0, svm_count = 0; + struct i915_address_space *vm; + struct i915_gem_context *ctx; + struct i915_svm_obj *svm_obj; int err; if (!check_buffer_count(count)) { @@ -2851,15 +2855,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (err) return err; + ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1); + if (!ctx || !rcu_access_pointer(ctx->vm)) + return -ENOENT; This is just hopelessly wrong. For persistence, the _ce_->vm will have a list of must-be-present vma, with a flag for whether they need prefaulting (!svm everything must be prefaulted obviously). Then during reservation we ensure that all those persistent vma are in place (so we probably use an eviction list to keep track of those we need to instantiate on this execbuf). We don't even want to individually track activity on those vma, preferring to assume they are used by every request and so on change they need serialising [for explicit uAPI unbind, where possible we strive to do it async for endless, or at least sync against iova semaphore] against the last request in the vm (so we need a vm->active). However, we do need an EXT_EXTENSION to mark writes for implicit fencing (e.g. exported dmabuf) to replace the information lost from execobject[] I did not understand some points above. I am no expert here, and appreciate the feedback. My understanding is that [excluding endless batch buffer scenario which is not supported in this patch series,] VM_BIND is no different than the soft-pinning of objects we have today in the execbuf path. Hence the idea here is to add those VM_BIND objects to the execobject[] and let the execbuffer path to take care of the rest. Persistence of bindings across multiple requests is something no
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Sun, Dec 15, 2019 at 10:24 PM Niranjan Vishwanathapura < niranjana.vishwanathap...@intel.com> wrote: > On Sat, Dec 14, 2019 at 10:31:37AM +, Chris Wilson wrote: > >Quoting Jason Ekstrand (2019-12-14 00:36:19) > >> On Fri, Dec 13, 2019 at 5:24 PM Niranjan Vishwanathapura < > >> niranjana.vishwanathap...@intel.com> wrote: > >> > >> On Fri, Dec 13, 2019 at 04:58:42PM -0600, Jason Ekstrand wrote: > >> > > >> > +/** > >> > + * struct drm_i915_gem_vm_bind > >> > + * > >> > + * Bind an object in a vm's page table. > >> > > >> > First off, this is something I've wanted for a while for > Vulkan, it's > >> just > >> > never made its way high enough up the priority list. However, > it's > >> going > >> > to have to come one way or another soon. I'm glad to see > kernel API > >> for > >> > this being proposed. > >> > I do, however, have a few high-level comments/questions about > the API: > >> >1. In order to be useful for sparse memory support, the API > has to go > >> the > >> > other way around so that it binds a VA range to a range within > the BO. > >> It > >> > also needs to be able to handle overlapping where two different > VA > >> ranges > >> > may map to the same underlying bytes in the BO. This likely > means that > >> > unbind needs to also take a VA range and only unbind that range. > >> >2. If this is going to be useful for managing GL's address > space where > >> we > >> > have lots of BOs, we probably want it to take a list of ranges > so we > >> > aren't making one ioctl for each thing we want to bind. > >> > >> Hi Jason, > >> > >> Yah, some of these requirements came up. > >> > >> > >> Yes, I have raised them every single time an API like this has come > across my > >> e-mail inbox for years and they continue to get ignored. Why are we > landing an > >> API that we know isn't the API we want especially when it's pretty > obvious > >> roughly what the API we want is? It may be less time in the short > term, but > >> long-term it means two ioctls and two implementations in i915, IGT > tests for > >> both code paths, and code in all UMDs to call one or the other > depending on > >> what kernel you're running on, and we have to maintain all that code > going > >> forward forever. Sure, that's a price we pay today for a variety of > things but > >> that's because they all seemed like the right thing at the time. > Landing the > >> wrong API when we know it's the wrong API seems foolish. > > > >Exactly. This is not even close to the uAPI we need. Reposting an RFC > >without taking in the concerns last time (or the time before that, or > >the time before that...) suggests that you aren't really requesting for > >comments at all. > > Thanks Jason for detailed exlanation. > Chris, all comments and guidance are much appreciated :) > > I haven't looked in detail, but my concern is that implementing > partial object binding (offset, lenght) from vma down to [un]binding > in ppgtt might be a lot of work to include in this SVM patch series. > I believe we need the partial object binding in non-SVM scenario > as well? > Yes, the Vulkan APIs require both partial binding and aliasing. It may be worth pointing out that we're already doing some of this stuff today, although in a rather backwards way. Our non-softpin model for Vulkan uses a memfd which we then map in userspace and turn into a BO via userptr. Due to the way we handle threading in the driver, we end up with multiple BOs pointing to the same overlapping range in the memfd and hence the same pages. That doesn't mean that everything in the path is already set up for what you need but the VA -> page mappings should be. Also, avoiding these kinds of shinanigans is exactly why we want a "real" kernel API for this. :-) > Ok, let me change the interface as below. > > struct drm_i915_gem_vm_bind_va > { > /** VA start to bind **/ > __u64 start; > > /** Offset in Object to bind for I915_GEM_VM_BIND_SVM_OBJ type **/ > __u64 offset; > > /** VA length to [un]bind **/ > __u64 length; > > /** Type of memory to [un]bind **/ > __u32 type; > #define I915_GEM_VM_BIND_SVM_OBJ 0 > #define I915_GEM_VM_BIND_SVM_BUFFER 1 > > /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ type **/ > __u32 handle; > > /** Flags **/ > __u32 flags; > #define I915_GEM_VM_BIND_UNBIND (1 << 0) > #define I915_GEM_VM_BIND_READONLY(1 << 1) > } > > struct drm_i915_gem_vm_bind { > /** vm to [un]bind **/ > __u32 vm_id; > > /** number of VAs to bind **/ > __u32 num_vas; > > /** Array of VAs to bind **/ > struct drm_i915_gem_vm_bind_va *bind_vas; > > /** User extensions **/ > __u64 extensions; > }; > > When synchronization control is added as ext
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Sat, Dec 14, 2019 at 10:56:54AM +, Chris Wilson wrote: Quoting Niranjana Vishwanathapura (2019-12-13 21:56:04) Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call. Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Sudeep Dutt Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Kconfig | 11 drivers/gpu/drm/i915/Makefile | 3 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 58 ++ drivers/gpu/drm/i915/gem/i915_gem_svm.c | 60 +++ drivers/gpu/drm/i915/gem/i915_gem_svm.h | 22 +++ drivers/gpu/drm/i915/i915_drv.c | 21 +++ drivers/gpu/drm/i915/i915_drv.h | 22 +++ drivers/gpu/drm/i915/i915_gem_gtt.c | 1 + drivers/gpu/drm/i915/i915_gem_gtt.h | 13 include/uapi/drm/i915_drm.h | 27 + 10 files changed, 227 insertions(+), 11 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index ba9595960bbe..c2e48710eec8 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -137,6 +137,16 @@ config DRM_I915_GVT_KVMGT Choose this option if you want to enable KVMGT support for Intel GVT-g. +config DRM_I915_SVM + bool "Enable Shared Virtual Memory support in i915" + depends on STAGING + depends on DRM_I915 + default n + help + Choose this option if you want Shared Virtual Memory (SVM) + support in i915. With SVM support, one can share the virtual + address space between a process and the GPU. + menu "drm/i915 Debugging" depends on DRM_I915 depends on EXPERT diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e0fd10c0cfb8..75fe45633779 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -153,6 +153,9 @@ i915-y += \ intel_region_lmem.o \ intel_wopcm.o +# SVM code +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o + # general-purpose microcontroller (GuC) support obj-y += gt/uc/ i915-y += gt/uc/intel_uc.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 5003e616a1ad..af360238a392 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2836,10 +2836,14 @@ int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { + struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user; struct drm_i915_gem_execbuffer2 *args = data; - struct drm_i915_gem_exec_object2 *exec2_list; - struct drm_syncobj **fences = NULL; const size_t count = args->buffer_count; + struct drm_syncobj **fences = NULL; + unsigned int i = 0, svm_count = 0; + struct i915_address_space *vm; + struct i915_gem_context *ctx; + struct i915_svm_obj *svm_obj; int err; if (!check_buffer_count(count)) { @@ -2851,15 +2855,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (err) return err; + ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1); + if (!ctx || !rcu_access_pointer(ctx->vm)) + return -ENOENT; This is just hopelessly wrong. For persistence, the _ce_->vm will have a list of must-be-present vma, with a flag for whether they need prefaulting (!svm everything must be prefaulted obviously). Then during reservation we ensure that all those persistent vma are in place (so we probably use an eviction list to keep track of those we need to instantiate on this execbuf). We don't even want to individually track activity on those vma, preferring to assume they are used by every request and so on change they need serialising [for explicit uAPI unbind, where possible we strive to do it async for endless, or at least sync against iova semaphore] against the last request in the vm (so we need a vm->active). However, we do need an EXT_EXTENSION to mark writes for implicit fencing (e.g. exported dmabuf) to replace the information lost from execobject[] I did not understand some points above. I am no expert here, and appreciate the feedback. My understanding is that [excluding endless batch buffer scenario which is not supported in this patch series,] VM_BIND is no different than the soft-pinning of objects we have today in the execbuf path. Hence the idea here is to add those VM_BIND objects to the execobject[] and let the execbuffer path to take care of the rest. Persistence of bindings across multiple requests is something not considered. Do we need this flag in execo
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Sat, Dec 14, 2019 at 10:31:37AM +, Chris Wilson wrote: Quoting Jason Ekstrand (2019-12-14 00:36:19) On Fri, Dec 13, 2019 at 5:24 PM Niranjan Vishwanathapura < niranjana.vishwanathap...@intel.com> wrote: On Fri, Dec 13, 2019 at 04:58:42PM -0600, Jason Ekstrand wrote: > > +/** > + * struct drm_i915_gem_vm_bind > + * > + * Bind an object in a vm's page table. > > First off, this is something I've wanted for a while for Vulkan, it's just > never made its way high enough up the priority list. However, it's going > to have to come one way or another soon. I'm glad to see kernel API for > this being proposed. > I do, however, have a few high-level comments/questions about the API: > 1. In order to be useful for sparse memory support, the API has to go the > other way around so that it binds a VA range to a range within the BO. It > also needs to be able to handle overlapping where two different VA ranges > may map to the same underlying bytes in the BO. This likely means that > unbind needs to also take a VA range and only unbind that range. > 2. If this is going to be useful for managing GL's address space where we > have lots of BOs, we probably want it to take a list of ranges so we > aren't making one ioctl for each thing we want to bind. Hi Jason, Yah, some of these requirements came up. Yes, I have raised them every single time an API like this has come across my e-mail inbox for years and they continue to get ignored. Why are we landing an API that we know isn't the API we want especially when it's pretty obvious roughly what the API we want is? It may be less time in the short term, but long-term it means two ioctls and two implementations in i915, IGT tests for both code paths, and code in all UMDs to call one or the other depending on what kernel you're running on, and we have to maintain all that code going forward forever. Sure, that's a price we pay today for a variety of things but that's because they all seemed like the right thing at the time. Landing the wrong API when we know it's the wrong API seems foolish. Exactly. This is not even close to the uAPI we need. Reposting an RFC without taking in the concerns last time (or the time before that, or the time before that...) suggests that you aren't really requesting for comments at all. Thanks Jason for detailed exlanation. Chris, all comments and guidance are much appreciated :) I haven't looked in detail, but my concern is that implementing partial object binding (offset, lenght) from vma down to [un]binding in ppgtt might be a lot of work to include in this SVM patch series. I believe we need the partial object binding in non-SVM scenario as well? Ok, let me change the interface as below. struct drm_i915_gem_vm_bind_va { /** VA start to bind **/ __u64 start; /** Offset in Object to bind for I915_GEM_VM_BIND_SVM_OBJ type **/ __u64 offset; /** VA length to [un]bind **/ __u64 length; /** Type of memory to [un]bind **/ __u32 type; #define I915_GEM_VM_BIND_SVM_OBJ 0 #define I915_GEM_VM_BIND_SVM_BUFFER 1 /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ type **/ __u32 handle; /** Flags **/ __u32 flags; #define I915_GEM_VM_BIND_UNBIND (1 << 0) #define I915_GEM_VM_BIND_READONLY(1 << 1) } struct drm_i915_gem_vm_bind { /** vm to [un]bind **/ __u32 vm_id; /** number of VAs to bind **/ __u32 num_vas; /** Array of VAs to bind **/ struct drm_i915_gem_vm_bind_va *bind_vas; /** User extensions **/ __u64 extensions; }; When synchronization control is added as extension, it applies to all VAs in the array. Does this looks good? Niranjana -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
Quoting Niranjana Vishwanathapura (2019-12-13 21:56:04) > Shared Virtual Memory (SVM) runtime allocator support allows > binding a shared virtual address to a buffer object (BO) in the > device page table through an ioctl call. > > Cc: Joonas Lahtinen > Cc: Jon Bloomfield > Cc: Daniel Vetter > Cc: Sudeep Dutt > Signed-off-by: Niranjana Vishwanathapura > --- > drivers/gpu/drm/i915/Kconfig | 11 > drivers/gpu/drm/i915/Makefile | 3 + > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 58 ++ > drivers/gpu/drm/i915/gem/i915_gem_svm.c | 60 +++ > drivers/gpu/drm/i915/gem/i915_gem_svm.h | 22 +++ > drivers/gpu/drm/i915/i915_drv.c | 21 +++ > drivers/gpu/drm/i915/i915_drv.h | 22 +++ > drivers/gpu/drm/i915/i915_gem_gtt.c | 1 + > drivers/gpu/drm/i915/i915_gem_gtt.h | 13 > include/uapi/drm/i915_drm.h | 27 + > 10 files changed, 227 insertions(+), 11 deletions(-) > create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c > create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h > > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig > index ba9595960bbe..c2e48710eec8 100644 > --- a/drivers/gpu/drm/i915/Kconfig > +++ b/drivers/gpu/drm/i915/Kconfig > @@ -137,6 +137,16 @@ config DRM_I915_GVT_KVMGT > Choose this option if you want to enable KVMGT support for > Intel GVT-g. > > +config DRM_I915_SVM > + bool "Enable Shared Virtual Memory support in i915" > + depends on STAGING > + depends on DRM_I915 > + default n > + help > + Choose this option if you want Shared Virtual Memory (SVM) > + support in i915. With SVM support, one can share the virtual > + address space between a process and the GPU. > + > menu "drm/i915 Debugging" > depends on DRM_I915 > depends on EXPERT > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index e0fd10c0cfb8..75fe45633779 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -153,6 +153,9 @@ i915-y += \ > intel_region_lmem.o \ > intel_wopcm.o > > +# SVM code > +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o > + > # general-purpose microcontroller (GuC) support > obj-y += gt/uc/ > i915-y += gt/uc/intel_uc.o \ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index 5003e616a1ad..af360238a392 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -2836,10 +2836,14 @@ int > i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, >struct drm_file *file) > { > + struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user; > struct drm_i915_gem_execbuffer2 *args = data; > - struct drm_i915_gem_exec_object2 *exec2_list; > - struct drm_syncobj **fences = NULL; > const size_t count = args->buffer_count; > + struct drm_syncobj **fences = NULL; > + unsigned int i = 0, svm_count = 0; > + struct i915_address_space *vm; > + struct i915_gem_context *ctx; > + struct i915_svm_obj *svm_obj; > int err; > > if (!check_buffer_count(count)) { > @@ -2851,15 +2855,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, > void *data, > if (err) > return err; > > + ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1); > + if (!ctx || !rcu_access_pointer(ctx->vm)) > + return -ENOENT; This is just hopelessly wrong. For persistence, the _ce_->vm will have a list of must-be-present vma, with a flag for whether they need prefaulting (!svm everything must be prefaulted obviously). Then during reservation we ensure that all those persistent vma are in place (so we probably use an eviction list to keep track of those we need to instantiate on this execbuf). We don't even want to individually track activity on those vma, preferring to assume they are used by every request and so on change they need serialising [for explicit uAPI unbind, where possible we strive to do it async for endless, or at least sync against iova semaphore] against the last request in the vm (so we need a vm->active). However, we do need an EXT_EXTENSION to mark writes for implicit fencing (e.g. exported dmabuf) to replace the information lost from execobject[] > +struct drm_i915_gem_vm_bind { > + /** VA start to bind **/ > + __u64 start; iova; offset; /* into handle */ length; /* from offset */ > + > + /** Type of memory to [un]bind **/ > + __u32 type; > +#define I915_GEM_VM_BIND_SVM_OBJ 0 > + > + /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ type **/ > + __u32 handle; > + > + /** vm to [un]bind **/ > +
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
Quoting Jason Ekstrand (2019-12-14 00:36:19) > On Fri, Dec 13, 2019 at 5:24 PM Niranjan Vishwanathapura < > niranjana.vishwanathap...@intel.com> wrote: > > On Fri, Dec 13, 2019 at 04:58:42PM -0600, Jason Ekstrand wrote: > > > > +/** > > + * struct drm_i915_gem_vm_bind > > + * > > + * Bind an object in a vm's page table. > > > > First off, this is something I've wanted for a while for Vulkan, it's > just > > never made its way high enough up the priority list. However, it's > going > > to have to come one way or another soon. I'm glad to see kernel API > for > > this being proposed. > > I do, however, have a few high-level comments/questions about the API: > > 1. In order to be useful for sparse memory support, the API has to go > the > > other way around so that it binds a VA range to a range within the > BO. > It > > also needs to be able to handle overlapping where two different VA > ranges > > may map to the same underlying bytes in the BO. This likely means > that > > unbind needs to also take a VA range and only unbind that range. > > 2. If this is going to be useful for managing GL's address space > where > we > > have lots of BOs, we probably want it to take a list of ranges so we > > aren't making one ioctl for each thing we want to bind. > > Hi Jason, > > Yah, some of these requirements came up. > > > Yes, I have raised them every single time an API like this has come across my > e-mail inbox for years and they continue to get ignored. Why are we landing > an > API that we know isn't the API we want especially when it's pretty obvious > roughly what the API we want is? It may be less time in the short term, but > long-term it means two ioctls and two implementations in i915, IGT tests for > both code paths, and code in all UMDs to call one or the other depending on > what kernel you're running on, and we have to maintain all that code going > forward forever. Sure, that's a price we pay today for a variety of things > but > that's because they all seemed like the right thing at the time. Landing the > wrong API when we know it's the wrong API seems foolish. Exactly. This is not even close to the uAPI we need. Reposting an RFC without taking in the concerns last time (or the time before that, or the time before that...) suggests that you aren't really requesting for comments at all. -Chris ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Fri, Dec 13, 2019 at 5:24 PM Niranjan Vishwanathapura < niranjana.vishwanathap...@intel.com> wrote: > On Fri, Dec 13, 2019 at 04:58:42PM -0600, Jason Ekstrand wrote: > > > > +/** > > + * struct drm_i915_gem_vm_bind > > + * > > + * Bind an object in a vm's page table. > > > > First off, this is something I've wanted for a while for Vulkan, it's > just > > never made its way high enough up the priority list. However, it's > going > > to have to come one way or another soon. I'm glad to see kernel API > for > > this being proposed. > > I do, however, have a few high-level comments/questions about the API: > >1. In order to be useful for sparse memory support, the API has to go > the > > other way around so that it binds a VA range to a range within the > BO. It > > also needs to be able to handle overlapping where two different VA > ranges > > may map to the same underlying bytes in the BO. This likely means that > > unbind needs to also take a VA range and only unbind that range. > >2. If this is going to be useful for managing GL's address space > where we > > have lots of BOs, we probably want it to take a list of ranges so we > > aren't making one ioctl for each thing we want to bind. > > Hi Jason, > > Yah, some of these requirements came up. > Yes, I have raised them every single time an API like this has come across my e-mail inbox for years and they continue to get ignored. Why are we landing an API that we know isn't the API we want especially when it's pretty obvious roughly what the API we want is? It may be less time in the short term, but long-term it means two ioctls and two implementations in i915, IGT tests for both code paths, and code in all UMDs to call one or the other depending on what kernel you're running on, and we have to maintain all that code going forward forever. Sure, that's a price we pay today for a variety of things but that's because they all seemed like the right thing at the time. Landing the wrong API when we know it's the wrong API seems foolish. They are not being done here due to time and effort involved in defining > those requirements, implementing and validating. > For #1, yes, it would require more effort but for #2, it really doesn't take any extra effort to make it take an array... > However, this ioctl can be extended in a backward compatible way to handle > those requirements if required. > > >3. Why are there no ways to synchronize this with anything? For > binding, > > this probably isn't really needed as long as the VA range you're > binding > > is empty. However, if you want to move bindings around or unbind > > something, the only option is to block in userspace and then call > > bind/unbind. This can be done but it means even more threads in the > UMD > > which is unpleasant. One could argue that that's more or less what the > > kernel is going to have to do so we may as well do it in userspace. > > However, I'm not 100% convinced that's true. > > --Jason > > > > Yah, that is the thought. > But as SVM feature evolves, I think we can consider handling some such > cases > if hadling those in driver does make whole lot sense. > Sparse binding exists as a feature. It's been in D3D for some time and it's in Vulkan. We pretty much know what the requirements are. If you go look at how it's supposed to work in Vulkan, you have a binding queue and it waits on semaphores before [un]binding and signals semaphores after [un]binding. The biggest problem from an API (as opposed to implementation) POV with doing that in i915 is that we have too many synchronization primitives to choose from. :-( --Jason > Thanks, > Niranjana > > > > > + */ > > +struct drm_i915_gem_vm_bind { > > + /** VA start to bind **/ > > + __u64 start; > > + > > + /** Type of memory to [un]bind **/ > > + __u32 type; > > +#define I915_GEM_VM_BIND_SVM_OBJ 0 > > + > > + /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ > type > > **/ > > + __u32 handle; > > + > > + /** vm to [un]bind **/ > > + __u32 vm_id; > > + > > + /** Flags **/ > > + __u32 flags; > > +#define I915_GEM_VM_BIND_UNBIND (1 << 0) > > +#define I915_GEM_VM_BIND_READONLY(1 << 1) > > +}; > > + > > #if defined(__cplusplus) > > } > > #endif > > -- > > 2.21.0.rc0.32.g243a4c7e27 > > > > ___ > > Intel-gfx mailing list > > intel-...@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Fri, Dec 13, 2019 at 04:58:42PM -0600, Jason Ekstrand wrote: +/** + * struct drm_i915_gem_vm_bind + * + * Bind an object in a vm's page table. First off, this is something I've wanted for a while for Vulkan, it's just never made its way high enough up the priority list. However, it's going to have to come one way or another soon. I'm glad to see kernel API for this being proposed. I do, however, have a few high-level comments/questions about the API: 1. In order to be useful for sparse memory support, the API has to go the other way around so that it binds a VA range to a range within the BO. It also needs to be able to handle overlapping where two different VA ranges may map to the same underlying bytes in the BO. This likely means that unbind needs to also take a VA range and only unbind that range. 2. If this is going to be useful for managing GL's address space where we have lots of BOs, we probably want it to take a list of ranges so we aren't making one ioctl for each thing we want to bind. Hi Jason, Yah, some of these requirements came up. They are not being done here due to time and effort involved in defining those requirements, implementing and validating. However, this ioctl can be extended in a backward compatible way to handle those requirements if required. 3. Why are there no ways to synchronize this with anything? For binding, this probably isn't really needed as long as the VA range you're binding is empty. However, if you want to move bindings around or unbind something, the only option is to block in userspace and then call bind/unbind. This can be done but it means even more threads in the UMD which is unpleasant. One could argue that that's more or less what the kernel is going to have to do so we may as well do it in userspace. However, I'm not 100% convinced that's true. --Jason Yah, that is the thought. But as SVM feature evolves, I think we can consider handling some such cases if hadling those in driver does make whole lot sense. Thanks, Niranjana + */ +struct drm_i915_gem_vm_bind { + /** VA start to bind **/ + __u64 start; + + /** Type of memory to [un]bind **/ + __u32 type; +#define I915_GEM_VM_BIND_SVM_OBJ 0 + + /** Object handle to [un]bind for I915_GEM_VM_BIND_SVM_OBJ type **/ + __u32 handle; + + /** vm to [un]bind **/ + __u32 vm_id; + + /** Flags **/ + __u32 flags; +#define I915_GEM_VM_BIND_UNBIND (1 << 0) +#define I915_GEM_VM_BIND_READONLY(1 << 1) +}; + #if defined(__cplusplus) } #endif -- 2.21.0.rc0.32.g243a4c7e27 ___ Intel-gfx mailing list intel-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Intel-gfx] [RFC v2 02/12] drm/i915/svm: Runtime (RT) allocator support
On Fri, Dec 13, 2019 at 4:07 PM Niranjana Vishwanathapura < niranjana.vishwanathap...@intel.com> wrote: > Shared Virtual Memory (SVM) runtime allocator support allows > binding a shared virtual address to a buffer object (BO) in the > device page table through an ioctl call. > > Cc: Joonas Lahtinen > Cc: Jon Bloomfield > Cc: Daniel Vetter > Cc: Sudeep Dutt > Signed-off-by: Niranjana Vishwanathapura < > niranjana.vishwanathap...@intel.com> > --- > drivers/gpu/drm/i915/Kconfig | 11 > drivers/gpu/drm/i915/Makefile | 3 + > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 58 ++ > drivers/gpu/drm/i915/gem/i915_gem_svm.c | 60 +++ > drivers/gpu/drm/i915/gem/i915_gem_svm.h | 22 +++ > drivers/gpu/drm/i915/i915_drv.c | 21 +++ > drivers/gpu/drm/i915/i915_drv.h | 22 +++ > drivers/gpu/drm/i915/i915_gem_gtt.c | 1 + > drivers/gpu/drm/i915/i915_gem_gtt.h | 13 > include/uapi/drm/i915_drm.h | 27 + > 10 files changed, 227 insertions(+), 11 deletions(-) > create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c > create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h > > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig > index ba9595960bbe..c2e48710eec8 100644 > --- a/drivers/gpu/drm/i915/Kconfig > +++ b/drivers/gpu/drm/i915/Kconfig > @@ -137,6 +137,16 @@ config DRM_I915_GVT_KVMGT > Choose this option if you want to enable KVMGT support for > Intel GVT-g. > > +config DRM_I915_SVM > + bool "Enable Shared Virtual Memory support in i915" > + depends on STAGING > + depends on DRM_I915 > + default n > + help > + Choose this option if you want Shared Virtual Memory (SVM) > + support in i915. With SVM support, one can share the virtual > + address space between a process and the GPU. > + > menu "drm/i915 Debugging" > depends on DRM_I915 > depends on EXPERT > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index e0fd10c0cfb8..75fe45633779 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -153,6 +153,9 @@ i915-y += \ > intel_region_lmem.o \ > intel_wopcm.o > > +# SVM code > +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o > + > # general-purpose microcontroller (GuC) support > obj-y += gt/uc/ > i915-y += gt/uc/intel_uc.o \ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index 5003e616a1ad..af360238a392 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -2836,10 +2836,14 @@ int > i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, >struct drm_file *file) > { > + struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user; > struct drm_i915_gem_execbuffer2 *args = data; > - struct drm_i915_gem_exec_object2 *exec2_list; > - struct drm_syncobj **fences = NULL; > const size_t count = args->buffer_count; > + struct drm_syncobj **fences = NULL; > + unsigned int i = 0, svm_count = 0; > + struct i915_address_space *vm; > + struct i915_gem_context *ctx; > + struct i915_svm_obj *svm_obj; > int err; > > if (!check_buffer_count(count)) { > @@ -2851,15 +2855,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, > void *data, > if (err) > return err; > > + ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1); > + if (!ctx || !rcu_access_pointer(ctx->vm)) > + return -ENOENT; > + > + rcu_read_lock(); > + vm = i915_vm_get(ctx->vm); > + rcu_read_unlock(); > + > +alloc_again: > + svm_count = vm->svm_count; > /* Allocate an extra slot for use by the command parser */ > - exec2_list = kvmalloc_array(count + 1, eb_element_size(), > + exec2_list = kvmalloc_array(count + svm_count + 1, > eb_element_size(), > __GFP_NOWARN | GFP_KERNEL); > if (exec2_list == NULL) { > DRM_DEBUG("Failed to allocate exec list for %zd buffers\n", > - count); > + count + svm_count); > return -ENOMEM; > } > - if (copy_from_user(exec2_list, > + mutex_lock(&vm->mutex); > + if (svm_count != vm->svm_count) { > + mutex_unlock(&vm->mutex); > + kvfree(exec2_list); > + goto alloc_again; > + } > + > + list_for_each_entry(svm_obj, &vm->svm_list, link) { > + memset(&exec2_list[i], 0, sizeof(*exec2_list)); > + exec2_list[i].handle = svm_obj->handle; > + exec2_list[i].offset = svm_obj->offset; > +