Re: [Intel-gfx] [PATCH 37/40] drm/i915: Allow a context to define its set of engines
On 28/09/2018 21:22, Chris Wilson wrote: Quoting Tvrtko Ursulin (2018-09-27 12:28:47) On 19/09/2018 20:55, Chris Wilson wrote: Over the last few years, we have debated how to extend the user API to support an increase in the number of engines, that may be sparse and even be heterogeneous within a class (not all video decoders created equal). We settled on using (class, instance) tuples to identify a specific engine, with an API for the user to construct a map of engines to capabilities. Into this picture, we then add a challenge of virtual engines; one user engine that maps behind the scenes to any number of physical engines. To keep it general, we want the user to have full control over that mapping. To that end, we allow the user to constrain a context to define the set of engines that it can access, order fully controlled by the user via (class, instance). With such precise control in context setup, we can continue to use the existing execbuf uABI of specifying a single index; only now it doesn't automagically map onto the engines, it uses the user defined engine map from the context. The I915_EXEC_DEFAULT slot is left empty, and invalid for use by execbuf. It's use will be revealed in the next patch. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c| 88 ++ drivers/gpu/drm/i915/i915_gem_context.h| 4 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 22 -- include/uapi/drm/i915_drm.h| 27 +++ 4 files changed, 135 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index a8570a07b3b7..313471253f51 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -90,6 +90,7 @@ #include #include "i915_drv.h" #include "i915_trace.h" +#include "i915_user_extensions.h" #include "intel_workarounds.h" #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 @@ -223,6 +224,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx) ce->ops->destroy(ce); } + kfree(ctx->engines); + if (ctx->timeline) i915_timeline_put(ctx->timeline); @@ -948,6 +951,87 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; } +struct set_engines { + struct i915_gem_context *ctx; + struct intel_engine_cs **engines; + unsigned int nengine; +}; + +static const i915_user_extension_fn set_engines__extensions[] = { +}; This is OK unless someone one day gets the desire to make the extension namespace sparse. I was thinking on how to put some warnings in the code to warn about it, but I think that's for later. Namespace is also per parent ioctl, another thing which would perhaps need enforcing. Unless you intend to go extremely sparse, I'd just leave it with the usual [NAME1] = func1, and skipping over NULLs. We can always add alternatives if need be (I'd actually been meaning to convert execbuf3 over to this scheme to flesh it out a bit more). It is fine for now, I was just thinking out loud how to protect us against the table accidentally growing huge one day. +static int set_engines(struct i915_gem_context *ctx, +struct drm_i915_gem_context_param *args) +{ + struct i915_context_param_engines __user *user; + struct set_engines set = { + .ctx = ctx, + .engines = NULL, + .nengine = -1, + }; + unsigned int n; + u64 size, extensions; Size is u32 in the uAPI, so either that or unsigned int would do here. + int err; + + user = u64_to_user_ptr(args->value); + size = args->size; + if (!size) args->sizes = sizeof(*user); ... if you want to allow size probing via set param, or we add a get_param (too much work for nothing?) and only allow probing from there? It's a variable array, so a little trickier. Indeed! + goto out; + + if (size < sizeof(struct i915_context_param_engines)) + return -EINVAL; + + size -= sizeof(struct i915_context_param_engines); + if (size % sizeof(*user->class_instance)) + return -EINVAL; + + set.nengine = size / sizeof(*user->class_instance); + if (set.nengine == 0 || set.nengine >= I915_EXEC_RING_MASK) + return -EINVAL; + + set.engines = kmalloc_array(set.nengine + 1, + sizeof(*set.engines), + GFP_KERNEL); + if (!set.engines) + return -ENOMEM; + + set.engines[0] = NULL; /* Reserve the I915_EXEC_DEFAULT slot. */ + for (n = 0; n < set.nengine; n++) { + u32 class, instance; I will later recommend we use u16 for class/instance. I think we settled for that in the meantime. + + if (get_user(class, >class_instance[n].class) || + get_user(instance,
Re: [Intel-gfx] [PATCH 37/40] drm/i915: Allow a context to define its set of engines
Quoting Tvrtko Ursulin (2018-09-27 12:28:47) > > On 19/09/2018 20:55, Chris Wilson wrote: > > Over the last few years, we have debated how to extend the user API to > > support an increase in the number of engines, that may be sparse and > > even be heterogeneous within a class (not all video decoders created > > equal). We settled on using (class, instance) tuples to identify a > > specific engine, with an API for the user to construct a map of engines > > to capabilities. Into this picture, we then add a challenge of virtual > > engines; one user engine that maps behind the scenes to any number of > > physical engines. To keep it general, we want the user to have full > > control over that mapping. To that end, we allow the user to constrain a > > context to define the set of engines that it can access, order fully > > controlled by the user via (class, instance). With such precise control > > in context setup, we can continue to use the existing execbuf uABI of > > specifying a single index; only now it doesn't automagically map onto > > the engines, it uses the user defined engine map from the context. > > > > The I915_EXEC_DEFAULT slot is left empty, and invalid for use by > > execbuf. It's use will be revealed in the next patch. > > > > Signed-off-by: Chris Wilson > > --- > > drivers/gpu/drm/i915/i915_gem_context.c| 88 ++ > > drivers/gpu/drm/i915/i915_gem_context.h| 4 + > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 22 -- > > include/uapi/drm/i915_drm.h| 27 +++ > > 4 files changed, 135 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > > b/drivers/gpu/drm/i915/i915_gem_context.c > > index a8570a07b3b7..313471253f51 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > > @@ -90,6 +90,7 @@ > > #include > > #include "i915_drv.h" > > #include "i915_trace.h" > > +#include "i915_user_extensions.h" > > #include "intel_workarounds.h" > > > > #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 > > @@ -223,6 +224,8 @@ static void i915_gem_context_free(struct > > i915_gem_context *ctx) > > ce->ops->destroy(ce); > > } > > > > + kfree(ctx->engines); > > + > > if (ctx->timeline) > > i915_timeline_put(ctx->timeline); > > > > @@ -948,6 +951,87 @@ int i915_gem_context_getparam_ioctl(struct drm_device > > *dev, void *data, > > return ret; > > } > > > > +struct set_engines { > > + struct i915_gem_context *ctx; > > + struct intel_engine_cs **engines; > > + unsigned int nengine; > > +}; > > + > > +static const i915_user_extension_fn set_engines__extensions[] = { > > +}; > > This is OK unless someone one day gets the desire to make the extension > namespace sparse. I was thinking on how to put some warnings in the code > to warn about it, but I think that's for later. Namespace is also per > parent ioctl, another thing which would perhaps need enforcing. Unless you intend to go extremely sparse, I'd just leave it with the usual [NAME1] = func1, and skipping over NULLs. We can always add alternatives if need be (I'd actually been meaning to convert execbuf3 over to this scheme to flesh it out a bit more). > > +static int set_engines(struct i915_gem_context *ctx, > > +struct drm_i915_gem_context_param *args) > > +{ > > + struct i915_context_param_engines __user *user; > > + struct set_engines set = { > > + .ctx = ctx, > > + .engines = NULL, > > + .nengine = -1, > > + }; > > + unsigned int n; > > + u64 size, extensions; > > Size is u32 in the uAPI, so either that or unsigned int would do here. > > > + int err; > > + > > + user = u64_to_user_ptr(args->value); > > + size = args->size; > > + if (!size) > > args->sizes = sizeof(*user); > > ... if you want to allow size probing via set param, or we add a > get_param (too much work for nothing?) and only allow probing from there? It's a variable array, so a little trickier. > > + goto out; > > + > > + if (size < sizeof(struct i915_context_param_engines)) > > + return -EINVAL; > > + > > + size -= sizeof(struct i915_context_param_engines); > > + if (size % sizeof(*user->class_instance)) > > + return -EINVAL; > > + > > + set.nengine = size / sizeof(*user->class_instance); > > + if (set.nengine == 0 || set.nengine >= I915_EXEC_RING_MASK) > > + return -EINVAL; > > + > > + set.engines = kmalloc_array(set.nengine + 1, > > + sizeof(*set.engines), > > + GFP_KERNEL); > > + if (!set.engines) > > + return -ENOMEM; > > + > > + set.engines[0] = NULL; > > /* Reserve the I915_EXEC_DEFAULT slot. */ > > > + for (n = 0; n < set.nengine; n++) { > > + u32 class,
Re: [Intel-gfx] [PATCH 37/40] drm/i915: Allow a context to define its set of engines
On 19/09/2018 20:55, Chris Wilson wrote: Over the last few years, we have debated how to extend the user API to support an increase in the number of engines, that may be sparse and even be heterogeneous within a class (not all video decoders created equal). We settled on using (class, instance) tuples to identify a specific engine, with an API for the user to construct a map of engines to capabilities. Into this picture, we then add a challenge of virtual engines; one user engine that maps behind the scenes to any number of physical engines. To keep it general, we want the user to have full control over that mapping. To that end, we allow the user to constrain a context to define the set of engines that it can access, order fully controlled by the user via (class, instance). With such precise control in context setup, we can continue to use the existing execbuf uABI of specifying a single index; only now it doesn't automagically map onto the engines, it uses the user defined engine map from the context. The I915_EXEC_DEFAULT slot is left empty, and invalid for use by execbuf. It's use will be revealed in the next patch. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c| 88 ++ drivers/gpu/drm/i915/i915_gem_context.h| 4 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 22 -- include/uapi/drm/i915_drm.h| 27 +++ 4 files changed, 135 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index a8570a07b3b7..313471253f51 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -90,6 +90,7 @@ #include #include "i915_drv.h" #include "i915_trace.h" +#include "i915_user_extensions.h" #include "intel_workarounds.h" #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 @@ -223,6 +224,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx) ce->ops->destroy(ce); } + kfree(ctx->engines); + if (ctx->timeline) i915_timeline_put(ctx->timeline); @@ -948,6 +951,87 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; } +struct set_engines { + struct i915_gem_context *ctx; + struct intel_engine_cs **engines; + unsigned int nengine; +}; + +static const i915_user_extension_fn set_engines__extensions[] = { +}; This is OK unless someone one day gets the desire to make the extension namespace sparse. I was thinking on how to put some warnings in the code to warn about it, but I think that's for later. Namespace is also per parent ioctl, another thing which would perhaps need enforcing. + +static int set_engines(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct i915_context_param_engines __user *user; + struct set_engines set = { + .ctx = ctx, + .engines = NULL, + .nengine = -1, + }; + unsigned int n; + u64 size, extensions; Size is u32 in the uAPI, so either that or unsigned int would do here. + int err; + + user = u64_to_user_ptr(args->value); + size = args->size; + if (!size) args->sizes = sizeof(*user); ... if you want to allow size probing via set param, or we add a get_param (too much work for nothing?) and only allow probing from there? + goto out; + + if (size < sizeof(struct i915_context_param_engines)) + return -EINVAL; + + size -= sizeof(struct i915_context_param_engines); + if (size % sizeof(*user->class_instance)) + return -EINVAL; + + set.nengine = size / sizeof(*user->class_instance); + if (set.nengine == 0 || set.nengine >= I915_EXEC_RING_MASK) + return -EINVAL; + + set.engines = kmalloc_array(set.nengine + 1, + sizeof(*set.engines), + GFP_KERNEL); + if (!set.engines) + return -ENOMEM; + + set.engines[0] = NULL; /* Reserve the I915_EXEC_DEFAULT slot. */ + for (n = 0; n < set.nengine; n++) { + u32 class, instance; I will later recommend we use u16 for class/instance. I think we settled for that in the meantime. + + if (get_user(class, >class_instance[n].class) || + get_user(instance, >class_instance[n].instance)) { + kfree(set.engines); + return -EFAULT; + } + + set.engines[n + 1] = + intel_engine_lookup_user(ctx->i915, class, instance); + if (!set.engines[n + 1]) { + kfree(set.engines); + return -ENOENT; + } + } + + err = -EFAULT; + if (!get_user(extensions,
[Intel-gfx] [PATCH 37/40] drm/i915: Allow a context to define its set of engines
Over the last few years, we have debated how to extend the user API to support an increase in the number of engines, that may be sparse and even be heterogeneous within a class (not all video decoders created equal). We settled on using (class, instance) tuples to identify a specific engine, with an API for the user to construct a map of engines to capabilities. Into this picture, we then add a challenge of virtual engines; one user engine that maps behind the scenes to any number of physical engines. To keep it general, we want the user to have full control over that mapping. To that end, we allow the user to constrain a context to define the set of engines that it can access, order fully controlled by the user via (class, instance). With such precise control in context setup, we can continue to use the existing execbuf uABI of specifying a single index; only now it doesn't automagically map onto the engines, it uses the user defined engine map from the context. The I915_EXEC_DEFAULT slot is left empty, and invalid for use by execbuf. It's use will be revealed in the next patch. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c| 88 ++ drivers/gpu/drm/i915/i915_gem_context.h| 4 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 22 -- include/uapi/drm/i915_drm.h| 27 +++ 4 files changed, 135 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index a8570a07b3b7..313471253f51 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -90,6 +90,7 @@ #include #include "i915_drv.h" #include "i915_trace.h" +#include "i915_user_extensions.h" #include "intel_workarounds.h" #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 @@ -223,6 +224,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx) ce->ops->destroy(ce); } + kfree(ctx->engines); + if (ctx->timeline) i915_timeline_put(ctx->timeline); @@ -948,6 +951,87 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; } +struct set_engines { + struct i915_gem_context *ctx; + struct intel_engine_cs **engines; + unsigned int nengine; +}; + +static const i915_user_extension_fn set_engines__extensions[] = { +}; + +static int set_engines(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct i915_context_param_engines __user *user; + struct set_engines set = { + .ctx = ctx, + .engines = NULL, + .nengine = -1, + }; + unsigned int n; + u64 size, extensions; + int err; + + user = u64_to_user_ptr(args->value); + size = args->size; + if (!size) + goto out; + + if (size < sizeof(struct i915_context_param_engines)) + return -EINVAL; + + size -= sizeof(struct i915_context_param_engines); + if (size % sizeof(*user->class_instance)) + return -EINVAL; + + set.nengine = size / sizeof(*user->class_instance); + if (set.nengine == 0 || set.nengine >= I915_EXEC_RING_MASK) + return -EINVAL; + + set.engines = kmalloc_array(set.nengine + 1, + sizeof(*set.engines), + GFP_KERNEL); + if (!set.engines) + return -ENOMEM; + + set.engines[0] = NULL; + for (n = 0; n < set.nengine; n++) { + u32 class, instance; + + if (get_user(class, >class_instance[n].class) || + get_user(instance, >class_instance[n].instance)) { + kfree(set.engines); + return -EFAULT; + } + + set.engines[n + 1] = + intel_engine_lookup_user(ctx->i915, class, instance); + if (!set.engines[n + 1]) { + kfree(set.engines); + return -ENOENT; + } + } + + err = -EFAULT; + if (!get_user(extensions, >extensions)) + err = i915_user_extensions(u64_to_user_ptr(extensions), + set_engines__extensions, + ARRAY_SIZE(set_engines__extensions), + ); + if (err) { + kfree(set.engines); + return err; + } + +out: + kfree(ctx->engines); + ctx->engines = set.engines; + ctx->nengine = set.nengine + 1; + + return 0; +} + int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -1011,6 +1095,10 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,