Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-23 Thread Sam Ravnborg
Hi Daniel.

On Mon, Mar 23, 2020 at 03:49:02PM +0100, Daniel Vetter wrote:
> We have lots of these. And the cleanup code tends to be of dubious
> quality. The biggest wrong pattern is that developers use devm_, which
> ties the release action to the underlying struct device, whereas
> all the userspace visible stuff attached to a drm_device can long
> outlive that one (e.g. after a hotunplug while userspace has open
> files and mmap'ed buffers). Give people what they want, but with more
> correctness.
> 
> Mostly copied from devres.c, with types adjusted to fit drm_device and
> a few simplifications - I didn't (yet) copy over everything. Since
> the types don't match code sharing looked like a hopeless endeavour.
> 
> For now it's only super simplified, no groups, you can't remove
> actions (but kfree exists, we'll need that soon). Plus all specific to
> drm_device ofc, including the logging. Which I didn't bother to make
> compile-time optional, since none of the other drm logging is compile
> time optional either.
> 
> One tricky bit here is the chicken&egg between allocating your
> drm_device structure and initiliazing it with drm_dev_init. For
> perfect onion unwinding we'd need to have the action to kfree the
> allocation registered before drm_dev_init registers any of its own
> release handlers. But drm_dev_init doesn't know where exactly the
> drm_device is emebedded into the overall structure, and by the time it
> returns it'll all be too late. And forcing drivers to be able clean up
> everything except the one kzalloc is silly.
> 
> Work around this by having a very special final_kfree pointer. This
> also avoids troubles with the list head possibly disappearing from
> underneath us when we release all resources attached to the
> drm_device.
> 
> v2: Do all the kerneldoc at the end, to avoid lots of fairly pointless
> shuffling while getting everything into shape.
> 
> v3: Add static to add/del_dr (Neil)
> Move typo fix to the right patch (Neil)
> 
> v4: Enforce contract for drmm_add_final_kfree:
> 
> Use ksize() to check that the drm_device is indeed contained somewhere
> in the final kfree(). Because we need that or the entire managed
> release logic blows up in a pile of use-after-frees. Motivated by a
> discussion with Laurent.
> 
> v5: Review from Laurent:
> - %zu instead of casting size_t
> - header guards
> - sorting of includes
> - guarding of data assignment if we didn't allocate it for a NULL
>   pointer
> - delete spurious newline
> - cast void* data parameter correctly in ->release call, no idea how
>   this even worked before
> 
> v3: Review from Sam
> - Add the kerneldoc for the managed sub-struct back in, even if it
>   doesn't show up in the generated html somehow.
> - Explain why __always_inline.
> - Fix bisectability around the final kfree() in drm_dev_relase(). This
>   is just interim code which will disappear again.
> - Some whitespace polish.
> - Add debug output when drmm_add_action or drmm_kmalloc fail.
> 
> v4: My bisectability fix wasn't up to par as noticed by smatch.
> 
> v5: Remove unecessary {} around if else
> 
> v6: Use kstrdup_const, which requires kfree_const and introducing a free_dr()
> helper (Thomas).
> 
> Cc: Thomas Zimmermann 
> Cc: Dan Carpenter 
> Cc: Sam Ravnborg 
> Cc: Laurent Pinchart 
> Cc: Neil Armstrong  Cc: Greg Kroah-Hartman 
> Cc: "Rafael J. Wysocki" 
> Signed-off-by: Daniel Vetter 

Looks good.
Reviewed-by: Sam Ravnborg 

> ---
>  Documentation/gpu/drm-internals.rst |   6 +
>  drivers/gpu/drm/Makefile|   3 +-
>  drivers/gpu/drm/drm_drv.c   |  15 ++-
>  drivers/gpu/drm/drm_internal.h  |   3 +
>  drivers/gpu/drm/drm_managed.c   | 193 
>  include/drm/drm_device.h|  15 +++
>  include/drm/drm_managed.h   |  30 +
>  include/drm/drm_print.h |   6 +
>  8 files changed, 267 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/gpu/drm/drm_managed.c
>  create mode 100644 include/drm/drm_managed.h
> 
> diff --git a/Documentation/gpu/drm-internals.rst 
> b/Documentation/gpu/drm-internals.rst
> index a73320576ca9..a6b6145fda78 100644
> --- a/Documentation/gpu/drm-internals.rst
> +++ b/Documentation/gpu/drm-internals.rst
> @@ -132,6 +132,12 @@ be unmapped; on many devices, the ROM address decoder is 
> shared with
>  other BARs, so leaving it mapped could cause undesired behaviour like
>  hangs or memory corruption.
>  
> +Managed Resources
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_managed.c
> +   :doc: managed resources
> +
>  Bus-specific Device Registration and PCI Support
>  
>  
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 7f72ef5e7811..183c60048307 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -17,7 +17,8 @@ drm-y   :=  drm_auth.o drm_cache.o \
>   drm_plane.o drm_color_mgmt.o drm_print.o \
>   dr

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-16 Thread Daniel Vetter
On Wed, Mar 11, 2020 at 10:14:03AM +0100, Thomas Zimmermann wrote:
> 
> 
> Am 02.03.20 um 23:25 schrieb Daniel Vetter:
> <...>
> > +
> > +int __drmm_add_action(struct drm_device *dev,
> > + drmres_release_t action,
> > + void *data, const char *name)
> > +{
> > +   struct drmres *dr;
> > +   void **void_ptr;
> > +
> > +   dr = alloc_dr(action, data ? sizeof(void*) : 0,
> > + GFP_KERNEL | __GFP_ZERO,
> > + dev_to_node(dev->dev));
> > +   if (!dr) {
> > +   drm_dbg_drmres(dev, "failed to add action %s for %p\n",
> > +  name, data);
> > +   return -ENOMEM;
> > +   }
> > +
> > +   dr->node.name = name;
> 
> Maybe do a kstrdup_const() on name and later a kfree_const() during
> release. Just in case someone decides to allocate 'name' dynamically.

Makes sense, but a bit of churn since I need a free_dr() helper now :-)
-Daniel

> 
> > +   if (data) {
> > +   void_ptr = (void **)&dr->data;
> > +   *void_ptr = data;
> > +   }
> > +
> > +   add_dr(dev, dr);
> > +
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL(__drmm_add_action);
> > +
> > +void *drmm_kmalloc(struct drm_device *dev, size_t size, gfp_t gfp)
> > +{
> > +   struct drmres *dr;
> > +
> > +   dr = alloc_dr(NULL, size, gfp, dev_to_node(dev->dev));
> > +   if (!dr) {
> > +   drm_dbg_drmres(dev, "failed to allocate %zu bytes, %u flags\n",
> > +  size, gfp);
> > +   return NULL;
> > +   }
> > +   dr->node.name = "kmalloc";
> > +
> > +   add_dr(dev, dr);
> > +
> > +   return dr->data;
> > +}
> > +EXPORT_SYMBOL(drmm_kmalloc);
> > +
> > +void drmm_kfree(struct drm_device *dev, void *data)
> > +{
> > +   struct drmres *dr_match = NULL, *dr;
> > +   unsigned long flags;
> > +
> > +   if (!data)
> > +   return;
> > +
> > +   spin_lock_irqsave(&dev->managed.lock, flags);
> > +   list_for_each_entry(dr, &dev->managed.resources, node.entry) {
> > +   if (dr->data == data) {
> > +   dr_match = dr;
> > +   del_dr(dev, dr_match);
> > +   break;
> > +   }
> > +   }
> > +   spin_unlock_irqrestore(&dev->managed.lock, flags);
> > +
> > +   if (WARN_ON(!dr_match))
> > +   return;
> > +
> > +   kfree(dr_match);
> > +}
> > +EXPORT_SYMBOL(drmm_kfree);
> > diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> > index bb60a949f416..d39132b477dd 100644
> > --- a/include/drm/drm_device.h
> > +++ b/include/drm/drm_device.h
> > @@ -67,6 +67,21 @@ struct drm_device {
> > /** @dev: Device structure of bus-device */
> > struct device *dev;
> >  
> > +   /**
> > +* @managed:
> > +*
> > +* Managed resources linked to the lifetime of this &drm_device as
> > +* tracked by @ref.
> > +*/
> > +   struct {
> > +   /** @managed.resources: managed resources list */
> > +   struct list_head resources;
> > +   /** @managed.final_kfree: pointer for final kfree() call */
> > +   void *final_kfree;
> > +   /** @managed.lock: protects @managed.resources */
> > +   spinlock_t lock;
> > +   } managed;
> > +
> > /** @driver: DRM driver managing the device */
> > struct drm_driver *driver;
> >  
> > diff --git a/include/drm/drm_managed.h b/include/drm/drm_managed.h
> > new file mode 100644
> > index ..7b5df7d09b19
> > --- /dev/null
> > +++ b/include/drm/drm_managed.h
> > @@ -0,0 +1,30 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#ifndef _DRM_MANAGED_H_
> > +#define _DRM_MANAGED_H_
> > +
> > +#include 
> > +#include 
> > +
> > +struct drm_device;
> > +
> > +typedef void (*drmres_release_t)(struct drm_device *dev, void *res);
> > +
> > +#define drmm_add_action(dev, action, data) \
> > +   __drmm_add_action(dev, action, data, #action)
> > +
> > +int __must_check __drmm_add_action(struct drm_device *dev,
> > +  drmres_release_t action,
> > +  void *data, const char *name);
> > +
> > +void drmm_add_final_kfree(struct drm_device *dev, void *parent);
> > +
> > +void *drmm_kmalloc(struct drm_device *dev, size_t size, gfp_t gfp) 
> > __malloc;
> > +static inline void *drmm_kzalloc(struct drm_device *dev, size_t size, 
> > gfp_t gfp)
> > +{
> > +   return drmm_kmalloc(dev, size, gfp | __GFP_ZERO);
> > +}
> > +
> > +void drmm_kfree(struct drm_device *dev, void *data);
> > +
> > +#endif
> > diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> > index ca7cee8e728a..1c9417430d08 100644
> > --- a/include/drm/drm_print.h
> > +++ b/include/drm/drm_print.h
> > @@ -313,6 +313,10 @@ enum drm_debug_category {
> >  * @DRM_UT_DP: Used in the DP code.
> >  */
> > DRM_UT_DP   = 0x100,
> > +   /**
> > +* @DRM_UT_DRMRES: Used in the drm managed resources code.
> > +*/
> > +   DRM_UT_DRMRES   = 0x200,
> >  };
> >  
> >  static inline bool drm_debug_enabled(enum drm_debug_category 

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-16 Thread Daniel Vetter
On Wed, Mar 11, 2020 at 10:07:13AM +0100, Thomas Zimmermann wrote:
> Hi Daniel
> 
> Am 02.03.20 um 23:25 schrieb Daniel Vetter:
> > We have lots of these. And the cleanup code tends to be of dubious
> > quality. The biggest wrong pattern is that developers use devm_, which
> > ties the release action to the underlying struct device, whereas
> > all the userspace visible stuff attached to a drm_device can long
> > outlive that one (e.g. after a hotunplug while userspace has open
> > files and mmap'ed buffers). Give people what they want, but with more
> > correctness.
> > 
> > Mostly copied from devres.c, with types adjusted to fit drm_device and
> > a few simplifications - I didn't (yet) copy over everything. Since
> > the types don't match code sharing looked like a hopeless endeavour.
> > 
> > For now it's only super simplified, no groups, you can't remove
> > actions (but kfree exists, we'll need that soon). Plus all specific to
> > drm_device ofc, including the logging. Which I didn't bother to make
> > compile-time optional, since none of the other drm logging is compile
> > time optional either.
> > 
> > One tricky bit here is the chicken&egg between allocating your
> > drm_device structure and initiliazing it with drm_dev_init. For
> > perfect onion unwinding we'd need to have the action to kfree the
> > allocation registered before drm_dev_init registers any of its own
> > release handlers. But drm_dev_init doesn't know where exactly the
> > drm_device is emebedded into the overall structure, and by the time it
> > returns it'll all be too late. And forcing drivers to be able clean up
> > everything except the one kzalloc is silly.
> > 
> > Work around this by having a very special final_kfree pointer. This
> > also avoids troubles with the list head possibly disappearing from
> > underneath us when we release all resources attached to the
> > drm_device.
> > 
> > v2: Do all the kerneldoc at the end, to avoid lots of fairly pointless
> > shuffling while getting everything into shape.
> > 
> > v3: Add static to add/del_dr (Neil)
> > Move typo fix to the right patch (Neil)
> > 
> > v4: Enforce contract for drmm_add_final_kfree:
> > 
> > Use ksize() to check that the drm_device is indeed contained somewhere
> > in the final kfree(). Because we need that or the entire managed
> > release logic blows up in a pile of use-after-frees. Motivated by a
> > discussion with Laurent.
> > 
> > v5: Review from Laurent:
> > - %zu instead of casting size_t
> > - header guards
> > - sorting of includes
> > - guarding of data assignment if we didn't allocate it for a NULL
> >   pointer
> > - delete spurious newline
> > - cast void* data parameter correctly in ->release call, no idea how
> >   this even worked before
> > 
> > v3: Review from Sam
> > - Add the kerneldoc for the managed sub-struct back in, even if it
> >   doesn't show up in the generated html somehow.
> > - Explain why __always_inline.
> > - Fix bisectability around the final kfree() in drm_dev_relase(). This
> >   is just interim code which will disappear again.
> > - Some whitespace polish.
> > - Add debug output when drmm_add_action or drmm_kmalloc fail.
> > 
> > Cc: Sam Ravnborg 
> > Cc: Laurent Pinchart 
> > Cc: Neil Armstrong  > Cc: Greg Kroah-Hartman 
> > Cc: "Rafael J. Wysocki" 
> > Signed-off-by: Daniel Vetter 
> > ---
> >  Documentation/gpu/drm-internals.rst |   6 +
> >  drivers/gpu/drm/Makefile|   3 +-
> >  drivers/gpu/drm/drm_drv.c   |  12 ++
> >  drivers/gpu/drm/drm_internal.h  |   3 +
> >  drivers/gpu/drm/drm_managed.c   | 186 
> >  include/drm/drm_device.h|  15 +++
> >  include/drm/drm_managed.h   |  30 +
> >  include/drm/drm_print.h |   6 +
> >  8 files changed, 260 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/gpu/drm/drm_managed.c
> >  create mode 100644 include/drm/drm_managed.h
> > 
> > diff --git a/Documentation/gpu/drm-internals.rst 
> > b/Documentation/gpu/drm-internals.rst
> > index a73320576ca9..a6b6145fda78 100644
> > --- a/Documentation/gpu/drm-internals.rst
> > +++ b/Documentation/gpu/drm-internals.rst
> > @@ -132,6 +132,12 @@ be unmapped; on many devices, the ROM address decoder 
> > is shared with
> >  other BARs, so leaving it mapped could cause undesired behaviour like
> >  hangs or memory corruption.
> >  
> > +Managed Resources
> > +-
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_managed.c
> > +   :doc: managed resources
> > +
> >  Bus-specific Device Registration and PCI Support
> >  
> >  
> > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> > index 7f72ef5e7811..183c60048307 100644
> > --- a/drivers/gpu/drm/Makefile
> > +++ b/drivers/gpu/drm/Makefile
> > @@ -17,7 +17,8 @@ drm-y   :=drm_auth.o drm_cache.o \
> > drm_plane.o drm_color_mgmt.o drm_print.o \
> > drm_dumb_buffers.o drm_mode_config.o drm_v

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-11 Thread Thomas Zimmermann


Am 11.03.20 um 10:07 schrieb Thomas Zimmermann:
> Hi Daniel
> 
> Am 02.03.20 um 23:25 schrieb Daniel Vetter:
>> We have lots of these. And the cleanup code tends to be of dubious
>> quality. The biggest wrong pattern is that developers use devm_, which
>> ties the release action to the underlying struct device, whereas
>> all the userspace visible stuff attached to a drm_device can long
>> outlive that one (e.g. after a hotunplug while userspace has open
>> files and mmap'ed buffers). Give people what they want, but with more
>> correctness.
>>
>> Mostly copied from devres.c, with types adjusted to fit drm_device and
>> a few simplifications - I didn't (yet) copy over everything. Since
>> the types don't match code sharing looked like a hopeless endeavour.
>>
>> For now it's only super simplified, no groups, you can't remove
>> actions (but kfree exists, we'll need that soon). Plus all specific to
>> drm_device ofc, including the logging. Which I didn't bother to make
>> compile-time optional, since none of the other drm logging is compile
>> time optional either.
>>
>> One tricky bit here is the chicken&egg between allocating your
>> drm_device structure and initiliazing it with drm_dev_init. For
>> perfect onion unwinding we'd need to have the action to kfree the
>> allocation registered before drm_dev_init registers any of its own
>> release handlers. But drm_dev_init doesn't know where exactly the
>> drm_device is emebedded into the overall structure, and by the time it
>> returns it'll all be too late. And forcing drivers to be able clean up
>> everything except the one kzalloc is silly.
>>
>> Work around this by having a very special final_kfree pointer. This
>> also avoids troubles with the list head possibly disappearing from
>> underneath us when we release all resources attached to the
>> drm_device.
>>
>> v2: Do all the kerneldoc at the end, to avoid lots of fairly pointless
>> shuffling while getting everything into shape.
>>
>> v3: Add static to add/del_dr (Neil)
>> Move typo fix to the right patch (Neil)
>>
>> v4: Enforce contract for drmm_add_final_kfree:
>>
>> Use ksize() to check that the drm_device is indeed contained somewhere
>> in the final kfree(). Because we need that or the entire managed
>> release logic blows up in a pile of use-after-frees. Motivated by a
>> discussion with Laurent.
>>
>> v5: Review from Laurent:
>> - %zu instead of casting size_t
>> - header guards
>> - sorting of includes
>> - guarding of data assignment if we didn't allocate it for a NULL
>>   pointer
>> - delete spurious newline
>> - cast void* data parameter correctly in ->release call, no idea how
>>   this even worked before
>>
>> v3: Review from Sam
>> - Add the kerneldoc for the managed sub-struct back in, even if it
>>   doesn't show up in the generated html somehow.
>> - Explain why __always_inline.
>> - Fix bisectability around the final kfree() in drm_dev_relase(). This
>>   is just interim code which will disappear again.
>> - Some whitespace polish.
>> - Add debug output when drmm_add_action or drmm_kmalloc fail.
>>
>> Cc: Sam Ravnborg 
>> Cc: Laurent Pinchart 
>> Cc: Neil Armstrong > Cc: Greg Kroah-Hartman 
>> Cc: "Rafael J. Wysocki" 
>> Signed-off-by: Daniel Vetter 
>> ---
>>  Documentation/gpu/drm-internals.rst |   6 +
>>  drivers/gpu/drm/Makefile|   3 +-
>>  drivers/gpu/drm/drm_drv.c   |  12 ++
>>  drivers/gpu/drm/drm_internal.h  |   3 +
>>  drivers/gpu/drm/drm_managed.c   | 186 
>>  include/drm/drm_device.h|  15 +++
>>  include/drm/drm_managed.h   |  30 +
>>  include/drm/drm_print.h |   6 +
>>  8 files changed, 260 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/gpu/drm/drm_managed.c
>>  create mode 100644 include/drm/drm_managed.h
>>
>> diff --git a/Documentation/gpu/drm-internals.rst 
>> b/Documentation/gpu/drm-internals.rst
>> index a73320576ca9..a6b6145fda78 100644
>> --- a/Documentation/gpu/drm-internals.rst
>> +++ b/Documentation/gpu/drm-internals.rst
>> @@ -132,6 +132,12 @@ be unmapped; on many devices, the ROM address decoder 
>> is shared with
>>  other BARs, so leaving it mapped could cause undesired behaviour like
>>  hangs or memory corruption.
>>  
>> +Managed Resources
>> +-
>> +
>> +.. kernel-doc:: drivers/gpu/drm/drm_managed.c
>> +   :doc: managed resources
>> +
>>  Bus-specific Device Registration and PCI Support
>>  
>>  
>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
>> index 7f72ef5e7811..183c60048307 100644
>> --- a/drivers/gpu/drm/Makefile
>> +++ b/drivers/gpu/drm/Makefile
>> @@ -17,7 +17,8 @@ drm-y   := drm_auth.o drm_cache.o \
>>  drm_plane.o drm_color_mgmt.o drm_print.o \
>>  drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
>>  drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
>> -drm_client_modeset.o drm_atomic_uapi

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-11 Thread Thomas Zimmermann


Am 02.03.20 um 23:25 schrieb Daniel Vetter:
<...>
> +
> +int __drmm_add_action(struct drm_device *dev,
> +   drmres_release_t action,
> +   void *data, const char *name)
> +{
> + struct drmres *dr;
> + void **void_ptr;
> +
> + dr = alloc_dr(action, data ? sizeof(void*) : 0,
> +   GFP_KERNEL | __GFP_ZERO,
> +   dev_to_node(dev->dev));
> + if (!dr) {
> + drm_dbg_drmres(dev, "failed to add action %s for %p\n",
> +name, data);
> + return -ENOMEM;
> + }
> +
> + dr->node.name = name;

Maybe do a kstrdup_const() on name and later a kfree_const() during
release. Just in case someone decides to allocate 'name' dynamically.

> + if (data) {
> + void_ptr = (void **)&dr->data;
> + *void_ptr = data;
> + }
> +
> + add_dr(dev, dr);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(__drmm_add_action);
> +
> +void *drmm_kmalloc(struct drm_device *dev, size_t size, gfp_t gfp)
> +{
> + struct drmres *dr;
> +
> + dr = alloc_dr(NULL, size, gfp, dev_to_node(dev->dev));
> + if (!dr) {
> + drm_dbg_drmres(dev, "failed to allocate %zu bytes, %u flags\n",
> +size, gfp);
> + return NULL;
> + }
> + dr->node.name = "kmalloc";
> +
> + add_dr(dev, dr);
> +
> + return dr->data;
> +}
> +EXPORT_SYMBOL(drmm_kmalloc);
> +
> +void drmm_kfree(struct drm_device *dev, void *data)
> +{
> + struct drmres *dr_match = NULL, *dr;
> + unsigned long flags;
> +
> + if (!data)
> + return;
> +
> + spin_lock_irqsave(&dev->managed.lock, flags);
> + list_for_each_entry(dr, &dev->managed.resources, node.entry) {
> + if (dr->data == data) {
> + dr_match = dr;
> + del_dr(dev, dr_match);
> + break;
> + }
> + }
> + spin_unlock_irqrestore(&dev->managed.lock, flags);
> +
> + if (WARN_ON(!dr_match))
> + return;
> +
> + kfree(dr_match);
> +}
> +EXPORT_SYMBOL(drmm_kfree);
> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> index bb60a949f416..d39132b477dd 100644
> --- a/include/drm/drm_device.h
> +++ b/include/drm/drm_device.h
> @@ -67,6 +67,21 @@ struct drm_device {
>   /** @dev: Device structure of bus-device */
>   struct device *dev;
>  
> + /**
> +  * @managed:
> +  *
> +  * Managed resources linked to the lifetime of this &drm_device as
> +  * tracked by @ref.
> +  */
> + struct {
> + /** @managed.resources: managed resources list */
> + struct list_head resources;
> + /** @managed.final_kfree: pointer for final kfree() call */
> + void *final_kfree;
> + /** @managed.lock: protects @managed.resources */
> + spinlock_t lock;
> + } managed;
> +
>   /** @driver: DRM driver managing the device */
>   struct drm_driver *driver;
>  
> diff --git a/include/drm/drm_managed.h b/include/drm/drm_managed.h
> new file mode 100644
> index ..7b5df7d09b19
> --- /dev/null
> +++ b/include/drm/drm_managed.h
> @@ -0,0 +1,30 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#ifndef _DRM_MANAGED_H_
> +#define _DRM_MANAGED_H_
> +
> +#include 
> +#include 
> +
> +struct drm_device;
> +
> +typedef void (*drmres_release_t)(struct drm_device *dev, void *res);
> +
> +#define drmm_add_action(dev, action, data) \
> + __drmm_add_action(dev, action, data, #action)
> +
> +int __must_check __drmm_add_action(struct drm_device *dev,
> +drmres_release_t action,
> +void *data, const char *name);
> +
> +void drmm_add_final_kfree(struct drm_device *dev, void *parent);
> +
> +void *drmm_kmalloc(struct drm_device *dev, size_t size, gfp_t gfp) __malloc;
> +static inline void *drmm_kzalloc(struct drm_device *dev, size_t size, gfp_t 
> gfp)
> +{
> + return drmm_kmalloc(dev, size, gfp | __GFP_ZERO);
> +}
> +
> +void drmm_kfree(struct drm_device *dev, void *data);
> +
> +#endif
> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
> index ca7cee8e728a..1c9417430d08 100644
> --- a/include/drm/drm_print.h
> +++ b/include/drm/drm_print.h
> @@ -313,6 +313,10 @@ enum drm_debug_category {
>* @DRM_UT_DP: Used in the DP code.
>*/
>   DRM_UT_DP   = 0x100,
> + /**
> +  * @DRM_UT_DRMRES: Used in the drm managed resources code.
> +  */
> + DRM_UT_DRMRES   = 0x200,
>  };
>  
>  static inline bool drm_debug_enabled(enum drm_debug_category category)
> @@ -442,6 +446,8 @@ void drm_dev_dbg(const struct device *dev, enum 
> drm_debug_category category,
>   drm_dev_dbg((drm)->dev, DRM_UT_LEASE, fmt, ##__VA_ARGS__)
>  #define drm_dbg_dp(drm, fmt, ...)\
>   drm_dev_dbg((drm)->dev, DRM_UT_DP, fmt, ##__VA_ARGS__)
> +#d

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-11 Thread Thomas Zimmermann
Hi Daniel

Am 02.03.20 um 23:25 schrieb Daniel Vetter:
> We have lots of these. And the cleanup code tends to be of dubious
> quality. The biggest wrong pattern is that developers use devm_, which
> ties the release action to the underlying struct device, whereas
> all the userspace visible stuff attached to a drm_device can long
> outlive that one (e.g. after a hotunplug while userspace has open
> files and mmap'ed buffers). Give people what they want, but with more
> correctness.
> 
> Mostly copied from devres.c, with types adjusted to fit drm_device and
> a few simplifications - I didn't (yet) copy over everything. Since
> the types don't match code sharing looked like a hopeless endeavour.
> 
> For now it's only super simplified, no groups, you can't remove
> actions (but kfree exists, we'll need that soon). Plus all specific to
> drm_device ofc, including the logging. Which I didn't bother to make
> compile-time optional, since none of the other drm logging is compile
> time optional either.
> 
> One tricky bit here is the chicken&egg between allocating your
> drm_device structure and initiliazing it with drm_dev_init. For
> perfect onion unwinding we'd need to have the action to kfree the
> allocation registered before drm_dev_init registers any of its own
> release handlers. But drm_dev_init doesn't know where exactly the
> drm_device is emebedded into the overall structure, and by the time it
> returns it'll all be too late. And forcing drivers to be able clean up
> everything except the one kzalloc is silly.
> 
> Work around this by having a very special final_kfree pointer. This
> also avoids troubles with the list head possibly disappearing from
> underneath us when we release all resources attached to the
> drm_device.
> 
> v2: Do all the kerneldoc at the end, to avoid lots of fairly pointless
> shuffling while getting everything into shape.
> 
> v3: Add static to add/del_dr (Neil)
> Move typo fix to the right patch (Neil)
> 
> v4: Enforce contract for drmm_add_final_kfree:
> 
> Use ksize() to check that the drm_device is indeed contained somewhere
> in the final kfree(). Because we need that or the entire managed
> release logic blows up in a pile of use-after-frees. Motivated by a
> discussion with Laurent.
> 
> v5: Review from Laurent:
> - %zu instead of casting size_t
> - header guards
> - sorting of includes
> - guarding of data assignment if we didn't allocate it for a NULL
>   pointer
> - delete spurious newline
> - cast void* data parameter correctly in ->release call, no idea how
>   this even worked before
> 
> v3: Review from Sam
> - Add the kerneldoc for the managed sub-struct back in, even if it
>   doesn't show up in the generated html somehow.
> - Explain why __always_inline.
> - Fix bisectability around the final kfree() in drm_dev_relase(). This
>   is just interim code which will disappear again.
> - Some whitespace polish.
> - Add debug output when drmm_add_action or drmm_kmalloc fail.
> 
> Cc: Sam Ravnborg 
> Cc: Laurent Pinchart 
> Cc: Neil Armstrong  Cc: Greg Kroah-Hartman 
> Cc: "Rafael J. Wysocki" 
> Signed-off-by: Daniel Vetter 
> ---
>  Documentation/gpu/drm-internals.rst |   6 +
>  drivers/gpu/drm/Makefile|   3 +-
>  drivers/gpu/drm/drm_drv.c   |  12 ++
>  drivers/gpu/drm/drm_internal.h  |   3 +
>  drivers/gpu/drm/drm_managed.c   | 186 
>  include/drm/drm_device.h|  15 +++
>  include/drm/drm_managed.h   |  30 +
>  include/drm/drm_print.h |   6 +
>  8 files changed, 260 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/drm_managed.c
>  create mode 100644 include/drm/drm_managed.h
> 
> diff --git a/Documentation/gpu/drm-internals.rst 
> b/Documentation/gpu/drm-internals.rst
> index a73320576ca9..a6b6145fda78 100644
> --- a/Documentation/gpu/drm-internals.rst
> +++ b/Documentation/gpu/drm-internals.rst
> @@ -132,6 +132,12 @@ be unmapped; on many devices, the ROM address decoder is 
> shared with
>  other BARs, so leaving it mapped could cause undesired behaviour like
>  hangs or memory corruption.
>  
> +Managed Resources
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_managed.c
> +   :doc: managed resources
> +
>  Bus-specific Device Registration and PCI Support
>  
>  
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 7f72ef5e7811..183c60048307 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -17,7 +17,8 @@ drm-y   :=  drm_auth.o drm_cache.o \
>   drm_plane.o drm_color_mgmt.o drm_print.o \
>   drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
>   drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
> - drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o
> + drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
> + drm_managed.o
>  
>  drm-$(CONFIG_DRM_LEGACY) += drm

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-03 Thread Dan Carpenter
Hi Daniel,

I love your patch! Perhaps something to improve:

url:
https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm_device-managed-resources-v4/20200303-071023
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 
Reported-by: Dan Carpenter 

smatch warnings:
drivers/gpu/drm/drm_drv.c:843 drm_dev_release() error: dereferencing freed 
memory 'dev'

# 
https://github.com/0day-ci/linux/commit/5aba700d4c32ae5722a9931c959b13a6217a86e2
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 5aba700d4c32ae5722a9931c959b13a6217a86e2
vim +/dev +843 drivers/gpu/drm/drm_drv.c

099d1c290e2ebc drivers/gpu/drm/drm_stub.c David Herrmann 2014-01-29  826  
static void drm_dev_release(struct kref *ref)
0dc8fe5985e01f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-02  827  {
099d1c290e2ebc drivers/gpu/drm/drm_stub.c David Herrmann 2014-01-29  828
struct drm_device *dev = container_of(ref, struct drm_device, ref);
8f6599da8e772f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-20  829  
f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  830
if (dev->driver->release) {
f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  831
dev->driver->release(dev);
f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  832
} else {
f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  833
drm_dev_fini(dev);
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  834
}
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  835  
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  836
drm_managed_release(dev);
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  837  
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  838
if (!dev->driver->release && !dev->managed.final_kfree) {
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  839
WARN_ON(!list_empty(&dev->managed.resources));
0dc8fe5985e01f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-02  840
kfree(dev);

^^
Free

0dc8fe5985e01f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-02  841
}
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  842  
5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02 @843
if (dev->managed.final_kfree)

^
Dereference

5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  844
kfree(dev->managed.final_kfree);
f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  845  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-03 Thread Daniel Vetter
On Tue, Mar 03, 2020 at 11:04:06AM +0300, Dan Carpenter wrote:
> Hi Daniel,
> 
> I love your patch! Perhaps something to improve:
> 
> url:
> https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm_device-managed-resources-v4/20200303-071023
> base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
> 
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot 
> Reported-by: Dan Carpenter 
> 
> smatch warnings:
> drivers/gpu/drm/drm_drv.c:843 drm_dev_release() error: dereferencing freed 
> memory 'dev'
> 
> # 
> https://github.com/0day-ci/linux/commit/5aba700d4c32ae5722a9931c959b13a6217a86e2
> git remote add linux-review https://github.com/0day-ci/linux
> git remote update linux-review
> git checkout 5aba700d4c32ae5722a9931c959b13a6217a86e2
> vim +/dev +843 drivers/gpu/drm/drm_drv.c
> 
> 099d1c290e2ebc drivers/gpu/drm/drm_stub.c David Herrmann 2014-01-29  826  
> static void drm_dev_release(struct kref *ref)
> 0dc8fe5985e01f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-02  827  {
> 099d1c290e2ebc drivers/gpu/drm/drm_stub.c David Herrmann 2014-01-29  828  
> struct drm_device *dev = container_of(ref, struct drm_device, ref);
> 8f6599da8e772f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-20  829  
> f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  830  
> if (dev->driver->release) {
> f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  831  
> dev->driver->release(dev);
> f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  832  
> } else {
> f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  833  
> drm_dev_fini(dev);
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  834  
> }
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  835  
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  836  
> drm_managed_release(dev);
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  837  
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  838  
> if (!dev->driver->release && !dev->managed.final_kfree) {
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  839  
> WARN_ON(!list_empty(&dev->managed.resources));
> 0dc8fe5985e01f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-02  840  
> kfree(dev);
>   
>   ^^
> Free
> 
> 0dc8fe5985e01f drivers/gpu/drm/drm_stub.c David Herrmann 2013-10-02  841  
> }
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  842  
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02 @843  
> if (dev->managed.final_kfree)
>   
>   ^
> Dereference

Drat, so much for me trying to get this to bisect properly (it's interim
code and will disappear, end is correct I  think). I guess I'll try again.
-Daniel

> 
> 5aba700d4c32ae drivers/gpu/drm/drm_drv.c  Daniel Vetter  2020-03-02  844  
> kfree(dev->managed.final_kfree);
> f30c92576af4bb drivers/gpu/drm/drm_drv.c  Chris Wilson   2017-02-02  845  }
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-02 Thread Sam Ravnborg
Hi Daniel / Jani.

> On Mon, Mar 02, 2020 at 11:22:34AM +0200, Jani Nikula wrote:
> > On Sat, 29 Feb 2020, Daniel Vetter  wrote:
> > > On Sat, Feb 29, 2020 at 12:17 PM Sam Ravnborg  wrote:
> > >> The header-check infrastructure was dropped again - see:
> > >> fcbb8461fd2376ba3782b5b8bd440c929b8e4980
> > >
> > > Uh I'm disappoint :-/
> > 
> > To say the least. I thought it was a good *opt-in* feature for whoever
> > wanted it. But the part that got the backlash was applying it to
> > absolutely everything under include/. And then it got removed
> > altogether. From one extreme to the other. Nuts.
> > 
> > > Adding Jani in case he missed this too. I guess maybe we should
> > > resurrect it for drm again (and with a file pattern starting in a
> > > .dot).
> > 
> > We have a local implementation in i915/Makefile again. It uses 'find' to
> > find the headers which is fine in i915, but the parameters need to be
> > adjusted for drm to not be recursive. -maxdepth 1 or something. Also
> > need to add another local config option. Sad trombone.
> 
> Splitting this up into two threads.
> 
> Could we extend this to drm headers again too? Sad thrombones indeed, but
> at least here we could still the proper fanfares ... Maybe something like
> have the Makefile snippet in drivers/gpu and then keep a list of
> directories (or file glob patterns probably better) in there that it
> should check.
> 
> I really liked this entire idea very much.
> 
> Oh also maybe switch the temp files over to dotfiles, so Linus doesn't get
> upset (which really I think is all  that Linus expected, but I guess
> people just panic and revert).

I will try to give it a spin by adding the feature back to kbuild,
but without the excessive use.
And with dot-files so the run does not disturb.
So we avoid different sub-systemes makes there own small solutions.

Give me a few weeks - need to land some exciting (not) binding
patches for panel/ first.
Anyone else up to the task, then I will be happy to review.

Sam
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-02 Thread Daniel Vetter
On Mon, Mar 02, 2020 at 11:22:34AM +0200, Jani Nikula wrote:
> On Sat, 29 Feb 2020, Daniel Vetter  wrote:
> > On Sat, Feb 29, 2020 at 12:17 PM Sam Ravnborg  wrote:
> >> The header-check infrastructure was dropped again - see:
> >> fcbb8461fd2376ba3782b5b8bd440c929b8e4980
> >
> > Uh I'm disappoint :-/
> 
> To say the least. I thought it was a good *opt-in* feature for whoever
> wanted it. But the part that got the backlash was applying it to
> absolutely everything under include/. And then it got removed
> altogether. From one extreme to the other. Nuts.
> 
> > Adding Jani in case he missed this too. I guess maybe we should
> > resurrect it for drm again (and with a file pattern starting in a
> > .dot).
> 
> We have a local implementation in i915/Makefile again. It uses 'find' to
> find the headers which is fine in i915, but the parameters need to be
> adjusted for drm to not be recursive. -maxdepth 1 or something. Also
> need to add another local config option. Sad trombone.

Splitting this up into two threads.

Could we extend this to drm headers again too? Sad thrombones indeed, but
at least here we could still the proper fanfares ... Maybe something like
have the Makefile snippet in drivers/gpu and then keep a list of
directories (or file glob patterns probably better) in there that it
should check.

I really liked this entire idea very much.

Oh also maybe switch the temp files over to dotfiles, so Linus doesn't get
upset (which really I think is all  that Linus expected, but I guess
people just panic and revert).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-02 Thread Daniel Vetter
On Mon, Mar 02, 2020 at 11:22:34AM +0200, Jani Nikula wrote:
> On Sat, 29 Feb 2020, Daniel Vetter  wrote:
> > On Sat, Feb 29, 2020 at 12:17 PM Sam Ravnborg  wrote:
> >> > > > + /**
> >> > > > +  * @managed:
> >> > > > +  *
> >> > > > +  * Managed resources linked to the lifetime of this 
> >> > > > &drm_device as
> >> > > > +  * tracked by @ref.
> >> > > > +  */
> >> > > > + struct {
> >> > > > + struct list_head resources;
> >> > > > + void *final_kfree;
> >> > > > + spinlock_t lock;
> >> > > > + } managed;
> >> > >
> >> > > I am missing kernel-doc here.
> >> > > At least document that lock is used to guard access to resources.
> >> > > (s/lock/lock_resources/ ?)
> >> >
> >> > Dunno why, but the support for name sub-structures seems to have
> >> > broken in kerneldoc. So I can type it, but it's not showing up, so I
> >> > didn't bother. Well I had it, but deleted it again. It's still
> >> > documented to work, but I have no idea what I'm doing wrong.
> >>
> >> Most readers prefer the .c files as the source.
> >> I personally read the generated kernel doc when I google
> >> and when I check that my own stuff looks good in kernel-doc format.
> >> So comments are still valueable despite not being picked up by
> >> kernel-doc.
> >> You know this - but I just wanted to encourage you to write the few
> >> lines that may help me and others :-)
> >
> > Hm I thought way back this actually worked. Again ping for Jani, he's
> > better on top of what's happening in kernel-doc land.
> 
> I haven't really been all that active lately, but I think the syntax
> here would be e.g. "@managed.resources:".

That's the one that doesn't seem to work unfortunately.

Adding kerneldoc people and mailing list, maybe this was intentionally
removed somewhen ... Jon, any pointers?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-03-02 Thread Jani Nikula
On Sat, 29 Feb 2020, Daniel Vetter  wrote:
> On Sat, Feb 29, 2020 at 12:17 PM Sam Ravnborg  wrote:
>> The header-check infrastructure was dropped again - see:
>> fcbb8461fd2376ba3782b5b8bd440c929b8e4980
>
> Uh I'm disappoint :-/

To say the least. I thought it was a good *opt-in* feature for whoever
wanted it. But the part that got the backlash was applying it to
absolutely everything under include/. And then it got removed
altogether. From one extreme to the other. Nuts.

> Adding Jani in case he missed this too. I guess maybe we should
> resurrect it for drm again (and with a file pattern starting in a
> .dot).

We have a local implementation in i915/Makefile again. It uses 'find' to
find the headers which is fine in i915, but the parameters need to be
adjusted for drm to not be recursive. -maxdepth 1 or something. Also
need to add another local config option. Sad trombone.

>> > > > + /**
>> > > > +  * @managed:
>> > > > +  *
>> > > > +  * Managed resources linked to the lifetime of this &drm_device 
>> > > > as
>> > > > +  * tracked by @ref.
>> > > > +  */
>> > > > + struct {
>> > > > + struct list_head resources;
>> > > > + void *final_kfree;
>> > > > + spinlock_t lock;
>> > > > + } managed;
>> > >
>> > > I am missing kernel-doc here.
>> > > At least document that lock is used to guard access to resources.
>> > > (s/lock/lock_resources/ ?)
>> >
>> > Dunno why, but the support for name sub-structures seems to have
>> > broken in kerneldoc. So I can type it, but it's not showing up, so I
>> > didn't bother. Well I had it, but deleted it again. It's still
>> > documented to work, but I have no idea what I'm doing wrong.
>>
>> Most readers prefer the .c files as the source.
>> I personally read the generated kernel doc when I google
>> and when I check that my own stuff looks good in kernel-doc format.
>> So comments are still valueable despite not being picked up by
>> kernel-doc.
>> You know this - but I just wanted to encourage you to write the few
>> lines that may help me and others :-)
>
> Hm I thought way back this actually worked. Again ping for Jani, he's
> better on top of what's happening in kernel-doc land.

I haven't really been all that active lately, but I think the syntax
here would be e.g. "@managed.resources:".

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-29 Thread Daniel Vetter
On Sat, Feb 29, 2020 at 12:17 PM Sam Ravnborg  wrote:
> > > > + *
> > > > + * Based on drivers/base/devres.c
> > > > + */
> > > > +
> > > > +#include 
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#include 
> > > > +#include 
> > >
> > > It is good practice to group the include files.
> > > And drm/ comes after linux/
> >
> > I try to put the main header first to make sure it's stand-alone, but
> > I guess that works with the header check now? Do I need to do anything
> > to get that checked?
>
> The header-check infrastructure was dropped again - see:
> fcbb8461fd2376ba3782b5b8bd440c929b8e4980

Uh I'm disappoint :-/

Adding Jani in case he missed this too. I guess maybe we should
resurrect it for drm again (and with a file pattern starting in a
.dot).

> > > > + /**
> > > > +  * @managed:
> > > > +  *
> > > > +  * Managed resources linked to the lifetime of this &drm_device as
> > > > +  * tracked by @ref.
> > > > +  */
> > > > + struct {
> > > > + struct list_head resources;
> > > > + void *final_kfree;
> > > > + spinlock_t lock;
> > > > + } managed;
> > >
> > > I am missing kernel-doc here.
> > > At least document that lock is used to guard access to resources.
> > > (s/lock/lock_resources/ ?)
> >
> > Dunno why, but the support for name sub-structures seems to have
> > broken in kerneldoc. So I can type it, but it's not showing up, so I
> > didn't bother. Well I had it, but deleted it again. It's still
> > documented to work, but I have no idea what I'm doing wrong.
>
> Most readers prefer the .c files as the source.
> I personally read the generated kernel doc when I google
> and when I check that my own stuff looks good in kernel-doc format.
> So comments are still valueable despite not being picked up by
> kernel-doc.
> You know this - but I just wanted to encourage you to write the few
> lines that may help me and others :-)

Hm I thought way back this actually worked. Again ping for Jani, he's
better on top of what's happening in kernel-doc land.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-29 Thread Sam Ravnborg
Hi Daniel.

> > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > index 9fcd6ab3c154..3e5627d6eba6 100644
> > > --- a/drivers/gpu/drm/drm_drv.c
> > > +++ b/drivers/gpu/drm/drm_drv.c
> > > @@ -629,6 +629,9 @@ int drm_dev_init(struct drm_device *dev,
> > >   dev->dev = get_device(parent);
> > >   dev->driver = driver;
> > >
> > > + INIT_LIST_HEAD(&dev->managed.resources);
> > > + spin_lock_init(&dev->managed.lock);
> > > +
> > >   /* no per-device feature limits by default */
> > >   dev->driver_features = ~0u;
> > >
> > > @@ -828,8 +831,16 @@ static void drm_dev_release(struct kref *ref)
> > >   dev->driver->release(dev);
> > >   } else {
> > >   drm_dev_fini(dev);
> > > - kfree(dev);
> > > + if (!dev->managed.final_kfree) {
> > > + WARN_ON(!list_empty(&dev->managed.resources));
> > > + kfree(dev);
> > > + }
> >
> > This looks sub-optimal.
> > We cannot be sure a driver have used drmm_add_final_kfree() if it makes
> > use of drmm_.
> > So we may not WARN in all relavant cases.
> > Also, we cannot expect all drivers that uses devmm_ to have managed
> > to get rid of their ->release call-back.
> 
> The above is purely transition code. It gets cleaned up once all
> drivers call drmm_add_final_kfree(). This all disappears again, but
> indeed looks like the interim state isn't quite what we want.
> 
> > So the right thing looks to me like we should move it out to be
> > unconditional. Se we will WARN_ON(!list_empty(&dev->managed.resources))
> > always.
> 
> Until the driver has set drmm_add_final_kfree it's actually dangerous
> to use the drmm stuff. Exactly because of the use-after-free you point
> out below. Hence the warning to make sure there's no release actions.
> I'll shuffle this around to make sure we call kfree last for all
> possible paths and make sure this bisects all correctly.

I was just reviewing the code I had on hand, and did not look further in
the set of patches.
Very good if we can keep is bisectable.

> > > + *
> > > + * Based on drivers/base/devres.c
> > > + */
> > > +
> > > +#include 
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +#include 
> > > +#include 
> >
> > It is good practice to group the include files.
> > And drm/ comes after linux/
> 
> I try to put the main header first to make sure it's stand-alone, but
> I guess that works with the header check now? Do I need to do anything
> to get that checked?

The header-check infrastructure was dropped again - see:
fcbb8461fd2376ba3782b5b8bd440c929b8e4980

So including it as the first header in the implmentation
file is likely the best way to keep it self contained.
We will spot errors sooner.

> > > +static __always_inline struct drmres * alloc_dr(drmres_release_t release,
> > > + size_t size, gfp_t gfp, int 
> > > nid)
> > Why do we force the compiler to inline this?
> > Seems a little agressive.
> 
> It's not for performance, but for kmalloc_trace_caller. No point if
> our caller is always some boring function from drm_managed.c that
> calls alloc_dr. If we force alloc_dr to inline, then we get the caller
> of the drm_managed.c function traced as allocator. Much better.
> 
> (I stole that trick from devres.c)
> 
> I'll add a comment to explain this.
Thanks.

> 
> > All the two users so far uses dev_to_node(dev->dev) for the nid.
> > Maybe let this function take a drm_device * and thus move the
> > calculation to this function?
> 
> Copypastes like that :-) I feel somewhat meh here ...
Well - keep the diff for devres smaller for now and leave it.
It was just an observation.

> > > + /**
> > > +  * @managed:
> > > +  *
> > > +  * Managed resources linked to the lifetime of this &drm_device as
> > > +  * tracked by @ref.
> > > +  */
> > > + struct {
> > > + struct list_head resources;
> > > + void *final_kfree;
> > > + spinlock_t lock;
> > > + } managed;
> >
> > I am missing kernel-doc here.
> > At least document that lock is used to guard access to resources.
> > (s/lock/lock_resources/ ?)
> 
> Dunno why, but the support for name sub-structures seems to have
> broken in kerneldoc. So I can type it, but it's not showing up, so I
> didn't bother. Well I had it, but deleted it again. It's still
> documented to work, but I have no idea what I'm doing wrong.

Most readers prefer the .c files as the source.
I personally read the generated kernel doc when I google
and when I check that my own stuff looks good in kernel-doc format.
So comments are still valueable despite not being picked up by
kernel-doc.
You know this - but I just wanted to encourage you to write the few
lines that may help me and others :-)

Sam
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/ma

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-28 Thread Daniel Vetter
On Fri, Feb 28, 2020 at 11:45 PM Sam Ravnborg  wrote:
>
> Hi Daniel.
>
> Some nitpicks / bikeshedding below.
>
> Sam
>
> On Thu, Feb 27, 2020 at 07:14:34PM +0100, Daniel Vetter wrote:
> > We have lots of these. And the cleanup code tends to be of dubious
> > quality. The biggest wrong pattern is that developers use devm_, which
> > ties the release action to the underlying struct device, whereas
> > all the userspace visible stuff attached to a drm_device can long
> > outlive that one (e.g. after a hotunplug while userspace has open
> > files and mmap'ed buffers). Give people what they want, but with more
> > correctness.
> >
> > Mostly copied from devres.c, with types adjusted to fit drm_device and
> > a few simplifications - I didn't (yet) copy over everything. Since
> > the types don't match code sharing looked like a hopeless endeavour.
>
> Readability had been increased if the short names was not reused.
>
> s/dr_/drmres_/
>
> But I know, this is in the bikeshedding area.
>
> >
> > For now it's only super simplified, no groups, you can't remove
> > actions (but kfree exists, we'll need that soon). Plus all specific to
> > drm_device ofc, including the logging. Which I didn't bother to make
> > compile-time optional, since none of the other drm logging is compile
> > time optional either.
> >
> > One tricky bit here is the chicken&egg between allocating your
> > drm_device structure and initiliazing it with drm_dev_init. For
> > perfect onion unwinding we'd need to have the action to kfree the
> > allocation registered before drm_dev_init registers any of its own
> > release handlers. But drm_dev_init doesn't know where exactly the
> > drm_device is emebedded into the overall structure, and by the time it
> > returns it'll all be too late. And forcing drivers to be able clean up
> > everything except the one kzalloc is silly.
> >
> > Work around this by having a very special final_kfree pointer. This
> > also avoids troubles with the list head possibly disappearing from
> > underneath us when we release all resources attached to the
> > drm_device.
> >
> > v2: Do all the kerneldoc at the end, to avoid lots of fairly pointless
> > shuffling while getting everything into shape.
> >
> > v3: Add static to add/del_dr (Neil)
> > Move typo fix to the right patch (Neil)
> >
> > v4: Enforce contract for drmm_add_final_kfree:
> >
> > Use ksize() to check that the drm_device is indeed contained somewhere
> > in the final kfree(). Because we need that or the entire managed
> > release logic blows up in a pile of use-after-frees. Motivated by a
> > discussion with Laurent.
> >
> > v5: Review from Laurent:
> > - %zu instead of casting size_t
> > - header guards
> > - sorting of includes
> > - guarding of data assignment if we didn't allocate it for a NULL
> >   pointer
> > - delete spurious newline
> > - cast void* data parameter correctly in ->release call, no idea how
> >   this even worked before
> >
> > Cc: Laurent Pinchart 
> > Cc: Neil Armstrong  > Cc: Greg Kroah-Hartman 
> > Cc: "Rafael J. Wysocki" 
> > Signed-off-by: Daniel Vetter 
> > ---
> >  Documentation/gpu/drm-internals.rst |   6 +
> >  drivers/gpu/drm/Makefile|   3 +-
> >  drivers/gpu/drm/drm_drv.c   |  13 ++-
> >  drivers/gpu/drm/drm_internal.h  |   3 +
> >  drivers/gpu/drm/drm_managed.c   | 175 
> >  include/drm/drm_device.h|  12 ++
> >  include/drm/drm_managed.h   |  30 +
> >  include/drm/drm_print.h |   6 +
> >  8 files changed, 246 insertions(+), 2 deletions(-)
> >  create mode 100644 drivers/gpu/drm/drm_managed.c
> >  create mode 100644 include/drm/drm_managed.h
> >
> > diff --git a/Documentation/gpu/drm-internals.rst 
> > b/Documentation/gpu/drm-internals.rst
> > index a73320576ca9..a6b6145fda78 100644
> > --- a/Documentation/gpu/drm-internals.rst
> > +++ b/Documentation/gpu/drm-internals.rst
> > @@ -132,6 +132,12 @@ be unmapped; on many devices, the ROM address decoder 
> > is shared with
> >  other BARs, so leaving it mapped could cause undesired behaviour like
> >  hangs or memory corruption.
> >
> > +Managed Resources
> > +-
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_managed.c
> > +   :doc: managed resources
> > +
> >  Bus-specific Device Registration and PCI Support
> >  
> >
> > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> > index 7f72ef5e7811..183c60048307 100644
> > --- a/drivers/gpu/drm/Makefile
> > +++ b/drivers/gpu/drm/Makefile
> > @@ -17,7 +17,8 @@ drm-y   :=  drm_auth.o drm_cache.o \
> >   drm_plane.o drm_color_mgmt.o drm_print.o \
> >   drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
> >   drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
> > - drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o
> > + drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
> > +   

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-28 Thread Sam Ravnborg
Hi Daniel.

Some nitpicks / bikeshedding below.

Sam

On Thu, Feb 27, 2020 at 07:14:34PM +0100, Daniel Vetter wrote:
> We have lots of these. And the cleanup code tends to be of dubious
> quality. The biggest wrong pattern is that developers use devm_, which
> ties the release action to the underlying struct device, whereas
> all the userspace visible stuff attached to a drm_device can long
> outlive that one (e.g. after a hotunplug while userspace has open
> files and mmap'ed buffers). Give people what they want, but with more
> correctness.
> 
> Mostly copied from devres.c, with types adjusted to fit drm_device and
> a few simplifications - I didn't (yet) copy over everything. Since
> the types don't match code sharing looked like a hopeless endeavour.

Readability had been increased if the short names was not reused.

s/dr_/drmres_/

But I know, this is in the bikeshedding area.

> 
> For now it's only super simplified, no groups, you can't remove
> actions (but kfree exists, we'll need that soon). Plus all specific to
> drm_device ofc, including the logging. Which I didn't bother to make
> compile-time optional, since none of the other drm logging is compile
> time optional either.
> 
> One tricky bit here is the chicken&egg between allocating your
> drm_device structure and initiliazing it with drm_dev_init. For
> perfect onion unwinding we'd need to have the action to kfree the
> allocation registered before drm_dev_init registers any of its own
> release handlers. But drm_dev_init doesn't know where exactly the
> drm_device is emebedded into the overall structure, and by the time it
> returns it'll all be too late. And forcing drivers to be able clean up
> everything except the one kzalloc is silly.
> 
> Work around this by having a very special final_kfree pointer. This
> also avoids troubles with the list head possibly disappearing from
> underneath us when we release all resources attached to the
> drm_device.
> 
> v2: Do all the kerneldoc at the end, to avoid lots of fairly pointless
> shuffling while getting everything into shape.
> 
> v3: Add static to add/del_dr (Neil)
> Move typo fix to the right patch (Neil)
> 
> v4: Enforce contract for drmm_add_final_kfree:
> 
> Use ksize() to check that the drm_device is indeed contained somewhere
> in the final kfree(). Because we need that or the entire managed
> release logic blows up in a pile of use-after-frees. Motivated by a
> discussion with Laurent.
> 
> v5: Review from Laurent:
> - %zu instead of casting size_t
> - header guards
> - sorting of includes
> - guarding of data assignment if we didn't allocate it for a NULL
>   pointer
> - delete spurious newline
> - cast void* data parameter correctly in ->release call, no idea how
>   this even worked before
> 
> Cc: Laurent Pinchart 
> Cc: Neil Armstrong  Cc: Greg Kroah-Hartman 
> Cc: "Rafael J. Wysocki" 
> Signed-off-by: Daniel Vetter 
> ---
>  Documentation/gpu/drm-internals.rst |   6 +
>  drivers/gpu/drm/Makefile|   3 +-
>  drivers/gpu/drm/drm_drv.c   |  13 ++-
>  drivers/gpu/drm/drm_internal.h  |   3 +
>  drivers/gpu/drm/drm_managed.c   | 175 
>  include/drm/drm_device.h|  12 ++
>  include/drm/drm_managed.h   |  30 +
>  include/drm/drm_print.h |   6 +
>  8 files changed, 246 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/gpu/drm/drm_managed.c
>  create mode 100644 include/drm/drm_managed.h
> 
> diff --git a/Documentation/gpu/drm-internals.rst 
> b/Documentation/gpu/drm-internals.rst
> index a73320576ca9..a6b6145fda78 100644
> --- a/Documentation/gpu/drm-internals.rst
> +++ b/Documentation/gpu/drm-internals.rst
> @@ -132,6 +132,12 @@ be unmapped; on many devices, the ROM address decoder is 
> shared with
>  other BARs, so leaving it mapped could cause undesired behaviour like
>  hangs or memory corruption.
>  
> +Managed Resources
> +-
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_managed.c
> +   :doc: managed resources
> +
>  Bus-specific Device Registration and PCI Support
>  
>  
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 7f72ef5e7811..183c60048307 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -17,7 +17,8 @@ drm-y   :=  drm_auth.o drm_cache.o \
>   drm_plane.o drm_color_mgmt.o drm_print.o \
>   drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
>   drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
> - drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o
> + drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
> + drm_managed.o
>  
>  drm-$(CONFIG_DRM_LEGACY) += drm_legacy_misc.o drm_bufs.o drm_context.o 
> drm_dma.o drm_scatter.o drm_lock.o
>  drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 9fcd6ab

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-26 Thread Andrzej Hajda
On 26.02.2020 11:21, Daniel Vetter wrote:
> On Wed, Feb 26, 2020 at 10:21:17AM +0100, Andrzej Hajda wrote:
>> On 25.02.2020 16:03, Daniel Vetter wrote:
>>> On Tue, Feb 25, 2020 at 11:27 AM Andrzej Hajda  wrote:
 Hi Daniel,


 The patchset looks interesting.


 On 21.02.2020 22:02, Daniel Vetter wrote:
> We have lots of these. And the cleanup code tends to be of dubious
> quality. The biggest wrong pattern is that developers use devm_, which
> ties the release action to the underlying struct device, whereas
> all the userspace visible stuff attached to a drm_device can long
> outlive that one (e.g. after a hotunplug while userspace has open
> files and mmap'ed buffers). Give people what they want, but with more
> correctness.
 I am not familiar with this stuff, so forgive me stupid questions.

 Is it documented how uapi should behave in such case?

 I guess the general rule is to return errors on most ioctls (ENODEV,
 EIO?), and wait until userspace releases everything, as there is not
 much more to do.

 If that is true what is the point of keeping these structs anyway -
 trivial functions with small context data should do the job.

 I suspect I am missing something but I do not know what :)
>>> We could do the above (also needs unmapping of all mmaps, so userspace
>>> then gets SIGSEGV everywhere) and watch userspace crash&burn.
>>> Essentially if the kernel can't do this properly, then there's no hope
>>> that userspace will be any better.
>>
>> We do not want to crash userspace. We just need to tell userspace that
>> the kernel objects userspace has references to are not valid.
>>
>> For this two mechanism should be enough:
>>
>> - signal hot-unplug,
>>
>> - report error (ENODEV for example) on any userspace requests (ioctls)
>> on invalid objects.
>>
>> Expecting from userspace properly handling ioctl errors seems to be fair.
> The trouble is that maybe it's fair, practice says it's just not going to
> happen.


So what? Bad API usage causes bad things, crashes will force developers
to fix it, if not we can assume it is not so harmful.

The gain is that kernel side is simpler and don't need to lie :)


>> Regarding mmap I am not sure how to properly handle disappearing
>> devices, but this is common problem regardless which solution we use.
> signal handler wrapped around every mmap access. Which doesn't compose
> across libraries, so is essentially impossible.
>
> Note that e.g. GL's robustness extensions works exactly like this here
> too: GPU dies, kernel kills all your objects and contexts and everything.
> But the driver keeps "working". The only way to get information that
> everything is actually dead is by querying the robustness extension, which
> then will tell you what's happened.
>
> Again this is because it's impossible to make sure userspace actually
> checks error codes every where. It's also prohibitively expensive. vk goes
> as far as outright removing all error validation (at least as much as
> possible).


vk is different story, and is for me counter-example - it has clear
policy - user should take care of proper API handling otherwise it risks
undefined behavior/crash. In your proposition I see opposition: lets
baby-sit user and protect him from his mistakes.


>
>>> Hence the idea is that we keep everything userspace facing still
>>> around, except it doesn't do much anymore. So connectors still there,
>>> but they look disconnected.
>>
>> It looks like lying to userspace that physical connectors still exists.
>> If we want to lie we need good reason for that. What is that reason?
>>
>> Why not just tell connectors are gone?
> Userspace sucks at handling hotunplugged connectors. Most of it is special
> case code for DP MST connectors only.
>
>>> Userspace can then hopefully eventually
>>> get around to processing the sysfs hotunplug event and remove the
>>> device from all its list. So the long-term idea is that a lot of stuff
>>> keeps working, except the driver doesn't talk to the hardware anymore.
>>> And we just sit around waiting for userspace to clean things up.
>>
>> What does it mean "lot of stuff keeps working"? What drm driver can do
>> without hardware? Could you show some examples?
> Nothing will "work", the goal is simply for userspace to not explode in
> fire and take the entire desktop down with it.


And why do we need to keep whole drm device for this task? What exactly
causes userspace explosion?


>
>>> I guess once we have a bunch of the panel/usb drivers converted over
>>> we could indeed document how this is all supposed to work from an uapi
>>> pov. But right now a lot of this is all rather aspirational, I think
>>> only the recent simple display pipe based drivers implement this as
>>> described above.
>>>
> Mostly copied from devres.c, with types adjusted to fit drm_device and
> a few simplifications - I didn't (yet) copy over everything. Since
>>

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-26 Thread Daniel Vetter
On Wed, Feb 26, 2020 at 10:21:17AM +0100, Andrzej Hajda wrote:
> On 25.02.2020 16:03, Daniel Vetter wrote:
> > On Tue, Feb 25, 2020 at 11:27 AM Andrzej Hajda  wrote:
> >> Hi Daniel,
> >>
> >>
> >> The patchset looks interesting.
> >>
> >>
> >> On 21.02.2020 22:02, Daniel Vetter wrote:
> >>> We have lots of these. And the cleanup code tends to be of dubious
> >>> quality. The biggest wrong pattern is that developers use devm_, which
> >>> ties the release action to the underlying struct device, whereas
> >>> all the userspace visible stuff attached to a drm_device can long
> >>> outlive that one (e.g. after a hotunplug while userspace has open
> >>> files and mmap'ed buffers). Give people what they want, but with more
> >>> correctness.
> >>
> >> I am not familiar with this stuff, so forgive me stupid questions.
> >>
> >> Is it documented how uapi should behave in such case?
> >>
> >> I guess the general rule is to return errors on most ioctls (ENODEV,
> >> EIO?), and wait until userspace releases everything, as there is not
> >> much more to do.
> >>
> >> If that is true what is the point of keeping these structs anyway -
> >> trivial functions with small context data should do the job.
> >>
> >> I suspect I am missing something but I do not know what :)
> > We could do the above (also needs unmapping of all mmaps, so userspace
> > then gets SIGSEGV everywhere) and watch userspace crash&burn.
> > Essentially if the kernel can't do this properly, then there's no hope
> > that userspace will be any better.
> 
> 
> We do not want to crash userspace. We just need to tell userspace that
> the kernel objects userspace has references to are not valid.
> 
> For this two mechanism should be enough:
> 
> - signal hot-unplug,
> 
> - report error (ENODEV for example) on any userspace requests (ioctls)
> on invalid objects.
> 
> Expecting from userspace properly handling ioctl errors seems to be fair.

The trouble is that maybe it's fair, practice says it's just not going to
happen.

> Regarding mmap I am not sure how to properly handle disappearing
> devices, but this is common problem regardless which solution we use.

signal handler wrapped around every mmap access. Which doesn't compose
across libraries, so is essentially impossible.

Note that e.g. GL's robustness extensions works exactly like this here
too: GPU dies, kernel kills all your objects and contexts and everything.
But the driver keeps "working". The only way to get information that
everything is actually dead is by querying the robustness extension, which
then will tell you what's happened.

Again this is because it's impossible to make sure userspace actually
checks error codes every where. It's also prohibitively expensive. vk goes
as far as outright removing all error validation (at least as much as
possible).

> > Hence the idea is that we keep everything userspace facing still
> > around, except it doesn't do much anymore. So connectors still there,
> > but they look disconnected.
> 
> 
> It looks like lying to userspace that physical connectors still exists.
> If we want to lie we need good reason for that. What is that reason?
> 
> Why not just tell connectors are gone?

Userspace sucks at handling hotunplugged connectors. Most of it is special
case code for DP MST connectors only.

> > Userspace can then hopefully eventually
> > get around to processing the sysfs hotunplug event and remove the
> > device from all its list. So the long-term idea is that a lot of stuff
> > keeps working, except the driver doesn't talk to the hardware anymore.
> > And we just sit around waiting for userspace to clean things up.
> 
> 
> What does it mean "lot of stuff keeps working"? What drm driver can do
> without hardware? Could you show some examples?

Nothing will "work", the goal is simply for userspace to not explode in
fire and take the entire desktop down with it.

> > I guess once we have a bunch of the panel/usb drivers converted over
> > we could indeed document how this is all supposed to work from an uapi
> > pov. But right now a lot of this is all rather aspirational, I think
> > only the recent simple display pipe based drivers implement this as
> > described above.
> >
> >>> Mostly copied from devres.c, with types adjusted to fit drm_device and
> >>> a few simplifications - I didn't (yet) copy over everything. Since
> >>> the types don't match code sharing looked like a hopeless endeavour.
> >>>
> >>> For now it's only super simplified, no groups, you can't remove
> >>> actions (but kfree exists, we'll need that soon). Plus all specific to
> >>> drm_device ofc, including the logging. Which I didn't bother to make
> >>> compile-time optional, since none of the other drm logging is compile
> >>> time optional either.
> >>
> >> I saw in v1 thread that copy/paste is OK and merging back devres and
> >> drmres can be done later, but experience shows that after short time
> >> things get de-synchronized and merging process becomes quite painful.
> >>
> >>

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-26 Thread Andrzej Hajda
On 25.02.2020 16:03, Daniel Vetter wrote:
> On Tue, Feb 25, 2020 at 11:27 AM Andrzej Hajda  wrote:
>> Hi Daniel,
>>
>>
>> The patchset looks interesting.
>>
>>
>> On 21.02.2020 22:02, Daniel Vetter wrote:
>>> We have lots of these. And the cleanup code tends to be of dubious
>>> quality. The biggest wrong pattern is that developers use devm_, which
>>> ties the release action to the underlying struct device, whereas
>>> all the userspace visible stuff attached to a drm_device can long
>>> outlive that one (e.g. after a hotunplug while userspace has open
>>> files and mmap'ed buffers). Give people what they want, but with more
>>> correctness.
>>
>> I am not familiar with this stuff, so forgive me stupid questions.
>>
>> Is it documented how uapi should behave in such case?
>>
>> I guess the general rule is to return errors on most ioctls (ENODEV,
>> EIO?), and wait until userspace releases everything, as there is not
>> much more to do.
>>
>> If that is true what is the point of keeping these structs anyway -
>> trivial functions with small context data should do the job.
>>
>> I suspect I am missing something but I do not know what :)
> We could do the above (also needs unmapping of all mmaps, so userspace
> then gets SIGSEGV everywhere) and watch userspace crash&burn.
> Essentially if the kernel can't do this properly, then there's no hope
> that userspace will be any better.


We do not want to crash userspace. We just need to tell userspace that
the kernel objects userspace has references to are not valid.

For this two mechanism should be enough:

- signal hot-unplug,

- report error (ENODEV for example) on any userspace requests (ioctls)
on invalid objects.

Expecting from userspace properly handling ioctl errors seems to be fair.

Regarding mmap I am not sure how to properly handle disappearing
devices, but this is common problem regardless which solution we use.


>
> Hence the idea is that we keep everything userspace facing still
> around, except it doesn't do much anymore. So connectors still there,
> but they look disconnected.


It looks like lying to userspace that physical connectors still exists.
If we want to lie we need good reason for that. What is that reason?

Why not just tell connectors are gone?


> Userspace can then hopefully eventually
> get around to processing the sysfs hotunplug event and remove the
> device from all its list. So the long-term idea is that a lot of stuff
> keeps working, except the driver doesn't talk to the hardware anymore.
> And we just sit around waiting for userspace to clean things up.


What does it mean "lot of stuff keeps working"? What drm driver can do
without hardware? Could you show some examples?


>
> I guess once we have a bunch of the panel/usb drivers converted over
> we could indeed document how this is all supposed to work from an uapi
> pov. But right now a lot of this is all rather aspirational, I think
> only the recent simple display pipe based drivers implement this as
> described above.
>
>>> Mostly copied from devres.c, with types adjusted to fit drm_device and
>>> a few simplifications - I didn't (yet) copy over everything. Since
>>> the types don't match code sharing looked like a hopeless endeavour.
>>>
>>> For now it's only super simplified, no groups, you can't remove
>>> actions (but kfree exists, we'll need that soon). Plus all specific to
>>> drm_device ofc, including the logging. Which I didn't bother to make
>>> compile-time optional, since none of the other drm logging is compile
>>> time optional either.
>>
>> I saw in v1 thread that copy/paste is OK and merging back devres and
>> drmres can be done later, but experience shows that after short time
>> things get de-synchronized and merging process becomes quite painful.
>>
>> On the other side I guess it shouldn't be difficult to split devres into
>> consumer agnostic core and "struct device" helpers and then use the core
>> in drm.
>>
>> For example currently devres uses two fields from struct device:
>>
>> spinlock_tdevres_lock;
>> struct list_headdevres_head;
>>
>> Lets put it into separate struct:
>>
>> struct devres {
>>
>> spinlock_tlock;
>> struct list_headhead;
>>
>> };
>>
>> And embed this struct into "struct device".
>>
>> Then convert all core devres functions to take "struct devres *"
>> argument instead of "struct device *" and then these core functions can
>> be usable in drm.
>>
>> Looks quite simple separation of abstraction (devres) and its consumer
>> (struct device).
>>
>> After such split one could think about changing name devres to something
>> more reliable.
> There was a long discussion on v1 exactly about this, Greg's
> suggestion was to "just share a struct device". So we're not going to
> do this here, and the struct device seems like slight overkill and not
> a good enough fit here.


But my proposition is different, I want to get rid of "struct device"
from devres core - devres has nothing to do with dev

Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-25 Thread Daniel Vetter
On Tue, Feb 25, 2020 at 11:27 AM Andrzej Hajda  wrote:
>
> Hi Daniel,
>
>
> The patchset looks interesting.
>
>
> On 21.02.2020 22:02, Daniel Vetter wrote:
> > We have lots of these. And the cleanup code tends to be of dubious
> > quality. The biggest wrong pattern is that developers use devm_, which
> > ties the release action to the underlying struct device, whereas
> > all the userspace visible stuff attached to a drm_device can long
> > outlive that one (e.g. after a hotunplug while userspace has open
> > files and mmap'ed buffers). Give people what they want, but with more
> > correctness.
>
>
> I am not familiar with this stuff, so forgive me stupid questions.
>
> Is it documented how uapi should behave in such case?
>
> I guess the general rule is to return errors on most ioctls (ENODEV,
> EIO?), and wait until userspace releases everything, as there is not
> much more to do.
>
> If that is true what is the point of keeping these structs anyway -
> trivial functions with small context data should do the job.
>
> I suspect I am missing something but I do not know what :)

We could do the above (also needs unmapping of all mmaps, so userspace
then gets SIGSEGV everywhere) and watch userspace crash&burn.
Essentially if the kernel can't do this properly, then there's no hope
that userspace will be any better.

Hence the idea is that we keep everything userspace facing still
around, except it doesn't do much anymore. So connectors still there,
but they look disconnected. Userspace can then hopefully eventually
get around to processing the sysfs hotunplug event and remove the
device from all its list. So the long-term idea is that a lot of stuff
keeps working, except the driver doesn't talk to the hardware anymore.
And we just sit around waiting for userspace to clean things up.

I guess once we have a bunch of the panel/usb drivers converted over
we could indeed document how this is all supposed to work from an uapi
pov. But right now a lot of this is all rather aspirational, I think
only the recent simple display pipe based drivers implement this as
described above.

> > Mostly copied from devres.c, with types adjusted to fit drm_device and
> > a few simplifications - I didn't (yet) copy over everything. Since
> > the types don't match code sharing looked like a hopeless endeavour.
> >
> > For now it's only super simplified, no groups, you can't remove
> > actions (but kfree exists, we'll need that soon). Plus all specific to
> > drm_device ofc, including the logging. Which I didn't bother to make
> > compile-time optional, since none of the other drm logging is compile
> > time optional either.
>
>
> I saw in v1 thread that copy/paste is OK and merging back devres and
> drmres can be done later, but experience shows that after short time
> things get de-synchronized and merging process becomes quite painful.
>
> On the other side I guess it shouldn't be difficult to split devres into
> consumer agnostic core and "struct device" helpers and then use the core
> in drm.
>
> For example currently devres uses two fields from struct device:
>
> spinlock_tdevres_lock;
> struct list_headdevres_head;
>
> Lets put it into separate struct:
>
> struct devres {
>
> spinlock_tlock;
> struct list_headhead;
>
> };
>
> And embed this struct into "struct device".
>
> Then convert all core devres functions to take "struct devres *"
> argument instead of "struct device *" and then these core functions can
> be usable in drm.
>
> Looks quite simple separation of abstraction (devres) and its consumer
> (struct device).
>
> After such split one could think about changing name devres to something
> more reliable.

There was a long discussion on v1 exactly about this, Greg's
suggestion was to "just share a struct device". So we're not going to
do this here, and the struct device seems like slight overkill and not
a good enough fit here.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/51] drm: add managed resources tied to drm_device

2020-02-25 Thread Andrzej Hajda
Hi Daniel,


The patchset looks interesting.


On 21.02.2020 22:02, Daniel Vetter wrote:
> We have lots of these. And the cleanup code tends to be of dubious
> quality. The biggest wrong pattern is that developers use devm_, which
> ties the release action to the underlying struct device, whereas
> all the userspace visible stuff attached to a drm_device can long
> outlive that one (e.g. after a hotunplug while userspace has open
> files and mmap'ed buffers). Give people what they want, but with more
> correctness.


I am not familiar with this stuff, so forgive me stupid questions.

Is it documented how uapi should behave in such case?

I guess the general rule is to return errors on most ioctls (ENODEV,
EIO?), and wait until userspace releases everything, as there is not
much more to do.

If that is true what is the point of keeping these structs anyway -
trivial functions with small context data should do the job.

I suspect I am missing something but I do not know what :)


>
> Mostly copied from devres.c, with types adjusted to fit drm_device and
> a few simplifications - I didn't (yet) copy over everything. Since
> the types don't match code sharing looked like a hopeless endeavour.
>
> For now it's only super simplified, no groups, you can't remove
> actions (but kfree exists, we'll need that soon). Plus all specific to
> drm_device ofc, including the logging. Which I didn't bother to make
> compile-time optional, since none of the other drm logging is compile
> time optional either.


I saw in v1 thread that copy/paste is OK and merging back devres and
drmres can be done later, but experience shows that after short time
things get de-synchronized and merging process becomes quite painful.

On the other side I guess it shouldn't be difficult to split devres into
consumer agnostic core and "struct device" helpers and then use the core
in drm.

For example currently devres uses two fields from struct device:

    spinlock_t        devres_lock;
    struct list_head    devres_head;

Lets put it into separate struct:

struct devres {

    spinlock_t        lock;
    struct list_head    head;

};

And embed this struct into "struct device".

Then convert all core devres functions to take "struct devres *"
argument instead of "struct device *" and then these core functions can
be usable in drm.

Looks quite simple separation of abstraction (devres) and its consumer
(struct device).

After such split one could think about changing name devres to something
more reliable.


Regards

Andrzej



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx