Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-09-08 Thread Lyude Paul
On Tue, 2020-08-18 at 10:58 -0700, Zwane Mwaikambo wrote:
> On Wed, 12 Aug 2020, Lyude Paul wrote:
> 
> > On Wed, 2020-08-12 at 16:10 +0200, Daniel Vetter wrote:
> > > On Wed, Aug 12, 2020 at 12:16 AM Zwane Mwaikambo 
> > > wrote:
> > > > On Tue, 11 Aug 2020, Daniel Vetter wrote:
> > > > 
> > > > > On Mon, Aug 10, 2020 at 10:11:50AM -0700, Zwane Mwaikambo wrote:
> > > > > > Hi Folks,
> > > > > > I know this thread eventually dropped off due to not
> > > > > > identifying
> > > > > > the underlying issue. It's still occuring on 5.8 and in my case it
> > > > > > happened because the udev device nodes for the DP aux devices were
> > > > > > not
> > > > > > cleaned up whereas the kernel had no association with them. I can
> > > > > > reproduce the bug just by creating a device node for a non-
> > > > > > existent
> > > > > > minor
> > > > > > device and calling open().
> > > > > 
> > > > > Hm I don't have that thread anymore, but generally these bugs are
> > > > > solved
> > > > > by not registering the device before it's ready for use. We do have
> > > > > drm_connector->late_register for that stuff. Just a guess since I'm
> > > > > not
> > > > > seeing full details here.
> > > > 
> > > > In this particular case, the physical device disappeared before the
> > > > nodes
> > > > were cleaned up. It involves putting a computer to sleep with a
> > > > monitor
> > > > plugged in and then waking it up with the monitor unplugged.
> > > 
> > > We also have early_unregister for the reverse, but yes this sounds
> > > more tricky ... Adding Lyude who's been working on way too much
> > > lifetime fun around dp recently.
> > > -Daniel
> > > 
> > Hi-I think just checking whether the auxdev is NULL or not is a reasonable
> > fix, although I am curious as to how exactly the aux dev's parent is
> > getting
> > destroyed before it's child, which I would have thought would be the only
> > way
> > you could hit this?
> 
> Hi, If this is acceptable, would you consider an updated patch against 
> 5.8?

Sure-although the process to getting this into stable is to get the patch into
drm-next first, then it can get cherry-picked into the stable kernel branches.
See https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

> 
> Thanks,
>   Zwane
> 
-- 
Cheers,
Lyude Paul (she/her)
Software Engineer at Red Hat



Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-08-18 Thread Zwane Mwaikambo
On Wed, 12 Aug 2020, Lyude Paul wrote:

> On Wed, 2020-08-12 at 16:10 +0200, Daniel Vetter wrote:
> > On Wed, Aug 12, 2020 at 12:16 AM Zwane Mwaikambo  wrote:
> > > On Tue, 11 Aug 2020, Daniel Vetter wrote:
> > > 
> > > > On Mon, Aug 10, 2020 at 10:11:50AM -0700, Zwane Mwaikambo wrote:
> > > > > Hi Folks,
> > > > > I know this thread eventually dropped off due to not identifying
> > > > > the underlying issue. It's still occuring on 5.8 and in my case it
> > > > > happened because the udev device nodes for the DP aux devices were not
> > > > > cleaned up whereas the kernel had no association with them. I can
> > > > > reproduce the bug just by creating a device node for a non-existent
> > > > > minor
> > > > > device and calling open().
> > > > 
> > > > Hm I don't have that thread anymore, but generally these bugs are solved
> > > > by not registering the device before it's ready for use. We do have
> > > > drm_connector->late_register for that stuff. Just a guess since I'm not
> > > > seeing full details here.
> > > 
> > > In this particular case, the physical device disappeared before the nodes
> > > were cleaned up. It involves putting a computer to sleep with a monitor
> > > plugged in and then waking it up with the monitor unplugged.
> > 
> > We also have early_unregister for the reverse, but yes this sounds
> > more tricky ... Adding Lyude who's been working on way too much
> > lifetime fun around dp recently.
> > -Daniel
> > 
> Hi-I think just checking whether the auxdev is NULL or not is a reasonable
> fix, although I am curious as to how exactly the aux dev's parent is getting
> destroyed before it's child, which I would have thought would be the only way
> you could hit this?

Hi, If this is acceptable, would you consider an updated patch against 
5.8?

Thanks,
Zwane


Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-08-12 Thread Zwane Mwaikambo
On Wed, 12 Aug 2020, Lyude Paul wrote:

> On Wed, 2020-08-12 at 16:10 +0200, Daniel Vetter wrote:
> > On Wed, Aug 12, 2020 at 12:16 AM Zwane Mwaikambo  wrote:
> > > On Tue, 11 Aug 2020, Daniel Vetter wrote:
> > > 
> > > > On Mon, Aug 10, 2020 at 10:11:50AM -0700, Zwane Mwaikambo wrote:
> > > > > Hi Folks,
> > > > > I know this thread eventually dropped off due to not identifying
> > > > > the underlying issue. It's still occuring on 5.8 and in my case it
> > > > > happened because the udev device nodes for the DP aux devices were not
> > > > > cleaned up whereas the kernel had no association with them. I can
> > > > > reproduce the bug just by creating a device node for a non-existent
> > > > > minor
> > > > > device and calling open().
> > > > 
> > > > Hm I don't have that thread anymore, but generally these bugs are solved
> > > > by not registering the device before it's ready for use. We do have
> > > > drm_connector->late_register for that stuff. Just a guess since I'm not
> > > > seeing full details here.
> > > 
> > > In this particular case, the physical device disappeared before the nodes
> > > were cleaned up. It involves putting a computer to sleep with a monitor
> > > plugged in and then waking it up with the monitor unplugged.
> > 
> > We also have early_unregister for the reverse, but yes this sounds
> > more tricky ... Adding Lyude who's been working on way too much
> > lifetime fun around dp recently.
> > -Daniel
> > 
> Hi-I think just checking whether the auxdev is NULL or not is a reasonable
> fix, although I am curious as to how exactly the aux dev's parent is getting
> destroyed before it's child, which I would have thought would be the only way
> you could hit this?

Here is what it looks like without (1) and with (2) monitor connected. In 
the case where the monitor disappears during suspend, the device nodes 
aux3,4 are still around

1) No monitor connected
ls -l /dev/drm*
crw--- 1 root root 238, 0 Aug  6 22:32 /dev/drm_dp_aux0
crw--- 1 root root 238, 1 Aug  6 22:32 /dev/drm_dp_aux1


2) Monitor connected
crw--- 1 root root 238, 0 Aug  6 22:32 /dev/drm_dp_aux0
crw--- 1 root root 238, 1 Aug  6 22:32 /dev/drm_dp_aux1
crw--- 1 root root 238, 2 Aug 11 14:51 /dev/drm_dp_aux2
crw--- 1 root root 238, 3 Aug 11 14:51 /dev/drm_dp_aux3
crw--- 1 root root 238, 4 Aug 11 14:51 /dev/drm_dp_aux4



> 
> > > 
> > > > > To me it still makes sense to just check aux_dev because the chardev
> > > > > has
> > > > > no way to check before calling.
> > > > > 
> > > > > (gdb) list *drm_dp_aux_dev_get_by_minor+0x29
> > > > > 0x17b39 is in drm_dp_aux_dev_get_by_minor
> > > > > (drivers/gpu/drm/drm_dp_aux_dev.c:65).
> > > > > 60  static struct drm_dp_aux_dev
> > > > > *drm_dp_aux_dev_get_by_minor(unsigned index)
> > > > > 61  {
> > > > > 62  struct drm_dp_aux_dev *aux_dev = NULL;
> > > > > 63
> > > > > 64  mutex_lock(_idr_mutex);
> > > > > 65  aux_dev = idr_find(_idr, index);
> > > > > 66  if (!kref_get_unless_zero(_dev->refcount))
> > > > > 67  aux_dev = NULL;
> > > > > 68  mutex_unlock(_idr_mutex);
> > > > > 69
> > > > > (gdb) p/x &((struct drm_dp_aux_dev *)(0x0))->refcount
> > > > > $8 = 0x18
> > > > > 
> > > > > static int auxdev_open(struct inode *inode, struct file *file)
> > > > > {
> > > > > unsigned int minor = iminor(inode);
> > > > > struct drm_dp_aux_dev *aux_dev;
> > > > > 
> > > > > aux_dev = drm_dp_aux_dev_get_by_minor(minor);
> > > > > if (!aux_dev)
> > > > > return -ENODEV;
> > > > > 
> > > > > file->private_data = aux_dev;
> > > > > return 0;
> > > > > }
> > > > > 
> > > > > 
> > > > > ___
> > > > > dri-devel mailing list
> > > > > dri-de...@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> > 
> 


Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-08-12 Thread Lyude Paul
On Wed, 2020-08-12 at 16:10 +0200, Daniel Vetter wrote:
> On Wed, Aug 12, 2020 at 12:16 AM Zwane Mwaikambo  wrote:
> > On Tue, 11 Aug 2020, Daniel Vetter wrote:
> > 
> > > On Mon, Aug 10, 2020 at 10:11:50AM -0700, Zwane Mwaikambo wrote:
> > > > Hi Folks,
> > > > I know this thread eventually dropped off due to not identifying
> > > > the underlying issue. It's still occuring on 5.8 and in my case it
> > > > happened because the udev device nodes for the DP aux devices were not
> > > > cleaned up whereas the kernel had no association with them. I can
> > > > reproduce the bug just by creating a device node for a non-existent
> > > > minor
> > > > device and calling open().
> > > 
> > > Hm I don't have that thread anymore, but generally these bugs are solved
> > > by not registering the device before it's ready for use. We do have
> > > drm_connector->late_register for that stuff. Just a guess since I'm not
> > > seeing full details here.
> > 
> > In this particular case, the physical device disappeared before the nodes
> > were cleaned up. It involves putting a computer to sleep with a monitor
> > plugged in and then waking it up with the monitor unplugged.
> 
> We also have early_unregister for the reverse, but yes this sounds
> more tricky ... Adding Lyude who's been working on way too much
> lifetime fun around dp recently.
> -Daniel
> 
Hi-I think just checking whether the auxdev is NULL or not is a reasonable
fix, although I am curious as to how exactly the aux dev's parent is getting
destroyed before it's child, which I would have thought would be the only way
you could hit this?

> > 
> > > > To me it still makes sense to just check aux_dev because the chardev
> > > > has
> > > > no way to check before calling.
> > > > 
> > > > (gdb) list *drm_dp_aux_dev_get_by_minor+0x29
> > > > 0x17b39 is in drm_dp_aux_dev_get_by_minor
> > > > (drivers/gpu/drm/drm_dp_aux_dev.c:65).
> > > > 60  static struct drm_dp_aux_dev
> > > > *drm_dp_aux_dev_get_by_minor(unsigned index)
> > > > 61  {
> > > > 62  struct drm_dp_aux_dev *aux_dev = NULL;
> > > > 63
> > > > 64  mutex_lock(_idr_mutex);
> > > > 65  aux_dev = idr_find(_idr, index);
> > > > 66  if (!kref_get_unless_zero(_dev->refcount))
> > > > 67  aux_dev = NULL;
> > > > 68  mutex_unlock(_idr_mutex);
> > > > 69
> > > > (gdb) p/x &((struct drm_dp_aux_dev *)(0x0))->refcount
> > > > $8 = 0x18
> > > > 
> > > > static int auxdev_open(struct inode *inode, struct file *file)
> > > > {
> > > > unsigned int minor = iminor(inode);
> > > > struct drm_dp_aux_dev *aux_dev;
> > > > 
> > > > aux_dev = drm_dp_aux_dev_get_by_minor(minor);
> > > > if (!aux_dev)
> > > > return -ENODEV;
> > > > 
> > > > file->private_data = aux_dev;
> > > > return 0;
> > > > }
> > > > 
> > > > 
> > > > ___
> > > > dri-devel mailing list
> > > > dri-de...@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 
-- 
Cheers,
Lyude Paul (she/her)
Software Engineer at Red Hat



Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-08-12 Thread Daniel Vetter
On Wed, Aug 12, 2020 at 12:16 AM Zwane Mwaikambo  wrote:
>
> On Tue, 11 Aug 2020, Daniel Vetter wrote:
>
> > On Mon, Aug 10, 2020 at 10:11:50AM -0700, Zwane Mwaikambo wrote:
> > > Hi Folks,
> > > I know this thread eventually dropped off due to not identifying
> > > the underlying issue. It's still occuring on 5.8 and in my case it
> > > happened because the udev device nodes for the DP aux devices were not
> > > cleaned up whereas the kernel had no association with them. I can
> > > reproduce the bug just by creating a device node for a non-existent minor
> > > device and calling open().
> >
> > Hm I don't have that thread anymore, but generally these bugs are solved
> > by not registering the device before it's ready for use. We do have
> > drm_connector->late_register for that stuff. Just a guess since I'm not
> > seeing full details here.
>
> In this particular case, the physical device disappeared before the nodes
> were cleaned up. It involves putting a computer to sleep with a monitor
> plugged in and then waking it up with the monitor unplugged.

We also have early_unregister for the reverse, but yes this sounds
more tricky ... Adding Lyude who's been working on way too much
lifetime fun around dp recently.
-Daniel

>
>
> > >
> > > To me it still makes sense to just check aux_dev because the chardev has
> > > no way to check before calling.
> > >
> > > (gdb) list *drm_dp_aux_dev_get_by_minor+0x29
> > > 0x17b39 is in drm_dp_aux_dev_get_by_minor 
> > > (drivers/gpu/drm/drm_dp_aux_dev.c:65).
> > > 60  static struct drm_dp_aux_dev 
> > > *drm_dp_aux_dev_get_by_minor(unsigned index)
> > > 61  {
> > > 62  struct drm_dp_aux_dev *aux_dev = NULL;
> > > 63
> > > 64  mutex_lock(_idr_mutex);
> > > 65  aux_dev = idr_find(_idr, index);
> > > 66  if (!kref_get_unless_zero(_dev->refcount))
> > > 67  aux_dev = NULL;
> > > 68  mutex_unlock(_idr_mutex);
> > > 69
> > > (gdb) p/x &((struct drm_dp_aux_dev *)(0x0))->refcount
> > > $8 = 0x18
> > >
> > > static int auxdev_open(struct inode *inode, struct file *file)
> > > {
> > > unsigned int minor = iminor(inode);
> > > struct drm_dp_aux_dev *aux_dev;
> > >
> > > aux_dev = drm_dp_aux_dev_get_by_minor(minor);
> > > if (!aux_dev)
> > > return -ENODEV;
> > >
> > > file->private_data = aux_dev;
> > > return 0;
> > > }
> > >
> > >
> > > ___
> > > dri-devel mailing list
> > > dri-de...@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-08-11 Thread Zwane Mwaikambo
On Tue, 11 Aug 2020, Daniel Vetter wrote:

> On Mon, Aug 10, 2020 at 10:11:50AM -0700, Zwane Mwaikambo wrote:
> > Hi Folks,
> > I know this thread eventually dropped off due to not identifying 
> > the underlying issue. It's still occuring on 5.8 and in my case it 
> > happened because the udev device nodes for the DP aux devices were not 
> > cleaned up whereas the kernel had no association with them. I can 
> > reproduce the bug just by creating a device node for a non-existent minor 
> > device and calling open().
> 
> Hm I don't have that thread anymore, but generally these bugs are solved
> by not registering the device before it's ready for use. We do have
> drm_connector->late_register for that stuff. Just a guess since I'm not
> seeing full details here.

In this particular case, the physical device disappeared before the nodes 
were cleaned up. It involves putting a computer to sleep with a monitor 
plugged in and then waking it up with the monitor unplugged.


> > 
> > To me it still makes sense to just check aux_dev because the chardev has 
> > no way to check before calling.
> > 
> > (gdb) list *drm_dp_aux_dev_get_by_minor+0x29
> > 0x17b39 is in drm_dp_aux_dev_get_by_minor 
> > (drivers/gpu/drm/drm_dp_aux_dev.c:65).
> > 60  static struct drm_dp_aux_dev *drm_dp_aux_dev_get_by_minor(unsigned 
> > index)
> > 61  {
> > 62  struct drm_dp_aux_dev *aux_dev = NULL;
> > 63
> > 64  mutex_lock(_idr_mutex);
> > 65  aux_dev = idr_find(_idr, index);
> > 66  if (!kref_get_unless_zero(_dev->refcount))
> > 67  aux_dev = NULL;
> > 68  mutex_unlock(_idr_mutex);
> > 69
> > (gdb) p/x &((struct drm_dp_aux_dev *)(0x0))->refcount
> > $8 = 0x18
> > 
> > static int auxdev_open(struct inode *inode, struct file *file)
> > {
> > unsigned int minor = iminor(inode);
> > struct drm_dp_aux_dev *aux_dev;
> > 
> > aux_dev = drm_dp_aux_dev_get_by_minor(minor);
> > if (!aux_dev)
> > return -ENODEV;
> > 
> > file->private_data = aux_dev;
> > return 0;
> > }
> > 
> > 
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 


Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-08-11 Thread Daniel Vetter
On Mon, Aug 10, 2020 at 10:11:50AM -0700, Zwane Mwaikambo wrote:
> Hi Folks,
>   I know this thread eventually dropped off due to not identifying 
> the underlying issue. It's still occuring on 5.8 and in my case it 
> happened because the udev device nodes for the DP aux devices were not 
> cleaned up whereas the kernel had no association with them. I can 
> reproduce the bug just by creating a device node for a non-existent minor 
> device and calling open().

Hm I don't have that thread anymore, but generally these bugs are solved
by not registering the device before it's ready for use. We do have
drm_connector->late_register for that stuff. Just a guess since I'm not
seeing full details here.
-Daniel

> 
> To me it still makes sense to just check aux_dev because the chardev has 
> no way to check before calling.
> 
> (gdb) list *drm_dp_aux_dev_get_by_minor+0x29
> 0x17b39 is in drm_dp_aux_dev_get_by_minor 
> (drivers/gpu/drm/drm_dp_aux_dev.c:65).
> 60  static struct drm_dp_aux_dev *drm_dp_aux_dev_get_by_minor(unsigned 
> index)
> 61  {
> 62  struct drm_dp_aux_dev *aux_dev = NULL;
> 63
> 64  mutex_lock(_idr_mutex);
> 65  aux_dev = idr_find(_idr, index);
> 66  if (!kref_get_unless_zero(_dev->refcount))
> 67  aux_dev = NULL;
> 68  mutex_unlock(_idr_mutex);
> 69
> (gdb) p/x &((struct drm_dp_aux_dev *)(0x0))->refcount
> $8 = 0x18
> 
> static int auxdev_open(struct inode *inode, struct file *file)
> {
> unsigned int minor = iminor(inode);
> struct drm_dp_aux_dev *aux_dev;
> 
> aux_dev = drm_dp_aux_dev_get_by_minor(minor);
> if (!aux_dev)
> return -ENODEV;
> 
> file->private_data = aux_dev;
> return 0;
> }
> 
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm: assure aux_dev is nonzero before using it

2020-08-10 Thread Zwane Mwaikambo
Hi Folks,
I know this thread eventually dropped off due to not identifying 
the underlying issue. It's still occuring on 5.8 and in my case it 
happened because the udev device nodes for the DP aux devices were not 
cleaned up whereas the kernel had no association with them. I can 
reproduce the bug just by creating a device node for a non-existent minor 
device and calling open().

To me it still makes sense to just check aux_dev because the chardev has 
no way to check before calling.

(gdb) list *drm_dp_aux_dev_get_by_minor+0x29
0x17b39 is in drm_dp_aux_dev_get_by_minor (drivers/gpu/drm/drm_dp_aux_dev.c:65).
60  static struct drm_dp_aux_dev *drm_dp_aux_dev_get_by_minor(unsigned 
index)
61  {
62  struct drm_dp_aux_dev *aux_dev = NULL;
63
64  mutex_lock(_idr_mutex);
65  aux_dev = idr_find(_idr, index);
66  if (!kref_get_unless_zero(_dev->refcount))
67  aux_dev = NULL;
68  mutex_unlock(_idr_mutex);
69
(gdb) p/x &((struct drm_dp_aux_dev *)(0x0))->refcount
$8 = 0x18

static int auxdev_open(struct inode *inode, struct file *file)
{
unsigned int minor = iminor(inode);
struct drm_dp_aux_dev *aux_dev;

aux_dev = drm_dp_aux_dev_get_by_minor(minor);
if (!aux_dev)
return -ENODEV;

file->private_data = aux_dev;
return 0;
}