Re: video(4) multiple opens

2021-02-13 Thread Marcus Glocker
On Sat, Feb 13, 2021 at 10:45:17AM +0100, Claudio Jeker wrote:

> On Sat, Feb 13, 2021 at 10:26:48AM +0100, Marcus Glocker wrote:
> > On Sat, Feb 13, 2021 at 08:30:04AM +0100, Claudio Jeker wrote:
> > 
> > > On Fri, Feb 12, 2021 at 10:59:05PM +0100, Jeremie Courreges-Anglas wrote:
> > > > On Wed, Feb 10 2021, Martin Pieuchot  wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > Which fields is the new lock protecting?  Why isn't the KERNEL_LOCK()
> > > > > enough?
> > > > 
> > > > When I mentioned this potential lack of locking to Marcus, I was
> > > > mistaken.  Some of the functions in video.c are called from syscalls
> > > > that are marked NOLOCK.  But now that I have added some printf
> > > > debugging, I can see that the kernel lock is held in places where
> > > > I didn't expect it to be held (videoioctl, videoread, videoclose...).
> > > > So there's probably a layer of locking that I am missing.
> > > > 
> > > > Marcus, sorry for my misleading diff. O:)
> > > > 
> > > > *crawls back under his rock*
> > > 
> > > Just because a syscall is marked NOLOCK does not mean that all of it will
> > > run without the KERNEL_LOCK. For example read and write are marked NOLOCK
> > > but the KERNEL_LOCK is grabbed later for all non-socket filedescriptors.
> > > This is why video(4) still runs with the KERNEL_LOCK.
> > > The NOLOCK in syscalls.master just tell the kernel to not grab the
> > > KERNEL_LOCK right at the start of the syscall.
> > 
> > OK, that wasn't clear to me either!  If Jeremie has some space left
> > under his rock, I would join :-)
> > 
> > That basically means we need no extra locking in video(4) for this
> > implementation, right?  For a test I replaced the rw_enter_write()
> > with KERNEL_ASSERT_LOCKED(), running a stream and multiple concurrent
> > ioctl programs, and all went fine.
> > 
> > I would come up with a revised diff then for the device owner part.
> > Just doing some polishing together with Anton on that.
> 
> For now this is OK. I have some diffs around which will allow unlocked
> read and write down to individual devices (I was experimenting with tun(4)
> and tap(4)). For ioctl() this will not happen any time soon. ioctl() is a
> world of pain. Maybe for devices like video(4) the call path will be
> manageable but for network devices it is just mess. The main issue is that
> ioctls work on multiple layers and sometimes the calls have layer
> violations and often multiple sleep points built into them.

Got it.  Thanks for the explanation.
Here the adapted diff, including a lot of input from anton@.
I sprinkled KERNEL_ASSERT_LOCKED() in those functions where we require
locking.  Not sure if we should keep them in?


Index: sys/dev/video.c
===
RCS file: /cvs/src/sys/dev/video.c,v
retrieving revision 1.48
diff -u -p -u -p -r1.48 video.c
--- sys/dev/video.c 31 Jan 2021 19:32:01 -  1.48
+++ sys/dev/video.c 13 Feb 2021 09:49:45 -
@@ -47,7 +47,8 @@ struct video_softc {
struct device   *sc_dev;/* hardware device struct */
struct video_hw_if  *hw_if; /* hardware interface */
char sc_dying;  /* device detached */
-   struct process   *sc_owner; /* owner process */
+   struct process  *sc_owner;  /* owner process */
+   uint8_t  sc_open;   /* device opened */
 
int  sc_fsize;
uint8_t *sc_fbuffer;
@@ -69,6 +70,8 @@ int   videoactivate(struct device *, int);
 intvideoprint(void *, const char *);
 
 void   video_intr(void *);
+intvideo_stop(struct video_softc *);
+intvideo_claim(struct video_softc *, struct process *);
 
 struct cfattach video_ca = {
sizeof(struct video_softc), videoprobe, videoattach,
@@ -122,6 +125,9 @@ videoopen(dev_t dev, int flags, int fmt,
 {
int unit;
struct video_softc *sc;
+   int error = 0;
+
+   KERNEL_ASSERT_LOCKED();
 
unit = VIDEOUNIT(dev);
if (unit >= video_cd.cd_ndevs ||
@@ -129,38 +135,40 @@ videoopen(dev_t dev, int flags, int fmt,
 sc->hw_if == NULL)
return (ENXIO);
 
-   if (sc->sc_owner != NULL) {
-   if (sc->sc_owner == p->p_p)
-   return (0);
-   else
-   return (EBUSY);
-   } else
-   sc->sc_owner = p->p_p;
+   if (sc->sc_open) {
+   DPRINTF(("%s: device already open\n", __func__));
+   return (0);
+   } else {
+   sc->sc_open = 1;
+   DPRINTF(("%s: set device to open\n", __func__));
+   }
 
sc->sc_vidmode = VIDMODE_NONE;
sc->sc_frames_ready = 0;
 
if (sc->hw_if->open != NULL)
-   return (sc->hw_if->open(sc->hw_hdl, flags, >sc_fsize,
-   sc->sc_fbuffer, video_intr, sc));
-   else
-   return (0);
+ 

Re: video(4) multiple opens

2021-02-13 Thread Claudio Jeker
On Sat, Feb 13, 2021 at 10:26:48AM +0100, Marcus Glocker wrote:
> On Sat, Feb 13, 2021 at 08:30:04AM +0100, Claudio Jeker wrote:
> 
> > On Fri, Feb 12, 2021 at 10:59:05PM +0100, Jeremie Courreges-Anglas wrote:
> > > On Wed, Feb 10 2021, Martin Pieuchot  wrote:
> > > 
> > > [...]
> > > 
> > > > Which fields is the new lock protecting?  Why isn't the KERNEL_LOCK()
> > > > enough?
> > > 
> > > When I mentioned this potential lack of locking to Marcus, I was
> > > mistaken.  Some of the functions in video.c are called from syscalls
> > > that are marked NOLOCK.  But now that I have added some printf
> > > debugging, I can see that the kernel lock is held in places where
> > > I didn't expect it to be held (videoioctl, videoread, videoclose...).
> > > So there's probably a layer of locking that I am missing.
> > > 
> > > Marcus, sorry for my misleading diff. O:)
> > > 
> > > *crawls back under his rock*
> > 
> > Just because a syscall is marked NOLOCK does not mean that all of it will
> > run without the KERNEL_LOCK. For example read and write are marked NOLOCK
> > but the KERNEL_LOCK is grabbed later for all non-socket filedescriptors.
> > This is why video(4) still runs with the KERNEL_LOCK.
> > The NOLOCK in syscalls.master just tell the kernel to not grab the
> > KERNEL_LOCK right at the start of the syscall.
> 
> OK, that wasn't clear to me either!  If Jeremie has some space left
> under his rock, I would join :-)
> 
> That basically means we need no extra locking in video(4) for this
> implementation, right?  For a test I replaced the rw_enter_write()
> with KERNEL_ASSERT_LOCKED(), running a stream and multiple concurrent
> ioctl programs, and all went fine.
> 
> I would come up with a revised diff then for the device owner part.
> Just doing some polishing together with Anton on that.

For now this is OK. I have some diffs around which will allow unlocked
read and write down to individual devices (I was experimenting with tun(4)
and tap(4)). For ioctl() this will not happen any time soon. ioctl() is a
world of pain. Maybe for devices like video(4) the call path will be
manageable but for network devices it is just mess. The main issue is that
ioctls work on multiple layers and sometimes the calls have layer
violations and often multiple sleep points built into them.

-- 
:wq Claudio



Re: video(4) multiple opens

2021-02-13 Thread Marcus Glocker
On Sat, Feb 13, 2021 at 08:30:04AM +0100, Claudio Jeker wrote:

> On Fri, Feb 12, 2021 at 10:59:05PM +0100, Jeremie Courreges-Anglas wrote:
> > On Wed, Feb 10 2021, Martin Pieuchot  wrote:
> > 
> > [...]
> > 
> > > Which fields is the new lock protecting?  Why isn't the KERNEL_LOCK()
> > > enough?
> > 
> > When I mentioned this potential lack of locking to Marcus, I was
> > mistaken.  Some of the functions in video.c are called from syscalls
> > that are marked NOLOCK.  But now that I have added some printf
> > debugging, I can see that the kernel lock is held in places where
> > I didn't expect it to be held (videoioctl, videoread, videoclose...).
> > So there's probably a layer of locking that I am missing.
> > 
> > Marcus, sorry for my misleading diff. O:)
> > 
> > *crawls back under his rock*
> 
> Just because a syscall is marked NOLOCK does not mean that all of it will
> run without the KERNEL_LOCK. For example read and write are marked NOLOCK
> but the KERNEL_LOCK is grabbed later for all non-socket filedescriptors.
> This is why video(4) still runs with the KERNEL_LOCK.
> The NOLOCK in syscalls.master just tell the kernel to not grab the
> KERNEL_LOCK right at the start of the syscall.

OK, that wasn't clear to me either!  If Jeremie has some space left
under his rock, I would join :-)

That basically means we need no extra locking in video(4) for this
implementation, right?  For a test I replaced the rw_enter_write()
with KERNEL_ASSERT_LOCKED(), running a stream and multiple concurrent
ioctl programs, and all went fine.

I would come up with a revised diff then for the device owner part.
Just doing some polishing together with Anton on that.



Re: video(4) multiple opens

2021-02-12 Thread Claudio Jeker
On Fri, Feb 12, 2021 at 10:59:05PM +0100, Jeremie Courreges-Anglas wrote:
> On Wed, Feb 10 2021, Martin Pieuchot  wrote:
> 
> [...]
> 
> > Which fields is the new lock protecting?  Why isn't the KERNEL_LOCK()
> > enough?
> 
> When I mentioned this potential lack of locking to Marcus, I was
> mistaken.  Some of the functions in video.c are called from syscalls
> that are marked NOLOCK.  But now that I have added some printf
> debugging, I can see that the kernel lock is held in places where
> I didn't expect it to be held (videoioctl, videoread, videoclose...).
> So there's probably a layer of locking that I am missing.
> 
> Marcus, sorry for my misleading diff. O:)
> 
> *crawls back under his rock*

Just because a syscall is marked NOLOCK does not mean that all of it will
run without the KERNEL_LOCK. For example read and write are marked NOLOCK
but the KERNEL_LOCK is grabbed later for all non-socket filedescriptors.
This is why video(4) still runs with the KERNEL_LOCK.
The NOLOCK in syscalls.master just tell the kernel to not grab the
KERNEL_LOCK right at the start of the syscall.

-- 
:wq Claudio



Re: video(4) multiple opens

2021-02-12 Thread Jeremie Courreges-Anglas
On Wed, Feb 10 2021, Martin Pieuchot  wrote:

[...]

> Which fields is the new lock protecting?  Why isn't the KERNEL_LOCK()
> enough?

When I mentioned this potential lack of locking to Marcus, I was
mistaken.  Some of the functions in video.c are called from syscalls
that are marked NOLOCK.  But now that I have added some printf
debugging, I can see that the kernel lock is held in places where
I didn't expect it to be held (videoioctl, videoread, videoclose...).
So there's probably a layer of locking that I am missing.

Marcus, sorry for my misleading diff. O:)

*crawls back under his rock*
-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: video(4) multiple opens

2021-02-10 Thread Martin Pieuchot
On 09/02/21(Tue) 20:35, Marcus Glocker wrote:
> jca@ has recently committed a change to video(4) to allow the same
> process to do multiple opens on the same video device to satisfy
> certain applications, and start to go in to the V4L2 "1.1.4 Multiple
> Opens" specification direction as described here:
> 
> https://www.kernel.org/doc/html/v5.10/userspace-api/media/v4l/open.html#f1
> 
> As well he recently sent me some locking code to prevent concurrent
> access to certain video(4) functions.  On that I think it makes more
> sense to introduce the locking code together with the next step, which
> is to allow different processes to open the same video device.
> 
> Therefore I have added on top of jca@s locking code the code to define
> a device owner, and based on that distinguish between which process is
> allowed to call certain video(4) functions.  Basically the process
> starting with the buffer memory allocation or/and starting the video
> stream becomes the device owner.  Other processes can do things like
> calling VIDIOC_G_CTRL or VIDIOC_S_CTRL ioctls.  In this diff certainly
> more ioctls can be moved up to the "shared" part, but I only started
> with the video controls for now.
> 
> Also video(1) requires a small change to make the read(2) method
> signal that the stream needs to be stopped on close.  I checked that
> other applications do implement this behavior as well.  Otherwise
> if you have multiple file handles open on a video device, and the
> read(2) process exists before another process, we won't notice that
> the stream needs to be stopped, since only the last file handle
> close will call the videoclose() function.
> 
> I would appreciate some regression testing and feedback to make a
> start on implementing this specification.

Which fields is the new lock protecting?  Why isn't the KERNEL_LOCK()
enough?  Is it because of sleeping points?  Could you annotate them?

> Index: sys/dev/video.c
> ===
> RCS file: /cvs/src/sys/dev/video.c,v
> retrieving revision 1.48
> diff -u -p -u -p -r1.48 video.c
> --- sys/dev/video.c   31 Jan 2021 19:32:01 -  1.48
> +++ sys/dev/video.c   7 Feb 2021 15:33:52 -
> @@ -48,6 +48,8 @@ struct video_softc {
>   struct video_hw_if  *hw_if; /* hardware interface */
>   char sc_dying;  /* device detached */
>   struct process   *sc_owner; /* owner process */
> + struct rwlocksc_lock;   /* device lock */
> + uint8_t  sc_open;   /* device opened */
>  
>   int  sc_fsize;
>   uint8_t *sc_fbuffer;
> @@ -122,6 +124,7 @@ videoopen(dev_t dev, int flags, int fmt,
>  {
>   int unit;
>   struct video_softc *sc;
> + int r = 0;
>  
>   unit = VIDEOUNIT(dev);
>   if (unit >= video_cd.cd_ndevs ||
> @@ -129,22 +132,27 @@ videoopen(dev_t dev, int flags, int fmt,
>sc->hw_if == NULL)
>   return (ENXIO);
>  
> - if (sc->sc_owner != NULL) {
> - if (sc->sc_owner == p->p_p)
> - return (0);
> - else
> - return (EBUSY);
> - } else
> - sc->sc_owner = p->p_p;
> + rw_enter_write(>sc_lock);
> +
> + if (sc->sc_open) {
> + DPRINTF(("%s: device already open\n", __func__));
> + rw_exit_write(>sc_lock);
> + return (r);
> + } else {
> + sc->sc_open = 1;
> + DPRINTF(("%s: set device to open\n", __func__));
> + }
>  
>   sc->sc_vidmode = VIDMODE_NONE;
>   sc->sc_frames_ready = 0;
>  
>   if (sc->hw_if->open != NULL)
> - return (sc->hw_if->open(sc->hw_hdl, flags, >sc_fsize,
> - sc->sc_fbuffer, video_intr, sc));
> - else
> - return (0);
> + r = sc->hw_if->open(sc->hw_hdl, flags, >sc_fsize,
> + sc->sc_fbuffer, video_intr, sc);
> +
> + rw_exit_write(>sc_lock);
> +
> + return (r);
>  }
>  
>  int
> @@ -155,11 +163,23 @@ videoclose(dev_t dev, int flags, int fmt
>  
>   sc = video_cd.cd_devs[VIDEOUNIT(dev)];
>  
> + rw_enter_write(>sc_lock);
> +
>   if (sc->hw_if->close != NULL)
>   r = sc->hw_if->close(sc->hw_hdl);
>  
> + if (p != NULL) {
> + sc->sc_open = 0;
> + DPRINTF(("%s: last close\n", __func__));
> + }
> + DPRINTF(("%s: stream close\n", __func__));
> +
> + sc->sc_vidmode = VIDMODE_NONE;
> + sc->sc_frames_ready = 0;
>   sc->sc_owner = NULL;
>  
> + rw_exit_write(>sc_lock);
> +
>   return (r);
>  }
>  
> @@ -175,11 +195,28 @@ videoread(dev_t dev, struct uio *uio, in
>   (sc = video_cd.cd_devs[unit]) == NULL)
>   return (ENXIO);
>  
> - if (sc->sc_dying)
> + rw_enter_write(>sc_lock);
> +
> + if (sc->sc_dying) {
> + rw_exit_write(>sc_lock);
>   return 

Re: video(4) multiple opens

2021-01-06 Thread Jeremie Courreges-Anglas
On Wed, Jan 06 2021, Marcus Glocker  wrote:
> On Tue, Jan 05, 2021 at 11:54:31PM +0100, Jeremie Courreges-Anglas wrote:

[...]

>> Here's the diff.  IIUC the use of atomic operations isn't strictly
>> needed here since open(2) runs with the kernel lock, but the result
>> is easy to understand IMO.
>
> I don't agree with that, see my comments bellow in point a) and b).
>  
>> Thoughts?  ok?
>
> I think it doesn't harm to allow the same process to do multiple opens
> on video(1) as a first improvement.  But I'm not happy using
> atomic_cas_ptr(9) and atomic_swap_ptr(9) for this because:
>
>   a) As you already have mentioned, we don't require atomic
>  operations here.  I checked that those functions are very
>  rarely used in the kernel, and only there were atomic
>  operation is required.  I'm afraid that when we use those in
>  video(1), and somebody looks at the code later on to
>  eventually replace it, there will be confusion if there was
>  a specific reason why to use atomic functions.
>
>   b) IMO it doesn't simplify the code.  I first had to read the
>  manual pages to understand how those functions work.

Fair enough, I also felt that it was a bit premature.

> I would prefer to do something simpler, like in this adapted diff.
> If it works for you, I'm OK if you commit this instead.

Done, thanks.

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: video(4) multiple opens

2021-01-06 Thread Marcus Glocker
On Tue, Jan 05, 2021 at 11:54:31PM +0100, Jeremie Courreges-Anglas wrote:

> 
> I hit a weird failure with firefox and BigBlueButton
> (https://bigbluebutton.org/) where firefox can't use my webcam.
> video(1) works, same for other webrtc sites in firefox, eg meet.jit.si.
> ktrace shows that a single firefox process tries to open /dev/video0
> more than once, and that fails with EBUSY.  The code lies in the
> libwebrtc library
> 
>   
> https://webrtc.googlesource.com/src/+/refs/heads/master/modules/video_capture/linux/
> 
> In my tests the multiple open() calls only happen at initialization
> time, video streaming only uses one fd.
> 
> Back to our kernel, the current restrictive behavior was introduced with
> 
>   revision 1.18
>   date: 2008/07/23 22:10:21;  author: mglocker;  state: Exp;  lines: +11 -4;
>   If /dev/video* is already used by an application, return EBUSY to other
>   applications.  Fixes a kernel panic.
> 
>   Reported by ian@
> 
> Meanwhile, the V4L2 API specifies that a device "can be opened more than
> once" from multiple processes, with access to certain methods blocked
> when an application starts reading from the device.
> 
>   
> https://www.kernel.org/doc/html/v5.10/userspace-api/media/v4l/open.html#multiple-opens
> 
> So the API says we're being too restrictive, but we don't want to go
> back to uncontrolled access either.  I guess the ideal solution would be
> to implement multiple process access with fine-grained checks, but...
> I don't have time for that!

Oh, I guess that makes video(1) an "old and obscure" driver according
to the API [2] footnote ;-)

Finally we should implement the controls for multiple video device
openings based on what has been exactly requested.  Since the basic
idea mentioned in the API is that other processes, like a panel
application, can be started to e.g. change controls.  I will have a
look at that when I find some time.

> An easy improvement for my use case would be to allow the same process
> to open a device multiple times.  It fixes firefox + BigBlueButton for
> me and doesn't make concurrent accesses worse (multiple threads from the
> same process can already do concurrent accesses, which is something
> we might want to look at).

Interesting.  I was not aware that threads of the same process can do
that.  I need to have a closer look at what is happening here.
 
> Here's the diff.  IIUC the use of atomic operations isn't strictly
> needed here since open(2) runs with the kernel lock, but the result
> is easy to understand IMO.

I don't agree with that, see my comments bellow in point a) and b).
 
> Thoughts?  ok?

I think it doesn't harm to allow the same process to do multiple opens
on video(1) as a first improvement.  But I'm not happy using
atomic_cas_ptr(9) and atomic_swap_ptr(9) for this because:

a) As you already have mentioned, we don't require atomic
   operations here.  I checked that those functions are very
   rarely used in the kernel, and only there were atomic
   operation is required.  I'm afraid that when we use those in
   video(1), and somebody looks at the code later on to
   eventually replace it, there will be confusion if there was
   a specific reason why to use atomic functions.

b) IMO it doesn't simplify the code.  I first had to read the
   manual pages to understand how those functions work.

I would prefer to do something simpler, like in this adapted diff.
If it works for you, I'm OK if you commit this instead.

> Index: dev/video.c
> ===
> RCS file: /d/cvs/src/sys/dev/video.c,v
> retrieving revision 1.46
> diff -u -p -r1.46 video.c
> --- dev/video.c   28 Dec 2020 18:28:11 -  1.46
> +++ dev/video.c   5 Jan 2021 22:34:04 -
> @@ -28,6 +28,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  
>  #include 
> @@ -46,8 +48,7 @@ struct video_softc {
>   struct device   *sc_dev;/* hardware device struct */
>   struct video_hw_if  *hw_if; /* hardware interface */
>   char sc_dying;  /* device detached */
> -#define VIDEO_OPEN   0x01
> - char sc_open;
> + struct process  *sc_owner;  /* owner process */
>  
>   int  sc_fsize;
>   uint8_t *sc_fbuffer;
> @@ -101,6 +102,7 @@ videoattach(struct device *parent, struc
>   sc->hw_hdl = sa->hdl;
>   sc->sc_dev = parent;
>   sc->sc_fbufferlen = 0;
> + sc->sc_owner = NULL;
>  
>   if (sc->hw_if->get_bufsize)
>   sc->sc_fbufferlen = (sc->hw_if->get_bufsize)(sc->hw_hdl);
> @@ -121,6 +123,7 @@ videoopen(dev_t dev, int flags, int fmt,
>  {
>   int unit;
>   struct video_softc *sc;
> + struct process *owner;
>  
>   unit = VIDEOUNIT(dev);
>   if (unit >= video_cd.cd_ndevs ||
> @@ -128,9 +131,13 @@