Re: rwsleep(9)

2016-09-13 Thread Vincent Gross
On Tue, 13 Sep 2016 10:08:13 +0200
Martin Pieuchot  wrote:

> On 12/09/16(Mon) 12:12, Vincent Gross wrote:
> > On Mon, 12 Sep 2016 10:49:03 +0200
> > Martin Pieuchot  wrote:
> >   
> > > I'd like to use a write lock to serialize accesses to ip_output().
> > > This will be used to guarantee that atomic code sections in the
> > > socket layer stay atomic when the input/forwarding path won't run
> > > under KERNEL_LOCK().
> > > 
> > > For such purpose I'll have to convert some tsleep(9) to an
> > > msleep(9)-like function operating on a write lock.  That's why I'd
> > > like to introduce rwsleep(9).  I did not bother exporting a read
> > > variant of this function since I don't need it for the moment.
> > > 
> > > ok?  
> > 
> > MP noob here :
> > 
> > tsleep() and msleep() check if they are called during
> > autoconfiguration or after a panic to let interrupts run. There is
> > no such check here. I get that rwsleep() during autoconf makes
> > little sense, but to err on the safe side maybe add some kind of
> > assert (if it is not too much of a pain) ? and what about panic,
> > shouldn't this be handled ?  
> 
> This is not a MP problem but an old BSD heritage.  I don't mind adding
> it.  But that's not a real solution to panic being broken with sleep
> or locks.
> 

"old BSD heritage" -> 'nuff said. No need to spread the rot then.

ok vgross@



Re: rwsleep(9)

2016-09-13 Thread Martin Pieuchot
On 12/09/16(Mon) 12:12, Vincent Gross wrote:
> On Mon, 12 Sep 2016 10:49:03 +0200
> Martin Pieuchot  wrote:
> 
> > I'd like to use a write lock to serialize accesses to ip_output().
> > This will be used to guarantee that atomic code sections in the
> > socket layer stay atomic when the input/forwarding path won't run
> > under KERNEL_LOCK().
> > 
> > For such purpose I'll have to convert some tsleep(9) to an
> > msleep(9)-like function operating on a write lock.  That's why I'd
> > like to introduce rwsleep(9).  I did not bother exporting a read
> > variant of this function since I don't need it for the moment.
> > 
> > ok?
> 
> MP noob here :
> 
> tsleep() and msleep() check if they are called during autoconfiguration
> or after a panic to let interrupts run. There is no such check here. I
> get that rwsleep() during autoconf makes little sense, but to err on
> the safe side maybe add some kind of assert (if it is not too much of
> a pain) ? and what about panic, shouldn't this be handled ?

This is not a MP problem but an old BSD heritage.  I don't mind adding
it.  But that's not a real solution to panic being broken with sleep or
locks.



Re: rwsleep(9)

2016-09-12 Thread Ted Unangst
Philip Guenther wrote:
> So what's protecting the state that you're waiting for the change to
> occur in?  If it's the mutex, then why are you accessing it with only
> the rwlock early on?  If it's the rwlock then how to prevent a
> lost-wakeup by a thread on another CPU getting in between the
> rw_exit() and msleep() and calling wakeup() in that window?  That code
> would only work reliably if you held *both* the mutex and the rwlock
> when making changes to the state, or both the mutex and the big-lock.
> Either way, what's the point of the rwlock there?

In this case, the mutex protects the sleep condition/wakeup, while the rwlock
protects whatever else it protects. Something like the lockmgr interlock.
I imagine there are really two states here, the 'internal' state for which we
require locking, and the 'external' state which we sleep/wakeup for.

But i'm just yammering.



Re: rwsleep(9)

2016-09-12 Thread Philip Guenther
On Mon, Sep 12, 2016 at 4:29 AM, Ted Unangst  wrote:
...
> Something feels a little weird about this. Usually you want to keep the rwlock
> while sleeping.

Yes, that's usually the point of the lock, but when you need a
condition-variable for changes in the state protected by an rwlock,
you need rwsleep().


> Instead we use mutex for passing control over sleep.
> I don't know how this would make the code look, but something like
>
> rw_enter();
> /* do a bunch of stuff */
> mtx_enter();
> rw_exit();
> msleep();

So what's protecting the state that you're waiting for the change to
occur in?  If it's the mutex, then why are you accessing it with only
the rwlock early on?  If it's the rwlock then how to prevent a
lost-wakeup by a thread on another CPU getting in between the
rw_exit() and msleep() and calling wakeup() in that window?  That code
would only work reliably if you held *both* the mutex and the rwlock
when making changes to the state, or both the mutex and the big-lock.
Either way, what's the point of the rwlock there?


Philip Guenther



Re: rwsleep(9)

2016-09-12 Thread Philip Guenther
On Mon, Sep 12, 2016 at 1:49 AM, Martin Pieuchot  wrote:
> I'd like to use a write lock to serialize accesses to ip_output().  This
> will be used to guarantee that atomic code sections in the socket layer
> stay atomic when the input/forwarding path won't run under KERNEL_LOCK().
>
> For such purpose I'll have to convert some tsleep(9) to an msleep(9)-like
> function operating on a write lock.  That's why I'd like to introduce
> rwsleep(9).  I did not bother exporting a read variant of this function
> since I don't need it for the moment.
>
> ok?

ok guenther@



Re: rwsleep(9)

2016-09-12 Thread Ted Unangst
Martin Pieuchot wrote:
> I'd like to use a write lock to serialize accesses to ip_output().  This
> will be used to guarantee that atomic code sections in the socket layer
> stay atomic when the input/forwarding path won't run under KERNEL_LOCK().
> 
> For such purpose I'll have to convert some tsleep(9) to an msleep(9)-like
> function operating on a write lock.  That's why I'd like to introduce
> rwsleep(9).  I did not bother exporting a read variant of this function
> since I don't need it for the moment.

Something feels a little weird about this. Usually you want to keep the rwlock
while sleeping. Instead we use mutex for passing control over sleep.
I don't know how this would make the code look, but something like

rw_enter();
/* do a bunch of stuff */
mtx_enter();
rw_exit();
msleep();

But I don't see anything wrong with rwsleep.



Re: rwsleep(9)

2016-09-12 Thread Vincent Gross
On Mon, 12 Sep 2016 10:49:03 +0200
Martin Pieuchot  wrote:

> I'd like to use a write lock to serialize accesses to ip_output().
> This will be used to guarantee that atomic code sections in the
> socket layer stay atomic when the input/forwarding path won't run
> under KERNEL_LOCK().
> 
> For such purpose I'll have to convert some tsleep(9) to an
> msleep(9)-like function operating on a write lock.  That's why I'd
> like to introduce rwsleep(9).  I did not bother exporting a read
> variant of this function since I don't need it for the moment.
> 
> ok?

MP noob here :

tsleep() and msleep() check if they are called during autoconfiguration
or after a panic to let interrupts run. There is no such check here. I
get that rwsleep() during autoconf makes little sense, but to err on
the safe side maybe add some kind of assert (if it is not too much of
a pain) ? and what about panic, shouldn't this be handled ?

> 
> Index: sys/kern/kern_synch.c
> ===
> RCS file: /cvs/src/sys/kern/kern_synch.c,v
> retrieving revision 1.134
> diff -u -p -r1.134 kern_synch.c
> --- sys/kern/kern_synch.c 3 Sep 2016 15:06:06 -
> 1.134 +++ sys/kern/kern_synch.c   12 Sep 2016 08:41:23 -
> @@ -226,6 +226,40 @@ msleep(const volatile void *ident, struc
>   return (error);
>  }
>  
> +/*
> + * Same as tsleep, but if we have a rwlock provided, then once we've
> + * entered the sleep queue we drop the it. After sleeping we re-lock.
> + */
> +int
> +rwsleep(const volatile void *ident, struct rwlock *wl, int priority,
> +const char *wmesg, int timo)
> +{
> + struct sleep_state sls;
> + int error, error1;
> +
> + KASSERT((priority & ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
> + rw_assert_wrlock(wl);
> +
> + sleep_setup(&sls, ident, priority, wmesg);
> + sleep_setup_timeout(&sls, timo);
> + sleep_setup_signal(&sls, priority);
> +
> + rw_exit_write(wl);
> +
> + sleep_finish(&sls, 1);
> + error1 = sleep_finish_timeout(&sls);
> + error = sleep_finish_signal(&sls);
> +
> + if ((priority & PNORELOCK) == 0)
> + rw_enter_write(wl);
> +
> + /* Signal errors are higher priority than timeouts. */
> + if (error == 0 && error1 != 0)
> + error = error1;
> +
> + return (error);
> +}
> +
>  void
>  sleep_setup(struct sleep_state *sls, const volatile void *ident, int
> prio, const char *wmesg)
> Index: sys/sys/systm.h
> ===
> RCS file: /cvs/src/sys/sys/systm.h,v
> retrieving revision 1.116
> diff -u -p -r1.116 systm.h
> --- sys/sys/systm.h   4 Sep 2016 09:22:29 -   1.116
> +++ sys/sys/systm.h   12 Sep 2016 08:35:26 -
> @@ -246,11 +246,13 @@ int sleep_finish_signal(struct sleep_sta
>  void sleep_queue_init(void);
>  
>  struct mutex;
> +struct rwlock;
>  voidwakeup_n(const volatile void *, int);
>  voidwakeup(const volatile void *);
>  #define wakeup_one(c) wakeup_n((c), 1)
>  int  tsleep(const volatile void *, int, const char *, int);
>  int  msleep(const volatile void *, struct mutex *, int,  const
> char*, int); +int rwsleep(const volatile void *, struct rwlock
> *, int, const char *, int); void  yield(void);
>  
>  void wdog_register(int (*)(void *, int), void *);
> Index: share/man/man9/tsleep.9
> ===
> RCS file: /cvs/src/share/man/man9/tsleep.9,v
> retrieving revision 1.10
> diff -u -p -r1.10 tsleep.9
> --- share/man/man9/tsleep.9   14 Sep 2015 15:14:55 -
> 1.10 +++ share/man/man9/tsleep.9  12 Sep 2016 08:42:55 -
> @@ -34,6 +34,7 @@
>  .Sh NAME
>  .Nm tsleep ,
>  .Nm msleep ,
> +.Nm rwsleep ,
>  .Nm wakeup ,
>  .Nm wakeup_n ,
>  .Nm wakeup_one
> @@ -45,6 +46,8 @@
>  .Fn tsleep "void *ident" "int priority" "const char *wmesg" "int
> timo" .Ft int
>  .Fn msleep "void *ident" "struct mutex *mtx" "int priority" "const
> char *wmesg" "int timo" +.Ft int
> +.Fn rwsleep "void *ident" "struct rwlock *wl" "int priority" "const
> char *wmesg" "int timo" .Ft void
>  .Fn wakeup "void *ident"
>  .Ft void
> @@ -53,9 +56,10 @@
>  .Fn wakeup_one "void *ident"
>  .Sh DESCRIPTION
>  These functions implement voluntary context switching.
> -.Fn tsleep
> -and
> +.Fn tsleep ,
>  .Fn msleep
> +and
> +.Fn rwsleep
>  are used throughout the kernel whenever processing in the current
> context cannot continue for any of the following reasons:
>  .Bl -bullet -offset indent
> @@ -146,6 +150,22 @@ argument.
>  .El
>  .Pp
>  The
> +.Fn rwsleep
> +function behaves just like
> +.Fn tsleep ,
> +but takes an additional argument:
> +.Bl -tag -width priority
> +.It Fa wl
> +A write lock that will be unlocked when the process is safely
> +on the sleep queue.
> +The write lock will be relocked at the end of rwsleep unless the
> +.Dv PNORELOCK
> +flag is set in the
> +.Fa priority
> +argument.
> +.El
> +.Pp
> +The
>  .Fn wakeup
>  function will mark all proce