Re: timeout_set_proc(9)

2016-10-05 Thread Christiano F. Haesbaert
On 5 October 2016 at 18:26, Ted Unangst  wrote:
> Christiano F. Haesbaert wrote:
>> There is another bug, the thread runs outside of IPL_SOFTCLOCK, the
>> interrupt handler already runs at IPL_SOFTCLOCK so it did not need to raise
>> it, but the thread does.
>>
>> The mutex is not enough as it will drop before running the handler, this
>> can cause intr timeouts to interrupt proc timeouts.
>
> Is this a real bug? A proc timeout can, by definition, sleep, so it shouldn't
> be making any assumptions about running atomically. If it needs to block
> timeouts, the handler should use spl.

I agree, you can't assume it is atomic, but at this point, in this
transition phase I'd raise the ipl, since the code written under is
assumed to be running without any timeouts triggering.
The code that actually does sleep (say on a rwlock), is "new" code
that should address losing the atomicity, but again, at this point,
this does not happen and it breaks an old invariant.
If this does not hold, then we shouldn't pin the thread to cpu0
either, it should run freely.



Re: timeout_set_proc(9)

2016-10-05 Thread Ted Unangst
Christiano F. Haesbaert wrote:
> There is another bug, the thread runs outside of IPL_SOFTCLOCK, the
> interrupt handler already runs at IPL_SOFTCLOCK so it did not need to raise
> it, but the thread does.
> 
> The mutex is not enough as it will drop before running the handler, this
> can cause intr timeouts to interrupt proc timeouts.

Is this a real bug? A proc timeout can, by definition, sleep, so it shouldn't
be making any assumptions about running atomically. If it needs to block
timeouts, the handler should use spl.



Re: timeout_set_proc(9)

2016-10-05 Thread Christiano F. Haesbaert
Am Montag, 26. September 2016 schrieb David Gwynne :

>
> > On 26 Sep 2016, at 13:36, Ted Unangst  > wrote:
> >
> > David Gwynne wrote:
> >> +mtx_enter(_mutex);
> >> +while (!CIRCQ_EMPTY(_proc)) {
> >> +to = timeout_from_circq(CIRCQ_
> FIRST(_proc));
> >> +CIRCQ_REMOVE(>to_list);
> >   leave();
> >> +timeout_run(to);
> >   enter();
> >> +}
> >> +mtx_leave(_mutex);
> >
> > you need to drop the mutex when running the timeout. with those changes,
> looks
> > pretty good.
>
> timeout_run drops the mutex.



There is another bug, the thread runs outside of IPL_SOFTCLOCK, the
interrupt handler already runs at IPL_SOFTCLOCK so it did not need to raise
it, but the thread does.

The mutex is not enough as it will drop before running the handler, this
can cause intr timeouts to interrupt proc timeouts.

I suggest raising it on the thread startup and leaving it always up. That's
why my diff had added IPL support for tasks btw.


Re: timeout_set_proc(9)

2016-09-25 Thread David Gwynne

> On 26 Sep 2016, at 13:36, Ted Unangst  wrote:
> 
> David Gwynne wrote:
>> +mtx_enter(_mutex);
>> +while (!CIRCQ_EMPTY(_proc)) {
>> +to = timeout_from_circq(CIRCQ_FIRST(_proc));
>> +CIRCQ_REMOVE(>to_list);
>   leave();
>> +timeout_run(to);
>   enter();
>> +}
>> +mtx_leave(_mutex);
> 
> you need to drop the mutex when running the timeout. with those changes, looks
> pretty good.

timeout_run drops the mutex.


Re: timeout_set_proc(9)

2016-09-25 Thread Ted Unangst
David Gwynne wrote:
> + mtx_enter(_mutex);
> + while (!CIRCQ_EMPTY(_proc)) {
> + to = timeout_from_circq(CIRCQ_FIRST(_proc));
> + CIRCQ_REMOVE(>to_list);
leave();
> + timeout_run(to);
enter();
> + }
> + mtx_leave(_mutex);

you need to drop the mutex when running the timeout. with those changes, looks
pretty good.



Re: timeout_set_proc(9)

2016-09-25 Thread David Gwynne
On Sat, Sep 24, 2016 at 12:20:02PM +0200, Christiano F. Haesbaert wrote:
> Am Samstag, 24. September 2016 schrieb David Gwynne :
> 
> > On Fri, Sep 23, 2016 at 10:16:34PM -0700, Philip Guenther wrote:
> > > On Fri, 23 Sep 2016, Christiano F. Haesbaert wrote:
> > > ...
> > > > The diff as it is will deadlock against SCHED_LOCK.
> > > > tsleep() calls sleep_setup(), which grabs SCHED_LOCK,
> > > > Then sleep_setup_timeout() will grab timeout_mutex in timeout_add()
> > > >
> > > > On softclock() you have the opposite:
> > > > Grabs timeout_mutex then does a wakeup, which grabs SCHED_LOCK.
> >
> > nice.
> >
> > >
> > > Hmm, yes. And softclock_thread() has the same problem: msleep() needs to
> > > grab SCHED_LOCK while the passed in mutex is held, so it'll hold
> > > timeout_mutex while grabbing SCHED_LOCK too.
> > >
> > > I just played around with using a separate mutex for protecting
> > > timeout_proc, a mutex which would be 'outside' SCHED_LOCK, unlike
> > > timeout_mutex which is 'inside' SCHED_LOCK.  The catch is that supporting
> > > all the functionality of timeout_add() and timeout_del() becomes ugly and
> > > fragile: they would need to check whether the mutex has
> > > TIMEOUT_NEEDPROCCTX set and, if so, grab the new mutex before grabbing
> > > timeout_mutex. ??That's "safe" from being a lock loop from tsleep()
> > because
> > > the thread's p_sleep_to *can't* have that flag set, so the 'outside'
> > mutex
> > > wouldn't be neededbut geez is this ugly.  I'm also unsure what
> > defined
> > > semantics, if any, timeout_triggered() should have for NEEDPROCCTX
> > > timeouts.  Should it return true once the timeout has been dequeued from
> > > the timeout wheel, or should it only be set once softclock_thread is
> > > actually going to run it?
> > >
> > >
> > > ...or maybe this makes people think we should toss this out and go
> > > directly to dlg@'s proposal...
> >
> > let's not go crazy.
> >
> > i believe the deadlock can be fixed by moving the wakeup outside
> > the hold of timeout_mutex in timeout_add. you only need to wake up
> > the thread once even if you queued multiple timeouts, cos the thread
> > will loop until it completes all pending work. think of this as
> > interrupt mitigation.
> 
> 
> Yes, that was my solution as well, i don't have access to the code right
> now, but I think this won't fix the msleep case guenther pointed out.

yeah. that one took me a bit longer to understand.

the below includes my first diff, but also gets rid of the hold of
the timeout mutex while waiting for entries in the timeout_proc
list. it basically sleeps on CIRCQ_EMPTY outside the lock.

Index: kern/kern_timeout.c
===
RCS file: /cvs/src/sys/kern/kern_timeout.c,v
retrieving revision 1.49
diff -u -p -r1.49 kern_timeout.c
--- kern/kern_timeout.c 22 Sep 2016 12:55:24 -  1.49
+++ kern/kern_timeout.c 26 Sep 2016 02:56:21 -
@@ -375,6 +375,7 @@ softclock(void *arg)
int delta;
struct circq *bucket;
struct timeout *to;
+   int needsproc = 0;
 
mtx_enter(_mutex);
while (!CIRCQ_EMPTY(_todo)) {
@@ -391,7 +392,7 @@ softclock(void *arg)
CIRCQ_INSERT(>to_list, bucket);
} else if (to->to_flags & TIMEOUT_NEEDPROCCTX) {
CIRCQ_INSERT(>to_list, _proc);
-   wakeup(_proc);
+   needsproc = 1;
} else {
 #ifdef DEBUG
if (delta < 0)
@@ -401,6 +402,9 @@ softclock(void *arg)
}
}
mtx_leave(_mutex);
+
+   if (needsproc)
+   wakeup(_proc);
 }
 
 void
@@ -415,6 +419,7 @@ softclock_thread(void *arg)
 {
CPU_INFO_ITERATOR cii;
struct cpu_info *ci;
+   struct sleep_state sls;
struct timeout *to;
 
KERNEL_ASSERT_LOCKED();
@@ -427,16 +432,18 @@ softclock_thread(void *arg)
KASSERT(ci != NULL);
sched_peg_curproc(ci);
 
-   mtx_enter(_mutex);
for (;;) {
-   while (CIRCQ_EMPTY(_proc))
-   msleep(_proc, _mutex, PSWP, "bored", 0);
+   sleep_setup(, _proc, PSWP, "bored");
+   sleep_finish(, CIRCQ_EMPTY(_proc));
 
-   to = timeout_from_circq(CIRCQ_FIRST(_proc));
-   CIRCQ_REMOVE(>to_list);
-   timeout_run(to);
+   mtx_enter(_mutex);
+   while (!CIRCQ_EMPTY(_proc)) {
+   to = timeout_from_circq(CIRCQ_FIRST(_proc));
+   CIRCQ_REMOVE(>to_list);
+   timeout_run(to);
+   }
+   mtx_leave(_mutex);
}
-   mtx_leave(_mutex);
 }
 
 #ifndef SMALL_KERNEL



Re: timeout_set_proc(9)

2016-09-24 Thread Christiano F. Haesbaert
Am Samstag, 24. September 2016 schrieb David Gwynne :

> On Fri, Sep 23, 2016 at 10:16:34PM -0700, Philip Guenther wrote:
> > On Fri, 23 Sep 2016, Christiano F. Haesbaert wrote:
> > ...
> > > The diff as it is will deadlock against SCHED_LOCK.
> > > tsleep() calls sleep_setup(), which grabs SCHED_LOCK,
> > > Then sleep_setup_timeout() will grab timeout_mutex in timeout_add()
> > >
> > > On softclock() you have the opposite:
> > > Grabs timeout_mutex then does a wakeup, which grabs SCHED_LOCK.
>
> nice.
>
> >
> > Hmm, yes. And softclock_thread() has the same problem: msleep() needs to
> > grab SCHED_LOCK while the passed in mutex is held, so it'll hold
> > timeout_mutex while grabbing SCHED_LOCK too.
> >
> > I just played around with using a separate mutex for protecting
> > timeout_proc, a mutex which would be 'outside' SCHED_LOCK, unlike
> > timeout_mutex which is 'inside' SCHED_LOCK.  The catch is that supporting
> > all the functionality of timeout_add() and timeout_del() becomes ugly and
> > fragile: they would need to check whether the mutex has
> > TIMEOUT_NEEDPROCCTX set and, if so, grab the new mutex before grabbing
> > timeout_mutex. ??That's "safe" from being a lock loop from tsleep()
> because
> > the thread's p_sleep_to *can't* have that flag set, so the 'outside'
> mutex
> > wouldn't be neededbut geez is this ugly.  I'm also unsure what
> defined
> > semantics, if any, timeout_triggered() should have for NEEDPROCCTX
> > timeouts.  Should it return true once the timeout has been dequeued from
> > the timeout wheel, or should it only be set once softclock_thread is
> > actually going to run it?
> >
> >
> > ...or maybe this makes people think we should toss this out and go
> > directly to dlg@'s proposal...
>
> let's not go crazy.
>
> i believe the deadlock can be fixed by moving the wakeup outside
> the hold of timeout_mutex in timeout_add. you only need to wake up
> the thread once even if you queued multiple timeouts, cos the thread
> will loop until it completes all pending work. think of this as
> interrupt mitigation.


Yes, that was my solution as well, i don't have access to the code right
now, but I think this won't fix the msleep case guenther pointed out.



>
> it is worth noting that sleep_setup_timeout doesnt timeout_add if
> the wait is 0, so the thread doesnt recurse.
>
> Index: kern/kern_timeout.c
> ===
> RCS file: /cvs/src/sys/kern/kern_timeout.c,v
> retrieving revision 1.49
> diff -u -p -r1.49 kern_timeout.c
> --- kern/kern_timeout.c 22 Sep 2016 12:55:24 -  1.49
> +++ kern/kern_timeout.c 24 Sep 2016 09:45:09 -
> @@ -375,6 +375,7 @@ softclock(void *arg)
> int delta;
> struct circq *bucket;
> struct timeout *to;
> +   int needsproc = 0;
>
> mtx_enter(_mutex);
> while (!CIRCQ_EMPTY(_todo)) {
> @@ -391,7 +392,7 @@ softclock(void *arg)
> CIRCQ_INSERT(>to_list, bucket);
> } else if (to->to_flags & TIMEOUT_NEEDPROCCTX) {
> CIRCQ_INSERT(>to_list, _proc);
> -   wakeup(_proc);
> +   needsproc = 1;
> } else {
>  #ifdef DEBUG
> if (delta < 0)
> @@ -401,6 +402,9 @@ softclock(void *arg)
> }
> }
> mtx_leave(_mutex);
> +
> +   if (needsproc)
> +   wakeup(_proc);
>  }
>
>  void
>


Re: timeout_set_proc(9)

2016-09-24 Thread David Gwynne
On Fri, Sep 23, 2016 at 10:16:34PM -0700, Philip Guenther wrote:
> On Fri, 23 Sep 2016, Christiano F. Haesbaert wrote:
> ...
> > The diff as it is will deadlock against SCHED_LOCK.
> > tsleep() calls sleep_setup(), which grabs SCHED_LOCK,
> > Then sleep_setup_timeout() will grab timeout_mutex in timeout_add()
> > 
> > On softclock() you have the opposite:
> > Grabs timeout_mutex then does a wakeup, which grabs SCHED_LOCK.

nice.

> 
> Hmm, yes. And softclock_thread() has the same problem: msleep() needs to 
> grab SCHED_LOCK while the passed in mutex is held, so it'll hold 
> timeout_mutex while grabbing SCHED_LOCK too.
> 
> I just played around with using a separate mutex for protecting 
> timeout_proc, a mutex which would be 'outside' SCHED_LOCK, unlike 
> timeout_mutex which is 'inside' SCHED_LOCK.  The catch is that supporting 
> all the functionality of timeout_add() and timeout_del() becomes ugly and 
> fragile: they would need to check whether the mutex has 
> TIMEOUT_NEEDPROCCTX set and, if so, grab the new mutex before grabbing 
> timeout_mutex. ??That's "safe" from being a lock loop from tsleep() because 
> the thread's p_sleep_to *can't* have that flag set, so the 'outside' mutex 
> wouldn't be neededbut geez is this ugly.  I'm also unsure what defined 
> semantics, if any, timeout_triggered() should have for NEEDPROCCTX 
> timeouts.  Should it return true once the timeout has been dequeued from 
> the timeout wheel, or should it only be set once softclock_thread is 
> actually going to run it?
> 
> 
> ...or maybe this makes people think we should toss this out and go 
> directly to dlg@'s proposal...

let's not go crazy.

i believe the deadlock can be fixed by moving the wakeup outside
the hold of timeout_mutex in timeout_add. you only need to wake up
the thread once even if you queued multiple timeouts, cos the thread
will loop until it completes all pending work. think of this as
interrupt mitigation.

it is worth noting that sleep_setup_timeout doesnt timeout_add if
the wait is 0, so the thread doesnt recurse.

Index: kern/kern_timeout.c
===
RCS file: /cvs/src/sys/kern/kern_timeout.c,v
retrieving revision 1.49
diff -u -p -r1.49 kern_timeout.c
--- kern/kern_timeout.c 22 Sep 2016 12:55:24 -  1.49
+++ kern/kern_timeout.c 24 Sep 2016 09:45:09 -
@@ -375,6 +375,7 @@ softclock(void *arg)
int delta;
struct circq *bucket;
struct timeout *to;
+   int needsproc = 0;
 
mtx_enter(_mutex);
while (!CIRCQ_EMPTY(_todo)) {
@@ -391,7 +392,7 @@ softclock(void *arg)
CIRCQ_INSERT(>to_list, bucket);
} else if (to->to_flags & TIMEOUT_NEEDPROCCTX) {
CIRCQ_INSERT(>to_list, _proc);
-   wakeup(_proc);
+   needsproc = 1;
} else {
 #ifdef DEBUG
if (delta < 0)
@@ -401,6 +402,9 @@ softclock(void *arg)
}
}
mtx_leave(_mutex);
+
+   if (needsproc)
+   wakeup(_proc);
 }
 
 void



Re: timeout_set_proc(9)

2016-09-23 Thread Philip Guenther
On Fri, 23 Sep 2016, Christiano F. Haesbaert wrote:
...
> The diff as it is will deadlock against SCHED_LOCK.
> tsleep() calls sleep_setup(), which grabs SCHED_LOCK,
> Then sleep_setup_timeout() will grab timeout_mutex in timeout_add()
> 
> On softclock() you have the opposite:
> Grabs timeout_mutex then does a wakeup, which grabs SCHED_LOCK.
> 

Hmm, yes. And softclock_thread() has the same problem: msleep() needs to 
grab SCHED_LOCK while the passed in mutex is held, so it'll hold 
timeout_mutex while grabbing SCHED_LOCK too.

I just played around with using a separate mutex for protecting 
timeout_proc, a mutex which would be 'outside' SCHED_LOCK, unlike 
timeout_mutex which is 'inside' SCHED_LOCK.  The catch is that supporting 
all the functionality of timeout_add() and timeout_del() becomes ugly and 
fragile: they would need to check whether the mutex has 
TIMEOUT_NEEDPROCCTX set and, if so, grab the new mutex before grabbing 
timeout_mutex.  That's "safe" from being a lock loop from tsleep() because 
the thread's p_sleep_to *can't* have that flag set, so the 'outside' mutex 
wouldn't be neededbut geez is this ugly.  I'm also unsure what defined 
semantics, if any, timeout_triggered() should have for NEEDPROCCTX 
timeouts.  Should it return true once the timeout has been dequeued from 
the timeout wheel, or should it only be set once softclock_thread is 
actually going to run it?


...or maybe this makes people think we should toss this out and go 
directly to dlg@'s proposal...


Philip Guenther

Index: kern_timeout.c
===
RCS file: /data/src/openbsd/src/sys/kern/kern_timeout.c,v
retrieving revision 1.49
diff -u -p -r1.49 kern_timeout.c
--- kern_timeout.c  22 Sep 2016 12:55:24 -  1.49
+++ kern_timeout.c  24 Sep 2016 05:10:00 -
@@ -87,9 +87,15 @@ timeout_from_circq(struct circq *p)
  * All wheels are locked with the same mutex.
  *
  * We need locking since the timeouts are manipulated from hardclock that's
- * not behind the big lock.
+ * not behind the big lock.  timeout_mutex protects the actual timeout
+ * wheel and queue of new timeouts.  timeout_proc_mutex proctects the
+ * queue of "need proc context" timeouts that have triggered.  Since those
+ * timeouts can reside in either location and are moved between them by
+ * softclock(), calls that might run when it's uncertain where the timeout
+ * is queued must hold *both* mutexes when queueing or dequeueing them.
  */
 struct mutex timeout_mutex = MUTEX_INITIALIZER(IPL_HIGH);
+struct mutex timeout_proc_mutex = MUTEX_INITIALIZER(IPL_NONE);
 
 /*
  * Circular queue definitions.
@@ -190,6 +196,8 @@ timeout_add(struct timeout *new, int to_
panic("timeout_add: to_ticks (%d) < 0", to_ticks);
 #endif
 
+   if (new->to_flags & TIMEOUT_NEEDPROCCTX)
+   mtx_enter(_proc_mutex);
mtx_enter(_mutex);
/* Initialize the time here, it won't change. */
old_time = new->to_time;
@@ -212,6 +220,8 @@ timeout_add(struct timeout *new, int to_
CIRCQ_INSERT(>to_list, _todo);
}
mtx_leave(_mutex);
+   if (new->to_flags & TIMEOUT_NEEDPROCCTX)
+   mtx_leave(_proc_mutex);
 
return (ret);
 }
@@ -312,6 +322,8 @@ timeout_del(struct timeout *to)
 {
int ret = 0;
 
+   if (to->to_flags & TIMEOUT_NEEDPROCCTX)
+   mtx_enter(_proc_mutex);
mtx_enter(_mutex);
if (to->to_flags & TIMEOUT_ONQUEUE) {
CIRCQ_REMOVE(>to_list);
@@ -320,6 +332,8 @@ timeout_del(struct timeout *to)
}
to->to_flags &= ~TIMEOUT_TRIGGERED;
mtx_leave(_mutex);
+   if (to->to_flags & TIMEOUT_NEEDPROCCTX)
+   mtx_leave(_proc_mutex);
 
return (ret);
 }
@@ -351,56 +365,66 @@ timeout_hardclock_update(void)
 }
 
 void
-timeout_run(struct timeout *to)
+timeout_run(struct timeout *to, struct mutex *mtx)
 {
void (*fn)(void *);
void *arg;
 
-   MUTEX_ASSERT_LOCKED(_mutex);
+   MUTEX_ASSERT_LOCKED(mtx);
 
to->to_flags &= ~TIMEOUT_ONQUEUE;
-   to->to_flags |= TIMEOUT_TRIGGERED;
 
fn = to->to_func;
arg = to->to_arg;
 
-   mtx_leave(_mutex);
+   mtx_leave(mtx);
fn(arg);
-   mtx_enter(_mutex);
+   mtx_enter(mtx);
 }
 
 void
 softclock(void *arg)
 {
+   struct circq need_proc;
int delta;
struct circq *bucket;
struct timeout *to;
+   int need_wakeup;
 
+   CIRCQ_INIT(_proc);
mtx_enter(_mutex);
while (!CIRCQ_EMPTY(_todo)) {
to = timeout_from_circq(CIRCQ_FIRST(_todo));
CIRCQ_REMOVE(>to_list);
 
-   /*
-* If due run it or defer execution to the thread,
-* otherwise insert it into the right bucket.
-*/
+   /* if not yet due, insert it into the right bucket */
delta = to->to_time - ticks;
  

Re: timeout_set_proc(9)

2016-09-23 Thread Christiano F. Haesbaert
Am Mittwoch, 21. September 2016 schrieb Martin Pieuchot :

> On 21/09/16(Wed) 16:29, David Gwynne wrote:
> > [...]
> > the point i was trying to make was that the existing stuff (tasks,
> timeouts) can be used together to get the effect we want. my point was very
> poorly made though.
> >
> > i think your point is that you can make a clever change to timeouts and
> not have to do a ton of flow on code changes to take advantage of it.
>
> I'm trying to fix one problem at a time.
>
> > [...]
> > if timeouts are the way to schedule stuff to run in the future, then
> we're going to get head-of-line blocking problems. pending timeouts will
> end up stuck behind work that is waiting on an arbitrary lock, because
> there's an implicit single thread that will run them all. right now that is
> mitigated by timeouts being an interrupt context, we just dont put a lot of
> work like that in there right now.
>
> Really?  Is it worth than it is now with the KERNEL_LOCK()?
>
> > the benefit of taskqs is that you explicitly steer work to threads that
> can sleep independently of each other. they lack being able to schedule
> work to run in the future though.
> >
> > it turns out it isnt that hard to make taskqs use a priority queue
> internally instead of a tailq. this allows you to specify that tasks get
> executed in the future (or right now, like the current behaviour) in an
> explicit thread (or pool of threads). it does mean a lot of changes to code
> thats using timeouts now though.
>
> I agree with you, but these thoughts are IMHO too far ahead.  Everything
> is still serialized in our kernel.
>
>
The diff as it is will deadlock against SCHED_LOCK.
tsleep() calls sleep_setup(), which grabs SCHED_LOCK,
Then sleep_setup_timeout() will grab timeout_mutex in timeout_add()

On softclock() you have the opposite:
Grabs timeout_mutex then does a wakeup, which grabs SCHED_LOCK.


Re: timeout_set_proc(9)

2016-09-21 Thread Martin Pieuchot
On 21/09/16(Wed) 16:29, David Gwynne wrote:
> [...] 
> the point i was trying to make was that the existing stuff (tasks, timeouts) 
> can be used together to get the effect we want. my point was very poorly made 
> though.
>
> i think your point is that you can make a clever change to timeouts and not 
> have to do a ton of flow on code changes to take advantage of it.

I'm trying to fix one problem at a time.

> [...]
> if timeouts are the way to schedule stuff to run in the future, then we're 
> going to get head-of-line blocking problems. pending timeouts will end up 
> stuck behind work that is waiting on an arbitrary lock, because there's an 
> implicit single thread that will run them all. right now that is mitigated by 
> timeouts being an interrupt context, we just dont put a lot of work like that 
> in there right now. 

Really?  Is it worth than it is now with the KERNEL_LOCK()?

> the benefit of taskqs is that you explicitly steer work to threads that can 
> sleep independently of each other. they lack being able to schedule work to 
> run in the future though.
> 
> it turns out it isnt that hard to make taskqs use a priority queue internally 
> instead of a tailq. this allows you to specify that tasks get executed in the 
> future (or right now, like the current behaviour) in an explicit thread (or 
> pool of threads). it does mean a lot of changes to code thats using timeouts 
> now though.

I agree with you, but these thoughts are IMHO too far ahead.  Everything
is still serialized in our kernel.



Re: timeout_set_proc(9)

2016-09-21 Thread David Gwynne

> On 19 Sep 2016, at 20:28, Mike Belopuhov  wrote:
> 
> On 19 September 2016 at 11:06, Martin Pieuchot  wrote:
>> On 19/09/16(Mon) 13:47, David Gwynne wrote:
>>> [...]
>>> id rather not grow the timeout (or task for that matter) apis just
>>> yet. theyre nice and straightforward to read and understand so far.
>> 
>> So better introduce a new API, right?

i guess.

the point i was trying to make was that the existing stuff (tasks, timeouts) 
can be used together to get the effect we want. my point was very poorly made 
though.

i think your point is that you can make a clever change to timeouts and not 
have to do a ton of flow on code changes to take advantage of it.

>> 
>>> for this specific problem can we do a specific fix for it? the diff
>>> below is equiv to the timeout_set_proc change, but implements it
>>> by using explicit tasks that are called by the timeouts from a
>>> common trampoline in the network stack.
>> 
>> Is it really a specific problem?  We already encounter this for the
>> linkstate and the watchdog.
>> 
>> I'm not convinced by this approach.  I don't understand why:
>>  - adding a task in every data structure is worth it
>>  - introducing a new if_nettmo_* make things simpler
>> 
>> So there's something which isn't explain in this email.
>> 
>> And I'll bet that in the upcoming years we're going to stop using soft
>> interrupts.  Meaning that timeout handlers will always have a stack
>> available.  If/when this happens, it will be easier to do:
>> 
>>s/timeout_set_proc/timeout_set/
>> 
> 
> FWIW, I agree with mpi on this.  As a temporary change his
> approach is sound.

if its temporary, then fine.

> I would also like to know what prevents us from executing all
> timeouts in a thread context now? serial drivers with softtty
> line discipline entanglement?

theres some stuff in wi(4) and ipsec that would need to be fixed first. apart 
from that i have concerns a about having everything go through a single 
execution context.

if timeouts are the way to schedule stuff to run in the future, then we're 
going to get head-of-line blocking problems. pending timeouts will end up stuck 
behind work that is waiting on an arbitrary lock, because there's an implicit 
single thread that will run them all. right now that is mitigated by timeouts 
being an interrupt context, we just dont put a lot of work like that in there 
right now. 

the benefit of taskqs is that you explicitly steer work to threads that can 
sleep independently of each other. they lack being able to schedule work to run 
in the future though.

it turns out it isnt that hard to make taskqs use a priority queue internally 
instead of a tailq. this allows you to specify that tasks get executed in the 
future (or right now, like the current behaviour) in an explicit thread (or 
pool of threads). it does mean a lot of changes to code thats using timeouts 
now though.

anyway. if this is temporary, then fine.

dlg


Re: timeout_set_proc(9)

2016-09-20 Thread Martin Pieuchot
On 15/09/16(Thu) 16:29, Martin Pieuchot wrote:
> After discussing with a few people about a new "timed task" API I came
> to the conclusion that mixing timeouts and tasks will result in:
> 
>   - always including a 'struct timeout' in a 'struct task', or the other
> the way around
> or
>   
>   - introducing a new data structure, hence API.
> 
> Since I'd like to keep the change as small as possible when converting
> existing timeout_set(9), neither option seem a good fit.  So I decided
> to add a new kernel thread, curiously named "softclock", that will
> offer his stack to the poor timeout handlers that need one. 
> 
> With this approach, converting a timeout is just a matter of doing:
> 
>   s/timeout_set/timeout_set_proc/
> 
> 
> Diff below includes the conversions I need for the "netlock".  I'm
> waiting for feedbacks and a better name to document the new function.

Updated diff using  CPU_INFO_FOREACH() as requested by kettenis@ and
including manpage bits.

Should I just move forward?  Objections?  Oks?


Index: sys/kern/init_main.c
===
RCS file: /cvs/src/sys/kern/init_main.c,v
retrieving revision 1.258
diff -u -p -r1.258 init_main.c
--- sys/kern/init_main.c18 Sep 2016 12:36:28 -  1.258
+++ sys/kern/init_main.c19 Sep 2016 09:18:49 -
@@ -144,6 +144,7 @@ voidprof_init(void);
 void   init_exec(void);
 void   kqueue_init(void);
 void   taskq_init(void);
+void   timeout_proc_init(void);
 void   pool_gc_pages(void *);
 
 extern char sigcode[], esigcode[], sigcoderet[];
@@ -335,6 +336,9 @@ main(void *framep)
sleep_queue_init();
sched_init_cpu(curcpu());
p->p_cpu->ci_randseed = (arc4random() & 0x7fff) + 1;
+
+   /* Initialize timeouts in process context. */
+   timeout_proc_init();
 
/* Initialize task queues */
taskq_init();
Index: sys/kern/kern_timeout.c
===
RCS file: /cvs/src/sys/kern/kern_timeout.c,v
retrieving revision 1.48
diff -u -p -r1.48 kern_timeout.c
--- sys/kern/kern_timeout.c 6 Jul 2016 15:53:01 -   1.48
+++ sys/kern/kern_timeout.c 20 Sep 2016 08:37:48 -
@@ -27,7 +27,7 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -54,6 +54,7 @@
 
 struct circq timeout_wheel[BUCKETS];   /* Queues of timeouts */
 struct circq timeout_todo; /* Worklist */
+struct circq timeout_proc; /* Due timeouts needing proc. context */
 
 #define MASKWHEEL(wheel, time) (((time) >> ((wheel)*WHEELBITS)) & WHEELMASK)
 
@@ -127,6 +128,9 @@ struct mutex timeout_mutex = MUTEX_INITI
 
 #define CIRCQ_EMPTY(elem) (CIRCQ_FIRST(elem) == (elem))
 
+void softclock_thread(void *);
+void softclock_create_thread(void *);
+
 /*
  * Some of the "math" in here is a bit tricky.
  *
@@ -147,11 +151,18 @@ timeout_startup(void)
int b;
 
CIRCQ_INIT(_todo);
+   CIRCQ_INIT(_proc);
for (b = 0; b < nitems(timeout_wheel); b++)
CIRCQ_INIT(_wheel[b]);
 }
 
 void
+timeout_proc_init(void)
+{
+   kthread_create_deferred(softclock_create_thread, NULL);
+}
+
+void
 timeout_set(struct timeout *new, void (*fn)(void *), void *arg)
 {
new->to_func = fn;
@@ -159,6 +170,12 @@ timeout_set(struct timeout *new, void (*
new->to_flags = TIMEOUT_INITIALIZED;
 }
 
+void
+timeout_set_proc(struct timeout *new, void (*fn)(void *), void *arg)
+{
+   timeout_set(new, fn, arg);
+   new->to_flags |= TIMEOUT_NEEDPROCCTX;
+}
 
 int
 timeout_add(struct timeout *new, int to_ticks)
@@ -334,38 +351,90 @@ timeout_hardclock_update(void)
 }
 
 void
+timeout_run(struct timeout *to)
+{
+   void (*fn)(void *);
+   void *arg;
+
+   MUTEX_ASSERT_LOCKED(_mutex);
+
+   to->to_flags &= ~TIMEOUT_ONQUEUE;
+   to->to_flags |= TIMEOUT_TRIGGERED;
+
+   fn = to->to_func;
+   arg = to->to_arg;
+
+   mtx_leave(_mutex);
+   fn(arg);
+   mtx_enter(_mutex);
+}
+
+void
 softclock(void *arg)
 {
int delta;
struct circq *bucket;
struct timeout *to;
-   void (*fn)(void *);
 
mtx_enter(_mutex);
while (!CIRCQ_EMPTY(_todo)) {
to = timeout_from_circq(CIRCQ_FIRST(_todo));
CIRCQ_REMOVE(>to_list);
 
-   /* If due run it, otherwise insert it into the right bucket. */
+   /*
+* If due run it or defer execution to the thread,
+* otherwise insert it into the right bucket.
+*/
delta = to->to_time - ticks;
if (delta > 0) {
bucket = (delta, to->to_time);
CIRCQ_INSERT(>to_list, bucket);
+   } else if (to->to_flags & TIMEOUT_NEEDPROCCTX) {
+   CIRCQ_INSERT(>to_list, _proc);
+   wakeup(_proc);
} else {
 #ifdef DEBUG
 

Re: timeout_set_proc(9)

2016-09-19 Thread Mark Kettenis
> Date: Mon, 19 Sep 2016 13:47:25 +1000
> From: David Gwynne 
> 
> On Fri, Sep 16, 2016 at 04:58:39PM +0200, Mark Kettenis wrote:
> > > Date: Thu, 15 Sep 2016 16:29:45 +0200
> > > From: Martin Pieuchot 
> > > 
> > > After discussing with a few people about a new "timed task" API I came
> > > to the conclusion that mixing timeouts and tasks will result in:
> > > 
> > >   - always including a 'struct timeout' in a 'struct task', or the other
> > > the way around
> > > or
> > >   
> > >   - introducing a new data structure, hence API.
> > > 
> > > Since I'd like to keep the change as small as possible when converting
> > > existing timeout_set(9), neither option seem a good fit.  So I decided
> > > to add a new kernel thread, curiously named "softclock", that will
> > > offer his stack to the poor timeout handlers that need one. 
> > > 
> > > With this approach, converting a timeout is just a matter of doing:
> > > 
> > >   s/timeout_set/timeout_set_proc/
> > > 
> > > 
> > > Diff below includes the conversions I need for the "netlock".  I'm
> > > waiting for feedbacks and a better name to document the new function.
> > > 
> > > Comments?
> > 
> > I like how minimal this is.  Would like to see a few more people that
> > are familliar with the timeout code chime in, but it looks mostly
> > correct to me as well.  One question though:
> 
> id rather not grow the timeout (or task for that matter) apis just
> yet. theyre nice and straightforward to read and understand so far.
> 
> for this specific problem can we do a specific fix for it? the diff
> below is equiv to the timeout_set_proc change, but implements it
> by using explicit tasks that are called by the timeouts from a
> common trampoline in the network stack.

But this is much more involved tha mpi@'s change.

So unless somebody shoots a hole into his approach, I'd prefer the
timeout_set_proc() approach.

> Index: net/if.c
> ===
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.448
> diff -u -p -r1.448 if.c
> --- net/if.c  13 Sep 2016 08:15:01 -  1.448
> +++ net/if.c  19 Sep 2016 01:51:37 -
> @@ -155,6 +155,8 @@ void  if_watchdog_task(void *);
>  void if_input_process(void *);
>  void if_netisr(void *);
>  
> +void if_nettmo_add(void *);
> +
>  #ifdef DDB
>  void ifa_print_all(void);
>  #endif
> @@ -875,6 +877,21 @@ if_netisr(void *unused)
>  
>   splx(s);
>   KERNEL_UNLOCK();
> +}
> +
> +void
> +if_nettmo_add(void *t)
> +{
> + /* the task is added to systq so it inherits the KERNEL_LOCK */
> + task_add(systq, t);
> +}
> +
> +void
> +if_nettmo_set(struct timeout *tmo, struct task *task,
> +void (*fn)(void *), void *arg)
> +{
> + task_set(task, fn, arg);
> + timeout_set(tmo, if_nettmo_add, task);
>  }
>  
>  void
> Index: net/if_pflow.c
> ===
> RCS file: /cvs/src/sys/net/if_pflow.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 if_pflow.c
> --- net/if_pflow.c29 Apr 2016 08:55:03 -  1.61
> +++ net/if_pflow.c19 Sep 2016 01:51:37 -
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -547,16 +548,24 @@ pflow_init_timeouts(struct pflow_softc *
>   timeout_del(>sc_tmo6);
>   if (timeout_initialized(>sc_tmo_tmpl))
>   timeout_del(>sc_tmo_tmpl);
> - if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> + if (!timeout_initialized(>sc_tmo)) {
> + if_nettmo_set(>sc_tmo, >sc_task,
> + pflow_timeout, sc);
> + }
>   break;
>   case PFLOW_PROTO_10:
> - if (!timeout_initialized(>sc_tmo_tmpl))
> - timeout_set(>sc_tmo_tmpl, pflow_timeout_tmpl, sc);
> - if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> - if (!timeout_initialized(>sc_tmo6))
> - timeout_set(>sc_tmo6, pflow_timeout6, sc);
> + if (!timeout_initialized(>sc_tmo_tmpl)) {
> + if_nettmo_set(>sc_tmo_tmpl, >sc_task_tmpl,
> + pflow_timeout_tmpl, sc);
> + }
> + if (!timeout_initialized(>sc_tmo)) {
> + if_nettmo_set(>sc_tmo, >sc_task,
> + pflow_timeout, sc);
> + }
> + if (!timeout_initialized(>sc_tmo6)) {
> + if_nettmo_set(>sc_tmo6, >sc_task6,
> + pflow_timeout6, sc);
> + }
>  
>   timeout_add_sec(>sc_tmo_tmpl, PFLOW_TMPL_TIMEOUT);
>   break;
> Index: net/if_pflow.h
> ===
> RCS file: /cvs/src/sys/net/if_pflow.h,v

Re: timeout_set_proc(9)

2016-09-19 Thread Alexander Bluhm
On Thu, Sep 15, 2016 at 04:29:45PM +0200, Martin Pieuchot wrote:
> After discussing with a few people about a new "timed task" API I came
> to the conclusion that mixing timeouts and tasks will result in:
> 
>   - always including a 'struct timeout' in a 'struct task', or the other
> the way around
> or
>   
>   - introducing a new data structure, hence API.
> 
> Since I'd like to keep the change as small as possible when converting
> existing timeout_set(9), neither option seem a good fit.  So I decided
> to add a new kernel thread, curiously named "softclock", that will
> offer his stack to the poor timeout handlers that need one. 
> 
> With this approach, converting a timeout is just a matter of doing:
> 
>   s/timeout_set/timeout_set_proc/
> 
> 
> Diff below includes the conversions I need for the "netlock".  I'm
> waiting for feedbacks and a better name to document the new function.
> 
> Comments?

I like this API, epsecially as we can switch easily between thread
and soft interrupt to try things.

OK bluhm@

> 
> Index: net/if_pflow.c
> ===
> RCS file: /cvs/src/sys/net/if_pflow.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 if_pflow.c
> --- net/if_pflow.c29 Apr 2016 08:55:03 -  1.61
> +++ net/if_pflow.c15 Sep 2016 14:19:10 -
> @@ -548,15 +548,16 @@ pflow_init_timeouts(struct pflow_softc *
>   if (timeout_initialized(>sc_tmo_tmpl))
>   timeout_del(>sc_tmo_tmpl);
>   if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> + timeout_set_proc(>sc_tmo, pflow_timeout, sc);
>   break;
>   case PFLOW_PROTO_10:
>   if (!timeout_initialized(>sc_tmo_tmpl))
> - timeout_set(>sc_tmo_tmpl, pflow_timeout_tmpl, sc);
> + timeout_set_proc(>sc_tmo_tmpl, pflow_timeout_tmpl,
> + sc);
>   if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> + timeout_set_proc(>sc_tmo, pflow_timeout, sc);
>   if (!timeout_initialized(>sc_tmo6))
> - timeout_set(>sc_tmo6, pflow_timeout6, sc);
> + timeout_set_proc(>sc_tmo6, pflow_timeout6, sc);
>  
>   timeout_add_sec(>sc_tmo_tmpl, PFLOW_TMPL_TIMEOUT);
>   break;
> Index: net/if_pfsync.c
> ===
> RCS file: /cvs/src/sys/net/if_pfsync.c,v
> retrieving revision 1.231
> diff -u -p -r1.231 if_pfsync.c
> --- net/if_pfsync.c   15 Sep 2016 02:00:18 -  1.231
> +++ net/if_pfsync.c   15 Sep 2016 14:19:10 -
> @@ -328,9 +328,9 @@ pfsync_clone_create(struct if_clone *ifc
>   IFQ_SET_MAXLEN(>if_snd, IFQ_MAXLEN);
>   ifp->if_hdrlen = sizeof(struct pfsync_header);
>   ifp->if_mtu = ETHERMTU;
> - timeout_set(>sc_tmo, pfsync_timeout, sc);
> - timeout_set(>sc_bulk_tmo, pfsync_bulk_update, sc);
> - timeout_set(>sc_bulkfail_tmo, pfsync_bulk_fail, sc);
> + timeout_set_proc(>sc_tmo, pfsync_timeout, sc);
> + timeout_set_proc(>sc_bulk_tmo, pfsync_bulk_update, sc);
> + timeout_set_proc(>sc_bulkfail_tmo, pfsync_bulk_fail, sc);
>  
>   if_attach(ifp);
>   if_alloc_sadl(ifp);
> @@ -1723,7 +1723,7 @@ pfsync_defer(struct pf_state *st, struct
>   sc->sc_deferred++;
>   TAILQ_INSERT_TAIL(>sc_deferrals, pd, pd_entry);
>  
> - timeout_set(>pd_tmo, pfsync_defer_tmo, pd);
> + timeout_set_proc(>pd_tmo, pfsync_defer_tmo, pd);
>   timeout_add_msec(>pd_tmo, 20);
>  
>   schednetisr(NETISR_PFSYNC);
> Index: netinet/ip_carp.c
> ===
> RCS file: /cvs/src/sys/netinet/ip_carp.c,v
> retrieving revision 1.293
> diff -u -p -r1.293 ip_carp.c
> --- netinet/ip_carp.c 25 Jul 2016 16:44:04 -  1.293
> +++ netinet/ip_carp.c 15 Sep 2016 14:19:11 -
> @@ -831,9 +831,9 @@ carp_new_vhost(struct carp_softc *sc, in
>   vhe->vhid = vhid;
>   vhe->advskew = advskew;
>   vhe->state = INIT;
> - timeout_set(>ad_tmo, carp_send_ad, vhe);
> - timeout_set(>md_tmo, carp_master_down, vhe);
> - timeout_set(>md6_tmo, carp_master_down, vhe);
> + timeout_set_proc(>ad_tmo, carp_send_ad, vhe);
> + timeout_set_proc(>md_tmo, carp_master_down, vhe);
> + timeout_set_proc(>md6_tmo, carp_master_down, vhe);
>  
>   KERNEL_ASSERT_LOCKED(); /* touching carp_vhosts */
>  
> Index: netinet/tcp_timer.h
> ===
> RCS file: /cvs/src/sys/netinet/tcp_timer.h,v
> retrieving revision 1.13
> diff -u -p -r1.13 tcp_timer.h
> --- netinet/tcp_timer.h   6 Jul 2011 23:44:20 -   1.13
> +++ netinet/tcp_timer.h   15 Sep 2016 14:19:11 -
> @@ -116,7 +116,7 @@ const char *tcptimers[] =
>   * Init, arm, disarm, and test TCP 

Re: timeout_set_proc(9)

2016-09-19 Thread Martin Pieuchot
On 19/09/16(Mon) 13:47, David Gwynne wrote:
> [...] 
> id rather not grow the timeout (or task for that matter) apis just
> yet. theyre nice and straightforward to read and understand so far.

So better introduce a new API, right?

> for this specific problem can we do a specific fix for it? the diff
> below is equiv to the timeout_set_proc change, but implements it
> by using explicit tasks that are called by the timeouts from a
> common trampoline in the network stack.

Is it really a specific problem?  We already encounter this for the
linkstate and the watchdog. 

I'm not convinced by this approach.  I don't understand why:
  - adding a task in every data structure is worth it
  - introducing a new if_nettmo_* make things simpler

So there's something which isn't explain in this email.

And I'll bet that in the upcoming years we're going to stop using soft
interrupts.  Meaning that timeout handlers will always have a stack 
available.  If/when this happens, it will be easier to do:

s/timeout_set_proc/timeout_set/

> Index: net/if.c
> ===
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.448
> diff -u -p -r1.448 if.c
> --- net/if.c  13 Sep 2016 08:15:01 -  1.448
> +++ net/if.c  19 Sep 2016 01:51:37 -
> @@ -155,6 +155,8 @@ void  if_watchdog_task(void *);
>  void if_input_process(void *);
>  void if_netisr(void *);
>  
> +void if_nettmo_add(void *);
> +
>  #ifdef DDB
>  void ifa_print_all(void);
>  #endif
> @@ -875,6 +877,21 @@ if_netisr(void *unused)
>  
>   splx(s);
>   KERNEL_UNLOCK();
> +}
> +
> +void
> +if_nettmo_add(void *t)
> +{
> + /* the task is added to systq so it inherits the KERNEL_LOCK */
> + task_add(systq, t);
> +}
> +
> +void
> +if_nettmo_set(struct timeout *tmo, struct task *task,
> +void (*fn)(void *), void *arg)
> +{
> + task_set(task, fn, arg);
> + timeout_set(tmo, if_nettmo_add, task);
>  }
>  
>  void
> Index: net/if_pflow.c
> ===
> RCS file: /cvs/src/sys/net/if_pflow.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 if_pflow.c
> --- net/if_pflow.c29 Apr 2016 08:55:03 -  1.61
> +++ net/if_pflow.c19 Sep 2016 01:51:37 -
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -547,16 +548,24 @@ pflow_init_timeouts(struct pflow_softc *
>   timeout_del(>sc_tmo6);
>   if (timeout_initialized(>sc_tmo_tmpl))
>   timeout_del(>sc_tmo_tmpl);
> - if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> + if (!timeout_initialized(>sc_tmo)) {
> + if_nettmo_set(>sc_tmo, >sc_task,
> + pflow_timeout, sc);
> + }
>   break;
>   case PFLOW_PROTO_10:
> - if (!timeout_initialized(>sc_tmo_tmpl))
> - timeout_set(>sc_tmo_tmpl, pflow_timeout_tmpl, sc);
> - if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> - if (!timeout_initialized(>sc_tmo6))
> - timeout_set(>sc_tmo6, pflow_timeout6, sc);
> + if (!timeout_initialized(>sc_tmo_tmpl)) {
> + if_nettmo_set(>sc_tmo_tmpl, >sc_task_tmpl,
> + pflow_timeout_tmpl, sc);
> + }
> + if (!timeout_initialized(>sc_tmo)) {
> + if_nettmo_set(>sc_tmo, >sc_task,
> + pflow_timeout, sc);
> + }
> + if (!timeout_initialized(>sc_tmo6)) {
> + if_nettmo_set(>sc_tmo6, >sc_task6,
> + pflow_timeout6, sc);
> + }
>  
>   timeout_add_sec(>sc_tmo_tmpl, PFLOW_TMPL_TIMEOUT);
>   break;
> Index: net/if_pflow.h
> ===
> RCS file: /cvs/src/sys/net/if_pflow.h,v
> retrieving revision 1.14
> diff -u -p -r1.14 if_pflow.h
> --- net/if_pflow.h3 Oct 2015 10:44:23 -   1.14
> +++ net/if_pflow.h19 Sep 2016 01:51:37 -
> @@ -182,8 +182,11 @@ struct pflow_softc {
>   u_int64_tsc_gcounter;
>   u_int32_tsc_sequence;
>   struct timeout   sc_tmo;
> - struct timeout   sc_tmo6;
> + struct task  sc_task;
>   struct timeout   sc_tmo_tmpl;
> + struct task  sc_task_tmpl;
> + struct timeout   sc_tmo6;
> + struct task  sc_task6;
>   struct socket   *so;
>   struct mbuf *send_nam;
>   struct sockaddr *sc_flowsrc;
> Index: net/if_pfsync.c
> ===
> RCS file: /cvs/src/sys/net/if_pfsync.c,v
> retrieving 

Re: timeout_set_proc(9)

2016-09-18 Thread David Gwynne
On Fri, Sep 16, 2016 at 04:58:39PM +0200, Mark Kettenis wrote:
> > Date: Thu, 15 Sep 2016 16:29:45 +0200
> > From: Martin Pieuchot 
> > 
> > After discussing with a few people about a new "timed task" API I came
> > to the conclusion that mixing timeouts and tasks will result in:
> > 
> >   - always including a 'struct timeout' in a 'struct task', or the other
> > the way around
> > or
> >   
> >   - introducing a new data structure, hence API.
> > 
> > Since I'd like to keep the change as small as possible when converting
> > existing timeout_set(9), neither option seem a good fit.  So I decided
> > to add a new kernel thread, curiously named "softclock", that will
> > offer his stack to the poor timeout handlers that need one. 
> > 
> > With this approach, converting a timeout is just a matter of doing:
> > 
> > s/timeout_set/timeout_set_proc/
> > 
> > 
> > Diff below includes the conversions I need for the "netlock".  I'm
> > waiting for feedbacks and a better name to document the new function.
> > 
> > Comments?
> 
> I like how minimal this is.  Would like to see a few more people that
> are familliar with the timeout code chime in, but it looks mostly
> correct to me as well.  One question though:

id rather not grow the timeout (or task for that matter) apis just
yet. theyre nice and straightforward to read and understand so far.

for this specific problem can we do a specific fix for it? the diff
below is equiv to the timeout_set_proc change, but implements it
by using explicit tasks that are called by the timeouts from a
common trampoline in the network stack.

Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.448
diff -u -p -r1.448 if.c
--- net/if.c13 Sep 2016 08:15:01 -  1.448
+++ net/if.c19 Sep 2016 01:51:37 -
@@ -155,6 +155,8 @@ voidif_watchdog_task(void *);
 void   if_input_process(void *);
 void   if_netisr(void *);
 
+void   if_nettmo_add(void *);
+
 #ifdef DDB
 void   ifa_print_all(void);
 #endif
@@ -875,6 +877,21 @@ if_netisr(void *unused)
 
splx(s);
KERNEL_UNLOCK();
+}
+
+void
+if_nettmo_add(void *t)
+{
+   /* the task is added to systq so it inherits the KERNEL_LOCK */
+   task_add(systq, t);
+}
+
+void
+if_nettmo_set(struct timeout *tmo, struct task *task,
+void (*fn)(void *), void *arg)
+{
+   task_set(task, fn, arg);
+   timeout_set(tmo, if_nettmo_add, task);
 }
 
 void
Index: net/if_pflow.c
===
RCS file: /cvs/src/sys/net/if_pflow.c,v
retrieving revision 1.61
diff -u -p -r1.61 if_pflow.c
--- net/if_pflow.c  29 Apr 2016 08:55:03 -  1.61
+++ net/if_pflow.c  19 Sep 2016 01:51:37 -
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -547,16 +548,24 @@ pflow_init_timeouts(struct pflow_softc *
timeout_del(>sc_tmo6);
if (timeout_initialized(>sc_tmo_tmpl))
timeout_del(>sc_tmo_tmpl);
-   if (!timeout_initialized(>sc_tmo))
-   timeout_set(>sc_tmo, pflow_timeout, sc);
+   if (!timeout_initialized(>sc_tmo)) {
+   if_nettmo_set(>sc_tmo, >sc_task,
+   pflow_timeout, sc);
+   }
break;
case PFLOW_PROTO_10:
-   if (!timeout_initialized(>sc_tmo_tmpl))
-   timeout_set(>sc_tmo_tmpl, pflow_timeout_tmpl, sc);
-   if (!timeout_initialized(>sc_tmo))
-   timeout_set(>sc_tmo, pflow_timeout, sc);
-   if (!timeout_initialized(>sc_tmo6))
-   timeout_set(>sc_tmo6, pflow_timeout6, sc);
+   if (!timeout_initialized(>sc_tmo_tmpl)) {
+   if_nettmo_set(>sc_tmo_tmpl, >sc_task_tmpl,
+   pflow_timeout_tmpl, sc);
+   }
+   if (!timeout_initialized(>sc_tmo)) {
+   if_nettmo_set(>sc_tmo, >sc_task,
+   pflow_timeout, sc);
+   }
+   if (!timeout_initialized(>sc_tmo6)) {
+   if_nettmo_set(>sc_tmo6, >sc_task6,
+   pflow_timeout6, sc);
+   }
 
timeout_add_sec(>sc_tmo_tmpl, PFLOW_TMPL_TIMEOUT);
break;
Index: net/if_pflow.h
===
RCS file: /cvs/src/sys/net/if_pflow.h,v
retrieving revision 1.14
diff -u -p -r1.14 if_pflow.h
--- net/if_pflow.h  3 Oct 2015 10:44:23 -   1.14
+++ net/if_pflow.h  19 Sep 2016 01:51:37 -
@@ -182,8 +182,11 @@ struct pflow_softc {
u_int64_tsc_gcounter;
u_int32_tsc_sequence;
struct timeout   sc_tmo;
-   struct timeout   sc_tmo6;
+   

Re: timeout_set_proc(9)

2016-09-16 Thread Mark Kettenis
> Date: Thu, 15 Sep 2016 16:29:45 +0200
> From: Martin Pieuchot 
> 
> After discussing with a few people about a new "timed task" API I came
> to the conclusion that mixing timeouts and tasks will result in:
> 
>   - always including a 'struct timeout' in a 'struct task', or the other
> the way around
> or
>   
>   - introducing a new data structure, hence API.
> 
> Since I'd like to keep the change as small as possible when converting
> existing timeout_set(9), neither option seem a good fit.  So I decided
> to add a new kernel thread, curiously named "softclock", that will
> offer his stack to the poor timeout handlers that need one. 
> 
> With this approach, converting a timeout is just a matter of doing:
> 
>   s/timeout_set/timeout_set_proc/
> 
> 
> Diff below includes the conversions I need for the "netlock".  I'm
> waiting for feedbacks and a better name to document the new function.
> 
> Comments?

I like how minimal this is.  Would like to see a few more people that
are familliar with the timeout code chime in, but it looks mostly
correct to me as well.  One question though:

> Index: kern/kern_timeout.c
> ===
> RCS file: /cvs/src/sys/kern/kern_timeout.c,v
> retrieving revision 1.48
> diff -u -p -r1.48 kern_timeout.c
> --- kern/kern_timeout.c   6 Jul 2016 15:53:01 -   1.48
> +++ kern/kern_timeout.c   15 Sep 2016 14:19:10 -
> @@ -27,7 +27,7 @@
>  
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -54,6 +54,7 @@
>  
>  struct circq timeout_wheel[BUCKETS]; /* Queues of timeouts */
>  struct circq timeout_todo;   /* Worklist */
> +struct circq timeout_proc;   /* Due timeouts needing proc. context */
>  
>  #define MASKWHEEL(wheel, time) (((time) >> ((wheel)*WHEELBITS)) & WHEELMASK)
>  
> @@ -127,6 +128,9 @@ struct mutex timeout_mutex = MUTEX_INITI
>  
>  #define CIRCQ_EMPTY(elem) (CIRCQ_FIRST(elem) == (elem))
>  
> +void softclock_thread(void *);
> +void softclock_create_thread(void *);
> +
>  /*
>   * Some of the "math" in here is a bit tricky.
>   *
> @@ -147,11 +151,18 @@ timeout_startup(void)
>   int b;
>  
>   CIRCQ_INIT(_todo);
> + CIRCQ_INIT(_proc);
>   for (b = 0; b < nitems(timeout_wheel); b++)
>   CIRCQ_INIT(_wheel[b]);
>  }
>  
>  void
> +timeout_proc_init(void)
> +{
> + kthread_create_deferred(softclock_create_thread, curcpu());
> +}
> +
> +void
>  timeout_set(struct timeout *new, void (*fn)(void *), void *arg)
>  {
>   new->to_func = fn;
> @@ -159,6 +170,12 @@ timeout_set(struct timeout *new, void (*
>   new->to_flags = TIMEOUT_INITIALIZED;
>  }
>  
> +void
> +timeout_set_proc(struct timeout *new, void (*fn)(void *), void *arg)
> +{
> + timeout_set(new, fn, arg);
> + new->to_flags |= TIMEOUT_NEEDPROCCTX;
> +}
>  
>  int
>  timeout_add(struct timeout *new, int to_ticks)
> @@ -334,38 +351,84 @@ timeout_hardclock_update(void)
>  }
>  
>  void
> +timeout_run(struct timeout *to)
> +{
> + void (*fn)(void *);
> + void *arg;
> +
> + MUTEX_ASSERT_LOCKED(_mutex);
> +
> + to->to_flags &= ~TIMEOUT_ONQUEUE;
> + to->to_flags |= TIMEOUT_TRIGGERED;
> +
> + fn = to->to_func;
> + arg = to->to_arg;
> +
> + mtx_leave(_mutex);
> + fn(arg);
> + mtx_enter(_mutex);
> +}
> +
> +void
>  softclock(void *arg)
>  {
>   int delta;
>   struct circq *bucket;
>   struct timeout *to;
> - void (*fn)(void *);
>  
>   mtx_enter(_mutex);
>   while (!CIRCQ_EMPTY(_todo)) {
>   to = timeout_from_circq(CIRCQ_FIRST(_todo));
>   CIRCQ_REMOVE(>to_list);
>  
> - /* If due run it, otherwise insert it into the right bucket. */
> + /*
> +  * If due run it or defer execution to the thread,
> +  * otherwise insert it into the right bucket.
> +  */
>   delta = to->to_time - ticks;
>   if (delta > 0) {
>   bucket = (delta, to->to_time);
>   CIRCQ_INSERT(>to_list, bucket);
> + } else if (to->to_flags & TIMEOUT_NEEDPROCCTX) {
> + CIRCQ_INSERT(>to_list, _proc);
> + wakeup(_proc);
>   } else {
>  #ifdef DEBUG
>   if (delta < 0)
>   printf("timeout delayed %d\n", delta);
>  #endif
> - to->to_flags &= ~TIMEOUT_ONQUEUE;
> - to->to_flags |= TIMEOUT_TRIGGERED;
> + timeout_run(to);
> + }
> + }
> + mtx_leave(_mutex);
> +}
>  
> - fn = to->to_func;
> - arg = to->to_arg;
> +void
> +softclock_create_thread(void *xci)
> +{
> + if (kthread_create(softclock_thread, xci, NULL, "softclock"))
> + panic("fork softclock");
> +}
>  
> - mtx_leave(_mutex);
> -  

Re: timeout_set_proc(9)

2016-09-16 Thread Mark Kettenis
> Date: Fri, 16 Sep 2016 16:03:50 +0200
> From: Vincent Gross 
> 
> On Thu, 15 Sep 2016 16:29:45 +0200
> Martin Pieuchot  wrote:
> 
> > After discussing with a few people about a new "timed task" API I came
> > to the conclusion that mixing timeouts and tasks will result in:
> > 
> >   - always including a 'struct timeout' in a 'struct task', or the
> > other the way around
> > or
> >   
> >   - introducing a new data structure, hence API.
> > 
> > Since I'd like to keep the change as small as possible when converting
> > existing timeout_set(9), neither option seem a good fit.  So I decided
> > to add a new kernel thread, curiously named "softclock", that will
> > offer his stack to the poor timeout handlers that need one. 
> > 
> > With this approach, converting a timeout is just a matter of doing:
> > 
> > s/timeout_set/timeout_set_proc/
> > 
> > 
> > Diff below includes the conversions I need for the "netlock".  I'm
> > waiting for feedbacks and a better name to document the new function.
> > 
> > Comments?
> 
> Reads OK; I like the simple renaming.
> 
> The "softclock" thread name will be confusing, the timeouts are indeed
> driven by the softclock interrupt, but the tasks have nothing to do
> with softclock. Maybe "timeothread" ?

Naming things is always hard.  The :"thread" in the name is a bit
redundant.  Probably just "timeout" would be fine.  The nice thing
about "sofclock" is that it is nicely symmetric with the "softnet"
thread.  Although that one is a taskq.



Re: timeout_set_proc(9)

2016-09-16 Thread Vincent Gross
On Thu, 15 Sep 2016 16:29:45 +0200
Martin Pieuchot  wrote:

> After discussing with a few people about a new "timed task" API I came
> to the conclusion that mixing timeouts and tasks will result in:
> 
>   - always including a 'struct timeout' in a 'struct task', or the
> other the way around
> or
>   
>   - introducing a new data structure, hence API.
> 
> Since I'd like to keep the change as small as possible when converting
> existing timeout_set(9), neither option seem a good fit.  So I decided
> to add a new kernel thread, curiously named "softclock", that will
> offer his stack to the poor timeout handlers that need one. 
> 
> With this approach, converting a timeout is just a matter of doing:
> 
>   s/timeout_set/timeout_set_proc/
> 
> 
> Diff below includes the conversions I need for the "netlock".  I'm
> waiting for feedbacks and a better name to document the new function.
> 
> Comments?

Reads OK; I like the simple renaming.

The "softclock" thread name will be confusing, the timeouts are indeed
driven by the softclock interrupt, but the tasks have nothing to do
with softclock. Maybe "timeothread" ?

Will this new thread stay, or is it only to ease the transition to MP
networking ?

> 
> Index: net/if_pflow.c
> ===
> RCS file: /cvs/src/sys/net/if_pflow.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 if_pflow.c
> --- net/if_pflow.c29 Apr 2016 08:55:03 -  1.61
> +++ net/if_pflow.c15 Sep 2016 14:19:10 -
> @@ -548,15 +548,16 @@ pflow_init_timeouts(struct pflow_softc *
>   if (timeout_initialized(>sc_tmo_tmpl))
>   timeout_del(>sc_tmo_tmpl);
>   if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> + timeout_set_proc(>sc_tmo, pflow_timeout,
> sc); break;
>   case PFLOW_PROTO_10:
>   if (!timeout_initialized(>sc_tmo_tmpl))
> - timeout_set(>sc_tmo_tmpl,
> pflow_timeout_tmpl, sc);
> + timeout_set_proc(>sc_tmo_tmpl,
> pflow_timeout_tmpl,
> + sc);
>   if (!timeout_initialized(>sc_tmo))
> - timeout_set(>sc_tmo, pflow_timeout, sc);
> + timeout_set_proc(>sc_tmo, pflow_timeout,
> sc); if (!timeout_initialized(>sc_tmo6))
> - timeout_set(>sc_tmo6, pflow_timeout6,
> sc);
> + timeout_set_proc(>sc_tmo6,
> pflow_timeout6, sc); 
>   timeout_add_sec(>sc_tmo_tmpl,
> PFLOW_TMPL_TIMEOUT); break;
> Index: net/if_pfsync.c
> ===
> RCS file: /cvs/src/sys/net/if_pfsync.c,v
> retrieving revision 1.231
> diff -u -p -r1.231 if_pfsync.c
> --- net/if_pfsync.c   15 Sep 2016 02:00:18 -  1.231
> +++ net/if_pfsync.c   15 Sep 2016 14:19:10 -
> @@ -328,9 +328,9 @@ pfsync_clone_create(struct if_clone *ifc
>   IFQ_SET_MAXLEN(>if_snd, IFQ_MAXLEN);
>   ifp->if_hdrlen = sizeof(struct pfsync_header);
>   ifp->if_mtu = ETHERMTU;
> - timeout_set(>sc_tmo, pfsync_timeout, sc);
> - timeout_set(>sc_bulk_tmo, pfsync_bulk_update, sc);
> - timeout_set(>sc_bulkfail_tmo, pfsync_bulk_fail, sc);
> + timeout_set_proc(>sc_tmo, pfsync_timeout, sc);
> + timeout_set_proc(>sc_bulk_tmo, pfsync_bulk_update, sc);
> + timeout_set_proc(>sc_bulkfail_tmo, pfsync_bulk_fail, sc);
>  
>   if_attach(ifp);
>   if_alloc_sadl(ifp);
> @@ -1723,7 +1723,7 @@ pfsync_defer(struct pf_state *st, struct
>   sc->sc_deferred++;
>   TAILQ_INSERT_TAIL(>sc_deferrals, pd, pd_entry);
>  
> - timeout_set(>pd_tmo, pfsync_defer_tmo, pd);
> + timeout_set_proc(>pd_tmo, pfsync_defer_tmo, pd);
>   timeout_add_msec(>pd_tmo, 20);
>  
>   schednetisr(NETISR_PFSYNC);
> Index: netinet/ip_carp.c
> ===
> RCS file: /cvs/src/sys/netinet/ip_carp.c,v
> retrieving revision 1.293
> diff -u -p -r1.293 ip_carp.c
> --- netinet/ip_carp.c 25 Jul 2016 16:44:04 -  1.293
> +++ netinet/ip_carp.c 15 Sep 2016 14:19:11 -
> @@ -831,9 +831,9 @@ carp_new_vhost(struct carp_softc *sc, in
>   vhe->vhid = vhid;
>   vhe->advskew = advskew;
>   vhe->state = INIT;
> - timeout_set(>ad_tmo, carp_send_ad, vhe);
> - timeout_set(>md_tmo, carp_master_down, vhe);
> - timeout_set(>md6_tmo, carp_master_down, vhe);
> + timeout_set_proc(>ad_tmo, carp_send_ad, vhe);
> + timeout_set_proc(>md_tmo, carp_master_down, vhe);
> + timeout_set_proc(>md6_tmo, carp_master_down, vhe);
>  
>   KERNEL_ASSERT_LOCKED(); /* touching carp_vhosts */
>  
> Index: netinet/tcp_timer.h
> ===
> RCS file: /cvs/src/sys/netinet/tcp_timer.h,v
> retrieving revision 1.13
> diff -u -p -r1.13 tcp_timer.h
> --- netinet/tcp_timer.h   6 Jul 2011 23:44:20