Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-29 Thread Tejun Heo
Hello,

On Wed, Jun 29, 2016 at 10:17:48AM +0200, Petr Mladek wrote:
> > Ah, okay, I don't think we need to change this.  I was suggesting to
> > simplify it by dropping the draining and just do flush from destroy.
> 
> I see. But then it does not address the original concern from Peter
> Zijlstra. He did not like that the caller was responsible for blocking
> further queueing. It still will be needed. Or did I miss something,
> please?

You can only protect against so much.  Let's say we make the worker
struct to be allocated by the user, what then prevents it prematurely
from user side?  Use-after-free is use-after-free.  If we can trivally
add some protection against it, great, but no need to contort the
design to add marginal protection.

Thanks.

-- 
tejun


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-29 Thread Petr Mladek
On Tue 2016-06-28 13:04:47, Tejun Heo wrote:
> Hello,
> 
> On Mon, Jun 27, 2016 at 04:33:50PM +0200, Petr Mladek wrote:
> > OK, so you suggest to do the following:
> > 
> >   1. Add a flag into struct kthread_worker that will prevent
> >  from further queuing.
> 
> This doesn't add any protection, right?  It's getting freed anyway.
> 
> >   2. kthread_create_worker()/kthread_destroy_worker() will
> >  not longer dynamically allocate struct kthread_worker.
> >  They will just start/stop the kthread.
> 
> Ah, okay, I don't think we need to change this.  I was suggesting to
> simplify it by dropping the draining and just do flush from destroy.

I see. But then it does not address the original concern from Peter
Zijlstra. He did not like that the caller was responsible for blocking
further queueing. It still will be needed. Or did I miss something,
please?

Best Regards,
Petr


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-28 Thread Tejun Heo
Hello,

On Mon, Jun 27, 2016 at 04:33:50PM +0200, Petr Mladek wrote:
> OK, so you suggest to do the following:
> 
>   1. Add a flag into struct kthread_worker that will prevent
>  from further queuing.

This doesn't add any protection, right?  It's getting freed anyway.

>   2. kthread_create_worker()/kthread_destroy_worker() will
>  not longer dynamically allocate struct kthread_worker.
>  They will just start/stop the kthread.

Ah, okay, I don't think we need to change this.  I was suggesting to
simplify it by dropping the draining and just do flush from destroy.

Thanks.

-- 
tejun


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-27 Thread Petr Mladek
Hi,

On Fri 2016-06-24 11:54:47, Tejun Heo wrote:
> On Fri, Jun 24, 2016 at 09:05:15AM +0200, Peter Zijlstra wrote:
> > > Given how rare that is 
> > 
> > Could you then not remove/rework these few cases for workqueue as well
> > and make that 'better' too?
> 
> Usage of draining is rare for workqueue but that still means several
> legitimate users.  With draining there, it's logical to use it during
> shutdown.  I don't think it makes sense to change it on workqueue
> side.
> 
> > > and the extra
> > > complexity of identifying self-requeueing cases, let's forget about
> > > draining and on destruction clear the worker pointer to block further
> > > queueing and then flush whatever is in flight.
> > 
> > You're talking about regular workqueues here?
> 
> No, kthread worker.  It's unlikely that kthread worker is gonna need
> chained draining especially given that most of its usages are gonna be
> conversions from raw kthread usages.  We won't lose much if anything
> by just ignoring draining and making the code simpler.

OK, so you suggest to do the following:

  1. Add a flag into struct kthread_worker that will prevent
 from further queuing.

  2. kthread_create_worker()/kthread_destroy_worker() will
 not longer dynamically allocate struct kthread_worker.
 They will just start/stop the kthread.


The result will be:

  a. User will not need the strict synchronization between
 the queue and create/destroy operations.

  b. We could get rid of drain_kthread_worker() because
 flush_kthread_worker() will be enough.


IMHO, the 1st change does not make sense without the 2nd one.
Otherwise, users could do an out-of-memory access when testing
the freed kthread_worker flag.

Do I get this correctly please?

Best Regards,
Petr


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-24 Thread Tejun Heo
Hello,

On Fri, Jun 24, 2016 at 09:05:15AM +0200, Peter Zijlstra wrote:
> > Given how rare that is 
> 
> Could you then not remove/rework these few cases for workqueue as well
> and make that 'better' too?

Usage of draining is rare for workqueue but that still means several
legitimate users.  With draining there, it's logical to use it during
shutdown.  I don't think it makes sense to change it on workqueue
side.

> > and the extra
> > complexity of identifying self-requeueing cases, let's forget about
> > draining and on destruction clear the worker pointer to block further
> > queueing and then flush whatever is in flight.
> 
> You're talking about regular workqueues here?

No, kthread worker.  It's unlikely that kthread worker is gonna need
chained draining especially given that most of its usages are gonna be
conversions from raw kthread usages.  We won't lose much if anything
by just ignoring draining and making the code simpler.

Thanks.

-- 
tejun


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-24 Thread Petr Mladek
On Fri 2016-06-24 09:05:15, Peter Zijlstra wrote:
> On Thu, Jun 23, 2016 at 05:32:58PM -0400, Tejun Heo wrote:
> > Hello,
> > 
> > On Wed, Jun 22, 2016 at 10:54:45PM +0200, Peter Zijlstra wrote:
> > > > + * The caller is responsible for blocking all users of this kthread
> > > > + * worker from queuing new works. Also it is responsible for blocking
> > > > + * the already queued works from an infinite re-queuing!
> > > 
> > > This, I really dislike that. And it makes the kthread_destroy_worker()
> > > from the next patch unnecessarily fragile.
> > > 
> > > Why not add a kthread_worker::blocked flag somewhere and refuse/WARN
> > > kthread_queue_work() when that is set.
> > 
> > It's the same logic from workqueue counterpart.
> 
> So ? Clearly it (the kthread workqueue) can be improved here.
> 
> > For workqueue, nothing can make it less fragile as the workqueue
> > struct itself is freed on destruction.  If its users fail to stop
> > issuing work items, it'll lead to use-after-free.
> 
> Right, but this kthread thingy does not, so why not add a failsafe?

The struct kthread_worker is freed in kthread_destroy_worker().
So kthread_worker is the same situation as workqueues.

The allocation/freeing has been added in v2. It helped
to make it clear when the structure was initialized. Note that we
still need the crate/destroy functions to start/stop the kthread.
See the discussion at
https://lkml.kernel.org/g/20150728172657.gc5...@mtj.duckdns.org

I personally do not have strong opinion about it.

On one hand, it makes the code more complex because we need strong
synchronization between queueing/canceling/destroying. There are cases
where it is not that important, for example the hugepage daemon or
hung task. It does not matter if the next round will be done or not.
Well, it is strange if someting gets queued and it is not proceed.

On the other hand, there are situations where the work must be
done, e.g. some I/O operation. They need the strong syncronization.
We could print a warning when queueing a work for a destroyed
(stoped) kthread_worker to catch potential problems. But then we will need
the strong synchronization in all cases to avoid "false" alarms.

After all, the blocked flag will not necessarily make the usage
less hairy. Or did I miss something?

Best Regards,
Petr


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-24 Thread Peter Zijlstra
On Thu, Jun 23, 2016 at 05:32:58PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Wed, Jun 22, 2016 at 10:54:45PM +0200, Peter Zijlstra wrote:
> > > + * The caller is responsible for blocking all users of this kthread
> > > + * worker from queuing new works. Also it is responsible for blocking
> > > + * the already queued works from an infinite re-queuing!
> > 
> > This, I really dislike that. And it makes the kthread_destroy_worker()
> > from the next patch unnecessarily fragile.
> > 
> > Why not add a kthread_worker::blocked flag somewhere and refuse/WARN
> > kthread_queue_work() when that is set.
> 
> It's the same logic from workqueue counterpart.

So ? Clearly it (the kthread workqueue) can be improved here.

> For workqueue, nothing can make it less fragile as the workqueue
> struct itself is freed on destruction.  If its users fail to stop
> issuing work items, it'll lead to use-after-free.

Right, but this kthread thingy does not, so why not add a failsafe?

> IIRC, the draining of self-requeueing work items is a specific
> requirement from some edge use case which used workqueue to implement
> multi-step state machine. 

Right, that might be an issue,

> Given how rare that is 

Could you then not remove/rework these few cases for workqueue as well
and make that 'better' too?

> and the extra
> complexity of identifying self-requeueing cases, let's forget about
> draining and on destruction clear the worker pointer to block further
> queueing and then flush whatever is in flight.

You're talking about regular workqueues here?


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-23 Thread Tejun Heo
Hello,

On Wed, Jun 22, 2016 at 10:54:45PM +0200, Peter Zijlstra wrote:
> > + * The caller is responsible for blocking all users of this kthread
> > + * worker from queuing new works. Also it is responsible for blocking
> > + * the already queued works from an infinite re-queuing!
> 
> This, I really dislike that. And it makes the kthread_destroy_worker()
> from the next patch unnecessarily fragile.
> 
> Why not add a kthread_worker::blocked flag somewhere and refuse/WARN
> kthread_queue_work() when that is set.

It's the same logic from workqueue counterpart.  For workqueue,
nothing can make it less fragile as the workqueue struct itself is
freed on destruction.  If its users fail to stop issuing work items,
it'll lead to use-after-free.

IIRC, the draining of self-requeueing work items is a specific
requirement from some edge use case which used workqueue to implement
multi-step state machine.  Given how rare that is and the extra
complexity of identifying self-requeueing cases, let's forget about
draining and on destruction clear the worker pointer to block further
queueing and then flush whatever is in flight.

Thanks.

-- 
tejun


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-22 Thread Peter Zijlstra
On Thu, Jun 16, 2016 at 01:17:25PM +0200, Petr Mladek wrote:
> +/**
> + * kthread_drain_worker - drain a kthread worker
> + * @worker: worker to be drained
> + *
> + * Wait until there is no work queued for the given kthread worker.
> + * @worker is flushed repeatedly until it becomes empty.  The number
> + * of flushing is determined by the depth of chaining and should
> + * be relatively short.  Whine if it takes too long.
> + *

> + * The caller is responsible for blocking all users of this kthread
> + * worker from queuing new works. Also it is responsible for blocking
> + * the already queued works from an infinite re-queuing!

This, I really dislike that. And it makes the kthread_destroy_worker()
from the next patch unnecessarily fragile.

Why not add a kthread_worker::blocked flag somewhere and refuse/WARN
kthread_queue_work() when that is set.

> + */
> +void kthread_drain_worker(struct kthread_worker *worker)
> +{
> + int flush_cnt = 0;
> +
> + spin_lock_irq(&worker->lock);
> +
> + while (!list_empty(&worker->work_list)) {
> + spin_unlock_irq(&worker->lock);
> +
> + kthread_flush_worker(worker);
> + WARN_ONCE(flush_cnt++ > 10,
> +   "kthread worker %s: kthread_drain_worker() isn't 
> complete after %u tries\n",
> +   worker->task->comm, flush_cnt);
> +
> + spin_lock_irq(&worker->lock);
> + }
> +
> + spin_unlock_irq(&worker->lock);
> +}
> +EXPORT_SYMBOL(kthread_drain_worker);
> -- 
> 1.8.5.6
> 


Re: [PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-20 Thread Tejun Heo
On Thu, Jun 16, 2016 at 01:17:25PM +0200, Petr Mladek wrote:
> kthread_flush_worker() returns when the currently queued works are proceed.
> But some other works might have been queued in the meantime.
> 
> This patch adds kthread_drain_worker() that is inspired by
> drain_workqueue(). It returns when the queue is completely
> empty and warns when it takes too long.
> 
> The initial implementation does not block queuing new works when
> draining. It makes things much easier. The blocking would be useful
> to debug potential problems but it is not clear if it is worth
> the complication at the moment.
> 
> Signed-off-by: Petr Mladek 

Acked-by: Tejun Heo 

Thanks.

-- 
tejun


[PATCH v9 06/12] kthread: Add kthread_drain_worker()

2016-06-16 Thread Petr Mladek
kthread_flush_worker() returns when the currently queued works are proceed.
But some other works might have been queued in the meantime.

This patch adds kthread_drain_worker() that is inspired by
drain_workqueue(). It returns when the queue is completely
empty and warns when it takes too long.

The initial implementation does not block queuing new works when
draining. It makes things much easier. The blocking would be useful
to debug potential problems but it is not clear if it is worth
the complication at the moment.

Signed-off-by: Petr Mladek 
---
 include/linux/kthread.h |  1 +
 kernel/kthread.c| 34 ++
 2 files changed, 35 insertions(+)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index f68041837dd6..c889b653f8cb 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -135,5 +135,6 @@ bool kthread_queue_work(struct kthread_worker *worker,
struct kthread_work *work);
 void kthread_flush_work(struct kthread_work *work);
 void kthread_flush_worker(struct kthread_worker *worker);
+void kthread_drain_worker(struct kthread_worker *worker);
 
 #endif /* _LINUX_KTHREAD_H */
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 590b9f699e9d..4454b1267718 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -819,3 +819,37 @@ void kthread_flush_worker(struct kthread_worker *worker)
wait_for_completion(&fwork.done);
 }
 EXPORT_SYMBOL_GPL(kthread_flush_worker);
+
+/**
+ * kthread_drain_worker - drain a kthread worker
+ * @worker: worker to be drained
+ *
+ * Wait until there is no work queued for the given kthread worker.
+ * @worker is flushed repeatedly until it becomes empty.  The number
+ * of flushing is determined by the depth of chaining and should
+ * be relatively short.  Whine if it takes too long.
+ *
+ * The caller is responsible for blocking all users of this kthread
+ * worker from queuing new works. Also it is responsible for blocking
+ * the already queued works from an infinite re-queuing!
+ */
+void kthread_drain_worker(struct kthread_worker *worker)
+{
+   int flush_cnt = 0;
+
+   spin_lock_irq(&worker->lock);
+
+   while (!list_empty(&worker->work_list)) {
+   spin_unlock_irq(&worker->lock);
+
+   kthread_flush_worker(worker);
+   WARN_ONCE(flush_cnt++ > 10,
+ "kthread worker %s: kthread_drain_worker() isn't 
complete after %u tries\n",
+ worker->task->comm, flush_cnt);
+
+   spin_lock_irq(&worker->lock);
+   }
+
+   spin_unlock_irq(&worker->lock);
+}
+EXPORT_SYMBOL(kthread_drain_worker);
-- 
1.8.5.6