Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-14 Thread Oleg Nesterov
On 04/14, Eric W. Biederman wrote:
>
> This is where I was going beyond what you were doing.  I needed a flag to say
> that this a kthread that is stopping to test in recalc_sigpending.  To be 
> certain
> of terminating interruptible sleeps.  I could not get at your struct kthread
> in that case.
> 
> If it wasn't for the wait_event_interruptible thing I likely would
> have just thrown a union in struct task_struct.
> 
> I also got lucky in that vfork_done is designed to point a completion
> just where I need it (when a task exits).  The name is now a little
> abused but otherwise it does just what I want it to.
> 
> >> It also doesn't solve the biggest problem with the current kthread 
> >> interface
> >> in that calling kthread_stop does not cause the code to break out of
> >> interruptible sleeps.
> >
> > Hm? kthread_stop() does wake_up_process(), it wakes up TASK_INTERRUPTIBLE 
> > tasks.
> 
> Yes. But if they are looping, unless signal_pending is set it is quite 
> possible
> they will go back to sleep.
> 
> Take for example:
> 
> > #define __wait_event_interruptible(wq, condition, ret)  \
> > do {
> > \
> > DEFINE_WAIT(__wait);\
> > \
> > for (;;) {  \
> > prepare_to_wait(, &__wait, TASK_INTERRUPTIBLE);  \
> > if (condition)  \
> > break;  \
> > if (!signal_pending(current)) { \
> > schedule(); \
> > continue;   \
> > }   \
> > ret = -ERESTARTSYS; \
> > break;  \
> > }   \
> > finish_wait(, &__wait);  \
> > } while (0)
> 
> We don't break out until either condition is true or signal_pending(current)
> is true.
> 
> Loops that do that are very common in the kernel.  I counted about 500
> calls of signal pending in places that otherwise care nothing about signals.
> Several kernel threads call into functions that use loops like
> wait_event_interruptible.  So I need a more forceful kthread_stop.  If
> I don't want to continue to use signals.

Yes, I got it reading your next patches. Ok, probably this change is good.
My question was: do we really want to force a kernel thread to exit if it
waits for event in TASK_INTERRUPTIBLE state? probably yes.

> > Yes, thanks... Can't understand how I was s stupid!!! thanks...
> >
> > Damn. We don't need 2 completions! just one.
> 
> Yep.  My second patch in this last round implements that.

Yes, I have read it. It is clearly better then mine, and I think correct.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-14 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes:

> On 04/13, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <[EMAIL PROTECTED]> writes:
>> 
>> > It's a shame kthread_stop() (may take a while!) runs with a global 
>> > semaphore
>> > held. With this patch kthread() allocates all neccesary data (struct
> kthread)
>> > on its own stack, globals kthread_stop_xxx are deleted.
>> 
>> Oleg so fare you patches  have been inspiring.  However..
>> 
>> > HACKS:
>> >
>> >- re-use task_struct->set_child_tid to point to "struct kthread"
>> 
>>   task_struct->vfork_done is a better cannidate.
>> 
>> >- use do_exit() directly to preserve "struct kthread" on stack
>> 
>> Calling do_exit directly like that is not a hack, as it appears the preferred
>> way to exit is to call do_exit, or complete_and_exit.
>> 
>> While this does improve the scalability and remove a global variable.  It
>> also introduces a complex special case in the form of struct kthread.
>
> I can't say I agree. I thought it is good to have a struct which represents
> a kernel thread. Actually, I thought we can have __kthread_create() which
> returns "struct kthread". May be I am wrong, because yes, ->set_child_tid can
> point right to completion, and we can use some TIF flag instead of
> ->should_stop.

> This needs to update a lot of include/asm/ files.

Yes it does.

This is where I was going beyond what you were doing.  I needed a flag to say
that this a kthread that is stopping to test in recalc_sigpending.  To be 
certain
of terminating interruptible sleeps.  I could not get at your struct kthread
in that case.

If it wasn't for the wait_event_interruptible thing I likely would
have just thrown a union in struct task_struct.

I also got lucky in that vfork_done is designed to point a completion
just where I need it (when a task exits).  The name is now a little
abused but otherwise it does just what I want it to.

>> It also doesn't solve the biggest problem with the current kthread interface
>> in that calling kthread_stop does not cause the code to break out of
>> interruptible sleeps.
>
> Hm? kthread_stop() does wake_up_process(), it wakes up TASK_INTERRUPTIBLE 
> tasks.

Yes. But if they are looping, unless signal_pending is set it is quite possible
they will go back to sleep.

Take for example:

> #define __wait_event_interruptible(wq, condition, ret)\
> do {  \
>   DEFINE_WAIT(__wait);\
>   \
>   for (;;) {  \
>   prepare_to_wait(, &__wait, TASK_INTERRUPTIBLE);  \
>   if (condition)  \
>   break;  \
>   if (!signal_pending(current)) { \
>   schedule(); \
>   continue;   \
>   }   \
>   ret = -ERESTARTSYS; \
>   break;  \
>   }   \
>   finish_wait(, &__wait);  \
> } while (0)

We don't break out until either condition is true or signal_pending(current)
is true.

Loops that do that are very common in the kernel.  I counted about 500
calls of signal pending in places that otherwise care nothing about signals.
Several kernel threads call into functions that use loops like
wait_event_interruptible.  So I need a more forceful kthread_stop.  If
I don't want to continue to use signals.

>> > @@ -91,7 +105,7 @@ static void create_kthread(struct kthrea
>> >
>> >/* We want our own signal handler (we take no signals by default). */
>> >pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
>> > -  create->result = pid;
>> > +  create->result = ERR_PTR(pid);
>> 
>> Ouch.You have a nasty race here.
>> 
>> If kthread runs before kernel_thread returns then setting
>> "create->result = ERR_PTR(pid);" could easily stomp 
>> "create->result = ".
>
> Yes, thanks... Can't understand how I was s stupid!!! thanks...
>
> Damn. We don't need 2 completions! just one.

Yep.  My second patch in this last round implements that.

>   create_kthread:
>
>   pid = kernel_thread(...);
>   if (pid < 0) {
>   create->result = ERR_PTR(pid);
>   complete(create->started);
>   }
>   // else: kthread() will do complete()
>
>   return;

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL 

Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-14 Thread Oleg Nesterov
On 04/13, Eric W. Biederman wrote:
>
> Oleg Nesterov <[EMAIL PROTECTED]> writes:
> 
> > It's a shame kthread_stop() (may take a while!) runs with a global semaphore
> > held. With this patch kthread() allocates all neccesary data (struct 
> > kthread)
> > on its own stack, globals kthread_stop_xxx are deleted.
> 
> Oleg so fare you patches  have been inspiring.  However..
> 
> > HACKS:
> >
> > - re-use task_struct->set_child_tid to point to "struct kthread"
> 
>task_struct->vfork_done is a better cannidate.
> 
> > - use do_exit() directly to preserve "struct kthread" on stack
> 
> Calling do_exit directly like that is not a hack, as it appears the preferred
> way to exit is to call do_exit, or complete_and_exit.
> 
> While this does improve the scalability and remove a global variable.  It
> also introduces a complex special case in the form of struct kthread.

I can't say I agree. I thought it is good to have a struct which represents
a kernel thread. Actually, I thought we can have __kthread_create() which
returns "struct kthread". May be I am wrong, because yes, ->set_child_tid can
point right to completion, and we can use some TIF flag instead of 
->should_stop.
This needs to update a lot of include/asm/ files.

> It also doesn't solve the biggest problem with the current kthread interface
> in that calling kthread_stop does not cause the code to break out of
> interruptible sleeps.

Hm? kthread_stop() does wake_up_process(), it wakes up TASK_INTERRUPTIBLE tasks.

> > @@ -91,7 +105,7 @@ static void create_kthread(struct kthrea
> >
> > /* We want our own signal handler (we take no signals by default). */
> > pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
> > -   create->result = pid;
> > +   create->result = ERR_PTR(pid);
> 
> Ouch.You have a nasty race here.
> 
> If kthread runs before kernel_thread returns then setting
> "create->result = ERR_PTR(pid);" could easily stomp 
> "create->result = ".

Yes, thanks... Can't understand how I was s stupid!!! thanks...

Damn. We don't need 2 completions! just one.

create_kthread:

pid = kernel_thread(...);
if (pid < 0) {
create->result = ERR_PTR(pid);
complete(create->started);
}
// else: kthread() will do complete()

return;

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-14 Thread Oleg Nesterov
On 04/13, Eric W. Biederman wrote:

 Oleg Nesterov [EMAIL PROTECTED] writes:
 
  It's a shame kthread_stop() (may take a while!) runs with a global semaphore
  held. With this patch kthread() allocates all neccesary data (struct 
  kthread)
  on its own stack, globals kthread_stop_xxx are deleted.
 
 Oleg so fare you patches  have been inspiring.  However..
 
  HACKS:
 
  - re-use task_struct-set_child_tid to point to struct kthread
 
task_struct-vfork_done is a better cannidate.
 
  - use do_exit() directly to preserve struct kthread on stack
 
 Calling do_exit directly like that is not a hack, as it appears the preferred
 way to exit is to call do_exit, or complete_and_exit.
 
 While this does improve the scalability and remove a global variable.  It
 also introduces a complex special case in the form of struct kthread.

I can't say I agree. I thought it is good to have a struct which represents
a kernel thread. Actually, I thought we can have __kthread_create() which
returns struct kthread. May be I am wrong, because yes, -set_child_tid can
point right to completion, and we can use some TIF flag instead of 
-should_stop.
This needs to update a lot of include/asm/ files.

 It also doesn't solve the biggest problem with the current kthread interface
 in that calling kthread_stop does not cause the code to break out of
 interruptible sleeps.

Hm? kthread_stop() does wake_up_process(), it wakes up TASK_INTERRUPTIBLE tasks.

  @@ -91,7 +105,7 @@ static void create_kthread(struct kthrea
 
  /* We want our own signal handler (we take no signals by default). */
  pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
  -   create-result = pid;
  +   create-result = ERR_PTR(pid);
 
 Ouch.You have a nasty race here.
 
 If kthread runs before kernel_thread returns then setting
 create-result = ERR_PTR(pid); could easily stomp 
 create-result = self.

Yes, thanks... Can't understand how I was s stupid!!! thanks...

Damn. We don't need 2 completions! just one.

create_kthread:

pid = kernel_thread(...);
if (pid  0) {
create-result = ERR_PTR(pid);
complete(create-started);
}
// else: kthread() will do complete()

return;

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-14 Thread Eric W. Biederman
Oleg Nesterov [EMAIL PROTECTED] writes:

 On 04/13, Eric W. Biederman wrote:

 Oleg Nesterov [EMAIL PROTECTED] writes:
 
  It's a shame kthread_stop() (may take a while!) runs with a global 
  semaphore
  held. With this patch kthread() allocates all neccesary data (struct
 kthread)
  on its own stack, globals kthread_stop_xxx are deleted.
 
 Oleg so fare you patches  have been inspiring.  However..
 
  HACKS:
 
 - re-use task_struct-set_child_tid to point to struct kthread
 
   task_struct-vfork_done is a better cannidate.
 
 - use do_exit() directly to preserve struct kthread on stack
 
 Calling do_exit directly like that is not a hack, as it appears the preferred
 way to exit is to call do_exit, or complete_and_exit.
 
 While this does improve the scalability and remove a global variable.  It
 also introduces a complex special case in the form of struct kthread.

 I can't say I agree. I thought it is good to have a struct which represents
 a kernel thread. Actually, I thought we can have __kthread_create() which
 returns struct kthread. May be I am wrong, because yes, -set_child_tid can
 point right to completion, and we can use some TIF flag instead of
 -should_stop.

 This needs to update a lot of include/asm/ files.

Yes it does.

This is where I was going beyond what you were doing.  I needed a flag to say
that this a kthread that is stopping to test in recalc_sigpending.  To be 
certain
of terminating interruptible sleeps.  I could not get at your struct kthread
in that case.

If it wasn't for the wait_event_interruptible thing I likely would
have just thrown a union in struct task_struct.

I also got lucky in that vfork_done is designed to point a completion
just where I need it (when a task exits).  The name is now a little
abused but otherwise it does just what I want it to.

 It also doesn't solve the biggest problem with the current kthread interface
 in that calling kthread_stop does not cause the code to break out of
 interruptible sleeps.

 Hm? kthread_stop() does wake_up_process(), it wakes up TASK_INTERRUPTIBLE 
 tasks.

Yes. But if they are looping, unless signal_pending is set it is quite possible
they will go back to sleep.

Take for example:

 #define __wait_event_interruptible(wq, condition, ret)\
 do {  \
   DEFINE_WAIT(__wait);\
   \
   for (;;) {  \
   prepare_to_wait(wq, __wait, TASK_INTERRUPTIBLE);  \
   if (condition)  \
   break;  \
   if (!signal_pending(current)) { \
   schedule(); \
   continue;   \
   }   \
   ret = -ERESTARTSYS; \
   break;  \
   }   \
   finish_wait(wq, __wait);  \
 } while (0)

We don't break out until either condition is true or signal_pending(current)
is true.

Loops that do that are very common in the kernel.  I counted about 500
calls of signal pending in places that otherwise care nothing about signals.
Several kernel threads call into functions that use loops like
wait_event_interruptible.  So I need a more forceful kthread_stop.  If
I don't want to continue to use signals.

  @@ -91,7 +105,7 @@ static void create_kthread(struct kthrea
 
 /* We want our own signal handler (we take no signals by default). */
 pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
  -  create-result = pid;
  +  create-result = ERR_PTR(pid);
 
 Ouch.You have a nasty race here.
 
 If kthread runs before kernel_thread returns then setting
 create-result = ERR_PTR(pid); could easily stomp 
 create-result = self.

 Yes, thanks... Can't understand how I was s stupid!!! thanks...

 Damn. We don't need 2 completions! just one.

Yep.  My second patch in this last round implements that.

   create_kthread:

   pid = kernel_thread(...);
   if (pid  0) {
   create-result = ERR_PTR(pid);
   complete(create-started);
   }
   // else: kthread() will do complete()

   return;

Eric

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-14 Thread Oleg Nesterov
On 04/14, Eric W. Biederman wrote:

 This is where I was going beyond what you were doing.  I needed a flag to say
 that this a kthread that is stopping to test in recalc_sigpending.  To be 
 certain
 of terminating interruptible sleeps.  I could not get at your struct kthread
 in that case.
 
 If it wasn't for the wait_event_interruptible thing I likely would
 have just thrown a union in struct task_struct.
 
 I also got lucky in that vfork_done is designed to point a completion
 just where I need it (when a task exits).  The name is now a little
 abused but otherwise it does just what I want it to.
 
  It also doesn't solve the biggest problem with the current kthread 
  interface
  in that calling kthread_stop does not cause the code to break out of
  interruptible sleeps.
 
  Hm? kthread_stop() does wake_up_process(), it wakes up TASK_INTERRUPTIBLE 
  tasks.
 
 Yes. But if they are looping, unless signal_pending is set it is quite 
 possible
 they will go back to sleep.
 
 Take for example:
 
  #define __wait_event_interruptible(wq, condition, ret)  \
  do {
  \
  DEFINE_WAIT(__wait);\
  \
  for (;;) {  \
  prepare_to_wait(wq, __wait, TASK_INTERRUPTIBLE);  \
  if (condition)  \
  break;  \
  if (!signal_pending(current)) { \
  schedule(); \
  continue;   \
  }   \
  ret = -ERESTARTSYS; \
  break;  \
  }   \
  finish_wait(wq, __wait);  \
  } while (0)
 
 We don't break out until either condition is true or signal_pending(current)
 is true.
 
 Loops that do that are very common in the kernel.  I counted about 500
 calls of signal pending in places that otherwise care nothing about signals.
 Several kernel threads call into functions that use loops like
 wait_event_interruptible.  So I need a more forceful kthread_stop.  If
 I don't want to continue to use signals.

Yes, I got it reading your next patches. Ok, probably this change is good.
My question was: do we really want to force a kernel thread to exit if it
waits for event in TASK_INTERRUPTIBLE state? probably yes.

  Yes, thanks... Can't understand how I was s stupid!!! thanks...
 
  Damn. We don't need 2 completions! just one.
 
 Yep.  My second patch in this last round implements that.

Yes, I have read it. It is clearly better then mine, and I think correct.

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-13 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes:

> It's a shame kthread_stop() (may take a while!) runs with a global semaphore
> held. With this patch kthread() allocates all neccesary data (struct kthread)
> on its own stack, globals kthread_stop_xxx are deleted.

Oleg so fare you patches  have been inspiring.  However..

> HACKS:
>
>   - re-use task_struct->set_child_tid to point to "struct kthread"

 task_struct->vfork_done is a better cannidate.

>   - use do_exit() directly to preserve "struct kthread" on stack

Calling do_exit directly like that is not a hack, as it appears the preferred
way to exit is to call do_exit, or complete_and_exit.

While this does improve the scalability and remove a global variable.  It
also introduces a complex special case in the form of struct kthread.

It also doesn't solve the biggest problem with the current kthread interface
in that calling kthread_stop does not cause the code to break out of
interruptible sleeps.

>  static int kthread(void *_create)
>  {
> - struct kthread_create_info *create = _create;
> - int (*threadfn)(void *data);
> - void *data;
> - int ret = -EINTR;
> + struct kthread self = {
> + .task = current,
> + .err = -EINTR,
> + };
>  
>   /* Copy data: it's on kthread's stack */
> - threadfn = create->threadfn;
> - data = create->data;
> + struct kthread_create_info *create = _create;
> + int (*threadfn)(void *data) = create->threadfn;
> + void *data = create->data;
> +
> + /*
> +  * This should be enough to assure that self is still on
> +  * stack when we enter do_exit()
> +  */
> + set_kthread();
> + create->result = 
>  

> @@ -91,7 +105,7 @@ static void create_kthread(struct kthrea
>  
>   /* We want our own signal handler (we take no signals by default). */
>   pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
> - create->result = pid;
> + create->result = ERR_PTR(pid);

Ouch.You have a nasty race here.

If kthread runs before kernel_thread returns then setting
"create->result = ERR_PTR(pid);" could easily stomp 
"create->result = ".


Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] make kthread_stop() scalable

2007-04-13 Thread Eric W. Biederman
Oleg Nesterov [EMAIL PROTECTED] writes:

 It's a shame kthread_stop() (may take a while!) runs with a global semaphore
 held. With this patch kthread() allocates all neccesary data (struct kthread)
 on its own stack, globals kthread_stop_xxx are deleted.

Oleg so fare you patches  have been inspiring.  However..

 HACKS:

   - re-use task_struct-set_child_tid to point to struct kthread

 task_struct-vfork_done is a better cannidate.

   - use do_exit() directly to preserve struct kthread on stack

Calling do_exit directly like that is not a hack, as it appears the preferred
way to exit is to call do_exit, or complete_and_exit.

While this does improve the scalability and remove a global variable.  It
also introduces a complex special case in the form of struct kthread.

It also doesn't solve the biggest problem with the current kthread interface
in that calling kthread_stop does not cause the code to break out of
interruptible sleeps.

  static int kthread(void *_create)
  {
 - struct kthread_create_info *create = _create;
 - int (*threadfn)(void *data);
 - void *data;
 - int ret = -EINTR;
 + struct kthread self = {
 + .task = current,
 + .err = -EINTR,
 + };
  
   /* Copy data: it's on kthread's stack */
 - threadfn = create-threadfn;
 - data = create-data;
 + struct kthread_create_info *create = _create;
 + int (*threadfn)(void *data) = create-threadfn;
 + void *data = create-data;
 +
 + /*
 +  * This should be enough to assure that self is still on
 +  * stack when we enter do_exit()
 +  */
 + set_kthread(self);
 + create-result = self;
  

 @@ -91,7 +105,7 @@ static void create_kthread(struct kthrea
  
   /* We want our own signal handler (we take no signals by default). */
   pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
 - create-result = pid;
 + create-result = ERR_PTR(pid);

Ouch.You have a nasty race here.

If kthread runs before kernel_thread returns then setting
create-result = ERR_PTR(pid); could easily stomp 
create-result = self.


Eric

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/