Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-11 Thread Petr Mladek
On Wed 2018-06-06 09:49:50, Miroslav Benes wrote:
> On Tue, 5 Jun 2018, Josh Poimboeuf wrote:
> 
> > On Tue, Jun 05, 2018 at 09:17:52AM +0200, Miroslav Benes wrote:
> > > On Mon, 4 Jun 2018, Josh Poimboeuf wrote:
> > > 
> > > > On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > > > > An administrator may send a fake signal to all remaining blocking 
> > > > > tasks
> > > > > of a running transition by writing to
> > > > > /sys/kernel/livepatch//signal attribute. Let's do it
> > > > > automatically after 10 seconds. The timeout is chosen deliberately. It
> > > > > gives the tasks enough time to transition themselves.
> > > > > 
> > > > > Theoretically, sending it once should be more than enough. Better be 
> > > > > safe
> > > > > than sorry, so send it periodically.
> > > > 
> > > > This is the part I don't understand.  Why do it periodically?
> > > 
> > > I met (rare!) cases when doing it once was not enough due to a race and 
> > > the signal was missed. However involved testcases were really artificial.
> > >  
> > > > Instead, might it make sense to just send the signals once, and if that
> > > > doesn't work, reverse the transition?  Then we could make patching a
> > > > synchronous operation.  But then, it might be remotely possible that the
> > > > reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> > > > to just leave all these controls in the hands of the user.
> > > 
> > > And there is 'force' option...
> > > 
> > > So given all this, I'd call klp_send_signals() once and then leave it up 
> > > to the user. Would that work for you?
> > 
> > Well, I don't know.  Since the patching process will already need to be
> > managed by user space, what's the benefit of having the kernel doing
> > only this part of it?
> 
> I'm not sure about 'will already need to be managed by user space' part. 
> Sending the fake signal would unblock transition in certain cases "for 
> free". No user interaction is really needed and we can do it 
> automatically. That's why I find it beneficial.

I wonder if several kicks might be necessary to get a kthread patched.

BTW: This approach has a bit different effect in compare with kGraft.
In kGraft, it was enough to force kthread going over an explicitly
defined safe point where it migrated itself. In upstream, the kthread
must get scheduled outside any patched function. The chance might
be a bit random.

To make it clear. I believe that sending signal once makes sense for
sure. And it would make sense to even repeat it.

BTW: If we allow to repeat it, we should reset the counter in
klp_reverse_transition().

Best Regards,
Petr


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-11 Thread Petr Mladek
On Wed 2018-06-06 09:49:50, Miroslav Benes wrote:
> On Tue, 5 Jun 2018, Josh Poimboeuf wrote:
> 
> > On Tue, Jun 05, 2018 at 09:17:52AM +0200, Miroslav Benes wrote:
> > > On Mon, 4 Jun 2018, Josh Poimboeuf wrote:
> > > 
> > > > On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > > > > An administrator may send a fake signal to all remaining blocking 
> > > > > tasks
> > > > > of a running transition by writing to
> > > > > /sys/kernel/livepatch//signal attribute. Let's do it
> > > > > automatically after 10 seconds. The timeout is chosen deliberately. It
> > > > > gives the tasks enough time to transition themselves.
> > > > > 
> > > > > Theoretically, sending it once should be more than enough. Better be 
> > > > > safe
> > > > > than sorry, so send it periodically.
> > > > 
> > > > This is the part I don't understand.  Why do it periodically?
> > > 
> > > I met (rare!) cases when doing it once was not enough due to a race and 
> > > the signal was missed. However involved testcases were really artificial.
> > >  
> > > > Instead, might it make sense to just send the signals once, and if that
> > > > doesn't work, reverse the transition?  Then we could make patching a
> > > > synchronous operation.  But then, it might be remotely possible that the
> > > > reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> > > > to just leave all these controls in the hands of the user.
> > > 
> > > And there is 'force' option...
> > > 
> > > So given all this, I'd call klp_send_signals() once and then leave it up 
> > > to the user. Would that work for you?
> > 
> > Well, I don't know.  Since the patching process will already need to be
> > managed by user space, what's the benefit of having the kernel doing
> > only this part of it?
> 
> I'm not sure about 'will already need to be managed by user space' part. 
> Sending the fake signal would unblock transition in certain cases "for 
> free". No user interaction is really needed and we can do it 
> automatically. That's why I find it beneficial.

I wonder if several kicks might be necessary to get a kthread patched.

BTW: This approach has a bit different effect in compare with kGraft.
In kGraft, it was enough to force kthread going over an explicitly
defined safe point where it migrated itself. In upstream, the kthread
must get scheduled outside any patched function. The chance might
be a bit random.

To make it clear. I believe that sending signal once makes sense for
sure. And it would make sense to even repeat it.

BTW: If we allow to repeat it, we should reset the counter in
klp_reverse_transition().

Best Regards,
Petr


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-06 Thread Miroslav Benes
On Tue, 5 Jun 2018, Josh Poimboeuf wrote:

> On Tue, Jun 05, 2018 at 09:17:52AM +0200, Miroslav Benes wrote:
> > On Mon, 4 Jun 2018, Josh Poimboeuf wrote:
> > 
> > > On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > > > An administrator may send a fake signal to all remaining blocking tasks
> > > > of a running transition by writing to
> > > > /sys/kernel/livepatch//signal attribute. Let's do it
> > > > automatically after 10 seconds. The timeout is chosen deliberately. It
> > > > gives the tasks enough time to transition themselves.
> > > > 
> > > > Theoretically, sending it once should be more than enough. Better be 
> > > > safe
> > > > than sorry, so send it periodically.
> > > 
> > > This is the part I don't understand.  Why do it periodically?
> > 
> > I met (rare!) cases when doing it once was not enough due to a race and 
> > the signal was missed. However involved testcases were really artificial.
> >  
> > > Instead, might it make sense to just send the signals once, and if that
> > > doesn't work, reverse the transition?  Then we could make patching a
> > > synchronous operation.  But then, it might be remotely possible that the
> > > reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> > > to just leave all these controls in the hands of the user.
> > 
> > And there is 'force' option...
> > 
> > So given all this, I'd call klp_send_signals() once and then leave it up 
> > to the user. Would that work for you?
> 
> Well, I don't know.  Since the patching process will already need to be
> managed by user space, what's the benefit of having the kernel doing
> only this part of it?

I'm not sure about 'will already need to be managed by user space' part. 
Sending the fake signal would unblock transition in certain cases "for 
free". No user interaction is really needed and we can do it 
automatically. That's why I find it beneficial.

If it does not help, then the user must decide what to do next. Reversion 
or force. That's up to them.

Miroslav


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-06 Thread Miroslav Benes
On Tue, 5 Jun 2018, Josh Poimboeuf wrote:

> On Tue, Jun 05, 2018 at 09:17:52AM +0200, Miroslav Benes wrote:
> > On Mon, 4 Jun 2018, Josh Poimboeuf wrote:
> > 
> > > On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > > > An administrator may send a fake signal to all remaining blocking tasks
> > > > of a running transition by writing to
> > > > /sys/kernel/livepatch//signal attribute. Let's do it
> > > > automatically after 10 seconds. The timeout is chosen deliberately. It
> > > > gives the tasks enough time to transition themselves.
> > > > 
> > > > Theoretically, sending it once should be more than enough. Better be 
> > > > safe
> > > > than sorry, so send it periodically.
> > > 
> > > This is the part I don't understand.  Why do it periodically?
> > 
> > I met (rare!) cases when doing it once was not enough due to a race and 
> > the signal was missed. However involved testcases were really artificial.
> >  
> > > Instead, might it make sense to just send the signals once, and if that
> > > doesn't work, reverse the transition?  Then we could make patching a
> > > synchronous operation.  But then, it might be remotely possible that the
> > > reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> > > to just leave all these controls in the hands of the user.
> > 
> > And there is 'force' option...
> > 
> > So given all this, I'd call klp_send_signals() once and then leave it up 
> > to the user. Would that work for you?
> 
> Well, I don't know.  Since the patching process will already need to be
> managed by user space, what's the benefit of having the kernel doing
> only this part of it?

I'm not sure about 'will already need to be managed by user space' part. 
Sending the fake signal would unblock transition in certain cases "for 
free". No user interaction is really needed and we can do it 
automatically. That's why I find it beneficial.

If it does not help, then the user must decide what to do next. Reversion 
or force. That's up to them.

Miroslav


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-05 Thread Josh Poimboeuf
On Tue, Jun 05, 2018 at 09:17:52AM +0200, Miroslav Benes wrote:
> On Mon, 4 Jun 2018, Josh Poimboeuf wrote:
> 
> > On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > > An administrator may send a fake signal to all remaining blocking tasks
> > > of a running transition by writing to
> > > /sys/kernel/livepatch//signal attribute. Let's do it
> > > automatically after 10 seconds. The timeout is chosen deliberately. It
> > > gives the tasks enough time to transition themselves.
> > > 
> > > Theoretically, sending it once should be more than enough. Better be safe
> > > than sorry, so send it periodically.
> > 
> > This is the part I don't understand.  Why do it periodically?
> 
> I met (rare!) cases when doing it once was not enough due to a race and 
> the signal was missed. However involved testcases were really artificial.
>  
> > Instead, might it make sense to just send the signals once, and if that
> > doesn't work, reverse the transition?  Then we could make patching a
> > synchronous operation.  But then, it might be remotely possible that the
> > reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> > to just leave all these controls in the hands of the user.
> 
> And there is 'force' option...
> 
> So given all this, I'd call klp_send_signals() once and then leave it up 
> to the user. Would that work for you?

Well, I don't know.  Since the patching process will already need to be
managed by user space, what's the benefit of having the kernel doing
only this part of it?

-- 
Josh


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-05 Thread Josh Poimboeuf
On Tue, Jun 05, 2018 at 09:17:52AM +0200, Miroslav Benes wrote:
> On Mon, 4 Jun 2018, Josh Poimboeuf wrote:
> 
> > On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > > An administrator may send a fake signal to all remaining blocking tasks
> > > of a running transition by writing to
> > > /sys/kernel/livepatch//signal attribute. Let's do it
> > > automatically after 10 seconds. The timeout is chosen deliberately. It
> > > gives the tasks enough time to transition themselves.
> > > 
> > > Theoretically, sending it once should be more than enough. Better be safe
> > > than sorry, so send it periodically.
> > 
> > This is the part I don't understand.  Why do it periodically?
> 
> I met (rare!) cases when doing it once was not enough due to a race and 
> the signal was missed. However involved testcases were really artificial.
>  
> > Instead, might it make sense to just send the signals once, and if that
> > doesn't work, reverse the transition?  Then we could make patching a
> > synchronous operation.  But then, it might be remotely possible that the
> > reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> > to just leave all these controls in the hands of the user.
> 
> And there is 'force' option...
> 
> So given all this, I'd call klp_send_signals() once and then leave it up 
> to the user. Would that work for you?

Well, I don't know.  Since the patching process will already need to be
managed by user space, what's the benefit of having the kernel doing
only this part of it?

-- 
Josh


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-05 Thread Miroslav Benes
On Mon, 4 Jun 2018, Josh Poimboeuf wrote:

> On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > An administrator may send a fake signal to all remaining blocking tasks
> > of a running transition by writing to
> > /sys/kernel/livepatch//signal attribute. Let's do it
> > automatically after 10 seconds. The timeout is chosen deliberately. It
> > gives the tasks enough time to transition themselves.
> > 
> > Theoretically, sending it once should be more than enough. Better be safe
> > than sorry, so send it periodically.
> 
> This is the part I don't understand.  Why do it periodically?

I met (rare!) cases when doing it once was not enough due to a race and 
the signal was missed. However involved testcases were really artificial.
 
> Instead, might it make sense to just send the signals once, and if that
> doesn't work, reverse the transition?  Then we could make patching a
> synchronous operation.  But then, it might be remotely possible that the
> reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> to just leave all these controls in the hands of the user.

And there is 'force' option...

So given all this, I'd call klp_send_signals() once and then leave it up 
to the user. Would that work for you?

> All that said, a few code review comments:
> 
> - AFAICT, it does an 8 second delay instead of a 10 second delay,
>   because

Not that it matters, because it is still wrong, but it is a 9 second 
delay (or I miscounted again).

>   a) try_complete_transition() is first called before there's any delay;
> 
>   b) the preincrement operator used on signals_cnt.
> 
> - I think 15 seconds might be a better default.  I've seen longer
>   patching delays on a system with 100+ CPUs.

Ok, why not. 

> - If a kthread or idle task is sleeping on a patched function, the
>   pr_notice("signaling remaining tasks\n") will be repeated continously.

True.

> - It might be cleaner to do it from the delayed work function
>   (klp_transition_work_fn).

I considered it but then decided to do it in klp_try_complete_transition() 
under 'if (!complete)'. It belongs right there in my opinion.

Thanks,
Miroslav


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-05 Thread Miroslav Benes
On Mon, 4 Jun 2018, Josh Poimboeuf wrote:

> On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> > An administrator may send a fake signal to all remaining blocking tasks
> > of a running transition by writing to
> > /sys/kernel/livepatch//signal attribute. Let's do it
> > automatically after 10 seconds. The timeout is chosen deliberately. It
> > gives the tasks enough time to transition themselves.
> > 
> > Theoretically, sending it once should be more than enough. Better be safe
> > than sorry, so send it periodically.
> 
> This is the part I don't understand.  Why do it periodically?

I met (rare!) cases when doing it once was not enough due to a race and 
the signal was missed. However involved testcases were really artificial.
 
> Instead, might it make sense to just send the signals once, and if that
> doesn't work, reverse the transition?  Then we could make patching a
> synchronous operation.  But then, it might be remotely possible that the
> reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
> to just leave all these controls in the hands of the user.

And there is 'force' option...

So given all this, I'd call klp_send_signals() once and then leave it up 
to the user. Would that work for you?

> All that said, a few code review comments:
> 
> - AFAICT, it does an 8 second delay instead of a 10 second delay,
>   because

Not that it matters, because it is still wrong, but it is a 9 second 
delay (or I miscounted again).

>   a) try_complete_transition() is first called before there's any delay;
> 
>   b) the preincrement operator used on signals_cnt.
> 
> - I think 15 seconds might be a better default.  I've seen longer
>   patching delays on a system with 100+ CPUs.

Ok, why not. 

> - If a kthread or idle task is sleeping on a patched function, the
>   pr_notice("signaling remaining tasks\n") will be repeated continously.

True.

> - It might be cleaner to do it from the delayed work function
>   (klp_transition_work_fn).

I considered it but then decided to do it in klp_try_complete_transition() 
under 'if (!complete)'. It belongs right there in my opinion.

Thanks,
Miroslav


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-04 Thread Josh Poimboeuf
On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> An administrator may send a fake signal to all remaining blocking tasks
> of a running transition by writing to
> /sys/kernel/livepatch//signal attribute. Let's do it
> automatically after 10 seconds. The timeout is chosen deliberately. It
> gives the tasks enough time to transition themselves.
> 
> Theoretically, sending it once should be more than enough. Better be safe
> than sorry, so send it periodically.

This is the part I don't understand.  Why do it periodically?

Instead, might it make sense to just send the signals once, and if that
doesn't work, reverse the transition?  Then we could make patching a
synchronous operation.  But then, it might be remotely possible that the
reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
to just leave all these controls in the hands of the user.

All that said, a few code review comments:

- AFAICT, it does an 8 second delay instead of a 10 second delay,
  because
  
  a) try_complete_transition() is first called before there's any delay;

  b) the preincrement operator used on signals_cnt.

- I think 15 seconds might be a better default.  I've seen longer
  patching delays on a system with 100+ CPUs.

- If a kthread or idle task is sleeping on a patched function, the
  pr_notice("signaling remaining tasks\n") will be repeated continously.

- It might be cleaner to do it from the delayed work function
  (klp_transition_work_fn).

-- 
Josh


Re: [PATCH 1/2] livepatch: Send a fake signal periodically

2018-06-04 Thread Josh Poimboeuf
On Mon, Jun 04, 2018 at 04:16:35PM +0200, Miroslav Benes wrote:
> An administrator may send a fake signal to all remaining blocking tasks
> of a running transition by writing to
> /sys/kernel/livepatch//signal attribute. Let's do it
> automatically after 10 seconds. The timeout is chosen deliberately. It
> gives the tasks enough time to transition themselves.
> 
> Theoretically, sending it once should be more than enough. Better be safe
> than sorry, so send it periodically.

This is the part I don't understand.  Why do it periodically?

Instead, might it make sense to just send the signals once, and if that
doesn't work, reverse the transition?  Then we could make patching a
synchronous operation.  But then, it might be remotely possible that the
reverse operation also stalls (e.g., on a kthread).  So, maybe it's best
to just leave all these controls in the hands of the user.

All that said, a few code review comments:

- AFAICT, it does an 8 second delay instead of a 10 second delay,
  because
  
  a) try_complete_transition() is first called before there's any delay;

  b) the preincrement operator used on signals_cnt.

- I think 15 seconds might be a better default.  I've seen longer
  patching delays on a system with 100+ CPUs.

- If a kthread or idle task is sleeping on a patched function, the
  pr_notice("signaling remaining tasks\n") will be repeated continously.

- It might be cleaner to do it from the delayed work function
  (klp_transition_work_fn).

-- 
Josh