Re: livepatch: change to a per-task consistency model

2016-05-17 Thread Jessica Yu

+++ Josh Poimboeuf [28/04/16 15:44 -0500]:

[snip]


diff --git a/Documentation/livepatch/livepatch.txt 
b/Documentation/livepatch/livepatch.txt
index 6c43f6e..bee86d0 100644
--- a/Documentation/livepatch/livepatch.txt
+++ b/Documentation/livepatch/livepatch.txt
@@ -72,7 +72,8 @@ example, they add a NULL pointer or a boundary check, fix a 
race by adding
a missing memory barrier, or add some locking around a critical section.
Most of these changes are self contained and the function presents itself
the same way to the rest of the system. In this case, the functions might
-be updated independently one by one.
+be updated independently one by one.  (This can be done by setting the
+'immediate' flag in the klp_patch struct.)

But there are more complex fixes. For example, a patch might change
ordering of locking in multiple functions at the same time. Or a patch
@@ -86,20 +87,103 @@ or no data are stored in the modified structures at the 
moment.
The theory about how to apply functions a safe way is rather complex.
The aim is to define a so-called consistency model. It attempts to define
conditions when the new implementation could be used so that the system
-stays consistent. The theory is not yet finished. See the discussion at
-http://thread.gmane.org/gmane.linux.kernel/1823033/focus=1828189
-
-The current consistency model is very simple. It guarantees that either
-the old or the new function is called. But various functions get redirected
-one by one without any synchronization.
-
-In other words, the current implementation _never_ modifies the behavior
-in the middle of the call. It is because it does _not_ rewrite the entire
-function in the memory. Instead, the function gets redirected at the
-very beginning. But this redirection is used immediately even when
-some other functions from the same patch have not been redirected yet.
-
-See also the section "Limitations" below.
+stays consistent.
+
+Livepatch has a consistency model which is a hybrid of kGraft and
+kpatch:  it uses kGraft's per-task consistency and syscall barrier
+switching combined with kpatch's stack trace switching.  There are also
+a number of fallback options which make it quite flexible.
+
+Patches are applied on a per-task basis, when the task is deemed safe to
+switch over.  When a patch is enabled, livepatch enters into a
+transition state where tasks are converging to the patched state.
+Usually this transition state can complete in a few seconds.  The same
+sequence occurs when a patch is disabled, except the tasks converge from
+the patched state to the unpatched state.
+
+An interrupt handler inherits the patched state of the task it
+interrupts.  The same is true for forked tasks: the child inherits the
+patched state of the parent.
+
+Livepatch uses several complementary approaches to determine when it's
+safe to patch tasks:
+
+1. The first and most effective approach is stack checking of sleeping
+   tasks.  If no affected functions are on the stack of a given task,
+   the task is patched.  In most cases this will patch most or all of
+   the tasks on the first try.  Otherwise it'll keep trying
+   periodically.  This option is only available if the architecture has
+   reliable stacks (CONFIG_RELIABLE_STACKTRACE and
+   CONFIG_STACK_VALIDATION).
+
+2. The second approach, if needed, is kernel exit switching.  A
+   task is switched when it returns to user space from a system call, a
+   user space IRQ, or a signal.  It's useful in the following cases:
+
+   a) Patching I/O-bound user tasks which are sleeping on an affected
+  function.  In this case you have to send SIGSTOP and SIGCONT to
+  force it to exit the kernel and be patched.


See below -


+   b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
+  then it will get patched the next time it gets interrupted by an
+  IRQ.
+   c) Applying patches for architectures which don't yet have
+  CONFIG_RELIABLE_STACKTRACE.  In this case you'll have to signal
+  most of the tasks on the system.  However this isn't a complete
+  solution, because there's currently no way to patch kthreads
+  without CONFIG_RELIABLE_STACKTRACE.
+
+   Note: since idle "swapper" tasks don't ever exit the kernel, they
+   instead have a kpatch_patch_task() call in the idle loop which allows
+   them to patched before the CPU enters the idle state.
+
+3. A third approach (not yet implemented) is planned for the case where
+   a kthread is sleeping on an affected function.  In that case we could
+   kick the kthread with a signal and then try to patch the task from
+   the to-be-patched function's livepatch ftrace handler when it
+   re-enters the function.  This will require
+   CONFIG_RELIABLE_STACKTRACE.
+
+All the above approaches may be skipped by setting the 'immediate' flag
+in the 'klp_patch' struct, which will patch all tasks immediately.  This
+can be useful if the patch doesn't change any function or data
+semantics.  Note that, even wi

Re: livepatch: change to a per-task consistency model

2016-05-18 Thread Jiri Kosina
On Tue, 17 May 2016, Jessica Yu wrote:

> What about tasks sleeping on affected functions in uninterruptible sleep 
> (possibly indefinitely)? Since all signals are ignored, we wouldn't be 
> able to patch those tasks in this way, right? Would that be an 
> unsupported case?

I don't think there is any better way out of this situation than 
documenting that the convergence of patching could in such cases could 
take quite a lot of time (well, we can pro-actively try to detect this 
situation before the patching actually start, and warn about the possible 
consequences).

But let's face it, this should be pretty uncommon, because (a) it's not 
realistic for the wait times to be really indefinite (b) the task is 
likely to be in TASK_KILLABLE rather than just plain TASK_UNINTERRUPTIBLE.

-- 
Jiri Kosina
SUSE Labs

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: livepatch: change to a per-task consistency model

2016-05-18 Thread Josh Poimboeuf
On Wed, May 18, 2016 at 10:16:22AM +0200, Jiri Kosina wrote:
> On Tue, 17 May 2016, Jessica Yu wrote:
> 
> > What about tasks sleeping on affected functions in uninterruptible sleep 
> > (possibly indefinitely)? Since all signals are ignored, we wouldn't be 
> > able to patch those tasks in this way, right? Would that be an 
> > unsupported case?
> 
> I don't think there is any better way out of this situation than 
> documenting that the convergence of patching could in such cases could 
> take quite a lot of time (well, we can pro-actively try to detect this 
> situation before the patching actually start, and warn about the possible 
> consequences).
> 
> But let's face it, this should be pretty uncommon, because (a) it's not 
> realistic for the wait times to be really indefinite (b) the task is 
> likely to be in TASK_KILLABLE rather than just plain TASK_UNINTERRUPTIBLE.

Yeah, I think this situation -- a task sleeping on an affected function
in uninterruptible state for a long period of time -- would be
exceedingly rare and not something we need to worry about for now.

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: livepatch: change to a per-task consistency model

2016-05-18 Thread Jiri Kosina
On Wed, 18 May 2016, Josh Poimboeuf wrote:

> Yeah, I think this situation -- a task sleeping on an affected function 
> in uninterruptible state for a long period of time -- would be 
> exceedingly rare and not something we need to worry about for now.

Plus in case task'd be in TASK_UNINTERRUPTIBLE for more than 120s, hung 
task detector would trigger anyway.

-- 
Jiri Kosina
SUSE Labs

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: livepatch: change to a per-task consistency model

2016-05-23 Thread David Laight
From: Jiri Kosina
> Sent: 18 May 2016 21:23
> On Wed, 18 May 2016, Josh Poimboeuf wrote:
> 
> > Yeah, I think this situation -- a task sleeping on an affected function
> > in uninterruptible state for a long period of time -- would be
> > exceedingly rare and not something we need to worry about for now.
> 
> Plus in case task'd be in TASK_UNINTERRUPTIBLE for more than 120s, hung
> task detector would trigger anyway.

Related, please can we have a flag for the sleep and/or process so that
an uninterruptible sleep doesn't trigger the 'hung task' detector
and also stops the process counting towards the 'load average'.

In particular some kernel threads are not signalable, and do not
want to be woken by signals (they exit on a specific request).

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: livepatch: change to a per-task consistency model

2016-05-23 Thread Jiri Kosina
On Mon, 23 May 2016, David Laight wrote:

> Related, please can we have a flag for the sleep and/or process so that
> an uninterruptible sleep doesn't trigger the 'hung task' detector

TASK_KILLABLE

> and also stops the process counting towards the 'load average'.

TASK_NOLOAD

-- 
Jiri Kosina
SUSE Labs

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: livepatch: change to a per-task consistency model

2016-05-24 Thread David Laight
From: Jiri Kosina 
> Sent: 23 May 2016 19:45
> > Related, please can we have a flag for the sleep and/or process so that
> > an uninterruptible sleep doesn't trigger the 'hung task' detector
> 
> TASK_KILLABLE

Not sure that does what I want.
It appears to allow some 'kill' actions to wake the process.
I'm sure I've looked at the 'hung task' code since 2007.

> > and also stops the process counting towards the 'load average'.
> 
> TASK_NOLOAD

Ah, that was added in May 2015.
Not surprising I didn't know about it.

I'll leave the code doing:
  set_current_state(signal_pending(current) ? TASK_UNINTERRUPTIBLE ? 
TASK_INTERRUPTIBLE);
for a while longer.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: livepatch: change to a per-task consistency model

2016-05-24 Thread Jiri Kosina
On Tue, 24 May 2016, David Laight wrote:

> > > Related, please can we have a flag for the sleep and/or process so that
> > > an uninterruptible sleep doesn't trigger the 'hung task' detector
> > 
> > TASK_KILLABLE
> 
> Not sure that does what I want.
> It appears to allow some 'kill' actions to wake the process.
> I'm sure I've looked at the 'hung task' code since 2007.

The trick is the 

if (t->state == TASK_UNINTERRUPTIBLE)

test in check_hung_uninterruptible_tasks(). That makes sure that 
TASK_KILLABLE tasks (e.g. waiting on NFS I/O, but not limited only to 
that; feel free to set it whereever you need) are skipped. Please note 
that TASK_KILLABLE is actually a _mask_ that includes TASK_UNINTERRUPTIBLE 
as well; therefore the '==' test skips such tasks.

-- 
Jiri Kosina
SUSE Labs

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-08 Thread Josh Poimboeuf
Change livepatch to use a basic per-task consistency model.  This is the
foundation which will eventually enable us to patch those ~10% of
security patches which change function or data semantics.  This is the
biggest remaining piece needed to make livepatch more generally useful.

This code stems from the design proposal made by Vojtech [1] in November
2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
consistency and syscall barrier switching combined with kpatch's stack
trace switching.  There are also a number of fallback options which make
it quite flexible.

Patches are applied on a per-task basis, when the task is deemed safe to
switch over.  When a patch is enabled, livepatch enters into a
transition state where tasks are converging to the patched state.
Usually this transition state can complete in a few seconds.  The same
sequence occurs when a patch is disabled, except the tasks converge from
the patched state to the unpatched state.

An interrupt handler inherits the patched state of the task it
interrupts.  The same is true for forked tasks: the child inherits the
patched state of the parent.

Livepatch uses several complementary approaches to determine when it's
safe to patch tasks:

1. The first and most effective approach is stack checking of sleeping
   tasks.  If no affected functions are on the stack of a given task,
   the task is patched.  In most cases this will patch most or all of
   the tasks on the first try.  Otherwise it'll keep trying
   periodically.  This option is only available if the architecture has
   reliable stacks (HAVE_RELIABLE_STACKTRACE).

2. The second approach, if needed, is kernel exit switching.  A
   task is switched when it returns to user space from a system call, a
   user space IRQ, or a signal.  It's useful in the following cases:

   a) Patching I/O-bound user tasks which are sleeping on an affected
  function.  In this case you have to send SIGSTOP and SIGCONT to
  force it to exit the kernel and be patched.
   b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
  then it will get patched the next time it gets interrupted by an
  IRQ.
   c) In the future it could be useful for applying patches for
  architectures which don't yet have HAVE_RELIABLE_STACKTRACE.  In
  this case you would have to signal most of the tasks on the
  system.  However this isn't supported yet because there's
  currently no way to patch kthreads without
  HAVE_RELIABLE_STACKTRACE.

3. For idle "swapper" tasks, since they don't ever exit the kernel, they
   instead have a klp_update_patch_state() call in the idle loop which
   allows them to be patched before the CPU enters the idle state.

   (Note there's not yet such an approach for kthreads.)

All the above approaches may be skipped by setting the 'immediate' flag
in the 'klp_patch' struct, which will disable per-task consistency and
patch all tasks immediately.  This can be useful if the patch doesn't
change any function or data semantics.  Note that, even with this flag
set, it's possible that some tasks may still be running with an old
version of the function, until that function returns.

There's also an 'immediate' flag in the 'klp_func' struct which allows
you to specify that certain functions in the patch can be applied
without per-task consistency.  This might be useful if you want to patch
a common function like schedule(), and the function change doesn't need
consistency but the rest of the patch does.

For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
must set patch->immediate which causes all tasks to be patched
immediately.  This option should be used with care, only when the patch
doesn't change any function or data semantics.

In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
may be allowed to use per-task consistency if we can come up with
another way to patch kthreads.

The /sys/kernel/livepatch//transition file shows whether a patch
is in transition.  Only a single patch (the topmost patch on the stack)
can be in transition at a given time.  A patch can remain in transition
indefinitely, if any of the tasks are stuck in the initial patch state.

A transition can be reversed and effectively canceled by writing the
opposite value to the /sys/kernel/livepatch//enabled file while
the transition is in progress.  Then all the tasks will attempt to
converge back to the original patch state.

[1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz

Signed-off-by: Josh Poimboeuf 
---
 Documentation/ABI/testing/sysfs-kernel-livepatch |   8 +
 Documentation/livepatch/livepatch.txt| 127 +-
 include/linux/init_task.h|   9 +
 include/linux/livepatch.h|  40 +-
 include/linux/sched.h|   3 +
 kernel/fork.c|   3 +
 kernel/livepatch/Makefile|   2 +-
 kernel/livepatc

[PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-01-19 Thread Josh Poimboeuf
Change livepatch to use a basic per-task consistency model.  This is the
foundation which will eventually enable us to patch those ~10% of
security patches which change function or data semantics.  This is the
biggest remaining piece needed to make livepatch more generally useful.

This code stems from the design proposal made by Vojtech [1] in November
2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
consistency and syscall barrier switching combined with kpatch's stack
trace switching.  There are also a number of fallback options which make
it quite flexible.

Patches are applied on a per-task basis, when the task is deemed safe to
switch over.  When a patch is enabled, livepatch enters into a
transition state where tasks are converging to the patched state.
Usually this transition state can complete in a few seconds.  The same
sequence occurs when a patch is disabled, except the tasks converge from
the patched state to the unpatched state.

An interrupt handler inherits the patched state of the task it
interrupts.  The same is true for forked tasks: the child inherits the
patched state of the parent.

Livepatch uses several complementary approaches to determine when it's
safe to patch tasks:

1. The first and most effective approach is stack checking of sleeping
   tasks.  If no affected functions are on the stack of a given task,
   the task is patched.  In most cases this will patch most or all of
   the tasks on the first try.  Otherwise it'll keep trying
   periodically.  This option is only available if the architecture has
   reliable stacks (HAVE_RELIABLE_STACKTRACE).

2. The second approach, if needed, is kernel exit switching.  A
   task is switched when it returns to user space from a system call, a
   user space IRQ, or a signal.  It's useful in the following cases:

   a) Patching I/O-bound user tasks which are sleeping on an affected
  function.  In this case you have to send SIGSTOP and SIGCONT to
  force it to exit the kernel and be patched.
   b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
  then it will get patched the next time it gets interrupted by an
  IRQ.
   c) In the future it could be useful for applying patches for
  architectures which don't yet have HAVE_RELIABLE_STACKTRACE.  In
  this case you would have to signal most of the tasks on the
  system.  However this isn't supported yet because there's
  currently no way to patch kthreads without
  HAVE_RELIABLE_STACKTRACE.

3. For idle "swapper" tasks, since they don't ever exit the kernel, they
   instead have a klp_update_patch_state() call in the idle loop which
   allows them to be patched before the CPU enters the idle state.

   (Note there's not yet such an approach for kthreads.)

All the above approaches may be skipped by setting the 'immediate' flag
in the 'klp_patch' struct, which will disable per-task consistency and
patch all tasks immediately.  This can be useful if the patch doesn't
change any function or data semantics.  Note that, even with this flag
set, it's possible that some tasks may still be running with an old
version of the function, until that function returns.

There's also an 'immediate' flag in the 'klp_func' struct which allows
you to specify that certain functions in the patch can be applied
without per-task consistency.  This might be useful if you want to patch
a common function like schedule(), and the function change doesn't need
consistency but the rest of the patch does.

For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
must set patch->immediate which causes all tasks to be patched
immediately.  This option should be used with care, only when the patch
doesn't change any function or data semantics.

In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
may be allowed to use per-task consistency if we can come up with
another way to patch kthreads.

The /sys/kernel/livepatch//transition file shows whether a patch
is in transition.  Only a single patch (the topmost patch on the stack)
can be in transition at a given time.  A patch can remain in transition
indefinitely, if any of the tasks are stuck in the initial patch state.

A transition can be reversed and effectively canceled by writing the
opposite value to the /sys/kernel/livepatch//enabled file while
the transition is in progress.  Then all the tasks will attempt to
converge back to the original patch state.

[1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz

Signed-off-by: Josh Poimboeuf 
---
 Documentation/ABI/testing/sysfs-kernel-livepatch |   8 +
 Documentation/livepatch/livepatch.txt| 158 ++-
 include/linux/init_task.h|   9 +
 include/linux/livepatch.h|  42 +-
 include/linux/sched.h|   3 +
 kernel/fork.c|   3 +
 kernel/livepatch/Makefile|   2 +-
 kernel/livepat

[PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-02-13 Thread Josh Poimboeuf
Change livepatch to use a basic per-task consistency model.  This is the
foundation which will eventually enable us to patch those ~10% of
security patches which change function or data semantics.  This is the
biggest remaining piece needed to make livepatch more generally useful.

This code stems from the design proposal made by Vojtech [1] in November
2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
consistency and syscall barrier switching combined with kpatch's stack
trace switching.  There are also a number of fallback options which make
it quite flexible.

Patches are applied on a per-task basis, when the task is deemed safe to
switch over.  When a patch is enabled, livepatch enters into a
transition state where tasks are converging to the patched state.
Usually this transition state can complete in a few seconds.  The same
sequence occurs when a patch is disabled, except the tasks converge from
the patched state to the unpatched state.

An interrupt handler inherits the patched state of the task it
interrupts.  The same is true for forked tasks: the child inherits the
patched state of the parent.

Livepatch uses several complementary approaches to determine when it's
safe to patch tasks:

1. The first and most effective approach is stack checking of sleeping
   tasks.  If no affected functions are on the stack of a given task,
   the task is patched.  In most cases this will patch most or all of
   the tasks on the first try.  Otherwise it'll keep trying
   periodically.  This option is only available if the architecture has
   reliable stacks (HAVE_RELIABLE_STACKTRACE).

2. The second approach, if needed, is kernel exit switching.  A
   task is switched when it returns to user space from a system call, a
   user space IRQ, or a signal.  It's useful in the following cases:

   a) Patching I/O-bound user tasks which are sleeping on an affected
  function.  In this case you have to send SIGSTOP and SIGCONT to
  force it to exit the kernel and be patched.
   b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
  then it will get patched the next time it gets interrupted by an
  IRQ.
   c) In the future it could be useful for applying patches for
  architectures which don't yet have HAVE_RELIABLE_STACKTRACE.  In
  this case you would have to signal most of the tasks on the
  system.  However this isn't supported yet because there's
  currently no way to patch kthreads without
  HAVE_RELIABLE_STACKTRACE.

3. For idle "swapper" tasks, since they don't ever exit the kernel, they
   instead have a klp_update_patch_state() call in the idle loop which
   allows them to be patched before the CPU enters the idle state.

   (Note there's not yet such an approach for kthreads.)

All the above approaches may be skipped by setting the 'immediate' flag
in the 'klp_patch' struct, which will disable per-task consistency and
patch all tasks immediately.  This can be useful if the patch doesn't
change any function or data semantics.  Note that, even with this flag
set, it's possible that some tasks may still be running with an old
version of the function, until that function returns.

There's also an 'immediate' flag in the 'klp_func' struct which allows
you to specify that certain functions in the patch can be applied
without per-task consistency.  This might be useful if you want to patch
a common function like schedule(), and the function change doesn't need
consistency but the rest of the patch does.

For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
must set patch->immediate which causes all tasks to be patched
immediately.  This option should be used with care, only when the patch
doesn't change any function or data semantics.

In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
may be allowed to use per-task consistency if we can come up with
another way to patch kthreads.

The /sys/kernel/livepatch//transition file shows whether a patch
is in transition.  Only a single patch (the topmost patch on the stack)
can be in transition at a given time.  A patch can remain in transition
indefinitely, if any of the tasks are stuck in the initial patch state.

A transition can be reversed and effectively canceled by writing the
opposite value to the /sys/kernel/livepatch//enabled file while
the transition is in progress.  Then all the tasks will attempt to
converge back to the original patch state.

[1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz

Signed-off-by: Josh Poimboeuf 
---
 Documentation/ABI/testing/sysfs-kernel-livepatch |   8 +
 Documentation/livepatch/livepatch.txt| 186 +++-
 include/linux/init_task.h|   9 +
 include/linux/livepatch.h|  42 +-
 include/linux/sched.h|   3 +
 kernel/fork.c|   3 +
 kernel/livepatch/Makefile|   2 +-
 kernel/livepa

[RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-04-28 Thread Josh Poimboeuf
Change livepatch to use a basic per-task consistency model.  This is the
foundation which will eventually enable us to patch those ~10% of
security patches which change function or data semantics.  This is the
biggest remaining piece needed to make livepatch more generally useful.

This code stems from the design proposal made by Vojtech [1] in November
2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
consistency and syscall barrier switching combined with kpatch's stack
trace switching.  There are also a number of fallback options which make
it quite flexible.

Patches are applied on a per-task basis, when the task is deemed safe to
switch over.  When a patch is enabled, livepatch enters into a
transition state where tasks are converging to the patched state.
Usually this transition state can complete in a few seconds.  The same
sequence occurs when a patch is disabled, except the tasks converge from
the patched state to the unpatched state.

An interrupt handler inherits the patched state of the task it
interrupts.  The same is true for forked tasks: the child inherits the
patched state of the parent.

Livepatch uses several complementary approaches to determine when it's
safe to patch tasks:

1. The first and most effective approach is stack checking of sleeping
   tasks.  If no affected functions are on the stack of a given task,
   the task is patched.  In most cases this will patch most or all of
   the tasks on the first try.  Otherwise it'll keep trying
   periodically.  This option is only available if the architecture has
   reliable stacks (CONFIG_RELIABLE_STACKTRACE and
   CONFIG_STACK_VALIDATION).

2. The second approach, if needed, is kernel exit switching.  A
   task is switched when it returns to user space from a system call, a
   user space IRQ, or a signal.  It's useful in the following cases:

   a) Patching I/O-bound user tasks which are sleeping on an affected
  function.  In this case you have to send SIGSTOP and SIGCONT to
  force it to exit the kernel and be patched.
   b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
  then it will get patched the next time it gets interrupted by an
  IRQ.
   c) Applying patches for architectures which don't yet have
  CONFIG_RELIABLE_STACKTRACE.  In this case you'll have to signal
  most of the tasks on the system.  However this isn't a complete
  solution, because there's currently no way to patch kthreads
  without CONFIG_RELIABLE_STACKTRACE.

   Note: since idle "swapper" tasks don't ever exit the kernel, they
   instead have a kpatch_patch_task() call in the idle loop which allows
   them to patched before the CPU enters the idle state.

3. A third approach (not yet implemented) is planned for the case where
   a kthread is sleeping on an affected function.  In that case we could
   kick the kthread with a signal and then try to patch the task from
   the to-be-patched function's livepatch ftrace handler when it
   re-enters the function.  This will require
   CONFIG_RELIABLE_STACKTRACE.

All the above approaches may be skipped by setting the 'immediate' flag
in the 'klp_patch' struct, which will patch all tasks immediately.  This
can be useful if the patch doesn't change any function or data
semantics.  Note that, even with this flag set, it's possible that some
tasks may still be running with an old version of the function, until
that function returns.

There's also an 'immediate' flag in the 'klp_func' struct which allows
you to specify that certain functions in the patch can be applied
without per-task consistency.  This might be useful if you want to patch
a common function like schedule(), and the function change doesn't need
consistency but the rest of the patch does.

For architectures which don't have CONFIG_RELIABLE_STACKTRACE, there
are two options:

a) the user can set the patch->immediate flag which causes all tasks to
   be patched immediately.  This option should be used with care, only
   when the patch doesn't change any function or data semantics; or

b) use the kernel exit switching approach (this is the default).
   Note the patching will never complete because there's no currently no
   way to patch kthreads without CONFIG_RELIABLE_STACKTRACE.

The /sys/kernel/livepatch//transition file shows whether a patch
is in transition.  Only a single patch (the topmost patch on the stack)
can be in transition at a given time.  A patch can remain in transition
indefinitely, if any of the tasks are stuck in the initial patch state.

A transition can be reversed and effectively canceled by writing the
opposite value to the /sys/kernel/livepatch//enabled file while
the transition is in progress.  Then all the tasks will attempt to
converge back to the original patch state.

[1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz

Signed-off-by: Josh Poimboeuf 
---
 Documentation/ABI/testing/sysfs-kernel-livepatch |   8 +
 Documentation/livepatch/livepatch.txt   

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-20 Thread Petr Mladek
On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> 
> Signed-off-by: Josh Poimboeuf 
> ---
> diff --git a/Documentation/livepatch/livepatch.txt 
> b/Documentation/livepatch/livepatch.txt
> index 6c43f6e..f87e742 100644
> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt

I like the description.

Just a note that we will also need to review the section about
limitations. But I am not sure that we want to do it in this patch.
It might open a long discussion on its own.

> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 1a5a93c..8e06fe5 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -28,18 +28,40 @@
>  
>  #include 
>  
> +/* task patch states */
> +#define KLP_UNDEFINED-1
> +#define KLP_UNPATCHED 0
> +#define KLP_PATCHED   1
> +
>  /**
>   * struct klp_func - function structure for live patching
>   * @old_name:name of the function to be patched
>   * @new_func:pointer to the patched function code
>   * @old_sympos: a hint indicating which symbol position the old function
>   *   can be found (optional)
> + * @immediate:  patch the func immediately, bypassing backtrace safety checks

There are more checks possible. I would use the same description
as for klp_object.


>   * @old_addr:the address of the function being patched
>   * @kobj:kobject for sysfs resources
>   * @stack_node:  list node for klp_ops func_stack list
>   * @old_size:size of the old function
>   * @new_size:size of the new function
>   * @patched: the func has been added to the klp_ops list
> + * @transition:  the func is currently being applied or reverted
> + *
> @@ -86,6 +110,7 @@ struct klp_object {
>   * struct klp_patch - patch structure for live patching
>   * @mod: reference to the live patch module
>   * @objs:object entries for kernel objects to be patched
> + * @immediate:  patch all funcs immediately, bypassing safety mechanisms
>   * @list:list node for global list of registered patches
>   * @kobj:kobject for sysfs resources
>   * @enabled: the patch is enabled (but operation may be incomplete)

[...]

> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index fc160c6..22c0c01 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -424,7 +477,10 @@ static ssize_t enabled_store(struct kobject *kobj, 
> struct kobj_attribute *attr,
>   goto err;
>   }
>  
> - if (enabled) {
> + if (patch == klp_transition_patch) {
> + klp_reverse_transition();
> + mod_delayed_work(system_wq, &klp_transition_work, 0);

I would put this mod_delayed_work() into klp_reverse_transition().
Also I would put that schedule_delayed_work() into
klp_try_complete_transition().

If I did not miss anything, it will allow to move the
klp_transition_work code to transition.c where it logically
belongs.

> + } else if (enabled) {
>   ret = __klp_enable_patch(patch);
>   if (ret)
>   goto err;

[...]

> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 5efa262..e79ebb5 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>  {
>   struct klp_ops *ops;
>   struct klp_func *func;
> + int patch_state;
>  
>   ops = container_of(fops, struct klp_ops, fops);
>  
>   rcu_read_lock();
> +
>   func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - if (WARN_ON_ONCE(!func))
> +
> + if (!func)
>   goto unlock;

Why do you removed the WARN_ON_ONCE(), please?

We still add the function on the stack before registering
the ftrace handler. Also we unregister the ftrace handler
before removing the the last entry from the stack.

AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
to make sure that no-one is inside the handler once finished.
Mirek knows more about it.

If this is not true, we have a problem. For example,
we call kfree(ops) after unregister_ftrace_function();

BTW: I thought that this change was really needed because of
klp_try_complete_transition(). But I think that the WARN
could and should stay after all. See below.


> + /*
> +  * Enforce the order of the ops->func_stack and func->transition reads.
> +  * The corresp

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-21 Thread Josh Poimboeuf
On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > 
> > Signed-off-by: Josh Poimboeuf 
> > ---
> > diff --git a/Documentation/livepatch/livepatch.txt 
> > b/Documentation/livepatch/livepatch.txt
> > index 6c43f6e..f87e742 100644
> > --- a/Documentation/livepatch/livepatch.txt
> > +++ b/Documentation/livepatch/livepatch.txt
> 
> I like the description.
> 
> Just a note that we will also need to review the section about
> limitations. But I am not sure that we want to do it in this patch.
> It might open a long discussion on its own.
> 
> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 1a5a93c..8e06fe5 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -28,18 +28,40 @@
> >  
> >  #include 
> >  
> > +/* task patch states */
> > +#define KLP_UNDEFINED  -1
> > +#define KLP_UNPATCHED   0
> > +#define KLP_PATCHED 1
> > +
> >  /**
> >   * struct klp_func - function structure for live patching
> >   * @old_name:  name of the function to be patched
> >   * @new_func:  pointer to the patched function code
> >   * @old_sympos: a hint indicating which symbol position the old function
> >   * can be found (optional)
> > + * @immediate:  patch the func immediately, bypassing backtrace safety 
> > checks
> 
> There are more checks possible. I would use the same description
> as for klp_object.

Agreed.

> >   * @old_addr:  the address of the function being patched
> >   * @kobj:  kobject for sysfs resources
> >   * @stack_node:list node for klp_ops func_stack list
> >   * @old_size:  size of the old function
> >   * @new_size:  size of the new function
> >   * @patched:   the func has been added to the klp_ops list
> > + * @transition:the func is currently being applied or reverted
> > + *
> > @@ -86,6 +110,7 @@ struct klp_object {
> >   * struct klp_patch - patch structure for live patching
> >   * @mod:   reference to the live patch module
> >   * @objs:  object entries for kernel objects to be patched
> > + * @immediate:  patch all funcs immediately, bypassing safety mechanisms
> >   * @list:  list node for global list of registered patches
> >   * @kobj:  kobject for sysfs resources
> >   * @enabled:   the patch is enabled (but operation may be incomplete)
> 
> [...]
> 
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index fc160c6..22c0c01 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -424,7 +477,10 @@ static ssize_t enabled_store(struct kobject *kobj, 
> > struct kobj_attribute *attr,
> > goto err;
> > }
> >  
> > -   if (enabled) {
> > +   if (patch == klp_transition_patch) {
> > +   klp_reverse_transition();
> > +   mod_delayed_work(system_wq, &klp_transition_work, 0);
> 
> I would put this mod_delayed_work() into klp_reverse_transition().
> Also I would put that schedule_delayed_work() into
> klp_try_complete_transition().
> 
> If I did not miss anything, it will allow to move the
> klp_transition_work code to transition.c where it logically
> belongs.

Makes sense, I'll see if I can move all the klp_transition_work code to
transition.c.

> > +   } else if (enabled) {
> > ret = __klp_enable_patch(patch);
> > if (ret)
> > goto err;
> 
> [...]
> 
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 5efa262..e79ebb5 100644
> > --- a/kernel/livepatch/patch.c
> > +++ b/kernel/livepatch/patch.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >  #include 
> >  #include "patch.h"
> > +#include "transition.h"
> >  
> >  static LIST_HEAD(klp_ops);
> >  
> > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> >  {
> > struct klp_ops *ops;
> > struct klp_func *func;
> > +   int patch_state;
> >  
> > ops = container_of(fops, struct klp_ops, fops);
> >  
> > rcu_read_lock();
> > +
> > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> >   stack_node);
> > -   if (WARN_ON_ONCE(!func))
> > +
> > +   if (!func)
> > goto unlock;
> 
> Why do you removed the WARN_ON_ONCE(), please?
> 
> We still add the function on the stack before registering
> the ftrace handler. Also we unregister the ftrace handler
> before removing the the last entry from the stack.
> 
> AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> to make sure that no-one is inside the handler once finished.
> Mirek knows more about it.

Hm, this is news 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-22 Thread Petr Mladek
On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > > 
> > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > 
> > > --- /dev/null
> > > +++ b/kernel/livepatch/transition.c
> > > +/*
> > > + * Initialize the global target patch state and all tasks to the initial 
> > > patch
> > > + * state, and initialize all function transition states to true in 
> > > preparation
> > > + * for patching or unpatching.
> > > + */
> > > +void klp_init_transition(struct klp_patch *patch, int state)
> > > +{
> > > + struct task_struct *g, *task;
> > > + unsigned int cpu;
> > > + struct klp_object *obj;
> > > + struct klp_func *func;
> > > + int initial_state = !state;
> > > +
> > > + WARN_ON_ONCE(klp_target_state != KLP_UNDEFINED);
> > > +
> > > + klp_transition_patch = patch;
> > > +
> > > + /*
> > > +  * Set the global target patch state which tasks will switch to.  This
> > > +  * has no effect until the TIF_PATCH_PENDING flags get set later.
> > > +  */
> > > + klp_target_state = state;
> > > +
> > > + /*
> > > +  * If the patch can be applied or reverted immediately, skip the
> > > +  * per-task transitions.
> > > +  */
> > > + if (patch->immediate)
> > > + return;
> > > +
> > > + /*
> > > +  * Initialize all tasks to the initial patch state to prepare them for
> > > +  * switching to the target state.
> > > +  */
> > > + read_lock(&tasklist_lock);
> > > + for_each_process_thread(g, task) {
> > > + WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > + task->patch_state = initial_state;
> > > + }
> > > + read_unlock(&tasklist_lock);
> > > +
> > > + /*
> > > +  * Ditto for the idle "swapper" tasks.
> > > +  */
> > > + get_online_cpus();
> > > + for_each_online_cpu(cpu) {
> > > + task = idle_task(cpu);
> > > + WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > + task->patch_state = initial_state;
> > > + }
> > > + put_online_cpus();
> > 
> > We allow to add/remove CPUs here. I am afraid that we will also need
> > to add a cpu coming/going handler that will set the task->patch_state
> > the right way. We must not set the klp_target_state until all ftrace
> > handlers are ready.
> 
> What if we instead just change the above to use for_each_possible_cpu()?
> We could do the same in klp_complete_transition().

I like this idea. It seems that there is idle task for each possible
cpu, see idle_threads_init().

IMHO, we should do the same everytime we do anything with the idle
tasks. I mean in klp_start_transition, klp_try_complete_transition()
and also complete_transition().

Then they will be handled like any other processes and we do not need
to think of any special races.


> > > + /*
> > > +  * Enforce the order of the task->patch_state initializations and the
> > > +  * func->transition updates to ensure that, in the enable path,
> > > +  * klp_ftrace_handler() doesn't see a func in transition with a
> > > +  * task->patch_state of KLP_UNDEFINED.
> > > +  */
> > > + smp_wmb();
> > > +
> > > + /*
> > > +  * Set the func transition states so klp_ftrace_handler() will know to
> > > +  * switch to the transition logic.
> > > +  *
> > > +  * When patching, the funcs aren't yet in the func_stack and will be
> > > +  * made visible to the ftrace handler shortly by the calls to
> > > +  * klp_patch_object().
> > > +  *
> > > +  * When unpatching, the funcs are already in the func_stack and so are
> > > +  * already visible to the ftrace handler.
> > > +  */
> > > + klp_for_each_object(patch, obj)
> > > + klp_for_each_func(obj, func)
> > > + func->transition = true;
> > > +}
> > > +
> > > +/*
> > > + * Start the transition to the specified target patch state so tasks can 
> > > begin
> > > + * switching to it.
> > > + */
> > > +void klp_start_transition(void)
> > > +{
> > > + struct task_struct *g, *task;
> > > + unsigned int cpu;
> > > +
> > > + WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > > +
> > > + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > > +   klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > > +
> > > + /*
> > > +  * If the patch can be applied or reverted immediately, skip the
> > > +  * per-task transitions.
> > > +  */
> > > + if (klp_transition_patch->immediate)
> > > + return;
> > > +
> > > + /*
> > > +  * Mark all normal tasks as needing a patch state update.  As they pass
> > > +  * through the syscall barrier they'll switch over to the target state
> > > +  * (unless we switch them in klp_try_complete_transition() first).
> > > +  */
> > > + r

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-22 Thread Josh Poimboeuf
On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > Change livepatch to use a basic per-task consistency model.  This is the
> > > > foundation which will eventually enable us to patch those ~10% of
> > > > security patches which change function or data semantics.  This is the
> > > > biggest remaining piece needed to make livepatch more generally useful.
> > > > 
> > > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > > 
> > > > --- /dev/null
> > > > +++ b/kernel/livepatch/transition.c
> > > > +/*
> > > > + * Initialize the global target patch state and all tasks to the 
> > > > initial patch
> > > > + * state, and initialize all function transition states to true in 
> > > > preparation
> > > > + * for patching or unpatching.
> > > > + */
> > > > +void klp_init_transition(struct klp_patch *patch, int state)
> > > > +{
> > > > +   struct task_struct *g, *task;
> > > > +   unsigned int cpu;
> > > > +   struct klp_object *obj;
> > > > +   struct klp_func *func;
> > > > +   int initial_state = !state;
> > > > +
> > > > +   WARN_ON_ONCE(klp_target_state != KLP_UNDEFINED);
> > > > +
> > > > +   klp_transition_patch = patch;
> > > > +
> > > > +   /*
> > > > +* Set the global target patch state which tasks will switch 
> > > > to.  This
> > > > +* has no effect until the TIF_PATCH_PENDING flags get set 
> > > > later.
> > > > +*/
> > > > +   klp_target_state = state;
> > > > +
> > > > +   /*
> > > > +* If the patch can be applied or reverted immediately, skip the
> > > > +* per-task transitions.
> > > > +*/
> > > > +   if (patch->immediate)
> > > > +   return;
> > > > +
> > > > +   /*
> > > > +* Initialize all tasks to the initial patch state to prepare 
> > > > them for
> > > > +* switching to the target state.
> > > > +*/
> > > > +   read_lock(&tasklist_lock);
> > > > +   for_each_process_thread(g, task) {
> > > > +   WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > > +   task->patch_state = initial_state;
> > > > +   }
> > > > +   read_unlock(&tasklist_lock);
> > > > +
> > > > +   /*
> > > > +* Ditto for the idle "swapper" tasks.
> > > > +*/
> > > > +   get_online_cpus();
> > > > +   for_each_online_cpu(cpu) {
> > > > +   task = idle_task(cpu);
> > > > +   WARN_ON_ONCE(task->patch_state != KLP_UNDEFINED);
> > > > +   task->patch_state = initial_state;
> > > > +   }
> > > > +   put_online_cpus();
> > > 
> > > We allow to add/remove CPUs here. I am afraid that we will also need
> > > to add a cpu coming/going handler that will set the task->patch_state
> > > the right way. We must not set the klp_target_state until all ftrace
> > > handlers are ready.
> > 
> > What if we instead just change the above to use for_each_possible_cpu()?
> > We could do the same in klp_complete_transition().
> 
> I like this idea. It seems that there is idle task for each possible
> cpu, see idle_threads_init().
> 
> IMHO, we should do the same everytime we do anything with the idle
> tasks. I mean in klp_start_transition, klp_try_complete_transition()
> and also complete_transition().
> 
> Then they will be handled like any other processes and we do not need
> to think of any special races.

More on this below.

> > > > +   /*
> > > > +* Enforce the order of the task->patch_state initializations 
> > > > and the
> > > > +* func->transition updates to ensure that, in the enable path,
> > > > +* klp_ftrace_handler() doesn't see a func in transition with a
> > > > +* task->patch_state of KLP_UNDEFINED.
> > > > +*/
> > > > +   smp_wmb();
> > > > +
> > > > +   /*
> > > > +* Set the func transition states so klp_ftrace_handler() will 
> > > > know to
> > > > +* switch to the transition logic.
> > > > +*
> > > > +* When patching, the funcs aren't yet in the func_stack and 
> > > > will be
> > > > +* made visible to the ftrace handler shortly by the calls to
> > > > +* klp_patch_object().
> > > > +*
> > > > +* When unpatching, the funcs are already in the func_stack and 
> > > > so are
> > > > +* already visible to the ftrace handler.
> > > > +*/
> > > > +   klp_for_each_object(patch, obj)
> > > > +   klp_for_each_func(obj, func)
> > > > +   func->transition = true;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Start the transition to the specified target patch state so tasks 
> > > > can begin
> > > > + * switching to it.
> > > > + */
> > > > +void klp_start_transition(void)
> > > > +{
> > > > +   struct task_struct *g, 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-23 Thread Miroslav Benes
> > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > index 5efa262..e79ebb5 100644
> > > --- a/kernel/livepatch/patch.c
> > > +++ b/kernel/livepatch/patch.c
> > > @@ -29,6 +29,7 @@
> > >  #include 
> > >  #include 
> > >  #include "patch.h"
> > > +#include "transition.h"
> > >  
> > >  static LIST_HEAD(klp_ops);
> > >  
> > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned long 
> > > ip,
> > >  {
> > >   struct klp_ops *ops;
> > >   struct klp_func *func;
> > > + int patch_state;
> > >  
> > >   ops = container_of(fops, struct klp_ops, fops);
> > >  
> > >   rcu_read_lock();
> > > +
> > >   func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > stack_node);
> > > - if (WARN_ON_ONCE(!func))
> > > +
> > > + if (!func)
> > >   goto unlock;
> > 
> > Why do you removed the WARN_ON_ONCE(), please?
> > 
> > We still add the function on the stack before registering
> > the ftrace handler. Also we unregister the ftrace handler
> > before removing the the last entry from the stack.
> > 
> > AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> > to make sure that no-one is inside the handler once finished.
> > Mirek knows more about it.
> 
> Hm, this is news to me.  Mirek, please share :-)

Well, I think the whole thing is well described in emails I exchanged with 
Steven few months ago. See [1].

[1] http://lkml.kernel.org/r/alpine.lnx.2.00.1608081041060.10...@pobox.suse.cz
 
> > If this is not true, we have a problem. For example,
> > we call kfree(ops) after unregister_ftrace_function();
> 
> Agreed.

TL;DR - we should be ok as long as we do not do crazy things in the 
handler, deliberate sleeping for example.

WARN_ON_ONCE() may be crazy too. I think we discussed it long ago and we 
came to an agreement to remove it.

Miroslav, very slowly going through the patch set


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2016-12-23 Thread Petr Mladek
On Fri 2016-12-23 10:24:35, Miroslav Benes wrote:
> > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > index 5efa262..e79ebb5 100644
> > > > --- a/kernel/livepatch/patch.c
> > > > +++ b/kernel/livepatch/patch.c
> > > > @@ -29,6 +29,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include "patch.h"
> > > > +#include "transition.h"
> > > >  
> > > >  static LIST_HEAD(klp_ops);
> > > >  
> > > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > long ip,
> > > >  {
> > > > struct klp_ops *ops;
> > > > struct klp_func *func;
> > > > +   int patch_state;
> > > >  
> > > > ops = container_of(fops, struct klp_ops, fops);
> > > >  
> > > > rcu_read_lock();
> > > > +
> > > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > >   stack_node);
> > > > -   if (WARN_ON_ONCE(!func))
> > > > +
> > > > +   if (!func)
> > > > goto unlock;
> > > 
> > > Why do you removed the WARN_ON_ONCE(), please?
> > > 
> > > We still add the function on the stack before registering
> > > the ftrace handler. Also we unregister the ftrace handler
> > > before removing the the last entry from the stack.
> > > 
> > > AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> > > to make sure that no-one is inside the handler once finished.
> > > Mirek knows more about it.
> > 
> > Hm, this is news to me.  Mirek, please share :-)
> 
> Well, I think the whole thing is well described in emails I exchanged with 
> Steven few months ago. See [1].
> 
> [1] http://lkml.kernel.org/r/alpine.lnx.2.00.1608081041060.10...@pobox.suse.cz
>  
> > > If this is not true, we have a problem. For example,
> > > we call kfree(ops) after unregister_ftrace_function();
> > 
> > Agreed.
> 
> TL;DR - we should be ok as long as we do not do crazy things in the 
> handler, deliberate sleeping for example.
> 
> WARN_ON_ONCE() may be crazy too. I think we discussed it long ago and we 
> came to an agreement to remove it.

There are definitely situations where this might hurt. For example,
when we redirect a function called under logbuf_lock.

On the other hand, there is a work in progress[1][2] that will mitigate
this risk a lot. Also this warning would be printed only when
something goes wrong. IMHO, it is worth the risk. It will succeed
in 99,999% cases and it might save us some headache when debugging
random crashes of the system.

Anyway, if there is a reason to remove the warning, it should be
described. And if it is not strictly related to this patch, it should
be handled separately.

[1] https://lkml.kernel.org/r/20161221143605.2272-1-sergey.senozhat...@gmail.com
[2] 
https://lkml.kernel.org/r/1461333180-2897-1-git-send-email-sergey.senozhat...@gmail.com

Best Regards,
Petr


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-04 Thread Miroslav Benes
On Thu, 8 Dec 2016, Josh Poimboeuf wrote:

> +void klp_start_transition(void)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> +
> + WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> +   klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (klp_transition_patch->immediate)
> + return;
> +
> + /*
> +  * Mark all normal tasks as needing a patch state update.  As they pass
> +  * through the syscall barrier they'll switch over to the target state
> +  * (unless we switch them in klp_try_complete_transition() first).
> +  */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, task)
> + set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> + read_unlock(&tasklist_lock);
> +
> + /*
> +  * Ditto for the idle "swapper" tasks, though they never cross the
> +  * syscall barrier.  Instead they switch over in cpu_idle_loop().

...or we switch them in klp_try_complete_transition() first by looking at 
their stacks, right? I would add it to the comment.

> +  */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> + put_online_cpus();
> +}

[...]

> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -264,6 +265,9 @@ static void do_idle(void)
>  
>   sched_ttwu_pending();
>   schedule_preempt_disabled();
> +
> + if (unlikely(klp_patch_pending(current)))
> + klp_update_patch_state(current);
>  }

I think that (theoretically) this is not sufficient, if we patch a 
function present on an idle task's stack and one of the two following 
scenarios happen.

1. there is nothing to schedule on a cpu and an idle task does not leave a 
loop in do_idle() for some time. It may be a nonsense practically and if 
it is not we could solve with schedule_on_each_cpu() on an empty stub 
somewhere in our code.

2. there is a cpu-bound process running on one of the cpus. No chance of 
going to do_idle() there at all and the idle task would block the 
patching. We ran into it in kGraft and I tried to solve it with this new 
hunk in pick_next_task()...

+   /*
+* Patching is in progress, schedule an idle task to migrate it
+*/
+   if (kgr_in_progress_branch()) {
+   if (!test_bit(0, kgr_immutable) &&
+   klp_kgraft_task_in_progress(rq->idle)) {
+   p = idle_sched_class.pick_next_task(rq, prev);
+
+   return p;
+   }
+   }

(kgr_in_progress_branch() is a static key basically. kgr_immutable flag 
solves something we don't have a problem with in upstream livepatch thanks 
to a combination of task->patch_state and klp_func->transition. 
klp_kgraft_task_in_progress() checks the livepatching TIF of a task.)

It is not tested properly and it is a hack as hell so take it as that. 
Also note that the problem in kGraft is more serious as we don't have a 
stack checking there. So any livepatch could cause the issue easily.

I can imagine even crazier solutions but nothing nice and pretty (which is 
probably impossible because the whole idea to deliberately schedule an 
idle task is not nice and pretty).

Otherwise the patch looks good to me. I don't understand how Petr found 
those races there.

Regards,
Miroslav


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-05 Thread Miroslav Benes

> @@ -740,6 +809,14 @@ int klp_register_patch(struct klp_patch *patch)
>   return -ENODEV;
>  
>   /*
> +  * Architectures without reliable stack traces have to set
> +  * patch->immediate because there's currently no way to patch kthreads
> +  * with the consistency model.
> +  */
> + if (!klp_have_reliable_stack() && !patch->immediate)
> + return -ENOSYS;
> +

I think an error message (pr_err) would be appropriate here. 

$ insmod patch_1.ko
insmod: ERROR: could not insert module patch_1.ko: Function not implemented

is not helpful much :)

Miroslav


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-06 Thread Josh Poimboeuf
On Fri, Dec 23, 2016 at 11:18:03AM +0100, Petr Mladek wrote:
> On Fri 2016-12-23 10:24:35, Miroslav Benes wrote:
> > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > > index 5efa262..e79ebb5 100644
> > > > > --- a/kernel/livepatch/patch.c
> > > > > +++ b/kernel/livepatch/patch.c
> > > > > @@ -29,6 +29,7 @@
> > > > >  #include 
> > > > >  #include 
> > > > >  #include "patch.h"
> > > > > +#include "transition.h"
> > > > >  
> > > > >  static LIST_HEAD(klp_ops);
> > > > >  
> > > > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > > long ip,
> > > > >  {
> > > > >   struct klp_ops *ops;
> > > > >   struct klp_func *func;
> > > > > + int patch_state;
> > > > >  
> > > > >   ops = container_of(fops, struct klp_ops, fops);
> > > > >  
> > > > >   rcu_read_lock();
> > > > > +
> > > > >   func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > > > stack_node);
> > > > > - if (WARN_ON_ONCE(!func))
> > > > > +
> > > > > + if (!func)
> > > > >   goto unlock;
> > > > 
> > > > Why do you removed the WARN_ON_ONCE(), please?
> > > > 
> > > > We still add the function on the stack before registering
> > > > the ftrace handler. Also we unregister the ftrace handler
> > > > before removing the the last entry from the stack.
> > > > 
> > > > AFAIK, unregister_ftrace_function() calls rcu_synchronize()'
> > > > to make sure that no-one is inside the handler once finished.
> > > > Mirek knows more about it.
> > > 
> > > Hm, this is news to me.  Mirek, please share :-)
> > 
> > Well, I think the whole thing is well described in emails I exchanged with 
> > Steven few months ago. See [1].
> > 
> > [1] 
> > http://lkml.kernel.org/r/alpine.lnx.2.00.1608081041060.10...@pobox.suse.cz
> >  
> > > > If this is not true, we have a problem. For example,
> > > > we call kfree(ops) after unregister_ftrace_function();
> > > 
> > > Agreed.
> > 
> > TL;DR - we should be ok as long as we do not do crazy things in the 
> > handler, deliberate sleeping for example.
> > 
> > WARN_ON_ONCE() may be crazy too. I think we discussed it long ago and we 
> > came to an agreement to remove it.
> 
> There are definitely situations where this might hurt. For example,
> when we redirect a function called under logbuf_lock.
> 
> On the other hand, there is a work in progress[1][2] that will mitigate
> this risk a lot. Also this warning would be printed only when
> something goes wrong. IMHO, it is worth the risk. It will succeed
> in 99,999% cases and it might save us some headache when debugging
> random crashes of the system.
> 
> Anyway, if there is a reason to remove the warning, it should be
> described. And if it is not strictly related to this patch, it should
> be handled separately.
> 
> [1] 
> https://lkml.kernel.org/r/20161221143605.2272-1-sergey.senozhat...@gmail.com
> [2] 
> https://lkml.kernel.org/r/1461333180-2897-1-git-send-email-sergey.senozhat...@gmail.com

Yeah, I'm thinking we should keep the warning to catch any bugs in case
any of our ftrace assumptions change.  Maybe I should add a comment:

/*
 * func can never be NULL because preemption should be disabled
 * here and unregister_ftrace_function() does the equivalent of
 * a synchronize_sched() before the func_stack removal.
 */
if (WARN_ON_ONCE(!func))
goto unlock;

-- 
Josh


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-06 Thread Josh Poimboeuf
On Wed, Jan 04, 2017 at 02:44:47PM +0100, Miroslav Benes wrote:
> On Thu, 8 Dec 2016, Josh Poimboeuf wrote:
> 
> > +void klp_start_transition(void)
> > +{
> > +   struct task_struct *g, *task;
> > +   unsigned int cpu;
> > +
> > +   WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > +
> > +   pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > + klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > +
> > +   /*
> > +* If the patch can be applied or reverted immediately, skip the
> > +* per-task transitions.
> > +*/
> > +   if (klp_transition_patch->immediate)
> > +   return;
> > +
> > +   /*
> > +* Mark all normal tasks as needing a patch state update.  As they pass
> > +* through the syscall barrier they'll switch over to the target state
> > +* (unless we switch them in klp_try_complete_transition() first).
> > +*/
> > +   read_lock(&tasklist_lock);
> > +   for_each_process_thread(g, task)
> > +   set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > +   read_unlock(&tasklist_lock);
> > +
> > +   /*
> > +* Ditto for the idle "swapper" tasks, though they never cross the
> > +* syscall barrier.  Instead they switch over in cpu_idle_loop().
> 
> ...or we switch them in klp_try_complete_transition() first by looking at 
> their stacks, right? I would add it to the comment.

Yeah, I guess the "ditto" was intended to include the "unless we switch
them in klp_try_complete_transition() first" statement from the previous
comment.  I'll try to make it clearer.

> > +*/
> > +   get_online_cpus();
> > +   for_each_online_cpu(cpu)
> > +   set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> > +   put_online_cpus();
> > +}
> 
> [...]
> 
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -9,6 +9,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  
> > @@ -264,6 +265,9 @@ static void do_idle(void)
> >  
> > sched_ttwu_pending();
> > schedule_preempt_disabled();
> > +
> > +   if (unlikely(klp_patch_pending(current)))
> > +   klp_update_patch_state(current);
> >  }
> 
> I think that (theoretically) this is not sufficient, if we patch a 
> function present on an idle task's stack and one of the two following 
> scenarios happen.

Agreed, though I'd argue that these are rare edge cases which can
probably be refined later, outside the scope of this patch set.

> 1. there is nothing to schedule on a cpu and an idle task does not leave a 
> loop in do_idle() for some time. It may be a nonsense practically and if 
> it is not we could solve with schedule_on_each_cpu() on an empty stub 
> somewhere in our code.

This might only be a theoretical issue, as it only happens when patching
one of the idle functions themselves.

If we decided that this were a real world problem, we could use
something like schedule_on_each_cpu() to flush them out as you
suggested.  Or it could even be done from user space by briefly running
a CPU-intensive program on the affected CPUs.

> 2. there is a cpu-bound process running on one of the cpus. No chance of 
> going to do_idle() there at all and the idle task would block the 
> patching.

To clarify I think this would only be an issue when trying to patch idle
code or schedule()/__schedule().

> We ran into it in kGraft and I tried to solve it with this new 
> hunk in pick_next_task()...
> 
> +   /*
> +* Patching is in progress, schedule an idle task to migrate it
> +*/
> +   if (kgr_in_progress_branch()) {
> +   if (!test_bit(0, kgr_immutable) &&
> +   klp_kgraft_task_in_progress(rq->idle)) {
> +   p = idle_sched_class.pick_next_task(rq, prev);
> +
> +   return p;
> +   }
> +   }
> 
> (kgr_in_progress_branch() is a static key basically. kgr_immutable flag 
> solves something we don't have a problem with in upstream livepatch thanks 
> to a combination of task->patch_state and klp_func->transition. 
> klp_kgraft_task_in_progress() checks the livepatching TIF of a task.)
> 
> It is not tested properly and it is a hack as hell so take it as that. 
> Also note that the problem in kGraft is more serious as we don't have a 
> stack checking there. So any livepatch could cause the issue easily.
> 
> I can imagine even crazier solutions but nothing nice and pretty (which is 
> probably impossible because the whole idea to deliberately schedule an 
> idle task is not nice and pretty).

Yeah, that looks hairy...

Since this is such a specialized case (patching the scheduler in an idle
task while CPU-intensive tasks are running) this might also be more
reasonably accomplished from user space by briefly SIGSTOPing the CPU
hog.

> Otherwise the patch looks good to me. I don't understand how Petr found 
> those races there.

Agreed, kudos to Petr :-)

-- 
Josh


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-06 Thread Josh Poimboeuf
On Thu, Jan 05, 2017 at 10:34:57AM +0100, Miroslav Benes wrote:
> 
> > @@ -740,6 +809,14 @@ int klp_register_patch(struct klp_patch *patch)
> > return -ENODEV;
> >  
> > /*
> > +* Architectures without reliable stack traces have to set
> > +* patch->immediate because there's currently no way to patch kthreads
> > +* with the consistency model.
> > +*/
> > +   if (!klp_have_reliable_stack() && !patch->immediate)
> > +   return -ENOSYS;
> > +
> 
> I think an error message (pr_err) would be appropriate here. 
> 
> $ insmod patch_1.ko
> insmod: ERROR: could not insert module patch_1.ko: Function not implemented
> 
> is not helpful much :)

Ok :-)

-- 
Josh


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Petr Mladek
On Fri 2017-01-06 14:07:34, Josh Poimboeuf wrote:
> On Fri, Dec 23, 2016 at 11:18:03AM +0100, Petr Mladek wrote:
> > On Fri 2016-12-23 10:24:35, Miroslav Benes wrote:
> > > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > > > index 5efa262..e79ebb5 100644
> > > > > > --- a/kernel/livepatch/patch.c
> > > > > > +++ b/kernel/livepatch/patch.c
> > > > > > @@ -29,6 +29,7 @@
> > > > > >  #include 
> > > > > >  #include 
> > > > > >  #include "patch.h"
> > > > > > +#include "transition.h"
> > > > > >  
> > > > > >  static LIST_HEAD(klp_ops);
> > > > > >  
> > > > > > @@ -54,15 +55,53 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > > > long ip,
> > > > > >  {
> > > > > > struct klp_ops *ops;
> > > > > > struct klp_func *func;
> > > > > > +   int patch_state;
> > > > > >  
> > > > > > ops = container_of(fops, struct klp_ops, fops);
> > > > > >  
> > > > > > rcu_read_lock();
> > > > > > +
> > > > > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > > > >   stack_node);
> > > > > > -   if (WARN_ON_ONCE(!func))
> > > > > > +
> > > > > > +   if (!func)
> > > > > > goto unlock;
> > > > > 
 
> Yeah, I'm thinking we should keep the warning to catch any bugs in case
> any of our ftrace assumptions change.  Maybe I should add a comment:
> 
>   /*
>* func can never be NULL because preemption should be disabled
>* here and unregister_ftrace_function() does the equivalent of
>* a synchronize_sched() before the func_stack removal.
>*/
>   if (WARN_ON_ONCE(!func))
>   goto unlock;

Sounds reasonable to me.

Best Regards,
Petr


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Miroslav Benes

> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -9,6 +9,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  
> > > @@ -264,6 +265,9 @@ static void do_idle(void)
> > >  
> > >   sched_ttwu_pending();
> > >   schedule_preempt_disabled();
> > > +
> > > + if (unlikely(klp_patch_pending(current)))
> > > + klp_update_patch_state(current);
> > >  }
> > 
> > I think that (theoretically) this is not sufficient, if we patch a 
> > function present on an idle task's stack and one of the two following 
> > scenarios happen.
> 
> Agreed, though I'd argue that these are rare edge cases which can
> probably be refined later, outside the scope of this patch set.

You're right. They should be really rare and we can solve them (if we even 
want to) later.
 
> > 1. there is nothing to schedule on a cpu and an idle task does not leave a 
> > loop in do_idle() for some time. It may be a nonsense practically and if 
> > it is not we could solve with schedule_on_each_cpu() on an empty stub 
> > somewhere in our code.
> 
> This might only be a theoretical issue, as it only happens when patching
> one of the idle functions themselves.
> 
> If we decided that this were a real world problem, we could use
> something like schedule_on_each_cpu() to flush them out as you
> suggested.  Or it could even be done from user space by briefly running
> a CPU-intensive program on the affected CPUs.

Yes.

> > 2. there is a cpu-bound process running on one of the cpus. No chance of 
> > going to do_idle() there at all and the idle task would block the 
> > patching.
> 
> To clarify I think this would only be an issue when trying to patch idle
> code or schedule()/__schedule().

Yes.

> > We ran into it in kGraft and I tried to solve it with this new 
> > hunk in pick_next_task()...
> > 
> > +   /*
> > +* Patching is in progress, schedule an idle task to migrate it
> > +*/
> > +   if (kgr_in_progress_branch()) {
> > +   if (!test_bit(0, kgr_immutable) &&
> > +   klp_kgraft_task_in_progress(rq->idle)) {
> > +   p = idle_sched_class.pick_next_task(rq, prev);
> > +
> > +   return p;
> > +   }
> > +   }
> > 
> > (kgr_in_progress_branch() is a static key basically. kgr_immutable flag 
> > solves something we don't have a problem with in upstream livepatch thanks 
> > to a combination of task->patch_state and klp_func->transition. 
> > klp_kgraft_task_in_progress() checks the livepatching TIF of a task.)
> > 
> > It is not tested properly and it is a hack as hell so take it as that. 
> > Also note that the problem in kGraft is more serious as we don't have a 
> > stack checking there. So any livepatch could cause the issue easily.
> > 
> > I can imagine even crazier solutions but nothing nice and pretty (which is 
> > probably impossible because the whole idea to deliberately schedule an 
> > idle task is not nice and pretty).
> 
> Yeah, that looks hairy...
> 
> Since this is such a specialized case (patching the scheduler in an idle
> task while CPU-intensive tasks are running) this might also be more
> reasonably accomplished from user space by briefly SIGSTOPing the CPU
> hog.

Yes, that is true. I can imagine there are users who don't want to stop 
the cpu hog at all. Even for a fraction of time. HPC comes to mind. But it 
is not worth it to solve it with something like the code above. Let's add 
it to the very end of our TODO lists :). 

Miroslav


Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Petr Mladek
On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > Change livepatch to use a basic per-task consistency model.  This is 
> > > > > the
> > > > > foundation which will eventually enable us to patch those ~10% of
> > > > > security patches which change function or data semantics.  This is the
> > > > > biggest remaining piece needed to make livepatch more generally 
> > > > > useful.
> > > > > 
> > > > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > > > 
> > > > > --- /dev/null
> > > > > +++ b/kernel/livepatch/transition.c
> > > > > + /*
> > > > > +  * Enforce the order of the task->patch_state initializations 
> > > > > and the
> > > > > +  * func->transition updates to ensure that, in the enable path,
> > > > > +  * klp_ftrace_handler() doesn't see a func in transition with a
> > > > > +  * task->patch_state of KLP_UNDEFINED.
> > > > > +  */
> > > > > + smp_wmb();
> > > > > +
> > > > > + /*
> > > > > +  * Set the func transition states so klp_ftrace_handler() will 
> > > > > know to
> > > > > +  * switch to the transition logic.
> > > > > +  *
> > > > > +  * When patching, the funcs aren't yet in the func_stack and 
> > > > > will be
> > > > > +  * made visible to the ftrace handler shortly by the calls to
> > > > > +  * klp_patch_object().
> > > > > +  *
> > > > > +  * When unpatching, the funcs are already in the func_stack and 
> > > > > so are
> > > > > +  * already visible to the ftrace handler.
> > > > > +  */
> > > > > + klp_for_each_object(patch, obj)
> > > > > + klp_for_each_func(obj, func)
> > > > > + func->transition = true;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Start the transition to the specified target patch state so tasks 
> > > > > can begin
> > > > > + * switching to it.
> > > > > + */
> > > > > +void klp_start_transition(void)
> > > > > +{
> > > > > + struct task_struct *g, *task;
> > > > > + unsigned int cpu;
> > > > > +
> > > > > + WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > > > > +
> > > > > + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > > > > +   klp_target_state == KLP_PATCHED ? "patching" : 
> > > > > "unpatching");
> > > > > +
> > > > > + /*
> > > > > +  * If the patch can be applied or reverted immediately, skip the
> > > > > +  * per-task transitions.
> > > > > +  */
> > > > > + if (klp_transition_patch->immediate)
> > > > > + return;
> > > > > +
> > > > > + /*
> > > > > +  * Mark all normal tasks as needing a patch state update.  As 
> > > > > they pass
> > > > > +  * through the syscall barrier they'll switch over to the 
> > > > > target state
> > > > > +  * (unless we switch them in klp_try_complete_transition() 
> > > > > first).
> > > > > +  */
> > > > > + read_lock(&tasklist_lock);
> > > > > + for_each_process_thread(g, task)
> > > > > + set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > 
> > > > This is called also from klp_reverse_transition(). We should set it
> > > > only when the task need migration. Also we should clear it when
> > > > the task is in the right state already.
> > > > 
> > > > It is not only optimization. It actually solves a race between
> > > > klp_complete_transition() and klp_update_patch_state(), see below.
> > > 
> > > I agree about the race, but if I did:
> > > 
> > >   for_each_process_thread(g, task) {
> > >   if (task->patch_state != klp_target_state)
> > >   set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > >   else
> > >   clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > >   }
> > > 
> > > It would still leave a small window where TIF_PATCH_PENDING gets set for
> > > an already patched task, if klp_update_patch_state() is running at the
> > > same time.
> > 
> > I see your point. Well, it seems that it is more complicated:
> > 
> > The race would be possible only when this was called from
> > klp_reverse_transition(). But we need to call there
> > rcu_synchronize() to prevent races with klp_update_patch_state()
> > also to prevent prelimitary patch completion.
> > 
> > The result is:
> > 
> > if (task->patch_state != klp_target_state) {
> > # it means that the task was already migrated before
> > # we reverted klp_target_state. It means that
> > # the TIF flag was already cleared, the related
> > # klp_update_patch_state() already finished (thanks
> > # to rcu_synchronize() and new one will be called
> > # only when we set the flag again
> > # => it is safe to set it
> > 
> > # we should also check and warn

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-10 Thread Josh Poimboeuf
On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > Change livepatch to use a basic per-task consistency model.  This 
> > > > > > is the
> > > > > > foundation which will eventually enable us to patch those ~10% of
> > > > > > security patches which change function or data semantics.  This is 
> > > > > > the
> > > > > > biggest remaining piece needed to make livepatch more generally 
> > > > > > useful.
> > > > > > 
> > > > > > [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> > > > > > 
> > > > > > --- /dev/null
> > > > > > +++ b/kernel/livepatch/transition.c
> > > > > > +   /*
> > > > > > +* Enforce the order of the task->patch_state initializations 
> > > > > > and the
> > > > > > +* func->transition updates to ensure that, in the enable path,
> > > > > > +* klp_ftrace_handler() doesn't see a func in transition with a
> > > > > > +* task->patch_state of KLP_UNDEFINED.
> > > > > > +*/
> > > > > > +   smp_wmb();
> > > > > > +
> > > > > > +   /*
> > > > > > +* Set the func transition states so klp_ftrace_handler() will 
> > > > > > know to
> > > > > > +* switch to the transition logic.
> > > > > > +*
> > > > > > +* When patching, the funcs aren't yet in the func_stack and 
> > > > > > will be
> > > > > > +* made visible to the ftrace handler shortly by the calls to
> > > > > > +* klp_patch_object().
> > > > > > +*
> > > > > > +* When unpatching, the funcs are already in the func_stack and 
> > > > > > so are
> > > > > > +* already visible to the ftrace handler.
> > > > > > +*/
> > > > > > +   klp_for_each_object(patch, obj)
> > > > > > +   klp_for_each_func(obj, func)
> > > > > > +   func->transition = true;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * Start the transition to the specified target patch state so 
> > > > > > tasks can begin
> > > > > > + * switching to it.
> > > > > > + */
> > > > > > +void klp_start_transition(void)
> > > > > > +{
> > > > > > +   struct task_struct *g, *task;
> > > > > > +   unsigned int cpu;
> > > > > > +
> > > > > > +   WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > > > > > +
> > > > > > +   pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > > > > > + klp_target_state == KLP_PATCHED ? "patching" : 
> > > > > > "unpatching");
> > > > > > +
> > > > > > +   /*
> > > > > > +* If the patch can be applied or reverted immediately, skip the
> > > > > > +* per-task transitions.
> > > > > > +*/
> > > > > > +   if (klp_transition_patch->immediate)
> > > > > > +   return;
> > > > > > +
> > > > > > +   /*
> > > > > > +* Mark all normal tasks as needing a patch state update.  As 
> > > > > > they pass
> > > > > > +* through the syscall barrier they'll switch over to the 
> > > > > > target state
> > > > > > +* (unless we switch them in klp_try_complete_transition() 
> > > > > > first).
> > > > > > +*/
> > > > > > +   read_lock(&tasklist_lock);
> > > > > > +   for_each_process_thread(g, task)
> > > > > > +   set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > > 
> > > > > This is called also from klp_reverse_transition(). We should set it
> > > > > only when the task need migration. Also we should clear it when
> > > > > the task is in the right state already.
> > > > > 
> > > > > It is not only optimization. It actually solves a race between
> > > > > klp_complete_transition() and klp_update_patch_state(), see below.
> > > > 
> > > > I agree about the race, but if I did:
> > > > 
> > > > for_each_process_thread(g, task) {
> > > > if (task->patch_state != klp_target_state)
> > > > set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > else
> > > > clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > > }
> > > > 
> > > > It would still leave a small window where TIF_PATCH_PENDING gets set for
> > > > an already patched task, if klp_update_patch_state() is running at the
> > > > same time.
> > > 
> > > I see your point. Well, it seems that it is more complicated:
> > > 
> > > The race would be possible only when this was called from
> > > klp_reverse_transition(). But we need to call there
> > > rcu_synchronize() to prevent races with klp_update_patch_state()
> > > also to prevent prelimitary patch completion.
> > > 
> > > The result is:
> > > 
> > >   if (task->patch_state != klp_target_state) {
> > >   # it means that the task was already migrated before
> > >   # we reverted klp_target_state. It means that
> > >   # the TIF flag was already cleared, the related
> > >   

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-11 Thread Petr Mladek
On Tue 2017-01-10 14:46:46, Josh Poimboeuf wrote:
> On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> > On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > > + read_unlock(&tasklist_lock);
> > > > > > > +
> > > > > > > + /*
> > > > > > > +  * Ditto for the idle "swapper" tasks, though they never cross 
> > > > > > > the
> > > > > > > +  * syscall barrier.  Instead they switch over in 
> > > > > > > cpu_idle_loop().
> > > > > > > +  */
> > > > > > > + get_online_cpus();
> > > > > > > + for_each_online_cpu(cpu)
> > > > > > > + set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> > > > > > > + put_online_cpus();
> > > > > > 
> > > > > > Also this stage need to be somehow handled by CPU coming/going
> > > > > > handlers.
> > > > > 
> > > > > Here I think we could automatically switch any offline CPUs' idle 
> > > > > tasks.
> > > > > And something similar in klp_try_complete_transition().
> > > > 
> > > > We still need to make sure to do not race with the cpu_up()/cpu_down()
> > > > calls.
> > > 
> > > Hm, maybe we'd need to call cpu_hotplug_disable() before switching the
> > > offline idle tasks?
> > > 
> > > > I would use here the trick with for_each_possible_cpu() and let
> > > > the migration for the stack check.
> > > 
> > > There are a few issues with that:
> > > 
> > > 1) The idle task of a missing CPU doesn't *have* a stack, so it doesn't
> > >make much sense to try to check it.
> > > 
> > > 2) We can't rely *only* on the stack check, because not all arches have
> > >it.  The other way to migrate idle tasks is from the idle loop switch
> > >point.  But if the task's CPU is down, its idle loop isn't running so
> > >it can't migrate.
> > > 
> > >(Note this is currently a theoretical point: we currently don't allow
> > >such arches to use the consistency model anyway because there's no
> > >way for them to migrate kthreads.)
> > 
> > Good points. My only concern is that the transaction might take a long
> > or even forever. I am not sure if it is wise to disable cpu_hotplug
> > for the entire transaction.
> > 
> > A compromise might be to disable cpu hotplug only when the task
> > state is manipulated a more complex way. Hmm, cpu_hotplug_disable()
> > looks like a rather costly function. We should not call it in
> > klp_try_complete_transition(). But we could do:
> > 
> >   1. When the patch is being enabled, disable cpu hotplug,
> >  go through each_possible_cpu and setup the transaction
> >  only for CPUs that are online. Then we could enable
> >  the hotplug again.
> > 
> >   2. Check only each_online_cpu in klp_try_complete_transition().
> >  If all tasks are migrated, disable cpu hotplug and re-check
> >  idle tasks on online CPUs. If any is not migrated, enable
> >  hotplug and return failure. Othewise, continue with
> >  completion of the transaction. [*]
> > 
> >   3. In klp_complete_transition, update all tasks including
> >  the offline CPUs and enable cpu hotplug again.
> > 
> > If the re-check in the 2nd step looks ugly, we could add some hotlug
> > notifiers to make sure that enabled/disabled CPUs are in a reasonable
> > state. We still should disable the hotplug in the 1st and 3rd step.
> > 
> > BTW: There is a new API for the cpu hotplug callbacks. I was involved
> > in one conversion. You might take inspiration in
> > drivers/thermal/intel_powerclamp.c. See cpuhp_setup_state_nocalls()
> > there.
> 
> Backing up a bit, although I brought up cpu_hotplug_disable(), I think I
> misunderstood the race you mentioned.  I actually don't think
> cpu_hotplug_disable() is necessary.

Great backing! You made me to study the difference. If I get it
correctly:

  + cpu_hotplug_disable() works like a writer lock. It gets
exclusive access via cpu_hotplug_begin(). A side effect
is that do_cpu_up() and do_cpu_down() do not wait. They
return -EBUSY if hotplug is disabled.

  + get_online_cpus() is kind of reader lock. It makes sure
that all the hotplug operations are finished and "softly"
blocks other further operation. By "softly" I mean that
the operations wait for the exclusive (write) access
in cpu_hotplug_begin().

IMHO, we really have to use get_online_cpus() and avoid the
the "hard" blocking.


> What do you think about something like the following:
 
> In klp_start_transition:
> 
>   get_online_cpus();
>   for_each_possible_cpu(cpu)
>   set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
>   put_online_cpus();
>
> In klp_try_complete_transition:
> 
>   get_online_cpus();
>   for_each_possible_cpu(cpu) {
>   task = idle_task(cpu);
>   if (cpu_online(cpu)) {
> 

Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

2017-01-11 Thread Josh Poimboeuf
On Wed, Jan 11, 2017 at 04:18:28PM +0100, Petr Mladek wrote:
> On Tue 2017-01-10 14:46:46, Josh Poimboeuf wrote:
> > On Tue, Jan 10, 2017 at 02:00:58PM +0100, Petr Mladek wrote:
> > > On Thu 2016-12-22 12:31:37, Josh Poimboeuf wrote:
> > > > On Thu, Dec 22, 2016 at 03:34:52PM +0100, Petr Mladek wrote:
> > > > > On Wed 2016-12-21 15:25:05, Josh Poimboeuf wrote:
> > > > > > On Tue, Dec 20, 2016 at 06:32:46PM +0100, Petr Mladek wrote:
> > > > > > > On Thu 2016-12-08 12:08:38, Josh Poimboeuf wrote:
> > > > > > > > +   read_unlock(&tasklist_lock);
> > > > > > > > +
> > > > > > > > +   /*
> > > > > > > > +* Ditto for the idle "swapper" tasks, though they 
> > > > > > > > never cross the
> > > > > > > > +* syscall barrier.  Instead they switch over in 
> > > > > > > > cpu_idle_loop().
> > > > > > > > +*/
> > > > > > > > +   get_online_cpus();
> > > > > > > > +   for_each_online_cpu(cpu)
> > > > > > > > +   set_tsk_thread_flag(idle_task(cpu), 
> > > > > > > > TIF_PATCH_PENDING);
> > > > > > > > +   put_online_cpus();
> > > > > > > 
> > > > > > > Also this stage need to be somehow handled by CPU coming/going
> > > > > > > handlers.
> > > > > > 
> > > > > > Here I think we could automatically switch any offline CPUs' idle 
> > > > > > tasks.
> > > > > > And something similar in klp_try_complete_transition().
> > > > > 
> > > > > We still need to make sure to do not race with the cpu_up()/cpu_down()
> > > > > calls.
> > > > 
> > > > Hm, maybe we'd need to call cpu_hotplug_disable() before switching the
> > > > offline idle tasks?
> > > > 
> > > > > I would use here the trick with for_each_possible_cpu() and let
> > > > > the migration for the stack check.
> > > > 
> > > > There are a few issues with that:
> > > > 
> > > > 1) The idle task of a missing CPU doesn't *have* a stack, so it doesn't
> > > >make much sense to try to check it.
> > > > 
> > > > 2) We can't rely *only* on the stack check, because not all arches have
> > > >it.  The other way to migrate idle tasks is from the idle loop switch
> > > >point.  But if the task's CPU is down, its idle loop isn't running so
> > > >it can't migrate.
> > > > 
> > > >(Note this is currently a theoretical point: we currently don't allow
> > > >such arches to use the consistency model anyway because there's no
> > > >way for them to migrate kthreads.)
> > > 
> > > Good points. My only concern is that the transaction might take a long
> > > or even forever. I am not sure if it is wise to disable cpu_hotplug
> > > for the entire transaction.
> > > 
> > > A compromise might be to disable cpu hotplug only when the task
> > > state is manipulated a more complex way. Hmm, cpu_hotplug_disable()
> > > looks like a rather costly function. We should not call it in
> > > klp_try_complete_transition(). But we could do:
> > > 
> > >   1. When the patch is being enabled, disable cpu hotplug,
> > >  go through each_possible_cpu and setup the transaction
> > >  only for CPUs that are online. Then we could enable
> > >  the hotplug again.
> > > 
> > >   2. Check only each_online_cpu in klp_try_complete_transition().
> > >  If all tasks are migrated, disable cpu hotplug and re-check
> > >  idle tasks on online CPUs. If any is not migrated, enable
> > >  hotplug and return failure. Othewise, continue with
> > >  completion of the transaction. [*]
> > > 
> > >   3. In klp_complete_transition, update all tasks including
> > >  the offline CPUs and enable cpu hotplug again.
> > > 
> > > If the re-check in the 2nd step looks ugly, we could add some hotlug
> > > notifiers to make sure that enabled/disabled CPUs are in a reasonable
> > > state. We still should disable the hotplug in the 1st and 3rd step.
> > > 
> > > BTW: There is a new API for the cpu hotplug callbacks. I was involved
> > > in one conversion. You might take inspiration in
> > > drivers/thermal/intel_powerclamp.c. See cpuhp_setup_state_nocalls()
> > > there.
> > 
> > Backing up a bit, although I brought up cpu_hotplug_disable(), I think I
> > misunderstood the race you mentioned.  I actually don't think
> > cpu_hotplug_disable() is necessary.
> 
> Great backing! You made me to study the difference. If I get it
> correctly:
> 
>   + cpu_hotplug_disable() works like a writer lock. It gets
> exclusive access via cpu_hotplug_begin(). A side effect
> is that do_cpu_up() and do_cpu_down() do not wait. They
> return -EBUSY if hotplug is disabled.
> 
>   + get_online_cpus() is kind of reader lock. It makes sure
> that all the hotplug operations are finished and "softly"
> blocks other further operation. By "softly" I mean that
> the operations wait for the exclusive (write) access
> in cpu_hotplug_begin().
> 
> IMHO, we really have to use get_online_cpus() and avoid the
> the "hard" blocking.
> 
> 
> > What do you think about something like the following:
>  
> > In klp_start_transition:
> > 
> > 

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-02 Thread Petr Mladek
On Thu 2017-01-19 09:46:21, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> diff --git a/Documentation/livepatch/livepatch.txt 
> b/Documentation/livepatch/livepatch.txt
> index 7f04e13..fb00d66 100644
> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt
> +3.1 Adding consistency model support to new architectures
> +-
> +
> +For adding consistency model support to new architectures, there are a
> +few options:
> +
> +1) Add CONFIG_HAVE_RELIABLE_STACKTRACE.  This means porting objtool, and
> +   for non-DWARF unwinders, also making sure there's a way for the stack
> +   tracing code to detect interrupts on the stack.
> +
> +2) Alternatively, figure out a way to patch kthreads without stack
> +   checking.  If all kthreads sleep in the same place, then we can
> +   designate that place as a patching point.  I think Petr M has been
> +   working on that?



Alternatively, every kthread has to explicitely switch
current->patch_state on a safe place. Kthreads are typically
"infinite" loops that do some action repeatedly. The safe
location is between the loops when there are no locks
taken and all data structures are in a well defined state.

The location is well defined when using the workqueues or
kthread worker API. These kthreads process "independent"
works in a generic loop.

It is much more complicated with kthreads using a custom
loop. There the safe place must be carefully localized case
by case.


> +  In that case, arches without
> +   HAVE_RELIABLE_STACKTRACE would still be able to use the
> +   non-stack-checking parts of the consistency model:
> +
> +   a) patching user tasks when they cross the kernel/user space
> +  boundary; and
> +
> +   b) patching kthreads and idle tasks at their designated patch points.
> +
> +   This option isn't as good as option 1 because it requires signaling
> +   most of the tasks to patch them.

Kthreads need to be waken because most of them ignore signals.

And idle tasks might need to be explicitely scheduled if there
is too big load. Mirek knows more about this.

Well, I am not sure if you want to get into such details.


> +  But it could still be a good backup
> +   option for those architectures which don't have reliable stack traces
> +   yet.
> +
> +In the meantime, patches for such architectures can bypass the
> +consistency model by setting klp_patch.immediate to true.

I would add that this is perfectly fine for patches that do not
change semantic of the patched functions. In practice, this is
usable in about 90% of security and critical fixes.

>  4. Livepatch module
> @@ -134,7 +242,7 @@ Documentation/livepatch/module-elf-format.txt for more 
> details.
>  
>  
>  4.2. Metadata
> -
> +-
>  
>  The patch is described by several structures that split the information
>  into three levels:
> @@ -239,9 +347,15 @@ Registered patches might be enabled either by calling 
> klp_enable_patch() or
>  by writing '1' to /sys/kernel/livepatch//enabled. The system will
>  start using the new implementation of the patched functions at this stage.
>  
> -In particular, if an original function is patched for the first time, a
> -function specific struct klp_ops is created and an universal ftrace handler
> -is registered.
> +When a patch is enabled, livepatch enters into a transition state where
> +tasks are converging to the patched state.  This is indicated by a value
> +of '1' in /sys/kernel/livepatch//transition.  Once all tasks have
> +been patched, the 'transition' value changes to '0'.  For more
> +information about this process, see the "Consistency model" section.
> +
> +If an original function is patched for the first time, a function
> +specific struct klp_ops is created and an universal ftrace handler is
> +registered.
>  
>  Functions might be patched multiple times. The ftrace handler is registered
>  only once for the given function. Further patches just add an entry to the
> @@ -261,6 +375,12 @@ by writing '0' to /sys/kernel/livepatch//enabled. 
> At this stage
>  either the code from the previously enabled patch or even the original
>  code gets used.
>  
> +When a patch is disabled, livepatch enters into a transition state where
> +tasks are converging to the unpatched state.  This is indicated by a
> +value of '1' in /sys/kernel/livepatch//transition.  Once all tasks
> +have been unpatched, the 'transition' value changes to '0'.  For more
> +information about this process, see the "Consistency model" section.
> +
>  Here all the functions (struct klp_func) associated with the to-be-disabled
>  patch are removed from the corresponding struct klp_ops. The ftrace handler
>  

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-02 Thread Petr Mladek
IMPORTANT: Please, forget this version. It is few days old
and incomplete and probably wrong.

I am sorry for confusion.

Best Regards,
Petr


Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-02 Thread Petr Mladek
!!! This is the right version. I am sorry again for the confusion. !!!

> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> diff --git a/Documentation/livepatch/livepatch.txt 
> b/Documentation/livepatch/livepatch.txt
> index 7f04e13..fb00d66 100644
> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt
> +3.1 Adding consistency model support to new architectures
> +-
> +
> +For adding consistency model support to new architectures, there are a
> +few options:
> +
> +1) Add CONFIG_HAVE_RELIABLE_STACKTRACE.  This means porting objtool, and
> +   for non-DWARF unwinders, also making sure there's a way for the stack
> +   tracing code to detect interrupts on the stack.
> +
> +2) Alternatively, figure out a way to patch kthreads without stack
> +   checking.  If all kthreads sleep in the same place, then we can
> +   designate that place as a patching point.  I think Petr M has been
> +   working on that?

Here is version with some more details:

Alternatively, every kthread has to explicitely switch
current->patch_state on a safe place. Kthreads are typically
"infinite" loops that do some action repeatedly. The safe
location is between the loops when there are no locks
taken and all data structures are in a well defined state.

The location is clear when using the workqueues or the kthread worker
API. These kthreads process "independent" works in a generic loop.

It is much more complicated with kthreads using a custom loop.
There the safe place must be carefully localized case by case.


> +  In that case, arches without
> +   HAVE_RELIABLE_STACKTRACE would still be able to use the
> +   non-stack-checking parts of the consistency model:
> +
> +   a) patching user tasks when they cross the kernel/user space
> +  boundary; and
> +
> +   b) patching kthreads and idle tasks at their designated patch points.
> +
> +   This option isn't as good as option 1 because it requires signaling
> +   most of the tasks to patch them.

Kthreads need to be waken because most of them ignore signals.

And idle tasks might need to be explicitely scheduled if there
is too big load. Mirek knows more about this.

Well, I am not sure if you want to get into such details.


> +  But it could still be a good backup
> +   option for those architectures which don't have reliable stack traces
> +   yet.
> +
> +In the meantime, patches for such architectures can bypass the
> +consistency model by setting klp_patch.immediate to true.

I would add that this is perfectly fine for patches that do not
change semantic of the patched functions. In practice, this is
usable in about 90% of security and critical fixes.


>  4. Livepatch module
> @@ -134,7 +242,7 @@ Documentation/livepatch/module-elf-format.txt for more 
> details.
>  
>  
>  4.2. Metadata
> -
> +-
>  
>  The patch is described by several structures that split the information
>  into three levels:
> @@ -239,9 +347,15 @@ Registered patches might be enabled either by calling 
> klp_enable_patch() or
>  by writing '1' to /sys/kernel/livepatch//enabled. The system will
>  start using the new implementation of the patched functions at this stage.
>  
> -In particular, if an original function is patched for the first time, a
> -function specific struct klp_ops is created and an universal ftrace handler
> -is registered.
> +When a patch is enabled, livepatch enters into a transition state where
> +tasks are converging to the patched state.  This is indicated by a value
> +of '1' in /sys/kernel/livepatch//transition.  Once all tasks have
> +been patched, the 'transition' value changes to '0'.  For more
> +information about this process, see the "Consistency model" section.
> +
> +If an original function is patched for the first time, a function
> +specific struct klp_ops is created and an universal ftrace handler is
> +registered.
>  
>  Functions might be patched multiple times. The ftrace handler is registered
>  only once for the given function. Further patches just add an entry to the
> @@ -261,6 +375,12 @@ by writing '0' to /sys/kernel/livepatch//enabled. 
> At this stage
>  either the code from the previously enabled patch or even the original
>  code gets used.
>  
> +When a patch is disabled, livepatch enters into a transition state where
> +tasks are converging to the unpatched state.  This is indicated by a
> +value of '1' in /sys/kernel/livepatch//transition.  Once all tasks
> +have been unpatched, the 'transition' value changes to '0'.  For more
> +information about this process, see the "Consistency model" section.
> +
>  Here all the functions (struct klp_func) associated with the to-be-disabled
>  patch are removed fr

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-03 Thread Miroslav Benes
On Thu, 2 Feb 2017, Petr Mladek wrote:

> > diff --git a/Documentation/livepatch/livepatch.txt 
> > b/Documentation/livepatch/livepatch.txt
> > index 7f04e13..fb00d66 100644
> > --- a/Documentation/livepatch/livepatch.txt
> > +++ b/Documentation/livepatch/livepatch.txt
> 
> > +  In that case, arches without
> > +   HAVE_RELIABLE_STACKTRACE would still be able to use the
> > +   non-stack-checking parts of the consistency model:
> > +
> > +   a) patching user tasks when they cross the kernel/user space
> > +  boundary; and
> > +
> > +   b) patching kthreads and idle tasks at their designated patch points.
> > +
> > +   This option isn't as good as option 1 because it requires signaling
> > +   most of the tasks to patch them.
> 
> Kthreads need to be waken because most of them ignore signals.
> 
> And idle tasks might need to be explicitely scheduled if there
> is too big load. Mirek knows more about this.

Yes, and we've already discussed it with Josh. The plan is not to do 
anything now, because described situations should be rare and/or 
theoretical only. It should be on our TODO lists though.
 
> Well, I am not sure if you want to get into such details.

I would not bother about it.
 
[...]

> > +/*
> > + * Start the transition to the specified target patch state so tasks can 
> > begin
> > + * switching to it.
> > + */
> > +void klp_start_transition(void)
> > +{
> > +   struct task_struct *g, *task;
> > +   unsigned int cpu;
> > +
> > +   WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> > +
> > +   pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> > + klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> > +
> > +   /*
> > +* If the patch can be applied or reverted immediately, skip the
> > +* per-task transitions.
> > +*/
> > +   if (klp_transition_patch->immediate)
> 
> We should call klp_try_complete_transition() here. Otherwise, it will
> never be called and the transition will never get completed.
> 
> Alternative solution would be to move klp_try_complete_transition()
> from klp_start_transition() and explicitely call it from
> __klp_disable_patch() and klp_enable_patch(). It would actually
> solve one issue with klp_revert_patch(), see below.
> 
> I kind of like the alternative solution. I hope that it was not
> me who suggested to move klp_try_complete_transition() into
> klp_start_transtion().

[...]

> Hmm, we should not call klp_try_complete_transition() when
> klp_start_transition() is called from here. I can't find a safe
> way to cancel klp_transition_work() when we own klp_mutex.
> It smells with a possible deadlock.
> 
> I suggest to move move klp_try_complete_transition() outside
> klp_start_transition() and explicitely call it from
>  __klp_disable_patch() and __klp_enabled_patch().
> This would fix also the problem with immediate patches, see
> klp_start_transition().

I agree. I think the best would be to move klp_try_complete_transition() 
out from klp_start_transition() and call it explicitly. This would solve 
the immediate problem for free.

I agree we should not call klp_try_complete_transition() from 
klp_reverse_transition() and leave it entirely to our delayed work. We 
discussed this in the past and the counter argument was that explicit call 
to klp_try_complete_transition() could make the process a bit faster. 
Possibly true, but reversion is really slow path and I would not care 
about speed at all. I think it is cleaner and perhaps safer.

Thanks,
Miroslav


Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-03 Thread Miroslav Benes

Petr has already mentioned majority of things I too found out, so only 
couple of nits...

> diff --git a/Documentation/ABI/testing/sysfs-kernel-livepatch 
> b/Documentation/ABI/testing/sysfs-kernel-livepatch
> index da87f43..24b6570 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-livepatch
> +++ b/Documentation/ABI/testing/sysfs-kernel-livepatch
> @@ -25,6 +25,14 @@ Description:
>   code is currently applied.  Writing 0 will disable the patch
>   while writing 1 will re-enable the patch.
>  
> +What:/sys/kernel/livepatch//transition
> +Date:May 2016

'May 2016' looks strange, but maybe nobody cares about it...

> diff --git a/Documentation/livepatch/livepatch.txt 
> b/Documentation/livepatch/livepatch.txt
> index 7f04e13..fb00d66 100644
> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt
>  4. Livepatch module
> @@ -134,7 +242,7 @@ Documentation/livepatch/module-elf-format.txt for more 
> details.
>  
>  
>  4.2. Metadata
> -
> +-

klp_func and klp_patch have new members - immediate. Should be documented 
here in "4.2. Metadata" section.

The section also contains this text under klp_patch bullet. It seems 
oudated:

"Also if a more complex consistency model is supported then a selected 
unit (thread, kernel as a whole) will see the new code from the entire 
patch only when it is in a safe state."

We now have a more complex consistency model.

And finally, the section "Limitations" has this text under the first 
bullet:

  + The patch must not change the semantic of the patched functions.

The current implementation guarantees only that either the old
or the new function is called. The functions are patched one
by one. It means that the patch must _not_ change the semantic
of the function.

I think it is confusing. The consistency model allows us to change the 
semantic of a function. To certain degree. Of course, there are cases that 
cannot be patched, or have to be patched carefully. For example if a 
function takes a lock by calling foo_lock(), foo_lock() is not on a stack 
afterwards. Then the locking semantics may be changed with a livepatch. 
One has to make sure to patch also the caller foo_lock() to enforce the 
consistency. And so on... But I do not consider a limitation of livepatch. 
It is a feature of the consistency model, which is weaker than kGraft's or 
kpatch's (or stronger. It depends on your point of view.)

So, I propose to remove this text and better describe the properties of 
the consistency model above in the section 3. Maybe a quote from an old 
mail thread (Nov 2014) would be sufficient. I don't remember what was 
mentioned and what not.

What do you think?

> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> index 6602b34..ed90ad1 100644
> --- a/include/linux/livepatch.h
> +++ b/include/linux/livepatch.h
> @@ -68,7 +92,7 @@ struct klp_func {
>   * @funcs:   function entries for functions to be patched in the object
>   * @kobj:kobject for sysfs resources
>   * @mod: kernel module associated with the patched object
> - *   (NULL for vmlinux)
> + *   (NULL for vmlinux)

This looks superfluous.

(checking my notes)... and that's it. Aside from the discussion in Petr's 
subthread it looks good to me. Great job.

Thanks,
Miroslav


Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-03 Thread Josh Poimboeuf
On Thu, Feb 02, 2017 at 12:51:16PM +0100, Petr Mladek wrote:
> !!! This is the right version. I am sorry again for the confusion. !!!
> 
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > diff --git a/Documentation/livepatch/livepatch.txt 
> > b/Documentation/livepatch/livepatch.txt
> > index 7f04e13..fb00d66 100644
> > --- a/Documentation/livepatch/livepatch.txt
> > +++ b/Documentation/livepatch/livepatch.txt
> > +3.1 Adding consistency model support to new architectures
> > +-
> > +
> > +For adding consistency model support to new architectures, there are a
> > +few options:
> > +
> > +1) Add CONFIG_HAVE_RELIABLE_STACKTRACE.  This means porting objtool, and
> > +   for non-DWARF unwinders, also making sure there's a way for the stack
> > +   tracing code to detect interrupts on the stack.
> > +
> > +2) Alternatively, figure out a way to patch kthreads without stack
> > +   checking.  If all kthreads sleep in the same place, then we can
> > +   designate that place as a patching point.  I think Petr M has been
> > +   working on that?
> 
> Here is version with some more details:
> 
> Alternatively, every kthread has to explicitely switch
> current->patch_state on a safe place. Kthreads are typically
> "infinite" loops that do some action repeatedly. The safe
> location is between the loops when there are no locks
> taken and all data structures are in a well defined state.
> 
> The location is clear when using the workqueues or the kthread worker
> API. These kthreads process "independent" works in a generic loop.
> 
> It is much more complicated with kthreads using a custom loop.
> There the safe place must be carefully localized case by case.

Good clarification.

> > +  In that case, arches without
> > +   HAVE_RELIABLE_STACKTRACE would still be able to use the
> > +   non-stack-checking parts of the consistency model:
> > +
> > +   a) patching user tasks when they cross the kernel/user space
> > +  boundary; and
> > +
> > +   b) patching kthreads and idle tasks at their designated patch points.
> > +
> > +   This option isn't as good as option 1 because it requires signaling
> > +   most of the tasks to patch them.
> 
> Kthreads need to be waken because most of them ignore signals.
> 
> And idle tasks might need to be explicitely scheduled if there
> is too big load. Mirek knows more about this.
> 
> Well, I am not sure if you want to get into such details.
> 
> 
> > +  But it could still be a good backup
> > +   option for those architectures which don't have reliable stack traces
> > +   yet.
> > +
> > +In the meantime, patches for such architectures can bypass the
> > +consistency model by setting klp_patch.immediate to true.
> 
> I would add that this is perfectly fine for patches that do not
> change semantic of the patched functions. In practice, this is
> usable in about 90% of security and critical fixes.

Another good one.

> >  4. Livepatch module
> > @@ -134,7 +242,7 @@ Documentation/livepatch/module-elf-format.txt for more 
> > details.
> >  
> >  
> >  4.2. Metadata
> > -
> > +-
> >  
> >  The patch is described by several structures that split the information
> >  into three levels:
> > @@ -239,9 +347,15 @@ Registered patches might be enabled either by calling 
> > klp_enable_patch() or
> >  by writing '1' to /sys/kernel/livepatch//enabled. The system will
> >  start using the new implementation of the patched functions at this stage.
> >  
> > -In particular, if an original function is patched for the first time, a
> > -function specific struct klp_ops is created and an universal ftrace handler
> > -is registered.
> > +When a patch is enabled, livepatch enters into a transition state where
> > +tasks are converging to the patched state.  This is indicated by a value
> > +of '1' in /sys/kernel/livepatch//transition.  Once all tasks have
> > +been patched, the 'transition' value changes to '0'.  For more
> > +information about this process, see the "Consistency model" section.
> > +
> > +If an original function is patched for the first time, a function
> > +specific struct klp_ops is created and an universal ftrace handler is
> > +registered.
> >  
> >  Functions might be patched multiple times. The ftrace handler is registered
> >  only once for the given function. Further patches just add an entry to the
> > @@ -261,6 +375,12 @@ by writing '0' to 
> > /sys/kernel/livepatch//enabled. At this stage
> >  either the code from the previously enabled patch or even the original
> >  code gets used.
> >  
> > +When a patch is disabled, livepatch enters into a transition state where
> > +tasks are converging to the unpatched state.  This is indicated by a
> > +v

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-06 Thread Josh Poimboeuf
On Fri, Feb 03, 2017 at 05:41:28PM +0100, Miroslav Benes wrote:
> 
> Petr has already mentioned majority of things I too found out, so only 
> couple of nits...
> 
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-livepatch 
> > b/Documentation/ABI/testing/sysfs-kernel-livepatch
> > index da87f43..24b6570 100644
> > --- a/Documentation/ABI/testing/sysfs-kernel-livepatch
> > +++ b/Documentation/ABI/testing/sysfs-kernel-livepatch
> > @@ -25,6 +25,14 @@ Description:
> > code is currently applied.  Writing 0 will disable the patch
> > while writing 1 will re-enable the patch.
> >  
> > +What:  /sys/kernel/livepatch//transition
> > +Date:  May 2016
> 
> 'May 2016' looks strange, but maybe nobody cares about it...

Will update the date to Feb 2017 and also the affected kernel version to
4.12.

> 
> > diff --git a/Documentation/livepatch/livepatch.txt 
> > b/Documentation/livepatch/livepatch.txt
> > index 7f04e13..fb00d66 100644
> > --- a/Documentation/livepatch/livepatch.txt
> > +++ b/Documentation/livepatch/livepatch.txt
> >  4. Livepatch module
> > @@ -134,7 +242,7 @@ Documentation/livepatch/module-elf-format.txt for more 
> > details.
> >  
> >  
> >  4.2. Metadata
> > -
> > +-
> 
> klp_func and klp_patch have new members - immediate. Should be documented 
> here in "4.2. Metadata" section.

Agreed.

> The section also contains this text under klp_patch bullet. It seems 
> oudated:
> 
> "Also if a more complex consistency model is supported then a selected 
> unit (thread, kernel as a whole) will see the new code from the entire 
> patch only when it is in a safe state."
> 
> We now have a more complex consistency model.

Agreed.

> And finally, the section "Limitations" has this text under the first 
> bullet:
> 
>   + The patch must not change the semantic of the patched functions.
> 
> The current implementation guarantees only that either the old
> or the new function is called. The functions are patched one
> by one. It means that the patch must _not_ change the semantic
> of the function.
> 
> I think it is confusing. The consistency model allows us to change the 
> semantic of a function. To certain degree. Of course, there are cases that 
> cannot be patched, or have to be patched carefully. For example if a 
> function takes a lock by calling foo_lock(), foo_lock() is not on a stack 
> afterwards. Then the locking semantics may be changed with a livepatch. 
> One has to make sure to patch also the caller foo_lock() to enforce the 
> consistency. And so on... But I do not consider a limitation of livepatch. 
> It is a feature of the consistency model, which is weaker than kGraft's or 
> kpatch's (or stronger. It depends on your point of view.)
> 
> So, I propose to remove this text and better describe the properties of 
> the consistency model above in the section 3. Maybe a quote from an old 
> mail thread (Nov 2014) would be sufficient. I don't remember what was 
> mentioned and what not.
> 
> What do you think?

I'll remove the above limitation.

I'm not sure how to improve the consistency model section.  It already
has at least some mentions of changed function semantics and locking
semantics.  I'll leave it alone for now, unless you have a specific
suggestion.

> > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > index 6602b34..ed90ad1 100644
> > --- a/include/linux/livepatch.h
> > +++ b/include/linux/livepatch.h
> > @@ -68,7 +92,7 @@ struct klp_func {
> >   * @funcs: function entries for functions to be patched in the object
> >   * @kobj:  kobject for sysfs resources
> >   * @mod:   kernel module associated with the patched object
> > - * (NULL for vmlinux)
> > + * (NULL for vmlinux)
> 
> This looks superfluous.

This is a minor whitespace fix -- remove a space before tab.  I figured
I'd go ahead and fix it since I'm already changing some of the
surrounding code.

> (checking my notes)... and that's it. Aside from the discussion in Petr's 
> subthread it looks good to me. Great job.

Thanks!

-- 
Josh


Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-06 Thread Petr Mladek
On Fri 2017-02-03 14:39:16, Josh Poimboeuf wrote:
> On Thu, Feb 02, 2017 at 12:51:16PM +0100, Petr Mladek wrote:
> > !!! This is the right version. I am sorry again for the confusion. !!!
> >
> > >  static int __klp_disable_patch(struct klp_patch *patch)
> > >  {
> > > - struct klp_object *obj;
> > > + if (klp_transition_patch)
> > > + return -EBUSY;
> > >  
> > >   /* enforce stacking: only the last enabled patch can be disabled */
> > >   if (!list_is_last(&patch->list, &klp_patches) &&
> > >   list_next_entry(patch, list)->enabled)
> > >   return -EBUSY;
> > >  
> > > - pr_notice("disabling patch '%s'\n", patch->mod->name);
> > > + klp_init_transition(patch, KLP_UNPATCHED);
> > >  
> > > - klp_for_each_object(patch, obj) {
> > > - if (obj->patched)
> > > - klp_unpatch_object(obj);
> > > - }
> > > + /*
> > > +  * Enforce the order of the klp_target_state write in
> > > +  * klp_init_transition() and the TIF_PATCH_PENDING writes in
> > > +  * klp_start_transition() to ensure that klp_update_patch_state()
> > > +  * doesn't set a task->patch_state to KLP_UNDEFINED.
> > > +  */
> > > + smp_wmb();
> > 
> > The description is not clear. The klp_target_state manipulation
> > is synchronized by another barrier inside klp_init_transition().
> 
> Yeah.  I should also update the barrier comment in klp_init_transition()
> to clarify that it also does this.
> 
> > A similar barrier is in __klp_enable_patch() and it is correctly
> > described there:
> > 
> >It enforces the order of the func->transition writes in
> >klp_init_transition() and the ops->func_stack writes in
> >klp_patch_object(). The corresponding barrier is in
> >klp_ftrace_handler().
> > 
> > But we do not modify ops->func_stack in __klp_disable_patch().
> > So we need another explanation.
> > 
> > Huh, I spent few hours thinking about it. I am almost sure
> > that it is not needed. But I am not 100% sure. The many times
> > rewriten summary looks like:
> > 
> > /*
> >  * Enforce the order of func->transtion write in
> >  * klp_init_transition() against TIF_PATCH_PENDING writes
> >  * in klp_start_transition(). It makes sure that
> >  * klp_ftrace_hadler() will see func->transition set
> >  * after the task is migrated by klp_update_patch_state().
> >  *
> >  * The barrier probably is not needed because the task
> >  * must not be migrated when being inside klp_ftrace_handler()
> >  * and there is another full barrier in
> >  * klp_update_patch_state().
> >  * But this is slow path and better be safe than sorry.
> >  */
> >  smp_wmb();
> 
> This mostly makes sense,  but I think the barrier *is* needed for
> ordering func->transition and TIF_PATCH_PENDING writes for the rare case
> where klp_ftrace_handler() is called right after
> klp_update_patch_state(), which could be possible in the idle loop, for
> example.
> 
> CPU0  CPU1
> __klp_disable_patch()
>   klp_init_transition()
> func->transition = true;
>   (...no barrier...)
>   klp_start_transition()
> set TIF_PATCH_PENDING
> 
>   klp_update_patch_state()
> if (test_and_clear(TIF_PATCH_PENDING))
>   task->patch_state = KLP_UNPATCHED;
>   ...
>   klp_ftrace_handler()
> smp_rmb();
> if (unlikely(func->transition)) <--- false 
> (patched)
>   ...
>   klp_ftrace_handler()
> smp_rmb();
> if (unlikely(func->transition)) <--- true 
> (unpatched)

You are right. I was able to find many scenarios where the barrier
was not needed. But it is needed in this one.

The first paragraph should be enough then:

/*
 * Enforce the order of func->transition write in
 * klp_init_transition() against TIF_PATCH_PENDING writes
 * in klp_start_transition(). It makes sure that
 * klp_ftrace_handler() will see func->transition set
 * after the task is migrated by klp_update_patch_state().
 */
 smp_wmb();


> So how about:
> 
>   /*
>* Enforce the order of the func->transition writes in
>* klp_init_transition() and the TIF_PATCH_PENDING writes in
>* klp_start_transition().  In the rare case where klp_ftrace_handler()
>* is called shortly after klp_update_patch_state() switches the task,
>* this ensures the handler sees func->transition is set.
>*/
>   smp_wmb();

Looks good to me.


> > > + klp_start_transition();
> > >   patch->enabled = false;
> > >  
> > >   return 0;
> > > @@ -337,6 +341,9 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > >   struct klp_object *obj;
> > >   int ret;
> > >  
> > > + if (klp_transition_patch)

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-06 Thread Josh Poimboeuf
On Mon, Feb 06, 2017 at 05:44:31PM +0100, Petr Mladek wrote:
> > > > @@ -347,22 +354,37 @@ static int __klp_enable_patch(struct klp_patch 
> > > > *patch)
> > > >  
> > > > pr_notice("enabling patch '%s'\n", patch->mod->name);
> > > >  
> > > > +   klp_init_transition(patch, KLP_PATCHED);
> > > > +
> > > > +   /*
> > > > +* Enforce the order of the func->transition writes in
> > > > +* klp_init_transition() and the ops->func_stack writes in
> > > > +* klp_patch_object(), so that klp_ftrace_handler() will see the
> > > > +* func->transition updates before the handler is registered 
> > > > and the
> > > > +* new funcs become visible to the handler.
> > > > +*/
> > > > +   smp_wmb();
> > > > +
> > > > klp_for_each_object(patch, obj) {
> > > > if (!klp_is_object_loaded(obj))
> > > > continue;
> > > >  
> > > > ret = klp_patch_object(obj);
> > > > -   if (ret)
> > > > -   goto unregister;
> > > > +   if (ret) {
> > > > +   pr_warn("failed to enable patch '%s'\n",
> > > > +   patch->mod->name);
> > > > +
> > > > +   klp_unpatch_objects(patch);
> > > 
> > > We should call here synchronize_rcu() here as we do
> > > in klp_try_complete_transition(). Some of the affected
> > > functions might have more versions on the stack and we
> > > need to make sure that klp_ftrace_handler() will _not_
> > > see the removed patch on the stack.
> > 
> > Even if the handler sees the new func on the stack, the
> > task->patch_state is still KLP_UNPATCHED, so it will still choose the
> > previous version of the function.  Or did I miss your point?
> 
> The barrier is needed from exactly the same reason as the one
> in klp_try_complete_transition()
> 
> CPU0  CPU1
> 
> __klp_enable_patch()
>   klp_init_transition()
> 
> for_each...
>   task->patch_state = KLP_UNPATCHED
> 
> for_each...
>   func->transition = true
> 
>   klp_for_each_object()
> klp_patch_object()
>   list_add_rcu()
> 
>   klp_ftrace_handler()
> func = list_first_...()
> 
> if (func->transition)
> 
> 
> ret = klp_patch_object()
> /* error */
> if (ret) {
>   klp_unpatch_objects()
> 
>   list_remove_rcu()
> 
>   klp_complete_transition()
> 
>   for_each_
> func->transition = true
> 
>   for_each_
> task->patch_state = PATCH_UNDEFINED
> 
>   patch_state = current->patch_state;
>   WARN_ON_ONCE(patch_state
>   ==
>KLP_UNDEFINED);
> 
> BANG: The warning is triggered.
> 
> => we need to call rcu_synchronize(). It will make sure that
> no ftrace handled will see the removed func on the stack
> and we could clear all the other values.

Makes sense.

Notice in this case that klp_target_state is KLP_PATCHED.  Which means
that klp_complete_transition() would not call synchronize_rcu() at the
right time, nor would it call module_put().  It can be fixed with:

@@ -387,7 +389,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
pr_warn("failed to enable patch '%s'\n",
patch->mod->name);
 
-   klp_unpatch_objects(patch);
+   klp_target_state = KLP_UNPATCHED;
klp_complete_transition();
 
return ret;

This assumes that the 'if (klp_target_state == KLP_UNPATCHED)' clause in
klp_try_complete_transition() gets moved to klp_complete_transition() as
you suggested.

> > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > index 5efa262..1a77f05 100644
> > > > --- a/kernel/livepatch/patch.c
> > > > +++ b/kernel/livepatch/patch.c
> > > > @@ -29,6 +29,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include "patch.h"
> > > > +#include "transition.h"
> > > >  
> > > >  static LIST_HEAD(klp_ops);
> > > >  
> > > > @@ -54,15 +55,58 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > long ip,
> > > >  {
> > > > struct klp_ops *ops;
> > > > struct klp_func *func;
> > > > +   int patch_state;
> > > >  
> > > > ops = container_of(fops, struct klp_ops, fops);
> > > >  
> > > > rcu_read_lock();
> > > > +
> > > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > >   stack_node);
> > > > +
> > > > +   /*
> > > > +* func should never be NULL because preemption should be 
> > > > disabled here
> > > > +* and unregister_ftrace_function() does the equivalen

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-07 Thread Miroslav Benes

> > And finally, the section "Limitations" has this text under the first 
> > bullet:
> > 
> >   + The patch must not change the semantic of the patched functions.
> > 
> > The current implementation guarantees only that either the old
> > or the new function is called. The functions are patched one
> > by one. It means that the patch must _not_ change the semantic
> > of the function.
> > 
> > I think it is confusing. The consistency model allows us to change the 
> > semantic of a function. To certain degree. Of course, there are cases that 
> > cannot be patched, or have to be patched carefully. For example if a 
> > function takes a lock by calling foo_lock(), foo_lock() is not on a stack 
> > afterwards. Then the locking semantics may be changed with a livepatch. 
> > One has to make sure to patch also the caller foo_lock() to enforce the 
> > consistency. And so on... But I do not consider a limitation of livepatch. 
> > It is a feature of the consistency model, which is weaker than kGraft's or 
> > kpatch's (or stronger. It depends on your point of view.)
> > 
> > So, I propose to remove this text and better describe the properties of 
> > the consistency model above in the section 3. Maybe a quote from an old 
> > mail thread (Nov 2014) would be sufficient. I don't remember what was 
> > mentioned and what not.
> > 
> > What do you think?
> 
> I'll remove the above limitation.
> 
> I'm not sure how to improve the consistency model section.  It already
> has at least some mentions of changed function semantics and locking
> semantics.  I'll leave it alone for now, unless you have a specific
> suggestion.

Fair enough. Let's see if I can come up with something.
 
> > > diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > > index 6602b34..ed90ad1 100644
> > > --- a/include/linux/livepatch.h
> > > +++ b/include/linux/livepatch.h
> > > @@ -68,7 +92,7 @@ struct klp_func {
> > >   * @funcs:   function entries for functions to be patched in the 
> > > object
> > >   * @kobj:kobject for sysfs resources
> > >   * @mod: kernel module associated with the patched object
> > > - *   (NULL for vmlinux)
> > > + *   (NULL for vmlinux)
> > 
> > This looks superfluous.
> 
> This is a minor whitespace fix -- remove a space before tab.  I figured
> I'd go ahead and fix it since I'm already changing some of the
> surrounding code.

Ok, no problem.

Thanks,
Miroslav


Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-08 Thread Petr Mladek
On Mon 2017-02-06 13:51:48, Josh Poimboeuf wrote:
> On Mon, Feb 06, 2017 at 05:44:31PM +0100, Petr Mladek wrote:
> > > > > @@ -347,22 +354,37 @@ static int __klp_enable_patch(struct klp_patch 
> > > > > *patch)
> > > > >  
> > > > >   pr_notice("enabling patch '%s'\n", patch->mod->name);
> > > > >  
> > > > > + klp_init_transition(patch, KLP_PATCHED);
> > > > > +
> > > > > + /*
> > > > > +  * Enforce the order of the func->transition writes in
> > > > > +  * klp_init_transition() and the ops->func_stack writes in
> > > > > +  * klp_patch_object(), so that klp_ftrace_handler() will see the
> > > > > +  * func->transition updates before the handler is registered 
> > > > > and the
> > > > > +  * new funcs become visible to the handler.
> > > > > +  */
> > > > > + smp_wmb();
> > > > > +
> > > > >   klp_for_each_object(patch, obj) {
> > > > >   if (!klp_is_object_loaded(obj))
> > > > >   continue;
> > > > >  
> > > > >   ret = klp_patch_object(obj);
> > > > > - if (ret)
> > > > > - goto unregister;
> > > > > + if (ret) {
> > > > > + pr_warn("failed to enable patch '%s'\n",
> > > > > + patch->mod->name);
> > > > > +
> > > > > + klp_unpatch_objects(patch);
> > > > 
> > > > We should call here synchronize_rcu() here as we do
> > > > in klp_try_complete_transition(). Some of the affected
> > > > functions might have more versions on the stack and we
> > > > need to make sure that klp_ftrace_handler() will _not_
> > > > see the removed patch on the stack.
> > > 
> > > Even if the handler sees the new func on the stack, the
> > > task->patch_state is still KLP_UNPATCHED, so it will still choose the
> > > previous version of the function.  Or did I miss your point?
> > 
> > The barrier is needed from exactly the same reason as the one
> > in klp_try_complete_transition()
> > 
> > CPU0CPU1
> > 
> > __klp_enable_patch()
> >   klp_init_transition()
> > 
> > for_each...
> >   task->patch_state = KLP_UNPATCHED
> > 
> > for_each...
> >   func->transition = true
> > 
> >   klp_for_each_object()
> > klp_patch_object()
> >   list_add_rcu()
> > 
> > klp_ftrace_handler()
> >   func = list_first_...()
> > 
> >   if (func->transition)
> > 
> > 
> > ret = klp_patch_object()
> > /* error */
> > if (ret) {
> >   klp_unpatch_objects()
> > 
> > list_remove_rcu()
> > 
> >   klp_complete_transition()
> > 
> > for_each_
> >   func->transition = true
> > 
> > for_each_
> >   task->patch_state = PATCH_UNDEFINED
> > 
> > patch_state = current->patch_state;
> > WARN_ON_ONCE(patch_state
> > ==
> >  KLP_UNDEFINED);
> > 
> > BANG: The warning is triggered.
> > 
> > => we need to call rcu_synchronize(). It will make sure that
> > no ftrace handled will see the removed func on the stack
> > and we could clear all the other values.
> 
> Makes sense.
> 
> Notice in this case that klp_target_state is KLP_PATCHED.  Which means
> that klp_complete_transition() would not call synchronize_rcu() at the
> right time, nor would it call module_put().  It can be fixed with:
>
> @@ -387,7 +389,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
>   pr_warn("failed to enable patch '%s'\n",
>   patch->mod->name);
>  
> - klp_unpatch_objects(patch);
> + klp_target_state = KLP_UNPATCHED;
>   klp_complete_transition();
>  
>   return ret;

Great catch! Looks good to me.

> This assumes that the 'if (klp_target_state == KLP_UNPATCHED)' clause in
> klp_try_complete_transition() gets moved to klp_complete_transition() as
> you suggested.
> 
> > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > > index 5efa262..1a77f05 100644
> > > > > --- a/kernel/livepatch/patch.c
> > > > > +++ b/kernel/livepatch/patch.c
> > > > > @@ -29,6 +29,7 @@
> > > > >  #include 
> > > > >  #include 
> > > > >  #include "patch.h"
> > > > > +#include "transition.h"
> > > > >  
> > > > >  static LIST_HEAD(klp_ops);
> > > > >  
> > > > > @@ -54,15 +55,58 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > > long ip,
> > > > >  {
> > > > >   struct klp_ops *ops;
> > > > >   struct klp_func *func;
> > > > > + int patch_state;
> > > > >  
> > > > >   ops = container_of(fops, struct klp_ops, fops);
> > > > >  
> > > > >   rcu_read_lock();
> > > > > +
> > > > >   func = list_first_or_null_rcu(&ops->func_stack, struc

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-08 Thread Josh Poimboeuf
On Wed, Feb 08, 2017 at 04:47:50PM +0100, Petr Mladek wrote:
> > Notice in this case that klp_target_state is KLP_PATCHED.  Which means
> > that klp_complete_transition() would not call synchronize_rcu() at the
> > right time, nor would it call module_put().  It can be fixed with:
> >
> > @@ -387,7 +389,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > pr_warn("failed to enable patch '%s'\n",
> > patch->mod->name);
> >  
> > -   klp_unpatch_objects(patch);
> > +   klp_target_state = KLP_UNPATCHED;
> > klp_complete_transition();
> >  
> > return ret;
> 
> Great catch! Looks good to me.

As it turns out, klp_target_state isn't accessible from this file, so
I'll make a klp_cancel_transition() helper function which reverses
klp_target_state and calls klp_complete_transition().

> > This assumes that the 'if (klp_target_state == KLP_UNPATCHED)' clause in
> > klp_try_complete_transition() gets moved to klp_complete_transition() as
> > you suggested.
> > 
> > > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > > > index 5efa262..1a77f05 100644
> > > > > > --- a/kernel/livepatch/patch.c
> > > > > > +++ b/kernel/livepatch/patch.c
> > > > > > @@ -29,6 +29,7 @@
> > > > > >  #include 
> > > > > >  #include 
> > > > > >  #include "patch.h"
> > > > > > +#include "transition.h"
> > > > > >  
> > > > > >  static LIST_HEAD(klp_ops);
> > > > > >  
> > > > > > @@ -54,15 +55,58 @@ static void notrace klp_ftrace_handler(unsigned 
> > > > > > long ip,
> > > > > >  {
> > > > > > struct klp_ops *ops;
> > > > > > struct klp_func *func;
> > > > > > +   int patch_state;
> > > > > >  
> > > > > > ops = container_of(fops, struct klp_ops, fops);
> > > > > >  
> > > > > > rcu_read_lock();
> > > > > > +
> > > > > > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > > > >   stack_node);
> > > > > > +
> > > > > > +   /*
> > > > > > +* func should never be NULL because preemption should be 
> > > > > > disabled here
> > > > > > +* and unregister_ftrace_function() does the equivalent of a
> > > > > > +* synchronize_sched() before the func_stack removal.
> > > > > > +*/
> > > > > > +   if (WARN_ON_ONCE(!func))
> > > > > > +   goto unlock;
> > > > > > +
> > > > > > +   /*
> > > > > > +* Enforce the order of the ops->func_stack and 
> > > > > > func->transition reads.
> > > > > > +* The corresponding write barrier is in __klp_enable_patch().
> > > > > > +*/
> > > > > > +   smp_rmb();
> > > > > 
> > > > > I was curious why the comment did not mention __klp_disable_patch().
> > > > > It was related to the hours of thinking. I would like to avoid this
> > > > > in the future and add a comment like.
> > > > > 
> > > > >* This barrier probably is not needed when the patch is being
> > > > >* disabled. The patch is removed from the stack in
> > > > >* klp_try_complete_transition() and there we need to call
> > > > >* rcu_synchronize() to prevent seeing the patch on the stack
> > > > >* at all.
> > > > >*
> > > > >* Well, it still might be needed to see func->transition
> > > > >* when the patch is removed and the task is migrated. See
> > > > >* the write barrier in __klp_disable_patch().
> > > > 
> > > > Agreed, though as you mentioned earlier, there's also the implicit
> > > > barrier in klp_update_patch_state(), which would execute first in such a
> > > > scenario.  So I think I'll update the barrier comments in
> > > > klp_update_patch_state().
> > > 
> > > You inspired me to a scenario with 3 CPUs:
> > > 
> > > CPU0  CPU1CPU2
> > > 
> > > __klp_disable_patch()
> > > 
> > >   klp_init_transition()
> > > 
> > > func->transition = true
> > > 
> > >   smp_wmb()
> > > 
> > >   klp_start_transition()
> > > 
> > > set TIF_PATCH_PATCHPENDING
> > > 
> > >   klp_update_patch_state()
> > > 
> > > task->patch_state
> > >= KLP_UNPATCHED
> > > 
> > > smp_mb()
> > > 
> > >   klp_ftrace_handler()
> > > func = list_...
> > > 
> > > smp_rmb() /*needed?*/
> > > 
> > > if (func->transition)
> > > 
> > 
> > I think this isn't possible.  Remember the comment I added to
> > klp_update_patch_state():
> > 
> >  * NOTE: If task is not 'current', the caller must ensure the task is 
> > inactive.
> >  * Otherwise klp_ftrace_handler() might read the wrong 'patch_state' value.
> > 
> > Right now klp_update_patch_state() is only called for current.
> > klp_ftrace_handler() on CPU2 would be running in the context of a
> > different task.
> 
> I a

Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model

2017-02-09 Thread Petr Mladek
On Wed 2017-02-08 10:46:36, Josh Poimboeuf wrote:
> On Wed, Feb 08, 2017 at 04:47:50PM +0100, Petr Mladek wrote:
> > > Notice in this case that klp_target_state is KLP_PATCHED.  Which means
> > > that klp_complete_transition() would not call synchronize_rcu() at the
> > > right time, nor would it call module_put().  It can be fixed with:
> > >
> > > @@ -387,7 +389,7 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > >   pr_warn("failed to enable patch '%s'\n",
> > >   patch->mod->name);
> > >  
> > > - klp_unpatch_objects(patch);
> > > + klp_target_state = KLP_UNPATCHED;
> > >   klp_complete_transition();
> > >  
> > >   return ret;
> > 
> > Great catch! Looks good to me.
> 
> As it turns out, klp_target_state isn't accessible from this file, so
> I'll make a klp_cancel_transition() helper function which reverses
> klp_target_state and calls klp_complete_transition().

Sound good to me.

> > > This assumes that the 'if (klp_target_state == KLP_UNPATCHED)' clause in
> > > klp_try_complete_transition() gets moved to klp_complete_transition() as
> > > you suggested.
> > > 
> > > > > > > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > > > > > > index 5efa262..1a77f05 100644
> > > > > > > --- a/kernel/livepatch/patch.c
> > > > > > > +++ b/kernel/livepatch/patch.c
> > > > > > > @@ -29,6 +29,7 @@
> > > > > > >  #include 
> > > > > > >  #include 
> > > > > > >  #include "patch.h"
> > > > > > > +#include "transition.h"
> > > > > > >  
> > > > > > >  static LIST_HEAD(klp_ops);
> > > > > > >  
> > > > > > > @@ -54,15 +55,58 @@ static void notrace 
> > > > > > > klp_ftrace_handler(unsigned long ip,
> > > > > > >  {
> > > > > > >   struct klp_ops *ops;
> > > > > > >   struct klp_func *func;
> > > > > > > + int patch_state;
> > > > > > >  
> > > > > > >   ops = container_of(fops, struct klp_ops, fops);
> > > > > > >  
> > > > > > >   rcu_read_lock();
> > > > > > > +
> > > > > > >   func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > > > > > > stack_node);
> > > > > > > +
> > > > > > > + /*
> > > > > > > +  * func should never be NULL because preemption should be 
> > > > > > > disabled here
> > > > > > > +  * and unregister_ftrace_function() does the equivalent of a
> > > > > > > +  * synchronize_sched() before the func_stack removal.
> > > > > > > +  */
> > > > > > > + if (WARN_ON_ONCE(!func))
> > > > > > > + goto unlock;
> > > > > > > +
> > > > > > > + /*
> > > > > > > +  * Enforce the order of the ops->func_stack and 
> > > > > > > func->transition reads.
> > > > > > > +  * The corresponding write barrier is in __klp_enable_patch().
> > > > > > > +  */
> > > > > > > + smp_rmb();
> > > > > > 
> > > > > > I was curious why the comment did not mention __klp_disable_patch().
> > > > > > It was related to the hours of thinking. I would like to avoid this
> > > > > > in the future and add a comment like.
> > > > > > 
> > > > > >  * This barrier probably is not needed when the patch is being
> > > > > >  * disabled. The patch is removed from the stack in
> > > > > >  * klp_try_complete_transition() and there we need to call
> > > > > >  * rcu_synchronize() to prevent seeing the patch on the stack
> > > > > >  * at all.
> > > > > >  *
> > > > > >  * Well, it still might be needed to see func->transition
> > > > > >  * when the patch is removed and the task is migrated. See
> > > > > >  * the write barrier in __klp_disable_patch().
> > > > > 
> > > > > Agreed, though as you mentioned earlier, there's also the implicit
> > > > > barrier in klp_update_patch_state(), which would execute first in 
> > > > > such a
> > > > > scenario.  So I think I'll update the barrier comments in
> > > > > klp_update_patch_state().
> > > > 
> > > > You inspired me to a scenario with 3 CPUs:
> > > > 
> > > > CPU0CPU1CPU2
> > > > 
> > > > __klp_disable_patch()
> > > > 
> > > >   klp_init_transition()
> > > > 
> > > > func->transition = true
> > > > 
> > > >   smp_wmb()
> > > > 
> > > >   klp_start_transition()
> > > > 
> > > > set TIF_PATCH_PATCHPENDING
> > > > 
> > > > klp_update_patch_state()
> > > > 
> > > >   task->patch_state
> > > >  = KLP_UNPATCHED
> > > > 
> > > >   smp_mb()
> > > > 
> > > > klp_ftrace_handler()
> > > >   func = list_...
> > > > 
> > > >   smp_rmb() /*needed?*/
> > > > 
> > > >   if (func->transition)
> > > > 
> > > 
> > > I think this isn't possible.  Remember the comment I added to
> > > klp_update_patch_state():
> > > 
> > >  * NOTE: If task is not 'current', th

Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-02-16 Thread Miroslav Benes

> @@ -347,22 +356,36 @@ static int __klp_enable_patch(struct klp_patch *patch)
>  
>   pr_notice("enabling patch '%s'\n", patch->mod->name);
>  
> + klp_init_transition(patch, KLP_PATCHED);
> +
> + /*
> +  * Enforce the order of the func->transition writes in
> +  * klp_init_transition() and the ops->func_stack writes in
> +  * klp_patch_object(), so that klp_ftrace_handler() will see the
> +  * func->transition updates before the handler is registered and the
> +  * new funcs become visible to the handler.
> +  */
> + smp_wmb();
> +
>   klp_for_each_object(patch, obj) {
>   if (!klp_is_object_loaded(obj))
>   continue;
>  
>   ret = klp_patch_object(obj);
> - if (ret)
> - goto unregister;
> + if (ret) {
> + pr_warn("failed to enable patch '%s'\n",
> + patch->mod->name);
> +
> + klp_cancel_transition();
> + return ret;
> + }

[...]

> +static void klp_complete_transition(void)
> +{
> + struct klp_object *obj;
> + struct klp_func *func;
> + struct task_struct *g, *task;
> + unsigned int cpu;
> +
> + if (klp_target_state == KLP_UNPATCHED) {
> + /*
> +  * All tasks have transitioned to KLP_UNPATCHED so we can now
> +  * remove the new functions from the func_stack.
> +  */
> + klp_unpatch_objects(klp_transition_patch);
> +
> + /*
> +  * Make sure klp_ftrace_handler() can no longer see functions
> +  * from this patch on the ops->func_stack.  Otherwise, after
> +  * func->transition gets cleared, the handler may choose a
> +  * removed function.
> +  */
> + synchronize_rcu();
> + }
> +
> + if (klp_transition_patch->immediate)
> + goto done;
> +
> + klp_for_each_object(klp_transition_patch, obj)
> + klp_for_each_func(obj, func)
> + func->transition = false;
> +
> + /* Prevent klp_ftrace_handler() from seeing KLP_UNDEFINED state */
> + if (klp_target_state == KLP_PATCHED)
> + synchronize_rcu();
> +
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, task) {
> + WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> + task->patch_state = KLP_UNDEFINED;
> + }
> + read_unlock(&tasklist_lock);
> +
> + for_each_possible_cpu(cpu) {
> + task = idle_task(cpu);
> + WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> + task->patch_state = KLP_UNDEFINED;
> + }
> +
> +done:
> + klp_target_state = KLP_UNDEFINED;
> + klp_transition_patch = NULL;
> +}
> +
> +/*
> + * This is called in the error path, to cancel a transition before it has
> + * started, i.e. klp_init_transition() has been called but
> + * klp_start_transition() hasn't.  If the transition *has* been started,
> + * klp_reverse_transition() should be used instead.
> + */
> +void klp_cancel_transition(void)
> +{
> + klp_target_state = !klp_target_state;
> + klp_complete_transition();
> +}

If we fail to enable patch in __klp_enable_patch(), we call 
klp_cancel_transition() and get to klp_complete_transition(). If the patch 
has immediate set to true, the module would not be allowed to go (the 
changes are in the last patch unfortunately, but my remark is closely 
related to klp_cancel_transition() and __klp_enable_patch()). This could 
annoy a user if it was due to a trivial reason. So we could call  
module_put() in this case. It should be safe as no task could be in a new 
function thanks to klp_ftrace_handler() logic.

Pity I have not spotted this earlier.

Putting module_put(patch->mod) right after klp_cancel_transition() call in 
__klp_enable_patch() would be the simplest fix (as a part of 15/15 patch). 
Maybe with a comment that it is safe to do it there.

What do you think?

Miroslav


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-02-16 Thread Josh Poimboeuf
On Thu, Feb 16, 2017 at 03:33:26PM +0100, Miroslav Benes wrote:
> 
> > @@ -347,22 +356,36 @@ static int __klp_enable_patch(struct klp_patch *patch)
> >  
> > pr_notice("enabling patch '%s'\n", patch->mod->name);
> >  
> > +   klp_init_transition(patch, KLP_PATCHED);
> > +
> > +   /*
> > +* Enforce the order of the func->transition writes in
> > +* klp_init_transition() and the ops->func_stack writes in
> > +* klp_patch_object(), so that klp_ftrace_handler() will see the
> > +* func->transition updates before the handler is registered and the
> > +* new funcs become visible to the handler.
> > +*/
> > +   smp_wmb();
> > +
> > klp_for_each_object(patch, obj) {
> > if (!klp_is_object_loaded(obj))
> > continue;
> >  
> > ret = klp_patch_object(obj);
> > -   if (ret)
> > -   goto unregister;
> > +   if (ret) {
> > +   pr_warn("failed to enable patch '%s'\n",
> > +   patch->mod->name);
> > +
> > +   klp_cancel_transition();
> > +   return ret;
> > +   }
> 
> [...]
> 
> > +static void klp_complete_transition(void)
> > +{
> > +   struct klp_object *obj;
> > +   struct klp_func *func;
> > +   struct task_struct *g, *task;
> > +   unsigned int cpu;
> > +
> > +   if (klp_target_state == KLP_UNPATCHED) {
> > +   /*
> > +* All tasks have transitioned to KLP_UNPATCHED so we can now
> > +* remove the new functions from the func_stack.
> > +*/
> > +   klp_unpatch_objects(klp_transition_patch);
> > +
> > +   /*
> > +* Make sure klp_ftrace_handler() can no longer see functions
> > +* from this patch on the ops->func_stack.  Otherwise, after
> > +* func->transition gets cleared, the handler may choose a
> > +* removed function.
> > +*/
> > +   synchronize_rcu();
> > +   }
> > +
> > +   if (klp_transition_patch->immediate)
> > +   goto done;
> > +
> > +   klp_for_each_object(klp_transition_patch, obj)
> > +   klp_for_each_func(obj, func)
> > +   func->transition = false;
> > +
> > +   /* Prevent klp_ftrace_handler() from seeing KLP_UNDEFINED state */
> > +   if (klp_target_state == KLP_PATCHED)
> > +   synchronize_rcu();
> > +
> > +   read_lock(&tasklist_lock);
> > +   for_each_process_thread(g, task) {
> > +   WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> > +   task->patch_state = KLP_UNDEFINED;
> > +   }
> > +   read_unlock(&tasklist_lock);
> > +
> > +   for_each_possible_cpu(cpu) {
> > +   task = idle_task(cpu);
> > +   WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> > +   task->patch_state = KLP_UNDEFINED;
> > +   }
> > +
> > +done:
> > +   klp_target_state = KLP_UNDEFINED;
> > +   klp_transition_patch = NULL;
> > +}
> > +
> > +/*
> > + * This is called in the error path, to cancel a transition before it has
> > + * started, i.e. klp_init_transition() has been called but
> > + * klp_start_transition() hasn't.  If the transition *has* been started,
> > + * klp_reverse_transition() should be used instead.
> > + */
> > +void klp_cancel_transition(void)
> > +{
> > +   klp_target_state = !klp_target_state;
> > +   klp_complete_transition();
> > +}
> 
> If we fail to enable patch in __klp_enable_patch(), we call 
> klp_cancel_transition() and get to klp_complete_transition(). If the patch 
> has immediate set to true, the module would not be allowed to go (the 
> changes are in the last patch unfortunately, but my remark is closely 
> related to klp_cancel_transition() and __klp_enable_patch()). This could 
> annoy a user if it was due to a trivial reason. So we could call  
> module_put() in this case. It should be safe as no task could be in a new 
> function thanks to klp_ftrace_handler() logic.
> 
> Pity I have not spotted this earlier.
> 
> Putting module_put(patch->mod) right after klp_cancel_transition() call in 
> __klp_enable_patch() would be the simplest fix (as a part of 15/15 patch). 
> Maybe with a comment that it is safe to do it there.
> 
> What do you think?

Good catch.  I agree that 15/15 should have something like that.

Also, the module_put() will be needed for non-immediate patches which
have a func->immediate set.

What do you think about the following?  I tried to put the logic in
klp_complete_transition(), so the module_put()'s would be in one place.
But it was too messy, so I put it in klp_cancel_transition() instead.


diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index e96346e..bd1c1fd 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -121,8 +121,28 @@ static void klp_complete_transition(void)
  */
 void klp_cancel_transition(void)
 {
+   bool immediate_func = false;
+
klp_target_state = !klp_target_state;
   

Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-02-17 Thread Miroslav Benes
On Thu, 16 Feb 2017, Josh Poimboeuf wrote:

> On Thu, Feb 16, 2017 at 03:33:26PM +0100, Miroslav Benes wrote:
> > 
> > > @@ -347,22 +356,36 @@ static int __klp_enable_patch(struct klp_patch 
> > > *patch)
> > >  
> > >   pr_notice("enabling patch '%s'\n", patch->mod->name);
> > >  
> > > + klp_init_transition(patch, KLP_PATCHED);
> > > +
> > > + /*
> > > +  * Enforce the order of the func->transition writes in
> > > +  * klp_init_transition() and the ops->func_stack writes in
> > > +  * klp_patch_object(), so that klp_ftrace_handler() will see the
> > > +  * func->transition updates before the handler is registered and the
> > > +  * new funcs become visible to the handler.
> > > +  */
> > > + smp_wmb();
> > > +
> > >   klp_for_each_object(patch, obj) {
> > >   if (!klp_is_object_loaded(obj))
> > >   continue;
> > >  
> > >   ret = klp_patch_object(obj);
> > > - if (ret)
> > > - goto unregister;
> > > + if (ret) {
> > > + pr_warn("failed to enable patch '%s'\n",
> > > + patch->mod->name);
> > > +
> > > + klp_cancel_transition();
> > > + return ret;
> > > + }
> > 
> > [...]
> > 
> > > +static void klp_complete_transition(void)
> > > +{
> > > + struct klp_object *obj;
> > > + struct klp_func *func;
> > > + struct task_struct *g, *task;
> > > + unsigned int cpu;
> > > +
> > > + if (klp_target_state == KLP_UNPATCHED) {
> > > + /*
> > > +  * All tasks have transitioned to KLP_UNPATCHED so we can now
> > > +  * remove the new functions from the func_stack.
> > > +  */
> > > + klp_unpatch_objects(klp_transition_patch);
> > > +
> > > + /*
> > > +  * Make sure klp_ftrace_handler() can no longer see functions
> > > +  * from this patch on the ops->func_stack.  Otherwise, after
> > > +  * func->transition gets cleared, the handler may choose a
> > > +  * removed function.
> > > +  */
> > > + synchronize_rcu();
> > > + }
> > > +
> > > + if (klp_transition_patch->immediate)
> > > + goto done;
> > > +
> > > + klp_for_each_object(klp_transition_patch, obj)
> > > + klp_for_each_func(obj, func)
> > > + func->transition = false;
> > > +
> > > + /* Prevent klp_ftrace_handler() from seeing KLP_UNDEFINED state */
> > > + if (klp_target_state == KLP_PATCHED)
> > > + synchronize_rcu();
> > > +
> > > + read_lock(&tasklist_lock);
> > > + for_each_process_thread(g, task) {
> > > + WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> > > + task->patch_state = KLP_UNDEFINED;
> > > + }
> > > + read_unlock(&tasklist_lock);
> > > +
> > > + for_each_possible_cpu(cpu) {
> > > + task = idle_task(cpu);
> > > + WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> > > + task->patch_state = KLP_UNDEFINED;
> > > + }
> > > +
> > > +done:
> > > + klp_target_state = KLP_UNDEFINED;
> > > + klp_transition_patch = NULL;
> > > +}
> > > +
> > > +/*
> > > + * This is called in the error path, to cancel a transition before it has
> > > + * started, i.e. klp_init_transition() has been called but
> > > + * klp_start_transition() hasn't.  If the transition *has* been started,
> > > + * klp_reverse_transition() should be used instead.
> > > + */
> > > +void klp_cancel_transition(void)
> > > +{
> > > + klp_target_state = !klp_target_state;
> > > + klp_complete_transition();
> > > +}
> > 
> > If we fail to enable patch in __klp_enable_patch(), we call 
> > klp_cancel_transition() and get to klp_complete_transition(). If the patch 
> > has immediate set to true, the module would not be allowed to go (the 
> > changes are in the last patch unfortunately, but my remark is closely 
> > related to klp_cancel_transition() and __klp_enable_patch()). This could 
> > annoy a user if it was due to a trivial reason. So we could call  
> > module_put() in this case. It should be safe as no task could be in a new 
> > function thanks to klp_ftrace_handler() logic.
> > 
> > Pity I have not spotted this earlier.
> > 
> > Putting module_put(patch->mod) right after klp_cancel_transition() call in 
> > __klp_enable_patch() would be the simplest fix (as a part of 15/15 patch). 
> > Maybe with a comment that it is safe to do it there.
> > 
> > What do you think?
> 
> Good catch.  I agree that 15/15 should have something like that.
> 
> Also, the module_put() will be needed for non-immediate patches which
> have a func->immediate set.

Right. This further complicates it.
 
> What do you think about the following?  I tried to put the logic in
> klp_complete_transition(), so the module_put()'s would be in one place.
> But it was too messy, so I put it in klp_cancel_transition() instead.
>
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index e96346e..bd1c1fd 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/l

Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-02-21 Thread Josh Poimboeuf
On Fri, Feb 17, 2017 at 09:51:29AM +0100, Miroslav Benes wrote:
> On Thu, 16 Feb 2017, Josh Poimboeuf wrote:
> > What do you think about the following?  I tried to put the logic in
> > klp_complete_transition(), so the module_put()'s would be in one place.
> > But it was too messy, so I put it in klp_cancel_transition() instead.
> >
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > index e96346e..bd1c1fd 100644
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -121,8 +121,28 @@ static void klp_complete_transition(void)
> >   */
> >  void klp_cancel_transition(void)
> >  {
> > +   bool immediate_func = false;
> > +
> > klp_target_state = !klp_target_state;
> > klp_complete_transition();
> > +
> > +   if (klp_target_state == KLP_PATCHED)
> > +   return;
> 
> This is not needed, I think. We call klp_cancel_transition() in 
> __klp_enable_patch() only. klp_target_state is set to KLP_PATCHED there 
> (through klp_init_transition()) and negated here. We know it must be 
> KLP_UNPATCHED.

Yeah, I was trying to hedge against the possibility of future code
calling this function in the disable path.  But that probably won't
happen and it would probably be cleaner to just add a warning if
klp_target_state isn't KLP_PATCHED.

> Moreover, due to klp_complete_transition() klp_target_state is always 
> KLP_UNDEFINED after it.
> 
> > +
> > +   /*
> > +* In the enable error path, even immediate patches can be safely
> > +* removed because the transition hasn't been started yet.
> > +*
> > +* klp_complete_transition() doesn't have a module_put() for immediate
> > +* patches, so do it here.
> > +*/
> > +   klp_for_each_object(klp_transition_patch, obj)
> > +   klp_for_each_func(obj, func)
> > +   if (func->immediate)
> > +   immediate_func = true;
> > +
> > +   if (klp_transition_patch->immediate || immediate_func)
> > +   module_put(klp_transition_patch->mod);
> 
> Almost correct. The only problem is that klp_transition_patch is NULL at 
> this point. klp_complete_transition() does that and it should stay there 
> in my opinion to keep it simple.
> 
> So we could either move all this to __klp_enable_patch(), where patch 
> variable is defined, or we could store klp_transition_patch to a local 
> variable here in klp_cancel_transition() before klp_complete_transition() 
> is called. That should be safe. I like the latter more, because it keeps 
> the code in klp_cancel_transition() where it belongs.

Good points.  Here's another try:

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index e96346e..a23c63c 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -121,8 +121,31 @@ static void klp_complete_transition(void)
  */
 void klp_cancel_transition(void)
 {
-   klp_target_state = !klp_target_state;
+   struct klp_patch *patch = klp_transition_patch;
+   struct klp_object *obj;
+   struct klp_func *func;
+   bool immediate_func = false;
+
+   if (WARN_ON_ONCE(klp_target_state != KLP_PATCHED))
+   return;
+
+   klp_target_state = KLP_UNPATCHED;
klp_complete_transition();
+
+   /*
+* In the enable error path, even immediate patches can be safely
+* removed because the transition hasn't been started yet.
+*
+* klp_complete_transition() doesn't have a module_put() for immediate
+* patches, so do it here.
+*/
+   klp_for_each_object(patch, obj)
+   klp_for_each_func(obj, func)
+   if (func->immediate)
+   immediate_func = true;
+
+   if (patch->immediate || immediate_func)
+   module_put(patch->mod);
 }
 
 /*


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-02-22 Thread Miroslav Benes
On Tue, 21 Feb 2017, Josh Poimboeuf wrote:

> On Fri, Feb 17, 2017 at 09:51:29AM +0100, Miroslav Benes wrote:
> > On Thu, 16 Feb 2017, Josh Poimboeuf wrote:
> > > What do you think about the following?  I tried to put the logic in
> > > klp_complete_transition(), so the module_put()'s would be in one place.
> > > But it was too messy, so I put it in klp_cancel_transition() instead.
> > >
> > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > > index e96346e..bd1c1fd 100644
> > > --- a/kernel/livepatch/transition.c
> > > +++ b/kernel/livepatch/transition.c
> > > @@ -121,8 +121,28 @@ static void klp_complete_transition(void)
> > >   */
> > >  void klp_cancel_transition(void)
> > >  {
> > > + bool immediate_func = false;
> > > +
> > >   klp_target_state = !klp_target_state;
> > >   klp_complete_transition();
> > > +
> > > + if (klp_target_state == KLP_PATCHED)
> > > + return;
> > 
> > This is not needed, I think. We call klp_cancel_transition() in 
> > __klp_enable_patch() only. klp_target_state is set to KLP_PATCHED there 
> > (through klp_init_transition()) and negated here. We know it must be 
> > KLP_UNPATCHED.
> 
> Yeah, I was trying to hedge against the possibility of future code
> calling this function in the disable path.  But that probably won't
> happen and it would probably be cleaner to just add a warning if
> klp_target_state isn't KLP_PATCHED.
> 
> > Moreover, due to klp_complete_transition() klp_target_state is always 
> > KLP_UNDEFINED after it.
> > 
> > > +
> > > + /*
> > > +  * In the enable error path, even immediate patches can be safely
> > > +  * removed because the transition hasn't been started yet.
> > > +  *
> > > +  * klp_complete_transition() doesn't have a module_put() for immediate
> > > +  * patches, so do it here.
> > > +  */
> > > + klp_for_each_object(klp_transition_patch, obj)
> > > + klp_for_each_func(obj, func)
> > > + if (func->immediate)
> > > + immediate_func = true;
> > > +
> > > + if (klp_transition_patch->immediate || immediate_func)
> > > + module_put(klp_transition_patch->mod);
> > 
> > Almost correct. The only problem is that klp_transition_patch is NULL at 
> > this point. klp_complete_transition() does that and it should stay there 
> > in my opinion to keep it simple.
> > 
> > So we could either move all this to __klp_enable_patch(), where patch 
> > variable is defined, or we could store klp_transition_patch to a local 
> > variable here in klp_cancel_transition() before klp_complete_transition() 
> > is called. That should be safe. I like the latter more, because it keeps 
> > the code in klp_cancel_transition() where it belongs.
> 
> Good points.  Here's another try:
> 
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index e96346e..a23c63c 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -121,8 +121,31 @@ static void klp_complete_transition(void)
>   */
>  void klp_cancel_transition(void)
>  {
> - klp_target_state = !klp_target_state;
> + struct klp_patch *patch = klp_transition_patch;
> + struct klp_object *obj;
> + struct klp_func *func;
> + bool immediate_func = false;
> +
> + if (WARN_ON_ONCE(klp_target_state != KLP_PATCHED))
> + return;
> +
> + klp_target_state = KLP_UNPATCHED;
>   klp_complete_transition();
> +
> + /*
> +  * In the enable error path, even immediate patches can be safely
> +  * removed because the transition hasn't been started yet.
> +  *
> +  * klp_complete_transition() doesn't have a module_put() for immediate
> +  * patches, so do it here.
> +  */
> + klp_for_each_object(patch, obj)
> + klp_for_each_func(obj, func)
> + if (func->immediate)
> + immediate_func = true;
> +
> + if (patch->immediate || immediate_func)
> + module_put(patch->mod);
>  }

Looks ok.

Thanks,
Miroslav


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-03-07 Thread Miroslav Benes
On Mon, 13 Feb 2017, Josh Poimboeuf wrote:

> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> This code stems from the design proposal made by Vojtech [1] in November
> 2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
> consistency and syscall barrier switching combined with kpatch's stack
> trace switching.  There are also a number of fallback options which make
> it quite flexible.
> 
> Patches are applied on a per-task basis, when the task is deemed safe to
> switch over.  When a patch is enabled, livepatch enters into a
> transition state where tasks are converging to the patched state.
> Usually this transition state can complete in a few seconds.  The same
> sequence occurs when a patch is disabled, except the tasks converge from
> the patched state to the unpatched state.
> 
> An interrupt handler inherits the patched state of the task it
> interrupts.  The same is true for forked tasks: the child inherits the
> patched state of the parent.
> 
> Livepatch uses several complementary approaches to determine when it's
> safe to patch tasks:
> 
> 1. The first and most effective approach is stack checking of sleeping
>tasks.  If no affected functions are on the stack of a given task,
>the task is patched.  In most cases this will patch most or all of
>the tasks on the first try.  Otherwise it'll keep trying
>periodically.  This option is only available if the architecture has
>reliable stacks (HAVE_RELIABLE_STACKTRACE).
> 
> 2. The second approach, if needed, is kernel exit switching.  A
>task is switched when it returns to user space from a system call, a
>user space IRQ, or a signal.  It's useful in the following cases:
> 
>a) Patching I/O-bound user tasks which are sleeping on an affected
>   function.  In this case you have to send SIGSTOP and SIGCONT to
>   force it to exit the kernel and be patched.
>b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
>   then it will get patched the next time it gets interrupted by an
>   IRQ.
>c) In the future it could be useful for applying patches for
>   architectures which don't yet have HAVE_RELIABLE_STACKTRACE.  In
>   this case you would have to signal most of the tasks on the
>   system.  However this isn't supported yet because there's
>   currently no way to patch kthreads without
>   HAVE_RELIABLE_STACKTRACE.
> 
> 3. For idle "swapper" tasks, since they don't ever exit the kernel, they
>instead have a klp_update_patch_state() call in the idle loop which
>allows them to be patched before the CPU enters the idle state.
> 
>(Note there's not yet such an approach for kthreads.)
> 
> All the above approaches may be skipped by setting the 'immediate' flag
> in the 'klp_patch' struct, which will disable per-task consistency and
> patch all tasks immediately.  This can be useful if the patch doesn't
> change any function or data semantics.  Note that, even with this flag
> set, it's possible that some tasks may still be running with an old
> version of the function, until that function returns.
> 
> There's also an 'immediate' flag in the 'klp_func' struct which allows
> you to specify that certain functions in the patch can be applied
> without per-task consistency.  This might be useful if you want to patch
> a common function like schedule(), and the function change doesn't need
> consistency but the rest of the patch does.
> 
> For architectures which don't have HAVE_RELIABLE_STACKTRACE, the user
> must set patch->immediate which causes all tasks to be patched
> immediately.  This option should be used with care, only when the patch
> doesn't change any function or data semantics.
> 
> In the future, architectures which don't have HAVE_RELIABLE_STACKTRACE
> may be allowed to use per-task consistency if we can come up with
> another way to patch kthreads.
> 
> The /sys/kernel/livepatch//transition file shows whether a patch
> is in transition.  Only a single patch (the topmost patch on the stack)
> can be in transition at a given time.  A patch can remain in transition
> indefinitely, if any of the tasks are stuck in the initial patch state.
> 
> A transition can be reversed and effectively canceled by writing the
> opposite value to the /sys/kernel/livepatch//enabled file while
> the transition is in progress.  Then all the tasks will attempt to
> converge back to the original patch state.
> 
> [1] https://lkml.kernel.org/r/20141107140458.ga21...@suse.cz
> 
> Signed-off-by: Josh Poimboeuf 

I looked at the patch again and could not see any problem with it. I 
tested it with a couple of live patches too and it worked as expected. 
Good job.

Acked-by: Miroslav Benes 

Thanks,
Miroslav


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2018-01-25 Thread Peter Zijlstra
On Mon, Feb 13, 2017 at 07:42:40PM -0600, Josh Poimboeuf wrote:
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 6a4bae0..a8b3f1a 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -264,6 +265,9 @@ static void do_idle(void)
>  
>   sched_ttwu_pending();
>   schedule_preempt_disabled();
> +
> + if (unlikely(klp_patch_pending(current)))
> + klp_update_patch_state(current);
>  }

Can someone explain this one? This is a very weird place to add things.
What was the expectation?


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2018-01-25 Thread Petr Mladek
On Thu 2018-01-25 10:04:44, Peter Zijlstra wrote:
> On Mon, Feb 13, 2017 at 07:42:40PM -0600, Josh Poimboeuf wrote:
> > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > index 6a4bae0..a8b3f1a 100644
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -9,6 +9,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  
> > @@ -264,6 +265,9 @@ static void do_idle(void)
> >  
> > sched_ttwu_pending();
> > schedule_preempt_disabled();
> > +
> > +   if (unlikely(klp_patch_pending(current)))
> > +   klp_update_patch_state(current);
> >  }
> 
> Can someone explain this one? This is a very weird place to add things.
> What was the expectation?

AFAIK, it was the least ugly and minimalist solution that we came with.

The tasks are migrated to the new patch when neither of the patched
functions is on the stack. The stack can be checked safely only when
the task is not running. It might be very hard to catch the idle
task on a such a place if we patch something that is used there.

If the idle task is scheduled, you would need to create some fake
load on the system, try to migrate the idle task, stop the fake load
on the CPU.

The above code makes the idle task to migrate itself on a sane place.
You just need to schedule some minimalist job on the CPU. The idle
task will do one loop, gets migrated, and might be scheduled again
immediately.

Best Regards,
Petr


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2018-01-25 Thread Peter Zijlstra
On Thu, Jan 25, 2018 at 11:24:14AM +0100, Petr Mladek wrote:
> On Thu 2018-01-25 10:04:44, Peter Zijlstra wrote:
> > On Mon, Feb 13, 2017 at 07:42:40PM -0600, Josh Poimboeuf wrote:
> > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > > index 6a4bae0..a8b3f1a 100644
> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -9,6 +9,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  
> > > @@ -264,6 +265,9 @@ static void do_idle(void)
> > >  
> > >   sched_ttwu_pending();
> > >   schedule_preempt_disabled();
> > > +
> > > + if (unlikely(klp_patch_pending(current)))
> > > + klp_update_patch_state(current);
> > >  }
> > 
> > Can someone explain this one? This is a very weird place to add things.
> > What was the expectation?
> 
> AFAIK, it was the least ugly and minimalist solution that we came with.
> 
> The tasks are migrated to the new patch when neither of the patched
> functions is on the stack. The stack can be checked safely only when
> the task is not running. It might be very hard to catch the idle
> task on a such a place if we patch something that is used there.
> 
> If the idle task is scheduled, you would need to create some fake
> load on the system, try to migrate the idle task, stop the fake load
> on the CPU.
> 
> The above code makes the idle task to migrate itself on a sane place.
> You just need to schedule some minimalist job on the CPU. The idle
> task will do one loop, gets migrated, and might be scheduled again
> immediately.

What I was getting at, the klp stuff is the very first thing we run when
we schedule the idle task, but its placed at the very end of the
function. This is confusing.

The above still doesn't help with solving that. Do you want to run
something before we go idle, or before we leave idle, in neither cases
would I place it where it is.


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2018-01-25 Thread Petr Mladek
On Thu 2018-01-25 11:38:55, Peter Zijlstra wrote:
> On Thu, Jan 25, 2018 at 11:24:14AM +0100, Petr Mladek wrote:
> > On Thu 2018-01-25 10:04:44, Peter Zijlstra wrote:
> > > On Mon, Feb 13, 2017 at 07:42:40PM -0600, Josh Poimboeuf wrote:
> > > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > > > index 6a4bae0..a8b3f1a 100644
> > > > --- a/kernel/sched/idle.c
> > > > +++ b/kernel/sched/idle.c
> > > > @@ -9,6 +9,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include 
> > > >  
> > > >  #include 
> > > >  
> > > > @@ -264,6 +265,9 @@ static void do_idle(void)
> > > >  
> > > > sched_ttwu_pending();
> > > > schedule_preempt_disabled();
> > > > +
> > > > +   if (unlikely(klp_patch_pending(current)))
> > > > +   klp_update_patch_state(current);
> > > >  }
> > > 
> > > Can someone explain this one? This is a very weird place to add things.
> > > What was the expectation?
> > 
> > AFAIK, it was the least ugly and minimalist solution that we came with.
> > 
> > The tasks are migrated to the new patch when neither of the patched
> > functions is on the stack. The stack can be checked safely only when
> > the task is not running. It might be very hard to catch the idle
> > task on a such a place if we patch something that is used there.
> > 
> > If the idle task is scheduled, you would need to create some fake
> > load on the system, try to migrate the idle task, stop the fake load
> > on the CPU.
> > 
> > The above code makes the idle task to migrate itself on a sane place.
> > You just need to schedule some minimalist job on the CPU. The idle
> > task will do one loop, gets migrated, and might be scheduled again
> > immediately.
> 
> What I was getting at, the klp stuff is the very first thing we run when
> we schedule the idle task, but its placed at the very end of the
> function. This is confusing.

I see.


> The above still doesn't help with solving that. Do you want to run
> something before we go idle, or before we leave idle, in neither cases
> would I place it where it is.

In fact, both ways are fine. We require going the idle task
through the entire cycle anyway. It is because both situations,
too long idling or non-idling, would block finishing the patch
transition.

Feel free to move it right before schedule_idle() or
__current_set_polling().

Or should I send a patch?

Best Regards,
Petr


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2018-01-25 Thread Peter Zijlstra
On Thu, Jan 25, 2018 at 01:13:21PM +0100, Petr Mladek wrote:
> > What I was getting at, the klp stuff is the very first thing we run when
> > we schedule the idle task, but its placed at the very end of the
> > function. This is confusing.
> 
> I see.
> 
> 
> > The above still doesn't help with solving that. Do you want to run
> > something before we go idle, or before we leave idle, in neither cases
> > would I place it where it is.
> 
> In fact, both ways are fine. We require going the idle task
> through the entire cycle anyway. It is because both situations,
> too long idling or non-idling, would block finishing the patch
> transition.
> 
> Feel free to move it right before schedule_idle() or
> __current_set_polling().

OK, I'll move it. Thanks!


Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

2017-04-11 Thread Petr Mladek
On Mon 2017-02-13 19:42:40, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> 
> Signed-off-by: Josh Poimboeuf 

Just for record, this last version looks fine to me. I do not see
problems any longer. Everything looks consistent now ;-)
It is a great work. Feel free to use:

Reviewed-by: Petr Mladek 

Thanks a lot for patience.

Best Regards,
Petr


Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.

> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -76,6 +76,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned long 
> clone_flags,
>   p->parent_exec_id = current->self_exec_id;
>   }
>  
> + klp_copy_process(p);

I am in doubts here. We copy the state from the parent here. It means
that the new process might still need to be converted. But at the same
point print_context_stack_reliable() returns zero without printing
any stack trace when TIF_FORK flag is set. It means that a freshly
forked task might get be converted immediately. I seems that boot
operations are always done when copy_process() is called. But
they are contradicting each other.

I guess that print_context_stack_reliable() should either return
-EINVAL when TIF_FORK is set. Or it should try to print the
stack of the newly forked task.

Or do I miss something, please?

> +
>   spin_lock(¤t->sighand->siglock);
>  
>   /*

[...]

> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 000..92819bb
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> +/*
> + * This function can be called in the middle of an existing transition to
> + * reverse the direction of the target patch state.  This can be done to
> + * effectively cancel an existing enable or disable operation if there are 
> any
> + * tasks which are stuck in the initial patch state.
> + */
> +void klp_reverse_transition(void)
> +{
> + struct klp_patch *patch = klp_transition_patch;
> +
> + klp_target_state = !klp_target_state;
> +
> + /*
> +  * Ensure that if another CPU goes through the syscall barrier, sees
> +  * the TIF_PATCH_PENDING writes in klp_start_transition(), and calls
> +  * klp_patch_task(), it also sees the above write to the target state.
> +  * Otherwise it can put the task in the wrong universe.
> +  */
> + smp_wmb();
> +
> + klp_start_transition();
> + klp_try_complete_transition();

It is a bit strange that we keep the work scheduled. It might be
better to use

   mod_delayed_work(system_wq, &klp_work, 0);

Which triggers more ideas from the nitpicking deparment:

I would move the work definition from core.c to transition.c because
it is closely related to klp_try_complete_transition();

When on it. I would make it more clear that the work is related
to transition. Also I would call queue_delayed_work() directly
instead of adding the klp_schedule_work() wrapper. The delay
might be defined using a constant, e.g.

#define KLP_TRANSITION_DELAY round_jiffies_relative(HZ)

queue_delayed_work(system_wq, &klp_transition_work, KLP_TRANSITION_DELAY);

Finally, the following is always called right after
klp_start_transition(), so I would call it from there.

if (!klp_try_complete_transition())
klp_schedule_work();


> +
> + patch->enabled = !patch->enabled;
> +}
> +

It is really great work! I am checking this patch from left, right, top,
and even bottom and all seems to work well together.

Best Regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Josh Poimboeuf
On Wed, May 04, 2016 at 10:42:23AM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> 
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -76,6 +76,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned long 
> > clone_flags,
> > p->parent_exec_id = current->self_exec_id;
> > }
> >  
> > +   klp_copy_process(p);
> 
> I am in doubts here. We copy the state from the parent here. It means
> that the new process might still need to be converted. But at the same
> point print_context_stack_reliable() returns zero without printing
> any stack trace when TIF_FORK flag is set. It means that a freshly
> forked task might get be converted immediately. I seems that boot
> operations are always done when copy_process() is called. But
> they are contradicting each other.
> 
> I guess that print_context_stack_reliable() should either return
> -EINVAL when TIF_FORK is set. Or it should try to print the
> stack of the newly forked task.
> 
> Or do I miss something, please?

Ok, I admit it's confusing.

A newly forked task doesn't *have* a stack (other than the pt_regs frame
it needs for the return to user space), which is why
print_context_stack_reliable() returns success with an empty array of
addresses.

For a little background, see the second switch_to() macro in
arch/x86/include/asm/switch_to.h.  When a newly forked task runs for the
first time, it returns from __switch_to() with no stack.  It then jumps
straight to ret_from_fork in entry_64.S, calls a few C functions, and
eventually returns to user space.  So, assuming we aren't patching entry
code or the switch_to() macro in __schedule(), it should be safe to
patch the task before it does all that.

With the current code, if an unpatched task gets forked, the child will
also be unpatched.  In theory, we could go ahead and patch the child
then.  In fact, that's what I did in v1.9.

But in v1.9 discussions it was pointed out that someday maybe the
ret_from_fork stuff will get cleaned up and instead the child stack will
be copied from the parent.  In that case the child should inherit its
parent's patched state.  So we decided to make it more future-proof by
having the child inherit the parent's patched state.

So, having said all that, I'm really not sure what the best approach is
for print_context_stack_reliable().  Right now I'm thinking I'll change
it back to return -EINVAL for a newly forked task, so it'll be more
future-proof: better to have a false positive than a false negative.
Either way it will probably need to be changed again if the
ret_from_fork code gets cleaned up.

> > +
> > spin_lock(¤t->sighand->siglock);
> >  
> > /*
> 
> [...]
> 
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > new file mode 100644
> > index 000..92819bb
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > +/*
> > + * This function can be called in the middle of an existing transition to
> > + * reverse the direction of the target patch state.  This can be done to
> > + * effectively cancel an existing enable or disable operation if there are 
> > any
> > + * tasks which are stuck in the initial patch state.
> > + */
> > +void klp_reverse_transition(void)
> > +{
> > +   struct klp_patch *patch = klp_transition_patch;
> > +
> > +   klp_target_state = !klp_target_state;
> > +
> > +   /*
> > +* Ensure that if another CPU goes through the syscall barrier, sees
> > +* the TIF_PATCH_PENDING writes in klp_start_transition(), and calls
> > +* klp_patch_task(), it also sees the above write to the target state.
> > +* Otherwise it can put the task in the wrong universe.
> > +*/
> > +   smp_wmb();
> > +
> > +   klp_start_transition();
> > +   klp_try_complete_transition();
> 
> It is a bit strange that we keep the work scheduled. It might be
> better to use
> 
>mod_delayed_work(system_wq, &klp_work, 0);

True, I think that would be better.

> Which triggers more ideas from the nitpicking deparment:
> 
> I would move the work definition from core.c to transition.c because
> it is closely related to klp_try_complete_transition();

That could be good, but there's a slight problem: klp_work_fn() requires
klp_mutex, which is static to core.c.  It's kind of nice to keep the use
of the mutex in core.c only.

> When on it. I would make it more clear that the work is related
> to transition.

How would you recommend doing that?  How about:

- rename "klp_work" -> "klp_transition_work"
- rename "klp_work_fn" -> "klp_transition_work_fn" 

?

> Also I would call q

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-05 Thread Miroslav Benes
On Wed, 4 May 2016, Josh Poimboeuf wrote:

> On Wed, May 04, 2016 at 10:42:23AM +0200, Petr Mladek wrote:
> > On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -76,6 +76,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  #include 
> > > @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned 
> > > long clone_flags,
> > >   p->parent_exec_id = current->self_exec_id;
> > >   }
> > >  
> > > + klp_copy_process(p);
> > 
> > I am in doubts here. We copy the state from the parent here. It means
> > that the new process might still need to be converted. But at the same
> > point print_context_stack_reliable() returns zero without printing
> > any stack trace when TIF_FORK flag is set. It means that a freshly
> > forked task might get be converted immediately. I seems that boot
> > operations are always done when copy_process() is called. But
> > they are contradicting each other.
> > 
> > I guess that print_context_stack_reliable() should either return
> > -EINVAL when TIF_FORK is set. Or it should try to print the
> > stack of the newly forked task.
> > 
> > Or do I miss something, please?
> 
> Ok, I admit it's confusing.
> 
> A newly forked task doesn't *have* a stack (other than the pt_regs frame
> it needs for the return to user space), which is why
> print_context_stack_reliable() returns success with an empty array of
> addresses.
> 
> For a little background, see the second switch_to() macro in
> arch/x86/include/asm/switch_to.h.  When a newly forked task runs for the
> first time, it returns from __switch_to() with no stack.  It then jumps
> straight to ret_from_fork in entry_64.S, calls a few C functions, and
> eventually returns to user space.  So, assuming we aren't patching entry
> code or the switch_to() macro in __schedule(), it should be safe to
> patch the task before it does all that.
> 
> With the current code, if an unpatched task gets forked, the child will
> also be unpatched.  In theory, we could go ahead and patch the child
> then.  In fact, that's what I did in v1.9.
> 
> But in v1.9 discussions it was pointed out that someday maybe the
> ret_from_fork stuff will get cleaned up and instead the child stack will
> be copied from the parent.  In that case the child should inherit its
> parent's patched state.  So we decided to make it more future-proof by
> having the child inherit the parent's patched state.
> 
> So, having said all that, I'm really not sure what the best approach is
> for print_context_stack_reliable().  Right now I'm thinking I'll change
> it back to return -EINVAL for a newly forked task, so it'll be more
> future-proof: better to have a false positive than a false negative.
> Either way it will probably need to be changed again if the
> ret_from_fork code gets cleaned up.

I'd be for returning -EINVAL. It is a safe play for now.

[...]
 
> > Finally, the following is always called right after
> > klp_start_transition(), so I would call it from there.
> > 
> > if (!klp_try_complete_transition())
> > klp_schedule_work();

On the other hand it is quite nice to see the sequence

init
start
try_complete

there. Just my 2 cents.

> Except for when it's called by klp_reverse_transition().  And it really
> depends on whether we want to allow transition.c to use the mutex.  I
> don't have a strong opinion either way, I may need to think about it
> some more.

Miroslav
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-05 Thread Petr Mladek
On Wed 2016-05-04 10:51:21, Josh Poimboeuf wrote:
> On Wed, May 04, 2016 at 10:42:23AM +0200, Petr Mladek wrote:
> > On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -76,6 +76,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  #include 
> > > @@ -1586,6 +1587,8 @@ static struct task_struct *copy_process(unsigned 
> > > long clone_flags,
> > >   p->parent_exec_id = current->self_exec_id;
> > >   }
> > >  
> > > + klp_copy_process(p);
> > 
> > I am in doubts here. We copy the state from the parent here. It means
> > that the new process might still need to be converted. But at the same
> > point print_context_stack_reliable() returns zero without printing
> > any stack trace when TIF_FORK flag is set. It means that a freshly
> > forked task might get be converted immediately. I seems that boot
> > operations are always done when copy_process() is called. But
> > they are contradicting each other.
> > 
> > I guess that print_context_stack_reliable() should either return
> > -EINVAL when TIF_FORK is set. Or it should try to print the
> > stack of the newly forked task.
> > 
> > Or do I miss something, please?
> 
> Ok, I admit it's confusing.
> 
> A newly forked task doesn't *have* a stack (other than the pt_regs frame
> it needs for the return to user space), which is why
> print_context_stack_reliable() returns success with an empty array of
> addresses.
> 
> For a little background, see the second switch_to() macro in
> arch/x86/include/asm/switch_to.h.  When a newly forked task runs for the
> first time, it returns from __switch_to() with no stack.  It then jumps
> straight to ret_from_fork in entry_64.S, calls a few C functions, and
> eventually returns to user space.  So, assuming we aren't patching entry
> code or the switch_to() macro in __schedule(), it should be safe to
> patch the task before it does all that.

This is great explanation. Thanks for it.

> So, having said all that, I'm really not sure what the best approach is
> for print_context_stack_reliable().  Right now I'm thinking I'll change
> it back to return -EINVAL for a newly forked task, so it'll be more
> future-proof: better to have a false positive than a false negative.
> Either way it will probably need to be changed again if the
> ret_from_fork code gets cleaned up.

I would prefer the -EINVAL. It might safe some hairs when anyone
is working on patching the switch_to stuff. Also it is not that
big loss beacuse most tasks will get migrated on the return to
userspace.

It might help a bit with the newly forked kthreads. But there should
be more safe location where the new kthreads might get migrated,
e.g. right before the main function gets called.


> > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > > new file mode 100644
> > > index 000..92819bb
> > > --- /dev/null
> > > +++ b/kernel/livepatch/transition.c
> > > +/*
> > > + * This function can be called in the middle of an existing transition to
> > > + * reverse the direction of the target patch state.  This can be done to
> > > + * effectively cancel an existing enable or disable operation if there 
> > > are any
> > > + * tasks which are stuck in the initial patch state.
> > > + */
> > > +void klp_reverse_transition(void)
> > > +{
> > > + struct klp_patch *patch = klp_transition_patch;
> > > +
> > > + klp_target_state = !klp_target_state;
> > > +
> > > + /*
> > > +  * Ensure that if another CPU goes through the syscall barrier, sees
> > > +  * the TIF_PATCH_PENDING writes in klp_start_transition(), and calls
> > > +  * klp_patch_task(), it also sees the above write to the target state.
> > > +  * Otherwise it can put the task in the wrong universe.
> > > +  */
> > > + smp_wmb();
> > > +
> > > + klp_start_transition();
> > > + klp_try_complete_transition();
> > 
> > It is a bit strange that we keep the work scheduled. It might be
> > better to use
> > 
> >mod_delayed_work(system_wq, &klp_work, 0);
> 
> True, I think that would be better.
> 
> > Which triggers more ideas from the nitpicking deparment:
> > 
> > I would move the work definition from core.c to transition.c because
> > it is closely related to klp_try_complete_transition();
> 
> That could be good, but there's a slight problem: klp_work_fn() requires
> klp_mutex, which is static to core.c.  It's kind of nice to keep the use
> of the mutex in core.c only.

I see and am surprised that we take the lock only in core.c ;-)

I do not have a strong opinion then. Just a small one. The lock guards
also operations from the other .c files. I think that it i

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-06 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 782fbb5..b3b8639 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -58,11 +59,42 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>   ops = container_of(fops, struct klp_ops, fops);
>  
>   rcu_read_lock();
> +
>   func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - if (WARN_ON_ONCE(!func))
> +
> + if (!func)
>   goto unlock;
>  
> + /*
> +  * See the comment for the 2nd smp_wmb() in klp_init_transition() for
> +  * an explanation of why this read barrier is needed.
> +  */
> + smp_rmb();
> +
> + if (unlikely(func->transition)) {
> +
> + /*
> +  * See the comment for the 1st smp_wmb() in
> +  * klp_init_transition() for an explanation of why this read
> +  * barrier is needed.
> +  */
> + smp_rmb();

I would add here:

WARN_ON_ONCE(current->patch_state == KLP_UNDEFINED);

We do not know in which context this is called, so the printk's are
not ideal. But it will get triggered only if there is a bug in
the livepatch implementation. It should happen on random locations
and rather early when a bug is introduced.

Anyway, better to die and catch the bug that let the system running
in an undefined state and produce cryptic errors later on.


> + if (current->patch_state == KLP_UNPATCHED) {
> + /*
> +  * Use the previously patched version of the function.
> +  * If no previous patches exist, use the original
> +  * function.
> +  */
> + func = list_entry_rcu(func->stack_node.next,
> +   struct klp_func, stack_node);
> +
> + if (&func->stack_node == &ops->func_stack)
> + goto unlock;
> + }
> + }

I am staring into the code for too long now. I need to step back for a
while. I'll do another look when you send the next version. Anyway,
you did a great work. I speak mainly for the livepatch part and
I like it.

Best Regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-06 Thread Josh Poimboeuf
On Fri, May 06, 2016 at 01:33:01PM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> > index 782fbb5..b3b8639 100644
> > --- a/kernel/livepatch/patch.c
> > +++ b/kernel/livepatch/patch.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >  #include 
> >  #include "patch.h"
> > +#include "transition.h"
> >  
> >  static LIST_HEAD(klp_ops);
> >  
> > @@ -58,11 +59,42 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > ops = container_of(fops, struct klp_ops, fops);
> >  
> > rcu_read_lock();
> > +
> > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> >   stack_node);
> > -   if (WARN_ON_ONCE(!func))
> > +
> > +   if (!func)
> > goto unlock;
> >  
> > +   /*
> > +* See the comment for the 2nd smp_wmb() in klp_init_transition() for
> > +* an explanation of why this read barrier is needed.
> > +*/
> > +   smp_rmb();
> > +
> > +   if (unlikely(func->transition)) {
> > +
> > +   /*
> > +* See the comment for the 1st smp_wmb() in
> > +* klp_init_transition() for an explanation of why this read
> > +* barrier is needed.
> > +*/
> > +   smp_rmb();
> 
> I would add here:
> 
>   WARN_ON_ONCE(current->patch_state == KLP_UNDEFINED);
> 
> We do not know in which context this is called, so the printk's are
> not ideal. But it will get triggered only if there is a bug in
> the livepatch implementation. It should happen on random locations
> and rather early when a bug is introduced.
> 
> Anyway, better to die and catch the bug that let the system running
> in an undefined state and produce cryptic errors later on.

Ok, makes sense.

> > +   if (current->patch_state == KLP_UNPATCHED) {
> > +   /*
> > +* Use the previously patched version of the function.
> > +* If no previous patches exist, use the original
> > +* function.
> > +*/
> > +   func = list_entry_rcu(func->stack_node.next,
> > + struct klp_func, stack_node);
> > +
> > +   if (&func->stack_node == &ops->func_stack)
> > +   goto unlock;
> > +   }
> > +   }
> 
> I am staring into the code for too long now. I need to step back for a
> while. I'll do another look when you send the next version. Anyway,
> you did a great work. I speak mainly for the livepatch part and
> I like it.

Thanks for the helpful reviews!  I'll be on vacation again next week so
I get a break too :-)

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-09 Thread Miroslav Benes

[...]

> +static int klp_target_state;

[...]

> +void klp_init_transition(struct klp_patch *patch, int state)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> + struct klp_object *obj;
> + struct klp_func *func;
> + int initial_state = !state;
> +
> + klp_transition_patch = patch;
> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (patch->immediate)
> + return;
> +
> + /*
> +  * Initialize all tasks to the initial patch state to prepare them for
> +  * switching to the target state.
> +  */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, task)
> + task->patch_state = initial_state;
> + read_unlock(&tasklist_lock);
> +
> + /*
> +  * Ditto for the idle "swapper" tasks.
> +  */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + idle_task(cpu)->patch_state = initial_state;
> + put_online_cpus();
> +
> + /*
> +  * Ensure klp_ftrace_handler() sees the task->patch_state updates
> +  * before the func->transition updates.  Otherwise it could read an
> +  * out-of-date task state and pick the wrong function.
> +  */
> + smp_wmb();
> +
> + /*
> +  * Set the func transition states so klp_ftrace_handler() will know to
> +  * switch to the transition logic.
> +  *
> +  * When patching, the funcs aren't yet in the func_stack and will be
> +  * made visible to the ftrace handler shortly by the calls to
> +  * klp_patch_object().
> +  *
> +  * When unpatching, the funcs are already in the func_stack and so are
> +  * already visible to the ftrace handler.
> +  */
> + klp_for_each_object(patch, obj)
> + klp_for_each_func(obj, func)
> + func->transition = true;
> +
> + /*
> +  * Set the global target patch state which tasks will switch to.  This
> +  * has no effect until the TIF_PATCH_PENDING flags get set later.
> +  */
> + klp_target_state = state;

I am afraid there is a problem for (patch->immediate == true) patches. 
klp_target_state is not set for those and the comment is not entirely 
true, because klp_target_state has an effect in several places.

[...]

> +void klp_start_transition(void)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> +   klp_target_state == KLP_PATCHED ? "patching" : "unpatching");

Here...

> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (klp_transition_patch->immediate)
> + return;
> +

[...]

> +bool klp_try_complete_transition(void)
> +{
> + unsigned int cpu;
> + struct task_struct *g, *task;
> + bool complete = true;
> +
> + /*
> +  * If the patch can be applied or reverted immediately, skip the
> +  * per-task transitions.
> +  */
> + if (klp_transition_patch->immediate)
> + goto success;
> +
> + /*
> +  * Try to switch the tasks to the target patch state by walking their
> +  * stacks and looking for any to-be-patched or to-be-unpatched
> +  * functions.  If such functions are found on a stack, or if the stack
> +  * is deemed unreliable, the task can't be switched yet.
> +  *
> +  * Usually this will transition most (or all) of the tasks on a system
> +  * unless the patch includes changes to a very common function.
> +  */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, task)
> + if (!klp_try_switch_task(task))
> + complete = false;
> + read_unlock(&tasklist_lock);
> +
> + /*
> +  * Ditto for the idle "swapper" tasks.
> +  */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + if (!klp_try_switch_task(idle_task(cpu)))
> + complete = false;
> + put_online_cpus();
> +
> + /*
> +  * Some tasks weren't able to be switched over.  Try again later and/or
> +  * wait for other methods like syscall barrier switching.
> +  */
> + if (!complete)
> + return false;
> +
> +success:
> + /*
> +  * When unpatching, all tasks have transitioned to KLP_UNPATCHED so we
> +  * can now remove the new functions from the func_stack.
> +  */
> + if (klp_target_state == KLP_UNPATCHED) {

Here (this is the most important one I think).

> + klp_unpatch_objects(klp_transition_patch);
> +
> + /*
> +  * Don't allow any existing instances of ftrace handlers to
> +  * access any obsolete funcs before we reset the func
> +  * transition states to false.  Otherwise the handler may see
> +  * the deleted "new" func, see that it's not in tra

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-10 Thread Miroslav Benes
On Thu, 28 Apr 2016, Josh Poimboeuf wrote:

> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> This code stems from the design proposal made by Vojtech [1] in November
> 2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
> consistency and syscall barrier switching combined with kpatch's stack
> trace switching.  There are also a number of fallback options which make
> it quite flexible.
> 
> Patches are applied on a per-task basis, when the task is deemed safe to
> switch over.  When a patch is enabled, livepatch enters into a
> transition state where tasks are converging to the patched state.
> Usually this transition state can complete in a few seconds.  The same
> sequence occurs when a patch is disabled, except the tasks converge from
> the patched state to the unpatched state.
> 
> An interrupt handler inherits the patched state of the task it
> interrupts.  The same is true for forked tasks: the child inherits the
> patched state of the parent.
> 
> Livepatch uses several complementary approaches to determine when it's
> safe to patch tasks:
> 
> 1. The first and most effective approach is stack checking of sleeping
>tasks.  If no affected functions are on the stack of a given task,
>the task is patched.  In most cases this will patch most or all of
>the tasks on the first try.  Otherwise it'll keep trying
>periodically.  This option is only available if the architecture has
>reliable stacks (CONFIG_RELIABLE_STACKTRACE and
>CONFIG_STACK_VALIDATION).
> 
> 2. The second approach, if needed, is kernel exit switching.  A
>task is switched when it returns to user space from a system call, a
>user space IRQ, or a signal.  It's useful in the following cases:
> 
>a) Patching I/O-bound user tasks which are sleeping on an affected
>   function.  In this case you have to send SIGSTOP and SIGCONT to
>   force it to exit the kernel and be patched.
>b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
>   then it will get patched the next time it gets interrupted by an
>   IRQ.
>c) Applying patches for architectures which don't yet have
>   CONFIG_RELIABLE_STACKTRACE.  In this case you'll have to signal
>   most of the tasks on the system.  However this isn't a complete
>   solution, because there's currently no way to patch kthreads
>   without CONFIG_RELIABLE_STACKTRACE.
> 
>Note: since idle "swapper" tasks don't ever exit the kernel, they
>instead have a kpatch_patch_task() call in the idle loop which allows

s/kpatch_patch_task()/klp_patch_task()/

[...]

> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt
> @@ -72,7 +72,8 @@ example, they add a NULL pointer or a boundary check, fix a 
> race by adding
>  a missing memory barrier, or add some locking around a critical section.
>  Most of these changes are self contained and the function presents itself
>  the same way to the rest of the system. In this case, the functions might
> -be updated independently one by one.
> +be updated independently one by one.  (This can be done by setting the
> +'immediate' flag in the klp_patch struct.)
>  
>  But there are more complex fixes. For example, a patch might change
>  ordering of locking in multiple functions at the same time. Or a patch
> @@ -86,20 +87,103 @@ or no data are stored in the modified structures at the 
> moment.
>  The theory about how to apply functions a safe way is rather complex.
>  The aim is to define a so-called consistency model. It attempts to define
>  conditions when the new implementation could be used so that the system
> -stays consistent. The theory is not yet finished. See the discussion at
> -http://thread.gmane.org/gmane.linux.kernel/1823033/focus=1828189
> -
> -The current consistency model is very simple. It guarantees that either
> -the old or the new function is called. But various functions get redirected
> -one by one without any synchronization.
> -
> -In other words, the current implementation _never_ modifies the behavior
> -in the middle of the call. It is because it does _not_ rewrite the entire
> -function in the memory. Instead, the function gets redirected at the
> -very beginning. But this redirection is used immediately even when
> -some other functions from the same patch have not been redirected yet.
> -
> -See also the section "Limitations" below.
> +stays consistent.
> +
> +Livepatch has a consistency model which is a hybrid of kGraft and
> +kpatch:  it uses kGraft's per-task consistency and syscall barrier
> +switching combined with kpatch's stack trace switching.  There are also
> +a number of fallback options which make it quite flexible.
> +
> +Patches are applied on a 

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-16 Thread Josh Poimboeuf
On Mon, May 09, 2016 at 11:41:37AM +0200, Miroslav Benes wrote:
> > +void klp_init_transition(struct klp_patch *patch, int state)
> > +{
> > +   struct task_struct *g, *task;
> > +   unsigned int cpu;
> > +   struct klp_object *obj;
> > +   struct klp_func *func;
> > +   int initial_state = !state;
> > +
> > +   klp_transition_patch = patch;
> > +
> > +   /*
> > +* If the patch can be applied or reverted immediately, skip the
> > +* per-task transitions.
> > +*/
> > +   if (patch->immediate)
> > +   return;
> > +
> > +   /*
> > +* Initialize all tasks to the initial patch state to prepare them for
> > +* switching to the target state.
> > +*/
> > +   read_lock(&tasklist_lock);
> > +   for_each_process_thread(g, task)
> > +   task->patch_state = initial_state;
> > +   read_unlock(&tasklist_lock);
> > +
> > +   /*
> > +* Ditto for the idle "swapper" tasks.
> > +*/
> > +   get_online_cpus();
> > +   for_each_online_cpu(cpu)
> > +   idle_task(cpu)->patch_state = initial_state;
> > +   put_online_cpus();
> > +
> > +   /*
> > +* Ensure klp_ftrace_handler() sees the task->patch_state updates
> > +* before the func->transition updates.  Otherwise it could read an
> > +* out-of-date task state and pick the wrong function.
> > +*/
> > +   smp_wmb();
> > +
> > +   /*
> > +* Set the func transition states so klp_ftrace_handler() will know to
> > +* switch to the transition logic.
> > +*
> > +* When patching, the funcs aren't yet in the func_stack and will be
> > +* made visible to the ftrace handler shortly by the calls to
> > +* klp_patch_object().
> > +*
> > +* When unpatching, the funcs are already in the func_stack and so are
> > +* already visible to the ftrace handler.
> > +*/
> > +   klp_for_each_object(patch, obj)
> > +   klp_for_each_func(obj, func)
> > +   func->transition = true;
> > +
> > +   /*
> > +* Set the global target patch state which tasks will switch to.  This
> > +* has no effect until the TIF_PATCH_PENDING flags get set later.
> > +*/
> > +   klp_target_state = state;
> 
> I am afraid there is a problem for (patch->immediate == true) patches. 
> klp_target_state is not set for those and the comment is not entirely 
> true, because klp_target_state has an effect in several places.

Ah, you're right.  I moved this assignment here for v2.  It was
originally done before the patch->immediate check.  If I remember
correctly, I moved it closer to the barrier for better readability (but
I created a bug in the process).

> I guess we need to set klp_target_state even for immediate patches. Should 
> we also initialize it with KLP_UNDEFINED and set it to KLP_UNDEFINED in 
> klp_complete_transition()?

Yes, to both.

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-06-06 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.

> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 000..92819bb
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> +/*
> + * Try to safely switch a task to the target patch state.  If it's currently
> + * running, or it's sleeping on a to-be-patched or to-be-unpatched function, 
> or
> + * if the stack is unreliable, return false.
> + */
> +static bool klp_try_switch_task(struct task_struct *task)
> +{
> + struct rq *rq;
> + unsigned long flags;

This should be of type "struct rq_flags". Otherwise, I get compilation
warnings:

kernel/livepatch/transition.c: In function ‘klp_try_switch_task’:
kernel/livepatch/transition.c:349:2: warning: passing argument 2 of 
‘task_rq_lock’ from incompatible pointer type [enabled by default]
  rq = task_rq_lock(task, &flags);
  ^
In file included from kernel/livepatch/transition.c:24:0:
kernel/livepatch/../sched/sched.h:1468:12: note: expected ‘struct rq_flags *’ 
but argument is of type ‘long unsigned int *’
 struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
^
kernel/livepatch/transition.c:367:2: warning: passing argument 3 of 
‘task_rq_unlock’ from incompatible pointer type [enabled by default]
  task_rq_unlock(rq, task, &flags);
  ^
In file included from kernel/livepatch/transition.c:24:0:
kernel/livepatch/../sched/sched.h:1480:1: note: expected ‘struct rq_flags *’ 
but argument is of type ‘long unsigned int *’
 task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)


And even runtime warnings from lockdep:

[  212.847548] WARNING: CPU: 1 PID: 3847 at kernel/locking/lockdep.c:3532 
lock_release+0x431/0x480
[  212.847549] releasing a pinned lock
[  212.847550] Modules linked in: livepatch_sample(E+)
[  212.847555] CPU: 1 PID: 3847 Comm: modprobe Tainted: GE K 
4.7.0-rc1-next-20160602-4-default+ #336
[  212.847556] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
[  212.847558]   880139823aa0 814388dc 
880139823af0
[  212.847562]   880139823ae0 8106fad1 
0dcc82b11390
[  212.847565]  88013fc978d8 810eea1e 8800ba0ed6d0 
0003
[  212.847569] Call Trace:
[  212.847572]  [] dump_stack+0x85/0xc9
[  212.847575]  [] __warn+0xd1/0xf0
[  212.847578]  [] ? klp_try_switch_task.part.3+0x5e/0x2b0
[  212.847580]  [] warn_slowpath_fmt+0x4f/0x60
[  212.847582]  [] lock_release+0x431/0x480
[  212.847585]  [] ? dump_trace+0x118/0x310
[  212.847588]  [] ? entry_SYSCALL_64_fastpath+0x1f/0xbd
[  212.847590]  [] _raw_spin_unlock+0x1f/0x30
[  212.847600]  [] klp_try_switch_task.part.3+0x5e/0x2b0
[  212.847603]  [] klp_try_complete_transition+0x84/0x190
[  212.847605]  [] __klp_enable_patch+0xb0/0x130
[  212.847607]  [] klp_enable_patch+0x55/0x80
[  212.847610]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
[livepatch_sample]
[  212.847613]  [] livepatch_init+0x31/0x70 [livepatch_sample]
[  212.847615]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
[livepatch_sample]
[  212.847617]  [] do_one_initcall+0x3d/0x160
[  212.847629]  [] ? do_init_module+0x27/0x1e4
[  212.847632]  [] ? rcu_read_lock_sched_held+0x62/0x70
[  212.847634]  [] ? kmem_cache_alloc_trace+0x282/0x340
[  212.847636]  [] do_init_module+0x60/0x1e4
[  212.847638]  [] load_module+0x1482/0x1d40
[  212.847640]  [] ? __symbol_put+0x40/0x40
[  212.847643]  [] SYSC_finit_module+0xa9/0xd0
[  212.847645]  [] SyS_finit_module+0xe/0x10
[  212.847647]  [] entry_SYSCALL_64_fastpath+0x1f/0xbd
[  212.847649] ---[ end trace e4e9f09d45443049 ]---


> + int ret;
> + bool success = false;
> +
> + /* check if this task has already switched over */
> + if (task->patch_state == klp_target_state)
> + return true;
> +
> + /*
> +  * For arches which don't have reliable stack traces, we have to rely
> +  * on other methods (e.g., switching tasks at the syscall barrier).
> +  */
> + if (!IS_ENABLED(CONFIG_RELIABLE_STACKTRACE))
> + return false;
> +
> + /*
> +  * Now try to check the stack for any to-be-patched or to-be-unpatched
> +  * functions.  If all goes well, switch the task to the target patch
> +  * state.
> +  */
> + rq = task_rq_lock(task, &flags);
> +
> + if (task_running(rq, task) && task != current) {
> + pr_debug("%s: pid %d (%s) is running\n", __func__, task->pid,
> +  task->comm);

Also I think about using printk_deferred() inside the rq_lock but
it is not strictly needed. Also we use only pr_debug() here which
is a NOP when not enabled.

Best 

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-06-06 Thread Josh Poimboeuf
On Mon, Jun 06, 2016 at 03:54:41PM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> 
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > new file mode 100644
> > index 000..92819bb
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > +/*
> > + * Try to safely switch a task to the target patch state.  If it's 
> > currently
> > + * running, or it's sleeping on a to-be-patched or to-be-unpatched 
> > function, or
> > + * if the stack is unreliable, return false.
> > + */
> > +static bool klp_try_switch_task(struct task_struct *task)
> > +{
> > +   struct rq *rq;
> > +   unsigned long flags;
> 
> This should be of type "struct rq_flags". Otherwise, I get compilation
> warnings:
> 
> kernel/livepatch/transition.c: In function ‘klp_try_switch_task’:
> kernel/livepatch/transition.c:349:2: warning: passing argument 2 of 
> ‘task_rq_lock’ from incompatible pointer type [enabled by default]
>   rq = task_rq_lock(task, &flags);
>   ^
> In file included from kernel/livepatch/transition.c:24:0:
> kernel/livepatch/../sched/sched.h:1468:12: note: expected ‘struct rq_flags *’ 
> but argument is of type ‘long unsigned int *’
>  struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
> ^
> kernel/livepatch/transition.c:367:2: warning: passing argument 3 of 
> ‘task_rq_unlock’ from incompatible pointer type [enabled by default]
>   task_rq_unlock(rq, task, &flags);
>   ^
> In file included from kernel/livepatch/transition.c:24:0:
> kernel/livepatch/../sched/sched.h:1480:1: note: expected ‘struct rq_flags *’ 
> but argument is of type ‘long unsigned int *’
>  task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
> 
> 
> And even runtime warnings from lockdep:
> 
> [  212.847548] WARNING: CPU: 1 PID: 3847 at kernel/locking/lockdep.c:3532 
> lock_release+0x431/0x480
> [  212.847549] releasing a pinned lock
> [  212.847550] Modules linked in: livepatch_sample(E+)
> [  212.847555] CPU: 1 PID: 3847 Comm: modprobe Tainted: GE K 
> 4.7.0-rc1-next-20160602-4-default+ #336
> [  212.847556] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Bochs 01/01/2011
> [  212.847558]   880139823aa0 814388dc 
> 880139823af0
> [  212.847562]   880139823ae0 8106fad1 
> 0dcc82b11390
> [  212.847565]  88013fc978d8 810eea1e 8800ba0ed6d0 
> 0003
> [  212.847569] Call Trace:
> [  212.847572]  [] dump_stack+0x85/0xc9
> [  212.847575]  [] __warn+0xd1/0xf0
> [  212.847578]  [] ? klp_try_switch_task.part.3+0x5e/0x2b0
> [  212.847580]  [] warn_slowpath_fmt+0x4f/0x60
> [  212.847582]  [] lock_release+0x431/0x480
> [  212.847585]  [] ? dump_trace+0x118/0x310
> [  212.847588]  [] ? entry_SYSCALL_64_fastpath+0x1f/0xbd
> [  212.847590]  [] _raw_spin_unlock+0x1f/0x30
> [  212.847600]  [] klp_try_switch_task.part.3+0x5e/0x2b0
> [  212.847603]  [] klp_try_complete_transition+0x84/0x190
> [  212.847605]  [] __klp_enable_patch+0xb0/0x130
> [  212.847607]  [] klp_enable_patch+0x55/0x80
> [  212.847610]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
> [livepatch_sample]
> [  212.847613]  [] livepatch_init+0x31/0x70 
> [livepatch_sample]
> [  212.847615]  [] ? livepatch_cmdline_proc_show+0x30/0x30 
> [livepatch_sample]
> [  212.847617]  [] do_one_initcall+0x3d/0x160
> [  212.847629]  [] ? do_init_module+0x27/0x1e4
> [  212.847632]  [] ? rcu_read_lock_sched_held+0x62/0x70
> [  212.847634]  [] ? kmem_cache_alloc_trace+0x282/0x340
> [  212.847636]  [] do_init_module+0x60/0x1e4
> [  212.847638]  [] load_module+0x1482/0x1d40
> [  212.847640]  [] ? __symbol_put+0x40/0x40
> [  212.847643]  [] SYSC_finit_module+0xa9/0xd0
> [  212.847645]  [] SyS_finit_module+0xe/0x10
> [  212.847647]  [] entry_SYSCALL_64_fastpath+0x1f/0xbd
> [  212.847649] ---[ end trace e4e9f09d45443049 ]---

Thanks, I also saw this when rebasing onto a newer linux-next.

> > +   int ret;
> > +   bool success = false;
> > +
> > +   /* check if this task has already switched over */
> > +   if (task->patch_state == klp_target_state)
> > +   return true;
> > +
> > +   /*
> > +* For arches which don't have reliable stack traces, we have to rely
> > +* on other methods (e.g., switching tasks at the syscall barrier).
> > +*/
> > +   if (!IS_ENABLED(CONFIG_RELIABLE_STACKTRACE))
> > +   return false;
> > +
> > +   /*
> > +* Now try to check the stack for any to-be-patched or to-be-unpatched
> > +* functions.  If all goes well, switch the task to the target patch
> > +* state.
> > +*/
> > +   rq = task_rq_lock(task, &flags);
> > +
> > +   if (task_runnin

barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.

I spent a lot of time with checking the memory barriers. It seems that
they are basically correct.  Let me use my own words to show how
I understand it. I hope that it will help others with review.

> diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
> index 782fbb5..b3b8639 100644
> --- a/kernel/livepatch/patch.c
> +++ b/kernel/livepatch/patch.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include "patch.h"
> +#include "transition.h"
>  
>  static LIST_HEAD(klp_ops);
>  
> @@ -58,11 +59,42 @@ static void notrace klp_ftrace_handler(unsigned long ip,
>   ops = container_of(fops, struct klp_ops, fops);
>  
>   rcu_read_lock();
> +
>   func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> stack_node);
> - if (WARN_ON_ONCE(!func))
> +
> + if (!func)
>   goto unlock;
>  
> + /*
> +  * See the comment for the 2nd smp_wmb() in klp_init_transition() for
> +  * an explanation of why this read barrier is needed.
> +  */
> + smp_rmb();

I would prefer to be more explicit, e.g.

/*
 * Read the right func->transition when the struct appeared on top of
 * func_stack.  See klp_init_transition and klp_patch_func().
 */

Note that this barrier is not really needed when the patch is being
disabled, see below.

> +
> + if (unlikely(func->transition)) {
> +
> + /*
> +  * See the comment for the 1st smp_wmb() in
> +  * klp_init_transition() for an explanation of why this read
> +  * barrier is needed.
> +  */
> + smp_rmb();

Similar here:

/*
 * Read the right initial state when func->transition was
 * enabled, see klp_init_transition().
 *
 * Note that the task must never be migrated to the target
 * state when being inside this ftrace handler.
 */

We might want to move the second paragraph on top of the function.
It is a basic and important fact. It actually explains why the first
read barrier is not needed when the patch is being disabled.

There are some more details below. I started to check and comment the
barriers from klp_init_transition().


> + if (current->patch_state == KLP_UNPATCHED) {
> + /*
> +  * Use the previously patched version of the function.
> +  * If no previous patches exist, use the original
> +  * function.
> +  */
> + func = list_entry_rcu(func->stack_node.next,
> +   struct klp_func, stack_node);
> +
> + if (&func->stack_node == &ops->func_stack)
> + goto unlock;
> + }
> + }
> +
>   klp_arch_set_pc(regs, (unsigned long)func->new_func);
>  unlock:
>   rcu_read_unlock();
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 000..92819bb
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> +/*
> + * klp_patch_task() - change the patched state of a task
> + * @task:The task to change
> + *
> + * Switches the patched state of the task to the set of functions in the 
> target
> + * patch state.
> + */
> +void klp_patch_task(struct task_struct *task)
> +{
> + clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> +
> + /*
> +  * The corresponding write barriers are in klp_init_transition() and
> +  * klp_reverse_transition().  See the comments there for an explanation.
> +  */
> + smp_rmb();

I would prefer to be more explicit, e.g.

/*
 * Read the correct klp_target_state when TIF_PATCH_PENDING was set
 * and this function was called.  See klp_init_transition() and
 * klp_reverse_transition().
 */
> +
> + task->patch_state = klp_target_state;
> +}

The function name confused me few times when klp_target_state
was KLP_UNPATCHED. I suggest to rename it to klp_update_task()
or klp_transit_task().

> +/*
> + * Initialize the global target patch state and all tasks to the initial 
> patch
> + * state, and initialize all function transition states to true in 
> preparation
> + * for patching or unpatching.
> + */
> +void klp_init_transition(struct klp_patch *patch, int state)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> + struct klp_object *obj;
> + struct klp_func *func;
> + int initial_state = !state;
> +
> + klp_transition_patch = p

klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Petr Mladek
On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> new file mode 100644
> index 000..92819bb
> --- /dev/null
> +++ b/kernel/livepatch/transition.c
> +/*
> + * klp_patch_task() - change the patched state of a task
> + * @task:The task to change
> + *
> + * Switches the patched state of the task to the set of functions in the 
> target
> + * patch state.
> + */
> +void klp_patch_task(struct task_struct *task)
> +{
> + clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> +
> + /*
> +  * The corresponding write barriers are in klp_init_transition() and
> +  * klp_reverse_transition().  See the comments there for an explanation.
> +  */
> + smp_rmb();
> +
> + task->patch_state = klp_target_state;
> +}
> +
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index bd12c6c..60d633f 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -266,6 +267,9 @@ static void cpu_idle_loop(void)
>  
>   sched_ttwu_pending();
>   schedule_preempt_disabled();
> +
> + if (unlikely(klp_patch_pending(current)))
> + klp_patch_task(current);
>   }

Some more ideas from the world of crazy races. I was shaking my head
if this was safe or not.

The problem might be if the task get rescheduled between the check
for the pending stuff or inside the klp_patch_task() function.
This will get even more important when we use this construct
on some more locations, e.g. in some kthreads.

If the task is sleeping on this strange locations, it might assign
strange values on strange times.

I think that it is safe only because it is called with the
'current' parameter and on a safe locations. It means that
the result is always safe and consistent. Also we could assign
an outdated value only when sleeping between reading klp_target_state
and storing task->patch_state. But if anyone modified
klp_target_state at this point, he also set TIF_PENDING_PATCH,
so the change will not get lost.

I think that we should document that klp_patch_func() must be
called only from a safe location from within the affected task.

I even suggest to avoid misuse by removing the struct *task_struct
parameter. It should always be called with current.

Best Regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Peter Zijlstra
On Wed, May 04, 2016 at 02:39:40PM +0200, Petr Mladek wrote:
> > +* This barrier also ensures that if another CPU goes through the
> > +* syscall barrier, sees the TIF_PATCH_PENDING writes in
> > +* klp_start_transition(), and calls klp_patch_task(), it also sees the
> > +* above write to the target state.  Otherwise it can put the task in
> > +* the wrong universe.
> > +*/
> 
> By other words, it makes sure that klp_patch_task() will assign the
> right patch_state. Where klp_patch_task() could not be called
> before we set TIF_PATCH_PENDING in klp_start_transition().
> 
> > +   smp_wmb();
> > +}

So I've not read the patch; but ending a function with an smp_wmb()
feels wrong.

A wmb orders two stores, and I feel both stores should be well visible
in the same function.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Petr Mladek
On Wed 2016-05-04 14:39:40, Petr Mladek wrote:
>*
>* Note that the task must never be migrated to the target
>* state when being inside this ftrace handler.
>*/
> 
> We might want to move the second paragraph on top of the function.
> It is a basic and important fact. It actually explains why the first
> read barrier is not needed when the patch is being disabled.

I wrote the statement partly intuitively. I think that it is really
somehow important. And I am slightly in doubts if we are on the safe side.

First, why is it important that the task->patch_state is not switched
when being inside the ftrace handler?

If we are inside the handler, we are kind-of inside the called
function. And the basic idea of this consistency model is that
we must not switch a task when it is inside a patched function.
This is normally decided by the stack.

The handler is a bit special because it is called right before the
function. If it was the only patched function on the stack, it would
not matter if we choose the new or old code. Both decisions would
be safe for the moment.

The fun starts when the function calls another patched function.
The other patched function must be called consistently with
the first one. If the first function was from the patch,
the other must be from the patch as well and vice versa.

This is why we must not switch task->patch_state dangerously
when being inside the ftrace handler.

Now I am not sure if this condition is fulfilled. The ftrace handler
is called as the very first instruction of the function. Does not
it break the stack validity? Could we sleep inside the ftrace
handler? Will the patched function be detected on the stack?

Or is my brain already too far in the fantasy world?


Best regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Jiri Kosina
On Wed, 4 May 2016, Petr Mladek wrote:

> > +
> > +   if (unlikely(klp_patch_pending(current)))
> > +   klp_patch_task(current);
> > }
> 
> Some more ideas from the world of crazy races. I was shaking my head
> if this was safe or not.
> 
> The problem might be if the task get rescheduled between the check
> for the pending stuff 

The code in question is running with preemption disabled.

> or inside the klp_patch_task() function. 

We must make sure that this function doesn't go to sleep. It's only used 
to clear the task_struct flag anyway.

-- 
Jiri Kosina
SUSE Labs

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Josh Poimboeuf
On Wed, May 04, 2016 at 03:53:29PM +0200, Peter Zijlstra wrote:
> On Wed, May 04, 2016 at 02:39:40PM +0200, Petr Mladek wrote:
> > > +  * This barrier also ensures that if another CPU goes through the
> > > +  * syscall barrier, sees the TIF_PATCH_PENDING writes in
> > > +  * klp_start_transition(), and calls klp_patch_task(), it also sees the
> > > +  * above write to the target state.  Otherwise it can put the task in
> > > +  * the wrong universe.

(oops, missed a "universe" -> "patch state" rename)

> > > +  */
> > 
> > By other words, it makes sure that klp_patch_task() will assign the
> > right patch_state. Where klp_patch_task() could not be called
> > before we set TIF_PATCH_PENDING in klp_start_transition().
> > 
> > > + smp_wmb();
> > > +}
> 
> So I've not read the patch; but ending a function with an smp_wmb()
> feels wrong.
> 
> A wmb orders two stores, and I feel both stores should be well visible
> in the same function.

Yeah, I would agree with that.  And also, it's probably a red flag that
the barrier needs *three* paragraphs to describe the various cases its
needed for.

However, there are some complications:

1) The stores are in separate functions (which is a generally a good
   thing as it greatly helps the readability of the code).

2) Which stores are being ordered depends on whether the function is
   called in the enable path or the disable path.

3) Either way it actually orders *two* separate pairs of stores.

Anyway I'm thinking I should move that barrier out of
klp_init_transition() and into its callers.  The stores will still be in
separate functions but at least there will be better visibility of where
the stores are occurring, and the comments can be a little more focused.

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Josh Poimboeuf
On Wed, May 04, 2016 at 02:39:40PM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> 
> I spent a lot of time with checking the memory barriers. It seems that
> they are basically correct.  Let me use my own words to show how
> I understand it. I hope that it will help others with review.

[...snip a ton of useful comments...]

Thanks, this will help a lot!  I'll try to incorporate your barrier
comments into the code.

I also agree that kpatch_patch_task() is poorly named.  I was trying to
make it clear to external callers that "hey, the task is getting patched
now!", but it's internally inconsistent with livepatch code because we
make a distinction between patching and unpatching.

Maybe I'll do:

  klp_update_task_patch_state()

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Josh Poimboeuf
On Wed, May 04, 2016 at 04:12:05PM +0200, Petr Mladek wrote:
> On Wed 2016-05-04 14:39:40, Petr Mladek wrote:
> >  *
> >  * Note that the task must never be migrated to the target
> >  * state when being inside this ftrace handler.
> >  */
> > 
> > We might want to move the second paragraph on top of the function.
> > It is a basic and important fact. It actually explains why the first
> > read barrier is not needed when the patch is being disabled.
> 
> I wrote the statement partly intuitively. I think that it is really
> somehow important. And I am slightly in doubts if we are on the safe side.
> 
> First, why is it important that the task->patch_state is not switched
> when being inside the ftrace handler?
> 
> If we are inside the handler, we are kind-of inside the called
> function. And the basic idea of this consistency model is that
> we must not switch a task when it is inside a patched function.
> This is normally decided by the stack.
> 
> The handler is a bit special because it is called right before the
> function. If it was the only patched function on the stack, it would
> not matter if we choose the new or old code. Both decisions would
> be safe for the moment.
> 
> The fun starts when the function calls another patched function.
> The other patched function must be called consistently with
> the first one. If the first function was from the patch,
> the other must be from the patch as well and vice versa.
> 
> This is why we must not switch task->patch_state dangerously
> when being inside the ftrace handler.
> 
> Now I am not sure if this condition is fulfilled. The ftrace handler
> is called as the very first instruction of the function. Does not
> it break the stack validity? Could we sleep inside the ftrace
> handler? Will the patched function be detected on the stack?
> 
> Or is my brain already too far in the fantasy world?

I think this isn't a possibility.

In today's code base, this can't happen because task patch states are
only switched when sleeping or when exiting the kernel.  The ftrace
handler doesn't sleep directly.

If it were preempted, it couldn't be switched there either because we
consider preempted stacks to be unreliable.

In theory, a DWARF stack trace of a preempted task *could* be reliable.
But then the DWARF unwinder should be smart enough to see that the
original function called the ftrace handler.  Right?  So the stack would
be reliable, but then livepatch would see the original function on the
stack and wouldn't switch the task.

Does that make sense?

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-04 Thread Josh Poimboeuf
On Wed, May 04, 2016 at 04:48:54PM +0200, Petr Mladek wrote:
> On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > Change livepatch to use a basic per-task consistency model.  This is the
> > foundation which will eventually enable us to patch those ~10% of
> > security patches which change function or data semantics.  This is the
> > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > new file mode 100644
> > index 000..92819bb
> > --- /dev/null
> > +++ b/kernel/livepatch/transition.c
> > +/*
> > + * klp_patch_task() - change the patched state of a task
> > + * @task:  The task to change
> > + *
> > + * Switches the patched state of the task to the set of functions in the 
> > target
> > + * patch state.
> > + */
> > +void klp_patch_task(struct task_struct *task)
> > +{
> > +   clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > +
> > +   /*
> > +* The corresponding write barriers are in klp_init_transition() and
> > +* klp_reverse_transition().  See the comments there for an explanation.
> > +*/
> > +   smp_rmb();
> > +
> > +   task->patch_state = klp_target_state;
> > +}
> > +
> > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > index bd12c6c..60d633f 100644
> > --- a/kernel/sched/idle.c
> > +++ b/kernel/sched/idle.c
> > @@ -9,6 +9,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  
> > @@ -266,6 +267,9 @@ static void cpu_idle_loop(void)
> >  
> > sched_ttwu_pending();
> > schedule_preempt_disabled();
> > +
> > +   if (unlikely(klp_patch_pending(current)))
> > +   klp_patch_task(current);
> > }
> 
> Some more ideas from the world of crazy races. I was shaking my head
> if this was safe or not.
> 
> The problem might be if the task get rescheduled between the check
> for the pending stuff or inside the klp_patch_task() function.
> This will get even more important when we use this construct
> on some more locations, e.g. in some kthreads.
> 
> If the task is sleeping on this strange locations, it might assign
> strange values on strange times.
> 
> I think that it is safe only because it is called with the
> 'current' parameter and on a safe locations. It means that
> the result is always safe and consistent. Also we could assign
> an outdated value only when sleeping between reading klp_target_state
> and storing task->patch_state. But if anyone modified
> klp_target_state at this point, he also set TIF_PENDING_PATCH,
> so the change will not get lost.
> 
> I think that we should document that klp_patch_func() must be
> called only from a safe location from within the affected task.
> 
> I even suggest to avoid misuse by removing the struct *task_struct
> parameter. It should always be called with current.

Would the race involve two tasks trying to call klp_patch_task() for the
same task at the same time?  If so I don't think that would be a problem
since they would both write the same value for task->patch_state.

(Sorry if I'm being slow, I think I've managed to reach my quota of hard
thinking for the day and I don't exactly follow what the race would be.)

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-05 Thread Petr Mladek
On Wed 2016-05-04 12:02:36, Josh Poimboeuf wrote:
> On Wed, May 04, 2016 at 02:39:40PM +0200, Petr Mladek wrote:
> > On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > 
> > I spent a lot of time with checking the memory barriers. It seems that
> > they are basically correct.  Let me use my own words to show how
> > I understand it. I hope that it will help others with review.
> 
> [...snip a ton of useful comments...]
> 
> Thanks, this will help a lot!  I'll try to incorporate your barrier
> comments into the code.

Thanks a lot.

> I also agree that kpatch_patch_task() is poorly named.  I was trying to
> make it clear to external callers that "hey, the task is getting patched
> now!", but it's internally inconsistent with livepatch code because we
> make a distinction between patching and unpatching.
> 
> Maybe I'll do:
> 
>   klp_update_task_patch_state()

I like it. It is long but it well describes the purpose.

Livepatch is using many state variables:

  + global:klp_transition_patch, klp_target_state
  + per task specific: TIF_PENDING_PATCH, patch_state
  + per each new function: transition, patched
  + per old function:  func_stack
  + per object:patched, loaded
  + per patch: enabled

The dependency between them and the workflow is important to
create a mental picture about the Livepatching. Good names
help with it.

Best Regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-05 Thread Petr Mladek
On Wed 2016-05-04 12:25:17, Josh Poimboeuf wrote:
> On Wed, May 04, 2016 at 04:12:05PM +0200, Petr Mladek wrote:
> > On Wed 2016-05-04 14:39:40, Petr Mladek wrote:
> > >*
> > >* Note that the task must never be migrated to the target
> > >* state when being inside this ftrace handler.
> > >*/
> > > 
> > > We might want to move the second paragraph on top of the function.
> > > It is a basic and important fact. It actually explains why the first
> > > read barrier is not needed when the patch is being disabled.
> > 
> > I wrote the statement partly intuitively. I think that it is really
> > somehow important. And I am slightly in doubts if we are on the safe side.
> > 
> > First, why is it important that the task->patch_state is not switched
> > when being inside the ftrace handler?
> > 
> > If we are inside the handler, we are kind-of inside the called
> > function. And the basic idea of this consistency model is that
> > we must not switch a task when it is inside a patched function.
> > This is normally decided by the stack.
> > 
> > The handler is a bit special because it is called right before the
> > function. If it was the only patched function on the stack, it would
> > not matter if we choose the new or old code. Both decisions would
> > be safe for the moment.
> > 
> > The fun starts when the function calls another patched function.
> > The other patched function must be called consistently with
> > the first one. If the first function was from the patch,
> > the other must be from the patch as well and vice versa.
> > 
> > This is why we must not switch task->patch_state dangerously
> > when being inside the ftrace handler.
> > 
> > Now I am not sure if this condition is fulfilled. The ftrace handler
> > is called as the very first instruction of the function. Does not
> > it break the stack validity? Could we sleep inside the ftrace
> > handler? Will the patched function be detected on the stack?
> > 
> > Or is my brain already too far in the fantasy world?
> 
> I think this isn't a possibility.
> 
> In today's code base, this can't happen because task patch states are
> only switched when sleeping or when exiting the kernel.  The ftrace
> handler doesn't sleep directly.
> 
> If it were preempted, it couldn't be switched there either because we
> consider preempted stacks to be unreliable.

This was the missing piece.

> In theory, a DWARF stack trace of a preempted task *could* be reliable.
> But then the DWARF unwinder should be smart enough to see that the
> original function called the ftrace handler.  Right?  So the stack would
> be reliable, but then livepatch would see the original function on the
> stack and wouldn't switch the task.
> 
> Does that make sense?

Yup. I think that we are on the safe side. Thanks for explanation.

Best Regards,
Petr
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-05 Thread Petr Mladek
On Wed 2016-05-04 12:57:00, Josh Poimboeuf wrote:
> On Wed, May 04, 2016 at 04:48:54PM +0200, Petr Mladek wrote:
> > On Thu 2016-04-28 15:44:48, Josh Poimboeuf wrote:
> > > Change livepatch to use a basic per-task consistency model.  This is the
> > > foundation which will eventually enable us to patch those ~10% of
> > > security patches which change function or data semantics.  This is the
> > > biggest remaining piece needed to make livepatch more generally useful.
> > > 
> > > diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> > > new file mode 100644
> > > index 000..92819bb
> > > --- /dev/null
> > > +++ b/kernel/livepatch/transition.c
> > > +/*
> > > + * klp_patch_task() - change the patched state of a task
> > > + * @task:The task to change
> > > + *
> > > + * Switches the patched state of the task to the set of functions in the 
> > > target
> > > + * patch state.
> > > + */
> > > +void klp_patch_task(struct task_struct *task)
> > > +{
> > > + clear_tsk_thread_flag(task, TIF_PATCH_PENDING);
> > > +
> > > + /*
> > > +  * The corresponding write barriers are in klp_init_transition() and
> > > +  * klp_reverse_transition().  See the comments there for an explanation.
> > > +  */
> > > + smp_rmb();
> > > +
> > > + task->patch_state = klp_target_state;
> > > +}
> > > +
> > > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> > > index bd12c6c..60d633f 100644
> > > --- a/kernel/sched/idle.c
> > > +++ b/kernel/sched/idle.c
> > > @@ -9,6 +9,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  
> > >  #include 
> > >  
> > > @@ -266,6 +267,9 @@ static void cpu_idle_loop(void)
> > >  
> > >   sched_ttwu_pending();
> > >   schedule_preempt_disabled();
> > > +
> > > + if (unlikely(klp_patch_pending(current)))
> > > + klp_patch_task(current);
> > >   }
> > 
> > Some more ideas from the world of crazy races. I was shaking my head
> > if this was safe or not.
> > 
> > The problem might be if the task get rescheduled between the check
> > for the pending stuff or inside the klp_patch_task() function.
> > This will get even more important when we use this construct
> > on some more locations, e.g. in some kthreads.
> > 
> > If the task is sleeping on this strange locations, it might assign
> > strange values on strange times.
> > 
> > I think that it is safe only because it is called with the
> > 'current' parameter and on a safe locations. It means that
> > the result is always safe and consistent. Also we could assign
> > an outdated value only when sleeping between reading klp_target_state
> > and storing task->patch_state. But if anyone modified
> > klp_target_state at this point, he also set TIF_PENDING_PATCH,
> > so the change will not get lost.
> > 
> > I think that we should document that klp_patch_func() must be
> > called only from a safe location from within the affected task.
> > 
> > I even suggest to avoid misuse by removing the struct *task_struct
> > parameter. It should always be called with current.
> 
> Would the race involve two tasks trying to call klp_patch_task() for the
> same task at the same time?  If so I don't think that would be a problem
> since they would both write the same value for task->patch_state.

I have missed that the two commands are called with preemption
disabled. So, I had the following crazy scenario in mind:


CPU0CPU1

klp_enable_patch()

  klp_target_state = KLP_PATCHED;

  for_each_task()
 set TIF_PENDING_PATCH

# task 123

if (klp_patch_pending(current)
  klp_patch_task(current)

clear TIF_PENDING_PATCH

smp_rmb();

# switch to assembly of
# klp_patch_task()

mov klp_target_state, %r12

# interrupt and schedule
# another task


  klp_reverse_transition();

klp_target_state = KLP_UNPATCHED;

klt_try_to_complete_transition()

  task = 123;
  if (task->patch_state == klp_target_state;
 return 0;

=> task 123 is in target state and does
not block conversion

  klp_complete_transition()


  # disable previous patch on the stack
  klp_disable_patch();

klp_target_state = KLP_UNPATCHED;
  
  
# task 123 gets scheduled again
lea %r12, task->patch_state

=> it happily stores an outdated
state


This is why the two functions should get called with preemption
disabled. We should document it at least. I imagine that we will
use them later also in another context and nobody will remember
this crazy scenario.

Well, even 

Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-06 Thread Josh Poimboeuf
On Thu, May 05, 2016 at 01:57:01PM +0200, Petr Mladek wrote:
> I have missed that the two commands are called with preemption
> disabled. So, I had the following crazy scenario in mind:
> 
> 
> CPU0  CPU1
> 
> klp_enable_patch()
> 
>   klp_target_state = KLP_PATCHED;
> 
>   for_each_task()
>  set TIF_PENDING_PATCH
> 
>   # task 123
> 
>   if (klp_patch_pending(current)
> klp_patch_task(current)
> 
> clear TIF_PENDING_PATCH
> 
>   smp_rmb();
> 
>   # switch to assembly of
>   # klp_patch_task()
> 
>   mov klp_target_state, %r12
> 
>   # interrupt and schedule
>   # another task
> 
> 
>   klp_reverse_transition();
> 
> klp_target_state = KLP_UNPATCHED;
> 
> klt_try_to_complete_transition()
> 
>   task = 123;
>   if (task->patch_state == klp_target_state;
>  return 0;
> 
> => task 123 is in target state and does
> not block conversion
> 
>   klp_complete_transition()
> 
> 
>   # disable previous patch on the stack
>   klp_disable_patch();
> 
> klp_target_state = KLP_UNPATCHED;
>   
>   
>   # task 123 gets scheduled again
>   lea %r12, task->patch_state
> 
>   => it happily stores an outdated
>   state
> 

Thanks for the clear explanation, this helps a lot.

> This is why the two functions should get called with preemption
> disabled. We should document it at least. I imagine that we will
> use them later also in another context and nobody will remember
> this crazy scenario.
> 
> Well, even disabled preemption does not help. The process on
> CPU1 might be also interrupted by an NMI and do some long
> printk in it.
> 
> IMHO, the only safe approach is to call klp_patch_task()
> only for "current" on a safe place. Then this race is harmless.
> The switch happen on a safe place, so that it does not matter
> into which state the process is switched.

I'm not sure about this solution.  When klp_complete_transition() is
called, we need all tasks to be patched, for good.  We don't want any of
them to randomly switch to the wrong state at some later time in the
middle of a future patch operation.  How would changing klp_patch_task()
to only use "current" prevent that?

> By other words, the task state might be updated only
> 
>+ by the task itself on a safe place
>+ by other task when the updated on is sleeping on a safe place
> 
> This should be well documented and the API should help to avoid
> a misuse.

I think we could fix it to be safe for future callers who might not have
preemption disabled with a couple of changes to klp_patch_task():
disabling preemption and testing/clearing the TIF_PATCH_PENDING flag
before changing the patch state:

  void klp_patch_task(struct task_struct *task)
  {
preempt_disable();
  
if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING))
task->patch_state = READ_ONCE(klp_target_state);
  
preempt_enable();
  }

We would also need a synchronize_sched() after the patching is complete,
either at the end of klp_try_complete_transition() or in
klp_complete_transition().  That would make sure that all existing calls
to klp_patch_task() are done.

-- 
Josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-09 Thread Petr Mladek
On Fri 2016-05-06 07:38:55, Josh Poimboeuf wrote:
> On Thu, May 05, 2016 at 01:57:01PM +0200, Petr Mladek wrote:
> > I have missed that the two commands are called with preemption
> > disabled. So, I had the following crazy scenario in mind:
> > 
> > 
> > CPU0CPU1
> > 
> > klp_enable_patch()
> > 
> >   klp_target_state = KLP_PATCHED;
> > 
> >   for_each_task()
> >  set TIF_PENDING_PATCH
> > 
> > # task 123
> > 
> > if (klp_patch_pending(current)
> >   klp_patch_task(current)
> > 
> > clear TIF_PENDING_PATCH
> > 
> > smp_rmb();
> > 
> > # switch to assembly of
> > # klp_patch_task()
> > 
> > mov klp_target_state, %r12
> > 
> > # interrupt and schedule
> > # another task
> > 
> > 
> >   klp_reverse_transition();
> > 
> > klp_target_state = KLP_UNPATCHED;
> > 
> > klt_try_to_complete_transition()
> > 
> >   task = 123;
> >   if (task->patch_state == klp_target_state;
> >  return 0;
> > 
> > => task 123 is in target state and does
> > not block conversion
> > 
> >   klp_complete_transition()
> > 
> > 
> >   # disable previous patch on the stack
> >   klp_disable_patch();
> > 
> > klp_target_state = KLP_UNPATCHED;
> >   
> >   
> > # task 123 gets scheduled again
> > lea %r12, task->patch_state
> > 
> > => it happily stores an outdated
> > state
> > 
> 
> Thanks for the clear explanation, this helps a lot.
> 
> > This is why the two functions should get called with preemption
> > disabled. We should document it at least. I imagine that we will
> > use them later also in another context and nobody will remember
> > this crazy scenario.
> > 
> > Well, even disabled preemption does not help. The process on
> > CPU1 might be also interrupted by an NMI and do some long
> > printk in it.
> > 
> > IMHO, the only safe approach is to call klp_patch_task()
> > only for "current" on a safe place. Then this race is harmless.
> > The switch happen on a safe place, so that it does not matter
> > into which state the process is switched.
> 
> I'm not sure about this solution.  When klp_complete_transition() is
> called, we need all tasks to be patched, for good.  We don't want any of
> them to randomly switch to the wrong state at some later time in the
> middle of a future patch operation.  How would changing klp_patch_task()
> to only use "current" prevent that?

You are right that it is pity but it really should be safe because
it is not entirely random.

If the race happens and assign an outdated value, there are two
situations:

1. It is assigned when there is not transition in the progress.
   Then it is OK because it will be ignored by the ftrace handler.
   The right state will be set before the next transition starts.

2. It is assigned when some other transition is in progress.
   Then it is OK as long as the function is called from "current".
   The "wrong" state will be used consistently. It will switch
   to the right state on another safe state.


> > By other words, the task state might be updated only
> > 
> >+ by the task itself on a safe place
> >+ by other task when the updated on is sleeping on a safe place
> > 
> > This should be well documented and the API should help to avoid
> > a misuse.
> 
> I think we could fix it to be safe for future callers who might not have
> preemption disabled with a couple of changes to klp_patch_task():
> disabling preemption and testing/clearing the TIF_PATCH_PENDING flag
> before changing the patch state:
> 
>   void klp_patch_task(struct task_struct *task)
>   {
>   preempt_disable();
>   
>   if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING))
>   task->patch_state = READ_ONCE(klp_target_state);
>   
>   preempt_enable();
>   }

It reduces the race window a bit but it is still there. For example,
NMI still might add a huge delay between reading klp_target_state
and assigning task->patch state.

What about the following?

/*
 * This function might assign an outdated value if the transaction
`* is reverted and finalized in parallel. But it is safe. If the value
 * is assigned outside of a transaction, it is ignored and the next
 * transaction will set the right one. Or if it gets assigned
 * inside another transaction, it will repeat the cycle and
 * set the right state.
 */
void klp_update_current_patch_state()
{
while (test_and_clear_tsk_thread_flag(current, TIF_PATCH_PENDING))
current->patch_state = READ_ONCE(klp_target_state);
}

Note that the disabled preemption helped only partially,
so I think that 

Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-09 Thread Miroslav Benes
On Wed, 4 May 2016, Josh Poimboeuf wrote:

> On Wed, May 04, 2016 at 04:12:05PM +0200, Petr Mladek wrote:
> > On Wed 2016-05-04 14:39:40, Petr Mladek wrote:
> > >*
> > >* Note that the task must never be migrated to the target
> > >* state when being inside this ftrace handler.
> > >*/
> > > 
> > > We might want to move the second paragraph on top of the function.
> > > It is a basic and important fact. It actually explains why the first
> > > read barrier is not needed when the patch is being disabled.
> > 
> > I wrote the statement partly intuitively. I think that it is really
> > somehow important. And I am slightly in doubts if we are on the safe side.
> > 
> > First, why is it important that the task->patch_state is not switched
> > when being inside the ftrace handler?
> > 
> > If we are inside the handler, we are kind-of inside the called
> > function. And the basic idea of this consistency model is that
> > we must not switch a task when it is inside a patched function.
> > This is normally decided by the stack.
> > 
> > The handler is a bit special because it is called right before the
> > function. If it was the only patched function on the stack, it would
> > not matter if we choose the new or old code. Both decisions would
> > be safe for the moment.
> > 
> > The fun starts when the function calls another patched function.
> > The other patched function must be called consistently with
> > the first one. If the first function was from the patch,
> > the other must be from the patch as well and vice versa.
> > 
> > This is why we must not switch task->patch_state dangerously
> > when being inside the ftrace handler.
> > 
> > Now I am not sure if this condition is fulfilled. The ftrace handler
> > is called as the very first instruction of the function. Does not
> > it break the stack validity? Could we sleep inside the ftrace
> > handler? Will the patched function be detected on the stack?
> > 
> > Or is my brain already too far in the fantasy world?
> 
> I think this isn't a possibility.
> 
> In today's code base, this can't happen because task patch states are
> only switched when sleeping or when exiting the kernel.  The ftrace
> handler doesn't sleep directly.
> 
> If it were preempted, it couldn't be switched there either because we
> consider preempted stacks to be unreliable.

And IIRC ftrace handlers cannot sleep and are called with preemption 
disabled as of now. The code is a bit obscure, but see 
__ftrace_ops_list_func for example. This is "main" ftrace handler that 
calls all the registered ones in case FTRACE_OPS_FL_DYNAMIC is set (which 
is always true for handlers coming from modules) and CONFIG_PREEMPT is 
on. If it is off and there is only one handler registered for a function 
dynamic trampoline is used. See commit 12cce594fa8f ("ftrace/x86: Allow 
!CONFIG_PREEMPT dynamic ops to use allocated trampolines"). I think 
Steven had a plan to implement dynamic trampolines even for 
CONFIG_PREEMPT case but he still hasn't done it. It should use RCU_TASKS 
infrastructure.

The reason for all the mess is that ftrace needs to be sure that no task 
is in the handler when the handler/trampoline is freed.

So we should be safe for now even from this side.

Miroslav
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-16 Thread Josh Poimboeuf
On Mon, May 09, 2016 at 02:23:03PM +0200, Petr Mladek wrote:
> On Fri 2016-05-06 07:38:55, Josh Poimboeuf wrote:
> > On Thu, May 05, 2016 at 01:57:01PM +0200, Petr Mladek wrote:
> > > I have missed that the two commands are called with preemption
> > > disabled. So, I had the following crazy scenario in mind:
> > > 
> > > 
> > > CPU0  CPU1
> > > 
> > > klp_enable_patch()
> > > 
> > >   klp_target_state = KLP_PATCHED;
> > > 
> > >   for_each_task()
> > >  set TIF_PENDING_PATCH
> > > 
> > >   # task 123
> > > 
> > >   if (klp_patch_pending(current)
> > > klp_patch_task(current)
> > > 
> > > clear TIF_PENDING_PATCH
> > > 
> > >   smp_rmb();
> > > 
> > >   # switch to assembly of
> > >   # klp_patch_task()
> > > 
> > >   mov klp_target_state, %r12
> > > 
> > >   # interrupt and schedule
> > >   # another task
> > > 
> > > 
> > >   klp_reverse_transition();
> > > 
> > > klp_target_state = KLP_UNPATCHED;
> > > 
> > > klt_try_to_complete_transition()
> > > 
> > >   task = 123;
> > >   if (task->patch_state == klp_target_state;
> > >  return 0;
> > > 
> > > => task 123 is in target state and does
> > > not block conversion
> > > 
> > >   klp_complete_transition()
> > > 
> > > 
> > >   # disable previous patch on the stack
> > >   klp_disable_patch();
> > > 
> > > klp_target_state = KLP_UNPATCHED;
> > >   
> > >   
> > >   # task 123 gets scheduled again
> > >   lea %r12, task->patch_state
> > > 
> > >   => it happily stores an outdated
> > >   state
> > > 
> > 
> > Thanks for the clear explanation, this helps a lot.
> > 
> > > This is why the two functions should get called with preemption
> > > disabled. We should document it at least. I imagine that we will
> > > use them later also in another context and nobody will remember
> > > this crazy scenario.
> > > 
> > > Well, even disabled preemption does not help. The process on
> > > CPU1 might be also interrupted by an NMI and do some long
> > > printk in it.
> > > 
> > > IMHO, the only safe approach is to call klp_patch_task()
> > > only for "current" on a safe place. Then this race is harmless.
> > > The switch happen on a safe place, so that it does not matter
> > > into which state the process is switched.
> > 
> > I'm not sure about this solution.  When klp_complete_transition() is
> > called, we need all tasks to be patched, for good.  We don't want any of
> > them to randomly switch to the wrong state at some later time in the
> > middle of a future patch operation.  How would changing klp_patch_task()
> > to only use "current" prevent that?
> 
> You are right that it is pity but it really should be safe because
> it is not entirely random.
> 
> If the race happens and assign an outdated value, there are two
> situations:
> 
> 1. It is assigned when there is not transition in the progress.
>Then it is OK because it will be ignored by the ftrace handler.
>The right state will be set before the next transition starts.
> 
> 2. It is assigned when some other transition is in progress.
>Then it is OK as long as the function is called from "current".
>The "wrong" state will be used consistently. It will switch
>to the right state on another safe state.

Maybe it would be safe, though I'm not entirely convinced.  Regardless I
think we should avoid these situations entirely because they create
windows for future bugs and races.

> > > By other words, the task state might be updated only
> > > 
> > >+ by the task itself on a safe place
> > >+ by other task when the updated on is sleeping on a safe place
> > > 
> > > This should be well documented and the API should help to avoid
> > > a misuse.
> > 
> > I think we could fix it to be safe for future callers who might not have
> > preemption disabled with a couple of changes to klp_patch_task():
> > disabling preemption and testing/clearing the TIF_PATCH_PENDING flag
> > before changing the patch state:
> > 
> >   void klp_patch_task(struct task_struct *task)
> >   {
> > preempt_disable();
> >   
> > if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING))
> > task->patch_state = READ_ONCE(klp_target_state);
> >   
> > preempt_enable();
> >   }
> 
> It reduces the race window a bit but it is still there. For example,
> NMI still might add a huge delay between reading klp_target_state
> and assigning task->patch state.

Maybe you missed this paragraph from my last email:

| We would also need a synchronize_sched() after the patching is complete,
| either at the end of klp_try_complete_transition() or in
| klp_com

Re: klp_task_patch: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-18 Thread Petr Mladek
On Mon 2016-05-16 13:12:50, Josh Poimboeuf wrote:
> On Mon, May 09, 2016 at 02:23:03PM +0200, Petr Mladek wrote:
> > On Fri 2016-05-06 07:38:55, Josh Poimboeuf wrote:
> > > On Thu, May 05, 2016 at 01:57:01PM +0200, Petr Mladek wrote:
> > > > I have missed that the two commands are called with preemption
> > > > disabled. So, I had the following crazy scenario in mind:
> > > > 
> > > > 
> > > > CPU0CPU1
> > > > 
> > > > klp_enable_patch()
> > > > 
> > > >   klp_target_state = KLP_PATCHED;
> > > > 
> > > >   for_each_task()
> > > >  set TIF_PENDING_PATCH
> > > > 
> > > > # task 123
> > > > 
> > > > if (klp_patch_pending(current)
> > > >   klp_patch_task(current)
> > > > 
> > > > clear TIF_PENDING_PATCH
> > > > 
> > > > smp_rmb();
> > > > 
> > > > # switch to assembly of
> > > > # klp_patch_task()
> > > > 
> > > > mov klp_target_state, %r12
> > > > 
> > > > # interrupt and schedule
> > > > # another task
> > > > 
> > > > 
> > > >   klp_reverse_transition();
> > > > 
> > > > klp_target_state = KLP_UNPATCHED;
> > > > 
> > > > klt_try_to_complete_transition()
> > > > 
> > > >   task = 123;
> > > >   if (task->patch_state == klp_target_state;
> > > >  return 0;
> > > > 
> > > > => task 123 is in target state and does
> > > > not block conversion
> > > > 
> > > >   klp_complete_transition()
> > > > 
> > > > 
> > > >   # disable previous patch on the stack
> > > >   klp_disable_patch();
> > > > 
> > > > klp_target_state = KLP_UNPATCHED;
> > > >   
> > > >   
> > > > # task 123 gets scheduled again
> > > > lea %r12, task->patch_state
> > > > 
> > > > => it happily stores an outdated
> > > > state
> > > > 
> > > 
> > > Thanks for the clear explanation, this helps a lot.
> > > 
> > > > This is why the two functions should get called with preemption
> > > > disabled. We should document it at least. I imagine that we will
> > > > use them later also in another context and nobody will remember
> > > > this crazy scenario.
> > > > 
> > > > Well, even disabled preemption does not help. The process on
> > > > CPU1 might be also interrupted by an NMI and do some long
> > > > printk in it.
> > > > 
> > > > IMHO, the only safe approach is to call klp_patch_task()
> > > > only for "current" on a safe place. Then this race is harmless.
> > > > The switch happen on a safe place, so that it does not matter
> > > > into which state the process is switched.
> > > 
> > > I'm not sure about this solution.  When klp_complete_transition() is
> > > called, we need all tasks to be patched, for good.  We don't want any of
> > > them to randomly switch to the wrong state at some later time in the
> > > middle of a future patch operation.  How would changing klp_patch_task()
> > > to only use "current" prevent that?
> > 
> > You are right that it is pity but it really should be safe because
> > it is not entirely random.
> > 
> > If the race happens and assign an outdated value, there are two
> > situations:
> > 
> > 1. It is assigned when there is not transition in the progress.
> >Then it is OK because it will be ignored by the ftrace handler.
> >The right state will be set before the next transition starts.
> > 
> > 2. It is assigned when some other transition is in progress.
> >Then it is OK as long as the function is called from "current".
> >The "wrong" state will be used consistently. It will switch
> >to the right state on another safe state.
> 
> Maybe it would be safe, though I'm not entirely convinced.  Regardless I
> think we should avoid these situations entirely because they create
> windows for future bugs and races.

Yup, I would prefer a cleaner solution as well.

> > > > By other words, the task state might be updated only
> > > > 
> > > >+ by the task itself on a safe place
> > > >+ by other task when the updated on is sleeping on a safe place
> > > > 
> > > > This should be well documented and the API should help to avoid
> > > > a misuse.
> > > 
> > > I think we could fix it to be safe for future callers who might not have
> > > preemption disabled with a couple of changes to klp_patch_task():
> > > disabling preemption and testing/clearing the TIF_PATCH_PENDING flag
> > > before changing the patch state:
> > > 
> > >   void klp_patch_task(struct task_struct *task)
> > >   {
> > >   preempt_disable();
> > >   
> > >   if (test_and_clear_tsk_thread_flag(task, TIF_PATCH_PENDING))
> > >   task->patch_state = READ_ON