Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-24 Thread Ingo Molnar
* Jiri Kosina wrote: > On Sat, 21 Feb 2015, Ingo Molnar wrote: > > > (It does have some other requirements, such as making > > all syscalls interruptible to a 'special' signalling > > method that only live patching triggers - even syscalls > > that are under the normal ABI uninterruptible, s

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-23 Thread Jiri Kosina
On Sat, 21 Feb 2015, Ingo Molnar wrote: > (It does have some other requirements, such as making all > syscalls interruptible to a 'special' signalling method > that only live patching triggers - even syscalls that are > under the normal ABI uninterruptible, such as sys_sync().) BTW I didn't re

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-22 Thread Jiri Kosina
[ live-patching@ ML added to CC here as well ] On Sun, 22 Feb 2015, Ingo Molnar wrote: > > BTW how exactly do you envision this will work? Do I understand your > > proposal correctly that EINTR will be "handled" somewhere in the "live > > patching special signal handler" and then have the inte

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-22 Thread Ingo Molnar
* Jiri Kosina wrote: > On Sat, 21 Feb 2015, Ingo Molnar wrote: > > > > Plus a lot of processes would see EINTR, causing more > > > havoc. > > > > Parking threads safely in user mode does not require > > the propagation of syscall interruption to user-space. > > BTW how exactly do you envisi

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-22 Thread Jiri Kosina
On Sun, 22 Feb 2015, Ingo Molnar wrote: > I am making specific technical arguments, but you attempted to redirect > my very specific arguments towards 'differences in philosophy' and > 'where to draw the line'. Lets discuss the merits and brush them aside > as 'philosophical differences' or a m

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-22 Thread Jiri Kosina
On Sat, 21 Feb 2015, Ingo Molnar wrote: > > Plus a lot of processes would see EINTR, causing more havoc. > > Parking threads safely in user mode does not require the propagation of > syscall interruption to user-space. BTW how exactly do you envision this will work? Do I understand your propos

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-22 Thread Ingo Molnar
* Jiri Kosina wrote: > > > Or think of kernel that has some 3rd party vendor > > > module loaded, and this module spawning a ktrehad > > > that is not capable of parking itself. > > > > The kernel will refuse to live patch until the module > > is fixed. It is a win by any measure. > > Depen

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Jiri Kosina
To make sure that this thread doesn't conclude in void, here's my take on it: - what's currently alredy there is the simplest-of-simplest methods; it allows you to apply context-less patches (such as adding bounds checking to the beginning of syscall, etc), which turns out to cover vast port

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Jiri Kosina
On Sat, 21 Feb 2015, Ingo Molnar wrote: > > I see the difference, but I am afraid you are simplifying > > the situation a litle bit too much. > > > > There will always be properties of patches that will make > > them unapplicable in a "live patching" way by design. > > Think of data structure

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Ingo Molnar
* Jiri Kosina wrote: > On Sat, 21 Feb 2015, Ingo Molnar wrote: > > > > But admittedly, if we reserve a special sort-of > > > signal for making the tasks pass through a safe > > > checkpoint (and make them queue there (your solution) > > > or make them just pass through it and continue > > >

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Jiri Kosina
On Sat, 21 Feb 2015, Ingo Molnar wrote: > > But admittedly, if we reserve a special sort-of signal > > for making the tasks pass through a safe checkpoint (and > > make them queue there (your solution) or make them just > > pass through it and continue (current kGraft)), it might > > reduce th

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Ingo Molnar
* Jiri Kosina wrote: > On Sat, 21 Feb 2015, Ingo Molnar wrote: > > > > This means that each and every sleeping task in the > > > system has to be woken up in some way (sending a > > > signal ...) to exit from a syscall it is sleeping in. > > > Same for CPU hogs. All kernel threads need to be

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Jiri Kosina
On Sat, 21 Feb 2015, Ingo Molnar wrote: > > This means that each and every sleeping task in the system has to be > > woken up in some way (sending a signal ...) to exit from a syscall it > > is sleeping in. Same for CPU hogs. All kernel threads need to be > > parked. > > Yes - although I'd not

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Ingo Molnar
* Josh Poimboeuf wrote: > On Fri, Feb 20, 2015 at 10:46:13PM +0100, Vojtech Pavlik wrote: > > On Fri, Feb 20, 2015 at 08:49:01PM +0100, Ingo Molnar wrote: > > > > > I.e. it's in essence the strong stop-all atomic > > > patching model of 'kpatch', combined with the > > > reliable avoidance of k

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-21 Thread Ingo Molnar
* Vojtech Pavlik wrote: > On Fri, Feb 20, 2015 at 08:49:01PM +0100, Ingo Molnar wrote: > > > > ... the choice the sysadmins have here is either have > > > the system running in an intermediate state, or have > > > the system completely dead for the *same time*. > > > Because to finish the tr

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-20 Thread Josh Poimboeuf
On Fri, Feb 20, 2015 at 10:46:13PM +0100, Vojtech Pavlik wrote: > On Fri, Feb 20, 2015 at 08:49:01PM +0100, Ingo Molnar wrote: > > I.e. it's in essence the strong stop-all atomic patching > > model of 'kpatch', combined with the reliable avoidance of > > kernel stacks that 'kgraft' uses. > > > T

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-20 Thread Vojtech Pavlik
On Fri, Feb 20, 2015 at 08:49:01PM +0100, Ingo Molnar wrote: > > ... the choice the sysadmins have here is either have the > > system running in an intermediate state, or have the > > system completely dead for the *same time*. Because to > > finish the transition successfully, all the tasks ha

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-20 Thread Josh Poimboeuf
On Fri, Feb 20, 2015 at 09:08:49PM +0100, Ingo Molnar wrote: > * Josh Poimboeuf wrote: > > On Fri, Feb 20, 2015 at 10:50:03AM +0100, Ingo Molnar wrote: > > > * Jiri Kosina wrote: > > > > > > > Alright, so to sum it up: > > > > > > > > - current stack dumping (even looking at /proc//stack) is no

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-20 Thread Ingo Molnar
* Josh Poimboeuf wrote: > On Fri, Feb 20, 2015 at 10:50:03AM +0100, Ingo Molnar wrote: > > > > * Jiri Kosina wrote: > > > > > Alright, so to sum it up: > > > > > > - current stack dumping (even looking at /proc//stack) is not > > > guaranteed to yield "correct" results in case the task is

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-20 Thread Ingo Molnar
* Jiri Kosina wrote: > > All fundamental pieces of the simple method are > > necessary to get guaranteed time transition from the > > complicated method: task tracking and transparent > > catching of them, handling kthreads, etc. > > > > My argument is that the simple method should be > > i

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-20 Thread Josh Poimboeuf
On Fri, Feb 20, 2015 at 09:49:32AM +0100, Jiri Kosina wrote: > Alright, so to sum it up: > > - current stack dumping (even looking at /proc//stack) is not > guaranteed to yield "correct" results in case the task is running at the > time the stack is being examined > > - the only fool-proof

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-20 Thread Josh Poimboeuf
On Fri, Feb 20, 2015 at 10:50:03AM +0100, Ingo Molnar wrote: > > * Jiri Kosina wrote: > > > Alright, so to sum it up: > > > > - current stack dumping (even looking at /proc//stack) is not > > guaranteed to yield "correct" results in case the task is running at the > > time the stack is be

Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-20 Thread Jiri Kosina
On Fri, 20 Feb 2015, Ingo Molnar wrote: > - the complicated method spread out over time: uses the > same essential mechanism plus the ftrace patching > machinery to detect whether all tasks have transitioned > through a version flip. [this is what kgraft does in > part.] The

live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

2015-02-20 Thread Ingo Molnar
* Jiri Kosina wrote: > On Fri, 20 Feb 2015, Ingo Molnar wrote: > > > So if your design is based on being able to discover > > > 'live' functions in the kernel stack dump of all tasks > > in the system, I think you need a serious reboot of the > > whole approach and get rid of that fragility

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-20 Thread Jiri Kosina
On Fri, 20 Feb 2015, Ingo Molnar wrote: > So if your design is based on being able to discover 'live' functions in > the kernel stack dump of all tasks in the system, I think you need a > serious reboot of the whole approach and get rid of that fragility > before any of that functionality gets

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-20 Thread Ingo Molnar
* Jiri Kosina wrote: > Alright, so to sum it up: > > - current stack dumping (even looking at /proc//stack) is not > guaranteed to yield "correct" results in case the task is running at the > time the stack is being examined Don't even _think_ about trying to base something as dangerous

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-20 Thread Jiri Kosina
Alright, so to sum it up: - current stack dumping (even looking at /proc//stack) is not guaranteed to yield "correct" results in case the task is running at the time the stack is being examined - the only fool-proof way is to send IPI-NMI to all CPUs, and synchronize the handlers between

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Jiri Kosina
On Thu, 19 Feb 2015, Josh Poimboeuf wrote: > So I've looked at kgr_needs_lazy_migration(), but I still have no idea > how it works. > > First of all, I think reading the stack while its being written to could > give you some garbage values, and a completely wrong nr_entries value > from save_stac

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Josh Poimboeuf
On Thu, Feb 19, 2015 at 10:26:09PM +0100, Jiri Kosina wrote: > On Thu, 19 Feb 2015, Josh Poimboeuf wrote: > > > How about with a TIF_IN_USERSPACE thread flag? It could be cleared/set > > right at the border. Then for running tasks it's as simple as: > > > > if (test_tsk_thread_flag(task, TIF_IN

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Josh Poimboeuf
On Thu, Feb 19, 2015 at 09:40:36PM +0100, Vojtech Pavlik wrote: > On Thu, Feb 19, 2015 at 11:32:55AM -0600, Josh Poimboeuf wrote: > > On Thu, Feb 19, 2015 at 06:19:29PM +0100, Vojtech Pavlik wrote: > > > On Thu, Feb 19, 2015 at 11:03:53AM -0600, Josh Poimboeuf wrote: > > > > On Thu, Feb 19, 2015 at

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Jiri Kosina
On Thu, 19 Feb 2015, Jiri Kosina wrote: > > How about with a TIF_IN_USERSPACE thread flag? It could be cleared/set > > right at the border. Then for running tasks it's as simple as: > > > > if (test_tsk_thread_flag(task, TIF_IN_USERSPACE)) > > klp_switch_task_universe(task); > > That's in

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Jiri Kosina
On Thu, 19 Feb 2015, Josh Poimboeuf wrote: > How about with a TIF_IN_USERSPACE thread flag? It could be cleared/set > right at the border. Then for running tasks it's as simple as: > > if (test_tsk_thread_flag(task, TIF_IN_USERSPACE)) > klp_switch_task_universe(task); That's in principle

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Vojtech Pavlik
On Thu, Feb 19, 2015 at 11:32:55AM -0600, Josh Poimboeuf wrote: > On Thu, Feb 19, 2015 at 06:19:29PM +0100, Vojtech Pavlik wrote: > > On Thu, Feb 19, 2015 at 11:03:53AM -0600, Josh Poimboeuf wrote: > > > On Thu, Feb 19, 2015 at 05:33:59PM +0100, Vojtech Pavlik wrote: > > > > On Thu, Feb 19, 2015 at

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Vojtech Pavlik
On February 19, 2015 6:32:55 PM CET, Josh Poimboeuf wrote: >> Yes. I'm saying that rather than guaranteeing they don't enter the >> kernel (by having them spin) you can flip them in case they try to do >> that instead. That solves the race condition just as well. > >Ok, gotcha. > >We'd still need

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Josh Poimboeuf
On Thu, Feb 19, 2015 at 06:19:29PM +0100, Vojtech Pavlik wrote: > On Thu, Feb 19, 2015 at 11:03:53AM -0600, Josh Poimboeuf wrote: > > On Thu, Feb 19, 2015 at 05:33:59PM +0100, Vojtech Pavlik wrote: > > > On Thu, Feb 19, 2015 at 10:24:29AM -0600, Josh Poimboeuf wrote: > > > > > > > > No, these task

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Vojtech Pavlik
On Thu, Feb 19, 2015 at 11:03:53AM -0600, Josh Poimboeuf wrote: > On Thu, Feb 19, 2015 at 05:33:59PM +0100, Vojtech Pavlik wrote: > > On Thu, Feb 19, 2015 at 10:24:29AM -0600, Josh Poimboeuf wrote: > > > > > > No, these tasks will _never_ make syscalls. So you need to guarantee > > > > they don't

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Jiri Kosina
On Thu, 19 Feb 2015, Josh Poimboeuf wrote: > > > > No, these tasks will _never_ make syscalls. So you need to guarantee > > > > they don't accidentally enter the kernel while you flip them. Something > > > > like so should do. > > > > > > > > You set TIF_ENTER_WAIT on them, check they're still in

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Josh Poimboeuf
On Thu, Feb 19, 2015 at 05:33:59PM +0100, Vojtech Pavlik wrote: > On Thu, Feb 19, 2015 at 10:24:29AM -0600, Josh Poimboeuf wrote: > > > > No, these tasks will _never_ make syscalls. So you need to guarantee > > > they don't accidentally enter the kernel while you flip them. Something > > > like so

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Vojtech Pavlik
On Thu, Feb 19, 2015 at 10:24:29AM -0600, Josh Poimboeuf wrote: > > No, these tasks will _never_ make syscalls. So you need to guarantee > > they don't accidentally enter the kernel while you flip them. Something > > like so should do. > > > > You set TIF_ENTER_WAIT on them, check they're still i

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Josh Poimboeuf
On Thu, Feb 19, 2015 at 11:16:07AM +0100, Peter Zijlstra wrote: > On Wed, Feb 18, 2015 at 10:17:53PM -0600, Josh Poimboeuf wrote: > > On Thu, Feb 19, 2015 at 01:20:58AM +0100, Peter Zijlstra wrote: > > > On Wed, Feb 18, 2015 at 11:12:56AM -0600, Josh Poimboeuf wrote: > > > > > The next line of att

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-19 Thread Peter Zijlstra
On Wed, Feb 18, 2015 at 10:17:53PM -0600, Josh Poimboeuf wrote: > On Thu, Feb 19, 2015 at 01:20:58AM +0100, Peter Zijlstra wrote: > > On Wed, Feb 18, 2015 at 11:12:56AM -0600, Josh Poimboeuf wrote: > > > The next line of attack is patching tasks when exiting the kernel to > > > user space (system

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-18 Thread Josh Poimboeuf
On Thu, Feb 19, 2015 at 01:20:58AM +0100, Peter Zijlstra wrote: > On Wed, Feb 18, 2015 at 11:12:56AM -0600, Josh Poimboeuf wrote: > > > So uhm, what happens if your target task is running? When will you > > > retry? The problem I see is that if you do a sample approach you might > > > never hit an

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-18 Thread Peter Zijlstra
On Wed, Feb 18, 2015 at 11:12:56AM -0600, Josh Poimboeuf wrote: > > > a) spend the time to ensure the unwinding code is correct and resilient > > >to errors; > > > > > > b) leave the consistency model compiled code out if !FRAME_POINTER and > > >allow users to patch without one (similar t

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-18 Thread Josh Poimboeuf
On Wed, Feb 18, 2015 at 04:21:00PM +0100, Peter Zijlstra wrote: > On Tue, Feb 17, 2015 at 03:25:32PM -0600, Josh Poimboeuf wrote: > > > And I'm assuming you're hard relying on CONFIG_FRAMEPOINTER here, > > > because otherwise x86 stacks are a mess too. > > > > Yeah, it'll rely on CONFIG_FRAME_POIN

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-18 Thread Peter Zijlstra
On Tue, Feb 17, 2015 at 03:25:32PM -0600, Josh Poimboeuf wrote: > > And I'm assuming you're hard relying on CONFIG_FRAMEPOINTER here, > > because otherwise x86 stacks are a mess too. > > Yeah, it'll rely on CONFIG_FRAME_POINTER. IIUC, the arches we care > about now (x86, power, s390, arm64) all h

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-17 Thread Josh Poimboeuf
On Tue, Feb 17, 2015 at 07:15:41PM +0100, Peter Zijlstra wrote: > On Tue, Feb 17, 2015 at 08:12:11AM -0600, Josh Poimboeuf wrote: > > On Tue, Feb 17, 2015 at 10:24:50AM +0100, Peter Zijlstra wrote: > > > So far stack unwinding has basically been a best effort debug output > > > kind of thing, you'r

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-17 Thread Peter Zijlstra
On Tue, Feb 17, 2015 at 08:12:11AM -0600, Josh Poimboeuf wrote: > On Tue, Feb 17, 2015 at 10:24:50AM +0100, Peter Zijlstra wrote: > > So far stack unwinding has basically been a best effort debug output > > kind of thing, you're wanting to make the integrity of the kernel depend > > on it. > > > >

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-17 Thread Josh Poimboeuf
On Tue, Feb 17, 2015 at 10:24:50AM +0100, Peter Zijlstra wrote: > On Mon, Feb 16, 2015 at 04:05:05PM -0600, Josh Poimboeuf wrote: > > Yeah, I can understand that. I definitely want to avoid touching the > > scheduler code. Basically I'm trying to find a way to atomically do the > > following: > >

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-17 Thread Peter Zijlstra
On Mon, Feb 16, 2015 at 04:05:05PM -0600, Josh Poimboeuf wrote: > Yeah, I can understand that. I definitely want to avoid touching the > scheduler code. Basically I'm trying to find a way to atomically do the > following: > > if (task is sleeping) { > walk the stack > if (certain set

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-16 Thread Josh Poimboeuf
On Mon, Feb 16, 2015 at 09:44:36PM +0100, Peter Zijlstra wrote: > On Mon, Feb 16, 2015 at 12:52:34PM -0600, Josh Poimboeuf wrote: > > +++ b/kernel/sched/core.c > > @@ -1338,6 +1338,23 @@ void kick_process(struct task_struct *p) > > EXPORT_SYMBOL_GPL(kick_process); > > #endif /* CONFIG_SMP */ > >

Re: [PATCH 1/3] sched: add sched_task_call()

2015-02-16 Thread Peter Zijlstra
On Mon, Feb 16, 2015 at 12:52:34PM -0600, Josh Poimboeuf wrote: > +++ b/kernel/sched/core.c > @@ -1338,6 +1338,23 @@ void kick_process(struct task_struct *p) > EXPORT_SYMBOL_GPL(kick_process); > #endif /* CONFIG_SMP */ > > +/*** > + * sched_task_call - call a function with a task's state locked