Re: [PATCH 2/3] static_call: Align static_call_is_init() patching condition
On Fri, Mar 19, 2021 at 03:40:46PM +0100, Rasmus Villemoes wrote: > On 19/03/2021 15.13, Peter Zijlstra wrote: > > >> Dunno, probably overkill, but perhaps we could have an atomic_t (or > >> refcount, whatever) init_ref inited to 1, with init_ref_get() doing an > >> inc_unless_zero, and iff you get a ref, you're free to call (/patch) > >> __init functions and access __initdata, but must do init_ref_put(), with > >> PID1 dropping its initial ref and waiting for it to drop to 0 before > >> doing the *free_initmem() calls. > > > > I'd as soon simply add another SYSTEM state. That way we don't have to > > worry about who else looks at RUNNING for what etc.. > > I don't understand. How would that solve the > > PID1 PIDX >ok = system_state < INIT_MEM_GONE; > system_state = INIT_MEM_GONE; > free_initmem(); > system_state = RUNNING; >if (ok) >poke init mem > > race? Argh, I meant to put it before SMP bringup, but then still have to run the smp_init calls :/ N/m, you're quite right.
Re: [PATCH 2/3] static_call: Align static_call_is_init() patching condition
On 19/03/2021 15.40, Rasmus Villemoes wrote: > On 19/03/2021 15.13, Peter Zijlstra wrote: > >>> Dunno, probably overkill, but perhaps we could have an atomic_t (or >>> refcount, whatever) init_ref inited to 1, with init_ref_get() doing an >>> inc_unless_zero, and iff you get a ref, you're free to call (/patch) >>> __init functions and access __initdata, but must do init_ref_put(), with >>> PID1 dropping its initial ref and waiting for it to drop to 0 before >>> doing the *free_initmem() calls. >> >> I'd as soon simply add another SYSTEM state. That way we don't have to >> worry about who else looks at RUNNING for what etc.. > > I don't understand. How would that solve the > > PID1 PIDX >ok = system_state < INIT_MEM_GONE; > system_state = INIT_MEM_GONE; > free_initmem(); > system_state = RUNNING; >if (ok) >poke init mem > > race? I really don't see any way arbitrary threads can reliably check > how far PID1 has progressed at one point in time and use that > information (a few lines) later to access init memory without > synchronizing with PID1. > > AFAICT, having an atomic_t object representing the init memory something like --- a/init/main.c +++ b/init/main.c @@ -1417,6 +1417,18 @@ void __weak free_initmem(void) free_initmem_default(POISON_FREE_INITMEM); } +static atomic_t init_mem_ref = ATOMIC_INIT(1); +static DECLARE_COMPLETION(init_mem_may_go); +bool init_mem_get(void) +{ + return atomic_inc_not_zero(_mem_ref); +} +void init_mem_put(void) +{ + if (atomic_dec_and_test(_mem_ref)) + complete(_mem_may_go); +} + static int __ref kernel_init(void *unused) { int ret; @@ -1424,6 +1436,8 @@ static int __ref kernel_init(void *unused) kernel_init_freeable(); /* need to finish all async __init code before freeing the memory */ async_synchronize_full(); + init_mem_put(); + wait_for_completion(_mem_may_go); kprobe_free_init_mem(); ftrace_free_init_mem(); kgdb_free_init_mem();
Re: [PATCH 2/3] static_call: Align static_call_is_init() patching condition
On 19/03/2021 15.13, Peter Zijlstra wrote: >> Dunno, probably overkill, but perhaps we could have an atomic_t (or >> refcount, whatever) init_ref inited to 1, with init_ref_get() doing an >> inc_unless_zero, and iff you get a ref, you're free to call (/patch) >> __init functions and access __initdata, but must do init_ref_put(), with >> PID1 dropping its initial ref and waiting for it to drop to 0 before >> doing the *free_initmem() calls. > > I'd as soon simply add another SYSTEM state. That way we don't have to > worry about who else looks at RUNNING for what etc.. I don't understand. How would that solve the PID1 PIDX ok = system_state < INIT_MEM_GONE; system_state = INIT_MEM_GONE; free_initmem(); system_state = RUNNING; if (ok) poke init mem race? I really don't see any way arbitrary threads can reliably check how far PID1 has progressed at one point in time and use that information (a few lines) later to access init memory without synchronizing with PID1. AFAICT, having an atomic_t object representing the init memory and a "no, you can't have a new reference" would guarantee correctness of the jump_label/static_call patching: If we get a reference, we do the patching just as for the rest of the kernel .text. If we don't, i.e. observe atomic_load(init_ref)==0, that may not necessarily mean that PID1 has actually discarded the memory yet, but no thread can currently or in the future actually run __init code, so it need not be patched. Rasmus
Re: [PATCH 2/3] static_call: Align static_call_is_init() patching condition
On Fri, Mar 19, 2021 at 02:31:19PM +0100, Rasmus Villemoes wrote: > On 18/03/2021 12.31, Peter Zijlstra wrote: > > The intent is to avoid writing init code after init (because the text > > might have been freed). The code is needlessly different between > > jump_label and static_call and not obviously correct. > > > > The existing code relies on the fact that the module loader clears the > > init layout, such that within_module_init() always fails, while > > jump_label relies on the module state which is more obvious and > > matches the kernel logic. > > > > Signed-off-by: Peter Zijlstra (Intel) > > --- > > kernel/static_call.c | 14 -- > > 1 file changed, 4 insertions(+), 10 deletions(-) > > > > --- a/kernel/static_call.c > > +++ b/kernel/static_call.c > > @@ -149,6 +149,7 @@ void __static_call_update(struct static_ > > }; > > > > for (site_mod = site_mod; site_mod = site_mod->next) { > > + bool init = system_state < SYSTEM_RUNNING; > > I recently had occasion to look at whether that would be a suitable > condition for knowing whether __init stuff was gone, but concluded that > it's not. Maybe I'm wrong. init/main.c: Ha, me too: https://lkml.kernel.org/r/YFMToXI/3qjlm...@hirez.programming.kicks-ass.net and I share your concern. > Dunno, probably overkill, but perhaps we could have an atomic_t (or > refcount, whatever) init_ref inited to 1, with init_ref_get() doing an > inc_unless_zero, and iff you get a ref, you're free to call (/patch) > __init functions and access __initdata, but must do init_ref_put(), with > PID1 dropping its initial ref and waiting for it to drop to 0 before > doing the *free_initmem() calls. I'd as soon simply add another SYSTEM state. That way we don't have to worry about who else looks at RUNNING for what etc..
Re: [PATCH 2/3] static_call: Align static_call_is_init() patching condition
On 18/03/2021 12.31, Peter Zijlstra wrote: > The intent is to avoid writing init code after init (because the text > might have been freed). The code is needlessly different between > jump_label and static_call and not obviously correct. > > The existing code relies on the fact that the module loader clears the > init layout, such that within_module_init() always fails, while > jump_label relies on the module state which is more obvious and > matches the kernel logic. > > Signed-off-by: Peter Zijlstra (Intel) > --- > kernel/static_call.c | 14 -- > 1 file changed, 4 insertions(+), 10 deletions(-) > > --- a/kernel/static_call.c > +++ b/kernel/static_call.c > @@ -149,6 +149,7 @@ void __static_call_update(struct static_ > }; > > for (site_mod = site_mod; site_mod = site_mod->next) { > + bool init = system_state < SYSTEM_RUNNING; I recently had occasion to look at whether that would be a suitable condition for knowing whether __init stuff was gone, but concluded that it's not. Maybe I'm wrong. init/main.c: free_initmem(); mark_readonly(); /* * Kernel mappings are now finalized - update the userspace page-table * to finalize PTI. */ pti_finalize(); system_state = SYSTEM_RUNNING; So ISTM there's window where system_state < SYSTEM_RUNNING but accessing init stuff is a bad idea. If you're PID1 it's all fine, but I suppose other kernel threads could end up calling static_call_update. And just moving the system_state setting up before the *free_initmem() calls doesn't really solve anything because TOCTOU. Dunno, probably overkill, but perhaps we could have an atomic_t (or refcount, whatever) init_ref inited to 1, with init_ref_get() doing an inc_unless_zero, and iff you get a ref, you're free to call (/patch) __init functions and access __initdata, but must do init_ref_put(), with PID1 dropping its initial ref and waiting for it to drop to 0 before doing the *free_initmem() calls. Rasmus