Re: [PATCH 2/2] livepatch: fix patched module loading race
On Fri, Mar 06, 2015 at 08:37:26PM +0900, Masami Hiramatsu wrote: > Actually, we can suppose this module unloading context is > not changing universe. thus it is expected behavior, isn't it? In the case of my proposed consistency model RFC, if the module unloading task gets preempted, or if mod->exit() calls schedule(), its task can switch to the new universe before it's done. And for many modules it could also be possible for other contexts to access the module's functions in the GOING state before mod->exit() disables them. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
On Fri 2015-03-06 20:37:26, Masami Hiramatsu wrote: > (2015/03/06 19:51), Petr Mladek wrote: > > On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote: > >> (2015/03/05 23:18), Josh Poimboeuf wrote: > >>> On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: > (2015/03/04 22:17), Petr Mladek wrote: > > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > >> It's possible for klp_register_patch() to see a module before the > >> COMING > >> notifier is called, or after the GOING notifier is called. > >> > >> That can cause all kinds of ugly races. As Pter Mladek reported: > >> > >> "The problem is that we do not keep the klp_mutex lock all the time > >> when > >> the module is being added or removed. > >> > >> First, the module is visible even before ftrace is ready. If we > >> enable a patch > >> in this time frame, adding ftrace ops will fail and the patch will > >> get rejected > >> just because bad timing. > > > > Ah, this is not true after all. I did not properly check when > > MODULE_STATE_COMING was set. I though that it was before ftrace was > > initialized but it was not true. > > > > > >> Second, if we are "lucky" and enable the patch for the coming module > >> when the > >> ftrace is ready but before the module notifier has been called. The > >> notifier > >> will try to enable the patch as well. It will detect that it is > >> already patched, > >> return error, and the module will get rejected just because bad > >> timing. The more serious problem is that it will not call the > >> notifier for > >> going module, so that the mess will stay there and we wont be able > >> to load > >> the module later. > > > > Ah, the race is there but the effect is not that serious in the > > end. It seems that errors from module notifiers are ignored. In fact, > > we do not propagate the error from klp_module_notify_coming(). It means > > that WARN() from klp_enable_object() will be printed but the module > > will be loaded and patched. > > > > I am sorry, I was confused by kGraft where kgr_module_init() was > > called directly from module_load(). The errors were propagated. It > > means that kGraft rejects module when the patch cannot be applied. > > > > Note that the current solution is perfectly fine for the simple > > consistency model. > > > > > >> Third, similar problems are there for going module. If a patch is > >> enabled after > >> the notifier finishes but before the module is removed from the list > >> of modules, > >> the new patch will be applied to the module. The module might > >> disappear at > >> anytime when the patch enabling is in progress, so there might be an > >> access out > >> of memory. Or the whole patch might be applied and some mess will be > >> left, > >> so it will not be possible to load/patch the module again." > > > > This is true. > > No, that's not true if you try_get_module() before patching. After the > module state goes GOING (more correctly say, after > try_release_module_ref() > succeeded), all try_get_module() must fail :) > So, please make sure to get module when applying patches. > >>> > >>> Hi Masami, > >>> > >>> As Jikos pointed out elsewhere, try_get_module() won't solve all the > >>> GOING races. > >>> > >>> The module can be in GOING before mod->exit() is called. If we apply a > >>> patch between GOING getting set and mod->exit(), try_module_get() will > >>> fail and the module won't be patched. But module code can still run > >>> before or during mod->exit(), so the unpatched module code might > >>> interact badly with new patched code elsewhere. > >> > >> Hmm, in that case, we'd better have new GONE state for the module. > >> At least kprobe needs it. > > > > What is the exact problem with kprobes, please? > > Ah, sorry, I miss understood the issue. > > > > Note that the notifiers for MODULE_STATE_GOING are called > > after mod->exit(). Therefore it is safe to kill kprobes > > the fast way when using the notifier. > > Right. > > /* Stop the machine so refcounts can't move and disable module. */ > ret = try_stop_module(mod, flags, ); > if (ret != 0) > goto out; > > mutex_unlock(_mutex); > /* Final destruction now no one is using it. */ > if (mod->exit != NULL) > mod->exit(); > blocking_notifier_call_chain(_notify_list, > MODULE_STATE_GOING, mod); > async_synchronize_full(); > > And, kprobes also can try to put a probe between mutex_unlock() > and mod->exit() on mod->exit() function. > Since kprobe tries to get module when registering, it will > fail but mod->exit() is called after that
Re: Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
(2015/03/06 19:51), Petr Mladek wrote: > On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote: >> (2015/03/05 23:18), Josh Poimboeuf wrote: >>> On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: (2015/03/04 22:17), Petr Mladek wrote: > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: >> It's possible for klp_register_patch() to see a module before the COMING >> notifier is called, or after the GOING notifier is called. >> >> That can cause all kinds of ugly races. As Pter Mladek reported: >> >> "The problem is that we do not keep the klp_mutex lock all the time >> when >> the module is being added or removed. >> >> First, the module is visible even before ftrace is ready. If we enable >> a patch >> in this time frame, adding ftrace ops will fail and the patch will get >> rejected >> just because bad timing. > > Ah, this is not true after all. I did not properly check when > MODULE_STATE_COMING was set. I though that it was before ftrace was > initialized but it was not true. > > >> Second, if we are "lucky" and enable the patch for the coming module >> when the >> ftrace is ready but before the module notifier has been called. The >> notifier >> will try to enable the patch as well. It will detect that it is >> already patched, >> return error, and the module will get rejected just because bad >> timing. The more serious problem is that it will not call the notifier >> for >> going module, so that the mess will stay there and we wont be able to >> load >> the module later. > > Ah, the race is there but the effect is not that serious in the > end. It seems that errors from module notifiers are ignored. In fact, > we do not propagate the error from klp_module_notify_coming(). It means > that WARN() from klp_enable_object() will be printed but the module > will be loaded and patched. > > I am sorry, I was confused by kGraft where kgr_module_init() was > called directly from module_load(). The errors were propagated. It > means that kGraft rejects module when the patch cannot be applied. > > Note that the current solution is perfectly fine for the simple > consistency model. > > >> Third, similar problems are there for going module. If a patch is >> enabled after >> the notifier finishes but before the module is removed from the list >> of modules, >> the new patch will be applied to the module. The module might >> disappear at >> anytime when the patch enabling is in progress, so there might be an >> access out >> of memory. Or the whole patch might be applied and some mess will be >> left, >> so it will not be possible to load/patch the module again." > > This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. >>> >>> Hi Masami, >>> >>> As Jikos pointed out elsewhere, try_get_module() won't solve all the >>> GOING races. >>> >>> The module can be in GOING before mod->exit() is called. If we apply a >>> patch between GOING getting set and mod->exit(), try_module_get() will >>> fail and the module won't be patched. But module code can still run >>> before or during mod->exit(), so the unpatched module code might >>> interact badly with new patched code elsewhere. >> >> Hmm, in that case, we'd better have new GONE state for the module. >> At least kprobe needs it. > > What is the exact problem with kprobes, please? Ah, sorry, I miss understood the issue. > Note that the notifiers for MODULE_STATE_GOING are called > after mod->exit(). Therefore it is safe to kill kprobes > the fast way when using the notifier. Right. /* Stop the machine so refcounts can't move and disable module. */ ret = try_stop_module(mod, flags, ); if (ret != 0) goto out; mutex_unlock(_mutex); /* Final destruction now no one is using it. */ if (mod->exit != NULL) mod->exit(); blocking_notifier_call_chain(_notify_list, MODULE_STATE_GOING, mod); async_synchronize_full(); And, kprobes also can try to put a probe between mutex_unlock() and mod->exit() on mod->exit() function. Since kprobe tries to get module when registering, it will fail but mod->exit() is called after that fail. So, there is also a race. However, I'm not sure that is actual serious issue. Actually, we can suppose this module unloading context is not changing universe. thus it is expected behavior, isn't it? Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology
Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote: > (2015/03/05 23:18), Josh Poimboeuf wrote: > > On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: > >> (2015/03/04 22:17), Petr Mladek wrote: > >>> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > It's possible for klp_register_patch() to see a module before the COMING > notifier is called, or after the GOING notifier is called. > > That can cause all kinds of ugly races. As Pter Mladek reported: > > "The problem is that we do not keep the klp_mutex lock all the time > when > the module is being added or removed. > > First, the module is visible even before ftrace is ready. If we enable > a patch > in this time frame, adding ftrace ops will fail and the patch will get > rejected > just because bad timing. > >>> > >>> Ah, this is not true after all. I did not properly check when > >>> MODULE_STATE_COMING was set. I though that it was before ftrace was > >>> initialized but it was not true. > >>> > >>> > Second, if we are "lucky" and enable the patch for the coming module > when the > ftrace is ready but before the module notifier has been called. The > notifier > will try to enable the patch as well. It will detect that it is > already patched, > return error, and the module will get rejected just because bad > timing. The more serious problem is that it will not call the notifier > for > going module, so that the mess will stay there and we wont be able to > load > the module later. > >>> > >>> Ah, the race is there but the effect is not that serious in the > >>> end. It seems that errors from module notifiers are ignored. In fact, > >>> we do not propagate the error from klp_module_notify_coming(). It means > >>> that WARN() from klp_enable_object() will be printed but the module > >>> will be loaded and patched. > >>> > >>> I am sorry, I was confused by kGraft where kgr_module_init() was > >>> called directly from module_load(). The errors were propagated. It > >>> means that kGraft rejects module when the patch cannot be applied. > >>> > >>> Note that the current solution is perfectly fine for the simple > >>> consistency model. > >>> > >>> > Third, similar problems are there for going module. If a patch is > enabled after > the notifier finishes but before the module is removed from the list > of modules, > the new patch will be applied to the module. The module might > disappear at > anytime when the patch enabling is in progress, so there might be an > access out > of memory. Or the whole patch might be applied and some mess will be > left, > so it will not be possible to load/patch the module again." > >>> > >>> This is true. > >> > >> No, that's not true if you try_get_module() before patching. After the > >> module state goes GOING (more correctly say, after try_release_module_ref() > >> succeeded), all try_get_module() must fail :) > >> So, please make sure to get module when applying patches. > > > > Hi Masami, > > > > As Jikos pointed out elsewhere, try_get_module() won't solve all the > > GOING races. > > > > The module can be in GOING before mod->exit() is called. If we apply a > > patch between GOING getting set and mod->exit(), try_module_get() will > > fail and the module won't be patched. But module code can still run > > before or during mod->exit(), so the unpatched module code might > > interact badly with new patched code elsewhere. > > Hmm, in that case, we'd better have new GONE state for the module. > At least kprobe needs it. What is the exact problem with kprobes, please? Note that the notifiers for MODULE_STATE_GOING are called after mod->exit(). Therefore it is safe to kill kprobes the fast way when using the notifier. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Fri, Mar 06, 2015 at 08:37:26PM +0900, Masami Hiramatsu wrote: Actually, we can suppose this module unloading context is not changing universe. thus it is expected behavior, isn't it? In the case of my proposed consistency model RFC, if the module unloading task gets preempted, or if mod-exit() calls schedule(), its task can switch to the new universe before it's done. And for many modules it could also be possible for other contexts to access the module's functions in the GOING state before mod-exit() disables them. -- Josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
(2015/03/06 19:51), Petr Mladek wrote: On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote: (2015/03/05 23:18), Josh Poimboeuf wrote: On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: (2015/03/04 22:17), Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. Hi Masami, As Jikos pointed out elsewhere, try_get_module() won't solve all the GOING races. The module can be in GOING before mod-exit() is called. If we apply a patch between GOING getting set and mod-exit(), try_module_get() will fail and the module won't be patched. But module code can still run before or during mod-exit(), so the unpatched module code might interact badly with new patched code elsewhere. Hmm, in that case, we'd better have new GONE state for the module. At least kprobe needs it. What is the exact problem with kprobes, please? Ah, sorry, I miss understood the issue. Note that the notifiers for MODULE_STATE_GOING are called after mod-exit(). Therefore it is safe to kill kprobes the fast way when using the notifier. Right. /* Stop the machine so refcounts can't move and disable module. */ ret = try_stop_module(mod, flags, forced); if (ret != 0) goto out; mutex_unlock(module_mutex); /* Final destruction now no one is using it. */ if (mod-exit != NULL) mod-exit(); blocking_notifier_call_chain(module_notify_list, MODULE_STATE_GOING, mod); async_synchronize_full(); And, kprobes also can try to put a probe between mutex_unlock() and mod-exit() on mod-exit() function. Since kprobe tries to get module when registering, it will fail but mod-exit() is called after that fail. So, there is also a race. However, I'm not sure that is actual serious issue. Actually, we can suppose this module unloading context is not changing universe. thus it is expected behavior, isn't it? Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
On Fri 2015-03-06 20:37:26, Masami Hiramatsu wrote: (2015/03/06 19:51), Petr Mladek wrote: On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote: (2015/03/05 23:18), Josh Poimboeuf wrote: On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: (2015/03/04 22:17), Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. Hi Masami, As Jikos pointed out elsewhere, try_get_module() won't solve all the GOING races. The module can be in GOING before mod-exit() is called. If we apply a patch between GOING getting set and mod-exit(), try_module_get() will fail and the module won't be patched. But module code can still run before or during mod-exit(), so the unpatched module code might interact badly with new patched code elsewhere. Hmm, in that case, we'd better have new GONE state for the module. At least kprobe needs it. What is the exact problem with kprobes, please? Ah, sorry, I miss understood the issue. Note that the notifiers for MODULE_STATE_GOING are called after mod-exit(). Therefore it is safe to kill kprobes the fast way when using the notifier. Right. /* Stop the machine so refcounts can't move and disable module. */ ret = try_stop_module(mod, flags, forced); if (ret != 0) goto out; mutex_unlock(module_mutex); /* Final destruction now no one is using it. */ if (mod-exit != NULL) mod-exit(); blocking_notifier_call_chain(module_notify_list, MODULE_STATE_GOING, mod); async_synchronize_full(); And, kprobes also can try to put a probe between mutex_unlock() and mod-exit() on mod-exit() function. Since kprobe tries to get module when registering, it will fail but mod-exit() is called after that fail. So, there is also a race. I think that it is not a problem if the Kprobe can not be registered at this stage. If anyone wants to register Kprobe on the mod-exit(), she should do it earlier. However, I'm not sure that is actual serious issue. Actually, we can suppose this module unloading context is not changing universe. thus it is expected behavior, isn't it? Live Patching is in different situation. The patch modifies many functions at once. If we allow semantic changes in functions, we need to have the patches
Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
On Fri 2015-03-06 10:24:27, Masami Hiramatsu wrote: (2015/03/05 23:18), Josh Poimboeuf wrote: On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: (2015/03/04 22:17), Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. Hi Masami, As Jikos pointed out elsewhere, try_get_module() won't solve all the GOING races. The module can be in GOING before mod-exit() is called. If we apply a patch between GOING getting set and mod-exit(), try_module_get() will fail and the module won't be patched. But module code can still run before or during mod-exit(), so the unpatched module code might interact badly with new patched code elsewhere. Hmm, in that case, we'd better have new GONE state for the module. At least kprobe needs it. What is the exact problem with kprobes, please? Note that the notifiers for MODULE_STATE_GOING are called after mod-exit(). Therefore it is safe to kill kprobes the fast way when using the notifier. Best Regards, Petr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
(2015/03/05 23:18), Josh Poimboeuf wrote: > On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: >> (2015/03/04 22:17), Petr Mladek wrote: >>> On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: "The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. >>> >>> Ah, this is not true after all. I did not properly check when >>> MODULE_STATE_COMING was set. I though that it was before ftrace was >>> initialized but it was not true. >>> >>> Second, if we are "lucky" and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. >>> >>> Ah, the race is there but the effect is not that serious in the >>> end. It seems that errors from module notifiers are ignored. In fact, >>> we do not propagate the error from klp_module_notify_coming(). It means >>> that WARN() from klp_enable_object() will be printed but the module >>> will be loaded and patched. >>> >>> I am sorry, I was confused by kGraft where kgr_module_init() was >>> called directly from module_load(). The errors were propagated. It >>> means that kGraft rejects module when the patch cannot be applied. >>> >>> Note that the current solution is perfectly fine for the simple >>> consistency model. >>> >>> Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again." >>> >>> This is true. >> >> No, that's not true if you try_get_module() before patching. After the >> module state goes GOING (more correctly say, after try_release_module_ref() >> succeeded), all try_get_module() must fail :) >> So, please make sure to get module when applying patches. > > Hi Masami, > > As Jikos pointed out elsewhere, try_get_module() won't solve all the > GOING races. > > The module can be in GOING before mod->exit() is called. If we apply a > patch between GOING getting set and mod->exit(), try_module_get() will > fail and the module won't be patched. But module code can still run > before or during mod->exit(), so the unpatched module code might > interact badly with new patched code elsewhere. Hmm, in that case, we'd better have new GONE state for the module. At least kprobe needs it. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Thu 2015-03-05 09:52:41, Masami Hiramatsu wrote: > (2015/03/04 22:17), Petr Mladek wrote: > > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > >> It's possible for klp_register_patch() to see a module before the COMING > >> notifier is called, or after the GOING notifier is called. > >> > >> That can cause all kinds of ugly races. As Pter Mladek reported: > >> > >> "The problem is that we do not keep the klp_mutex lock all the time when > >> the module is being added or removed. > >> > >> First, the module is visible even before ftrace is ready. If we enable a > >> patch > >> in this time frame, adding ftrace ops will fail and the patch will get > >> rejected > >> just because bad timing. > > > > Ah, this is not true after all. I did not properly check when > > MODULE_STATE_COMING was set. I though that it was before ftrace was > > initialized but it was not true. > > > > > >> Second, if we are "lucky" and enable the patch for the coming module > >> when the > >> ftrace is ready but before the module notifier has been called. The > >> notifier > >> will try to enable the patch as well. It will detect that it is already > >> patched, > >> return error, and the module will get rejected just because bad > >> timing. The more serious problem is that it will not call the notifier > >> for > >> going module, so that the mess will stay there and we wont be able to > >> load > >> the module later. > > > > Ah, the race is there but the effect is not that serious in the > > end. It seems that errors from module notifiers are ignored. In fact, > > we do not propagate the error from klp_module_notify_coming(). It means > > that WARN() from klp_enable_object() will be printed but the module > > will be loaded and patched. > > > > I am sorry, I was confused by kGraft where kgr_module_init() was > > called directly from module_load(). The errors were propagated. It > > means that kGraft rejects module when the patch cannot be applied. > > > > Note that the current solution is perfectly fine for the simple > > consistency model. > > > > > >> Third, similar problems are there for going module. If a patch is > >> enabled after > >> the notifier finishes but before the module is removed from the list of > >> modules, > >> the new patch will be applied to the module. The module might disappear > >> at > >> anytime when the patch enabling is in progress, so there might be an > >> access out > >> of memory. Or the whole patch might be applied and some mess will be > >> left, > >> so it will not be possible to load/patch the module again." > > > > This is true. > > No, that's not true if you try_get_module() before patching. After the > module state goes GOING (more correctly say, after try_release_module_ref() > succeeded), all try_get_module() must fail :) > So, please make sure to get module when applying patches. The question is what to do if we could not get the module reference because the module is going. We must not ignore the module because there is a window when the module code could still get called, see https://lkml.org/lkml/2015/3/4/776 It would be possible to wait until the module is removed but it might take quite some time, especially if the mod->exit() function is complex and need to wait for something. Or we could refuse to add the patch but it is ugly. I tend to go back and use the original idea with a boolean that is updated when the module going notifier is called. It sets the border, when it is safe to ignore the going module. There is no waiting, no rejection. I am actually getting near to send rfc patch v2 with this implementation. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: > (2015/03/04 22:17), Petr Mladek wrote: > > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > >> It's possible for klp_register_patch() to see a module before the COMING > >> notifier is called, or after the GOING notifier is called. > >> > >> That can cause all kinds of ugly races. As Pter Mladek reported: > >> > >> "The problem is that we do not keep the klp_mutex lock all the time when > >> the module is being added or removed. > >> > >> First, the module is visible even before ftrace is ready. If we enable a > >> patch > >> in this time frame, adding ftrace ops will fail and the patch will get > >> rejected > >> just because bad timing. > > > > Ah, this is not true after all. I did not properly check when > > MODULE_STATE_COMING was set. I though that it was before ftrace was > > initialized but it was not true. > > > > > >> Second, if we are "lucky" and enable the patch for the coming module > >> when the > >> ftrace is ready but before the module notifier has been called. The > >> notifier > >> will try to enable the patch as well. It will detect that it is already > >> patched, > >> return error, and the module will get rejected just because bad > >> timing. The more serious problem is that it will not call the notifier > >> for > >> going module, so that the mess will stay there and we wont be able to > >> load > >> the module later. > > > > Ah, the race is there but the effect is not that serious in the > > end. It seems that errors from module notifiers are ignored. In fact, > > we do not propagate the error from klp_module_notify_coming(). It means > > that WARN() from klp_enable_object() will be printed but the module > > will be loaded and patched. > > > > I am sorry, I was confused by kGraft where kgr_module_init() was > > called directly from module_load(). The errors were propagated. It > > means that kGraft rejects module when the patch cannot be applied. > > > > Note that the current solution is perfectly fine for the simple > > consistency model. > > > > > >> Third, similar problems are there for going module. If a patch is > >> enabled after > >> the notifier finishes but before the module is removed from the list of > >> modules, > >> the new patch will be applied to the module. The module might disappear > >> at > >> anytime when the patch enabling is in progress, so there might be an > >> access out > >> of memory. Or the whole patch might be applied and some mess will be > >> left, > >> so it will not be possible to load/patch the module again." > > > > This is true. > > No, that's not true if you try_get_module() before patching. After the > module state goes GOING (more correctly say, after try_release_module_ref() > succeeded), all try_get_module() must fail :) > So, please make sure to get module when applying patches. Hi Masami, As Jikos pointed out elsewhere, try_get_module() won't solve all the GOING races. The module can be in GOING before mod->exit() is called. If we apply a patch between GOING getting set and mod->exit(), try_module_get() will fail and the module won't be patched. But module code can still run before or during mod->exit(), so the unpatched module code might interact badly with new patched code elsewhere. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [PATCH 2/2] livepatch: fix patched module loading race
(2015/03/05 23:18), Josh Poimboeuf wrote: On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: (2015/03/04 22:17), Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. Hi Masami, As Jikos pointed out elsewhere, try_get_module() won't solve all the GOING races. The module can be in GOING before mod-exit() is called. If we apply a patch between GOING getting set and mod-exit(), try_module_get() will fail and the module won't be patched. But module code can still run before or during mod-exit(), so the unpatched module code might interact badly with new patched code elsewhere. Hmm, in that case, we'd better have new GONE state for the module. At least kprobe needs it. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Thu 2015-03-05 09:52:41, Masami Hiramatsu wrote: (2015/03/04 22:17), Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. The question is what to do if we could not get the module reference because the module is going. We must not ignore the module because there is a window when the module code could still get called, see https://lkml.org/lkml/2015/3/4/776 It would be possible to wait until the module is removed but it might take quite some time, especially if the mod-exit() function is complex and need to wait for something. Or we could refuse to add the patch but it is ugly. I tend to go back and use the original idea with a boolean that is updated when the module going notifier is called. It sets the border, when it is safe to ignore the going module. There is no waiting, no rejection. I am actually getting near to send rfc patch v2 with this implementation. Best Regards, Petr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Thu, Mar 05, 2015 at 09:52:41AM +0900, Masami Hiramatsu wrote: (2015/03/04 22:17), Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. Hi Masami, As Jikos pointed out elsewhere, try_get_module() won't solve all the GOING races. The module can be in GOING before mod-exit() is called. If we apply a patch between GOING getting set and mod-exit(), try_module_get() will fail and the module won't be patched. But module code can still run before or during mod-exit(), so the unpatched module code might interact badly with new patched code elsewhere. -- Josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
(2015/03/04 22:17), Petr Mladek wrote: > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: >> It's possible for klp_register_patch() to see a module before the COMING >> notifier is called, or after the GOING notifier is called. >> >> That can cause all kinds of ugly races. As Pter Mladek reported: >> >> "The problem is that we do not keep the klp_mutex lock all the time when >> the module is being added or removed. >> >> First, the module is visible even before ftrace is ready. If we enable a >> patch >> in this time frame, adding ftrace ops will fail and the patch will get >> rejected >> just because bad timing. > > Ah, this is not true after all. I did not properly check when > MODULE_STATE_COMING was set. I though that it was before ftrace was > initialized but it was not true. > > >> Second, if we are "lucky" and enable the patch for the coming module when >> the >> ftrace is ready but before the module notifier has been called. The >> notifier >> will try to enable the patch as well. It will detect that it is already >> patched, >> return error, and the module will get rejected just because bad >> timing. The more serious problem is that it will not call the notifier for >> going module, so that the mess will stay there and we wont be able to load >> the module later. > > Ah, the race is there but the effect is not that serious in the > end. It seems that errors from module notifiers are ignored. In fact, > we do not propagate the error from klp_module_notify_coming(). It means > that WARN() from klp_enable_object() will be printed but the module > will be loaded and patched. > > I am sorry, I was confused by kGraft where kgr_module_init() was > called directly from module_load(). The errors were propagated. It > means that kGraft rejects module when the patch cannot be applied. > > Note that the current solution is perfectly fine for the simple > consistency model. > > >> Third, similar problems are there for going module. If a patch is enabled >> after >> the notifier finishes but before the module is removed from the list of >> modules, >> the new patch will be applied to the module. The module might disappear at >> anytime when the patch enabling is in progress, so there might be an >> access out >> of memory. Or the whole patch might be applied and some mess will be left, >> so it will not be possible to load/patch the module again." > > This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 11:02:59PM +0100, Jiri Kosina wrote: > On Wed, 4 Mar 2015, Josh Poimboeuf wrote: > > > Well, we could just get a reference on all patched modules to prevent them > > from being unloaded. > > Is that really a solution for cases where you are unloading a module (it > has "just" been switched to MODULE_STATE_GOING) and enable a patch which > patches one of its functions directly after it got switched to > MODULE_STATE_GOING? Ah, right. try_module_get() fails, we don't patch the module, and we end up in the same mod->exit() race. Never mind, again :-) -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, 4 Mar 2015, Josh Poimboeuf wrote: > Well, we could just get a reference on all patched modules to prevent them > from being unloaded. Is that really a solution for cases where you are unloading a module (it has "just" been switched to MODULE_STATE_GOING) and enable a patch which patches one of its functions directly after it got switched to MODULE_STATE_GOING? -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 05:36:11PM +0100, Petr Mladek wrote: > For example, let's have three patches (P1, P2, P3) for the functions a() and > b() > where a() is from vmcore and b() is from a module M. Something like: > > a() b() > P1a1()b1() > P2a2()b2() > P3a3()b3(3) > > If you load the module M after all patches are registered and enabled. > The ftrace ops for function a() and b() has listed the functions in this > order > > ops_a->func_stack -> list(a3,a2,a1) > ops_b->func_stack -> list(b3,b2,b1) > > , so the pointer to b3() is the first and will be used. > > Then you might have the following scenario. Let's start with state > when patches P1 and P2 are registered and enabled but the module M > is not loaded. Then ftrace ops for b() does not exist. Then we > get into the following race: > > > CPU0 CPU1 > > load_module(M) > > complete_formation() > > mod->state = MODULE_STATE_COMING; > mutex_unlock(_mutex); > > klp_register_patch(P3); > klp_enable_patch(P3); > > # STATE 1 > > > klp_module_notify(M) > klp_module_notify_coming(P1); > klp_module_notify_coming(P2); > klp_module_notify_coming(P3); > > # STATE 2 > > > The ftrace ops for a() and b() then looks: > > STATE1: > > ops_a->func_stack -> list(a3,a2,a1); > ops_b->func_stack -> list(b3); > > STATE2: > ops_a->func_stack -> list(a3,a2,a1); > ops_b->func_stack -> list(b2,b1,b3); > > therefore, b2() is used for the module but a3() is used for vmcore > because they were the last added. Thanks for the excellent explanation. That makes sense. > My plan is to fix this problem by calling klp_module_init() directly > in load_module() just after ftrace_module_init(). It will solve this > problem because it will be called in MODULE_STATE_UNFORMED. Ok, looking forward to that. > It will have another big advantage. It will allow to pass the error > code and refuse loading modules that could not get patched. This will > be needed for the more complex patches anyway. We have to prevent > running module code that is inconsistent with the patched system. Yeah, that does need to be fixed up too. > I am still in doubts how to best solve the problem for going modules. > Your suggested solution is fine for now. But we will need a better fix > after adding the more complex consistency model. Well, we could just get a reference on all patched modules to prevent them from being unloaded. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed 2015-03-04 09:34:15, Josh Poimboeuf wrote: > On Wed, Mar 04, 2015 at 02:17:52PM +0100, Petr Mladek wrote: > > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > > > It's possible for klp_register_patch() to see a module before the COMING > > > notifier is called, or after the GOING notifier is called. > > > > > > That can cause all kinds of ugly races. As Pter Mladek reported: > > > > > > "The problem is that we do not keep the klp_mutex lock all the time when > > > the module is being added or removed. > > > Second, if we are "lucky" and enable the patch for the coming module > > > when the > > > ftrace is ready but before the module notifier has been called. The > > > notifier > > > will try to enable the patch as well. It will detect that it is already > > > patched, > > > return error, and the module will get rejected just because bad > > > timing. The more serious problem is that it will not call the notifier > > > for > > > going module, so that the mess will stay there and we wont be able to > > > load > > > the module later. > > > > Ah, the race is there but the effect is not that serious in the > > end. It seems that errors from module notifiers are ignored. In fact, > > we do not propagate the error from klp_module_notify_coming(). It means > > that WARN() from klp_enable_object() will be printed but the module > > will be loaded and patched. > > > > I am sorry, I was confused by kGraft where kgr_module_init() was > > called directly from module_load(). The errors were propagated. It > > means that kGraft rejects module when the patch cannot be applied. > > True, but we should still eliminate the race. Initializing the object > twice could cause some sneaky bugs, if not now then in the future. sure > > Note that the current solution is perfectly fine for the simple > > consistency model. > > > > > > > Third, similar problems are there for going module. If a patch is > > > enabled after > > > the notifier finishes but before the module is removed from the list of > > > modules, > > > the new patch will be applied to the module. The module might disappear > > > at > > > anytime when the patch enabling is in progress, so there might be an > > > access out > > > of memory. Or the whole patch might be applied and some mess will be > > > left, > > > so it will not be possible to load/patch the module again." > > > > This is true. > > > > > > > Fix these races by letting the first one who sees the module do the > > > needed work. > > > > > > Reported-by: Petr Mladek > > > Signed-off-by: Josh Poimboeuf > > > --- > > > kernel/livepatch/core.c | 53 > > > + > > > 1 file changed, 49 insertions(+), 4 deletions(-) > > > > > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c > > > index a74e4e8..19a758c 100644 > > > --- a/kernel/livepatch/core.c > > > +++ b/kernel/livepatch/core.c > > > @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object > > > *obj) > > > /* sets obj->mod if object is not vmlinux and module is found */ > > > static void klp_find_object_module(struct klp_object *obj) > > > { > > > + struct module *mod; > > > + > > > if (!klp_is_module(obj)) > > > return; > > > > > > mutex_lock(_mutex); > > > + > > > /* > > >* We don't need to take a reference on the module here because we have > > >* the klp_mutex, which is also taken by the module notifier. This > > >* prevents any module from unloading until we release the klp_mutex. > > >*/ > > > - obj->mod = find_module(obj->name); > > > + mod = find_module(obj->name); > > > + > > > + /* > > > + * Be careful here to prevent races with the notifier: > > > + * > > > + * - MODULE_STATE_COMING: This may be before or after the notifier gets > > > + * called. If after, the notifier didn't see the object anyway > > > + * because it hadn't been added to the patch list yet. Either way, > > > + * ftrace is already initialized, so it's safe to just go ahead and > > > + * initialize the object here. > > > > Well, we need to be careful. This will handle only the newly > > registered patch. If there are other patches against the module > > on the stack, it might produce wrong order at func_stack, see below. > > Sorry, but I don't follow. Can you give an example? See below. > > > + * > > > + * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be > > > + * before or after the notifier gets called. If after, the notifer > > > + * didn't see the object anyway. Either way it's safe to just > > > + * pretend that it's already gone and leave obj->mod at NULL. > > > > This is true for the current simple consistency model. > > > > Note that there will be a small race window if we allow dependency > > between patched functions and introduce more a complex consistency model > > with lazy patching. The problem is the following sequence: > > > >
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 04:51:39PM +0100, Jiri Kosina wrote: > On Wed, 4 Mar 2015, Josh Poimboeuf wrote: > > > > CPU0 CPU1 > > > > > > delete_module() #SYSCALL > > > > > >try_stop_module() > > > mod->state = MODULE_STATE_GOING; > > > > > >mutex_unlock(_mutex); > > > > > > klp_register_patch() > > > klp_enable_patch() > > > > > > #save place to switch universe > > > > > > b() # from module that is going > > > a() # from core (patched) > > > > > > > > >mod->exit(); > > > > > > > > > Note that the function b() can be called until we call mod->exit(). > > > > > > If we do not apply patch against b() because it is in > > > MODULE_STATE_GOING. It will call patched a() with modified semantic > > > and things might get wrong. > > > > > > Well, the above described race is rather theoretical. It will happen > > > only when a patched module is being removed and a module with a patch > > > is added at the same time. Also it means that some other CPU will > > > manage to register, enable the patch, switch the universe, and > > > call function from the patched module during the small race window. > > > > > > I am not sure if we need to handle such a type of race if the solution > > > adds too big complexity. > > > > But b() can't be called after the module is in MODULE_STATE_GOING, > > right? try_stop_module() makes sure it's not being used before changing > > its state. > > If b() is called from __exit() of the particular module, then you end up > in exactly the situation described above, don't you? Well, maybe. I guess it depends on the consistency model. The current "inconsistency" model doesn't allow for function semantic changes anyway. If we went to the per-task consistency model with stack checking (like my RFC), if mod->exit() calls schedule() before calling b(), the module task can potentially switch to the new universe, and call old b() which calls new a(). So yeah. Maybe this isn't the best approach. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, 4 Mar 2015, Josh Poimboeuf wrote: > > CPU0CPU1 > > > > delete_module() #SYSCALL > > > >try_stop_module() > > mod->state = MODULE_STATE_GOING; > > > >mutex_unlock(_mutex); > > > > klp_register_patch() > > klp_enable_patch() > > > > #save place to switch universe > > > > b() # from module that is going > > a() # from core (patched) > > > > > >mod->exit(); > > > > > > Note that the function b() can be called until we call mod->exit(). > > > > If we do not apply patch against b() because it is in > > MODULE_STATE_GOING. It will call patched a() with modified semantic > > and things might get wrong. > > > > Well, the above described race is rather theoretical. It will happen > > only when a patched module is being removed and a module with a patch > > is added at the same time. Also it means that some other CPU will > > manage to register, enable the patch, switch the universe, and > > call function from the patched module during the small race window. > > > > I am not sure if we need to handle such a type of race if the solution > > adds too big complexity. > > But b() can't be called after the module is in MODULE_STATE_GOING, > right? try_stop_module() makes sure it's not being used before changing > its state. If b() is called from __exit() of the particular module, then you end up in exactly the situation described above, don't you? Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 02:17:52PM +0100, Petr Mladek wrote: > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > > It's possible for klp_register_patch() to see a module before the COMING > > notifier is called, or after the GOING notifier is called. > > > > That can cause all kinds of ugly races. As Pter Mladek reported: > > > > "The problem is that we do not keep the klp_mutex lock all the time when > > the module is being added or removed. > > > > First, the module is visible even before ftrace is ready. If we enable a > > patch > > in this time frame, adding ftrace ops will fail and the patch will get > > rejected > > just because bad timing. > > Ah, this is not true after all. I did not properly check when > MODULE_STATE_COMING was set. I though that it was before ftrace was > initialized but it was not true. Ah, right. I didn't re-read your description after realizing that MODULE_STATE_COMING means ftrace is already initialized. > > Second, if we are "lucky" and enable the patch for the coming module when > > the > > ftrace is ready but before the module notifier has been called. The > > notifier > > will try to enable the patch as well. It will detect that it is already > > patched, > > return error, and the module will get rejected just because bad > > timing. The more serious problem is that it will not call the notifier for > > going module, so that the mess will stay there and we wont be able to load > > the module later. > > Ah, the race is there but the effect is not that serious in the > end. It seems that errors from module notifiers are ignored. In fact, > we do not propagate the error from klp_module_notify_coming(). It means > that WARN() from klp_enable_object() will be printed but the module > will be loaded and patched. > > I am sorry, I was confused by kGraft where kgr_module_init() was > called directly from module_load(). The errors were propagated. It > means that kGraft rejects module when the patch cannot be applied. True, but we should still eliminate the race. Initializing the object twice could cause some sneaky bugs, if not now then in the future. > Note that the current solution is perfectly fine for the simple > consistency model. > > > > Third, similar problems are there for going module. If a patch is enabled > > after > > the notifier finishes but before the module is removed from the list of > > modules, > > the new patch will be applied to the module. The module might disappear at > > anytime when the patch enabling is in progress, so there might be an > > access out > > of memory. Or the whole patch might be applied and some mess will be left, > > so it will not be possible to load/patch the module again." > > This is true. > > > > Fix these races by letting the first one who sees the module do the > > needed work. > > > > Reported-by: Petr Mladek > > Signed-off-by: Josh Poimboeuf > > --- > > kernel/livepatch/core.c | 53 > > + > > 1 file changed, 49 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c > > index a74e4e8..19a758c 100644 > > --- a/kernel/livepatch/core.c > > +++ b/kernel/livepatch/core.c > > @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj) > > /* sets obj->mod if object is not vmlinux and module is found */ > > static void klp_find_object_module(struct klp_object *obj) > > { > > + struct module *mod; > > + > > if (!klp_is_module(obj)) > > return; > > > > mutex_lock(_mutex); > > + > > /* > > * We don't need to take a reference on the module here because we have > > * the klp_mutex, which is also taken by the module notifier. This > > * prevents any module from unloading until we release the klp_mutex. > > */ > > - obj->mod = find_module(obj->name); > > + mod = find_module(obj->name); > > + > > + /* > > +* Be careful here to prevent races with the notifier: > > +* > > +* - MODULE_STATE_COMING: This may be before or after the notifier gets > > +* called. If after, the notifier didn't see the object anyway > > +* because it hadn't been added to the patch list yet. Either way, > > +* ftrace is already initialized, so it's safe to just go ahead and > > +* initialize the object here. > > Well, we need to be careful. This will handle only the newly > registered patch. If there are other patches against the module > on the stack, it might produce wrong order at func_stack, see below. Sorry, but I don't follow. Can you give an example? > > +* > > +* - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be > > +* before or after the notifier gets called. If after, the notifer > > +* didn't see the object anyway. Either way it's safe to just > > +* pretend that it's already gone and leave obj->mod at NULL. > > This is true for the current
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed 2015-03-04 14:17:52, Petr Mladek wrote: > On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > > It's possible for klp_register_patch() to see a module before the COMING > > notifier is called, or after the GOING notifier is called. > > > > That can cause all kinds of ugly races. As Pter Mladek reported: > > > > "The problem is that we do not keep the klp_mutex lock all the time when > > the module is being added or removed. > > > > First, the module is visible even before ftrace is ready. If we enable a > > patch > > in this time frame, adding ftrace ops will fail and the patch will get > > rejected > > just because bad timing. > > Ah, this is not true after all. I did not properly check when > MODULE_STATE_COMING was set. I though that it was before ftrace was > initialized but it was not true. > > > > Second, if we are "lucky" and enable the patch for the coming module when > > the > > ftrace is ready but before the module notifier has been called. The > > notifier > > will try to enable the patch as well. It will detect that it is already > > patched, > > return error, and the module will get rejected just because bad > > timing. The more serious problem is that it will not call the notifier for > > going module, so that the mess will stay there and we wont be able to load > > the module later. > > Ah, the race is there but the effect is not that serious in the > end. It seems that errors from module notifiers are ignored. In fact, > we do not propagate the error from klp_module_notify_coming(). It means > that WARN() from klp_enable_object() will be printed but the module > will be loaded and patched. > > I am sorry, I was confused by kGraft where kgr_module_init() was > called directly from module_load(). The errors were propagated. It > means that kGraft rejects module when the patch cannot be applied. > > Note that the current solution is perfectly fine for the simple > consistency model. > > > > Third, similar problems are there for going module. If a patch is enabled > > after > > the notifier finishes but before the module is removed from the list of > > modules, > > the new patch will be applied to the module. The module might disappear at > > anytime when the patch enabling is in progress, so there might be an > > access out > > of memory. Or the whole patch might be applied and some mess will be left, > > so it will not be possible to load/patch the module again." > > This is true. > > > > Fix these races by letting the first one who sees the module do the > > needed work. > > > > Reported-by: Petr Mladek > > Signed-off-by: Josh Poimboeuf > > --- > > kernel/livepatch/core.c | 53 > > + > > @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block > > *nb, unsigned long action, > > continue; > > > > if (action == MODULE_STATE_COMING) { > > + > > + /* > > +* Check for a small window where the register > > +* path already initialized the object. > > +*/ > s/path/patch/ > > > > > + if (obj->mod) > > + continue; > > This might break stacking. The recently registered patch might become > the last on the stack and thus unused. Going through the stack when registering new patch would be quite ugly. I am going to provide yet another solution. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: > It's possible for klp_register_patch() to see a module before the COMING > notifier is called, or after the GOING notifier is called. > > That can cause all kinds of ugly races. As Pter Mladek reported: > > "The problem is that we do not keep the klp_mutex lock all the time when > the module is being added or removed. > > First, the module is visible even before ftrace is ready. If we enable a > patch > in this time frame, adding ftrace ops will fail and the patch will get > rejected > just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. > Second, if we are "lucky" and enable the patch for the coming module when > the > ftrace is ready but before the module notifier has been called. The notifier > will try to enable the patch as well. It will detect that it is already > patched, > return error, and the module will get rejected just because bad > timing. The more serious problem is that it will not call the notifier for > going module, so that the mess will stay there and we wont be able to load > the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. > Third, similar problems are there for going module. If a patch is enabled > after > the notifier finishes but before the module is removed from the list of > modules, > the new patch will be applied to the module. The module might disappear at > anytime when the patch enabling is in progress, so there might be an access > out > of memory. Or the whole patch might be applied and some mess will be left, > so it will not be possible to load/patch the module again." This is true. > Fix these races by letting the first one who sees the module do the > needed work. > > Reported-by: Petr Mladek > Signed-off-by: Josh Poimboeuf > --- > kernel/livepatch/core.c | 53 > + > 1 file changed, 49 insertions(+), 4 deletions(-) > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c > index a74e4e8..19a758c 100644 > --- a/kernel/livepatch/core.c > +++ b/kernel/livepatch/core.c > @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj) > /* sets obj->mod if object is not vmlinux and module is found */ > static void klp_find_object_module(struct klp_object *obj) > { > + struct module *mod; > + > if (!klp_is_module(obj)) > return; > > mutex_lock(_mutex); > + > /* >* We don't need to take a reference on the module here because we have >* the klp_mutex, which is also taken by the module notifier. This >* prevents any module from unloading until we release the klp_mutex. >*/ > - obj->mod = find_module(obj->name); > + mod = find_module(obj->name); > + > + /* > + * Be careful here to prevent races with the notifier: > + * > + * - MODULE_STATE_COMING: This may be before or after the notifier gets > + * called. If after, the notifier didn't see the object anyway > + * because it hadn't been added to the patch list yet. Either way, > + * ftrace is already initialized, so it's safe to just go ahead and > + * initialize the object here. Well, we need to be careful. This will handle only the newly registered patch. If there are other patches against the module on the stack, it might produce wrong order at func_stack, see below. > + * > + * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be > + * before or after the notifier gets called. If after, the notifer > + * didn't see the object anyway. Either way it's safe to just > + * pretend that it's already gone and leave obj->mod at NULL. This is true for the current simple consistency model. Note that there will be a small race window if we allow dependency between patched functions and introduce more a complex consistency model with lazy patching. The problem is the following sequence: CPU0CPU1 delete_module() #SYSCALL try_stop_module() mod->state = MODULE_STATE_GOING; mutex_unlock(_mutex); klp_register_patch() klp_enable_patch()
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 02:17:52PM +0100, Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Ah, right. I didn't re-read your description after realizing that MODULE_STATE_COMING means ftrace is already initialized. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. True, but we should still eliminate the race. Initializing the object twice could cause some sneaky bugs, if not now then in the future. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. Fix these races by letting the first one who sees the module do the needed work. Reported-by: Petr Mladek pmla...@suse.cz Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- kernel/livepatch/core.c | 53 + 1 file changed, 49 insertions(+), 4 deletions(-) diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index a74e4e8..19a758c 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj) /* sets obj-mod if object is not vmlinux and module is found */ static void klp_find_object_module(struct klp_object *obj) { + struct module *mod; + if (!klp_is_module(obj)) return; mutex_lock(module_mutex); + /* * We don't need to take a reference on the module here because we have * the klp_mutex, which is also taken by the module notifier. This * prevents any module from unloading until we release the klp_mutex. */ - obj-mod = find_module(obj-name); + mod = find_module(obj-name); + + /* +* Be careful here to prevent races with the notifier: +* +* - MODULE_STATE_COMING: This may be before or after the notifier gets +* called. If after, the notifier didn't see the object anyway +* because it hadn't been added to the patch list yet. Either way, +* ftrace is already initialized, so it's safe to just go ahead and +* initialize the object here. Well, we need to be careful. This will handle only the newly registered patch. If there are other patches against the module on the stack, it might produce wrong order at func_stack, see below. Sorry, but I don't follow. Can you give an example? +* +* - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be +* before or after the notifier gets called. If after, the notifer +* didn't see the object anyway. Either way it's safe to just +* pretend that it's already gone and leave obj-mod at NULL. This is true for the current simple consistency model. Note that there will be a small race window if we allow dependency between patched functions and introduce more a complex
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, 4 Mar 2015, Josh Poimboeuf wrote: CPU0CPU1 delete_module() #SYSCALL try_stop_module() mod-state = MODULE_STATE_GOING; mutex_unlock(module_mutex); klp_register_patch() klp_enable_patch() #save place to switch universe b() # from module that is going a() # from core (patched) mod-exit(); Note that the function b() can be called until we call mod-exit(). If we do not apply patch against b() because it is in MODULE_STATE_GOING. It will call patched a() with modified semantic and things might get wrong. Well, the above described race is rather theoretical. It will happen only when a patched module is being removed and a module with a patch is added at the same time. Also it means that some other CPU will manage to register, enable the patch, switch the universe, and call function from the patched module during the small race window. I am not sure if we need to handle such a type of race if the solution adds too big complexity. But b() can't be called after the module is in MODULE_STATE_GOING, right? try_stop_module() makes sure it's not being used before changing its state. If b() is called from __exit() of the particular module, then you end up in exactly the situation described above, don't you? Thanks, -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 04:51:39PM +0100, Jiri Kosina wrote: On Wed, 4 Mar 2015, Josh Poimboeuf wrote: CPU0 CPU1 delete_module() #SYSCALL try_stop_module() mod-state = MODULE_STATE_GOING; mutex_unlock(module_mutex); klp_register_patch() klp_enable_patch() #save place to switch universe b() # from module that is going a() # from core (patched) mod-exit(); Note that the function b() can be called until we call mod-exit(). If we do not apply patch against b() because it is in MODULE_STATE_GOING. It will call patched a() with modified semantic and things might get wrong. Well, the above described race is rather theoretical. It will happen only when a patched module is being removed and a module with a patch is added at the same time. Also it means that some other CPU will manage to register, enable the patch, switch the universe, and call function from the patched module during the small race window. I am not sure if we need to handle such a type of race if the solution adds too big complexity. But b() can't be called after the module is in MODULE_STATE_GOING, right? try_stop_module() makes sure it's not being used before changing its state. If b() is called from __exit() of the particular module, then you end up in exactly the situation described above, don't you? Well, maybe. I guess it depends on the consistency model. The current inconsistency model doesn't allow for function semantic changes anyway. If we went to the per-task consistency model with stack checking (like my RFC), if mod-exit() calls schedule() before calling b(), the module task can potentially switch to the new universe, and call old b() which calls new a(). So yeah. Maybe this isn't the best approach. -- Josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 05:36:11PM +0100, Petr Mladek wrote: For example, let's have three patches (P1, P2, P3) for the functions a() and b() where a() is from vmcore and b() is from a module M. Something like: a() b() P1a1()b1() P2a2()b2() P3a3()b3(3) If you load the module M after all patches are registered and enabled. The ftrace ops for function a() and b() has listed the functions in this order ops_a-func_stack - list(a3,a2,a1) ops_b-func_stack - list(b3,b2,b1) , so the pointer to b3() is the first and will be used. Then you might have the following scenario. Let's start with state when patches P1 and P2 are registered and enabled but the module M is not loaded. Then ftrace ops for b() does not exist. Then we get into the following race: CPU0 CPU1 load_module(M) complete_formation() mod-state = MODULE_STATE_COMING; mutex_unlock(module_mutex); klp_register_patch(P3); klp_enable_patch(P3); # STATE 1 klp_module_notify(M) klp_module_notify_coming(P1); klp_module_notify_coming(P2); klp_module_notify_coming(P3); # STATE 2 The ftrace ops for a() and b() then looks: STATE1: ops_a-func_stack - list(a3,a2,a1); ops_b-func_stack - list(b3); STATE2: ops_a-func_stack - list(a3,a2,a1); ops_b-func_stack - list(b2,b1,b3); therefore, b2() is used for the module but a3() is used for vmcore because they were the last added. Thanks for the excellent explanation. That makes sense. My plan is to fix this problem by calling klp_module_init() directly in load_module() just after ftrace_module_init(). It will solve this problem because it will be called in MODULE_STATE_UNFORMED. Ok, looking forward to that. It will have another big advantage. It will allow to pass the error code and refuse loading modules that could not get patched. This will be needed for the more complex patches anyway. We have to prevent running module code that is inconsistent with the patched system. Yeah, that does need to be fixed up too. I am still in doubts how to best solve the problem for going modules. Your suggested solution is fine for now. But we will need a better fix after adding the more complex consistency model. Well, we could just get a reference on all patched modules to prevent them from being unloaded. -- Josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed 2015-03-04 09:34:15, Josh Poimboeuf wrote: On Wed, Mar 04, 2015 at 02:17:52PM +0100, Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. True, but we should still eliminate the race. Initializing the object twice could cause some sneaky bugs, if not now then in the future. sure Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. Fix these races by letting the first one who sees the module do the needed work. Reported-by: Petr Mladek pmla...@suse.cz Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- kernel/livepatch/core.c | 53 + 1 file changed, 49 insertions(+), 4 deletions(-) diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index a74e4e8..19a758c 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj) /* sets obj-mod if object is not vmlinux and module is found */ static void klp_find_object_module(struct klp_object *obj) { + struct module *mod; + if (!klp_is_module(obj)) return; mutex_lock(module_mutex); + /* * We don't need to take a reference on the module here because we have * the klp_mutex, which is also taken by the module notifier. This * prevents any module from unloading until we release the klp_mutex. */ - obj-mod = find_module(obj-name); + mod = find_module(obj-name); + + /* + * Be careful here to prevent races with the notifier: + * + * - MODULE_STATE_COMING: This may be before or after the notifier gets + * called. If after, the notifier didn't see the object anyway + * because it hadn't been added to the patch list yet. Either way, + * ftrace is already initialized, so it's safe to just go ahead and + * initialize the object here. Well, we need to be careful. This will handle only the newly registered patch. If there are other patches against the module on the stack, it might produce wrong order at func_stack, see below. Sorry, but I don't follow. Can you give an example? See below. + * + * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be + * before or after the notifier gets called. If after, the notifer + * didn't see the object anyway. Either way it's safe to just + * pretend that it's already gone and leave obj-mod at NULL. This is true for the current simple consistency model. Note that there will be a small race window if we allow dependency between patched functions and introduce more a complex consistency model with lazy patching. The problem is the following sequence: CPU0CPU1 delete_module() #SYSCALL try_stop_module() mod-state = MODULE_STATE_GOING; mutex_unlock(module_mutex); klp_register_patch()
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, 4 Mar 2015, Josh Poimboeuf wrote: Well, we could just get a reference on all patched modules to prevent them from being unloaded. Is that really a solution for cases where you are unloading a module (it has just been switched to MODULE_STATE_GOING) and enable a patch which patches one of its functions directly after it got switched to MODULE_STATE_GOING? -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed, Mar 04, 2015 at 11:02:59PM +0100, Jiri Kosina wrote: On Wed, 4 Mar 2015, Josh Poimboeuf wrote: Well, we could just get a reference on all patched modules to prevent them from being unloaded. Is that really a solution for cases where you are unloading a module (it has just been switched to MODULE_STATE_GOING) and enable a patch which patches one of its functions directly after it got switched to MODULE_STATE_GOING? Ah, right. try_module_get() fails, we don't patch the module, and we end up in the same mod-exit() race. Never mind, again :-) -- Josh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
(2015/03/04 22:17), Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. No, that's not true if you try_get_module() before patching. After the module state goes GOING (more correctly say, after try_release_module_ref() succeeded), all try_get_module() must fail :) So, please make sure to get module when applying patches. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. Fix these races by letting the first one who sees the module do the needed work. Reported-by: Petr Mladek pmla...@suse.cz Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- kernel/livepatch/core.c | 53 + 1 file changed, 49 insertions(+), 4 deletions(-) diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index a74e4e8..19a758c 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj) /* sets obj-mod if object is not vmlinux and module is found */ static void klp_find_object_module(struct klp_object *obj) { + struct module *mod; + if (!klp_is_module(obj)) return; mutex_lock(module_mutex); + /* * We don't need to take a reference on the module here because we have * the klp_mutex, which is also taken by the module notifier. This * prevents any module from unloading until we release the klp_mutex. */ - obj-mod = find_module(obj-name); + mod = find_module(obj-name); + + /* + * Be careful here to prevent races with the notifier: + * + * - MODULE_STATE_COMING: This may be before or after the notifier gets + * called. If after, the notifier didn't see the object anyway + * because it hadn't been added to the patch list yet. Either way, + * ftrace is already initialized, so it's safe to just go ahead and + * initialize the object here. Well, we need to be careful. This will handle only the newly registered patch. If there are other patches against the module on the stack, it might produce wrong order at func_stack, see below. + * + * - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be + * before or after the notifier gets called. If after, the notifer + * didn't see the object anyway. Either way it's safe to just + * pretend that it's already gone and leave obj-mod at NULL. This is true for the current simple consistency model. Note that there will be a small race window if we allow dependency between patched functions and introduce more a complex consistency model with lazy patching. The problem is the following sequence: CPU0CPU1 delete_module() #SYSCALL try_stop_module() mod-state = MODULE_STATE_GOING; mutex_unlock(module_mutex); klp_register_patch() klp_enable_patch() #save place to switch universe
Re: [PATCH 2/2] livepatch: fix patched module loading race
On Wed 2015-03-04 14:17:52, Petr Mladek wrote: On Tue 2015-03-03 17:02:22, Josh Poimboeuf wrote: It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Ah, this is not true after all. I did not properly check when MODULE_STATE_COMING was set. I though that it was before ftrace was initialized but it was not true. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Ah, the race is there but the effect is not that serious in the end. It seems that errors from module notifiers are ignored. In fact, we do not propagate the error from klp_module_notify_coming(). It means that WARN() from klp_enable_object() will be printed but the module will be loaded and patched. I am sorry, I was confused by kGraft where kgr_module_init() was called directly from module_load(). The errors were propagated. It means that kGraft rejects module when the patch cannot be applied. Note that the current solution is perfectly fine for the simple consistency model. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. This is true. Fix these races by letting the first one who sees the module do the needed work. Reported-by: Petr Mladek pmla...@suse.cz Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- kernel/livepatch/core.c | 53 + @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action, continue; if (action == MODULE_STATE_COMING) { + + /* +* Check for a small window where the register +* path already initialized the object. +*/ s/path/patch/ + if (obj-mod) + continue; This might break stacking. The recently registered patch might become the last on the stack and thus unused. Going through the stack when registering new patch would be quite ugly. I am going to provide yet another solution. Best Regards, Petr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] livepatch: fix patched module loading race
It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: "The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Second, if we are "lucky" and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again." Fix these races by letting the first one who sees the module do the needed work. Reported-by: Petr Mladek Signed-off-by: Josh Poimboeuf --- kernel/livepatch/core.c | 53 + 1 file changed, 49 insertions(+), 4 deletions(-) diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index a74e4e8..19a758c 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj) /* sets obj->mod if object is not vmlinux and module is found */ static void klp_find_object_module(struct klp_object *obj) { + struct module *mod; + if (!klp_is_module(obj)) return; mutex_lock(_mutex); + /* * We don't need to take a reference on the module here because we have * the klp_mutex, which is also taken by the module notifier. This * prevents any module from unloading until we release the klp_mutex. */ - obj->mod = find_module(obj->name); + mod = find_module(obj->name); + + /* +* Be careful here to prevent races with the notifier: +* +* - MODULE_STATE_COMING: This may be before or after the notifier gets +* called. If after, the notifier didn't see the object anyway +* because it hadn't been added to the patch list yet. Either way, +* ftrace is already initialized, so it's safe to just go ahead and +* initialize the object here. +* +* - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be +* before or after the notifier gets called. If after, the notifer +* didn't see the object anyway. Either way it's safe to just +* pretend that it's already gone and leave obj->mod at NULL. +* +* MODULE_STATE_LIVE: The common case. The module already finished +* loading. Just like with MODULE_STATE_COMING, we know the notifier +* didn't see the object, so we init it here. +*/ + if (mod && (mod->state == MODULE_STATE_COMING || + mod->state == MODULE_STATE_LIVE)) + obj->mod = mod; + mutex_unlock(_mutex); } @@ -695,8 +721,6 @@ static void klp_free_object_loaded(struct klp_object *obj) { struct klp_func *func; - obj->mod = NULL; - for (func = obj->funcs; func->old_name; func++) func->old_addr = 0; } @@ -765,6 +789,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj) return -EINVAL; obj->state = KLP_DISABLED; + obj->mod = NULL; klp_find_object_module(obj); @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action, continue; if (action == MODULE_STATE_COMING) { + + /* +* Check for a small window where the register +* path already initialized the object. +*/ + if (obj->mod) + continue; + obj->mod = mod; klp_module_notify_coming(patch, obj); - } else /* MODULE_STATE_GOING */ + } else { + /* MODULE_STATE_GOING */ + +
[PATCH 2/2] livepatch: fix patched module loading race
It's possible for klp_register_patch() to see a module before the COMING notifier is called, or after the GOING notifier is called. That can cause all kinds of ugly races. As Pter Mladek reported: The problem is that we do not keep the klp_mutex lock all the time when the module is being added or removed. First, the module is visible even before ftrace is ready. If we enable a patch in this time frame, adding ftrace ops will fail and the patch will get rejected just because bad timing. Second, if we are lucky and enable the patch for the coming module when the ftrace is ready but before the module notifier has been called. The notifier will try to enable the patch as well. It will detect that it is already patched, return error, and the module will get rejected just because bad timing. The more serious problem is that it will not call the notifier for going module, so that the mess will stay there and we wont be able to load the module later. Third, similar problems are there for going module. If a patch is enabled after the notifier finishes but before the module is removed from the list of modules, the new patch will be applied to the module. The module might disappear at anytime when the patch enabling is in progress, so there might be an access out of memory. Or the whole patch might be applied and some mess will be left, so it will not be possible to load/patch the module again. Fix these races by letting the first one who sees the module do the needed work. Reported-by: Petr Mladek pmla...@suse.cz Signed-off-by: Josh Poimboeuf jpoim...@redhat.com --- kernel/livepatch/core.c | 53 + 1 file changed, 49 insertions(+), 4 deletions(-) diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index a74e4e8..19a758c 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -89,16 +89,42 @@ static bool klp_is_object_loaded(struct klp_object *obj) /* sets obj-mod if object is not vmlinux and module is found */ static void klp_find_object_module(struct klp_object *obj) { + struct module *mod; + if (!klp_is_module(obj)) return; mutex_lock(module_mutex); + /* * We don't need to take a reference on the module here because we have * the klp_mutex, which is also taken by the module notifier. This * prevents any module from unloading until we release the klp_mutex. */ - obj-mod = find_module(obj-name); + mod = find_module(obj-name); + + /* +* Be careful here to prevent races with the notifier: +* +* - MODULE_STATE_COMING: This may be before or after the notifier gets +* called. If after, the notifier didn't see the object anyway +* because it hadn't been added to the patch list yet. Either way, +* ftrace is already initialized, so it's safe to just go ahead and +* initialize the object here. +* +* - MODULE_STATE_GOING: Similar to MODULE_STATE_COMING, this may be +* before or after the notifier gets called. If after, the notifer +* didn't see the object anyway. Either way it's safe to just +* pretend that it's already gone and leave obj-mod at NULL. +* +* MODULE_STATE_LIVE: The common case. The module already finished +* loading. Just like with MODULE_STATE_COMING, we know the notifier +* didn't see the object, so we init it here. +*/ + if (mod (mod-state == MODULE_STATE_COMING || + mod-state == MODULE_STATE_LIVE)) + obj-mod = mod; + mutex_unlock(module_mutex); } @@ -695,8 +721,6 @@ static void klp_free_object_loaded(struct klp_object *obj) { struct klp_func *func; - obj-mod = NULL; - for (func = obj-funcs; func-old_name; func++) func-old_addr = 0; } @@ -765,6 +789,7 @@ static int klp_init_object(struct klp_patch *patch, struct klp_object *obj) return -EINVAL; obj-state = KLP_DISABLED; + obj-mod = NULL; klp_find_object_module(obj); @@ -965,10 +990,30 @@ static int klp_module_notify(struct notifier_block *nb, unsigned long action, continue; if (action == MODULE_STATE_COMING) { + + /* +* Check for a small window where the register +* path already initialized the object. +*/ + if (obj-mod) + continue; + obj-mod = mod; klp_module_notify_coming(patch, obj); - } else /* MODULE_STATE_GOING */ + } else { + /* MODULE_STATE_GOING