Re: [RFC 10/10] kmod: add a sanity check on module loading
"Luis R. Rodriguez" writes: >> Maybe a similar hack for try_then_request_module(), but many places seem >> to open-code request_module() so it's not as trivial... Hi Luis, Jessica (who is the main module maintainer now), Back from break, sorry about delay. > Right, out of ~350 request_module() calls (not included try requests) > only ~46 check the return value. Hence a validation check, and come to > think of it, *this* was the issue that originally had me believing > that in some places we might end up in a null deref --if those open > coded request_module() calls assume the driver is loaded there could > be many places where a NULL is inevitable. Yes, assuming success == module loade is simply a bug. I wrote try_then_request_module() to attempt to encapsulate the correct logic into a single place; maybe we need other helpers to cover (most of?) the remaining cases? > Granted, I agree they > should be fixed, we could add a grammar rule to start nagging at > driver developers for started, but it does beg the question also of > what a tightly knit validation for modprobe might look like, and hence > this patch and now the completed not-yet-posted alias work. I really think aliases-in-kernel is too heavy a hammer, but a warning when modprobe "succeeds" and the module still isn't found would be a Good Thing. > Would it be worthy as a kconfig kmod debugging aide for now? I can > follow up with a semantic patch to nag about checking the return value > of request_module(), and we can have 0-day then also complain about > new invalid uses. Yeah, a warning about this would be win for sure. BTW, I wrote the original "check-for-module-before-loading" in module-init-tools, but I'm starting to wonder if it was a premature optimization. Have you thought about simply removing it and always trying to load the module? If it doesn't slow things down, perhaps simplicity FTW? Thanks, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 10/10] kmod: add a sanity check on module loading
"Luis R. Rodriguez" writes: > OK -- if userspace messes up again it may be a bit hard to prove > unless we have a validation debug thing in place, would such a thing > in debug form be reasonable ? That makes perfect sense. Untested hack: diff --git a/fs/filesystems.c b/fs/filesystems.c index c5618db110be..e5c90e80c7d3 100644 --- a/fs/filesystems.c +++ b/fs/filesystems.c @@ -275,9 +275,10 @@ struct file_system_type *get_fs_type(const char *name) int len = dot ? dot - name : strlen(name); fs = __get_fs_type(name, len); - if (!fs && (request_module("fs-%.*s", len, name) == 0)) + if (!fs && (request_module("fs-%.*s", len, name) == 0)) { fs = __get_fs_type(name, len); - + WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name); + } if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) { put_filesystem(fs); fs = NULL; Maybe a similar hack for try_then_request_module(), but many places seem to open-code request_module() so it's not as trivial... Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 10/10] kmod: add a sanity check on module loading
"Luis R. Rodriguez" writes: > On Dec 16, 2016 9:54 PM, "Rusty Russell" wrote: > > AFAICT the mistake here is that kmod is returning "done, OK" when the > > module it is trying to load is already loading (but not finished > > loading). That's the root problem; it's an attempt at optimization by > > kmod which goes awry. > > This is true! To be precise though the truth of the matter is that kmod'd > respective usermode helper: modprobe can be buggy and may lie to us. It may > allow request_module() to return 0 but since we don't validate it, any > assumption we make can be deadly. In the case of get_fs_type() its a null > dereference. Wait, what?? I can't see that in get_fs_type, which hasn't changed since 2013. If a caller is assuming get_fs_type() doesn't return NULL, they're broken and need fixing of course: struct file_system_type *get_fs_type(const char *name) { struct file_system_type *fs; const char *dot = strchr(name, '.'); int len = dot ? dot - name : strlen(name); fs = __get_fs_type(name, len); if (!fs && (request_module("fs-%.*s", len, name) == 0)) fs = __get_fs_type(name, len); if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) { put_filesystem(fs); fs = NULL; } return fs; } Where does this NULL-deref is the module isn't correctly loaded? > *Iff* we want a sanity check to verify kmod's umh is not lying to us we > need to verify after 0 was returned that it was not lying to us. Since kmod > accepts aliases but find_modules_all() only works on the real module name a > validation check cannot happen when all you have are aliases. request_module() should block until resolution, but that's fundamentally a userspace problem. Let's not paper over it in kernelspace. > *Iff* we are sure we don't want a validation (or another earlier > optimization to avoid calling out to modrobe if the alias requested is > already present, which does the time shaving I mentioned on the tests) then > naturally no request_module() calls returning 0 can assert information > about the requested module. I think we might need to change more code if we > accept we cannot trust request_module() calls, or we accept userspace > telling the kernel something may mean we sometimes crash. This later > predicament seems rather odd so hence the patch. > > Perhaps in some cases validation of work from a umh is not critical in > kernel but for request_module() I can tell you that today get_fs_type code > currently asserts the module found can never be NULL. OK, what am I missing in the code above? > > Looking at the code in the kernel, we *already* get this right: block if > > a module is still loading anyway. Once it succeeds we return -EBUSY if > > > > it fails we'll proceed to try to load it again. > > > > I don't understand what you're trying to fix with adding aliases > > in-kernel? > > Two fold now: > > a) validation on request_module() work when an alias is used But why? > b) since kmod accepts aliaes, if we get aliases support, it means we could > *also* preemptively avoid calling out to userspace for modules already > present. No, because once we have a module we don't request it: requesting is the fallback case. > >> FWIW a few things did occur to me: > >> > >> a) list_add_rcu() is used so new modules get added first > > > > Only after we're sure that there are no duplicates. > > > > > OK! This is a very critical assertion. I should be able to add a debug > WARN_ON() should two modules be on the modules list for the same module > then ? Yes, names must be unique. >> b) find_module_all() returns the last module which was added as it > traverses >>the module list > >> BTW should find_module_all() use rcu to traverse? > > Yes; the kallsyms code does this on Oops. Not really a big issue in > practice, but a nice fix. > > Ok, will bundle into my queue. Please submit to Jessica for her module queue, as it's orthogonal AFAICT. > I will note though that I still think there's a bug in this code -- > upon a failure other "spinning" requests can fail, I believe this may > be due to not having another state or informing pending modules too > early of a failure but I haven't been able to prove this conjecture > yet. That's possible, but I can't see it from quickly re-checking the code. The module should be fully usable
Re: [RFC 10/10] kmod: add a sanity check on module loading
"Luis R. Rodriguez" writes: > On Thu, Dec 15, 2016 at 10:57:42AM +1030, Rusty Russell wrote: >> "Luis R. Rodriguez" writes: >> > kmod has an optimization in place whereby if a some kernel code >> > uses request_module() on a module already loaded we never bother >> > userspace as the module already is loaded. This is not true for >> > get_fs_type() though as it uses aliases. >> >> Well, the obvious thing to do here is block kmod if we're currently >> loading the same module. > > OK thanks, I've now added this, it sure helps. Test cases 0008 and 0009 > require > hammering on the test over and over to see a failure on vanilla kernels, > an upper bound I found was about 150 times each test. Running test 0008 > 150 times with this enhancement you mentioned shaves off ~4 seconds. > For test 0009 it shaves off ~16 seconds, but as I note below the alias support > was needed as well. > >> Otherwise it has to do some weird spinning >> thing in userspace anyway. > > Right, but note that the get_fs_type() tests would still fail given > module.c was not alias-aware yet. AFAICT the mistake here is that kmod is returning "done, OK" when the module it is trying to load is already loading (but not finished loading). That's the root problem; it's an attempt at optimization by kmod which goes awry. Looking at the code in the kernel, we *already* get this right: block if a module is still loading anyway. Once it succeeds we return -EBUSY; if it fails we'll proceed to try to load it again. I don't understand what you're trying to fix with adding aliases in-kernel? > FWIW a few things did occur to me: > > a) list_add_rcu() is used so new modules get added first Only after we're sure that there are no duplicates. > b) find_module_all() returns the last module which was added as it traverses >the module list > BTW should find_module_all() use rcu to traverse? Yes; the kallsyms code does this on Oops. Not really a big issue in practice, but a nice fix. Thanks, Rusty. > > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -594,7 +594,7 @@ static struct module *find_module_all(const char *name, > size_t len, > > module_assert_mutex_or_preempt(); > > - list_for_each_entry(mod, &modules, list) { > + list_for_each_entry_rcu(mod, &modules, list) { > if (!even_unformed && mod->state == MODULE_STATE_UNFORMED) > continue; > if (strlen(mod->name) == len && !memcmp(mod->name, name, len)) > @@ -3532,7 +3532,7 @@ static int add_unformed_module(struct module *mod) > goto out; > } > mod_update_bounds(mod); > - list_add_rcu(&mod->list, &modules); > + list_add_tail_rcu(&mod->list, &modules); > mod_tree_insert(mod); > err = 0; > -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 10/10] kmod: add a sanity check on module loading
"Luis R. Rodriguez" writes: > kmod has an optimization in place whereby if a some kernel code > uses request_module() on a module already loaded we never bother > userspace as the module already is loaded. This is not true for > get_fs_type() though as it uses aliases. Well, the obvious thing to do here is block kmod if we're currently loading the same module. Otherwise it has to do some weird spinning thing in userspace anyway. We already have module_wq for this, we just need a bit more code to share the return value; and there's a weird corner case there where we have "modprobe foo param=invalid" then "modprobe foo param=valid" and we fail both with -EINVAL, but it's probably not worth fixing. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] Add kernel parameter to blacklist modules
Prarit Bhargava writes: > Rusty, this looks like it will work. I tested this by adding Applied. Hey, we got there in the end! Thanks for your patience, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] Add kernel parameter to blacklist modules
Prarit Bhargava writes: > Blacklisting a module in linux has long been a problem. The current > procedure is to use rd.blacklist=module_name, however, that doesn't > cover the case after the initramfs and before a boot prompt (where one > is supposed to use /etc/modprobe.d/blacklist.conf to blacklist > runtime loading). Using rd.shell to get an early prompt is hit-or-miss, > and doesn't cover all situations AFAICT. > > This patch adds this functionality of permanently blacklisting a module > by its name via the kernel parameter module_blacklist=module_name. > > [v2]: Rusty, use core_param() instead of __setup() which simplifies > things. > > [v3]: Rusty, undo wreckage from strsep() There's no locking here, meaning that something can easily slip through the blacklist. This time for sure! (UNTESTED!) static bool blacklisted(char *module_name) { const char *p; size_t len; if (!module_blacklist) return false; for (p = module_blacklist; *p; p += len) { len = strcspn(p, ","); if (strlen(module_name) == len && !memcmp(module_name, p, len)) return true; if (p[len] == ',') len++; } return false; } Thanks, Rusty. > > Signed-off-by: Prarit Bhargava > Cc: Jonathan Corbet > Cc: Rusty Russell > Cc: linux-doc@vger.kernel.org > --- > Documentation/kernel-parameters.txt |3 +++ > kernel/module.c | 29 + > 2 files changed, 32 insertions(+) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index 82b42c958d1c..c720b96f2efc 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2295,6 +2295,9 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > Note that if CONFIG_MODULE_SIG_FORCE is set, that > is always true, so this option does nothing. > > + module_blacklist= [KNL] Do not load a comma-separated list of > + modules. Useful for debugging problem modules. > + > mousedev.tap_time= > [MOUSE] Maximum time between finger touching and > leaving touchpad surface for touch to be considered > diff --git a/kernel/module.c b/kernel/module.c > index 5f71aa63ed2a..5240da88af79 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -3155,6 +3155,32 @@ int __weak module_frob_arch_sections(Elf_Ehdr *hdr, > return 0; > } > > +/* module_blacklist is a comma-separated list of module names */ > +static char *module_blacklist; > +static bool blacklisted(char *module_name) > +{ > + char *str, *entry; > + > + if (!module_blacklist) > + return false; > + > + str = module_blacklist; > + do { > + if (str != module_blacklist) > + module_blacklist[strlen(str) - 1] = ','; > + entry = strsep(&str, ","); > + if (!strcmp(module_name, entry)) { > + pr_info("module %s is blacklisted\n", module_name); > + if (str != module_blacklist) > + module_blacklist[strlen(str) - 1] = ','; > + return true; > + } > + } while (str); > + > + return false; > +} > +core_param(module_blacklist, module_blacklist, charp, 0400); > + > static struct module *layout_and_allocate(struct load_info *info, int flags) > { > /* Module within temporary copy. */ > @@ -3165,6 +3191,9 @@ static struct module *layout_and_allocate(struct > load_info *info, int flags) > if (IS_ERR(mod)) > return mod; > > + if (blacklisted(mod->name)) > + return ERR_PTR(-EPERM); > + > err = check_modinfo(mod, info, flags); > if (err) > return ERR_PTR(err); > -- > 1.7.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Add kernel parameter to blacklist modules
Prarit Bhargava writes: > Blacklisting a module in linux has long been a problem. The current > procedure is to use rd.blacklist=module_name, however, that doesn't > cover the case after the initramfs and before a boot prompt (where one > is supposed to use /etc/modprobe.d/blacklist.conf to blacklist > runtime loading). Using rd.shell to get an early prompt is hit-or-miss, > and doesn't cover all situations AFAICT. > > This patch adds this functionality of permanently blacklisting a module > by its name via the kernel parameter module_blacklist=module_name. > > [v2]: Rusty, use core_param() instead of __setup(), and drop struct which > simplifies things. > > Signed-off-by: Prarit Bhargava > Cc: Jonathan Corbet > Cc: Rusty Russell > Cc: linux-doc@vger.kernel.org > --- > Documentation/kernel-parameters.txt |3 +++ > kernel/module.c | 25 + > 2 files changed, 28 insertions(+) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index 82b42c958d1c..c720b96f2efc 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2295,6 +2295,9 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > Note that if CONFIG_MODULE_SIG_FORCE is set, that > is always true, so this option does nothing. > > + module_blacklist= [KNL] Do not load a comma-separated list of > + modules. Useful for debugging problem modules. > + > mousedev.tap_time= > [MOUSE] Maximum time between finger touching and > leaving touchpad surface for touch to be considered > diff --git a/kernel/module.c b/kernel/module.c > index 5f71aa63ed2a..5ff5287b19a8 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -3155,6 +3155,28 @@ int __weak module_frob_arch_sections(Elf_Ehdr *hdr, > return 0; > } > > +/* module_blacklist is a comma-separated list of module names */ > +static char *module_blacklist; > +static bool blacklisted(char *module_name) > +{ > + char *str, *entry; > + > + if (!module_blacklist) > + return false; > + > + str = module_blacklist; > + do { > + entry = strsep(&str, ","); > + if (!strcmp(module_name, entry)) { > + pr_info("module %s is blacklisted\n", module_name); > + return true; > + } strsep mangles the string; this will only work once :) This is untested, and a little ugly: len = strlen(module_name); while ((p = strstr(p, module_name)) != NULL) { if ((p == module_blacklist || p[-1] == ',') && (p[len] == ',' || p[len] == '\0')) return true; p += len; } return false; Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Add kernel parameter to blacklist modules
Prarit Bhargava writes: > Blacklisting a module in linux has long been a problem. The process of > blacklisting a module has changed over time, and it seems that every OS > does it slightly differently and depends on the age of the init system > used on that OS. Hi Prarit, I don't mind the idea, but we can refine the implementation a little I think. 1) Use core_param(module_blacklist, module_blacklist, charp, 0400) instead of __setup(). I'd like to kill __setup(), really. 2) Simply search the blacklist string at load time, no new datastructures. Not completely sure about #2, but I think it'll be fairly neat. Thanks, Rusty. > The current Fedora/systemd procedure is to use rd.blacklist=module_name, > however, that doesn't cover the case after the initramfs and before a boot > prompt (where one is supposed to use /etc/modprobe.d/blacklist.conf to > blacklist runtime loading). Using rd.shell to get an early prompt is > hit-or-miss, and doesn't cover all situations AFAICT. Explaining all of this > to an end user isn't trivial. This becomes especially difficult when > attempting to blacklist a module during system install from RO media. > > This patch adds this functionality of blacklisting a module by its name > via the kernel parameter module_blacklist=module_name. The parameter > can also accept comma separated values. > > Usage example: > > [root@intel-mohonpeak-02 ~]# insmod /home/dummy-module/dummy-module.ko > insmod: ERROR: could not insert module > /home/rhel7/redhat/debug/dummy-module/dummy-module.ko: Operation not permitted > [root@intel-mohonpeak-02 ~]# dmesg | grep black > [0.00] Command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ > root=/dev/mapper/rhel_intel--mohonpeak--02-root ro crashkernel=auto > rd.lvm.lv=rhel_intel-mohonpeak-02/root rd.lvm.lv=rhel_intel-mohonpeak-02/swap > console=ttyS0,115200 LANG=en_US.UTF-8 module_blacklist=dummy_module > [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+ > root=/dev/mapper/rhel_intel--mohonpeak--02-root ro crashkernel=auto > rd.lvm.lv=rhel_intel-mohonpeak-02/root rd.lvm.lv=rhel_intel-mohonpeak-02/swap > console=ttyS0,115200 LANG=en_US.UTF-8 module_blacklist=dummy_module > [0.00] blacklisting module dummy_module > [ 85.127433] module dummy_module has been blacklisted. This module will > not load. > > Signed-off-by: Prarit Bhargava > Cc: Jonathan Corbet > Cc: Rusty Russell > Cc: linux-doc@vger.kernel.org > --- > Documentation/kernel-parameters.txt |3 +++ > kernel/module.c | 37 > +++ > 2 files changed, 40 insertions(+) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index 82b42c958d1c..c720b96f2efc 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2295,6 +2295,9 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > Note that if CONFIG_MODULE_SIG_FORCE is set, that > is always true, so this option does nothing. > > + module_blacklist= [KNL] Do not load a comma-separated list of > + modules. Useful for debugging problem modules. > + > mousedev.tap_time= > [MOUSE] Maximum time between finger touching and > leaving touchpad surface for touch to be considered > diff --git a/kernel/module.c b/kernel/module.c > index 5f71aa63ed2a..ed3076d98a32 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -46,6 +46,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -3155,16 +3156,52 @@ int __weak module_frob_arch_sections(Elf_Ehdr *hdr, > return 0; > } > > +struct mod_blacklist_entry { > + struct list_head next; > + char *buf; > +}; > + > +static LIST_HEAD(blacklisted_modules); > +static int __init module_blacklist(char *str) > +{ > + char *str_entry; > + struct mod_blacklist_entry *entry; > + > + /* str argument is a comma-separated list of module names */ > + do { > + str_entry = strsep(&str, ","); > + if (str_entry) { > + pr_debug("blacklisting module %s\n", str_entry); > + entry = alloc_bootmem(sizeof(*entry)); > + entry->buf = alloc_bootmem(strlen(str_entry) + 1); > + strcpy(entry->buf, str_entry); > + list_add(&entry->next, &blacklisted_modules); > + } > + } while (str_entry); > + > + return 0; >
Re: [PATCH v6 2/5] module: preserve Elf information for livepatch modules
Jessica Yu writes: > For livepatch modules, copy Elf section, symbol, and string information > from the load_info struct in the module loader. Persist copies of the > original symbol table and string table. > > Livepatch manages its own relocation sections in order to reuse module > loader code to write relocations. Livepatch modules must preserve Elf > information such as section indices in order to apply livepatch relocation > sections using the module loader's apply_relocate_add() function. > > In order to apply livepatch relocation sections, livepatch modules must > keep a complete copy of their original symbol table in memory. Normally, a > stripped down copy of a module's symbol table (containing only "core" > symbols) is made available through module->core_symtab. But for livepatch > modules, the symbol table copied into memory on module load must be exactly > the same as the symbol table produced when the patch module was compiled. > This is because the relocations in each livepatch relocation section refer > to their respective symbols with their symbol indices, and the original > symbol indices (and thus the symtab ordering) must be preserved in order > for apply_relocate_add() to find the right symbol. > > Signed-off-by: Jessica Yu > Reviewed-by: Miroslav Benes > --- > include/linux/module.h | 25 ++ > kernel/module.c| 125 > +++++++-- > 2 files changed, 147 insertions(+), 3 deletions(-) Acked-by: Rusty Russell (I guess that takes precedence over "Reviewed-by", which I did too). Thanks, Rusty. > diff --git a/include/linux/module.h b/include/linux/module.h > index 2bb0c30..3daf2b3 100644 > --- a/include/linux/module.h > +++ b/include/linux/module.h > @@ -330,6 +330,15 @@ struct mod_kallsyms { > char *strtab; > }; > > +#ifdef CONFIG_LIVEPATCH > +struct klp_modinfo { > + Elf_Ehdr hdr; > + Elf_Shdr *sechdrs; > + char *secstrings; > + unsigned int symndx; > +}; > +#endif > + > struct module { > enum module_state state; > > @@ -456,7 +465,11 @@ struct module { > #endif > > #ifdef CONFIG_LIVEPATCH > + bool klp; /* Is this a livepatch module? */ > bool klp_alive; > + > + /* Elf information */ > + struct klp_modinfo *klp_info; > #endif > > #ifdef CONFIG_MODULE_UNLOAD > @@ -630,6 +643,18 @@ static inline bool module_requested_async_probing(struct > module *module) > return module && module->async_probe_requested; > } > > +#ifdef CONFIG_LIVEPATCH > +static inline bool is_livepatch_module(struct module *mod) > +{ > + return mod->klp; > +} > +#else /* !CONFIG_LIVEPATCH */ > +static inline bool is_livepatch_module(struct module *mod) > +{ > + return false; > +} > +#endif /* CONFIG_LIVEPATCH */ > + > #else /* !CONFIG_MODULES... */ > > /* Given an address, look for it in the exception tables. */ > diff --git a/kernel/module.c b/kernel/module.c > index 041200c..5f71aa6 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -1973,6 +1973,83 @@ static void module_enable_nx(const struct module *mod) > { } > static void module_disable_nx(const struct module *mod) { } > #endif > > +#ifdef CONFIG_LIVEPATCH > +/* > + * Persist Elf information about a module. Copy the Elf header, > + * section header table, section string table, and symtab section > + * index from info to mod->klp_info. > + */ > +static int copy_module_elf(struct module *mod, struct load_info *info) > +{ > + unsigned int size, symndx; > + int ret; > + > + size = sizeof(*mod->klp_info); > + mod->klp_info = kmalloc(size, GFP_KERNEL); > + if (mod->klp_info == NULL) > + return -ENOMEM; > + > + /* Elf header */ > + size = sizeof(mod->klp_info->hdr); > + memcpy(&mod->klp_info->hdr, info->hdr, size); > + > + /* Elf section header table */ > + size = sizeof(*info->sechdrs) * info->hdr->e_shnum; > + mod->klp_info->sechdrs = kmalloc(size, GFP_KERNEL); > + if (mod->klp_info->sechdrs == NULL) { > + ret = -ENOMEM; > + goto free_info; > + } > + memcpy(mod->klp_info->sechdrs, info->sechdrs, size); > + > + /* Elf section name string table */ > + size = info->sechdrs[info->hdr->e_shstrndx].sh_size; > + mod->klp_info->secstrings = kmalloc(size, GFP_KERNEL); > + if (mod->klp_info->secstrings == NULL) { > + ret = -ENOMEM; > + goto free_sechdrs; > + } > + memcpy(mod->klp_inf
Re: [RFC PATCH v4 2/6] module: preserve Elf information for livepatch modules
Petr Mladek writes: > On Tue 2016-02-09 11:33:07, Jiri Kosina wrote: >> On Tue, 9 Feb 2016, Petr Mladek wrote: >> >> > > +#ifdef CONFIG_KALLSYMS >> > > +/* Make symtab and strtab available prior to module init call */ >> > > +mod->num_symtab = mod->core_num_syms; >> > > +mod->symtab = mod->core_symtab; >> > > +mod->strtab = mod->core_strtab; >> > > +#endif >> > >> > This should be done with module_mutex. Otherwise, it looks racy at least >> > against module_kallsyms_on_each_symbol(). >> >> Hmm, if this is the case, the comment describing the locking rules for >> module_mutex should be updated. >> >> > BTW: I wonder why even the original code is not racy for example against >> > module_get_kallsym. It is called without the mutex. This code sets the >> > number of entries before the pointer to the entries. >> > >> > Note that the module is in the list even in the UNFORMED state. >> >> module_kallsyms_on_each_symbol() gives up in such case though. > > The problem is that we set the three values above in the COMMING > state. They were even set in the LIVE state before this patch. True. Indeed, I have a fix for exactly this queued in my fixes tree, and am about to send Linus a pull req. Below is the fix I went with, for your reference (part of a series, but you get the idea). Thanks! Rusty. commit 8244062ef1e54502ef55f54cced659913f244c3e Author: Rusty Russell Date: Wed Feb 3 16:55:26 2016 +1030 modules: fix longstanding /proc/kallsyms vs module insertion race. For CONFIG_KALLSYMS, we keep two symbol tables and two string tables. There's one full copy, marked SHF_ALLOC and laid out at the end of the module's init section. There's also a cut-down version that only contains core symbols and strings, and lives in the module's core section. After module init (and before we free the module memory), we switch the mod->symtab, mod->num_symtab and mod->strtab to point to the core versions. We do this under the module_mutex. However, kallsyms doesn't take the module_mutex: it uses preempt_disable() and rcu tricks to walk through the modules, because it's used in the oops path. It's also used in /proc/kallsyms. There's nothing atomic about the change of these variables, so we can get the old (larger!) num_symtab and the new symtab pointer; in fact this is what I saw when trying to reproduce. By grouping these variables together, we can use a carefully-dereferenced pointer to ensure we always get one or the other (the free of the module init section is already done in an RCU callback, so that's safe). We allocate the init one at the end of the module init section, and keep the core one inside the struct module itself (it could also have been allocated at the end of the module core, but that's probably overkill). Reported-by: Weilong Chen Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541 Cc: sta...@kernel.org Signed-off-by: Rusty Russell diff --git a/include/linux/module.h b/include/linux/module.h index 4560d8f1545d..2bb0c3085706 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -324,6 +324,12 @@ struct module_layout { #define __module_layout_align #endif +struct mod_kallsyms { + Elf_Sym *symtab; + unsigned int num_symtab; + char *strtab; +}; + struct module { enum module_state state; @@ -405,15 +411,10 @@ struct module { #endif #ifdef CONFIG_KALLSYMS - /* -* We keep the symbol and string tables for kallsyms. -* The core_* fields below are temporary, loader-only (they -* could really be discarded after module init). -*/ - Elf_Sym *symtab, *core_symtab; - unsigned int num_symtab, core_num_syms; - char *strtab, *core_strtab; - + /* Protected by RCU and/or module_mutex: use rcu_dereference() */ + struct mod_kallsyms *kallsyms; + struct mod_kallsyms core_kallsyms; + /* Section attributes */ struct module_sect_attrs *sect_attrs; diff --git a/kernel/module.c b/kernel/module.c index 1e79d8157712..9537da37ce87 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -303,6 +303,9 @@ struct load_info { struct _ddebug *debug; unsigned int num_debug; bool sig_ok; +#ifdef CONFIG_KALLSYMS + unsigned long mod_kallsyms_init_off; +#endif struct { unsigned int sym, str, mod, vers, info, pcpu; } index; @@ -2480,10 +2483,21 @@ static void layout_symtab(struct module *mod, struct load_info *info) strsect->sh_flags |= SHF_ALLOC;