On 11/24/20 12:35 PM, Jazzoo Watchman wrote: > Hello all, > Hope this finds you well. > > Trying to run config2help it segfaults.
It doesn't for me, which means I can't reproduce this, which means I won't know if I've fixed it. You don't mention how you did it. Which distro? Which compiler version? Which build flags? I'm guessing you're using some kind of page poisoning? > Pulling the source from Toybox and building with > gcc -0 -g -Wall > > That too segfaults so I think I have the latest version (the source seems to > be > mostly contained in a single file scripts/config2help.c). > > Google(ing) turned up many hits of fairly recent (circa 2018+) conversations > on > or about this same complaint. > > Digging into the logic of the code, it seems there may be some cruft/age to > the > logic and some mishandling of pointer use AFTER free (as shown by valgrind). > In It's more that it's a quick and dirty build time tool that works on known inputs, so I got it working and then didn't care. The only way it could be "exploitable" is somehow tricking someone into running the build with different inputs...? The comment at the top of mkflags.c is a bit more explicit: // This is intentionally crappy code because we control the inputs. It leaks // memory like a sieve and segfaults if malloc returns null, but does the job. I have the following diff in my tree as a todo: --- a/scripts/config2help.c +++ b/scripts/config2help.c @@ -146,7 +146,8 @@ char *dlist_zap(struct double_list **help) struct double_list *dd = dlist_pop(help); char *s = dd->data; - free(dd); + memset(dd, 0xaa, sizeof(struct double_list)); +// free(dd); return s; } Which might or might not be related to what you're seeing because you didn't tell me how to reproduce what you're seeing. To be honest, the "fix" I'd do is just remove the free and leak the data because again: quick and dirty build tool working on known input? generated/flags.raw is like 10k of data, it can't leak enough to matter over the lifetime of the tool's run. (The only reason I have the patch is in case I got around to doing a better fix than that.) > "some" aforementioned Google(ing) complaints seemed so stem from an update of > their glibc. I can attest to the same as my builds were going smoothly > until I > did an apt update; apt upgrade . Ok, so you're using a debian derivative of some sort. This is the first information you've given me about your build environment. I'm guessing you "upgraded" gcc (or llvm?) to a newer version that enables an address sanitization flag by default? But that's just a guess because you didn't SAY. > My point. > > 1.) > Is the latest source (hopefully) with a fix for the "use after free" segfault > in > "https://github.com/landley/toybox/blob/master/scripts/config2help.c" Given that you apparently haven't built that, now I have to ask "what version of toybox _did_ you build"? I don't know what you built, _and_ I don't know how you built it. > 2.) > Following the code in 1.) it seems that some of the logic may be more complex > than the input would suggest. It turns out config2help.c did have a reason to exist, yes. > The code inspecting the generated "sym" list while looping through the entries > in ".config" and setting sym->enabled ... indicates to me that ".config" > controls which "processed" "symbols" get spit-out as "#defined HELPXXX". > > Is that true and or intended? The config controls what output is produced. More output has been added over time to a loop that already had selection logic in it. As I said: quick and dirty build tool, which has had its design goals shift out from under it a bit over the years. I have a post-1.0 todo item to rewrite kconfig (so I'm not using a stale snapshot of gplv2 kernel code), and as part of that all this build infrastructure might change, or go away and get completely redone. Dunno yet, haven't designed the replacement. > The flow seems to indicate that multiple "symbols" of the same "name" may/do > exist over multiple Config.in (and "sourced") files; and "duplicates" > need/should to be "collated". The nature of kconfig is there's dependency information: symbols depend on other symbols. I repurposed kconfig's compile-time help text to provide each command's --help output, which requires the help text of dependent symbols to be collated into a single help block with the main command symbol's help text (the one that controls whether or not a command by that name gets installed), and THAT requires the help text to have a common structure sections can be parsed out and processed differently. We used to make much heavier use of this (like busybox does), but these days I've ripped out most command sub-options, with only a few like CONFIG_LSPCI_TEXT remaining. If I unify all of them like that, then most of the help text processing can go away. But I haven't done that yet because it has side effects I need to think through. We might still need the FLAG_ enable and disable logic to let the dead code elimination part of the compiler's optimizer shrink commands in certain configurations, but at the same time without command options that would only apply to multiple commands in the same command.c file, and most of the time those FORCE_FLAGS to disable that behavior when it would apply (for reasons probably explained in code.html?) So MAYBE that goes too...? Given that LLVM also has __COUNTER__ possibly mkflags.c isn't needed at all? Except could that work with the re-include of flags.h to CLEANUP_xxx ? In theory I can snapshot the current value of __COUNTER__ and then subtract to get "how many times since last reset", and MAYBE grep/sed can chop out the optstr and expand it, but is that really cleaner...? tl;dr: I used to use certain techniques a lot, these days I don't and have removed most of the old instances of them, but there's still a few I haven't chased down. If I DO chase them down, I need to do a design and analysis pass to figure out what support infrastructure would still be needed, and I haven't yet. I've been busy with other things, and it's overshadowed by the need to write a new menuconfig from scratch. And none of this is pressing because the old one works for now and I have higher priority todo items. > In practice; I haven't seen "duplicate" "symbols" > and wonder if mashing "duplicates" together wouldn't lead to probable > error/nonsense. CONFIG_LSPCI_TEXT has the help text: usage: lspci [-n] [-i FILE ] -n Numeric output (repeat for readable and numeric) -i PCI ID database (default /usr/share/misc/pci.ids) Which needs to be spliced into the CONFIG_LSPCI help text. That sort of thing used to be a lot more common. I haven't yet decided to completely eliminate it (permanently hardwiring all such options on), and worked out what subset of config2help.c would still be needed if I did, because I'm busy with things like trying to finish toys/*/sh.c > 3.) > If the above are/is true; would a rewrite with less "dependencies" and a > "clean" > valgrind run be at-all welcome? I haven't done a new design pass on the build infrastructure, in part because deciding what's stale and what isn't requires janitorial work on the rest of the tree to make sure it's no longer used (eliminating the last few users), and then work out the best way to do what's left, which isn't necessarily a subset of the existing code but may be something entirely different. For one thing, I'm most of the way through writing a new shell with known behavior which I could then rely on to have things like "wait -n" so scripts/make.sh could more reliably keep multiple processors active instead of using the "portable" (to old bash versions at least) wait for specific $PID version which isn't necessarily the nex tone to finish because gcc instances take varying amounts of time to run depending on what .c file they were given... which means maybe I could write a config2help.c replacement as a shell function using variable arrays, I don't know yet? You seem to be volunteering to understand what toybox needs better than I do. > In closing, am I just lost in the woods? If so, could someone spin me around > and nudge me in the correct direction? You had a build break. You didn't tell me how to reproduce the build break you're seeing so I can tell whether or not I fixed it. That APPEARS to be the problem at hand. One quick and dirty fix to your build break is not to run address sanitization on build-time tools. (HOSTCC vs target CC.) Alternately, I can remove the free() so it just leaks that field, because it's just a compile time tool working on 10k of known input. Rob _______________________________________________ Toybox mailing list Toybox@lists.landley.net http://lists.landley.net/listinfo.cgi/toybox-landley.net