Hi,

Over on the dash mailing list, Christoph Anton Mitterer reported a segfault in dash when dealing with invalid substitutions: <https://lore.kernel.org/dash/fe9be702c676b43bba8e0136a94583f919a05a66.ca...@scientia.org/T/#t>

busybox ash being based on dash, despite the segmentation fault not triggering there with the original script, it can be triggered in busybox ash with a different script as well. I am reporting this here as requested in that thread. The below is with a build with 'make defconfig', except CONFIG_DEBUG=y and CONFIG_DEBUG_PESSIMIZE=y.

$ gdb --args ./busybox_unstripped sh -c 'f() { echo ${PWD-${PWD!}}; }; f'
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./busybox_unstripped...
(gdb) run
Starting program: /home/harald/busybox/busybox_unstripped sh -c f\(\)\ \{\ echo\ \$\{PWD-\$\{PWD\!\}\}\;\ \}\;\ f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00348eab in argstr (p=0x1 <error: Cannot access memory at address 0x1>, flag=1089) at shell/ash.c:6819
6819                    if (*p == '~')
(gdb)

This happens because invalid substitutions (${PWD!}) are encoded using a null byte, but function definitions treat node text as C-style strings terminated by the first null byte, so we end up accessing the duplicated node text past the end of the buffer:

(gdb) b shell/ash.c:9148
Breakpoint 1 at 0x34ca03: file shell/ash.c, line 9148.
(gdb) run
Starting program: /home/harald/busybox/busybox_unstripped sh -c f\(\)\ \{\ echo\ \$\{PWD-\$\{PWD\!\}\}\;\ \}\;\ f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, copynode (n=0x3923ec) at shell/ash.c:9148
9148                    new->narg.text = nodeckstrdup(n->narg.text);
(gdb) cont
Continuing.

Breakpoint 1, copynode (n=0x39240c) at shell/ash.c:9148
9148                    new->narg.text = nodeckstrdup(n->narg.text);
(gdb) p n->narg.text
$1 = 0x3923fc "\202\002PWD=\202"
(gdb) step
nodeckstrdup (s=0xffffd1c8 "") at shell/ash.c:9059
9059    {
(gdb) next
9060            funcstring_end -= SHELL_ALIGN(strlen(s) + 1);
(gdb)
9061            return strcpy(funcstring_end, s);
(gdb)
9062    }

Here, \202 is CTLVAR. We can see that n->narg.text ends in CTLVAR followed by a null byte, and it is copied using strlen() and strcpy(), so any bytes after that null byte will be left out.

There are two possible ways of fixing this, depending on the intended behaviour. Nothing has yet been said on the list to definitively know what the dash intended behaviour here is, but regardless, busybox may choose to act now.

1: If the intended behaviour is to raise an error:

--- a/shell/ash.c
+++ b/shell/ash.c
@@ -7465,9 +7465,6 @@ varvalue(char *name, int varflags, int flags, int quoted) int discard = (subtype == VSPLUS || subtype == VSLENGTH) | (flags & EXP_DISCARD);

        if (!subtype) {
-               if (discard)
-                       return -1;
-
                ifsfree();
                raise_error_syntax("bad substitution");
        }

This preserves the copynode() behaviour of cutting off the word, but it is okay as now a null byte is guaranteed to terminate the expansion.

2: If the intended behaviour is to ignore the invalid substitution as long as it is skipped:

--- a/shell/ash.c
+++ b/shell/ash.c
@@ -12981,6 +12981,8 @@ parsesub: {
                        synstack->dblquote = newsyn != BASESYNTAX;
                }

+               if (subtype == 0)
+                       subtype = VSNUL;
                ((unsigned char *)stackblock())[typeloc] = subtype;
                if (subtype != VSNORMAL) {
                        synstack->varnest++;

This encodes invalid substitutions using VSNUL, which when masked with VSTYPE will result in 0 like before, but does not result in all bits zero, so does not terminate the string.

I know my mail client will have mangled the formatting. Apologies for that.

I am not expecting either of these patches to be applied as is anyway, at this point I am more interested in getting the busybox views on whether either fix is wanted now already (before dash acts), and if so, which one. Especially the second one will likely have opportunities to clean up and reduce code size by making sure subtype is already set to VSNUL at this point, rather than 0, meaning it does not need to be patched up here.

Cheers,
Harald van Dijk
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to