I will try to produce a small patch to solve the main issue (masking filenames 
when listing archive content) my back-of-the napkin idea is to simply replace 
the following bytes with “?” when printing them via the header list function:

0x08 (backspace)
0x09 (htab)
0x0a (line feed)
0x0b (vtab)
0x0c (form feed)
0x0d (carriage return)
0x1b (escape)
0x7f (del)

That should solve the problem while not breaking multi-byte charsets that I can 
tell

PS apologies for the top-posting, I hate outlook!.

From: busybox <busybox-boun...@busybox.net> on behalf of Kang-Che Sung 
<explore...@gmail.com>
Date: Thursday, 4 July 2024 at 02:20
To: Kang-Che Sung <explore...@gmail.com>, busybox@busybox.net 
<busybox@busybox.net>
Subject: [EXTERNAL] Re: Re busybox tar hidden filename exploit
(Sorry if my mail client messes up with the quoting. I use the mobile web 
version of Gmail. ) Steffen Nurpmeso <steffen@ sdaoden. eu> 於 2024年7月4日 星期四寫道: 
> |Kang-Che Sung wrote in > | > |Just FYI, there is a portable alternative to

(Sorry if my mail client messes up with the quoting. I use the mobile web 
version of Gmail.)

Steffen Nurpmeso <stef...@sdaoden.eu<mailto:stef...@sdaoden.eu>> 於 2024年7月4日 
星期四寫道:
>  |Kang-Che Sung wrote in
>  |
>  |Just FYI, there is a portable alternative to the $'' (dollar-single-quote)
>  |of passing special characters in the shell. It's $(printf '...') with
>  |command substitution.
>
> You mean the %q format?  That is not standardized.
>
>    %q     ARGUMENT is printed in a format that can be reused as  shell  in-
>           put,  escaping  non-printable  characters with the proposed POSIX
>           $'' syntax.
>
> Just like bash(1)s ${parameter@operator}:
>
>     Q      The expansion is a string that is the value  of  parameter
>            quoted in a format that can be reused as input.

I am not expecting any quoted and shell-escaped filename output may be reused 
as input. Such quoting and escaping may be useful in filtering well-known 
problematic characters (shell meta-characters, quotation marks, etc.), but 
would never be complete in mitigating all possible attacks with the Unicode 
characters.

That's why I mentioned two use cases, and made them distinct. You can't win 
both.

> Well one could look for isatty(3) for example.
> Things are easier if you also know you are in a Unicode-aware
> environment, then you can simply add U+2400 aka do
>
>      if(!iswprint(wc) && wc != '\n' /*&& wc != '\r' && wc != '\b'*/ &&
>            wc != '\t'){
>         if ((wc & ~S(wchar_t,037)) == 0)
>            wc = isuni ? 0x2400 | wc : '?';
>         else if(wc == 0177)
>            wc = isuni ? 0x2421 : '?';
>         else
>            wc = isuni ? 0x2426 : '?';
>
> but in other cases have to be aware of L-TO-R and R-TO-R marks,
> zero width and non-characters, ie most brutal (where isuni tells
> us that the character set aka wchar_t is real Unicode).
>
>        }else if(isuni){ /* TODO ctext */
>           /* Need to filter out L-TO-R and R-TO-R marks TODO ctext */
>           if(wc == 0x200E || wc == 0x200F || (wc >= 0x202A && wc <= 0x202E))
>              continue;
>           /* And some zero-width messes */
>           if(wc == 0x00AD || (wc >= 0x200B && wc <= 0x200D))
>              continue;
>           /* Oh about the ISO C wide character interfaces, baby! */
>           if(wc == 0xFEFF)
>              continue;
>        }

This was the second use case I mentioned. That is, `--quoting-style=whatever`. 
We can make this the default when `stdout` is a terminal, and I believe GNU 
utilities already did this.
Any email and files/attachments transmitted with it are intended solely for the 
use of the individual or entity to whom they are addressed. If this message has 
been sent to you in error, you must not copy, distribute or disclose of the 
information it contains. Please notify Entrust immediately and delete the 
message from your system.
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to