On Tue, Dec 5, 2023 at 9:43 AM Stefan Schiller via internals
<internals@lists.php.net> wrote:
>
> On Mon, Dec 4, 2023 at 8:45 PM Alex <alexinbeij...@gmail.com> wrote:
> >
> > Stefan,
> >
> >>
> >> My biggest concern is that this quirk can cause security issues in
> >> user code. I came across this in the first place when discovering an
> >> exploitable security vulnerability in an application. From my point of
> >> view, this is not only about inconsistent behavior but also violates
> >> the documentation for specific functions like mb_strstr. I agree that
> >> a lot of mbstring operations should not be used on invalid strings,
> >> and an exception seems to be an appropriate answer despite the huge BC
> >> impact.
> >
> >
> > 1) I'm sure the vulnerable application is proprietary, but can you at least 
> > tell us which mbstring functions were directly involved in this 
> > vulnerability, give an overview of the mechanism by which the exploit 
> > worked (scrubbed of any specific details about the application), and let us 
> > know the general nature of the resulting compromise? (i.e. 
> > denial-of-service, information disclosure, allowing users to impersonate 
> > other users, etc)
> >
> > Real-life examples of actual security impact provide far more compelling 
> > grounds for a BC break than hypothetical security impact.
>
> The case I am referring to involves mb_strpos, which is used to
> determine the index of a specific character, and mb_substr, which is
> used to extract the substring until the first occurrence of this
> character. For this specific case, this results in a Cross-Site
> Scripting (XSS) vulnerability, which would allow an attacker to
> perform actions on behalf of an authenticated user. The actual code is
> more complex, but the following lines should be a suitable
> representation of the issue:
>
> $data = $_GET['data'];
> $pos = mb_strpos($data, "<");
> $out = $pos ? mb_substr($data, 0, $pos) : $data;
> echo $out;
>
> The developer's assumption here is that $out should never contain an
> opening HTML tag, and it is thus safe to echo it back in the response.
> Despite the fact that e.g. htmlspecialchars should be used to encode
> the output, this assumption seems reasonable to me. However, due to
> the inconsistent behavior, an attacker can break this assumption. If
> $data, e.g., contains the sequence "\xf0AAA<b>", str_pos will consider
> the index of "<" to be 4. However, mb_substr assumes the full sequence
> to have a length of 4 (1 x 4-byte Unicode character + 3 x 1-byte
> Unicode characters). Accordingly, the full sequence, including the
> "<b>" tag, is reflected in the response.

This feels like an exact case of where mb_* functions shouldn't be
used. HTML spec very clearly states that only "<" (\x3C LESS-THAN
SIGN) is considered for html opening tags. I don't think there is a
reason to consider multibyte strings here.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to