On Sat, Sep 05, 2015 at 01:26:18PM +0200, Stefan Sperling wrote:
> 
> > +static u_int32_t
> > +decode_utf8(const char *in, const char **nextc, int *had_error)
> > +{
> 
> Please make sure this function performs the same validation checks
> as src/lib/libc/citrus/citrus_utf8.c:_citrus_utf8_ctype_mbrtowc() 
> 
> And if that libc function is missing checks you're doing here we should
> discuss about aligning the two.
> 

We have a missing check in libc function.

RFC 3629 ask for limiting the range to 0x10FFFF:
https://tools.ietf.org/html/rfc3629#page-10

Currently, passing a c-string with "f7 bf bf bf" to mbrtowc(3) [with
UTF-8 locale], the function return 4 and make a wchar_t outside 0x10FFFF
limit.

With the following patch, the limit is checked, and the input is
considered as invalid.

Comments ? OK ?
-- 
Sebastien Marie

Index: citrus_utf8.c
===================================================================
RCS file: /cvs/src/lib/libc/citrus/citrus_utf8.c,v
retrieving revision 1.8
diff -u -p -r1.8 citrus_utf8.c
--- citrus_utf8.c       16 Jan 2015 16:48:51 -0000      1.8
+++ citrus_utf8.c       5 Sep 2015 12:26:50 -0000
@@ -169,6 +169,13 @@ _citrus_utf8_ctype_mbrtowc(wchar_t * __r
                errno = EILSEQ;
                return ((size_t)-1);
        }
+       if (wch > 0x10ffff) {
+               /*
+                * Malformed input; invalid code points.
+                */
+               errno = EILSEQ;
+               return ((size_t)-1);
+       }
        if (pwc != NULL)
                *pwc = wch;
        us->want = 0;

Reply via email to