On 01/28/2018 11:58 PM, Daniel Stenberg wrote:
On Sun, 28 Jan 2018, Daniel Stenberg wrote:

Yes, I noticed this too and it is highly annoying since it now keeps making our CI builds red. I tried to reproduce it on my Mac but for some reason it doesn't fail when I run it! :-/

I managed to trigger the bug on macos now. It turns out ISALNUM() returns TRUE on macos and FALSE on Linux when given the char 'ÿ' (0xc3) as input.

Further, isalnum() is apparently depending on locale. Both my Linux and macos machines use the en_US.UTF-8 locale. However, if I set "LANG=C" before I run unit1307, the test case no longer fails!

I'm afraid there's but one conclusion to draw from this: we need to make and use a custom isalnum() function for this. And to be really sure I figure we should cover the other is*() macros as well as they may very well contain the same sort of weaknesses.

Totally agreed, I think it is the wisest thing we can do, not only for curl_fnmatch, but for the whole library.

In addition, if isalnum considers its argument as 8-bit only, it is impossible to determine if the first byte of an arbitrary encoding multibyte character represents an alphanum, whatever the locale is. I think our own implementation of ctypes.c equivalent functions should always return false for values < 0 or >= 0x80.

Your isalnum() explanation made me find the real bug: in a set, it is not possible to follow an alphanum character with a non-alphanum. Try:

  { "[a@]",                    "a",                      MATCH },

This fails even on Linux.
This occurs identically on mac with the "[!ÿ]" test if ISALNUM(0xBF) is false. The setcharset() function has to be changed to be much less "picky" about what is currently considered as a pattern error.
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Reply via email to