On 01/28/2018 11:58 PM, Daniel Stenberg wrote:
On Sun, 28 Jan 2018, Daniel Stenberg wrote:
Yes, I noticed this too and it is highly annoying since it now keeps
making our CI builds red. I tried to reproduce it on my Mac but for
some reason it doesn't fail when I run it! :-/
I managed to trigger the bug on macos now. It turns out ISALNUM()
returns TRUE on macos and FALSE on Linux when given the char 'ÿ'
(0xc3) as input.
Further, isalnum() is apparently depending on locale. Both my Linux
and macos machines use the en_US.UTF-8 locale. However, if I set
"LANG=C" before I run unit1307, the test case no longer fails!
I'm afraid there's but one conclusion to draw from this: we need to
make and use a custom isalnum() function for this. And to be really
sure I figure we should cover the other is*() macros as well as they
may very well contain the same sort of weaknesses.
Totally agreed, I think it is the wisest thing we can do, not only for
curl_fnmatch, but for the whole library.
In addition, if isalnum considers its argument as 8-bit only, it is
impossible to determine if the first byte of an arbitrary encoding
multibyte character represents an alphanum, whatever the locale is.
I think our own implementation of ctypes.c equivalent functions should
always return false for values < 0 or >= 0x80.
Your isalnum() explanation made me find the real bug: in a set, it is
not possible to follow an alphanum character with a non-alphanum. Try:
{ "[a@]", "a", MATCH },
This fails even on Linux.
This occurs identically on mac with the "[!ÿ]" test if ISALNUM(0xBF) is
false.
The setcharset() function has to be changed to be much less "picky"
about what is currently considered as a pattern error.
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.haxx.se/mail/etiquette.html