On 12.04.2011 21:37, William A. Rowe Jr. wrote: > On 4/12/2011 11:56 AM, Jeff Trawick wrote: >> On Tue, Apr 12, 2011 at 12:29 PM, William A. Rowe Jr. >> <wr...@rowe-clan.net> wrote: >>> I have one dev question for my apr_fnmatch() refactoring >>> >>> Today we lowercase the two characters (and don't support case-insensitive >>> range matches at all, I won't change this apr-specific quirk). But IIRC >>> there are language with multiple lower case representations of the same >>> upper case character, but never (or at least, rarely) visa versa? >>> >>> Shouldn't we upcase both the text and match chars, instead, to better >>> support non-ASCII locales? (Obviously, this ignores utf-8 issues, and >>> I'm not going to enable MBCS in this next release, but will at least make >>> it possible to enhance for MBCS later on, without changing fn prototypes). >> No real answer, just some comments... >> >> * FWLIW, it is tolower() now "just because." It was originally toupper(). >> * For interesting text, it could change behavior, and we don't have >> bugs filed now, right? >> * For interesting text, neither toupper() nor tolower() nor == is >> correct! (So don't bother changing behavior.) > I think I found the answer to "just because", thanks Deutchlanders... from > the linux manpage... > > In some non-English locales, there are lowercase letters with no corre- > sponding uppercase equivalent; the German sharp s is one example. > > Still pondering.
The only marginally safe comparison would be strcoll on whole non-wildcard subsequences of the pattern, and even that isn't guaranteed to work because the filesystem (it's for fnmatch, right?) can have a different collation than the current locale, thank you NTFS. -- Brane