Re: [RFA/libiberty] Darwin has case-insensitive filesystems
Looks OK to me. Thanks, DJ. I've just checked the patch in on the GCC side. I will push it on the src/GDB CVS momentarily. -- Joel
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On Jun 14 18:01, DJ Delorie wrote: This is wrong as not all FSs are case insensitive. In fact HFS+ can be case sensitive too. I think you need better check than just saying all Darwin is case insensitive. This is just like using FAT32 on Linux. In fact I think HAVE_DOS_BASED_FILE_SYSTEM is incorrect also for NTFS as it can also be case sensitive. There's a difference between case preserving and case sensitive, though, and we really don't have a portable way to detect case-sensitivity on a per-directory basis, sow how can we do better? As Andrew points out, NTFS can be case-sensitive as well, and on Windows the case-sensitivity vs. case-preserving behaviour can be chosen for each file or directory descriptor at the time the file is opened. IMHO it's actually a pity that the filename comparison behaves differently on different systems. I think it would make sense to behave identical on all systems. What about this: Always search case-sensitive. If file has been found, return. Otherwise, search case-insensitive. Talking about case-insensitive comparison, the filename_cmp and filename_ncmp functions don't work for multibyte codesets, only for singlebyte codesets. Given that UTF-8 is standard nowadays, shouldn't these functions be replaced with multibyte-aware versions? Along the same lines, the entire set of safe-ctype functions only work for ASCII and EBCDIC... Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
Date: Wed, 15 Jun 2011 10:22:36 +0200 From: Corinna Vinschen vinsc...@redhat.com On Jun 14 18:01, DJ Delorie wrote: This is wrong as not all FSs are case insensitive. In fact HFS+ can be case sensitive too. I think you need better check than just saying all Darwin is case insensitive. This is just like using FAT32 on Linux. In fact I think HAVE_DOS_BASED_FILE_SYSTEM is incorrect also for NTFS as it can also be case sensitive. There's a difference between case preserving and case sensitive, though, and we really don't have a portable way to detect case-sensitivity on a per-directory basis, sow how can we do better? As Andrew points out, NTFS can be case-sensitive as well, and on Windows the case-sensitivity vs. case-preserving behaviour can be chosen for each file or directory descriptor at the time the file is opened. IMHO it's actually a pity that the filename comparison behaves differently on different systems. I think it would make sense to behave identical on all systems. What about this: Always search case-sensitive. If file has been found, return. Otherwise, search case-insensitive. Over my dead body. On a proper operating system filenames are case-sensitive. Your suggestion would create spurious matches. Even on case-preserving filesystems I'd argue that treating them as case-sensitive is still the right approach. If that creates problems, it means somebody was sloppy and didn't type the proper name of the file or some piece of code in the toolchain arbitrarily changed the case of a filename. I don't mind punishing people for that. They have to learn that on a proper operating system file names are case-sensitive! If you're still using an operating system with fully case-insensitive filesystems, I feel very, very sorry for you. Talking about case-insensitive comparison, the filename_cmp and filename_ncmp functions don't work for multibyte codesets, only for singlebyte codesets. Given that UTF-8 is standard nowadays, shouldn't these functions be replaced with multibyte-aware versions? For UTF-8, that isn't necessary. Normal string manipulation functions work just fine on them, since UTF-8 strings don't contain any embedded NUL characters. It's only when you try to be too clever about case-insensitivity that you run into problems. Along the same lines, the entire set of safe-ctype functions only work for ASCII and EBCDIC... That really should only matter for displaying filenames. Anyway. I really don't care how deep a hole people have dug for themselves in trying to support Windows in all its various broken configurations. But on a native debugger for a UNIX-like system, or a cross debugger between such systems, filename_cmp() should simply do a strcmp().
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On Jun 15 11:58, Mark Kettenis wrote: Date: Wed, 15 Jun 2011 10:22:36 +0200 From: Corinna Vinschen ... Please do not quote my email address in the body of your message. Thank you. IMHO it's actually a pity that the filename comparison behaves differently on different systems. I think it would make sense to behave identical on all systems. What about this: Always search case-sensitive. If file has been found, return. Otherwise, search case-insensitive. Over my dead body. On a proper operating system filenames are case-sensitive. Your suggestion would create spurious matches. Indeed. Probably the case sensitivity should not be hardcoded in a low-level function at all. The application should decide if it wants case-sensitive or case-insensitive filename comparison. This way, the comparison could be based on OS, filesystem, or user choice. Even on case-preserving filesystems I'd argue that treating them as case-sensitive is still the right approach. If that creates problems, it means somebody was sloppy and didn't type the proper name of the file or some piece of code in the toolchain arbitrarily changed the case of a filename. I don't mind punishing people for that. They have to learn that on a proper operating system file names are case-sensitive! I wasn't aware that gcc, gdb, and other tools using libiberty are supposed to punish people for the features of the OS they are working on. At one point I actually thought they were supposed to *help* developers. I seem to be wrong. Talking about case-insensitive comparison, the filename_cmp and filename_ncmp functions don't work for multibyte codesets, only for singlebyte codesets. Given that UTF-8 is standard nowadays, shouldn't these functions be replaced with multibyte-aware versions? For UTF-8, that isn't necessary. Normal string manipulation functions work just fine on them, since UTF-8 strings don't contain any embedded NUL characters. It's only when you try to be too clever about case-insensitivity that you run into problems. If you read the text you're replying to once more, you see that I'm explicitely talking about case-insensitive comparison. In that case, the functions won't work correctly, unless you use a singlebyte codeset. The tolower function on a single byte just doesn't make sense in multibyte charsets. The right thing to do would be something along the lines of mbstowcs (wide_a, a); mbstowcs (wide_b, b); return wcscasecmp (wide_a, wide_b); Along the same lines, the entire set of safe-ctype functions only work for ASCII and EBCDIC... That really should only matter for displaying filenames. It matters for case-insensitive filename comparison as well. Anyway. I really don't care how deep a hole people have dug for themselves in trying to support Windows in all its various broken configurations. I can't help but notice that you seem to have a strained relationship to Windows. However, if you read the OP again, you'll notice that the patch was supposed to help developers on MacOS, not Windows. For Windows the function already performs case-insensitive comparison, albeit wrong. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On Wed, 15 Jun 2011, Corinna Vinschen wrote: these functions be replaced with multibyte-aware versions? Along the same lines, the entire set of safe-ctype functions only work for ASCII and EBCDIC... That's the whole point of safe-ctype: that code that is processing things such as C source code whose semantics do not depend on the host locale can examine character properties in a locale-independent way. Where C source code has multibyte characters, the correct handling depends in detail on the version of C and cannot be done by generic code. -- Joseph S. Myers jos...@codesourcery.com
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On Wednesday 15 June 2011 11:44:19, Corinna Vinschen wrote: Indeed. Probably the case sensitivity should not be hardcoded in a low-level function at all. The application should decide if it wants case-sensitive or case-insensitive filename comparison. This way, the comparison could be based on OS, filesystem, or user choice. http://sourceware.org/ml/gdb-patches/2010-12/msg00343.html (that only handles filename comparison, not file opening) -- Pedro Alves
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On Jun 15 10:45, Joseph S. Myers wrote: On Wed, 15 Jun 2011, Corinna Vinschen wrote: these functions be replaced with multibyte-aware versions? Along the same lines, the entire set of safe-ctype functions only work for ASCII and EBCDIC... That's the whole point of safe-ctype: that code that is processing things such as C source code whose semantics do not depend on the host locale can examine character properties in a locale-independent way. Where C source code has multibyte characters, the correct handling depends in detail on the version of C and cannot be done by generic code. Ok, I see. Just in this specific case it's about filenames, not C source. I don't think it makes sense to restrict filenames to ASCII or EBCDIC chars. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On 6/15/2011 5:58 AM, Mark Kettenis wrote: Over my dead body. On a proper operating system filenames are case-sensitive. Your suggestion would create spurious matches. Yes, we all know that Unix systems chose case sensitive, and are happy to have files differing only by case in the same directory. Obviously any proper software has to fully support such systems (if I was in the same mode as you and adding gratuitious flames to my comments, I would have preceded the word systems by brain-dead). Even on case-preserving filesystems I'd argue that treating them as case-sensitive is still the right approach. Absolutely not, please don't visit your unix-borne predjudices on non-unix systems. There is nothing worse for Windows users than having to put up with silly decisions like this that visit unix nonsense (and it is nonsense in a windows environment) on windows software. If that creates problems, it means somebody was sloppy and didn't type the proper name of the file The whole point in a system like Windows which is case preserving but not case sensitive is that you are NOT expected to type in the proper capitalization. In English, we recognize the words English and ENGLISH as equivalent, and windows users expect the same treatment. So the normal expectation in windows systems is that, yes, you can make nice capitalization like MyFile if you like, and it will be properly displayed. But any software that requires me to type MyFile rather than myfile is junk! If you're still using an operating system with fully case-insensitive filesystems, I feel very, very sorry for you. You are allowed to have this opinion, I feel the same about people who have to tolerate case-sensitive file systems, but I am quite happy with software for Unix systems fully behaving (I would agree that any software that EVER did case insensitive matching, as suggested earlier in this thread would be broken on Unix). But following your suggestion would be equally broken in Windows. or some piece of code in the toolchain arbitrarily changed the case of a filename. I don't mind punishing people for that. They have to learn that on a proper operating system file names are case-sensitive! This kind of unix arrogance leads to junk unusable software on windows. It's really important not to visit your unix predjudices on windows users. After all we feel the same way in return, I find Unix systems complete junk for many reasons, one of which is the very annoying case sensitive viewpoint, but I do not translate my feelings into silly suggestions for making software malfunction on Unix. You should not make this mistake in a reverse direction.
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
Date: Wed, 15 Jun 2011 06:59:11 -0400 From: Robert Dewar de...@adacore.com CC: vinsc...@redhat.com, d...@redhat.com, pins...@gmail.com, brobec...@adacore.com, gcc-patches@gcc.gnu.org, gdb-patc...@sourceware.org or some piece of code in the toolchain arbitrarily changed the case of a filename. I don't mind punishing people for that. They have to learn that on a proper operating system file names are case-sensitive! This kind of unix arrogance leads to junk unusable software on windows. It's really important not to visit your unix predjudices on windows users. After all we feel the same way in return, I find Unix systems complete junk for many reasons, one of which is the very annoying case sensitive viewpoint, but I do not translate my feelings into silly suggestions for making software malfunction on Unix. You should not make this mistake in a reverse direction. I cannot agree more.
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On Jun 15 20:27, Eli Zaretskii wrote: Date: Wed, 15 Jun 2011 10:22:36 +0200 From: Corinna Vinschen ... Talking about case-insensitive comparison, the filename_cmp and filename_ncmp functions don't work for multibyte codesets, only for singlebyte codesets. Given that UTF-8 is standard nowadays, shouldn't these functions be replaced with multibyte-aware versions? I agree, but if we go that way, shouldn't we support UTF-16, which is used by the native Windows APIs? Windows does not use UTF-8 for file names. I don't think so. UTF-16 is Windows' wchar_t (or WCHAR) codeset, but the applications calling the libiberty functions are using the char datatype with single- or multibyte codesets. If the filename_cmp function converts the multibyte input strings to wchar_t and compares the wide char strings case insensitive(*), they would use UTF-16 under the hood on Windows anyway. (*) As proposed in http://sourceware.org/ml/gdb-patches/2011-06/msg00210.html, basically like this: #ifdef _WIN32 #define wcscasecmp _wcsicmp #endif mbstowcs (wide_a, a); mbstowcs (wide_b, b); return wcscasecmp (wide_a, wide_b); Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
Looks OK to me.
Re: [RFA/libiberty] Darwin has case-insensitive filesystems
On Tue, Jun 14, 2011 at 2:33 PM, Joel Brobecker brobec...@adacore.com wrote: Hello, HFS+, the FS on Darwin, is case insensitive. So this patch adjusts filename_cmp.c to ignore the casing when comparing filenames on Darwin. This is wrong as not all FSs are case insensitive. In fact HFS+ can be case sensitive too. I think you need better check than just saying all Darwin is case insensitive. This is just like using FAT32 on Linux. In fact I think HAVE_DOS_BASED_FILE_SYSTEM is incorrect also for NTFS as it can also be case sensitive. Thanks, Andrew Pinski