Ganesan R <[EMAIL PROTECTED]> wrote: >>>>>> "Steve" == Steve Greenland <[EMAIL PROTECTED]> writes: >> On 31-Dec-01, 19:42 (CST), Ganesan R <[EMAIL PROTECTED]> wrote: >>> Another thing that puzzles me since this whole debate started. If you look >>> at the declaration of ctype.h functions (isalpha family), they take a int as >>> an argument. > >> I don't know that I agree about needing to pass them unsigned char, >> though. The char->int conversion should be value preserving. If you pass >> a negative value, then you are making a domain error, and deserve what >> you get. > > Are you saying that isalpha() etc should work for negative values or that > you should never call isalpha() with negative values for chars? In a > ISO8859-1 locale when I call isalpha() for accented characters I should get > expected results without worrying about whether the accented character is a > signed quantity. Unfortunately older C library implementations may break > because an accented character will take a negative index on a character > table without proper casting. > > I followed up a bit on this and found out that ISO C specifically states > that ctype functions should work for all values of unsigned char as well as > the default char type. In other words, if the default C char type is signed > you can just call the functions without any cast and expect it to work.
If every system had up-to-date, standards-conforming ctype.h support, we wouldn't have to worry much at all. But even these days, pretty many systems with buggy macros are still in use. FYI, as far as I know, the most portable way to use the ctype macros is to define wrapper macros (e.g., like those below, from fileutils/src/sys2.h) and then use only the wrappers (upper-case names) from your code. Of course, the following assumes you have the right definitions for STDC_HEADERS and HAVE_ISASCII. You get those by using these autoconf macros: AC_HEADER_STDC AC_CHECK_FUNCS(isascii) Be careful when choosing between ISDIGIT and ISDIGIT_LOCALE. Jim ---------------------------------------- #include "config.h" #include <ctype.h> /* [someone :-)] writes: "... Some ctype macros are valid only for character codes that isascii says are ASCII (SGI's IRIX-4.0.5 is one such system --when using /bin/cc or gcc but without giving an ansi option). So, all ctype uses should be through macros like ISPRINT... If STDC_HEADERS is defined, then autoconf has verified that the ctype macros don't need to be guarded with references to isascii. ... Defining isascii to 1 should let any compiler worth its salt eliminate the && through constant folding." Bruno Haible adds: "... Furthermore, isupper(c) etc. have an undefined result if c is outside the range -1 <= c <= 255. One is tempted to write isupper(c) with c being of type `char', but this is wrong if c is an 8-bit character >= 128 which gets sign-extended to a negative value. The macro ISUPPER protects against this as well." */ #if STDC_HEADERS || (!defined (isascii) && !HAVE_ISASCII) # define IN_CTYPE_DOMAIN(c) 1 #else # define IN_CTYPE_DOMAIN(c) isascii(c) #endif #ifdef isblank # define ISBLANK(c) (IN_CTYPE_DOMAIN (c) && isblank (c)) #else # define ISBLANK(c) ((c) == ' ' || (c) == '\t') #endif #ifdef isgraph # define ISGRAPH(c) (IN_CTYPE_DOMAIN (c) && isgraph (c)) #else # define ISGRAPH(c) (IN_CTYPE_DOMAIN (c) && isprint (c) && !isspace (c)) #endif /* This is defined in <sys/euc.h> on at least Solaris2.6 systems. */ #undef ISPRINT #define ISPRINT(c) (IN_CTYPE_DOMAIN (c) && isprint (c)) #define ISALNUM(c) (IN_CTYPE_DOMAIN (c) && isalnum (c)) #define ISALPHA(c) (IN_CTYPE_DOMAIN (c) && isalpha (c)) #define ISCNTRL(c) (IN_CTYPE_DOMAIN (c) && iscntrl (c)) #define ISLOWER(c) (IN_CTYPE_DOMAIN (c) && islower (c)) #define ISPUNCT(c) (IN_CTYPE_DOMAIN (c) && ispunct (c)) #define ISSPACE(c) (IN_CTYPE_DOMAIN (c) && isspace (c)) #define ISUPPER(c) (IN_CTYPE_DOMAIN (c) && isupper (c)) #define ISXDIGIT(c) (IN_CTYPE_DOMAIN (c) && isxdigit (c)) #define ISDIGIT_LOCALE(c) (IN_CTYPE_DOMAIN (c) && isdigit (c)) #if STDC_HEADERS # define TOLOWER(Ch) tolower (Ch) # define TOUPPER(Ch) toupper (Ch) #else # define TOLOWER(Ch) (ISUPPER (Ch) ? tolower (Ch) : (Ch)) # define TOUPPER(Ch) (ISLOWER (Ch) ? toupper (Ch) : (Ch)) #endif /* ISDIGIT differs from ISDIGIT_LOCALE, as follows: - Its arg may be any int or unsigned int; it need not be an unsigned char. - It's guaranteed to evaluate its argument exactly once. - It's typically faster. Posix 1003.2-1992 section 2.5.2.1 page 50 lines 1556-1558 says that only '0' through '9' are digits. Prefer ISDIGIT to ISDIGIT_LOCALE unless it's important to use the locale's definition of `digit' even when the host does not conform to Posix. */ #define ISDIGIT(c) ((unsigned) (c) - '0' <= 9)