On Wed, Aug 27, 2003 at 04:28:27AM +0100, Glynn Clements wrote: > > Attached is a properly internationalized implementation of > > Foreign.C.String, along with some other routines which I feel would be > > very at home in the FFI standard. > > > > Note that I am trying to solve a simpler problem than full generic i18n. > > I just want the ability to work within the current locale, whatever it > > might be. > > But bear in mind that other programmers may want to work in the "C" > locale, regardless of the user's environment settings. This why the C > library doesn't attempt to use those settings unless the program > explicitly requests their use via setlocale(). > > Also, some libraries may fail to cope with locales other than "C"; > particularly locales with multi-byte encodings.
then this is a library interface issue, not a locale one. If a library documents that an argument shall be nothing other than ascii text then call withAsciiCString, if the library accepts localized text then calal withCString (which is defined to return the localized version by the ffi spec) frankly, the ability to not call setlocale was a hack to work around migration issues in C programs, there is no need for haskell programs to inherit this complexity, if libraries need ascii strings, explicitly pass ascii strings, if they need localized strings pass localized ones and so forth. at worst, you can always do LANG=C foo to force a program to run in a specific locale. (but if people are smart about writing their C bindings, this sort of hack shouldn't ever be necisarry) > > also, to a lesser extent I propose we add explicit utf8 routines: > > > > withUTF8String, withUTF8StringLen, newUTF8String, > > newUTF8StringLen, peekUTF8String, peekUTF8StringLen > > > > there are several libraries (X11 being a major one) which export an > > explicit utf8 based interface, > > Note that the Xutf8* functions are specific to XFree86's version of > Xlib (and are only in 4.0.2 and later); they aren't in the vanilla > OpenGroup version. They don't exist in vanilla X11R6, or in XFree86 3.x. > > Also those functions are redundant; you can always use the Xmb* > functions with a UTF8-based locale instead. yeah, I was giving them as an example of a place where they would be handy. locale hacks are bad! what if the system doesn't have a utf8 locale? in a multithreaded program temporarily changing the locale can be disastorous. Xutf8* lets us avoid this very nicely when they exist and we only need to fall back to less reliable locale tricks when necisarry. This is similar to my previous comment, interfaces either use data in the current locale or a specifified one only. haskell is of the later type, specifying unicode as it's character set, which means that we can talk to interfaces which are locale independent (like the Xutf8*) very easily with statically determined charset conversions. we should take advantage of this ability whenever possible. > Simon Marlow wrote: > > > In our new implementation of Data.Char.isUpper and friends, I made the > > simplifying assumption that Char==wchar_t==Unicode. With glibc, this > > appears to be valid as long as (a) you set LANG to something other than > > "C" or "POSIX", and (b) you call setlocale() first. > > The glibc Info file says: > > The wide character character set always is UCS4, at least on > GNU systems. yes. with glibc, wchar_t is always unicode no matter what the locale. better yet, all ISO C implementations define a handy C symbol to test for this. if __STDC_ISO_10646__ is defined then wchar_t is always unicode no matter what. > > We now call setlocale() in the RTS startup code. > So anyone who doesn't want to use the current locale now has to > explicitly set it back to "C"? > > Also, this is just for LC_CTYPE, right? if they want to use a different locale, they should change the enviornment prior to running the program. if they want to use code which only supports the C locale (meaning it only works with ascii) then call withAsciiCString and friends... John PS I made the implicit assumption that once *CString* was replaced by localized versions we would export the old versions under *AsciiCString* which just makes sense. -- --------------------------------------------------------------------------- John Meacham - California Institute of Technology, Alum. - [EMAIL PROTECTED] --------------------------------------------------------------------------- _______________________________________________ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi