Ashley Yakeley wrote: > > Simply claiming that values of type Char are Unicode characters > > doesn't make it so. > > Actually, that's exactly what makes it so.
Hmm. I suppose that there's some validity to that perspective. OTOH, it's one thing to state that it's true, but that's rather hollow if nothing actually behaves as if it is. It's a bit like saying "values of type Int are complex numbers; oh, BTW, the implementation is currently broken". IOW, if it walks like a duck, ... > > Unless I'm missing something, the only "support" that GHC provides is > > that Char is 4 bytes. > > No, on GHC a Char is a Unicode codepoint, which means it has only > 17*2^16 possible values. This by itself is the most important aspect of > Unicode support. OK; by "Char is 4 bytes" I basically meant that it's "large enough". > But most of the rest is missing. AFAICT, *all*[1] of the rest is missing. [1] With one rather useless exception: (maxBound :: Char) == 0x10ffff. I can't think of any other aspect of GHC's behaviour which would indicate that Char is meant to be Unicode. > > If you use Char to store anything other than ISO > > Latin-1 characters, none of the Haskell functions with Char in their > > signature will be of any use. > > Actually, many of those functions ought to use Word8 instead. But then: 1. Where would you get a Char from? 2. Where would you put it? BTW, I agree that the IO functions *should* use Word8. And I really wouldn't be that bothered if the standard was changed to just use "type Char = Word8". Actually, I would prefer that to the current fiction. At least the problems with the Char functions are just implementation bugs; those functions *could* be made to work correctly. The IO problems are design bugs, and can't truly be fixed without breaking a lot of existing code. A workaround which preserves backward compatibility could result in a rather ugly interface: either all of the relevant functions use a default encoding (which will probably be the wrong one as often as not), or the "right" functions have to have their names bastardised because the "wrong" functions have already stolen the obvious names. -- Glynn Clements <[EMAIL PROTECTED]> _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell