On Fri, 2002-08-09 at 08:40, Ashley Yakeley wrote: > At 2002-08-08 23:10, Ken Shan wrote: > > > 1. Octets. > > 2. C "char". > > 3. Unicode code points. > > 4. Unicode code values, useful only for UTF-16, which is seldom used. > > 5. "What handles handle". > ... > >I suggest that the following Haskell types be used for the five items > >above: > > > > 1. Word8 > > 2. CChar > > 3. CodePoint > > 4. Word16 > > 5. Char > > I disagree, they should be: > > 1. Word8 > 2. CChar > 3. Char > 4. Word16 > 5. Word8
Yes. > >Let me elaborate. Files are funny because the information units they > >contain can be treated as both numbers and characters. > > No, a file is always a list of octets. Nothing else (ignoring metadata, > forks etc.). Of course, you can interpret those octets as text using > "ASCII" or "UTF-8" or whatever, equally, you can interpret those octets > as an image using "PNG", "JPEG" etc. But those are secondary > transformations, separate from the business of reading from and writing > to a file. Ack! > We should have Word8-based interfaces to file and network handles. > Whether or not the old Char-based ones should be deprecated, or whatever, > I don't know. I think any notion of treating the _raw_ contents of a file as Chars must go, because it is simply incorrect. It's like a typo someone made, because for a moment, he got Haskell Char and C char mixed up. > As for Unicode codepoints, if there's to be an internationalisation > effort for Haskell, the type of character literals, Char, should be fixed > as the type for Unicode codepoints, much as it already is in GHC. Ack. Sven Moritz _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell
