tor 2003-07-10 klockan 04.56 skrev Glynn Clements: > OTOH, existing implementations (at least GHC and Hugs) currently read > and write "8-bit binary", i.e. characters 0-255 get read and written > "as-is" and anything else breaks, and changing that would probably > break a fair amount of existing code.
What I would like to see, is a package for converting between different
encodings and character sets. Python has two types for strings, 'str'
(which is just a sequence of octets) and 'unicode'. You can encode and
decode between them, I find this pretty neat:
'foo ���'.decode('latin1') -> unicode string
ustr.encode('latin1') -> string, breaks if there are non-latin1
characters in the string
ustr.encode('utf-8') -> UTF-8 representation of the string.
If I recall correctly, the 'str' type is being replaced with another
type to highlight that it's actually only a sequence of bytes, whereas
'unicode' are Really Nice strings...
Having something like this in Haskell would be wonderful, unfortunately
I don't know much about Unicode beyond happily using it, so I don't have
any suggestions or anything. :)
/Martin
--
Martin Sj�gren
[EMAIL PROTECTED]
Phone: +46 (0)31 7490880 Cell: +46 (0)739 169191
GPG key: http://www.strakt.com/~martin/gpg.html
signature.asc
Description: PGP signature
