On Mon, Mar 26, 2012 at 5:08 AM, Christian Siefkes <christ...@siefkes.net> wrote: > On 03/26/2012 02:39 AM, Gabriel Dos Reis wrote: >> True, but should the language definition default to a string type >> that is one the most unsuited for text processing in the 21st >> century where global multilingualism abounds? Even C has qualms >> about that. > ... >> I have no doubt believing that if all texts my students have to >> process are US ASCII, [Char] is more than sufficient. So, I have >> sympathy for your position. However, I doubt [Char] would be >> adequate if I ask them to shared texts from their diverse cultures. > > Uh, while a C char is (usually) just a byte (2^8 bits of information, like > Word8 in Haskell), a Haskell Char is a Unicode character (2^21 bits of > information).
It is not the precision of Char or char that is the issue here. It has been clarified at several points that Char is not a Unicode character, but a Unicode code point. Not every Unicode code point represents a Unicode code character, and not every sequence of Unicode code points represents a character or a sequence of Unicode character. > A single C char cannot contain arbitrary Unicode character, > while a Haskell Char can, and does. Hence [Char] is (efficiency issues > aside) perfectly adequate for dealing with texts written in arbitrary > languages. See above. -- Gaby _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime