Re: String != [Char]

Brandon Allbery Mon, 26 Mar 2012 10:13:25 -0700

On Mon, Mar 26, 2012 at 06:08, Christian Siefkes <[email protected]>wrote:


> On 03/26/2012 02:39 AM, Gabriel Dos Reis wrote:
> > True, but should the language definition default to a string type
> > that is one the most unsuited for text processing in the 21st
> > century where global multilingualism abounds?  Even C has qualms
> > about that.
> ...
> > I have no doubt believing that if all texts my students have to
> > process are US ASCII, [Char] is more than sufficient.  So, I have
> > sympathy for your position.  However,  I doubt [Char] would be
> > adequate if I ask them to shared texts from their diverse cultures.
>
> Uh, while a C char is (usually) just a byte (2^8 bits of information, like
> Word8 in Haskell), a Haskell Char is a Unicode character (2^21 bits of
> information). A single C char cannot contain arbitrary Unicode character,
> while a Haskell Char can, and does. Hence [Char] is (efficiency issues
> aside) perfectly adequate for dealing with texts written in arbitrary
> languages.
>

...as long as you ignore combining characters and the like.  I claim
ignoring them in this way is just continuing the same "good enough for my
language" attitude that has plagued text handling ever since someone got
the notion that maybe text processing should consider more than just ISO
8859/1 and got roundly pooh-poohed by the community.

-- 
brandon s allbery                                      [email protected]
wandering unix systems administrator (available)     (412) 475-9364 vm/sms

_______________________________________________
Haskell-prime mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-prime

Re: String != [Char]

Reply via email to