On 24 March 2012 22:33, Freddie Manners <f.mann...@gmail.com> wrote: > To add my tuppence-worth on this, addressed to no-one in particular: > > (1) I think getting hung up on UTF-8 correctness is a distraction here. I > can't imagine anyone suggesting that the C/C++ standards removed support for > (char*) because it wasn't UTF-8 correct: sure, you'd recommend people use a > different type when it matters, but the language standard itself shouldn't > be driven by technical issues that don't affect most people most of the > time. I'm sure it's good engineering practice to worry about these things, > but the standard isn't there to encourage good engineering practice.
It doesn't really have anything to do with UTF-8. UTF-8 is just a particular serialisation of a unicode string. Here's a simple illustration of the problems one faces: Let's say you want to search for the string "fix". Now, the problem is that the sequence 'f','i' could be represented both as ['f', 'i'] or as [chr 0xfb01] (the "fi" ligature). The text-icu package provides a function to normalise a string such that only one of these forms can occur in each string. Because the world's languages are rather complex there are many more such cases which need to be handled properly (if you don't want to run into weird corner cases). > (2) I'd suggest that a proposal that advocated overloaded string literals -- > of which [Char] was an option -- couldn't be much more confusing from a > pedagogical perspective than the fact that numeric literals are overloaded. > Since that seems to be one of the main biases in favour of [Char] in the > current standard, that might be a possible incremental fix. I agree that this proposal should probably include the standardisation of the OverloadedStrings extension. > > Best, > Freddie > > > On 24 March 2012 22:15, Ian Lynagh <ig...@earth.li> wrote: >> >> On Sat, Mar 24, 2012 at 08:38:23PM +0000, Thomas Schilling wrote: >> > On 24 March 2012 20:16, Ian Lynagh <ig...@earth.li> wrote: >> > > >> > >> Correctness >> > >> ========== >> > >> >> > >> Using list-based operations on Strings are almost always wrong >> > > >> > > Data.Text seems to think that many of them are worth reimplementing >> > > for >> > > Text. It looks like someone's systematically gone through Data.List. >> > >> > That's exactly what happened as part of the platform inclusion >> > process. In fact, there was quite a bit of bike shedding whether the >> > Text API should be compatible with the list API or not. In the end >> > the decision was made to add all the list functions even if that >> > encouraged running into unicode issues. I'm pretty sure you >> > participated in that discussion. >> >> As far as I remember, a few functions were added to text and bytestring >> during that, but mostly the discussion was about naming. >> >> Even in the first 0.1 release of bytestring: >> >> http://hackage.haskell.org/packages/archive/text/0.1/doc/html/Data-Text.html >> there is a large amount of Data.List covered, e.g. map, transpose, >> foldl1', minimum, mapAccumR, groupBy. >> >> >> Thanks >> Ian >> >> >> _______________________________________________ >> Haskell-prime mailing list >> Haskell-prime@haskell.org >> http://www.haskell.org/mailman/listinfo/haskell-prime > > > > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime > -- Push the envelope. Watch it bend. _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime