Hello, I am also not an expert but I got curious and did a bit of Wikipedia reading. Based on what I understood, here are two (related) questions that it might be nice to clarify in a future version of the report:
1. What is the alphabet used by the grammar in the Haskell report? My understanding is that the intention is that the alphabet is unicode codepoints (sometimes referred to as unicode characters). There is no way to refer to specific code-points by escaping as in Java (the link that Gaby shared), you just have to write the code-points directly (and there are plenty of encodings for doing that, e.g. UTF-8 etc.) 2. Do we respect "unicode equivalence" (http://en.wikipedia.org/wiki/Canonical_equivalence) in Haskell source code. The issue here is that, apparently, some sequences of unicode code points/characters are supposed to be morally the same. For example, it would appear that there are two different ways to write the Spanish letter ñ: it has its own number, but it can also be made by writing "n" followed by a modifier to put the wavy sign on top. I would guess that implementing "unicode equivalence" would not be too hard---supposedly the unicode standard specifies a "text normalization procedure". However, this would complicate the report specification, because now the alphabet becomes not just unicode code-points, but equivalence classes of code points. Thoughts? -Iavor On Fri, Mar 16, 2012 at 4:49 PM, Ian Lynagh <ig...@earth.li> wrote: > > Hi Gaby, > > On Fri, Mar 16, 2012 at 06:29:24PM -0500, Gabriel Dos Reis wrote: >> >> OK, thanks! I guess a take away from this discussion is that what >> is a punctuation is far less well defined than it appears... > > I'm not really sure what you're asking. Haskell's uniSymbol includes all > Unicode characters (should that be codepoints? I'm not a Unicode expert) > in the punctuation category; I'm not sure what the best reference is, > but e.g. table 12 in > http://www.unicode.org/reports/tr44/tr44-8.html#Property_Values > lists a number of Px categories, and a meta-category P "Punctuation". > > > Thanks > Ian > > > _______________________________________________ > Haskell-prime mailing list > Haskell-prime@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-prime _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime