I'd like to fix more of our unicode mess while we're at it. For example Mn (non-spacing combining marks) should allowed in varid_{cont}, so it won't be like this:
h> let é=() in é () h> let x́=() in x́ <interactive>:6:6: lexical error at character '\769' (that’s because x́ is denormalized and is actually 2 code points). [:Mc:] probably too. Also we can include include unicode ′ ″ ‴ ⁗ primes (there's already a proposal for this IIRC). We have some prior work to look at, — that is at least the java language specification and UAX #31. One problem is that doing it perfectly will require normalization (but there's always the java way — to just ignore it). (I'm willing to formulate everything if there's some agreement to fix this.) Edward Kmett <ekm...@gmail.com> writes: > Back in 2008 or so, GHC changed the behavior of unicode characters in > the parser that parse as OtherLetter to make them parse as lower case > so that languages like Japanese that lack case could be used in > identifier names: > > https://ghc.haskell.org/trac/ghc/ticket/1103 > > In a recent thread on reddit Lennart Augustsson pointed out that this > change > was never backported to Haskell'. > > http://www.reddit.com/r/haskell/comments/2dce3d/%E0%B2%A0_%E0%B2%A0_string_ > a/cjo68ij > > Would it make sense to adopt this change in the language standard? > > Marlow when he made the change to GHC noted he was considering > bringing it up to Haskell' but here we are 6 years later. _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime