I'd like to fix more of our unicode mess while we're at it.
For example Mn (non-spacing combining marks) should allowed in
varid_{cont}, so it won't be like this:
h let é=() in é
()
h let x́=() in x́
interactive:6:6: lexical error at character '\769'
(that’s because x́ is denormalized and is actually 2 code points).
[:Mc:] probably too. Also we can include include unicode ′ ″ ‴ ⁗ primes
(there's already a proposal for this IIRC).
We have some prior work to look at, — that is at least the java language
specification and UAX #31. One problem is that doing it perfectly will
require normalization (but there's always the java way — to just ignore
it).
(I'm willing to formulate everything if there's some agreement to fix
this.)
Edward Kmett ekm...@gmail.com writes:
Back in 2008 or so, GHC changed the behavior of unicode characters in
the parser that parse as OtherLetter to make them parse as lower case
so that languages like Japanese that lack case could be used in
identifier names:
https://ghc.haskell.org/trac/ghc/ticket/1103
In a recent thread on reddit Lennart Augustsson pointed out that this
change
was never backported to Haskell'.
http://www.reddit.com/r/haskell/comments/2dce3d/%E0%B2%A0_%E0%B2%A0_string_
a/cjo68ij
Would it make sense to adopt this change in the language standard?
Marlow when he made the change to GHC noted he was considering
bringing it up to Haskell' but here we are 6 years later.
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime