Yeah, I specifically excluded ascii prime (') from special handling in jhc due to its already overloaded meaning in haskell. I just added the subscript/superscript ones to the 'trailing' character class.
John On Wed, Jun 25, 2014 at 12:54 PM, Mikhail Vorozhtsov <mikhail.vorozht...@gmail.com> wrote: > Isn't it weird that you can't write `a₁'`? I was considering proposing > > varid -> (small { small | large | digit | ' | primes } { subsup | primes }) > (EXCEPT reservedid) > > but felt that it would be odd to allow primes in the middle of an identifier > but not super/subscripts. I wish we could just abandon things like `a'bc'd` > altogether... > > > On 06/15/2014 03:58 AM, John Meacham wrote: >> >> I have this feature in jhc, where I have a 'trailing' character class >> that can appear at the end of both symbols and ids. >> >> currently it consists of >> >> $trailing = [₀₁₂₃₄₅₆₇₈₉⁰¹²³⁴⁵⁶⁷⁸⁹₍₎⁽⁾₊₋] >> >> John >> >> On Sat, Jun 14, 2014 at 7:48 AM, Mikhail Vorozhtsov >> <mikhail.vorozht...@gmail.com> wrote: >>> >>> Hello lists, >>> >>> As some of you may know, GHC's support for Unicode characters in lexemes >>> is >>> rather crude and hence prone to inconsistencies in their handling versus >>> the >>> ASCII counterparts. For example, APOSTROPHE is treated differently from >>> PRIME: >>> >>> λ> data a +' b = Plus a b >>> <interactive>:3:9: >>> Unexpected type ‘b’ >>> In the data declaration for ‘+’ >>> A data declaration should have form >>> data + a b c = ... >>> λ> data a +′ b = Plus a b >>> >>> λ> let a' = 1 >>> λ> let a′ = 1 >>> <interactive>:10:8: parse error on input ‘=’ >>> >>> Also some rather bizarre looking things are accepted: >>> >>> λ> let ᵤxᵤy = 1 >>> >>> In the spirit of improving things little by little I would like to >>> propose: >>> >>> 1. Handle single/double/triple/quadruple Unicode PRIMEs the same way as >>> APOSTROPHE, meaning the following alterations to the lexer: >>> >>> primes -> U+2032 | U+2033 | U+2034 | U+2057 >>> symbol -> ascSymbol | uniSymbol (EXCEPT special | _ | " | ' | primes) >>> graphic -> small | large | symbol | digit | special | " | ' | primes >>> varid -> (small { small | large | digit | ' | primes }) (EXCEPT >>> reservedid) >>> conid -> large { small | large | digit | ' | primes } >>> >>> 2. Introduce a new lexer nonterminal "subsup" that would include the >>> Unicode >>> sub/superscript[1] versions of numbers, "-", "+", "=", "(", ")", Latin >>> and >>> Greek letters. And allow these characters to be used in names and >>> operators: >>> >>> symbol -> ascSymbol | uniSymbol (EXCEPT special | _ | " | ' | primes | >>> subsup ) >>> digit -> ascDigit | uniDigit (EXCEPT subsup) >>> small -> ascSmall | uniSmall (EXCEPT subsup) | _ >>> large -> ascLarge | uniLarge (EXCEPT subsup) >>> graphic -> small | large | symbol | digit | special | " | ' | primes | >>> subsup >>> varid -> (small { small | large | digit | ' | primes | subsup }) (EXCEPT >>> reservedid) >>> conid -> large { small | large | digit | ' | primes | subsup } >>> varsym -> (symbol (EXCEPT :) {symbol | subsup}) (EXCEPT reservedop | >>> dashes) >>> consym -> (: {symbol | subsup}) (EXCEPT reservedop) >>> >>> If this proposal is received favorably, I'll write a patch for GHC based >>> on >>> my previous stab at the problem[2]. >>> >>> P.S. I'm CC-ing Cafe for extra attention, but please keep the discussion >>> to >>> the GHC users list. >>> >>> [1] https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts >>> [2] https://ghc.haskell.org/trac/ghc/ticket/5108 >>> _______________________________________________ >>> Glasgow-haskell-users mailing list >>> Glasgow-haskell-users@haskell.org >>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users >> >> >> > -- John Meacham - http://notanumber.net/ _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users