On Tuesday, 15 September 2020 at 02:23:31 UTC, Paul Backus wrote:
On Tuesday, 15 September 2020 at 01:49:13 UTC, James Blachly
wrote:
I wish to write a function including ∂x and ∂y (these are
trivial to type with appropriate keyboard shortcuts - alt+d on
Mac), but without a unicode byte order mark at the beginning
of the file, the lexer rejects the tokens.
It is not apparently easy to insert such marks (AFAICT no
common tool does this specifically), while other languages
work fine (i.e., accept unicode in their source) without it.
Is there a downside to at least presuming UTF-8?
According to the spec [1] this should Just Work. I'd recommend
filing a bug.
[1] https://dlang.org/spec/lex.html#source_text
Under the identifiers section
(https://dlang.org/spec/lex.html#identifiers) it describes
identifiers as:
Identifiers start with a letter, _, or universal alpha, and are
followed by any number of letters, _, digits, or universal
alphas. Universal alphas are as defined in ISO/IEC 9899:1999(E)
Appendix D of the C99 Standard.
I was unable to find the definition of a "universal alpha", or
whether that includes non-ascii alphabetic characters.