Issue 54732
Summary Clang 14 rejects certain Unicode characters in identifiers that are accepted by Clang 13 and the C++ Standard
Labels
Assignees
Reporter tttapa
    Some Unicode characters like ₊ (U+208A) and other subscripts are rejected by Clang 14. These characters are in the allowed ranges for identifiers in the `[lex.name]` section of the C++ Standard. Recent versions of GCC and older versions of Clang do not raise any errors.

For example:
```cpp
double foo(double xₖ, double xₖ₊₁) {
  return xₖ₊₁ - xₖ;
}
```
```cpp
$ clang++-14 -c unicode.cpp -std=c++20                                                                                                                                                 
unicode.cpp:1:36: error: character <U+208A> not allowed in an identifier
double foo(double xₖ, double xₖ₊₁) {
                               ^
unicode.cpp:1:39: error: character <U+2081> not allowed in an identifier
double foo(double xₖ, double xₖ₊₁) {
                                ^
unicode.cpp:2:14: error: character <U+208A> not allowed in an identifier
  return xₖ₊₁ - xₖ;
           ^
unicode.cpp:2:17: error: character <U+2081> not allowed in an identifier
  return xₖ₊₁ - xₖ;
            ^
4 errors generated.
```
```sh
$ clang++-14 --version
Ubuntu clang version 14.0.1-++20220402053234+23d08271a4b2-1~exp1~20220402053315.111
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```
Is this a deliberate change or a regression bug from Clang 13 to 14?
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to