[Bug c++/88507] utf8 not displayed

2018-12-17 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88507

--- Comment #5 from Jonny Grant  ---
Created attachment 45247
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45247=edit
C example of this UTF8 not displaying

[Bug c++/88507] utf8 not displayed

2018-12-17 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88507

--- Comment #4 from Jonny Grant  ---
Clang has an appropriate message and displays the UTF8 okay.

GCC just needs to catch up with clang on this one...


#1 with x86-64 clang (trunk)
:8:7: error: non-ASCII characters are not allowed outside of literals
and identifiers
st£ing buf;
  ^
:8:5: error: unknown type name 'st'
st£ing buf;
^
:8:12: error: expected ';' at end of declaration
st£ing buf;
  ^
  ;
:10:5: error: use of undeclared identifier 'buf'
buf = "£"
^
4 errors generated.
Compiler returned: 1

[Bug c++/88507] utf8 not displayed

2018-12-17 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88507

--- Comment #3 from Jonny Grant  ---
ICC displays the UTF8 ok:


#1 with x86-64 icc 19.0.1
(8): error: unrecognized token
  st£ing buf;
^

(8): error: identifier "st" is undefined
  st£ing buf;
  ^

(8): error: expected a ";"
  st£ing buf;
^

(10): error: identifier "buf" is undefined

  buf = "£"
  ^

(11): error: expected a ";"
  }
  ^

compilation aborted for  (code 2)
Compiler returned: 2

[Bug c++/88507] utf8 not displayed

2018-12-17 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88507

--- Comment #2 from Jonny Grant  ---
(In reply to Jonathan Wakely from comment #1)
> (In reply to Jonny Grant from comment #0)
> > Hello
> > 
> > This is an typo in the word "string", just reporting as perhaps it could
> > show £ correctly, as it does on line 10 error.
> 
> But then you couldn't have two separate caret locations pointing to the two
> invalid bytes, because it would only occupy a single column. You also assume
> the terminal is capable of showing UTF-8 characters.

Ok. I would suggest worth displaying the "st£ing" and say ‘st£ing’ was not a
valid identifier (Latin letter, underscore, or non-digit character) as per
C/C++ specs?

Example expected output:

$ g++ -Wall -o string string.cpp
string.cpp: In function ‘int main()’:
string.cpp:8:5: error: ‘st£ing’ is not a valid identifier as contains non-latin
characters
 st£ing buf;
 ^~
string.cpp:8:5: note: suggested alternative: ‘string’
 st£ing buf;
 ^~
 string
string.cpp:10:5: error: ‘buf’ was not declared in this scope
 buf = "£"
 ^~~

[Bug c++/88507] utf8 not displayed

2018-12-17 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88507

--- Comment #1 from Jonathan Wakely  ---
(In reply to Jonny Grant from comment #0)
> Hello
> 
> This is an typo in the word "string", just reporting as perhaps it could
> show £ correctly, as it does on line 10 error.

But then you couldn't have two separate caret locations pointing to the two
invalid bytes, because it would only occupy a single column. You also assume
the terminal is capable of showing UTF-8 characters.


> Perhaps could also show the
> stray bytes in hex as well? ie "0xF3C2"

I don't see why that would that be an improvement.