[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-07-20 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #12 from Thiago Jung Bauermann  
---
I confirmed that this fixed the failures I was seeing. Thanks again!

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-07-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Lewis Hyatt :

https://gcc.gnu.org/g:b2cfe5233e682fc04a9b6fc91f3d30685515630b

commit r14-2665-gb2cfe5233e682fc04a9b6fc91f3d30685515630b
Author: Lewis Hyatt 
Date:   Wed Jul 19 22:07:54 2023 -0400

testsuite: Fix C++ UDL tests failing on 32-bit arch [PR103902]

These tests need to use "size_t" rather than "unsigned long"
for the user-defined literal function arguments.

gcc/testsuite/ChangeLog:

PR preprocessor/103902
* g++.dg/cpp0x/udlit-extended-id-1.C: Change "unsigned long" to
"size_t" throughout.
* g++.dg/cpp0x/udlit-extended-id-3.C: Likewise.

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-07-19 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #10 from Thiago Jung Bauermann  
---
(In reply to Lewis Hyatt from comment #9)
> Thanks, sorry about that, I need to replace "unsigned long" with "size_t".
> Will fix it.

No problem. Thank you!

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-07-19 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #9 from Lewis Hyatt  ---
Thanks, sorry about that, I need to replace "unsigned long" with "size_t". Will
fix it.

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-07-19 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

Thiago Jung Bauermann  changed:

   What|Removed |Added

 CC||thiago.bauermann at linaro dot 
org

--- Comment #8 from Thiago Jung Bauermann  
---
Hello,

The new tests udlit-extended-id-1.C and udlit-extended-id-3.C are failing on
armv8l-linux-gnueabihf (tested on Ubuntu 22.04):

Running g++:g++.dg/dg.exp ...
FAIL: g++.dg/cpp0x/udlit-extended-id-1.C -std=c++14 (test for excess errors)
UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C -std=c++14 compilation failed to
produce executable
FAIL: g++.dg/cpp0x/udlit-extended-id-1.C -std=c++17 (test for excess errors)
UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C -std=c++17 compilation failed to
produce executable
FAIL: g++.dg/cpp0x/udlit-extended-id-1.C -std=c++20 (test for excess errors)
UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C -std=c++20 compilation failed to
produce executable
FAIL: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/udlit-extended-id-3.C -std=c++20 (test for excess errors)

Looking at g++.log, the errors are:

/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:24:14:
error: 'const char* operator""_1\U03c3(const char*, long unsigned int)' has
invalid argument list
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:29:14:
error: 'const char* operator""_\U03a32(const char*, long unsigned int)' has
invalid argument list
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:34:14:
error: 'const char* operator""_\U00e61(const char*, long unsigned int)' has
invalid argument list
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:39:13:
error: 'const char* operator""_\U01532(const char*, long unsigned int)' has
invalid argument list
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:
In function 'int main()':
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:56:15:
error: unable to find string literal operator 'operator""_1\U03c3' with
'const char [4]', 'unsigned int' arguments
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:58:15:
error: unable to find string literal operator 'operator""_\U03a32' with
'const char [5]', 'unsigned int' arguments
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:60:15:
error: unable to find string literal operator 'operator""_1\U03c3' with
'const char [7]', 'unsigned int' arguments
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:62:15:
error: unable to find string literal operator 'operator""_1\U03c3' with
'const char [8]', 'unsigned int' arguments
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:65:15:
error: unable to find string literal operator 'operator""_\U03a32' with
'const char [7]', 'unsigned int' arguments
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:67:15:
error: unable to find string literal operator 'operator""_\U00e61' with
'const char [4]', 'unsigned int' arguments
/home/thiago.bauermann/src/gcc/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C:69:15:
error: unable to find string literal operator 'operator""_\U01532' with
'const char [4]', 'unsigned int' arguments
compiler exited with status 1

Any idea what could be going wrong? They do pass on aarch64-linux, so I wonder
if this is a 32-bit issue?

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-07-18 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

Lewis Hyatt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED

--- Comment #7 from Lewis Hyatt  ---
Fixed for GCC 14.

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-07-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Lewis Hyatt :

https://gcc.gnu.org/g:1d3e4f4e2d19c3394dc018118a78c1f4b59cb5c2

commit r14-2629-g1d3e4f4e2d19c3394dc018118a78c1f4b59cb5c2
Author: Lewis Hyatt 
Date:   Tue Jul 18 17:16:08 2023 -0400

libcpp: Handle extended characters in user-defined literal suffix
[PR103902]

The PR complains that we do not handle UTF-8 in the suffix for a
user-defined
literal, such as:

bool operator ""_Ï (unsigned long long);

In fact we don't handle any extended identifier characters there, whether
UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space
after
the "" tokens is included, since then the identifier is lexed in the
"normal" way as its own token. But when it is lexed as part of the string
token, this is handled in lex_string() with a one-off loop that is not
aware
of extended characters.

This patch fixes it by adding a new function scan_cur_identifier() that can
be used to lex an identifier while in the middle of lexing another token.

BTW, the other place that has been mis-lexing identifiers is
lex_identifier_intern(), which is used to implement #pragma push_macro
and #pragma pop_macro. This does not support extended characters either.
I will add that in a subsequent patch, because it can't directly reuse the
new function, but rather needs to lex from a string instead of a
cpp_buffer.

With scan_cur_identifier(), we do also correctly warn about bidi and
normalization issues in the extended identifiers comprising the suffix.

libcpp/ChangeLog:

PR preprocessor/103902
* lex.cc (identifier_diagnostics_on_lex): New function refactoring
some common code.
(lex_identifier_intern): Use the new function.
(lex_identifier): Don't run identifier diagnostics here, rather let
the call site do it when needed.
(_cpp_lex_direct): Adjust the call sites of lex_identifier ()
acccordingly.
(struct scan_id_result): New struct.
(scan_cur_identifier): New function.
(create_literal2): New function.
(lit_accum::create_literal2): New function.
(is_macro): Folded into new function...
(maybe_ignore_udl_macro_suffix): ...here.
(is_macro_not_literal_suffix): Folded likewise.
(lex_raw_string): Handle UTF-8 in UDL suffix via
scan_cur_identifier ().
(lex_string): Likewise.

gcc/testsuite/ChangeLog:

PR preprocessor/103902
* g++.dg/cpp0x/udlit-extended-id-1.C: New test.
* g++.dg/cpp0x/udlit-extended-id-2.C: New test.
* g++.dg/cpp0x/udlit-extended-id-3.C: New test.
* g++.dg/cpp0x/udlit-extended-id-4.C: New test.

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2023-02-08 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #5 from Andrew Pinski  ---
*** Bug 108717 has been marked as a duplicate of this bug. ***

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2022-06-28 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #4 from Lewis Hyatt  ---
(In reply to Lewis Hyatt from comment #3)
> I can look into that.

Patch waiting for review here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596660.html

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2022-06-10 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

--- Comment #3 from Lewis Hyatt  ---
I can look into that.

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2022-06-10 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

Lewis Hyatt  changed:

   What|Removed |Added

 CC||maik.urbannek at cattatech dot 
de

--- Comment #2 from Lewis Hyatt  ---
*** Bug 104640 has been marked as a duplicate of this bug. ***

[Bug preprocessor/103902] GCC requires a space between string-literal and identifier in a literal-operator-id where the identifier is not in basic character set

2022-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103902

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2022-01-04
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
  Component|c++ |preprocessor

--- Comment #1 from Andrew Pinski  ---
>From libcpp/lex.c:
  /* Grab user defined literal suffix.  */
  else if (ISIDST (*pos))
{
  type = cpp_userdef_string_add_type (type);
  ++pos;

  while (ISIDNUM (*pos))
++pos;
}


Hmm, that looks wrong for non-ASCII codes.

I suspect the preprocessor/lexer has other issues with non-ASCII code in other
areas too.