Re: [PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-08-11 Thread Tom Honermann via Gcc-patches
If there are no further concerns, could a C++ or libcpp maintainer 
please commit this for me?


Thank you!

Tom.

On 8/4/22 12:42 PM, Tom Honermann via Gcc-patches wrote:
Are there any further concerns with this patch? If not, I extend my 
gratitude to anyone so kind as to commit this for me as I don't have 
commit access.


I just noticed that I neglected to add a ChangeLog entry for the 
comment addition to gcc/cp/parser.cc. Noted inline below. I can 
re-send the patch with that update if desired.


Tom.

On 8/1/22 2:49 PM, Tom Honermann wrote:

Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the 
preprocessor

(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following 
diagnostic

otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
   warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixeshttps://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat 
diagnostics

in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.


gcc/cp/ChangeLog:
    * parser.cc (cp_lexer_saving_tokens): Add comment regarding 
diagnostic requirements.




gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
  gcc/c-family/c-opts.cc |  7 +++
  gcc/c-family/c.opt |  2 +-
  gcc/cp/parser.cc   |  5 -
  gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
  gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
  libcpp/include/cpplib.h    |  4 
  libcpp/init.cc |  1 +
  7 files changed, 46 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
    else if (warn_narrowing == -1)
  warn_narrowing = 0;
  +  if (cxx_dialect >= cxx20)
+    {
+  /* Don't warn about C++20 compatibility changes in C++20 or 
later.  */

+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+    }
+
    /* C++17 has stricter evaluation order requirements; let's use 
some of them

   for earlier C++ as well, so chaining works as expected. */
    if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
  C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
    Wc++20-compat
-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ 
ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ 
ObjC++,Wall) Init(0) CPP(cpp_warn_cxx20_compat) 
CppReason(CPP_W_CXX20_COMPAT)
  Warn about C++ constructs whose meaning differs between ISO C++ 
2017 and ISO C++ 2020.

    Wc++11-extensions
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer)
  /* Store the next token from the preprocessor in *TOKEN. Return true
 if we reach EOF.  If LEXER is NULL, assume we are handling an
 initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strings.
+
+   Diagnostics issued from this function must have their controlling 
option (if
+   any) in c.opt annotated as a libcpp option via the CppReason 
property.  */

    static void
  cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token)
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C

new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-com

Re: [PATCH v4 2/2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-08-08 Thread Tom Honermann via Gcc-patches

On 8/2/22 6:14 PM, Joseph Myers wrote:

On Tue, 2 Aug 2022, Tom Honermann via Gcc-patches wrote:


This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

OK in the absence of C++ maintainer objections within 72 hours.  (This is
the case where, when I added support for such literals for C (commit
7c5890cc0a0ecea0e88cc39e9fba6385fb579e61), I raised the question of
whether they should be unsigned in the preprocessor for C++ as well.)


Joseph, would you be so kind as to commit this patch series for me? I 
don't have commit access. Thank you in advance!


Tom.



Re: [PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-08-04 Thread Tom Honermann via Gcc-patches
Are there any further concerns with this patch? If not, I extend my 
gratitude to anyone so kind as to commit this for me as I don't have 
commit access.


I just noticed that I neglected to add a ChangeLog entry for the comment 
addition to gcc/cp/parser.cc. Noted inline below. I can re-send the 
patch with that update if desired.


Tom.

On 8/1/22 2:49 PM, Tom Honermann wrote:

Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
   warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixeshttps://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics
in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.


gcc/cp/ChangeLog:
    * parser.cc (cp_lexer_saving_tokens): Add comment regarding 
diagnostic requirements.




gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
  gcc/c-family/c-opts.cc |  7 +++
  gcc/c-family/c.opt |  2 +-
  gcc/cp/parser.cc   |  5 -
  gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
  gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
  libcpp/include/cpplib.h|  4 
  libcpp/init.cc |  1 +
  7 files changed, 46 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
else if (warn_narrowing == -1)
  warn_narrowing = 0;
  
+  if (cxx_dialect >= cxx20)

+{
+  /* Don't warn about C++20 compatibility changes in C++20 or later.  */
+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+}
+
/* C++17 has stricter evaluation order requirements; let's use some of them
   for earlier C++ as well, so chaining works as expected.  */
if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
  C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
  
  Wc++20-compat

-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT)
  Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
C++ 2020.
  
  Wc++11-extensions

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer)
  /* Store the next token from the preprocessor in *TOKEN.  Return true
 if we reach EOF.  If LEXER is NULL, assume we are handling an
 initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strings.
+
+   Diagnostics issued from this function must have their controlling option (if
+   any) in c.opt annotated as a libcpp option via the CppReason property.  */
  
  static void

  cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token)
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-compat" }
+
+// Validate suppression of -Wc++11-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++11-compat"
+int alignof;
+int alignas;
+int constexpr;
+int decltype;
+int noexcept;
+int nullptr;
+int stat

[PATCH v4 2/2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-08-02 Thread Tom Honermann via Gcc-patches
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

PR preprocessor/106426

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.

gcc/testsuite/ChangeLog:
* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.

libcpp/ChangeLog:
* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
literals based on unsigned_utf8char.
* include/cpplib.h (cpp_options): Add unsigned_utf8char.
* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
---
 gcc/c-family/c-opts.cc| 1 +
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +-
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 
 libcpp/charset.cc | 4 ++--
 libcpp/include/cpplib.h   | 4 ++--
 libcpp/init.cc| 1 +
 6 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 108adc5caf8..02ce1e86cdb 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename)
   /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
 flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
+  cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
index 8ed85ccfdcd..2994dd38516 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
@@ -1,6 +1,6 @@
 // Test that UTF-8 character literals have type char if -fchar8_t is not 
enabled.
 // { dg-do compile }
-// { dg-options "-std=c++17 -fno-char8_t" }
+// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" }
 
 template
   struct is_same
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 > 0
+#error "UTF-8 character literals not signed in preprocessor"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
index 7861736689c..db4fe70046d 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 < 0
+#error "UTF-8 character literals not unsigned in preprocessor"
+#endif
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index ca8b7cf7aa5..12e31632228 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string 
str,
   /* Multichar constants are of type int and therefore signed.  */
   if (i > 1)
 unsigned_p = 0;
-  else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus))
-unsigned_p = 1;
+  else if (type == CPP_UTF8CHAR)
+unsigned_p = CPP_OPTION (pfile, unsigned_utf8char);
   else
 unsigned_p = CPP_OPTION (pfile, unsigned_char);
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..f9c042db034 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -581,8 +581,8 @@ struct cpp_options
  ints and target wide characters, respectively.  */
   size_t precision, char_precision, int_precision, wchar_precision;
 
-  /* True means chars (wide chars) are unsigned.  */
-  bool unsigned_char, unsigned_wchar;
+  /* True means chars (wide chars, UTF-8 chars) are unsigned.  */
+  bool unsigned_char, unsigned_wchar, unsigned_utf8char;
 
   /* True if the most significant byte in a word has the lowest
  address in memory.  */
diff --git a/libcpp/init.cc b/libcpp/init.cc
index f4ab83d2145..0242da5f55c 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int);
   CPP_OPTION (pfile, unsigned_char) = 0;
   CPP_OPTION (pfile, unsigned_wchar) = 1;
+  CPP_OPTION (pfile, unsigned_utf8char) = 1;
   CPP_OPTION (pfile, bytes_big_endian) = 1;  /* does not matter */
 
   /* Default to no charset conversion.  */
-- 
2.32.0



[PATCH v4 1/2] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-08-02 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library
changes adopted for C2X via WG14 N2653.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

gcc/ChangeLog:

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c11-utf8str-type.c: New test.
* gcc.dg/c17-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 gcc/c-family/c-lex.cc | 13 --
 gcc/c-family/c-opts.cc|  4 +-
 gcc/c/c-parser.cc | 16 ++-
 gcc/c/c-typeck.cc |  2 +-
 gcc/ginclude/stdatomic.h  |  6 +++
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 13 files changed, 170 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index 8bfa4f4024f..0b6f94e18a8 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool 
objc_string, bool translate)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-   type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
   else
 type = char_type_node;
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..108adc5caf8 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 92049d1a101..fa9395986de 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+

[PATCH v4 0/2] Implement C2X N2653 (char8_t) and correct UTF-8 character literal type in preprocessor directives for C++

2022-08-02 Thread Tom Honermann via Gcc-patches
This patch series provides an implementation and tests for the WG14 N2653
paper as adopted for C2X.

Additionally, a fix is included for the C++ preprocessor to treat UTF-8
character literals in preprocessor directives as an unsigned type in char8_t
enabled modes (in C++17 and earlier with -fchar8_t or in C++20 or later
without -fno-char8_t).

Tom Honermann (2):
  C: Implement C2X N2653 char8_t and UTF-8 string literal changes
  preprocessor/106426: Treat u8 character literals as unsigned in
char8_t modes.

 gcc/c-family/c-lex.cc | 13 --
 gcc/c-family/c-opts.cc|  5 ++-
 gcc/c/c-parser.cc | 16 ++-
 gcc/c/c-typeck.cc |  2 +-
 gcc/ginclude/stdatomic.h  |  6 +++
 .../g++.dg/ext/char8_t-char-literal-1.C   |  6 ++-
 .../g++.dg/ext/char8_t-char-literal-2.C   |  4 ++
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 libcpp/charset.cc |  4 +-
 libcpp/include/cpplib.h   |  4 +-
 libcpp/init.cc|  1 +
 18 files changed, 185 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

-- 
2.32.0



Re: [PATCH 2/3 v3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-02 Thread Tom Honermann via Gcc-patches

On 8/2/22 12:53 PM, Joseph Myers wrote:

On Mon, 1 Aug 2022, Tom Honermann via Gcc-patches wrote:


This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

Could you please send a complete patch series?  I'm not sure what the
matching patches 1 and 3 are.  Also, I don't generally find it helpful for
tests to be separated from the patch making the changes they test, since
tests are necessary to review of that code.


Absolutely. I'll merge the implementation and test commits, so the next 
series (v4) will have just two commits; one for the C2X N2653 
implementation and the other for the C++ u8 preprocessor string type 
fix. Coming right up.


Tom.



[PATCH 2/3 v3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches
This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 9 files changed, 143 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..1b692f55ed0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)  \
+  do   \
+{  \
+  int r1 = MACRO;  \
+  int r2 = atomic_is_lock_free (&V1);  \
+  int r3 = atomic_is_lock_free (&V2);  \
+  if (r1 != 0 && r1 != 1 && r1 != 2)   \
+   abort ();   \
+  if (r2 != 0 && r2 != 1)  \
+   abort ();   \
+  if (r3 != 0 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r2 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 0 && r2 != 0)  \
+   abort ();   \
+  if (r1 == 0 && r3 != 0)  \
+   abort ();   \
+}  \
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..27a3cfe3552
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c11-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
new file mode 100644
index 000..8be9abb9686
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C11 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, default: 2) == 1, "UTF-8 string 
literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c17-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
new file mode 100644
index 000..515c6db3970
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C17 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c17" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, default: 2) == 1, "UTF-8 string 
literal elements 

Re: [PATCH 2/3 v2] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches

On 8/1/22 3:13 PM, Joseph Myers wrote:

On Mon, 1 Aug 2022, Tom Honermann via Gcc-patches wrote:


diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c 
b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
new file mode 100644
index 000..3456105563a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
@@ -0,0 +1,11 @@
+/* Test C2X predefined macros.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif

These aren't macros defined by C2X.  You could argue that they are part of
the stable interface provided by GCC for e.g. libc implementations to use,
and so should be tested as such, but any such test shouldn't suggest it's
testing a standard feature (and should have a better name to describe what
it's actually testing rather than suggesting it's about predefined macros
in general).

Fair point. This test is redundant anyway; these macros are directly or 
indirectly exercised by the other tests. I'll just remove it.


Tom.



[PATCH 1/1 v2] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-08-01 Thread Tom Honermann via Gcc-patches
Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixes https://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics
in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.

gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
 gcc/c-family/c-opts.cc |  7 +++
 gcc/c-family/c.opt |  2 +-
 gcc/cp/parser.cc   |  5 -
 gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
 libcpp/include/cpplib.h|  4 
 libcpp/init.cc |  1 +
 7 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
   else if (warn_narrowing == -1)
 warn_narrowing = 0;
 
+  if (cxx_dialect >= cxx20)
+{
+  /* Don't warn about C++20 compatibility changes in C++20 or later.  */
+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+}
+
   /* C++17 has stricter evaluation order requirements; let's use some of them
  for earlier C++ as well, so chaining works as expected.  */
   if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
 C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
 
 Wc++20-compat
-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT)
 Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
C++ 2020.
 
 Wc++11-extensions
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@ cp_lexer_saving_tokens (const cp_lexer* lexer)
 /* Store the next token from the preprocessor in *TOKEN.  Return true
if we reach EOF.  If LEXER is NULL, assume we are handling an
initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strings.
+
+   Diagnostics issued from this function must have their controlling option (if
+   any) in c.opt annotated as a libcpp option via the CppReason property.  */
 
 static void
 cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token)
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-compat" }
+
+// Validate suppression of -Wc++11-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++11-compat"
+int alignof;
+int alignas;
+int constexpr;
+int decltype;
+int noexcept;
+int nullptr;
+int static_assert;
+int thread_local;
+int _Alignas;
+int _Alignof;
+int _Thread_local;
diff --git a/gcc/testsuite/g++.dg/cpp2a/keywords2.C 
b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
new file mode 100644
index 000..8714a7b26b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++17_down } }
+// { dg-options "-Wc++20-compat" }
+
+// Validate suppression of -Wc++20-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++20-compat"
+int constinit;
+int consteval;
+int re

[PATCH 2/3 v2] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches
This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c11-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c17-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-predefined-macros.c  | 11 +
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 10 files changed, 154 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c11-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c17-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..1b692f55ed0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)  \
+  do   \
+{  \
+  int r1 = MACRO;  \
+  int r2 = atomic_is_lock_free (&V1);  \
+  int r3 = atomic_is_lock_free (&V2);  \
+  if (r1 != 0 && r1 != 1 && r1 != 2)   \
+   abort ();   \
+  if (r2 != 0 && r2 != 1)  \
+   abort ();   \
+  if (r3 != 0 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r2 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 0 && r2 != 0)  \
+   abort ();   \
+  if (r1 == 0 && r3 != 0)  \
+   abort ();   \
+}  \
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..27a3cfe3552
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c11-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
new file mode 100644
index 000..8be9abb9686
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C11 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, default: 2) == 1, "UTF-8 string 
literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c17-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
new file mode 100644
index 000..515c6db3970
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c17-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C17 UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c17" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, default: 2) == 1, "UTF-8 string 
literals 

[PATCH 1/3 v2] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-08-01 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library
changes adopted for C2X via WG14 N2653.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

gcc/ChangeLog:

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.
---
 gcc/c-family/c-lex.cc| 13 +
 gcc/c-family/c-opts.cc   |  4 ++--
 gcc/c/c-parser.cc| 16 ++--
 gcc/c/c-typeck.cc|  2 +-
 gcc/ginclude/stdatomic.h |  6 ++
 5 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index 8bfa4f4024f..0b6f94e18a8 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool 
objc_string, bool translate)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-   type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
   else
 type = char_type_node;
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..108adc5caf8 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 92049d1a101..fa9395986de 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -7472,9 +7479,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
 {
 default:
 case CPP_STRING:
-case CPP_UTF8STRING:
   TREE_TYPE (value) = char_array_type_node;
   break;
+case CPP_UTF8STRING:
+  if (flag_char8_t)
+   TREE_TYPE (value) = char8_array_type_node;
+  else
+   TREE_TYPE (value) = char_array_type_node;
+  break;
 case CPP_STRING16:
   TREE_TYPE (value) = char16_array_type_node;
   break;
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index fd0a7f81a7a..231f4e980b6 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8045,7 +8045,7 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
 
  if (char_array)
{
- if (typ2 != char_type_node)
+ if (typ2 != char_type_node && typ2 != char8_type_node)
incompat_string_cst = true;
}
  else if (!comptypes (typ1, typ2))
diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index bfcfdf664c7..9f2475b739d 100644
--

Re: [PATCH 2/3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-07-31 Thread Tom Honermann via Gcc-patches

On 7/27/22 7:23 PM, Joseph Myers wrote:

On Mon, 25 Jul 2022, Tom Honermann via Gcc-patches wrote:


This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

I'd expect this patch also to add tests verifying that u8"" strings have
the old type for C11 (unless there are existing such tests, but I don't
see them).

Agreed, good catch. thank you.



diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */

I don't think _ISOC2X_SOURCE belongs in any GCC tests.
That was necessary because the first patch in this series omitted the 
atomic_char8_t and ATOMIC_CHAR8_T_LOCK_FREE definitions unless one of 
_GNU_SOURCE or _ISOC2X_SOURCE was defined. Per review of that first 
patch, those conditions will be removed, so there will be no need to 
define them here.



diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */

Nor does _GNU_SOURCE (unless the test depends on glibc functionality
that's only available with _GNU_SOURCE, but in that case you also need
some effective-target conditionals to restrict it to appropriate glibc
targets).


Ditto.

I'll post new patches shortly.

Tom.



Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-31 Thread Tom Honermann via Gcc-patches

On 7/31/22 11:05 AM, Lewis Hyatt wrote:

On Sat, Jul 30, 2022 at 7:06 PM Tom Honermann via Gcc-patches
  wrote:

On 7/27/22 7:09 PM, Joseph Myers wrote:

On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote:


Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also

There are lots of C++ warning options, all of which should support pragma
suppression regardless of whether they are relevant to the preprocessor or
not.  Do they all need this kind of handling, or is it only -Wc++20-compat
that has some kind of problem?

I had only checked -Wc++20-compat when working on the patch.

I did some spot checking now and confirmed that suppression works as
expected for C++ for at least the following warnings:
-Wuninitialized
-Warray-compare
-Wbool-compare
-Wtautological-compare
-Wterminate

I don't know the diagnostic framework well. As best I can tell, this
issue is specific to the -Wc++20-compat option and when the particular
diagnostic is issued (e.g., during lexing as opposed to during parsing).
The following call chains appear to be relevant.
cp_lexer_new_main -> cp_lexer_handle_early_pragma ->
c_invoke_early_pragma_handler
cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler
(where * might be "declaration", "toplevel_declaration",
"class_head", "objc_interstitial_code", ...)

The -Wc++20-compat enabled warning regarding new keywords in C++20 is
issued from cp_lexer_get_preprocessor_token.

Tom.


I have been working on improving the handling of "#pragma GCC
diagnostic" lately. The behavior for C++ changed since r13-1544
(https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e46f4d7430c5210465791603735ab219ef263c51).
I have some more comments about the patch's approach on the PR
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c44).

"#pragma GCC diagnostic" formerly did not work in C++ at all, for
diagnostics generated by libcpp, because C++ obtains all the tokens
from libcpp first (including deferred pragmas), and then processes
them afterward, too late to take effect for diagnostics that libcpp
has already emitted. r13-1544 fixed this up by adding an early pragma
handler, which runs as soon as a deferred pragma token is seen and
handles diagnostic pragmas if they pertain to libcpp-controlled
diagnostics. Non-libcpp diagnostics still need to be handled later,
during parsing, or else they get processed too early and it leads to
other problems. Basically, now each diagnostic pragma is handled as
close in time as possible to the time the associated diagnostics might
be generated.

The early pragma handler determines that an option comes from libcpp,
and so should be subject to early processing, if it was marked as such
in the options definition file. Tom's patch points out that
-Wc++20-compat needs to be handled early, and so marking it as a
libcpp diagnostic in c-family/c.opt arranges for that to work as
intended. Now one potential objection here is that -Wc++20-compat
warnings are not technically generated by libcpp. They are generated
by the C++ frontend immediately after lexing an identifier token from
libcpp (cp_lexer_get_preprocessor_token()). But the distinction
between these two steps is rather blurry and it seems logical to me,
to denote this as a libcpp-related option. Also, the same is already
done for -Wc++11-compat. Otherwise, we would need to add some new
option property to indicate which ones need to be handled for pragmas
at lexing time rather than parsing time.

At the moment I don't see any other diagnostics issued from
cp_lexer_get_preprocessor_token() that would need similar adjustments.
Assuming the approach is OK, it might be nice to add a comment to that
function, indicating that any diagnostics emitted there should be
annotated as libcpp options in the .opt file?


Thank you for those details; I wasn't aware of that history.

If I'm interpreting your response correctly, it sounds like you agree 
with the direction of the patch.


If you like, I can add a comment as you suggested and re-post the patch. 
Perhaps:


diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 4f67441eeb1..c3584446827 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -924,7 +924,10 @@cp_lexer_saving_tokens (const cp_lexer* lexer)
/* Store the next token from the preprocessor in *TOKEN.  Return true
   if we reach EOF.  If LEXER is NULL, assume we are handling an
   initial #pragma pch_preprocess, and thus want the lexer to return
-   processed strings.  */
+   processed strin

Re: [PATCH 1/3] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-07-30 Thread Tom Honermann via Gcc-patches

On 7/27/22 7:20 PM, Joseph Myers wrote:

On Mon, 25 Jul 2022, Tom Honermann via Gcc-patches wrote:


diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index bfcfdf664c7..75ed7965689 100644
--- a/gcc/ginclude/stdatomic.h
+++ b/gcc/ginclude/stdatomic.h
@@ -49,6 +49,10 @@ typedef _Atomic long atomic_long;
  typedef _Atomic unsigned long atomic_ulong;
  typedef _Atomic long long atomic_llong;
  typedef _Atomic unsigned long long atomic_ullong;
+#if (defined(__CHAR8_TYPE__) \
+ && (defined(_GNU_SOURCE) || defined(_ISOC2X_SOURCE)))
+typedef _Atomic __CHAR8_TYPE__ atomic_char8_t;
+#endif
  typedef _Atomic __CHAR16_TYPE__ atomic_char16_t;
  typedef _Atomic __CHAR32_TYPE__ atomic_char32_t;
  typedef _Atomic __WCHAR_TYPE__ atomic_wchar_t;

GCC headers don't test glibc feature test macros such as _GNU_SOURCE and
_ISOC2X_SOURCE; they base things only on the standard version (whether
directly, or indirectly as via __CHAR8_TYPE__) and standard-defined
feature test macros.


Ok, thank you, that makes sense. I'll follow up with a revised patch 
that removes the additional conditions.


Tom.



(There's one exception in glimits.h - testing __USE_GNU, the macro defined
internally by glibc's headers - but I don't think that's something we want
to emulate in new code.)



Re: [PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-30 Thread Tom Honermann via Gcc-patches

On 7/27/22 7:09 PM, Joseph Myers wrote:

On Sun, 24 Jul 2022, Tom Honermann via Gcc-patches wrote:


Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also

There are lots of C++ warning options, all of which should support pragma
suppression regardless of whether they are relevant to the preprocessor or
not.  Do they all need this kind of handling, or is it only -Wc++20-compat
that has some kind of problem?


I had only checked -Wc++20-compat when working on the patch.

I did some spot checking now and confirmed that suppression works as 
expected for C++ for at least the following warnings:

  -Wuninitialized
  -Warray-compare
  -Wbool-compare
  -Wtautological-compare
  -Wterminate

I don't know the diagnostic framework well. As best I can tell, this 
issue is specific to the -Wc++20-compat option and when the particular 
diagnostic is issued (e.g., during lexing as opposed to during parsing). 
The following call chains appear to be relevant.
  cp_lexer_new_main -> cp_lexer_handle_early_pragma -> 
c_invoke_early_pragma_handler

  cp_parser_* -> cp_parser_pragma -> c_invoke_pragma_handler
  (where * might be "declaration", "toplevel_declaration", 
"class_head", "objc_interstitial_code", ...)


The -Wc++20-compat enabled warning regarding new keywords in C++20 is 
issued from cp_lexer_get_preprocessor_token.


Tom.



Re: [PATCH 3/3] c++/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-07-25 Thread Tom Honermann via Gcc-patches

On 7/25/22 2:05 PM, Andrew Pinski wrote:

On Mon, Jul 25, 2022 at 11:01 AM Tom Honermann via Gcc-patches
 wrote:

This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

Fixes https://gcc.gnu.org/PR106426.

The above mention of the PR # should just be:
preprocessor/106426

And then when this patch gets committed, it will be recorded in bugzilla also.


Thank you. I resent the patch with a revised subject line and commit 
message to reflect the component change in Bugzilla.


Tom.



Thanks,
Andrew Pinski



Re: [PATCH 3/3 v2] preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

PR preprocessor/106426

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.

gcc/testsuite/ChangeLog:
* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.

libcpp/ChangeLog:
* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
literals based on unsigned_utf8char.
* include/cpplib.h (cpp_options): Add unsigned_utf8char.
* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
---
 gcc/c-family/c-opts.cc| 1 +
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +-
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 
 libcpp/charset.cc | 4 ++--
 libcpp/include/cpplib.h   | 4 ++--
 libcpp/init.cc| 1 +
 6 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 108adc5caf8..02ce1e86cdb 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename)
   /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
 flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
+  cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
index 8ed85ccfdcd..2994dd38516 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
@@ -1,6 +1,6 @@
 // Test that UTF-8 character literals have type char if -fchar8_t is not 
enabled.
 // { dg-do compile }
-// { dg-options "-std=c++17 -fno-char8_t" }
+// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" }
 
 template
   struct is_same
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 > 0
+#error "UTF-8 character literals not signed in preprocessor"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
index 7861736689c..db4fe70046d 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 < 0
+#error "UTF-8 character literals not unsigned in preprocessor"
+#endif
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index ca8b7cf7aa5..12e31632228 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string 
str,
   /* Multichar constants are of type int and therefore signed.  */
   if (i > 1)
 unsigned_p = 0;
-  else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus))
-unsigned_p = 1;
+  else if (type == CPP_UTF8CHAR)
+unsigned_p = CPP_OPTION (pfile, unsigned_utf8char);
   else
 unsigned_p = CPP_OPTION (pfile, unsigned_char);
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..f9c042db034 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -581,8 +581,8 @@ struct cpp_options
  ints and target wide characters, respectively.  */
   size_t precision, char_precision, int_precision, wchar_precision;
 
-  /* True means chars (wide chars) are unsigned.  */
-  bool unsigned_char, unsigned_wchar;
+  /* True means chars (wide chars, UTF-8 chars) are unsigned.  */
+  bool unsigned_char, unsigned_wchar, unsigned_utf8char;
 
   /* True if the most significant byte in a word has the lowest
  address in memory.  */
diff --git a/libcpp/init.cc b/libcpp/init.cc
index f4ab83d2145..0242da5f55c 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int);
   CPP_OPTION (pfile, unsigned_char) = 0;
   CPP_OPTION (pfile, unsigned_wchar) = 1;
+  CPP_OPTION (pfile, unsigned_utf8char) = 1;
   CPP_OPTION (pfile, bytes_big_endian) = 1;  /* does not matter */
 
   /* Default to no charset conversion.  */
-- 
2.32.0



[PATCH 2/3] testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal changes

2022-07-25 Thread Tom Honermann via Gcc-patches
This change provides new tests for the core language and compiler
dependent library changes adopted for C2X via WG14 N2653.

gcc/testsuite/ChangeLog:
* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.
---
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c2x-predefined-macros.c  | 11 +
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 8 files changed, 142 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)  \
+  do   \
+{  \
+  int r1 = MACRO;  \
+  int r2 = atomic_is_lock_free (&V1);  \
+  int r3 = atomic_is_lock_free (&V2);  \
+  if (r1 != 0 && r1 != 1 && r1 != 2)   \
+   abort ();   \
+  if (r2 != 0 && r2 != 1)  \
+   abort ();   \
+  if (r3 != 0 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r2 != 1)  \
+   abort ();   \
+  if (r1 == 2 && r3 != 1)  \
+   abort ();   \
+  if (r1 == 0 && r2 != 0)  \
+   abort ();   \
+  if (r1 == 0 && r3 != 0)  \
+   abort ();   \
+}  \
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c 
b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c 
b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
new file mode 100644
index 000..3456105563a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
@@ -0,0 +1,11 @@
+/* Test C2X predefined macros.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str-type.c 
b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
new file mode 100644
index 000..1ae86955516
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C2X UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, unsigned char*: 2) == 2, "UTF-8 
string literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, unsigned char:  2) == 2, "UTF-8 
string literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str.c 
b/gcc/testsuite/gcc.dg/c2x-utf

[PATCH 1/3] C: Implement C2X N2653 char8_t and UTF-8 string literal changes

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library
changes adopted for C2X via WG14 N2653.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

gcc/ChangeLog:

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.
---
 gcc/c-family/c-lex.cc| 13 +
 gcc/c-family/c-opts.cc   |  4 ++--
 gcc/c/c-parser.cc| 16 ++--
 gcc/c/c-typeck.cc|  2 +-
 gcc/ginclude/stdatomic.h |  8 
 5 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index 8bfa4f4024f..0b6f94e18a8 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool 
objc_string, bool translate)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,9 +1432,7 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-   type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
   else
 type = char_type_node;
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..108adc5caf8 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1059,9 +1059,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 92049d1a101..fa9395986de 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -7447,7 +7447,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
default:
case CPP_STRING:
case CPP_UTF8STRING:
- value = build_string (1, "");
+ if (type == CPP_UTF8STRING && flag_char8_t)
+   {
+ value = build_string (TYPE_PRECISION (char8_type_node)
+   / TYPE_PRECISION (char_type_node),
+   "");  /* char8_t is 8 bits */
+   }
+ else
+   value = build_string (1, "");
  break;
case CPP_STRING16:
  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -7472,9 +7479,14 @@ c_parser_string_literal (c_parser *parser, bool 
translate, bool wide_ok)
 {
 default:
 case CPP_STRING:
-case CPP_UTF8STRING:
   TREE_TYPE (value) = char_array_type_node;
   break;
+case CPP_UTF8STRING:
+  if (flag_char8_t)
+   TREE_TYPE (value) = char8_array_type_node;
+  else
+   TREE_TYPE (value) = char_array_type_node;
+  break;
 case CPP_STRING16:
   TREE_TYPE (value) = char16_array_type_node;
   break;
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index fd0a7f81a7a..231f4e980b6 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8045,7 +8045,7 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
 
  if (char_array)
{
- if (typ2 != char_type_node)
+ if (typ2 != char_type_node && typ2 != char8_type_node)
incompat_string_cst = true;
}
  else if (!comptypes (typ1, typ2))
diff --git a/gcc/ginclude/stdatomic.h b/gcc/ginclude/stdatomic.h
index bfcfdf664c7..75ed7965689 100644

[PATCH 3/3] c++/106426: Treat u8 character literals as unsigned in char8_t modes.

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

Fixes https://gcc.gnu.org/PR106426.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.

gcc/testsuite/ChangeLog:
* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.

libcpp/ChangeLog:
* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
literals based on unsigned_utf8char.
* include/cpplib.h (cpp_options): Add unsigned_utf8char.
* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
---
 gcc/c-family/c-opts.cc| 1 +
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C | 6 +-
 gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C | 4 
 libcpp/charset.cc | 4 ++--
 libcpp/include/cpplib.h   | 4 ++--
 libcpp/init.cc| 1 +
 6 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 108adc5caf8..02ce1e86cdb 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1062,6 +1062,7 @@ c_common_post_options (const char **pfilename)
   /* char8_t support is implicitly enabled in C++20 and C2X.  */
   if (flag_char8_t == -1)
 flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
+  cpp_opts->unsigned_utf8char = flag_char8_t ? 1 : cpp_opts->unsigned_char;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
index 8ed85ccfdcd..2994dd38516 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-1.C
@@ -1,6 +1,6 @@
 // Test that UTF-8 character literals have type char if -fchar8_t is not 
enabled.
 // { dg-do compile }
-// { dg-options "-std=c++17 -fno-char8_t" }
+// { dg-options "-std=c++17 -fsigned-char -fno-char8_t" }
 
 template
   struct is_same
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 > 0
+#error "UTF-8 character literals not signed in preprocessor"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C 
b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
index 7861736689c..db4fe70046d 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-char-literal-2.C
@@ -10,3 +10,7 @@ template
   { static const bool value = true; };
 
 static_assert(is_same::value, "Error");
+
+#if u8'\0' - 1 < 0
+#error "UTF-8 character literals not unsigned in preprocessor"
+#endif
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index ca8b7cf7aa5..12e31632228 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1960,8 +1960,8 @@ narrow_str_to_charconst (cpp_reader *pfile, cpp_string 
str,
   /* Multichar constants are of type int and therefore signed.  */
   if (i > 1)
 unsigned_p = 0;
-  else if (type == CPP_UTF8CHAR && !CPP_OPTION (pfile, cplusplus))
-unsigned_p = 1;
+  else if (type == CPP_UTF8CHAR)
+unsigned_p = CPP_OPTION (pfile, unsigned_utf8char);
   else
 unsigned_p = CPP_OPTION (pfile, unsigned_char);
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..f9c042db034 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -581,8 +581,8 @@ struct cpp_options
  ints and target wide characters, respectively.  */
   size_t precision, char_precision, int_precision, wchar_precision;
 
-  /* True means chars (wide chars) are unsigned.  */
-  bool unsigned_char, unsigned_wchar;
+  /* True means chars (wide chars, UTF-8 chars) are unsigned.  */
+  bool unsigned_char, unsigned_wchar, unsigned_utf8char;
 
   /* True if the most significant byte in a word has the lowest
  address in memory.  */
diff --git a/libcpp/init.cc b/libcpp/init.cc
index f4ab83d2145..0242da5f55c 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -231,6 +231,7 @@ cpp_create_reader (enum c_lang lang, cpp_hash_table *table,
   CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int);
   CPP_OPTION (pfile, unsigned_char) = 0;
   CPP_OPTION (pfile, unsigned_wchar) = 1;
+  CPP_OPTION (pfile, unsigned_utf8char) = 1;
   CPP_OPTION (pfile, bytes_big_endian) = 1;  /* does not matter */
 
   /* Default to no charset conversion.  */
-- 
2.32.0



[PATCH 0/3] Implement C2X N2653 (char8_t) and correct UTF-8 character literal type in preprocessor directives for C++

2022-07-25 Thread Tom Honermann via Gcc-patches
This patch series provides an implementation and tests for the WG14 N2653
paper as adopted for C2X.

Additionally, a fix is included for the C++ preprocessor to treat UTF-8
character literals in preprocessor directives as an unsigned type in char8_t
enabled modes (in C++17 and earlier with -fchar8_t or in C++20 or later
without -fno-char8_t).

Tom Honermann (3):
  C: Implement C2X N2653 char8_t and UTF-8 string literal changes
  testsuite: Add tests for C2X N2653 char8_t and UTF-8 string literal
changes
  c++/106426: Treat u8 character literals as unsigned in char8_t modes.

 gcc/c-family/c-lex.cc | 13 --
 gcc/c-family/c-opts.cc|  5 ++-
 gcc/c/c-parser.cc | 16 ++-
 gcc/c/c-typeck.cc |  2 +-
 gcc/ginclude/stdatomic.h  |  8 
 .../g++.dg/ext/char8_t-char-literal-1.C   |  6 ++-
 .../g++.dg/ext/char8_t-char-literal-2.C   |  4 ++
 .../atomic/c2x-stdatomic-lockfree-char8_t.c   | 42 +++
 .../atomic/gnu2x-stdatomic-lockfree-char8_t.c |  5 +++
 gcc/testsuite/gcc.dg/c2x-predefined-macros.c  | 11 +
 gcc/testsuite/gcc.dg/c2x-utf8str-type.c   |  6 +++
 gcc/testsuite/gcc.dg/c2x-utf8str.c| 34 +++
 .../gcc.dg/gnu2x-predefined-macros.c  |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c |  5 +++
 gcc/testsuite/gcc.dg/gnu2x-utf8str.c  | 34 +++
 libcpp/charset.cc |  4 +-
 libcpp/include/cpplib.h   |  4 +-
 libcpp/init.cc|  1 +
 18 files changed, 191 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
 create mode 100644 
gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-utf8str.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-predefined-macros.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str-type.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu2x-utf8str.c

-- 
2.32.0



[PATCH 1/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

2022-07-23 Thread Tom Honermann via Gcc-patches
Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

Fixes https://gcc.gnu.org/PR106423.

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat diagnostics
in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.

gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.
---
 gcc/c-family/c-opts.cc |  7 +++
 gcc/c-family/c.opt |  2 +-
 gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
 libcpp/include/cpplib.h|  4 
 libcpp/init.cc |  1 +
 6 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b9f01a65ed7..1ea37ba9742 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1046,6 +1046,13 @@ c_common_post_options (const char **pfilename)
   else if (warn_narrowing == -1)
 warn_narrowing = 0;
 
+  if (cxx_dialect >= cxx20)
+{
+  /* Don't warn about C++20 compatibility changes in C++20 or later.  */
+  warn_cxx20_compat = 0;
+  cpp_opts->cpp_warn_cxx20_compat = 0;
+}
+
   /* C++17 has stricter evaluation order requirements; let's use some of them
  for earlier C++ as well, so chaining works as expected.  */
   if (c_dialect_cxx ()
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44e1a60ce24..dfdebd596ef 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -455,7 +455,7 @@ Wc++2a-compat
 C++ ObjC++ Warning Alias(Wc++20-compat) Undocumented
 
 Wc++20-compat
-C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall)
+C++ ObjC++ Var(warn_cxx20_compat) Warning LangEnabledBy(C++ ObjC++,Wall) 
Init(0) CPP(cpp_warn_cxx20_compat) CppReason(CPP_W_CXX20_COMPAT)
 Warn about C++ constructs whose meaning differs between ISO C++ 2017 and ISO 
C++ 2020.
 
 Wc++11-extensions
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords2.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
new file mode 100644
index 000..d67d01e31ed
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords2.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-compat" }
+
+// Validate suppression of -Wc++11-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++11-compat"
+int alignof;
+int alignas;
+int constexpr;
+int decltype;
+int noexcept;
+int nullptr;
+int static_assert;
+int thread_local;
+int _Alignas;
+int _Alignof;
+int _Thread_local;
diff --git a/gcc/testsuite/g++.dg/cpp2a/keywords2.C 
b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
new file mode 100644
index 000..8714a7b26b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/keywords2.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++17_down } }
+// { dg-options "-Wc++20-compat" }
+
+// Validate suppression of -Wc++20-compat diagnostics.
+#pragma GCC diagnostic ignored "-Wc++20-compat"
+int constinit;
+int consteval;
+int requires;
+int concept;
+int co_await;
+int co_yield;
+int co_return;
+int char8_t;
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 3eba6f74b57..9d90c18e4f2 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -547,6 +547,9 @@ struct cpp_options
   /* True if warn about differences between C++98 and C++11.  */
   bool cpp_warn_cxx11_compat;
 
+  /* True if warn about differences between C++17 and C++20.  */
+  bool cpp_warn_cxx20_compat;
+
   /* Nonzero if bidirectional control characters checking is on.  See enum
  cpp_bidirectional_level.  */
   unsigned char cpp_warn_bidirectional;
@@ -655,6 +658,7 @@ enum cpp_warning_reason {
   CPP_W_C90_C99_COMPAT,
   CPP_W_C11_C2X_COMPAT,
   CPP_W_CXX11_COMPAT,
+  

[PATCH 0/1] c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics

2022-07-23 Thread Tom Honermann via Gcc-patches
This change addresses the following issue raised on the libc-alpha mailing list:
  https://sourceware.org/pipermail/libc-alpha/2022-July/140825.html
Glibc 2.36 adds a char8_t typedef in C++ modes that do not enable the char8_t
builtin type (C++17 and earlier by default; subject to _GNU_SOURCE and use of
the -f[no-]char8_t option).  When -Wc++20-compat diagnostics are enabled, the
following warning is issued from the glibc uchar.h header.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]
Such diagnostics are not desired from system headers, so glibc would like to
suppress the diagnostic using '#pragma GCC diagnostic ignored "-Wc++20-compat"',
but attempting to do so currently fails.  This patch corrects that.

Tom Honermann (1):
  c++/106423: Fix pragma suppression of -Wc++20-compat diagnostics.

 gcc/c-family/c-opts.cc |  7 +++
 gcc/c-family/c.opt |  2 +-
 gcc/testsuite/g++.dg/cpp0x/keywords2.C | 16 
 gcc/testsuite/g++.dg/cpp2a/keywords2.C | 13 +
 libcpp/include/cpplib.h|  4 
 libcpp/init.cc |  1 +
 6 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords2.C

-- 
2.32.0



Re: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library

2022-01-11 Thread Tom Honermann via Gcc-patches

On 1/10/22 4:38 PM, Jonathan Wakely wrote:

On Mon, 10 Jan 2022 at 21:24, Tom Honermann via Libstdc++
 wrote:

On 1/10/22 8:23 AM, Jonathan Wakely wrote:


On Sat, 8 Jan 2022 at 00:42, Tom Honermann via Libstdc++
mailto:libstdc%2b...@gcc.gnu.org>> wrote:

 This patch completes implementation of the C++20 proposal P0482R6
 [1] by
 adding declarations of std::c8rtomb() and std::mbrtoc8() in
  if
 provided by the C library in .

 This patch addresses feedback provided in response to a previous
 patch
 submission [2].

 Autoconf changes determine if the C library declares c8rtomb and
 mbrtoc8
 at global scope when uchar.h is included and compiled with either
 -fchar8_t or -std=c++20. New
 _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T
 and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros
 reflect the probe results. The  header declares these
 functions
 in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T
 configuration macro is defined (by default it is defined if the C++20
 __cpp_char8_t feature test macro is defined)

 Patches to glibc to implement c8rtomb and mbrtoc8 have been
 submitted [3].

 New tests validate the presence of these declarations. The tests pass
 trivially if the C library does not provide these functions.
 Otherwise
 they ensure that the functions are declared when  is included
 and either -fchar8_t or -std=c++20 is enabled.

 Tested on Linux x86_64.

 libstdc++-v3/ChangeLog:

 2022-01-07  Tom Honermann  mailto:t...@honermann.net>>

 * acinclude.m4 Define config macros if uchar.h provides
 c8rtomb() and mbrtoc8().
 * config.h.in : Re-generate.
 * configure: Re-generate.
 * include/c_compatibility/uchar.h: Declare ::c8rtomb and
 ::mbrtoc8.
 * include/c_global/cuchar: Declare std::c8rtomb and
 std::mbrtoc8.
 * include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8.
 * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
 New test.
 *
 testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
 New test.



Thanks, Tom, this looks good and I'll get it committed for GCC 12.

Thank you!

My only concern is that the new tests depend on an internal macro:

+#if _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20
+  using std::mbrtoc8;
+  using std::c8rtomb;

I prefer if tests are written as "user code" when possible, and not
using our internal macros. That isn't always possible, and in this
case would require adding new effective-target keyword to
testsuite/lib/libstdc++.exp just for use in these two tests. I don't
think we should bother with that.

I went with this approach solely due to my unfamiliarity with the test
system. I knew there should be a way to conditionally make the test
"pass" as unsupported or as an expected failure, but didn't know how to
go about implementing that. I don't mind following up with an additional
patch if such a change is desirable. I took a look at
testsuite/lib/libstdc++.exp and it looks like it may be pretty straight
forward to add effective-target support. It would probably be a good
learning experience for me. I'll prototype and report back.

Yes, it's very easy to do. Take a look at the
check_effective_target_blah procs in that file, especially the later
ones that use v3_check_preprocessor_condition. You can use that to
define an effective target keyword for any preprocessor condition
(such as the new macros you're adding).

Then the test can do:
// { dg-do compile { target blah } }
which will make it UNSUPPORTED if the effective target proc doesn't return true.
See https://gcc.gnu.org/onlinedocs/gccint/Selectors.html#Selectors for
the docs on target selectors.

I'm just not sure it's worth adding a new keyword for just two tests.


Thank you for the implementation direction; this was quite easy!

Patch attached (to be applied after the original one).

libstdc++-v3/ChangeLog:

2022-01-11  Tom Honermann  

* testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
Modify to use new c8rtomb_mbrtoc8_cxx20 effective target.
* testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
Modify to use new c8rtomb_mbrtoc8_fchar8_t effective target.
* testsuite/lib/libstdc++.exp: Add new effective targets.

If you decide that the new keywords aren't worth adding, no worries; my 
feelings won't be hurt :)


Tom.

commit 0542361fe8cb5da146097f86ca8ea8bca86421e0
Author: Tom Honermann 
Date:   Tue Jan 11 14:57:51 2022 -0500

Add effective target support for tests of C++20 c8rtomb and mbrtoc8.

diff --git a/libstdc++-v3/testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc b/libstdc++-v3/testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc
index 7c152ed42b5..681c12127db 100644
--- a/libstdc

Re: [PATCH 0/2]: C N2653 char8_t implementation

2022-01-11 Thread Tom Honermann via Gcc-patches

On 1/10/22 9:23 PM, Joseph Myers wrote:

Please repost these patches after GCC 12 branches (updated as appropriate
depending on whether the feature is accepted at the two-week Jan/Feb WG14
meeting, which doesn't yet have an agenda), since we're currently
stabilizing for the release and so not considering new features.


Thank you, Joseph. Will do!

Tom.



Re: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library

2022-01-10 Thread Tom Honermann via Gcc-patches

On 1/10/22 8:23 AM, Jonathan Wakely wrote:



On Sat, 8 Jan 2022 at 00:42, Tom Honermann via Libstdc++ 
mailto:libstdc%2b...@gcc.gnu.org>> wrote:


This patch completes implementation of the C++20 proposal P0482R6
[1] by
adding declarations of std::c8rtomb() and std::mbrtoc8() in
 if
provided by the C library in .

This patch addresses feedback provided in response to a previous
patch
submission [2].

Autoconf changes determine if the C library declares c8rtomb and
mbrtoc8
at global scope when uchar.h is included and compiled with either
-fchar8_t or -std=c++20. New
_GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T
and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros
reflect the probe results. The  header declares these
functions
in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T
configuration macro is defined (by default it is defined if the C++20
__cpp_char8_t feature test macro is defined)

Patches to glibc to implement c8rtomb and mbrtoc8 have been
submitted [3].

New tests validate the presence of these declarations. The tests pass
trivially if the C library does not provide these functions.
Otherwise
they ensure that the functions are declared when  is included
and either -fchar8_t or -std=c++20 is enabled.

Tested on Linux x86_64.

libstdc++-v3/ChangeLog:

2022-01-07  Tom Honermann  mailto:t...@honermann.net>>

        * acinclude.m4 Define config macros if uchar.h provides
        c8rtomb() and mbrtoc8().
        * config.h.in : Re-generate.
        * configure: Re-generate.
        * include/c_compatibility/uchar.h: Declare ::c8rtomb and
        ::mbrtoc8.
        * include/c_global/cuchar: Declare std::c8rtomb and
        std::mbrtoc8.
        * include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8.
        * testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
        New test.
        *
testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
        New test.



Thanks, Tom, this looks good and I'll get it committed for GCC 12.

Thank you!


My only concern is that the new tests depend on an internal macro:

+#if _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20
+  using std::mbrtoc8;
+  using std::c8rtomb;

I prefer if tests are written as "user code" when possible, and not 
using our internal macros. That isn't always possible, and in this 
case would require adding new effective-target keyword to 
testsuite/lib/libstdc++.exp just for use in these two tests. I don't 
think we should bother with that.
I went with this approach solely due to my unfamiliarity with the test 
system. I knew there should be a way to conditionally make the test 
"pass" as unsupported or as an expected failure, but didn't know how to 
go about implementing that. I don't mind following up with an additional 
patch if such a change is desirable. I took a look at 
testsuite/lib/libstdc++.exp and it looks like it may be pretty straight 
forward to add effective-target support. It would probably be a good 
learning experience for me. I'll prototype and report back.


I suppose strictly speaking we should not define __cpp_lib_char8_t 
unless these two functions are present in libc. But I'm not sure we 
want to change that now either.


All of libstdc++, libc++, and MS STL have been defining 
__cpp_lib_char8_t despite the absence of these functions, so yeah, I 
don't think we want to change that.


Tom.



[PATCH 2/2]: C N2653 char8_t: New tests​

2022-01-07 Thread Tom Honermann via Gcc-patches
This patch provides new tests for the core language and compiler 
dependent library changes proposed in WG14 N2653 [1] for C2x.


Tested on Linux x86_64.

gcc/testsuite/ChangeLog:

2021-05-31  Tom Honermann  

* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/c2x-predefined-macros.c: New test.
* gcc.dg/c2x-utf8str-type.c: New test.
* gcc.dg/c2x-utf8str.c: New test.
* gcc.dg/gnu2x-predefined-macros.c: New test.
* gcc.dg/gnu2x-utf8str-type.c: New test.
* gcc.dg/gnu2x-utf8str.c: New test.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm
commit f4eee2bf403b62714d1ccb4542b8c85dc552a411
Author: Tom Honermann 
Date:   Sun Jan 2 00:26:17 2022 -0500

N2653 char8_t for C: New tests

This change provides new tests for the core language and compiler
dependent library changes proposed in WG14 N2653 for C.

diff --git a/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..37ea4c8926c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x -D_ISOC2X_SOURCE -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)		\
+  do		\
+{		\
+  int r1 = MACRO;\
+  int r2 = atomic_is_lock_free (&V1);	\
+  int r3 = atomic_is_lock_free (&V2);	\
+  if (r1 != 0 && r1 != 1 && r1 != 2)	\
+	abort ();\
+  if (r2 != 0 && r2 != 1)			\
+	abort ();\
+  if (r3 != 0 && r3 != 1)			\
+	abort ();\
+  if (r1 == 2 && r2 != 1)			\
+	abort ();\
+  if (r1 == 2 && r3 != 1)			\
+	abort ();\
+  if (r1 == 0 && r2 != 0)			\
+	abort ();\
+  if (r1 == 0 && r3 != 0)			\
+	abort ();\
+}		\
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..a017b134817
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c
@@ -0,0 +1,5 @@
+/* Test atomic_is_lock_free for char8_t with -std=gnu2x.  */
+/* { dg-do run } */
+/* { dg-options "-std=gnu2x -D_GNU_SOURCE -pedantic-errors" } */
+
+#include "c2x-stdatomic-lockfree-char8_t.c"
diff --git a/gcc/testsuite/gcc.dg/c2x-predefined-macros.c b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
new file mode 100644
index 000..c88e51b54c5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-predefined-macros.c
@@ -0,0 +1,11 @@
+/* Test C2x predefined macros.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str-type.c b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
new file mode 100644
index 000..76559c0b19b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str-type.c
@@ -0,0 +1,6 @@
+/* Test C2x UTF-8 string literal type.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x" } */
+
+_Static_assert (_Generic (u8"text", char*: 1, unsigned char*: 2) == 2, "UTF-8 string literals have an unexpected type");
+_Static_assert (_Generic (u8"x"[0], char:  1, unsigned char:  2) == 2, "UTF-8 string literal elements have an unexpected type");
diff --git a/gcc/testsuite/gcc.dg/c2x-utf8str.c b/gcc/testsuite/gcc.dg/c2x-utf8str.c
new file mode 100644
index 000..712482c6569
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-utf8str.c
@@ -0,0 +1,34 @@
+/* Test initialization by UTF-8 string literal in C2x.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target wchar } */
+/* { dg-options "-std=c2x" } */
+
+typedef __CHAR8_TYPE__	char8_t;
+typedef __CHAR16_TYPE__	char16_t;
+typedef __CHAR32_TYPE__ char32_t;
+typedef __WCHAR_TYPE__	wchar_t;
+
+/* Test that char, signed char, unsigned char, and char8_t arrays can be
+   initialized by a UTF-8 string literal.  */
+const char cbuf1[] = u8"text";
+const char cbuf2[] = { u8"text" };
+const signed char scbuf1[] = u8"text";
+const signed char scbuf2[] = { u8"text" };
+const unsigned char ucbuf1[] = u8"text";
+const unsigned char ucbuf2[] = { u8"text" };
+const char8_t c8buf1[] = u8"text";
+const char8_t c8buf2[] = { u8"text" };
+
+/* Test that a diagnostic is issued for attempted initialization of
+   other character types by a UTF-8 string literal.  */
+const char16_t c16buf1[] = u8"text";		/* { dg-error "

[PATCH 1/2]: C N2653 char8_t: Language support

2022-01-07 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library 
changes proposed in WG14 N2653 [1] for C2x. The changes include:

- Change of type for UTF-8 string literals from array of char to array
  of char8_t (unsigned char) when targeting C2x.
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

Tested on Linux x86_64.

gcc/ChangeLog:

2022-01-07  Tom Honermann  

* ginclude/stdatomic.h (atomic_char8_t,
ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

2022-01-07  Tom Honermann  

* c-parser.c (c_parser_string_literal): Use char8_t as the type
of CPP_UTF8STRING when char8_t support is enabled.
* c-typeck.c (digest_init): Allow initialization of an array
of character type by a string literal with type array of
char8_t.

gcc/c-family/ChangeLog:

2022-01-07  Tom Honermann  

* c-lex.c (lex_string, lex_charconst): Use char8_t as the type
of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
enabled.
* c-opts.c (c_common_post_options): Set flag_char8_t if
targeting C2x.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm
commit c041cce5d262908349be3f1f2e361c824db15845
Author: Tom Honermann 
Date:   Sat Jan 1 18:10:41 2022 -0500

N2653 char8_t for C: Language support

This patch implements the core language and compiler dependent library
changes proposed in WG14 N2653 for C2X.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 2651331e683..0b3debbb9bd 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1352,7 +1352,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	{
+	  value = build_string (TYPE_PRECISION (char8_type_node)
+/ TYPE_PRECISION (char_type_node),
+"");  /* char8_t is 8 bits */
+	}
+	  else
+	value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -1425,10 +1432,10 @@ lex_charconst (const cpp_token *token)
 type = char16_type_node;
   else if (token->type == CPP_UTF8CHAR)
 {
-  if (!c_dialect_cxx ())
-	type = unsigned_char_type_node;
-  else if (flag_char8_t)
+  if (flag_char8_t)
 type = char8_type_node;
+  else if (!c_dialect_cxx ())
+	type = unsigned_char_type_node;
   else
 type = char_type_node;
 }
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 4c20e44f5b5..bd96e1319ad 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -1060,9 +1060,9 @@ c_common_post_options (const char **pfilename)
   if (flag_sized_deallocation == -1)
 flag_sized_deallocation = (cxx_dialect >= cxx14);
 
-  /* char8_t support is new in C++20.  */
+  /* char8_t support is implicitly enabled in C++20 and C2x.  */
   if (flag_char8_t == -1)
-flag_char8_t = (cxx_dialect >= cxx20);
+flag_char8_t = (cxx_dialect >= cxx20) || flag_isoc2x;
 
   if (flag_extern_tls_init)
 {
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index b09ad307acd..4239633e295 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7439,7 +7439,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	{
+	  value = build_string (TYPE_PRECISION (char8_type_node)
+/ TYPE_PRECISION (char_type_node),
+"");  /* char8_t is 8 bits */
+	}
+	  else
+	value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
@@ -7464,9 +7471,14 @@ c_parser_string_literal (c_parser *parser, bool translate, bool wide_ok)
 {
 default:
 case CPP_STRING:
-case CPP_UTF8STRING:
   TREE_TYPE (value) = char_array_type_node;
   break;
+case CPP_UTF8STRING:
+  if (flag_char8_t)
+	TREE_TYPE (value) = char8_array_type_node;
+  else
+	TREE_TYPE (value) = char_array_type_node;
+  break;
 case CPP_STRING16:
   TREE_TYPE (value) = char16_array_type_node;
   break;
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 78a6c68aaa6..b4eeea545a9 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -8028,7 +8028,7 @@ digest_init (location_t init_loc, tree type, t

[PATCH 0/2]: C N2653 char8_t implementation

2022-01-07 Thread Tom Honermann via Gcc-patches
This series of patches implements the core language features for the 
WG14 N2653 [1] proposal to provide char8_t support in C. These changes 
are intended to align char8_t support in C with the support provided in 
C++20 via WG21 P0482R6 [2].


These patches addresses feedback provided in response to a previous 
submission [3][4].


These changes do not impact default gcc behavior. Per prior feedback by 
Joseph Myers, the existing -fchar8_t and -fno-char8_t options used to 
opt-in to or opt-out of char8_t support in C++ are NOT reused for C. 
Instead, the C related core language changes are enabled when targeting 
C2x. Note that N2653 has not yet been accepted by WG14 for C2x, but the 
patches enable these changes for C2x in order to avoid an additional 
language dialect flag (e.g., -fchar8_t).


Patch 1: Language support
Patch 2: New tests

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

[2]: WG21 P0482R6
 "char8_t: A type for UTF-8 characters and strings (Revision 6)"
 https://wg21.link/p0482r6

[3]: [PATCH 0/3]: C N2653 char8_t implementation
 https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572022.html

[4]: [PATCH 1/3]: C N2653 char8_t: Language support
 https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572023.html


[PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library

2022-01-07 Thread Tom Honermann via Gcc-patches
This patch completes implementation of the C++20 proposal P0482R6 [1] by 
adding declarations of std::c8rtomb() and std::mbrtoc8() in  if 
provided by the C library in .


This patch addresses feedback provided in response to a previous patch 
submission [2].


Autoconf changes determine if the C library declares c8rtomb and mbrtoc8 
at global scope when uchar.h is included and compiled with either 
-fchar8_t or -std=c++20. New _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T 
and _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros 
reflect the probe results. The  header declares these functions 
in the std namespace only if available and the _GLIBCXX_USE_CHAR8_T 
configuration macro is defined (by default it is defined if the C++20 
__cpp_char8_t feature test macro is defined)


Patches to glibc to implement c8rtomb and mbrtoc8 have been submitted [3].

New tests validate the presence of these declarations. The tests pass 
trivially if the C library does not provide these functions. Otherwise 
they ensure that the functions are declared when  is included 
and either -fchar8_t or -std=c++20 is enabled.


Tested on Linux x86_64.

libstdc++-v3/ChangeLog:

2022-01-07  Tom Honermann  

* acinclude.m4 Define config macros if uchar.h provides
c8rtomb() and mbrtoc8().
* config.h.in: Re-generate.
* configure: Re-generate.
* include/c_compatibility/uchar.h: Declare ::c8rtomb and
::mbrtoc8.
* include/c_global/cuchar: Declare std::c8rtomb and
std::mbrtoc8.
* include/c_std/cuchar: Declare std::c8rtomb and std::mbrtoc8.
* testsuite/21_strings/headers/cuchar/functions_std_cxx20.cc:
New test.
* testsuite/21_strings/headers/cuchar/functions_std_fchar8_t.cc:
New test.

Tom.

[1]: WG21 P0482R6
 "char8_t: A type for UTF-8 characters and strings (Revision 6)"
 https://wg21.link/p0482r6

[2]: [PATCH] C++ P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 
if provided by the C library

 https://gcc.gnu.org/pipermail/libstdc++/2021-June/052685.html

[3]: "C++20 P0482R6 and C2X N2653"
 [Patch 0/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135061.html
 [Patch 1/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135062.html
 [Patch 2/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135063.html
 [Patch 3/3]: 
https://sourceware.org/pipermail/libc-alpha/2022-January/135064.html


Tom.
commit 3d40bc9bf5c79343ea5a6cc355539542f4b56c9b
Author: Tom Honermann 
Date:   Sat Jan 1 17:26:31 2022 -0500

P0482R6 char8_t: declare std::c8rtomb and std::mbrtoc8 if provided by the C library.

This change completes implementation of the C++20 proposal P0482R6 by
adding declarations of std::c8rtomb() and std::mbrtoc8() if provided
by the C library.

Autoconf changes determine if the C library declares c8rtomb and mbrtoc8
at global scope when uchar.h is included and compiled with either -fchar8_t
or -std=c++20 enabled; new _GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T and
_GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_CXX20 configuration macros are defined
accordingly. The  header declares these functions in the std
namespace only if available and the _GLIBCXX_USE_CHAR8_T configuration
macro is defined (by default it is defined if the C++20 __cpp_char8_t
feature test macro is defined).

New tests validate the presence of these declarations. The tests pass
trivially if the C library does not provide these functions. Otherwise they
ensure that the functions are declared when  is included and
either -fchar8_t or -std=c++20 is enabled.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 635168d7e25..85235005c7e 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -2039,6 +2039,50 @@ AC_DEFUN([GLIBCXX_CHECK_UCHAR_H], [
 	  namespace std in .])
   fi
 
+  CXXFLAGS="$CXXFLAGS -fchar8_t"
+  if test x"$ac_has_uchar_h" = x"yes"; then
+AC_MSG_CHECKING([for c8rtomb and mbrtoc8 in  with -fchar8_t])
+AC_TRY_COMPILE([#include 
+		namespace test
+		{
+		  using ::c8rtomb;
+		  using ::mbrtoc8;
+		}
+		   ],
+		   [], [ac_uchar_c8rtomb_mbrtoc8_fchar8_t=yes],
+		   [ac_uchar_c8rtomb_mbrtoc8_fchar8_t=no])
+  else
+ac_uchar_c8rtomb_mbrtoc8_fchar8_t=no
+  fi
+  AC_MSG_RESULT($ac_uchar_c8rtomb_mbrtoc8_fchar8_t)
+  if test x"$ac_uchar_c8rtomb_mbrtoc8_fchar8_t" = x"yes"; then
+AC_DEFINE(_GLIBCXX_USE_UCHAR_C8RTOMB_MBRTOC8_FCHAR8_T, 1,
+	  [Define if c8rtomb and mbrtoc8 functions in  should be
+	  imported into namespace std in  for -fchar8_t.])
+  fi
+
+  CXXFLAGS="$CXXFLAGS -std=c++20"
+  if test x"$ac_has_uchar_h" = x"yes"; then
+AC_MSG_CHECKING([for c8rtomb and mbrtoc8 in  with -std=c++20])
+AC_TRY_COMPILE([#include 
+		namespace test
+		{
+		  using ::c8rtomb;
+		  using ::mbrtoc8;
+		}
+		   ],
+		   [], [ac_uch

Re: [PATCH 0/3]: C N2653 char8_t implementation

2021-06-13 Thread Tom Honermann via Gcc-patches

On 6/11/21 1:27 PM, Joseph Myers wrote:

On Fri, 11 Jun 2021, Tom Honermann via Gcc-patches wrote:


The option is needed because it impacts core language backward compatibility
(for both C and C++, the type of u8 string literals; for C++, the type of u8
character literals and the new char8_t fundamental type).

Lots of new features in new standard versions can affect backward
compatibility.  We generally bundle all of those up into a single -std
option rather than having an explosion of different language variants with
different features enabled or disabled.  I don't think this feature, for
C, reaches the threshold that would justify having a separate option to
control it, especially given that people can use -Wno-pointer-sign or
pointer casts or their own local char8_t typedef as an intermediate step
if they want code using u8"" strings to work for both old and new standard
versions.
Ok, I'm happy to defer to your experience.  My perspective is likely 
biased by the C++20 changes being more disruptive for that language.


I don't think u8"" strings are widely used in C library headers in a way
where the choice of type matters.  (Use of a feature in library headers is
a key thing that can justify options such as -fgnu89-inline, because it
means the choice of language version is no longer fully under control of a
single project.)

That aligns with my expectations.


The only feature proposed for C2x that I think is likely to have
significant compatibility implications in practice for a lot of code is
making bool, true and false into keywords.  I still don't think a separate
option makes sense there.  (If that feature is accepted for C2x, what
would be useful is for people to do distribution rebuilds with -std=gnu2x
as the default to find and fix code that breaks, in advance of the default
actually changing in GCC.  But the workaround for not-yet-fixed code would
be -std=gnu11, not a separate option for that one feature.)

Ok, that comparison is helpful.



I think the whole patch series would best wait until after the proposal
has been considered by a WG14 meeting, in addition to not increasing the
number of language dialects supported.

As an opt-in feature, this is useful to gain implementation and deployment
experience for WG14.

I think this feature is one of the cases where experience in C++ is
sufficiently relevant for C (although there are certainly cases of other
language features where the languages are sufficiently different that
using C++ experience like that can be problematic).

E.g. we didn't need -fdigit-separators for C before digit separators were
added to C2x, and we don't need -fno-digit-separators now they are in C2x
(the feature is just enabled or disabled based on the language version),
although that's one of many features that do affect compatibility in
corner cases.


Got it, thanks again, that comparison is helpful.

Per this and prior messages, I'll revise the gcc patch series as follows 
(I'll likewise revise the glibc changes, but will detail that in the 
corresponding glibc mailing list thread).


1. Remove the proposed use of -fchar8_t and -fno-char8_t for C code.
2. Remove the updated documentation for the -fchar8_t option since it
   won't be applicable to C code.
3. Remove the _CHAR8_T_SOURCE macro.
4. Enable the change of u8 string literal type based on -std=[gnu|c]2x
   (by setting flag_char8_t if flag_isoc2x is set).
5. Condition the declarations of atomic_char8_t and
   __GCC_ATOMIC_CHAR8_T_LOCK_FREE on _GNU_SOURCE or _ISOC2X_SOURCE.
6. Remove the char8 data member from cpp_options that I had added and
   forgot to remove.
7. Revise the tests and rename them for consistency with other C2x tests.

If I've forgotten anything, please let me know.

Thank you for the thorough review!

Tom.



Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-13 Thread Tom Honermann via Gcc-patches

On 6/11/21 12:53 PM, Jakub Jelinek wrote:

On Fri, Jun 11, 2021 at 12:20:48PM -0400, Tom Honermann wrote:

I'm open to whatever signaling mechanism would be preferred.  It took me a
while to settle on _CHAR8_T_SOURCE as the mechanism to propose as I didn't
find much for other precedents.

I agree that having _CHAR8_T_SOURCE be implied by the -fchar8_t option is
unusual with respect to other feature test macros.  Is that what you find to
be weird and inconsistent?

Predefining __SIZEOF_CHAR8_T__ would be consistent with __SIZEOF_WCHAR_T__,
but kind of strange too since the size is always 1.

Perhaps a better approach would be to follow the __CHAR16_TYPE__ and
__CHAR32_TYPE__ precedent and define __CHAR8_TYPE__ to unsigned char.  That
is likewise a bit strange since the type would always be unsigned char, but
it does provide a bit more symmetry.  That could potentially have some use
as well; for C++, it could be defined as char8_t and thereby reflect the
difference between the two languages.  Perhaps it could be useful in the
future as well if WG14 were to add distinct char8_t, char16_t, and char32_t
types as C++ did (I'm not offering any prediction regarding the likelihood
of that happening).

C++ already predefines
#define __CHAR8_TYPE__ unsigned char
#define __CHAR16_TYPE__ short unsigned int
#define __CHAR32_TYPE__ unsigned int
for -std={c,gnu}++2{0,a,3,b} or -fchar8_t (unless -fno-char8_t), so I agree
just making sure __CHAR8_TYPE__ is defined to unsigned char even for C
is best.
And you probably don't need to do anything in the C patch for it,
void
c_stddef_cpp_builtins(void)
{
   builtin_define_with_value ("__SIZE_TYPE__", SIZE_TYPE, 0);
...
   if (flag_char8_t)
 builtin_define_with_value ("__CHAR8_TYPE__", CHAR8_TYPE, 0);
   builtin_define_with_value ("__CHAR16_TYPE__", CHAR16_TYPE, 0);
   builtin_define_with_value ("__CHAR32_TYPE__", CHAR32_TYPE, 0);
will do that.


Thank you; I had forgotten that I had already done that work.  I 
confirmed that the proposed changes result in __CHAR8_TYPE__ being 
defined (the tests included with the patch already enforced it).


Tom.



Jakub





Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/11/21 12:01 PM, Jakub Jelinek wrote:

On Fri, Jun 11, 2021 at 11:52:41AM -0400, Tom Honermann via Gcc-patches wrote:

On 6/7/21 5:11 PM, Joseph Myers wrote:

On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote:


When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro
is predefined.  This is the mechanism proposed to glibc to opt-in to
declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed
in N2653.  See [2].

I don't think glibc should have such a feature test macro, and I don't
think GCC should define such feature test macros either - _*_SOURCE macros
are generally for the *user* to define to decide what namespace they want
visible, not for the compiler to define.  Without proliferating new
language dialects, __STDC_VERSION__ ought to be sufficient to communicate
from the compiler to the library (including to GCC's own headers such as
stdatomic.h).


In general I agree, but I think an exception is warranted in this case for a
few reasons:

1. The feature includes both core language changes (the change of type
for u8 string literals) and library changes.  The library changes
are not actually dependent on the core language change, but they are
intended to be used together.
2. Existing use of the char8_t identifier can be found in existing open
source projects and likely exists in some closed source projects as
well.  An opt-in approach avoids conflict and the need to
conditionalize code based on gcc version.
3. An opt-in approach enables evaluation of the feature prior to any
WG14 approval.

But calling it _CHAR8_T_SOURCE is weird and inconsistent with everything
else.
In C++, there is __cpp_char8_t 201811L predefined macro for char8_t.
Using that in C is not right, sure.
Often we use __SIZEOF_type__ macros not just for sizeof(), but also for
presence check of the types, like
#ifdef __SIZEOF_INT128__
__int128 i;
#else
long long i;
#endif
etc., while char8_t has sizeof (char8_t) == 1, perhaps predefining
__SIZEOF_CHAR8_T__ 1
instead of _CHAR8_T_SOURCE would be better?


I'm open to whatever signaling mechanism would be preferred.  It took me 
a while to settle on _CHAR8_T_SOURCE as the mechanism to propose as I 
didn't find much for other precedents.


I agree that having _CHAR8_T_SOURCE be implied by the -fchar8_t option 
is unusual with respect to other feature test macros.  Is that what you 
find to be weird and inconsistent?


Predefining __SIZEOF_CHAR8_T__ would be consistent with 
__SIZEOF_WCHAR_T__, but kind of strange too since the size is always 1.


Perhaps a better approach would be to follow the __CHAR16_TYPE__ and 
__CHAR32_TYPE__ precedent and define __CHAR8_TYPE__ to unsigned char.  
That is likewise a bit strange since the type would always be unsigned 
char, but it does provide a bit more symmetry.  That could potentially 
have some use as well; for C++, it could be defined as char8_t and 
thereby reflect the difference between the two languages.  Perhaps it 
could be useful in the future as well if WG14 were to add distinct 
char8_t, char16_t, and char32_t types as C++ did (I'm not offering any 
prediction regarding the likelihood of that happening).


Tom.



Jakub





Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/7/21 5:12 PM, Joseph Myers wrote:

Also, it seems odd to add a new field to cpp_options without any code in
libcpp that uses the value of that field.

Ah, thank you.  That appears to be leftover code from prior 
experimentation and I failed to identify it as such when preparing the 
patch.  I'll provide a revised patch.


Tom.



Re: [PATCH 1/3]: C N2653 char8_t: Language support

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/7/21 5:11 PM, Joseph Myers wrote:

On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote:


When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro
is predefined.  This is the mechanism proposed to glibc to opt-in to
declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed
in N2653.  See [2].

I don't think glibc should have such a feature test macro, and I don't
think GCC should define such feature test macros either - _*_SOURCE macros
are generally for the *user* to define to decide what namespace they want
visible, not for the compiler to define.  Without proliferating new
language dialects, __STDC_VERSION__ ought to be sufficient to communicate
from the compiler to the library (including to GCC's own headers such as
stdatomic.h).

In general I agree, but I think an exception is warranted in this case 
for a few reasons:


1. The feature includes both core language changes (the change of type
   for u8 string literals) and library changes.  The library changes
   are not actually dependent on the core language change, but they are
   intended to be used together.
2. Existing use of the char8_t identifier can be found in existing open
   source projects and likely exists in some closed source projects as
   well.  An opt-in approach avoids conflict and the need to
   conditionalize code based on gcc version.
3. An opt-in approach enables evaluation of the feature prior to any
   WG14 approval.

Tom.



Re: [PATCH 0/3]: C N2653 char8_t implementation

2021-06-11 Thread Tom Honermann via Gcc-patches

On 6/7/21 5:03 PM, Joseph Myers wrote:

On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote:


These changes do not impact default gcc behavior.  The existing -fchar8_t
option is extended to C compilation to enable the N2653 changes, and
-fno-char8_t is extended to explicitly disable them.  N2653 has not yet been
accepted by WG14, so no changes are made to handling of the C2X language
dialect.

Why is that option needed?  Normally I'd expect features to be enabled or
disabled based on the selected language version, rather than having
separate options to adjust the configuration for one very specific feature
in a language version.  Adding extra language dialects not corresponding
to any standard version but to some peculiar mix of versions (such as C17
with a changed type for u8"", or C2X with a changed type for u8'') needs a
strong reason for those language dialects to be useful (for example, the
-fgnu89-inline option was justified by widespread use of GNU-style extern
inline in headers).


The option is needed because it impacts core language backward 
compatibility (for both C and C++, the type of u8 string literals; for 
C++, the type of u8 character literals and the new char8_t fundamental 
type).


The ability to opt-in or opt-out of the feature eases migration by 
enabling source code compatibility.  C and C++ standards are not 
published at the same cadence.  A project that targets C++20 and C17 may 
therefore have a need to either opt-out of char8_t support on the C++ 
side (already possible via -fno-char8_t), or to opt-in to char8_t 
support on the C side until such time as the targets change to C++20(+) 
and C23(+); assuming WG14 approval at some point.




I think the whole patch series would best wait until after the proposal
has been considered by a WG14 meeting, in addition to not increasing the
number of language dialects supported.


As an opt-in feature, this is useful to gain implementation and 
deployment experience for WG14.


It would be appropriate to document this as an experimental feature 
pending WG14 approval.  If WG14 declines it or approves it with 
different behavior, the feature can then be removed or changed.


The option could also be introduced as -fexperimental-char8_t if that 
eases concerns, though I do not favor that approach due to misalignment 
with the existing option for C++.


Tom.



[PATCH 3/3]: C N2653 char8_t: Documentation updates

2021-06-06 Thread Tom Honermann via Gcc-patches
This patch updates documentation for the -fchar8_t and -fno-char8_t 
options to describe their effect on C code as proposed in WG14 N2653 [1].


Tested on Linux x86_64.

2021-05-31  Tom Honermann  

* doc/invoke.texi (-fchar8_t): update for char8_t support for C.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm


commit d3cb3c6648cc15fe1beea6c9799e044cb722148a
Author: Tom Honermann 
Date:   Sun May 30 16:57:09 2021 -0400

N2653 char8_t for C: Documentation updates

This change updates documentation for the -fchar8_t option to describe
its affect on C code as proposed in WG14 N2653 for C.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5cd4e2d993c..ba4c60a6179 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2884,14 +2884,27 @@ This flag is enabled by default for @option{-std=c++17}.
 @itemx -fno-char8_t
 @opindex fchar8_t
 @opindex fno-char8_t
-Enable support for @code{char8_t} as adopted for C++20.  This includes
-the addition of a new @code{char8_t} fundamental type, changes to the
-types of UTF-8 string and character literals, new signatures for
-user-defined literals, associated standard library updates, and new
-@code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature test macros.
+Enable support for @code{char8_t} for C as proposed in N2653, and for
+C++ as adopted for C++20.
+
+For C, this changes the type of UTF-8 string literals from array of
+@code{char} to array of @code{unsigned char} and defines the
+@code{_CHAR8_T_SOURCE} macro to inform the C standard library that the
+@code{char8_t} typedef name and the @code{mbrtoc8} and @code{c8rtomb}
+functions should be declared by @code{}, and that the
+@code{atomic_char8_t} typedef name and the @code{ATOMIC_CHAR8_T_LOCK_FREE}
+macro should be defined by @code{}.
+
+For C++, this enables the @code{char8_t} fundamental type, changes the
+type of UTF-8 string literals from array of @code{char} to array of
+@code{char8_t}, changes the type of character literals from @code{char}
+to @code{char8_t}, adds additional @code{char8_t}-based signatures for
+user-defined literals, enables associated standard library updates, and
+defines the @code{__cpp_char8_t} and @code{__cpp_lib_char8_t} feature
+test macros.
 
 This option enables functions to be overloaded for ordinary and UTF-8
-strings:
+strings in C++:
 
 @smallexample
 int f(const char *);// #1





[PATCH 2/3]: C N2653 char8_t: New tests​

2021-06-06 Thread Tom Honermann via Gcc-patches
This patch provides new tests for the core language and compiler 
dependent library changes proposed in WG14 N2653 [1] for C.


Most of the tests are provided in both a positive (-fchar8_t) and 
negative (-fno-char8_t) form to ensure behaviors are appropriately 
present or absent in each mode.


Tested on Linux x86_64.

gcc/testsuite/ChangeLog:

2021-05-31  Tom Honermann  

* gcc.dg/atomic/stdatomic-lockfree-char8_t.c: New test.
* gcc.dg/char8_t-init-string-literal-1.c: New test.
* gcc.dg/char8_t-predefined-macros-1.c: New test.
* gcc.dg/char8_t-predefined-macros-2.c: New test.
* gcc.dg/char8_t-string-literal-1.c: New test.
* gcc.dg/char8_t-string-literal-2.c: New test.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm


commit 900aa3507defd80339828e5791c215a28efd9fea
Author: Tom Honermann 
Date:   Sat Feb 13 10:02:41 2021 -0500

N2653 char8_t for C: New tests

This change provides new tests for the core language and compiler
dependent library changes proposed in WG14 N2653 for C.

Some of the tests are provided in both a positive (-fchar8_t) and
negative (-fno-char8_t) form to ensure behaviors are appropriately
present or absent in each mode.

diff --git a/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c b/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c
new file mode 100644
index 000..bb9eae84e83
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/stdatomic-lockfree-char8_t.c
@@ -0,0 +1,42 @@
+/* Test atomic_is_lock_free for char8_t.  */
+/* { dg-do run } */
+/* { dg-options "-std=c11 -fchar8_t -pedantic-errors" } */
+
+#include 
+#include 
+
+extern void abort (void);
+
+_Atomic __CHAR8_TYPE__ ac8a;
+atomic_char8_t ac8t;
+
+#define CHECK_TYPE(MACRO, V1, V2)		\
+  do		\
+{		\
+  int r1 = MACRO;\
+  int r2 = atomic_is_lock_free (&V1);	\
+  int r3 = atomic_is_lock_free (&V2);	\
+  if (r1 != 0 && r1 != 1 && r1 != 2)	\
+	abort ();\
+  if (r2 != 0 && r2 != 1)			\
+	abort ();\
+  if (r3 != 0 && r3 != 1)			\
+	abort ();\
+  if (r1 == 2 && r2 != 1)			\
+	abort ();\
+  if (r1 == 2 && r3 != 1)			\
+	abort ();\
+  if (r1 == 0 && r2 != 0)			\
+	abort ();\
+  if (r1 == 0 && r3 != 0)			\
+	abort ();\
+}		\
+  while (0)
+
+int
+main ()
+{
+  CHECK_TYPE (ATOMIC_CHAR8_T_LOCK_FREE, ac8a, ac8t);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c b/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c
new file mode 100644
index 000..4d587e90a26
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-init-string-literal-1.c
@@ -0,0 +1,13 @@
+/* Test that char, signed char, and unsigned char arrays can still be
+   initialized by UTF-8 string literals if -fchar8_t is enabled.  */
+/* { dg-do compile } */
+/* { dg-options "-fchar8_t" } */
+
+char cbuf1[] = u8"text";
+char cbuf2[] = { u8"text" };
+
+signed char scbuf1[] = u8"text";
+signed char scbuf2[] = { u8"text" };
+
+unsigned char ucbuf1[] = u8"text";
+unsigned char ucbuf2[] = { u8"text" };
diff --git a/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c
new file mode 100644
index 000..884c634990d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-1.c
@@ -0,0 +1,16 @@
+// Test that char8_t related predefined macros are not present when -fchar8_t is
+// not enabled.
+// { dg-do compile }
+// { dg-options "-fno-char8_t" }
+
+#if defined(_CHAR8_T_SOURCE)
+# error _CHAR8_T_SOURCE is defined!
+#endif
+
+#if defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is defined!
+#endif
+
+#if defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c
new file mode 100644
index 000..7f425357f57
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-predefined-macros-2.c
@@ -0,0 +1,16 @@
+// Test that char8_t related predefined macros are present when -fchar8_t is
+// enabled.
+// { dg-do compile }
+// { dg-options "-fchar8_t" }
+
+#if !defined(_CHAR8_T_SOURCE)
+# error _CHAR8_T_SOURCE is not defined!
+#endif
+
+#if !defined(__CHAR8_TYPE__)
+# error __CHAR8_TYPE__ is not defined!
+#endif
+
+#if !defined(__GCC_ATOMIC_CHAR8_T_LOCK_FREE)
+# error __GCC_ATOMIC_CHAR8_T_LOCK_FREE is not defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c b/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c
new file mode 100644
index 000..df94582ac1d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/char8_t-string-literal-1.c
@@ -0,0 +1,6 @@
+// Test that UTF-8 string literals have type char[] if -fchar8_t is not enabled.
+// { dg-do compile }
+// { dg-options "-std=c11 -fno-char8_t" }
+
+_Static_assert (_Generic (u8"text", char*: 1, unsi

[PATCH 1/3]: C N2653 char8_t: Language support

2021-06-06 Thread Tom Honermann via Gcc-patches
This patch implements the core language and compiler dependent library 
changes proposed in WG14 N2653 [1] for C.  The changes include:

- Use of the existing -fchar8_t and -fno-char8_t options to opt-in to
  (or opt-out of) the following changes when compiling C code.
- Change of type for UTF-8 string literals from array of char to array
  of char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of a new
  predefined ATOMIC_CHAR8_T_LOCK_FREE macro.

When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE 
macro is predefined.  This is the mechanism proposed to glibc to opt-in 
to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions 
proposed in N2653.  See [2].


Tested on Linux x86_64.

gcc/ChangeLog:

2021-05-31  Tom Honermann  

 * ginclude/stdatomic.h (atomic_char8_t, ATOMIC_CHAR8_T_LOCK_FREE):
   New typedef and macro.

gcc/c/ChangeLog:

2021-05-31  Tom Honermann  

 * c-parser.c (c_parser_string_literal): Use char8_t as the type of
   CPP_UTF8STRING when char8_t support is enabled.
 * c-typeck.c (digest_init): Handle initialization of an array
   of character type by a string literal with type array of
   unsigned char.

gcc/c-family/ChangeLog:

2021-05-31  Tom Honermann  

 * c-cppbuiltin.c (c_cpp_builtins): Define _CHAR8_T_SOURCE if
   char8_t support is enabled in non-C++ language modes.
 * c-lex.c (lex_string): Use char8_t as the type of
   CPP_UTF8STRING when char8_t support is enabled.
 * c-opts.c (c_common_handle_option): Inform the preprocessor if
   char8_t support is enabled.
 * c.opt (fchar8_t): Enable for C language modes.

libcpp/ChangeLog:

2021-05-31  Tom Honermann  

 * include/cpplib.h (cpp_options): Add char8.

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

[2]: C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and 
c8rtomb().
 [Patch 0]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127230.html
 [Patch 1]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127231.html
 [Patch 2]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127232.html
 [Patch 3]: 
https://sourceware.org/pipermail/libc-alpha/2021-June/127233.html
commit c4260c7c49822522945377cc2fb93ee9830cefc8
Author: Tom Honermann 
Date:   Sat Feb 13 09:02:34 2021 -0500

N2653 char8_t for C: Language support

This patch implements the core language and compiler dependent library
changes proposed in WG14 N2653 for C.  The changes include:
- Use of the existing -fchar8_t and -fno-char8_t options to opt-in to
  (or opt-out of) the following changes when compiling C code.
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of a new
  predefined ATOMIC_CHAR8_T_LOCK_FREE macro.

When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE
macro is predefined.  This is the mechanism proposed to glibc to opt-in
to declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions
proposed in N2653.

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 42b7604c9ac..3e944ec2b86 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1467,6 +1467,11 @@ c_cpp_builtins (cpp_reader *pfile)
   if (flag_iso)
 cpp_define (pfile, "__STRICT_ANSI__");
 
+  /* Express intent for char8_t support in C (not C++) to the C library if
+ requested.  */
+  if (!c_dialect_cxx () && flag_char8_t)
+cpp_define (pfile, "_CHAR8_T_SOURCE");
+
   if (!flag_signed_char)
 cpp_define (pfile, "__CHAR_UNSIGNED__");
 
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index c44e7a13489..e30e44e9f5c 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1335,7 +1335,14 @@ lex_string (const cpp_token *tok, tree *valp, bool objc_string, bool translate)
 	default:
 	case CPP_STRING:
 	case CPP_UTF8STRING:
-	  value = build_string (1, "");
+	  if (type == CPP_UTF8STRING && flag_char8_t)
+	{
+	  value = build_string (TYPE_PRECISION (char8_type_node)
+/ TYPE_PRECISION (char_type_node),
+"");  /* char8_t is 8 bits */
+	}
+	  else
+	value = build_string (1, "");
 	  break;
 	case CPP_STRING16:
 	  value = build_string (TYPE_PRECISION (char16_type_node)
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 60b5802722c..eefc607dac6 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -718,6 +718,10 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
 case OPT_v:
   verbose = true;
   break;
+
+case OPT_fchar8_t:
+  c

[PATCH 0/3]: C N2653 char8_t implementation

2021-06-06 Thread Tom Honermann via Gcc-patches
This series of patches implements the core language features for the 
WG14 N2653 [1] proposal to provide char8_t support in C.  These changes 
are intended to align char8_t support in C with the support provided in 
C++20 via WG21 P0482R6 [2].


These changes do not impact default gcc behavior.  The existing 
-fchar8_t option is extended to C compilation to enable the N2653 
changes, and -fno-char8_t is extended to explicitly disable them.  N2653 
has not yet been accepted by WG14, so no changes are made to handling of 
the C2X language dialect.


Patch 1: Language support
Patch 2: New tests
Patch 3: Documentation updates

Tom.

[1]: WG14 N2653
 "char8_t: A type for UTF-8 characters and strings (Revision 1)"
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm

[2]: WG21 P0482R6
 "char8_t: A type for UTF-8 characters and strings (Revision 6)"
 https://wg21.link/p0482r6