date:20221207

Re: [PATCH] Fix aarch64 PR 99657: ICE with SVE types used without an error

2022-12-07 Thread Richard Sandiford via Gcc-patches

"Kewen.Lin"  writes:
> on 2022/12/7 20:55, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>>> Hi Richard,
>>>
>>> on 2022/12/7 17:16, Richard Sandiford wrote:
 "Kewen.Lin"  writes:
> Hi,
>
> In the recent discussion on how to make some built-in type only valid for
> some target features efficiently[1], Andrew mentioned this patch which he
> made previously (Thanks!).  I confirmed it can help rs6000 related issue,
> and noticed PR99657 is still opened, so I think we still want this to
> be reviewed.

 But does it work for things like:

 void f(foo_t *x, foo_t *y) { *x = *y; }

 where no variables are being created with foo_t type?

>>>
>>> I think it can work for this case as it touches build_indirect_ref.
>> 
>> Ah, ok.  But indirecting through a pointer doesn't seem to match
>> TCTX_AUTO_STORAGE.
>> 
>
> Indeed. :)
>
>> I guess another case is where there are global variables of the type
>> that you want to forbid, compiled while the target feature is enabled,
>> and then a function tries to access those variables with the target
>> feature locally disabled (through a pragma or attribute).  Does that
>> case work?
>> 
>
> Thanks for pointing out this, I tried with the below test case:
>
> __vector_quad a1;
> __vector_quad a2;
>
> __attribute__((target("cpu=power8")))
> void foo ()
> {
>   a2 = a3;
> }
>
> the verify_type_context doesn't catch it as you suspected, I think
> it needs some enhancements somewhere.

FWIW, another possible case is:

  foo_t f();
  void g(foo_t);
  void h() { g(f()); }

I'm not aware of any verify_type_context checks that would catch this
for SVE (since it's valid for SVE types).

Thanks,
Richard

Re: AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

2022-12-07 Thread Richard Sandiford via Gcc-patches

"Pop, Sebastian"  writes:
> Hi Richard,
>
>
> Please find attached a patch that follows your recommendations to generate 
> the BTI_C instructions.
>
> Please let me know if the patch can be further improved.
>
> The patch passed bootstrap and regressions tests on arm64-linux.

LGTM.  OK for trunk, thanks, and for release branches after a grace period.

Richard

> Thanks,
>
> Sebastian
>
> 
> From: Richard Sandiford 
> Sent: Wednesday, December 7, 2022 3:12:08 AM
> To: Pop, Sebastian
> Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
> Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
>
>
>
> "Pop, Sebastian"  writes:
>> Thanks Richard for your review and for pointing out the issue with BTI.
>>
>>
>> The current patch removes the existing BTI instruction,
>>
>> and then adds the BTI hint when expanding the patchable_area pseudo.
>
> Thanks.  I still think...
>
>> The attached patch passed bootstrap and regression test on arm64-linux.
>>
>> Ok to commit to gcc trunk?
>>
>>
>> Thank you,
>> Sebastian
>>
>> 
>> From: Richard Sandiford 
>> Sent: Monday, December 5, 2022 5:34:40 AM
>> To: Pop, Sebastian
>> Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
>> Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>>
>> CAUTION: This email originated from outside of the organization. Do not 
>> click links or open attachments unless you can confirm the sender and know 
>> the content is safe.
>>
>>
>>
>> "Pop, Sebastian"  writes:
>>> Hi,
>>>
>>> Currently patchable area is at the wrong place on AArch64.  It is placed
>>> immediately after function label, before .cfi_startproc.  This patch
>>> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
>>> modifies aarch64_print_patchable_function_entry to avoid placing
>>> patchable area before .cfi_startproc.
>>>
>>> The patch passed bootstrap and regression test on aarch64-linux.
>>> Ok to commit to trunk and backport to active release branches?
>>
>> Looks good, but doesn't the problem described in the PR then still
>> apply to the BTI emitted by:
>>
>>   if (cfun->machine->label_is_assembled
>>   && aarch64_bti_enabled ()
>>   && !cgraph_node::get (cfun->decl)->only_called_directly_p ())
>> {
>>   /* Remove the BTI that follows the patch area and insert a new BTI
>>  before the patch area right after the function label.  */
>>   rtx_insn *insn = next_real_nondebug_insn (get_insns ());
>>   if (insn
>>   && INSN_P (insn)
>>   && GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE
>>   && XINT (PATTERN (insn), 1) == UNSPECV_BTI_C)
>> delete_insn (insn);
>>   asm_fprintf (file, "\thint\t34 // bti c\n");
>> }
>>
>> ?  It seems like the BTI will be before the cfi_startproc and the
>> patchable entry afterwards.
>>
>> I guess we should keep the BTI instruction as-is (rather than printing
>> a .hint) and emit the new UNSPECV_PATCHABLE_AREA after the BTI rather
>> than before it.
>
> ...this approach would be slightly cleaner though.  The .hint asm string
> we're emitting here is exactly the same as the one emiitted by the
> original bti_c instruction.  The only reason for deleting the
> instruction and emitting text was because we were emitting the
> patchable entry directly as text, and the BTI text had to come
> before the patchable entry text.
>
> Now that we're emitting the patchable entry via a normal instruction
> (a good thing!) we can keep the preceding bti_c as a normal instruction
> too.  That is, I think we should use emit_insn_after to emit the entry
> after the bti_c insn (if it exists) instead of before BB_HEAD.
>
> Thanks,
> Richard
>
>>> gcc/
>>> PR target/93492
>>> * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
>>> Declared.
>>> * config/aarch64/aarch64.cc 
>>> (aarch64_print_patchable_function_entry):
>>> Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
>>> (aarch64_output_patchable_area): New.
>>> * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
>>> (patchable_area): Define.
>>>
>>> gcc/testsuite/
>>> PR target/93492
>>> * gcc.target/aarch64/pr98776.c: New.
>>>
>>>
>>> From b9cf87bcdf65f515b38f1851eb95c18aaa180253 Mon Sep 17 00:00:00 2001
>>> From: Sebastian Pop 
>>> Date: Wed, 30 Nov 2022 19:45:24 +
>>> Subject: [PATCH] AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>>>
>>> Currently patchable area is at the wrong place on AArch64.  It is placed
>>> immediately after function label, before .cfi_startproc.  This patch
>>> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
>>> modifies aarch64_print_patchable_function_entry to avoid placing
>>>

Re: [PATCH] docs: Suggest options to improve ASAN stack traces

2022-12-07 Thread Florian Weimer via Gcc-patches

* Marek Polacek via Gcc-patches:

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 726392409b6..2de14466dd3 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -16510,6 +16510,14 @@ The option cannot be combined with 
> @option{-fsanitize=thread} or
>  @option{-fsanitize=hwaddress}.  Note that the only target
>  @option{-fsanitize=hwaddress} is currently supported on is AArch64.
>  
> +To get more accurate stack traces, it is possible to use options such as
> +@option{-O} (which, for instance, prevents most function inlining),
> +@option{-fno-optimize-sibling-calls} (which prevents optimizing sibling
> +and tail recursive calls), or @option{-fno-ipa-icf} (which disables Identical
> +Code Folding for functions and read-only variables).  Since multiple runs
> +of the program may yield backtraces with different addresses due to ASLR,
> +it may be desirable to turn off ASLR: @samp{setarch `uname -m` -R ./prog}.

What about -fasynchronous-unwind-tables?  It should help if ASAN ever
reports stray segmentation faults.  Whether it also helps in general
depends on whether ASAN maintains ABI around its instrumentation.

Thanks,
Florian

[PATCH v4, rs6000] Enable have_cbranchcc4 on rs6000

2022-12-07 Thread HAO CHEN GUI via Gcc-patches

Hi,
  This patch enables "have_cbranchcc4" on rs6000 by defining
a "cbranchcc4" expander. "have_cbrnachcc4" is a flag in ifcvt.cc
to indicate if branch by CC bits is invalid or not. With this
flag enabled, some branches can be optimized to conditional
moves.

  Compared to last version, the main changes are on the test
cases. Test case is renamed and comments are modified.

  Bootstrapped and tested on powerpc64-linux BE and LE with no
regressions. Is this okay for trunk? Any recommendations? Thanks
a lot.

BR
Gui Haochen

ChangeLog
2022-12-07  Haochen Gui 

gcc/
* config/rs6000/rs6000.md (cbranchcc4): New expander.

gcc/testsuite
* gcc.target/powerpc/cbranchcc4-1.c: New.
* gcc.target/powerpc/cbranchcc4-2.c: New.

patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index e9e5cd1e54d..d7ddd96cc70 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11932,6 +11932,16 @@ (define_expand "cbranch4"
   DONE;
 })

+(define_expand "cbranchcc4"
+  [(set (pc)
+   (if_then_else (match_operator 0 "branch_comparison_operator"
+   [(match_operand 1 "cc_reg_operand")
+(match_operand 2 "zero_constant")])
+ (label_ref (match_operand 3))
+ (pc)))]
+  ""
+  "")
+
 (define_expand "cstore4_signed"
   [(use (match_operator 1 "signed_comparison_operator"
  [(match_operand:P 2 "gpc_reg_operand")
diff --git a/gcc/testsuite/gcc.target/powerpc/cbranchcc4-1.c 
b/gcc/testsuite/gcc.target/powerpc/cbranchcc4-1.c
new file mode 100644
index 000..6c2cd130b6d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/cbranchcc4-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Verify there is no ICE with cbranchcc4 enabled.  */
+
+int foo (double d)
+{
+  if (d == 0.0)
+return 0;
+
+  d = ((d) >= 0 ? (d) : -(d));
+
+  if (d < 1.0)
+return 1;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/cbranchcc4-2.c 
b/gcc/testsuite/gcc.target/powerpc/cbranchcc4-2.c
new file mode 100644
index 000..528ba1a878d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/cbranchcc4-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-ce1" } */
+/* { dg-final { scan-rtl-dump "noce_try_store_flag_constants" "ce1" } } */
+
+/* The inner branch should be detected by ifcvt then be converted to a setcc
+   with a plus by noce_try_store_flag_constants.  */
+
+int test (unsigned int a, unsigned int b)
+{
+return (a < b ? 0 : (a > b ? 2 : 1));
+}

Re: [PATCH] Fix aarch64 PR 99657: ICE with SVE types used without an error

2022-12-07 Thread Kewen.Lin via Gcc-patches

on 2022/12/7 20:55, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> Hi Richard,
>>
>> on 2022/12/7 17:16, Richard Sandiford wrote:
>>> "Kewen.Lin"  writes:
 Hi,

 In the recent discussion on how to make some built-in type only valid for
 some target features efficiently[1], Andrew mentioned this patch which he
 made previously (Thanks!).  I confirmed it can help rs6000 related issue,
 and noticed PR99657 is still opened, so I think we still want this to
 be reviewed.
>>>
>>> But does it work for things like:
>>>
>>> void f(foo_t *x, foo_t *y) { *x = *y; }
>>>
>>> where no variables are being created with foo_t type?
>>>
>>
>> I think it can work for this case as it touches build_indirect_ref.
> 
> Ah, ok.  But indirecting through a pointer doesn't seem to match
> TCTX_AUTO_STORAGE.
> 

Indeed. :)

> I guess another case is where there are global variables of the type
> that you want to forbid, compiled while the target feature is enabled,
> and then a function tries to access those variables with the target
> feature locally disabled (through a pragma or attribute).  Does that
> case work?
> 

Thanks for pointing out this, I tried with the below test case:

__vector_quad a1;
__vector_quad a2;

__attribute__((target("cpu=power8")))
void foo ()
{
  a2 = a3;
}

the verify_type_context doesn't catch it as you suspected, I think
it needs some enhancements somewhere.

> That's not an issue for SVE because global variables can't have
> sizeless type.
> 
>>> That's not to say we shouldn't have the patch.  I'm just not sure
>>> it can be the complete solution.
>>
>> I'm not sure about that either, maybe Andrew have more insights.
>> But as you pointed out in [1], I doubted trying to find all invalid
>> uses of a built-in type is worthwhile, it seems catching those usual
>> cases is enough and practical.  So if this verify_type_context
>> framework can cover the most of uses, maybe it's a good direction
>> to go and extend.
> 
> IMO it depends on what we're trying to protect against.  If the
> compiler can handle these types correctly even when the target feature
> is disabled, and we're simply disallowing the types for policy rather
> than correctness reasons, then maybe just handling the usual cases is
> good enough.  But things are different if the compiler is going to ICE
> or generate invalid code when something slips through.  In that case,
> I think the niche cases matter too.
> 

Thanks for the clarification, good point, I agree!  It means we still
need some handlings in movoo and movxo to avoid possible ICE, which can
still be caused by some cases like the above one or similar.  This
verify_type_context checking is only a nice add-on to improve the
diagnosis for invalid built-in type.  I'm going to fix the expanders,
it should be independent of this patch.

BR,
Kewen

Re: [PATCH] doc: Correct a clerical error in the document.

2022-12-07 Thread Lulu Cheng




在 2022/12/7 下午6:05, Richard Sandiford 写道:

Lulu Cheng  writes:

gcc/ChangeLog:

* doc/rtl.texi: Correct a clerical error in the document.
---
  gcc/doc/rtl.texi | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 43c9ee8bffe..44858d12892 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -2142,7 +2142,7 @@ stores the lower 2 bytes of @var{y} in @var{x} and 
discards the upper
  (set @var{z} (subreg:SI (reg:HI @var{x}) 0))
  @end smallexample
  
-would set the lower two bytes of @var{z} to @var{y} and set the upper

+would set the lower two bytes of @var{z} to @var{x} and set the upper
  two bytes to an unknown value assuming @code{SUBREG_PROMOTED_VAR_P} is
  false.

Both versions are right in their way.  I think the intention of the
original was to show the effect of moving y to z via a paradoxical
subreg on x.

How about:

   would set the lower two bytes of @var{z} to @var{x} (which contains
   the lower two bytes of @var{y}) and set the upper ...

OK with that change if you agree.

Richard


Ah, I see. The second test case follows the first test case. :-)

So I think this change is not necessary.

Thanks.

Add zstd support to libbacktrace

2022-12-07 Thread Ian Lance Taylor via Gcc-patches

This patch adds zstd support to libbacktrace, to support the new
linker option --compress-debug-sections=zstd.

The zstd format is fairly complicated, so it's likely that there are
some bugs here.  It does pass the tests, at least.

Unfortunately this decompressor only runs at about 1/3 the speed to
the zstd library decompressor.  Still, it's smaller and simpler, and I
think it uses less memory.  Plus of course it uses the signal-safe
libbacktrace memory allocator.  Perhaps people can make a bit faster
over time.

Bootstrapped and ran libbacktrace and Go tests while using a linker
that compressed using zstd.

Committed to mainline.

Ian

Support decompressing --compress-debug-sections=zstd.
* configure.ac: Check for zstd library and
--compress-debug-sections=zstd linker option.
* Makefile.am (zstdtest_*): New targets.
(zstdtest_alloc_*, ctestzstd_*): New targets.
(BUILDTESTS): Add zstdtest, zstdtest_alloc, ctestzstd as
appropriate.
* elf.c (ELFCOMPRESS_ZSTD): Define.
(elf_fetch_bits): Rename from elf_zlib_fetch.  Update uses.
(elf_fetch_bits_backward): New static function.
(ZLIB_HUFFMAN_*): Rename from HUFFMAN_*.  Update uses.
(ZLIB_TABLE_*): Rename from ZDEBUG_TABLE_*.  Update uses.
(ZSTD_TABLE_*): Define.
(struct elf_zstd_fse_entry): Define.
(elf_zstd_read_fse): New static function.
(elf_zstd_build_fse): Likewise.
(lit): Define if BACKTRACE_GENERATE_ZSTD_FSE_TABLES.
(match, offset, next, print_table, main): Likewise.
(elf_zstd_lit_table): New static const array.
(elf_zstd_match_table, elf_zstd_offset_table): Likewise.
(elf_zstd_read_huff): New static function.
(struct elf_zstd_seq_decode): Define.
(elf_zstd_unpack_seq_decode): New static function.
(ZSTD_LIT_*): Define.
(struct elf_zstd_literals): Define.
(elf_zstd_literal_output): New static function.
(ZSTD_LITERAL_LENGTH_BASELINE_OFFSET): Define.
(elf_zstd_literal_length_baseline): New static const array.
(elf_zstd_literal_length_bits): Likewise.
(ZSTD_MATCH_LENGTH_BASELINE_OFFSET): Define.
(elf_zstd_match_length_baseline): New static const array.
(elf_zstd_match_length_bits): Likewise.
(elf_zstd_decompress): New static function.
(ZDEBUG_TABLE_SIZE): New definition.
(elf_uncompress_chdr): Support ELF_COMPRESS_ZSTD.
(backtrace_uncompress_zstd): New function.
(elf_add): Use ZLIB_TABLE_SIZE for zlib-gnu sections.
* internal.h (backtrace_uncompress_zstd): Declare.
* zstdtest.c: New file.
* configure, config.h.in, Makefile.in: Regenerate.
diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
index 9f8516d00e2..047b573c29a 100644
--- a/libbacktrace/Makefile.am
+++ b/libbacktrace/Makefile.am
@@ -368,6 +368,25 @@ ztest_alloc_CFLAGS = $(ztest_CFLAGS)
 
 BUILDTESTS += ztest_alloc
 
+zstdtest_SOURCES = zstdtest.c testlib.c
+zstdtest_CFLAGS = $(libbacktrace_TEST_CFLAGS) -DSRCDIR=\"$(srcdir)\"
+zstdtest_LDADD = libbacktrace.la
+zstdtest_alloc_LDADD = libbacktrace_alloc.la
+
+if HAVE_ZSTD
+zstdtest_LDADD += -lzstd
+zstdtest_alloc_LDADD += -lzstd
+endif
+zstdtest_LDADD += $(CLOCK_GETTIME_LINK)
+zstdtest_alloc_LDADD += $(CLOCK_GETTIME_LINK)
+
+BUILDTESTS += zstdtest
+
+zstdtest_alloc_SOURCES = $(zstdtest_SOURCES)
+zstdtest_alloc_CFLAGS = $(zstdtest_CFLAGS)
+
+BUILDTESTS += zstdtest_alloc
+
 endif HAVE_ELF
 
 edtest_SOURCES = edtest.c edtest2_build.c testlib.c
@@ -450,6 +469,17 @@ ctesta_LDADD = libbacktrace.la
 
 BUILDTESTS += ctestg ctesta
 
+if HAVE_COMPRESSED_DEBUG_ZSTD
+
+ctestzstd_SOURCES = btest.c testlib.c
+ctestzstd_CFLAGS = $(libbacktrace_TEST_CFLAGS)
+ctestzstd_LDFLAGS = -Wl,--compress-debug-sections=zstd
+ctestzstd_LDADD = libbacktrace.la
+
+BUILDTESTS += ctestzstd
+
+endif
+
 ctestg_alloc_SOURCES = $(ctestg_SOURCES)
 ctestg_alloc_CFLAGS = $(ctestg_CFLAGS)
 ctestg_alloc_LDFLAGS = $(ctestg_LDFLAGS)
diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac
index 1daaa2f62d2..d0a0475cfa8 100644
--- a/libbacktrace/configure.ac
+++ b/libbacktrace/configure.ac
@@ -495,6 +495,21 @@ AC_LINK_IFELSE([AC_LANG_PROGRAM(,)],
 LDFLAGS=$LDFLAGS_hold])
 AM_CONDITIONAL(HAVE_COMPRESSED_DEBUG, test "$libgo_cv_ld_compress" = yes)
 
+AC_CHECK_LIB([zstd], [ZSTD_compress],
+[AC_DEFINE(HAVE_ZSTD, 1, [Define if -lzstd is available.])])
+AM_CONDITIONAL(HAVE_ZSTD, test "$ac_cv_lib_zstd_ZSTD_compress" = yes)
+
+dnl Test whether the linker supports --compress-debug-sections=zstd option.
+AC_CACHE_CHECK([whether --compress-debug-sections=zstd is supported],
+[libgo_cv_ld_compress_zstd],
+[LDFLAGS_hold=$LDFLAGS
+LDFLAGS="$LDFLAGS -Wl,--compress-debug-sections=zstd"
+AC_LINK_IFELSE([AC_LANG_PROGRAM(,)],
+[libgo_cv_ld_compress_zstd=yes],
+[libgo_cv_ld_compress_zstd=no])
+LDFLAGS=$LDFLAGS_hold])
+AM_CONDITIONAL(HAVE_COMPRESSED_DEBUG_ZSTD, test "$libgo_cv_ld_compress_zstd" = 
yes)
+
 AC_ARG_VAR(OBJCOPY, [location of objcopy])
 AC_CHECK_PROG(OBJCOPY, objcopy, objcopy,)
 AC_CHECK_PROG(READELF, readelf, readelf)
diff --git a/libbacktrace/elf.c b/libbacktrace/elf.c
index 181d195fe35..15e6f284db6 100644
--- a/libbacktrace/elf.c
+++ b/libbacktrace/elf.c
@@ -184,6 +184,7 @@

[committed] c: Diagnose auto constexpr used with a type

2022-12-07 Thread Joseph Myers

The constraints on auto in C2x disallow use with other storage-class
specifiers unless the type is inferred from an initializer.  That
includes constexpr; add the missing checks for this case (the
combination of auto, constexpr and a type specifier).

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (declspecs_add_type, declspecs_add_scspec): Check for
auto, constexpr and a type used together.

gcc/testsuite/
* gcc.dg/c2x-constexpr-1.c: Do not use auto, constexpr and a type
together.
* gcc.dg/c2x-constexpr-3.c: Add tests of auto, constexpr and type
used together.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 111f05e2a40..e47ca6718b3 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -11430,6 +11430,10 @@ declspecs_add_type (location_t loc, struct c_declspecs 
*specs,
   else if (specs->thread_p)
error ("%qs used with %",
   specs->thread_gnu_p ? "__thread" : "_Thread_local");
+  else if (specs->constexpr_p)
+   /* auto may only be used with another storage class specifier,
+  such as constexpr, if the type is inferred.  */
+   error ("% used with %");
   else
specs->storage_class = csc_auto;
 }
@@ -12363,6 +12367,10 @@ declspecs_add_scspec (location_t loc,
  return specs;
}
   n = csc_auto;
+  /* auto may only be used with another storage class specifier,
+such as constexpr, if the type is inferred.  */
+  if (specs->constexpr_p)
+   error ("%qE used with %", scspec);
   break;
 case RID_EXTERN:
   n = csc_extern;
@@ -12393,6 +12401,10 @@ declspecs_add_scspec (location_t loc,
error ("%qE used with %", scspec);
   else if (specs->storage_class == csc_typedef)
error ("%qE used with %", scspec);
+  else if (specs->storage_class == csc_auto)
+   /* auto may only be used with another storage class specifier,
+  such as constexpr, if the type is inferred.  */
+   error ("%qE used with %", scspec);
   else if (specs->thread_p)
error ("%qE used with %qs", scspec,
   specs->thread_gnu_p ? "__thread" : "_Thread_local");
diff --git a/gcc/testsuite/gcc.dg/c2x-constexpr-1.c 
b/gcc/testsuite/gcc.dg/c2x-constexpr-1.c
index f7f64e2d300..d43d95ddd7c 100644
--- a/gcc/testsuite/gcc.dg/c2x-constexpr-1.c
+++ b/gcc/testsuite/gcc.dg/c2x-constexpr-1.c
@@ -180,10 +180,10 @@ f0 ()
 {
   constexpr int fv0 = 3;
   static_assert (fv0 == 3);
-  auto constexpr int fv1 = 4;
+  auto constexpr fv1 = 4;
   static_assert (fv1 == 4);
   register constexpr float fv2 = 1.0;
-  constexpr auto int fv3 = 123;
+  constexpr auto fv3 = 123;
   static_assert (fv3 == 123);
   constexpr register void *fv4 = (void *) 0;
   const int *fv5 = &(constexpr int) { 234 };
diff --git a/gcc/testsuite/gcc.dg/c2x-constexpr-3.c 
b/gcc/testsuite/gcc.dg/c2x-constexpr-3.c
index 16e56db2835..29fedc03afd 100644
--- a/gcc/testsuite/gcc.dg/c2x-constexpr-3.c
+++ b/gcc/testsuite/gcc.dg/c2x-constexpr-3.c
@@ -225,4 +225,12 @@ f0 ()
   constexpr typeof (nullptr) not_npc = nullptr;
   int *ptr = 0;
   (void) (ptr == not_npc); /* { dg-error "invalid operands" } */
+  /* auto may only be used with another storage class specifier, such as
+ constexpr, if the type is inferred.  */
+  auto constexpr int a_c_t = 1; /* { dg-error "'auto' used with 'constexpr'" } 
*/
+  constexpr auto int c_a_t = 1; /* { dg-error "'auto' used with 'constexpr'" } 
*/
+  auto int constexpr a_t_c = 1; /* { dg-error "'constexpr' used with 'auto'" } 
*/
+  constexpr int auto c_t_a = 1; /* { dg-error "'auto' used with 'constexpr'" } 
*/
+  int auto constexpr t_a_c = 1; /* { dg-error "'constexpr' used with 'auto'" } 
*/
+  int constexpr auto t_c_a = 1; /* { dg-error "'auto' used with 'constexpr'" } 
*/
 }

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fortran: handle zero-sized arrays in ctors with typespec [PR108010]

2022-12-07 Thread Steve Kargl via Gcc-patches

On Wed, Dec 07, 2022 at 09:57:20PM +0100, Harald Anlauf via Fortran wrote:
> Dear all,
> 
> we need to be careful about zero-sized arrays in arithmetic
> reductions (unary & binary), as we otherwise may hit a NULL
> pointer dereference on valid code.
> 
> The actual fix is straightforward, see attached patch.
> 
> Regtested on x86_64-pc-linux-gnu.  OK for mainline?
> 

Yes.  Thanks for the patch.

-- 
Steve

[PATCH] c++: modules and std::source_location::current() def arg [PR100881]

2022-12-07 Thread Patrick Palka via Gcc-patches

We currently declare __builtin_source_location with a const void* return
type instead of the true type (const std::source_location::__impl*), and
later when folding this builtin we just obtain the true type via name
lookup.

But the below testcase demonstrates this name lookup approach seems to
interact poorly with modules, since we may import an entity that uses
std::source_location::current() in a default argument (or DMI) without
also importing , and thus the name lookup will fail
when folding the builtin at the call site unless we also import
.

This patch fixes by instead initially declaring __builtin_source_location
with an auto return type and updating it appropriately upon its first use.
Thus when folding calls to this builtin we can fish out the true return
type through the type of the CALL_EXPR and avoid needing to do name
lookup.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
reasonable?

PR c++/100881

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_builtin_function_call): Adjust calls
to fold_builtin_source_location.
* cp-gimplify.cc (cp_gimplify_expr): Likewise.
(cp_fold): Likewise.
(get_source_location_impl_type): Remove location_t parameter and
adjust accordingly.  No longer static.
(fold_builtin_source_location): Take a CALL_EXPR tree instead of a
location and obtain the impl type from its return type.
* cp-tree.h (enum cp_tree_index): Remove CPTI_SOURCE_LOCATION_IMPL
enumerator.
(source_location_impl): Remove.
(fold_builtin_source_location): Adjust parameter type.
(get_source_location_impl_type): Declare.
* decl.cc (cxx_init_decl_processing): Declare
__builtin_source_location with auto return type instead of
const void*.
(require_deduced_type): Update the return type of
__builtin_source_location.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/srcloc3.C: Adjust expected note s/evaluating/using.
* g++.dg/cpp2a/srcloc4.C: Likewise.
* g++.dg/cpp2a/srcloc5.C: Likewise.
* g++.dg/cpp2a/srcloc6.C: Likewise.
* g++.dg/cpp2a/srcloc7.C: Likewise.
* g++.dg/cpp2a/srcloc8.C: Likewise.
* g++.dg/cpp2a/srcloc9.C: Likewise.
* g++.dg/cpp2a/srcloc10.C: Likewise.
* g++.dg/cpp2a/srcloc11.C: Likewise.
* g++.dg/cpp2a/srcloc12.C: Likewise.
* g++.dg/cpp2a/srcloc13.C: Likewise.
* g++.dg/modules/pr100881_a.C: New test.
* g++.dg/modules/pr100881_b.C: New test.
---
 gcc/cp/constexpr.cc   |  2 +-
 gcc/cp/cp-gimplify.cc | 58 +++
 gcc/cp/cp-tree.h  |  8 +---
 gcc/cp/decl.cc| 25 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc10.C |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc11.C |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc12.C |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc13.C |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc3.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc4.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc5.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc6.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc7.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc8.C  |  2 +-
 gcc/testsuite/g++.dg/cpp2a/srcloc9.C  |  2 +-
 gcc/testsuite/g++.dg/modules/pr100881_a.C | 28 +++
 gcc/testsuite/g++.dg/modules/pr100881_b.C |  8 
 17 files changed, 101 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr100881_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr100881_b.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 0b43ae4ece3..6ff994fd599 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1492,7 +1492,7 @@ cxx_eval_builtin_function_call (const constexpr_ctx *ctx, 
tree t, tree fun,
   temp_override ovr (current_function_decl);
   if (ctx->call && ctx->call->fundef)
current_function_decl = ctx->call->fundef->decl;
-  return fold_builtin_source_location (EXPR_LOCATION (t));
+  return fold_builtin_source_location (t);
 }
 
   int strops = 0;
diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index 983f2a566a6..6ad8458ab28 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -722,7 +722,7 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p)
break;
  case CP_BUILT_IN_SOURCE_LOCATION:
*expr_p
- = fold_builtin_source_location (EXPR_LOCATION (*expr_p));
+ = fold_builtin_source_location (*expr_p);
break;
  case CP_BUILT_IN_IS_CORRESPONDING_MEMBER:
*expr_p
@@ -2850,7 +2850,7 @@ cp_fold (tree x)
  case CP_BUILT_IN_IS_CONSTANT_EVALUATED:
break;
  case CP_BUILT_IN_SOURCE_LOCATION:
-   x = fold_builtin_source_location (EXPR_LOCATION (x));
+

Re: [PATCH] RISC-V: Produce better code with complex constants [PR95632] [PR106602]

2022-12-07 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 07, 2022 at 05:55:17PM -0300, Raphael Moreira Zinsly wrote:
> Due to RISC-V limitations on operations with big constants combine
> is failing to match such operations and is not being able to
> produce optimal code as it keeps splitting them. By pretending we
> can do those operations we can get more opportunities for
> simplification of surrounding instructions.
> 
> 2022-12-06 Raphael Moreira Zinsly 
>Jeff Law 

Just nits, not a proper review.
2 spaces after date and 2 spaces before <, rather than just 1.

> 
> gcc/Changelog:
>   PR target/95632
> PR target/106602
> * config/riscv/riscv.md: New pattern to simulate complex
> const_int loads.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/riscv/pr95632.c: New test.
> * gcc.target/riscv/pr106602.c: Likewise.

All lines in the ChangeLog should be tab indented, rather than just some of
them and others with 8 spaces.

> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1667,6 +1667,22 @@
> MAX_MACHINE_MODE, [3], TRUE);
>  })
>  
> +;; Pretend to have the ability to load complex const_int in order to get
> +;; better code generation around them.
> +(define_insn_and_split ""

define_insn_and_split patterns better should have some name, even if it
starts with *.  It makes dumps more readable, and you can refer to it
in the ChangeLog when it is added or changed etc.

> +  [(set (match_operand:GPR 0 "register_operand" "=r")
> +(match_operand:GPR 1 "splittable_const_int_operand" "i"))]
> +  "cse_not_expected"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
> +

Why the empty line?

> +{
> +  riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]),
> +   mode, TRUE);

You can just use  if there is only one iterator in the pattern.

Jakub

Re: [PATCH] RISC-V: Produce better code with complex constants [PR95632] [PR106602]

2022-12-07 Thread Jeff Law via Gcc-patches





On 12/7/22 13:55, Raphael Moreira Zinsly wrote:

Due to RISC-V limitations on operations with big constants combine
is failing to match such operations and is not being able to
produce optimal code as it keeps splitting them. By pretending we
can do those operations we can get more opportunities for
simplification of surrounding instructions.

2022-12-06 Raphael Moreira Zinsly 
Jeff Law 

gcc/Changelog:
PR target/95632
 PR target/106602
 * config/riscv/riscv.md: New pattern to simulate complex
 const_int loads.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr95632.c: New test.
 * gcc.target/riscv/pr106602.c: Likewise.

So to give a little background to others.

The core issue is that when we break down constants early, it can make 
it difficult for combine to reconstruct the constant and simplify code 
using the reconstructed constant -- you end up trying to do 4->3 or 
worse combination sequences which aren't supported by the combiner.


Usually this kind of scenario is handled with a "bridge" pattern.  Those 
are generally defined as patterns that exist solely for combine and may 
not correspond to any real instruction on the target.  "bridge" patterns 
are typically 2->1 or 3->1 combinations and are intermediate steps for 
4->N or even larger combination opportunities.  Obviously if the bridge 
doesn't allow subsequent simplifications, then the bridge pattern must 
generate correct code (either by generating suitable assembly or 
splitting later).


Raphael's patch introduces a bridge pattern that pretends we can load up 
splittable constants in a single insn.  We restrict the bridge pattern 
to be active from the point when CSE is no longer expected through the 
combiner up to the first splitter pass (where we'll break it down again 
if it's still in the IL).


So we get most of the benefit of splitting constants early (CSE, LICM, 
etc) while also getting the benefits of splitting late (combine 
simplifications).


Given I was working with Raphael on the patch, it's probably best for 
someone else to do the review rather than me approving it :-)


Jeff

[PATCH 1/3] btf: add 'extern' linkage for variables [PR106773]

2022-12-07 Thread David Faust via Gcc-patches

Add support for the 'extern' linkage value for BTF_KIND_VAR records,
which is used for variables declared as extern in the source file.

PR target/106773

gcc/

* btfout.cc (BTF_LINKAGE_STATIC): New define.
(BTF_LINKAGE_GLOBAL): Likewise.
(BTF_LINKAGE_EXTERN): Likewise.
(btf_collect_datasec): Mark extern variables as such.
(btf_asm_varent): Accomodate 'extern' linkage.

gcc/testsuite/

* gcc.dg/debug/btf/btf-variables-4.c: New test.

include/

* btf.h (struct btf_var): Update comment to note 'extern' linkage.
---
 gcc/btfout.cc |  9 ++-
 .../gcc.dg/debug/btf/btf-variables-4.c| 24 +++
 include/btf.h |  2 +-
 3 files changed, 33 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index aef9fd70a28..a1c6266a7db 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -66,6 +66,10 @@ static char btf_info_section_label[MAX_BTF_LABEL_BYTES];
 
 #define BTF_INVALID_TYPEID 0x
 
+#define BTF_LINKAGE_STATIC 0
+#define BTF_LINKAGE_GLOBAL 1
+#define BTF_LINKAGE_EXTERN 2
+
 /* Mapping of CTF variables to the IDs they will be assigned when they are
converted to BTF_KIND_VAR type records. Strictly accounts for the index
from the start of the variable type entries, does not include the number
@@ -314,6 +318,9 @@ btf_collect_datasec (ctf_container_ref ctfc)
continue;
 
   const char *section_name = node->get_section ();
+  /* Mark extern variables.  */
+  if (DECL_EXTERNAL (node->decl))
+   dvd->dvd_visibility = BTF_LINKAGE_EXTERN;
 
   if (section_name == NULL)
{
@@ -676,7 +683,7 @@ btf_asm_varent (ctf_dvdef_ref var)
   dw2_asm_output_data (4, var->dvd_name_offset, "btv_name");
   dw2_asm_output_data (4, BTF_TYPE_INFO (BTF_KIND_VAR, 0, 0), "btv_info");
   dw2_asm_output_data (4, get_btf_id (var->dvd_type), "btv_type");
-  dw2_asm_output_data (4, (var->dvd_visibility ? 1 : 0), "btv_linkage");
+  dw2_asm_output_data (4, var->dvd_visibility, "btv_linkage");
 }
 
 /* Asm'out a member description following a BTF_KIND_STRUCT or
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c
new file mode 100644
index 000..d77600bae1c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c
@@ -0,0 +1,24 @@
+/* Test BTF generation for extern variables.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* Expect 4 variables.  */
+/* { dg-final { scan-assembler-times "\[\t \]0xe00\[\t 
\]+\[^\n\]*btv_info" 4 } } */
+
+/* 2 extern, 1 global, 1 static.  */
+/* { dg-final { scan-assembler-times "\[\t \]0\[\t \]+\[^\n\]*btv_linkage" 1 } 
} */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t \]+\[^\n\]*btv_linkage" 1 
} } */
+/* { dg-final { scan-assembler-times "\[\t \]0x2\[\t \]+\[^\n\]*btv_linkage" 2 
} } */
+
+extern int a;
+extern const int b;
+int c;
+static const int d = 5;
+
+int foo (int x)
+{
+  c = a + b + x;
+
+  return c + d;
+}
diff --git a/include/btf.h b/include/btf.h
index eba67f9d599..9a757ce5bc9 100644
--- a/include/btf.h
+++ b/include/btf.h
@@ -182,7 +182,7 @@ struct btf_param
information about the variable.  */
 struct btf_var
 {
-  uint32_t linkage;/* Currently only 0=static or 1=global.  */
+  uint32_t linkage;/* 0=static, 1=global, 2=extern.  */
 };
 
 /* BTF_KIND_DATASEC is followed by VLEN struct btf_var_secinfo entries,
-- 
2.38.1

[PATCH 3/3] btf: correct generation for extern funcs [PR106773]

2022-12-07 Thread David Faust via Gcc-patches

The eBPF loader expects to find entries for functions declared as extern
in the corresponding BTF_KIND_DATASEC record, but we were not generating
these entries.

This patch adds support for the 'extern' linkage of function types in
BTF, and creates entries for for them BTF_KIND_DATASEC records as needed.

PR target/106773

gcc/

* btfout.cc (get_section_name): New function.
(btf_collect_datasec): Use it here. Process functions, marking them
'extern' and generating DATASEC entries for them as appropriate. Move
creation of BTF_KIND_FUNC records to here...
(btf_dtd_emit_preprocess_cb): ... from here.

gcc/testsuite/

* gcc.dg/debug/btf/btf-datasec-2.c: New test.
* gcc.dg/debug/btf/btf-function-6.c: New test.

include/

* btf.h (struct btf_var_secinfo): Update comments with notes about
extern functions.
---
 gcc/btfout.cc | 129 --
 .../gcc.dg/debug/btf/btf-datasec-2.c  |  28 
 .../gcc.dg/debug/btf/btf-function-6.c |  19 +++
 include/btf.h |   9 +-
 4 files changed, 139 insertions(+), 46 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-function-6.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 05f3a3f9b6e..d7ead377ec5 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -294,7 +294,35 @@ btf_datasec_push_entry (ctf_container_ref ctfc, const char 
*secname,
   ds.entries.safe_push (info);
 
   datasecs.safe_push (ds);
-  num_types_created++;
+}
+
+
+/* Return the section name, as of interest to btf_collect_datasec, for the
+   given symtab node. Note that this deliberately returns NULL for objects
+   which do not go in a section btf_collect_datasec cares about.  */
+static const char *
+get_section_name (symtab_node *node)
+{
+  const char *section_name = node->get_section ();
+
+  if (section_name == NULL)
+{
+  switch (categorize_decl_for_section (node->decl, 0))
+   {
+   case SECCAT_BSS:
+ section_name = ".bss";
+ break;
+   case SECCAT_DATA:
+ section_name = ".data";
+ break;
+   case SECCAT_RODATA:
+ section_name = ".rodata";
+ break;
+   default:;
+   }
+}
+
+  return section_name;
 }
 
 /* Construct all BTF_KIND_DATASEC records for CTFC. One such record is created
@@ -305,7 +333,60 @@ btf_datasec_push_entry (ctf_container_ref ctfc, const char 
*secname,
 static void
 btf_collect_datasec (ctf_container_ref ctfc)
 {
-  /* See cgraph.h struct symtab_node, which varpool_node extends.  */
+  cgraph_node *func;
+  FOR_EACH_FUNCTION (func)
+{
+  dw_die_ref die = lookup_decl_die (func->decl);
+  if (die == NULL)
+   continue;
+
+  ctf_dtdef_ref dtd = ctf_dtd_lookup (ctfc, die);
+  if (dtd == NULL)
+   continue;
+
+  /* Functions actually get two types: a BTF_KIND_FUNC_PROTO, and
+also a BTF_KIND_FUNC. But the CTF container only allocates one
+type per function, which matches closely with BTF_KIND_FUNC_PROTO.
+For each such function, also allocate a BTF_KIND_FUNC entry.
+These will be output later.  */
+  ctf_dtdef_ref func_dtd = ggc_cleared_alloc ();
+  func_dtd->dtd_data = dtd->dtd_data;
+  func_dtd->dtd_data.ctti_type = dtd->dtd_type;
+  func_dtd->linkage = dtd->linkage;
+  func_dtd->dtd_type = num_types_added + num_types_created;
+
+  /* Only the BTF_KIND_FUNC type actually references the name. The
+BTF_KIND_FUNC_PROTO is always anonymous.  */
+  dtd->dtd_data.ctti_name = 0;
+
+  vec_safe_push (funcs, func_dtd);
+  num_types_created++;
+
+  /* Mark any 'extern' funcs and add DATASEC entries for them.  */
+  if (DECL_EXTERNAL (func->decl))
+   {
+ func_dtd->linkage = BTF_LINKAGE_EXTERN;
+
+ const char *section_name = get_section_name (func);
+ /* Note: get_section_name () returns NULL for functions in text
+section. This is intentional, since we do not want to generate
+DATASEC entries for them.  */
+ if (section_name == NULL)
+   continue;
+
+ struct btf_var_secinfo info;
+
+ /* +1 for the sentinel type not in the types map.  */
+ info.type = func_dtd->dtd_type + 1;
+
+ /* Both zero at compile time.  */
+ info.size = 0;
+ info.offset = 0;
+
+ btf_datasec_push_entry (ctfc, section_name, info);
+   }
+}
+
   varpool_node *node;
   FOR_EACH_VARIABLE (node)
 {
@@ -317,28 +398,13 @@ btf_collect_datasec (ctf_container_ref ctfc)
   if (dvd == NULL)
continue;
 
-  const char *section_name = node->get_section ();
   /* Mark extern variables.  */
   if (DECL_EXTERNAL (node->decl))
dvd->dvd_visibility = BTF_LINKAGE_EXTERN;
 
+  const char *section_name = get_section_name (node);
   if

[PATCH 2/3] btf: fix 'extern const void' variables [PR106773]

2022-12-07 Thread David Faust via Gcc-patches

The eBPF loader expects to find BTF_KIND_VAR records for references to
extern const void symbols. We were mistakenly identifing these as
unsupported types, and as a result skipping emitting VAR records for
them.

In addition, the internal DWARF representation from which BTF is
produced does not generate 'const' modifier DIEs for the void type,
which meant in BTF the 'const' qualifier was dropped for 'extern const
void' variables. This patch also adds support for generating a const
void type in BTF to correct emission for these variables.

PR target/106773

gcc/

* btfout.cc (btf_collect_datasec): Correct size of void entries.
(btf_dvd_emit_preprocess_cb): Do not skip emitting variables which
refer to void types.
(btf_init_postprocess): Create 'const void' type record if needed and
adjust variables to refer to it as appropriate.

gcc/testsuite/

* gcc.dg/debug/btf/btf-pr106773.c: New test.
---
 gcc/btfout.cc | 44 +--
 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c | 25 +++
 2 files changed, 65 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index a1c6266a7db..05f3a3f9b6e 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -354,6 +354,8 @@ btf_collect_datasec (ctf_container_ref ctfc)
   tree size = DECL_SIZE_UNIT (node->decl);
   if (tree_fits_uhwi_p (size))
info.size = tree_to_uhwi (size);
+  else if (VOID_TYPE_P (TREE_TYPE (node->decl)))
+   info.size = 1;
 
   /* Offset is left as 0 at compile time, to be filled in by loaders such
 as libbpf.  */
@@ -439,7 +441,7 @@ btf_dvd_emit_preprocess_cb (ctf_dvdef_ref *slot, 
ctf_container_ref arg_ctfc)
   ctf_dvdef_ref var = (ctf_dvdef_ref) * slot;
 
   /* Do not add variables which refer to unsupported types.  */
-  if (btf_removed_type_p (var->dvd_type))
+  if (!voids.contains (var->dvd_type) && btf_removed_type_p (var->dvd_type))
 return 1;
 
   arg_ctfc->ctfc_vars_list[num_vars_added] = var;
@@ -1073,15 +1075,49 @@ btf_init_postprocess (void)
 {
   ctf_container_ref tu_ctfc = ctf_get_tu_ctfc ();
 
-  size_t i;
-  size_t num_ctf_types = tu_ctfc->ctfc_types->elements ();
-
   holes.create (0);
   voids.create (0);
 
   num_types_added = 0;
   num_types_created = 0;
 
+  /* Workaround for 'const void' variables. These variables are sometimes used
+ in eBPF programs to address kernel symbols. DWARF does not generate const
+ qualifier on void type, so we would incorrectly emit these variables
+ without the const qualifier.
+ Unfortunately we need the TREE node to know it was const, and we need
+ to create the const modifier type (if needed) now, before making the types
+ list. So we can't avoid iterating with FOR_EACH_VARIABLE here, and then
+ again when creating the DATASEC entries.  */
+  ctf_id_t constvoid_id = CTF_NULL_TYPEID;
+  varpool_node *var;
+  FOR_EACH_VARIABLE (var)
+{
+  if (!var->decl)
+   continue;
+
+  tree type = TREE_TYPE (var->decl);
+  if (type && VOID_TYPE_P (type) && TYPE_READONLY (type))
+   {
+ dw_die_ref die = lookup_decl_die (var->decl);
+ if (die == NULL)
+   continue;
+
+ ctf_dvdef_ref dvd = ctf_dvd_lookup (tu_ctfc, die);
+ if (dvd == NULL)
+   continue;
+
+ /* Create the 'const' modifier type for void.  */
+ if (constvoid_id == CTF_NULL_TYPEID)
+   constvoid_id = ctf_add_reftype (tu_ctfc, CTF_ADD_ROOT,
+   dvd->dvd_type, CTF_K_CONST, NULL);
+ dvd->dvd_type = constvoid_id;
+   }
+}
+
+  size_t i;
+  size_t num_ctf_types = tu_ctfc->ctfc_types->elements ();
+
   if (num_ctf_types)
 {
   init_btf_id_map (num_ctf_types + 1);
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
new file mode 100644
index 000..f90fa773a4b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
@@ -0,0 +1,25 @@
+/* Test BTF generation for extern const void symbols.
+   BTF_KIND_VAR records should be emitted for such symbols if they are used,
+   as well as a corresponding entry in the appropriate DATASEC record.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* Expect 1 variable record only for foo, with 'extern' (2) linkage.  */
+/* { dg-final { scan-assembler-times "\[\t \]0xe00\[\t 
\]+\[^\n\]*btv_info" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x2\[\t \]+\[^\n\]*btv_linkage" 1 
} } */
+
+/* { dg-final { scan-assembler-times "ascii \"foo.0\"\[\t 
\]+\[^\n\]*btf_string" 1 } } */
+
+/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bts_offset" 1 } } */
+/* { dg-final { scan-assembler-times "1\[\t \]+\[^\n\]*bts_size" 1 } } */
+
+extern const void foo __attribute__((weak)) __attribute__((section 
(".ksyms")));
+extern

[PATCH 0/3] btf: fix BTF for extern items [PR106773]

2022-12-07 Thread David Faust via Gcc-patches

Hi,

This series fixes the issues reported in target/PR106773. I decided to
split it into three commits, as there are ultimately three distinct
issues and fixes. See each patch for details.

Tested on bpf-unknown-none and x86_64-linux-gnu, no known regressions.

OK to push?
Thanks.

David Faust (3):
  btf: add 'extern' linkage for variables [PR106773]
  btf: fix 'extern const void' variables [PR106773]
  btf: correct generation for extern funcs [PR106773]

 gcc/btfout.cc | 182 +-
 .../gcc.dg/debug/btf/btf-datasec-2.c  |  28 +++
 .../gcc.dg/debug/btf/btf-function-6.c |  19 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c |  25 +++
 .../gcc.dg/debug/btf/btf-variables-4.c|  24 +++
 include/btf.h |  11 +-
 6 files changed, 237 insertions(+), 52 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-datasec-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-function-6.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-variables-4.c

-- 
2.38.1

[PATCH] Fortran: handle zero-sized arrays in ctors with typespec [PR108010]

2022-12-07 Thread Harald Anlauf via Gcc-patches

Dear all,

we need to be careful about zero-sized arrays in arithmetic
reductions (unary & binary), as we otherwise may hit a NULL
pointer dereference on valid code.

The actual fix is straightforward, see attached patch.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald
From 02a8b7308d04dc84fb13b077bd3b2fe01e15c92e Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 7 Dec 2022 21:50:23 +0100
Subject: [PATCH] Fortran: handle zero-sized arrays in ctors with typespec
 [PR108010]

gcc/fortran/ChangeLog:

	PR fortran/108010
	* arith.cc (reduce_unary): Handle zero-sized arrays.
	(reduce_binary_aa): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/108010
	* gfortran.dg/pr108010.f90: New test.
---
 gcc/fortran/arith.cc   | 24 ++--
 gcc/testsuite/gfortran.dg/pr108010.f90 | 54 ++
 2 files changed, 74 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr108010.f90

diff --git a/gcc/fortran/arith.cc b/gcc/fortran/arith.cc
index c4ab75b401c..c0d12cfad9d 100644
--- a/gcc/fortran/arith.cc
+++ b/gcc/fortran/arith.cc
@@ -1342,8 +1342,16 @@ reduce_unary (arith (*eval) (gfc_expr *, gfc_expr **), gfc_expr *op,
   else
 {
   gfc_constructor *c = gfc_constructor_first (head);
-  r = gfc_get_array_expr (c->expr->ts.type, c->expr->ts.kind,
-			  >where);
+  if (c == NULL)
+	{
+	  /* Handle zero-sized arrays.  */
+	  r = gfc_get_array_expr (op->ts.type, op->ts.kind, >where);
+	}
+  else
+	{
+	  r = gfc_get_array_expr (c->expr->ts.type, c->expr->ts.kind,
+  >where);
+	}
   r->shape = gfc_copy_shape (op->shape, op->rank);
   r->rank = op->rank;
   r->value.constructor = head;
@@ -1501,8 +1509,16 @@ reduce_binary_aa (arith (*eval) (gfc_expr *, gfc_expr *, gfc_expr **),
   else
 {
   gfc_constructor *c = gfc_constructor_first (head);
-  r = gfc_get_array_expr (c->expr->ts.type, c->expr->ts.kind,
-			  >where);
+  if (c == NULL)
+	{
+	  /* Handle zero-sized arrays.  */
+	  r = gfc_get_array_expr (op1->ts.type, op1->ts.kind, >where);
+	}
+  else
+	{
+	  r = gfc_get_array_expr (c->expr->ts.type, c->expr->ts.kind,
+  >where);
+	}
   r->shape = gfc_copy_shape (op1->shape, op1->rank);
   r->rank = op1->rank;
   r->value.constructor = head;
diff --git a/gcc/testsuite/gfortran.dg/pr108010.f90 b/gcc/testsuite/gfortran.dg/pr108010.f90
new file mode 100644
index 000..303b2b98220
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr108010.f90
@@ -0,0 +1,54 @@
+! { dg-do run }
+! PR fortran/108010 - ICE in reduce_unary, reduce_binary_aa
+! Contributed by G.Steinmetz
+
+program p
+  implicit none
+  print *,   + [integer :: [real ::]]
+  print *,   - [integer :: [real ::]]
+  print *, 1 + [integer :: [real ::]]
+  print *, 1 - [integer :: [real ::]]
+  print *, 2 * [integer :: [real ::]]
+  print *,   - [real :: [real ::], 2]
+  print *,   + [integer :: [real ::], 2]
+  print *,   - [integer :: [real ::], 2]
+  print *, 1 + [integer :: [real ::], 2]
+  print *, 1 - [integer :: [real ::], 2]
+  print *, 2 * [integer :: [real ::], 2]
+  print *, [integer :: [real ::]] + [integer :: [real ::]]
+  print *, [integer :: [real ::]] - [integer :: [real ::]]
+  print *, [integer :: [real ::]] * [integer :: [real ::]]
+  print *, [integer :: [real ::], 2] + [real :: [real ::], 3]
+  print *, [integer :: [real ::], 2] - [real :: [real ::], 3]
+  print *, [integer :: [real ::], 2] * [real :: [real ::], 3]
+
+  ! Validate type of resulting arrays
+  if (.not. is_int ([integer :: [real ::]] )) stop 1
+  if (.not. is_int ([integer :: [real ::]] + [integer :: [real ::]])) stop 2
+  if (.not. is_real([real :: [integer ::]] )) stop 3
+  if (.not. is_real([real :: [integer ::]] + [real :: [integer ::]])) stop 4
+  if (.not. is_real([real :: [integer ::]] + [integer :: [real ::]])) stop 5
+  if (.not. is_real([integer :: [real ::]] + [real :: [integer ::]])) stop 6
+
+contains
+
+  logical function is_int (x)
+class(*) :: x(:)
+select type (x)
+type is (integer)
+   is_int = .true.
+class default
+   is_int = .false.
+end select
+  end function is_int
+
+  logical function is_real (x)
+class(*) :: x(:)
+select type (x)
+type is (real)
+   is_real = .true.
+class default
+   is_real = .false.
+end select
+  end function is_real
+end
--
2.35.3

[PATCH] RISC-V: Produce better code with complex constants [PR95632] [PR106602]

2022-12-07 Thread Raphael Moreira Zinsly

Due to RISC-V limitations on operations with big constants combine
is failing to match such operations and is not being able to
produce optimal code as it keeps splitting them. By pretending we
can do those operations we can get more opportunities for
simplification of surrounding instructions.

2022-12-06 Raphael Moreira Zinsly 
   Jeff Law 

gcc/Changelog:
PR target/95632
PR target/106602
* config/riscv/riscv.md: New pattern to simulate complex
const_int loads.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr95632.c: New test.
* gcc.target/riscv/pr106602.c: Likewise.
---
 gcc/config/riscv/riscv.md | 16 
 gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++
 gcc/testsuite/gcc.target/riscv/pr95632.c  | 15 +++
 3 files changed, 45 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index df57e2b0b4a..0a9b5ec22b0 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1667,6 +1667,22 @@
  MAX_MACHINE_MODE, [3], TRUE);
 })
 
+;; Pretend to have the ability to load complex const_int in order to get
+;; better code generation around them.
+(define_insn_and_split ""
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(match_operand:GPR 1 "splittable_const_int_operand" "i"))]
+  "cse_not_expected"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+
+{
+  riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]),
+ mode, TRUE);
+  DONE;
+})
+
 ;; 64-bit integer moves
 
 (define_expand "movdi"
diff --git a/gcc/testsuite/gcc.target/riscv/pr106602.c 
b/gcc/testsuite/gcc.target/riscv/pr106602.c
new file mode 100644
index 000..83b70877012
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr106602.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gc" } */
+
+unsigned long
+foo2 (unsigned long a)
+{
+  return (unsigned long)(unsigned int) a << 6;
+}
+
+/* { dg-final { scan-assembler-times "slli\t" 1 } } */
+/* { dg-final { scan-assembler-times "srli\t" 1 } } */
+/* { dg-final { scan-assembler-not "\tli\t" } } */
+/* { dg-final { scan-assembler-not "addi\t" } } */
+/* { dg-final { scan-assembler-not "and\t" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/pr95632.c 
b/gcc/testsuite/gcc.target/riscv/pr95632.c
new file mode 100644
index 000..bd316ab1d7b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr95632.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv32imafc -mabi=ilp32f" } */
+
+unsigned short
+foo (unsigned short crc)
+{
+  crc ^= 0x4002;
+  crc >>= 1;
+  crc |= 0x8000;
+
+  return crc;
+}
+
+/* { dg-final { scan-assembler-times "srli\t" 1 } } */
+/* { dg-final { scan-assembler-not "slli\t" } } */
-- 
2.38.1

[PATCH] docs: Suggest options to improve ASAN stack traces

2022-12-07 Thread Marek Polacek via Gcc-patches

I got a complaint that while Clang docs suggest options that improve
the quality of the backtraces ASAN prints (cf.
), our docs
don't say anything to that effect.  This patch amends that with a new
paragraph.  (It deliberately doesn't mention -fno-omit-frame-pointer.)

gcc/ChangeLog:

* doc/invoke.texi (-fsanitize=address): Suggest options to improve
stack traces.
---
 gcc/doc/invoke.texi | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 726392409b6..2de14466dd3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16510,6 +16510,14 @@ The option cannot be combined with 
@option{-fsanitize=thread} or
 @option{-fsanitize=hwaddress}.  Note that the only target
 @option{-fsanitize=hwaddress} is currently supported on is AArch64.
 
+To get more accurate stack traces, it is possible to use options such as
+@option{-O} (which, for instance, prevents most function inlining),
+@option{-fno-optimize-sibling-calls} (which prevents optimizing sibling
+and tail recursive calls), or @option{-fno-ipa-icf} (which disables Identical
+Code Folding for functions and read-only variables).  Since multiple runs
+of the program may yield backtraces with different addresses due to ASLR,
+it may be desirable to turn off ASLR: @samp{setarch `uname -m` -R ./prog}.
+
 @item -fsanitize=kernel-address
 @opindex fsanitize=kernel-address
 Enable AddressSanitizer for Linux kernel.

base-commit: 3ad0f470c16d5528a5283060b007f8b419c33c92
-- 
2.38.1

[PATCH] c++: ICE with concepts TS multiple auto deduction [PR101886]

2022-12-07 Thread Patrick Palka via Gcc-patches

In extract_autos_r, we need to reset TYPE_CANONICAL for the template
type parameter after adjusting its index, otherwise we end up with a
comptypes ICE for the below testcase.  Note that such in-place type
adjustment isn't generallly safe to do since the type could be the
TYPE_CANONICAL of another (unadjusted) type, but in this case the
canonical auto (of some level and 0 index) is the first auto (of that
level) that's created, and so any auto that we do end up adjusting can't
be the canonical one.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/101886

gcc/cp/ChangeLog:

* pt.cc (extract_autos_r): Reset TYPE_CANONICAL after
adjusting the template type parameter's index.  Simplify
by using TEMPLATE_TYPE_IDX.  Add some sanity checks.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/auto5.C: New test.
---
 gcc/cp/pt.cc  | 12 +---
 gcc/testsuite/g++.dg/concepts/auto5.C |  9 +
 2 files changed, 18 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/concepts/auto5.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 24ed718ffbb..d05a49b1c11 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -29164,18 +29164,24 @@ extract_autos_r (tree t, void *data)
 {
   /* All the autos were built with index 0; fix that up now.  */
   tree *p = hash.find_slot (t, INSERT);
-  unsigned idx;
+  int idx;
   if (*p)
/* If this is a repeated constrained-type-specifier, use the index we
   chose before.  */
-   idx = TEMPLATE_PARM_IDX (TEMPLATE_TYPE_PARM_INDEX (*p));
+   idx = TEMPLATE_TYPE_IDX (*p);
   else
{
  /* Otherwise this is new, so use the current count.  */
  *p = t;
  idx = hash.elements () - 1;
}
-  TEMPLATE_PARM_IDX (TEMPLATE_TYPE_PARM_INDEX (t)) = idx;
+  if (idx != TEMPLATE_TYPE_IDX (t))
+   {
+ gcc_checking_assert (TEMPLATE_TYPE_IDX (t) == 0);
+ gcc_checking_assert (TYPE_CANONICAL (t) != t);
+ TEMPLATE_TYPE_IDX (t) = idx;
+ TYPE_CANONICAL (t) = canonical_type_parameter (t);
+   }
 }
 
   /* Always keep walking.  */
diff --git a/gcc/testsuite/g++.dg/concepts/auto5.C 
b/gcc/testsuite/g++.dg/concepts/auto5.C
new file mode 100644
index 000..f1d653efd87
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/auto5.C
@@ -0,0 +1,9 @@
+// PR c++/101886
+// { dg-do compile { target c++17_only } }
+// { dg-options "-fconcepts-ts" }
+
+template struct A { };
+
+A a;
+A b1 = a;
+A b2 = a;
-- 
2.39.0.rc2

Re: [PATCH] libstdc++: Add error handler for

2022-12-07 Thread Jonathan Wakely via Gcc-patches

On Wed, 7 Dec 2022 at 17:58, François Dumont  wrote:
>
> Looks perfect to me, thanks.

OK thanks, it's pushed to trunk now.


>
> On 06/12/22 22:44, Jonathan Wakely wrote:
> > On Wed, 30 Nov 2022 at 18:00, François Dumont  wrote:
> >> On 30/11/22 14:07, Jonathan Wakely wrote:
> >>> On Wed, 30 Nov 2022 at 11:57, Jonathan Wakely  wrote:
> 
>  On Wed, 30 Nov 2022 at 11:54, Jonathan Wakely  wrote:
> >
> > On Wed, 30 Nov 2022 at 06:04, François Dumont via Libstdc++ 
> >  wrote:
> >> Good catch, then we also need this patch.
> > Is it worth printing an error? If we can't show the backtrace because 
> > of an error, we can just print nothing there.
> >> No strong opinion on that but if we do not print anything the output
> >> will be:
> >>
> >> Backtrace:
> >>
> >> Error: ...
> >>
> >> I just considered that it did not cost much to report the issue to the
> >> user that defined _GLIBCXX_DEBUG_BACKTRACE and so is expecting a backtrace.
> >>
> >> Maybe printing "Backtrace:\n" could be done in the normal callback
> >> leaving the user with the feeling that _GLIBCXX_DEBUG_BACKTRACE does not
> >> work.
> > OK, how's this?
> >
> > Tested x86_64-linux.
>
>

[committed] testsuite: Add test for C90 auto with implicit int

2022-12-07 Thread Joseph Myers

Add a test for the case of auto with implicit int in C90 mode, which
is incompatible with C2x semantics (I missed adding such a test when
implementing C2x auto).

Tested for x86_64-pc-linux-gnu.

* gcc.dg/c90-auto-1.c: New test.

diff --git a/gcc/testsuite/gcc.dg/c90-auto-1.c 
b/gcc/testsuite/gcc.dg/c90-auto-1.c
new file mode 100644
index 000..f00f767c50a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c90-auto-1.c
@@ -0,0 +1,12 @@
+/* Test auto with implicit int for C90.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c90 -pedantic-errors" } */
+
+void
+f (void)
+{
+  /* This should have type int following C90 rules, whereas in C2x it
+ would have type double.  */
+  auto x = 1.5;
+  int *p = 
+}

-- 
Joseph S. Myers
jos...@codesourcery.com

[committed] preprocessor: Enable __VA_OPT__ for C2x

2022-12-07 Thread Joseph Myers

C2x supports __VA_OPT__, so adjust libcpp not to pedwarn for uses of
it (or of not passing any variable arguments to a variable-arguments
macro) in standard C2x mode.

I didn't try to duplicate existing tests for the details of the
feature, just verified -pedantic-errors handling is as expected.  And
there's a reasonable argument (bug 98859) that __VA_OPT__ shouldn't be
diagnosed in older standard modes at all (as opposed to not passing
any variable arguments to a variable-arguments macro, for which older
versions of the C standard require a diagnostic as a constraint
violation); that argument applies to C as much as to C++, but I
haven't made any changes in that regard.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

libcpp/
* init.cc (lang_defaults): Enable va_opt for STDC2X.
* lex.cc (maybe_va_opt_error): Adjust diagnostic message for C.
* macro.cc (_cpp_arguments_ok): Update comment.

gcc/testsuite/
* gcc.dg/cpp/c11-vararg-1.c, gcc.dg/cpp/c2x-va-opt-1.c: New tests.

diff --git a/gcc/testsuite/gcc.dg/cpp/c11-vararg-1.c 
b/gcc/testsuite/gcc.dg/cpp/c11-vararg-1.c
new file mode 100644
index 000..6b1bc38bb2c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c11-vararg-1.c
@@ -0,0 +1,9 @@
+/* Test error in C11 for no arguments passed for variable arguments to a
+   macro.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+#define M(X, ...) X
+
+M (x); /* { dg-error "requires at least one argument" } */
+M (x, y);
diff --git a/gcc/testsuite/gcc.dg/cpp/c2x-va-opt-1.c 
b/gcc/testsuite/gcc.dg/cpp/c2x-va-opt-1.c
new file mode 100644
index 000..bd438f74571
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/c2x-va-opt-1.c
@@ -0,0 +1,11 @@
+/* Test __VA_OPT__ and no "..." arguments in a call to a variable-arguments
+   macro accepted for C2X.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+#define CALL(F, ...) F (7 __VA_OPT__(,) __VA_ARGS__)
+#define M(X, ...) X
+
+CALL (a);
+CALL (b, 1);
+M (x);
diff --git a/libcpp/init.cc b/libcpp/init.cc
index 5f34e3515d2..ea683f0cfaf 100644
--- a/libcpp/init.cc
+++ b/libcpp/init.cc
@@ -114,7 +114,7 @@ static const struct lang_flags lang_defaults[] =
   /* STDC99   */  { 1,  0,  1,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0,  0,  0,0 },
   /* STDC11   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0,  0,  0,0 },
   /* STDC17   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0,  0,  0,0 },
-  /* STDC2X   */  { 1,  0,  1,  1,  1,  1,1,  1,   1,   0,   0,1, 
1, 0,   1,  0,   1, 1,   0,   1,  1,  0,1 },
+  /* STDC2X   */  { 1,  0,  1,  1,  1,  1,1,  1,   1,   0,   0,1, 
1, 0,   1,  1,   1, 1,   0,   1,  1,  0,1 },
   /* GNUCXX   */  { 0,  1,  1,  1,  0,  1,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0,  0,  0,1 },
   /* CXX98*/  { 0,  1,  0,  1,  0,  1,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   1, 0,   0,   0,  0,  0,1 },
   /* GNUCXX11 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0,  0,  0,1 },
diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index b1107920c94..9a21a3e9ecc 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -2135,8 +2135,14 @@ maybe_va_opt_error (cpp_reader *pfile)
   /* __VA_OPT__ should not be accepted at all, but allow it in
 system headers.  */
   if (!_cpp_in_system_header (pfile))
-   cpp_error (pfile, CPP_DL_PEDWARN,
-  "__VA_OPT__ is not available until C++20");
+   {
+ if (CPP_OPTION (pfile, cplusplus))
+   cpp_error (pfile, CPP_DL_PEDWARN,
+  "__VA_OPT__ is not available until C++20");
+ else
+   cpp_error (pfile, CPP_DL_PEDWARN,
+  "__VA_OPT__ is not available until C2X");
+   }
 }
   else if (!pfile->state.va_args_ok)
 {
diff --git a/libcpp/macro.cc b/libcpp/macro.cc
index 7d5a0d0fd2e..452e14a1e66 100644
--- a/libcpp/macro.cc
+++ b/libcpp/macro.cc
@@ -1093,7 +1093,7 @@ _cpp_arguments_ok (cpp_reader *pfile, cpp_macro *macro, 
const cpp_hashnode *node
 
   if (argc < macro->paramc)
 {
-  /* In C++20 (here the va_opt flag is used), and also as a GNU
+  /* In C++20 and C2X (here the va_opt flag is used), and also as a GNU
 extension, variadic arguments are allowed to not appear in
 the invocation at all.
 e.g. #define debug(format, args...) something

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH 2/2] OpenMP: Duplicate checking for map clauses in Fortran (PR107214)

2022-12-07 Thread Julian Brown

> Hi Julian,
> 
> I had a first quick lock at this patch, I should have a closer look
> later. However, I stumbled over the following:
> 
> On 20.10.22 18:14, Julian Brown wrote:
> > typedef struct gfc_symbol
> > {
> > ...
> >struct gfc_symbol *old_symbol;
> >
> >unsigned mark:1, comp_mark:1, data_mark:1, dev_mark:1,
> > gen_mark:1; unsigned reduc_mark:1, gfc_new:1;
> >
> >struct gfc_symbol *tlink;
> >
> >unsigned equiv_built:1;
> >...  
> I know that this was the case before, but can you move the mark:1 etc.
> after 'tlink'? In that case all bitfields are grouped together. If I
> have not miscounted, we have currently 7 bits before and 9 bits after
> 'tlink' and grouping them together reduced pointless padding.
> 
> * * *
> > +  else if (n->sym->mark)
> > + gfc_error ("Symbol %qs present on both data and map clauses "
> > +"at %L", n->sym->name, >where);  
> 
> I wonder whether that also rejects the following – which seems to be
> valid. The 'map' goes to 'target' and the 'firstprivate' to
> 'parallel', cf. OpenMP 5.2, "17.2 Clauses on Combined and Composite
> Constructs", [340:3-4 & 12-14]. (BTW: While some fixes went into 5.1
> regarding this section, a likewise wording is already in 5.0.)
> 
> (Testing showed: it give an ICE without the patch and an error with.)

...and this patch avoids the error for combined directives, and
reorders the gfc_symbol bitfields.

--

This patch adds duplicate checking for OpenMP "map" clauses, taking some
cues from the implementation for C in c-typeck.cc:c_finish_omp_clauses
(and similar for C++).

In addition to the existing use of the "mark" and "comp_mark" bitfields
in the gfc_symbol structure, the patch adds several new bits handling
duplicate checking within various categories of clause types.  If "mark"
is being used for map clauses, we need to use different bits for other
clauses for cases where "map" and some other clause can refer to the
same symbol (e.g. "map(n) shared(n)").

This version of the patch avoids flagging variables that are listed on
both map and firstprivate clauses when they are on a combined directive,
as they get moved to separate nested directives later (see previous
patch in series).

Tested with offloading to NVPTX alongside previous patch (and
dependencies).  OK?

2022-12-06  Julian Brown  

gcc/fortran/
PR fortran/107214
* gfortran.h (gfc_symbol): Add data_mark, dev_mark, gen_mark and
reduc_mark bitfields.
* openmp.cc (resolve_omp_clauses): Use above bitfields to improve
duplicate clause detection.

gcc/testsuite/
PR fortran/107214
* gfortran.dg/gomp/pr107214.f90: New test.
>From fa6d1e273449aff61833064027fed3787c13121f Mon Sep 17 00:00:00 2001
Message-Id: 
In-Reply-To: 
References: 
From: Julian Brown 
Date: Tue, 6 Dec 2022 23:10:58 +
Subject: [PATCH 2/2] OpenMP: Duplicate checking for map clauses in Fortran
 (PR107214)

This patch adds duplicate checking for OpenMP "map" clauses, taking some
cues from the implementation for C in c-typeck.cc:c_finish_omp_clauses
(and similar for C++).

In addition to the existing use of the "mark" and "comp_mark" bitfields
in the gfc_symbol structure, the patch adds several new bits handling
duplicate checking within various categories of clause types.  If "mark"
is being used for map clauses, we need to use different bits for other
clauses for cases where "map" and some other clause can refer to the
same symbol (e.g. "map(n) shared(n)").

This version of the patch avoids flagging variables that are listed on
both map and firstprivate clauses when they are on a combined directive,
as they get moved to separate nested directives later (see previous
patch in series).

Tested with offloading to NVPTX alongside previous patch (and
dependencies).  OK?

2022-12-06  Julian Brown  

gcc/fortran/
	PR fortran/107214
	* gfortran.h (gfc_symbol): Add data_mark, dev_mark, gen_mark and
	reduc_mark bitfields.
	* openmp.cc (resolve_omp_clauses): Use above bitfields to improve
	duplicate clause detection.

gcc/testsuite/
	PR fortran/107214
	* gfortran.dg/gomp/pr107214.f90: New test.
---
 gcc/fortran/gfortran.h  | 32 ++---
 gcc/fortran/openmp.cc   | 73 +
 gcc/testsuite/gfortran.dg/gomp/pr107214.f90 |  7 ++
 3 files changed, 90 insertions(+), 22 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr107214.f90

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index df90ed39bea7..47a7f5552385 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1871,16 +1871,6 @@ typedef struct gfc_symbol

   gfc_namelist *namelist, *namelist_tail;

-  /* Change management fields.  Symbols that might be modified by the
- current statement have the mark member nonzero.  Of these symbols,
- symbols with old_symbol equal to NULL are symbols created within
- the current statement.  Otherwise, old_symbol points to a copy of
-

[PATCH 1/2] OpenMP/Fortran: Combined directives with map/firstprivate of same symbol

2022-12-07 Thread Julian Brown

On Wed, 26 Oct 2022 12:39:39 +0200
Tobias Burnus  wrote:

> The ICE seems to be because gcc/fortran/trans-openmp.cc's
> gfc_split_omp_clauses mishandles this as the dump shows the following:
> 
>#pragma omp target firstprivate(a) map(tofrom:a)
>  #pragma omp parallel firstprivate(a)
> 
>   * * *
> 
> In contrast, for the C testcase:
> 
> void foo(int x) {
> #pragma omp target parallel for simd map(x) firstprivate(x)
> for (int k = 0; k < 1; ++k)
>x = 1;
> }
> 
> the dump is as follows, which seems to be sensible:
> 
>#pragma omp target map(tofrom:x)
>  #pragma omp parallel firstprivate(x)
>#pragma omp for nowait
>  #pragma omp simd

First, here's a patch to address this bit...

This patch fixes a case where a combined directive (e.g. "!$omp target
parallel ...") contains both a map and a firstprivate clause for the
same variable.  When the combined directive is split into two nested
directives, the outer "target" gets the "map" clause, and the inner
"parallel" gets the "firstprivate" clause, like so:

  !$omp target parallel map(x) firstprivate(x)

  -->

  !$omp target map(x)
!$omp parallel firstprivate(x)
  ...

When there is no map of the same variable, the firstprivate is distributed
to both directives, e.g. for 'y' in:

  !$omp target parallel map(x) firstprivate(y)

  -->

  !$omp target map(x) firstprivate(y)
!$omp parallel firstprivate(y)
  ...

This is not a recent regression, but appears to fix a long-standing ICE.
(The included testcase is based on one by Tobias.)

Tested with offloading to NVPTX, alongside previously-posted patches
(in review or approved but waiting for other patches), i.e.:

  OpenMP/OpenACC: Rework clause expansion and nested struct handling
  OpenMP/OpenACC: Refine condition for when map clause expansion happens
  OpenMP: Pointers and member mappings

and the patch following.  OK?

2022-12-06  Julian Brown  

gcc/fortran/
* trans-openmp.cc (gfc_add_firstprivate_if_unmapped): New function.
(gfc_split_omp_clauses): Call above.

libgomp/
* testsuite/libgomp.fortran/combined-directive-splitting-1.f90: New
test.
>From c66db363066913ae4939f2aa706427338b109d71 Mon Sep 17 00:00:00 2001
Message-Id: 
From: Julian Brown 
Date: Tue, 6 Dec 2022 12:18:33 +
Subject: [PATCH 1/2] OpenMP/Fortran: Combined directives with map/firstprivate
 of same symbol

This patch fixes a case where a combined directive (e.g. "!$omp target
parallel ...") contains both a map and a firstprivate clause for the
same variable.  When the combined directive is split into two nested
directives, the outer "target" gets the "map" clause, and the inner
"parallel" gets the "firstprivate" clause, like so:

  !$omp target parallel map(x) firstprivate(x)

  -->

  !$omp target map(x)
!$omp parallel firstprivate(x)
  ...

When there is no map of the same variable, the firstprivate is distributed
to both directives, e.g. for 'y' in:

  !$omp target parallel map(x) firstprivate(y)

  -->

  !$omp target map(x) firstprivate(y)
!$omp parallel firstprivate(y)
  ...

This is not a recent regression, but appears to fix a long-standing ICE.
(The included testcase is based on one by Tobias.)

Tested with offloading to NVPTX, alongside previously-posted patches
(in review or approved but waiting for other patches), i.e.:

  OpenMP/OpenACC: Rework clause expansion and nested struct handling
  OpenMP/OpenACC: Refine condition for when map clause expansion happens
  OpenMP: Pointers and member mappings

and the patch following.  OK?

2022-12-06  Julian Brown  

gcc/fortran/
	* trans-openmp.cc (gfc_add_firstprivate_if_unmapped): New function.
	(gfc_split_omp_clauses): Call above.

libgomp/
	* testsuite/libgomp.fortran/combined-directive-splitting-1.f90: New
	test.
---
 gcc/fortran/trans-openmp.cc   | 37 -
 .../combined-directive-splitting-1.f90| 41 +++
 2 files changed, 76 insertions(+), 2 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.fortran/combined-directive-splitting-1.f90

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index e39f7b1cb273..c61cd1bf55de 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -6121,6 +6121,39 @@ gfc_add_clause_implicitly (gfc_omp_clauses *clauses_out,
 }
 }

+/* Kind of opposite to above, add firstprivate to CLAUSES_OUT if it is mapped
+   in CLAUSES_IN's FIRSTPRIVATE list but not its MAP list.  */
+
+static void
+gfc_add_firstprivate_if_unmapped (gfc_omp_clauses *clauses_out,
+  gfc_omp_clauses *clauses_in)
+{
+  gfc_omp_namelist *n = clauses_in->lists[OMP_LIST_FIRSTPRIVATE];
+  gfc_omp_namelist **tail = NULL;
+
+  for (; n != NULL; n = n->next)
+{
+  gfc_omp_namelist *n2 = clauses_out->lists[OMP_LIST_MAP];
+  for (; n2 != NULL; n2 = n2->next)
+	if (n->sym == n2->sym)
+	  break;
+  if (n2 == NULL)
+	{
+

Re: AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

2022-12-07 Thread Pop, Sebastian via Gcc-patches

Hi Richard,


Please find attached a patch that follows your recommendations to generate the 
BTI_C instructions.

Please let me know if the patch can be further improved.

The patch passed bootstrap and regressions tests on arm64-linux.


Thanks,

Sebastian


From: Richard Sandiford 
Sent: Wednesday, December 7, 2022 3:12:08 AM
To: Pop, Sebastian
Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



"Pop, Sebastian"  writes:
> Thanks Richard for your review and for pointing out the issue with BTI.
>
>
> The current patch removes the existing BTI instruction,
>
> and then adds the BTI hint when expanding the patchable_area pseudo.

Thanks.  I still think...

> The attached patch passed bootstrap and regression test on arm64-linux.
>
> Ok to commit to gcc trunk?
>
>
> Thank you,
> Sebastian
>
> 
> From: Richard Sandiford 
> Sent: Monday, December 5, 2022 5:34:40 AM
> To: Pop, Sebastian
> Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
> Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
>
>
>
> "Pop, Sebastian"  writes:
>> Hi,
>>
>> Currently patchable area is at the wrong place on AArch64.  It is placed
>> immediately after function label, before .cfi_startproc.  This patch
>> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
>> modifies aarch64_print_patchable_function_entry to avoid placing
>> patchable area before .cfi_startproc.
>>
>> The patch passed bootstrap and regression test on aarch64-linux.
>> Ok to commit to trunk and backport to active release branches?
>
> Looks good, but doesn't the problem described in the PR then still
> apply to the BTI emitted by:
>
>   if (cfun->machine->label_is_assembled
>   && aarch64_bti_enabled ()
>   && !cgraph_node::get (cfun->decl)->only_called_directly_p ())
> {
>   /* Remove the BTI that follows the patch area and insert a new BTI
>  before the patch area right after the function label.  */
>   rtx_insn *insn = next_real_nondebug_insn (get_insns ());
>   if (insn
>   && INSN_P (insn)
>   && GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE
>   && XINT (PATTERN (insn), 1) == UNSPECV_BTI_C)
> delete_insn (insn);
>   asm_fprintf (file, "\thint\t34 // bti c\n");
> }
>
> ?  It seems like the BTI will be before the cfi_startproc and the
> patchable entry afterwards.
>
> I guess we should keep the BTI instruction as-is (rather than printing
> a .hint) and emit the new UNSPECV_PATCHABLE_AREA after the BTI rather
> than before it.

...this approach would be slightly cleaner though.  The .hint asm string
we're emitting here is exactly the same as the one emiitted by the
original bti_c instruction.  The only reason for deleting the
instruction and emitting text was because we were emitting the
patchable entry directly as text, and the BTI text had to come
before the patchable entry text.

Now that we're emitting the patchable entry via a normal instruction
(a good thing!) we can keep the preceding bti_c as a normal instruction
too.  That is, I think we should use emit_insn_after to emit the entry
after the bti_c insn (if it exists) instead of before BB_HEAD.

Thanks,
Richard

>> gcc/
>> PR target/93492
>> * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
>> Declared.
>> * config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry):
>> Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
>> (aarch64_output_patchable_area): New.
>> * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
>> (patchable_area): Define.
>>
>> gcc/testsuite/
>> PR target/93492
>> * gcc.target/aarch64/pr98776.c: New.
>>
>>
>> From b9cf87bcdf65f515b38f1851eb95c18aaa180253 Mon Sep 17 00:00:00 2001
>> From: Sebastian Pop 
>> Date: Wed, 30 Nov 2022 19:45:24 +
>> Subject: [PATCH] AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>>
>> Currently patchable area is at the wrong place on AArch64.  It is placed
>> immediately after function label, before .cfi_startproc.  This patch
>> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
>> modifies aarch64_print_patchable_function_entry to avoid placing
>> patchable area before .cfi_startproc.
>>
>> gcc/
>>   PR target/93492
>>   * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
>>   Declared.
>>   * config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry):
>>   Emit an

Re: [PATCH v2 0/2] gcc: xtensa: allow dynamic configuration

2022-12-07 Thread Max Filippov via Gcc-patches

On Mon, Nov 28, 2022 at 4:46 PM Max Filippov  wrote:
>
> Hello,
>
> this series addresses the long standing issue with xtensa configuration
> support by adding a way to configure toolchain for a specific xtensa
> core at runtime using the xtensa-dynconfig [1] library as a plugin.
> On a platform with shared library support single toolchain binary
> becomes capable of building code for arbitrary xtensa configuration.
> At the same time it fully preserves the traditional way of configuring
> the toolchain using the xtensa configuration overlay.
>
> Currently xtensa toolchain needs to be patched and rebuilt for every
> new xtensa processor configuration. This has a number of downsides:
> - toolchain builders need to change the toolchain source code, and
>   because xtensa configuration overlay is not a patch, this change is
>   special, embedding it into the toolchain build process gets
>   backpressure.
> - toolchain built for one configuration is usually not usable for any
>   other configuration. It's not possible for a distribution to provide
>   reusable prebuilt xtensa toolchain.
>
> This series allows building the toolchain (including target libraries)
> without its source code modification. Built toolchain takes configuration
> parameters from the shared object specified in the environment variable.
> That shared object may be built by the xtensa-dynconfig project [1].
>
> The same shared object is used for gcc, all binutils and for gdb.
> Xtensa core specific information needed to build that shared object is
> taken from the configuration overlay.
>
> Both gcc and binutils-gdb get new shared header file
> include/xtensa-dynconfig.h that provides definition of configuration
> data structure, initialization macros, redefines XCHAL_* macros to
> access this structure and declares function for loading configuration
> dynamically.
>
> This is not the first submission of this series, it was first
> submitted in 2017 [2]. This version has improved configuration
> versioning and GPL-compatibility check that was suggested in comments
> for the v1.
>
> [1] https://github.com/jcmvbkbc/xtensa-dynconfig
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2017-May/475109.html
>
> Max Filippov (2):
>   gcc: xtensa: allow dynamic configuration
>   libgcc: xtensa: use built-in configuration
>
>  gcc/config.gcc   |   1 +
>  gcc/config/xtensa/t-xtensa   |   8 +-
>  gcc/config/xtensa/xtensa-dynconfig.c | 170 +++
>  gcc/config/xtensa/xtensa-protos.h|   1 +
>  gcc/config/xtensa/xtensa.h   |  22 +-
>  include/xtensa-dynconfig.h   | 442 +++
>  libgcc/config/xtensa/crti.S  |   2 +-
>  libgcc/config/xtensa/crtn.S  |   2 +-
>  libgcc/config/xtensa/lib1funcs.S |   2 +-
>  libgcc/config/xtensa/lib2funcs.S |   2 +-
>  libgcc/config/xtensa/xtensa-config-builtin.h | 198 +
>  11 files changed, 828 insertions(+), 22 deletions(-)
>  create mode 100644 gcc/config/xtensa/xtensa-dynconfig.c
>  create mode 100644 include/xtensa-dynconfig.h
>  create mode 100644 libgcc/config/xtensa/xtensa-config-builtin.h

Series regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max

Re: [PATCH] libstdc++: Add error handler for

2022-12-07 Thread François Dumont via Gcc-patches


Looks perfect to me, thanks.

On 06/12/22 22:44, Jonathan Wakely wrote:

On Wed, 30 Nov 2022 at 18:00, François Dumont  wrote:

On 30/11/22 14:07, Jonathan Wakely wrote:

On Wed, 30 Nov 2022 at 11:57, Jonathan Wakely  wrote:


On Wed, 30 Nov 2022 at 11:54, Jonathan Wakely  wrote:


On Wed, 30 Nov 2022 at 06:04, François Dumont via Libstdc++ 
 wrote:

Good catch, then we also need this patch.

Is it worth printing an error? If we can't show the backtrace because of an 
error, we can just print nothing there.

No strong opinion on that but if we do not print anything the output
will be:

Backtrace:

Error: ...

I just considered that it did not cost much to report the issue to the
user that defined _GLIBCXX_DEBUG_BACKTRACE and so is expecting a backtrace.

Maybe printing "Backtrace:\n" could be done in the normal callback
leaving the user with the feeling that _GLIBCXX_DEBUG_BACKTRACE does not
work.

OK, how's this?

Tested x86_64-linux.

Re: [PATCH] range-op-float, v2: frange_arithmetic tweaks for MODE_COMPOSITE_P

2022-12-07 Thread Aldy Hernandez via Gcc-patches


OK, thanks.
Aldy

On 12/7/22 17:05, Jakub Jelinek wrote:

On Wed, Dec 07, 2022 at 04:38:14PM +0100, Aldy Hernandez wrote:

So, perhaps a combination of that, change frange_nextafter to do the above
and change frange_arithmetic for the initial inexact rounding only to
do it by hand using range_nextafter and starting from value.


Either way is fine.  Whatever is cleaner.


Now in patch form:

2022-12-07  Jakub Jelinek  

* range-op-float.cc (frange_nextafter): For MODE_COMPOSITE_P from
denormal or zero, use real_nextafter on DFmode with conversions
around it.
(frange_arithmetic): For mode_composite, on top of rounding in the
right direction accept extra 1ulp error for PLUS/MINUS_EXPR, extra
2ulps error for MULT_EXPR and extra 3ulps error for RDIV_EXPR.

--- gcc/range-op-float.cc.jj2022-12-07 12:46:01.536123757 +0100
+++ gcc/range-op-float.cc   2022-12-07 16:58:02.406062286 +0100
@@ -254,10 +254,21 @@ frange_nextafter (enum machine_mode mode
  REAL_VALUE_TYPE ,
  const REAL_VALUE_TYPE )
  {
-  const real_format *fmt = REAL_MODE_FORMAT (mode);
-  REAL_VALUE_TYPE tmp;
-  real_nextafter (, fmt, , );
-  value = tmp;
+  if (MODE_COMPOSITE_P (mode)
+  && (real_isdenormal (, mode) || real_iszero ()))
+{
+  // IBM extended denormals only have DFmode precision.
+  REAL_VALUE_TYPE tmp, tmp2;
+  real_convert (, DFmode, );
+  real_nextafter (, REAL_MODE_FORMAT (DFmode), , );
+  real_convert (, mode, );
+}
+  else
+{
+  REAL_VALUE_TYPE tmp;
+  real_nextafter (, REAL_MODE_FORMAT (mode), , );
+  value = tmp;
+}
  }
  
  // Like real_arithmetic, but round the result to INF if the operation

@@ -324,21 +335,40 @@ frange_arithmetic (enum tree_code code,
  }
if (round && (inexact || !real_identical (, )))
  {
-  if (mode_composite)
+  if (mode_composite
+ && (real_isdenormal (, mode) || real_iszero ()))
{
- if (real_isdenormal (, mode)
- || real_iszero ())
-   {
- // IBM extended denormals only have DFmode precision.
- REAL_VALUE_TYPE tmp;
- real_convert (, DFmode, );
- frange_nextafter (DFmode, tmp, inf);
- real_convert (, mode, );
- return;
-   }
+ // IBM extended denormals only have DFmode precision.
+ REAL_VALUE_TYPE tmp, tmp2;
+ real_convert (, DFmode, );
+ real_nextafter (, REAL_MODE_FORMAT (DFmode), , );
+ real_convert (, mode, );
}
-  frange_nextafter (mode, result, inf);
+  else
+   frange_nextafter (mode, result, inf);
  }
+  if (mode_composite)
+switch (code)
+  {
+  case PLUS_EXPR:
+  case MINUS_EXPR:
+   // ibm-ldouble-format documents 1ulp for + and -.
+   frange_nextafter (mode, result, inf);
+   break;
+  case MULT_EXPR:
+   // ibm-ldouble-format documents 2ulps for *.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+  case RDIV_EXPR:
+   // ibm-ldouble-format documents 3ulps for /.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+  default:
+   break;
+  }
  }
  
  // Crop R to [-INF, MAX] where MAX is the maximum representable number



Jakub

Re: [PATCH] range-op-float, v2: Fix up frange_arithmetic [PR107967]

2022-12-07 Thread Aldy Hernandez via Gcc-patches


OK, thanks.
Aldy

On 12/7/22 16:49, Jakub Jelinek wrote:

On Wed, Dec 07, 2022 at 04:26:14PM +0100, Aldy Hernandez wrote:

This chunk...

...is quite similar to this one.  Could you abstract this?


It differs in various small details, plus comment content.
Anyway, here it is reworked such that those various small details
are based on whether inf is negative or positive in a single piece
of code.
Is this ok if it passes bootstrap/regtest?

2022-12-07  Jakub Jelinek  

PR tree-optimization/107967
* range-op-float.cc (frange_arithmetic): Fix a thinko - if
inf is negative, use nextafter if !real_less (, )
rather than if real_less (, ).  If result is +-INF
while value is finite and -fno-rounding-math, don't do rounding
if !inexact or if result is significantly above max representable
value or below min representable value.

* gcc.dg/pr107967-1.c: New test.
* gcc.dg/pr107967-2.c: New test.
* gcc.dg/pr107967-3.c: New test.

--- gcc/range-op-float.cc.jj2022-12-07 16:37:05.285143250 +0100
+++ gcc/range-op-float.cc   2022-12-07 16:41:58.500928517 +0100
@@ -287,9 +287,42 @@ frange_arithmetic (enum tree_code code,
  
// Be extra careful if there may be discrepancies between the

// compile and runtime results.
-  if ((mode_composite || (real_isneg () ? real_less (, )
- : !real_less (, )))
-  && (inexact || !real_identical (, )))
+  bool round = false;
+  if (mode_composite)
+round = true;
+  else
+{
+  bool low = real_isneg ();
+  round = (low ? !real_less (, )
+  : !real_less (, ));
+  if (real_isinf (, !low)
+ && !real_isinf ()
+ && !flag_rounding_math)
+   {
+ // Use just [+INF, +INF] rather than [MAX, +INF]
+ // even if value is larger than MAX and rounds to
+ // nearest to +INF.  Similarly just [-INF, -INF]
+ // rather than [-INF, +MAX] even if value is smaller
+ // than -MAX and rounds to nearest to -INF.
+ // Unless INEXACT is true, in that case we need some
+ // extra buffer.
+ if (!inexact)
+   round = false;
+ else
+   {
+ REAL_VALUE_TYPE tmp = result, tmp2;
+ frange_nextafter (mode, tmp, inf);
+ // TMP is at this point the maximum representable
+ // number.
+ real_arithmetic (, MINUS_EXPR, , );
+ if (real_isneg () != low
+ && (REAL_EXP () - REAL_EXP ()
+ >= 2 - REAL_MODE_FORMAT (mode)->p))
+   round = false;
+   }
+   }
+}
+  if (round && (inexact || !real_identical (, )))
  {
if (mode_composite)
{
--- gcc/testsuite/gcc.dg/pr107967-1.c.jj2022-12-07 16:38:07.519248686 
+0100
+++ gcc/testsuite/gcc.dg/pr107967-1.c   2022-12-07 16:38:07.519248686 +0100
@@ -0,0 +1,35 @@
+/* PR tree-optimization/107967 */
+/* { dg-do compile { target float64 } } */
+/* { dg-options "-O2 -frounding-math -fno-trapping-math -fdump-tree-optimized" 
} */
+/* { dg-add-options float64 } */
+/* { dg-final { scan-tree-dump-not "return\[ \t]\*-?Inf;" "optimized" } } */
+
+_Float64
+foo (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * huge;
+}
+
+_Float64
+bar (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * -huge;
+}
+
+_Float64
+baz (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+970f64;
+  return a + b;
+}
+
+_Float64
+qux (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+969f64;
+  return a + b;
+}
--- gcc/testsuite/gcc.dg/pr107967-2.c.jj2022-12-07 16:38:07.519248686 
+0100
+++ gcc/testsuite/gcc.dg/pr107967-2.c   2022-12-07 16:38:07.519248686 +0100
@@ -0,0 +1,35 @@
+/* PR tree-optimization/107967 */
+/* { dg-do compile { target float64 } } */
+/* { dg-options "-O2 -fno-rounding-math -fno-trapping-math 
-fdump-tree-optimized" } */
+/* { dg-add-options float64 } */
+/* { dg-final { scan-tree-dump-times "return\[ \t]\*-?Inf;" 3 "optimized" } } 
*/
+
+_Float64
+foo (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * huge;
+}
+
+_Float64
+bar (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * -huge;
+}
+
+_Float64
+baz (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+970f64;
+  return a + b;
+}
+
+_Float64
+qux (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+969f64;
+  return a + b;
+}
--- gcc/testsuite/gcc.dg/pr107967-3.c.jj2022-12-07 16:38:07.519248686 
+0100
+++ gcc/testsuite/gcc.dg/pr107967-3.c   2022-12-07 16:38:07.519248686 +0100
@@ -0,0 +1,53 @@
+/* PR tree-optimization/107967 */
+/* { dg-do compile { target float64 } } */
+/* { dg-options "-O2 -fno-rounding-math -fno-trapping-math 
-fdump-tree-optimized" } */
+/* {

Re: [PATCH] PR tree-optimization/107985 - Ensure arguments to range-op handler are supported.

2022-12-07 Thread Richard Biener via Gcc-patches

On Wed, Dec 7, 2022 at 5:45 PM Andrew MacLeod via Gcc-patches
 wrote:
>
> THis patch invalidates a range-op handler object if an operand type in
> the statement is not supported.
>
> This also triggered a check in stmt dependency resolution which assumed
> there must be a valid handler for any stmt with an appropriate LHS
> type... which is a false assumption.
>
> This should do for now, but long term I will rework the dispatch code to
> ensure it matches the specifically supported patterns of operands. This
> will make the handler creation a little slower, but speed up the actual
> dispatch, especially as we add new range types next release.  Its also
> much more invasive... too much for this release I think.
>
> bootstraps on x86_64-pc-linux-gnu with no regressions.  OK?

+ if (!Value_Range::supports_type_p (TREE_TYPE (m_op1)) ||
+ !Value_Range::supports_type_p (TREE_TYPE (m_op2)))

The ||s go to the next line.  Since in a GIMPLE_COND both operand types
are compatible it's enough to check one of them.

Likewise for the GIMPLE_ASSIGN case I think - I don't know of any
binary operator that has operands that would not be both compatible
or not compatible (but it's less clear-cut here).

Otherwise looks straight forward.

Thanks,
Richard.

> Andrew
>

RE: [PATCH] arm: fix mve intrinsics scan body tests for C++

2022-12-07 Thread Kyrylo Tkachov via Gcc-patches

Hi Andrea,

> -Original Message-
> From: Andrea Corallo 
> Sent: Wednesday, December 7, 2022 3:03 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Andrea Corallo 
> Subject: [PATCH] arm: fix mve intrinsics scan body tests for C++
> 
> Hi all,
> 
> this patch is to export the functions defined in these MVE tests as C
> so the body scan assembler works as expected also for our C++ tests.
> 
> Best Regards and sorry for the regression!

Ok.
Thanks,
Kyrill

> 
>   Andrea
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/mve/intrinsics/vabavq_p_s16.c: Extern functions
>   as "C".
>   * gcc.target/arm/mve/intrinsics/vabavq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_p_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_p_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_p_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_p_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabavq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_m_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabdq_x_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_m_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_m_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_m_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_x_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_x_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_x_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_x_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vabsq_x_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvaq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvaq_p_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvaq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvaq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvq_p_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddlvq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_m_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_m_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_m_n_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_m_n_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_m_n_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_m_n_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vaddq_m_n_s8.c: Likewise.
>   *

Re: [PATCH][AArch64] Cleanup move immediate code

2022-12-07 Thread Wilco Dijkstra via Gcc-patches

Hi Andreas,

Thanks for the report, I've committed the fix: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108006

Cheers,
Wilco

[COMMITTED] AArch64: Fix assert in aarch64_move_imm [PR108006]

2022-12-07 Thread Wilco Dijkstra via Gcc-patches

Ensure we only pass SI/DImode which fixes the assert.

Committed as obvious.

gcc/
        PR target/108006
* config/aarch64/aarch64.c (aarch64_expand_sve_const_vector):
        Fix call to aarch64_move_imm to use SI/DI.
---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
89bf0dff904b6b52b71841aec299541f01884f3d..27a814d862101ce244c52d4863c6158cf549f066
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -6513,7 +6513,8 @@ aarch64_expand_sve_const_vector (rtx target, rtx src)
  /* If the integer can be moved into a general register by a
 single instruction, do that and duplicate the result.  */
  if (CONST_INT_P (elt_value)
- && aarch64_move_imm (INTVAL (elt_value), elt_mode))
+ && aarch64_move_imm (INTVAL (elt_value),
+  encoded_bits <= 32 ? SImode : DImode))
{
  elt_value = force_reg (elt_mode, elt_value);
  return expand_vector_broadcast (mode, elt_value);

[PATCH] PR tree-optimization/107985 - Ensure arguments to range-op handler are supported.

2022-12-07 Thread Andrew MacLeod via Gcc-patches

THis patch invalidates a range-op handler object if an operand type in 
the statement is not supported.


This also triggered a check in stmt dependency resolution which assumed 
there must be a valid handler for any stmt with an appropriate LHS 
type... which is a false assumption.


This should do for now, but long term I will rework the dispatch code to 
ensure it matches the specifically supported patterns of operands. This 
will make the handler creation a little slower, but speed up the actual 
dispatch, especially as we add new range types next release.  Its also 
much more invasive... too much for this release I think.


bootstraps on x86_64-pc-linux-gnu with no regressions.  OK?

Andrew

From 966076046e5687937eeac61df762f89178aa17c7 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 6 Dec 2022 10:41:29 -0500
Subject: [PATCH 1/2] Ensure arguments to range-op handler are supported.

	PR tree-optimization/107985
	gcc/
	* gimple-range-op.cc
	(gimple_range_op_handler::gimple_range_op_handler): Check if type
	of the operands is supported.
	* gimple-range.cc (gimple_ranger::prefill_stmt_dependencies): Do
	not assert if here is no range-op handler.

	gcc/testsuite/
	* g++.dg/pr107985.C: New.
---
 gcc/gimple-range-op.cc  |  6 ++
 gcc/gimple-range.cc | 24 +---
 gcc/testsuite/g++.dg/pr107985.C | 18 ++
 3 files changed, 37 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr107985.C

diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 7764166d5fb..c36c49ac1da 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -148,6 +148,9 @@ gimple_range_op_handler::gimple_range_op_handler (gimple *s)
 	case GIMPLE_COND:
 	  m_op1 = gimple_cond_lhs (m_stmt);
 	  m_op2 = gimple_cond_rhs (m_stmt);
+	  if (!Value_Range::supports_type_p (TREE_TYPE (m_op1)) ||
+	  !Value_Range::supports_type_p (TREE_TYPE (m_op2)))
+	m_valid = false;
 	  return;
 	case GIMPLE_ASSIGN:
 	  m_op1 = gimple_range_base_of_assignment (m_stmt);
@@ -164,6 +167,9 @@ gimple_range_op_handler::gimple_range_op_handler (gimple *s)
 	}
 	  if (gimple_num_ops (m_stmt) >= 3)
 	m_op2 = gimple_assign_rhs2 (m_stmt);
+	  if ((m_op1 && !Value_Range::supports_type_p (TREE_TYPE (m_op1))) ||
+	  (m_op2 && !Value_Range::supports_type_p (TREE_TYPE (m_op2
+	m_valid = false;
 	  return;
 	default:
 	  gcc_unreachable ();
diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index ecd6039e0fd..8c055826e17 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -422,18 +422,20 @@ gimple_ranger::prefill_stmt_dependencies (tree ssa)
   else
 	{
 	  gimple_range_op_handler handler (stmt);
-	  gcc_checking_assert (handler);
-	  tree op = handler.operand2 ();
-	  if (op)
+	  if (handler)
 	{
-	  Value_Range r (TREE_TYPE (op));
-	  prefill_name (r, op);
-	}
-	  op = handler.operand1 ();
-	  if (op)
-	{
-	  Value_Range r (TREE_TYPE (op));
-	  prefill_name (r, op);
+	  tree op = handler.operand2 ();
+	  if (op)
+		{
+		  Value_Range r (TREE_TYPE (op));
+		  prefill_name (r, op);
+		}
+	  op = handler.operand1 ();
+	  if (op)
+		{
+		  Value_Range r (TREE_TYPE (op));
+		  prefill_name (r, op);
+		}
 	}
 	}
 }
diff --git a/gcc/testsuite/g++.dg/pr107985.C b/gcc/testsuite/g++.dg/pr107985.C
new file mode 100644
index 000..8d244b54efb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr107985.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -ftree-vrp -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre" } */
+
+struct B {
+  int f;
+};
+
+struct D : public B {
+};
+
+void foo() {
+  D d;
+  d.f = 7;
+
+  int B::* pfb = ::f;
+  int D::* pfd = pfb;
+  int v = d.*pfd;
+}
-- 
2.38.1

Re: [PATCH v5 3/4] OpenMP: Pointers and member mappings

2022-12-07 Thread Tobias Burnus


Hi Julian,

I think this patch is OK; however, at least for gimplify.cc Jakub needs to have 
a second look.

As remarked for the 2/4 patch, I believe mapping 'map(tofrom: var%f(2:3))' 
should work
without explicitly mapping 'map(tofrom: var%f)'
(→ [TR11 157:21-26] (approx. [5.2 154:22-27], [5.1 352:17-22], [5.0 320:22-27]).
→ https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608100.html (+ 
previously in the thread).

Testing the patch, that seems to work fine (i.e. contrary to C/C++, cf. 2/4),
which matches the dump and, if I understood correctly, also your (Julian's) 
expectation.
Thus, no need to modify the code part.

Regarding the testcases:
* I would prefer if you don't modify the existing 
libgomp.fortran/struct-elem-map-1.f90 testcase;
  However, you could add your version as another variant ('subroutine nine()', 
'four_var()' or
  what's the next free name, possibly with a comment telling that it is 
'four()' but with an
  added explicit basepointer mapping.).

* As the new version should map *less*, I wonder whether some 
-fdump-tree-{original,gimple,omplower}
  scan-dump-tree checks would be useful besides testing whether it works at run 
time.
  (Your decision regarding which tree, which testcases and whether at all.)

* Likewise, maybe a 'target enter/exit data' check? However, you might very 
well run into my
  'omp target data exit' issue, cf. 
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604887.html
  (needs to be revised based on Jakub's comments; I think those were on IRC 
only – the problem is that
  not only 'alloc' is affected but also 'from' etc.)

On 18.10.22 12:39, Julian Brown wrote:

Implementing the "omp declare mapper" functionality, I noticed some
cases where handling of derived type members that are pointers doesn't
seem to be quite right. At present, a type such as this:
...
   map(to: tvar%arrptr) map(tofrom: tvar%arrptr(3:8))

and then instead we should follow (OpenMP 5.2, 5.8.3 "map Clause"):
...
   2) map(tofrom: tvar%arrptr(3:8)   -->
   GOMP_MAP_TOFROM *tvar%arrptr%data(3)  (size 8-3+1, etc.)
   GOMP_MAP_TO_PSETtvar%arrptr
   GOMP_MAP_ATTACH_DETACH  tvar%arrptr%data  (bias 3, etc.)

...
Additionally, the next patch in the series adds a runtime diagnostic
for the (illegal) case where 'i' and 'j' are different.

2022-10-18  Julian Brown  

gcc/fortran/
  * dependency.cc (gfc_omp_expr_prefix_same): New function.
  * dependency.h (gfc_omp_expr_prefix_same): Add prototype.
  * gfortran.h (gfc_omp_namelist): Add "duplicate_of" field to "u2"
  union.
  * trans-openmp.cc (dependency.h): Include.
  (gfc_trans_omp_array_section): Use GOMP_MAP_TO_PSET unconditionally for
  mapping array descriptors.
  (gfc_symbol_rooted_namelist): New function.
  (gfc_trans_omp_clauses): Check subcomponent and subarray/element
  accesses elsewhere in the clause list for pointers to derived types or
  array descriptors, and adjust or drop mapping nodes appropriately.

gcc/
  * gimplify.cc (omp_tsort_mapping_groups): Process nodes that have
  OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P set after those that don't.
  (omp_accumulate_sibling_list): Adjust GOMP_MAP_TO_PSET handling.
  Remove GOMP_MAP_ALWAYS_POINTER handling.

libgomp/
  * testsuite/libgomp.fortran/map-subarray.f90: New test.
  * testsuite/libgomp.fortran/map-subarray-2.f90: New test.
  * testsuite/libgomp.fortran/map-subarray-3.f90: New test.
  * testsuite/libgomp.fortran/map-subarray-4.f90: New test.
  * testsuite/libgomp.fortran/map-subarray-6.f90: New test.
  * testsuite/libgomp.fortran/map-subarray-7.f90: New test.
  * testsuite/libgomp.fortran/map-subcomponents.f90: New test.
  * testsuite/libgomp.fortran/struct-elem-map-1.f90: Adjust for
  descriptor-mapping changes.  Remove XFAIL.

...

--- a/libgomp/testsuite/libgomp.fortran/struct-elem-map-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/struct-elem-map-1.f90
@@ -229,7 +229,8 @@ contains

  !   !$omp target map(tofrom: var%d(4:7), var%f(2:3), var%str2(2:3)) &
  !   !$omp&   map(tofrom: var%str4(2:2), var%uni2(2:3), var%uni4(2:2))
-!$omp target map(tofrom: var%d(4:7), var%f(2:3), var%str2(2:3), 
var%uni2(2:3))
+!$omp target map(to: var%f) map(tofrom: var%d(4:7), var%f(2:3), &
+!$omp&   var%str2(2:3), var%uni2(2:3))

This adds 'to: var%f'  (to the existing 'var%f(2:3)') – where 'f' is a
POINTER. As discussed at the top, I prefer to leave it as is – and
possibly just add another test-function, replicating this function and
only there adding the basepointer as additional list item.

-!$omp target map(tofrom: var%f(2:3))
+!$omp target map(to: var%f) map(tofrom: var%f(2:3))

likewise.

-!$omp target map(tofrom: var%d(5), var%f(3), var%str2(3), var%uni2(3))
+!$omp target map(to: var%f) map(tofrom: var%d(5), var%f(3), &
+!$omp&  var%str2(3), var%uni2(3))

likewise.

-!$omp target map(tofrom:

Re: [PATCH v5 2/4] OpenMP/OpenACC: Rework clause expansion and nested struct handling

2022-12-07 Thread Tobias Burnus


Hi Julian,

On 07.12.22 16:16, Julian Brown wrote:

On Wed, 7 Dec 2022 15:54:42 +0100 Tobias Burnus  wrote:

If I understand Deepak's comment (on OpenMP.org's omp-lang list, sorry
it is a nonpublic list) correctly, the following wording implies that
a 'from: s.w[z:4]' for a pointer 's.w' also implies a mapping of
's.w' - if 's' is used inside the target region and, thus, gets
implicitly mapped.

[TR11 157:21-26] (approx. [5.2 154:22-27], [5.1 352:17-22], [5.0
320:22-27])

"If a list item with an implicit data-mapping attribute does not have
any corresponding storage in the device data environment prior to a
task encountering the construct associated with the map clause, and
one or more contiguous parts of the original storage are either list
items or base pointers to list items that are explicitly mapped on
the construct, only those parts of the original storage will have
corresponding storage in the device data environment as a result of
the map clauses on the construct."

Hmmm... IIRC that is a different conclusion than the one we have
understood previously, leading to e.g. the patch here (Chung-Lin CC'ed):

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html


This seems to be the "Target directive struct mapping question" omp-lang thread,
started on 2021-03-22.

I think we need to distinguish:

  #pragma omp target enter data map(to: s.w[:10])

from

  #pragma omp target map(tofrom: s.arr[:20])
s.arr[0] = 5;

As in the latter case 's' gets implicitly mapped and then applies to
the base pointer 's.arr' of 's.arr[:20]'. While in the former case,
only the pointee gets mapped without the pointer 's.arr' (and, hence,
there is also no pointer attachment).

At least that's what I get from the wording above and reading Deepak's last
email - and it does not seem to clash with the discussion in the lengthy
omp-lang thread. (Maybe there are other threads – or I completely misread them.)

I think it makes sense to have a clarifying example in OpenMP; hence,
I filed the OpenMP.org example issue #342, starting with essentially
what I wrote above: 'target enter data' needs more work to get the pointer
handling done, 'target' + accessing 's' works as is.

I hope it makes sense.


Follow-on discussion then questioned whether the change was really the
intention of the spec, but we thought it was.  Has that changed now?


No idea – I find it difficult to track all the language changes and find
mapping complex and unclear.

However, it does seem to make sense in the way written above without
contradicting to all previous discussions, minus the common confusion.
(As least as I gathered from browsing both omp-lang and gcc-patches.)


(I think actually changing the behaviour is a matter of flipping a
switch, but let's make sure we choose the right setting!)


That sounds great!

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[PATCH] range-op-float, v2: frange_arithmetic tweaks for MODE_COMPOSITE_P

2022-12-07 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 07, 2022 at 04:38:14PM +0100, Aldy Hernandez wrote:
> > So, perhaps a combination of that, change frange_nextafter to do the above
> > and change frange_arithmetic for the initial inexact rounding only to
> > do it by hand using range_nextafter and starting from value.
> 
> Either way is fine.  Whatever is cleaner.

Now in patch form:

2022-12-07  Jakub Jelinek  

* range-op-float.cc (frange_nextafter): For MODE_COMPOSITE_P from
denormal or zero, use real_nextafter on DFmode with conversions
around it.
(frange_arithmetic): For mode_composite, on top of rounding in the
right direction accept extra 1ulp error for PLUS/MINUS_EXPR, extra
2ulps error for MULT_EXPR and extra 3ulps error for RDIV_EXPR.

--- gcc/range-op-float.cc.jj2022-12-07 12:46:01.536123757 +0100
+++ gcc/range-op-float.cc   2022-12-07 16:58:02.406062286 +0100
@@ -254,10 +254,21 @@ frange_nextafter (enum machine_mode mode
  REAL_VALUE_TYPE ,
  const REAL_VALUE_TYPE )
 {
-  const real_format *fmt = REAL_MODE_FORMAT (mode);
-  REAL_VALUE_TYPE tmp;
-  real_nextafter (, fmt, , );
-  value = tmp;
+  if (MODE_COMPOSITE_P (mode)
+  && (real_isdenormal (, mode) || real_iszero ()))
+{
+  // IBM extended denormals only have DFmode precision.
+  REAL_VALUE_TYPE tmp, tmp2;
+  real_convert (, DFmode, );
+  real_nextafter (, REAL_MODE_FORMAT (DFmode), , );
+  real_convert (, mode, );
+}
+  else
+{
+  REAL_VALUE_TYPE tmp;
+  real_nextafter (, REAL_MODE_FORMAT (mode), , );
+  value = tmp;
+}
 }
 
 // Like real_arithmetic, but round the result to INF if the operation
@@ -324,21 +335,40 @@ frange_arithmetic (enum tree_code code,
 }
   if (round && (inexact || !real_identical (, )))
 {
-  if (mode_composite)
+  if (mode_composite
+ && (real_isdenormal (, mode) || real_iszero ()))
{
- if (real_isdenormal (, mode)
- || real_iszero ())
-   {
- // IBM extended denormals only have DFmode precision.
- REAL_VALUE_TYPE tmp;
- real_convert (, DFmode, );
- frange_nextafter (DFmode, tmp, inf);
- real_convert (, mode, );
- return;
-   }
+ // IBM extended denormals only have DFmode precision.
+ REAL_VALUE_TYPE tmp, tmp2;
+ real_convert (, DFmode, );
+ real_nextafter (, REAL_MODE_FORMAT (DFmode), , );
+ real_convert (, mode, );
}
-  frange_nextafter (mode, result, inf);
+  else
+   frange_nextafter (mode, result, inf);
 }
+  if (mode_composite)
+switch (code)
+  {
+  case PLUS_EXPR:
+  case MINUS_EXPR:
+   // ibm-ldouble-format documents 1ulp for + and -.
+   frange_nextafter (mode, result, inf);
+   break;
+  case MULT_EXPR:
+   // ibm-ldouble-format documents 2ulps for *.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+  case RDIV_EXPR:
+   // ibm-ldouble-format documents 3ulps for /.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+  default:
+   break;
+  }
 }
 
 // Crop R to [-INF, MAX] where MAX is the maximum representable number


Jakub

[PATCH] range-op-float, v2: Fix up frange_arithmetic [PR107967]

2022-12-07 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 07, 2022 at 04:26:14PM +0100, Aldy Hernandez wrote:
> This chunk...
> 
> ...is quite similar to this one.  Could you abstract this?

It differs in various small details, plus comment content.
Anyway, here it is reworked such that those various small details
are based on whether inf is negative or positive in a single piece
of code.
Is this ok if it passes bootstrap/regtest?

2022-12-07  Jakub Jelinek  

PR tree-optimization/107967
* range-op-float.cc (frange_arithmetic): Fix a thinko - if
inf is negative, use nextafter if !real_less (, )
rather than if real_less (, ).  If result is +-INF
while value is finite and -fno-rounding-math, don't do rounding
if !inexact or if result is significantly above max representable
value or below min representable value.

* gcc.dg/pr107967-1.c: New test.
* gcc.dg/pr107967-2.c: New test.
* gcc.dg/pr107967-3.c: New test.

--- gcc/range-op-float.cc.jj2022-12-07 16:37:05.285143250 +0100
+++ gcc/range-op-float.cc   2022-12-07 16:41:58.500928517 +0100
@@ -287,9 +287,42 @@ frange_arithmetic (enum tree_code code,
 
   // Be extra careful if there may be discrepancies between the
   // compile and runtime results.
-  if ((mode_composite || (real_isneg () ? real_less (, )
- : !real_less (, )))
-  && (inexact || !real_identical (, )))
+  bool round = false;
+  if (mode_composite)
+round = true;
+  else
+{
+  bool low = real_isneg ();
+  round = (low ? !real_less (, )
+  : !real_less (, ));
+  if (real_isinf (, !low)
+ && !real_isinf ()
+ && !flag_rounding_math)
+   {
+ // Use just [+INF, +INF] rather than [MAX, +INF]
+ // even if value is larger than MAX and rounds to
+ // nearest to +INF.  Similarly just [-INF, -INF]
+ // rather than [-INF, +MAX] even if value is smaller
+ // than -MAX and rounds to nearest to -INF.
+ // Unless INEXACT is true, in that case we need some
+ // extra buffer.
+ if (!inexact)
+   round = false;
+ else
+   {
+ REAL_VALUE_TYPE tmp = result, tmp2;
+ frange_nextafter (mode, tmp, inf);
+ // TMP is at this point the maximum representable
+ // number.
+ real_arithmetic (, MINUS_EXPR, , );
+ if (real_isneg () != low
+ && (REAL_EXP () - REAL_EXP ()
+ >= 2 - REAL_MODE_FORMAT (mode)->p))
+   round = false;
+   }
+   }
+}
+  if (round && (inexact || !real_identical (, )))
 {
   if (mode_composite)
{
--- gcc/testsuite/gcc.dg/pr107967-1.c.jj2022-12-07 16:38:07.519248686 
+0100
+++ gcc/testsuite/gcc.dg/pr107967-1.c   2022-12-07 16:38:07.519248686 +0100
@@ -0,0 +1,35 @@
+/* PR tree-optimization/107967 */
+/* { dg-do compile { target float64 } } */
+/* { dg-options "-O2 -frounding-math -fno-trapping-math -fdump-tree-optimized" 
} */
+/* { dg-add-options float64 } */
+/* { dg-final { scan-tree-dump-not "return\[ \t]\*-?Inf;" "optimized" } } */
+
+_Float64
+foo (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * huge;
+}
+
+_Float64
+bar (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * -huge;
+}
+
+_Float64
+baz (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+970f64;
+  return a + b;
+}
+
+_Float64
+qux (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+969f64;
+  return a + b;
+}
--- gcc/testsuite/gcc.dg/pr107967-2.c.jj2022-12-07 16:38:07.519248686 
+0100
+++ gcc/testsuite/gcc.dg/pr107967-2.c   2022-12-07 16:38:07.519248686 +0100
@@ -0,0 +1,35 @@
+/* PR tree-optimization/107967 */
+/* { dg-do compile { target float64 } } */
+/* { dg-options "-O2 -fno-rounding-math -fno-trapping-math 
-fdump-tree-optimized" } */
+/* { dg-add-options float64 } */
+/* { dg-final { scan-tree-dump-times "return\[ \t]\*-?Inf;" 3 "optimized" } } 
*/
+
+_Float64
+foo (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * huge;
+}
+
+_Float64
+bar (void)
+{
+  const _Float64 huge = 1.0e+300f64;
+  return huge * -huge;
+}
+
+_Float64
+baz (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+970f64;
+  return a + b;
+}
+
+_Float64
+qux (void)
+{
+  const _Float64 a = 0x1.fp+1023f64;
+  const _Float64 b = 0x1.fp+969f64;
+  return a + b;
+}
--- gcc/testsuite/gcc.dg/pr107967-3.c.jj2022-12-07 16:38:07.519248686 
+0100
+++ gcc/testsuite/gcc.dg/pr107967-3.c   2022-12-07 16:38:07.519248686 +0100
@@ -0,0 +1,53 @@
+/* PR tree-optimization/107967 */
+/* { dg-do compile { target float64 } } */
+/* { dg-options "-O2 -fno-rounding-math -fno-trapping-math 
-fdump-tree-optimized" } */
+/* { dg-add-options float64 } */
+/* { dg-final { scan-tree-dump-times

Re: [PATCH 2/2] Corrected pr25521.c target matching.

2022-12-07 Thread Cupertino Miranda via Gcc-patches



> On 12/2/22 10:52, Cupertino Miranda via Gcc-patches wrote:
>> This commit is a follow up of bugzilla #107181.
>> The commit /a0aafbc/ changed the default implementation of the
>> SELECT_SECTION hook in order to match clang/llvm behaviour w.r.t the
>> placement of `const volatile' objects.
>> However, the following targets use target-specific selection functions
>> and they choke on the testcase pr25521.c:
>>   *rx - target sets its const variables as '.section C,"a",@progbits'.
> That's presumably a constant section.  We should instead twiddle the test to
> recognize that section.

Although @progbits is indeed a constant section, I believe it is
more interesting to detect if the `rx' starts selecting more
standard sections instead of the current @progbits.
That was the reason why I opted to XFAIL instead of PASSing it.
Can I keep it as such ?

>
>>   *powerpc - its 32bit version is eager to allocate globals in .sdata
>>  sections.
>> Normally, one can expect for the variable to be allocated in .srodata,
>> however, in case of powerpc-*-* or powerpc64-*-* (with -m32)
>> 'targetm.have_srodata_section == false' and the code in
>> categorize_decl_for_section(varasm.cc), forces it to allocate in .sdata.
>>/* If the target uses small data sections, select it.  */
>>else if (targetm.in_small_data_p (decl))
>>  {
>>if (ret == SECCAT_BSS)
>>  ret = SECCAT_SBSS;
>>else if targetm.have_srodata_section && ret == SECCAT_RODATA)
>>  ret = SECCAT_SRODATA;
>>else
>>  ret = SECCAT_SDATA;
>>  }
> I'd just skip the test for 32bit ppc.  There should be suitable 
> effective-target
> tests you can use.
>
> jeff

[PATCH] c++, TLS: Support cross-tu static initialization for targets without alias support [PR106435].

2022-12-07 Thread Iain Sandoe via Gcc-patches

 This has been tested on x86_64 and arm64 Darwin and on x86_64 linux gnu.
 The basic patch is live in the homebrew macOS support and so has had quite
 wide coverage on non-trivial codebases.
 
 OK for master?
 Iain
 
 Since this actually fixes wrong code, I wonder if we should also consider
 back-porting.
 
 --- >8 ---

The description below relates to the code path when TARGET_SUPPORTS_ALIASES is
false; current operation is maintained for targets with alias support and any
new support code should be DCEd in that case.

--

Currently, cross-tu static initialisation is not supported for targets without
alias support.

The patch adds support by building a shim function in place of the alias for
these targets; the shim simply calls the generic initialiser.  Although this is
slightly less efficient than the alias, in practice (for targets that allow
sibcalls) the penalty is a single jump when code is optimised.

>From the perspective of a TU referencing an extern TLS variable, there is no
way to determine if it requires a guarded dynamic init.  So, in the referencing
TU, we build a weak reference to the potential init and check at runtime if the
init is present before calling it.  This strategy is fine for targets that have
ELF semantics, but fails at link time for Mach-O (which does not permit the
reference to be undefined in the static link).

The actual initialiser call is contained in a wrapper function, and to resolve
the Mach-O linker issue, in the TU that is referencing the var, we now generate
both the wrapper _and_ a weak definition of a dummy init function.  In the case
that there _is_ a dynamic init (in a different TU), that version will be 
non-weak
and will be override the weak dummy one.  In the case that we have a trivial
static init (so no init in any other TU) the weak-defined dummy init will be
called (a single return insn for optimised code).  We mitigate the call to
the dummy init by reworking the wrapper code-gen path to remove the test for
the weak reference function (as it will always be true) since the static linker
will now determine the function to be called.

Signed-off-by: Iain Sandoe 

PR c++/106435

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): Allow fextern-tls-init for targets
without alias support.

gcc/cp/ChangeLog:

* decl2.cc (get_tls_init_fn): Allow targets without alias support.
(handle_tls_init): Emit aliases for single init functions where the
target supporst this, otherwise emit a stub function that calls the
main tls init function.  (generate_tls_dummy_init): New.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/pr106435-b.cc: New file.
* g++.dg/cpp0x/pr106435.C: New test.
* g++.dg/cpp0x/pr106435.h: New file.
---
 gcc/c-family/c-opts.cc   |  2 +-
 gcc/cp/decl2.cc  | 80 
 gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc | 22 +++
 gcc/testsuite/g++.dg/cpp0x/pr106435.C| 24 +++
 gcc/testsuite/g++.dg/cpp0x/pr106435.h| 27 
 5 files changed, 142 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435.h

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 70745aa4e7c..064645f980d 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1070,7 +1070,7 @@ c_common_post_options (const char **pfilename)
 
   if (flag_extern_tls_init)
 {
-  if (!TARGET_SUPPORTS_ALIASES || !SUPPORTS_WEAK)
+  if (!SUPPORTS_WEAK)
{
  /* Lazy TLS initialization for a variable in another TU requires
 alias and weak reference support.  */
diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index f95529a5c9a..c6550c0c2fc 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -3672,9 +3672,8 @@ get_tls_init_fn (tree var)
   if (!flag_extern_tls_init && DECL_EXTERNAL (var))
 return NULL_TREE;
 
-  /* If the variable is internal, or if we can't generate aliases,
- call the local init function directly.  */
-  if (!TREE_PUBLIC (var) || !TARGET_SUPPORTS_ALIASES)
+  /* If the variable is internal call the local init function directly.  */
+  if (!TREE_PUBLIC (var))
 return get_local_tls_init_fn (DECL_SOURCE_LOCATION (var));
 
   tree sname = mangle_tls_init_fn (var);
@@ -3811,8 +3810,12 @@ generate_tls_wrapper (tree fn)
   if (tree init_fn = get_tls_init_fn (var))
 {
   tree if_stmt = NULL_TREE;
-  /* If init_fn is a weakref, make sure it exists before calling.  */
-  if (lookup_attribute ("weak", DECL_ATTRIBUTES (init_fn)))
+
+  /* If init_fn is a weakref, make sure it exists before calling.
+If the target does not support aliases, then we will have generated
+a dummy weak function, so there is no need to test its existence.  */
+  if (TARGET_SUPPORTS_ALIASES &&
+

Re: [PATCH] range-op-float: frange_arithmetic tweaks for MODE_COMPOSITE_P

2022-12-07 Thread Aldy Hernandez via Gcc-patches





On 12/7/22 16:31, Jakub Jelinek wrote:

On Wed, Dec 07, 2022 at 04:21:09PM +0100, Aldy Hernandez wrote:

On 12/7/22 13:10, Jakub Jelinek wrote:

+ switch (code)
+   {
+   case PLUS_EXPR:
+   case MINUS_EXPR:
+ // ibm-ldouble-format documents 1ulp for + and -.
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   case MULT_EXPR:
+ // ibm-ldouble-format documents 2ulps for *.
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   case RDIV_EXPR:
+ // ibm-ldouble-format documents 3ulps for /.
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   default:
+ if (!inexact)
+   return;
+ break;


It looks like this chunk...



+   switch (code)
+ {
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+   // ibm-ldouble-format documents 1ulp for + and -.
+   frange_nextafter (mode, result, inf);
+   break;
+ case MULT_EXPR:
+   // ibm-ldouble-format documents 2ulps for *.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+ case RDIV_EXPR:
+   // ibm-ldouble-format documents 3ulps for /.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+ default:
+   break;
+ }


...is the same as this chunk.  Plus, all this mode composite stuff is


It is not the same, there is the DFmode, tmp vs. mode, result difference.
But sure, we could either add an inline function which for
(code, mode, result, inf) set of options (or (code, DFmode, tmp, inf))
do those 0, 1, 2, 3 frange_nextafter calls (and return bool if it did any
- there is also the if (!inexact) return; case), or as you suggest
perhaps change frange_nextafter to handle MODE_COMPOSITE_P differently
and do there
   if (mode_composite && (real_isdenormal (, mode) || real_iszero 
()))
 {
   // IBM extended denormals only have DFmode precision.
   REAL_VALUE_TYPE tmp;
   real_convert (, DFmode, );
   frange_nextafter (DFmode, tmp, inf);
   real_convert (, mode, );
 }
   else
 frange_nextafter (mode, result, inf);
Though, that somewhat changes behavior, it will convert to DFmode and back
for every nextafter rather than just once (just slower compile time),
but also right now we start from value rather than result.

So, perhaps a combination of that, change frange_nextafter to do the above
and change frange_arithmetic for the initial inexact rounding only to
do it by hand using range_nextafter and starting from value.


Either way is fine.  Whatever is cleaner.

Aldy



Anyway, this patch is far less important than the previous one...

Jakub

Re: [PATCH] range-op-float: frange_arithmetic tweaks for MODE_COMPOSITE_P

2022-12-07 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 07, 2022 at 04:21:09PM +0100, Aldy Hernandez wrote:
> On 12/7/22 13:10, Jakub Jelinek wrote:
> > + switch (code)
> > +   {
> > +   case PLUS_EXPR:
> > +   case MINUS_EXPR:
> > + // ibm-ldouble-format documents 1ulp for + and -.
> > + frange_nextafter (DFmode, tmp, inf);
> > + break;
> > +   case MULT_EXPR:
> > + // ibm-ldouble-format documents 2ulps for *.
> > + frange_nextafter (DFmode, tmp, inf);
> > + frange_nextafter (DFmode, tmp, inf);
> > + break;
> > +   case RDIV_EXPR:
> > + // ibm-ldouble-format documents 3ulps for /.
> > + frange_nextafter (DFmode, tmp, inf);
> > + frange_nextafter (DFmode, tmp, inf);
> > + frange_nextafter (DFmode, tmp, inf);
> > + break;
> > +   default:
> > + if (!inexact)
> > +   return;
> > + break;
> 
> It looks like this chunk...
> 
> 
> > +   switch (code)
> > + {
> > + case PLUS_EXPR:
> > + case MINUS_EXPR:
> > +   // ibm-ldouble-format documents 1ulp for + and -.
> > +   frange_nextafter (mode, result, inf);
> > +   break;
> > + case MULT_EXPR:
> > +   // ibm-ldouble-format documents 2ulps for *.
> > +   frange_nextafter (mode, result, inf);
> > +   frange_nextafter (mode, result, inf);
> > +   break;
> > + case RDIV_EXPR:
> > +   // ibm-ldouble-format documents 3ulps for /.
> > +   frange_nextafter (mode, result, inf);
> > +   frange_nextafter (mode, result, inf);
> > +   frange_nextafter (mode, result, inf);
> > +   break;
> > + default:
> > +   break;
> > + }
> 
> ...is the same as this chunk.  Plus, all this mode composite stuff is

It is not the same, there is the DFmode, tmp vs. mode, result difference.
But sure, we could either add an inline function which for
(code, mode, result, inf) set of options (or (code, DFmode, tmp, inf))
do those 0, 1, 2, 3 frange_nextafter calls (and return bool if it did any
- there is also the if (!inexact) return; case), or as you suggest
perhaps change frange_nextafter to handle MODE_COMPOSITE_P differently
and do there
  if (mode_composite && (real_isdenormal (, mode) || real_iszero 
()))
{
  // IBM extended denormals only have DFmode precision.
  REAL_VALUE_TYPE tmp;
  real_convert (, DFmode, );
  frange_nextafter (DFmode, tmp, inf);
  real_convert (, mode, );
}
  else
frange_nextafter (mode, result, inf);
Though, that somewhat changes behavior, it will convert to DFmode and back
for every nextafter rather than just once (just slower compile time),
but also right now we start from value rather than result.

So, perhaps a combination of that, change frange_nextafter to do the above
and change frange_arithmetic for the initial inexact rounding only to
do it by hand using range_nextafter and starting from value.

Anyway, this patch is far less important than the previous one...

Jakub

Re: [PATCH] range-op-float: Fix up frange_arithmetic [PR107967]

2022-12-07 Thread Aldy Hernandez via Gcc-patches





On 12/7/22 09:29, Jakub Jelinek wrote:

Hi!

The addition of PLUS/MINUS/MULT/RDIV_EXPR frange handlers causes
miscompilation of some of the libm routines, resulting in lots of
glibc test failures.  A part of them is purely PR107608 fold-overflow-1.c
etc. issues, say when the code does
   return -0.5 / 0.0;
and expects division by zero to be emitted, but we propagate -Inf
and avoid the operation.
But there are also various tests where we end up with different computed
value from the expected ones.  All those cases are like:
  is:  inf   inf
  should be:   1.18973149535723176502e+4932   0xf.fff0p+16380
  is:  inf   inf
  should be:   1.18973149535723176508575932662800701e+4932   
0x1.p+16383
  is:  inf   inf
  should be:   1.7976931348623157e+308   0x1.fp+1023
  is:  inf   inf
  should be:   3.40282346e+38   0x1.fep+127
and the corresponding source looks like:
static const double huge = 1.0e+300;
double whatever (...) {
...
   return huge * huge;
...
}
which for rounding to nearest or +inf should and does return +inf, but
for rounding to -inf or 0 should instead return nextafter (inf, -inf);
The rules IEEE754 has are that operations on +-Inf operands are exact
and produce +-Inf (except for the invalid ones that produce NaN) regardless
of rounding mode, while overflows:
"a) roundTiesToEven and roundTiesToAway carry all overflows to ∞ with the
sign of the intermediate result.
b) roundTowardZero carries all overflows to the format’s largest finite
number with the sign of the intermediate result.
c) roundTowardNegative carries positive overflows to the format’s largest
finite number, and carries negative overflows to −∞.
d) roundTowardPositive carries negative overflows to the format’s most
negative finite number, and carries positive overflows to +∞."

The behavior around overflows to -Inf or nextafter (-inf, inf) was actually
handled correctly, we'd construct [-INF, -MAX] ranges in those cases
because !real_less (, ) in that case - value is finite
but larger in magnitude than what the format can represent (but GCC
internal's format can), while result is -INF in that case.
But for the overflows to +Inf or nextafter (inf, -inf) was handled
incorrectly, it tested real_less (, ) rather than
!real_less (, ), the former test is true when already the
rounding value -> result rounded down and in that case we shouldn't
round again, we should round down when it didn't.

So, in theory this could be fixed just by adding one ! character,
-  if ((mode_composite || (real_isneg () ? real_less (, )
+  if ((mode_composite || (real_isneg () ? !real_less (, )
  : !real_less (, )))
but the following patch goes further.  The distance between
nextafter (inf, -inf) and inf is large (infinite) and expressions like
1.0e+300 * 1.0e+300 always produce +inf in round to nearest mode by far,
so I think having low bound of nextafter (inf, -inf) in that case is
unnecessary.  But if it isn't multiplication but say addition and we are
inexact and very close to the boundary between rounding to nearest
maximum representable vs. rounding to nearest +inf, still using [MAX, +INF]
etc. ranges seems safer because we don't know exactly what we lost in the
inexact computation.

The following patch implements that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-12-07  Jakub Jelinek  

PR tree-optimization/107967
* range-op-float.cc (frange_arithmetic): Fix a thinko - if
inf is negative, use nextafter if !real_less (, )
rather than if real_less (, ).  If result is +-INF
while value is finite and -fno-rounding-math, don't do rounding
if !inexact or if result is significantly above max representable
value or below min representable value.

* gcc.dg/pr107967-1.c: New test.
* gcc.dg/pr107967-2.c: New test.
* gcc.dg/pr107967-3.c: New test.

--- gcc/range-op-float.cc.jj2022-12-06 10:25:16.594848892 +0100
+++ gcc/range-op-float.cc   2022-12-06 20:53:47.751295689 +0100
@@ -287,9 +287,64 @@ frange_arithmetic (enum tree_code code,
  
// Be extra careful if there may be discrepancies between the

// compile and runtime results.
-  if ((mode_composite || (real_isneg () ? real_less (, )
- : !real_less (, )))
-  && (inexact || !real_identical (, )))
+  bool round = false;
+  if (mode_composite)
+round = true;
+  else if (real_isneg ())
+{
+  round = !real_less (, );
+  if (real_isinf (, false)
+ && !real_isinf ()
+ && !flag_rounding_math)
+   {
+ // Use just [+INF, +INF] rather than [MAX, +INF]
+ // even if value is larger than MAX and rounds to
+ // nearest to +INF.  Unless INEXACT is true, in
+ // that case we need some extra buffer.
+ if (!inexact)
+   round = false;
+ else
+   {
+

Re: [PATCH V3] Use reg mode to move sub blocks for parameters and returns

2022-12-07 Thread Segher Boessenkool

Hi!

On Wed, Dec 07, 2022 at 08:00:08PM +0800, Jiufu Guo wrote:
> When assigning a parameter to a variable, or assigning a variable to
> return value with struct type, "block move" are used to expand
> the assignment. It would be better to use the register mode according
> to the target/ABI to move the blocks if the parameter/return is passed
> through registers. And then this would raise more opportunities for
> other optimization passes(cse/dse/xprop).
> 
> As the example code (like code in PR65421):
> 
> typedef struct SA {double a[3];} A;
> A ret_arg_pt (A *a) {return *a;} // on ppc64le, expect only 3 lfd(s)
> A ret_arg (A a) {return a;} // just empty fun body
> void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s)

What is this like if you use [5] instead?  Or use an ABI without
homogeneous aggregates?

> +static void
> +move_sub_blocks (rtx to_rtx, tree from, machine_mode sub_mode, bool 
> nontemporal)
> +{
> +  HOST_WIDE_INT size, sub_size;
> +  int len;
> +
> +  gcc_assert (MEM_P (to_rtx));
> +
> +  size = MEM_SIZE (to_rtx).to_constant ();
> +  sub_size = GET_MODE_SIZE (sub_mode).to_constant ();
> +  len = size / sub_size;

Unrelated, but a pet peeve: it is much more modern (and imo much better
taste) to not put all declarations at the start; just declare at first
use:

  gcc_assert (MEM_P (to_rtx));

  HOST_WIDE_INT size = MEM_SIZE (to_rtx).to_constant ();
  HOST_WIDE_INT sub_size = GET_MODE_SIZE (sub_mode).to_constant ();
  int len = size / sub_size;

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c
> @@ -0,0 +1,15 @@
> +/* PR target/65421 */
> +/* { dg-options "-O2" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +
> +typedef struct SA
> +{
> +  double a[2];
> +  long l;
> +} A;
> +
> +/* std 3 param regs to return slot */
> +A ret_arg (A a) {return a;}
> +/* { dg-final { scan-assembler-times {\mstd 4,0\(3\)\s} 1 } } */
> +/* { dg-final { scan-assembler-times {\mstd 5,8\(3\)\s} 1 } } *
> +/* { dg-final { scan-assembler-times {\mstd 6,16\(3\)\s} 1 } } */

This is only correct on certain ABIs, probably only ELFv2 even.


We certainly can improve the homogeneous aggregates stuff, but please
make sure you don't degrade all other stuff?  Older, as well as when
things are not an homogeneous aggregate, for example too big.  Can you
please add tests for such cases?


Segher

Re: [PATCH] range-op-float: frange_arithmetic tweaks for MODE_COMPOSITE_P

2022-12-07 Thread Aldy Hernandez via Gcc-patches





On 12/7/22 13:10, Jakub Jelinek wrote:

Hi!

As mentioned in PR107967, ibm-ldouble-format documents that
+- has 1ulp accuracy, * 2ulps and / 3ulps.
So, even if the result is exact, we need to widen the range a little bit.

The following patch does that.  I just wonder what it means for reverse
division (the op1_range case), which we implement through multiplication,
when division has 3ulps error and multiplication just 2ulps.  In any case,
this format is a mess and for non-default rounding modes can't be trusted
at all, instead of +inf or something close to it it happily computes -inf.

2022-12-07  Jakub Jelinek  

* range-op-float.cc (frange_arithmetic): For mode_composite,
on top of rounding in the right direction accept extra 1ulp
error for PLUS/MINUS_EXPR, extra 2ulps error for MULT_EXPR
and extra 3ulps error for RDIV_EXPR.

--- gcc/range-op-float.cc.jj2022-12-07 12:46:01.536123757 +0100
+++ gcc/range-op-float.cc   2022-12-07 12:50:40.812085139 +0100
@@ -344,22 +344,70 @@ frange_arithmetic (enum tree_code code,
}
}
  }
-  if (round && (inexact || !real_identical (, )))
+  if (!inexact && !real_identical (, ))
+inexact = true;
+  if (round && (inexact || mode_composite))
  {
if (mode_composite)
{
- if (real_isdenormal (, mode)
- || real_iszero ())
+ if (real_isdenormal (, mode) || real_iszero ())
{
  // IBM extended denormals only have DFmode precision.
  REAL_VALUE_TYPE tmp;
  real_convert (, DFmode, );
- frange_nextafter (DFmode, tmp, inf);
+ if (inexact)
+   frange_nextafter (DFmode, tmp, inf);
+ switch (code)
+   {
+   case PLUS_EXPR:
+   case MINUS_EXPR:
+ // ibm-ldouble-format documents 1ulp for + and -.
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   case MULT_EXPR:
+ // ibm-ldouble-format documents 2ulps for *.
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   case RDIV_EXPR:
+ // ibm-ldouble-format documents 3ulps for /.
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   default:
+ if (!inexact)
+   return;
+ break;


It looks like this chunk...



+   }
  real_convert (, mode, );
  return;
}
}
-  frange_nextafter (mode, result, inf);
+  if (inexact)
+   frange_nextafter (mode, result, inf);
+  if (mode_composite)
+   switch (code)
+ {
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+   // ibm-ldouble-format documents 1ulp for + and -.
+   frange_nextafter (mode, result, inf);
+   break;
+ case MULT_EXPR:
+   // ibm-ldouble-format documents 2ulps for *.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+ case RDIV_EXPR:
+   // ibm-ldouble-format documents 3ulps for /.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+ default:
+   break;
+ }


...is the same as this chunk.  Plus, all this mode composite stuff is 
polluting what was a rather clean function.  Would it be possible to 
abstract this into an inline function, and then we could do:


if (mode_composite)
  frange_composite_nextafter (...);
else
  frange_nextafter (...);

or perhaps abstract the whole nextafter in frange_arithmetic into:

frange_arithmetic_nextafter () {
  if (mode_composite) { do ugly stuff }
  else frange_nextafter (...)
}

I'm most worried about maintainability, not correctness here, cause you 
obviously know what you're doing ;-).


Aldy



  }
  }
  


Jakub

Re: [PATCH v5 2/4] OpenMP/OpenACC: Rework clause expansion and nested struct handling

2022-12-07 Thread Julian Brown

On Wed, 7 Dec 2022 15:54:42 +0100
Tobias Burnus  wrote:

> Hi Julian,
> 
> If I understand Deepak's comment (on OpenMP.org's omp-lang list, sorry
> it is a nonpublic list) correctly, the following wording implies that
> a 'from: s.w[z:4]' for a pointer 's.w' also implies a mapping of
> 's.w' - if 's' is used inside the target region and, thus, gets
> implicitly mapped.
> 
> [TR11 157:21-26] (approx. [5.2 154:22-27], [5.1 352:17-22], [5.0
> 320:22-27])
> 
> "If a list item with an implicit data-mapping attribute does not have
> any corresponding storage in the device data environment prior to a
> task encountering the construct associated with the map clause, and
> one or more contiguous parts of the original storage are either list
> items or base pointers to list items that are explicitly mapped on
> the construct, only those parts of the original storage will have
> corresponding storage in the device data environment as a result of
> the map clauses on the construct."

Hmmm... IIRC that is a different conclusion than the one we have
understood previously, leading to e.g. the patch here (Chung-Lin CC'ed):

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html

Follow-on discussion then questioned whether the change was really the
intention of the spec, but we thought it was.  Has that changed now?

(I think actually changing the behaviour is a matter of flipping a
switch, but let's make sure we choose the right setting!)

Thanks,

Julian

Re: [PATCH 1/2] select .rodata for const volatile variables.

2022-12-07 Thread Cupertino Miranda via Gcc-patches



Hi Jeff,

First of all thanks for your quick review.
Apologies for the delay replying, the message got lost in my inbox.

> On 12/2/22 10:52, Cupertino Miranda via Gcc-patches wrote:
>> Changed target code to select .rodata section for 'const volatile'
>> defined variables.
>> This change is in the context of the bugzilla #170181.
>> gcc/ChangeLog:
>>  v850.c(v850_select_section): Changed function.
> I'm not sure this is safe/correct.  ISTM that you need to look at the 
> underlying
> TREE_TYPE to check for const-volatile rather than TREE_SIDE_EFFECTS.

I believe this was asked by Jose when he first sent the generic patches.
Please notice my change is influenced by his original patch that does
the same and was approved.

https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599348.html
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602374.html

>
> Of secondary importance is the ChangeLog.  Just saying "Changed function"
> provides no real information.  Something like this would be better:
>
>   * config/v850/v850.c (v850_select_section): Put const volatile
>   objects into read-only sections.
>
>
> Jeff
>
>
>
>
>> ---
>>   gcc/config/v850/v850.cc | 1 -
>>   1 file changed, 1 deletion(-)
>> diff --git a/gcc/config/v850/v850.cc b/gcc/config/v850/v850.cc
>> index c7d432990ab..e66893fede4 100644
>> --- a/gcc/config/v850/v850.cc
>> +++ b/gcc/config/v850/v850.cc
>> @@ -2865,7 +2865,6 @@ v850_select_section (tree exp,
>>   {
>> int is_const;
>> if (!TREE_READONLY (exp)
>> -  || TREE_SIDE_EFFECTS (exp)
>>|| !DECL_INITIAL (exp)
>>|| (DECL_INITIAL (exp) != error_mark_node
>>&& !TREE_CONSTANT (DECL_INITIAL (exp

Re: [PATCH] Fix a few incorrect accesses.

2022-12-07 Thread Andrew MacLeod via Gcc-patches




On 12/7/22 05:08, Thomas Schwinge wrote:

Hi Andrew!

On 2022-12-02T09:12:23-0500, Andrew MacLeod via Gcc-patches 
 wrote:

This consists of 3 changes which stronger type checking has indicated
are non-compliant with the type field.

I'm curious what that "stronger type checking" is?



Remnants of an old project which replaces the uses of trees to represent 
types everywhere in GCC with a new type pointer. This provided much 
stronger compile time type checking.   These few places in the patch 
caused compile time errors because the accesses were not type nodes.


They were hanging around in an old branch from 2017 I was looking at, so 
I figured it was time to get them into trunk :-)



Anddrew

Re: [PATCH][AArch64] Cleanup move immediate code

2022-12-07 Thread Andreas Schwab via Gcc-patches

FAIL: gcc.target/aarch64/sve/cond_arith_5.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/const_3.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/loop_add_5.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/mask_load_slp_1.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/mul_highpart_3.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/slp_13.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/slp_2.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/slp_8.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/slp_9.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/spill_4.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/spill_6.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/vcond_18.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/vcond_19.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/vcond_20.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/vcond_3.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/vcond_7.c (internal compiler error: in 
aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c90 -O0 -g -DTEST_FULL 
(internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c90 -O0 -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c90 -O1 -g -DTEST_FULL 
(internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c90 -O1 -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c99 -O2 -g -DTEST_FULL 
(internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c99 -O2 -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c11 -O3 -g -DTEST_FULL 
(internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=c11 -O3 -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=gnu90 -O2 
-fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_FULL (internal compiler 
error: in aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=gnu90 -O2 
-fno-schedule-insns -DCHECK_ASM --save-temps -DTEST_OVERLOADS (internal 
compiler error: in aarch64_move_imm, at config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=gnu99 -Ofast -g 
-DTEST_FULL (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=gnu99 -Ofast -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=gnu11 -Os -g -DTEST_FULL 
(internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_f16.c  -std=gnu11 -Os -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_s16.c  -std=c90 -O0 -g -DTEST_FULL 
(internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_s16.c  -std=c90 -O0 -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_s16.c  -std=c90 -O1 -g -DTEST_FULL 
(internal compiler error: in aarch64_move_imm, at 
config/aarch64/aarch64.cc:5692)
FAIL: gcc.target/aarch64/sve/acle/asm/dup_s16.c  -std=c90 -O1 -g 
-DTEST_OVERLOADS (internal compiler error: in aarch64_move_imm, at

Re: [PATCH v5 2/4] OpenMP/OpenACC: Rework clause expansion and nested struct handling

2022-12-07 Thread Tobias Burnus


Hi Julian,

If I understand Deepak's comment (on OpenMP.org's omp-lang list, sorry
it is a nonpublic list) correctly, the following wording implies that a
'from: s.w[z:4]' for a pointer 's.w' also implies a mapping of 's.w' -
if 's' is used inside the target region and, thus, gets implicitly mapped.

[TR11 157:21-26] (approx. [5.2 154:22-27], [5.1 352:17-22], [5.0 320:22-27])

"If a list item with an implicit data-mapping attribute does not have any 
corresponding storage in the device data environment prior to a task encountering the 
construct associated with the map clause, and one or more contiguous parts of the 
original storage are either list items or base pointers to list items that are explicitly 
mapped on the construct, only those parts of the original storage will have corresponding 
storage in the device data environment as a result of the map clauses on the 
construct."

Thus, the following change should not be required – but if I undo it, I see a 
libgomp runtime error. Hence, it looks as if you need to fix this:

On 18.10.22 12:39, Julian Brown wrote:

--- a/libgomp/testsuite/libgomp.c/target-22.c
+++ b/libgomp/testsuite/libgomp.c/target-22.c
@@ -21,7 +21,8 @@ main ()
s.v.b = a + 16;
s.w = c + 3;
int err = 0;
-  #pragma omp target map (to:s.v.b[0:z + 7], s.u[z + 1:z + 4]) \
+  #pragma omp target map (to: s.w, s.v.b, s.u, s.s) \
+  map (to:s.v.b[0:z + 7], s.u[z + 1:z + 4]) \
   map (tofrom:s.s[3:3]) \
   map (from: s.w[z:4], err) private (i)


Thanks,

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH v2 1/2] RISC-V: Support _Float16 type.

2022-12-07 Thread Kito Cheng via Gcc-patches

Hi Maciej:

It’s not intentionally, I suspect that is because I port from our internal
old gcc branch, will send patch to fix that later, thanks for catching this!

Maciej W. Rozycki 於 2022年12月5日 週一，21:05寫道：

> Hi Kito,
>
>  I came across this issue while inspecting code and I have been wondering
> what the reason was to downgrade current FMV.X.W and FMW.W.X instructions
> to their older FMV.S.W and FMV.W.S variants here:
>
> On Wed, 10 Aug 2022, Kito Cheng wrote:
>
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 5a0adffb5ce..47e6110767c 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -2308,10 +2310,19 @@ riscv_output_move (rtx dest, rtx src)
> >if (dest_code == REG && GP_REG_P (REGNO (dest)))
> >  {
> >if (src_code == REG && FP_REG_P (REGNO (src)))
> > - return dbl_p ? "fmv.x.d\t%0,%1" : "fmv.x.w\t%0,%1";
> > + switch (width)
> > +   {
> > +   case 2:
> > + /* Using fmv.x.s + sign-extend to emulate fmv.x.h.  */
> > + return "fmv.x.s\t%0,%1;slli\t%0,%0,16;srai\t%0,%0,16";
> > +   case 4:
> > + return "fmv.x.s\t%0,%1";
> > +   case 8:
> > + return "fmv.x.d\t%0,%1";
> > +   }
>
> and here:
>
> > @@ -2353,18 +2364,24 @@ riscv_output_move (rtx dest, rtx src)
> >   return "mv\t%0,%z1";
> >
> > if (FP_REG_P (REGNO (dest)))
> > - {
> > -   if (!dbl_p)
> > - return "fmv.w.x\t%0,%z1";
> > -   if (TARGET_64BIT)
> > - return "fmv.d.x\t%0,%z1";
> > -   /* in RV32, we can emulate fmv.d.x %0, x0 using fcvt.d.w */
> > -   gcc_assert (src == CONST0_RTX (mode));
> > -   return "fcvt.d.w\t%0,x0";
> > - }
> > + switch (width)
> > +   {
> > +   case 2:
> > + /* High 16 bits should be all-1, otherwise HW will treated
> > +as a n-bit canonical NaN, but isn't matter for
> softfloat.  */
> > + return "fmv.s.x\t%0,%1";
> > +   case 4:
> > + return "fmv.s.x\t%0,%z1";
> > +   case 8:
> > + if (TARGET_64BIT)
> > +   return "fmv.d.x\t%0,%z1";
> > + /* in RV32, we can emulate fmv.d.x %0, x0 using fcvt.d.w */
>
> (Incorrect comment formatting here as well.)
>
> > + gcc_assert (src == CONST0_RTX (mode));
> > + return "fcvt.d.w\t%0,x0";
> > +   }
>
> Was it intentional or just an oversight in review?  If intentional, I'd
> expect such a change to happen on its own rather than sneaked in with a
> large functional update.
>
>   Maciej
>

[PATCH] tree-optimization/106904 - bogus -Wstringopt-overflow with vectors

2022-12-07 Thread Richard Biener via Gcc-patches

The following avoids CSE of >wp to >wp.hwnd confusing
-Wstringopt-overflow by making sure to produce addresses to the
biggest container from vectorization.  For this I introduce
strip_zero_offset_components which turns >wp.hwnd into
&(*ps) and use that to base the vector data references on.
That will also work for addresses with variable components,
alternatively emitting pointer arithmetic via calling
get_inner_reference and gimplifying that would be possible
but likely more intrusive.

This is by no means a complete fix for all of those issues
(avoiding ADDR_EXPRs in favor of pointer arithmetic might be).
Other passes will have similar issues.

In theory that might now cause false negatives.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any opinion?

Thanks,
Richard.

PR tree-optimization/106904
* tree.h (strip_zero_offset_components): Declare.
* tree.cc (strip_zero_offset_components): Define.
* tree-vect-data-refs.cc (vect_create_addr_base_for_vector_ref):
Strip zero offset components before building the address.

* gcc.dg/Wstringop-overflow-pr106904.c: New testcase.
---
 .../gcc.dg/Wstringop-overflow-pr106904.c  | 30 +++
 gcc/tree-vect-data-refs.cc| 12 
 gcc/tree.cc   | 12 
 gcc/tree.h|  1 +
 4 files changed, 50 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wstringop-overflow-pr106904.c

diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-pr106904.c 
b/gcc/testsuite/gcc.dg/Wstringop-overflow-pr106904.c
new file mode 100644
index 000..15e67c28c15
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-pr106904.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wstringop-overflow -fno-vect-cost-model" } */
+
+struct windowpos
+{
+  int hwnd;
+  int hwnd2;
+};
+
+struct packed_windowpos
+{
+  int hwnd;
+  int pad1;
+  int hwnd2;
+  int pad2;
+};
+
+struct packed_structs
+{
+  struct packed_windowpos wp;
+};
+
+void func(struct packed_structs *ps)
+{
+  struct windowpos wp;
+
+  wp.hwnd = ps->wp.hwnd;
+  wp.hwnd2 = ps->wp.hwnd2;
+  __builtin_memcpy(>wp, , sizeof(wp)); /* { dg-bogus "into a region" } 
*/
+}
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 6c892791bd4..18b0f962670 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -4845,11 +4845,13 @@ vect_create_addr_base_for_vector_ref (vec_info *vinfo, 
stmt_vec_info stmt_info,
   if (loop_vinfo)
 addr_base = fold_build_pointer_plus (data_ref_base, base_offset);
   else
-{
-  addr_base = build1 (ADDR_EXPR,
- build_pointer_type (TREE_TYPE (DR_REF (dr))),
- unshare_expr (DR_REF (dr)));
-}
+addr_base = build1 (ADDR_EXPR,
+   build_pointer_type (TREE_TYPE (DR_REF (dr))),
+   /* Strip zero offset components since we don't need
+  them and they can confuse late diagnostics if
+  we CSE them wrongly.  See PR106904 for example.  */
+   unshare_expr (strip_zero_offset_components
+   (DR_REF (dr;
 
   vect_ptr_type = build_pointer_type (TREE_TYPE (DR_REF (dr)));
   dest = vect_get_new_vect_var (vect_ptr_type, vect_pointer_var, base_name);
diff --git a/gcc/tree.cc b/gcc/tree.cc
index b40c95ae8c4..0a51f9ddb4d 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -12014,6 +12014,18 @@ strip_invariant_refs (const_tree op)
   return op;
 }
 
+/* Strip handled components with zero offset from OP.  */
+
+tree
+strip_zero_offset_components (tree op)
+{
+  while (TREE_CODE (op) == COMPONENT_REF
+&& integer_zerop (DECL_FIELD_OFFSET (TREE_OPERAND (op, 1)))
+&& integer_zerop (DECL_FIELD_BIT_OFFSET (TREE_OPERAND (op, 1
+op = TREE_OPERAND (op, 0);
+  return op;
+}
+
 static GTY(()) tree gcc_eh_personality_decl;
 
 /* Return the GCC personality function decl.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 1c810c0b21b..065ad527c3f 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5373,6 +5373,7 @@ extern bool tree_nop_conversion_p (const_tree, 
const_tree);
 extern tree tree_strip_nop_conversions (tree);
 extern tree tree_strip_sign_nop_conversions (tree);
 extern const_tree strip_invariant_refs (const_tree);
+extern tree strip_zero_offset_components (tree);
 extern tree lhd_gcc_personality (void);
 extern void assign_assembler_name_if_needed (tree);
 extern bool warn_deprecated_use (tree, tree);
-- 
2.35.3

Re: [PATCH] Fix aarch64 PR 99657: ICE with SVE types used without an error

2022-12-07 Thread Richard Sandiford via Gcc-patches

"Kewen.Lin"  writes:
> Hi Richard,
>
> on 2022/12/7 17:16, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>>> Hi,
>>>
>>> In the recent discussion on how to make some built-in type only valid for
>>> some target features efficiently[1], Andrew mentioned this patch which he
>>> made previously (Thanks!).  I confirmed it can help rs6000 related issue,
>>> and noticed PR99657 is still opened, so I think we still want this to
>>> be reviewed.
>> 
>> But does it work for things like:
>> 
>> void f(foo_t *x, foo_t *y) { *x = *y; }
>> 
>> where no variables are being created with foo_t type?
>> 
>
> I think it can work for this case as it touches build_indirect_ref.

Ah, ok.  But indirecting through a pointer doesn't seem to match
TCTX_AUTO_STORAGE.

I guess another case is where there are global variables of the type
that you want to forbid, compiled while the target feature is enabled,
and then a function tries to access those variables with the target
feature locally disabled (through a pragma or attribute).  Does that
case work?

That's not an issue for SVE because global variables can't have
sizeless type.

>> That's not to say we shouldn't have the patch.  I'm just not sure
>> it can be the complete solution.
>
> I'm not sure about that either, maybe Andrew have more insights.
> But as you pointed out in [1], I doubted trying to find all invalid
> uses of a built-in type is worthwhile, it seems catching those usual
> cases is enough and practical.  So if this verify_type_context
> framework can cover the most of uses, maybe it's a good direction
> to go and extend.

IMO it depends on what we're trying to protect against.  If the
compiler can handle these types correctly even when the target feature
is disabled, and we're simply disallowing the types for policy rather
than correctness reasons, then maybe just handling the usual cases is
good enough.  But things are different if the compiler is going to ICE
or generate invalid code when something slips through.  In that case,
I think the niche cases matter too.

Thanks,
Richard

[PATCH] range-op-float: frange_arithmetic tweaks for MODE_COMPOSITE_P

2022-12-07 Thread Jakub Jelinek via Gcc-patches

Hi!

As mentioned in PR107967, ibm-ldouble-format documents that
+- has 1ulp accuracy, * 2ulps and / 3ulps.
So, even if the result is exact, we need to widen the range a little bit.

The following patch does that.  I just wonder what it means for reverse
division (the op1_range case), which we implement through multiplication,
when division has 3ulps error and multiplication just 2ulps.  In any case,
this format is a mess and for non-default rounding modes can't be trusted
at all, instead of +inf or something close to it it happily computes -inf.

2022-12-07  Jakub Jelinek  

* range-op-float.cc (frange_arithmetic): For mode_composite,
on top of rounding in the right direction accept extra 1ulp
error for PLUS/MINUS_EXPR, extra 2ulps error for MULT_EXPR
and extra 3ulps error for RDIV_EXPR.

--- gcc/range-op-float.cc.jj2022-12-07 12:46:01.536123757 +0100
+++ gcc/range-op-float.cc   2022-12-07 12:50:40.812085139 +0100
@@ -344,22 +344,70 @@ frange_arithmetic (enum tree_code code,
}
}
 }
-  if (round && (inexact || !real_identical (, )))
+  if (!inexact && !real_identical (, ))
+inexact = true;
+  if (round && (inexact || mode_composite))
 {
   if (mode_composite)
{
- if (real_isdenormal (, mode)
- || real_iszero ())
+ if (real_isdenormal (, mode) || real_iszero ())
{
  // IBM extended denormals only have DFmode precision.
  REAL_VALUE_TYPE tmp;
  real_convert (, DFmode, );
- frange_nextafter (DFmode, tmp, inf);
+ if (inexact)
+   frange_nextafter (DFmode, tmp, inf);
+ switch (code)
+   {
+   case PLUS_EXPR:
+   case MINUS_EXPR:
+ // ibm-ldouble-format documents 1ulp for + and -.
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   case MULT_EXPR:
+ // ibm-ldouble-format documents 2ulps for *.
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   case RDIV_EXPR:
+ // ibm-ldouble-format documents 3ulps for /.
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ frange_nextafter (DFmode, tmp, inf);
+ break;
+   default:
+ if (!inexact)
+   return;
+ break;
+   }
  real_convert (, mode, );
  return;
}
}
-  frange_nextafter (mode, result, inf);
+  if (inexact)
+   frange_nextafter (mode, result, inf);
+  if (mode_composite)
+   switch (code)
+ {
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+   // ibm-ldouble-format documents 1ulp for + and -.
+   frange_nextafter (mode, result, inf);
+   break;
+ case MULT_EXPR:
+   // ibm-ldouble-format documents 2ulps for *.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+ case RDIV_EXPR:
+   // ibm-ldouble-format documents 3ulps for /.
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   frange_nextafter (mode, result, inf);
+   break;
+ default:
+   break;
+ }
 }
 }
 

Jakub

[PATCH V3] Use reg mode to move sub blocks for parameters and returns

2022-12-07 Thread Jiufu Guo via Gcc-patches

Hi,

When assigning a parameter to a variable, or assigning a variable to
return value with struct type, "block move" are used to expand
the assignment. It would be better to use the register mode according
to the target/ABI to move the blocks if the parameter/return is passed
through registers. And then this would raise more opportunities for
other optimization passes(cse/dse/xprop).

As the example code (like code in PR65421):

typedef struct SA {double a[3];} A;
A ret_arg_pt (A *a) {return *a;} // on ppc64le, expect only 3 lfd(s)
A ret_arg (A a) {return a;} // just empty fun body
void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s)

This patches check the "from" and "to" of an assignment in
"expand_assignment", if it is about param/ret which may passing via
register, then use the register mode to move sub-blocks for the
assignning.

This patches are based on the discussions for previous version:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606498.html
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607480.html

Comparing with previous version, this patch uses a flag to mark if
a parameter is passed via registers and stored to stack. And when
expanding an assignment, this flag is checked.  This could avoid
inaccurate guess during expanding the assignment.  In this version,
testcase pr65421-1.c is updated to remove unnecessary checks.

Bootstrap and regtest pass on ppc64{,le} and x86_64.
Is this ok for trunk?

BR,
Jeff (Jiufu)


PR target/65421

gcc/ChangeLog:

* cfgexpand.cc (expand_used_vars): Update to mark DECL_USEDBY_RETURN_P
for returns.
* expr.cc (move_sub_blocks): New function.
(expand_assignment): Update to call move_sub_blocks for returns or
parameters.
* function.cc (assign_parm_setup_block): Update to mark
DECL_STACK_REGS_P for parameter.
* tree-core.h (struct tree_decl_common): Add comment.
* tree.h (DECL_USEDBY_RETURN_P): New define.
(DECL_STACK_REGS_P): New define.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr65421-1.c: New test.
* gcc.target/powerpc/pr65421.c: New test.

---
 gcc/cfgexpand.cc | 14 
 gcc/expr.cc  | 81 
 gcc/function.cc  |  3 +
 gcc/tree-core.h  |  4 +-
 gcc/tree.h   |  9 +++
 gcc/testsuite/gcc.target/powerpc/pr65421-1.c | 15 
 gcc/testsuite/gcc.target/powerpc/pr65421.c   | 24 ++
 7 files changed, 149 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index dd29c03..09b8ec64cea 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars)
 frame_phase = off ? align - off : 0;
   }
 
+  /* Collect VARs on returns.  */
+  if (DECL_RESULT (current_function_decl))
+{
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
+   if (greturn *ret = safe_dyn_cast (last_stmt (e->src)))
+ {
+   tree val = gimple_return_retval (ret);
+   if (val && VAR_P (val))
+ DECL_USEDBY_RETURN_P (val) = 1;
+ }
+}
+
   /* Set TREE_USED on all variables in the local_decls.  */
   FOR_EACH_LOCAL_DECL (cfun, i, var)
 TREE_USED (var) = 1;
diff --git a/gcc/expr.cc b/gcc/expr.cc
index d9407432ea5..d61669d5662 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -5559,6 +5559,55 @@ mem_ref_refers_to_non_mem_p (tree ref)
   return non_mem_decl_p (base);
 }
 
+/* Sub routine of expand_assignment, invoked when assigning from a
+   parameter or assigning to a return val on struct type which may
+   be passed through registers.  The mode of register is used to
+   move the content for the assignment.
+
+   This routine generates code for expression FROM which is BLKmode,
+   and move the generated content to TO_RTX by su-blocks in SUB_MODE.  */
+
+static void
+move_sub_blocks (rtx to_rtx, tree from, machine_mode sub_mode, bool 
nontemporal)
+{
+  HOST_WIDE_INT size, sub_size;
+  int len;
+
+  gcc_assert (MEM_P (to_rtx));
+
+  size = MEM_SIZE (to_rtx).to_constant ();
+  sub_size = GET_MODE_SIZE (sub_mode).to_constant ();
+  len = size / sub_size;
+
+  /* It would be not profitable to move through sub-modes, if the size does
+ not meet register mode.  */
+  if ((size % sub_size) != 0)
+{
+  push_temp_slots ();
+  rtx result = store_expr (from, to_rtx, 0, nontemporal, false);
+  preserve_temp_slots (result);
+  pop_temp_slots ();
+  return;
+}
+
+  push_temp_slots ();
+
+  rtx from_rtx;
+  from_rtx = expand_expr (from, NULL_RTX, GET_MODE (to_rtx), EXPAND_NORMAL);
+  for (int i = 0; i < len; i++)
+{
+  rtx temp = gen_reg_rtx (sub_mode);
+  rtx src =

[committed] MAINTAINERS: Add myself as Rust front-end maintainer

2022-12-07 Thread arthur . cohen

From: Arthur Cohen 

Changelog:
* MAINTAINERS: Add Arthur Cohen as Rust front-end maintainer.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 55c5ef95806..d2eea049667 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -165,6 +165,7 @@ D front end Iain Buclaw 

 go Ian Lance Taylor
 objective-c/c++Mike Stump  
 objective-c/c++Iain Sandoe 
+Rust   Arthur Cohen
 
Various Maintainers
 
-- 
2.38.1

Re: [PATCH] Fix aarch64 PR 99657: ICE with SVE types used without an error

2022-12-07 Thread Kewen.Lin via Gcc-patches

Hi Richard,

on 2022/12/7 17:16, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> Hi,
>>
>> In the recent discussion on how to make some built-in type only valid for
>> some target features efficiently[1], Andrew mentioned this patch which he
>> made previously (Thanks!).  I confirmed it can help rs6000 related issue,
>> and noticed PR99657 is still opened, so I think we still want this to
>> be reviewed.
> 
> But does it work for things like:
> 
> void f(foo_t *x, foo_t *y) { *x = *y; }
> 
> where no variables are being created with foo_t type?
> 

I think it can work for this case as it touches build_indirect_ref.

> That's not to say we shouldn't have the patch.  I'm just not sure
> it can be the complete solution.

I'm not sure about that either, maybe Andrew have more insights.
But as you pointed out in [1], I doubted trying to find all invalid
uses of a built-in type is worthwhile, it seems catching those usual
cases is enough and practical.  So if this verify_type_context
framework can cover the most of uses, maybe it's a good direction
to go and extend.

[1] https://gcc.gnu.org/pipermail/gcc/2022-December/240218.html

BR,
Kewen

Re: [PATCH] i386: fix assert (__builtin_cpu_supports ("x86-64") >= 0)

2022-12-07 Thread Jakub Jelinek via Gcc-patches

On Fri, Nov 25, 2022 at 01:57:35PM +0100, Martin Liška wrote:
> PR target/107551
> 
> gcc/ChangeLog:
> 
>   * config/i386/i386-builtins.cc (fold_builtin_cpu): Use same path
>   as for PR103661.
>   * doc/extend.texi: Fix "x86-64" use.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/i386/builtin_target.c: Add more checks.
> +
> +  field_val = (1U << feature);

Just
  field_val = 1U << feature;
?

> +  final = build2 (BIT_AND_EXPR, unsigned_type_node, array_elt,
> +   build_int_cstu (unsigned_type_node, field_val));
> +  if (feature == (INT_TYPE_SIZE - 1))

Just
  if (feature == INT_TYPE_SIZE - 1)
?

> + return build2 (NE_EXPR, integer_type_node, final,
> +build_int_cst (unsigned_type_node, 0));
> +  else
> + return build1 (NOP_EXPR, integer_type_node, final);
>  }
>gcc_unreachable ();
>  }

Otherwise LGTM, though I must say the destinction for when
__builtin_cpu_is and __builtin_cpu_supports works looks completely random.

Jakub

[PATCH] tree-optimization/104475 - bogus -Wstringop-overflow

2022-12-07 Thread Richard Biener via Gcc-patches

The following avoids a bogus -Wstringop-overflow diagnostic by
properly recognizing that >m_mutex cannot be nullptr in C++
even if m_mutex is at offset zero.  The frontend already diagnoses
a >m_mutex != nullptr comparison and the following transfers
this knowledge to the middle-end which sees >m_mutex as
simple pointer arithmetic.  The new ADDR_NONZERO flag on an
ADDR_EXPR is used to carry this information and it's checked in
the tree_expr_nonzero_p API which causes this to be folded early.

To avoid the bogus diagnostic this avoids separating the nullptr
path via jump-threading by eliminating the nullptr check.

I'd appreciate C++ folks picking this up and put the flag on
the appropriate ADDR_EXPRs - I've tried avoiding to put it on
all of them and didn't try hard to mimick what -Waddress warns
on (the code is big, maybe some refactoring would help but also
not sure what exactly the C++ standard constraints are here).

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Richard.

PR tree-optimization/104475
gcc/
* tree-core.h: Document use of nothrow_flag on ADDR_EXPR.
* tree.h (ADDR_NONZERO): New.
* fold-const.cc (tree_single_nonzero_warnv_p): Check
ADDR_NONZERO.

gcc/cp/
* typeck.cc (cp_build_addr_expr_1): Set ADDR_NONZERO
on the built address if it is of a COMPONENT_REF.

* g++.dg/opt/pr104475.C: New testcase.
---
 gcc/cp/typeck.cc|  3 +++
 gcc/fold-const.cc   |  4 +++-
 gcc/testsuite/g++.dg/opt/pr104475.C | 12 
 gcc/tree-core.h |  3 +++
 gcc/tree.h  |  4 
 5 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/pr104475.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 7dfe5acc67e..3563750803e 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -7232,6 +7232,9 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, 
tsubst_flags_t complain)
   gcc_assert (same_type_ignoring_top_level_qualifiers_p
  (TREE_TYPE (object), decl_type_context (field)));
   val = build_address (arg);
+  if (TREE_CODE (val) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (val, 0)) == COMPONENT_REF)
+   ADDR_NONZERO (val) = 1;
 }
 
   if (TYPE_PTR_P (argtype)
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index e80be8049e1..cdfe3f50ae3 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -15308,8 +15308,10 @@ tree_single_nonzero_warnv_p (tree t, bool 
*strict_overflow_p)
 
 case ADDR_EXPR:
   {
-   tree base = TREE_OPERAND (t, 0);
+   if (ADDR_NONZERO (t))
+ return true;
 
+   tree base = TREE_OPERAND (t, 0);
if (!DECL_P (base))
  base = get_base_address (base);
 
diff --git a/gcc/testsuite/g++.dg/opt/pr104475.C 
b/gcc/testsuite/g++.dg/opt/pr104475.C
new file mode 100644
index 000..013c70302c6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr104475.C
@@ -0,0 +1,12 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+// { dg-options "-O -Waddress -fdump-tree-original" }
+
+struct X { int i; };
+
+bool foo (struct X *p)
+{
+  return >i != nullptr; /* { dg-warning "never be NULL" } */
+}
+
+/* { dg-final { scan-tree-dump "return  = 1;" "original" } } */
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index e146b133dbd..303e25b5df6 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1376,6 +1376,9 @@ struct GTY(()) tree_base {
TREE_THIS_NOTRAP in
   INDIRECT_REF, MEM_REF, TARGET_MEM_REF, ARRAY_REF, ARRAY_RANGE_REF
 
+   ADDR_NONZERO in
+ ADDR_EXPR
+
SSA_NAME_IN_FREE_LIST in
   SSA_NAME
 
diff --git a/gcc/tree.h b/gcc/tree.h
index 23223ca0c87..1c810c0b21b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -876,6 +876,10 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
   (TREE_CHECK5 (NODE, INDIRECT_REF, MEM_REF, TARGET_MEM_REF, ARRAY_REF,
\
ARRAY_RANGE_REF)->base.nothrow_flag)
 
+/* Nozero means this ADDR_EXPR is not equal to NULL.  */
+#define ADDR_NONZERO(NODE) \
+  (TREE_CHECK (NODE, ADDR_EXPR)->base.nothrow_flag)
+
 /* In a VAR_DECL, PARM_DECL or FIELD_DECL, or any kind of ..._REF node,
nonzero means it may not be the lhs of an assignment.
Nonzero in a FUNCTION_DECL means this function should be treated
-- 
2.35.3

Re: [PATCH] i386: fix assert (__builtin_cpu_supports ("x86-64") >= 0)

2022-12-07 Thread Martin Liška

On 12/2/22 10:54, Uros Bizjak wrote:
> I'm not quite familiar with this part of the compiler, but if Jakub is
> OK with the patch, consider it rubber-stamped OK.

Thanks Uros, Jakub can you please approve it?

Thanks,
Martin

> 
> Thanks,
> Uros.

[PATCH] ipa/105676 - pure attribute suggestion for const function

2022-12-07 Thread Richard Biener via Gcc-patches

When a function is declared const (even though it technically
accesses memory), ipa-modref discovering pureness shouldn't end
up suggesting that attribute.  The following thus exempts
'const' functions from ipa_make_function_pure handling.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR ipa/105676
* ipa-pure-const.cc (ipa_make_function_pure): Skip also
for functions already being const.

* gcc.dg/pr105676.c: New testcase.
---
 gcc/ipa-pure-const.cc   |  5 +++--
 gcc/testsuite/gcc.dg/pr105676.c | 14 ++
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr105676.c

diff --git a/gcc/ipa-pure-const.cc b/gcc/ipa-pure-const.cc
index 572a6da274f..0b748eee0ee 100644
--- a/gcc/ipa-pure-const.cc
+++ b/gcc/ipa-pure-const.cc
@@ -1526,8 +1526,9 @@ ipa_make_function_pure (struct cgraph_node *node, bool 
looping, bool local)
 {
   bool cdtor = false;
 
-  if (DECL_PURE_P (node->decl)
-  && (looping || !DECL_LOOPING_CONST_OR_PURE_P (node->decl)))
+  if (TREE_READONLY (node->decl)
+  || (DECL_PURE_P (node->decl)
+ && (looping || !DECL_LOOPING_CONST_OR_PURE_P (node->decl
 return false;
   warn_function_pure (node->decl, !looping);
   if (local && skip_function_for_local_pure_const (node))
diff --git a/gcc/testsuite/gcc.dg/pr105676.c b/gcc/testsuite/gcc.dg/pr105676.c
new file mode 100644
index 000..077fc18a17f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr105676.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wsuggest-attribute=pure" } */
+
+__attribute__((const))
+extern int do_expensive_calculation(void);
+
+__attribute__((const))
+int getval(void) /* { dg-bogus "candidate for attribute" } */
+{
+  static int cache = -1;
+  if (cache == -1)
+cache = do_expensive_calculation();
+  return cache;
+}
-- 
2.35.3

Re: [PATCH] Fix a few incorrect accesses.

2022-12-07 Thread Thomas Schwinge

Hi Andrew!

On 2022-12-02T09:12:23-0500, Andrew MacLeod via Gcc-patches 
 wrote:
> This consists of 3 changes which stronger type checking has indicated
> are non-compliant with the type field.

I'm curious what that "stronger type checking" is?


Grüße
 Thomas


> I doubt they are super important because there has not been a trap
> triggered by them, and they have been in the source base since sometime
> before 2017.  However, we should probably fix them.
>
> I also notice that those are all uses of VOID_TYPE_P, which
> coincidentally does not check if its a type node being checked:
>
> /* Nonzero if this type is the (possibly qualified) void type.  */
> #define VOID_TYPE_P(NODE) (TREE_CODE (NODE) == VOID_TYPE)
>
> So I guess it wouldn't trap anyway, just silently never trigger.
>
> Bootstraps on x86_64-pc-linux-gnu with no regressions.  OK for trunk?
>
> Andrew


> From d1003e853d1813105eef6e441578e5bea9de8d03 Mon Sep 17 00:00:00 2001
> From: Andrew MacLeod 
> Date: Tue, 29 Nov 2022 13:07:28 -0500
> Subject: [PATCH] Fix a few incorrect accesses.
>
> This consists of 3 changes which stronger type checking has indicated
> are incorrect.
>
>   gcc/
>   * fold-const.cc (fold_unary_loc): Check TREE_TYPE of node.
>   (tree_invalid_nonnegative_warnv_p): Likewise.
>
>   gcc/c-family/
>   * c-attribs.cc (handle_deprecated_attribute): Use type when
>   using TYPE_NAME.
> ---
>  gcc/c-family/c-attribs.cc | 2 +-
>  gcc/fold-const.cc | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index 07bca68e9b9..b36dd97802b 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -4240,7 +4240,7 @@ handle_deprecated_attribute (tree *node, tree name,
>if (type && TYPE_NAME (type))
>   {
> if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
> - what = TYPE_NAME (*node);
> + what = TYPE_NAME (type);
> else if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
>  && DECL_NAME (TYPE_NAME (type)))
>   what = DECL_NAME (TYPE_NAME (type));
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 114258fa182..e80be8049e1 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -9369,8 +9369,8 @@ fold_unary_loc (location_t loc, enum tree_code code, 
> tree type, tree op0)
> && TREE_CODE (tem) == COND_EXPR
> && TREE_CODE (TREE_OPERAND (tem, 1)) == code
> && TREE_CODE (TREE_OPERAND (tem, 2)) == code
> -   && ! VOID_TYPE_P (TREE_OPERAND (tem, 1))
> -   && ! VOID_TYPE_P (TREE_OPERAND (tem, 2))
> +   && ! VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (tem, 1)))
> +   && ! VOID_TYPE_P (TREE_TYPE (TREE_OPERAND (tem, 2)))
> && (TREE_TYPE (TREE_OPERAND (TREE_OPERAND (tem, 1), 0))
> == TREE_TYPE (TREE_OPERAND (TREE_OPERAND (tem, 2), 0)))
> && (! (INTEGRAL_TYPE_P (TREE_TYPE (tem))
> @@ -15002,7 +15002,7 @@ tree_invalid_nonnegative_warnv_p (tree t, bool 
> *strict_overflow_p, int depth)
>
>   /* If the initializer is non-void, then it's a normal expression
>  that will be assigned to the slot.  */
> - if (!VOID_TYPE_P (t))
> + if (!VOID_TYPE_P (TREE_TYPE (t)))
> return RECURSE (t);
>
>   /* Otherwise, the initializer sets the slot in some way.  One common
> --
> 2.38.1
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH] doc: Correct a clerical error in the document.

2022-12-07 Thread Richard Sandiford via Gcc-patches

Lulu Cheng  writes:
> gcc/ChangeLog:
>
>   * doc/rtl.texi: Correct a clerical error in the document.
> ---
>  gcc/doc/rtl.texi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 43c9ee8bffe..44858d12892 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -2142,7 +2142,7 @@ stores the lower 2 bytes of @var{y} in @var{x} and 
> discards the upper
>  (set @var{z} (subreg:SI (reg:HI @var{x}) 0))
>  @end smallexample
>  
> -would set the lower two bytes of @var{z} to @var{y} and set the upper
> +would set the lower two bytes of @var{z} to @var{x} and set the upper
>  two bytes to an unknown value assuming @code{SUBREG_PROMOTED_VAR_P} is
>  false.

Both versions are right in their way.  I think the intention of the
original was to show the effect of moving y to z via a paradoxical
subreg on x.

How about:

  would set the lower two bytes of @var{z} to @var{x} (which contains
  the lower two bytes of @var{y}) and set the upper ...

OK with that change if you agree.

Richard

Re: [committed] onlinedocs: Add documentation links to gdc

2022-12-07 Thread Iain Buclaw via Gcc-patches

Hi Gerald,

Excerpts from Gerald Pfeifer's message of Dezember 6, 2022 2:13 pm:
> On Tue, 6 Dec 2022, Iain Buclaw wrote:
>> Now that the D front-end documentation has been generated and pushed to
>> the site after r13-4421, this can be added to the main index page.
>> 
>> This is a simple copy from other entries, so have gone ahead and
>> committed it.
> 
> Cool, thank you. And sorry, I applied the change on the gcc.gnu.org 
> system and then missed droping you a note once it successfully ran 
> the first time.
> 
> With your web page patch, are we complete now? Or is anything missing?
> 

Looks like it's all there to me. Just need myself to write up more content.

Iain.

Re: [PATCH v2 1/2] Allow subtarget customization of CC1_SPEC

2022-12-07 Thread Richard Sandiford via Gcc-patches

Sebastian Huber  writes:
> On 06.12.22 22:06, Thomas Schwinge wrote:
>> Hi!
>> 
>> I suppose I just fail to see some detail here, but:
>> 
>> On 2022-11-21T08:25:25+0100, Sebastian 
>> Huber  wrote:
>>> gcc/ChangeLog:
>>>
>>>* gcc.cc (SUBTARGET_CC1_SPEC): Define if not defined.
>>>(cc1_spec): Append SUBTARGET_CC1_SPEC.
>>> ---
>>> v2: Append SUBTARGET_CC1_SPEC directly to cc1_spec and not through CC1_SPEC.
>>>  This avoids having to modify all the CC1_SPEC definitions in the 
>>> targets.
>>>
>>>   gcc/gcc.cc | 9 -
>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/gcc.cc b/gcc/gcc.cc
>>> index 830ab88701f..4e1574a4df1 100644
>>> --- a/gcc/gcc.cc
>>> +++ b/gcc/gcc.cc
>>> @@ -706,6 +706,13 @@ proper position among the other output files.  */
>>>   #define CPP_SPEC ""
>>>   #endif
>>>
>>> +/* Subtargets can define SUBTARGET_CC1_SPEC to provide extra args to cc1 
>>> and
>>> +   cc1plus or extra switch-translations.  The SUBTARGET_CC1_SPEC is 
>>> appended
>>> +   to CC1_SPEC.  */
>>> +#ifndef SUBTARGET_CC1_SPEC
>>> +#define SUBTARGET_CC1_SPEC ""
>>> +#endif
>>> +
>>>   /* config.h can define CC1_SPEC to provide extra args to cc1 and cc1plus
>>>  or extra switch-translations.  */
>>>   #ifndef CC1_SPEC
>>> @@ -1174,7 +1181,7 @@ proper position among the other output files.  */
>>>   static const char *asm_debug = ASM_DEBUG_SPEC;
>>>   static const char *asm_debug_option = ASM_DEBUG_OPTION_SPEC;
>>>   static const char *cpp_spec = CPP_SPEC;
>>> -static const char *cc1_spec = CC1_SPEC;
>>> +static const char *cc1_spec = CC1_SPEC SUBTARGET_CC1_SPEC;
>>>   static const char *cc1plus_spec = CC1PLUS_SPEC;
>>>   static const char *link_gcc_c_sequence_spec = LINK_GCC_C_SEQUENCE_SPEC;
>>>   static const char *link_ssp_spec = LINK_SSP_SPEC;
>> ... doesn't this (at least potentially?) badly interact with any existing
>> 'SUBTARGET_CC1_SPEC' definitions -- which pe rabove get appended to
>> 'cc1_spec'?
>> 
>>  gcc/config/loongarch/gnu-user.h-   and provides this hook instead.  */
>>  gcc/config/loongarch/gnu-user.h:#undef SUBTARGET_CC1_SPEC
>>  gcc/config/loongarch/gnu-user.h:#define SUBTARGET_CC1_SPEC 
>> GNU_USER_TARGET_CC1_SPEC
>>  gcc/config/loongarch/gnu-user.h-
>>  --
>>  gcc/config/loongarch/loongarch.h-#define EXTRA_SPECS \
>>  gcc/config/loongarch/loongarch.h:  {"subtarget_cc1_spec", 
>> SUBTARGET_CC1_SPEC}, \
>>  gcc/config/loongarch/loongarch.h-  {"subtarget_cpp_spec", 
>> SUBTARGET_CPP_SPEC}, \
>>  --
>>  gcc/config/mips/gnu-user.h-   and provides this hook instead.  */
>>  gcc/config/mips/gnu-user.h:#undef SUBTARGET_CC1_SPEC
>>  gcc/config/mips/gnu-user.h:#define SUBTARGET_CC1_SPEC 
>> GNU_USER_TARGET_CC1_SPEC
>>  gcc/config/mips/gnu-user.h-
>>  --
>>  gcc/config/mips/linux-common.h-
>>  gcc/config/mips/linux-common.h:#undef  SUBTARGET_CC1_SPEC
>>  gcc/config/mips/linux-common.h:#define SUBTARGET_CC1_SPEC   
>> \
>>  gcc/config/mips/linux-common.h-  LINUX_OR_ANDROID_CC 
>> (GNU_USER_TARGET_CC1_SPEC, \
>>  --
>>  gcc/config/mips/mips.h-
>>  gcc/config/mips/mips.h:/* SUBTARGET_CC1_SPEC is passed to the compiler 
>> proper.  It may be
>>  gcc/config/mips/mips.h-   overridden by subtargets.  */
>>  gcc/config/mips/mips.h:#ifndef SUBTARGET_CC1_SPEC
>>  gcc/config/mips/mips.h:#define SUBTARGET_CC1_SPEC ""
>>  gcc/config/mips/mips.h-#endif
>>  --
>>  gcc/config/mips/mips.h-#define EXTRA_SPECS  
>> \
>>  gcc/config/mips/mips.h:  { "subtarget_cc1_spec", SUBTARGET_CC1_SPEC },  
>> \
>>  gcc/config/mips/mips.h-  { "subtarget_cpp_spec", SUBTARGET_CPP_SPEC },  
>> \
>>  --
>>  gcc/config/mips/r3900.h-/* By default (if not mips-something-else) 
>> produce code for the r3900 */
>>  gcc/config/mips/r3900.h:#undef SUBTARGET_CC1_SPEC
>>  gcc/config/mips/r3900.h:#define SUBTARGET_CC1_SPEC "\
>>  gcc/config/mips/r3900.h-%{mhard-float:%e-mhard-float not supported} \
>
> Oh, I came up with the name SUBTARGET_CC1_SPEC after a discussion on the 
> mailing list and I have to admit that I didn't check that it was 
> actually already in use. What about renaming the loongarch/mips define 
> to LOONGARCH_CC1_SPEC and MIPS_CC1_SPEC?

One drawback to that is that people might have their own spec files
that reference the existing names.  Typically those would be produced
by using -dumpspecs and editing the output.  I think we only really
support that if people regenerate the specs files for each release,
but for targets like MIPS that don't see much activity, it's probably
easy to get away without doing that.

How about going back to Jose's suggestion from the original thread
of using OS_CC1_SPEC?  The patch is OK with that change if no-one
objects in 24 hours.

Thanks,
Richard

Re: [PATCH v3 16/19] modula2 front end: bootstrap and documentation tools

2022-12-07 Thread Gaius Mulley via Gcc-patches

Martin Liška  writes:

> On 12/6/22 15:47, Gaius Mulley wrote:
>> |Hi Martin, here is the revised patch having applied all previous
>> recommendations:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603436.html. Is
>> this ok now? Thanks for the improvement suggestions.|
>
> Hello.
>
> It looks much better and I'm sending a small patch that resolves the remaining
> flake8 issue. I use the following plugins (some listed here: 
> https://gcc.gnu.org/codingconventions.html#python):
>
> $ flake8 --version
> 5.0.4 (flake8-bugbear: 22.10.27, flake8-builtins: 1.5.3,
> flake8-comprehensions: 3.4.0, flake8-import-order: 0.18.1,
> flake8-quotes: 3.3.1, mccabe: 0.7.0, pycodestyle: 2.9.1, pyflakes:
> 2.5.0) CPython 3.10.8 on Linux
>
> and I see:
>
> gcc/m2/tools-src> flake8
> ./boilerplate.py:108:66: E999 SyntaxError: invalid syntax
> ./tidydates.py:26:1: I100 Import statements are in the wrong order. 'import 
> pathlib' should be before 'import sys'
> ./tidydates.py:129:50: E128 continuation line under-indented for visual indent
> ./def2doc.py:49:5: E301 expected 1 blank line, found 0
> ./def2doc.py:49:18: E211 whitespace before '('
> ./def2doc.py:51:5: E301 expected 1 blank line, found 0
> ./def2doc.py:51:18: E211 whitespace before '('
> ./def2doc.py:53:5: E301 expected 1 blank line, found 0
> ./def2doc.py:55:5: E301 expected 1 blank line, found 0
> ./def2doc.py:57:5: E301 expected 1 blank line, found 0
> ./def2doc.py:59:5: E301 expected 1 blank line, found 0
> ./def2doc.py:61:5: E301 expected 1 blank line, found 0
> ./def2doc.py:65:5: E301 expected 1 blank line, found 0
> ./def2doc.py:70:5: E301 expected 1 blank line, found 0
> ./def2doc.py:72:5: E301 expected 1 blank line, found 0
> ./def2doc.py:191:80: E501 line too long (81 > 79 characters)
> ./def2doc.py:330:22: A002 argument "dir" is shadowing a python builtin
> ./def2doc.py:348:23: A002 argument "dir" is shadowing a python builtin
> ./def2doc.py:377:17: A002 argument "dir" is shadowing a python builtin
> ./def2doc.py:396:21: A002 argument "dir" is shadowing a python builtin
> ./def2doc.py:406:16: A002 argument "dir" is shadowing a python builtin
> ./def2doc.py:418:15: A002 argument "dir" is shadowing a python builtin
> ./def2doc.py:432:25: A002 argument "dir" is shadowing a python builtin
> ./def2doc.py:437:19: Q000 Double quotes found but single quotes preferred
> ./def2doc.py:439:19: Q000 Double quotes found but single quotes preferred
> ./def2doc.py:441:19: Q000 Double quotes found but single quotes preferred
> ./def2doc.py:468:18: Q001 Single quote multiline found but double quotes 
> preferred
>
> It seems the first one is a real syntax error. Anyway, feel free to apply the 
> suggested patch.
>
> And I would consider replacing the following static 'str.' calls:
>
> def2doc.py:output.write(str.replace(str.replace(str.rstrip(line),
> def2doc.py:output.write(str.replace(str.replace(line, '{', '@{'), 
> '}', '@}'))
>
> with line.rstrip().replace(...).replace(...)
>
> Cheers,
> Martin

Hi Martin,

many thanks for the patch and suggestions (and flake8 plugin output) - I
will apply the patch and change the str calls,

regards,
Gaius

[PATCH] preprocessor: __has_include_next should not error out [PR80755]

2022-12-07 Thread Helmut Grohne

If __has_include_next reaches the end of the search path, it causes an
error. The use of __has_include_next at the end of the search path is
legal and it should return false instead.

Bootstrapped and tested on x86_64-linux-gnu. Patched cross toolchain for
i686-gnu (hurd) built many packages.

gcc/ChangeLog:

PR preprocessor/80755
* libcpp/files.cc (search_path_head): Do not raise an error for
type IT_INCLUDE_NEXT.
* libcpp/files.cc (_cpp_has_header): Deal with NULL return from
search_path_head.

gcc/testsuite/ChangeLog:

PR preprocessor/80755
* testsuite/gcc.dg/cpp/pr80755.c: New test.
* gcc.dg/cpp/inc/pr80755.h: Added support file for test.

Signed-off-by: Helmut Grohne 
---
 gcc/testsuite/gcc.dg/cpp/inc/pr80755.h | 2 ++
 gcc/testsuite/gcc.dg/cpp/pr80755.c | 5 +
 libcpp/files.cc| 4 +++-
 3 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/cpp/inc/pr80755.h
 create mode 100644 gcc/testsuite/gcc.dg/cpp/pr80755.c

diff --git a/gcc/testsuite/gcc.dg/cpp/inc/pr80755.h 
b/gcc/testsuite/gcc.dg/cpp/inc/pr80755.h
new file mode 100644
index 000..32022cc691d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/inc/pr80755.h
@@ -0,0 +1,2 @@
+#if __has_include_next()
+#endif
diff --git a/gcc/testsuite/gcc.dg/cpp/pr80755.c 
b/gcc/testsuite/gcc.dg/cpp/pr80755.c
new file mode 100644
index 000..34ae995a6c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/pr80755.c
@@ -0,0 +1,5 @@
+/* PR preprocessor/80755 */
+/* { dg-do preprocess } */
+/* { dg-options "-idirafter $srcdir/gcc.dg/cpp/inc" } */
+
+#include "pr80755.h"
diff --git a/libcpp/files.cc b/libcpp/files.cc
index a18b1caf48d..606f53ed015 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -1042,7 +1042,7 @@ search_path_head (cpp_reader *pfile, const char *fname, 
int angle_brackets,
  path use the normal search logic.  */
   if (type == IT_INCLUDE_NEXT && file->dir
   && file->dir != >no_search_path)
-dir = file->dir->next;
+return file->dir->next;
   else if (angle_brackets)
 dir = pfile->bracket_include;
   else if (type == IT_CMDLINE)
@@ -2145,6 +2145,8 @@ _cpp_has_header (cpp_reader *pfile, const char *fname, 
int angle_brackets,
 enum include_type type)
 {
   cpp_dir *start_dir = search_path_head (pfile, fname, angle_brackets, type);
+  if (!start_dir)
+return false;
   _cpp_file *file = _cpp_find_file (pfile, fname, start_dir, angle_brackets,
_cpp_FFK_HAS_INCLUDE, 0);
   return file->err_no != ENOENT;
-- 
2.38.1

Re: [PATCH v3 16/19] modula2 front end: bootstrap and documentation tools

2022-12-07 Thread Martin Liška

On 12/6/22 15:47, Gaius Mulley wrote:
> |Hi Martin, here is the revised patch having applied all previous 
> recommendations: 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603436.html. Is this 
> ok now? Thanks for the improvement suggestions.|

Hello.

It looks much better and I'm sending a small patch that resolves the remaining
flake8 issue. I use the following plugins (some listed here: 
https://gcc.gnu.org/codingconventions.html#python):

$ flake8 --version
5.0.4 (flake8-bugbear: 22.10.27, flake8-builtins: 1.5.3, flake8-comprehensions: 
3.4.0, flake8-import-order: 0.18.1, flake8-quotes: 3.3.1, mccabe: 0.7.0, 
pycodestyle: 2.9.1, pyflakes: 2.5.0) CPython 3.10.8 on Linux

and I see:

gcc/m2/tools-src> flake8
./boilerplate.py:108:66: E999 SyntaxError: invalid syntax
./tidydates.py:26:1: I100 Import statements are in the wrong order. 'import 
pathlib' should be before 'import sys'
./tidydates.py:129:50: E128 continuation line under-indented for visual indent
./def2doc.py:49:5: E301 expected 1 blank line, found 0
./def2doc.py:49:18: E211 whitespace before '('
./def2doc.py:51:5: E301 expected 1 blank line, found 0
./def2doc.py:51:18: E211 whitespace before '('
./def2doc.py:53:5: E301 expected 1 blank line, found 0
./def2doc.py:55:5: E301 expected 1 blank line, found 0
./def2doc.py:57:5: E301 expected 1 blank line, found 0
./def2doc.py:59:5: E301 expected 1 blank line, found 0
./def2doc.py:61:5: E301 expected 1 blank line, found 0
./def2doc.py:65:5: E301 expected 1 blank line, found 0
./def2doc.py:70:5: E301 expected 1 blank line, found 0
./def2doc.py:72:5: E301 expected 1 blank line, found 0
./def2doc.py:191:80: E501 line too long (81 > 79 characters)
./def2doc.py:330:22: A002 argument "dir" is shadowing a python builtin
./def2doc.py:348:23: A002 argument "dir" is shadowing a python builtin
./def2doc.py:377:17: A002 argument "dir" is shadowing a python builtin
./def2doc.py:396:21: A002 argument "dir" is shadowing a python builtin
./def2doc.py:406:16: A002 argument "dir" is shadowing a python builtin
./def2doc.py:418:15: A002 argument "dir" is shadowing a python builtin
./def2doc.py:432:25: A002 argument "dir" is shadowing a python builtin
./def2doc.py:437:19: Q000 Double quotes found but single quotes preferred
./def2doc.py:439:19: Q000 Double quotes found but single quotes preferred
./def2doc.py:441:19: Q000 Double quotes found but single quotes preferred
./def2doc.py:468:18: Q001 Single quote multiline found but double quotes 
preferred

It seems the first one is a real syntax error. Anyway, feel free to apply the 
suggested patch.

And I would consider replacing the following static 'str.' calls:

def2doc.py:output.write(str.replace(str.replace(str.rstrip(line),
def2doc.py:output.write(str.replace(str.replace(line, '{', '@{'), '}', 
'@}'))

with line.rstrip().replace(...).replace(...)

Cheers,
Martindiff --git a/gcc/m2/tools-src/boilerplate.py b/gcc/m2/tools-src/boilerplate.py
index f0b266f403f..99596529b4e 100644
--- a/gcc/m2/tools-src/boilerplate.py
+++ b/gcc/m2/tools-src/boilerplate.py
@@ -85,9 +85,9 @@ def analyse_comment(text, f):
 lic = 'GPL'
 elif text.find(GNU_LESSER_GENERAL) > 0:
 lic = 'LGPL'
-for license in Licenses.keys():
-if text.find(license) > 0:
-lic += Licenses[license]
+for license_ in Licenses.keys():
+if text.find(license_) > 0:
+lic += Licenses[license_]
 if text.find(GCC_RUNTIME_LIB_EXC) > 0:
 lic += 'x'
 now = datetime.datetime.now()
@@ -105,7 +105,7 @@ def analyse_comment(text, f):
 i = text.find(basename(f))
 j = text.find('. ', i)
 if j < 0:
-error('summary of the file does not finish with a '.'')
+error("summary of the file does not finish with a '.'")
 summary = text[i:]
 else:
 summary = text[i:j]
@@ -175,7 +175,7 @@ def add_stop(sentence):
 return sentence
 
 
-GPLv3 = '''
+GPLv3 = """
 %s
 
 Copyright (C) %s Free Software Foundation, Inc.
@@ -196,9 +196,9 @@ General Public License for more details.
 You should have received a copy of the GNU General Public License
 along with GNU Modula-2; see the file COPYING3.  If not see
 .
-'''
+"""
 
-GPLv3x = '''
+GPLv3x = """
 %s
 
 Copyright (C) %s Free Software Foundation, Inc.
@@ -224,9 +224,9 @@ You should have received a copy of the GNU General Public License and
 a copy of the GCC Runtime Library Exception along with this program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .
-'''
+"""
 
-LGPLv3 = '''
+LGPLv3 = """
 %s
 
 Copyright (C) %s Free Software Foundation, Inc.
@@ -246,9 +246,9 @@ Lesser General Public License for more details.
 
 You should have received a copy of the GNU Lesser General Public License
 along with GNU Modula-2.  If not, see .
-'''
+"""

Re: [PATCH] Fix aarch64 PR 99657: ICE with SVE types used without an error

2022-12-07 Thread Richard Sandiford via Gcc-patches

"Kewen.Lin"  writes:
> Hi,
>
> In the recent discussion on how to make some built-in type only valid for
> some target features efficiently[1], Andrew mentioned this patch which he
> made previously (Thanks!).  I confirmed it can help rs6000 related issue,
> and noticed PR99657 is still opened, so I think we still want this to
> be reviewed.

But does it work for things like:

void f(foo_t *x, foo_t *y) { *x = *y; }

where no variables are being created with foo_t type?

That's not to say we shouldn't have the patch.  I'm just not sure
it can be the complete solution.

Thanks,
Richard

>
> Could some C/C++ FE experts help to review it?
>
> Thanks in advance!
>
> BR,
> Kewen
>
> [1] https://gcc.gnu.org/pipermail/gcc/2022-December/240220.html
>
> on 2021/11/9 18:09, apinski--- via Gcc-patches wrote:
>> From: Andrew Pinski 
>> 
>> This fixes fully where SVE types were being used without sve being enabled.
>> Instead of trying to fix it such that we error out during RTL time, it is
>> better to error out in front-ends.  This expands verify_type_context to
>> have a context of auto storage decl which is used for both auto storage
>> decls and for indirection context.
>> 
>> A few testcases needed to be updated for the new error message; they were
>> already being rejected before hand.
>> 
>> OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>> 
>> PR target/99657
>> gcc/c/ChangeLog:
>> 
>>  * c-decl.c (finish_decl): Call verify_type_context
>>  for all decls and not just global_decls.
>>  * c-typeck.c (build_indirect_ref): Call verify_type_context
>>  to check to see if the type is ok to be used.
>> 
>> gcc/ChangeLog:
>> 
>>  * config/aarch64/aarch64-sve-builtins.cc (verify_type_context):
>>  Add TXTC_AUTO_STORAGE support
>>  * target.h (enum type_context_kind): Add TXTC_AUTO_STORAGE.
>> 
>> gcc/cp/ChangeLog:
>> 
>>  * decl.c (cp_finish_decl): Call verify_type_context
>>  for all decls and not just global_decls.
>>  * typeck.c (cp_build_indirect_ref_1): Call verify_type_context
>>  to check to see if the type is ok to be used.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/aarch64/sve/acle/general/nosve_1.c: Update test.
>>  * gcc.target/aarch64/sve/acle/general/nosve_4.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/general/nosve_5.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/general/nosve_6.c: Likewise.
>>  * gcc.target/aarch64/sve/pcs/nosve_2.c: Likewise.
>>  * gcc.target/aarch64/sve/pcs/nosve_3.c: Likewise.
>>  * gcc.target/aarch64/sve/pcs/nosve_4.c: Likewise.
>>  * gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise.
>>  * gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise.
>>  * gcc.target/aarch64/sve/pcs/nosve_9.c: New test.
>> ---
>>  gcc/c/c-decl.c| 14 +++---
>>  gcc/c/c-typeck.c  |  2 ++
>>  gcc/config/aarch64/aarch64-sve-builtins.cc| 14 ++
>>  gcc/cp/decl.c | 10 ++
>>  gcc/cp/typeck.c   |  4 
>>  gcc/target.h  |  3 +++
>>  .../gcc.target/aarch64/sve/acle/general/nosve_1.c |  1 +
>>  .../gcc.target/aarch64/sve/acle/general/nosve_4.c |  2 +-
>>  .../gcc.target/aarch64/sve/acle/general/nosve_5.c |  2 +-
>>  .../gcc.target/aarch64/sve/acle/general/nosve_6.c |  1 +
>>  .../gcc.target/aarch64/sve/pcs/nosve_2.c  |  2 +-
>>  .../gcc.target/aarch64/sve/pcs/nosve_3.c  |  2 +-
>>  .../gcc.target/aarch64/sve/pcs/nosve_4.c  |  3 +--
>>  .../gcc.target/aarch64/sve/pcs/nosve_5.c  |  3 +--
>>  .../gcc.target/aarch64/sve/pcs/nosve_6.c  |  3 +--
>>  .../gcc.target/aarch64/sve/pcs/nosve_9.c  | 15 +++
>>  16 files changed, 60 insertions(+), 21 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/nosve_9.c
>> 
>> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
>> index 186fa1692c1..b3583622475 100644
>> --- a/gcc/c/c-decl.c
>> +++ b/gcc/c/c-decl.c
>> @@ -5441,19 +5441,19 @@ finish_decl (tree decl, location_t init_loc, tree 
>> init,
>>  
>>if (VAR_P (decl))
>>  {
>> +  type_context_kind context = TCTX_AUTO_STORAGE;
>>if (init && TREE_CODE (init) == CONSTRUCTOR)
>>  add_flexible_array_elts_to_size (decl, init);
>>  
>>complete_flexible_array_elts (DECL_INITIAL (decl));
>>  
>>if (is_global_var (decl))
>> -{
>> -  type_context_kind context = (DECL_THREAD_LOCAL_P (decl)
>> -   ? TCTX_THREAD_STORAGE
>> -   : TCTX_STATIC_STORAGE);
>> -  if (!verify_type_context (input_location, context, TREE_TYPE (decl)))
>> -TREE_TYPE (decl) = error_mark_node;
>> -}
>> +context = (DECL_THREAD_LOCAL_P (decl)
>> +   ? TCTX_THREAD_STORAGE
>> +   : TCTX_STATIC_STORAGE);
>> +
>> +  if

Re: [PATCH Rust front-end v4 20/46] gccrs: Add wrapper for make_unique

2022-12-07 Thread Thomas Schwinge

Hi!

On 2022-12-07T09:50:40+0100, Arsen Arsenović via Gcc-patches 
 wrote:
> arthur.co...@embecosm.com writes:
>
>> This is a wrapper for make_unique. We can likely get rid of this, as there
>> are other implementations available, or simply keep using the unique_ptr
>> constructor.
>> ---
>>  gcc/rust/util/rust-make-unique.h | 35 
>>  1 file changed, 35 insertions(+)
>>  create mode 100644 gcc/rust/util/rust-make-unique.h
>>
>> diff --git a/gcc/rust/util/rust-make-unique.h 
>> b/gcc/rust/util/rust-make-unique.h
>> new file mode 100644
>> index 000..7b79e625ff1
>> --- /dev/null
>> +++ b/gcc/rust/util/rust-make-unique.h
>> @@ -0,0 +1,35 @@
>> +// Copyright (C) 2020-2022 Free Software Foundation, Inc.
>> +
>> +// This file is part of GCC.
>> +
>> +// GCC is free software; you can redistribute it and/or modify it under
>> +// the terms of the GNU General Public License as published by the Free
>> +// Software Foundation; either version 3, or (at your option) any later
>> +// version.
>> +
>> +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +// WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +// for more details.
>> +
>> +// You should have received a copy of the GNU General Public License
>> +// along with GCC; see the file COPYING3.  If not see
>> +// .
>> +
>> +#ifndef RUST_MAKE_UNIQUE_H
>> +#define RUST_MAKE_UNIQUE_H
>> +
>> +#include "rust-system.h"
>> +
>> +namespace Rust {
>> +
>> +template 
>> +std::unique_ptr
>> +make_unique (Ts &&...params)
>> +{
>> +  return std::unique_ptr (new T (std::forward (params)...));
>> +}
>> +
>> +} // namespace Rust
>> +
>> +#endif // RUST_MAKE_UNIQUE_H
>
> I think this was added recently, see commit
> 00d7c8ff16e6838273cea808ffbe22e98104f9d5 and gcc/make-unique.h.

I too had seen that, but decided to wait for until after the GCC/Rust
merge, to not add more complexity to that one.  It's OK, in my opinion,
to add 'gcc/rust/util/rust-make-unique.h' now, and then later
re-factor/get rid of that, to use 'gcc/make-unique.h' instead.

Anyway, thanks for pointing that out, of course!  :-)


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

2022-12-07 Thread Richard Sandiford via Gcc-patches

"Pop, Sebastian"  writes:
> Thanks Richard for your review and for pointing out the issue with BTI.
>
>
> The current patch removes the existing BTI instruction,
>
> and then adds the BTI hint when expanding the patchable_area pseudo.

Thanks.  I still think...

> The attached patch passed bootstrap and regression test on arm64-linux.
>
> Ok to commit to gcc trunk?
>
>
> Thank you,
> Sebastian
>
> 
> From: Richard Sandiford 
> Sent: Monday, December 5, 2022 5:34:40 AM
> To: Pop, Sebastian
> Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
> Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
>
>
>
> "Pop, Sebastian"  writes:
>> Hi,
>>
>> Currently patchable area is at the wrong place on AArch64.  It is placed
>> immediately after function label, before .cfi_startproc.  This patch
>> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
>> modifies aarch64_print_patchable_function_entry to avoid placing
>> patchable area before .cfi_startproc.
>>
>> The patch passed bootstrap and regression test on aarch64-linux.
>> Ok to commit to trunk and backport to active release branches?
>
> Looks good, but doesn't the problem described in the PR then still
> apply to the BTI emitted by:
>
>   if (cfun->machine->label_is_assembled
>   && aarch64_bti_enabled ()
>   && !cgraph_node::get (cfun->decl)->only_called_directly_p ())
> {
>   /* Remove the BTI that follows the patch area and insert a new BTI
>  before the patch area right after the function label.  */
>   rtx_insn *insn = next_real_nondebug_insn (get_insns ());
>   if (insn
>   && INSN_P (insn)
>   && GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE
>   && XINT (PATTERN (insn), 1) == UNSPECV_BTI_C)
> delete_insn (insn);
>   asm_fprintf (file, "\thint\t34 // bti c\n");
> }
>
> ?  It seems like the BTI will be before the cfi_startproc and the
> patchable entry afterwards.
>
> I guess we should keep the BTI instruction as-is (rather than printing
> a .hint) and emit the new UNSPECV_PATCHABLE_AREA after the BTI rather
> than before it.

...this approach would be slightly cleaner though.  The .hint asm string
we're emitting here is exactly the same as the one emiitted by the
original bti_c instruction.  The only reason for deleting the
instruction and emitting text was because we were emitting the
patchable entry directly as text, and the BTI text had to come
before the patchable entry text.

Now that we're emitting the patchable entry via a normal instruction
(a good thing!) we can keep the preceding bti_c as a normal instruction
too.  That is, I think we should use emit_insn_after to emit the entry
after the bti_c insn (if it exists) instead of before BB_HEAD.

Thanks,
Richard

>> gcc/
>> PR target/93492
>> * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
>> Declared.
>> * config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry):
>> Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
>> (aarch64_output_patchable_area): New.
>> * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
>> (patchable_area): Define.
>>
>> gcc/testsuite/
>> PR target/93492
>> * gcc.target/aarch64/pr98776.c: New.
>>
>>
>> From b9cf87bcdf65f515b38f1851eb95c18aaa180253 Mon Sep 17 00:00:00 2001
>> From: Sebastian Pop 
>> Date: Wed, 30 Nov 2022 19:45:24 +
>> Subject: [PATCH] AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>>
>> Currently patchable area is at the wrong place on AArch64.  It is placed
>> immediately after function label, before .cfi_startproc.  This patch
>> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
>> modifies aarch64_print_patchable_function_entry to avoid placing
>> patchable area before .cfi_startproc.
>>
>> gcc/
>>   PR target/93492
>>   * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
>>   Declared.
>>   * config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry):
>>   Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
>>   (aarch64_output_patchable_area): New.
>>   * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
>>   (patchable_area): Define.
>>
>> gcc/testsuite/
>>   PR target/93492
>>   * gcc.target/aarch64/pr98776.c: New.
>> ---
>>  gcc/config/aarch64/aarch64-protos.h|  2 ++
>>  gcc/config/aarch64/aarch64.cc  | 24 +-
>>  gcc/config/aarch64/aarch64.md  | 14 +
>>  gcc/testsuite/gcc.target/aarch64/pr98776.c | 11 ++
>>  4 files changed, 50 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr98776.c
>>
>> diff --git

Re: [PATCH Rust front-end v4 20/46] gccrs: Add wrapper for make_unique

2022-12-07 Thread Arsen Arsenović via Gcc-patches


arthur.co...@embecosm.com writes:

> This is a wrapper for make_unique. We can likely get rid of this, as there
> are other implementations available, or simply keep using the unique_ptr
> constructor.
> ---
>  gcc/rust/util/rust-make-unique.h | 35 
>  1 file changed, 35 insertions(+)
>  create mode 100644 gcc/rust/util/rust-make-unique.h
>
> diff --git a/gcc/rust/util/rust-make-unique.h 
> b/gcc/rust/util/rust-make-unique.h
> new file mode 100644
> index 000..7b79e625ff1
> --- /dev/null
> +++ b/gcc/rust/util/rust-make-unique.h
> @@ -0,0 +1,35 @@
> +// Copyright (C) 2020-2022 Free Software Foundation, Inc.
> +
> +// This file is part of GCC.
> +
> +// GCC is free software; you can redistribute it and/or modify it under
> +// the terms of the GNU General Public License as published by the Free
> +// Software Foundation; either version 3, or (at your option) any later
> +// version.
> +
> +// GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +// WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +// for more details.
> +
> +// You should have received a copy of the GNU General Public License
> +// along with GCC; see the file COPYING3.  If not see
> +// .
> +
> +#ifndef RUST_MAKE_UNIQUE_H
> +#define RUST_MAKE_UNIQUE_H
> +
> +#include "rust-system.h"
> +
> +namespace Rust {
> +
> +template 
> +std::unique_ptr
> +make_unique (Ts &&...params)
> +{
> +  return std::unique_ptr (new T (std::forward (params)...));
> +}
> +
> +} // namespace Rust
> +
> +#endif // RUST_MAKE_UNIQUE_H

I think this was added recently, see commit
00d7c8ff16e6838273cea808ffbe22e98104f9d5 and gcc/make-unique.h.

-- 
Arsen Arsenović


signature.asc
Description: PGP signature

[PATCH] range-op-float: Fix up frange_arithmetic [PR107967]

2022-12-07 Thread Jakub Jelinek via Gcc-patches

Hi!

The addition of PLUS/MINUS/MULT/RDIV_EXPR frange handlers causes
miscompilation of some of the libm routines, resulting in lots of
glibc test failures.  A part of them is purely PR107608 fold-overflow-1.c
etc. issues, say when the code does
  return -0.5 / 0.0;
and expects division by zero to be emitted, but we propagate -Inf
and avoid the operation.
But there are also various tests where we end up with different computed
value from the expected ones.  All those cases are like:
 is:  inf   inf
 should be:   1.18973149535723176502e+4932   0xf.fff0p+16380
 is:  inf   inf
 should be:   1.18973149535723176508575932662800701e+4932   
0x1.p+16383
 is:  inf   inf
 should be:   1.7976931348623157e+308   0x1.fp+1023
 is:  inf   inf
 should be:   3.40282346e+38   0x1.fep+127
and the corresponding source looks like:
static const double huge = 1.0e+300;
double whatever (...) {
...
  return huge * huge;
...
}
which for rounding to nearest or +inf should and does return +inf, but
for rounding to -inf or 0 should instead return nextafter (inf, -inf);
The rules IEEE754 has are that operations on +-Inf operands are exact
and produce +-Inf (except for the invalid ones that produce NaN) regardless
of rounding mode, while overflows:
"a) roundTiesToEven and roundTiesToAway carry all overflows to ∞ with the
sign of the intermediate result.
b) roundTowardZero carries all overflows to the format’s largest finite
number with the sign of the intermediate result.
c) roundTowardNegative carries positive overflows to the format’s largest
finite number, and carries negative overflows to −∞.
d) roundTowardPositive carries negative overflows to the format’s most
negative finite number, and carries positive overflows to +∞."

The behavior around overflows to -Inf or nextafter (-inf, inf) was actually
handled correctly, we'd construct [-INF, -MAX] ranges in those cases
because !real_less (, ) in that case - value is finite
but larger in magnitude than what the format can represent (but GCC
internal's format can), while result is -INF in that case.
But for the overflows to +Inf or nextafter (inf, -inf) was handled
incorrectly, it tested real_less (, ) rather than
!real_less (, ), the former test is true when already the
rounding value -> result rounded down and in that case we shouldn't
round again, we should round down when it didn't.

So, in theory this could be fixed just by adding one ! character,
-  if ((mode_composite || (real_isneg () ? real_less (, )
+  if ((mode_composite || (real_isneg () ? !real_less (, )
  : !real_less (, )))
but the following patch goes further.  The distance between
nextafter (inf, -inf) and inf is large (infinite) and expressions like
1.0e+300 * 1.0e+300 always produce +inf in round to nearest mode by far,
so I think having low bound of nextafter (inf, -inf) in that case is
unnecessary.  But if it isn't multiplication but say addition and we are
inexact and very close to the boundary between rounding to nearest
maximum representable vs. rounding to nearest +inf, still using [MAX, +INF]
etc. ranges seems safer because we don't know exactly what we lost in the
inexact computation.

The following patch implements that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-12-07  Jakub Jelinek  

PR tree-optimization/107967
* range-op-float.cc (frange_arithmetic): Fix a thinko - if
inf is negative, use nextafter if !real_less (, )
rather than if real_less (, ).  If result is +-INF
while value is finite and -fno-rounding-math, don't do rounding
if !inexact or if result is significantly above max representable
value or below min representable value.

* gcc.dg/pr107967-1.c: New test.
* gcc.dg/pr107967-2.c: New test.
* gcc.dg/pr107967-3.c: New test.

--- gcc/range-op-float.cc.jj2022-12-06 10:25:16.594848892 +0100
+++ gcc/range-op-float.cc   2022-12-06 20:53:47.751295689 +0100
@@ -287,9 +287,64 @@ frange_arithmetic (enum tree_code code,
 
   // Be extra careful if there may be discrepancies between the
   // compile and runtime results.
-  if ((mode_composite || (real_isneg () ? real_less (, )
- : !real_less (, )))
-  && (inexact || !real_identical (, )))
+  bool round = false;
+  if (mode_composite)
+round = true;
+  else if (real_isneg ())
+{
+  round = !real_less (, );
+  if (real_isinf (, false)
+ && !real_isinf ()
+ && !flag_rounding_math)
+   {
+ // Use just [+INF, +INF] rather than [MAX, +INF]
+ // even if value is larger than MAX and rounds to
+ // nearest to +INF.  Unless INEXACT is true, in
+ // that case we need some extra buffer.
+ if (!inexact)
+   round = false;
+ else
+   {
+ REAL_VALUE_TYPE tmp = result, tmp2;
+ frange_nextafter

Re: [PATCH v2 1/2] Allow subtarget customization of CC1_SPEC

2022-12-07 Thread Iain Sandoe

Hi

> On 7 Dec 2022, at 07:54, Sebastian Huber  
> wrote:
> 
> 
> 
> On 07.12.22 08:10, Thomas Schwinge wrote:
>> Hi!
>> On 2022-12-07T07:04:10+0100, Sebastian Huber 
>>  wrote:
>>> On 06.12.22 22:06, Thomas Schwinge wrote:
>>> I suppose I just fail to see some detail here, but:
>>> 
 On 2022-11-21T08:25:25+0100, Sebastian 
 Huber  wrote:
> gcc/ChangeLog:
> 
>* gcc.cc (SUBTARGET_CC1_SPEC): Define if not defined.
>(cc1_spec): Append SUBTARGET_CC1_SPEC.
> ---
> v2: Append SUBTARGET_CC1_SPEC directly to cc1_spec and not through 
> CC1_SPEC.
>  This avoids having to modify all the CC1_SPEC definitions in the 
> targets.
> 
>   gcc/gcc.cc | 9 -
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> index 830ab88701f..4e1574a4df1 100644
> --- a/gcc/gcc.cc
> +++ b/gcc/gcc.cc
> @@ -706,6 +706,13 @@ proper position among the other output files.  */
>   #define CPP_SPEC ""
>   #endif
> 
> +/* Subtargets can define SUBTARGET_CC1_SPEC to provide extra args to cc1 
> and
> +   cc1plus or extra switch-translations.  The SUBTARGET_CC1_SPEC is 
> appended
> +   to CC1_SPEC.  */
> +#ifndef SUBTARGET_CC1_SPEC
> +#define SUBTARGET_CC1_SPEC ""
> +#endif
> +
>   /* config.h can define CC1_SPEC to provide extra args to cc1 and cc1plus
>  or extra switch-translations.  */
>   #ifndef CC1_SPEC
> @@ -1174,7 +1181,7 @@ proper position among the other output files.  */
>   static const char *asm_debug = ASM_DEBUG_SPEC;
>   static const char *asm_debug_option = ASM_DEBUG_OPTION_SPEC;
>   static const char *cpp_spec = CPP_SPEC;
> -static const char *cc1_spec = CC1_SPEC;
> +static const char *cc1_spec = CC1_SPEC SUBTARGET_CC1_SPEC;
>   static const char *cc1plus_spec = CC1PLUS_SPEC;
>   static const char *link_gcc_c_sequence_spec = LINK_GCC_C_SEQUENCE_SPEC;
>   static const char *link_ssp_spec = LINK_SSP_SPEC;
 
 ... doesn't this (at least potentially?) badly interact with any existing
 'SUBTARGET_CC1_SPEC' definitions -- which pe rabove get appended to
 'cc1_spec'?
 
  gcc/config/loongarch/gnu-user.h-   and provides this hook instead.  */
  gcc/config/loongarch/gnu-user.h:#undef SUBTARGET_CC1_SPEC
  gcc/config/loongarch/gnu-user.h:#define SUBTARGET_CC1_SPEC 
 GNU_USER_TARGET_CC1_SPEC
  gcc/config/loongarch/gnu-user.h-
  --
  gcc/config/loongarch/loongarch.h-#define EXTRA_SPECS \
  gcc/config/loongarch/loongarch.h:  {"subtarget_cc1_spec", 
 SUBTARGET_CC1_SPEC}, \
  gcc/config/loongarch/loongarch.h-  {"subtarget_cpp_spec", 
 SUBTARGET_CPP_SPEC}, \
  --
  gcc/config/mips/gnu-user.h-   and provides this hook instead.  */
  gcc/config/mips/gnu-user.h:#undef SUBTARGET_CC1_SPEC
  gcc/config/mips/gnu-user.h:#define SUBTARGET_CC1_SPEC 
 GNU_USER_TARGET_CC1_SPEC
  gcc/config/mips/gnu-user.h-
  --
  gcc/config/mips/linux-common.h-
  gcc/config/mips/linux-common.h:#undef  SUBTARGET_CC1_SPEC
  gcc/config/mips/linux-common.h:#define SUBTARGET_CC1_SPEC 
   \
  gcc/config/mips/linux-common.h-  LINUX_OR_ANDROID_CC 
 (GNU_USER_TARGET_CC1_SPEC, \
  --
  gcc/config/mips/mips.h-
  gcc/config/mips/mips.h:/* SUBTARGET_CC1_SPEC is passed to the 
 compiler proper.  It may be
  gcc/config/mips/mips.h-   overridden by subtargets.  */
  gcc/config/mips/mips.h:#ifndef SUBTARGET_CC1_SPEC
  gcc/config/mips/mips.h:#define SUBTARGET_CC1_SPEC ""
  gcc/config/mips/mips.h-#endif
  --
  gcc/config/mips/mips.h-#define EXTRA_SPECS
   \
  gcc/config/mips/mips.h:  { "subtarget_cc1_spec", SUBTARGET_CC1_SPEC 
 },  \
  gcc/config/mips/mips.h-  { "subtarget_cpp_spec", SUBTARGET_CPP_SPEC 
 },  \
  --
  gcc/config/mips/r3900.h-/* By default (if not mips-something-else) 
 produce code for the r3900 */
  gcc/config/mips/r3900.h:#undef SUBTARGET_CC1_SPEC
  gcc/config/mips/r3900.h:#define SUBTARGET_CC1_SPEC "\
  gcc/config/mips/r3900.h-%{mhard-float:%e-mhard-float not supported} \
>>> 
>>> Oh, I came up with the name SUBTARGET_CC1_SPEC after a discussion on the
>>> mailing list
>> I've put Iain in CC.
>>> and I have to admit that I didn't check that it was
>>> actually already in use.
>> Always one of the first things I do.  ;-)
>>> What about renaming the loongarch/mips define
>>> to LOONGARCH_CC1_SPEC and MIPS_CC1_SPEC?
>> Also in use are a number of other 'SUBTARGET_[...]_SPEC' and
>> corresponding 'subtarget_[...]_spec' in 'EXTRA_SPECS', for

[Patch] libgomp.texi: Reverse-offload updates (was: [Patch] libgomp: Handle OpenMP's reverse offloads)

2022-12-07 Thread Tobias Burnus


On 06.12.22 08:45, Tobias Burnus wrote:

* As follow-up,  libgomp.texi must be updated


That is what the attached patch does – obviously, it is depending on the
main patch.

OK (once the main patch is in)?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp.texi: Reverse-offload updates

libgomp/
	* libgomp.texi (5.0 Impl. Status): Update 'requires' and 'ancestor'.
	(GCN): Add item about 'omp requires'.
	(nvptx): Likewise; add item about reverse offload.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index efa7d956a33..e9ab079ecf5 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -192,8 +192,8 @@ The OpenMP 4.5 specification is fully supported.
   env variable @tab Y @tab
 @item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab
 @item @code{requires} directive @tab P
-  @tab complete but no non-host devices provides @code{unified_address},
-  @code{unified_shared_memory} or @code{reverse_offload}
+  @tab complete but no non-host devices provides @code{unified_address} or
+  @code{unified_shared_memory}
 @item @code{teams} construct outside an enclosing target region @tab Y @tab
 @item Non-rectangular loop nests @tab Y @tab
 @item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab
@@ -228,7 +228,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{allocate} clause @tab P @tab Initial support
 @item @code{use_device_addr} clause on @code{target data} @tab Y @tab
 @item @code{ancestor} modifier on @code{device} clause
-  @tab Y @tab See comment for @code{requires}
+  @tab Y @tab Host fallback with GCN devices
 @item Implicit declare target directive @tab Y @tab
 @item Discontiguous array section with @code{target update} construct
   @tab N @tab
@@ -288,7 +288,7 @@ The OpenMP 4.5 specification is fully supported.
   @code{append_args} @tab N @tab
 @item @code{dispatch} construct @tab N @tab
 @item device-specific ICV settings with environment variables @tab Y @tab
-@item @code{assume} directive @tab Y @tab
+@item @code{assume} and @code{assumes} directives @tab Y @tab
 @item @code{nothing} directive @tab Y @tab
 @item @code{error} directive @tab Y @tab
 @item @code{masked} construct @tab Y @tab
@@ -4455,6 +4455,9 @@ The implementation remark:
 @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
   using the C library @code{printf} functions and the Fortran
   @code{print}/@code{write} statements.
+@item OpenMP code that has a requires directive with @code{unified_address},
+  @code{unified_shared_memory} or @code{reverse_offload} will remove
+  any GCN device from the list of available devices (``host fallback'').
 @end itemize
 
 
@@ -4504,6 +4507,13 @@ The implementation remark:
 @item Compilation OpenMP code that contains @code{requires reverse_offload}
   requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30}
   is not supported.
+@item For code containing reverse offload (i.e. @code{target} regions with
+  @code{device(ancestor:1)}), there is a slight performance penality
+  for @emph{all} target regions, consisting mostly of shutdown delay
+  between zero to one microsecond and a tiny device querying overhead.
+@item OpenMP code that has a requires directive with @code{unified_address}
+  or @code{unified_shared_memory} will remove any nvptx device from the
+  list of available devices (``host fallback'').
 @end itemize

76 matches

Mail list logo