Re: [PATCH] PR116080: Fix test suite checks for musttail

2024-08-02 Thread Andi Kleen
Andi Kleen  writes:

> From: Andi Kleen 
>
> This is a new attempt to fix PR116080. The previous try was reverted
> because it just broke a bunch of tests, hiding the problem.

The previous version still had one failure on powerpc because
of a template call that needs a dg-error check for external_tail_call.
I fixed that now in the below version.

Okay for trunk? I would like to check that one in to avoid the noise
in the regression reports.

---

This is a new attempt to fix PR116080. The previous try was reverted
because it just broke a bunch of tests, hiding the problem.

- musttail behaves differently than tailcall at -O0. Some of the test
run at -O0, so add separate effective target tests for musttail.
- New effective target tests need to use unique file names
to make dejagnu caching work
- Change the tests to use new targets
- Add a external_musttail test to check for target's ability
to do tail calls between translation units. This covers some powerpc
ABIs.

gcc/testsuite/ChangeLog:

PR testsuite/116080
* c-c++-common/musttail1.c: Use musttail target.
* c-c++-common/musttail12.c: Use struct_musttail target.
* c-c++-common/musttail2.c: Use musttail target.
* c-c++-common/musttail3.c: Likewise.
* c-c++-common/musttail4.c: Likewise.
* c-c++-common/musttail7.c: Likewise.
* c-c++-common/musttail8.c: Likewise.
* g++.dg/musttail10.C: Likewise. Replace powerpc checks with
external_musttail.
* g++.dg/musttail11.C: Use musttail target.
* g++.dg/musttail6.C: Use musttail target. Replace powerpc
checks with external_musttail.
* g++.dg/musttail9.C: Use musttail target.
* lib/target-supports.exp: Add musttail, struct_musttail,
external_musttail targets. Remove optimization for musttail.
Use unique file names for musttail.
---
 gcc/testsuite/c-c++-common/musttail1.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail12.c |  2 +-
 gcc/testsuite/c-c++-common/musttail2.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail3.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail4.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail7.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail8.c  |  2 +-
 gcc/testsuite/g++.dg/musttail10.C   |  6 ++---
 gcc/testsuite/g++.dg/musttail11.C   |  2 +-
 gcc/testsuite/g++.dg/musttail6.C|  4 ++--
 gcc/testsuite/g++.dg/musttail9.C|  2 +-
 gcc/testsuite/lib/target-supports.exp   | 30 -
 12 files changed, 38 insertions(+), 20 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
index 74efcc2a0bc6..51549672e02a 100644
--- a/gcc/testsuite/c-c++-common/musttail1.c
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { musttail && { c || c++11 } } } } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
 int __attribute__((noinline,noclone,noipa))
diff --git a/gcc/testsuite/c-c++-common/musttail12.c 
b/gcc/testsuite/c-c++-common/musttail12.c
index 4140bcd00950..475afc5af3f3 100644
--- a/gcc/testsuite/c-c++-common/musttail12.c
+++ b/gcc/testsuite/c-c++-common/musttail12.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { struct_tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { struct_musttail && { c || c++11 } } } } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
 struct str
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
index 86f2c3d77404..1970c4edd670 100644
--- a/gcc/testsuite/c-c++-common/musttail2.c
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { musttail && { c || c++11 } } } } */
 
 struct box { char field[256]; int i; };
 
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
index ea9589c59ef2..7499fd6460b4 100644
--- a/gcc/testsuite/c-c++-common/musttail3.c
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { struct_musttail && { c || c++11 } } } } */
 
 extern int foo2 (int x, ...);
 
diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
b/gcc/testsuite/c-c++-common/musttail4.c
index 23f4b5e1cd68..bd6effa4b931 100644
--- a/gcc/testsuite/c-c++-common/musttail4.c
+++ b/gcc/testsuite/c-c++-common/musttail4.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { musttail && { c || c++11 } } } } */
 
 struct box { char field[64]; int i; };
 
diff --git a/gcc/testsuite/c-c++-common/musttail7.c 
b/gcc/testsuite/c-c++-common/musttail

Re: [PATCH 0/1] Initial support for AVX10.2

2024-08-02 Thread Andi Kleen
> 
> INT8 is actually char per my understanding.
> 
> For FP8, currently there is no basic calculation insts yet. So we have no
> support for them in AVX10.2 currently, and treat them just as a piece
> of char.
> 
> Also there might be other issues for FP8 to discuss, like ABI issues, so
> we put the support aside for now. When everything is mature, we may
> add the support for that.

But then it's too late isn't it? You wouldn't be able to change
the types of the existing intrinsics anymore, or later end up with
two sets of intrinsics, and end up with interoperability problems
with full computation.

Better to define proper types from the beginning.

-Andi


Re: [PATCH 0/1] Initial support for AVX10.2

2024-08-01 Thread Andi Kleen
Haochen Jiang  writes:

> Hi all,
>
> AVX10.2 tech details has been just published on July 31st in the
> following link:
>
> https://cdrdv2.intel.com/v1/dl/getContent/828965
>
> For new features and instructions, we could divide them into two parts.
> One is ymm rounding control, the other is the new instructions.
>
> In the following weeks, we plan to upstream ymm rounding part first,
> following by new instructions. After all of them upstreamed, we will
> also upstream several patches optimizing codegen with new AVX10.2
> instructions.

Are there plans to make INT8/FP8 types supported by the compiler?
Or just supporting it through some intrinsics?

It seems explicit types would be much more convenient to use
for developers, although it has some drawbacks (like accuracy
depending on spills)

I realize it's likely a lot more work, but it might be worth it?

-Andi


Re: [PATCH] middle-end/114563 - improve release_pages

2024-07-31 Thread Andi Kleen
On Wed, Jul 31, 2024 at 04:02:22PM +0200, Richard Biener wrote:
> The following improves release_pages when using the madvise path
> to sort the freelist to get more page entries contiguous and possibly
> release them.  This populates the unused prev pointer so the reclaim
> can then easily unlink from the freelist without re-ordering it.
> The paths not having madvise do not keep the memory allocated, so
> I left them untouched.
> 
> Re-bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> I've CCed people messing with release_pages;  This doesn't really
> address PR114563 but I thought I post this patch anyway - the
> actual issue we run into for the PR is the linear search of
> G.free_pages when that list becomes large but a requested allocation
> cannot be served from it.
> 
>   PR middle-end/114563
>   * ggc-page.cc (page_sort): New qsort comparator.
>   (release_pages): Sort the free_pages list entries after their
>   memory block virtual address to improve contiguous memory
>   chunk release.

I saw this in a profile some time ago and tried it with a slightly
different patch. Instead of a full sort it uses an array to keep
multiple free lists. But I couldn't find any speed ups in non checking
builds later.

My feeling is that an array is probably more efficient.

I guess should compare both on that PR.


diff --git a/gcc/ggc-page.cc b/gcc/ggc-page.cc
index 4245f843a29f..af1627b002c6 100644
--- a/gcc/ggc-page.cc
+++ b/gcc/ggc-page.cc
@@ -234,6 +234,8 @@ static struct
 }
 inverse_table[NUM_ORDERS];
 
+struct free_list;
+
 /* A page_entry records the status of an allocation page.  This
structure is dynamically sized to fit the bitmap in_use_p.  */
 struct page_entry
@@ -251,6 +253,9 @@ struct page_entry
  of the host system page size.)  */
   size_t bytes;
 
+  /* Free list of this page size.  */
+  struct free_list *free_list;
+
   /* The address at which the memory is allocated.  */
   char *page;
 
@@ -368,6 +373,15 @@ struct free_object
 };
 #endif
 
+constexpr int num_free_list = 8;
+
+/* A free_list for pages with BYTES size.  */
+struct free_list
+{
+  size_t bytes;
+  page_entry *free_pages;
+};
+
 /* The rest of the global variables.  */
 static struct ggc_globals
 {
@@ -412,8 +426,8 @@ static struct ggc_globals
   int dev_zero_fd;
 #endif
 
-  /* A cache of free system pages.  */
-  page_entry *free_pages;
+  /* A cache of free system pages. Entry 0 is fallback.  */
+  struct free_list free_lists[num_free_list];
 
 #ifdef USING_MALLOC_PAGE_GROUPS
   page_group *page_groups;
@@ -754,6 +768,26 @@ clear_page_group_in_use (page_group *group, char *page)
 }
 #endif
 
+/* Find a free list for ENTRY_SIZE.  */
+
+static inline struct free_list *
+find_free_list (size_t entry_size)
+{
+  int i;
+  for (i = 1; i < num_free_list; i++)
+{
+  if (G.free_lists[i].bytes == entry_size)
+   return _lists[i];
+  if (G.free_lists[i].bytes == 0)
+   {
+ G.free_lists[i].bytes = entry_size;
+ return _lists[i];
+   }
+}
+  /* Fallback.  */
+  return _lists[0];
+}
+
 /* Allocate a new page for allocating objects of size 2^ORDER,
and return an entry for it.  The entry is not added to the
appropriate page_table list.  */
@@ -770,6 +804,7 @@ alloc_page (unsigned order)
 #ifdef USING_MALLOC_PAGE_GROUPS
   page_group *group;
 #endif
+  struct free_list *free_list;
 
   num_objects = OBJECTS_PER_PAGE (order);
   bitmap_size = BITMAP_SIZE (num_objects + 1);
@@ -782,8 +817,10 @@ alloc_page (unsigned order)
   entry = NULL;
   page = NULL;
 
+  free_list = find_free_list (entry_size);
+
   /* Check the list of free pages for one we can use.  */
-  for (pp = _pages, p = *pp; p; pp = >next, p = *pp)
+  for (pp = _list->free_pages, p = *pp; p; pp = >next, p = *pp)
 if (p->bytes == entry_size)
   break;
 
@@ -816,7 +853,7 @@ alloc_page (unsigned order)
   /* We want just one page.  Allocate a bunch of them and put the
 extras on the freelist.  (Can only do this optimization with
 mmap for backing store.)  */
-  struct page_entry *e, *f = G.free_pages;
+  struct page_entry *e, *f = free_list->free_pages;
   int i, entries = GGC_QUIRE_SIZE;
 
   page = alloc_anon (NULL, G.pagesize * GGC_QUIRE_SIZE, false);
@@ -833,12 +870,13 @@ alloc_page (unsigned order)
  e = XCNEWVAR (struct page_entry, page_entry_size);
  e->order = order;
  e->bytes = G.pagesize;
+ e->free_list = free_list;
  e->page = page + (i << G.lg_pagesize);
  e->next = f;
  f = e;
}
 
-  G.free_pages = f;
+  free_list->free_pages = f;
 }
   else
 page = alloc_anon (NULL, entry_size, true);
@@ -904,12 +942,13 @@ alloc_page (unsigned order)
  e = XCNEWVAR (struct page_entry, page_entry_size);
  e->order = order;
  e->bytes = G.pagesize;
+ e->free_list = free_list;
  e->page = a;
  e->group 

Re: [PATCH] Add a bootstrap-native build config

2024-07-30 Thread Andi Kleen
> > +BOOT_CFLAGS := -march=native -mtune=native $(BOOT_CFLAGS)
> 
> I was under the impression that -mtune=native is useless with
> -march=native. Is that wrong?

On x86 it's right, but not sure about other architectures. I suppose
it doesn't hurt.

-Andi


Re: [PATCH 2/2] Add AVX2 code path to lexer

2024-07-30 Thread Andi Kleen
> Is that from some kind of rigorous measurement under perf? As you
> surely know, 0.6% wall-clock time can be from boost clock variation
> or just run-to-run noise on x86.

I compared it using hyperfine which does rigorous measurements yes.
It was well above the run-to-run variability.

I had some other patches that didn't meet that bar, e.g. 
i've been experimenting with more modern hashes for inchash
and multiple ggc free lists, but so far no above noise
results.

> 
> I have looked at this code before. When AVX2 is available, so is SSSE3,
> and then a much more efficient approach is available: instead of comparing
> against \r \n \\ ? one-by-one, build a vector
> 
>   0  1  2  3  4  5  6  7  8  9a   bc d   e   f
> { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, '\n', 0, '\\', '\r', 0, '?' }
> 
> where each character C we're seeking is at position (C % 16). Then
> you can match against them all at once using PSHUFB:
> 
>   t = _mm_shuffle_epi8 (lut, data);
>   t = t == data;

I thought the PSHUFB trick only worked for some bit patterns?

At least according to this paper: https://arxiv.org/pdf/1902.08318

But yes if it applies here it's a good idea.


> 
> As you might recognize this handily beats the fancy SSE4.1 loop as well.
> I did not pursue this because I did not measure a substantial improvement
> (we're way into the land of diminishing returns here) and it seemed like
> maintainers might not like to be distracted with that, but if we are
> touching this code, might as well use the more efficient algorithm.
> I'll be happy to propose a patch if people think it's worthwhile.

Yes makes sense.

(of course it would be even better to teach the vectorizer about it,
although this will require fixing some other issues first, see PR116126)

-Andi


[PATCH] Add a bootstrap-native build config

2024-07-30 Thread Andi Kleen
From: Andi Kleen 

... that uses -march=native -mtune=native to build a compiler optimized
for the host.

config/ChangeLog:

* bootstrap-native.mk: New file.

gcc/ChangeLog:

* doc/install.texi: Document bootstrap-native.
---
 config/bootstrap-native.mk | 1 +
 gcc/doc/install.texi   | 6 ++
 2 files changed, 7 insertions(+)
 create mode 100644 config/bootstrap-native.mk

diff --git a/config/bootstrap-native.mk b/config/bootstrap-native.mk
new file mode 100644
index ..a4a3d8594089
--- /dev/null
+++ b/config/bootstrap-native.mk
@@ -0,0 +1 @@
+BOOT_CFLAGS := -march=native -mtune=native $(BOOT_CFLAGS)
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 4973f195daf9..29827c5106f8 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3052,6 +3052,12 @@ Removes any @option{-O}-started option from 
@code{BOOT_CFLAGS}, and adds
 @itemx @samp{bootstrap-Og}
 Analogous to @code{bootstrap-O1}.
 
+@item @samp{bootstrap-native}
+@itemx @samp{bootstrap-native}
+Optimize the compiler code for the build host, if supported by the
+architecture. Note this only affects the compiler, not the targeted
+code. If you want the later use @samp{--with-cpu}.
+
 @item @samp{bootstrap-lto}
 Enables Link-Time Optimization for host tools during bootstrapping.
 @samp{BUILD_CONFIG=bootstrap-lto} is equivalent to adding
-- 
2.45.2



Re: [PATCH 2/2] Add AVX2 code path to lexer

2024-07-30 Thread Andi Kleen
Andrew Pinski  writes:
>
> Using the builtin here seems wrong. Why not use the intrinsic
> _mm256_movemask_epi8 ?

I followed the rest of the vectorized code paths. The original reason was that
there was some incompatibility of the intrinsic header with the source
build. I don't know if it's still true, but I guess it doesn't hurt.

> Also it might make sense to remove the MMX version.

See the previous patch.

-Andi



[PATCH 1/2] Remove MMX code path in lexer

2024-07-30 Thread Andi Kleen
From: Andi Kleen 

Host systems with only MMX and no SSE2 should be really rare now.
Let's remove the MMX code path to keep the number of custom
implementations the same.

The SSE2 code path is also somewhat dubious now (nearly everything
should have SSE4 4.2 which is >15 years old now), but the SSE2
code path is used as fallback for others and also apparently
Solaris uses it due to tool chain deficiencies.

libcpp/ChangeLog:

* lex.cc (search_line_mmx): Remove function.
(init_vectorized_lexer): Remove search_line_mmx.
---
 libcpp/lex.cc | 75 ---
 1 file changed, 75 deletions(-)

diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index 16f2c23af1e1..1591dcdf151a 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -290,71 +290,6 @@ static const char repl_chars[4][16] 
__attribute__((aligned(16))) = {
 '?', '?', '?', '?', '?', '?', '?', '?' },
 };
 
-/* A version of the fast scanner using MMX vectorized byte compare insns.
-
-   This uses the PMOVMSKB instruction which was introduced with "MMX2",
-   which was packaged into SSE1; it is also present in the AMD MMX
-   extension.  Mark the function as using "sse" so that we emit a real
-   "emms" instruction, rather than the 3dNOW "femms" instruction.  */
-
-static const uchar *
-#ifndef __SSE__
-__attribute__((__target__("sse")))
-#endif
-search_line_mmx (const uchar *s, const uchar *end ATTRIBUTE_UNUSED)
-{
-  typedef char v8qi __attribute__ ((__vector_size__ (8)));
-  typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
-
-  const v8qi repl_nl = *(const v8qi *)repl_chars[0];
-  const v8qi repl_cr = *(const v8qi *)repl_chars[1];
-  const v8qi repl_bs = *(const v8qi *)repl_chars[2];
-  const v8qi repl_qm = *(const v8qi *)repl_chars[3];
-
-  unsigned int misalign, found, mask;
-  const v8qi *p;
-  v8qi data, t, c;
-
-  /* Align the source pointer.  While MMX doesn't generate unaligned data
- faults, this allows us to safely scan to the end of the buffer without
- reading beyond the end of the last page.  */
-  misalign = (uintptr_t)s & 7;
-  p = (const v8qi *)((uintptr_t)s & -8);
-  data = *p;
-
-  /* Create a mask for the bytes that are valid within the first
- 16-byte block.  The Idea here is that the AND with the mask
- within the loop is "free", since we need some AND or TEST
- insn in order to set the flags for the branch anyway.  */
-  mask = -1u << misalign;
-
-  /* Main loop processing 8 bytes at a time.  */
-  goto start;
-  do
-{
-  data = *++p;
-  mask = -1;
-
-start:
-  t = __builtin_ia32_pcmpeqb(data, repl_nl);
-  c = __builtin_ia32_pcmpeqb(data, repl_cr);
-  t = (v8qi) __builtin_ia32_por ((__m64)t, (__m64)c);
-  c = __builtin_ia32_pcmpeqb(data, repl_bs);
-  t = (v8qi) __builtin_ia32_por ((__m64)t, (__m64)c);
-  c = __builtin_ia32_pcmpeqb(data, repl_qm);
-  t = (v8qi) __builtin_ia32_por ((__m64)t, (__m64)c);
-  found = __builtin_ia32_pmovmskb (t);
-  found &= mask;
-}
-  while (!found);
-
-  __builtin_ia32_emms ();
-
-  /* FOUND contains 1 in bits for which we matched a relevant
- character.  Conversion to the byte index is trivial.  */
-  found = __builtin_ctz(found);
-  return (const uchar *)p + found;
-}
 
 /* A version of the fast scanner using SSE2 vectorized byte compare insns.  */
 
@@ -509,8 +444,6 @@ init_vectorized_lexer (void)
   minimum = 3;
 #elif defined(__SSE2__)
   minimum = 2;
-#elif defined(__SSE__)
-  minimum = 1;
 #endif
 
   if (minimum == 3)
@@ -521,14 +454,6 @@ init_vectorized_lexer (void)
 impl = search_line_sse42;
   else if (minimum == 2 || (edx & bit_SSE2))
impl = search_line_sse2;
-  else if (minimum == 1 || (edx & bit_SSE))
-   impl = search_line_mmx;
-}
-  else if (__get_cpuid (0x8001, , , , ))
-{
-  if (minimum == 1
- || (edx & (bit_MMXEXT | bit_CMOV)) == (bit_MMXEXT | bit_CMOV))
-   impl = search_line_mmx;
 }
 
   search_line_fast = impl;
-- 
2.45.2



[PATCH 2/2] Add AVX2 code path to lexer

2024-07-30 Thread Andi Kleen
From: Andi Kleen 

AVX2 is widely available on x86 and it allows to do the scanner line
check with 32 bytes at a time. The code is similar to the SSE2 code
path, just using AVX and 32 bytes at a time instead of SSE2 16 bytes.

Also adjust the code to allow inlining when the compiler
is built for an AVX2 host, following what other architectures
do.

I see about a ~0.6% compile time improvement for compiling i386
insn-recog.i with -O0.

libcpp/ChangeLog:

* config.in (HAVE_AVX2): Add.
* configure: Regenerate.
* configure.ac: Add HAVE_AVX2 check.
* lex.cc (repl_chars): Extend to 32 bytes.
(search_line_avx2): New function to scan line using AVX2.
(init_vectorized_lexer): Check for AVX2 in CPUID.
---
 libcpp/config.in|  3 ++
 libcpp/configure| 17 +
 libcpp/configure.ac |  3 ++
 libcpp/lex.cc   | 91 +++--
 4 files changed, 110 insertions(+), 4 deletions(-)

diff --git a/libcpp/config.in b/libcpp/config.in
index 253ef03a3dea..8fad6bd4b4f5 100644
--- a/libcpp/config.in
+++ b/libcpp/config.in
@@ -213,6 +213,9 @@
 /* Define to 1 if you can assemble SSE4 insns. */
 #undef HAVE_SSE4
 
+/* Define to 1 if you can assemble AVX2 insns. */
+#undef HAVE_AVX2
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STDDEF_H
 
diff --git a/libcpp/configure b/libcpp/configure
index 32d6aaa30699..6d9286ac9601 100755
--- a/libcpp/configure
+++ b/libcpp/configure
@@ -9149,6 +9149,23 @@ if ac_fn_c_try_compile "$LINENO"; then :
 
 $as_echo "#define HAVE_SSE4 1" >>confdefs.h
 
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+asm ("vpcmpeqb %%ymm0, %%ymm4, %%ymm5" : : "i"(0))
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+
+$as_echo "#define HAVE_AVX2 1" >>confdefs.h
+
 fi
 rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
 esac
diff --git a/libcpp/configure.ac b/libcpp/configure.ac
index b883fec776fe..c06609827924 100644
--- a/libcpp/configure.ac
+++ b/libcpp/configure.ac
@@ -200,6 +200,9 @@ case $target in
 AC_TRY_COMPILE([], [asm ("pcmpestri %0, %%xmm0, %%xmm1" : : "i"(0))],
   [AC_DEFINE([HAVE_SSE4], [1],
 [Define to 1 if you can assemble SSE4 insns.])])
+AC_TRY_COMPILE([], [asm ("vpcmpeqb %%ymm0, %%ymm4, %%ymm5" : : "i"(0))],
+  [AC_DEFINE([HAVE_AVX2], [1],
+[Define to 1 if you can assemble AVX2 insns.])])
 esac
 
 # Enable --enable-host-shared.
diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index 1591dcdf151a..72f3402aac99 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -278,19 +278,31 @@ search_line_acc_char (const uchar *s, const uchar *end 
ATTRIBUTE_UNUSED)
 /* Replicated character data to be shared between implementations.
Recall that outside of a context with vector support we can't
define compatible vector types, therefore these are all defined
-   in terms of raw characters.  */
-static const char repl_chars[4][16] __attribute__((aligned(16))) = {
+   in terms of raw characters.
+   gcc constant propagates this and usually turns it into a
+   vector broadcast, so it actually disappears.  */
+
+static const char repl_chars[4][32] __attribute__((aligned(32))) = {
   { '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n',
+'\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n',
+'\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n',
 '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n' },
   { '\r', '\r', '\r', '\r', '\r', '\r', '\r', '\r',
+'\r', '\r', '\r', '\r', '\r', '\r', '\r', '\r',
+'\r', '\r', '\r', '\r', '\r', '\r', '\r', '\r',
 '\r', '\r', '\r', '\r', '\r', '\r', '\r', '\r' },
   { '\\', '\\', '\\', '\\', '\\', '\\', '\\', '\\',
+'\\', '\\', '\\', '\\', '\\', '\\', '\\', '\\',
+'\\', '\\', '\\', '\\', '\\', '\\', '\\', '\\',
 '\\', '\\', '\\', '\\', '\\', '\\', '\\', '\\' },
   { '?', '?', '?', '?', '?', '?', '?', '?',
+'?', '?', '?', '?', '?', '?', '?', '?',
+'?', '?', '?', '?', '?', '?', '?', '?',
 '?', '?', '?', '?', '?', '?', '?', '?' },
 };
 
 
+#ifndef __AVX2__
 /* A version of the fast scanner using SSE2 vectorized byte compare insns.  */
 
 static const uchar *
@@ -343,8 +355,9 @@ search_line_sse2 (const uchar *s, const uchar *end 
ATTRIBUTE_UNUSED)
   found = __builtin_ctz(found);
   return (const uchar *)p + found;
 }
+#endif
 
-#ifdef HAVE_SSE4
+#if defined(HAVE_SSE4) && !defined(__AVX2__)
 /* A version of the fast scanner using SSE 4.2 vectorized string insns.  */
 
 static const uchar *
@@ -425,6 +438,71 @@ search_line_sse42 (const uchar *s, const uchar *end)
 #define search_line_sse42 search_line_sse2
 #endif
 
+#ifdef HAVE_AVX2
+
+/* A version of the fast scanner using AVX2 vectorized byte compare insns.  */
+
+static const uchar *
+#

[PATCH] PR116080: Fix test suite checks for musttail

2024-07-29 Thread Andi Kleen
From: Andi Kleen 

This is a new attempt to fix PR116080. The previous try was reverted
because it just broke a bunch of tests, hiding the problem.

- musttail behaves differently than tailcall at -O0. Some of the test
run at -O0, so add separate effective target tests for musttail.
- New effective target tests need to use unique file names
to make dejagnu caching work
- Change the tests to use new targets
- Add a external_musttail test to check for target's ability
to do tail calls between translation units. This covers some powerpc
ABIs.

gcc/testsuite/ChangeLog:

PR testsuite/116080
* c-c++-common/musttail1.c: Use musttail target.
* c-c++-common/musttail12.c: Use struct_musttail target.
* c-c++-common/musttail2.c: Use musttail target.
* c-c++-common/musttail3.c: Likewise.
* c-c++-common/musttail4.c: Likewise.
* c-c++-common/musttail7.c: Likewise.
* c-c++-common/musttail8.c: Likewise.
* g++.dg/musttail10.C: Likewise. Replace powerpc checks with
external_musttail.
* g++.dg/musttail11.C: Use musttail target.
* g++.dg/musttail6.C: Use musttail target. Replace powerpc
checks with external_musttail.
* g++.dg/musttail9.C: Use musttail target.
* lib/target-supports.exp: Add musttail, struct_musttail,
external_musttail targets. Remove optimization for musttail.
Use unique file names for musttail.
---
 gcc/testsuite/c-c++-common/musttail1.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail12.c |  2 +-
 gcc/testsuite/c-c++-common/musttail2.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail3.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail4.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail7.c  |  2 +-
 gcc/testsuite/c-c++-common/musttail8.c  |  2 +-
 gcc/testsuite/g++.dg/musttail10.C   |  4 ++--
 gcc/testsuite/g++.dg/musttail11.C   |  2 +-
 gcc/testsuite/g++.dg/musttail6.C|  4 ++--
 gcc/testsuite/g++.dg/musttail9.C|  2 +-
 gcc/testsuite/lib/target-supports.exp   | 30 -
 12 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
index 74efcc2a0bc6..51549672e02a 100644
--- a/gcc/testsuite/c-c++-common/musttail1.c
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { musttail && { c || c++11 } } } } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
 int __attribute__((noinline,noclone,noipa))
diff --git a/gcc/testsuite/c-c++-common/musttail12.c 
b/gcc/testsuite/c-c++-common/musttail12.c
index 4140bcd00950..475afc5af3f3 100644
--- a/gcc/testsuite/c-c++-common/musttail12.c
+++ b/gcc/testsuite/c-c++-common/musttail12.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { struct_tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { struct_musttail && { c || c++11 } } } } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
 struct str
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
index 86f2c3d77404..1970c4edd670 100644
--- a/gcc/testsuite/c-c++-common/musttail2.c
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { musttail && { c || c++11 } } } } */
 
 struct box { char field[256]; int i; };
 
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
index ea9589c59ef2..7499fd6460b4 100644
--- a/gcc/testsuite/c-c++-common/musttail3.c
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { struct_musttail && { c || c++11 } } } } */
 
 extern int foo2 (int x, ...);
 
diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
b/gcc/testsuite/c-c++-common/musttail4.c
index 23f4b5e1cd68..bd6effa4b931 100644
--- a/gcc/testsuite/c-c++-common/musttail4.c
+++ b/gcc/testsuite/c-c++-common/musttail4.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { musttail && { c || c++11 } } } } */
 
 struct box { char field[64]; int i; };
 
diff --git a/gcc/testsuite/c-c++-common/musttail7.c 
b/gcc/testsuite/c-c++-common/musttail7.c
index c753a3fe9b2a..d17cb71256d7 100644
--- a/gcc/testsuite/c-c++-common/musttail7.c
+++ b/gcc/testsuite/c-c++-common/musttail7.c
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-do compile { target { musttail && { c || c++11 } } } } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
 void __attribute__((noipa)) f() {}
diff --git a/gcc/testsuite/c-c+

[gcc r15-2384] Revert "PR116080: Fix tail call dejagnu checks"

2024-07-29 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:a7d6f7327e9211fbb4a800c06d00c4555dbffcec

commit r15-2384-ga7d6f7327e9211fbb4a800c06d00c4555dbffcec
Author: Andi Kleen 
Date:   Mon Jul 29 10:17:43 2024 -0700

Revert "PR116080: Fix tail call dejagnu checks"

This reverts commit ee41cd863b7c38ee3bc415ea7154954aa6facca3.

Diff:
---
 gcc/testsuite/g++.dg/musttail10.C |  2 +-
 gcc/testsuite/g++.dg/musttail6.C  |  2 +-
 gcc/testsuite/lib/target-supports.exp | 14 +++---
 3 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/g++.dg/musttail10.C 
b/gcc/testsuite/g++.dg/musttail10.C
index bd75affa2220..ff7fcc7d8755 100644
--- a/gcc/testsuite/g++.dg/musttail10.C
+++ b/gcc/testsuite/g++.dg/musttail10.C
@@ -8,7 +8,7 @@ double g() { [[gnu::musttail]] return f(); } /* { dg-error 
"cannot tail-cal
 
 template 
 __attribute__((noinline, noclone, noipa))
-T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not able" 
"" { target { external_tail_call } } } */
+T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not able" 
"" { target powerpc*-*-* } } */
 
 template 
 __attribute__((noinline, noclone, noipa))
diff --git a/gcc/testsuite/g++.dg/musttail6.C b/gcc/testsuite/g++.dg/musttail6.C
index 81f6d9f3ca77..5c6f69407ddb 100644
--- a/gcc/testsuite/g++.dg/musttail6.C
+++ b/gcc/testsuite/g++.dg/musttail6.C
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { struct_tail_call } } } */
-/* { dg-require-effective-target external_tail_call } */
 /* A lot of architectures will not build this due to PR115606 and PR115607 */
+/* { dg-skip-if "powerpc does not support sibcall to templates" { powerpc*-*-* 
} } */
 /* { dg-options "-std=gnu++11" } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 0a3946e82d4b..d368251ef9a4 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -12741,15 +12741,7 @@ proc check_effective_target_tail_call { } {
 return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
__attribute__((__noipa__)) void foo (void) { }
__attribute__((__noipa__)) void bar (void) { foo(); }
-} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
-}
-
-# Return 1 if the target can perform tail-calls for externals
-proc check_effective_target_external_tail_call { } {
-return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
-   extern __attribute__((__noipa__)) void foo (void);
-   __attribute__((__noipa__)) void bar (void) { foo(); }
-} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
+} {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed 
dump.
 }
 
 # Return 1 if the target can perform tail-call optimizations for structures
@@ -12759,9 +12751,9 @@ proc check_effective_target_struct_tail_call { } {
 return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
// C++
struct foo { int a, b; };
-   extern __attribute__((__noipa__)) struct foo foo (void);
+   __attribute__((__noipa__)) struct foo foo (void) { return {}; }
__attribute__((__noipa__)) struct foo bar (void) { return foo(); }
-} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
+} {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed 
dump.
 }
 
 # Return 1 if the target's calling sequence or its ABI


Re: [PATCH v1 1/2] PR116080: Fix tail call dejagnu checks

2024-07-29 Thread Andi Kleen



I'm going to revert the patch for now. There are two problems:

- The new tests don't have a unique name so the caching confuses 
the results.
- To test with -O2 we need explicit musttail checks because tail call doesn't
run with -O0 w/o musttail.



Re: [PATCH v1 1/2] PR116080: Fix tail call dejagnu checks

2024-07-29 Thread Andi Kleen
> ..., that means that a number of the new test cases are UNSUPPORTED, for
> example, x86_64 GNU/Linux:
> 
> +UNSUPPORTED: c-c++-common/musttail1.c  -Wc++-compat 
> +UNSUPPORTED: c-c++-common/musttail12.c  -Wc++-compat 
> +PASS: c-c++-common/musttail13.c  -Wc++-compat   (test for errors, line 4)
> +PASS: c-c++-common/musttail13.c  -Wc++-compat  (test for excess errors)
> +UNSUPPORTED: c-c++-common/musttail2.c  -Wc++-compat 
> +UNSUPPORTED: c-c++-common/musttail3.c  -Wc++-compat 
> +UNSUPPORTED: c-c++-common/musttail4.c  -Wc++-compat 
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for errors, line 17)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 10)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 11)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 12)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 24)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 25)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 26)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 5)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat   (test for warnings, line 
> 6)
> +PASS: c-c++-common/musttail5.c  -Wc++-compat  (test for excess errors)
> +UNSUPPORTED: c-c++-common/musttail7.c  -Wc++-compat 
> +UNSUPPORTED: c-c++-common/musttail8.c  -Wc++-compat 
> 
> (Similarly for their C++ testing.)
> 
> +UNSUPPORTED: g++.dg/musttail10.C  
> +UNSUPPORTED: g++.dg/musttail11.C  
> +UNSUPPORTED: g++.dg/musttail6.C  
> +UNSUPPORTED: g++.dg/musttail9.C  
> 
> ..., and even a few existing test cases "regress" from PASS to
> UNSUPPORTED:
> 
> [-PASS:-]{+UNSUPPORTED:+} gcc.dg/plugin/must-tail-call-1.c 
> -fplugin=./must_tail_call_plugin.so[-(test for excess errors)-]
> [-PASS:-]{+UNSUPPORTED:+} gcc.dg/plugin/must-tail-call-2.c 
> -fplugin=./must_tail_call_plugin.so[-(test for errors, line 18)-]
> [-PASS: gcc.dg/plugin/must-tail-call-2.c 
> -fplugin=./must_tail_call_plugin.so  (test for errors, line 33)-]
> [-PASS: gcc.dg/plugin/must-tail-call-2.c 
> -fplugin=./must_tail_call_plugin.so  (test for errors, line 40)-]
> [-PASS: gcc.dg/plugin/must-tail-call-2.c 
> -fplugin=./must_tail_call_plugin.so  (test for errors, line 49)-]
> [-PASS: gcc.dg/plugin/must-tail-call-2.c 
> -fplugin=./must_tail_call_plugin.so  (test for errors, line 58)-]
> [-PASS: gcc.dg/plugin/must-tail-call-2.c 
> -fplugin=./must_tail_call_plugin.so (test for excess errors)-]
> 
> Similarly for ppc64le GNU/Linux.
> 
> Is that intentional?

Thanks.  I will take a look. At least on x86_64-linux everything should
be supported. On powerpc and ARM I expect some unsupported. 

But the previous test cases shouldn't have changed. Maybe we need
more tail_call dejagnu tests that also enable -O2. 

The whole area is unfortunately somewhat of a mine field because of
lots of varying restrictions on tail calls, both with frontends
and targets.

-Andi


Re: [Linaro-TCWG-CI] gcc-15-2233-g8d1af8f904a: Failure on arm

2024-07-28 Thread Andi Kleen
On Sun, Jul 28, 2024 at 09:26:26AM +0400, Maxim Kuvyrkov wrote:
> Hi Andi,
> 
> The regression is ...
>   === g++ tests ===
> 
> Running g++:g++.dg/dg.exp ...
> FAIL: c-c++-common/musttail12.c -std=c++14 (test for excess errors)
> FAIL: c-c++-common/musttail12.c -std=c++17 (test for excess errors)
> FAIL: c-c++-common/musttail12.c -std=c++20 (test for excess errors)
> FAIL: g++.dg/musttail6.C (test for excess errors)
> 
> It wasn't included in the report due to typo in the scripts.

it should be fixed now.

-Andi


gcc-wwwdocs branch master updated. 823b04aa91ca48a9f1d73b1fa2eda4d7b34400cf

2024-07-28 Thread Andi Kleen via Gcc-cvs-wwwdocs
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gcc-wwwdocs".

The branch, master has been updated
   via  823b04aa91ca48a9f1d73b1fa2eda4d7b34400cf (commit)
  from  a4557009470684fe3cbb5b0ec4332ef840ae4aa0 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 823b04aa91ca48a9f1d73b1fa2eda4d7b34400cf
Author: Andi Kleen 
Date:   Sun Jul 28 21:23:14 2024 -0700

add manual links for constexpr asm/musttail

diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index 1d0cfa16..3b3a6c0b 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -62,14 +62,16 @@ a work-in-progress.
 
 
 
-  A musttail statement attribute was added to enforce tail 
calls.
+   A https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#index-musttail-statement-attribute;>
+   musttail statement attribute was added to enforce 
tail calls.
 
 
 
 
 
-  Inline assembler statements now support constexpr generated 
strings,
-  analoguous to static_assert.
+   Inline assembler statements now support
+   https://gcc.gnu.org/onlinedocs/gcc/asm-constexprs.html;>constexpr
 generated strings,
+ analoguous to static_assert.
 
 
 

---

Summary of changes:
 htdocs/gcc-15/changes.html | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)


hooks/post-receive
-- 
gcc-wwwdocs


[gcc r15-2340] PR116019: Improve tail call error message

2024-07-26 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:899ee4815424a73a2b9d899591fab3fcc4520b61

commit r15-2340-g899ee4815424a73a2b9d899591fab3fcc4520b61
Author: Andi Kleen 
Date:   Thu Jul 25 13:54:50 2024 -0700

PR116019: Improve tail call error message

The "tail call must be the same type" message is common on some
targets with C++, or without optimization. It is generated
when gcc believes there is an access of the return value
after the call. However usually it does not actually corespond
to a type mismatch, but can be caused for other reasons.

Make it slightly more vague to be less misleading.

gcc/ChangeLog:

PR c++/116019
* tree-tailcall.cc (find_tail_calls): Change tail call
error message.

Diff:
---
 gcc/tree-tailcall.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index a68079d4f507..1901b1a13f99 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -632,7 +632,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail,
   && may_be_aliased (result_decl)
   && ref_maybe_used_by_stmt_p (call, result_decl, false))
 {
-  maybe_error_musttail (call, _("tail call must be same type"));
+  maybe_error_musttail (call, _("return value used after call"));
   return;
 }


[gcc r15-2339] PR116080: Fix tail call dejagnu checks

2024-07-26 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:ee41cd863b7c38ee3bc415ea7154954aa6facca3

commit r15-2339-gee41cd863b7c38ee3bc415ea7154954aa6facca3
Author: Andi Kleen 
Date:   Wed Jul 24 20:18:56 2024 -0700

PR116080: Fix tail call dejagnu checks

- Run the target_effective tail_call checks without optimization to
match the actual test cases.
- Add an extra check for external tail calls to handle targets like
powerpc that cannot tail call between different object files.
This one will also cover templates.

gcc/testsuite/ChangeLog:

PR testsuite/116080
* g++.dg/musttail10.C: Use external tail call target check.
* g++.dg/musttail6.C: Dito.
* lib/target-supports.exp: Add external_tail_call. Disable
optimization for tail call checks.

Diff:
---
 gcc/testsuite/g++.dg/musttail10.C |  2 +-
 gcc/testsuite/g++.dg/musttail6.C  |  2 +-
 gcc/testsuite/lib/target-supports.exp | 14 +++---
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/g++.dg/musttail10.C 
b/gcc/testsuite/g++.dg/musttail10.C
index ff7fcc7d8755..bd75affa2220 100644
--- a/gcc/testsuite/g++.dg/musttail10.C
+++ b/gcc/testsuite/g++.dg/musttail10.C
@@ -8,7 +8,7 @@ double g() { [[gnu::musttail]] return f(); } /* { dg-error 
"cannot tail-cal
 
 template 
 __attribute__((noinline, noclone, noipa))
-T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not able" 
"" { target powerpc*-*-* } } */
+T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not able" 
"" { target { external_tail_call } } } */
 
 template 
 __attribute__((noinline, noclone, noipa))
diff --git a/gcc/testsuite/g++.dg/musttail6.C b/gcc/testsuite/g++.dg/musttail6.C
index 5c6f69407ddb..81f6d9f3ca77 100644
--- a/gcc/testsuite/g++.dg/musttail6.C
+++ b/gcc/testsuite/g++.dg/musttail6.C
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { struct_tail_call } } } */
+/* { dg-require-effective-target external_tail_call } */
 /* A lot of architectures will not build this due to PR115606 and PR115607 */
-/* { dg-skip-if "powerpc does not support sibcall to templates" { powerpc*-*-* 
} } */
 /* { dg-options "-std=gnu++11" } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index d368251ef9a4..0a3946e82d4b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -12741,7 +12741,15 @@ proc check_effective_target_tail_call { } {
 return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
__attribute__((__noipa__)) void foo (void) { }
__attribute__((__noipa__)) void bar (void) { foo(); }
-} {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed 
dump.
+} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
+}
+
+# Return 1 if the target can perform tail-calls for externals
+proc check_effective_target_external_tail_call { } {
+return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
+   extern __attribute__((__noipa__)) void foo (void);
+   __attribute__((__noipa__)) void bar (void) { foo(); }
+} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
 }
 
 # Return 1 if the target can perform tail-call optimizations for structures
@@ -12751,9 +12759,9 @@ proc check_effective_target_struct_tail_call { } {
 return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
// C++
struct foo { int a, b; };
-   __attribute__((__noipa__)) struct foo foo (void) { return {}; }
+   extern __attribute__((__noipa__)) struct foo foo (void);
__attribute__((__noipa__)) struct foo bar (void) { return foo(); }
-} {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed 
dump.
+} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
 }
 
 # Return 1 if the target's calling sequence or its ABI


[PATCH v1 2/2] PR116019: Improve tail call error message

2024-07-25 Thread Andi Kleen
From: Andi Kleen 

The "tail call must be the same type" message is common on some
targets with C++, or without optimization. It is generated
when gcc believes there is an access of the return value
after the call. However usually it does not actually corespond
to a type mismatch, but can be caused for other reasons.

Make it slightly more vague to be less misleading.

gcc/ChangeLog:

PR c++/116019
* tree-tailcall.cc (find_tail_calls): Change tail call
error message.
---
 gcc/tree-tailcall.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index a68079d4f507..1901b1a13f99 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -632,7 +632,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail,
   && may_be_aliased (result_decl)
   && ref_maybe_used_by_stmt_p (call, result_decl, false))
 {
-  maybe_error_musttail (call, _("tail call must be same type"));
+  maybe_error_musttail (call, _("return value used after call"));
   return;
 }
 
-- 
2.45.2



[PATCH v1 1/2] PR116080: Fix tail call dejagnu checks

2024-07-25 Thread Andi Kleen
From: Andi Kleen 

- Run the target_effective tail_call checks without optimization to
match the actual test cases.
- Add an extra check for external tail calls to handle targets like
powerpc that cannot tail call between different object files.
This one will also cover templates.

gcc/testsuite/ChangeLog:

PR testsuite/116080
* g++.dg/musttail10.C: Use external tail call target check.
* g++.dg/musttail6.C: Dito.
* lib/target-supports.exp: Add external_tail_call. Disable
optimization for tail call checks.
---
 gcc/testsuite/g++.dg/musttail10.C |  2 +-
 gcc/testsuite/g++.dg/musttail6.C  |  2 +-
 gcc/testsuite/lib/target-supports.exp | 14 +++---
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/g++.dg/musttail10.C 
b/gcc/testsuite/g++.dg/musttail10.C
index ff7fcc7d8755..bd75affa2220 100644
--- a/gcc/testsuite/g++.dg/musttail10.C
+++ b/gcc/testsuite/g++.dg/musttail10.C
@@ -8,7 +8,7 @@ double g() { [[gnu::musttail]] return f(); } /* { dg-error 
"cannot tail-cal
 
 template 
 __attribute__((noinline, noclone, noipa))
-T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not able" 
"" { target powerpc*-*-* } } */
+T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not able" 
"" { target { external_tail_call } } } */
 
 template 
 __attribute__((noinline, noclone, noipa))
diff --git a/gcc/testsuite/g++.dg/musttail6.C b/gcc/testsuite/g++.dg/musttail6.C
index 5c6f69407ddb..81f6d9f3ca77 100644
--- a/gcc/testsuite/g++.dg/musttail6.C
+++ b/gcc/testsuite/g++.dg/musttail6.C
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { struct_tail_call } } } */
+/* { dg-require-effective-target external_tail_call } */
 /* A lot of architectures will not build this due to PR115606 and PR115607 */
-/* { dg-skip-if "powerpc does not support sibcall to templates" { powerpc*-*-* 
} } */
 /* { dg-options "-std=gnu++11" } */
 /* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index d368251ef9a4..0a3946e82d4b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -12741,7 +12741,15 @@ proc check_effective_target_tail_call { } {
 return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
__attribute__((__noipa__)) void foo (void) { }
__attribute__((__noipa__)) void bar (void) { foo(); }
-} {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed 
dump.
+} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
+}
+
+# Return 1 if the target can perform tail-calls for externals
+proc check_effective_target_external_tail_call { } {
+return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
+   extern __attribute__((__noipa__)) void foo (void);
+   __attribute__((__noipa__)) void bar (void) { foo(); }
+} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
 }
 
 # Return 1 if the target can perform tail-call optimizations for structures
@@ -12751,9 +12759,9 @@ proc check_effective_target_struct_tail_call { } {
 return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
// C++
struct foo { int a, b; };
-   __attribute__((__noipa__)) struct foo foo (void) { return {}; }
+   extern __attribute__((__noipa__)) struct foo foo (void);
__attribute__((__noipa__)) struct foo bar (void) { return foo(); }
-} {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed 
dump.
+} {-fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed dump.
 }
 
 # Return 1 if the target's calling sequence or its ABI
-- 
2.45.2



[gcc r15-2234] Add documentation for musttail attribute

2024-07-23 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:8daae81113eeff37b4ae2e08a9797295fbc8b81e

commit r15-2234-g8daae81113eeff37b4ae2e08a9797295fbc8b81e
Author: Andi Kleen 
Date:   Tue Jan 23 23:38:23 2024 -0800

Add documentation for musttail attribute

gcc/ChangeLog:

PR c/83324
* doc/extend.texi: Document [[musttail]]

Diff:
---
 gcc/doc/extend.texi | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 4b77599380b5..b0273927b256 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9921,7 +9921,7 @@ same manner as the @code{deprecated} attribute.
 @section Statement Attributes
 @cindex Statement Attributes
 
-GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
+GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
 for details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
 (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
@@ -9978,6 +9978,25 @@ foo (int x, int y)
 @code{y} is not actually incremented and the compiler can but does not
 have to optimize it to just @code{return 42 + 42;}.
 
+@cindex @code{musttail} statement attribute
+@item musttail
+
+The @code{gnu::musttail} or @code{clang::musttail} attribute
+can be applied to a @code{return} statement with a return-value expression
+that is a function call.  It asserts that the call must be a tail call that
+does not allocate extra stack space, so it is safe to use tail recursion
+to implement long running loops.
+
+@smallexample
+[[gnu::musttail]] return foo();
+@end smallexample
+
+If the compiler cannot generate a @code{musttail} tail call it will report
+an error. On some targets tail calls may never be supported.
+Tail calls cannot reference locals in memory, which may affect
+builds without optimization when passing small structures, or passing
+or returning large structures. Enabling -O1 or -O2 can improve
+the success of tail calls.
 @end table
 
 @node Attribute Syntax
@@ -10101,7 +10120,9 @@ the constant expression, if present.
 
 @subsubheading Statement Attributes
 In GNU C, an attribute specifier list may appear as part of a null
-statement.  The attribute goes before the semicolon.
+statement. The attribute goes before the semicolon.
+Some attributes in new style syntax are also supported
+on non-null statements.
 
 @subsubheading Type Attributes


[gcc r15-2233] Add tests for C/C++ musttail attributes

2024-07-23 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:8d1af8f904a0c08656d976cbf8ca56dba35197b0

commit r15-2233-g8d1af8f904a0c08656d976cbf8ca56dba35197b0
Author: Andi Kleen 
Date:   Tue Jan 23 23:54:56 2024 -0800

Add tests for C/C++ musttail attributes

Some adopted from the existing C musttail plugin tests.
Also extends the ability to query the sibcall capabilities of the
target.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp:
(check_effective_target_struct_tail_call): New function.
* c-c++-common/musttail1.c: New test.
* c-c++-common/musttail12.c: New test.
* c-c++-common/musttail13.c: New test.
* c-c++-common/musttail2.c: New test.
* c-c++-common/musttail3.c: New test.
* c-c++-common/musttail4.c: New test.
* c-c++-common/musttail5.c: New test.
* c-c++-common/musttail7.c: New test.
* c-c++-common/musttail8.c: New test.
* g++.dg/musttail10.C: New test.
* g++.dg/musttail11.C: New test.
* g++.dg/musttail6.C: New test.
* g++.dg/musttail9.C: New test.

Diff:
---
 gcc/testsuite/c-c++-common/musttail1.c  | 14 
 gcc/testsuite/c-c++-common/musttail12.c | 15 +
 gcc/testsuite/c-c++-common/musttail13.c |  5 +++
 gcc/testsuite/c-c++-common/musttail2.c  | 33 ++
 gcc/testsuite/c-c++-common/musttail3.c  | 29 
 gcc/testsuite/c-c++-common/musttail4.c  | 17 ++
 gcc/testsuite/c-c++-common/musttail5.c  | 28 +++
 gcc/testsuite/c-c++-common/musttail7.c  | 14 
 gcc/testsuite/c-c++-common/musttail8.c  | 17 ++
 gcc/testsuite/g++.dg/musttail10.C   | 40 ++
 gcc/testsuite/g++.dg/musttail11.C   | 33 ++
 gcc/testsuite/g++.dg/musttail6.C| 60 +
 gcc/testsuite/g++.dg/musttail9.C| 10 ++
 gcc/testsuite/lib/target-supports.exp   | 12 +++
 14 files changed, 327 insertions(+)

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
new file mode 100644
index ..74efcc2a0bc6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+int __attribute__((noinline,noclone,noipa))
+callee (int i)
+{
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+caller (int i)
+{
+  [[gnu::musttail]] return callee (i + 1);
+}
diff --git a/gcc/testsuite/c-c++-common/musttail12.c 
b/gcc/testsuite/c-c++-common/musttail12.c
new file mode 100644
index ..4140bcd00950
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail12.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { struct_tail_call && { c || c++11 } } } } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+struct str
+{
+  int a, b;
+};
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+L:
+[[gnu::musttail]] return cstruct (x + 1);
+  return ((struct str){ x, 0 });
+}
diff --git a/gcc/testsuite/c-c++-common/musttail13.c 
b/gcc/testsuite/c-c++-common/musttail13.c
new file mode 100644
index ..6bd212fbeb8f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail13.c
@@ -0,0 +1,5 @@
+/* { dg-do compile { target { c || c++11 } } } */
+void f(void)
+{
+  [[gnu::musttail]] return; /* { dg-error "cannot tail-call.*return value must 
be a call" } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
new file mode 100644
index ..86f2c3d77404
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[256]; int i; };
+
+int __attribute__((noinline,noclone,noipa))
+test_2_callee (int i, struct box b)
+{
+  if (b.field[0])
+return 5;
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+test_2_caller (int i)
+{
+  struct box b;
+  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
tail-call: " } */
+}
+
+extern void setjmp (void);
+void
+test_3 (void)
+{
+  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
+}
+
+extern float f7(void);
+
+int
+test_6 (void)
+{
+  [[gnu::musttail]] return f7(); /* { dg-error "cannot tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
new file mode 100644
index ..ea9589c59ef2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -0,0 +1,29 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+extern int foo2 (int x, ...);
+
+struct str
+{
+  int a, b;
+};
+
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+[

[gcc r15-2232] C: Implement musttail attribute for returns

2024-07-23 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:78bbdbd5352df527feccf0a8c2f862f25a2e88b4

commit r15-2232-g78bbdbd5352df527feccf0a8c2f862f25a2e88b4
Author: Andi Kleen 
Date:   Wed Jan 24 07:44:23 2024 -0800

C: Implement musttail attribute for returns

Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.

gcc/c/ChangeLog:

PR c/83324
* c-parser.cc (struct attr_state): Define with musttail_p.
(c_parser_statement_after_labels): Handle [[musttail]].
(c_parser_std_attribute): Dito.
(c_parser_handle_musttail): Dito.
(c_parser_compound_statement_nostart): Dito.
(c_parser_all_labels): Dito.
(c_parser_statement): Dito.
* c-tree.h (c_finish_return): Add musttail_p flag.
* c-typeck.cc (c_finish_return): Handle musttail_p flag.

Diff:
---
 gcc/c/c-parser.cc | 71 +--
 gcc/c/c-tree.h|  2 +-
 gcc/c/c-typeck.cc |  7 --
 3 files changed, 64 insertions(+), 16 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 12c5ed5d92c7..9b9284b1ba4d 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1621,6 +1621,12 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
+struct attr_state
+{
+  /* True if we parsed a musttail attribute for return.  */
+  bool musttail_p;
+};
+
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
  unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1665,7 +1671,8 @@ static location_t c_parser_compound_statement_nostart 
(c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-vec * = NULL);
+vec * = NULL,
+attr_state = {});
 static tree c_parser_c99_block_statement (c_parser *, bool *,
  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec *);
@@ -6982,6 +6989,29 @@ c_parser_handle_directive_omp_attributes (tree ,
 }
 }
 
+/* Check if STD_ATTR contains a musttail attribute and remove if it
+   precedes a return.  PARSER is the parser and ATTR is the output
+   attr_state.  */
+
+static tree
+c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state )
+{
+  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
+{
+  if (lookup_attribute ("gnu", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+  if (lookup_attribute ("clang", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+}
+  return std_attrs;
+}
+
 /* Parse a compound statement except for the opening brace.  This is
used for parsing both compound statements and statement expressions
(which follow different paths to handling the opening).  */
@@ -6998,6 +7028,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
 = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
+  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
 {
@@ -7138,7 +7169,10 @@ c_parser_compound_statement_nostart (c_parser *parser)
= c_parser_nth_token_starts_std_attributes (parser, 1);
   tree std_attrs = NULL_TREE;
   if (have_std_attrs)
-   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+   {
+ std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+ std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
+   }
   if (c_parser_next_token_is_keyword (parser, RID_CASE)
  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7286,7 +7320,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
  last_stmt = true;
  mark_valid_location_for_stdc_pragma (false);
  if (!omp_for_parse_state)
-   c_parser_statement_after_labels (parser, NULL);
+   c_parser_statement_after_labels (parser, NULL, NULL, a);
  else
{
  /* In canonical loop nest form, nested loops can only appear
@@ -7328,15 +7362,20 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
attributes.  In this context, a statement is required, not a
declaration, so attributes must be followed by a

[gcc r15-2231] C++: Support clang compatible [[musttail]] (PR83324)

2024-07-23 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:2bd8177256b6d87f6e75819218cf22c2c0bfc1ac

commit r15-2231-g2bd8177256b6d87f6e75819218cf22c2c0bfc1ac
Author: Andi Kleen 
Date:   Tue Jan 23 23:44:48 2024 -0800

C++: Support clang compatible [[musttail]] (PR83324)

This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

Passes bootstrap and full test

gcc/c-family/ChangeLog:

* c-attribs.cc (set_musttail_on_return): New function.
* c-common.h (set_musttail_on_return): Declare new function.

gcc/cp/ChangeLog:

PR c/83324
* cp-tree.h (AGGR_INIT_EXPR_MUST_TAIL): Add.
* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
* semantics.cc (simplify_aggr_init_expr): Handle musttail.

Diff:
---
 gcc/c-family/c-attribs.cc | 20 
 gcc/c-family/c-common.h   |  1 +
 gcc/cp/cp-tree.h  |  4 
 gcc/cp/parser.cc  | 32 +---
 gcc/cp/pt.cc  |  9 -
 gcc/cp/semantics.cc   |  1 +
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 5adc7b775eaf..685f212683f4 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -672,6 +672,26 @@ attribute_takes_identifier_p (const_tree attr_id)
 return targetm.attribute_takes_identifier_p (attr_id);
 }
 
+/* Set a musttail attribute MUSTTAIL_P on return expression RETVAL
+   at LOC.  */
+
+void
+set_musttail_on_return (tree retval, location_t loc, bool musttail_p)
+{
+  if (retval && musttail_p)
+{
+  tree t = retval;
+  if (TREE_CODE (t) == TARGET_EXPR)
+   t = TARGET_EXPR_INITIAL (t);
+  if (TREE_CODE (t) != CALL_EXPR)
+   error_at (loc, "cannot tail-call: return value must be a call");
+  else
+   CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+}
+  else if (musttail_p && !retval)
+error_at (loc, "cannot tail-call: return value must be a call");
+}
+
 /* Verify that argument value POS at position ARGNO to attribute NAME
applied to function FN (which is either a function declaration or function
type) refers to a function parameter at position POS and the expected type
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index adee822a3ae0..2510ee4dbc9d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1648,6 +1648,7 @@ extern tree handle_noreturn_attribute (tree *, tree, 
tree, int, bool *);
 extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
+extern void set_musttail_on_return (tree, location_t, bool);
 
 /* In c-format.cc.  */
 extern bool valid_format_string_type_p (tree);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 856699de82f2..e2cec2f2c16c 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4228,6 +4228,10 @@ templated_operator_saved_lookups (tree t)
 #define AGGR_INIT_FROM_THUNK_P(NODE) \
   (AGGR_INIT_EXPR_CHECK (NODE)->base.protected_flag)
 
+/* Nonzero means that the call was marked musttail.  */
+#define AGGR_INIT_EXPR_MUST_TAIL(NODE) \
+  (AGGR_INIT_EXPR_CHECK (NODE)->base.static_flag)
+
 /* AGGR_INIT_EXPR accessors.  These are equivalent to the CALL_EXPR
accessors, except for AGGR_INIT_EXPR_SLOT (which takes the place of
CALL_EXPR_STATIC_CHAIN).  */
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index efd5d6f29a71..1fa0780944b6 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree 

[gcc r15-2180] Revert "Add documentation for musttail attribute"

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:4a46ba2d905ed40116fb7e7a34809307dc8e37be

commit r15-2180-g4a46ba2d905ed40116fb7e7a34809307dc8e37be
Author: Andi Kleen 
Date:   Sat Jul 20 16:09:41 2024 -0700

Revert "Add documentation for musttail attribute"

This reverts commit 56f824cc206ff00d466aaeb11211d8005c4668bc.

Diff:
---
 gcc/doc/extend.texi | 25 ++---
 1 file changed, 2 insertions(+), 23 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b0273927b256..4b77599380b5 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9921,7 +9921,7 @@ same manner as the @code{deprecated} attribute.
 @section Statement Attributes
 @cindex Statement Attributes
 
-GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
+GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
 for details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
 (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
@@ -9978,25 +9978,6 @@ foo (int x, int y)
 @code{y} is not actually incremented and the compiler can but does not
 have to optimize it to just @code{return 42 + 42;}.
 
-@cindex @code{musttail} statement attribute
-@item musttail
-
-The @code{gnu::musttail} or @code{clang::musttail} attribute
-can be applied to a @code{return} statement with a return-value expression
-that is a function call.  It asserts that the call must be a tail call that
-does not allocate extra stack space, so it is safe to use tail recursion
-to implement long running loops.
-
-@smallexample
-[[gnu::musttail]] return foo();
-@end smallexample
-
-If the compiler cannot generate a @code{musttail} tail call it will report
-an error. On some targets tail calls may never be supported.
-Tail calls cannot reference locals in memory, which may affect
-builds without optimization when passing small structures, or passing
-or returning large structures. Enabling -O1 or -O2 can improve
-the success of tail calls.
 @end table
 
 @node Attribute Syntax
@@ -10120,9 +10101,7 @@ the constant expression, if present.
 
 @subsubheading Statement Attributes
 In GNU C, an attribute specifier list may appear as part of a null
-statement. The attribute goes before the semicolon.
-Some attributes in new style syntax are also supported
-on non-null statements.
+statement.  The attribute goes before the semicolon.
 
 @subsubheading Type Attributes


[gcc r15-2179] Revert "Add tests for C/C++ musttail attributes"

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:8805ad29924f43d2005784efc8f7d29f10b7f6d9

commit r15-2179-g8805ad29924f43d2005784efc8f7d29f10b7f6d9
Author: Andi Kleen 
Date:   Sat Jul 20 16:09:25 2024 -0700

Revert "Add tests for C/C++ musttail attributes"

This reverts commit 37c4703ce84722b9c24db3e8e6d57ab6d3a7b5eb.

Diff:
---
 gcc/testsuite/c-c++-common/musttail1.c  | 14 
 gcc/testsuite/c-c++-common/musttail12.c | 15 -
 gcc/testsuite/c-c++-common/musttail13.c |  5 ---
 gcc/testsuite/c-c++-common/musttail2.c  | 33 --
 gcc/testsuite/c-c++-common/musttail3.c  | 29 
 gcc/testsuite/c-c++-common/musttail4.c  | 17 --
 gcc/testsuite/c-c++-common/musttail5.c  | 28 ---
 gcc/testsuite/c-c++-common/musttail7.c  | 14 
 gcc/testsuite/c-c++-common/musttail8.c  | 17 --
 gcc/testsuite/g++.dg/musttail10.C   | 40 --
 gcc/testsuite/g++.dg/musttail11.C   | 33 --
 gcc/testsuite/g++.dg/musttail6.C| 60 -
 gcc/testsuite/g++.dg/musttail9.C| 10 --
 gcc/testsuite/lib/target-supports.exp   |  9 -
 14 files changed, 324 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
deleted file mode 100644
index 74efcc2a0bc6..
--- a/gcc/testsuite/c-c++-common/musttail1.c
+++ /dev/null
@@ -1,14 +0,0 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
-/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
-
-int __attribute__((noinline,noclone,noipa))
-callee (int i)
-{
-  return i * i;
-}
-
-int __attribute__((noinline,noclone,noipa))
-caller (int i)
-{
-  [[gnu::musttail]] return callee (i + 1);
-}
diff --git a/gcc/testsuite/c-c++-common/musttail12.c 
b/gcc/testsuite/c-c++-common/musttail12.c
deleted file mode 100644
index 4140bcd00950..
--- a/gcc/testsuite/c-c++-common/musttail12.c
+++ /dev/null
@@ -1,15 +0,0 @@
-/* { dg-do compile { target { struct_tail_call && { c || c++11 } } } } */
-/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
-
-struct str
-{
-  int a, b;
-};
-struct str
-cstruct (int x)
-{
-  if (x < 10)
-L:
-[[gnu::musttail]] return cstruct (x + 1);
-  return ((struct str){ x, 0 });
-}
diff --git a/gcc/testsuite/c-c++-common/musttail13.c 
b/gcc/testsuite/c-c++-common/musttail13.c
deleted file mode 100644
index 6bd212fbeb8f..
--- a/gcc/testsuite/c-c++-common/musttail13.c
+++ /dev/null
@@ -1,5 +0,0 @@
-/* { dg-do compile { target { c || c++11 } } } */
-void f(void)
-{
-  [[gnu::musttail]] return; /* { dg-error "cannot tail-call.*return value must 
be a call" } */
-}
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
deleted file mode 100644
index 86f2c3d77404..
--- a/gcc/testsuite/c-c++-common/musttail2.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
-
-struct box { char field[256]; int i; };
-
-int __attribute__((noinline,noclone,noipa))
-test_2_callee (int i, struct box b)
-{
-  if (b.field[0])
-return 5;
-  return i * i;
-}
-
-int __attribute__((noinline,noclone,noipa))
-test_2_caller (int i)
-{
-  struct box b;
-  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
tail-call: " } */
-}
-
-extern void setjmp (void);
-void
-test_3 (void)
-{
-  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
-}
-
-extern float f7(void);
-
-int
-test_6 (void)
-{
-  [[gnu::musttail]] return f7(); /* { dg-error "cannot tail-call: " } */
-}
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
deleted file mode 100644
index ea9589c59ef2..
--- a/gcc/testsuite/c-c++-common/musttail3.c
+++ /dev/null
@@ -1,29 +0,0 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
-
-extern int foo2 (int x, ...);
-
-struct str
-{
-  int a, b;
-};
-
-struct str
-cstruct (int x)
-{
-  if (x < 10)
-[[clang::musttail]] return cstruct (x + 1);
-  return ((struct str){ x, 0 });
-}
-
-int
-foo (int x)
-{
-  if (x < 10)
-[[clang::musttail]] return foo2 (x, 29);
-  if (x < 100)
-{
-  int k = foo (x + 1);
-  [[clang::musttail]] return k;/* { dg-error "cannot tail-call: " } */
-}
-  return x;
-}
diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
b/gcc/testsuite/c-c++-common/musttail4.c
deleted file mode 100644
index 23f4b5e1cd68..
--- a/gcc/testsuite/c-c++-common/musttail4.c
+++ /dev/null
@@ -1,17 +0,0 @@
-/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
-
-struct box { char field[64]; int i; };
-
-struct box __attribute__((noinline,noclone,noipa))
-returns_struct (int i)
-{
-  struct box b;
-  b.i = i * i;
-  return b;
-}
-
-int __attri

[gcc r15-2177] Revert "C++: Support clang compatible [[musttail]] (PR83324)"

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:ff6994e483be5bd340bfacc50d3441bd21aba1c4

commit r15-2177-gff6994e483be5bd340bfacc50d3441bd21aba1c4
Author: Andi Kleen 
Date:   Sat Jul 20 16:07:41 2024 -0700

Revert "C++: Support clang compatible [[musttail]] (PR83324)"

This reverts commit 59dd1d7ab21ad9a7ebf641ec9aeea609c003ad2f.

Diff:
---
 gcc/c-family/c-attribs.cc | 20 
 gcc/c-family/c-common.h   |  1 -
 gcc/cp/cp-tree.h  |  4 
 gcc/cp/parser.cc  | 32 +++-
 gcc/cp/pt.cc  |  9 +
 gcc/cp/semantics.cc   |  1 -
 6 files changed, 4 insertions(+), 63 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 685f212683f4..5adc7b775eaf 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -672,26 +672,6 @@ attribute_takes_identifier_p (const_tree attr_id)
 return targetm.attribute_takes_identifier_p (attr_id);
 }
 
-/* Set a musttail attribute MUSTTAIL_P on return expression RETVAL
-   at LOC.  */
-
-void
-set_musttail_on_return (tree retval, location_t loc, bool musttail_p)
-{
-  if (retval && musttail_p)
-{
-  tree t = retval;
-  if (TREE_CODE (t) == TARGET_EXPR)
-   t = TARGET_EXPR_INITIAL (t);
-  if (TREE_CODE (t) != CALL_EXPR)
-   error_at (loc, "cannot tail-call: return value must be a call");
-  else
-   CALL_EXPR_MUST_TAIL_CALL (t) = 1;
-}
-  else if (musttail_p && !retval)
-error_at (loc, "cannot tail-call: return value must be a call");
-}
-
 /* Verify that argument value POS at position ARGNO to attribute NAME
applied to function FN (which is either a function declaration or function
type) refers to a function parameter at position POS and the expected type
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 2510ee4dbc9d..adee822a3ae0 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1648,7 +1648,6 @@ extern tree handle_noreturn_attribute (tree *, tree, 
tree, int, bool *);
 extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
-extern void set_musttail_on_return (tree, location_t, bool);
 
 /* In c-format.cc.  */
 extern bool valid_format_string_type_p (tree);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index c6728328bc6f..609d8941cf72 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4236,10 +4236,6 @@ templated_operator_saved_lookups (tree t)
 #define AGGR_INIT_FROM_THUNK_P(NODE) \
   (AGGR_INIT_EXPR_CHECK (NODE)->base.protected_flag)
 
-/* Nonzero means that the call was marked musttail.  */
-#define AGGR_INIT_EXPR_MUST_TAIL(NODE) \
-  (AGGR_INIT_EXPR_CHECK (NODE)->base.static_flag)
-
 /* AGGR_INIT_EXPR accessors.  These are equivalent to the CALL_EXPR
accessors, except for AGGR_INIT_EXPR_SLOT (which takes the place of
CALL_EXPR_STATIC_CHAIN).  */
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 1fa0780944b6..efd5d6f29a71 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *, tree &);
+  (cp_parser *);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12757,7 +12757,7 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
case RID_CO_RETURN:
case RID_GOTO:
  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
- statement = cp_parser_jump_statement (parser, std_attrs);
+ statement = cp_parser_jump_statement (parser);
  break;
 
  /* Objective-C++ exception-handling constructs.  */
@@ -14845,11 +14845,10 @@ cp_parser_init_statement (cp_parser *parser, tree 
*decl)
jump-statement:
  goto * expression ;
 
-   STD_ATTRS are the statement attributes. They can be modified.
Returns the new BREAK_STMT, CONTINUE_STMT, RETURN_EXPR, or GOTO_EXPR.  */
 
 static tree
-cp_parser_jump_statement (cp_parser* parser, tree _attrs)
+cp_parser_jump_statement (cp_parser* parser)
 {
   tree statement = error_mark_node;
   cp_token *token;
@@ -14926,31 +14925,6 @@ cp_parser_jump_statement (cp_parser* parser, tree 
_attrs)
  /* If the next token is a `;', then there is no
 expression.  */
  expr = NULL_TREE;
-
-   if (keyword == RID_RETURN)
- {
-   bool musttail_p = false;
-   if (lookup_attribute ("gnu", "musttail", std_attrs))
- {
-   musttail_p = true;
-   std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
- }
-   /* Support this for compatibility.  */
-   if (lookup_attribute ("clang", "musttail",

[gcc r15-2178] Revert "C: Implement musttail attribute for returns"

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:53660b102e69ddc6d9add22abaf03ce791babb44

commit r15-2178-g53660b102e69ddc6d9add22abaf03ce791babb44
Author: Andi Kleen 
Date:   Sat Jul 20 16:09:07 2024 -0700

Revert "C: Implement musttail attribute for returns"

This reverts commit 7db47f7b915c5f5d645fa536547e26b92290afe3.

Diff:
---
 gcc/c/c-parser.cc | 71 ++-
 gcc/c/c-tree.h|  2 +-
 gcc/c/c-typeck.cc |  7 ++
 3 files changed, 16 insertions(+), 64 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 9b9284b1ba4d..12c5ed5d92c7 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1621,12 +1621,6 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
-struct attr_state
-{
-  /* True if we parsed a musttail attribute for return.  */
-  bool musttail_p;
-};
-
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
  unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1671,8 +1665,7 @@ static location_t c_parser_compound_statement_nostart 
(c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-vec * = NULL,
-attr_state = {});
+vec * = NULL);
 static tree c_parser_c99_block_statement (c_parser *, bool *,
  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec *);
@@ -6989,29 +6982,6 @@ c_parser_handle_directive_omp_attributes (tree ,
 }
 }
 
-/* Check if STD_ATTR contains a musttail attribute and remove if it
-   precedes a return.  PARSER is the parser and ATTR is the output
-   attr_state.  */
-
-static tree
-c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state )
-{
-  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
-{
-  if (lookup_attribute ("gnu", "musttail", std_attrs))
-   {
- std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
- attr.musttail_p = true;
-   }
-  if (lookup_attribute ("clang", "musttail", std_attrs))
-   {
- std_attrs = remove_attribute ("clang", "musttail", std_attrs);
- attr.musttail_p = true;
-   }
-}
-  return std_attrs;
-}
-
 /* Parse a compound statement except for the opening brace.  This is
used for parsing both compound statements and statement expressions
(which follow different paths to handling the opening).  */
@@ -7028,7 +6998,6 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
 = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
-  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
 {
@@ -7169,10 +7138,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
= c_parser_nth_token_starts_std_attributes (parser, 1);
   tree std_attrs = NULL_TREE;
   if (have_std_attrs)
-   {
- std_attrs = c_parser_std_attribute_specifier_sequence (parser);
- std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
-   }
+   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
   if (c_parser_next_token_is_keyword (parser, RID_CASE)
  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7320,7 +7286,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
  last_stmt = true;
  mark_valid_location_for_stdc_pragma (false);
  if (!omp_for_parse_state)
-   c_parser_statement_after_labels (parser, NULL, NULL, a);
+   c_parser_statement_after_labels (parser, NULL);
  else
{
  /* In canonical loop nest form, nested loops can only appear
@@ -7362,20 +7328,15 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
attributes.  In this context, a statement is required, not a
declaration, so attributes must be followed by a statement that is
-   not just a semicolon.  Returns an attr_state.  */
+   not just a semicolon.  */
 
-static attr_state
+static void
 c_parser_all_labels (c_parser *parser)
 {
-  attr_state attr = {};
   bool have_std_attrs;
   tree std_attrs = NULL;
   if ((have_std_attrs = c_parser_nth_token_starts_std_attributes (parser, 1)))
-{
-  std_attrs = c_parser_std_attribute_specifier_sequence (parser);
-  std_attrs = c_parser_handle_musttail (parser, std_attrs, attr);
-}
-
+std_attrs = c_parser_std_attribute_specifier_sequence (parser);
   while (c_parser_next_token_

[gcc r15-2171] Add tests for C/C++ musttail attributes

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:37c4703ce84722b9c24db3e8e6d57ab6d3a7b5eb

commit r15-2171-g37c4703ce84722b9c24db3e8e6d57ab6d3a7b5eb
Author: Andi Kleen 
Date:   Tue Jan 23 23:54:56 2024 -0800

Add tests for C/C++ musttail attributes

Some adopted from the existing C musttail plugin tests.
Also extends the ability to query the sibcall capabilities of the
target.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp:
(check_effective_target_struct_tail_call): New function.
* c-c++-common/musttail1.c: New test.
* c-c++-common/musttail12.c: New test.
* c-c++-common/musttail13.c: New test.
* c-c++-common/musttail2.c: New test.
* c-c++-common/musttail3.c: New test.
* c-c++-common/musttail4.c: New test.
* c-c++-common/musttail5.c: New test.
* c-c++-common/musttail7.c: New test.
* c-c++-common/musttail8.c: New test.
* g++.dg/musttail10.C: New test.
* g++.dg/musttail11.C: New test.
* g++.dg/musttail6.C: New test.
* g++.dg/musttail9.C: New test.

Diff:
---
 gcc/testsuite/c-c++-common/musttail1.c  | 14 
 gcc/testsuite/c-c++-common/musttail12.c | 15 +
 gcc/testsuite/c-c++-common/musttail13.c |  5 +++
 gcc/testsuite/c-c++-common/musttail2.c  | 33 ++
 gcc/testsuite/c-c++-common/musttail3.c  | 29 
 gcc/testsuite/c-c++-common/musttail4.c  | 17 ++
 gcc/testsuite/c-c++-common/musttail5.c  | 28 +++
 gcc/testsuite/c-c++-common/musttail7.c  | 14 
 gcc/testsuite/c-c++-common/musttail8.c  | 17 ++
 gcc/testsuite/g++.dg/musttail10.C   | 40 ++
 gcc/testsuite/g++.dg/musttail11.C   | 33 ++
 gcc/testsuite/g++.dg/musttail6.C| 60 +
 gcc/testsuite/g++.dg/musttail9.C| 10 ++
 gcc/testsuite/lib/target-supports.exp   |  9 +
 14 files changed, 324 insertions(+)

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
new file mode 100644
index ..74efcc2a0bc6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+int __attribute__((noinline,noclone,noipa))
+callee (int i)
+{
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+caller (int i)
+{
+  [[gnu::musttail]] return callee (i + 1);
+}
diff --git a/gcc/testsuite/c-c++-common/musttail12.c 
b/gcc/testsuite/c-c++-common/musttail12.c
new file mode 100644
index ..4140bcd00950
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail12.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { struct_tail_call && { c || c++11 } } } } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+struct str
+{
+  int a, b;
+};
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+L:
+[[gnu::musttail]] return cstruct (x + 1);
+  return ((struct str){ x, 0 });
+}
diff --git a/gcc/testsuite/c-c++-common/musttail13.c 
b/gcc/testsuite/c-c++-common/musttail13.c
new file mode 100644
index ..6bd212fbeb8f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail13.c
@@ -0,0 +1,5 @@
+/* { dg-do compile { target { c || c++11 } } } */
+void f(void)
+{
+  [[gnu::musttail]] return; /* { dg-error "cannot tail-call.*return value must 
be a call" } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
new file mode 100644
index ..86f2c3d77404
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[256]; int i; };
+
+int __attribute__((noinline,noclone,noipa))
+test_2_callee (int i, struct box b)
+{
+  if (b.field[0])
+return 5;
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+test_2_caller (int i)
+{
+  struct box b;
+  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
tail-call: " } */
+}
+
+extern void setjmp (void);
+void
+test_3 (void)
+{
+  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
+}
+
+extern float f7(void);
+
+int
+test_6 (void)
+{
+  [[gnu::musttail]] return f7(); /* { dg-error "cannot tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
new file mode 100644
index ..ea9589c59ef2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -0,0 +1,29 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+extern int foo2 (int x, ...);
+
+struct str
+{
+  int a, b;
+};
+
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+[

gcc-wwwdocs branch master updated. 5ae0f1dc042d644f3831a4dacfd0977a48491077

2024-07-20 Thread Andi Kleen via Gcc-cvs-wwwdocs
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gcc-wwwdocs".

The branch, master has been updated
   via  5ae0f1dc042d644f3831a4dacfd0977a48491077 (commit)
  from  d957b42266df71806b0173990eecf361e177f631 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -
commit 5ae0f1dc042d644f3831a4dacfd0977a48491077
Author: Andi Kleen 
Date:   Fri Jul 19 23:35:01 2024 -0700

Document musttail and constexpr asm in release notes

diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index a121f40a..1d0cfa16 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -61,8 +61,16 @@ a work-in-progress.
 
 
 
+
+  A musttail statement attribute was added to enforce tail 
calls.
+
 
 
+
+
+  Inline assembler statements now support constexpr generated 
strings,
+  analoguous to static_assert.
+
 
 
 

---

Summary of changes:
 htdocs/gcc-15/changes.html | 8 
 1 file changed, 8 insertions(+)


hooks/post-receive
-- 
gcc-wwwdocs


[gcc r15-2170] C: Implement musttail attribute for returns

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:7db47f7b915c5f5d645fa536547e26b92290afe3

commit r15-2170-g7db47f7b915c5f5d645fa536547e26b92290afe3
Author: Andi Kleen 
Date:   Wed Jan 24 07:44:23 2024 -0800

C: Implement musttail attribute for returns

Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.

gcc/c/ChangeLog:

PR c/83324
* c-parser.cc (struct attr_state): Define with musttail_p.
(c_parser_statement_after_labels): Handle [[musttail]].
(c_parser_std_attribute): Dito.
(c_parser_handle_musttail): Dito.
(c_parser_compound_statement_nostart): Dito.
(c_parser_all_labels): Dito.
(c_parser_statement): Dito.
* c-tree.h (c_finish_return): Add musttail_p flag.
* c-typeck.cc (c_finish_return): Handle musttail_p flag.

Diff:
---
 gcc/c/c-parser.cc | 71 +--
 gcc/c/c-tree.h|  2 +-
 gcc/c/c-typeck.cc |  7 --
 3 files changed, 64 insertions(+), 16 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 12c5ed5d92c7..9b9284b1ba4d 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1621,6 +1621,12 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
+struct attr_state
+{
+  /* True if we parsed a musttail attribute for return.  */
+  bool musttail_p;
+};
+
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
  unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1665,7 +1671,8 @@ static location_t c_parser_compound_statement_nostart 
(c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-vec * = NULL);
+vec * = NULL,
+attr_state = {});
 static tree c_parser_c99_block_statement (c_parser *, bool *,
  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec *);
@@ -6982,6 +6989,29 @@ c_parser_handle_directive_omp_attributes (tree ,
 }
 }
 
+/* Check if STD_ATTR contains a musttail attribute and remove if it
+   precedes a return.  PARSER is the parser and ATTR is the output
+   attr_state.  */
+
+static tree
+c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state )
+{
+  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
+{
+  if (lookup_attribute ("gnu", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+  if (lookup_attribute ("clang", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+}
+  return std_attrs;
+}
+
 /* Parse a compound statement except for the opening brace.  This is
used for parsing both compound statements and statement expressions
(which follow different paths to handling the opening).  */
@@ -6998,6 +7028,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
 = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
+  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
 {
@@ -7138,7 +7169,10 @@ c_parser_compound_statement_nostart (c_parser *parser)
= c_parser_nth_token_starts_std_attributes (parser, 1);
   tree std_attrs = NULL_TREE;
   if (have_std_attrs)
-   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+   {
+ std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+ std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
+   }
   if (c_parser_next_token_is_keyword (parser, RID_CASE)
  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7286,7 +7320,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
  last_stmt = true;
  mark_valid_location_for_stdc_pragma (false);
  if (!omp_for_parse_state)
-   c_parser_statement_after_labels (parser, NULL);
+   c_parser_statement_after_labels (parser, NULL, NULL, a);
  else
{
  /* In canonical loop nest form, nested loops can only appear
@@ -7328,15 +7362,20 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
attributes.  In this context, a statement is required, not a
declaration, so attributes must be followed by a

[gcc r15-2172] Add documentation for musttail attribute

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:56f824cc206ff00d466aaeb11211d8005c4668bc

commit r15-2172-g56f824cc206ff00d466aaeb11211d8005c4668bc
Author: Andi Kleen 
Date:   Tue Jan 23 23:38:23 2024 -0800

Add documentation for musttail attribute

gcc/ChangeLog:

PR c/83324
* doc/extend.texi: Document [[musttail]]

Diff:
---
 gcc/doc/extend.texi | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 4b77599380b5..b0273927b256 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9921,7 +9921,7 @@ same manner as the @code{deprecated} attribute.
 @section Statement Attributes
 @cindex Statement Attributes
 
-GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
+GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
 for details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
 (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
@@ -9978,6 +9978,25 @@ foo (int x, int y)
 @code{y} is not actually incremented and the compiler can but does not
 have to optimize it to just @code{return 42 + 42;}.
 
+@cindex @code{musttail} statement attribute
+@item musttail
+
+The @code{gnu::musttail} or @code{clang::musttail} attribute
+can be applied to a @code{return} statement with a return-value expression
+that is a function call.  It asserts that the call must be a tail call that
+does not allocate extra stack space, so it is safe to use tail recursion
+to implement long running loops.
+
+@smallexample
+[[gnu::musttail]] return foo();
+@end smallexample
+
+If the compiler cannot generate a @code{musttail} tail call it will report
+an error. On some targets tail calls may never be supported.
+Tail calls cannot reference locals in memory, which may affect
+builds without optimization when passing small structures, or passing
+or returning large structures. Enabling -O1 or -O2 can improve
+the success of tail calls.
 @end table
 
 @node Attribute Syntax
@@ -10101,7 +10120,9 @@ the constant expression, if present.
 
 @subsubheading Statement Attributes
 In GNU C, an attribute specifier list may appear as part of a null
-statement.  The attribute goes before the semicolon.
+statement. The attribute goes before the semicolon.
+Some attributes in new style syntax are also supported
+on non-null statements.
 
 @subsubheading Type Attributes


[gcc r15-2169] C++: Support clang compatible [[musttail]] (PR83324)

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:59dd1d7ab21ad9a7ebf641ec9aeea609c003ad2f

commit r15-2169-g59dd1d7ab21ad9a7ebf641ec9aeea609c003ad2f
Author: Andi Kleen 
Date:   Tue Jan 23 23:44:48 2024 -0800

C++: Support clang compatible [[musttail]] (PR83324)

This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

Passes bootstrap and full test

gcc/c-family/ChangeLog:

* c-attribs.cc (set_musttail_on_return): New function.
* c-common.h (set_musttail_on_return): Declare new function.

gcc/cp/ChangeLog:

PR c/83324
* cp-tree.h (AGGR_INIT_EXPR_MUST_TAIL): Add.
* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
* semantics.cc (simplify_aggr_init_expr): Handle musttail.

Diff:
---
 gcc/c-family/c-attribs.cc | 20 
 gcc/c-family/c-common.h   |  1 +
 gcc/cp/cp-tree.h  |  4 
 gcc/cp/parser.cc  | 32 +---
 gcc/cp/pt.cc  |  9 -
 gcc/cp/semantics.cc   |  1 +
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 5adc7b775eaf..685f212683f4 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -672,6 +672,26 @@ attribute_takes_identifier_p (const_tree attr_id)
 return targetm.attribute_takes_identifier_p (attr_id);
 }
 
+/* Set a musttail attribute MUSTTAIL_P on return expression RETVAL
+   at LOC.  */
+
+void
+set_musttail_on_return (tree retval, location_t loc, bool musttail_p)
+{
+  if (retval && musttail_p)
+{
+  tree t = retval;
+  if (TREE_CODE (t) == TARGET_EXPR)
+   t = TARGET_EXPR_INITIAL (t);
+  if (TREE_CODE (t) != CALL_EXPR)
+   error_at (loc, "cannot tail-call: return value must be a call");
+  else
+   CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+}
+  else if (musttail_p && !retval)
+error_at (loc, "cannot tail-call: return value must be a call");
+}
+
 /* Verify that argument value POS at position ARGNO to attribute NAME
applied to function FN (which is either a function declaration or function
type) refers to a function parameter at position POS and the expected type
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index adee822a3ae0..2510ee4dbc9d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1648,6 +1648,7 @@ extern tree handle_noreturn_attribute (tree *, tree, 
tree, int, bool *);
 extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
+extern void set_musttail_on_return (tree, location_t, bool);
 
 /* In c-format.cc.  */
 extern bool valid_format_string_type_p (tree);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 609d8941cf72..c6728328bc6f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4236,6 +4236,10 @@ templated_operator_saved_lookups (tree t)
 #define AGGR_INIT_FROM_THUNK_P(NODE) \
   (AGGR_INIT_EXPR_CHECK (NODE)->base.protected_flag)
 
+/* Nonzero means that the call was marked musttail.  */
+#define AGGR_INIT_EXPR_MUST_TAIL(NODE) \
+  (AGGR_INIT_EXPR_CHECK (NODE)->base.static_flag)
+
 /* AGGR_INIT_EXPR accessors.  These are equivalent to the CALL_EXPR
accessors, except for AGGR_INIT_EXPR_SLOT (which takes the place of
CALL_EXPR_STATIC_CHAIN).  */
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index efd5d6f29a71..1fa0780944b6 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree 

[gcc r15-2168] Add a musttail generic attribute to the c-attribs table

2024-07-20 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:5c4c1fe6df0f752764cdfd7404a60bfd2b4f5057

commit r15-2168-g5c4c1fe6df0f752764cdfd7404a60bfd2b4f5057
Author: Andi Kleen 
Date:   Wed May 15 19:38:43 2024 -0700

Add a musttail generic attribute to the c-attribs table

The actual handling is directly in the parser since the
generic mechanism doesn't support statement attributes,
but this gives basic error checking/detection on the attribute.

gcc/c-family/ChangeLog:

PR c/83324
* c-attribs.cc (handle_musttail_attribute): Add.
* c-common.h (handle_musttail_attribute): Add.

Diff:
---
 gcc/c-family/c-attribs.cc | 15 +++
 gcc/c-family/c-common.h   |  1 +
 2 files changed, 16 insertions(+)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index f9b229aba7fc..5adc7b775eaf 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -340,6 +340,8 @@ const struct attribute_spec c_common_gnu_attributes[] =
   { "common", 0, 0, true,  false, false, false,
  handle_common_attribute,
  attr_common_exclusions },
+  { "musttail",  0, 0, false, false, false,
+ false, handle_musttail_attribute, NULL },
   /* FIXME: logically, noreturn attributes should be listed as
  "false, true, true" and apply to function types.  But implementing this
  would require all the places in the compiler that use TREE_THIS_VOLATILE
@@ -1222,6 +1224,19 @@ handle_common_attribute (tree *node, tree name, tree 
ARG_UNUSED (args),
   return NULL_TREE;
 }
 
+/* Handle a "musttail" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+tree
+handle_musttail_attribute (tree ARG_UNUSED (*node), tree name, tree ARG_UNUSED 
(args),
+  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  /* Currently only a statement attribute, handled directly in parser.  */
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+  return NULL_TREE;
+}
+
 /* Handle a "noreturn" attribute; arguments as in
struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ccaea27c2b9f..adee822a3ae0 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1645,6 +1645,7 @@ extern tree find_tm_attribute (tree);
 extern const struct attribute_spec::exclusions attr_cold_hot_exclusions[];
 extern const struct attribute_spec::exclusions attr_noreturn_exclusions[];
 extern tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
+extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);


Re: [PATCH v10 1/3] C++: Support clang compatible [[musttail]] (PR83324)

2024-07-18 Thread Andi Kleen


Updated patch with the !retval bug fix identified by Marek.

This patch implements a clang compatible [[musttail]] attribute for
returns.
  
musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.
   
It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

Passes bootstrap and full test

gcc/c-family/ChangeLog:

* c-attribs.cc (set_musttail_on_return): New function.
* c-common.h (set_musttail_on_return): Declare new function.

gcc/cp/ChangeLog:
 
PR c/83324
* cp-tree.h (AGGR_INIT_EXPR_MUST_TAIL): Add.
* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
* semantics.cc (simplify_aggr_init_expr): Handle musttail.

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 5adc7b775eaf..685f212683f4 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -672,6 +672,26 @@ attribute_takes_identifier_p (const_tree attr_id)
 return targetm.attribute_takes_identifier_p (attr_id);
 }
 
+/* Set a musttail attribute MUSTTAIL_P on return expression RETVAL
+   at LOC.  */
+
+void
+set_musttail_on_return (tree retval, location_t loc, bool musttail_p)
+{
+  if (retval && musttail_p)
+{
+  tree t = retval;
+  if (TREE_CODE (t) == TARGET_EXPR)
+   t = TARGET_EXPR_INITIAL (t);
+  if (TREE_CODE (t) != CALL_EXPR)
+   error_at (loc, "cannot tail-call: return value must be a call");
+  else
+   CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+}
+  else if (musttail_p && !retval)
+error_at (loc, "cannot tail-call: return value must be a call");
+}
+
 /* Verify that argument value POS at position ARGNO to attribute NAME
applied to function FN (which is either a function declaration or function
type) refers to a function parameter at position POS and the expected type
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index adee822a3ae0..2510ee4dbc9d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1648,6 +1648,7 @@ extern tree handle_noreturn_attribute (tree *, tree, 
tree, int, bool *);
 extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
+extern void set_musttail_on_return (tree, location_t, bool);
 
 /* In c-format.cc.  */
 extern bool valid_format_string_type_p (tree);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index c6f102564ce0..67ba3274eb1b 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4236,6 +4236,10 @@ templated_operator_saved_lookups (tree t)
 #define AGGR_INIT_FROM_THUNK_P(NODE) \
   (AGGR_INIT_EXPR_CHECK (NODE)->base.protected_flag)
 
+/* Nonzero means that the call was marked musttail.  */
+#define AGGR_INIT_EXPR_MUST_TAIL(NODE) \
+  (AGGR_INIT_EXPR_CHECK (NODE)->base.static_flag)
+
 /* AGGR_INIT_EXPR accessors.  These are equivalent to the CALL_EXPR
accessors, except for AGGR_INIT_EXPR_SLOT (which takes the place of
CALL_EXPR_STATIC_CHAIN).  */
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index efd5d6f29a71..1fa0780944b6 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *);
+  (cp_parser *, tree &);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12757,7 +12757,7 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
case RID_CO_RETURN:
case RID_GOTO:
  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
- statement = cp_parser_jump_statement (parser);
+ statement = cp_parser_jump_statement (parser, std_attrs);
  break;
 
  /* Objective-C++ exception-handling constructs.  */
@@ 

Re: [PATCH v10 2/3] C: Implement musttail attribute for returns

2024-07-18 Thread Andi Kleen
> > > > +  set_musttail_on_return (retval, xloc, musttail_p);
> > > > +
> > > >if (retval)
> > > >  {
> > > >tree semantic_type = NULL_TREE;
> > > 
> > > Is it deliberate that set_musttail_on_return is called outside the
> > > if (retval) block?  If it can be moved into it, set_musttail_on_return
> > > can be simplified to assume that retval is always non-null.
> > 
> > Yes it can be removed.

Actually I was wrong here, after double checking. The !retval case is
needed to diagnose a [[musttail]] set on a plain return (which is not
allowed following the clang spec)

So the call has to be outside the check.

The C frontend did it correctly, but the C++ part did not (fixed now)

-Andi


Re: [PATCH v10 2/3] C: Implement musttail attribute for returns

2024-07-18 Thread Andi Kleen
On Thu, Jul 18, 2024 at 02:19:21PM -0400, Marek Polacek wrote:
> On Wed, Jul 17, 2024 at 09:30:00PM -0700, Andi Kleen wrote:
> > Implement a C23 clang compatible musttail attribute similar to the earlier
> > C++ implementation in the C parser.
> > 
> > gcc/c/ChangeLog:
> > 
> > PR c/83324
> > * c-parser.cc (struct attr_state): Define with musttail_p.
> > (c_parser_statement_after_labels): Handle [[musttail]].
> > (c_parser_std_attribute): Dito.
> > (c_parser_handle_musttail): Dito.
> > (c_parser_compound_statement_nostart): Dito.
> > (c_parser_all_labels): Dito.
> > (c_parser_statement): Dito.
> > * c-tree.h (c_finish_return): Add musttail_p flag.
> > * c-typeck.cc (c_finish_return): Handle musttail_p flag.
> > ---
> >  gcc/c/c-parser.cc | 70 ++-
> >  gcc/c/c-tree.h|  2 +-
> >  gcc/c/c-typeck.cc |  7 +++--
> >  3 files changed, 63 insertions(+), 16 deletions(-)
> > 
> > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> > index 12c5ed5d92c7..a8848d01f21a 100644
> > --- a/gcc/c/c-parser.cc
> > +++ b/gcc/c/c-parser.cc
> > @@ -1621,6 +1621,12 @@ struct omp_for_parse_data {
> >bool fail : 1;
> >  };
> >  
> > +struct attr_state
> > +{
> > +  /* True if we parsed a musttail attribute for return.  */
> > +  bool musttail_p;
> > +};
> > +
> >  static bool c_parser_nth_token_starts_std_attributes (c_parser *,
> >   unsigned int);
> >  static tree c_parser_std_attribute_specifier_sequence (c_parser *);
> > @@ -1665,7 +1671,7 @@ static location_t c_parser_compound_statement_nostart 
> > (c_parser *);
> >  static void c_parser_label (c_parser *, tree);
> >  static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
> >  static void c_parser_statement_after_labels (c_parser *, bool *,
> > -vec * = NULL);
> > +vec * = NULL, attr_state = 
> > {});
> 
> Nit: the line seems to go over 80 columns.

Ok.

> >  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
> >  || (c_parser_next_token_is (parser, CPP_NAME)
> > @@ -7346,7 +7384,10 @@ c_parser_all_labels (c_parser *parser)
> >std_attrs = NULL;
> >if ((have_std_attrs = c_parser_nth_token_starts_std_attributes 
> > (parser,
> >   1)))
> > -   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
> > +   {
> > + std_attrs = c_parser_std_attribute_specifier_sequence (parser);
> > + std_attrs = c_parser_handle_musttail (parser, std_attrs, attr);
> > +   }
> 
> Thanks, I believe this addresses the testcase I mentioned earlier:
> 
>   struct str
>   {
> int a, b;
>   };
> 
>   struct str
>   cstruct (int x)
>   {
> if (x < 10)
>   L: // <
>   [[gnu::musttail]] return cstruct (x + 1);
> return ((struct str){ x, 0 });
>   }
> 
> but I didn't see that being tested in your testsuite patch; apologies if
> I missed it.

It wasn't there. I will add it.

> 
> >  tree
> > -c_finish_return (location_t loc, tree retval, tree origtype)
> > +c_finish_return (location_t loc, tree retval, tree origtype, bool 
> > musttail_p)
> >  {
> >tree valtype = TREE_TYPE (TREE_TYPE (current_function_decl)), ret_stmt;
> >bool no_warning = false;
> > @@ -11742,6 +11743,8 @@ c_finish_return (location_t loc, tree retval, tree 
> > origtype)
> >  warning_at (xloc, 0,
> > "function declared % has a % statement");
> >  
> > +  set_musttail_on_return (retval, xloc, musttail_p);
> > +
> >if (retval)
> >  {
> >tree semantic_type = NULL_TREE;
> 
> Is it deliberate that set_musttail_on_return is called outside the
> if (retval) block?  If it can be moved into it, set_musttail_on_return
> can be simplified to assume that retval is always non-null.

Yes it can be removed.

Is the patchk ok with these changes?

-Andi


[PATCH v10 3/3] Add tests for C/C++ musttail attributes

2024-07-17 Thread Andi Kleen
Some adopted from the existing C musttail plugin tests.
Also extends the ability to query the sibcall capabilities of the
target.

gcc/testsuite/ChangeLog:

* testsuite/lib/target-supports.exp
(check_effective_target_struct_tail_call): New function.
* c-c++-common/musttail1.c: New test.
* c-c++-common/musttail2.c: New test.
* c-c++-common/musttail3.c: New test.
* c-c++-common/musttail4.c: New test.
* c-c++-common/musttail7.c: New test.
* c-c++-common/musttail8.c: New test.
* g++.dg/musttail6.C: New test.
* g++.dg/musttail9.C: New test.
* g++.dg/musttail10.C: New test.
* g++.dg/musttail11.C: New test.
---
 gcc/testsuite/c-c++-common/musttail1.c | 14 ++
 gcc/testsuite/c-c++-common/musttail2.c | 33 ++
 gcc/testsuite/c-c++-common/musttail3.c | 29 +
 gcc/testsuite/c-c++-common/musttail4.c | 17 
 gcc/testsuite/c-c++-common/musttail5.c | 28 
 gcc/testsuite/c-c++-common/musttail7.c | 14 ++
 gcc/testsuite/c-c++-common/musttail8.c | 17 
 gcc/testsuite/g++.dg/musttail10.C  | 40 +
 gcc/testsuite/g++.dg/musttail11.C  | 33 ++
 gcc/testsuite/g++.dg/musttail6.C   | 60 ++
 gcc/testsuite/g++.dg/musttail9.C   | 10 +
 gcc/testsuite/lib/target-supports.exp  |  9 
 12 files changed, 304 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/musttail1.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail2.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail3.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail4.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail5.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail7.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail8.c
 create mode 100644 gcc/testsuite/g++.dg/musttail10.C
 create mode 100644 gcc/testsuite/g++.dg/musttail11.C
 create mode 100644 gcc/testsuite/g++.dg/musttail6.C
 create mode 100644 gcc/testsuite/g++.dg/musttail9.C

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
new file mode 100644
index ..74efcc2a0bc6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+int __attribute__((noinline,noclone,noipa))
+callee (int i)
+{
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+caller (int i)
+{
+  [[gnu::musttail]] return callee (i + 1);
+}
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
new file mode 100644
index ..86f2c3d77404
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[256]; int i; };
+
+int __attribute__((noinline,noclone,noipa))
+test_2_callee (int i, struct box b)
+{
+  if (b.field[0])
+return 5;
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+test_2_caller (int i)
+{
+  struct box b;
+  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
tail-call: " } */
+}
+
+extern void setjmp (void);
+void
+test_3 (void)
+{
+  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
+}
+
+extern float f7(void);
+
+int
+test_6 (void)
+{
+  [[gnu::musttail]] return f7(); /* { dg-error "cannot tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
new file mode 100644
index ..ea9589c59ef2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -0,0 +1,29 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+extern int foo2 (int x, ...);
+
+struct str
+{
+  int a, b;
+};
+
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+[[clang::musttail]] return cstruct (x + 1);
+  return ((struct str){ x, 0 });
+}
+
+int
+foo (int x)
+{
+  if (x < 10)
+[[clang::musttail]] return foo2 (x, 29);
+  if (x < 100)
+{
+  int k = foo (x + 1);
+  [[clang::musttail]] return k;/* { dg-error "cannot tail-call: " } */
+}
+  return x;
+}
diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
b/gcc/testsuite/c-c++-common/musttail4.c
new file mode 100644
index ..23f4b5e1cd68
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[64]; int i; };
+
+struct box __attribute__((noinline,noclone,noipa))
+returns_struct (int i)
+{
+  struct box b;
+  b.i = i * i;
+  return b;
+}
+
+int __attribute__((noinline,noclone))
+test_1 (int i)
+{
+  [[gnu::musttail]] return returns_struct (i * 5).i; /* { dg-error "cannot 
tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail5.c 
b/gcc/testsuite/c-c++-common/musttail5.c

[PATCH v10 2/3] C: Implement musttail attribute for returns

2024-07-17 Thread Andi Kleen
Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.

gcc/c/ChangeLog:

PR c/83324
* c-parser.cc (struct attr_state): Define with musttail_p.
(c_parser_statement_after_labels): Handle [[musttail]].
(c_parser_std_attribute): Dito.
(c_parser_handle_musttail): Dito.
(c_parser_compound_statement_nostart): Dito.
(c_parser_all_labels): Dito.
(c_parser_statement): Dito.
* c-tree.h (c_finish_return): Add musttail_p flag.
* c-typeck.cc (c_finish_return): Handle musttail_p flag.
---
 gcc/c/c-parser.cc | 70 ++-
 gcc/c/c-tree.h|  2 +-
 gcc/c/c-typeck.cc |  7 +++--
 3 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 12c5ed5d92c7..a8848d01f21a 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1621,6 +1621,12 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
+struct attr_state
+{
+  /* True if we parsed a musttail attribute for return.  */
+  bool musttail_p;
+};
+
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
  unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1665,7 +1671,7 @@ static location_t c_parser_compound_statement_nostart 
(c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-vec * = NULL);
+vec * = NULL, attr_state = 
{});
 static tree c_parser_c99_block_statement (c_parser *, bool *,
  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec *);
@@ -6982,6 +6988,29 @@ c_parser_handle_directive_omp_attributes (tree ,
 }
 }
 
+/* Check if STD_ATTR contains a musttail attribute and remove if it
+   precedes a return.  PARSER is the parser and ATTR is the output
+   attr_state.  */
+
+static tree
+c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state )
+{
+  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
+{
+  if (lookup_attribute ("gnu", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+  if (lookup_attribute ("clang", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+}
+  return std_attrs;
+}
+
 /* Parse a compound statement except for the opening brace.  This is
used for parsing both compound statements and statement expressions
(which follow different paths to handling the opening).  */
@@ -6998,6 +7027,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
 = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
+  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
 {
@@ -7138,7 +7168,10 @@ c_parser_compound_statement_nostart (c_parser *parser)
= c_parser_nth_token_starts_std_attributes (parser, 1);
   tree std_attrs = NULL_TREE;
   if (have_std_attrs)
-   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+   {
+ std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+ std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
+   }
   if (c_parser_next_token_is_keyword (parser, RID_CASE)
  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7286,7 +7319,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
  last_stmt = true;
  mark_valid_location_for_stdc_pragma (false);
  if (!omp_for_parse_state)
-   c_parser_statement_after_labels (parser, NULL);
+   c_parser_statement_after_labels (parser, NULL, NULL, a);
  else
{
  /* In canonical loop nest form, nested loops can only appear
@@ -7328,15 +7361,20 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
attributes.  In this context, a statement is required, not a
declaration, so attributes must be followed by a statement that is
-   not just a semicolon.  */
+   not just a semicolon.  Returns an attr_state.  */
 
-static void
+static attr_state
 c_parser_all_labels (c_parser *parser)
 {
+  attr_state attr = {};
   bool have_std_attrs;
   tree std_attrs = NULL;
   if ((have_std_attrs = c_parser_nth_token_starts_std_attributes (parser, 1)))
-std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+{
+  std_attrs = 

[PATCH v10 1/3] C++: Support clang compatible [[musttail]] (PR83324)

2024-07-17 Thread Andi Kleen
This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

Passes bootstrap and full test

gcc/c-family/ChangeLog:

* c-attribs.cc (set_musttail_on_return): New function.
* c-common.h (set_musttail_on_return): Declare new function.

gcc/cp/ChangeLog:

PR c/83324
* cp-tree.h (AGGR_INIT_EXPR_MUST_TAIL): Add.
* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
* semantics.cc (simplify_aggr_init_expr): Handle musttail.
---
 gcc/c-family/c-attribs.cc | 20 
 gcc/c-family/c-common.h   |  1 +
 gcc/cp/cp-tree.h  |  4 
 gcc/cp/parser.cc  | 32 +---
 gcc/cp/pt.cc  |  9 -
 gcc/cp/semantics.cc   |  1 +
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 5adc7b775eaf..685f212683f4 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -672,6 +672,26 @@ attribute_takes_identifier_p (const_tree attr_id)
 return targetm.attribute_takes_identifier_p (attr_id);
 }
 
+/* Set a musttail attribute MUSTTAIL_P on return expression RETVAL
+   at LOC.  */
+
+void
+set_musttail_on_return (tree retval, location_t loc, bool musttail_p)
+{
+  if (retval && musttail_p)
+{
+  tree t = retval;
+  if (TREE_CODE (t) == TARGET_EXPR)
+   t = TARGET_EXPR_INITIAL (t);
+  if (TREE_CODE (t) != CALL_EXPR)
+   error_at (loc, "cannot tail-call: return value must be a call");
+  else
+   CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+}
+  else if (musttail_p && !retval)
+error_at (loc, "cannot tail-call: return value must be a call");
+}
+
 /* Verify that argument value POS at position ARGNO to attribute NAME
applied to function FN (which is either a function declaration or function
type) refers to a function parameter at position POS and the expected type
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index adee822a3ae0..2510ee4dbc9d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1648,6 +1648,7 @@ extern tree handle_noreturn_attribute (tree *, tree, 
tree, int, bool *);
 extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
+extern void set_musttail_on_return (tree, location_t, bool);
 
 /* In c-format.cc.  */
 extern bool valid_format_string_type_p (tree);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index c6f102564ce0..67ba3274eb1b 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4236,6 +4236,10 @@ templated_operator_saved_lookups (tree t)
 #define AGGR_INIT_FROM_THUNK_P(NODE) \
   (AGGR_INIT_EXPR_CHECK (NODE)->base.protected_flag)
 
+/* Nonzero means that the call was marked musttail.  */
+#define AGGR_INIT_EXPR_MUST_TAIL(NODE) \
+  (AGGR_INIT_EXPR_CHECK (NODE)->base.static_flag)
+
 /* AGGR_INIT_EXPR accessors.  These are equivalent to the CALL_EXPR
accessors, except for AGGR_INIT_EXPR_SLOT (which takes the place of
CALL_EXPR_STATIC_CHAIN).  */
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index efd5d6f29a71..71bffd4a9311 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *);
+  (cp_parser *, tree &);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12757,7 +12757,7 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
case RID_CO_RETURN:
case RID_GOTO:
  std_attrs = process_stmt_hotness_attribute (std_attrs, 

Remaining frontend patches for musttail

2024-07-17 Thread Andi Kleen
This patchkit contains the remaining C/C++ frontend patches for the 
[[musttail]] 
extension that still need approval for trunk. I already committed
the tree-ssa and RTL pieces.

C: I addressed Marek's feedback, but need final ack. Marek can you
please take a look?

C++: Fixed support for AGGR_VIEW expressions thanks to Jason's prodding.

Tests: Addressed Jason's feedback and covered now hopefully all the 
class passing cases. I split some tests to create a full set of errors,
otherwise frontend errors would stop the tree optimizers from running.

The class passing tests are showing another problem in the middle-end
code where implicit calls generated by the C++ frontend stop
tree-tailcall early, so it can't identify the user written tail call.
This results in cryptic "cannot tail-call: other reasons" fallback
errors, which is not ideal, but also not a show stopper.
Currently this is hidden in the test suite by running that test at -O0. 

-Andi





[gcc r15-2126] Mark expand musttail error messages for translation

2024-07-17 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:d062b0abf45cd54057352fc4b7827a3b1b9a160a

commit r15-2126-gd062b0abf45cd54057352fc4b7827a3b1b9a160a
Author: Andi Kleen 
Date:   Fri Jun 21 11:19:12 2024 -0700

Mark expand musttail error messages for translation

The musttail error messages are reported to the user, so must be
translated.

gcc/ChangeLog:

PR c/83324
* calls.cc (initialize_argument_information): Mark messages
for translation.
(can_implement_as_sibling_call_p): Dito.
(expand_call): Dito.

Diff:
---
 gcc/calls.cc | 56 
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 883eb9971257..f28c58217fdf 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -1420,9 +1420,9 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
{
  *may_tailcall = false;
  maybe_complain_about_tail_call (exp,
- "a callee-copied argument is"
- " stored in the current"
- " function's frame");
+ _("a callee-copied argument 
is"
+   " stored in the current"
+   " function's frame"));
}
 
  args[i].tree_value = build_fold_addr_expr_loc (loc,
@@ -1489,8 +1489,8 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
  type = TREE_TYPE (args[i].tree_value);
  *may_tailcall = false;
  maybe_complain_about_tail_call (exp,
- "argument must be passed"
- " by copying");
+ _("argument must be passed"
+   " by copying"));
}
  arg.pass_by_reference = true;
}
@@ -2508,8 +2508,8 @@ can_implement_as_sibling_call_p (tree exp,
 {
   maybe_complain_about_tail_call
(exp,
-"machine description does not have"
-" a sibcall_epilogue instruction pattern");
+_("machine description does not have"
+  " a sibcall_epilogue instruction pattern"));
   return false;
 }
 
@@ -2519,7 +2519,7 @@ can_implement_as_sibling_call_p (tree exp,
  sibling calls will return a structure.  */
   if (structure_value_addr != NULL_RTX)
 {
-  maybe_complain_about_tail_call (exp, "callee returns a structure");
+  maybe_complain_about_tail_call (exp, _("callee returns a structure"));
   return false;
 }
 
@@ -2528,8 +2528,8 @@ can_implement_as_sibling_call_p (tree exp,
   if (!targetm.function_ok_for_sibcall (fndecl, exp))
 {
   maybe_complain_about_tail_call (exp,
- "target is not able to optimize the"
- " call into a sibling call");
+ _("target is not able to optimize the"
+   " call into a sibling call"));
   return false;
 }
 
@@ -2537,18 +2537,18 @@ can_implement_as_sibling_call_p (tree exp,
  optimized.  */
   if (flags & ECF_RETURNS_TWICE)
 {
-  maybe_complain_about_tail_call (exp, "callee returns twice");
+  maybe_complain_about_tail_call (exp, _("callee returns twice"));
   return false;
 }
   if (flags & ECF_NORETURN)
 {
-  maybe_complain_about_tail_call (exp, "callee does not return");
+  maybe_complain_about_tail_call (exp, _("callee does not return"));
   return false;
 }
 
   if (TYPE_VOLATILE (TREE_TYPE (TREE_TYPE (addr
 {
-  maybe_complain_about_tail_call (exp, "volatile function type");
+  maybe_complain_about_tail_call (exp, _("volatile function type"));
   return false;
 }
 
@@ -2567,7 +2567,7 @@ can_implement_as_sibling_call_p (tree exp,
  the argument areas are shared.  */
   if (fndecl && decl_function_context (fndecl) == current_function_decl)
 {
-  maybe_complain_about_tail_call (exp, "nested function");
+  maybe_complain_about_tail_call (exp, _("nested function"));
   return false;
 }
 
@@ -2579,8 +2579,8 @@ can_implement_as_sibling_call_p (tree exp,
crtl->args.size - crtl->args.pretend_args_size))
 {
   maybe_complain_about_tail_call (exp,
- "callee required mo

[gcc r15-2125] Give better error messages for musttail

2024-07-17 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:81824596361cf4919d6dc026155160581c99b860

commit r15-2125-g81824596361cf4919d6dc026155160581c99b860
Author: Andi Kleen 
Date:   Tue May 21 07:01:57 2024 -0700

Give better error messages for musttail

When musttail is set, make tree-tailcall give error messages
when it cannot handle a call. This avoids vague "other reasons"
error messages later at expand time when it sees a musttail
function not marked tail call.

In various cases this requires delaying the error until
the call is discovered.

Also print more information on the failure to the dump file.

gcc/ChangeLog:

PR c/83324
* tree-tailcall.cc (maybe_error_musttail): New function.
(suitable_for_tail_opt_p): Report error reason.
(suitable_for_tail_call_opt_p): Report error reason.
(find_tail_calls): Accept basic blocks with abnormal edges.
Delay reporting of errors until the call is discovered.
Move top level suitability checks to here.
(tree_optimize_tail_calls_1): Remove top level checks.

Diff:
---
 gcc/tree-tailcall.cc | 187 ++-
 1 file changed, 154 insertions(+), 33 deletions(-)

diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index 43e8c25215cb..a68079d4f507 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -40,9 +40,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-eh.h"
 #include "dbgcnt.h"
 #include "cfgloop.h"
+#include "intl.h"
 #include "common/common-target.h"
 #include "ipa-utils.h"
 #include "tree-ssa-live.h"
+#include "diagnostic-core.h"
 
 /* The file implements the tail recursion elimination.  It is also used to
analyze the tail calls in general, passing the results to the rtl level
@@ -131,14 +133,20 @@ static tree m_acc, a_acc;
 
 static bitmap tailr_arg_needs_copy;
 
+static void maybe_error_musttail (gcall *call, const char *err);
+
 /* Returns false when the function is not suitable for tail call optimization
-   from some reason (e.g. if it takes variable number of arguments).  */
+   from some reason (e.g. if it takes variable number of arguments). CALL
+   is call to report for.  */
 
 static bool
-suitable_for_tail_opt_p (void)
+suitable_for_tail_opt_p (gcall *call)
 {
   if (cfun->stdarg)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses stdargs"));
+  return false;
+}
 
   return true;
 }
@@ -146,35 +154,47 @@ suitable_for_tail_opt_p (void)
 /* Returns false when the function is not suitable for tail call optimization
for some reason (e.g. if it takes variable number of arguments).
This test must pass in addition to suitable_for_tail_opt_p in order to make
-   tail call discovery happen.  */
+   tail call discovery happen. CALL is call to report error for.  */
 
 static bool
-suitable_for_tail_call_opt_p (void)
+suitable_for_tail_call_opt_p (gcall *call)
 {
   tree param;
 
   /* alloca (until we have stack slot life analysis) inhibits
  sibling call optimizations, but not tail recursion.  */
   if (cfun->calls_alloca)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses alloca"));
+  return false;
+}
 
   /* If we are using sjlj exceptions, we may need to add a call to
  _Unwind_SjLj_Unregister at exit of the function.  Which means
  that we cannot do any sibcall transformations.  */
   if (targetm_common.except_unwind_info (_options) == UI_SJLJ
   && current_function_has_exception_handlers ())
-return false;
+{
+  maybe_error_musttail (call, _("caller uses sjlj exceptions"));
+  return false;
+}
 
   /* Any function that calls setjmp might have longjmp called from
  any called function.  ??? We really should represent this
  properly in the CFG so that this needn't be special cased.  */
   if (cfun->calls_setjmp)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses setjmp"));
+  return false;
+}
 
   /* Various targets don't handle tail calls correctly in functions
  that call __builtin_eh_return.  */
   if (cfun->calls_eh_return)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses __builtin_eh_return"));
+  return false;
+}
 
   /* ??? It is OK if the argument of a function is taken in some cases,
  but not in all cases.  See PR15387 and PR19616.  Revisit for 4.1.  */
@@ -182,7 +202,10 @@ suitable_for_tail_call_opt_p (void)
param;
param = DECL_CHAIN (param))
 if (TREE_ADDRESSABLE (param))
-  return false;
+  {
+   maybe_error_musttail (call, _("address of caller arguments taken"));
+   return false;
+  }
 
   return true;
 }
@@ -402,16 +425,42 @@ propagate_thr

[gcc r15-2124] Enable musttail tail conversion even when not optimizing

2024-07-17 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:b738a63e528db410a1c51fc27db37fe22f0cb397

commit r15-2124-gb738a63e528db410a1c51fc27db37fe22f0cb397
Author: Andi Kleen 
Date:   Wed May 15 19:57:22 2024 -0700

Enable musttail tail conversion even when not optimizing

Enable the tailcall optimization for non optimizing builds,
but in this case only checks calls that have the musttail attribute set.
This makes musttail work without optimization.

This is done with a new late musttail pass that is only active when
not optimizing. The new pass relies on tree-cfg to discover musttails.
This avoids a ~0.8% compiler run time penalty at -O0.

gcc/ChangeLog:

PR c/83324
* function.h (struct function): Add has_musttail.
* lto-streamer-in.cc (input_struct_function_base): Stream
has_musttail.
* lto-streamer-out.cc (output_struct_function_base): Dito.
* passes.def (pass_musttail): Add.
* tree-cfg.cc (notice_special_calls): Record has_musttail.
(clear_special_calls): Clear has_musttail.
* tree-pass.h (make_pass_musttail): Add.
* tree-tailcall.cc (find_tail_calls): Handle only_musttail
argument.
(tree_optimize_tail_calls_1): Pass on only_musttail.
(execute_tail_calls): Pass only_musttail as false.
(class pass_musttail): Add.
(make_pass_musttail): Add.

Diff:
---
 gcc/function.h  |  3 +++
 gcc/lto-streamer-in.cc  |  1 +
 gcc/lto-streamer-out.cc |  1 +
 gcc/passes.def  |  1 +
 gcc/tree-cfg.cc |  3 +++
 gcc/tree-pass.h |  1 +
 gcc/tree-tailcall.cc| 68 ++---
 7 files changed, 69 insertions(+), 9 deletions(-)

diff --git a/gcc/function.h b/gcc/function.h
index c0ba6cc1531a..fbeadeaf4104 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -430,6 +430,9 @@ struct GTY(()) function {
   /* Nonzero when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
 
+  /* Has musttail marked calls.  */
+  unsigned int has_musttail : 1;
+
   /* Nonzero if the current function contains a #pragma GCC unroll.  */
   unsigned int has_unroll : 1;
 
diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc
index ad0ca24007a0..2e592be80823 100644
--- a/gcc/lto-streamer-in.cc
+++ b/gcc/lto-streamer-in.cc
@@ -1325,6 +1325,7 @@ input_struct_function_base (struct function *fn, class 
data_in *data_in,
   fn->calls_eh_return = bp_unpack_value (, 1);
   fn->has_force_vectorize_loops = bp_unpack_value (, 1);
   fn->has_simduid_loops = bp_unpack_value (, 1);
+  fn->has_musttail = bp_unpack_value (, 1);
   fn->assume_function = bp_unpack_value (, 1);
   fn->va_list_fpr_size = bp_unpack_value (, 8);
   fn->va_list_gpr_size = bp_unpack_value (, 8);
diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc
index 8b4bf9659cb3..c329ac8af958 100644
--- a/gcc/lto-streamer-out.cc
+++ b/gcc/lto-streamer-out.cc
@@ -2282,6 +2282,7 @@ output_struct_function_base (struct output_block *ob, 
struct function *fn)
   bp_pack_value (, fn->calls_eh_return, 1);
   bp_pack_value (, fn->has_force_vectorize_loops, 1);
   bp_pack_value (, fn->has_simduid_loops, 1);
+  bp_pack_value (, fn->has_musttail, 1);
   bp_pack_value (, fn->assume_function, 1);
   bp_pack_value (, fn->va_list_fpr_size, 8);
   bp_pack_value (, fn->va_list_gpr_size, 8);
diff --git a/gcc/passes.def b/gcc/passes.def
index 94386ba54577..5e7f9395d84f 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -444,6 +444,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_tsan_O0);
   NEXT_PASS (pass_sanopt);
   NEXT_PASS (pass_cleanup_eh);
+  NEXT_PASS (pass_musttail);
   NEXT_PASS (pass_lower_resx);
   NEXT_PASS (pass_nrv);
   NEXT_PASS (pass_gimple_isel);
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 7fb7b92966be..e6fd1294b958 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -2290,6 +2290,8 @@ notice_special_calls (gcall *call)
 cfun->calls_alloca = true;
   if (flags & ECF_RETURNS_TWICE)
 cfun->calls_setjmp = true;
+  if (gimple_call_must_tail_p (call))
+cfun->has_musttail = true;
 }
 
 
@@ -2301,6 +2303,7 @@ clear_special_calls (void)
 {
   cfun->calls_alloca = false;
   cfun->calls_setjmp = false;
+  cfun->has_musttail = false;
 }
 
 /* Remove PHI nodes associated with basic block BB and all edges out of BB.  */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 84bc91b51e9d..3a0cf13089e2 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -368,6 +368,7 @@ extern gimple_opt_pass *make_pass_sra (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sra_early (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_recursion (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_calls (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_musttail (gcc::context *ctxt);
 extern gi

[gcc r15-2123] Fix pro_and_epilogue for sibcalls at -O0 (PR115255)

2024-07-17 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:983daf0e5fdaada5b930374c21455d42d34350be

commit r15-2123-g983daf0e5fdaada5b930374c21455d42d34350be
Author: Andi Kleen 
Date:   Sat Jun 1 22:04:41 2024 -0700

Fix pro_and_epilogue for sibcalls at -O0 (PR115255)

Some of the cfg fixups in pro_and_epilogue for sibcalls were dependent on 
"optimize".
Make them check cfun->tail_call_marked instead to handle the -O0 musttail
case. This fixes the musttail test cases on arm targets.

gcc/ChangeLog:

PR target/115255
* function.cc (thread_prologue_and_epilogue_insns): Check
cfun->tail_call_marked for sibcalls too.
(rest_of_handle_thread_prologue_and_epilogue): Dito.

Diff:
---
 gcc/function.cc | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/function.cc b/gcc/function.cc
index 4edd4da12474..a6f6de349420 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -2231,6 +2231,7 @@ use_register_for_decl (const_tree decl)
   /* We don't set DECL_IGNORED_P for the function_result_decl.  */
   if (optimize)
return true;
+  /* Needed for [[musttail]] which can operate even at -O0 */
   if (cfun->tail_call_marked)
return true;
   /* We don't set DECL_REGISTER for the function_result_decl.  */
@@ -6259,8 +6260,11 @@ thread_prologue_and_epilogue_insns (void)
 }
 
   /* Threading the prologue and epilogue changes the artificial refs in the
- entry and exit blocks, and may invalidate DF info for tail calls.  */
+ entry and exit blocks, and may invalidate DF info for tail calls.
+ This is also needed for [[musttail]] conversion even when not
+ optimizing.  */
   if (optimize
+  || cfun->tail_call_marked
   || flag_optimize_sibling_calls
   || flag_ipa_icf_functions
   || in_lto_p)
@@ -6557,7 +6561,7 @@ rest_of_handle_thread_prologue_and_epilogue (function 
*fun)
 {
   /* prepare_shrink_wrap is sensitive to the block structure of the control
  flow graph, so clean it up first.  */
-  if (optimize)
+  if (cfun->tail_call_marked || optimize)
 cleanup_cfg (0);
 
   /* On some machines, the prologue and epilogue code, or parts thereof,


[gcc r15-2122] Improve must tail in RTL backend

2024-07-17 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:a6502accf381358173b19e615fdeb0aa17949c93

commit r15-2122-ga6502accf381358173b19e615fdeb0aa17949c93
Author: Andi Kleen 
Date:   Tue Jan 23 23:42:08 2024 -0800

Improve must tail in RTL backend

- Give error messages for all causes of non sibling call generation
- When giving error messages clear the musttail flag to avoid ICEs
- Error out when tree-tailcall failed to mark a must-tail call
sibcall. In this case it doesn't know the true reason and only gives
a vague message.

gcc/ChangeLog:

PR c/83324
* calls.cc (maybe_complain_about_tail_call): Clear must tail
flag on error.
(expand_call): Give error messages for all musttail failures.

Diff:
---
 gcc/calls.cc | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 21d78f9779fe..883eb9971257 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -1249,6 +1249,7 @@ maybe_complain_about_tail_call (tree call_expr, const 
char *reason)
 return;
 
   error_at (EXPR_LOCATION (call_expr), "cannot tail-call: %s", reason);
+  CALL_EXPR_MUST_TAIL_CALL (call_expr) = 0;
 }
 
 /* Fill in ARGS_SIZE and ARGS array based on the parameters found in
@@ -2650,7 +2651,13 @@ expand_call (tree exp, rtx target, int ignore)
   /* The type of the function being called.  */
   tree fntype;
   bool try_tail_call = CALL_EXPR_TAILCALL (exp);
-  bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp);
+  /* tree-tailcall decided not to do tail calls. Error for the musttail case,
+ unfortunately we don't know the reason so it's fairly vague.
+ When tree-tailcall reported an error it already cleared the flag,
+ so this shouldn't really happen unless the
+ the musttail pass gave up walking before finding the call.  */
+  if (!try_tail_call)
+  maybe_complain_about_tail_call (exp, "other reasons");
   int pass;
 
   /* Register in which non-BLKmode value will be returned,
@@ -3022,10 +3029,21 @@ expand_call (tree exp, rtx target, int ignore)
  pushed these optimizations into -O2.  Don't try if we're already
  expanding a call, as that means we're an argument.  Don't try if
  there's cleanups, as we know there's code to follow the call.  */
-  if (currently_expanding_call++ != 0
-  || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp))
-  || args_size.var
-  || dbg_cnt (tail_call) == false)
+  if (currently_expanding_call++ != 0)
+{
+  maybe_complain_about_tail_call (exp, "inside another call");
+  try_tail_call = 0;
+}
+  if (!flag_optimize_sibling_calls
+   && !CALL_FROM_THUNK_P (exp)
+   && !CALL_EXPR_MUST_TAIL_CALL (exp))
+try_tail_call = 0;
+  if (args_size.var)
+{
+  maybe_complain_about_tail_call (exp, "variable size arguments");
+  try_tail_call = 0;
+}
+  if (dbg_cnt (tail_call) == false)
 try_tail_call = 0;
 
   /* Workaround buggy C/C++ wrappers around Fortran routines with
@@ -3046,13 +3064,15 @@ expand_call (tree exp, rtx target, int ignore)
if (MEM_P (*iter))
  {
try_tail_call = 0;
+   maybe_complain_about_tail_call (exp,
+   "hidden string length argument passed on 
stack");
break;
  }
}
 
   /* If the user has marked the function as requiring tail-call
  optimization, attempt it.  */
-  if (must_tail_call)
+  if (CALL_EXPR_MUST_TAIL_CALL (exp))
 try_tail_call = 1;
 
   /*  Rest of purposes for tail call optimizations to fail.  */


Re: [PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-17 Thread Andi Kleen
> Great. Does it also work in a non-template function?

Sadly it did not because there needs to be more AGGR_VIEW_EXPR handling,
as you predicted at some point. I fixed it now. Will send updated patches.

-Andi


Re: [PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-16 Thread Andi Kleen
On Tue, Jul 16, 2024 at 06:06:42PM -0400, Jason Merrill wrote:
> On 7/16/24 5:55 PM, Andi Kleen wrote:
> > On Tue, Jul 16, 2024 at 12:52:31PM -0700, Andi Kleen wrote:
> > > On Tue, Jul 16, 2024 at 02:51:13PM -0400, Jason Merrill wrote:
> > > > On 7/16/24 12:18 PM, Andi Kleen wrote:
> > > > > On Tue, Jul 16, 2024 at 11:17:14AM -0400, Jason Merrill wrote:
> > > > > > On 7/16/24 11:15 AM, Andi Kleen wrote:
> > > > > > > > In the adjusted test it looks like the types of f and g match, 
> > > > > > > > so I wouldn't
> > > > > > > > expect an error.
> > > > > > > 
> > > > > > > Good point! Missing the forest for the trees.
> > > > > > > 
> > > > > > > Anyways are the C++ patches ok with this change?
> > > > > > 
> > > > > > I'm still looking for a test which does error because the types are
> > > > > > different.
> > > > > 
> > > > > Like this?
> > > > 
> > > > Where the called function returns C and the callee function does not.
> > > 
> > > In this case the attribute seems to get lost and it succeeds.
> > 
> > This somewhat hackish patch fixes it here, to handle the case
> > of a TARGET_EXPR where the CALL_EXPR is in the cleanup. extract_call
> > bails on that.
> 
> The CALL_EXPR in the cleanup is calling the destructor, that's not what
> we're trying to tail-call.
> 
> I think the problem here is that the call to f is represented with an
> AGGR_INIT_EXPR instead of CALL_EXPR, so you need to handle the flag on that
> tree_code as well.


Okay this seems to work
(I had to adjust the test case because it now correctly errors out
on passing the class at -O0) 


diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 4bb3e9c4989b..5ec8102c1849 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4245,6 +4245,10 @@ templated_operator_saved_lookups (tree t)
 #define AGGR_INIT_FROM_THUNK_P(NODE) \
   (AGGR_INIT_EXPR_CHECK (NODE)->base.protected_flag)
 
+/* Nonzero means that the call was marked musttail.  */
+#define AGGR_INIT_EXPR_MUST_TAIL(NODE) \
+  (AGGR_INIT_EXPR_CHECK (NODE)->base.static_flag)
+
 /* AGGR_INIT_EXPR accessors.  These are equivalent to the CALL_EXPR
accessors, except for AGGR_INIT_EXPR_SLOT (which takes the place of
CALL_EXPR_STATIC_CHAIN).  */
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 3b914089a6e2..d668c5af6a23 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -21124,6 +21124,8 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  CALL_EXPR_REVERSE_ARGS (call) = rev;
  if (TREE_CODE (call) == CALL_EXPR)
CALL_EXPR_MUST_TAIL_CALL (call) = mtc;
+ else if (TREE_CODE (call) == AGGR_INIT_EXPR)
+   AGGR_INIT_EXPR_MUST_TAIL (call) = mtc;
}
if (warning_suppressed_p (t, OPT_Wpessimizing_move))
  /* This also suppresses -Wredundant-move.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index cd3df13772db..fb45974cd90f 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -4979,6 +4979,7 @@ simplify_aggr_init_expr (tree *tp)
 = CALL_EXPR_OPERATOR_SYNTAX (aggr_init_expr);
   CALL_EXPR_ORDERED_ARGS (call_expr) = CALL_EXPR_ORDERED_ARGS (aggr_init_expr);
   CALL_EXPR_REVERSE_ARGS (call_expr) = CALL_EXPR_REVERSE_ARGS (aggr_init_expr);
+  CALL_EXPR_MUST_TAIL_CALL (call_expr) = AGGR_INIT_EXPR_MUST_TAIL 
(aggr_init_expr);
 
   if (style == ctor)
 {
diff --git a/gcc/testsuite/g++.dg/musttail10.C 
b/gcc/testsuite/g++.dg/musttail10.C
index e454a6238a06..93ec32db160a 100644
--- a/gcc/testsuite/g++.dg/musttail10.C
+++ b/gcc/testsuite/g++.dg/musttail10.C
@@ -14,9 +14,11 @@ template 
 __attribute__((noinline, noclone, noipa))
 T g2() { [[gnu::musttail]] return f(); }
 
+#if __OPTIMIZE__ >= 1
 template 
 __attribute__((noinline, noclone, noipa))
 T g3() { [[gnu::musttail]] return f(); }
+#endif
 
 template 
 __attribute__((noinline, noclone, noipa))
@@ -28,12 +30,20 @@ class C
 public:
   C(double x) : x(x) {}
   ~C() { asm("":::"memory"); }
+  operator int() { return x; } 
 };
 
+template 
+__attribute__((noinline, noclone, noipa))
+T g5() { [[gnu::musttail]] return f(); } /* { dg-error "cannot tail-call" } 
*/
+
 int main()
 {
   g1();
   g2();
+#if __OPTIMIZE__ >= 1
   g3();
+#endif
   g4();
+  g5();
 }



Re: [PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-16 Thread Andi Kleen
On Tue, Jul 16, 2024 at 12:52:31PM -0700, Andi Kleen wrote:
> On Tue, Jul 16, 2024 at 02:51:13PM -0400, Jason Merrill wrote:
> > On 7/16/24 12:18 PM, Andi Kleen wrote:
> > > On Tue, Jul 16, 2024 at 11:17:14AM -0400, Jason Merrill wrote:
> > > > On 7/16/24 11:15 AM, Andi Kleen wrote:
> > > > > > In the adjusted test it looks like the types of f and g match, so I 
> > > > > > wouldn't
> > > > > > expect an error.
> > > > > 
> > > > > Good point! Missing the forest for the trees.
> > > > > 
> > > > > Anyways are the C++ patches ok with this change?
> > > > 
> > > > I'm still looking for a test which does error because the types are
> > > > different.
> > > 
> > > Like this?
> > 
> > Where the called function returns C and the callee function does not.
> 
> In this case the attribute seems to get lost and it succeeds.

This somewhat hackish patch fixes it here, to handle the case
of a TARGET_EXPR where the CALL_EXPR is in the cleanup. extract_call
bails on that.

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 3b914089a6e2..8753aa51da52 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -21124,6 +21124,10 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  CALL_EXPR_REVERSE_ARGS (call) = rev;
  if (TREE_CODE (call) == CALL_EXPR)
CALL_EXPR_MUST_TAIL_CALL (call) = mtc;
+ else if (mtc
+  && TREE_CODE (ret) == TARGET_EXPR
+  && TREE_CODE (TARGET_EXPR_CLEANUP (ret)) == 
CALL_EXPR)
+   CALL_EXPR_MUST_TAIL_CALL (TARGET_EXPR_CLEANUP (ret)) = mtc;
}
if (warning_suppressed_p (t, OPT_Wpessimizing_move))
  /* This also suppresses -Wredundant-move.  */


Re: [PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-16 Thread Andi Kleen
On Tue, Jul 16, 2024 at 02:51:13PM -0400, Jason Merrill wrote:
> On 7/16/24 12:18 PM, Andi Kleen wrote:
> > On Tue, Jul 16, 2024 at 11:17:14AM -0400, Jason Merrill wrote:
> > > On 7/16/24 11:15 AM, Andi Kleen wrote:
> > > > > In the adjusted test it looks like the types of f and g match, so I 
> > > > > wouldn't
> > > > > expect an error.
> > > > 
> > > > Good point! Missing the forest for the trees.
> > > > 
> > > > Anyways are the C++ patches ok with this change?
> > > 
> > > I'm still looking for a test which does error because the types are
> > > different.
> > 
> > Like this?
> 
> Where the called function returns C and the callee function does not.

In this case the attribute seems to get lost and it succeeds.

diff --git a/gcc/testsuite/g++.dg/musttail10.C 
b/gcc/testsuite/g++.dg/musttail10.C
index e454a6238a06..39f0ec38253d 100644
--- a/gcc/testsuite/g++.dg/musttail10.C
+++ b/gcc/testsuite/g++.dg/musttail10.C
@@ -28,12 +28,18 @@ class C
 public:
   C(double x) : x(x) {}
   ~C() { asm("":::"memory"); }
+  operator int() { return x; } 
 };
 
+template 
+__attribute__((noinline, noclone, noipa))
+T g5() { [[gnu::musttail]] return f(); } /* { dg-error "cannot tail-call" } 
*/
+
 int main()
 {
   g1();
   g2();
   g3();
   g4();
+  g5();
 }



Re: [PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-16 Thread Andi Kleen
On Tue, Jul 16, 2024 at 11:17:14AM -0400, Jason Merrill wrote:
> On 7/16/24 11:15 AM, Andi Kleen wrote:
> > > In the adjusted test it looks like the types of f and g match, so I 
> > > wouldn't
> > > expect an error.
> > 
> > Good point! Missing the forest for the trees.
> > 
> > Anyways are the C++ patches ok with this change?
> 
> I'm still looking for a test which does error because the types are
> different.

Like this?

-Andi


diff --git a/gcc/testsuite/g++.dg/musttail10.C 
b/gcc/testsuite/g++.dg/musttail10.C
index 6a8507784a14..e454a6238a06 100644
--- a/gcc/testsuite/g++.dg/musttail10.C
+++ b/gcc/testsuite/g++.dg/musttail10.C
@@ -4,7 +4,7 @@
 
 template  T f();
 
-double h() { [[gnu::musttail]] return f(); } /* { dg-error "cannot 
tail-call" } */
+double g() { [[gnu::musttail]] return f(); } /* { dg-error "cannot 
tail-call" } */
 
 template 
 __attribute__((noinline, noclone, noipa))
@@ -18,6 +18,10 @@ template 
 __attribute__((noinline, noclone, noipa))
 T g3() { [[gnu::musttail]] return f(); }
 
+template 
+__attribute__((noinline, noclone, noipa))
+T g4() { [[gnu::musttail]] return f(); } /* { dg-error "cannot 
tail-call" } */
+
 class C
 {
   double x;
@@ -31,4 +35,5 @@ int main()
   g1();
   g2();
   g3();
+  g4();
 }


Re: [PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-16 Thread Andi Kleen
> In the adjusted test it looks like the types of f and g match, so I wouldn't
> expect an error.

Good point! Missing the forest for the trees.

Anyways are the C++ patches ok with this change?

Thanks,
-Andi


Re: [PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-15 Thread Andi Kleen
On Mon, Jul 15, 2024 at 06:57:57PM -0400, Jason Merrill wrote:
> On 7/8/24 12:56 PM, Andi Kleen wrote:
> > diff --git a/gcc/testsuite/g++.dg/musttail10.C 
> > b/gcc/testsuite/g++.dg/musttail10.C
> > new file mode 100644
> > index ..9b7043b8a306
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/musttail10.C
> > @@ -0,0 +1,34 @@
> > +/* { dg-do compile { target { tail_call } } } */
> > +/* { dg-options "-std=gnu++11" } */
> > +/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
> > +
> > +int f();
> > +
> > +double h() { [[gnu::musttail]] return f(); } /* { dg-error "cannot 
> > tail-call" } */
> > +
> > +template 
> > +__attribute__((noinline, noclone, noipa))
> > +T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not 
> > able" "" { target powerpc*-*-* } } */
> > +
> > +template 
> > +__attribute__((noinline, noclone, noipa))
> > +T g2() { [[gnu::musttail]] return f(); } /* { dg-error "cannot tail-call" 
> > } */
> > +
> > +template 
> > +__attribute__((noinline, noclone, noipa))
> > +T g3() { [[gnu::musttail]] return f(); } /* { dg-error "cannot tail-call" 
> > } */
> > +
> > +class C
> > +{
> > +  double x;
> > +public:
> > +  C(double x) : x(x) {}
> > +  ~C() { asm("":::"memory"); }
> > +};
> > +
> > +int main()
> > +{
> > +  g1();
> > +  g2();
> > +  g3();
> > +}
> 
> I had asked for this test to check the case where the function called with
> [[musttail]] returns a non-trivially-copyable class; the test now includes
> such a class, but all the [[musttail]] calls are still to a function that
> returns int.

Thanks Jason.

I fixed the test case, but now the musttail gets lost, no error for g2/g3.

That means the flag is still lost somewhere. Does something outside tsubst need 
changes too?

Right now tsubst has only this:

@@ -21113,12 +21113,17 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
bool op = CALL_EXPR_OPERATOR_SYNTAX (t);
bool ord = CALL_EXPR_ORDERED_ARGS (t);
bool rev = CALL_EXPR_REVERSE_ARGS (t);
-   if (op || ord || rev)
+   bool mtc = false;
+   if (TREE_CODE (t) == CALL_EXPR)
+ mtc = CALL_EXPR_MUST_TAIL_CALL (t);
+   if (op || ord || rev || mtc)
  if (tree call = extract_call_expr (ret))
{
  CALL_EXPR_OPERATOR_SYNTAX (call) = op;
  CALL_EXPR_ORDERED_ARGS (call) = ord;
  CALL_EXPR_REVERSE_ARGS (call) = rev;
+ if (TREE_CODE (call) == CALL_EXPR)
+   CALL_EXPR_MUST_TAIL_CALL (call) = mtc;
}
if (warning_suppressed_p (t, OPT_Wpessimizing_move))
  /* This also suppresses -Wredundant-move.  */


Fixed test case:


template  T f();

double h() { [[gnu::musttail]] return f(); } /* { dg-error "cannot 
tail-call" } */

template 
__attribute__((noinline, noclone, noipa))
T g1() { [[gnu::musttail]] return f(); } /* { dg-error "target is not able" 
"" { target powerpc*-*-* } } */

template 
__attribute__((noinline, noclone, noipa))
T g2() { [[gnu::musttail]] return f(); } /* { dg-error "cannot tail-call" } 
*/

template 
__attribute__((noinline, noclone, noipa))
T g3() { [[gnu::musttail]] return f(); } /* { dg-error "cannot tail-call" } 
*/

class C
{
  double x;
public:
  C(double x) : x(x) {}
  ~C() { asm("":::"memory"); }
};

int main()
{
  g1();
  g2();
  g3();
}



Re: [PATCH v9 04/10] C++: Support clang compatible [[musttail]] (PR83324)

2024-07-13 Thread Andi Kleen


Updated version with common code for C/C++ extracted in c-family.
Other than that no changes.
Is this version ok to commit?

---


This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

Passes bootstrap and full test

gcc/c-family/ChangeLog:

* c-attribs.cc (set_musttail_on_return): New function.
* c-common.h (set_musttail_on_return): Declare new function.

gcc/cp/ChangeLog:

PR c/83324
* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
---
 gcc/c-family/c-attribs.cc | 20 
 gcc/c-family/c-common.h   |  1 +
 gcc/cp/parser.cc  | 26 +++---
 gcc/cp/pt.cc  |  7 ++-
 4 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 5adc7b775eaf..685f212683f4 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -672,6 +672,26 @@ attribute_takes_identifier_p (const_tree attr_id)
 return targetm.attribute_takes_identifier_p (attr_id);
 }
 
+/* Set a musttail attribute MUSTTAIL_P on return expression RETVAL
+   at LOC.  */
+
+void
+set_musttail_on_return (tree retval, location_t loc, bool musttail_p)
+{
+  if (retval && musttail_p)
+{
+  tree t = retval;
+  if (TREE_CODE (t) == TARGET_EXPR)
+   t = TARGET_EXPR_INITIAL (t);
+  if (TREE_CODE (t) != CALL_EXPR)
+   error_at (loc, "cannot tail-call: return value must be a call");
+  else
+   CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+}
+  else if (musttail_p && !retval)
+error_at (loc, "cannot tail-call: return value must be a call");
+}
+
 /* Verify that argument value POS at position ARGNO to attribute NAME
applied to function FN (which is either a function declaration or function
type) refers to a function parameter at position POS and the expected type
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index e84c9c47513b..079c9dc5f08b 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1646,6 +1646,7 @@ extern tree handle_noreturn_attribute (tree *, tree, 
tree, int, bool *);
 extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
+extern void set_musttail_on_return (tree, location_t, bool);
 
 /* In c-format.cc.  */
 extern bool valid_format_string_type_p (tree);
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 31ae9c2fb54d..e2411ee7213c 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *);
+  (cp_parser *, tree &);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12756,7 +12756,7 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
case RID_CO_RETURN:
case RID_GOTO:
  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
- statement = cp_parser_jump_statement (parser);
+ statement = cp_parser_jump_statement (parser, std_attrs);
  break;
 
  /* Objective-C++ exception-handling constructs.  */
@@ -14844,10 +14844,11 @@ cp_parser_init_statement (cp_parser *parser, tree 
*decl)
jump-statement:
  goto * expression ;
 
+   STD_ATTRS are the statement attributes. They can be modified.
Returns the new BREAK_STMT, CONTINUE_STMT, RETURN_EXPR, or GOTO_EXPR.  */
 
 static tree
-cp_parser_jump_statement (cp_parser* parser)
+cp_parser_jump_statement (cp_parser* parser, tree _attrs)
 {
   tree statement = error_mark_node;
   cp_token *token;
@@ -14924,6 

Re: [PATCH v9 05/10] C: Implement musttail attribute for returns

2024-07-13 Thread Andi Kleen


Here's an updated patch with your feedback addressed.
Is this version ok?

The common code is in the C++ patch.

---

Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.

gcc/c/ChangeLog:

PR c/83324
* c-parser.cc (struct attr_state): Define with musttail_p.
(c_parser_statement_after_labels): Handle [[musttail]].
(c_parser_std_attribute): Dito.
(c_parser_handle_musttail): Dito.
(c_parser_compound_statement_nostart): Dito.
(c_parser_all_labels): Dito.
(c_parser_statement): Dito.
* c-tree.h (c_finish_return): Add musttail_p flag.
* c-typeck.cc (c_finish_return): Handle musttail_p flag.
---
 gcc/c/c-parser.cc | 70 ++-
 gcc/c/c-tree.h|  2 +-
 gcc/c/c-typeck.cc |  7 +++--
 3 files changed, 63 insertions(+), 16 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 8c4e697a4e10..9cb4d5d932ad 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1621,6 +1621,12 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
+struct attr_state
+{
+  /* True if we parsed a musttail attribute for return.  */
+  bool musttail_p;
+};
+
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
  unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1665,7 +1671,7 @@ static location_t c_parser_compound_statement_nostart 
(c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-vec * = NULL);
+vec * = NULL, attr_state = 
{});
 static tree c_parser_c99_block_statement (c_parser *, bool *,
  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec *);
@@ -6982,6 +6988,29 @@ c_parser_handle_directive_omp_attributes (tree ,
 }
 }
 
+/* Check if STD_ATTR contains a musttail attribute and remove if it
+   precedes a return.  PARSER is the parser and ATTR is the output
+   attr_state.  */
+
+static tree
+c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state )
+{
+  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
+{
+  if (lookup_attribute ("gnu", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+  if (lookup_attribute ("clang", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ attr.musttail_p = true;
+   }
+}
+  return std_attrs;
+}
+
 /* Parse a compound statement except for the opening brace.  This is
used for parsing both compound statements and statement expressions
(which follow different paths to handling the opening).  */
@@ -6998,6 +7027,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
 = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
+  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
 {
@@ -7138,7 +7168,10 @@ c_parser_compound_statement_nostart (c_parser *parser)
= c_parser_nth_token_starts_std_attributes (parser, 1);
   tree std_attrs = NULL_TREE;
   if (have_std_attrs)
-   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+   {
+ std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+ std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
+   }
   if (c_parser_next_token_is_keyword (parser, RID_CASE)
  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7286,7 +7319,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
  last_stmt = true;
  mark_valid_location_for_stdc_pragma (false);
  if (!omp_for_parse_state)
-   c_parser_statement_after_labels (parser, NULL);
+   c_parser_statement_after_labels (parser, NULL, NULL, a);
  else
{
  /* In canonical loop nest form, nested loops can only appear
@@ -7328,15 +7361,20 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
attributes.  In this context, a statement is required, not a
declaration, so attributes must be followed by a statement that is
-   not just a semicolon.  */
+   not just a semicolon.  Returns an attr_state.  */
 
-static void
+static attr_state
 c_parser_all_labels (c_parser *parser)
 {
+  attr_state attr = {};
   bool have_std_attrs;
   tree std_attrs = NULL;
   if ((have_std_attrs = c_parser_nth_token_starts_std_attributes 

[PATCH v9 10/10] Mark expand musttail error messages for translation

2024-07-08 Thread Andi Kleen
The musttail error messages are reported to the user, so must be
translated.

gcc/ChangeLog:

PR83324
* calls.cc (initialize_argument_information): Mark messages
for translation.
(can_implement_as_sibling_call_p): Dito.
(expand_call): Dito.
---
 gcc/calls.cc | 56 ++--
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 883eb9971257..f28c58217fdf 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -1420,9 +1420,9 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
{
  *may_tailcall = false;
  maybe_complain_about_tail_call (exp,
- "a callee-copied argument is"
- " stored in the current"
- " function's frame");
+ _("a callee-copied argument 
is"
+   " stored in the current"
+   " function's frame"));
}
 
  args[i].tree_value = build_fold_addr_expr_loc (loc,
@@ -1489,8 +1489,8 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
  type = TREE_TYPE (args[i].tree_value);
  *may_tailcall = false;
  maybe_complain_about_tail_call (exp,
- "argument must be passed"
- " by copying");
+ _("argument must be passed"
+   " by copying"));
}
  arg.pass_by_reference = true;
}
@@ -2508,8 +2508,8 @@ can_implement_as_sibling_call_p (tree exp,
 {
   maybe_complain_about_tail_call
(exp,
-"machine description does not have"
-" a sibcall_epilogue instruction pattern");
+_("machine description does not have"
+  " a sibcall_epilogue instruction pattern"));
   return false;
 }
 
@@ -2519,7 +2519,7 @@ can_implement_as_sibling_call_p (tree exp,
  sibling calls will return a structure.  */
   if (structure_value_addr != NULL_RTX)
 {
-  maybe_complain_about_tail_call (exp, "callee returns a structure");
+  maybe_complain_about_tail_call (exp, _("callee returns a structure"));
   return false;
 }
 
@@ -2528,8 +2528,8 @@ can_implement_as_sibling_call_p (tree exp,
   if (!targetm.function_ok_for_sibcall (fndecl, exp))
 {
   maybe_complain_about_tail_call (exp,
- "target is not able to optimize the"
- " call into a sibling call");
+ _("target is not able to optimize the"
+   " call into a sibling call"));
   return false;
 }
 
@@ -2537,18 +2537,18 @@ can_implement_as_sibling_call_p (tree exp,
  optimized.  */
   if (flags & ECF_RETURNS_TWICE)
 {
-  maybe_complain_about_tail_call (exp, "callee returns twice");
+  maybe_complain_about_tail_call (exp, _("callee returns twice"));
   return false;
 }
   if (flags & ECF_NORETURN)
 {
-  maybe_complain_about_tail_call (exp, "callee does not return");
+  maybe_complain_about_tail_call (exp, _("callee does not return"));
   return false;
 }
 
   if (TYPE_VOLATILE (TREE_TYPE (TREE_TYPE (addr
 {
-  maybe_complain_about_tail_call (exp, "volatile function type");
+  maybe_complain_about_tail_call (exp, _("volatile function type"));
   return false;
 }
 
@@ -2567,7 +2567,7 @@ can_implement_as_sibling_call_p (tree exp,
  the argument areas are shared.  */
   if (fndecl && decl_function_context (fndecl) == current_function_decl)
 {
-  maybe_complain_about_tail_call (exp, "nested function");
+  maybe_complain_about_tail_call (exp, _("nested function"));
   return false;
 }
 
@@ -2579,8 +2579,8 @@ can_implement_as_sibling_call_p (tree exp,
crtl->args.size - crtl->args.pretend_args_size))
 {
   maybe_complain_about_tail_call (exp,
- "callee required more stack slots"
- " than the caller");
+ _("callee required more stack slots"
+   " than the caller"));
   return false;
 }
 
@@ -2594,15 +2594,15 @@ can_implement_as_sibling_call_p (tree exp,
crtl->args.size)))
 {
   maybe_complain_about_tail_call (exp,
- "inconsistent number of"
- " popped arguments");
+ _("inconsistent number of"
+ 

[PATCH v9 05/10] C: Implement musttail attribute for returns

2024-07-08 Thread Andi Kleen
Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.

PR83324

gcc/c/ChangeLog:

* c-parser.cc (struct attr_state): Define with musttail_p.
(c_parser_statement_after_labels): Handle [[musttail]]
(c_parser_std_attribute): Dito.
(c_parser_handle_musttail): Dito.
(c_parser_compound_statement_nostart): Dito.
(c_parser_all_labels): Dito.
(c_parser_statement): Dito.
* c-tree.h (c_finish_return): Add musttail_p flag.
* c-typeck.cc (c_finish_return): Handle musttail_p flag.
---
 gcc/c/c-parser.cc | 59 +--
 gcc/c/c-tree.h|  2 +-
 gcc/c/c-typeck.cc | 15 ++--
 3 files changed, 61 insertions(+), 15 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 8c4e697a4e10..ce1c2c2be835 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1621,6 +1621,11 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
+struct attr_state
+{
+  bool musttail_p; // parsed a musttail for return
+};
+
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
  unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1665,7 +1670,7 @@ static location_t c_parser_compound_statement_nostart 
(c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-vec * = NULL);
+vec * = NULL, attr_state = 
{});
 static tree c_parser_c99_block_statement (c_parser *, bool *,
  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec *);
@@ -6982,6 +6987,28 @@ c_parser_handle_directive_omp_attributes (tree ,
 }
 }
 
+/* Check if STD_ATTR contains a musttail attribute and handle it
+   PARSER is the parser and A is the output attr_state.  */
+
+static tree
+c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state )
+{
+  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
+{
+  if (lookup_attribute ("gnu", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ a.musttail_p = true;
+   }
+  if (lookup_attribute ("clang", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ a.musttail_p = true;
+   }
+}
+  return std_attrs;
+}
+
 /* Parse a compound statement except for the opening brace.  This is
used for parsing both compound statements and statement expressions
(which follow different paths to handling the opening).  */
@@ -6998,6 +7025,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
 = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
+  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
 {
@@ -7138,7 +7166,10 @@ c_parser_compound_statement_nostart (c_parser *parser)
= c_parser_nth_token_starts_std_attributes (parser, 1);
   tree std_attrs = NULL_TREE;
   if (have_std_attrs)
-   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+   {
+ std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+ std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
+   }
   if (c_parser_next_token_is_keyword (parser, RID_CASE)
  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7286,7 +7317,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
  last_stmt = true;
  mark_valid_location_for_stdc_pragma (false);
  if (!omp_for_parse_state)
-   c_parser_statement_after_labels (parser, NULL);
+   c_parser_statement_after_labels (parser, NULL, NULL, a);
  else
{
  /* In canonical loop nest form, nested loops can only appear
@@ -7328,15 +7359,18 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
attributes.  In this context, a statement is required, not a
declaration, so attributes must be followed by a statement that is
-   not just a semicolon.  */
+   not just a semicolon.  Returns an attr_state.  */
 
-static void
+static attr_state
 c_parser_all_labels (c_parser *parser)
 {
+  attr_state a = {};
   bool have_std_attrs;
   tree std_attrs = NULL;
   if ((have_std_attrs = c_parser_nth_token_starts_std_attributes (parser, 1)))
-std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+std_attrs = c_parser_handle_musttail (parser,
+   

[PATCH v9 08/10] Add tests for C/C++ musttail attributes

2024-07-08 Thread Andi Kleen
Some adopted from the existing C musttail plugin tests.

gcc/testsuite/ChangeLog:

* c-c++-common/musttail1.c: New test.
* c-c++-common/musttail2.c: New test.
* c-c++-common/musttail3.c: New test.
* c-c++-common/musttail4.c: New test.
* c-c++-common/musttail7.c: New test.
* c-c++-common/musttail8.c: New test.
* g++.dg/musttail6.C: New test.
* g++.dg/musttail9.C: New test.
* g++.dg/musttail10.C: New test.
---
 gcc/testsuite/c-c++-common/musttail1.c | 14 ++
 gcc/testsuite/c-c++-common/musttail2.c | 33 ++
 gcc/testsuite/c-c++-common/musttail3.c | 29 
 gcc/testsuite/c-c++-common/musttail4.c | 17 +++
 gcc/testsuite/c-c++-common/musttail5.c | 28 
 gcc/testsuite/c-c++-common/musttail7.c | 14 ++
 gcc/testsuite/c-c++-common/musttail8.c | 17 +++
 gcc/testsuite/g++.dg/musttail10.C  | 34 ++
 gcc/testsuite/g++.dg/musttail6.C   | 61 ++
 gcc/testsuite/g++.dg/musttail9.C   | 10 +
 10 files changed, 257 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/musttail1.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail2.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail3.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail4.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail5.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail7.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail8.c
 create mode 100644 gcc/testsuite/g++.dg/musttail10.C
 create mode 100644 gcc/testsuite/g++.dg/musttail6.C
 create mode 100644 gcc/testsuite/g++.dg/musttail9.C

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
new file mode 100644
index ..74efcc2a0bc6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+int __attribute__((noinline,noclone,noipa))
+callee (int i)
+{
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+caller (int i)
+{
+  [[gnu::musttail]] return callee (i + 1);
+}
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
new file mode 100644
index ..86f2c3d77404
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[256]; int i; };
+
+int __attribute__((noinline,noclone,noipa))
+test_2_callee (int i, struct box b)
+{
+  if (b.field[0])
+return 5;
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+test_2_caller (int i)
+{
+  struct box b;
+  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
tail-call: " } */
+}
+
+extern void setjmp (void);
+void
+test_3 (void)
+{
+  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
+}
+
+extern float f7(void);
+
+int
+test_6 (void)
+{
+  [[gnu::musttail]] return f7(); /* { dg-error "cannot tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
new file mode 100644
index ..ea9589c59ef2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -0,0 +1,29 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+extern int foo2 (int x, ...);
+
+struct str
+{
+  int a, b;
+};
+
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+[[clang::musttail]] return cstruct (x + 1);
+  return ((struct str){ x, 0 });
+}
+
+int
+foo (int x)
+{
+  if (x < 10)
+[[clang::musttail]] return foo2 (x, 29);
+  if (x < 100)
+{
+  int k = foo (x + 1);
+  [[clang::musttail]] return k;/* { dg-error "cannot tail-call: " } */
+}
+  return x;
+}
diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
b/gcc/testsuite/c-c++-common/musttail4.c
new file mode 100644
index ..23f4b5e1cd68
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[64]; int i; };
+
+struct box __attribute__((noinline,noclone,noipa))
+returns_struct (int i)
+{
+  struct box b;
+  b.i = i * i;
+  return b;
+}
+
+int __attribute__((noinline,noclone))
+test_1 (int i)
+{
+  [[gnu::musttail]] return returns_struct (i * 5).i; /* { dg-error "cannot 
tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail5.c 
b/gcc/testsuite/c-c++-common/musttail5.c
new file mode 100644
index ..234da0d3f2a9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23" { target c } } */
+/* { dg-options "-std=gnu++11" { target c++ } } */
+
+[[musttail]] int j; /* { dg-warning "attribute" } */
+__attribute__((musttail)) int k; /* { dg-warning "attribute" } */
+
+void foo(void)

[PATCH v9 09/10] Add documentation for musttail attribute

2024-07-08 Thread Andi Kleen
gcc/ChangeLog:

PR83324
* doc/extend.texi: Document [[musttail]]
---
 gcc/doc/extend.texi | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b2e41a581dd1..f83e643da19c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9921,7 +9921,7 @@ same manner as the @code{deprecated} attribute.
 @section Statement Attributes
 @cindex Statement Attributes
 
-GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
+GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
 for details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
 (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
@@ -9978,6 +9978,25 @@ foo (int x, int y)
 @code{y} is not actually incremented and the compiler can but does not
 have to optimize it to just @code{return 42 + 42;}.
 
+@cindex @code{musttail} statement attribute
+@item musttail
+
+The @code{gnu::musttail} or @code{clang::musttail} attribute
+can be applied to a @code{return} statement with a return-value expression
+that is a function call.  It asserts that the call must be a tail call that
+does not allocate extra stack space, so it is safe to use tail recursion
+to implement long running loops.
+
+@smallexample
+[[gnu::musttail]] return foo();
+@end smallexample
+
+If the compiler cannot generate a @code{musttail} tail call it will report
+an error. On some targets tail calls may never be supported.
+Tail calls cannot reference locals in memory, which may affect
+builds without optimization when passing small structures, or passing
+or returning large structures. Enabling -O1 or -O2 can improve
+the success of tail calls.
 @end table
 
 @node Attribute Syntax
@@ -10101,7 +10120,9 @@ the constant expression, if present.
 
 @subsubheading Statement Attributes
 In GNU C, an attribute specifier list may appear as part of a null
-statement.  The attribute goes before the semicolon.
+statement. The attribute goes before the semicolon.
+Some attributes in new style syntax are also supported
+on non-null statements.
 
 @subsubheading Type Attributes
 
-- 
2.45.2



[PATCH v9 07/10] Give better error messages for musttail

2024-07-08 Thread Andi Kleen
When musttail is set, make tree-tailcall give error messages
when it cannot handle a call. This avoids vague "other reasons"
error messages later at expand time when it sees a musttail
function not marked tail call.

In various cases this requires delaying the error until
the call is discovered.

Also print more information on the failure to the dump file.

gcc/ChangeLog:

PR83324
* tree-tailcall.cc (maybe_error_musttail): New function.
(suitable_for_tail_opt_p): Report error reason.
(suitable_for_tail_call_opt_p): Report error reason.
(find_tail_calls): Accept basic blocks with abnormal edges.
Delay reporting of errors until the call is discovered.
Move top level suitability checks to here.
(tree_optimize_tail_calls_1): Remove top level checks.
---
 gcc/tree-tailcall.cc | 187 +++
 1 file changed, 154 insertions(+), 33 deletions(-)

diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index 43e8c25215cb..a68079d4f507 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -40,9 +40,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-eh.h"
 #include "dbgcnt.h"
 #include "cfgloop.h"
+#include "intl.h"
 #include "common/common-target.h"
 #include "ipa-utils.h"
 #include "tree-ssa-live.h"
+#include "diagnostic-core.h"
 
 /* The file implements the tail recursion elimination.  It is also used to
analyze the tail calls in general, passing the results to the rtl level
@@ -131,14 +133,20 @@ static tree m_acc, a_acc;
 
 static bitmap tailr_arg_needs_copy;
 
+static void maybe_error_musttail (gcall *call, const char *err);
+
 /* Returns false when the function is not suitable for tail call optimization
-   from some reason (e.g. if it takes variable number of arguments).  */
+   from some reason (e.g. if it takes variable number of arguments). CALL
+   is call to report for.  */
 
 static bool
-suitable_for_tail_opt_p (void)
+suitable_for_tail_opt_p (gcall *call)
 {
   if (cfun->stdarg)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses stdargs"));
+  return false;
+}
 
   return true;
 }
@@ -146,35 +154,47 @@ suitable_for_tail_opt_p (void)
 /* Returns false when the function is not suitable for tail call optimization
for some reason (e.g. if it takes variable number of arguments).
This test must pass in addition to suitable_for_tail_opt_p in order to make
-   tail call discovery happen.  */
+   tail call discovery happen. CALL is call to report error for.  */
 
 static bool
-suitable_for_tail_call_opt_p (void)
+suitable_for_tail_call_opt_p (gcall *call)
 {
   tree param;
 
   /* alloca (until we have stack slot life analysis) inhibits
  sibling call optimizations, but not tail recursion.  */
   if (cfun->calls_alloca)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses alloca"));
+  return false;
+}
 
   /* If we are using sjlj exceptions, we may need to add a call to
  _Unwind_SjLj_Unregister at exit of the function.  Which means
  that we cannot do any sibcall transformations.  */
   if (targetm_common.except_unwind_info (_options) == UI_SJLJ
   && current_function_has_exception_handlers ())
-return false;
+{
+  maybe_error_musttail (call, _("caller uses sjlj exceptions"));
+  return false;
+}
 
   /* Any function that calls setjmp might have longjmp called from
  any called function.  ??? We really should represent this
  properly in the CFG so that this needn't be special cased.  */
   if (cfun->calls_setjmp)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses setjmp"));
+  return false;
+}
 
   /* Various targets don't handle tail calls correctly in functions
  that call __builtin_eh_return.  */
   if (cfun->calls_eh_return)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses __builtin_eh_return"));
+  return false;
+}
 
   /* ??? It is OK if the argument of a function is taken in some cases,
  but not in all cases.  See PR15387 and PR19616.  Revisit for 4.1.  */
@@ -182,7 +202,10 @@ suitable_for_tail_call_opt_p (void)
param;
param = DECL_CHAIN (param))
 if (TREE_ADDRESSABLE (param))
-  return false;
+  {
+   maybe_error_musttail (call, _("address of caller arguments taken"));
+   return false;
+  }
 
   return true;
 }
@@ -402,16 +425,42 @@ propagate_through_phis (tree var, edge e)
   return var;
 }
 
+/* Report an error for failing to tail convert must call CALL
+   with error message ERR. Also clear the flag to prevent further
+   errors.  */
+
+static void
+maybe_error_musttail (gcall *call, const char *err)
+{
+  if (gimple_call_must_tail_p (call))
+{
+  error_at (call->location, "cannot tail-call: %s", err);
+  /* Avoid another error. ??? If there are multiple reasons why tail
+calls fail it might be useful to report 

[PATCH v9 01/10] Improve must tail in RTL backend

2024-07-08 Thread Andi Kleen
- Give error messages for all causes of non sibling call generation
- When giving error messages clear the musttail flag to avoid ICEs
- Error out when tree-tailcall failed to mark a must-tail call
sibcall. In this case it doesn't know the true reason and only gives
a vague message.

PR83324

gcc/ChangeLog:

* calls.cc (maybe_complain_about_tail_call): Clear must tail
flag on error.
(expand_call): Give error messages for all musttail failures.
---
 gcc/calls.cc | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 21d78f9779fe..883eb9971257 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -1249,6 +1249,7 @@ maybe_complain_about_tail_call (tree call_expr, const 
char *reason)
 return;
 
   error_at (EXPR_LOCATION (call_expr), "cannot tail-call: %s", reason);
+  CALL_EXPR_MUST_TAIL_CALL (call_expr) = 0;
 }
 
 /* Fill in ARGS_SIZE and ARGS array based on the parameters found in
@@ -2650,7 +2651,13 @@ expand_call (tree exp, rtx target, int ignore)
   /* The type of the function being called.  */
   tree fntype;
   bool try_tail_call = CALL_EXPR_TAILCALL (exp);
-  bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp);
+  /* tree-tailcall decided not to do tail calls. Error for the musttail case,
+ unfortunately we don't know the reason so it's fairly vague.
+ When tree-tailcall reported an error it already cleared the flag,
+ so this shouldn't really happen unless the
+ the musttail pass gave up walking before finding the call.  */
+  if (!try_tail_call)
+  maybe_complain_about_tail_call (exp, "other reasons");
   int pass;
 
   /* Register in which non-BLKmode value will be returned,
@@ -3022,10 +3029,21 @@ expand_call (tree exp, rtx target, int ignore)
  pushed these optimizations into -O2.  Don't try if we're already
  expanding a call, as that means we're an argument.  Don't try if
  there's cleanups, as we know there's code to follow the call.  */
-  if (currently_expanding_call++ != 0
-  || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp))
-  || args_size.var
-  || dbg_cnt (tail_call) == false)
+  if (currently_expanding_call++ != 0)
+{
+  maybe_complain_about_tail_call (exp, "inside another call");
+  try_tail_call = 0;
+}
+  if (!flag_optimize_sibling_calls
+   && !CALL_FROM_THUNK_P (exp)
+   && !CALL_EXPR_MUST_TAIL_CALL (exp))
+try_tail_call = 0;
+  if (args_size.var)
+{
+  maybe_complain_about_tail_call (exp, "variable size arguments");
+  try_tail_call = 0;
+}
+  if (dbg_cnt (tail_call) == false)
 try_tail_call = 0;
 
   /* Workaround buggy C/C++ wrappers around Fortran routines with
@@ -3046,13 +3064,15 @@ expand_call (tree exp, rtx target, int ignore)
if (MEM_P (*iter))
  {
try_tail_call = 0;
+   maybe_complain_about_tail_call (exp,
+   "hidden string length argument passed on 
stack");
break;
  }
}
 
   /* If the user has marked the function as requiring tail-call
  optimization, attempt it.  */
-  if (must_tail_call)
+  if (CALL_EXPR_MUST_TAIL_CALL (exp))
 try_tail_call = 1;
 
   /*  Rest of purposes for tail call optimizations to fail.  */
-- 
2.45.2



[PATCH v9 06/10] Enable musttail tail conversion even when not optimizing

2024-07-08 Thread Andi Kleen
Enable the tailcall optimization for non optimizing builds,
but in this case only checks calls that have the musttail attribute set.
This makes musttail work without optimization.

This is done with a new late musttail pass that is only active when
not optimizing. The new pass relies on tree-cfg to discover musttails.
This avoids a ~0.8% compiler run time penalty at -O0.

gcc/ChangeLog:

PR83324
* function.h (struct function): Add has_musttail.
* lto-streamer-in.cc (input_struct_function_base): Stream
has_musttail.
* lto-streamer-out.cc (output_struct_function_base): Dito.
* passes.def (pass_musttail): Add.
* tree-cfg.cc (notice_special_calls): Record has_musttail.
(clear_special_calls): Clear has_musttail.
* tree-pass.h (make_pass_musttail): Add.
* tree-tailcall.cc (find_tail_calls): Handle only_musttail
  argument.
(tree_optimize_tail_calls_1): Pass on only_musttail.
(execute_tail_calls): Pass only_musttail as false.
(class pass_musttail): Add.
(make_pass_musttail): Add.
---
 gcc/function.h  |  3 ++
 gcc/lto-streamer-in.cc  |  1 +
 gcc/lto-streamer-out.cc |  1 +
 gcc/passes.def  |  1 +
 gcc/tree-cfg.cc |  3 ++
 gcc/tree-pass.h |  1 +
 gcc/tree-tailcall.cc| 68 +++--
 7 files changed, 69 insertions(+), 9 deletions(-)

diff --git a/gcc/function.h b/gcc/function.h
index c0ba6cc1531a..fbeadeaf4104 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -430,6 +430,9 @@ struct GTY(()) function {
   /* Nonzero when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
 
+  /* Has musttail marked calls.  */
+  unsigned int has_musttail : 1;
+
   /* Nonzero if the current function contains a #pragma GCC unroll.  */
   unsigned int has_unroll : 1;
 
diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc
index ad0ca24007a0..2e592be80823 100644
--- a/gcc/lto-streamer-in.cc
+++ b/gcc/lto-streamer-in.cc
@@ -1325,6 +1325,7 @@ input_struct_function_base (struct function *fn, class 
data_in *data_in,
   fn->calls_eh_return = bp_unpack_value (, 1);
   fn->has_force_vectorize_loops = bp_unpack_value (, 1);
   fn->has_simduid_loops = bp_unpack_value (, 1);
+  fn->has_musttail = bp_unpack_value (, 1);
   fn->assume_function = bp_unpack_value (, 1);
   fn->va_list_fpr_size = bp_unpack_value (, 8);
   fn->va_list_gpr_size = bp_unpack_value (, 8);
diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc
index d4f728094ed5..0be381abbd96 100644
--- a/gcc/lto-streamer-out.cc
+++ b/gcc/lto-streamer-out.cc
@@ -2290,6 +2290,7 @@ output_struct_function_base (struct output_block *ob, 
struct function *fn)
   bp_pack_value (, fn->calls_eh_return, 1);
   bp_pack_value (, fn->has_force_vectorize_loops, 1);
   bp_pack_value (, fn->has_simduid_loops, 1);
+  bp_pack_value (, fn->has_musttail, 1);
   bp_pack_value (, fn->assume_function, 1);
   bp_pack_value (, fn->va_list_fpr_size, 8);
   bp_pack_value (, fn->va_list_gpr_size, 8);
diff --git a/gcc/passes.def b/gcc/passes.def
index b8c21b1e4351..49ab89387552 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -444,6 +444,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_tsan_O0);
   NEXT_PASS (pass_sanopt);
   NEXT_PASS (pass_cleanup_eh);
+  NEXT_PASS (pass_musttail);
   NEXT_PASS (pass_lower_resx);
   NEXT_PASS (pass_nrv);
   NEXT_PASS (pass_gimple_isel);
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 7fb7b92966be..e6fd1294b958 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -2290,6 +2290,8 @@ notice_special_calls (gcall *call)
 cfun->calls_alloca = true;
   if (flags & ECF_RETURNS_TWICE)
 cfun->calls_setjmp = true;
+  if (gimple_call_must_tail_p (call))
+cfun->has_musttail = true;
 }
 
 
@@ -2301,6 +2303,7 @@ clear_special_calls (void)
 {
   cfun->calls_alloca = false;
   cfun->calls_setjmp = false;
+  cfun->has_musttail = false;
 }
 
 /* Remove PHI nodes associated with basic block BB and all edges out of BB.  */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 9843d189d27d..8093b363bf14 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -368,6 +368,7 @@ extern gimple_opt_pass *make_pass_sra (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sra_early (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_recursion (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_calls (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_musttail (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_fix_loops (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_loop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_no_loop (gcc::context *ctxt);
diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index e9f7f8a12b3a..43e8c25215cb 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -408,10 +408,10 @@ static live_vars_map *live_vars;
 static vec live_vars_vec;
 
 /* 

[PATCH v9 04/10] C++: Support clang compatible [[musttail]] (PR83324)

2024-07-08 Thread Andi Kleen
This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

Passes bootstrap and full test

gcc/cp/ChangeLog:

PR c/83324
* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
---
 gcc/cp/parser.cc | 34 +++---
 gcc/cp/pt.cc |  7 ++-
 2 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 31ae9c2fb54d..c8ed88f7a91b 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *);
+  (cp_parser *, tree &);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12756,7 +12756,7 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
case RID_CO_RETURN:
case RID_GOTO:
  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
- statement = cp_parser_jump_statement (parser);
+ statement = cp_parser_jump_statement (parser, std_attrs);
  break;
 
  /* Objective-C++ exception-handling constructs.  */
@@ -14844,10 +14844,11 @@ cp_parser_init_statement (cp_parser *parser, tree 
*decl)
jump-statement:
  goto * expression ;
 
+   STD_ATTRS are the statement attributes. They can be modified.
Returns the new BREAK_STMT, CONTINUE_STMT, RETURN_EXPR, or GOTO_EXPR.  */
 
 static tree
-cp_parser_jump_statement (cp_parser* parser)
+cp_parser_jump_statement (cp_parser* parser, tree _attrs)
 {
   tree statement = error_mark_node;
   cp_token *token;
@@ -14924,6 +14925,33 @@ cp_parser_jump_statement (cp_parser* parser)
  /* If the next token is a `;', then there is no
 expression.  */
  expr = NULL_TREE;
+
+   if (keyword == RID_RETURN && expr)
+ {
+   bool musttail_p = false;
+   if (lookup_attribute ("gnu", "musttail", std_attrs))
+ {
+   musttail_p = true;
+   std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ }
+   /* Support this for compatibility.  */
+   if (lookup_attribute ("clang", "musttail", std_attrs))
+ {
+   musttail_p = true;
+   std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ }
+   if (musttail_p)
+ {
+   tree t = expr;
+   if (t && TREE_CODE (t) == TARGET_EXPR)
+ t = TARGET_EXPR_INITIAL (t);
+   if (t && TREE_CODE (t) != CALL_EXPR)
+ error_at (token->location, "cannot tail-call: return value 
must be a call");
+   else
+ CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+ }
+ }
+
/* Build the return-statement, check co-return first, since type
   deduction is not valid there.  */
if (keyword == RID_CO_RETURN)
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index d1316483e245..3b914089a6e2 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -21113,12 +21113,17 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
bool op = CALL_EXPR_OPERATOR_SYNTAX (t);
bool ord = CALL_EXPR_ORDERED_ARGS (t);
bool rev = CALL_EXPR_REVERSE_ARGS (t);
-   if (op || ord || rev)
+   bool mtc = false;
+   if (TREE_CODE (t) == CALL_EXPR)
+ mtc = CALL_EXPR_MUST_TAIL_CALL (t);
+   if (op || ord || rev || mtc)
  if (tree call = extract_call_expr (ret))
{
  CALL_EXPR_OPERATOR_SYNTAX (call) = op;
  CALL_EXPR_ORDERED_ARGS (call) = ord;
  

New musttail patchkit

2024-07-08 Thread Andi Kleen
This version addresses all the review feedback (Thanks everyone!)

It is getting close to the finish line. The only missing reviews now
are for the C frontend part (patch 5). Joseph and Marek, I would
appreciate if you could take a look.

- Addressed Richie's feedback with various improvements
and better comments and commit messages.
- Squashed some tree-tailcall patches
- Fix some more test issues pointed out by the Linaro bot
[if there are other architectures with some but
not full tail call support like ARM the test cases
may need further adjustments to skip those]
- Some minor cleanups.

-Andi


[PATCH v9 03/10] Add a musttail generic attribute to the c-attribs table

2024-07-08 Thread Andi Kleen
The actual handling is directly in the parser since the
generic mechanism doesn't support statement attributes,
but this gives basic error checking/detection on the attribute.

gcc/c-family/ChangeLog:

PR83324
* c-attribs.cc (handle_musttail_attribute): Add.
* c-common.h (handle_musttail_attribute): Add.
---
 gcc/c-family/c-attribs.cc | 15 +++
 gcc/c-family/c-common.h   |  1 +
 2 files changed, 16 insertions(+)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index f9b229aba7fc..5adc7b775eaf 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -340,6 +340,8 @@ const struct attribute_spec c_common_gnu_attributes[] =
   { "common", 0, 0, true,  false, false, false,
  handle_common_attribute,
  attr_common_exclusions },
+  { "musttail",  0, 0, false, false, false,
+ false, handle_musttail_attribute, NULL },
   /* FIXME: logically, noreturn attributes should be listed as
  "false, true, true" and apply to function types.  But implementing this
  would require all the places in the compiler that use TREE_THIS_VOLATILE
@@ -1222,6 +1224,19 @@ handle_common_attribute (tree *node, tree name, tree 
ARG_UNUSED (args),
   return NULL_TREE;
 }
 
+/* Handle a "musttail" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+tree
+handle_musttail_attribute (tree ARG_UNUSED (*node), tree name, tree ARG_UNUSED 
(args),
+  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  /* Currently only a statement attribute, handled directly in parser.  */
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+  return NULL_TREE;
+}
+
 /* Handle a "noreturn" attribute; arguments as in
struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 48c89b603bcd..e84c9c47513b 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1643,6 +1643,7 @@ extern tree find_tm_attribute (tree);
 extern const struct attribute_spec::exclusions attr_cold_hot_exclusions[];
 extern const struct attribute_spec::exclusions attr_noreturn_exclusions[];
 extern tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
+extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
 
-- 
2.45.2



[PATCH v9 02/10] Fix pro_and_epilogue for sibcalls at -O0 (PR115255)

2024-07-08 Thread Andi Kleen
Some of the cfg fixups in pro_and_epilogue for sibcalls were dependent on 
"optimize".
Make them check cfun->tail_call_marked instead to handle the -O0 musttail
case. This fixes the musttail test cases on arm targets.

gcc/ChangeLog:

PR target/115255
* function.cc (thread_prologue_and_epilogue_insns): Check
  cfun->tail_call_marked for sibcalls too.
(rest_of_handle_thread_prologue_and_epilogue): Dito.
---
 gcc/function.cc | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/function.cc b/gcc/function.cc
index 4edd4da12474..a6f6de349420 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -2231,6 +2231,7 @@ use_register_for_decl (const_tree decl)
   /* We don't set DECL_IGNORED_P for the function_result_decl.  */
   if (optimize)
return true;
+  /* Needed for [[musttail]] which can operate even at -O0 */
   if (cfun->tail_call_marked)
return true;
   /* We don't set DECL_REGISTER for the function_result_decl.  */
@@ -6259,8 +6260,11 @@ thread_prologue_and_epilogue_insns (void)
 }
 
   /* Threading the prologue and epilogue changes the artificial refs in the
- entry and exit blocks, and may invalidate DF info for tail calls.  */
+ entry and exit blocks, and may invalidate DF info for tail calls.
+ This is also needed for [[musttail]] conversion even when not
+ optimizing.  */
   if (optimize
+  || cfun->tail_call_marked
   || flag_optimize_sibling_calls
   || flag_ipa_icf_functions
   || in_lto_p)
@@ -6557,7 +6561,7 @@ rest_of_handle_thread_prologue_and_epilogue (function 
*fun)
 {
   /* prepare_shrink_wrap is sensitive to the block structure of the control
  flow graph, so clean it up first.  */
-  if (optimize)
+  if (cfun->tail_call_marked || optimize)
 cleanup_cfg (0);
 
   /* On some machines, the prologue and epilogue code, or parts thereof,
-- 
2.45.2



Re: [PATCH v8 07/12] Enable musttail tail conversion even when not optimizing

2024-07-08 Thread Andi Kleen
On Mon, Jul 08, 2024 at 05:27:53PM +0200, Richard Biener wrote:
> 
> 
> > Am 08.07.2024 um 17:22 schrieb Andi Kleen :
> > 
> > On Mon, Jul 08, 2024 at 08:53:27AM +0200, Richard Biener wrote:
> >> Ah, I see.  So this pass is responsible for both -O0 and
> >> -fno-optimized-sibling-calls.
> >> But I'm quite sure the other pass doesn't run with -O0
> >> -foptimize-sibling-calls, does it?
> > 
> > It does run:
> > 
> > ./cc1 -O0 -fdump-passes -foptimize-sibling-calls t.c 2>&1 | grep tail
> > tree-tailr1   :  ON
> >  tree-tailr2  :  ON
> >  tree-tailc   :  ON
> 
> I would not trust -fdump-passes, IIRC that just executes the gate functions.

You're right it's not invoked according to gdb.

So the musttail gate needs a || optimize == 0.

-Andi


Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-07-08 Thread Andi Kleen
> I have added a target hook for this in v4 of this patch. The hook
> receives all the information about the stores, the load, the estimated
> sequence cost and whether we expect to eliminate the load. With this
> information the target should be able to make an informed decision.
> 
> What you mention is also true for AArch64: some microbenchmarking I
> did shows that some cores efficiently handle 32bit->64bit store
> forwarding while others not, so creating a target hook is necessary
> for such cases.

Perhaps for the 32->64 case have a generic simple target flag. I presume it
will be common.

On x86 there are lots of other cases too and the details vary based on
the micro architecture. I wonder if there is an efficient way to encode
that in a table.

> This is still hard to tell. In some cases I have observed either
> improvement or regressions in benchmarks, which are highly susceptible
> to costing and the specific store-forwarding penalties of the CPU.
> I have seen cases where the store-forwarding instance is profitable to
> avoid but we get bad code generation due to other reasons (usually
> store_bit_field lowering not being good enough) and hence a
> regression.

I wonder if there could be some heuristic to avoid it for those cases.

> So I believe more time and testing is needed to really evaluate the
> speedups that can be achieved.

So for now it would be off by default?

-Andi


Re: [PATCH v8 08/12] Give better error messages for musttail

2024-07-08 Thread Andi Kleen
On Mon, Jul 08, 2024 at 09:06:21AM +0200, Richard Biener wrote:
> On Sat, Jul 6, 2024 at 8:45 PM Andi Kleen  wrote:
> >
> > > >if (!single_succ_p (bb))
> > > > -return;
> > > > +{
> > > > +  int num_eh, num_other;
> > > > +  bb_get_succ_edge_count (bb, num_eh, num_other);
> > > > +  /* Allow EH edges so that we can give a better
> > > > +error message later.  */
> > >
> > > Please instead use has_abnormal_or_eh_outgoing_edge_p (bb) instead
> >
> > That's not equivalent, need a num_other == 1 check too.
> 
> There can be at most one regular outgoing edge for a block with an
> outgoing EH or abnormal edge.

GIMPLE_CONDs cannot trigger EH?

> > Do you want me to move the function to a generic place?
> 
> Maybe you can use find_fallthru_edge () instead if you think
> has_abnormal_or_eh_outgoing_edge_p isn't good enough?  That will
> find the single_succ_edge when the BB isn't single_succ_p because
> of EH/abnormal edges.
> 
> I think both choices would be equivalent to your new function and its use.

Okay will do the later.

> The comment above the check is a bit weird in how it talks about types, but
> "tail call must be same type" isn't very helpful and it isn't in any way 
> related
> to the actual check being performed.  "return slot" is supposed to be the
> storage used for return pointed to by the invisible reference parameter to
> space allocated by the caller.  Do you know a more C/C++ standard related
> naming for this?

I don't have a better name. Probably the right thing would be to use
whatever term the respective ABI uses, but that may not be the same for
every target. I used your suggestion.

-Andi


Re: [PATCH v8 09/12] Delay caller error reporting for musttail

2024-07-08 Thread Andi Kleen
> > Overall the logic in this pass is rather convoluted and
> > could deserve some cleanups and separation of concerns.
> > e.g. it would be better to separate tail calls and tail
> > recursion. But I'm not trying to rewrite the pass here.
> 
> Understood.  For a v9, can you squash the tree-tailcall.cc changes
> please?

I squashed all the tree-tailcall error changes. The new pass is still
a separate patch. I would prefer to keep it this way.

BTW I'm surprised you prefer that. Normally smaller patches are better
if bisecting is needed.

-Andi


Re: [PATCH v8 07/12] Enable musttail tail conversion even when not optimizing

2024-07-08 Thread Andi Kleen
On Mon, Jul 08, 2024 at 08:53:27AM +0200, Richard Biener wrote:
> Ah, I see.  So this pass is responsible for both -O0 and
> -fno-optimized-sibling-calls.
> But I'm quite sure the other pass doesn't run with -O0
> -foptimize-sibling-calls, does it?

It does run:

./cc1 -O0 -fdump-passes -foptimize-sibling-calls t.c 2>&1 | grep tail
 tree-tailr1   :  ON
  tree-tailr2  :  ON
  tree-tailc   :  ON

But I suspect without the earlier expand patch to adjust the cfg rebuild it may
ICE on some of the targets.

-Andi


Re: [PATCH v8 08/12] Give better error messages for musttail

2024-07-06 Thread Andi Kleen
> >if (!single_succ_p (bb))
> > -return;
> > +{
> > +  int num_eh, num_other;
> > +  bb_get_succ_edge_count (bb, num_eh, num_other);
> > +  /* Allow EH edges so that we can give a better
> > +error message later.  */
> 
> Please instead use has_abnormal_or_eh_outgoing_edge_p (bb) instead

That's not equivalent, need a num_other == 1 check too.

Do you want me to move the function to a generic place?

> to avoid adding another function like this.  Also only continue searching
> for a musttail call if cfun->has_musttail

Done (although I must say I liked the better dump messages even for non
tailcall)

> >if (gimple_references_memory_p (stmt)
> >   || gimple_has_volatile_ops (stmt))
> > -   return;
> > +   {
> > + bad_stmt = true;
> 
> break here when !cfun->has_musttail?

Done.

> >if (ass_var
> >&& !is_gimple_reg (ass_var)
> >&& !auto_var_in_fn_p (ass_var, cfun->decl))
> > -return;
> > +{
> > +  maybe_error_musttail (call, _("return value in memory"));
> > +  return;
> > +}
> > +
> > +  if (cfun->calls_setjmp)
> > +{
> > +  maybe_error_musttail (call, _("caller uses setjmp"));
> > +  return;
> > +}
> >
> >/* If the call might throw an exception that wouldn't propagate out of
> >   cfun, we can't transform to a tail or sibling call (82081).  */
> > -  if (stmt_could_throw_p (cfun, stmt)
> > -  && !stmt_can_throw_external (cfun, stmt))
> > +  if ((stmt_could_throw_p (cfun, stmt)
> > +   && !stmt_can_throw_external (cfun, stmt)) || !single_succ_p (bb))
> 
> This reports for the found stmt while above we reject any intermediate
> non-fallthru control flow.  I would suggest to, in the above BB check,
> record a gimple *last = last_stmt (bb) and if last == stmt report this reason
> but otherwise "control altering statement between call and return"?

Ok.  I reported "code between call and return". I don't think there
since "control" would imply control flow.

Also there is no last_stmt () or did I miss it? It couldn't be used
anyways because it still needs to skip the nops etc. But the backwards 
loop can easily discover it.

BTW I suspect some of the checks are redundant but it is hard to really
prove it, so I left everything in place.

> > +maybe_error_musttail (call,
> > + _("call may throw exception that does not 
> > propagate"));
> >  return;
> > +  }
> >
> >/* If the function returns a value, then at present, the tail call
> >   must return the same type of value.  There is conceptually a copy
> > @@ -524,7 +593,10 @@ find_tail_calls (basic_block bb, struct tailcall 
> > **ret, bool only_musttail)
> >if (result_decl
> >&& may_be_aliased (result_decl)
> >&& ref_maybe_used_by_stmt_p (call, result_decl, false))
> > -return;
> > +{
> > +  maybe_error_musttail (call, _("tail call must be same type"));
> 
> ?  "call uses the return slot"?
> 
> Otherwise looks OK.

Done. Although I'm not sure what a return slot is, but maybe the users
can figure it out)

-Andi


Re: [PATCH v8 09/12] Delay caller error reporting for musttail

2024-07-06 Thread Andi Kleen
On Fri, Jul 05, 2024 at 01:45:17PM +0200, Richard Biener wrote:
> On Sat, Jun 22, 2024 at 9:00 PM Andi Kleen  wrote:
> >
> > Move the error reporting for caller attributes to be
> > after the tail call discovery, so that we can give proper
> > error messages tagged to the calls.
> 
> Hmm.  This all gets a bit awkward.  I realize that early checking
> gets us less compile-time unnecessarily spent for searching for
> a tail call - but at least for the musttail case parsing constraints
> should put a practical limit on how far to look?

All the top level checks are for obscure situations, so it's unlikely
that it makes much difference for compile time either way.

> 
> So what I wonder is whether it would be better to separate
> searching for a (musttail) candidate separate from validation?
> 
> We could for example invoke find_tail_calls twice, once to
> find a musttail candidate (can there be multiple ones?) and once
> to validate and error?  Would that make the delaying less awkward?

There can be multiple musttails in a function, in theory
one for every return.

I'm not sure I see the awkward part? (other than perhaps
the not-quite-natural accumulation of opt_tailcalls). There
are alots of checks before and after discovery. This just
moves them all to be after.

If the top level checks were done based on a discovered 
list you would need extra loops to walk the candidates 
later and error. It wouldn't be any simpler at least.

Overall the logic in this pass is rather convoluted and
could deserve some cleanups and separation of concerns.
e.g. it would be better to separate tail calls and tail
recursion. But I'm not trying to rewrite the pass here.

-Andi


Re: [PATCH v8 03/12] Add a musttail generic attribute to the c-attribs table

2024-07-06 Thread Andi Kleen
On Fri, Jul 05, 2024 at 12:44:47PM +0200, Richard Biener wrote:
> On Sat, Jun 22, 2024 at 8:57 PM Andi Kleen  wrote:
> >
> > It does nothing currently since statement attributes are handled
> > directly in the parser.
> 
> Is this needed at all?  a "'musttail' attribute ignored" diagnostic isn't
> much more helpful than "'foo' attribute directive ignored"?  Or does
> stmt attribute parsing rely on this table as well?

It avoids an extra check in the C/C++ parser. I will clarify the commit
message to say that.

-Andi


Re: [PATCH v8 07/12] Enable musttail tail conversion even when not optimizing

2024-07-06 Thread Andi Kleen
> > +class pass_musttail : public gimple_opt_pass
> > +{
> > +public:
> > +  pass_musttail (gcc::context *ctxt)
> > +: gimple_opt_pass (pass_data_musttail, ctxt)
> > +  {}
> > +
> > +  /* opt_pass methods: */
> > +  /* This pass is only used when not optimizing to make [[musttail]] still
> > + work.  */
> > +  bool gate (function *) final override { return 
> > !flag_optimize_sibling_calls; }
> 
> Shouldn't this check f->has_musttail only?  That is, I would expect
> -fno-optimize-sibling-calls to still tail-call [[musttail]]?  The comment says
> the pass only runs when not optimizing - so maybe you wanted to do
> return optimize == 0;?

When flag_optimize_sibling_call is set the other tailcall pass will 
take care of the musttails. It is only needed when that one doesn't run.
So I think looking at that flag is correct.

But I should move the f->has_musttail check into the gate (done) and
clarified the comment because it is not specific to optimizing.

Thanks,
-Andi


Re: [PATCH] x86: Update branch hint for Redwood Cove.

2024-07-02 Thread Andi Kleen
liuhongt  writes:

> From: "H.J. Lu" 
>
> According to Intel® 64 and IA-32 Architectures Optimization Reference
> Manual[1], Branch Hint is updated for Redwood Cove.
>
> cut from [1]-
> Starting with the Redwood Cove microarchitecture, if the predictor has
> no stored information about a branch, the branch has the Intel® SSE2
> branch taken hint (i.e., instruction prefix 3EH), When the codec
> decodes the branch, it flips the branch’s prediction from not-taken to
> taken. It then flushes the pipeline in front of it and steers this
> pipeline to fetch the taken path of the branch.
> cut end -
>
> For -mtune-ctrl=branch_prediction_hints, always generate branch hint for
> conditional branches, this tune is disabled by default.
>
> [1] 
> https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready push to trunk.

So what does it do to code size?
You may not want to do it with -Os.

Maybe it should be only done with actual profile feedback data
available, i'm not sure if the builtin heuristics are good enough to
justify it and there is a risk that it is very wrong.  

Yes as long as it's disabled by default that's all not a problem, but it
would need to be solved to enable it.

-Andi


[PING] Re: Updated musttail patchkit

2024-07-01 Thread Andi Kleen
Andi Kleen  writes:

I wanted to ping this patch kit to add musttail support for C/C++,
to enable future python versions and other users and keep up with clang. 

https://gcc.gnu.org/pipermail/gcc-patches/2024-June/thread.html#655447

It unfortunately touches various different parts of the compiler.
All the previous feedback has been addressed, except for
- cannot make it a warning because that would defeat the purpose
- cannot move all of the checking to expand time (would be a whole
scale rewrite of the whole mechanism)

These are RTL level:
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655448.html
(got some feedback from the two Richards and Jakub earlier)
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655450.html
(got some feedback from Andrew)

C++:
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655449.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655451.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655453.html
(C++, already approved)

C:
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655452.html
(C, got some feedback from Joseph, but never got finally approved) 

https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655455.html

Unreviewed patches, touching both tree-ssa-tailcall and calls.c expand:
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655454.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655457.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655456.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655458.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655459.html

Thanks,
-Andi

> - Fix problems with encoding musttail in tree structure (Thanks Jakub and 
> Jason)
> - Fixes a miscompilation that would break bootstrap with 
> --enable-checking=release
> - Avoids a 0.8% compile time penalty at -O0 for the new musttail pass by 
> using a cfun flag
> that is discovered by tree-cfg
> - Enables translation of musttail error messages
> - Further improves error reporting, avoiding "other reasons" error messages
> for various cases and reporting the correct error in others.
> - Adjusted the test suite to powerpc sibcall limitations
> - Addressed C++ review feedback
> - Improves dump file output
> - Improves the documentation
> - Some random cleanups
> - Rebased on trunk
>
> Tested full bootstrap on x86_64-linux and powerpc64le-linux, as well
> as a x86_64 LTO profiled bootstrap and some x86_64 testing with
> --enable-release=checking.


[PATCH v8 08/12] Give better error messages for musttail

2024-06-22 Thread Andi Kleen
When musttail is set, make tree-tailcall give error messages
when it cannot handle a call. This avoids vague "other reasons"
error messages later at expand time when it sees a musttail
function not marked tail call.

In various cases this requires delaying the error until
the call is discovered.

gcc/ChangeLog:

* tree-tailcall.cc (maybe_error_musttail): New function.
(bb_get_succ_edge_count): New function.
(find_tail_calls): Add error reporting. Handle EH edges
for error reporting.
---
 gcc/tree-tailcall.cc | 116 +--
 1 file changed, 102 insertions(+), 14 deletions(-)

diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index 0c6df10e64f7..4687e20e61d0 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -40,9 +40,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-eh.h"
 #include "dbgcnt.h"
 #include "cfgloop.h"
+#include "intl.h"
 #include "common/common-target.h"
 #include "ipa-utils.h"
 #include "tree-ssa-live.h"
+#include "diagnostic-core.h"
 
 /* The file implements the tail recursion elimination.  It is also used to
analyze the tail calls in general, passing the results to the rtl level
@@ -402,6 +404,41 @@ propagate_through_phis (tree var, edge e)
   return var;
 }
 
+/* Report an error for failing to tail convert must call CALL
+   with error message ERR. Also clear the flag to prevent further
+   errors.  */
+
+static void
+maybe_error_musttail (gcall *call, const char *err)
+{
+  if (gimple_call_must_tail_p (call))
+{
+  error_at (call->location, "cannot tail-call: %s", err);
+  /* Avoid another error. ??? If there are multiple reasons why tail
+calls fail it might be useful to report them all to avoid
+whack-a-mole for the user. But currently there is too much
+redundancy in the reporting, so keep it simple.  */
+  gimple_call_set_must_tail (call, false); /* Avoid another error.  */
+  gimple_call_set_tail (call, false);
+}
+}
+
+/* Count succ edges for BB and return in NUM_OTHER and NUM_EH.  */
+
+static void
+bb_get_succ_edge_count (basic_block bb, int _other, int _eh)
+{
+  edge e;
+  edge_iterator ei;
+  num_eh = 0;
+  num_other = 0;
+  FOR_EACH_EDGE (e, ei, bb->succs)
+if (e->flags & EDGE_EH)
+  num_eh++;
+else
+  num_other++;
+}
+
 /* Argument for compute_live_vars/live_vars_at_stmt and what compute_live_vars
returns.  Computed lazily, but just once for the function.  */
 static live_vars_map *live_vars;
@@ -426,8 +463,16 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail)
   tree var;
 
   if (!single_succ_p (bb))
-return;
+{
+  int num_eh, num_other;
+  bb_get_succ_edge_count (bb, num_eh, num_other);
+  /* Allow EH edges so that we can give a better
+error message later.  */
+  if (num_other != 1)
+   return;
+}
 
+  bool bad_stmt = false;
   for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev ())
 {
   stmt = gsi_stmt (gsi);
@@ -448,6 +493,12 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail)
  /* Handle only musttail calls when not optimizing.  */
  if (only_musttail && !gimple_call_must_tail_p (call))
return;
+ if (bad_stmt)
+   {
+ maybe_error_musttail (call,
+ _("memory reference or volatile after call"));
+ return;
+   }
  ass_var = gimple_call_lhs (call);
  break;
}
@@ -462,9 +513,14 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail)
   /* If the statement references memory or volatile operands, fail.  */
   if (gimple_references_memory_p (stmt)
  || gimple_has_volatile_ops (stmt))
-   return;
+   {
+ bad_stmt = true;
+   }
 }
 
+  if (bad_stmt)
+return;
+
   if (gsi_end_p (gsi))
 {
   edge_iterator ei;
@@ -489,13 +545,26 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail)
   if (ass_var
   && !is_gimple_reg (ass_var)
   && !auto_var_in_fn_p (ass_var, cfun->decl))
-return;
+{
+  maybe_error_musttail (call, _("return value in memory"));
+  return;
+}
+
+  if (cfun->calls_setjmp)
+{
+  maybe_error_musttail (call, _("caller uses setjmp"));
+  return;
+}
 
   /* If the call might throw an exception that wouldn't propagate out of
  cfun, we can't transform to a tail or sibling call (82081).  */
-  if (stmt_could_throw_p (cfun, stmt)
-  && !stmt_can_throw_external (cfun, stmt))
+  if ((stmt_could_throw_p (cfun, stmt)
+   && !stmt_can_throw_external (cfun, stmt)) || !single_succ_p (bb))
+  {
+maybe_error_musttail (call,
+ _("call may throw exception that does not 
propagate"));
 return;
+  }
 
   /* If the function returns a value, then at present, the tail call
  must return 

[PATCH v8 11/12] Dump reason for missing tail call into dump file

2024-06-22 Thread Andi Kleen
gcc/ChangeLog:

* tree-tailcall.cc (maybe_error_musttail): Print reason to
dump_file.
(find_tail_calls): Print gimple stmt or other reasons that stop
the search for tail calls into dump file.
---
 gcc/tree-tailcall.cc | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index a77fa1511415..f69a9ad40bda 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -442,6 +442,11 @@ maybe_error_musttail (gcall *call, const char *err)
   gimple_call_set_must_tail (call, false); /* Avoid another error.  */
   gimple_call_set_tail (call, false);
 }
+  if (dump_file)
+{
+  print_gimple_stmt (dump_file, call, 0, TDF_SLIM);
+  fprintf (dump_file, "Cannot convert: %s\n", err);
+}
 }
 
 /* Count succ edges for BB and return in NUM_OTHER and NUM_EH.  */
@@ -492,7 +497,12 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail,
   /* Allow EH edges so that we can give a better
 error message later.  */
   if (num_other != 1)
-   return;
+   {
+ if (dump_file)
+   fprintf (dump_file, "Basic block %d has %d eh / %d other edges\n",
+  bb->index, num_eh, num_other);
+ return;
+   }
 }
 
   bool bad_stmt = false;
@@ -537,6 +547,11 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail,
   if (gimple_references_memory_p (stmt)
  || gimple_has_volatile_ops (stmt))
{
+ if (dump_file)
+   {
+ fprintf (dump_file, "Cannot handle ");
+ print_gimple_stmt (dump_file, stmt, 0);
+   }
  bad_stmt = true;
}
 }
-- 
2.45.2



[PATCH v8 10/12] Add documentation for musttail attribute

2024-06-22 Thread Andi Kleen
gcc/ChangeLog:

* doc/extend.texi: Document [[musttail]]
---
 gcc/doc/extend.texi | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b2e41a581dd1..f83e643da19c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9921,7 +9921,7 @@ same manner as the @code{deprecated} attribute.
 @section Statement Attributes
 @cindex Statement Attributes
 
-GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
+GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
 for details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
 (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
@@ -9978,6 +9978,25 @@ foo (int x, int y)
 @code{y} is not actually incremented and the compiler can but does not
 have to optimize it to just @code{return 42 + 42;}.
 
+@cindex @code{musttail} statement attribute
+@item musttail
+
+The @code{gnu::musttail} or @code{clang::musttail} attribute
+can be applied to a @code{return} statement with a return-value expression
+that is a function call.  It asserts that the call must be a tail call that
+does not allocate extra stack space, so it is safe to use tail recursion
+to implement long running loops.
+
+@smallexample
+[[gnu::musttail]] return foo();
+@end smallexample
+
+If the compiler cannot generate a @code{musttail} tail call it will report
+an error. On some targets tail calls may never be supported.
+Tail calls cannot reference locals in memory, which may affect
+builds without optimization when passing small structures, or passing
+or returning large structures. Enabling -O1 or -O2 can improve
+the success of tail calls.
 @end table
 
 @node Attribute Syntax
@@ -10101,7 +10120,9 @@ the constant expression, if present.
 
 @subsubheading Statement Attributes
 In GNU C, an attribute specifier list may appear as part of a null
-statement.  The attribute goes before the semicolon.
+statement. The attribute goes before the semicolon.
+Some attributes in new style syntax are also supported
+on non-null statements.
 
 @subsubheading Type Attributes
 
-- 
2.45.2



[PATCH v8 09/12] Delay caller error reporting for musttail

2024-06-22 Thread Andi Kleen
Move the error reporting for caller attributes to be
after the tail call discovery, so that we can give proper
error messages tagged to the calls.

gcc/ChangeLog:

* tree-tailcall.cc (maybe_error_musttail): Declare.
(suitable_for_tail_opt_p): Take call and report errors.
(suitable_for_tail_call_opt_p): Take call and report errors.
(find_tail_calls): Report caller errors after discovery.
(tree_optimize_tail_calls_1): Remove caller suitableness check.
---
 gcc/tree-tailcall.cc | 62 ++--
 1 file changed, 43 insertions(+), 19 deletions(-)

diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index 4687e20e61d0..a77fa1511415 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -133,14 +133,20 @@ static tree m_acc, a_acc;
 
 static bitmap tailr_arg_needs_copy;
 
+static void maybe_error_musttail (gcall *call, const char *err);
+
 /* Returns false when the function is not suitable for tail call optimization
-   from some reason (e.g. if it takes variable number of arguments).  */
+   from some reason (e.g. if it takes variable number of arguments). CALL
+   is call to report for.  */
 
 static bool
-suitable_for_tail_opt_p (void)
+suitable_for_tail_opt_p (gcall *call)
 {
   if (cfun->stdarg)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses stdargs"));
+  return false;
+}
 
   return true;
 }
@@ -148,35 +154,47 @@ suitable_for_tail_opt_p (void)
 /* Returns false when the function is not suitable for tail call optimization
for some reason (e.g. if it takes variable number of arguments).
This test must pass in addition to suitable_for_tail_opt_p in order to make
-   tail call discovery happen.  */
+   tail call discovery happen. CALL is call to report error for.  */
 
 static bool
-suitable_for_tail_call_opt_p (void)
+suitable_for_tail_call_opt_p (gcall *call)
 {
   tree param;
 
   /* alloca (until we have stack slot life analysis) inhibits
  sibling call optimizations, but not tail recursion.  */
   if (cfun->calls_alloca)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses alloca"));
+  return false;
+}
 
   /* If we are using sjlj exceptions, we may need to add a call to
  _Unwind_SjLj_Unregister at exit of the function.  Which means
  that we cannot do any sibcall transformations.  */
   if (targetm_common.except_unwind_info (_options) == UI_SJLJ
   && current_function_has_exception_handlers ())
-return false;
+{
+  maybe_error_musttail (call, _("caller uses sjlj exceptions"));
+  return false;
+}
 
   /* Any function that calls setjmp might have longjmp called from
  any called function.  ??? We really should represent this
  properly in the CFG so that this needn't be special cased.  */
   if (cfun->calls_setjmp)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses setjmp"));
+  return false;
+}
 
   /* Various targets don't handle tail calls correctly in functions
  that call __builtin_eh_return.  */
   if (cfun->calls_eh_return)
-return false;
+{
+  maybe_error_musttail (call, _("caller uses __builtin_eh_return"));
+  return false;
+}
 
   /* ??? It is OK if the argument of a function is taken in some cases,
  but not in all cases.  See PR15387 and PR19616.  Revisit for 4.1.  */
@@ -184,7 +202,10 @@ suitable_for_tail_call_opt_p (void)
param;
param = DECL_CHAIN (param))
 if (TREE_ADDRESSABLE (param))
-  return false;
+  {
+   maybe_error_musttail (call, _("address of caller arguments taken"));
+   return false;
+  }
 
   return true;
 }
@@ -445,10 +466,12 @@ static live_vars_map *live_vars;
 static vec live_vars_vec;
 
 /* Finds tailcalls falling into basic block BB. The list of found tailcalls is
-   added to the start of RET. When ONLY_MUSTTAIL is set only handle musttail.  
*/
+   added to the start of RET. When ONLY_MUSTTAIL is set only handle musttail.
+   Update OPT_TAILCALLS as output parameter.  */
 
 static void
-find_tail_calls (basic_block bb, struct tailcall **ret, bool only_musttail)
+find_tail_calls (basic_block bb, struct tailcall **ret, bool only_musttail,
+bool _tailcalls)
 {
   tree ass_var = NULL_TREE, ret_var, func, param;
   gimple *stmt;
@@ -526,11 +549,17 @@ find_tail_calls (basic_block bb, struct tailcall **ret, 
bool only_musttail)
   edge_iterator ei;
   /* Recurse to the predecessors.  */
   FOR_EACH_EDGE (e, ei, bb->preds)
-   find_tail_calls (e->src, ret, only_musttail);
+   find_tail_calls (e->src, ret, only_musttail, opt_tailcalls);
 
   return;
 }
 
+  if (!suitable_for_tail_opt_p (call))
+return;
+
+  if (!suitable_for_tail_call_opt_p (call))
+opt_tailcalls = false;
+
   /* If the LHS of our call is not just a simple register or local
  variable, we can't transform this into a tail or sibling call.
  This 

[PATCH v8 07/12] Enable musttail tail conversion even when not optimizing

2024-06-22 Thread Andi Kleen
Enable the tailcall optimization for non optimizing builds,
but in this case only checks calls that have the musttail attribute set.
This makes musttail work without optimization.

This is done with a new late musttail pass that is only active when
not optimizing. The new pass relies on tree-cfg to discover musttails.
This avoids a ~0.8% compiler run time penalty at -O0.

gcc/ChangeLog:

* function.h (struct function): Add has_musttail.
* lto-streamer-in.cc (input_struct_function_base): Stream
has_musttail.
* lto-streamer-out.cc (output_struct_function_base): Dito.
* passes.def (pass_musttail): Add.
* tree-cfg.cc (notice_special_calls): Record has_musttail.
(clear_special_calls): Clear has_musttail.
* tree-pass.h (make_pass_musttail): Add.
* tree-tailcall.cc (find_tail_calls): Handle only_musttail
  argument.
(tree_optimize_tail_calls_1): Pass on only_musttail.
(execute_tail_calls): Pass only_musttail as false.
(class pass_musttail): Add.
(make_pass_musttail): Add.
---
 gcc/function.h  |  3 ++
 gcc/lto-streamer-in.cc  |  1 +
 gcc/lto-streamer-out.cc |  1 +
 gcc/passes.def  |  1 +
 gcc/tree-cfg.cc |  3 ++
 gcc/tree-pass.h |  1 +
 gcc/tree-tailcall.cc| 66 +++--
 7 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/gcc/function.h b/gcc/function.h
index c0ba6cc1531a..fbeadeaf4104 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -430,6 +430,9 @@ struct GTY(()) function {
   /* Nonzero when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
 
+  /* Has musttail marked calls.  */
+  unsigned int has_musttail : 1;
+
   /* Nonzero if the current function contains a #pragma GCC unroll.  */
   unsigned int has_unroll : 1;
 
diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc
index ad0ca24007a0..2e592be80823 100644
--- a/gcc/lto-streamer-in.cc
+++ b/gcc/lto-streamer-in.cc
@@ -1325,6 +1325,7 @@ input_struct_function_base (struct function *fn, class 
data_in *data_in,
   fn->calls_eh_return = bp_unpack_value (, 1);
   fn->has_force_vectorize_loops = bp_unpack_value (, 1);
   fn->has_simduid_loops = bp_unpack_value (, 1);
+  fn->has_musttail = bp_unpack_value (, 1);
   fn->assume_function = bp_unpack_value (, 1);
   fn->va_list_fpr_size = bp_unpack_value (, 8);
   fn->va_list_gpr_size = bp_unpack_value (, 8);
diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc
index d4f728094ed5..0be381abbd96 100644
--- a/gcc/lto-streamer-out.cc
+++ b/gcc/lto-streamer-out.cc
@@ -2290,6 +2290,7 @@ output_struct_function_base (struct output_block *ob, 
struct function *fn)
   bp_pack_value (, fn->calls_eh_return, 1);
   bp_pack_value (, fn->has_force_vectorize_loops, 1);
   bp_pack_value (, fn->has_simduid_loops, 1);
+  bp_pack_value (, fn->has_musttail, 1);
   bp_pack_value (, fn->assume_function, 1);
   bp_pack_value (, fn->va_list_fpr_size, 8);
   bp_pack_value (, fn->va_list_gpr_size, 8);
diff --git a/gcc/passes.def b/gcc/passes.def
index 041229e47a68..5b5390e6ac0b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -444,6 +444,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_tsan_O0);
   NEXT_PASS (pass_sanopt);
   NEXT_PASS (pass_cleanup_eh);
+  NEXT_PASS (pass_musttail);
   NEXT_PASS (pass_lower_resx);
   NEXT_PASS (pass_nrv);
   NEXT_PASS (pass_gimple_isel);
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 7fb7b92966be..e6fd1294b958 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -2290,6 +2290,8 @@ notice_special_calls (gcall *call)
 cfun->calls_alloca = true;
   if (flags & ECF_RETURNS_TWICE)
 cfun->calls_setjmp = true;
+  if (gimple_call_must_tail_p (call))
+cfun->has_musttail = true;
 }
 
 
@@ -2301,6 +2303,7 @@ clear_special_calls (void)
 {
   cfun->calls_alloca = false;
   cfun->calls_setjmp = false;
+  cfun->has_musttail = false;
 }
 
 /* Remove PHI nodes associated with basic block BB and all edges out of BB.  */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index edebb2be245d..59e53558034f 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -368,6 +368,7 @@ extern gimple_opt_pass *make_pass_sra (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_sra_early (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_recursion (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tail_calls (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_musttail (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_fix_loops (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_loop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_no_loop (gcc::context *ctxt);
diff --git a/gcc/tree-tailcall.cc b/gcc/tree-tailcall.cc
index e9f7f8a12b3a..0c6df10e64f7 100644
--- a/gcc/tree-tailcall.cc
+++ b/gcc/tree-tailcall.cc
@@ -408,10 +408,10 @@ static live_vars_map *live_vars;
 static vec live_vars_vec;
 
 /* Finds tailcalls 

[PATCH v8 06/12] Add tests for C/C++ musttail attributes

2024-06-22 Thread Andi Kleen
Some adopted from the existing C musttail plugin tests.

gcc/testsuite/ChangeLog:

* c-c++-common/musttail1.c: New test.
* c-c++-common/musttail2.c: New test.
* c-c++-common/musttail3.c: New test.
* c-c++-common/musttail4.c: New test.
* c-c++-common/musttail7.c: New test.
* c-c++-common/musttail8.c: New test.
* g++.dg/musttail6.C: New test.
* g++.dg/musttail9.C: New test.
* g++.dg/musttail10.C: New test.
---
 gcc/testsuite/c-c++-common/musttail1.c | 14 ++
 gcc/testsuite/c-c++-common/musttail2.c | 33 ++
 gcc/testsuite/c-c++-common/musttail3.c | 29 +
 gcc/testsuite/c-c++-common/musttail4.c | 17 
 gcc/testsuite/c-c++-common/musttail5.c | 28 
 gcc/testsuite/c-c++-common/musttail7.c | 14 ++
 gcc/testsuite/c-c++-common/musttail8.c | 17 
 gcc/testsuite/g++.dg/musttail10.C  | 34 +++
 gcc/testsuite/g++.dg/musttail6.C   | 59 ++
 gcc/testsuite/g++.dg/musttail9.C   | 10 +
 10 files changed, 255 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/musttail1.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail2.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail3.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail4.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail5.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail7.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail8.c
 create mode 100644 gcc/testsuite/g++.dg/musttail10.C
 create mode 100644 gcc/testsuite/g++.dg/musttail6.C
 create mode 100644 gcc/testsuite/g++.dg/musttail9.C

diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
b/gcc/testsuite/c-c++-common/musttail1.c
new file mode 100644
index ..74efcc2a0bc6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+int __attribute__((noinline,noclone,noipa))
+callee (int i)
+{
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+caller (int i)
+{
+  [[gnu::musttail]] return callee (i + 1);
+}
diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
b/gcc/testsuite/c-c++-common/musttail2.c
new file mode 100644
index ..86f2c3d77404
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[256]; int i; };
+
+int __attribute__((noinline,noclone,noipa))
+test_2_callee (int i, struct box b)
+{
+  if (b.field[0])
+return 5;
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+test_2_caller (int i)
+{
+  struct box b;
+  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
tail-call: " } */
+}
+
+extern void setjmp (void);
+void
+test_3 (void)
+{
+  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
+}
+
+extern float f7(void);
+
+int
+test_6 (void)
+{
+  [[gnu::musttail]] return f7(); /* { dg-error "cannot tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
b/gcc/testsuite/c-c++-common/musttail3.c
new file mode 100644
index ..ea9589c59ef2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -0,0 +1,29 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+extern int foo2 (int x, ...);
+
+struct str
+{
+  int a, b;
+};
+
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+[[clang::musttail]] return cstruct (x + 1);
+  return ((struct str){ x, 0 });
+}
+
+int
+foo (int x)
+{
+  if (x < 10)
+[[clang::musttail]] return foo2 (x, 29);
+  if (x < 100)
+{
+  int k = foo (x + 1);
+  [[clang::musttail]] return k;/* { dg-error "cannot tail-call: " } */
+}
+  return x;
+}
diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
b/gcc/testsuite/c-c++-common/musttail4.c
new file mode 100644
index ..23f4b5e1cd68
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[64]; int i; };
+
+struct box __attribute__((noinline,noclone,noipa))
+returns_struct (int i)
+{
+  struct box b;
+  b.i = i * i;
+  return b;
+}
+
+int __attribute__((noinline,noclone))
+test_1 (int i)
+{
+  [[gnu::musttail]] return returns_struct (i * 5).i; /* { dg-error "cannot 
tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail5.c 
b/gcc/testsuite/c-c++-common/musttail5.c
new file mode 100644
index ..234da0d3f2a9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23" { target c } } */
+/* { dg-options "-std=gnu++11" { target c++ } } */
+
+[[musttail]] int j; /* { dg-warning "attribute" } */
+__attribute__((musttail)) int k; /* { dg-warning "attribute" } */
+
+void 

[PATCH v8 12/12] Mark expand musttail error messages for translation

2024-06-22 Thread Andi Kleen
The musttail error messages are reported to the user, so must be
translated.

gcc/ChangeLog:

* calls.cc (initialize_argument_information): Mark messages
for translation.
(can_implement_as_sibling_call_p): Dito.
(expand_call): Dito.
---
 gcc/calls.cc | 56 ++--
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 883eb9971257..f28c58217fdf 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -1420,9 +1420,9 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
{
  *may_tailcall = false;
  maybe_complain_about_tail_call (exp,
- "a callee-copied argument is"
- " stored in the current"
- " function's frame");
+ _("a callee-copied argument 
is"
+   " stored in the current"
+   " function's frame"));
}
 
  args[i].tree_value = build_fold_addr_expr_loc (loc,
@@ -1489,8 +1489,8 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
  type = TREE_TYPE (args[i].tree_value);
  *may_tailcall = false;
  maybe_complain_about_tail_call (exp,
- "argument must be passed"
- " by copying");
+ _("argument must be passed"
+   " by copying"));
}
  arg.pass_by_reference = true;
}
@@ -2508,8 +2508,8 @@ can_implement_as_sibling_call_p (tree exp,
 {
   maybe_complain_about_tail_call
(exp,
-"machine description does not have"
-" a sibcall_epilogue instruction pattern");
+_("machine description does not have"
+  " a sibcall_epilogue instruction pattern"));
   return false;
 }
 
@@ -2519,7 +2519,7 @@ can_implement_as_sibling_call_p (tree exp,
  sibling calls will return a structure.  */
   if (structure_value_addr != NULL_RTX)
 {
-  maybe_complain_about_tail_call (exp, "callee returns a structure");
+  maybe_complain_about_tail_call (exp, _("callee returns a structure"));
   return false;
 }
 
@@ -2528,8 +2528,8 @@ can_implement_as_sibling_call_p (tree exp,
   if (!targetm.function_ok_for_sibcall (fndecl, exp))
 {
   maybe_complain_about_tail_call (exp,
- "target is not able to optimize the"
- " call into a sibling call");
+ _("target is not able to optimize the"
+   " call into a sibling call"));
   return false;
 }
 
@@ -2537,18 +2537,18 @@ can_implement_as_sibling_call_p (tree exp,
  optimized.  */
   if (flags & ECF_RETURNS_TWICE)
 {
-  maybe_complain_about_tail_call (exp, "callee returns twice");
+  maybe_complain_about_tail_call (exp, _("callee returns twice"));
   return false;
 }
   if (flags & ECF_NORETURN)
 {
-  maybe_complain_about_tail_call (exp, "callee does not return");
+  maybe_complain_about_tail_call (exp, _("callee does not return"));
   return false;
 }
 
   if (TYPE_VOLATILE (TREE_TYPE (TREE_TYPE (addr
 {
-  maybe_complain_about_tail_call (exp, "volatile function type");
+  maybe_complain_about_tail_call (exp, _("volatile function type"));
   return false;
 }
 
@@ -2567,7 +2567,7 @@ can_implement_as_sibling_call_p (tree exp,
  the argument areas are shared.  */
   if (fndecl && decl_function_context (fndecl) == current_function_decl)
 {
-  maybe_complain_about_tail_call (exp, "nested function");
+  maybe_complain_about_tail_call (exp, _("nested function"));
   return false;
 }
 
@@ -2579,8 +2579,8 @@ can_implement_as_sibling_call_p (tree exp,
crtl->args.size - crtl->args.pretend_args_size))
 {
   maybe_complain_about_tail_call (exp,
- "callee required more stack slots"
- " than the caller");
+ _("callee required more stack slots"
+   " than the caller"));
   return false;
 }
 
@@ -2594,15 +2594,15 @@ can_implement_as_sibling_call_p (tree exp,
crtl->args.size)))
 {
   maybe_complain_about_tail_call (exp,
- "inconsistent number of"
- " popped arguments");
+ _("inconsistent number of"
+ 

[PATCH v8 04/12] C++: Support clang compatible [[musttail]] (PR83324)

2024-06-22 Thread Andi Kleen
This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

Passes bootstrap and full test

PR83324

gcc/cp/ChangeLog:

* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
---
 gcc/cp/parser.cc | 34 +++---
 gcc/cp/pt.cc |  7 ++-
 2 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index e7409b856f11..c03c1aec2c01 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *);
+  (cp_parser *, tree &);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12755,7 +12755,7 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
case RID_CO_RETURN:
case RID_GOTO:
  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
- statement = cp_parser_jump_statement (parser);
+ statement = cp_parser_jump_statement (parser, std_attrs);
  break;
 
  /* Objective-C++ exception-handling constructs.  */
@@ -14822,10 +14822,11 @@ cp_parser_init_statement (cp_parser *parser, tree 
*decl)
jump-statement:
  goto * expression ;
 
+   STD_ATTRS are the statement attributes. They can be modified.
Returns the new BREAK_STMT, CONTINUE_STMT, RETURN_EXPR, or GOTO_EXPR.  */
 
 static tree
-cp_parser_jump_statement (cp_parser* parser)
+cp_parser_jump_statement (cp_parser* parser, tree _attrs)
 {
   tree statement = error_mark_node;
   cp_token *token;
@@ -14902,6 +14903,33 @@ cp_parser_jump_statement (cp_parser* parser)
  /* If the next token is a `;', then there is no
 expression.  */
  expr = NULL_TREE;
+
+   if (keyword == RID_RETURN && expr)
+ {
+   bool musttail_p = false;
+   if (lookup_attribute ("gnu", "musttail", std_attrs))
+ {
+   musttail_p = true;
+   std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ }
+   /* Support this for compatibility.  */
+   if (lookup_attribute ("clang", "musttail", std_attrs))
+ {
+   musttail_p = true;
+   std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ }
+   if (musttail_p)
+ {
+   tree t = expr;
+   if (t && TREE_CODE (t) == TARGET_EXPR)
+ t = TARGET_EXPR_INITIAL (t);
+   if (t && TREE_CODE (t) != CALL_EXPR)
+ error_at (token->location, "cannot tail-call: return value 
must be a call");
+   else
+ CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+ }
+ }
+
/* Build the return-statement, check co-return first, since type
   deduction is not valid there.  */
if (keyword == RID_CO_RETURN)
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 607753ae6b7f..9addcc10bfd0 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -2,12 +2,17 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
bool op = CALL_EXPR_OPERATOR_SYNTAX (t);
bool ord = CALL_EXPR_ORDERED_ARGS (t);
bool rev = CALL_EXPR_REVERSE_ARGS (t);
-   if (op || ord || rev)
+   bool mtc = false;
+   if (TREE_CODE (t) == CALL_EXPR)
+ mtc = CALL_EXPR_MUST_TAIL_CALL (t);
+   if (op || ord || rev || mtc)
  if (tree call = extract_call_expr (ret))
{
  CALL_EXPR_OPERATOR_SYNTAX (call) = op;
  CALL_EXPR_ORDERED_ARGS (call) = ord;
  

[PATCH v8 05/12] C: Implement musttail attribute for returns

2024-06-22 Thread Andi Kleen
Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.

PR83324

gcc/c/ChangeLog:

* c-parser.cc (struct attr_state): Define with musttail_p.
(c_parser_statement_after_labels): Handle [[musttail]]
(c_parser_std_attribute): Dito.
(c_parser_handle_musttail): Dito.
(c_parser_compound_statement_nostart): Dito.
(c_parser_all_labels): Dito.
(c_parser_statement): Dito.
* c-tree.h (c_finish_return): Add musttail_p flag.
* c-typeck.cc (c_finish_return): Handle musttail_p flag.
---
 gcc/c/c-parser.cc | 61 +--
 gcc/c/c-tree.h|  2 +-
 gcc/c/c-typeck.cc | 15 ++--
 3 files changed, 63 insertions(+), 15 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index e83e9c683f75..f47cde3c9d6e 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1621,6 +1621,11 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
+struct attr_state
+{
+  bool musttail_p; // parsed a musttail for return
+};
+
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
  unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1665,7 +1670,7 @@ static location_t c_parser_compound_statement_nostart 
(c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-vec * = NULL);
+vec * = NULL, attr_state = 
{});
 static tree c_parser_c99_block_statement (c_parser *, bool *,
  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec *);
@@ -5763,6 +5768,8 @@ c_parser_std_attribute (c_parser *parser, bool for_tm)
}
   goto out;
 }
+  else if (is_attribute_p ("musttail", name))
+error ("% attribute has arguments");
   {
 location_t open_loc = c_parser_peek_token (parser)->location;
 matching_parens parens;
@@ -6985,6 +6992,28 @@ c_parser_handle_directive_omp_attributes (tree ,
 }
 }
 
+/* Check if STD_ATTR contains a musttail attribute and handle it
+   PARSER is the parser and A is the output attr_state.  */
+
+static tree
+c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state )
+{
+  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
+{
+  if (lookup_attribute ("gnu", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ a.musttail_p = true;
+   }
+  if (lookup_attribute ("clang", "musttail", std_attrs))
+   {
+ std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ a.musttail_p = true;
+   }
+}
+  return std_attrs;
+}
+
 /* Parse a compound statement except for the opening brace.  This is
used for parsing both compound statements and statement expressions
(which follow different paths to handling the opening).  */
@@ -7001,6 +7030,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
 = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
+  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
 {
@@ -7141,7 +7171,10 @@ c_parser_compound_statement_nostart (c_parser *parser)
= c_parser_nth_token_starts_std_attributes (parser, 1);
   tree std_attrs = NULL_TREE;
   if (have_std_attrs)
-   std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+   {
+ std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+ std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
+   }
   if (c_parser_next_token_is_keyword (parser, RID_CASE)
  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7289,7 +7322,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
  last_stmt = true;
  mark_valid_location_for_stdc_pragma (false);
  if (!omp_for_parse_state)
-   c_parser_statement_after_labels (parser, NULL);
+   c_parser_statement_after_labels (parser, NULL, NULL, a);
  else
{
  /* In canonical loop nest form, nested loops can only appear
@@ -7331,15 +7364,18 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
attributes.  In this context, a statement is required, not a
declaration, so attributes must be followed by a statement that is
-   not just a semicolon.  */
+   not just a semicolon.  Returns an attr_state.  */
 
-static void
+static attr_state
 c_parser_all_labels (c_parser *parser)
 {
+  attr_state a = {};
   

[PATCH v8 02/12] Fix pro_and_epilogue for sibcalls at -O0

2024-06-22 Thread Andi Kleen
Some of the cfg fixups in pro_and_epilogue for sibcalls were dependent on 
"optimize".
Make them check cfun->tail_call_marked instead to handle the -O0 musttail
case. This fixes the musttail test cases on arm targets.

PR115255

gcc/ChangeLog:

* function.cc (thread_prologue_and_epilogue_insns): Check
  cfun->tail_call_marked for sibcalls too.
(rest_of_handle_thread_prologue_and_epilogue): Dito.
---
 gcc/function.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/function.cc b/gcc/function.cc
index 4edd4da12474..7c9b181423d4 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -6261,6 +6261,7 @@ thread_prologue_and_epilogue_insns (void)
   /* Threading the prologue and epilogue changes the artificial refs in the
  entry and exit blocks, and may invalidate DF info for tail calls.  */
   if (optimize
+  || cfun->tail_call_marked
   || flag_optimize_sibling_calls
   || flag_ipa_icf_functions
   || in_lto_p)
@@ -6557,7 +6558,7 @@ rest_of_handle_thread_prologue_and_epilogue (function 
*fun)
 {
   /* prepare_shrink_wrap is sensitive to the block structure of the control
  flow graph, so clean it up first.  */
-  if (optimize)
+  if (cfun->tail_call_marked || optimize)
 cleanup_cfg (0);
 
   /* On some machines, the prologue and epilogue code, or parts thereof,
-- 
2.45.2



Updated musttail patchkit

2024-06-22 Thread Andi Kleen
- Fix problems with encoding musttail in tree structure (Thanks Jakub and Jason)
- Fixes a miscompilation that would break bootstrap with 
--enable-checking=release
- Avoids a 0.8% compile time penalty at -O0 for the new musttail pass by using 
a cfun flag
that is discovered by tree-cfg
- Enables translation of musttail error messages
- Further improves error reporting, avoiding "other reasons" error messages
for various cases and reporting the correct error in others.
- Adjusted the test suite to powerpc sibcall limitations
- Addressed C++ review feedback
- Improves dump file output
- Improves the documentation
- Some random cleanups
- Rebased on trunk

Tested full bootstrap on x86_64-linux and powerpc64le-linux, as well
as a x86_64 LTO profiled bootstrap and some x86_64 testing with
--enable-release=checking.



[PATCH v8 03/12] Add a musttail generic attribute to the c-attribs table

2024-06-22 Thread Andi Kleen
It does nothing currently since statement attributes are handled
directly in the parser.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_musttail_attribute): Add.
* c-common.h (handle_musttail_attribute): Add.
---
 gcc/c-family/c-attribs.cc | 15 +++
 gcc/c-family/c-common.h   |  1 +
 2 files changed, 16 insertions(+)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index f9b229aba7fc..5adc7b775eaf 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -340,6 +340,8 @@ const struct attribute_spec c_common_gnu_attributes[] =
   { "common", 0, 0, true,  false, false, false,
  handle_common_attribute,
  attr_common_exclusions },
+  { "musttail",  0, 0, false, false, false,
+ false, handle_musttail_attribute, NULL },
   /* FIXME: logically, noreturn attributes should be listed as
  "false, true, true" and apply to function types.  But implementing this
  would require all the places in the compiler that use TREE_THIS_VOLATILE
@@ -1222,6 +1224,19 @@ handle_common_attribute (tree *node, tree name, tree 
ARG_UNUSED (args),
   return NULL_TREE;
 }
 
+/* Handle a "musttail" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+tree
+handle_musttail_attribute (tree ARG_UNUSED (*node), tree name, tree ARG_UNUSED 
(args),
+  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  /* Currently only a statement attribute, handled directly in parser.  */
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+  return NULL_TREE;
+}
+
 /* Handle a "noreturn" attribute; arguments as in
struct attribute_spec.handler.  */
 
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 48c89b603bcd..e84c9c47513b 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1643,6 +1643,7 @@ extern tree find_tm_attribute (tree);
 extern const struct attribute_spec::exclusions attr_cold_hot_exclusions[];
 extern const struct attribute_spec::exclusions attr_noreturn_exclusions[];
 extern tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
+extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
 
-- 
2.45.2



[PATCH v8 01/12] Improve must tail in RTL backend

2024-06-22 Thread Andi Kleen
- Give error messages for all causes of non sibling call generation
- When giving error messages clear the musttail flag to avoid ICEs
- Error out when tree-tailcall failed to mark a must-tail call
sibcall. In this case it doesn't know the true reason and only gives
a vague message.

PR83324

gcc/ChangeLog:

* calls.cc (maybe_complain_about_tail_call): Clear must tail
flag on error.
(expand_call): Give error messages for all musttail failures.
---
 gcc/calls.cc | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 21d78f9779fe..883eb9971257 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -1249,6 +1249,7 @@ maybe_complain_about_tail_call (tree call_expr, const 
char *reason)
 return;
 
   error_at (EXPR_LOCATION (call_expr), "cannot tail-call: %s", reason);
+  CALL_EXPR_MUST_TAIL_CALL (call_expr) = 0;
 }
 
 /* Fill in ARGS_SIZE and ARGS array based on the parameters found in
@@ -2650,7 +2651,13 @@ expand_call (tree exp, rtx target, int ignore)
   /* The type of the function being called.  */
   tree fntype;
   bool try_tail_call = CALL_EXPR_TAILCALL (exp);
-  bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp);
+  /* tree-tailcall decided not to do tail calls. Error for the musttail case,
+ unfortunately we don't know the reason so it's fairly vague.
+ When tree-tailcall reported an error it already cleared the flag,
+ so this shouldn't really happen unless the
+ the musttail pass gave up walking before finding the call.  */
+  if (!try_tail_call)
+  maybe_complain_about_tail_call (exp, "other reasons");
   int pass;
 
   /* Register in which non-BLKmode value will be returned,
@@ -3022,10 +3029,21 @@ expand_call (tree exp, rtx target, int ignore)
  pushed these optimizations into -O2.  Don't try if we're already
  expanding a call, as that means we're an argument.  Don't try if
  there's cleanups, as we know there's code to follow the call.  */
-  if (currently_expanding_call++ != 0
-  || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp))
-  || args_size.var
-  || dbg_cnt (tail_call) == false)
+  if (currently_expanding_call++ != 0)
+{
+  maybe_complain_about_tail_call (exp, "inside another call");
+  try_tail_call = 0;
+}
+  if (!flag_optimize_sibling_calls
+   && !CALL_FROM_THUNK_P (exp)
+   && !CALL_EXPR_MUST_TAIL_CALL (exp))
+try_tail_call = 0;
+  if (args_size.var)
+{
+  maybe_complain_about_tail_call (exp, "variable size arguments");
+  try_tail_call = 0;
+}
+  if (dbg_cnt (tail_call) == false)
 try_tail_call = 0;
 
   /* Workaround buggy C/C++ wrappers around Fortran routines with
@@ -3046,13 +3064,15 @@ expand_call (tree exp, rtx target, int ignore)
if (MEM_P (*iter))
  {
try_tail_call = 0;
+   maybe_complain_about_tail_call (exp,
+   "hidden string length argument passed on 
stack");
break;
  }
}
 
   /* If the user has marked the function as requiring tail-call
  optimization, attempt it.  */
-  if (must_tail_call)
+  if (CALL_EXPR_MUST_TAIL_CALL (exp))
 try_tail_call = 1;
 
   /*  Rest of purposes for tail call optimizations to fail.  */
-- 
2.45.2



Re: [PATCH 4/7 v2] lto: Implement ltrans cache

2024-06-21 Thread Andi Kleen
On Fri, Jun 21, 2024 at 06:59:05PM +0200, Michal Jireš wrote:
> > The lockfiles scare me a bit. What happens when they get lost, e.g.
> > due to a compiler crash? You may need some recovery for that.
> > Perhaps it would be better to make the files self checking, so that
> > partial files can be detected when reading, and get rid of the locks.
> 
> It uses process-associated locks via fcntl, so if the compiler crashes,
> the locks will be released. If the compiler process crashes and leaves
> partially written file, the lto-wrapper deletes it in tool_cleanup.
> If a file is missing, the cache entry will be deleted.

Sounds good to me.

-Andi


Re: [PATCH 4/7 v2] lto: Implement ltrans cache

2024-06-21 Thread Andi Kleen


FWIW I suspect not handling lockfile errors could be a show stopper
even for an initial implementation.  It's not that uncommon that people
press Ctrl-C. flock on systems that have it would be a safer
alternative.

> There are many things to do and I think it is better to do that in trunk
> rahter than cumulating relatively complex changes on branch.
> md5 is already supported by libiberty so it is kind of easy choice for
> first cut implementation.

At least use sha1 then. This is also in libiberty and it has hardware
acceleration on modern x86.

-Andi


[PING^3] Re: [PATCH v7 1/9] Improve must tail in RTL backend

2024-06-20 Thread Andi Kleen
Andi Kleen  writes:

PING^3 for the musttail patchkit at
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653319.html
(except the C++ patch which got approved) 

Thanks!
-Andi


> Andi Kleen  writes:
>
> PING^2
>
>> Need reviewers for the tree and middle-end parts, as well as the C frontend.
>>
>> Thanks!
>>
>> -Andi


Re: [PATCH 4/7 v2] lto: Implement ltrans cache

2024-06-20 Thread Andi Kleen
Michal Jires  writes:

No performance data?

> +
> +static const md5_checksum_t INVALID_CHECKSUM = {
> +  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> +};

There are much faster/optimized modern hashes for good collision detection over
MD5 especially when it's not needed to be cryptographically secure. Pick
something from smhasher.

Also perhaps the check sum should be cached in the file? I assume it's
cheap to compute while writing. It could be written at the tail of the
file. Then it can be read by seeking to the end and you save that
step.

The lockfiles scare me a bit. What happens when they get lost, e.g.
due to a compiler crash? You may need some recovery for that.
Perhaps it would be better to make the files self checking, so that
partial files can be detected when reading, and get rid of the locks.

-Andi


Re: [PING^2] Re: [PATCH v7 1/9] Improve must tail in RTL backend

2024-06-14 Thread Andi Kleen
Andi Kleen  writes:

PING^2

> Need reviewers for the tree and middle-end parts, as well as the C frontend.
>
> Thanks!
>
> -Andi


Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-06-13 Thread Andi Kleen
Manolis Tsamis  writes:
>
> Assembly like this can appear with bitfields or type punning / unions.
> On stress-ng when running the cpu-union microbenchmark the following speedups
> have been observed.
>
>   Neoverse-N1:  +29.4%
>   Intel Coffeelake: +13.1%
>   AMD 5950X:+17.5%

It seems this should have some kind of target hook so that the target
can configure what forwards should be avoided. At least in x86 land
there is a trend to the hardware handling more and more cases with each
generation.

Also is there any data what this does to code size? Perhaps it should be
only done on hot blocks? 

And did you see speedups on real applications?

-Andi


[gcc r15-1237] Fix error message

2024-06-12 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:64cd70e315ed2cf0653cfdde96ae80c3f90a07f4

commit r15-1237-g64cd70e315ed2cf0653cfdde96ae80c3f90a07f4
Author: Andi Kleen 
Date:   Wed Jun 12 09:15:47 2024 -0700

Fix error message

gcc/cp/ChangeLog:

* parser.cc (cp_parser_asm_string_expression): Use correct error
message.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-asm-3.C: Adjust for new message.

Diff:
---
 gcc/cp/parser.cc | 2 +-
 gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index adc4e6fc1aee..01a19080d6c7 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -22863,7 +22863,7 @@ cp_parser_asm_string_expression (cp_parser *parser)
   else if (!cp_parser_is_string_literal (tok))
 {
   error_at (tok->location,
-   "expected string-literal or constexpr in brackets");
+   "expected string-literal or constexpr in parentheses");
   return error_mark_node;
 }
   return cp_parser_string_literal (parser, false, false);
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
index ef8a35a0b3ba..0cf8940e109c 100644
--- a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
@@ -26,7 +26,7 @@ constexpr std::string_view genclobber ()
 void f()
 {
   int a;
-  asm(genfoo () : /* { dg-error "expected string-literal or constexpr in 
brackets" } */
+  asm(genfoo () : /* { dg-error "expected string-literal or constexpr in 
parentheses" } */
   genoutput() (a) :
   geninput() (1) :
   genclobber());


[gcc r15-1236] Parse close paren even when constexpr extraction fails

2024-06-12 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:d0379809a45f77d2dedb93a443aa1dd96d13c6e5

commit r15-1236-gd0379809a45f77d2dedb93a443aa1dd96d13c6e5
Author: Andi Kleen 
Date:   Wed Jun 12 09:11:46 2024 -0700

Parse close paren even when constexpr extraction fails

To get better error recovery.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_asm_string_expression): Parse close
parent when constexpr extraction fails.

Diff:
---
 gcc/cp/parser.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 98e8ca10ac40..adc4e6fc1aee 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -22856,7 +22856,7 @@ cp_parser_asm_string_expression (cp_parser *parser)
   if (!cstr.type_check (tok->location))
return error_mark_node;
   if (!cstr.extract (tok->location, string))
-   return error_mark_node;
+   string = error_mark_node;
   parens.require_close (parser);
   return string;
 }


[gcc r15-1235] Remove const char * support for asm constexpr

2024-06-12 Thread Andi Kleen via Gcc-cvs
https://gcc.gnu.org/g:6f1f1657cd7a8472b4a4aeef60f1c59606ee011b

commit r15-1235-g6f1f1657cd7a8472b4a4aeef60f1c59606ee011b
Author: Andi Kleen 
Date:   Wed Jun 12 09:09:37 2024 -0700

Remove const char * support for asm constexpr

asm constexpr now only accepts the same string types as C++26 assert,
e.g. string_view and string. Adjust test suite and documentation.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_asm_string_expression): Remove support
for const char * for asm constexpr.

gcc/ChangeLog:

* doc/extend.texi: Use std::string_view in asm constexpr
example.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-asm-1.C: Use std::std_string_view.
* g++.dg/cpp1z/constexpr-asm-3.C: Dito.

Diff:
---
 gcc/cp/parser.cc |  7 ---
 gcc/doc/extend.texi  |  3 ++-
 gcc/testsuite/g++.dg/cpp1z/constexpr-asm-1.C | 12 +++-
 gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C | 12 +++-
 4 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index de5f0483c120..98e8ca10ac40 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -22852,13 +22852,6 @@ cp_parser_asm_string_expression (cp_parser *parser)
   tree string = cp_parser_constant_expression (parser);
   if (string != error_mark_node)
string = cxx_constant_value (string, tf_error);
-  if (TREE_CODE (string) == NOP_EXPR)
-   string = TREE_OPERAND (string, 0);
-  if (TREE_CODE (string) == ADDR_EXPR
- && TREE_CODE (TREE_OPERAND (string, 0)) == STRING_CST)
-   string = TREE_OPERAND (string, 0);
-  if (TREE_CODE (string) == VIEW_CONVERT_EXPR)
-   string = TREE_OPERAND (string, 0);
   cexpr_str cstr (string);
   if (!cstr.type_check (tok->location))
return error_mark_node;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 17e26c5004c1..ee3644a52645 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -10716,7 +10716,8 @@ message. Any string is converted to the character set 
of the source code.
 When this feature is available the @code{__GXX_CONSTEXPR_ASM__} cpp symbol is 
defined.
 
 @example
-constexpr const char *genfoo() @{ return "foo"; @}
+#include 
+constexpr std::string_view genfoo() @{ return "foo"; @}
 
 void function()
 @{
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-1.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-1.C
index 7cc6b37d6208..311209acb43b 100644
--- a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-1.C
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-1.C
@@ -1,22 +1,24 @@
 /* { dg-do compile } */
-/* { dg-options "-std=gnu++11" } */
+/* { dg-options "-std=gnu++17" } */
 
-constexpr const char *genfoo ()
+#include 
+
+constexpr std::string_view genfoo ()
 {
   return "foo %1,%0";
 }
 
-constexpr const char *genoutput ()
+constexpr std::string_view genoutput ()
 {
   return "=r";
 }
 
-constexpr const char *geninput ()
+constexpr std::string_view geninput ()
 {
   return "r";
 }
 
-constexpr const char *genclobber ()
+constexpr std::string_view genclobber ()
 {
   return "memory";
 }
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
index d33631876bdc..ef8a35a0b3ba 100644
--- a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
@@ -1,22 +1,24 @@
 /* { dg-do compile } */
-/* { dg-options "-std=gnu++11" } */
+/* { dg-options "-std=gnu++17" } */
 
-constexpr const char *genfoo ()
+#include 
+
+constexpr std::string_view genfoo ()
 {
   return "foo %1,%0";
 }
 
-constexpr const char *genoutput ()
+constexpr std::string_view genoutput ()
 {
   return "=r";
 }
 
-constexpr const char *geninput ()
+constexpr std::string_view geninput ()
 {
   return "r";
 }
 
-constexpr const char *genclobber ()
+constexpr std::string_view genclobber ()
 {
   return "memory";
 }


[PATCH 3/3] Fix error message

2024-06-12 Thread Andi Kleen
gcc/cp/ChangeLog:

* parser.cc (cp_parser_asm_string_expression): Use correct error
message.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-asm-3.C: Adjust for new message.
---
 gcc/cp/parser.cc | 2 +-
 gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index adc4e6fc1aee..01a19080d6c7 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -22863,7 +22863,7 @@ cp_parser_asm_string_expression (cp_parser *parser)
   else if (!cp_parser_is_string_literal (tok))
 {
   error_at (tok->location,
-   "expected string-literal or constexpr in brackets");
+   "expected string-literal or constexpr in parentheses");
   return error_mark_node;
 }
   return cp_parser_string_literal (parser, false, false);
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
index ef8a35a0b3ba..0cf8940e109c 100644
--- a/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-asm-3.C
@@ -26,7 +26,7 @@ constexpr std::string_view genclobber ()
 void f()
 {
   int a;
-  asm(genfoo () : /* { dg-error "expected string-literal or constexpr in 
brackets" } */
+  asm(genfoo () : /* { dg-error "expected string-literal or constexpr in 
parentheses" } */
   genoutput() (a) :
   geninput() (1) :
   genclobber());
-- 
2.45.1



  1   2   3   4   5   6   7   8   9   10   >