date:20200727

[PATCH, rs6000] Add non-relative jump table support for 64bit rs6000

2020-07-27 Thread HAO CHEN GUI via Gcc-patches


Hi,

This patch adds non-relative jump table support for 64bit rs6000. It 
implements ASM_OUTPUT_ADDR_VEC_ELT and corresponding expansion of jump 
table instruction for 64bit rs6000. We want to put non-relative jump 
table in data.rel.ro section for rs6000. So I add a new target hook - 
TARGET_ASM_SELECT_JUMPTABLE_SECTION to choose which section jump table 
should be put in. It puts the jump table in data.rel.ro for 64bit 
rs6000. For other platforms, it calls 
targetm.asm_out.function_rodata_section, just as before. We want to have 
an option to use non-relative jump table even when PIC flag is set. So I 
add a new option - "mforce-nonrelative-jumptables" for rs6000.  It 
enables the feature by default. The option takes effect in target hook - 
TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC.


static bool
rs6000_gen_pic_addr_diff_vec (void)
{
  return TARGET_32BIT || !rs6000_force_nonrelative_jumptables;
}

The attachments are the patch diff file and change log file.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  
Is this okay for trunk? Any recommendations? Thanks a lot.


2019-11-30  Haochen Gui  

* config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Redefine.
* config/rs6000/rs6000.c 
(TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC,TARGET_ASM_SELECT_JUMPTABLE_SECTION): 
Implement two hooks.
* config/rs6000/rs6000.h 
(CASE_VECTOR_PC_RELATIVE,CASE_VECTOR_MODE) Redefine.
* config/rs6000/rs6000.md (nonrelative_tablejumpdi, 
nonrelative_tablejumpdi_nospec) Add two expansions.
* config/rs6000/rs6000.opt (mforce-nonrelative-jumptables) Add 
a new options for rs6000.
* doc/tm.texi.in (TARGET_ASM_SELECT_JUMPTABLE_SECTION): Likewise.
* doc/tm.texi: Regenerate.
* final.c (targetm.asm_out.select_jumptable_section): Replace 
targetm.asm_out.function_rodata_section with 
targetm.asm_out.select_jumptable_section in jumptable section selection.
* output.h (default_select_jumptable_section): Declare.
* target.def (default_select_jumptable_section): Likewise
* varasm.c (default_select_jumptable_section): Implementation.diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 34776c8421e..e9a1fed43cd 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -324,7 +324,7 @@ extern int dot_symbols;
 
 /* Indicate that jump tables go in the text section.  */
 #undef  JUMP_TABLES_IN_TEXT_SECTION
-#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
+#define JUMP_TABLES_IN_TEXT_SECTION (TARGET_64BIT && flag_pic && 
!rs6000_force_nonrelative_jumptables)
 
 /* The linux ppc64 ABI isn't explicit on whether aggregates smaller
than a doubleword should be padded upward or downward.  You could
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 58f5d780603..d0c0ac6529a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1369,6 +1369,12 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA
 #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA rs6000_output_addr_const_extra
 
+#undef  TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC
+#define TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC rs6000_gen_pic_addr_dif_vec
+
+#undef TARGET_ASM_SELECT_JUMPTABLE_SECTION
+#define TARGET_ASM_SELECT_JUMPTABLE_SECTION rs6000_select_jumptable_section
+
 #undef TARGET_LEGITIMIZE_ADDRESS
 #define TARGET_LEGITIMIZE_ADDRESS rs6000_legitimize_address
 
@@ -26494,6 +26500,68 @@ rs6000_cannot_substitute_mem_equiv_p (rtx mem)
   return false;
 }
 
+/* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC. Return false if 
rs6000_force_nonrelative_jumptables is set and target is 64bit. */
+
+static bool
+rs6000_gen_pic_addr_dif_vec (void)
+{
+  return !rs6000_force_nonrelative_jumptables;
+}
+
+/* Implement TARGET_ASM_FUNCTION_RODATA_SECTION. Put non-relative jump table 
in data.rel.ro section */
+
+section *
+rs6000_select_jumptable_section (tree decl)
+{
+  if (TARGET_32BIT)
+return default_function_rodata_section (decl);
+
+  if (decl != NULL_TREE && DECL_SECTION_NAME (decl))
+{
+  const char *name = DECL_SECTION_NAME (decl);
+
+  if (DECL_COMDAT_GROUP (decl) && HAVE_COMDAT_GROUP)
+{
+  const char *dot;
+  size_t len;
+  char* rname;
+
+  dot = strchr (name + 1, '.');
+  if (!dot)
+dot = name;
+  len = strlen (dot) + 13;
+  rname = (char *) alloca (len);
+
+  strcpy (rname, ".data.rel.ro");
+  strcat (rname, dot);
+  return get_section (rname, SECTION_WRITE | SECTION_RELRO | 
SECTION_LINKONCE, decl);
+}
+/* For .gnu.linkonce.t.foo we want to use .gnu.linkonce.r.foo.  */
+ else if (DECL_COMDAT_GROUP (decl)
+   && strncmp (name, ".gnu.linkonce.t.", 16) == 0)
+{
+  size_t len = strlen (name) + 1;
+  char *rname = (char *) alloca (len);
+
+  memcpy

RE: [PATCH] RISC-V: Add ZFINX support

2020-07-27 Thread wangtao (CH)

Hi Jim,

Thanks for your comments, We will try to solve the copyright problem and send 
the patch as soon as possible.

Thanks,
Tao Wang


> -Original Message-
> From: Jim Wilson [mailto:j...@sifive.com]
> Sent: Tuesday, July 28, 2020 6:36 AM
> To: wangtao (CH) 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] RISC-V: Add ZFINX support
> 
> On Sun, Jul 26, 2020 at 11:40 PM wangtao (CH) 
> wrote:
> > This is the patch to support ZFINX of RISC-V, which option is like
> >-march=rv32gc_zfinx. The ZFINX means f-registers in x-registers under RV-F
> >and RV-D extension. For more details, please refer to
> >https://github.com/riscv/riscv-zfinx/blob/master/Zfinx_spec.adoc.
> > This patch mainly adds the ZFINX option and abi constraints, and when it’s
> >under ZFINX, makes the f-registers as FIXED_REGs to avoid allocating
> >f-regsiters to pseudo registers.
> > And for binutils support, it has been done and I will send it to 
> > binutils-gdb
> >community to review later.
> 
> Normally I'd expect to see the binutils patch first, since the gcc patch 
> can't be
> tested without the binutils patch.  Looking at FSF Copyright assignments, I 
> see
> that Huawei has corporate assignments for gcc and glibc, but I don't see one
> for binutils.  If Huawei is writing the binutils patch, and we can't accept 
> the
> binutils patch due to a missing copyright assignment, then that makes the gcc
> patch mostly useless.
> 
> Current convention is that we only accept patches for ratified extensions, and
> zfinx is not ratified yet.  It is still a proposed extension that may change 
> in
> incompatible ways before it is ratified.
> It is good to have binutils/gcc patches so that we can test it, but
> they can't be on the master branch with current conventions.We can
> put them on a vendor branch in the FSF GCC tree.  Or we can put them on a
> branch in the github.com/riscv trees.  We do still need a copyright assignment
> from Huawei before we can use the github.com/riscv trees though, to avoid
> contaminating those trees with patches that can't be upstreamed.
> 
> I haven't tried reviewing the patch yet.  I took most of last week off, so 
> this is
> now on my to do list and hopefully I can get to it soon.
> 
> Jim

Re: [PATCH v2] [RISC-V] Add support for TLS stack protector canary access

2020-07-27 Thread Kito Cheng via Gcc-patches

Add testcase later is OK to me.

On Tue, Jul 28, 2020 at 6:55 AM Jim Wilson  wrote:
>
> On Sun, Jul 19, 2020 at 7:04 PM cooper  wrote:
> > Ping
> >
> > On 2020/7/13 下午4:15, cooper wrote:
> > > gcc/
> > >   * config/riscv/riscv-opts.h (stack_protector_guard): New enum.
> > >   * config/riscv/riscv.c (riscv_option_override): Handle
> > >   the new options.
> > >   * config/riscv/riscv.md (stack_protect_set): New pattern to handle
> > >   flexible stack protector guard settings.
> > >   (stack_protect_set_): Ditto.
> > >   (stack_protect_test): Ditto.
> > >   (stack_protect_test_): Ditto.
> > >   * config/riscv/riscv.opt (mstack-protector-guard=,
> > >   mstack-protector-guard-reg=, mstack-protector-guard-offset=): New
> > >   options.
> > >   * doc/invoke.texi (Option Summary) [RISC-V Options]:
> > >   Add -mstack-protector-guard=, -mstack-protector-guard-reg=, and
> > >   -mstack-protector-guard-offset=.
> > >   (RISC-V Options): Ditto.
>
> The v2 patch looks fine to me.  Meanwhile, Kito asked for testcases
> which would be nice to have but I don't think is critical considering
> that this has already been tested with a kernel build.  Maybe the
> testcases can be a follow on patch?  I'd like to see forward movement
> on this, even if we accept a patch without the testcases.
>
> Jim

Re: [PATCH] MIPS: Fix __builtin_longjmp (PR 64242)

2020-07-27 Thread Paul Hua via Gcc-patches

ping?

On Sun, Jul 12, 2020 at 2:27 PM Paul Hua  wrote:
>
> From 589dbe8a1c2397bfafefa4e84abe5ec6e6798928 Mon Sep 17 00:00:00 2001
> From: Andrew Pinski 
> Date: Wed, 12 Feb 2020 11:42:57 +
> Subject: [PATCH] MIPS: Fix __builtin_longjmp (PR 64242)
>
> The problem here is mips has its own builtin_longjmp
> pattern and it was not fixed when expand_builtin_longjmp
> was fixed.  We need to read the new fp and gp before
> restoring the stack as the buffer might be a local
> variable.
>
> Change-Id: I3416568e260e6bde3ad5cc470fb4f2ecfa207f05
> Signed-off-by: Andrew Pinski 
>
> This patch from Andrew, I bootstrapped and tested on mips64el-linux-gnu.
>
> OK for master ?
>
> gcc/ChangeLog:
>
> PR middle-end/64242
> * config/mips/mips.md (builtin_longjmp): Restore the frame pointer
>and stack pointer and gp.

Go patch committed: Pass only ptr and len to some runtime calls

2020-07-27 Thread Ian Lance Taylor via Gcc-patches

This patch changes the Go frontend and the libgo runtime package to
pass only ptr and len to some runtime calls, rather than passing an
entire slice.  This ports https://golang.org/cl/227163 to the Go
frontend.  This is a step toward moving up to the go1.15rc1 release.

The original change says "Some runtime calls accept a slice, but only
use ptr and len.  This change modifies most such routines to accept
only ptr and len."

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
108fdcc56ee49dd7dc8314ce5022191f406a125f
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 12e48c19932..64a655e911e 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-8b9c7fb00ccaf1d4bcc8d581a1a4d46a35771b77
+63bc2430187efe5ff47e9c7b9cd6d40b350ee7d7
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 327f9403b39..90f860bd735 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -4157,45 +4157,43 @@ 
Type_conversion_expression::do_get_backend(Translate_context* context)
   go_assert(e->integer_type() != NULL);
   go_assert(this->expr_->is_variable());
 
-  Runtime::Function code;
+  Expression* buf;
+  if (this->no_escape_ && !this->no_copy_)
+{
+  Type* byte_type = Type::lookup_integer_type("uint8");
+  Expression* buflen =
+Expression::make_integer_ul(tmp_string_buf_size, NULL, loc);
+  Type* array_type = Type::make_array_type(byte_type, buflen);
+  buf = Expression::make_allocation(array_type, loc);
+  buf->allocation_expression()->set_allocate_on_stack();
+  buf->allocation_expression()->set_no_zero();
+}
+  else
+buf = Expression::make_nil(loc);
+
   if (e->integer_type()->is_byte())
 {
+ Expression* ptr =
+   Expression::make_slice_info(this->expr_, SLICE_INFO_VALUE_POINTER,
+   loc);
+ Expression* len =
+   Expression::make_slice_info(this->expr_, SLICE_INFO_LENGTH, loc);
   if (this->no_copy_)
 {
   if (gogo->debug_optimization())
 go_debug(loc, "no copy string([]byte)");
-  Expression* ptr = Expression::make_slice_info(this->expr_,
-
SLICE_INFO_VALUE_POINTER,
-loc);
-  Expression* len = Expression::make_slice_info(this->expr_,
-SLICE_INFO_LENGTH,
-loc);
   Expression* str = Expression::make_string_value(ptr, len, loc);
   return str->get_backend(context);
 }
-  code = Runtime::SLICEBYTETOSTRING;
+ return Runtime::make_call(Runtime::SLICEBYTETOSTRING, loc, 3, buf,
+   ptr, len)->get_backend(context);
 }
   else
 {
   go_assert(e->integer_type()->is_rune());
-  code = Runtime::SLICERUNETOSTRING;
-}
-
-  Expression* buf;
-  if (this->no_escape_)
-{
-  Type* byte_type = Type::lookup_integer_type("uint8");
-  Expression* buflen =
-Expression::make_integer_ul(tmp_string_buf_size, NULL, loc);
-  Type* array_type = Type::make_array_type(byte_type, buflen);
-  buf = Expression::make_allocation(array_type, loc);
-  buf->allocation_expression()->set_allocate_on_stack();
-  buf->allocation_expression()->set_no_zero();
-}
-  else
-buf = Expression::make_nil(loc);
-  return Runtime::make_call(code, loc, 2, buf,
-   this->expr_)->get_backend(context);
+ return Runtime::make_call(Runtime::SLICERUNETOSTRING, loc, 2, buf,
+   this->expr_)->get_backend(context);
+   }
 }
   else if (type->is_slice_type() && expr_type->is_string_type())
 {
@@ -8397,8 +8395,16 @@ Builtin_call_expression::do_flatten(Gogo* gogo, 
Named_object* function,
 if (et->has_pointer())
   {
 Expression* td = Expression::make_type_descriptor(et, loc);
+   Expression* pd =
+ Expression::make_slice_info(arg1, SLICE_INFO_VALUE_POINTER, loc);
+   Expression* ld =
+ Expression::make_slice_info(arg1, SLICE_INFO_LENGTH, loc);
+   Expression* ps =
+ Expression::make_slice_info(arg2, SLICE_INFO_VALUE_POINTER, loc);
+   Expression* ls =
+ Expression::make_slice_info(arg2, SLICE_INFO_LENGTH, loc);
 ret = Runtime::make_call(Runtime::TYPEDSLICECOPY, loc,
- 3, td, arg1,

Go patch committed: turn global "a = b; b = x' to "a = x" when possible

2020-07-27 Thread Ian Lance Taylor via Gcc-patches

This patch to the Go frontend changes package-scope "a = b; b = x" to
just set "a = x".  This avoids requiring an init function to
initialize the variable.  This can only be done if x is a static
initializer.

The go1.15rc1 runtime package relies on this optimization.  The
package has a variable "var maxSearchAddr = maxOffAddr".  The
maxSearchAddr variable is used by code that runs before package
initialization is complete.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
2b29792a4c40cde1f5a5d0858234162258d7c1b3
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 0fa32a43489..12e48c19932 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-e86f2cb5d6b1984fde345d6ade605e377fa38c04
+8b9c7fb00ccaf1d4bcc8d581a1a4d46a35771b77
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc
index c1021e5679c..4c8c55fcb14 100644
--- a/gcc/go/gofrontend/gogo.cc
+++ b/gcc/go/gofrontend/gogo.cc
@@ -1622,16 +1622,31 @@ Gogo::write_globals()
   // The initializer is constant if it is the zero-value of the
   // variable's type or if the initial value is an immutable value
   // that is not copied to the heap.
-  bool is_static_initializer = false;
-  if (var->init() == NULL)
+ Expression* init = var->init();
+
+ // If we see "a = b; b = x", and x is a static
+ // initializer, just set a to x.
+ while (init != NULL && init->var_expression() != NULL)
+   {
+ Named_object* ino = init->var_expression()->named_object();
+ if (!ino->is_variable() || ino->package() != NULL)
+   break;
+ Expression* ino_init = ino->var_value()->init();
+ if (ino->var_value()->has_pre_init()
+ || ino_init == NULL
+ || !ino_init->is_static_initializer())
+   break;
+ init = ino_init;
+   }
+
+  bool is_static_initializer;
+  if (init == NULL)
 is_static_initializer = true;
   else
 {
   Type* var_type = var->type();
-  Expression* init = var->init();
-  Expression* init_cast =
-  Expression::make_cast(var_type, init, var->location());
-  is_static_initializer = init_cast->is_static_initializer();
+  init = Expression::make_cast(var_type, init, 
var->location());
+  is_static_initializer = init->is_static_initializer();
 }
 
  // Non-constant variable initializations might need to create
@@ -1650,7 +1665,15 @@ Gogo::write_globals()
 }
  var_init_fn = init_fndecl;
}
-  Bexpression* var_binit = var->get_init(this, var_init_fn);
+
+ Bexpression* var_binit;
+ if (init == NULL)
+   var_binit = NULL;
+ else
+   {
+ Translate_context context(this, var_init_fn, NULL, NULL);
+ var_binit = init->get_backend();
+   }
 
   if (var_binit == NULL)
;

gcc.dg/torture/pr39074-2.c, pr39074.c, pta-callused-1.c: Adjust for mmix.

2020-07-27 Thread Hans-Peter Nilsson

Committed.

These FAILs for mmix showed up as regressions for me due to a
flaw in the btest-gcc.sh test-results-accounting: a bug was
recently fixed regarding the naming of dump-files so the names
are again correct.  To wit, parts of the tests that were
UNRESOLVED, due to missing dump-files, and ignored in the
presence of other parts (execution, excess errors) PASSing,
became FAIL, trumping the PASSing parts of the tests.

As in a recent patch, the variables are "privatized" using
ASM_PN_FORMAT for MMIX and the lines to match look like:
 y::0_1 = { i }
 y::0_1, points-to NULL, points-to vars: { D.1465 } (nonlocal, escaped)
instead of e.g. for cris-elf:
 y.0_1 = { i }
 y.0_1, points-to NULL, points-to vars: { D.1433 } (nonlocal, escaped)
Also checked that the general pattern still matches for cris-elf.

gcc/testsuite:
* gcc.dg/torture/pr39074-2.c: Adjust for mmix.
* gcc.dg/torture/pr39074.c, gcc.dg/torture/pta-callused-1.c: Ditto.

--- gcc/gcc/testsuite/gcc.dg/torture/pr39074-2.c.orig   Tue Jul 28 01:04:26 2020
+++ gcc/gcc/testsuite/gcc.dg/torture/pr39074-2.cTue Jul 28 01:04:27 2020
@@ -30,5 +30,5 @@ int main()
   return 0;
 }

-/* { dg-final { scan-tree-dump "y.._. = { i }" "alias" } } */
-/* { dg-final { scan-tree-dump "y.._., points-to NULL, points-to vars: { 
D. }" "alias" } } */
+/* { dg-final { scan-tree-dump "y.\?.._. = { i }" "alias" } } */
+/* { dg-final { scan-tree-dump "y.\?.._., points-to NULL, points-to vars: { 
D. }" "alias" } } */
--- gcc/gcc/testsuite/gcc.dg/torture/pr39074.c.orig Tue Jul 28 01:12:39 2020
+++ gcc/gcc/testsuite/gcc.dg/torture/pr39074.c  Tue Jul 28 01:13:14 2020
@@ -29,5 +29,5 @@ int main()
   return 0;
 }

-/* { dg-final { scan-tree-dump "y.._. = { i }" "alias" } } */
-/* { dg-final { scan-tree-dump "y.._., points-to NULL, points-to vars: { 
D. }" "alias" } } */
+/* { dg-final { scan-tree-dump "y.\?.._. = { i }" "alias" } } */
+/* { dg-final { scan-tree-dump "y.\?.._., points-to NULL, points-to vars: { 
D. }" "alias" } } */
--- gcc/gcc/testsuite/gcc.dg/torture/pta-callused-1.c.orig  Tue Jul 28 
01:27:36 2020
+++ gcc/gcc/testsuite/gcc.dg/torture/pta-callused-1.c   Tue Jul 28 01:27:55 2020
@@ -21,4 +21,4 @@ int main()
   return 0;
 }

-/* { dg-final { scan-tree-dump "p.._. = { i j }" "alias" } } */
+/* { dg-final { scan-tree-dump "p.\?.._. = { i j }" "alias" } } */

Re: [PATCH] aarch64: Delete duplicated option docs.

2020-07-27 Thread Jim Wilson

Ping.  ccing the aarch64 maintainers.  If I don't get a response, I
will just commit this as obvious.

Jim

On Sun, Jul 12, 2020 at 4:52 PM Jim Wilson  wrote:
>
> Noticed while reviewing the RISC-V -mstack-protector-guard docs.  The, and 
> could maybe be added as a separate patch.

> AArch64 section has two identical copies of the docs for this option.
>
> * doc/invoke.texi (AArch64 Options): Delete duplicate
> -mstack-protector-guard docs.
> ---
>  gcc/doc/invoke.texi | 18 --
>  1 file changed, 18 deletions(-)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 09bcc5b0f78..2d5803a781b 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -17126,24 +17126,6 @@ and from what offset from that base register. There 
> is no default
>  register or offset as this is entirely for use within the Linux
>  kernel.
>
> -@item -mstack-protector-guard=@var{guard}
> -@itemx -mstack-protector-guard-reg=@var{reg}
> -@itemx -mstack-protector-guard-offset=@var{offset}
> -@opindex mstack-protector-guard
> -@opindex mstack-protector-guard-reg
> -@opindex mstack-protector-guard-offset
> -Generate stack protection code using canary at @var{guard}.  Supported
> -locations are @samp{global} for a global canary or @samp{sysreg} for a
> -canary in an appropriate system register.
> -
> -With the latter choice the options
> -@option{-mstack-protector-guard-reg=@var{reg}} and
> -@option{-mstack-protector-guard-offset=@var{offset}} furthermore specify
> -which system register to use as base register for reading the canary,
> -and from what offset from that base register. There is no default
> -register or offset as this is entirely for use within the Linux
> -kernel.
> -
>  @item -mtls-dialect=desc
>  @opindex mtls-dialect=desc
>  Use TLS descriptors as the thread-local storage mechanism for dynamic 
> accesses
> --
> 2.17.1
>

Re: [PATCH v2] [RISC-V] Add support for TLS stack protector canary access

2020-07-27 Thread Jim Wilson

On Sun, Jul 19, 2020 at 7:04 PM cooper  wrote:
> Ping
>
> On 2020/7/13 下午4:15, cooper wrote:
> > gcc/
> >   * config/riscv/riscv-opts.h (stack_protector_guard): New enum.
> >   * config/riscv/riscv.c (riscv_option_override): Handle
> >   the new options.
> >   * config/riscv/riscv.md (stack_protect_set): New pattern to handle
> >   flexible stack protector guard settings.
> >   (stack_protect_set_): Ditto.
> >   (stack_protect_test): Ditto.
> >   (stack_protect_test_): Ditto.
> >   * config/riscv/riscv.opt (mstack-protector-guard=,
> >   mstack-protector-guard-reg=, mstack-protector-guard-offset=): New
> >   options.
> >   * doc/invoke.texi (Option Summary) [RISC-V Options]:
> >   Add -mstack-protector-guard=, -mstack-protector-guard-reg=, and
> >   -mstack-protector-guard-offset=.
> >   (RISC-V Options): Ditto.

The v2 patch looks fine to me.  Meanwhile, Kito asked for testcases
which would be nice to have but I don't think is critical considering
that this has already been tested with a kernel build.  Maybe the
testcases can be a follow on patch?  I'd like to see forward movement
on this, even if we accept a patch without the testcases.

Jim

Re: [PATCH] RISC-V: Add ZFINX support

2020-07-27 Thread Jim Wilson

On Sun, Jul 26, 2020 at 11:40 PM wangtao (CH)  wrote:
> This is the patch to support ZFINX of RISC-V, which option is like 
> -march=rv32gc_zfinx. The ZFINX means f-registers in x-registers under RV-F 
> and RV-D extension. For more details, please refer to 
> https://github.com/riscv/riscv-zfinx/blob/master/Zfinx_spec.adoc.
> This patch mainly adds the ZFINX option and abi constraints, and when it’s 
> under ZFINX, makes the f-registers as FIXED_REGs to avoid allocating 
> f-regsiters to pseudo registers.
> And for binutils support, it has been done and I will send it to binutils-gdb 
> community to review later.

Normally I'd expect to see the binutils patch first, since the gcc
patch can't be tested without the binutils patch.  Looking at FSF
Copyright assignments, I see that Huawei has corporate assignments for
gcc and glibc, but I don't see one for binutils.  If Huawei is writing
the binutils patch, and we can't accept the binutils patch due to a
missing copyright assignment, then that makes the gcc patch mostly
useless.

Current convention is that we only accept patches for ratified
extensions, and zfinx is not ratified yet.  It is still a proposed
extension that may change in incompatible ways before it is ratified.
It is good to have binutils/gcc patches so that we can test it, but
they can't be on the master branch with current conventions.We can
put them on a vendor branch in the FSF GCC tree.  Or we can put them
on a branch in the github.com/riscv trees.  We do still need a
copyright assignment from Huawei before we can use the
github.com/riscv trees though, to avoid contaminating those trees with
patches that can't be upstreamed.

I haven't tried reviewing the patch yet.  I took most of last week
off, so this is now on my to do list and hopefully I can get to it
soon.

Jim

Re: [PATCH] ipa/96291: don't crash on unoptimized lto functions

2020-07-27 Thread Sergei Trofimovich via Gcc-patches

On Mon, 27 Jul 2020 14:41:14 +0200
Martin Jambor  wrote:

> Hi,
> 
> On Mon, Jul 27 2020, Richard Biener via Gcc-patches wrote:
> > On Sat, Jul 25, 2020 at 8:35 PM Sergei Trofimovich via Gcc-patches
> >  wrote:  
> >>
> >> From: Sergei Trofimovich 
> >>
> >> In PR ipa/96291 the test contained an SCC with one
> >> unoptimized function. This tricked ipa-cp into NULL dereference.
> >>
> >> has_undead_caller_from_outside_scc_p() did not take into account
> >> that unoptimized funtions don't have IPA summary analysis. and
> >> dereferenced NULL pointer causing an ICE.  
> >
> > Can you create a single-unit testcase with a SCC with one function
> > having the no_ipa attribute?  
> 
> This bug is LTO specific because otherwise a summary (although marked as
> quite useless) will be left over from the summary building stage.

Yeah, I was not able to shrink the example even down to 2 files.

> So Sergei, if you can afford to spend an extra while providing a
> testcase, you'll need to add three files into gcc/testsuite/gcc.dg/lto,
> with either the second or third (numbered _1 or _2)) having

Sounds good! I tried is as:
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550817.html

> /* { dg-lto-options { { -flto -O0 } } } */

I used "/* { dg-options {-O0} } */" looking at pr88077_1.c which uses
"/* { dg-options {-fcommon} } */".  But only now looked closer at your
suggestion.

Should I resend with "/* { dg-lto-options { { -flto -O0 } } } */"? I need
a bit or help to understand the difference :)

-- 

  Sergei

[PATCH v2] ipa/96291: don't crash on unoptimized lto functions

2020-07-27 Thread Sergei Trofimovich via Gcc-patches

From: Sergei Trofimovich 

In PR ipa/96291 the test contained an SCC with one
unoptimized function. This tricked ipa-cp into NULL dereference.

has_undead_caller_from_outside_scc_p() did not take into account
that unoptimized funtions don't have IPA summary analysis. And
dereferenced NULL pointer causing an ICE.

gcc/
PR ipa/96291
* ipa-cp.c (has_undead_caller_from_outside_scc_p): Consider
unoptimized callers as undead.

gcc/testsuite/
PR ipa/96291
* gcc.dg/lto/pr96291_0.c: New testcase.
* gcc.dg/lto/pr96291_1.c: Support file.
* gcc.dg/lto/pr96291_2.c: Likewise.
* gcc.dg/lto/pr96291.h: Likewise.
---
 gcc/ipa-cp.c |  5 +++--
 gcc/testsuite/gcc.dg/lto/pr96291.h   |  4 
 gcc/testsuite/gcc.dg/lto/pr96291_0.c | 11 +++
 gcc/testsuite/gcc.dg/lto/pr96291_1.c |  3 +++
 gcc/testsuite/gcc.dg/lto/pr96291_2.c |  7 +++
 5 files changed, 28 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr96291.h
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr96291_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr96291_1.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr96291_2.c

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index b0c8f405260..fe010ff457c 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -5667,8 +5667,9 @@ has_undead_caller_from_outside_scc_p (struct cgraph_node 
*node,
  (has_undead_caller_from_outside_scc_p, NULL, true))
   return true;
 else if (!ipa_edge_within_scc (cs)
-&& !IPA_NODE_REF (cs->caller)->node_dead)
-  return true;
+&& (!IPA_NODE_REF (cs->caller) /* Unoptimized caller.  */
+|| !IPA_NODE_REF (cs->caller)->node_dead))
+ return true;
   return false;
 }
 
diff --git a/gcc/testsuite/gcc.dg/lto/pr96291.h 
b/gcc/testsuite/gcc.dg/lto/pr96291.h
new file mode 100644
index 000..70eb3cb71b8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr96291.h
@@ -0,0 +1,4 @@
+void e(void);
+void f(void);
+void a(void *, void *);
+void c(int);
diff --git a/gcc/testsuite/gcc.dg/lto/pr96291_0.c 
b/gcc/testsuite/gcc.dg/lto/pr96291_0.c
new file mode 100644
index 000..07e63038e03
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr96291_0.c
@@ -0,0 +1,11 @@
+/* { dg-lto-do link } */
+
+#include "pr96291.h"
+
+static void * b;
+void c(int d) {
+  f();
+  a(b, b);
+}
+
+void e(void) { c(0); }
diff --git a/gcc/testsuite/gcc.dg/lto/pr96291_1.c 
b/gcc/testsuite/gcc.dg/lto/pr96291_1.c
new file mode 100644
index 000..44744a94941
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr96291_1.c
@@ -0,0 +1,3 @@
+#include "pr96291.h"
+
+void f(void) { c(0); }
diff --git a/gcc/testsuite/gcc.dg/lto/pr96291_2.c 
b/gcc/testsuite/gcc.dg/lto/pr96291_2.c
new file mode 100644
index 000..5febffbb00c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr96291_2.c
@@ -0,0 +1,7 @@
+/* { dg-options {-O0} } */
+
+#include "pr96291.h"
+
+void a(void * a1, void * a2) { e(); }
+
+int main(){}
-- 
2.27.0

Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Jakub Jelinek via Gcc-patches

On Mon, Jul 27, 2020 at 09:53:31AM -0600, Martin Sebor via Gcc-patches wrote:
>   Return a pointer P to a NUL-terminated string containing
>   the sequence of bytes corresponding to the representation
>   of the object referred to by SRC (or a subsequence of such
>   bytes within it if SRC is a reference to an initialized
>   constant array plus some constant offset).
> 
> I.e., c_getstr returns a STRING_CST for arbitrary non-string
> constants.  This enables optimizations like the by-pieces
> expansion of calls to raw memory functions like memcpy, or
> the folding of other raw memory calls like memchr with non-
> string arguments.
> 
> c_getstr relies on string_constant for that.  Restricting
> the latter function to just character types prevents these
> optimizations for zero-initialized constants of other types.
> A test case that shows the difference to the by-pieces
> expansion goes something like this:

Having STRING_CST in the compiler with arbitrary array type is IMHO a very
bad idea, so if you want something like that, you should come up with a
different representation for that, not STRING_CSTs.
Because most of the compiler assumes STRING_CSTs are what it says, string
literals where elements are some kind of characters, don't have to be
narrow, but better should be integral.
Maybe returning a CONSTRUCTOR with no elements with the right type is a
better idea for that, that in the compiler stands for zero initialized
aggregate.

Jakub

c-family: Use strcmp to compare location file names

2020-07-27 Thread Nathan Sidwell

The logic to figure out where a missing #include should be inserted uses 
pointer equality to check filenames -- the routine even says so. But 
cpplib makes no such guarantee.  It happens to be true for input that it 
preprocesses[* see line zero below], but is not true for source that has 
already been preprocessed -- all those '# ...' line directives produce 
disctinct filename strings.  That renders using -fdirectives-only as a 
prescanning stage (as I understand some people do), broken.


This patch changes to string comparisons, and explicitly rejects any 
line-zero location map that occurs at the beginning of a file.  The very 
first map of a file has a different string to the remaining maps, and we 
never tripped on that because of the pointer comparison.  The second 
testcase deploys -save-temps to cause an intermediate preprocessed 
output that is read back.


gcc/c-family/
* c-common.c (try_to_locate_new_include_insertion_point): Use
strcmp, not pointer equality.
gcc/testsuite/
* g++.dg/lookup/missing-std-include-10.h: New.
* g++.dg/lookup/missing-std-include-10.C: New.
* g++.dg/lookup/missing-std-include-11.C: New.

pushing ...
--
Nathan Sidwell
diff --git c/gcc/c-family/c-common.c w/gcc/c-family/c-common.c
index 51ecde69f2d..98b80d56cae 100644
--- c/gcc/c-family/c-common.c
+++ w/gcc/c-family/c-common.c
@@ -8764,8 +8764,7 @@ c_family_tests (void)
 #endif /* #if CHECKING_P */
 
 /* Attempt to locate a suitable location within FILE for a
-   #include directive to be inserted before.  FILE should
-   be a string from libcpp (pointer equality is used).
+   #include directive to be inserted before.  
LOC is the location of the relevant diagnostic.
 
Attempt to return the location within FILE immediately
@@ -8800,13 +8799,17 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
 
   if (const line_map_ordinary *from
 	  = linemap_included_from_linemap (line_table, ord_map))
-	if (from->to_file == file)
+	/* We cannot use pointer equality, because with preprocessed
+	   input all filename strings are unique.  */
+	if (0 == strcmp (from->to_file, file))
 	  {
 	last_include_ord_map = from;
 	last_ord_map_after_include = NULL;
 	  }
 
-  if (ord_map->to_file == file)
+  /* Likewise, use strcmp, and reject any line-zero introductory
+	 map.  */
+  if (ord_map->to_line && 0 == strcmp (ord_map->to_file, file))
 	{
 	  if (!first_ord_map_in_file)
 	first_ord_map_in_file = ord_map;
diff --git c/gcc/testsuite/g++.dg/lookup/missing-std-include-10.C w/gcc/testsuite/g++.dg/lookup/missing-std-include-10.C
new file mode 100644
index 000..9dfa78fb60e
--- /dev/null
+++ w/gcc/testsuite/g++.dg/lookup/missing-std-include-10.C
@@ -0,0 +1,43 @@
+// { dg-do compile }
+// { dg-additional-options -fdiagnostics-show-caret }
+// comment
+
+
+
+
+
+
+// Intentional blank lines
+
+
+
+
+
+
+
+
+#include "missing-std-include-10.h"
+// HERE
+
+
+
+
+
+
+// Intentional blank lines
+
+
+
+
+
+
+
+
+
+int main ()
+{
+  return strcmp ("", "");
+}
+// { dg-additional-files "missing-std-include-10.h" }
+// { dg-regexp {[^\n]*: error: 'strcmp' was not declared in this scope\n *return strcmp [^\n]*;\n *\^~*\n} }
+// { dg-regexp {[^\n]* note: 'strcmp' is defined in header[^\n]*\n #include "missing-std-include-10.h"\n\+#include \n // HERE\n} }
diff --git c/gcc/testsuite/g++.dg/lookup/missing-std-include-10.h w/gcc/testsuite/g++.dg/lookup/missing-std-include-10.h
new file mode 100644
index 000..40a8c178f10
--- /dev/null
+++ w/gcc/testsuite/g++.dg/lookup/missing-std-include-10.h
@@ -0,0 +1 @@
+/* empty */
diff --git c/gcc/testsuite/g++.dg/lookup/missing-std-include-11.C w/gcc/testsuite/g++.dg/lookup/missing-std-include-11.C
new file mode 100644
index 000..ec2c494c557
--- /dev/null
+++ w/gcc/testsuite/g++.dg/lookup/missing-std-include-11.C
@@ -0,0 +1,43 @@
+// { dg-do compile }
+// { dg-additional-options {-fdiagnostics-show-caret -save-temps} }
+// comment  save-temps causes us to compile preprocessed output
+
+
+
+
+
+
+// Intentional blank lines
+
+
+
+
+
+
+
+
+#include "missing-std-include-10.h"
+// HERE
+
+
+
+
+
+
+// Intentional blank lines
+
+
+
+
+
+
+
+
+
+int main ()
+{
+  return strcmp ("", "");
+}
+// { dg-additional-files "missing-std-include-10.h" }
+// { dg-regexp {[^\n]*: error: 'strcmp' was not declared in this scope\n *return strcmp [^\n]*;\n *\^~*\n} }
+// { dg-regexp {[^\n]* note: 'strcmp' is defined in header[^\n]*\n #include "missing-std-include-10.h"\n\+#include \n // HERE\n} }

Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Sebor via Gcc-patches


On 7/27/20 12:54 PM, Martin Liška wrote:

On 7/27/20 5:53 PM, Martin Sebor wrote:

The tests I committed with the change didn't exercise any of
this so that's my bad.  I'm still not sure I understand how
the problem with the incomplete type comes up (I haven't had
a chance to look into the recent updates on the bug yet) but
to retain the optimization (and keep the comments in sync
with the code) I think a better solution than restricting
the function to integers is to limit it to complete types.
Beyond that, extending the function to also constant arrays
or nonzero aggregates will also enable the optimization for
those.


Hello.

I must admit that I'm not super-familiar with that code I modified.
Can you please assign the PR and propose a proper fix? I can then
test it on chromium and I'm also deferring backport of the patch I 
installed

to master.


Sure.  I've been trying to wrap something up and it's been taking
longer than I expected.  I'll look into this as soon as I'm done,
hopefully tomorrow.

Martin

V2 [PATCH] PKG_CHECK_MODULES: Check if $pkg_cv_[]$1[]_LIBS works

2020-07-27 Thread H.J. Lu via Gcc-patches

On Mon, Jul 27, 2020 at 12:14 PM H.J. Lu  wrote:
>
> On Mon, Jul 27, 2020 at 9:11 AM Aaron Merey  wrote:
> >
> > On Mon, Jul 27, 2020 at 11:32 AM H.J. Lu  wrote:
> > >
> > > On Sat, Jul 25, 2020 at 9:01 AM H.J. Lu  wrote:
> > > > This caused:
> > > >
> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=26301
> > > >
> > >
> > > It is quite normal to have debuginfod headers without libdebuginfod on
> > > multilib OSes.  Restore AC_CHECK_LIB to check if libdebuginfod exists.
> > > And always define HAVE_LIBDEBUGINFOD to 0 or 1 for
> > >
> > > binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> > > binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> > > binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> > > binutils/dwarf.h:#if HAVE_LIBDEBUGINFOD
> > > binutils/objdump.c:#if HAVE_LIBDEBUGINFOD
> > > binutils/objdump.c:#endif /* HAVE_LIBDEBUGINFOD */
> > > binutils/readelf.c:#if HAVE_LIBDEBUGINFOD
> > > binutils/readelf.c:#endif /* HAVE_LIBDEBUGINFOD */
> > > gdb/top.c:#if HAVE_LIBDEBUGINFOD
> > >
> > > OK for master?
> >
> > Thanks for spotting this. Normally PKG_CHECH_MODULES would correctly
> > detect whether the .so and header are installed and build accordingly,
> > but when cross compiling the AC_CHECK_LIB may be needed.
>
> I am not cross compiling.  I am simply using "gcc -m32".   The problem
> is PKG_CHECK_MODULES which doesn't check if $pkg_cv_[]$1[]_LIBS
> actually works.   Here is the updated patch to fix PKG_CHECK_MODULES.
> Any comments or objections?
>
>

HAVE_LIBDEBUGINFOD is a separate issue.  Here is the updated patch
which only adds AC_TRY_LINK to PKG_CHECK_MODULES to check if
$pkg_cv_[]$1[]_LIBS works.

-- 
H.J.
From 44682ce298a8ce2b795303d4054ec532847bfcae Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 27 Jul 2020 08:24:15 -0700
Subject: [PATCH] PKG_CHECK_MODULES: Check if $pkg_cv_[]$1[]_LIBS works

It is quite normal to have headers without library on multilib OSes.
Add AC_TRY_LINK to PKG_CHECK_MODULES to check if $pkg_cv_[]$1[]_LIBS
works.

config/

	PR binutils/26301
	* pkg.m4 (PKG_CHECK_MODULES): Add AC_TRY_LINK to check if
	$pkg_cv_[]$1[]_LIBS works.

binutils/

	PR binutils/26301
	* configure: Regenerated.

gdb/

	PR binutils/26301
	* configure: Regenerated.
---
 binutils/configure | 22 ++
 config/pkg.m4  |  6 ++
 gdb/configure  | 22 ++
 3 files changed, 50 insertions(+)

diff --git a/binutils/configure b/binutils/configure
index c9fc5108e0..4620a6b105 100755
--- a/binutils/configure
+++ b/binutils/configure
@@ -12439,6 +12439,28 @@ fi
 pkg_failed=untried
 fi
 
+pkg_save_LDFLAGS="$LDFLAGS"
+LDFLAGS="$LDFLAGS $pkg_cv_DEBUGINFOD_LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+return 0;
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pkg_failed=no
+else
+  pkg_failed=yes
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+LDFLAGS=$pkg_save_LDFLAGS
+
 
 
 if test $pkg_failed = yes; then
diff --git a/config/pkg.m4 b/config/pkg.m4
index 13a8890178..45587e97c8 100644
--- a/config/pkg.m4
+++ b/config/pkg.m4
@@ -147,6 +147,12 @@ AC_MSG_CHECKING([for $2])
 _PKG_CONFIG([$1][_CFLAGS], [cflags], [$2])
 _PKG_CONFIG([$1][_LIBS], [libs], [$2])
 
+dnl Check whether $pkg_cv_[]$1[]_LIBS works.
+pkg_save_LDFLAGS="$LDFLAGS"
+LDFLAGS="$LDFLAGS $pkg_cv_[]$1[]_LIBS"
+AC_TRY_LINK([],[return 0;], [pkg_failed=no], [pkg_failed=yes])
+LDFLAGS=$pkg_save_LDFLAGS
+
 m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS
 and $1[]_LIBS to avoid the need to call pkg-config.
 See the pkg-config man page for more details.])
diff --git a/gdb/configure b/gdb/configure
index adcfa49c63..eb38aaacfc 100755
--- a/gdb/configure
+++ b/gdb/configure
@@ -7037,6 +7037,28 @@ fi
 pkg_failed=untried
 fi
 
+pkg_save_LDFLAGS="$LDFLAGS"
+LDFLAGS="$LDFLAGS $pkg_cv_DEBUGINFOD_LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+return 0;
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pkg_failed=no
+else
+  pkg_failed=yes
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+LDFLAGS=$pkg_save_LDFLAGS
+
 
 
 if test $pkg_failed = yes; then
-- 
2.26.2

[PATCH] PKG_CHECK_MODULES: Check if $pkg_cv_[]$1[]_LIBS works

2020-07-27 Thread H.J. Lu via Gcc-patches

On Mon, Jul 27, 2020 at 9:11 AM Aaron Merey  wrote:
>
> On Mon, Jul 27, 2020 at 11:32 AM H.J. Lu  wrote:
> >
> > On Sat, Jul 25, 2020 at 9:01 AM H.J. Lu  wrote:
> > > This caused:
> > >
> > > https://sourceware.org/bugzilla/show_bug.cgi?id=26301
> > >
> >
> > It is quite normal to have debuginfod headers without libdebuginfod on
> > multilib OSes.  Restore AC_CHECK_LIB to check if libdebuginfod exists.
> > And always define HAVE_LIBDEBUGINFOD to 0 or 1 for
> >
> > binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> > binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> > binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> > binutils/dwarf.h:#if HAVE_LIBDEBUGINFOD
> > binutils/objdump.c:#if HAVE_LIBDEBUGINFOD
> > binutils/objdump.c:#endif /* HAVE_LIBDEBUGINFOD */
> > binutils/readelf.c:#if HAVE_LIBDEBUGINFOD
> > binutils/readelf.c:#endif /* HAVE_LIBDEBUGINFOD */
> > gdb/top.c:#if HAVE_LIBDEBUGINFOD
> >
> > OK for master?
>
> Thanks for spotting this. Normally PKG_CHECH_MODULES would correctly
> detect whether the .so and header are installed and build accordingly,
> but when cross compiling the AC_CHECK_LIB may be needed.

I am not cross compiling.  I am simply using "gcc -m32".   The problem
is PKG_CHECK_MODULES which doesn't check if $pkg_cv_[]$1[]_LIBS
actually works.   Here is the updated patch to fix PKG_CHECK_MODULES.
Any comments or objections?


-- 
H.J.
From 42d49b1444ad6c8475672f6a6a16810d9e7c15ef Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 27 Jul 2020 08:24:15 -0700
Subject: [PATCH] PKG_CHECK_MODULES: Check if $pkg_cv_[]$1[]_LIBS works

It is quite normal to have headers without library on multilib OSes.
Add AC_TRY_LINK to PKG_CHECK_MODULES to check if $pkg_cv_[]$1[]_LIBS
works.  And always define HAVE_LIBDEBUGINFOD to 0 or 1 for

binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.h:#if HAVE_LIBDEBUGINFOD
binutils/objdump.c:#if HAVE_LIBDEBUGINFOD
binutils/objdump.c:#endif /* HAVE_LIBDEBUGINFOD */
binutils/readelf.c:#if HAVE_LIBDEBUGINFOD
binutils/readelf.c:#endif /* HAVE_LIBDEBUGINFOD */
gdb/top.c:#if HAVE_LIBDEBUGINFOD

config/

	PR binutils/26301
	* debuginfod.m4 (AC_DEBUGINFOD): Always define HAVE_LIBDEBUGINFOD
	to 0 or 1.
	* pkg.m4 (PKG_CHECK_MODULES): Add AC_TRY_LINK to check if
	$pkg_cv_[]$1[]_LIBS works.

binutils/

	PR binutils/26301
	* configure: Regenerated.

gdb/

	PR binutils/26301
	* configure: Regenerated.
---
 binutils/configure   | 38 +-
 config/debuginfod.m4 |  9 +++--
 config/pkg.m4|  6 ++
 gdb/configure| 38 +-
 4 files changed, 79 insertions(+), 12 deletions(-)

diff --git a/binutils/configure b/binutils/configure
index c9fc5108e0..3f9ac88990 100755
--- a/binutils/configure
+++ b/binutils/configure
@@ -12398,6 +12398,7 @@ $as_echo_n "checking whether to use debuginfod... " >&6; }
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_debuginfod" >&5
 $as_echo "$with_debuginfod" >&6; }
 
+have_libdebuginfod=0
 if test "x$with_debuginfod" != xno; then
 
 pkg_failed=no
@@ -12439,6 +12440,28 @@ fi
 pkg_failed=untried
 fi
 
+pkg_save_LDFLAGS="$LDFLAGS"
+LDFLAGS="$LDFLAGS $pkg_cv_DEBUGINFOD_LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+return 0;
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pkg_failed=no
+else
+  pkg_failed=yes
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+LDFLAGS=$pkg_save_LDFLAGS
+
 
 
 if test $pkg_failed = yes; then
@@ -12458,7 +12481,8 @@ fi
 	# Put the nasty error message in config.log where it belongs
 	echo "$DEBUGINFOD_PKG_ERRORS" >&5
 
-	if test "x$with_debuginfod" = xyes; then
+	have_libdebuginfod=0
+ if test "x$with_debuginfod" = xyes; then
as_fn_error $? "\"--with-debuginfod was given, but libdebuginfod is missing or unusable.\"" "$LINENO" 5
  else
{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: libdebuginfod is missing or unusable; some features may be unavailable." >&5
@@ -12467,7 +12491,8 @@ $as_echo "$as_me: WARNING: libdebuginfod is missing or unusable; some features m
 elif test $pkg_failed = untried; then
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
 $as_echo "no" >&6; }
-	if test "x$with_debuginfod" = xyes; then
+	have_libdebuginfod=0
+ if test "x$with_debuginfod" = xyes; then
as_fn_error $? "\"--with-debuginfod was given, but libdebuginfod is missing or unusable.\"" "$LINENO" 5
  else
{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: libdebuginfod is missing or unusable; some features may be unavailable." >&5
@@ -12478,15 +12503,18 @@ else
 	DEBUGINFOD_LIBS=$pkg_cv_DEBUGINFOD_LIBS
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
 $as_echo "yes" >&6; }
-
-$as_echo "#define HAVE_LIBDEBUGINFOD 1" >>confdefs.h
-
+	have_libdebuginfod=1
 fi
 else
   { $as_echo

Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Liška


On 7/27/20 5:53 PM, Martin Sebor wrote:

The tests I committed with the change didn't exercise any of
this so that's my bad.  I'm still not sure I understand how
the problem with the incomplete type comes up (I haven't had
a chance to look into the recent updates on the bug yet) but
to retain the optimization (and keep the comments in sync
with the code) I think a better solution than restricting
the function to integers is to limit it to complete types.
Beyond that, extending the function to also constant arrays
or nonzero aggregates will also enable the optimization for
those.


Hello.

I must admit that I'm not super-familiar with that code I modified.
Can you please assign the PR and propose a proper fix? I can then
test it on chromium and I'm also deferring backport of the patch I installed
to master.

Thanks,
Martin

Re: [PATCH] libgomp: Add helper functions for memory handling.

2020-07-27 Thread y2s1982 via Gcc-patches

Hello Jakub,

Thanks for the reply. I apparently need further clarification.

On Mon, Jul 27, 2020 at 12:36 PM Jakub Jelinek  wrote:

> On Sat, Jul 25, 2020 at 11:03:27AM -0400, y2s1982 via Gcc-patches wrote:
> > This patch adds few helper functions aims to improve readability of
> > fetching addresses, sizes, and values. It also proposes a syntax for
> > querying these information through the callback functions, similar to
> > that of LLVM implementation. The syntax format is _ > name>, or __. '_' is the
> > delimiter between fields. '', as currently defined in the
> > enum, is either gompd_query_address or gompd_query_size: the first
> > handles address or offset queries while the second handles the size of
> > the variable/member. '' refers to the variable type, and
> > '' refers to the data type of the member of the variable.
> > This code is incomplete: in particular, it currently lacks CUDA support,
> > as well as segment handling, and inlining of the functions.
>
> That assumes on the libgomp.so.1 side you want to add all the magic symbols
> to the dynamic! symbol table.  We do not want that, it is a big waste of
> system resources that way.
> Rather than that, as I said several times, there should be a single
> variable, perhaps with generated content, with a compact format of data on
> which the two libraries agree on and libgompd should parse and use
> information from that single variable.
>

I do know you have said this several times, and I thought I understood it,
but it seems I am wrong each time. I just want to clarify my understanding
and what I had intended on doing on this and would like further explanation
on what you would like to see happen more specifically so that I may make
less mistakes.

My assumption in this patch was that, as the ompd_callback_symbol_addr_fn_t
callback function takes 2 context addresses and string query as input, and
ompd_address_t address as an output, I should give it:
  - the context addresses, which I assumed would contain a single data in
compact form generated from the tool's side having size, offset, and
address information,
  - and a string, which is basically a query that needs to be interpreted
by the callback function to determine which part of the data it needs to
use to generate to the returning pointer.
I wasn't sure what role the filename input played in this.
This seemed to fit what you meant by having a single compact data that
contains all the information while resolving my ongoing mystery as to what
the callback was expecting to have as the string identifying the symbol. To
further clarify, I assumed the callback would do string manipulation and
comparison to identify which information is being requested, refer to the
single data contained in appropriate context, and return the address
information.

I am more than willing to scrap my bad idea for a better one. I am
sincerely interested in learning better ways to implement and improve
myself. I just need to know more specifics of the design, particularly:
- where the compact data is stored (I assumed context, which means it might
not be defined in the library side, but should it be in the handle or
global to library?),
- how the information necessary for the compact data is gathered (if it is
done on the library side, should I just use the ompd_callback_sizeof_fn_t
to obtain primitive sizes, and from there, just build the offset
information for each variable members? How about variable address?)
- how the ompd_callback_symbol_addr_fn_t callback function would likely use
it given its arguments (input of 2 contexts, 1 file name, 1 string to
output 1 address),
- what are the expected strings for the callback that would correspond to
each variable (here, I assume, the types would be gomp_thread,
gompd_thread_pool, gomp_team, etc.) or each member of said variables,
(these, at least I expected, should be documented and agreed on between
library and tool),
among other things.


>
> So I'm afraid pretty much nothing from this patch is really usable.
>
> > +#define FOREACH_QUERYTYPE(TYPE)\
> > + TYPE (gompd_query_address)\
> > + TYPE (gompd_query_size)\
> > +
> Why the final \ ?  There should be space before the \ too.
> > +
> >  extern ompd_callbacks_t gompd_callbacks;
> >
> >  typedef struct _ompd_aspace_handle {
> > @@ -47,4 +53,51 @@ typedef struct _ompd_aspace_handle {
> >ompd_size_t ref_count;
> >  } ompd_address_space_handle_t;
> >
> > +typedef enum gompd_query_type {
> > +#define GENERATE_ENUM(ENUM) ENUM,
> > +  FOREACH_QUERYTYPE (GENERATE_ENUM)
> > +#undef GENERATE_ENUM
> > +} query_type;
>
> It is unclear what you want to achieve through this, right now it is
> a very fancy way to say
> typedef enum gompd_query_type {
>   gompd_query_address,
>   gompd_query_size,
> } query_type;
>

I wanted to have a consistent way to create an enum and a corresponding
array string that translates the enum to a string. I also wasn't sure if I
covered all the types the query should handle as the

Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-27 Thread Richard Sandiford

Zhongyunde  writes:
> I reconsider the issue and update patch attached.
> Yes, If the kernel loop bb's information doesn't use in regrename, it
> also need not be collected to save compile time.

Thanks.  The point about other callers was a good one though.
I think it would be OK to add a new parameter to regrename_analyze
that says whether it should process all blocks, ignoring
BB_DISABLE_SCHEDULE.  The default value could be true, with only
regrename itself passing false.

Why do you need the BB_MODIFIED test?  The point in time that that
flag is measured from is somewhat arbitrary.  Also, the modification
that caused the flag to be set might not have invalidated the schedule.

Richard

>
>> -Original Message-
>> From: Zhongyunde
>> Sent: Sunday, July 26, 2020 3:29 PM
>> To: 'Richard Sandiford' 
>> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) 
>> Subject: RE: [PATCH PR95696] regrename creates overlapping register
>> allocations for vliw
>> 
>> 
>> > >> It's interesting that this is for a testcase using SMS.  One of the
>> > >> traditional problems with the GCC implementation of SMS has been
>> > >> ensuring that later passes don't mess up the scheduled loop.  So in
>> > >> your testcase, does register allocation succeed for the SMS loop
>> > >> without invalidating the bundling decisions?
>> > >
>> > > Yes.
>> > >
>> > >> If so, then it's probably better to avoid running regrename on it at 
>> > >> all.
>> > >> It mostly exists to help the second scheduling pass, but the second
>> > >> scheduling pass shouldn't be messing with an SMS loop anyway.
>> > >> Also, although the patch deals with one case in which regrename
>> > >> could disrupt the bundling, there are others too.
>> > >>
>> > >> So maybe one option would be to make regrename ignore blocks that
>> > >> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
>> > >> discounted
>> > >> already.)
>> > >
>> > > ok, according your advice, I make a new patch attached.
>> >
>> > Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
>> > separate patches though.
>> >
>> > For the SMS part, I think a better place to enforce the rule is in
>> > build_def_use.  If that function returns false early for
>> > BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
>> > block without wasting too much compile time on it, and we'll still
>> > keep the pass structures internally correct.  (It would also be good
>> > to have a dump_file message to say that that's what we're doing.)
>> 
>> > Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
>> > could you describe the (presumably non-SMS) cases that are affected?
>> 
>> Yes, the non-SMS basic block should not be affected.
>> An alternate method attached can avoid use REG_UNUSED stuff for BB with
>> BB_DISABLE_SCHEDUL.
>> 
>> I don't change build_def_use to return false early as I find some other
>> optimization reuse the function regrename_analyze to creat def/use chain
>> info of the kernel loop body in our target.
>> 
>> > TBH, given that the bundling information is so uncertain at this
>> > stage, I think it would be better to have a mode in which regrename
>> > ignores REG_UNUSED notes altogether.  Perhaps we could put it under
>> a
>> > --param, which targets could then set to whichever default they prefer.
>> > The default should be the current behaviour though.
>> 
>> 
>> > Thanks,
>> > Richard
>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index c38173a77..1df58e52d 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -737,6 +737,14 @@ regrename_analyze (bitmap bb_mask)
>if (dump_file)
>   fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
>  
> +  if ((bb1->flags & BB_DISABLE_SCHEDULE) != 0
> +   && (bb1->flags & BB_MODIFIED) == 0)
> + {
> +   if (dump_file)
> + fprintf (dump_file, "skip to avoid disrupting the sms schedule\n");
> +   continue;
> + }
> +
>init_rename_info (this_info, bb1);
>  
>success = build_def_use (bb1);

Re: [PATCH v4] driver: fix a problem with implementation of -falign-foo=0 [PR96247]

2020-07-27 Thread Richard Sandiford

Hu Jiangping  writes:
> Hi!
>
> This patch makes the -falign-foo=0 work as described in the
> documentation. Thanks for all the suggestions.
>
> v4: do changes for coding conventions
> v3: make change more readable and self-consistent
>
> Changelog:
> 2020-07-27  Hu Jiangping  
>
>   PR driver/96247
>   * opts.c (check_alignment_argument): Set the -falign-Name
>   on/off flag on and set the -falign-Name string value null,
>   when the command-line specified argument is zero.

Thanks, pushed to master.

Richard

>
> Tested on x86_64.
>
> Regards!
> Hujp
>
> ---
>  gcc/opts.c | 28 ++--
>  1 file changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 499eb900643..574b28416fb 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -2004,13 +2004,21 @@ parse_and_check_align_values (const char *flag,
>  }
>  
>  /* Check that alignment value FLAG for -falign-NAME is valid at a given
> -   location LOC.  */
> +   location LOC. OPT_STR points to the stored -falign-NAME=argument and
> +   OPT_FLAG points to the associated -falign-NAME on/off flag.  */
>  
>  static void
> -check_alignment_argument (location_t loc, const char *flag, const char *name)
> +check_alignment_argument (location_t loc, const char *flag, const char *name,
> +int *opt_flag, const char **opt_str)
>  {
>auto_vec align_result;
>parse_and_check_align_values (flag, name, align_result, true, loc);
> +
> +  if (align_result.length() >= 1 && align_result[0] == 0)
> +{
> +  *opt_flag = 1;
> +  *opt_str = NULL;
> +}
>  }
>  
>  /* Print help when OPT__help_ is set.  */
> @@ -2785,19 +2793,27 @@ common_handle_option (struct gcc_options *opts,
>break;
>  
>  case OPT_falign_loops_:
> -  check_alignment_argument (loc, arg, "loops");
> +  check_alignment_argument (loc, arg, "loops",
> +>x_flag_align_loops,
> +>x_str_align_loops);
>break;
>  
>  case OPT_falign_jumps_:
> -  check_alignment_argument (loc, arg, "jumps");
> +  check_alignment_argument (loc, arg, "jumps",
> +>x_flag_align_jumps,
> +>x_str_align_jumps);
>break;
>  
>  case OPT_falign_labels_:
> -  check_alignment_argument (loc, arg, "labels");
> +  check_alignment_argument (loc, arg, "labels",
> +>x_flag_align_labels,
> +>x_str_align_labels);
>break;
>  
>  case OPT_falign_functions_:
> -  check_alignment_argument (loc, arg, "functions");
> +  check_alignment_argument (loc, arg, "functions",
> +>x_flag_align_functions,
> +>x_str_align_functions);
>break;
>  
>  case OPT_ftabstop_:

Go patch committed: Scan all function literals for escape analysis

2020-07-27 Thread Ian Lance Taylor via Gcc-patches

This patch to the Go frontend scans all function literals for escape
analysis.  We were scanning only function literals with closures, but
not all function literals have closures.  The effect of this is a
missed optimization in some cases: we will allocate some variables on
the heap unnecessarily.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
a449a09f88f149f4c29fae09539ba0a909204a36
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index d23e7377306..0fa32a43489 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-587d4595e446c597efe97ccdc81b2f05cbc04a21
+e86f2cb5d6b1984fde345d6ade605e377fa38c04
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/escape.cc b/gcc/go/gofrontend/escape.cc
index 0d38858897d..8962f3f38d1 100644
--- a/gcc/go/gofrontend/escape.cc
+++ b/gcc/go/gofrontend/escape.cc
@@ -142,18 +142,22 @@ Node::ast_format(Gogo* gogo) const
   else if (this->expr() != NULL)
 {
   Expression* e = this->expr();
+
   bool is_call = e->call_expression() != NULL;
   if (is_call)
-   e->call_expression()->fn();
+   e = e->call_expression()->fn();
   Func_expression* fe = e->func_expression();;
-
-  bool is_closure = fe != NULL && fe->closure() != NULL;
-  if (is_closure)
+  if (fe != NULL)
{
- if (is_call)
-   return "(func literal)()";
- return "func literal";
+ Named_object* no = fe->named_object();
+ if (no->is_function() && no->func_value()->enclosing() != NULL)
+   {
+ if (is_call)
+   return "(func literal)()";
+ return "func literal";
+   }
}
+
   Ast_dump_context::dump_to_stream(this->expr(), );
 }
   else if (this->statement() != NULL)
@@ -1172,11 +1176,14 @@ Escape_discover_expr::expression(Expression** pexpr)
   // Method call or function call.
   fn = e->call_expression()->fn()->func_expression()->named_object();
 }
-  else if (e->func_expression() != NULL
-   && e->func_expression()->closure() != NULL)
+  else if (e->func_expression() != NULL)
 {
-  // Closure.
-  fn = e->func_expression()->named_object();
+  Named_object* no = e->func_expression()->named_object();
+  if (no->is_function() && no->func_value()->enclosing() != NULL)
+   {
+ // Nested function.
+ fn = no;
+   }
 }
 
   if (fn != NULL)

Re: [PATCH] libgomp: Add helper functions for memory handling.

2020-07-27 Thread Jakub Jelinek via Gcc-patches

On Sat, Jul 25, 2020 at 11:03:27AM -0400, y2s1982 via Gcc-patches wrote:
> This patch adds few helper functions aims to improve readability of
> fetching addresses, sizes, and values. It also proposes a syntax for
> querying these information through the callback functions, similar to
> that of LLVM implementation. The syntax format is _ name>, or __. '_' is the
> delimiter between fields. '', as currently defined in the
> enum, is either gompd_query_address or gompd_query_size: the first
> handles address or offset queries while the second handles the size of
> the variable/member. '' refers to the variable type, and
> '' refers to the data type of the member of the variable.
> This code is incomplete: in particular, it currently lacks CUDA support,
> as well as segment handling, and inlining of the functions.

That assumes on the libgomp.so.1 side you want to add all the magic symbols
to the dynamic! symbol table.  We do not want that, it is a big waste of
system resources that way.
Rather than that, as I said several times, there should be a single
variable, perhaps with generated content, with a compact format of data on
which the two libraries agree on and libgompd should parse and use
information from that single variable.

So I'm afraid pretty much nothing from this patch is really usable.

> +#define FOREACH_QUERYTYPE(TYPE)\
> + TYPE (gompd_query_address)\
> + TYPE (gompd_query_size)\
> +
Why the final \ ?  There should be space before the \ too.
> +
>  extern ompd_callbacks_t gompd_callbacks;
>  
>  typedef struct _ompd_aspace_handle {
> @@ -47,4 +53,51 @@ typedef struct _ompd_aspace_handle {
>ompd_size_t ref_count;
>  } ompd_address_space_handle_t;
>  
> +typedef enum gompd_query_type {
> +#define GENERATE_ENUM(ENUM) ENUM,
> +  FOREACH_QUERYTYPE (GENERATE_ENUM)
> +#undef GENERATE_ENUM
> +} query_type;

It is unclear what you want to achieve through this, right now it is
a very fancy way to say
typedef enum gompd_query_type {
  gompd_query_address,
  gompd_query_size,
} query_type;

> +
> +ompd_rc_t gompd_getQueryStringSize (size_t *, query_type, const char*,
> + const char *);

GCC is not a CamelCaseShop, so please don't use such names.
Appropriate names would be gompd_get_query_string_size etc.

Jakub

Re: [PATCH] config/debuginfod.m4: Restore AC_CHECK_LIB check

2020-07-27 Thread Aaron Merey via Gcc-patches

On Mon, Jul 27, 2020 at 11:32 AM H.J. Lu  wrote:
>
> On Sat, Jul 25, 2020 at 9:01 AM H.J. Lu  wrote:
> > This caused:
> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=26301
> >
>
> It is quite normal to have debuginfod headers without libdebuginfod on
> multilib OSes.  Restore AC_CHECK_LIB to check if libdebuginfod exists.
> And always define HAVE_LIBDEBUGINFOD to 0 or 1 for
>
> binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
> binutils/dwarf.h:#if HAVE_LIBDEBUGINFOD
> binutils/objdump.c:#if HAVE_LIBDEBUGINFOD
> binutils/objdump.c:#endif /* HAVE_LIBDEBUGINFOD */
> binutils/readelf.c:#if HAVE_LIBDEBUGINFOD
> binutils/readelf.c:#endif /* HAVE_LIBDEBUGINFOD */
> gdb/top.c:#if HAVE_LIBDEBUGINFOD
>
> OK for master?

Thanks for spotting this. Normally PKG_CHECH_MODULES would correctly
detect whether the .so and header are installed and build accordingly,
but when cross compiling the AC_CHECK_LIB may be needed.

Aaron

Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Sebor via Gcc-patches


On 7/27/20 6:32 AM, Martin Liška wrote:

Hey.

As mentioned in the PR, we should not create a string constant for a type
that is different from char_type_node. Looking at expr.c, I was inspired
and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
underlying

type is a character type.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests. 
And it fixes chromium

build with gcc-10 branch with the patch applied.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

 PR tree-optimization/96058
 * expr.c (string_constant): Build string_constant only
 for a type that is main variant of char_type_node.
---
  gcc/expr.c | 22 +-
  1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 5db0a7a8565..c3fdd82b319 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11828,17 +11828,21 @@ string_constant (tree arg, tree *ptr_offset, 
tree *mem_size, tree *decl)

  chartype = TREE_TYPE (chartype);
    while (TREE_CODE (chartype) == ARRAY_TYPE)
  chartype = TREE_TYPE (chartype);
-  /* Convert a char array to an empty STRING_CST having an array
- of the expected type and size.  */
-  if (!initsize)
-  initsize = integer_zero_node;

-  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
-  init = build_string_literal (size, NULL, chartype, size);
-  init = TREE_OPERAND (init, 0);
-  init = TREE_OPERAND (init, 0);
+  if (TYPE_MAIN_VARIANT (chartype) == char_type_node)


The change to c_getstr I recently committed made it clear that
the function can:

  Return a pointer P to a NUL-terminated string containing
  the sequence of bytes corresponding to the representation
  of the object referred to by SRC (or a subsequence of such
  bytes within it if SRC is a reference to an initialized
  constant array plus some constant offset).

I.e., c_getstr returns a STRING_CST for arbitrary non-string
constants.  This enables optimizations like the by-pieces
expansion of calls to raw memory functions like memcpy, or
the folding of other raw memory calls like memchr with non-
string arguments.

c_getstr relies on string_constant for that.  Restricting
the latter function to just character types prevents these
optimizations for zero-initialized constants of other types.
A test case that shows the difference to the by-pieces
expansion goes something like this:

  const struct { char a[64]; } x = { 0 };

  void f (void *d)
  {
__builtin_memcpy (d, , sizeof x - 1);
  }

A test case for the memchr folding is similar:

  const struct { char a[64]; } x = { 0 };

  int f (void *d)
  {
return __builtin_memchr (, 0, sizeof x) != 0;
  }

The tests I committed with the change didn't exercise any of
this so that's my bad.  I'm still not sure I understand how
the problem with the incomplete type comes up (I haven't had
a chance to look into the recent updates on the bug yet) but
to retain the optimization (and keep the comments in sync
with the code) I think a better solution than restricting
the function to integers is to limit it to complete types.
Beyond that, extending the function to also constant arrays
or nonzero aggregates will also enable the optimization for
those.

Martin


+    {
+  /* Convert a char array to an empty STRING_CST having an array
+ of the expected type and size.  */
+  if (!initsize)
+    initsize = integer_zero_node;
+
+  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
+  init = build_string_literal (size, NULL, chartype, size);
+  init = TREE_OPERAND (init, 0);
+  init = TREE_OPERAND (init, 0);

-  *ptr_offset = integer_zero_node;
+  *ptr_offset = integer_zero_node;
+    }
  }

    if (decl)

[PATCH] config/debuginfod.m4: Restore AC_CHECK_LIB check

2020-07-27 Thread H.J. Lu via Gcc-patches

On Sat, Jul 25, 2020 at 9:01 AM H.J. Lu  wrote:
>
> On Fri, Jul 24, 2020 at 1:04 PM Aaron Merey via Gcc-patches
>  wrote:
> >
> > On Tue, Jul 21, 2020 at 2:11 PM Aaron Merey  wrote:
> > >
> > > On Tue, Jul 21, 2020 at 11:20 AM Tom Tromey  wrote:
> > > >
> > > > Simon> Since it's debuginfo.m4 that is using PKG_CHECK_MODULES, can you 
> > > > put the include
> > > > Simon> of pkg.m4 in debuginfo.m4, instead of in 
> > > > {binutils,gdb}/configure.ac?
> > > >
> > > > Simon> Otherwise, from GDB's point of view I think it looks good, unless
> > > > Simon> Tom has some things to add.
> > > >
> > > > I'm happy with it.  Thanks for persevering.
> > >
> > > Great. I can push to binutils-gdb but not gcc. Should I just push to
> > > binutils-gdb for now or wait until the patch can be applied to both
> > > repos at once?
> >
> > I'm going to go ahead and push to binutils-gdb. Since these changes
> > should not affect gcc there shouldn't be any conflicts.
> >
>
> This caused:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=26301
>

It is quite normal to have debuginfod headers without libdebuginfod on
multilib OSes.  Restore AC_CHECK_LIB to check if libdebuginfod exists.
And always define HAVE_LIBDEBUGINFOD to 0 or 1 for

binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.h:#if HAVE_LIBDEBUGINFOD
binutils/objdump.c:#if HAVE_LIBDEBUGINFOD
binutils/objdump.c:#endif /* HAVE_LIBDEBUGINFOD */
binutils/readelf.c:#if HAVE_LIBDEBUGINFOD
binutils/readelf.c:#endif /* HAVE_LIBDEBUGINFOD */
gdb/top.c:#if HAVE_LIBDEBUGINFOD

OK for master?

-- 
H.J.
From 975f68898817f2db13c5d7061fb6e6a9147b06aa Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 27 Jul 2020 08:24:15 -0700
Subject: [PATCH] config/debuginfod.m4: Restore AC_CHECK_LIB check

It is quite normal to have debuginfod headers without libdebuginfod on
multilib OSes.  Restore AC_CHECK_LIB to check if libdebuginfod exists.
And always define HAVE_LIBDEBUGINFOD to 0 or 1 for

binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.c:#if HAVE_LIBDEBUGINFOD
binutils/dwarf.h:#if HAVE_LIBDEBUGINFOD
binutils/objdump.c:#if HAVE_LIBDEBUGINFOD
binutils/objdump.c:#endif /* HAVE_LIBDEBUGINFOD */
binutils/readelf.c:#if HAVE_LIBDEBUGINFOD
binutils/readelf.c:#endif /* HAVE_LIBDEBUGINFOD */
gdb/top.c:#if HAVE_LIBDEBUGINFOD

config/

	PR binutils/26301
	* debuginfod.m4 (AC_DEBUGINFOD): Restore AC_CHECK_LIB to check if
	libdebuginfod exists.

binutils/

	PR binutils/26301
	* configure: Regenerated.

gdb/

	PR binutils/26301
	* configure: Regenerated.
---
 binutils/configure   | 88 +---
 config/debuginfod.m4 | 14 +--
 gdb/configure| 88 +---
 3 files changed, 161 insertions(+), 29 deletions(-)

diff --git a/binutils/configure b/binutils/configure
index c9fc5108e0..02d33cb5ed 100755
--- a/binutils/configure
+++ b/binutils/configure
@@ -12398,6 +12398,7 @@ $as_echo_n "checking whether to use debuginfod... " >&6; }
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_debuginfod" >&5
 $as_echo "$with_debuginfod" >&6; }
 
+have_libdebuginfod=0
 if test "x$with_debuginfod" != xno; then
 
 pkg_failed=no
@@ -12458,35 +12459,96 @@ fi
 	# Put the nasty error message in config.log where it belongs
 	echo "$DEBUGINFOD_PKG_ERRORS" >&5
 
-	if test "x$with_debuginfod" = xyes; then
-   as_fn_error $? "\"--with-debuginfod was given, but libdebuginfod is missing or unusable.\"" "$LINENO" 5
- else
-   { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: libdebuginfod is missing or unusable; some features may be unavailable." >&5
-$as_echo "$as_me: WARNING: libdebuginfod is missing or unusable; some features may be unavailable." >&2;}
- fi
+	as_fn_error $? "Package requirements (libdebuginfod >= 0.179) were not met:
+
+$DEBUGINFOD_PKG_ERRORS
+
+Consider adjusting the PKG_CONFIG_PATH environment variable if you
+installed software in a non-standard prefix.
+
+Alternatively, you may set the environment variables DEBUGINFOD_CFLAGS
+and DEBUGINFOD_LIBS to avoid the need to call pkg-config.
+See the pkg-config man page for more details." "$LINENO" 5
 elif test $pkg_failed = untried; then
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
 $as_echo "no" >&6; }
-	if test "x$with_debuginfod" = xyes; then
-   as_fn_error $? "\"--with-debuginfod was given, but libdebuginfod is missing or unusable.\"" "$LINENO" 5
- else
-   { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: libdebuginfod is missing or unusable; some features may be unavailable." >&5
-$as_echo "$as_me: WARNING: libdebuginfod is missing or unusable; some features may be unavailable." >&2;}
- fi
+	{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+as_fn_error $? "The pkg-config script could not be found or is too old.  Make sure it
+is

Fwd: [PATCH 00/29] rs6000: Auto-generate builtins from descriptions [V2]

2020-07-27 Thread Bill Schmidt via Gcc-patches

I apologize for the useless "From" address (and lack of "Reply-To" 
address) on this patch series.  My usual machine is down for 
maintenance, so I ended up sending this from my laptop, which was 
clearly not configured well enough for that.  My bad.  I won't do that 
again.


Meantime, please reply to wschm...@linux.ibm.com for this patch series.

Thanks!

Bill


 Forwarded Message 
Subject: 	[PATCH 00/29] rs6000: Auto-generate builtins from descriptions 
[V2]

Date:   Mon, 27 Jul 2020 09:13:46 -0500
From:   Bill Schmidt 
To: gcc-patches@gcc.gnu.org
CC: 	seg...@kernel.crashing.org, dje@gmail.com, Bill Schmidt 





From: Bill Schmidt 

This is a slight reworking of the patches posted on June 17. I have
made a couple of improvements, but the general arrangement of the patches
is the same as before. Two major things to call out:

- I've introduced a uniform set of parsing error codes to make it easier
to follow some of the logic when certain conditions occur during parsing.

- I reorganized the treatment of built-in stanzas. Before, the stanza
conditions were checked prior to initializing entries in the built-in
table. Now, all built-ins are initialized, but we check the conditions
at expand time to determine whether they should be enabled. This
addresses a frequent problem we have with the existing methods, where
"#pragma target" doesn't work as expected when changing the target
CPU for a single function.

As described before, the current built-in support in the rs6000 back end
requires at least a master's degree in spelunking to comprehend. It's
full of cruft, redundancy, and unused bits of code, and long overdue for a
replacement. This is the first part of my project to do that.

My intent is to make adding new built-in functions as simple as adding a
few lines to a couple of files, and automatically generating as much of
the initialization, overload resolution, and expansion logic as possible.
This patch series establishes the format of the input files and creates
a new program (rs600-gen-builtins) to:

* Parse the input files into an internal representation;
* Generate a file of #defines (rs6000-vecdefines.h) for eventual
inclusion into altivec.h; and
* Generate an initialization file to create and initialize tables of
built-in functions and overloads.

Patches 1, 3-7, and 9-19 contain the logic for rs6000-gen-builtins.
Patch 8 provides balanced tree search support for parsing scalability.
Patches 2 and 21-27 provide a first cut at the input files.
Patch 20 incorporates the new code into the GCC build.
Patch 28 adds comments to some existing files that will help during the
transition from the previous built-in mechanism.
Patch 29 turns on the initialization logic, while leaving GCC's behavior
unchanged otherwise.

The patch series is constructed so that any prefix set of the patches
can be upstreamed without breaking anything. There's still plenty of
work left, but I think it will be helpful to get this big chunk of
patches upstream to make further progress easier (translation: avoid
complex rebases like the one I just went through :-).

Following is some additional information about the present and future
design that may be of help.

The set of patches submitted upstream so far does the relatively
straightforward work of reading builtin descriptions from flat files and
generating initialization code for builtin and overload tables. It also
generates an include file meant to be included in altivec.h, which produces
the #defines that map vec_* to __builtin_* functions for external 
consumption.


Data structures are automatically initialized in rs6000_builtins.c:
rs6000_autoinit_builtins. Initialized data structures are:

- rs6000_gen_builtins: An enumeration of builtin identifiers, such as
RS6000_BIF_CPU_SUPPORTS. These names are deliberately different from the
existing builtin identifiers so they can co-exist for a while.

- rs6000_gen_overloads: An enumeration of overload identifiers, such as
RS6000_OVLD_MAX. Again, deliberately different from the old names. Right
now I have the two enumerations using nonoverlapping numbers. This is
because the two tables were part of one table in the old design, and I
haven't yet proven for sure that I can separate them without problems.
I think that I can, in which case I will have both enumerations start from
zero.

- A number of filescope variables representing TREE_TYPEs of functions.
These are named _ftype__ ... _ and
initialized as tree lists from the prototypes in the input files. The
naming scheme for types is described in the code.
- rs6000_builtin_info_x: An array indexed by rs6000_gen_builtins containing
all the fun stuff for each builtin. The "_x" is because we already have
rs6000_builtin_info as the old table, and they need to coexist for a while.

- rs6000_overload_info: An array indexed by rs6000_gen_overloads containing
all the fun stuff for each overload.

- bif_hash: A hash table mapping builtin function names to pointers

Re: [PATCH 8/9] [OpenACC] Fix standalone attach for Fortran assumed-shape array pointers

2020-07-27 Thread Julian Brown

On Fri, 17 Jul 2020 13:16:11 +0200
Thomas Schwinge  wrote:

> Hi Julian, Tobias!
> 
> On 2020-07-15T12:28:42+0200, Thomas Schwinge
>  wrote:
> > On 2020-07-14T13:43:37+0200, I wrote:  
> >> On 2020-06-16T15:39:44-0700, Julian Brown
> >>  wrote:  
> >>> As mentioned in the blurb for the previous patch, an "attach"
> >>> operation for a Fortran pointer with an array descriptor must
> >>> copy that array descriptor to the target.  
> >>
> >> Heh, I see -- I don't think I had read the OpenACC standard in
> >> that way, but I think I agree your interpretation is fine.
> >>
> >> This does not create some sort of memory leak -- everything
> >> implicitly allocated there will eventually be deallocated again,
> >> right?  
> 
> Unanswered -- but I may now have found this problem, and also found
> "the reverse problem" ('finalize'); see below.

Sorry, I didn't answer this explicitly -- the idea was to pair alloc
(present) and release mappings for the pointed-to data. In that way,
the idea was for the release mapping to perform that deallocation. That
was partly done so that the existing handling in gfc_trans_omp_clauses
could be used for this case without too much disruption to the code --
but actually, after Tobias's reorganisation of that function, that's
not really so much of an issue any more.

You can still get a "leak" if you try to attach a synthesized/temporary
array descriptor that goes out of scope before the pointed-to data it
refers to does -- that's a problem I've mentioned earlier, and is
kind-of unavoidable unless we do some more sophisticated analysis to
diagnose it as user error.

> >>> This patch arranges for that to be so.  
> >>
> >> In response to the new OpenACC/Fortran testcase that I'd submtited
> >> in
> >> <87wo3co0tm.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87wo3co0tm.fsf@euler.schwinge.homeip.net>,
> >> you (Julian) correctly supposed in
> >> ,
> >> that this patch indeed does resolve that testcase, too.  That
> >> wasn't obvious to me.  So, similar to
> >> 'libgomp/testsuite/libgomp.oacc-c-c++-common/pr95270-{1.2}.c',
> >> please include my new OpenACC/Fortran testcase (if that makes
> >> sense to you), and reference PR95270 in the commit log.  
> >
> > My new OpenACC/Fortran testcase got again broken ('libgomp: pointer
> > target not mapped for attach') by Tobias' commit
> > 102502e32ea4e8a75d6b252ba319d09d735d9aa7 "[OpenMP, Fortran] Add
> > structure/derived-type element mapping",
> > http://mid.mail-archive.com/c5b43e02-d1d5-e7cf-c11c-6daf1e8f33c5@codesourcery.com>.
> >
> > Similar ('libgomp: attempt to attach null pointer') for your new
> > 'libgomp.oacc-fortran/attach-descriptor-1.f90'.
> >
> > (Whether or not 'attach'ing 'NULL' should actually be allowed, is a
> > separate topic for discussion.)
> >
> > So this patch here will (obviously) need to be adapted to what
> > Tobias changed.  
> 
> I see what you pushed in commit
> 39dda0020801045d9a604575b2a2593c05310015 "openacc: Fix standalone
> attach for Fortran assumed-shape array pointers" indeed has become
> much smaller/simpler.  :-)

Yes, thank you.

> (But, (parts of?) Tobias' commit mentioned above (plus commit
> 524862db444b6544c6dc87c5f06f351100ecf50d "Fix goacc/finalize-1.f tree
> dump-scanning for -m32", if applicable) will then also need to be
> backported to releases/gcc-10 branch (once un-frozen).)
> 
> > (Plus my more general questions quoted above and below.)  
> 
> >>> OK?  
> >>
> >> Basically yes (for master and releases/gcc-10 branches), but please
> >> consider the following:
> >>  
> >>> --- a/gcc/fortran/trans-openmp.c
> >>> +++ b/gcc/fortran/trans-openmp.c
> >>> @@ -2573,8 +2573,44 @@ gfc_trans_omp_clauses (stmtblock_t *block,
> >>> gfc_omp_clauses *clauses, }
> >>>   }
> >>> if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))
> >>> -   && n->u.map_op != OMP_MAP_ATTACH
> >>> -   && n->u.map_op != OMP_MAP_DETACH)
> >>> +   && (n->u.map_op == OMP_MAP_ATTACH
> >>> +   || n->u.map_op == OMP_MAP_DETACH))
> >>> + {
> >>> +   tree type = TREE_TYPE (decl);
> >>> +   tree data = gfc_conv_descriptor_data_get
> >>> (decl);
> >>> +   if (present)
> >>> + data = gfc_build_cond_assign_expr
> >>> (block, present,
> >>> +data,
> >>> +
> >>> null_pointer_node);
> >>> +   tree ptr
> >>> + = fold_convert (build_pointer_type
> >>> (char_type_node),
> >>> + data);
> >>> +   ptr = build_fold_indirect_ref (ptr);
> >>> +   /* Standalone attach clauses used with
> >>> arrays with
> >>> +  descriptors must copy the descriptor to
> >>> the target,
> >>> +  else they won't have anything to
> >>> perform

Re: [PATCH 00/29] rs6000: Auto-generate builtins from descriptions [V2]

2020-07-27 Thread Bill Schmidt via Gcc-patches


Just a reminder this patch series exists and wants a review. :-)

Bill

On 7/27/20 9:13 AM, Bill Schmidt wrote:

From: Bill Schmidt 

This is a slight reworking of the patches posted on June 17.  I have
made a couple of improvements, but the general arrangement of the patches
is the same as before.  Two major things to call out:

  - I've introduced a uniform set of parsing error codes to make it easier
to follow some of the logic when certain conditions occur during parsing.

  - I reorganized the treatment of built-in stanzas.  Before, the stanza
conditions were checked prior to initializing entries in the built-in
table.  Now, all built-ins are initialized, but we check the conditions
at expand time to determine whether they should be enabled.  This
addresses a frequent problem we have with the existing methods, where
"#pragma target" doesn't work as expected when changing the target
CPU for a single function.

As described before, the current built-in support in the rs6000 back end
requires at least a master's degree in spelunking to comprehend.  It's
full of cruft, redundancy, and unused bits of code, and long overdue for a
replacement.  This is the first part of my project to do that.

My intent is to make adding new built-in functions as simple as adding a
few lines to a couple of files, and automatically generating as much of
the initialization, overload resolution, and expansion logic as possible.
This patch series establishes the format of the input files and creates
a new program (rs600-gen-builtins) to:

  * Parse the input files into an internal representation;
  * Generate a file of #defines (rs6000-vecdefines.h) for eventual
inclusion into altivec.h; and
  * Generate an initialization file to create and initialize tables of
built-in functions and overloads.

Patches 1, 3-7, and 9-19 contain the logic for rs6000-gen-builtins.
Patch 8 provides balanced tree search support for parsing scalability.
Patches 2 and 21-27 provide a first cut at the input files.
Patch 20 incorporates the new code into the GCC build.
Patch 28 adds comments to some existing files that will help during the
transition from the previous built-in mechanism.
Patch 29 turns on the initialization logic, while leaving GCC's behavior
unchanged otherwise.

The patch series is constructed so that any prefix set of the patches
can be upstreamed without breaking anything.  There's still plenty of
work left, but I think it will be helpful to get this big chunk of
patches upstream to make further progress easier (translation: avoid
complex rebases like the one I just went through :-).

Following is some additional information about the present and future
design that may be of help.

The set of patches submitted upstream so far does the relatively
straightforward work of reading builtin descriptions from flat files and
generating initialization code for builtin and overload tables.  It also
generates an include file meant to be included in altivec.h, which produces
the #defines that map vec_* to __builtin_* functions for external consumption.

Data structures are automatically initialized in rs6000_builtins.c:
rs6000_autoinit_builtins.  Initialized data structures are:

  - rs6000_gen_builtins:  An enumeration of builtin identifiers, such as
RS6000_BIF_CPU_SUPPORTS.  These names are deliberately different from the
existing builtin identifiers so they can co-exist for a while.

  - rs6000_gen_overloads:  An enumeration of overload identifiers, such as
RS6000_OVLD_MAX.  Again, deliberately different from the old names.  Right
now I have the two enumerations using nonoverlapping numbers.  This is
because the two tables were part of one table in the old design, and I
haven't yet proven for sure that I can separate them without problems.
I think that I can, in which case I will have both enumerations start from
zero.

  - A number of filescope variables representing TREE_TYPEs of functions.
These are named _ftype__ ... _ and
initialized as tree lists from the prototypes in the input files.  The
naming scheme for types is described in the code.

  - rs6000_builtin_info_x:  An array indexed by rs6000_gen_builtins containing
all the fun stuff for each builtin.  The "_x" is because we already have
rs6000_builtin_info as the old table, and they need to coexist for a while.

  - rs6000_overload_info:  An array indexed by rs6000_gen_overloads containing
all the fun stuff for each overload.

  - bif_hash:  A hash table mapping builtin function names to pointers to
their rs6000_builtin_info_x entries.

  - ovld_hash:  A hash table mapping overload names to pointers to their
rs6000_overload_info entries.

The new initialization code is called from rs6000_init_builtins. Currently
this function continues to do its existing initialization work, but also
initializes the new tables (and then ignores them).

The old initialization code contains a lot of ad hoc hackery to handle
different kinds of functions that

Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Stefan Schulze Frielinghaus via Gcc-patches

On Mon, Jul 27, 2020 at 12:29:11PM +0200, Richard Biener wrote:
> On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
>  wrote:
> >
> > Richard Biener  writes:
> > > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> > >  wrote:
> > >>
> > >> Richard Biener via Gcc-patches  writes:
> > >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> > >> > Gcc-patches  wrote:
> > >> >>
> > >> >> This is a follow up to commit 5c9669a0e6c respectively discussion
> > >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> > >> >>
> > >> >> In case that an alignment constraint is less than the size of a
> > >> >> corresponding scalar type, ensure that we advance at least by one
> > >> >> iteration.  For example, on s390x we have for a long double an 
> > >> >> alignment
> > >> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> > >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
> > >> >> can be reproduced by the following MWE:
> > >> >
> > >> > But we guard this case with vector_alignment_reachable_p, so we 
> > >> > shouldn't
> > >> > have ended up here and the patch looks bogus.
> > >>
> > >> The above sounds like it ought to count as reachable alignment though.
> > >> If a type requires a lower alignment than its size, then that's even
> > >> more easily reachable than a type that requires the same alignment as
> > >> the size.  I guess at one extreme, a target alignment of 1 is always
> > >> reachable.
> > >
> > > Well, if the element alignment is 8 but its size is 16 then when 
> > > presumably
> > > the desired vector alignment is a multiple of 16 we can never reach it.
> > > Isn't this the case here?
> >
> > If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
> > TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> > fixing wouldn't occur.  I agree that we might never be able to reach
> > that alignment if the pointer starts out misaligned by 8 bytes.
> >
> > But I think that's why it makes sense for the target to only ask
> > for 8-byte alignment for vectors too, if it can cope with it.  8-byte
> > alignment should always be achievable if the scalars are ABI-aligned.
> > And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> > DR_SIZE would be zero and the loop would never progress, which is the
> > problem that the patch is fixing.
> >
> > It would even make sense for the target to ask for 1-byte alignment,
> > if the target doesn't care about alignment at all.
> 
> Hmm, OK.  Guess I still think we should detect this somewhere upward
> and avoid this peeling compute at all.  Somehow.

I've been playing around with another solution which works for me by
changing vector_alignment_reachable_p to return also false if the
alignment requirements are already satisfied, i.e., by adding:

if (known_alignment_for_access_p (dr_info) && aligned_access_p (dr_info))
  return false;

Though, I'm not entirely sure whether this makes it better or not.
Strictly speaking if the alignment was reachable before peeling, then
reaching alignment with peeling is also possible but probably not what
was intended.  So I guess returning false in this case is sensible.  Any
comments?

Thanks,
Stefan

> 
> Richard.
> 
> > Thanks,
> > Richard

[PATCH 28/29] rs6000: Add comments to help with transition

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin.def: Add comments.
* config/rs6000/rs6000-call.c: Likewise.
---
 gcc/config/rs6000/rs6000-builtin.def |  15 +++
 gcc/config/rs6000/rs6000-call.c  | 166 +++
 2 files changed, 181 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index f7037552faf..83b31cbbbce 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1783,8 +1783,10 @@ BU_VSX_3 (XXPERMDI_4SF,   "xxpermdi_4sf",   CONST,   
vsx_xxpermdi_v4sf)
 BU_VSX_3 (XXPERMDI_4SI,   "xxpermdi_4si",   CONST, 
vsx_xxpermdi_v4si)
 BU_VSX_3 (XXPERMDI_8HI,   "xxpermdi_8hi",   CONST, 
vsx_xxpermdi_v8hi)
 BU_VSX_3 (XXPERMDI_16QI,  "xxpermdi_16qi",  CONST, 
vsx_xxpermdi_v16qi)
+/* Following one needs an _UNS version.  */
 BU_VSX_3 (SET_1TI,"set_1ti",CONST, vsx_set_v1ti)
 BU_VSX_3 (SET_2DF,"set_2df",CONST, vsx_set_v2df)
+/* Following one needs an _UNS version.  */
 BU_VSX_3 (SET_2DI,"set_2di",CONST, vsx_set_v2di)
 BU_VSX_3 (XXSLDWI_2DI,"xxsldwi_2di",CONST, 
vsx_xxsldwi_v2di)
 BU_VSX_3 (XXSLDWI_2DF,"xxsldwi_2df",CONST, 
vsx_xxsldwi_v2df)
@@ -1828,8 +1830,11 @@ BU_VSX_2 (CPSGNDP, "cpsgndp",CONST,  
vector_copysignv2df3)
 BU_VSX_2 (CPSGNSP,   "cpsgnsp",CONST,  vector_copysignv4sf3)
 
 BU_VSX_2 (CONCAT_2DF,"concat_2df", CONST,  vsx_concat_v2df)
+/* Following one needs an _UNS version.  */
 BU_VSX_2 (CONCAT_2DI,"concat_2di", CONST,  vsx_concat_v2di)
+/* Following two should be unary functions?!  */
 BU_VSX_2 (SPLAT_2DF, "splat_2df",  CONST,  vsx_splat_v2df)
+/* Following one also needs an _UNS version.  */
 BU_VSX_2 (SPLAT_2DI, "splat_2di",  CONST,  vsx_splat_v2di)
 BU_VSX_2 (XXMRGHW_4SF,   "xxmrghw",CONST,  vsx_xxmrghw_v4sf)
 BU_VSX_2 (XXMRGHW_4SI,   "xxmrghw_4si",CONST,  vsx_xxmrghw_v4si)
@@ -2009,6 +2014,12 @@ BU_VSX_X (ST_ELEMREV_V4SF,"st_elemrev_v4sf",  MEM)
 BU_VSX_X (ST_ELEMREV_V4SI,"st_elemrev_v4si",  MEM)
 BU_VSX_X (ST_ELEMREV_V8HI,"st_elemrev_v8hi",  MEM)
 BU_VSX_X (ST_ELEMREV_V16QI,   "st_elemrev_v16qi", MEM)
+/*  Following builtins appear to be orphaned with no implementation.
+   They are all marked as requiring special handling, but there is no
+   special handling for them.  For now I am not sure we really miss
+   anything by not having these; they all have regular float counterparts
+   and there is little utility in forcing these with builtins.  No plan
+   to put them in rs6000-builtins-new.def.  */
 BU_VSX_X (XSABSDP,   "xsabsdp",CONST)
 BU_VSX_X (XSADDDP,   "xsadddp",FP)
 BU_VSX_X (XSCMPODP,  "xscmpodp",   FP)
@@ -2033,6 +2044,7 @@ BU_VSX_X (XSNMADDMDP,   "xsnmaddmdp", FP)
 BU_VSX_X (XSNMSUBADP,"xsnmsubadp", FP)
 BU_VSX_X (XSNMSUBMDP,"xsnmsubmdp", FP)
 BU_VSX_X (XSSUBDP,   "xssubdp",FP)
+/*  End orphaned builtins.  */
 BU_VSX_X (VEC_INIT_V1TI,  "vec_init_v1ti", CONST)
 BU_VSX_X (VEC_INIT_V2DF,  "vec_init_v2df", CONST)
 BU_VSX_X (VEC_INIT_V2DI,  "vec_init_v2di", CONST)
@@ -2394,6 +2406,7 @@ BU_P9V_VSX_2 (VSCEDPLT,   "scalar_cmp_exp_dp_lt", CONST,  
xscmpexpdp_lt)
 BU_P9V_VSX_2 (VSCEDPEQ,"scalar_cmp_exp_dp_eq", CONST,  xscmpexpdp_eq)
 BU_P9V_VSX_2 (VSCEDPUO,"scalar_cmp_exp_dp_unordered",  CONST,  
xscmpexpdp_unordered)
 
+/* Shouldn't these be BU_FLOAT128_HW_VSX_2?  */
 BU_P9V_VSX_2 (VSCEQPGT,"scalar_cmp_exp_qp_gt", CONST,  
xscmpexpqp_gt_kf)
 BU_P9V_VSX_2 (VSCEQPLT,"scalar_cmp_exp_qp_lt", CONST,  
xscmpexpqp_lt_kf)
 BU_P9V_VSX_2 (VSCEQPEQ,"scalar_cmp_exp_qp_eq", CONST,  
xscmpexpqp_eq_kf)
@@ -2500,6 +2513,8 @@ BU_FLOAT128_HW_3 (FMAF128_ODD,   "fmaf128_round_to_odd",  
 FP, fmakf4_odd)
 
 /* 3 argument vector functions returning void, treated as SPECIAL,
added in ISA 3.0 (power9).  */
+/* ??? Why are these named __builtin_altivec_* when the corresponding
+   load builtins are named __builtin_vsx_*??  */
 BU_P9V_64BIT_AV_X (STXVL,  "stxvl",MISC)
 BU_P9V_64BIT_AV_X (XST_LEN_R,  "xst_len_r",MISC)
 
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 5ec3f2c55ad..70813a3bb9f 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -444,48 +444,56 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V16QI, 0, 0 },
 
   /* Binary AltiVec/VSX builtins.  */
+  /* Next 2 deprecated, not in rs6000-builtin-new.def.  */
   { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
 RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_V16QI, 0 },
   { ALTIVEC_BUILTIN_VEC_ADD,

[PATCH 29/29] rs6000: Call rs6000_autoinit_builtins from rs6000_builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-call.c (rs6000-builtins.h): New #include.
(rs6000_init_builtins): Call rs6000_autoinit_builtins.
---
 gcc/config/rs6000/rs6000-call.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 70813a3bb9f..032203f5e9a 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -71,6 +71,7 @@
 #include "opts.h"
 
 #include "rs6000-internal.h"
+#include "rs6000-builtins.h"
 
 #if TARGET_MACHO
 #include "gstab.h"  /* for N_SLINE */
@@ -12864,6 +12865,9 @@ rs6000_init_builtins (void)
   pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel",
 pixel_type_node, 8);
 
+  /* Execute the autogenerated initialization code for builtins.  */
+  rs6000_autoinit_builtins ();
+
   /* Create Altivec, VSX and MMA builtins on machines with at least the
  general purpose extensions (970 and newer) to allow the use of
  the target attribute.  */
-- 
2.17.1

[PATCH 27/29] rs6000: Add remaining builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: Add ieee128-hw, dfp,
crypto, and htm builtins.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 217 +++
 1 file changed, 217 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 1338f543a6a..ce2e3c2476b 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -2748,3 +2748,220 @@
 XL_LEN_R xl_len_r {}
 
 
+; Builtins requiring hardware support for IEEE-128 floating-point.
+[ieee128-hw]
+  fpmath _Float128 __builtin_vsx_addf128_round_to_odd (_Float128, _Float128);
+ADDF128_ODD addkf3_odd {}
+
+  fpmath _Float128 __builtin_vsx_divf128_round_to_odd (_Float128, _Float128);
+DIVF128_ODD divkf3_odd {}
+
+  fpmath _Float128 __builtin_vsx_fmaf128_round_to_odd (_Float128, _Float128, 
_Float128);
+FMAF128_ODD fmakf4_odd {}
+
+  fpmath _Float128 __builtin_vsx_mulf128_round_to_odd (_Float128, _Float128);
+MULF128_ODD mulkf3_odd {}
+
+  const unsigned long long __builtin_vsx_scalar_extract_expq (_Float128);
+VSEEQP xsxexpqp_kf {}
+
+  const unsigned __int128 __builtin_vsx_scalar_extract_sigq (_Float128);
+VSESQP xsxsigqp_kf {}
+
+  const signed int __buiiltin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
+VSCEQPEQ xscmpexpqp_eq_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
+VSCEQPGT xscmpexpqp_gt_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
+VSCEQPLT xscmpexpqp_lt_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, 
_Float128);
+VSCEQPUO xscmpexpqp_unordered_kf {}
+
+  const _Float128 __builtin_vsx_scalar_insert_exp_q (unsigned __int128, 
unsigned long long);
+VSIEQP xsiexpqp_kf {}
+
+  const _Float128 __builtin_vsx_scalar_insert_exp_qp (_Float128, unsigned long 
long);
+VSIEQPF xsiexpqpf_kf {}
+
+  const unsigned int __builtin_vsx_scalar_test_data_class_qp (_Float128, 
signed int);
+VSTDCQP xststdcqp_kf {}
+
+  const unsigned int __builtin_vsx_scalar_test_neg_qp (_Float128);
+VSTDCNQP xststdcnegqp_kf {}
+
+  fpmath _Float128 __builtin_vsx_sqrtf128_round_to_odd (_Float128);
+SQRTF128_ODD sqrtkf2_odd {}
+
+  fpmath _Float128 __builtin_vsx_subf128_round_to_odd (_Float128, _Float128);
+SUBF128_ODD subkf3_odd {}
+
+  fpmath double __builtin_vsx_truncf128_round_to_odd (_Float128);
+TRUNCF128_ODD trunckfdf2_odd {}
+
+
+
+; Decimal floating-point builtins.
+[dfp]
+  const _Decimal64 __builtin_ddedpd (const int<0,2>, _Decimal64);
+DDEDPD dfp_ddedpd_dd {}
+
+  const _Decimal128 __builtin_ddedpdq (const int<0,2>, _Decimal128);
+DDEDPDQ dfp_ddedpd_td {}
+
+  const _Decimal64 __builtin_denbcd (const int<1>, _Decimal64);
+DENBCD dfp_denbcd_dd {}
+
+  const _Decimal128 __builtin_denbcdq (const int<1>, _Decimal128);
+DENBCDQ dfp_denbcd_td {}
+
+  const _Decimal64 __builtin_diex (signed long long, _Decimal64);
+DIEX dfp_diex_dd {}
+
+  const _Decimal128 __builtin_diexq (signed long long, _Decimal128);
+DIEXQ dfp_diex_td {}
+
+  const _Decimal64 __builtin_dscli (_Decimal64, const int<6>);
+DSCLI dfp_dscli_dd {}
+
+  const _Decimal128 __builtin_dscliq (_Decimal128, const int<6>);
+DSCLIQ dfp_dscli_td {}
+
+  const _Decimal64 __builtin_dscri (_Decimal64, const int<6>);
+DSCRI dfp_dscri_dd {}
+
+  const _Decimal128 __builtin_dscriq (_Decimal128, const int<6>);
+DSCRIQ dfp_dscri_td {}
+
+  const signed long long __builtin_dxex (_Decimal64);
+DXEX dfp_dxex_dd {}
+
+  const signed long long __builtin_dxexq (_Decimal128);
+DXEXQ dfp_dxex_td {}
+
+  const _Decimal128 __builtin_pack_dec128 (unsigned long long, unsigned long 
long);
+PACK_TD packtd {}
+
+  void __builtin_set_fpscr_drn (signed int);
+SET_FPSCR_DRN rs6000_set_fpscr_drn {}
+
+  const unsigned long long __builtin_unpack_dec128 (_Decimal128, const int<1>);
+UNPACK_TD unpacktd {}
+
+
+[crypto]
+  const vull __builtin_crypto_vcipher (vull, vull);
+VCIPHER crypto_vcipher_v2di {}
+
+  const vuc __builtin_crypto_vcipher_be (vuc, vuc);
+VCIPHER_BE crypto_vcipher_v16qi {}
+
+  const vull __builtin_crypto_vcipherlast (vull, vull);
+VCIPHERLAST crypto_vcipherlast_v2di {}
+
+  const vuc __builtin_crypto_vcipherlast_be (vuc, vuc);
+VCIPHERLAST_BE crypto_vcipherlast_v16qi {}
+
+  const vull __builtin_crypto_vncipher (vull, vull);
+VNCIPHER crypto_vncipher_v2di {}
+
+  const vuc __builtin_crypto_vncipher_be (vuc, vuc);
+VNCIPHER_BE crypto_vncipher_v16qi {}
+
+  const vull __builtin_crypto_vncipherlast (vull, vull);
+VNCIPHERLAST crypto_vncipherlast_v2di {}
+
+  const vuc __builtin_crypto_vncipherlast_be (vuc, vuc);
+VNCIPHERLAST_BE crypto_vncipherlast_v16qi {}
+
+  const vull __builtin_crypto_vsbox (vull);
+VSBOX crypto_vsbox_v2di {}
+
+  const vuc __builtin_crypto_vsbox_be (vuc);
+

[PATCH 25/29] rs6000: Add Power8 vector builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: Add power8-vector
builtins.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 417 +++
 1 file changed, 417 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 0a17cad446c..2f918c1d69e 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1977,3 +1977,420 @@
 DIVDEU diveu_di {}
 
 
+; Power8 vector built-ins.
+[power8-vector]
+  const vsll __builtin_altivec_abs_v2di (vsll);
+ABS_V2DI absv2di2 {}
+
+  const vsc __builtin_altivec_eqv_v16qi (vsc, vsc);
+EQV_V16QI eqvv16qi3 {}
+
+  const vuc __builtin_altivec_eqv_v16qi_uns (vuc, vuc);
+EQV_V16QI_UNS eqvv16qi3 {}
+
+  const vsq __builtin_altivec_eqv_v1ti (vsq, vsq);
+EQV_V1TI eqvv1ti3 {}
+
+  const vuq __builtin_altivec_eqv_v1ti_uns (vuq, vuq);
+EQV_V1TI_UNS eqvv1ti3 {}
+
+  const vd __builtin_altivec_eqv_v2df (vd, vd);
+EQV_V2DF eqvv2df3 {}
+
+  const vsll __builtin_altivec_eqv_v2di (vsll, vsll);
+EQV_V2DI eqvv2di3 {}
+
+  const vull __builtin_altivec_eqv_v2di_uns (vull, vull);
+EQV_V2DI_UNS eqvv2di3 {}
+
+  const vf __builtin_altivec_eqv_v4sf (vf, vf);
+EQV_V4SF eqvv4sf3 {}
+
+  const vsi __builtin_altivec_eqv_v4si (vsi, vsi);
+EQV_V4SI eqvv4si3 {}
+
+  const vui __builtin_altivec_eqv_v4si_uns (vui, vui);
+EQV_V4SI_UNS eqvv4si3 {}
+
+  const vss __builtin_altivec_eqv_v8hi (vss, vss);
+EQV_V8HI eqvv8hi3 {}
+
+  const vus __builtin_altivec_eqv_v8hi_uns (vus, vus);
+EQV_V8HI_UNS eqvv8hi3 {}
+
+  const vsc __builtin_altivec_nand_v16qi (vsc, vsc);
+NAND_V16QI nandv16qi3 {}
+
+  const vuc __builtin_altivec_nand_v16qi_uns (vuc, vuc);
+NAND_V16QI_UNS nandv16qi3 {}
+
+  const vsq __builtin_altivec_nand_v1ti (vsq, vsq);
+NAND_V1TI nandv1ti3 {}
+
+  const vuq __builtin_altivec_nand_v1ti_uns (vuq, vuq);
+NAND_V1TI_UNS nandv1ti3 {}
+
+  const vd __builtin_altivec_nand_v2df (vd, vd);
+NAND_V2DF nandv2df3 {}
+
+  const vsll __builtin_altivec_nand_v2di (vsll, vsll);
+NAND_V2DI nandv2di3 {}
+
+  const vull __builtin_altivec_nand_v2di_uns (vull, vull);
+NAND_V2DI_UNS nandv2di3 {}
+
+  const vf __builtin_altivec_nand_v4sf (vf, vf);
+NAND_V4SF nandv4sf3 {}
+
+  const vsi __builtin_altivec_nand_v4si (vsi, vsi);
+NAND_V4SI nandv4si3 {}
+
+  const vui __builtin_altivec_nand_v4si_uns (vui, vui);
+NAND_V4SI_UNS nandv4si3 {}
+
+  const vss __builtin_altivec_nand_v8hi (vss, vss);
+NAND_V8HI nandv8hi3 {}
+
+  const vus __builtin_altivec_nand_v8hi_uns (vus, vus);
+NAND_V8HI_UNS nandv8hi3 {}
+
+  const vsc __builtin_altivec_neg_v16qi (vsc);
+NEG_V16QI negv16qi2 {}
+
+  const vd __builtin_altivec_neg_v2df (vd);
+NEG_V2DF negv2df2 {}
+
+  const vsll __builtin_altivec_neg_v2di (vsll);
+NEG_V2DI negv2di2 {}
+
+  const vf __builtin_altivec_neg_v4sf (vf);
+NEG_V4SF negv4sf2 {}
+
+  const vsi __builtin_altivec_neg_v4si (vsi);
+NEG_V4SI negv4si2 {}
+
+  const vss __builtin_altivec_neg_v8hi (vss);
+NEG_V8HI negv8hi2 {}
+
+  const vsc __builtin_altivec_orc_v16qi (vsc, vsc);
+ORC_V16QI orcv16qi3 {}
+
+  const vuc __builtin_altivec_orc_v16qi_uns (vuc, vuc);
+ORC_V16QI_UNS orcv16qi3 {}
+
+  const vsq __builtin_altivec_orc_v1ti (vsq, vsq);
+ORC_V1TI orcv1ti3 {}
+
+  const vuq __builtin_altivec_orc_v1ti_uns (vuq, vuq);
+ORC_V1TI_UNS orcv1ti3 {}
+
+  const vd __builtin_altivec_orc_v2df (vd, vd);
+ORC_V2DF orcv2df3 {}
+
+  const vsll __builtin_altivec_orc_v2di (vsll, vsll);
+ORC_V2DI orcv2di3 {}
+
+  const vull __builtin_altivec_orc_v2di_uns (vull, vull);
+ORC_V2DI_UNS orcv2di3 {}
+
+  const vf __builtin_altivec_orc_v4sf (vf, vf);
+ORC_V4SF orcv4sf3 {}
+
+  const vsi __builtin_altivec_orc_v4si (vsi, vsi);
+ORC_V4SI orcv4si3 {}
+
+  const vui __builtin_altivec_orc_v4si_uns (vui, vui);
+ORC_V4SI_UNS orcv4si3 {}
+
+  const vss __builtin_altivec_orc_v8hi (vss, vss);
+ORC_V8HI orcv8hi3 {}
+
+  const vus __builtin_altivec_orc_v8hi_uns (vus, vus);
+ORC_V8HI_UNS orcv8hi3 {}
+
+  const vsc __builtin_altivec_vclzb (vsc);
+VCLZB clzv16qi2 {}
+
+  const vsll __builtin_altivec_vclzd (vsll);
+VCLZD clzv2di2 {}
+
+  const vss __builtin_altivec_vclzh (vss);
+VCLZH clzv8hi2 {}
+
+  const vsi __builtin_altivec_vclzw (vsi);
+VCLZW clzv4si2 {}
+
+  const vsc __builtin_altivec_vgbbd (vsc);
+VGBBD p8v_vgbbd {}
+
+  const vsq __builtin_altivec_vaddcuq (vsq, vsq);
+VADDCUQ altivec_vaddcuq {}
+
+  const vsq __builtin_altivec_vaddecuq (vsq, vsq, vsq);
+VADDECUQ altivec_vaddecuq {}
+
+  const vuq __builtin_altivec_vaddeuqm (vuq, vuq, vuq);
+VADDEUQM altivec_vaddeuqm {}
+
+  const vsll __builtin_altivec_vaddudm (vsll, vsll);
+VADDUDM addv2di3 {}
+
+  const vsq __builtin_altivec_vadduqm (vsq, vsq);
+VADDUQM altivec_vadduqm {}
+
+  const vsll __builtin_altivec_vbpermq (vop, vsc);
+

[PATCH 26/29] rs6000: Add Power9 builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: Add power9,
power9-vector, and power9-64 builtins.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 354 +++
 1 file changed, 354 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 2f918c1d69e..1338f543a6a 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -2394,3 +2394,357 @@
 XSCVSPDPN vsx_xscvspdpn {}
 
 
+; Power9 vector builtins.
+[power9-vector]
+  const vus __builtin_altivec_convert_4f32_8i16 (vf, vf);
+CONVERT_4F32_8I16 convert_4f32_8i16 {}
+
+  const unsigned int __builtin_altivec_first_match_index_v16qi (vsc, vsc);
+VFIRSTMATCHINDEX_V16QI first_match_index_v16qi {}
+
+  const unsigned int __builtin_altivec_first_match_index_v8hi (vss, vss);
+VFIRSTMATCHINDEX_V8HI first_match_index_v8hi {}
+
+  const unsigned int __builtin_altivec_first_match_index_v4si (vsi, vsi);
+VFIRSTMATCHINDEX_V4SI first_match_index_v4si {}
+
+  const unsigned int __builtin_altivec_first_match_or_eos_index_v16qi (vsc, 
vsc);
+VFIRSTMATCHOREOSINDEX_V16QI first_match_or_eos_index_v16qi {}
+
+  const unsigned int __builtin_altivec_first_match_or_eos_index_v8hi (vss, 
vss);
+VFIRSTMATCHOREOSINDEX_V8HI first_match_or_eos_index_v8hi {}
+
+  const unsigned int __builtin_altivec_first_match_or_eos_index_v4si (vsi, 
vsi);
+VFIRSTMATCHOREOSINDEX_V4SI first_match_or_eos_index_v4si {}
+
+  const unsigned int __builtin_altivec_first_mismatch_index_v16qi (vsc, vsc);
+VFIRSTMISMATCHINDEX_V16QI first_mismatch_index_v16qi {}
+
+  const unsigned int __builtin_altivec_first_mismatch_index_v8hi (vss, vss);
+VFIRSTMISMATCHINDEX_V8HI first_mismatch_index_v8hi {}
+
+  const unsigned int __builtin_altivec_first_mismatch_index_v4si (vsi, vsi);
+VFIRSTMISMATCHINDEX_V4SI first_mismatch_index_v4si {}
+
+  const unsigned int __builtin_altivec_first_mismatch_or_eos_index_v16qi (vsc, 
vsc);
+VFIRSTMISMATCHOREOSINDEX_V16QI first_mismatch_or_eos_index_v16qi {}
+
+  const unsigned int __builtin_altivec_first_mismatch_or_eos_index_v8hi (vss, 
vss);
+VFIRSTMISMATCHOREOSINDEX_V8HI first_mismatch_or_eos_index_v8hi {}
+
+  const unsigned int __builtin_altivec_first_mismatch_or_eos_index_v4si (vsi, 
vsi);
+VFIRSTMISMATCHOREOSINDEX_V4SI first_mismatch_or_eos_index_v4si {}
+
+  const vuc __builtin_altivec_vadub (vuc, vuc);
+VADUB vaduv16qi3 {}
+
+  const vus __builtin_altivec_vaduh (vus, vus);
+VADUH vaduv8hi3 {}
+
+  const vui __builtin_altivec_vaduw (vui, vui);
+VADUW vaduv4si3 {}
+
+  const vull __builtin_altivec_vbpermd (vull, vuc);
+VBPERMD altivec_vbpermd {}
+
+  const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
+VCLZLSBB_V16QI vclzlsbb_v16qi {}
+
+  const signed int __builtin_altivec_vclzlsbb_v4si (vsi);
+VCLZLSBB_V4SI vclzlsbb_v4si {}
+
+  const signed int __builtin_altivec_vclzlsbb_v8hi (vss);
+VCLZLSBB_V8HI vclzlsbb_v8hi {}
+
+  const vsc __builtin_altivec_vctzb (vsc);
+VCTZB ctzv16qi2 {}
+
+  const vsll __builtin_altivec_vctzd (vsll);
+VCTZD ctzv2di2 {}
+
+  const vss __builtin_altivec_vctzh (vss);
+VCTZH ctzv8hi2 {}
+
+  const vsi __builtin_altivec_vctzw (vsi);
+VCTZW ctzv4si2 {}
+
+  const signed int __builtin_altivec_vctzlsbb_v16qi (vsc);
+VCTZLSBB_V16QI vctzlsbb_v16qi {}
+
+  const signed int __builtin_altivec_vctzlsbb_v4si (vsi);
+VCTZLSBB_V4SI vctzlsbb_v4si {}
+
+  const signed int __builtin_altivec_vctzlsbb_v8hi (vss);
+VCTZLSBB_V8HI vctzlsbb_v8hi {}
+
+  const signed int __builtin_altivec_vcmpaeb_p (vsc, vsc);
+VCMPAEB_P vector_ae_v16qi_p {pred}
+
+  const signed int __builtin_altivec_vcmpaed_p (vsll, vsll);
+VCMPAED_P vector_ae_v2di_p {pred}
+
+  const signed int __builtin_altivec_vcmpaedp_p (vd, vd);
+VCMPAEDP_P vector_ae_v2df_p {pred}
+
+  const signed int __builtin_altivec_vcmpaefp_p (vf, vf);
+VCMPAEFP_P vector_ae_v4sf_p {pred}
+
+  const signed int __builtin_altivec_vcmpaeh_p (vss, vss);
+VCMPAEH_P vector_ae_v8hi_p {pred}
+
+  const signed int __builtin_altivec_vcmpaew_p (vsi, vsi);
+VCMPAEW_P vector_ae_v4si_p {pred}
+
+  const vbc __builtin_altivec_vcmpneb (vsc, vsc);
+CMPNEB vcmpneb {}
+
+  const signed int __builtin_altivec_vcmpneb_p (vsc, vsc);
+VCMPNEB_P vector_ne_v16qi_p {pred}
+
+  const signed int __builtin_altivec_vcmpned_p (vsll, vsll);
+VCMPNED_P vector_ne_v2di_p {pred}
+
+  const signed int __builtin_altivec_vcmpnedp_p (vd, vd);
+VCMPNEDP_P vector_ne_v2df_p {pred}
+
+  const signed int __builtin_altivec_vcmpnefp_p (vf, vf);
+VCMPNEFP_P vector_ne_v4sf_p {pred}
+
+  const vbs __builtin_altivec_vcmpneh (vss, vss);
+CMPNEH vcmpneh {}
+
+  const signed int __builtin_altivec_vcmpneh_p (vss, vss);
+VCMPNEH_P vector_ne_v8hi_p {pred}
+
+  const vbi __builtin_altivec_vcmpnew (vsi, vsi);
+CMPNEW vcmpnew {}
+
+  const signed int

[PATCH 21/29] rs6000: Add remaining AltiVec builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: Add remaining AltiVec
builtins.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 843 +++
 1 file changed, 843 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 5fc7e1301c3..0b79f155389 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -177,3 +177,846 @@
   const vss __builtin_altivec_abs_v8hi (vss);
 ABS_V8HI absv8hi2 {}
 
+  const vsc __builtin_altivec_abss_v16qi (vsc);
+ABSS_V16QI altivec_abss_v16qi {}
+
+  const vsi __builtin_altivec_abss_v4si (vsi);
+ABSS_V4SI altivec_abss_v4si {}
+
+  const vss __builtin_altivec_abss_v8hi (vss);
+ABSS_V8HI altivec_abss_v8hi {}
+
+  const vf __builtin_altivec_copysignfp (vf, vf);
+COPYSIGN_V4SF vector_copysignv4sf3 {}
+
+  void __builtin_altivec_dss (const int<2>);
+DSS altivec_dss {}
+
+  void __builtin_altivec_dssall ();
+DSSALL altivec_dssall {}
+
+  void __builtin_altivec_dst (void *, const int, const int<2>);
+DST altivec_dst {}
+
+  void __builtin_altivec_dstst (void *, const int, const int<2>);
+DSTST altivec_dstst {}
+
+  void __builtin_altivec_dststt (void *, const int, const int<2>);
+DSTSTT altivec_dststt {}
+
+  void __builtin_altivec_dstt (void *, const int, const int<2>);
+DSTT altivec_dstt {}
+
+  fpmath vsi __builtin_altivec_fix_sfsi (vf);
+FIX_V4SF_V4SI fix_truncv4sfv4si2 {}
+
+  fpmath vui __builtin_altivec_fixuns_sfsi (vf);
+FIXUNS_V4SF_V4SI fixuns_truncv4sfv4si2 {}
+
+  fpmath vf __builtin_altivec_float_sisf (vsi);
+FLOAT_V4SI_V4SF floatv4siv4sf2 {}
+
+  pure vop __builtin_altivec_lvebx (signed long long, void *);
+LVEBX altivec_lvebx {ldvec}
+
+  pure vop __builtin_altivec_lvehx (signed long long, void *);
+LVEHX altivec_lvehx {ldvec}
+
+  pure vop __builtin_altivec_lvewx (signed long long, void *);
+LVEWX altivec_lvewx {ldvec}
+
+  pure vop __builtin_altivec_lvlx (signed long long, void *);
+LVLX altivec_lvlx {ldvec}
+
+  pure vop __builtin_altivec_lvlxl (signed long long, void *);
+LVLXL altivec_lvlxl {ldvec}
+
+  pure vop __builtin_altivec_lvrx (signed long long, void *);
+LVRX altivec_lvrx {ldvec}
+
+  pure vop __builtin_altivec_lvrxl (signed long long, void *);
+LVRXL altivec_lvrxl {ldvec}
+
+  pure vuc __builtin_altivec_lvsl (signed long long, void *);
+LVSL altivec_lvsl {ldvec}
+
+  pure vuc __builtin_altivec_lvsr (signed long long, void *);
+LVSR altivec_lvsr {ldvec}
+
+; Following LVX one is redundant, and I don't think we need to
+; keep it.  It only maps to LVX_V4SI.  Probably remove.
+  pure vop __builtin_altivec_lvx (signed long long, void *);
+LVX altivec_lvx_v4si {ldvec}
+
+  pure vsc __builtin_altivec_lvx_v16qi (signed long long, void *);
+LVX_V16QI altivec_lvx_v16qi {ldvec}
+
+  pure vf __builtin_altivec_lvx_v4sf (signed long long, void *);
+LVX_V4SF altivec_lvx_v4sf {ldvec}
+
+  pure vsi __builtin_altivec_lvx_v4si (signed long long, void *);
+LVX_V4SI altivec_lvx_v4si {ldvec}
+
+  pure vss __builtin_altivec_lvx_v8hi (signed long long, void *);
+LVX_V8HI altivec_lvx_v8hi {ldvec}
+
+  pure vsi __builtin_altivec_lvxl (signed long long, signed int *);
+LVXL altivec_lvxl_v4si {ldvec}
+
+  pure vsc __builtin_altivec_lvxl_v16qi (signed long long, void *);
+LVXL_V16QI altivec_lvxl_v16qi {ldvec}
+
+  pure vf __builtin_altivec_lvxl_v4sf (signed long long, void *);
+LVXL_V4SF altivec_lvxl_v4sf {ldvec}
+
+  pure vsi __builtin_altivec_lvxl_v4si (signed long long, void *);
+LVXL_V4SI altivec_lvxl_v4si {ldvec}
+
+  pure vss __builtin_altivec_lvxl_v8hi (signed long long, void *);
+LVXL_V8HI altivec_lvxl_v8hi {ldvec}
+
+  vuc __builtin_altivec_mask_for_load (long long, void *);
+MASK_FOR_LOAD altivec_lvsr_direct {ldstmask}
+
+  vuc __builtin_altivec_mask_for_store (long long, void *);
+MASK_FOR_STORE altivec_lvsr_direct {ldstmask}
+
+  vus __builtin_altivec_mfvscr ();
+MFVSCR altivec_mfvscr {}
+
+  void __builtin_altivec_mtvscr (vop);
+MTVSCR altivec_mtvscr {}
+
+  const vsc __builtin_altivec_nabs_v16qi (vsc);
+NABS_V16QI nabsv16qi2 {}
+
+  const vf __builtin_altivec_nabs_v4sf (vf);
+NABS_V4SF vsx_nabsv4sf2 {}
+
+  const vsi __builtin_altivec_nabs_v4si (vsi);
+NABS_V4SI nabsv4si2 {}
+
+  const vss __builtin_altivec_nabs_v8hi (vss);
+NABS_V8HI nabsv8hi2 {}
+
+  void __builtin_altivec_stvebx (vuc, signed long long, void *);
+STVEBX altivec_stvebx {stvec}
+
+  void __builtin_altivec_stvehx (vss, signed long long, void *);
+STVEHX_VSS altivec_stvehx {stvec}
+
+  void __builtin_altivec_stvewx (vsi, signed long long, void *);
+STVEWX altivec_stvewx {stvec}
+
+  void __builtin_altivec_stvlx (vop, signed long long, void *);
+STVLX altivec_stvlx {stvec}
+
+  void __builtin_altivec_stvlxl (vop, signed long long, void *);
+STVLXL

[PATCH 13/29] rs6000: Parsing of overload input file

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (ovld_stanza): New struct.
(MAXOVLDSTANZAS): New defined constant.
(ovld_stanzas): New filescope variable.
(curr_ovld_stanza): Likewise.
(MAXOVLDS): New defined constant.
(ovlddata): New struct.
(ovlds): New filescope variable.
(curr_ovld): Likewise.
(parse_ovld_entry): New function.
(parse_ovld_stanza): Likewise.
(parse_ovld): Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 202 +++-
 1 file changed, 201 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index c4bc5c724a3..5413cc2681e 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -360,8 +360,31 @@ struct bifdata {
 static bifdata bifs[MAXBIFS];
 static int num_bifs;
 static int curr_bif;
+
+/* Stanzas are groupings of built-in functions and overloads by some
+   common feature/attribute.  These definitions are for overload stanzas.  */
+struct ovld_stanza {
+  char *stanza_id;
+  char *extern_name;
+  char *intern_name;
+};
+
+#define MAXOVLDSTANZAS 256
+static ovld_stanza ovld_stanzas[MAXOVLDSTANZAS];
 static int num_ovld_stanzas;
+static int curr_ovld_stanza;
+
+#define MAXOVLDS 16384
+struct ovlddata {
+  int stanza;
+  prototype proto;
+  char *idname;
+  char *fndecl;
+};
+
+static ovlddata ovlds[MAXOVLDS];
 static int num_ovlds;
+static int curr_ovld;
 
 /* Exit codes for the shell.  */
 enum exit_codes {
@@ -1460,11 +1483,188 @@ parse_bif ()
   return result;
 }
 
+/* Parse one two-line entry in the overload file.  */
+static parse_codes
+parse_ovld_entry ()
+{
+  /* Check for end of stanza.  */
+  pos = 0;
+  consume_whitespace ();
+  if (linebuf[pos] == '[')
+return PC_EOSTANZA;
+
+  /* Allocate an entry in the overload table.  */
+  if (num_ovlds >= MAXOVLDS - 1)
+{
+  (*diag) ("too many overloads.\n");
+  return PC_PARSEFAIL;
+}
+
+  curr_ovld = num_ovlds++;
+  ovlds[curr_ovld].stanza = curr_ovld_stanza;
+
+  if (parse_prototype ([curr_ovld].proto) == PC_PARSEFAIL)
+return PC_PARSEFAIL;
+
+  /* Now process line 2, which just contains the builtin id.  */
+  if (!advance_line (ovld_file))
+{
+  (*diag) ("unexpected EOF.\n");
+  return PC_EOFILE;
+}
+
+  pos = 0;
+  consume_whitespace ();
+  int oldpos = pos;
+  char *id = match_identifier ();
+  ovlds[curr_ovld].idname = id;
+  if (!id)
+{
+  (*diag) ("missing overload id at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+
+#ifdef DEBUG
+  (*diag) ("ID name is '%s'.\n", id);
+#endif
+
+  /* The builtin id has to match one from the bif file.  */
+  if (!rbt_find (_rbt, id))
+{
+  (*diag) ("builtin ID '%s' not found in bif file.\n", id);
+  return PC_PARSEFAIL;
+}
+
+  /* Save the ID in a lookup structure.  */
+  if (!rbt_insert (_rbt, id))
+{
+  (*diag) ("duplicate function ID '%s' at column %d.\n", id, oldpos + 1);
+  return PC_PARSEFAIL;
+}
+
+  consume_whitespace ();
+  if (linebuf[pos] != '\n')
+{
+  (*diag) ("garbage at end of line at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+  return PC_OK;
+}
+
+/* Parse one stanza of the input overload file.  linebuf already contains the
+   first line to parse.  */
+static parse_codes
+parse_ovld_stanza ()
+{
+  /* Parse the stanza header.  */
+  pos = 0;
+  consume_whitespace ();
+
+  if (linebuf[pos] != '[')
+{
+  (*diag) ("ill-formed stanza header at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+  safe_inc_pos ();
+
+  char *stanza_name = match_identifier ();
+  if (!stanza_name)
+{
+  (*diag) ("no identifier found in stanza header.\n");
+  return PC_PARSEFAIL;
+}
+
+  /* Add the identifier to a table and set the number to be recorded
+ with subsequent overload entries.  */
+  if (num_ovld_stanzas >= MAXOVLDSTANZAS)
+{
+  (*diag) ("too many stanza headers.\n");
+  return PC_PARSEFAIL;
+}
+
+  curr_ovld_stanza = num_ovld_stanzas++;
+  ovld_stanza *stanza = _stanzas[curr_ovld_stanza];
+  stanza->stanza_id = stanza_name;
+
+  consume_whitespace ();
+  if (linebuf[pos] != ',')
+{
+  (*diag) ("missing comma at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+  safe_inc_pos ();
+
+  consume_whitespace ();
+  stanza->extern_name = match_identifier ();
+  if (!stanza->extern_name)
+{
+  (*diag) ("missing external name at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+
+  consume_whitespace ();
+  if (linebuf[pos] != ',')
+{
+  (*diag) ("missing comma at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+  safe_inc_pos ();
+
+  consume_whitespace ();
+  stanza->intern_name = match_identifier ();
+  if (!stanza->intern_name)
+{
+  (*diag) ("missing internal name at column %d.\n", pos + 1);

[PATCH 20/29] rs6000: Incorporate new builtins code into the build machinery

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config.gcc (powerpc*-*-*): Add rs6000-builtins.o to extra_objs.
* config/rs6000/t-rs6000 (rs6000-gen-builtins.o): New target.
(rbtree.o): Likewise.
(rs6000-gen-builtins): Likewise.
(rs6000-builtins.c): Likewise.
(rs6000-builtins.o): Likewise.
(rs6000-call.o): Add dependency on rs6000-builtins.c.
---
 gcc/config.gcc |  3 ++-
 gcc/config/rs6000/t-rs6000 | 25 -
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 30b51c3dc81..b6aa0c2d15d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -505,7 +505,8 @@ or1k*-*-*)
;;
 powerpc*-*-*)
cpu_type=rs6000
-   extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o 
rs6000-call.o"
+   extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
+   extra_objs="${extra_objs} rs6000-call.o rs6000-builtins.o"
extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index 1ddb5729cb2..6053721f76c 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -43,7 +43,30 @@ rs6000-logue.o: $(srcdir)/config/rs6000/rs6000-logue.c
$(COMPILE) $<
$(POSTCOMPILE)
 
-rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
+rs6000-gen-builtins.o: $(srcdir)/config/rs6000/rs6000-gen-builtins.c
+   $(COMPILE) $<
+   $(POSTCOMPILE)
+
+rbtree.o: $(srcdir)/config/rs6000/rbtree.c
+   $(COMPILE) $<
+   $(POSTCOMPILE)
+
+rs6000-gen-builtins: rs6000-gen-builtins.o rbtree.o
+   +$(LINKER_FOR_BUILD) $(BUILD_LINKERFLAGS) $(BUILD_LDFLAGS) -o $@ \
+   $(filter-out $(BUILD_LIBDEPS), $^) $(BUILD_LIBS)
+
+rs6000-builtins.c: rs6000-gen-builtins \
+  $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+  $(srcdir)/config/rs6000/rs6000-overload.def
+   ./rs6000-gen-builtins $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+   $(srcdir)/config/rs6000/rs6000-overload.def rs6000-builtins.h \
+   rs6000-builtins.c rs6000-vecdefines.h
+
+rs6000-builtins.o: rs6000-builtins.c
+   $(COMPILE) $<
+   $(POSTCOMPILE)
+
+rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c rs6000-builtins.c
$(COMPILE) $<
$(POSTCOMPILE)
 
-- 
2.17.1

[PATCH 24/29] rs6000: Add Power7 builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: Add power7 and
power7-64 builtins.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 39 
 1 file changed, 39 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 1cb019bd4fb..0a17cad446c 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1938,3 +1938,42 @@
 XXSPLTD_V2DI vsx_xxspltd_v2di {}
 
 
+; Power7 builtins (ISA 2.06).
+[power7]
+  const unsigned int __builtin_addg6s (unsigned int, unsigned int);
+ADDG6S addg6s {}
+
+  const signed long long __builtin_bpermd (signed long long, signed long long);
+BPERMD bpermd_di {}
+
+  const unsigned int __builtin_cbcdtd (unsigned int);
+CBCDTD cbcdtd {}
+
+  const unsigned int __builtin_cdtbcd (unsigned int);
+CDTBCD cdtbcd {}
+
+  const signed int __builtin_divwe (signed int, signed int);
+DIVWE dive_si {}
+
+  const unsigned int __builtin_divweu (unsigned int, unsigned int);
+DIVWEU diveu_si {}
+
+  const vsq __builtin_pack_vector_int128 (unsigned long long, unsigned long 
long);
+PACK_V1TI packv1ti {}
+
+  void __builtin_ppc_speculation_barrier ();
+SPECBARR speculation_barrier {}
+
+  const unsigned long long __builtin_unpack_vector_int128 (vsq, const int<1>);
+UNPACK_V1TI unpackv1ti {}
+
+
+; Power7 builtins requiring 64-bit GPRs (even with 32-bit addressing).
+[power7-64]
+  const signed long long __builtin_divde (signed long long, signed long long);
+DIVDE dive_di {}
+
+  const unsigned long long __builtin_divdeu (unsigned long long, unsigned long 
long);
+DIVDEU diveu_di {}
+
+
-- 
2.17.1

[PATCH 17/29] rs6000: Write output to the builtins init file, part 1 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (write_fntype): New
function.
(write_fntype_init): New stub function.
(write_init_bif_table): Likewise.
(write_init_ovld_table): New function.
(write_init_file): Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 149 
 1 file changed, 149 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 67c7b22aad6..3ac199ff2e5 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -2055,6 +2055,18 @@ write_extern_fntype (char *str)
   fprintf (header_file, "extern tree %s;\n", str);
 }
 
+void
+write_fntype (char *str)
+{
+  fprintf (init_file, "tree %s;\n", str);
+}
+
+/* Write an initializer for a function type identified by STR.  */
+void
+write_fntype_init (char *str)
+{
+}
+
 /* Write everything to the header file (rs6000-builtins.h).  */
 static int
 write_header_file ()
@@ -2078,10 +2090,147 @@ write_header_file ()
   return 1;
 }
 
+/* Write code to initialize the built-in function table.  */
+static void
+write_init_bif_table ()
+{
+}
+
+/* Write code to initialize the overload table.  */
+static void
+write_init_ovld_table ()
+{
+  fprintf (init_file, "  int base = RS6000_BIF_MAX;\n\n");
+
+  for (int i = 0; i <= curr_ovld; i++)
+{
+  fprintf (init_file,
+  "  rs6000_overload_info[RS6000_OVLD_%s - base].bifname"
+  "\n= \"%s\";\n",
+  ovlds[i].idname, ovlds[i].proto.bifname);
+  fprintf (init_file,
+  "  rs6000_overload_info[RS6000_OVLD_%s - base].bifid"
+  "\n= RS6000_BIF_%s;\n",
+  ovlds[i].idname, ovlds[i].idname);
+  fprintf (init_file,
+  "  rs6000_overload_info[RS6000_OVLD_%s - base].fntype"
+  "\n= %s;\n",
+  ovlds[i].idname, ovlds[i].fndecl);
+  fprintf (init_file,
+  "  rs6000_overload_info[RS6000_OVLD_%s - base].next"
+  "\n= ", ovlds[i].idname);
+  if (i < curr_ovld
+ && !strcmp (ovlds[i+1].proto.bifname, ovlds[i].proto.bifname))
+   fprintf (init_file,
+"_overload_info[RS6000_OVLD_%s - base];\n",
+ovlds[i+1].idname);
+  else
+   fprintf (init_file, "NULL;\n");
+
+  if (i == 0 || ovlds[i].stanza != ovlds[i-1].stanza)
+   {
+ fprintf (init_file, "\n");
+
+ fprintf (init_file,
+  "  ovldaddr = _overload_info"
+  "[RS6000_OVLD_%s - base];\n",
+  ovlds[i].idname);
+ fprintf (init_file,
+  "  hash = rs6000_ovld_hasher::hash (ovldaddr);\n");
+ fprintf (init_file,
+  "  oslot = ovld_hash.find_slot_with_hash (\n");
+ fprintf (init_file,
+  "\"%s\", hash, INSERT\n",
+  ovlds[i].proto.bifname);
+ fprintf (init_file,
+  " );\n");
+ fprintf (init_file,
+  "  *oslot = ovldaddr;\n");
+   }
+
+  if (i < curr_ovld)
+   fprintf (init_file, "\n");
+}
+}
+
 /* Write everything to the initialization file (rs6000-builtins.c).  */
 static int
 write_init_file ()
 {
+  write_autogenerated_header (init_file);
+
+  fprintf (init_file, "#include \"config.h\"\n");
+  fprintf (init_file, "#include \"system.h\"\n");
+  fprintf (init_file, "#include \"coretypes.h\"\n");
+  fprintf (init_file, "#include \"backend.h\"\n");
+  fprintf (init_file, "#include \"rtl.h\"\n");
+  fprintf (init_file, "#include \"tree.h\"\n");
+  fprintf (init_file, "#include \"langhooks.h\"\n");
+  fprintf (init_file, "#include \"insn-codes.h\"\n");
+  fprintf (init_file, "#include \"rs6000-builtins.h\"\n");
+  fprintf (init_file, "\n");
+
+#ifdef DEBUG_NEW_BUILTINS
+  fprintf (init_file, "int new_builtins_are_live = 1;\n\n");
+#else
+  fprintf (init_file, "int new_builtins_are_live = 0;\n\n");
+#endif
+
+  fprintf (init_file,
+  "bifdata rs6000_builtin_info_x[RS6000_BIF_MAX];\n\n");
+  fprintf (init_file,
+  "ovlddata rs6000_overload_info[RS6000_OVLD_MAX"
+  " - RS6000_BIF_MAX];\n\n");
+
+  rbt_inorder_callback (_rbt, fntype_rbt.rbt_root, write_fntype);
+  fprintf (init_file, "\n");
+
+  fprintf (init_file, "hashval_t\n");
+  fprintf (init_file, "rs6000_bif_hasher::hash (bifdata *bd)\n");
+  fprintf (init_file, "{\n");
+  fprintf (init_file, "  return htab_hash_string (bd->bifname);\n");
+  fprintf (init_file, "}\n\n");
+
+  fprintf (init_file, "bool\n");
+  fprintf (init_file,
+  "rs6000_bif_hasher::equal (bifdata *bd, const char *name)\n");
+  fprintf (init_file, "{\n");
+  fprintf (init_file, "  return bd && name && !strcmp (bd->bifname, name);\n");
+  fprintf (init_file, "}\n\n");
+
+  fprintf (init_file, "hash_table bif_hash (1024);\n\n");
+
+  fprintf (init_file, "hashval_t\n");
+

[PATCH 22/29] rs6000: Add VSX builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: Add VSX builtins.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 840 +++
 1 file changed, 840 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 0b79f155389..6c60177e4bb 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1020,3 +1020,843 @@
   const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
 VEC_SET_V8HI nothing {set}
 
+
+; VSX builtins.
+[vsx]
+  pure vsq __builtin_altivec_lvx_v1ti (signed long long, void *);
+LVX_V1TI altivec_lvx_v1ti {ldvec}
+
+  pure vd __builtin_altivec_lvx_v2df (signed long long, void *);
+LVX_V2DF altivec_lvx_v2df {ldvec}
+
+  pure vsll __builtin_altivec_lvx_v2di (signed long long, void *);
+LVX_V2DI altivec_lvx_v2di {ldvec}
+
+  pure vd __builtin_altivec_lvxl_v2df (signed long long, void *);
+LVXL_V2DF altivec_lvxl_v2df {ldvec}
+
+  pure vsll __builtin_altivec_lvxl_v2di (signed long long, void *);
+LVXL_V2DI altivec_lvxl_v2di {ldvec}
+
+  const vd __builtin_altivec_nabs_v2df (vd);
+NABS_V2DF vsx_nabsv2df2 {}
+
+  const vsll __builtin_altivec_nabs_v2di (vsll);
+NABS_V2DI nabsv2di2 {}
+
+  void __builtin_altivec_stvx_v2df (vd, signed long long, void *);
+STVX_V2DF altivec_stvx_v2df {stvec}
+
+  void __builtin_altivec_stvx_v2di (vop, signed long long, void *);
+STVX_V2DI altivec_stvx_v2di {stvec}
+
+  void __builtin_altivec_stvxl_v2df (vd, signed long long, void *);
+STVXL_V2DF altivec_stvxl_v2df {stvec}
+
+  void __builtin_altivec_stvxl_v2di (vop, signed long long, void *);
+STVXL_V2DI altivec_stvxl_v2di {stvec}
+
+  const vd __builtin_altivec_vand_v2df (vd, vd);
+VAND_V2DF andv2df3 {}
+
+  const vsll __builtin_altivec_vand_v2di (vsll, vsll);
+VAND_V2DI andv2di3 {}
+
+  const vull __builtin_altivec_vand_v2di_uns (vull, vull);
+VAND_V2DI_UNS andv2di3 {}
+
+  const vd __builtin_altivec_vandc_v2df (vd, vd);
+VANDC_V2DF andcv2df3 {}
+
+  const vsll __builtin_altivec_vandc_v2di (vsll, vsll);
+VANDC_V2DI andcv2di3 {}
+
+  const vull __builtin_altivec_vandc_v2di_uns (vull, vull);
+VANDC_V2DI_UNS andcv2di3 {}
+
+  const vd __builtin_altivec_vnor_v2df (vd, vd);
+VNOR_V2DF norv2df3 {}
+
+  const vsll __builtin_altivec_vnor_v2di (vsll, vsll);
+VNOR_V2DI norv2di3 {}
+
+  const vull __builtin_altivec_vnor_v2di_uns (vull, vull);
+VNOR_V2DI_UNS norv2di3 {}
+
+  const vd __builtin_altivec_vor_v2df (vd, vd);
+VOR_V2DF iorv2df3 {}
+
+  const vsll __builtin_altivec_vor_v2di (vsll, vsll);
+VOR_V2DI iorv2di3 {}
+
+  const vull __builtin_altivec_vor_v2di_uns (vull, vull);
+VOR_V2DI_UNS iorv2di3 {}
+
+  const vd __builtin_altivec_vperm_2df (vd, vd, vuc);
+VPERM_2DF altivec_vperm_v2df {}
+
+  const vsll __builtin_altivec_vperm_2di (vsll, vsll, vuc);
+VPERM_2DI altivec_vperm_v2di {}
+
+  const vull __builtin_altivec_vperm_2di_uns (vull, vull, vuc);
+VPERM_2DI_UNS altivec_vperm_v2di_uns {}
+
+  const vd __builtin_altivec_vreve_v2df (vd);
+VREVE_V2DF altivec_vrevev2df2 {}
+
+  const vsll __builtin_altivec_vreve_v2di (vsll);
+VREVE_V2DI altivec_vrevev2di2 {}
+
+  const vd __builtin_altivec_vsel_2df (vd, vd, vop);
+VSEL_2DF vector_select_v2df {}
+
+  const vsll __builtin_altivec_vsel_2di (vsll, vsll, vsll, vbll);
+VSEL_2DI_B vector_select_v2di {}
+
+  const vull __builtin_altivec_vsel_2di_uns (vull, vull, vull);
+VSEL_2DI_UNS vector_select_v2di_uns {}
+
+  const vd __builtin_altivec_vsldoi_2df (vd, vd, const int<4>);
+VSLDOI_2DF altivec_vsldoi_v2df {}
+
+  const vsll __builtin_altivec_vsldoi_2di (vsll, vsll, const int<4>);
+VSLDOI_2DI altivec_vsldoi_v2di {}
+
+  const vd __builtin_altivec_vxor_v2df (vd, vd);
+VXOR_V2DF xorv2df3 {}
+
+  const vsll __builtin_altivec_vxor_v2di (vsll, vsll);
+VXOR_V2DI xorv2di3 {}
+
+  const vull __builtin_altivec_vxor_v2di_uns (vull, vull);
+VXOR_V2DI_UNS xorv2di3 {}
+
+  const vbc __builtin_vsx_cmpge_16qi (vsc, vsc);
+CMPGE_16QI vector_nltv16qi {}
+
+  const vbll __builtin_vsx_cmpge_2di (vsll, vsll);
+CMPGE_2DI vector_nltv2di {}
+
+  const vbi __builtin_vsx_cmpge_4si (vsi, vsi);
+CMPGE_4SI vector_nltv4si {}
+
+  const vbs __builtin_vsx_cmpge_8hi (vss, vss);
+CMPGE_8HI vector_nltv8hi {}
+
+  const vbc __builtin_vsx_cmpge_u16qi (vuc, vuc);
+CMPGE_U16QI vector_nltuv16qi {}
+
+  const vbll __builtin_vsx_cmpge_u2di (vull, vull);
+CMPGE_U2DI vector_nltuv2di {}
+
+  const vbi __builtin_vsx_cmpge_u4si (vui, vui);
+CMPGE_U4SI vector_nltuv4si {}
+
+  const vbs __builtin_vsx_cmpge_u8hi (vus, vus);
+CMPGE_U8HI vector_nltuv8hi {}
+
+  const vbc __builtin_vsx_cmple_16qi (vsc, vsc);
+CMPLE_16QI vector_ngtv16qi {}
+
+  const vbll __builtin_vsx_cmple_2di (vsll, vsll);
+CMPLE_2DI vector_ngtv2di {}
+
+  const vbi __builtin_vsx_cmple_4si (vsi, vsi);
+

[PATCH 15/29] rs6000: Write output to the vector definition include file

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (write_defines_file):
Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index b31b666e071..67336152550 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -1896,6 +1896,10 @@ write_init_file ()
 static int
 write_defines_file ()
 {
+  for (int i = 0; i < num_ovld_stanzas; i++)
+fprintf (defines_file, "#define %s %s\n",
+ovld_stanzas[i].extern_name,
+ovld_stanzas[i].intern_name);
   return 1;
 }
 
-- 
2.17.1

[PATCH 14/29] rs6000: Build and store function type identifiers

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (complete_vector_type): New
function.
(complete_base_type): Likewise.
(construct_fntype_id): Likewise.
(parse_bif_entry): Call construct_fntype_id.
(parse_ovld_entry): Likewise.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 211 
 1 file changed, 211 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 5413cc2681e..b31b666e071 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -1254,6 +1254,209 @@ htmspr = %d, htmcr = %d, mma = %d, no32bit = %d, cpu = 
%d, ldstmask = %d.\n",
   return PC_OK;
 }
 
+/* Convert a vector type into a mode string.  */
+static void
+complete_vector_type (typeinfo *typeptr, char *buf, int *bufi)
+{
+  if (typeptr->isbool)
+buf[(*bufi)++] = 'b';
+  buf[(*bufi)++] = 'v';
+  if (typeptr->ispixel)
+{
+  memcpy ([*bufi], "p8hi", 4);
+  *bufi += 4;
+}
+  else
+{
+  switch (typeptr->base)
+   {
+   case BT_CHAR:
+ memcpy ([*bufi], "16qi", 4);
+ *bufi += 4;
+ break;
+   case BT_SHORT:
+ memcpy ([*bufi], "8hi", 3);
+ *bufi += 3;
+ break;
+   case BT_INT:
+ memcpy ([*bufi], "4si", 3);
+ *bufi += 3;
+ break;
+   case BT_LONGLONG:
+ memcpy ([*bufi], "2di", 3);
+ *bufi += 3;
+ break;
+   case BT_FLOAT:
+ memcpy ([*bufi], "4sf", 3);
+ *bufi += 3;
+ break;
+   case BT_DOUBLE:
+ memcpy ([*bufi], "2df", 3);
+ *bufi += 3;
+ break;
+   case BT_INT128:
+ memcpy ([*bufi], "1ti", 3);
+ *bufi += 3;
+ break;
+   case BT_FLOAT128:
+ memcpy ([*bufi], "1tf", 3);
+ *bufi += 3;
+ break;
+   default:
+ (*diag) ("unhandled basetype %d.\n", typeptr->base);
+ exit (EC_INTERR);
+   }
+}
+}
+
+/* Convert a base type into a mode string.  */
+static void
+complete_base_type (typeinfo *typeptr, char *buf, int *bufi)
+{
+  switch (typeptr->base)
+{
+case BT_CHAR:
+  memcpy ([*bufi], "qi", 2);
+  break;
+case BT_SHORT:
+  memcpy ([*bufi], "hi", 2);
+  break;
+case BT_INT:
+  memcpy ([*bufi], "si", 2);
+  break;
+case BT_LONGLONG:
+  memcpy ([*bufi], "di", 2);
+  break;
+case BT_FLOAT:
+  memcpy ([*bufi], "sf", 2);
+  break;
+case BT_DOUBLE:
+  memcpy ([*bufi], "df", 2);
+  break;
+case BT_INT128:
+  memcpy ([*bufi], "ti", 2);
+  break;
+case BT_FLOAT128:
+  memcpy ([*bufi], "tf", 2);
+  break;
+case BT_DECIMAL32:
+  memcpy ([*bufi], "sd", 2);
+  break;
+case BT_DECIMAL64:
+  memcpy ([*bufi], "dd", 2);
+  break;
+case BT_DECIMAL128:
+  memcpy ([*bufi], "td", 2);
+  break;
+case BT_IBM128:
+  memcpy ([*bufi], "if", 2);
+  break;
+default:
+  (*diag) ("unhandled basetype %d.\n", typeptr->base);
+  exit (EC_INTERR);
+}
+
+  *bufi += 2;
+}
+
+/* Build a function type descriptor identifier from the return type
+   and argument types described by PROTOPTR, and store it if it does
+   not already exist.  Return the identifier.  */
+static char *
+construct_fntype_id (prototype *protoptr)
+{
+  /* Determine the maximum space for a function type descriptor id.
+ Each type requires at most 8 characters (6 for the mode*, 1 for
+ the optional 'u' preceding the mode, and 1 for an underscore
+ following the mode).  We also need 5 characters for the string
+ "ftype" that separates the return mode from the argument modes.
+ The last argument doesn't need a trailing underscore, but we
+ count that as the one trailing "ftype" instead.  For the special
+ case of zero arguments, we need 8 for the return type and 7
+ for "ftype_v".  Finally, we need one character for the
+ terminating null.  Thus for a function with N arguments, we
+ need at most 8N+14 characters for N>0, otherwise 16.
+ 
+   *Worst case is bv16qi for "vector bool char".  */
+  int len = protoptr->nargs ? (protoptr->nargs + 1) * 8 + 6 : 16;
+  char *buf = (char *) malloc (len);
+  int bufi = 0;
+
+  if (protoptr->rettype.ispointer)
+{
+  assert (protoptr->rettype.isvoid);
+  buf[bufi++] = 'p';
+}
+  if (protoptr->rettype.isvoid)
+buf[bufi++] = 'v';
+  else
+{
+  if (protoptr->rettype.isopaque)
+   {
+ memcpy ([bufi], "opaque", 6);
+ bufi += 6;
+   }
+  else
+   {
+ if (protoptr->rettype.isunsigned)
+   buf[bufi++] = 'u';
+ if (protoptr->rettype.isvector)
+   complete_vector_type (>rettype, buf, );
+ else
+   complete_base_type (>rettype, buf, );
+   }
+}
+
+  memcpy ([bufi], "_ftype", 6);
+  bufi +=

[PATCH 18/29] rs6000: Write output to the builtins init file, part 2 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (write_init_bif_table):
Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 153 
 1 file changed, 153 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 3ac199ff2e5..43d13b46a43 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -2094,6 +2094,159 @@ write_header_file ()
 static void
 write_init_bif_table ()
 {
+  const char *attr_string;
+
+  for (int i = 0; i <= curr_bif; i++)
+{
+  fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].bifname"
+  "\n= \"%s\";\n",
+  bifs[i].idname, bifs[i].proto.bifname);
+  fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].enable"
+  "\n= %s;\n",
+  bifs[i].idname, enable_string[bifs[i].stanza]);
+  fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].fntype"
+  "\n= %s;\n",
+  bifs[i].idname, bifs[i].fndecl);
+  fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].nargs"
+  "\n= %d;\n",
+  bifs[i].idname, bifs[i].proto.nargs);
+  fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].icode"
+  "\n= CODE_FOR_%s;\n",
+  bifs[i].idname, bifs[i].patname);
+  fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].bifattrs"
+  "\n= 0",
+  bifs[i].idname);
+  if (bifs[i].attrs.isinit)
+   fprintf (init_file, " | bif_init_bit");
+  if (bifs[i].attrs.isset)
+   fprintf (init_file, " | bif_set_bit");
+  if (bifs[i].attrs.isextract)
+   fprintf (init_file, " | bif_extract_bit");
+  if (bifs[i].attrs.isnosoft)
+   fprintf (init_file, " | bif_nosoft_bit");
+  if (bifs[i].attrs.isldvec)
+   fprintf (init_file, " | bif_ldvec_bit");
+  if (bifs[i].attrs.isstvec)
+   fprintf (init_file, " | bif_stvec_bit");
+  if (bifs[i].attrs.isreve)
+   fprintf (init_file, " | bif_reve_bit");
+  if (bifs[i].attrs.ispred)
+   fprintf (init_file, " | bif_pred_bit");
+  if (bifs[i].attrs.ishtm)
+   fprintf (init_file, " | bif_htm_bit");
+  if (bifs[i].attrs.ishtmspr)
+   fprintf (init_file, " | bif_htmspr_bit");
+  if (bifs[i].attrs.ishtmcr)
+   fprintf (init_file, " | bif_htmcr_bit");
+  if (bifs[i].attrs.ismma)
+   fprintf (init_file, " | bif_mma_bit");
+  if (bifs[i].attrs.isno32bit)
+   fprintf (init_file, " | bif_no32bit_bit");
+  if (bifs[i].attrs.iscpu)
+   fprintf (init_file, " | bif_cpu_bit");
+  if (bifs[i].attrs.isldstmask)
+   fprintf (init_file, " | bif_ldstmask_bit");
+  fprintf (init_file, ";\n");
+  for (int j = 0; j < 1; j++)
+   {
+ fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].restr_opnd[%d]"
+  "\n= %d;\n",
+  bifs[i].idname, j, bifs[i].proto.restr_opnd[j]);
+ if (bifs[i].proto.restr_opnd[j])
+   {
+ const char *res
+   = (bifs[i].proto.restr[j] == RES_BITS ? "RES_BITS"
+  : (bifs[i].proto.restr[j] == RES_RANGE ? "RES_RANGE"
+ : (bifs[i].proto.restr[j] == RES_VALUES ? "RES_VALUES"
+: (bifs[i].proto.restr[j] == RES_VAR_RANGE
+   ? "RES_VAR_RANGE" : "ERROR";
+ fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].restr[%d]"
+  "\n= %s;\n",
+  bifs[i].idname, j, res);
+ fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].restr_val1[%d]"
+  "\n= %d;\n",
+  bifs[i].idname, j, bifs[i].proto.restr_val1[j]);
+ fprintf (init_file,
+  "  rs6000_builtin_info_x[RS6000_BIF_%s].restr_val2[%d]"
+  "\n= %d;\n",
+  bifs[i].idname, j, bifs[i].proto.restr_val2[j]);
+   }
+ fprintf (init_file, "\n");
+   }
+
+  fprintf (init_file,
+  "  bifaddr = _builtin_info_x[RS6000_BIF_%s];\n",
+  bifs[i].idname);
+  fprintf (init_file,
+  "  hash = rs6000_bif_hasher::hash (bifaddr);\n");
+  fprintf (init_file,
+  "  slot = bif_hash.find_slot_with_hash (\n");
+  fprintf (init_file,
+  "   \"%s\", hash, INSERT\n",
+  bifs[i].proto.bifname);
+  fprintf (init_file,
+  " );\n");
+  fprintf (init_file,
+  "  *slot = bifaddr;\n\n");
+
+  fprintf (init_file,
+  "  if (new_builtins_are_live)\n");
+  fprintf

[PATCH 16/29] rs6000: Write output to the builtins header file

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c
(write_autogenerated_header): New function.
(write_bif_enum): Likewise.
(write_ovld_enum): Likewise.
(write_decls): Likewise.
(write_extern_fntype): Likewise.
(write_header_file): Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 193 
 1 file changed, 193 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 67336152550..67c7b22aad6 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -1878,10 +1878,203 @@ parse_ovld ()
   return result;
 }
 
+/* Write a comment at the top of FILE about how the code was generated.  */
+static void
+write_autogenerated_header (FILE *file)
+{
+  fprintf (file, "/* Automatically generated by the program '%s'\n",
+  pgm_path);
+  fprintf (file, "   from the files '%s' and '%s'.  */\n\n",
+  bif_path, ovld_path);
+}
+
+/* Callback functions used in creating enumerations.  */
+void write_bif_enum (char *str)
+{
+  fprintf (header_file, "  RS6000_BIF_%s,\n", str);
+}
+
+void write_ovld_enum (char *str)
+{
+  fprintf (header_file, "  RS6000_OVLD_%s,\n", str);
+}
+
+/* Write declarations into the header file.  */
+static void
+write_decls ()
+{
+  fprintf (header_file, "enum rs6000_gen_builtins\n{\n  RS6000_BIF_NONE,\n");
+  rbt_inorder_callback (_rbt, bif_rbt.rbt_root, write_bif_enum);
+  fprintf (header_file, "  RS6000_BIF_MAX\n};\n\n");
+
+  fprintf (header_file, "enum restriction {\n");
+  fprintf (header_file, "  RES_NONE,\n");
+  fprintf (header_file, "  RES_BITS,\n");
+  fprintf (header_file, "  RES_RANGE,\n");
+  fprintf (header_file, "  RES_VAR_RANGE,\n");
+  fprintf (header_file, "  RES_VALUES\n");
+  fprintf (header_file, "};\n\n");
+
+  fprintf (header_file, "enum bif_enable {\n");
+  fprintf (header_file, "  ENB_ALWAYS,\n");
+  fprintf (header_file, "  ENB_P5,\n");
+  fprintf (header_file, "  ENB_P6,\n");
+  fprintf (header_file, "  ENB_ALTIVEC,\n");
+  fprintf (header_file, "  ENB_VSX,\n");
+  fprintf (header_file, "  ENB_P7,\n");
+  fprintf (header_file, "  ENB_P7_64,\n");
+  fprintf (header_file, "  ENB_P8,\n");
+  fprintf (header_file, "  ENB_P8V,\n");
+  fprintf (header_file, "  ENB_P9,\n");
+  fprintf (header_file, "  ENB_P9_64,\n");
+  fprintf (header_file, "  ENB_P9V,\n");
+  fprintf (header_file, "  ENB_IEEE128_HW,\n");
+  fprintf (header_file, "  ENB_DFP,\n");
+  fprintf (header_file, "  ENB_CRYPTO,\n");
+  fprintf (header_file, "  ENB_HTM,\n");
+  fprintf (header_file, "  ENB_P10,\n");
+  fprintf (header_file, "  ENB_MMA\n");
+  fprintf (header_file, "};\n\n");
+
+  fprintf (header_file, "struct bifdata\n");
+  fprintf (header_file, "{\n");
+  fprintf (header_file, "  const char *bifname;\n");
+  fprintf (header_file, "  bif_enable enable;\n");
+  fprintf (header_file, "  tree fntype;\n");
+  fprintf (header_file, "  insn_code icode;\n");
+  fprintf (header_file, "  int  nargs;\n");
+  fprintf (header_file, "  int  bifattrs;\n");
+  fprintf (header_file, "  int  restr_opnd[2];\n");
+  fprintf (header_file, "  restriction restr[2];\n");
+  fprintf (header_file, "  int  restr_val1[2];\n");
+  fprintf (header_file, "  int  restr_val2[2];\n");
+  fprintf (header_file, "};\n\n");
+
+  fprintf (header_file, "#define bif_init_bit\t\t(0x0001)\n");
+  fprintf (header_file, "#define bif_set_bit\t\t(0x0002)\n");
+  fprintf (header_file, "#define bif_extract_bit\t\t(0x0004)\n");
+  fprintf (header_file, "#define bif_nosoft_bit\t\t(0x0008)\n");
+  fprintf (header_file, "#define bif_ldvec_bit\t\t(0x0010)\n");
+  fprintf (header_file, "#define bif_stvec_bit\t\t(0x0020)\n");
+  fprintf (header_file, "#define bif_reve_bit\t\t(0x0040)\n");
+  fprintf (header_file, "#define bif_pred_bit\t\t(0x0080)\n");
+  fprintf (header_file, "#define bif_htm_bit\t\t(0x0100)\n");
+  fprintf (header_file, "#define bif_htmspr_bit\t\t(0x0200)\n");
+  fprintf (header_file, "#define bif_htmcr_bit\t\t(0x0400)\n");
+  fprintf (header_file, "#define bif_mma_bit\t\t(0x0800)\n");
+  fprintf (header_file, "#define bif_no32bit_bit\t\t(0x1000)\n");
+  fprintf (header_file, "#define bif_cpu_bit\t\t(0x2000)\n");
+  fprintf (header_file, "#define bif_ldstmask_bit\t(0x4000)\n");
+  fprintf (header_file, "\n");
+  fprintf (header_file,
+  "#define bif_is_init(x)\t\t((x).bifattrs & bif_init_bit)\n");
+  fprintf (header_file,
+  "#define bif_is_set(x)\t\t((x).bifattrs & bif_set_bit)\n");
+  fprintf (header_file,
+  "#define bif_is_extract(x)\t((x).bifattrs & bif_extract_bit)\n");
+  fprintf (header_file,
+  "#define bif_is_nosoft(x)\t((x).bifattrs & bif_nosoft_bit)\n");
+  fprintf (header_file,
+  "#define bif_is_ldvec(x)\t\t((x).bifattrs & bif_ldvec_bit)\n");
+  fprintf (header_file,
+  "#define

[PATCH 12/29] rs6000: Parsing built-in input file, part 3 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (parse_bif_attrs):
Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 86 +
 1 file changed, 86 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 5265b591ec6..c4bc5c724a3 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -1142,6 +1142,92 @@ base = %d, restr = %d, val1 = %d, val2 = %d, pos = 
%d.\n",
 static parse_codes
 parse_bif_attrs (attrinfo *attrptr)
 {
+  consume_whitespace ();
+  if (linebuf[pos] != '{')
+{
+  (*diag) ("missing attribute set at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+  safe_inc_pos ();
+
+  memset (attrptr, 0, sizeof (*attrptr));
+  char *attrname = NULL;
+
+  do {
+consume_whitespace ();
+int oldpos = pos;
+attrname = match_identifier ();
+if (attrname)
+  {
+   if (!strcmp (attrname, "init"))
+ attrptr->isinit = 1;
+   else if (!strcmp (attrname, "set"))
+ attrptr->isset = 1;
+   else if (!strcmp (attrname, "extract"))
+ attrptr->isextract = 1;
+   else if (!strcmp (attrname, "nosoft"))
+ attrptr->isnosoft = 1;
+   else if (!strcmp (attrname, "ldvec"))
+ attrptr->isldvec = 1;
+   else if (!strcmp (attrname, "stvec"))
+ attrptr->isstvec = 1;
+   else if (!strcmp (attrname, "reve"))
+ attrptr->isreve = 1;
+   else if (!strcmp (attrname, "pred"))
+ attrptr->ispred = 1;
+   else if (!strcmp (attrname, "htm"))
+ attrptr->ishtm = 1;
+   else if (!strcmp (attrname, "htmspr"))
+ attrptr->ishtmspr = 1;
+   else if (!strcmp (attrname, "htmcr"))
+ attrptr->ishtmcr = 1;
+   else if (!strcmp (attrname, "mma"))
+ attrptr->ismma = 1;
+   else if (!strcmp (attrname, "no32bit"))
+ attrptr->isno32bit = 1;
+   else if (!strcmp (attrname, "cpu"))
+ attrptr->iscpu = 1;
+   else if (!strcmp (attrname, "ldstmask"))
+ attrptr->isldstmask = 1;
+   else
+ {
+   (*diag) ("unknown attribute at column %d.\n", oldpos + 1);
+   return PC_PARSEFAIL;
+ }
+
+   consume_whitespace ();
+   if (linebuf[pos] == ',')
+ safe_inc_pos ();
+   else if (linebuf[pos] != '}')
+ {
+   (*diag) ("arg not followed by ',' or '}' at column %d.\n",
+pos + 1);
+   return PC_PARSEFAIL;
+ }
+  }
+else
+  {
+   pos = oldpos;
+   if (linebuf[pos] != '}')
+ {
+   (*diag) ("badly terminated attr set at column %d.\n", pos + 1);
+   return PC_PARSEFAIL;
+ }
+   safe_inc_pos ();
+  }
+  } while (attrname);
+
+#ifdef DEBUG
+  (*diag) ("attribute set: init = %d, set = %d, extract = %d, \
+nosoft = %d, ldvec = %d, stvec = %d, reve = %d, pred = %d, htm = %d, \
+htmspr = %d, htmcr = %d, mma = %d, no32bit = %d, cpu = %d, ldstmask = %d.\n",
+  attrptr->isinit, attrptr->isset, attrptr->isextract,
+  attrptr->isnosoft, attrptr->isldvec, attrptr->isstvec,
+  attrptr->isreve, attrptr->ispred, attrptr->ishtm, attrptr->ishtmspr,
+  attrptr->ishtmcr, attrptr->ismma, attrptr->isno32bit,
+  attrptr->iscpu, attrptr->isldstmask);
+#endif
+
   return PC_OK;
 }
 
-- 
2.17.1

[PATCH 19/29] rs6000: Write output to the builtins init file, part 3 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (typemap): New struct.
(TYPE_MAP_SIZE): New defined constant.
(type_map): New filescope variable; initialize.
(map_token_to_type_node): New function.
(write_type_node): New function.
(write_fntype_init): Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 116 
 1 file changed, 116 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 43d13b46a43..d2c029c0fb5 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -417,6 +417,60 @@ static rbt_strings bif_rbt;
 static rbt_strings ovld_rbt;
 static rbt_strings fntype_rbt;
 
+/* Mapping from type tokens to type node names.  */
+struct typemap
+{
+  const char *key;
+  const char *value;
+};
+
+/* This table must be kept in alphabetical order, as we use binary
+   search for table lookups in map_token_to_type_node.  The table
+   maps tokens from a fntype string to a tree type.  For example,
+   in "si_ftype_hi" we would map "si" to "intSI_type_node" and
+   map "hi" to "intHI_type_node".  */
+#define TYPE_MAP_SIZE 37
+static typemap type_map[TYPE_MAP_SIZE] =
+  {
+{ "bv16qi","bool_V16QI" },
+{ "bv2di", "bool_V2DI" },
+{ "bv4si", "bool_V4SI" },
+{ "bv8hi", "bool_V8HI" },
+{ "dd","dfloat64" },
+{ "df","double" },
+{ "di","intDI" },
+{ "hi","intHI" },
+{ "if","ibm128_float" },
+{ "opaque", "opaque_V4SI" },
+{ "pv","ptr" },
+{ "qi","intQI" },
+{ "sd","dfloat32" },
+{ "sf","float" },
+{ "si","intSI" },
+{ "td","dfloat128" },
+{ "tf","long_double" },
+{ "ti","intTI" },
+{ "udi",   "unsigned_intDI" },
+{ "uhi",   "unsigned_intHI" },
+{ "uqi",   "unsigned_intQI" },
+{ "usi",   "unsigned_intSI" },
+{ "uti",   "unsigned_intTI" },
+{ "uv16qi","unsigned_V16QI" },
+{ "uv1ti", "unsigned_V1TI" },
+{ "uv2di", "unsigned_V2DI" },
+{ "uv4si", "unsigned_V4SI" },
+{ "uv8hi", "unsigned_V8HI" },
+{ "v", "void" },
+{ "v16qi", "V16QI" },
+{ "v1ti",  "V1TI" },
+{ "v2df",  "V2DF" },
+{ "v2di",  "V2DI" },
+{ "v4sf",  "V4SF" },
+{ "v4si",  "V4SI" },
+{ "v8hi",  "V8HI" },
+{ "vp8hi", "pixel_V8HI" },
+  };
+
 /* Pointer to a diagnostic function.  */
 void (*diag) (const char *, ...) __attribute__ ((format (printf, 1, 2)))
   = NULL;
@@ -2061,10 +2115,72 @@ write_fntype (char *str)
   fprintf (init_file, "tree %s;\n", str);
 }
 
+/* Look up TOK in the type map and return the corresponding string used
+   to build the type node.  */
+static const char *
+map_token_to_type_node (char *tok)
+{
+  int low = 0;
+  int high = TYPE_MAP_SIZE - 1;
+  int mid = (low + high) >> 1;
+  int cmp;
+
+  while ((cmp = strcmp (type_map[mid].key, tok)) && low < high)
+{
+  if (cmp < 0)
+   low = (low == mid ? mid + 1 : mid);
+  else
+   high = (high == mid ? mid - 1: mid);
+  mid = (low + high) >> 1;
+}
+
+  if (low > high)
+{
+  (*diag) ("token '%s' doesn't appear in the type map!\n", tok);
+  exit (EC_INTERR);
+}
+
+  return type_map[mid].value;
+}
+
+/* Write the type node corresponding to TOK.  */
+static void
+write_type_node (char *tok)
+{
+  const char *str = map_token_to_type_node (tok);
+  fprintf (init_file, "%s_type_node", str);
+}
+
 /* Write an initializer for a function type identified by STR.  */
 void
 write_fntype_init (char *str)
 {
+  char *tok;
+
+  /* Avoid side effects of strtok on the original string by using a copy.  */
+  char *buf = (char *) malloc (strlen (str) + 1);
+  strcpy (buf, str);
+
+  fprintf (init_file, "  %s\n= build_function_type_list (", buf);
+  tok = strtok (buf, "_");
+  write_type_node (tok);
+  tok = strtok (0, "_");
+  assert (tok);
+  assert (!strcmp (tok, "ftype"));
+
+  tok = strtok (0, "_");
+  if (tok)
+fprintf (init_file, ",\n\t\t\t\t");
+
+  /* Note:  A function with no arguments ends with '_ftype_v'.  */
+  while (tok && strcmp (tok, "v"))
+{
+  write_type_node (tok);
+  tok = strtok (0, "_");
+  fprintf (init_file, ",\n\t\t\t\t");
+}
+  fprintf (init_file, "NULL_TREE);\n");
+  free (buf);
 }
 
 /* Write everything to the header file (rs6000-builtins.h).  */
-- 
2.17.1

[PATCH 23/29] rs6000: Add available-everywhere and ancient builtins

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: Add always, power5,
and power6 builtins.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 78 
 1 file changed, 78 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 6c60177e4bb..1cb019bd4fb 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -163,6 +163,84 @@
 ; a semicolon are also treated as blank lines.
 
 
+; Builtins that have been around since time immemorial or are just
+; considered available everywhere.
+[always]
+  void __builtin_cpu_init ();
+CPU_INIT nothing {cpu}
+
+  unsigned int __builtin_cpu_is (const char *);
+CPU_IS nothing {cpu}
+
+  unsigned int __builtin_cpu_supports (const char *);
+CPU_SUPPORTS nothing {cpu}
+
+  unsigned long long __builtin_ppc_get_timebase ();
+GET_TB rs6000_get_timebase {}
+
+  double __builtin_mffs ();
+MFFS rs6000_mffs {}
+
+  unsigned long long __builtin_ppc_mftb ();
+MFTB rs6000_mftb_di {}
+
+  void __builtin_mtfsb0 (const int<5>);
+MTFSB0 rs6000_mtfsb0 {}
+
+  void __builtin_mtfsb1 (const int<5>);
+MTFSB1 rs6000_mtfsb1 {}
+
+  void __builtin_mtfsf (const int<8>, double);
+MTFSF rs6000_mtfsf {}
+
+  const __ibm128 __builtin_pack_ibm128 (double, double);
+PACK_IF packif {}
+
+  void __builtin_set_fpscr_rn (signed int);
+SET_FPSCR_RN rs6000_set_fpscr_rn {}
+
+  const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
+UNPACK_IF unpackif {}
+
+
+; Builtins that have been around just about forever, but not quite.
+[power5]
+;  Not sure what to do with this one.  It is apparently another
+; name for __builtin_pack_ibm128 when long double == __ibm128.
+; There isn't a lot of sense in having pack and unpack for _Float128.
+; Inclined to deprecate, should discuss with Steve Munroe.
+;  const long double __builtin_pack_longdouble (double, double);
+;PACK_TF packtf {}
+
+  fpmath double __builtin_recipdiv (double, double);
+RECIP recipdf3 {}
+
+  fpmath float __builtin_recipdivf (float, float);
+RECIPF recipsf3 {}
+
+  fpmath double __builtin_rsqrt (double);
+RSQRT rsqrtdf2 {}
+
+  fpmath float __builtin_rsqrtf (float);
+RSQRTF rsqrtsf2 {}
+
+;  Not sure what to do with this one.  It is apparently another
+; name for __builtin_unpack_ibm128 when long double == __ibm128.
+; There isn't a lot of sense in having pack and unpack for _Float128.
+; Inclined to deprecate, should discuss with Steve Munroe.
+;  const double __builtin_unpack_longdouble (long double, const int<1>);
+;UNPACK_TF unpacktf {}
+
+
+; Power6 builtins.
+[power6]
+  const signed int __builtin_p6_cmpb (signed int, signed int);
+CMPB cmpbdi3 {}
+
+  const signed int __builtin_p6_cmpb_32 (signed int, signed int);
+CMPB_32 cmpbsi3 {}
+
+
 ; AltiVec builtins.
 [altivec]
   const vsc __builtin_altivec_abs_v16qi (vsc);
-- 
2.17.1

[PATCH 11/29] rs6000: Parsing built-in input file, part 2 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (parse_args): New function.
(parse_prototype): Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 143 
 1 file changed, 143 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index a8b0d8e4288..5265b591ec6 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -1053,6 +1053,91 @@ match_type (typeinfo *typedata, int voidok)
   return match_basetype (typedata);
 }
 
+/* Parse the argument list.  */
+static parse_codes
+parse_args (prototype *protoptr)
+{
+  typelist **argptr = >args;
+  int *nargs = >nargs;
+  int *restr_opnd = protoptr->restr_opnd;
+  restriction *restr = protoptr->restr;
+  int *val1 = protoptr->restr_val1;
+  int *val2 = protoptr->restr_val2;
+  int restr_cnt = 0;
+
+  int success;
+  *nargs = 0;
+
+  /* Start the argument list.  */
+  consume_whitespace ();
+  if (linebuf[pos] != '(')
+{
+  (*diag) ("missing '(' at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+  safe_inc_pos ();
+
+  do {
+consume_whitespace ();
+int oldpos = pos;
+typelist *argentry = (typelist *) malloc (sizeof (typelist));
+memset (argentry, 0, sizeof (*argentry));
+typeinfo *argtype = >info;
+success = match_type (argtype, VOID_NOTOK);
+if (success)
+  {
+   if (argtype->restr)
+ {
+   if (restr_cnt >= 2)
+ {
+   (*diag) ("More than two restricted operands\n");
+   return PC_PARSEFAIL;
+ }
+   restr_opnd[restr_cnt] = *nargs + 1;
+   restr[restr_cnt] = argtype->restr;
+   val1[restr_cnt] = argtype->val1;
+   val2[restr_cnt++] = argtype->val2;
+ }
+   (*nargs)++;
+   *argptr = argentry;
+   argptr = >next;
+   consume_whitespace ();
+   if (linebuf[pos] == ',')
+ safe_inc_pos ();
+   else if (linebuf[pos] != ')')
+ {
+   (*diag) ("arg not followed by ',' or ')' at column %d.\n",
+pos + 1);
+   return PC_PARSEFAIL;
+ }
+
+#ifdef DEBUG
+   (*diag) ("argument type: isvoid = %d, isconst = %d, isvector = %d, \
+issigned = %d, isunsigned = %d, isbool = %d, ispixel = %d, ispointer = %d, \
+base = %d, restr = %d, val1 = %d, val2 = %d, pos = %d.\n",
+argtype->isvoid, argtype->isconst, argtype->isvector,
+argtype->issigned, argtype->isunsigned, argtype->isbool,
+argtype->ispixel, argtype->ispointer, argtype->base,
+argtype->restr, argtype->val1, argtype->val2, pos + 1);
+#endif
+  }
+else
+  {
+   free (argentry);
+   *argptr = NULL;
+   pos = oldpos;
+   if (linebuf[pos] != ')')
+ {
+   (*diag) ("badly terminated arg list at column %d.\n", pos + 1);
+   return PC_PARSEFAIL;
+ }
+   safe_inc_pos ();
+  }
+  } while (success);
+
+  return PC_OK;
+}
+
 /* Parse the attribute list.  */
 static parse_codes
 parse_bif_attrs (attrinfo *attrptr)
@@ -1065,6 +1150,64 @@ parse_bif_attrs (attrinfo *attrptr)
 static parse_codes
 parse_prototype (prototype *protoptr)
 {
+  typeinfo *ret_type = >rettype;
+  char **bifname = >bifname;
+
+  /* Get the return type.  */
+  consume_whitespace ();
+  int oldpos = pos;
+  int success = match_type (ret_type, VOID_OK);
+  if (!success)
+{
+  (*diag) ("missing or badly formed return type at column %d.\n",
+  oldpos + 1);
+  return PC_PARSEFAIL;
+}
+
+#ifdef DEBUG
+  (*diag) ("return type: isvoid = %d, isconst = %d, isvector = %d, \
+issigned = %d, isunsigned = %d, isbool = %d, ispixel = %d, ispointer = %d, \
+base = %d, restr[0] = %d, val1[0] = %d, val2[0] = %d, restr1[1] = %d, \
+val1[1] = %d, val2[1] = %d, pos = %d.\n",
+  ret_type->isvoid, ret_type->isconst, ret_type->isvector,
+  ret_type->issigned, ret_type->isunsigned, ret_type->isbool,
+  ret_type->ispixel, ret_type->ispointer, ret_type->base,
+  ret_type->restr, ret_type->val1, ret_type->val2, pos + 1);
+#endif
+
+  /* Get the bif name.  */
+  consume_whitespace ();
+  oldpos = pos;
+  *bifname = match_identifier ();
+  if (!*bifname)
+{
+  (*diag) ("missing function name at column %d.\n", oldpos + 1);
+  return PC_PARSEFAIL;
+}
+
+#ifdef DEBUG
+  (*diag) ("function name is '%s'.\n", *bifname);
+#endif
+
+  /* Process arguments.  */
+  if (parse_args (protoptr) == PC_PARSEFAIL)
+return PC_PARSEFAIL;
+
+  /* Process terminating semicolon.  */
+  consume_whitespace ();
+  if (linebuf[pos] != ';')
+{
+  (*diag) ("missing semicolon at column %d.\n", pos + 1);
+  return PC_PARSEFAIL;
+}
+  safe_inc_pos ();
+  consume_whitespace ();
+  if (linebuf[pos] != '\n')
+{
+  (*diag) ("garbage at end of line at column %d.\n", pos + 1);
+

[PATCH 04/29] rs6000: Add helper functions for parsing

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (MININT): New defined
constant.
(exit_codes): New enum.
(consume_whitespace): New function.
(advance_line): Likewise.
(safe_inc_pos): Likewise.
(match_identifier): Likewise.
(match_integer): Likewise.
(match_to_right_bracket): Likewise.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 121 
 1 file changed, 121 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 8c8fad66edf..e2a9b28eb16 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -140,6 +140,10 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include 
 
+/* Used as a sentinel for range constraints on integer fields.  No field can
+   be 32 bits wide, so this is a safe sentinel value.  */
+#define MININT INT32_MIN
+
 /* Input and output file descriptors and pathnames.  */
 static FILE *bif_file;
 static FILE *ovld_file;
@@ -162,6 +166,11 @@ static char linebuf[LINELEN];
 static int line;
 static int pos;
 
+/* Exit codes for the shell.  */
+enum exit_codes {
+  EC_INTERR
+};
+
 /* Pointer to a diagnostic function.  */
 void (*diag) (const char *, ...) __attribute__ ((format (printf, 1, 2)))
   = NULL;
@@ -187,3 +196,115 @@ ovld_diag (const char * fmt, ...)
   va_end (args);
 }
 
+/* Pass over unprintable characters and whitespace (other than a newline,
+   which terminates the scan).  */
+static void
+consume_whitespace ()
+{
+  while (pos < LINELEN && isspace(linebuf[pos]) && linebuf[pos] != '\n')
+pos++;
+  return;
+}
+
+/* Get the next nonblank, noncomment line, returning 0 on EOF, 1 otherwise.  */
+static int
+advance_line (FILE *file)
+{
+  while (1)
+{
+  /* Read ahead one line and check for EOF.  */
+  if (!fgets (linebuf, sizeof(linebuf), file))
+   return 0;
+  line++;
+  pos = 0;
+  consume_whitespace ();
+  if (linebuf[pos] != '\n' && linebuf[pos] != ';')
+   return 1;
+}
+}
+
+static inline void
+safe_inc_pos ()
+{
+  if (pos++ >= LINELEN)
+{
+  (*diag) ("line length overrun.\n");
+  exit (EC_INTERR);
+}
+}
+
+/* Match an identifier, returning NULL on failure, else a pointer to a
+   buffer containing the identifier.  */
+static char *
+match_identifier ()
+{
+  int lastpos = pos - 1;
+  while (isalnum (linebuf[lastpos + 1]) || linebuf[lastpos + 1] == '_')
+if (++lastpos >= LINELEN - 1)
+  {
+   (*diag) ("line length overrun.\n");
+   exit (EC_INTERR);
+  }
+
+  if (lastpos < pos)
+return 0;
+
+  char *buf = (char *) malloc (lastpos - pos + 2);
+  memcpy (buf, [pos], lastpos - pos + 1);
+  buf[lastpos - pos + 1] = '\0';
+
+  pos = lastpos + 1;
+  return buf;
+}
+
+/* Match an integer and return its value, or MININT on failure.  */
+static int
+match_integer ()
+{
+  int startpos = pos;
+  if (linebuf[pos] == '-')
+safe_inc_pos ();
+
+  int lastpos = pos - 1;
+  while (isdigit (linebuf[lastpos + 1]))
+if (++lastpos >= LINELEN - 1)
+  {
+   (*diag) ("line length overrun in match_integer.\n");
+   exit (EC_INTERR);
+  }
+
+  if (lastpos < pos)
+return MININT;
+
+  pos = lastpos + 1;
+  char *buf = (char *) malloc (lastpos - startpos + 2);
+  memcpy (buf, [startpos], lastpos - startpos + 1);
+  buf[lastpos - startpos + 1] = '\0';
+
+  int x;
+  sscanf (buf, "%d", );
+  return x;
+}
+
+static const char *
+match_to_right_bracket ()
+{
+  int lastpos = pos - 1;
+  while (linebuf[lastpos + 1] != ']')
+if (++lastpos >= LINELEN - 1)
+  {
+   (*diag) ("line length overrun.\n");
+   exit (EC_INTERR);
+  }
+
+  if (lastpos < pos)
+return 0;
+
+  char *buf = (char *) malloc (lastpos - pos + 2);
+  memcpy (buf, [pos], lastpos - pos + 1);
+  buf[lastpos - pos + 1] = '\0';
+
+  pos = lastpos + 1;
+  return buf;
+}
+
-- 
2.17.1

[PATCH 05/29] rs6000: Add functions for matching types, part 1 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (void_status): New enum.
(basetype): Likewise.
(typeinfo): New struct.
(handle_pointer): New function.
(match_basetype): New stub function.
(match_const_restriction): Likewise.
(match_type): New function.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 382 
 1 file changed, 382 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index e2a9b28eb16..ea1ebedfa52 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -166,6 +166,44 @@ static char linebuf[LINELEN];
 static int line;
 static int pos;
 
+/* Used to determine whether a type can be void (only return types).  */
+enum void_status {
+  VOID_NOTOK,
+  VOID_OK
+};
+
+/* Legal base types for an argument or return type.  */
+enum basetype {
+  BT_CHAR,
+  BT_SHORT,
+  BT_INT,
+  BT_LONGLONG,
+  BT_FLOAT,
+  BT_DOUBLE,
+  BT_INT128,
+  BT_FLOAT128,
+  BT_DECIMAL32,
+  BT_DECIMAL64,
+  BT_DECIMAL128,
+  BT_IBM128
+};
+
+/* Type modifiers for an argument or return type.  */
+struct typeinfo {
+  char isvoid;
+  char isconst;
+  char isvector;
+  char issigned;
+  char isunsigned;
+  char isbool;
+  char ispixel;
+  char ispointer;
+  char isopaque;
+  basetype base;
+  int val1;
+  int val2;
+};
+
 /* Exit codes for the shell.  */
 enum exit_codes {
   EC_INTERR
@@ -308,3 +346,347 @@ match_to_right_bracket ()
   return buf;
 }
 
+static inline void
+handle_pointer (typeinfo *typedata)
+{
+  consume_whitespace ();
+  if (linebuf[pos] == '*')
+{
+  typedata->ispointer = 1;
+  safe_inc_pos ();
+}
+}
+
+/* Match one of the allowable base types.  Consumes one token unless the
+   token is "long", which must be paired with a second "long".  Optionally
+   consumes a following '*' token for pointers.  Return 1 for success,
+   0 for failure.  */
+static int
+match_basetype (typeinfo *typedata)
+{
+  return 1;
+}
+
+/* A const int argument may be restricted to certain values.  This is
+   indicated by one of the following occurring after the "int' token:
+
+restricts the constant to x bits, interpreted as unsigned
+  restricts the constant to the inclusive range [x,y]
+ [x,y] restricts the constant to the inclusive range [x,y],
+  but only applies if the argument is constant.
+ {x,y} restricts the constant to one of two values, x or y.
+
+   Here x and y are integer tokens.  Note that the "const" token is a
+   lie when the restriction is [x,y], but this simplifies the parsing
+   significantly and is hopefully forgivable.
+
+   Return 1 for success, else 0.  */
+static int
+match_const_restriction (typeinfo *typedata)
+{
+  return 1;
+}
+
+/* Look for a type, which can be terminated by a token that is not part of
+   a type, a comma, or a closing parenthesis.  Place information about the
+   type in TYPEDATA.  Return 1 for success, 0 for failure.  */
+static int
+match_type (typeinfo *typedata, int voidok)
+{
+  /* A legal type is of the form:
+
+   [const] [[signed|unsigned]  | ] [*]
+
+ where "const" applies only to a  of "int".  Legal values
+ of  are (for now):
+
+   char
+   short
+   int
+   long long
+   float
+   double
+   __int128
+   _Float128
+   _Decimal32
+   _Decimal64
+   _Decimal128
+   __ibm128
+
+ Legal values of  are as follows, and are shorthand for
+ the associated meaning:
+
+   vsc vector signed char
+   vuc vector unsigned char
+   vbc vector bool char
+   vss vector signed short
+   vus vector unsigned short
+   vbs vector bool short
+   vsi vector signed int
+   vui vector unsigned int
+   vbi vector bool int
+   vsllvector signed long long
+   vullvector unsigned long long
+   vbllvector bool long long
+   vsq vector signed __int128
+   vuq vector unsigned __int128
+   vbq vector bool __int128
+   vp  vector pixel
+   vf  vector float
+   vd  vector double
+   vop opaque vector (matches all vectors)
+
+ For simplicity, We don't support "short int" and "long long int".
+ We don't support a  of "bool", "long double", or "_Float16",
+ but will add these if builtins require it.  "signed" and "unsigned"
+ only apply to integral base types.  The optional * indicates a pointer
+ type, which can be used with any base type, but is treated for type
+ signature purposes as a pointer to void.  */
+
+  consume_whitespace ();
+  memset (typedata, 0, sizeof(*typedata));
+  int oldpos = pos;
+
+  char *token = match_identifier ();
+  if (!token)
+return 0;
+
+  if (!strcmp (token, "void"))
+typedata->isvoid = 1;
+
+  if (!strcmp (token, "const"))
+{
+  typedata->isconst = 1;
+

[PATCH 08/29] rs6000: Red-black tree implementation for balanced tree search

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rbtree.c: New file.
* config/rs6000/rbtree.h: New file.
---
 gcc/config/rs6000/rbtree.c | 233 +
 gcc/config/rs6000/rbtree.h |  51 
 2 files changed, 284 insertions(+)
 create mode 100644 gcc/config/rs6000/rbtree.c
 create mode 100644 gcc/config/rs6000/rbtree.h

diff --git a/gcc/config/rs6000/rbtree.c b/gcc/config/rs6000/rbtree.c
new file mode 100644
index 000..50e5f57a50c
--- /dev/null
+++ b/gcc/config/rs6000/rbtree.c
@@ -0,0 +1,233 @@
+/* Partial red-black tree implementation for rs6000-gen-builtins.c.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Bill Schmidt, IBM 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include 
+#include 
+#include 
+#include 
+#include "rbtree.h"
+
+/* Create a new node to be inserted into the red-black tree.  An inserted
+   node starts out red.  */
+static struct rbt_string_node *
+rbt_create_node (struct rbt_strings *t, char *str)
+{
+  struct rbt_string_node *nodeptr
+= (struct rbt_string_node *) malloc (sizeof (struct rbt_string_node));
+  nodeptr->str = str;
+  nodeptr->left = t->rbt_nil;
+  nodeptr->right = t->rbt_nil;
+  nodeptr->par = NULL;
+  nodeptr->color = RBT_RED;
+  return nodeptr;
+}
+
+/* Perform a left-rotate operation on NODE in the red-black tree.  */
+static void
+rbt_left_rotate (struct rbt_strings *t, struct rbt_string_node *node)
+{
+  struct rbt_string_node *right = node->right;
+  assert (right);
+
+  /* Turn RIGHT's left subtree into NODE's right subtree.  */
+  node->right = right->left;
+  if (right->left != t->rbt_nil)
+right->left->par = node;
+
+  /* Link NODE's parent to RIGHT.  */
+  right->par = node->par;
+
+  if (node->par == t->rbt_nil)
+t->rbt_root = right;
+  else if (node == node->par->left)
+node->par->left = right;
+  else
+node->par->right = right;
+
+  /* Put NODE on RIGHT's left.  */
+  right->left = node;
+  node->par = right;
+}
+
+/* Perform a right-rotate operation on NODE in the red-black tree.  */
+static void
+rbt_right_rotate (struct rbt_strings *t, struct rbt_string_node *node)
+{
+  struct rbt_string_node *left = node->left;
+  assert (left);
+
+  /* Turn LEFT's right subtree into NODE's left subtree.  */
+  node->left = left->right;
+  if (left->right != t->rbt_nil)
+left->right->par = node;
+
+  /* Link NODE's parent to LEFT.  */
+  left->par = node->par;
+
+  if (node->par == t->rbt_nil)
+t->rbt_root = left;
+  else if (node == node->par->right)
+node->par->right = left;
+  else
+node->par->left = left;
+
+  /* Put NODE on LEFT's right.  */
+  left->right = node;
+  node->par = left;
+}
+
+/* Insert STR into the tree, returning 1 for success and 0 if STR already
+   appears in the tree.  */
+int
+rbt_insert (struct rbt_strings *t, char *str)
+{
+  struct rbt_string_node *curr = t->rbt_root;
+  struct rbt_string_node *trail = t->rbt_nil;
+
+  while (curr != t->rbt_nil)
+{
+  trail = curr;
+  int cmp = strcmp (str, curr->str);
+  if (cmp < 0)
+   curr = curr->left;
+  else if (cmp > 0)
+   curr = curr->right;
+  else
+   return 0;
+}
+
+  struct rbt_string_node *fresh = rbt_create_node (t, str);
+  fresh->par = trail;
+
+  if (trail == t->rbt_nil)
+t->rbt_root = fresh;
+  else if (strcmp (fresh->str, trail->str) < 0)
+trail->left = fresh;
+  else
+trail->right = fresh;
+
+  fresh->left = t->rbt_nil;
+  fresh->right = t->rbt_nil;
+
+  /* FRESH has now been inserted as a red leaf.  If we have invalidated
+ one of the following preconditions, we must fix things up:
+  (a) If a node is red, both of its children are black.
+  (b) The root must be black.
+ Note that only (a) or (b) applies at any given time during the
+ process.  This algorithm works up the tree from NEW looking
+ for a red child with a red parent, and cleaning that up.  If the
+ root ends up red, it gets turned black at the end.  */
+  curr = fresh;
+  while (curr->par->color == RBT_RED)
+if (curr->par == curr->par->par->left)
+  {
+   struct rbt_string_node *uncle = curr->par->par->right;
+   if (uncle->color == RBT_RED)
+ {
+   curr->par->color = RBT_BLACK;
+   uncle->color = RBT_BLACK;
+   curr->par->par->color = RBT_RED;
+   curr =

[PATCH 02/29] rs6000: Add initial input files

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

This patch adds a tiny subset of the built-in and overload descriptions.

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-builtin-new.def: New.
* config/rs6000/rs6000-overload.def: New.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 179 +++
 gcc/config/rs6000/rs6000-overload.def|  57 
 2 files changed, 236 insertions(+)
 create mode 100644 gcc/config/rs6000/rs6000-builtin-new.def
 create mode 100644 gcc/config/rs6000/rs6000-overload.def

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
new file mode 100644
index 000..5fc7e1301c3
--- /dev/null
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -0,0 +1,179 @@
+; Built-in functions for PowerPC.
+; Copyright (C) 2020 Free Software Foundation, Inc.
+; Contributed by Bill Schmidt, IBM 
+;
+; This file is part of GCC.
+;
+; GCC is free software; you can redistribute it and/or modify it under
+; the terms of the GNU General Public License as published by the Free
+; Software Foundation; either version 3, or (at your option) any later
+; version.
+;
+; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+; for more details.
+;
+; You should have received a copy of the GNU General Public License
+; along with GCC; see the file COPYING3.  If not see
+; .  */
+
+
+; Built-in functions in this file are organized into "stanzas", where
+; all built-ins in a given stanza are enabled together.  Each stanza
+; starts with a line identifying the circumstances in which the group of
+; functions is permitted, with the gating predicate in square brackets.
+; This is the only information allowed on the stanza header line, other
+; than whitespace.
+;
+; Following the stanza header are two lines for each function: the
+; prototype line and the attributes line.  The prototype line has
+; this format, where the square brackets indicate optional
+; information and angle brackets indicate required information:
+;
+;   [kind]   ();
+;
+; Here [kind] can be one of "const", "pure", or "fpmath";
+;  is a legal type for a built-in function result;
+;  is the name by which the function can be called;
+; and  is a comma-separated list of legal types
+; for built-in function arguments.  The argument list may be
+; empty, but the parentheses and semicolon are required.
+;
+; A legal type is of the form:
+;
+;   [const] [[signed|unsigned]  | ] [*]
+;
+; where "const" applies only to a  of "int".  Legal values
+; of  are (for now):
+;
+;   char
+;   short
+;   int
+;   long long
+;   float
+;   double
+;   __int128
+;   _Float128
+;   _Decimal32
+;   _Decimal64
+;   _Decimal128
+;   __ibm128
+;
+; Legal values of  are as follows, and are shorthand for
+; the associated meaning:
+;
+;   vscvector signed char
+;   vucvector unsigned char
+;   vbcvector bool char
+;   vssvector signed short
+;   vusvector unsigned short
+;   vbsvector bool short
+;   vsivector signed int
+;   vuivector unsigned int
+;   vbivector bool int
+;   vsll   vector signed long long
+;   vull   vector unsigned long long
+;   vbll   vector bool long long
+;   vsqvector signed __int128
+;   vuqvector unsigned __int128
+;   vbqvector bool __int128
+;   vp vector pixel
+;   vf vector float
+;   vd vector double
+;   vopopaque vector (matches all vectors)
+;
+; For simplicity, We don't support "short int" and "long long int".
+; We don't currently support a  of "bool", "long double",
+; or "_Float16".  "signed" and "unsigned" only apply to integral base
+; types.  The optional * indicates a pointer type, which can be used
+; only with "void" and "const char" in this file.  (More specific
+; pointer types are allowed in overload prototypes.)
+;
+; The attributes line looks like this:
+;
+; {}
+;
+; Here  is a unique internal identifier for the built-in
+; function that will be used as part of an enumeration of all
+; built-in functions;  is the define_expand or
+; define_insn that will be invoked when the call is expanded;
+; and  is a comma-separated list of special
+; conditions that apply to the built-in function.  The attribute
+; list may be empty, but the braces are required.
+;
+; Attributes are strings, and the allowed ones are listed below.
+;
+;   init Process as a vec_init function
+;   set  Process as a vec_set function
+;   extract  Process as a vec_extract function
+;   nosoft   Not valid with -msoft-float
+;   ldvecNeeds special handling for vec_ld semantics
+;   stvecNeeds special handling for vec_st semantics
+;   reve Needs special

[PATCH 06/29] rs6000: Add functions for matching types, part 2 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (match_basetype):
Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index ea1ebedfa52..efc0b2dec65 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -364,6 +364,55 @@ handle_pointer (typeinfo *typedata)
 static int
 match_basetype (typeinfo *typedata)
 {
+  consume_whitespace ();
+  int oldpos = pos;
+  char *token = match_identifier ();
+  if (!token)
+{
+  (*diag) ("missing base type in return type at column %d\n", pos + 1);
+  return 0;
+}
+
+  if (!strcmp (token, "char"))
+typedata->base = BT_CHAR;
+  else if (!strcmp (token, "short"))
+typedata->base = BT_SHORT;
+  else if (!strcmp (token, "int"))
+typedata->base = BT_INT;
+  else if (!strcmp (token, "long"))
+{
+  consume_whitespace ();
+  char *mustbelong = match_identifier ();
+  if (!mustbelong || strcmp (mustbelong, "long"))
+   {
+ (*diag) ("incomplete 'long long' at column %d\n", oldpos + 1);
+ return 0;
+   }
+  typedata->base = BT_LONGLONG;
+}
+  else if (!strcmp (token, "float"))
+typedata->base = BT_FLOAT;
+  else if (!strcmp (token, "double"))
+typedata->base = BT_DOUBLE;
+  else if (!strcmp (token, "__int128"))
+typedata->base = BT_INT128;
+  else if (!strcmp (token, "_Float128"))
+typedata->base = BT_FLOAT128;
+  else if (!strcmp (token, "_Decimal32"))
+typedata->base = BT_DECIMAL32;
+  else if (!strcmp (token, "_Decimal64"))
+typedata->base = BT_DECIMAL64;
+  else if (!strcmp (token, "_Decimal128"))
+typedata->base = BT_DECIMAL128;
+  else if (!strcmp (token, "__ibm128"))
+typedata->base = BT_IBM128;
+  else
+{
+  (*diag) ("unrecognized base type at column %d\n", oldpos + 1);
+  return 0;
+}
+
+  handle_pointer (typedata);
   return 1;
 }
 
-- 
2.17.1

[PATCH 09/29] rs6000: Main function with stubs for parsing and output

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (rbtree.h): New #include.
(num_bifs): Likewise.
(num_ovld_stanzas): Likewise.
(num_ovlds): Likewise.
(exit_codes): Add more enum values.
(parse_codes): New enum.
(bif_rbt): New filescope variable.
(ovld_rbt): Likewise.
(fntype_rbt): Likewise.
(parse_bif): New stub function.
(parse_ovld): Likewise.
(write_header_file): Likewise.
(write_init_file): Likewise.
(write_defines_file): Likewise.
(delete_output_files): New function.
(main): Likewise.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 210 
 1 file changed, 210 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 18c67ce2202..d6058a8e73b 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -139,6 +139,7 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include 
 #include 
+#include "rbtree.h"
 
 /* Used as a sentinel for range constraints on integer fields.  No field can
be 32 bits wide, so this is a safe sentinel value.  */
@@ -220,11 +221,41 @@ struct typeinfo {
   int val2;
 };
 
+static int num_bifs;
+static int num_ovld_stanzas;
+static int num_ovlds;
+
 /* Exit codes for the shell.  */
 enum exit_codes {
+  EC_OK,
+  EC_BADARGS,
+  EC_NOBIF,
+  EC_NOOVLD,
+  EC_NOHDR,
+  EC_NOINIT,
+  EC_NODEFINES,
+  EC_PARSEBIF,
+  EC_PARSEOVLD,
+  EC_WRITEHDR,
+  EC_WRITEINIT,
+  EC_WRITEDEFINES,
   EC_INTERR
 };
 
+/* Return codes for parsing routines.  */
+enum parse_codes {
+  PC_OK,
+  PC_EOFILE,
+  PC_EOSTANZA,
+  PC_PARSEFAIL
+};
+
+/* The red-black trees for built-in function identifiers, built-in
+   overload identifiers, and function type descriptors.  */
+static rbt_strings bif_rbt;
+static rbt_strings ovld_rbt;
+static rbt_strings fntype_rbt;
+
 /* Pointer to a diagnostic function.  */
 void (*diag) (const char *, ...) __attribute__ ((format (printf, 1, 2)))
   = NULL;
@@ -875,3 +906,182 @@ match_type (typeinfo *typedata, int voidok)
   return match_basetype (typedata);
 }
 
+/* Parse the built-in file.  */
+static parse_codes
+parse_bif ()
+{
+  return PC_OK;
+}
+
+/* Parse the overload file.  */
+static parse_codes
+parse_ovld ()
+{
+  return PC_OK;
+}
+
+/* Write everything to the header file (rs6000-builtins.h).  */
+static int
+write_header_file ()
+{
+  return 1;
+}
+
+/* Write everything to the initialization file (rs6000-builtins.c).  */
+static int
+write_init_file ()
+{
+  return 1;
+}
+
+/* Write everything to the include file (rs6000-vecdefines.h).  */
+static int
+write_defines_file ()
+{
+  return 1;
+}
+
+/* Close and delete output files after any failure, so that subsequent
+   build dependencies will fail.  */
+static void
+delete_output_files ()
+{
+  /* Depending on whence we're called, some of these may already be
+ closed.  Don't check for errors.  */
+  fclose (header_file);
+  fclose (init_file);
+  fclose (defines_file);
+
+  unlink (header_path);
+  unlink (init_path);
+  unlink (defines_path);
+}
+
+/* Main program to convert flat files into built-in initialization code.  */
+int
+main (int argc, const char **argv)
+{
+  if (argc != 6)
+{
+  fprintf (stderr,
+  "Five arguments required: two input file and three output "
+  "files.\n");
+  exit (EC_BADARGS);
+}
+
+  pgm_path = argv[0];
+  bif_path = argv[1];
+  ovld_path = argv[2];
+  header_path = argv[3];
+  init_path = argv[4];
+  defines_path = argv[5];
+
+  bif_file = fopen (bif_path, "r");
+  if (!bif_file)
+{
+  fprintf (stderr, "Cannot find input built-in file '%s'.\n", bif_path);
+  exit (EC_NOBIF);
+}
+  ovld_file = fopen (ovld_path, "r");
+  if (!ovld_file)
+{
+  fprintf (stderr, "Cannot find input overload file '%s'.\n", ovld_path);
+  exit (EC_NOOVLD);
+}
+  header_file = fopen (header_path, "w");
+  if (!header_file)
+{
+  fprintf (stderr, "Cannot open header file '%s' for output.\n",
+  header_path);
+  exit (EC_NOHDR);
+}
+  init_file = fopen (init_path, "w");
+  if (!init_file)
+{
+  fprintf (stderr, "Cannot open init file '%s' for output.\n", init_path);
+  exit (EC_NOINIT);
+}
+  defines_file = fopen (defines_path, "w");
+  if (!defines_file)
+{
+  fprintf (stderr, "Cannot open defines file '%s' for output.\n",
+  defines_path);
+  exit (EC_NODEFINES);
+}
+
+  /* Initialize the balanced trees containing built-in function ids,
+ overload function ids, and function type declaration ids.  */
+  bif_rbt.rbt_nil = (rbt_string_node *) malloc (sizeof (rbt_string_node));
+  bif_rbt.rbt_nil->color = RBT_BLACK;
+  bif_rbt.rbt_root = bif_rbt.rbt_nil;
+
+  ovld_rbt.rbt_nil = (rbt_string_node *) malloc (sizeof (rbt_string_node));
+

[PATCH 03/29] rs6000: Add file support and functions for diagnostic support

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (bif_file): New filescope
variable.
(ovld_file): Likewise.
(header_file): Likewise.
(init_file): Likewise.
(defines_file): Likewise.
(pgm_path): Likewise.
(bif_path): Likewise.
(ovld_path): Likewise.
(header_path): Likewise.
(init_path): Likewise.
(defines_path): Likewise.
(LINELEN): New defined constant.
(linebuf): New filescope variable.
(line): Likewise.
(pos): Likewise.
(diag): Likewise.
(bif_diag): New function.
(ovld_diag): New function.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 462387f4b44..8c8fad66edf 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -139,3 +139,51 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include 
 #include 
+
+/* Input and output file descriptors and pathnames.  */
+static FILE *bif_file;
+static FILE *ovld_file;
+static FILE *header_file;
+static FILE *init_file;
+static FILE *defines_file;
+
+static const char *pgm_path;
+static const char *bif_path;
+static const char *ovld_path;
+static const char *header_path;
+static const char *init_path;
+static const char *defines_path;
+
+/* Position information.  Note that "pos" is zero-indexed, but users
+   expect one-indexed column information, so representations of "pos"
+   as columns in diagnostic messages must be adjusted.  */
+#define LINELEN 1024
+static char linebuf[LINELEN];
+static int line;
+static int pos;
+
+/* Pointer to a diagnostic function.  */
+void (*diag) (const char *, ...) __attribute__ ((format (printf, 1, 2)))
+  = NULL;
+
+/* Custom diagnostics.  */
+static void __attribute__ ((format (printf, 1, 2)))
+bif_diag (const char * fmt, ...)
+{
+  va_list args;
+  fprintf (stderr, "%s:%d: ", bif_path, line);
+  va_start (args, fmt);
+  vfprintf (stderr, fmt, args);
+  va_end (args);
+}
+
+static void __attribute__ ((format (printf, 1, 2)))
+ovld_diag (const char * fmt, ...)
+{
+  va_list args;
+  fprintf (stderr, "%s:%d: ", ovld_path, line);
+  va_start (args, fmt);
+  vfprintf (stderr, fmt, args);
+  va_end (args);
+}
+
-- 
2.17.1

[PATCH 07/29] rs6000: Add functions for matching types, part 3 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (restriction): New enum.
(typeinfo): Add restriction field.
(match_const_restriction): Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 136 
 1 file changed, 136 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index efc0b2dec65..18c67ce2202 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -188,6 +188,21 @@ enum basetype {
   BT_IBM128
 };
 
+/* Ways in which a const int value can be restricted.  RES_BITS indicates
+   that the integer is restricted to val1 bits, interpreted as an unsigned
+   number.  RES_RANGE indicates that the integer is restricted to values
+   between val1 and val2, inclusive.  RES_VAR_RANGE is like RES_RANGE, but
+   the argument may be variable, so it can only be checked if it is constant.
+   RES_VALUES indicates that the integer must have one of the values val1
+   or val2.  */
+enum restriction {
+  RES_NONE,
+  RES_BITS,
+  RES_RANGE,
+  RES_VAR_RANGE,
+  RES_VALUES
+};
+
 /* Type modifiers for an argument or return type.  */
 struct typeinfo {
   char isvoid;
@@ -200,6 +215,7 @@ struct typeinfo {
   char ispointer;
   char isopaque;
   basetype base;
+  restriction restr;
   int val1;
   int val2;
 };
@@ -433,6 +449,126 @@ match_basetype (typeinfo *typedata)
 static int
 match_const_restriction (typeinfo *typedata)
 {
+  int oldpos = pos;
+  if (linebuf[pos] == '<')
+{
+  safe_inc_pos ();
+  oldpos = pos;
+  int x = match_integer ();
+  if (x == MININT)
+   {
+ (*diag) ("malformed integer at column %d.\n", oldpos + 1);
+ return 0;
+   }
+  consume_whitespace ();
+  if (linebuf[pos] == '>')
+   {
+ typedata->restr = RES_BITS;
+ typedata->val1 = x;
+ safe_inc_pos ();
+ return 1;
+   }
+  else if (linebuf[pos] != ',')
+   {
+ (*diag) ("malformed restriction at column %d.\n", pos + 1);
+ return 0;
+   }
+  safe_inc_pos ();
+  oldpos = pos;
+  int y = match_integer ();
+  if (y == MININT)
+   {
+ (*diag) ("malformed integer at column %d.\n", oldpos + 1);
+ return 0;
+   }
+  typedata->restr = RES_RANGE;
+  typedata->val1 = x;
+  typedata->val2 = y;
+
+  consume_whitespace ();
+  if (linebuf[pos] != '>')
+   {
+ (*diag) ("malformed restriction at column %d.\n", pos + 1);
+ return 0;
+   }
+  safe_inc_pos ();
+}
+  else if (linebuf[pos] == '{')
+{
+  safe_inc_pos ();
+  oldpos = pos;
+  int x = match_integer ();
+  if (x == MININT)
+   {
+ (*diag) ("malformed integer at column %d.\n", oldpos + 1);
+ return 0;
+   }
+  consume_whitespace ();
+  if (linebuf[pos] != ',')
+   {
+ (*diag) ("missing comma at column %d.\n", pos + 1);
+ return 0;
+   }
+  safe_inc_pos ();
+  consume_whitespace ();
+  oldpos = pos;
+  int y = match_integer ();
+  if (y == MININT)
+   {
+ (*diag) ("malformed integer at column %d.\n", oldpos + 1);
+ return 0;
+   }
+  typedata->restr = RES_VALUES;
+  typedata->val1 = x;
+  typedata->val2 = y;
+
+  consume_whitespace ();
+  if (linebuf[pos] != '}')
+   {
+ (*diag) ("malformed restriction at column %d.\n", pos + 1);
+ return 0;
+   }
+  safe_inc_pos ();
+}
+  else
+{
+  assert (linebuf[pos] == '[');
+  safe_inc_pos ();
+  oldpos = pos;
+  int x = match_integer ();
+  if (x == MININT)
+   {
+ (*diag) ("malformed integer at column %d.\n", oldpos + 1);
+ return 0;
+   }
+  consume_whitespace ();
+  if (linebuf[pos] != ',')
+   {
+ (*diag) ("missing comma at column %d.\n", pos + 1);
+ return 0;
+   }
+  safe_inc_pos ();
+  consume_whitespace ();
+  oldpos = pos;
+  int y = match_integer ();
+  if (y == MININT)
+   {
+ (*diag) ("malformed integer at column %d.\n", oldpos + 1);
+ return 0;
+   }
+  typedata->restr = RES_VAR_RANGE;
+  typedata->val1 = x;
+  typedata->val2 = y;
+
+  consume_whitespace ();
+  if (linebuf[pos] != ']')
+   {
+ (*diag) ("malformed restriction at column %d.\n", pos + 1);
+ return 0;
+   }
+  safe_inc_pos ();
+}
+
   return 1;
 }
 
-- 
2.17.1

[PATCH 01/29] rs6000: Initial create of rs6000-gen-builtins.c

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

Add header commentary explaining the purpose of rs6000-gen-builtins.c,
along with an initial set of includes.

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c: New.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 141 
 1 file changed, 141 insertions(+)
 create mode 100644 gcc/config/rs6000/rs6000-gen-builtins.c

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
new file mode 100644
index 000..462387f4b44
--- /dev/null
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -0,0 +1,141 @@
+/* Generate built-in function initialization and recognition for Power.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Bill Schmidt, IBM 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* This program generates built-in function initialization and
+   recognition code for Power targets, based on text files that
+   describe the built-in functions and vector overloads:
+
+ rs6000-builtin-new.def Table of built-in functions
+ rs6000-overload.defTable of overload functions
+
+   Both files group similar functions together in "stanzas," as
+   described below.
+
+   Each stanza in the built-in function file starts with a line
+   identifying the circumstances in which the group of functions is
+   permitted, with the gating predicate in square brackets.  For
+   example, this could be
+
+ [altivec]
+
+   or it could be
+
+ [power9]
+
+   The bracketed gating predicate is the only information allowed on
+   the stanza header line, other than whitespace.
+
+   Following the stanza header are two lines for each function: the
+   prototype line and the attributes line.  The prototype line has
+   this format, where the square brackets indicate optional
+   information and angle brackets indicate required information:
+
+ [kind]   ();
+
+   Here [kind] can be one of "const", "pure", or "fpmath";
+is a legal type for a built-in function result;
+is the name by which the function can be called;
+   and  is a comma-separated list of legal types
+   for built-in function arguments.  The argument list may be
+   empty, but the parentheses and semicolon are required.
+
+   The attributes line looks like this:
+
+   {}
+
+   Here  is a unique internal identifier for the built-in
+   function that will be used as part of an enumeration of all
+   built-in functions;  is the define_expand or
+   define_insn that will be invoked when the call is expanded;
+   and  is a comma-separated list of special
+   conditions that apply to the built-in function.  The attribute
+   list may be empty, but the braces are required.
+
+   Attributes are strings, such as these:
+
+ init Process as a vec_init function
+ set  Process as a vec_set function
+ extract  Process as a vec_extract function
+ nosoft   Not valid with -msoft-float
+ ldvecNeeds special handling for vec_ld semantics
+ stvecNeeds special handling for vec_st semantics
+ reve Needs special handling for element reversal
+ pred Needs special handling for comparison predicates
+ htm  Needs special handling for transactional memory
+ htmspr   HTM function using an SPR
+ htmcrHTM function using a CR
+ mma  Needs special handling for MMA instructions
+ no32bit  Not valid for TARGET_32BIT
+ cpu  This is a "cpu_is" or "cpu_supports" builtin
+ ldstmask Altivec mask for load or store
+
+   An example stanza might look like this:
+
+[altivec]
+  const vsc __builtin_altivec_abs_v16qi (vsc);
+ABS_V16QI absv16qi2 {}
+  const vss __builtin_altivec_abs_v8hi (vss);
+ABS_V8HI absv8hi2 {}
+
+   Here "vsc" and "vss" are shorthand for "vector signed char" and
+   "vector signed short" to shorten line lengths and improve readability.
+   Note the use of indentation, which is recommended but not required.
+
+   The overload file has more complex stanza headers.  Here the stanza
+   represents all functions with the same overloaded function name:
+
+ [, , ]
+
+   Here the square brackets are part of the syntax,  is a
+   unique internal identifier for the overload that will be used as part
+   of an enumeration of all overloaded functions;  is the name
+   that will appear as a #define in altivec.h; and  is the
+   name that is

[PATCH 10/29] rs6000: Parsing built-in input file, part 1 of 3

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

2020-07-26  Bill Schmidt  

* config/rs6000/rs6000-gen-builtins.c (bif_stanza): New enum.
(curr_bif_stanza): Likewise.
(stanza_entry): New struct.
(stanza_map): New initialized filescope variable.
(enable_string): Likewise.
(fnkinds): New enum.
(typelist): New struct.
(attrinfo): Likewise.
(prototype): Likewise.
(MAXBIFS): New defined constant.
(bifdata): New struct.
(bifs): New filescope variable.
(curr_bif): Likewise.
(stanza_name_to_stanza): New function.
(parse_bif_attrs): New stub function.
(parse_prototype): Likewise.
(parse_bif_entry): New function.
(parse_bif_stanza): Likewise.
(parse_bif): Implement.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 320 +++-
 1 file changed, 319 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index d6058a8e73b..a8b0d8e4288 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -173,6 +173,93 @@ enum void_status {
   VOID_OK
 };
 
+/* Stanzas are groupings of built-in functions and overloads by some
+   common feature/attribute.  These definitions are for built-in function
+   stanzas.  */
+enum bif_stanza {
+  BSTZ_ALWAYS,
+  BSTZ_P5,
+  BSTZ_P6,
+  BSTZ_ALTIVEC,
+  BSTZ_VSX,
+  BSTZ_P7,
+  BSTZ_P7_64,
+  BSTZ_P8,
+  BSTZ_P8V,
+  BSTZ_P9,
+  BSTZ_P9_64,
+  BSTZ_P9V,
+  BSTZ_IEEE128_HW,
+  BSTZ_DFP,
+  BSTZ_CRYPTO,
+  BSTZ_HTM,
+  BSTZ_P10,
+  BSTZ_MMA,
+  NUMBIFSTANZAS
+};
+
+static bif_stanza curr_bif_stanza;
+
+struct stanza_entry
+{
+  const char *stanza_name;
+  bif_stanza stanza;
+};
+
+static stanza_entry stanza_map[NUMBIFSTANZAS] =
+  {
+{ "always",BSTZ_ALWAYS },
+{ "power5",BSTZ_P5 },
+{ "power6",BSTZ_P6 },
+{ "altivec",   BSTZ_ALTIVEC},
+{ "vsx",   BSTZ_VSX},
+{ "power7",BSTZ_P7 },
+{ "power7-64", BSTZ_P7_64  },
+{ "power8",BSTZ_P8 },
+{ "power8-vector", BSTZ_P8V},
+{ "power9",BSTZ_P9 },
+{ "power9-64", BSTZ_P9_64  },
+{ "power9-vector", BSTZ_P9V},
+{ "ieee128-hw",BSTZ_IEEE128_HW },
+{ "dfp",   BSTZ_DFP},
+{ "crypto",BSTZ_CRYPTO },
+{ "htm",   BSTZ_HTM},
+{ "power10",   BSTZ_P10},
+{ "mma",   BSTZ_MMA}
+  };
+
+static const char *enable_string[NUMBIFSTANZAS] =
+  {
+"ENB_ALWAYS",
+"ENB_P5",
+"ENB_P6",
+"ENB_ALTIVEC",
+"ENB_VSX",
+"ENB_P7",
+"ENB_P7_64",
+"ENB_P8",
+"ENB_P8V",
+"ENB_P9",
+"ENB_P9_64",
+"ENB_P9V",
+"ENB_IEEE128_HW",
+"ENB_DFP",
+"ENB_CRYPTO",
+"ENB_HTM",
+"ENB_P10",
+"ENB_MMA"
+  };
+
+/* Function modifiers provide special handling for const, pure, and fpmath
+   functions.  These are mutually exclusive, and therefore kept separate
+   from other bif attributes.  */
+enum fnkinds {
+  FNK_NONE,
+  FNK_CONST,
+  FNK_PURE,
+  FNK_FPMATH
+};
+
 /* Legal base types for an argument or return type.  */
 enum basetype {
   BT_CHAR,
@@ -221,7 +308,58 @@ struct typeinfo {
   int val2;
 };
 
+/* A list of argument types.  */
+struct typelist {
+  typeinfo info;
+  typelist *next;
+};
+
+/* Attributes of a builtin function.  */
+struct attrinfo {
+  char isinit;
+  char isset;
+  char isextract;
+  char isnosoft;
+  char isldvec;
+  char isstvec;
+  char isreve;
+  char ispred;
+  char ishtm;
+  char ishtmspr;
+  char ishtmcr;
+  char ismma;
+  char isno32bit;
+  char iscpu;
+  char isldstmask;
+};
+
+/* Fields associated with a function prototype (bif or overload).  */
+struct prototype {
+  typeinfo rettype;
+  char *bifname;
+  int nargs;
+  typelist *args;
+  int restr_opnd[2];
+  restriction restr[2];
+  int restr_val1[2];
+  int restr_val2[2];
+};
+
+/* Data associated with a builtin function, and a table of such data.  */
+#define MAXBIFS 16384
+struct bifdata {
+  int stanza;
+  fnkinds kind;
+  prototype proto;
+  char *idname;
+  char *patname;
+  attrinfo attrs;
+  char *fndecl;
+};
+
+static bifdata bifs[MAXBIFS];
 static int num_bifs;
+static int curr_bif;
 static int num_ovld_stanzas;
 static int num_ovlds;
 
@@ -404,6 +542,15 @@ handle_pointer (typeinfo *typedata)
 }
 }
 
+static bif_stanza
+stanza_name_to_stanza (const char *stanza_name)
+{
+  for (int i = 0; i < NUMBIFSTANZAS; i++)
+if (!strcmp (stanza_name, stanza_map[i].stanza_name))
+  return stanza_map[i].stanza;
+  assert (false);
+}
+
 /* Match one of the allowable base types.  Consumes one token unless the
token is "long", which must be paired with a second "long".  Optionally
consumes a following '*' token for pointers.  Return 1

[PATCH 00/29] rs6000: Auto-generate builtins from descriptions [V2]

2020-07-27 Thread Bill Schmidt

From: Bill Schmidt 

This is a slight reworking of the patches posted on June 17.  I have
made a couple of improvements, but the general arrangement of the patches
is the same as before.  Two major things to call out:

 - I've introduced a uniform set of parsing error codes to make it easier
   to follow some of the logic when certain conditions occur during parsing.

 - I reorganized the treatment of built-in stanzas.  Before, the stanza
   conditions were checked prior to initializing entries in the built-in
   table.  Now, all built-ins are initialized, but we check the conditions
   at expand time to determine whether they should be enabled.  This
   addresses a frequent problem we have with the existing methods, where
   "#pragma target" doesn't work as expected when changing the target
   CPU for a single function.

As described before, the current built-in support in the rs6000 back end
requires at least a master's degree in spelunking to comprehend.  It's
full of cruft, redundancy, and unused bits of code, and long overdue for a
replacement.  This is the first part of my project to do that.

My intent is to make adding new built-in functions as simple as adding a
few lines to a couple of files, and automatically generating as much of
the initialization, overload resolution, and expansion logic as possible.
This patch series establishes the format of the input files and creates
a new program (rs600-gen-builtins) to:

 * Parse the input files into an internal representation;
 * Generate a file of #defines (rs6000-vecdefines.h) for eventual
   inclusion into altivec.h; and
 * Generate an initialization file to create and initialize tables of
   built-in functions and overloads.

Patches 1, 3-7, and 9-19 contain the logic for rs6000-gen-builtins.
Patch 8 provides balanced tree search support for parsing scalability.
Patches 2 and 21-27 provide a first cut at the input files.
Patch 20 incorporates the new code into the GCC build.
Patch 28 adds comments to some existing files that will help during the
transition from the previous built-in mechanism.
Patch 29 turns on the initialization logic, while leaving GCC's behavior
unchanged otherwise.

The patch series is constructed so that any prefix set of the patches
can be upstreamed without breaking anything.  There's still plenty of
work left, but I think it will be helpful to get this big chunk of
patches upstream to make further progress easier (translation: avoid
complex rebases like the one I just went through :-).

Following is some additional information about the present and future
design that may be of help.

The set of patches submitted upstream so far does the relatively
straightforward work of reading builtin descriptions from flat files and
generating initialization code for builtin and overload tables.  It also
generates an include file meant to be included in altivec.h, which produces
the #defines that map vec_* to __builtin_* functions for external consumption.

Data structures are automatically initialized in rs6000_builtins.c:
rs6000_autoinit_builtins.  Initialized data structures are:

 - rs6000_gen_builtins:  An enumeration of builtin identifiers, such as
RS6000_BIF_CPU_SUPPORTS.  These names are deliberately different from the
existing builtin identifiers so they can co-exist for a while.

 - rs6000_gen_overloads:  An enumeration of overload identifiers, such as
RS6000_OVLD_MAX.  Again, deliberately different from the old names.  Right
now I have the two enumerations using nonoverlapping numbers.  This is
because the two tables were part of one table in the old design, and I
haven't yet proven for sure that I can separate them without problems.
I think that I can, in which case I will have both enumerations start from
zero.

 - A number of filescope variables representing TREE_TYPEs of functions.
These are named _ftype__ ... _ and
initialized as tree lists from the prototypes in the input files.  The
naming scheme for types is described in the code. 

 - rs6000_builtin_info_x:  An array indexed by rs6000_gen_builtins containing
all the fun stuff for each builtin.  The "_x" is because we already have
rs6000_builtin_info as the old table, and they need to coexist for a while.

 - rs6000_overload_info:  An array indexed by rs6000_gen_overloads containing
all the fun stuff for each overload.

 - bif_hash:  A hash table mapping builtin function names to pointers to
their rs6000_builtin_info_x entries.

 - ovld_hash:  A hash table mapping overload names to pointers to their
rs6000_overload_info entries. 

The new initialization code is called from rs6000_init_builtins. Currently
this function continues to do its existing initialization work, but also
initializes the new tables (and then ignores them). 

The old initialization code contains a lot of ad hoc hackery to handle
different kinds of functions that require extra behavior. The new design
removes as much of that as possible, and instead uses syntax in the flat
file to generate flags and

Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Jakub Jelinek via Gcc-patches

On Mon, Jul 27, 2020 at 04:12:09PM +0200, Martin Liška wrote:
> On 7/27/20 3:16 PM, Jakub Jelinek wrote:
> > On Mon, Jul 27, 2020 at 02:32:15PM +0200, Martin Liška wrote:
> > > As mentioned in the PR, we should not create a string constant for a type
> > > that is different from char_type_node. Looking at expr.c, I was inspired
> > > and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
> > > underlying
> > > type is a character type.
> > 
> > That doesn't look correct, there is char, signed char, unsigned char,
> > or say std::byte, and all of them are perfectly fine.
> > So, rather than requiring it is char and nothing else, you should instead
> > check that it is an INTEGRAL_TYPE_P (maybe better other than BOOLEAN_TYPE?),
> > which is complete and has the same TYPE_PRECISION as char_type_node.
> 
> All right, the following survives tests and bootstraps.

LGTM.

Jakub

Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Liška


On 7/27/20 3:16 PM, Jakub Jelinek wrote:

On Mon, Jul 27, 2020 at 02:32:15PM +0200, Martin Liška wrote:

As mentioned in the PR, we should not create a string constant for a type
that is different from char_type_node. Looking at expr.c, I was inspired
and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
underlying
type is a character type.


That doesn't look correct, there is char, signed char, unsigned char,
or say std::byte, and all of them are perfectly fine.
So, rather than requiring it is char and nothing else, you should instead
check that it is an INTEGRAL_TYPE_P (maybe better other than BOOLEAN_TYPE?),
which is complete and has the same TYPE_PRECISION as char_type_node.


All right, the following survives tests and bootstraps.

Ready to be installed?
Thanks,
Martin



Jakub



>From d40967eb2ef27b512ae177b1aee5e85ac2246acd Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 27 Jul 2020 12:30:24 +0200
Subject: [PATCH] expr: build string_constant only for a char type

gcc/ChangeLog:

	PR tree-optimization/96058
	* expr.c (string_constant): Build string_constant only
	for a type that has same precision as char_type_node
	and is an integral type.
---
 gcc/expr.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 5db0a7a8565..a150fa0d3b5 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11828,17 +11828,22 @@ string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
 	chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
 	chartype = TREE_TYPE (chartype);
-  /* Convert a char array to an empty STRING_CST having an array
-	 of the expected type and size.  */
-  if (!initsize)
-	  initsize = integer_zero_node;
 
-  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
-  init = build_string_literal (size, NULL, chartype, size);
-  init = TREE_OPERAND (init, 0);
-  init = TREE_OPERAND (init, 0);
+  if (INTEGRAL_TYPE_P (chartype)
+	  && TYPE_PRECISION (chartype) == TYPE_PRECISION (char_type_node))
+	{
+	  /* Convert a char array to an empty STRING_CST having an array
+	 of the expected type and size.  */
+	  if (!initsize)
+	initsize = integer_zero_node;
+
+	  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
+	  init = build_string_literal (size, NULL, chartype, size);
+	  init = TREE_OPERAND (init, 0);
+	  init = TREE_OPERAND (init, 0);
 
-  *ptr_offset = integer_zero_node;
+	  *ptr_offset = integer_zero_node;
+	}
 }
 
   if (decl)
-- 
2.27.0

[PATCH] [PATCH][GCC] arm: Enable no-writeback vldr.16/vstr.16.

2020-07-27 Thread Joe Ramsay

Hi,

There was previously no way to specify that a register operand cannot
have any writeback modifiers, and as a result the argument to vldr.16
and vstr.16 could be erroneously output with post-increment. This
change adds an operand specifier which forbids all writeback, and
selects it in the relevant case for vldr.16 and vstr.16

Bootstrapped on arm-linux, gcc and CMSIS-DSP testsuites are clean.
Is this patch OK for trunk? If yes, please commit on my behalf as I don't
have commit rights.

Thanks,
Joe

gcc/ChangeLog:

2020-05-20  Joe Ramsay  

* config/arm/arm-protos.h (arm_coproc_mem_operand_no_writeback): 
Declare prototype.
(arm_mve_mode_and_operands_type_check): Declare prototype.
* config/arm/arm.c (arm_coproc_mem_operand): Refactor to use 
_arm_coproc_mem_operand.
(arm_coproc_mem_operand_wb): New function to cover full, limited and no 
writeback.
(arm_coproc_mem_operand_no_writeback): New constraint for memory 
operand with no writeback.
(arm_print_operand): Implement 'j' specifier for memory operand that 
does not support
writeback.
(arm_mve_mode_and_operands_type_check): New constraint check for MVE 
memory operands.
* config/arm/constraints.md: Add Uj constraint for VFP vldr.16 and 
vstr.16.
* config/arm/vfp.md (*mov_load_vfp_hf16): New pattern for vldr.16.
(*mov_store_vfp_hf16): New pattern for vstr.16.
(*mov_vfp_16): Remove MVE moves.

gcc/testsuite/ChangeLog:

2020-05-20  Joe Ramsay  

* gcc.target/arm/mve/intrinsics/mve-vldstr16-no-writeback.c: New test.

---
 gcc/config/arm/arm-protos.h|   3 +
 gcc/config/arm/arm.c   | 100 ++---
 gcc/config/arm/constraints.md  |   7 ++
 gcc/config/arm/vfp.md  |  28 --
 .../arm/mve/intrinsics/mve-vldstr16-no-writeback.c |  17 
 5 files changed, 135 insertions(+), 20 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/arm/mve/intrinsics/mve-vldstr16-no-writeback.c

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 33d162c..e811da4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -115,8 +115,11 @@ extern enum reg_class coproc_secondary_reload_class 
(machine_mode, rtx,
 extern bool arm_tls_referenced_p (rtx);
 
 extern int arm_coproc_mem_operand (rtx, bool);
+extern int arm_coproc_mem_operand_no_writeback (rtx);
+extern int arm_coproc_mem_operand_wb (rtx, int);
 extern int neon_vector_mem_operand (rtx, int, bool);
 extern int mve_vector_mem_operand (machine_mode, rtx, bool);
+bool arm_mve_mode_and_operands_type_check (machine_mode, rtx, rtx);
 extern int neon_struct_mem_operand (rtx);
 
 extern rtx *neon_vcmla_lane_prepare_operands (rtx *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6b7ca82..ed080d2 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -13217,13 +13217,14 @@ neon_element_bits (machine_mode mode)
 /* Predicates for `match_operand' and `match_operator'.  */
 
 /* Return TRUE if OP is a valid coprocessor memory address pattern.
-   WB is true if full writeback address modes are allowed and is false
+   WB level is 2 if full writeback address modes are allowed, 1
if limited writeback address modes (POST_INC and PRE_DEC) are
-   allowed.  */
+   allowed and 0 if no writeback at all is supported.  */
 
 int
-arm_coproc_mem_operand (rtx op, bool wb)
+arm_coproc_mem_operand_wb (rtx op, int wb_level)
 {
+  gcc_assert (wb_level == 0 || wb_level == 1 || wb_level == 2);
   rtx ind;
 
   /* Reject eliminable registers.  */
@@ -13256,16 +13257,18 @@ arm_coproc_mem_operand (rtx op, bool wb)
 
   /* Autoincremment addressing modes.  POST_INC and PRE_DEC are
  acceptable in any case (subject to verification by
- arm_address_register_rtx_p).  We need WB to be true to accept
+ arm_address_register_rtx_p).  We need full writeback to accept
+ PRE_INC and POST_DEC, and at least restricted writeback for
  PRE_INC and POST_DEC.  */
-  if (GET_CODE (ind) == POST_INC
-  || GET_CODE (ind) == PRE_DEC
-  || (wb
- && (GET_CODE (ind) == PRE_INC
- || GET_CODE (ind) == POST_DEC)))
+  if (wb_level > 0
+  && (GET_CODE (ind) == POST_INC
+ || GET_CODE (ind) == PRE_DEC
+ || (wb_level > 1
+ && (GET_CODE (ind) == PRE_INC
+ || GET_CODE (ind) == POST_DEC
 return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
-  if (wb
+  if (wb_level > 1
   && (GET_CODE (ind) == POST_MODIFY || GET_CODE (ind) == PRE_MODIFY)
   && arm_address_register_rtx_p (XEXP (ind, 0), 0)
   && GET_CODE (XEXP (ind, 1)) == PLUS
@@ -13287,6 +13290,25 @@ arm_coproc_mem_operand (rtx op, bool wb)
   return FALSE;
 }
 
+/* Return TRUE if OP is a valid coprocessor memory address pattern.
+   WB is true if full writeback address modes are allowed and is false
+   if limited writeback

Re: [PATCH v4] vect/rs6000: Support vector with length cost modeling

2020-07-27 Thread Richard Sandiford

"Kewen.Lin"  writes:
> Hi Richard,
>
> Thanks for the review again!
>
> on 2020/7/25 上午12:21, Richard Sandiford wrote:
>> "Kewen.Lin"  writes:
>> 
>> Thanks, the rearrangement of the existing code looks good.  Could you
>> split that and the new LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo) stuff
>> out into separate patches?
>> 
>
> Splitted to https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550691.html.
>
> errr... that subject should be with prefix "[PATCH] vect:".
>
> [snip ...] 
> (Some comments in the snipped content will be done in v4)
>
>>> +here.  */
>>> +
>>> +  /* For now we only operate length-based partial vectors on Power,
>>> +which has constant VF all the time, we need some tweakings below
>>> +if it doesn't hold in future.  */
>>> +  gcc_assert (LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ());
>> 
>> Where do you rely on this?  There didn't seem to be any obvious
>> to_constant uses.  Since this is “only” a cost calculation, we should
>> be using assumed_vf.
>
> Sorry for the confusion.  This was intended for the poly things like
> VF or nitems_per_ctrl which isn't constant during compilation time,
> then get people's attention on the possible runtime cost on things like
> scaling up for nitems_step etc.  But I just realized that the computations
> like the multiply with another constant can operate on the coefficient,
> it looks there is no runtime cost then?  If so, I think I thought too
> much before.  ;-)
>
>>> -  prologue_cost_vec.release ();
>>> -  epilogue_cost_vec.release ();
>>> +  (void) add_stmt_cost (loop_vinfo, target_cost_data, prol_cnt, 
>>> scalar_stmt,
>>> +   NULL, NULL_TREE, 0, vect_prologue);
>>> +  (void) add_stmt_cost (loop_vinfo, target_cost_data, body_cnt, 
>>> scalar_stmt,
>>> +   NULL, NULL_TREE, 0, vect_body);
>> 
>> IMO this seems to be reproducing too much of the functions that you
>> referred to.  And the danger with that is that they could easily
>> get out of sync later.
>
> Good point!  The original intention was to model as possible as we can,
> to avoid some bad decision due to some unmodeled pieces, like the case
> the loop body is small and some computation become nonnegligible.
> The unsync risks seems also applied for other codes.  How about adding
> some "note" comments in those functions?
>
> The updated v4 is attached by addressing your comments as well as Segher's
> comments.
>
> BR,
> Kewen
> -
>
> gcc/ChangeLog:
>
>   * config/rs6000/rs6000.c (rs6000_adjust_vect_cost_per_loop): New
>   function.
>   (rs6000_finish_cost): Call rs6000_adjust_vect_cost_per_loop.
>   * tree-vect-loop.c (vect_estimate_min_profitable_iters): Add cost
>   modeling for vector with length.
>   * tree-vect-loop-manip.c (vect_set_loop_controls_directly): Update
>   function comment.
>   * tree-vect-stmts.c (vect_gen_len): Likewise.
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 009afc5f894..86ef584e09b 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -5177,6 +5177,34 @@ rs6000_add_stmt_cost (class vec_info *vinfo, void 
> *data, int count,
>return retval;
>  }
>  
> +/* For some target specific vectorization cost which can't be handled per 
> stmt,
> +   we check the requisite conditions and adjust the vectorization cost
> +   accordingly if satisfied.  One typical example is to model shift cost for
> +   vector with length by counting number of required lengths under condition
> +   LOOP_VINFO_FULLY_WITH_LENGTH_P.  */
> +
> +static void
> +rs6000_adjust_vect_cost_per_loop (rs6000_cost_data *data)
> +{
> +  struct loop *loop = data->loop_info;
> +  gcc_assert (loop);
> +  loop_vec_info loop_vinfo = loop_vec_info_for_loop (loop);
> +
> +  if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
> +{
> +  rgroup_controls *rgc;
> +  unsigned int num_vectors_m1;
> +  unsigned int shift_cnt = 0;
> +  FOR_EACH_VEC_ELT (LOOP_VINFO_LENS (loop_vinfo), num_vectors_m1, rgc)
> + if (rgc->type)
> +   /* Each length needs one shift to fill into bits 0-7.  */
> +   shift_cnt += num_vectors_m1 + 1;
> +
> +  rs6000_add_stmt_cost (loop_vinfo, (void *) data, shift_cnt, 
> scalar_stmt,
> + NULL, NULL_TREE, 0, vect_body);
> +}
> +}
> +
>  /* Implement targetm.vectorize.finish_cost.  */
>  
>  static void
> @@ -5186,7 +5214,10 @@ rs6000_finish_cost (void *data, unsigned 
> *prologue_cost,
>rs6000_cost_data *cost_data = (rs6000_cost_data*) data;
>  
>if (cost_data->loop_info)
> -rs6000_density_test (cost_data);
> +{
> +  rs6000_adjust_vect_cost_per_loop (cost_data);
> +  rs6000_density_test (cost_data);
> +}
>  
>/* Don't vectorize minimum-vectorization-factor, simple copy loops
>   that require versioning for any reason.  The vectorization is at
> diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
>

Re: [PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Jakub Jelinek via Gcc-patches

On Mon, Jul 27, 2020 at 02:32:15PM +0200, Martin Liška wrote:
> As mentioned in the PR, we should not create a string constant for a type
> that is different from char_type_node. Looking at expr.c, I was inspired
> and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
> underlying
> type is a character type.

That doesn't look correct, there is char, signed char, unsigned char,
or say std::byte, and all of them are perfectly fine.
So, rather than requiring it is char and nothing else, you should instead
check that it is an INTEGRAL_TYPE_P (maybe better other than BOOLEAN_TYPE?),
which is complete and has the same TYPE_PRECISION as char_type_node.

Jakub

Re: Refactor peel_iters_{pro,epi}logue cost model handlings

2020-07-27 Thread Richard Sandiford

"Kewen.Lin"  writes:
> Hi,
>
> As Richard S. suggested in the thread:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550633.html
>
> this patch is separated from the one of that thread, mainly to refactor the
> existing peel_iters_{pro,epi}logue cost model handlings.
>
> I've addressed Richard S.'s review comments there, moreover because of one
> failure of aarch64 testing, I updated it a bit more to keep the logic 
> unchanged
> as before first (refactor_cost.diff).

Heh, nice when a clean-up exposes an existing bug. ;-)  I agree the
updates look correct.  E.g. if vf is 1, we should assume that there
are no peeled iterations even if the number of iterations is unknown.

Both patches are OK with some very minor nits (sorry):

> gcc/ChangeLog:
>
>   * tree-vect-loop.c (vect_get_known_peeling_cost): Factor out some code
>   to determine peel_iters_epilogue to function ...
>   (vect_get_peel_iters_epilogue): ... this.  New function.

to determine peel_iters_epilogue to...
(vect_get_peel_iters_epilogue): ...this new function.

> +  if (dump_enabled_p ())
> + dump_printf (MSG_NOTE, "cost model: "
> +"prologue peel iters set to vf/2.\n");

Agree this looks nice, but the old formatting was correct: the string
should be indented under “MSG_NOTE, ”.  Same for the epilogue.

> +  /* If peeled iterations are unknown, count a taken branch and a not 
> taken
> +  branch per peeled loop. Even if scalar loop iterations are known,
> +  vector iterations are not known since peeled prologue iterations are
> +  not known. Hence guards remain the same.  */

Should be two spaces rather than one space before “Hence”.

Thanks for doing this.

Richard

RE: [PATCH PR96053] Add "#pragma GCC no_reduc_chain"

2020-07-27 Thread zhoukaipeng (A)

Sorry for the late reply!

-Original Message-
From: Richard Biener [mailto:rguent...@suse.de]
Sent: Wednesday, July 22, 2020 3:02 PM

> First of all I think giving users more control over vectorization is 
> good.  Now as for "#pragma GCC no_reduc_chain" I'd like to avoid 
> negatives and terms internal to GCC.  I also would like to see 
> vectorization pragmas to be grouped somehow, also to avoid bit 
> explosion in struct loop.  There's already annot_expr_no_vector_kind 
> and annot_expr_vector_kind both only used by the fortran FE at the 
> moment.  Note ANNOATE_EXPR already allows an extra argument thus only 
> annot_expr_vector_kind should prevail with its argument specifying a 
> bitmask of vectorizer hints.  We'd have an extra enum for those like
> 
> enum annot_vector_subkind {
>   annot_vector_never = 0,
>   annot_vector_auto = 1, // this is the default
>   annot_vector_always = 3,
>   your new flag
> };
> 
> and the user would specify it via
> 
> #pragma GCC vect [(never|always|auto)] [your new flag]

I'll add the code to complete the above modifications.
 
> now, I honestly have a difficulty in suggesting a better name than 
> no_reduc_chain.  Quoting the testcase:
> 
> +double f(double *a, double *b)
> +{
> +  double res1 = 0;
> +  double res0 = 0;
> +#pragma GCC no_reduc_chain
> +  for (int i = 0 ; i < 1000; i+=4) {
> +res0 += a[i] * b[i];
> +res1 += a[i+1] * b[i*1];
> +res0 += a[i+2] * b[i+2];
> +res1 += a[i+3] * b[i+3];
> +  }
> +  return res0 + res1;
> +}
> 
> for your case with IIRC V2DF vectors using reduction chains will 
> result in a vectorization factor of two while with a SLP reduction the 
> vectorization factor is one.

I reconfirmed the situation.  Using reduction chains also result in a 
vectorization factor of one as the same as using reductions.
Related logs are followed:

1.using reduction chains
pr96053.c:6:3: note:   Final SLP tree for instance:
pr96053.c:6:3: note:   node 0x1a4f4ba0 (max_nunits=2, refcnt=2)
pr96053.c:6:3: note:stmt 0 res1_37 = _14 + res1_48;
pr96053.c:6:3: note:stmt 1 res1_39 = _28 + res1_37;
...
pr96053.c:6:3: note:   === vect_make_slp_decision ===
pr96053.c:6:3: note:   Decided to SLP 2 instances. Unrolling factor 1
pr96053.c:6:3: note:   === vect_detect_hybrid_slp ===
pr96053.c:6:3: note:   === vect_update_vf_for_slp ===
pr96053.c:6:3: note:   Loop contains only SLP stmts
pr96053.c:6:3: note:   Updating vectorization factor to 1.
pr96053.c:6:3: note:  vectorization_factor = 1, niters = 250

2.using reductions
pr96053.c:6:3: note:   Final SLP tree for instance:
pr96053.c:6:3: note:   node 0x3a7f2cb0 (max_nunits=2, refcnt=2)
pr96053.c:6:3: note:stmt 0 res0_38 = _21 + res0_36;
pr96053.c:6:3: note:stmt 1 res1_39 = _28 + res1_37;
...
pr96053.c:6:3: note:   === vect_make_slp_decision ===
pr96053.c:6:3: note:   Decided to SLP 1 instances. Unrolling factor 1
pr96053.c:6:3: note:   === vect_detect_hybrid_slp ===
pr96053.c:6:3: note:   === vect_update_vf_for_slp ===
pr96053.c:6:3: note:   Loop contains only SLP stmts
pr96053.c:6:3: note:   Updating vectorization factor to 1.
pr96053.c:6:3: note:  vectorization_factor = 1, niters = 250

> So maybe it is better to give the user control over the vectorization factor?
> That's desirable in other cases where the user wants to force a larger 
> VF to get extra unrolling for example.  For the testcase above you'd 
> use
> 
> #pragma GCC vect vf(1)
> 
> or so (syntax to be discussed).  The side-effect would be that with a 
> reduction chain the VF request cannot be fulfilled but with a SLP 
> reduction it can.  Of course no_reduc_chain is much easier to actually 
> implement in a strict way while specifying VF will likely need to be 
> documented as a hint (with an eventual diagnostic if it wasn't 
> fulfilled)
> 
> Richard/Jakub, any thoughts?
> 
> Thanks,
> Richard.

Thanks,
Kaipeng Zhou

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Richard Biener via Gcc-patches

On Mon, Jul 27, 2020 at 2:54 PM Martin Liška  wrote:
>
> On 7/27/20 2:51 PM, Richard Biener wrote:
> > On Mon, Jul 27, 2020 at 2:50 PM Andreas Schwab  
> > wrote:
> >>
> >> On Jul 27 2020, Martin Liška wrote:
> >>
> >>> @Andreas: Is it a known issue?
> >>
> >> Which issue?
> >
> > I guess Martin means the checking glibc done looks excessive
> > (for the specific case of realloc). But yes, it's enabled in the build roots
> > so we just get what we ask for.
>
> Yes. I'm basically curious who is this enabled/disabled for glibc.
> I speak e.g. about:
>
> #0  0x77e886f0 in mem2chunk_check () from /lib64/libc.so.6
> #1  0x77e8cb8a in realloc_check () from /lib64/libc.so.6
> #2  0x50bf in main (argc=, argv=) 
> at bench.c:7

It's /etc/profile.d/malloc-debug.sh

> Martin
>
> >
> > Richard.
> >
> >> Andreas.
> >>
> >> --
> >> Andreas Schwab, sch...@linux-m68k.org
> >> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> >> "And now for something completely different."
>

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Martin Liška


On 7/27/20 2:51 PM, Richard Biener wrote:

On Mon, Jul 27, 2020 at 2:50 PM Andreas Schwab  wrote:


On Jul 27 2020, Martin Liška wrote:


@Andreas: Is it a known issue?


Which issue?


I guess Martin means the checking glibc done looks excessive
(for the specific case of realloc). But yes, it's enabled in the build roots
so we just get what we ask for.


Yes. I'm basically curious who is this enabled/disabled for glibc.
I speak e.g. about:

#0  0x77e886f0 in mem2chunk_check () from /lib64/libc.so.6
#1  0x77e8cb8a in realloc_check () from /lib64/libc.so.6
#2  0x50bf in main (argc=, argv=) at 
bench.c:7

Martin



Richard.


Andreas.

--
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [PATCH] Remove dead vector comparisons

2020-07-27 Thread Richard Biener via Gcc-patches

On Mon, Jul 27, 2020 at 2:50 PM Martin Liška  wrote:
>
> On 7/27/20 1:22 PM, Richard Biener wrote:
> > I wonder what happens if we make vector lowering not allow the compare
> > expanded via expand_vec_cond_expr_p?
>
> So the following patch survives bootstrap and tests on x86_64-linux-gnu:
>
> diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
> index f8bd26f2156..fe6477c8592 100644
> --- a/gcc/tree-vect-generic.c
> +++ b/gcc/tree-vect-generic.c
> @@ -419,8 +419,7 @@ expand_vector_comparison (gimple_stmt_iterator *gsi, tree 
> type, tree op0,
>   return NULL_TREE;
>
> tree t;
> -  if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
> -  && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
> +  if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code))
>   {
> if (VECTOR_BOOLEAN_TYPE_P (type)
>&& SCALAR_INT_MODE_P (TYPE_MODE (type))
>
> and provides a reasonable BIT_FIELD_REF expansion on the problematic s390x 
> test-case.
> Is it a way to go?

If it avoids to do this expansion if the compare feeds supported
VEC_COND_EXPRs then
I think yes.

Richard.

>
> Martin

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Richard Biener via Gcc-patches

On Mon, Jul 27, 2020 at 2:50 PM Andreas Schwab  wrote:
>
> On Jul 27 2020, Martin Liška wrote:
>
> > @Andreas: Is it a known issue?
>
> Which issue?

I guess Martin means the checking glibc done looks excessive
(for the specific case of realloc). But yes, it's enabled in the build roots
so we just get what we ask for.

Richard.

> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."

Re: [PATCH] Remove dead vector comparisons

2020-07-27 Thread Martin Liška


On 7/27/20 1:22 PM, Richard Biener wrote:

I wonder what happens if we make vector lowering not allow the compare
expanded via expand_vec_cond_expr_p?


So the following patch survives bootstrap and tests on x86_64-linux-gnu:

diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index f8bd26f2156..fe6477c8592 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -419,8 +419,7 @@ expand_vector_comparison (gimple_stmt_iterator *gsi, tree 
type, tree op0,
 return NULL_TREE;
 
   tree t;

-  if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
-  && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
+  if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code))
 {
   if (VECTOR_BOOLEAN_TYPE_P (type)
  && SCALAR_INT_MODE_P (TYPE_MODE (type))

and provides a reasonable BIT_FIELD_REF expansion on the problematic s390x 
test-case.
Is it a way to go?

Martin

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Andreas Schwab

On Jul 27 2020, Martin Liška wrote:

> @Andreas: Is it a known issue?

Which issue?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [PATCH] ipa/96291: don't crash on unoptimized lto functions

2020-07-27 Thread Martin Jambor

Hi,

On Mon, Jul 27 2020, Richard Biener via Gcc-patches wrote:
> On Sat, Jul 25, 2020 at 8:35 PM Sergei Trofimovich via Gcc-patches
>  wrote:
>>
>> From: Sergei Trofimovich 
>>
>> In PR ipa/96291 the test contained an SCC with one
>> unoptimized function. This tricked ipa-cp into NULL dereference.
>>
>> has_undead_caller_from_outside_scc_p() did not take into account
>> that unoptimized funtions don't have IPA summary analysis. and
>> dereferenced NULL pointer causing an ICE.
>
> Can you create a single-unit testcase with a SCC with one function
> having the no_ipa attribute?

This bug is LTO specific because otherwise a summary (although marked as
quite useless) will be left over from the summary building stage.

So Sergei, if you can afford to spend an extra while providing a
testcase, you'll need to add three files into gcc/testsuite/gcc.dg/lto,
with either the second or third (numbered _1 or _2)) having

/* { dg-lto-options { { -flto -O0 } } } */

in them.

Thanks,

Martin



>
>> PR ipa/96291
>> * ipa-cp.c (has_undead_caller_from_outside_scc_p): Consider
>> unoptimized callers as undead.
>> ---
>>  gcc/ipa-cp.c | 12 +---
>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
>> index b0c8f405260..d5082576962 100644
>> --- a/gcc/ipa-cp.c
>> +++ b/gcc/ipa-cp.c
>> @@ -5666,9 +5666,15 @@ has_undead_caller_from_outside_scc_p (struct 
>> cgraph_node *node,
>> && cs->caller->call_for_symbol_thunks_and_aliases
>>   (has_undead_caller_from_outside_scc_p, NULL, true))
>>return true;
>> -else if (!ipa_edge_within_scc (cs)
>> -&& !IPA_NODE_REF (cs->caller)->node_dead)
>> -  return true;
>> +else if (!ipa_edge_within_scc (cs))
>> +  {
>> +   /* Unoptimized callers don't have IPA information.
>> +  Conservatively assume callers are undead.  */
>> +   if (!IPA_NODE_REF (cs->caller))
>> + return true;
>> +   if (!IPA_NODE_REF (cs->caller)->node_dead)
>> + return true;
>> +  }
>>return false;
>>  }
>>
>> --
>> 2.27.0
>>

Re: [PATCH] ipa/96291: don't crash on unoptimized lto functions

2020-07-27 Thread Martin Jambor

Hi,

On Sat, Jul 25 2020, Sergei Trofimovich wrote:
> From: Sergei Trofimovich 
>
> In PR ipa/96291 the test contained an SCC with one
> unoptimized function. This tricked ipa-cp into NULL dereference.
>
> has_undead_caller_from_outside_scc_p() did not take into account
> that unoptimized funtions don't have IPA summary analysis. and
> dereferenced NULL pointer causing an ICE.
>
>   PR ipa/96291
>   * ipa-cp.c (has_undead_caller_from_outside_scc_p): Consider
>   unoptimized callers as undead.
> ---
>  gcc/ipa-cp.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index b0c8f405260..d5082576962 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -5666,9 +5666,15 @@ has_undead_caller_from_outside_scc_p (struct 
> cgraph_node *node,
>   && cs->caller->call_for_symbol_thunks_and_aliases
> (has_undead_caller_from_outside_scc_p, NULL, true))
>return true;
> -else if (!ipa_edge_within_scc (cs)
> -  && !IPA_NODE_REF (cs->caller)->node_dead)
> -  return true;
> +else if (!ipa_edge_within_scc (cs))
> +  {
> + /* Unoptimized callers don't have IPA information.
> +Conservatively assume callers are undead.  */
> + if (!IPA_NODE_REF (cs->caller))
> +   return true;
> + if (!IPA_NODE_REF (cs->caller)->node_dead)
> +   return true;

I'd prefer a single condition, i.e.:

else if (!ipa_edge_within_scc (cs)
 && (!IPA_NODE_REF (cs->caller)
 || !IPA_NODE_REF (cs->caller)->node_dead))
  return true;


so OK with that change.

Thanks a lot for looking into this.

Martin

[PATCH] expr: build string_constant only for a char type

2020-07-27 Thread Martin Liška


Hey.

As mentioned in the PR, we should not create a string constant for a type
that is different from char_type_node. Looking at expr.c, I was inspired
and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that 
underlying
type is a character type.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests. And it 
fixes chromium
build with gcc-10 branch with the patch applied.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR tree-optimization/96058
* expr.c (string_constant): Build string_constant only
for a type that is main variant of char_type_node.
---
 gcc/expr.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 5db0a7a8565..c3fdd82b319 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11828,17 +11828,21 @@ string_constant (tree arg, tree *ptr_offset, tree 
*mem_size, tree *decl)
chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
chartype = TREE_TYPE (chartype);
-  /* Convert a char array to an empty STRING_CST having an array
-of the expected type and size.  */
-  if (!initsize)
- initsize = integer_zero_node;
 
-  unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);

-  init = build_string_literal (size, NULL, chartype, size);
-  init = TREE_OPERAND (init, 0);
-  init = TREE_OPERAND (init, 0);
+  if (TYPE_MAIN_VARIANT (chartype) == char_type_node)
+   {
+ /* Convert a char array to an empty STRING_CST having an array
+of the expected type and size.  */
+ if (!initsize)
+   initsize = integer_zero_node;
+
+ unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
+ init = build_string_literal (size, NULL, chartype, size);
+ init = TREE_OPERAND (init, 0);
+ init = TREE_OPERAND (init, 0);
 
-  *ptr_offset = integer_zero_node;

+ *ptr_offset = integer_zero_node;
+   }
 }
 
   if (decl)

--
2.27.0

[PATCH v4] driver: fix a problem with implementation of -falign-foo=0 [PR96247]

2020-07-27 Thread Hu Jiangping

Hi!

This patch makes the -falign-foo=0 work as described in the
documentation. Thanks for all the suggestions.

v4: do changes for coding conventions
v3: make change more readable and self-consistent

Changelog:
2020-07-27  Hu Jiangping  

PR driver/96247
* opts.c (check_alignment_argument): Set the -falign-Name
on/off flag on and set the -falign-Name string value null,
when the command-line specified argument is zero.

Tested on x86_64.

Regards!
Hujp

---
 gcc/opts.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/gcc/opts.c b/gcc/opts.c
index 499eb900643..574b28416fb 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2004,13 +2004,21 @@ parse_and_check_align_values (const char *flag,
 }
 
 /* Check that alignment value FLAG for -falign-NAME is valid at a given
-   location LOC.  */
+   location LOC. OPT_STR points to the stored -falign-NAME=argument and
+   OPT_FLAG points to the associated -falign-NAME on/off flag.  */
 
 static void
-check_alignment_argument (location_t loc, const char *flag, const char *name)
+check_alignment_argument (location_t loc, const char *flag, const char *name,
+int *opt_flag, const char **opt_str)
 {
   auto_vec align_result;
   parse_and_check_align_values (flag, name, align_result, true, loc);
+
+  if (align_result.length() >= 1 && align_result[0] == 0)
+{
+  *opt_flag = 1;
+  *opt_str = NULL;
+}
 }
 
 /* Print help when OPT__help_ is set.  */
@@ -2785,19 +2793,27 @@ common_handle_option (struct gcc_options *opts,
   break;
 
 case OPT_falign_loops_:
-  check_alignment_argument (loc, arg, "loops");
+  check_alignment_argument (loc, arg, "loops",
+>x_flag_align_loops,
+>x_str_align_loops);
   break;
 
 case OPT_falign_jumps_:
-  check_alignment_argument (loc, arg, "jumps");
+  check_alignment_argument (loc, arg, "jumps",
+>x_flag_align_jumps,
+>x_str_align_jumps);
   break;
 
 case OPT_falign_labels_:
-  check_alignment_argument (loc, arg, "labels");
+  check_alignment_argument (loc, arg, "labels",
+>x_flag_align_labels,
+>x_str_align_labels);
   break;
 
 case OPT_falign_functions_:
-  check_alignment_argument (loc, arg, "functions");
+  check_alignment_argument (loc, arg, "functions",
+>x_flag_align_functions,
+>x_str_align_functions);
   break;
 
 case OPT_ftabstop_:
-- 
2.17.1

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Richard Biener via Gcc-patches

On Mon, Jul 27, 2020 at 1:24 PM Martin Liška  wrote:
>
> On 7/27/20 1:11 PM, Jan Hubicka wrote:
> >> On 7/27/20 9:11 AM, Richard Biener wrote:
> >>> OK.  I guess the previous code tried to use less memory.
> >>
> >> It did. But I didn't realize that such exact growth would lead
> >> to a massive reallocation for huge apps like chromium.
> >
> > I would consider it an API issue - it is not really at all that obvious
> > when vec API does auto reserve and when it does not.
>
> Fully agree here, it's super-confusing.
>
> >
> > Grepping for vec_safe_grow, rtl_create_basic_block, gimple_set_bb,
> > extend_h_i_d, stack_regs_mentioned, init_deps_data_vector
> > extend_insn_data, create_bb, move_block_to_fn logic has similar logic
> > but implemented by hand.  Perhaps we can switch it to the new API.
> >
> > combine_split_insns, combine_instructions, update_row_reg_save,
> > grow_label_align, update_uses, final_warning_record::grow_type_warnings,
> > sem_function::bb_dict_test, ::add_single_to_queue,
> > symtab_node::create_reference, mark_phi_for_rewrite, addr_for_mem_ref,
> > multiplier_allowed_in_address_p, get_address_cost_ainc,
> > make_ssa_name_fn, add_to_value, phi_translate_1,
> > optimize_range_tests_cmp_bitwise, set_strinfo,
> > ssa_name_values.safe_grow_cleared, vect_record_loop_mask has similarly
> > suspicious logic in it.
>
> Are you talking about changing all '*gros*' calls to use exact=false, right?
> I can experiment with that.

No, add gro*_exact variants and replace existing ones with this, then switch
to exact = false for the non-_exact variants.  Or add a exact=false argument
to all of them and make all existing calls explicitly passing true.

Only later, on a case by case basis, swap one for the other when obvious.

Richard.

> Martin
>
> >
> > Honza
> >>
> >> I'm going to backport the patch older releases as well.
> >>
> >> Martin
>

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Martin Liška


On 7/27/20 1:11 PM, Jan Hubicka wrote:

On 7/27/20 9:11 AM, Richard Biener wrote:

OK.  I guess the previous code tried to use less memory.


It did. But I didn't realize that such exact growth would lead
to a massive reallocation for huge apps like chromium.


I would consider it an API issue - it is not really at all that obvious
when vec API does auto reserve and when it does not.


Fully agree here, it's super-confusing.



Grepping for vec_safe_grow, rtl_create_basic_block, gimple_set_bb,
extend_h_i_d, stack_regs_mentioned, init_deps_data_vector
extend_insn_data, create_bb, move_block_to_fn logic has similar logic
but implemented by hand.  Perhaps we can switch it to the new API.

combine_split_insns, combine_instructions, update_row_reg_save,
grow_label_align, update_uses, final_warning_record::grow_type_warnings,
sem_function::bb_dict_test, ::add_single_to_queue,
symtab_node::create_reference, mark_phi_for_rewrite, addr_for_mem_ref,
multiplier_allowed_in_address_p, get_address_cost_ainc,
make_ssa_name_fn, add_to_value, phi_translate_1,
optimize_range_tests_cmp_bitwise, set_strinfo,
ssa_name_values.safe_grow_cleared, vect_record_loop_mask has similarly
suspicious logic in it.


Are you talking about changing all '*gros*' calls to use exact=false, right?
I can experiment with that.

Martin



Honza


I'm going to backport the patch older releases as well.

Martin

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Martin Liška


On 7/27/20 1:03 PM, Jan Hubicka wrote:

Did you do chroot to the chromium build?


Oh, you are right!

It really takes move than 60 seconds with:

35.43%  a.outlibc-2.31.so  [.] mem2chunk_check
26.54%  a.outlibc-2.31.so  [.] mem2mem_check
21.50%  a.outlibc-2.31.so  [.] realloc_check
 8.38%  a.outlibc-2.31.so  [.] mremap_chunk
 5.03%  a.outlibc-2.31.so  [.] realloc
 1.87%  a.outa.out [.] main

So there's really a hardening enabled in openSUSE:Factory chroot.
@Andreas: Is it a known issue?


Note that there may be an interleaving load on the machine.
Perf says:

 55.40%  a.outlibc-2.26.so  [.] realloc
 36.01%  a.outa.out [.] realloc@plt
  4.98%  a.outlibc-2.26.so  [.] mremap_chunk
  3.60%  a.outa.out [.] main

How one can do perfing on kunlun?


You can install perf with:
osc build -x perf -x kernel-default

and then you can run it in a chroot env.

Martin

Re: [PATCH] Remove dead vector comparisons

2020-07-27 Thread Richard Biener via Gcc-patches

On Mon, Jul 27, 2020 at 11:32 AM Martin Liška  wrote:
>
> On 7/10/20 10:24 AM, Richard Biener wrote:
> > On Fri, Jul 10, 2020 at 9:50 AM Martin Liška  wrote:
> >>
> >> As mentioned in the PR, we need to clean up orphan vector comparisons
> >> that tend to happen be gimplification of VEC_COND_EXPR.
> >>
> >> I've done that easily in expand_vector_comparison where I add these
> >> to a bitmap used in simple DCE.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > I don't like this much - the reason for the dead code is that the 
> > gimplifier,
> > while it manages to DCE the VEC_COND_EXPR because the value is not
> > needed, does not apply the same for the operands where only side-effects
> > would need to be kept.
> >
> > But then if the vector comparisons would not be dead, the testcase
> > would still ICE even with your patch.
>
> Hello.
>
> Question here is if one can write such a test-case? I would the target
> would lower a vector comparison directly to a non-vector code?

Possibly you can't (looks like you need boolean vectors for this).  The
GIMPLE FE also does not support this yet (because it relies on the C
frontends type capabilities).

> >  And the reason for the ICE is
> > that vector lowering checks
> >
> >if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
> >&& !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
> >  {
> >
> > while RTL expansion has
> >
> >/* For vector typed comparisons emit code to generate the desired
> >   all-ones or all-zeros mask.  */
> >if (TREE_CODE (ops->type) == VECTOR_TYPE)
> >  {
> >tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
> >if (VECTOR_BOOLEAN_TYPE_P (ops->type)
> >&& expand_vec_cmp_expr_p (TREE_TYPE (arg0), ops->type, 
> > ops->code))
> >  return expand_vec_cmp_expr (ops->type, ifexp, target);
> >else
> >  gcc_unreachable ();
> >
> > so vector lowering doesn't lower vector comparisons when we can expand
> > via a VEC_COND_EXPR but the RTL expansion code asserts this will not
> > be needed.
>
> Ah, ok, now I understand that.
>
> >
> > Thus this should either be fixed by re-instantiating the RTL expansion code
>
> Could you please explain what re-instantiating the RTL means?

Put the code back that predates the gcc_unreachable ().

> > or handling vector comparisons in gimple-isel.cc at least when they need
> > to be expanded as VEC_COND_EXPR.
>
> That's doable but by something like:
>
>_1 = v_5(D) > { 0, 0, 0, 0, 0, 0, 0, 0 };
>_10 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0 }, { 1, 1, 1, 1, 1, 1, 
> 1, 1 }>;
>
> which will be immediately expanded to in ISEL to:
>
>_10 = .VCOND (v_5(D), { 0, 0, 0, 0, 0, 0, 0, 0 }, { 0, 0, 0, 0, 0, 0, 0, 0 
> }, { 1, 1, 1, 1, 1, 1, 1, 1 }, 109);
>
> But I would need to redirect all uses of _1 to _10, right? Do we prefer to do 
> this?

not sure - it might be that _1 is always unused at least in the for it
appears in the testcase
(with the boolean vector).  Note that the boolean vector case is not
representable by
the .VCOND because of the different result type.  So the separate
comparison looks
like an artifact of the split from the VEC_COND_EXPR after all ...

The gimplifier has

static bool
verify_gimple_comparison (tree type, tree op0, tree op1, enum tree_code code)
{
...
  /* Or a boolean vector type with the same element count
 as the comparison operand types.  */
  else if (TREE_CODE (type) == VECTOR_TYPE
   && TREE_CODE (TREE_TYPE (type)) == BOOLEAN_TYPE)
{

so it requires a BOOLEAN_TYPE vector.

I wonder what happens if we make vector lowering not allow the compare
expanded via expand_vec_cond_expr_p?

Richard.

> Thanks,
> Martin
>
> >
> > Richard.
> >
> >> Thanks,
> >> Martin
> >>
> >> gcc/ChangeLog:
> >>
> >>  PR tree-optimization/96128
> >>  * tree-vect-generic.c (expand_vector_comparison): Remove vector
> >>  comparisons that don't have a usage.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>  PR tree-optimization/96128
> >>  * gcc.target/s390/vector/pr96128.c: New test.
> >> ---
> >>.../gcc.target/s390/vector/pr96128.c  | 35 +++
> >>gcc/tree-vect-generic.c   |  4 ++-
> >>2 files changed, 38 insertions(+), 1 deletion(-)
> >>create mode 100644 gcc/testsuite/gcc.target/s390/vector/pr96128.c
> >>
> >> diff --git a/gcc/testsuite/gcc.target/s390/vector/pr96128.c 
> >> b/gcc/testsuite/gcc.target/s390/vector/pr96128.c
> >> new file mode 100644
> >> index 000..20abe5e515c
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/s390/vector/pr96128.c
> >> @@ -0,0 +1,35 @@
> >> +/* PR tree-optimization/96128 */
> >> +/* { dg-options "-march=z13" } */
> >> +
> >> +#define B_TEST(TYPE) { TYPE v __attribute__((vector_size(16))); (void)((v 
> >> < v) < v); }
> >> +#ifdef __cplusplus
> >> +#define T_TEST(TYPE) { TYPE s; TYPE

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Jan Hubicka

> On 7/27/20 9:11 AM, Richard Biener wrote:
> > OK.  I guess the previous code tried to use less memory.
> 
> It did. But I didn't realize that such exact growth would lead
> to a massive reallocation for huge apps like chromium.

I would consider it an API issue - it is not really at all that obvious
when vec API does auto reserve and when it does not. 

Grepping for vec_safe_grow, rtl_create_basic_block, gimple_set_bb,
extend_h_i_d, stack_regs_mentioned, init_deps_data_vector
extend_insn_data, create_bb, move_block_to_fn logic has similar logic
but implemented by hand.  Perhaps we can switch it to the new API.  

combine_split_insns, combine_instructions, update_row_reg_save,
grow_label_align, update_uses, final_warning_record::grow_type_warnings,
sem_function::bb_dict_test, ::add_single_to_queue,
symtab_node::create_reference, mark_phi_for_rewrite, addr_for_mem_ref,
multiplier_allowed_in_address_p, get_address_cost_ainc,
make_ssa_name_fn, add_to_value, phi_translate_1,
optimize_range_tests_cmp_bitwise, set_strinfo,
ssa_name_values.safe_grow_cleared, vect_record_loop_mask has similarly
suspicious logic in it.  

Honza
> 
> I'm going to backport the patch older releases as well.
> 
> Martin

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Richard Biener via Gcc-patches

On Mon, Jul 27, 2020 at 12:48 PM Jan Hubicka  wrote:
>
> > Yes, I verified that.
> >
> > > I guess I can try
> > > to do some profiling since this problem did not show on Firefox (that i
> > > find odd given that Firefox is just about half of the size).
> >
> > Yep, I'm also surprised about it.
> >
> > > Perhaps glibc has some stupid limit in realloc that makes it to behave
> > > in a silly way for very large arrays?
> >
> > Dunno :P
> Seems like glibc issue. On my debian testing box:

I'm sure in your actual testcase you run into fragmentation and
eventually fall out of cache which should make things worse by
an order of magnitude.

> hubicka@lomikamen-jh:~$ cat t.c
> #include 
> main(int argc, char **argv)
> {
>   char *a = malloc (1);
>   int i,n=atoi(argv[1]);
>   for (i=2;i a = realloc (a,i);
> }
>
> hubicka@lomikamen-jh:~$ time ./a.out 10
>
> real0m10.057s
> user0m9.696s
> sys 0m0.356s
>
> And kunlun (which is a lot faster than my 2013 buldozer):
>
>
> abuild@kunlun:~> time ./a.out 10
>
> real0m59.808s
> user0m58.703s
> sys 0m1.080s
>
> GDB stops at:
> (gdb) bt
> #0  0x77e70bfe in realloc () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x50bf in main ()
> on debian while:
> (gdb) bt
> #0  0x77e7d6d0 in mem2mem_check () from /lib64/libc.so.6
> #1  0x77e81c7d in realloc_check () from /lib64/libc.so.6
> #2  0x50bf in main ()
> on kunlun.
>
> Perhaps someone enabled some cool security harnessing feature without
> much of benchmarking :) (but even debian numbers seems like they can be
> improved)
>
> Honza
> >
> > Martin
> >
> > >
> > > Honza
> > > >
> > > > Martin
> >

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Jan Hubicka

> It runs for me in:
> 
> $ time ./a.out 10
> 
> real  0m10.048s
> user  0m9.742s
> sys   0m0.305s

Did you do chroot to the chromium build?
> 
> Note that there may be an interleaving load on the machine.
> Perf says:
> 
> 55.40%  a.outlibc-2.26.so  [.] realloc
> 36.01%  a.outa.out [.] realloc@plt
>  4.98%  a.outlibc-2.26.so  [.] mremap_chunk
>  3.60%  a.outa.out [.] main

How one can do perfing on kunlun?
> 
> while on my machine I see:
> 
> real  0m4.998s
> user  0m4.947s
> sys   0m0.050s
> 
> 54.49%  a.outlibc-2.31.so  [.] realloc
> 37.63%  a.outlibc-2.31.so  [.] mremap_chunk
>  3.72%  a.outa.out [.] realloc@plt

Honza
> 
> Martin
> 
> > 
> > GDB stops at:
> > (gdb) bt
> > #0  0x77e70bfe in realloc () from /lib/x86_64-linux-gnu/libc.so.6
> > #1  0x50bf in main ()
> > on debian while:
> > (gdb) bt
> > #0  0x77e7d6d0 in mem2mem_check () from /lib64/libc.so.6
> > #1  0x77e81c7d in realloc_check () from /lib64/libc.so.6
> > #2  0x50bf in main ()
> > on kunlun.
> > 
> > Perhaps someone enabled some cool security harnessing feature without
> > much of benchmarking :) (but even debian numbers seems like they can be
> > improved)
> > 
> > Honza
> > > 
> > > Martin
> > > 
> > > > 
> > > > Honza
> > > > > 
> > > > > Martin
> > > 
>

Re: [PATCH v3] driver: fix a problem with implementation of -falign-foo=0 [PR96247]

2020-07-27 Thread Martin Liška


On 7/27/20 12:25 PM, Richard Sandiford wrote:

So I don't think there's a different value that parse_and_check_align_values
could sensibly insert instead of zero.


All right, works for me.

Martin

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Martin Liška


On 7/27/20 12:48 PM, Jan Hubicka wrote:

Yes, I verified that.


I guess I can try
to do some profiling since this problem did not show on Firefox (that i
find odd given that Firefox is just about half of the size).


Yep, I'm also surprised about it.


Perhaps glibc has some stupid limit in realloc that makes it to behave
in a silly way for very large arrays?


Dunno :P

Seems like glibc issue. On my debian testing box:

hubicka@lomikamen-jh:~$ cat t.c
#include 
main(int argc, char **argv)
{
   char *a = malloc (1);
   int i,n=atoi(argv[1]);
   for (i=2;i time ./a.out 10

real0m59.808s
user0m58.703s
sys 0m1.080s


It runs for me in:

$ time ./a.out 10

real0m10.048s
user0m9.742s
sys 0m0.305s

Note that there may be an interleaving load on the machine.
Perf says:

55.40%  a.outlibc-2.26.so  [.] realloc
36.01%  a.outa.out [.] realloc@plt
 4.98%  a.outlibc-2.26.so  [.] mremap_chunk
 3.60%  a.outa.out [.] main

while on my machine I see:

real0m4.998s
user0m4.947s
sys 0m0.050s

54.49%  a.outlibc-2.31.so  [.] realloc
37.63%  a.outlibc-2.31.so  [.] mremap_chunk
 3.72%  a.outa.out [.] realloc@plt

Martin



GDB stops at:
(gdb) bt
#0  0x77e70bfe in realloc () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x50bf in main ()
on debian while:
(gdb) bt
#0  0x77e7d6d0 in mem2mem_check () from /lib64/libc.so.6
#1  0x77e81c7d in realloc_check () from /lib64/libc.so.6
#2  0x50bf in main ()
on kunlun.

Perhaps someone enabled some cool security harnessing feature without
much of benchmarking :) (but even debian numbers seems like they can be
improved)

Honza


Martin



Honza


Martin

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Jan Hubicka

> Yes, I verified that.
> 
> > I guess I can try
> > to do some profiling since this problem did not show on Firefox (that i
> > find odd given that Firefox is just about half of the size).
> 
> Yep, I'm also surprised about it.
> 
> > Perhaps glibc has some stupid limit in realloc that makes it to behave
> > in a silly way for very large arrays?
> 
> Dunno :P
Seems like glibc issue. On my debian testing box:

hubicka@lomikamen-jh:~$ cat t.c
#include 
main(int argc, char **argv)
{
  char *a = malloc (1);
  int i,n=atoi(argv[1]);
  for (i=2;i time ./a.out 10

real0m59.808s
user0m58.703s
sys 0m1.080s

GDB stops at:
(gdb) bt
#0  0x77e70bfe in realloc () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x50bf in main ()
on debian while:
(gdb) bt
#0  0x77e7d6d0 in mem2mem_check () from /lib64/libc.so.6
#1  0x77e81c7d in realloc_check () from /lib64/libc.so.6
#2  0x50bf in main ()
on kunlun.

Perhaps someone enabled some cool security harnessing feature without
much of benchmarking :) (but even debian numbers seems like they can be
improved)

Honza
> 
> Martin
> 
> > 
> > Honza
> > > 
> > > Martin
>

RE: [PATCH v3] driver: fix a problem with implementation of -falign-foo=0 [PR96247]

2020-07-27 Thread Hu, Jiangping

> > This patch makes the -falign-foo=0 work as described in the
> > documentation. Thanks for all the suggestions, Richard and Segher!
> 
> Hello.
> 
> I'm the author of the original code.
> 
> >
> > v3: make change more readable and self-consistent
> > v2: at a high level handles -falign-foo=0 like -falign-foo
> >
> > Regards!
> > Hujp
> >
> > ---
> >   gcc/opts.c | 24 +++-
> >   1 file changed, 19 insertions(+), 5 deletions(-)
> >
> > diff --git a/gcc/opts.c b/gcc/opts.c
> > index 499eb900643..dec5ba6d2be 100644
> > --- a/gcc/opts.c
> > +++ b/gcc/opts.c
> > @@ -2007,10 +2007,20 @@ parse_and_check_align_values (const char
> *flag,
> >  location LOC.  */
> >
> >   static void
> > -check_alignment_argument (location_t loc, const char *flag, const char
> *name)
> > +check_alignment_argument (location_t loc,
> > +const char *flag,
> > +const char *name,
> > +int *opt_flag,
> > +const char **opt_str)
> >   {
> > auto_vec align_result;
> > parse_and_check_align_values (flag, name, align_result, true, loc);
> > +
> > +  if (align_result.length() >= 1 && align_result[0] == 0)
> > +  {
> > +*opt_flag = 1;
> > +*opt_str = NULL;
> > +  }
> 
> Hm, shouldn't the code be placed in parse_and_check_align_values? Note that
> there's
> one another call gcc/toplev.c (parse_N_M).
>
> I bet you likely want to modify result_values in case -falign-foo=0.
> About the -falign-foo=0:m:n2:m3, you can report an error in
> parse_and_check_align_values?
> 
Thanks Martin!

Reporting errors may break exist Makefiles and is not as good as handling
-falign-foo=0 like -falign-foo. We have discussed earlier.

And when parse_N_M is called, the values may have been overided by
target code, I think it is inappropriate to report errors at that time.

> Thanks for working on that?
> Btw. what's your use-case that you use the extended syntax of -falign-foo?
> Martin
> 
In fact, I was understanding the cpu tuning structure of gcc, and then
when I read the documentation of -falign-foo, I found this problem.

Thanks,
Hujp

> >   }
> >
> >   /* Print help when OPT__help_ is set.  */
> > @@ -2785,19 +2795,23 @@ common_handle_option (struct gcc_options
> *opts,
> > break;
> >
> >   case OPT_falign_loops_:
> > -  check_alignment_argument (loc, arg, "loops");
> > +  check_alignment_argument (loc, arg, "loops",
> > +  >x_flag_align_loops, >x_str_align_loops);
> > break;
> >
> >   case OPT_falign_jumps_:
> > -  check_alignment_argument (loc, arg, "jumps");
> > +  check_alignment_argument (loc, arg, "jumps",
> > +  >x_flag_align_jumps, >x_str_align_jumps);
> > break;
> >
> >   case OPT_falign_labels_:
> > -  check_alignment_argument (loc, arg, "labels");
> > +  check_alignment_argument (loc, arg, "labels",
> > +  >x_flag_align_labels, >x_str_align_labels);
> > break;
> >
> >   case OPT_falign_functions_:
> > -  check_alignment_argument (loc, arg, "functions");
> > +  check_alignment_argument (loc, arg, "functions",
> > +  >x_flag_align_functions, >x_str_align_functions);
> > break;
> >
> >   case OPT_ftabstop_:
> >
> 
>

Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Richard Biener via Gcc-patches

On Mon, Jul 27, 2020 at 11:45 AM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
> >  wrote:
> >>
> >> Richard Biener via Gcc-patches  writes:
> >> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
> >> > Gcc-patches  wrote:
> >> >>
> >> >> This is a follow up to commit 5c9669a0e6c respectively discussion
> >> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
> >> >>
> >> >> In case that an alignment constraint is less than the size of a
> >> >> corresponding scalar type, ensure that we advance at least by one
> >> >> iteration.  For example, on s390x we have for a long double an alignment
> >> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
> >> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
> >> >> can be reproduced by the following MWE:
> >> >
> >> > But we guard this case with vector_alignment_reachable_p, so we shouldn't
> >> > have ended up here and the patch looks bogus.
> >>
> >> The above sounds like it ought to count as reachable alignment though.
> >> If a type requires a lower alignment than its size, then that's even
> >> more easily reachable than a type that requires the same alignment as
> >> the size.  I guess at one extreme, a target alignment of 1 is always
> >> reachable.
> >
> > Well, if the element alignment is 8 but its size is 16 then when presumably
> > the desired vector alignment is a multiple of 16 we can never reach it.
> > Isn't this the case here?
>
> If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
> TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
> fixing wouldn't occur.  I agree that we might never be able to reach
> that alignment if the pointer starts out misaligned by 8 bytes.
>
> But I think that's why it makes sense for the target to only ask
> for 8-byte alignment for vectors too, if it can cope with it.  8-byte
> alignment should always be achievable if the scalars are ABI-aligned.
> And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
> DR_SIZE would be zero and the loop would never progress, which is the
> problem that the patch is fixing.
>
> It would even make sense for the target to ask for 1-byte alignment,
> if the target doesn't care about alignment at all.

Hmm, OK.  Guess I still think we should detect this somewhere upward
and avoid this peeling compute at all.  Somehow.

Richard.

> Thanks,
> Richard

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Martin Liška


On 7/27/20 11:51 AM, Jan Hubicka wrote:

On 7/27/20 9:11 AM, Richard Biener wrote:

OK.  I guess the previous code tried to use less memory.


It did. But I didn't realize that such exact growth would lead
to a massive reallocation for huge apps like chromium.

I'm going to backport the patch older releases as well.


Thank you!  Did it solved the the Chromium problem?


Yes, I verified that.


I guess I can try
to do some profiling since this problem did not show on Firefox (that i
find odd given that Firefox is just about half of the size).


Yep, I'm also surprised about it.


Perhaps glibc has some stupid limit in realloc that makes it to behave
in a silly way for very large arrays?


Dunno :P

Martin



Honza


Martin

Re: [PATCH v3] driver: fix a problem with implementation of -falign-foo=0 [PR96247]

2020-07-27 Thread Richard Sandiford

Martin Liška  writes:
> On 7/27/20 9:46 AM, Hu Jiangping wrote:
>> Hi!
>> 
>> This patch makes the -falign-foo=0 work as described in the
>> documentation. Thanks for all the suggestions, Richard and Segher!
>
> Hello.
>
> I'm the author of the original code.
>
>> 
>> v3: make change more readable and self-consistent
>> v2: at a high level handles -falign-foo=0 like -falign-foo
>> 
>> Regards!
>> Hujp
>> 
>> ---
>>   gcc/opts.c | 24 +++-
>>   1 file changed, 19 insertions(+), 5 deletions(-)
>> 
>> diff --git a/gcc/opts.c b/gcc/opts.c
>> index 499eb900643..dec5ba6d2be 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -2007,10 +2007,20 @@ parse_and_check_align_values (const char *flag,
>>  location LOC.  */
>>   
>>   static void
>> -check_alignment_argument (location_t loc, const char *flag, const char 
>> *name)
>> +check_alignment_argument (location_t loc,
>> +const char *flag,
>> +const char *name,
>> +int *opt_flag,
>> +const char **opt_str)
>>   {
>> auto_vec align_result;
>> parse_and_check_align_values (flag, name, align_result, true, loc);
>> +
>> +  if (align_result.length() >= 1 && align_result[0] == 0)
>> +  {
>> +*opt_flag = 1;
>> +*opt_str = NULL;
>> +  }
>
> Hm, shouldn't the code be placed in parse_and_check_align_values? Note that 
> there's
> one another call gcc/toplev.c (parse_N_M).

I think that's why check_alignment_argument is the right place.
The idea is that we're making -falign-foo=0 do two things:

- imply -falign-foo
- invalidate any previous -falign-foo=… option

It's therefore something that we should only do while iterating
through the command-line in order.

> I bet you likely want to modify result_values in case -falign-foo=0.
> About the -falign-foo=0:m:n2:m3, you can report an error in 
> parse_and_check_align_values?

The problem is that we don't know in general what the target's default is.
It depends on whether this is aligning for functions, loops, labels, etc.
It also depends on the target core, which we might not know yet, and which
might change during compilation based on attributes and pragmas.

So I don't think there's a different value that parse_and_check_align_values
could sensibly insert instead of zero.

Thanks,
Richard

Re: [PATCH] Use vec::reserve before vec_safe_grow_cleared is called

2020-07-27 Thread Jan Hubicka

> On 7/27/20 9:11 AM, Richard Biener wrote:
> > OK.  I guess the previous code tried to use less memory.
> 
> It did. But I didn't realize that such exact growth would lead
> to a massive reallocation for huge apps like chromium.
> 
> I'm going to backport the patch older releases as well.

Thank you!  Did it solved the the Chromium problem?  I guess I can try
to do some profiling since this problem did not show on Firefox (that i
find odd given that Firefox is just about half of the size).
Perhaps glibc has some stupid limit in realloc that makes it to behave
in a silly way for very large arrays?

Honza
> 
> Martin

Re: [committed] libstdc++: Add std::from_chars for floating-point types

2020-07-27 Thread Jonathan Wakely via Gcc-patches


On 27/07/20 11:41 +0200, Rainer Orth wrote:

Hi Jonathan,


This adds the missing std::from_chars overloads for floating-point
types, as required for C++17 conformance.

The implementation is a hack and not intended to be used in the long
term. Rather than parsing the string directly, this determines the
initial portion of the string that matches the pattern determined by the
chars_format parameter, then creates a NTBS to be parsed by strtod (or
strtold or strtof).

Because creating a NTBS requires allocating memory, but std::from_chars
is noexcept, we need to be careful to minimise allocation. Even after
being careful, allocation failure is still possible, and so a
non-conforming std::no_more_memory error code might be returned.

Because strtod et al depend on the current locale, but std::from_chars
does not, we change the current thread's locale to "C" using newlocale
and uselocale before calling strtod, and restore it afterwards.

Because strtod doesn't have the equivalent of a std::chars_format
parameter, it has to examine the input to determine the format in use,
even though the std::from_chars code has already parsed it once (or
twice for large input strings!)

By replacing the use of strtod we could avoid allocation, avoid changing
locale, and use optimised code paths specific to each std::chars_format
case. We would also get more portable behaviour, rather than depending
on the presence of uselocale, and on any bugs or quirks of the target
libc's strtod. Replacing strtod is a project for a later date.


two of the new tests FAIL on Solaris 11.3 only:

+FAIL: 20_util/from_chars/4.cc (test for excess errors)
+UNRESOLVED: 20_util/from_chars/4.cc compilation failed to produce executable
+FAIL: 20_util/from_chars/5.cc (test for excess errors)
+UNRESOLVED: 20_util/from_chars/5.cc compilation failed to produce executable

Excess errors:
/vol/gcc/src/hg/master/local/libstdc++-v3/testsuite/20_util/from_chars/4.cc:41: error: 
no matching function for call to 'from_chars(char*, char*, double&, 
std::chars_format&)'
[...]

AFAICT that's due to the fact that Solaris 11.3 (unlike 11.4) lacks
uselocale.


Ah yes, the tests should not run unconditionally. I'll fix that today,
thanks.

Re: [PATCH] [RFC] vect: Fix infinite loop while determining peeling amount

2020-07-27 Thread Richard Sandiford

Richard Biener  writes:
> On Mon, Jul 27, 2020 at 11:09 AM Richard Sandiford
>  wrote:
>>
>> Richard Biener via Gcc-patches  writes:
>> > On Wed, Jul 22, 2020 at 5:18 PM Stefan Schulze Frielinghaus via
>> > Gcc-patches  wrote:
>> >>
>> >> This is a follow up to commit 5c9669a0e6c respectively discussion
>> >> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/549132.html
>> >>
>> >> In case that an alignment constraint is less than the size of a
>> >> corresponding scalar type, ensure that we advance at least by one
>> >> iteration.  For example, on s390x we have for a long double an alignment
>> >> constraint of 8 bytes whereas the size is 16 bytes.  Therefore,
>> >> TARGET_ALIGN / DR_SIZE equals zero resulting in an infinite loop which
>> >> can be reproduced by the following MWE:
>> >
>> > But we guard this case with vector_alignment_reachable_p, so we shouldn't
>> > have ended up here and the patch looks bogus.
>>
>> The above sounds like it ought to count as reachable alignment though.
>> If a type requires a lower alignment than its size, then that's even
>> more easily reachable than a type that requires the same alignment as
>> the size.  I guess at one extreme, a target alignment of 1 is always
>> reachable.
>
> Well, if the element alignment is 8 but its size is 16 then when presumably
> the desired vector alignment is a multiple of 16 we can never reach it.
> Isn't this the case here?

If the desired vector alignment (TARGET_ALIGN) is a multiple of 16 then
TARGET_ALIGN / DR_SIZE will be nonzero and the problem the patch is
fixing wouldn't occur.  I agree that we might never be able to reach
that alignment if the pointer starts out misaligned by 8 bytes.

But I think that's why it makes sense for the target to only ask
for 8-byte alignment for vectors too, if it can cope with it.  8-byte
alignment should always be achievable if the scalars are ABI-aligned.
And if the target does ask for only 8-byte alignment, TARGET_ALIGN /
DR_SIZE would be zero and the loop would never progress, which is the
problem that the patch is fixing.

It would even make sense for the target to ask for 1-byte alignment,
if the target doesn't care about alignment at all.

Thanks,
Richard

Re: [committed] libstdc++: Add std::from_chars for floating-point types

2020-07-27 Thread Rainer Orth

Hi Jonathan,

> This adds the missing std::from_chars overloads for floating-point
> types, as required for C++17 conformance.
>
> The implementation is a hack and not intended to be used in the long
> term. Rather than parsing the string directly, this determines the
> initial portion of the string that matches the pattern determined by the
> chars_format parameter, then creates a NTBS to be parsed by strtod (or
> strtold or strtof).
>
> Because creating a NTBS requires allocating memory, but std::from_chars
> is noexcept, we need to be careful to minimise allocation. Even after
> being careful, allocation failure is still possible, and so a
> non-conforming std::no_more_memory error code might be returned.
>
> Because strtod et al depend on the current locale, but std::from_chars
> does not, we change the current thread's locale to "C" using newlocale
> and uselocale before calling strtod, and restore it afterwards.
>
> Because strtod doesn't have the equivalent of a std::chars_format
> parameter, it has to examine the input to determine the format in use,
> even though the std::from_chars code has already parsed it once (or
> twice for large input strings!)
>
> By replacing the use of strtod we could avoid allocation, avoid changing
> locale, and use optimised code paths specific to each std::chars_format
> case. We would also get more portable behaviour, rather than depending
> on the presence of uselocale, and on any bugs or quirks of the target
> libc's strtod. Replacing strtod is a project for a later date.

two of the new tests FAIL on Solaris 11.3 only:

+FAIL: 20_util/from_chars/4.cc (test for excess errors)
+UNRESOLVED: 20_util/from_chars/4.cc compilation failed to produce executable
+FAIL: 20_util/from_chars/5.cc (test for excess errors)
+UNRESOLVED: 20_util/from_chars/5.cc compilation failed to produce executable

Excess errors:
/vol/gcc/src/hg/master/local/libstdc++-v3/testsuite/20_util/from_chars/4.cc:41: 
error: no matching function for call to 'from_chars(char*, char*, double&, 
std::chars_format&)'
[...]

AFAICT that's due to the fact that Solaris 11.3 (unlike 11.4) lacks
uselocale.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH v3] driver: fix a problem with implementation of -falign-foo=0 [PR96247]

2020-07-27 Thread Martin Liška


On 7/27/20 9:46 AM, Hu Jiangping wrote:

Hi!

This patch makes the -falign-foo=0 work as described in the
documentation. Thanks for all the suggestions, Richard and Segher!


Hello.

I'm the author of the original code.



v3: make change more readable and self-consistent
v2: at a high level handles -falign-foo=0 like -falign-foo

Regards!
Hujp

---
  gcc/opts.c | 24 +++-
  1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/gcc/opts.c b/gcc/opts.c
index 499eb900643..dec5ba6d2be 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2007,10 +2007,20 @@ parse_and_check_align_values (const char *flag,
 location LOC.  */
  
  static void

-check_alignment_argument (location_t loc, const char *flag, const char *name)
+check_alignment_argument (location_t loc,
+const char *flag,
+const char *name,
+int *opt_flag,
+const char **opt_str)
  {
auto_vec align_result;
parse_and_check_align_values (flag, name, align_result, true, loc);
+
+  if (align_result.length() >= 1 && align_result[0] == 0)
+  {
+*opt_flag = 1;
+*opt_str = NULL;
+  }


Hm, shouldn't the code be placed in parse_and_check_align_values? Note that 
there's
one another call gcc/toplev.c (parse_N_M).

I bet you likely want to modify result_values in case -falign-foo=0.
About the -falign-foo=0:m:n2:m3, you can report an error in 
parse_and_check_align_values?

Thanks for working on that?
Btw. what's your use-case that you use the extended syntax of -falign-foo?
Martin


  }
  
  /* Print help when OPT__help_ is set.  */

@@ -2785,19 +2795,23 @@ common_handle_option (struct gcc_options *opts,
break;
  
  case OPT_falign_loops_:

-  check_alignment_argument (loc, arg, "loops");
+  check_alignment_argument (loc, arg, "loops",
+  >x_flag_align_loops, >x_str_align_loops);
break;
  
  case OPT_falign_jumps_:

-  check_alignment_argument (loc, arg, "jumps");
+  check_alignment_argument (loc, arg, "jumps",
+  >x_flag_align_jumps, >x_str_align_jumps);
break;
  
  case OPT_falign_labels_:

-  check_alignment_argument (loc, arg, "labels");
+  check_alignment_argument (loc, arg, "labels",
+  >x_flag_align_labels, >x_str_align_labels);
break;
  
  case OPT_falign_functions_:

-  check_alignment_argument (loc, arg, "functions");
+  check_alignment_argument (loc, arg, "functions",
+  >x_flag_align_functions, >x_str_align_functions);
break;
  
  case OPT_ftabstop_:

Re: [PATCH] Remove dead vector comparisons

2020-07-27 Thread Martin Liška


On 7/10/20 10:24 AM, Richard Biener wrote:

On Fri, Jul 10, 2020 at 9:50 AM Martin Liška  wrote:


As mentioned in the PR, we need to clean up orphan vector comparisons
that tend to happen be gimplification of VEC_COND_EXPR.

I've done that easily in expand_vector_comparison where I add these
to a bitmap used in simple DCE.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?


I don't like this much - the reason for the dead code is that the gimplifier,
while it manages to DCE the VEC_COND_EXPR because the value is not
needed, does not apply the same for the operands where only side-effects
would need to be kept.

But then if the vector comparisons would not be dead, the testcase
would still ICE even with your patch.


Hello.

Question here is if one can write such a test-case? I would the target
would lower a vector comparison directly to a non-vector code?


 And the reason for the ICE is
that vector lowering checks

   if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type, code)
   && !expand_vec_cond_expr_p (type, TREE_TYPE (op0), code))
 {

while RTL expansion has

   /* For vector typed comparisons emit code to generate the desired
  all-ones or all-zeros mask.  */
   if (TREE_CODE (ops->type) == VECTOR_TYPE)
 {
   tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
   if (VECTOR_BOOLEAN_TYPE_P (ops->type)
   && expand_vec_cmp_expr_p (TREE_TYPE (arg0), ops->type, ops->code))
 return expand_vec_cmp_expr (ops->type, ifexp, target);
   else
 gcc_unreachable ();

so vector lowering doesn't lower vector comparisons when we can expand
via a VEC_COND_EXPR but the RTL expansion code asserts this will not
be needed.


Ah, ok, now I understand that.



Thus this should either be fixed by re-instantiating the RTL expansion code


Could you please explain what re-instantiating the RTL means?


or handling vector comparisons in gimple-isel.cc at least when they need
to be expanded as VEC_COND_EXPR.


That's doable but by something like:

  _1 = v_5(D) > { 0, 0, 0, 0, 0, 0, 0, 0 };
  _10 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0 }, { 1, 1, 1, 1, 1, 1, 1, 1 
}>;

which will be immediately expanded to in ISEL to:

  _10 = .VCOND (v_5(D), { 0, 0, 0, 0, 0, 0, 0, 0 }, { 0, 0, 0, 0, 0, 0, 0, 0 }, 
{ 1, 1, 1, 1, 1, 1, 1, 1 }, 109);

But I would need to redirect all uses of _1 to _10, right? Do we prefer to do 
this?

Thanks,
Martin



Richard.


Thanks,
Martin

gcc/ChangeLog:

 PR tree-optimization/96128
 * tree-vect-generic.c (expand_vector_comparison): Remove vector
 comparisons that don't have a usage.

gcc/testsuite/ChangeLog:

 PR tree-optimization/96128
 * gcc.target/s390/vector/pr96128.c: New test.
---
   .../gcc.target/s390/vector/pr96128.c  | 35 +++
   gcc/tree-vect-generic.c   |  4 ++-
   2 files changed, 38 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/gcc.target/s390/vector/pr96128.c

diff --git a/gcc/testsuite/gcc.target/s390/vector/pr96128.c 
b/gcc/testsuite/gcc.target/s390/vector/pr96128.c
new file mode 100644
index 000..20abe5e515c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/pr96128.c
@@ -0,0 +1,35 @@
+/* PR tree-optimization/96128 */
+/* { dg-options "-march=z13" } */
+
+#define B_TEST(TYPE) { TYPE v __attribute__((vector_size(16))); (void)((v < v) 
< v); }
+#ifdef __cplusplus
+#define T_TEST(TYPE) { TYPE s; TYPE v __attribute__((vector_size(16))); 
__typeof((v

Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-07-27 Thread Hongtao Liu via Gcc-patches

ping

On Mon, Jul 20, 2020 at 4:40 PM Hongtao Liu  wrote:
>
> Correct PR number in ChangeLog
> it's pr96243.
>
> On Mon, Jul 20, 2020 at 1:46 PM Hongtao Liu  wrote:
> >
> > Hi:
> >   For rtx like (eq:HI (V8SI 90) (V8SI 91)), cse will take it as a
> > boolean value and try to do some optimization. But it is not true for
> > vector compare, also other places in rtl passes hold the same
> > assumption.
> >
> > Bootstrap is ok, regression test is ok for i386 backend.
> >
> > 2020-07-20  Hongtao Liu  
> >
> > gcc/
> > PR target/96243
> > * config/i386/i386-expand.c (ix86_expand_sse_cmp): Refine for
> > maskcmp.
> > (ix86_expand_mask_vec_cmp): Change prototype.
> > * config/i386/i386-protos.h (ix86_expand_mask_vec_cmp): Change
> > prototype.
> > * config/i386/i386.c (ix86_print_operand): Remove operand
> > modifier 'I'.
> > * config/i386/sse.md
> > (*_cmp3,
> > *_cmp3,
> > *_ucmp3,
> > *_ucmp3,
> > avx512f_maskcmp3): Deleted.
> >
> > gcc/testsuite
> > * gcc.target/i386/pr92865-1.c: Adjust testcase.
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

Re: [PATCH][AVX512][PR96246] Merge two define_insn: _blendm, _load_mask.

2020-07-27 Thread Hongtao Liu via Gcc-patches

ping

On Wed, Jul 22, 2020 at 12:59 PM Hongtao Liu  wrote:
>
>   Those two define_insns have same pattern, and
> _load_mask would always be matched since it show up
> earlier in the md file, and it may lose some opportunity in
> pass_reload since _load_mask only have constraint "0C"
> for operand2, and "v" constraint in _vblendm would never
> be matched.
>
> 2020-07-21  Hongtao Liu  
>
> gcc/
>PR target/96246
> * config/i386/sse.md (_load_mask,
> _load_mask): Extend to generate blendm
> instructions.
> (_blendm, _blendm): Change
> define_insn to define_expand.
>
> gcc/testsuite/
> * gcc.target/i386/avx512bw-pr96246-1.c: New test.
> * gcc.target/i386/avx512bw-pr96246-2.c: New test.
> * gcc.target/i386/avx512vl-pr96246-1.c: New test.
> * gcc.target/i386/avx512vl-pr96246-2.c: New test.
> * gcc.target/i386/avx512bw-vmovdqu16-1.c: New test.
> * gcc.target/i386/avx512bw-vmovdqu8-1.c: New test.
> * gcc.target/i386/avx512f-vmovapd-1.c: New test.
> * gcc.target/i386/avx512f-vmovaps-1.c: New test.
> * gcc.target/i386/avx512f-vmovdqa32-1.c: New test.
> * gcc.target/i386/avx512f-vmovdqa64-1.c: New test.
> * gcc.target/i386/avx512vl-pr92686-movcc-1.c: New test.
> * gcc.target/i386/avx512vl-pr96246-1.c: New test.
> * gcc.target/i386/avx512vl-pr96246-2.c: New test.
> * gcc.target/i386/avx512vl-vmovapd-1.c: New test.
> * gcc.target/i386/avx512vl-vmovaps-1.c: New test.
> * gcc.target/i386/avx512vl-vmovdqa32-1.c: New test.
> * gcc.target/i386/avx512vl-vmovdqa64-1.c: New test.
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

Re: [PATCH] Using gen_int_mode instead of GEN_INT to avoid ICE caused by type promotion.

2020-07-27 Thread Hongtao Liu via Gcc-patches

ping

On Wed, Jul 22, 2020 at 3:57 PM Hongtao Liu  wrote:
>
>   Bootstrap is ok, regression test is ok for i386 backend.
>
> gcc/
> PR target/96262
> * config/i386/i386-expand.c
> (ix86_expand_vec_shift_qihi_constant): Refine.
>
> gcc/testsuite/
> * gcc.target/i386/pr96262-1.c: New test.
>
> ---
>  gcc/config/i386/i386-expand.c |  6 +++---
>  gcc/testsuite/gcc.target/i386/pr96262-1.c | 11 +++
>  2 files changed, 14 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr96262-1.c
>
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index e194214804b..d57d043106a 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -19537,7 +19537,7 @@ bool
>  ix86_expand_vec_shift_qihi_constant (enum rtx_code code, rtx dest,
> rtx op1, rtx op2)
>  {
>machine_mode qimode, himode;
> -  unsigned int and_constant, xor_constant;
> +  HOST_WIDE_INT and_constant, xor_constant;
>HOST_WIDE_INT shift_amount;
>rtx vec_const_and, vec_const_xor;
>rtx tmp, op1_subreg;
> @@ -19612,7 +19612,7 @@ ix86_expand_vec_shift_qihi_constant (enum
> rtx_code code, rtx dest, rtx op1, rtx
>emit_move_insn (dest, simplify_gen_subreg (qimode, tmp, himode, 0));
>emit_move_insn (vec_const_and,
>   ix86_build_const_vector (qimode, true,
> -  GEN_INT (and_constant)));
> +  gen_int_mode (and_constant,
> QImode)));
>emit_insn (gen_and (dest, dest, vec_const_and));
>
>/* For ASHIFTRT, perform extra operation like
> @@ -19623,7 +19623,7 @@ ix86_expand_vec_shift_qihi_constant (enum
> rtx_code code, rtx dest, rtx op1, rtx
>vec_const_xor = gen_reg_rtx (qimode);
>emit_move_insn (vec_const_xor,
>   ix86_build_const_vector (qimode, true,
> -  GEN_INT (xor_constant)));
> +  gen_int_mode
> (xor_constant, QImode)));
>emit_insn (gen_xor (dest, dest, vec_const_xor));
>emit_insn (gen_sub (dest, dest, vec_const_xor));
>  }
> diff --git a/gcc/testsuite/gcc.target/i386/pr96262-1.c
> b/gcc/testsuite/gcc.target/i386/pr96262-1.c
> new file mode 100644
> index 000..1825388072e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr96262-1.c
> @@ -0,0 +1,11 @@
> +/* PR target/96262 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -O" } */
> +
> +typedef char __attribute__ ((__vector_size__ (64))) V;
> +
> +V
> +foo (V v)
> +{
> +  return ~(v << 1);
> +}
> --
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

1 2 >

1 - 100 of 131 matches

Mail list logo