date:20181102

On Wed, Oct 31, 2018 at 5:40 PM Martin Liška  wrote:
>
> Hi.
>
> As seen in r265663 having htab_hash_string accepting const char * would
> report a compilation error. The void * argument is needed for old C-style
> htab used in libiberty. I'm suggesting to come up with htab_hash_string_vptr 
> and
> change signature of the old one (htab_hash_string). And putting these into
> hashtab.h will make it inlinable in C++-style hash table.
>
> Patch survives regression tests on ppc64le-linux-gnu.
> Total cc1 change with the patch:
>   +0.0% +2.09Ki
>
> Hope it's acceptable.

What's the reason to inline the implementation?  I guess you want to
get a compilation error when calling htab_hash_string on a non-string?
But then libiberty isn't C++ an is used in other projects which you
may be breaking with your patch?

A solution might be to poison htab_hash_string and provide our own
variant.

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2018-10-31  Martin Liska  
>
> * gengtype-state.c (read_state): Use newly added
> htab_hash_string_vptr.
> * gensupport.c (gen_mnemonic_attr): Likewise.
> (check_define_attr_duplicates): Likewise.
> * godump.c (go_finish): Likewise.
>
> include/ChangeLog:
>
> 2018-10-31  Martin Liska  
>
> * hashtab.h (htab_hash_string): Change signature
> to const char * and make it static inline.
> (htab_hash_string_vptr): Likewise.
>
> libcpp/ChangeLog:
>
> 2018-10-31  Martin Liska  
>
> * files.c (_cpp_init_files): Use htab_hash_string_vptr.
>
> libiberty/ChangeLog:
>
> 2018-10-31  Martin Liska  
>
> * hashtab.c:
> (htab_hash_string): Move to header file.
> ---
>  gcc/gengtype-state.c |  2 +-
>  gcc/gensupport.c |  4 ++--
>  gcc/godump.c |  6 +++---
>  include/hashtab.h| 47 ++--
>  libcpp/files.c   |  2 +-
>  libiberty/hashtab.c  | 38 ---
>  6 files changed, 52 insertions(+), 47 deletions(-)
>
>

Re: [PATCH, libphobos] Fix libgphobos.spec in the wrong place with --enable-version-specific-runtime-libs

On Thu, Nov 1, 2018 at 12:57 AM Iain Buclaw  wrote:
>
> Hi,
>
> This adds --enable-version-specific-runtime-libs configure option to libphbos.
>
> Also uncovered that MULTISUBDIR wasn't being set correctly when this
> option was enabled.
>
> Built and checked with make install-target-libphobos.  Ok for trunk?

OK.

Richard.

> --
> Iain
>
> ---
> libphobos/ChangeLog:
>
> 2018-11-01  Iain Buclaw  
>
> PR d/87827
> * Makefile.in: Rebuild.
> * configure: Rebuild.
> * configure.ac: Properly set MULTISUBDIR.
> * d_rules.am: Set toolexecdir and toolexeclibdir.
> * libdruntime/Makefile.in: Rebuild.
> * m4/druntime.m4 (DRUNTIME_INSTALL_DIRECTORIES): Add
> --enable-version-specific-runtime-libs.
> * src/Makefile.in: Rebuild.
> * testsuite/Makefile.in: Rebuild.
>
> ---

Re: [PATCH] Come up with htab_hash_string_vptr and use string-specific if possible.

2018-11-02 Thread Martin Liška

On 11/2/18 9:02 AM, Richard Biener wrote:
> On Wed, Oct 31, 2018 at 5:40 PM Martin Liška  wrote:
>>
>> Hi.
>>
>> As seen in r265663 having htab_hash_string accepting const char * would
>> report a compilation error. The void * argument is needed for old C-style
>> htab used in libiberty. I'm suggesting to come up with htab_hash_string_vptr 
>> and
>> change signature of the old one (htab_hash_string). And putting these into
>> hashtab.h will make it inlinable in C++-style hash table.
>>
>> Patch survives regression tests on ppc64le-linux-gnu.
>> Total cc1 change with the patch:
>>   +0.0% +2.09Ki
>>
>> Hope it's acceptable.
> 
> What's the reason to inline the implementation?

Doing that can enable inlining of hash_table::* functions (find_slot_with_hash, 
...)
in gcc.

I guess you want to
> get a compilation error when calling htab_hash_string on a non-string?

Yep.

> But then libiberty isn't C++ an is used in other projects which you
> may be breaking with your patch?

Ah, ok, that's new for me that it's used elsewhere :)

> 
> A solution might be to poison htab_hash_string and provide our own
> variant.

Will work on that.

Martin

> 
> Richard.
> 
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-10-31  Martin Liska  
>>
>> * gengtype-state.c (read_state): Use newly added
>> htab_hash_string_vptr.
>> * gensupport.c (gen_mnemonic_attr): Likewise.
>> (check_define_attr_duplicates): Likewise.
>> * godump.c (go_finish): Likewise.
>>
>> include/ChangeLog:
>>
>> 2018-10-31  Martin Liska  
>>
>> * hashtab.h (htab_hash_string): Change signature
>> to const char * and make it static inline.
>> (htab_hash_string_vptr): Likewise.
>>
>> libcpp/ChangeLog:
>>
>> 2018-10-31  Martin Liska  
>>
>> * files.c (_cpp_init_files): Use htab_hash_string_vptr.
>>
>> libiberty/ChangeLog:
>>
>> 2018-10-31  Martin Liska  
>>
>> * hashtab.c:
>> (htab_hash_string): Move to header file.
>> ---
>>  gcc/gengtype-state.c |  2 +-
>>  gcc/gensupport.c |  4 ++--
>>  gcc/godump.c |  6 +++---
>>  include/hashtab.h| 47 ++--
>>  libcpp/files.c   |  2 +-
>>  libiberty/hashtab.c  | 38 ---
>>  6 files changed, 52 insertions(+), 47 deletions(-)
>>
>>

Re: [PATCH, testsuite] test case fixes for pdp11

2018-11-02 Thread Andreas Schwab

* gcc.c-torture/execute/20010904-2.c: Fix last change.
* gcc.dg/Wattributes-10.c: Likewise.

diff --git a/gcc/testsuite/gcc.c-torture/execute/20010904-2.c 
b/gcc/testsuite/gcc.c-torture/execute/20010904-2.c
index 7f3affe10f..a0f2626e76 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20010904-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20010904-2.c
@@ -6,7 +6,7 @@
 #define alignment 32
 #endif
 
-typedef struct x { int a; int b; } __attribute__((aligned(aligned))) X;
+typedef struct x { int a; int b; } __attribute__((aligned(alignment))) X;
 typedef struct y { X x; X y[31]; int c; } Y;
 
 Y y[2];
diff --git a/gcc/testsuite/gcc.dg/Wattributes-10.c 
b/gcc/testsuite/gcc.dg/Wattributes-10.c
index 37fd2c1b75..4dccaf3075 100644
--- a/gcc/testsuite/gcc.dg/Wattributes-10.c
+++ b/gcc/testsuite/gcc.dg/Wattributes-10.c
@@ -12,7 +12,7 @@ struct S
 
   int* __attribute__ ((aligned (16), packed)) qaligned;   /* { dg-warning 
"ignoring attribute .packed. because it conflicts with attribute .aligned." } */
   int* __attribute__ ((packed, aligned (16))) qpacked;/* { dg-warning 
".packed. attribute ignored for type .int \\\*." } */
-} s;/* { dg-error "alignment of 's' is greater" { target pdp11*-*-* } } */
+} s;/* { dg-error "alignment of 's' is greater" "" { target pdp11*-*-* } } 
*/
 
 
 void test (void)
-- 
2.19.1


-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-11-02 Thread Kugan Vivekanandarajah

Hi Richard,
Thanks for the review.
On Tue, 30 Oct 2018 at 01:25, Richard Biener  wrote:
>
> On Mon, Oct 29, 2018 at 2:06 AM Kugan Vivekanandarajah
>  wrote:
> >
> > Hi Richard and Jeff,
> >
> > Thanks for your comments.
> >
> > On Fri, 26 Oct 2018 at 19:40, Richard Biener  
> > wrote:
> > >
> > > On Fri, Oct 26, 2018 at 4:55 AM Jeff Law  wrote:
> > > >
> > > > On 10/25/18 4:33 PM, Kugan Vivekanandarajah wrote:
> > > > > Hi,
> > > > >
> > > > > PR87528 showed a case where libgcc generated popcount is causing
> > > > > regression for Skylake.
> > > > > We also have PR86677 where kernel build is failing because the kernel
> > > > > does not use the libgcc (when backend is not defining popcount
> > > > > pattern).  While I agree that the kernel should implement its own
> > > > > functionality when it is not using the libgcc, I am afraid that the
> > > > > implementation can have the same performance issues reported for
> > > > > Skylake in PR87528.
> > > > >
> > > > > Therefore, I would like to propose that we disable popcount detection
> > > > > when we don't have a pattern for that. The attached patch (based on
> > > > > previous discussions) does this.
> > > > >
> > > > > Bootstrapped and regression tested on x86_64-linux-gnu with no new
> > > > > regressions. We need to disable the popcount* testcases. I will have
> > > > > to define a effective_target_with_popcount in
> > > > > gcc/testsuite/lib/target-supports.exp if this patch is OK?
> > > > > Thanks,
> > > > > Kugan
> > > > >
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > 2018-10-25  Kugan Vivekanandarajah  
> > > > >
> > > > > * tree-scalar-evolution.c (expression_expensive_p): Make BUILTIN 
> > > > > POPCOUNT
> > > > > as expensive when backend does not define it.
> > > > >
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > 2018-10-25  Kugan Vivekanandarajah  
> > > > >
> > > > > * gcc.target/aarch64/popcount4.c: New test.
> > > > >
> > > > FWIW, I've been disabling by checking direct_optab_handler elsewhere
> > > > (number_of_iterations_popcount) in my tester.  It may in fact be an old
> > > > patch from you.
> > > >
> > > > Richi argued that it's the kernel team's responsibility to provide a
> > > > popcount since they don't link with libgcc.  And I'm generally in
> > > > agreement with that position, though it does tend to generate some
> > > > friction with the kernel developers.  We also run the real risk of GCC 9
> > > > not being able to build the kernel which, IMHO, would be a disaster from
> > > > a PR standpoint.
> > > >
> > > > I'd like to hear from others here.  I fully realize we're beyond the
> > > > realm of what is strictly technically correct here from a review 
> > > > standpoint.
> > >
> > > As said final value replacement to a library call is probably not wanted
> > > for optimization purpose, so adjusting expression_expensive_p is OK with
> > > me.  It might not fully solve the (non-)issue in case another 
> > > optimization pass
> > > chooses to materialize niter computation result.
> > >
> > > Few comments on the patch:
> > >
> > > +  tree fndecl = get_callee_fndecl (expr);
> > > +
> > > +  if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
> > > +   {
> > > + combined_fn cfn = as_combined_fn (DECL_FUNCTION_CODE (fndecl));
> > >
> > >   combined_fn cfn = gimple_call_combined_fn (expr);
> > >   switch (cfn)
> > > {
> >
> > Did you mean:
> > combined_fn cfn = get_call_combined_fn (expr);
>
> Yes.
>
> > > ...
> > >
> > > cfn will be CFN_LAST for a non-builtin/internal call.  I know Richard is 
> > > mostly
> > > offline but eventually he knows whether there is a better way to query
> > >
> > > +   CASE_CFN_POPCOUNT:
> > > + /* Check if opcode for popcount is available.  */
> > > + if (optab_handler (popcount_optab,
> > > +TYPE_MODE (TREE_TYPE (CALL_EXPR_ARG
> > > (expr, 0
> > > + == CODE_FOR_nothing)
> > > +   return true;
> > >
> > > note that we currently generate builtin calls rather than IFN calls
> > > (when a direct
> > > optab is supported).
> > >
> > > Another comment on the patch is that you probably have to adjust existing
> > > popcount testcases to add architecture specific flags enabling suport for
> > > the instructions, otherwise you won't see loop replacement.
> > Indeed.
> > In lib/target-supports.exp, I will try to add support for
> > check_effective_target_popcount_long.
> > When I grep for the popcount pattern in md files, I see it is defined for:
> >
> > tilegx
> > tilepro
> > alpha
> > aarch64  when TARGET_SIMD
> > ia64
> > rs6000
> > i386  when TARGET_POPCOUNT
> > popwerpcspce  when TARGET_POPCNTB || TARGET_POPCNTD
> > s390  when TARGET_Z916 && TARGET_64BIT
> > sparc when TARGET_POPC
> > arm when TARGET_NEON
> > mips when ISA_HAS_POP
> > spu
> > avr
> >
> > I can check these targets with the condition.
> > Another possibility is to check with a

Re: [PATCH] Come up with htab_hash_string_vptr and use string-specific if possible.

2018-11-02 Thread Martin Liška

V2 of the patch.

Thoughts?
Martin
>From 484d6d29292f210946f9d5091273eb9e1796c874 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 2 Nov 2018 10:15:10 +0100
Subject: [PATCH] Come up with htab_hash_string_vptr and use string-specific if
 possible.

gcc/ChangeLog:

2018-11-02  Martin Liska  

	* system.h (hash_string): New function.
	(hash_string_vptr): Likewise.
	Poison htab_hash_string.
	* attribs.c (struct excl_hash_traits): Use hash_string and
	hash_string_vptr instead of libiberty htah_hash_string.
	* config/darwin.c (indirection_hasher::hash): Likewise.
	(machopic_indirection_name): Likewise.
	(machopic_validate_stub_or_non_lazy_ptr): Likewise.
	* config/sol2.c (comdat_entry_hasher::hash): Likewise.
	* dwarf2out.c (indirect_string_hasher::hash): Likewise.
	(find_AT_string_in_table): Likewise.
	(addr_hasher::hash): Likewise.
	(external_ref_hasher::hash): Likewise.
	(dwarf_file_hasher::hash): Likewise.
	(lookup_filename): Likewise.
	(macinfo_entry_hasher::hash): Likewise.
	(prune_unused_types_update_strings): Likewise.
	* gengtype-state.c (read_state): Likewise.
	* gengtype.c (htab_hash_inputfile): Likewise.
	* genhooks.c (s_hook_hash): Likewise.
	* genmatch.c (id_base::id_base): Likewise.
	* genmodes.c (hash_mode): Likewise.
	* gensupport.c (gen_mnemonic_attr): Likewise.
	(check_define_attr_duplicates): Likewise.
	(hash_struct_pred_data): Likewise.
	* gentarget-def.c (insn_hasher::hash): Likewise.
	(def_target_insn): Likewise.
	(add_insn): Likewise.
	* godump.c (macro_hash_hashval): Likewise.
	(go_define): Likewise.
	(go_finish): Likewise.
	* hash-traits.h (string_hash::hash): Likewise.
	* lra.c (lra_rtx_hash): Likewise.
	* lto-section-in.c (hash_name): Likewise.
	* plugin.c (event_hasher::hash): Likewise.
	(htab_hash_plugin): Likewise.
	(add_new_plugin): Likewise.
	(parse_plugin_arg_opt): Likewise.
	(register_plugin_info): Likewise.
	(init_one_plugin): Likewise.
	* read-md.c (leading_string_hash): Likewise.
	* read-rtl.c (overloaded_name_hash): Likewise.
	* statistics.c (stats_counter_hasher::hash): Likewise.
	* symtab.c (symbol_table::decl_assembler_name_hash): Likewise.
	(section_name_hasher::hash): Likewise.
	(symtab_node::set_section_for_node): Likewise.
	* tlink.c (hash_string_hash): Likewise.
	(symbol_hash_lookup): Likewise.
	(file_hash_lookup): Likewise.
	(demangled_hash_lookup): Likewise.
	* varasm.c (section_hasher::hash): Likewise.
	(hash_section): Likewise.
	(get_section): Likewise.
	(const_rtx_hash_1): Likewise.

gcc/cp/ChangeLog:

2018-11-02  Martin Liska  

	* cp-ubsan.c (cp_ubsan_instrument_vptr): Use hash_string and
	hash_string_vptr instead of libiberty htah_hash_string.

gcc/fortran/ChangeLog:

2018-11-02  Martin Liska  

	* trans-decl.c (struct module_hasher): Use hash_string and
	hash_string_vptr instead of libiberty htah_hash_string.
	(module_decl_hasher::hash): Likewise.
	(gfc_find_module): Likewise.
	(gfc_module_add_decl): Likewise.
	(gfc_trans_use_stmts): Likewise.

gcc/lto/ChangeLog:

2018-11-02  Martin Liska  

	* lto.c (hash_name): Use hash_string and
	hash_string_vptr instead of libiberty htah_hash_string.

libcc1/ChangeLog:

2018-11-02  Martin Liska  

	* libcc1plugin.cc (struct string_hasher): Use hash_string and
	hash_string_vptr instead of libiberty htah_hash_string.
	* libcp1plugin.cc (struct string_hasher): Likewise.
---
 gcc/attribs.c|  4 ++--
 gcc/config/darwin.c  |  6 +++---
 gcc/config/sol2.c|  2 +-
 gcc/cp/cp-ubsan.c|  2 +-
 gcc/dwarf2out.c  | 16 
 gcc/fortran/trans-decl.c | 10 +-
 gcc/gengtype-state.c |  2 +-
 gcc/gengtype.c   |  2 +-
 gcc/genhooks.c   |  2 +-
 gcc/genmatch.c   |  2 +-
 gcc/genmodes.c   |  2 +-
 gcc/gensupport.c |  6 +++---
 gcc/gentarget-def.c  |  6 +++---
 gcc/godump.c | 10 +-
 gcc/hash-traits.h|  2 +-
 gcc/lra.c|  2 +-
 gcc/lto-section-in.c |  2 +-
 gcc/lto/lto.c|  2 +-
 gcc/plugin.c | 12 ++--
 gcc/read-md.c|  2 +-
 gcc/read-rtl.c   |  2 +-
 gcc/statistics.c |  2 +-
 gcc/symtab.c | 10 +-
 gcc/system.h | 24 
 gcc/tlink.c  |  8 
 gcc/varasm.c |  8 
 libcc1/libcc1plugin.cc   |  2 +-
 libcc1/libcp1plugin.cc   |  2 +-
 28 files changed, 88 insertions(+), 64 deletions(-)

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 8b721274d3b..11fe1e51b30 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1828,8 +1828,8 @@ struct excl_hash_traits: typed_noop_remove
 
   static hashval_t hash (const value_type &x)
   {
-hashval_t h1 = htab_hash_string (x.first);
-hashval_t h2 = htab_hash_string (x.second);
+hashval_t h1 = hash_string (x.first);
+hashval_t h2 = hash_string (x.second);
 return h1 ^ h2;
   }
 
diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index aa2ef91c64a..1c1b930c174 100644
--- a/gcc/config/darwin.c
+++ b/

PR83750: CSE erf/erfc pair

2018-11-02 Thread Prathamesh Kulkarni

Hi,
This patch adds two transforms to match.pd to CSE erf/erfc pair.
erfc(x) is canonicalized to 1 - erf(x) and is then reversed to 1 -
erf(x) when canonicalization is disabled and result of erf(x) has
single use within 1 - erf(x).

The patch regressed builtin-nonneg-1.c. The following test-case
reproduces the issue with patch:

void test(double d1) {
  if (signbit(erfc(d1)))
link_failure_erfc();
}

ssa dump:

   :
  _5 = __builtin_erf (d1_4(D));
  _1 = 1.0e+0 - _5;
  _6 = _1 < 0.0;
  _2 = (int) _6;
  if (_2 != 0)
goto ; [INV]
  else
goto ; [INV]

   :
  link_failure_erfc ();

   :
  return;

As can be seen, erfc(d1) is folded to 1 - erf(d1).
forwprop then transforms the if condition from _2 != 0
to _5 > 1.0e+0 and that defeats DCE thus resulting in link failure
in undefined reference to link_failure_erfc().

So, the patch adds another transform erf(x) > 1 -> 0
which resolves the regression.

Bootstrapped+tested on x86_64-unknown-linux-gnu.
Cross-testing on arm and aarch64 variants in progress.
OK for trunk if passes ?

Thanks,
Prathamesh
2018-11-02  Prathamesh Kulkarni  

* match.pd (erfc(x) -> 1 - erf(x)): New pattern.
(1 - erf(x) -> erfc(x)): Likewise.
(erf(x) > 1 -> 0): Likewise.

testsuite/
* gcc.dg/tree-ssa/pr83750-1.c: New test
* gcc.dg/tree-ssa/pr83750-2.c: Likewise.

diff --git a/gcc/match.pd b/gcc/match.pd
index d07ceb7d087..03e9230a579 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4490,7 +4490,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (targetm.libc_has_function (function_c99_math_complex))
 (complex
  (mult (exps@1 (realpart @0)) (realpart (cexpis:type@2 (imagpart @0
- (mult @1 (imagpart @2)))
+ (mult @1 (imagpart @2))
+
+
+ /* Canonicalize erfc(x) -> 1 - erf(x) */
+ (simplify
+  (ERFC @0)
+   (minus { build_one_cst (TREE_TYPE (@0)); } (ERF @0
+
+(if (flag_unsafe_math_optimizations
+ && !canonicalize_math_p())
+
+ /* 1 - erf(x) -> erfc(x)
+This is only done if result of erf() has single use in 1 - erf(x). */
+ (simplify
+  (minus real_onep (ERF@1 @0))
+   (if (single_use (@1))
+(ERFC @0)))
+
+ /* erf(x) > 1 -> 0 */
+ (simplify
+  (gt (ERF @0) real_onep)
+  { integer_zero_node; }))
 
 (if (canonicalize_math_p ())
  /* floor(x) -> trunc(x) if x is nonnegative.  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c
new file mode 100644
index 000..c4d3e428f15
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target c99_runtime } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-optimized" } */
+
+float f1(float x)
+{
+  float g1(float, float);
+
+  float r = __builtin_erff (x);
+  float t = __builtin_erfcf (x);
+  return g1 (r, t);
+}
+
+double f2(double x)
+{
+  double g2(double, double);
+
+  double r = __builtin_erf (x);
+  double t = __builtin_erfc (x);
+  return g2 (r, t);
+}
+
+long double f3(long double x)
+{
+  long double g3(long double, long double);
+
+  long double r = __builtin_erfl (x);
+  long double t = __builtin_erfcl (x);
+  return g3(r, t);
+}
+
+/* { dg-final { scan-tree-dump-not "erfc" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c
new file mode 100644
index 000..60417b38681
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target c99_runtime } */
+/* { dg-options "-O2 -ffast-math -fdump-tree-optimized" } */
+
+/* Check that the canonicalized form 1 - erf(x) is folded to erfc(x). */
+
+float f1(float x)
+{
+  return __builtin_erfcf (x);
+}
+
+double f2(double x)
+{
+  return __builtin_erfc (x);
+}
+
+long double f3(long double x)
+{
+  return __builtin_erfcl (x); 
+}
+
+/* { dg-final { scan-tree-dump-times "erfc" 3 "optimized" } } */

Re: [ARM] Implement division using vrecpe, vrecps

2018-11-02 Thread Prathamesh Kulkarni

On Fri, 26 Oct 2018 at 10:34, Prathamesh Kulkarni
 wrote:
>
> Hi,
> This is a rebased version of patch that adds a pattern to neon.md for
> implementing division with multiplication by reciprocal using
> vrecpe/vrecps with -funsafe-math-optimizations excluding -Os.
> The newly added test-cases are not vectorized on armeb target with
> -O2. I posted the analysis for that here:
> https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01765.html
>
> Briefly, the difference between little and big-endian vectorizer is in
> arm_builtin_support_vector_misalignment() which calls
> default_builtin_support_vector_misalignment() for big-endian case, and
> that returns false because
> movmisalign_optab does not exist for V2SF mode. This isn't observed
> with -O3 because loop peeling for alignment gets enabled.
>
> It seems that the test cases in patch appear unsupported on armeb,
> after r221677 thus this patch requires no changes to
> target-supports.exp to adjust for armeb (unlike last time which
> stalled the patch).
>
> Bootstrap+tested on arm-linux-gnueabihf.
> Cross-tested on arm*-*-* variants.
> OK for trunk ?
ping: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01645.html

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh

Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)

2018-11-02 Thread Christophe Lyon

On Tue, 30 Oct 2018 at 20:07, Jeff Law  wrote:
>
> On 10/29/18 3:59 PM, Martin Sebor wrote:
> > PR 87041 - -Wformat "reading through null pointer" on unreachable
> > code is a complaint about -Wformat false positives due to null
> > arguments to %s directives in unreachable printf calls.  The warning
> > is issued by the front end, too early to know whether or not the call
> > is ever made.
> >
> > The -Wformat-overflow has had the ability to detect null pointers
> > in %s and similar directives to sprintf calls since GCC 7 without
> > these false positives, but the warning doesn't consider stream or
> > file I/O functions like printf/fprintf.  To resolve the bug report
> > I have enhanced -Wformat-overflow to consider all printf-like
> > functions, including user-defined ones declared attribute format
> > (printf).
> >
> > Besides null pointers the enhancement also makes it possible to
> > detect other problems (like out-of-range arguments and output in
> > excess of INT_MAX bytes).  It also lays the groundwork for
> > checking user-defined printf-like functions for buffer overflow
> > (once a suitable attribute is added to indicate which arguments
> > are the destination buffer pointer and the buffer size).
> >
> > With that, I have removed the null checking from -Wformat (again,
> > only for printf-like functions).
> >
> > Martin
> >
> > gcc-87041.diff
> >
> > PR middle-end/87041 - -Wformat reading through null pointer on unreachable 
> > code
> >
> > gcc/ChangeLog:
> >
> >   PR middle-end/87041
> >   * gimple-ssa-sprintf.c (format_directive): Use %G to include
> >   inlining context.
> >   (sprintf_dom_walker::compute_format_length):
> >   Avoid setting POSUNDER4K here.
> >   (get_destination_size): Handle null argument values.
> >   (get_user_idx_format): New function.
> >   (sprintf_dom_walker::handle_gimple_call): Handle all printf-like
> >   functions, including user-defined with attribute format printf.
> >   Use %G to include inlining context.
> >   Set POSUNDER4K here.
> >
> > gcc/c-family/ChangeLog:
> >
> >   PR middle-end/87041
> >   * c-format.c (check_format_types): Avoid diagnosing null pointer
> >   arguments to printf-family of functions.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR middle-end/87041
> >   * gcc.c-torture/execute/fprintf-2.c: New test.
> >   * gcc.c-torture/execute/printf-2.c: Same.
> >   * gcc.c-torture/execute/user-printf.c: Same.
> >   * gcc.dg/tree-ssa/builtin-fprintf-warn-1.c: Same.
> >   * gcc.dg/tree-ssa/builtin-printf-2.c: Same.
> >   * gcc.dg/tree-ssa/builtin-printf-warn-1.c: Same.
> >   * gcc.dg/tree-ssa/user-printf-warn-1.c: Same.
> OK.
>

Hi,

I've noticed failure on targets using newlib (aarch64-elf and arm-eabi):
FAIL: gcc.c-torture/execute/printf-2.c
FAIL: gcc.c-torture/execute/user-printf.c

my gcc.log contains:
gcc.c-torture/execute/user-printf.c   -O0  execution test (reason: TCL
LOOKUP CHANNEL exp5)
which is not very helpful


> Note some folks might complain about dropping the warning from the
> front-end.  Their (largely reasonable) argument is that warning out of
> the front-end is stable across releases and doesn't depend on
> optimizations.  Of course the downside of warning out of the front-end
> is false positives like we see in this PR.
>
> jeff

Re: PR83750: CSE erf/erfc pair

2018-11-02 Thread Ulrich Drepper

On Fri, Nov 2, 2018 at 10:36 AM Prathamesh Kulkarni
 wrote:
> So, the patch adds another transform erf(x) > 1 -> 0
> which resolves the regression.

Why don't you match for any constant with absolute value >= 1.0
instead of just 1.0?

Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-11-02 Thread Renlin Li


Hi Peter,

On 11/01/2018 10:07 PM, Peter Bergner wrote:

On 11/1/18 1:50 PM, Renlin Li wrote:

Is there any update on this issues?
arm-none-linux-gnueabihf native toolchain has been mis-compiled for a while.


 From the analysis I've done, my commit is just exposing latent issues
in LRA.  


Yes, it looks like some latent issues are been exposed.


Can you try the patch I submitted here to see if it helps?


   https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01757.html


Thanks for the patch! I'll help to test the patch and let you know the status.

Thanks,
Renlin



It survives on powerpc64le-linux, x86_64-linux and s390x-linux.
Jeff threw it on his testers and said he saw an arm issue and was
trying to come up with a test case for me to debug.

The specific issue you mentioned with the inline asm and the casp insn
is a bug in LRA where is will spill a user defined hard register and
it shouldn't do that.  My patch above stops that.  The question is
whether we've quashed the rest of the latent bugs.

Peter

Re: [PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM

2018-11-02 Thread Wei Xiao

Hi Uros and HJ,

I have updated the patch according to your remarks as attached.
Ok for trunk?

Thanks
Wei

gcc/
2018-11-2 Wei Xiao 

*config/i386/avx512fintrin.h: Update VFIXUPIMM* intrinsics.
(_mm512_fixupimm_round_pd): Update parameters and builtin.
(_mm512_maskz_fixupimm_round_pd): Ditto.
(_mm512_fixupimm_round_ps): Ditto.
(_mm512_maskz_fixupimm_round_ps): Ditto.
(_mm_fixupimm_round_sd): Ditto.
(_mm_maskz_fixupimm_round_sd): Ditto.
(_mm_fixupimm_round_ss): Ditto.
(_mm_maskz_fixupimm_round_ss): Ditto.
(_mm512_fixupimm_pd): Ditto.
(_mm512_maskz_fixupimm_pd): Ditto.
(_mm512_fixupimm_ps): Ditto.
(_mm512_maskz_fixupimm_ps): Ditto.
(_mm_fixupimm_sd): Ditto.
(_mm_maskz_fixupimm_sd): Ditto.
(_mm_fixupimm_ss): Ditto.
(_mm_maskz_fixupimm_ss): Ditto.
(_mm512_mask_fixupimm_round_pd): Update builtin.
(_mm512_mask_fixupimm_round_ps): Ditto.
(_mm_mask_fixupimm_round_sd): Ditto.
(_mm_mask_fixupimm_round_ss): Ditto.
(_mm512_mask_fixupimm_pd): Ditto.
(_mm512_mask_fixupimm_ps): Ditto.
(_mm_mask_fixupimm_sd): Ditto.
(_mm_mask_fixupimm_ss): Ditto.
*config/i386/avx512vlintrin.h:
(_mm256_fixupimm_pd): Update parameters and builtin.
(_mm256_maskz_fixupimm_pd): Ditto.
(_mm256_fixupimm_ps): Ditto.
(_mm256_maskz_fixupimm_ps): Ditto.
(_mm_fixupimm_pd): Ditto.
(_mm_maskz_fixupimm_pd): Ditto.
(_mm_fixupimm_ps): Ditto.
(_mm_maskz_fixupimm_ps): Ditto.
(_mm256_mask_fixupimm_pd): Update builtin.
(_mm256_mask_fixupimm_ps): Ditto.
(_mm_mask_fixupimm_pd): Ditto.
(_mm_mask_fixupimm_ps): Ditto.
*config/i386/i386-builtin-types.def: Add new types and remove
useless ones.
*config/i386/i386-builtin.def: Update builtin definitions.
*config/i386/i386.c: Handle new builtin types and remove useless ones.
*config/i386/sse.md: Update VFIXUPIMM* patterns.
(_fixupimm_maskz): Update.
(_fixupimm): Update.
(_fixupimm_mask): Update.
(avx512f_sfixupimm_maskz): Update.
(avx512f_sfixupimm): Update.
(avx512f_sfixupimm_mask): Update.
*config/i386/subst.md:
(round_saeonly_sd_mask_operand4): Add new subst_attr.
(round_saeonly_sd_mask_op4): Ditto.
(round_saeonly_expand_operand5): Ditto.
(round_saeonly_expand): Update.

gcc/testsuite
2018-11-2 Wei Xiao 

*gcc.target/i386/avx-1.c: Update tests for VFIXUPIMM* intrinsics.
*gcc.target/i386/avx512f-vfixupimmpd-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmpd-2.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmps-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmsd-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmss-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto.
*gcc.target/i386/avx512vl-vfixupimmpd-1.c: Ditto.
*gcc.target/i386/avx512vl-vfixupimmps-1.c: Ditto.
*gcc.target/i386/sse-13.c: Ditto.
*gcc.target/i386/sse-14.c: Ditto.
*gcc.target/i386/sse-22.c: Ditto.
*gcc.target/i386/sse-23.c: Ditto.
*gcc.target/i386/testimm-10.c: Ditto.
*gcc.target/i386/testround-1.c: Ditto.
Uros Bizjak  于2018年11月2日周五 上午1:27写道：
>
> On Tue, Oct 30, 2018 at 10:12 AM Wei Xiao  wrote:
> >
> > Hi,
> >
> > The attached patch updates VFIXUPIMM* Intrinsics to align with the
> > latest Intel® 64 and IA-32 Architectures Software Developer’s Manual
> > (SDM).
> > Tested with GCC regression test on x86, no regression.
>
> A couple of remarks:
>
> -_mm512_fixupimm_round_pd (__m512d __A, __m512d __B, __m512i __C,
> +_mm512_fixupimm_round_pd (__m512d __B, __m512i __C,
>
>  _mm512_mask_fixupimm_round_pd (__m512d __A, __mmask8 __U, __m512d __B,
> __m512i __C, const int __imm, const int __R)
>
> Some kind of the convention in avx512fintrin.h is that arguments are
> named like this:
>
> [ __m512. __W,] __mmask. __U, __m512x __A, __m512x __B, ..., const int
> _imm, const int __R]. Can we please keep the same approach here? I'
> mostly concerned that argument names don't start with __A.
>
> -BDESC (OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_fixupimmv4df_mask,
> "__builtin_ia32_fixupimmpd256_mask", IX86_BUILTIN_FIXUPIMMPD256_MASK,
> UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI)
> ...
>
> You are removing the only users of e.g.
> V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI (and other definitions). If there
> are no users left, can you also remove the relevant definitions?
>
> > Is it ok?
>
> Please repost the patch with above remarks addressed. These builtins
> are mostly Intel affair, so in the hope that extensive testsuite in
> this area catches all issues, I will just just rubber-stamp the
> updated patch OK and leave the final approval to HJ

[PATCH, OpenACC] Update documentation to mention OpenACC 2.5

2018-11-02 Thread Chung-Lin Tang


Hi Thomas,
this patch (mostly by yourself:) ) are the changes to the documentation to now 
state OpenACC 2.5 support.
I believe this is within your maintainership scope.

A part in libgomp/libgomp.texi mentions the ACC_PROFLIB variable. I assume we 
are going
to get the profiling patches applied in time, so I have left that as is.

Okay for trunk?

Thanks,
Chung-Lin

2018-11-xx  Thomas Schwinge 

gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510".

gcc/fortran/
* cpp.c (cpp_define_builtins): Update "_OPENACC" to "201510".
* gfortran.texi: Update for OpenACC 2.5.
* Intrinsic.texi: Likewise.
* invoke.texi: Likewise.

gcc/testsuite/
* c-c++-common/cpp/openacc-define-3.c: Update.
* gfortran.dg/openacc-define-3.f90: Likewise.

gcc/
* doc/invoke.texi: Update for OpenACC 2.5.

libgomp/
* libgomp.texi: Update for OpenACC 2.5.
* openacc.f90 (openacc_version): Update to "201510".
* openacc_lib.h (openacc_version): Likewise.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Update.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Update.

Index: gcc/c-family/c-cppbuiltin.c
===
--- gcc/c-family/c-cppbuiltin.c (revision 265711)
+++ gcc/c-family/c-cppbuiltin.c (working copy)
@@ -1396,7 +1396,7 @@ c_cpp_builtins (cpp_reader *pfile)
 cpp_define (pfile, "__SSP__=1");
 
   if (flag_openacc)
-cpp_define (pfile, "_OPENACC=201306");
+cpp_define (pfile, "_OPENACC=201510");
 
   if (flag_openmp)
 cpp_define (pfile, "_OPENMP=201511");
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 265711)
+++ gcc/doc/invoke.texi (working copy)
@@ -2178,10 +2178,12 @@ freestanding and hosted environments.
 Enable handling of OpenACC directives @code{#pragma acc} in C/C++ and
 @code{!$acc} in Fortran.  When @option{-fopenacc} is specified, the
 compiler generates accelerated code according to the OpenACC Application
-Programming Interface v2.0 @w{@uref{https://www.openacc.org}}.  This option
+Programming Interface v2.5 @w{@uref{https://www.openacc.org}}.  This option
 implies @option{-pthread}, and thus is only supported on targets that
 have support for @option{-pthread}.
 
+See @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
+
 @item -fopenacc-dim=@var{geom}
 @opindex fopenacc-dim
 @cindex OpenACC accelerator programming
Index: gcc/fortran/cpp.c
===
--- gcc/fortran/cpp.c   (revision 265711)
+++ gcc/fortran/cpp.c   (working copy)
@@ -166,7 +166,7 @@ cpp_define_builtins (cpp_reader *pfile)
   cpp_define (pfile, "_LANGUAGE_FORTRAN=1");
 
   if (flag_openacc)
-cpp_define (pfile, "_OPENACC=201306");
+cpp_define (pfile, "_OPENACC=201510");
 
   if (flag_openmp)
 cpp_define (pfile, "_OPENMP=201511");
Index: gcc/fortran/gfortran.texi
===
--- gcc/fortran/gfortran.texi   (revision 265711)
+++ gcc/fortran/gfortran.texi   (working copy)
@@ -476,9 +476,7 @@ used on real-world programs.  In particular, the s
 include OpenMP, Cray-style pointers, some old vendor extensions, and several
 Fortran 2003 and Fortran 2008 features, including TR 15581.  However, it is
 still under development and has a few remaining rough edges.
-There also is initial support for OpenACC.
-Note that this is an experimental feature, incomplete, and subject to
-change in future versions of GCC.  See
+There also is support for OpenACC.  See
 @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
 
 At present, the GNU Fortran compiler passes the
@@ -538,10 +536,8 @@ status} and @ref{Fortran 2018 status} sections of
 Additionally, the GNU Fortran compilers supports the OpenMP specification
 (version 4.0 and most of the features of the 4.5 version,
 @url{http://openmp.org/@/wp/@/openmp-specifications/}).
-There also is initial support for the OpenACC specification (targeting
-version 2.0, @uref{http://www.openacc.org/}).
-Note that this is an experimental feature, incomplete, and subject to
-change in future versions of GCC.  See
+There also is support for the OpenACC specification (targeting
+version 2.5, @uref{http://www.openacc.org/}).  See
 @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
 
 @node Varying Length Character Strings
@@ -2178,7 +2174,7 @@ influence run-time behavior.
 
 GNU Fortran strives to be compatible to the
 @uref{http://www.openacc.org/, OpenACC Application Programming
-Interface v2.0}.
+Interface v2.5}.
 
 To enable the processing of the OpenACC directive @code{!$acc} in
 free-form source code; the @code{c$acc}, @code{*$acc} and @code{!$acc}
@@ -2194,9 +2190,7 @@ The OpenACC Fortran runtime library routines are p
 form of a Fortran 90 module named @code{o

Re: [PATCH v2] bring netbsd/arm support up to speed. eabi, etc.

On 31/10/2018 22:04, co...@sdf.org wrote:
> On Wed, Oct 31, 2018 at 03:23:27PM +, Richard Earnshaw (lists) wrote:
>> On 31/10/2018 14:10, co...@sdf.org wrote:
>>> +
>>> +# Currently there is a bug somewhere in GCC's alias analysis
>>> +# or scheduling code that is breaking _fpmul_parts in fp-bit.c.
>>> +# Disabling function inlining is a workaround for this problem.
>>> +HOST_LIBGCC2_CFLAGS += -fno-inline
>>
>> This needs to be investigated properly (and fixed if it's still a problem).
>>
>> R.
> 
> After some VCS digging, it turns out you committed this change:
> https://github.com/gcc-mirror/gcc/commit/cffb2a26c44c682185b6bb405d48fcbe1fbc0b37
> 
> NetBSD copied it over from existing GCC files, and it still exists in
> GCC trunk, in libgcc/config/arm/t-elf.
> 

Sorry about that.  You don't really expect me to remember every patch I
committed 18 years ago!

And pedantically, that was a branch merge patch.  The original commit
(back in the CVS days) was:

  revision 1.9.2.1
  date: 1999/10/25 17:47:02;  author: [redacted];  state: Exp;  lines:
+34 -10
  Initial check in of merged arm/thumb backend.

However, the age of this makes me suspect that it quite likely is not
relevant any more and that we should investigate whether it is safe to
remove.  We're running some tests here, but can you test the NetBSD port
without that as well for another data point?

R.

Re: [PATCH/AARCH64] Add OcteonTX for -mcpu=

On 01/11/2018 01:52, Andrew Pinski wrote:
> On Tue, Oct 30, 2018 at 10:21 AM Richard Earnshaw (lists)
>  wrote:
>>
>> On 30/10/2018 17:06, Andrew Pinski wrote:
>>> Hi all,
>>>   There was a name change of the Products, ThunderX T81 and ThunderX
>>> T83 to OcteonTX family name.  This change was done a few years ago but
>>> I had not submmitted the change at that time.  This is also the first
>>> patch in a series to add OcteonTX 2 support to GCC.
>>>
>>> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regression.
>>>
>>
>> You're missing a documentation update.
> 
> Oops.  I knew I missed that too.
> Here is the updated patch with the doc update included.
> 
> Thanks,
> Andrew Pinski
> 
> gcc/ChangeLog:
> * config/aarch64/aarch64-cores.def (octeontx): New.
> (octeontx81): Likewise.
> (octeontx83): Likewise.
> * config/aarch64/aarch64-tune.md: Regenerate.
> * doc/invoke.texi (AArch64 Options) [mtune]: Add octeontx, octeontx81
> and octeontx83.
> 

OK.

R.

>>
>> R.
>>
>>> Thanks,
>>> Andrew Pinski
>>>
>>> gcc/ChangeLog:
>>> * config/aarch64/aarch64-cores.def (octeontx): New.
>>> (octeontx81): Likewise.
>>> (octeontx83): Likewise.
>>> * config/aarch64/aarch64-tune.md: Regenerate.
>>>
>>>
>>> addoctx.diff.txt
>>>
>>> Index: gcc/config/aarch64/aarch64-cores.def
>>> ===
>>> --- gcc/config/aarch64/aarch64-cores.def  (revision 265605)
>>> +++ gcc/config/aarch64/aarch64-cores.def  (working copy)
>>> @@ -58,6 +58,12 @@
>>> this order is required to handle variant correctly. */
>>>  AARCH64_CORE("thunderxt88p1", thunderxt88p1, thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO,  
>>> thunderxt88,  0x43, 0x0a1, 0)
>>>  AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderxt88,  
>>> 0x43, 0x0a1, -1)
>>> +
>>> +/* OcteonTX is the official name for T81/T83. */
>>> +AARCH64_CORE("octeontx",  octeontx,  thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>>> 0x0a0, -1)
>>> +AARCH64_CORE("octeontx81",octeontxt81,   thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>>> 0x0a2, -1)
>>> +AARCH64_CORE("octeontx83",octeontxt83,   thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>>> 0x0a3, -1)
>>> +
>>>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>>> 0x0a2, -1)
>>>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>>> 0x0a3, -1)
>>>
>>> Index: gcc/config/aarch64/aarch64-tune.md
>>> ===
>>> --- gcc/config/aarch64/aarch64-tune.md(revision 265605)
>>> +++ gcc/config/aarch64/aarch64-tune.md(working copy)
>>> @@ -1,5 +1,5 @@
>>>  ;; -*- buffer-read-only: t -*-
>>>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>>>  (define_attr "tune"
>>> - 
>>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>>> + 
>>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>>>   (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
>>>
>>
>>
>> addoctx.diff.txt
>>
>> Index: gcc/config/aarch64/aarch64-cores.def
>> ===
>> --- gcc/config/aarch64/aarch64-cores.def (revision 265702)
>> +++ gcc/config/aarch64/aarch64-cores.def (working copy)
>> @@ -58,6 +58,12 @@
>> this order is required to handle variant correctly. */
>>  AARCH64_CORE("thunderxt88p1", thunderxt88p1, thunderx,  8A,  
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderxt88,  
>> 0x43, 0x0a1, 0)
>>  AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  8A,  
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderxt88,  
>> 0x43, 0x0a1, -1)
>> +
>> +/* OcteonTX is the official name for T81/T83. */
>> +AARCH64_CORE("octeontx",  octeontx,  thunderx,  8A,  
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>> 0x0a0, -1)
>> +AARCH64_CORE("octeontx81",octeontxt81,   thunderx,  8A,  
>> AARC

[C++ PATCH] refactor duplicate_decls

2018-11-02 Thread Nathan Sidwell

duplicate_decls is one of the more complex fns in decl.c, and I need to 
make it more complicated.  But first some refactoring, so it's a little 
more understandable.  Generally moving warning checks later when we know 
we've actually got a duplicate, and splitting up some conflict checking.


Applying to trunk after an x86_64-linux bootstrap.

nathan
--
Nathan Sidwell
2018-11-01  Nathan Sidwell  

	gcc/cp/
	* decl.c (duplicate_decls): Refactor checks.
	* name-lookup.c (name_lookup::process_module_binding): Only pubic
	namespaces are shared.

	gcc/testsuite/
	* g++.dg/lookup/crash6.C: Adjust error
	* g++.dg/parse/crash38.C: Likewise.

Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c	(revision 265743)
+++ gcc/cp/decl.c	(working copy)
@@ -1370,47 +1370,6 @@ duplicate_decls (tree newdecl, tree oldd
   || TREE_TYPE (olddecl) == error_mark_node)
 return error_mark_node;
 
-  if (DECL_NAME (newdecl)
-  && DECL_NAME (olddecl)
-  && UDLIT_OPER_P (DECL_NAME (newdecl))
-  && UDLIT_OPER_P (DECL_NAME (olddecl)))
-{
-  if (TREE_CODE (newdecl) == TEMPLATE_DECL
-	  && TREE_CODE (olddecl) != TEMPLATE_DECL
-	  && check_raw_literal_operator (olddecl))
-	error_at (newdecl_loc,
-		  "literal operator template %qD conflicts with"
-		  " raw literal operator %qD", newdecl, olddecl);
-  else if (TREE_CODE (newdecl) != TEMPLATE_DECL
-	   && TREE_CODE (olddecl) == TEMPLATE_DECL
-	   && check_raw_literal_operator (newdecl))
-	error_at (newdecl_loc,
-		  "raw literal operator %qD conflicts with"
-		  " literal operator template %qD", newdecl, olddecl);
-}
-
-  /* True to merge attributes between the declarations, false to
- set OLDDECL's attributes to those of NEWDECL (for template
- explicit specializations that specify their own attributes
- independent of those specified for the primary template).  */
-  const bool merge_attr = (TREE_CODE (newdecl) != FUNCTION_DECL
-			   || !DECL_TEMPLATE_SPECIALIZATION (newdecl)
-			   || DECL_TEMPLATE_SPECIALIZATION (olddecl));
-
-  if (DECL_P (olddecl)
-  && TREE_CODE (newdecl) == FUNCTION_DECL
-  && TREE_CODE (olddecl) == FUNCTION_DECL
-  && merge_attr
-  && diagnose_mismatched_attributes (olddecl, newdecl))
-{
-  if (DECL_INITIAL (olddecl))
-	inform (olddecl_loc,
-		"previous definition of %qD was here", olddecl);
-  else
-	inform (olddecl_loc,
-		"previous declaration of %qD was here", olddecl);
-}
-
   /* Check for redeclaration and other discrepancies.  */
   if (TREE_CODE (olddecl) == FUNCTION_DECL
   && DECL_ARTIFICIAL (olddecl))
@@ -1634,38 +1593,45 @@ next_arg:;
   /* C++ Standard, 3.3, clause 4:
 	 "[Note: a namespace name or a class template name must be unique
 	 in its declarative region (7.3.2, clause 14). ]"  */
-  if (TREE_CODE (olddecl) != NAMESPACE_DECL
-	  && TREE_CODE (newdecl) != NAMESPACE_DECL
-	  && (TREE_CODE (olddecl) != TEMPLATE_DECL
-	  || TREE_CODE (DECL_TEMPLATE_RESULT (olddecl)) != TYPE_DECL)
-	  && (TREE_CODE (newdecl) != TEMPLATE_DECL
-	  || TREE_CODE (DECL_TEMPLATE_RESULT (newdecl)) != TYPE_DECL))
-	{
-	  if ((TREE_CODE (olddecl) == TYPE_DECL && DECL_ARTIFICIAL (olddecl)
-	   && TREE_CODE (newdecl) != TYPE_DECL)
-	  || (TREE_CODE (newdecl) == TYPE_DECL && DECL_ARTIFICIAL (newdecl)
-		  && TREE_CODE (olddecl) != TYPE_DECL))
-	{
-	  /* We do nothing special here, because C++ does such nasty
-		 things with TYPE_DECLs.  Instead, just let the TYPE_DECL
-		 get shadowed, and know that if we need to find a TYPE_DECL
-		 for a given name, we can look in the IDENTIFIER_TYPE_VALUE
-		 slot of the identifier.  */
-	  return NULL_TREE;
-	}
-	
-	if ((TREE_CODE (newdecl) == FUNCTION_DECL
-		 && DECL_FUNCTION_TEMPLATE_P (olddecl))
-		|| (TREE_CODE (olddecl) == FUNCTION_DECL
-		&& DECL_FUNCTION_TEMPLATE_P (newdecl)))
-	  return NULL_TREE;
+  if (TREE_CODE (olddecl) == NAMESPACE_DECL
+	  || TREE_CODE (newdecl) == NAMESPACE_DECL)
+	/* Namespace conflicts with not namespace.  */;
+  else if (DECL_TYPE_TEMPLATE_P (olddecl)
+	   || DECL_TYPE_TEMPLATE_P (newdecl))
+	/* Class template conflicts.  */;
+  else if ((TREE_CODE (newdecl) == FUNCTION_DECL
+		&& DECL_FUNCTION_TEMPLATE_P (olddecl))
+	   || (TREE_CODE (olddecl) == FUNCTION_DECL
+		   && DECL_FUNCTION_TEMPLATE_P (newdecl)))
+	{
+	  /* One is a function and the other is a template
+	 function.  */
+	  if (!UDLIT_OPER_P (DECL_NAME (newdecl)))
+	return NULL_TREE;
+
+	  /* There can only be one!  */
+	  if (TREE_CODE (newdecl) == TEMPLATE_DECL
+	  && check_raw_literal_operator (olddecl))
+	error_at (newdecl_loc,
+		  "literal operator %q#D conflicts with"
+		  " raw literal operator", newdecl);
+	  else if (check_raw_literal_operator (newdecl))
+	error_at (newdecl_loc,
+		  "raw literal operator %q#D conflicts with"
+		  " literal operator template", newdecl);
+	  else
+

Re: [PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM

2018-11-02 Thread Uros Bizjak

On Fri, Nov 2, 2018 at 11:12 AM Wei Xiao  wrote:
>
> Hi Uros and HJ,
>
> I have updated the patch according to your remarks as attached.
> Ok for trunk?
>
> Thanks
> Wei
>
> gcc/
> 2018-11-2 Wei Xiao 
>
> *config/i386/avx512fintrin.h: Update VFIXUPIMM* intrinsics.
> (_mm512_fixupimm_round_pd): Update parameters and builtin.
> (_mm512_maskz_fixupimm_round_pd): Ditto.
> (_mm512_fixupimm_round_ps): Ditto.
> (_mm512_maskz_fixupimm_round_ps): Ditto.
> (_mm_fixupimm_round_sd): Ditto.
> (_mm_maskz_fixupimm_round_sd): Ditto.
> (_mm_fixupimm_round_ss): Ditto.
> (_mm_maskz_fixupimm_round_ss): Ditto.
> (_mm512_fixupimm_pd): Ditto.
> (_mm512_maskz_fixupimm_pd): Ditto.
> (_mm512_fixupimm_ps): Ditto.
> (_mm512_maskz_fixupimm_ps): Ditto.
> (_mm_fixupimm_sd): Ditto.
> (_mm_maskz_fixupimm_sd): Ditto.
> (_mm_fixupimm_ss): Ditto.
> (_mm_maskz_fixupimm_ss): Ditto.
> (_mm512_mask_fixupimm_round_pd): Update builtin.
> (_mm512_mask_fixupimm_round_ps): Ditto.
> (_mm_mask_fixupimm_round_sd): Ditto.
> (_mm_mask_fixupimm_round_ss): Ditto.
> (_mm512_mask_fixupimm_pd): Ditto.
> (_mm512_mask_fixupimm_ps): Ditto.
> (_mm_mask_fixupimm_sd): Ditto.
> (_mm_mask_fixupimm_ss): Ditto.
> *config/i386/avx512vlintrin.h:
> (_mm256_fixupimm_pd): Update parameters and builtin.
> (_mm256_maskz_fixupimm_pd): Ditto.
> (_mm256_fixupimm_ps): Ditto.
> (_mm256_maskz_fixupimm_ps): Ditto.
> (_mm_fixupimm_pd): Ditto.
> (_mm_maskz_fixupimm_pd): Ditto.
> (_mm_fixupimm_ps): Ditto.
> (_mm_maskz_fixupimm_ps): Ditto.
> (_mm256_mask_fixupimm_pd): Update builtin.
> (_mm256_mask_fixupimm_ps): Ditto.
> (_mm_mask_fixupimm_pd): Ditto.
> (_mm_mask_fixupimm_ps): Ditto.
> *config/i386/i386-builtin-types.def: Add new types and remove
> useless ones.
> *config/i386/i386-builtin.def: Update builtin definitions.
> *config/i386/i386.c: Handle new builtin types and remove useless ones.
> *config/i386/sse.md: Update VFIXUPIMM* patterns.
> (_fixupimm_maskz): Update.
> (_fixupimm): Update.
> (_fixupimm_mask): Update.
> (avx512f_sfixupimm_maskz): Update.
> (avx512f_sfixupimm): Update.
> (avx512f_sfixupimm_mask): Update.
> *config/i386/subst.md:
> (round_saeonly_sd_mask_operand4): Add new subst_attr.
> (round_saeonly_sd_mask_op4): Ditto.
> (round_saeonly_expand_operand5): Ditto.
> (round_saeonly_expand): Update.
>
> gcc/testsuite
> 2018-11-2 Wei Xiao 
>
> *gcc.target/i386/avx-1.c: Update tests for VFIXUPIMM* intrinsics.
> *gcc.target/i386/avx512f-vfixupimmpd-1.c: Ditto.
> *gcc.target/i386/avx512f-vfixupimmpd-2.c: Ditto.
> *gcc.target/i386/avx512f-vfixupimmps-1.c: Ditto.
> *gcc.target/i386/avx512f-vfixupimmsd-1.c: Ditto.
> *gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto.
> *gcc.target/i386/avx512f-vfixupimmss-1.c: Ditto.
> *gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto.
> *gcc.target/i386/avx512vl-vfixupimmpd-1.c: Ditto.
> *gcc.target/i386/avx512vl-vfixupimmps-1.c: Ditto.
> *gcc.target/i386/sse-13.c: Ditto.
> *gcc.target/i386/sse-14.c: Ditto.
> *gcc.target/i386/sse-22.c: Ditto.
> *gcc.target/i386/sse-23.c: Ditto.
> *gcc.target/i386/testimm-10.c: Ditto.
> *gcc.target/i386/testround-1.c: Ditto.

Please also rename these:

 _mm512_mask_fixupimm_round_pd (__m512d __A, __mmask8 __U, __m512d __B,
__m512i __C, const int __imm, const int __R)

 _mm512_mask_fixupimm_round_ps (__m512 __A, __mmask16 __U, __m512 __B,
__m512i __C, const int __imm, const int __R)

 _mm_mask_fixupimm_round_sd (__m128d __A, __mmask8 __U, __m128d __B,
 __m128i __C, const int __imm, const int __R)

 _mm_mask_fixupimm_round_ss (__m128 __A, __mmask8 __U, __m128 __B,
 __m128i __C, const int __imm, const int __R)

 _mm512_mask_fixupimm_pd (__m512d __A, __mmask8 __U, __m512d __B,
  __m512i __C, const int __imm)

_mm512_mask_fixupimm_ps (__m512 __A, __mmask16 __U, __m512 __B,
  __m512i __C, const int __imm)

 _mm_mask_fixupimm_sd (__m128d __A, __mmask8 __U, __m128d __B,
   __m128i __C, const int __imm)

 _mm_mask_fixupimm_ss (__m128 __A, __mmask8 __U, __m128 __B,
   __m128i __C, const int __imm)

 _mm256_mask_fixupimm_pd (__m256d __A, __mmask8 __U, __m256d __B,
  __m256i __C, const int __imm)

 _mm256_mask_fixupimm_ps (__m256 __A, __mmask8 __U, __m256 __B,
  __m256i __C, const int __imm)

  _mm_mask_fixupimm_pd (__m128d __A, __mmask8 __U, __m128d __B,
   __m128i __C, const int __imm)

 _mm_mask_fixupimm_ps (__m128

[Patch][gdb] Initialise quiet flag for "info functions"

2018-11-02 Thread Matthew Malcomson

With this flag unset, using 'info functions' without a set quiet flag
was not deterministic and was causing some flaky test failures.

Failures seen in (at least).
gdb.base/info_qt.exp
gdb.dwarf2/dw2-case-insensitive.exp
gdb.base/info-fun.exp

Ok for trunk?
I don't have commit rights.

gdb/ChangeLog:

2018-11-02  Matthew Malcomson  

* symtab.c (info_functions_command): Initialise quiet flag.



### Attachment also inlined for ease of reply###


diff --git a/gdb/symtab.c b/gdb/symtab.c
index 
cd27a75e8ca2370a9d11ae6057d051ca6ce13f90..7649908d9c9341ad695626e0a22a085f2af302ef
 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -4760,7 +4760,7 @@ info_functions_command (const char *args, int from_tty)
 {
   std::string regexp;
   std::string t_regexp;
-  bool quiet;
+  bool quiet = false;
 
   while (args != NULL
 && extract_info_print_args (&args, &quiet, ®exp, &t_regexp))

diff --git a/gdb/symtab.c b/gdb/symtab.c
index 
cd27a75e8ca2370a9d11ae6057d051ca6ce13f90..7649908d9c9341ad695626e0a22a085f2af302ef
 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -4760,7 +4760,7 @@ info_functions_command (const char *args, int from_tty)
 {
   std::string regexp;
   std::string t_regexp;
-  bool quiet;
+  bool quiet = false;
 
   while (args != NULL
 && extract_info_print_args (&args, &quiet, ®exp, &t_regexp))

[PATCH] Obstackify coalesce list, remove bitmap indirection



The following should improve memory locality.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-11-02  Richard Biener  

PR rtl-optimization/87852
* tree-ssa-coalesce.c (struct coalesce_list): Add obstack member.
(pop_cost_one_pair): Do not free pair.
(pop_best_coalesce): Likewise.
(create_coalesce_list): Initialize obstack.
(delete_coalesce_list): Free obstack.
(find_coalesce_pair): Obstack-allocate coalesce pairs.
(add_cost_one_coalesce): Likewise.
(struct live_track): Remove bitmap pointer indirections.
(new_live_track): Adjust.
(delete_live_track): Likewise.
(live_track_remove_partition): Likewise.
(live_track_add_partition): Likewise.
(live_track_live_p): Likewise.
(live_track_process_def): Likewise.
(live_track_clear_base_vars): Likewise.

diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 83b0c1fec8a..6ae9bb90efa 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -135,6 +135,7 @@ struct coalesce_list
   coalesce_pair **sorted;  /* List when sorted.  */
   int num_sorted;  /* Number in the sorted list.  */
   cost_one_pair *cost_one_list;/* Single use coalesces with cost 1.  */
+  obstack ob;
 };
 
 #define NO_BEST_COALESCE   -1
@@ -226,8 +227,6 @@ pop_cost_one_pair (coalesce_list *cl, int *p1, int *p2)
   *p2 = ptr->second_element;
   cl->cost_one_list = ptr->next;
 
-  free (ptr);
-
   return 1;
 }
 
@@ -251,7 +250,6 @@ pop_best_coalesce (coalesce_list *cl, int *p1, int *p2)
   *p1 = node->first_element;
   *p2 = node->second_element;
   ret = node->cost;
-  free (node);
 
   return ret;
 }
@@ -273,6 +271,7 @@ create_coalesce_list (void)
   list->sorted = NULL;
   list->num_sorted = 0;
   list->cost_one_list = NULL;
+  gcc_obstack_init (&list->ob);
   return list;
 }
 
@@ -287,6 +286,7 @@ delete_coalesce_list (coalesce_list *cl)
   cl->list = NULL;
   free (cl->sorted);
   gcc_assert (cl->num_sorted == 0);
+  obstack_free (&cl->ob, NULL);
   free (cl);
 }
 
@@ -328,7 +328,7 @@ find_coalesce_pair (coalesce_list *cl, int p1, int p2, bool 
create)
 
   if (!*slot)
 {
-  struct coalesce_pair * pair = XNEW (struct coalesce_pair);
+  struct coalesce_pair * pair = XOBNEW (&cl->ob, struct coalesce_pair);
   gcc_assert (cl->sorted == NULL);
   pair->first_element = p.first_element;
   pair->second_element = p.second_element;
@@ -346,7 +346,7 @@ add_cost_one_coalesce (coalesce_list *cl, int p1, int p2)
 {
   cost_one_pair *pair;
 
-  pair = XNEW (cost_one_pair);
+  pair = XOBNEW (&cl->ob, cost_one_pair);
   pair->first_element = p1;
   pair->second_element = p2;
   pair->next = cl->cost_one_list;
@@ -677,8 +677,8 @@ ssa_conflicts_dump (FILE *file, ssa_conflicts *ptr)
 struct live_track
 {
   bitmap_obstack obstack;  /* A place to allocate our bitmaps.  */
-  bitmap live_base_var;/* Indicates if a basevar is live.  */
-  bitmap *live_base_partitions;/* Live partitions for each basevar.  */
+  bitmap_head live_base_var;   /* Indicates if a basevar is live.  */
+  bitmap_head *live_base_partitions;   /* Live partitions for each basevar.  */
   var_map map; /* Var_map being used for partition mapping.  */
 };
 
@@ -695,14 +695,14 @@ new_live_track (var_map map)
   /* Make sure there is a partition view in place.  */
   gcc_assert (map->partition_to_base_index != NULL);
 
-  ptr = (live_track *) xmalloc (sizeof (live_track));
+  ptr = XNEW (live_track);
   ptr->map = map;
   lim = num_basevars (map);
   bitmap_obstack_initialize (&ptr->obstack);
-  ptr->live_base_partitions = (bitmap *) xmalloc (sizeof (bitmap *) * lim);
-  ptr->live_base_var = BITMAP_ALLOC (&ptr->obstack);
+  ptr->live_base_partitions = XNEWVEC (bitmap_head, lim);
+  bitmap_initialize (&ptr->live_base_var, &ptr->obstack);
   for (x = 0; x < lim; x++)
-ptr->live_base_partitions[x] = BITMAP_ALLOC (&ptr->obstack);
+bitmap_initialize (&ptr->live_base_partitions[x], &ptr->obstack);
   return ptr;
 }
 
@@ -713,8 +713,8 @@ static void
 delete_live_track (live_track *ptr)
 {
   bitmap_obstack_release (&ptr->obstack);
-  free (ptr->live_base_partitions);
-  free (ptr);
+  XDELETEVEC (ptr->live_base_partitions);
+  XDELETE (ptr);
 }
 
 
@@ -726,10 +726,10 @@ live_track_remove_partition (live_track *ptr, int 
partition)
   int root;
 
   root = basevar_index (ptr->map, partition);
-  bitmap_clear_bit (ptr->live_base_partitions[root], partition);
+  bitmap_clear_bit (&ptr->live_base_partitions[root], partition);
   /* If the element list is empty, make the base variable not live either.  */
-  if (bitmap_empty_p (ptr->live_base_partitions[root]))
-bitmap_clear_bit (ptr->live_base_var, root);
+  if (bitmap_empty_p (&ptr->live_base_partitions[root]))
+bitmap_clear_bit (&ptr->live_base_var, root);
 }
 
 
@@ -743,9 +743,9 @@ live_track_add_partition (live_tra

[Patch][gdb] Initialise quiet flag for "info functions"

2018-11-02 Thread Matthew Malcomson

With this flag unset, using 'info functions' without a set quiet flag
was not deterministic and was causing some flaky test failures.

Failures seen in (at least).
gdb.base/info_qt.exp
gdb.dwarf2/dw2-case-insensitive.exp
gdb.base/info-fun.exp

Ok for trunk?
I don't have commit rights.

gdb/ChangeLog:

2018-11-02  Matthew Malcomson  

* symtab.c (info_functions_command): Initialise quiet flag.



### Attachment also inlined for ease of reply###


diff --git a/gdb/symtab.c b/gdb/symtab.c
index 
cd27a75e8ca2370a9d11ae6057d051ca6ce13f90..7649908d9c9341ad695626e0a22a085f2af302ef
 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -4760,7 +4760,7 @@ info_functions_command (const char *args, int from_tty)
 {
   std::string regexp;
   std::string t_regexp;
-  bool quiet;
+  bool quiet = false;
 
   while (args != NULL
 && extract_info_print_args (&args, &quiet, ®exp, &t_regexp))

diff --git a/gdb/symtab.c b/gdb/symtab.c
index 
cd27a75e8ca2370a9d11ae6057d051ca6ce13f90..7649908d9c9341ad695626e0a22a085f2af302ef
 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -4760,7 +4760,7 @@ info_functions_command (const char *args, int from_tty)
 {
   std::string regexp;
   std::string t_regexp;
-  bool quiet;
+  bool quiet = false;
 
   while (args != NULL
 && extract_info_print_args (&args, &quiet, ®exp, &t_regexp))

Re: [Patch][gdb] Initialise quiet flag for "info functions"

2018-11-02 Thread Matthew Malcomson

Oops -- wrong list -- please ignore.


On 02/11/18 11:27, Matthew Malcomson wrote:
> With this flag unset, using 'info functions' without a set quiet flag
> was not deterministic and was causing some flaky test failures.
>
> Failures seen in (at least).
> gdb.base/info_qt.exp
> gdb.dwarf2/dw2-case-insensitive.exp
> gdb.base/info-fun.exp
>
> Ok for trunk?
> I don't have commit rights.
>
> gdb/ChangeLog:
>
> 2018-11-02  Matthew Malcomson  
>
>   * symtab.c (info_functions_command): Initialise quiet flag.
>
>
>
> ### Attachment also inlined for ease of reply
> ###
>
>
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 
> cd27a75e8ca2370a9d11ae6057d051ca6ce13f90..7649908d9c9341ad695626e0a22a085f2af302ef
>  100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -4760,7 +4760,7 @@ info_functions_command (const char *args, int from_tty)
>   {
> std::string regexp;
> std::string t_regexp;
> -  bool quiet;
> +  bool quiet = false;
>   
> while (args != NULL
>&& extract_info_print_args (&args, &quiet, ®exp, &t_regexp))
>

Re: [fortran, patch, committed] Adjust error message

2018-11-02 Thread Bernhard Reutner-Fischer

On Thu, 1 Nov 2018 13:05:39 +0100
Thomas Koenig  wrote:

> Hello world,
> 
> I just adjusted the error message for BIND(C) functions which
> return strings of length > 1.  I just used the suggestion from
> the PR.  Committed as obvious and simple (although I managed
> to use up three revisions to do it right :-)

heh, so you don't mind if we make this four, right :)

+ gfc_error ("Return type of BIND(C) function %qs of character "
+"type at %L must have length 1 ", tmp_sym->name,
 &(tmp_sym->declared_at));

Please remove the trailing whitespace (after "length 1").

TIA,

[aarch64] disable shrink wrapping when tracking speculative execution

Although there's no fundamental reason why shrink wrapping and
speculation tracking are incompatible, a phase-ordering requirement (we
need to do speculation tracking before the final basic block clean-up)
means that the shrink wrapping pass can undo some of the changes the
speculation tracking pass makes.  The result is that the tracking, while
still safe is less comprehensive than we really want.

So to keep things simple, and because the tracking code is quite
expensive anyway, it seems best to just disable that pass when we are
tracking speculative execution.

* config/aarch64/aarch64.c (aarch64_override_options): Disable
shrink-wrapping when -mtrack-speculation.

Committed.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 54f57463e97..d356fa37823 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11346,6 +11346,12 @@ aarch64_override_options (void)
|| (aarch64_arch_string && valid_arch))
 gcc_assert (explicit_arch != aarch64_no_arch);
 
+  /* The pass to insert speculation tracking runs before
+ shrink-wrapping and the latter does not know how to update the
+ tracking status.  So disable it in this case.  */
+  if (aarch64_track_speculation)
+flag_shrink_wrap = 0;
+
   aarch64_override_options_internal (&global_options);
 
   /* Save these options as the default ones in case we push and pop them later

Re: [ARM] Implement division using vrecpe, vrecps

2018-11-02 Thread Wilco Dijkstra

Prathamesh Kulkarni wrote:

> This is a rebased version of patch that adds a pattern to neon.md for
> implementing division with multiplication by reciprocal using
> vrecpe/vrecps with -funsafe-math-optimizations excluding -Os.
> The newly added test-cases are not vectorized on armeb target with
> -O2. I posted the analysis for that here:
> https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01765.html

I don't think doing this unconditionally for any CPU is a good idea. On AArch64
we don't enable this for any core since it's not really faster (newer CPUs have
significantly improved division and the reciprocal instructions reduce 
throughput
of other FMAs). On wrf doing reciprocal square root is far better than 
reciprocal
division, but it's only faster on some specific CPUs, so it's not enabled by 
default.

Wilco

[PATCH][RTL] Fix PR87852



The following fixes PR87852, a latent bug in fwprop which when verifying
whether it may propagate a use from its definition site has a shortcut

  /* Check if the reg in USE has only one definition.  We already
 know that this definition reaches use, or we wouldn't be here.
 However, this is invalid for hard registers because if they are
 live at the beginning of the function it does not mean that we
 have an uninitialized access.  */
  regno = DF_REF_REGNO (use);
  def = DF_REG_DEF_CHAIN (regno);
  if (def
  && DF_REF_NEXT_REG (def) == NULL
  && regno >= FIRST_PSEUDO_REGISTER)
return false;

not considering the case of a loop where the def might not dominate
the use.  In fact earlier code in the very same function does
handle this case but only for the case where we'd try propagating
a later def into an earlier use.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK for trunk?

Thanks,
Richard.

2018-11-02  Richard Biener  

PR rtl-optimization/87852
* fwprop.c (use_killed_between): Only consider single-defs of the
use in the definition stmt that dominate it.

diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 0fca0f1edbc..cd44c0ef637 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -767,7 +767,11 @@ use_killed_between (df_ref use, rtx_insn *def_insn, 
rtx_insn *target_insn)
   def = DF_REG_DEF_CHAIN (regno);
   if (def
   && DF_REF_NEXT_REG (def) == NULL
-  && regno >= FIRST_PSEUDO_REGISTER)
+  && regno >= FIRST_PSEUDO_REGISTER
+  && (BLOCK_FOR_INSN (DF_REF_INSN (def)) == def_bb
+ ? DF_INSN_LUID (DF_REF_INSN (def)) < DF_INSN_LUID (def_insn)
+ : dominated_by_p (CDI_DOMINATORS,
+   def_bb, BLOCK_FOR_INSN (DF_REF_INSN (def)
 return false;
 
   /* Check locally if we are in the same basic block.  */

Re: [aarch64] disable shrink wrapping when tracking speculative execution

On Fri, Nov 2, 2018 at 2:38 PM Richard Earnshaw (lists)
 wrote:
>
> Although there's no fundamental reason why shrink wrapping and
> speculation tracking are incompatible, a phase-ordering requirement (we
> need to do speculation tracking before the final basic block clean-up)
> means that the shrink wrapping pass can undo some of the changes the
> speculation tracking pass makes.  The result is that the tracking, while
> still safe is less comprehensive than we really want.
>
> So to keep things simple, and because the tracking code is quite
> expensive anyway, it seems best to just disable that pass when we are
> tracking speculative execution.

Shouldn't you be able to do this per function at least?

Richard.

> * config/aarch64/aarch64.c (aarch64_override_options): Disable
> shrink-wrapping when -mtrack-speculation.
>
> Committed.

Re: [PATCH libquadmath/PR68686]

2018-11-02 Thread Joseph Myers

On Wed, 24 Oct 2018, Jakub Jelinek wrote:

> On Tue, Oct 23, 2018 at 09:45:13PM -0400, Ed Smith-Rowland wrote:
> > Greetings,
> > 
> > This is an almost trivial patch to get the correct sign for tgammaq.
> 
> Doesn't look trivial to me.  What happens if x is a NaN?  Or if x is outside
> of the range of int?
> 
> Generally, libquadmath follows what glibc does, I don't see such a change in
> there.  So, the right fix is probably port all the upstream glibc *gamma*
> changes, in PR65757 as the 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65757#c18
> comment and my patch submission says, I've left those out because they were
> too large and I didn't have spare cycles for that.

I think it would be best to move to having a script to generate 
libquadmath sources automatically from glibc sources by appropriate 
substitutions, so that while you might need to update the script or 
quadmath-imp.h as part of updating libquadmath from glibc, you don't need 
to merge lots of changes manually.

Specifically, any comments on the patch below (quadmath-imp.h changes and 
new script shown, 6000 lines of diffs from running the script not shown)?  
It doesn't yet update the *gamma* sources, but could be extended to do so.  
(It also doesn't do anything with the parts of libquadmath outside of 
libquadmath/math/, but again could be extended for that.)  Specifically, 
the following files in libquadmath/math/ aren't yet updated by the script 
(a few of these, e.g. sqrtq.c, aren't actually based on glibc sources at 
all, while others just need the script to gain new features, or additional 
source files to be added to libquadmath): cacoshq.c cacosq.c casinhq.c 
complex.c expq.c fmaq.c ilogbq.c isinf_nsq.c lgammaq.c nanq.c rem_pio2q.c 
sqrtq.c tanq.c tgammaq.c x2y2m1q.c.

(The script does not aim to match formatting, comments etc. with the 
existing libquadmath code in all cases, but the results should be similar 
enough for comparison to be reasonable, especially if you ignore 
whitespace; an important question is whether it's losing deliberate local 
changes in the libquadmath code, that are either relevant for portability 
or affect the generated code.)

Index: libquadmath/quadmath-imp.h
===
--- libquadmath/quadmath-imp.h  (revision 265724)
+++ libquadmath/quadmath-imp.h  (working copy)
@@ -21,10 +21,15 @@
 #ifndef QUADMATH_IMP_H
 #define QUADMATH_IMP_H

+#include 
+#include 
 #include 
 #include 
 #include "quadmath.h"
 #include "config.h"
+#ifdef HAVE_FENV_H
+# include 
+#endif

 /* Under IEEE 754, an architecture may determine tininess of
@@ -36,6 +41,8 @@

 #define TININESS_AFTER_ROUNDING   1

+#define FIX_FLT128_LONG_CONVERT_OVERFLOW 0
+#define FIX_FLT128_LLONG_CONVERT_OVERFLOW 0

 /* Prototypes for internal functions.  */
 extern int32_t __quadmath_rem_pio2q (__float128, __float128 *);
@@ -227,4 +234,50 @@
 }  \
   while (0)

+/* Likewise, for both real and imaginary parts of a complex
+   result.  */
+#define math_check_force_underflow_complex(x)  \
+  do   \
+{  \
+  __typeof (x) force_underflow_complex_tmp = (x);  \
+  math_check_force_underflow (__real__ force_underflow_complex_tmp); \
+  math_check_force_underflow (__imag__ force_underflow_complex_tmp); \
+}  \
+  while (0)
+
+#ifndef HAVE_FENV_H
+# define feraiseexcept(arg) ((void) 0)
+typedef int fenv_t;
+# define feholdexcept(arg) ((void) 0)
+# define fesetround(arg) ((void) 0)
+# define feupdateenv(arg) ((void) (arg))
+# define fesetenv(arg) ((void) (arg))
+# define fetestexcept(arg) 0
+# define feclearexcept(arg) ((void) 0)
+#else
+# ifndef HAVE_FEHOLDEXCEPT
+#  define feholdexcept(arg) ((void) 0)
+# endif
+# ifndef HAVE_FESETROUND
+#  define fesetround(arg) ((void) 0)
+# endif
+# ifndef HAVE_FEUPDATEENV
+#  define feupdateenv(arg) ((void) (arg))
+# endif
+# ifndef HAVE_FESETENV
+#  define fesetenv(arg) ((void) (arg))
+# endif
+# ifndef HAVE_FETESTEXCEPT
+#  define fetestexcept(arg) 0
+# endif
 #endif
+
+#ifndef __glibc_likely
+# define __glibc_likely(cond)  __builtin_expect ((cond), 1)
+#endif
+
+#ifndef __glibc_unlikely
+# define __glibc_unlikely(cond)__builtin_expect ((cond), 0)
+#endif
+
+#endif
Index: libquadmath/update-quadmath.py
===
--- libquadmath/update-quadmath.py  (nonexistent)
+++ libquadmath/update-quadmath.py  (working copy)
@@ -0,0 +1,198 @@
+#!/usr/bin/python3
+# Update libquadmath code from glibc sources.
+# Copyright (C) 2018 Free Software Foundation, Inc.
+# This file is part of the libquadmath library.
+#
+# Libquadmath is free software; you can redistribute it and/or
+# modify it u

Re: [aarch64] disable shrink wrapping when tracking speculative execution

On 02/11/2018 13:53, Richard Biener wrote:
> On Fri, Nov 2, 2018 at 2:38 PM Richard Earnshaw (lists)
>  wrote:
>>
>> Although there's no fundamental reason why shrink wrapping and
>> speculation tracking are incompatible, a phase-ordering requirement (we
>> need to do speculation tracking before the final basic block clean-up)
>> means that the shrink wrapping pass can undo some of the changes the
>> speculation tracking pass makes.  The result is that the tracking, while
>> still safe is less comprehensive than we really want.
>>
>> So to keep things simple, and because the tracking code is quite
>> expensive anyway, it seems best to just disable that pass when we are
>> tracking speculative execution.
> 
> Shouldn't you be able to do this per function at least?
> 

do what per function?  track speculation?

R.

> Richard.
> 
>> * config/aarch64/aarch64.c (aarch64_override_options): Disable
>> shrink-wrapping when -mtrack-speculation.
>>
>> Committed.

[PATCH][rs6000] fix ICE for strncmp expansion on power6

2018-11-02 Thread Aaron Sawdey

This patch addresses an ICE for a missing instruction when targeting power6. 
The issue
is that we shouldn't generate x-form load rtx if TARGET_AVOID_XFORM is true 
because
it won't end up being matched. More generally, on big endian we do not need to 
use
ldbrx et. al. which are index loads, but can just use ld and other normal d-form
loads. So this is going to generate better code for BE in general which is why 
I have
changed it to do this for big endian or TARGET_AVOID_XFORM.

Bootstrap/regtest passes on ppc32 and ppc64 (power 6/7/8), ok for trunk?

Thanks!
   Aaron


2018-11-02  Aaron Sawdey  

* config/rs6000/rs6000-string.c (expand_strncmp_gpr_sequence): Pay
attention to TARGET_AVOID_XFORM.

Index: gcc/config/rs6000/rs6000-string.c
===
--- gcc/config/rs6000/rs6000-string.c   (revision 265733)
+++ gcc/config/rs6000/rs6000-string.c   (working copy)
@@ -1798,12 +1798,18 @@
   rid of the extra bytes.  */
cmp_bytes = bytes_to_compare;

-  rtx offset_reg = gen_reg_rtx (Pmode);
-  emit_move_insn (offset_reg, GEN_INT (offset));
-
-  rtx addr1 = gen_rtx_PLUS (Pmode, src1_addr, offset_reg);
+  rtx offset_rtx;
+  if (BYTES_BIG_ENDIAN || TARGET_AVOID_XFORM)
+   offset_rtx = GEN_INT (offset);
+  else
+   {
+ offset_rtx = gen_reg_rtx (Pmode);
+ emit_move_insn (offset_rtx, GEN_INT (offset));
+   }
+  rtx addr1 = gen_rtx_PLUS (Pmode, src1_addr, offset_rtx);
+  rtx addr2 = gen_rtx_PLUS (Pmode, src2_addr, offset_rtx);
+   
   do_load_for_compare_from_addr (load_mode, tmp_reg_src1, addr1, 
orig_src1);
-  rtx addr2 = gen_rtx_PLUS (Pmode, src2_addr, offset_reg);
   do_load_for_compare_from_addr (load_mode, tmp_reg_src2, addr2, 
orig_src2);

   /* We must always left-align the data we read, and


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Re: [patch][x86_64]: AMD znver2 enablement

2018-11-02 Thread Uros Bizjak

On Wed, Oct 31, 2018 at 6:25 AM Kumar, Venkataramanan
 wrote:
>
> Hi Maintainers,
>
> PFA, the patch that enables support for the next generation AMD  Zen CPU via 
> -march=znver2.
> As of now,  znver2 is using the same costs and scheduler descriptions written 
> for znver1.
>
> We will update scheduler descriptions and costing for znver2 later as we get 
> more information.
>
> Ok for trunk?
>
> Regards,
> Venkat.
>
> ChangeLog gcc:
> * common/config/i386/i386-common.c (processor_alias_table): Add 
> znver2 entry.
>   * config.gcc (i[34567]86-*-linux* | ...): Add znver2.
>   (case ${target}): Add znver2.
>   * config/i386/driver-i386.c: (host_detect_local_cpu): Let
>   -march=native recognize znver2 processors.
>   * config/i386/i386-c.c (ix86_target_macros_internal): Add 
> znver2.
>   * config/i386/i386.c (m_znver2): New definition.
>   (m_ZNVER): New definition.
>   (m_AMD_MULTIPLE): Includes m_znver2.
>   (processor_cost_table): Add znver2 entry.
>   (processor_target_table): Add znver2 entry.
>   (get_builtin_code_for_version): Set priority for
>  PROCESSOR_ZNVER2.
> (processor_model): Add M_AMDFAM17H_ZNVER2.
> (arch_names_table): Ditto.
> (ix86_reassociation_width): Include znver2.
> * config/i386/i386.h (TARGET_znver2): New definition.
>   (struct ix86_size_cost): Add TARGET_ZNVER2.
>   (enum processor_type): Add PROCESSOR_ZNVER2.
>   * config/i386/i386.md (define_attr "cpu"): Add znver2.
> * config/i386/x86-tune-costs.h: (processor_costs) Add znver2 costs.
> * config/i386/x86-tune-sched.c: (ix86_issue_rate): Add znver2.
> (ix86_adjust_cost): Add znver2.
>   * config/i386/x86-tune.def:  Replace m_ZNVER1 by m_ZNVER
>   * gcc/doc/extend.texi: Add details about znver2.
>   * gcc/doc/invoke.texi: Add details about znver2.
>
> ChangeLog libgcc
>  * config/i386/cpuinfo.c: (get_amd_cpu): Add znver2.
>  (processor_subtypes): Ditto.


diff --git a/libgcc/config/i386/cpuinfo.h b/libgcc/config/i386/cpuinfo.h
index 0aa887b..86cb4ea 100644
--- a/libgcc/config/i386/cpuinfo.h
+++ b/libgcc/config/i386/cpuinfo.h
@@ -67,6 +67,7 @@ enum processor_subtypes
   AMDFAM15H_BDVER3,
   AMDFAM15H_BDVER4,
   AMDFAM17H_ZNVER1,
+  AMDFAM17H_ZNVER2,
   INTEL_COREI7_IVYBRIDGE,
   INTEL_COREI7_HASWELL,
   INTEL_COREI7_BROADWELL,

As the comment above these enums says:

/* Any new types or subtypes have to be inserted at the end. */

So, please add new entry at the end of enum processor_types.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 963c7fc..bbe3bb3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -32269,6 +32276,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
 M_AMDFAM15H_BDVER3,
 M_AMDFAM15H_BDVER4,
 M_AMDFAM17H_ZNVER1,
+M_AMDFAM17H_ZNVER2,
 M_INTEL_COREI7_IVYBRIDGE,
 M_INTEL_COREI7_HASWELL,
 M_INTEL_COREI7_BROADWELL,

The above also have to be in sync with enum processor_subtypes.

Otherwise LGTM.

Uros.

Re: [PATCH][rs6000] fix ICE for strncmp expansion on power6

2018-11-02 Thread Segher Boessenkool

On Fri, Nov 02, 2018 at 09:58:50AM -0500, Aaron Sawdey wrote:
> This patch addresses an ICE for a missing instruction when targeting power6. 
> The issue
> is that we shouldn't generate x-form load rtx if TARGET_AVOID_XFORM is true 
> because
> it won't end up being matched. More generally, on big endian we do not need 
> to use
> ldbrx et. al. which are index loads, but can just use ld and other normal 
> d-form
> loads. So this is going to generate better code for BE in general which is 
> why I have
> changed it to do this for big endian or TARGET_AVOID_XFORM.

Great :-)

> 2018-11-02  Aaron Sawdey  
> 
>   * config/rs6000/rs6000-string.c (expand_strncmp_gpr_sequence): Pay
>   attention to TARGET_AVOID_XFORM.

Also mention BIG_ENDIAN please?

Okay with that.  Thanks!


Segher

Re: [PATCH][RTL] Fix PR87852

2018-11-02 Thread Eric Botcazou

> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
> 2018-11-02  Richard Biener  
> 
>   PR rtl-optimization/87852
>   * fwprop.c (use_killed_between): Only consider single-defs of the
>   use in the definition stmt that dominate it.

This looks OK to me, but this lacks commentary and I have a hard time parsing 
the ChangeLog entry.  Maybe:

* fwprop.c (use_killed_between): Only consider single-defs of the use
whose definition statement dominates the use.

FWIW I've attached a patch that also fixes the head comment of the function.

-- 
Eric BotcazouIndex: fwprop.c
===
--- fwprop.c	(revision 265634)
+++ fwprop.c	(working copy)
@@ -731,14 +731,15 @@ local_ref_killed_between_p (df_ref ref,
 }
 
 
-/* Check if the given DEF is available in INSN.  This would require full
-   computation of available expressions; we check only restricted conditions:
-   - if DEF is the sole definition of its register, go ahead;
-   - in the same basic block, we check for no definitions killing the
- definition of DEF_INSN;
-   - if USE's basic block has DEF's basic block as the sole predecessor,
- we check if the definition is killed after DEF_INSN or before
+/* Check if USE is killed between DEF_INSN and TARGET_INSN.  This would
+   require full computation of available expressions; we check only a few
+   restricted conditions:
+   - if the reg in USE has only one definition, go ahead;
+   - in the same basic block, we check for no definitions killing the use;
+   - if TARGET_INSN's basic block has DEF_INSN's basic block as its sole
+ predecessor, we check if the use is killed after DEF_INSN or before
  TARGET_INSN insn, in their respective basic blocks.  */
+
 static bool
 use_killed_between (df_ref use, rtx_insn *def_insn, rtx_insn *target_insn)
 {
@@ -762,12 +763,17 @@ use_killed_between (df_ref use, rtx_insn
  know that this definition reaches use, or we wouldn't be here.
  However, this is invalid for hard registers because if they are
  live at the beginning of the function it does not mean that we
- have an uninitialized access.  */
+ have an uninitialized access.  And we have to check for the case
+ where a register may be used uninitialized in a loop as above.  */
   regno = DF_REF_REGNO (use);
   def = DF_REG_DEF_CHAIN (regno);
   if (def
   && DF_REF_NEXT_REG (def) == NULL
-  && regno >= FIRST_PSEUDO_REGISTER)
+  && regno >= FIRST_PSEUDO_REGISTER
+  && (BLOCK_FOR_INSN (DF_REF_INSN (def)) == def_bb
+	  ? DF_INSN_LUID (DF_REF_INSN (def)) < DF_INSN_LUID (def_insn)
+	  : dominated_by_p (CDI_DOMINATORS,
+			def_bb, BLOCK_FOR_INSN (DF_REF_INSN (def)
 return false;
 
   /* Check locally if we are in the same basic block.  */

[PATCH] i386: Remove duplicated AVX2/AVX512 vec_dup patterns

2018-11-02 Thread H.J. Lu

Remove duplicated AVX2/AVX512 vec_dup patterns and replace them with
subreg.  gcc.target/i386/avx2-vbroadcastss_ps256-1.c is changed by

 avx2_test:
.cfi_startproc
-   vmovaps x(%rip), %xmm1
-   vbroadcastss%xmm1, %ymm0
+   vbroadcastssx(%rip), %ymm0
vmovaps %ymm0, y(%rip)
vzeroupper
ret
.cfi_endproc

gcc.target/i386/avx512vl-vbroadcast-3.c is changed by

@@ -113,7 +113,7 @@ f10:
.cfi_startproc
vmovaps %ymm0, %ymm16
vpermilps   $85, %ymm16, %ymm16
-   vbroadcastss%xmm16, %ymm16
+   vshuff32x4  $0x0, %ymm16, %ymm16, %ymm16
vzeroupper
ret
.cfi_endproc
@@ -153,8 +153,7 @@ f12:
 f13:
 .LFB12:
.cfi_startproc
-   vmovaps (%rdi), %ymm16
-   vbroadcastss%xmm16, %ymm16
+   vbroadcastss(%rdi), %ymm16
vzeroupper
ret
.cfi_endproc

OK for trunk?

Thanks.

H.J.
--
gcc/

* config/i386/i386-builtin.def: Replace CODE_FOR_avx2_vec_dupv4sf,
CODE_FOR_avx2_vec_dupv8sf and CODE_FOR_avx2_vec_dupv4df with
CODE_FOR_vec_dupv4sf, CODE_FOR_vec_dupv8sf and
CODE_FOR_vec_dupv4df, respectively.
* config/i386/i386.c (expand_vec_perm_1): Use subreg with vec_dup.
* config/i386/i386.md (SF to DF splitter): Replace
gen_avx512f_vec_dupv16sf_1 with gen_avx512f_vec_dupv16sf.
* config/i386/sse.md (VF48_AVX512VL): New.
(avx2_vec_dup): Removed.
(avx2_vec_dupv8sf_1): Likewise.
(avx512f_vec_dup_1): Likewise.
(avx2_pbroadcast_1): Likewise.
(avx2_vec_dupv4df): Likewise.
(_vec_dup_1): Likewise.
(*avx_vperm_broadcast_): Replace gen_avx2_vec_dupv8sf with
gen_vec_dupv8sf.

gcc/testsuite/

* gcc.target/i386/avx2-vbroadcastss_ps256-1.c: Updated.
* gcc.target/i386/avx512vl-vbroadcast-3.c: Likewise.
---
 gcc/config/i386/i386-builtin.def  |  6 +-
 gcc/config/i386/i386.c| 57 ++---
 gcc/config/i386/i386.md   |  2 +-
 gcc/config/i386/sse.md| 83 +--
 .../i386/avx2-vbroadcastss_ps256-1.c  |  3 +-
 .../gcc.target/i386/avx512vl-vbroadcast-3.c   |  5 +-
 6 files changed, 56 insertions(+), 100 deletions(-)

diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index df0f7e975ac..d217add8ee2 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -1194,9 +1194,9 @@ BDESC (OPTION_MASK_ISA_AVX2, 
CODE_FOR_avx2_interleave_lowv16hi, "__builtin_ia32_
 BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_interleave_lowv8si, 
"__builtin_ia32_punpckldq256", IX86_BUILTIN_PUNPCKLDQ256, UNKNOWN, (int) 
V8SI_FTYPE_V8SI_V8SI)
 BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_interleave_lowv4di, 
"__builtin_ia32_punpcklqdq256", IX86_BUILTIN_PUNPCKLQDQ256, UNKNOWN, (int) 
V4DI_FTYPE_V4DI_V4DI)
 BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_xorv4di3, "__builtin_ia32_pxor256", 
IX86_BUILTIN_PXOR256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI)
-BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_vec_dupv4sf, 
"__builtin_ia32_vbroadcastss_ps", IX86_BUILTIN_VBROADCASTSS_PS, UNKNOWN, (int) 
V4SF_FTYPE_V4SF)
-BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_vec_dupv8sf, 
"__builtin_ia32_vbroadcastss_ps256", IX86_BUILTIN_VBROADCASTSS_PS256, UNKNOWN, 
(int) V8SF_FTYPE_V4SF)
-BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_vec_dupv4df, 
"__builtin_ia32_vbroadcastsd_pd256", IX86_BUILTIN_VBROADCASTSD_PD256, UNKNOWN, 
(int) V4DF_FTYPE_V2DF)
+BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_vec_dupv4sf, 
"__builtin_ia32_vbroadcastss_ps", IX86_BUILTIN_VBROADCASTSS_PS, UNKNOWN, (int) 
V4SF_FTYPE_V4SF)
+BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_vec_dupv8sf, 
"__builtin_ia32_vbroadcastss_ps256", IX86_BUILTIN_VBROADCASTSS_PS256, UNKNOWN, 
(int) V8SF_FTYPE_V4SF)
+BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_vec_dupv4df, 
"__builtin_ia32_vbroadcastsd_pd256", IX86_BUILTIN_VBROADCASTSD_PD256, UNKNOWN, 
(int) V4DF_FTYPE_V2DF)
 BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_vbroadcasti128_v4di, 
"__builtin_ia32_vbroadcastsi256", IX86_BUILTIN_VBROADCASTSI256, UNKNOWN, (int) 
V4DI_FTYPE_V2DI)
 BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_pblenddv4si, 
"__builtin_ia32_pblendd128", IX86_BUILTIN_PBLENDD128, UNKNOWN, (int) 
V4SI_FTYPE_V4SI_V4SI_INT)
 BDESC (OPTION_MASK_ISA_AVX2, CODE_FOR_avx2_pblenddv8si, 
"__builtin_ia32_pblendd256", IX86_BUILTIN_PBLENDD256, UNKNOWN, (int) 
V8SI_FTYPE_V8SI_V8SI_INT)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 963c7fcbb34..6b95d774ad1 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -45963,28 +45963,41 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
{
  /* Use vpbroadcast{b,w,d}.  */
  rtx (*gen) (rtx, rtx) = NULL;
+ machine_mode smode = VOIDmode;
  switch (d->vmode)
{
case E_V64QImode:
  if (TARGET_AVX512BW)
-   gen = gen_avx512bw_vec_dupv64qi

[PATCH 1/3][GCC] Add new target hook asm_post_cfi_startproc

Hi all,

This patch adds a new target hook called "asm_post_cfi_startproc". This hook is
intended to be used by the aarch64 backend to emit a directive that enables
support for unwinding frames signed with the pointer authentication B-key. This
hook is triggered after the ".cfi_startproc" directive is emitted in
gcc/dwarf2out.c.

Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf with no 
regressions.

Ok for trunk?

gcc/
2018-11-02  Sam Tebbs

* doc/tm.texi (TARGET_ASM_POST_CFI_STARTPROC): Define.
* doc/tm.texi.in (TARGET_ASM_POST_CFI_STARTPROC): Define.
* dwarf2out.c (dwarf2out_do_cfi_startproc): Trigger the hook.
* hooks.c (hook_void_FILEptr_tree): Define.
* hooks.h (hook_void_FILEptr_tree): Define.
* target.def (post_cfi_startproc): Define.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f841527..e26c0a7 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9413,6 +9413,14 @@ If this macro is not defined, nothing special is output at the end of
 the jump-table.
 @end defmac
 
+@deftypefn {Target Hook} void TARGET_ASM_POST_CFI_STARTPROC (FILE *@var{}, @var{tree})
+This target hook is used to emit assembly strings required by the target
+after the .cfi_startproc directive.  The first argument is the file stream to
+write the strings to and the second argument is the function's declaration.
+
+The default is to not output any assembly strings.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_ASM_EMIT_UNWIND_LABEL (FILE *@var{stream}, tree @var{decl}, int @var{for_eh}, int @var{empty})
 This target hook emits a label at the beginning of each FDE@.  It
 should be defined on targets where FDEs need special labels, and it
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 967ef3a..7d933c0 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -6426,6 +6426,8 @@ If this macro is not defined, nothing special is output at the end of
 the jump-table.
 @end defmac
 
+@hook TARGET_ASM_POST_CFI_STARTPROC
+
 @hook TARGET_ASM_EMIT_UNWIND_LABEL
 
 @hook TARGET_ASM_EMIT_EXCEPT_TABLE_LABEL
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 30bbfee..6c1531a 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -969,6 +969,8 @@ dwarf2out_do_cfi_startproc (bool second)
 
   fprintf (asm_out_file, "\t.cfi_startproc\n");
 
+  targetm.asm_out.post_cfi_startproc (asm_out_file, current_function_decl);
+
   /* .cfi_personality and .cfi_lsda are only relevant to DWARF2
  eh unwinders.  */
   if (targetm_common.except_unwind_info (&global_options) != UI_DWARF2)
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 0ed5b95..bcfc231 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -82,6 +82,7 @@ extern void hook_void_FILEptr_constcharptr_const_tree (FILE *, const char *,
 		   const_tree);
 extern bool hook_bool_FILEptr_rtx_false (FILE *, rtx);
 extern void hook_void_rtx_tree (rtx, tree);
+extern void hook_void_FILEptr_tree (FILE *, tree);
 extern void hook_void_tree (tree);
 extern void hook_void_tree_treeptr (tree, tree *);
 extern void hook_void_int_int (int, int);
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 780cc1e..46bf2a8 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -277,6 +277,11 @@ hook_void_tree (tree)
 }
 
 void
+hook_void_FILEptr_tree (FILE *, tree)
+{
+}
+
+void
 hook_void_rtx_tree (rtx, tree)
 {
 }
diff --git a/gcc/target.def b/gcc/target.def
index ad27d35..3b0022d 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -87,6 +87,17 @@ when the relevant string is @code{NULL}.",
  bool, (rtx x, unsigned int size, int aligned_p),
  default_assemble_integer)
 
+/* Assembly strings required after the .cfi_startproc label.  */
+DEFHOOK
+(post_cfi_startproc,
+  "This target hook is used to emit assembly strings required by the target\n\
+after the .cfi_startproc directive.  The first argument is the file stream to\n\
+write the strings to and the second argument is the function\'s declaration.\n\
+\n\
+The default is to not output any assembly strings.",
+  void, (FILE *, tree),
+  hook_void_FILEptr_tree)
+
 /* Notify the backend that we have completed emitting the data for a
decl.  */
 DEFHOOK

[PATCH 2/3][GCC][AARCH64] Add new -mbranch-protection option to combine pointer signing and BTI

Hi all,

The -mbranch-protection option combines the functionality of
-msign-return-address and the BTI features new in Armv8.5 to better reflect
their relationship. This new option therefore supersedes and deprecates the
existing -msign-return-address option.

-mbranch-protection=[none|standard|] - Turns on different types of branch
protection available where:

 * "none": Turn of all types of branch protection
 * "standard" : Turns on all the types of protection to their respective
   standard levels.
 *  can be "+" separated protection types:

* "bti" : Branch Target Identification Mechanism.
* "pac-ret{+leaf+b-key}": Return Address Signing. The default return
  address signing is enabled by signing functions that save the return
  address to memory (non-leaf functions will practically always do this)
  using the a-key. The optional tuning arguments allow the user to
  extend the scope of return address signing to include leaf functions
  and to change the key to b-key. The tuning arguments must proceed the
  protection type "pac-ret".

Thus -mbranch-protection=standard -> -mbranch-protection=bti+pac-ret.

Its mapping to -msign-return-address is as follows:

 * -mbranch-protection=none -> -msign-return-address=none
 * -mbranch-protection=standard -> -msign-return-address=leaf
 * -mbranch-protection=pac-ret -> -msign-return-address=non-leaf
 * -mbranch-protection=pac-ret+leaf -> -msign-return-address=all

This patch implements the option's skeleton and the "none", "standard" and
"pac-ret" types (along with its "leaf" subtype).

The previous patch in this series is here:
https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00103.html

Bootstrapped successfully and tested on aarch64-none-elf with no regressions.

OK for trunk?

gcc/ChangeLog:

2018-11-02  Sam Tebbs

* config/aarch64/aarch64.c (BRANCH_PROTEC_STR_MAX,
aarch64_parse_branch_protection,
struct aarch64_branch_protec_type,
aarch64_handle_no_branch_protection,
aarch64_handle_standard_branch_protection,
aarch64_validate_mbranch_protection,
aarch64_handle_pac_ret_protection,
aarch64_handle_attr_branch_protection,
accepted_branch_protection_string,
aarch64_pac_ret_subtypes,
aarch64_branch_protec_types,
aarch64_handle_pac_ret_leaf): Define.
(aarch64_override_options_after_change_1): Add check for
accepted_branch_protection_string.
(aarch64_override_options): Add check for
accepted_branch_protection_string.
(aarch64_option_save): Save accepted_branch_protection_string.
(aarch64_option_restore): Save
accepted_branch_protection_string.
* config/aarch64/aarch64.c (aarch64_attributes): Add branch-protection.
* config/aarch64/aarch64.opt: Add mbranch-protection. Deprecate
msign-return-address.
* doc/invoke.texi: Add mbranch-protection.

gcc/testsuite/ChangeLog:

2018-11-02  Sam Tebbs

* (gcc.target/aarch64/return_address_sign_1.c,
gcc.target/aarch64/return_address_sign_2.c,
gcc.target/aarch64/return_address_sign_3.c (__attribute__)): Change
option to -mbranch-protection.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b44ee40115dce526c7cc302b2a47c28ab8b41508..121348dbd909f42717efea163ea6b7c545f5b1c7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -183,6 +183,12 @@ bool aarch64_pcrelative_literal_loads;
 /* Global flag for whether frame pointer is enabled.  */
 bool aarch64_use_frame_pointer;
 
+#define BRANCH_PROTEC_STR_MAX 255
+char *accepted_branch_protection_string = NULL;
+
+static enum aarch64_parse_opt_result
+aarch64_parse_branch_protection (const char*, char**);
+
 /* Support for command line parsing of boolean flags in the tuning
structures.  */
 struct aarch64_flag_desc
@@ -1108,6 +1114,80 @@ aarch64_cc;
 
 #define AARCH64_INVERSE_CONDITION_CODE(X) ((aarch64_cc) (((int) X) ^ 1))
 
+struct aarch64_branch_protec_type
+{
+  /* The type's name that the user passes to the branch-protection option
+string.  */
+  const char* name;
+  /* Function to handle the protection type and set global variables.
+First argument is the string token corresponding with this type and the
+second argument is the next token in the option string.
+Return values:
+* AARCH64_PARSE_OK: Handling was sucessful.
+* AARCH64_INVALID_ARG: The type is invalid in this context and the caller
+  should print an error.
+* AARCH64_INVALID_FEATURE: The type is invalid and the handler prints its
+  own error.  */
+  enum aarch64_parse_opt_result (*handler)(char*, char*);
+  /* A list of types that can follow this type in the option string.  */
+  const aarch64_branch_protec_type* subtypes;
+  unsigned int num_subtypes;
+};
+
+static enum aarch64_parse_opt_result
+aarch64_handle_no_branch_

[PATCH 3/3][GCC][AARCH64] Add support for pointer authentication B key

Hi all,

This patch adds support for the Armv8.3-A pointer authentication instructions
that use the B-key (pacib*, autib* and retab). This required adding builtins for
pacib1716 and autib1716, adding the "b-key" feature to the -mbranch-protection
option, and required emitting a new CFI directive ".cfi_b_key_frame" which
causes GAS to add 'B' to the CIE augmentation string. I also had to add a new
hook called ASM_POST_CFI_STARTPROC which is triggered when the .cfi_startproc
directive is emitted.

The libgcc stack unwinder has been amended to authenticate return addresses
with the B key when the function has been signed with the B key.

The previous patch in this series is here:
https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00104.html

Bootstrapped successfully and regression tested on aarch64-none-elf.

OK for trunk?

gcc/
2018-11-02  Sam Tebbs

* config/aarch64/aarch64-builtins.c (aarch64_builtins): Add
AARCH64_PAUTH_BUILTIN_AUTIB1716 and AARCH64_PAUTH_BUILTIN_PACIB1716.
* config/aarch64/aarch64-builtins.c (aarch64_init_pauth_hint_builtins):
Add autib1716 and pacib1716 initialisation.
* config/aarch64/aarch64-builtins.c (aarch64_expand_builtin): Add checks
for autib1716 and pacib1716.
* config/aarch64/aarch64-protos.h (aarch64_key_type,
aarch64_post_cfi_startproc): Define.
* config/aarch64/aarch64-protos.h (aarch64_ra_sign_key): Define extern.
* config/aarch64/aarch64.c (aarch64_return_address_signing_enabled): Add
check for b-key, remove frame.laid_out assertion.
* config/aarch64/aarch64.c (aarch64_ra_sign_key,
aarch64_post_cfi_startproc, aarch64_handle_pac_ret_b_key): Define.
* config/aarch64/aarch64.h (TARGET_ASM_POST_CFI_STARTPROC): Define.
* config/aarch64/aarch64.c (aarch64_pac_ret_subtypes): Add "b-key".
* config/aarch64/aarch64.md (unspec): Add UNSPEC_AUTIA1716,
UNSPEC_AUTIB1716, UNSPEC_AUTIASP, UNSPEC_AUTIBSP, UNSPEC_PACIA1716,
UNSPEC_PACIB1716, UNSPEC_PACIASP, UNSPEC_PACIBSP.
* config/aarch64/aarch64.md (do_return): Add check for b-key.
* config/aarch64/aarch64.md (sp): Add check for
signing key and scope selected.
* config/aarch64/aarch64.md (1716): Add check for
signing key and scope selected.
* config/aarch64/aarch64.opt (msign-return-address=): Deprecate.
* config/aarch64/iterators.md (PAUTH_LR_SP): Add UNSPEC_AUTIASP,
UNSPEC_AUTIBSP, UNSPEC_PACIASP, UNSPEC_PACIBSP.
* config/aarch64/iterators.md (PAUTH_17_16): Add UNSPEC_AUTIA1716,
UNSPEC_AUTIB1716, UNSPEC_PACIA1716, UNSPEC_PACIB1716.
* config/aarch64/iterators.md (pauth_mnem_prefix): Add UNSPEC_AUTIA1716,
UNSPEC_AUTIB1716, UNSPEC_PACIA1716, UNSPEC_PACIB1716, UNSPEC_AUTIASP,
UNSPEC_AUTIBSP, UNSPEC_PACIASP, UNSPEC_PACIBSP.
* config/aarch64/iterators.md (pauth_hint_num_a): Replace
UNSPEC_PACI1716 and UNSPEC_AUTI1716 with UNSPEC_PACIA1716 and
UNSPEC_AUTIA1716 respectively.
* config/aarch64/iterators.md (pauth_hint_num_b): New int attribute.

gcc/testsuite
2018-11-02  Sam Tebbs

* gcc.target/aarch64/return_address_sign_1.c (dg-final): Replace
"autiasp" and "paciasp" with "hint\t29 // autisp" and
"hint\t25 // pacisp" respectively.
* gcc.target/aarch64/return_address_sign_2.c (dg-final): Replace
"paciasp" with "hint\t25 // pacisp".
* gcc.target/aarch64/return_address_sign_3.c (dg-final): Replace
"paciasp" and "autiasp" with "pacisp" and "autisp" respectively.
* gcc.target/aarch64/return_address_sign_b_1.c: New file.
* gcc.target/aarch64/return_address_sign_b_2.c: New file.
* gcc.target/aarch64/return_address_sign_b_3.c: New file.
* gcc.target/aarch64/return_address_sign_b_exception.c: New file.
* gcc.target/aarch64/return_address_sign_builtin.c: New file

libgcc/
2018-11-02  Sam Tebbs

* config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key): New
function.
* config/aarch64/aarch64-unwind.h (aarch64_post_extract_frame_addr,
aarch64_post_frob_eh_handler_addr): Add check for b-key.
* unwind-dw2-fde.c (get_cie_encoding): Add check for 'B' in augmentation
string.
* unwind-dw2.c (extract_cie_info): Add check for 'B' in augmentation
string.

diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 8cced94567008e28b1761ec8771589a3925f2904..d676f36c157c486cc9cbe6bffe0a7389ba0ccdd8 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -398,6 +398,8 @@ enum aarch64_builtins
   /* ARMv8.3-A Pointer Authentication Builtins.  */
   AARCH64_PAUTH_BUILTIN_AUTIA1716,
   AARCH64_PAUTH_BUILTIN_PACIA1716,
+  AARCH64_PAUTH_BUILTIN_AUTIB1716,
+  AARCH64_PAUTH_BUILTIN_PACIB1716,
   AARCH64_PAUTH_BUILTIN_XPACLRI,
   AARCH64_BUILTIN_MAX
 };
@@ -971,6

Re: [PATCH v2] bring netbsd/arm support up to speed. eabi, etc.

2018-11-02 Thread coypu

On Fri, Nov 02, 2018 at 11:04:06AM +, Richard Earnshaw (lists) wrote:
> Sorry about that.  You don't really expect me to remember every patch I
> committed 18 years ago!
> 
> And pedantically, that was a branch merge patch.  The original commit
> (back in the CVS days) was:
> 
> 
>   revision 1.9.2.1
>   date: 1999/10/25 17:47:02;  author: [redacted];  state: Exp;  lines:
> +34 -10
>   Initial check in of merged arm/thumb backend.
> 
> However, the age of this makes me suspect that it quite likely is not
> relevant any more and that we should investigate whether it is safe to
> remove.  We're running some tests here, but can you test the NetBSD port
> without that as well for another data point?

I thought it's funny, sorry :-)
netbsd seems to only do this for OABI (and defaults to EABI).
I tried it anyway on netbsd's mutant GCC 6.4 and ran a full userland with
it. It works really well!
I'm surprised I can run code that file identifies as "ARMv1" on a machine
that can run Aarch64.

[doc PATCH] Fix weakref description.

2018-11-02 Thread Michael Ploujnikov

I came across this typo and also added a similar ld invocation for
illustration purposes as mentioned by Jakub on irc.
From 2df4903f04fbe68e9e6a1ae0eea460e7592a8512 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Fri, 2 Nov 2018 13:40:50 -0400
Subject: [PATCH] Fix weakref description.

gcc/ChangeLog:

2018-11-02  Michael Ploujnikov  

	* doc/extend.texi: Fix typo in the weakref description.
---
 gcc/doc/extend.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git gcc/doc/extend.texi gcc/doc/extend.texi
index 4dbb2da39e..027e0f75a1 100644
--- gcc/doc/extend.texi
+++ gcc/doc/extend.texi
@@ -3603,7 +3603,7 @@ symbol, not necessarily in the same translation unit.
 The effect is equivalent to moving all references to the alias to a
 separate translation unit, renaming the alias to the aliased symbol,
 declaring it as weak, compiling the two separate translation units and
-performing a reloadable link on them.
+performing a incremental link (like @code{ld -r}) on them.
 
 At present, a declaration to which @code{weakref} is attached can
 only be @code{static}.
-- 
2.19.1



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 3/3][GCC][AARCH64] Add support for pointer authentication B key

On 11/02/2018 05:35 PM, Sam Tebbs wrote:

> Hi all,
>
> This patch adds support for the Armv8.3-A pointer authentication instructions
> that use the B-key (pacib*, autib* and retab). This required adding builtins 
> for
> pacib1716 and autib1716, adding the "b-key" feature to the -mbranch-protection
> option, and required emitting a new CFI directive ".cfi_b_key_frame" which
> causes GAS to add 'B' to the CIE augmentation string. I also had to add a new
> hook called ASM_POST_CFI_STARTPROC which is triggered when the .cfi_startproc
> directive is emitted.
>
> The libgcc stack unwinder has been amended to authenticate return addresses
> with the B key when the function has been signed with the B key.
>
> The previous patch in this series is here:
> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00104.html
>
> Bootstrapped successfully and regression tested on aarch64-none-elf.
>
> OK for trunk?
>
> gcc/
> 2018-11-02  Sam Tebbs  
>
>   * config/aarch64/aarch64-builtins.c (aarch64_builtins): Add
>   AARCH64_PAUTH_BUILTIN_AUTIB1716 and AARCH64_PAUTH_BUILTIN_PACIB1716.
>   * config/aarch64/aarch64-builtins.c (aarch64_init_pauth_hint_builtins):
>   Add autib1716 and pacib1716 initialisation.
>   * config/aarch64/aarch64-builtins.c (aarch64_expand_builtin): Add checks
>   for autib1716 and pacib1716.
>   * config/aarch64/aarch64-protos.h (aarch64_key_type,
>   aarch64_post_cfi_startproc): Define.
>   * config/aarch64/aarch64-protos.h (aarch64_ra_sign_key): Define extern.
>   * config/aarch64/aarch64.c (aarch64_return_address_signing_enabled): Add
>   check for b-key, remove frame.laid_out assertion.
>   * config/aarch64/aarch64.c (aarch64_ra_sign_key,
>   aarch64_post_cfi_startproc, aarch64_handle_pac_ret_b_key): Define.
>   * config/aarch64/aarch64.h (TARGET_ASM_POST_CFI_STARTPROC): Define.
>   * config/aarch64/aarch64.c (aarch64_pac_ret_subtypes): Add "b-key".
>   * config/aarch64/aarch64.md (unspec): Add UNSPEC_AUTIA1716,
>   UNSPEC_AUTIB1716, UNSPEC_AUTIASP, UNSPEC_AUTIBSP, UNSPEC_PACIA1716,
>   UNSPEC_PACIB1716, UNSPEC_PACIASP, UNSPEC_PACIBSP.
>   * config/aarch64/aarch64.md (do_return): Add check for b-key.
>   * config/aarch64/aarch64.md (sp): Add check for
>   signing key and scope selected.
>   * config/aarch64/aarch64.md (1716): Add check for
>   signing key and scope selected.
>   * config/aarch64/aarch64.opt (msign-return-address=): Deprecate.
>   * config/aarch64/iterators.md (PAUTH_LR_SP): Add UNSPEC_AUTIASP,
>   UNSPEC_AUTIBSP, UNSPEC_PACIASP, UNSPEC_PACIBSP.
>   * config/aarch64/iterators.md (PAUTH_17_16): Add UNSPEC_AUTIA1716,
>   UNSPEC_AUTIB1716, UNSPEC_PACIA1716, UNSPEC_PACIB1716.
>   * config/aarch64/iterators.md (pauth_mnem_prefix): Add UNSPEC_AUTIA1716,
>   UNSPEC_AUTIB1716, UNSPEC_PACIA1716, UNSPEC_PACIB1716, UNSPEC_AUTIASP,
>   UNSPEC_AUTIBSP, UNSPEC_PACIASP, UNSPEC_PACIBSP.
>   * config/aarch64/iterators.md (pauth_hint_num_a): Replace
>   UNSPEC_PACI1716 and UNSPEC_AUTI1716 with UNSPEC_PACIA1716 and
>   UNSPEC_AUTIA1716 respectively.
>   * config/aarch64/iterators.md (pauth_hint_num_b): New int attribute.
>
> gcc/testsuite
> 2018-11-02  Sam Tebbs  
>
>   * gcc.target/aarch64/return_address_sign_1.c (dg-final): Replace
>   "autiasp" and "paciasp" with "hint\t29 // autisp" and
>   "hint\t25 // pacisp" respectively.
>   * gcc.target/aarch64/return_address_sign_2.c (dg-final): Replace
>   "paciasp" with "hint\t25 // pacisp".
>   * gcc.target/aarch64/return_address_sign_3.c (dg-final): Replace
>   "paciasp" and "autiasp" with "pacisp" and "autisp" respectively.
>   * gcc.target/aarch64/return_address_sign_b_1.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_2.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_3.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_exception.c: New file.
>   * gcc.target/aarch64/return_address_sign_builtin.c: New file
>
> libgcc/
> 2018-11-02  Sam Tebbs  
>
>   * config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key): New
>   function.
>   * config/aarch64/aarch64-unwind.h (aarch64_post_extract_frame_addr,
>   aarch64_post_frob_eh_handler_addr): Add check for b-key.
>   * unwind-dw2-fde.c (get_cie_encoding): Add check for 'B' in augmentation
>   string.
>   * unwind-dw2.c (extract_cie_info): Add check for 'B' in augmentation
>   string.

Attached is an updated patch rebased on an improvement to the 
-mbranch-protection option documentation.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 8cced94..d676f36 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -398,6 +398,8 @@ enum aarch64_builtins
   /* ARMv8.3-A Pointer Authentication Builtins.  */
   AARCH64_PAUTH_BUILTIN_AUTIA1716,
   AARCH64_PAUTH_BUILTIN_PACIA1716,
+

Re: [PATCH 1/3][GCC] Add new target hook asm_post_cfi_startproc

On 11/02/2018 05:28 PM, Sam Tebbs wrote:

> Hi all,
>
> This patch adds a new target hook called "asm_post_cfi_startproc". This hook 
> is
> intended to be used by the aarch64 backend to emit a directive that enables
> support for unwinding frames signed with the pointer authentication B-key. 
> This
> hook is triggered after the ".cfi_startproc" directive is emitted in
> gcc/dwarf2out.c.
>
> Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf with no 
> regressions.
>
> Ok for trunk?
>
> gcc/
> 2018-11-02  Sam Tebbs
>
>   * doc/tm.texi (TARGET_ASM_POST_CFI_STARTPROC): Define.
>   * doc/tm.texi.in (TARGET_ASM_POST_CFI_STARTPROC): Define.
>   * dwarf2out.c (dwarf2out_do_cfi_startproc): Trigger the hook.
>   * hooks.c (hook_void_FILEptr_tree): Define.
>   * hooks.h (hook_void_FILEptr_tree): Define.
>   * target.def (post_cfi_startproc): Define.

CCing global reviewers and dwarf maintainers.

[PATCH, GCC, AARCH64, 2/6] Add new arch command line feaures from ARMv8.5-A

Hi

This patch is part of a series that enables ARMv8.5-A in GCC and
adds Branch Target Identification Mechanism.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)

This patch add all the command line feature that are added by ARMv8.5.
Optional extensions to armv8.5-a:
+rng : Random number Generation Instructions.
+memtag : Memory Tagging Extension.

ARMv8.5-A features that are optional to older arch:
+sb : Speculation barrier instruction.
+ssbs: Speculative Store Bypass Safe instruction.
+predres: Execution and Data Prediction Restriction instructions.

All of the above only effect the assembler and have already (or almost
for a couple of cases) gone in the trunk of binutils.

Bootstrapped and regression tested with aarch64-none-linux-gnu.

Is this ok for trunk?

Thanks
Sudi

*** gcc/ChangeLog ***

2018-xx-xx  Sudakshina Das  

* config/aarch64/aarch64-option-extensions.def: Define
AARCH64_OPT_EXTENSION for memtag, rng, sb, ssbs and predres.
* gcc/config/aarch64/aarch64.h (AARCH64_FL_RNG): New.
(AARCH64_FL_MEMTAG, ARCH64_FL_SB, AARCH64_FL_SSBS): New.
(AARCH64_FL_PREDRES): New.
(AARCH64_FL_FOR_ARCH8_5): Add AARCH64_FL_SB, AARCH64_FL_SSBS and
AARCH64_FL_PREDRES by default.
* gcc/doc/invoke.texi: Document rng, memtag, sb, ssbs and
predres.

diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 69ab796a4e1a959b89ebb55b599919c442cfb088..ed669a63061ba5e1595840943176077af7e69988 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -108,4 +108,19 @@ AARCH64_OPT_EXTENSION("sve", AARCH64_FL_SVE, AARCH64_FL_FP | AARCH64_FL_SIMD | A
 /* Enabling/Disabling "profile" does not enable/disable any other feature.  */
 AARCH64_OPT_EXTENSION("profile", AARCH64_FL_PROFILE, 0, 0, "")
 
+/* Enabling/Disabling "rng" only changes "rng".  */
+AARCH64_OPT_EXTENSION("rng", AARCH64_FL_RNG, 0, 0, "")
+
+/* Enabling/Disabling "memtag" only changes "memtag".  */
+AARCH64_OPT_EXTENSION("memtag", AARCH64_FL_MEMTAG, 0, 0, "")
+
+/* Enabling/Disabling "sb" only changes "sb".  */
+AARCH64_OPT_EXTENSION("sb", AARCH64_FL_SB, 0, 0, "")
+
+/* Enabling/Disabling "ssbs" only changes "ssbs".  */
+AARCH64_OPT_EXTENSION("ssbs", AARCH64_FL_SSBS, 0, 0, "")
+
+/* Enabling/Disabling "predres" only changes "predres".  */
+AARCH64_OPT_EXTENSION("predres", AARCH64_FL_PREDRES, 0, 0, "")
+
 #undef AARCH64_OPT_EXTENSION
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index b324cdd2fede33af13c03362750401f9eb1c9a90..60325bb1b16c71e951ef18319872e8b0911e8d12 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -172,10 +172,22 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_FL_RCPC8_4(1 << 20)  /* Has ARMv8.4-a RCPC extensions.  */
 /* ARMv8.5-A architecture extensions.  */
 #define AARCH64_FL_V8_5	  (1 << 22)  /* Has ARMv8.5-A features.  */
+#define AARCH64_FL_RNG	  (1 << 23)  /* ARMv8.5-A Random Number Insns.  */
+#define AARCH64_FL_MEMTAG (1 << 24)  /* ARMv8.5-A Memory Tagging
+	Extensions.  */
 
 /* Statistical Profiling extensions.  */
 #define AARCH64_FL_PROFILE(1 << 21)
 
+/* Speculation Barrier instruction supported.  */
+#define AARCH64_FL_SB	  (1 << 25)
+
+/* Speculative Store Bypass Safe instruction supported.  */
+#define AARCH64_FL_SSBS	  (1 << 26)
+
+/* Execution and Data Prediction Restriction instructions supported.  */
+#define AARCH64_FL_PREDRES(1 << 27)
+
 /* Has FP and SIMD.  */
 #define AARCH64_FL_FPSIMD (AARCH64_FL_FP | AARCH64_FL_SIMD)
 
@@ -195,7 +207,8 @@ extern unsigned aarch64_architecture_version;
   (AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_V8_4 | AARCH64_FL_F16FML \
| AARCH64_FL_DOTPROD | AARCH64_FL_RCPC8_4)
 #define AARCH64_FL_FOR_ARCH8_5			\
-  (AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_V8_5)
+  (AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_V8_5	\
+   | AARCH64_FL_SB | AARCH64_FL_SSBS | AARCH64_FL_PREDRES)
 
 /* Macros to test ISA flags.  */
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0cf568b60dfb0fb260ca3708ea2d7e081d20cc8b..cc7420f3a84f9cd527c582114a9a96f406b63699 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15287,6 +15287,27 @@ Use of this option with architectures prior to Armv8.2-A is not supported.
 @item profile
 Enable the Statistical Profiling extension.  This option is only to enable the
 extension at the assembler level and does not affect code generation.
+@item rng
+Enable the Armv8.5-a Random Number instructions.  This option is only to
+enable the extension at the assembler level and does not affect code
+generation.
+@item memtag
+Enable the Armv8.5-a Memory Tagging Extensions.  This option is only to
+enable the extension at the assembler level and does not affect code
+generation.
+@item sb
+Enable the Armv8-a Speculation Barrier instructi

[PATCH, GCC, AARCH64, 1/6] Enable ARMv8.5-A in gcc

Hi

This patch is part of a series that enables ARMv8.5-A in GCC and
adds Branch Target Identification Mechanism.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)

This patch add the march option for armv8.5-a.

Bootstrapped and regression tested with aarch64-none-linux-gnu.
Is this ok for trunk?

Thanks
Sudi


*** gcc/ChangeLog ***

2018-xx-xx  Sudakshina Das  

* config/aarch64/aarch64-arches.def: Define AARCH64_ARCH for
ARMv8.5-A.
* gcc/config/aarch64/aarch64.h (AARCH64_FL_V8_5): New.
(AARCH64_FL_FOR_ARCH8_5, AARCH64_ISA_V8_5): New.
* gcc/doc/invoke.texi: Document ARMv8.5-A.

diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index a37a5553894d6ab1d629017ea204478f69d8773d..7d05cd604093d15f27e5b197803a50c45a260e6e 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -35,5 +35,6 @@ AARCH64_ARCH("armv8.1-a", generic,	 8_1A,	8,  AARCH64_FL_FOR_ARCH8_1)
 AARCH64_ARCH("armv8.2-a", generic,	 8_2A,	8,  AARCH64_FL_FOR_ARCH8_2)
 AARCH64_ARCH("armv8.3-a", generic,	 8_3A,	8,  AARCH64_FL_FOR_ARCH8_3)
 AARCH64_ARCH("armv8.4-a", generic,	 8_4A,	8,  AARCH64_FL_FOR_ARCH8_4)
+AARCH64_ARCH("armv8.5-a", generic,	 8_5A,	8,  AARCH64_FL_FOR_ARCH8_5)
 
 #undef AARCH64_ARCH
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index fa9af26fd40fd23b1c9cd6da9b6300fd77089103..b324cdd2fede33af13c03362750401f9eb1c9a90 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -170,6 +170,8 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_FL_SHA3	  (1 << 18)  /* Has ARMv8.4-a SHA3 and SHA512.  */
 #define AARCH64_FL_F16FML (1 << 19)  /* Has ARMv8.4-a FP16 extensions.  */
 #define AARCH64_FL_RCPC8_4(1 << 20)  /* Has ARMv8.4-a RCPC extensions.  */
+/* ARMv8.5-A architecture extensions.  */
+#define AARCH64_FL_V8_5	  (1 << 22)  /* Has ARMv8.5-A features.  */
 
 /* Statistical Profiling extensions.  */
 #define AARCH64_FL_PROFILE(1 << 21)
@@ -192,6 +194,8 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_FL_FOR_ARCH8_4			\
   (AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_V8_4 | AARCH64_FL_F16FML \
| AARCH64_FL_DOTPROD | AARCH64_FL_RCPC8_4)
+#define AARCH64_FL_FOR_ARCH8_5			\
+  (AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_V8_5)
 
 /* Macros to test ISA flags.  */
 
@@ -213,6 +217,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_SHA3	   (aarch64_isa_flags & AARCH64_FL_SHA3)
 #define AARCH64_ISA_F16FML	   (aarch64_isa_flags & AARCH64_FL_F16FML)
 #define AARCH64_ISA_RCPC8_4	   (aarch64_isa_flags & AARCH64_FL_RCPC8_4)
+#define AARCH64_ISA_V8_5	   (aarch64_isa_flags & AARCH64_FL_V8_5)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 06a00a29de73aa509b6a15ebb34dfc182cf94cd2..c76c4fc223f9c46e517213eb6ad292c70aa1c89f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15097,8 +15097,11 @@ more feature modifiers.  This option has the form
 @option{-march=@var{arch}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}.
 
 The permissible values for @var{arch} are @samp{armv8-a},
-@samp{armv8.1-a}, @samp{armv8.2-a}, @samp{armv8.3-a} or @samp{armv8.4-a}
-or @var{native}.
+@samp{armv8.1-a}, @samp{armv8.2-a}, @samp{armv8.3-a}, @samp{armv8.4-a},
+@samp{armv8.5-a} or @var{native}.
+
+The value @samp{armv8.5-a} implies @samp{armv8.4-a} and enables compiler
+support for the ARMv8.5-A architecture extensions.
 
 The value @samp{armv8.4-a} implies @samp{armv8.3-a} and enables compiler
 support for the ARMv8.4-A architecture extensions.

[PATCH], Remove power9 fusion support

2018-11-02 Thread Michael Meissner

As I discussed in my 2018 Cauldron talk, the PowerPC GCC compiler supported a
subset of the original design for fusion in the power9 hardware using peepholes
to fuse together ADDIS instructions and floating point load/store operations.

However, while fusion was part of the original power9 design, by the time the
machine came out, the fusion support was no longer part of the architecture.

This patch removes all of the so-called power9 fusion support for the GCC
compiler.  It leaves -mpower9-fusion as a deprecated switch in case somebody
used it (the switch was never documented).  If you do -mcpu=power9, it will
turn off the power8 fusion support.  If you do -mcpu=power8 and -mtune=power9,
it will also turn off the power8 fusion support.

I have done a bootstrap on a power9 system and there were no regressions in the
tests, other than the two tests for power9 fusion that are now deleted as part
of this patch.  I have also done runs of the Spec 2006 benchmark on the same
machine with and without the patches.  The CactusADM benchmark gets a slight
bump (1.6%) if it doesn't do the power9 fusion.

Can I check these changes into the GCC 9 trunk?

[gcc]
2018-11-02  Michael Meissner  

* config/rs6000/constraints.md (wF constraint): Only document the
wF constraint for power8 fusion.  Remove documentation for power9
fusion.
* config/rs6000/predicates.md (p9_fusion_reg_operand): Delete, the
predicate only used for power9 fusion support.
(fusion_gpr_addis): Drop support to allow fusing offsets where the
top 11 bits weren't all 0 or all 1's.
(fusion_gpr_mem_load): Add comment about not allowing SFmode or
DFmode in power8 fusion.
(fusion_addis_mem_combo_load): Drop power9 fusion support.  Only
support power8 fusion.  Add comment about not allowing SFmode or
DFmode in power8 fusion.
(fusion_offsettable_mem_operand): Delete, the predicate only used
for power9 fusion support.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_NO_FUSION): New
option masks that is all of the ISA 2.07 bits except for fusion.
(ISA_2_7_MASKS_SERVER): Add fusion bits back in.
(ISA_3_0_MASKS_SERVER): Delete power9 fusion.  Use ISA 2.07 bits
without enabling power8 fusion.
(POWERPC_MASKS): Delete power9 fusion option mask.
* config/rs6000/rs6000-protos.h (emit_fusion_load_store): Delete
function declarations used for power9 fusion.
(fusion_p9_p): Likewise.
(expand_fusion_p9_load): Likewise.
(expand_fusion_p9_store): Likewise.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/rs6000/rs6000.c (rs6000_debug_reg_global): Delete power9
fusion debug print.
(rs6000_option_override_internal): Delete power9 fusion option
support.  If we do -mcpu=power8 -mtune=power9, turn off power8
fusion.
(rs6000_opt_masks): Delete power9 fusion option.
(emit_fusion_load): Rename function from emit_fusion_load_store.
Make the function static.  Delete support for fusing stores, since
we are deleting power9 fusion.
(fusion_p9_p): Delete power9 fusion support functions.
(expand_fusion_p9_load): Likewise.
(expand_fusion_p9_store): Likewise.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/rs6000/rs6000.h: Delete comment about power9 fusion.
* config/rs6000/rs6000.md (UNSPEC_FUSION_P9): Delete, no longer
used since power9 fusion support has been deleted.
(GPR_FUSION iterator): Likewise.
(FPR_FUSION iterator): Likewise.
(power9 fusion peephole2's): Likewise.
(fusion_gpr___load): Likewise.
(fusion_gpr___store): Likewise.
(fusion_vsx___load): Likewise.
(fusion_vsx___store): Likewise.
(fusion_p9__constant): Likewise.
* config/rs6000/rs6000.opt (-mpower9-fusion): Mark as deprecated.
* doc/md.texi (PowerPC constraints): Update wF documentation.

[gcc/testsuite]
2018-11-02  Michael Meissner  

* gcc.target/powerpc/fusion3.c: Delete power9 fusion.
* gcc.target/powerpc/fusion4.c: Likewise.


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md 
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 265537)
+++ gcc/config/rs6000/rs6000.md (.../gcc/config/rs6000) (working copy)
@@ -136,7 +136,6 @@ (define_c_enum "unspec"
UNSPEC_LSQ
UNSPEC_FUSION_GPR
UNSPEC_STACK_CHECK
-   UNSPEC_FUSION_P9
UNSPEC_ADD_ROUND_TO_ODD
UNSPEC_SUB_ROUND_TO_ODD
UNSPEC_MUL_ROUND_TO_ODD
@@ -349,19 +348,6 @@ (define_mode_iterator HSI [HI SI])
 ; SImode

[PATCH, GCC, AARCH64, 3/6] Restrict indirect tail calls to x16 and x17

Hi

This patch is part of a series that enables ARMv8.5-A in GCC and
adds Branch Target Identification Mechanism.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)

This patch changes the registers that are allowed for indirect tail
calls. We are choosing to restrict these to only x16 or x17.

Indirect tail calls are special in a way that they convert a call
statement (BLR instruction) to a jump statement (BR instruction). For
the best possible use of Branch Target Identification Mechanism, we 
would like to place a "BTI C" (call) at the beginning of the function
which is only compatible with BLRs and BR X16/X17. In order to make
indirect tail calls compatible with this scenario, we are restricting 
the TAILCALL_ADDR_REGS.

In order to use x16/x17 for this purpose, we also had to change the use
of these registers in the epilogue/prologue handling. For this purpose
we are now using x12 and x13 named as EP0_REGNUM and EP1_REGNUM as
scratch registers for epilogue and prologue.

Bootstrapped and regression tested with aarch64-none-linux-gnu. Updated
test. Ran Spec2017 and no performance hit.

Is this ok for trunk?

Thanks
Sudi


*** gcc/ChangeLog***

2018-xx-xx  Sudakshina Das  

  * config/aarch64/aarch64.c (aarch64_expand_prologue): Use new
  epilogue/prologue scratch registers EP0_REGNUM and EP1_REGNUM.
  (aarch64_expand_epilogue): Likewise.
  (aarch64_output_mi_thunk): Likewise
  * config/aarch64/aarch64.h (REG_CLASS_CONTENTS): Change
TAILCALL_ADDR_REGS
  to x16 and x17.
  * config/aarch64/aarch64.md: Define EP0_REGNUM and EP1_REGNUM.

*** gcc/testsuite/ChangeLog ***

2018-xx-xx  Sudakshina Das  

  * gcc.target/aarch64/test_frame_17.c: Update to check for
EP0_REGNUM instead of IP0_REGNUM and add test case.

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 94184049c9c77d858fd5b3e2a8970a48b70f7529..8e7a8d54351cf7eb1774a474bfbfbebf58070e31 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -579,7 +579,7 @@ enum reg_class
 #define REG_CLASS_CONTENTS		\
 {	\
   { 0x, 0x, 0x },	/* NO_REGS */		\
-  { 0x0004, 0x, 0x },	/* TAILCALL_ADDR_REGS */\
+  { 0x0003, 0x, 0x },	/* TAILCALL_ADDR_REGS */\
   { 0x7fff, 0x, 0x0003 },	/* GENERAL_REGS */	\
   { 0x8000, 0x, 0x },	/* STACK_REG */		\
   { 0x, 0x, 0x0003 },	/* POINTER_REGS */	\
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 27f81b654a2bae3ddd87b99e4b7926cc588a95f5..f9a81f1734e6885662f6a9e6c97bdbcdac24211b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5317,8 +5317,8 @@ aarch64_expand_prologue (void)
 	aarch64_emit_probe_stack_range (get_stack_check_protect (), frame_size);
 }
 
-  rtx ip0_rtx = gen_rtx_REG (Pmode, IP0_REGNUM);
-  rtx ip1_rtx = gen_rtx_REG (Pmode, IP1_REGNUM);
+  rtx tmp0_rtx = gen_rtx_REG (Pmode, EP0_REGNUM);
+  rtx tmp1_rtx = gen_rtx_REG (Pmode, EP1_REGNUM);
 
   /* In theory we should never have both an initial adjustment
  and a callee save adjustment.  Verify that is the case since the
@@ -5328,7 +5328,7 @@ aarch64_expand_prologue (void)
   /* Will only probe if the initial adjustment is larger than the guard
  less the amount of the guard reserved for use by the caller's
  outgoing args.  */
-  aarch64_allocate_and_probe_stack_space (ip0_rtx, ip1_rtx, initial_adjust,
+  aarch64_allocate_and_probe_stack_space (tmp0_rtx, tmp1_rtx, initial_adjust,
 	  true, false);
 
   if (callee_adjust != 0)
@@ -5346,7 +5346,7 @@ aarch64_expand_prologue (void)
 	}
   aarch64_add_offset (Pmode, hard_frame_pointer_rtx,
 			  stack_pointer_rtx, callee_offset,
-			  ip1_rtx, ip0_rtx, frame_pointer_needed);
+			  tmp1_rtx, tmp0_rtx, frame_pointer_needed);
   if (frame_pointer_needed && !frame_size.is_constant ())
 	{
 	  /* Variable-sized frames need to describe the save slot
@@ -5388,7 +5388,7 @@ aarch64_expand_prologue (void)
 
   /* We may need to probe the final adjustment if it is larger than the guard
  that is assumed by the called.  */
-  aarch64_allocate_and_probe_stack_space (ip1_rtx, ip0_rtx, final_adjust,
+  aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, final_adjust,
 	  !frame_pointer_needed, true);
 }
 
@@ -5426,8 +5426,8 @@ aarch64_expand_epilogue (bool for_sibcall)
   unsigned reg2 = cfun->machine->frame.wb_candidate2;
   rtx cfi_ops = NULL;
   rtx_insn *insn;
-  /* A stack clash protection prologue may not have left IP0_REGNUM or
- IP1_REGNUM in a usable state.  The same is true for allocations
+  /* A stack clash protection prologue may not have left EP0_REGNUM or
+ EP1_REGNUM in a usable state.  The same is true for allocations
  with an SVE component, since we then need both temporary registers
  for each all

[PATCH, GCC, AARCH64, 4/6] Enable BTI: Add new to -mbranch-protection.

Hi

This patch is part of a series that enables ARMv8.5-A in GCC and
adds Branch Target Identification Mechanism.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)

NOTE: This patch is dependent on Sam Tebbs patch to deprecate
-msign-return-address and add new -mbranch-protection option
https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00104.html

This pass updates the CLI of -mbranch-protection to add "bti" as a new
type of branch protection and also add it its definition of "none" and
"standard". Since the BTI instructions, just like the return address
signing instructions are in the HINT space, this option is not limited
to ARMv8.5-A architecture version.

The option does not really do anything functional.
The functional changes are in the next patch. I am initializing the 
target variable aarch64_enable_bti to 2 since I am also adding a
configure option in a later patch and a value different from 0 and 1
would help identify if its already been updated.

Bootstrapped and regression tested with aarch64-none-linux-gnu.
Is this ok for trunk?

Thanks
Sudi


*** gcc/ChangeLog ***

2018-xx-xx  Sudakshina Das  

* config/aarch64/aarch64-protos.h (aarch64_bti_enabled):
Declare.
* config/aarch64/aarch64.c
(aarch64_handle_no_branch_protection): Disable bti for
-mbranch-protection=none.
(aarch64_handle_standard_branch_protection): Enable bti for
-mbranch-protection=standard.
(aarch64_handle_bti_protection): Enable bti for "bti" in the
string to -mbranch-protection.
(aarch64_bti_enabled): Check if bti is enabled.
* config/aarch64/aarch64.opt: Declare target variable.
* doc/invoke.texi: Add bti to the -mbranch-protection
documentation.


diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index bba8204fa53083da49d00a8c2b29e62849bd233c..a5ccfe534b6c59c90bd91215f89c59d67fd88688 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -525,6 +525,7 @@ void aarch64_register_pragmas (void);
 void aarch64_relayout_simd_types (void);
 void aarch64_reset_previous_fndecl (void);
 bool aarch64_return_address_signing_enabled (void);
+bool aarch64_bti_enabled (void);
 void aarch64_save_restore_target_globals (tree);
 void aarch64_addti_scratch_regs (rtx, rtx, rtx *,
  rtx *, rtx *,
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 039aec828d7dae60918493abb0d044001ac0b366..836275ab58de894529a72be88ff226da503598dc 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1140,6 +1140,7 @@ static enum aarch64_parse_opt_result
 aarch64_handle_no_branch_protection (char* str ATTRIBUTE_UNUSED, char* rest)
 {
   aarch64_ra_sign_scope = AARCH64_FUNCTION_NONE;
+  aarch64_enable_bti = 0;
   if (rest)
 {
   error ("unexpected %<%s%> after %<%s%>", rest, str);
@@ -1154,6 +1155,7 @@ aarch64_handle_standard_branch_protection (char* str ATTRIBUTE_UNUSED,
 {
   aarch64_ra_sign_scope = AARCH64_FUNCTION_NON_LEAF;
   aarch64_ra_sign_key = AARCH64_KEY_A;
+  aarch64_enable_bti = 1;
   if (rest)
 {
   error ("unexpected %<%s%> after %<%s%>", rest, str);
@@ -1187,6 +1189,14 @@ aarch64_handle_pac_ret_b_key (char* str ATTRIBUTE_UNUSED,
   return AARCH64_PARSE_OK;
 }
 
+static enum aarch64_parse_opt_result
+aarch64_handle_bti_protection (char* str ATTRIBUTE_UNUSED,
+char* rest ATTRIBUTE_UNUSED)
+{
+  aarch64_enable_bti = 1;
+  return AARCH64_PARSE_OK;
+}
+
 static const struct aarch64_branch_protec_type aarch64_pac_ret_subtypes[] = {
   { "leaf", aarch64_handle_pac_ret_leaf, NULL, 0 },
   { "b-key", aarch64_handle_pac_ret_b_key, NULL, 0 },
@@ -1198,6 +1208,7 @@ static const struct aarch64_branch_protec_type aarch64_branch_protec_types[] = {
   { "standard", aarch64_handle_standard_branch_protection, NULL, 0 },
   { "pac-ret", aarch64_handle_pac_ret_protection, aarch64_pac_ret_subtypes,
 sizeof (aarch64_pac_ret_subtypes) / sizeof (aarch64_branch_protec_type) },
+  { "bti", aarch64_handle_bti_protection, NULL, 0 },
   { NULL, NULL, NULL, 0 }
 };
 
@@ -4581,6 +4592,13 @@ aarch64_return_address_signing_enabled (void)
 	  && cfun->machine->frame.reg_offset[LR_REGNUM] >= 0));
 }
 
+/* Return TRUE if Branch Target Identification Mechanism is enabled.  */
+bool
+aarch64_bti_enabled (void)
+{
+  return (aarch64_enable_bti == 1);
+}
+
 /* Emit code to save the callee-saved registers from register number START
to LIMIT to the stack at the location starting at offset START_OFFSET,
skipping any write-back candidates if SKIP_WB is true.  */
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 9460636d93b67af1525f028176aa78e6fed4e45f..fc2064bd688490765b977eca777245986274d268 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -33,6 +33,9 @@ const char *x_aarch64_override_tune_string
 TargetVariable

[PATCH, GCC, AARCH64, 5/6] Enable BTI : Add new pass for BTI.

Hi

This patch is part of a series that enables ARMv8.5-A in GCC and
adds Branch Target Identification Mechanism.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)

This patch adds a new pass called "bti" which is triggered by the
command line argument -mbranch-protection whenever "bti" is turned on.

The pass iterates through the instructions and adds appropriated BTI 
instructions based on the following:
* Add a new "BTI C" at the beginning of a function, unless its already
  protected by a "PACIASP/PACIBSP". We exempt the functions that are
  only called directly.
* Add a new "BTI J" for every target of an indirect jump, jump table
  targets, non-local goto targets or labels that might be referenced
  by variables, constant pools, etc (NOTE_INSN_DELETED_LABEL)

Since we have already changed the use of indirect tail calls to only x16 
and x17, we do not have to use "BTI JC".
(check patch 3/6).

Bootstrapped and regression tested with aarch64-none-linux-gnu. Added 
new tests.
Is this ok for trunk?

Thanks
Sudi

*** gcc/ChangeLog ***

2018-xx-xx  Sudakshina Das  
Ramana Radhakrishnan  

* config.gcc (aarch64*-*-*): Add aarch64-bti-insert.o.
* gcc/config/aarch64/aarch64.h: Update comment for
TRAMPOLINE_SIZE.
* config/aarch64/aarch64.c (aarch64_asm_trampoline_template):
Update if bti is enabled.
* config/aarch64/aarch64-bti-insert.c: New file.
* config/aarch64/aarch64-passes.def (INSERT_PASS_BEFORE): Insert
bti pass.
* config/aarch64/aarch64-protos.h (make_pass_insert_bti):
Declare the new bti pass.
* config/aarch64/aarch64.md (bti_nop): Define.
* config/aarch64/t-aarch64: Add rule for aarch64-bti-insert.o.

*** gcc/testsuite/ChangeLog ***

2018-xx-xx  Sudakshina Das  

* gcc.target/aarch64/bti-1.c: New test.
* gcc.target/aarch64/bti-2.c: New test.
* lib/target-supports.exp
(check_effective_target_aarch64_bti_hw): Add new check for
BTI hw.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b108697cfc7b1c9c6dc1f30cca6fd1158182c29e..3e77f9df6ad6ca55fccca50387eab4b2501af647 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -317,7 +317,7 @@ aarch64*-*-*)
 	c_target_objs="aarch64-c.o"
 	cxx_target_objs="aarch64-c.o"
 	d_target_objs="aarch64-d.o"
-	extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o"
+	extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch64-bti-insert.o"
 	target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c"
 	target_has_targetm_common=yes
 	;;
diff --git a/gcc/config/aarch64/aarch64-bti-insert.c b/gcc/config/aarch64/aarch64-bti-insert.c
new file mode 100644
index ..efd57620d8803302e03ca643b9f2495e188dc19b
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-bti-insert.c
@@ -0,0 +1,195 @@
+/* Branch Target Identification for AArch64 architecture.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   Contributed by Arm Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#define INCLUDE_STRING
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "gimple.h"
+#include "tm_p.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "emit-rtl.h"
+#include "gimplify.h"
+#include "gimple-iterator.h"
+#include "dumpfile.h"
+#include "rtl-iter.h"
+#include "cfgrtl.h"
+#include "tree-pass.h"
+#include "cgraph.h"
+
+namespace {
+
+const pass_data pass_data_insert_bti =
+{
+  RTL_PASS, /* type.  */
+  "bti", /* name.  */
+  OPTGROUP_NONE, /* optinfo_flags.  */
+  TV_MACH_DEP, /* tv_id.  */
+  0, /* properties_required.  */
+  0, /* properties_provided.  */
+  0, /* properties_destroyed.  */
+  0, /* todo_flags_start.  */
+  0, /* todo_flags_finish.  */
+};
+
+/* Check if X (or any sub-rtx of X) is a PACIASP/PACIBSP instruction.  */
+static bool
+aarch64_pac_insn_p (rtx x)
+{
+  if (!INSN_P (x))
+return x;
+
+  subrtx_var_iterator::array_type array;
+  FOR_EACH_SUBRTX_VAR (iter, array, PATTERN (x), ALL)
+{
+  rtx

[PATCH, GCC, AARCH64, 6/6] Enable BTI: Add configure option for BTI and PAC-RET

Hi

This patch is part of a series that enables ARMv8.5-A in GCC and
adds Branch Target Identification Mechanism.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)

This patch is adding a new configure option for enabling and return
address signing by default with --enable-standard-branch-protection.
This is equivalent to -mbranch-protection=standard which would
imply -mbranch-protection=pac-ret+bti.

Bootstrapped and regression tested with aarch64-none-linux-gnu with
and without the configure option turned on.
Also tested on aarch64-none-elf with and without configure option with a
BTI enabled aem. Only 2 regressions and these were because newlib
requires patches to protect hand coded libraries with BTI.

Is this ok for trunk?

Thanks
Sudi

*** gcc/ChangeLog ***

2018-xx-xx  Sudakshina Das  

* config/aarch64/aarch64.c (aarch64_override_options): Add case to check
configure option to set BTI and Return Address Signing.
* configure.ac: Add --enable-standard-branch-protection and
--disable-standard-branch-protection.
* configure: Regenerated.
* doc/install.texi: Document the same.

*** gcc/testsuite/ChangeLog ***

2018-xx-xx  Sudakshina Das  

* gcc.target/aarch64/bti-1.c: Update test to not add command
line option when configure with bti.
* gcc.target/aarch64/bti-2.c: Likewise.
* lib/target-supports.exp
(check_effective_target_default_branch_protection):
Add configure check for --enable-standard-branch-protection.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 12a55a640de4fdc5df21d313c7ea6841f1daf3f2..a1a5b7b464eaa2ce67ac66d9aea837159590aa07 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11558,6 +11558,26 @@ aarch64_override_options (void)
   if (!selected_tune)
 selected_tune = selected_cpu;
 
+  if (aarch64_enable_bti == 2)
+{
+#ifdef TARGET_ENABLE_BTI
+  aarch64_enable_bti = 1;
+#else
+  aarch64_enable_bti = 0;
+#endif
+}
+
+  /* No command-line option yet.  */
+  if (accepted_branch_protection_string == NULL && !TARGET_ILP32)
+{
+#ifdef TARGET_ENABLE_PAC_RET
+  aarch64_ra_sign_scope = AARCH64_FUNCTION_NON_LEAF;
+  aarch64_ra_sign_key = AARCH64_KEY_A;
+#else
+  aarch64_ra_sign_scope = AARCH64_FUNCTION_NONE;
+#endif
+}
+
 #ifndef HAVE_AS_MABI_OPTION
   /* The compiler may have been configured with 2.23.* binutils, which does
  not have support for ILP32.  */
diff --git a/gcc/configure b/gcc/configure
index 03461f1e27538a3a0791c2b61b0e75c3ff1a25be..a0f95106c22ee858bbf4516f14cd9d265dede272 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -947,6 +947,7 @@ with_plugin_ld
 enable_gnu_indirect_function
 enable_initfini_array
 enable_comdat
+enable_standard_branch_protection
 enable_fix_cortex_a53_835769
 enable_fix_cortex_a53_843419
 with_glibc_version
@@ -1677,6 +1678,14 @@ Optional Features:
   --enable-initfini-array	use .init_array/.fini_array sections
   --enable-comdat enable COMDAT group support
 
+  --enable-standard-branch-protection
+  enable Branch Target Identification Mechanism and
+  Return Address Signing by default for AArch64
+  --disable-standard-branch-protection
+  disable Branch Target Identification Mechanism and
+  Return Address Signing by default for AArch64
+
+
   --enable-fix-cortex-a53-835769
   enable workaround for AArch64 Cortex-A53 erratum
   835769 by default
@@ -18529,7 +18538,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18532 "configure"
+#line 18541 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18635,7 +18644,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18638 "configure"
+#line 18647 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -24939,6 +24948,25 @@ $as_echo "#define HAVE_AS_SMALL_PIC_RELOCS 1" >>confdefs.h
 
 fi
 
+# Enable Branch Target Identification Mechanism and Return Address
+# Signing by default.
+# Check whether --enable-standard-branch-protection was given.
+if test "${enable_standard_branch_protection+set}" = set; then :
+  enableval=$enable_standard_branch_protection;
+case $enableval in
+  yes)
+tm_defines="${tm_defines} TARGET_ENABLE_BTI=1 TARGET_ENABLE_PAC_RET=1"
+;;
+  no)
+;;
+  *)
+as_fn_error "'$enableval' is an invalid value for --enable-standard-branch-protection.\
+  Valid choices are 'yes' and 'no'." "$LINENO" 5
+;;
+esac
+
+fi
+
 # Enable default workaround for AArch64 Cortex-A53 erratum 835769.
 # Check whether --enable-fix-cortex-

Re: [PATCH, GCC, AARCH64, 1/6] Enable ARMv8.5-A in gcc

Hi

On 02/11/18 18:37, Sudakshina Das wrote:
> Hi
> 
> This patch is part of a series that enables ARMv8.5-A in GCC and
> adds Branch Target Identification Mechanism.
> (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)
>  
> 
> 
> This patch add the march option for armv8.5-a.
> 
> Bootstrapped and regression tested with aarch64-none-linux-gnu.
> Is this ok for trunk?
> 
> Thanks
> Sudi
> 
> 
> *** gcc/ChangeLog ***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>  * config/aarch64/aarch64-arches.def: Define AARCH64_ARCH for
>  ARMv8.5-A.
>  * gcc/config/aarch64/aarch64.h (AARCH64_FL_V8_5): New.
>  (AARCH64_FL_FOR_ARCH8_5, AARCH64_ISA_V8_5): New.
>  * gcc/doc/invoke.texi: Document ARMv8.5-A.
> 

As per an offline chat earlier with Richard, I was supposed to send
future patch series as a reply on a single thread. Sadly I forgot to
do that this time. So I am adding links of the other patches here to
make it easy to link the series:

[PATCH, GCC, AARCH64, 2/6] Add new arch command line feaures from 
ARMv8.5-A : https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00111.html

[PATCH, GCC, AARCH64, 3/6] Restrict indirect tail calls to x16 and x17:
https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00113.html

[PATCH, GCC, AARCH64, 4/6] Enable BTI: Add new  to 
-mbranch-protection: 
https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00114.html

[PATCH, GCC, AARCH64, 5/6] Enable BTI : Add new pass for BTI:
https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00115.html

[PATCH, GCC, AARCH64, 6/6] Enable BTI: Add configure option for BTI and 
PAC-RET: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00116.html


Sorry!
Sudi

Re: [PATCH, testsuite] test case fixes for pdp11

2018-11-02 Thread Rainer Orth

Hi Paul,

> This patch fixes a number of test case failures on pdp11.  Some are too large 
> for the address space, some have dependencies on the float format that don't 
> match the DEC format, some add pdp11 to the targets that expect particular 
> compiler messages.

unfortunately, even apart from the two bugs in your patch Andreas
already fixed, there are more problems: with the patch one gets 20
warnings

WARNING: compat.exp does not support dg-skip-if

in mail-report.log for gcc.dg/compat.  While the message is misleading
and it took me a moment to understand what's wrong, you should have
found this in your testing.  A good way something like this doesn't go
unnoticed in a regtest is to run make mail-report.log in a vanilla and
patched tree and compare the output.  Those WARNING and ERROR lines are
prominent there for a reason ;-)

While we give target maintainers quite a bit of leeway to apply
testsuite patches affecting only their targets, this needs to be
exercised with caution.  Best test the modified testsuite on a different
target, too, to check that it doesn't break there.

Besides, a bit more detail on the failures you observe without your
patch would have been helpful.  I noticed that some of the tests you
change already have dg-skip-if directives for avr with a comment of
"Program too big".  It's hard to tell if this is the same issue as your
"limited code space".  If so, it would be advisable and much more
expressive to introduce (yet another) effective-target keyword for this.

The problem is that in the g??.dg/compat testsuites, dg keywords except
for dg-options are only supposed to go into the *_main.c file.  The
following patch fixes this.  Tested on i386-pc-solaris2.11 and
sparc-sun-solaris2.11, installed on mainline.
 
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-11-02  Rainer Orth  

gcc/testsuite:
* gcc.dg/compat/pr83487-1_y.c: Move dg-skip-if ...
* gcc.dg/compat/pr83487-1_main.c: ... here.
* gcc.dg/compat/struct-by-value-10_main.c,
gcc.dg/compat/struct-by-value-10_x.c,
gcc.dg/compat/struct-by-value-11_main.c,
gcc.dg/compat/struct-by-value-11_x.c,
gcc.dg/compat/struct-by-value-12_main.c,
gcc.dg/compat/struct-by-value-12_x.c,
gcc.dg/compat/struct-by-value-13_main.c,
gcc.dg/compat/struct-by-value-13_x.c,
gcc.dg/compat/struct-by-value-14_main.c,
gcc.dg/compat/struct-by-value-14_x.c,
gcc.dg/compat/struct-by-value-15_main.c,
gcc.dg/compat/struct-by-value-15_x.c,
gcc.dg/compat/struct-by-value-17_main.c,
gcc.dg/compat/struct-by-value-17_x.c,
gcc.dg/compat/struct-by-value-18_main.c,
gcc.dg/compat/struct-by-value-18_x.c,
gcc.dg/compat/struct-by-value-2_main.c,
gcc.dg/compat/struct-by-value-2_x.c,
gcc.dg/compat/struct-by-value-22_main.c,
gcc.dg/compat/struct-by-value-22_x.c,
gcc.dg/compat/struct-by-value-3_main.c,
gcc.dg/compat/struct-by-value-3_x.c,
gcc.dg/compat/struct-by-value-4_main.c,
gcc.dg/compat/struct-by-value-4_x.c,
gcc.dg/compat/struct-by-value-5b_main.c,
gcc.dg/compat/struct-by-value-5b_x.c,
gcc.dg/compat/struct-by-value-6b_main.c,
gcc.dg/compat/struct-by-value-6b_x.c,
gcc.dg/compat/struct-by-value-6b_main.c,
gcc.dg/compat/struct-by-value-7b_x.c,
gcc.dg/compat/struct-by-value-7b_main.c,
gcc.dg/compat/struct-by-value-8_main.c,
gcc.dg/compat/struct-by-value-8_x.c,
gcc.dg/compat/struct-by-value-9_main.c,
gcc.dg/compat/struct-by-value-9_x.c,
gcc.dg/compat/struct-return-2_main.c,
gcc.dg/compat/struct-return-2_x.c: Likewise.

# HG changeset patch
# Parent  3b32766f9b2fd59fd2ced721fe452cceb2bd0f7c
Move gcc.dg/compat dg-skip-if to *_main.c files

diff --git a/gcc/testsuite/gcc.dg/compat/pr83487-1_main.c b/gcc/testsuite/gcc.dg/compat/pr83487-1_main.c
--- a/gcc/testsuite/gcc.dg/compat/pr83487-1_main.c
+++ b/gcc/testsuite/gcc.dg/compat/pr83487-1_main.c
@@ -1,3 +1,5 @@
+/* { dg-skip-if "no large alignment" { pdp11-*-* } } */
+
 extern void do_test (void);
 
 int
diff --git a/gcc/testsuite/gcc.dg/compat/pr83487-1_y.c b/gcc/testsuite/gcc.dg/compat/pr83487-1_y.c
--- a/gcc/testsuite/gcc.dg/compat/pr83487-1_y.c
+++ b/gcc/testsuite/gcc.dg/compat/pr83487-1_y.c
@@ -1,5 +1,3 @@
-/* { dg-skip-if "no large alignment" { pdp11-*-* } } */
-
 #include "pr83487-1.h"
 
 struct A a;
diff --git a/gcc/testsuite/gcc.dg/compat/struct-by-value-10_main.c b/gcc/testsuite/gcc.dg/compat/struct-by-value-10_main.c
--- a/gcc/testsuite/gcc.dg/compat/struct-by-value-10_main.c
+++ b/gcc/testsuite/gcc.dg/compat/struct-by-value-10_main.c
@@ -1,6 +1,7 @@
 /* Test structures passed by value, including to a function with a
variable-length argument lists.  All struct members are floatin

Re: [PATCH, testsuite] test case fixes for pdp11

2018-11-02 Thread Paul Koning

> On Nov 2, 2018, at 3:19 PM, Rainer Orth  wrote:
> 
> Hi Paul,
> 
>> This patch fixes a number of test case failures on pdp11.  Some are too 
>> large for the address space, some have dependencies on the float format that 
>> don't match the DEC format, some add pdp11 to the targets that expect 
>> particular compiler messages.
> 
> unfortunately, even apart from the two bugs in your patch Andreas
> already fixed, there are more problems: with the patch one gets 20
> warnings
> 
> WARNING: compat.exp does not support dg-skip-if
> 
> in mail-report.log for gcc.dg/compat.  While the message is misleading
> and it took me a moment to understand what's wrong, you should have
> found this in your testing.  A good way something like this doesn't go
> unnoticed in a regtest is to run make mail-report.log in a vanilla and
> patched tree and compare the output.  Those WARNING and ERROR lines are
> prominent there for a reason ;-)

My apologies for these errors.  I will strive to learn from the feedback you 
and Andreas have given and do it better next time.

I wasn't ware of mail-report -- I have used contrib/test_summary in the past.  
What makes it more difficult in this case is that most of these test cases have 
not run in a long time (if ever) so the volume is quite large.  I'm getting it 
down to a more manageable number, though 1000 out of 60k is still much higher 
than it should be.

> While we give target maintainers quite a bit of leeway to apply
> testsuite patches affecting only their targets, this needs to be
> exercised with caution.  Best test the modified testsuite on a different
> target, too, to check that it doesn't break there.

I understand, I will do so in the future.

> Besides, a bit more detail on the failures you observe without your
> patch would have been helpful.  I noticed that some of the tests you
> change already have dg-skip-if directives for avr with a comment of
> "Program too big".  It's hard to tell if this is the same issue as your
> "limited code space".  If so, it would be advisable and much more
> expressive to introduce (yet another) effective-target keyword for this.

I could use "! ptr32plus".  I'm a bit hesitant to do so because I don't know 
what othe targets might match that.  msp430, based on the comment, and possibly 
others.  For tests that allocate megabyte buffers that's an obvious fit.  For 
tests that generate large blocks of code it isn't quite so obvious; the 
gcc.gd/long_branch.c test is too big for pdp11 but it might be ok for other 
"small" targets.

> The problem is that in the g??.dg/compat testsuites, dg keywords except
> for dg-options are only supposed to go into the *_main.c file.  The
> following patch fixes this.  Tested on i386-pc-solaris2.11 and
> sparc-sun-solaris2.11, installed on mainline.

Thank you.  How can I tell for a particular test case or test directory what 
the rules are for where "dg" directives go, or which ones are supported?  I 
know there is a fair amount of variability in this, but I don't yet understand 
what they all are.

paul

Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-11-02 Thread Martin Sebor


I have reworked the patch to resolve any lingering concerns about
warnings in configure tests.  The attached revision only warns
with -Wextra and only for incompatible declarations of built-ins
that take arguments.  For void built-ins like abort() it only
warns with -Wpedantic (this required adjustments to several
tests that are being compiled with -pedantic-errors).

The revised patch also detects incompatibilities om uses of
built-ins declared with no prototype and warns for those.  This
includes argument passing and function pointer conversions.

The effect of these changes is that for example the following
snippet is only diagnosed with -Wpedantic because it isn't
incorrect (so only on the basis that the declaration style is
deprecated in C):

  void abort ();   // warning only with -Wpedantic here

  void f (void)
  {
abort ();  // safe, not diagnosed
  }

while the following is diagnosed by default because the call is
definitely undefined:

  char* strncpy ();  // warning only with -Wextra

  void f (char *d)
  {
strncpy (d, 4, "123");   // undefined, warning by default
  }

Tested on x86_64-linux.

Martin

On 07/05/2018 01:44 PM, Jeff Law wrote:

On 07/04/2018 11:32 AM, Martin Sebor wrote:

On 07/03/2018 08:33 PM, Jeff Law wrote:




But since the number of warnings here hasn't changed, the ones
in GCC logs predate my changes.  So updating the tests seems
like an improvement to consider independently of the patch.

Agreed.  I'm still wary of proceeding given the general concerns about
configure tests.  It's good that GCC's configury bits aren't affected,
but I'm not sure we can generalize a whole lot from that.


So what's the next step?  I'm open to relaxing the warning
so it only triggers with -Wall or -Wextra and not by default
if that's considered necessary.

I'm not sure :-)  The problem is we have notable potential to break
things and do so in ways that are going to be painful to find.

Having them only turn on for -Wextra might be an compromise position.
But even if we do that I don't really see how we take the next step (ie,
adding it to Wall).




At the same time, the instances of the warning we have seen
have all been issued for the configure tests for years and
we have not seen any new instances of it as a result of
this change, so the concern that the patch might lead to some
more while at the same time accepting the ones we know about
doesn't make sense to me.

Again, I don't think we can generalize much from the GCC autoconf
scripts and the failure modes are going to be extremely painful to track
down to this change.

While we have this concern with every new warning or enhancements to
existing warnings, this specific instance is worse because of how it
interacts with relatively common configury code.

Jeff




PR c/83656 - missing -Wbuiltin-declaration-mismatch on declaration without prototype

gcc/c/ChangeLog:

	PR c/83656
	* c-decl.c (header_for_builtin_fn): Declare.
	(diagnose_mismatched_decls): Diagnose declarations of built-in
	functions without a prototype.
	* c-typeck.c (convert_argument): New function.
	(convert_arguments): Factor code out into convert_argument.
	Detect mismatches between built-in formal arguments in calls
	to built-in without prototype.
	(type_or_builtin_type): New function.
	(convert_for_assignment): Add argument.  Conditionally issue
	warnings instead of errors for mismatches.

gcc/testsuite/ChangeLog:

	PR c/83656
	* gcc.dg/20021006-1.c
	* gcc.dg/Wbuiltin-declaration-mismatch.c: New test.
	* gcc.dg/Wbuiltin-declaration-mismatch-2.c: New test.
	* gcc.dg/Wbuiltin-declaration-mismatch-3.c: New test.
	* gcc.dg/Walloca-16.c: Adjust.
	* gcc.dg/Wrestrict-4.c: Adjust.
	* gcc.dg/Wrestrict-5.c: Adjust.
	* gcc.dg/atomic/stdatomic-generic.c: Adjust.
	* gcc.dg/atomic/stdatomic-lockfree.c: Adjust.
	* gcc.dg/initpri1.c: Adjust.
	* gcc.dg/pr15698-1.c: Adjust.
	* gcc.dg/pr69156.c: Adjust.
	* gcc.dg/pr83463.c: Adjust.
	* gcc.dg/redecl-4.c: Adjust.
	* gcc.dg/tls/thr-init-2.c: Adjust.
	* gcc.dg/torture/pr55890-2.c: Adjust.
	* gcc.dg/torture/pr55890-3.c: Adjust.
	* gcc.dg/torture/pr67741.c: Adjust.
	* gcc.dg/torture/stackalign/sibcall-1.c: Adjust.
	* gcc.dg/torture/tls/thr-init-1.c: Adjust.

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index cbbf7eb..1572f45 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -604,6 +604,7 @@ static tree grokparms (struct c_arg_info *, bool);
 static void layout_array_type (tree);
 static void warn_defaults_to (location_t, int, const char *, ...)
 ATTRIBUTE_GCC_DIAG(3,4);
+static const char *header_for_builtin_fn (enum built_in_function);
 
 /* T is a statement.  Add it to the statement-tree.  This is the
C/ObjC version--C++ has a slightly different version of this
@@ -1887,12 +1888,25 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl,
 	*oldtypep = oldtype = trytype;
 	  else
 	{
+	  const char *header
+		= header_for_builtin_fn (DECL_FUNCTION_CODE (olddecl));
+	  location_t loc = DECL_SOU

Re: [fortran, patch, committed] Adjust error message

2018-11-02 Thread Thomas Koenig


Please remove the trailing whitespace (after "length 1").


Done (r265732).

2018-11-02  Thomas Koenig  

PR fortran/46020
* decl.c (verify_bind_c_sym): Remove unnecessary space
in error message.

Index: decl.c
===
--- decl.c  (Revision 265732)
+++ decl.c  (Arbeitskopie)
@@ -5648,7 +5648,7 @@ verify_bind_c_sym (gfc_symbol *tmp_sym, gfc_typesp
|| tmp_sym->ts.u.cl->length->expr_type != EXPR_CONSTANT
|| mpz_cmp_si (tmp_sym->ts.u.cl->length->value.integer, 1) 
!= 0)

  gfc_error ("Return type of BIND(C) function %qs of character "
-"type at %L must have length 1 ", tmp_sym->name,
+"type at %L must have length 1", tmp_sym->name,
 &(tmp_sym->declared_at));
 }

Regards

Thomas

Re: [PATCH] Fix not properly nul-terminated string constants in JIT

On Sun, 2018-08-05 at 16:59 +, Bernd Edlinger wrote:
> Hi!
> 
> 
> My other patch with adds assertions to varasm.c regarding correct
> nul termination of sting literals did make these incorrect string
> constants in JIT frontend fail.
> 
> The string constants are not nul terminated if their length exceeds
> 200 characters.  The test cases do not use strings of that size where
> that would make a difference.  But using a fixed index type is
> clearly
> wrong.
> 
> This patch removes the fixed char[200] array type from
> playback::context,
> and uses build_string_literal instead of using build_string directly.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

Sorry for the belated response.

Was this tested with --enable-host-shared and --enable-languages=jit ?
Note that "jit" is not included in --enable-languages=all.

The patch seems reasonable, but I'm a little confused over the meaning
of "len" in build_string_literal and build_string: does it refer to the
length or the size of the string?

> @@ -617,16 +616,9 @@ playback::rvalue *
>  playback::context::
>  new_string_literal (const char *value)
>  {
> -  tree t_str = build_string (strlen (value), value);
> -  gcc_assert (m_char_array_type_node);
> -  TREE_TYPE (t_str) = m_char_array_type_node;
> -
> -  /* Convert to (const char*), loosely based on
> - c/c-typeck.c: array_to_pointer_conversion,
> - by taking address of start of string.  */
> -  tree t_addr = build1 (ADDR_EXPR, m_const_char_ptr, t_str);
> +  tree t_str = build_string_literal (strlen (value) + 1, value);
>  
> -  return new rvalue (this, t_addr);
> +  return new rvalue (this, t_str);
>  }

In the above, the call to build_string with strlen is replaced with
build_string_literal with strlen + 1.

build_string's comment says:

"Note that for a C string literal, LEN should include the trailing
NUL."

but has:

  length = len + offsetof (struct tree_string, str) + 1;

and:

  TREE_STRING_LENGTH (s) = len;
  memcpy (s->string.str, str, len);
  s->string.str[len] = '\0';

suggesting that the "len" parameter is in fact the length *without* the
trailing NUL, and that a trailing NUL is added by build_string.

However build_string_literal has:

  t = build_string (len, str);
  elem = build_type_variant (char_type_node, 1, 0);
  index = build_index_type (size_int (len - 1));

suggesting that the len is passed directly to build_string (and thus
ought to be strlen), but the build_index_type uses len - 1 (which
suggests that len is the size of the string, rather than its length).

What's the intended meaning of len in these functions?

Thanks
Dave

Re: [PATCH v3 2/3] PR preprocessor/83173: New test

On Thu, 2018-11-01 at 11:56 -0400, Mike Gulick wrote:
> 2018-10-31  Mike Gulick  
> 
>   PR preprocessor/83173
>   * gcc.dg/plugin/location-overflow-test-pr83173.c: New test.
>   * gcc.dg/plugin/location-overflow-test-pr83173.h: Header for
>   pr83173.c.
>   * gcc.dg/plugin/location-overflow-test-pr83173-1.h: Header for
>   pr83173.c.
>   * gcc.dg/plugin/location-overflow-test-pr83173-2.h: Header for
>   pr83173.c.
>   * gcc.dg/plugin/location_overflow_plugin.c: Use PLUGIN_PRAGMAS
>   instead of PLUGIN_START_UNIT.
>   * gcc.dg/plugin/plugin.exp: Enable new test.

(sorry for the belated response on this)

This patch is OK once the other parts are approved, and assuming your
contributor paperwork is in place.

Dave

Re: [PATCH v3 3/3] PR preprocessor/83173: Enhance -fdump-internal-locations output

On Thu, 2018-11-01 at 11:56 -0400, Mike Gulick wrote:
> 2017-10-31  Mike Gulick  
> 
>   PR preprocessor/83173
>   * gcc/input.c (dump_location_info): Dump reason and
>   included_from fields from line_map_ordinary struct.  Fix
>   indentation when location > 5 digits.
> 
>   * libcpp/location-example.txt: Update example
>   -fdump-internal-locations output.
> ---
>  gcc/input.c |  49 +-
>  libcpp/location-example.txt | 333 +-
> --
>  2 files changed, 241 insertions(+), 141 deletions(-)

Sorry about the belated response.  This is a nice enhancement; some
nits below.

> diff --git a/gcc/input.c b/gcc/input.c
> index a94a010f353..f938a37f20e 100644
> --- a/gcc/input.c
> +++ b/gcc/input.c
> @@ -1075,6 +1075,17 @@ dump_labelled_location_range (FILE *stream,
>fprintf (stream, "\n");
>  }
>  
> +#define NUM_DIGITS(x) ((x) >= 10 ? 10 : \
> +(x) >= 1 ? 9 : \
> +(x) >= 1000 ? 8 : \
> +(x) >= 100 ? 7 : \
> +(x) >= 10 ? 6 : \
> +(x) >= 1 ? 5 : \
> +(x) >= 1000 ? 4 : \
> +(x) >= 100 ? 3 : \
> +(x) >= 10 ? 2 : \
> +1)

diagnostic-show-locus.c has a function "num_digits" (currently static)
and, fwiw, a unit test.  It would be good to share the implementation.

>  /* Write a visualization of the locations in the line_table to
> STREAM.  */
>  
>  void
> @@ -1104,6 +1115,35 @@ dump_location_info (FILE *stream)
>  map->m_column_and_range_bits - map->m_range_bits);
>fprintf (stream, "  range bits: %i\n",
>  map->m_range_bits);
> +  const char * reason;
> +  switch (map->reason) {
> +  case LC_ENTER:
> + reason = "LC_ENTER";
> + break;
> +  case LC_LEAVE:
> + reason = "LC_LEAVE";
> + break;
> +  case LC_RENAME:
> + reason = "LC_RENAME";
> + break;
> +  case LC_RENAME_VERBATIM:
> + reason = "LC_RENAME_VERBATIM";
> + break;
> +  case LC_ENTER_MACRO:
> + reason = "LC_RENAME_MACRO";
> + break;
> +  default:
> + reason = "Unknown";
> +  }
> +  fprintf (stream, "  reason: %d (%s)\n", map->reason, reason);
> +
> +  const line_map_ordinary *includer_map
> + = linemap_included_from_linemap (line_table, map);
> +  fprintf (stream, "  included from map: %d\n",
> +includer_map ? int (includer_map - line_table-
> >info_ordinary.maps)
> +: -1);

I'm not a fan of "-1" here; it's a NULL pointer in the original data.
How about "n/a" for that case?


> +  fprintf (stream, "  included from location: %d\n",
> +linemap_included_from (map));

...or merging it with this line, for something like:

  included from location: 127 (in ordinary map 2)

vs:

  included from location: 0

[...snip...]

Other than that, this is OK for trunk, assuming your contributor
paperwork is in place.

Dave

Re: [PATCH v3 1/3] PR preprocessor/83173: Additional check before decrementing highest_location

On Thu, 2018-11-01 at 11:56 -0400, Mike Gulick wrote:
> 2018-10-31  Mike Gulick  
> 
>   PR preprocessor/83173
>   * libcpp/files.c (_cpp_stack_include): Check if
>   line_table->highest_location is past current line before
>   decrementing.
> ---
>  libcpp/files.c | 32 +++-
>  1 file changed, 23 insertions(+), 9 deletions(-)
> 
> diff --git a/libcpp/files.c b/libcpp/files.c
> index 08b7c647c91..c0165fe64e4 100644
> --- a/libcpp/files.c
> +++ b/libcpp/files.c
> @@ -1012,6 +1012,7 @@ _cpp_stack_include (cpp_reader *pfile, const
> char *fname, int angle_brackets,
>struct cpp_dir *dir;
>_cpp_file *file;
>bool stacked;
> +  bool decremented = false;
>  
>/* For -include command-line flags we have type == IT_CMDLINE.
>   When the first -include file is processed we have the case,
> where
> @@ -1035,20 +1036,33 @@ _cpp_stack_include (cpp_reader *pfile, const
> char *fname, int angle_brackets,
>  return false;
>  
>/* Compensate for the increment in linemap_add that occurs if
> -  _cpp_stack_file actually stacks the file.  In the case of a
> - normal #include, we're currently at the start of the line
> - *following* the #include.  A separate source_location for this
> - location makes no sense (until we do the LC_LEAVE), and
> - complicates LAST_SOURCE_LINE_LOCATION.  This does not apply if
> we
> - found a PCH file (in which case linemap_add is not called) or
> we
> - were included from the command-line.  */
> + _cpp_stack_file actually stacks the file.  In the case of a
> normal
> + #include, we're currently at the start of the line *following*
> the
> + #include.  A separate source_location for this location makes
> no
> + sense (until we do the LC_LEAVE), and complicates
> + LAST_SOURCE_LINE_LOCATION.  This does not apply if we found a
> PCH
> + file (in which case linemap_add is not called) or we were
> included
> + from the command-line.  In the case that the #include is the
> last
> + line in the file, highest_location still points to the current
> + line, not the start of the next line, so we do not decrement in
> + this case.  See plugin/location-overflow-test-pr83173.h for an
> + example.  */
>if (file->pchname == NULL && file->err_no == 0
>&& type != IT_CMDLINE && type != IT_DEFAULT)
> -pfile->line_table->highest_location--;
> +{
> +  int highest_line = linemap_get_expansion_line (pfile-
> >line_table,
> +  pfile-
> >line_table->highest_location);
> +  int source_line = linemap_get_expansion_line (pfile-
> >line_table, loc);
> +  if (highest_line > source_line)
> + {
> +   pfile->line_table->highest_location--;
> +   decremented = true;
> + }
> +}
>  
>stacked = _cpp_stack_file (pfile, file, type == IT_IMPORT, loc);
>  
> -  if (!stacked)
> +  if (decremented && !stacked)
>  /* _cpp_stack_file didn't stack the file, so let's rollback the
> compensation dance we performed above.  */
>  pfile->line_table->highest_location++;

Sorry for the belated response.

This is OK for trunk (assuming your contributor paperwork is in place).

Thanks
Dave

Re: [PATCH v3 1/3] PR preprocessor/83173: Additional check before decrementing highest_location

2018-11-02 Thread Mike Gulick

On 11/2/18 5:13 PM, David Malcolm wrote:
> On Thu, 2018-11-01 at 11:56 -0400, Mike Gulick wrote:
>> 2018-10-31  Mike Gulick  
>>
>>  PR preprocessor/83173
>>  * libcpp/files.c (_cpp_stack_include): Check if
>>  line_table->highest_location is past current line before
>>  decrementing.
>> ---
>>  libcpp/files.c | 32 +++-
>>  1 file changed, 23 insertions(+), 9 deletions(-)
>>
>> diff --git a/libcpp/files.c b/libcpp/files.c
>> index 08b7c647c91..c0165fe64e4 100644
>> --- a/libcpp/files.c
>> +++ b/libcpp/files.c
>> @@ -1012,6 +1012,7 @@ _cpp_stack_include (cpp_reader *pfile, const
>> char *fname, int angle_brackets,
>>struct cpp_dir *dir;
>>_cpp_file *file;
>>bool stacked;
>> +  bool decremented = false;
>>  
>>/* For -include command-line flags we have type == IT_CMDLINE.
>>   When the first -include file is processed we have the case,
>> where
>> @@ -1035,20 +1036,33 @@ _cpp_stack_include (cpp_reader *pfile, const
>> char *fname, int angle_brackets,
>>  return false;
>>  
>>/* Compensate for the increment in linemap_add that occurs if
>> -  _cpp_stack_file actually stacks the file.  In the case of a
>> - normal #include, we're currently at the start of the line
>> - *following* the #include.  A separate source_location for this
>> - location makes no sense (until we do the LC_LEAVE), and
>> - complicates LAST_SOURCE_LINE_LOCATION.  This does not apply if
>> we
>> - found a PCH file (in which case linemap_add is not called) or
>> we
>> - were included from the command-line.  */
>> + _cpp_stack_file actually stacks the file.  In the case of a
>> normal
>> + #include, we're currently at the start of the line *following*
>> the
>> + #include.  A separate source_location for this location makes
>> no
>> + sense (until we do the LC_LEAVE), and complicates
>> + LAST_SOURCE_LINE_LOCATION.  This does not apply if we found a
>> PCH
>> + file (in which case linemap_add is not called) or we were
>> included
>> + from the command-line.  In the case that the #include is the
>> last
>> + line in the file, highest_location still points to the current
>> + line, not the start of the next line, so we do not decrement in
>> + this case.  See plugin/location-overflow-test-pr83173.h for an
>> + example.  */
>>if (file->pchname == NULL && file->err_no == 0
>>&& type != IT_CMDLINE && type != IT_DEFAULT)
>> -pfile->line_table->highest_location--;
>> +{
>> +  int highest_line = linemap_get_expansion_line (pfile-
>>> line_table,
>> + pfile-
>>> line_table->highest_location);
>> +  int source_line = linemap_get_expansion_line (pfile-
>>> line_table, loc);
>> +  if (highest_line > source_line)
>> +{
>> +  pfile->line_table->highest_location--;
>> +  decremented = true;
>> +}
>> +}
>>  
>>stacked = _cpp_stack_file (pfile, file, type == IT_IMPORT, loc);
>>  
>> -  if (!stacked)
>> +  if (decremented && !stacked)
>>  /* _cpp_stack_file didn't stack the file, so let's rollback the
>> compensation dance we performed above.  */
>>  pfile->line_table->highest_location++;
> 
> Sorry for the belated response.
> 
> This is OK for trunk (assuming your contributor paperwork is in place).
> 
> Thanks
> Dave
> 
Thanks Dave.  I don't have contributor paperwork in place for gcc.  I was under 
the impressed the request needed to be initiated by a maintainer, but if I can 
make the request myself let me know and I will do so.  I do have an employer 
copyright disclaimer already approved which includes gcc, so the process should 
be quick.

Thanks,
Mike

Re: [wwwdocs] readings.html - add OpenRISC links

On Thu, Nov 01, 2018 at 04:43:08PM -0500, Segher Boessenkool wrote:
> On Fri, Nov 02, 2018 at 06:20:56AM +0900, Stafford Horne wrote:
> > As we were getting ready for OpenRISC gcc port upstreaming Segher pointed 
> > out
> > that we should be updating this.
> > 
> > I don't think have CVS write access (only git binutils-gdb), can someone 
> > help to
> > review and commit if OK?
> 
> I committed this for you (as trivial and obvious).  Thanks!

Thank you!

Re: [PATCH] Fix not properly nul-terminated string constants in JIT

2018-11-02 Thread Bernd Edlinger

On 11/2/18 9:40 PM, David Malcolm wrote:
> On Sun, 2018-08-05 at 16:59 +, Bernd Edlinger wrote:
>> Hi!
>>
>>
>> My other patch with adds assertions to varasm.c regarding correct
>> nul termination of sting literals did make these incorrect string
>> constants in JIT frontend fail.
>>
>> The string constants are not nul terminated if their length exceeds
>> 200 characters.  The test cases do not use strings of that size where
>> that would make a difference.  But using a fixed index type is
>> clearly
>> wrong.
>>
>> This patch removes the fixed char[200] array type from
>> playback::context,
>> and uses build_string_literal instead of using build_string directly.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> 
> Sorry for the belated response.
> 
> Was this tested with --enable-host-shared and --enable-languages=jit ?
> Note that "jit" is not included in --enable-languages=all.
> 

Yes, of course.  The test suite contains a few string constants, just
all of them are shorter than 200 characters.  But I think removing this
artificial limit enables the existing test cases to test that the
shorter string is in fact zero terminated.

> The patch seems reasonable, but I'm a little confused over the meaning
> of "len" in build_string_literal and build_string: does it refer to the
> length or the size of the string?
> 

build_string_literal:
For languages that use zero-terminated strings, len is strlen(str)+1, and
str is a zero terminated single-byte character string.
For languages that don't use zero-terminated strings, len is the size of
the string and str is not zero terminated.

build_string:
constructs a STRING_CST tree object, which is usable as is in some contexts,
like for asm constraints, but as a string literal it is incomplete, and
needs an index type.  The index type defines the memory size which must
be larger than the string precision.  Excess memory is implicitly cleared.

This means currently all jit strings shorter than 200 characters
are filled with zero up to the limit of 200 chars as imposed by
m_char_array_type_node.  Strings of exactly 200 chars are not zero terminated,
and larger strings should result in an assertion (excess precision was 
previously
allowed, but no zero termination was appended, when that is not part of
the original string constant).

Previously it was allowed to have memory size less than the string len, which
had complicated the STRING_CST semantics in the middle-end, but with the
string_cst semantic rework I did for gcc-9 this is no longer allowed and
results in (checking) assertions in varasm.c.

>> @@ -617,16 +616,9 @@ playback::rvalue *
>>   playback::context::
>>   new_string_literal (const char *value)
>>   {
>> -  tree t_str = build_string (strlen (value), value);
>> -  gcc_assert (m_char_array_type_node);
>> -  TREE_TYPE (t_str) = m_char_array_type_node;
>> -
>> -  /* Convert to (const char*), loosely based on
>> - c/c-typeck.c: array_to_pointer_conversion,
>> - by taking address of start of string.  */
>> -  tree t_addr = build1 (ADDR_EXPR, m_const_char_ptr, t_str);
>> +  tree t_str = build_string_literal (strlen (value) + 1, value);
>>   
>> -  return new rvalue (this, t_addr);
>> +  return new rvalue (this, t_str);
>>   }
> 
> In the above, the call to build_string with strlen is replaced with
> build_string_literal with strlen + 1.
> 
> 
> build_string's comment says:
> 
> "Note that for a C string literal, LEN should include the trailing
> NUL."
> 
> but has:
> 
>length = len + offsetof (struct tree_string, str) + 1;
> 
> and:
> 
>TREE_STRING_LENGTH (s) = len;
>memcpy (s->string.str, str, len);
>s->string.str[len] = '\0';
> 
> suggesting that the "len" parameter is in fact the length *without* the
> trailing NUL, and that a trailing NUL is added by build_string.
> 

Yes, string constants in tree objects have another zero termiation,
but varasm.c does something different, there the index range takes
precedence.
The index range is built in build_string_literal as follows:

   elem = build_type_variant (char_type_node, 1, 0);
   index = build_index_type (size_int (len - 1));
   type = build_array_type (elem, index);

therefore the string constant hast the type char[0..len-1]
thus only len bytes are significant for code generation, the extra
nul is just for "convenience".

> However build_string_literal has:
> 
>t = build_string (len, str);
>elem = build_type_variant (char_type_node, 1, 0);
>index = build_index_type (size_int (len - 1));
> 
> suggesting that the len is passed directly to build_string (and thus
> ought to be strlen), but the build_index_type uses len - 1 (which
> suggests that len is the size of the string, rather than its length).
> 
> What's the intended meaning of len in these functions?
> 

I hope this helps.

Thanks
Bernd.

> Thanks
> Dave
>

[PATCH] newlib/configure.host: Set have_init_fini to no for OpenRISC

The new GCC port for OpenRISC will use the init_fini_array only and not
provide the init() and fini() functions.  Disable the function usage by
default as its no longer needed.

Signed-off-by: Stafford Horne 
---
 newlib/configure.host | 1 +
 1 file changed, 1 insertion(+)

diff --git a/newlib/configure.host b/newlib/configure.host
index 27bce36a1..6c49cb750 100644
--- a/newlib/configure.host
+++ b/newlib/configure.host
@@ -279,6 +279,7 @@ case "${host_cpu}" in
;;
   or1k*|or1knd*)
machine_dir=or1k
+   have_init_fini=no
;;
   powerpc*)
machine_dir=powerpc
-- 
2.17.2

Re: [PATCH][RTL] Fix PR87852

On 11/2/18 7:40 AM, Richard Biener wrote:
> 
> The following fixes PR87852, a latent bug in fwprop which when verifying
> whether it may propagate a use from its definition site has a shortcut
> 
>   /* Check if the reg in USE has only one definition.  We already
>  know that this definition reaches use, or we wouldn't be here.
>  However, this is invalid for hard registers because if they are
>  live at the beginning of the function it does not mean that we
>  have an uninitialized access.  */
>   regno = DF_REF_REGNO (use);
>   def = DF_REG_DEF_CHAIN (regno);
>   if (def
>   && DF_REF_NEXT_REG (def) == NULL
>   && regno >= FIRST_PSEUDO_REGISTER)
> return false;
> 
> not considering the case of a loop where the def might not dominate
> the use.  In fact earlier code in the very same function does
> handle this case but only for the case where we'd try propagating
> a later def into an earlier use.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
> 2018-11-02  Richard Biener  
> 
>   PR rtl-optimization/87852
>   * fwprop.c (use_killed_between): Only consider single-defs of the
>   use in the definition stmt that dominate it.
You may have just saved me major headaches.  I'm in the middle of trying
to debug an unreported ARM codegen bug.  I just walked through the CSE
dump and everything looks OK, and it's mucked up in fwprop1.  The thing
that caught my eye was a pseudo where the def does not dominate a use in
a loop.  ie, on the first iteration of the loop the value is undefined
(but we'll never read it at runtime due to other checks)

/me hopes this is the same thing and ultimately explains the
mis-compilation of python I'm seeing.

Jeff

Re: [PATCH] combine: Do not combine moves from hard registers

2018-11-02 Thread Renlin Li

Hi Segher,

I find a problem with your change to add make_more_copies.
I am investigating those regressions, a big amount of them are wrong code
generation.

One problem is that, make_more_copies will split the assignment of fp to sfp.

From:
(insn 48 26 28 5 (set (reg/f:SI 102 sfp)
(reg/f:SI 11 fp)) -1
To:
(insn 51 32 26 5 (set (reg:SI 117)
(reg/f:SI 11 fp)) 646 {*arm_movsi_vfp}
(expr_list:REG_EQUIV (reg/f:SI 11 fp)
(nil)))
(insn 48 26 28 5 (set (reg/f:SI 102 sfp)
(reg:SI 117)) 646 {*arm_movsi_vfp}
(expr_list:REG_DEAD (reg:SI 117)
(nil)))

The original rtx is generated by expand_builtin_setjmp_receiver to adjust the
frame pointer.

And later in LRA, it will try to eliminate frame_pointer with hard frame
pointer which is
defined the ELIMINABLE_REGS.

Your change split the insn into two.
This makes it doesn't match the "from" and "to" regs defined in ELIMINABLE_REGS.
The if statement to generate the adjustment insn is been skipt.
And the original instruction is just been deleted!

Probably, we don't want to split the move rtx if they are related to entries
defined in ELIMINABLE_REGS?

Regards,
Renlin

On 10/24/2018 09:23 AM, Christophe Lyon wrote:

On Wed, 24 Oct 2018 at 00:26, Segher Boessenkool
wrote:

Hi Christophe,

On Tue, Oct 23, 2018 at 03:25:55PM +0200, Christophe Lyon wrote:

On Tue, 23 Oct 2018 at 14:29, Segher Boessenkool
wrote:

On Tue, Oct 23, 2018 at 12:14:27PM +0200, Christophe Lyon wrote:

I have noticed many regressions on arm and aarch64 between 265366 and
265408 (this commit is 265398).

I bisected at least one to this commit on aarch64:
FAIL: gcc.dg/ira-shrinkwrap-prep-1.c scan-rtl-dump ira "Split
live-range of register"
The same test also regresses on arm.

Many targets also fail gcc.dg/ira-shrinkwrap-prep-2.c; these tests fail
when random things in the RTL change, apparently.

This is PR87708 now.

For a whole picture of all the regressions I noticed during these two
commits, have a look at:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/265408/report-build-info.html

No thanks. I am not going to click on 111 links and whatever is behind
those. Please summarise, like, what was the diff in test_summary, and
then dig down into individual tests if you want. Or whatever else works
both for you and for me. This doesn't work for me.

OK this is not very practical for me either. There were 25 commits between
the two validations being compared,
25-28 gcc tests regressed on aarch64, depending on the exact target
177-206 gcc tests regressed on arm*, 7-29 gfortran regressions on arm*
so I could have to run many bisects to make sure every regression is
caused by the same commit.

So many, ouch! I didn't realise.

I've now got the results of validating your patch only, compared to the
previous revision, and it does cause all the regressions I noticed earlier.

Since these are all automated builds with everything discarded after
computing the regressions, it's quite time consuming to re-run the
tests manually on my side (probably at least as much as it is for you).

Running arm tests is very painful for me. But you say this is on aarch64
as well, I didn't realise that either; aarch64 should be easy to test,
we have many reasonable aarch64 machines in the cfarm.

I know this doesn't answer your question, but I thought you could run aarch64
tests easily and that would be more efficient for the project that you
do it directly
without waiting for me to provide hardly little more information.

Well, I'm not too familiar with aarch64, so if you can say "this Z is a
pretty simple test that should do X but now does Y" that would be a huge
help :-)

Maybe this will answer your question better:
List of aarch64-linux-gnu regressions:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/265408/aarch64-none-linux-gnu/diff-gcc-rh60-aarch64-none-linux-gnu-default-default-default.txt
List of arm-none-linux-gnueabihf regressions:
(gcc)
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/265408/arm-none-linux-gnueabihf/diff-gcc-rh60-arm-none-linux-gnueabihf-arm-cortex-a9-neon-fp16.txt
(gfortran)
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/265408/arm-none-linux-gnueabihf/diff-gfortran-rh60-arm-none-linux-gnueabihf-arm-cortex-a9-neon-fp16.txt

That may help yes, thanks!

To me it just highlights again that we need a validation system easier to
work with when we break something on a target we are not familiar with.

OTOH a patch like this is likely to break many target-specific tests, and
that should not prevent commiting it imnsho. If it actively breaks things,
then of course it shouldn't go in as-is, or if it breaks bootstrap, etc.

I run post-commit validations as finely grained as possible with the CPU
resources I have access to, that's not enough and I think having a
developer-accessible gerrit+jenkins-like system would be very valuable
to test pat

Re: [PATCH v3 1/3] PR preprocessor/83173: Additional check before decrementing highest_location

On 11/2/18 3:34 PM, Mike Gulick wrote:
> On 11/2/18 5:13 PM, David Malcolm wrote:
>> On Thu, 2018-11-01 at 11:56 -0400, Mike Gulick wrote:
>>> 2018-10-31  Mike Gulick  
>>>
>>> PR preprocessor/83173
>>> * libcpp/files.c (_cpp_stack_include): Check if
>>> line_table->highest_location is past current line before
>>> decrementing.
>>> ---
>>>  libcpp/files.c | 32 +++-
>>>  1 file changed, 23 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/libcpp/files.c b/libcpp/files.c
>>> index 08b7c647c91..c0165fe64e4 100644
>>> --- a/libcpp/files.c
>>> +++ b/libcpp/files.c
>>> @@ -1012,6 +1012,7 @@ _cpp_stack_include (cpp_reader *pfile, const
>>> char *fname, int angle_brackets,
>>>struct cpp_dir *dir;
>>>_cpp_file *file;
>>>bool stacked;
>>> +  bool decremented = false;
>>>  
>>>/* For -include command-line flags we have type == IT_CMDLINE.
>>>   When the first -include file is processed we have the case,
>>> where
>>> @@ -1035,20 +1036,33 @@ _cpp_stack_include (cpp_reader *pfile, const
>>> char *fname, int angle_brackets,
>>>  return false;
>>>  
>>>/* Compensate for the increment in linemap_add that occurs if
>>> -  _cpp_stack_file actually stacks the file.  In the case of a
>>> - normal #include, we're currently at the start of the line
>>> - *following* the #include.  A separate source_location for this
>>> - location makes no sense (until we do the LC_LEAVE), and
>>> - complicates LAST_SOURCE_LINE_LOCATION.  This does not apply if
>>> we
>>> - found a PCH file (in which case linemap_add is not called) or
>>> we
>>> - were included from the command-line.  */
>>> + _cpp_stack_file actually stacks the file.  In the case of a
>>> normal
>>> + #include, we're currently at the start of the line *following*
>>> the
>>> + #include.  A separate source_location for this location makes
>>> no
>>> + sense (until we do the LC_LEAVE), and complicates
>>> + LAST_SOURCE_LINE_LOCATION.  This does not apply if we found a
>>> PCH
>>> + file (in which case linemap_add is not called) or we were
>>> included
>>> + from the command-line.  In the case that the #include is the
>>> last
>>> + line in the file, highest_location still points to the current
>>> + line, not the start of the next line, so we do not decrement in
>>> + this case.  See plugin/location-overflow-test-pr83173.h for an
>>> + example.  */
>>>if (file->pchname == NULL && file->err_no == 0
>>>&& type != IT_CMDLINE && type != IT_DEFAULT)
>>> -pfile->line_table->highest_location--;
>>> +{
>>> +  int highest_line = linemap_get_expansion_line (pfile-
 line_table,
>>> +pfile-
 line_table->highest_location);
>>> +  int source_line = linemap_get_expansion_line (pfile-
 line_table, loc);
>>> +  if (highest_line > source_line)
>>> +   {
>>> + pfile->line_table->highest_location--;
>>> + decremented = true;
>>> +   }
>>> +}
>>>  
>>>stacked = _cpp_stack_file (pfile, file, type == IT_IMPORT, loc);
>>>  
>>> -  if (!stacked)
>>> +  if (decremented && !stacked)
>>>  /* _cpp_stack_file didn't stack the file, so let's rollback the
>>> compensation dance we performed above.  */
>>>  pfile->line_table->highest_location++;
>>
>> Sorry for the belated response.
>>
>> This is OK for trunk (assuming your contributor paperwork is in place).
>>
>> Thanks
>> Dave
>>
> Thanks Dave.  I don't have contributor paperwork in place for gcc.  I was 
> under the impressed the request needed to be initiated by a maintainer, but 
> if I can make the request myself let me know and I will do so.  I do have an 
> employer copyright disclaimer already approved which includes gcc, so the 
> process should be quick.
Contact ass...@gnu.org and indicate to them you need a past and future
copyright assignment for GCC.

jeff

Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-11-02 Thread Joseph Myers

On Fri, 2 Nov 2018, Martin Sebor wrote:

> I have reworked the patch to resolve any lingering concerns about
> warnings in configure tests.  The attached revision only warns
> with -Wextra and only for incompatible declarations of built-ins
> that take arguments.  For void built-ins like abort() it only
> warns with -Wpedantic (this required adjustments to several
> tests that are being compiled with -pedantic-errors).

I don't think this use of -Wpedantic is appropriate.  -Wpedantic is not a 
catch-all for warnings we don't want to enable with some other option; 
it's specifically for programs doing something that is disallowed by ISO C 
(such warnings may or may not also be enabled by other relevant options).

Since this declaration is not disallowed by ISO C, -Wpedantic should not 
result in a warning for it.

(I do consider declarations with () for built-in functions without 
arguments to be more dubious than for user-defined functions without 
arguments, simply because good practice would be to include the standard 
header to get declarations of those functions, whereas for user-defined 
functions the code might simply be using C++ style for declaring functions 
without arguments.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] combine: Do not combine moves from hard registers

2018-11-02 Thread Segher Boessenkool

Hi!

On Fri, Nov 02, 2018 at 10:19:01PM +, Renlin Li wrote:
> I find a problem with your change to add make_more_copies.
> I am investigating those regressions, a big amount of them are wrong code 
> generation.
> 
> One problem is that, make_more_copies will split the assignment of fp to 
> sfp.
> 
> From:
> (insn 48 26 28 5 (set (reg/f:SI 102 sfp)
> (reg/f:SI 11 fp)) -1
> To:
> (insn 51 32 26 5 (set (reg:SI 117)
> (reg/f:SI 11 fp)) 646 {*arm_movsi_vfp}
>  (expr_list:REG_EQUIV (reg/f:SI 11 fp)
> (nil)))
> (insn 48 26 28 5 (set (reg/f:SI 102 sfp)
> (reg:SI 117)) 646 {*arm_movsi_vfp}
>  (expr_list:REG_DEAD (reg:SI 117)
> (nil)))

I was looking at this just now :-)  (PR87871)

fp is a hard reg, but not a fixed reg, so make_more_moves thinks it is
fine to copy it to some pseudo, before copying it to the final dest.  And
that is just fine as far as I can see.

That final dest is sfp, and that final move is moved over the clobber of
fp, and yes eventually deleted as you say below.

> The original rtx is generated by expand_builtin_setjmp_receiver to adjust 
> the frame pointer.
> 
> And later in LRA, it will try to eliminate frame_pointer with hard frame 
> pointer which is
> defined the ELIMINABLE_REGS.
> 
> Your change split the insn into two.
> This makes it doesn't match the "from" and "to" regs defined in 
> ELIMINABLE_REGS.
> The if statement to generate the adjustment insn is been skipt.
> And the original instruction is just been deleted!

I don't follow why, or what should have prevented it from being deleted.

> Probably, we don't want to split the move rtx if they are related to 
> entries defined in ELIMINABLE_REGS?

One thing I can easily do is not making an intermediate pseudo when copying
*to* a fixed reg, which sfp is.  Let me try if that helps the testcase I'm
looking at (setjmp-4.c).

Segher

Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-11-02 Thread Martin Sebor


On 11/02/2018 04:52 PM, Joseph Myers wrote:

On Fri, 2 Nov 2018, Martin Sebor wrote:


I have reworked the patch to resolve any lingering concerns about
warnings in configure tests.  The attached revision only warns
with -Wextra and only for incompatible declarations of built-ins
that take arguments.  For void built-ins like abort() it only
warns with -Wpedantic (this required adjustments to several
tests that are being compiled with -pedantic-errors).


I don't think this use of -Wpedantic is appropriate.  -Wpedantic is not a
catch-all for warnings we don't want to enable with some other option;
it's specifically for programs doing something that is disallowed by ISO C
(such warnings may or may not also be enabled by other relevant options).

Since this declaration is not disallowed by ISO C, -Wpedantic should not
result in a warning for it.

(I do consider declarations with () for built-in functions without
arguments to be more dubious than for user-defined functions without
arguments, simply because good practice would be to include the standard
header to get declarations of those functions, whereas for user-defined
functions the code might simply be using C++ style for declaring functions
without arguments.)


-Wpedantic alone doesn't cause a warning, only in conjunction
with -Wno-builtin-declaration-mismatch.

But I have no preference for what option to put it under, or
necessarily think that using -Wpedantic (or any other "group"
option) like this is a great idea (it doesn't work with #pragma
GCC diagnostic that way I think it should).  In fact, with
the latest approach of diagnosing unsafe calls to these functions
regardless of the declaration form it doesn't seem that important
that declarations of built-ins with no arguments be diagnosed at
all.  Either way, there aren't enough of them for it to matter
much.  I think there's just one: abort.  I'm fine with removing
this part of the patch.

Is there anything else?

Martin

Re: [PATCH libquadmath/PR68686]

2018-11-02 Thread Joseph Myers

On Fri, 2 Nov 2018, Joseph Myers wrote:

> I think it would be best to move to having a script to generate 
> libquadmath sources automatically from glibc sources by appropriate 
> substitutions, so that while you might need to update the script or 
> quadmath-imp.h as part of updating libquadmath from glibc, you don't need 
> to merge lots of changes manually.
> 
> Specifically, any comments on the patch below (quadmath-imp.h changes and 
> new script shown, 6000 lines of diffs from running the script not shown)?  
> It doesn't yet update the *gamma* sources, but could be extended to do so.  
> (It also doesn't do anything with the parts of libquadmath outside of 
> libquadmath/math/, but again could be extended for that.)  Specifically, 
> the following files in libquadmath/math/ aren't yet updated by the script 
> (a few of these, e.g. sqrtq.c, aren't actually based on glibc sources at 
> all, while others just need the script to gain new features, or additional 
> source files to be added to libquadmath): cacoshq.c cacosq.c casinhq.c 
> complex.c expq.c fmaq.c ilogbq.c isinf_nsq.c lgammaq.c nanq.c rem_pio2q.c 
> sqrtq.c tanq.c tgammaq.c x2y2m1q.c.

Here's an updated version of the patch that also updates most of the 
previously omitted libquadmath/math/ files that are based on glibc sources 
(not fmaq.c or rem_pio2q.c), including *gamma*.  It adds exp2q and 
issignalingq as new public interfaces, given how they are used in the 
current glibc versions of some of the functions already present in 
libquadmath, but doesn't add any other new functions from glibc.

Index: libquadmath/Makefile.am
===
--- libquadmath/Makefile.am (revision 265750)
+++ libquadmath/Makefile.am (working copy)
@@ -44,7 +44,7 @@
 libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include
 
 libquadmath_la_SOURCES = \
-  math/x2y2m1q.c math/isinf_nsq.c math/acoshq.c math/fmodq.c \
+  math/x2y2m1q.c math/acoshq.c math/fmodq.c \
   math/acosq.c math/frexpq.c \
   math/rem_pio2q.c math/asinhq.c math/hypotq.c math/remainderq.c \
   math/asinq.c math/rintq.c math/atan2q.c math/isinfq.c \
@@ -58,6 +58,8 @@
   math/tanhq.c math/expq.c math/modfq.c math/tanq.c math/fabsq.c \
   math/nanq.c math/tgammaq.c math/finiteq.c math/nextafterq.c \
   math/truncq.c math/floorq.c math/powq.c math/fmaq.c math/logbq.c \
+  math/exp2q.c math/issignalingq.c math/lgammaq_neg.c math/lgammaq_product.c \
+  math/tanq_kernel.c math/tgammaq_product.c math/casinhq_kernel.c \
   math/cacoshq.c math/cacosq.c math/casinhq.c math/casinq.c \
   math/catanhq.c math/catanq.c math/cimagq.c math/conjq.c math/cprojq.c \
   math/crealq.c math/fdimq.c math/fmaxq.c math/fminq.c math/ilogbq.c \
Index: libquadmath/libquadmath.texi
===
--- libquadmath/libquadmath.texi(revision 265750)
+++ libquadmath/libquadmath.texi(working copy)
@@ -157,6 +157,7 @@
 @item @code{cosq}: cosine function
 @item @code{erfq}: error function
 @item @code{erfcq}: complementary error function
+@item @code{exp2q}: base 2 exponential function
 @item @code{expq}: exponential function
 @item @code{expm1q}: exponential minus 1 function
 @need 800
@@ -173,6 +174,7 @@
 @item @code{ilogbq}: get exponent of the value
 @item @code{isinfq}: check for infinity
 @item @code{isnanq}: check for not a number
+@item @code{issignalingq}: check for signaling not a number
 @item @code{j0q}: Bessel function of the first kind, first order
 @item @code{j1q}: Bessel function of the first kind, second order
 @item @code{jnq}: Bessel function of the first kind, @var{n}-th order
Index: libquadmath/quadmath-imp.h
===
--- libquadmath/quadmath-imp.h  (revision 265750)
+++ libquadmath/quadmath-imp.h  (working copy)
@@ -21,10 +21,16 @@
 #ifndef QUADMATH_IMP_H
 #define QUADMATH_IMP_H
 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include "quadmath.h"
 #include "config.h"
+#ifdef HAVE_FENV_H
+# include 
+#endif
 
 
 /* Under IEEE 754, an architecture may determine tininess of
@@ -36,7 +42,11 @@
 
 #define TININESS_AFTER_ROUNDING   1
 
+#define HIGH_ORDER_BIT_IS_SET_FOR_SNAN 0
 
+#define FIX_FLT128_LONG_CONVERT_OVERFLOW 0
+#define FIX_FLT128_LLONG_CONVERT_OVERFLOW 0
+
 /* Prototypes for internal functions.  */
 extern int32_t __quadmath_rem_pio2q (__float128, __float128 *);
 extern void __quadmath_kernel_sincosq (__float128, __float128, __float128 *,
@@ -43,9 +53,24 @@
   __float128 *, int);
 extern __float128 __quadmath_kernel_sinq (__float128, __float128, int);
 extern __float128 __quadmath_kernel_cosq (__float128, __float128);
+extern __float128 __quadmath_kernel_tanq (__float128, __float128, int);
+extern __float128 __quadmath_gamma_productq (__float128, __float128, int,
+__float128 *);
+extern __float128 __quadmath_g

Re: [PATCH] combine: Do not combine moves from hard registers

2018-11-02 Thread Segher Boessenkool

On Fri, Nov 02, 2018 at 06:03:20PM -0500, Segher Boessenkool wrote:
> > The original rtx is generated by expand_builtin_setjmp_receiver to adjust 
> > the frame pointer.
> > 
> > And later in LRA, it will try to eliminate frame_pointer with hard frame 
> > pointer which is
> > defined the ELIMINABLE_REGS.
> > 
> > Your change split the insn into two.
> > This makes it doesn't match the "from" and "to" regs defined in 
> > ELIMINABLE_REGS.
> > The if statement to generate the adjustment insn is been skipt.
> > And the original instruction is just been deleted!
> 
> I don't follow why, or what should have prevented it from being deleted.
> 
> > Probably, we don't want to split the move rtx if they are related to 
> > entries defined in ELIMINABLE_REGS?
> 
> One thing I can easily do is not making an intermediate pseudo when copying
> *to* a fixed reg, which sfp is.  Let me try if that helps the testcase I'm
> looking at (setjmp-4.c).

This indeed helps, see patch below.  Could you try that on the whole
testsuite?

Thanks,


Segher


p.s. It still is a problem in the arm backend, but this won't hurt combine,
so why not.


>From 814ca23ce05384d017b3c2bff41ab61cf5446e46 Mon Sep 17 00:00:00 2001
Message-Id: 
<814ca23ce05384d017b3c2bff41ab61cf5446e46.1541202704.git.seg...@kernel.crashing.org>
From: Segher Boessenkool 
Date: Fri, 2 Nov 2018 23:33:32 +
Subject: [PATCH] combine: Don't break up copy from hard to fixed reg

---
 gcc/combine.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index dfb0b44..15e941a 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -14998,6 +14998,8 @@ make_more_copies (void)
continue;
  if (TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src)))
continue;
+ if (REG_P (dest) && TEST_HARD_REG_BIT (fixed_reg_set, REGNO (dest)))
+   continue;
 
  rtx new_reg = gen_reg_rtx (GET_MODE (dest));
  rtx_insn *new_insn = gen_move_insn (new_reg, src);
-- 
1.8.3.1

Re: [PATCH][RTL] Fix PR87852

On 11/2/18 11:12 AM, Eric Botcazou wrote:
>> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>>
>> OK for trunk?
>>
>> Thanks,
>> Richard.
>>
>> 2018-11-02  Richard Biener  
>>
>>  PR rtl-optimization/87852
>>  * fwprop.c (use_killed_between): Only consider single-defs of the
>>  use in the definition stmt that dominate it.
> This looks OK to me, but this lacks commentary and I have a hard time parsing 
> the ChangeLog entry.  Maybe:
> 
>   * fwprop.c (use_killed_between): Only consider single-defs of the use
>   whose definition statement dominates the use.
> 
> FWIW I've attached a patch that also fixes the head comment of the function.
LGTM.  It does fix an armeb bug I was looking at from the c-torture
suite.  But sadly python still doesn't work :(


jeff

Re: [PATCH] Remove options that are not disabled with -Os (PR web/87829).

On 11/2/18 2:03 AM, Martin Liška wrote:
> Hi.
> 
> I would like to remove options that are not disabled with -Os:
> -freorder-blocks and -freorder-blocks-and-partition.
> The option -freorder-blocks-and-partition is enabled on x86_64,
> thus I would not name it under -Os option. And 
> -freorder-blocks-algorithm=algorithm chooses a different algorithm,
> then disabling such option does not make sense.
> 
> Ready for trunk?
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-11-02  Martin Liska  
> 
>   PR web/87829
>   * doc/invoke.texi: Remove options that are
>   not disabled with -Os.
> ---
>  gcc/doc/invoke.texi | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> 
OK
jeff

Re: [PATCH] Verify that last argument of __builtin_expect_with_probability is a real cst (PR c/87811).

On 11/1/18 7:45 AM, Martin Liška wrote:
> On 11/1/18 1:15 PM, Jakub Jelinek wrote:
>> On Thu, Nov 01, 2018 at 01:09:16PM +0100, Martin Liška wrote:
>>> -range 0.0 to 1.0, inclusive.
>>> +range 0.0 to 1.0, inclusive.  The @var{probability} argument must be
>>> +a compiler time constant.
>> When you say must, I think error_at should be used rather than warning_at.
>> If others disagree I'm open for leaving it as is.
> Error is fine for me as well.
> 
>>> @@ -2474,6 +2481,11 @@ expr_expected_value_1 (tree type, tree op0, enum 
>>> tree_code code,
>>>   *predictor = PRED_BUILTIN_EXPECT_WITH_PROBABILITY;
>>>   *probability = probi;
>>> }
>>> + else
>>> + warning_at (gimple_location (def), 0,
>>> + "probability argument %qE must be a in the "
>>> + "range 0.0 to 1.0", prob);
>> Wrong indentation.
>>
>> And, no diagnostics for -O0 (which should also be covered by a testcase).
> Test for that added.
> 
>>> +/* { dg-options "-O2 -fdump-tree-profile_estimate -frounding-math" } */
>> Why the -frounding-math options? 
> I remember I had some issue with:
> tree r = fold_build2_initializer_loc (UNKNOWN_LOCATION,
>   MULT_EXPR, t, prob, 
> base);
> 
> on targets with a non-IEEE floating point arithmetics (s390?).
> 
>  I think test
>> coverage should handle both that and when that option is not used
>> if that option makes any difference.
> It will eventually pop up if we install new tests w/o rounding math.
> 
>>  Jakub
>>
> 
> Martin
> 
> 
> 0001-Verify-that-last-argument-of-__builtin_expect_with_p.patch
> 
> From 7e0834a2ebe1a3fb83304197a843dca63332bc78 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Tue, 30 Oct 2018 14:01:59 +0100
> Subject: [PATCH] Verify that last argument of
>  __builtin_expect_with_probability is a real cst (PR c/87811).
> 
> gcc/ChangeLog:
> 
> 2018-10-30  Martin Liska  
> 
>   PR c/87811
>   * predict.c (expr_expected_value_1): Verify
>   that last argument is a real constants and emit
>   error.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-10-30  Martin Liska  
> 
>   PR c/87811
>   * gcc.dg/pr87811.c: New test.
>   * gcc.dg/pr87811-2.c: Likewise.
>   * gcc.dg/pr87811-3.c: Likewise.
> 
> gcc/ChangeLog:
> 
> 2018-10-30  Martin Liska  
> 
>   * doc/extend.texi: Update constrain about the last argument
>   of __builtin_expect_with_probability.
> ---
>  gcc/doc/extend.texi  |  3 ++-
>  gcc/predict.c| 12 
>  gcc/testsuite/gcc.dg/pr87811-2.c | 13 +
>  gcc/testsuite/gcc.dg/pr87811-3.c | 11 +++
>  gcc/testsuite/gcc.dg/pr87811.c   | 13 +
>  5 files changed, 51 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr87811-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr87811-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr87811.c
OK
jeff

Re: [PATCH] combine: Do not combine moves from hard registers

On 11/2/18 5:54 PM, Segher Boessenkool wrote:
> On Fri, Nov 02, 2018 at 06:03:20PM -0500, Segher Boessenkool wrote:
>>> The original rtx is generated by expand_builtin_setjmp_receiver to adjust 
>>> the frame pointer.
>>>
>>> And later in LRA, it will try to eliminate frame_pointer with hard frame 
>>> pointer which is
>>> defined the ELIMINABLE_REGS.
>>>
>>> Your change split the insn into two.
>>> This makes it doesn't match the "from" and "to" regs defined in 
>>> ELIMINABLE_REGS.
>>> The if statement to generate the adjustment insn is been skipt.
>>> And the original instruction is just been deleted!
>> I don't follow why, or what should have prevented it from being deleted.
>>
>>> Probably, we don't want to split the move rtx if they are related to 
>>> entries defined in ELIMINABLE_REGS?
>> One thing I can easily do is not making an intermediate pseudo when copying
>> *to* a fixed reg, which sfp is.  Let me try if that helps the testcase I'm
>> looking at (setjmp-4.c).
> This indeed helps, see patch below.  Could you try that on the whole
> testsuite?
> 
> Thanks,
> 
> 
> Segher
> 
> 
> p.s. It still is a problem in the arm backend, but this won't hurt combine,
> so why not.
> 
> 
> From 814ca23ce05384d017b3c2bff41ab61cf5446e46 Mon Sep 17 00:00:00 2001
> Message-Id: 
> <814ca23ce05384d017b3c2bff41ab61cf5446e46.1541202704.git.seg...@kernel.crashing.org>
> From: Segher Boessenkool 
> Date: Fri, 2 Nov 2018 23:33:32 +
> Subject: [PATCH] combine: Don't break up copy from hard to fixed reg
> 
> ---
>  gcc/combine.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/gcc/combine.c b/gcc/combine.c
> index dfb0b44..15e941a 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -14998,6 +14998,8 @@ make_more_copies (void)
>   continue;
> if (TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src)))
>   continue;
> +   if (REG_P (dest) && TEST_HARD_REG_BIT (fixed_reg_set, REGNO (dest)))
> + continue;
>  
> rtx new_reg = gen_reg_rtx (GET_MODE (dest));
> rtx_insn *new_insn = gen_move_insn (new_reg, src);
> -- 1.8.3.1
It certainly helps the armeb test results.

Jeff

Ping^4 Re: [PATCH v3 0/6] [MIPS] Reorganize the loongson march and extensions instructions set

2018-11-02 Thread Paul Hua

Ping ?

On Fri, Oct 26, 2018 at 5:50 PM Paul Hua  wrote:
>
> Ping ?
>
> On Tue, Oct 23, 2018 at 9:16 AM Paul Hua  wrote:
> >
> > Ping ?
> >
> > On Fri, Oct 19, 2018 at 2:19 PM Paul Hua  wrote:
> > >
> > > Ping?
> > >
> > > I'd like check in those patches before stage3.
> > >
> > > Thanks,
> > >
> > > On Tue, Oct 16, 2018 at 10:49 AM Paul Hua  wrote:
> > > >
> > > > Hi:
> > > >
> > > > The original version of patches were here:
> > > > https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00099.html
> > > >
> > > > This is a update version. please review, thanks.
> > > >
> > > > This series patches reorganize the Loongson -march=xxx and Loongson
> > > > extensions instructions set.  For long time, the Loongson extensions
> > > > instructions set puts under -march=loongson3a option.  We can't
> > > > disable one of them when we need.
> > > >
> > > > The patch (1) split Loongson  MultiMedia extensions Instructions (MMI)
> > > > from loongson3a, add -mloongson-mmi/-mno-loongson-mmi option for
> > > > enable/disable them.
> > > >
> > > > The patch (2) split Loongson EXTensions (EXT) instructions from
> > > > loongson3a, add -mloongson-ext/-mno-loongson-ext option for
> > > > enable/disable them.
> > > >
> > > > The patch (3) add Loongson EXTensions R2 (EXT2) instructions support,
> > > > add -mloongson-ext2/-mno-loongson-ext2 option for enable/disable them.
> > > >
> > > > The patch (4) add Loongson 3A1000 processor support.  The gs464 is a
> > > > codename of 3A1000 microarchitecture.  Rename -march=loongson3a to
> > > > -march=gs464, Keep -march=loongson3a as an alias of -march=gs464 for
> > > > compatibility.
> > > >
> > > > The patch (5) add Loongson 3A2000/3A3000 processor support.  Include
> > > > Loongson MMI, EXT, EXT2 instructions set.
> > > >
> > > > The patch (6) add Loongson 2K1000 processor support. Include Loongson
> > > > MMI, EXT, EXT2 and MSA instructions set.
> > > >
> > > > The binutils patch has been upstreamed.
> > > >
> > > > There are six patches in this set, as follows.
> > > > 1) 0001-MIPS-Add-support-for-loongson-mmi-instructions.patch
> > > > 2) 0002-MIPS-Add-support-for-Loongson-EXT-istructions.patch
> > > > 3) 0003-MIPS-Add-support-for-Loongson-EXT2-istructions.patch
> > > > 4) 0004-MIPS-Add-support-for-Loongson-3A1000-proccessor.patch
> > > > 5) 0005-MIPS-Add-support-for-Loongson-3A2000-3A3000-proccess.patch
> > > > 6) 0006-MIPS-Add-support-for-Loongson-2K1000-proccessor.patch
> > > >
> > > > All patchs test under mips64el-linux-gnu no new regressions.
> > > >
> > > > Ok for commit ?
> > > >
> > > > Thanks,
> > > > Paul Hua

[PATCH] Add myself to MAINTAINERS

Committing this.

2018-11-02  Stafford Horne  

* MAINTAINERS (Write After Approval): Add myself.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 265762)
+++ MAINTAINERS (working copy)
@@ -415,6 +415,7 @@ Stuart Henderson

 Matthew Hiller 
 Kazu Hirata
 Manfred Hollstein  
+Stafford Horne 
 Cong Hou   
 Falk Hueffner  
 Andrew John Hughes

Re: [PATCH] Add myself to MAINTAINERS (not-committed)