Re: [RS6000] PowerPC64 soft-float

2018-11-29 Thread Alan Modra
On Thu, Nov 29, 2018 at 12:15:06PM -0600, Segher Boessenkool wrote:
> On Sun, Nov 25, 2018 at 10:50:27PM +1030, Alan Modra wrote:
> > This patch aims to prevent long sequences loading soft-float
> > constants.  For 32-bit, it makes sense to load values inline to a gpr
> > with lis, addi, but not so much for 64-bit where a 5 insn sequence
> > might be needed for each gpr.  For TFmode in particular, a 10 insn
> > sequence is reduced to 2 loads from memory plus 1 or 2 address setup
> > insns.
> > 
> > Bootstrapped etc. powerpc64le-linux and powerpc64-linux.  OK for
> > next stage1?
> 
> It's okay now, even.

Thanks!  Revised patch as per your other comments bootstrapped and
regression tested powerpc64le-linux.

* config/rs6000/predicates.md (easy_fp_constant): Avoid long
dependent insn sequences.
* config/rs6000/rs6000.c (num_insns_constant): Support long
double constants.
* config/rs6000/rs6000.md (mov_softfloat128) Adjust length
attribute.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index cf07d5c6372..94feae28c02 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -564,9 +564,26 @@ (define_predicate "easy_fp_constant"
 {
   gcc_assert (GET_MODE (op) == mode && SCALAR_FLOAT_MODE_P (mode));
 
-  /* Consider all constants with -msoft-float to be easy.  */
+  /* Consider all constants with -msoft-float to be easy when regs are
+ 32-bit and thus can be loaded with a maximum of 2 insns.  For
+ 64-bit avoid long dependent insn sequences.  */
   if (TARGET_SOFT_FLOAT)
-return 1;
+{
+  if (!TARGET_POWERPC64)
+return 1;
+
+  int size = GET_MODE_SIZE (mode);
+  if (size < 8)
+return 1;
+
+  int load_from_mem_insns = 2;
+  if (size > 8)
+load_from_mem_insns++;
+  if (TARGET_CMODEL != CMODEL_SMALL)
+load_from_mem_insns++;
+  if (num_insns_constant (op, mode) <= load_from_mem_insns)
+return 1;
+}
 
   /* 0.0D is not all zero bits.  */
   if (DECIMAL_FLOAT_MODE_P (mode))
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index c438fdc64fe..60c319de467 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5940,6 +5940,25 @@ num_insns_constant (rtx op, machine_mode mode)
val |= l[WORDS_BIG_ENDIAN ? 1 : 0] & 0xUL;
mode = DImode;
  }
+   else if (mode == TFmode || mode == TDmode
+|| mode == KFmode || mode == IFmode)
+ {
+   long l[4];
+   int insns;
+
+   if (mode == TDmode)
+ REAL_VALUE_TO_TARGET_DECIMAL128 (*rv, l);
+   else
+ REAL_VALUE_TO_TARGET_LONG_DOUBLE (*rv, l);
+
+   val = (unsigned HOST_WIDE_INT) l[WORDS_BIG_ENDIAN ? 0 : 3] << 32;
+   val |= l[WORDS_BIG_ENDIAN ? 1 : 2] & 0xUL;
+   insns = num_insns_constant_multi (val, DImode);
+   val = (unsigned HOST_WIDE_INT) l[WORDS_BIG_ENDIAN ? 2 : 1] << 32;
+   val |= l[WORDS_BIG_ENDIAN ? 3 : 0] & 0xUL;
+   insns += num_insns_constant_multi (val, DImode);
+   return insns;
+ }
else
  gcc_unreachable ();
   }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d2f6f11b3e5..797d5c32e64 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7729,7 +7729,7 @@ (define_insn_and_split "*mov_softfloat"
(const_string "8")
(const_string "16"))
(if_then_else (match_test "TARGET_POWERPC64")
-   (const_string "40")
+   (const_string "16")
(const_string "32"))
(if_then_else (match_test "TARGET_POWERPC64")
(const_string "8")

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-29 Thread Richard Biener
On Thu, Nov 29, 2018 at 9:34 PM Martin Sebor  wrote:
>
> On 11/16/2018 02:07 AM, Richard Biener wrote:
> > On Fri, Nov 16, 2018 at 4:12 AM Martin Sebor  wrote:
> >>
> >> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
> >>
> >> Please let me know if there is something I need to change here
> >> to make the fix acceptable or if I should stop trying.
> >
> > I have one more comment about
> >
> > +  /* Defer warning (and folding) until the next statement in the basic
> > + block is reachable.  */
> > +  if (!gimple_bb (stmt))
> > +return false;
> > +
> >
> > it's not about the next statement in the basic-block being "reachable"
> > (even w/o a CFG you can use gsi_next()) but rather that the next
> > stmt isn't yet gimplified and thus not inserted into the gimple sequence,
> > right?
>
> No, it's about the current statement not being associated with
> a basic block yet when the warning code runs for the first time
> (during gimplify_expr), and so gsi_next() returning null.
>
> > You apply this to gimple_fold_builtin_strncpy but I'd rather
> > see us not sprinkling this over gimple-fold.c but instead do this
> > in gimplify.c:maybe_fold_stmt, delaying folding until say lowering.
> >
> > See the attached (untested).
>
> I would also prefer this solution.  I had tested it (in response
> to you first mentioning it back in September) and it causes quite
> a bit of fallout in tests that look for the folding to take place
> very early.  See the end of my reply here:
>
>https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01248.html
>
> But I'm willing to do the test suite cleanup if you think it's
> suitable for GCC 9.  (If you're thinking GCC 10 please let me
> know now.)

I very much prefer that to the hacks in gimple-fold.c if it doesn't
help now then I'll rather live with some bogus warnings for GCC 9
and fix it up properly for GCC 10.

I expect the fallout to be quite minimal (also considering my
suggestion to do the folding in gimple-low.c).

Richard.

> Thanks
> Martin
>
> >
> > Richard.
> >
> >
> >
> >> On 10/31/2018 10:33 AM, Martin Sebor wrote:
> >>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
> >>>
> >>> On 10/20/2018 06:01 PM, Martin Sebor wrote:
>  On 10/16/2018 03:21 PM, Jeff Law wrote:
> > On 10/4/18 9:51 AM, Martin Sebor wrote:
> >> On 10/04/2018 08:58 AM, Jeff Law wrote:
> >>> On 8/27/18 9:42 AM, Richard Biener wrote:
>  On Mon, Aug 27, 2018 at 5:32 PM Jeff Law  wrote:
> >
> > On 08/27/2018 02:29 AM, Richard Biener wrote:
> >> On Sun, Aug 26, 2018 at 7:26 AM Jeff Law  wrote:
> >>>
> >>> On 08/24/2018 09:58 AM, Martin Sebor wrote:
>  The warning suppression for -Wstringop-truncation looks for
>  the next statement after a truncating strncpy to see if it
>  adds a terminating nul.  This only works when the next
>  statement can be reached using the Gimple statement iterator
>  which isn't until after gimplification.  As a result, strncpy
>  calls that truncate their constant argument that are being
>  folded to memcpy this early get diagnosed even if they are
>  followed by the nul assignment:
> 
>    const char s[] = "12345";
>    char d[3];
> 
>    void f (void)
>    {
>  strncpy (d, s, sizeof d - 1);   // -Wstringop-truncation
>  d[sizeof d - 1] = 0;
>    }
> 
>  To avoid the warning I propose to defer folding strncpy to
>  memcpy until the pointer to the basic block the strnpy call
>  is in can be used to try to reach the next statement (this
>  happens as early as ccp1).  I'm aware of the preference to
>  fold things early but in the case of strncpy (a relatively
>  rarely used function that is often misused), getting
>  the warning right while folding a bit later but still fairly
>  early on seems like a reasonable compromise.  I fear that
>  otherwise, the false positives will drive users to adopt
>  other unsafe solutions (like memcpy) where these kinds of
>  bugs cannot be as readily detected.
> 
>  Tested on x86_64-linux.
> 
>  Martin
> 
>  PS There still are outstanding cases where the warning can
>  be avoided.  I xfailed them in the test for now but will
>  still try to get them to work for GCC 9.
> 
>  gcc-87028.diff
> 
> 
>  PR tree-optimization/87028 - false positive -Wstringop-truncation
>  strncpy with global variable source string
>  gcc/ChangeLog:
> 
>    PR tree-optimization/87028
>    * gimple-fold.c (gimple_fo

Re: [RS6000] num_insns_constant ICE

2018-11-29 Thread Alan Modra
On Thu, Nov 29, 2018 at 01:52:34PM -0600, Segher Boessenkool wrote:
> On Sun, Nov 25, 2018 at 10:49:12PM +1030, Alan Modra wrote:
> > +  while (nregs-- > 0)
> > +{
> > +  int ins = num_insns_constant_gpr (value);
> 
> You probably should mask "value" here so that it is only one GPR.
> Alternatively, make num_insns_constant_gpr handle that.  Consider what
> happens for a 64-bit number with 32-bit registers here?

Oh, right, the low part will always give an answer of 2 if the high
parts isn't merely a sign extension.

> > +   val = l[WORDS_BIG_ENDIAN ? 0 : 1] << 32;
> 
> This doesn't work, as in the other patch: long can be 32 bit.

Huh, I originally had "high" and "low" HOST_WIDE_INTs which avoided
this sort of problem on 32-bit hosts.

Revised patch follows.  Bootstrapped and regression tested
powerpc64le-linux.  Hopefully this has removed all the stupidity.

* config/rs6000/rs6000.c (num_insns_constant_gpr): Renamed from
num_insns_constant_wide.  Make static.  Revise comment.
(num_insns_constant_multi): New function.
(num_insns_constant): Formatting.  Correct CONST_WIDE_INT
calculation.  Simplify and extract code common to both
CONST_INT and CONST_DOUBLE.  Add gcc_unreachable for unhandled
const_double modes.
* config/rs6000/rs6000-protos.h (num_insns_const_wide): Delete.

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 1dfe7995ff1..dfee1f28aa9 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -36,7 +36,6 @@ extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
 extern int num_insns_constant (rtx, machine_mode);
-extern int num_insns_constant_wide (HOST_WIDE_INT);
 extern int small_data_operand (rtx, machine_mode);
 extern bool mem_operand_gpr (rtx, machine_mode);
 extern bool mem_operand_ds_form (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 02e69c103ec..c438fdc64fe 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5821,11 +5821,12 @@ direct_return (void)
   return 0;
 }
 
-/* Return the number of instructions it takes to form a constant in an
-   integer register.  */
+/* Helper for num_insns_constant.  Calculate number of instructions to
+   load VALUE to a single gpr using combinations of addi, addis, ori,
+   oris and sldi instructions.  */
 
-int
-num_insns_constant_wide (HOST_WIDE_INT value)
+static int
+num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
   if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x1)
@@ -5847,85 +5848,108 @@ num_insns_constant_wide (HOST_WIDE_INT value)
   high >>= 1;
 
   if (low == 0)
-   return num_insns_constant_wide (high) + 1;
+   return num_insns_constant_gpr (high) + 1;
   else if (high == 0)
-   return num_insns_constant_wide (low) + 1;
+   return num_insns_constant_gpr (low) + 1;
   else
-   return (num_insns_constant_wide (high)
-   + num_insns_constant_wide (low) + 1);
+   return (num_insns_constant_gpr (high)
+   + num_insns_constant_gpr (low) + 1);
 }
 
   else
 return 2;
 }
 
+/* Helper for num_insns_constant.  Allow constants formed by the
+   num_insns_constant_gpr sequences, plus li -1, rldicl/rldicr/rlwinm,
+   and handle modes that require multiple gprs.  */
+
+static int
+num_insns_constant_multi (HOST_WIDE_INT value, machine_mode mode)
+{
+  int nregs = (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+  int total = 0;
+  while (nregs-- > 0)
+{
+  HOST_WIDE_INT low = sext_hwi (value, BITS_PER_WORD);
+  int insns = num_insns_constant_gpr (low);
+  if (insns > 2
+ /* We won't get more than 2 from num_insns_constant_gpr
+except when TARGET_POWERPC64 and mode is DImode or
+wider, so the register mode must be DImode.  */
+ && rs6000_is_valid_and_mask (GEN_INT (low), DImode))
+   insns = 2;
+  total += insns;
+  value >>= BITS_PER_WORD;
+}
+  return total;
+}
+
+/* Return the number of instructions it takes to form a constant in as
+   many gprs are needed for MODE.  */
+
 int
 num_insns_constant (rtx op, machine_mode mode)
 {
-  HOST_WIDE_INT low, high;
+  HOST_WIDE_INT val;
 
   switch (GET_CODE (op))
 {
 case CONST_INT:
-  if ((INTVAL (op) >> 31) != 0 && (INTVAL (op) >> 31) != -1
- && rs6000_is_valid_and_mask (op, mode))
-   return 2;
-  else
-   return num_insns_constant_wide (INTVAL (op));
+  val = INTVAL (op);
+  break;
 
 case CONST_WIDE_INT:
   {
-   int i;
-   int ins = CONST_WIDE_INT_NUNITS (op) - 1;
-   for (i = 0; i < CONST_WIDE_INT_NUNITS (op); i++)
- ins += num_insns_constant_wide (CONST_WIDE_INT_ELT (op, i));
-   return ins;
+   int 

Re: [PATCH, libfortran] PR 88137 Initialize backtrace state once

2018-11-29 Thread Janne Blomqvist
PING!

(for the GCC-7 branch, I'll commit it after the 7.4 release)

On Thu, Nov 22, 2018 at 11:17 AM Janne Blomqvist 
wrote:

> From backtrace.h for backtrace_create_state:
>
>Calling this function allocates resources that can not be freed.
>There is no backtrace_free_state function.  The state is used to
>cache information that is expensive to recompute.  Programs are
>expected to call this function at most once and to save the return
>value for all later calls to backtrace functions.
>
> So instead of calling backtrace_create_state every time we wish to
> show a backtrace, do it once and store the result in a static
> variable.  libbacktrace allows multiple threads to access the state,
> so no need to use TLS.
>
> Regtested on x86_64-pc-linux-gnu, Ok for trunk/8/7?
>
> libgfortran/ChangeLog:
>
> 2018-11-22  Janne Blomqvist  
>
> PR libfortran/88137
> * runtime/backtrace.c (show_backtrace): Make lbstate a static
> variable, initialize once.
> ---
>  libgfortran/runtime/backtrace.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/libgfortran/runtime/backtrace.c
> b/libgfortran/runtime/backtrace.c
> index e0c277044b6..3fc973a5e6d 100644
> --- a/libgfortran/runtime/backtrace.c
> +++ b/libgfortran/runtime/backtrace.c
> @@ -146,11 +146,15 @@ full_callback (void *data, uintptr_t pc, const char
> *filename,
>  void
>  show_backtrace (bool in_signal_handler)
>  {
> -  struct backtrace_state *lbstate;
> +  /* Note that libbacktrace allows the state to be accessed from
> + multiple threads, so we don't need to use a TLS variable for the
> + state here.  */
> +  static struct backtrace_state *lbstate;
>struct mystate state = { 0, false, in_signal_handler };
> -
> -  lbstate = backtrace_create_state (NULL, __gthread_active_p (),
> -   error_callback, NULL);
> +
> +  if (!lbstate)
> +lbstate = backtrace_create_state (NULL, __gthread_active_p (),
> + error_callback, NULL);
>
>if (lbstate == NULL)
>  return;
> --
> 2.17.1
>
>

-- 
Janne Blomqvist


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-11-29 Thread Richard Biener
On Thu, Nov 29, 2018 at 5:44 PM Alexander Monakov  wrote:
>
> > On Mon, 15 Oct 2018, Richard Biener wrote:
> > > I think it's sound but also note that I think it is logically independent 
> > > of
> > > asm inline ().  While it may work for the inlining issue for some kernel
> > > examples to asm inline () is sth similar to always_inline for functions,
> > > that is, even though an asm _does_ have non-negligible .text size
> > > we do want to ignore that for the purpose of inlining (but not for the
> > > purpose of branch size estimation).
> >
> > My understanding is that kernel folks are not demanding "always_inline"
> > semantics though; they were unhappy with inlining cost mis-estimation.
>
> Ping - as I think this approach addresses the root of the problem, I wouldn't
> like it to be forgotten.

I agree this is also useful but it addresses another issue (that may appear to
be related).  asm inline is really a hint to the inliner estimates (with no way
to get semantics botched) while marking off-section parts is making the
asm text more precise also affecting code generation and thus has the
possibility to cause correctness issues (if you say mark everything as
off-section just to make it inline better).

I'm sympathtetic to both patches but clearly the kernel folks have shown
need for the inline hint (arguably the kernel folks are the ones we could
expect to get usages of %` correct ...).  I haven't seen any comments
from possible users of %`

Richard.

> > > Note in your docs you refer to "non-code" sections but it should
> > > equally apply to .text sections that are not the section of the asm
> > > context (.text.cold, for example).  So better wording there would
> > > be appreciated.
> >
> > I've tried to address this with the following incremental change. I think
> > it was the only suggestion, so the rest of the patch is unchanged.
> >
> > @@ -9849,11 +9849,11 @@ a label is unreachable.
> >  Likewise, it is possible for GCC to significantly over-estimate the
> >  number of instructions in an @code{asm}, resulting in suboptimal decisions
> >  when the estimate is used during inlining and branch range optimization.
> > -This can happen if the @code{asm} template has many lines that do not
> > -correspond to instructions, for example when the @samp{.pushsection}
> > -directive is used to emit auxiliary data in a non-code section.
> > +For instance, this can happen if a @samp{.pushsection} directive is used to
> > +temporarily switch to another section and emit auxiliary code or data 
> > there.
> >  For Extended @code{asm} statements, you can improve the estimate by
> > -wrapping the non-code portion in @samp{%` ... %`} delimiters.
> > +wrapping the parts where newlines and separators should not be counted in
> > +@samp{%` ... %`} delimiters.
> >
> >
> > * doc/extend.texi (Extended Asm): Document %` in template.
> > (Size of an Asm): Document intended use of %`.
> > * final.c (asm_insn_count): Adjust.
> > (asm_str_count): Add argument to distinguish basic and extended 
> > asms.
> > In extended asms, ignore separators inside of %` ... %`.
> > (output_asm_insn): Handle %`.
> > * rtl.h (asm_str_count): Adjust prototype.
> > * tree-inline.c (estimate_num_insns): Adjust.
> > * config/arm/arm.c (arm_rtx_costs_internal): Adjust.
> >
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 34ecc4f8d14..dac4728375b 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -8622,6 +8622,11 @@ generates multiple assembler instructions.
> >  Outputs @samp{@{}, @samp{|}, and @samp{@}} characters (respectively)
> >  into the assembler code.  When unescaped, these characters have special
> >  meaning to indicate multiple assembler dialects, as described below.
> > +
> > +@item %`
> > +Signifies a boundary of a region where instruction separators are not
> > +counted towards its size (@pxref{Size of an asm}). Must be followed by
> > +a whitespace character.
> >  @end table
> >
> >  @subsubheading Multiple assembler dialects in @code{asm} templates
> > @@ -9830,7 +9835,7 @@ does this by counting the number of instructions in 
> > the pattern of the
> >  @code{asm} and multiplying that by the length of the longest
> >  instruction supported by that processor.  (When working out the number
> >  of instructions, it assumes that any occurrence of a newline or of
> > -whatever statement separator character is supported by the assembler --
> > +whatever statement separator character is supported by the assembler ---
> >  typically @samp{;} --- indicates the end of an instruction.)
> >
> >  Normally, GCC's estimate is adequate to ensure that correct
> > @@ -9841,6 +9846,15 @@ space in the object file than is needed for a single 
> > instruction.
> >  If this happens then the assembler may produce a diagnostic saying that
> >  a label is unreachable.
> >
> > +Likewise, it is possible for GCC to significantly over-estimate th

Re: [PATCH v3] Make function clone name numbering independent.

2018-11-29 Thread Richard Biener
On Wed, Nov 28, 2018 at 10:09 PM Michael Ploujnikov
 wrote:
>
> Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html
>
> It took longer than expected, but I've finally rebased this on top of
> the new clone_function_name* API and included the requested
> optimizations.
>
> There are a few remaining spots where we could probably apply similar
> optimizations:
>
> - gcc/multiple_target.c:create_target_clone
> - gcc/multiple_target.c:create_dispatcher_calls
> - gcc/omp-expand.c:grid_expand_target_grid_body
>
> But I've yet to figure out how these work in detail and with the new
> API these shouldn't block the main change from being merged.
>
> I've also included a small change to rs6000 which I'm pretty sure is
> safe, but I have no way of testing.
>
> I'm not sure what's the consensus on what's more appropriate, but I
> thought that it would be a good idea to keep these changes as separate
> patches since only the first one really changes behaviour, while the
> rest are optimizations. It's conceivable that someone who is
> backporting this to an older release might want to just backport the
> core bits of the change. I can re-submit it as one patch if that's
> more appropriate.
>
> Everything in these patches was bootstrapped and regression tested on
> x86_64.
>
> Ok for trunk?

+  unsigned &clone_number = lto_clone_numbers->get_or_insert (
+ IDENTIFIER_POINTER (DECL_NAME (decl)));
   name = maybe_rewrite_identifier (name);
   symtab->change_decl_assembler_name (decl,
-  clone_function_name_numbered (
-  name, "lto_priv"));
+  clone_function_name (
+  name, "lto_priv", clone_number));

since we use 'name' from maybe_rewrite_identifier in the end wouldn't it
make more sense to use that as a key for lto_clone_numbers?

For the ipa-hsa.c changes since we iterate over all defined functions
we at most create a single clone per cgraph_node.  That means your
hash-map keyed on cgraph_node is useless and you could use
an unnumbered clone (or a static zero number).

-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
-   suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl,
+   clone_function_name (
+ IDENTIFIER_POINTER (
+   DECL_ASSEMBLER_NAME (old_decl)),
+ suffix,
+ num_suffix));

can you please hide the implementation detail of accessing the assembler name
by adding an overload to clone_function_name_numbered with an explicit
number?

OK with those changes.

Thanks,
Richard.



>
> - Michael


[PATCH, libstdc++, C++20] Implement P0457R2 String Prefix and Suffix Checking.

2018-11-29 Thread Ed Smith-Rowland

Greetings,

This patch implements starts_with and ends_with for basic_string and 
basic_string_view for C++20.


This builds and tests cleanly on x86_64-linux.

Ed



2018-11-30  Edward Smith-Rowland  <3dw...@verizon.net>

Implement P0457R2 String Prefix and Suffix Checking.
* include/bits/basic_string.h: Add starts_with, ends_with members.
* include/std/string_view: Ditto.
* testsuite/21_strings/basic_string/operations/starts_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string/operations/starts_with/
wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string/operations/ends_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string/operations/ends_with/
wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/starts_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/starts_with/
wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/ends_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/ends_with/
wchar_t/1.cc: New test.

Index: include/bits/basic_string.h
===
--- include/bits/basic_string.h (revision 266645)
+++ include/bits/basic_string.h (working copy)
@@ -3038,6 +3038,32 @@
   compare(size_type __pos, size_type __n1, const _CharT* __s,
  size_type __n2) const;
 
+#if __cplusplus > 201703L
+  bool
+  starts_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(const _CharT* __x) const
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  ends_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+
+  bool
+  ends_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+
+  bool
+  ends_with(const _CharT* __x) const
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+#endif // C++20
+
   // Allow basic_stringbuf::__xfer_bufptrs to call _M_length:
   template friend class basic_stringbuf;
 };
@@ -5884,6 +5910,32 @@
   compare(size_type __pos, size_type __n1, const _CharT* __s,
  size_type __n2) const;
 
+#if __cplusplus > 201703L
+  bool
+  starts_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(const _CharT* __x) const
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  ends_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+
+  bool
+  ends_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+
+  bool
+  ends_with(const _CharT* __x) const
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+#endif // C++20
+
 # ifdef _GLIBCXX_TM_TS_INTERNAL
   friend void
   ::_txnal_cow_string_C1_for_exceptions(void* that, const char* s,
Index: include/std/string_view
===
--- include/std/string_view (revision 266645)
+++ include/std/string_view (working copy)
@@ -227,7 +227,6 @@
__sv = __tmp;
   }
 
-
   // [string.view.ops], string operations:
 
   size_type
@@ -387,6 +386,36 @@
  traits_type::length(__str));
   }
 
+#if __cplusplus > 201703L
+  constexpr bool
+  starts_with(basic_string_view __x) const noexcept
+  { return this->size() >= __x.size()
+   && this->compare(0, __x.size(), __x) == 0; }
+
+  constexpr bool
+  starts_with(_CharT __x) const noexcept
+  { return this->starts_with(basic_string_view(&__x, 1)); }
+
+  constexpr bool
+  starts_with(const _CharT* __x) const
+  { return this->starts_with(basic_string_view(__x)); }
+
+  constexpr bool
+  ends_with(basic_string_view __x) const noexcept
+  {
+   return this->size() >= __x.size()
+   && this->compare(this->size() - __x.size(), npos, __x) == 0;
+  }
+
+  constexpr bool
+  ends_with(_CharT __x) const noexcept
+  { return this->ends_with(basic_string_view(&__x, 1)); }
+
+  constexpr bool
+  ends_with(const _CharT* __x) const
+  { return this-

Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-29 Thread Jeff Law
On 11/29/18 4:43 PM, Martin Sebor wrote:
>> The fallout on existing tests is minimal.  What's more concerning is
>> that it doesn't actually pass the new test from Martin's original
>> submission.  We get bogus warnings.
>>
>> At least part of the problem is weakness in maybe_diag_stxncpy_trunc.
>> It can't handle something like this:
>>
>> test_literal (char * d, struct S * s)
>> {
>>    strncpy (d, "1234567890", 3);
>>    _1 = d + 3;
>>    *_1 = 0;
>> }
>>
>>
>> Note the pointer arithmetic between the strncpy and storing the NUL
>> terminator.
> 
> Right.  I'm less concerned about this case because it involves
> a literal that's obviously longer than the destination but it
> would be nice if the suppression worked here as well in case
> the literal comes from macro expansion.  It will require
> another tweak.
OK.  If this isn't at the core of the regression BZ, then xfailing those
particular cases and coming back to them is fine.

> 
> But the test from my patch passes with the changes to calls.c
> from my patch, so that's an improvement.
> 
> I have done the test suite cleanup in the attached patch.  It
> was indeed minimal -- not sure why I saw so many failures with
> my initial approach.
Richi's does the folding as part of gimple lowering.  So it's still
pretty early -- basically it ends up hitting just a few tests that are
looking for folded stuff in the .gimple dump.

I had actually targeted this patch as one to work through and try to get
resolved today.  Kind of funny that we were poking at it at the same time.


> 
> I can submit a patch to handle the literal case above as
> a followup unless you would prefer it done at the same time.
Follow-up is fine by me.

jeff


Re: [PATCH 1/2] asm qualifiers (PR55681)

2018-11-29 Thread Segher Boessenkool
On Fri, Nov 30, 2018 at 12:11:30AM +, Joseph Myers wrote:
> On Thu, 29 Nov 2018, Segher Boessenkool wrote:
> 
> > So "asm const restrict" is allowed, but "asm const const restrict" isn't.
> 
> No, asm const restrict isn't allowed.  volatile is allowed; const and 
> restrict are allowed with warnings because that replicates what the old 
> bison parser allowed; but at most one qualifier is allowed at present.

I mean the wanted behaviour, not the current behaviour.

> > What do you want done with const and restrict (and _Atomic, which is
> > allowed by the current grammar)?
> 
> Don't allow _Atomic, since it's not allowed at present.  Either allow at 
> most one qualifier (being one of volatile / const / restrict) or remove 
> support for const and restrict there and just allow (at most one) 
> volatile.

I'll go for the latter.  That also harmonises C and C++ here.

Thanks!


Segher


Re: [PATCH 1/2] asm qualifiers (PR55681)

2018-11-29 Thread Joseph Myers
On Thu, 29 Nov 2018, Segher Boessenkool wrote:

> So "asm const restrict" is allowed, but "asm const const restrict" isn't.

No, asm const restrict isn't allowed.  volatile is allowed; const and 
restrict are allowed with warnings because that replicates what the old 
bison parser allowed; but at most one qualifier is allowed at present.

> What do you want done with const and restrict (and _Atomic, which is
> allowed by the current grammar)?

Don't allow _Atomic, since it's not allowed at present.  Either allow at 
most one qualifier (being one of volatile / const / restrict) or remove 
support for const and restrict there and just allow (at most one) 
volatile.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH, libstdc++] Uniform container erasure for c++20 returning number of erasures.

2018-11-29 Thread Ed Smith-Rowland
This version of erase_if/erase returning number erased actually build 
and tests cleanly on x86_64-linux.


Is this what you has in mind for this?  I basically return the number of 
erased items for all the things


Ed




Re: [PATCH 1/2] asm qualifiers (PR55681)

2018-11-29 Thread Segher Boessenkool
On Thu, Nov 29, 2018 at 11:14:45PM +, Joseph Myers wrote:
> On Thu, 29 Nov 2018, Segher Boessenkool wrote:
> 
> > > What's the basis for allowing duplicates for C but not for C++?
> > 
> > It is the status quo.  It would make sense to allow duplicates for C++ as
> > well, sure.  If that is preferred I can make a patch for it?
> 
> Duplicate qualifiers are allowed *in declarations* for C (as per C99).  

And I used type-qualifier-list[opt] there, to start with.

> They aren't allowed in asm.  I'd think the most obvious thing would be not 
> to allow duplicate qualifiers in asm at all (but still allow any ordering 
> of volatile, goto and inline).  Essentially, the use in asm is just 
> reusing a keyword in a different context, so I don't think duplicates 
> being allowed in declarations is relevant to allowing them in asm (any 
> more than it indicates that __attribute__ ((const const const)) should be 
> allowed just because const is a valid attribute).

So "asm const restrict" is allowed, but "asm const const restrict" isn't.
Hrm.  That means we'll have to keep track of the ignored qualifiers.
(There aren't any ignored asm qualifiers in C++ so no such issue there).

What do you want done with const and restrict (and _Atomic, which is
allowed by the current grammar)?


Segher


Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-29 Thread Martin Sebor

On 11/29/18 4:07 PM, Jeff Law wrote:

On 11/29/18 1:34 PM, Martin Sebor wrote:

On 11/16/2018 02:07 AM, Richard Biener wrote:

On Fri, Nov 16, 2018 at 4:12 AM Martin Sebor  wrote:


Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

Please let me know if there is something I need to change here
to make the fix acceptable or if I should stop trying.


I have one more comment about

+  /* Defer warning (and folding) until the next statement in the basic
+ block is reachable.  */
+  if (!gimple_bb (stmt))
+    return false;
+

it's not about the next statement in the basic-block being "reachable"
(even w/o a CFG you can use gsi_next()) but rather that the next
stmt isn't yet gimplified and thus not inserted into the gimple sequence,
right?


No, it's about the current statement not being associated with
a basic block yet when the warning code runs for the first time
(during gimplify_expr), and so gsi_next() returning null.


You apply this to gimple_fold_builtin_strncpy but I'd rather
see us not sprinkling this over gimple-fold.c but instead do this
in gimplify.c:maybe_fold_stmt, delaying folding until say lowering.

See the attached (untested).


I would also prefer this solution.  I had tested it (in response
to you first mentioning it back in September) and it causes quite
a bit of fallout in tests that look for the folding to take place
very early.  See the end of my reply here:

   https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01248.html

But I'm willing to do the test suite cleanup if you think it's
suitable for GCC 9.  (If you're thinking GCC 10 please let me
know now.)

The fallout on existing tests is minimal.  What's more concerning is
that it doesn't actually pass the new test from Martin's original
submission.  We get bogus warnings.

At least part of the problem is weakness in maybe_diag_stxncpy_trunc.
It can't handle something like this:

test_literal (char * d, struct S * s)
{
   strncpy (d, "1234567890", 3);
   _1 = d + 3;
   *_1 = 0;
}


Note the pointer arithmetic between the strncpy and storing the NUL
terminator.


Right.  I'm less concerned about this case because it involves
a literal that's obviously longer than the destination but it
would be nice if the suppression worked here as well in case
the literal comes from macro expansion.  It will require
another tweak.

But the test from my patch passes with the changes to calls.c
from my patch, so that's an improvement.

I have done the test suite cleanup in the attached patch.  It
was indeed minimal -- not sure why I saw so many failures with
my initial approach.

I can submit a patch to handle the literal case above as
a followup unless you would prefer it done at the same time.

Martin
PR tree-optimization/87028 - false positive -Wstringop-truncation strncpy with global variable source string

gcc/ChangeLog:

	PR tree-optimization/87028
	* calls.c (get_attr_nonstring_decl): Avoid setting *REF to
	SSA_NAME_VAR.
	* gcc/gimple-low.c (lower_stmt): Delay foldin built-ins.
	* gimplify (maybe_fold_stmt): Avoid folding statements that
	don't belong to a basic block.
	* tree.h (SSA_NAME_VAR): Update comment.
	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Simplify.

gcc/testsuite/ChangeLog:

	PR tree-optimization/87028
	* c-c++-common/Wstringop-truncation.c: Remove xfails.
	* gcc.dg/Wstringop-truncation-5.c: New test.
	* gcc.dg/strcmpopt_1.c: Adjust.
	* gcc.dg/tree-ssa/pr79697.c: Same.

Index: gcc/calls.c
===
--- gcc/calls.c	(revision 266637)
+++ gcc/calls.c	(working copy)
@@ -1503,6 +1503,7 @@ tree
 get_attr_nonstring_decl (tree expr, tree *ref)
 {
   tree decl = expr;
+  tree var = NULL_TREE;
   if (TREE_CODE (decl) == SSA_NAME)
 {
   gimple *def = SSA_NAME_DEF_STMT (decl);
@@ -1515,17 +1516,25 @@ get_attr_nonstring_decl (tree expr, tree *ref)
 	  || code == VAR_DECL)
 	decl = gimple_assign_rhs1 (def);
 	}
-  else if (tree var = SSA_NAME_VAR (decl))
-	decl = var;
+  else
+	var = SSA_NAME_VAR (decl);
 }
 
   if (TREE_CODE (decl) == ADDR_EXPR)
 decl = TREE_OPERAND (decl, 0);
 
+  /* To simplify calling code, store the referenced DECL regardless of
+ the attribute determined below, but avoid storing the SSA_NAME_VAR
+ obtained above (it's not useful for dataflow purposes).  */
   if (ref)
 *ref = decl;
 
-  if (TREE_CODE (decl) == ARRAY_REF)
+  /* Use the SSA_NAME_VAR that was determined above to see if it's
+ declared nonstring.  Otherwise drill down into the referenced
+ DECL.  */
+  if (var)
+decl = var;
+  else if (TREE_CODE (decl) == ARRAY_REF)
 decl = TREE_OPERAND (decl, 0);
   else if (TREE_CODE (decl) == COMPONENT_REF)
 decl = TREE_OPERAND (decl, 1);
Index: gcc/gimple-low.c
===
--- gcc/gimple-low.c	(revision 266637)
+++ gcc/gimple-low.c	(working copy)
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "g

Re: [PATCH, rs6000] Fix PR87496: ICE in aggregate_value_p at gcc/function.c:2046

2018-11-29 Thread Peter Bergner
On 11/29/18 1:31 PM, Peter Bergner wrote:
> On 11/29/18 11:26 AM, Segher Boessenkool wrote:
>> On Wed, Nov 28, 2018 at 03:27:19PM -0600, Peter Bergner wrote:
>>> PR87496 shows a bug where we ICE if we attempt to use -mabi=ieeelongdouble
>>> and -mno-popcntd.  The IEEE128 support requires full ISA 2.06 (aka POWER7)
>>> support, so we really should throw an error when using those options
>>> together.  Ditto for -mabi=ieeelongdouble and -mno-vsx.  The patch below
>>> does that.
>>>
>>> Ok for mainline once bootstrap and regtesting are complete and clean?
>>
>> Okay.  Eventually we shouldn't allow selecting popcntd independently from
>> -mcpu=, but that day isn't here yet.  So, okay for trunk, and backports
>> if wanted.  Thanks!
> 
> Ok, committed to mainline.  It looks like GCC8 needs the same patch.
> I'll have to look closer at GCC7 on whether it needs it too, since the
> code seems to be a little different.

Ok, both GCC8 and GCC7 need the backport.  I committed the backport
to GCC8 and will do the same for GCC7 once the freeze is over.

Peter





PING [PATCH] x86: Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]

2018-11-29 Thread H.J. Lu
On Wed, Oct 31, 2018 at 12:42 PM H.J. Lu  wrote:
>
> On Thu, Sep 27, 2018 at 7:58 AM Richard Biener
>  wrote:
> >
> > On Thu, Sep 27, 2018 at 3:16 PM H.J. Lu  wrote:
> > >
> > > On Thu, Sep 27, 2018 at 6:08 AM, Szabolcs Nagy  
> > > wrote:
> > > > On 26/09/18 19:10, H.J. Lu wrote:
> > > >>
> > > >> Add -mzero-caller-saved-regs=[skip|used|all] command-line option and
> > > >> zero_caller_saved_regs("skip|used|all") function attribue:
> > > >>
> > > >> 1. -mzero-caller-saved-regs=skip and zero_caller_saved_regs("skip")
> > > >>
> > > >> Don't zero caller-saved integer registers upon function return.
> > > >>
> > > >> 2. -mzero-caller-saved-regs=used and zero_caller_saved_regs("used")
> > > >>
> > > >> Zero used caller-saved integer registers upon function return.
> > > >>
> > > >> 3. -mzero-caller-saved-regs=all and zero_caller_saved_regs("all")
> > > >>
> > > >> Zero all caller-saved integer registers upon function return.
> > > >>
> > > >> Tested on i686 and x86-64 with bootstrapping GCC trunk and
> > > >> -mzero-caller-saved-regs=used as well as -mzero-caller-saved-regs=all
> > > >> enabled by default.
> > > >>
> > > >
> > > > from this description and the documentation it's
> > > > not clear to me what this tries to achieve.
> > > >
> > > > is it trying to prevent information leak?
> > > > or some pcs hack the caller may rely on?
> > > >
> > > > if it's for information leak then i'd expect such
> > > > attribute to be used on crypto code.. however i'd
> > > > expect crypto code to use simd registers as well,
> > > > so integer only cleaning needs explanation.
> > >
> > > The target usage is in Linux kernel.
> >
> > Maybe still somehow encode that in the option since it otherwise raises
> > expectations that are not met?
> > -mzero-call-clobbered-regs=used-int|all-int|skip|used-simd|used-fp,etc.?
> > and sorry() on unimplemented ones?  Or simply zero also non-integer
> > regs the same way?  I suppose
> > there isn't sth like vzeroupper that zeros all SIMD regs and completely?
> >
>
> Here is the updated patch to zero caller-saved vector registers.   I don't
> mind a different option name if it is preferred.  I may be able to create
> some generic utility functions which can be used by other backends.  But
> actual implementation must be target specific.
>
> Any comments?

PING.

https://gcc.gnu.org/ml/gcc-patches/2018-10/msg02079.html

-- 
H.J.


Re: [PATCH 1/2] asm qualifiers (PR55681)

2018-11-29 Thread Joseph Myers
On Thu, 29 Nov 2018, Segher Boessenkool wrote:

> > What's the basis for allowing duplicates for C but not for C++?
> 
> It is the status quo.  It would make sense to allow duplicates for C++ as
> well, sure.  If that is preferred I can make a patch for it?

Duplicate qualifiers are allowed *in declarations* for C (as per C99).  
They aren't allowed in asm.  I'd think the most obvious thing would be not 
to allow duplicate qualifiers in asm at all (but still allow any ordering 
of volatile, goto and inline).  Essentially, the use in asm is just 
reusing a keyword in a different context, so I don't think duplicates 
being allowed in declarations is relevant to allowing them in asm (any 
more than it indicates that __attribute__ ((const const const)) should be 
allowed just because const is a valid attribute).

-- 
Joseph S. Myers
jos...@codesourcery.com


[SPARC] Fix PR target/87807

2018-11-29 Thread Eric Botcazou
This started as a simple fix for a small issue (passing floating-point vectors 
to variadic functions in 64-bit mode) and then evolved into a small cleanup of 
the code implementing the calling conventions of the 2 SPARC ABIs.

Tested and compat-regtested on SPARC/Solaris 11, applied on the mainline.


2018-11-29  Eric Botcazou  

PR target/87807
* config/sparc/sparc-modes.def: Minor tweak.
* config/sparc/sparc.c: Minor reordering.
(sparc_pass_by_reference): Move around.
(traverse_record_type): Change offset from HOST_WIDE_INT to int.
(classify_registers): Likewise for bitpos.
(function_arg_slotno): Remove dead test and tweak comments.
: Remove useless assertion and test whether the
parameter is named in order to pass it in FP registers.  Return
the regno for floating-point vector types.
(compute_int_layout): Change bitpos from HOST_WIDE_INT to int.
(compute_fp_layout): Likewise.
(count_registers): Likewise.
(assign_int_registers): Likewise.
(assign_fp_registers): Likewise.
(assign_registers): Likewise.
(function_arg_record_value): Change size from HOST_WIDE_INT to int
and use CEIL_NWORDS to compute the number of registers.
(function_arg_union_value): Minor tweaks.
(function_arg_vector_value): Add slotno and named parameters, use
CEIL_NWORDS to compute the number of registers.
(sparc_function_arg_1): Rework handling of vector types.  Change
size from HOST_WIDE_INT to int.
(sparc_arg_partial_bytes): Rework handling of 32-bit ABI and deal
with vector types for the 64-bt ABI.
(sparc_function_arg_advance): Likewise.
(sparc_return_in_memory): Add reference to -fpcc-struct-return.
(sparc_struct_value_rtx): Return NULL_RTX instead of 0.
(sparc_function_value_1): Rework handling of vector types.  Change
size from HOST_WIDE_INT to int.


2018-11-29  Eric Botcazou  

* gcc.target/sparc/20181129-1.c: New test.
* gcc.target/sparc/20181129-2.c: Likewise.

-- 
Eric BotcazouIndex: config/sparc/sparc-modes.def
===
--- config/sparc/sparc-modes.def	(revision 266506)
+++ config/sparc/sparc-modes.def	(working copy)
@@ -56,8 +56,8 @@ CC_MODE (CCFP);
 CC_MODE (CCFPE);
 
 /* Vector modes.  */
-VECTOR_MODES (INT, 16);   /* V16QI V8HI V4SI V2DI */
-VECTOR_MODES (INT, 8);/*   V8QI V4HI V2SI */
-VECTOR_MODES (INT, 4);/*   V4QI V2HI  */
-VECTOR_MODE (INT, DI, 1); /* V1DI */
-VECTOR_MODE (INT, SI, 1); /* V1SI */
+VECTOR_MODES (INT, 16);   /* V16QI V8HI V4SI V2DI  */
+VECTOR_MODES (INT, 8);/*   V8QI V4HI V2SI  */
+VECTOR_MODES (INT, 4);/*V4QI V2HI  */
+VECTOR_MODE (INT, DI, 1); /*  V1DI */
+VECTOR_MODE (INT, SI, 1); /*  V1SI */
Index: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 266506)
+++ config/sparc/sparc.c	(working copy)
@@ -642,13 +642,8 @@ static rtx sparc_tls_got (void);
 static int sparc_register_move_cost (machine_mode,
  reg_class_t, reg_class_t);
 static bool sparc_rtx_costs (rtx, machine_mode, int, int, int *, bool);
-static rtx sparc_function_value (const_tree, const_tree, bool);
-static rtx sparc_libcall_value (machine_mode, const_rtx);
-static bool sparc_function_value_regno_p (const unsigned int);
-static rtx sparc_struct_value_rtx (tree, int);
 static machine_mode sparc_promote_function_mode (const_tree, machine_mode,
 		  int *, const_tree, int);
-static bool sparc_return_in_memory (const_tree, const_tree);
 static bool sparc_strict_argument_naming (cumulative_args_t);
 static void sparc_va_start (tree, rtx);
 static tree sparc_gimplify_va_arg (tree, tree, gimple_seq *, gimple_seq *);
@@ -674,6 +669,11 @@ static unsigned int sparc_function_arg_b
 		 const_tree);
 static int sparc_arg_partial_bytes (cumulative_args_t,
 machine_mode, tree, bool);
+static bool sparc_return_in_memory (const_tree, const_tree);
+static rtx sparc_struct_value_rtx (tree, int);
+static rtx sparc_function_value (const_tree, const_tree, bool);
+static rtx sparc_libcall_value (machine_mode, const_rtx);
+static bool sparc_function_value_regno_p (const unsigned int);
 static unsigned HOST_WIDE_INT sparc_asan_shadow_offset (void);
 static void sparc_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED;
 static void sparc_file_end (void);
@@ -806,18 +806,9 @@ char sparc_hard_reg_printed[8];
 
 #undef TARGET_PROMOTE_FUNCTION_MODE
 #define TARGET_PROMOTE_FUNCTION_MODE sparc_promote_function_mode
+#undef TARGET_STRICT_ARGUMENT_NAMING
+#define TARGET_STRICT_ARGUMENT_NAMING sparc_strict_argument_naming
 
-#undef TARGET_FUNCTION_VALUE
-#define TARGET_FUNCTION_VALUE

Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-29 Thread Jeff Law
On 11/29/18 1:34 PM, Martin Sebor wrote:
> On 11/16/2018 02:07 AM, Richard Biener wrote:
>> On Fri, Nov 16, 2018 at 4:12 AM Martin Sebor  wrote:
>>>
>>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
>>>
>>> Please let me know if there is something I need to change here
>>> to make the fix acceptable or if I should stop trying.
>>
>> I have one more comment about
>>
>> +  /* Defer warning (and folding) until the next statement in the basic
>> + block is reachable.  */
>> +  if (!gimple_bb (stmt))
>> +    return false;
>> +
>>
>> it's not about the next statement in the basic-block being "reachable"
>> (even w/o a CFG you can use gsi_next()) but rather that the next
>> stmt isn't yet gimplified and thus not inserted into the gimple sequence,
>> right?
> 
> No, it's about the current statement not being associated with
> a basic block yet when the warning code runs for the first time
> (during gimplify_expr), and so gsi_next() returning null.
> 
>> You apply this to gimple_fold_builtin_strncpy but I'd rather
>> see us not sprinkling this over gimple-fold.c but instead do this
>> in gimplify.c:maybe_fold_stmt, delaying folding until say lowering.
>>
>> See the attached (untested).
> 
> I would also prefer this solution.  I had tested it (in response
> to you first mentioning it back in September) and it causes quite
> a bit of fallout in tests that look for the folding to take place
> very early.  See the end of my reply here:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01248.html
> 
> But I'm willing to do the test suite cleanup if you think it's
> suitable for GCC 9.  (If you're thinking GCC 10 please let me
> know now.)
The fallout on existing tests is minimal.  What's more concerning is
that it doesn't actually pass the new test from Martin's original
submission.  We get bogus warnings.

At least part of the problem is weakness in maybe_diag_stxncpy_trunc.
It can't handle something like this:

test_literal (char * d, struct S * s)
{
  strncpy (d, "1234567890", 3);
  _1 = d + 3;
  *_1 = 0;
}


Note the pointer arithmetic between the strncpy and storing the NUL
terminator.

jeff


Re: [PING^3] Re: [PATCH 1/3] Support instrumenting returns of instrumented functions

2018-11-29 Thread Jan Hubicka
> 
> I don't think it is because the instrumentation only adds calls, and
> calls don't get annotated in DWARF. The only issue I could think of
> if is something patches in push instructions through the nops, 
> but there is really nothing the compiler could do about that.
> 
> I tested gdb and it can backtrace through the return instrumentation.
> 
> Breakpoint 1, 0x00401182 in __return__ ()
> (gdb) bt
> #0  0x00401182 in __return__ ()
> #1  0x004011a3 in f2 ()
> #2  0x004011b7 in main ()
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)

:) Not most user friendly message though.
The patch is OK.

Honza
> 
> -Andi


Re: [PATCH 1/2] asm qualifiers (PR55681)

2018-11-29 Thread Segher Boessenkool
On Thu, Nov 29, 2018 at 09:13:13PM +, Joseph Myers wrote:
> I'd expect testcases to be added for the new syntax variants (duplicate 
> qualifiers / goto and new orderings thereof).

Okay.

> There's a description of the syntax in extend.texi:
> 
> @example
> asm @r{[}volatile@r{]} ( @var{AssemblerTemplate} 
>  : @var{OutputOperands} 
>  @r{[} : @var{InputOperands}
>  @r{[} : @var{Clobbers} @r{]} @r{]})
> 
> asm @r{[}volatile@r{]} goto ( @var{AssemblerTemplate} 
>   : 
>   : @var{InputOperands}
>   : @var{Clobbers}
>   : @var{GotoLabels})
> @end example
> 
> I'd expect this to be updated by this patch, and again by the "asm inline" 
> one.

That stuff needs to be rewritten :-(

But I'll see what I can do.

> What's the basis for allowing duplicates for C but not for C++?

It is the status quo.  It would make sense to allow duplicates for C++ as
well, sure.  If that is preferred I can make a patch for it?


Segher


Re: [PATCH] Improve combiner's find_split_point (PR target/54589)

2018-11-29 Thread Segher Boessenkool
Hi Jakub,

On Thu, Nov 29, 2018 at 10:49:21PM +0100, Jakub Jelinek wrote:
> The following patch attempts to improve find_split_point inside of
> complex MEM addresses, if the target supports REG + REG + const
> addressing, but doesn't support more_complex_rtx + REG + const,
> try to split it at more_complex_rtx rather than more_complex_rtx + REG.

> 2018-11-29  Jakub Jelinek  
> 
>   PR target/54589
>   * combine.c (find_split_point): For invalid memory address
>   nonobj + obj + const, if reg + obj + const is valid addressing
>   mode, split at nonobj.  Use if rather than else if for the
>   fallback.  Comment fixes.
> 
>   * gcc.target/i386/pr54589.c: New test.

That looks good, but let me try it on some bigger builds first.

Thanks


Segher


Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-29 Thread Ian Lance Taylor
Pedro Alves  writes:

> Hi Nick,
>
> On 11/29/2018 03:01 PM, Nick Clifton wrote:
>>  static struct demangle_component *
>>  d_function_type (struct d_info *di)
>>  {
>> -  struct demangle_component *ret;
>> +  static unsigned long recursion_level = 0;
>
> Did you consider making this be a part of struct d_info instead?
> IIRC, d_info is like a "this" pointer, passed around pretty
> much everywhere.
>
> I think going in the direction of making the demangler harder to use
> in an efficient thread-safe manner is undesirable, even if the feature
> is optional.  E.g., in GDB, loading big binaries, demangling is very high
> in profiles, and so we've kicked around the desire to parallelize
> it (e.g., by parallelizing the reading/interning of DSO files, instead of
> reading all of them sequentially).  Having to synchronize access to the
> demangler would be quite unfortunate.  If possible, it'd be great
> to avoid making work toward that direction harder.  (Keeping in mind that
> if this recursion detection feature is useful for binutils, then it should
> also be useful for GDB.)

I agree.  Using static variables here seems problematic.  Right now as
far as I know the demangler has no static variables at all.

Ian


Re: [PATCH] Optimize integer vector comparison followed by movmsk (PR target/88152, take 2)

2018-11-29 Thread Uros Bizjak
On Thu, Nov 29, 2018 at 10:42 PM Jakub Jelinek  wrote:
>
> On Thu, Nov 29, 2018 at 05:26:41PM +0100, Uros Bizjak wrote:
> > On Thu, Nov 29, 2018 at 3:36 PM Jakub Jelinek  wrote:
> > > Like blend, movmsk also only cares about the most significant bit,
> > > so prior < 0 comparisons or (happens also on the testcase below in some
> > > cases) arithmetic shift right (by any value) isn't needed before the 
> > > movmsk.
> > >
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > Same comment as with your lt+blend -> blend patch. I think that
> > pre-reload define_insn_and_split that splits the combination to movmsk
> > would be better here. We already implement similar approach to remove
> > useless maskings of shift operands (c.f. various "..._mask" insns in
> > i386.md).
>
> So like this?  Bootstrapped/regtested on x86_64-linux and i686-linux.
> Ok for trunk?
>
> 2018-11-29  Jakub Jelinek  
>
> PR target/88152
> * config/i386/sse.md (*_movmsk_lt,
> *_movmsk_zext_lt,
> *_movmsk_shift,
> *_movmsk_zext_shift,
> *_pmovmskb_lt, *_pmovmskb_zext_lt): New
> define_insn_and_split patterns.
>
> * g++.target/i386/pr88152.C: New test.

OK.

Thanks,
Uros.

> --- gcc/config/i386/sse.md.jj   2018-11-29 18:52:42.747904630 +0100
> +++ gcc/config/i386/sse.md  2018-11-29 19:21:44.371143252 +0100
> @@ -14653,6 +14653,78 @@ (define_insn "*_movmsk (set_attr "prefix" "maybe_vex")
> (set_attr "mode" "")])
>
> +(define_insn_and_split "*_movmsk_lt"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +   (unspec:SI
> + [(lt:VF_128_256
> +(match_operand: 1 "register_operand" "x")
> +(match_operand: 2 "const0_operand" "C"))]
> + UNSPEC_MOVMSK))]
> +  "TARGET_SSE"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +   (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))]
> +  "operands[1] = gen_lowpart (mode, operands[1]);"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
> +(define_insn_and_split "*_movmsk_zext_lt"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +   (zero_extend:DI
> + (unspec:SI
> +   [(lt:VF_128_256
> +  (match_operand: 1 "register_operand" "x")
> +  (match_operand: 2 "const0_operand" "C"))]
> +   UNSPEC_MOVMSK)))]
> +  "TARGET_64BIT && TARGET_SSE"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +   (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))]
> +  "operands[1] = gen_lowpart (mode, operands[1]);"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
> +(define_insn_and_split "*_movmsk_shift"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +   (unspec:SI
> + [(subreg:VF_128_256
> +(ashiftrt:
> +  (match_operand: 1 "register_operand" "x")
> +  (match_operand:QI 2 "const_int_operand" "n")) 0)]
> + UNSPEC_MOVMSK))]
> +  "TARGET_SSE"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +   (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))]
> +  "operands[1] = gen_lowpart (mode, operands[1]);"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
> +(define_insn_and_split 
> "*_movmsk_zext_shift"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +   (zero_extend:DI
> + (unspec:SI
> +   [(subreg:VF_128_256
> +  (ashiftrt:
> +(match_operand: 1 "register_operand" "x")
> +  (match_operand:QI 2 "const_int_operand" "n")) 0)]
> +   UNSPEC_MOVMSK)))]
> +  "TARGET_64BIT && TARGET_SSE"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +   (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))]
> +  "operands[1] = gen_lowpart (mode, operands[1]);"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
>  (define_insn "_pmovmskb"
>[(set (match_operand:SI 0 "register_operand" "=r")
> (unspec:SI
> @@ -14680,6 +14752,49 @@ (define_insn "*_pmovmskb_zext
>[(set_attr "type" "ssemov")
> (set (attr "prefix_data16")
>   (if_then_else
> +   (match_test "TARGET_AVX")
> + (const_string "*")
> + (const_string "1")))
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "SI")])
> +
> +(define_insn_and_split "*_pmovmskb_lt"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +   (unspec:SI
> + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x")
> +   (match_operand:VI1_AVX2 2 "const0_operand" "C"))]
> + UNSPEC_MOVMSK))]
> +  "TARGET_SSE2"
> +  "#"
> +  ""
> +  [(set (match_dup 0)
> +   (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))]
> +  ""
> +  [(set_attr "type" "ssemov")
> +   (set (attr "prefix_data16")
> + (if_then_else
> +

Re: [PATCH] Optimize integral lt + blend into just blend (PR target/54700)

2018-11-29 Thread Uros Bizjak
On Thu, Nov 29, 2018 at 10:40 PM Jakub Jelinek  wrote:
>
> On Thu, Nov 29, 2018 at 05:41:59PM +0100, Uros Bizjak wrote:
> > On Thu, Nov 29, 2018 at 5:28 PM Jakub Jelinek  wrote:
> > >
> > > On Thu, Nov 29, 2018 at 05:21:53PM +0100, Uros Bizjak wrote:
> > > > > > * g++.target/i386/avx2-check.h: New file.
> > > > > > * g++.target/i386/m128-check.h: New file.
> > > > > > * g++.target/i386/m256-check.h: New file.
> > > > > > * g++.target/i386/avx-os-support.h: New file.
> > > > >
> > > > > OK.
> > > >
> > > > On a second thought, should we rather use (pre-reload?)
> > > > define_insn_and_split to split the combination to the blend insn?
> > >
> > > I've already committed it.  But can work on a patch that does that 
> > > tomorrow.
> >
> > Thanks. You will probably need to split it after reload, since a
> > change from intvec->FPvec is needed.
>
> Like this?  Bootstrapped/regtested on x86_64-linux and i686-linux.
>
> 2018-11-29  Jakub Jelinek  
>
> PR target/54700
> * config/i386/sse.md
> (*_blendv_lt,
> *_blendv_ltint,
> *_pblendvb_lt): Change define_insn into
> define_insn_and_split.

OK.

Thanks,
Uros.

> --- gcc/config/i386/sse.md.jj   2018-11-29 15:32:27.597301378 +0100
> +++ gcc/config/i386/sse.md  2018-11-29 18:52:42.747904630 +0100
> @@ -15682,7 +15682,7 @@ (define_insn "sse4_1_blendv]
>(const_string "")))])
>
> -(define_insn "*_blendv_lt"
> +(define_insn_and_split "*_blendv_lt"
>[(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x")
> (unspec:VF_128_256
>   [(match_operand:VF_128_256 1 "register_operand" "0,0,x")
> @@ -15693,10 +15693,12 @@ (define_insn "*_blendv(match_operand: 4 "const0_operand" "C,C,C")) 0)]
>   UNSPEC_BLENDV))]
>"TARGET_SSE4_1"
> -  "@
> -   blendv\t{%3, %2, %0|%0, %2, %3}
> -   blendv\t{%3, %2, %0|%0, %2, %3}
> -   vblendv\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +   (unspec:VF_128_256
> +[(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))]
> +  "operands[3] = gen_lowpart (mode, operands[3]);"
>[(set_attr "isa" "noavx,noavx,avx")
> (set_attr "type" "ssemov")
> (set_attr "length_immediate" "1")
> @@ -15712,7 +15714,7 @@ (define_mode_attr ssefltmodesuffix
>  (define_mode_attr ssefltvecmode
>[(V2DI "V2DF") (V4DI "V4DF") (V4SI "V4SF") (V8SI "V8SF")])
>
> -(define_insn "*_blendv_ltint"
> +(define_insn_and_split 
> "*_blendv_ltint"
>[(set (match_operand: 0 "register_operand" "=Yr,*x,x")
> (unspec:
>   [(match_operand: 1 "register_operand" "0,0,x")
> @@ -15723,10 +15725,17 @@ (define_insn "*_blendv(match_operand:VI48_AVX 4 "const0_operand" "C,C,C")) 0)]
>   UNSPEC_BLENDV))]
>"TARGET_SSE4_1"
> -  "@
> -   blendv\t{%3, %2, %0|%0, %2, %3}
> -   blendv\t{%3, %2, %0|%0, %2, %3}
> -   vblendv\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 0)
> +   (unspec:
> +[(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))]
> +{
> +  operands[0] = gen_lowpart (mode, operands[0]);
> +  operands[1] = gen_lowpart (mode, operands[1]);
> +  operands[2] = gen_lowpart (mode, operands[2]);
> +  operands[3] = gen_lowpart (mode, operands[3]);
> +}
>[(set_attr "isa" "noavx,noavx,avx")
> (set_attr "type" "ssemov")
> (set_attr "length_immediate" "1")
> @@ -15834,7 +15843,7 @@ (define_insn "_pblendvb"
> (set_attr "btver2_decode" "vector,vector,vector")
> (set_attr "mode" "")])
>
> -(define_insn "*_pblendvb_lt"
> +(define_insn_and_split "*_pblendvb_lt"
>[(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x")
> (unspec:VI1_AVX2
>   [(match_operand:VI1_AVX2 1 "register_operand"  "0,0,x")
> @@ -15843,10 +15852,12 @@ (define_insn "*_pblendvb_lt
> (match_operand:VI1_AVX2 4 "const0_operand" "C,C,C"))]
>   UNSPEC_BLENDV))]
>"TARGET_SSE4_1"
> -  "@
> -   pblendvb\t{%3, %2, %0|%0, %2, %3}
> -   pblendvb\t{%3, %2, %0|%0, %2, %3}
> -   vpblendvb\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  "#"
> +  ""
> +  [(set (match_dup 0)
> +   (unspec:VI1_AVX2
> +[(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))]
> +  ""
>[(set_attr "isa" "noavx,noavx,avx")
> (set_attr "type" "ssemov")
> (set_attr "prefix_extra" "1")
>
>
> Jakub


[C++ PATCH] Fix xvalue COND_EXPR handling (PR c++/88103)

2018-11-29 Thread Jakub Jelinek
Hi!

On the following testcase, build_conditional_expr_1 tries hard to make sure
that if both arguments are xvalue_p (or one is and the other throw) the
result is still xvalue_p.  But, later on we call unary_complex_lvalue,
which does rationalize_conditional_expr which changes it from
cond ? x : y to *(cond ? &x : &y) and that change turns something formerly
xvalue_p into newly lvalue_p.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2018-11-29  Jakub Jelinek  

PR c++/88103
* typeck.c (unary_complex_lvalue): If a COND_EXPR is xvalue_p, make
sure the result is as well.

* g++.dg/cpp0x/rv-cond3.C: New test.

--- gcc/cp/typeck.c.jj  2018-11-27 09:48:58.506103668 +0100
+++ gcc/cp/typeck.c 2018-11-29 21:00:33.900636750 +0100
@@ -6503,7 +6503,16 @@ unary_complex_lvalue (enum tree_code cod
   /* Handle (a ? b : c) used as an "lvalue".  */
   if (TREE_CODE (arg) == COND_EXPR
   || TREE_CODE (arg) == MIN_EXPR || TREE_CODE (arg) == MAX_EXPR)
-return rationalize_conditional_expr (code, arg, tf_warning_or_error);
+{
+  tree ret = rationalize_conditional_expr (code, arg, tf_warning_or_error);
+  /* Preserve xvalue kind.  */
+  if (xvalue_p (arg))
+   {
+ tree reftype = cp_build_reference_type (TREE_TYPE (arg), true);
+ ret = cp_convert (reftype, ret, tf_warning_or_error);
+   }
+  return ret;
+}
 
   /* Handle (a = b), (++a), and (--a) used as an "lvalue".  */
   if (TREE_CODE (arg) == MODIFY_EXPR
--- gcc/testsuite/g++.dg/cpp0x/rv-cond3.C.jj2018-11-29 21:04:48.228440774 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/rv-cond3.C   2018-11-29 21:06:22.315888491 
+0100
@@ -0,0 +1,22 @@
+// PR c++/88103
+// { dg-do compile { target c++11 } }
+
+struct A {
+  A (int);
+  A&& foo () &&;
+  int i;
+};
+void free (A&&);
+
+void test_xvalue (A a){
+  A&& ref = true ? static_cast (a) : static_cast (a); 
+  free (true ? static_cast (a) : static_cast (a));
+  (true ? static_cast (a) : static_cast (a)).foo ();
+  int&& k = (true ? static_cast (a) : static_cast (a)).i;
+}
+void test_prvalue (A a){
+  A&& ref = true ? static_cast (a) : 1; 
+  free (true ? static_cast (a) : 1);
+  (true ? static_cast (a) : 1).foo ();
+  int&& k = (true ? static_cast (a) : 1).i;
+}

Jakub


[PATCH] Improve combiner's find_split_point (PR target/54589)

2018-11-29 Thread Jakub Jelinek
Hi!

The following patch attempts to improve find_split_point inside of
complex MEM addresses, if the target supports REG + REG + const
addressing, but doesn't support more_complex_rtx + REG + const,
try to split it at more_complex_rtx rather than more_complex_rtx + REG.

On the original testcase from the PR, the change is with x86_64 -m64 -O2:
movzbl  (%rdi), %eax
-   addq$1, %rax
salq$4, %rax
-   movl8(%rsp,%rax), %eax
+   movl24(%rsp,%rax), %eax
movl%eax, (%rsi)
ret
and -m32 -O2:
movl4116(%esp), %eax
movzbl  (%eax), %eax
-   addl$1, %eax
sall$4, %eax
-   movl4(%esp,%eax), %edx
+   movl20(%esp,%eax), %edx
movl4120(%esp), %eax
movl%edx, (%eax)
ret
(though, that testcase not included, e.g. because trying to optimize a
single insn in a code that passes around more than 4KB structure as argument
by value is pointless).

Bootstrapped/regtested on x86_64-linux and i686-linux, bootstrapped
on powerpc64{,le}-linux (regtest still pending on those two), ok for trunk?

2018-11-29  Jakub Jelinek  

PR target/54589
* combine.c (find_split_point): For invalid memory address
nonobj + obj + const, if reg + obj + const is valid addressing
mode, split at nonobj.  Use if rather than else if for the
fallback.  Comment fixes.

* gcc.target/i386/pr54589.c: New test.

--- gcc/combine.c.jj2018-11-21 19:57:26.229726485 +0100
+++ gcc/combine.c   2018-11-29 17:57:48.069423874 +0100
@@ -4945,7 +4945,7 @@ find_split_point (rtx *loc, rtx_insn *in
}
 
   /* If we have a PLUS whose second operand is a constant and the
-address is not valid, perhaps will can split it up using
+address is not valid, perhaps we can split it up using
 the machine-specific way to split large constants.  We use
 the first pseudo-reg (one of the virtual regs) as a placeholder;
 it will not remain in the result.  */
@@ -4960,7 +4960,7 @@ find_split_point (rtx *loc, rtx_insn *in
 
  /* This should have produced two insns, each of which sets our
 placeholder.  If the source of the second is a valid address,
-we can make put both sources together and make a split point
+we can put both sources together and make a split point
 in the middle.  */
 
  if (seq
@@ -5001,14 +5001,51 @@ find_split_point (rtx *loc, rtx_insn *in
}
}
 
+ /* If that didn't work and we have a nested plus, like:
+((REG1 * CONST1) + REG2) + CONST2 and (REG1 + REG2) + CONST2
+is valid address, try to split (REG1 * CONST1).  */
+ if (GET_CODE (XEXP (XEXP (x, 0), 0)) == PLUS
+ && !OBJECT_P (XEXP (XEXP (XEXP (x, 0), 0), 0))
+ && OBJECT_P (XEXP (XEXP (XEXP (x, 0), 0), 1))
+ && ! (GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == SUBREG
+   && OBJECT_P (SUBREG_REG (XEXP (XEXP (XEXP (x, 0),
+0), 0)
+   {
+ rtx tem = XEXP (XEXP (XEXP (x, 0), 0), 0);
+ XEXP (XEXP (XEXP (x, 0), 0), 0) = reg;
+ if (memory_address_addr_space_p (GET_MODE (x), XEXP (x, 0),
+  MEM_ADDR_SPACE (x)))
+   {
+ XEXP (XEXP (XEXP (x, 0), 0), 0) = tem;
+ return &XEXP (XEXP (XEXP (x, 0), 0), 0);
+   }
+ XEXP (XEXP (XEXP (x, 0), 0), 0) = tem;
+   }
+ else if (GET_CODE (XEXP (XEXP (x, 0), 0)) == PLUS
+  && OBJECT_P (XEXP (XEXP (XEXP (x, 0), 0), 0))
+  && !OBJECT_P (XEXP (XEXP (XEXP (x, 0), 0), 1))
+  && ! (GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 1)) == SUBREG
+&& OBJECT_P (SUBREG_REG (XEXP (XEXP (XEXP (x, 0),
+ 0), 1)
+   {
+ rtx tem = XEXP (XEXP (XEXP (x, 0), 0), 1);
+ XEXP (XEXP (XEXP (x, 0), 0), 1) = reg;
+ if (memory_address_addr_space_p (GET_MODE (x), XEXP (x, 0),
+  MEM_ADDR_SPACE (x)))
+   {
+ XEXP (XEXP (XEXP (x, 0), 0), 1) = tem;
+ return &XEXP (XEXP (XEXP (x, 0), 0), 1);
+   }
+ XEXP (XEXP (XEXP (x, 0), 0), 1) = tem;
+   }
+
  /* If that didn't work, perhaps the first operand is complex and
 needs to be computed separately, so make a split point there.
 This will occur on machines that just support REG + CONST
 and have a constant moved through some previous computation.  */
-
- else if (!OBJECT_P (XEXP (XEXP (x, 0), 0))
-  && ! (GET_CODE (XEXP (XEXP (x, 0), 0)) == SUBREG
-&& OBJECT_P

Re: [patch,openacc] use existing local variable in cp_parser_oacc_enter_exit_data

2018-11-29 Thread Joseph Myers
On Thu, 29 Nov 2018, Julian Brown wrote:

> So, OK, or shall we just drop this? (Joseph?)

I've no comment on this patch; it seems to be in Thomas's area.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Optimize integer vector comparison followed by movmsk (PR target/88152, take 2)

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 05:26:41PM +0100, Uros Bizjak wrote:
> On Thu, Nov 29, 2018 at 3:36 PM Jakub Jelinek  wrote:
> > Like blend, movmsk also only cares about the most significant bit,
> > so prior < 0 comparisons or (happens also on the testcase below in some
> > cases) arithmetic shift right (by any value) isn't needed before the movmsk.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Same comment as with your lt+blend -> blend patch. I think that
> pre-reload define_insn_and_split that splits the combination to movmsk
> would be better here. We already implement similar approach to remove
> useless maskings of shift operands (c.f. various "..._mask" insns in
> i386.md).

So like this?  Bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk?

2018-11-29  Jakub Jelinek  

PR target/88152
* config/i386/sse.md (*_movmsk_lt,
*_movmsk_zext_lt,
*_movmsk_shift,
*_movmsk_zext_shift,
*_pmovmskb_lt, *_pmovmskb_zext_lt): New
define_insn_and_split patterns.

* g++.target/i386/pr88152.C: New test.

--- gcc/config/i386/sse.md.jj   2018-11-29 18:52:42.747904630 +0100
+++ gcc/config/i386/sse.md  2018-11-29 19:21:44.371143252 +0100
@@ -14653,6 +14653,78 @@ (define_insn "*_movmsk")])
 
+(define_insn_and_split "*_movmsk_lt"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI
+ [(lt:VF_128_256
+(match_operand: 1 "register_operand" "x")
+(match_operand: 2 "const0_operand" "C"))]
+ UNSPEC_MOVMSK))]
+  "TARGET_SSE"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))]
+  "operands[1] = gen_lowpart (mode, operands[1]);"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
+(define_insn_and_split "*_movmsk_zext_lt"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (unspec:SI
+   [(lt:VF_128_256
+  (match_operand: 1 "register_operand" "x")
+  (match_operand: 2 "const0_operand" "C"))]
+   UNSPEC_MOVMSK)))]
+  "TARGET_64BIT && TARGET_SSE"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))]
+  "operands[1] = gen_lowpart (mode, operands[1]);"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
+(define_insn_and_split "*_movmsk_shift"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI
+ [(subreg:VF_128_256
+(ashiftrt:
+  (match_operand: 1 "register_operand" "x")
+  (match_operand:QI 2 "const_int_operand" "n")) 0)]
+ UNSPEC_MOVMSK))]
+  "TARGET_SSE"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))]
+  "operands[1] = gen_lowpart (mode, operands[1]);"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
+(define_insn_and_split "*_movmsk_zext_shift"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (unspec:SI
+   [(subreg:VF_128_256
+  (ashiftrt:
+(match_operand: 1 "register_operand" "x")
+  (match_operand:QI 2 "const_int_operand" "n")) 0)]
+   UNSPEC_MOVMSK)))]
+  "TARGET_64BIT && TARGET_SSE"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))]
+  "operands[1] = gen_lowpart (mode, operands[1]);"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
 (define_insn "_pmovmskb"
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
@@ -14680,6 +14752,49 @@ (define_insn "*_pmovmskb_zext
   [(set_attr "type" "ssemov")
(set (attr "prefix_data16")
  (if_then_else
+   (match_test "TARGET_AVX")
+ (const_string "*")
+ (const_string "1")))
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "SI")])
+
+(define_insn_and_split "*_pmovmskb_lt"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI
+ [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x")
+   (match_operand:VI1_AVX2 2 "const0_operand" "C"))]
+ UNSPEC_MOVMSK))]
+  "TARGET_SSE2"
+  "#"
+  ""
+  [(set (match_dup 0)
+   (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))]
+  ""
+  [(set_attr "type" "ssemov")
+   (set (attr "prefix_data16")
+ (if_then_else
+   (match_test "TARGET_AVX")
+ (const_string "*")
+ (const_string "1")))
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "SI")])
+
+(define_insn_and_split "*_pmovmskb_zext_lt"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (unspec:SI
+   [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x")
+ (match_op

Re: [PATCH] Optimize integral lt + blend into just blend (PR target/54700)

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 05:41:59PM +0100, Uros Bizjak wrote:
> On Thu, Nov 29, 2018 at 5:28 PM Jakub Jelinek  wrote:
> >
> > On Thu, Nov 29, 2018 at 05:21:53PM +0100, Uros Bizjak wrote:
> > > > > * g++.target/i386/avx2-check.h: New file.
> > > > > * g++.target/i386/m128-check.h: New file.
> > > > > * g++.target/i386/m256-check.h: New file.
> > > > > * g++.target/i386/avx-os-support.h: New file.
> > > >
> > > > OK.
> > >
> > > On a second thought, should we rather use (pre-reload?)
> > > define_insn_and_split to split the combination to the blend insn?
> >
> > I've already committed it.  But can work on a patch that does that tomorrow.
> 
> Thanks. You will probably need to split it after reload, since a
> change from intvec->FPvec is needed.

Like this?  Bootstrapped/regtested on x86_64-linux and i686-linux.

2018-11-29  Jakub Jelinek  

PR target/54700
* config/i386/sse.md
(*_blendv_lt,
*_blendv_ltint,
*_pblendvb_lt): Change define_insn into
define_insn_and_split.

--- gcc/config/i386/sse.md.jj   2018-11-29 15:32:27.597301378 +0100
+++ gcc/config/i386/sse.md  2018-11-29 18:52:42.747904630 +0100
@@ -15682,7 +15682,7 @@ (define_insn "sse4_1_blendv")))])
 
-(define_insn "*_blendv_lt"
+(define_insn_and_split "*_blendv_lt"
   [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x")
(unspec:VF_128_256
  [(match_operand:VF_128_256 1 "register_operand" "0,0,x")
@@ -15693,10 +15693,12 @@ (define_insn "*_blendv 4 "const0_operand" "C,C,C")) 0)]
  UNSPEC_BLENDV))]
   "TARGET_SSE4_1"
-  "@
-   blendv\t{%3, %2, %0|%0, %2, %3}
-   blendv\t{%3, %2, %0|%0, %2, %3}
-   vblendv\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (unspec:VF_128_256
+[(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))]
+  "operands[3] = gen_lowpart (mode, operands[3]);"
   [(set_attr "isa" "noavx,noavx,avx")
(set_attr "type" "ssemov")
(set_attr "length_immediate" "1")
@@ -15712,7 +15714,7 @@ (define_mode_attr ssefltmodesuffix
 (define_mode_attr ssefltvecmode
   [(V2DI "V2DF") (V4DI "V4DF") (V4SI "V4SF") (V8SI "V8SF")])
 
-(define_insn "*_blendv_ltint"
+(define_insn_and_split 
"*_blendv_ltint"
   [(set (match_operand: 0 "register_operand" "=Yr,*x,x")
(unspec:
  [(match_operand: 1 "register_operand" "0,0,x")
@@ -15723,10 +15725,17 @@ (define_insn "*_blendv\t{%3, %2, %0|%0, %2, %3}
-   blendv\t{%3, %2, %0|%0, %2, %3}
-   vblendv\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (unspec:
+[(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))]
+{
+  operands[0] = gen_lowpart (mode, operands[0]);
+  operands[1] = gen_lowpart (mode, operands[1]);
+  operands[2] = gen_lowpart (mode, operands[2]);
+  operands[3] = gen_lowpart (mode, operands[3]);
+}
   [(set_attr "isa" "noavx,noavx,avx")
(set_attr "type" "ssemov")
(set_attr "length_immediate" "1")
@@ -15834,7 +15843,7 @@ (define_insn "_pblendvb"
(set_attr "btver2_decode" "vector,vector,vector")
(set_attr "mode" "")])
 
-(define_insn "*_pblendvb_lt"
+(define_insn_and_split "*_pblendvb_lt"
   [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x")
(unspec:VI1_AVX2
  [(match_operand:VI1_AVX2 1 "register_operand"  "0,0,x")
@@ -15843,10 +15852,12 @@ (define_insn "*_pblendvb_lt
(match_operand:VI1_AVX2 4 "const0_operand" "C,C,C"))]
  UNSPEC_BLENDV))]
   "TARGET_SSE4_1"
-  "@
-   pblendvb\t{%3, %2, %0|%0, %2, %3}
-   pblendvb\t{%3, %2, %0|%0, %2, %3}
-   vpblendvb\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  "#"
+  ""
+  [(set (match_dup 0)
+   (unspec:VI1_AVX2
+[(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))]
+  ""
   [(set_attr "isa" "noavx,noavx,avx")
(set_attr "type" "ssemov")
(set_attr "prefix_extra" "1")


Jakub


Re: [patch,openacc] use existing local variable in cp_parser_oacc_enter_exit_data

2018-11-29 Thread Julian Brown
On Wed, 26 Sep 2018 11:21:33 -0700
Cesar Philippidis  wrote:

> This is an old gomp4 patch that updates the location of the clause for
> acc enter/exit data. Apparently, it didn't impact any test cases. Is
> this OK for trunk or should we drop it from OG8?
> 
> I bootstrapped and regtested it for x86_64 Linux with nvptx
> offloading.

At least at a glance, there is no actual change to behaviour given in
this patch, it is just an extremely minor cleanup. I.e. in:

  location_t loc = pragma_tok->location;
  [...]
  SET_EXPR_LOCATION (stmt, pragma_tok->location);

the variable "loc" is used in the SET_EXPR_LOCATION instead. It doesn't
look like anything could mutate either variable in the interim.

So, OK, or shall we just drop this? (Joseph?)

Thanks,

Julian


Re: [patch,openacc] C, C++ OpenACC wait diagnostic change

2018-11-29 Thread Joseph Myers
On Thu, 29 Nov 2018, Julian Brown wrote:

> > But for C, it does not appear that c_parser_expr_list has a code path
> > that can return a zero-length list at all. So, we can elide the
> > diagnostic with no change to compiler behaviour. This patch does that,
> > and also changes the C++ diagnostic, leading to errors being reported
> > like:

The c-parser.c change is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch,openacc] Fix infinite recursion in OMP clause pretty-printing, default label

2018-11-29 Thread Joseph Myers
On Thu, 29 Nov 2018, Julian Brown wrote:

> On Thu, 20 Sep 2018 10:08:51 -0700
> Cesar Philippidis  wrote:
> 
> > Apparently, Tom ran into an ICE when we were adding support for new
> > clauses back in the gomp-4_0-branch days.  This patch shouldn't be
> > necessary because all of the clauses are fully implemented now, but
> > it may prevent similar bugs from occurring in the future at least
> > during development.
> > 
> > Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
> > Linux with nvptx offloading.
> 
> Joseph, could you take a look at this please?

Lots of other places in the same function use gcc_unreachable ().  I think 
using gcc_unreachable () here as well would be more appropriate than 
special-casing this one place in this function to use "unknown".

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch,openacc] C, C++ OpenACC wait diagnostic change

2018-11-29 Thread Julian Brown
On Fri, 28 Sep 2018 14:17:42 +0100
Julian Brown  wrote:

> On Wed, 26 Sep 2018 14:08:37 -0700
> Cesar Philippidis  wrote:
> 
> > On 09/26/2018 12:50 PM, Joseph Myers wrote:  
> > > On Wed, 26 Sep 2018, Cesar Philippidis wrote:
> > > 
> > >> Attached is an old patch which updated the C and C++ FEs to use
> > >> %<)%> for the right ')' symbol. It's mostly a cosmetic change.
> > >> All of the changes are self-contained to the OpenACC code
> > >> path.
> > > 
> > > Why is the "before ')'" included in the call to c_parser_error at
> > > all? c_parser_error calls c_parse_error which adds its own "
> > > before " and token description or expansion, so I'd expect the
> > > current error to result in a message ending in something of the
> > > form "before X before Y".
> 
> > Julian, I need to start working on deep copy in OpenACC. Can you
> > take over this patch? The error handling code in the C FE needs to
> > be removed because it's dead.  
> 
> I agree that the error-handling path in question in the C FE is dead.
> The difference is that in C, c_parser_oacc_wait_list parses the open
> parenthesis, the list and then the close parenthesis separately, and
> so a token sequence like:
> 
>(1
> 
> will return an expression list of length 1. In the C++ FE rather, a
> cp_parser_parenthesized_expression_list is parsed all in one go, and
> if the input is not that well-formed sequence then NULL is returned
> (or a zero-length vector for an empty list).
> 
> But for C, it does not appear that c_parser_expr_list has a code path
> that can return a zero-length list at all. So, we can elide the
> diagnostic with no change to compiler behaviour. This patch does that,
> and also changes the C++ diagnostic, leading to errors being reported
> like:
> 
> diag.c: In function 'int main(int, char*)':
> diag.c:6:59: error: expected ')' before end of line
> 6 | #pragma acc parallel copyin (a[0:N]) copy (b[0:N]) wait (1
>   | ~ ^
>   |   )
> diag.c:6:59: error: expected integer expression list before end of
> line 
> 
> Actually I'm not too sure how useful the second error line is. Maybe
> we should just remove it to improve consistency between C & C++?
> 
> The attached has been tested with offloading to nvptx and
> bootstrapped. OK?

Ping?

Thanks,

Julian


Re: [patch,openacc] Fix infinite recursion in OMP clause pretty-printing, default label

2018-11-29 Thread Julian Brown
On Thu, 20 Sep 2018 10:08:51 -0700
Cesar Philippidis  wrote:

> Apparently, Tom ran into an ICE when we were adding support for new
> clauses back in the gomp-4_0-branch days.  This patch shouldn't be
> necessary because all of the clauses are fully implemented now, but
> it may prevent similar bugs from occurring in the future at least
> during development.
> 
> Is this patch OK for trunk? I bootstrapped and regtested it for x86_64
> Linux with nvptx offloading.

Joseph, could you take a look at this please?

Thanks,

Julian


Re: [PATCH 1/2] asm qualifiers (PR55681)

2018-11-29 Thread Joseph Myers
I'd expect testcases to be added for the new syntax variants (duplicate 
qualifiers / goto and new orderings thereof).

There's a description of the syntax in extend.texi:

@example
asm @r{[}volatile@r{]} ( @var{AssemblerTemplate} 
 : @var{OutputOperands} 
 @r{[} : @var{InputOperands}
 @r{[} : @var{Clobbers} @r{]} @r{]})

asm @r{[}volatile@r{]} goto ( @var{AssemblerTemplate} 
  : 
  : @var{InputOperands}
  : @var{Clobbers}
  : @var{GotoLabels})
@end example

I'd expect this to be updated by this patch, and again by the "asm inline" 
one.

What's the basis for allowing duplicates for C but not for C++?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] v2: C++: show namespaces for enum values (PR c++/88121)

2018-11-29 Thread Jason Merrill

On 11/29/18 2:29 PM, David Malcolm wrote:

On Wed, 2018-11-21 at 13:41 -0500, Jason Merrill wrote:

On 11/21/18 8:35 AM, David Malcolm wrote:

Consider this test case:

namespace json
{
enum { JSON_OBJECT };
}

void test ()
{
JSON_OBJECT;
}

which erroneously accesses an enum value in another namespace
without
qualifying the access.

GCC 6 through 8 issue a suggestion that doesn't mention the
namespace:

: In function 'void test()':
:8:3: error: 'JSON_OBJECT' was not declared in this scope
 JSON_OBJECT;
 ^~~
:8:3: note: suggested alternative:
:3:10: note:   'JSON_OBJECT'
 enum { JSON_OBJECT };
^~~

which is suboptimal.

I made the problem worse with r265610, as gcc 9 now consolidates
the single suggestion into the error, and emits:

: In function 'void test()':
:8:3: error: 'JSON_OBJECT' was not declared in this scope;
did
 you mean 'JSON_OBJECT'?
  8 |   JSON_OBJECT;
|   ^~~
|   JSON_OBJECT
:3:10: note: 'JSON_OBJECT' declared here
  3 |   enum { JSON_OBJECT };
|  ^~~

where the message:
'JSON_OBJECT' was not declared in this scope; did you mean
'JSON_OBJECT'?
is nonsensical.

The root cause is that dump_scope doesn't print anything when
called for
CONST_DECL in a namespace: the scope is an ENUMERAL_TYPE, rather
than
a namespace.


Although that's only true for unscoped enums.


This patch tweaks dump_scope to detect ENUMERAL_TYPE, and to use
the
enclosing namespace, so that the CONST_DECL is dumped as
"json::JSON_OBJECT".
@@ -182,6 +182,12 @@ dump_scope (cxx_pretty_printer *pp, tree
scope, int flags)
 if (scope == NULL_TREE)
   return;

+  /* Enum values will be CONST_DECL with an ENUMERAL_TYPE as their
+ "scope".  Use CP_TYPE_CONTEXT of the ENUMERAL_TYPE, so as to
+ print the enclosing namespace.  */
+  if (TREE_CODE (scope) == ENUMERAL_TYPE)
+scope = CP_TYPE_CONTEXT (scope);


This needs to handle scoped enums differently.


I attempted to trigger this code path (printing a *value* within a
scoped enum via %qE, I believe), but wasn't able to), so I extended
the, ahem, scope of the patch a little, so that when scanning
namespaces for exact matches for a name, we also scan inside scoped
enums, to cover the case where someone doesn't supply the scope.

Hence with the updated patch given e.g.:

enum class vegetable { CARROT, TURNIP };

we're able to offer e.g.:

suggestions-scoped-enums.C:50:3: error: 'CARROT' was not declared in
   this scope; did you mean 'vegetable::CARROT'?
50 |   CARROT;
   |   ^~
   |   vegetable::CARROT

and this exercises the code path above.  The patch updates dump_scope
for scoped enums so that we print the scope when printing the
value ("vegetable::CARROT"), rather than just the name of the value
("CARROT").


diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
new file mode 100644
index 000..2bf3ed6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-fdiagnostics-show-caret" }
+
+enum class vegetable { CARROT, TURNIP };
+
+void misspelled_value_in_scoped_enum ()
+{
+  vegetable::TURNUP; // { dg-error "'TURNUP' is not a member of
'vegetable'" }
+  /* { dg-begin-multiline-output "" }
+   vegetable::TURNUP;
+  ^~
+ { dg-end-multiline-output "" } */
+}


I don't see any suggestion in the expected output, and would hope for
it
to suggest vegetable::TURNIP.


The updated patch also adds spell-corrections within a scoped enum,
giving:

suggestions-scoped-enums.C:18:14: error: 'TURNUP' is not a member of
   'vegetable'; did you mean 'TURNIP'?
18 |   vegetable::TURNUP;
   |  ^~
   |  TURNIP

As before, the patch fixes the bogus suggestion in PR c++/88121.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?


OK, thanks.

Jason



Re: C++ PATCH for c++/88184, ICE when treating name as template-name

2018-11-29 Thread Jason Merrill

On 11/29/18 1:28 PM, Marek Polacek wrote:

On Wed, Nov 28, 2018 at 10:34:11PM -0500, Jason Merrill wrote:

On 11/28/18 10:48 AM, Marek Polacek wrote:

Since P0846R0 was implemented, a name will be treated as a template-name when
it is an unqualified-id followed by a < and name lookup finds either one or
more functions or finds nothing, in order to potentially cause ADL to be 
performed.

In this case, we had

f ();

where we'd found a decl for f (not a TEMPLATE_DECL) and < follows.


 From the backtrace in the PR, it seems as though we're treating f as
non-dependent, which is wrong.  type_dependent_expression_p only looks at
the arguments of a TEMPLATE_ID_EXPR if it has unknown_type_node, so we
probably want to give it that type.


That was my first attempt but it was crashing everywhere, so I abandoned it.
But I was setting unknown_type_node in cp_parser_template_name whereas this
patch sets it in cp_parser_postfix_expression, which works and is arguably
better because it reuses the diagnostic in finish_call_expr.


Would it work for lookup_template_function to always use unknown_type_node?

Jason



Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54589

2018-11-29 Thread Segher Boessenkool
On Thu, Nov 29, 2018 at 05:20:03PM +0100, Jakub Jelinek wrote:
> On Thu, Nov 29, 2018 at 05:10:33PM +0100, Uros Bizjak wrote:
> > Maybe a combine splitter can be used here? Please see documentation
> > from paragraph 17.16 onward:
> > 
> > --quote--
> >  The insn combiner phase also splits putative insns.  If three insns are
> > merged into one insn with a complex expression that cannot be matched by
> > some 'define_insn' pattern, the combiner phase attempts to split the
> > complex pattern into two insns that are recognized.  Usually it can
> > break the complex pattern into two patterns by splitting out some
> > subexpression.  However, in some other cases, such as performing an
> > addition of a large constant in two insns on a RISC machine, the way to
> > split the addition into two insns is machine-dependent.
> 
> Maybe, but not sure how the define_split would look like.
> We essentially want to match any RTL whatsoever that has some MEM in it
> and if the MEM address is of certain kind, move some part of it.
> 
> combine.c has also find_split_point function which does the right thing
> in the foo case, just doesn't for the bar case.

Yes, that is to find where to split an insn into two (with a new intermediate
register).  It uses heuristics and finds at most *one* place to try.

Maybe the heuristics can be improved here?

Something like change_zero_ext can be used to try a different form for the
same instruction, but trying something like that while producing more than
one insn is probably hard to do (and not very useful).


Segher


Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-29 Thread Martin Sebor

On 11/16/2018 02:07 AM, Richard Biener wrote:

On Fri, Nov 16, 2018 at 4:12 AM Martin Sebor  wrote:


Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

Please let me know if there is something I need to change here
to make the fix acceptable or if I should stop trying.


I have one more comment about

+  /* Defer warning (and folding) until the next statement in the basic
+ block is reachable.  */
+  if (!gimple_bb (stmt))
+return false;
+

it's not about the next statement in the basic-block being "reachable"
(even w/o a CFG you can use gsi_next()) but rather that the next
stmt isn't yet gimplified and thus not inserted into the gimple sequence,
right?


No, it's about the current statement not being associated with
a basic block yet when the warning code runs for the first time
(during gimplify_expr), and so gsi_next() returning null.


You apply this to gimple_fold_builtin_strncpy but I'd rather
see us not sprinkling this over gimple-fold.c but instead do this
in gimplify.c:maybe_fold_stmt, delaying folding until say lowering.

See the attached (untested).


I would also prefer this solution.  I had tested it (in response
to you first mentioning it back in September) and it causes quite
a bit of fallout in tests that look for the folding to take place
very early.  See the end of my reply here:

  https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01248.html

But I'm willing to do the test suite cleanup if you think it's
suitable for GCC 9.  (If you're thinking GCC 10 please let me
know now.)

Thanks
Martin



Richard.




On 10/31/2018 10:33 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

On 10/20/2018 06:01 PM, Martin Sebor wrote:

On 10/16/2018 03:21 PM, Jeff Law wrote:

On 10/4/18 9:51 AM, Martin Sebor wrote:

On 10/04/2018 08:58 AM, Jeff Law wrote:

On 8/27/18 9:42 AM, Richard Biener wrote:

On Mon, Aug 27, 2018 at 5:32 PM Jeff Law  wrote:


On 08/27/2018 02:29 AM, Richard Biener wrote:

On Sun, Aug 26, 2018 at 7:26 AM Jeff Law  wrote:


On 08/24/2018 09:58 AM, Martin Sebor wrote:

The warning suppression for -Wstringop-truncation looks for
the next statement after a truncating strncpy to see if it
adds a terminating nul.  This only works when the next
statement can be reached using the Gimple statement iterator
which isn't until after gimplification.  As a result, strncpy
calls that truncate their constant argument that are being
folded to memcpy this early get diagnosed even if they are
followed by the nul assignment:

  const char s[] = "12345";
  char d[3];

  void f (void)
  {
strncpy (d, s, sizeof d - 1);   // -Wstringop-truncation
d[sizeof d - 1] = 0;
  }

To avoid the warning I propose to defer folding strncpy to
memcpy until the pointer to the basic block the strnpy call
is in can be used to try to reach the next statement (this
happens as early as ccp1).  I'm aware of the preference to
fold things early but in the case of strncpy (a relatively
rarely used function that is often misused), getting
the warning right while folding a bit later but still fairly
early on seems like a reasonable compromise.  I fear that
otherwise, the false positives will drive users to adopt
other unsafe solutions (like memcpy) where these kinds of
bugs cannot be as readily detected.

Tested on x86_64-linux.

Martin

PS There still are outstanding cases where the warning can
be avoided.  I xfailed them in the test for now but will
still try to get them to work for GCC 9.

gcc-87028.diff


PR tree-optimization/87028 - false positive -Wstringop-truncation
strncpy with global variable source string
gcc/ChangeLog:

  PR tree-optimization/87028
  * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid
folding when
  statement doesn't belong to a basic block.
  * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Handle
MEM_REF on
  the left hand side of assignment.

gcc/testsuite/ChangeLog:

  PR tree-optimization/87028
  * c-c++-common/Wstringop-truncation.c: Remove xfails.
  * gcc.dg/Wstringop-truncation-5.c: New test.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 07341eb..284c2fb 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy
(gimple_stmt_iterator *gsi,
   if (tree_int_cst_lt (ssize, len))
 return false;

+  /* Defer warning (and folding) until the next statement in the
basic
+ block is reachable.  */
+  if (!gimple_bb (stmt))
+return false;

I think you want cfun->cfg as the test here.  They should be
equivalent
in practice.


Please do not add 'cfun' references.  Note that the next stmt is
also accessible
when there is no CFG.  I guess the issue is that we fold this
during
gimplification where the next stmt is not yet "there" (but still in
GENERIC)?

That was my assumption.  I almost suggested peeking at gsi_next and
avoiding in that case.


So I'd rather add guards to maybe_fold_stmt in the gimplifier then.

So I think the concern with addin

Re: [C++ Patch] Improve compute_array_index_type locations

2018-11-29 Thread Jason Merrill

On 11/29/18 2:09 PM, Paolo Carlini wrote:

Hi,

On 29/11/18 17:13, Jason Merrill wrote:

+ error_at (cp_expr_loc_or_loc (osize, loc),


+    error_at (cp_expr_loc_or_loc (osize, input_location),


Let's compute the location we want to use once at the top of the 
function.  And maybe rename the 'loc' parameter to name_loc.


Makes sense. Thus I tested the attached.


OK.

Jason



Re: [RS6000] num_insns_constant ICE

2018-11-29 Thread Segher Boessenkool
Hi!

On Sun, Nov 25, 2018 at 10:49:12PM +1030, Alan Modra wrote:
> This patch came about from investigating an ICE that appeared when I
> was retesting an old half-baked patch of mine to rs6000_rtx_costs.
> If a const_double is fed to rs6000_is_valid_and_mask and from there to
> rs6000_is_valid_mask where INTVAL is used, gcc will ICE.
> 
> The num_insns_constant ICE was introduced with git commit f337168d97.
> However, the code was buggy before that.  There was no point in
> testing for a mask since the mask predicates only handle const_int.
> In fact, I don't think the function ever handled floating point
> constants that might match a load of minus one and mask.  It does now.
> I've added a few comments regarding splitters so the next person
> looking at this code can see how this works.
> 
> The patch also extracts code out of num_insns_constant that needed to
> handle multiple gprs for DFmode constants in 32-bit mode, to a
> function that handles multiple gprs a little more generally.  I don't
> think there is any need for anything but the 32-bit DFmode case
> currently, but this allows for possible future uses.  The
> CONST_WIDE_INT case is also not used currently, and needed fixing.
> Adding CONST_WIDE_INT_NUNITS - 1 only makes sense if the elements of
> the array were being shifted into a register of size larger than the
> element size (which is 64-bits).
> 
> Boostrapped etc. powerpc64le-linux and powerpc64-linux.
> 
>   * config/rs6000/rs6000.c (num_insns_constant_gpr): Renamed from
>   num_insns_constant_wide.  Make static.  Revise comment.
>   (num_insns_constant_multi): New function.
>   (num_insns_constant): Formatting.  Correct CONST_WIDE_INT
>   calculation.  Simplify and extract code common to both
>   CONST_INT and CONST_DOUBLE.  Add gcc_unreachable for unhandled
>   const_double modes.
>   * config/rs6000/rs6000-protos.h (num_insns_const_wide): Delete.


> +/* Helper for num_insn_constant.  Allow constants formed by the
> +   num_insns_constant_gpr sequences, plus li -1, rldicl/rldicr/rlwinm,
> +   and handle modes that require multiple gprs.  */
> +
> +static int
> +num_insns_constant_multi (HOST_WIDE_INT value, machine_mode mode)
> +{
> +  int nregs = (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
> +  int total = 0;
> +  while (nregs-- > 0)
> +{
> +  int ins = num_insns_constant_gpr (value);

You probably should mask "value" here so that it is only one GPR.
Alternatively, make num_insns_constant_gpr handle that.  Consider what
happens for a 64-bit number with 32-bit registers here?

Also, s/ins/insns/ please.

> +  if (ins > 2
> +   /* We won't get more than 2 from num_insns_constant_gpr
> +  except when TARGET_POWERPC64 and mode is DImode or
> +  wider, so the register mode must be DImode.  */
> +   && rs6000_is_valid_and_mask (GEN_INT (value), DImode))
> + ins = 2;
> +  total += ins;

> +  value >>= 32;
> +  if (TARGET_POWERPC64)
> + value >>= 32;

That's just

  value >>= BITS_PER_WORD;

> + long l[2];

> + val = l[WORDS_BIG_ENDIAN ? 0 : 1] << 32;

This doesn't work, as in the other patch: long can be 32 bit.

Looks good otherwise.


Segher


Re: [PATCH, rs6000] Fix PR87496: ICE in aggregate_value_p at gcc/function.c:2046

2018-11-29 Thread Peter Bergner
On 11/29/18 11:26 AM, Segher Boessenkool wrote:
> On Wed, Nov 28, 2018 at 03:27:19PM -0600, Peter Bergner wrote:
>> PR87496 shows a bug where we ICE if we attempt to use -mabi=ieeelongdouble
>> and -mno-popcntd.  The IEEE128 support requires full ISA 2.06 (aka POWER7)
>> support, so we really should throw an error when using those options
>> together.  Ditto for -mabi=ieeelongdouble and -mno-vsx.  The patch below
>> does that.
>>
>> Ok for mainline once bootstrap and regtesting are complete and clean?
> 
> Okay.  Eventually we shouldn't allow selecting popcntd independently from
> -mcpu=, but that day isn't here yet.  So, okay for trunk, and backports
> if wanted.  Thanks!

Ok, committed to mainline.  It looks like GCC8 needs the same patch.
I'll have to look closer at GCC7 on whether it needs it too, since the
code seems to be a little different.

Thanks

Peter



[PATCH] Fix PR64242

2018-11-29 Thread Wilco Dijkstra
Fix PR64242 - the middle end expansion for long jmp updates the
hard frame pointer before it reads the new stack pointer.  This
results in a corrupted stack pointer if the jmp buffer is a local.
The obvious fix is to insert a temporary.

AArch64 bootstrap & regress pass. OK to commit?

ChangeLog:
2018-11-29  Wilco Dijkstra  

gcc/
PR middle-end/64242
* builtins.c (expand_builtin_longjmp): Use a temporary when restoring
the frame pointer.
(expand_builtin_nonlocal_goto): Likewise.

testsuite/
PR middle-end/64242
* gcc.c-torture/execute/pr64242.c: New test.

---
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 
ebde2db6e6494cf7e4441f6834e65fcb81e1629c..5c80c60378fc4fbb3faf8e04672d7091ac071624
 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1142,8 +1142,11 @@ expand_builtin_longjmp (rtx buf_addr, rtx value)
  emit_clobber (gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)));
  emit_clobber (gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx));
 
- emit_move_insn (hard_frame_pointer_rtx, fp);
+ /* Restore the frame pointer and stack pointer.  We must use a
+temporary since the setjmp buffer may be a local.  */
+ fp = copy_to_reg (fp);
  emit_stack_restore (SAVE_NONLOCAL, stack);
+ emit_move_insn (hard_frame_pointer_rtx, fp);
 
  emit_use (hard_frame_pointer_rtx);
  emit_use (stack_pointer_rtx);
@@ -1286,9 +1289,11 @@ expand_builtin_nonlocal_goto (tree exp)
   emit_clobber (gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode)));
   emit_clobber (gen_rtx_MEM (BLKmode, hard_frame_pointer_rtx));
 
-  /* Restore frame pointer for containing function.  */
-  emit_move_insn (hard_frame_pointer_rtx, r_fp);
+  /* Restore the frame pointer and stack pointer.  We must use a
+temporary since the setjmp buffer may be a local.  */
+  r_fp = copy_to_reg (r_fp);
   emit_stack_restore (SAVE_NONLOCAL, r_sp);
+  emit_move_insn (hard_frame_pointer_rtx, r_fp);
 
   /* USE of hard_frame_pointer_rtx added for consistency;
 not clear if really needed.  */
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr64242.c 
b/gcc/testsuite/gcc.c-torture/execute/pr64242.c
new file mode 100644
index 
..72dab5709203437b50514a70c523d104706e4bda
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr64242.c
@@ -0,0 +1,30 @@
+/* { dg-require-effective-target indirect_jumps } */
+
+extern void abort (void);
+
+__attribute ((noinline)) void
+broken_longjmp(void *p)
+{
+  void *buf[5];
+  __builtin_memcpy (buf, p, 5 * sizeof (void*));
+  /* Corrupts stack pointer...  */
+  __builtin_longjmp (buf, 1);
+}
+
+volatile int x = 0;
+volatile void *p;
+int
+main (void)
+{
+  void *buf[5];
+  p = __builtin_alloca (x);
+
+  if (!__builtin_setjmp (buf))
+broken_longjmp (buf);
+
+  /* Fails if stack pointer corrupted.  */
+  if (p != __builtin_alloca (x))
+abort();
+
+  return 0;
+}



Re: [C++ Patch] Improve compute_array_index_type locations

2018-11-29 Thread Paolo Carlini

Hi,

On 29/11/18 17:13, Jason Merrill wrote:

+ error_at (cp_expr_loc_or_loc (osize, loc),


+    error_at (cp_expr_loc_or_loc (osize, input_location),


Let's compute the location we want to use once at the top of the 
function.  And maybe rename the 'loc' parameter to name_loc.


Makes sense. Thus I tested the attached.

Thanks, Paolo.




Index: cp/decl.c
===
--- cp/decl.c   (revision 266634)
+++ cp/decl.c   (working copy)
@@ -9621,8 +9621,9 @@ fold_sizeof_expr (tree t)
an appropriate index type for the array.  If non-NULL, NAME is
the name of the entity being declared.  */
 
-tree
-compute_array_index_type (tree name, tree size, tsubst_flags_t complain)
+static tree
+compute_array_index_type_loc (location_t name_loc, tree name, tree size,
+ tsubst_flags_t complain)
 {
   tree itype;
   tree osize = size;
@@ -9630,6 +9631,8 @@ fold_sizeof_expr (tree t)
   if (error_operand_p (size))
 return error_mark_node;
 
+  location_t loc = cp_expr_loc_or_loc (size, name ? name_loc : input_location);
+
   if (!type_dependent_expression_p (size))
 {
   osize = size = mark_rvalue_use (size);
@@ -9658,9 +9661,10 @@ fold_sizeof_expr (tree t)
  if (!(complain & tf_error))
return error_mark_node;
  if (name)
-   error ("size of array %qD has non-integral type %qT", name, type);
+   error_at (loc, "size of array %qD has non-integral type %qT",
+ name, type);
  else
-   error ("size of array has non-integral type %qT", type);
+   error_at (loc, "size of array has non-integral type %qT", type);
  size = integer_one_node;
}
 }
@@ -9689,8 +9693,14 @@ fold_sizeof_expr (tree t)
 {
   tree folded = cp_fully_fold (size);
   if (TREE_CODE (folded) == INTEGER_CST)
-   pedwarn (input_location, OPT_Wpedantic,
-"size of array is not an integral constant-expression");
+   {
+ if (name)
+   pedwarn (loc, OPT_Wpedantic, "size of array %qD is not an "
+"integral constant-expression", name);
+ else
+   pedwarn (loc, OPT_Wpedantic,
+"size of array is not an integral constant-expression");
+   }
   /* Use the folded result for VLAs, too; it will have resolved
 SIZEOF_EXPR.  */
   size = folded;
@@ -9706,9 +9716,9 @@ fold_sizeof_expr (tree t)
return error_mark_node;
 
  if (name)
-   error ("size of array %qD is negative", name);
+   error_at (loc, "size of array %qD is negative", name);
  else
-   error ("size of array is negative");
+   error_at (loc, "size of array is negative");
  size = integer_one_node;
}
   /* As an extension we allow zero-sized arrays.  */
@@ -9722,9 +9732,11 @@ fold_sizeof_expr (tree t)
  else if (in_system_header_at (input_location))
/* Allow them in system headers because glibc uses them.  */;
  else if (name)
-   pedwarn (input_location, OPT_Wpedantic, "ISO C++ forbids zero-size 
array %qD", name);
+   pedwarn (loc, OPT_Wpedantic,
+"ISO C++ forbids zero-size array %qD", name);
  else
-   pedwarn (input_location, OPT_Wpedantic, "ISO C++ forbids zero-size 
array");
+   pedwarn (loc, OPT_Wpedantic,
+"ISO C++ forbids zero-size array");
}
 }
   else if (TREE_CONSTANT (size)
@@ -9737,24 +9749,27 @@ fold_sizeof_expr (tree t)
return error_mark_node;
   /* `(int) &fn' is not a valid array bound.  */
   if (name)
-   error ("size of array %qD is not an integral constant-expression",
-  name);
+   error_at (loc,
+ "size of array %qD is not an integral constant-expression",
+ name);
   else
-   error ("size of array is not an integral constant-expression");
+   error_at (loc, "size of array is not an integral constant-expression");
   size = integer_one_node;
 }
   else if (pedantic && warn_vla != 0)
 {
   if (name)
-   pedwarn (input_location, OPT_Wvla, "ISO C++ forbids variable length 
array %qD", name);
+   pedwarn (name_loc, OPT_Wvla,
+"ISO C++ forbids variable length array %qD", name);
   else
-   pedwarn (input_location, OPT_Wvla, "ISO C++ forbids variable length 
array");
+   pedwarn (input_location, OPT_Wvla,
+"ISO C++ forbids variable length array");
 }
   else if (warn_vla > 0)
 {
   if (name)
-   warning (OPT_Wvla, 
- "variable length array %qD is used", name);
+   warning_at (name_loc, OPT_Wvla, 
+   "variable length array %qD is used", name);
   else
warning (OPT_Wvla, 
  "variable length array is used");
@@ -9821,6 +9836,1

[PATCH] v2: C++: show namespaces for enum values (PR c++/88121)

2018-11-29 Thread David Malcolm
On Wed, 2018-11-21 at 13:41 -0500, Jason Merrill wrote:
> On 11/21/18 8:35 AM, David Malcolm wrote:
> > Consider this test case:
> >
> > namespace json
> > {
> >enum { JSON_OBJECT };
> > }
> >
> > void test ()
> > {
> >JSON_OBJECT;
> > }
> >
> > which erroneously accesses an enum value in another namespace
> > without
> > qualifying the access.
> >
> > GCC 6 through 8 issue a suggestion that doesn't mention the
> > namespace:
> >
> > : In function 'void test()':
> > :8:3: error: 'JSON_OBJECT' was not declared in this scope
> > JSON_OBJECT;
> > ^~~
> > :8:3: note: suggested alternative:
> > :3:10: note:   'JSON_OBJECT'
> > enum { JSON_OBJECT };
> >^~~
> >
> > which is suboptimal.
> >
> > I made the problem worse with r265610, as gcc 9 now consolidates
> > the single suggestion into the error, and emits:
> >
> > : In function 'void test()':
> > :8:3: error: 'JSON_OBJECT' was not declared in this scope;
> > did
> > you mean 'JSON_OBJECT'?
> >  8 |   JSON_OBJECT;
> >|   ^~~
> >|   JSON_OBJECT
> > :3:10: note: 'JSON_OBJECT' declared here
> >  3 |   enum { JSON_OBJECT };
> >|  ^~~
> >
> > where the message:
> >'JSON_OBJECT' was not declared in this scope; did you mean
> > 'JSON_OBJECT'?
> > is nonsensical.
> >
> > The root cause is that dump_scope doesn't print anything when
> > called for
> > CONST_DECL in a namespace: the scope is an ENUMERAL_TYPE, rather
> > than
> > a namespace.
>
> Although that's only true for unscoped enums.
>
> > This patch tweaks dump_scope to detect ENUMERAL_TYPE, and to use
> > the
> > enclosing namespace, so that the CONST_DECL is dumped as
> > "json::JSON_OBJECT".
> > @@ -182,6 +182,12 @@ dump_scope (cxx_pretty_printer *pp, tree
> > scope, int flags)
> > if (scope == NULL_TREE)
> >   return;
> >
> > +  /* Enum values will be CONST_DECL with an ENUMERAL_TYPE as their
> > + "scope".  Use CP_TYPE_CONTEXT of the ENUMERAL_TYPE, so as to
> > + print the enclosing namespace.  */
> > +  if (TREE_CODE (scope) == ENUMERAL_TYPE)
> > +scope = CP_TYPE_CONTEXT (scope);
>
> This needs to handle scoped enums differently.

I attempted to trigger this code path (printing a *value* within a
scoped enum via %qE, I believe), but wasn't able to), so I extended
the, ahem, scope of the patch a little, so that when scanning
namespaces for exact matches for a name, we also scan inside scoped
enums, to cover the case where someone doesn't supply the scope.

Hence with the updated patch given e.g.:

enum class vegetable { CARROT, TURNIP };

we're able to offer e.g.:

suggestions-scoped-enums.C:50:3: error: 'CARROT' was not declared in
  this scope; did you mean 'vegetable::CARROT'?
   50 |   CARROT;
  |   ^~
  |   vegetable::CARROT

and this exercises the code path above.  The patch updates dump_scope
for scoped enums so that we print the scope when printing the
value ("vegetable::CARROT"), rather than just the name of the value
("CARROT").

> > diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
> > b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
> > new file mode 100644
> > index 000..2bf3ed6
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
> > @@ -0,0 +1,13 @@
> > +// { dg-do compile { target c++11 } }
> > +// { dg-options "-fdiagnostics-show-caret" }
> > +
> > +enum class vegetable { CARROT, TURNIP };
> > +
> > +void misspelled_value_in_scoped_enum ()
> > +{
> > +  vegetable::TURNUP; // { dg-error "'TURNUP' is not a member of
> > 'vegetable'" }
> > +  /* { dg-begin-multiline-output "" }
> > +   vegetable::TURNUP;
> > +  ^~
> > + { dg-end-multiline-output "" } */
> > +}
>
> I don't see any suggestion in the expected output, and would hope for
> it
> to suggest vegetable::TURNIP.

The updated patch also adds spell-corrections within a scoped enum,
giving:

suggestions-scoped-enums.C:18:14: error: 'TURNUP' is not a member of
  'vegetable'; did you mean 'TURNIP'?
   18 |   vegetable::TURNUP;
  |  ^~
  |  TURNIP

As before, the patch fixes the bogus suggestion in PR c++/88121.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/88121
* cp-name-hint.h (suggest_alternative_in_scoped_enum): New decl.
* error.c (dump_scope): Ensure that we print any scope for values
of unscoped enums.  Print the scope of values of scoped enums.
(qualified_name_lookup_error): Offer suggestions for failures
within scoped enums by calling suggest_alternative_in_scoped_enum.
* name-lookup.c (class namespace_hints): Update comment to mention
scoped enums.
(namespace_hints::namespace_hints): Call
maybe_add_candidate_for_scoped_enum.
(namespace_hints::maybe_add_candidate_for_scoped_enum): New member
(suggest_alternatives_for): Update 

Re: [PATCH 4/4][libbacktrace] Add tests for unused formats

2018-11-29 Thread Ian Lance Taylor via gcc-patches
On Fri, Nov 23, 2018 at 12:56 PM, Tom de Vries  wrote:
>
> When building libbacktrace, we typically use elf.c, and don't build pecoff.c,
> xcoff.c or unknown.c
>
> Add testcases that use unused format to ensure that we also build and
> test those on a typical development setup.
>
> Bootstrapped and reg-tested on x86_64.
>
> OK for trunk?
>
> Thanks,
> - Tom
>
> [libbacktrace] Add tests for unused formats
>
> 2018-11-23  Tom de Vries  
>
> * configure.ac (NOT_HAVE_FORMAT_ELF, NOT_HAVE_FORMAT_PECOFF)
> (NOT_HAVE_FORMAT_UNKNOWN, NOT_HAVE_FORMAT_XCOFF_32)
> (NOT_HAVE_FORMAT_XCOFF_64): New AM_CONDITIONAL.
> * configure: Regenerate.
> * Makefile.am (check_PROGRAMS): Add test_elf, test_xcoff_32,
> test_xcoff_64, test_pecoff and test_unknown.
> * Makefile.in: Regenerate.
> * test_format.c: New file.


Again it seems feasible to avoid GNU make features, and skip the
negative conditionals and just build the tests for all formats
including the one on the current system.  It's not worth adding the
complexity to avoid building the test.  Thanks.

Ian


Re: C++ PATCH for c++/88184, ICE when treating name as template-name

2018-11-29 Thread Marek Polacek
On Wed, Nov 28, 2018 at 10:34:11PM -0500, Jason Merrill wrote:
> On 11/28/18 10:48 AM, Marek Polacek wrote:
> > Since P0846R0 was implemented, a name will be treated as a template-name 
> > when
> > it is an unqualified-id followed by a < and name lookup finds either one or
> > more functions or finds nothing, in order to potentially cause ADL to be 
> > performed.
> > 
> > In this case, we had
> > 
> >f ();
> > 
> > where we'd found a decl for f (not a TEMPLATE_DECL) and < follows.
> 
> From the backtrace in the PR, it seems as though we're treating f as
> non-dependent, which is wrong.  type_dependent_expression_p only looks at
> the arguments of a TEMPLATE_ID_EXPR if it has unknown_type_node, so we
> probably want to give it that type.

That was my first attempt but it was crashing everywhere, so I abandoned it.
But I was setting unknown_type_node in cp_parser_template_name whereas this
patch sets it in cp_parser_postfix_expression, which works and is arguably
better because it reuses the diagnostic in finish_call_expr.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-11-29  Marek Polacek  

PR c++/88184, ICE when treating name as template-name.
* parser.c (cp_parser_postfix_expression): Set the function decl type
to unknown type when the name-as-template-name doesn't have any
arguments.

* g++.dg/cpp2a/fn-template17.C: New test.
* g++.dg/cpp2a/fn-template18.C: New test.

diff --git gcc/cp/parser.c gcc/cp/parser.c
index 3ef1eda4518..fc328c28f69 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -7241,6 +7241,20 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
 complain);
  }
  }
+   else if (TREE_CODE (postfix_expression) == TEMPLATE_ID_EXPR
+&& (TREE_CODE (TREE_OPERAND (postfix_expression, 0))
+== FUNCTION_DECL)
+&& args->is_empty ())
+ {
+   /* We can get here when treating a name as a template name
+  (because it was an unqualified-id followed by a < and
+  name lookup found a decl), but now we see that there
+  are no arguments, so ADL isn't possible.  Tweak the type
+  so that the TEMPLATE_ID_EXPR is type-dependent and we get
+  proper diagnostic down in finish_call_expr.  */
+   TREE_TYPE (TREE_OPERAND (postfix_expression, 0))
+ = unknown_type_node;
+ }
  }
 
if (TREE_CODE (postfix_expression) == COMPONENT_REF)
diff --git gcc/testsuite/g++.dg/cpp2a/fn-template17.C 
gcc/testsuite/g++.dg/cpp2a/fn-template17.C
new file mode 100644
index 000..f0d24107682
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp2a/fn-template17.C
@@ -0,0 +1,21 @@
+// PR c++/88184
+// { dg-do compile }
+// { dg-options "-std=c++2a -fchecking=2" }
+
+namespace A
+{
+  void f ();
+}
+
+using A::f;
+
+template  void g ()
+{
+  f (); // { dg-error "no matching function for call" }
+}
+
+void
+fn ()
+{
+  g();
+}
diff --git gcc/testsuite/g++.dg/cpp2a/fn-template18.C 
gcc/testsuite/g++.dg/cpp2a/fn-template18.C
new file mode 100644
index 000..7fe5e89ace3
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp2a/fn-template18.C
@@ -0,0 +1,23 @@
+// PR c++/88184
+// { dg-do compile }
+// { dg-options "-std=c++2a -fchecking=2" }
+
+namespace A
+{
+  void f ();
+  void f (int);
+  void f (int, int);
+}
+
+using A::f;
+
+template  void g ()
+{
+  f (); // { dg-error "no matching function for call" }
+}
+
+void
+fn ()
+{
+  g();
+}


Re: [PATCH 1/4][libbacktrace] Test check_PROGRAMS without mmap

2018-11-29 Thread Ian Lance Taylor via gcc-patches
On Fri, Nov 23, 2018 at 12:47 PM, Tom de Vries  wrote:
> [ was: [PATCH 1/2][libbacktrace] Handle realloc returning NULL if size == 0 ]
>
> On Thu, Nov 22, 2018 at 01:35:43PM +0100, Tom de Vries wrote:
>> Hi,
>>
>> Build and tested on x86_64, with mmap.c replaced by alloc.c to ensure that
>> backtrace_vector_release in alloc.c is tested.
>
> Hi,
>
> I came up with a more structural solution.
>
> OK for trunk?
>
> Thanks,
> - Tom
>
> [libbacktrace] Test check_PROGRAMS without mmap
>
> When building libbacktrace, we typically use mmapio.c and mmap.c, and don't
> build read.c and alloc.c.
>
> Add testcases that use read.c and alloc.c to ensure that we also build and
> test those on a typical development setup.
>
> Bootstrapped and reg-tested on x86_64.
>
> 2018-11-23  Tom de Vries  
>
> * configure.ac (HAVE_MMAP): New AM_CONDITIONAL.
> * configure: Regenerate.
> * Makefile.am : Add _with_alloc version for each test in
> check_PROGRAMS.
> * Makefile.in: Regenerate.
>
> ---
>  libbacktrace/Makefile.am  |  45 
>  libbacktrace/Makefile.in  | 284 
> +++---
>  libbacktrace/configure|  18 ++-
>  libbacktrace/configure.ac |   1 +
>  4 files changed, 331 insertions(+), 17 deletions(-)
>
> diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
> index 3c1bd49dd7b..ee47daee415 100644
> --- a/libbacktrace/Makefile.am
> +++ b/libbacktrace/Makefile.am
> @@ -95,11 +95,25 @@ btest_CFLAGS = $(AM_CFLAGS) -g -O
>  btest_LDADD = libbacktrace.la
>
>  check_PROGRAMS += btest
> +if HAVE_MMAP
> +libbacktrace_with_alloc = $(libbacktrace_la_OBJECTS) \
> +   $(filter-out mmap.lo mmapio.lo,$(libbacktrace_la_LIBADD)) \
> +   alloc.lo read.lo


As far as I know libbacktrace does not currently rely on using GNU
make.  I'd rather not add that dependency for this purpose (I don't
mind adding this kind of testing but to me this seems only of mild
interest--I expect that all significant libbacktrace users have mmap).
You should be able to write something like

libbacktrace_alloc_la_SOURCES = $(libbacktrace_SOURCES)
libbacktrace_alloc_la_LIBADD = $(BACKTRACE_FILE) $(FORMAT_FILE) read.c alloc.c

Then I wouldn't bother with only running the tests with HAVE_MMAP,
just add unconditional tests for btest_alloc, etc.  It's OK to run
duplicate tests for the incredibly rare case of a host that doesn't
support mmap.

Ian


[PATCH, i386]: Fix two small MMX glitches

2018-11-29 Thread Uros Bizjak
2018-11-29  Uros Bizjak  

* config/i386/i386.c (inline_memory_move_cost):
Check "in" for 2 in MMX_CLASS_P case.
* config/i386/mmx.md (*mov_internal): Correct
TARGET_INTER_UNIT_MOVES_FROM_VEC and TARGET_INTER_UNIT_MOVES_TO_VEC
alternatives in preferred_for_speed attribute calculation.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 266593)
+++ config/i386/i386.c  (working copy)
@@ -39528,7 +39528,7 @@ inline_memory_move_cost (machine_mode mode, enum r
  default:
return 100;
}
-  if (in)
+  if (in == 2)
 return MAX (ix86_cost->mmx_load [index], ix86_cost->mmx_store [index]);
   return in ? ix86_cost->mmx_load [index] : ix86_cost->mmx_store [index];
 }
Index: config/i386/mmx.md
===
--- config/i386/mmx.md  (revision 266593)
+++ config/i386/mmx.md  (working copy)
@@ -208,9 +208,9 @@
   ]
   (const_string "DI")))
(set (attr "preferred_for_speed")
- (cond [(eq_attr "alternative" "10,15")
+ (cond [(eq_attr "alternative" "9,15")
  (symbol_ref "TARGET_INTER_UNIT_MOVES_FROM_VEC")
-   (eq_attr "alternative" "11,16")
+   (eq_attr "alternative" "10,16")
  (symbol_ref "TARGET_INTER_UNIT_MOVES_TO_VEC")
   ]
   (symbol_ref "true")))])


Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-29 Thread Pedro Alves
Hi Nick,

On 11/29/2018 03:01 PM, Nick Clifton wrote:
>  static struct demangle_component *
>  d_function_type (struct d_info *di)
>  {
> -  struct demangle_component *ret;
> +  static unsigned long recursion_level = 0;

Did you consider making this be a part of struct d_info instead?
IIRC, d_info is like a "this" pointer, passed around pretty
much everywhere.

I think going in the direction of making the demangler harder to use
in an efficient thread-safe manner is undesirable, even if the feature
is optional.  E.g., in GDB, loading big binaries, demangling is very high
in profiles, and so we've kicked around the desire to parallelize
it (e.g., by parallelizing the reading/interning of DSO files, instead of
reading all of them sequentially).  Having to synchronize access to the
demangler would be quite unfortunate.  If possible, it'd be great
to avoid making work toward that direction harder.  (Keeping in mind that
if this recursion detection feature is useful for binutils, then it should
also be useful for GDB.)

Thanks,
Pedro Alves


Re: [PATCH 1/5][libbacktrace] Factor out backtrace_vector_free

2018-11-29 Thread Ian Lance Taylor via gcc-patches
On Thu, Nov 29, 2018 at 4:33 AM, Tom de Vries  wrote:
> On 29-11-18 00:26, Ian Lance Taylor wrote:
>> On Wed, Nov 28, 2018 at 3:15 PM, Tom de Vries  wrote:
>>>
>>> this patch factors out new function backtrace_vector_free.
>>>
>>> Bootstrapped and reg-tested on x86_64.
>>>
>>> OK for trunk?
>>
>> We should only add new files if we really absolutely must, as this
>> package is copied around to a lot of places (e.g.,
>> libsanitizer/libbacktrace) and adding files here requires
>> modifications in all those places.
>>
>
> I see, thanks for the explanation.
>
> How about his patch? It does not add a file, though it does add an
> external function which requires a rename in libsanitizer/libbacktrace
> (I don't know whether that requires changes in any other places).
>
> [ Also, it inlines backtrace-vector.c into alloc.c and mmap.c, so it
> duplicates code. If that is not acceptable, I could move it to
> internal.h as static inline or static ATTRIBUTE_UNUSED. ]

Yes, let's just use a static inline function or a macro.  Thanks.

Ian


Re: [RS6000] PowerPC64 soft-float

2018-11-29 Thread Segher Boessenkool
Hi!

On Sun, Nov 25, 2018 at 10:50:27PM +1030, Alan Modra wrote:
> This patch aims to prevent long sequences loading soft-float
> constants.  For 32-bit, it makes sense to load values inline to a gpr
> with lis, addi, but not so much for 64-bit where a 5 insn sequence
> might be needed for each gpr.  For TFmode in particular, a 10 insn
> sequence is reduced to 2 loads from memory plus 1 or 2 address setup
> insns.
> 
> Bootstrapped etc. powerpc64le-linux and powerpc64-linux.  OK for
> next stage1?

It's okay now, even.

>   * config/rs6000/predicates.md (easy_fp_constant): Avoid long
>   dependent insn sequences.
>   * config/rs6000/rs6000.c (num_insns_constant): Support long
>   double constants.
>   * config/rs6000/rs6000.md (mov_softfloat128) Adjust length
>   attribute.

> -  /* Consider all constants with -msoft-float to be easy.  */
> -  if (TARGET_SOFT_FLOAT)
> +  /* Consider all constants with -msoft-float to be easy when regs are
> + 32-bit and thus can be loaded with a maximum of 2 insns.  For
> + 64-bit avoid long dependent insn sequences.  */
> +  if (TARGET_SOFT_FLOAT
> +  && (!TARGET_POWERPC64
> +   || mode == SFmode || mode == SDmode
> +   || ((mode == DFmode || mode == DDmode)
> +   && (num_insns_constant (op, mode)
> +   <= (TARGET_CMODEL != CMODEL_SMALL ? 3 : 2)))
> +   || ((mode == TFmode || mode == TDmode
> +|| mode == KFmode || mode == IFmode)
> +   && (num_insns_constant (op, mode)
> +   <= (TARGET_CMODEL != CMODEL_SMALL ? 4 : 3)
>  return 1;

Maybe this can be written simpler with  MODE_SIZE (mode) >= 16  or such.

  if (TARGET_SOFT_FLOAT)
{
  if (!TARGET_POWERPC64)
return 1;

  int size = MODE_SIZE (mode);
  if (size == 4)
return 1;

  int max_insns = 3;
  if (TARGET_CMODEL == CMODEL_SMALL)
max_insns--;
  if (size >= 16)
max_insns++;
  if (num_insns_constant (op, mode) <= max_insns)
return 1;
}

Something like that, perhaps.  It's not really shorter :-/

> + else if (mode == TFmode || mode == TDmode
> +  || mode == KFmode || mode == IFmode)
> +   {
> + long l[4];
> + int ins;
> +
> + if (mode == TDmode)
> +   REAL_VALUE_TO_TARGET_DECIMAL128 (*rv, l);
> + else
> +   REAL_VALUE_TO_TARGET_LONG_DOUBLE (*rv, l);
> +
> + val = l[WORDS_BIG_ENDIAN ? 0 : 3] << 32;
> + val |= l[WORDS_BIG_ENDIAN ? 1 : 2] & 0xUL;

You can't shift a "long" left by 32 bits.  Cast to HOST_WIDE_INT?
Maybe there is some helper already?

> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -7729,7 +7729,7 @@ (define_insn_and_split "*mov_softfloat128"
>   (const_string "8")
>   (const_string "16"))
>   (if_then_else (match_test "TARGET_POWERPC64")
> - (const_string "40")
> + (const_string "16")
>   (const_string "32"))
>   (if_then_else (match_test "TARGET_POWERPC64")
>   (const_string "8")

I really would love to see this in stage3 still; it's arguably a bugfix.


Segher


Re: [PATCH, rs6000] Fix PR87496: ICE in aggregate_value_p at gcc/function.c:2046

2018-11-29 Thread Segher Boessenkool
On Wed, Nov 28, 2018 at 03:27:19PM -0600, Peter Bergner wrote:
> PR87496 shows a bug where we ICE if we attempt to use -mabi=ieeelongdouble
> and -mno-popcntd.  The IEEE128 support requires full ISA 2.06 (aka POWER7)
> support, so we really should throw an error when using those options
> together.  Ditto for -mabi=ieeelongdouble and -mno-vsx.  The patch below
> does that.
> 
> Ok for mainline once bootstrap and regtesting are complete and clean?

Okay.  Eventually we shouldn't allow selecting popcntd independently from
-mcpu=, but that day isn't here yet.  So, okay for trunk, and backports
if wanted.  Thanks!


Segher


>   PR target/87496
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Disallow
>   -mabi=ieeelongdouble without both -mpopcntd and -mvsx.
> 
> gcc/testsuite/
>   PR target/87496
>   * gcc.target/powerpc/pr87496.c: New test.


[PATCH][OBVIOUS] Fix thinko in transition to memop_ret type (PR middle-end/88246).

2018-11-29 Thread Martin Liška
Hi.

One obvious thinko I did when I convert the endp into a new enum.
The patch survives tests and bootstrap on x86_64-linux-gnu and
fixes the ICE in s390x bootstrap. I'm going to install it.

Martin

gcc/ChangeLog:

2018-11-29  Martin Liska  

PR middle-end/88246
* builtins.c (expand_movstr): Fix thinko introduced
when switching to the new enum.
---
 gcc/builtins.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/gcc/builtins.c b/gcc/builtins.c
index dcac49d8be1..537228cf3be 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3931,7 +3931,7 @@ expand_movstr (tree dest, tree src, rtx target, memop_ret retmode)
 
   dest_mem = get_memory_rtx (dest, NULL);
   src_mem = get_memory_rtx (src, NULL);
-  if (retmode != RETURN_BEGIN)
+  if (retmode == RETURN_BEGIN)
 {
   target = force_reg (Pmode, XEXP (dest_mem, 0));
   dest_mem = replace_equiv_address (dest_mem, target);



Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-29 Thread Scott Gayou
Thank you for looking into this Nick. I've been staring at a few of these
CVEs off-and-on for a few days, and the following CVEs all look like
duplicates:

CVE-2018-17985: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87335
CVE-2018-18484: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87636
CVE-2018-18701: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87675
CVE-2018-18700: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87681

There may be more. I think Mitre is scanning the gnu bugzilla and assigning
CVEs? This does look like a legitimate very low criticality "denial of
service", but generating new CVEs for every unique poc file against the
same root cause doesn't seem useful. Perhaps some of these should be
rejected?

-- 
Scott Gayou / Red Had Product Security


Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-29 Thread Ed Smith-Rowland

On 11/26/18 6:18 AM, Jonathan Wakely wrote:

On 24/11/18 13:54 -0500, Ed Smith-Rowland wrote:

All,

I's very late but uniform container erasure is, I think, the last 
little tidbit to graduate from fundamentals/v2 to std at the last 
meeting.  I think it would be a shame not to nudge this into gcc-9.  
The routines are very short so I just copied them. Ditto the 
testcases (with adjustments).  The node erasure tool was moved from 
experimental to std/bits and adjustments made to affected *set/*map 
headers.


The experimental code has been in our tree since 2015.

This builds and tests clean on x86_64-linux.


OK for trunk as it only touches experimental C++2a and TS material.
Thanks.

I pointed out to the committee that the erase_if functions added to
C++20 completely overlook P0646R1 "Improving the Return Value of
Erase-Like Algorithms" and so fail to preserve the useful information
returned by the members of the linked list containers.
Is *this* what you has in mind for this?  I basically return the number 
of erased items for all the things.


I expect that to get fixed as a defect. If you have time and
motivation, I think we should make that change proactively in
libstdc++. The Library Fundamentals versions can continue to return
void, but the ones in namespace std can return the number of elements
removed


Ed



2018-11-29  Edward Smith-Rowland  <3dw...@verizon.net>

Pre-emptively support P0646R1 for std container erasure.
* include/bits/erase_if.h: Accumulate and return number of erased nodes.
* include/std/forward_list (): Return number of erased items.
* include/std/list (): Ditto.
* include/std/map (): Ditto.
* include/std/set (): Ditto.
* include/std/string (): Ditto.
* include/std/unordered_map (): Ditto.
* include/std/unordered_set (): Ditto.
* include/std/vector (): Ditto.
* testsuite/21_strings/basic_string/erasure.cc: Test number of erasures.
* testsuite/23_containers/deque/erasure.cc: Ditto.
* testsuite/23_containers/forward_list/erasure.cc: Ditto.
* testsuite/23_containers/list/erasure.cc: Ditto.
* testsuite/23_containers/map/erasure.cc: Ditto.
* testsuite/23_containers/set/erasure.cc: Ditto.
* testsuite/23_containers/unordered_map/erasure.cc: Ditto.
* testsuite/23_containers/unordered_set/erasure.cc: Ditto.
* testsuite/23_containers/vector/erasure.cc: Ditto.

Index: include/bits/erase_if.h
===
--- include/bits/erase_if.h (revision 266623)
+++ include/bits/erase_if.h (working copy)
@@ -41,17 +41,22 @@
   namespace __detail
   {
 template
-  void
+  typename _Container::size_type
   __erase_nodes_if(_Container& __cont, _Predicate __pred)
   {
+   typename _Container::size_type __num = 0;
for (auto __iter = __cont.begin(), __last = __cont.end();
 __iter != __last;)
-   {
- if (__pred(*__iter))
-   __iter = __cont.erase(__iter);
- else
-   ++__iter;
-   }
+ {
+   if (__pred(*__iter))
+ {
+   __iter = __cont.erase(__iter);
+   ++__num;
+ }
+   else
+ ++__iter;
+ }
+   return __num;
   }
   } // namespace __detail
 
Index: include/std/forward_list
===
--- include/std/forward_list(revision 266623)
+++ include/std/forward_list(working copy)
@@ -66,16 +66,17 @@
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
-inline void 
+inline typename forward_list<_Tp, _Alloc>::size_type 
 erase_if(forward_list<_Tp, _Alloc>& __cont, _Predicate __pred)
-{ __cont.remove_if(__pred); }
+{ return __cont.remove_if(__pred); }
 
   template
-inline void
+inline typename forward_list<_Tp, _Alloc>::size_type
 erase(forward_list<_Tp, _Alloc>& __cont, const _Up& __value)
 {
   using __elem_type = typename forward_list<_Tp, _Alloc>::value_type;
-  erase_if(__cont, [&](__elem_type& __elem) { return __elem == __value; });
+  return erase_if(__cont,
+ [&](__elem_type& __elem) { return __elem == __value; });
 }
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
Index: include/std/list
===
--- include/std/list(revision 266623)
+++ include/std/list(working copy)
@@ -90,16 +90,17 @@
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
-inline void
+inline typename list<_Tp, _Alloc>::size_type
 erase_if(list<_Tp, _Alloc>& __cont, _Predicate __pred)
-{ __cont.remove_if(__pred); }
+{ return __cont.remove_if(__pred); }
 
   template
-inline void
+inline typename list<_Tp, _Alloc>::size_type
 erase(list<_Tp, _Alloc>& __cont, const _Up& __value)
 {
   using __elem_type = typename li

Re: [PATCH, GCC, AARCH64, 3/6] Restrict indirect tail calls to x16 and x17

2018-11-29 Thread Sudakshina Das
Hi

On 02/11/18 18:37, Sudakshina Das wrote:
> Hi
> 
> This patch is part of a series that enables ARMv8.5-A in GCC and
> adds Branch Target Identification Mechanism.
> (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)
> 
> This patch changes the registers that are allowed for indirect tail
> calls. We are choosing to restrict these to only x16 or x17.
> 
> Indirect tail calls are special in a way that they convert a call
> statement (BLR instruction) to a jump statement (BR instruction). For
> the best possible use of Branch Target Identification Mechanism, we
> would like to place a "BTI C" (call) at the beginning of the function
> which is only compatible with BLRs and BR X16/X17. In order to make
> indirect tail calls compatible with this scenario, we are restricting
> the TAILCALL_ADDR_REGS.
> 
> In order to use x16/x17 for this purpose, we also had to change the use
> of these registers in the epilogue/prologue handling. For this purpose
> we are now using x12 and x13 named as EP0_REGNUM and EP1_REGNUM as
> scratch registers for epilogue and prologue.
> 
> Bootstrapped and regression tested with aarch64-none-linux-gnu. Updated
> test. Ran Spec2017 and no performance hit.
> 
> Is this ok for trunk?
> 
> Thanks
> Sudi
> 
> 
> *** gcc/ChangeLog***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>* config/aarch64/aarch64.c (aarch64_expand_prologue): Use new
>epilogue/prologue scratch registers EP0_REGNUM and EP1_REGNUM.
>(aarch64_expand_epilogue): Likewise.
>(aarch64_output_mi_thunk): Likewise
>* config/aarch64/aarch64.h (REG_CLASS_CONTENTS): Change
>   TAILCALL_ADDR_REGS
>to x16 and x17.
>* config/aarch64/aarch64.md: Define EP0_REGNUM and EP1_REGNUM.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>* gcc.target/aarch64/test_frame_17.c: Update to check for
>   EP0_REGNUM instead of IP0_REGNUM and add test case.
> 
I have edited the patch to take out a change that was not needed as part
of this patch in aarch64_expand_epilogue. The only change now happening
there is as mentioned in the ChangeLog to replace the uses of IP0/IP1.
ChangeLog still applies.

Thanks
Sudi
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 4bec6bd963d91c475a4e18f883955093e9268cfd..cc95be32d40268d3647c8280188f17ff8212a156 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -586,7 +586,7 @@ enum reg_class
 #define REG_CLASS_CONTENTS		\
 {	\
   { 0x, 0x, 0x },	/* NO_REGS */		\
-  { 0x0004, 0x, 0x },	/* TAILCALL_ADDR_REGS */\
+  { 0x0003, 0x, 0x },	/* TAILCALL_ADDR_REGS */\
   { 0x7fff, 0x, 0x0003 },	/* GENERAL_REGS */	\
   { 0x8000, 0x, 0x },	/* STACK_REG */		\
   { 0x, 0x, 0x0003 },	/* POINTER_REGS */	\
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index da7430f1fd88566c4f017a1b491f8de7dce724e8..f4ff300b883ce832335a4915b22bcbfefe64d9ae 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5357,8 +5357,8 @@ aarch64_expand_prologue (void)
 	aarch64_emit_probe_stack_range (get_stack_check_protect (), frame_size);
 }
 
-  rtx ip0_rtx = gen_rtx_REG (Pmode, IP0_REGNUM);
-  rtx ip1_rtx = gen_rtx_REG (Pmode, IP1_REGNUM);
+  rtx tmp0_rtx = gen_rtx_REG (Pmode, EP0_REGNUM);
+  rtx tmp1_rtx = gen_rtx_REG (Pmode, EP1_REGNUM);
 
   /* In theory we should never have both an initial adjustment
  and a callee save adjustment.  Verify that is the case since the
@@ -5368,7 +5368,7 @@ aarch64_expand_prologue (void)
   /* Will only probe if the initial adjustment is larger than the guard
  less the amount of the guard reserved for use by the caller's
  outgoing args.  */
-  aarch64_allocate_and_probe_stack_space (ip0_rtx, ip1_rtx, initial_adjust,
+  aarch64_allocate_and_probe_stack_space (tmp0_rtx, tmp1_rtx, initial_adjust,
 	  true, false);
 
   if (callee_adjust != 0)
@@ -5386,7 +5386,7 @@ aarch64_expand_prologue (void)
 	}
   aarch64_add_offset (Pmode, hard_frame_pointer_rtx,
 			  stack_pointer_rtx, callee_offset,
-			  ip1_rtx, ip0_rtx, frame_pointer_needed);
+			  tmp1_rtx, tmp0_rtx, frame_pointer_needed);
   if (frame_pointer_needed && !frame_size.is_constant ())
 	{
 	  /* Variable-sized frames need to describe the save slot
@@ -5428,7 +5428,7 @@ aarch64_expand_prologue (void)
 
   /* We may need to probe the final adjustment if it is larger than the guard
  that is assumed by the called.  */
-  aarch64_allocate_and_probe_stack_space (ip1_rtx, ip0_rtx, final_adjust,
+  aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, final_adjust,
 	  !frame_pointer_needed, true);
 }
 
@@ -5466,8 +5466,8 @@ aarch64_expand_epilogue (bool for_sibcall)
   unsigned reg2 = cfun->machine->frame.wb_candidate2;
   rtx cfi_ops

Re: [PATCH, GCC, AARCH64, 5/6] Enable BTI : Add new pass for BTI.

2018-11-29 Thread Sudakshina Das
Hi

On 13/11/18 14:47, Sudakshina Das wrote:
> Hi
> 
> On 02/11/18 18:38, Sudakshina Das wrote:
>> Hi
>>
>> This patch is part of a series that enables ARMv8.5-A in GCC and
>> adds Branch Target Identification Mechanism.
>> (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)
>>
>> This patch adds a new pass called "bti" which is triggered by the
>> command line argument -mbranch-protection whenever "bti" is turned on.
>>
>> The pass iterates through the instructions and adds appropriated BTI
>> instructions based on the following:
>>   * Add a new "BTI C" at the beginning of a function, unless its already
>> protected by a "PACIASP/PACIBSP". We exempt the functions that are
>> only called directly.
>>   * Add a new "BTI J" for every target of an indirect jump, jump table
>> targets, non-local goto targets or labels that might be referenced
>> by variables, constant pools, etc (NOTE_INSN_DELETED_LABEL)
>>
>> Since we have already changed the use of indirect tail calls to only x16
>> and x17, we do not have to use "BTI JC".
>> (check patch 3/6).
>>
> 
> I missed out on the explanation for the changes to the trampoline code.
> The patch also updates the trampoline code in case BTI is enabled. Since
> the trampoline code is a target of an indirect branch, we need to add an
> appropriate BTI instruction at the beginning of it to avoid a branch
> target exception.
> 
>> Bootstrapped and regression tested with aarch64-none-linux-gnu. Added
>> new tests.
>> Is this ok for trunk?
>>
>> Thanks
>> Sudi
>>
>> *** gcc/ChangeLog ***
>>
>> 2018-xx-xx  Sudakshina Das  
>>  Ramana Radhakrishnan  
>>
>>  * config.gcc (aarch64*-*-*): Add aarch64-bti-insert.o.
>>  * gcc/config/aarch64/aarch64.h: Update comment for
>>  TRAMPOLINE_SIZE.
>>  * config/aarch64/aarch64.c (aarch64_asm_trampoline_template):
>>  Update if bti is enabled.
>>  * config/aarch64/aarch64-bti-insert.c: New file.
>>  * config/aarch64/aarch64-passes.def (INSERT_PASS_BEFORE): Insert
>>  bti pass.
>>  * config/aarch64/aarch64-protos.h (make_pass_insert_bti):
>>  Declare the new bti pass.
>>  * config/aarch64/aarch64.md (bti_nop): Define.
>>  * config/aarch64/t-aarch64: Add rule for aarch64-bti-insert.o.
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2018-xx-xx  Sudakshina Das  
>>
>>  * gcc.target/aarch64/bti-1.c: New test.
>>  * gcc.target/aarch64/bti-2.c: New test.
>>  * lib/target-supports.exp
>>  (check_effective_target_aarch64_bti_hw): Add new check for
>>  BTI hw.
>>
> 
> Updated patch attached with more comments and a bit of simplification
> in aarch64-bti-insert.c. ChangeLog still applies.
> 
> Thanks
> Sudi
> 

I found a missed case in the bti pass and edited the patch to include
it. This made me realize that the only 2 regressions I saw with the
BTI enabled model can now be avoided. (as quoted below from my 6/6
patch)
"Bootstrapped and regression tested with aarch64-none-linux-gnu with
and without the configure option turned on.
Also tested on aarch64-none-elf with and without configure option with a
BTI enabled aem. Only 2 regressions and these were because newlib
requires patches to protect hand coded libraries with BTI."

The ChangeLog still applies.

Sudi
diff --git a/gcc/config.gcc b/gcc/config.gcc
index b108697cfc7b1c9c6dc1f30cca6fd1158182c29e..3e77f9df6ad6ca55fccca50387eab4b2501af647 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -317,7 +317,7 @@ aarch64*-*-*)
 	c_target_objs="aarch64-c.o"
 	cxx_target_objs="aarch64-c.o"
 	d_target_objs="aarch64-d.o"
-	extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o"
+	extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch64-bti-insert.o"
 	target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c"
 	target_has_targetm_common=yes
 	;;
diff --git a/gcc/config/aarch64/aarch64-bti-insert.c b/gcc/config/aarch64/aarch64-bti-insert.c
new file mode 100644
index ..be604fb2fd5df052971cc81b7e6d7760880a6b79
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-bti-insert.c
@@ -0,0 +1,236 @@
+/* Branch Target Identification for AArch64 architecture.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   Contributed by Arm Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GN

Re: [PATCH] Come up with --param asan-stack-small-redzone (PR sanitizer/81715).

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 05:37:11PM +0100, Martin Liška wrote:
>   0x10007fff7b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10007fff7b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10007fff7b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10007fff7b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10007fff7b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> =>0x10007fff7b50: 00 00 00 00 f1 f1 f1 f1[01]f2 00 00 01 f2 f2 f2
>   0x10007fff7b60: f2 f2 00 00 00 06 f2 f2 f2 f2 00 00 00 00 00 f3
>   0x10007fff7b70: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10007fff7b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10007fff7b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   0x10007fff7ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 
> as seen 06f2f2f2 is completely unaligned write.

Sure, but the whole frame section where the vars from one function are
is properly aligned.  So, you shouldn't store 0xf2f2f206, but 0xf2f20600
one byte earlier, and then 0xf2f2 after it.  Of course, if we decide
that we want to do solely SImode stores.  If we want to complicate it
through determining costs and comparing if we should do a 1 or 2 or 4 byte
store and if unaligned is possible and not very expensive consider even
those in some cases, the code might be more complex.
I guess for now I'd be fine with SImode stores only and if we get complains,
we can address that incrementally (the question is if we'd want hooks or
what to determine it).

Jakub


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-11-29 Thread Alexander Monakov
> On Mon, 15 Oct 2018, Richard Biener wrote:
> > I think it's sound but also note that I think it is logically independent of
> > asm inline ().  While it may work for the inlining issue for some kernel
> > examples to asm inline () is sth similar to always_inline for functions,
> > that is, even though an asm _does_ have non-negligible .text size
> > we do want to ignore that for the purpose of inlining (but not for the
> > purpose of branch size estimation).
> 
> My understanding is that kernel folks are not demanding "always_inline"
> semantics though; they were unhappy with inlining cost mis-estimation.

Ping - as I think this approach addresses the root of the problem, I wouldn't
like it to be forgotten.

> > Note in your docs you refer to "non-code" sections but it should
> > equally apply to .text sections that are not the section of the asm
> > context (.text.cold, for example).  So better wording there would
> > be appreciated.
> 
> I've tried to address this with the following incremental change. I think
> it was the only suggestion, so the rest of the patch is unchanged.
> 
> @@ -9849,11 +9849,11 @@ a label is unreachable.
>  Likewise, it is possible for GCC to significantly over-estimate the
>  number of instructions in an @code{asm}, resulting in suboptimal decisions
>  when the estimate is used during inlining and branch range optimization.
> -This can happen if the @code{asm} template has many lines that do not
> -correspond to instructions, for example when the @samp{.pushsection}
> -directive is used to emit auxiliary data in a non-code section.
> +For instance, this can happen if a @samp{.pushsection} directive is used to
> +temporarily switch to another section and emit auxiliary code or data there.
>  For Extended @code{asm} statements, you can improve the estimate by
> -wrapping the non-code portion in @samp{%` ... %`} delimiters.
> +wrapping the parts where newlines and separators should not be counted in
> +@samp{%` ... %`} delimiters.
> 
> 
> * doc/extend.texi (Extended Asm): Document %` in template.
> (Size of an Asm): Document intended use of %`.
> * final.c (asm_insn_count): Adjust.
> (asm_str_count): Add argument to distinguish basic and extended asms.
> In extended asms, ignore separators inside of %` ... %`.
> (output_asm_insn): Handle %`.
> * rtl.h (asm_str_count): Adjust prototype.
> * tree-inline.c (estimate_num_insns): Adjust.
> * config/arm/arm.c (arm_rtx_costs_internal): Adjust.
> 
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 34ecc4f8d14..dac4728375b 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -8622,6 +8622,11 @@ generates multiple assembler instructions.
>  Outputs @samp{@{}, @samp{|}, and @samp{@}} characters (respectively)
>  into the assembler code.  When unescaped, these characters have special
>  meaning to indicate multiple assembler dialects, as described below.
> +
> +@item %`
> +Signifies a boundary of a region where instruction separators are not
> +counted towards its size (@pxref{Size of an asm}). Must be followed by
> +a whitespace character.
>  @end table
>  
>  @subsubheading Multiple assembler dialects in @code{asm} templates
> @@ -9830,7 +9835,7 @@ does this by counting the number of instructions in the 
> pattern of the
>  @code{asm} and multiplying that by the length of the longest
>  instruction supported by that processor.  (When working out the number
>  of instructions, it assumes that any occurrence of a newline or of
> -whatever statement separator character is supported by the assembler --
> +whatever statement separator character is supported by the assembler ---
>  typically @samp{;} --- indicates the end of an instruction.)
>  
>  Normally, GCC's estimate is adequate to ensure that correct
> @@ -9841,6 +9846,15 @@ space in the object file than is needed for a single 
> instruction.
>  If this happens then the assembler may produce a diagnostic saying that
>  a label is unreachable.
>  
> +Likewise, it is possible for GCC to significantly over-estimate the
> +number of instructions in an @code{asm}, resulting in suboptimal decisions
> +when the estimate is used during inlining and branch range optimization.
> +For instance, this can happen if a @samp{.pushsection} directive is used to
> +temporarily switch to another section and emit auxiliary code or data there.
> +For Extended @code{asm} statements, you can improve the estimate by
> +wrapping the parts where newlines and separators should not be counted in
> +@samp{%` ... %`} delimiters.
> +
>  @node Alternate Keywords
>  @section Alternate Keywords
>  @cindex alternate keywords
> diff --git a/gcc/final.c b/gcc/final.c
> index 6e61f1e17a8..1953293d379 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -1408,29 +1408,40 @@ static int
>  asm_insn_count (rtx body)
>  {
>const char *templ;
> +  bool basic = GET_CODE (body) == ASM_INPUT;
>  
> -  if (GET_CODE (body) == ASM_INP

Re: [PATCH] Optimize integral lt + blend into just blend (PR target/54700)

2018-11-29 Thread Uros Bizjak
On Thu, Nov 29, 2018 at 5:28 PM Jakub Jelinek  wrote:
>
> On Thu, Nov 29, 2018 at 05:21:53PM +0100, Uros Bizjak wrote:
> > > > * g++.target/i386/avx2-check.h: New file.
> > > > * g++.target/i386/m128-check.h: New file.
> > > > * g++.target/i386/m256-check.h: New file.
> > > > * g++.target/i386/avx-os-support.h: New file.
> > >
> > > OK.
> >
> > On a second thought, should we rather use (pre-reload?)
> > define_insn_and_split to split the combination to the blend insn?
>
> I've already committed it.  But can work on a patch that does that tomorrow.

Thanks. You will probably need to split it after reload, since a
change from intvec->FPvec is needed.

Uros.


Re: [PATCH] PR880088 Enable -Wtrampolines for no-exec-stack targets with -Wall.

2018-11-29 Thread Segher Boessenkool
Hi Mark,

On Thu, Nov 29, 2018 at 12:57:03PM +0100, Mark Wielaard wrote:
> On Wed, 2018-11-28 at 15:00 -0600, Segher Boessenkool wrote:
> > On Tue, Nov 27, 2018 at 08:54:07PM +0100, Mark Wielaard wrote:
> > > On Tue, 2018-11-27 at 12:37 -0600, Segher Boessenkool wrote:
> > > > > diff --git a/gcc/config/rs6000/sysv4.h
> > > > > b/gcc/config/rs6000/sysv4.h
> > > > > index 0c67634..9330acf 100644
> > > > > --- a/gcc/config/rs6000/sysv4.h
> > > > > +++ b/gcc/config/rs6000/sysv4.h
> > > > > @@ -972,6 +972,11 @@ ncrtn.o%s"
> > > > >  /* Generate entries in .fixup for relocatable addresses.  */
> > > > >  #define RELOCATABLE_NEEDS_FIXUP 1
> > > > >  
> > > > > +#if defined (POWERPC_LINUX) || defined (POWERPC_FREEBSD)
> > > > > +  #define TARGET_HAS_DEFAULT_NOEXEC_STACK (TARGET_32BIT \
> > > > > +|| DEFAULT_ABI ==
> > > > > ABI_ELFv2)
> > > > > +#endif
> > > > 
> > > > I don't think this belongs in sysv4.h .
> > > 
> > > I might have gotten lost in the tree of defines/macros.
> > > 
> > > There are two sysv4.h files gcc/config/rs6000/sysv4.h and
> > > gcc/config/powerpcspe/sysv4.h
> > 
> > Forget about powerpcspe please, I am talking about rs6000 only.
> 
> I don't know the differences between the config backends, but it looks
> like at least some of the configs were just copy/pasted which might
> explain why they both define things the same (in sysv4.h). And they

Yes, things are copied.  My point is, my review is for rs6000/ only :-)

> > You want linux.h and freebsd.h, maybe the "64" versions of those separately.
> > Or put this in rs6000.h.  sysv4.h is a random header for this, it doesn't
> > belong there.
> 
> The reason I added it to sysv4.h is because it matches where the
> TARGET_ASM_FILE_END hook is setup. It might make sense to have

Yeah, but it does not make sense to put it there, since you are handling
other ABIs as well.

> So if I understand correctly you would like to have:
> 
> rs6000/linux.h and rs6000/freebsd.h:
> #define TARGET_NEEDS_EXEC_STACK_FLAG_FOR_TRAMPOLINES 1
> 
> rs6000/linux64.h and rs6000/freebsd64.h:
> #define TARGET_NEEDS_EXEC_STACK_FLAG_FOR_TRAMPOLINES (DEFAULT_ABI == 
> ABI_ELFv2)

Something like that yes.  Or in rs6000.h, maybe.  Some location that makes
sense :-)


Segher


Re: [PATCH] Come up with --param asan-stack-small-redzone (PR sanitizer/81715).

2018-11-29 Thread Martin Liška
On 11/29/18 4:17 PM, Jakub Jelinek wrote:
> On Thu, Nov 29, 2018 at 04:03:42PM +0100, Martin Liška wrote:
>>> Two problems, it uses unconditionally unaligned stores, without
>>> checking if the target supports them at all (in this case it does).
>>> And, it doesn't check if it wouldn't be more efficient to use
>>> 32-bit stores. 
>>
>> Ok, so now we reduced the alignment of stack variables to 
>> ASAN_MIN_RED_ZONE_SIZE (16).
>> What options do wehave when we emit the red zones? The only guarantee we 
>> have is
>> that in shadow memory we are aligned to at least 2. Should byte-based 
>> emission used
>> for STRICT_ALIGNMENT targets?
> 
> While individual variables can have smaller alignment, why we can't as before 
> ensure
> the whole block of all the vars is 32-byte aligned and (see below) also
> padded to 32 bytes?  Doesn't __asan_stack_malloc* align those as before and
> if not doing use after return sanitization, has your patch changed anything
> in how the whole block is aligned on the stack?  Of course, on non-strict
> alignment targets we might not care that much.

We can ensure the alignment and padding will be done to 32-bytes. However the 
current
code jumps over the offsets and can eventually do a store to 1B aligned memory.
See small demo:

$ cat asan-smaller.c
void bar (void *a, void *b, void *c, void *d, void *e, void *f, void *g)
{
  int *p = a;
  *p = 123;
}


int
baz (void)
{
  char a;
  char e[17];
  char e1[30];
  char e2[40];
  bar (&a, 0, 0, 0, &e[0], &e1[0], &e2[0]);
  return 0;
}

int main()
{
  baz();
}

$ gcc asan-smaller.c  -fsanitize=address  && ./a.out
flushing into: -256, values: f1f1f1f1
flushing into: -224, values: 01f2
flushing into: -192, values: 01f2f2f2
flushing into: -160, values: f2f2
flushing into: -120, values: 06f2f2f2
flushing into: -88, values: f200
flushing into: -40, values: f3f3f3f3
flushing into: -8, values: f300
=
==7793==ERROR: AddressSanitizer: stack-buffer-overflow on address 
0x7fffdac0 at pc 0x004011a1 bp 0x7fffda30 sp 0x7fffda28
WRITE of size 4 at 0x7fffdac0 thread T0
#0 0x4011a0 in bar (/home/marxin/Programming/testcases/a.out+0x4011a0)
#1 0x401293 in baz (/home/marxin/Programming/testcases/a.out+0x401293)
#2 0x40133b in main (/home/marxin/Programming/testcases/a.out+0x40133b)
#3 0x77019fea in __libc_start_main ../csu/libc-start.c:308
#4 0x401099 in _start (/home/marxin/Programming/testcases/a.out+0x401099)

Address 0x7fffdac0 is located in stack of thread T0 at offset 32 in frame
#0 0x4011bd in baz (/home/marxin/Programming/testcases/a.out+0x4011bd)

  This frame has 4 object(s):
[32, 33) 'a' (line 11) <== Memory access at offset 32 partially overflows 
this variable
[48, 65) 'e' (line 12)
[112, 142) 'e1' (line 13)
[176, 216) 'e2' (line 14)
HINT: this may be a false positive if your program uses some custom stack 
unwind mechanism, swapcontext or vfork
  (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow 
(/home/marxin/Programming/testcases/a.out+0x4011a0) in bar
Shadow bytes around the buggy address:
  0x10007fff7b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10007fff7b50: 00 00 00 00 f1 f1 f1 f1[01]f2 00 00 01 f2 f2 f2
  0x10007fff7b60: f2 f2 00 00 00 06 f2 f2 f2 f2 00 00 00 00 00 f3
  0x10007fff7b70: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fff7ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

as seen 06f2f2f2 is completely unaligned write.

So the current code needs to be learned that once we move to another variable 
we must
start the offset at 32B boundary. Let me take a look..

Martin

> 
>>>  It isn't that we want the gaps to have whatever
>>> value the shadow memory contained before, we want it to be 0 and just rely
>>> on it being zero from earlier.
>>> Another question is if it wouldn't be worth to ensure the variable area is
>>> always a multiple of 32 bytes (thus the corresponding shadow always multiple
>>> of 4 bytes).  Then, for this testcase, you'd store:
>>> $-235802127, 2147450880(%rbp)   // 0xf1f1f1f1
>>> $-234687999, 2147450884(%rbp)   // 0xf202f201
>>> $-218041852, 2147450888(%rbp)   // 0xf300f204
>>> $-202116109, 2147450892(%rbp)   // 0xf3f3f3f3
>>> For the single char/short/int/long var in a function case you'd still emit
>>> just f1 f1 f1 f1 0[1240] f3 f3 f3
>>> and the number of final f3 bytes would be from 3 to 6 (or 2 to 5), 1 is
>>> probably too few.
> 
> Here, the padding to 32 bytes.
> 
>   Jakub
> 



Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-29 Thread Jonathan Wakely

On 29/11/18 10:18 -0500, Ed Smith-Rowland wrote:

On 11/29/18 9:09 AM, Jonathan Wakely wrote:

On 29/11/18 08:47 -0500, Ed Smith-Rowland wrote:

Fixed with 266616.


Thanks!


Index: include/std/deque
===
--- include/std/deque    (revision 266567)
+++ include/std/deque    (working copy)
@@ -58,6 +58,7 @@
#pragma GCC system_header

#include 
+#include  // For remove and remove_if


Please only include  when __cplusplus > 201703L
otherwise we include hundreds of kilobytes of code just for these tiny
functions that aren't even defined in the default C++14 dialect.


Done with 266624.


Thanks for the quick turnaround.




Re: [PATCH] Optimize integral lt + blend into just blend (PR target/54700)

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 05:21:53PM +0100, Uros Bizjak wrote:
> > > * g++.target/i386/avx2-check.h: New file.
> > > * g++.target/i386/m128-check.h: New file.
> > > * g++.target/i386/m256-check.h: New file.
> > > * g++.target/i386/avx-os-support.h: New file.
> >
> > OK.
> 
> On a second thought, should we rather use (pre-reload?)
> define_insn_and_split to split the combination to the blend insn?

I've already committed it.  But can work on a patch that does that tomorrow.

Jakub


Re: [PATCH] Optimize integer vector comparison followed by movmsk (PR target/88152)

2018-11-29 Thread Uros Bizjak
On Thu, Nov 29, 2018 at 3:36 PM Jakub Jelinek  wrote:
>
> Hi!
>
> Like blend, movmsk also only cares about the most significant bit,
> so prior < 0 comparisons or (happens also on the testcase below in some
> cases) arithmetic shift right (by any value) isn't needed before the movmsk.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Same comment as with your lt+blend -> blend patch. I think that
pre-reload define_insn_and_split that splits the combination to movmsk
would be better here. We already implement similar approach to remove
useless maskings of shift operands (c.f. various "..._mask" insns in
i386.md).

Uros.

> 2018-11-29  Jakub Jelinek  
>
> PR target/88152
> * config/i386/sse.md (*_movmsk_lt,
> *_movmsk_zext_lt,
> *_movmsk_shift,
> *_movmsk_zext_shift,
> *_pmovmskb_lt, *_pmovmskb_zext_lt): New
> patterns.
>
> * g++.target/i386/pr88152.C: New test.
>
> --- gcc/config/i386/sse.md.jj   2018-11-29 12:30:45.257028189 +0100
> +++ gcc/config/i386/sse.md  2018-11-29 13:16:34.111969513 +0100
> @@ -14653,6 +14653,62 @@ (define_insn "*_movmsk (set_attr "prefix" "maybe_vex")
> (set_attr "mode" "")])
>
> +(define_insn "*_movmsk_lt"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +   (unspec:SI
> + [(lt:VF_128_256
> +(match_operand: 1 "register_operand" "x")
> +(match_operand: 2 "const0_operand" "C"))]
> + UNSPEC_MOVMSK))]
> +  "TARGET_SSE"
> +  "%vmovmsk\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
> +(define_insn "*_movmsk_zext_lt"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +   (zero_extend:DI
> + (unspec:SI
> +   [(lt:VF_128_256
> +  (match_operand: 1 "register_operand" "x")
> +  (match_operand: 2 "const0_operand" "C"))]
> +   UNSPEC_MOVMSK)))]
> +  "TARGET_64BIT && TARGET_SSE"
> +  "%vmovmsk\t{%1, %k0|%k0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
> +(define_insn "*_movmsk_shift"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +   (unspec:SI
> + [(subreg:VF_128_256
> +(ashiftrt:
> +  (match_operand: 1 "register_operand" "x")
> +  (match_operand:QI 2 "const_int_operand" "n")) 0)]
> + UNSPEC_MOVMSK))]
> +  "TARGET_SSE"
> +  "%vmovmsk\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
> +(define_insn "*_movmsk_zext_shift"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +   (zero_extend:DI
> + (unspec:SI
> +   [(subreg:VF_128_256
> +  (ashiftrt:
> +(match_operand: 1 "register_operand" "x")
> +  (match_operand:QI 2 "const_int_operand" "n")) 0)]
> +   UNSPEC_MOVMSK)))]
> +  "TARGET_64BIT && TARGET_SSE"
> +  "%vmovmsk\t{%1, %k0|%k0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "")])
> +
>  (define_insn "_pmovmskb"
>[(set (match_operand:SI 0 "register_operand" "=r")
> (unspec:SI
> @@ -14677,6 +14733,41 @@ (define_insn "*_pmovmskb_zext
> UNSPEC_MOVMSK)))]
>"TARGET_64BIT && TARGET_SSE2"
>"%vpmovmskb\t{%1, %k0|%k0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set (attr "prefix_data16")
> + (if_then_else
> +   (match_test "TARGET_AVX")
> + (const_string "*")
> + (const_string "1")))
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "SI")])
> +
> +(define_insn "*_pmovmskb_lt"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +   (unspec:SI
> + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x")
> +   (match_operand:VI1_AVX2 2 "const0_operand" "C"))]
> + UNSPEC_MOVMSK))]
> +  "TARGET_SSE2"
> +  "%vpmovmskb\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set (attr "prefix_data16")
> + (if_then_else
> +   (match_test "TARGET_AVX")
> + (const_string "*")
> + (const_string "1")))
> +   (set_attr "prefix" "maybe_vex")
> +   (set_attr "mode" "SI")])
> +
> +(define_insn "*_pmovmskb_zext_lt"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +   (zero_extend:DI
> + (unspec:SI
> +   [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x")
> + (match_operand:VI1_AVX2 2 "const0_operand" "C"))]
> +   UNSPEC_MOVMSK)))]
> +  "TARGET_64BIT && TARGET_SSE2"
> +  "%vpmovmskb\t{%1, %k0|%k0, %1}"
>[(set_attr "type" "ssemov")
> (set (attr "prefix_data16")
>   (if_then_else
> --- gcc/testsuite/g++.target/i386/pr88152.C.jj  2018-11-29 13:25:23.809113651 
> +0100
> +++ gcc/testsuite/g++.target/i386/pr88152.C 2018-11-29 13:26:20.362168048 
> +0100
> @@ -0,0 +1,44 @@
> +// PR target/88152
> +// { d

Re: [PATCH] Optimize integral lt + blend into just blend (PR target/54700)

2018-11-29 Thread Uros Bizjak
On Thu, Nov 29, 2018 at 10:54 AM Uros Bizjak  wrote:
>
> On Thu, Nov 29, 2018 at 9:00 AM Jakub Jelinek  wrote:
> >
> > Hi!
> >
> > The following patch optimizes
> > -   pxor%xmm3, %xmm3
> > -   pcmpgtb %xmm0, %xmm3
> > -   movdqa  %xmm3, %xmm0
> > pblendvb%xmm0, %xmm1, %xmm2
> > movdqa  %xmm2, %xmm0
> > ret
> >
> > -   vpxor   %xmm3, %xmm3, %xmm3
> > -   vpcmpgtq%ymm0, %ymm3, %ymm0
> > -   vpblendvb   %ymm0, %ymm2, %ymm1, %ymm0
> > +   vblendvpd   %ymm0, %ymm2, %ymm1, %ymm0
> > ret
> >
> > etc.  As the *blendv* instructions only look at the most significant
> > bit, we don't really need to perform pcmpgt* or vpcmpgt* instructions;
> > while they set also the other bits based on the most significant one,
> > the only consumer doesn't care about those other bits.
> >
> > I believe we can't do this for floating point comparisons even with
> > -ffast-math, because -fno-signed-zeros isn't a guarantee that -0.0 won't
> > appear, just that it will appear randomly when 0.0 is wanted and vice versa,
> > and having x < 0.0 be suddenly false if x is -0.0 would IMHO break too much
> > code.
>
> I agree with the above. This would mean that a comparison x < 0.0
> would be substituted with an equivalent to a signbit (). We don't do
> this even for -ffast-math or -funsafe-math-optimizations.
>
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > 2018-11-28  Jakub Jelinek  
> >
> > PR target/54700
> > * config/i386/sse.md (ssebytemode): Add V16SI, V8SI and V4SI 
> > entries.
> > (ssefltmodesuffix, ssefltvecmode): New define_mode_attrs.
> > (*_blendv_lt,
> > *_blendv_ltint,
> > *_pblendvb_lt): New define_insns.
> >
> > * g++.target/i386/sse4_1-pr54700-1.C: New test.
> > * g++.target/i386/sse4_1-pr54700-2.C: New test.
> > * g++.target/i386/avx-pr54700-1.C: New test.
> > * g++.target/i386/avx-pr54700-2.C: New test.
> > * g++.target/i386/avx2-pr54700-1.C: New test.
> > * g++.target/i386/avx2-pr54700-2.C: New test.
> > * g++.target/i386/sse4_1-check.h: New file.
> > * g++.target/i386/avx-check.h: New file.
> > * g++.target/i386/avx2-check.h: New file.
> > * g++.target/i386/m128-check.h: New file.
> > * g++.target/i386/m256-check.h: New file.
> > * g++.target/i386/avx-os-support.h: New file.
>
> OK.

On a second thought, should we rather use (pre-reload?)
define_insn_and_split to split the combination to the blend insn?

Uros.


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54589

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 05:10:33PM +0100, Uros Bizjak wrote:
> Maybe a combine splitter can be used here? Please see documentation
> from paragraph 17.16 onward:
> 
> --quote--
>  The insn combiner phase also splits putative insns.  If three insns are
> merged into one insn with a complex expression that cannot be matched by
> some 'define_insn' pattern, the combiner phase attempts to split the
> complex pattern into two insns that are recognized.  Usually it can
> break the complex pattern into two patterns by splitting out some
> subexpression.  However, in some other cases, such as performing an
> addition of a large constant in two insns on a RISC machine, the way to
> split the addition into two insns is machine-dependent.

Maybe, but not sure how the define_split would look like.
We essentially want to match any RTL whatsoever that has some MEM in it
and if the MEM address is of certain kind, move some part of it.

combine.c has also find_split_point function which does the right thing
in the foo case, just doesn't for the bar case.

Jakub


Re: [PATCH][Version 3]Come up with -flive-patching master option.

2018-11-29 Thread Qing Zhao
the patch has been committed today as:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=266627 


I will try to update the gcc9 change page soon.

thanks.

Qing

> On Nov 28, 2018, at 2:24 PM, Qing Zhao  wrote:
> 
>> 
>> On Nov 28, 2018, at 9:52 AM, Jan Hubicka  wrote:
>> 
>>> 
>>> 2018-11-20  qing zhao  
>>> 
>>> * cif-code.def (EXTERN_LIVE_ONLY_STATIC): New CIF code.
>>> * common.opt: Add -flive-patching flag.
>>> * doc/invoke.texi: Document -flive-patching.
>>> * flag-types.h (enum live_patching_level): New enum.
>>> * ipa-inline.c (can_inline_edge_p): Disable external functions from
>>> inlining when flag_live_patching is LIVE_PATCHING_INLINE_ONLY_STATIC.
>>> * opts.c (control_options_for_live_patching): New function.
>>> (finish_options): Make flag_live_patching incompatible with flag_lto.
>>> Control IPA optimizations based on different levels of 
>>> flag_live_patching.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> 2018-11-20  qing zhao  
>>> 
>>> * gcc.dg/live-patching-1.c: New test.
>>> * gcc.dg/live-patching-2.c: New test.
>>> * gcc.dg/live-patching-3.c: New test.
>>> * gcc.dg/tree-ssa/writeonly-3.c: New test.
>>> * gcc.target/i386/ipa-stack-alignment-2.c: New test.
>>> 
>> 
>> I am still somewhat worried about possible use with C++ programs where
>> we will kill all inlining of comdats, but I guess we could discuss that
>> when it becomes an issue.
> 
> Okay. If this will be a problem later when we use live-patching in more C++ 
> applications, let’s 
> revisit it at that time. 
> 
>> +
>> +  /* FIXME: disable unreachable code removal.  */
>> 
>> Disabling unreachable code removal will really introduce a lot of extra
>> dead code, can't live patches just provide what they need if the code
>> was earlier removed.
>> +
>> +  /* discovery of functions/variables with no address taken.  */
>> +  if (opts_set->x_flag_ipa_reference_addressable
>> +  && opts->x_flag_ipa_reference_addressable)
>> +error_at (loc,
>> +  "%<-fipa-reference-addressable%> is incompatible with "
>> +  "%<-flive-patching=inline-only-static|inline-clone%>");
>> +  else
>> +opts->x_flag_ipa_reference_addressable = 0;
>> +
>> +  /* ipa stack alignment propagation.  */
>> +  if (opts_set->x_flag_ipa_stack_alignment
>> +  && opts->x_flag_ipa_stack_alignment)
>> +error_at (loc,
>> +  "%<-fipa-stack-alignment%> is incompatible with "
>> +  "%<-flive-patching=inline-only-static|inline-clone%>");
>> +  else
>> +opts->x_flag_ipa_stack_alignment = 0;
>> 
>> Shall we also disable nothrow or we will worry about C++ only ter?
> 
> This is also mainly for C++ applications, so currently should not be a 
> problem.
> But I can add a separate simple patch to add another flag to control nothrow 
> propagation and disable it when -flive-patching is ON.
> 
>> 
>> Patch is OK,
> 
> thanks for the review.
> 
> I will commit the patch very soon.
> 
> Qing
>> thanks!
>> Honza



Re: [C++ Patch] Improve compute_array_index_type locations

2018-11-29 Thread Jason Merrill

On 11/29/18 10:22 AM, Paolo Carlini wrote:

Hi,

On 29/11/18 03:01, Jason Merrill wrote:

On 11/6/18 4:01 AM, Paolo Carlini wrote:
when I improved create_array_type_for_decl I didn't notice that it 
calls compute_array_index_type as helper, which simply needs to have 
the location information propagated. Tested x86_64-linux.


This looks like it will use the declarator-id location in diagnostics 
about the array bound expression, which isn't optimal; if 'size' has a 
location, we should use that instead.


Yes. The below, for all the error messages talking about "size", uses 
the location of "size" and either the passed location or input_location 
as fall-back. This, afaics, also matches the behavior of clang. There 
are a few minor subtleties: 1- A first version of the patch didn't 
exploit the location of "size" in the unnamed cases, but doing that 
actually helps for testcases like the new constexpr-base6b.C; 2- It's 
important to use the original "size" - as saved in "osize" - otherwise 
after the folding the location information is lost in some cases (eg, 
happens for the ubsan testcase and a couple of g++.old-deja tests); 3- I 
didn't change the warnings about variable length arrays to exploit the 
location of "size", because they don't explicitly talk about the size - 
that, by the way, appears to match clang's behavior.



+   error_at (cp_expr_loc_or_loc (osize, loc),



+   error_at (cp_expr_loc_or_loc (osize, input_location),


Let's compute the location we want to use once at the top of the 
function.  And maybe rename the 'loc' parameter to name_loc.


Jason


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54589

2018-11-29 Thread Uros Bizjak
On Thu, Nov 29, 2018 at 4:09 PM Jakub Jelinek  wrote:
>
> On Thu, Nov 29, 2018 at 08:13:48PM +0530, Umesh Kalappa wrote:
> > We are able to fix the subjected issue  with the  peephole patterns
> > (target specific) in the md file (attached the patch pr54589.patch).
> > While testing the fix ,we end up with some of the C constructs like
>
> The right thing for this sort of transformations is generally combiner
> and it works for most complex addressing cases.  The reason why it doesn't
> work in this case is that:
> Failed to match this instruction:
> (set (mem:SI (reg:DI 97) [2 *dst_7(D)+0 S4 A32])
> (vec_select:SI (mem:V4SI (plus:DI (plus:DI (mult:DI (reg:DI 90 [ 
> *src_4(D) ])
> (const_int 16 [0x10]))
> (reg/f:DI 16 argp))
> (const_int 16 [0x10])) [1 p.array S16 A128])
> (parallel [
> (const_int 0 [0])
> ])))
> Indeed, x86 doesn't have scale 16 addressing mode nor anything higher,
> so the above isn't recognized and combiner only tries
> Failed to match this instruction:
> (set (reg:DI 95)
> (plus:DI (ashift:DI (reg:DI 90 [ *src_4(D) ])
> (const_int 4 [0x4]))
> (reg/f:DI 16 argp)))
> after it, which doesn't help.
>
> For (x + N) << M, replacing it with (x << M) + (N << M) isn't generally
> a win, e.g. the (N << M) constant could be expensive to construct in a
> register, so optimizing that at GIMPLE level is not right, furthermore
> it isn't even visible at GIMPLE level, as it is just part of ARRAY_REF
> addressing (or could be).
>
> Not sure how hard would be to teach combiner to retry for the addressing
> case with the mult/ashift replaced with a register and if successful,
> emit that multiplication/shift as a separate instruction in 3->2 or 4->2
> combination.  Segher?

Maybe a combine splitter can be used here? Please see documentation
from paragraph 17.16 onward:

--quote--
 The insn combiner phase also splits putative insns.  If three insns are
merged into one insn with a complex expression that cannot be matched by
some 'define_insn' pattern, the combiner phase attempts to split the
complex pattern into two insns that are recognized.  Usually it can
break the complex pattern into two patterns by splitting out some
subexpression.  However, in some other cases, such as performing an
addition of a large constant in two insns on a RISC machine, the way to
split the addition into two insns is machine-dependent.

...

--/quote--

Uros.


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54589

2018-11-29 Thread Jakub Jelinek
Hi!

Note, in other cases combine is sucessful.  E.g. consider:
struct S { int a, b, c, d; };
struct T { struct S e[16]; struct S f[1024]; } t;

int
foo (unsigned long x)
{
  return t.f[x + 5].a;
}

int
bar (struct T *x, unsigned long y)
{
  return x->f[y + 5].b;
}

On x86_64-linux with -O2, we have before combine in both cases
(idx + 21) << 4 and in foo combine says:
Trying 7, 8 -> 10:
7: {r87:DI=r91:DI+0x15;clobber flags:CC;}
  REG_DEAD r91:DI
  REG_UNUSED flags:CC
8: {r88:DI=r87:DI<<0x4;clobber flags:CC;}
  REG_DEAD r87:DI
  REG_UNUSED flags:CC
   10: r90:SI=[r88:DI+`t']
  REG_DEAD r88:DI
Failed to match this instruction:
(set (reg:SI 90 [ t.f[_1].a ])
(mem:SI (plus:DI (mult:DI (reg:DI 91)
(const_int 16 [0x10]))
(const:DI (plus:DI (symbol_ref:DI ("t") [flags 0x2]  )
(const_int 336 [0x150] [1 t.f[_1].a+0 S4 A32]))
Successfully matched this instruction:
(set (reg:DI 88)
(ashift:DI (reg:DI 91)
(const_int 4 [0x4])))
Successfully matched this instruction:
(set (reg:SI 90 [ t.f[_1].a ])
(mem:SI (plus:DI (reg:DI 88)
(const:DI (plus:DI (symbol_ref:DI ("t") [flags 0x2]  )
(const_int 336 [0x150] [1 t.f[_1].a+0 S4 A32]))
and it is merged the way we want:
salq$4, %rdi
movlt+336(%rdi), %eax
while in bar:
Failed to match this instruction:
(set (reg:SI 91 [ x_4(D)->f[_1].b ])
(mem:SI (plus:DI (plus:DI (mult:DI (reg:DI 93)
(const_int 16 [0x10]))
(reg:DI 92))
(const_int 340 [0x154])) [1 x_4(D)->f[_1].b+0 S4 A32]))
Failed to match this instruction:
(set (reg:DI 88)
(plus:DI (ashift:DI (reg:DI 93)
(const_int 4 [0x4]))
(reg:DI 92)))
and it is not merged:
addq$21, %rsi
salq$4, %rsi
movl4(%rsi,%rdi), %eax
when it could be:
salq$4, %rsi
movl340(%rsi,%rdi), %eax

It isn't x86 specific though, e.g. on powerpc64-linux we end up with:
addi 4,4,21
sldi 4,4,4
add 4,3,4
lwa 3,4(4)
for bar, where I guess:
sldi 4,4,4
add 4,3,4
lwa 3,340(4)
would work too.

Jakub


Re: [PATCH] PR libstdc++/67843 set shared_ptr lock policy at build-time

2018-11-29 Thread Christophe Lyon

On 28/11/2018 12:35, Jonathan Wakely wrote:

On 28/11/18 10:54 +0100, Christophe Lyon wrote:

On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely  wrote:


This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.

The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.

To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.

In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.

    PR libstdc++/67843
    * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
    that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
    * config.h.in: Regenerate.
    * configure: Regenerate.
    * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
    * doc/xml/manual/configure.xml: Document new configure option.
    * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
    instead of shared_ptr.
    (recursive_directory_iterator): Likewise.
    (__shared_ptr<_Dir>): Add explicit instantiation declaration.
(__shared_ptr): Likewise.
    * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
    Add default template argument for _Lock_policy template parameter.
    * include/ext/concurrence.h (__default_lock_policy): Check macro
    _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
    target supports the builtins for compare-and-swap.
    * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
    instantiation definition.
(__shared_ptr): Likewise.
    (directory_iterator, recursive_directory_iterator): Use __make_shared
    instead of make_shared.

Tested x86_64-linux and aarch64-linux, committed to trunk.

I would appreciate testing on ARM, especially Christophe's
-march=armv5t set up.



Hi Jonathan,

Your patch passed my tests, thanks!


Thanks for checking it.


directory_iterator.cc tests used to be killed by a timeout until last
night where now they do PASS in my configurations involving
-march=armv5t.

That being said, I'm not sure I fully understand the fix. In my tests
involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
and I pass -march=armv5t as an override with the --target-board runtest option.


Yup, and that difference between the arch used to build libstdc++ and
the arch used to build the tests was what caused the FAILures. And
that's what my patch addresses, by ensuring that the arch used to
build libstdc++ is what dictates the type of synchronization for
shared_ptr.


In these cases libstdc++ configure detects support for "atomic", so I
suspect the tests pass only because I use QEMU with --cpu cortex-a9.
I think if I used a different QEMU cpu (eg arm926), the tests would
try to use atomics, and fail?


I don't know if a libstdc++ configured for cortex-a9 would run on an
arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to
the use of atomics in my patch). See the end of this mail.


The reason I'm still using cortex-a9 at QEMU level is that many tests
override -mcpu/-march, and I used that as a compromise.


That means you're not fully testing armv5t, because you're still
linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
instruction set. Usually that doesn't matter, because most libstdc++
code doesn't care about which specific instructions it gets compiled
to. That

However, I do have configurations using --with-cpu=arm10tdmi or
--with-cpu=default and QEMU --cpu arm926.
In these 2 cases, the tests do PASS, but they used to before your
patch. libstdc++ configure does detect "mutex" in these cases.


What matters is that the tests and the libstdc++ libraries ag

Re: [PATCH v4] Add sinh(atanh(x)) and cosh(atanh(x)) optimizations

2018-11-29 Thread Jeff Law
On 11/27/18 12:38 PM, Giuliano Augusto Faulin Belinassi wrote:
> Only do this optimization if funsafe-math and -fno-math-errno are
> enabled, as pointed in the previous iteration.
> 
> Also added one more test case to ensure that fno-math-errno is
> required for the optimization.
> 
> Special thanks for Wilco Dijsktra for all his help :-)
> 
> gcc/ChangeLog
> 2018-11-27  Giuliano Belinassi  
> 
> * match.pd (sinh (atanh (x))): New simplification rules.
> (cosh (atanh (x))): Likewise.
> 
> gcc/testsuite/ChangeLog
> 2018-11-27  Giuliano Belinassi  
> 
> * gcc.dg/sinhatanh-1.c: New test.
> * gcc.dg/sinhatanh-2.c: New test.
> * gcc.dg/sinhatanh-3.c: New test.
Arguably you'd want to extend the -2 and -3 tests to verify the number
of calls to sinh, cosh and atanh is what you expect.  An alternate would
be verifying there are no sqrt calls.  But I think that can be handled
as a follow-up if you wanted to improve the test.


I've installed the patch.  THanks for your persistence.  This probably
took a lot longer than anyone anticipated.

jeff


Re: [PATCH v4] Add sinh(atanh(x)) and cosh(atanh(x)) optimizations

2018-11-29 Thread Jeff Law
On 11/29/18 3:34 AM, Giuliano Augusto Faulin Belinassi wrote:
> Hi,
> 
> Please check line 4102 in match.pd. Notice that
> 
> (if (flag_unsafe_math_optimizations)
> 
> And my code is inside that if :-)
Ugh.  I went backwards looking for something like that and missed it :-)
 I may have not gone back far enough...

I think this is ready to go.  I expect to commit it shortly.

jeff


Re: [C++ Patch] Improve compute_array_index_type locations

2018-11-29 Thread Paolo Carlini

Hi,

On 29/11/18 03:01, Jason Merrill wrote:

On 11/6/18 4:01 AM, Paolo Carlini wrote:
when I improved create_array_type_for_decl I didn't notice that it 
calls compute_array_index_type as helper, which simply needs to have 
the location information propagated. Tested x86_64-linux.


This looks like it will use the declarator-id location in diagnostics 
about the array bound expression, which isn't optimal; if 'size' has a 
location, we should use that instead.


Yes. The below, for all the error messages talking about "size", uses 
the location of "size" and either the passed location or input_location 
as fall-back. This, afaics, also matches the behavior of clang. There 
are a few minor subtleties: 1- A first version of the patch didn't 
exploit the location of "size" in the unnamed cases, but doing that 
actually helps for testcases like the new constexpr-base6b.C; 2- It's 
important to use the original "size" - as saved in "osize" - otherwise 
after the folding the location information is lost in some cases (eg, 
happens for the ubsan testcase and a couple of g++.old-deja tests); 3- I 
didn't change the warnings about variable length arrays to exploit the 
location of "size", because they don't explicitly talk about the size - 
that, by the way, appears to match clang's behavior.


Tested x86_64-linux.

Thanks, Paolo.

//

Index: cp/decl.c
===
--- cp/decl.c   (revision 266555)
+++ cp/decl.c   (working copy)
@@ -9621,8 +9621,9 @@ fold_sizeof_expr (tree t)
an appropriate index type for the array.  If non-NULL, NAME is
the name of the entity being declared.  */
 
-tree
-compute_array_index_type (tree name, tree size, tsubst_flags_t complain)
+static tree
+compute_array_index_type_loc (location_t loc, tree name, tree size,
+ tsubst_flags_t complain)
 {
   tree itype;
   tree osize = size;
@@ -9658,9 +9659,12 @@ fold_sizeof_expr (tree t)
  if (!(complain & tf_error))
return error_mark_node;
  if (name)
-   error ("size of array %qD has non-integral type %qT", name, type);
+   error_at (cp_expr_loc_or_loc (osize, loc),
+ "size of array %qD has non-integral type %qT",
+ name, type);
  else
-   error ("size of array has non-integral type %qT", type);
+   error_at (cp_expr_loc_or_loc (osize, input_location),
+ "size of array has non-integral type %qT", type);
  size = integer_one_node;
}
 }
@@ -9689,8 +9693,16 @@ fold_sizeof_expr (tree t)
 {
   tree folded = cp_fully_fold (size);
   if (TREE_CODE (folded) == INTEGER_CST)
-   pedwarn (input_location, OPT_Wpedantic,
-"size of array is not an integral constant-expression");
+   {
+ if (name)
+   pedwarn (cp_expr_loc_or_loc (osize, loc),
+OPT_Wpedantic, "size of array %qD is not an "
+"integral constant-expression", name);
+ else
+   pedwarn (cp_expr_loc_or_loc (osize, input_location),
+OPT_Wpedantic,
+"size of array is not an integral constant-expression");
+   }
   /* Use the folded result for VLAs, too; it will have resolved
 SIZEOF_EXPR.  */
   size = folded;
@@ -9706,9 +9718,11 @@ fold_sizeof_expr (tree t)
return error_mark_node;
 
  if (name)
-   error ("size of array %qD is negative", name);
+   error_at (cp_expr_loc_or_loc (osize, loc),
+ "size of array %qD is negative", name);
  else
-   error ("size of array is negative");
+   error_at (cp_expr_loc_or_loc (osize, input_location),
+ "size of array is negative");
  size = integer_one_node;
}
   /* As an extension we allow zero-sized arrays.  */
@@ -9722,9 +9736,12 @@ fold_sizeof_expr (tree t)
  else if (in_system_header_at (input_location))
/* Allow them in system headers because glibc uses them.  */;
  else if (name)
-   pedwarn (input_location, OPT_Wpedantic, "ISO C++ forbids zero-size 
array %qD", name);
+   pedwarn (cp_expr_loc_or_loc (osize, loc), OPT_Wpedantic,
+"ISO C++ forbids zero-size array %qD", name);
  else
-   pedwarn (input_location, OPT_Wpedantic, "ISO C++ forbids zero-size 
array");
+   pedwarn (cp_expr_loc_or_loc (osize, input_location),
+OPT_Wpedantic,
+"ISO C++ forbids zero-size array");
}
 }
   else if (TREE_CONSTANT (size)
@@ -9737,24 +9754,28 @@ fold_sizeof_expr (tree t)
return error_mark_node;
   /* `(int) &fn' is not a valid array bound.  */
   if (name)
-   error ("size of array %qD is not an integral constant-expression",
-  name);
+   error_at (cp_expr_loc_or_

Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-29 Thread Ed Smith-Rowland

On 11/29/18 9:09 AM, Jonathan Wakely wrote:

On 29/11/18 08:47 -0500, Ed Smith-Rowland wrote:

Fixed with 266616.


Thanks!


Index: include/std/deque
===
--- include/std/deque    (revision 266567)
+++ include/std/deque    (working copy)
@@ -58,6 +58,7 @@
#pragma GCC system_header

#include 
+#include  // For remove and remove_if


Please only include  when __cplusplus > 201703L
otherwise we include hundreds of kilobytes of code just for these tiny
functions that aren't even defined in the default C++14 dialect.


Done with 266624.

Ed


At some point I'll split std::remove and std::remove_if into their own
header, and we can just include that instead




2018-11-29  Edward Smith-Rowland  <3dw...@verizon.net>

Only include bits/stl_algo.h for C++20.
* include/std/deque: Only include bits/stl_algo.h for C++20.
* include/std/string: Ditto.
* include/std/vector: Ditto.

Index: include/std/deque
===
--- include/std/deque   (revision 266616)
+++ include/std/deque   (working copy)
@@ -58,7 +58,9 @@
 #pragma GCC system_header
 
 #include 
-#include  // For remove and remove_if
+#if __cplusplus > 201703L
+#  include  // For remove and remove_if
+#endif // C++20
 #include 
 #include 
 #include 
Index: include/std/string
===
--- include/std/string  (revision 266616)
+++ include/std/string  (working copy)
@@ -48,7 +48,9 @@
 #include  // For less
 #include 
 #include 
-#include  // For remove and remove_if
+#if __cplusplus > 201703L
+#  include  // For remove and remove_if
+#endif // C++20
 #include 
 #include 
 #include 
Index: include/std/vector
===
--- include/std/vector  (revision 266616)
+++ include/std/vector  (working copy)
@@ -58,7 +58,9 @@
 #pragma GCC system_header
 
 #include 
-#include  // For remove and remove_if
+#if __cplusplus > 201703L
+#  include  // For remove and remove_if
+#endif // C++20
 #include 
 #include 
 #include 


Re: [PATCH] Come up with --param asan-stack-small-redzone (PR sanitizer/81715).

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 04:03:42PM +0100, Martin Liška wrote:
> > Two problems, it uses unconditionally unaligned stores, without
> > checking if the target supports them at all (in this case it does).
> > And, it doesn't check if it wouldn't be more efficient to use
> > 32-bit stores. 
> 
> Ok, so now we reduced the alignment of stack variables to 
> ASAN_MIN_RED_ZONE_SIZE (16).
> What options do wehave when we emit the red zones? The only guarantee we have 
> is
> that in shadow memory we are aligned to at least 2. Should byte-based 
> emission used
> for STRICT_ALIGNMENT targets?

While individual variables can have smaller alignment, why we can't as before 
ensure
the whole block of all the vars is 32-byte aligned and (see below) also
padded to 32 bytes?  Doesn't __asan_stack_malloc* align those as before and
if not doing use after return sanitization, has your patch changed anything
in how the whole block is aligned on the stack?  Of course, on non-strict
alignment targets we might not care that much.

> >  It isn't that we want the gaps to have whatever
> > value the shadow memory contained before, we want it to be 0 and just rely
> > on it being zero from earlier.
> > Another question is if it wouldn't be worth to ensure the variable area is
> > always a multiple of 32 bytes (thus the corresponding shadow always multiple
> > of 4 bytes).  Then, for this testcase, you'd store:
> > $-235802127, 2147450880(%rbp)   // 0xf1f1f1f1
> > $-234687999, 2147450884(%rbp)   // 0xf202f201
> > $-218041852, 2147450888(%rbp)   // 0xf300f204
> > $-202116109, 2147450892(%rbp)   // 0xf3f3f3f3
> > For the single char/short/int/long var in a function case you'd still emit
> > just f1 f1 f1 f1 0[1240] f3 f3 f3
> > and the number of final f3 bytes would be from 3 to 6 (or 2 to 5), 1 is
> > probably too few.

Here, the padding to 32 bytes.

Jakub


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54589

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 08:13:48PM +0530, Umesh Kalappa wrote:
> We are able to fix the subjected issue  with the  peephole patterns
> (target specific) in the md file (attached the patch pr54589.patch).
> While testing the fix ,we end up with some of the C constructs like

The right thing for this sort of transformations is generally combiner
and it works for most complex addressing cases.  The reason why it doesn't
work in this case is that:
Failed to match this instruction:
(set (mem:SI (reg:DI 97) [2 *dst_7(D)+0 S4 A32])
(vec_select:SI (mem:V4SI (plus:DI (plus:DI (mult:DI (reg:DI 90 [ *src_4(D) 
])
(const_int 16 [0x10]))
(reg/f:DI 16 argp))
(const_int 16 [0x10])) [1 p.array S16 A128])
(parallel [
(const_int 0 [0])
])))
Indeed, x86 doesn't have scale 16 addressing mode nor anything higher,
so the above isn't recognized and combiner only tries
Failed to match this instruction:
(set (reg:DI 95)
(plus:DI (ashift:DI (reg:DI 90 [ *src_4(D) ])
(const_int 4 [0x4]))
(reg/f:DI 16 argp)))
after it, which doesn't help.

For (x + N) << M, replacing it with (x << M) + (N << M) isn't generally
a win, e.g. the (N << M) constant could be expensive to construct in a
register, so optimizing that at GIMPLE level is not right, furthermore
it isn't even visible at GIMPLE level, as it is just part of ARRAY_REF
addressing (or could be).

Not sure how hard would be to teach combiner to retry for the addressing
case with the mult/ashift replaced with a register and if successful,
emit that multiplication/shift as a separate instruction in 3->2 or 4->2
combination.  Segher?

For your patch, not really sure if peephole is the best place, it is too
late (after RA), and your patch is in some cases too narrow (e.g. hardcoding
DImode for the addresses, Pmode might be better), especially because you
want to apply it for any kind of MEM, without caring about what mode that
mem has, nor where in the instruction it appears and what else the
instruction does.  It is equally useful if the mem already has some
immediate if the two can be combined, or if it doesn't have a base, etc.

Jakub


Re: [PATCH] Come up with --param asan-stack-small-redzone (PR sanitizer/81715).

2018-11-29 Thread Martin Liška
On 11/28/18 12:41 PM, Jakub Jelinek wrote:
> On Wed, Sep 26, 2018 at 11:33:25AM +0200, Martin Liška wrote:
>> 2018-09-26  Martin Liska  
>>
>>  PR sanitizer/81715
>>  * asan.c (asan_shadow_cst): Remove.
>>  (asan_emit_redzone_payload): New.
>>  (asan_emit_stack_protection): Make it more
>>  flexible to support arbitrary size of red zones.
>>  * asan.h (ASAN_MIN_RED_ZONE_SIZE): New.
>>  (asan_var_and_redzone_size): Likewise.
>>  * cfgexpand.c (expand_stack_vars): Reserve size
>>  for stack vars according to asan_var_and_redzone_size.
>>  (expand_used_vars): Make smaller offset based
>>  on ASAN_SHADOW_GRANULARITY.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-09-26  Martin Liska  
>>
>>  PR sanitizer/81715
>>  * c-c++-common/asan/asan-stack-small.c: New test.
> 
> Sorry for the delay.  I had a quick look.  Using:
> struct S { char a[32]; };
> void bar (void *, void *, void *, void *);
> 
> int
> foo (void)
> {
>   struct S a, b, c, d;
>   bar (&a, &b, &c, &d);
>   return 0;
> }
> 
> int
> baz (void)
> {
>   char a;
>   short b;
>   int c;
>   long long d;
>   bar (&a, &b, &c, &d);
>   return 0;
> }
> -O2 -fsanitize=address, I see that foo is identical before/after your patch,
> perfect.
> Looking at baz, I see issues though:
> .LC2:
> .string "4 32 1 4 a:15 48 2 4 b:16 64 4 4 c:17 80 8 4 d:18"
> ...
> movq%rbx, %rbp
> movl$-3580, %edx
> movl$-3085, %ecx
> movq$1102416563, (%rbx)
> shrq$3, %rbp
> movq$.LC2, 8(%rbx)
> leaq48(%rbx), %rsi
> leaq32(%rbx), %rdi
> movq$.LASANPC1, 16(%rbx)
> movw%dx, 2147450888(%rbp)
> leaq64(%rbx), %rdx
> movw%cx, 2147450891(%rbp)
> leaq80(%rbx), %rcx
> movl$-235802127, 2147450880(%rbp)
> movl$-234687999, 2147450884(%rbp)
> movb$-13, 2147450893(%rbp)
> callbar
> cmpq%rbx, %r12
> jne .L15
> movq$0, 2147450880(%rbp)
> xorl%eax, %eax
> movl$0, 2147450888(%rbp)
> movw%ax, 2147450892(%rbp)
> So, looks like the shadow at (%rbx>>3)+0x7fff8000 is:
> / 2147450880(%rbp)
> |   / 2147450884(%rbp)
> |   |   / 2147450888(%rbp)
> |   |   |   / 2147450892(%rbp)
> |   |   |   |
> f1 f1 f1 f1 01 f2 02 f2 04 f2 00 f3 f3 f3

Hi.

> 
> Two problems, it uses unconditionally unaligned stores, without
> checking if the target supports them at all (in this case it does).
> And, it doesn't check if it wouldn't be more efficient to use
> 32-bit stores. 

Ok, so now we reduced the alignment of stack variables to 
ASAN_MIN_RED_ZONE_SIZE (16).
What options do wehave when we emit the red zones? The only guarantee we have is
that in shadow memory we are aligned to at least 2. Should byte-based emission 
used
for STRICT_ALIGNMENT targets?

>  It isn't that we want the gaps to have whatever
> value the shadow memory contained before, we want it to be 0 and just rely
> on it being zero from earlier.
> Another question is if it wouldn't be worth to ensure the variable area is
> always a multiple of 32 bytes (thus the corresponding shadow always multiple
> of 4 bytes).  Then, for this testcase, you'd store:
> $-235802127, 2147450880(%rbp) // 0xf1f1f1f1
> $-234687999, 2147450884(%rbp)   // 0xf202f201
> $-218041852, 2147450888(%rbp)   // 0xf300f204
> $-202116109, 2147450892(%rbp)   // 0xf3f3f3f3
> For the single char/short/int/long var in a function case you'd still emit
> just f1 f1 f1 f1 0[1240] f3 f3 f3
> and the number of final f3 bytes would be from 3 to 6 (or 2 to 5), 1 is
> probably too few.

I like the idea, so that can be achieved by setting alignment of the last stack
variable to 32, right?

> 
> Rather than special-casing the two very small adjacent vars case,
> just use a rolling handling of the shadow bytes, if you fill all 4,
> emit immediately, otherwise queue later and flush if the next shadow offset
> is outside of the 4 bytes.

I rewrote it and I'm sending untested patch (however surviving asan.exp tests).
I must agree the code is now much more readable.

>  Either always use SImode stores, or check rtx

I'm now using only SImode stores, which should be the same as before the patch.
The complication now we have is that the guarantee alignment is only 2B in 
shadow
mem.

Martin

> costs; if the 32-bit constants you'd store is as expensive or less than the
> 16-bit or 8-bit constant, use 32-bit store, otherwise use a 16-bit or 8-bit
> one.  If you want to complicate it further, allow unaligned stores on
> targets that do allow them, but use them with care, if you could use same
> amount of aligned stores with the same cost of constants, prefer aligned
> ones.
> 
>   Jakub
> 

>From 6088a945ccbe9bf9c49c8238fe923f63688a778f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 25 Sep 2018 10:54:37 +0200
Subje

RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-29 Thread Nick Clifton
Hi Ian,

  Libiberty's demangler seems to be a favourite target of people looking
  to file new CVEs by fuzzing strings and I have finally gotten tired of
  reviewing them all.  So I would like to propose a patch to add support
  for placing a recursion limit on the demangling functions.

  The patch adds a new demangling option - DMGL_RECURSE_LIMIT - which
  needs to be present in order for the new code to take affect.   So
  current users of the libiberty library will not be affected[*].
  The patch also adds a new function - cplus_demangle_set_recursion_limit()
  - which can be used to probe and/or set the limit.

  When the option is in effect a few of the demangler functions will use
  static counters to make sure that they have not been called
  recursively too many times.  I only chose those functions for which I
  could find filed PRs that triggered this kind of problem.  But with
  the stack limiting framework in place it would be easy to add checks
  to other functions, should they prove necessary.

  I also encountered one binary that could trigger stack exhaustion in
  d_demangle_callback where it tries to allocate stack space to hold
  component and substitution arrays.  So the patch includes a check
  for this situation as well.
  
  There does not appear to be any error reporting framework for the
  demangler functions, so when a recursion limit is reached the
  functions just return a failure/NULL result.

  I have tested the patch with a variety of different binutils builds
  and also bootstrapped an x86_64-pc-linux-gnu toolchain.  No new
  failures were found.

  What do you think, is this approach reasonable ?

Cheers
  Nick

[*] Actually I also have a patch for the binutils to modify the
addr2line, c++filt, nm and objdump programs to make use of this new
feature, should it be accepted into libiberty.

Patches:

include/ChangeLog
2018-11-29  Nick Clifton  

* demangle.h (DMGL_RECURSE_LIMIT): Define.
(cplus_demangle_set_recursion_limit): Prototype.

libiberty/ChangeLog
2018-11-29  Nick Clifton  

PR 87675
PR 87636
* cp-demangle.c (demangle_recursion_limit): New static
variable.
(d_function_type): Add recursion counter.  If the recursion
limit is enabled and reached, return with a failure result.
(d_demangle_callback): If the recursion limit is enabled, check
for a mangled string that is so long that there is not enough
stack space for the local arrays.
(cplus_demangle_set_recursion): New function.  Sets and/or
returns the current stack recursion limit.
* cplus-dem.c (demangle_nested_args): Add recursion counter.  If
the recursion limit is enabled and reached, return with a
failure result.

diff --git a/include/demangle.h b/include/demangle.h
index b8d57cf295..426b5880aa 100644
--- a/include/demangle.h
+++ b/include/demangle.h
@@ -65,6 +65,8 @@ extern "C" {
 #define DMGL_DLANG	 (1 << 16)
 #define DMGL_RUST	 (1 << 17)	/* Rust wraps GNU_V3 style mangling.  */
 
+#define DMGL_RECURSE_LIMIT (1 << 18)	/* Enable limits on the depth of recursion in mangled strings.  */
+
 /* If none of these are set, use 'current_demangling_style' as the default. */
 #define DMGL_STYLE_MASK (DMGL_AUTO|DMGL_GNU|DMGL_LUCID|DMGL_ARM|DMGL_HP|DMGL_EDG|DMGL_GNU_V3|DMGL_JAVA|DMGL_GNAT|DMGL_DLANG|DMGL_RUST)
 
@@ -716,6 +718,15 @@ cplus_demangle_print_callback (int options,
struct demangle_component *tree,
demangle_callbackref callback, void *opaque);
 
+/* Set a limit on the amount of recursion allowed in mangled strings.
+   Only effective if the DMGL_RECURSE_LIMIT option has been set.
+   Returns the previous value of the recursion limit.
+   Ignores settings a limit of 0 or 1.
+   Note - setting the limit is not a thread-safe operation.  */
+
+extern unsigned long
+cplus_demangle_set_recursion_limit (unsigned long limit);
+  
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 3f2a097e7f..a6aeac58a3 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -568,6 +568,11 @@ static int d_demangle_callback (const char *, int,
 demangle_callbackref, void *);
 static char *d_demangle (const char *, int, size_t *);
 
+/* A limit on the amount of recursion allowed in mangled strings.
+   Only effective if the DMGL_RECURSE_LIMIT option has been set.
+   The initial value of 1024 is an arbitrary choice.  */
+static unsigned long demangle_recursion_limit = 1024;
+
 #define FNQUAL_COMPONENT_CASE\
 case DEMANGLE_COMPONENT_RESTRICT_THIS:		\
 case DEMANGLE_COMPONENT_VOLATILE_THIS:		\
@@ -2843,21 +2848,34 @@ d_ref_qualifier (struct d_info *di, struct demangle_component *sub)
 static struct demangle_component *
 d_function_type (struct d_info *di)
 {
-  struct demangle_component *ret;
+  static unsigned long recursion_level 

Re: [PATCH, ARM] Error out when -mfpu set and targeting Thumb-1

2018-11-29 Thread Richard Earnshaw (lists)
On 29/11/2018 10:51, Thomas Preudhomme wrote:
> Hi,
> 
> FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
> but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
> not set. Among other things, it makes some of the cmse tests (eg.
> gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
> -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch
> errors out when a Thumb-1 -like target is selected and a FPU is
> specified, thus making such tests being skipped.
> 
> ChangeLog entries are as follows:
> 
> *** gcc/ChangeLog ***
> 
> 2018-11-28  thomas Preud'homme  
> 
> * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out
> if targeting Thumb-1 with an FPU specified.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2018-11-28  thomas Preud'homme  
> 
> * gcc.target/arm/thumb1_mfpu-1.c: New testcase.
> * gcc.target/arm/thumb1_mfpu-2.c: Likewise.
> 
> Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M.
> Fails as expected when targeting Armv6-M with an -mfpu or a default FPU.
> Succeeds without.
> 
> Is this ok for stage3?
> 

This doesn't sound right.  Specifically this bit...

+  else if (TARGET_THUMB1
+  && bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2))
+   error ("Thumb-1 does not allow FP instructions");

If I use

-mcpu=arm1176jzf-s -mfpu=auto -mfloat-abi=softfp -mthumb

then that shouldn't error, since softfp and thumb is, in reality, just
float-abi=soft (as there are no fp instructions in thumb).  We also want
it to work this way so that I can add the thumb/arm attribute to
specific functions and have the compiler use HW float instructions when
they are suitable.


R.

> Best regards,
> 
> Thomas
> 
> 
> thumb1_mfpu_error.patch
> 
> From 051e38552d7c596873e0303f6ec4272b26d50900 Mon Sep 17 00:00:00 2001
> From: Thomas Preud'homme 
> Date: Tue, 27 Nov 2018 15:52:38 +
> Subject: [PATCH] [PATCH, ARM] Error out when -mfpu set and targeting Thumb-1
> 
> Hi,
> 
> FP instructions are only enabled for TARGET_32BIT and TARGET_HARD_FLOAT
> but GCC only gives an error when TARGET_HARD_FLOAT is true and -mfpu is
> not set. Among other things, it makes some of the cmse tests (eg.
> gcc.target/arm/cmse/baseline/softfp.c) fail when targeting
> -march=armv8-m.base -mfpu= -mfloat-abi=softfp. This patch
> errors out when a Thumb-1 -like target is selected and a FPU is
> specified, thus making such tests being skipped.
> 
> ChangeLog entries are as follows:
> 
> *** gcc/ChangeLog ***
> 
> 2018-11-28  thomas Preud'homme  
> 
>   * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Error out
>   if targeting Thumb-1 with an FPU specified.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2018-11-28  thomas Preud'homme  
> 
>   * gcc.target/arm/thumb1_mfpu-1.c: New testcase.
>   * gcc.target/arm/thumb1_mfpu-2.c: Likewise.
> 
> Testing: No testsuite regression when targeting arm-none-eabi Armv6S-M.
> Fails as expected when targeting Armv6-M with an -mfpu or a default FPU.
> Succeeds without.
> 
> Is this ok for stage3?
> 
> Best regards,
> 
> Thomas
> ---
>  gcc/config/arm/arm.c | 3 +++
>  gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c | 7 +++
>  gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c | 8 
>  3 files changed, 18 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 40f0574e32e..1a205123cf5 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3747,6 +3747,9 @@ arm_options_perform_arch_sanity_checks (void)
>  {
>if (arm_abi == ARM_ABI_IWMMXT)
>   arm_pcs_default = ARM_PCS_AAPCS_IWMMXT;
> +  else if (TARGET_THUMB1
> +&& bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2))
> + error ("Thumb-1 does not allow FP instructions");
>else if (TARGET_HARD_FLOAT_ABI)
>   {
> arm_pcs_default = ARM_PCS_AAPCS_VFP;
> diff --git a/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c 
> b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c
> new file mode 100644
> index 000..5347e63f9b6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-1.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_thumb1_ok } */
> +/* { dg-skip-if "incompatible float ABI" { *-*-* } { "-mfloat-abi=*" } { 
> "-mfloat-abi=softfp" } } */
> +/* { dg-options "-mthumb -mfpu=vfp -mfloat-abi=softfp" } */
> +/* { dg-error "Thumb-1 does not allow FP instructions" "" { target *-*-* } 0 
> } */
> +
> +int foo;
> diff --git a/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c 
> b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c
> new file mode 100644
> index 000..941ed26ed01
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/thumb1_mfpu-2.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_thumb1

Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54589

2018-11-29 Thread Umesh Kalappa
Hi All,

We are able to fix the subjected issue  with the  peephole patterns
(target specific) in the md file (attached the patch pr54589.patch).
While testing the fix ,we end up with some of the C constructs like

Testcase 1:
#include 
struct param {
int a, b, c, d;
__m128i array[256];
};
void func(struct param p, unsigned char *src, int *dst)
{
__m128i x = p.array[*src];
*dst = _mm_cvtsi128_si32(x);
}

compiles with -O2 for  x86-64 and asm looks like
func:
movzbl  (%rdi), %eax
addq$1, %rax
salq$4, %rax
movl8(%rsp,%rax), %eax
movl%eax, (%rsi)
ret

Testcase 2:
  #include 
  struct param {
int a, b, c, d;
__m128i array[256];
  };

struct param *p1;

void func(unsigned char *src, int *dst)
{
__m128i x = p1->array[*src];
*dst = _mm_cvtsi128_si32(x);
}

compiles with -O2 on x86-64 :
func:
movzbl  (%rdi), %eax
addq$1, %rax
salq$4, %rax
addqp1(%rip), %rax
movl(%rax), %eax
movl%eax, (%rsi)
ret

So, we are thinking to extend our fix with few more peephole patterns
in the md file.

OR it’s better to handle all C constructs in combiner/fwprop pass
,please let us know your suggestions or comments on this ,that guides
us in the right direction.

Thank you
~Umesh

On Fri, Nov 23, 2018 at 3:56 PM Umesh Kalappa  wrote:
>
> Hi Richard,
>
> for the subjected issue , we found few suggestions to handle the issue like
>
>  1. be more conservative(target specific) and defining the peephole in
> the md file to handle the patterns like add,shl and movl to "shlq and
> movl"
>  2. like you mentioned  in fwprop/combiner .
>
> we would like to know your inputs /suggestions before we go ahead with
> the patch.
>
> Thank you
> ~Umesh


pr54589.patch
Description: Binary data


[PATCH] Optimize integer vector comparison followed by movmsk (PR target/88152)

2018-11-29 Thread Jakub Jelinek
Hi!

Like blend, movmsk also only cares about the most significant bit,
so prior < 0 comparisons or (happens also on the testcase below in some
cases) arithmetic shift right (by any value) isn't needed before the movmsk.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-29  Jakub Jelinek  

PR target/88152
* config/i386/sse.md (*_movmsk_lt,
*_movmsk_zext_lt,
*_movmsk_shift,
*_movmsk_zext_shift,
*_pmovmskb_lt, *_pmovmskb_zext_lt): New
patterns.

* g++.target/i386/pr88152.C: New test.

--- gcc/config/i386/sse.md.jj   2018-11-29 12:30:45.257028189 +0100
+++ gcc/config/i386/sse.md  2018-11-29 13:16:34.111969513 +0100
@@ -14653,6 +14653,62 @@ (define_insn "*_movmsk")])
 
+(define_insn "*_movmsk_lt"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI
+ [(lt:VF_128_256
+(match_operand: 1 "register_operand" "x")
+(match_operand: 2 "const0_operand" "C"))]
+ UNSPEC_MOVMSK))]
+  "TARGET_SSE"
+  "%vmovmsk\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
+(define_insn "*_movmsk_zext_lt"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (unspec:SI
+   [(lt:VF_128_256
+  (match_operand: 1 "register_operand" "x")
+  (match_operand: 2 "const0_operand" "C"))]
+   UNSPEC_MOVMSK)))]
+  "TARGET_64BIT && TARGET_SSE"
+  "%vmovmsk\t{%1, %k0|%k0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
+(define_insn "*_movmsk_shift"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI
+ [(subreg:VF_128_256
+(ashiftrt:
+  (match_operand: 1 "register_operand" "x")
+  (match_operand:QI 2 "const_int_operand" "n")) 0)]
+ UNSPEC_MOVMSK))]
+  "TARGET_SSE"
+  "%vmovmsk\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
+(define_insn "*_movmsk_zext_shift"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (unspec:SI
+   [(subreg:VF_128_256
+  (ashiftrt:
+(match_operand: 1 "register_operand" "x")
+  (match_operand:QI 2 "const_int_operand" "n")) 0)]
+   UNSPEC_MOVMSK)))]
+  "TARGET_64BIT && TARGET_SSE"
+  "%vmovmsk\t{%1, %k0|%k0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "")])
+
 (define_insn "_pmovmskb"
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
@@ -14677,6 +14733,41 @@ (define_insn "*_pmovmskb_zext
UNSPEC_MOVMSK)))]
   "TARGET_64BIT && TARGET_SSE2"
   "%vpmovmskb\t{%1, %k0|%k0, %1}"
+  [(set_attr "type" "ssemov")
+   (set (attr "prefix_data16")
+ (if_then_else
+   (match_test "TARGET_AVX")
+ (const_string "*")
+ (const_string "1")))
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "SI")])
+
+(define_insn "*_pmovmskb_lt"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI
+ [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x")
+   (match_operand:VI1_AVX2 2 "const0_operand" "C"))]
+ UNSPEC_MOVMSK))]
+  "TARGET_SSE2"
+  "%vpmovmskb\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set (attr "prefix_data16")
+ (if_then_else
+   (match_test "TARGET_AVX")
+ (const_string "*")
+ (const_string "1")))
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "SI")])
+
+(define_insn "*_pmovmskb_zext_lt"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (zero_extend:DI
+ (unspec:SI
+   [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x")
+ (match_operand:VI1_AVX2 2 "const0_operand" "C"))]
+   UNSPEC_MOVMSK)))]
+  "TARGET_64BIT && TARGET_SSE2"
+  "%vpmovmskb\t{%1, %k0|%k0, %1}"
   [(set_attr "type" "ssemov")
(set (attr "prefix_data16")
  (if_then_else
--- gcc/testsuite/g++.target/i386/pr88152.C.jj  2018-11-29 13:25:23.809113651 
+0100
+++ gcc/testsuite/g++.target/i386/pr88152.C 2018-11-29 13:26:20.362168048 
+0100
@@ -0,0 +1,44 @@
+// PR target/88152
+// { dg-do compile }
+// { dg-options "-O2 -mavx2 -std=c++11" }
+// { dg-final { scan-assembler-times "vpmovmskb\[^\n\r]*xmm" 6 } }
+// { dg-final { scan-assembler-times "vpmovmskb\[^\n\r]*ymm" 6 } }
+// { dg-final { scan-assembler-times "vmovmskps\[^\n\r]*xmm" 4 } }
+// { dg-final { scan-assembler-times "vmovmskps\[^\n\r]*ymm" 4 } }
+// { dg-final { scan-assembler-times "vmovmskpd\[^\n\r]*xmm" 4 } }
+// { dg-final { scan-assembler-times "vmovmskpd\[^\n\r]*ymm" 4 } }
+// { dg-final { scan-assembler-not "vpcmpgt|vpcmpeq|vpsra" } }
+
+#include 
+
+template 
+using V [[gnu::vector_size(N)]] = T;
+
+int f0 (V a) { return _mm_movemask_epi8 
(reinterpret_cast<__m128i>

Re: [PATCH] Fix vec_{add,sub} rs6000_gimple_fold_builtin folding (PR target/88234)

2018-11-29 Thread Segher Boessenkool
Hi Jakub,

On Thu, Nov 29, 2018 at 03:00:22PM +0100, Jakub Jelinek wrote:
> vec_add/sub of with vector unsigned args is lowered to a builtin which
> has vector signed args and therefore if not -fwrapv it is undefined if
> signed integer overflow occurs in those vectors.
> 
> The following patch fixes it to make sure that those builtins are folded
> to PLUS/MINUS_EXPR done on unsigned vectors instead, so there is no UB.
> If it makes it through to RTL expansion, it makes no difference, but
> for UBSan it matters a lot and also I'd say if e.g. we'd extract just one
> scalar from the resulting vector, we'd optimize it just to a scalar +/- and
> could very well optimize based on lack of UB.
> 
> I've looked at a couple of other builtins, but e.g. with vec_mul* couldn't
> trigger anything problematic.
> 
> Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?

Okay for trunk, and backports too if you want any.  Thanks!


Segher


> 2018-11-29  Jakub Jelinek  
> 
>   PR target/88234
>   * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): For
>   vec_add and vec_sub builtins, perform PLUS_EXPR or MINUS_EXPR
>   in unsigned_type_for instead of vector integral type where overflow
>   doesn't wrap.
> 
>   * gcc.dg/ubsan/pr88234.c: New test.


Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-29 Thread Jakub Jelinek
On Thu, Nov 29, 2018 at 08:47:15AM -0500, Ed Smith-Rowland wrote:
> --- include/std/deque (revision 266567)
> +++ include/std/deque (working copy)
> @@ -58,6 +58,7 @@
>  #pragma GCC system_header
>  
>  #include 
> +#include  // For remove and remove_if

Isn't that too expensive, especially in non-C++20 modes?
stl_algo.h is larger than 200KB.

So, could these includes be either guarded with __cplusplus > 201703
if they are only needed for C++20, or better have bits/stl_remove.h
containing just those?

Jakub


Re: [PATCH v3] Make function clone name numbering independent.

2018-11-29 Thread Michael Ploujnikov
On 2018-11-28 5:49 p.m., Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote:
>> I've also included a small change to rs6000 which I'm pretty sure is
>> safe, but I have no way of testing.
> 
> Do you have an account on the GCC Compile Farm?
> https://cfarm.tetaneutral.net/
> There are some nice Power machines in there!
> 
> Does this patch dependent on the rest of the series?
> 
> If it tests okay, it is okay for trunk of course.  Thanks!
> 
> One comment about your changelog:
> 
>> 2018-11-28  Michael Ploujnikov  
>>
>>  * config/rs6000/rs6000.c (make_resolver_func): Generate
>>resolver symbol with clone_function_name instead of
>>clone_function_name_numbered.
> 
> Those last two lines should not start with the spaces.  It should be:
> 
> 2018-11-28  Michael Ploujnikov  
> 
>   * config/rs6000/rs6000.c (make_resolver_func): Generate
>   resolver symbol with clone_function_name instead of
>   clone_function_name_numbered.
> 
> 
> Segher
> 

Thanks for the review and suggestion to use the cfarm! I've
successfully regtested this on power9 and committed the stand-alone
change to rs6000.c after fixing the changelog.

- Michael



signature.asc
Description: OpenPGP digital signature


Re: [testsuite] Require ucn support in gdc.test/compilable/ddoc12.d (PR d/88039)

2018-11-29 Thread Rainer Orth
Hi Iain,

> On Tue, 27 Nov 2018 at 20:32, Rainer Orth  
> wrote:
>>
>> Hi Mike,
>>
>> > On Nov 27, 2018, at 2:18 AM, Rainer Orth 
>> > wrote:
>> >>
>> >> Some assemblers, including the Solaris one, don't support UTF-8
>> >> identifiers, which breaks the gdc.test/compilable/ddoc12.d testcase as
>> >> reported in the PR.
>> >
>> > So, another style of fix, would be to change the binding from the language
>> > front-end to encode unsupported symbols using a fixed, documented, abi
>> > defined technique.
>>
>> there's been some discussion on this in the PR.  Joseph's suggestion was
>> to follow the system compilers if this were done, and indeed they do
>> encode UTF-8 identifiers in some way which could either be
>> reverse-engineered or a spec obtained.  However, given Iain's stance
>> towards UTF-8 identifiers in D, I very much doubt this is worth the
>> effort.  Ultimately, it's his call, of course.
>>
>
> Not only my stance, but as a whole just how those maintaining the core
> language generally agree on. Encoding UTF8 characters in symbols is
> not part of the D ABI, so that is something that needs convincing
> upstream.
>
> There is a third way however, all compilable/ddoc* tests don't
> actually require us to compile the module all the way down to object
> code, the only thing that really needs to be tested is the Ddoc
> generator itself.  Which would be setting 'dg-do-what compile' and
> build with the compiler option -fdoc, then dg-final checks for the
> presence of the file ddoc12.html is the minimum that needs to be done
> for these tests to be considered passed.
>
> I'll have a look into doing it that way tomorrow.

that would be even better of course, also saving some testing time.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-29 Thread Jonathan Wakely

On 29/11/18 08:47 -0500, Ed Smith-Rowland wrote:

Fixed with 266616.


Thanks!


Index: include/std/deque
===
--- include/std/deque   (revision 266567)
+++ include/std/deque   (working copy)
@@ -58,6 +58,7 @@
#pragma GCC system_header

#include 
+#include  // For remove and remove_if


Please only include  when __cplusplus > 201703L
otherwise we include hundreds of kilobytes of code just for these tiny
functions that aren't even defined in the default C++14 dialect.

At some point I'll split std::remove and std::remove_if into their own
header, and we can just include that instead.



[PATCH] Fix vec_{add,sub} rs6000_gimple_fold_builtin folding (PR target/88234)

2018-11-29 Thread Jakub Jelinek
Hi!

vec_add/sub of with vector unsigned args is lowered to a builtin which
has vector signed args and therefore if not -fwrapv it is undefined if
signed integer overflow occurs in those vectors.

The following patch fixes it to make sure that those builtins are folded
to PLUS/MINUS_EXPR done on unsigned vectors instead, so there is no UB.
If it makes it through to RTL expansion, it makes no difference, but
for UBSan it matters a lot and also I'd say if e.g. we'd extract just one
scalar from the resulting vector, we'd optimize it just to a scalar +/- and
could very well optimize based on lack of UB.

I've looked at a couple of other builtins, but e.g. with vec_mul* couldn't
trigger anything problematic.

Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?

2018-11-29  Jakub Jelinek  

PR target/88234
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): For
vec_add and vec_sub builtins, perform PLUS_EXPR or MINUS_EXPR
in unsigned_type_for instead of vector integral type where overflow
doesn't wrap.

* gcc.dg/ubsan/pr88234.c: New test.

--- gcc/config/rs6000/rs6000.c.jj   2018-11-29 08:41:29.753806139 +0100
+++ gcc/config/rs6000/rs6000.c  2018-11-29 11:39:04.783862074 +0100
@@ -15371,6 +15371,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_
   enum rs6000_builtins fn_code
 = (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl);
   tree arg0, arg1, lhs, temp;
+  enum tree_code bcode;
   gimple *g;
 
   size_t uns_fncode = (size_t) fn_code;
@@ -15409,10 +15410,32 @@ rs6000_gimple_fold_builtin (gimple_stmt_
 case P8V_BUILTIN_VADDUDM:
 case ALTIVEC_BUILTIN_VADDFP:
 case VSX_BUILTIN_XVADDDP:
+  bcode = PLUS_EXPR;
+do_binary:
   arg0 = gimple_call_arg (stmt, 0);
   arg1 = gimple_call_arg (stmt, 1);
   lhs = gimple_call_lhs (stmt);
-  g = gimple_build_assign (lhs, PLUS_EXPR, arg0, arg1);
+  if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (lhs)))
+ && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE (lhs
+   {
+ /* Ensure the binary operation is performed in a type
+that wraps if it is integral type.  */
+ gimple_seq stmts = NULL;
+ tree type = unsigned_type_for (TREE_TYPE (lhs));
+ tree uarg0 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+type, arg0);
+ tree uarg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+type, arg1);
+ tree res = gimple_build (&stmts, gimple_location (stmt), bcode,
+  type, uarg0, uarg1);
+ gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+ g = gimple_build_assign (lhs, VIEW_CONVERT_EXPR,
+  build1 (VIEW_CONVERT_EXPR,
+  TREE_TYPE (lhs), res));
+ gsi_replace (gsi, g, true);
+ return true;
+   }
+  g = gimple_build_assign (lhs, bcode, arg0, arg1);
   gimple_set_location (g, gimple_location (stmt));
   gsi_replace (gsi, g, true);
   return true;
@@ -15424,13 +15447,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_
 case P8V_BUILTIN_VSUBUDM:
 case ALTIVEC_BUILTIN_VSUBFP:
 case VSX_BUILTIN_XVSUBDP:
-  arg0 = gimple_call_arg (stmt, 0);
-  arg1 = gimple_call_arg (stmt, 1);
-  lhs = gimple_call_lhs (stmt);
-  g = gimple_build_assign (lhs, MINUS_EXPR, arg0, arg1);
-  gimple_set_location (g, gimple_location (stmt));
-  gsi_replace (gsi, g, true);
-  return true;
+  bcode = MINUS_EXPR;
+  goto do_binary;
 case VSX_BUILTIN_XVMULSP:
 case VSX_BUILTIN_XVMULDP:
   arg0 = gimple_call_arg (stmt, 0);
--- gcc/testsuite/gcc.dg/ubsan/pr88234.c.jj 2018-11-29 12:13:06.879735598 
+0100
+++ gcc/testsuite/gcc.dg/ubsan/pr88234.c2018-11-29 12:13:54.594937165 
+0100
@@ -0,0 +1,29 @@
+/* PR target/88234 */
+/* { dg-do run { target { powerpc*-*-* && vmx_hw } } } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-fsanitize=signed-integer-overflow 
-fno-sanitize-recover=signed-integer-overflow -O2 -maltivec" } */
+
+#include 
+
+__attribute__((noipa)) vector unsigned int
+f1 (vector unsigned int x, vector unsigned int y)
+{
+  return vec_add (x, y);
+}
+
+__attribute__((noipa)) vector unsigned int
+f2 (vector unsigned int x, vector unsigned int y)
+{
+  return vec_sub (x, y);
+}
+
+int
+main ()
+{
+  vector unsigned int x = { __INT_MAX__, -__INT_MAX__, __INT_MAX__ - 3, 
-__INT_MAX__ + 4 };
+  vector unsigned int y = { 1, -1, 4, -5 };
+  vector unsigned int z = f1 (x, y);
+  f2 (z, x);
+  f2 (z, y);
+  return 0;
+}

Jakub


Re: [PATCH] PR libstdc++/67843 set shared_ptr lock policy at build-time

2018-11-29 Thread Richard Biener
On Wed, Nov 28, 2018 at 3:09 PM Jonathan Wakely  wrote:
>
> On 28/11/18 14:46 +0100, Richard Biener wrote:
> >On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely  wrote:
> >>
> >> On 28/11/18 11:11 +, Jonathan Wakely wrote:
> >> >On 28/11/18 11:46 +0100, Richard Biener wrote:
> >> >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely  
> >> >>wrote:
> >> >>>
> >> >>>This resolves a longstanding issue where the lock policy for shared_ptr
> >> >>>reference counting depends on compilation options when the header is
> >> >>>included, so that different -march options can cause ABI changes. For
> >> >>>example, objects compiled with -march=armv7 will use atomics to
> >> >>>synchronize reference counts, and objects compiled with -march=armv5t
> >> >>>will use a mutex. That means the shared_ptr control block will have a
> >> >>>different layout in different objects, causing ODR violations and
> >> >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as
> >> >>>well as PR libstdc++/67843.
> >> >>>
> >> >>>The solution is to decide on the lock policy at build time, when
> >> >>>libstdc++ is configured. The configure script checks for the
> >> >>>availability of the necessary atomic built-ins for the target and fixes
> >> >>>that choice permanently. Different -march flags used to compile user
> >> >>>code will not cause changes to the lock policy. This results in an ABI
> >> >>>change for certain compilations, but only where there was already an ABI
> >> >>>incompatibility between the libstdc++.so library and objects built with
> >> >>>an incompatible -march option. In general, this means a more stable ABI
> >> >>>that isn't silently altered when -march flags make addition atomic ops
> >> >>>available.
> >> >>>
> >> >>>To force a target to use "atomic" or "mutex" the new configure option
> >> >>>--with-libstdcxx-lock-policy can be used.
> >> >>>
> >> >>>In order to turn ODR violations into linker errors, the uses of
> >> >>>shared_ptr in filesystem directory iterators have been replaced
> >> >>>with __shared_ptr, and explicit instantiations are declared. This
> >> >>>ensures that object files using those types cannot link to libstdc++
> >> >>>libs unless they use the same lock policy.
> >> >>
> >> >>Would it be possible to have both in libstdc++ and with differnet 
> >> >>mangling of
> >> >>different kind ensure compatibility between different user CUs?  Or is
> >> >>that too awkward for the user to get right?
> >> >
> >> >It would mean duplicating a lot more code, which is already duplicated
> >> >once for the std::string ABI, so we'd have four permuations!
> >> >
> >> >It still wouldn't ensure compatibility between different user CUs,
> >> >only between any user CU and any build of libstdc++. Different user
> >> >CUs would still disagree on the ABI of the types, and so couldn't pass
> >> >them between CUs. I see no advantage to supporting that for the
> >> >std::filesystem library (unlike std::string and std::iostream, which
> >> >are probably used in the majority of CUs).
> >> >
> >> >I do not want to get to the point where every type in libstdc++ exists
> >> >multiple times and you select some combination via command-line flags.
> >> >It's already becoming unmanageable with multiple std::string and long
> >> >double ABIs.
> >>
> >> Also, in practice, I don't see a need. The common cases where this bug
> >> arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
> >> mean that you can always use "atomic". For bare metal ARM toolchains
> >> you probably don't want to be mixing CUs built against different
> >> versions of libstdc++ anyway. You have your toolchain for the board,
> >> and you use that. If it is configured to use "atomic", then that's
> >> what you get. If it's configured to use "mutex", then that's what you
> >> get. You probably don't want to use a toolchain configured for a
> >> different board, but you could use --with-libstdcxx-lock-policy to
> >> ensure a consistent policy across those toolchains if needed.
> >>
> >> It's also possible for problems to arise on i386. If you build GCC for
> >> i386 then it will choose "mutex" and be incompatible with a libstdc++
> >> built for i486 or later. In that case, you could use the option
> >> --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
> >> then you'd need to link to libatomic to provide the ops, as there are
> >> no kernel helpers for i386 in libgcc).
> >
> >I think the default should be part of the platform ABI somehow,
> >like the i386 incompatibility should better not arise.
>
> What do you suggest to ensure that?
>
> Just imake i386 default to the "atomic" lock policy, and require anybody
> silly enough to configure for i386 to either link to libatomic, or
> explicitly use --with-libstdcxx-lock-policy=mutex if they really want
> to be incompatible with default i486 builds?

Something like that, yes.

> >I suppose
> >libstdc++ configury doesn't read in gcc/config.gcc but does it
> >have sth similar

Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-29 Thread Ed Smith-Rowland

On 11/28/18 7:25 PM, Jonathan Wakely wrote:

On 28/11/18 12:12 -0500, Ed Smith-Rowland wrote:

Index: testsuite/21_strings/basic_string/erasure.cc
===
--- testsuite/21_strings/basic_string/erasure.cc (nonexistent)
+++ testsuite/21_strings/basic_string/erasure.cc    (working copy)
@@ -0,0 +1,53 @@
+// { dg-do run { target c++2a } }
+


None of these new tests actually run by default, because they are
gated to only run for C++2a and the default is gnu++14. That means
they're all skipped as UNSUPPORTED.

(When I add new tests I always try to remember to check the
testsuite/libstdc++.sum file to make sure they are actually running).

The tests need an explicit -std option added via dg-options, which has
to come before the dg-do directive, otherwise the target check still
uses the default options i.e.

// { dg-options "-std=gnu++2a" }
// { dg-do run { target c++2a } }

With that added, most of them start to FAIL:

FAIL: 23_containers/deque/erasure.cc (test for excess errors)
UNRESOLVED: 23_containers/deque/erasure.cc compilation failed to 
produce executable

FAIL: 23_containers/unordered_set/erasure.cc (test for excess errors)
UNRESOLVED: 23_containers/unordered_set/erasure.cc compilation failed 
to produce executable

FAIL: 23_containers/vector/erasure.cc (test for excess errors)
UNRESOLVED: 23_containers/vector/erasure.cc compilation failed to 
produce executable

FAIL: experimental/deque/erasure.cc (test for excess errors)
UNRESOLVED: experimental/deque/erasure.cc compilation failed to 
produce executable

FAIL: experimental/forward_list/erasure.cc (test for excess errors)
UNRESOLVED: experimental/forward_list/erasure.cc compilation failed to 
produce executable

FAIL: experimental/list/erasure.cc (test for excess errors)
UNRESOLVED: experimental/list/erasure.cc compilation failed to produce 
executable

FAIL: experimental/vector/erasure.cc (test for excess errors)
UNRESOLVED: experimental/vector/erasure.cc compilation failed to 
produce executable



The errors are all like:

In file included from 
/home/jwakely/src/gcc/gcc/libstdc++-v3/testsuite/23_containers/vector/erasure.cc:21:
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector: 
In function 'void std::erase_if(std::vector<_Tp, _Alloc>&, _Predicate)':
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector:101: 
error: 'remove_if' is not a member of 'std'; did you mean 'remove_cv'?
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector: 
In function 'void std::erase(std::vector<_Tp, _Alloc>&, const _Up&)':
/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/vector:109: 
error: 'remove' is not a member of 'std'; did you mean 'move'?


This is because std::remove and std::remove_if are defined in
 which is not included.

Could you please fix this ASAP 



Sorry about that.

Fixed with 266616.

Ed



2018-11-29  Edward Smith-Rowland  <3dw...@verizon.net>

Fix erasure goofs.
* include/experimental/deque: Make inline.
* include/std/deque: Include bits/stl_algo.h.
(erase, erase_if): Make inline.
* include/std/string: Include bits/stl_algo.h.
* include/std/unordered_set: Add erase, erase_if!
* include/std/vector: Include bits/stl_algo.h.
* testsuite/21_strings/basic_string/erasure.cc:
Add { dg-options "-std=gnu++2a" }.
* testsuite/23_containers/deque/erasure.cc: Ditto.
* testsuite/23_containers/forward_list/erasure.cc: Ditto.
* testsuite/23_containers/list/erasure.cc: Ditto.
* testsuite/23_containers/map/erasure.cc: Ditto.
* testsuite/23_containers/set/erasure.cc: Ditto.
* testsuite/23_containers/unordered_map/erasure.cc: Ditto.
* testsuite/23_containers/unordered_set/erasure.cc: Ditto.
* testsuite/23_containers/vector/erasure.cc: Ditto.

Index: include/experimental/deque
===
--- include/experimental/deque  (revision 266566)
+++ include/experimental/deque  (working copy)
@@ -46,7 +46,7 @@
 inline namespace fundamentals_v2
 {
   template
-void
+inline void
 erase_if(deque<_Tp, _Alloc>& __cont, _Predicate __pred)
 {
   __cont.erase(std::remove_if(__cont.begin(), __cont.end(), __pred),
@@ -54,7 +54,7 @@
 }
 
   template
-void
+inline void
 erase(deque<_Tp, _Alloc>& __cont, const _Up& __value)
 {
   __cont.erase(std::remove(__cont.begin(), __cont.end(), __value),
Index: include/std/deque
===
--- include/std/deque   (revision 266567)
+++ include/std/deque   (working copy)
@@ -58,6 +58,7 @@
 #pragma GCC system_header
 
 #include 
+#include  // For remove and remove_if
 #include 
 #include 
 #include 
@@ -92,7 +93,7 @@
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
-void
+inline void
 erase_if(deque<_Tp, 

Re: [PATCH 1/2] asm qualifiers (PR55681)

2018-11-29 Thread Segher Boessenkool
+cc: C and C++ maintainers.  Sorry I forgot before :-/

On Tue, Oct 30, 2018 at 05:30:33PM +, Segher Boessenkool wrote:
> PR55681 observes that currently only one qualifier is allowed for
> inline asm, so that e.g. "volatile asm" is allowed, "const asm" is also
> okay (with a warning), but "const volatile asm" gives an error.  Also
> "const const asm" is an error (while "const const int" is okay for C),
> "goto" has to be last, and "_Atomic" isn't handled at all.
> 
> This patch fixes all these.  It allows any order of qualifiers (and
> goto), allows them all for C, allows duplications for C.  For C++ it
> still allows only a single volatile and single goto, but in any order.
> 
> 
> 2018-10-30  Segher Boessenkool  
> 
> gcc/c/
>   PR inline-asm/55681
>   * c-parser.c (c_parser_for_statement): Update grammar.  Allow any
>   combination of type-qualifiers and goto in any order, with repetitions
>   allowed.
> 
> gcc/cp/
>   PR inline-asm/55681
>   * parser.c (cp_parser_using_directive): Update grammar.  Allow any
>   combination of volatile and goto in any order, without repetitions.
> 
> ---
>  gcc/c/c-parser.c | 66 +++-
>  gcc/cp/parser.c  | 77 
> +---
>  2 files changed, 89 insertions(+), 54 deletions(-)
> 
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index ee66ce8..ce9921e 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -6283,23 +6283,31 @@ c_parser_for_statement (c_parser *parser, bool ivdep, 
> unsigned short unroll,
>  }
>  
>  /* Parse an asm statement, a GNU extension.  This is a full-blown asm
> -   statement with inputs, outputs, clobbers, and volatile tag
> +   statement with inputs, outputs, clobbers, and volatile and goto tag
> allowed.
>  
> +   asm-qualifier:
> + type-qualifier
> + goto
> +
> +   asm-qualifier-list:
> + asm-qualifier-list asm-qualifier
> + asm-qualifier
> +
> asm-statement:
> - asm type-qualifier[opt] ( asm-argument ) ;
> - asm type-qualifier[opt] goto ( asm-goto-argument ) ;
> + asm asm-qualifier-list[opt] ( asm-argument ) ;
>  
> asm-argument:
>   asm-string-literal
>   asm-string-literal : asm-operands[opt]
>   asm-string-literal : asm-operands[opt] : asm-operands[opt]
> - asm-string-literal : asm-operands[opt] : asm-operands[opt] : 
> asm-clobbers[opt]
> -
> -   asm-goto-argument:
> + asm-string-literal : asm-operands[opt] : asm-operands[opt] \
> +   : asm-clobbers[opt]
>   asm-string-literal : : asm-operands[opt] : asm-clobbers[opt] \
> : asm-goto-operands
>  
> +   The form with asm-goto-operands is valid if and only if the
> +   asm-qualifier-list contains goto, and is the only allowed form in that 
> case.
> Qualifiers other than volatile are accepted in the syntax but
> warned for.  */
>  
> @@ -6313,30 +6321,32 @@ c_parser_asm_statement (c_parser *parser)
>  
>gcc_assert (c_parser_next_token_is_keyword (parser, RID_ASM));
>c_parser_consume_token (parser);
> -  if (c_parser_next_token_is_keyword (parser, RID_VOLATILE))
> -{
> -  quals = c_parser_peek_token (parser)->value;
> -  c_parser_consume_token (parser);
> -}
> -  else if (c_parser_next_token_is_keyword (parser, RID_CONST)
> -|| c_parser_next_token_is_keyword (parser, RID_RESTRICT))
> -{
> -  warning_at (c_parser_peek_token (parser)->location,
> -   0,
> -   "%E qualifier ignored on asm",
> -   c_parser_peek_token (parser)->value);
> -  quals = NULL_TREE;
> -  c_parser_consume_token (parser);
> -}
> -  else
> -quals = NULL_TREE;
>  
> +  quals = NULL_TREE;
>is_goto = false;
> -  if (c_parser_next_token_is_keyword (parser, RID_GOTO))
> -{
> -  c_parser_consume_token (parser);
> -  is_goto = true;
> -}
> +  for (bool done = false; !done; )
> +switch (c_parser_peek_token (parser)->keyword)
> +  {
> +  case RID_VOLATILE:
> + quals = c_parser_peek_token (parser)->value;
> + c_parser_consume_token (parser);
> + break;
> +  case RID_CONST:
> +  case RID_RESTRICT:
> +  case RID_ATOMIC:
> + warning_at (c_parser_peek_token (parser)->location,
> + 0,
> + "%E qualifier ignored on asm",
> + c_parser_peek_token (parser)->value);
> + c_parser_consume_token (parser);
> + break;
> +  case RID_GOTO:
> + is_goto = true;
> + c_parser_consume_token (parser);
> + break;
> +  default:
> + done = true;
> +  }
>  
>/* ??? Follow the C++ parser rather than using the
>   lex_untranslated_string kludge.  */
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index ebe326e..d44fd4d 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -19196,22 +19196,34 @@ cp_parser_using_directive (cp_parser* parser)
>  
>  /* Parse an asm-definition.
>  
> +  asm-qualifier:
> + 

[PATCH] Fix PR88243

2018-11-29 Thread Richard Biener


The following adjusts the pattern def-seq generation to give all
stmts in the def-seq vect_internal_def type and not retain the
vect_nested_cycle_def or similar def type from the main stmt.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

>From b2ff48c042f116ea2f0dd84da48daae36e3e2000 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Thu, 29 Nov 2018 12:50:45 +0100
Subject: [PATCH] fix-pr88243

2018-11-29  Richard Biener  

PR tree-optimization/88243
* tree-vect-patterns.c (vect_mark_pattern_stmts): Set the def
type of all pattern-sequence stmts to vect_internal_def.

* gcc.dg/torture/pr88243.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/torture/pr88243.c 
b/gcc/testsuite/gcc.dg/torture/pr88243.c
new file mode 100644
index 000..17ca38fb684
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr88243.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+
+int a, b, c;
+
+void d()
+{
+  int e, f;
+  for (; a; a++)
+{
+  e = (__UINTPTR_TYPE__)d;
+  b = 0;
+  for (; b < 2; b++)
+   {
+ f = e = e / 2;
+ c = c + f;
+   }
+}
+}
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 2b56d85afc5..39b6f822d19 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -4723,7 +4723,15 @@ vect_mark_pattern_stmts (stmt_vec_info orig_stmt_info, 
gimple *pattern_stmt,
   if (def_seq)
 for (gimple_stmt_iterator si = gsi_start (def_seq);
 !gsi_end_p (si); gsi_next (&si))
-  vect_init_pattern_stmt (gsi_stmt (si), orig_stmt_info, pattern_vectype);
+  {
+   stmt_vec_info pattern_stmt_info
+ = vect_init_pattern_stmt (gsi_stmt (si),
+   orig_stmt_info, pattern_vectype);
+   /* Stmts in the def sequence are not vectorizable cycle or
+  induction defs, instead they should all be vect_internal_def
+  feeding the main pattern stmt which retains this def type.  */
+   STMT_VINFO_DEF_TYPE (pattern_stmt_info) = vect_internal_def;
+  }
 
   if (orig_pattern_stmt)
 {


[PATCH] S/390: Add support for section anchors

2018-11-29 Thread Ilya Leoshkevich
Bootstrapped and regtested on s390x-redhat-linux and
x86_64-redhat-linux.

gcc/ChangeLog:

2018-09-12  Ilya Leoshkevich  

* common/config/s390/s390-common.c (s390_option_init_struct):
Use section anchors by default.
* config/s390/s390.c (s390_check_symref_alignment): Handle
anchors.
(TARGET_MAX_ANCHOR_OFFSET): Use short displacement.
* output.h (assemble_align): Pass `align' as unsigned int, so
that the value 0x8000, which corresponds to `aligned(1 <<
28)', would pass the `align > BITS_PER_UNIT' check.
* varasm.c (assemble_align): Likewise.

gcc/testsuite/ChangeLog:

2018-09-12  Ilya Leoshkevich  

* gcc.target/s390/nodatarel-1.c: Expect .LANCHOR0@GOTENT instead
of a@GOTENT.
* gcc.target/s390/section-anchors.c: New test.
* gcc.target/s390/section-anchors2.c: New test.
* gcc.target/s390/section-anchors3.c: New test.
---
 gcc/common/config/s390/s390-common.c  |  3 +++
 gcc/config/s390/s390.c| 19 ++
 gcc/output.h  |  2 +-
 gcc/testsuite/gcc.target/s390/nodatarel-1.c   |  2 +-
 .../gcc.target/s390/section-anchors.c | 14 ++
 .../gcc.target/s390/section-anchors2.c| 26 +++
 .../gcc.target/s390/section-anchors3.c| 11 
 gcc/varasm.c  |  2 +-
 8 files changed, 76 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/section-anchors.c
 create mode 100644 gcc/testsuite/gcc.target/s390/section-anchors2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/section-anchors3.c

diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 2f728957e25..59b24654c82 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -74,6 +74,9 @@ s390_option_init_struct (struct gcc_options *opts)
   /* By default, always emit DWARF-2 unwind info.  This allows debugging
  without maintaining a stack frame back-chain.  */
   opts->x_flag_asynchronous_unwind_tables = 1;
+
+  /* Enable section anchors by default.  */
+  opts->x_flag_section_anchors = 1;
 }
 
 /* Implement TARGET_HANDLE_OPTION.  */
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 277d555440b..62868995ca6 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -4187,6 +4187,20 @@ s390_check_symref_alignment (rtx addr, HOST_WIDE_INT 
alignment)
 
   if (GET_CODE (symref) == SYMBOL_REF)
 {
+  /* s390_encode_section_info is not called for anchors, since they don't
+have corresponding VAR_DECLs.  Therefore, we cannot rely on
+SYMBOL_FLAG_NOTALIGN{2,4,8}_P returning useful information.  */
+  if (SYMBOL_REF_ANCHOR_P (symref))
+   {
+ HOST_WIDE_INT block_offset = SYMBOL_REF_BLOCK_OFFSET (symref);
+ unsigned int block_alignment = (SYMBOL_REF_BLOCK (symref)->alignment
+ / BITS_PER_UNIT);
+
+ gcc_assert (block_offset >= 0);
+ return ((block_offset & (alignment - 1)) == 0
+ && block_alignment >= alignment);
+   }
+
   /* We have load-relative instructions for 2-byte, 4-byte, and
 8-byte alignment so allow only these.  */
   switch (alignment)
@@ -16338,6 +16352,11 @@ s390_case_values_threshold (void)
 #undef TARGET_CASE_VALUES_THRESHOLD
 #define TARGET_CASE_VALUES_THRESHOLD s390_case_values_threshold
 
+/* Use only short displacement, since long displacement is not available for
+   the floating point instructions.  */
+#undef TARGET_MAX_ANCHOR_OFFSET
+#define TARGET_MAX_ANCHOR_OFFSET 0xfff
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-s390.h"
diff --git a/gcc/output.h b/gcc/output.h
index 89e484c..b2f0cc168eb 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -219,7 +219,7 @@ extern void assemble_external (tree);
 extern void assemble_zeros (unsigned HOST_WIDE_INT);
 
 /* Assemble an alignment pseudo op for an ALIGN-bit boundary.  */
-extern void assemble_align (int);
+extern void assemble_align (unsigned int);
 
 /* Assemble a string constant with the specified C string as contents.  */
 extern void assemble_string (const char *, int);
diff --git a/gcc/testsuite/gcc.target/s390/nodatarel-1.c 
b/gcc/testsuite/gcc.target/s390/nodatarel-1.c
index 1d589a10947..f53332f901d 100644
--- a/gcc/testsuite/gcc.target/s390/nodatarel-1.c
+++ b/gcc/testsuite/gcc.target/s390/nodatarel-1.c
@@ -29,7 +29,7 @@ bar (int b)
   a = b;
 }
 
-/* { dg-final { scan-assembler-times "a@GOTENT" 3 } } */
+/* { dg-final { scan-assembler-times "\\.LANCHOR\\d+@GOTENT" 3 } } */
 
 /* The exrl target is a label_ref which should not be affected at
all.  */
diff --git a/gcc/testsuite/gcc.target/s390/section-anchors.c 
b/gcc/testsuite/gcc.target/s390/section-anchors.c
new file mode 100644
index 000..68a6a393e31
--- /dev/null
+++ b/gcc/testsuite/

[PATCH] PR libstdc++/88119 use alignof in std::alignment_of, not __alignof__

2018-11-29 Thread Jonathan Wakely

Now that __alignof__ and alignof sometimes disagree it matters which one
we use. The standard says that std::alignment_of::value equals
alignof(T), so we need to use that.

Change the only uses of alignment_of to use __alignof__ to avoid a
change in alignment.

PR libstdc++/88119
* include/ext/aligned_buffer.h (__aligned_membuf): Add comment.
(__aligned_buffer): Use __alignof__ instead of std::alignment_of.
* include/std/type_traits (alignment_of): Use alignof instead of
__alignof__.
* testsuite/20_util/alignment_of/value.cc: Fix test to check values
match alignof not __alignof__, as required by the standard.

Tested powercp64le-linux, committed to trunk.

This should be changed on gcc-8-branch too.


commit b8b2d1978d67bddd0caf115cff1bf437c6609eef
Author: Jonathan Wakely 
Date:   Thu Nov 29 12:07:48 2018 +

PR libstdc++/88119 use alignof in std::alignment_of, not __alignof__

Now that __alignof__ and alignof sometimes disagree it matters which one
we use. The standard says that std::alignment_of::value equals
alignof(T), so we need to use that.

Change the only uses of alignment_of to use __alignof__ to avoid a
change in alignment.

PR libstdc++/88119
* include/ext/aligned_buffer.h (__aligned_membuf): Add comment.
(__aligned_buffer): Use __alignof__ instead of std::alignment_of.
* include/std/type_traits (alignment_of): Use alignof instead of
__alignof__.
* testsuite/20_util/alignment_of/value.cc: Fix test to check values
match alignof not __alignof__, as required by the standard.

diff --git a/libstdc++-v3/include/ext/aligned_buffer.h 
b/libstdc++-v3/include/ext/aligned_buffer.h
index 81fb797723c..2fc8f12e62d 100644
--- a/libstdc++-v3/include/ext/aligned_buffer.h
+++ b/libstdc++-v3/include/ext/aligned_buffer.h
@@ -49,6 +49,8 @@ namespace __gnu_cxx
   // Target macro ADJUST_FIELD_ALIGN can produce different alignment for
   // types when used as class members. __aligned_membuf is intended
   // for use as a class member, so align the buffer as for a class member.
+  // Since GCC 8 we could just use alignof(_Tp) instead, but older
+  // versions of non-GNU compilers might still need this trick.
   struct _Tp2 { _Tp _M_t; };
 
   alignas(__alignof__(_Tp2::_M_t)) unsigned char _M_storage[sizeof(_Tp)];
@@ -86,11 +88,10 @@ namespace __gnu_cxx
   // This type is still used to avoid an ABI change.
   template
 struct __aligned_buffer
-: std::aligned_storage::value>
+: std::aligned_storage
 {
   typename
-   std::aligned_storage::value>::type
-   _M_storage;
+   std::aligned_storage::type _M_storage;
 
   __aligned_buffer() = default;
 
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 60094f9897b..727a5451c56 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1286,7 +1286,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// alignment_of
   template
 struct alignment_of
-: public integral_constant { };
+: public integral_constant { };
 
   /// rank
   template
diff --git a/libstdc++-v3/testsuite/20_util/alignment_of/value.cc 
b/libstdc++-v3/testsuite/20_util/alignment_of/value.cc
index 7c46a58bf28..f73008daabd 100644
--- a/libstdc++-v3/testsuite/20_util/alignment_of/value.cc
+++ b/libstdc++-v3/testsuite/20_util/alignment_of/value.cc
@@ -20,16 +20,22 @@
 #include 
 #include 
 
+template
+constexpr bool test()
+{
+  return __gnu_test::test_property(alignof(T));
+}
+
 void test01()
 {
-  using std::alignment_of;
-  using namespace __gnu_test;
-
-  static_assert(test_property(__alignof__(char)), "");
-  static_assert(test_property(__alignof__(short)), "");
-  static_assert(test_property(__alignof__(int)), "");
-  static_assert(test_property(__alignof__(double)), "");
-  static_assert(test_property(__alignof__(int[4])), "");
-  static_assert(test_property(__alignof__(ClassType)), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test(), "");
+  static_assert(test<__gnu_test::ClassType>(), "");
 }


Re: [PATCH 1/5][libbacktrace] Factor out backtrace_vector_free

2018-11-29 Thread Tom de Vries
On 29-11-18 00:26, Ian Lance Taylor wrote:
> On Wed, Nov 28, 2018 at 3:15 PM, Tom de Vries  wrote:
>>
>> this patch factors out new function backtrace_vector_free.
>>
>> Bootstrapped and reg-tested on x86_64.
>>
>> OK for trunk?
> 
> We should only add new files if we really absolutely must, as this
> package is copied around to a lot of places (e.g.,
> libsanitizer/libbacktrace) and adding files here requires
> modifications in all those places.
> 

I see, thanks for the explanation.

How about his patch? It does not add a file, though it does add an
external function which requires a rename in libsanitizer/libbacktrace
(I don't know whether that requires changes in any other places).

[ Also, it inlines backtrace-vector.c into alloc.c and mmap.c, so it
duplicates code. If that is not acceptable, I could move it to
internal.h as static inline or static ATTRIBUTE_UNUSED. ]

Thanks,
- Tom
[libbacktrace] Factor out backtrace_vector_free

Factor out new function backtrace_vector_free.

Bootstrapped and reg-tested on x86_64.

2018-11-28  Tom de Vries  

	* alloc.c (backtrace_vector_free): New fuction, factored out of ...
	* dwarf.c (read_line_info): ... here.
	* mmap.c (backtrace_vector_free): Copy from alloc.c.
	* internal.h (backtrace_vector_free): Declare.

	* libbacktrace/backtrace-rename.h (backtrace_vector_free): Define to
	__asan_backtrace_vector_free.

---
 libbacktrace/alloc.c | 12 
 libbacktrace/dwarf.c |  4 +---
 libbacktrace/internal.h  |  7 +++
 libbacktrace/mmap.c  | 12 
 libsanitizer/libbacktrace/backtrace-rename.h |  1 +
 5 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/libbacktrace/alloc.c b/libbacktrace/alloc.c
index 522b59dd59f..d442b5d1762 100644
--- a/libbacktrace/alloc.c
+++ b/libbacktrace/alloc.c
@@ -165,3 +165,15 @@ backtrace_vector_release (struct backtrace_state *state ATTRIBUTE_UNUSED,
 
   return 1;
 }
+
+/* Free the space managed by VEC.  */
+
+void
+backtrace_vector_free (struct backtrace_state *state,
+		   struct backtrace_vector *vec,
+		   backtrace_error_callback error_callback, void *data)
+{
+  vec->alc += vec->size;
+  vec->size = 0;
+  backtrace_vector_release (state, vec, error_callback, data);
+}
diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 4e93f120820..13d0aa4bcd8 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -2057,9 +2057,7 @@ read_line_info (struct backtrace_state *state, struct dwarf_data *ddata,
   return 1;
 
  fail:
-  vec.vec.alc += vec.vec.size;
-  vec.vec.size = 0;
-  backtrace_vector_release (state, &vec.vec, error_callback, data);
+  backtrace_vector_free (state, &vec.vec, error_callback, data);
   free_line_header (state, hdr, error_callback, data);
   *lines = (struct line *) (uintptr_t) -1;
   *lines_count = 0;
diff --git a/libbacktrace/internal.h b/libbacktrace/internal.h
index bff8ed470e4..710693bcf66 100644
--- a/libbacktrace/internal.h
+++ b/libbacktrace/internal.h
@@ -257,6 +257,13 @@ extern int backtrace_vector_release (struct backtrace_state *state,
  backtrace_error_callback error_callback,
  void *data);
 
+/* Free the space managed by VEC.  This will reset VEC.  */
+
+extern void backtrace_vector_free (struct backtrace_state *state,
+   struct backtrace_vector *vec,
+   backtrace_error_callback error_callback,
+   void *data);
+
 /* Read initial debug data from a descriptor, and set the
fileline_data, syminfo_fn, and syminfo_data fields of STATE.
Return the fileln_fn field in *FILELN_FN--this is done this way so
diff --git a/libbacktrace/mmap.c b/libbacktrace/mmap.c
index 9f896a1bb99..080868c8a91 100644
--- a/libbacktrace/mmap.c
+++ b/libbacktrace/mmap.c
@@ -325,3 +325,15 @@ backtrace_vector_release (struct backtrace_state *state,
 vec->base = NULL;
   return 1;
 }
+
+/* Free the space managed by VEC.  */
+
+void
+backtrace_vector_free (struct backtrace_state *state,
+		   struct backtrace_vector *vec,
+		   backtrace_error_callback error_callback, void *data)
+{
+  vec->alc += vec->size;
+  vec->size = 0;
+  backtrace_vector_release (state, vec, error_callback, data);
+}
diff --git a/libsanitizer/libbacktrace/backtrace-rename.h b/libsanitizer/libbacktrace/backtrace-rename.h
index 2555fe508c2..e2494e3686d 100644
--- a/libsanitizer/libbacktrace/backtrace-rename.h
+++ b/libsanitizer/libbacktrace/backtrace-rename.h
@@ -15,6 +15,7 @@
 #define backtrace_vector_finish __asan_backtrace_vector_finish
 #define backtrace_vector_grow __asan_backtrace_vector_grow
 #define backtrace_vector_release __asan_backtrace_vector_release
+#define backtrace_vector_free __asan_backtrace_vector_free
 
 #define cplus_demangle_builtin_types __asan_cplus_demangle_builtin_types
 #define cplus_demangle_fill_ctor __asan_cplus_demangle_fill_ctor


[ping x3] Re: [PATCH 0/2] asm qualifiers (PR55681) and asm inline

2018-11-29 Thread Segher Boessenkool
Ping x3.

On Sat, Nov 17, 2018 at 08:52:58AM -0600, Segher Boessenkool wrote:
> Ping x2.
> 
> On Sat, Nov 10, 2018 at 06:33:37PM -0600, Segher Boessenkool wrote:
> > Ping.
> > 
> > On Tue, Oct 30, 2018 at 05:30:32PM +, Segher Boessenkool wrote:
> > > Hi!
> > > 
> > > This is the same "asm input" patch as before, but now preceded by a patch
> > > that makes all orderings of volatile/goto/inline valid, also the other 
> > > type
> > > qualifiers for C, and also repetitions for C.
> > > 
> > > Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> > > 
> > > 
> > > Segher
> > > 
> > > 
> > >  gcc/c/c-parser.c| 79 
> > > -
> > >  gcc/c/c-tree.h  |  3 +-
> > >  gcc/c/c-typeck.c|  3 +-
> > >  gcc/cp/cp-tree.h|  2 +-
> > >  gcc/cp/parser.c | 92 
> > > +
> > >  gcc/cp/pt.c |  2 +-
> > >  gcc/cp/semantics.c  |  3 +-
> > >  gcc/doc/extend.texi | 10 ++-
> > >  gcc/gimple-pretty-print.c   |  2 +
> > >  gcc/gimple.h| 24 ++-
> > >  gcc/gimplify.c  |  1 +
> > >  gcc/ipa-icf-gimple.c|  3 +
> > >  gcc/testsuite/c-c++-common/torture/asm-inline.c | 53 ++
> > >  gcc/tree-core.h |  3 +
> > >  gcc/tree-inline.c   |  3 +
> > >  gcc/tree.h  |  3 +
> > >  16 files changed, 221 insertions(+), 65 deletions(-)
> > >  create mode 100644 gcc/testsuite/c-c++-common/torture/asm-inline.c
> > > 
> > > -- 
> > > 1.8.3.1


Re: The GCC 7 branch is now frozen

2018-11-29 Thread Eric Botcazou
> Thanks for your support.

You're campaigning for something? ;-)

-- 
Eric Botcazou


Re: [PATCH] PR880088 Enable -Wtrampolines for no-exec-stack targets with -Wall.

2018-11-29 Thread Mark Wielaard
Hi Segher,

On Wed, 2018-11-28 at 15:00 -0600, Segher Boessenkool wrote:
> On Tue, Nov 27, 2018 at 08:54:07PM +0100, Mark Wielaard wrote:
> > On Tue, 2018-11-27 at 12:37 -0600, Segher Boessenkool wrote:
> > > > Define a new target macro TARGET_HAS_DEFAULT_NOEXEC_STACK for
> > > > those
> > > > targets
> > > > that have a non-executable default stack based on when they
> > > > call
> > > > file_end_indicate_exec_stack.
> > > 
> > > As Paul says, that name isn't so good.
> > > 
> > > TARGET_NEEDS_MAKING_THE_STACK_EXECUTABLE_FOR_TRAMPOLINES, or
> > > similar?
> > 
> > Would the slightly shorter
> > TARGET_NEEDS_EXEC_STACK_MARKER_FOR_TRAMPOLINES be descriptive
> > enough?
> 
> "MARKER", is that some official name for it?  If no, just "FLAG"?
> Fine with me, sure.

FLAG it is then! It is even a little shorter :)

> > > > diff --git a/gcc/config/rs6000/sysv4.h
> > > > b/gcc/config/rs6000/sysv4.h
> > > > index 0c67634..9330acf 100644
> > > > --- a/gcc/config/rs6000/sysv4.h
> > > > +++ b/gcc/config/rs6000/sysv4.h
> > > > @@ -972,6 +972,11 @@ ncrtn.o%s"
> > > >  /* Generate entries in .fixup for relocatable addresses.  */
> > > >  #define RELOCATABLE_NEEDS_FIXUP 1
> > > >  
> > > > +#if defined (POWERPC_LINUX) || defined (POWERPC_FREEBSD)
> > > > +  #define TARGET_HAS_DEFAULT_NOEXEC_STACK (TARGET_32BIT \
> > > > +  || DEFAULT_ABI ==
> > > > ABI_ELFv2)
> > > > +#endif
> > > 
> > > I don't think this belongs in sysv4.h .
> > 
> > I might have gotten lost in the tree of defines/macros.
> > 
> > There are two sysv4.h files gcc/config/rs6000/sysv4.h and
> > gcc/config/powerpcspe/sysv4.h
> 
> Forget about powerpcspe please, I am talking about rs6000 only.

I don't know the differences between the config backends, but it looks
like at least some of the configs were just copy/pasted which might
explain why they both define things the same (in sysv4.h). And they
both use the same TARGET_ASM_FILE_END hook (set to rs6000_elf_file_end
although that function also seems copy/pasted between powerpcspe.c and
rs6000.c.

Could you explain why I should forget about powerpcspe and/or why where
and how to setup the new target marco would be different between the
two config backends?

> You want linux.h and freebsd.h, maybe the "64" versions of those separately.
> Or put this in rs6000.h.  sysv4.h is a random header for this, it doesn't
> belong there.

The reason I added it to sysv4.h is because it matches where the
TARGET_ASM_FILE_END hook is setup. It might make sense to have
specialized TARGET_ASM_FILE_END hooks too for [linux,freebsd)(64)].h.
But that is probably a different discussion.

So if I understand correctly you would like to have:

rs6000/linux.h and rs6000/freebsd.h:
#define TARGET_NEEDS_EXEC_STACK_FLAG_FOR_TRAMPOLINES 1

rs6000/linux64.h and rs6000/freebsd64.h:
#define TARGET_NEEDS_EXEC_STACK_FLAG_FOR_TRAMPOLINES (DEFAULT_ABI == ABI_ELFv2)

If so, shouldn't the same be done for powerpcspe?
Sorry, you told me to forget about it, but I just cannot :)

Thanks,

Mark


  1   2   >