Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Richard Biener
On Wed, 18 Jul 2018, Martin Sebor wrote:

> > > > +  while (TREE_CODE (chartype) != INTEGER_TYPE)
> > > > +chartype = TREE_TYPE (chartype);
> > > This is a bit concerning.  First under what conditions is chartype not
> > > going to be an INTEGER_TYPE?  And under what conditions will extracting
> > > its type ultimately lead to something that is an INTEGER_TYPE?
> > 
> > chartype is usually (maybe even always) pointer type here:
> > 
> >   const char a[] = "123";
> >   extern int i;
> >   n = strlen (&a[i]);
> 
> But your hunch was correct that the loop isn't safe because
> the element type need not be an integer (I didn't know/forgot
> that the function is called for non-strings too).  The loop
> should be replaced by:
> 
>   while (TREE_CODE (chartype) == ARRAY_TYPE
>|| TREE_CODE (chartype) == POINTER_TYPE)
>   chartype = TREE_TYPE (chartype);

As this function may be called "late" you need to cope with
the middle-end ignoring type changes and thus happily
passing int *** directly rather than (char *) of that.

Also doesn't the above yield int for int *[]?

I guess you really want

   if (POINTER_TYPE_P (chartype))
 chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
 chartype = TREE_TYPE (chartype);

?

>  if (TREE_CODE (chartype) != INTEGER_TYPE)
>   return NULL;
> 
> I will update the patch before committing.
> 
> FWIW, it seems like it would be useful to extend the function to
> non-string arguments.  That way it would be able to return array
> initializers in cases like this:
> 
>   const struct A { int a[2]; }
> a = { { 1, 2 } },
> b = { { 1, 2 } };
> 
>   int f (void)
>   {
> return __builtin_memcmp (&a, &b, sizeof a);
>   }
> 
> which would in turn make it possible to fold the result of
> such calls analogously to strlen or strcmp.
> 
> Martin
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Martin Liška
On 07/18/2018 06:28 PM, Thomas Preudhomme wrote:
> Hi Martin,
> 
> Why is this needed when -mfpu does not seem to need it for instance?

Because the mfpu is an enum option type:

mfpu=
Target RejectNegative Joined Enum(arm_fpu) Var(arm_fpu_index) 
Init(TARGET_FPU_auto) Save
Specify the name of the target floating point hardware/format.

On the contrary mtune is a string option type:

mtune=
Target RejectNegative ToLower Joined Var(arm_tune_string)
Tune code for the given processor.

That's why mtune enum values are automatically printed in --help=target output.

> Regarding the patch:
> 
>> -print "Name(processor_type) Type(enum processor_type)"
>> -print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
>> +print "Name(processor_type) Type(enum processor_type) ForceHelp"
>> +print "Known ARM CPUs (for use with the -mtune= options):\n"
> 
> Why changing the text beyond adding ForceHelp?

That's probably wrong, do you accept the same values for -mcpu as for -mtune, 
right?

> 
>> +@item ForceHelp
>> +This property is optional.  If present, enum values is printed
>> +in @option{--help} output.
>> +
> 
> are printed

Yep.

I'm sending updated version of the patch.

Martin

> 
> Thanks,
> 
> Thomas
> On Wed, 18 Jul 2018 at 16:50, Martin Liška  wrote:
>>
>> Hi.
>>
>> This introduces new ForceHelp option flag that helps to
>> print valid option enum values that are not directly
>> used as a type of an option.
>>
>> May I please ask ARM folks to test the patch?
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-07-18  Martin Liska  
>>
>> PR driver/83193
>> * config/arm/arm-tables.opt: Add ForceHelp flag for
>> processor_type and arch_name enum types.
>> * config/arm/parsecpu.awk: Likewise.
>> * doc/options.texi: Document new flag ForceHelp.
>> * opt-read.awk: Parse ForceHelp and set it in construction.
>> * optc-gen.awk: Likewise.
>> * opts.c (print_filtered_help): Handle force_help option.
>> * opts.h (struct cl_enum): New field force_help.
>> ---
>>  gcc/config/arm/arm-tables.opt | 6 +++---
>>  gcc/config/arm/parsecpu.awk   | 6 +++---
>>  gcc/doc/options.texi  | 4 
>>  gcc/opt-read.awk  | 3 +++
>>  gcc/optc-gen.awk  | 3 ++-
>>  gcc/opts.c| 3 ++-
>>  gcc/opts.h| 3 +++
>>  7 files changed, 20 insertions(+), 8 deletions(-)
>>
>>

>From af9140854ca089577a54cc12602d75b3cee6a3ad Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 20 Feb 2018 10:39:09 +0100
Subject: [PATCH] Show valid options for -march and -mtune in --help=target for
 arm32 (PR driver/83193).

gcc/ChangeLog:

2018-07-18  Martin Liska  

PR driver/83193
	* config/arm/arm-tables.opt: Add ForceHelp flag for
processor_type and arch_name enum types.
	* config/arm/parsecpu.awk: Likewise.
	* doc/options.texi: Document new flag ForceHelp.
	* opt-read.awk: Parse ForceHelp and set it in construction.
	* optc-gen.awk: Likewise.
	* opts.c (print_filtered_help): Handle force_help option.
	* opts.h (struct cl_enum): New field force_help.
---
 gcc/config/arm/arm-tables.opt | 6 +++---
 gcc/config/arm/parsecpu.awk   | 4 ++--
 gcc/doc/options.texi  | 4 
 gcc/opt-read.awk  | 3 +++
 gcc/optc-gen.awk  | 3 ++-
 gcc/opts.c| 3 ++-
 gcc/opts.h| 3 +++
 7 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index eacee746a39..cbaa67385d7 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -21,8 +21,8 @@
 ; .
 
 Enum
-Name(processor_type) Type(enum processor_type)
-Known ARM CPUs (for use with the -mcpu= and -mtune= options):
+Name(processor_type) Type(enum processor_type) ForceHelp
+Known ARM CPUs (for use with the -mtune= options):
 
 EnumValue
 Enum(processor_type) String(arm8) Value( TARGET_CPU_arm8)
@@ -298,7 +298,7 @@ EnumValue
 Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
 
 Enum
-Name(arm_arch) Type(int)
+Name(arm_arch) Type(int) ForceHelp
 Known ARM architectures (for use with the -march= option):
 
 EnumValue
diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
index aabe1b0c64c..c499a5ed0ce 100644
--- a/gcc/config/arm/parsecpu.awk
+++ b/gcc/config/arm/parsecpu.awk
@@ -441,7 +441,7 @@ function gen_opt () {
 boilerplate("md")
 
 print "Enum"
-print "Name(processor_type) Type(enum processor_type)"
+print "Name(processor_type) Type(enum processor_type) ForceHelp"
 print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
 
 ncpus = split (cpu_list, cpus)
@@ -454,7 +454,7 @@ function gen_opt () {
 }
 
 print "Enum"
-print "Name(arm_arch) Type(int)"
+print "Name(arm_arch) Type(int) ForceHelp"
 print "Known ARM architectures (for use with the -march= option):\n"
 
 narchs = s

Re: [PATCH] Introduce instance discriminators

2018-07-19 Thread Alexandre Oliva
On Jul 18, 2018, Richard Biener  wrote:

> On Wed, Jul 18, 2018 at 8:53 AM Alexandre Oliva  wrote:
>> Instance discriminators are not compatible with LTO, in that the
>> instance mapping is not preserved in LTO dumps.  There are no plans to
>> preserve discriminators in them.

> Because...

... it follows existing practice.  BB discriminators are not saved for
LTO either.  They could be saved along with the CFG, but AFAICT they
aren't.  As for instance discriminators, we might be able to save them
along with LOC information, but that would be quite wasteful, and
because of the way ordinary maps are reconstructed when reading in the
LTO data, we'd end up with yet another internal representation for
line_maps.  I was told there was no interest from our customers in using
the converage and monitoring aspects of instance discriminators when
performing link-time optimizations, and thus that it made sense to
follow existing practice.


I suspect there might be a way to assign instance discriminator numbers
to individual function DECLs, and then walk up the lexical block tree to
identify the DECL containing the block so as to obtain the discriminator
number.  This would be a lot less efficient, algorithmically speaking,
but, provided that LTO dumps discriminator numbers as part of the decls,
and enough info to reconstruct the lexical block trees, including the
inlined-function enclosing blocks, that should work even for LTO, at
least as long as different decls are maintained for each instance.

Indeed, if this is the case, code ranges of lexical blocks in inlined
subroutines would suffice to identify each instantiation, without the
need for discriminators.  It would be a lot more expensive to gather the
information from that debug info than from discriminators, though.

All this said, there doesn't seem to be much interest in that from Ada
users to justify by itself the pursuit of yet another internal
representation.  I wonder if this sort of discriminator might be of
interest for users of C++ templates as well.

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Martin Liška
This is correct version of the patch. Anyway, I'm thinking about the ForceHelp
attribute. I may do it in a bit different version. Let me come up with one 
another
version of the patch.

Martin

>From 9bfc1400213911b4508e90198df7b2dd11efc85c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 20 Feb 2018 10:39:09 +0100
Subject: [PATCH] Show valid options for -march and -mtune in --help=target for
 arm32 (PR driver/83193).

gcc/ChangeLog:

2018-07-18  Martin Liska  

PR driver/83193
	* config/arm/arm-tables.opt: Add ForceHelp flag for
processor_type and arch_name enum types.
	* config/arm/parsecpu.awk: Likewise.
	* doc/options.texi: Document new flag ForceHelp.
	* opt-read.awk: Parse ForceHelp and set it in construction.
	* optc-gen.awk: Likewise.
	* opts.c (print_filtered_help): Handle force_help option.
	* opts.h (struct cl_enum): New field force_help.
---
 gcc/config/arm/arm-tables.opt | 4 ++--
 gcc/config/arm/parsecpu.awk   | 4 ++--
 gcc/doc/options.texi  | 4 
 gcc/opt-read.awk  | 3 +++
 gcc/optc-gen.awk  | 3 ++-
 gcc/opts.c| 3 ++-
 gcc/opts.h| 3 +++
 7 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index eacee746a39..c74229e27d7 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -21,7 +21,7 @@
 ; .
 
 Enum
-Name(processor_type) Type(enum processor_type)
+Name(processor_type) Type(enum processor_type) ForceHelp
 Known ARM CPUs (for use with the -mcpu= and -mtune= options):
 
 EnumValue
@@ -298,7 +298,7 @@ EnumValue
 Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
 
 Enum
-Name(arm_arch) Type(int)
+Name(arm_arch) Type(int) ForceHelp
 Known ARM architectures (for use with the -march= option):
 
 EnumValue
diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
index aabe1b0c64c..c499a5ed0ce 100644
--- a/gcc/config/arm/parsecpu.awk
+++ b/gcc/config/arm/parsecpu.awk
@@ -441,7 +441,7 @@ function gen_opt () {
 boilerplate("md")
 
 print "Enum"
-print "Name(processor_type) Type(enum processor_type)"
+print "Name(processor_type) Type(enum processor_type) ForceHelp"
 print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
 
 ncpus = split (cpu_list, cpus)
@@ -454,7 +454,7 @@ function gen_opt () {
 }
 
 print "Enum"
-print "Name(arm_arch) Type(int)"
+print "Name(arm_arch) Type(int) ForceHelp"
 print "Known ARM architectures (for use with the -march= option):\n"
 
 narchs = split (arch_list, archs)
diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi
index b3ca9f6fce6..af77ad78e8c 100644
--- a/gcc/doc/options.texi
+++ b/gcc/doc/options.texi
@@ -120,6 +120,10 @@ being described by this record.
 This property is required; it says what value (representable as
 @code{int}) should be used for the given string.
 
+@item ForceHelp
+This property is optional.  If present, enum values are printed
+in @option{--help} output.
+
 @item Canonical
 This property is optional.  If present, it says the present string is
 the canonical one among all those with the given value.  Other strings
diff --git a/gcc/opt-read.awk b/gcc/opt-read.awk
index 2072958e6ba..6d2be9e99d7 100644
--- a/gcc/opt-read.awk
+++ b/gcc/opt-read.awk
@@ -89,6 +89,9 @@ BEGIN {
 			enum_index[name] = n_enums
 			enum_unknown_error[name] = unknown_error
 			enum_help[name] = $3
+			enum_force_help[name] = test_flag("ForceHelp", props, "true")
+			if (enum_force_help[name] == "")
+			  enum_force_help[name] = "false"
 			n_enums++
 		}
 		else if ($1 == "EnumValue")  {
diff --git a/gcc/optc-gen.awk b/gcc/optc-gen.awk
index bf177e86330..5c4f4239db0 100644
--- a/gcc/optc-gen.awk
+++ b/gcc/optc-gen.awk
@@ -167,7 +167,8 @@ for (i = 0; i < n_enums; i++) {
 	print "cl_enum_" name "_data,"
 	print "sizeof (" enum_type[name] "),"
 	print "cl_enum_" name "_set,"
-	print "cl_enum_" name "_get"
+	print "cl_enum_" name "_get,"
+	print "" enum_force_help[name]
 	print "  },"
 }
 print "};"
diff --git a/gcc/opts.c b/gcc/opts.c
index b8ae8756b4f..214ef806cd5 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1337,7 +1337,8 @@ print_filtered_help (unsigned int include_flags,
 {
   unsigned int j, pos;
 
-  if (opts->x_help_enum_printed[i] != 1)
+  if (opts->x_help_enum_printed[i] != 1
+	  && !cl_enums[i].force_help)
 	continue;
   if (cl_enums[i].help == NULL)
 	continue;
diff --git a/gcc/opts.h b/gcc/opts.h
index 3723bdbf95b..c8777b3cd6a 100644
--- a/gcc/opts.h
+++ b/gcc/opts.h
@@ -193,6 +193,9 @@ struct cl_enum
 
   /* Function to get the value of a variable of this type.  */
   int (*get) (const void *var);
+
+  /* Force enum to be printed in help.  */
+  bool force_help;
 };
 
 extern const struct cl_enum cl_enums[];
-- 
2.18.0



Re: RFC: Patch to implement Aarch64 SIMD ABI

2018-07-19 Thread Richard Sandiford
Hi,

Thanks for doing this.

Steve Ellcey  writes:
> This is a patch to support the Aarch64 SIMD ABI [1] in GCC.  I intend
> to eventually follow this up with two more patches; one to define the
> TARGET_SIMD_CLONE* macros and one to improve the GCC register
> allocation/usage when calling SIMD functions.
>
> The significant difference between the standard ARM ABI and the SIMD ABI
> is that in the normal ABI a callee saves only the lower 64 bits of registers
> V8-V15, in the SIMD ABI the callee must save all 128 bits of registers
> V8-V23.
>
> This patch checks for SIMD functions and saves the extra registers when
> needed.  It does not change the caller behavour, so with just this patch
> there may be values saved by both the caller and callee.  This is not
> efficient, but it is correct code.
>
> This patch bootstraps and passes the GCC testsuite but that only verifies
> I haven't broken anything, it doesn't validate the handling of SIMD functions.
> I tried to write some tests, but I could never get GCC to generate code
> that would save the FP callee-save registers in the prologue.  Complex code
> might generate spills and fills but it never triggered the prologue/epilogue
> code to save V8-V23.  If anyone has ideas on how to write a test that would
> cause GCC to generate this code I would appreciate some ideas.  Just doing
> lots of calculations with lots of intermediate values doesn't seem to be 
> enough.

Probably easiest to use asm clobbers, e.g.:

void __attribute__ ((aarch64_vector_pcs))
f (void)
{
  asm volatile ("" ::: "s8", "s13");
}

This also lets you control exactly which registers are saved.

> @@ -4105,7 +4128,8 @@ aarch64_layout_frame (void)
>{
>   /* If there is an alignment gap between integer and fp callee-saves,
>  allocate the last fp register to it if possible.  */
> - if (regno == last_fp_reg && has_align_gap && (offset & 8) == 0)
> + if (regno == last_fp_reg && has_align_gap
> + && !simd_function && (offset & 8) == 0)
> {
>   cfun->machine->frame.reg_offset[regno] = max_int_offset;
>   break;
> @@ -4117,7 +4141,7 @@ aarch64_layout_frame (void)
>   else if (cfun->machine->frame.wb_candidate2 == INVALID_REGNUM
>&& cfun->machine->frame.wb_candidate1 >= V0_REGNUM)
> cfun->machine->frame.wb_candidate2 = regno;
> - offset += UNITS_PER_WORD;
> + offset += simd_function ? UNITS_PER_VREG : UNITS_PER_WORD;
>}
>  
>offset = ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT);
> @@ -4706,8 +4730,11 @@ aarch64_process_components (sbitmap components, bool 
> prologue_p)
>while (regno != last_regno)
>  {
>/* AAPCS64 section 5.1.2 requires only the bottom 64 bits to be saved
> -  so DFmode for the vector registers is enough.  */
> -  machine_mode mode = GP_REGNUM_P (regno) ? E_DImode : E_DFmode;
> +  so DFmode for the vector registers is enough.  For simd functions
> + we want to save the entire register.  */
> +  machine_mode mode = GP_REGNUM_P (regno) ? E_DImode
> + : (aarch64_simd_function_p (cfun->decl) ? E_TFmode : E_DFmode);

This condition also occurs in aarch64_push_regs and aarch64_pop_regs.
It'd probably be worth splitting it out into a subfunction.

I think you also need to handle the writeback cases, which should work
for Q registers too.  This will mean extra loadwb_pair and storewb_pair
patterns.

LGTM otherwise FWIW.

> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index f284e74..d11474e 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -500,6 +500,8 @@ extern unsigned aarch64_architecture_version;
>  #define PR_LO_REGNUM_P(REGNO)\
>(((unsigned) (REGNO - P0_REGNUM)) <= (P7_REGNUM - P0_REGNUM))
>  
> +#define FP_SIMD_SAVED_REGNUM_P(REGNO)\
> +  (((unsigned) (REGNO - V8_REGNUM)) <= (V23_REGNUM - V8_REGNUM))

(We should probably rewrite these to use IN_RANGE at some point,
but I agree it's better to be consistent until then.)

Thanks,
Richard


Re: cleanup cross product code in VRP

2018-07-19 Thread Richard Biener
On Wed, Jul 18, 2018 at 2:05 PM Aldy Hernandez  wrote:
>
> Hi again!
>
> Well, since this hasn't been reviewed and I'm about to overhaul the
> TYPE_OVERFLOW_WRAPS code anyhow, might as well lump it all in one patch.
>
> On 07/16/2018 09:19 AM, Aldy Hernandez wrote:
> > Howdy!
> >
> > I've abstracted out the cross product calculations into its own
> > function, and have adapted it to deal with wide ints so it's more
> > reusable.  It required some shuffling around, and implementing things a
> > bit different, but things should be behave as before.
> >
> > I also renamed vrp_int_const_binop to make its intent clearer,
> > especially now that it's really just a wrapper to wide_int_binop that
> > deals with overflow.
> >
> > (If wide_int_binop_overflow is generally useful, perhaps we could merge
> > it with wide_int_overflow.)
>
> This is the same as the previous patch, plus I'm abstracting the
> TYPE_OVERFLOW_WRAPS code as well.  With this, the code dealing with
> MULT_EXPR in vrp gets reduced to handling value_range specific stuff.
> Yay code re-use!
>
> A few notes:
>
> This is dead code.  I've removed it:
>
> -  /* If we have an unsigned MULT_EXPR with two VR_ANTI_RANGEs,
> -drop to VR_VARYING.  It would take more effort to compute a
> -precise range for such a case.  For example, if we have
> -op0 == 65536 and op1 == 65536 with their ranges both being
> -~[0,0] on a 32-bit machine, we would have op0 * op1 == 0, so
> -we cannot claim that the product is in ~[0,0].  Note that we
> -are guaranteed to have vr0.type == vr1.type at this
> -point.  */
> -  if (vr0.type == VR_ANTI_RANGE
> - && !TYPE_OVERFLOW_UNDEFINED (expr_type))
> -   {
> - set_value_range_to_varying (vr);
> - return;
> -   }
>
> Also, the vrp_int typedef has a weird name, especially when we have
> widest2_int in gimple-fold.c that does the exact thing.  I've moved the
> common code to wide-int.h and tree.h so we can all share :).
>
> At some point we could move the wide_int_range* and wide_int_binop* code
> into its own file.

Yes.

> Tested on x86-64 Linux.
>
> OK?

+bool
+wide_int_range_cross_product_wrapping (wide_int &res_lb,
+  wide_int &res_ub,

please rename this to sth like wide_int_range_mult_wrapping
because it only handles multiplication to not confuse it with
the other function.

Otherwise OK (and sorry for the delay in reviewing).

Thanks,
Richard.


Re: [PATCH][debug] Handle references to skipped params in remap_ssa_name

2018-07-19 Thread Richard Biener
On Wed, Jul 18, 2018 at 3:42 PM Tom de Vries  wrote:
>
> On 07/06/2018 12:28 PM, Richard Biener wrote:
> > On Thu, Jul 5, 2018 at 4:12 PM Tom de Vries  wrote:
> >>
> >> On 07/05/2018 01:39 PM, Richard Biener wrote:
> >>> On Thu, Jul 5, 2018 at 1:25 PM Tom de Vries  wrote:
> 
>  [ was: Re: [testsuite/guality, committed] Prevent optimization of local 
>  in
>  vla-1.c ]
> 
>  On Wed, Jul 04, 2018 at 02:32:27PM +0200, Tom de Vries wrote:
> > On 07/03/2018 11:05 AM, Tom de Vries wrote:
> >> On 07/02/2018 10:16 AM, Jakub Jelinek wrote:
> >>> On Mon, Jul 02, 2018 at 09:44:04AM +0200, Richard Biener wrote:
>  Given the array has size i + 1 it's upper bound should be 'i' and 'i'
>  should be available via DW_OP_[GNU_]entry_value.
> 
>  I see it is
> 
>  <175>   DW_AT_upper_bound : 10 byte block: 75 1 8 20 24 8 20 26 
>  31
>  1c   (DW_OP_breg5 (rdi): 1; DW_OP_const1u: 32; DW_OP_shl;
>  DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus)
> 
>  and %rdi is 1.  Not sure why gdb fails to print it's length.  Yes, 
>  the
>  storage itself doesn't have a location but the
>  type specifies the size.
> 
>  (gdb) ptype a
>  type = char [variable length]
>  (gdb) p sizeof(a)
>  $3 = 0
> 
>  this looks like a gdb bug to me?
> 
> >>
> >> With gdb patch:
> >> ...
> >> diff --git a/gdb/findvar.c b/gdb/findvar.c
> >> index 8ad5e25cb2..ebaff923a1 100644
> >> --- a/gdb/findvar.c
> >> +++ b/gdb/findvar.c
> >> @@ -789,6 +789,8 @@ default_read_var_value
> >>break;
> >>
> >>  case LOC_OPTIMIZED_OUT:
> >> +  if (is_dynamic_type (type))
> >> +   type = resolve_dynamic_type (type, NULL,
> >> +/* Unused address.  */ 0);
> >>return allocate_optimized_out_value (type);
> >>
> >>  default:
> >> ...
> >>
> >> I get:
> >> ...
> >> $ ./gdb -batch -ex "b f1" -ex "r" -ex "p sizeof (a)" vla-1.exe
> >> Breakpoint 1 at 0x4004a8: file vla-1.c, line 17.
> >>
> >> Breakpoint 1, f1 (i=i@entry=5) at vla-1.c:17
> >> 17return a[0];
> >> $1 = 6
> >> ...
> >>
> >
> > Well, for -O1 and -O2.
> >
> > For O3, I get instead:
> > ...
> > $ ./gdb vla-1.exe -q -batch -ex "b f1" -ex "run" -ex "p sizeof (a)"
> > Breakpoint 1 at 0x4004b0: f1. (2 locations)
> >
> > Breakpoint 1, f1 (i=5) at vla-1.c:17
> > 17return a[0];
> > $1 = 0
> > ...
> >
> 
>  Hi,
> 
>  When compiling guality/vla-1.c with -O3 -g, vla 'a[i + 1]' in f1 is 
>  optimized
>  away, but f1 still contains a debug expression describing the upper 
>  bound of the
>  vla (D.1914):
>  ...
>   __attribute__((noinline))
>   f1 (intD.6 iD.1900)
>   {
> 
> saved_stack.1_2 = __builtin_stack_save ();
> # DEBUG BEGIN_STMT
> # DEBUG D#3 => i_1(D) + 1
> # DEBUG D#2 => (long intD.8) D#3
> # DEBUG D#1 => D#2 + -1
> # DEBUG D.1914 => (sizetype) D#1
>  ...
> 
>  Then f1 is cloned to a version f1.constprop with no parameters, 
>  eliminating
>  parameter i, and 'DEBUG D#3 => i_1(D) + 1' turns into 'D#3 => NULL'.
>  Consequently, 'print sizeof (a)' yields '0' in gdb.
> >>>
> >>> So does gdb correctly recognize there isn't any size available or do we 
> >>> somehow
> >>> generate invalid debug info, not recognizing that D#3 => NULL means
> >>> "optimized out" and thus all dependent expressions are "optimized out" as 
> >>> well?
> >>>
> >>> That is, shouldn't gdb do
> >>>
> >>> (gdb) print sizeof (a)
> >>> 
> >>>
> >>> ?
> >>
> >> The type for the vla gcc is emitting is an DW_TAG_array_type with
> >> DW_TAG_subrange_type without DW_AT_upper_bound or DW_AT_count, which
> >> makes the upper bound value 'unknown'. So I'd say the debug info is valid.
> >
> > OK, that sounds reasonable.  I wonder if languages like Ada have a way
> > to declare an array type with unknown upper bound but known lower bound.
> > For
> >
> > typedef int arr[];
> > arr *x;
> >
> > we generate just
> >
> >  <1><2d>: Abbrev Number: 2 (DW_TAG_typedef)
> > <2e>   DW_AT_name: arr
> > <32>   DW_AT_decl_file   : 1
> > <33>   DW_AT_decl_line   : 1
> > <34>   DW_AT_decl_column : 13
> > <35>   DW_AT_type: <0x39>
> >  <1><39>: Abbrev Number: 3 (DW_TAG_array_type)
> > <3a>   DW_AT_type: <0x44>
> > <3e>   DW_AT_sibling : <0x44>
> >  <2><42>: Abbrev Number: 4 (DW_TAG_subrange_type)
> >  <2><43>: Abbrev Number: 0
> >
> > which does
> >
> > (gdb) ptype arr
> > type = int []
> > (gdb) ptype x
> > type = int (*)[]
> > (gdb) p sizeof (arr)
> > $1 = 0
> >
> > so I wonder whether the pat

Re: [PATCH 2/3] i386: Change indirect_return to function type attribute

2018-07-19 Thread Richard Biener
On Wed, Jul 18, 2018 at 5:33 PM H.J. Lu  wrote:
>
> In
>
> struct ucontext;
> typedef struct ucontext ucontext_t;
>
> extern int (*bar) (ucontext_t *__restrict __oucp,
>const ucontext_t *__restrict __ucp)
>   __attribute__((__indirect_return__));
>
> extern int res;
>
> void
> foo (ucontext_t *oucp, ucontext_t *ucp)
> {
>   res = bar (oucp, ucp);
> }
>
> bar() may return via indirect branch.  This patch changes indirect_return
> to type attribute to allow indirect_return attribute on variable or type
> of function pointer so that ENDBR can be inserted after call to bar().
>
> Tested on i386 and x86-64.  OK for trunk?

OK.

Richard.

> Thanks.
>
>
> H.J.
> ---
> gcc/
>
> PR target/86560
> * config/i386/i386.c (rest_of_insert_endbranch): Lookup
> indirect_return as function type attribute.
> (ix86_attribute_table): Change indirect_return to function
> type attribute.
> * doc/extend.texi: Update indirect_return attribute.
>
> gcc/testsuite/
>
> PR target/86560
> * gcc.target/i386/pr86560-1.c: New test.
> * gcc.target/i386/pr86560-2.c: Likewise.
> * gcc.target/i386/pr86560-3.c: Likewise.
> ---
>  gcc/config/i386/i386.c| 23 +++
>  gcc/doc/extend.texi   |  5 +++--
>  gcc/testsuite/gcc.target/i386/pr86560-1.c | 16 
>  gcc/testsuite/gcc.target/i386/pr86560-2.c | 16 
>  gcc/testsuite/gcc.target/i386/pr86560-3.c | 17 +
>  5 files changed, 67 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr86560-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr86560-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr86560-3.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index aec739c3974..ac27248370b 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2627,16 +2627,23 @@ rest_of_insert_endbranch (void)
> {
>   rtx call = get_call_rtx_from (insn);
>   rtx fnaddr = XEXP (call, 0);
> + tree fndecl = NULL_TREE;
>
>   /* Also generate ENDBRANCH for non-tail call which
>  may return via indirect branch.  */
> - if (MEM_P (fnaddr)
> - && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF)
> + if (GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF)
> +   fndecl = SYMBOL_REF_DECL (XEXP (fnaddr, 0));
> + if (fndecl == NULL_TREE)
> +   fndecl = MEM_EXPR (fnaddr);
> + if (fndecl
> + && TREE_CODE (TREE_TYPE (fndecl)) != FUNCTION_TYPE
> + && TREE_CODE (TREE_TYPE (fndecl)) != METHOD_TYPE)
> +   fndecl = NULL_TREE;
> + if (fndecl && TYPE_ARG_TYPES (TREE_TYPE (fndecl)))
> {
> - tree fndecl = SYMBOL_REF_DECL (XEXP (fnaddr, 0));
> - if (fndecl
> - && lookup_attribute ("indirect_return",
> -  DECL_ATTRIBUTES (fndecl)))
> + tree fntype = TREE_TYPE (fndecl);
> + if (lookup_attribute ("indirect_return",
> +   TYPE_ATTRIBUTES (fntype)))
> need_endbr = true;
> }
> }
> @@ -46101,8 +46108,8 @@ static const struct attribute_spec 
> ix86_attribute_table[] =
>  ix86_handle_fndecl_attribute, NULL },
>{ "function_return", 1, 1, true, false, false, false,
>  ix86_handle_fndecl_attribute, NULL },
> -  { "indirect_return", 0, 0, true, false, false, false,
> -ix86_handle_fndecl_attribute, NULL },
> +  { "indirect_return", 0, 0, false, true, true, false,
> +NULL, NULL },
>
>/* End element.  */
>{ NULL, 0, 0, false, false, false, false, NULL, NULL }
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 8b4d3fd9de3..edeaec6d872 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -5861,8 +5861,9 @@ foo (void)
>  @item indirect_return
>  @cindex @code{indirect_return} function attribute, x86
>
> -The @code{indirect_return} attribute on a function is used to inform
> -the compiler that the function may return via indirect branch.
> +The @code{indirect_return} attribute can be applied to a function,
> +as well as variable or type of function pointer to inform the
> +compiler that the function may return via indirect branch.
>
>  @end table
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr86560-1.c 
> b/gcc/testsuite/gcc.target/i386/pr86560-1.c
> new file mode 100644
> index 000..a2b702695c5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr86560-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fcf-protection" } */
> +/* { dg-final { scan-assembler-times {

Re: [C++ Patch] PR 59480 ("Missing error diagnostic: friend declaration specifying a default argument must be a definition")

2018-07-19 Thread Rainer Orth
Hi Paolo,

> the below resolves the bug report and its duplicates by implementing - 
> in a rather straightforward way, I believe - the resolution of DR 136,
> which also made into C++17. Note that in the patch I used permerror instead
> of a plain error for consistency with the other check
> (check_redeclaration_no_default_args) which I added (rather) recently, and
> I'm exploiting that to allow two existing testcases to compile as they
> are. Tested x86_64-linux.

this patch caused

+FAIL: g++.old-deja/g++.mike/p784.C  -std=gnu++11 (test for excess errors)  
+FAIL: g++.old-deja/g++.mike/p784.C  -std=gnu++14 (test for excess errors)
+FAIL: g++.old-deja/g++.mike/p784.C  -std=gnu++98 (test for excess errors)

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.old-deja/g++.mike/p784.C:1185:21: 
error: friend declaration of 'String common_prefix(const String&, const 
String&, int)' specifies default arguments and isn't a definition [-fpermissive]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.old-deja/g++.mike/p784.C:1187:21: 
error: friend declaration of 'String common_suffix(const String&, const 
String&, int)' specifies default arguments and isn't a definition [-fpermissive]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.old-deja/g++.mike/p784.C:1226:21: 
error: friend declaration of 'int readline(istream&, String&, char, int)' 
specifies default arguments and isn't a definition [-fpermissive]

I'm seeing it on i386-pc-solaris2.11 and sparc-sun-solaris2.11 with
-m32, but according to gcc-testresults it also happens on
i686-pc-linux-gnu, x86_64-pc-linux-gnu and a few others.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Compilation error in simple-object-elf.c

2018-07-19 Thread Richard Biener
On Wed, Jul 18, 2018 at 6:18 PM Eli Zaretskii  wrote:
>
> Hi,
>
> I've built the pretest of GDB 8.2 with MinGW today, and bumped into a
> compilation error in libiberty:
>
>  if [ x"" != x ]; then \
>gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS  -I. 
> -I./../include   -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes 
> -pedantic  -D_GNU_SOURCE   ./simple-object-elf.c -o 
> noasan/simple-object-elf.o; \
>  else true; fi
>  gcc -c -DHAVE_CONFIG_H -O2 -gdwarf-4 -g3 -D__USE_MINGW_ACCESS  -I. 
> -I./../include   -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes 
> -pedantic  -D_GNU_SOURCE ./simple-object-elf.c -o simple-object-elf.o
>  ./simple-object-elf.c: In function 
> 'simple_object_elf_copy_lto_debug_sections':
>  ./simple-object-elf.c:1284:14: error: 'ENOTSUP' undeclared (first use in 
> this function)
> *err = ENOTSUP;
>^~~
>  ./simple-object-elf.c:1284:14: note: each undeclared identifier is 
> reported only once for each function it appears in
>
> Suggested fix:

Works for me, thus OK.  I'm going to check it in to make 8.2.

Richard.

> 2018-07-18  Eli Zaretskii  
>
> * libiberty/simple-object-elf.c (ENOTSUP): If not defined by
>   errno.h, redirect to ENOSYS.
>
> --- libiberty/simple-object-elf.c~0 2018-07-04 18:41:59.0 +0300
> +++ libiberty/simple-object-elf.c   2018-07-18 18:19:39.286654700 +0300
> @@ -22,6 +22,10 @@ Boston, MA 02110-1301, USA.  */
>  #include "simple-object.h"
>
>  #include 
> +/* mingw.org's MinGW doesn't have ENOTSUP.  */
> +#ifndef ENOTSUP
> +# define ENOTSUP ENOSYS
> +#endif
>  #include 
>
>  #ifdef HAVE_STDLIB_H
>


Re: [SVE ACLE] Add initial support for arm_sve.h

2018-07-19 Thread Richard Biener
On Wed, Jul 18, 2018 at 8:08 PM Richard Sandiford
 wrote:
>
> This patch adds the target framework for handling the SVE ACLE,
> starting with four functions: svadd, svptrue, svsub and svsubr.
>
> The ACLE has both overloaded and non-overloaded names.  Without
> the equivalent of clang's __attribute__((overloadable)), a header
> file that declared all functions would need three sets of declarations:
>
> - the non-overloaded forms (used for both C and C++)
> - _Generic-based macros to handle overloading in C
> - normal overloaded inline functions for C++
>
> This would likely require a lot of cut-&-paste.  It would probably
> also lead to poor diagnosics and be slow to parse.
>
> Another consideration is that some functions require certain arguments
> to be integer constant expressions.  We can (sort of) enforce that
> for calls to built-in functions using resolve_overloaded_builtin,
> but it would be harder to enforce with inline forwarder functions.
>
> For these reasons and others, the patch takes the approach of adding
> a pragma that gets the compiler to insert the definitions itself.
> This requires a slight variation on the existing lang hooks for
> built-in functions, but otherwise it seems to just work.

I guess you did consider auto-generating the three variants from a template?

> It was easier to add the support without enumerating every function
> at build time.  This in turn meant that it was easier if the SVE
> builtins occupied a distinct numberspace from the existing AArch64 ones.
> The patch therefore divides the built-in functions codes into "major"
> and "minor" codes.  At present the major code is just "general" or "SVE".
>
> For now, the patch is only expected to work for fixed-length SVE.
> Some uses of the ACLE do manage to squeak through the front-end
> in the normal vector-length agnostic mode, but that's more by
> accident than design.  We're planning to work on proper frontend
> support for "sizeless" types in parallel with the backend changes.
>
> Other things not handled yet:
>
> - support for the SVE AAPCS
> - handling the built-ins correctly when the compiler is invoked
>   without SVE enabled (e.g. if SVE is enabled later by a pragma)
>
> Both of these are blockers to merging the support into trunk.
>
> The aim is to make sure when adding a function that the function
> produces the expected assembly output for all relevant combinations.
> The patch adds a new check-function-bodies test to try to make
> that easier.
>
> Tested on aarch64-linux-gnu (with and without SVE) and committed
> to aarch64/sve-acle-branch.
>
> Richard
>
>


Re: RFC: Patch to implement Aarch64 SIMD ABI

2018-07-19 Thread Ramana Radhakrishnan
On Thu, Jul 19, 2018 at 8:31 AM, Richard Sandiford
 wrote:
> Hi,
>
> Thanks for doing this.
>
> Steve Ellcey  writes:
>> This is a patch to support the Aarch64 SIMD ABI [1] in GCC.  I intend
>> to eventually follow this up with two more patches; one to define the
>> TARGET_SIMD_CLONE* macros and one to improve the GCC register
>> allocation/usage when calling SIMD functions.
>>
>> The significant difference between the standard ARM ABI and the SIMD ABI
>> is that in the normal ABI a callee saves only the lower 64 bits of registers
>> V8-V15, in the SIMD ABI the callee must save all 128 bits of registers
>> V8-V23.
>>
>> This patch checks for SIMD functions and saves the extra registers when
>> needed.  It does not change the caller behavour, so with just this patch
>> there may be values saved by both the caller and callee.  This is not
>> efficient, but it is correct code.
>>
>> This patch bootstraps and passes the GCC testsuite but that only verifies
>> I haven't broken anything, it doesn't validate the handling of SIMD 
>> functions.
>> I tried to write some tests, but I could never get GCC to generate code
>> that would save the FP callee-save registers in the prologue.  Complex code
>> might generate spills and fills but it never triggered the prologue/epilogue
>> code to save V8-V23.  If anyone has ideas on how to write a test that would
>> cause GCC to generate this code I would appreciate some ideas.  Just doing
>> lots of calculations with lots of intermediate values doesn't seem to be 
>> enough.
>
> Probably easiest to use asm clobbers, e.g.:
>
> void __attribute__ ((aarch64_vector_pcs))
> f (void)
> {
>   asm volatile ("" ::: "s8", "s13");
> }
>
> This also lets you control exactly which registers are saved.

For just checking the save and restore the technique Richard suggests
is probably sufficient.

One of the techniques I've used in the past in general is to force
everything to be tested with a command line option added for testing
-. In this case after all the C library dependence of the testsuite
isn't huge and there wouldn't be any vector PCS interfaces to the
libraries needed to run the testsuite that things would work ?

You could cross-check coverage by using lcov. Instructions thanks to
marxin are :

$> ../configure --disable-bootstrap --enable-coverage=opt
--enable-languages=c,c++,fortran,go,jit,lto --enable-host-shared
$> make
$> make check
$>find gcc/testsuite/ -name '*.gcda' -exec rm -rf {} \;
$> lcov -d . --capture --output-file gcc.info
$> lcov --remove gcc.info "/usr/*" "/opt/*" "*/gcc/gt-*"
"*/gcc/gtype-*" --output-file gcc.info
$> genhtml gcc.info --ignore-errors=source --output-directory html
--html-epilog epilog.txt

You will see a warning

genhtml: WARNING: cannot read $builddir/gcc/cfns.gperf

and you can ignore that.

Then just look at the html output that it produces, pretty neat and I
see about 80% coverage on an aarch64-none-linux-gnu test run with
c,c++,fortran,go,lto IIRC.


regards
Ramana

1.
>
>> @@ -4105,7 +4128,8 @@ aarch64_layout_frame (void)
>>{
>>   /* If there is an alignment gap between integer and fp callee-saves,
>>  allocate the last fp register to it if possible.  */
>> - if (regno == last_fp_reg && has_align_gap && (offset & 8) == 0)
>> + if (regno == last_fp_reg && has_align_gap
>> + && !simd_function && (offset & 8) == 0)
>> {
>>   cfun->machine->frame.reg_offset[regno] = max_int_offset;
>>   break;
>> @@ -4117,7 +4141,7 @@ aarch64_layout_frame (void)
>>   else if (cfun->machine->frame.wb_candidate2 == INVALID_REGNUM
>>&& cfun->machine->frame.wb_candidate1 >= V0_REGNUM)
>> cfun->machine->frame.wb_candidate2 = regno;
>> - offset += UNITS_PER_WORD;
>> + offset += simd_function ? UNITS_PER_VREG : UNITS_PER_WORD;
>>}
>>
>>offset = ROUND_UP (offset, STACK_BOUNDARY / BITS_PER_UNIT);
>> @@ -4706,8 +4730,11 @@ aarch64_process_components (sbitmap components, bool 
>> prologue_p)
>>while (regno != last_regno)
>>  {
>>/* AAPCS64 section 5.1.2 requires only the bottom 64 bits to be saved
>> -  so DFmode for the vector registers is enough.  */
>> -  machine_mode mode = GP_REGNUM_P (regno) ? E_DImode : E_DFmode;
>> +  so DFmode for the vector registers is enough.  For simd functions
>> + we want to save the entire register.  */
>> +  machine_mode mode = GP_REGNUM_P (regno) ? E_DImode
>> + : (aarch64_simd_function_p (cfun->decl) ? E_TFmode : E_DFmode);
>
> This condition also occurs in aarch64_push_regs and aarch64_pop_regs.
> It'd probably be worth splitting it out into a subfunction.
>
> I think you also need to handle the writeback cases, which should work
> for Q registers too.  This will mean extra loadwb_pair and storewb_pair
> patterns.
>
> LGTM otherwise FWIW.
>
>> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
>> index f284e74..d11474e 100644
>> --- a/gcc/config/aarch64/aarch64.h

Re: [PATCH] Introduce instance discriminators

2018-07-19 Thread Richard Biener
On Thu, Jul 19, 2018 at 9:21 AM Alexandre Oliva  wrote:
>
> On Jul 18, 2018, Richard Biener  wrote:
>
> > On Wed, Jul 18, 2018 at 8:53 AM Alexandre Oliva  wrote:
> >> Instance discriminators are not compatible with LTO, in that the
> >> instance mapping is not preserved in LTO dumps.  There are no plans to
> >> preserve discriminators in them.
>
> > Because...
>
> ... it follows existing practice.  BB discriminators are not saved for
> LTO either.

Oh, that probably wasn't omitted on purpose.  Cary said it was used
for profiling but I can't see any such use.

Is the instance discriminator stuff also used for profiling?  I agree
that coverage probably isn't depending on LTO.

>  They could be saved along with the CFG, but AFAICT they
> aren't.  As for instance discriminators, we might be able to save them
> along with LOC information, but that would be quite wasteful, and
> because of the way ordinary maps are reconstructed when reading in the
> LTO data, we'd end up with yet another internal representation for
> line_maps.  I was told there was no interest from our customers in using
> the converage and monitoring aspects of instance discriminators when
> performing link-time optimizations, and thus that it made sense to
> follow existing practice.
>
>
> I suspect there might be a way to assign instance discriminator numbers
> to individual function DECLs, and then walk up the lexical block tree to
> identify the DECL containing the block so as to obtain the discriminator
> number.  This would be a lot less efficient, algorithmically speaking,
> but, provided that LTO dumps discriminator numbers as part of the decls,
> and enough info to reconstruct the lexical block trees, including the
> inlined-function enclosing blocks, that should work even for LTO, at
> least as long as different decls are maintained for each instance.
>
> Indeed, if this is the case, code ranges of lexical blocks in inlined
> subroutines would suffice to identify each instantiation, without the
> need for discriminators.  It would be a lot more expensive to gather the
> information from that debug info than from discriminators, though.
>
> All this said, there doesn't seem to be much interest in that from Ada
> users to justify by itself the pursuit of yet another internal
> representation.  I wonder if this sort of discriminator might be of
> interest for users of C++ templates as well.
>
> --
> Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
> Be the change, be Free! FSF Latin America board member
> GNU Toolchain EngineerFree Software Evangelist


Re: cleanup cross product code in VRP

2018-07-19 Thread Aldy Hernandez




On 07/19/2018 04:18 AM, Richard Biener wrote:

On Wed, Jul 18, 2018 at 2:05 PM Aldy Hernandez  wrote:


Hi again!

Well, since this hasn't been reviewed and I'm about to overhaul the
TYPE_OVERFLOW_WRAPS code anyhow, might as well lump it all in one patch.

On 07/16/2018 09:19 AM, Aldy Hernandez wrote:

Howdy!

I've abstracted out the cross product calculations into its own
function, and have adapted it to deal with wide ints so it's more
reusable.  It required some shuffling around, and implementing things a
bit different, but things should be behave as before.

I also renamed vrp_int_const_binop to make its intent clearer,
especially now that it's really just a wrapper to wide_int_binop that
deals with overflow.

(If wide_int_binop_overflow is generally useful, perhaps we could merge
it with wide_int_overflow.)


This is the same as the previous patch, plus I'm abstracting the
TYPE_OVERFLOW_WRAPS code as well.  With this, the code dealing with
MULT_EXPR in vrp gets reduced to handling value_range specific stuff.
Yay code re-use!

A few notes:

This is dead code.  I've removed it:

-  /* If we have an unsigned MULT_EXPR with two VR_ANTI_RANGEs,
-drop to VR_VARYING.  It would take more effort to compute a
-precise range for such a case.  For example, if we have
-op0 == 65536 and op1 == 65536 with their ranges both being
-~[0,0] on a 32-bit machine, we would have op0 * op1 == 0, so
-we cannot claim that the product is in ~[0,0].  Note that we
-are guaranteed to have vr0.type == vr1.type at this
-point.  */
-  if (vr0.type == VR_ANTI_RANGE
- && !TYPE_OVERFLOW_UNDEFINED (expr_type))
-   {
- set_value_range_to_varying (vr);
- return;
-   }

Also, the vrp_int typedef has a weird name, especially when we have
widest2_int in gimple-fold.c that does the exact thing.  I've moved the
common code to wide-int.h and tree.h so we can all share :).

At some point we could move the wide_int_range* and wide_int_binop* code
into its own file.


Yes.


Sometime within the next couple rounds I'll come up with a file name 
that doesn't hurt my eyes.  It seems that the hardest part of 
programming is actually coming up with sensible file and variable names :-/.





Tested on x86-64 Linux.

OK?


+bool
+wide_int_range_cross_product_wrapping (wide_int &res_lb,
+  wide_int &res_ub,

please rename this to sth like wide_int_range_mult_wrapping
because it only handles multiplication to not confuse it with
the other function.


Done.

Thanks.
Aldy



Otherwise OK (and sorry for the delay in reviewing).

Thanks,
Richard.



GCC 8.2 Status Report (2018-07-19), branch frozen for release

2018-07-19 Thread Richard Biener


Status
==

The GCC 8 branch is frozen for preparation of the GCC 8.2 release.
All changes to the branch now require release manager approval.


Previous Report
===

https://gcc.gnu.org/ml/gcc/2018-07/msg00194.html


Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Richard Earnshaw (lists)
On 19/07/18 08:30, Martin Liška wrote:
> This is correct version of the patch. Anyway, I'm thinking about the ForceHelp
> attribute. I may do it in a bit different version. Let me come up with one 
> another
> version of the patch.
> 
> Martin
> 

I don't understand how this is supposed to work.  -mcpu, -march and
-mtune all take strings now and have to be parsed to identify various
sub-components of the parameter.  So why do you talk about these being
enum types?

R.

> 
> 0001-Show-valid-options-for-march-and-mtune-in-help-targe-v3.patch
> 
> 
> From 9bfc1400213911b4508e90198df7b2dd11efc85c Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Tue, 20 Feb 2018 10:39:09 +0100
> Subject: [PATCH] Show valid options for -march and -mtune in --help=target for
>  arm32 (PR driver/83193).
> 
> gcc/ChangeLog:
> 
> 2018-07-18  Martin Liska  
> 
> PR driver/83193
>   * config/arm/arm-tables.opt: Add ForceHelp flag for
> processor_type and arch_name enum types.
>   * config/arm/parsecpu.awk: Likewise.
>   * doc/options.texi: Document new flag ForceHelp.
>   * opt-read.awk: Parse ForceHelp and set it in construction.
>   * optc-gen.awk: Likewise.
>   * opts.c (print_filtered_help): Handle force_help option.
>   * opts.h (struct cl_enum): New field force_help.
> ---
>  gcc/config/arm/arm-tables.opt | 4 ++--
>  gcc/config/arm/parsecpu.awk   | 4 ++--
>  gcc/doc/options.texi  | 4 
>  gcc/opt-read.awk  | 3 +++
>  gcc/optc-gen.awk  | 3 ++-
>  gcc/opts.c| 3 ++-
>  gcc/opts.h| 3 +++
>  7 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
> index eacee746a39..c74229e27d7 100644
> --- a/gcc/config/arm/arm-tables.opt
> +++ b/gcc/config/arm/arm-tables.opt
> @@ -21,7 +21,7 @@
>  ; .
>  
>  Enum
> -Name(processor_type) Type(enum processor_type)
> +Name(processor_type) Type(enum processor_type) ForceHelp
>  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
>  
>  EnumValue
> @@ -298,7 +298,7 @@ EnumValue
>  Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
>  
>  Enum
> -Name(arm_arch) Type(int)
> +Name(arm_arch) Type(int) ForceHelp
>  Known ARM architectures (for use with the -march= option):
>  
>  EnumValue
> diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
> index aabe1b0c64c..c499a5ed0ce 100644
> --- a/gcc/config/arm/parsecpu.awk
> +++ b/gcc/config/arm/parsecpu.awk
> @@ -441,7 +441,7 @@ function gen_opt () {
>  boilerplate("md")
>  
>  print "Enum"
> -print "Name(processor_type) Type(enum processor_type)"
> +print "Name(processor_type) Type(enum processor_type) ForceHelp"
>  print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
>  
>  ncpus = split (cpu_list, cpus)
> @@ -454,7 +454,7 @@ function gen_opt () {
>  }
>  
>  print "Enum"
> -print "Name(arm_arch) Type(int)"
> +print "Name(arm_arch) Type(int) ForceHelp"
>  print "Known ARM architectures (for use with the -march= option):\n"
>  
>  narchs = split (arch_list, archs)
> diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi
> index b3ca9f6fce6..af77ad78e8c 100644
> --- a/gcc/doc/options.texi
> +++ b/gcc/doc/options.texi
> @@ -120,6 +120,10 @@ being described by this record.
>  This property is required; it says what value (representable as
>  @code{int}) should be used for the given string.
>  
> +@item ForceHelp
> +This property is optional.  If present, enum values are printed
> +in @option{--help} output.
> +
>  @item Canonical
>  This property is optional.  If present, it says the present string is
>  the canonical one among all those with the given value.  Other strings
> diff --git a/gcc/opt-read.awk b/gcc/opt-read.awk
> index 2072958e6ba..6d2be9e99d7 100644
> --- a/gcc/opt-read.awk
> +++ b/gcc/opt-read.awk
> @@ -89,6 +89,9 @@ BEGIN {
>   enum_index[name] = n_enums
>   enum_unknown_error[name] = unknown_error
>   enum_help[name] = $3
> + enum_force_help[name] = test_flag("ForceHelp", props, 
> "true")
> + if (enum_force_help[name] == "")
> +   enum_force_help[name] = "false"
>   n_enums++
>   }
>   else if ($1 == "EnumValue")  {
> diff --git a/gcc/optc-gen.awk b/gcc/optc-gen.awk
> index bf177e86330..5c4f4239db0 100644
> --- a/gcc/optc-gen.awk
> +++ b/gcc/optc-gen.awk
> @@ -167,7 +167,8 @@ for (i = 0; i < n_enums; i++) {
>   print "cl_enum_" name "_data,"
>   print "sizeof (" enum_type[name] "),"
>   print "cl_enum_" name "_set,"
> - print "cl_enum_" name "_get"
> + print "cl_enum_" name "_get,"
> + print "" enum_force_help[name]
>   print "  },"
>  }
>  print "};"
> diff --git a/gcc/opts.c b/gcc/opts.c
> index b8ae8

Re: [C++ Patch] PR 59480 ("Missing error diagnostic: friend declaration specifying a default argument must be a definition")

2018-07-19 Thread Paolo Carlini

Hi,

On 19/07/2018 10:43, Rainer Orth wrote:

Hi Paolo,


the below resolves the bug report and its duplicates by implementing -
in a rather straightforward way, I believe - the resolution of DR 136,
which also made into C++17. Note that in the patch I used permerror instead
of a plain error for consistency with the other check
(check_redeclaration_no_default_args) which I added (rather) recently, and
I'm exploiting that to allow two existing testcases to compile as they
are. Tested x86_64-linux.

this patch caused

+FAIL: g++.old-deja/g++.mike/p784.C  -std=gnu++11 (test for excess errors)
+FAIL: g++.old-deja/g++.mike/p784.C  -std=gnu++14 (test for excess errors)
+FAIL: g++.old-deja/g++.mike/p784.C  -std=gnu++98 (test for excess errors)

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.old-deja/g++.mike/p784.C:1185:21: error: 
friend declaration of 'String common_prefix(const String&, const String&, int)' 
specifies default arguments and isn't a definition [-fpermissive]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.old-deja/g++.mike/p784.C:1187:21: error: 
friend declaration of 'String common_suffix(const String&, const String&, int)' 
specifies default arguments and isn't a definition [-fpermissive]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.old-deja/g++.mike/p784.C:1226:21: error: 
friend declaration of 'int readline(istream&, String&, char, int)' specifies 
default arguments and isn't a definition [-fpermissive]

I'm seeing it on i386-pc-solaris2.11 and sparc-sun-solaris2.11 with
-m32, but according to gcc-testresults it also happens on
i686-pc-linux-gnu, x86_64-pc-linux-gnu and a few others.
Thanks. I'm wondering why I didn't notice that. Anyway, I'm going to 
simply add -fpermissive to this testcase too.


Thanks again,
Paolo.


[wwwdocs] Mention LTO link-time issue fixed in gcc 8.2

2018-07-19 Thread Jan Hubicka
Hi,
since we now mention the problem with Intel tuning, I tought we also may mention
the LTO link-time issue that was fixed.  It was mentioned by several folks at
the phoronix forum. (Basicaly sometimes the partition size has overlfown which
made partitioner to put every symbol into separate parition causing fork bomb
while streaming and overall very slow compile times).

Honza

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.89
diff -u -r1.89 changes.html
--- changes.html15 Jul 2018 12:57:34 -  1.89
+++ changes.html19 Jul 2018 09:45:27 -
@@ -1321,6 +1321,12 @@
 complete (that is, it is possible that some PRs that have been fixed
 are not listed here).
 
+General Improvements
+  
+Fixed LTO link-time performance problems caused by an overflow
+   in the partitioning algorithm while building large binaries.
+  
+
 Target Specific Changes
 
 IA-32/x86-64


Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Martin Liška
On 07/19/2018 11:28 AM, Richard Earnshaw (lists) wrote:
> On 19/07/18 08:30, Martin Liška wrote:
>> This is correct version of the patch. Anyway, I'm thinking about the 
>> ForceHelp
>> attribute. I may do it in a bit different version. Let me come up with one 
>> another
>> version of the patch.
>>
>> Martin
>>
> 
> I don't understand how this is supposed to work.  -mcpu, -march and
> -mtune all take strings now and have to be parsed to identify various
> sub-components of the parameter.  So why do you talk about these being
> enum types?

Yes, they are string types. But for purpose of --help output, it's nice
to present to a user a list of possible values. That's the enum type.

Please take a look at attached patch.

Thanks,
Martin

> 
> R.
> 
>>
>> 0001-Show-valid-options-for-march-and-mtune-in-help-targe-v3.patch
>>
>>
>> From 9bfc1400213911b4508e90198df7b2dd11efc85c Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Tue, 20 Feb 2018 10:39:09 +0100
>> Subject: [PATCH] Show valid options for -march and -mtune in --help=target 
>> for
>>  arm32 (PR driver/83193).
>>
>> gcc/ChangeLog:
>>
>> 2018-07-18  Martin Liska  
>>
>> PR driver/83193
>>  * config/arm/arm-tables.opt: Add ForceHelp flag for
>> processor_type and arch_name enum types.
>>  * config/arm/parsecpu.awk: Likewise.
>>  * doc/options.texi: Document new flag ForceHelp.
>>  * opt-read.awk: Parse ForceHelp and set it in construction.
>>  * optc-gen.awk: Likewise.
>>  * opts.c (print_filtered_help): Handle force_help option.
>>  * opts.h (struct cl_enum): New field force_help.
>> ---
>>  gcc/config/arm/arm-tables.opt | 4 ++--
>>  gcc/config/arm/parsecpu.awk   | 4 ++--
>>  gcc/doc/options.texi  | 4 
>>  gcc/opt-read.awk  | 3 +++
>>  gcc/optc-gen.awk  | 3 ++-
>>  gcc/opts.c| 3 ++-
>>  gcc/opts.h| 3 +++
>>  7 files changed, 18 insertions(+), 6 deletions(-)
>>
>> diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
>> index eacee746a39..c74229e27d7 100644
>> --- a/gcc/config/arm/arm-tables.opt
>> +++ b/gcc/config/arm/arm-tables.opt
>> @@ -21,7 +21,7 @@
>>  ; .
>>  
>>  Enum
>> -Name(processor_type) Type(enum processor_type)
>> +Name(processor_type) Type(enum processor_type) ForceHelp
>>  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
>>  
>>  EnumValue
>> @@ -298,7 +298,7 @@ EnumValue
>>  Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
>>  
>>  Enum
>> -Name(arm_arch) Type(int)
>> +Name(arm_arch) Type(int) ForceHelp
>>  Known ARM architectures (for use with the -march= option):
>>  
>>  EnumValue
>> diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
>> index aabe1b0c64c..c499a5ed0ce 100644
>> --- a/gcc/config/arm/parsecpu.awk
>> +++ b/gcc/config/arm/parsecpu.awk
>> @@ -441,7 +441,7 @@ function gen_opt () {
>>  boilerplate("md")
>>  
>>  print "Enum"
>> -print "Name(processor_type) Type(enum processor_type)"
>> +print "Name(processor_type) Type(enum processor_type) ForceHelp"
>>  print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
>>  
>>  ncpus = split (cpu_list, cpus)
>> @@ -454,7 +454,7 @@ function gen_opt () {
>>  }
>>  
>>  print "Enum"
>> -print "Name(arm_arch) Type(int)"
>> +print "Name(arm_arch) Type(int) ForceHelp"
>>  print "Known ARM architectures (for use with the -march= option):\n"
>>  
>>  narchs = split (arch_list, archs)
>> diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi
>> index b3ca9f6fce6..af77ad78e8c 100644
>> --- a/gcc/doc/options.texi
>> +++ b/gcc/doc/options.texi
>> @@ -120,6 +120,10 @@ being described by this record.
>>  This property is required; it says what value (representable as
>>  @code{int}) should be used for the given string.
>>  
>> +@item ForceHelp
>> +This property is optional.  If present, enum values are printed
>> +in @option{--help} output.
>> +
>>  @item Canonical
>>  This property is optional.  If present, it says the present string is
>>  the canonical one among all those with the given value.  Other strings
>> diff --git a/gcc/opt-read.awk b/gcc/opt-read.awk
>> index 2072958e6ba..6d2be9e99d7 100644
>> --- a/gcc/opt-read.awk
>> +++ b/gcc/opt-read.awk
>> @@ -89,6 +89,9 @@ BEGIN {
>>  enum_index[name] = n_enums
>>  enum_unknown_error[name] = unknown_error
>>  enum_help[name] = $3
>> +enum_force_help[name] = test_flag("ForceHelp", props, 
>> "true")
>> +if (enum_force_help[name] == "")
>> +  enum_force_help[name] = "false"
>>  n_enums++
>>  }
>>  else if ($1 == "EnumValue")  {
>> diff --git a/gcc/optc-gen.awk b/gcc/optc-gen.awk
>> index bf177e86330..5c4f4239db0 100644
>> --- a/gcc/optc-gen.awk
>> +++ b/gcc/optc-gen.awk
>> @@ -167,7 +167,8 @@ for (i = 0

Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Richard Earnshaw (lists)
On 19/07/18 10:56, Martin Liška wrote:
> On 07/19/2018 11:28 AM, Richard Earnshaw (lists) wrote:
>> On 19/07/18 08:30, Martin Liška wrote:
>>> This is correct version of the patch. Anyway, I'm thinking about the 
>>> ForceHelp
>>> attribute. I may do it in a bit different version. Let me come up with one 
>>> another
>>> version of the patch.
>>>
>>> Martin
>>>
>>
>> I don't understand how this is supposed to work.  -mcpu, -march and
>> -mtune all take strings now and have to be parsed to identify various
>> sub-components of the parameter.  So why do you talk about these being
>> enum types?
> 
> Yes, they are string types. But for purpose of --help output, it's nice
> to present to a user a list of possible values. That's the enum type.
> 
> Please take a look at attached patch.
> 

But that isn't the list of possible values.  Please see the manual.  A
valid CPU name can look something like

cortex-a53+crypto

and architectures names can be even more complex.

You can't get this from that list of enum values.

R.

> Thanks,
> Martin
> 
>>
>> R.
>>
>>>
>>> 0001-Show-valid-options-for-march-and-mtune-in-help-targe-v3.patch
>>>
>>>
>>> From 9bfc1400213911b4508e90198df7b2dd11efc85c Mon Sep 17 00:00:00 2001
>>> From: marxin 
>>> Date: Tue, 20 Feb 2018 10:39:09 +0100
>>> Subject: [PATCH] Show valid options for -march and -mtune in --help=target 
>>> for
>>>  arm32 (PR driver/83193).
>>>
>>> gcc/ChangeLog:
>>>
>>> 2018-07-18  Martin Liska  
>>>
>>> PR driver/83193
>>> * config/arm/arm-tables.opt: Add ForceHelp flag for
>>> processor_type and arch_name enum types.
>>> * config/arm/parsecpu.awk: Likewise.
>>> * doc/options.texi: Document new flag ForceHelp.
>>> * opt-read.awk: Parse ForceHelp and set it in construction.
>>> * optc-gen.awk: Likewise.
>>> * opts.c (print_filtered_help): Handle force_help option.
>>> * opts.h (struct cl_enum): New field force_help.
>>> ---
>>>  gcc/config/arm/arm-tables.opt | 4 ++--
>>>  gcc/config/arm/parsecpu.awk   | 4 ++--
>>>  gcc/doc/options.texi  | 4 
>>>  gcc/opt-read.awk  | 3 +++
>>>  gcc/optc-gen.awk  | 3 ++-
>>>  gcc/opts.c| 3 ++-
>>>  gcc/opts.h| 3 +++
>>>  7 files changed, 18 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
>>> index eacee746a39..c74229e27d7 100644
>>> --- a/gcc/config/arm/arm-tables.opt
>>> +++ b/gcc/config/arm/arm-tables.opt
>>> @@ -21,7 +21,7 @@
>>>  ; .
>>>  
>>>  Enum
>>> -Name(processor_type) Type(enum processor_type)
>>> +Name(processor_type) Type(enum processor_type) ForceHelp
>>>  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
>>>  
>>>  EnumValue
>>> @@ -298,7 +298,7 @@ EnumValue
>>>  Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
>>>  
>>>  Enum
>>> -Name(arm_arch) Type(int)
>>> +Name(arm_arch) Type(int) ForceHelp
>>>  Known ARM architectures (for use with the -march= option):
>>>  
>>>  EnumValue
>>> diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
>>> index aabe1b0c64c..c499a5ed0ce 100644
>>> --- a/gcc/config/arm/parsecpu.awk
>>> +++ b/gcc/config/arm/parsecpu.awk
>>> @@ -441,7 +441,7 @@ function gen_opt () {
>>>  boilerplate("md")
>>>  
>>>  print "Enum"
>>> -print "Name(processor_type) Type(enum processor_type)"
>>> +print "Name(processor_type) Type(enum processor_type) ForceHelp"
>>>  print "Known ARM CPUs (for use with the -mcpu= and -mtune= options):\n"
>>>  
>>>  ncpus = split (cpu_list, cpus)
>>> @@ -454,7 +454,7 @@ function gen_opt () {
>>>  }
>>>  
>>>  print "Enum"
>>> -print "Name(arm_arch) Type(int)"
>>> +print "Name(arm_arch) Type(int) ForceHelp"
>>>  print "Known ARM architectures (for use with the -march= option):\n"
>>>  
>>>  narchs = split (arch_list, archs)
>>> diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi
>>> index b3ca9f6fce6..af77ad78e8c 100644
>>> --- a/gcc/doc/options.texi
>>> +++ b/gcc/doc/options.texi
>>> @@ -120,6 +120,10 @@ being described by this record.
>>>  This property is required; it says what value (representable as
>>>  @code{int}) should be used for the given string.
>>>  
>>> +@item ForceHelp
>>> +This property is optional.  If present, enum values are printed
>>> +in @option{--help} output.
>>> +
>>>  @item Canonical
>>>  This property is optional.  If present, it says the present string is
>>>  the canonical one among all those with the given value.  Other strings
>>> diff --git a/gcc/opt-read.awk b/gcc/opt-read.awk
>>> index 2072958e6ba..6d2be9e99d7 100644
>>> --- a/gcc/opt-read.awk
>>> +++ b/gcc/opt-read.awk
>>> @@ -89,6 +89,9 @@ BEGIN {
>>> enum_index[name] = n_enums
>>> enum_unknown_error[name] = unknown_error
>>> enum_help[name] = $3
>>> +   enum_force_help[name] = test_flag("ForceHelp", pr

Re: [AArch64] Generate load-pairs when the last load clobbers the address register [2/2]

2018-07-19 Thread Jackson Woodruff

Hi Richard,


On 07/12/2018 05:35 PM, Richard Earnshaw (lists) wrote:

On 11/07/18 17:48, Jackson Woodruff wrote:

Hi Sudi,

On 07/10/2018 02:29 PM, Sudakshina Das wrote:

Hi Jackson


On Tuesday 10 July 2018 09:37 AM, Jackson Woodruff wrote:

Hi all,

This patch resolves PR86014.  It does so by noticing that the last
load may clobber the address register without issue (regardless of
where it exists in the final ldp/stp sequence). That check has been
changed so that the last register may be clobbered and the testcase
(gcc.target/aarch64/ldp_stp_10.c) now passes.

Bootstrap and regtest OK.

OK for trunk?

Jackson

Changelog:

gcc/

2018-06-25  Jackson Woodruff  

     PR target/86014
     * config/aarch64/aarch64.c
(aarch64_operands_adjust_ok_for_ldpstp):
     Remove address clobber check on last register.


This looks good to me but you will need a maintainer to approve it.
The only
thing I would add is that if you could move the comment on top of the
for loop
to this patch. That is, keep the original
/* Check if the addresses are clobbered by load.  */
in your [1/2] and make the comment change in [2/2].

Thanks, change made.  OK for trunk?

Thanks,

Jackson

pr86014.patch


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
da44b33b2bc12f9aa2122cf5194e244437fb31a5..8a027974e9772cacf5f5cb8ec61e8ef62187e879
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -17071,9 +17071,10 @@ aarch64_operands_adjust_ok_for_ldpstp (rtx *operands, 
bool load,
return false;
  }
  
-  /* Check if addresses are clobbered by load.  */

+  /* Only the last register in the order in which they occur
+ may be clobbered by the load.  */
if (load)
-for (int i = 0; i < num_instructions; i++)
+for (int i = 0; i < num_instructions - 1; i++)
if (reg_mentioned_p (reg[i], mem[i]))
return false;
  


Can we have a new test for this?

I've added ldp_stp_13.c that tests for this.


Also, if rclass (which you calculate later) is FP_REGS, then the test is
redundant since mems can never use FP registers as a base register.


Yes, makes sense.  I've flipped the logic around so that the rclass is
calculated first and is then used to avoid the base register check if
it is not GENERAL_REGS.

Re-bootstrapped and regtested.

Is this OK for trunk?

Thanks,

Jackson



R.


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1369704da3ed8094c0d4612643794e6392dce05a..3dd891ebd00f24ffa4187f0125b306a3c6671bef 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -17084,9 +17084,26 @@ aarch64_operands_adjust_ok_for_ldpstp (rtx *operands, bool load,
 	return false;
 }
 
-  /* Check if addresses are clobbered by load.  */
-  if (load)
-for (int i = 0; i < num_insns; i++)
+  /* Check if the registers are of same class.  */
+  rclass = REG_P (reg[0]) && FP_REGNUM_P (REGNO (reg[0]))
+? FP_REGS : GENERAL_REGS;
+
+  for (int i = 1; i < num_insns; i++)
+if (REG_P (reg[i]) && FP_REGNUM_P (REGNO (reg[i])))
+  {
+	if (rclass != FP_REGS)
+	  return false;
+  }
+else
+  {
+	if (rclass != GENERAL_REGS)
+	  return false;
+  }
+
+  /* Only the last register in the order in which they occur
+ may be clobbered by the load.  */
+  if (rclass == GENERAL_REGS && load)
+for (int i = 0; i < num_insns - 1; i++)
   if (reg_mentioned_p (reg[i], mem[i]))
 	return false;
 
@@ -17126,22 +17143,6 @@ aarch64_operands_adjust_ok_for_ldpstp (rtx *operands, bool load,
   && MEM_ALIGN (mem[0]) < 8 * BITS_PER_UNIT)
 return false;
 
-  /* Check if the registers are of same class.  */
-  rclass = REG_P (reg[0]) && FP_REGNUM_P (REGNO (reg[0]))
-? FP_REGS : GENERAL_REGS;
-
-  for (int i = 1; i < num_insns; i++)
-if (REG_P (reg[i]) && FP_REGNUM_P (REGNO (reg[i])))
-  {
-	if (rclass != FP_REGS)
-	  return false;
-  }
-else
-  {
-	if (rclass != GENERAL_REGS)
-	  return false;
-  }
-
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_13.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_13.c
new file mode 100644
index ..9cc3942f153773e8ffe9bcaf07f6b32dc0d5f95e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_13.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mabi=ilp32" } */
+
+long long
+load_long (long long int *arr)
+{
+  return arr[400] << 1 + arr[401] << 1 + arr[403] << 1 + arr[404] << 1;
+}
+
+/* { dg-final { scan-assembler-times "ldp\tx\[0-9\]+, x\[0-9\]+, " 2 } } */
+
+int
+load (int *arr)
+{
+  return arr[527] << 1 + arr[400] << 1 + arr[401] << 1 + arr[528] << 1;
+}
+
+/* { dg-final { scan-assembler-times "ldp\tw\[0-9\]+, w\[0-9\]+, " 2 } } */


Re: [SVE ACLE] Add initial support for arm_sve.h

2018-07-19 Thread Richard Sandiford
Richard Biener  writes:
> On Wed, Jul 18, 2018 at 8:08 PM Richard Sandiford
>  wrote:
>>
>> This patch adds the target framework for handling the SVE ACLE,
>> starting with four functions: svadd, svptrue, svsub and svsubr.
>>
>> The ACLE has both overloaded and non-overloaded names.  Without
>> the equivalent of clang's __attribute__((overloadable)), a header
>> file that declared all functions would need three sets of declarations:
>>
>> - the non-overloaded forms (used for both C and C++)
>> - _Generic-based macros to handle overloading in C
>> - normal overloaded inline functions for C++
>>
>> This would likely require a lot of cut-&-paste.  It would probably
>> also lead to poor diagnosics and be slow to parse.
>>
>> Another consideration is that some functions require certain arguments
>> to be integer constant expressions.  We can (sort of) enforce that
>> for calls to built-in functions using resolve_overloaded_builtin,
>> but it would be harder to enforce with inline forwarder functions.
>>
>> For these reasons and others, the patch takes the approach of adding
>> a pragma that gets the compiler to insert the definitions itself.
>> This requires a slight variation on the existing lang hooks for
>> built-in functions, but otherwise it seems to just work.
>
> I guess you did consider auto-generating the three variants from a template?

Yeah.  But that would only solve the cut-&-paste problem, not the others.
It would also be quite a lot more complicated overall.

E.g. scripting code to produce the right _Generics is much more complicated
than just implementing the overloading using resolve_overloaded_builtin
(which also produces better error messages).  And even just scripting
the declarations is more work: the backend has to register a built-in
function either way, so getting it to register the public name is
easier than having the backend register a __builtin_ function and
then scripting a header file declaration with the same prototype
and attributes.

Thanks,
Richard


Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Martin Liška
On 07/19/2018 12:01 PM, Richard Earnshaw (lists) wrote:
> On 19/07/18 10:56, Martin Liška wrote:
>> On 07/19/2018 11:28 AM, Richard Earnshaw (lists) wrote:
>>> On 19/07/18 08:30, Martin Liška wrote:
 This is correct version of the patch. Anyway, I'm thinking about the 
 ForceHelp
 attribute. I may do it in a bit different version. Let me come up with one 
 another
 version of the patch.

 Martin

>>>
>>> I don't understand how this is supposed to work.  -mcpu, -march and
>>> -mtune all take strings now and have to be parsed to identify various
>>> sub-components of the parameter.  So why do you talk about these being
>>> enum types?
>>
>> Yes, they are string types. But for purpose of --help output, it's nice
>> to present to a user a list of possible values. That's the enum type.
>>
>> Please take a look at attached patch.
>>
> 
> But that isn't the list of possible values.  Please see the manual.  A
> valid CPU name can look something like
> 
>   cortex-a53+crypto
> 
> and architectures names can be even more complex.
> 
> You can't get this from that list of enum values.

I'm fully aware of the limitation, it's questionable whether you want to get:

@@ -56,6 +56,9 @@
   Known ARM ABIs (for use with the -mabi= option):
 aapcs aapcs-linux apcs-gnu atpcs iwmmxt
 
+  Known ARM architectures (for use with the -march= option):
+armv4 armv4t armv5t armv5te armv5tej armv6 armv6-m armv6j armv6k armv6kz 
armv6s-m armv6t2 armv6z armv6zk armv7 armv7-a armv7-m armv7-r armv7e-m armv7ve 
armv8-a armv8-m.base armv8-m.main armv8-r armv8.1-a armv8.2-a armv8.3-a 
armv8.4-a iwmmxt iwmmxt2 native
+
   Known __fp16 formats (for use with the -mfp16-format= option):
 alternative ieee none
 
@@ -68,6 +71,12 @@
   Known floating-point ABIs (for use with the -mfloat-abi= option):
 hard soft softfp
 
+  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
+arm1020e arm1020t arm1022e arm1026ej-s arm10e arm10tdmi arm1136j-s 
arm1136jf-s arm1156t2-s arm1156t2f-s arm1176jz-s arm1176jzf-s arm710t arm720t 
arm740t arm7tdmi arm7tdmi-s arm8 arm810 arm9 arm920 arm920t arm922t arm926ej-s 
arm940t arm946e-s arm966e-s arm968e-s arm9e
+arm9tdmi cortex-a12 cortex-a15 cortex-a15.cortex-a7 cortex-a17 
cortex-a17.cortex-a7 cortex-a32 cortex-a35 cortex-a5 cortex-a53 cortex-a55 
cortex-a57 cortex-a57.cortex-a53 cortex-a7 cortex-a72 cortex-a72.cortex-a53 
cortex-a73 cortex-a73.cortex-a35 cortex-a73.cortex-a53
+cortex-a75 cortex-a75.cortex-a55 cortex-a76 cortex-a76.cortex-a55 
cortex-a8 cortex-a9 cortex-m0 cortex-m0.small-multiply cortex-m0plus 
cortex-m0plus.small-multiply cortex-m1 cortex-m1.small-multiply cortex-m23 
cortex-m3 cortex-m33 cortex-m4 cortex-m7 cortex-r4
+cortex-r4f cortex-r5 cortex-r52 cortex-r7 cortex-r8 ep9312 exynos-m1 fa526 
fa606te fa626 fa626te fa726te fmp626 generic-armv7-a iwmmxt iwmmxt2 marvell-pj4 
mpcore mpcorenovfp native strongarm strongarm110 strongarm1100 strongarm1110 
xgene1 xscale
+
   TLS dialect to use:
 gnu gnu2

I hope it's still beneficial for users.

Martin

> 
> R.
> 
>> Thanks,
>> Martin
>>
>>>
>>> R.
>>>

 0001-Show-valid-options-for-march-and-mtune-in-help-targe-v3.patch


 From 9bfc1400213911b4508e90198df7b2dd11efc85c Mon Sep 17 00:00:00 2001
 From: marxin 
 Date: Tue, 20 Feb 2018 10:39:09 +0100
 Subject: [PATCH] Show valid options for -march and -mtune in --help=target 
 for
  arm32 (PR driver/83193).

 gcc/ChangeLog:

 2018-07-18  Martin Liska  

 PR driver/83193
* config/arm/arm-tables.opt: Add ForceHelp flag for
 processor_type and arch_name enum types.
* config/arm/parsecpu.awk: Likewise.
* doc/options.texi: Document new flag ForceHelp.
* opt-read.awk: Parse ForceHelp and set it in construction.
* optc-gen.awk: Likewise.
* opts.c (print_filtered_help): Handle force_help option.
* opts.h (struct cl_enum): New field force_help.
 ---
  gcc/config/arm/arm-tables.opt | 4 ++--
  gcc/config/arm/parsecpu.awk   | 4 ++--
  gcc/doc/options.texi  | 4 
  gcc/opt-read.awk  | 3 +++
  gcc/optc-gen.awk  | 3 ++-
  gcc/opts.c| 3 ++-
  gcc/opts.h| 3 +++
  7 files changed, 18 insertions(+), 6 deletions(-)

 diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
 index eacee746a39..c74229e27d7 100644
 --- a/gcc/config/arm/arm-tables.opt
 +++ b/gcc/config/arm/arm-tables.opt
 @@ -21,7 +21,7 @@
  ; .
  
  Enum
 -Name(processor_type) Type(enum processor_type)
 +Name(processor_type) Type(enum processor_type) ForceHelp
  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
  
  EnumValue
 @@ -298,7 +298,7 @@ EnumValue
  Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cort

Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Richard Earnshaw (lists)
On 19/07/18 11:22, Martin Liška wrote:
> On 07/19/2018 12:01 PM, Richard Earnshaw (lists) wrote:
>> On 19/07/18 10:56, Martin Liška wrote:
>>> On 07/19/2018 11:28 AM, Richard Earnshaw (lists) wrote:
 On 19/07/18 08:30, Martin Liška wrote:
> This is correct version of the patch. Anyway, I'm thinking about the 
> ForceHelp
> attribute. I may do it in a bit different version. Let me come up with 
> one another
> version of the patch.
>
> Martin
>

 I don't understand how this is supposed to work.  -mcpu, -march and
 -mtune all take strings now and have to be parsed to identify various
 sub-components of the parameter.  So why do you talk about these being
 enum types?
>>>
>>> Yes, they are string types. But for purpose of --help output, it's nice
>>> to present to a user a list of possible values. That's the enum type.
>>>
>>> Please take a look at attached patch.
>>>
>>
>> But that isn't the list of possible values.  Please see the manual.  A
>> valid CPU name can look something like
>>
>>  cortex-a53+crypto
>>
>> and architectures names can be even more complex.
>>
>> You can't get this from that list of enum values.
> 
> I'm fully aware of the limitation, it's questionable whether you want to get:
> 
> @@ -56,6 +56,9 @@
>Known ARM ABIs (for use with the -mabi= option):
>  aapcs aapcs-linux apcs-gnu atpcs iwmmxt
>  
> +  Known ARM architectures (for use with the -march= option):
> +armv4 armv4t armv5t armv5te armv5tej armv6 armv6-m armv6j armv6k armv6kz 
> armv6s-m armv6t2 armv6z armv6zk armv7 armv7-a armv7-m armv7-r armv7e-m 
> armv7ve armv8-a armv8-m.base armv8-m.main armv8-r armv8.1-a armv8.2-a 
> armv8.3-a armv8.4-a iwmmxt iwmmxt2 native
> +
>Known __fp16 formats (for use with the -mfp16-format= option):
>  alternative ieee none
>  
> @@ -68,6 +71,12 @@
>Known floating-point ABIs (for use with the -mfloat-abi= option):
>  hard soft softfp
>  
> +  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
> +arm1020e arm1020t arm1022e arm1026ej-s arm10e arm10tdmi arm1136j-s 
> arm1136jf-s arm1156t2-s arm1156t2f-s arm1176jz-s arm1176jzf-s arm710t arm720t 
> arm740t arm7tdmi arm7tdmi-s arm8 arm810 arm9 arm920 arm920t arm922t 
> arm926ej-s arm940t arm946e-s arm966e-s arm968e-s arm9e
> +arm9tdmi cortex-a12 cortex-a15 cortex-a15.cortex-a7 cortex-a17 
> cortex-a17.cortex-a7 cortex-a32 cortex-a35 cortex-a5 cortex-a53 cortex-a55 
> cortex-a57 cortex-a57.cortex-a53 cortex-a7 cortex-a72 cortex-a72.cortex-a53 
> cortex-a73 cortex-a73.cortex-a35 cortex-a73.cortex-a53
> +cortex-a75 cortex-a75.cortex-a55 cortex-a76 cortex-a76.cortex-a55 
> cortex-a8 cortex-a9 cortex-m0 cortex-m0.small-multiply cortex-m0plus 
> cortex-m0plus.small-multiply cortex-m1 cortex-m1.small-multiply cortex-m23 
> cortex-m3 cortex-m33 cortex-m4 cortex-m7 cortex-r4
> +cortex-r4f cortex-r5 cortex-r52 cortex-r7 cortex-r8 ep9312 exynos-m1 
> fa526 fa606te fa626 fa626te fa726te fmp626 generic-armv7-a iwmmxt iwmmxt2 
> marvell-pj4 mpcore mpcorenovfp native strongarm strongarm110 strongarm1100 
> strongarm1110 xgene1 xscale
> +
>TLS dialect to use:
>  gnu gnu2
> 
> I hope it's still beneficial for users.

Frankly, I find the list too long to be helpful.  I'd also prefer it if
we could come up with a more useful approach.  I've pondered if the
following were possible:

In general target help, print

For list of supported CPUs [Architectures] use -mcpu=help [-march=help]

And then, invoking the compiler gives that list in a more user-friendly
fashion.  Finally, at the end we could have:

For CPU [Architecture]-specific extensions use -mcpu=+help
[-march=+help]

and then it would show the specific extensions for that architecture.

It's relatively straight forward to do the back-end plumbing for this,
but the help driver would have to know how to call into the back-end or
for the back-end to be able to report to the midend that this was a help
invocation not a normal run.  I couldn't find a simple way of doing that
when I tried before.

R.

> 
> Martin
> 
>>
>> R.
>>
>>> Thanks,
>>> Martin
>>>

 R.

>
> 0001-Show-valid-options-for-march-and-mtune-in-help-targe-v3.patch
>
>
> From 9bfc1400213911b4508e90198df7b2dd11efc85c Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Tue, 20 Feb 2018 10:39:09 +0100
> Subject: [PATCH] Show valid options for -march and -mtune in 
> --help=target for
>  arm32 (PR driver/83193).
>
> gcc/ChangeLog:
>
> 2018-07-18  Martin Liska  
>
> PR driver/83193
>   * config/arm/arm-tables.opt: Add ForceHelp flag for
> processor_type and arch_name enum types.
>   * config/arm/parsecpu.awk: Likewise.
>   * doc/options.texi: Document new flag ForceHelp.
>   * opt-read.awk: Parse ForceHelp and set it in construction.
>   * optc-gen.awk: Likewise.
>   * opts.c (print_filtered_help): Handle force_help opt

Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-19 Thread Kyrill Tkachov

Hi Thomas,

On 17/07/18 12:02, Thomas Preudhomme wrote:

Fixed in attached patch. ChangeLog entries are unchanged:

*** gcc/ChangeLog ***

2018-07-05  Thomas Preud'homme 

PR target/85434
* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to new require_pic_register prototype.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(stack_protect_combined_set): New insn_and_split pattern.
(stack_protect_set): New insn pattern.
(stack_protect_combined_test): New insn_and_split pattern.
(stack_protect_test): New insn pattern.
* config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
(UNSPEC_SP_TEST): Likewise.
* doc/md.texi (stack_protect_combined_set): Document new standard
pattern name.
(stack_protect_set): Clarify that the operand for guard's address is
legal.
(stack_protect_combined_test): Document new standard pattern name.
(stack_protect_test): Clarify that the operand for guard's address is
legal.

*** gcc/testsuite/ChangeLog ***

2018-07-05  Thomas Preud'homme 

PR target/85434
* gcc.target/arm/pr85434.c: New test.



Sorry for the delay. Some comments inline.

Kyrill

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index d6e3c382085..d1a893ac56e 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -6105,8 +6105,18 @@ stack_protect_prologue (void)
 {
   tree guard_decl = targetm.stack_protect_guard ();
   rtx x, y;
+  struct expand_operand ops[2];
 
   x = expand_normal (crtl->stack_protect_guard);

+  create_fixed_operand (&ops[0], x);
+  create_fixed_operand (&ops[1], DECL_RTL (guard_decl));
+  /* Allow the target to compute address of Y and copy it to X without
+ leaking Y into a register.  This combined address + copy pattern allows
+ the target to prevent spilling of any intermediate results by splitting
+ it after register allocator.  */
+  if (maybe_expand_insn (targetm.code_for_stack_protect_combined_set, 2, ops))
+return;
+
   if (guard_decl)
 y = expand_normal (guard_decl);
   else
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 8537262ce64..100844e659c 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -67,7 +67,7 @@ extern int const_ok_for_dimode_op (HOST_WIDE_INT, enum 
rtx_code);
 extern int arm_split_constant (RTX_CODE, machine_mode, rtx,
   HOST_WIDE_INT, rtx, rtx, int);
 extern int legitimate_pic_operand_p (rtx);
-extern rtx legitimize_pic_address (rtx, machine_mode, rtx);
+extern rtx legitimize_pic_address (rtx, machine_mode, rtx, rtx, bool);
 extern rtx legitimize_tls_address (rtx, rtx);
 extern bool arm_legitimate_address_p (machine_mode, rtx, bool);
 extern int arm_legitimate_address_outer_p (machine_mode, rtx, RTX_CODE, int);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ec3abbcba9f..f4a970580c2 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7369,20 +7369,26 @@ legitimate_pic_operand_p (rtx x)
 }
 
 /* Record that the current function needs a PIC register.  Initialize

-   cfun->machine->pic_reg if we have not already done so.  */
+   cfun->machine->pic_reg if we have not already done so.
+
+   If not NULL, PIC_REG indicates which register to use as PIC register,
+   otherwise it is decided by register allocator.  COMPUTE_NOW forces the PIC
+   register to be loaded, irregardless of whether it was loaded previously.  */
 
 static void

-require_pic_register (void)
+require_pic_register (rtx pic_reg, bool compute_now)
 {
   /* A lot of the logic here is made obscure by the fact that this
  routine gets called as part of the rtx cost estimation process.
  We don't want those calls to affect any assumptions about the real
  function; and further, we can't call entry_of_function() until we
  start the real expansion process.  */
-  if (!crtl->uses_pic_offset_table)
+  if (!crtl->uses_pic_offset_table || compute_now)
 {
-  gcc_assert (can_create_pseudo_p ());
+  gcc_assert (can_create_pseudo_p ()
+ || (pic_reg != NULL_RTX && GET_MODE (pic_reg) == Pmode));
   if (arm_pic_register != INVALID_R

[PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread H.J. Lu
On Thu, Jul 19, 2018 at 10:35:27AM +0200, Richard Biener wrote:
> On Wed, Jul 18, 2018 at 5:33 PM H.J. Lu  wrote:
> >
> > In
> >
> > struct ucontext;
> > typedef struct ucontext ucontext_t;
> >
> > extern int (*bar) (ucontext_t *__restrict __oucp,
> >const ucontext_t *__restrict __ucp)
> >   __attribute__((__indirect_return__));
> >
> > extern int res;
> >
> > void
> > foo (ucontext_t *oucp, ucontext_t *ucp)
> > {
> >   res = bar (oucp, ucp);
> > }
> >
> > bar() may return via indirect branch.  This patch changes indirect_return
> > to type attribute to allow indirect_return attribute on variable or type
> > of function pointer so that ENDBR can be inserted after call to bar().
> >
> > Tested on i386 and x86-64.  OK for trunk?
> 
> OK.
> 

The new indirect_return attribute is intended to mark swapcontext in
.  This patch defines __HAVE_INDIRECT_RETURN_ATTRIBUTE__
so that it can be used checked before using indirect_return attribute
in .  It works when the indirect_return attribute is
backported to GCC 8.

OK for trunk?

Thanks.

H.J.
---
gcc/

PR target/86560
* config/i386/i386-c.c (ix86_target_macros): Define
__HAVE_INDIRECT_RETURN_ATTRIBUTE__.

gcc/testsuite/

PR target/86560
* gcc.target/i386/pr86560-4.c: New test.
---
 gcc/config/i386/i386-c.c  |  2 ++
 gcc/testsuite/gcc.target/i386/pr86560-4.c | 19 +++
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86560-4.c

diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 005e1a5b308..041d47c3ee6 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -695,6 +695,8 @@ ix86_target_macros (void)
   if (flag_cf_protection != CF_NONE)
 cpp_define_formatted (parse_in, "__CET__=%d",
  flag_cf_protection & ~CF_SET);
+
+  cpp_define (parse_in, "__HAVE_INDIRECT_RETURN_ATTRIBUTE__");
 }
 
 
diff --git a/gcc/testsuite/gcc.target/i386/pr86560-4.c 
b/gcc/testsuite/gcc.target/i386/pr86560-4.c
new file mode 100644
index 000..46ea923fdfc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86560-4.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+struct ucontext;
+
+extern int (*bar) (struct ucontext *)
+#ifdef __HAVE_INDIRECT_RETURN_ATTRIBUTE__
+  __attribute__((__indirect_return__))
+#endif
+;
+
+extern int res;
+
+void
+foo (struct ucontext *oucp)
+{
+  res = bar (oucp);
+}
-- 
2.17.1



Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread Jakub Jelinek
On Thu, Jul 19, 2018 at 04:21:26AM -0700, H.J. Lu wrote:
> The new indirect_return attribute is intended to mark swapcontext in
> .  This patch defines __HAVE_INDIRECT_RETURN_ATTRIBUTE__
> so that it can be used checked before using indirect_return attribute
> in .  It works when the indirect_return attribute is
> backported to GCC 8.
> 
> OK for trunk?

No.  Use
#ifdef __has_attribute
#if __has_attribute (indirect_return)
...
#endif
#endif
instead, like for any other attribute.  Predefined macros aren't zero cost.

Jakub


Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread Florian Weimer

On 07/19/2018 01:33 PM, Jakub Jelinek wrote:

On Thu, Jul 19, 2018 at 04:21:26AM -0700, H.J. Lu wrote:

The new indirect_return attribute is intended to mark swapcontext in
.  This patch defines __HAVE_INDIRECT_RETURN_ATTRIBUTE__
so that it can be used checked before using indirect_return attribute
in .  It works when the indirect_return attribute is
backported to GCC 8.

OK for trunk?


No.  Use
#ifdef __has_attribute
#if __has_attribute (indirect_return)
...
#endif
#endif
instead, like for any other attribute.


That doesn't work because indirect_return is not in the implementation 
namespace and expanded in this context. I assume that __has_attribute 
(__indirect_return__) would work, though.


Could we add:

#ifdef __has_attribute
# define __glibc_has_attribute(attr) __has_attribute (attr)
#else
# define __glibc_has_attribute 0
#endif

And then use this:

#if __glibc_has_attribute (__indirect_return__)

Would that still work?

Thanks,
Florian


Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread H.J. Lu
On Thu, Jul 19, 2018 at 01:39:04PM +0200, Florian Weimer wrote:
> On 07/19/2018 01:33 PM, Jakub Jelinek wrote:
> > On Thu, Jul 19, 2018 at 04:21:26AM -0700, H.J. Lu wrote:
> > > The new indirect_return attribute is intended to mark swapcontext in
> > > .  This patch defines __HAVE_INDIRECT_RETURN_ATTRIBUTE__
> > > so that it can be used checked before using indirect_return attribute
> > > in .  It works when the indirect_return attribute is
> > > backported to GCC 8.
> > > 
> > > OK for trunk?
> > 
> > No.  Use
> > #ifdef __has_attribute
> > #if __has_attribute (indirect_return)
> > ...
> > #endif
> > #endif
> > instead, like for any other attribute.
> 
> That doesn't work because indirect_return is not in the implementation
> namespace and expanded in this context. I assume that __has_attribute
> (__indirect_return__) would work, though.
> 
> Could we add:
> 
> #ifdef __has_attribute
> # define __glibc_has_attribute(attr) __has_attribute (attr)
> #else
> # define __glibc_has_attribute 0
> #endif
> 
> And then use this:
> 
> #if __glibc_has_attribute (__indirect_return__)
> 
> Would that still work?
> 

Both __has_attribute (indirect_return) and __has_attribute (__indirect_return__)
work here.


H.J.
---
The new indirect_return attribute is intended to mark swapcontext in
.  Test __has_attribute (indirect_return) so that it
can be backported to GCC 8.

PR target/86560
* gcc.target/i386/pr86560-4.c: New test.
* gcc.target/i386/pr86560-5.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/pr86560-4.c | 21 +
 gcc/testsuite/gcc.target/i386/pr86560-5.c | 21 +
 2 files changed, 42 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86560-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr86560-5.c

diff --git a/gcc/testsuite/gcc.target/i386/pr86560-4.c 
b/gcc/testsuite/gcc.target/i386/pr86560-4.c
new file mode 100644
index 000..a623e3dcbeb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86560-4.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+struct ucontext;
+
+extern int (*bar) (struct ucontext *)
+#ifdef __has_attribute
+# if __has_attribute (indirect_return)
+  __attribute__((__indirect_return__))
+# endif
+#endif
+;
+
+extern int res;
+
+void
+foo (struct ucontext *oucp)
+{
+  res = bar (oucp);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr86560-5.c 
b/gcc/testsuite/gcc.target/i386/pr86560-5.c
new file mode 100644
index 000..33b0f6424c2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86560-5.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+struct ucontext;
+
+extern int (*bar) (struct ucontext *)
+#ifdef __has_attribute
+# if __has_attribute (__indirect_return__)
+  __attribute__((__indirect_return__))
+# endif
+#endif
+;
+
+extern int res;
+
+void
+foo (struct ucontext *oucp)
+{
+  res = bar (oucp);
+}
-- 
2.17.1



Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread Florian Weimer

On 07/19/2018 01:48 PM, H.J. Lu wrote:

Both __has_attribute (indirect_return) and __has_attribute (__indirect_return__)
work here.


Applications can have

#define indirect_return

so the variant without underscore mangling is definitely not correct.

Thanks,
Florian


Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread Jakub Jelinek
On Thu, Jul 19, 2018 at 01:39:04PM +0200, Florian Weimer wrote:
> On 07/19/2018 01:33 PM, Jakub Jelinek wrote:
> > On Thu, Jul 19, 2018 at 04:21:26AM -0700, H.J. Lu wrote:
> > > The new indirect_return attribute is intended to mark swapcontext in
> > > .  This patch defines __HAVE_INDIRECT_RETURN_ATTRIBUTE__
> > > so that it can be used checked before using indirect_return attribute
> > > in .  It works when the indirect_return attribute is
> > > backported to GCC 8.
> > > 
> > > OK for trunk?
> > 
> > No.  Use
> > #ifdef __has_attribute
> > #if __has_attribute (indirect_return)
> > ...
> > #endif
> > #endif
> > instead, like for any other attribute.
> 
> That doesn't work because indirect_return is not in the implementation
> namespace and expanded in this context. I assume that __has_attribute
> (__indirect_return__) would work, though.
> 
> Could we add:
> 
> #ifdef __has_attribute
> # define __glibc_has_attribute(attr) __has_attribute (attr)
> #else
> # define __glibc_has_attribute 0

# define __glibc_has_attribute(attr) 0
instead, otherwise you get errors.

> #endif
> 
> And then use this:
> 
> #if __glibc_has_attribute (__indirect_return__)
> 
> Would that still work?

Sure, you can do this.

Jakub


Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread Jakub Jelinek
On Thu, Jul 19, 2018 at 01:54:46PM +0200, Florian Weimer wrote:
> On 07/19/2018 01:48 PM, H.J. Lu wrote:
> > Both __has_attribute (indirect_return) and __has_attribute 
> > (__indirect_return__)
> > work here.
> 
> Applications can have
> 
> #define indirect_return
> 
> so the variant without underscore mangling is definitely not correct.

Incorrect for what?  glibc header?  Yes.  The libsanitizer use, where we
control the headers and what we define?  No.

Jakub


Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread H.J. Lu
On Thu, Jul 19, 2018 at 4:56 AM, Jakub Jelinek  wrote:
> On Thu, Jul 19, 2018 at 01:54:46PM +0200, Florian Weimer wrote:
>> On 07/19/2018 01:48 PM, H.J. Lu wrote:
>> > Both __has_attribute (indirect_return) and __has_attribute 
>> > (__indirect_return__)
>> > work here.
>>
>> Applications can have
>>
>> #define indirect_return
>>
>> so the variant without underscore mangling is definitely not correct.
>
> Incorrect for what?  glibc header?  Yes.  The libsanitizer use, where we
> control the headers and what we define?  No.
>
> Jakub

I am checking my testcases to show how it works.

-- 
H.J.


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Glen Fernandes
Updated patch to simplify the helper trait, and to include 
instead of  in the unit test for copy_uninitialized:

Use __builtin_memmove for trivially copy assignable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__is_simple_copy_move): Defined helper.
(__copy_move_a): Used helper.
(__copy_move_backward_a): Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
New test.
* testsuite/25_algorithms/copy/58982.cc: Updated tests.
* testsuite/25_algorithms/copy_n/58982.cc: Likewise.

Attached: patch.txt

Glen
commit 1af8d465545fda2451928fe100901db37c3e632c
Author: Glen Fernandes 
Date:   Thu Jul 19 07:40:17 2018 -0400

Use __builtin_memmove for trivially copy assignable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__is_simple_copy_move): Defined helper.
(__copy_move_a): Used helper.
(__copy_move_backward_a): Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
New test.
* testsuite/25_algorithms/copy/58982.cc: Updated tests.
* testsuite/25_algorithms/copy_n/58982.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 16a3f83b6..4488207f0 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -72,10 +72,16 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+  template
+struct __is_simple_copy_move
+{
+  enum { __value = __is_trivially_assignable(_Tp, const _Tp&) };
+};
+
 #if __cplusplus < 201103L
   // See http://gcc.gnu.org/ml/libstdc++/2004-08/msg00167.html: in a
   // nutshell, we are partially implementing the resolution of DR 187,
   // when it's safe, i.e., the value_types are equal.
   template
@@ -389,11 +395,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_a(_II __first, _II __last, _OI __result)
 {
   typedef typename iterator_traits<_II>::value_type _ValueTypeI;
   typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
   typedef typename iterator_traits<_II>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueTypeI)
+  const bool __simple = (__is_simple_copy_move<_ValueTypeI>::__value
 && __is_pointer<_II>::__value
 && __is_pointer<_OI>::__value
 && __are_same<_ValueTypeI, _ValueTypeO>::__value);
 
   return std::__copy_move<_IsMove, __simple,
@@ -591,11 +597,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_backward_a(_BI1 __first, _BI1 __last, _BI2 __result)
 {
   typedef typename iterator_traits<_BI1>::value_type _ValueType1;
   typedef typename iterator_traits<_BI2>::value_type _ValueType2;
   typedef typename iterator_traits<_BI1>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueType1)
+  const bool __simple = (__is_simple_copy_move<_ValueType1>::__value
 && __is_pointer<_BI1>::__value
 && __is_pointer<_BI2>::__value
 && __are_same<_ValueType1, _ValueType2>::__value);
 
   return std::__copy_move_backward<_IsMove, __simple,
diff --git 
a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc
new file mode 100644
index 0..ec681879f
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc
@@ -0,0 +1,38 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++11 } }
+
+#include 
+#include 
+
+struct T
+{
+  T() { }
+  T(const T&) = delete;
+};
+
+static_assert(std::is_trivially_copy_assignable::value &&
+  !__is_trivial(T), "T is only trivially copy assignable");
+
+void
+test01(T* result)
+{
+  T t[1];
+  std::uninitialized_copy(t, t+1, result); // { dg-error "here" }
+}
+// { dg-prune-output "use of deleted function" }
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/58982.cc 
b/libstdc++-v3/testsuite/25_algorithms/copy/58982.cc
index

[PATCH] SCCVN data-structure TLC

2018-07-19 Thread Richard Biener


The following does away with copying hashtable elements when moving them
from the optimistic hashtable to the valid one.  It does that by
keeping only a single global obstack for all phi, ref and nary
elements and freeing things between iterations by unwinding that obstack.
In this process phis and refs no longer use an alloc-pool and I changed
phis to have arguments as a trailing array.

All nice and dandy but vn_nary_build_or_lookup_1 throws a wrench in
because of its need to insert to the valid tables.  So I need to
allocate those nary elements on a separate obstack.  Bah.

The patch is big enough so further cleanups (make ref operands a trailing
array, go away with having two sets of hashtables) will be followups.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2018-07-19  Richard Biener  

* tree-ssa-sccvn.h (struct vn_phi_s): Make phiargs member
a trailing array.
* tree-ssa-sccvn.c: Remove alloc-pool.h use.
(vn_phi_hasher): Derive from nofree_ptr_hash and remove remove method.
(vn_reference_hasher): Likewise.
(struct vn_tables_s): Remove obstack and alloc-pool members.
(vn_tables_obstack, vn_tables_insert_obstack): New global obstacks.
(vn_nary_build_or_lookup_1): Manually build in vn_tables_insert_obstack.
(vn_reference_insert): Allocate from obstack instead of from alloc-pool.
(vn_reference_insert_pieces): Likewise.
(alloc_vn_nary_op_noinit): Adjust.
(vn_nary_op_insert_stmt): Allocate phiargs in-place.
(vn_phi_eq): Adjust.
(shared_lookup_phiargs): Remove.
(vn_phi_lookup): Allocate temporary vn_phi_s on the stack.
(vn_phi_insert): Allocate from obstack instead of from alloc-pool.
(visit_reference_op_call): Likewise.
(copy_nary, copy_phi, copy_reference): Remove.
(process_scc): Rewind the obstack when iterating.  Do not
copy the elements to valid_info but just move them from one
hashtable to the other.
(allocate_vn_table): Adjust.
(free_vn_table): Likewise.
(init_scc_vn): Likewise.
(free_scc_vn): Likewise.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 262871)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -25,7 +25,6 @@ along with GCC; see the file COPYING3.
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
-#include "alloc-pool.h"
 #include "ssa.h"
 #include "expmed.h"
 #include "insn-config.h"
@@ -169,11 +168,10 @@ typedef vn_nary_op_table_type::iterator
 static int
 vn_phi_eq (const_vn_phi_t const vp1, const_vn_phi_t const vp2);
 
-struct vn_phi_hasher : pointer_hash 
+struct vn_phi_hasher : nofree_ptr_hash 
 { 
   static inline hashval_t hash (const vn_phi_s *);
   static inline bool equal (const vn_phi_s *, const vn_phi_s *);
-  static inline void remove (vn_phi_s *);
 };
 
 /* Return the computed hashcode for phi operation P1.  */
@@ -192,14 +190,6 @@ vn_phi_hasher::equal (const vn_phi_s *vp
   return vn_phi_eq (vp1, vp2);
 }
 
-/* Free a phi operation structure VP.  */
-
-inline void
-vn_phi_hasher::remove (vn_phi_s *phi)
-{
-  phi->phiargs.release ();
-}
-
 typedef hash_table vn_phi_table_type;
 typedef vn_phi_table_type::iterator vn_phi_iterator_type;
 
@@ -235,11 +225,10 @@ free_reference (vn_reference_s *vr)
 
 /* vn_reference hashtable helpers.  */
 
-struct vn_reference_hasher : pointer_hash 
+struct vn_reference_hasher : nofree_ptr_hash 
 {
   static inline hashval_t hash (const vn_reference_s *);
   static inline bool equal (const vn_reference_s *, const vn_reference_s *);
-  static inline void remove (vn_reference_s *);
 };
 
 /* Return the hashcode for a given reference operation P1.  */
@@ -256,26 +245,17 @@ vn_reference_hasher::equal (const vn_ref
   return vn_reference_eq (v, c);
 }
 
-inline void
-vn_reference_hasher::remove (vn_reference_s *v)
-{
-  free_reference (v);
-}
-
 typedef hash_table vn_reference_table_type;
 typedef vn_reference_table_type::iterator vn_reference_iterator_type;
 
 
-/* The set of hashtables and alloc_pool's for their items.  */
+/* The set of VN hashtables.  */
 
 typedef struct vn_tables_s
 {
   vn_nary_op_table_type *nary;
   vn_phi_table_type *phis;
   vn_reference_table_type *references;
-  struct obstack nary_obstack;
-  object_allocator *phis_pool;
-  object_allocator *references_pool;
 } *vn_tables_t;
 
 
@@ -310,25 +290,26 @@ static hash_table *c
 static bitmap constant_value_ids;
 
 
+/* Obstack we allocate the vn-tables elements from.  */
+static obstack vn_tables_obstack;
+/* Special obstack we never unwind.  */
+static obstack vn_tables_insert_obstack;
+
 /* Valid hashtables storing information we have proven to be
correct.  */
-
 static vn_tables_t valid_info;
 
 /* Optimistic hashtables storing information we are making assumptions about
during iterations.  */
-
 static vn_tables_t optimistic_info;
 
 /*

Re: [PATCH] i386: Define __HAVE_INDIRECT_RETURN_ATTRIBUTE__

2018-07-19 Thread Florian Weimer

On 07/19/2018 01:56 PM, Jakub Jelinek wrote:

On Thu, Jul 19, 2018 at 01:54:46PM +0200, Florian Weimer wrote:

On 07/19/2018 01:48 PM, H.J. Lu wrote:

Both __has_attribute (indirect_return) and __has_attribute (__indirect_return__)
work here.


Applications can have

#define indirect_return

so the variant without underscore mangling is definitely not correct.


Incorrect for what?  glibc header?  Yes.  The libsanitizer use, where we
control the headers and what we define?  No.


Agreed.  I somehow missed that this wasn't for an installed glibc header.

Thanks,
Florian


Re: [PATCH 1/2] v5: Add "optinfo" framework

2018-07-19 Thread Richard Biener
On Wed, Jul 11, 2018 at 12:53 PM David Malcolm  wrote:
>
> Changes relative to v4:
> * eliminated optinfo subclasses as discussed
> * eliminated optinfo-internal.h, moving what remained into optinfo.h
> * added support for dump_gimple_expr_loc and dump_gimple_expr
> * more selftests
>
> This patch implements a way to consolidate dump_* calls into
> optinfo objects, as enabling work towards being able to write out
> optimization records to a file (I'm focussing on that destination
> in this patch kit, rather than diagnostic remarks).
>
> The patch adds the support for building optinfo instances from dump_*
> calls, but leaves implementing any *users* of them to followup patches.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk?

dump_context::get ().dump_symtab_node and friends is a bit
visually disturbing.  They are well-hidden so I guess I simply
look away for a second ;)

Otherwise looks very good now, thus...

... OK.

Thanks and sorry for the delay in reviewing.
Richard.

> gcc/ChangeLog:
> * Makefile.in (OBJS): Add optinfo.o.
> * coretypes.h (class symtab_node): New forward decl.
> (struct cgraph_node): New forward decl.
> (class varpool_node): New forward decl.
> * dump-context.h: New file.
> * dumpfile.c: Include "optinfo.h", "dump-context.h", "cgraph.h",
> "tree-pass.h".
> (refresh_dumps_are_enabled): Use optinfo_enabled_p.
> (set_dump_file): Call dumpfile_ensure_any_optinfo_are_flushed.
> (set_alt_dump_file): Likewise.
> (dump_context::~dump_context): New dtor.
> (dump_gimple_stmt): Move implementation to...
> (dump_context::dump_gimple_stmt): ...this new member function.
> Add the stmt to any pending optinfo, creating one if need be.
> (dump_gimple_stmt_loc): Move implementation to...
> (dump_context::dump_gimple_stmt_loc): ...this new member function.
> Start a new optinfo and add the stmt to it.
> (dump_gimple_expr): Move implementation to...
> (dump_context::dump_gimple_expr): ...this new member function.
> Add the stmt to any pending optinfo, creating one if need be.
> (dump_gimple_expr_loc): Move implementation to...
> (dump_context::dump_gimple_expr_loc): ...this new member function.
> Start a new optinfo and add the stmt to it.
> (dump_generic_expr): Move implementation to...
> (dump_context::dump_generic_expr): ...this new member function.
> Add the tree to any pending optinfo, creating one if need be.
> (dump_generic_expr_loc): Move implementation to...
> (dump_context::dump_generic_expr_loc): ...this new member
> function.  Add the tree to any pending optinfo, creating one if
> need be.
> (dump_printf): Move implementation to...
> (dump_context::dump_printf_va): ...this new member function.  Add
> the text to any pending optinfo, creating one if need be.
> (dump_printf_loc): Move implementation to...
> (dump_context::dump_printf_loc_va): ...this new member function.
> Start a new optinfo and add the stmt to it.
> (dump_dec): Move implementation to...
> (dump_context::dump_dec): ...this new member function.  Add the
> value to any pending optinfo, creating one if need be.
> (dump_context::dump_symtab_node): New member function.
> (dump_context::get_scope_depth): New member function.
> (dump_context::begin_scope): New member function.
> (dump_context::end_scope): New member function.
> (dump_context::ensure_pending_optinfo): New member function.
> (dump_context::begin_next_optinfo): New member function.
> (dump_context::end_any_optinfo): New member function.
> (dump_context::s_current): New global.
> (dump_context::s_default): New global.
> (dump_scope_depth): Delete global.
> (dumpfile_ensure_any_optinfo_are_flushed): New function.
> (dump_symtab_node): New function.
> (get_dump_scope_depth): Reimplement in terms of dump_context.
> (dump_begin_scope): Likewise.
> (dump_end_scope): Likewise.
> (selftest::temp_dump_context::temp_dump_context): New ctor.
> (selftest::temp_dump_context::~temp_dump_context): New dtor.
> (selftest::verify_item): New function.
> (ASSERT_IS_TEXT): New macro.
> (ASSERT_IS_TREE): New macro.
> (ASSERT_IS_GIMPLE): New macro.
> (selftest::test_capture_of_dump_calls): New test.
> (selftest::dumpfile_c_tests): Call it.
> * dumpfile.h (dump_printf, dump_printf_loc, dump_basic_block)
> (dump_generic_expr_loc, dump_generic_expr, dump_gimple_stmt_loc)
> (dump_gimple_stmt, dump_dec): Gather these related decls and add a
> descriptive comment.
> (dump_function, print_combine_total_stats, enable_rtl_dump_file)
>

Re: [PATCH][GCC][AARCH64] Canonicalize aarch64 widening simd plus insns

2018-07-19 Thread Matthew Malcomson

Hi again.

Providing an updated patch to include the formatting suggestions.

Thanks,

Matthew


On 12/07/18 11:39, Sudakshina Das wrote:

Hi Matthew

On 12/07/18 11:18, Richard Sandiford wrote:

Looks good to me FWIW (not a maintainer), just a minor formatting thing:

Matthew Malcomson  writes:
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
aac5fa146ed8dde4507a0eb4ad6a07ce78d2f0cd..67b29cbe2cad91e031ee23be656ec61a403f2cf9 
100644

--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3302,38 +3302,78 @@
    DONE;
  })
  -(define_insn "aarch64_w"
+(define_insn "aarch64_subw"
    [(set (match_operand: 0 "register_operand" "=w")
-    (ADDSUB: (match_operand: 1 "register_operand" 
"w")

-    (ANY_EXTEND:
-  (match_operand:VD_BHSI 2 "register_operand" "w"]
+    (minus:
+ (match_operand: 1 "register_operand" "w")
+ (ANY_EXTEND:
+   (match_operand:VD_BHSI 2 "register_operand" "w"]


The (minus should be under the "(match_operand":

(define_insn "aarch64_subw"
   [(set (match_operand: 0 "register_operand" "=w")
(minus: (match_operand: 1 "register_operand" "w")
   (ANY_EXTEND:
 (match_operand:VD_BHSI 2 "register_operand" "w"]

Same for the other patterns.

Thanks,
Richard



You will need a maintainer's approval but this looks good to me.
Thanks for doing this. I would only point out one other nit which you
can choose to ignore:

+/* Ensure
+   saddw2 and one saddw for the function add()
+   ssubw2 and one ssubw for the function subtract()
+   uaddw2 and one uaddw for the function uadd()
+   usubw2 and one usubw for the function usubtract() */
+
+/* { dg-final { scan-assembler-times "\[ \t\]ssubw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]ssubw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]saddw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]saddw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]usubw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]usubw\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]uaddw2\[ \t\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\[ \t\]uaddw\[ \t\]+" 1 } } */

The scan-assembly directives for the different
functions can be placed right below each of them and that would
make it easier to read the expected results in the test and you
can get rid of the comments saying the same.

Thanks
Sudi


diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b5c551ad650e1a83416d5fbbbdd38e3fa3beb532..1f356d04d5b3542ac9ce51bc315c81d1fff91f21 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3303,38 +3303,74 @@
   DONE;
 })
 
-(define_insn "aarch64_w"
+(define_insn "aarch64_subw"
   [(set (match_operand: 0 "register_operand" "=w")
-(ADDSUB: (match_operand: 1 "register_operand" "w")
-			(ANY_EXTEND:
-			  (match_operand:VD_BHSI 2 "register_operand" "w"]
+	(minus: (match_operand: 1 "register_operand" "w")
+	  (ANY_EXTEND:
+	(match_operand:VD_BHSI 2 "register_operand" "w"]
   "TARGET_SIMD"
-  "w\\t%0., %1., %2."
-  [(set_attr "type" "neon__widen")]
+  "subw\\t%0., %1., %2."
+  [(set_attr "type" "neon_sub_widen")]
 )
 
-(define_insn "aarch64_w_internal"
+(define_insn "aarch64_subw_internal"
   [(set (match_operand: 0 "register_operand" "=w")
-(ADDSUB: (match_operand: 1 "register_operand" "w")
-			(ANY_EXTEND:
-			  (vec_select:
-			   (match_operand:VQW 2 "register_operand" "w")
-			   (match_operand:VQW 3 "vect_par_cnst_lo_half" "")]
+	(minus: (match_operand: 1 "register_operand" "w")
+	  (ANY_EXTEND:
+	(vec_select:
+	  (match_operand:VQW 2 "register_operand" "w")
+	  (match_operand:VQW 3 "vect_par_cnst_lo_half" "")]
   "TARGET_SIMD"
-  "w\\t%0., %1., %2."
-  [(set_attr "type" "neon__widen")]
+  "subw\\t%0., %1., %2."
+  [(set_attr "type" "neon_sub_widen")]
 )
 
-(define_insn "aarch64_w2_internal"
+(define_insn "aarch64_subw2_internal"
   [(set (match_operand: 0 "register_operand" "=w")
-(ADDSUB: (match_operand: 1 "register_operand" "w")
-			(ANY_EXTEND:
-			  (vec_select:
-			   (match_operand:VQW 2 "register_operand" "w")
-			   (match_operand:VQW 3 "vect_par_cnst_hi_half" "")]
+	(minus: (match_operand: 1 "register_operand" "w")
+	  (ANY_EXTEND:
+	(vec_select:
+	  (match_operand:VQW 2 "register_operand" "w")
+	  (match_operand:VQW 3 "vect_par_cnst_hi_half" "")]
   "TARGET_SIMD"
-  "w2\\t%0., %1., %2."
-  [(set_attr "type" "neon__widen")]
+  "subw2\\t%0., %1., %2."
+  [(set_attr "type" "neon_sub_widen")]
+)
+
+(define_insn "aarch64_addw"
+  [(set (match_operand: 0 "register_operand" "=w")
+	(plus:
+	  (ANY_EXTEND: (match_operand:VD_BHSI 2 "register_operand" "w"))
+	  (match_operand: 1 "register_operand" "w")))]
+  "TARGET_SIMD"
+  "addw\\t%0., %1., %2."
+  [

Re: [PATCH 2/2] Add "-fsave-optimization-record"

2018-07-19 Thread Richard Biener
On Wed, Jul 11, 2018 at 12:53 PM David Malcolm  wrote:
>
> This patch implements a -fsave-optimization-record option, which
> leads to a JSON file being written out, recording the dump_* calls
> made (via the optinfo infrastructure in the previous patch).
>
> The patch includes a minimal version of the JSON patch I posted last
> year, with just enough support needed for optimization records (I
> removed all of the parser code, leaving just the code for building
> in-memory JSON trees and writing them to a pretty_printer).
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk?

+@item -fsave-optimization-record
+@opindex fsave-optimization-record
+Write a SRCFILE.opt-record.json file detailing what optimizations
+were performed.
+

I guess this should note that it is experimental and in no way
complete.  Maybe list areas where reports will be generated,
like vectorization?

Did you check what happens with -flto -fsave-optimization-record?
Will the compile-phase emit a json file for each source (expected,
like with early inlining decisions)?  Will the WPA phase emit one
(IPA decisions?) or will IPA decisions be recorded in the LTRANS
one?  How will the LTRANS ones be named and where can they
be found?  You don't need to solve all the issues with this patch
but they should be eventually addressed somehow.

I don't question the use or implementation of JSON, I'll just
approve it.

The rest looks obvious enough, thus OK.

Some overall blurb in the documentation or changes.html
on how to use this would be nice of course.

Thanks,
Richard.


> gcc/ChangeLog:
> * Makefile.in (OBJS): Add json.o and optinfo-emit-json.o.
> (CFLAGS-optinfo-emit-json.o): Define TARGET_NAME.
> * common.opt (fsave-optimization-record): New option.
> * coretypes.h (struct kv_pair): Move here from dumpfile.c.
> * doc/invoke.texi (-fsave-optimization-record): New option.
> * dumpfile.c: Include "optinfo-emit-json.h".
> (struct kv_pair): Move to coretypes.h.
> (optgroup_options): Make non-static.
> (dump_context::end_scope): Call
> optimization_records_maybe_pop_dump_scope.
> * dumpfile.h (optgroup_options): New decl.
> * json.cc: New file.
> * json.h: New file.
> * optinfo-emit-json.cc: New file.
> * optinfo-emit-json.h: New file.
> * optinfo.cc: Include "optinfo-emit-json.h".
> (optinfo::emit): Call optimization_records_maybe_record_optinfo.
> (optinfo_enabled_p): Check optimization_records_enabled_p.
> (optinfo_wants_inlining_info_p): Likewise.
> * optinfo.h: Update comment.
> * profile-count.c (profile_quality_as_string): New function.
> * profile-count.h (profile_quality_as_string): New decl.
> (profile_count::quality): New accessor.
> * selftest-run-tests.c (selftest::run_tests): Call json_cc_tests
> and optinfo_emit_json_cc_tests.
> * selftest.h (selftest::json_cc_tests): New decl.
> (selftest::optinfo_emit_json_cc_tests): New decl.
> * toplev.c: Include "optinfo-emit-json.h".
> (compile_file): Call optimization_records_finish.
> (do_compile): Call optimization_records_start.
> * tree-ssa-live.c: Include optinfo.h.
> (remove_unused_scope_block_p): Retain inlining information if
> optinfo_wants_inlining_info_p returns true.
> ---
>  gcc/Makefile.in  |   3 +
>  gcc/common.opt   |   4 +
>  gcc/coretypes.h  |   8 +
>  gcc/doc/invoke.texi  |   8 +-
>  gcc/dumpfile.c   |  15 +-
>  gcc/dumpfile.h   |   3 +
>  gcc/json.cc  | 293 
>  gcc/json.h   | 166 ++
>  gcc/optinfo-emit-json.cc | 568 
> +++
>  gcc/optinfo-emit-json.h  |  36 +++
>  gcc/optinfo.cc   |  11 +-
>  gcc/optinfo.h|   4 -
>  gcc/profile-count.c  |  28 +++
>  gcc/profile-count.h  |   5 +
>  gcc/selftest-run-tests.c |   2 +
>  gcc/selftest.h   |   2 +
>  gcc/toplev.c |   5 +
>  gcc/tree-ssa-live.c  |   4 +-
>  18 files changed, 1143 insertions(+), 22 deletions(-)
>  create mode 100644 gcc/json.cc
>  create mode 100644 gcc/json.h
>  create mode 100644 gcc/optinfo-emit-json.cc
>  create mode 100644 gcc/optinfo-emit-json.h
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index dd1dfc1..b871640 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1387,6 +1387,7 @@ OBJS = \
> ira-color.o \
> ira-emit.o \
> ira-lives.o \
> +   json.o \
> jump.o \
> langhooks.o \
> lcm.o \
> @@ -1428,6 +1429,7 @@ OBJS = \
> optabs-query.o \
> optabs-tree.o \
> optinfo.o \
> +   optinfo-emit-json.o \
> options-save.o \
> opts-global.o \
> passes.o \
> @@ -2251,6 +2253,7 @@ s-bversion: BASE-VER
> $(STA

Re: [PATCH] Call REAL(swapcontext) with indirect_return attribute on x86

2018-07-19 Thread H.J. Lu
On Wed, Jul 18, 2018 at 12:34:28PM -0700, Kostya Serebryany wrote:
> On Wed, Jul 18, 2018 at 12:29 PM H.J. Lu  wrote:
> >
> > On Wed, Jul 18, 2018 at 11:45 AM, Kostya Serebryany  wrote:
> > > On Wed, Jul 18, 2018 at 11:40 AM H.J. Lu  wrote:
> > >>
> > >> On Wed, Jul 18, 2018 at 11:18 AM, Kostya Serebryany  
> > >> wrote:
> > >> > What's ENDBR and do we really need to have it in compiler-rt?
> > >>
> > >> When shadow stack from Intel CET is enabled,  the first instruction of 
> > >> all
> > >> indirect branch targets must be a special instruction, ENDBR.  In this 
> > >> case,
> > >
> > > I am confused.
> > > CET is a security mitigation feature (and ENDBR is a pretty weak form of 
> > > such),
> > > while ASAN is a testing tool, rarely used in production is almost
> > > never as a mitigation (which it is not!).
> > > Why would anyone need to combine CET and ASAN in one process?
> > >
> >
> > CET is transparent to ASAN.  It is perfectly OK to use -fcf-protection to
> > enable CET together with ASAN.
> 
> It is ok, but does it make any sense?
> If anything, the current ASAN's intereceptors are a large blob of
> security vulnerabilities.
> If we ever want to use ASAN (or, more likely, HWASAN) as a security
> mitigation feature,
> we will need to get rid of these interceptors entirely.
> 
> 
> >
> > > Also, CET doesn't exist in the hardware yet, at least not publicly 
> > > available.
> > > Which means there should be no rush (am I wrong?) and we can do things
> > > in the correct order:
> > > implement the Clang/LLVM support, make the compiler-rt change in LLVM,
> > > merge back to GCC.
> >
> > I am working with our LLVM people to address this.
> 
> Cool!
> 

I am testing this patch and will submit it upstream.

H.J.
---
asan/asan_interceptors.cc has

...
  int res = REAL(swapcontext)(oucp, ucp);
...

REAL(swapcontext) is a function pointer to swapcontext in libc.  Since
swapcontext may return via indirect branch on x86 when shadow stack is
enabled, we need to call REAL(swapcontext) with indirect_return attribute
on x86 so that compiler can insert ENDBR after REAL(swapcontext) call.

PR target/86560
* asan/asan_interceptors.cc (swapcontext): Call REAL(swapcontext)
with indirect_return attribute on x86 if indirect_return attribute
is available.
---
 libsanitizer/asan/asan_interceptors.cc | 9 +
 1 file changed, 9 insertions(+)

diff --git a/libsanitizer/asan/asan_interceptors.cc 
b/libsanitizer/asan/asan_interceptors.cc
index a8f4b72723f..3ae473f210a 100644
--- a/libsanitizer/asan/asan_interceptors.cc
+++ b/libsanitizer/asan/asan_interceptors.cc
@@ -267,7 +267,16 @@ INTERCEPTOR(int, swapcontext, struct ucontext_t *oucp,
   uptr stack, ssize;
   ReadContextStack(ucp, &stack, &ssize);
   ClearShadowMemoryForContextStack(stack, ssize);
+#if defined(__has_attribute) && (defined(__x86_64__) || defined(__i386__))
+  int (*real_swapcontext) (struct ucontext_t *, struct ucontext_t *)
+# if __has_attribute (__indirect_return__)
+__attribute__((__indirect_return__))
+# endif
+= REAL(swapcontext);
+  int res = real_swapcontext(oucp, ucp);
+#else
   int res = REAL(swapcontext)(oucp, ucp);
+#endif
   // swapcontext technically does not return, but program may swap context to
   // "oucp" later, that would look as if swapcontext() returned 0.
   // We need to clear shadow for ucp once again, as it may be in arbitrary
-- 
2.17.1



Re: [PATCH] Show valid options for -march and -mtune in --help=target for arm32 (PR driver/83193).

2018-07-19 Thread Martin Liška
On 07/19/2018 12:31 PM, Richard Earnshaw (lists) wrote:
> On 19/07/18 11:22, Martin Liška wrote:
>> On 07/19/2018 12:01 PM, Richard Earnshaw (lists) wrote:
>>> On 19/07/18 10:56, Martin Liška wrote:
 On 07/19/2018 11:28 AM, Richard Earnshaw (lists) wrote:
> On 19/07/18 08:30, Martin Liška wrote:
>> This is correct version of the patch. Anyway, I'm thinking about the 
>> ForceHelp
>> attribute. I may do it in a bit different version. Let me come up with 
>> one another
>> version of the patch.
>>
>> Martin
>>
>
> I don't understand how this is supposed to work.  -mcpu, -march and
> -mtune all take strings now and have to be parsed to identify various
> sub-components of the parameter.  So why do you talk about these being
> enum types?

 Yes, they are string types. But for purpose of --help output, it's nice
 to present to a user a list of possible values. That's the enum type.

 Please take a look at attached patch.

>>>
>>> But that isn't the list of possible values.  Please see the manual.  A
>>> valid CPU name can look something like
>>>
>>> cortex-a53+crypto
>>>
>>> and architectures names can be even more complex.
>>>
>>> You can't get this from that list of enum values.
>>
>> I'm fully aware of the limitation, it's questionable whether you want to get:
>>
>> @@ -56,6 +56,9 @@
>>Known ARM ABIs (for use with the -mabi= option):
>>  aapcs aapcs-linux apcs-gnu atpcs iwmmxt
>>  
>> +  Known ARM architectures (for use with the -march= option):
>> +armv4 armv4t armv5t armv5te armv5tej armv6 armv6-m armv6j armv6k 
>> armv6kz armv6s-m armv6t2 armv6z armv6zk armv7 armv7-a armv7-m armv7-r 
>> armv7e-m armv7ve armv8-a armv8-m.base armv8-m.main armv8-r armv8.1-a 
>> armv8.2-a armv8.3-a armv8.4-a iwmmxt iwmmxt2 native
>> +
>>Known __fp16 formats (for use with the -mfp16-format= option):
>>  alternative ieee none
>>  
>> @@ -68,6 +71,12 @@
>>Known floating-point ABIs (for use with the -mfloat-abi= option):
>>  hard soft softfp
>>  
>> +  Known ARM CPUs (for use with the -mcpu= and -mtune= options):
>> +arm1020e arm1020t arm1022e arm1026ej-s arm10e arm10tdmi arm1136j-s 
>> arm1136jf-s arm1156t2-s arm1156t2f-s arm1176jz-s arm1176jzf-s arm710t 
>> arm720t arm740t arm7tdmi arm7tdmi-s arm8 arm810 arm9 arm920 arm920t arm922t 
>> arm926ej-s arm940t arm946e-s arm966e-s arm968e-s arm9e
>> +arm9tdmi cortex-a12 cortex-a15 cortex-a15.cortex-a7 cortex-a17 
>> cortex-a17.cortex-a7 cortex-a32 cortex-a35 cortex-a5 cortex-a53 cortex-a55 
>> cortex-a57 cortex-a57.cortex-a53 cortex-a7 cortex-a72 cortex-a72.cortex-a53 
>> cortex-a73 cortex-a73.cortex-a35 cortex-a73.cortex-a53
>> +cortex-a75 cortex-a75.cortex-a55 cortex-a76 cortex-a76.cortex-a55 
>> cortex-a8 cortex-a9 cortex-m0 cortex-m0.small-multiply cortex-m0plus 
>> cortex-m0plus.small-multiply cortex-m1 cortex-m1.small-multiply cortex-m23 
>> cortex-m3 cortex-m33 cortex-m4 cortex-m7 cortex-r4
>> +cortex-r4f cortex-r5 cortex-r52 cortex-r7 cortex-r8 ep9312 exynos-m1 
>> fa526 fa606te fa626 fa626te fa726te fmp626 generic-armv7-a iwmmxt iwmmxt2 
>> marvell-pj4 mpcore mpcorenovfp native strongarm strongarm110 strongarm1100 
>> strongarm1110 xgene1 xscale
>> +
>>TLS dialect to use:
>>  gnu gnu2
>>
>> I hope it's still beneficial for users.
> 
> Frankly, I find the list too long to be helpful.  I'd also prefer it if

One justification for this can be that with a very simple patch and
can have bash completion to finish a -march option value.

> we could come up with a more useful approach.  I've pondered if the
> following were possible:
> 
> In general target help, print
> 
>   For list of supported CPUs [Architectures] use -mcpu=help [-march=help]
> 
> And then, invoking the compiler gives that list in a more user-friendly
> fashion.  Finally, at the end we could have:
> 
> For CPU [Architecture]-specific extensions use -mcpu=+help
> [-march=+help]
> 
> and then it would show the specific extensions for that architecture.
> 
> It's relatively straight forward to do the back-end plumbing for this,
> but the help driver would have to know how to call into the back-end or
> for the back-end to be able to report to the midend that this was a help
> invocation not a normal run.  I couldn't find a simple way of doing that
> when I tried before.

Would you be able to implement that as target_common hook? These are defined
in gcc/common/common-target.def. If so, I can then provide an API that
will use it.

Martin


> 
> R.
> 
>>
>> Martin
>>
>>>
>>> R.
>>>
 Thanks,
 Martin

>
> R.
>
>>
>> 0001-Show-valid-options-for-march-and-mtune-in-help-targe-v3.patch
>>
>>
>> From 9bfc1400213911b4508e90198df7b2dd11efc85c Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Tue, 20 Feb 2018 10:39:09 +0100
>> Subject: [PATCH] Show valid options for -march and -mtune in 
>> --help=target for
>>  arm32 (PR d

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-19 Thread Tamar Christina
Hi Jeff,

> -Original Message-
> From: Tamar Christina 
> Sent: Thursday, July 12, 2018 18:45
> To: Jeff Law 
> Cc: gcc-patches@gcc.gnu.org; nd ;
> jos...@codesourcery.com; bonz...@gnu.org; d...@redhat.com;
> nero...@gcc.gnu.org; aol...@redhat.com; ralf.wildenh...@gmx.de
> Subject: Re: [PATCH][GCC][front-end][build-machinery][opt-framework]
> Allow setting of stack-clash via configure options. [Patch (4/6)]
> 
> Hi Jeff,
> 
> The 07/11/2018 20:21, Jeff Law wrote:
> > On 07/11/2018 05:22 AM, Tamar Christina wrote:
> > > Hi All,
> > >
> > > This patch defines a configure option to allow the setting of the
> > > default guard size via configure flags when building the target.
> > >
> > > The new flag is:
> > >
> > >  * --with-stack-clash-protection-guard-size=
> > >
> > > The value of configured based params are set very early on and allow
> > > the target to validate or reject the values as it sees fit.
> > >
> > > To do this the values for the parameter get set by configure through CPP
> defines.
> > > In case the back-end wants to know if a value was set or not the
> > > original default value is also passed down as a define.
> > >
> > > This allows a target to check if a param was changed by the user at
> configure time.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-
> gnu and no issues.
> > > Both targets were tested with stack clash on and off by default.
> > >
> > > Ok for trunk?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/
> > > 2018-07-11  Tamar Christina  
> > >
> > >   PR target/86486
> > >   * configure.ac: Add stack-clash-protection-guard-size.
> > >   * config.in (DEFAULT_STK_CLASH_GUARD_SIZE,
> STK_CLASH_GUARD_SIZE_DEFAULT,
> > >   STK_CLASH_GUARD_SIZE_MAX, STK_CLASH_GUARD_SIZE_MIN):
> New.
> > >   * params.def (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE):
> Use it.
> > >   * configure: Regenerate.
> > >   * Makefile.in (params.list, params.options): Add include dir for CPP.
> > >   * params-list.h: Include auto-host.h
> > >   * params-options.h: Likewise.
> > >
> > Something seems wrong here.
> >
> > What's the purpose of including auto-host in params-list and
> > params-options?  It seems like you're putting a property of the target
> > (guard size) into the wrong place (auto-host.h).
> >
> 
> The reason for this is because there's a test gcc.dg/params/blocksort-part.c
> that uses these params-options to generate test cases to perform parameter
> validation. However because now the params.def file can contain a CPP
> macro these would then fail.
> 
> CPP is already called to create params-options and params-list so the easiest
> way to fix this test was just to include auto-host which would get it the 
> values
> from configure.
> 
> This test is probably not needed anymore after my second patch series as
> parameters are validated by the front-end now, so they can never go out of
> range.
> 
> > It's also a bit unclear to me why this is necessary at all.  Are we
> > planning to support both the 4k and 64k guards?  My goal (once the
> > guard was configurable) was never for supporting multiple sizes on a
> > target but instead to allow experimentation to find the right default.
> >

Having talked to people I believe we do need to support both 4k and 64k guards.
For the Linux/Glibc world it wouldn't matter much, either 4 or 64k would do, 
though Glibc has settled on 64k pages.

However other systems like (open/free)BSD or musl based systems do not want
64k pages but want 4k ones.  So we're ending up having to support both as a 
compromise.

Regards,
Tamar

> 
> I will get back to you on this one.
> 
> Thanks,
> Tamar
> 
> > Jeff
> 
> --


Re: [GCC][PATCH][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-07-19 Thread Sam Tebbs

Hi Richard,

Thanks for the feedback. I find that using "is_left_consecutive" is more 
descriptive than checking for it being a power of 2 - 1, since it 
describes the requirement (having consecutive ones from the MSB) more 
explicitly. I would be happy to change it though if that is the consensus.


I have addressed your point about just returning the string instead of 
using output_asm_insn and have changed it locally. I'll send an updated 
patch soon.



On 07/17/2018 02:33 AM, Richard Henderson wrote:

On 07/16/2018 10:10 AM, Sam Tebbs wrote:

+++ b/gcc/config/aarch64/aarch64.c
@@ -1439,6 +1439,14 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, 
unsigned,
  return SImode;
  }
  
+/* Implement IS_LEFT_CONSECUTIVE.  Check if an integer's bits are consecutive

+   ones from the MSB.  */
+bool
+aarch64_is_left_consecutive (HOST_WIDE_INT i)
+{
+  return (i | (i - 1)) == HOST_WIDE_INT_M1;
+}
+

...

+(define_insn "*aarch64_bfxil"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(ior:DI (and:DI (match_operand:DI 1 "register_operand" "r")
+   (match_operand 3 "const_int_operand"))
+   (and:DI (match_operand:DI 2 "register_operand" "0")
+   (match_operand 4 "const_int_operand"]
+  "INTVAL (operands[3]) == ~INTVAL (operands[4])
+&& aarch64_is_left_consecutive (INTVAL (operands[4]))"

Better is to use a define_predicate to merge both that second test and the
const_int_operand.

(I'm not sure about the "left_consecutive" language either.
Isn't it more descriptive to say that op3 is a power of 2 minus 1?)

(define_predicate "pow2m1_operand"
   (and (match_code "const_int")
(match_test "exact_pow2 (INTVAL(op) + 1) > 0")))

and use

   (match_operand:DI 3 "pow2m1_operand")

and then just the

   INTVAL (operands[3]) == ~INTVAL (operands[4])

test.

Also, don't omit the modes for the constants.
Also, there's no reason this applies only to DI mode;
use the GPI iterator and % in the output template.


+HOST_WIDE_INT op3 = INTVAL (operands[3]);
+operands[3] = GEN_INT (ceil_log2 (op3));
+output_asm_insn ("bfxil\\t%0, %1, 0, %3", operands);
+return "";

You can just return the string that you passed to output_asm_insn.


+  }
+  [(set_attr "type" "bfx")]

The other aliases of the BFM insn use type "bfm";
"bfx" appears to be aliases of UBFM and SBFM.
Not that it appears to matter to the scheduling
descriptions, but it is inconsistent.


r~




Re: Compilation error in simple-object-elf.c

2018-07-19 Thread Eli Zaretskii
> From: Richard Biener 
> Date: Thu, 19 Jul 2018 10:46:01 +0200
> Cc: DJ Delorie , GCC Patches , 
> gdb-patc...@sourceware.org
> 
> > *err = ENOTSUP;
> >^~~
> >  ./simple-object-elf.c:1284:14: note: each undeclared identifier is 
> > reported only once for each function it appears in
> >
> > Suggested fix:
> 
> Works for me, thus OK.  I'm going to check it in to make 8.2.

Thanks.

Joel/Pedro, is this okay for GDB's copy of libiberty, master and
branch?


Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Bernd Edlinger
> @@ -633,12 +642,17 @@ c_strlen (tree src, int only_value)
>   return ssize_int (0);
>  
>/* We don't know the starting offset, but we do know that the string
> -  has no internal zero bytes.  We can assume that the offset falls
> -  within the bounds of the string; otherwise, the programmer deserves
> -  what he gets.  Subtract the offset from the length of the string,
> -  and return that.  This would perhaps not be valid if we were dealing
> -  with named arrays in addition to literal string constants.  */
> -  return size_diffop_loc (loc, size_int (maxelts * eltsize), byteoff);
> +  has no internal zero bytes.  If the offset falls within the bounds
> +  of the string subtract the offset from the length of the string,
> +  and return that.  Otherwise the length is zero.  Take care to
> +  use SAVE_EXPR in case the OFFSET has side-effects.  */
> +  tree offsave = TREE_SIDE_EFFECTS (byteoff) ? save_expr (byteoff) : 
> byteoff;
> +  offsave = fold_convert (ssizetype, offsave);
> +  tree condexp = fold_build2_loc (loc, LE_EXPR, boolean_type_node, 
> offsave,
> +   build_int_cst (ssizetype, len * eltsize));
> +  tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), 
> offsave);
> +  return fold_build3_loc (loc, COND_EXPR, ssizetype, condexp, lenexp,
> +   build_zero_cst (ssizetype));


This computes the number of bytes.
c_strlen is supposed to return number of (wide) characters:

/* Compute the length of a null-terminated character string or wide
character string handling character sizes of 1, 2, and 4 bytes.
TREE_STRING_LENGTH is not the right way because it evaluates to
the size of the character array in bytes (as opposed to characters)
and because it can contain a zero byte in the middle.


> @@ -11343,16 +11356,15 @@ string_constant (tree arg, tree *ptr_offset)
>  {
>if (TREE_CODE (TREE_TYPE (array)) != ARRAY_TYPE)
>   return NULL_TREE;
> -  if (tree eltsize = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array
> - {
> -   /* Add the scaled variable index to the constant offset.  */
> -   tree eltoff = fold_build2 (MULT_EXPR, TREE_TYPE (offset),
> -  fold_convert (sizetype, varidx),
> -  eltsize);
> -   offset = fold_build2 (PLUS_EXPR, TREE_TYPE (offset), offset, eltoff);
> - }
> -  else
> - return NULL_TREE;
> +
> +  while (TREE_CODE (chartype) != INTEGER_TYPE)
> + chartype = TREE_TYPE (chartype);
> +
> +  /* Set the non-constant offset to the non-constant index scaled
> +  by the size of the character type.  */
> +  offset = fold_build2 (MULT_EXPR, TREE_TYPE (offset),
> + fold_convert (sizetype, varidx),
> + TYPE_SIZE_UNIT (chartype));

here you fix the computation for wide character strings,
but I see no test cases with wide character stings.

But down here you use a non-wide character function on a
wide character string:

   /* Avoid returning a string that doesn't fit in the array
  it is stored in, like
  const char a[4] = "abcde";
  but do handle those that fit even if they have excess
  initializers, such as in
  const char a[4] = "abc\000\000";
  The excess elements contribute to TREE_STRING_LENGTH()
  but not to strlen().  */
   unsigned HOST_WIDE_INT length
 = strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));


Actually I begin to wonder, if all this wide character stuff is
really so common that we have to optimize it.
Same for the strlen(&a[0][i]), does this happen really so often that
it is a worth the risk?


Bernd.


[PATCH] Merge Ignore and Deprecated in .opt files.

2018-07-19 Thread Martin Liška
Hi.

Few weeks ago I added new Deprecated flag for options. Apparently, there's
one similar called Ignore. Thus I moved all Deprecated to Ignore and for
all Ignored I do a warning of following format:

$ xgcc: warning: switch ‘-mmpx’ is no longer supported

After that there were quite some usages for unsupported options in test-suite.
When a test-case was very simple and obviously testing just a legacy option
then I removed such a test-case. Finally I added some sanity checking into
optc-gen.awk where Ignore flag is handled.


Patch can bootstrap on ppc64le-redhat-linux and x86_64-linux-gnu
and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-07-18  Martin Liska  

* common.opt: Remove Warn, Init and Report for options with
Ignore flag. Warning is done automatically.  Options with Deprecated
now use Ignore flag.
* config/i386/i386.opt: Use Ignore instead of Deprecated.
* config/ia64/ia64.opt: Remove Warn for Ignore flags.
* config/rs6000/rs6000.opt: Likewise.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Remove usage of flag_check_pointer_bounds.
* lto-wrapper.c (merge_and_complain): Do not handle
OPT_fcheck_pointer_bounds.
(append_compiler_options): Likewise.
* opt-functions.awk: Do not handle Deprecated.
* optc-gen.awk: Check that Var, Report and Init are not
used for an option with Ignore flag.
* opts-common.c (decode_cmdline_option): Do not report
CL_ERR_DEPRECATED.
(read_cmdline_option): Remove warning for OPT_SPECIAL_ignore
options.
* opts.h (struct cl_option): Remove cl_deprecated flag.
(CL_ERR_DEPRECATED): Remove error enum value.
* doc/options.texi: Remove entry for Deprecated and move
it to Ignore.

gcc/testsuite/ChangeLog:

2018-07-18  Martin Liska  

* g++.dg/opt/mpx.C: Fix scanned pattern.
* gcc.target/i386/mpx.c: Likewise.
* g++.dg/warn/Wunreachable-code-1.C: Remove.
* g++.dg/warn/Wunreachable-code-2.C: Likewise.
* gcc.dg/torture/pr52969.c: Likewise.
* g++.dg/warn/pr31246-2.C: Likewise.
* g++.dg/warn/pr31246.C: Likewise.
* gcc.dg/pr33092.c: Likewise.
* g++.dg/opt/eh1.C: Remove a deprecated option.
* g++.dg/template/inline1.C: Likewise.
* g++.dg/tree-ssa/pr81408.C: Likewise.
* gcc.dg/pr41837.c: Likewise.
* gcc.dg/pr41841.c: Likewise.
* gcc.dg/pr42250.c: Likewise.
* gcc.dg/pr43084.c: Likewise.
* gcc.dg/pr43317.c: Likewise.
* gcc.dg/pr51879-18.c: Likewise.
* gcc.dg/torture/pr36066.c: Likewise.
* gcc.dg/tree-ssa/ifc-8.c: Likewise.
* gcc.dg/tree-ssa/ifc-cd.c: Likewise.
* gcc.dg/tree-ssa/pr19210-1.c: Likewise.
* gcc.dg/tree-ssa/pr45122.c: Likewise.
* gcc.target/i386/pr45352-2.c: Likewise.
* gcc.target/i386/zee.c: Likewise.
* gfortran.dg/auto_char_len_2.f90: Likewise.
* gfortran.dg/auto_char_len_4.f90: Likewise.
* gfortran.dg/c_ptr_tests_15.f90: Likewise.
* gfortran.dg/char_array_structure_constructor.f90: Likewise.
* gfortran.dg/gomp/pr47331.f90: Likewise.
* gfortran.dg/pr40999.f: Likewise.
* gfortran.dg/pr41011.f: Likewise.
* gfortran.dg/pr42051.f03: Likewise.
* gfortran.dg/pr46804.f90: Likewise.
* gfortran.dg/pr83149_1.f90: Likewise.
* gfortran.dg/pr83149_b.f90: Likewise.
* gfortran.dg/whole_file_1.f90: Likewise.
* gfortran.dg/whole_file_10.f90: Likewise.
* gfortran.dg/whole_file_11.f90: Likewise.
* gfortran.dg/whole_file_12.f90: Likewise.
* gfortran.dg/whole_file_13.f90: Likewise.
* gfortran.dg/whole_file_14.f90: Likewise.
* gfortran.dg/whole_file_15.f90: Likewise.
* gfortran.dg/whole_file_16.f90: Likewise.
* gfortran.dg/whole_file_17.f90: Likewise.
* gfortran.dg/whole_file_18.f90: Likewise.
* gfortran.dg/whole_file_19.f90: Likewise.
* gfortran.dg/whole_file_2.f90: Likewise.
* gfortran.dg/whole_file_20.f03: Likewise.
* gfortran.dg/whole_file_3.f90: Likewise.
* gfortran.dg/whole_file_4.f90: Likewise.
* gfortran.dg/whole_file_5.f90: Likewise.
* gfortran.dg/whole_file_6.f90: Likewise.
* gfortran.dg/whole_file_7.f90: Likewise.
* gfortran.dg/whole_file_8.f90: Likewise.
* gfortran.dg/whole_file_9.f90: Likewise.
* gcc.dg/vect/vect.exp: Likewise.

gcc/c-family/ChangeLog:

2018-07-18  Martin Liska  

* c.opt: Remove Warn, Init and Report for options with
Ignore flag. Warning is done automatically.  Options with Deprecated
now use Ignore flag.
---
 gcc/c-family/c.opt| 96 +--
 gcc/common.opt| 10 +-
 gcc/config/i386/i386.opt  |  2 

Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Jonathan Wakely

On 19/07/18 07:59 -0400, Glen Fernandes wrote:

Updated patch to simplify the helper trait, and to include 
instead of  in the unit test for copy_uninitialized:

Use __builtin_memmove for trivially copy assignable types

2018-07-19  Glen Joseph Fernandes  

   * include/bits/stl_algobase.h
   (__is_simple_copy_move): Defined helper.
   (__copy_move_a): Used helper.
   (__copy_move_backward_a): Likewise.
   * testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
   New test.
   * testsuite/25_algorithms/copy/58982.cc: Updated tests.
   * testsuite/25_algorithms/copy_n/58982.cc: Likewise.

Attached: patch.txt

Glen



commit 1af8d465545fda2451928fe100901db37c3e632c
Author: Glen Fernandes 
Date:   Thu Jul 19 07:40:17 2018 -0400

   Use __builtin_memmove for trivially copy assignable types

   2018-07-19  Glen Joseph Fernandes  

   * include/bits/stl_algobase.h
   (__is_simple_copy_move): Defined helper.
   (__copy_move_a): Used helper.
   (__copy_move_backward_a): Likewise.
   * testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
   New test.
   * testsuite/25_algorithms/copy/58982.cc: Updated tests.
   * testsuite/25_algorithms/copy_n/58982.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 16a3f83b6..4488207f0 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -72,10 +72,16 @@

namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION

+  template
+struct __is_simple_copy_move
+{
+  enum { __value = __is_trivially_assignable(_Tp, const _Tp&) };
+};
+
#if __cplusplus < 201103L
  // See http://gcc.gnu.org/ml/libstdc++/2004-08/msg00167.html: in a
  // nutshell, we are partially implementing the resolution of DR 187,
  // when it's safe, i.e., the value_types are equal.
  template
@@ -389,11 +395,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__copy_move_a(_II __first, _II __last, _OI __result)
{
  typedef typename iterator_traits<_II>::value_type _ValueTypeI;
  typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
  typedef typename iterator_traits<_II>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueTypeI)
+  const bool __simple = (__is_simple_copy_move<_ValueTypeI>::__value


Sorry for the delay in reviewing this properly, as I've only just
realised that this introduces undefined behaviour, doesn't it?

It's undefined to use memmove for a type that is not trivially
copyable. All trivial types are trivially copyable, so __is_trivial
was too conservative, but safe (IIRC we used it because there was no
__is_trivially_copyable trait at the time, so __is_trivial was the
best we had).

There are types which are trivially assignable but not trivially
copyable, and it's undefined to use memmove for such types. With your
patch applied I get a warning for this code, where there was none
before:

#include 
#include 

struct T
{
 T() { }
 T(const T&) { }
};

static_assert(std::is_trivially_copy_assignable::value
   && !std::is_trivially_copyable::value,
   "T is only trivially copy assignable, not trivially copyable");

void
test01(T* result)
{
 T t[1];
 std::copy(t, t+1, result);
}


In file included from /home/jwakely/gcc/9/include/c++/9.0.0/memory:62,
from copy.cc:1:
/home/jwakely/gcc/9/include/c++/9.0.0/bits/stl_algobase.h: In instantiation of 
'static _Tp* std::__copy_move<_IsMove, true, 
std::random_access_iterator_tag>::__copy_m(const _Tp*, const _Tp*, _Tp*) [with _Tp 
= T; bool _IsMove = false]':
/home/jwakely/gcc/9/include/c++/9.0.0/bits/stl_algobase.h:406:30:   required 
from '_OI std::__copy_move_a(_II, _II, _OI) [with bool _IsMove = false; _II = 
T*; _OI = T*]'
/home/jwakely/gcc/9/include/c++/9.0.0/bits/stl_algobase.h:443:30:   required 
from '_OI std::__copy_move_a2(_II, _II, _OI) [with bool _IsMove = false; _II = 
T*; _OI = T*]'
/home/jwakely/gcc/9/include/c++/9.0.0/bits/stl_algobase.h:476:7:   required 
from '_OI std::copy(_II, _II, _OI) [with _II = T*; _OI = T*]'
copy.cc:18:27:   required from here
/home/jwakely/gcc/9/include/c++/9.0.0/bits/stl_algobase.h:388:23: warning: 
'void* __builtin_memmove(void*, const void*, long unsigned int)' writing to an 
object of non-trivially copyable type 'struct T'; use copy-assignment or 
copy-initialization instead [-Wclass-memaccess]
 __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
 ~^~~
copy.cc:4:8: note: 'struct T' declared here
struct T
   ^


I think the best we can do here is simply replace __is_trivial with
__is_trivially_copyable, which will enable memmove for trivially
copyable types for which !is_trivially_default_constructible_v.




Re: Compilation error in simple-object-elf.c

2018-07-19 Thread Pedro Alves
On 07/19/2018 02:06 PM, Eli Zaretskii wrote:
>> From: Richard Biener 
>> Date: Thu, 19 Jul 2018 10:46:01 +0200
>> Cc: DJ Delorie , GCC Patches , 
>> gdb-patc...@sourceware.org
>>
>>> *err = ENOTSUP;
>>>^~~
>>>  ./simple-object-elf.c:1284:14: note: each undeclared identifier is 
>>> reported only once for each function it appears in
>>>
>>> Suggested fix:
>>
>> Works for me, thus OK.  I'm going to check it in to make 8.2.
> 
> Thanks.
> 
> Joel/Pedro, is this okay for GDB's copy of libiberty, master and
> branch?

Yes.

Thanks,
Pedro Alves


Re: [PATCH] Merge Ignore and Deprecated in .opt files.

2018-07-19 Thread Jakub Jelinek
On Thu, Jul 19, 2018 at 03:25:15PM +0200, Martin Liška wrote:
> Few weeks ago I added new Deprecated flag for options. Apparently, there's
> one similar called Ignore. Thus I moved all Deprecated to Ignore and for
> all Ignored I do a warning of following format:
> 
> $ xgcc: warning: switch ‘-mmpx’ is no longer supported

Is that what we want for all the Ignore options?  Looking at {,*/}*.opt,
I see a lot of options that have those Warn(switch %qs is no longer supported)
and a lot of them that do not, while with your patch it will now warn all.
Especially when it is a warning without corresponding -W... option that can
be quite nasty.

Wouldn't be better to just remove Deprecated and keep the Ignore behavior it
had?  Or make Deprecated effectively an alias for
Ignore Warn(switch %qs is no longer supported)
and use it for those options that did that?

Jakub


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Glen Fernandes
On Thu, Jul 19, 2018 at 9:25 AM Jonathan Wakely  wrote:
> Sorry for the delay in reviewing this properly, as I've only just
> realised that this introduces undefined behaviour, doesn't it?
>
> It's undefined to use memmove for a type that is not trivially
> copyable. All trivial types are trivially copyable, so __is_trivial
> was too conservative, but safe (IIRC we used it because there was no
> __is_trivially_copyable trait at the time, so __is_trivial was the
> best we had).
>
> There are types which are trivially assignable but not trivially
> copyable, and it's undefined to use memmove for such types.

I was still unclear about that, but I forwarded you an e-mail from
Marshall with his answer when I asked whether libc++'s use of
TriviallyCopyAssignable here was incorrect. Let me know if it applies
here, and if not (and that interpretation of the standard is
incorrect), I'll update the patch to do as you suggest and run the
tests again.

Glen


Re: Compilation error in simple-object-elf.c

2018-07-19 Thread Eli Zaretskii
> Cc: d...@redhat.com, gcc-patches@gcc.gnu.org, gdb-patc...@sourceware.org
> From: Pedro Alves 
> Date: Thu, 19 Jul 2018 14:41:13 +0100
> 
> On 07/19/2018 02:06 PM, Eli Zaretskii wrote:
> >> From: Richard Biener 
> >> Date: Thu, 19 Jul 2018 10:46:01 +0200
> >> Cc: DJ Delorie , GCC Patches , 
> >> gdb-patc...@sourceware.org
> >>
> >>> *err = ENOTSUP;
> >>>^~~
> >>>  ./simple-object-elf.c:1284:14: note: each undeclared identifier is 
> >>> reported only once for each function it appears in
> >>>
> >>> Suggested fix:
> >>
> >> Works for me, thus OK.  I'm going to check it in to make 8.2.
> > 
> > Thanks.
> > 
> > Joel/Pedro, is this okay for GDB's copy of libiberty, master and
> > branch?

Thanks, done.


Re: [AArch64][PATCH 1/2] Fix addressing printing of LDP/STP

2018-07-19 Thread Andre Vieira (lists)
On 17/07/18 15:52, James Greenhalgh wrote:
> On Mon, Jun 25, 2018 at 03:48:13AM -0500, Andre Simoes Dias Vieira wrote:
>> On 18/06/18 09:08, Andre Simoes Dias Vieira wrote:
>>> Hi Richard,
>>>
>>> Sorry for the delay I have been on holidays.  I had a look and I think you 
>>> are right.  With these changes Umq and Uml seem to have the same 
>>> functionality though, so I would suggest using only one.  Maybe use a 
>>> different name for both, removing both Umq and Uml in favour of Umn, where 
>>> the n indicates it narrows the addressing mode.  How does that sound to you?
>>>
>>> I also had a look at Ump, but that one is used in the parallel pattern for 
>>> STP/LDP which does not use this "narrowing". So we should leave that one as 
>>> is.
>>>
>>> Cheers,
>>> Andre
>>>
>>> 
>>> From: Richard Sandiford 
>>> Sent: Thursday, June 14, 2018 12:28:16 PM
>>> To: Andre Simoes Dias Vieira
>>> Cc: gcc-patches@gcc.gnu.org; nd
>>> Subject: Re: [AArch64][PATCH 1/2] Fix addressing printing of LDP/STP
>>>
>>> Andre Simoes Dias Vieira  writes:
 @@ -5716,10 +5717,17 @@ aarch64_classify_address (struct 
 aarch64_address_info *info,
unsigned int vec_flags = aarch64_classify_vector_mode (mode);
bool advsimd_struct_p = (vec_flags == (VEC_ADVSIMD | VEC_STRUCT));
bool load_store_pair_p = (type == ADDR_QUERY_LDP_STP
 + || type == ADDR_QUERY_LDP_STP_N
   || mode == TImode
   || mode == TFmode
   || (BYTES_BIG_ENDIAN && advsimd_struct_p));

 +  /* If we are dealing with ADDR_QUERY_LDP_STP_N that means the incoming 
 mode
 + corresponds to the actual size of the memory being loaded/stored and 
 the
 + mode of the corresponding addressing mode is half of that.  */
 +  if (type == ADDR_QUERY_LDP_STP_N && known_eq (GET_MODE_SIZE (mode), 16))
 +mode = DFmode;
 +
bool allow_reg_index_p = (!load_store_pair_p
   && (known_lt (GET_MODE_SIZE (mode), 16)
   || vec_flags == VEC_ADVSIMD
>>>
>>> I don't know whether it matters in practice, but that description also
>>> applies to Umq, not just Uml.  It might be worth changing it too so
>>> that things stay consistent.
>>>
>>> Thanks,
>>> Richard
>>>
>> Hi all,
>>
>> This is a reworked patched, replacing Umq and Uml with Umn now.
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>
>> Is this OK for trunk?
> 
> OK. Does this also need backporting to 8?

It doesn't __need__ to, as the failure this fixes has not been observed
on gcc-8, I assume that is because of our restrictive  LDP/STP
generation (see patch 2/2 of this series).  Though it wouldn't hurt
either I think.
> 
> Thanks,
> James
> 
>>
>> gcc
>> 2018-06-25  Andre Vieira  
>>
>> * config/aarch64/aarch64-simd.md (aarch64_simd_mov):
>> Replace
>> Umq with Umn.
>> (store_pair_lanes): Likewise.
>> * config/aarch64/aarch64-protos.h (aarch64_addr_query_type): Add new
>> enum value 'ADDR_QUERY_LDP_STP_N'.
>> * config/aarch64/aarch64.c (aarch64_addr_query_type): Likewise.
>> (aarch64_print_address_internal): Add declaration.
>> (aarch64_print_ldpstp_address): Remove.
>> (aarch64_classify_address): Adapt mode for 'ADDR_QUERY_LDP_STP_N'.
>> (aarch64_print_operand): Change printing of 'y'.
>> * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Use
>> new enum value 'ADDR_QUERY_LDP_STP_N', don't hardcode mode and use
>> 'true' rather than '1'.
>> * gcc/config/aarch64/constraints.md (Uml): Likewise.
>> (Uml): Rename to Umn.
>> (Umq): Remove.
> 



Re: [PATCH] Merge Ignore and Deprecated in .opt files.

2018-07-19 Thread Martin Liška
On 07/19/2018 03:47 PM, Jakub Jelinek wrote:
> On Thu, Jul 19, 2018 at 03:25:15PM +0200, Martin Liška wrote:
>> Few weeks ago I added new Deprecated flag for options. Apparently, there's
>> one similar called Ignore. Thus I moved all Deprecated to Ignore and for
>> all Ignored I do a warning of following format:
>>
>> $ xgcc: warning: switch ‘-mmpx’ is no longer supported
> 
> Is that what we want for all the Ignore options?  Looking at {,*/}*.opt,
> I see a lot of options that have those Warn(switch %qs is no longer supported)
> and a lot of them that do not, while with your patch it will now warn all.

I must admit that was my intention :) In my eyes it makes it more consistent and
it gives consumers feedback about usage of an option that does nothing.
For x86_64 there's list of options that are Ignore and don't produce a warning:

Wimport
Wunreachable-code
Wunsafe-loop-optimizations
fargument-alias
fargument-noalias
fargument-noalias-anything
fargument-noalias-global
fcse-skip-blocks
fdefault-inline
fdump-core
feliminate-dwarf2-dups
fforce-addr
fipa-cp-alignment
fipa-matrix-reorg
fipa-struct-reorg
floop-flatten
floop-optimize
foptimize-register-move
foptional-diags
fregmove
frerun-loop-opt
fsched2-use-traces
fsee
fstrength-reduce
ftree-coalesce-inlined-vars
ftree-copyrename
ftree-loop-if-convert-stores
ftree-salias
ftree-store-ccp
ftree-store-copy-prop
ftree-vect-loop-version
ftree-vectorizer-verbose=
funsafe-loop-optimizations
fwhole-file
fzee


> Especially when it is a warning without corresponding -W... option that can
> be quite nasty.

As spoken on IRC, we can always add -Wno-ignored-options that will suppress the 
warnings. We'll do distro rebuild before next release, so if huge numbers
of the warnings will be seen, we can revert that.

Martin

> 
> Wouldn't be better to just remove Deprecated and keep the Ignore behavior it
> had?  Or make Deprecated effectively an alias for
> Ignore Warn(switch %qs is no longer supported)
> and use it for those options that did that?
> 
>   Jakub
> 



Re: [PATCH] Merge Ignore and Deprecated in .opt files.

2018-07-19 Thread Jakub Jelinek
On Thu, Jul 19, 2018 at 04:16:10PM +0200, Martin Liška wrote:
> I must admit that was my intention :) In my eyes it makes it more consistent 
> and
> it gives consumers feedback about usage of an option that does nothing.
> For x86_64 there's list of options that are Ignore and don't produce a 
> warning:
> 
> Wimport
> Wunreachable-code
> Wunsafe-loop-optimizations

I'm especially worried about the above ones, we don't emit any warnings if
we do
-Wno-foobarbazqux
unless some other diagnostic is emitted.
Do we warn if we do
-Wno-unsafe-loop-optimizations
?  At least for -Wno-* Ignored options it would be nice to treat them similarly
to the non-existed -Wno-* options.

Jakub


Re: RFC: Patch to implement Aarch64 SIMD ABI

2018-07-19 Thread Wilco Dijkstra
Hi Steve,

> This patch checks for SIMD functions and saves the extra registers when
> needed.  It does not change the caller behavour, so with just this patch
> there may be values saved by both the caller and callee.  This is not
> efficient, but it is correct code.

I tried a few simple test cases. It seems calls to non-vector functions don't 
mark
the callee-saves as needing to be saved/restored:

void g(void);

void __attribute__ ((aarch64_vector_pcs))
f1 (void)
{ 
  g();
  g();
}

f1:
str x30, [sp, -16]!
bl  g
ldr x30, [sp], 16
b   g

Here I would expect q8-q23 to be preserved and no tailcall to g() since it is 
not a vector
function. This is important for correctness since f1 must preserve q8-q23.


// compile with -O2 -ffixed-d1 -ffixed-d2 -ffixed-d3 -ffixed-d4 -ffixed-d5 
-ffixed-d6 -ffixed-d7
float __attribute__ ((aarch64_vector_pcs))
f2 (float *p)
{
  float t0 = p[1];
  float t1 = p[3];
  float t2 = p[5]; 
  return t0 - t1 * (t1 + t0) + (t2 * t0);
}

f2:
stp d16, d17, [sp, -48]!
ldr s17, [x0, 4]
ldr s18, [x0, 12]
ldr s0, [x0, 20]
fadds16, s17, s18
fmsub   s16, s16, s18, s17
fmadd   s0, s17, s0, s16
ldp d16, d17, [sp], 48
ret

This uses s16-s18 when it should prefer to use s24-s31 first. Also it needs to 
save q16-q18,
not only d16 and d17.

Btw the -ffixed-d* is useful to block the register allocator from using certain 
registers.

Wilco


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Glen Fernandes
On Thu, Jul 19, 2018 at 10:01 AM Glen Fernandes wrote:
>
> I was still unclear about that, but I forwarded you an e-mail from
> Marshall with his answer when I asked whether libc++'s use of
> TriviallyCopyAssignable here was incorrect. Let me know if it applies
> here, and if not (and that interpretation of the standard is
> incorrect), I'll update the patch to do as you suggest and run the
> tests again.
>
> Glen

Attached: patch.txt

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__copy_move_a): Used __is_trivially_copyable.
(__copy_move_backward_a): Likewise.

Tested x86_64-pc-linux-gnu.

Glen
commit 60bf5acca419df37752336c2008123558627ece7
Author: Glen Fernandes 
Date:   Thu Jul 19 10:27:54 2018 -0400

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__copy_move_a): Used __is_trivially_copyable.
(__copy_move_backward_a): Likewise.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 16a3f83b6..f0130bc41 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -389,11 +389,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_a(_II __first, _II __last, _OI __result)
 {
   typedef typename iterator_traits<_II>::value_type _ValueTypeI;
   typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
   typedef typename iterator_traits<_II>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueTypeI)
+  const bool __simple = (__is_trivially_copyable(_ValueTypeI)
 && __is_pointer<_II>::__value
 && __is_pointer<_OI>::__value
 && __are_same<_ValueTypeI, _ValueTypeO>::__value);
 
   return std::__copy_move<_IsMove, __simple,
@@ -591,11 +591,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_backward_a(_BI1 __first, _BI1 __last, _BI2 __result)
 {
   typedef typename iterator_traits<_BI1>::value_type _ValueType1;
   typedef typename iterator_traits<_BI2>::value_type _ValueType2;
   typedef typename iterator_traits<_BI1>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueType1)
+  const bool __simple = (__is_trivially_copyable(_ValueType1)
 && __is_pointer<_BI1>::__value
 && __is_pointer<_BI2>::__value
 && __are_same<_ValueType1, _ValueType2>::__value);
 
   return std::__copy_move_backward<_IsMove, __simple,


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Jonathan Wakely

On 19/07/18 10:01 -0400, Glen Fernandes wrote:

On Thu, Jul 19, 2018 at 9:25 AM Jonathan Wakely  wrote:

Sorry for the delay in reviewing this properly, as I've only just
realised that this introduces undefined behaviour, doesn't it?

It's undefined to use memmove for a type that is not trivially
copyable. All trivial types are trivially copyable, so __is_trivial
was too conservative, but safe (IIRC we used it because there was no
__is_trivially_copyable trait at the time, so __is_trivial was the
best we had).

There are types which are trivially assignable but not trivially
copyable, and it's undefined to use memmove for such types.


I was still unclear about that, but I forwarded you an e-mail from
Marshall with his answer when I asked whether libc++'s use of
TriviallyCopyAssignable here was incorrect. Let me know if it applies
here, and if not (and that interpretation of the standard is
incorrect), I'll update the patch to do as you suggest and run the
tests again.


While I sympathise with Marshall's position (that std::copy only cares
about assignment not copying) that doesn't make it OK to use memmove
here.

Using memmove for a non-trivially copyable type is undefined. Period.

The fact GCC warns that it's undefined also means GCC might start
optimising based on the assumption that undefined behaviour isn't
reached at runtime. So it could (for example) assume that the input
range must be empty and remove the entire call to std::copy.

For a non-trivially copyable, trivially assignable type I think we
just have to rely on the compiler to transform the assignments into
optimal code (which might end up being a memmove, ironically).

Please do update the patch to use __is_trivially_copyable. I don't
think we need the __is_simple_copy_move helper in that case, just
change two uses of __is_trivial to __is_trivially_copyable.



Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Jonathan Wakely

On 19/07/18 10:32 -0400, Glen Fernandes wrote:

On Thu, Jul 19, 2018 at 10:01 AM Glen Fernandes wrote:


I was still unclear about that, but I forwarded you an e-mail from
Marshall with his answer when I asked whether libc++'s use of
TriviallyCopyAssignable here was incorrect. Let me know if it applies
here, and if not (and that interpretation of the standard is
incorrect), I'll update the patch to do as you suggest and run the
tests again.

Glen


Attached: patch.txt

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

   * include/bits/stl_algobase.h
   (__copy_move_a): Used __is_trivially_copyable.
   (__copy_move_backward_a): Likewise.

Tested x86_64-pc-linux-gnu.


Ah, that was quick :-)

Can we keep the new test you added in the previous patch? It seems
useful to add anyway.




Re: cleanup cross product code in VRP

2018-07-19 Thread Jeff Law
On 07/19/2018 03:06 AM, Aldy Hernandez wrote:
> 
> 
> On 07/19/2018 04:18 AM, Richard Biener wrote:
>> On Wed, Jul 18, 2018 at 2:05 PM Aldy Hernandez  wrote:
>>>
>>> Hi again!
>>>
>>> Well, since this hasn't been reviewed and I'm about to overhaul the
>>> TYPE_OVERFLOW_WRAPS code anyhow, might as well lump it all in one patch.
>>>
>>> On 07/16/2018 09:19 AM, Aldy Hernandez wrote:
 Howdy!

 I've abstracted out the cross product calculations into its own
 function, and have adapted it to deal with wide ints so it's more
 reusable.  It required some shuffling around, and implementing things a
 bit different, but things should be behave as before.

 I also renamed vrp_int_const_binop to make its intent clearer,
 especially now that it's really just a wrapper to wide_int_binop that
 deals with overflow.

 (If wide_int_binop_overflow is generally useful, perhaps we could merge
 it with wide_int_overflow.)
>>>
>>> This is the same as the previous patch, plus I'm abstracting the
>>> TYPE_OVERFLOW_WRAPS code as well.  With this, the code dealing with
>>> MULT_EXPR in vrp gets reduced to handling value_range specific stuff.
>>> Yay code re-use!
>>>
>>> A few notes:
>>>
>>> This is dead code.  I've removed it:
>>>
>>> -  /* If we have an unsigned MULT_EXPR with two VR_ANTI_RANGEs,
>>> -    drop to VR_VARYING.  It would take more effort to compute a
>>> -    precise range for such a case.  For example, if we have
>>> -    op0 == 65536 and op1 == 65536 with their ranges both being
>>> -    ~[0,0] on a 32-bit machine, we would have op0 * op1 == 0, so
>>> -    we cannot claim that the product is in ~[0,0].  Note that we
>>> -    are guaranteed to have vr0.type == vr1.type at this
>>> -    point.  */
>>> -  if (vr0.type == VR_ANTI_RANGE
>>> - && !TYPE_OVERFLOW_UNDEFINED (expr_type))
>>> -   {
>>> - set_value_range_to_varying (vr);
>>> - return;
>>> -   }
>>>
>>> Also, the vrp_int typedef has a weird name, especially when we have
>>> widest2_int in gimple-fold.c that does the exact thing.  I've moved the
>>> common code to wide-int.h and tree.h so we can all share :).
>>>
>>> At some point we could move the wide_int_range* and wide_int_binop* code
>>> into its own file.
>>
>> Yes.
> 
> Sometime within the next couple rounds I'll come up with a file name
> that doesn't hurt my eyes.  It seems that the hardest part of
> programming is actually coming up with sensible file and variable names
> :-/.
I recall reading somewhere that once you have the right
function/method/variable names you're 90% of the way to having the
problem solved.  I don't totally agree, but I can see the point the
original author of the statement was trying to get across.

Internally when refactoring I usually start with calling stuff "foo",
"bar", "doit" and friends until I'm fairly happy with what got carved
out then I try to figure out reasonable names.

Jeff


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Glen Fernandes
On Thu, Jul 19, 2018 at 10:40 AM Jonathan Wakely  wrote:
> On 19/07/18 10:32 -0400, Glen Fernandes wrote:
> >Attached: patch.txt
> >Use __builtin_memmove for trivially copyable types
> >2018-07-19  Glen Joseph Fernandes  
> >* include/bits/stl_algobase.h
> >(__copy_move_a): Used __is_trivially_copyable.
> >(__copy_move_backward_a): Likewise.
> >Tested x86_64-pc-linux-gnu.
>
> Ah, that was quick :-)
>
> Can we keep the new test you added in the previous patch? It seems
> useful to add anyway.

Affirmative. Attached: patch.txt

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__copy_move_a): Used __is_trivially_copyable.
(__copy_move_backward_a): Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
New test.

Glen
commit f08d827c2e7e525f94b31d9ad5c22dab5a84e451
Author: Glen Fernandes 
Date:   Thu Jul 19 10:54:50 2018 -0400

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__copy_move_a): Used __is_trivially_copyable.
(__copy_move_backward_a): Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
New test.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 16a3f83b6..f0130bc41 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -389,11 +389,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_a(_II __first, _II __last, _OI __result)
 {
   typedef typename iterator_traits<_II>::value_type _ValueTypeI;
   typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
   typedef typename iterator_traits<_II>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueTypeI)
+  const bool __simple = (__is_trivially_copyable(_ValueTypeI)
 && __is_pointer<_II>::__value
 && __is_pointer<_OI>::__value
 && __are_same<_ValueTypeI, _ValueTypeO>::__value);
 
   return std::__copy_move<_IsMove, __simple,
@@ -591,11 +591,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_backward_a(_BI1 __first, _BI1 __last, _BI2 __result)
 {
   typedef typename iterator_traits<_BI1>::value_type _ValueType1;
   typedef typename iterator_traits<_BI2>::value_type _ValueType2;
   typedef typename iterator_traits<_BI1>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueType1)
+  const bool __simple = (__is_trivially_copyable(_ValueType1)
 && __is_pointer<_BI1>::__value
 && __is_pointer<_BI2>::__value
 && __are_same<_ValueType1, _ValueType2>::__value);
 
   return std::__copy_move_backward<_IsMove, __simple,
diff --git 
a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc
new file mode 100644
index 0..461d1f242
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc
@@ -0,0 +1,37 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct T
+{
+  T() { }
+  T(const T&) = delete;
+};
+
+static_assert(__is_trivially_assignable(T&, const T&) &&
+  !__is_trivial(T), "T is only trivially copy assignable");
+
+void
+test01(T* result)
+{
+  T t[1];
+  std::uninitialized_copy(t, t+1, result); // { dg-error "here" }
+}
+// { dg-prune-output "use of deleted function" }


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Glen Fernandes
On Thu, Jul 19, 2018 at 10:40 AM Jonathan Wakely  wrote:
> On 19/07/18 10:32 -0400, Glen Fernandes wrote:
> >Attached: patch.txt
> >Use __builtin_memmove for trivially copyable types
> >2018-07-19  Glen Joseph Fernandes  
> >* include/bits/stl_algobase.h
> >(__copy_move_a): Used __is_trivially_copyable.
> >(__copy_move_backward_a): Likewise.
> >Tested x86_64-pc-linux-gnu.
>
> Ah, that was quick :-)
>
> Can we keep the new test you added in the previous patch? It seems
> useful to add anyway.

Affirmative. Attached: patch.txt

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__copy_move_a): Used __is_trivially_copyable.
(__copy_move_backward_a): Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
New test.

Glen
commit f08d827c2e7e525f94b31d9ad5c22dab5a84e451
Author: Glen Fernandes 
Date:   Thu Jul 19 10:54:50 2018 -0400

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

* include/bits/stl_algobase.h
(__copy_move_a): Used __is_trivially_copyable.
(__copy_move_backward_a): Likewise.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
New test.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 16a3f83b6..f0130bc41 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -389,11 +389,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_a(_II __first, _II __last, _OI __result)
 {
   typedef typename iterator_traits<_II>::value_type _ValueTypeI;
   typedef typename iterator_traits<_OI>::value_type _ValueTypeO;
   typedef typename iterator_traits<_II>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueTypeI)
+  const bool __simple = (__is_trivially_copyable(_ValueTypeI)
 && __is_pointer<_II>::__value
 && __is_pointer<_OI>::__value
 && __are_same<_ValueTypeI, _ValueTypeO>::__value);
 
   return std::__copy_move<_IsMove, __simple,
@@ -591,11 +591,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __copy_move_backward_a(_BI1 __first, _BI1 __last, _BI2 __result)
 {
   typedef typename iterator_traits<_BI1>::value_type _ValueType1;
   typedef typename iterator_traits<_BI2>::value_type _ValueType2;
   typedef typename iterator_traits<_BI1>::iterator_category _Category;
-  const bool __simple = (__is_trivial(_ValueType1)
+  const bool __simple = (__is_trivially_copyable(_ValueType1)
 && __is_pointer<_BI1>::__value
 && __is_pointer<_BI2>::__value
 && __are_same<_ValueType1, _ValueType2>::__value);
 
   return std::__copy_move_backward<_IsMove, __simple,
diff --git 
a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc
new file mode 100644
index 0..461d1f242
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc
@@ -0,0 +1,37 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct T
+{
+  T() { }
+  T(const T&) = delete;
+};
+
+static_assert(__is_trivially_assignable(T&, const T&) &&
+  !__is_trivial(T), "T is only trivially copy assignable");
+
+void
+test01(T* result)
+{
+  T t[1];
+  std::uninitialized_copy(t, t+1, result); // { dg-error "here" }
+}
+// { dg-prune-output "use of deleted function" }


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Jeff Law
On 07/19/2018 08:39 AM, Jonathan Wakely wrote:
> On 19/07/18 10:01 -0400, Glen Fernandes wrote:
>> On Thu, Jul 19, 2018 at 9:25 AM Jonathan Wakely 
>> wrote:
>>> Sorry for the delay in reviewing this properly, as I've only just
>>> realised that this introduces undefined behaviour, doesn't it?
>>>
>>> It's undefined to use memmove for a type that is not trivially
>>> copyable. All trivial types are trivially copyable, so __is_trivial
>>> was too conservative, but safe (IIRC we used it because there was no
>>> __is_trivially_copyable trait at the time, so __is_trivial was the
>>> best we had).
>>>
>>> There are types which are trivially assignable but not trivially
>>> copyable, and it's undefined to use memmove for such types.
>>
>> I was still unclear about that, but I forwarded you an e-mail from
>> Marshall with his answer when I asked whether libc++'s use of
>> TriviallyCopyAssignable here was incorrect. Let me know if it applies
>> here, and if not (and that interpretation of the standard is
>> incorrect), I'll update the patch to do as you suggest and run the
>> tests again.
> 
> While I sympathise with Marshall's position (that std::copy only cares
> about assignment not copying) that doesn't make it OK to use memmove
> here.
> 
> Using memmove for a non-trivially copyable type is undefined. Period.
> 
> The fact GCC warns that it's undefined also means GCC might start
> optimising based on the assumption that undefined behaviour isn't
> reached at runtime. So it could (for example) assume that the input
> range must be empty and remove the entire call to std::copy.
Right.  In fact we have a pass which searches for a very small subset of
undefined behavior (null pointer dereferences, division by zero) and
when it finds them it replaces the offending operation with a trap and
does the obvious CFG cleanups.

While neither of those would affect this specific issue and we're pretty
conservative about adding more cases to this pass, certainly the right
thing to do is avoid undefined behavior :-)  So I'm in total agreement
with Jon here.

jeff


Re: [PATCH, GCC, AARCH64] Add support for +profile extension

2018-07-19 Thread Andre Vieira (lists)
On 17/07/18 16:23, James Greenhalgh wrote:
> On Mon, Jul 09, 2018 at 08:20:53AM -0500, Andre Vieira (lists) wrote:
>> Hi,
>>
>> This patch adds support for the Statistical Profiling Extension (SPE) on
>> AArch64. Even though the compiler will not generate code any differently
>> given this extension, it will need to pass it on to the assembler in
>> order to let it correctly assemble inline asm containing accesses to the
>> extension's system registers.  The same applies when using the
>> preprocessor on an assembly file as this first must pass through cc1.
>>
>> I left the hwcaps string for SPE empty as the kernel does not define a
>> feature string for this extension.  The current effect of this is that
>> driver will disable profile feature bit in GCC.  This is OK though
>> because we don't, nor do we ever, enable this feature bit, as codegen is
>> not affect by the SPE support and more importantly the driver will still
>> pass the extension down to the assembler regardless.
> 
> Please make these conditions clear in the documentation. Something like.
> 
>> +@item profile
>> +Enable the Statistical Profiling extension.  This option only changes
>> +the behavior of the assembler, and does not change code generation.
> 
> Maybe worded better...
> 
>>
>> Boostrapped aarch64-none-linux-gnu and ran regression tests.
>>
>> Is it OK for trunk?
>>
>> gcc/ChangeLog:
>> 2018-07-09  Andre Vieira  
>>
>>  * config/aarch64/aarch64-option-extensions.def: New entry for profile
>>  extension.
>>  * config/aarch64/aarch64.h (AARCH64_FL_PROFILE): New.
>>  * doc/invoke.texi (aarch64-feature-modifiers): New entry for profile
>>  extension.
>>
>> gcc/testsuite/ChangeLog:
>> 2018-07-09 Andre Vieira 
>>
>>  * gcc.target/aarch64/profile.c: New test.
> 
> This test will fail for targets with old assemblers. That isn't ideal, we
> don't normally add these assembler tests for new instructions for that
> reason. Personally I'd drop the test down to a compile-only and scan the
> assembler for "+profile".
> 
> OK with those changes.
> 

Committed with changes in r262882.

> Thanks,
> James
> 
> 
>> diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
>> b/gcc/config/aarch64/aarch64-option-extensions.def
>> index 
>> 5fe5e3f7dddf622a48a5b9458ef30449a886f395..69ab796a4e1a959b89ebb55b599919c442cfb088
>>  100644
>> --- a/gcc/config/aarch64/aarch64-option-extensions.def
>> +++ b/gcc/config/aarch64/aarch64-option-extensions.def
>> @@ -105,4 +105,7 @@ AARCH64_OPT_EXTENSION("fp16fml", AARCH64_FL_F16FML, 
>> AARCH64_FL_FP | AARCH64_FL_F
>> Disabling "sve" just disables "sve".  */
>>  AARCH64_OPT_EXTENSION("sve", AARCH64_FL_SVE, AARCH64_FL_FP | 
>> AARCH64_FL_SIMD | AARCH64_FL_F16, 0, "sve")
>>  
>> +/* Enabling/Disabling "profile" does not enable/disable any other feature.  
>> */
>> +AARCH64_OPT_EXTENSION("profile", AARCH64_FL_PROFILE, 0, 0, "")
>> +
>>  #undef AARCH64_OPT_EXTENSION
>> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
>> index 
>> f284e74bfb8c9bab2aa22cc6c5a67750cbbba3c2..c1218503bab19323eee1cca8b7e4bea8fbfcf573
>>  100644
>> --- a/gcc/config/aarch64/aarch64.h
>> +++ b/gcc/config/aarch64/aarch64.h
>> @@ -158,6 +158,9 @@ extern unsigned aarch64_architecture_version;
>>  #define AARCH64_FL_SHA3   (1 << 18)  /* Has ARMv8.4-a SHA3 and 
>> SHA512.  */
>>  #define AARCH64_FL_F16FML (1 << 19)  /* Has ARMv8.4-a FP16 extensions.  
>> */
>>  
>> +/* Statistical Profiling extensions.  */
>> +#define AARCH64_FL_PROFILE(1 << 20)
>> +
>>  /* Has FP and SIMD.  */
>>  #define AARCH64_FL_FPSIMD (AARCH64_FL_FP | AARCH64_FL_SIMD)
>>  
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 
>> 56cd122b0d7b420e2b16ceb02907860879d3b9d7..4ca68a563297482afc75abed4a31c106af38caf7
>>  100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -14813,6 +14813,8 @@ instructions. Use of this option with architectures 
>> prior to Armv8.2-A is not su
>>  @item sm4
>>  Enable the sm3 and sm4 crypto extension.  This also enables Advanced SIMD 
>> instructions.
>>  Use of this option with architectures prior to Armv8.2-A is not supported.
>> +@item profile
>> +Enable the Statistical Profiling extension.
>>  
>>  @end table
>>  
>> diff --git a/gcc/testsuite/gcc.target/aarch64/profile.c 
>> b/gcc/testsuite/gcc.target/aarch64/profile.c
>> new file mode 100644
>> index 
>> ..db51b4746dd60009d784bc0b37ea99b2f120d856
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/profile.c
>> @@ -0,0 +1,9 @@
>> +/* { dg-do assemble } */
>> +/* { dg-options "-std=gnu99 -march=armv8.2-a+profile" } */
>> +
>> +int foo (void)
>> +{
>> +  int ret;
>> +  asm ("mrs  %0, pmblimitr_el1" : "=r" (ret));
>> +  return ret;
>> +}
> 

>From bc351f3940bdd26dfe91dd8fba2a00667705aa20 Mon Sep 17 00:00:00 2001
From: Andre Simoes Dias Vieira 
Date: Thu, 5 Jul 2018 10:27:37 +0100
Subject: [PATCH] [PATCH, GCC, AARCH64] Add support for +profile ex

Re: [PATCH][Aarch64] v2: Arithmetic overflow subv patterns [Patch 3/4]

2018-07-19 Thread James Greenhalgh
On Wed, Jun 13, 2018 at 03:06:05AM -0500, Michael Collison wrote:
> Updated previous patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-06/msg00508.html
> 
> With coding style feedback from Richard Sandiford: (that also apply to this 
> patch)
> 
>  https://gcc.gnu.org/ml/gcc-patches/2018-06/msg00508.html
> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?

OK.

Thanks,
James

> 
> 2018-05-31  Michael Collison  
>   Richard Henderson 
> 
>   * config/aarch64/aarch64.md (subv4, usubv4): New patterns.
>   (subti): Handle op1 zero.
>   (subvti4, usub4ti4): New.
>   (*sub3_compare1_imm): New.
>   (sub3_carryinCV): New.
>   (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
>   (*sub3_carryinCV_z2, *sub3_carryinCV): New.



Re: [PATCH][Aarch64] v2: Arithmetic overflow addv patterns [Patch 2/4]

2018-07-19 Thread James Greenhalgh
On Wed, Jun 13, 2018 at 02:57:45AM -0500, Michael Collison wrote:
> Updated with Richard's style and mismatched mode comments.
> 
> Okay for trunk?

OK.

Thanks,
James



Re: [PATCH, ARM] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-07-19 Thread Thomas Preudhomme
[Dropping Jeff Law from the list since he already commented on the
middle end parts]

Hi Kyrill,

On Thu, 19 Jul 2018 at 12:02, Kyrill Tkachov
 wrote:
>
> Hi Thomas,
>
> On 17/07/18 12:02, Thomas Preudhomme wrote:
> > Fixed in attached patch. ChangeLog entries are unchanged:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme 
> >
> > PR target/85434
> > * target-insns.def (stack_protect_combined_set): Define new standard
> > pattern name.
> > (stack_protect_combined_test): Likewise.
> > * cfgexpand.c (stack_protect_prologue): Try new
> > stack_protect_combined_set pattern first.
> > * function.c (stack_protect_epilogue): Try new
> > stack_protect_combined_test pattern first.
> > * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
> > parameters to control which register to use as PIC register and force
> > reloading PIC register respectively.
> > (legitimize_pic_address): Expose above new parameters in prototype and
> > adapt recursive calls accordingly.
> > (arm_legitimize_address): Adapt to new legitimize_pic_address
> > prototype.
> > (thumb_legitimize_address): Likewise.
> > (arm_emit_call_insn): Adapt to new require_pic_register prototype.
> > * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
> > change.
> > * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
> > prototype change.
> > (stack_protect_combined_set): New insn_and_split pattern.
> > (stack_protect_set): New insn pattern.
> > (stack_protect_combined_test): New insn_and_split pattern.
> > (stack_protect_test): New insn pattern.
> > * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
> > (UNSPEC_SP_TEST): Likewise.
> > * doc/md.texi (stack_protect_combined_set): Document new standard
> > pattern name.
> > (stack_protect_set): Clarify that the operand for guard's address is
> > legal.
> > (stack_protect_combined_test): Document new standard pattern name.
> > (stack_protect_test): Clarify that the operand for guard's address is
> > legal.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-07-05  Thomas Preud'homme 
> >
> > PR target/85434
> > * gcc.target/arm/pr85434.c: New test.
> >
>
> Sorry for the delay. Some comments inline.
>
> Kyrill
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index d6e3c382085..d1a893ac56e 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -6105,8 +6105,18 @@ stack_protect_prologue (void)
>   {
> tree guard_decl = targetm.stack_protect_guard ();
> rtx x, y;
> +  struct expand_operand ops[2];
>
> x = expand_normal (crtl->stack_protect_guard);
> +  create_fixed_operand (&ops[0], x);
> +  create_fixed_operand (&ops[1], DECL_RTL (guard_decl));
> +  /* Allow the target to compute address of Y and copy it to X without
> + leaking Y into a register.  This combined address + copy pattern allows
> + the target to prevent spilling of any intermediate results by splitting
> + it after register allocator.  */
> +  if (maybe_expand_insn (targetm.code_for_stack_protect_combined_set, 2, 
> ops))
> +return;
> +
> if (guard_decl)
>   y = expand_normal (guard_decl);
> else
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 8537262ce64..100844e659c 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -67,7 +67,7 @@ extern int const_ok_for_dimode_op (HOST_WIDE_INT, enum 
> rtx_code);
>   extern int arm_split_constant (RTX_CODE, machine_mode, rtx,
>HOST_WIDE_INT, rtx, rtx, int);
>   extern int legitimate_pic_operand_p (rtx);
> -extern rtx legitimize_pic_address (rtx, machine_mode, rtx);
> +extern rtx legitimize_pic_address (rtx, machine_mode, rtx, rtx, bool);
>   extern rtx legitimize_tls_address (rtx, rtx);
>   extern bool arm_legitimate_address_p (machine_mode, rtx, bool);
>   extern int arm_legitimate_address_outer_p (machine_mode, rtx, RTX_CODE, 
> int);
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index ec3abbcba9f..f4a970580c2 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -7369,20 +7369,26 @@ legitimate_pic_operand_p (rtx x)
>   }
>
>   /* Record that the current function needs a PIC register.  Initialize
> -   cfun->machine->pic_reg if we have not already done so.  */
> +   cfun->machine->pic_reg if we have not already done so.
> +
> +   If not NULL, PIC_REG indicates which register to use as PIC register,
> +   otherwise it is decided by register allocator.  COMPUTE_NOW forces the PIC
> +   register to be loaded, irregardless of whether it was loaded previously.  
> */
>
>   static void
> -require_pic_register (void)
> +require_pic_register (rtx pic_reg, bool compute_now)
>   {
> /* A lot of the logic here is made obscure by the fact that this
>routine gets called as part of the rtx cost estimation process.
> 

[PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Qing Zhao
Hi, 

As Wilco mentioned in PR78809 after I checked in the last part of 
implementation of inline strcmp:

See  http://www.iso-9899.info/n1570.html
 section 7.24.4:

"The sign of a nonzero value returned by the comparison functions memcmp, 
strcmp, and strncmp is determined 
by the sign of the difference between the values of the first pair of 
characters (both interpreted as unsigned char)
 that differ in the objects being compared."

currently, in my implementation, I used char type when expanding 
strcmp/strncmp, and unsigned char when expanding
memcmp.

from the C standard, we should use unsigned char for all strcmp/strncmp/memcmp.

the change is quite simple, and I have tested it on X86, aarch64 and powerPC, 
no regressions.

Okay for trunk?

Qing

gcc/ChangeLog:

+2018-07-19  Qing Zhao  
+
+   * builtins.c (expand_builtin_memcmp): Delete the last parameter for
+   call to inline_expand_builtin_string_cmp.
+   (expand_builtin_strcmp): Likewise.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): Delete the last parameter, change char_type_node
+   to unsigned_char_type_node for strcmp/strncmp;
+   (inline_expand_builtin_string_cmp): Delete the last parameter.
+


78809C_uchar.patch
Description: Binary data


[PATCH] Simplify the base characteristics for some type traits

2018-07-19 Thread Jonathan Wakely

This removes some seemingly redundant conditions from a few traits. If
__is_trivially_assignable correctly checks the assignable condition as
well as triviality, then we don't need is_assignable explicitly.  Does
anybody see a problem with that

I added some extra tests for cases that had been problematic with
__is_trivially_constructible.

We can definitely do that change for is_trivially_constructible,
because Ville fixed the intrinsic recently (PR 86398)

It also simplifies some others to replace
integral_constant::value> with the equivalent
foo::type, or to replace integral_constant::value>
with the equivalent __not_>::type.

I also started a wholesale replacement of integral_constant
with __bool_constant but backed that out again. It's a fair bit of
churn for not much benefit (the two cases I did change allow better
line wrapping which makes me a happy boy).

* include/std/type_traits (__is_member_object_pointer_helper): Use
__not_>::type instead of integral_constant.
(__is_member_function_pointer_helper): Likewise for
is_function<_Tp>::type.
(is_compund): Likewise for __not_>::type.
(__do_is_nt_destructible_impl): Use __bool_constant and reindent.
(is_trivially_constructible): Remove redundant use of
is_constructible.
(__is_trivially_copy_assignable_impl): Remove redundant use of
is_copy_assignable.
(__is_trivially_move_assignable_impl): Remove redundant use of
is_move_assignable.
(is_trivially_destructible): Use __bool_constant.
* testsuite/20_util/is_trivially_assignable/value.cc: Add some more
tests for scalar types.

Tested powerpc64le-linux.


commit 82960cb6e64ca78b53fb799318087cb23b942079
Author: Jonathan Wakely 
Date:   Thu Jul 19 17:03:33 2018 +0100

Simplify the base characteristics for some type traits

* include/std/type_traits (__is_member_object_pointer_helper): Use
__not_>::type instead of integral_constant.
(__is_member_function_pointer_helper): Likewise for
is_function<_Tp>::type.
(is_compund): Likewise for __not_>::type.
(__do_is_nt_destructible_impl): Use __bool_constant and reindent.
(is_trivially_constructible): Remove redundant use of
is_constructible.
(__is_trivially_copy_assignable_impl): Remove redundant use of
is_copy_assignable.
(__is_trivially_move_assignable_impl): Remove redundant use of
is_move_assignable.
(is_trivially_destructible): Use __bool_constant.
* testsuite/20_util/is_trivially_assignable/value.cc: Add some more
tests for scalar types.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 4df82bf6d8c..aaa554c6200 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -396,7 +396,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 struct __is_member_object_pointer_helper<_Tp _Cp::*>
-: public integral_constant::value> { };
+: public __not_>::type { };
 
   /// is_member_object_pointer
   template
@@ -411,7 +411,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 struct __is_member_function_pointer_helper<_Tp _Cp::*>
-: public integral_constant::value> { };
+: public is_function<_Tp>::type { };
 
   /// is_member_function_pointer
   template
@@ -603,7 +603,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_compound
   template
 struct is_compound
-: public integral_constant::value> { };
+: public __not_>::type { };
 
   template
 struct __is_member_pointer_helper
@@ -826,8 +826,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   struct __do_is_nt_destructible_impl
   {
 template
-  static integral_constant().~_Tp())>
-__test(int);
+  static __bool_constant().~_Tp())>
+  __test(int);
 
 template
   static false_type __test(...);
@@ -1136,8 +1136,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_trivially_constructible
   template
 struct is_trivially_constructible
-: public __and_, __bool_constant<
- __is_trivially_constructible(_Tp, _Args...)>>::type
+: public __bool_constant<__is_trivially_constructible(_Tp, _Args...)>
 { };
 
   /// is_trivially_default_constructible
@@ -1235,9 +1234,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 struct __is_trivially_copy_assignable_impl<_Tp, true>
-: public __and_,
-   integral_constant>
+: public __bool_constant<__is_trivially_assignable(_Tp&, const _Tp&)>
 { };
 
   template
@@ -1256,9 +1253,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 struct __is_trivially_move_assignable_impl<_Tp, true>
-: public __and_,
-   integral_constant>
+: public __bool_constant<__is_trivially_assignable(_Tp&, _Tp&&)>
 { };
 
   template
@@ -1269,8 +1264,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 

Re: [PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Jakub Jelinek
On Thu, Jul 19, 2018 at 11:49:16AM -0500, Qing Zhao wrote:
> As Wilco mentioned in PR78809 after I checked in the last part of 
> implementation of inline strcmp:
> 
> See  http://www.iso-9899.info/n1570.html
>  section 7.24.4:
> 
> "The sign of a nonzero value returned by the comparison functions memcmp, 
> strcmp, and strncmp is determined 
> by the sign of the difference between the values of the first pair of 
> characters (both interpreted as unsigned char)
>  that differ in the objects being compared."
> 
> currently, in my implementation, I used char type when expanding 
> strcmp/strncmp, and unsigned char when expanding
> memcmp.
> 
> from the C standard, we should use unsigned char for all 
> strcmp/strncmp/memcmp.
> 
> the change is quite simple, and I have tested it on X86, aarch64 and powerPC, 
> no regressions.
> 
> Okay for trunk?

If you expand it as (int) ((unsigned char *)p)[n] - (int) ((unsigned char 
*)q)[n]
then aren't you relying on int type to have wider precision than unsigned char
(or unit_mode being narrower than mode)?  I don't see anywhere where you'd
give up on doing the inline expansion on targets where e.g. lowest
addressable unit would be 16-bit and int would be 16-bit too.
On targets where int is as wide as char, one would need to expand it instead
as something like:
if (((unsigned char *)p)[n] == ((unsigned char *)q)[n]) loop;
ret = ((unsigned char *)p)[n] < ((unsigned char *)q)[n] ? -1 : 1;
or similar or just use the library routine.

Also:
  var_rtx
= adjust_address (var_rtx_array, TYPE_MODE (unit_type_node), offset);
  const_rtx = c_readstr (const_str + offset, unit_mode);
  rtx op0 = (const_str_n == 1) ? const_rtx : var_rtx;
  rtx op1 = (const_str_n == 1) ? var_rtx : const_rtx;
  
  result = expand_simple_binop (mode, MINUS, op0, op1,
result, is_memcmp ? 1 : 0, OPTAB_WIDEN);
doesn't look correct to me, var_rtx and const_rtx here are in unit_mode,
you need to convert those to mode before you can use those in
expand_simple_binop, using
  op0 = convert_modes (mode, unit_mode, op0, 1);
  op1 = convert_modes (mode, unit_mode, op1, 1);
before the expand_simple_binop.
While expand_simple_binop is called with an unsignedp argument, that is
meant for the cases where the expansion needs to widen it further, not for
calling expand_simple_binop with arguments with known incorrect mode;
furthermore, one of them being CONST_INT which has VOIDmode.

> +2018-07-19  Qing Zhao  
> +
> +   * builtins.c (expand_builtin_memcmp): Delete the last parameter for
> +   call to inline_expand_builtin_string_cmp.
> +   (expand_builtin_strcmp): Likewise.
> +   (expand_builtin_strncmp): Likewise.
> +   (inline_string_cmp): Delete the last parameter, change char_type_node
> +   to unsigned_char_type_node for strcmp/strncmp;
> +   (inline_expand_builtin_string_cmp): Delete the last parameter.
> +

Jakub


Re: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-19 Thread Jeff Law
On 07/19/2018 06:55 AM, Tamar Christina wrote:
>>>
>>> What's the purpose of including auto-host in params-list and
>>> params-options?  It seems like you're putting a property of the target
>>> (guard size) into the wrong place (auto-host.h).
>>>
>>
>> The reason for this is because there's a test gcc.dg/params/blocksort-part.c
>> that uses these params-options to generate test cases to perform parameter
>> validation. However because now the params.def file can contain a CPP
>> macro these would then fail.
>>
>> CPP is already called to create params-options and params-list so the easiest
>> way to fix this test was just to include auto-host which would get it the 
>> values
>> from configure.
>>
>> This test is probably not needed anymore after my second patch series as
>> parameters are validated by the front-end now, so they can never go out of
>> range.
Right, but I don't immediately see a way to avoid the test.  ie, it just
walks down everything in params.options and except for a couple
exceptional values the test gets run.

I wonder if all this is an indication that having CPP constants in the
options isn't going to work well as we're mixing the distinction between
host/target.


>>
>>> It's also a bit unclear to me why this is necessary at all.  Are we
>>> planning to support both the 4k and 64k guards?  My goal (once the
>>> guard was configurable) was never for supporting multiple sizes on a
>>> target but instead to allow experimentation to find the right default.
>>>
> 
> Having talked to people I believe we do need to support both 4k and 64k 
> guards.
> For the Linux/Glibc world it wouldn't matter much, either 4 or 64k would do, 
> though Glibc has settled on 64k pages.
> 
> However other systems like (open/free)BSD or musl based systems do not want
> 64k pages but want 4k ones.  So we're ending up having to support both as a 
> compromise.
Understood.  Thanks for verifying.  I wonder if we could just bury this
entirely in the aarch64 config files and not expose the default into
params.def?

jeff


[PATCH, obvious?] Some minor nits in string folding functions

2018-07-19 Thread Bernd Edlinger
Hi,


this fixes a few minor nits, which I spotted while
looking at the string folding functions:

string_constant: Remove impossible check: TREE_CODE (arg)
can't be COMPONENT_REF and MEM_REF at the same time.

c_strlen: maxelts is (signed) HOST_WIDE_INT, therefore
use tree_to_shwi.

c_getstr: tree_to_uhwi needs to be protected by
tree_fits_uhwi_p.

BTW: c_getstr uses string_constant which appears to be
able to extract wide char string initializers, but c_getstr
does not seem to be prepared for wide char strings:

   else if (string[string_length - 1] != '\0')
 {
   /* Support only properly NUL-terminated strings but handle
  consecutive strings within the same array, such as the six
  substrings in "1\0002\0003".  */
   return NULL;
 }


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
2018-07-19  Bernd Edlinger  

	* builtins.c (c_strlen): Use tree_to_shwi.
	* expr.c (string_constant): Remove impossible check.
	* fold-const.c (c_getstr): Use tree_fits_uhwi_p.

Index: gcc/builtins.c
===
--- gcc/builtins.c	(revision 262861)
+++ gcc/builtins.c	(working copy)
@@ -610,7 +610,7 @@ c_strlen (tree src, int only_value)
   tree type = TREE_TYPE (src);
   if (tree size = TYPE_SIZE_UNIT (type))
 if (tree_fits_shwi_p (size))
-  maxelts = tree_to_uhwi (size);
+  maxelts = tree_to_shwi (size);
 
   maxelts = maxelts / eltsize - 1;
 
Index: gcc/expr.c
===
--- gcc/expr.c	(revision 262861)
+++ gcc/expr.c	(working copy)
@@ -11371,11 +11371,6 @@ string_constant (tree arg, tree *ptr_offset)
 return NULL_TREE;
   if (TREE_CODE (init) == CONSTRUCTOR)
 {
-  if (TREE_CODE (arg) != ARRAY_REF
-	  && TREE_CODE (arg) == COMPONENT_REF
-	  && TREE_CODE (arg) == MEM_REF)
-	return NULL_TREE;
-
   /* Convert the 64-bit constant offset to a wider type to avoid
 	 overflow.  */
   offset_int wioff;
Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 262861)
+++ gcc/fold-const.c	(working copy)
@@ -14613,7 +14613,7 @@ c_getstr (tree src, unsigned HOST_WIDE_INT *strlen
   unsigned HOST_WIDE_INT string_size = string_length;
   tree type = TREE_TYPE (src);
   if (tree size = TYPE_SIZE_UNIT (type))
-if (tree_fits_shwi_p (size))
+if (tree_fits_uhwi_p (size))
   string_size = tree_to_uhwi (size);
 
   if (strlen)


Re: [PATCH, obvious?] Some minor nits in string folding functions

2018-07-19 Thread Jeff Law
On 07/19/2018 12:04 PM, Bernd Edlinger wrote:
> Hi,
> 
> 
> this fixes a few minor nits, which I spotted while
> looking at the string folding functions:
> 
> string_constant: Remove impossible check: TREE_CODE (arg)
> can't be COMPONENT_REF and MEM_REF at the same time.
Shouldn't they all be != tests?  Though clearly this code isn't being
tested by anything.


> 
> c_strlen: maxelts is (signed) HOST_WIDE_INT, therefore
> use tree_to_shwi.
One could argue maxelts should be unsigned.

> 
> c_getstr: tree_to_uhwi needs to be protected by
> tree_fits_uhwi_p.
Looks correct to me.


> 
> BTW: c_getstr uses string_constant which appears to be
> able to extract wide char string initializers, but c_getstr
> does not seem to be prepared for wide char strings:
> 
>else if (string[string_length - 1] != '\0')
>  {
>/* Support only properly NUL-terminated strings but handle
>   consecutive strings within the same array, such as the six
>   substrings in "1\0002\0003".  */
>return NULL;
>  }
Seems like a goof to me.

jeff

Jeff


Re: [PATCH] Simplify the base characteristics for some type traits

2018-07-19 Thread Ville Voutilainen
On 19 July 2018 at 20:18, Jonathan Wakely  wrote:
> This removes some seemingly redundant conditions from a few traits. If
> __is_trivially_assignable correctly checks the assignable condition as
> well as triviality, then we don't need is_assignable explicitly.  Does
> anybody see a problem with that

It should work; if it doesn't, that's a bug in the compiler. Both
is_constructible
and is_assignable and trivial variants thereof go to the same
"is_xible" code path,
so it should be fair for the library to expect that it all works. In
case it doesn't,
that's something that I will fix in the front-end.

> I added some extra tests for cases that had been problematic with
> __is_trivially_constructible.

That seems reasonable; some of those tests are now duplicated on the
compiler side,
but I think that's fine.


Re: [PATCH] Use __builtin_memmove for trivially copy assignable types

2018-07-19 Thread Jonathan Wakely

On 19/07/18 11:04 -0400, Glen Fernandes wrote:

On Thu, Jul 19, 2018 at 10:40 AM Jonathan Wakely  wrote:

On 19/07/18 10:32 -0400, Glen Fernandes wrote:
>Attached: patch.txt
>Use __builtin_memmove for trivially copyable types
>2018-07-19  Glen Joseph Fernandes  
>* include/bits/stl_algobase.h
>(__copy_move_a): Used __is_trivially_copyable.
>(__copy_move_backward_a): Likewise.
>Tested x86_64-pc-linux-gnu.

Ah, that was quick :-)

Can we keep the new test you added in the previous patch? It seems
useful to add anyway.


Affirmative. Attached: patch.txt

Use __builtin_memmove for trivially copyable types

2018-07-19  Glen Joseph Fernandes  

   * include/bits/stl_algobase.h
   (__copy_move_a): Used __is_trivially_copyable.
   (__copy_move_backward_a): Likewise.
   * testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
   New test.


Tested on x86_64-linux and committed to trunk - thanks!




Re: [PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Qing Zhao
Jakub,

thanks a lot for you review and comments.

> On Jul 19, 2018, at 12:31 PM, Jakub Jelinek  wrote:
> 
> On Thu, Jul 19, 2018 at 11:49:16AM -0500, Qing Zhao wrote:
>> As Wilco mentioned in PR78809 after I checked in the last part of 
>> implementation of inline strcmp:
>> 
>> See  http://www.iso-9899.info/n1570.html
>> section 7.24.4:
>> 
>> "The sign of a nonzero value returned by the comparison functions memcmp, 
>> strcmp, and strncmp is determined 
>> by the sign of the difference between the values of the first pair of 
>> characters (both interpreted as unsigned char)
>> that differ in the objects being compared."
>> 
>> currently, in my implementation, I used char type when expanding 
>> strcmp/strncmp, and unsigned char when expanding
>> memcmp.
>> 
>> from the C standard, we should use unsigned char for all 
>> strcmp/strncmp/memcmp.
>> 
>> the change is quite simple, and I have tested it on X86, aarch64 and 
>> powerPC, no regressions.
>> 
>> Okay for trunk?
> 
> If you expand it as (int) ((unsigned char *)p)[n] - (int) ((unsigned char 
> *)q)[n]
> then aren't you relying on int type to have wider precision than unsigned char
> (or unit_mode being narrower than mode)?

do you imply that we should only expand it as  (int) ((unsigned char *)p)[n] - 
(int) ((unsigned char *)q)[n] when we are sure
int type is wider than unsigned char? 

>  I don't see anywhere where you'd
> give up on doing the inline expansion on targets where e.g. lowest
> addressable unit would be 16-bit and int would be 16-bit too.

even on this targets, is char type still 8-bit?
then int type is still wider than char?

> On targets where int is as wide as char, one would need to expand it instead
> as something like:
> if (((unsigned char *)p)[n] == ((unsigned char *)q)[n]) loop;
> ret = ((unsigned char *)p)[n] < ((unsigned char *)q)[n] ? -1 : 1;
> or similar or just use the library routine.


even when int type is as wide as char,  expand it as (int) ((unsigned char 
*)p)[n] - (int) ((unsigned char *)q)[n]
should still be correct (even though not optimal), doesn’t it?

do I miss anything in this part?

> 
> Also:
>  var_rtx
>= adjust_address (var_rtx_array, TYPE_MODE (unit_type_node), offset);
>  const_rtx = c_readstr (const_str + offset, unit_mode);
>  rtx op0 = (const_str_n == 1) ? const_rtx : var_rtx;
>  rtx op1 = (const_str_n == 1) ? var_rtx : const_rtx;
> 
>  result = expand_simple_binop (mode, MINUS, op0, op1,
>result, is_memcmp ? 1 : 0, OPTAB_WIDEN);
> doesn't look correct to me, var_rtx and const_rtx here are in unit_mode,
> you need to convert those to mode before you can use those in
> expand_simple_binop, using
>  op0 = convert_modes (mode, unit_mode, op0, 1);
>  op1 = convert_modes (mode, unit_mode, op1, 1);
> before the expand_simple_binop.
> While expand_simple_binop is called with an unsignedp argument, that is
> meant for the cases where the expansion needs to widen it further, not for
> calling expand_simple_binop with arguments with known incorrect mode;
> furthermore, one of them being CONST_INT which has VOIDmode.

thank you for raising this issue, Yes, I will update this part of the code as 
you suggested.

Qing



Re: [PATCH] Simplify the base characteristics for some type traits

2018-07-19 Thread Jonathan Wakely

On 19/07/18 21:40 +0300, Ville Voutilainen wrote:

On 19 July 2018 at 20:18, Jonathan Wakely  wrote:

This removes some seemingly redundant conditions from a few traits. If
__is_trivially_assignable correctly checks the assignable condition as
well as triviality, then we don't need is_assignable explicitly.  Does
anybody see a problem with that


It should work; if it doesn't, that's a bug in the compiler. Both
is_constructible
and is_assignable and trivial variants thereof go to the same
"is_xible" code path,
so it should be fair for the library to expect that it all works. In
case it doesn't,
that's something that I will fix in the front-end.


Yeah I agree. I guess my concern is that we introduce a regression and
don't notice for a while that the intrinsic is buggy.

I've committed it now, so we'll see :-)




Re: [PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Jakub Jelinek
On Thu, Jul 19, 2018 at 02:06:16PM -0500, Qing Zhao wrote:
> > If you expand it as (int) ((unsigned char *)p)[n] - (int) ((unsigned char 
> > *)q)[n]
> > then aren't you relying on int type to have wider precision than unsigned 
> > char
> > (or unit_mode being narrower than mode)?
> 
> do you imply that we should only expand it as  (int) ((unsigned char *)p)[n] 
> - (int) ((unsigned char *)q)[n] when we are sure
> int type is wider than unsigned char? 

Yes.

> >  I don't see anywhere where you'd
> > give up on doing the inline expansion on targets where e.g. lowest
> > addressable unit would be 16-bit and int would be 16-bit too.
> 
> even on this targets, is char type still 8-bit?
> then int type is still wider than char?

C requires that int is at least 16-bit wide, so the sizeof (int) == sizeof
(char) case is only possible say with 16-bit char and 16-bit int, or 32-bit
char and 32-bit int etc.

> > On targets where int is as wide as char, one would need to expand it instead
> > as something like:
> > if (((unsigned char *)p)[n] == ((unsigned char *)q)[n]) loop;
> > ret = ((unsigned char *)p)[n] < ((unsigned char *)q)[n] ? -1 : 1;
> > or similar or just use the library routine.
> 
> 
> even when int type is as wide as char,  expand it as (int) ((unsigned char 
> *)p)[n] - (int) ((unsigned char *)q)[n]
> should still be correct (even though not optimal), doesn’t it?

No.  Consider p[n] being e.g. 1 and q[n] being __SCHAR_MAX__ + 3U and 16-bit
int and 16-bit char.  Then (unsigned char) 0x0001 < (unsigned char) 0x8002,
so it should return a negative number.  But (int) (0x0001U - 0x8002U) is
0x7fff, which is a positive int.  Now, if int is 17-bit and char is 16-bit,
this works fine, because is then -0x8001 and thus negative.

The above really works only if int is at least one bit wider than unsigned
char.

Jakub


Re: [PATCH, obvious?] Some minor nits in string folding functions

2018-07-19 Thread Bernd Edlinger
On 07/19/18 20:11, Jeff Law wrote:
> On 07/19/2018 12:04 PM, Bernd Edlinger wrote:
>> Hi,
>>
>>
>> this fixes a few minor nits, which I spotted while
>> looking at the string folding functions:
>>
>> string_constant: Remove impossible check: TREE_CODE (arg)
>> can't be COMPONENT_REF and MEM_REF at the same time.
> Shouldn't they all be != tests?  Though clearly this code isn't being
> tested by anything.
> 

I think a COMPONENT_REF would be possible for strlen(&a.b).
It looks like that removing this check will be the best option.
Unless Martin has a different idea of course.

> 
>>
>> c_strlen: maxelts is (signed) HOST_WIDE_INT, therefore
>> use tree_to_shwi.
> One could argue maxelts should be unsigned.
> 

I would agree, but a few lines later I see:

   /* Offset from the beginning of the string in elements.  */
   HOST_WIDE_INT eltoff;

   /* We have a known offset into the string.  Start searching there for
  a null character if we can represent it as a single HOST_WIDE_INT.  */
   if (byteoff == 0)
 eltoff = 0;
   else if (! tree_fits_shwi_p (byteoff))
 eltoff = -1;
   else
 eltoff = tree_to_shwi (byteoff) / eltsize;

   /* If the offset is known to be out of bounds, warn, and call strlen at
  runtime.  */
   if (eltoff < 0 || eltoff > maxelts)
 {

This would be doing signed/unsigned comparisons then.
Maybe that is the reason why ?

>>
>> c_getstr: tree_to_uhwi needs to be protected by
>> tree_fits_uhwi_p.
> Looks correct to me.
> 
> 
>>
>> BTW: c_getstr uses string_constant which appears to be
>> able to extract wide char string initializers, but c_getstr
>> does not seem to be prepared for wide char strings:
>>
>> else if (string[string_length - 1] != '\0')
>>   {
>> /* Support only properly NUL-terminated strings but handle
>>consecutive strings within the same array, such as the six
>>substrings in "1\0002\0003".  */
>> return NULL;
>>   }
> Seems like a goof to me.
> 

Well, maybe this could be a gcc_assert instead.  Normal strings should always
be zero-terminated, even wide character strings.
Anyways probably for another time.


Bernd.


Re: [PATCH, obvious?] Some minor nits in string folding functions

2018-07-19 Thread Martin Sebor

On 07/19/2018 12:04 PM, Bernd Edlinger wrote:

Hi,


this fixes a few minor nits, which I spotted while
looking at the string folding functions:


Please hold off until the patch for bug 86532 has been reviewed,
approved, and committed.  I'm making changes in this area,
partly to address some of your comments on it, including some
of the same ones you are making here.  It doesn't help for you
to be making other changes to the same code at the same time.

Thanks
Martin



string_constant: Remove impossible check: TREE_CODE (arg)
can't be COMPONENT_REF and MEM_REF at the same time.

c_strlen: maxelts is (signed) HOST_WIDE_INT, therefore
use tree_to_shwi.

c_getstr: tree_to_uhwi needs to be protected by
tree_fits_uhwi_p.

BTW: c_getstr uses string_constant which appears to be
able to extract wide char string initializers, but c_getstr
does not seem to be prepared for wide char strings:

   else if (string[string_length - 1] != '\0')
 {
   /* Support only properly NUL-terminated strings but handle
  consecutive strings within the same array, such as the six
  substrings in "1\0002\0003".  */
   return NULL;
 }


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.





Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Martin Sebor

On 07/19/2018 01:17 AM, Richard Biener wrote:

On Wed, 18 Jul 2018, Martin Sebor wrote:


+  while (TREE_CODE (chartype) != INTEGER_TYPE)
+chartype = TREE_TYPE (chartype);

This is a bit concerning.  First under what conditions is chartype not
going to be an INTEGER_TYPE?  And under what conditions will extracting
its type ultimately lead to something that is an INTEGER_TYPE?


chartype is usually (maybe even always) pointer type here:

  const char a[] = "123";
  extern int i;
  n = strlen (&a[i]);


But your hunch was correct that the loop isn't safe because
the element type need not be an integer (I didn't know/forgot
that the function is called for non-strings too).  The loop
should be replaced by:

  while (TREE_CODE (chartype) == ARRAY_TYPE
 || TREE_CODE (chartype) == POINTER_TYPE)
chartype = TREE_TYPE (chartype);


As this function may be called "late" you need to cope with
the middle-end ignoring type changes and thus happily
passing int *** directly rather than (char *) of that.

Also doesn't the above yield int for int *[]?


I don't think it ever gets this far for either a pointer to
an array of int, or for an array of pointers to int.  So for
something like the following the function fails earlier:

  const int* const a[2] = { ... };
  const char* (const *p)[2] = &a;

  int f (void)
  {
return __builtin_memcmp (*p, "12345678", 8);
  }

(Assuming this is what you were asking about.)


I guess you really want

   if (POINTER_TYPE_P (chartype))
 chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
 chartype = TREE_TYPE (chartype);

?


That seems to work too.  Attached is an update with this tweak.
The update also addresses some of Bernd's comments: it removes
the pointless second test in:

if (TREE_CODE (type) == ARRAY_TYPE
&& TREE_CODE (type) != INTEGER_TYPE)

the unused assignment to chartype in:

   else if (DECL_P (arg))
 {
   array = arg;
   chartype = TREE_TYPE (arg);
 }

and calls string_constant() instead of strnlen() to compute
the length of a generic string.

Other improvements  are possible in this area but they are
orthogonal to the bug I'm trying to fix so I'll post separate
patches for some of those.

Martin
PR tree-optimization/86532 - Wrong code due to a wrong strlen folding starting with r262522

gcc/ChangeLog:

	PR tree-optimization/86532
	* builtins.h (string_length): Declare.
	* builtins.c (c_strlen): Correct handling of non-constant offsets.	
	(check_access): Be prepared for non-constant length ranges.
	(string_length): Make extern.
	* expr.c (string_constant): Only handle the minor non-constant
	array index.  Use string_constant to compute the length of
	a generic string constant.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86532
	* gcc.c-torture/execute/strlen-2.c: New test.
	* gcc.c-torture/execute/strlen-3.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index c069d66..ceb477d 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -517,11 +517,11 @@ get_pointer_alignment (tree exp)
   return align;
 }
 
-/* Return the number of non-zero elements in the sequence
+/* Return the number of leading non-zero elements in the sequence
[ PTR, PTR + MAXELTS ) where each element's size is ELTSIZE bytes.
ELTSIZE must be a power of 2 less than 8.  Used by c_strlen.  */
 
-static unsigned
+unsigned
 string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
 {
   gcc_checking_assert (eltsize == 1 || eltsize == 2 || eltsize == 4);
@@ -605,14 +605,21 @@ c_strlen (tree src, int only_value)
 
   /* Set MAXELTS to sizeof (SRC) / sizeof (*SRC) - 1, the maximum possible
  length of SRC.  Prefer TYPE_SIZE() to TREE_STRING_LENGTH() if possible
- in case the latter is less than the size of the array.  */
-  HOST_WIDE_INT maxelts = TREE_STRING_LENGTH (src);
+ in case the latter is less than the size of the array, such as when
+ SRC refers to a short string literal used to initialize a large array.
+ In that case, the elements of the array after the terminating NUL are
+ all NUL.  */
+  HOST_WIDE_INT strelts = TREE_STRING_LENGTH (src);
+  strelts = strelts / eltsize - 1;
+
+  HOST_WIDE_INT maxelts = strelts;
   tree type = TREE_TYPE (src);
   if (tree size = TYPE_SIZE_UNIT (type))
 if (tree_fits_shwi_p (size))
-  maxelts = tree_to_uhwi (size);
-
-  maxelts = maxelts / eltsize - 1;
+  {
+	maxelts = tree_to_uhwi (size);
+	maxelts = maxelts / eltsize - 1;
+  }
 
   /* PTR can point to the byte representation of any string type, including
  char* and wchar_t*.  */
@@ -620,10 +627,12 @@ c_strlen (tree src, int only_value)
 
   if (byteoff && TREE_CODE (byteoff) != INTEGER_CST)
 {
-  /* If the string has an internal zero byte (e.g., "foo\0bar"), we can't
-	 compute the offset to the following null if we don't know where to
+  /* If the string has an internal NUL character followed by any
+	 non-NUL characters (e.g.

NetApp Users Contact List

2018-07-19 Thread Meghan Hudson
Hi,



Hope you having a great day!



I just wanted to be aware if you looking to acquire NetApp Users Contact List 
for your marketing efforts?



We also have Users information of companies using: Dell EMC, VMware, CISCO, 
Citrix, IBM, Oracle, Brocade Communications Systems, Symantec, Veeam, Nutanix, 
Juniper Networks, Hewlett-Packard, Pure Storage, Commvault, Hitachi Data 
Systems, Fujitsu, 3PAR, Nimble Storage, Avaya and Check Point Software 
Technologies.



Kindly review and let me be aware of your interest so that I can get back to 
you with the exact counts, sample and more info regarding the same.



Do let me be aware if you have any questions for me.



Regards,

Meghan Hudson

Database Executive

If you do not wish to receive these emails. Please respond Exit.


Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Martin Sebor

On 07/19/2018 12:19 AM, Bernd Edlinger wrote:

  if (TREE_CODE (idx) != INTEGER_CST
  && TREE_CODE (argtype) == POINTER_TYPE)
{
  /* From a pointer (but not array) argument extract the variable
 index to prevent get_addr_base_and_unit_offset() from failing
 due to it.  Use it later to compute the non-constant offset
 into the string and return it to the caller.  */
  varidx = idx;
  ref = TREE_OPERAND (arg, 0);

  tree type = TREE_TYPE (arg);
  if (TREE_CODE (type) == ARRAY_TYPE
  && TREE_CODE (type) != INTEGER_TYPE)
return NULL_TREE;
}

the condition TREE_CODE(type) == ARRAY_TYPE
&& TREE_CODE (type) != INTEGER_TYPE looks funny.
Check for ARRAY_TYPE should imply != INTEGER_TYPE.


Yes, that other test was superfluous.  I've removed it in
the updated patch I just posted.



  else if (DECL_P (arg))
{
  array = arg;
  chartype = TREE_TYPE (arg);
}

chartype is only used in the if (varidx) block, but that is always zero
in this case.


True.  Chartype could be left uninitialized.  I did it mostly
out of an abundance of caution (I don't like leaving variables
with such big scope uninitialized).  In the latest update
I instead initialize chartype to null.



  while (TREE_CODE (chartype) == ARRAY_TYPE
 || TREE_CODE (chartype) == POINTER_TYPE)
chartype = TREE_TYPE (chartype);

you multiply sizeof(chartype) with varidx but you should probably
use the type of the  TREE_OPERAND (arg, 0); above instead.


The offset returned to the caller is relative to the character
array so varidx must also be the index into the same array.
Otherwise it cannot be used.  The difference is this:

  const char a[][4] = { "12", "123" };

  int x = strlen (&a[0][i]);   // use i as the offset
  int y = strlen (&a[i][0]);   // bail



this is not in the patch, but I dont like it at all, because it compares the
size of a single initializer against the full size of the array.  But it should
be the innermost enclosing array:

  tree array_size = DECL_SIZE_UNIT (array);
  if (!array_size || TREE_CODE (array_size) != INTEGER_CST)
return NULL_TREE;

  /* Avoid returning a string that doesn't fit in the array
 it is stored in, like
 const char a[4] = "abcde";
 but do handle those that fit even if they have excess
 initializers, such as in
 const char a[4] = "abc\000\000";
 The excess elements contribute to TREE_STRING_LENGTH()
 but not to strlen().  */
  unsigned HOST_WIDE_INT length
= strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));
  if (compare_tree_int (array_size, length + 1) < 0)
return NULL_TREE;


The use of strnlen here isn't right for wide strings and needs
to be replaced with a call to string_length() from builtins.c.
I've made that change in the updated patch.

The remaining concern is orthogonal to the changes to fix
the wrong code.  As I mentioned, I opened a few bugs to
improve things in this area: 86434, 86572, 86552.  As
the first step I'm about to post a solution for 86552.



consider the following test case:
$ cat part.c
const char a[2][3][8] = { { "a", "bb", "ccc"},
  { "", "e", "ff" } };

int main ()
{
  int n = __builtin_strlen (&a[0][1][0]);

  if (n == 30)
__builtin_abort ();
}


With my patch for pr86552 GCC prints:

warning: initializer-string for array of chars is too long
 const char a[2][3][8] = { { "a", "bb", "ccc"},
  ^~~~
note: (near initialization for ‘a[0][1]’)
In function ‘main’:
warning: ‘__builtin_strlen’ argument missing terminating nul 
[-Wstringop-overflow=]

   int n = __builtin_strlen (&a[0][1][0]);
   ^~~~
note: referenced argument declared here
 const char a[2][3][8] = { { "a", "bb", "ccc"},

We can discuss what value, if any, might be more appropriate
to fold the length to in these undefined cases but  I would
prefer to have that discussion separately from this review.

Martin


Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Martin Sebor

On 07/19/2018 07:23 AM, Bernd Edlinger wrote:

@@ -633,12 +642,17 @@ c_strlen (tree src, int only_value)
return ssize_int (0);

   /* We don't know the starting offset, but we do know that the string
-has no internal zero bytes.  We can assume that the offset falls
-within the bounds of the string; otherwise, the programmer deserves
-what he gets.  Subtract the offset from the length of the string,
-and return that.  This would perhaps not be valid if we were dealing
-with named arrays in addition to literal string constants.  */
-  return size_diffop_loc (loc, size_int (maxelts * eltsize), byteoff);
+has no internal zero bytes.  If the offset falls within the bounds
+of the string subtract the offset from the length of the string,
+and return that.  Otherwise the length is zero.  Take care to
+use SAVE_EXPR in case the OFFSET has side-effects.  */
+  tree offsave = TREE_SIDE_EFFECTS (byteoff) ? save_expr (byteoff) : 
byteoff;
+  offsave = fold_convert (ssizetype, offsave);
+  tree condexp = fold_build2_loc (loc, LE_EXPR, boolean_type_node, offsave,
+ build_int_cst (ssizetype, len * eltsize));
+  tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), 
offsave);
+  return fold_build3_loc (loc, COND_EXPR, ssizetype, condexp, lenexp,
+ build_zero_cst (ssizetype));



This computes the number of bytes.
c_strlen is supposed to return number of (wide) characters:


You're right.  I guess that must mean the function isn't used
for wide character strings.  The original code also computes
a byte offset and so does the new expression.  I have no
problem changing it, but if feels like a change that should
be made independently of this bug fix.



/* Compute the length of a null-terminated character string or wide
character string handling character sizes of 1, 2, and 4 bytes.
TREE_STRING_LENGTH is not the right way because it evaluates to
the size of the character array in bytes (as opposed to characters)
and because it can contain a zero byte in the middle.



@@ -11343,16 +11356,15 @@ string_constant (tree arg, tree *ptr_offset)
 {
   if (TREE_CODE (TREE_TYPE (array)) != ARRAY_TYPE)
return NULL_TREE;
-  if (tree eltsize = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array
-   {
- /* Add the scaled variable index to the constant offset.  */
- tree eltoff = fold_build2 (MULT_EXPR, TREE_TYPE (offset),
-fold_convert (sizetype, varidx),
-eltsize);
- offset = fold_build2 (PLUS_EXPR, TREE_TYPE (offset), offset, eltoff);
-   }
-  else
-   return NULL_TREE;
+
+  while (TREE_CODE (chartype) != INTEGER_TYPE)
+   chartype = TREE_TYPE (chartype);
+
+  /* Set the non-constant offset to the non-constant index scaled
+by the size of the character type.  */
+  offset = fold_build2 (MULT_EXPR, TREE_TYPE (offset),
+   fold_convert (sizetype, varidx),
+   TYPE_SIZE_UNIT (chartype));


here you fix the computation for wide character strings,
but I see no test cases with wide character stings.


The change above corrects the offset when eltsize refers to
the size of an array rather than its character elements.
It wasn't made to fix how wide strings are handled.  I don't
know if the function is ever called for them in a way that
matters (there are no wide character built-ins).  But it's
something to look into.



But down here you use a non-wide character function on a
wide character string:

   /* Avoid returning a string that doesn't fit in the array
  it is stored in, like
  const char a[4] = "abcde";
  but do handle those that fit even if they have excess
  initializers, such as in
  const char a[4] = "abc\000\000";
  The excess elements contribute to TREE_STRING_LENGTH()
  but not to strlen().  */
   unsigned HOST_WIDE_INT length
 = strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));


Actually I begin to wonder, if all this wide character stuff is
really so common that we have to optimize it.


I've replaced the strnlen() call with string_length() in
the updated patch.  The code is entered for wchar_t (and
other) strings but since there are no wide character
built-ns in GCC I suspect that optimizations that apply
to them that don't also apply to other arrays are few
and far between.

At the same time, as wide characters (beyond wchar_t) are
increasingly becoming used in software I wouldn't want to
preclude future optimizations from making use of those that
benefit narrow strings.  In any event, that too is
a discussion to have independently of this bug fix.

Martin



[PATCH] warn for strlen of arrays with missing nul (PR 86552)

2018-07-19 Thread Martin Sebor

In the discussion of my patch for pr86532 Bernd noted that
GCC silently accepts constant character arrays with no
terminating nul as arguments to strlen (and other string
functions).

The attached patch is a first step in detecting these kinds
of bugs in strlen calls by issuing -Wstringop-overflow.
The next step is to modify all other handlers of built-in
functions to detect the same problem (not part of this patch).
Yet another step is to detect these problems in arguments
initialized using the non-string form:

  const char a[] = { 'a', 'b', 'c' };

This patch is meant to apply on top of the one for bug 86532
(I tested it with an earlier version of that patch so there
is code in the context that does not appear in the latest
version of the other diff).

Martin

PR tree-optimization/86552 - missing warning for reading past the end of non-string arrays

gcc/ChangeLog:

	PR tree-optimization/86552
	* builtins.c (warn_string_no_nul): New function.
	(string_length): Add argument and use it.
	(c_strlen): Same.
	(expand_builtin_strlen): Detect missing nul.
	(fold_builtin_1): Adjust.
	* builtins.h (c_strlen): Add argument.
	* expr.c (string_constant): Add arguments.  Detect missing nul
	terminator and outermost declaration it's missing in.
	* expr.h (string_constant): Add argument.
	* fold-const.c (c_getstr): Revert test.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86552
	* gcc.dg/warn-string-no-nul.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 03cf012..9885c4b 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -150,7 +150,7 @@ static tree stabilize_va_list_loc (location_t, tree, int);
 static rtx expand_builtin_expect (tree, rtx);
 static tree fold_builtin_constant_p (tree);
 static tree fold_builtin_classify_type (tree);
-static tree fold_builtin_strlen (location_t, tree, tree);
+static tree fold_builtin_strlen (location_t, tree, tree, tree);
 static tree fold_builtin_inf (location_t, tree, int);
 static tree rewrite_call_expr (location_t, tree, int, tree, int, ...);
 static bool validate_arg (const_tree, enum tree_code code);
@@ -550,6 +550,36 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
   return n;
 }
 
+/* For a call expression EXP to a function that expects a string argument,
+   issue a diagnostic due to it being a called with an argument NONSTR
+   that is a character array with no terminating NUL.  */
+
+static void
+warn_string_no_nul (location_t loc, tree exp, tree fndecl, tree nonstr)
+{
+  loc = expansion_point_location_if_in_system_header (loc);
+
+  bool warned;
+  if (exp)
+{
+  if (!fndecl)
+	fndecl = get_callee_fndecl (exp);
+  warned = warning_at (loc, OPT_Wstringop_overflow_,
+			   "%K%qD argument missing terminating nul",
+			   exp, fndecl);
+}
+  else
+{
+  gcc_assert (fndecl);
+  warned = warning_at (loc, OPT_Wstringop_overflow_,
+			   "%qD argument missing terminating nul",
+			   fndecl);
+}
+
+  if (warned && DECL_P (nonstr))
+inform (DECL_SOURCE_LOCATION (nonstr), "referenced argument declared here");
+}
+
 /* Compute the length of a null-terminated character string or wide
character string handling character sizes of 1, 2, and 4 bytes.
TREE_STRING_LENGTH is not the right way because it evaluates to
@@ -567,13 +597,17 @@ string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
accesses.  Note that this implies the result is not going to be emitted
into the instruction stream.
 
+   When ARR is non-null and the string is not properly nul-terminated,
+   set *ARR to the declaration of the outermost constant object whose
+   initializer (or one of its elements) is not nul-terminated.
+
The value returned is of type `ssizetype'.
 
Unfortunately, string_constant can't access the values of const char
arrays with initializers, so neither can we do so here.  */
 
 tree
-c_strlen (tree src, int only_value)
+c_strlen (tree src, int only_value, tree *arr /* = NULL */)
 {
   STRIP_NOPS (src);
   if (TREE_CODE (src) == COND_EXPR
@@ -581,24 +615,31 @@ c_strlen (tree src, int only_value)
 {
   tree len1, len2;
 
-  len1 = c_strlen (TREE_OPERAND (src, 1), only_value);
-  len2 = c_strlen (TREE_OPERAND (src, 2), only_value);
+  len1 = c_strlen (TREE_OPERAND (src, 1), only_value, arr);
+  len2 = c_strlen (TREE_OPERAND (src, 2), only_value, arr);
   if (tree_int_cst_equal (len1, len2))
 	return len1;
 }
 
   if (TREE_CODE (src) == COMPOUND_EXPR
   && (only_value || !TREE_SIDE_EFFECTS (TREE_OPERAND (src, 0
-return c_strlen (TREE_OPERAND (src, 1), only_value);
+return c_strlen (TREE_OPERAND (src, 1), only_value, arr);
 
   location_t loc = EXPR_LOC_OR_LOC (src, input_location);
 
   /* Offset from the beginning of the string in bytes.  */
   tree byteoff;
-  src = string_constant (src, &byteoff);
-  if (src == 0)
+  /* Set if array is nul-terminated, false otherwise.  */
+  bool nulterm;
+  src = string_constant (src, &byteoff, &nu

Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Bernd Edlinger
On 07/19/18 22:03, Martin Sebor wrote:
> On 07/19/2018 07:23 AM, Bernd Edlinger wrote:
>>> @@ -633,12 +642,17 @@ c_strlen (tree src, int only_value)
>>>  return ssize_int (0);
>>>
>>>    /* We don't know the starting offset, but we do know that the string
>>> - has no internal zero bytes.  We can assume that the offset falls
>>> - within the bounds of the string; otherwise, the programmer deserves
>>> - what he gets.  Subtract the offset from the length of the string,
>>> - and return that.  This would perhaps not be valid if we were dealing
>>> - with named arrays in addition to literal string constants.  */
>>> -  return size_diffop_loc (loc, size_int (maxelts * eltsize), byteoff);
>>> + has no internal zero bytes.  If the offset falls within the bounds
>>> + of the string subtract the offset from the length of the string,
>>> + and return that.  Otherwise the length is zero.  Take care to
>>> + use SAVE_EXPR in case the OFFSET has side-effects.  */
>>> +  tree offsave = TREE_SIDE_EFFECTS (byteoff) ? save_expr (byteoff) : 
>>> byteoff;
>>> +  offsave = fold_convert (ssizetype, offsave);
>>> +  tree condexp = fold_build2_loc (loc, LE_EXPR, boolean_type_node, 
>>> offsave,
>>> +  build_int_cst (ssizetype, len * eltsize));
>>> +  tree lenexp = size_diffop_loc (loc, ssize_int (strelts * eltsize), 
>>> offsave);
>>> +  return fold_build3_loc (loc, COND_EXPR, ssizetype, condexp, lenexp,
>>> +  build_zero_cst (ssizetype));
>>
>>
>> This computes the number of bytes.
>> c_strlen is supposed to return number of (wide) characters:
> 
> You're right.  I guess that must mean the function isn't used
> for wide character strings.  The original code also computes
> a byte offset and so does the new expression.  I have no
> problem changing it, but if feels like a change that should
> be made independently of this bug fix.
> 
>>
>> /* Compute the length of a null-terminated character string or wide
>>     character string handling character sizes of 1, 2, and 4 bytes.
>>     TREE_STRING_LENGTH is not the right way because it evaluates to
>>     the size of the character array in bytes (as opposed to characters)
>>     and because it can contain a zero byte in the middle.
>>
>>
>>> @@ -11343,16 +11356,15 @@ string_constant (tree arg, tree *ptr_offset)
>>>  {
>>>    if (TREE_CODE (TREE_TYPE (array)) != ARRAY_TYPE)
>>>  return NULL_TREE;
>>> -  if (tree eltsize = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (array
>>> -    {
>>> -  /* Add the scaled variable index to the constant offset.  */
>>> -  tree eltoff = fold_build2 (MULT_EXPR, TREE_TYPE (offset),
>>> - fold_convert (sizetype, varidx),
>>> - eltsize);
>>> -  offset = fold_build2 (PLUS_EXPR, TREE_TYPE (offset), offset, eltoff);
>>> -    }
>>> -  else
>>> -    return NULL_TREE;
>>> +
>>> +  while (TREE_CODE (chartype) != INTEGER_TYPE)
>>> +    chartype = TREE_TYPE (chartype);
>>> +
>>> +  /* Set the non-constant offset to the non-constant index scaled
>>> + by the size of the character type.  */
>>> +  offset = fold_build2 (MULT_EXPR, TREE_TYPE (offset),
>>> +    fold_convert (sizetype, varidx),
>>> +    TYPE_SIZE_UNIT (chartype));
>>
>> here you fix the computation for wide character strings,
>> but I see no test cases with wide character stings.
> 
> The change above corrects the offset when eltsize refers to
> the size of an array rather than its character elements.
> It wasn't made to fix how wide strings are handled.  I don't
> know if the function is ever called for them in a way that
> matters (there are no wide character built-ins).  But it's
> something to look into.
> 

Now I am suprised.
your loop above locates always an INTEGER_TYPE,
in the later version it skips all ARRAY_TYPES,
So TYPE_SIZE_UNIT is never the size of an array,
it is always the element type (char, wchar_t, int *)
that the array is made of.
So you compute offset = varidx * sizeof(char)
but sizeof (char) is 1, so I thought naturally
you are concerned about sizeof(wchar_t).

And indeed I can write strlen((char*) & L"test"[i]).


>>
>> But down here you use a non-wide character function on a
>> wide character string:
>>
>>    /* Avoid returning a string that doesn't fit in the array
>>   it is stored in, like
>>   const char a[4] = "abcde";
>>   but do handle those that fit even if they have excess
>>   initializers, such as in
>>   const char a[4] = "abc\000\000";
>>   The excess elements contribute to TREE_STRING_LENGTH()
>>   but not to strlen().  */
>>    unsigned HOST_WIDE_INT length
>>  = strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));
>>
>>
>> Actually I begin to wonder, if all this wide character stuff is
>> really so common that we have to optimize it.
> 
> I've replaced the strnlen() call with string_length() in
> the updated 

Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Bernd Edlinger
>@@ -11413,8 +11429,10 @@ string_constant (tree arg, tree *ptr_offset)
>  const char a[4] = "abc\000\000";
>  The excess elements contribute to TREE_STRING_LENGTH()
>  but not to strlen().  */
>-  unsigned HOST_WIDE_INT length
>-= strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));
>+  unsigned HOST_WIDE_INT charsize
>+= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (init;
>+  unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
>+  length = string_length (TREE_STRING_POINTER (init), charsize, length);
>   if (compare_tree_int (array_size, length + 1) < 0)
> return NULL_TREE;

But TREE_STRING_LENGTH is the length in bytes including NUL-termination.
then length is passed to string_length with expects it in units of charsize.
and returns number of characters.
then compare_tree_int compares array_size which is in units of bytes,
but not the size of the innermost enclosing array.

I really don't see why we need to support wide characters especially
when there is no reasonable test coverage, and no usable wstrlen
builtin first.


Bernd.


Re: [PATCH][Middle-end]change char type to unsigned char type when expanding strcmp/strncmp

2018-07-19 Thread Qing Zhao


> On Jul 19, 2018, at 2:24 PM, Jakub Jelinek  wrote:
> 
> On Thu, Jul 19, 2018 at 02:06:16PM -0500, Qing Zhao wrote:
>>> If you expand it as (int) ((unsigned char *)p)[n] - (int) ((unsigned char 
>>> *)q)[n]
>>> then aren't you relying on int type to have wider precision than unsigned 
>>> char
>>> (or unit_mode being narrower than mode)?
>> 
>> do you imply that we should only expand it as  (int) ((unsigned char *)p)[n] 
>> - (int) ((unsigned char *)q)[n] when we are sure
>> int type is wider than unsigned char? 
> 
> Yes.
> 
>>> I don't see anywhere where you'd
>>> give up on doing the inline expansion on targets where e.g. lowest
>>> addressable unit would be 16-bit and int would be 16-bit too.
>> 
>> even on this targets, is char type still 8-bit?
>> then int type is still wider than char?
> 
> C requires that int is at least 16-bit wide, so the sizeof (int) == sizeof
> (char) case is only possible say with 16-bit char and 16-bit int, or 32-bit
> char and 32-bit int etc.
> 
>>> On targets where int is as wide as char, one would need to expand it instead
>>> as something like:
>>> if (((unsigned char *)p)[n] == ((unsigned char *)q)[n]) loop;
>>> ret = ((unsigned char *)p)[n] < ((unsigned char *)q)[n] ? -1 : 1;
>>> or similar or just use the library routine.
>> 
>> 
>> even when int type is as wide as char,  expand it as (int) ((unsigned char 
>> *)p)[n] - (int) ((unsigned char *)q)[n]
>> should still be correct (even though not optimal), doesn’t it?
> 
> No.  Consider p[n] being e.g. 1 and q[n] being __SCHAR_MAX__ + 3U and 16-bit
> int and 16-bit char.  Then (unsigned char) 0x0001 < (unsigned char) 0x8002,
> so it should return a negative number.  But (int) (0x0001U - 0x8002U) is
> 0x7fff, which is a positive int.  Now, if int is 17-bit and char is 16-bit,
> this works fine, because is then -0x8001 and thus negative.

Okay, I see now.
really appreciate for your detailed explanation.
> 
> The above really works only if int is at least one bit wider than unsigned
> char.

Then, I will add a check to exclude the inlining when int is NOT wider than 
unsigned char on the target.

is the following the correct check:  (exp is the call to strcmp)

 if (CHAR_TYPE_SIZE >= TYPE_PRECISION (TREE_TYPE (exp)))

 
Thanks.

Qing




Re: RFC: Patch to implement Aarch64 SIMD ABI

2018-07-19 Thread Steve Ellcey
On Thu, 2018-07-19 at 08:31 +0100, Richard Sandiford wrote:
> 
> > @@ -4706,8 +4730,11 @@ aarch64_process_components (sbitmap
> > components, bool prologue_p)
> >    while (regno != last_regno)
> >  {
> >    /* AAPCS64 section 5.1.2 requires only the bottom 64 bits to be saved
> > -  so DFmode for the vector registers is enough.  */
> > -  machine_mode mode = GP_REGNUM_P (regno) ? E_DImode : E_DFmode;
> > +  so DFmode for the vector registers is enough.  For simd functions
> > + we want to save the entire register.  */
> > +  machine_mode mode = GP_REGNUM_P (regno) ? E_DImode
> > + : (aarch64_simd_function_p (cfun->decl) ? E_TFmode : E_DFmode);
> This condition also occurs in aarch64_push_regs and aarch64_pop_regs.
> It'd probably be worth splitting it out into a subfunction.
> 
> I think you also need to handle the writeback cases, which should work
> for Q registers too.  This will mean extra loadwb_pair and storewb_pair
> patterns.
> 
> LGTM otherwise FWIW.

Yes, I see where I missed this in aarch64_push_regs
and aarch64_pop_regs.  I think that is why the second of
Wilco's two examples (f2) is wrong.  I am unclear about
exactly what is meant by writeback and why we have it and
how that and callee_adjust are used.  Any chance someone
could help me understand this part of the prologue/epilogue
code better?  The comments in aarch64.c/aarch64.h aren't
really helping me understand what the code is doing or
why it is doing it.

Steve Ellcey
sell...@cavium.com


Re: [PATCH] restore -Warray-bounds for string literals (PR 83776)

2018-07-19 Thread Jeff Law
On 07/13/2018 05:45 PM, Martin Sebor wrote:
>>
>> +  offset_int ofr[] = {
>> +   wi::to_offset (fold_convert (ptrdiff_type_node, vr->min)),
>> +   wi::to_offset (fold_convert (ptrdiff_type_node, vr->max))
>> +  };
>>
>> huh.  Do you maybe want to use widest_int for ofr[]?  What's
>> wrong with wi::to_offset (vr->min)?  Building another intermediate
>> tree node here looks just bogus.
> 
> I need to convert size_type indices to signed to keep their sign
> if it's negative and include it in warnings.  I've moved the code
> into a conditional where it's used to minimize the cost but other
> than that I don't know how else to convert it.
> 
>>
>> What are you trying to do in this loop anyways?
> 
> The loop It determines the range of the final index/offset for
> a series of POINTER_PLUS_EXPRs.  It handles cases like this:
> 
>   int g (int i, int j, int k)
>   {
> if (i < 1) i = 1;
> if (j < 1) j = 1;
> if (k < 1) k = 1;
> 
> const char *p0 = "123";
> const char *p1 = p0 + i;
> const char *p2 = p1 + j;
> const char *p3 = p2 + k;
> 
> // p2[3] and p3[1] are out of bounds
> return p0[4] + p1[3] + p2[2] + p3[1];
>   }
> 
>> I suppose
>> look at
>>
>>   p_1 = &obj[i_2];  // already bounds-checked, but with ignore_off_by_one
>>   ... = MEM[p_1 + CST];
>>
>> ?  But then
>>
>> +  if (TREE_CODE (varoff) != SSA_NAME)
>> +   break;
>>
>> you should at least handle INTEGER_CSTs here?
> 
> It's already handled (and set in CSTOFF).  There should be no
> more constant offsets after that (I haven't come across any.)
> 
>>
>> +  if (!vr || vr->type == VR_UNDEFINED || !vr->min || !vr->max)
>> +   break;
>>
>> please use positive tests here, VR_RANGE || VR_ANTI_RANGE.  As usual
>> the anti-range handling looks odd.  Please add comments so we can follow
>> what you were thinking when writing range merging code.  Even better if you
>>
>> can stick to use existing code rather than always re-inventing the wheel...
>>
> 
> The anti-range handling is to conservatively add
> [-MAXOBJSIZE -1, MAXOBJSIZE] to OFFRANGE.  I've added comments
> to make it clear.  I'd be more than happy to reuse existing
> code if I knew where to find it (if it exists).  It sure would
> save me lots of work to have a library of utility functions to
> call instead of rolling my own code each time.
Finding stuff is never easy :(  GCC's just gotten so big it's virtually
impossible for anyone to know all the APIs.

The suggestion I'd have would be to (when possible) factor this stuff
into routines you can reuse.  We (as a whole) have this tendency to
open-code all kinds of things rather than factoring the code into
reusable routines.

In addition to increasing the probability that you'll be able to reuse
code later, just reading a new function's header tends to make me (as a
reviewer) internally ask if there's already a routine we should be using
instead.  When it's open-coded it's significantly harder to spot those
cases (at least for me).


> 
>>
>> I think I commented on earlier variants but this doesn't seem to resemble
>> any of them.
> 
> I've reworked the patch (sorry) to also handle arrays.  For GCC
> 9 it seems I might as well do both in one go.
> 
> Attached is an updated patch with these changes.
> 
> Martin
> 
> gcc-83776.diff
> 
> 
> PR tree-optimization/84047 - missing -Warray-bounds on an out-of-bounds index 
> into an array
> PR tree-optimization/83776 - missing -Warray-bounds indexing past the end of 
> a string literal
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/84047
>   PR tree-optimization/83776
>   * tree-vrp.c (vrp_prop::check_mem_ref): New function.
>   (check_array_bounds): Call it.
>   * /gcc/tree-sra.c (get_access_for_expr): Fail for out-of-bounds
>   array offsets.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/83776
>   PR tree-optimization/84047
>   * gcc.dg/Warray-bounds-29.c: New test.
>   * gcc.dg/Warray-bounds-30.c: New test.
>   * gcc.dg/Warray-bounds-31.c: New test.
>   * gcc.dg/Warray-bounds-32.c: New test.
> 

> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 3e30f6b..8221a06 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -3110,6 +3110,19 @@ get_access_for_expr (tree expr)
>|| !DECL_P (base))
>  return NULL;
>  
> +  /* Fail for out-of-bounds array offsets.  */
> +  tree basetype = TREE_TYPE (base);
> +  if (TREE_CODE (basetype) == ARRAY_TYPE)
> +{
> +  if (offset < 0)
> + return NULL;
> +
> +  if (tree size = DECL_SIZE (base))
> + if (tree_fits_uhwi_p (size)
> + && tree_to_uhwi (size) < (unsigned HOST_WIDE_INT) offset)
> +   return NULL;
> +}
> +
>if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
>  return NULL;
So I'm a bit curious about this hunk.  Did we end up creating an access
structure that walked off the end of the array?  Presumably  you
suppressing SRA at this point so that you can see the array access later
a

commited: avoid extended initializer lists warnings

2018-07-19 Thread Martin Sebor

I've checked in the patch below as r262892 to avoid the many
warnings the new code was causing with GCC 6:

/ssd/src/gcc/svn/gcc/align.h:53:32: warning: extended initializer lists 
only available with -std=c++11 or -std=gnu++11


Martin

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 262891)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2018-07-19  Martin Sebor  
+
+   * align.h (align_flags): Use member initialization.
+
 2018-07-19  David Malcolm  

* Makefile.in (OBJS): Add optinfo.o.
Index: gcc/align.h
===
--- gcc/align.h (revision 262891)
+++ gcc/align.h (working copy)
@@ -50,8 +50,10 @@ struct align_flags
   /* Default constructor.  */
   align_flags (int log0 = 0, int maxskip0 = 0, int log1 = 0, int 
maxskip1 = 0)

   {
-levels[0] = {log0, maxskip0};
-levels[1] = {log1, maxskip1};
+levels[0].log = log0;
+levels[0].maxskip = maxskip0;
+levels[1].log = log1;
+levels[1].maxskip = maxskip1;
 normalize ();
   }

t


Re: [PATCH] specify large command line option arguments (PR 82063)

2018-07-19 Thread Jeff Law
On 06/24/2018 03:05 PM, Martin Sebor wrote:
> Storing integer command line option arguments in type int
> limits options such as -Wlarger-than= or -Walloca-larger-than
> to at most INT_MAX (see bug 71905).  Larger values wrap around
> zero.  The value zero is considered to disable the option,
> making it impossible to specify a zero limit.
> 
> To get around these limitations, the -Walloc-size-larger-than=
> option accepts a string argument that it then parses itself
> and interprets as HOST_WIDE_INT.  The option also accepts byte
> size suffixes like KB, MB, GiB, etc. to make it convenient to
> specify very large limits.
> 
> The int limitation is obviously less than ideal in a 64-bit
> world.  The treatment of zero as a toggle is just a minor wart.
> The special treatment to make it work for just a single option
> makes option handling inconsistent.  It should be possible for
> any option that takes an integer argument to use the same logic.
> 
> The attached patch enhances GCC option processing to do that.
> It changes the storage type of option arguments from int to
> HOST_WIDE_INT and extends the existing (although undocumented)
> option property Host_Wide_Int to specify wide option arguments.
> It also introduces the ByteSize property for options for which
> specifying the byte-size suffix makes sense.
> 
> To make it possible to consider zero as a meaningful argument
> value rather than a flag indicating that an option is disabled
> the patch also adds a CLVC_SIZE enumerator to the cl_var_type
> enumeration, and modifies how options of the kind are handled.
> 
> Warning options that take large byte-size arguments can be
> disabled by specifying a value equal to or greater than
> HOST_WIDE_INT_M1U.  For convenience, aliases in the form of
> -Wno-xxx-larger-than have been provided for all the affected
> options.
> 
> In the patch all the existing -larger-than options are set
> to PTRDIFF_MAX.  This makes them effectively enabled, but
> because the setting is exceedingly permissive, and because
> some of the existing warnings are already set to the same
> value and some other checks detect and reject such exceedingly
> large values with errors, this change shouldn't noticeably
> affect what constructs are diagnosed.
> 
> Although all the options are set to PTRDIFF_MAX, I think it
> would make sense to consider setting some of them lower, say
> to PTRDIFF_MAX / 2.  I'd like to propose that in a followup
> patch.
> 
> To minimize observable changes the -Walloca-larger-than and
> -Wvla-larger-than warnings required more extensive work to
> make of the new mechanism because of the "unbounded" argument
> handling (the warnings trigger for arguments that are not
> visibly constrained), and because of the zero handling
> (the warnings also trigger
> 
> 
> Martin
> 
> 
> gcc-82063.diff
> 
> 
> PR middle-end/82063 - issues with arguments enabled by -Wall
> 
> gcc/ada/ChangeLog:
> 
>   PR middle-end/82063
>   * gcc-interface/misc.c (gnat_handle_option): Change function argument
>   to HOST_WIDE_INT.
> 
> gcc/brig/ChangeLog:
>   * brig/brig-lang.c (brig_langhook_handle_option): Change function
>   argument to HOST_WIDE_INT.
> 
> gcc/c-family/ChangeLog:
> 
>   PR middle-end/82063
>   * c-common.h (c_common_handle_option): Change function argument
>   to HOST_WIDE_INT.
>   * c-opts.c (c_common_init_options): Same.
>   (c_common_handle_option): Same.  Remove special handling of
>   OPT_Walloca_larger_than_ and OPT_Wvla_larger_than_.
>   * c.opt (-Walloc-size-larger-than, -Walloca-larger-than): Change
>   options to take a HOST_WIDE_INT argument and accept a byte-size
>   suffix.  Initialize.
>   (-Wvla-larger-than): Same.
>   (-Wno-alloc-size-larger-than, -Wno-alloca-larger-than): New.
>   (-Wno-vla-larger-than): Same.
>   
> 
> gcc/fortran/ChangeLog:
> 
>   PR middle-end/82063
>   * gfortran.h (gfc_handle_option): Change function argument
>   to HOST_WIDE_INT.
>   * options.c (gfc_handle_option): Same.
> 
> gcc/go/ChangeLog:
> 
>   PR middle-end/82063
>   * go-lang.c (go_langhook_handle_option): Change function argument
>   to HOST_WIDE_INT.
> 
> gcc/lto/ChangeLog:
> 
>   PR middle-end/82063
>   * lto-lang.c (lto_handle_option): Change function argument
>   to HOST_WIDE_INT.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR middle-end/82063
>   * gcc.dg/Walloc-size-larger-than-16.c: Adjust.
>   * gcc.dg/Walloca-larger-than.c: New test.
>   * gcc.dg/Wframe-larger-than-2.c: New test.
>   * gcc.dg/Wlarger-than3.c: New test.
>   * gcc.dg/Wvla-larger-than-3.c: New test.
> 
> gcc/ChangeLog:
> 
>   PR middle-end/82063
>   * builtins.c (expand_builtin_alloca): Adjust.
>   * calls.c (alloc_max_size): Simplify.
>   * cgraphunit.c (cgraph_node::expand): Adjust.
>   * common.opt (larger_than_size, warn_frame_larger_than): Remove
>   variables.
>   (frame_larger_than_size): Sa

Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Martin Sebor

Here's one more update with tweaks addressing a couple more
of Bernd's comments:

1) correct the use of TREE_STRING_LENGTH() where a number of
array elements is expected and not bytes
2) set CHARTYPE as soon as it's first determined rather than
trying to extract it again later

On 07/19/2018 01:49 PM, Martin Sebor wrote:

On 07/19/2018 01:17 AM, Richard Biener wrote:

On Wed, 18 Jul 2018, Martin Sebor wrote:


+  while (TREE_CODE (chartype) != INTEGER_TYPE)
+chartype = TREE_TYPE (chartype);

This is a bit concerning.  First under what conditions is chartype not
going to be an INTEGER_TYPE?  And under what conditions will
extracting
its type ultimately lead to something that is an INTEGER_TYPE?


chartype is usually (maybe even always) pointer type here:

  const char a[] = "123";
  extern int i;
  n = strlen (&a[i]);


But your hunch was correct that the loop isn't safe because
the element type need not be an integer (I didn't know/forgot
that the function is called for non-strings too).  The loop
should be replaced by:

  while (TREE_CODE (chartype) == ARRAY_TYPE
 || TREE_CODE (chartype) == POINTER_TYPE)
chartype = TREE_TYPE (chartype);


As this function may be called "late" you need to cope with
the middle-end ignoring type changes and thus happily
passing int *** directly rather than (char *) of that.

Also doesn't the above yield int for int *[]?


I don't think it ever gets this far for either a pointer to
an array of int, or for an array of pointers to int.  So for
something like the following the function fails earlier:

  const int* const a[2] = { ... };
  const char* (const *p)[2] = &a;

  int f (void)
  {
return __builtin_memcmp (*p, "12345678", 8);
  }

(Assuming this is what you were asking about.)


I guess you really want

   if (POINTER_TYPE_P (chartype))
 chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
 chartype = TREE_TYPE (chartype);

?


That seems to work too.  Attached is an update with this tweak.
The update also addresses some of Bernd's comments: it removes
the pointless second test in:

if (TREE_CODE (type) == ARRAY_TYPE
&& TREE_CODE (type) != INTEGER_TYPE)

the unused assignment to chartype in:

   else if (DECL_P (arg))
 {
   array = arg;
   chartype = TREE_TYPE (arg);
 }

and calls string_constant() instead of strnlen() to compute
the length of a generic string.

Other improvements  are possible in this area but they are
orthogonal to the bug I'm trying to fix so I'll post separate
patches for some of those.

Martin


PR tree-optimization/86532 - Wrong code due to a wrong strlen folding starting with r262522

gcc/ChangeLog:

	PR tree-optimization/86532
	* builtins.h (string_length): Declare.
	* builtins.c (c_strlen): Correct handling of non-constant offsets.	
	(check_access): Be prepared for non-constant length ranges.
	(string_length): Make extern.
	* expr.c (string_constant): Only handle the minor non-constant
	array index.  Use string_constant to compute the length of
	a generic string constant.

gcc/testsuite/ChangeLog:

	PR tree-optimization/86532
	* gcc.c-torture/execute/strlen-2.c: New test.
	* gcc.c-torture/execute/strlen-3.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index c069d66..ceb477d 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -517,11 +517,11 @@ get_pointer_alignment (tree exp)
   return align;
 }
 
-/* Return the number of non-zero elements in the sequence
+/* Return the number of leading non-zero elements in the sequence
[ PTR, PTR + MAXELTS ) where each element's size is ELTSIZE bytes.
ELTSIZE must be a power of 2 less than 8.  Used by c_strlen.  */
 
-static unsigned
+unsigned
 string_length (const void *ptr, unsigned eltsize, unsigned maxelts)
 {
   gcc_checking_assert (eltsize == 1 || eltsize == 2 || eltsize == 4);
@@ -605,14 +605,21 @@ c_strlen (tree src, int only_value)
 
   /* Set MAXELTS to sizeof (SRC) / sizeof (*SRC) - 1, the maximum possible
  length of SRC.  Prefer TYPE_SIZE() to TREE_STRING_LENGTH() if possible
- in case the latter is less than the size of the array.  */
-  HOST_WIDE_INT maxelts = TREE_STRING_LENGTH (src);
+ in case the latter is less than the size of the array, such as when
+ SRC refers to a short string literal used to initialize a large array.
+ In that case, the elements of the array after the terminating NUL are
+ all NUL.  */
+  HOST_WIDE_INT strelts = TREE_STRING_LENGTH (src);
+  strelts = strelts / eltsize - 1;
+
+  HOST_WIDE_INT maxelts = strelts;
   tree type = TREE_TYPE (src);
   if (tree size = TYPE_SIZE_UNIT (type))
 if (tree_fits_shwi_p (size))
-  maxelts = tree_to_uhwi (size);
-
-  maxelts = maxelts / eltsize - 1;
+  {
+	maxelts = tree_to_uhwi (size);
+	maxelts = maxelts / eltsize - 1;
+  }
 
   /* PTR can point to the byte representation of any string type, including
  char* and wchar_t*.  */
@@ -620,10 +627,12 @@ c_strlen (tree src, int only_value)

Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Martin Sebor

On 07/19/2018 02:45 PM, Bernd Edlinger wrote:

@@ -11413,8 +11429,10 @@ string_constant (tree arg, tree *ptr_offset)
 const char a[4] = "abc\000\000";
 The excess elements contribute to TREE_STRING_LENGTH()
 but not to strlen().  */
-  unsigned HOST_WIDE_INT length
-= strnlen (TREE_STRING_POINTER (init), TREE_STRING_LENGTH (init));
+  unsigned HOST_WIDE_INT charsize
+= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (init;
+  unsigned HOST_WIDE_INT length = TREE_STRING_LENGTH (init);
+  length = string_length (TREE_STRING_POINTER (init), charsize, length);
  if (compare_tree_int (array_size, length + 1) < 0)
return NULL_TREE;


But TREE_STRING_LENGTH is the length in bytes including NUL-termination.
then length is passed to string_length with expects it in units of charsize.
and returns number of characters.
then compare_tree_int compares array_size which is in units of bytes,
but not the size of the innermost enclosing array.


Whoops.  I forgot TREE_STRING_LENGTH(s) is really sizeof(s),
not the length or size in characters.  Not the most fortunate
name.  The update to this patch I sent corrects this mistake.
The subsequent patch I sent to implement the warning whose
absence you lamented for non-nul-terminated strings tightens
up the bound on the array size.


I really don't see why we need to support wide characters especially
when there is no reasonable test coverage, and no usable wstrlen
builtin first.


I'm not sure what special support are you talking about.
The string_constant function handles all kinds of strings
and the change above doesn't affect that.  The purpose of
the check above is to conservatively fail for constant
arrays with excess initializers such as in the comment:

const char a[4] = "abc\000\000";

Defining such strings is undefined, as is using out-of-bounds
offsets into such things, so a permissive check isn't inherently
wrong, nor would it be incorrect to remove it and accept and
fold them.  But neither is essential to the fix for this bug.

If you are you referring to the handling of non-const offsets
here, those are already handled in c_strlen() for a subset of
cases (plain ARRAY_REFs).  This change just makes it consistent
for all references, including those to multi-dimensinal arrays
and other aggregates.  (I noticed it while some of my tests
for the fix were failing.)

Martin



Re: [PATCH] restore -Warray-bounds for string literals (PR 83776)

2018-07-19 Thread Martin Sebor

On 07/19/2018 03:51 PM, Jeff Law wrote:

On 07/13/2018 05:45 PM, Martin Sebor wrote:


+  offset_int ofr[] = {
+   wi::to_offset (fold_convert (ptrdiff_type_node, vr->min)),
+   wi::to_offset (fold_convert (ptrdiff_type_node, vr->max))
+  };

huh.  Do you maybe want to use widest_int for ofr[]?  What's
wrong with wi::to_offset (vr->min)?  Building another intermediate
tree node here looks just bogus.


I need to convert size_type indices to signed to keep their sign
if it's negative and include it in warnings.  I've moved the code
into a conditional where it's used to minimize the cost but other
than that I don't know how else to convert it.



What are you trying to do in this loop anyways?


The loop It determines the range of the final index/offset for
a series of POINTER_PLUS_EXPRs.  It handles cases like this:

  int g (int i, int j, int k)
  {
if (i < 1) i = 1;
if (j < 1) j = 1;
if (k < 1) k = 1;

const char *p0 = "123";
const char *p1 = p0 + i;
const char *p2 = p1 + j;
const char *p3 = p2 + k;

// p2[3] and p3[1] are out of bounds
return p0[4] + p1[3] + p2[2] + p3[1];
  }


I suppose
look at

  p_1 = &obj[i_2];  // already bounds-checked, but with ignore_off_by_one
  ... = MEM[p_1 + CST];

?  But then

+  if (TREE_CODE (varoff) != SSA_NAME)
+   break;

you should at least handle INTEGER_CSTs here?


It's already handled (and set in CSTOFF).  There should be no
more constant offsets after that (I haven't come across any.)



+  if (!vr || vr->type == VR_UNDEFINED || !vr->min || !vr->max)
+   break;

please use positive tests here, VR_RANGE || VR_ANTI_RANGE.  As usual
the anti-range handling looks odd.  Please add comments so we can follow
what you were thinking when writing range merging code.  Even better if you

can stick to use existing code rather than always re-inventing the wheel...



The anti-range handling is to conservatively add
[-MAXOBJSIZE -1, MAXOBJSIZE] to OFFRANGE.  I've added comments
to make it clear.  I'd be more than happy to reuse existing
code if I knew where to find it (if it exists).  It sure would
save me lots of work to have a library of utility functions to
call instead of rolling my own code each time.

Finding stuff is never easy :(  GCC's just gotten so big it's virtually
impossible for anyone to know all the APIs.

The suggestion I'd have would be to (when possible) factor this stuff
into routines you can reuse.  We (as a whole) have this tendency to
open-code all kinds of things rather than factoring the code into
reusable routines.

In addition to increasing the probability that you'll be able to reuse
code later, just reading a new function's header tends to make me (as a
reviewer) internally ask if there's already a routine we should be using
instead.  When it's open-coded it's significantly harder to spot those
cases (at least for me).






I think I commented on earlier variants but this doesn't seem to resemble
any of them.


I've reworked the patch (sorry) to also handle arrays.  For GCC
9 it seems I might as well do both in one go.

Attached is an updated patch with these changes.

Martin

gcc-83776.diff


PR tree-optimization/84047 - missing -Warray-bounds on an out-of-bounds index 
into an array
PR tree-optimization/83776 - missing -Warray-bounds indexing past the end of a 
string literal

gcc/ChangeLog:

PR tree-optimization/84047
PR tree-optimization/83776
* tree-vrp.c (vrp_prop::check_mem_ref): New function.
(check_array_bounds): Call it.
* /gcc/tree-sra.c (get_access_for_expr): Fail for out-of-bounds
array offsets.

gcc/testsuite/ChangeLog:

PR tree-optimization/83776
PR tree-optimization/84047
* gcc.dg/Warray-bounds-29.c: New test.
* gcc.dg/Warray-bounds-30.c: New test.
* gcc.dg/Warray-bounds-31.c: New test.
* gcc.dg/Warray-bounds-32.c: New test.




diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 3e30f6b..8221a06 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -3110,6 +3110,19 @@ get_access_for_expr (tree expr)
   || !DECL_P (base))
 return NULL;

+  /* Fail for out-of-bounds array offsets.  */
+  tree basetype = TREE_TYPE (base);
+  if (TREE_CODE (basetype) == ARRAY_TYPE)
+{
+  if (offset < 0)
+   return NULL;
+
+  if (tree size = DECL_SIZE (base))
+   if (tree_fits_uhwi_p (size)
+   && tree_to_uhwi (size) < (unsigned HOST_WIDE_INT) offset)
+ return NULL;
+}
+
   if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
 return NULL;

So I'm a bit curious about this hunk.  Did we end up creating an access
structure that walked off the end of the array?  Presumably  you
suppressing SRA at this point so that you can see the array access later
and give a suitable warning.  Right?


Yes, but I didn't make a note of the test case that triggered
it and I'm not able to trigger the code path now, so the change
might not be necessary.  I've re

[PATCH] Explicitly mark _S_ti() as default visibility to work around clang -fvisibility-inlines-hidden bug

2018-07-19 Thread Fāng-ruì Sòng via gcc-patches

clang (including trunk and many older versions) incorrectly marks static local 
variables (__tag) hidden when -fvisibility-inlines-hidden is used.

% cat b.cc
#include 
std::shared_ptr foo(int x) {
 return std::make_shared(x);
}
% g++-8 -fvisibility-inlines-hidden -fno-rtti -c b.cc
% readelf -s b.o | grep _S_ti
  163:  1 OBJECT  UNIQUE DEFAULT   67 
_ZZNSt19_Sp_make_shared_tag5_S_tiEvE5__tag
  164:  8 FUNCWEAK   HIDDEN68 
_ZNSt19_Sp_make_shared_tag5_S_tiEv
% ~/Dev/llvm/static-release/bin/clang++ -fvisibility-inlines-hidden -fno-rtti 
-c b.cc
% readelf -s b.o | grep _S_ti
  129: 16 FUNCWEAK   HIDDEN34 
_ZNSt19_Sp_make_shared_tag5_S_tiEv
  155:  1 OBJECT  WEAK   HIDDEN   202 
_ZZNSt19_Sp_make_shared_tag5_S_tiEvE5__tag

This can lead to multiple instances of __tag when shares objects are used.
The function
virtual void* std::_Sp_counted_ptr_inplace::_M_get_deleter(const 
std::type_info& __ti) noexcept
may return nullptr and causes std::make_shared() to return nullptr 
(-fvisibility-inlines-hidden -fno-rtti).

After applying this patch (tagging _S_ti() with default visibility to override 
-fvisibility-inlines-hidden)

% readelf -s b.o | grep _S_ti
  129: 16 FUNCWEAK   DEFAULT   34 
_ZNSt19_Sp_make_shared_tag5_S_tiEv
  155:  1 OBJECT  WEAK   DEFAULT  202 
_ZZNSt19_Sp_make_shared_tag5_S_tiEvE5__tag


This issue caused 10+ check-all tests of a -DUSE_SHARED_LLVM=On build of llvm 
(compiled with clang trunk) to SIGSEGV (because std::make_shared returned 
nullptr) and this patch fixes it.


   * include/bits/shared_ptr_base.h (_S_ti): Use
   _GLIBCXX_VISIBILITY(default)


--
宋方睿
>From 6da8cec298766ce043d9c6dcda7b87142228dafb Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Thu, 19 Jul 2018 16:40:26 -0700
Subject: [PATCH 1/1] Explicitly mark _S_ti() as default visibility to work
 around clang -fvisibility-inlines-hidden bug

clang (including trunk and many older versions) incorrectly marks static
local variables (__tag) hidden when -fvisibility-inlines-hidden is used.
This can lead to multiple instances of __tag when shares objects are used.

* include/bits/shared_ptr_base.h (_S_ti): Use
_GLIBCXX_VISIBILITY(default)
---
 libstdc++-v3/include/bits/shared_ptr_base.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index f3994da158f..870aeb9bfda 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -508,7 +508,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   friend class _Sp_counted_ptr_inplace;
 
 static const type_info&
-_S_ti() noexcept
+_S_ti() noexcept _GLIBCXX_VISIBILITY(default)
 {
   alignas(type_info) static constexpr _Sp_make_shared_tag __tag;
   return reinterpret_cast(__tag);
-- 
2.18.0



Stepping down, mostly...

2018-07-19 Thread DJ Delorie


This has been a long time coming, but as most of you know I've changed
groups at Red Hat and my new duties don't give me the time or focus I
used to have for all my various upstream maintainerships.  To be fair
to the community, I'm making this change official by stepping down
from the ones I can no longer give sufficient attention to:

m32c
rx
rl78
v850
msp430
libiberty
build machinery

I'm recommending Sebastian Perta  for the
Renesas targets (m32c, rx, rl78, v850) and Jozef Lawrynowicz
 for the TI target (msp430), if they so
desire.

I'll keep my DJGPP-related maintainerships of course :-)

Specific patches to MAINAINERS attached.

Reply-to set to me due to cross-posting.



[gcc]
* MAINTAINERS (m32c, msp43, rl78, libiberty, build): Remove myself
as maintainer.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 262891)
+++ MAINTAINERS (working copy)
@@ -67,38 +67,35 @@ hppa port   John David Anglin   
 i386 port  Uros Bizjak 
 i386 vector ISA extns  Kirill Yukhin   
 ia64 port  Jim Wilson  
 iq2000 portNick Clifton
 lm32 port  Sebastien Bourdeauducq  
-m32c port  DJ Delorie  
 m32r port  Nick Clifton
 m68k port (?)  Jeff Law
 m68k port  Andreas Schwab  
 m68k-motorola-sysv portPhilippe De Muyter  
 mcore port Nick Clifton
 microblaze Michael Eager   
 mips port  Matthew Fortune 
 mmix port  Hans-Peter Nilsson  
 mn10300 port   Jeff Law
 mn10300 port   Alexandre Oliva 
 moxie port Anthony Green   
-msp430 portDJ Delorie  
 msp430 portNick Clifton
 nds32 port Chung-Ju Wu 
 nds32 port Shiva Chen  
 nios2 port Chung-Lin Tang  
 nios2 port Sandra Loosemore
 nvptx port Tom de Vries
 pdp11 port Paul Koning 
 powerpcspe portAndrew Jenner   

 riscv port Kito Cheng  
 riscv port Palmer Dabbelt  
 riscv port Andrew Waterman 
 riscv port Jim Wilson  
-rl78 port  DJ Delorie  
 rs6000/powerpc portDavid Edelsohn  
 rs6000/powerpc portSegher Boessenkool  
 rs6000 vector extnsAldy Hernandez  
 rx portNick Clifton
 s390 port  Hartmut Penner  
 s390 port  Ulrich Weigand  
@@ -162,13 +159,12 @@ libcppAll C and C++ front end maintai
 libcpp David Malcolm   
 fp-bit Ian Lance Taylor
 libdecnumber   Ben Elliston
 libgcc Ian Lance Taylor
 libgo  Ian Lance Taylor
 libgompJakub Jelinek   
-libiberty  DJ Delorie  
 libiberty  Ian Lance Taylor
 libitm Torvald Riegel  
 libobjcNicola Pero 

 libobjcAndrew Pinski   
 libquadmathTobias Burnus   
 libquadmathJakub Jelinek   
@@ -206,13 +202,12 @@ web pages Gerald Pfeifer  
 i18n   Philipp Thomas  
 i18n   Joseph Myers
 diagnostic messagesDodji Seketeli  
 diagnostic messagesDavid Malcolm   
 build machinery (*.in) Paolo Bonzini   
-build machinery (*.in) DJ Delorie  
 build machinery (*.in) Nathanael Nerode
 build machinery (*.in) Alexandre Oliva 
 build machinery (*.in) Ralf Wildenhues 
 docs co-maintainer Gerald Pfeifer  
 docs co-maintainer Joseph Myers
 docs co-maintainer Sandra Loosemore


[binutils]
* MAINTAINERS (RL78, RX): Remove myself as maintainer.

diff --git a/binutils/MAINTAINERS b/binutils/MAINTAINERS
index 8a1b152..5b3a6c8 100644
--- a/binutils/MAINTAINERS
+++ b/binutils/MAINTAINERS
@@ -119,8 +119,6 @@ responsibility among the other maintainers.
   PPC vector ext   Aldy Hernandez 
   RISC-V   Palmer Dabbelt 
   RISC-V   Andrew Waterman  
-  RL78 DJ Delorie 
-  RX   DJ Delorie 
   RX   Nick Clifton 
   s390, s390x Martin Schwidefsky 
   s390, s390x Andreas Krebbel 

[sim]
* MAINTAINERS (rl78, m32c, rx, v850): Remove myself as maintainer.

diff --git a/sim/MAINTAINERS b/sim/MAINTAINERS
index 62887d4..62afd

Re: [PATCH 11/11] Increase MAX_MAX_OPERANDS limit

2018-07-19 Thread Dimitar Dimitrov
On събота, 23 юни 2018 г. 20:35:23 EEST Jakub Jelinek wrote:
> On Sat, Jun 23, 2018 at 03:26:50PM +0300, Dimitar Dimitrov wrote:
> > I took arm/ldmstm.md as an inspiration. See attached machine description
> > for PRU that requires the increase. I omitted this machine-generated MD
> > file from my first patch set, but per comments will include it in v2.
> > 
> > PRU has a total of 32 32-bit registers with flexible subregister
> > addressing. The PRU GCC port represents the register file as 128
> > individual 8-bit registers. Rationale:
> > http://gcc.gnu.org/ml/gcc/2017-01/msg00217.html
> > 
> > Load/store instructions can load anywhere between 1 and 124 consecutive
> > 8-bit registers. The load/store-multiple patterns seem to require
> > const_int_operand offsets for each loaded register, hence the explosion
> > of operands.
> If it is consecutive only, then you could represent those that load a lot of
> registers using wider modes, so represent e.g. that 124 register load as 15
> DImode loads + 1 SImode.
> 
>   Jakub
Jeff, Jakub, thank you for raising a concern that increasing MAX_MAX_OPERANDS 
is suspicous.

I think a better approach is to altogether avoid expansion, and instead 
declare define_insn. Advantages are that:
  - machine description is greatly simplified;
  - there is no machine-generated code;
  - I don't need to increase MAX_MAX_OPERANDS.

I'll revise the PRU port and send patch v2. Here is how I intend to implement 
the pattern:

(define_insn "load_multiple"
  [(unspec_volatile
[(parallel [(match_operand:QI 0 "register_operand" "=r")
(match_operand:BLK 1 "memory_operand" "m")
(match_operand:VOID 2 "const_int_operand" "i")])]
UNSPECV_LOAD_MULTPLE)]
  ""
  "lb%B1o\\t%b0, %1, %2"
  [(set_attr "type" "ld")
   (set_attr "length" "4")])

Regards,
Dimitar



Re: [PATCH] Make function clone name numbering independent.

2018-07-19 Thread Michael Ploujnikov
On 2018-07-17 04:25 PM, Michael Ploujnikov wrote:
> On 2018-07-17 06:02 AM, Richard Biener wrote:
>> On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer
>>  wrote:
>>>
>>> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov 
>>>  wrote:
 Hi,

>>>
 +clone_fn_ids = hash_map::create_ggc
 (1000);
>>>
>>> Isn't 1000 a bit excessive? What about 64 or thereabouts?
>>
>> I'm not sure we should throw memory at this "problem" in this way.
>> What specific issue
>> does this address?
> 
> This goes along with the general theme of preventing changes to one function 
> affecting codegen of others. What I'm seeing in this case is when a function 
> bar is modified codegen decides to create more clones of it (eg: during the 
> constprop pass). These extra clones cause the global counter to increment so 
> the clones of the unchanged function foo are named differently only because 
> of a source change to bar. I was hoping that the testcase would be a good 
> illustration, but perhaps not; is there anything else I can do to make this 
> clearer?
> 
> 
>>
>> Iff then I belive forgoing the automatic counter addidion is the way
>> to go and hand
>> control of that to the callers (for example the caller in
>> lto/lto-partition.c doesn't
>> even seem to need that counter.
> 
> How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm 
> assuming it has a good reason to call clone_function_name_1 rather than 
> appending ".lto_priv" itself.
> 
>> You also assume the string you key on persists - luckily the
>> lto-partition.c caller
>> leaks it but this makes your approach quite fragile in my eye (put the logic
>> into clone_function_name instead, where you at least know you are dealing
>> with a string from an indentifier which are never collected).
>>
>> Richard.
>>
> 
> Is this what you had in mind?:
> 
> diff --git gcc/cgraphclones.c gcc/cgraphclones.c
> index 6e84a31..f000420 100644
> --- gcc/cgraphclones.c
> +++ gcc/cgraphclones.c
> @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count 
> prof_count,
>return new_node;
>  }
>  
> -static GTY(()) unsigned int clone_fn_id_num;
> +static GTY(()) hash_map *clone_fn_ids;
>  
>  /* Return a new assembler name for a clone with SUFFIX of a decl named
> NAME.  */
> @@ -521,14 +521,13 @@ tree
>  clone_function_name_1 (const char *name, const char *suffix)
>  {
>size_t len = strlen (name);
> -  char *tmp_name, *prefix;
> +  char *prefix;
>  
>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>memcpy (prefix, name, len);
>strcpy (prefix + len + 1, suffix);
>prefix[len] = symbol_table::symbol_suffix_separator ();
> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
> -  return get_identifier (tmp_name);
> +  return get_identifier (prefix);
>  }
>  
>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
> @@ -537,7 +536,17 @@ tree
>  clone_function_name (tree decl, const char *suffix)
>  {
>tree name = DECL_ASSEMBLER_NAME (decl);
> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
> +  char *decl_name = IDENTIFIER_POINTER (name);
> +  char *numbered_name;
> +  unsigned int *suffix_counter;
> +  if (!clone_fn_ids) {
> +/* Initialize the per-function counter hash table if this is the first 
> call */
> +clone_fn_ids = hash_map::create_ggc (64);
> +  }
> +  suffix_counter = &clone_fn_ids->get_or_insert(name);
> +  ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter);
> +  *suffix_counter = *suffix_counter + 1;
> +  return clone_function_name_1 (numbered_name, suffix);
>  }
>  
> - Michael
> 
> 
> 

Ping, and below is the updated version of the full patch with changelogs:


gcc:
2018-07-16  Michael Ploujnikov  

   Make function clone name numbering independent.
   * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids.
   (clone_function_name_1): Move suffixing to clone_function_name
   and change it to use per-function clone_fn_ids.

testsuite:
2018-07-16  Michael Ploujnikov  

Clone id counters should be completely independent from one another.
* gcc/testsuite/gcc.dg/independent-cloneids-1.c: New test.

---
 gcc/cgraphclones.c| 19 ++
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++
 2 files changed, 52 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index 6e84a31..e1a77a2 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count 
prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map *clone_fn_ids;
 
 /* Return a new assembler name for a clone with SUFFIX of a decl named
NAME.  */
@@ -521,14 +521,13 @@ tree
 clone_function_name_1 (const char *name, const char *suffix)
 {
   size_t len = strl

Re: [PATCH] When using -fprofile-generate=/some/path mangle absolute path of file (PR lto/85759).

2018-07-19 Thread Bin.Cheng
On Fri, Jun 29, 2018 at 9:54 PM, Martin Liška  wrote:
> On 06/22/2018 10:35 PM, Jeff Law wrote:
>> On 05/16/2018 05:53 AM, Martin Liška wrote:
>>> On 12/21/2017 10:13 AM, Martin Liška wrote:
 On 12/20/2017 06:45 PM, Jakub Jelinek wrote:
> Another thing is that the "/" in there is wrong, so
>   const char dir_separator_str[] = { DIR_SEPARATOR, '\0' };
>   char *b = concat (profile_data_prefix, dir_separator_str, pwd, NULL);
> needs to be used instead.
 This looks much nicer, I forgot about DIR_SEPARATOR.

> Does profile_data_prefix have any dir separators stripped from the end?
 That's easy to achieve..

> Is pwd guaranteed to be relative in this case?
 .. however this is absolute path, which would be problematic on a DOC 
 based FS.
 Maybe we should do the same path mangling as we do for purpose of gcov:

 https://github.com/gcc-mirror/gcc/blob/master/gcc/gcov.c#L2424
>>> Hi.
>>>
>>> I decided to implement that. Which means for:
>>>
>>> $ gcc -fprofile-generate=/tmp/myfolder empty.c -O2 && ./a.out
>>>
>>> we get following file:
>>> /tmp/myfolder/#home#marxin#Programming#testcases#tmp#empty.gcda
>>>
>>> That guarantees we have a unique file path. As seen in the PR it
>>> can produce a funny ICE.
>>>
>>> I've been testing the patch.
>>> Ready after it finishes tests?
>>>
>>> Martin
>>>
 What do you think about it?
 Regarding the string manipulation: I'm not an expert, but work with string 
 in C
 is for me always a pain :)

 Martin

>>>
>>> 0001-When-using-fprofile-generate-some-path-mangle-absolu.patch
>>>
>>>
>>> From 386a4561a4d1501e8959871791289e95f6a89af5 Mon Sep 17 00:00:00 2001
>>> From: marxin 
>>> Date: Wed, 16 Aug 2017 10:22:57 +0200
>>> Subject: [PATCH] When using -fprofile-generate=/some/path mangle absolute 
>>> path
>>>  of file (PR lto/85759).
>>>
>>> gcc/ChangeLog:
>>>
>>> 2018-05-16  Martin Liska  
>>>
>>>  PR lto/85759
>>>  * coverage.c (coverage_init): Mangle full path name.
>>>  * doc/invoke.texi: Document the change.
>>>  * gcov-io.c (mangle_path): New.
>>>  * gcov-io.h (mangle_path): Likewise.
>>>  * gcov.c (mangle_name): Use mangle_path for path mangling.
>> ISTM you can self-approve this now if you want it to move forward :-)
>>
>> jeff
>>
>
> Sure, let me install the patch then.
Hi,
I am a bit confused after path mangling change.
Now with below command line:
$ ./gcc -O2 -fprofile-use=./ sort.c -o sort.c
or
$ ./gcc -O2 -fprofile-use=./sort.gcda sort.c -o sort.c

The da_file_name and the final name used in gcov_open is as:

$ p name
$11 = 0x2e63050
./#home#chengbin.cb#work#gcc-patches#trunk-orig#target.build#bin#sort.gcda
or
p da_file_name
$1 = 0x2e63050 
"sort.gcda/#home#chengbin.cb#work#gcc-patches#trunk-orig#target.build#bin#sort.gcda"


These are not valid paths?  Or how should I modify the command line options?

Thanks,
bin
>
> Martin


Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-19 Thread Bernd Edlinger
>@@ -605,14 +605,21 @@ c_strlen (tree src, int only_value)
> 
>   /* Set MAXELTS to sizeof (SRC) / sizeof (*SRC) - 1, the maximum possible
>  length of SRC.  Prefer TYPE_SIZE() to TREE_STRING_LENGTH() if possible
>- in case the latter is less than the size of the array.  */
>-  HOST_WIDE_INT maxelts = TREE_STRING_LENGTH (src);
>+ in case the latter is less than the size of the array, such as when
>+ SRC refers to a short string literal used to initialize a large array.
>+ In that case, the elements of the array after the terminating NUL are
>+ all NUL.  */
>+  HOST_WIDE_INT strelts = TREE_STRING_LENGTH (src);
>+  strelts = strelts / eltsize - 1;
>+
>+  HOST_WIDE_INT maxelts = strelts;
>   tree type = TREE_TYPE (src);
>   if (tree size = TYPE_SIZE_UNIT (type))
> if (tree_fits_shwi_p (size))

If you already touch this area, please make at least the tree_fits_ and 
tree_to_ consistent.

>-  maxelts = tree_to_uhwi (size);
>-
>-  maxelts = maxelts / eltsize - 1;
>+  {
>+  maxelts = tree_to_uhwi (size);
>+  maxelts = maxelts / eltsize - 1;
>+  }
> 
>   /* PTR can point to the byte representation of any string type, including
>  char* and wchar_t*.  */
>@@ -620,10 +627,12 @@ c_strlen (tree src, int only_value)
> 
>   if (byteoff && TREE_CODE (byteoff) != INTEGER_CST)
> {

Please don't go into this if block, when eltsize != 1, the folding as it stands 
is
incorrect for eltsize != 1.

>-  /* If the string has an internal zero byte (e.g., "foo\0bar"), we can't
>-   compute the offset to the following null if we don't know where to
>+  /* If the string has an internal NUL character followed by any
>+   non-NUL characters (e.g., "foo\0bar"), we can't compute
>+   the offset to the following NUL if we don't know where to
>start searching for it.  */
>-  if (string_length (ptr, eltsize, maxelts) < maxelts)
>+  unsigned len = string_length (ptr, eltsize, strelts);
>+  if (len < strelts)
>   {

>> I really don't see why we need to support wide characters especially
>> when there is no reasonable test coverage, and no usable wstrlen
>> builtin first.
>
>I'm not sure what special support are you talking about.
>The string_constant function handles all kinds of strings
>and the change above doesn't affect that.  The purpose of
>the check above is to conservatively fail for constant
>arrays with excess initializers such as in the comment:
>
> const char a[4] = "abc\000\000";

Yes, fine, but as I said, conservative is not folding stuff that
is too complicated, for instance for the variable index case
_together_ with wide character strings.

Out of curiosity: how often have you seen existing code
to use strlen(&a[i]) and similar?  I know emacs (pr86528),
but how common is that code do you have numbers?



Bernd.

  1   2   >