date:20170802

Re: [PATCH] RTEMS: Add GCC Runtime Library Exception

2017-08-02 Thread Sebastian Huber


On 02/08/17 21:30, Jeff Law wrote:


On 07/24/2017 12:03 AM, Sebastian Huber wrote:

gcc/

PR libgcc/61152
* aarch64/rtems.h: Add GCC Runtime Library Exception.  Format
changes.
* arm/rtems.h: Likewise.
* bfin/rtems.h: Likewise.
* i386/rtemself.h: Likewise.
* lm32/rtems.h: Likewise.
* m32c/rtems.h: Likewise.
* m68k/rtemself.h: Likewise.
* microblaze/rtems.h: Likewise.
* mips/rtems.h: Likewise.
* moxie/rtems.h: Likewise.
* nios2/rtems.h: Likewise.
* powerpcspe/rtems.h: Likewise.
* rs6000/rtems.h: Likewise.
* rtems.h: Likewise.
* sh/rtems.h: Likewise.
* sh/rtemself.h: Likewise.
* sparc/rtemself.h: Likewise.

This seems horribly wrong.  Did anyone ack this change?  I'm fully
supportive of target maintainers taking care of their areas, but
licensing stuff probably should get explicitly ack'd.

I just reviewed all the rtems config files and I don't see anything in
any of them that deserves a runtime exception with the possible
exception of rs6000/rtems.h.

Seriously.  Redefining the CPP builtins?  LINK_SPEC?  #undefs?   Those
are not things we should be granting an exception for.

The one that looks marginal to me would be rs6000/rtems.h and its
definition of CRT_CALL_STATIC_FUNCTION.


I asked on the mailing list:

https://gcc.gnu.org/ml/gcc/2017-07/msg00171.html

Jakub Jelinek said that for header files included by libgcc it is 
important whether they have the runtime exception or not:


https://gcc.gnu.org/ml/gcc/2017-07/msg00176.html

There is also this PR61152 from 2014

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61152

which adds the runtime exception to a couple of gcc/ subdirectory files 
(including some RTEMS files). So, I had no explicit acknowledge, but my 
impression was that I did simply fix some left over files.


If you don't add the runtime exception to files included by libgcc, then 
the user of GCC must check that these files contain no content that 
deserves a copyright. Is this really user friendly?


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Re: [PING^4][PATCH v2] Generate reproducible output independently of the build-path

2017-08-02 Thread Yury Gribov


On 03.08.2017 3:06, Ximin Luo wrote:

Jeff Law:

On 07/21/2017 10:15 AM, Ximin Luo wrote:

(Please keep me on CC, I am not subscribed)


Proposal


This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When
this is set, GCC will treat this as extra implicit "-fdebug-prefix-map=$value"
command-line arguments that precede any explicit ones. This makes the final
binary output reproducible, and also hides the unreproducible value (the source
path prefixes) from CFLAGS et. al. which many build tools (understandably)
embed as-is into their build output.

I'd *really* avoid doing this with magic environment variables.  Make it
a first class option to the compiler.  Yes, it means projects that want
this behavior have to arrange to pass that flag to their compiler, but
IMHO that's much preferred over environment variables.

Jeff



Hi Jeff,

If by "first class option" you meant a command-line flag, GCC *already has* that 
(-fdebug-prefix-map) > and it wasn't enough to achieve reproducibility in many cases we 
tested.


Shouldn't -fdebug-prefix-map be updated to use the same syntax as 
BUILD_PATH_PREFIX_MAP?


-Y

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread Yury Gribov


On 02.08.2017 23:04, H.J. Lu wrote:

On Wed, Aug 2, 2017 at 1:56 PM, Yury Gribov  wrote:

On 02.08.2017 21:48, H.J. Lu wrote:


On Wed, Aug 2, 2017 at 1:39 PM, Yury Gribov 
wrote:


On 02.08.2017 20:02, Jeff Law wrote:



On 08/02/2017 12:47 PM, Jakub Jelinek wrote:



On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:



On 07/17/2017 01:23 AM, Yuri Gribov wrote:



I've rebased the previous patch to trunk per Andrew's suggestion.
Original patch description/motivation/questions are in
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html



Is his stuff used for exception handling?  If so, doesn't that make
the
performance a significant concern (ie that msync call?)




For the performance issue, I wonder if it wouldn't be better to just
compile the unwinder twice, basically include everything needed for the
verification backtrace in a single *.c file with special defines, and
not export anything from that *.o file except the single entrypoint.
That would also handle the ABI problems.



Yea.

Given that the vast majority of the time the addresses are valid, I
wonder if we could take advantage of that property to keep the overhead
lower in the most common cases.


For the concurrent calls of other threads unmapping stacks of running
threads (that is where most of the verification goes against), I'm
afraid
that is simply non-solveable, even if you don't cache anything, it will
still be racy.



Absolutely -- I think solving this stuff 100% reliably without races
isn't possible.  I think the question is can we make this significantly
better.




All,

First of all, thanks for the feedback.  This is indeed not a 100% robust
solution, just something to allow tools like Asan to produce more sane
backtraces on crash (currently when stack is corrupted they would crash
in
the unwinder without printing at least partial backtrace).



FWIW, glibc 2.26 is changed to abort as soon as possible when stack is
corrupted:

https://sourceware.org/bugzilla/show_bug.cgi?id=21752
https://sourceware.org/bugzilla/show_bug.cgi?id=12189

There is very little point to unwind stack when stack is corrupted.



I'd argue that situation with __stack_chk_fail is very special.  It's used
in production code where you'd want to halt as soon as corruption is
detected to prevent further damage so even printing message is an additional
risk.  Debugging tools (e.g. sanitizers) are different and would prefer to
print as many survived frames as possible.



But stack unwinder in libgcc is primarily for production use, not for debugging.


I can see it used in different projects to print backtraces on error 
(bind9, SimGrid, sanitizers).  backtrace(3) also sits on top of libgcc 
unwinder and seems to be meant primarily for debugging (at least many 
projects use it for this purpose: 
https://codesearch.debian.net/search?q=backtrace_symbols&perpkg=1). Same 
for libbacktrace which, as its README mentions, is meant to "print a 
detailed backtrace when an error occurs or to gather detailed profiling 
information".


Validation is also performed by libunwind (at least in some cases) which 
libgcc unwinder mimics.



Use it to unwind corrupted stack isn't appropriate.   Santizer can try similar
similar to valgrind to invoke a debugger to unwind corrupted stack.


-Y

Re: [PING^4][PATCH v2] Generate reproducible output independently of the build-path

2017-08-02 Thread Ximin Luo

Jeff Law:
> On 07/21/2017 10:15 AM, Ximin Luo wrote:
>> (Please keep me on CC, I am not subscribed)
>>
>>
>> Proposal
>> 
>>
>> This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When
>> this is set, GCC will treat this as extra implicit 
>> "-fdebug-prefix-map=$value"
>> command-line arguments that precede any explicit ones. This makes the final
>> binary output reproducible, and also hides the unreproducible value (the 
>> source
>> path prefixes) from CFLAGS et. al. which many build tools (understandably)
>> embed as-is into their build output.
> I'd *really* avoid doing this with magic environment variables.  Make it
> a first class option to the compiler.  Yes, it means projects that want
> this behavior have to arrange to pass that flag to their compiler, but
> IMHO that's much preferred over environment variables.
> 
> Jeff
> 

Hi Jeff,

If by "first class option" you meant a command-line flag, GCC *already has* 
that (-fdebug-prefix-map) and it wasn't enough to achieve reproducibility in 
many cases we tested. dpkg-buildflags actually already adds these flags to 
CFLAGS CXXFLAGS etc on Debian. However, with this patch using the environment 
variable, we are able to reproduce 1800 more packages out of 26000.

GCC already supports a similar environment variable SOURCE_DATE_EPOCH, which 
was accepted about 2 years ago in a patch written by one of our GSoC students. 
We are not planning any more environment variables like this, and are committed 
to fixing other sources of non-determinism by patching the relevant build 
scripts.

However for build-paths, we think that this environment variable is the best 
practical approach here. It is even more prevalent than the timestamps issue - 
when SOURCE_DATE_EPOCH was accepted into GCC, we were able to reproduce about 
400 more packages. With BUILD_PATH_PREFIX_MAP, this number is about 1800.

I understand that environment variables have a bit of a "magical feel" to them, 
but I think that this scenario fits their use-cases well. The idea of an 
"environment" or a "context" is used in many scenarios in programming 
(including pure functional programming) when one wants to avoid passing the 
same arguments down multiple layers of function calls, especially when only 
some of these layers (and often, only the bottom one) use these arguments and 
the other layers totally ignore them (except to pass it down). I think this 
fits the situation here - patching many many build scripts simply to take a 
prefix-map and pass it downwards one layer in the "buildsystem stack", and do 
nothing else with this information, does not seem like a clean nor scalable 
solution to me. I can empathise that UNIX global environment variables is not 
the most elegant manifestation of this general idea of "an environment", but I 
think the overall end result in this case, is still cleaner than "boilerplate 
argument passing".

On top of all of this, we've seen that some build scripts will save GCC's 
command-line arguments into the build output. But this defeats the purpose of 
setting -fdebug-prefix-map for reproducibility purposes, since it contains an 
unreproducible value that is meant to be stripped out of the build output. 
Indeed, GCC itself used to do this in DW_AT_producer, until we submitted a 
patch over a year ago to specifically ignore -fdebug-prefix-map when writing 
DW_AT_producer. If we could only set -fdebug-prefix-map via the CLI and not 
this envvar, we'd have to teach this logic to all other build scripts that want 
to save compiler flags as well. By contrast, if we use a previously-unused 
envvar, and explicitly say that this should not be saved into build output, 
this problem goes away. (I think it is reasonable to save CLI flags into build 
output, because they very obviously affect what the program does. However, it 
is much less reasonable to save arbitrary envvars into build output, and we 
currently would treat this as a reproducibility bug and potential privacy leak.)

For all of these reasons, that is why I decided an envvar approach was best 
here. At least, there are even stronger arguments for this one, than for 
SOURCE_DATE_EPOCH.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

Re: [PATCH, rs6000] Clean up capitalized diagnostic messages

2017-08-02 Thread Segher Boessenkool

Hi Bill,

On Wed, Aug 02, 2017 at 10:29:20AM -0500, Bill Schmidt wrote:
> I don't anticipate backporting any of this.

Good :-)

> @@ -6802,7 +6802,7 @@ altivec_resolve_overloaded_builtin (location_t loc
>  if (unsupported_builtin)
>{
>   const char *name = rs6000_overloaded_builtin_name (fcode);
> - error ("Builtin function %s not supported in this compiler 
> configuration",
> + error ("builtin function %s not supported in this compiler 
> configuration",
>  name);

As Martin says, %qs for this and similar (see the documentation before
pp_format in pretty-print.c).  Can be a separate patch of course, this
one is big enough already.

> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c(revision 250791)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -4132,7 +4132,7 @@ rs6000_option_override_internal (bool global_init_
>|| rs6000_cpu == PROCESSOR_PPCE5500)
>  {
>if (TARGET_ALTIVEC)
> - error ("AltiVec not supported in this target");
> + error ("altivec not supported in this target");
>  }
>  
>/* If we are optimizing big endian systems for space, use the load/store

Let's either keep AltiVec or say -maltivec.  We only have this warning
because we allow -maltivec with CPUs that do not support it; and this
warning is only for some of the FSL CPUs.  It isn't very consistent.

> @@ -4250,7 +4250,7 @@ rs6000_option_override_internal (bool global_init_
> rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~ignore_masks);
>   }
> else
> - error ("Power9 target option is incompatible with -mcpu= for "
> + error ("power9 target option is incompatible with -mcpu= for "
>  " less than power9");
>   }
>else if ((ISA_3_0_MASKS_SERVER & rs6000_isa_flags_explicit)

See also PR79477.  Since many of these options are going away it is
probably not worth spending too much time on this, not until stage 3
or so anyway.

> @@ -11226,7 +11226,7 @@ rs6000_return_in_memory (const_tree type, const_tr
>static bool warned_for_return_big_vectors = false;
>if (!warned_for_return_big_vectors)
>   {
> -   warning (OPT_Wpsabi, "GCC vector returned by reference: "
> +   warning (OPT_Wpsabi, "gcc vector returned by reference: "
>  "non-standard ABI extension with no compatibility 
> guarantee");
> warned_for_return_big_vectors = true;
>   }

Maybe the warning should just say "big vector"?  Or "generic vector"?

(Vectors that fit in one VR, or in GPRs in 8 bytes or less, do not have
the problem this warns for.  Kind of hard to express tersely and
precisely though).

Approved for trunk with whichever of the suggested changes you think
are good.  Thanks,

Segher

[PATCH 5/6 v2] [i386] Modify SP realignment in ix86_expand_prologue, et. al.

2017-08-02 Thread Daniel Santos

My first version of this patch inited m->fs.sp_realigned_fp_last with
the value of m->fs.sp_offset prior to performing the stack realignment.
I had forgotten, however, that when we're saving GP regs using MOV that
we delay SP modification as long as possible so that the value of
m->fs.sp_offset at this point is correct when we've used push, but
incorrect when we've used mov.

This time I've bootstraped with --enable-checking=yes,rtl
--enable-languages=all and reg tested using the below command to test both 64-
and 32-bit code.

  make -kj8 RUNTESTFLAGS="--target_board=unix/\{,-m32\}" check

Original patch description:

The SP allocation calculation is now done in ix86_compute_frame_layout
and the result stored in ix86_frame::stack_realign_allocate.  This
change also updates comments for choose_baseaddr to clarify that the
alignment returned doesn't necessarily reflect the alignment of the
cfa_offset passed (e.g., you can pass cfa_offset 48 and it can return an
alignment of 64 bytes).

Since the alignment required may be more than 16-bytes, we cannot defer
SP allocation to ix86_emit_outlined_ms2sysv_save (when it's enabled), so
that function needs to be updated as well.

Signed-off-by: Daniel Santos 
---
 gcc/config/i386/i386.c | 58 --
 1 file changed, 32 insertions(+), 26 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 0dc366cf16e..a1f39cd714c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13289,10 +13289,13 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx 
&base_reg,
 }
 
 /* Return an RTX that points to CFA_OFFSET within the stack frame and
-   the alignment of address.  If align is non-null, it should point to
+   the alignment of address.  If ALIGN is non-null, it should point to
an alignment value (in bits) that is preferred or zero and will
-   recieve the alignment of the base register that was selected.  The
-   valid base registers are taken from CFUN->MACHINE->FS.  */
+   recieve the alignment of the base register that was selected,
+   irrespective of rather or not CFA_OFFSET is a multiple of that
+   alignment value.
+
+   The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
 choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
@@ -14338,35 +14341,35 @@ ix86_emit_outlined_ms2sysv_save (const struct 
ix86_frame &frame)
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout &xlogue = xlogue_layout::get_instance ();
-  HOST_WIDE_INT rax_offset = xlogue.get_stub_ptr_offset () + m->fs.sp_offset;
-  HOST_WIDE_INT stack_alloc_size = frame.stack_pointer_offset - 
m->fs.sp_offset;
-  HOST_WIDE_INT stack_align_off_in = xlogue.get_stack_align_off_in ();
+  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
+
+  /* AL should only be live with sysv_abi.  */
+  gcc_assert (!ix86_eax_live_at_start_p ());
+
+  /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
+ we've actually realigned the stack or not.  */
+  align = GET_MODE_ALIGNMENT (V4SFmode);
+  addr = choose_baseaddr (frame.stack_realign_offset
+ + xlogue.get_stub_ptr_offset (), &align);
+  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
+  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Verify that the incoming stack 16-byte alignment offset matches the
- layout we're using.  */
-  gcc_assert (stack_align_off_in == (m->fs.sp_offset & UNITS_PER_WORD));
+  /* Allocate stack if not already done.  */
+  if (allocate > 0)
+  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+   GEN_INT (-allocate), -1, false);
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
  : XLOGUE_STUB_SAVE);
   RTVEC_ELT (v, vi++) = gen_rtx_USE (VOIDmode, sym);
 
-  /* Setup RAX as the stub's base pointer.  */
-  align = GET_MODE_ALIGNMENT (V4SFmode);
-  addr = choose_baseaddr (rax_offset, &align);
-  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  insn = emit_insn (gen_rtx_SET (rax, addr));
-
-  gcc_assert (stack_alloc_size >= xlogue.get_stack_space_used ());
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-GEN_INT (-stack_alloc_size), -1,
-m->fs.cfa_reg == stack_pointer_rtx);
   for (i = 0; i < ncregs; ++i)
 {
   const xlogue_layout::reginfo &r = xlogue.get_reginfo (i);
   rtx reg = gen_rtx_REG ((SSE_REGNO_P (r.regno) ? V4SFmode : word_mode),
 r.regno);
-  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);;
+  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);
 }
 
   gcc_assert (vi == (unsigned)GET_NUM_ELEM (v));
@@ -14621,14 +14624,15 @@ ix86_expand_prologue (void)
   gcc_assert (align_bytes > MIN_STACK_BOUNDARY / BI

Re: [PATCH 1/1] sparc: support for -mmisalign in the SPARC M8

2017-08-02 Thread David Miller

From: Qing Zhao 
Date: Wed, 2 Aug 2017 14:41:51 -0500

> so, could you please specify what kind of side effects will have
> when set STRICT_ALIGNMENT to true on TARGET_MISALIGN?

Why don't you read the code rather than just relying upon what
high level description is given by the documentation instead?

Thanks.

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread H.J. Lu

On Wed, Aug 2, 2017 at 1:56 PM, Yury Gribov  wrote:
> On 02.08.2017 21:48, H.J. Lu wrote:
>>
>> On Wed, Aug 2, 2017 at 1:39 PM, Yury Gribov 
>> wrote:
>>>
>>> On 02.08.2017 20:02, Jeff Law wrote:


 On 08/02/2017 12:47 PM, Jakub Jelinek wrote:
>
>
> On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:
>>
>>
>> On 07/17/2017 01:23 AM, Yuri Gribov wrote:
>>>
>>>
>>> I've rebased the previous patch to trunk per Andrew's suggestion.
>>> Original patch description/motivation/questions are in
>>> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html
>>
>>
>> Is his stuff used for exception handling?  If so, doesn't that make
>> the
>> performance a significant concern (ie that msync call?)
>
>
>
> For the performance issue, I wonder if it wouldn't be better to just
> compile the unwinder twice, basically include everything needed for the
> verification backtrace in a single *.c file with special defines, and
> not export anything from that *.o file except the single entrypoint.
> That would also handle the ABI problems.


 Yea.

 Given that the vast majority of the time the addresses are valid, I
 wonder if we could take advantage of that property to keep the overhead
 lower in the most common cases.

> For the concurrent calls of other threads unmapping stacks of running
> threads (that is where most of the verification goes against), I'm
> afraid
> that is simply non-solveable, even if you don't cache anything, it will
> still be racy.


 Absolutely -- I think solving this stuff 100% reliably without races
 isn't possible.  I think the question is can we make this significantly
 better.
>>>
>>>
>>>
>>> All,
>>>
>>> First of all, thanks for the feedback.  This is indeed not a 100% robust
>>> solution, just something to allow tools like Asan to produce more sane
>>> backtraces on crash (currently when stack is corrupted they would crash
>>> in
>>> the unwinder without printing at least partial backtrace).
>>>
>>
>> FWIW, glibc 2.26 is changed to abort as soon as possible when stack is
>> corrupted:
>>
>> https://sourceware.org/bugzilla/show_bug.cgi?id=21752
>> https://sourceware.org/bugzilla/show_bug.cgi?id=12189
>>
>> There is very little point to unwind stack when stack is corrupted.
>
>
> I'd argue that situation with __stack_chk_fail is very special.  It's used
> in production code where you'd want to halt as soon as corruption is
> detected to prevent further damage so even printing message is an additional
> risk.  Debugging tools (e.g. sanitizers) are different and would prefer to
> print as many survived frames as possible.
>

But stack unwinder in libgcc is primarily for production use, not for debugging.
Use it to unwind corrupted stack isn't appropriate.   Santizer can try similar
similar to valgrind to invoke a debugger to unwind corrupted stack.

-- 
H.J.

Re: [patch][Ping #3] PR80929: Realistic PARALLEL cost in seq_cost.

2017-08-02 Thread Segher Boessenkool

On Wed, Aug 02, 2017 at 12:17:44PM -0600, Jeff Law wrote:
> So this has probably been hashed to death, but just a couple thoughts.
> 
> I think realistically one has to look at the entirety of the PARALLEL to
> get any reasonable costing.
> 
> Summing the individual components is wrong.  Taking the cost of any
> individual component is wrong.  You might be able to make a case for the
> maximum cost of the individual components, but even that is almost
> certain to be wrong in some cases.

I have tried all of these over the years.  Nothing is good for all
targets and all insns for a target, alas.

> Presumably this is part of what Segher is trying to address with his
> costing changes.

Sure, seq_cost is trivial to express in terms of insn_cost.  I'll add
a patch for it.

(set_rtx_cost, which is what seq_cost currently uses, is most often
used for RTL that isn't (yet) in an insn.  We'll have to look per case
how to best handle that.  Ideally rtx_cost will go away or only continue
in a much slimmed down form, for targets that implement insn_cost).

Segher

[PATCH, AArch64] Add falkor pipeline description.

2017-08-02 Thread Jim Wilson

This adds a pipeline description for the Qualcomm Falkor core.  This was
tested with a bootstrap and make check.  There were no regressions.  This
gives about 0.5% performance gain on SPEC CPU2006 on our internal tree, which
has a few other patches that aren't in the FSF tree yet.

OK?

Jim

gcc/
* config/aarch64/aarch64-cores.def (falkor): Use falkor pipeline.
(qdf24xx): Likewise.
* config/aarch64/aarch64.md: Include falkor.md.
* config/aarch64/falkor.md: New.
---
 gcc/config/aarch64/aarch64-cores.def |   4 +-
 gcc/config/aarch64/aarch64.md|   1 +
 gcc/config/aarch64/falkor.md | 681 +++
 3 files changed, 684 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/aarch64/falkor.md

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index b8d0ba6..1089332 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -65,8 +65,8 @@ AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH
 AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, 
xgene1, 0x50, 0x000, -1)
 
 /* Qualcomm ('Q') cores. */
-AARCH64_CORE("falkor",  falkor,cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO | AARCH64_FL_RDMA, qdf24xx,   0x51, 0xC00, 
-1)
-AARCH64_CORE("qdf24xx", qdf24xx,   cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO | AARCH64_FL_RDMA, qdf24xx,   0x51, 0xC00, 
-1)
+AARCH64_CORE("falkor",  falkor,falkor,8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO | AARCH64_FL_RDMA, qdf24xx,   0x51, 0xC00, 
-1)
+AARCH64_CORE("qdf24xx", qdf24xx,   falkor,8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO | AARCH64_FL_RDMA, qdf24xx,   0x51, 0xC00, 
-1)
 
 /* Samsung ('S') cores. */
 AARCH64_CORE("exynos-m1",   exynosm1,  exynosm1,  8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1,  0x53, 0x001, -1)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index fc79947..51061b2 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -231,6 +231,7 @@
 (include "../arm/cortex-a53.md")
 (include "../arm/cortex-a57.md")
 (include "../arm/exynos-m1.md")
+(include "falkor.md")
 (include "thunderx.md")
 (include "../arm/xgene1.md")
 (include "thunderx2t99.md")
diff --git a/gcc/config/aarch64/falkor.md b/gcc/config/aarch64/falkor.md
new file mode 100644
index 000..b422ab3
--- /dev/null
+++ b/gcc/config/aarch64/falkor.md
@@ -0,0 +1,681 @@
+;; Falkor pipeline description
+;; Copyright (C) 2017 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_automaton "falkor")
+
+;; Complex int instructions (e.g. multiply and divide) execute in the X
+;; pipeline.  Simple int instructions execute in the X, Y, and Z pipelines.
+
+(define_cpu_unit "falkor_x" "falkor")
+(define_cpu_unit "falkor_y" "falkor")
+(define_cpu_unit "falkor_z" "falkor")
+
+;; Branches execute in the B pipeline or in one of the int pipelines depending
+;; on how complex it is.  Simple int insns (like movz) can also execute here.
+
+(define_cpu_unit "falkor_b" "falkor")
+
+;; Vector and FP insns execute in the VX and VY pipelines.
+
+(define_automaton "falkor_vfp")
+
+(define_cpu_unit "falkor_vx" "falkor_vfp")
+(define_cpu_unit "falkor_vy" "falkor_vfp")
+
+;; Loads execute in the LD pipeline.
+;; Stores execute in the ST, SD, and VSD pipelines, for address, data, and
+;; vector data.
+
+(define_automaton "falkor_mem")
+
+(define_cpu_unit "falkor_ld" "falkor_mem")
+(define_cpu_unit "falkor_st" "falkor_mem")
+(define_cpu_unit "falkor_sd" "falkor_mem")
+(define_cpu_unit "falkor_vsd" "falkor_mem")
+
+;; The GTOV and VTOG pipelines are for general to vector reg moves, and vice
+;; versa.
+
+(define_cpu_unit "falkor_gtov" "falkor")
+(define_cpu_unit "falkor_vtog" "falkor")
+
+;; Common reservation combinations.
+
+(define_reservation "falkor_vxvy" "falkor_vx|falkor_vy")
+(define_reservation "falkor_zb"   "falkor_z|falkor_b")
+(define_reservation "falkor_xyz"  "falkor_x|falkor_y|falkor_z")
+(define_reservation "falkor_xyzb" "falkor_x|falkor_y|falkor_z|falkor_b")
+
+;; SIMD Floating-Point Instructions
+
+(define_insn_reservation "falkor_afp_1_vx

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread Florian Weimer

* Jeff Law:

> Something like setup a signal handler when we first start unwinding that
> flags the error and tear it down when we're done unwinding?Obviously
> we can't do setup/tear down each time for each address.  Anyway, just
> thinking outloud here...

Linux doesn't have per-thread signal handlers, so this doesn't work
reliably.

If speed is not a concern, but reliability is, call fork (the system
call, not glibc's wrapper which calls fork handlers) and do the work
in a single-threaded copy of the process.  There, you can set up
signal handlers as you see fit, and the VM layout won't change
unexpectedly.

A completely different way to deal with this is to have the shell and
abrt/apport/systemd-coredumpd coordinate and generate the backtrace
from a userspace coredump handler.

To harden unwinding against corrupted tables or table locations, we'd
have to change ld.so to make all critical data read-only after loading
and remove the unwinder caches (with more help from ld.so instead).
It would make sense to move the unwinder implementation into ld.so.
With proper hardening, corrupted stacks would not be able to cause
crashes anymore, either.

Re: [C++ Patch] PR 71440 ("ICE on invalid C++ code in instantiate_type, at cp/class.c...")

2017-08-02 Thread Jason Merrill

OK.

On Wed, Aug 2, 2017 at 5:15 PM, Paolo Carlini  wrote:
> Hi,
>
> by and large this issue seems already fixed in mainline, ie, we don't ICE
> anymore during error recovery, but I noticed that we try to prettyprint
> constructor and destructor as expressions and we end up with ugliness like:
>
> error: taking address of constructor ‘((C*)this)->*A::__ct ’
>
> thus the below, tested x86_64-linux. Ok?
>
> Thanks,
> Paolo.
>
>

Re: [PATCH 3/5] combine: Use insn_cost instead of pattern_cost everywhere

2017-08-02 Thread Segher Boessenkool

On Wed, Aug 02, 2017 at 10:40:32AM -0600, Jeff Law wrote:
> On 07/31/2017 05:25 PM, Segher Boessenkool wrote:
> > This changes combine to always use insn_cost, not pattern_cost.  I don't
> > intend to commit it like this: it calls recog more often than necessary,
> > and it is very ugly (no, don't look at it).  But it's good enough to test
> > things with.
> So what do you expect this is going to look like when you're done?  The
> other target independent patches all look reasonable so I think it's
> really a matter of how you want to want to use the new infrastructure in
> combine.c (which will become the template for how other passes might use
> the infrastructure as well).

Just like this: use insn_cost everywhere instead of pattern_cost.
The change still needed is that in the current prototype here I have
to swap in and out the new patterns more often than necessary, and
call recog twice for every new pattern (the first time to see if
combine came up with insns that exist at all, the second time to see
how expensive they are).  This of course is a bit wasteful (recog isn't
cheap), and unnecessary.

Patch 2 already uses insn_cost (instead of pattern_cost / insn_rtx_cost)
in cfgrtl.c, dse.c, and in ifcvt.c in one place (the other three places
weren't trivial to convert).

Segher

[C++ Patch] PR 71440 ("ICE on invalid C++ code in instantiate_type, at cp/class.c...")

2017-08-02 Thread Paolo Carlini


Hi,

by and large this issue seems already fixed in mainline, ie, we don't 
ICE anymore during error recovery, but I noticed that we try to 
prettyprint constructor and destructor as expressions and we end up with 
ugliness like:


error: taking address of constructor ‘((C*)this)->*A::__ct ’

thus the below, tested x86_64-linux. Ok?

Thanks,
Paolo.


/cp
2017-08-02  Paolo Carlini  

PR c++/71440
* typeck.c (build_x_unary_op): Avoid pretty-printing constructor /
destructor as expressions.

/testsuite
2017-08-02  Paolo Carlini  

PR c++/71440
* g++.dg/template/crash127.C: New.
Index: cp/typeck.c
===
--- cp/typeck.c (revision 250787)
+++ cp/typeck.c (working copy)
@@ -5487,9 +5487,9 @@ build_x_unary_op (location_t loc, enum tree_code c
{
  if (complain & tf_error)
error (DECL_CONSTRUCTOR_P (fn)
-  ? G_("taking address of constructor %qE")
-  : G_("taking address of destructor %qE"),
-  xarg.get_value ());
+  ? G_("taking address of constructor %qD")
+  : G_("taking address of destructor %qD"),
+  fn);
  return error_mark_node;
}
}
Index: testsuite/g++.dg/template/crash127.C
===
--- testsuite/g++.dg/template/crash127.C(revision 0)
+++ testsuite/g++.dg/template/crash127.C(working copy)
@@ -0,0 +1,22 @@
+// PR c++/71440
+
+struct A 
+{
+  void f () {}
+};
+
+typedef void (A::*Ptr) ();
+
+template < Ptr > struct B {};
+
+template < class T > 
+struct C : public A
+{
+  void bar ()
+  {
+B < &A::A > b;  // { dg-error "taking address of constructor 'A::A" "" { 
target c++98_only } }
+// { dg-error "taking address of constructor 'constexpr A::A" "" { target 
c++11 } .-1 }
+  }
+};
+
+template class C < int >;

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread Yury Gribov


On 02.08.2017 21:48, H.J. Lu wrote:

On Wed, Aug 2, 2017 at 1:39 PM, Yury Gribov  wrote:

On 02.08.2017 20:02, Jeff Law wrote:


On 08/02/2017 12:47 PM, Jakub Jelinek wrote:


On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:


On 07/17/2017 01:23 AM, Yuri Gribov wrote:


I've rebased the previous patch to trunk per Andrew's suggestion.
Original patch description/motivation/questions are in
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html


Is his stuff used for exception handling?  If so, doesn't that make the
performance a significant concern (ie that msync call?)



For the performance issue, I wonder if it wouldn't be better to just
compile the unwinder twice, basically include everything needed for the
verification backtrace in a single *.c file with special defines, and
not export anything from that *.o file except the single entrypoint.
That would also handle the ABI problems.


Yea.

Given that the vast majority of the time the addresses are valid, I
wonder if we could take advantage of that property to keep the overhead
lower in the most common cases.


For the concurrent calls of other threads unmapping stacks of running
threads (that is where most of the verification goes against), I'm afraid
that is simply non-solveable, even if you don't cache anything, it will
still be racy.


Absolutely -- I think solving this stuff 100% reliably without races
isn't possible.  I think the question is can we make this significantly
better.



All,

First of all, thanks for the feedback.  This is indeed not a 100% robust
solution, just something to allow tools like Asan to produce more sane
backtraces on crash (currently when stack is corrupted they would crash in
the unwinder without printing at least partial backtrace).



FWIW, glibc 2.26 is changed to abort as soon as possible when stack is
corrupted:

https://sourceware.org/bugzilla/show_bug.cgi?id=21752
https://sourceware.org/bugzilla/show_bug.cgi?id=12189

There is very little point to unwind stack when stack is corrupted.


I'd argue that situation with __stack_chk_fail is very special.  It's 
used in production code where you'd want to halt as soon as corruption 
is detected to prevent further damage so even printing message is an 
additional risk.  Debugging tools (e.g. sanitizers) are different and 
would prefer to print as many survived frames as possible.


-Y

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread H.J. Lu

On Wed, Aug 2, 2017 at 1:39 PM, Yury Gribov  wrote:
> On 02.08.2017 20:02, Jeff Law wrote:
>>
>> On 08/02/2017 12:47 PM, Jakub Jelinek wrote:
>>>
>>> On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:

 On 07/17/2017 01:23 AM, Yuri Gribov wrote:
>
> I've rebased the previous patch to trunk per Andrew's suggestion.
> Original patch description/motivation/questions are in
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html

 Is his stuff used for exception handling?  If so, doesn't that make the
 performance a significant concern (ie that msync call?)
>>>
>>>
>>> For the performance issue, I wonder if it wouldn't be better to just
>>> compile the unwinder twice, basically include everything needed for the
>>> verification backtrace in a single *.c file with special defines, and
>>> not export anything from that *.o file except the single entrypoint.
>>> That would also handle the ABI problems.
>>
>> Yea.
>>
>> Given that the vast majority of the time the addresses are valid, I
>> wonder if we could take advantage of that property to keep the overhead
>> lower in the most common cases.
>>
>>> For the concurrent calls of other threads unmapping stacks of running
>>> threads (that is where most of the verification goes against), I'm afraid
>>> that is simply non-solveable, even if you don't cache anything, it will
>>> still be racy.
>>
>> Absolutely -- I think solving this stuff 100% reliably without races
>> isn't possible.  I think the question is can we make this significantly
>> better.
>
>
> All,
>
> First of all, thanks for the feedback.  This is indeed not a 100% robust
> solution, just something to allow tools like Asan to produce more sane
> backtraces on crash (currently when stack is corrupted they would crash in
> the unwinder without printing at least partial backtrace).
>

FWIW, glibc 2.26 is changed to abort as soon as possible when stack is
corrupted:

https://sourceware.org/bugzilla/show_bug.cgi?id=21752
https://sourceware.org/bugzilla/show_bug.cgi?id=12189

There is very little point to unwind stack when stack is corrupted.


-- 
H.J.

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread Yury Gribov

On 02.08.2017 20:02, Jeff Law wrote:

On 08/02/2017 12:47 PM, Jakub Jelinek wrote:

On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:

On 07/17/2017 01:23 AM, Yuri Gribov wrote:

I've rebased the previous patch to trunk per Andrew's suggestion.
Original patch description/motivation/questions are in
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html

Is his stuff used for exception handling?  If so, doesn't that make the
performance a significant concern (ie that msync call?)

For the performance issue, I wonder if it wouldn't be better to just
compile the unwinder twice, basically include everything needed for the
verification backtrace in a single *.c file with special defines, and
not export anything from that *.o file except the single entrypoint.
That would also handle the ABI problems.

Yea.

Given that the vast majority of the time the addresses are valid, I
wonder if we could take advantage of that property to keep the overhead
lower in the most common cases.

For the concurrent calls of other threads unmapping stacks of running
threads (that is where most of the verification goes against), I'm afraid
that is simply non-solveable, even if you don't cache anything, it will
still be racy.

Absolutely -- I think solving this stuff 100% reliably without races
isn't possible.  I think the question is can we make this significantly
better.

All,

First of all, thanks for the feedback.  This is indeed not a 100% robust 
solution, just something to allow tools like Asan to produce more sane 
backtraces on crash (currently when stack is corrupted they would crash 
in the unwinder without printing at least partial backtrace).

I probly put this aside for now given that people in community don't 
find it too useful.  Should I close PR 67336 then?

> Something like setup a signal handler when we first start unwinding that
> flags the error and tear it down when we're done unwinding?Obviously
> we can't do setup/tear down each time for each address.  Anyway, just
> thinking outloud here...

Hm, I think I could sigsetjmp for SIGSEGV/SIGBUS before calling unwinder 
and then

* siglongjmp back on error (if signal is caught in unwinding thread)
* call users handler (for all other threads)

It's not clear how this is going to work in presense of nested signals 
though.  I could clear sigprocmask during execution of safe unwinding to 
prevent them but this sounds too gross...

-Y

Re: [PATCH 1/1] sparc: support for -mmisalign in the SPARC M8

2017-08-02 Thread Qing Zhao

Hi, David,

thanks a lot for your comment.

see my reply below

> STRICT_ALIGNMENT has a lot of implications.

from the definition of STRICT_ALIGNMENT:

/* Set this nonzero if move instructions will actually fail to work
   when given unaligned data.  */
#define STRICT_ALIGNMENT 1

for MISALIGN_TARGET,  it’s clear that move instructions will NOT fail to work 
when given unaligned data.
so, it’s reasonable to set STRICT_ALIGNMENT to 0 for MISALIGN_TARGET.

> 
> I think just because we happen to have misaligned loads and stores
> available doesn't mean we want all of the side effects associated
> with STRICT_ALIGNMENT being true.

so, could you please specify what kind of side effects will have when =
set STRICT_ALIGNMENT to true on TARGET_MISALIGN?=20

thanks a lot.

Qing

Re: [PATCH] Fix target attribute handling (PR c++/81355).

2017-08-02 Thread Martin Sebor


On 08/02/2017 01:04 PM, Jeff Law wrote:

On 07/28/2017 05:13 AM, Martin Liška wrote:

Hello.

Following patch skips empty strings in 'target' attribute.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-07-13  Martin Liska  

PR c++/81355
* attribs.c (sorted_attr_string): Skip empty strings.

gcc/testsuite/ChangeLog:

2017-07-13  Martin Liska  

PR c++/81355
* g++.dg/other/pr81355.C: New test.

OK.  THough one could legitimately ask if this ought to be a parsing error.


I would suggest to at least issue a warning.  It seems that
the empty string must almost certainly be a bug here, say due
to unintended macro expansion.

Otherwise, if it should remain silently (and intentionally)
accepted, I recommend to document it.  As it is, the manual
says that the "string" argument is equivalent to compiling
with -mstring which for "" would be rejected by the driver.

Martin

Re: [PATCH v2][RFC] Canonize names of attributes.

2017-08-02 Thread Jason Merrill

OK.

On Thu, Jul 13, 2017 at 9:48 AM, Martin Liška  wrote:
> On 07/11/2017 05:52 PM, Jason Merrill wrote:
>> On Tue, Jul 11, 2017 at 9:37 AM, Martin Liška  wrote:
>>> On 07/03/2017 11:00 PM, Jason Merrill wrote:
 On Mon, Jul 3, 2017 at 5:52 AM, Martin Liška  wrote:
> On 06/30/2017 09:34 PM, Jason Merrill wrote:
>>
>> On Fri, Jun 30, 2017 at 5:23 AM, Martin Liška  wrote:
>>>
>>> This is v2 of the patch, where just names of attributes are
>>> canonicalized.
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>> tests.
>>
>>
>> What is the purpose of the new "strict" parameter to cmp_attribs* ?  I
>> don't see any discussion of it.
>
>
> It's needed for arguments of attribute names, like:
>
> /usr/include/stdio.h:391:62: internal compiler error: in cmp_attribs, at
> tree.h:5523
>__THROWNL __attribute__ ((__format__ (__printf__, 3, 4)));
>

 Mm.  Although we don't want to automatically canonicalize all
 identifier arguments to attributes in the parser, we could still do it
 for specific attributes, e.g. in handle_format_attribute or
 handle_mode_attribute.
>>>
>>> Yep, that was done in my previous version of the patch
>>> (https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00996.html).
>>> Where only attribute that was preserved unchanged was 'cleanup':
>>>
>>> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
>>> index 8f638785e0e..08b4db5e5bd 100644
>>> --- a/gcc/cp/parser.c
>>> +++ b/gcc/cp/parser.c
>>> @@ -24765,7 +24765,8 @@ cp_parser_gnu_attribute_list (cp_parser* parser)
>>>   tree tv;
>>>   if (arguments != NULL_TREE
>>>   && ((tv = TREE_VALUE (arguments)) != NULL_TREE)
>>> - && TREE_CODE (tv) == IDENTIFIER_NODE)
>>> + && TREE_CODE (tv) == IDENTIFIER_NODE
>>> + && !id_equal (TREE_PURPOSE (attribute), "cleanup"))
>>> TREE_VALUE (arguments) = canonize_attr_name (tv);
>>>   release_tree_vector (vec);
>>> }
>>>
>>> Does it work for you to do it so?
>>
>> This is canonicalizing arguments by default; I want the default to be
>> not canonicalizing arguments.  I think we only want to canonicalize
>> arguments for format and mode, and we can do that in their handle_*
>> functions.
>
> Yep, done that in v3. I decided to move couple of functions to attribs.h and
> verified that it will not cause binary size increase of cc1 and cc1plus.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?
> Martin
>
>>
>> Jason
>>
>

Re: [PATCH] RTEMS: Add GCC Runtime Library Exception

2017-08-02 Thread Jeff Law

On 07/24/2017 12:03 AM, Sebastian Huber wrote:
> gcc/
> 
>   PR libgcc/61152
>   * aarch64/rtems.h: Add GCC Runtime Library Exception.  Format
>   changes.
>   * arm/rtems.h: Likewise.
>   * bfin/rtems.h: Likewise.
>   * i386/rtemself.h: Likewise.
>   * lm32/rtems.h: Likewise.
>   * m32c/rtems.h: Likewise.
>   * m68k/rtemself.h: Likewise.
>   * microblaze/rtems.h: Likewise.
>   * mips/rtems.h: Likewise.
>   * moxie/rtems.h: Likewise.
>   * nios2/rtems.h: Likewise.
>   * powerpcspe/rtems.h: Likewise.
>   * rs6000/rtems.h: Likewise.
>   * rtems.h: Likewise.
>   * sh/rtems.h: Likewise.
>   * sh/rtemself.h: Likewise.
>   * sparc/rtemself.h: Likewise.
This seems horribly wrong.  Did anyone ack this change?  I'm fully
supportive of target maintainers taking care of their areas, but
licensing stuff probably should get explicitly ack'd.

I just reviewed all the rtems config files and I don't see anything in
any of them that deserves a runtime exception with the possible
exception of rs6000/rtems.h.

Seriously.  Redefining the CPP builtins?  LINK_SPEC?  #undefs?   Those
are not things we should be granting an exception for.

The one that looks marginal to me would be rs6000/rtems.h and its
definition of CRT_CALL_STATIC_FUNCTION.

Jeff

Re: [PATCH 1/3] Use BUILD_PATH_PREFIX_MAP envvar for debug-prefix-map

2017-08-02 Thread Jeff Law

On 07/21/2017 10:15 AM, Ximin Luo wrote:
> Define the BUILD_PATH_PREFIX_MAP environment variable, and treat it as 
> implicit
> -fdebug-prefix-map CLI options specified before any explicit such options.
> 
> Much of the generic code for applying and parsing prefix-maps is implemented 
> in
> libiberty instead of the dwarf2 parts of the code, in order to make subsequent
> patches unrelated to debuginfo easier.
> 
> Acknowledgements
> 
> 
> Daniel Kahn Gillmor who wrote the patch for r231835, which saved me a lot of
> time figuring out what to edit.
> 
> HW42 for discussion on the details of the proposal, and for suggesting that we
> retain the ability to map the prefix to something other than ".".
> 
> Other contributors to the BUILD_PATH_PREFIX_MAP specification, see
> https://reproducible-builds.org/specs/build-path-prefix-map/
> 
> ChangeLogs
> --
> 
> include/ChangeLog:
> 
> 2017-07-21  Ximin Luo  
> 
>   * prefix-map.h: New file implementing the BUILD_PATH_PREFIX_MAP
>   specification; includes code from /gcc/final.c and code adapted from
>   examples attached to the specification.
> 
> libiberty/ChangeLog:
> 
> 2017-07-21  Ximin Luo  
> 
>   * prefix-map.c: New file implementing the BUILD_PATH_PREFIX_MAP
>   specification; includes code from /gcc/final.c and code adapted from
>   examples attached to the specification.
>   * Makefile.in: Update for new files.
> 
> gcc/ChangeLog:
> 
> 2017-07-21  Ximin Luo  
> 
>   * debug.h: Declare add_debug_prefix_map_from_envvar.
>   * final.c: Define add_debug_prefix_map_from_envvar, and refactor
>   prefix-map utilities to use equivalent code from libiberty instead.
>   * opts-global.c: (handle_common_deferred_options): Call
>   add_debug_prefix_map_from_envvar before processing options.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-07-21  Ximin Luo  
> 
>   * lib/gcc-dg.exp: Allow dg-set-compiler-env-var to take only one
>   argument in which case it unsets the given env var.
>   * gcc.dg/debug/dwarf2/build_path_prefix_map-1.c: New test.
>   * gcc.dg/debug/dwarf2/build_path_prefix_map-2.c: New test.
I think some revamping will be needed here to avoid using the
environment variable.

Conceptually I'm OK with doing much of the work in libiberty, driven by GCC.



> 
> Index: gcc-8-20170716/include/prefix-map.h
> ===
> --- /dev/null
> +++ gcc-8-20170716/include/prefix-map.h
> @@ -0,0 +1,94 @@
> +/* Declarations for manipulating filename prefixes.
> +   Written 2017 by Ximin Luo 
> +   This code is in the public domain. */
> +
> +#ifndef _PREFIX_MAP_H
> +#define _PREFIX_MAP_H
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#ifdef HAVE_STDLIB_H
> +#include 
> +#endif
> +
> +/* Linked-list of mappings from old prefixes to new prefixes.  */
Going to assume using a linked list for this stuff isn't a performance
concern in practice.  We shouldn't have enough prefixes for it to matter.


> +
> +struct prefix_map
> +{
> +  const char *old_prefix;
> +  const char *new_prefix;
> +  size_t old_len;
> +  size_t new_len;
> +  struct prefix_map *next;
> +};
> +
> +
> +/* Find a mapping suitable for the given OLD_NAME in the linked list MAP.\
> +
> +   If a mapping is found, writes a pointer to the non-matching suffix part of
> +   OLD_NAME in SUFFIX, and its length in SUF_LEN.
> +
> +   Returns NULL if there was no suitable mapping.  */
> +struct prefix_map *
> +prefix_map_find (struct prefix_map *map, const char *old_name,
> +  const char **suffix, size_t *suf_len);
> +
> +/* Prepend a prefix map before a given SUFFIX.
> +
> +   The remapped name is written to NEW_NAME and returned as a const pointer. 
> No
> +   allocations are performed; the caller must ensure it can hold at least
> +   MAP->NEW_LEN + SUF_LEN + 1 characters.  */
> +const char *
> +prefix_map_prepend (struct prefix_map *map, char *new_name,
> + const char *suffix, size_t suf_len);
> +
> +/* Remap a filename.
> +
> +   Returns OLD_NAME unchanged if there was no remapping, otherwise returns a
> +   pointer to newly-allocated memory for the remapped filename.  The memory 
> is
> +   allocated by the given ALLOC function, which also determines who is
> +   responsible for freeing it.  */
> +#define prefix_map_remap_alloc_(map_head, old_name, alloc)  \
> +  __extension__  
>\
> +  ({\
> +const char *__suffix;   \
> +size_t __suf_len;
>\
> +struct prefix_map *__map;
>\
> +(__map = prefix_map_find ((map_head), (old_name), &__suffix, 
> &__suf_len))  \
> +  ? prefix_map_prepend (__map,

Re: [PING^4][PATCH v2] Generate reproducible output independently of the build-path

2017-08-02 Thread Jeff Law

On 07/21/2017 10:15 AM, Ximin Luo wrote:
> (Please keep me on CC, I am not subscribed)
> 
> 
> Proposal
> 
> 
> This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When
> this is set, GCC will treat this as extra implicit "-fdebug-prefix-map=$value"
> command-line arguments that precede any explicit ones. This makes the final
> binary output reproducible, and also hides the unreproducible value (the 
> source
> path prefixes) from CFLAGS et. al. which many build tools (understandably)
> embed as-is into their build output.
I'd *really* avoid doing this with magic environment variables.  Make it
a first class option to the compiler.  Yes, it means projects that want
this behavior have to arrange to pass that flag to their compiler, but
IMHO that's much preferred over environment variables.

Jeff

Re: [PATCH] Fix target attribute handling (PR c++/81355).

2017-08-02 Thread Jeff Law

On 07/28/2017 05:13 AM, Martin Liška wrote:
> Hello.
> 
> Following patch skips empty strings in 'target' attribute.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/ChangeLog:
> 
> 2017-07-13  Martin Liska  
> 
>   PR c++/81355
>   * attribs.c (sorted_attr_string): Skip empty strings.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-07-13  Martin Liska  
> 
>   PR c++/81355
>   * g++.dg/other/pr81355.C: New test.
OK.  THough one could legitimately ask if this ought to be a parsing error.

jeff

Re: [PATCH] fix the handling of string precision in pretty printer (PR 81586)

2017-08-02 Thread Jeff Law

On 07/27/2017 05:29 PM, Martin Sebor wrote:
> The pretty printer treats precision in %s directives as a request
> to print exactly as many characters from the string argument when
> what precision normally (in C) means is the maximum number of
> characters to read from the string.  It doesn't mean to read
> past the terminating NUL.
> 
> The attached patch fixes that.  Tested on x86_64-linux.
> 
> Martin
> 
> gcc-81586.diff
> 
> 
> PR c++/81586 - valgrind error in output_buffer_append_r with -Wall
> 
> gcc/ChangeLog:
> 
>   PR c++/81586
>   * pretty-print.c (pp_format): Correct the handling of %s precision.
OK.
jeff

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread Jeff Law

On 08/02/2017 12:47 PM, Jakub Jelinek wrote:
> On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:
>> On 07/17/2017 01:23 AM, Yuri Gribov wrote:
>>> I've rebased the previous patch to trunk per Andrew's suggestion.
>>> Original patch description/motivation/questions are in
>>> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html
>> Is his stuff used for exception handling?  If so, doesn't that make the
>> performance a significant concern (ie that msync call?)
> 
> For the performance issue, I wonder if it wouldn't be better to just
> compile the unwinder twice, basically include everything needed for the
> verification backtrace in a single *.c file with special defines, and
> not export anything from that *.o file except the single entrypoint.
> That would also handle the ABI problems.
Yea.

Given that the vast majority of the time the addresses are valid, I
wonder if we could take advantage of that property to keep the overhead
lower in the most common cases.

Something like setup a signal handler when we first start unwinding that
flags the error and tear it down when we're done unwinding?Obviously
we can't do setup/tear down each time for each address.  Anyway, just
thinking outloud here...

> 
> For the concurrent calls of other threads unmapping stacks of running
> threads (that is where most of the verification goes against), I'm afraid
> that is simply non-solveable, even if you don't cache anything, it will
> still be racy.
Absolutely -- I think solving this stuff 100% reliably without races
isn't possible.  I think the question is can we make this significantly
better.

Jeff

Re: [PATCH][2/2] early LTO debug, main part

2017-08-02 Thread Jason Merrill

On Wed, Aug 2, 2017 at 6:35 AM, Richard Biener  wrote:
> On Wed, 2 Aug 2017, Jason Merrill wrote:
>
>> On 05/19/2017 06:42 AM, Richard Biener wrote:
>> > + /* ???  In some cases the C++ FE (at least) fails to
>> > +set DECL_CONTEXT properly.  Simply globalize stuff
>> > +in this case.  For example
>> > +__dso_handle created via iostream line 74 col 25.  */
>> > + parent = comp_unit_die ();
>>
>> I've now fixed __dso_handle, so that can be removed from the comment, but it
>> still makes sense to have this fall-back for the (permitted) case of null
>> DECL_CONTEXT.
>
> Adjusted the comment.
>
>> > +   /* ???  LANG issue - DW_TAG_module for fortran.  Either look
>> > +at the input language (if we have enough DECL_CONTEXT to follow)
>> > +or use a bit in tree_decl_with_vis to record the distinction.  */
>>
>> Sure, you should be able to look at TRANSLATION_UNIT_LANGUAGE.
>
> Yeah, the comment says we might be able to walk DECL_CONTEXT up to
> a TRANSLATION_UNIT_DECL.  I've amended is_fortran similar to as I
> amended is_cxx, providing an overload for a decl, factoring out common
> code.  So this is now if (is_fortran (decl)) ... = new_die
> (DW_TAG_module,...).
>
>> > ! /* ???  We cannot unconditionally output die_offset if
>> > !non-zero - at least -feliminate-dwarf2-dups will
>> > !create references to those DIEs via symbols.  And we
>> > !do not clear its DIE offset after outputting it
>> > !(and the label refers to the actual DIEs, not the
>> > !DWARF CU unit header which is when using label + offset
>> > !would be the correct thing to do).
>>
>> As in our previous discussion, I think -feliminate-dwarf2-dups can go away
>> now.  But this is a more general issue: die_offset has meant the offset from
>> the beginning of the CU, but if with_offset is set it means an offset from
>> die_symbol.  Since with_offset changes the meaning of die_symbol and
>> die_offset, having different code here depending on that flag makes sense.
>>
>> It seems likely that when -fEDD goes away, we will never again want to do
>> direct symbolic references to DIEs, in which case we could drop the current
>> meaning of die_symbol, and so we wouldn't need the with_offset flag.
>
> Yeah, I've been playing with a patch to remove -fEDD but it has conflicts
> with the early LTO debug work and thus I wanted to postpone it until
> after that goes in to avoid further churn.  I hope that's fine, it's
> sth I'll tackle soon after this patch lands on trunk.

Sure.

>> > !   /* Don't output the symbol twice.  For LTO we want the label
>> > !  on the section beginning, not on the actual DIE.  */
>> > !   && (!flag_generate_lto
>> > ! || die->die_tag != DW_TAG_compile_unit))
>>
>> I think this check should just be !with_offset; if that flag is set the DIE
>> doesn't actually have its own symbol.
>
> with_offset is set only during LTRANS phase for the DIEs refering to
> the early DIEs via the CU label.  But the above is guarding the
> early phase when we do not want to output that CU label twice.
>
> Can we revisit this check when -fEDD has gone away?

Yes.

>> > !   if (old_die
>> > ! && (c = get_AT_ref (old_die, DW_AT_abstract_origin))
>> > ! /* ???  In LTO all origin DIEs still refer to the early
>> > !debug copy.  Detect that.  */
>> > ! && get_AT (c, DW_AT_inline))
>> ...
>> > !   /* "Unwrap" the decls DIE which we put in the imported unit 
>> > context.
>> > !   ???  If we finish dwarf2out_function_decl refactoring we can
>> > ! do this in a better way from the start and only lazily emit
>> > ! the early DIE references.  */
>>
>> It seems like in gen_subprogram_die you deliberately avoid reusing the DIE
>> from dwarf2out_register_external_die (since it doesn't have DW_AT_inline), 
>> and
>> then in add_abstract_origin_attribute you need to look through that redundant
>> die.  Why not reuse it?
>
> What we're doing here is dealing with the case of an inlined
> instance which is adjusted to point back to the early debug copy
> directly than to the wrapping DIE (supposed to eventually contain
> the concrete instance).

But we enter this block when we're emitting the concrete out-of-line
instance, and the DW_AT_inline check prevents us from using the
wrapping DIE for the out-of-line instance.

The comment should really change "inlined instance" to "concrete
instance"; inlined instances are handled in
gen_inlined_subroutine_die.

Jason

Re: [PATCH, rs6000] Clean up capitalized diagnostic messages

2017-08-02 Thread Bill Schmidt

On Aug 2, 2017, at 11:04 AM, Joseph Myers  wrote:
> 
> On Wed, 2 Aug 2017, Jakub Jelinek wrote:
> 
>> On Wed, Aug 02, 2017 at 10:29:20AM -0500, Bill Schmidt wrote:
>>> --- gcc/config/rs6000/rs6000.c  (revision 250791)
>>> +++ gcc/config/rs6000/rs6000.c  (working copy)
>>> @@ -4132,7 +4132,7 @@ rs6000_option_override_internal (bool global_init_
>>>   || rs6000_cpu == PROCESSOR_PPCE5500)
>>> {
>>>   if (TARGET_ALTIVEC)
>>> -   error ("AltiVec not supported in this target");
>>> +   error ("altivec not supported in this target");
>> 
>> If AltiVec is spelled that way always, then I think we want to keep it
>> capitalized, but CCing Joseph just to be sure.  Or the diagnostics
> 
> Yes, if it would be capitalized mid-sentence then it should be capitalized 
> at the start of a diagnostic (and if it wouldn't be capitalized 
> mid-sentence, it shouldn't be capitalized at the start of a diagnostic).

Honestly, if I had my way we would retroactively change all the
"altivec/Altivec/AltiVec" inconsistencies to "VMX". ;-)  But given the
ubiquity of  (lowercase there) I suppose that's not practical...

I'll return this one to AltiVec, and change "GCC vector" to "GNU vector"
if that's okay with everyone.

Bill

> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com
>

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread Jakub Jelinek

On Wed, Aug 02, 2017 at 12:38:13PM -0600, Jeff Law wrote:
> On 07/17/2017 01:23 AM, Yuri Gribov wrote:
> > I've rebased the previous patch to trunk per Andrew's suggestion.
> > Original patch description/motivation/questions are in
> > https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html
> Is his stuff used for exception handling?  If so, doesn't that make the
> performance a significant concern (ie that msync call?)

For the performance issue, I wonder if it wouldn't be better to just
compile the unwinder twice, basically include everything needed for the
verification backtrace in a single *.c file with special defines, and
not export anything from that *.o file except the single entrypoint.
That would also handle the ABI problems.

For the concurrent calls of other threads unmapping stacks of running
threads (that is where most of the verification goes against), I'm afraid
that is simply non-solveable, even if you don't cache anything, it will
still be racy.

Jakub

Re: [PATCH 1/2] Introduce testsuite support to run Python tests

2017-08-02 Thread Jeff Law

On 07/26/2017 10:48 AM, David Malcolm wrote:
> On Wed, 2017-07-26 at 18:35 +0200, Pierre-Marie de Rodat wrote:
>> On 07/26/2017 06:25 PM, David Malcolm wrote:
>>> str.format was introduced in Python 2.6, so presumably the minimum
>>> python 2 version here is at least 2.6+; for Python 3 I believe it
>>> was
>>> present in Python 3.0 onwards.
>>
>> Hm… Python 2.6 is fairly old: last binary release was ages ago, last 
>> source release was in 2013. Do you think it’s worth supporting it?
> 
> IIRC RHEL 6 has Python 2.6 as its /usr/bin/python (but Python 2.7 is
> available as a "software collection" add-on).
Given the age of RHEL 6, I wouldn't get too hung up on on supporting it
for this stuff.  Hell, it's EOL in just a few short years.


> 
> I don't know if gcc as a project would want to support 2.6+ or simply
> 2.7 for Python 2.
If it was trivial, then let's support 2.6.  But if it's nontrivial, I'd
support stepping to something more modern.

Jeff

Re: [RFC][PR 67336][PING^6] Verify pointers during stack unwind

2017-08-02 Thread Jeff Law

On 07/17/2017 01:23 AM, Yuri Gribov wrote:
> Hi all,
> 
> I've rebased the previous patch to trunk per Andrew's suggestion.
> Original patch description/motivation/questions are in
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01869.html
Is his stuff used for exception handling?  If so, doesn't that make the
performance a significant concern (ie that msync call?)

Though I wouldn't be a fan of parsing /proc/self/maps either.

Given something like msync or parsing /proc/self/maps, aren't you
essentially limiting this to Linux targets?  Which means you probably
ought to be using getpagesize() to get the page size rather than pulling
it from a config file or hardcoding it.

You're #including sys/mman.h, shouldn't that be protected with a
HAVE_SYS_MMAN_H and the autoconf hackery that implies?

Is the extension of struct _Unwind_Context an ABI change?  ie, can this
co-exist with existing code?

Note that I'm not entirely sure caching is appropriate/safe.  Consider
if we've got multiple threads.  One is in the middle of unwinding while
another happens to be unmapping pages.  I believe the Red Hat bz has a
bug of this nature that's still not fully analyzed.

Not a full review -- this stuff is well outside my areas of expertise.
BUt I do happen to have some state via the Red Hat BZ that I looked at a
while ago :-)

jeff

Re: [PATCH] Add missing edge probability in simd_clone_adjust

2017-08-02 Thread Jakub Jelinek

On Wed, Aug 02, 2017 at 08:19:41PM +0200, Tom de Vries wrote:
> 2017-08-02  Tom de Vries  
> 
>   * omp-simd-clone.c (simd_clone_adjust): Add missing edge probability.

Ok, thanks.

> diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
> index a1a563e..fbb122c 100644
> --- a/gcc/omp-simd-clone.c
> +++ b/gcc/omp-simd-clone.c
> @@ -1240,8 +1240,11 @@ simd_clone_adjust (struct cgraph_node *node)
>g = gimple_build_cond (EQ_EXPR, mask, build_zero_cst (TREE_TYPE 
> (mask)),
>NULL, NULL);
>gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
> -  make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
> -  FALLTHRU_EDGE (loop->header)->flags = EDGE_FALSE_VALUE;
> +  edge e = make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
> +  e->probability = profile_probability::unlikely ().guessed ();
> +  edge fallthru = FALLTHRU_EDGE (loop->header);
> +  fallthru->flags = EDGE_FALSE_VALUE;
> +  fallthru->probability = profile_probability::likely ().guessed ();
>  }
>  
>basic_block latch_bb = NULL;


Jakub

Re: GCC 7.2 Status Report (2017-08-02)

2017-08-02 Thread Ian Lance Taylor

On Wed, Aug 2, 2017 at 4:17 AM, Richard Biener  wrote:
>
> Status
> ==
>
> The GCC 7 branch is now frozen for the upcoming release candidate
> and release.  All changes require release manager approval.

Whoops, I committed a patch to the GCC 7 branch this morning (my time)
before I saw this e-mail message (SVN revision 250833).  Sorry about
that.  Let me know if you want me to revert it, but I think it's
pretty safe.

Ian

Re: [PATCH] Add missing edge probability in simd_clone_adjust

2017-08-02 Thread Tom de Vries


On 08/02/2017 06:29 PM, Jakub Jelinek wrote:

On Wed, Aug 02, 2017 at 06:15:31PM +0200, Tom de Vries wrote:

2017-08-02  Tom de Vries

* omp-simd-clone.c (simd_clone_adjust): Add missing edge probability.

I think that is not the right probability, when using masked simd, the point
is that it is called from conditional code where usually not all the bits in
the mask are set.  So I think it is better to use say likely () probability
for the EDGE_FALSE_VALUE edge (that mask != 0) and unlikely () probability
for the EDGE_TRUE_VALUE edge.
Hopefully that conditional goes away later when vectorized, but if it
doesn't, we shouldn't  say it is that unlikely it will be masked off.



Updated with likely/unlikely split.

OK?

Thanks,
- Tom
Add missing edge probability in simd_clone_adjust

Currently we generate an if with probability set on only one of the two edges:
   [0.00%] [count: INV]:
  _5 = mask.3[iter.6_3];
  if (_5 == 0)
goto ; [INV] [count: INV]
  else
goto ; [100.00%] [count: INV]

Add the missing edge probability, and set the split to unlikely/likely:
  if (_5 == 0)
goto ; [19.99%] [count: INV]
  else
goto ; [80.01%] [count: INV]

2017-08-02  Tom de Vries  

	* omp-simd-clone.c (simd_clone_adjust): Add missing edge probability.

---
 gcc/omp-simd-clone.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index a1a563e..fbb122c 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -1240,8 +1240,11 @@ simd_clone_adjust (struct cgraph_node *node)
   g = gimple_build_cond (EQ_EXPR, mask, build_zero_cst (TREE_TYPE (mask)),
 			 NULL, NULL);
   gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
-  make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
-  FALLTHRU_EDGE (loop->header)->flags = EDGE_FALSE_VALUE;
+  edge e = make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
+  e->probability = profile_probability::unlikely ().guessed ();
+  edge fallthru = FALLTHRU_EDGE (loop->header);
+  fallthru->flags = EDGE_FALSE_VALUE;
+  fallthru->probability = profile_probability::likely ().guessed ();
 }
 
   basic_block latch_bb = NULL;

Re: [patch][Ping #3] PR80929: Realistic PARALLEL cost in seq_cost.

2017-08-02 Thread Jeff Law

On 06/29/2017 02:51 AM, Georg-Johann Lay wrote:
> On 28.06.2017 22:18, Wilco Dijkstra wrote:
>> Georg-Johann Lay wrote:
>>
>> @@ -5300,6 +5300,9 @@ seq_cost (const rtx_insn *seq, bool spee
>> set = single_set (seq);
>> if (set)
>>   cost += set_rtx_cost (set, speed);
>> +  else if (INSN_P (seq)
>> +   && PARALLEL == GET_CODE (PATTERN (seq)))
>> +cost += insn_rtx_cost (PATTERN (seq), speed);
>> else
>>   cost++;
>>
>> insn_rtx_cost may return zero if it can't find something useful in the
>> parallel,
>> which means it may return a lower cost and even zero. Not sure whether
>> this
>> is important, but in eg. combine a cost of zero means infinite and so
>> could have
>> unintended consequences. So incrementing cost with a non-zero value
>> if insn_rtx_cost == 0 would seem safer.
> 
> Updated patch below, it just adds 1 (which is 1/4 of CONST_N_INSNS) to
> avoid zero.
> 
> 
>> Also why does the else do cost++ and not cost += COSTS_N_INSNS (1)?
>>
>> Wilco
> 
> Dunno, I didn't change this.  Maybe it's also just to escape 0.
> 
> Johann
> 
> gcc/
> PR middle-end/80929
> * rtlanal.c (seq_cost) [PARALLEL]: Get cost from insn_rtx_cost
> instead of assuming cost of 1.
So this has probably been hashed to death, but just a couple thoughts.

I think realistically one has to look at the entirety of the PARALLEL to
get any reasonable costing.

Summing the individual components is wrong.  Taking the cost of any
individual component is wrong.  You might be able to make a case for the
maximum cost of the individual components, but even that is almost
certain to be wrong in some cases.


Presumably this is part of what Segher is trying to address with his
costing changes.

jeff

Re: [PATCH 6/6] qsort comparator consistency checking

2017-08-02 Thread Jeff Law

On 08/02/2017 12:00 PM, Alexander Monakov wrote:
> On Wed, 2 Aug 2017, Jeff Law wrote:
>> Well, there's not *that* many qsort calls.  My quick grep shows 94 and
>> its a very mechanical change.  Then a poison in system.h to ensure raw
>> calls to qsort don't return.
> 
> Any suggestion for the non-poisoned replacement?  xqsort?  gcc_qsort?
qsort_chk/qsort_nochk for checked and non-checked?

> 
> Can you review the rest (the substantial functionality) of the patch
> without waiting for the mass-renaming change?
I think the rest is good as-is.  If we find a need to adjust the
checking intervals, then we can do that as a follow-up.

jeff

Re: [PATCH 6/6] qsort comparator consistency checking

2017-08-02 Thread Alexander Monakov

On Wed, 2 Aug 2017, Jeff Law wrote:
> Well, there's not *that* many qsort calls.  My quick grep shows 94 and
> its a very mechanical change.  Then a poison in system.h to ensure raw
> calls to qsort don't return.

Any suggestion for the non-poisoned replacement?  xqsort?  gcc_qsort?

Can you review the rest (the substantial functionality) of the patch
without waiting for the mass-renaming change?

Thanks.
Alexander

[PATCH 2/3] retire mem_signal_fence pattern

2017-08-02 Thread Alexander Monakov

Similar to mem_thread_fence issue from the patch 1/3, RTL representation of
__atomic_signal_fence must be a compiler barrier.  We have just one backend
offering this pattern, and it does not place a compiler barrier.

It does not appear useful to expand signal_fence to some kind of hardware
instruction, its documentation is wrong/outdated, and we are using it
incorrectly anyway.  So just remove the whole thing and simply emit a compiler
memory barrier in the optabs.c handler.

* config/s390/s390.md (mem_signal_fence): Remove.
* doc/md.texi (mem_signal_fence): Remove.
* optabs.c (expand_mem_signal_fence): Remove uses of mem_signal_fence.
Update comments.
* target-insns.def (mem_signal_fence): Remove.

---
 gcc/config/s390/s390.md |  9 -
 gcc/doc/md.texi | 13 -
 gcc/optabs.c| 17 +
 gcc/target-insns.def|  1 -
 4 files changed, 5 insertions(+), 35 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index d1ac0b8395d..1d63523d3b0 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -10084,15 +10084,6 @@ (define_insn "*basr_tls"
 ; memory barrier patterns.
 ;
 
-(define_expand "mem_signal_fence"
-  [(match_operand:SI 0 "const_int_operand")]   ;; model
-  ""
-{
-  /* The s390 memory model is strong enough not to require any
- barrier in order to synchronize a thread with itself.  */
-  DONE;
-})
-
 (define_expand "mem_thread_fence"
   [(match_operand:SI 0 "const_int_operand")]   ;; model
   ""
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 0be161a08dc..73d0e7d83c0 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -7058,19 +7058,6 @@ If this pattern is not specified, all memory models 
except
 @code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize}
 barrier pattern.
 
-@cindex @code{mem_signal_fence@var{mode}} instruction pattern
-@item @samp{mem_signal_fence@var{mode}}
-This pattern emits code required to implement a signal fence with
-memory model semantics.  Operand 0 is the memory model to be used.
-
-This pattern should impact the compiler optimizers the same way that
-mem_signal_fence does, but it does not need to issue any barrier
-instructions.
-
-If this pattern is not specified, all memory models except
-@code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize}
-barrier pattern.
-
 @cindex @code{get_thread_pointer@var{mode}} instruction pattern
 @cindex @code{set_thread_pointer@var{mode}} instruction pattern
 @item @samp{get_thread_pointer@var{mode}}
diff --git a/gcc/optabs.c b/gcc/optabs.c
index d568ca376fe..6884fd4b8aa 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6315,22 +6315,15 @@ expand_mem_thread_fence (enum memmodel model)
 }
 }
 
-/* This routine will either emit the mem_signal_fence pattern or issue a 
-   sync_synchronize to generate a fence for memory model MEMMODEL.  */
+/* Emit a signal fence with given memory model.  */
 
 void
 expand_mem_signal_fence (enum memmodel model)
 {
-  if (targetm.have_mem_signal_fence ())
-emit_insn (targetm.gen_mem_signal_fence (GEN_INT (model)));
-  else if (!is_mm_relaxed (model))
-{
-  /* By default targets are coherent between a thread and the signal
-handler running on the same thread.  Thus this really becomes a
-compiler barrier, in that stores must not be sunk past
-(or raised above) a given point.  */
-  expand_asm_memory_barrier ();
-}
+  /* No machine barrier is required to implement a signal fence, but
+ a compiler memory barrier must be issued, except for relaxed MM.  */
+  if (!is_mm_relaxed (model))
+expand_asm_memory_barrier ();
 }
 
 /* This function expands the atomic load operation:
diff --git a/gcc/target-insns.def b/gcc/target-insns.def
index fb92f72ac29..4669439c7e1 100644
--- a/gcc/target-insns.def
+++ b/gcc/target-insns.def
@@ -58,7 +58,6 @@ DEF_TARGET_INSN (indirect_jump, (rtx x0))
 DEF_TARGET_INSN (insv, (rtx x0, rtx x1, rtx x2, rtx x3))
 DEF_TARGET_INSN (jump, (rtx x0))
 DEF_TARGET_INSN (load_multiple, (rtx x0, rtx x1, rtx x2))
-DEF_TARGET_INSN (mem_signal_fence, (rtx x0))
 DEF_TARGET_INSN (mem_thread_fence, (rtx x0))
 DEF_TARGET_INSN (memory_barrier, (void))
 DEF_TARGET_INSN (movstr, (rtx x0, rtx x1, rtx x2))
-- 
2.13.3

[PATCH 3/3] optabs: ensure atomic_load/stores have compiler barriers

2017-08-02 Thread Alexander Monakov

Again, like in patch 1/3, a backend might expand C11 atomic load/store to a
volatile memory access if hardware memory ordering is sufficiently strong that
no machine barrier is required.  Nevertheless, we must ensure that compiler
memory barrier(s) are present before/after the access to prevent wrong code
transformations.  Simply place them as appropriate in optabs.c expansion code.

There's ALIAS_SET_MEMORY_BARRIER placed on MEMs accessed via atomics, but it's
probably not useful, it's very hard to audit all RTL passes and ensure they
respect it.  On the other hand, asm barriers must work or else much of
real-world code will break.

PR rtl-optimization/57448
PR target/67458
PR target/81316
* optabs.c (expand_atomic_load): Place compiler memory barriers if using
atomic_load pattern.
(expand_atomic_store): Likewise.
testsuite/
* gcc.dg/atomic/pr80640-2.c: New testcase.
* gcc.dg/atomic/pr81316.c: New testcase.
---
 gcc/optabs.c| 20 ++--
 gcc/testsuite/gcc.dg/atomic/pr80640-2.c | 32 
 gcc/testsuite/gcc.dg/atomic/pr81316.c   | 29 +
 3 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/pr80640-2.c
 create mode 100644 gcc/testsuite/gcc.dg/atomic/pr81316.c

diff --git a/gcc/optabs.c b/gcc/optabs.c
index 6884fd4b8aa..11977d6d53f 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6343,12 +6343,20 @@ expand_atomic_load (rtx target, rtx mem, enum memmodel 
model)
   if (icode != CODE_FOR_nothing)
 {
   struct expand_operand ops[3];
+  rtx_insn *last = get_last_insn ();
+  if (is_mm_seq_cst (model))
+   expand_asm_memory_barrier ();
 
   create_output_operand (&ops[0], target, mode);
   create_fixed_operand (&ops[1], mem);
   create_integer_operand (&ops[2], model);
   if (maybe_expand_insn (icode, 3, ops))
-   return ops[0].value;
+   {
+ if (!is_mm_relaxed (model))
+   expand_asm_memory_barrier ();
+ return ops[0].value;
+   }
+  delete_insns_since (last);
 }
 
   /* If the size of the object is greater than word size on this target,
@@ -6393,11 +6401,19 @@ expand_atomic_store (rtx mem, rtx val, enum memmodel 
model, bool use_release)
   icode = direct_optab_handler (atomic_store_optab, mode);
   if (icode != CODE_FOR_nothing)
 {
+  rtx_insn *last = get_last_insn ();
+  if (!is_mm_relaxed (model))
+   expand_asm_memory_barrier ();
   create_fixed_operand (&ops[0], mem);
   create_input_operand (&ops[1], val, mode);
   create_integer_operand (&ops[2], model);
   if (maybe_expand_insn (icode, 3, ops))
-   return const0_rtx;
+   {
+ if (is_mm_seq_cst (model))
+   expand_asm_memory_barrier ();
+ return const0_rtx;
+   }
+  delete_insns_since (last);
 }
 
   /* If using __sync_lock_release is a viable alternative, try it.
diff --git a/gcc/testsuite/gcc.dg/atomic/pr80640-2.c 
b/gcc/testsuite/gcc.dg/atomic/pr80640-2.c
new file mode 100644
index 000..a735054090d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/pr80640-2.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-pthread" } */
+/* { dg-require-effective-target pthread } */
+
+#include 
+
+static volatile int sem1;
+static _Atomic  int sem2;
+
+static void *f(void *va)
+{
+  void **p = va;
+  if (*p) return *p;
+  sem1 = 1;
+  while (!__atomic_load_n(&sem2, __ATOMIC_ACQUIRE));
+  // GCC used to RTL-CSE this and the first load, causing 0 to be returned
+  return *p;
+}
+
+int main()
+{
+  void *p = 0;
+  pthread_t thr;
+  if (pthread_create(&thr, 0, f, &p))
+return 2;
+  while (!sem1);
+  __atomic_thread_fence(__ATOMIC_ACQUIRE);
+  p = &p;
+  __atomic_store_n(&sem2, 1, __ATOMIC_RELEASE);
+  pthread_join(thr, &p);
+  return !p;
+}
diff --git a/gcc/testsuite/gcc.dg/atomic/pr81316.c 
b/gcc/testsuite/gcc.dg/atomic/pr81316.c
new file mode 100644
index 000..ef10095718e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/pr81316.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-pthread" } */
+/* { dg-require-effective-target pthread } */
+
+#include 
+#include 
+
+static _Atomic int sem1;
+
+static void *f(void *va)
+{
+  void **p = va;
+  while (!__atomic_load_n(&sem1, __ATOMIC_ACQUIRE));
+  exit(!*p);
+}
+
+int main(int argc)
+{
+  void *p = 0;
+  pthread_t thr;
+  if (pthread_create(&thr, 0, f, &p))
+return 2;
+  // GCC used to RTL-DSE this store
+  p = &p;
+  __atomic_store_n(&sem1, 1, __ATOMIC_RELEASE);
+  int r = -1;
+  while (r < 0) asm("":"+r"(r));
+  return r;
+}
-- 
2.13.3

[PATCH 1/3] optabs: ensure mem_thread_fence is a compiler barrier

2017-08-02 Thread Alexander Monakov

As recently discussed in the previous thread for PR 80640, some targets have
sufficiently strong hardware memory ordering that implementation of C11 atomic
fences might not need machine barriers.  However, a compiler memory barrier
nevertheless must be present, and at least two targets (x86, s390) got this
wrong in their .md expanders.

Handle it in the middle-end by detecting empty expansion and emitting a
compiler memory barrier.  If the backend produced non-empty RTL, expect that
it's using the same RTX structure as in "memory_barrier" (__sync_synchronize)
expansion, which must be a compiler memory on its own.

A followup refactor is possible in expand_mem_thread_fence to make it start with

  if (is_mm_relaxed (model))
return;

as it's not useful to pass __ATOMIC_RELAXED model to gen_mem_thread_fence.

PR target/80640
* doc/md.texi (mem_thread_fence): Remove mention of mode.  Clarify that
empty expansion is handled by placing a compiler barrier.
* optabs.c (expand_mem_thread_fence): Place a compiler barrier if
targetm.gen_mem_thread_fence produced no instructions.
testsuite/
* gcc.dg/atomic/pr80640.c: New testcase.

---
 gcc/doc/md.texi   |  9 +++--
 gcc/optabs.c  |  7 ++-
 gcc/testsuite/gcc.dg/atomic/pr80640.c | 34 ++
 3 files changed, 47 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/pr80640.c

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index ea959208c98..0be161a08dc 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -7044,11 +7044,16 @@ If these patterns are not defined, attempts will be 
made to use
 counterparts.  If none of these are available a compare-and-swap
 loop will be used.
 
-@cindex @code{mem_thread_fence@var{mode}} instruction pattern
-@item @samp{mem_thread_fence@var{mode}}
+@cindex @code{mem_thread_fence} instruction pattern
+@item @samp{mem_thread_fence}
 This pattern emits code required to implement a thread fence with
 memory model semantics.  Operand 0 is the memory model to be used.
 
+Expanding this pattern may produce no instructions if no machine barrier
+is required on the target for given memory model.  In that case, unless
+memory model is @code{__ATOMIC_RELAXED}, a compiler memory barrier is
+emitted automatically.
+
 If this pattern is not specified, all memory models except
 @code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize}
 barrier pattern.
diff --git a/gcc/optabs.c b/gcc/optabs.c
index a9900657a58..d568ca376fe 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6298,7 +6298,12 @@ void
 expand_mem_thread_fence (enum memmodel model)
 {
   if (targetm.have_mem_thread_fence ())
-emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
+{
+  rtx_insn *last = get_last_insn ();
+  emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
+  if (last == get_last_insn () && !is_mm_relaxed (model))
+   expand_asm_memory_barrier ();
+}
   else if (!is_mm_relaxed (model))
 {
   if (targetm.have_memory_barrier ())
diff --git a/gcc/testsuite/gcc.dg/atomic/pr80640.c 
b/gcc/testsuite/gcc.dg/atomic/pr80640.c
new file mode 100644
index 000..fd17978a482
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/pr80640.c
@@ -0,0 +1,34 @@
+/* { dg-do run } */
+/* { dg-options "-pthread" } */
+/* { dg-require-effective-target pthread } */
+
+#include 
+
+static volatile int sem1;
+static volatile int sem2;
+
+static void *f(void *va)
+{
+  void **p = va;
+  if (*p) return *p;
+  sem1 = 1;
+  while (!sem2);
+  __atomic_thread_fence(__ATOMIC_ACQUIRE);
+  // GCC used to RTL-CSE this and the first load, causing 0 to be returned
+  return *p;
+}
+
+int main()
+{
+  void *p = 0;
+  pthread_t thr;
+  if (pthread_create(&thr, 0, f, &p))
+return 2;
+  while (!sem1);
+  __atomic_thread_fence(__ATOMIC_ACQUIRE);
+  p = &p;
+  __atomic_thread_fence(__ATOMIC_RELEASE);
+  sem2 = 1;
+  pthread_join(thr, &p);
+  return !p;
+}
-- 
2.13.3

Re: c-family PATCH to improve and simplify -Wmultistatement-macros (PR c/81448, c/81306)

2017-08-02 Thread Marek Polacek

On Wed, Aug 02, 2017 at 10:16:54AM -0700, Steve Ellcey wrote:
> On Wed, 2017-08-02 at 11:37 +, Joseph Myers wrote:
> > On Thu, 27 Jul 2017, Marek Polacek wrote:
> > 
> > > 
> > > Bootstrapped/regtested on x86_64-linux and ppc64le-linux, ok for
> > > trunk?
> > > 
> > > 2017-07-27  Marek Polacek  
> > > 
> > >   PR c/81448
> > >   PR c/81306
> > >   * c-warn.c (warn_for_multistatement_macros): Prevent bogus
> > >   warnings.  Avoid walking MACRO_MAP_LOCATIONS.   
> > >    
> > > 
> > >   * c-c++-common/Wmultistatement-macros-13.c: New test.
> > OK.  Steve, as I noted in 
> > , please see 
> > if this means you no longer need to use -Wno-multistatement-macros for 
> > certain glibc tests (if it's still needed, that suggests another related 
> > problem may still be present).
> 
> I have built and tested the latest glibc with the latest gcc and after
> removing the warnings from the Makefile, everything ran fine.  There
> were no build problems.
 
Excellent.  And sorry for the trouble in the first place.

Marek

Re: [PATCH 6/6] qsort comparator consistency checking

2017-08-02 Thread Jeff Law

On 07/15/2017 02:47 PM, Alexander Monakov wrote:
> This is the updated qsort comparator verifier.
> 
> Since we have vec::qsort(cmp), the patch uses the macro argument counting
> trick to redirect only the four-argument invocations of qsort to qsort_chk.
> I realize that won't win much sympathies, but a patch doing mass-renaming
> of qsort in the whole GCC codebase would win even fewer, I suspect.
> 
> The macro ENABLE_FULL_QSORT_CHECKING could be used to enable full O(n^2)
> checking, but there's no accompanying configure.ac change.  I'm not sure
> that would be useful in practice, so I'd rather turn it into #if 0.
> 
> The suppression in genmodes was needed because it's not linked with vec.o.
> 
>   * genmodes.c (calc_wider_mode): Suppress qsort macro.
> * system.h [CHECKING_P] (qsort): Redirect to qsort_chk.
> (qsort_nochk): New macro.
> (qsort_chk): Declare.
> (qsort_disable_checking): Declare.
> * vec.c (qsort_chk_error): New static function.
> (qsort_disable_checking): Define.
> (qsort_chk): New function.
Well, there's not *that* many qsort calls.  My quick grep shows 94 and
its a very mechanical change.  Then a poison in system.h to ensure raw
calls to qsort don't return.

The counting trick is, well, ugly.  There's also issues in that some
compilers don't properly implement the C99 pre-processor standard
properly (MSVC) and what happens when there's no arguments?  While not
likely an issue here, I think it highlights that this kind of
preprocessor hackery is just a bad idea.

I'd prefer to just bite the bullet and do the mechanical qsort change.

Jeff

Re: c-family PATCH to improve and simplify -Wmultistatement-macros (PR c/81448, c/81306)

2017-08-02 Thread Steve Ellcey

On Wed, 2017-08-02 at 11:37 +, Joseph Myers wrote:
> On Thu, 27 Jul 2017, Marek Polacek wrote:
> 
> > 
> > Bootstrapped/regtested on x86_64-linux and ppc64le-linux, ok for
> > trunk?
> > 
> > 2017-07-27  Marek Polacek  
> > 
> > PR c/81448
> > PR c/81306
> > * c-warn.c (warn_for_multistatement_macros): Prevent bogus
> > warnings.  Avoid walking MACRO_MAP_LOCATIONS.   
> >  
> > 
> > * c-c++-common/Wmultistatement-macros-13.c: New test.
> OK.  Steve, as I noted in 
> , please see 
> if this means you no longer need to use -Wno-multistatement-macros for 
> certain glibc tests (if it's still needed, that suggests another related 
> problem may still be present).

I have built and tested the latest glibc with the latest gcc and after
removing the warnings from the Makefile, everything ran fine.  There
were no build problems.

Since the mainline is now open I will go ahead and check in this patch.

Steve Ellcey
sell...@cavium.com


2017-08-02  Steve Ellcey  

* localedata/Makefile (CFLAGS-tst_iswalnum.c, CFLAGS-tst_iswalpha.c
CFLAGS-tst_iswcntrl.c, CFLAGS-tst_iswdigit.c, CFLAGS-tst_iswgraph.c,
CFLAGS-tst_iswlower.c, CFLAGS-tst_iswprint.c, CFLAGS-tst_iswpunct.c,
CFLAGS-tst_iswspace.c, CFLAGS-tst_iswupper.c, CFLAGS-tst_iswxdigit.c,
CFLAGS-tst_towlower.c, CFLAGS-tst_towupper.c): Remove.


diff --git a/localedata/Makefile b/localedata/Makefile
index 20c5921..9db9464 100644
--- a/localedata/Makefile
+++ b/localedata/Makefile
@@ -122,22 +122,6 @@ $(inst_i18ndir)/charmaps/%.gz: charmaps/% $(+force)
 # Install the locale source files in the appropriate directory.
 $(inst_i18ndir)/locales/%: locales/% $(+force); $(do-install)
 
-# These tests use multistatement macros from tests-mbwc/tst_funcs.h
-# and will not compile with GCC 8.1 without the warning turned off.
-CFLAGS-tst_iswalnum.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswalpha.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswcntrl.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswdigit.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswgraph.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswlower.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswprint.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswpunct.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswspace.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswupper.c  = -Wno-multistatement-macros
-CFLAGS-tst_iswxdigit.c = -Wno-multistatement-macros
-CFLAGS-tst_towlower.c  = -Wno-multistatement-macros
-CFLAGS-tst_towupper.c  = -Wno-multistatement-macros
-
 ifeq ($(run-built-tests),yes)
 generated-dirs += $(LOCALES)

Re: [PATCH 6/6] qsort comparator consistency checking

2017-08-02 Thread Jeff Law

On 07/31/2017 12:27 PM, Alexander Monakov wrote:
> On Mon, 31 Jul 2017, Jeff Law wrote:
>> I  must have missed something.  Can't you just define
>>
>> qsort (BASE, NMEMB, SIZE, COMPARE) into
>>
>> qsort_chk (BASE, NMEMB, SIZE, COMPARE)
>>
>> That shouldn't affect the qsort from vec?  Right?  Or am I missing something
> 
> If you do
> 
>   #define qsort(base, n, sz, cmp) qsort_chk (base, n, sz, cmp)
> 
> then all invocations of vec::qsort, i.e.
> 
>   myvec.qsort (mycmp);
> 
> will yield a preprocessing error due to wrong number of arguments supplied
> to the qsort macro (one instead of four).
Duh.  Hit me with a clue stick. I guess I have to get familiar with the
macro argument counting trick :-)

jeff

Re: [PING #2] [PATCH] enhance -Wrestrict to handle string built-ins (PR 78918)

2017-08-02 Thread Jeff Law

On 08/01/2017 03:25 AM, Richard Biener wrote:
> On Tue, Aug 1, 2017 at 11:23 AM, Richard Biener
>  wrote:
>> On Tue, Aug 1, 2017 at 4:27 AM, Martin Sebor  wrote:
>>> Richard,
>>>
>>> in discussing this work Jeff mentioned that your comments on
>>> the tree-ssa-alias.c parts would be helpful.  When you have
>>> a chance could you please give it a once over and let me know
>>> if you have any suggestions or concerns?  There are no visible
>>> changes to existing clients of the pass, just extensions that
>>> are relied on only by the new diagnostics.
>>>
>>>   https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01264.html
>>>
>>> I expect to revisit the sprintf %s patch mentioned below and
>>> see how to simplify it by taking advantage of the changes
>>> implemented here.
>>
>> Your ptr_deref_alias_decl_p returns true when must_alias is true
>> but there is no must-alias relationship.
>>
>> The ao_ref_init_from_ptr_and_size doesn't belong there.
>>
>> I dislike all of the changes related to returning an alias "size".
>>
>> Please keep away from the alias machinery.
>>
>> Warning during folding is similarly bad design.  Please don't add
>> more such things.
>>
>> Thanks for putting this on my radar though.
>> Richard.
> 
> Oh, for a constructive comment this all feels like sth for either
> sanitizers or a proper static analysis tool.  As I outlined repeatedly
> I belive GCC could host a static analysis tool but it surely should
> not be interwinded into it's optimization passes or tools.
I disagree strongly here.

Sanitiers catch problems after the fact and only if you've compiled your
code with sanitization and manage to find a way to trigger the problem path.

Other static analysis tools are useful, but they aren't an integral part
of the edit, compile, debug cycle and due to their cost are often run
long after code was committed.

Finding useful warnings that can be issued as part of the compile, edit,
debug cycle is, IMHO, far more useful than sanitizers or independent
static checkers.

--

I think finding a way to exploit information that our various analyzers
can provide to give precise warnings is a good thing.  So it seemed
natural that if we wanted to ask a must-alias question that we should be
querying the alias oracle rather than implementing that kind of query
within the sprintf warnings.  I'm not sure why you'd say "Please keep
away from the alias machinery".

I'm a little confused here...

Jeff

Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-08-02 Thread Jeff Law

On 07/31/2017 01:42 PM, Martin Sebor wrote:
>> So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a
>> INTEGER_CST, it could be a non-constant expression for the size.  Are
>> the callers of compute_objsize prepared to handle that?  Just to be
>> clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an
>> INTEGER_CST, I'm just worried about the callers ability to handle that
>> correctly.
> 
> They should be prepared for it.  If not, it's a bug.  I've added
> a few more test cases though I'm not sure the case you're concerned
> about actually arises (VLA sizes are represented as gimple calls to
> __builtin_alloca_with_align so the code doesn't get this far).
It may in the end be a non-issue because VLAs are still represented as
alloca calls at this point.  BUt it looks like the result is typically
passed down into check_sizes which assumes the result is an INTEGER_CST
based on a quick scan.


So just to be future safe perhaps this:

if (TREE_CODE (type) == ARRAY_TYPE
&& TREE_CODE (TYPE_SIZE_UNIT (type)) == INTEGER_CST)
  return TYPE_SIZE_UNIT (type);

>>> +
>>> +@smallexample
>>> +void append (char *d)
>>> +@{
>>> +  strncat (d, ".txt", 4);
>>> +@}
>>> +@end smallexample
>> Sorry.  I don't follow this particular example.  Where's the truncation
>> when strlen (SRC) == N for strncat?  In that case aren't we going to
>> append SRC without the nul terminator, then nul terminate the result?
>> Isn't that the same as appending SRC with its nul terminator?  Am I
>> missing something here?
> 
> You're right that there is no truncation and the effect is
> the same but only in the unlikely case when the destination
> is empty.  Otherwise the result is truncated.
Maybe this is where I'm confused.  How does the destination play into
truncation issues?  I've always been under the impression that the
destination has to be large enough to hold the result, but that it
doesn't affect truncation of the source.


> 
>>
>>> @@ -2512,6 +2651,19 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
>>>case BUILT_IN_STPCPY_CHK_CHKP:
>>>  handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi);
>>>  break;
>>> +
>>> +  case BUILT_IN_STRNCAT:
>>> +  case BUILT_IN_STRNCAT_CHK:
>>> +check_builtin_strncat (DECL_FUNCTION_CODE (callee), gsi);
>>> +break;
>>> +
>>> +  case BUILT_IN_STPNCPY:
>>> +  case BUILT_IN_STPNCPY_CHK:
>>> +  case BUILT_IN_STRNCPY:
>>> +  case BUILT_IN_STRNCPY_CHK:
>>> +check_builtin_stxncpy (DECL_FUNCTION_CODE (callee), gsi);
>>> +break;
>>> +
>> So we've got calls to check the arguments, but not optimize here.  But
>> the containing function is "strlen_optimize_stmt".
>>
>> Would it make sense to first call strlen_optimize_stmt to handle the
>> optimization cases, then to call a new separate function to handle
>> warnings for the strn* family?
> 
> tree-ssa-strlen doesn't handle strncat or strncpy (for the latter
> I'm tracking the enhancement in bug 81433).  When the handling is
> added I expect check_builtin_{strncat,stxncpy} will be renamed to
> handle_builtin_{strncat,stxncpy} to match the existing functions,
> and the optimization done there.
> 
> Or did you have something else in mind and I missed it?
So do you envision doing both optimization and checking together?  If
so, then I think we'd want to rename strlen_optimize_stmt.  If
optimization and checking are expected to remain separate we'd want a
new function to handle checking of strncat stpncpy, strncpy and the
caller would look something like this:

  for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
{
  check_strstar_arguments (gsi);  // Or whatever we named it
  if (strlen_optimize_stmt (&gsi))
gsi_next (&gsi);
}

jeff

Re: [PATCH, rs6000] Clean up capitalized diagnostic messages

2017-08-02 Thread Martin Sebor


On 08/02/2017 09:29 AM, Bill Schmidt wrote:

Hi,

Jakub pointed out that we've had several error messages slip into the POWER 
back end
that incorrectly start with a capital letter.  This patch hunts those down and 
cleans
them up.


Others have already commented on some aspects of the patch but
let me suggest that using the option that controls it instead
of changing the spelling of AltiVec would bypass the whole
capitalization issue and might also provide an additional detail
that may otherwise not be immediately obvious.

Also, while making changes to the diagnostics, it would be great
to also add quoting around the built-in names and option names as
in:


--- gcc/config/rs6000/rs6000-c.c(revision 250791)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -6802,7 +6802,7 @@ altivec_resolve_overloaded_builtin (location_t loc
 if (unsupported_builtin)
   {
const char *name = rs6000_overloaded_builtin_name (fcode);
-   error ("Builtin function %s not supported in this compiler 
configuration",
+   error ("builtin function %s not supported in this compiler 
configuration",


use %qs instead:

  error ("builtin function %qs not supported in this compiler 
configuration", ...);


and similarly here:


+   error ("power9 target option is incompatible with -mcpu= for "
   " less than power9");


Here too, I suggest to consider using and quoting an option that
enables the to power9 target rather than using power9 as a plain
word.

Martin

Re: ping [PATCH] [MSP430] Fix PR78849: ICE on initialization of global struct containing __int20 array

2017-08-02 Thread Joseph Myers

On Wed, 2 Aug 2017, Jeff Law wrote:

> I think Joseph's suggestion for looking at partial float handling may be
> useful, though ia64's RFmode may be more interesting as it's not a
> multiple of 8 in bitsize.  IF/KF modes on the ppc port have similar
> properties.

The key issue those floating-point modes engage is the meaning of 
precision.  IFmode and KFmode have precision set to 106 and 113 to 
distinguish them.  But that's precision in the sense of number of mantissa 
bits; as normally understood in GCC, precision should be the number of 
significant bits, so 128 for both those modes (but having multiple 
different binary floating-point modes with the same precision would 
require changing how we deal with laying out long double, so the target 
specifies a mode for floating-point types rather than leaving it to be 
deduced from values such as LONG_DOUBLE_TYPE_SIZE).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 1/1] sparc: support for -mmisalign in the SPARC M8

2017-08-02 Thread David Miller

From: qinzhao 
Date: Wed,  2 Aug 2017 10:27:51 -0500

> This patch adds support to GCC for the misaligned load/store
> instructions introduced in the Oracle SPARC Architecture 2017 and
> implemented by the SPARC M8 processor.
> 
> A new command line option -mmisaligned is added, that activates the
> usage of the new instructions.
> 
> The SPARC backend is modified to use the misaligned load/store
> instructions when loading/storing data from/to addresses that are
> known to be misaligned at compile time (such as in packed structs).
> 
> New tests are added to check that the proper instructions are used
> when loading and storing from/to packed structs.
> 
> The GCC manual is expanded to cover the new command-line option.

STRICT_ALIGNMENT has a lot of implications.

I think just because we happen to have misaligned loads and stores
available doesn't mean we want all of the side effects associated
with STRICT_ALIGNMENT being true.

Re: [PATCH 3/5] combine: Use insn_cost instead of pattern_cost everywhere

2017-08-02 Thread Jeff Law

On 07/31/2017 05:25 PM, Segher Boessenkool wrote:
> This changes combine to always use insn_cost, not pattern_cost.  I don't
> intend to commit it like this: it calls recog more often than necessary,
> and it is very ugly (no, don't look at it).  But it's good enough to test
> things with.
So what do you expect this is going to look like when you're done?  The
other target independent patches all look reasonable so I think it's
really a matter of how you want to want to use the new infrastructure in
combine.c (which will become the template for how other passes might use
the infrastructure as well).

jeff

Re: [PATCH 4/5] Add targetm.insn_cost hook

2017-08-02 Thread Jeff Law

On 07/31/2017 05:25 PM, Segher Boessenkool wrote:
> This introduces a hook to implement insn_cost.  If a target does not
> implement the hook, the old function (i.e. pattern_cost) is used.
> 
> ---
>  gcc/doc/tm.texi| 12 
>  gcc/doc/tm.texi.in |  2 ++
>  gcc/rtlanal.c  |  3 +++
>  gcc/target.def | 14 ++
>  4 files changed, 31 insertions(+)
Looks reasonable as well.

jeff

Re: [PATCH 2/5] Replace insn_rtx_cost with insn_cost and pattern_cost

2017-08-02 Thread Jeff Law

On 07/31/2017 05:25 PM, Segher Boessenkool wrote:
> This renames insn_rtx_cost to pattern cost, and adds a new function
> insn_cost that takes an rtx_insn * instead of an instruction pattern
> as input.  It uses the latter function anywhere an instruction is
> readily available (instead of just an instruction pattern).
> 
> The actual implementation of insn_cost just calls pattern_cost on
> the pattern of the instruction; no functional change yet.
> 
> ---
>  gcc/cfgrtl.c  |  7 +++
>  gcc/combine.c | 17 -
>  gcc/dse.c |  2 +-
>  gcc/ifcvt.c   | 12 ++--
>  gcc/rtl.h |  3 ++-
>  gcc/rtlanal.c | 13 +++--
>  6 files changed, 31 insertions(+), 23 deletions(-)
Also looks good to me to go in once you've got a ChangeLog.

jeff

Re: [PATCH 1/5] Rename existing insn_cost to insn_sched_cost

2017-08-02 Thread Jeff Law

On 07/31/2017 05:25 PM, Segher Boessenkool wrote:
> haifa-sched exports an insn_cost function, but it is only used in a
> few places and specialised to scheduling.  This patch renames it to
> insn_sched_cost.
> 
> ---
>  gcc/haifa-sched.c  | 14 +++---
>  gcc/sched-int.h|  2 +-
>  gcc/sched-rgn.c|  4 ++--
>  gcc/sel-sched-ir.c |  2 +-
>  4 files changed, 11 insertions(+), 11 deletions(-)
This seems fine independent of the following patches.  So with a
ChangeLog it should probably go in now rather than later.

jeff

Re: ping [PATCH] [MSP430] Fix PR78849: ICE on initialization of global struct containing __int20 array

2017-08-02 Thread Jeff Law

On 08/01/2017 10:26 AM, Jozef Lawrynowicz wrote:
> On 01/08/2017 00:08, Joseph Myers wrote:
>> On Wed, 26 Jul 2017, Jeff Law wrote:
>>
>>> TYPE_SIZE, according to my understanding, should be a tree for the size
>>> of the expression in bits.
>>>
>>> The problem is for msp430 that size varies depending on where it's used.
>>>   ie, in a register an object might have a bitsize of 20 bits, but in
>>> memory its size is 32 bits.
>>>
>>> I don't think that TYPE_SIZE has any concept that the use context is
>>> relevant to selecting the proper size.
>>
>> TYPE_SIZE_UNIT is unambiguously the memory size, including padding; it's
>> used for sizeof.  TYPE_SIZE may be less clear.  We've had issues before
>> with unions with x87 long double, which has 80-bit precision in registers
>> but is 12-byte or 16-byte in memory; that was wrong code (bug 71522)
>> rather than an ICE, but the long double case may be useful for comparison
>> of what various type properties are set to in such cases.
>>
> 
> Thanks for the responses.
> 
> Could it be possible to use "variable_size" to help the compiler choose
> which size to use for TYPE_SIZE? Although the problem I see with this
> right away is that variable_size is never executed on an INTEGER_CST,
> perhaps this is part of the problem?
I suspect variable_size is more for things like VLAs where the size of
an array may vary at runtime.

What we're dealing here is more that the size of an object varies
depending on if it's in a register or in memory.


> Zooming out a bit from TYPE_SIZE, the value that ends up being incorrect
> as a result of TYPE_SIZE is rli->bitpos, this value is used a lot in
> stor_layout.c:place_field, perhaps I need to look deeper in there.
I think Joseph's suggestion for looking at partial float handling may be
useful, though ia64's RFmode may be more interesting as it's not a
multiple of 8 in bitsize.  IF/KF modes on the ppc port have similar
properties.

If you could declare a type that corresponds to one of those modes, then
build an array of them and follow how that code works it might give you
some hints on how to fix intXX.
jeff

Re: [PATCH] Add missing edge probability in simd_clone_adjust

2017-08-02 Thread Jakub Jelinek

On Wed, Aug 02, 2017 at 06:15:31PM +0200, Tom de Vries wrote:
> 2017-08-02  Tom de Vries  
> 
>   * omp-simd-clone.c (simd_clone_adjust): Add missing edge probability.

I think that is not the right probability, when using masked simd, the point
is that it is called from conditional code where usually not all the bits in
the mask are set.  So I think it is better to use say likely () probability
for the EDGE_FALSE_VALUE edge (that mask != 0) and unlikely () probability
for the EDGE_TRUE_VALUE edge.
Hopefully that conditional goes away later when vectorized, but if it
doesn't, we shouldn't  say it is that unlikely it will be masked off.

> diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
> index a1a563e..5b5d199 100644
> --- a/gcc/omp-simd-clone.c
> +++ b/gcc/omp-simd-clone.c
> @@ -1240,7 +1240,8 @@ simd_clone_adjust (struct cgraph_node *node)
>g = gimple_build_cond (EQ_EXPR, mask, build_zero_cst (TREE_TYPE 
> (mask)),
>NULL, NULL);
>gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
> -  make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
> +  edge e = make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
> +  e->probability = profile_probability::guessed_never ();
>FALLTHRU_EDGE (loop->header)->flags = EDGE_FALSE_VALUE;
>  }
>  

Jakub

Go patch committed: Only finalize embedded fields before finalizing methods

2017-08-02 Thread Ian Lance Taylor

In the Go frontend, when finalizing the methods of a named struct
type, we used to finalize all the field types first.  That can fail if
the field types refer indirectly to the named type.  Change it to just
finalize the embedded field types first, and the rest of the fields
later.  This fixes https://golang.org/issue/21253.  Bootstrapped and
ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline and
GCC 7 branch.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 250686)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-f7c36b27a49131f60eedde260896d310d735d408
+c1ac6bc99f988633c6bc68a5ca9ffad3487750ef
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 250548)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -3058,26 +3058,53 @@ Finalize_methods::type(Type* t)
 
 case Type::TYPE_NAMED:
   {
-   // We have to finalize the methods of the real type first.
-   // But if the real type is a struct type, then we only want to
-   // finalize the methods of the field types, not of the struct
-   // type itself.  We don't want to add methods to the struct,
-   // since it has a name.
Named_type* nt = t->named_type();
Type* rt = nt->real_type();
if (rt->classification() != Type::TYPE_STRUCT)
  {
+   // Finalize the methods of the real type first.
if (Type::traverse(rt, this) == TRAVERSE_EXIT)
  return TRAVERSE_EXIT;
+
+   // Finalize the methods of this type.
+   nt->finalize_methods(this->gogo_);
  }
else
  {
+   // We don't want to finalize the methods of a named struct
+   // type, as the methods should be attached to the named
+   // type, not the struct type.  We just want to finalize
+   // the field types.
+   //
+   // It is possible that a field type refers indirectly to
+   // this type, such as via a field with function type with
+   // an argument or result whose type is this type.  To
+   // avoid the cycle, first finalize the methods of any
+   // embedded types, which are the only types we need to
+   // know to finalize the methods of this type.
+   const Struct_field_list* fields = rt->struct_type()->fields();
+   if (fields != NULL)
+ {
+   for (Struct_field_list::const_iterator pf = fields->begin();
+pf != fields->end();
+++pf)
+ {
+   if (pf->is_anonymous())
+ {
+   if (Type::traverse(pf->type(), this) == TRAVERSE_EXIT)
+ return TRAVERSE_EXIT;
+ }
+ }
+ }
+
+   // Finalize the methods of this type.
+   nt->finalize_methods(this->gogo_);
+
+   // Finalize all the struct fields.
if (rt->struct_type()->traverse_field_types(this) == TRAVERSE_EXIT)
  return TRAVERSE_EXIT;
  }
 
-   nt->finalize_methods(this->gogo_);
-
// If this type is defined in a different package, then finalize the
// types of all the methods, since we won't see them otherwise.
if (nt->named_object()->package() != NULL && nt->has_any_methods())

Re: ping [PATCH] [MSP430] Fix PR78849: ICE on initialization of global struct containing __int20 array

2017-08-02 Thread Jeff Law

On 07/31/2017 05:08 PM, Joseph Myers wrote:
> On Wed, 26 Jul 2017, Jeff Law wrote:
> 
>> TYPE_SIZE, according to my understanding, should be a tree for the size
>> of the expression in bits.
>>
>> The problem is for msp430 that size varies depending on where it's used.
>>  ie, in a register an object might have a bitsize of 20 bits, but in
>> memory its size is 32 bits.
>>
>> I don't think that TYPE_SIZE has any concept that the use context is
>> relevant to selecting the proper size.
> 
> TYPE_SIZE_UNIT is unambiguously the memory size, including padding; it's 
> used for sizeof.  TYPE_SIZE may be less clear.  We've had issues before 
> with unions with x87 long double, which has 80-bit precision in registers 
> but is 12-byte or 16-byte in memory; that was wrong code (bug 71522) 
> rather than an ICE, but the long double case may be useful for comparison 
> of what various type properties are set to in such cases.
Thanks for the clarification.  I wandered around a bit before commenting
on Jozef's patch and didn't come across anything which made it
unambiguous.

And yes, x87 long doubles are a good example to compare/contrast against.

jeff

Re: [PATCH] Rewrite mklog in Python

2017-08-02 Thread Jeff Law

On 08/01/2017 01:00 PM, Yury Gribov wrote:
> Hi all,
> 
> This is a rewrite of contrib/mklog in Python.  I started adding features
> suggested by Trevor some time ago but this quickly turned into a full
> rewrite of existing Perl mklog and then I decided to just fully rewrite
> it in Python (given that this has been requested several times).
> 
> The code is mostly similar to existing version, except that I now
> separate parsing and generation of ChangeLog text (for clarity).  New
> script will also (try to) detect changes in prototypes, macro and
> supermacro.
> 
> I have a local testsuite for mklog which I used to verify functionality.
>  In all cases it was similar or better compared to Perl version.  And
> I'll obviously be around to fix regressions.
> 
> I suggest to replace existing contrib/mklog with this (Perl version can
> be moved to contrib/mklog.pl).
Sounds good to me.Please go ahead and install, keeping the old one
as mklog.pl for now.

Jeff

Re: [PATCH] minor readability tweaks to print_node

2017-08-02 Thread Jeff Law

On 08/01/2017 07:16 PM, Martin Sebor wrote:
> This is a small readability tweak to the tree printer to have
> it consistently use dashes and colons in tree attribute names
> in favor of spaces.  It's been tested on x86_64-linux.
> 
> The tree printer tends to avoid using spaces to separate tree
> attribute names (not the __attribute__ kind but things like
> needs-constructing or pointer_to_this).  Instead it uses dashes
> and underscores to separate words, and colons to separate integer
> values from the attribute names.
> 
> In a small number of cases it does use a space which can make it
> harder to read.  For instance a FUNCTION_DECL might be rendered
> like so:
> 
> type <...>
>  public external QI file /build/tmp/a.c line 14 col 13 user align 256>
> 
> In the last three attributes "user align 256" it's not completely
> clear whether it's user, followed by align, followed by 256, or
> user-align:256.
> 
> The attached patch replaces the handful of spaces between words
> I noticed with dashes, and between words and numbers with colons.
> It also fixes the above location to use the prevalent
> file:line:column notation.
> 
> Martin
> 
> gcc-print-tree.c.diff
> 
> 
> gcc/ChangeLog:
> 
>   * print-tree.c (print_node): Print location using the established
>   format %s:%i%i.
>   Replace spaces with colons.
>   (debug_raw, debug): Ditto.
OK.


Jeff

[PATCH] Add missing edge probability in simd_clone_adjust

2017-08-02 Thread Tom de Vries


[ Re: [PATCH] Verify edge probability consistency in verify_flow_info ]

On 08/02/2017 06:07 PM, Tom de Vries wrote:
I've written this patch to check for the missing probability more 
consistently. I'm not certain if we can require that the probability 
should always be set, so I'm just requiring that if it is set on one 
outgoing edge, it needs to be set on all outgoing edges.


Sofar I've build a c-only x86_64 non-bootstrap compiler and ran dg.exp. 
The only problem I ran into was in attr-simd{,-2,-4}.c. I've written a 
tentative patch for that, and will submit it as follow-up.


Jakub,

this patch adds a missing edge probability in simd_clone_adjust.

OK for trunk if bootstrap and reg-test on x86_64 succeeds?

Thanks,
- Tom
Add missing edge probability in simd_clone_adjust

Currently we generate an if with probability set on only one of the two edges:
   [0.00%] [count: INV]:
  _5 = mask.3[iter.6_3];
  if (_5 == 0)
goto ; [INV] [count: INV]
  else
goto ; [100.00%] [count: INV]

Add the missing edge probability:
goto ; [0.00%] [count: INV]

2017-08-02  Tom de Vries  

	* omp-simd-clone.c (simd_clone_adjust): Add missing edge probability.

---
 gcc/omp-simd-clone.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index a1a563e..5b5d199 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -1240,7 +1240,8 @@ simd_clone_adjust (struct cgraph_node *node)
   g = gimple_build_cond (EQ_EXPR, mask, build_zero_cst (TREE_TYPE (mask)),
 			 NULL, NULL);
   gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING);
-  make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
+  edge e = make_edge (loop->header, incr_bb, EDGE_TRUE_VALUE);
+  e->probability = profile_probability::guessed_never ();
   FALLTHRU_EDGE (loop->header)->flags = EDGE_FALSE_VALUE;
 }

[PATCH] Verify edge probability consistency in verify_flow_info

2017-08-02 Thread Tom de Vries


Hi,

I.

for target nvptx we recently ran into PR81442, an ICE in verify_flow_info:
...
error: verify_flow_info: REG_BR_PROB is set but cfg probability is not
...

We start out with a jump instruction:
...
(jump_insn 18 17 31 2 (set (pc)
(if_then_else (ne (reg:BI 83)
(const_int 0 [0]))
(label_ref 31)
(pc))) "reduction-5.c":52 104 {br_true}
 (expr_list:REG_DEAD (reg:BI 83)
(int_list:REG_BR_PROB 1 (nil)))
 -> 31)
...

The probability is set on the branch edge, but not on the fallthru edge.

After fixup_reorder_chain, the condition is reversed, and the 
probability reg-note update accordingly:

...
(jump_insn 18 17 32 2 (set (pc)
(if_then_else (eq (reg:BI 83)
(const_int 0 [0]))
(label_ref:DI 33)
(pc))) "reduction-5.c":52 105 {br_false}
 (expr_list:REG_DEAD (reg:BI 83)
(int_list:REG_BR_PROB 0 (nil)))
 -> 33)
...

The branch and fallthru edge are also reversed, which means now the 
probability is set on the fallthru edge, and not on the branch edge.


This is the immediate cause for the verify_flow_info error.


II.

We've fixed the ICE in the target by setting the probability on the 
fallthru edge when we introduce it (r250422).


We've noted other places where we were forgetting to set the probability 
(fixed in r250823), but that did not trigger the ICE.



III.

I've written this patch to check for the missing probability more 
consistently. I'm not certain if we can require that the probability 
should always be set, so I'm just requiring that if it is set on one 
outgoing edge, it needs to be set on all outgoing edges.


Sofar I've build a c-only x86_64 non-bootstrap compiler and ran dg.exp. 
The only problem I ran into was in attr-simd{,-2,-4}.c. I've written a 
tentative patch for that, and will submit it as follow-up.


Is this check a good idea?

OK for trunk if bootstrap and reg-test on x86_64 succeeds?

Thanks,
- Tom
Verify edge probability consistency in verify_flow_info

2017-08-02  Tom de Vries  

	* cfghooks.c (verify_flow_info): Verify that if one outgoing edge has
	probability set, all outgoing edges have probability set.

---
 gcc/cfghooks.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c
index 18dc49a..b973651 100644
--- a/gcc/cfghooks.c
+++ b/gcc/cfghooks.c
@@ -152,6 +152,8 @@ verify_flow_info (void)
 		 bb->index, bb->frequency);
 	  err = 1;
 	}
+  bool has_prob_uninit_edges = false;
+  bool has_prob_init_edges = false;
   FOR_EACH_EDGE (e, ei, bb->succs)
 	{
 	  if (last_visited [e->dest->index] == bb)
@@ -166,6 +168,10 @@ verify_flow_info (void)
 		 e->src->index, e->dest->index);
 	  err = 1;
 	}
+	  if (e->probability.initialized_p ())
+	has_prob_init_edges = true;
+	  else
+	has_prob_uninit_edges = true;
 	  if (!e->count.verify ())
 	{
 	  error ("verify_flow_info: Wrong count of edge %i->%i",
@@ -197,6 +203,12 @@ verify_flow_info (void)
 	  error ("wrong amount of branch edges after unconditional jump %i", bb->index);
 	  err = 1;
 	}
+  if (has_prob_uninit_edges && has_prob_init_edges)
+	{
+	  error ("Missing edge probability after unconditional jump in bb %i",
+		 bb->index);
+	  err = 1;
+	}
 
   FOR_EACH_EDGE (e, ei, bb->preds)
 	{

Re: [PATCH, rs6000] Clean up capitalized diagnostic messages

2017-08-02 Thread Joseph Myers

On Wed, 2 Aug 2017, Jakub Jelinek wrote:

> On Wed, Aug 02, 2017 at 10:29:20AM -0500, Bill Schmidt wrote:
> > --- gcc/config/rs6000/rs6000.c  (revision 250791)
> > +++ gcc/config/rs6000/rs6000.c  (working copy)
> > @@ -4132,7 +4132,7 @@ rs6000_option_override_internal (bool global_init_
> >|| rs6000_cpu == PROCESSOR_PPCE5500)
> >  {
> >if (TARGET_ALTIVEC)
> > -   error ("AltiVec not supported in this target");
> > +   error ("altivec not supported in this target");
> 
> If AltiVec is spelled that way always, then I think we want to keep it
> capitalized, but CCing Joseph just to be sure.  Or the diagnostics

Yes, if it would be capitalized mid-sentence then it should be capitalized 
at the start of a diagnostic (and if it wouldn't be capitalized 
mid-sentence, it shouldn't be capitalized at the start of a diagnostic).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] toplev: avoid recursive emergency_dump_function

2017-08-02 Thread Jeff Law

On 07/20/2017 08:40 AM, Alexander Monakov wrote:
> Hi,
> 
> Segher pointed out on IRC that ICE reporting with dumps enabled got worse:
> if emergency_dump_function itself leads to an ICE (e.g. by segfaulting),
> nested ICE reporting will invoke emergency_dump_function in exactly the
> same context, but not only would we uselessly announce current pass again,
> this time around the second SIGSEGV will just terminate cc1 because the
> signal handler is unregistered (so we won't print the backtrace).  Sorry
> for not really considering the implications when submitting that patch.
> 
> Solve this by substituting the callback for global_dc->internal_error;
> this avoids invoking emergency_dump_function, and also gives a
> convenient point to flush the dump file.  OK to apply?
> 
>   * topvel.c (dumpfile.h): New include.
>   (internal_error_reentered): New static function.  Use it...
>   (internal_error_function): ...here to handle reentered internal_error.
OK.
jeff

Re: [PATCH, rs6000] Clean up capitalized diagnostic messages

2017-08-02 Thread Jakub Jelinek

On Wed, Aug 02, 2017 at 10:29:20AM -0500, Bill Schmidt wrote:
> --- gcc/config/rs6000/rs6000.c(revision 250791)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -4132,7 +4132,7 @@ rs6000_option_override_internal (bool global_init_
>|| rs6000_cpu == PROCESSOR_PPCE5500)
>  {
>if (TARGET_ALTIVEC)
> - error ("AltiVec not supported in this target");
> + error ("altivec not supported in this target");

If AltiVec is spelled that way always, then I think we want to keep it
capitalized, but CCing Joseph just to be sure.  Or the diagnostics
could be reworded not to have AltiVec first.

> @@ -4250,7 +4250,7 @@ rs6000_option_override_internal (bool global_init_
> rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~ignore_masks);
>   }
> else
> - error ("Power9 target option is incompatible with -mcpu= for "
> + error ("power9 target option is incompatible with -mcpu= for "
>  " less than power9");
>   }
>else if ((ISA_3_0_MASKS_SERVER & rs6000_isa_flags_explicit)

Similarly, not sure about this one, but isn't it about -mpower9-minmax
being incompatible with -mcpu option not supporting power9 ISA?
The  and talking about less for strings that aren't necessarily ordered
is just weird.

> @@ -11226,7 +11226,7 @@ rs6000_return_in_memory (const_tree type, const_tr
>static bool warned_for_return_big_vectors = false;
>if (!warned_for_return_big_vectors)
>   {
> -   warning (OPT_Wpsabi, "GCC vector returned by reference: "
> +   warning (OPT_Wpsabi, "gcc vector returned by reference: "
>  "non-standard ABI extension with no compatibility 
> guarantee");
> warned_for_return_big_vectors = true;
>   }
> @@ -12773,7 +12773,7 @@ rs6000_pass_by_reference (cumulative_args_t cum AT
>   fprintf (stderr, "function_arg_pass_by_reference: synthetic vector\n");
>if (!warned_for_pass_big_vectors)
>   {
> -   warning (OPT_Wpsabi, "GCC vector passed by reference: "
> +   warning (OPT_Wpsabi, "gcc vector passed by reference: "
>  "non-standard ABI extension with no compatibility 
> guarantee");
> warned_for_pass_big_vectors = true;
>   }

GCC I think needs to be capitalized.  Not sure if talking about GCC vector
is clear, it is talking about the GNU vector extension, perhaps GNU instead?

The rest of the changes look good to me, but I'll defer review to the target
maintainers.

Jakub

Re: [PATCH 0/2] Python testcases to check DWARF output

2017-08-02 Thread Jeff Law

On 07/27/2017 01:52 AM, Richard Biener wrote:
> On Wed, Jul 26, 2017 at 11:25 PM, Mike Stump  wrote:
>> On Jul 26, 2017, at 9:00 AM, Pierre-Marie de Rodat  
>> wrote:
>>> At the last GNU Cauldron, Richard Biener and I talked about DWARF output
>>> testing. Except for guality tests, which are disabled on several
>>> targets, the only way tests check the DWARF is scanning the annotated
>>> assembly (-dA), making it hard to write reliable tests.
>>
>>> Anyway, Richard and I discussed about doing something similar in-tree,
>>> and here is a candidate set of patches to achieve that
>>
>> I'm fine with the direction if a reviewer wants to go in that direction.  I 
>> wish python didn't have a built-in speed penalty, that's the only downside I 
>> don't like about it.  Aside from that, even switching all of the testsuite 
>> to be python based isn't a terrible idea.
> 
> But is it worse than TCL?
I don't think Python is worse than TCL.  Few things would have that
property.  And I think we're a lot more likely to be able to find folks
that can hack Python as needed vs hacking TCL.

jeff

Re: [PATCH 0/2] Python testcases to check DWARF output

2017-08-02 Thread Jeff Law

On 07/26/2017 10:00 AM, Pierre-Marie de Rodat wrote:
> Hello,
> 
> At the last GNU Cauldron, Richard Biener and I talked about DWARF output
> testing. Except for guality tests, which are disabled on several
> targets, the only way tests check the DWARF is scanning the annotated
> assembly (-dA), making it hard to write reliable tests.
> 
> For instance, checking the number of times DW_AT_location is present in
> order to check that some specific variable is assigned one is fuzzy.
> Depending on the target and on the evolution of the compiler, the number
> of output variables, or which one is assigned a location can vary
> legitimately but still make the test fail.
> 
> On my side, I already had written an out-of-tree testsuite for the DWARF
> features I added for Ada. This testsuite uses a DWARF parser in order to
> perform checks on a tree:
> . I had to update it
> a couple of times, for instance when a change created a
> DW_TAG_const_type DIE or removed one somewhere in a type tree, but
> that’s very rare. I would say that I’m satisfied with the checks I could
> express, but I don’t remember I ever caught a regression with them, so I
> have no representative experience to share in this area. Maybe DWARF
> back-end developpers do a too good job. ;-)
> 
> Anyway, Richard and I discussed about doing something similar in-tree,
> and here is a candidate set of patches to achieve that:
> 
>   * The first patch installs DejaGNU scripts to run a Python interpreter
> in testcases.
> 
>   * The second one installs other DejaGNU scripts to detect DWARF
> dumping tools, plus a small Python library to parse and pattern
> match DIEs and their attributes. It also adds several C and Ada
> tests as examples; these are inspired by existing homonym tests
> based on assembly scanning.
> 
> For now, this supports only platforms where objdump is available for the
> current target, but extending it to other tools, such as otool on Darwin
> should be doable.
> 
> I would appreciate feedback about the idea and the implementation I
> propose. This is the first time I do more in the testsuite than just
> adding new tests, so thank you in advance for you patience in reviewing
> these. :-)
> 
> I tested these patches on x86_64-linux.
I hate to throw in a wrench at this point, but has anyone looked at
dwgrep from Petr Machata?  He's not doing much with it anymore, but it
might provide enough of a dwarf scanning framework to be useful for
testing purposes.

jeff

Re: [PATCH] Improve extraction of changed file in contrib/mklog

2017-08-02 Thread Jeff Law

On 07/27/2017 08:33 PM, Yuri Gribov wrote:
> On Wed, Jul 26, 2017 at 6:11 PM, Jeff Law  wrote:
>> On 07/09/2017 01:03 PM, Yuri Gribov wrote:
>>> Hi,
>>>
>>> Currently mklog will fail to analyze lines like this in patches:
>>> diff -rupN gcc/gcc/testsuite/lib/profopt.exp
>>> gcc-compare-checks/gcc/testsuite/lib/profopt.exp
>>> (it fails with "Error: failed to parse diff for ... and ...").
>>>
>>> This patch fixes it. Ok for trunk?
>>>
>>> -Y
>>>
>>>
>>> mklog-filename-fix-1.patch
>>>
>>>
>>> 2017-07-09  Yury Gribov  
>>>
>>> contrib/
>>> * mklog: Fix extraction of changed file name.
>> One could argue that given general directions python would be a better
>> match than perl, but I don't think it's reasonable to require a rewrite
>> to move forward.
> 
> Jeff,
> 
> Is this to make script more accessible for hacking by others?
Precisely.  In general I think folks here are more adept at python than
perl.

> Otherwise from technical standpoint there probly isn't much
> difference.
Agreed.

> 
> I'm fine with rewriting it in Python once I get some time later this month.
Thanks.  Saw that fly by, it's in the queue.

jeff

[PATCH, rs6000] Clean up capitalized diagnostic messages

2017-08-02 Thread Bill Schmidt

Hi,

Jakub pointed out that we've had several error messages slip into the POWER 
back end
that incorrectly start with a capital letter.  This patch hunts those down and 
cleans
them up.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this 
ok for
trunk?  I don't anticipate backporting any of this.

Thanks,
Bill


[gcc]

2017-08-02  Bill Schmidt  

* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Don't start diagnostic messages with a capital letter.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Likewise.
(rs6000_return_in_memory): Likewise.
(rs6000_pass_by_reference): Likewise.
(rs6000_invalid_builtin): Likewise.
(rs6000_trampoline_init): Likewise.

[gcc/testsuite]

2017-08-02  Bill Schmidt  

* gcc.dg/pr35442.c: Adjust for error messages that used to start
with a capital letter.
* gcc.target/powerpc/bfp/scalar-cmp-exp-eq-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-cmp-exp-gt-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-cmp-exp-lt-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-cmp-unordered-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-exp-1.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-exp-4.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-sig-1.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-sig-4.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-1.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-10.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-4.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-7.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-11.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-6.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-7.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.
* gcc.target/powerpc/bfp/vec-extract-exp-2.c: Likewise.
* gcc.target/powerpc/bfp/vec-extract-exp-3.c: Likewise.
* gcc.target/powerpc/bfp/vec-extract-sig-2.c: Likewise.
* gcc.target/powerpc/bfp/vec-extract-sig-3.c: Likewise.
* gcc.target/powerpc/bfp/vec-insert-exp-2.c: Likewise.
* gcc.target/powerpc/bfp/vec-insert-exp-3.c: Likewise.
* gcc.target/powerpc/bfp/vec-insert-exp-6.c: Likewise.
* gcc.target/powerpc/bfp/vec-insert-exp-7.c: Likewise.
* gcc.target/powerpc/bfp/vec-test-data-class-2.c: Likewise.
* gcc.target/powerpc/bfp/vec-test-data-class-3.c: Likewise.
* gcc.target/powerpc/byte-in-either-range-1.c: Likewise.
* gcc.target/powerpc/byte-in-range-1.c: Likewise.
* gcc.target/powerpc/byte-in-set-1.c: Likewise.
* gcc.target/powerpc/crypto-builtin-2.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-1.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-11.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-16.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-21.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-26.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-31.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-36.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-41.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-46.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-51.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-56.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-6.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-61.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-66.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-71.c: Likewise.
* gcc.target/powerpc/dfp/dtstsfi-76.c: Likewise.
* gcc.target/powerpc/vsu/vec-all-nez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-any-eqz-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cmpnez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-xl-len-12.c: Likewise.
* gcc.target/powerpc/vsu/vec-xlx-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-xrx-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-xst-len-12.c: Likewise.


Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 250791)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -6802,7 +6802,7 @@ altivec_resolve_overloaded_builtin (location_t loc
 if (unsupported_builtin)
   {
const char *name = rs6000_overloaded_builtin_name (fcode);
-   error ("Builtin function %s not supported in this compiler 
configuration",
+   error ("builtin function %s not supported in

[PATCH 1/1] sparc: support for -mmisalign in the SPARC M8

2017-08-02 Thread qinzhao

This patch adds support to GCC for the misaligned load/store
instructions introduced in the Oracle SPARC Architecture 2017 and
implemented by the SPARC M8 processor.

A new command line option -mmisaligned is added, that activates the
usage of the new instructions.

The SPARC backend is modified to use the misaligned load/store
instructions when loading/storing data from/to addresses that are
known to be misaligned at compile time (such as in packed structs).

New tests are added to check that the proper instructions are used
when loading and storing from/to packed structs.

The GCC manual is expanded to cover the new command-line option.

gcc/ChangeLog:

* config/sparc/constraints.md: New constraint B for memory
references whose addresses are misaligned.
* config/sparc/m8.md: New insn reservation for misaligned
load/store.
* config/sparc/sparc-protos.h (memory_is_misaligned): New.
* config/sparc/sparc.c (dump_target_flag_bits): Dump
MASK_MISALIGN.
(sparc_option_override): Honour MASK_MISALIGN, use command-line
option explicitly specified target_flags to control
target_flags.
(RTX_OK_FOR_OFFSET_P): For TARGET_MISALIGN treat 10-bits as
legal IMM.
(sparc_legitimate_address_p): Prohibit LO_SUM+IMM for
TARGET_MISALIGN.
(memory_is_misaligned): New
* config/sparc/sparc.h (STRICT_ALIGNMENT): Set to
!(TARGET_MISALIGN)
* config/sparc/sparc.md (cpu_feature): Add new feature misalign
(enabled): Handle misalign.
(type): New insn type load_mis, store_mis, fpload_mis,
fpstore_mis.
("*movhi_insn"): Add new alternatives for misaligned memory
accesses to use M8 misaligned load/store insns,update
corresponding attributes.
("*movsi_insn"): Likewise.
("*movdi_insn_sp32"): Likewise.
("*movdi_insn_sp64"): Likewise.
("*movsf_insn"): Likewise.
("*movdf_insn_sp32"): Likewise.
("*movdf_insn_sp64"): Likewise.
("*mov_insn"): Likewise.
("*mov_insn_sp64"): Likewise.
("*mov_insn_sp32"): Likewise.
(define_split): DI "memory_operand" from "const_zero_operand"
disable splitting for TARGET_MISALIGN.
(define_split): DF "memory_operand" from "const_zero_operand"
Likewise.
(define_split): VM64 "memory_operand" from "const_zero_operand"
Likewise.
* config/sparc/sparc.opt (mmisalign): New option.
* doc/invoke.texi (Option Summary): Document -mmisalign and
-mno-misalign.
(SPARC Optons): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/sparc/misalign-1.c: New test for misaligned ld/st.
* gcc.target/sparc/misalign-2.c: Likewise.
* gcc.target/sparc/misalign-3.c: Likewise.
* gcc.target/sparc/misalign-4.c: Likewise.
* gcc.target/sparc/misalign-5.c: Likewise.
* gcc.target/sparc/misalign-run-1.c: Likewise.
* gcc.target/sparc/misalign-run-2.c: Likewise.
* gcc.target/sparc/misalign-run-3.c: Likewise.
* lib/target-supports.exp (check_effective_target_misalign_hw):
New procedure.
---
 gcc/config/sparc/constraints.md |   12 ++-
 gcc/config/sparc/m8.md  |   16 +-
 gcc/config/sparc/sparc-protos.h |1 +
 gcc/config/sparc/sparc.c|   60 +++-
 gcc/config/sparc/sparc.h|5 +-
 gcc/config/sparc/sparc.md   |  183 +++
 gcc/config/sparc/sparc.opt  |4 +
 gcc/doc/invoke.texi |   10 ++
 gcc/testsuite/gcc.target/sparc/misalign-1.c |   45 ++
 gcc/testsuite/gcc.target/sparc/misalign-2.c |   23 +++
 gcc/testsuite/gcc.target/sparc/misalign-3.c |   42 +
 gcc/testsuite/gcc.target/sparc/misalign-4.c |   23 +++
 gcc/testsuite/gcc.target/sparc/misalign-5.c |   19 +++
 gcc/testsuite/gcc.target/sparc/misalign-run-1.c |   34 
 gcc/testsuite/gcc.target/sparc/misalign-run-2.c |   23 +++
 gcc/testsuite/gcc.target/sparc/misalign-run-3.c |   53 +++
 gcc/testsuite/lib/target-supports.exp   |   15 ++
 17 files changed, 485 insertions(+), 83 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-1.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-2.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-3.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-4.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-5.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-run-1.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-run-2.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/misalign-run-3.c

diff --git a/gcc/config/sparc/constraints.md b/gcc/config/sparc/constraints.md
index cff5a61..ba15233 100644
---

Re: Add include-guard to tree-vrp.h

2017-08-02 Thread Jeff Law

On 08/02/2017 08:16 AM, Prathamesh Kulkarni wrote:
> Add include-guard to tree-vrp.h.
> OK to commit ?
OK.
jeff

Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts

2017-08-02 Thread Michael Meissner

On Fri, Jul 28, 2017 at 04:08:50PM -0500, Segher Boessenkool wrote:
> 
> "emit" is not a good name for this: that is generally used for something
> that does emit_insn, i.e. put an insn in the instruction stream.  This
> function returns a string a define_insn can return.  For the rl* insns
> I called the similar functions rs6000_insn_for_*, maybe something like
> that is better here?

...

> I think calling this with the rtx elementN args makes this only more
> complicated (the function comment doesn't say what they are or what
> NULL means, btw).

...

> In this and the other testcase, should you test no other insns at all
> are generated?

Here are the revised patches.  I tested on a little endian power8 system and a
big endian power7 system.  Are these patches ok for the trunk?

[gcc]
2017-08-02  Michael Meissner  

PR target/81593
* config/rs6000/rs6000-protos.h (rs6000_output_xxpermdi): New
declaration.
* config/rs6000/rs6000.c (rs6000_output_xxpermdi): New function to
emit XXPERMDI accessing either double word in either vector
register inputs.
* config/rs6000/vsx.md (vsx_concat_, VSX_D iterator):
Rewrite VEC_CONCAT insn to call rs6000_output_xxpermdi.  Simplify
the constraints with the removal of the -mupper-regs-* switches.
(vsx_concat__1): New combiner insns to optimize CONCATs
where either register might have come from VEC_SELECT.
(vsx_concat__2): Likewise.
(vsx_concat__3): Likewise.
(vsx_set_, VSX_D iterator): Rewrite insn to generate a
VEC_CONCAT rather than use an UNSPEC to specify the option.

[gcc/testsuite]
2017-08-02  Michael Meissner  

PR target/81593
* gcc.target/powerpc/vsx-extract-6.c: New test.
* gcc.target/powerpc/vsx-extract-7.c: Likewise.


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 250793)
+++ gcc/config/rs6000/rs6000-protos.h   (.../gcc/config/rs6000) (working copy)
@@ -233,6 +233,7 @@ extern void rs6000_asm_output_dwarf_pcre
   const char *label);
 extern void rs6000_asm_output_dwarf_datarel (FILE *file, int size,
 const char *label);
+extern const char *rs6000_output_xxpermdi (rtx, rtx, rtx, rtx, rtx);
 
 /* Declare functions in rs6000-c.c */
 
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 250793)
+++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000) (working copy)
@@ -39007,6 +39007,60 @@ rs6000_optab_supported_p (int op, machin
   return true;
 }
 }
+
+
+/* Output a xxpermdi instruction that sets a 128-bit vector DEST combining two
+   inputs SRC1 and SRC2.
+
+   If ELEMENT1 is null, use the top 64-bit double word of ARG1.  If it is
+   non-NULL, it is a 0 or 1 constant that gives the vector element number to
+   use for extracting the 64-bit double word from ARG1.
+
+   If ELEMENT2 is null, use the top 64-bit double word of ARG2.  If it is
+   non-NULL, it is a 0 or 1 constant that gives the vector element number to
+   use for extracting the 64-bit double word from ARG2.
+
+   The element number is based on the user element ordering, set by the
+   endianess and by the -maltivec={le,be} options.  */
+
+const char *
+rs6000_output_xxpermdi (rtx dest,
+   rtx src1,
+   rtx src2,
+   rtx element1,
+   rtx element2)
+{
+  int op1_dword = (!element1) ? 0 : INTVAL (element1);
+  int op2_dword = (!element2) ? 0 : INTVAL (element2);
+  rtx xops[10];
+  const char *insn_string;
+
+  gcc_assert (IN_RANGE (op1_dword | op2_dword, 0, 1));
+  xops[0] = dest;
+  xops[1] = src1;
+  xops[2] = src2;
+
+  if (BYTES_BIG_ENDIAN)
+{
+  xops[3] = GEN_INT (2*op1_dword + op2_dword);
+  insn_string = "xxpermdi %x0,%x1,%x2,%3";
+}
+  else
+{
+  if (element1)
+   op1_dword = 1 - op1_dword;
+
+  if (element2)
+   op2_dword = 1 - op2_dword;
+
+  xops[3] = GEN_INT (op1_dword + 2*op2_dword);
+  insn_string = "xxpermdi %x0,%x2,%x1,%3";
+}
+
+  output_asm_insn (insn_string, xops);
+  return "";
+}
+
 
 struct gcc_target targetm = TARGET_INITIALIZER;
 
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 250793)
+++ gcc/config/rs6000/vsx.md(.../gcc/config/rs6000) (working copy)
@@

Add include-guard to tree-vrp.h

2017-08-02 Thread Prathamesh Kulkarni

Add include-guard to tree-vrp.h.
OK to commit ?

Thanks,
Prathamesh
2017-08-02  Prathamesh Kulkarni  

* tree-vrp.h: Add include guard.

diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h
index ef2c68a752b..f84403a0f83 100644
--- a/gcc/tree-vrp.h
+++ b/gcc/tree-vrp.h
@@ -17,6 +17,9 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#ifndef GCC_TREE_VRP_H
+#define GCC_TREE_VRP_H
+
 /* Type of value ranges.  See value_range_d In tree-vrp.c for a
description of these types.  */
 enum value_range_type { VR_UNDEFINED, VR_RANGE,
@@ -57,3 +60,4 @@ extern void extract_range_from_unary_expr (value_range *vr,
   value_range *vr0_,
   tree op0_type);
 
+#endif /* GCC_TREE_VRP_H */

[PATCH, i386]: Fix PR81644, ICE in rtl_verify_bICE in rtl_verify_bb_insn, BBRO pass duplicates BB that ends with flow control insnb_insn, BBRO pass duplicates BB that ends with flow control insn

2017-08-02 Thread Uros Bizjak

It turned out we can't simply emit trap insn during epilogue
expansion, since epilogue is expanded on FALLTHRU edge to EXIT BB.
Later passes rightfully choke on control flow insn in BB with FALTHRU
exit edge.

So, instead of using TRAP_IF in the prologue, use UNSPEC_VOLATILE that
directly emits "ud2".

2017-08-02  Uros Bizjak  

PR target/81644
* config/i386/i386.md (unspecv): Add UNSPECV_UD2.
(ud2): New insn pattern.
* config/i386/i386.c (ix86_expand_epilogue):
For naked functions, generate ud2 instead of trap insn

testsuite/ChangeLog:

2017-08-02  Uros Bizjak  

PR target/81644
* gcc.target/i386/pr81644.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 250818)
+++ config/i386/i386.c  (working copy)
@@ -15199,7 +15199,7 @@ ix86_expand_epilogue (int style)
   if (ix86_function_naked (current_function_decl))
 {
   /* The program should not reach this point.  */
-  emit_insn (gen_trap ());
+  emit_insn (gen_ud2 ());
   return;
 }
 
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 250818)
+++ config/i386/i386.md (working copy)
@@ -201,6 +201,7 @@
 ])
 
 (define_c_enum "unspecv" [
+  UNSPECV_UD2
   UNSPECV_BLOCKAGE
   UNSPECV_STACK_PROBE
   UNSPECV_PROBE_STACK_RANGE
@@ -18606,6 +18607,18 @@
 }
   [(set_attr "length" "2")])
 
+(define_insn "ud2"
+  [(unspec_volatile [(const_int 0)] UNSPECV_UD2)]
+  ""
+{
+#ifdef HAVE_AS_IX86_UD2
+  return "ud2";
+#else
+  return ASM_SHORT "0x0b0f";
+#endif
+}
+  [(set_attr "length" "2")])
+
 (define_expand "prefetch"
   [(prefetch (match_operand 0 "address_operand")
 (match_operand:SI 1 "const_int_operand")
Index: testsuite/gcc.target/i386/pr81644.c
===
--- testsuite/gcc.target/i386/pr81644.c (nonexistent)
+++ testsuite/gcc.target/i386/pr81644.c (working copy)
@@ -0,0 +1,15 @@
+/* PR target/81644 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-mregparm=1" { target ia32 } } */
+
+void b (void);
+
+void
+__attribute__ ((naked))
+a (int z)
+{
+  if (z)
+return;
+  b ();
+}

Re: [PATCH][RFC] Make expansion of balanced binary trees of switches on tree level.

2017-08-02 Thread David Malcolm

On Wed, 2017-08-02 at 13:20 +0200, Martin Liška wrote:
> Hello.
> 
> After some discussions with Honza, I've decided to convert current
> code in stmt.c that
> is responsible for switch expansion. More precisely, I would like to
> convert the code
> to expand gswitch statements on tree level. Currently the newly
> created pass is executed
> at the end of tree optimizations.
> 
> My plan for future is to inspire in [1] and come up with some more
> sophisticated switch
> expansions. For that I've been working on a paper where I'll
> summarize statistics based
> on what I've collected in openSUSE distribution with specially
> instrumented GCC. If I'll be
> happy I can also fit in to schedule of this year's Cauldron with a
> talk.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression
> tests.
> 
> Thoughts?
> Martin
> 
> [1] https://www.nextmovesoftware.com/technology/SwitchOptimization.pd
> f
> 
> gcc/ChangeLog:
> 
> 2017-07-31  Martin Liska  
> 
[...]

>   * gimple-switch-low.c: New file.

Shouldn't new files have a .cc suffix these days?

[...]

[testsuite, committed] Use relative line number in gcc.dg/Walloca-14.c

2017-08-02 Thread Tom de Vries


[ was: Re: Avoid generating useless range info ]

On 06/28/2017 09:56 AM, Aldy Hernandez wrote:

void *q = __builtin_alloca (p); /* { dg-warning "passing argument 1" } */
+  /* { dg-warning "unbounded use of 'alloca'" "unbounded" { target *-*-* } 11 
} */


Hi,

Please use relative line number next time.

Committed.

Thanks,
- Tom
Use relative line number in gcc.dg/Walloca-14.c

2017-08-02  Tom de Vries  

	* gcc.dg/Walloca-14.c: Use relative line number.

---
 gcc/testsuite/gcc.dg/Walloca-14.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/Walloca-14.c b/gcc/testsuite/gcc.dg/Walloca-14.c
index f3e3f57..ea48227 100644
--- a/gcc/testsuite/gcc.dg/Walloca-14.c
+++ b/gcc/testsuite/gcc.dg/Walloca-14.c
@@ -9,6 +9,6 @@ g (int *p)
   extern void f (void *);
 
   void *q = __builtin_alloca (p); /* { dg-warning "passing argument 1" } */
-  /* { dg-warning "unbounded use of 'alloca'" "unbounded" { target *-*-* } 11 } */
+  /* { dg-warning "unbounded use of 'alloca'" "unbounded" { target *-*-* } .-1 } */
   f (q);
 }

[patch, fortran] Fix PR 60355, missing error for BIND(C) outside module scope

2017-08-02 Thread Thomas Koenig


Hello world,

the attached patch is a bit smaller than it looks, because most of
it is due to reformatting a large comment.  It is rather simple -
checking for an incorrectly placed BIND(C) variable was sometimes not
done because the test was mixed in with other tests where implicitly
typed variables were excluded.

Regression-tested. OK for trunk?

Regards

Thomas

2017-08-02  Thomas Koenig  

PR fortran/60355
* resolve.c (resolve_symbol): Adjust (and reformat)
comment.  Perform check if a BIND(C) is declared
at module level regardless of whether it is typed
implicitly or not.

2017-08-02  Thomas Koenig  

PR fortran/60355
* gfortran.dg (bind_c_usage_30): New test.
Index: resolve.c
===
--- resolve.c	(Revision 250720)
+++ resolve.c	(Arbeitskopie)
@@ -14397,17 +14397,18 @@ resolve_symbol (gfc_symbol *sym)
 	}
 }
 
-  /* If the symbol is marked as bind(c), verify it's type and kind.  Do not
- do this for something that was implicitly typed because that is handled
- in gfc_set_default_type.  Handle dummy arguments and procedure
- definitions separately.  Also, anything that is use associated is not
- handled here but instead is handled in the module it is declared in.
- Finally, derived type definitions are allowed to be BIND(C) since that
- only implies that they're interoperable, and they are checked fully for
- interoperability when a variable is declared of that type.  */
-  if (sym->attr.is_bind_c && sym->attr.implicit_type == 0 &&
-  sym->attr.use_assoc == 0 && sym->attr.dummy == 0 &&
-  sym->attr.flavor != FL_PROCEDURE && sym->attr.flavor != FL_DERIVED)
+  /* If the symbol is marked as bind(c), that it is declared at module level
+ scope and verify its type and kind.  Do not do the latter for symbols
+ that are implicitly typed because that is handled in
+ gfc_set_default_type.  Handle dummy arguments and procedure definitions
+ separately.  Also, anything that is use associated is not handled here
+ but instead is handled in the module it is declared in.  Finally, derived
+ type definitions are allowed to be BIND(C) since that only implies that
+ they're interoperable, and they are checked fully for interoperability
+ when a variable is declared of that type.  */
+  if (sym->attr.is_bind_c && sym->attr.use_assoc == 0
+  && sym->attr.dummy == 0 && sym->attr.flavor != FL_PROCEDURE
+  && sym->attr.flavor != FL_DERIVED)
 {
   bool t = true;
 
@@ -14421,11 +14422,11 @@ resolve_symbol (gfc_symbol *sym)
 		 "module level scope", sym->name, &(sym->declared_at));
 	  t = false;
 	}
-  else if (sym->common_head != NULL)
+  else if (sym->common_head != NULL && sym->attr.implicit_type == 0)
 {
   t = verify_com_block_vars_c_interop (sym->common_head);
 }
-  else
+  else if (sym->attr.implicit_type == 0)
 	{
 	  /* If type() declaration, we need to verify that the components
 	 of the given type are all C interoperable, etc.  */
! { dg-do compile }
! PR 60355 - there was no error message for implicitly typed variables
! Test case contributed by Vladimir Fuka
program main
  bind(c) test_BIND ! { dg-error "cannot be BIND" }
END

Re: PATCH for base_pool_allocator (PR other/81667)

2017-08-02 Thread Richard Biener

On Wed, Aug 2, 2017 at 2:45 PM, Marek Polacek  wrote:
> This PR points out that m_elt_size was left uninitialized in the member
> initializer list for base_pool_allocator.  I think it makes sense to 
> initialize
> it like this.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

Ok.

Richard.

> 2017-08-02  Marek Polacek  
>
> PR other/81667
> * alloc-pool.h (base_pool_allocator): Initialize m_elt_size.
>
> diff --git gcc/alloc-pool.h gcc/alloc-pool.h
> index a5236db3dae..1d04e5d2cfe 100644
> --- gcc/alloc-pool.h
> +++ gcc/alloc-pool.h
> @@ -240,8 +240,9 @@ base_pool_allocator 
> ::base_pool_allocator (
> const char *name, size_t size MEM_STAT_DECL):
>m_name (name), m_id (0), m_elts_per_block (0), m_returned_free_list (NULL),
>m_virgin_free_list (NULL), m_virgin_elts_remaining (0), m_elts_allocated 
> (0),
> -  m_elts_free (0), m_blocks_allocated (0), m_block_list (NULL), m_size 
> (size),
> -  m_initialized (false), m_location (ALLOC_POOL_ORIGIN, false PASS_MEM_STAT) 
> {}
> +  m_elts_free (0), m_blocks_allocated (0), m_block_list (NULL), m_elt_size 
> (0),
> +  m_size (size), m_initialized (false),
> +  m_location (ALLOC_POOL_ORIGIN, false PASS_MEM_STAT) {}
>
>  /* Initialize a pool allocator.  */
>
>
> Marek

Re: [PATCH][RFC] Make expansion of balanced binary trees of switches on tree level.

2017-08-02 Thread Martin Liška

On 08/02/2017 01:51 PM, Richard Biener wrote:
> On Wed, Aug 2, 2017 at 1:20 PM, Martin Liška  wrote:
>> Hello.
>>
>> After some discussions with Honza, I've decided to convert current code in 
>> stmt.c that
>> is responsible for switch expansion. More precisely, I would like to convert 
>> the code
>> to expand gswitch statements on tree level. Currently the newly created pass 
>> is executed
>> at the end of tree optimizations.
>>
>> My plan for future is to inspire in [1] and come up with some more 
>> sophisticated switch
>> expansions. For that I've been working on a paper where I'll summarize 
>> statistics based
>> on what I've collected in openSUSE distribution with specially instrumented 
>> GCC. If I'll be
>> happy I can also fit in to schedule of this year's Cauldron with a talk.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Thoughts?
> 
> First of all thanks.
> 
> I think part of switch expansion moved to switch-conversion some time ago
> (emit_case_bit_tests).  So maybe the full lowering should be in at least
> the same source file and it should maybe applied earlier for a subset of
> cases (very low number of cases for example).

Yep, good idea. I'll take a look.

> 
> Did you base the code on the RTL expansion code or did you re-write it from
> scratch?

It's based, I've just changed the function that create CFG.

Martin

> 
> No further comments sofar.
> 
> Richard.
> 
>> Martin
>>
>> [1] https://www.nextmovesoftware.com/technology/SwitchOptimization.pdf
>>
>> gcc/ChangeLog:
>>
>> 2017-07-31  Martin Liska  
>>
>> * Makefile.in: Add gimple-switch-low.o.
>> * passes.def: Include pass_lower_switch.
>> * stmt.c (dump_case_nodes): Remove and move to
>> gimple-switch-low.c.
>> (case_values_threshold): Likewise.
>> (expand_switch_as_decision_tree_p): Likewise.
>> (emit_case_decision_tree): Likewise.
>> (expand_case): Likewise.
>> (balance_case_nodes): Likewise.
>> (node_has_low_bound): Likewise.
>> (node_has_high_bound): Likewise.
>> (node_is_bounded): Likewise.
>> (emit_case_nodes): Likewise.
>> * timevar.def: Add TV_TREE_SWITCH_LOWERING.
>> * tree-pass.h: Add make_pass_lower_switch.
>> * gimple-switch-low.c: New file.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-07-31  Martin Liska  
>>
>> * gcc.dg/tree-prof/update-loopch.c: Scan patterns in
>> switchlower.
>> * gcc.dg/tree-ssa/vrp104.c: Likewise.
>> ---
>>  gcc/Makefile.in|1 +
>>  gcc/gimple-switch-low.c| 1226 
>> 
>>  gcc/passes.def |1 +
>>  gcc/stmt.c |  861 -
>>  gcc/testsuite/gcc.dg/tree-prof/update-loopch.c |   10 +-
>>  gcc/testsuite/gcc.dg/tree-ssa/vrp104.c |2 +-
>>  gcc/timevar.def|1 +
>>  gcc/tree-pass.h|1 +
>>  8 files changed, 1236 insertions(+), 867 deletions(-)
>>  create mode 100644 gcc/gimple-switch-low.c
>>
>>

Re: [PATCH] Fix PR81354 (rewrite gimple_split_edge)

2017-08-02 Thread Bill Schmidt

On Aug 2, 2017, at 3:09 AM, Richard Biener  wrote:
> 
> On Tue, Aug 1, 2017 at 5:56 PM, Bill Schmidt
>  wrote:
>> On Aug 1, 2017, at 8:50 AM, Bill Schmidt  wrote:
>>> 
>>> On Aug 1, 2017, at 7:44 AM, Bill Schmidt  
>>> wrote:
 
> 
> On Aug 1, 2017, at 3:46 AM, Richard Biener  
> wrote:
> 
> On Mon, Jul 31, 2017 at 4:03 PM, Bill Schmidt
>  wrote:
>> 
>>> On Jul 31, 2017, at 8:19 AM, Bill Schmidt  
>>> wrote:
>>> 
>>> That would certainly be much simpler!  I'll regstrap it and test it on 
>>> the other
>>> occurrence I've found to be certain.
>> 
>> Unfortunately, this fails bootstrap:
>> 
>> /home/wschmidt/gcc/gcc-mainline-test/gcc/calls.c: In function 'rtx_def* 
>> emit_library_call_value_1(int, rtx, rtx, libcall_type, machine_mode, 
>> int, va_list)':
>> /home/wschmidt/gcc/gcc-mainline-test/gcc/calls.c:4362:1: error: 
>> definition in block 214 does not dominate use in block 14
>> emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
>> ^
>> for SSA_NAME: slsr_772 in statement:
>> slsr_749 = PHI <_17(26), slsr_772(14), slsr_334(214)>
>> PHI argument
>> slsr_772
>> for PHI node
>> slsr_749 = PHI <_17(26), slsr_772(14), slsr_334(214)>
>> during GIMPLE pass: slsr
>> /home/wschmidt/gcc/gcc-mainline-test/gcc/calls.c:4362:1: internal 
>> compiler error: verify_ssa failed
>> 0x11567cf3 verify_ssa(bool, bool)
>> /home/wschmidt/gcc/gcc-mainline-test/gcc/tree-ssa.c:1186
>> 0x10fc3fff execute_function_todo
>> /home/wschmidt/gcc/gcc-mainline-test/gcc/passes.c:1997
>> 0x10fc277f do_per_function
>> /home/wschmidt/gcc/gcc-mainline-test/gcc/passes.c:1655
>> 0x10fc42a3 execute_todo
>> /home/wschmidt/gcc/gcc-mainline-test/gcc/passes.c:2044
>> Please submit a full bug report,
>> with preprocessed source if appropriate.
>> Please include the complete backtrace with any bug report.
>> See  for instructions.
>> 
>> Not terribly surprising given how sensitive this stuff is.  I can look 
>> into why
>> this fails, but looks like it can't be quite this simple, sadly.
> 
> Intersting ... a dg-torture.exp run was clean for me (all I
> tested...).  So yes, can you
> check what fails?  Maybe run the testsuite with the stage1 compiler.
 
 Looks like it's the stage1 that doesn't build.  I think the difference is
 that I was building trunk and you were building 6.  I'll try to look into
 it later today after I get through some meetings.
>>> 
>>> Sorry, no, it was stage 2 where the failure occurs.
>> 
>> Ah, "good" news.  I believe the bootstrap failure with this change is another
>> rediscovery of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81503, which
>> explains why it wasn't seen for GCC 6.  I'll work on getting that fixed and
>> then try this again.
> 
> Note I very much would like to "fix" split_edge on trunk, on release branches
> I'd appreciate if a simpler fix inside SLSR works, like using 
> gsi_insert_on_edge
> to avoid splitting edges during the iteration.

Sure, I'll look into it.

Bill
> 
> Richard.
> 
>> Bill
>> 
>>> 
>>> Bill
>>> 
 
 Bill
> 
> Richard.
> 
>> Bill
>> 
>>> 
>>> -- Bill
>>> 
 On Jul 31, 2017, at 4:15 AM, Richard Biener 
  wrote:
 
 On Sun, Jul 30, 2017 at 8:04 PM, Bill Schmidt
  wrote:
> Hi,
> 
> PR81354 identifies a latent bug that can happen in SLSR since the
> conditional candidate support was first added.  SLSR relies on the
> address of a GIMPLE PHI remaining constant during the course of the
> optimization pass, but it needs to split edges.  The use of
> make_single_succ_edge and reinstall_phi_args in gimple_split_edge
> causes GIMPLE PHI statements to be temporarily expanded to add a
> predecessor, and then rebuilt to have the original number of
> predecessors.  The expansion usually, if not always, causes the PHI
> statement to change address.  Thus gimple_split_edge needs to be
> rewritten to perform in-situ replacement of PHI arguments.
> 
> The required pieces of make_single_succ_edge have been extracted into
> two places:  make_replacement_pred_edge, and some fixup code at the
> end of gimple_split_edge.  The division is necessary because the
> destination of the original edge must remember its original
> predecessors for the switch processing in
> gimple_redirect_edge_and_branch_1 to work properly.
> 
> The function gimple_redirect_edge_and_branch was factored into two
> pieces so that most of it can be used by gimple_split_edge without
> calling ssa_redirect_edge, which also interferes with PHIs.  The
> useful bits of ssa_redire

PATCH for base_pool_allocator (PR other/81667)

2017-08-02 Thread Marek Polacek

This PR points out that m_elt_size was left uninitialized in the member
initializer list for base_pool_allocator.  I think it makes sense to initialize
it like this.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-08-02  Marek Polacek  

PR other/81667
* alloc-pool.h (base_pool_allocator): Initialize m_elt_size.

diff --git gcc/alloc-pool.h gcc/alloc-pool.h
index a5236db3dae..1d04e5d2cfe 100644
--- gcc/alloc-pool.h
+++ gcc/alloc-pool.h
@@ -240,8 +240,9 @@ base_pool_allocator ::base_pool_allocator (
const char *name, size_t size MEM_STAT_DECL):
   m_name (name), m_id (0), m_elts_per_block (0), m_returned_free_list (NULL),
   m_virgin_free_list (NULL), m_virgin_elts_remaining (0), m_elts_allocated (0),
-  m_elts_free (0), m_blocks_allocated (0), m_block_list (NULL), m_size (size),
-  m_initialized (false), m_location (ALLOC_POOL_ORIGIN, false PASS_MEM_STAT) {}
+  m_elts_free (0), m_blocks_allocated (0), m_block_list (NULL), m_elt_size (0),
+  m_size (size), m_initialized (false),
+  m_location (ALLOC_POOL_ORIGIN, false PASS_MEM_STAT) {}
 
 /* Initialize a pool allocator.  */
 

Marek

Re: [PATCH] Fix PR bootstrap/81638

2017-08-02 Thread David Edelsohn

To fix the false-positive uninitialized warning, this patch
initializes the "incl" variable.

* xcoff.c (xcoff_process_linenos): Initialize incl to NULL.

Index: xcoff.c
===
--- xcoff.c (revision 250806)
+++ xcoff.c (working copy)
@@ -774,7 +774,7 @@ xcoff_process_linenos (struct backtrace_state *sta
   const b_xcoff_lineno *lineno;
   const unsigned char *lineptr;
   const char *function;
-  struct xcoff_incl *incl;
+  struct xcoff_incl *incl = NULL;
   uintptr_t lnnoptr;
   uintptr_t pc;
   uint32_t lnno;

Re: C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-08-02 Thread Marek Polacek

On Tue, Aug 01, 2017 at 04:48:01PM -0400, David Malcolm wrote:
> I'm wondering if the messages could use a slight rewording, to give a
> clue to the user about the reason *why* the expression has changed
> signedness.  The old message "signed and unsigned type in conditional
> expression" gave the clue (but failed to underline the subexpression
> changing sign, and tell what the old/new types were).
> 
> A horribly verbose way to put it would be something like:
> 
> "operand of conditional expression with mixed signedness changes
> signedness from %qT to %qT due to promotion to unsigned to match
> unsignedness of other operand" (ugh)
> 
> (assuming I'm understanding the logic correctly)
> 
> or something like:
> 
> "operand of conditional expression changes signedness from %qT to %qT
> due to unsignedness of other operand"
> 
> or somesuch (am not 100% happy with that either).

Hmm, how about this, then?

"operand of ?: changes signedness from %qT to %qT due to unsignedness of other 
operand"

I couldn't come up with anything more brief yet conveying all the information.
I don't like adding "second"/"third"/... very much; we should offer a good
location already.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-08-02  Marek Polacek  

PR c/81417
* c-array-notation.c (fix_builtin_array_notation_fn): Update calls to
build_conditional_expr. 
* c-parser.c (c_parser_conditional_expression): Create locations for
EXP1 and EXP2 from their source ranges.  Pass the locations down to
build_conditional_expr.
* c-tree.h (build_conditional_expr): Update declaration.
* c-typeck.c (build_conditional_expr): Add location_t parameters.
For -Wsign-compare, also print the types.

* input.c (make_location): New overload.
* input.h (make_location): Declare.

* objc-next-runtime-abi-02.c (build_v2_build_objc_method_call): Update
a call to build_conditional_expr.

* Wsign-compare-1.c: New test.
* gcc.dg/compare1.c: Adjust dg-bogus.
* gcc.dg/compare2.c: Likewise.
* gcc.dg/compare3.c: Likewise.
* gcc.dg/compare7.c: Likewise.
* gcc.dg/compare8.c: Likewise.
* gcc.dg/compare9.c: Likewise.
* gcc.dg/pr11492.c: Likewise.

diff --git gcc/c/c-array-notation.c gcc/c/c-array-notation.c
index e430f5c681b..40f1cfdabb8 100644
--- gcc/c/c-array-notation.c
+++ gcc/c/c-array-notation.c
@@ -355,8 +355,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_NONZERO:
   new_var_init = build_modify_expr
@@ -375,8 +376,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_ZERO:
   new_var_init = build_modify_expr
@@ -394,8 +396,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));   
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_NONZERO:
   new_var_init = build_modify_expr
@@ -413,8 +416,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));   
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (

[PATCH] Fix PR bootstrap/81638

2017-08-02 Thread David Edelsohn

To fix the false-positive uninitialized warning, this patch
initializes the "incl" variable.

[nvptx, committed] Add missing probabilities in nvptx_lock{less,full}_update

2017-08-02 Thread Tom de Vries


[ was: Re: [PATCH] update edge profile info in nvptx.c ]

On 07/13/2017 06:53 PM, Cesar Philippidis wrote:

The recent basic block profiling changes broke a couple of libgomp
OpenACC execution tests involving reductions with nvptx offloading. For
gang and worker reductions, the nvptx BE updates the original reduction
variable using a lock-free atomic algorithm. This lock-free algorithm
utilizes a polling loop to check the state of the variable being
updated. This loop introduced a new basic block edge, but it wasn't
assigned a branch probability.


Right, and the same problem occurs in nvptx_lockfull_update.


Because of the highly threaded nature of
CUDA accelerators, I set the branch probability for that edge as even.


I don't have any experimental data to make a good estimation, so 50/50 
sounds fine to me.


As before, the setting of the probability of the non-loop edge was missing.

I've added that, and committed.

Thanks,
- Tom
Add missing probabilities in nvptx_lock{less,full}_update

2017-08-02  Tom de Vries  
	Cesar Philippidis  

	* config/nvptx/nvptx.c (nvptx_lockless_update, nvptx_lockfull_update):
	Add missing edge probabilities.

---
 gcc/config/nvptx/nvptx.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 208b115..0d21eb1 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5072,7 +5072,9 @@ nvptx_lockless_update (location_t loc, gimple_stmt_iterator *gsi,
   *gsi = gsi_for_stmt (gsi_stmt (*gsi));
 
   post_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU;
+  post_edge->probability = profile_probability::even ();
   edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_FALSE_VALUE);
+  loop_edge->probability = profile_probability::even ();
   set_immediate_dominator (CDI_DOMINATORS, loop_bb, pre_bb);
   set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb);
 
@@ -5145,7 +5147,9 @@ nvptx_lockfull_update (location_t loc, gimple_stmt_iterator *gsi,
   
   /* Create the lock loop ... */
   locked_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU;
-  make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE);
+  locked_edge->probability = profile_probability::even ();
+  edge loop_edge = make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE);
+  loop_edge->probability = profile_probability::even ();
   set_immediate_dominator (CDI_DOMINATORS, lock_bb, entry_bb);
   set_immediate_dominator (CDI_DOMINATORS, update_bb, lock_bb);

Re: c-family PATCH to improve and simplify -Wmultistatement-macros (PR c/81448, c/81306)

2017-08-02 Thread Marek Polacek

On Wed, Aug 02, 2017 at 11:37:27AM +, Joseph Myers wrote:
> On Thu, 27 Jul 2017, Marek Polacek wrote:
> 
> > Bootstrapped/regtested on x86_64-linux and ppc64le-linux, ok for trunk?
> > 
> > 2017-07-27  Marek Polacek  
> > 
> > PR c/81448
> > PR c/81306
> > * c-warn.c (warn_for_multistatement_macros): Prevent bogus
> > warnings.  Avoid walking MACRO_MAP_LOCATIONS.   
> >  
> > 
> > * c-c++-common/Wmultistatement-macros-13.c: New test.
> 
> OK.  Steve, as I noted in 
> , please see 
> if this means you no longer need to use -Wno-multistatement-macros for 
> certain glibc tests (if it's still needed, that suggests another related 
> problem may still be present).

I've committed the patch (r250822), so I hope all those bogus warnings will
disappear.

Marek

Re: [PATCH][RFC] Make expansion of balanced binary trees of switches on tree level.

2017-08-02 Thread Richard Biener

On Wed, Aug 2, 2017 at 1:20 PM, Martin Liška  wrote:
> Hello.
>
> After some discussions with Honza, I've decided to convert current code in 
> stmt.c that
> is responsible for switch expansion. More precisely, I would like to convert 
> the code
> to expand gswitch statements on tree level. Currently the newly created pass 
> is executed
> at the end of tree optimizations.
>
> My plan for future is to inspire in [1] and come up with some more 
> sophisticated switch
> expansions. For that I've been working on a paper where I'll summarize 
> statistics based
> on what I've collected in openSUSE distribution with specially instrumented 
> GCC. If I'll be
> happy I can also fit in to schedule of this year's Cauldron with a talk.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Thoughts?

First of all thanks.

I think part of switch expansion moved to switch-conversion some time ago
(emit_case_bit_tests).  So maybe the full lowering should be in at least
the same source file and it should maybe applied earlier for a subset of
cases (very low number of cases for example).

Did you base the code on the RTL expansion code or did you re-write it from
scratch?

No further comments sofar.

Richard.

> Martin
>
> [1] https://www.nextmovesoftware.com/technology/SwitchOptimization.pdf
>
> gcc/ChangeLog:
>
> 2017-07-31  Martin Liska  
>
> * Makefile.in: Add gimple-switch-low.o.
> * passes.def: Include pass_lower_switch.
> * stmt.c (dump_case_nodes): Remove and move to
> gimple-switch-low.c.
> (case_values_threshold): Likewise.
> (expand_switch_as_decision_tree_p): Likewise.
> (emit_case_decision_tree): Likewise.
> (expand_case): Likewise.
> (balance_case_nodes): Likewise.
> (node_has_low_bound): Likewise.
> (node_has_high_bound): Likewise.
> (node_is_bounded): Likewise.
> (emit_case_nodes): Likewise.
> * timevar.def: Add TV_TREE_SWITCH_LOWERING.
> * tree-pass.h: Add make_pass_lower_switch.
> * gimple-switch-low.c: New file.
>
> gcc/testsuite/ChangeLog:
>
> 2017-07-31  Martin Liska  
>
> * gcc.dg/tree-prof/update-loopch.c: Scan patterns in
> switchlower.
> * gcc.dg/tree-ssa/vrp104.c: Likewise.
> ---
>  gcc/Makefile.in|1 +
>  gcc/gimple-switch-low.c| 1226 
> 
>  gcc/passes.def |1 +
>  gcc/stmt.c |  861 -
>  gcc/testsuite/gcc.dg/tree-prof/update-loopch.c |   10 +-
>  gcc/testsuite/gcc.dg/tree-ssa/vrp104.c |2 +-
>  gcc/timevar.def|1 +
>  gcc/tree-pass.h|1 +
>  8 files changed, 1236 insertions(+), 867 deletions(-)
>  create mode 100644 gcc/gimple-switch-low.c
>
>

Re: C PATCH to fix a crash on invalid (PR c/81289)

2017-08-02 Thread Joseph Myers

On Wed, 2 Aug 2017, Marek Polacek wrote:

> 2017-08-02  Marek Polacek  
> 
>   PR c/81289
>   * c-parser.c (c_parser_unary_expression): Use set_error.
> 
>   * gcc.dg/noncompile/pr81289.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: c-family PATCH to improve and simplify -Wmultistatement-macros (PR c/81448, c/81306)

2017-08-02 Thread Joseph Myers

On Thu, 27 Jul 2017, Marek Polacek wrote:

> Bootstrapped/regtested on x86_64-linux and ppc64le-linux, ok for trunk?
> 
> 2017-07-27  Marek Polacek  
> 
>   PR c/81448
>   PR c/81306
>   * c-warn.c (warn_for_multistatement_macros): Prevent bogus
>   warnings.  Avoid walking MACRO_MAP_LOCATIONS.   
>  
> 
>   * c-c++-common/Wmultistatement-macros-13.c: New test.

OK.  Steve, as I noted in 
, please see 
if this means you no longer need to use -Wno-multistatement-macros for 
certain glibc tests (if it's still needed, that suggests another related 
problem may still be present).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Add -nolibc option

2017-08-02 Thread Joseph Myers

On Thu, 27 Jul 2017, Tristan Gingold wrote:

> Index: gcc/common.opt
> ===
> --- gcc/common.opt(revision 250563)
> +++ gcc/common.opt(working copy)
> @@ -2956,6 +2956,10 @@
>  nostdlib
>  Driver
> 
> +nolibc
> +Driver
> +Do not link with libc

Help texts should end with ".".  This may not cause a test failure in this 
particular case because help texts for Driver options aren't actually used 
(and the test for help text formatting checks the --help output, not the 
.opt files); you actually need to update gcc.c:display_help to get a help 
text in the driver's --help output.

Since none of the related options such as -nostdlib are mentioned in that 
--help output, either removing the help text or adding "." would seem 
appropriate.  Or add all those options to --help output separately.

> Index: gcc/config/arm/unknown-elf.h
> ===
> --- gcc/config/arm/unknown-elf.h  (revision 250563)
> +++ gcc/config/arm/unknown-elf.h  (working copy)
> @@ -91,6 +91,6 @@
>  /* The libgcc udivmod functions may throw exceptions.  If newlib is
> configured to support long longs in I/O, then printf will depend on
> udivmoddi4, which will depend on the exception unwind routines,
> -   which will depend on abort, which is defined in libc.  */
> +   which will depend on abort, which is defined in libc.  */
>  #undef LINK_GCC_C_SEQUENCE_SPEC
> -#define LINK_GCC_C_SEQUENCE_SPEC "--start-group %G %L --end-group"
> +#define LINK_GCC_C_SEQUENCE_SPEC "--start-group %G %{!nolibc:%L} --end-group"

There are lots of other LINK_GCC_C_SEQUENCE_SPEC definitions for 
particular targets I'd expect to be updated.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH v2][RFC] Canonize names of attributes.

2017-08-02 Thread Joseph Myers

On Thu, 13 Jul 2017, Martin Liška wrote:

> +/* For a given IDENTIFIER_NODE, strip leading and trailing '_' characters
> +   so that we have a canonical form of attribute names.  */
> +
> +static inline tree
> +canonicalize_attr_name (tree attr_name)
> +{
> +  const size_t l = IDENTIFIER_LENGTH (attr_name);
> +  const char *s = IDENTIFIER_POINTER (attr_name);
> +
> +  if (l > 4 && s[0] == '_')
> +{
> +  gcc_checking_assert (s[l - 2] == '_');
> +  return get_identifier_with_length (s + 2, l - 4);
> +}
> +
> +  return attr_name;

For this to (a) be correct, (b) not trigger the assertion, there must be a 
precondition that attr_name either starts and ends with __, or does not 
start with _, or has length 4 or less.  I don't see anything in the 
callers to ensure this precondition holds, so that, for example, 
__attribute__ ((_foobar)) does not trigger the assertion, and 
__attribute__ ((_xformat__)) is not wrongly interpreted as a format 
attribute (and similarly for attribute arguments when those are 
canonicalized).

> +/* Compare attribute identifiers ATTR1 and ATTR2 with length ATTR1_LEN and
> +   ATTR2_LEN.  */
> +
> +static inline bool
> +cmp_attribs (const char *attr1, size_t attr1_len,
> +  const char *attr2, size_t attr2_len)
> +{
> +  gcc_checking_assert (attr1_len == 0 || attr1[0] != '_');
> +  gcc_checking_assert (attr2_len == 0 || attr2[0] != '_');

Of course, once you only canonicalize attributes that both start and end 
with __, you have the possibility of a canonicalized attribute name that 
still starts with _, so can't assert on it here (unless this is only ever 
called with an attribute that is already known to be valid).  (And of 
course there are cases such as __attribute__((__)) that starts and ends 
with __ but is too short to canonicalize, etc., which arise even with the 
above canonicalize_attr_name.)

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH][RFC] Make expansion of balanced binary trees of switches on tree level.

2017-08-02 Thread Martin Liška

Hello.

After some discussions with Honza, I've decided to convert current code in 
stmt.c that
is responsible for switch expansion. More precisely, I would like to convert 
the code
to expand gswitch statements on tree level. Currently the newly created pass is 
executed
at the end of tree optimizations.

My plan for future is to inspire in [1] and come up with some more 
sophisticated switch
expansions. For that I've been working on a paper where I'll summarize 
statistics based
on what I've collected in openSUSE distribution with specially instrumented 
GCC. If I'll be
happy I can also fit in to schedule of this year's Cauldron with a talk.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Thoughts?
Martin

[1] https://www.nextmovesoftware.com/technology/SwitchOptimization.pdf

gcc/ChangeLog:

2017-07-31  Martin Liska  

* Makefile.in: Add gimple-switch-low.o.
* passes.def: Include pass_lower_switch.
* stmt.c (dump_case_nodes): Remove and move to
gimple-switch-low.c.
(case_values_threshold): Likewise.
(expand_switch_as_decision_tree_p): Likewise.
(emit_case_decision_tree): Likewise.
(expand_case): Likewise.
(balance_case_nodes): Likewise.
(node_has_low_bound): Likewise.
(node_has_high_bound): Likewise.
(node_is_bounded): Likewise.
(emit_case_nodes): Likewise.
* timevar.def: Add TV_TREE_SWITCH_LOWERING.
* tree-pass.h: Add make_pass_lower_switch.
* gimple-switch-low.c: New file.

gcc/testsuite/ChangeLog:

2017-07-31  Martin Liska  

* gcc.dg/tree-prof/update-loopch.c: Scan patterns in
switchlower.
* gcc.dg/tree-ssa/vrp104.c: Likewise.
---
 gcc/Makefile.in|1 +
 gcc/gimple-switch-low.c| 1226 
 gcc/passes.def |1 +
 gcc/stmt.c |  861 -
 gcc/testsuite/gcc.dg/tree-prof/update-loopch.c |   10 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp104.c |2 +-
 gcc/timevar.def|1 +
 gcc/tree-pass.h|1 +
 8 files changed, 1236 insertions(+), 867 deletions(-)
 create mode 100644 gcc/gimple-switch-low.c


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index efca9169671..4a84f36c6b9 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1309,6 +1309,7 @@ OBJS = \
 	gimple-ssa-warn-alloca.o \
 	gimple-streamer-in.o \
 	gimple-streamer-out.o \
+	gimple-switch-low.o \
 	gimple-walk.o \
 	gimplify.o \
 	gimplify-me.o \
diff --git a/gcc/gimple-switch-low.c b/gcc/gimple-switch-low.c
new file mode 100644
index 000..a37c49b8494
--- /dev/null
+++ b/gcc/gimple-switch-low.c
@@ -0,0 +1,1226 @@
+/* Lower GIMPLE_SWITCH expressions to something more efficient than
+   a jump table.
+   Copyright (C) 2006-2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not, write to the Free
+Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301, USA.  */
+
+/* This file handles the lowering of GIMPLE_SWITCH to an indexed
+   load, or a series of bit-test-and-branch expressions.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "insn-codes.h"
+#include "rtl.h"
+#include "tree.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "optabs-tree.h"
+#include "cgraph.h"
+#include "gimple-pretty-print.h"
+#include "params.h"
+#include "fold-const.h"
+#include "varasm.h"
+#include "stor-layout.h"
+#include "cfganal.h"
+#include "gimplify.h"
+#include "gimple-iterator.h"
+#include "gimplify-me.h"
+#include "tree-cfg.h"
+#include "cfgloop.h"
+#include "target.h"
+#include "alloc-pool.h"
+#include "tree-into-ssa.h"
+
+struct case_node
+{
+  case_node		*left;	/* Left son in binary tree.  */
+  case_node		*right;	/* Right son in binary tree;
+   also node chain.  */
+  case_node		*parent; /* Parent of node in binary tree.  */
+  tree			low;	/* Lowest index value for this label.  */
+  tree			high;	/* Highest index value for this label.  */
+  basic_block		case_bb; /* Label to jump to when node matches.  */
+  tree			case_label; /* Label to jump to when node matches.  */
+  profile_probability   prob; /* Probability of taking this case.  */
+  profile_p

Re: [PATCH] Add macro DISABLE_COPY_AND_ASSIGN

2017-08-02 Thread Yao Qi

On Wed, Jul 26, 2017 at 9:55 AM, Yao Qi  wrote:
> On 17-07-19 10:30:45, Yao Qi wrote:
>> We have many classes that copy cotr and assignment operator are deleted
>> in different projects, gcc, gdb and gold.  So this patch adds a macro
>> to do this, and replace these existing mechanical code with macro
>> DISABLE_COPY_AND_ASSIGN.
>>
>> The patch was posted in gdb-patches,
>> https://sourceware.org/ml/gdb-patches/2017-07/msg00254.html but we
>> think it is better to put this macro in include/ansidecl.h so that
>> other projects can use it too.
>>
>> Boostrapped on x86_64-linux-gnu.  Is it OK?
>>
>> include:
>>
>> 2017-07-19  Yao Qi  
>>   Pedro Alves  
>>
>>   * ansidecl.h (DISABLE_COPY_AND_ASSIGN): New macro.
>>
>> gcc/cp:
>>
>> 2017-07-19  Yao Qi  
>>
>>   * cp-tree.h (class ovl_iterator): Use DISABLE_COPY_AND_ASSIGN.
>>   * name-lookup.c (struct name_loopup): Likewise.
>>
>> gcc:
>>
>> 2017-07-19  Yao Qi  
>>
>>   * tree-ssa-scopedtables.h (class avail_exprs_stack): Use
>>   DISABLE_COPY_AND_ASSIGN.
>>   (class const_and_copies): Likewise.
>
> Ping.
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01134.html
>

Ping.  It is a quite straightforward patch, can any one
take a look?

-- 
Yao (齐尧)

GCC 7.2 Status Report (2017-08-02)

2017-08-02 Thread Richard Biener


Status
==

The GCC 7 branch is now frozen for the upcoming release candidate
and release.  All changes require release manager approval.

Re: [PATCH][PR 59521] Respect probabilities when expanding switch statement

2017-08-02 Thread Martin Liška

On 08/02/2017 12:52 PM, Yury Gribov wrote:
> On Wed, Aug 2, 2017 at 11:42 AM, Martin Liška  > wrote:
> 
> On 08/02/2017 11:53 AM, Jan Hubicka wrote:
> > Hello,
> > sorry for not responding for a while.  Martin Liska has patch to move 
> switch
> > expansion to gimple level that will likely simplify the code 
> combinatoin.
> 
> Hello.
> 
> Yep, will land today to gcc-patches mailing list.
> 
> >
> >>
> >> combine_predictions_for_bb calculates final probability for edges of
> >> if-else or switch statements.
> >>
> >> For if-elses this is done by combining values computed by different
> >> predictors using Dempster-Shafer theory.  For switch statement DS is
> >> not used, mainly because we do not have heuristics for predicting
> >> which case will be taken (paper by Larus concluded that using if-else
> >> heuristics does not give good results).
> >>
> >> So until this patch we just used set_even_probabilities. The name of
> >> this function is misleading, in addition to setting even probabilities
> >> it can also understand that some edges are very unlikely and set
> >> unlikely probs for those.  With patch it now also understands that one
> >> edge is very likely.
> >
> > I am not sure that the conclusion of Ball&Larus paper applies to us 
> here.
> > In addition to usual if-then-else heuristics we have those based on walk
> > of CFG (such as ones predicting paths to unlikely calls) and those 
> should
> > work well on switch statements.
> >
> > We discussed adding predictor combining code for BBs with more than 2
> > successors. Martin, do you have some code for that?
> 
> This has been discussed and we decided to reject that as we're unable to
> apply DS theory as we can't evaluate what probability has a predictor for
> edges different from the edge which it can evaluate. Note that with 2 
> edges
> and probability x, one can calculate probability of the second edge
> simply by 1 - x. That's not doable if one has > 2 edges.
> 
> 
> Did you consider splitting 1 - x equally among alternatives?

That's quite obvious simplification. I'll take a look one more time what was 
problematic
there.

Thanks,
Martin

>  
> 
> That was reason
> why I decided to use DF theory for such situations and wrote just simple
> handling of very {un,}likely probabilities.
> 
> Maybe I overlooked something in understanding of DF theory?
> 
> Martin
> 
> >
> > I guess teaching even propbabilities about likely edges also works, but
> > perhaps doing more general prediction combining would be cleaner...
> >
> > Honza
> >
> 
>

Re: [PATCH][PR 59521] Respect probabilities when expanding switch statement

2017-08-02 Thread Martin Liška

On 08/02/2017 11:53 AM, Jan Hubicka wrote:
> Hello,
> sorry for not responding for a while.  Martin Liska has patch to move switch
> expansion to gimple level that will likely simplify the code combinatoin.

Hello.

Yep, will land today to gcc-patches mailing list.

> 
>>
>> combine_predictions_for_bb calculates final probability for edges of
>> if-else or switch statements.
>>
>> For if-elses this is done by combining values computed by different
>> predictors using Dempster-Shafer theory.  For switch statement DS is
>> not used, mainly because we do not have heuristics for predicting
>> which case will be taken (paper by Larus concluded that using if-else
>> heuristics does not give good results).
>>
>> So until this patch we just used set_even_probabilities. The name of
>> this function is misleading, in addition to setting even probabilities
>> it can also understand that some edges are very unlikely and set
>> unlikely probs for those.  With patch it now also understands that one
>> edge is very likely.
> 
> I am not sure that the conclusion of Ball&Larus paper applies to us here.
> In addition to usual if-then-else heuristics we have those based on walk
> of CFG (such as ones predicting paths to unlikely calls) and those should
> work well on switch statements. 
> 
> We discussed adding predictor combining code for BBs with more than 2
> successors. Martin, do you have some code for that?

This has been discussed and we decided to reject that as we're unable to
apply DS theory as we can't evaluate what probability has a predictor for
edges different from the edge which it can evaluate. Note that with 2 edges
and probability x, one can calculate probability of the second edge
simply by 1 - x. That's not doable if one has > 2 edges. That was reason
why I decided to use DF theory for such situations and wrote just simple
handling of very {un,}likely probabilities.

Maybe I overlooked something in understanding of DF theory?

Martin 

> 
> I guess teaching even propbabilities about likely edges also works, but
> perhaps doing more general prediction combining would be cleaner...
> 
> Honza
>

Re: [PATCH] Make mempcpy more optimal (PR middle-end/70140).

2017-08-02 Thread Bin.Cheng

On Wed, Aug 2, 2017 at 10:54 AM, Martin Liška  wrote:
> On 08/02/2017 11:45 AM, Bin.Cheng wrote:
>> Hi Martin,
>> With r250771, GCC failed to build glibc for arm/aarch64 linux cross 
>> toolchain:
>
> Hi.
>
> Sorry for the breakage, I accidentally installed wrong version of patch.
> Should be fixed in r250789.
Thanks!

Thanks,
bin
>
> M.

Re: [PATCH][2/2] early LTO debug, main part

2017-08-02 Thread Richard Biener

On Wed, 2 Aug 2017, Jason Merrill wrote:

> On 05/19/2017 06:42 AM, Richard Biener wrote:
> > + /* ???  In some cases the C++ FE (at least) fails to
> > +set DECL_CONTEXT properly.  Simply globalize stuff
> > +in this case.  For example
> > +__dso_handle created via iostream line 74 col 25.  */
> > + parent = comp_unit_die ();
> 
> I've now fixed __dso_handle, so that can be removed from the comment, but it
> still makes sense to have this fall-back for the (permitted) case of null
> DECL_CONTEXT.

Adjusted the comment.

> > +   /* ???  LANG issue - DW_TAG_module for fortran.  Either look
> > +at the input language (if we have enough DECL_CONTEXT to follow)
> > +or use a bit in tree_decl_with_vis to record the distinction.  */
> 
> Sure, you should be able to look at TRANSLATION_UNIT_LANGUAGE.

Yeah, the comment says we might be able to walk DECL_CONTEXT up to
a TRANSLATION_UNIT_DECL.  I've amended is_fortran similar to as I
amended is_cxx, providing an overload for a decl, factoring out common
code.  So this is now if (is_fortran (decl)) ... = new_die 
(DW_TAG_module,...).

> > ! /* ???  We cannot unconditionally output die_offset if
> > !non-zero - at least -feliminate-dwarf2-dups will
> > !create references to those DIEs via symbols.  And we
> > !do not clear its DIE offset after outputting it
> > !(and the label refers to the actual DIEs, not the
> > !DWARF CU unit header which is when using label + offset
> > !would be the correct thing to do).
> 
> As in our previous discussion, I think -feliminate-dwarf2-dups can go away
> now.  But this is a more general issue: die_offset has meant the offset from
> the beginning of the CU, but if with_offset is set it means an offset from
> die_symbol.  Since with_offset changes the meaning of die_symbol and
> die_offset, having different code here depending on that flag makes sense.
> 
> It seems likely that when -fEDD goes away, we will never again want to do
> direct symbolic references to DIEs, in which case we could drop the current
> meaning of die_symbol, and so we wouldn't need the with_offset flag.

Yeah, I've been playing with a patch to remove -fEDD but it has conflicts
with the early LTO debug work and thus I wanted to postpone it until
after that goes in to avoid further churn.  I hope that's fine, it's
sth I'll tackle soon after this patch lands on trunk.

> > !   unit_die->comdat_type_p = comdat_p;
> > ! }
> > ! ! static void
> > ! compute_section_prefix (dw_die_ref unit_die)
> > ! {
> > !   compute_section_prefix_1 (unit_die, true);
> > !   comdat_symbol_id = unit_die->die_id.die_symbol;
> > comdat_symbol_number = 0;
> >   }
> 
> Let's set the comdat_type_p flag in this function rather than add a parameter
> to the existing function.  And when -fEDD goes away, we don't need this entry
> point at all.
> 
> Also, for LTO debug, it seems you aren't actually using the symbol as a
> section prefix, so the name becomes inaccurate.  Maybe
> compute_comp_unit_symbol rather than compute_section_prefix_1?

Done.

> > +   /* For LTO cross unit DIE refs we want a symbol on the start of the
> > +  debuginfo section, not on the CU DIE.
> > +  ???  We could simply use the symbol as it would be output by
> > output_die
> > +  and account for the extra offset produced by the CU header which has
> > fixed
> > +  size.  OTOH it currently only supports linkonce globals which would
> > +  be less than ideal?.  */
> 
> I think the way you're doing it now is better than this alternative, since
> die_offset is relative to the beginning of the CU header.

Removed the ??? part of the comment.

> > !   /* Don't output the symbol twice.  For LTO we want the label
> > !  on the section beginning, not on the actual DIE.  */
> > !   && (!flag_generate_lto
> > ! || die->die_tag != DW_TAG_compile_unit))
> 
> I think this check should just be !with_offset; if that flag is set the DIE
> doesn't actually have its own symbol.

with_offset is set only during LTRANS phase for the DIEs refering to
the early DIEs via the CU label.  But the above is guarding the
early phase when we do not want to output that CU label twice.

Can we revisit this check when -fEDD has gone away?

> > !   if (old_die
> > ! && (c = get_AT_ref (old_die, DW_AT_abstract_origin))
> > ! /* ???  In LTO all origin DIEs still refer to the early
> > !debug copy.  Detect that.  */
> > ! && get_AT (c, DW_AT_inline))
> ...
> > !   /* "Unwrap" the decls DIE which we put in the imported unit context.
> > !   ???  If we finish dwarf2out_function_decl refactoring we can
> > ! do this in a better way from the start and only lazily emit
> > ! the early DIE references.  */
> 
> It seems like in gen_subprogram_die you deliberately avoid reusing the DIE
> from dwarf2out_register_external_die

Re: [PATCH] toplev: avoid recursive emergency_dump_function

2017-08-02 Thread Alexander Monakov

Hello,

On Thu, 20 Jul 2017, Alexander Monakov wrote:
> Segher pointed out on IRC that ICE reporting with dumps enabled got worse:
> if emergency_dump_function itself leads to an ICE (e.g. by segfaulting),
> nested ICE reporting will invoke emergency_dump_function in exactly the
> same context, but not only would we uselessly announce current pass again,
> this time around the second SIGSEGV will just terminate cc1 because the
> signal handler is unregistered (so we won't print the backtrace).  Sorry
> for not really considering the implications when submitting that patch.
> 
> Solve this by substituting the callback for global_dc->internal_error;
> this avoids invoking emergency_dump_function, and also gives a
> convenient point to flush the dump file.  OK to apply?
> 
>   * topvel.c (dumpfile.h): New include.
>   (internal_error_reentered): New static function.  Use it...
>   (internal_error_function): ...here to handle reentered internal_error.

David, could you review this please?  Segher indicated in the other
subthread that he's happy with this solution.

Thanks.
Alexander

> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index e6c69a4..67254fb 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "hsa-common.h"
>  #include "edit-context.h"
>  #include "tree-pass.h"
> +#include "dumpfile.h"
> 
>  #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
>  #include "dbxout.h"
> @@ -1064,11 +1065,22 @@ open_auxiliary_file (const char *ext)
>return file;
>  }
> 
> +/* Alternative diagnostics callback for reentered ICE reporting.  */
> +
> +static void
> +internal_error_reentered (diagnostic_context *, const char *, va_list *)
> +{
> +  /* Flush the dump file if emergency_dump_function itself caused an ICE.  */
> +  if (dump_file)
> +fflush (dump_file);
> +}
> +
>  /* Auxiliary callback for the diagnostics code.  */
> 
>  static void
>  internal_error_function (diagnostic_context *, const char *, va_list *)
>  {
> +  global_dc->internal_error = internal_error_reentered;
>warn_if_plugins ();
>emergency_dump_function ();
>  }

Re: [PATCH] Tweak tree-ssa/pr81588.c testcase (PR tree-optimization/81655)

2017-08-02 Thread Richard Biener

On Wed, Aug 2, 2017 at 12:11 PM, Jakub Jelinek  wrote:
> Hi!
>
> I have an improvement that handles tree-ssa/pr81588.c optimization
> even on branch cost 1, but as it is effectively an extension to the
> range var_bound optimization, I'm not sure it should be backported.
>
> Therefore, this patch treats this testcase similarly to other
> testcases that rely on branch cost of 2 (so that there is BIT_IOR_EXPR
> and not 2 separate jumps).
>
> Tested on x86_64-linux, ok for 7.2?

Ok.

Richard.

> 2017-08-02  Jakub Jelinek  
>
> PR tree-optimization/81655
> PR tree-optimization/81588
> * gcc.dg/tree-ssa/pr81588.c: Use -mbranch-cost=2 where possible,
> don't run the test on branch-cost=1 targets.
>
> --- gcc/testsuite/gcc.dg/tree-ssa/pr81588.c.jj  2017-08-01 10:28:50.0 
> +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr81588.c 2017-08-02 11:02:30.966184545 
> +0200
> @@ -1,7 +1,8 @@
>  /* PR tree-optimization/81588 */
> -/* { dg-do compile } */
> +/* { dg-do compile { target { ! "m68k*-*-* mmix*-*-* bfin*-*-* v850*-*-* 
> moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* xtensa*-*-* 
> hppa*-*-* nios2*-*-*" } } } */
>  /* { dg-options "-O2 -fdump-tree-reassoc1-details" } */
> -
> +/* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* 
> s390*-*-* i?86-*-* x86_64-*-* } } */
> +
>  extern long long int a, c;
>  extern unsigned short b;
>
>
> Jakub

[PATCH] Tweak tree-ssa/pr81588.c testcase (PR tree-optimization/81655)

2017-08-02 Thread Jakub Jelinek

Hi!

I have an improvement that handles tree-ssa/pr81588.c optimization
even on branch cost 1, but as it is effectively an extension to the
range var_bound optimization, I'm not sure it should be backported.

Therefore, this patch treats this testcase similarly to other
testcases that rely on branch cost of 2 (so that there is BIT_IOR_EXPR
and not 2 separate jumps).

Tested on x86_64-linux, ok for 7.2?

2017-08-02  Jakub Jelinek  

PR tree-optimization/81655
PR tree-optimization/81588
* gcc.dg/tree-ssa/pr81588.c: Use -mbranch-cost=2 where possible,
don't run the test on branch-cost=1 targets.

--- gcc/testsuite/gcc.dg/tree-ssa/pr81588.c.jj  2017-08-01 10:28:50.0 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr81588.c 2017-08-02 11:02:30.966184545 
+0200
@@ -1,7 +1,8 @@
 /* PR tree-optimization/81588 */
-/* { dg-do compile } */
+/* { dg-do compile { target { ! "m68k*-*-* mmix*-*-* bfin*-*-* v850*-*-* 
moxie*-*-* cris*-*-* m32c*-*-* fr30*-*-* mcore*-*-* powerpc*-*-* xtensa*-*-* 
hppa*-*-* nios2*-*-*" } } } */
 /* { dg-options "-O2 -fdump-tree-reassoc1-details" } */
-   
+/* { dg-additional-options "-mbranch-cost=2" { target mips*-*-* avr-*-* 
s390*-*-* i?86-*-* x86_64-*-* } } */
+
 extern long long int a, c;
 extern unsigned short b;
 

Jakub

1 2 >

1 - 100 of 113 matches

Mail list logo