Re: [PATCH] fix cygwin builds

2017-10-30 Thread Andreas Schwab
This broke ia64:

In file included from ./tm_p.h:4:0,
 from ../../gcc/gimplify.c:30:
../../gcc/config/ia64/ia64-protos.h:49:13: error: use of enum 'memmodel' 
without previous declaration
enum memmodel);
 ^

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH] Implement omp async support for nvptx

2017-10-30 Thread Jakub Jelinek
On Fri, Oct 27, 2017 at 03:57:28PM +0200, Tom de Vries wrote:
> how about this approach:
> 1 - Move async_run from plugin-hsa.c to default_async_run
> 2 - Implement omp async support for nvptx
> ?
> 
> The first patch moves the GOMP_OFFLOAD_async_run implementation from
> plugin-hsa.c to target.c, making it the default implementation if the plugin
> does not define the GOMP_OFFLOAD_async_run symbol.
> 
> The second patch removes the GOMP_OFFLOAD_async_run symbol from the nvptx
> plugin, activating the default implementation, and makes sure
> GOMP_OFFLOAD_run can be called from a fresh thread.
> 
> I've tested this with libgomp.c/c.exp and the previously failing target-33.c
> and target-34.c are now passing, and there are no regressions.
> 
> OK for trunk after complete testing (and adding function comment for
> default_async_run)?

Can't PTX do better than this?  What I mean is that while we probably need
to take the device lock for the possible memory transfers and deallocation
at the end of the region and thus perform some action on the host in between
the end of the async target region and data copying/deallocation, can't we
have a single thread per device instead of one thread per async target
region, use CUDA async APIs and poll for all the pending async regions
together?  I mean, if we need to take the device lock, then we need to
serialize the finalization anyway and reusing the same thread would
significantly decrease the overhead if there are many async regions.

And, if it at least in theory can do better than that, then even if we
punt on that for now due to time/resource constraints, maybe it would be
better to do this inside of plugin where it can be more easily replaced
later.

Jakub


Re: [PATCH v3] Add asan and ubsan support on NetBSD/amd64

2017-10-30 Thread Jakub Jelinek
On Thu, Oct 26, 2017 at 09:50:43PM +0200, Kamil Rytarowski wrote:
> $ make check-asan
> $ make check-asan-dynamic
> $ make check-ubsan

That is testing of the upstream code, not of GCC and the libsanitizer
copy in GCC.  What I'm more interested to hear is whether
you've bootstrapped/regtested the gcc tree with this patch on
x86_64-*-netbsd*, as per https://gcc.gnu.org/contribute.html
I.e. /configure ...; make -jN bootstrap; make -jN -k check; 
/contrib/test_summary
and from there if there are any */asan/* or */ubsan/* FAILs.

> 2017-10-26  Kamil Rytarowski  
> 
>   * sanitizer_common/Makefile.am (sanitizer_common_files): Add
>   sanitizer_platform_limits_netbsd.cc.
>   * sanitizer_common/Makefile.in: Regenerated.
>   * configure.tgt: Enable asan and ubsan on x86_64-*-netbsd*.

Jakub


Re: [09/nn] Add a fixed_size_mode_pod class

2017-10-30 Thread Richard Sandiford
Trevor Saunders  writes:
> On Thu, Oct 26, 2017 at 09:37:31PM +0200, Eric Botcazou wrote:
>> > Can you figure what oldest GCC release supports the C++11/14 POD handling
>> > that would be required?
>> 
>> GCC needs to be buildable by other compilers than itself though.
>
> It sounds like people are mostly concerned about sun studio and xlc? It
> doesn't seem that hard to provide precompiled binaries for those two
> platforms, and maybe 4.8 binaries for people who want to compile theire
> own gcc from source.  If that would be enough to deal with people
> concerns it seems doable by next stage 1?

Would it be worth supporting a 4-stage bootstrap, with stage 0 being
built from older gcc sources?  We could include a contrib/ script that
downloads sources for gcc-4.7 or whatever and patches it to build with
modern as well as old compilers.  (When I tried gcc-4.7 last week,
I needed a couple of tweaks to get it to build.)

Not that I'd have time try that before GCC 9...

Thanks,
Richard


Re: [AArch64, testsuite] gfortran.dg/ieee/ieee_8.f90: xfail for aarch64+ilp32

2017-10-30 Thread Janne Blomqvist
On Tue, Oct 24, 2017 at 9:27 PM, Charles Baylis
 wrote:
> The test is already marked xfail for aarch64*-*-gnu, but this needs to
> be changed to aarch64*-*-gnu* in order to match
> aarch64-linux-gnu_ilp32.
>
> Test was previously xfail'd in [1].
>
> Shows the expected FAIL->XFAILs on aarch64-linux-gnu_ilp32.
>
> gcc/testsuite:
>   Charles Baylis  
>
> * gfortran.dg/ieee/ieee_8.f90: xfail for aarch64*-*-gnu*
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02188.html

Ok, thanks for the patch.


-- 
Janne Blomqvist


Re: [RFA][PATCH] Provide a class interface into substitute_and_fold.

2017-10-30 Thread Pedro Alves
On 10/25/2017 06:20 PM, Jeff Law wrote:

> My conclusion on the virtual dtor issue is that it's not strictly needed
> right  now.
> 
> IIUC the issue is you could do something like
> 
> base *foo = new derived ();
> [ ... ]
> delete foo;
> 
> If the base's destructor is not virtual and foo is a base * pointing to
> a derived object then the deletion invokes undefined behavior.
> 
> In theory we shouldn't be doing such things :-)  In fact, if there's a
> way to prevent this with some magic on the base class I'd love to know
> about it.

There is: make the base class destructor protected.

Thanks,
Pedro Alves



Re: [PATCH, rs6000] update vec_perm testcase

2017-10-30 Thread Segher Boessenkool
Hi!

On Fri, Oct 27, 2017 at 10:58:34AM -0500, Will Schmidt wrote:
> Update the vec-perm testcase to use 'long long' rather than 'long'.  This was 
> a missed typo
> from when i initially committed the test.
> 
> Credit given to Carl for noticing this one.
> 
> OK for trunk?

Yes please (with some suitable changelog).  Thanks to you both,


Segher


Re: [09/nn] Add a fixed_size_mode_pod class

2017-10-30 Thread Eric Botcazou
> It sounds like people are mostly concerned about sun studio and xlc? It
> doesn't seem that hard to provide precompiled binaries for those two
> platforms, and maybe 4.8 binaries for people who want to compile theire
> own gcc from source.

I'm not sure that we want to enter the business of precompiled binaries.
Moreover, if we want people to contribute to GCC's development, especially 
occasionally to fix a couple of bugs, we need to make it easier to build the 
compiler, not the other way around.

-- 
Eric Botcazou


Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-30 Thread Maxim Ostapenko

Hi,

sorry for the late response.

On 20/10/17 13:45, Christophe Lyon wrote:

Hi,

On 19 October 2017 at 13:17, Jakub Jelinek  wrote:

On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote:

Is the patch (the merge + this incremental) ok for trunk?

I think the patch is OK, just wondering about two things:

Richi just approved the patch on IRC, so I'll commit, then we can deal with
follow-ups.


Does anyone else run these tests on arm?
Since you applied this patch, I'm seeing lots of new errors and timeouts.
I have been ignoring regression reports for *san because of yyrandomness
in the results, but the timeouts are a  major inconvenience in testing
because it increases latency a lot in getting results, or worse I get no
result at all because the validation job is killed before completion.

Looking at some intermediate logs, I have noticed:
==24797==AddressSanitizer CHECK failed:
/libsanitizer/asan/asan_poisoning.cc:34
"((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0)
 #0 0x408d7d65 in AsanCheckFailed /libsanitizer/asan/asan_rtl.cc:67
 #1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/libsanitizer/sanitizer_common/sanitizer_termination.cc:77
 #2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned
long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34
 #3 0x4085409b in __asan_register_globals
/libsanitizer/asan/asan_globals.cc:368
 #4 0x109eb in _GLOBAL__sub_I_00099_1_ten
(/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb)

in MANY (193 in gcc) tests.

and many others (152 in gcc) just time out individually (eg
c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in
the logs besides Dejagnu's
WARNING: program timed out.


Since I'm using an apparently unusual setup, maybe I have to update it
to cope with the new version,
so I'd like to know if others are seeing the same problems on arm?

I'm using qemu -R 0 to execute the test programs, encapsulated by
proot (similar to chroot, but does not require root privileges).

Am I missing something obvious?


I've caught the same error on my Arndale board. The issue seems to be 
quite obvious: after merge, ASan requires globals array to be aligned by 
shadow granularity.

This trivial patch seems to fix the issue. Could you check it on your setup?

Thanks,
-Maxim



Thanks,

Christophe



1) We have a bunch of GCC local patches, did you include them into a
cumulative patch (I guess yes)?

I have done some verification today, diff from upstream r285547 to unpatched
GCC (with the LLVM Compiler infrastructure two liners removed), attached P1,
and diff from upstream r315899 to patched GCC (again, two liners removed),
attached P2 and went through the changes in P1 and verified that except for
the ubsan backwards compatibility we had that can't work anymore everything
else was upstreamed, or remained in P2.  So P2 is the current diff from
upstream, with the sanitizer_common/sanitizer_symbolizer_libbacktrace.cc
changes now filed upstream.


2) Upstream has enabled LSan for x86 and ARM, is it worth to enable them in
GCC too?

Maybe, feel free to post patches.  For LSan we need to split off lsan_preinit
out of liblsan and link it into executables, will handle it next (there is a
PR about it, just wanted to wait until the merge is in).

 Jakub





gcc/ChangeLog:

2017-10-30  Maxim Ostapenko  

	* asan.c (asan_finish_file): Align asan globals array by shadow
	granularity.

diff --git a/gcc/asan.c b/gcc/asan.c
index 302ac4f..d5128aa 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2942,6 +2942,9 @@ asan_finish_file (void)
   TREE_CONSTANT (ctor) = 1;
   TREE_STATIC (ctor) = 1;
   DECL_INITIAL (var) = ctor;
+  SET_DECL_ALIGN (var, MAX (DECL_ALIGN (var),
+ASAN_SHADOW_GRANULARITY * BITS_PER_UNIT));
+
   varpool_node::finalize_decl (var);
 
   tree fn = builtin_decl_implicit (BUILT_IN_ASAN_REGISTER_GLOBALS);


Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-30 Thread Jakub Jelinek
On Mon, Oct 30, 2017 at 01:12:47PM +0300, Maxim Ostapenko wrote:
> I've caught the same error on my Arndale board. The issue seems to be quite
> obvious: after merge, ASan requires globals array to be aligned by shadow
> granularity.
> This trivial patch seems to fix the issue. Could you check it on your setup?

Indeed, it now wants to poison the arrays after the registration is over.

> 2017-10-30  Maxim Ostapenko  
> 
>   * asan.c (asan_finish_file): Align asan globals array by shadow
>   granularity.

Ok, thanks.

BTW, we should investigate the 
__asan_register_elf_globals/__asan_unregister_elf_globals
and whether it is something we should start using too.

> diff --git a/gcc/asan.c b/gcc/asan.c
> index 302ac4f..d5128aa 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -2942,6 +2942,9 @@ asan_finish_file (void)
>TREE_CONSTANT (ctor) = 1;
>TREE_STATIC (ctor) = 1;
>DECL_INITIAL (var) = ctor;
> +  SET_DECL_ALIGN (var, MAX (DECL_ALIGN (var),
> + ASAN_SHADOW_GRANULARITY * BITS_PER_UNIT));
> +
>varpool_node::finalize_decl (var);
>  
>tree fn = builtin_decl_implicit (BUILT_IN_ASAN_REGISTER_GLOBALS);


Jakub


Re: [006/nnn] poly_int: tree constants

2017-10-30 Thread Pedro Alves
On 10/27/2017 12:29 AM, Martin Sebor wrote:

> 
> IMO, a good rule of thumb to follow in class design is to have
> every class with any user-defined ctor either define a default
> ctor that puts the object into a determinate state, or make
> the default ctor inaccessible (or deleted in new C++ versions).
> If there is a use case for leaving newly constructed objects
> of a class in an uninitialized state that's an exception to
> the rule that can be accommodated by providing a special API
> (or in C++ 11, a defaulted ctor).

Yet another rule of thumb is to make classes that model
built-in types behave as close to the built-in types as
possible, making it easier to migrate between the custom
types and the built-in types (and vice versa), to follow
expectations, and to avoid pessimization around e.g., otherwise
useless forcing initialization of such types in containers/arrays
when you're going to immediately fill in the container/array with
real values.

BTW, there's a proposal for adding a wide_int class to C++20:

 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0539r1.html

and I noticed:

~~~
 26.??.2.?? wide_integer constructors [numeric.wide_integer.cons]

 constexpr wide_integer() noexcept = default;

 Effects: A Constructs an object with undefined value.
~~~

Thanks,
Pedro Alves


Re: [patch][x86] GFNI enabling [2/4]

2017-10-30 Thread Kirill Yukhin
On 17 Oct 12:58, Koval, Julia wrote:
> Hi, this is the second patch of enabling GFNI ISASET. It adds GF2P8AFFINEINV 
> instruction.
> The instruction is described here:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> 
> gcc/
>   * config.gcc: Add gfniintrin.h.
>   * config/i386/gfniintrin.h: New.
>   * config/i386/i386-builtin-types.def 
> (__builtin_ia32_vgf2p8affineinvqb_v64qi,
>   __builtin_ia32_vgf2p8affineinvqb_v64qi_mask, 
> __builtin_ia32_vgf2p8affineinvqb_v32qi
>   __builtin_ia32_vgf2p8affineinvqb_v32qi_mask, 
> __builtin_ia32_vgf2p8affineinvqb_v16qi,
>   __builtin_ia32_vgf2p8affineinvqb_v16qi_mask): New builtins.
>   * config/i386/i386-builtin.def (V64QI_FTYPE_V64QI_V64QI_INT_V64QI_UDI,
>   V32QI_FTYPE_V32QI_V32QI_INT_V32QI_USI, 
> V16QI_FTYPE_V16QI_V16QI_INT_V16QI_UHI,
>   V64QI_FTYPE_V64QI_V64QI_INT): New types.
>   * config/i386/i386.c (ix86_expand_args_builtin): Handle new types.
>   * config/i386/immintrin.h: Include gfniintrin.h.
>   * config/i386/sse.md (vgf2p8affineinvqb_*) New pattern.
> 
> gcc/testsuite/
>   * gcc.target/i386/avx-1.c: Handle new intrinsics.
>   * gcc.target/i386/avx512-check.h: Check GFNI bit.
>   * gcc.target/i386/avx512f-gf2p8affineinvqb-2.c: Runtime test.
>   * gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c: Runtime test.
>   * gcc.target/i386/gfni-1.c: New.
>   * gcc.target/i386/gfni-2.c: New.
>   * gcc.target/i386/gfni-3.c: New.
>   * gcc.target/i386/gfni-4.c: New.
>   * gcc.target/i386/i386.exp: (check_effective_target_gfni): New.
>   * gcc.target/i386/sse-13.c: Handle new intrinsics.
>   * gcc.target/i386/sse-23.c: Handle new intrinsics.
> 
> Ok for trunk?
Few comments:
1. Why copyright in config/i386/gfniintrin.h starts from 2014?

2. I think few tests updates are missing: g++.dg/other/i386-2,3.c + 
gcc.target/i386/sse-12,14.c

--
Thanks, K
> 
> Thanks,
> Julia




[patch] Remove old kludge in gcc.c

2017-10-30 Thread Eric Botcazou
Hi,

these lines in gcc.c date back to 2001:

/* By default there is no special suffix for target executables.  */
/* FIXME: when autoconf is fixed, remove the host check - dj */
#if defined(TARGET_EXECUTABLE_SUFFIX) && defined(HOST_EXECUTABLE_SUFFIX)
#define HAVE_TARGET_EXECUTABLE_SUFFIX
#endif

HOST_EXECUTABLE_SUFFIX and TARGET_EXECUTABLE_SUFFIX are defined only for VMS 
and DOS/Windows.  The effect is that ".exe" is forced on executables for these 
targets only if they are built as native and not if they are built as cross.

OK for the mainline?


2017-10-30  Eric Botcazou  

* gcc.c (HAVE_TARGET_EXECUTABLE_SUFFIX): Remove old kludge.

-- 
Eric BotcazouIndex: gcc.c
===
--- gcc.c	(revision 254193)
+++ gcc.c	(working copy)
@@ -170,9 +170,10 @@ env_manager::restore ()
 
 
 /* By default there is no special suffix for target executables.  */
-/* FIXME: when autoconf is fixed, remove the host check - dj */
-#if defined(TARGET_EXECUTABLE_SUFFIX) && defined(HOST_EXECUTABLE_SUFFIX)
+#ifdef TARGET_EXECUTABLE_SUFFIX
 #define HAVE_TARGET_EXECUTABLE_SUFFIX
+#else
+#define TARGET_EXECUTABLE_SUFFIX ""
 #endif
 
 /* By default there is no special suffix for host executables.  */
Index: gcc.c
===
--- gcc.c	(revision 254193)
+++ gcc.c	(working copy)
@@ -170,9 +170,10 @@ env_manager::restore ()
 
 
 /* By default there is no special suffix for target executables.  */
-/* FIXME: when autoconf is fixed, remove the host check - dj */
-#if defined(TARGET_EXECUTABLE_SUFFIX) && defined(HOST_EXECUTABLE_SUFFIX)
+#ifdef TARGET_EXECUTABLE_SUFFIX
 #define HAVE_TARGET_EXECUTABLE_SUFFIX
+#else
+#define TARGET_EXECUTABLE_SUFFIX ""
 #endif
 
 /* By default there is no special suffix for host executables.  */


[PATCH][GIMPLEFE] Parse ?: stmts

2017-10-30 Thread Richard Biener

We missed handling of conditional stmts in assignments so the
following adds support for it.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-10-30  Richard Biener  

c/
* gimple-parser.c (c_parser_gimple_statement): Parse conditional
stmts.

* gcc.dg/gimplefe-27.c: New testcase.

Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 254211)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -419,6 +419,23 @@ c_parser_gimple_statement (c_parser *par
   if (lhs.value != error_mark_node
   && rhs.value != error_mark_node)
 {
+  /* If we parsed a comparison and the next token is a '?' then
+ parse a conditional expression.  */
+  if (COMPARISON_CLASS_P (rhs.value)
+ && c_parser_next_token_is (parser, CPP_QUERY))
+   {
+ struct c_expr trueval, falseval;
+ c_parser_consume_token (parser);
+ trueval = c_parser_gimple_postfix_expression (parser);
+ falseval.set_error ();
+ if (c_parser_require (parser, CPP_COLON, "expected %<:%>"))
+   falseval = c_parser_gimple_postfix_expression (parser);
+ if (trueval.value == error_mark_node
+ || falseval.value == error_mark_node)
+   return;
+ rhs.value = build3_loc (loc, COND_EXPR, TREE_TYPE (trueval.value),
+ rhs.value, trueval.value, falseval.value);
+   }
   assign = gimple_build_assign (lhs.value, rhs.value);
   gimple_seq_add_stmt (seq, assign);
   gimple_set_location (assign, loc);
Index: gcc/testsuite/gcc.dg/gimplefe-27.c
===
--- gcc/testsuite/gcc.dg/gimplefe-27.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/gimplefe-27.c  (working copy)
@@ -0,0 +1,9 @@
+/* { dg-options "-O -fgimple" } */
+
+int __GIMPLE ()
+p (int n)
+{
+  int _2;
+  _2 = n_1(D) != 0 ? 2 : 0;
+  return _2;
+}


Re: [PATCH] Assorted store-merging improvements (PR middle-end/22141)

2017-10-30 Thread Richard Biener
On Fri, 27 Oct 2017, Jakub Jelinek wrote:

> Hi!
> 
> The following patch attempts to improve store merging, for the time being
> it still only optimizes constant stores to adjacent memory.
> 
> The biggest improvement is handling bitfields, it uses the get_bit_range
> helper to find the bounds of what can be modified when modifying the bitfield
> and instead of requiring all the stores to be adjacent it now only requires
> that their bitregion_* regions are adjacent.  If get_bit_range fails (e.g. for
> non-C/C++ languages), then it still rounds the boundaries down and up to whole
> bytes, as any kind of changes within the byte affect the rest.  At the end,
> if there are any gaps in between the stored values, the old value is loaded 
> from
> memory (had to set TREE_NO_WARNING on it, so that uninit doesn't complain)
> masked with mask, ored with the constant maked with negation of mask and 
> stored,
> pretty much what the expansion emits.  As incremental improvement, perhaps we
> could emit in some cases where all the stored bitfields in one load/store set
> are adjacent a BIT_INSERT_EXPR instead of doing the and/or.
> 
> Another improvement is related to alignment handling, previously the code
> was using get_object_alignment, which say for the store_merging_11.c testcase
> has to return 8-bit alignment, as the whole struct is 64-bit aligned,
> but the first store is 1 byte after that.  The old code would then on
> targets that don't allow unaligned stores or have them slow emit just byte
> stores (many of them).  The patch uses get_object_alignment_1, so that we
> get both the maximum known alignment and misalign, and computes alignment
> for that for every bitpos we try, such that for stores starting with
> 1 byte after 64-bit alignment we get 1 byte store, then 2 byte and then 4 byte
> and then 8 byte.
> 
> Another improvement is for targets that allow unaligned stores, the new
> code performs a dry run on split_group and if it determines that aligned
> stores are as many or fewer as unaligned stores, it prefers aligned stores.
> E.g. for the case in store_merging_11.c, where ptr is 64-bit aligned and
> we store 15 bytes, unaligned stores and unpatched gcc would choose to
> do an 8 byte, then 4 byte, then 2 byte and then one byte store.
> Aligned stores, 1, 2, 4, 8 in that order are also 4, so it is better
> to do those.
> 
> The patch also attempts to reuse original stores (well, just their lhs/rhs1),
> if we choose a split store that has a single original insn in it.  That way
> we don't lose ARRAY_REFs/COMPONENT_REFs etc. unnecessarily.  Furthermore, if
> there is a larger original store than the maximum we try (wordsize), e.g. when
> there is originally 8 byte long long store followed by 1 byte store followed 
> by
> 1 byte store, on 32-bit targets we'd previously try to split it into 4 byte
> store, 4 byte store and 2 byte store, figure out that is 3 stores like 
> previously
> and give up.  With the patch, if we see a single original larger store at the
> bitpos we want, we just reuse that store, so we get in that case an 8 byte
> original store (lhs/rhs1) followed by 2 byte store.
> 
> In find_constituent_stmts it optimizes by not walking unnecessarily 
> group->stores
> entries that are already known to be before the bitpos we ask.  It fixes the 
> comparisons
> which were off by one, so previously it often chose more original stores than 
> were really
> in the split store.
> 
> Another change is that output_merged_store used to emit the new stores into a 
> sequence
> and if it found out there are too many, released all ssa names and failed.
> That seems to be unnecessary to me, because we know before entering the loop
> how many split stores there are, so can just fail at that point and so only 
> start
> emitting something when we have decided to do the replacement.
> 
> I had to disable store merging in the g++.dg/pr71694.C testcase, but that is 
> just
> because the testcase doesn't test what it should.  In my understanding, it 
> wants to
> verify that the c.c store isn't using 32-bit RMW, because it would create 
> data race
> for c.d.  But it stores both c.c and c.d next to each other, so even when 
> c.c's
> bitregion is the first 3 bytes and c.d's bitregion is the following byte, we 
> are
> then touching bytes in both of the regions and thus a RMW cycle for the whole
> 32-bit word is fine, as c.d is written, it will store the new value and 
> ignore the
> old value of the c.d byte.  What is wrong, but store merging doesn't emit, is 
> what
> we emitted before, i.e. 32-bit RMW that just stored c.c, followed by c.d 
> store.
> Another thread could have stored value1 into c.d, we'd R it by 32-bit read, 
> modify,
> while another thread stored value2 into c.d, then we W the 32-bit word and 
> thus
> introduce a value1 store into c.d, then another thread reads it and finds
> a value1 instead of expected value2.  Finally we store into c.d value3.
> So, alternative to -fno-store-merging 

Re: [03/nn] [AArch64] Rework interface to add constant/offset routines

2017-10-30 Thread Richard Sandiford
Richard Sandiford  writes:
> The port had aarch64_add_offset and aarch64_add_constant routines
> that did similar things.  This patch replaces them with an expanded
> version of aarch64_add_offset that takes separate source and
> destination registers.  The new routine also takes a poly_int64 offset
> instead of a HOST_WIDE_INT offset, but it leaves the HOST_WIDE_INT
> case to aarch64_add_offset_1, which is basically a repurposed
> aarch64_add_constant_internal.  The SVE patch will put the handling
> of VL-based constants in aarch64_add_offset, while still using
> aarch64_add_offset_1 for the constant part.
>
> The vcall_offset == 0 path in aarch64_output_mi_thunk will use temp0
> as well as temp1 once SVE is added.
>
> A side-effect of the patch is that we now generate:
>
> mov x29, sp
>
> instead of:
>
> add x29, sp, 0
>
> in the pr70044.c test.

Sorry, I stupidly rebased the patch just before posting and so
introduced a last-minute bug.  Here's a fixed version that survives
testing on aarch64-linux-gnu.

Thanks,
Richard


2017-10-30  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* config/aarch64/aarch64.c (aarch64_force_temporary): Assert that
x exists before using it.
(aarch64_add_constant_internal): Rename to...
(aarch64_add_offset_1): ...this.  Replace regnum with separate
src and dest rtxes.  Handle the case in which they're different,
including when the offset is zero.  Replace scratchreg with an rtx.
Use 2 additions if there is no spare register into which we can
move a 16-bit constant.
(aarch64_add_constant): Delete.
(aarch64_add_offset): Replace reg with separate src and dest
rtxes.  Take a poly_int64 offset instead of a HOST_WIDE_INT.
Use aarch64_add_offset_1.
(aarch64_add_sp, aarch64_sub_sp): Take the scratch register as
an rtx rather than an int.  Take the delta as a poly_int64
rather than a HOST_WIDE_INT.  Use aarch64_add_offset.
(aarch64_expand_mov_immediate): Update uses of aarch64_add_offset.
(aarch64_allocate_and_probe_stack_space): Take the scratch register
as an rtx rather than an int.  Use Pmode rather than word_mode
in the loop code.  Update calls to aarch64_sub_sp.
(aarch64_expand_prologue): Update calls to aarch64_sub_sp,
aarch64_allocate_and_probe_stack_space and aarch64_add_offset.
(aarch64_expand_epilogue): Update calls to aarch64_add_offset
and aarch64_add_sp.
(aarch64_output_mi_thunk): Use aarch64_add_offset rather than
aarch64_add_constant.

gcc/testsuite/
* gcc.target/aarch64/pr70044.c: Allow "mov x29, sp" too.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-10-30 10:45:23.806582919 +
+++ gcc/config/aarch64/aarch64.c2017-10-30 10:46:47.278836421 +
@@ -1818,30 +1818,13 @@ aarch64_force_temporary (machine_mode mo
 return force_reg (mode, value);
   else
 {
-  x = aarch64_emit_move (x, value);
+  gcc_assert (x);
+  aarch64_emit_move (x, value);
   return x;
 }
 }
 
 
-static rtx
-aarch64_add_offset (scalar_int_mode mode, rtx temp, rtx reg,
-   HOST_WIDE_INT offset)
-{
-  if (!aarch64_plus_immediate (GEN_INT (offset), mode))
-{
-  rtx high;
-  /* Load the full offset into a register.  This
- might be improvable in the future.  */
-  high = GEN_INT (offset);
-  offset = 0;
-  high = aarch64_force_temporary (mode, temp, high);
-  reg = aarch64_force_temporary (mode, temp,
-gen_rtx_PLUS (mode, high, reg));
-}
-  return plus_constant (mode, reg, offset);
-}
-
 static int
 aarch64_internal_mov_immediate (rtx dest, rtx imm, bool generate,
scalar_int_mode mode)
@@ -1966,86 +1949,123 @@ aarch64_internal_mov_immediate (rtx dest
   return num_insns;
 }
 
-/* Add DELTA to REGNUM in mode MODE.  SCRATCHREG can be used to hold a
-   temporary value if necessary.  FRAME_RELATED_P should be true if
-   the RTX_FRAME_RELATED flag should be set and CFA adjustments added
-   to the generated instructions.  If SCRATCHREG is known to hold
-   abs (delta), EMIT_MOVE_IMM can be set to false to avoid emitting the
-   immediate again.
-
-   Since this function may be used to adjust the stack pointer, we must
-   ensure that it cannot cause transient stack deallocation (for example
-   by first incrementing SP and then decrementing when adjusting by a
-   large immediate).  */
+/* A subroutine of aarch64_add_offset that handles the case in which
+   OFFSET is known at compile time.  The arguments are otherwise the same.  */
 
 static void
-aarch64_add_constant_internal (scalar_int_mode mode, int regnum,
-  int scratchreg, HOST_WIDE_INT delta,
- 

Re: [RFC] Make 4-stage PGO bootstrap really working

2017-10-30 Thread Richard Biener
On Fri, Oct 27, 2017 at 4:31 PM, Markus Trippelsdorf
 wrote:
> On 2017.10.27 at 15:03 +0200, Martin Liška wrote:
>> > And BTW would it make sense to add -gtoggle to stage2 in bootstrap-lto?
>>
>> Why do you want to have it there? Am I right that we do not do a stage
>> comparison with LTO bootstrap?
>
> The idea was to trigger -g -flto at least during one stage, just as a
> sanity check.
> Comparison wouldn't make sense, because we would compare LTO object
> files.

I've always wanted to make us compare the actual binaries now that the checksums
do not cause miscompares...  of course linker behavior will then
factor in and I'm
not sure how reliable such compare would be.

And yes, at the moment we _do_ compare LTO bytecode... (which obviously will
fail with -g vs. -g0)

Richard.

> --
> Markus


Re: [PATCH] Change default to -fno-math-errno

2017-10-30 Thread Richard Biener
On Fri, Oct 27, 2017 at 6:28 PM, Joseph Myers  wrote:
> No existing glibc version defines math_errhandling based on
> __NO_MATH_ERRNO__.  I'd expect such a change to come with a glibc patch,
> and indeed a GCC execution test of the value of math_errhandling to make
> sure the compiler's behavior isn't contradicting what's declared by the
> runtime libraries.  Also, the default should not be changed for pre-C99 C
> standard modes (or pre-C++11 C++, I suppose).

Should we also get the __ieee764_ entries used if the compiler sets
__NO_MATH_ERRNO__?  That is, if the librari advertises not setting errno
via math_errhandling is it still allowed to set/clobber errno anyways?

Richard.

> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Richard Biener
On Sun, 29 Oct 2017, Martin Sebor wrote:

> In my work on -Wrestrict, to issue meaningful warnings, I found
> it important to detect both out of bounds array indices as well
> as offsets in calls to restrict-qualified functions like strcpy.
> GCC already detects some of these cases but my tests for
> the enhanced warning exposed a few gaps.
> 
> The attached patch enhances -Warray-bounds to detect more instances
> out-of-bounds indices and offsets to member arrays and non-array
> members.  For example, it detects the out-of-bounds offset in the
> call to strcpy below.
> 
> The patch is meant to be applied on top posted here but not yet
> committed:
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> 
> Richard, since this also touches tree-vrp.c I look for your comments.

You fail to tell what you are changing and why - I have to reverse
engineer this from the patch which a) isn't easy in this case, b) feels
like a waste of time.  Esp. since the patch does many things.

My first question is why do you add a warning from forwprop?  It
_feels_ like you're trying to warn about arbitrary out-of-bound
addresses at the point they are folded to MEM_REFs.  And it looks
like you're warning about pointer arithmetic like &p->a + 6.
That doesn't look correct to me.  Pointer arithmetic in GIMPLE
is not restricted to operate within fields that are appearantly
accessed here - the only restriction is with respect to the
whole underlying pointed-to-object.

By doing the warning from forwprop you'll run into all such cases
introduced by GCC itself during quite late optimization passes.

You're trying to re-do __builtin_object_size even when that wasn't
used.

So it looks like you're on the wrong track.  Yes,

 strcpy (p->a + 6, "y");

_may_ be "invalid" C (I'm not even sure about that!) but it
is certainly not invalid GIMPLE.

Richard.

> Jeff, this is the enhancement you were interested in when we spoke
> last week.
> 
> Thanks
> Martin
> 
> $ cat a.c && gcc -O2 -S -Wall a.c
>   struct A { char a[4]; void (*pf)(void); };
> 
>   void f (struct A *p)
>   {
> p->a[5] = 'x';// existing -Warray-bounds
> 
> strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>   }
> 
>   a.c: In function ‘f’:
>   a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’ [-Warray-bounds]
>strcpy (p->a + 6, "y");
>^~
>   a.c:1:17: note: member declared here
>struct A { char a[4]; void (*pf)(void); };
>^
>   a.c:5:7: warning: array subscript 5 is above array bounds of ‘char[4]’
> [-Warray-bounds]
>  p->a[5] = 'x';
>  ^~~

Re: [PATCH] Implement omp async support for nvptx

2017-10-30 Thread Tom de Vries

On 10/30/2017 08:15 AM, Jakub Jelinek wrote:

On Fri, Oct 27, 2017 at 03:57:28PM +0200, Tom de Vries wrote:

how about this approach:
1 - Move async_run from plugin-hsa.c to default_async_run
2 - Implement omp async support for nvptx
?

The first patch moves the GOMP_OFFLOAD_async_run implementation from
plugin-hsa.c to target.c, making it the default implementation if the plugin
does not define the GOMP_OFFLOAD_async_run symbol.

The second patch removes the GOMP_OFFLOAD_async_run symbol from the nvptx
plugin, activating the default implementation, and makes sure
GOMP_OFFLOAD_run can be called from a fresh thread.

I've tested this with libgomp.c/c.exp and the previously failing target-33.c
and target-34.c are now passing, and there are no regressions.

OK for trunk after complete testing (and adding function comment for
default_async_run)?


Can't PTX do better than this?


It can.

I found your comment describing this implementation as a hack here ( 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02726.html ) after sending 
this on Friday, and thought about things a little bit more. So let me 
try again.


This is not an optimal nvptx async implementation. This is a proposal to 
have a poor man's async implementation in the common code, rather than 
having libgomp accel ports implementing GOMP_OFFLOAD_async_run as abort 
at first.


AFAIU, the purpose of the async functionality is to have jobs executed 
concurrently and/or interleaved on the device. While this implementation 
does not offer jobs to the device in separate queues, such that the 
device can decide on concurrent and interleaved behaviour, it does 
present the device with a possibly interleaved job schedule (which is 
slightly better than having a poor mans async implementation that is 
just synchronous).


In order to have an optimal implementation, one would still need to 
implement the GOMP_OFFLOAD_async_run hook, which would bypass this 
implementation.


I'm not sure how useful this would be, but I can even imagine using this 
if all the accel ports have implemented the GOMP_OFFLOAD_async_run hook.

We could define a variable OMP_ASYNC with semantics:
- 0: ignore plugins GOMP_OFFLOAD_async_run hook, fall back on
 synchronous behaviour
- 1: ignore plugins GOMP_OFFLOAD_async_run hook, use poor man's
 implementation.
- 2: use plugins GOMP_OFFLOAD_async_run hook.
This could be helpful in debugging programs with async behaviour.


What I mean is that while we probably need
to take the device lock for the possible memory transfers and deallocation
at the end of the region and thus perform some action on the host in between
the end of the async target region and data copying/deallocation, can't we
have a single thread per device instead of one thread per async target
region, use CUDA async APIs and poll for all the pending async regions
together?  I mean, if we need to take the device lock, then we need to
serialize the finalization anyway and reusing the same thread would
significantly decrease the overhead if there are many async regions.



As for the poor mans implementation, it is indeed inefficient and could 
be improved, but I wonder if it's worth the effort. [ Perhaps though for 
debugging purposes the ability to control the interleaving in some way 
might be useful. ]


I imagine that an efficient nvptx implementation will use cuda streams, 
which are queues where both kernels and mem transfers can be queued. So 
rather than calling GOMP_PLUGIN_target_task_completion once the kernel 
is done, it would be more efficient to be able call a similar function 
that schedules the data transfers that need to happen, without assuming 
that the kernel is already done. However, AFAIU, that won't take care of 
deallocation. So I guess the first approach will be to use cuda events 
to poll whether a kernel has completed, and then call 
GOMP_PLUGIN_target_task_completion.



And, if it at least in theory can do better than that, then even if we
punt on that for now due to time/resource constraints, maybe it would be
better to do this inside of plugin where it can be more easily replaced
later.


I'm trying to argue the other way round: if there is no optimal 
implementation in the plugin, let's provide at least a non-optimal but 
non-synchronous implementation as default, and exercise the async code 
rather than having tests fail with a plugin abort.


Thanks,
- Tom


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-10-30 Thread Richard Biener
On Sun, Oct 29, 2017 at 5:15 PM, Martin Sebor  wrote:
> Ping -- please see my reply below.
>
>
> On 10/20/2017 09:57 AM, Richard Biener wrote:
>>
>> get_addr_base_and_unit_offset will return NULL if there's any
>>>
>>> variable
>>
>> component in 'ref'.  So as written it seems to be dead code (you
>> want to pass 'arg'?)
>
>
>
> Sorry, I'm not sure I understand what you mean.  What do you think
> is dead code?  The call to get_addr_base_and_unit_offset() is also
> made for an array of unspecified bound (up_bound is null) and for
> an array at the end of a struct.  For those the function returns
> non-null, and for the others (arrays of runtime bound) it returns
> null.  (I passed arg instead of ref but I see no difference in
> my tests.)


 If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>>>
>>> it will

 return the offset of 'c'.  If you pass a.b[j].c it will still return
>>>
>>> NULL.

 You could use get_ref_base_and_extent which will return the offset
 of a.b[0].c in this case and sets max_size != size - but you are only
 interested in offset.  The disadvantage of get_ref_base_and_extent
 is it returns offset in bits thus if the offset is too large for a
>>>
>>> HWI

 you'll instead get offset == 0 and max_size == -1.

 Thus I'm saying this is dead code for variable array accesses
 (even for the array you are warning about).  Yes, for constant index
 and at-struct-end you'll get sth, but the warning is in VRP because
 of variable indexes.

 So I suggest to pass 'arg' and use get_ref_base_and_extent
 for some extra precision (and possible lossage for very very large
 structures).
>>>
>>>
>>> Computing bit offsets defeats the out-of-bounds flexible array
>>> index detection because the computation overflows (the function
>>> sets the offset to zero).
>>
>>
>> It shouldn't if you pass arg rather than ref.
>
>
> I suspect you missed my reply in IRC where I confirmed that this
> approach doesn't work for the reason you yourself mentioned above:
>
 The disadvantage of get_ref_base_and_extent
 is it returns offset in bits thus if the offset is too large
 for a HWI you'll instead get offset == 0 and max_size == -1.
>
> This means that using the function you suggest would either prevent
> detecting the out-of-bounds indices that overflow due to the bit
> offset, thus largely defeating the purpose of the patch (to detect
> excessively large indices), or not printing the value of the out-of
> bounds index in the warning in this case, which would in turn mean
> further changes to the rest of the logic.

Well, it would not handle the case of

a.b[offset-bringing-you-bitoffset-overflow].c[i]

but it would handle (compared to your version of the patch)

a.b[j].c[i]

with i being the unreasonably large offset.  That's beause we
look at the bit offset of a.b[j].c thus the _start_ of the array.

I still find the half-address-space case totally pointless to warn about
(even more to get "precise" here by subtracting the offset of the array).
There's not a single testcase that looks motivating to me (like an
error with some signed->unsigned conversion and then GCC magically
figuring out a range you'd warn for)

The diagnostic wording changes like

@@ -6718,7 +6750,8 @@ check_array_ref (location_t location, tree ref,
bool ignore_off_by_one)
   && tree_int_cst_le (low_sub, low_bound))
 {
   warning_at (location, OPT_Warray_bounds,
-  "array subscript is outside array bounds");
+  "array subscript [%E, %E] is outside array bounds of %qT",
+  low_sub, up_sub, artype);
   TREE_NO_WARNING (ref) = 1;
 }
 }

are ok for trunk.

Thanks,
Richard.


>>   I'll go ahead with the first version
>>>
>>> unless you have a different suggestion.
>
>
> But before I commit the patch as is I want to double-check to make
> sure you don't have further suggestions.
>
> Martin
>
> PS For reference the last version of the patch is here:
>
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html


Re: [PATCH 1/7] GCOV: document behavior of -fkeep-{static,inline}-functions (PR gcov-profile/82633).

2017-10-30 Thread Nathan Sidwell

On 10/26/2017 04:11 AM, marxin wrote:

gcc/ChangeLog:



+++ b/gcc/doc/gcov.texi
@@ -327,6 +327,11 @@ non-exceptional paths or only exceptional paths such as 
C++ exception
  handlers, respectively. Given @samp{-a} option, unexecuted blocks are
  marked @samp{$} or @samp{%}, depending on whether a basic block
  is reachable via non-exceptional or exceptional paths.
+Note that GCC can perform function removal for functions obviously not
+used in a compilation unit.  Such functions are marked with @samp{-}
+even though they contain a code.  Use @option{-fkeep-inline-functions} and
+@option{-fkeep-static-functions} in order to properly
+record @var{execution_count} of such functions.


This reads a little oddly.  How about:

Note that GCC can completely remove the bodies of functions that are not 
needed -- for instance if they are inlined everywhere.  Such functions 
are marked with @samp{-}, which can be confusing.  Use the 
@option{-fkeep-inline-functions} and @option{-fkeep-static-functions} 
options to retain these functions and allow gcov to properly show their 
@var{execution_count}.


Ok with that (or something approximating it).

nathan
--
Nathan Sidwell


Re: [PATCH 2/7] GCOV: introduce usage of terminal colors.

2017-10-30 Thread Nathan Sidwell

On 10/26/2017 04:11 AM, marxin wrote:

I consider using colors in context of gcov as very useful. There's
example for tramp3d:
https://pste.eu/p/Tl2D.html


nice!


gcc/ChangeLog:

2017-10-23  Martin Liska  

* color-macros.h: New file.
* diagnostic-color.c: Factor out color related to macros to
color-macros.h.
* doc/gcov.texi: Document -k option.
* gcov.c (INCLUDE_STRING): Include string.h.
(print_usage): Add -k option.
(process_args): Parse it.
(pad_count_string): New function.
(output_line_beginning): Likewise.
(DEFAULT_LINE_START): New macro.
(output_lines): Support color output.


The gcov changes are ok.  I guess David has the review ball for the 
diagnostic refactoring?


nathan

--
Nathan Sidwell


[Patch, fortran] PR80850 - Sourced allocate() fails to allocate a pointer

2017-10-30 Thread Paul Richard Thomas
Dear All,

This bug took a silly amount of effort to diagnose but once done, the
fix was obvious.

The bug is triggered in this function from the reporter's source file
gfc_graph.F90:

function GraphIterAppendVertex(this,vertex) result(ierr)
!Appends a new vertex to the graph.
 implicit none
 integer(INTD):: ierr   !out: error code
 class(graph_iter_t), intent(inout):: this  !inout:
graph iterator
 class(graph_vertex_t), intent(in), target:: vertex !in: new vertex
 type(vert_link_refs_t):: vlr

 ierr=this%vert_it%append(vertex) !<= right here!
snip
 return
end function GraphIterAppendVertex

'vertex' is being passed to a class(*) dummy. As you will see from the
attached patch and the ChangeLog, 'vertex' was being cast as unlimited
polymorphic and assigned to the passed actual argument. This left the
_len field indeterminate since it is not present in normal (limited?)
polymorphic objects.

Further down the way, in stsubs.F90(clone_object) an allocation is
being made using the unlimited version of 'vertex as a source. Since
the size passed to malloc for an unlimited source is, for  _len > 0,
the value of the _len multiplied by the vtable _size, the amount of
memory is also indeterminate and causes the operating system to flag a
failed allocation, pretty much at random.

The ChangeLog and the patch describe the fix sufficiently well as not
to require further explanation. I will write a testcase that tests the
tree dump for the _len field before committing the patch.

Bootstraps and regtests on FC23/x86_64 - OK for 7- and 8-branches?

If I do not receive any contrary comments, I will commit tonight.

Regards

Paul

2017-10-30  Paul Thomas  

PR fortran/80850
* trans_expr.c (gfc_conv_procedure_call): When passing a class
argument to an unlimited polymorphic dummy, it is wrong to cast
the passed expression as unlimited, unless it is unlimited. The
correct way is to assign to each of the fields and set the _len
field to zero.
Index: gcc/fortran/trans-expr.c
===
*** gcc/fortran/trans-expr.c(revision 254196)
--- gcc/fortran/trans-expr.c(working copy)
*** gfc_conv_procedure_call (gfc_se * se, gf
*** 5173,5182 
}
  else
{
! gfc_add_modify (&parmse.pre, var,
! fold_build1_loc (input_location,
!  VIEW_CONVERT_EXPR,
!  type, parmse.expr));
  parmse.expr = gfc_build_addr_expr (NULL_TREE, var);
}
}
--- 5173,5198 
}
  else
{
! if (!UNLIMITED_POLY (e) && UNLIMITED_POLY (fsym))
!   {
! tmp = gfc_class_data_get (var);
! gfc_add_modify (&parmse.pre, tmp,
! gfc_class_data_get (parmse.expr));
! tmp = gfc_class_vptr_get (var);
! gfc_add_modify (&parmse.pre, tmp,
! gfc_class_vptr_get (parmse.expr));
! tmp = gfc_class_len_get (var);
! gfc_add_modify (&parmse.pre, tmp,
! build_int_cst (TREE_TYPE (tmp), 
0));
!   }
! else
!   {
! tmp = fold_build1_loc (input_location,
!VIEW_CONVERT_EXPR,
!type, parmse.expr);
! gfc_add_modify (&parmse.pre, var, tmp);
! ;
!   }
  parmse.expr = gfc_build_addr_expr (NULL_TREE, var);
}
}


Re: [PATCH 3/7] GCOV: add support for lines with an unexecuted lines.

2017-10-30 Thread Nathan Sidwell

On 10/26/2017 04:11 AM, marxin wrote:

It's possible to have a line of code that has a non-zero coverage.
However, it can contain unexecuted blocks and I hope adding a
notification can be usefull. LLVM also does that:



1*:5:   int a = b < 1 ? (c < 3 ? d : c) : e;


A useful enhancement.




  function:@var{line_number},@var{execution_count},@var{function_name}
-lcount:@var{line number},@var{execution_count}
+lcount:@var{line number},@var{execution_count},@var{has_unexecuted_statement}


Is 'statement' the right phrase.  Pedantically it is 'basic block', 
which the documentation does discuss.  So perhaps 'has_unexecuted_block'?


  
@@ -341,6 +341,9 @@ used in a compilation unit.  Such functions are marked with @samp{-}

  even though they contain a code.  Use @option{-fkeep-inline-functions} and
  @option{-fkeep-static-functions} in order to properly
  record @var{execution_count} of such functions.
+Executed lines having a statement with zero @var{execution_count} end with
+@samp{*} character and are colored with magenta color with @option{-k}
+option.


Same comment.



+  unsigned has_unexecuted_block : 1;


Heh, and the code matches my thought :)

Looks good otherwise, WDYT?

nathan

--
Nathan Sidwell


Re: [PATCH] Change default to -fno-math-errno

2017-10-30 Thread Joseph Myers
On Mon, 30 Oct 2017, Richard Biener wrote:

> On Fri, Oct 27, 2017 at 6:28 PM, Joseph Myers  wrote:
> > No existing glibc version defines math_errhandling based on
> > __NO_MATH_ERRNO__.  I'd expect such a change to come with a glibc patch,
> > and indeed a GCC execution test of the value of math_errhandling to make
> > sure the compiler's behavior isn't contradicting what's declared by the
> > runtime libraries.  Also, the default should not be changed for pre-C99 C
> > standard modes (or pre-C++11 C++, I suppose).
> 
> Should we also get the __ieee764_ entries used if the compiler sets
> __NO_MATH_ERRNO__?  That is, if the librari advertises not setting errno
> via math_errhandling is it still allowed to set/clobber errno anyways?

See C11 7.12.1p7.  "If a domain, pole, or range error occurs and the 
integer expression math_errhandling & MATH_ERRNO is zero, then errno shall 
either be set to the value corresponding to the error or left unmodified. 
If no such error occurs, errno shall be left unmodified regardless of the 
setting of math_errhandling.".  (And, for  functions, "An 
implementation may set errno but is not required to.".)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 4/7] GCOV: add -j argument (human readable format).

2017-10-30 Thread Nathan Sidwell

On 10/26/2017 04:11 AM, marxin wrote:

Human readable format is quite useful in my opinion. There's example:

 -:1:unsigned
14.00K:2:loop (unsigned n, int value)


My first thought is 'why 2 decimal places'?  That seems excessive.  Zero 
surely suffices?



Question is do we want to do it by default, or a new option is fine?
Note that all external tools using gcov should use intermediate format
which is obviously unchanged.


I don't think other tools do it by default.



+ [@option{-j}|@option{--human-numbers}]


man ls:
  -h, --human-readable
  with -l and/or -s, print human readable sizes (e.g., 1K 
234M 2G)


Sadly '-h' is help (IIRC).  but we could at least copy the long form.



+  const char *units[] = {"", "K", "M", "G", "Y", "P", "E", "Z"};


Those aren't right  KMGTPEZY
http://www.npl.co.uk/reference/measurement-units/si-prefixes/



+  for (unsigned i = 0; i < sizeof (units); i++)
+{
+  if (v < 1000.0f)
+   {
+ sprintf (buffer, "%3.2f%s", v, units[i]);
+ return buffer;
+   }
+
+  v /= 1000.0f;
+}


that's going to fail on certain roundings.  You're doing multiple 
divisions by 1000, which itself will have roundings.  But more 
importantly, numbers after scaling like 999.999 will round to 1000, and 
you probably don't want that.  This kind of formatting is tricky.  My 
inclination is to keep it in the integer domain.  Determine a scaling 
factor.  Divide once by that and then explicitly round to nearest 
even[*] with a check for the 999.9 case.  Then print as an unsigned.


nathan

[*] I presume you're familiar with RNE?  That's probably overkill and 
plain RN would be adequate.


--
Nathan Sidwell


[PATCH] Fix PR82762

2017-10-30 Thread Richard Biener

Un-revert the reversion.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-10-30  Richard Biener  

PR tree-optimization/82762
Revert
2017-10-23  Richard Biener  

PR tree-optimization/82129
Revert
2017-08-01  Richard Biener  

PR tree-optimization/81181
* tree-ssa-pre.c (compute_antic_aux): Defer clean() to ...
(compute_antic): ... end of iteration here.


Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 254211)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -2029,7 +2039,8 @@ static sbitmap has_abnormal_preds;
  ANTIC_OUT[BLOCK] = phi_translate (ANTIC_IN[succ(BLOCK)])
 
ANTIC_IN[BLOCK] = clean(ANTIC_OUT[BLOCK] U EXP_GEN[BLOCK] - TMP_GEN[BLOCK])
-*/
+
+   Note that clean() is deferred until after the iteration.  */
 
 static bool
 compute_antic_aux (basic_block block, bool block_has_abnormal_pred_edge)
@@ -2165,7 +2176,8 @@ compute_antic_aux (basic_block block, bo
 bitmap_value_insert_into_set (ANTIC_IN (block),
  expression_for_id (bii));
 
-  clean (ANTIC_IN (block));
+  /* clean (ANTIC_IN (block)) is defered to after the iteration converged
+ because it can cause non-convergence, see for example PR81181.  */
 
   if (!bitmap_set_equal (old, ANTIC_IN (block)))
 changed = true;
@@ -2397,6 +2409,12 @@ compute_antic (void)
   gcc_checking_assert (num_iterations < 500);
 }
 
+  /* We have to clean after the dataflow problem converged as cleaning
+ can cause non-convergence because it is based on expressions
+ rather than values.  */
+  FOR_EACH_BB_FN (block, cfun)
+clean (ANTIC_IN (block));
+
   statistics_histogram_event (cfun, "compute_antic iterations",
  num_iterations);
 
Index: gcc/testsuite/gcc.dg/torture/pr82762.c
===
--- gcc/testsuite/gcc.dg/torture/pr82762.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr82762.c  (working copy)
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+
+int printf (const char *, ...);
+
+int b, c, d, e, f, g, j, k;
+char h, i;
+volatile int l;
+
+int m (int n, int o)
+{ 
+  return o < 0 || o > 1 ? n : o;
+}
+
+int p (int n, unsigned o)
+{ 
+  return n - o;
+}
+
+int q ()
+{ 
+  char r;
+  int a, s, t, u, v, w;
+L:
+  if (t)
+printf ("%d", d);
+  u = v;
+  while (j)
+{ 
+  while (e)
+   for (w = 0; w != 54; w += 6)
+ { 
+   l;
+   s = p (u < 1, i || c);
+   r = s < 0 || s > 1 ? 0 : 1 >> s;
+   v = r;
+   g = h;
+ }
+  if (h)
+   return f;
+  if (u)
+   for (a = 0; a != 54; a += 6)
+ f = m (2, -(k || b));
+}
+  d = t;
+  goto L;
+}


[PATCH] Fix PR82757

2017-10-30 Thread Richard Biener

The following fixes the gold linker still recognizing LTO bytecode
via an UNDEF __gnu_lto_* symbol.  So on "removal" of the symbol
make sure to strip two leading _s.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-10-30  Richard Biener  

PR lto/82757
* simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
Strip two leading _s from the __gnu_lto_* symbols.

Index: libiberty/simple-object-elf.c
===
--- libiberty/simple-object-elf.c   (revision 254211)
+++ libiberty/simple-object-elf.c   (working copy)
@@ -1384,7 +1384,12 @@ simple_object_elf_copy_lto_debug_section
  && p[1] == '_'
  && strncmp (p + (p[2] == '_'),
  "__gnu_lto_", 10) == 0)
-   other = STV_HIDDEN;
+   {
+ other = STV_HIDDEN;
+ ELF_SET_FIELD (type_functions, ei_class, Sym,
+ent, st_name, Elf_Word,
+st_name + 2);
+   }
}
}
  *st_other = other;


Re: [PATCH] Fix PR82757

2017-10-30 Thread Jakub Jelinek
On Mon, Oct 30, 2017 at 01:48:00PM +0100, Richard Biener wrote:
> 
> The following fixes the gold linker still recognizing LTO bytecode
> via an UNDEF __gnu_lto_* symbol.  So on "removal" of the symbol
> make sure to strip two leading _s.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Doesn't that make a gnu_lto_* symbol defined in the same shared library or
binary hidden, even when not hidden in the source?  Or are there . or $
characters in the name that make that impossible?

> 2017-10-30  Richard Biener  
> 
>   PR lto/82757
>   * simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
>   Strip two leading _s from the __gnu_lto_* symbols.
> 
> Index: libiberty/simple-object-elf.c
> ===
> --- libiberty/simple-object-elf.c (revision 254211)
> +++ libiberty/simple-object-elf.c (working copy)
> @@ -1384,7 +1384,12 @@ simple_object_elf_copy_lto_debug_section
> && p[1] == '_'
> && strncmp (p + (p[2] == '_'),
> "__gnu_lto_", 10) == 0)
> - other = STV_HIDDEN;
> + {
> +   other = STV_HIDDEN;
> +   ELF_SET_FIELD (type_functions, ei_class, Sym,
> +  ent, st_name, Elf_Word,
> +  st_name + 2);
> + }
>   }
>   }
> *st_other = other;

Jakub


Re: [PATCH] Fix PR82757

2017-10-30 Thread Richard Biener
On Mon, 30 Oct 2017, Jakub Jelinek wrote:

> On Mon, Oct 30, 2017 at 01:48:00PM +0100, Richard Biener wrote:
> > 
> > The following fixes the gold linker still recognizing LTO bytecode
> > via an UNDEF __gnu_lto_* symbol.  So on "removal" of the symbol
> > make sure to strip two leading _s.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
> 
> Doesn't that make a gnu_lto_* symbol defined in the same shared library or
> binary hidden, even when not hidden in the source?  Or are there . or $
> characters in the name that make that impossible?

Asked Alan the same question and he said

"Of course on x86_64, you'll start off with "__gnu_lto_", skip two chars 
to give "gnu_lto_", which also ought to be fine.  Even if it matches some 
user symbol."

I also said that I'll likely end up writing proper symbol removal
code ... (but not during stage1).

Richard.


Re: [Patch, fortran] PR80850 - Sourced allocate() fails to allocate a pointer

2017-10-30 Thread Andre Vehreschild
Hi Paul,

whoopsie, I remember that I inserted the check for _len > 0 in allocate(). So
it was me causing the bug. Thanks that you found it. The patch looks good to
me. Thanks for the work.

- Andre

On Mon, 30 Oct 2017 12:20:20 +
Paul Richard Thomas  wrote:

> Dear All,
> 
> This bug took a silly amount of effort to diagnose but once done, the
> fix was obvious.
> 
> The bug is triggered in this function from the reporter's source file
> gfc_graph.F90:
> 
> function GraphIterAppendVertex(this,vertex) result(ierr)
> !Appends a new vertex to the graph.
>  implicit none
>  integer(INTD):: ierr   !out: error code
>  class(graph_iter_t), intent(inout):: this  !inout:
> graph iterator
>  class(graph_vertex_t), intent(in), target:: vertex !in: new vertex
>  type(vert_link_refs_t):: vlr
> 
>  ierr=this%vert_it%append(vertex) !<= right here!
> snip
>  return
> end function GraphIterAppendVertex
> 
> 'vertex' is being passed to a class(*) dummy. As you will see from the
> attached patch and the ChangeLog, 'vertex' was being cast as unlimited
> polymorphic and assigned to the passed actual argument. This left the
> _len field indeterminate since it is not present in normal (limited?)
> polymorphic objects.
> 
> Further down the way, in stsubs.F90(clone_object) an allocation is
> being made using the unlimited version of 'vertex as a source. Since
> the size passed to malloc for an unlimited source is, for  _len > 0,
> the value of the _len multiplied by the vtable _size, the amount of
> memory is also indeterminate and causes the operating system to flag a
> failed allocation, pretty much at random.
> 
> The ChangeLog and the patch describe the fix sufficiently well as not
> to require further explanation. I will write a testcase that tests the
> tree dump for the _len field before committing the patch.
> 
> Bootstraps and regtests on FC23/x86_64 - OK for 7- and 8-branches?
> 
> If I do not receive any contrary comments, I will commit tonight.
> 
> Regards
> 
> Paul
> 
> 2017-10-30  Paul Thomas  
> 
> PR fortran/80850
> * trans_expr.c (gfc_conv_procedure_call): When passing a class
> argument to an unlimited polymorphic dummy, it is wrong to cast
> the passed expression as unlimited, unless it is unlimited. The
> correct way is to assign to each of the fields and set the _len
> field to zero.


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] Implement omp async support for nvptx

2017-10-30 Thread Tom de Vries

On 10/30/2017 12:55 PM, Tom de Vries wrote:

On 10/30/2017 08:15 AM, Jakub Jelinek wrote:

On Fri, Oct 27, 2017 at 03:57:28PM +0200, Tom de Vries wrote:

how about this approach:
1 - Move async_run from plugin-hsa.c to default_async_run
2 - Implement omp async support for nvptx
?

The first patch moves the GOMP_OFFLOAD_async_run implementation from
plugin-hsa.c to target.c, making it the default implementation if the 
plugin

does not define the GOMP_OFFLOAD_async_run symbol.

The second patch removes the GOMP_OFFLOAD_async_run symbol from the 
nvptx

plugin, activating the default implementation, and makes sure
GOMP_OFFLOAD_run can be called from a fresh thread.

I've tested this with libgomp.c/c.exp and the previously failing 
target-33.c

and target-34.c are now passing, and there are no regressions.

OK for trunk after complete testing (and adding function comment for
default_async_run)?


Can't PTX do better than this?


It can.

I found your comment describing this implementation as a hack here ( 
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02726.html ) after sending 
this on Friday, and thought about things a little bit more. So let me 
try again.


This is not an optimal nvptx async implementation. This is a proposal to 
have a poor man's async implementation in the common code, rather than 
having libgomp accel ports implementing GOMP_OFFLOAD_async_run as abort 
at first.


AFAIU, the purpose of the async functionality is to have jobs executed 
concurrently and/or interleaved on the device. While this implementation 
does not offer jobs to the device in separate queues, such that the 
device can decide on concurrent and interleaved behaviour, it does 
present the device with a possibly interleaved job schedule (which is 
slightly better than having a poor mans async implementation that is 
just synchronous).


In order to have an optimal implementation, one would still need to 
implement the GOMP_OFFLOAD_async_run hook, which would bypass this 
implementation.


I'm not sure how useful this would be, but I can even imagine using this 
if all the accel ports have implemented the GOMP_OFFLOAD_async_run hook.

We could define a variable OMP_ASYNC with semantics:
- 0: ignore plugins GOMP_OFFLOAD_async_run hook, fall back on
  synchronous behaviour
- 1: ignore plugins GOMP_OFFLOAD_async_run hook, use poor man's
  implementation.
- 2: use plugins GOMP_OFFLOAD_async_run hook.
This could be helpful in debugging programs with async behaviour.


What I mean is that while we probably need
to take the device lock for the possible memory transfers and 
deallocation
at the end of the region and thus perform some action on the host in 
between
the end of the async target region and data copying/deallocation, 
can't we

have a single thread per device instead of one thread per async target
region, use CUDA async APIs and poll for all the pending async regions
together?  I mean, if we need to take the device lock, then we need to
serialize the finalization anyway and reusing the same thread would
significantly decrease the overhead if there are many async regions.



As for the poor mans implementation, it is indeed inefficient and could 
be improved, but I wonder if it's worth the effort. [ Perhaps though for 
debugging purposes the ability to control the interleaving in some way 
might be useful. ]


I imagine that an efficient nvptx implementation will use cuda streams, 
which are queues where both kernels and mem transfers can be queued. So 
rather than calling GOMP_PLUGIN_target_task_completion once the kernel 
is done, it would be more efficient to be able call a similar function 
that schedules the data transfers that need to happen, without assuming 
that the kernel is already done. However, AFAIU, that won't take care of 
deallocation. So I guess the first approach will be to use cuda events 
to poll whether a kernel has completed, and then call 
GOMP_PLUGIN_target_task_completion.



And, if it at least in theory can do better than that, then even if we
punt on that for now due to time/resource constraints, maybe it would be
better to do this inside of plugin where it can be more easily replaced
later.


I'm trying to argue the other way round: if there is no optimal 
implementation in the plugin, let's provide at least a non-optimal but 
non-synchronous implementation as default, and exercise the async code 
rather than having tests fail with a plugin abort.


And if you're concerned about not having tests failing to remind us that 
the default async support is not optimal and the GOMP_OFFLOAD_async_run 
hook is missing, we can add target-3[34]-opt.c testcases that force the 
use of the GOMP_OFFLOAD_async_run hook, and will abort if the hook is 
missing. Updated untested patch attached (implemented the the env var 
proposed above as OMP_TARGET_ASYNC).


Thanks,
- Tom
Move async_run from plugin-hsa.c to default_async_run

2017-10-27  Tom de Vries  

	* env.c (gomp_target_async, gomp_target_asy

[C++ Patch/RFC] PR 82085 ("[6/7/8 Regression] ICE: Template variable reference used in nested template alias")

2017-10-30 Thread Paolo Carlini

Hi,

lately I spent quite a bit of time triaging and resolving ICEs and 
noticed this recently filed regression which, in a wicked way, I find 
somehow funny. In short, we are seeing an ICE in tsubst_copy_and_build, 
when, for an INDIRECT_REF, TREE_TYPE (r) is found null:


case INDIRECT_REF:
   {
    tree r = RECUR (TREE_OPERAND (t, 0));

    if (REFERENCE_REF_P (t))
      {
        /* A type conversion to reference type will be enclosed in
       such an indirect ref, but the substitution of the cast
       will have also added such an indirect ref.  */
        if (TREE_CODE (TREE_TYPE (r)) == REFERENCE_TYPE)
      r = convert_from_reference (r);

Since Jason's r226642, which fixed a serious issue with multiple 
Bugzillas, we have lookup_template_variable doing:


  /* The type of the expression is NULL_TREE since the template-id 
could refer

 to an explicit or partial specialization. */
  tree type = NULL_TREE;

instead of setting type = unknown_type_node and that ultimately leads to 
the ICE for the testcase at issue. Now, I found in a way funny that the 
body of convert_from_reference has:


tree
convert_from_reference (tree val)
{
  if (TREE_TYPE (val)
  && TREE_CODE (TREE_TYPE (val)) == REFERENCE_TYPE)

thus, if only somebody had cleaned up with no functionality change the 
above bits of tsubst_copy_and_build to unconditionally call 
convert_from_reference we would not have PR82085. Or at least we would 
not have it in this form, maybe we are only uncovering a much deeper 
issue?!? Anyway, the trivial patchlet below certainly passes testing on 
x86_64-linux. What do you think?


Thanks, Paolo.

/

Index: cp/pt.c
===
--- cp/pt.c (revision 254198)
+++ cp/pt.c (working copy)
@@ -17079,8 +17079,7 @@ tsubst_copy_and_build (tree t,
/* A type conversion to reference type will be enclosed in
   such an indirect ref, but the substitution of the cast
   will have also added such an indirect ref.  */
-   if (TREE_CODE (TREE_TYPE (r)) == REFERENCE_TYPE)
- r = convert_from_reference (r);
+   r = convert_from_reference (r);
  }
else
  r = build_x_indirect_ref (input_location, r, RO_UNARY_STAR,
Index: testsuite/g++.dg/cpp1y/var-templ56.C
===
--- testsuite/g++.dg/cpp1y/var-templ56.C(nonexistent)
+++ testsuite/g++.dg/cpp1y/var-templ56.C(working copy)
@@ -0,0 +1,11 @@
+// PR c++/82085
+// { dg-do compile { target c++14 } }
+
+template 
+using char_sequence_t = int;
+
+template 
+constexpr char name_of_v = 'x';
+
+template 
+using type = char_sequence_t>;


Re: [PATCH][ARM] Remove DImode expansions for 1-bit shifts

2017-10-30 Thread Wilco Dijkstra
Kyrill Tkachov wrote:
> On 16/10/17 12:30, Wilco Dijkstra wrote:

> > DImode right shifts of 1 are rarely used (6 in total in the GCC binary),
> > so there is little benefit of the arm_ashrdi3_1bit and arm_lshrdi3_1bit
> > patterns.
>
> ... but it's still used, and the patterns were put there for a reason.
> Even if GCC itself doesn't use them much they may be used by other 
> applications.
> 
> So I'd support removing the left shift 1-bit expansions, but not the 
> right shift ones.

The purpose of removing the shift-by-1 cases is not just to cleanup code
but also to improve code generation. These shifts cannot expand early
and thus don't benefit from optimization (like shift merging). They also
suffer from the DImode register allocation issues.

As a simple example this loop runs >20% faster with my patch on both
Cortex-A53 and Cortex-A57 when built with -mfpu=vfp:

long long loop1 (long long x, long long y, int n)
{
  int i;
   for (i = 0; i < n; i++)
   {
  x >>= 1;
  x |= y;
  x >>= 1;
  x |= y;
  x >>= 1;
  x |= y;
  x >>= 1;
  x |= y;
   }
   return x;
}

So given these shifts are bad for performance, why have them at all?

Wilco

Re: [C++ Patch/RFC] PR 82085 ("[6/7/8 Regression] ICE: Template variable reference used in nested template alias")

2017-10-30 Thread Jason Merrill
OK.

On Mon, Oct 30, 2017 at 9:52 AM, Paolo Carlini  wrote:
> Hi,
>
> lately I spent quite a bit of time triaging and resolving ICEs and noticed
> this recently filed regression which, in a wicked way, I find somehow funny.
> In short, we are seeing an ICE in tsubst_copy_and_build, when, for an
> INDIRECT_REF, TREE_TYPE (r) is found null:
>
> case INDIRECT_REF:
>{
> tree r = RECUR (TREE_OPERAND (t, 0));
>
> if (REFERENCE_REF_P (t))
>   {
> /* A type conversion to reference type will be enclosed in
>such an indirect ref, but the substitution of the cast
>will have also added such an indirect ref.  */
> if (TREE_CODE (TREE_TYPE (r)) == REFERENCE_TYPE)
>   r = convert_from_reference (r);
>
> Since Jason's r226642, which fixed a serious issue with multiple Bugzillas,
> we have lookup_template_variable doing:
>
>   /* The type of the expression is NULL_TREE since the template-id could
> refer
>  to an explicit or partial specialization. */
>   tree type = NULL_TREE;
>
> instead of setting type = unknown_type_node and that ultimately leads to the
> ICE for the testcase at issue. Now, I found in a way funny that the body of
> convert_from_reference has:
>
> tree
> convert_from_reference (tree val)
> {
>   if (TREE_TYPE (val)
>   && TREE_CODE (TREE_TYPE (val)) == REFERENCE_TYPE)
>
> thus, if only somebody had cleaned up with no functionality change the above
> bits of tsubst_copy_and_build to unconditionally call convert_from_reference
> we would not have PR82085. Or at least we would not have it in this form,
> maybe we are only uncovering a much deeper issue?!? Anyway, the trivial
> patchlet below certainly passes testing on x86_64-linux. What do you think?
>
> Thanks, Paolo.
>
> /
>


Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-30 Thread Christophe Lyon

On 30/10/2017 11:12, Maxim Ostapenko wrote:

Hi,

sorry for the late response.

On 20/10/17 13:45, Christophe Lyon wrote:

Hi,

On 19 October 2017 at 13:17, Jakub Jelinek  wrote:

On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote:

Is the patch (the merge + this incremental) ok for trunk?

I think the patch is OK, just wondering about two things:

Richi just approved the patch on IRC, so I'll commit, then we can deal with
follow-ups.


Does anyone else run these tests on arm?
Since you applied this patch, I'm seeing lots of new errors and timeouts.
I have been ignoring regression reports for *san because of yyrandomness
in the results, but the timeouts are a  major inconvenience in testing
because it increases latency a lot in getting results, or worse I get no
result at all because the validation job is killed before completion.

Looking at some intermediate logs, I have noticed:
==24797==AddressSanitizer CHECK failed:
/libsanitizer/asan/asan_poisoning.cc:34
"((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0)
 #0 0x408d7d65 in AsanCheckFailed /libsanitizer/asan/asan_rtl.cc:67
 #1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/libsanitizer/sanitizer_common/sanitizer_termination.cc:77
 #2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned
long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34
 #3 0x4085409b in __asan_register_globals
/libsanitizer/asan/asan_globals.cc:368
 #4 0x109eb in _GLOBAL__sub_I_00099_1_ten
(/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb)

in MANY (193 in gcc) tests.

and many others (152 in gcc) just time out individually (eg
c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in
the logs besides Dejagnu's
WARNING: program timed out.


Since I'm using an apparently unusual setup, maybe I have to update it
to cope with the new version,
so I'd like to know if others are seeing the same problems on arm?

I'm using qemu -R 0 to execute the test programs, encapsulated by
proot (similar to chroot, but does not require root privileges).

Am I missing something obvious?


I've caught the same error on my Arndale board. The issue seems to be quite 
obvious: after merge, ASan requires globals array to be aligned by shadow 
granularity.


Thanks for confirming. I've spent a lot of time investigating the timeout 
issues, that led to zombie processes and servers needing reboot. I've finally 
identified that going back to qemu-2.7 avoid the timeout issues (I've reported 
a qemu bug).


This trivial patch seems to fix the issue. Could you check it on your setup?


I was just about to finally start looking at this sanity check problem, so 
thank you very much for sharing your patch.
I manually tested it on the subset of my configs and it solves the assertion 
failure, thanks!
However, I can notice many regressions compared to before the merge:
c-c++-common/asan/alloca_instruments_all_paddings.c
c-c++-common/asan/alloca_loop_unpoisoning.c
c-c++-common/asan/alloca_safe_access.c
c-c++-common/asan/asan-interface-1.c
c-c++-common/asan/halt_on_error-1.c
c-c++-common/asan/pr59063-1.c
c-c++-common/asan/pr59063-2.c
c-c++-common/asan/pr63316.c
c-c++-common/asan/pr63888.c
c-c++-common/asan/pr70712.c
c-c++-common/asan/pr71480.c
c-c++-common/asan/pr79944.c
c-c++-common/asan/pr80308.c
c-c++-common/asan/swapcontext-test-1.c
gcc.dg/asan/nosanitize-and-inline.c
gcc.dg/asan/pr79196.c
gcc.dg/asan/pr80166.c
gcc.dg/asan/pr81186.c
gcc.dg/asan/use-after-scope-11.c
gcc.dg/asan/use-after-scope-4.c
gcc.dg/asan/use-after-scope-6.c
gcc.dg/asan/use-after-scope-7.c
gcc.dg/asan/use-after-scope-goto-1.c
gcc.dg/asan/use-after-scope-goto-2.c
gcc.dg/asan/use-after-scope-switch-1.c
gcc.dg/asan/use-after-scope-switch-2.c
gcc.dg/asan/use-after-scope-switch-3.c
gcc.dg/asan/use-after-scope-switch-4.c

out of which only
c-c++-common/asan/swapcontext-test-1.c
c-c++-common/asan/halt_on_error-1.c
print something in gcc.log

Do they pass for you?

Thanks,

Christophe


Thanks,
-Maxim



Thanks,

Christophe



1) We have a bunch of GCC local patches, did you include them into a
cumulative patch (I guess yes)?

I have done some verification today, diff from upstream r285547 to unpatched
GCC (with the LLVM Compiler infrastructure two liners removed), attached P1,
and diff from upstream r315899 to patched GCC (again, two liners removed),
attached P2 and went through the changes in P1 and verified that except for
the ubsan backwards compatibility we had that can't work anymore everything
else was upstreamed, or remained in P2.  So P2 is the current diff from
upstream, with the sanitizer_common/sanitizer_symbolizer_libbacktrace.cc
changes now filed upstream.


2) Upstream has enabled LSan for x86 and ARM, is it worth to enable them in
GCC too?

Maybe, feel free to post patches.  For LSan we need to split off lsan_preinit
out of liblsan and lin

Re: [v3 PATCH] Implement LWG 2485

2017-10-30 Thread Jonathan Wakely

On 29/10/17 12:46 +0200, Ville Voutilainen wrote:

On 29 October 2017 at 00:57, Ville Voutilainen
 wrote:

2017-10-29  Ville Voutilainen  

Implement LWG 2485



The full testsuite passes on Linux-PPC64. The debug mode tests for
array have been run manually
on Linux-x64.


Looks like there are a couple of Doxygen comments in  that
don't fit in 80 columns, OK for trunk with those fixed. Thanks.


One solution would be to replace the separate comments on the
different get overloads with a comment group:

/// Return a reference to the ith element of a tuple
/// @{
... define all four overloads
// @}

It seems fairly self-explanatory to me that if the tuple is a const
lvalue the returned reference is const lvalue reference, and if the
tuple is a non-const rvalue the returned tuple is ... blah blah.



Re: [PATCH 5/7] GCOV: std::vector refactoring.

2017-10-30 Thread Nathan Sidwell

On 10/26/2017 04:11 AM, marxin wrote:

gcc/ChangeLog:

2017-10-26  Martin Liska  

* gcov.c (struct source_info): Remove typedef.
(source_info::source_info): Add proper ctor.
(accumulate_line_counts): Use struct, not it's typedef.
(output_gcov_file): Likewise.
(output_lines): Likewise.
(main): Do not allocate an array.
(output_intermediate_file): Use size of vector container.
(process_file): Resize the vector.
(generate_results): Do not preallocate, use newly added vector
lines.
(release_structures): Do not release sources.
(find_source): Use vector methods.
(add_line_counts): Do not use typedef.


ok


--
Nathan Sidwell


Re: [PATCH 6/7] GCOV: Vector refactoring II

2017-10-30 Thread Nathan Sidwell

On 10/26/2017 04:11 AM, marxin wrote:

gcc/ChangeLog:

2017-10-26  Martin Liska  

* gcov.c (struct line_info): Remove it's typedef.
(line_info::line_info): Add proper ctor.
(line_info::has_block): Do not use a typedef.
(struct source_info): Do not use typedef.
(circuit): Likewise.
(get_cycles_count): Likewise.
(output_intermediate_file): Iterate via vector iterator.
(add_line_counts): Use std::vector methods.
(accumulate_line_counts): Likewise.
(output_lines): Likewise.


ok.


--
Nathan Sidwell


Re: [PATCH 7/7] GCOV: std::vector refactoring III

2017-10-30 Thread Nathan Sidwell

On 10/26/2017 04:11 AM, marxin wrote:

gcc/ChangeLog:

2017-10-26  Martin Liska  

* gcov.c (struct name_map): do not use typedef.
Define operator== and operator<.
(name_search): Remove.
(name_sort): Remove.
(main): Do not allocate names.
(process_file): Add vertical space.
(generate_results): Use std::find.
(release_structures): Do not release memory.
(find_source): Use std::find.


You're giving it operators, so perhaps rename it to class but with 
public accessors?


Ok with/without that change.

nathan

--
Nathan Sidwell


Re: [Patch, fortran] PR80850 - Sourced allocate() fails to allocate a pointer

2017-10-30 Thread Paul Richard Thomas
Hi Andre,

You didn't cause the bug. That was generated by neglect of the correct
transfer of actual to formal in gfc_conv_procedure_call. Your _len
check hid the bug :-) We presumably have been have been allocating
random multiples of the space required when passing class objects to
unlimited dummies. The bug only emerged in Dmitry's code because of
the heavy use of allocated memory arising exactly from such usage.
Quite simply the system could not allocate any more memory.

Cheers and thanks for the OK.

Paul

On 30 October 2017 at 13:39, Andre Vehreschild  wrote:
> Hi Paul,
>
> whoopsie, I remember that I inserted the check for _len > 0 in allocate(). So
> it was me causing the bug. Thanks that you found it. The patch looks good to
> me. Thanks for the work.
>
> - Andre
>
> On Mon, 30 Oct 2017 12:20:20 +
> Paul Richard Thomas  wrote:
>
>> Dear All,
>>
>> This bug took a silly amount of effort to diagnose but once done, the
>> fix was obvious.
>>
>> The bug is triggered in this function from the reporter's source file
>> gfc_graph.F90:
>>
>> function GraphIterAppendVertex(this,vertex) result(ierr)
>> !Appends a new vertex to the graph.
>>  implicit none
>>  integer(INTD):: ierr   !out: error code
>>  class(graph_iter_t), intent(inout):: this  !inout:
>> graph iterator
>>  class(graph_vertex_t), intent(in), target:: vertex !in: new vertex
>>  type(vert_link_refs_t):: vlr
>>
>>  ierr=this%vert_it%append(vertex) !<= right here!
>> snip
>>  return
>> end function GraphIterAppendVertex
>>
>> 'vertex' is being passed to a class(*) dummy. As you will see from the
>> attached patch and the ChangeLog, 'vertex' was being cast as unlimited
>> polymorphic and assigned to the passed actual argument. This left the
>> _len field indeterminate since it is not present in normal (limited?)
>> polymorphic objects.
>>
>> Further down the way, in stsubs.F90(clone_object) an allocation is
>> being made using the unlimited version of 'vertex as a source. Since
>> the size passed to malloc for an unlimited source is, for  _len > 0,
>> the value of the _len multiplied by the vtable _size, the amount of
>> memory is also indeterminate and causes the operating system to flag a
>> failed allocation, pretty much at random.
>>
>> The ChangeLog and the patch describe the fix sufficiently well as not
>> to require further explanation. I will write a testcase that tests the
>> tree dump for the _len field before committing the patch.
>>
>> Bootstraps and regtests on FC23/x86_64 - OK for 7- and 8-branches?
>>
>> If I do not receive any contrary comments, I will commit tonight.
>>
>> Regards
>>
>> Paul
>>
>> 2017-10-30  Paul Thomas  
>>
>> PR fortran/80850
>> * trans_expr.c (gfc_conv_procedure_call): When passing a class
>> argument to an unlimited polymorphic dummy, it is wrong to cast
>> the passed expression as unlimited, unless it is unlimited. The
>> correct way is to assign to each of the fields and set the _len
>> field to zero.
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [patch] Remove old kludge in gcc.c

2017-10-30 Thread Jeff Law
On 10/30/2017 04:30 AM, Eric Botcazou wrote:
> Hi,
> 
> these lines in gcc.c date back to 2001:
> 
> /* By default there is no special suffix for target executables.  */
> /* FIXME: when autoconf is fixed, remove the host check - dj */
> #if defined(TARGET_EXECUTABLE_SUFFIX) && defined(HOST_EXECUTABLE_SUFFIX)
> #define HAVE_TARGET_EXECUTABLE_SUFFIX
> #endif
> 
> HOST_EXECUTABLE_SUFFIX and TARGET_EXECUTABLE_SUFFIX are defined only for VMS 
> and DOS/Windows.  The effect is that ".exe" is forced on executables for 
> these 
> targets only if they are built as native and not if they are built as cross.
> 
> OK for the mainline?
> 
> 
> 2017-10-30  Eric Botcazou  
> 
>   * gcc.c (HAVE_TARGET_EXECUTABLE_SUFFIX): Remove old kludge.
> 
OK.

Jeff


Re: Fix basic block reordering glitches

2017-10-30 Thread Jeff Law
On 10/26/2017 02:55 PM, Eric Botcazou wrote:
> The attached Ada testcase triggers an ICE at -O3 when compiled for x86:
> 
> eric@polaris:~/build/gcc/native> gcc/xgcc -Bprev-gcc -S opt68.adb -O3 -m32
> opt68.adb: In function 'Opt68.Copy':
> opt68.adb:51:6: error: multiple hot/cold transitions found (bb 34)
> opt68.adb:51:6: error: multiple hot/cold transitions found (bb 64)
> +===GNAT BUG DETECTED==+
> | 8.0.0 20171026 (experimental) [trunk revision 254096] (x86_64-suse-linux) 
> GCC error:|
> | verify_flow_info failed  |
> 
> It's the RTL basic block reordering pass confusing itself, more precisely:
> 
> /* If the best destination has multiple predecessors, and can be
>duplicated cheaper than a jump, don't allow it to be added
>to a trace.  We'll duplicate it when connecting traces.  */
> if (best_edge && EDGE_COUNT (best_edge->dest->preds) >= 2
> && copy_bb_p (best_edge->dest, 0))
>   best_edge = NULL;
> 
> Now, if you're in the first round (reordering of the hot paritition) and the 
> basic block has 2 predecessors, a regular one (corresponding to best_edge) 
> and 
> another coming from the cold partition (i.e. EDGE_CROSSING), the duplication 
> will redirect best_edge and leave the block with only one predecessor coming 
> from the cold partition, which means that the block will eventually become 
> cold at the end of the pass if fixup_partitions is invoked.  But that's too 
> late since
> 
>   /* Signal that rtl_verify_flow_info_1 can now verify that there
>  is at most one switch between hot/cold sections.  */
>   crtl->bb_reorder_complete = true;
> 
> So we would need to re-run the pass in order to move the block from the hot 
> partition to the cold partition (re-running only a subpass wouldn't work 
> since 
> the above decision is made by find_traces but the duplication is triggered by 
> connect_traces), which seems overkill.
> 
> Therefore the attached patch detects this problematic case and doesn't reset 
> best_edge upon finding it.  This means that the destination will be added to 
> the current trace in the hot partition, which looks OK since all its other 
> predecessors are in the cold partition.  It also contains a fix for an off-by-
> one bug in the dump file and removes a redundant check from better_edge_p 
> (the 
> same check on the BB_PARTITION of the blocks is done in find_traces_1_round).
> 
> Bootstrapped/regtested on x86_64-suse-linux.  Any objection?
> 
> 
> 2017-10-26  Eric Botcazou  
> 
>   * bb-reorder.c (find_traces_1_round): Fix off-by-one index.
>   Move around comment.  Do not reset best_edge for a copiable
>   destination if the copy would cause a partition change.
>   (better_edge_p): Remove redundant check.
> 
> 
> 2017-10-26  Eric Botcazou  
> 
>   * gnat.dg/opt68.ad[sb]: New test.
OK.
jeff


Re: [07/nn] Add unique CONSTs

2017-10-30 Thread Jeff Law
On 10/27/2017 09:56 AM, Richard Sandiford wrote:
> Jeff Law  writes:
>> On 10/23/2017 05:21 AM, Richard Sandiford wrote:
>>> This patch adds a way of treating certain kinds of CONST as unique,
>>> so that pointer equality is equivalent to value equality.  For now it
>>> is restricted to VEC_DUPLICATE and VEC_SERIES, although the code to
>>> generate them remains in the else arm of an "if (1)" until a later
>>> patch.
>>>
>>> This is needed so that (const (vec_duplicate xx)) can used as the
>>> CONSTxx_RTX of a variable-length vector.
>> You're brave :-)  I know we looked at making CONST_INTs behave in this
>> manner eons ago in an effort to reduce memory consumption and it was
>> just plain painful.   There may still be comments from that project
>> littering the source code.
>>
>> I do wonder if we might want to revisit this again as we have better
>> infrastructure in place.
> 
> For vectors it isn't so bad, since we already do the same thing
> for CONST_VECTOR.  Fortunately CONST_VECTOR and CONST always have
> a mode, so there's no awkward sharing between modes...
> 
>>> 2017-10-23  Richard Sandiford  
>>> Alan Hayward  
>>> David Sherwood  
>>>
>>> gcc/
>>> * rtl.h (unique_const_p): New function.
>>> (gen_rtx_CONST): Declare.
>>> * emit-rtl.c (const_hasher): New struct.
>>> (const_htab): New variable.
>>> (init_emit_once): Initialize it.
>>> (const_hasher::hash, const_hasher::equal): New functions.
>>> (gen_rtx_CONST): New function.
>>> (spare_vec_duplicate, spare_vec_series): New variables.
>>> (gen_const_vec_duplicate_1): Add code for use (const (vec_duplicate)),
>>> but disable it for now.
>>> (gen_const_vec_series): Likewise (const (vec_series)).
>>> * gengenrtl.c (special_rtx): Return true for CONST.
>>> * rtl.c (shared_const_p): Return true if unique_const_p.
>> ISTM that you need an update the rtl.texi's structure sharing
>> assumptions section to describe the new rules around CONSTs.
> 
> Oops, yeah.  How about the attached?
OK.

> 
>> So what's the purpose of the sparc_vec_* stuff that you're going to use
>> in the future?  It looks like a single element cache to me.Am I
>> missing something?
> 
> No, that's right.  When looking up the const for (vec_duplicate x), say,
> it's easier to create the vec_duplicate rtx first.  But if the lookup
> succeeds (and so we already have an rtx with that value), we keep the
> discarded vec_duplicate around so that we can reuse it for the next
> lookup.
OK.

Jeff


Re: [PATCH 2/7] GCOV: introduce usage of terminal colors.

2017-10-30 Thread David Malcolm
On Mon, 2017-10-30 at 08:17 -0400, Nathan Sidwell wrote:
> On 10/26/2017 04:11 AM, marxin wrote:
> > I consider using colors in context of gcov as very useful. There's
> > example for tramp3d:
> > https://pste.eu/p/Tl2D.html
> 
> nice!
> 
> > gcc/ChangeLog:
> > 
> > 2017-10-23  Martin Liska  
> > 
> > * color-macros.h: New file.
> > * diagnostic-color.c: Factor out color related to macros to
> > color-macros.h.
> > * doc/gcov.texi: Document -k option.
> > * gcov.c (INCLUDE_STRING): Include string.h.
> > (print_usage): Add -k option.
> > (process_args): Parse it.
> > (pad_count_string): New function.
> > (output_line_beginning): Likewise.
> > (DEFAULT_LINE_START): New macro.
> > (output_lines): Support color output.
> 
> The gcov changes are ok.  I guess David has the review ball for the 
> diagnostic refactoring?
> 
> nathan

The comments beginning:
+/* Select Graphic Rendition (SGR, "\33[...m") strings.  */

and ending:
  It would be impractical for GCC to become a full-fledged
  terminal program linked against ncurses or the like, so it will
  not detect terminfo(5) capabilities.  */

are still in diagnostic-color.c after your patch, but they are
describing the macros, and in particular, if I'm reading them right,
are a rationale for why SGR_END contains a "\33[K".

Hence I think that those comments should also be moved to color-
macros.h.

Other than that the diagnostic changes mostly look good to me, but the
color-macros.h has:
+   Copyright (C) 2017 Free Software Foundation, Inc.

Shouldn't that copyright line express the full range of years for the
existing content that's being moved from diagnostic-color.c?

The macros there were introduced by the creation of diagnostic-color.c
in r197842 in April 2013 (aka dc604d41825b3cbd09045baeef09b1b88fc5a02),
which had the copyright line:

+   Copyright 2011-2013 Free Software Foundation, Inc.


FWIW, the macros seem to come from this patch by Manu:
"RFC: color diagnostics markers"
  https://gcc.gnu.org/ml/gcc-patches/2013-03/msg01365.html

where the code in question has:

/* Based on code from: */
+/* grep.c - main driver file for grep.
+   Copyright (C) 1992, 1997-2002, 2004-2013 Free Software Foundation,
Inc.

though as far as I can tell the only material taken directly from
"grep" are the comments I mentioned above beginning i.e. it appears to
me that the actual macros in the new file were written by Manu in 2013.

The comments mentioned above come from GNU grep; looking at the history
in grep's git repo shows it comes from commit
56623b5129d487cb6673fa5a582d094edc1fe983:

2005-06-20  Charles Levert  

   * src/grep.c: Extensively document the SGR/EL-to-Right issue.

Hence I believe the color-macros.h copyright line should range from 
2005-2017 if you move the comments (please do), or from 2013-2017 as-
is.

Diagnostic refactoring is OK otherwise.

Dave


[PATCH] Fix PR82765

2017-10-30 Thread Richard Biener

The following fixes an ICE in constant pool hashtable functions
for overly large (__int128) array indices.  The function doesn't
try to be correct in any way so simply truncate to HWI.

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok?

Thanks,
Richard.

2017-10-30  Richard Biener  

PR middle-end/82765
* varasm.c (decode_addr_const): Make offset HOST_WIDE_INT.
Truncate ARRAY_REF index and element size.

* gcc.dg/pr82765.c: New testcase.

Index: gcc/varasm.c
===
--- gcc/varasm.c(revision 254211)
+++ gcc/varasm.c(working copy)
@@ -2879,7 +2879,7 @@ static void
 decode_addr_const (tree exp, struct addr_const *value)
 {
   tree target = TREE_OPERAND (exp, 0);
-  int offset = 0;
+  HOST_WIDE_INT offset = 0;
   rtx x;
 
   while (1)
@@ -2893,8 +2893,9 @@ decode_addr_const (tree exp, struct addr
   else if (TREE_CODE (target) == ARRAY_REF
   || TREE_CODE (target) == ARRAY_RANGE_REF)
{
- offset += (tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (target)))
-* tree_to_shwi (TREE_OPERAND (target, 1)));
+ /* Truncate big offset.  */
+ offset += (TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (target)))
+* TREE_INT_CST_LOW (TREE_OPERAND (target, 1)));
  target = TREE_OPERAND (target, 0);
}
   else if (TREE_CODE (target) == MEM_REF
Index: gcc/testsuite/gcc.dg/pr82765.c
===
--- gcc/testsuite/gcc.dg/pr82765.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr82765.c  (working copy)
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -w" } */
+
+int a[1][1];
+int main() { int *b[] = {a, a[1820408606019012862278468], a, a, a}; }


[PATCH] Minor header reorganization for unordered containers

2017-10-30 Thread Jonathan Wakely

In the spirit of "include what you use" (and because I couldn't find
why  and  include  when they
don't use it ... it's because  uses it).

* include/bits/hashtable_policy.h: Include .
* include/std/unordered_map: Only include  instead
of  and .
* include/std/unordered_set: Likewise.

Tested powerpc64le-linux, committed to trunk.

commit 58390a166cfda1cb7db2ccc65c6515f3d48892ec
Author: Jonathan Wakely 
Date:   Mon Oct 30 14:40:51 2017 +

Minor header reorganization for unordered containers

* include/bits/hashtable_policy.h: Include .
* include/std/unordered_map: Only include  instead
of  and .
* include/std/unordered_set: Likewise.

diff --git a/libstdc++-v3/include/bits/hashtable_policy.h 
b/libstdc++-v3/include/bits/hashtable_policy.h
index da6d49c96c4..e19f92abd76 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -31,7 +31,9 @@
 #ifndef _HASHTABLE_POLICY_H
 #define _HASHTABLE_POLICY_H 1
 
-#include  // for std::min.
+#include// for std::tuple, std::forward_as_tuple
+#include  // for std::uint_fast64_t
+#include  // for std::min.
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/std/unordered_map 
b/libstdc++-v3/include/std/unordered_map
index 2cdcd377936..80773235b65 100644
--- a/libstdc++-v3/include/std/unordered_map
+++ b/libstdc++-v3/include/std/unordered_map
@@ -35,13 +35,12 @@
 # include 
 #else
 
-#include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
+#include 
 #include  // equal_to, _Identity, _Select1st
 #include 
 #include 
diff --git a/libstdc++-v3/include/std/unordered_set 
b/libstdc++-v3/include/std/unordered_set
index 2646c0f2f00..faf7ebe23ae 100644
--- a/libstdc++-v3/include/std/unordered_set
+++ b/libstdc++-v3/include/std/unordered_set
@@ -35,13 +35,12 @@
 # include 
 #else
 
-#include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
+#include 
 #include  // equal_to, _Identity, _Select1st
 #include 
 #include 


[PATCH] Minor tweak to libstdc++ FAQ

2017-10-30 Thread Jonathan Wakely

The C++17 standard renumbers the clauses, and I got annoyed again by
somebody writing "stdlibc++" ;-)

* doc/xml/faq.xml: Adjust "What is libstdc++?" answer slightly.

Committed to trunk.

commit e9f2efc7ae2ead47cebfefe3707e6bd73448f65e
Author: Jonathan Wakely 
Date:   Mon Oct 30 14:56:01 2017 +

Minor tweak to libstdc++ FAQ

* doc/xml/faq.xml: Adjust "What is libstdc++?" answer slightly.

diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++-v3/doc/xml/faq.xml
index 703ade5b20d..1a21e48007a 100644
--- a/libstdc++-v3/doc/xml/faq.xml
+++ b/libstdc++-v3/doc/xml/faq.xml
@@ -30,13 +30,18 @@
   
 
  The GNU Standard C++ Library v3 is an ongoing project to
- implement the ISO 14882 Standard C++ library as described in
- clauses 17 through 30 and annex D.  For those who want to see
+ implement the ISO 14882 C++ Standard Library as described in
+ clauses 20 through 33 and annex D (prior to the 2017 standard
+ the library clauses started with 17).  For those who want to see
  exactly how far the project has come, or just want the latest
  bleeding-edge code, the up-to-date source is available over
- anonymous SVN, and can be browsed over
- the http://www.w3.org/1999/xlink"; 
xlink:href="https://gcc.gnu.org/svn.html";>web.
+ anonymous SVN, and can be browsed over the
+ http://www.w3.org/1999/xlink"; 
xlink:href="https://gcc.gnu.org/svn.html";>web.
  
+
+
+N.B. The library is called libstdc++ not stdlibc++.
+
   
 
 


Re: [14/nn] Add helpers for shift count modes

2017-10-30 Thread Jeff Law
On 10/26/2017 06:06 AM, Richard Biener wrote:
> On Mon, Oct 23, 2017 at 1:25 PM, Richard Sandiford
>  wrote:
>> This patch adds a stub helper routine to provide the mode
>> of a scalar shift amount, given the mode of the values
>> being shifted.
>>
>> One long-standing problem has been to decide what this mode
>> should be for arbitrary rtxes (as opposed to those directly
>> tied to a target pattern).  Is it the mode of the shifted
>> elements?  Is it word_mode?  Or maybe QImode?  Is it whatever
>> the corresponding target pattern says?  (In which case what
>> should the mode be when the target doesn't have a pattern?)
>>
>> For now the patch picks word_mode, which should be safe on
>> all targets but could perhaps become suboptimal if the helper
>> routine is used more often than it is in this patch.  As it
>> stands the patch does not change the generated code.
>>
>> The patch also adds a helper function that constructs rtxes
>> for constant shift amounts, again given the mode of the value
>> being shifted.  As well as helping with the SVE patches, this
>> is one step towards allowing CONST_INTs to have a real mode.
> 
> I think gen_shift_amount_mode is flawed and while encapsulating
> constant shift amount RTX generation into a gen_int_shift_amount
> looks good to me I'd rather have that ??? in this function (and
> I'd use the mode of the RTX shifted, not word_mode...).
> 
> In the end it's up to insn recognizing to convert the op to the
> expected mode and for generic RTL it's us that should decide
> on the mode -- on GENERIC the shift amount has to be an
> integer so why not simply use a mode that is large enough to
> make the constant fit?
> 
> Just throwing in some comments here, RTL isn't my primary
> expertise.
I wonder if encapsulation + a target hook to specify the mode would be
better?  We'd then have to argue over word_mode, vs QImode vs something
else for the default, but at least we'd have a way for the target to
specify the mode is generally best when working on shift counts.

In the end I doubt there's a single definition that is overall better.
Largely because I suspect there are times when the narrowest mode is
best, or the mode of the operand being shifted.

So thoughts on doing the encapsulation with a target hook to specify the
desired mode?  Does that get us what we need for SVE and does it provide
us a path forward on this issue if we were to try to move towards
CONST_INTs with modes?

jeff



Re: [11/nn] Add narrower_subreg_mode helper function

2017-10-30 Thread Jeff Law
On 10/23/2017 05:24 AM, Richard Sandiford wrote:
> This patch adds a narrowing equivalent of wider_subreg_mode.  At present
> there is only one user.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * rtl.h (narrower_subreg_mode): New function.
>   * ira-color.c (update_costs_from_allocno): Use it.
OK.  I'm going to assume further uses will show up :-)

jeff


Re: [PATCH] Change default to -fno-math-errno

2017-10-30 Thread Wilco Dijkstra
Richard Biener wrote:
> Should we also get the __ieee764_ entries used if the compiler sets
> __NO_MATH_ERRNO__?  That is, if the librari advertises not setting errno
> via math_errhandling is it still allowed to set/clobber errno anyways?

That's a good question! I checked and the math wrappers currently use
__FINITE_MATH_ONLY__ to decide whether to use the IEEE754 version
or the wrapper which sets errno. However that's actually incorrect given
that the only purpose of the wrappers is to set errno, not to add IEEE
handling (which is done in the math implementation).

Note -ffinite-math-only appears to imply -fno-math-errno, eventhough
this is not clear from the documentation - is that expected? In principle
these options are orthogonal, setting errno due to a range error is still
valid even if you don't use infinities or NaNs.

Wilco


Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Martin Sebor

On 10/30/2017 05:45 AM, Richard Biener wrote:

On Sun, 29 Oct 2017, Martin Sebor wrote:


In my work on -Wrestrict, to issue meaningful warnings, I found
it important to detect both out of bounds array indices as well
as offsets in calls to restrict-qualified functions like strcpy.
GCC already detects some of these cases but my tests for
the enhanced warning exposed a few gaps.

The attached patch enhances -Warray-bounds to detect more instances
out-of-bounds indices and offsets to member arrays and non-array
members.  For example, it detects the out-of-bounds offset in the
call to strcpy below.

The patch is meant to be applied on top posted here but not yet
committed:
   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

Richard, since this also touches tree-vrp.c I look for your comments.


You fail to tell what you are changing and why - I have to reverse
engineer this from the patch which a) isn't easy in this case, b) feels
like a waste of time.  Esp. since the patch does many things.

My first question is why do you add a warning from forwprop?  It
_feels_ like you're trying to warn about arbitrary out-of-bound
addresses at the point they are folded to MEM_REFs.  And it looks
like you're warning about pointer arithmetic like &p->a + 6.
That doesn't look correct to me.  Pointer arithmetic in GIMPLE
is not restricted to operate within fields that are appearantly
accessed here - the only restriction is with respect to the
whole underlying pointed-to-object.

By doing the warning from forwprop you'll run into all such cases
introduced by GCC itself during quite late optimization passes.


I haven't run into any such cases.  What would a more appropriate
place to detect out-of-bounds offsets?  I'm having a hard time
distinguishing what is appropriate and what isn't.  For instance,
if it's okay to detect some out of bounds offsets/indices in vrp
why is it wrong to do a better job of it in forwprop?



You're trying to re-do __builtin_object_size even when that wasn't
used.


That's not the quite my intent, although it is close.



So it looks like you're on the wrong track.  Yes,

  strcpy (p->a + 6, "y");

_may_ be "invalid" C (I'm not even sure about that!) but it
is certainly not invalid GIMPLE.


Adding (or subtracting) an integer to/from a pointer to an array
is defined in both C and C++ only if the result points to an element
of the array or just past the last element of the array.  Otherwise
it's undefined. (A non-array object is considered an array of one
for this purpose.)

Martin



Richard.


Jeff, this is the enhancement you were interested in when we spoke
last week.

Thanks
Martin

$ cat a.c && gcc -O2 -S -Wall a.c
   struct A { char a[4]; void (*pf)(void); };

   void f (struct A *p)
   {
 p->a[5] = 'x';// existing -Warray-bounds

 strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
   }

   a.c: In function ‘f’:
   a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’ [-Warray-bounds]
strcpy (p->a + 6, "y");
^~
   a.c:1:17: note: member declared here
struct A { char a[4]; void (*pf)(void); };
^
   a.c:5:7: warning: array subscript 5 is above array bounds of ‘char[4]’
[-Warray-bounds]
  p->a[5] = 'x';
  ^~~


Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-30 Thread Maxim Ostapenko

On 30/10/17 17:08, Christophe Lyon wrote:

On 30/10/2017 11:12, Maxim Ostapenko wrote:

Hi,

sorry for the late response.

On 20/10/17 13:45, Christophe Lyon wrote:

Hi,

On 19 October 2017 at 13:17, Jakub Jelinek  wrote:

On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote:

Is the patch (the merge + this incremental) ok for trunk?

I think the patch is OK, just wondering about two things:
Richi just approved the patch on IRC, so I'll commit, then we can 
deal with

follow-ups.


Does anyone else run these tests on arm?
Since you applied this patch, I'm seeing lots of new errors and 
timeouts.
I have been ignoring regression reports for *san because of 
yyrandomness

in the results, but the timeouts are a  major inconvenience in testing
because it increases latency a lot in getting results, or worse I 
get no

result at all because the validation job is killed before completion.

Looking at some intermediate logs, I have noticed:
==24797==AddressSanitizer CHECK failed:
/libsanitizer/asan/asan_poisoning.cc:34
"((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0)
 #0 0x408d7d65 in AsanCheckFailed /libsanitizer/asan/asan_rtl.cc:67
 #1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/libsanitizer/sanitizer_common/sanitizer_termination.cc:77
 #2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned
long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34
 #3 0x4085409b in __asan_register_globals
/libsanitizer/asan/asan_globals.cc:368
 #4 0x109eb in _GLOBAL__sub_I_00099_1_ten
(/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb) 



in MANY (193 in gcc) tests.

and many others (152 in gcc) just time out individually (eg
c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in
the logs besides Dejagnu's
WARNING: program timed out.


Since I'm using an apparently unusual setup, maybe I have to update it
to cope with the new version,
so I'd like to know if others are seeing the same problems on arm?

I'm using qemu -R 0 to execute the test programs, encapsulated by
proot (similar to chroot, but does not require root privileges).

Am I missing something obvious?


I've caught the same error on my Arndale board. The issue seems to be 
quite obvious: after merge, ASan requires globals array to be aligned 
by shadow granularity.


Thanks for confirming. I've spent a lot of time investigating the 
timeout issues, that led to zombie processes and servers needing 
reboot. I've finally identified that going back to qemu-2.7 avoid the 
timeout issues (I've reported a qemu bug).


This trivial patch seems to fix the issue. Could you check it on your 
setup?


I was just about to finally start looking at this sanity check 
problem, so thank you very much for sharing your patch.
I manually tested it on the subset of my configs and it solves the 
assertion failure, thanks!

However, I can notice many regressions compared to before the merge:
c-c++-common/asan/alloca_instruments_all_paddings.c
c-c++-common/asan/alloca_loop_unpoisoning.c
c-c++-common/asan/alloca_safe_access.c
c-c++-common/asan/asan-interface-1.c
c-c++-common/asan/halt_on_error-1.c
c-c++-common/asan/pr59063-1.c
c-c++-common/asan/pr59063-2.c
c-c++-common/asan/pr63316.c
c-c++-common/asan/pr63888.c
c-c++-common/asan/pr70712.c
c-c++-common/asan/pr71480.c
c-c++-common/asan/pr79944.c
c-c++-common/asan/pr80308.c
c-c++-common/asan/swapcontext-test-1.c
gcc.dg/asan/nosanitize-and-inline.c
gcc.dg/asan/pr79196.c
gcc.dg/asan/pr80166.c
gcc.dg/asan/pr81186.c
gcc.dg/asan/use-after-scope-11.c
gcc.dg/asan/use-after-scope-4.c
gcc.dg/asan/use-after-scope-6.c
gcc.dg/asan/use-after-scope-7.c
gcc.dg/asan/use-after-scope-goto-1.c
gcc.dg/asan/use-after-scope-goto-2.c
gcc.dg/asan/use-after-scope-switch-1.c
gcc.dg/asan/use-after-scope-switch-2.c
gcc.dg/asan/use-after-scope-switch-3.c
gcc.dg/asan/use-after-scope-switch-4.c

out of which only
c-c++-common/asan/swapcontext-test-1.c
c-c++-common/asan/halt_on_error-1.c
print something in gcc.log

Do they pass for you?


Ah, I see. The problem is that after this merge LSan was enabled for 
ARM. LSan sets atexit handler that calls internal_clone function that's 
not supported in QEMU.

That's why these tests pass on board, but fail under QEMU.
Could you try set ASAN_OPTIONS=detect_leaks=0 in your environment?

-Maxim



Thanks,

Christophe


Thanks,
-Maxim



Thanks,

Christophe



1) We have a bunch of GCC local patches, did you include them into a
cumulative patch (I guess yes)?
I have done some verification today, diff from upstream r285547 to 
unpatched
GCC (with the LLVM Compiler infrastructure two liners removed), 
attached P1,
and diff from upstream r315899 to patched GCC (again, two liners 
removed),
attached P2 and went through the changes in P1 and verified that 
except for
the ubsan backwards compatibility we had that can't work anymore 
everythin

Re: RFC: why do our std::ofsteam functions add ios_base::trunc ?

2017-10-30 Thread Jonathan Wakely

On 24/10/17 19:25 +0100, Jonathan Wakely wrote:

While adding the new C++17 overloads taking filesystem paths to
 I noticed that ofstream uses out|trunc for the default
arguments instead of out as the stadnard says. The effect is the same,
but my preference would be to do what the standard says. Is there a
good reason for deviating from what the standard says?

Although the out|trunc default arguments have no difference in
behaviour from just using out, the comments saying

"Calls @c std::basic_filebuf::open(__s,__mode|out|trunc)"

are definitely wrong. That would not be a conforming behaviour.

Also, I think we can remove the comments saying to use c_str() as
fstreams accept std::string arguments since C++11.

Any comments?


I've removed these redundant bits from the default arguments, and from
the misleading comments.

Tested powerpc64le-linux, committed to trunk.

commit 892871f7c6259919aecfb197ea69c013d9c19cb2
Author: Jonathan Wakely 
Date:   Mon Oct 30 15:00:49 2017 +

Remove ios_mode::trunc from basic_ofstream openmode arguments

* include/std/fstream (basic_ifstream, basic_ofstream, basic_fstream):
Remove outdated comments about calling c_str() to create a file stream
from a std::string.
(basic_ofstream::basic_ofstream, basic_ofstream::open): Remove
redundant ios_mode::trunc bits from default arguments and comments.

diff --git a/libstdc++-v3/include/std/fstream b/libstdc++-v3/include/std/fstream
index 3205f81fb47..a3324c004d7 100644
--- a/libstdc++-v3/include/std/fstream
+++ b/libstdc++-v3/include/std/fstream
@@ -506,9 +506,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @param  __mode  Open file in specified mode (see std::ios_base).
*
*  @c ios_base::in is automatically included in @a __mode.
-   *
-   *  Tip:  When using std::string to hold the filename, you must use
-   *  .c_str() before passing it to this constructor.
*/
   explicit
   basic_ifstream(const char* __s, ios_base::openmode __mode = ios_base::in)
@@ -622,9 +619,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*
*  Calls @c std::basic_filebuf::open(s,__mode|in).  If that function
*  fails, @c failbit is set in the stream's error state.
-   *
-   *  Tip:  When using std::string to hold the filename, you must use
-   *  .c_str() before passing it to this constructor.
*/
   void
   open(const char* __s, ios_base::openmode __mode = ios_base::in)
@@ -738,15 +732,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @param  __s  Null terminated string specifying the filename.
*  @param  __mode  Open file in specified mode (see std::ios_base).
*
-   *  @c ios_base::out | @c ios_base::trunc is automatically included in
-   *  @a __mode.
-   *
-   *  Tip:  When using std::string to hold the filename, you must use
-   *  .c_str() before passing it to this constructor.
+   *  @c ios_base::out is automatically included in @a __mode.
*/
   explicit
   basic_ofstream(const char* __s,
-		 ios_base::openmode __mode = ios_base::out|ios_base::trunc)
+		 ios_base::openmode __mode = ios_base::out)
   : __ostream_type(), _M_filebuf()
   {
 	this->init(&_M_filebuf);
@@ -759,12 +749,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @param  __s  std::string specifying the filename.
*  @param  __mode  Open file in specified mode (see std::ios_base).
*
-   *  @c ios_base::out | @c ios_base::trunc is automatically included in
-   *  @a __mode.
+   *  @c ios_base::out is automatically included in @a __mode.
*/
   explicit
   basic_ofstream(const std::string& __s,
-		 ios_base::openmode __mode = ios_base::out|ios_base::trunc)
+		 ios_base::openmode __mode = ios_base::out)
   : __ostream_type(), _M_filebuf()
   {
 	this->init(&_M_filebuf);
@@ -777,13 +766,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @param  __s  filesystem::path specifying the filename.
*  @param  __mode  Open file in specified mode (see std::ios_base).
*
-   *  @c ios_base::out | @c ios_base::trunc is automatically included in
-   *  @a __mode.
+   *  @c ios_base::out is automatically included in @a __mode.
*/
   template>>
-	basic_ofstream(const _Path& __s, ios_base::openmode __mode
-		   = ios_base::out|ios_base::trunc)
+	basic_ofstream(const _Path& __s,
+		   ios_base::openmode __mode = ios_base::out)
 	: basic_ofstream(__s.c_str(), __mode)
 	{ }
 #endif // C++17
@@ -857,15 +845,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @param  __s  The name of the file.
*  @param  __mode  The open mode flags.
*
-   *  Calls @c std::basic_filebuf::open(__s,__mode|out|trunc).  If that
+   *  Calls @c std::basic_filebuf::open(__s,__mode|out).  If that
*  function fails, @c failbit is set in the stream's error state.
-   *
-   *

Re: [PATCH] Change default to -fno-math-errno

2017-10-30 Thread Joseph Myers
On Mon, 30 Oct 2017, Wilco Dijkstra wrote:

> Richard Biener wrote:
> > Should we also get the __ieee764_ entries used if the compiler sets
> > __NO_MATH_ERRNO__?  That is, if the librari advertises not setting errno
> > via math_errhandling is it still allowed to set/clobber errno anyways?
> 
> That's a good question! I checked and the math wrappers currently use
> __FINITE_MATH_ONLY__ to decide whether to use the IEEE754 version
> or the wrapper which sets errno. However that's actually incorrect given
> that the only purpose of the wrappers is to set errno, not to add IEEE
> handling (which is done in the math implementation).

The semantics of __*_finite definitely include finite-math-only, as they 
aren't all just disabling the wrappers (e.g. sysdeps/i386/i686/fpu/e_log.S 
has __log_finite separate from __ieee754_log).

> Note -ffinite-math-only appears to imply -fno-math-errno, eventhough
> this is not clear from the documentation - is that expected? In principle
> these options are orthogonal, setting errno due to a range error is still
> valid even if you don't use infinities or NaNs.

Well, it's expected for glibc (as an empirical observation about what the 
__*_finite entry points do).  It may not necessarily be an implication for 
GCC.  And given appropriate documentation, it doesn't necessarily 
contradict math_errhandling & MATH_ERRNO being nonzero (in that errno 
setting on underflow is implementation-defined in that case - glibc 
documents semantics involving setting errno on underflow to 0 but not 
necessarily otherwise, but could document the effects of 
-ffinite-math-only potentially disabling the errno setting on underflow to 
0).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] fix cygwin builds

2017-10-30 Thread Jim Wilson
On Mon, 2017-10-30 at 08:07 +0100, Andreas Schwab wrote:
> This broke ia64:
> 
> In file included from ./tm_p.h:4:0,
>  from ../../gcc/gimplify.c:30:
> ../../gcc/config/ia64/ia64-protos.h:49:13: error: use of enum
> 'memmodel' without previous declaration
> enum memmodel);
>  ^

Looks like alpha, sparc, and tilegx have the same problem.  Fixed by
including memmodel.h before tm_p.h.  Tested with quick x86_64-linux C
only bootstrap, and checked in under the obvious rule.

Jim
2017-10-30  Jim Wilson  

	gcc/
	* gimplify.c: Include memmodel.h.

Index: gcc/gimplify.c
===
--- gcc/gimplify.c	(revision 254222)
+++ gcc/gimplify.c	(working copy)
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "rtl.h"
 #include "tree.h"
+#include "memmodel.h"
 #include "tm_p.h"
 #include "gimple.h"
 #include "gimple-predict.h"


Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-30 Thread Christophe Lyon

On 30/10/2017 16:21, Maxim Ostapenko wrote:

On 30/10/17 17:08, Christophe Lyon wrote:

On 30/10/2017 11:12, Maxim Ostapenko wrote:

Hi,

sorry for the late response.

On 20/10/17 13:45, Christophe Lyon wrote:

Hi,

On 19 October 2017 at 13:17, Jakub Jelinek  wrote:

On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote:

Is the patch (the merge + this incremental) ok for trunk?

I think the patch is OK, just wondering about two things:

Richi just approved the patch on IRC, so I'll commit, then we can deal with
follow-ups.


Does anyone else run these tests on arm?
Since you applied this patch, I'm seeing lots of new errors and timeouts.
I have been ignoring regression reports for *san because of yyrandomness
in the results, but the timeouts are a  major inconvenience in testing
because it increases latency a lot in getting results, or worse I get no
result at all because the validation job is killed before completion.

Looking at some intermediate logs, I have noticed:
==24797==AddressSanitizer CHECK failed:
/libsanitizer/asan/asan_poisoning.cc:34
"((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0)
 #0 0x408d7d65 in AsanCheckFailed /libsanitizer/asan/asan_rtl.cc:67
 #1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/libsanitizer/sanitizer_common/sanitizer_termination.cc:77
 #2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned
long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34
 #3 0x4085409b in __asan_register_globals
/libsanitizer/asan/asan_globals.cc:368
 #4 0x109eb in _GLOBAL__sub_I_00099_1_ten
(/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb)

in MANY (193 in gcc) tests.

and many others (152 in gcc) just time out individually (eg
c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in
the logs besides Dejagnu's
WARNING: program timed out.


Since I'm using an apparently unusual setup, maybe I have to update it
to cope with the new version,
so I'd like to know if others are seeing the same problems on arm?

I'm using qemu -R 0 to execute the test programs, encapsulated by
proot (similar to chroot, but does not require root privileges).

Am I missing something obvious?


I've caught the same error on my Arndale board. The issue seems to be quite 
obvious: after merge, ASan requires globals array to be aligned by shadow 
granularity.


Thanks for confirming. I've spent a lot of time investigating the timeout 
issues, that led to zombie processes and servers needing reboot. I've finally 
identified that going back to qemu-2.7 avoid the timeout issues (I've reported 
a qemu bug).


This trivial patch seems to fix the issue. Could you check it on your setup?


I was just about to finally start looking at this sanity check problem, so 
thank you very much for sharing your patch.
I manually tested it on the subset of my configs and it solves the assertion 
failure, thanks!
However, I can notice many regressions compared to before the merge:
c-c++-common/asan/alloca_instruments_all_paddings.c
c-c++-common/asan/alloca_loop_unpoisoning.c
c-c++-common/asan/alloca_safe_access.c
c-c++-common/asan/asan-interface-1.c
c-c++-common/asan/halt_on_error-1.c
c-c++-common/asan/pr59063-1.c
c-c++-common/asan/pr59063-2.c
c-c++-common/asan/pr63316.c
c-c++-common/asan/pr63888.c
c-c++-common/asan/pr70712.c
c-c++-common/asan/pr71480.c
c-c++-common/asan/pr79944.c
c-c++-common/asan/pr80308.c
c-c++-common/asan/swapcontext-test-1.c
gcc.dg/asan/nosanitize-and-inline.c
gcc.dg/asan/pr79196.c
gcc.dg/asan/pr80166.c
gcc.dg/asan/pr81186.c
gcc.dg/asan/use-after-scope-11.c
gcc.dg/asan/use-after-scope-4.c
gcc.dg/asan/use-after-scope-6.c
gcc.dg/asan/use-after-scope-7.c
gcc.dg/asan/use-after-scope-goto-1.c
gcc.dg/asan/use-after-scope-goto-2.c
gcc.dg/asan/use-after-scope-switch-1.c
gcc.dg/asan/use-after-scope-switch-2.c
gcc.dg/asan/use-after-scope-switch-3.c
gcc.dg/asan/use-after-scope-switch-4.c

out of which only
c-c++-common/asan/swapcontext-test-1.c
c-c++-common/asan/halt_on_error-1.c
print something in gcc.log

Do they pass for you?


Ah, I see. The problem is that after this merge LSan was enabled for ARM. LSan 
sets atexit handler that calls internal_clone function that's not supported in 
QEMU.
That's why these tests pass on board, but fail under QEMU.
Could you try set ASAN_OPTIONS=detect_leaks=0 in your environment?



Ha, thanks it works!

The remaining failure is on c-c++-common/asan/halt_on_error-1.c, which seems to 
dump the expected messages, but exits with return code=4.

Any idea ?

Thanks,

Christophe


-Maxim



Thanks,

Christophe


Thanks,
-Maxim



Thanks,

Christophe



1) We have a bunch of GCC local patches, did you include them into a
cumulative patch (I guess yes)?

I have done some verification today, diff from upstream r285547 to unpatched
GCC (with the LLVM Compiler infrastructure two liners removed), attached P1

Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-30 Thread Maxim Ostapenko

On 30/10/17 18:46, Christophe Lyon wrote:

On 30/10/2017 16:21, Maxim Ostapenko wrote:

On 30/10/17 17:08, Christophe Lyon wrote:

On 30/10/2017 11:12, Maxim Ostapenko wrote:

Hi,

sorry for the late response.

On 20/10/17 13:45, Christophe Lyon wrote:

Hi,

On 19 October 2017 at 13:17, Jakub Jelinek  wrote:

On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote:

Is the patch (the merge + this incremental) ok for trunk?

I think the patch is OK, just wondering about two things:
Richi just approved the patch on IRC, so I'll commit, then we can 
deal with

follow-ups.


Does anyone else run these tests on arm?
Since you applied this patch, I'm seeing lots of new errors and 
timeouts.
I have been ignoring regression reports for *san because of 
yyrandomness
in the results, but the timeouts are a  major inconvenience in 
testing
because it increases latency a lot in getting results, or worse I 
get no

result at all because the validation job is killed before completion.

Looking at some intermediate logs, I have noticed:
==24797==AddressSanitizer CHECK failed:
/libsanitizer/asan/asan_poisoning.cc:34
"((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0)
 #0 0x408d7d65 in AsanCheckFailed 
/libsanitizer/asan/asan_rtl.cc:67

 #1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/libsanitizer/sanitizer_common/sanitizer_termination.cc:77
 #2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned
long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34
 #3 0x4085409b in __asan_register_globals
/libsanitizer/asan/asan_globals.cc:368
 #4 0x109eb in _GLOBAL__sub_I_00099_1_ten
(/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb) 



in MANY (193 in gcc) tests.

and many others (152 in gcc) just time out individually (eg
c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in
the logs besides Dejagnu's
WARNING: program timed out.


Since I'm using an apparently unusual setup, maybe I have to 
update it

to cope with the new version,
so I'd like to know if others are seeing the same problems on arm?

I'm using qemu -R 0 to execute the test programs, encapsulated by
proot (similar to chroot, but does not require root privileges).

Am I missing something obvious?


I've caught the same error on my Arndale board. The issue seems to 
be quite obvious: after merge, ASan requires globals array to be 
aligned by shadow granularity.


Thanks for confirming. I've spent a lot of time investigating the 
timeout issues, that led to zombie processes and servers needing 
reboot. I've finally identified that going back to qemu-2.7 avoid 
the timeout issues (I've reported a qemu bug).


This trivial patch seems to fix the issue. Could you check it on 
your setup?


I was just about to finally start looking at this sanity check 
problem, so thank you very much for sharing your patch.
I manually tested it on the subset of my configs and it solves the 
assertion failure, thanks!

However, I can notice many regressions compared to before the merge:
c-c++-common/asan/alloca_instruments_all_paddings.c
c-c++-common/asan/alloca_loop_unpoisoning.c
c-c++-common/asan/alloca_safe_access.c
c-c++-common/asan/asan-interface-1.c
c-c++-common/asan/halt_on_error-1.c
c-c++-common/asan/pr59063-1.c
c-c++-common/asan/pr59063-2.c
c-c++-common/asan/pr63316.c
c-c++-common/asan/pr63888.c
c-c++-common/asan/pr70712.c
c-c++-common/asan/pr71480.c
c-c++-common/asan/pr79944.c
c-c++-common/asan/pr80308.c
c-c++-common/asan/swapcontext-test-1.c
gcc.dg/asan/nosanitize-and-inline.c
gcc.dg/asan/pr79196.c
gcc.dg/asan/pr80166.c
gcc.dg/asan/pr81186.c
gcc.dg/asan/use-after-scope-11.c
gcc.dg/asan/use-after-scope-4.c
gcc.dg/asan/use-after-scope-6.c
gcc.dg/asan/use-after-scope-7.c
gcc.dg/asan/use-after-scope-goto-1.c
gcc.dg/asan/use-after-scope-goto-2.c
gcc.dg/asan/use-after-scope-switch-1.c
gcc.dg/asan/use-after-scope-switch-2.c
gcc.dg/asan/use-after-scope-switch-3.c
gcc.dg/asan/use-after-scope-switch-4.c

out of which only
c-c++-common/asan/swapcontext-test-1.c
c-c++-common/asan/halt_on_error-1.c
print something in gcc.log

Do they pass for you?


Ah, I see. The problem is that after this merge LSan was enabled for 
ARM. LSan sets atexit handler that calls internal_clone function 
that's not supported in QEMU.

That's why these tests pass on board, but fail under QEMU.
Could you try set ASAN_OPTIONS=detect_leaks=0 in your environment?



Ha, thanks it works!

The remaining failure is on c-c++-common/asan/halt_on_error-1.c, which 
seems to dump the expected messages, but exits with return code=4.


Same here, halt_on_error-1.c just overwrites ASAN_OPTIONS via 
"halt_on_error=false" thus enabling LSan back.

Something like this (attached) should work.

-Maxim



Any idea ?

Thanks,

Christophe


-Maxim



Thanks,

Christophe


Thanks,
-Maxim



Thanks,

Christophe


1) We have a bunch of GCC loc

Re: [RFC PATCH] Merge libsanitizer from upstream

2017-10-30 Thread Christophe Lyon

On 30/10/2017 16:54, Maxim Ostapenko wrote:

On 30/10/17 18:46, Christophe Lyon wrote:

On 30/10/2017 16:21, Maxim Ostapenko wrote:

On 30/10/17 17:08, Christophe Lyon wrote:

On 30/10/2017 11:12, Maxim Ostapenko wrote:

Hi,

sorry for the late response.

On 20/10/17 13:45, Christophe Lyon wrote:

Hi,

On 19 October 2017 at 13:17, Jakub Jelinek  wrote:

On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote:

Is the patch (the merge + this incremental) ok for trunk?

I think the patch is OK, just wondering about two things:

Richi just approved the patch on IRC, so I'll commit, then we can deal with
follow-ups.


Does anyone else run these tests on arm?
Since you applied this patch, I'm seeing lots of new errors and timeouts.
I have been ignoring regression reports for *san because of yyrandomness
in the results, but the timeouts are a  major inconvenience in testing
because it increases latency a lot in getting results, or worse I get no
result at all because the validation job is killed before completion.

Looking at some intermediate logs, I have noticed:
==24797==AddressSanitizer CHECK failed:
/libsanitizer/asan/asan_poisoning.cc:34
"((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0)
 #0 0x408d7d65 in AsanCheckFailed /libsanitizer/asan/asan_rtl.cc:67
 #1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)
/libsanitizer/sanitizer_common/sanitizer_termination.cc:77
 #2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned
long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34
 #3 0x4085409b in __asan_register_globals
/libsanitizer/asan/asan_globals.cc:368
 #4 0x109eb in _GLOBAL__sub_I_00099_1_ten
(/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb)

in MANY (193 in gcc) tests.

and many others (152 in gcc) just time out individually (eg
c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in
the logs besides Dejagnu's
WARNING: program timed out.


Since I'm using an apparently unusual setup, maybe I have to update it
to cope with the new version,
so I'd like to know if others are seeing the same problems on arm?

I'm using qemu -R 0 to execute the test programs, encapsulated by
proot (similar to chroot, but does not require root privileges).

Am I missing something obvious?


I've caught the same error on my Arndale board. The issue seems to be quite 
obvious: after merge, ASan requires globals array to be aligned by shadow 
granularity.


Thanks for confirming. I've spent a lot of time investigating the timeout 
issues, that led to zombie processes and servers needing reboot. I've finally 
identified that going back to qemu-2.7 avoid the timeout issues (I've reported 
a qemu bug).


This trivial patch seems to fix the issue. Could you check it on your setup?


I was just about to finally start looking at this sanity check problem, so 
thank you very much for sharing your patch.
I manually tested it on the subset of my configs and it solves the assertion 
failure, thanks!
However, I can notice many regressions compared to before the merge:
c-c++-common/asan/alloca_instruments_all_paddings.c
c-c++-common/asan/alloca_loop_unpoisoning.c
c-c++-common/asan/alloca_safe_access.c
c-c++-common/asan/asan-interface-1.c
c-c++-common/asan/halt_on_error-1.c
c-c++-common/asan/pr59063-1.c
c-c++-common/asan/pr59063-2.c
c-c++-common/asan/pr63316.c
c-c++-common/asan/pr63888.c
c-c++-common/asan/pr70712.c
c-c++-common/asan/pr71480.c
c-c++-common/asan/pr79944.c
c-c++-common/asan/pr80308.c
c-c++-common/asan/swapcontext-test-1.c
gcc.dg/asan/nosanitize-and-inline.c
gcc.dg/asan/pr79196.c
gcc.dg/asan/pr80166.c
gcc.dg/asan/pr81186.c
gcc.dg/asan/use-after-scope-11.c
gcc.dg/asan/use-after-scope-4.c
gcc.dg/asan/use-after-scope-6.c
gcc.dg/asan/use-after-scope-7.c
gcc.dg/asan/use-after-scope-goto-1.c
gcc.dg/asan/use-after-scope-goto-2.c
gcc.dg/asan/use-after-scope-switch-1.c
gcc.dg/asan/use-after-scope-switch-2.c
gcc.dg/asan/use-after-scope-switch-3.c
gcc.dg/asan/use-after-scope-switch-4.c

out of which only
c-c++-common/asan/swapcontext-test-1.c
c-c++-common/asan/halt_on_error-1.c
print something in gcc.log

Do they pass for you?


Ah, I see. The problem is that after this merge LSan was enabled for ARM. LSan 
sets atexit handler that calls internal_clone function that's not supported in 
QEMU.
That's why these tests pass on board, but fail under QEMU.
Could you try set ASAN_OPTIONS=detect_leaks=0 in your environment?



Ha, thanks it works!

The remaining failure is on c-c++-common/asan/halt_on_error-1.c, which seems to 
dump the expected messages, but exits with return code=4.


Same here, halt_on_error-1.c just overwrites ASAN_OPTIONS via 
"halt_on_error=false" thus enabling LSan back.
Something like this (attached) should work.



Indeed, thanks a lot!

It looks like I'm going to have to apply local patches such as this one if I 
want to contin

Re: [PATCH, version 5a], Add support for _Float and _FloatX sqrt, fma, fmin, fmax built-in functions

2017-10-30 Thread Joseph Myers
This has broken building the mainline glibc testsuite with GCC mainline 
for platforms where long double has binary128 format.

Recall that _FloatN type names are not supported for C++, because of the 
expectation that any C++ bindings for such types would be class-based like 
the TR 24733 DFP support rather than using built-in types such as 
_Float128.  Thus glibc uses __float128 to define _Float128 internally for 
C++ where __float128 is available, but where it is not available but long 
double has that format, the headers use long double to implement 
_Float128.  Thus you get conflicting declarations, headers declaring 
sqrtf128 to use a typedef for long double but the built-in functions using 
the internal _Float128 type that has no built-in name referring directly 
to it for C++.

One obvious possible fix would be for the builtins.def macros never to 
define public names such as sqrtf128 for C++, only ever for C.  That would 
be on the basis that C++ code is expected to be using a sqrt overload 
anyway - so making the built-in function available under a public name, in 
the case where __float128 exists for C++, is a matter for libstdc++ 
headers.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Why do we create a broken symlink for include/bits/stamp-bits ?

2017-10-30 Thread Jonathan Wakely

On 25/10/17 13:18 +0100, Jonathan Wakely wrote:

In include/Makefile the stamp-bits-sup target creates a symlink called
stamp-bits:

stamp-bits-sup: stamp-bits ${bits_sup_headers}
@-cd ${bits_builddir} && $(LN_S) $? . 2>/dev/null
#
@$(STAMP) stamp-bits-sup

Should this use "$(LN_S) ${bits_sup_headers} ." instead, so it only
creates symlinks for the headers, not "stamp-bits" as well?

The broken symlink isn't a big deal but causes errors like:

libstdc++-v3$ grep -R foobar include/
grep: include/bits/stamp-bits: Too many levels of symbolic links



I've committed this to fix it. Tested powerpc64le-linux.

Another way to fi it would have been "$(LN_S) $(filter-out $<,$?) ."
but I think what I've committed is easier to grok and less fragile.

commit d2c0483f1cda88d6fb6cb76552765680fde28560
Author: Jonathan Wakely 
Date:   Mon Oct 30 15:39:06 2017 +

Don't create broken symlink in libstdc++-v3/include/bits

* include/Makefile.am (stamp-bits-sup): Do not create broken symlink
to stamp-bits.
* include/Makefile.in: Regenerate.

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 3e34dc00747..add2277df80 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -1031,7 +1031,7 @@ stamp-bits: ${bits_headers}
 	@$(STAMP) stamp-bits
 
 stamp-bits-sup: stamp-bits ${bits_sup_headers}
-	@-cd ${bits_builddir} && $(LN_S) $? . 2>/dev/null
+	@-cd ${bits_builddir} && $(LN_S) ${bits_sup_headers} . 2>/dev/null
 	@$(STAMP) stamp-bits-sup
 
 stamp-c_base: ${c_base_headers}


[committed] Unbreak big-endian bootstrap (PR middle-end/22141)

2017-10-30 Thread Jakub Jelinek
Hi!

Apparently I broke bootstrap or testing on big-endian targets,
I'm sorry for screwing up testing and not testing on any big-endian.

I've committed the following patch which fixes miscompilation on the
following short testcase:

struct S { char a, b, c, d; } s;
struct T { int a : 2, b : 5, c : 17, d : 8; } t, t2;

__attribute__((noipa)) void
foo (void)
{
  s.a = 1;
  s.b = 2;
  s.c = 3;
  s.d = 4;
}

__attribute__((noipa)) void
bar (void)
{
  t.a = 1;
  t.b = 2;
  t.c = 3;
  t.d = 4;
}

__attribute__((noipa)) void
baz (void)
{
  t.a = -2;
  t.c = 7;
  t.d = 8;
}

int
main ()
{
  foo ();
  if (s.a != 1 || s.b != 2 || s.c != 3 || s.d != 4)
__builtin_abort ();
  bar ();
  if (t.a != 1 || t.b != 2 || t.c != 3 || t.d != 4)
__builtin_abort ();
  baz ();
  if (t.a != -2 || t.b != 2 || t.c != 7 || t.d != 8)
__builtin_abort ();
  __builtin_memset (&t, 0, sizeof (t));
  baz ();
  if (t.a != -2 || t.b != 0 || t.c != 7 || t.d != 8)
__builtin_abort ();
  __builtin_memset (&t, -1, sizeof (t));
  baz ();
  if (t.a != -2 || t.b != -1 || t.c != 7 || t.d != 8)
__builtin_abort ();
  return 0;
}

as obvious to unbreak bootstrap.  David said his AIX bootstrap is past
the bootstrap failure point now with this change.

2017-10-30  Jakub Jelinek  

PR middle-end/22141
* gimple-ssa-store-merging.c (merged_store_group::apply_stores): Fix
arguments to clear_bit_region_be.

--- gcc/gimple-ssa-store-merging.c.jj   2017-10-30 12:03:56.601219516 +0100
+++ gcc/gimple-ssa-store-merging.c  2017-10-30 17:03:55.713149323 +0100
@@ -701,7 +701,9 @@ merged_store_group::apply_stores ()
return false;
   unsigned char *m = mask + (pos_in_buffer / BITS_PER_UNIT);
   if (BYTES_BIG_ENDIAN)
-   clear_bit_region_be (m, pos_in_buffer % BITS_PER_UNIT, info->bitsize);
+   clear_bit_region_be (m, (BITS_PER_UNIT - 1
+- (pos_in_buffer % BITS_PER_UNIT)),
+info->bitsize);
   else
clear_bit_region (m, pos_in_buffer % BITS_PER_UNIT, info->bitsize);
 }

Jakub


Re: [PATCH] fix AIX fortran builds

2017-10-30 Thread Thomas Koenig

Hi Jim,


If I add a call to gcc_unreachable after the longjmp call, then it
builds on both linux and AIX.  Anyone have a better idea on how to fix
this?  If I don't get any responses in a few days, I will check it in
under the obvious rule, since it fixes a build failure.


The patch is OK.

Thanks!

Regards

Thomas


Re: Fix pretty printers for versioned namespace

2017-10-30 Thread François Dumont

On 26/10/2017 22:41, Jonathan Wakely wrote:

On 26/10/17 21:37 +0100, Jonathan Wakely wrote:

On 26/10/17 21:30 +0100, Jonathan Wakely wrote:

On 26/10/17 22:19 +0200, François Dumont wrote:

@@ -1232,7 +1232,7 @@ class Printer(object):
  # Add a name using _GLIBCXX_BEGIN_NAMESPACE_CONTAINER.
  def add_container(self, base, name, function):
  self.add_version(base, name, function)
-    self.add_version(base + '__cxx1998::', name, function)
+    self.add_version(base, '__cxx1998::' + name, function)


I don't like this change.

Previously the arguments were the namespace(s) and the type. That's
nice and simple.


Not always, see updated patch.



Now it's the first namespace, and then all the other namespaces and
the type. That's not very clean.

There must be a way to keep the add_version and add_container calls
the same, and have it transparently handle the case where the
namespace is 'std::__8::foo' not 'std::foo'.


e.g. in add_version instead of:

  if _versioned_namespace:
  self.add(base + _versioned_namespace + name, function)

We should replace "std::" in base with "std::" + _versioned_namespace

That means we only have to change that function, not every call to it.


Something like this:

--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1227,7 +1227,8 @@ class Printer(object):
    def add_version(self, base, name, function):
    self.add(base + name, function)
    if _versioned_namespace:
-    self.add(base + _versioned_namespace + name, function)
+    vbase = re.sub('^std::', 'std::%s::' % 
_versioned_namespace, base)

+    self.add(vbase + name, function)

    # Add a name using _GLIBCXX_BEGIN_NAMESPACE_CONTAINER.
    def add_container(self, base, name, function)


Ok, I have adapted and generalized this approache to all symbols.

Tested under Linux x86_64 normal and versioned namespace modes.

Ok to commit ?

François
diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index b7b2a0f..2bbe458 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -973,8 +973,8 @@ class StdExpAnyPrinter(SingleObjContainerPrinter):
 "Print a std::any or std::experimental::any"
 
 def __init__ (self, typename, val):
-self.typename = re.sub('^std::experimental::fundamentals_v\d::', 'std::experimental::', typename, 1)
-self.typename = strip_versioned_namespace(self.typename)
+self.typename = strip_versioned_namespace(typename)
+self.typename = re.sub('^std::experimental::fundamentals_v\d::', 'std::experimental::', self.typename, 1)
 self.val = val
 self.contained_type = None
 contained_value = None
@@ -1021,8 +1021,8 @@ class StdExpOptionalPrinter(SingleObjContainerPrinter):
 
 def __init__ (self, typename, val):
 valtype = self._recognize (val.type.template_argument(0))
-self.typename = re.sub('^std::(experimental::|)(fundamentals_v\d::|)(.*)', r'std::\1\3<%s>' % valtype, typename, 1)
-self.typename = strip_versioned_namespace(self.typename)
+self.typename = strip_versioned_namespace(typename)
+self.typename = re.sub('^std::(experimental::|)(fundamentals_v\d::|)(.*)', r'std::\1\3<%s>' % valtype, self.typename, 1)
 if not self.typename.startswith('std::experimental'):
 val = val['_M_payload']
 self.val = val
@@ -1043,8 +1043,8 @@ class StdVariantPrinter(SingleObjContainerPrinter):
 
 def __init__(self, typename, val):
 alternatives = self._template_args(val)
-self.typename = "%s<%s>" % (typename, ', '.join([self._recognize(alt) for alt in alternatives]))
-self.typename = strip_versioned_namespace(self.typename)
+self.typename = strip_versioned_namespace(typename)
+self.typename = "%s<%s>" % (self.typename, ', '.join([self._recognize(alt) for alt in alternatives]))
 self.index = val['_M_index']
 if self.index >= len(alternatives):
 self.contained_type = None
@@ -1227,7 +1227,12 @@ class Printer(object):
 def add_version(self, base, name, function):
 self.add(base + name, function)
 if _versioned_namespace:
-self.add(base + _versioned_namespace + name, function)
+vbase = re.sub('^std::', 'std::%s' % _versioned_namespace, base)
+if vbase != base:
+self.add(vbase + name, function)
+vbase = re.sub('^__gnu_cxx::', '__gnu_cxx::%s' % _versioned_namespace, base)
+if vbase != base:
+self.add(vbase + name, function)
 
 # Add a name using _GLIBCXX_BEGIN_NAMESPACE_CONTAINER.
 def add_container(self, base, name, function):
@@ -1507,7 +1512,7 @@ def build_libstdcxx_dictionary ():
 # In order from:
 # http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a01847.html
 libstdcxx_p

Re: [20/nn] Make tree-ssa-dse.c:normalize_ref return a bool

2017-10-30 Thread Jeff Law
On 10/23/2017 05:29 AM, Richard Sandiford wrote:
> This patch moves the check for an overlapping byte to normalize_ref
> from its callers, so that it's easier to convert to poly_ints later.
> It's not really worth it on its own.
> 
> 
> 2017-10-23  Richard Sandiford  
> 
> gcc/
>   * tree-ssa-dse.c (normalize_ref): Check whether the ranges overlap
>   and return false if not.
>   (clear_bytes_written_by, live_bytes_read): Update accordingly.
OK.
jeff


Re: [PATCH][AArch64] Wrong type-attribute for stp and str

2017-10-30 Thread Dominik Inführ
Could you please also commit the patch? I don’t have commit rights.

Best
Dominik

> On 24 Oct 2017, at 16:58, Richard Earnshaw (lists)  
> wrote:
> 
> On 24/10/17 15:54, Dominik Inführ wrote:
>> 
>>> On 24 Oct 2017, at 11:40, Richard Earnshaw (lists) 
>>>  wrote:
>>> 
>>> On 23/10/17 17:36, Dominik Inführ wrote:
 I’ve added your suggestions. I would also like to propose to change the 
 type attribute from neon_stp to store_8 and store_16, this seems to be 
 more in line with respect to other patterns.
 
 Thanks,
 Dominik
 
 ChangeLog:
 2017-10-23  Dominik Infuehr  
 
* config/aarch64/aarch64-simd.md
(*aarch64_simd_mov): Fix type-attribute.
(*aarch64_simd_mov): Likewise.
>>> 
>>> I don't think we should conflate loads/stores to the simd registers with
>>> loads/stores to the gp registers.  They might have very different
>>> characteristics.  So no, I don't think we should change it that way.
>> 
>> I agree, but I don’t think my changes would conflate that. I only changed 
>> the types for the pattern alternatives `str xzr, %1` and `stp xzr, xzr, %1` 
>> to store_8 and store_16 since both instructions don’t actually involve 
>> SIMD/FP-registers. Using neon_stp for both `stp xzr, xzr, %0` and `stp %d1, 
>> %d3, %0` seems misleading because of possibly different characteristics. Am 
>> I missing something?
> 
> You're not missing anything.  I looking at an old version of the code
> and getting confused :-(
> 
> OK.
> 
> R.
> 
>> 
>>> 
>>> You've also missed the renaming of the ambiguous patterns from your
>>> changelog entry.  Finally, 'fix xxx' is generally frowned upon in
>>> ChangeLogs.  A better description would be 'Re-order type attributes to
>>> match pattern alternatives’.
>> 
>> Ok, thanks for pointing that out. Here is the updated ChangeLog:
>> 
>> 2017-10-24  Dominik Infuehr  
>> 
>>  * config/aarch64/aarch64-simd.md (*aarch64_simd_mov): Rename
>>  both identically named patterns to (*aarch64_simd_mov)
>>  and (*aarch64_simd_mov).
>>  (*aarch64_simd_mov): Change type attribute to match
>>  pattern alternative.
>>  (*aarch64_simd_mov): Re-order and change type
>>  attributes to match pattern alternative.
>> 
>>> 
>>> R.
>>> 
 —
 diff --git a/gcc/config/aarch64/aarch64-simd.md 
 b/gcc/config/aarch64/aarch64-simd.md
 index 49f615cfdbf..447ee3afd17 100644
 --- a/gcc/config/aarch64/aarch64-simd.md
 +++ b/gcc/config/aarch64/aarch64-simd.md
 @@ -102,7 +102,7 @@
  [(set_attr "type" "neon_dup")]
 )
 
 -(define_insn "*aarch64_simd_mov"
 +(define_insn "*aarch64_simd_mov"
  [(set (match_operand:VD 0 "nonimmediate_operand"
"=w, m,  m,  w, ?r, ?w, ?r, w")
(match_operand:VD 1 "general_operand"
 @@ -126,12 +126,12 @@
 default: gcc_unreachable ();
 }
 }
 -  [(set_attr "type" "neon_load1_1reg, neon_stp, neon_store1_1reg,\
 +  [(set_attr "type" "neon_load1_1reg, store_8, neon_store1_1reg,\
 neon_logic, neon_to_gp, f_mcr,\
 mov_reg, neon_move")]
 )
 
 -(define_insn "*aarch64_simd_mov"
 +(define_insn "*aarch64_simd_mov"
  [(set (match_operand:VQ 0 "nonimmediate_operand"
"=w, Umq,  m,  w, ?r, ?w, ?r, w")
(match_operand:VQ 1 "general_operand"
 @@ -160,8 +160,8 @@
gcc_unreachable ();
}
 }
 -  [(set_attr "type" "neon_load1_1reg, neon_store1_1reg,\
 -   neon_stp, neon_logic, multiple, multiple,\
 +  [(set_attr "type" "neon_load1_1reg, store_16, neon_store1_1reg,\
 +   neon_logic, multiple, multiple,\
 multiple, neon_move")
   (set_attr "length" "4,4,4,4,8,8,8,4")]
 )
 
 
> On 20 Oct 2017, at 16:07, Richard Earnshaw (lists) 
>  wrote:
> 
> On 16/10/17 14:26, Dominik Inführ wrote:
>> Hi,
>> 
>> it seems the type attributes for neon_stp and neon_store1_1reg should 
>> be the other way around.
>> 
> 
> Yes, I agree, but there's more
> 
> Firstly, we have two patterns that are named *aarch64_simd_mov,
> with different iterators.  That's slightly confusing.  I think they need
> to be renamed as:
> 
>   *aarch64_simd_mov
> 
> and
> 
>   *aarch64_simd_mov
> 
> to break the ambiguity.
> 
> Secondly it looks to me as though the attributes on the other one are
> also incorrect.  Could you check that one out as well, please.
> 
> Thanks,
> 
> R.
> 
>> Thanks
>> Dominik
>> 
>> ChangeLog:
>> 2017-10-16  Dominik Infuehr  
>> 
>>  * config/aarch64/aarch64-simd.md
>>  (*aarch64_simd_mov): Fix type-attribute.
>> --
>> diff --git a/gcc/config/aarch64/aarch64-simd.md 
>> b/gcc/config/aarch64/aarch64-simd.md
>> index 49f615cfdbf..409ad3502ff 100644
>> --- a/gcc/config/aarch64/aarch64-simd.m

Re: Fix pretty printers for versioned namespace

2017-10-30 Thread Jonathan Wakely

On 30/10/17 18:13 +0100, François Dumont wrote:

On 26/10/2017 22:41, Jonathan Wakely wrote:

On 26/10/17 21:37 +0100, Jonathan Wakely wrote:

On 26/10/17 21:30 +0100, Jonathan Wakely wrote:

On 26/10/17 22:19 +0200, François Dumont wrote:

@@ -1232,7 +1232,7 @@ class Printer(object):
  # Add a name using _GLIBCXX_BEGIN_NAMESPACE_CONTAINER.
  def add_container(self, base, name, function):
  self.add_version(base, name, function)
-    self.add_version(base + '__cxx1998::', name, function)
+    self.add_version(base, '__cxx1998::' + name, function)


I don't like this change.

Previously the arguments were the namespace(s) and the type. That's
nice and simple.


Not always, see updated patch.


Ah yes. Well it's nice to fix those cases then :-)


@@ -1227,7 +1227,12 @@ class Printer(object):
def add_version(self, base, name, function):
self.add(base + name, function)
if _versioned_namespace:
-self.add(base + _versioned_namespace + name, function)
+vbase = re.sub('^std::', 'std::%s' % _versioned_namespace, base)
+if vbase != base:
+self.add(vbase + name, function)
+vbase = re.sub('^__gnu_cxx::', '__gnu_cxx::%s' % 
_versioned_namespace, base)
+if vbase != base:
+self.add(vbase + name, function)


Only one re.sub is needed:

vbase = re.sub('^(std|__gnu_cxx)::', r'\1::%s' % _versioned_namespace, base)

OK for trunk with that change, assuming it works. Thanks.



Re: [PATCH, version 5a], Add support for _Float and _FloatX sqrt, fma, fmin, fmax built-in functions

2017-10-30 Thread Michael Meissner
On Mon, Oct 30, 2017 at 04:16:42PM +, Joseph Myers wrote:
> This has broken building the mainline glibc testsuite with GCC mainline 
> for platforms where long double has binary128 format.
> 
> Recall that _FloatN type names are not supported for C++, because of the 
> expectation that any C++ bindings for such types would be class-based like 
> the TR 24733 DFP support rather than using built-in types such as 
> _Float128.  Thus glibc uses __float128 to define _Float128 internally for 
> C++ where __float128 is available, but where it is not available but long 
> double has that format, the headers use long double to implement 
> _Float128.  Thus you get conflicting declarations, headers declaring 
> sqrtf128 to use a typedef for long double but the built-in functions using 
> the internal _Float128 type that has no built-in name referring directly 
> to it for C++.
> 
> One obvious possible fix would be for the builtins.def macros never to 
> define public names such as sqrtf128 for C++, only ever for C.  That would 
> be on the basis that C++ code is expected to be using a sqrt overload 
> anyway - so making the built-in function available under a public name, in 
> the case where __float128 exists for C++, is a matter for libstdc++ 
> headers.

Arghh, yes, I can restrict it to just the C language.  Would you prefer just
that?  Do we want a target hook to be able to over-ride this on a platform
basis (with the default behavior only enabling it for C)?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[PATCH] Fix gfortran.dg/dtio_13.f90 failure on at least FreeBSD

2017-10-30 Thread Steve Kargl
The attached patch fixes the failure of gfortran.dg/dtio_13.f90
on at least FreeBSD.  This test has been failing for a very long
time.  In resolve_transfer, gfortran needs to check for a BT_CLASS,
but failed to to do.  The patch allows one to remove a TODO in
dt90_13.f90 and more important the dg-error test for a spurious
error.  OK to commit?

2017-10-30  Steven G. Kargl   

* resolve.c (resolve_transfer): Set derived to correct symbol for 
BT_CLASS.

2017-10-30  Steven G. Kargl   

* gfortran.dg/dtio_13.f90: Remove TODO comment and dg-error test.

-- 
Steve
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 254232)
+++ gcc/fortran/resolve.c	(working copy)
@@ -9181,7 +9181,7 @@ resolve_transfer (gfc_code *code)
   if (dt && dt->dt_io_kind->value.iokind != M_INQUIRE
   && (ts->type == BT_DERIVED || ts->type == BT_CLASS))
 {
-  if (ts->type == BT_DERIVED)
+  if (ts->type == BT_DERIVED || ts->type == BT_CLASS)
 	derived = ts->u.derived;
   else
 	derived = ts->u.derived->components->ts.u.derived;
Index: gcc/testsuite/gfortran.dg/dtio_13.f90
===
--- gcc/testsuite/gfortran.dg/dtio_13.f90	(revision 254232)
+++ gcc/testsuite/gfortran.dg/dtio_13.f90	(working copy)
@@ -136,9 +136,7 @@ program test
character(3) :: a, b
class(t) :: chairman ! { dg-error "must be dummy, allocatable or pointer" }
open (unit=71, file='myunformatted_data.dat', form='unformatted')
-! The following error is spurious and is eliminated if previous error is corrected.
-! TODO Although better than an ICE, fix me.
-   read (71) a, chairman, b ! { dg-error "cannot be polymorphic" }
+   read (71) a, chairman, b 
close (unit=71)
 end
 


Re: Fix pretty printers for versioned namespace

2017-10-30 Thread Jonathan Wakely

On 30/10/17 17:59 +, Jonathan Wakely wrote:

On 30/10/17 18:13 +0100, François Dumont wrote:

On 26/10/2017 22:41, Jonathan Wakely wrote:

On 26/10/17 21:37 +0100, Jonathan Wakely wrote:

On 26/10/17 21:30 +0100, Jonathan Wakely wrote:

On 26/10/17 22:19 +0200, François Dumont wrote:

@@ -1232,7 +1232,7 @@ class Printer(object):
  # Add a name using _GLIBCXX_BEGIN_NAMESPACE_CONTAINER.
  def add_container(self, base, name, function):
  self.add_version(base, name, function)
-    self.add_version(base + '__cxx1998::', name, function)
+    self.add_version(base, '__cxx1998::' + name, function)


I don't like this change.

Previously the arguments were the namespace(s) and the type. That's
nice and simple.


Not always, see updated patch.


Ah yes. Well it's nice to fix those cases then :-)


@@ -1227,7 +1227,12 @@ class Printer(object):
   def add_version(self, base, name, function):
   self.add(base + name, function)
   if _versioned_namespace:
-self.add(base + _versioned_namespace + name, function)
+vbase = re.sub('^std::', 'std::%s' % _versioned_namespace, base)
+if vbase != base:
+self.add(vbase + name, function)
+vbase = re.sub('^__gnu_cxx::', '__gnu_cxx::%s' % 
_versioned_namespace, base)
+if vbase != base:
+self.add(vbase + name, function)


Only one re.sub is needed:

vbase = re.sub('^(std|__gnu_cxx)::', r'\1::%s' % _versioned_namespace, base)


Or using \g<0> to refer to the whole match:

vbase = re.sub('^(std|__gnu_cxx)::', r'\g<0>%s' % _versioned_namespace, base)


OK for trunk with that change, assuming it works. Thanks.



Re: [PATCH][ARM] Remove DImode expansions for 1-bit shifts

2017-10-30 Thread Kyrill Tkachov


On 30/10/17 13:54, Wilco Dijkstra wrote:

Kyrill Tkachov wrote:

On 16/10/17 12:30, Wilco Dijkstra wrote:

DImode right shifts of 1 are rarely used (6 in total in the GCC binary),
so there is little benefit of the arm_ashrdi3_1bit and arm_lshrdi3_1bit
patterns.

... but it's still used, and the patterns were put there for a reason.
Even if GCC itself doesn't use them much they may be used by other
applications.

So I'd support removing the left shift 1-bit expansions, but not the
right shift ones.

The purpose of removing the shift-by-1 cases is not just to cleanup code
but also to improve code generation. These shifts cannot expand early
and thus don't benefit from optimization (like shift merging). They also
suffer from the DImode register allocation issues.

As a simple example this loop runs >20% faster with my patch on both
Cortex-A53 and Cortex-A57 when built with -mfpu=vfp:

long long loop1 (long long x, long long y, int n)
{
   int i;
for (i = 0; i < n; i++)
{
   x >>= 1;
   x |= y;
   x >>= 1;
   x |= y;
   x >>= 1;
   x |= y;
   x >>= 1;
   x |= y;
}
return x;
}

So given these shifts are bad for performance, why have them at all?


Thanks, that makes sense.
Ok for trunk.

Thanks,
Kyrill


Wilco




Re: [10/nn] Widening optab cleanup

2017-10-30 Thread Jeff Law
On 10/23/2017 05:23 AM, Richard Sandiford wrote:
> widening_optab_handler had the comment:
> 
>   /* ??? Why does find_widening_optab_handler_and_mode attempt to
>  widen things that can't be widened?  E.g. add_optab... */
>   if (op > LAST_CONV_OPTAB)
> return CODE_FOR_nothing;
> 
> I think it comes from expand_binop using
> find_widening_optab_handler_and_mode for two things: to test whether
> a "normal" optab like add_optab is supported for a standard binary
> operation and to test whether a "convert" optab is supported for a
> widening operation like umul_widen_optab.  In the former case from_mode
> and to_mode must be the same, in the latter from_mode must be narrower
> than to_mode.
> 
> For the former case, find_widening_optab_handler_and_mode is only really
> testing the modes that are passed in.  permit_non_widening must be true
> here.
> 
> For the latter case, find_widening_optab_handler_and_mode should only
> really consider new from_modes that are wider than the original
> from_mode and narrower than the original to_mode.  Logically
> permit_non_widening should be false, since widening optabs aren't
> supposed to take operands that are the same width as the destination.
> We get away with permit_non_widening being true because no target
> would/should define a widening .md pattern with matching modes.
> 
> But really, it seems better for expand_binop to handle these two
> cases itself rather than pushing them down.  With that change,
> find_widening_optab_handler_and_mode is only ever called with
> permit_non_widening set to false and is only ever called with
> a "proper" convert optab.  We then no longer need widening_optab_handler,
> we can just use convert_optab_handler directly.
> 
> The patch also passes the instruction code down to expand_binop_directly.
> This should be more efficient and removes an extra call to
> find_widening_optab_handler_and_mode.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * optabs-query.h (convert_optab_p): New function, split out from...
>   (convert_optab_handler): ...here.
>   (widening_optab_handler): Delete.
>   (find_widening_optab_handler): Remove permit_non_widening parameter.
>   (find_widening_optab_handler_and_mode): Likewise.  Provide an
>   override that operates on mode class wrappers.
>   * optabs-query.c (widening_optab_handler): Delete.
>   (find_widening_optab_handler_and_mode): Remove permit_non_widening
>   parameter.  Assert that the two modes are the same class and that
>   the "from" mode is narrower than the "to" mode.  Use
>   convert_optab_handler instead of widening_optab_handler.
>   * expmed.c (expmed_mult_highpart_optab): Use convert_optab_handler
>   instead of widening_optab_handler.
>   * expr.c (expand_expr_real_2): Update calls to
>   find_widening_optab_handler.
>   * optabs.c (expand_widen_pattern_expr): Likewise.
>   (expand_binop_directly): Take the insn_code as a parameter.
>   (expand_binop): Only call find_widening_optab_handler for
>   conversion optabs; use optab_handler otherwise.  Update calls
>   to find_widening_optab_handler and expand_binop_directly.
>   Use convert_optab_handler instead of widening_optab_handler.
>   * tree-ssa-math-opts.c (convert_mult_to_widen): Update calls to
>   find_widening_optab_handler and use scalar_mode rather than
>   machine_mode.
>   (convert_plusminus_to_widen): Likewise.
OK.
jeff


Re: [PATCH, version 5a], Add support for _Float and _FloatX sqrt, fma, fmin, fmax built-in functions

2017-10-30 Thread Joseph Myers
On Mon, 30 Oct 2017, Michael Meissner wrote:

> > One obvious possible fix would be for the builtins.def macros never to 
> > define public names such as sqrtf128 for C++, only ever for C.  That would 
> > be on the basis that C++ code is expected to be using a sqrt overload 
> > anyway - so making the built-in function available under a public name, in 
> > the case where __float128 exists for C++, is a matter for libstdc++ 
> > headers.
> 
> Arghh, yes, I can restrict it to just the C language.  Would you prefer just
> that?  Do we want a target hook to be able to over-ride this on a platform

Yes, I think such a restriction is appropriate (that is, C and ObjC get 
both sqrtf128 and __builtin_sqrtf128 built-in functions, C++ and ObjC++ 
get only __builtin_sqrtf128).

> basis (with the default behavior only enabling it for C)?

If we think that the preferred way for C++ code to access the underlying 
functions is via function overloads for __float128 (not currently 
implemented in libstdc++, it seems) rather than suffixed functions, or 
through accessing the format as long double once that support is working, 
a target hook may not be neded.  If C++ code is expected to use 
f128-suffixed functions directly, presumably it's desirable for a C++ call 
to sqrtf128 to be inlined where possible and so you'd want a hook, or some 
other way of enabling the built-in functions for C++ only in the case 
where they are usefully distinct from the float / double / long double 
functions.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][AArch64] Wrong type-attribute for stp and str

2017-10-30 Thread Wilco Dijkstra
Dominik wrote:

> Could you please also commit the patch? I don’t have commit rights.

I've committed it as r254236.

Wilco

Re: [PATCH] Fix gfortran.dg/dtio_13.f90 failure on at least FreeBSD

2017-10-30 Thread Paul Richard Thomas
Hi Steve,

It looks good to me - OK.

Thanks

Paul

On 30 October 2017 at 18:05, Steve Kargl
 wrote:
> The attached patch fixes the failure of gfortran.dg/dtio_13.f90
> on at least FreeBSD.  This test has been failing for a very long
> time.  In resolve_transfer, gfortran needs to check for a BT_CLASS,
> but failed to to do.  The patch allows one to remove a TODO in
> dt90_13.f90 and more important the dg-error test for a spurious
> error.  OK to commit?
>
> 2017-10-30  Steven G. Kargl   
>
> * resolve.c (resolve_transfer): Set derived to correct symbol for
> BT_CLASS.
>
> 2017-10-30  Steven G. Kargl   
>
> * gfortran.dg/dtio_13.f90: Remove TODO comment and dg-error test.
>
> --
> Steve



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [PATCH, i386]: Fix PR82725, ICE in change_address_1, at emit-rtl.c

2017-10-30 Thread Paolo Carlini

Hi,

On 29/10/2017 18:59, H.J. Lu wrote:



2017-10-29  Uros Bizjak  

 PR target/82725
 * g++.dg/pr82725.C: New test.
The new testcase is failing for everybody in gnu++98 mode. I think you 
want to move it to the cpp0x directory and use the appropriate DejaGnu 
incantations.


Paolo.


[C++ PATCH] operator name cleanup prepatch

2017-10-30 Thread Nathan Sidwell
I've been fixing an itch to do with operator name encoding.  This patch 
is the preliminary cleanup to that.


1) A bunch of whitespace, indentation and line wrapping
2) the parser has a massive switch that maps cpp-tokens to 
operator-names.  I move the operator name lookup to after the switch.
3) grok_op_properties mixes semantic checking for errors with warning 
checks for effective C++ and similar.  It also has multiple checks for 
various operator new/delete handling.  Move all the warning checks to 
after validation, and bunch all the new/delete handling together.  Plus 
moving some argument checking to just before we know we need it.


applying to trunk.

nathan
--
Nathan Sidwell
2017-10-30  Nathan Sidwell  

	cp/
	* call.c (build_op_call_1): Test for FUNCTION_DECL in same manner
	as a few lines earlier.
	* cp-tree.h (PACK_EXPANSION_PATTERN): Fix white space.
	* decl.c (grokfndecl): Fix indentation.
	(compute_array_index_type): Use processing_template_decl_sentinel.
	(grok_op_properties): Move warnings to end.  Reorder other checks
	to group similar entities.  Tweak diagnostics.
	* lex.c (unqualified_name_lookup_error): No need to check name is
	not ERROR_MARK operator.
	* parser.c (cp_parser_operator): Select operator code before
	looking it up.
	* typeck.c (check_return_expr): Fix indentation and line wrapping.

	testsuite/

	* g++.dg/other/operator2.C: Adjust diagnostic.
	* g++.old-deja/g++.jason/operator.C: Likewise.

Index: cp/call.c
===
--- cp/call.c	(revision 254231)
+++ cp/call.c	(working copy)
@@ -4565,11 +4565,14 @@ build_op_call_1 (tree obj, vecfn))
+	  if (TREE_CODE (cand->fn) == FUNCTION_DECL)
 	obj = convert_like_with_context (cand->convs[0], obj, cand->fn,
 	 -1, complain);
 	  else
-	obj = convert_like (cand->convs[0], obj, complain);
+	{
+	  gcc_checking_assert (TYPE_P (cand->fn));
+	  obj = convert_like (cand->convs[0], obj, complain);
+	}
 	  obj = convert_from_reference (obj);
 	  result = cp_build_function_call_vec (obj, args, complain);
 	}
Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 254231)
+++ cp/cp-tree.h	(working copy)
@@ -3460,7 +3460,7 @@ extern void decl_shadowed_for_var_insert
 /* Extracts the type or expression pattern from a TYPE_PACK_EXPANSION or
EXPR_PACK_EXPANSION.  */
 #define PACK_EXPANSION_PATTERN(NODE)\
-  (TREE_CODE (NODE) == TYPE_PACK_EXPANSION? TREE_TYPE (NODE)\
+  (TREE_CODE (NODE) == TYPE_PACK_EXPANSION ? TREE_TYPE (NODE)\
: TREE_OPERAND (NODE, 0))
 
 /* Sets the type or expression pattern for a TYPE_PACK_EXPANSION or
Index: cp/decl.c
===
--- cp/decl.c	(revision 254231)
+++ cp/decl.c	(working copy)
@@ -8698,7 +8698,7 @@ grokfndecl (tree ctype,
 		  "deduction guide %qD must not have a function body", decl);
 }
   else if (IDENTIFIER_ANY_OP_P (DECL_NAME (decl))
-  && !grok_op_properties (decl, /*complain=*/true))
+	   && !grok_op_properties (decl, /*complain=*/true))
 return NULL_TREE;
   else if (UDLIT_OPER_P (DECL_NAME (decl)))
 {
@@ -9472,22 +9472,20 @@ compute_array_index_type (tree name, tre
 itype = build_min (MINUS_EXPR, sizetype, size, integer_one_node);
   else
 {
-  HOST_WIDE_INT saved_processing_template_decl;
-
   /* Compute the index of the largest element in the array.  It is
 	 one less than the number of elements in the array.  We save
 	 and restore PROCESSING_TEMPLATE_DECL so that computations in
 	 cp_build_binary_op will be appropriately folded.  */
-  saved_processing_template_decl = processing_template_decl;
-  processing_template_decl = 0;
-  itype = cp_build_binary_op (input_location,
-  MINUS_EXPR,
-  cp_convert (ssizetype, size, complain),
-  cp_convert (ssizetype, integer_one_node,
-	  complain),
-  complain);
-  itype = maybe_constant_value (itype);
-  processing_template_decl = saved_processing_template_decl;
+  {
+	processing_template_decl_sentinel s;
+	itype = cp_build_binary_op (input_location,
+MINUS_EXPR,
+cp_convert (ssizetype, size, complain),
+cp_convert (ssizetype, integer_one_node,
+		complain),
+complain);
+	itype = maybe_constant_value (itype);
+  }
 
   if (!TREE_CONSTANT (itype))
 	{
@@ -12909,25 +12907,14 @@ bool
 grok_op_properties (tree decl, bool complain)
 {
   tree argtypes = TYPE_ARG_TYPES (TREE_TYPE (decl));
-  tree argtype;
   int methodp = (TREE_CODE (TREE_TYPE (decl)) == METHOD_TYPE);
   tree name = DECL_NAME (decl);
-  enum tree_code operator_code;
-  int arity;
-  bool ellipsis_p;
-  tree class_type;
-
-  /* Count the number of arguments and check for ellipsis.  */
-  for (argtype = argtypes, arity = 0;
-   argtype && argtype != void_list_node;
-   argtype = TREE_CHAIN (argtype))
-++arity;
-  ellips

Re: [PATCH][PR 81376] Remove unnecessary float casts in comparisons

2017-10-30 Thread Jeff Law
On 10/01/2017 02:37 AM, Yuri Gribov wrote:
> Hi all,
> 
> (Previous mail was sent with spurious HTML, sorry!)
> 
> This patch gets rid of float casts in comparisons when all values of
> casted integral type are exactly representable by the float type
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81376).
> 
> Bootstrapped and regtested on x64_64. Ok to commit?
> 
> -Y
> 
> 
> pr81376-1.patch
> 
> 
> From f6081c8fc1d9c20b686af3f4d8b09a5f60491cba Mon Sep 17 00:00:00 2001
> From: Yury Gribov 
> Date: Fri, 29 Sep 2017 07:34:54 +0200
> Subject: [PATCH] 2017-09-29  Yury Gribov  
> 
>   PR middle-end/81376
> 
> gcc/
>   * real.c (format_helper::can_represent_integral_type_p): New function
>   * real.h (format_helper::can_represent_integral_type_p): New function.
>   * match.pd: New pattern.
> 
> gcc/testsuite/
>   * c-c++-common/pr81376.c: New test.
> ---
>  gcc/match.pd | 45 
> ++--
>  gcc/real.c   | 13 +++
>  gcc/real.h   |  1 +
>  gcc/testsuite/c-c++-common/pr81376.c | 39 +++
>  4 files changed, 91 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/pr81376.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 0863273..863c657 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -2906,6 +2906,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(if (! HONOR_NANS (@0))
>   (cmp @0 @1))
>  
> +/* Optimize various special cases of (FTYPE) N CMP (FTYPE) M.  */
> +(for cmp (lt le eq ne ge gt)
> + (simplify
> +  (cmp (float@0 @1) (float @2))
Since this runs on GENERIC as well as GIMPLE, I don't think you can rely
on the types of the float expressions being the same?!?  Which implies
subsequent code will need and fmt1/fmt2 because you don't know they're
necessarily the same.

Am I missing something?

jeff


Re: [PATCH,RFC] collect2 LTO for AIX

2017-10-30 Thread Jeff Law
On 10/13/2017 12:04 PM, David Edelsohn wrote:
> The attached patch is an incremental step toward GCC LTO on AIX.  The
> recent Libiberty Simple Object improvements for XCOFF provide more
> capabilities for operations on XCOFF object files, which are a
> prerequisite for GCC LTO functionality.
> 
> This patch adds the basic LTO scanning pass to the COFF support in
> collect2.  I don't believe that this change should affect other COFF
> targets adversely (do they even use collect2?), but I wanted to give
> people an opportunity to comment.
I honestly can't remember what the COFF targets do anymore :-)  Didn't
we delete them all a while back?  I see that the generic COFF targets
died back in gcc-4.4.


Jeff


Re: [PATCH] Change default optimization level to -Og

2017-10-30 Thread Eric Gallager
On Thu, Oct 26, 2017 at 2:16 PM, Eric Gallager  wrote:
> On 10/26/17, Wilco Dijkstra  wrote:
>> GCC's default optimization level is -O0.  Unfortunately unlike other
>> compilers,
>> GCC generates extremely inefficient code with -O0.  It is almost unusable
>> for
>> low-level debugging or manual inspection of generated code.  So a -O option
>> is
>> always required for compilation.  -Og not only allows for fast compilation,
>> but
>> also produces code that is efficient, readable as well as debuggable.
>> Therefore -Og makes for a much better default setting.
>>
>> Any comments?
>
> There are a number of bugs with -Og that I'd want to see fixed before
> making it the default; I'll follow this message up once I find them
> all.

update: I've filed bug 82738 to track them all:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82738
https://gcc.gnu.org/ml/gcc-bugs/2017-10/msg03058.html

>
>>
>> 2017-10-26  Wilco Dijkstra  
>>
>>   * opts.c (default_options_optimization): Set default to -Og.
>>
>> doc/
>>   * invoke.texi (-O0) Remove default mention.
>>   (-Og): Add mention of default setting.
>>
>> --
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index
>> 3328a3b5fafa6a98007eff52d2a26af520de9128..74c33ea35b9f320b419a3417e6007d2391536f1b
>> 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -7343,7 +7343,7 @@ by @option{-O2} and also turns on the following
>> optimization flags:
>>  @item -O0
>>  @opindex O0
>>  Reduce compilation time and make debugging produce the expected
>> -results.  This is the default.
>> +results.
>>
>>  @item -Os
>>  @opindex Os
>> @@ -7371,7 +7371,7 @@ Optimize debugging experience.  @option{-Og} enables
>> optimizations
>>  that do not interfere with debugging. It should be the optimization
>>  level of choice for the standard edit-compile-debug cycle, offering
>>  a reasonable level of optimization while maintaining fast compilation
>> -and a good debugging experience.
>> +and a good debugging experience.  This is the default.
>>  @end table
>>
>>  If you use multiple @option{-O} options, with or without level numbers,
>> diff --git a/gcc/opts.c b/gcc/opts.c
>> index
>> dfad955e220870a3250198640f3790c804b191e0..74511215309f11445685db4894be2ab6881695d3
>> 100644
>> --- a/gcc/opts.c
>> +++ b/gcc/opts.c
>> @@ -565,6 +565,12 @@ default_options_optimization (struct gcc_options
>> *opts,
>>int opt2;
>>bool openacc_mode = false;
>>
>> +  /* Set the default optimization to -Og.  */
>> +  opts->x_optimize_size = 0;
>> +  opts->x_optimize = 1;
>> +  opts->x_optimize_fast = 0;
>> +  opts->x_optimize_debug = 1;
>> +
>>/* Scan to see what optimization level has been specified.  That will
>>   determine the default value of many flags.  */
>>for (i = 1; i < decoded_options_count; i++)
>>
>>


Re: [PATCH][PR 81376] Remove unnecessary float casts in comparisons

2017-10-30 Thread Marc Glisse

On Mon, 30 Oct 2017, Jeff Law wrote:


+/* Optimize various special cases of (FTYPE) N CMP (FTYPE) M.  */
+(for cmp (lt le eq ne ge gt)
+ (simplify
+  (cmp (float@0 @1) (float @2))

Since this runs on GENERIC as well as GIMPLE, I don't think you can rely
on the types of the float expressions being the same?!?  Which implies
subsequent code will need and fmt1/fmt2 because you don't know they're
necessarily the same.


Why would GENERIC have comparisons of operands with different types? The 
front-ends first convert both sides to the same type before sending it 
anywhere near the folding machinery. I am pretty sure we have a number of 
other places in match.pd where we assume that the existence of the 
expression "a OP b" implies that the types of a and b are the same.


--
Marc Glisse


Re: [PATCH][PR 81376] Remove unnecessary float casts in comparisons

2017-10-30 Thread Richard Biener
On October 30, 2017 8:22:52 PM GMT+01:00, Marc Glisse  
wrote:
>On Mon, 30 Oct 2017, Jeff Law wrote:
>
>>> +/* Optimize various special cases of (FTYPE) N CMP (FTYPE) M.  */
>>> +(for cmp (lt le eq ne ge gt)
>>> + (simplify
>>> +  (cmp (float@0 @1) (float @2))
>> Since this runs on GENERIC as well as GIMPLE, I don't think you can
>rely
>> on the types of the float expressions being the same?!?  Which
>implies
>> subsequent code will need and fmt1/fmt2 because you don't know
>they're
>> necessarily the same.
>
>Why would GENERIC have comparisons of operands with different types?
>The 
>front-ends first convert both sides to the same type before sending it 
>anywhere near the folding machinery. I am pretty sure we have a number
>of 
>other places in match.pd where we assume that the existence of the 
>expression "a OP b" implies that the types of a and b are the same.

In fact GENERIC is more strict and requires equal types for operands.

Richard. 


Re: [PATCH][PR 81376] Remove unnecessary float casts in comparisons

2017-10-30 Thread Jeff Law
On 10/30/2017 01:28 PM, Richard Biener wrote:
> On October 30, 2017 8:22:52 PM GMT+01:00, Marc Glisse  
> wrote:
>> On Mon, 30 Oct 2017, Jeff Law wrote:
>>
 +/* Optimize various special cases of (FTYPE) N CMP (FTYPE) M.  */
 +(for cmp (lt le eq ne ge gt)
 + (simplify
 +  (cmp (float@0 @1) (float @2))
>>> Since this runs on GENERIC as well as GIMPLE, I don't think you can
>> rely
>>> on the types of the float expressions being the same?!?  Which
>> implies
>>> subsequent code will need and fmt1/fmt2 because you don't know
>> they're
>>> necessarily the same.
>>
>> Why would GENERIC have comparisons of operands with different types?
>> The 
>> front-ends first convert both sides to the same type before sending it 
>> anywhere near the folding machinery. I am pretty sure we have a number
>> of 
>> other places in match.pd where we assume that the existence of the 
>> expression "a OP b" implies that the types of a and b are the same.
> 
> In fact GENERIC is more strict and requires equal types for operands.
I always thought the type system didn't come into play until we hit
GIMPLE and that in GENERIC we could have mismatched types.  If I'm
wrong, then obviously ignore the question/concern :-)


jeff


Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Richard Biener
On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor  wrote:
>On 10/30/2017 05:45 AM, Richard Biener wrote:
>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>> 
>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>> it important to detect both out of bounds array indices as well
>>> as offsets in calls to restrict-qualified functions like strcpy.
>>> GCC already detects some of these cases but my tests for
>>> the enhanced warning exposed a few gaps.
>>>
>>> The attached patch enhances -Warray-bounds to detect more instances
>>> out-of-bounds indices and offsets to member arrays and non-array
>>> members.  For example, it detects the out-of-bounds offset in the
>>> call to strcpy below.
>>>
>>> The patch is meant to be applied on top posted here but not yet
>>> committed:
>>>https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>
>>> Richard, since this also touches tree-vrp.c I look for your
>comments.
>> 
>> You fail to tell what you are changing and why - I have to reverse
>> engineer this from the patch which a) isn't easy in this case, b)
>feels
>> like a waste of time.  Esp. since the patch does many things.
>> 
>> My first question is why do you add a warning from forwprop?  It
>> _feels_ like you're trying to warn about arbitrary out-of-bound
>> addresses at the point they are folded to MEM_REFs.  And it looks
>> like you're warning about pointer arithmetic like &p->a + 6.
>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>> is not restricted to operate within fields that are appearantly
>> accessed here - the only restriction is with respect to the
>> whole underlying pointed-to-object.
>> 
>> By doing the warning from forwprop you'll run into all such cases
>> introduced by GCC itself during quite late optimization passes.
>
>I haven't run into any such cases.  What would a more appropriate
>place to detect out-of-bounds offsets?  I'm having a hard time
>distinguishing what is appropriate and what isn't.  For instance,
>if it's okay to detect some out of bounds offsets/indices in vrp
>why is it wrong to do a better job of it in forwprop?
>
>> 
>> You're trying to re-do __builtin_object_size even when that wasn't
>> used.
>
>That's not the quite my intent, although it is close.
>
>> 
>> So it looks like you're on the wrong track.  Yes,
>> 
>>   strcpy (p->a + 6, "y");
>> 
>> _may_ be "invalid" C (I'm not even sure about that!) but it
>> is certainly not invalid GIMPLE.
>
>Adding (or subtracting) an integer to/from a pointer to an array
>is defined in both C and C++ only if the result points to an element
>of the array or just past the last element of the array.  Otherwise
>it's undefined. (A non-array object is considered an array of one
>for this purpose.)

On GIMPLE this is indistinguishable from (short *) (p->a) + 3.

GIMPLE is neither C nor C++. 

Richard. 

>
>Martin
>
>> 
>> Richard.
>> 
>>> Jeff, this is the enhancement you were interested in when we spoke
>>> last week.
>>>
>>> Thanks
>>> Martin
>>>
>>> $ cat a.c && gcc -O2 -S -Wall a.c
>>>struct A { char a[4]; void (*pf)(void); };
>>>
>>>void f (struct A *p)
>>>{
>>>  p->a[5] = 'x';// existing -Warray-bounds
>>>
>>>  strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>>>}
>>>
>>>a.c: In function ‘f’:
>>>a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’
>[-Warray-bounds]
>>> strcpy (p->a + 6, "y");
>>> ^~
>>>a.c:1:17: note: member declared here
>>> struct A { char a[4]; void (*pf)(void); };
>>> ^
>>>a.c:5:7: warning: array subscript 5 is above array bounds of
>‘char[4]’
>>> [-Warray-bounds]
>>>   p->a[5] = 'x';
>>>   ^~~



Re: [PATCH v3] Add asan and ubsan support on NetBSD/amd64

2017-10-30 Thread Kamil Rytarowski
On 30.10.2017 08:24, Jakub Jelinek wrote:
> On Thu, Oct 26, 2017 at 09:50:43PM +0200, Kamil Rytarowski wrote:
>> $ make check-asan
>> $ make check-asan-dynamic
>> $ make check-ubsan
> 
> That is testing of the upstream code, not of GCC and the libsanitizer
> copy in GCC.  What I'm more interested to hear is whether
> you've bootstrapped/regtested the gcc tree with this patch on
> x86_64-*-netbsd*, as per https://gcc.gnu.org/contribute.html
> I.e. /configure ...; make -jN bootstrap; make -jN -k check; 
> /contrib/test_summary
> and from there if there are any */asan/* or */ubsan/* FAILs.
> 

I've been executing GCC tests.

Some/many tests were hanging and I was killing them after a longer
period of time. There were certainly environment issues, like attempts
to execute non-existent 'python' (in pkgsrc/NetBSD we version python to
python2.7, python3.6 etc).

http://netbsd.org/~kamil/gcc/test_summary.log.8-20171022.txt

It looks like the tests complains for asan, and nothing complains for
ubsan. I expect that the reporting issue in asan/GCC is generating these
results.

>> 2017-10-26  Kamil Rytarowski  
>>
>>  * sanitizer_common/Makefile.am (sanitizer_common_files): Add
>>  sanitizer_platform_limits_netbsd.cc.
>>  * sanitizer_common/Makefile.in: Regenerated.
>>  * configure.tgt: Enable asan and ubsan on x86_64-*-netbsd*.
> 
>   Jakub
> 




signature.asc
Description: OpenPGP digital signature


Re: [PATCH, i386]: Fix PR82725, ICE in change_address_1, at emit-rtl.c

2017-10-30 Thread Uros Bizjak
On Mon, Oct 30, 2017 at 7:56 PM, Paolo Carlini  wrote:
> Hi,
>
> On 29/10/2017 18:59, H.J. Lu wrote:
>>
>>
>> 2017-10-29  Uros Bizjak  
>>
>>  PR target/82725
>>  * g++.dg/pr82725.C: New test.
>
> The new testcase is failing for everybody in gnu++98 mode. I think you want
> to move it to the cpp0x directory and use the appropriate DejaGnu
> incantations.

Thanks for heads up. Fixed with:

2017-10-30  Uros Bizjak  

* g++.dg/pr82725.C: Move to ...
* g++.dg/cpp0x/pr82725.C: ... here.  Add c++11 target directive.

Tested on x86_64-linux-gnu and committed.

Uros.

Index: g++.dg/cpp0x/pr82725.C
===
--- g++.dg/cpp0x/pr82725.C  (revision 254212)
+++ g++.dg/cpp0x/pr82725.C  (working copy)
@@ -1,4 +1,4 @@
-// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-do compile { target { { i?86-*-* x86_64-*-* } && c++11 } } }
// { dg-require-effective-target pie }
// { dg-options "-O2 -fpie -mtls-direct-seg-refs" }


Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Martin Sebor

On 10/30/2017 01:53 PM, Richard Biener wrote:

On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor  wrote:

On 10/30/2017 05:45 AM, Richard Biener wrote:

On Sun, 29 Oct 2017, Martin Sebor wrote:


In my work on -Wrestrict, to issue meaningful warnings, I found
it important to detect both out of bounds array indices as well
as offsets in calls to restrict-qualified functions like strcpy.
GCC already detects some of these cases but my tests for
the enhanced warning exposed a few gaps.

The attached patch enhances -Warray-bounds to detect more instances
out-of-bounds indices and offsets to member arrays and non-array
members.  For example, it detects the out-of-bounds offset in the
call to strcpy below.

The patch is meant to be applied on top posted here but not yet
committed:
https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

Richard, since this also touches tree-vrp.c I look for your

comments.


You fail to tell what you are changing and why - I have to reverse
engineer this from the patch which a) isn't easy in this case, b)

feels

like a waste of time.  Esp. since the patch does many things.

My first question is why do you add a warning from forwprop?  It
_feels_ like you're trying to warn about arbitrary out-of-bound
addresses at the point they are folded to MEM_REFs.  And it looks
like you're warning about pointer arithmetic like &p->a + 6.
That doesn't look correct to me.  Pointer arithmetic in GIMPLE
is not restricted to operate within fields that are appearantly
accessed here - the only restriction is with respect to the
whole underlying pointed-to-object.

By doing the warning from forwprop you'll run into all such cases
introduced by GCC itself during quite late optimization passes.


I haven't run into any such cases.  What would a more appropriate
place to detect out-of-bounds offsets?  I'm having a hard time
distinguishing what is appropriate and what isn't.  For instance,
if it's okay to detect some out of bounds offsets/indices in vrp
why is it wrong to do a better job of it in forwprop?



You're trying to re-do __builtin_object_size even when that wasn't
used.


That's not the quite my intent, although it is close.



So it looks like you're on the wrong track.  Yes,

   strcpy (p->a + 6, "y");

_may_ be "invalid" C (I'm not even sure about that!) but it
is certainly not invalid GIMPLE.


Adding (or subtracting) an integer to/from a pointer to an array
is defined in both C and C++ only if the result points to an element
of the array or just past the last element of the array.  Otherwise
it's undefined. (A non-array object is considered an array of one
for this purpose.)


On GIMPLE this is indistinguishable from (short *) (p->a) + 3.


Sure, they're both the same:

  pa_3 = &p_2(D)->a;
  _1 = pa_3 + 6;

and that's fine because the implementation of the warning sees and
uses the byte offset from the beginning of a, so I don't understand
the problem you are pointing out.  Can you clarify what you mean?

Martin



GIMPLE is neither C nor C++.

Richard.



Martin



Richard.


Jeff, this is the enhancement you were interested in when we spoke
last week.

Thanks
Martin

$ cat a.c && gcc -O2 -S -Wall a.c
struct A { char a[4]; void (*pf)(void); };

void f (struct A *p)
{
  p->a[5] = 'x';// existing -Warray-bounds

  strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
}

a.c: In function ‘f’:
a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’

[-Warray-bounds]

 strcpy (p->a + 6, "y");
 ^~
a.c:1:17: note: member declared here
 struct A { char a[4]; void (*pf)(void); };
 ^
a.c:5:7: warning: array subscript 5 is above array bounds of

‘char[4]’

[-Warray-bounds]
   p->a[5] = 'x';
   ^~~




Re: [patch, fortran, RFC] Interchange indices for FORALL and DO CONCURRENT if profitable

2017-10-30 Thread Steve Kargl
On Sat, Oct 28, 2017 at 01:23:30PM +0200, Thomas Koenig wrote:
> Hi Steve,
> 
> > On Sat, Oct 28, 2017 at 12:03:58AM +0200, Thomas Koenig wrote:
> >> +/* Callback function to determine if an expression is the
> >> +   corresponding variable.  */
> >> +
> >> +static int
> > static bool
> 
> Most of the functions in the patch are callback functions for
> gfc_code_walker or gfc_expr_walker, respectively. Their
> function arguments are given as
> 
> typedef int (*walk_code_fn_t) (gfc_code **, int *, void *);
> typedef int (*walk_expr_fn_t) (gfc_expr **, int *, void *);
> 
> respectively, so the types of the functions are fixed.
> 

Whoops, I didn't realize that the prototypes were related
to being call back functions.  I noticed the functions were
declared as int, but only returned a single value of 0.

In any event, if you have already applied the patch, it looks
ok to me.

-- 
Steve


Re: [PATCH PR79868 ][aarch64] Fix error calls in aarch64 code so they can be translated (version 2)

2017-10-30 Thread Steve Ellcey
On Thu, 2017-10-26 at 13:56 +0100, Richard Earnshaw (lists) wrote:
> 
> I can't help feeling that all this logic is somewhat excessive and
> changing the wording of each message to include "pragma or attribute"
> would solve it equally well.  With the new context highlighting it's
> trivial to tell which subcase of usage is being referred to.
> 
> R.

I have no problem with that.  Here is a version that uses "pragma or
attribute".

Tested on ToT with no regressions.  Ok to checkin?

Steve Ellcey
sell...@cavium.com



ChangeLog:

2017-10-30  Steve Ellcey  

PR target/79868
* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse):
Remove second argument from aarch64_process_target_attr call.
* config/aarch64/aarch64-protos.h (aarch64_process_target_attr):
Ditto.
* config/aarch64/aarch64.c (aarch64_attribute_info): Change
field type.
(aarch64_handle_attr_arch): Remove second argument.
(aarch64_handle_attr_cpu): Ditto.
(aarch64_handle_attr_tune): Ditto.
(aarch64_handle_attr_isa_flags): Ditto.
(aarch64_process_one_target_attr): Ditto.
(aarch64_process_target_attr): Ditto.
(aarch64_option_valid_attribute_p): Remove second argument.
on aarch64_process_target_attr call.


Testsuite ChangeLog:

2017-10-30  Steve Ellcey  

PR target/79868
* gcc.target/aarch64/spellcheck_1.c: Update dg-error string to match
new format.
* gcc.target/aarch64/spellcheck_2.c: Ditto.
* gcc.target/aarch64/spellcheck_3.c: Ditto.
* gcc.target/aarch64/target_attr_11.c: Ditto.
* gcc.target/aarch64/target_attr_12.c: Ditto.
* gcc.target/aarch64/target_attr_17.c: Ditto.diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index c7d866f..e18ec4a 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -166,7 +166,7 @@ aarch64_pragma_target_parse (tree args, tree pop_target)
  information that it specifies.  */
   if (args)
 {
-  if (!aarch64_process_target_attr (args, "pragma"))
+  if (!aarch64_process_target_attr (args))
 	return false;
 
   aarch64_override_options_internal (&global_options);
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 5d7c5df..345bfe8 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -457,7 +457,7 @@ bool aarch64_gen_adjusted_ldpstp (rtx *, bool, scalar_mode, RTX_CODE);
 
 void aarch64_init_builtins (void);
 
-bool aarch64_process_target_attr (tree, const char*);
+bool aarch64_process_target_attr (tree);
 void aarch64_override_options_internal (struct gcc_options *);
 
 rtx aarch64_expand_builtin (tree exp,
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1cc1043..fa2f86e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9554,9 +9554,8 @@ enum aarch64_attr_opt_type
ATTR_TYPE specifies the type of behavior of the attribute as described
in the definition of enum aarch64_attr_opt_type.
ALLOW_NEG is true if the attribute supports a "no-" form.
-   HANDLER is the function that takes the attribute string and whether
-   it is a pragma or attribute and handles the option.  It is needed only
-   when the ATTR_TYPE is aarch64_attr_custom.
+   HANDLER is the function that takes the attribute string as an argument
+   It is needed only when the ATTR_TYPE is aarch64_attr_custom.
OPT_NUM is the enum specifying the option that the attribute modifies.
This is needed for attributes that mirror the behavior of a command-line
option, that is it has ATTR_TYPE aarch64_attr_mask, aarch64_attr_bool or
@@ -9567,15 +9566,14 @@ struct aarch64_attribute_info
   const char *name;
   enum aarch64_attr_opt_type attr_type;
   bool allow_neg;
-  bool (*handler) (const char *, const char *);
+  bool (*handler) (const char *);
   enum opt_code opt_num;
 };
 
-/* Handle the ARCH_STR argument to the arch= target attribute.
-   PRAGMA_OR_ATTR is used in potential error messages.  */
+/* Handle the ARCH_STR argument to the arch= target attribute.  */
 
 static bool
-aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
+aarch64_handle_attr_arch (const char *str)
 {
   const struct processor *tmp_arch = NULL;
   enum aarch64_parse_opt_result parse_res
@@ -9592,15 +9590,14 @@ aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
   switch (parse_res)
 {
   case AARCH64_PARSE_MISSING_ARG:
-	error ("missing architecture name in 'arch' target %s", pragma_or_attr);
+	error ("missing name in % pragma or attribute");
 	break;
   case AARCH64_PARSE_INVALID_ARG:
-	error ("unknown value %qs for 'arch' target %s", str, pragma_or_attr);
+	error ("invalid name (\"%s\") in % pragma or attribute", str);
 	aarch64_print_hint_for_arch (str);
 	break;
   case AARCH64_PARSE_INVALID_FEATURE:
-	error ("invalid 

Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Richard Biener
On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor  wrote:
>On 10/30/2017 01:53 PM, Richard Biener wrote:
>> On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor
> wrote:
>>> On 10/30/2017 05:45 AM, Richard Biener wrote:
 On Sun, 29 Oct 2017, Martin Sebor wrote:

> In my work on -Wrestrict, to issue meaningful warnings, I found
> it important to detect both out of bounds array indices as well
> as offsets in calls to restrict-qualified functions like strcpy.
> GCC already detects some of these cases but my tests for
> the enhanced warning exposed a few gaps.
>
> The attached patch enhances -Warray-bounds to detect more
>instances
> out-of-bounds indices and offsets to member arrays and non-array
> members.  For example, it detects the out-of-bounds offset in the
> call to strcpy below.
>
> The patch is meant to be applied on top posted here but not yet
> committed:
> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>
> Richard, since this also touches tree-vrp.c I look for your
>>> comments.

 You fail to tell what you are changing and why - I have to reverse
 engineer this from the patch which a) isn't easy in this case, b)
>>> feels
 like a waste of time.  Esp. since the patch does many things.

 My first question is why do you add a warning from forwprop?  It
 _feels_ like you're trying to warn about arbitrary out-of-bound
 addresses at the point they are folded to MEM_REFs.  And it looks
 like you're warning about pointer arithmetic like &p->a + 6.
 That doesn't look correct to me.  Pointer arithmetic in GIMPLE
 is not restricted to operate within fields that are appearantly
 accessed here - the only restriction is with respect to the
 whole underlying pointed-to-object.

 By doing the warning from forwprop you'll run into all such cases
 introduced by GCC itself during quite late optimization passes.
>>>
>>> I haven't run into any such cases.  What would a more appropriate
>>> place to detect out-of-bounds offsets?  I'm having a hard time
>>> distinguishing what is appropriate and what isn't.  For instance,
>>> if it's okay to detect some out of bounds offsets/indices in vrp
>>> why is it wrong to do a better job of it in forwprop?
>>>

 You're trying to re-do __builtin_object_size even when that wasn't
 used.
>>>
>>> That's not the quite my intent, although it is close.
>>>

 So it looks like you're on the wrong track.  Yes,

strcpy (p->a + 6, "y");

 _may_ be "invalid" C (I'm not even sure about that!) but it
 is certainly not invalid GIMPLE.
>>>
>>> Adding (or subtracting) an integer to/from a pointer to an array
>>> is defined in both C and C++ only if the result points to an element
>>> of the array or just past the last element of the array.  Otherwise
>>> it's undefined. (A non-array object is considered an array of one
>>> for this purpose.)
>> 
>> On GIMPLE this is indistinguishable from (short *) (p->a) + 3.
>
>Sure, they're both the same:
>
>   pa_3 = &p_2(D)->a;
>   _1 = pa_3 + 6;
>
>and that's fine because the implementation of the warning sees and
>uses the byte offset from the beginning of a, so I don't understand
>the problem you are pointing out.  Can you clarify what you mean?

It does not access the array but the underlying object. On GIMPLE it is just an 
address calculation without constraints. 

Richard. 

>Martin
>
>> 
>> GIMPLE is neither C nor C++.
>> 
>> Richard.
>> 
>>>
>>> Martin
>>>

 Richard.

> Jeff, this is the enhancement you were interested in when we spoke
> last week.
>
> Thanks
> Martin
>
> $ cat a.c && gcc -O2 -S -Wall a.c
> struct A { char a[4]; void (*pf)(void); };
>
> void f (struct A *p)
> {
>   p->a[5] = 'x';// existing -Warray-bounds
>
>   strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
> }
>
> a.c: In function ‘f’:
> a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’
>>> [-Warray-bounds]
>  strcpy (p->a + 6, "y");
>  ^~
> a.c:1:17: note: member declared here
>  struct A { char a[4]; void (*pf)(void); };
>  ^
> a.c:5:7: warning: array subscript 5 is above array bounds of
>>> ‘char[4]’
> [-Warray-bounds]
>p->a[5] = 'x';
>^~~
>> 



Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Martin Sebor

On 10/30/2017 02:56 PM, Richard Biener wrote:

On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor  wrote:

On 10/30/2017 01:53 PM, Richard Biener wrote:

On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor

 wrote:

On 10/30/2017 05:45 AM, Richard Biener wrote:

On Sun, 29 Oct 2017, Martin Sebor wrote:


In my work on -Wrestrict, to issue meaningful warnings, I found
it important to detect both out of bounds array indices as well
as offsets in calls to restrict-qualified functions like strcpy.
GCC already detects some of these cases but my tests for
the enhanced warning exposed a few gaps.

The attached patch enhances -Warray-bounds to detect more

instances

out-of-bounds indices and offsets to member arrays and non-array
members.  For example, it detects the out-of-bounds offset in the
call to strcpy below.

The patch is meant to be applied on top posted here but not yet
committed:
 https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

Richard, since this also touches tree-vrp.c I look for your

comments.


You fail to tell what you are changing and why - I have to reverse
engineer this from the patch which a) isn't easy in this case, b)

feels

like a waste of time.  Esp. since the patch does many things.

My first question is why do you add a warning from forwprop?  It
_feels_ like you're trying to warn about arbitrary out-of-bound
addresses at the point they are folded to MEM_REFs.  And it looks
like you're warning about pointer arithmetic like &p->a + 6.
That doesn't look correct to me.  Pointer arithmetic in GIMPLE
is not restricted to operate within fields that are appearantly
accessed here - the only restriction is with respect to the
whole underlying pointed-to-object.

By doing the warning from forwprop you'll run into all such cases
introduced by GCC itself during quite late optimization passes.


I haven't run into any such cases.  What would a more appropriate
place to detect out-of-bounds offsets?  I'm having a hard time
distinguishing what is appropriate and what isn't.  For instance,
if it's okay to detect some out of bounds offsets/indices in vrp
why is it wrong to do a better job of it in forwprop?



You're trying to re-do __builtin_object_size even when that wasn't
used.


That's not the quite my intent, although it is close.



So it looks like you're on the wrong track.  Yes,

strcpy (p->a + 6, "y");

_may_ be "invalid" C (I'm not even sure about that!) but it
is certainly not invalid GIMPLE.


Adding (or subtracting) an integer to/from a pointer to an array
is defined in both C and C++ only if the result points to an element
of the array or just past the last element of the array.  Otherwise
it's undefined. (A non-array object is considered an array of one
for this purpose.)


On GIMPLE this is indistinguishable from (short *) (p->a) + 3.


Sure, they're both the same:

   pa_3 = &p_2(D)->a;
   _1 = pa_3 + 6;

and that's fine because the implementation of the warning sees and
uses the byte offset from the beginning of a, so I don't understand
the problem you are pointing out.  Can you clarify what you mean?


It does not access the array but the underlying object. On GIMPLE it is just an 
address calculation without constraints.


But the computation starts with the subobject and so is only
valid within the bounds of the subobject.  Or are you saying
that GCC emits such GIMPLE expressions for valid code?  If so,
can you give an example of such code?

Or, if that's not it, what exactly is your concern with this
enhancement?  If it's that it's implemented in forwprop, what
would be a better place, e.g., earlier in the optimization
phase?  If it's something something else, I'd appreciate it
if you could explain what.

Martin



Richard.


Martin



GIMPLE is neither C nor C++.

Richard.



Martin



Richard.


Jeff, this is the enhancement you were interested in when we spoke
last week.

Thanks
Martin

$ cat a.c && gcc -O2 -S -Wall a.c
 struct A { char a[4]; void (*pf)(void); };

 void f (struct A *p)
 {
   p->a[5] = 'x';// existing -Warray-bounds

   strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
 }

 a.c: In function ‘f’:
 a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’

[-Warray-bounds]

  strcpy (p->a + 6, "y");
  ^~
 a.c:1:17: note: member declared here
  struct A { char a[4]; void (*pf)(void); };
  ^
 a.c:5:7: warning: array subscript 5 is above array bounds of

‘char[4]’

[-Warray-bounds]
p->a[5] = 'x';
^~~






Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Jeff Law
On 10/30/2017 09:19 AM, Martin Sebor wrote:
> On 10/30/2017 05:45 AM, Richard Biener wrote:
>> On Sun, 29 Oct 2017, Martin Sebor wrote:
>>
>>> In my work on -Wrestrict, to issue meaningful warnings, I found
>>> it important to detect both out of bounds array indices as well
>>> as offsets in calls to restrict-qualified functions like strcpy.
>>> GCC already detects some of these cases but my tests for
>>> the enhanced warning exposed a few gaps.
>>>
>>> The attached patch enhances -Warray-bounds to detect more instances
>>> out-of-bounds indices and offsets to member arrays and non-array
>>> members.  For example, it detects the out-of-bounds offset in the
>>> call to strcpy below.
>>>
>>> The patch is meant to be applied on top posted here but not yet
>>> committed:
>>>    https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>>>
>>> Richard, since this also touches tree-vrp.c I look for your comments.
>>
>> You fail to tell what you are changing and why - I have to reverse
>> engineer this from the patch which a) isn't easy in this case, b) feels
>> like a waste of time.  Esp. since the patch does many things.
>>
>> My first question is why do you add a warning from forwprop?  It
>> _feels_ like you're trying to warn about arbitrary out-of-bound
>> addresses at the point they are folded to MEM_REFs.  And it looks
>> like you're warning about pointer arithmetic like &p->a + 6.
>> That doesn't look correct to me.  Pointer arithmetic in GIMPLE
>> is not restricted to operate within fields that are appearantly
>> accessed here - the only restriction is with respect to the
>> whole underlying pointed-to-object.
>>
>> By doing the warning from forwprop you'll run into all such cases
>> introduced by GCC itself during quite late optimization passes.
> 
> I haven't run into any such cases.  What would a more appropriate
> place to detect out-of-bounds offsets?  I'm having a hard time
> distinguishing what is appropriate and what isn't.  For instance,
> if it's okay to detect some out of bounds offsets/indices in vrp
> why is it wrong to do a better job of it in forwpropI think part of the 
> problem is there isn't a well defined place to do
this kind of warning.  I suspect it's currently in VRP simply because
that is where we had range information in the past.  It's still the
location with the most accurate range information.

In a world where we have an embedded context sensitive range analysis
engine, we should *really* look at pulling the out of bounds array
warnings out of any optimization pass an have a distinct pass to deal
with them.

I guess in the immediate term the question I would ask Martin is what is
it about forwprop that makes it interesting?  Is it because of the
lowering issues we touched on last week?  If so I wonder if we could
recreate an array form from a MEM_REF for the purposes of optimization.
Or if we could just handle MEM_REFs better within the existing warning.


> 
>>
>> You're trying to re-do __builtin_object_size even when that wasn't
>> used.
> 
> That's not the quite my intent, although it is close.
Wouldn't we be better off improving _b_o_s?

> 
>>
>> So it looks like you're on the wrong track.  Yes,
>>
>>   strcpy (p->a + 6, "y");
>>
>> _may_ be "invalid" C (I'm not even sure about that!) but it
>> is certainly not invalid GIMPLE.
> 
> Adding (or subtracting) an integer to/from a pointer to an array
> is defined in both C and C++ only if the result points to an element
> of the array or just past the last element of the array.  Otherwise
> it's undefined. (A non-array object is considered an array of one
> for this purpose.)
I think Richi's argument is that gimple allows things that are not
necessarily allowed by the C/C++ standard.  For example we support
virtual origins from Ada, which internally would look something like
invalid C code

OTOH, we currently have code in tree-vrp.c which warns if we compute the
address of an out of bounds array index which is very C/C++ centric.

jeff


Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Jeff Law
On 10/29/2017 10:01 AM, Martin Sebor wrote:
> In my work on -Wrestrict, to issue meaningful warnings, I found
> it important to detect both out of bounds array indices as well
> as offsets in calls to restrict-qualified functions like strcpy.
> GCC already detects some of these cases but my tests for
> the enhanced warning exposed a few gaps.
> 
> The attached patch enhances -Warray-bounds to detect more instances
> out-of-bounds indices and offsets to member arrays and non-array
> members.  For example, it detects the out-of-bounds offset in the
> call to strcpy below.
> 
> The patch is meant to be applied on top posted here but not yet
> committed:
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
> 
> Richard, since this also touches tree-vrp.c I look for your comments.
> 
> Jeff, this is the enhancement you were interested in when we spoke
> last week.
> 
> Thanks
> Martin
> 
> $ cat a.c && gcc -O2 -S -Wall a.c
>   struct A { char a[4]; void (*pf)(void); };
> 
>   void f (struct A *p)
>   {
> p->a[5] = 'x';// existing -Warray-bounds
> 
> strcpy (p->a + 6, "y");   // enhanced -Warray-bounds
>   }
> 
>   a.c: In function ‘f’:
>   a.c:7:3: warning: offset 6 is out of bounds of ‘char[4]’ [-Warray-bounds]
>    strcpy (p->a + 6, "y");
>    ^~
>   a.c:1:17: note: member declared here
>    struct A { char a[4]; void (*pf)(void); };
>    ^
>   a.c:5:7: warning: array subscript 5 is above array bounds of
> ‘char[4]’ [-Warray-bounds]
>  p->a[5] = 'x';
>  ^~~
> 
> gcc-82455.diff
> 
> 
> PR tree-optimization/82455 - missing -Warray-bounds on strcpy offset in an 
> out-of-bounds range
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/82455
>   * tree-ssa-forwprop.c (maybe_warn_offset_out_of_bounds): New function.
>   (forward_propagate_addr_expr_1): Call it.
>   * tree-vrp.c (check_array_ref): Return bool.  Handle flexible array
>   members, string literals, and inner indices.
>   (search_for_addr_array): Return bool.  Add an argument.  Add detail
>   to diagnostics.
>   (check_array_bounds): New function.
>   * tree-vrp.h (check_array_bounds): Declare it.
>   * wide-int.h (sign_mask): Assert a precondition.So one of the things I 
> see here is exposing the array bounds checking
bits from VRP to other passes.  I'm not inherently against that -- I'd
like to be able to do that in other passes for various reasons.

It also seems like you're improving that code.


So I'd probably start with pulling the improvements into their own
patch.  That shouldn't be terribly controversial.


I'd have to look more closely, but let's make sure we have a query API
that does not warn, but merely returns a boolean.  I think we'll have a
desire for that kind of query.

I wonder if all this would be better implemented in the context of
_b_o_s.  That may also make it easy to use a domwalk to gather the
ranges given potentially better ranges than we're getting out of VRP.

I would also request that we dig deep into the routines we use here.
I'm currently trying to pull apart the various dependencies within VRP
and wouldn't want to make that worse.  Even if it's just a proof of
concept, pull the routines into a different file to see what the "touch"
points are.  I'm rather dismayed at how intertwined various bits within
VRP are.




> +/* Check to see if the expression (char*)PTR + OFF is valid (i.e., it
> +   doesn't overflow and the result is within the bounds of the object
> +   pointed to by PTR.  OFF is an offset in bytes.  If it isn't valid,
> +   issue a -Warray-bounds warning.  */
> +
> +static bool
> +maybe_warn_offset_out_of_bounds (location_t loc, tree ptr, tree off)
> +{
> +  offset_int cstoff = 0;
> +
> +  bool ignore_off_by_1 = false;
> +
> +  if (TREE_CODE (ptr) == SSA_NAME)
> +{
> +  gimple *stmt = SSA_NAME_DEF_STMT (ptr);
> +  if (is_gimple_assign (stmt))
> + {
> +   tree_code code = gimple_assign_rhs_code (stmt);
> +   if (code == ADDR_EXPR)
> + {
> +   ptr = gimple_assign_rhs1 (stmt);
> +   ignore_off_by_1 = true;
> + }
> +   else if (code == POINTER_PLUS_EXPR)
> + {
> +   ptr = gimple_assign_rhs1 (stmt);
> +   tree offset = gimple_assign_rhs2 (stmt);
> +   if (maybe_warn_offset_out_of_bounds (loc, ptr, offset))
> + return true;
> +
> +   /* To do: handle ranges.  */
> +   if (TREE_CODE (offset) != INTEGER_CST)
> + return false;
Even if we don't handle full ranges, handling ranges that have collapsed
to singletons ought to be trivial.  Similarly handling a range that
lives entirely out of bounds shouldn't be too hard either.


> +
> +  if (check_array_bounds (ptr, ignore_off_by_1))
> +return true;
> +
> +  bool addr_expr = false;
> +
> +  if (TREE_CODE (ptr) == ADDR_EXPR)
> +{
> +  ptr = TREE_OPERAND (ptr, 0);
> +  addr_expr = true;
> +}
> +
> +  /* The type

Re: [Patch, fortran] PR80850 - Sourced allocate() fails to allocate a pointer

2017-10-30 Thread Paul Richard Thomas
Dear Andre,

Committed to trunk as revision 254244.

In order to debug the code, I was forced to use 7-branch for
development since there were dependencies that detected the change in
module number. 7-branch accepted the assignments without casts but I
was forced to include them in trunk. As advertised the testcase just
enforces the assignment to the _len field through a tree dump.

7-branch will come on Wednesday. Tomorrow is full of Halloween fun

Thanks

Paul


On 30 October 2017 at 13:39, Andre Vehreschild  wrote:
> Hi Paul,
>
> whoopsie, I remember that I inserted the check for _len > 0 in allocate(). So
> it was me causing the bug. Thanks that you found it. The patch looks good to
> me. Thanks for the work.
>
> - Andre
>
> On Mon, 30 Oct 2017 12:20:20 +
> Paul Richard Thomas  wrote:
>
>> Dear All,
>>
>> This bug took a silly amount of effort to diagnose but once done, the
>> fix was obvious.
>>
>> The bug is triggered in this function from the reporter's source file
>> gfc_graph.F90:
>>
>> function GraphIterAppendVertex(this,vertex) result(ierr)
>> !Appends a new vertex to the graph.
>>  implicit none
>>  integer(INTD):: ierr   !out: error code
>>  class(graph_iter_t), intent(inout):: this  !inout:
>> graph iterator
>>  class(graph_vertex_t), intent(in), target:: vertex !in: new vertex
>>  type(vert_link_refs_t):: vlr
>>
>>  ierr=this%vert_it%append(vertex) !<= right here!
>> snip
>>  return
>> end function GraphIterAppendVertex
>>
>> 'vertex' is being passed to a class(*) dummy. As you will see from the
>> attached patch and the ChangeLog, 'vertex' was being cast as unlimited
>> polymorphic and assigned to the passed actual argument. This left the
>> _len field indeterminate since it is not present in normal (limited?)
>> polymorphic objects.
>>
>> Further down the way, in stsubs.F90(clone_object) an allocation is
>> being made using the unlimited version of 'vertex as a source. Since
>> the size passed to malloc for an unlimited source is, for  _len > 0,
>> the value of the _len multiplied by the vtable _size, the amount of
>> memory is also indeterminate and causes the operating system to flag a
>> failed allocation, pretty much at random.
>>
>> The ChangeLog and the patch describe the fix sufficiently well as not
>> to require further explanation. I will write a testcase that tests the
>> tree dump for the _len field before committing the patch.
>>
>> Bootstraps and regtests on FC23/x86_64 - OK for 7- and 8-branches?
>>
>> If I do not receive any contrary comments, I will commit tonight.
>>
>> Regards
>>
>> Paul
>>
>> 2017-10-30  Paul Thomas  
>>
>> PR fortran/80850
>> * trans_expr.c (gfc_conv_procedure_call): When passing a class
>> argument to an unlimited polymorphic dummy, it is wrong to cast
>> the passed expression as unlimited, unless it is unlimited. The
>> correct way is to assign to each of the fields and set the _len
>> field to zero.
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [PATCH, version 5a], Add support for _Float and _FloatX sqrt, fma, fmin, fmax built-in functions

2017-10-30 Thread Michael Meissner
On Mon, Oct 30, 2017 at 06:32:44PM +, Joseph Myers wrote:
> On Mon, 30 Oct 2017, Michael Meissner wrote:
> 
> > > One obvious possible fix would be for the builtins.def macros never to 
> > > define public names such as sqrtf128 for C++, only ever for C.  That 
> > > would 
> > > be on the basis that C++ code is expected to be using a sqrt overload 
> > > anyway - so making the built-in function available under a public name, 
> > > in 
> > > the case where __float128 exists for C++, is a matter for libstdc++ 
> > > headers.
> > 
> > Arghh, yes, I can restrict it to just the C language.  Would you prefer just
> > that?  Do we want a target hook to be able to over-ride this on a platform
> 
> Yes, I think such a restriction is appropriate (that is, C and ObjC get 
> both sqrtf128 and __builtin_sqrtf128 built-in functions, C++ and ObjC++ 
> get only __builtin_sqrtf128).
> 
> > basis (with the default behavior only enabling it for C)?
> 
> If we think that the preferred way for C++ code to access the underlying 
> functions is via function overloads for __float128 (not currently 
> implemented in libstdc++, it seems) rather than suffixed functions, or 
> through accessing the format as long double once that support is working, 
> a target hook may not be neded.  If C++ code is expected to use 
> f128-suffixed functions directly, presumably it's desirable for a C++ call 
> to sqrtf128 to be inlined where possible and so you'd want a hook, or some 
> other way of enabling the built-in functions for C++ only in the case 
> where they are usefully distinct from the float / double / long double 
> functions.

This patch fixes exporting the non __builtin_ names to be done by default only
for the C language.  I added a target hook in case a port needs to enable
built-ins for C++ (either wholesale, or for particular built-in functions).

I have done a bootstrap on a little endian Power8 system and it had no
regressions.  The x86-64 bootstrap just entered stage3.  Assuming there are no
regressions on x86-64, can I check this patch into the trunk?

2017-10-30  Michael Meissner  

* builtins.def (DEF_FLOATN_BUILTIN): Change most _Float and
_FloatX built-in functions so that the variant without the
"__builtin_" prefix is only enabled for the GNU C language when it
is non-strict ANSI/ISO mode.
(DEF_EXT_LIB_FLOATN_NX_BUILTINS): Likewise.
* target.def (floatn_builtin_p): Add a target hook to control
whether _Float and _FloatX built-in functions without the
"__builtin_" prefix are enabled.  Include langhooks.h in
targhooks.c.
* targhooks.h (default_floatn_builtin_p): Likewise.
* targhooks.c (default_floatn_builtin_p): Likewise.
* doc/tm.texi.in (TARGET_FLOATN_BUILTIN_P): Document the
floatn_builtin_p target hook.
* doc/tm.texi (TARGET_FLOATN_BUILTIN_P): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/builtins.def
===
--- gcc/builtins.def(revision 254232)
+++ gcc/builtins.def(working copy)
@@ -130,18 +130,26 @@ along with GCC; see the file COPYING3.
 
 /* A set of GCC builtins for _FloatN and _FloatNx types.  TYPE_MACRO is called
with an argument such as FLOAT32 to produce the enum value for the type.  If
-   we are being fully conformant we ignore the version of these builtins that
-   does not being with __builtin_.  */
+   we are compiling for the C language with GNU extensions, we enable the name
+   without the __builtin_ prefix as well as the name with the __builtin_
+   prefix.  C++ does not enable these names by default because they don't have
+   the _Float and _FloatX keywords, and a class based library should use
+   the __builtin_ names.  */
+#undef DEF_FLOATN_BUILTIN
+#define DEF_FLOATN_BUILTIN(ENUM, NAME, TYPE, ATTRS)\
+  DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
+  targetm.floatn_builtin_p ((int) ENUM),   \
+  targetm.floatn_builtin_p ((int) ENUM), true, ATTRS, false, true)
 #undef DEF_EXT_LIB_FLOATN_NX_BUILTINS
-#define DEF_EXT_LIB_FLOATN_NX_BUILTINS(ENUM, NAME, TYPE_MACRO, ATTRS)  
\
-  DEF_EXT_LIB_BUILTIN (ENUM ## F16, NAME "f16", TYPE_MACRO (FLOAT16), ATTRS) \
-  DEF_EXT_LIB_BUILTIN (ENUM ## F32, NAME "f32", TYPE_MACRO (FLOAT32), ATTRS) \
-  DEF_EXT_LIB_BUILTIN (ENUM ## F64, NAME "f64", TYPE_MACRO (FLOAT64), ATTRS) \
-  DEF_EXT_LIB_BUILTIN (ENUM ## F128, NAME "f128", TYPE_MACRO (FLOAT128), 
ATTRS)\
-  DEF_EXT_LIB_BUILTIN (ENUM ## F32X, NAME "f32x", TYPE_MACRO (FLOAT32X), 
ATTRS)\
-  DEF_EXT_LIB_BUILTIN (ENUM ## F64X, NAME "f64x", TYPE_MACRO (FLOAT64X), 
ATTRS)\
-  DEF_EXT_LIB_BUILTIN (ENUM ## F128X, NAME "f128x", TYPE_MACRO (FLOAT128X), \
-   ATTRS)
+#define DEF_EXT_LIB_FLOATN_NX_BUILTINS(ENUM, NAM

[PATCH] Add fields to struct gomp_thread for debugging purposes

2017-10-30 Thread Kevin Buettner
This patch adds a new member named "pthread_id" to the gomp_thread
struct.  It is initialized in team.c.

It also adds a field named "parent" which is initialized to the thread
which created the thread in question.  For non-nested parallel
regions, this is always the master thread.

These new fields serve no purpose in a normally running OpenMP
program.  They are intended to be used by a debugger for identifying
threads and for finding the parent thread.

I've done a "make bootstrap" and have regression tested these changes
with no regressions found.

libgomp/ChangeLog:

* libgomp.h (struct gomp_thread): Add new member "pthread_id"
and "parent".
* team.c (struct gomp_thread_start_data): Add field "parent".
(gomp_thread_start): Set parent and pthread_id.
(gomp_team_start): Initialize master thread.  Initialize parent
in the start data.
---
 libgomp/libgomp.h |  7 +++
 libgomp/team.c| 13 +
 2 files changed, 20 insertions(+)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 940b5b8..7fa64f7 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -611,6 +611,13 @@ struct gomp_thread
  to any place.  */
   unsigned int place;
 
+  /* The pthread id associated with this thread.  This is required for
+ debugging.  */
+  pthread_t pthread_id;
+
+  /* Thread which spawned this one.  This is required for debugging.  */
+  struct gomp_thread *parent;
+
   /* User pthread thread pool */
   struct gomp_thread_pool *thread_pool;
 };
diff --git a/libgomp/team.c b/libgomp/team.c
index 676614a..17a0b3d 100644
--- a/libgomp/team.c
+++ b/libgomp/team.c
@@ -58,6 +58,7 @@ struct gomp_thread_start_data
   struct gomp_thread_pool *thread_pool;
   unsigned int place;
   bool nested;
+  struct gomp_thread *parent;
 };
 
 
@@ -89,9 +90,12 @@ gomp_thread_start (void *xdata)
   thr->ts = data->ts;
   thr->task = data->task;
   thr->place = data->place;
+  thr->parent = data->parent;
 
   thr->ts.team->ordered_release[thr->ts.team_id] = &thr->release;
 
+  thr->pthread_id = pthread_self ();
+
   /* Make thread pool local. */
   pool = thr->thread_pool;
 
@@ -718,6 +722,14 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned 
nthreads,
   attr = &thread_attr;
 }
 
+  /* Add the master thread to threads[] and record its pthread id too.  */
+  if (pool->threads[0] == NULL)
+{
+  pool->threads[0] = thr;
+  thr->pthread_id = pthread_self ();
+  thr->parent = NULL;
+}
+
   start_data = gomp_alloca (sizeof (struct gomp_thread_start_data)
* (nthreads-i));
 
@@ -812,6 +824,7 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned 
nthreads,
   team->implicit_task[i].icv.bind_var = bind_var;
   start_data->thread_pool = pool;
   start_data->nested = nested;
+  start_data->parent = thr;
 
   attr = gomp_adjust_thread_attr (attr, &thread_attr);
   err = pthread_create (&pt, attr, gomp_thread_start, start_data++);



Re: [PATCH] Add fields to struct gomp_thread for debugging purposes

2017-10-30 Thread Kevin Buettner
Below is some additional information about the work I've been doing. 
It may be useful in understanding where I'm going with my libgomp
patch and other patches still to come...

I've been working on improvements to gdb, gcc, and libgomp which make
GDB able to better access variables in an OpenMP program.  Access to
some variables was already possible even before starting my work.  One
case which did not work is access to variables which are not used in
an OpenMP parallel region.

Consider this program:

void foo (int a1) {}

int
main (void)
{
  static int s1 = -41;
  int i1 = 11, i2;

  for (i2 = 1; i2 <= 2; i2++)
{
  int pass = i2;
#pragma omp parallel num_threads (2) firstprivate (i1)
  {
foo (i1);
  }
  foo(pass);
}
  foo (s1); foo (i2);
}

When using GDB, if a breakpoint is placed on the "foo (i1)" line, GDB
was unable to access variables `s1', `i2', or `pass' in either of the
two threads that are created.

I recently committed a patch to GCC which now allows GDB to access all
of these variables, but at the moment, GDB is only able to find stack
based variables (e.g. i2 and pass) for the master thread.  Finding
static variables (e.g. s1) now works from any thread.

In order to find stack based variables from non-master thread(s), GDB
needs to be able to find the parent thread and then look for the
variables on the stack of the parent thread.

I've recently committed/pushed patches to GDB which map a thread
handle to one of GDB's internal thread identifiers.  This work
also exposes the mapping functionality via GDB's Python interface.
I provide an example of how this might be used later on.

On the GDB side of things, I have two other patches.

The first one implements an interface for a python function which I've
named `thread_parent'.  The idea here is that we implement
thread_parent in Python and that GDB then uses this implementation,
when appropriate, to locate a parent thread.

The second patch makes calls to the thread_parent to find the thread
to search for stack based variables which are not found on the stack
of the current thread under consideration.

I also have a patch to libgomp which adds pthread id and parent
fields to the gomp_thread struct.  These fields make it possible
for a debugger to know which pthread_t identifier corresponds to
a particular gomp thread.  It also makes it possible for the debugger
to discern the parent / child relationship among threads.

Using the above work, thread_parent can be implemented in Python
as follows:

def thr_parent (thr):
try:
h = gdb.parse_and_eval("gomp_tls_data->parent->pthread_id")
parent = gdb.selected_inferior().thread_from_thread_handle(h)
return parent
except:
return None

gdb.thread_parent = thr_parent

Note the ease with with the thread handle is obtained from libgomp. 
GDB's expression evaluator (which relies on having accurate DWARF
info) is used to fetch the thread handle via this expression:

gomp_tls_data->parent->pthread_id

This is the very expression that one might use within libgomp to
access the thread handle of the parent.  The master thread has a NULL
parent, so attempting to access pthread_id for NULL throws an
exception, causing None to be returned.

The file in which this python code resides is placed in one of the
standard locations for python plugins for GDB.  It must follow the
naming conventions which will cause it to be loaded when the libgomp
shared object is loaded.  For the current version of libgomp, the file
is named libgomp.so.1.0.0-gdb.py.  I've been placing it within the
same directory as libgomp.so, which is just a symlink to
libgomp.so.1.0.0.  This file may also be placed a location relative to
auto-load in GDB's installed data directory.  Also, it is my
understanding that this script could be placed in a .debug_gdb_scripts
section of the libgomp shared library.

It is expected that some of this work will (still) prove useful when
an OMPD library is implemented for libgomp.  Some of it will, of
course, have to be discarded, but in the interim, it will improve
GDB's capability for debugging OpenMP programs.


Re: [PATCH, version 5a], Add support for _Float and _FloatX sqrt, fma, fmin, fmax built-in functions

2017-10-30 Thread Michael Meissner
The x86-64 bootstrap/make check run has now finished, and there were no
regressions.  Can I check the changes into trunk?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455)

2017-10-30 Thread Martin Sebor

On 10/30/2017 03:48 PM, Jeff Law wrote:

On 10/30/2017 09:19 AM, Martin Sebor wrote:

On 10/30/2017 05:45 AM, Richard Biener wrote:

On Sun, 29 Oct 2017, Martin Sebor wrote:


In my work on -Wrestrict, to issue meaningful warnings, I found
it important to detect both out of bounds array indices as well
as offsets in calls to restrict-qualified functions like strcpy.
GCC already detects some of these cases but my tests for
the enhanced warning exposed a few gaps.

The attached patch enhances -Warray-bounds to detect more instances
out-of-bounds indices and offsets to member arrays and non-array
members.  For example, it detects the out-of-bounds offset in the
call to strcpy below.

The patch is meant to be applied on top posted here but not yet
committed:
    https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

Richard, since this also touches tree-vrp.c I look for your comments.


You fail to tell what you are changing and why - I have to reverse
engineer this from the patch which a) isn't easy in this case, b) feels
like a waste of time.  Esp. since the patch does many things.

My first question is why do you add a warning from forwprop?  It
_feels_ like you're trying to warn about arbitrary out-of-bound
addresses at the point they are folded to MEM_REFs.  And it looks
like you're warning about pointer arithmetic like &p->a + 6.
That doesn't look correct to me.  Pointer arithmetic in GIMPLE
is not restricted to operate within fields that are appearantly
accessed here - the only restriction is with respect to the
whole underlying pointed-to-object.

By doing the warning from forwprop you'll run into all such cases
introduced by GCC itself during quite late optimization passes.


I haven't run into any such cases.  What would a more appropriate
place to detect out-of-bounds offsets?  I'm having a hard time
distinguishing what is appropriate and what isn't.  For instance,
if it's okay to detect some out of bounds offsets/indices in vrp
why is it wrong to do a better job of it in forwpropI think part of the problem 
is there isn't a well defined place to do

this kind of warning.  I suspect it's currently in VRP simply because
that is where we had range information in the past.  It's still the
location with the most accurate range information.

In a world where we have an embedded context sensitive range analysis
engine, we should *really* look at pulling the out of bounds array
warnings out of any optimization pass an have a distinct pass to deal
with them.

I guess in the immediate term the question I would ask Martin is what is
it about forwprop that makes it interesting?  Is it because of the
lowering issues we touched on last week?  If so I wonder if we could
recreate an array form from a MEM_REF for the purposes of optimization.
Or if we could just handle MEM_REFs better within the existing warning.


I put it in forwprop only because that was the last stage where
there's still enough context before the POINTER_PLUS_EXPR is
folded into MEM_REF to tell an offset from the beginning of
a subobject from the one from the beginning of the bigger object
of which the subobject is a member.  I certainly don't mind moving
it somewhere else more appropriate if this isn't ideal, just as
long it doesn't cripple the detection (e.g., as long as we still
have range information).


You're trying to re-do __builtin_object_size even when that wasn't
used.


That's not the quite my intent, although it is close.

Wouldn't we be better off improving _b_o_s?





So it looks like you're on the wrong track.  Yes,

   strcpy (p->a + 6, "y");

_may_ be "invalid" C (I'm not even sure about that!) but it
is certainly not invalid GIMPLE.


Adding (or subtracting) an integer to/from a pointer to an array
is defined in both C and C++ only if the result points to an element
of the array or just past the last element of the array.  Otherwise
it's undefined. (A non-array object is considered an array of one
for this purpose.)

I think Richi's argument is that gimple allows things that are not
necessarily allowed by the C/C++ standard.  For example we support
virtual origins from Ada, which internally would look something like
invalid C code

OTOH, we currently have code in tree-vrp.c which warns if we compute the
address of an out of bounds array index which is very C/C++ centric.


I of course don't want to break anything.  I didn't see any fallout
in my testing and I normally test all the front ends, including Ada,
but let me check to make sure I tested it this time (I had made some
temporary changes to my build script and may have disabled it.)  Let
me double check it after I get back from my trip.

Martin


  1   2   >