Re: [PATCH GCC][4/5]Improve loop distribution to handle hmmer

2017-06-04 Thread Kugan Vivekanandarajah
Hi Bin,

Thanks for posting the patch. I haven't looked in detail yet but have
couple of quick questions.

1. Shouldn’t the run time alias check for versioning happen only when
vectorisation is enabled? You seems to be using the
IFN_LOOP_DIST_ALIAS when vectoring but seems to be versioning
otherwise too.

2. As I understand, loop distribution will be mostly beneficial when
enabling transformations succeeds.  Like resulting loop being
vectorized.  I am wondering if we should only version when the loop’s
control flow is simple that it can be vectorizable. I know that you
are adding the internal function for this to some extend but for some
cases we should be able to say while in loop distribution itself that
the control flow will not result in loop being vectorized.

Btw, did you run Spec2006 with this? Any notable changes ?

Thanks,
Kugan




On 2 June 2017 at 21:51, Bin Cheng  wrote:
> Hi,
> This is the main patch of the change.  It improves loop distribution by 
> versioning loop under
> runtime alias check conditions, as well as better partition fusion.  As 
> described in comments,
> the patch basically implements distribution in the following steps:
>
>  1) Seed partitions with specific type statements.  For now we support
> two types seed statements: statement defining variable used outside
> of loop; statement storing to memory.
>  2) Build reduced dependence graph (RDG) for loop to be distributed.
> The vertices (RDG:V) model all statements in the loop and the edges
> (RDG:E) model flow and control dependences between statements.
>  3) Apart from RDG, compute data dependences between memory references.
>  4) Starting from seed statement, build up partition by adding depended
> statements according to RDG's dependence information.  Partition is
> classified as parallel type if it can be executed parallelly; or as
> sequential type if it can't.  Parallel type partition is further
> classified as different builtin kinds if it can be implemented as
> builtin function calls.
>  5) Build partition dependence graph (PG) based on data dependences.
> The vertices (PG:V) model all partitions and the edges (PG:E) model
> all data dependences between every partitions pair.  In general,
> data dependence is either compilation time known or unknown.  In C
> family languages, there exists quite amount compilation time unknown
> dependences because of possible alias relation of data references.
> We categorize PG's edge to two types: "true" edge that represents
> compilation time known data dependences; "alias" edge for all other
> data dependences.
>  6) Traverse subgraph of PG as if all "alias" edges don't exist.  Merge
> partitions in each strong connected commponent (SCC) correspondingly.
> Build new PG for merged partitions.
>  7) Traverse PG again and this time with both "true" and "alias" edges
> included.  We try to break SCCs by removing some edges.  Because
> SCCs by "true" edges are all fused in step 6), we can break SCCs
> by removing some "alias" edges.  It's NP-hard to choose optimal
> edge set, fortunately simple approximation is good enough for us
> given the small problem scale.
>  8) Collect all data dependences of the removed "alias" edges.  Create
> runtime alias checks for collected data dependences.
>  9) Version loop under the condition of runtime alias checks.  Given
> loop distribution generally introduces additional overhead, it is
> only useful if vectorization is achieved in distributed loop.  We
> version loop with internal function call IFN_LOOP_DIST_ALIAS.  If
> no distributed loop can be vectorized, we simply remove distributed
> loops and recover to the original one.
>
> Also, there are some more to improve in the future (which shouldn't be 
> difficult):
>TODO:
>  1) We only distribute innermost loops now.  This pass should handle loop
> nests in the future.
>  2) We only fuse partitions in SCC now.  A better fusion algorithm is
> desired to minimize loop overhead, maximize parallelism and maximize
>
> This patch also fixes couple of latent bugs in the original implementation.
>
> After this change, kernel loop in hmmer can be distributed and vectorized as 
> a result.
> This gives obvious performance improvement.  There is still inefficient code 
> generation
> issue which I will try to fix in loop split.  Apart from this, the next 
> opportunity in hmmer
> is to eliminate number of dead stores under proper alias information.
> Bootstrap and test at O2/O3 on x86_64 and AArch64.  is it OK?
>
> Thanks,
> bin
> 2017-05-31  Bin Cheng  
>
> * cfgloop.h (struct loop): New field ldist_alias_id.
> * cfgloopmanip.c 

Re: [testsuite]MIPS remove duplicate div-x test

2017-06-04 Thread Paul Hua
Commited as r248868.

Thanks.
Paul.

On Mon, Jun 5, 2017 at 4:41 AM, Matthew Fortune
 wrote:
> Hi Paul,
>
> Paul Hua  writes:
>> cc: Matthew.
>>
>> ping.
>
> Sorry a little slow on the reply.
>
>> On Thu, Jun 1, 2017 at 3:35 PM, Paul Hua  wrote:
>> > Hi,
>> >
>> > There are duplicate testcase in gcc.target/mips dir.
>> >
>> > div-5.c same as div-9.c.
>> > div-6.c same as div-10.c.
>> > div-7.c same as div-11.c.
>> > div-8.c same as div-12.c.
>> >
>> > Is this deliberate?
>
> I see no evidence of this being deliberate and has been like this since
> the original commit.
>
>> > Otherwise, the attached patch fixing this.
>> >
>> >
>> > Paul.
>> >
>> > ***ChangeLog***
>> >
>> > 2017-06-01Chenghua Xu 
>> >
>> > Remove duplicate div-x testcase.
>
> These kind of comments don't tend to go in a changelog.
>
>> > * gcc.target/mips/div-9.c: Delete.
>
> You could say "Delete duplicate test" here if you want though.
>
>> > * gcc.target/mips/div-10.c: Ditto.
>> > * gcc.target/mips/div-11.c: Ditto.
>> > * gcc.target/mips/div-12.c: Ditto.
>
> Otherwise OK. I can't remember if you have write access let me know
> if you need it committing. Thanks for finding this.
>
> Matthew


Re: [PATCH v8] add -fpatchable-function-entry=N,M option

2017-06-04 Thread Sandra Loosemore

On 05/29/2017 04:29 AM, Maxim Kuvyrkov wrote:


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 65308c9d933..6cbb77a8dc4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11382,6 +11382,32 @@ of the function name, it is considered to be a match.  
For C99 and C++
  extended identifiers, the function name must be given in UTF-8, not
  using universal character names.

+@item -fpatchable-function-entry=@var{N}[,@var{M}]
+@opindex fpatchable-function-entry
+Generate @var{N} NOPs right at the beginning
+of each function, with the function entry point before the @var{M}th NOP.
+If @var{M} is omitted, it defaults to @code{0} so the
+function entry points to the address just at the first NOP.
+The NOP instructions reserve extra space which can be used to patch in
+any desired instrumentation at run time, provided that the code segment
+is writable.  The amount of space is controllable indirectly via
+the number of NOPs; the NOP instruction is by default the one created
+by @code{gen_nop ()}.


This is too implementor-speaky.  We shouldn't expect GCC users to read 
the source code to figure out what gen_nop is, or what it might do on 
any particular target.



+The NOP instructions are inserted at -- and maybe before, depending on
+@var{M} -- the function entry address, even before the prologue.


Texinfo nit: s / -- /---/g


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c4f2c893c8e..4a5317d73bb 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4566,6 +4566,10 @@ will select the smallest suitable mode.
  This section describes the macros that output function entry
  (@dfn{prologue}) and exit (@dfn{epilogue}) code.

+@deftypefn {Target Hook} void TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY (FILE 
*@var{file}, unsigned HOST_WIDE_INT @var{patch_area_size}, bool @var{record_p})
+Generate a patchable area at the function start
+@end deftypefn


This surely needs explanation of the parameters, and some punctuation.

-Sandra



Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)

2017-06-04 Thread Jason Merrill

On 06/02/2017 05:28 PM, Martin Sebor wrote:

On 05/31/2017 05:34 PM, Jason Merrill wrote:

On 05/27/2017 06:44 PM, Martin Sebor wrote:

+  /* True if the class is trivial and has a trivial non-deleted copy
+ assignment, copy ctor, and default ctor, respectively.  The last
+ one isn't used to issue warnings but only to decide what suitable
+ alternatives to offer as replacements for the raw memory
operation.  */
+  bool trivial = trivial_type_p (desttype);


This comment seems out of date; there's only one variable here now.


+  /* True if the class is has a non-deleted trivial assignment.  Set


s/is//


+  /* True if the class has a (possibly deleted) trivial copy ctor.  */
+  bool trivcopy = trivially_copyable_p (desttype);


"True if the class is trivially copyable."


+  if (delassign)
+warnfmt = G_("%qD writing to an object of type %#qT with "
+ "deleted copy assignment");
+  else if (!trivassign)
+warnfmt = G_("%qD writing to an object of type %#qT with "
+ "no trivial copy assignment");
+  else if (!trivial)
+warnfmt = G_("%qD writing to an object of non-trivial "
+ "type %#qT; use assignment instead");


I'd still like the !trivial test to come first in all the memset cases,
!trivcopy in the copy cases.


The tests are in the order they're in to provide as much useful
detail in the diagnostics as necessary to understand the problem
make the suggestion meaningful.  To what end you want to change
it?


Mostly I was thinking that whether a class is trivial(ly copyable) is 
more to the point, but I guess what you have now is fine.



+static bool
+has_trivial_special_function (tree ctype, special_function_kind sfk,
+  bool *deleted_p)


This seems redundant with type_has_trivial_fn.  If that function is
giving the wrong answer for a class where all of the SFK are deleted,
let's fix that, under check_bases_and_members, rather than introduce a
new function.  I don't want to call synthesized_method_walk every time
we want to check whether a function is trivial.


A deleted special function can be trivial.


I believe that in the language, the triviality of a deleted function 
cannot be determined.  But I believe you're right about the behavior of 
type_has_trivial_fn, which is why I mentioned changing it.



Maybe I should use a different approach and instead of trying
to see if a function is deleted use trivially_xible to see if
it's usable.  That will mean changing the diagnostics from
"with a deleted special function" to "without trivial special
function" but it will avoid calling synthesized_method_walk
while still avoiding giving bogus suggestions.

Is this approach acceptable?


Yes, that makes sense.

Jason



Re: [PATCH doc] update documentation of x86 -mcx16 option

2017-06-04 Thread Sandra Loosemore

On 05/26/2017 12:48 PM, Alexander Monakov wrote:

Hi,

This patch fixes a few issues in documentation of -mcx16 x86 backend option:

- remove implementor-speak ('oword')
- mention alignment restriction and availability only in 64-bit mode
- improve usage example
 existing documentation uses a really silly example (128-bit integer
 counters), it was apparently taken from Wikipedia back in 2007; today,
 Wikipedia uses a far more realistic example ("This is useful for parallel
 algorithms that use compare and swap on data larger than the size of a
 pointer, common in lock-free and wait-free algorithms")
- mention that the instruction is used when expanding __sync builtins, and NOT
   used when expanding __atomic builtins. This is a quiet change in GCC-7, GCC-6
   and earlier used this instruction for __atomic builtins too.

OK for trunk?  For the gcc-7 branch?

Thanks.
Alexander

* doc/invoke.texi (mcx16): Rewrite.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 3308b63..0b3c296 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -25160,13 +25160,12 @@ This option instructs GCC to use 128-bit AVX 
instructions instead of

  @item -mcx16
  @opindex mcx16
-This option enables GCC to generate @code{CMPXCHG16B} instructions.
-@code{CMPXCHG16B} allows for atomic operations on 128-bit double quadword
-(or oword) data types.
-This is useful for high-resolution counters that can be updated
-by multiple processors (or cores).  This instruction is generated as part of
-atomic built-in functions: see @ref{__sync Builtins} or
-@ref{__atomic Builtins} for details.
+This option enables GCC to generate @code{CMPXCHG16B} instructions in 64-bit
+code to implement compare-and-exchange operations on 16-byte aligned 128-bit
+objects.  This is useful for atomic updates of data structures exceeding one
+machine word in size.  The compiler uses this instruction to implement
+@ref{__sync Builtins}.  However, for @ref{__atomic Builtins} operating on
+128-bit integers, a library call is always used.

  @item -msahf
  @opindex msahf



This is good, thanks.  I think it's fine for GCC 7 branch as well (I 
guess it's not a regression fix, but it seems unlikely to break anything).


-Sandra



Re: [PATCH] add more detail to -Wconversion and -Woverflow (PR 80731)

2017-06-04 Thread Martin Sebor

On 06/02/2017 11:11 AM, Renlin Li wrote:

Hi Martin,

I noticed the following failures after your change r248431.
FAIL: c-c++-common/Wfloat-conversion.c  -Wc++-compat   (test for
warnings, line 42)
FAIL: c-c++-common/Wfloat-conversion.c  -Wc++-compat   (test for
warnings, line 43)

It happens on arm target which is not a large_long_double target.
The patch here add the missing target selector. After the change, those
test
won't checked in arm target.

Here I have a simple fix to it. Okay to commit?


r248431 wasn't meant to change when any of these warnings fire,
only their text and the amount of detail included in them.  The
removal of the { target large_long_double } selector was by
accident.

Please go ahead and commit your fix.  Thanks!



gcc/testsuite/ChangeLog:

2017-06-02 Renlin Li 

* c-c++-common/Wfloat-conversion.c: Add large_long_double target
selector to related line.



And there is another failure:
FAIL: gcc.dg/utf16-4.c  (test for warnings, line 15)

The warning message is slightly different from expected.
utf16-4.c:10:15: warning: character constant too long for its type
utf16-4.c:15:15: warning: conversion from 'long unsigned int' to
'char16_t {aka short unsigned int}' changes value from '410401' to '17185'


The test passes for me now.  The initial commit had introduced
bug 80731 but it's recently been fixed.  Can you please try
again with the latest sources?

Martin


Re: [PATCH] have -Wformat-overflow handle -fexec-charset (PR 80503)

2017-06-04 Thread Martin Sebor

On 06/02/2017 09:38 AM, Renlin Li wrote:

Hi Martin,

After r247444, I saw the following two regressions in
arm-linux-gnueabihf environment:

FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-18.c  (test for warnings,
line 119)
PASS: gcc.dg/tree-ssa/builtin-sprintf-warn-18.c  (test for warnings,
line 121)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-18.c  (test for warnings,
line 121)

The warning message related to those two lines are:
testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-18.c:119:3: warning:
'%9223372036854775808i' directive width out of range [-Wformat-overflow=]

testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-18.c:121:3: warning:
'%.9223372036854775808i' directive precision out of range
[-Wformat-overflow=]

testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-18.c:121:3: warning:
'%.9223372036854775808i' directive precision out of range
[-Wformat-overflow=]

Did you notice similar things from your test environment, Christophe?


Looks like you're missing a couple of warnings.  I see the following
output with both my arm-linux-gnueabihf cross compiler and my native
x86_64 GCC, both in 32-bit and 64-bit modes, as expected by the test,
so I don't see the same issue in my environment.

/ssd/src/gcc/git/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-18.c:119:3: 
warning: ‘%9223372036854775808i’ directive width out of range 
[-Wformat-overflow=]
   T ("%9223372036854775808i", 0);/* { dg-warning "width out of 
range" } */

   ^
/ssd/src/gcc/git/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-18.c:119:3: 
warning: ‘%9223372036854775808i’ directive output of 9223372036854775807 
bytes causes result to exceed ‘INT_MAX’ [-Wformat-overflow=]
/ssd/src/gcc/git/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-18.c:121:3: 
warning: ‘%.9223372036854775808i’ directive precision out of range 
[-Wformat-overflow=]
   T ("%.9223372036854775808i", 0);   /* { dg-warning "precision out of 
range" } */

   ^
/ssd/src/gcc/git/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-18.c:121:3: 
warning: ‘%.9223372036854775808i’ directive output of 
9223372036854775807 bytes causes result to exceed ‘INT_MAX’ 
[-Wformat-overflow=]


Martin


RE: [testsuite]MIPS remove duplicate div-x test

2017-06-04 Thread Matthew Fortune
Hi Paul,

Paul Hua  writes:
> cc: Matthew.
> 
> ping.

Sorry a little slow on the reply.
 
> On Thu, Jun 1, 2017 at 3:35 PM, Paul Hua  wrote:
> > Hi,
> >
> > There are duplicate testcase in gcc.target/mips dir.
> >
> > div-5.c same as div-9.c.
> > div-6.c same as div-10.c.
> > div-7.c same as div-11.c.
> > div-8.c same as div-12.c.
> >
> > Is this deliberate?

I see no evidence of this being deliberate and has been like this since
the original commit.

> > Otherwise, the attached patch fixing this.
> >
> >
> > Paul.
> >
> > ***ChangeLog***
> >
> > 2017-06-01Chenghua Xu 
> >
> > Remove duplicate div-x testcase.

These kind of comments don't tend to go in a changelog.

> > * gcc.target/mips/div-9.c: Delete.

You could say "Delete duplicate test" here if you want though.

> > * gcc.target/mips/div-10.c: Ditto.
> > * gcc.target/mips/div-11.c: Ditto.
> > * gcc.target/mips/div-12.c: Ditto.

Otherwise OK. I can't remember if you have write access let me know
if you need it committing. Thanks for finding this.

Matthew


Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.

2017-06-04 Thread David Miller
From: Eric Botcazou 
Date: Sun, 04 Jun 2017 10:32:47 +0200

>> This is an attempt to fix PR target/80968.  This bug has existed
>> basically forever.
>> 
>> The stack_tie sequence seems to be how other targets deal with this
>> issue.  I only emit this when alloca is used.  If there are other
>> conditions that potentially would necessitate such a barrier, just let
>> me know.
> 
> See my comment in the audit trail about the stack tie approach, let's just 
> emit a frame_blockage instead.

That seems to work as well, following is going through a testsuite
run right now:


[PATCH] sparc: Fix stack references in return delay slot.

gcc/

PR target/80968
* config/sparc/sparc.c (sparc_expand_prologue): Emit frame
blockage if function uses alloca.

gcc/testsuite/

* gcc.target/sparc/sparc-ret-3.c: New test.
---
 gcc/config/sparc/sparc.c |  3 ++
 gcc/testsuite/gcc.target/sparc/sparc-ret-3.c | 53 
 2 files changed, 56 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/sparc/sparc-ret-3.c

diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 6dfb269..95a64a4 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -5792,6 +5792,9 @@ sparc_expand_epilogue (bool for_eh)
 {
   HOST_WIDE_INT size = sparc_frame_size;
 
+  if (cfun->calls_alloca)
+emit_insn (gen_frame_blockage ());
+
   if (sparc_n_global_fp_regs > 0)
 emit_save_or_restore_global_fp_regs (sparc_frame_base_reg,
 sparc_frame_base_offset
diff --git a/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c 
b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c
new file mode 100644
index 000..7a151f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/sparc-ret-3.c
@@ -0,0 +1,53 @@
+/* PR target/80968 */
+/* { dg-do compile } */
+/* { dg-skip-if "no register windows" { *-*-* } { "-mflat" } { "" } } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-mcpu=ultrasparc -O" } */
+
+/* Make sure references to the stack frame do not slip into the delay slot
+   of a return instruction.  */
+
+struct crypto_shash {
+   unsigned int descsize;
+};
+struct crypto_shash *tfm;
+
+struct shash_desc {
+   struct crypto_shash *tfm;
+   unsigned int flags;
+
+   void *__ctx[] __attribute__((aligned(8)));
+};
+
+static inline unsigned int crypto_shash_descsize(struct crypto_shash *tfm)
+{
+   return tfm->descsize;
+}
+
+static inline void *shash_desc_ctx(struct shash_desc *desc)
+{
+   return desc->__ctx;
+}
+
+#define SHASH_DESC_ON_STACK(shash, ctx)  \
+   char __##shash##_desc[sizeof(struct shash_desc) + \
+ crypto_shash_descsize(ctx)] 
__attribute__((aligned(8))); \
+   struct shash_desc *shash = (struct shash_desc *)__##shash##_desc
+
+extern int crypto_shash_update(struct shash_desc *, const void *, unsigned 
int);
+
+unsigned int bug(unsigned int crc, const void *address, unsigned int length)
+{
+   SHASH_DESC_ON_STACK(shash, tfm);
+   unsigned int *ctx = (unsigned int *)shash_desc_ctx(shash);
+   int err;
+
+   shash->tfm = tfm;
+   shash->flags = 0;
+   *ctx = crc;
+
+   err = crypto_shash_update(shash, address, length);
+
+   return *ctx;
+}
+/* { dg-final { scan-assembler "ld\[ \t\]*\\\[%i5\\+8\\\], 
%i0\n\[^\n\]*return\[ \t\]*%i7\\+8" } } */
-- 
2.1.2.532.g19b5d50



Containers default initialization

2017-06-04 Thread François Dumont

Hi

I have eventually adapt the test to all containers and the result 
is successful for map/set/unordered_map/unordered_set. It is failing for 
deque/list/forward_list/vector/vector.


I even try to change the test to look at the difference between an 
explicit call to the default constructor done through the placement new 
call and an implicit call done on normal declaration. I wondered if we 
would have the same kind of difference we have between a int i; and a 
int i(); I tried to set the stack to ~0 before declaring the instance. I 
know there is no guarantee on the content of the stack for the following 
declaration but do you think it is reliable enough to commit it ?


Ok to commit the successful tests ?

Franckly I don't understand the result of those tests. I would have 
expect map/set to fail and others to succeed. We might need help from 
compiler guys, no ?


François



Index: testsuite/23_containers/map/allocator/default_init.cc
===
--- testsuite/23_containers/map/allocator/default_init.cc	(nonexistent)
+++ testsuite/23_containers/map/allocator/default_init.cc	(working copy)
@@ -0,0 +1,57 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }
+
+#include 
+#include 
+#include 
+
+#include 
+
+using T = int;
+
+using __gnu_test::default_init_allocator;
+
+void test01()
+{
+  typedef default_init_allocator alloc_type;
+  typedef std::map test_type;
+
+  {
+__gnu_cxx::__aligned_buffer buf;
+__builtin_memset(buf._M_addr(), ~0, sizeof(test_type));
+
+VERIFY( buf._M_ptr()->get_allocator().state != 0 );
+  
+test_type *tmp = ::new(buf._M_addr()) test_type();
+
+VERIFY( tmp->get_allocator().state == 0 );
+
+tmp->~test_type();
+__builtin_memset(buf._M_addr(), ~0, sizeof(test_type));
+  }
+
+  test_type tmp;
+  VERIFY( tmp.get_allocator().state == 0 );
+}
+
+int main()
+{
+  test01();
+  return 0;
+}
Index: testsuite/23_containers/set/allocator/default_init.cc
===
--- testsuite/23_containers/set/allocator/default_init.cc	(nonexistent)
+++ testsuite/23_containers/set/allocator/default_init.cc	(working copy)
@@ -0,0 +1,57 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }
+
+#include 
+#include 
+#include 
+
+#include 
+
+using T = int;
+
+using __gnu_test::default_init_allocator;
+
+void test01()
+{
+  typedef default_init_allocator alloc_type;
+  typedef std::set test_type;
+
+  {
+__gnu_cxx::__aligned_buffer buf;
+__builtin_memset(buf._M_addr(), ~0, sizeof(test_type));
+
+VERIFY( buf._M_ptr()->get_allocator().state != 0 );
+  
+test_type *tmp = ::new(buf._M_addr()) test_type();
+
+VERIFY( tmp->get_allocator().state == 0 );
+
+tmp->~test_type();
+__builtin_memset(buf._M_addr(), ~0, sizeof(test_type));
+  }
+
+  test_type tmp;
+  VERIFY( tmp.get_allocator().state == 0 );
+}
+
+int main()
+{
+  test01();
+  return 0;
+}
Index: testsuite/23_containers/unordered_map/allocator/default_init.cc
===
--- testsuite/23_containers/unordered_map/allocator/default_init.cc	(nonexistent)
+++ testsuite/23_containers/unordered_map/allocator/default_init.cc	(working copy)
@@ -0,0 +1,58 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This 

Re: [PATCH, gfortran] Cleanup the submodule tests

2017-06-04 Thread Segher Boessenkool
On Sun, Jun 04, 2017 at 02:52:36PM +0200, Dominique d'Humières wrote:
> >> -   regexp "(\[0-9\]+)\[ \t\]+(?:(?:#)?\[ \t\]*include\[ 
> >> \t\]+)\[\"\](\[^\"\]*)\[\"\]" $i dummy lineno include_file
> >> +   regexp -nocase 
> >> "(\[0-9\]+)\\s+(?:(?:#)?\\s*include\\s+)\[\"\'\](\[^\"\'\]*)\[\"\'\]" $i 
> >> dummy lineno include_file
> > 
> > My regex sorcery may be a bit rusty, but why does \\s need a double
> > escape while \t appears with single escape?
> 
> I have used the Segher’s tips.

Nah, that wasn't a tip.  A tip is: don't use double quotes unless you
actually want double quotes.  You can write this as:

  regexp -nocase {([0-9]+)\s+(?:(?:#)?\s*include\s+)["']([^"']*)["']} $i dummy 
lineno include_file

or even

  regexp -nocase {([0-9]+)\s+#?\s*include\s+["']([^"']*)["']} $i dummy lineno 
include_file

(where I removed the useless (?:) groups as well).

You want double quotes if you want command substitution ("bla[stuff]bla"),
variable substitution ("$something"), or backslash substitution (like "\t"
becomes a tab char).  In a regexp you do not usually have any use for this,
and it hurts so much.


Segher


Re: Reorgnanization of profile count maintenance code, part 1

2017-06-04 Thread Jan Hubicka
Hello,
thanks for proofreading!
> On Thu, Jun 01, 2017 at 01:35:56PM +0200, Jan Hubicka wrote:
> 
> Just some very minor nits.
> 
> > Index: final.c
> > ===
> > --- final.c (revision 248684)
> > +++ final.c (working copy)
> > @@ -1951,9 +1951,11 @@ dump_basic_block_info (FILE *file, rtx_i
> >fprintf (file, "%s BLOCK %d", ASM_COMMENT_START, bb->index);
> >if (bb->frequency)
> >  fprintf (file, " freq:%d", bb->frequency);
> > -  if (bb->count)
> > -fprintf (file, " count:%" PRId64,
> > - bb->count);
> > +  if (bb->count.initialized_p ())
> > +   {
> > +  fprintf (file, " count");
> 
> Missing colon.
> s/count"/count:"/
Fixed!
> 
> I think i saw a_count = a_count + something above and assumed you didn't
> have a += operator. Could thus use the terse form in the snipped code
> above on the patch, maybe?

I did conversion without those operators first because all those places
are subject to updating WRT uninitialized values.  I will re-add in
place operators incrementally.
> >  static bool
> >  check_counter (gimple *stmt, const char * name,
> > -  gcov_type *count, gcov_type *all, gcov_type bb_count)
> > +  gcov_type *count, gcov_type *all, profile_count bb_count_d)
> >  {
> > +  gcov_type bb_count = bb_count_d.to_gcov_type ();
> > +  return true;
> 
> On purpose?

No, it was a hack. I have dropped it now.
I will commit the patch after bit of additional testing.  Note that it breaks 
gcc.dg/tree-prof/section-attr-2.c because it uncovers bug in loop-im which
forgets to update profile (thus we end up with uninitialized counts rahter than
0 now and prevent hot/cold splitting).  I will submit fixes incrementally
one by one rather than mixing them with the actual reorg.

Honza


RFC: [PATCH] Add warn_if_not_aligned attribute

2017-06-04 Thread H.J. Lu
__attribute__((warn_if_not_aligned(N))) issues a warning if the field
in a struct or union is not aligned to N:

typedef unsigned long long __u64
  __attribute__((aligned(4),warn_if_not_aligned(8)));

struct foo
{
  int i1;
  int i2;
  __u64 x;
};

__u64 is aligned to 4 bytes.  But inside struct foo, __u64 should be
aligned at 8 bytes.  It is used to define struct foo in such a way that
struct foo has the same layout and x has the same alignment when __u64
is aligned at either 4 or 8 bytes.

Since struct foo is normally aligned to 4 bytes, a warning will be issued:

warning: alignment 4 of ‘struct foo’ is less than 8

Align struct foo to 8 bytes:

struct foo
{
  int i1;
  int i2;
  __u64 x;
} __attribute__((aligned(8)));

silences the warning.  It also warns the field with misaligned offset:

struct foo
{
  int i1;
  int i2;
  int i3;
  __u64 x;
} __attribute__((aligned(8)));

warning: ‘x’ offset 12 in ‘struct foo’ isn't aligned to 8

The same warning is also issued without the warn_if_not_aligned attribute
for the field with explicitly specified alignment in a packed struct or
union:

struct __attribute__ ((aligned (8))) S8 { char a[8]; };
struct __attribute__ ((packed)) S {
  struct S8 s8;
};

warning: alignment 1 of ‘struct S’ is less than 8

gcc/

PR c/53037
* c-decl.c (finish_enum): Also copy TYPE_WARN_IF_NOT_ALIGN.
* print-tree.c (print_node): Support TYPE_WARN_IF_NOT_ALIGN.
* stor-layout.c (handle_warn_if_not_align): New.
(place_union_field): Call handle_warn_if_not_align.
(place_field): Call handle_warn_if_not_align.  Copy
TYPE_WARN_IF_NOT_ALIGN.
(finish_builtin_struct): Copy TYPE_WARN_IF_NOT_ALIGN.
(layout_type): Likewise.
* tree.c (build_range_type_1): Likewise.
* tree-core.h (tree_type_common): Add warn_if_not_align.  Set
spare to 18.
* tree.h (TYPE_WARN_IF_NOT_ALIGN): New.
(SET_TYPE_WARN_IF_NOT_ALIGN): Likewise.
* c-family/c-attribs.c (handle_warn_if_not_aligned_attribute): New.
(c_common_attribute_table): Add warn_if_not_aligned.
(handle_aligned_attribute): Renamed to ...
(common_handle_aligned_attribute): Remove argument, name, and add
argument, warn_if_not_aligned.  Handle warn_if_not_aligned.
(handle_aligned_attribute): New.
* c/c-decl.c (finish_enum): Copy TYPE_WARN_IF_NOT_ALIGN.

gcc/testsuite/

PR c/53037
* gcc.dg/pr53037-1.c: New test.
---
 gcc/c-family/c-attribs.c | 55 +
 gcc/c/c-decl.c   |  2 ++
 gcc/print-tree.c |  6 ++--
 gcc/stor-layout.c| 45 ++
 gcc/testsuite/gcc.dg/pr53037-1.c | 59 
 gcc/tree-core.h  |  3 +-
 gcc/tree.c   |  1 +
 gcc/tree.h   | 10 +++
 8 files changed, 172 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr53037-1.c

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 695c58c..a76f9f7 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -85,6 +85,8 @@ static tree handle_destructor_attribute (tree *, tree, tree, 
int, bool *);
 static tree handle_mode_attribute (tree *, tree, tree, int, bool *);
 static tree handle_section_attribute (tree *, tree, tree, int, bool *);
 static tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
+static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,
+ int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
@@ -208,6 +210,9 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_section_attribute, false },
   { "aligned",0, 1, false, false, false,
  handle_aligned_attribute, false },
+  { "warn_if_not_aligned",0, 1, false, false, false,
+ handle_warn_if_not_aligned_attribute,
+ false },
   { "weak",   0, 0, true,  false, false,
  handle_weak_attribute, false },
   { "noplt",   0, 0, true,  false, false,
@@ -1558,12 +1563,13 @@ check_cxx_fundamental_alignment_constraints (tree node,
   return !alignment_too_large_p;
 }
 
-/* Handle a "aligned" attribute; arguments as in
-   struct attribute_spec.handler.  */
+/* Common codes shared by handle_warn_if_not_aligned_attribute and
+   handle_aligned_attribute.  */
 
 static tree
-handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
- int flags, bool *no_add_attrs)
+common_handle_aligned_attribute (tree *node, tree args, int flags,
+ 

Re: [PATCH] handle bzero/bcopy in DSE and aliasing (PR 80933, 80934)

2017-06-04 Thread Bernhard Reutner-Fischer
On 2 June 2017 13:12:41 CEST, Richard Biener  wrote:

>Note I'd be _much_ more sympathetic to simply canonicalizing all of
>bzero and bcopy
>to memset / memmove and be done with all the above complexity.

Indeed and even more so since SUSv3 marked it LEGACY and both were removed in 
SUSv4.
thanks,


Fwd: [PATCH, gfortran] Cleanup the submodule tests

2017-06-04 Thread Dominique d'Humières


> Début du message réexpédié :
> 
> De: Dominique d'Humières 
> Objet: Rép : [PATCH, gfortran] Cleanup the submodule tests
> Date: 4 juin 2017 à 14:52:36 UTC+2
> À: Janus Weil 
> Cc: Paul Richard Thomas , 
> seg...@kernel.crashing.org, gfortran , gcc-patches 
> 
> 
>> 
>> Le 16 avr. 2017 à 22:13, Janus Weil  a écrit :
>> 
>> Hi Dominique,
>> 
>>> I am currently testing the following patch that handle both modules and 
>>> submodules. It is a little bit clumsy and may not handle all the possible 
>>> syntax variants. Any comment welcomed!-) Testing in progress.
>> 
>> so, how did the testing go?
> 
> It took more time than expected!-(everything should be fixed now, but for 
> some cleanup-modules left as explained below.
> 
>>> proc list-module-names-1 { file } {
>>> set result {}
>>> -set tmp [grep $file "^\[ \t\]*((#)?\[ 
>>> \t\]*include|\[mM\]\[oO\]\[dD\]\[uU\]\[lL\]\[eE\](?!\[ 
>>> \t\]+\[pP\]\[rR\]\[oO\]\[cC\]\[eE\]\[dD\]\[uU\]\[rR\]\[eE\]\[ \t\]+))\[ 
>>> \t\]+.*" line]
>>> +if {[file isdirectory $file]} {return}
>>> +set tmp [igrep $file 
>>> "^\\s*((#)?\\s*include|(sub)?module(?!\\s+(recursive\\s+)?(procedure|subroutine|function)\\s*))\\s*.*"
>>>  line]
>> 
>> This is supposed to catch all lines including "module", but not
>> "module procedure", "module subroutine" etc, right? Why do you need
>> "recursive" here, but no other attributes like "pure" or "elemental »
> 
> I need "recursive" because it is present in submodule_(4|14|16).f08. I have 
> added "PURE|(IMPURE\s+)?ELEMENTAL », but this is untested.
> 
>>> if {![string match "" $tmp]} {
>>>foreach i $tmp {
>>> -   regexp "(\[0-9\]+)\[ \t\]+(?:(?:#)?\[ \t\]*include\[ 
>>> \t\]+)\[\"\](\[^\"\]*)\[\"\]" $i dummy lineno include_file
>>> +   regexp -nocase 
>>> "(\[0-9\]+)\\s+(?:(?:#)?\\s*include\\s+)\[\"\'\](\[^\"\'\]*)\[\"\'\]" $i 
>>> dummy lineno include_file
>> 
>> My regex sorcery may be a bit rusty, but why does \\s need a double
>> escape while \t appears with single escape?
> 
> I have used the Segher’s tips.
> 
>>> @@ -99,7 +100,11 @@ proc list-module-names-1 { file } {
>>>}
>>>continue
>>>}
>>> -   regexp "(\[0-9\]+)\[ 
>>> \t\]+(?:(\[mM\]\[oO\]\[dD\]\[uU\]\[lL\]\[eE\]\[ 
>>> \t\]+(?!\[pP\]\[rR\]\[oO\]\[cC\]\[eE\]\[dD\]\[uU\]\[rR\]\[eE\]\[ 
>>> \t\]+)))(\[^ \t;\]*)" $i i lineno keyword mod
>>> +   regexp -nocase "(\[0-9\]+)\\s+(module|submodule)\\s*(\[^;\]*)" 
>>> $i i lineno keyword mod
>>> +   regsub "\\s*!.*" $mod "" mod
>>> +   regsub ":\[^)\]*" $mod "" mod
>>> +   regsub "\\(\\s*" $mod "" mod
>>> +   regsub "\\s*\\)\\s*" $mod "@" mod
>>>if {![info exists lineno]} {
>>>continue
>>>}
>> 
>> Not sure I understand this part. Guess I'm spending too little time
>> with regexps to find them readable :(
> 
> I have added a few comments.
> 
>> Cheers,
>> Janus
> 
> I had to leave the following files
> 
> gcc/testsuite/gfortran.dg/binding_label_tests_10_main.f03:! { dg-final { 
> cleanup-modules "binding_label_tests_10" } }
> gcc/testsuite/gfortran.dg/binding_label_tests_11_main.f03:! { dg-final { 
> cleanup-modules "binding_label_tests_11" } }
> gcc/testsuite/gfortran.dg/binding_label_tests_13_main.f03:! { dg-final { 
> cleanup-modules "binding_label_tests_13" } }
> gcc/testsuite/gfortran.dg/binding_label_tests_26b.f90:! { dg-final { 
> cleanup-modules "fg f" } }
> gcc/testsuite/gfortran.dg/class_45b.f03:! { dg-final { cleanup-modules 
> "G_Nodes" } }
> gcc/testsuite/gfortran.dg/coarray_29_2.f90:! { dg-final { cleanup-modules 
> "co_sum_module" } }
> gcc/testsuite/gfortran.dg/coarray_35a.f90:! { dg-final { cleanup-modules 
> "global_coarrays" } }
> gcc/testsuite/gfortran.dg/namelist_83.f90:! { dg-final { cleanup-modules 
> "gfcbug126" } }
> gcc/testsuite/gfortran.dg/pr37287-1.f90:! { dg-final { cleanup-modules 
> "pr37287_2" } }
> gcc/testsuite/gfortran.dg/test_common_binding_labels_2_main.f03:! { dg-final 
> { cleanup-modules "test_common_binding_labels_2" } }
> gcc/testsuite/gfortran.dg/test_common_binding_labels_3_main.f03:! { dg-final 
> { cleanup-modules "test_common_binding_labels_3" } }
> gcc/testsuite/gfortran.dg/whole_file_29.f90:! { dg-final { cleanup-modules 
> "iso_red" } }
> gcc/testsuite/gfortran.dg/whole_file_31.f90:! { dg-final { cleanup-modules 
> "system_defs_m" } }
> 
> with cleanup-modules because they are either called with 
> dg-compile-aux-modules or with dg-additional-sources. I am planning to fix 
> that once these patches have been accepted.
> 
> Cheers,
> 
> Dominique
> 


patch-cl
Description: Binary data
> 


patch-exp
Description: Binary data
> 


patch-tsts
Description: Binary data
> 



Re: [PATCH][SPARC] PR target/80968 Prevent stack loads in return delay slot.

2017-06-04 Thread Eric Botcazou
> This is an attempt to fix PR target/80968.  This bug has existed
> basically forever.
> 
> The stack_tie sequence seems to be how other targets deal with this
> issue.  I only emit this when alloca is used.  If there are other
> conditions that potentially would necessitate such a barrier, just let
> me know.

See my comment in the audit trail about the stack tie approach, let's just 
emit a frame_blockage instead.

-- 
Eric Botcazou