Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-19 Thread Richard Biener
On Mon, Jul 18, 2016 at 5:36 PM, Bin.Cheng  wrote:
> On Mon, Jul 18, 2016 at 4:28 PM, NightStrike  wrote:
>> On Mon, Jul 18, 2016 at 3:55 AM, Bin.Cheng  wrote:
>>> On Sat, Jul 16, 2016 at 6:28 PM, NightStrike  wrote:
 On Fri, Jul 15, 2016 at 1:07 PM, Bin Cheng  wrote:
> Hi,
> This patch removes support for -funsafe-loop-optimizations, as well as 
> -Wunsafe-loop-optimizations.  By its name, this option does unsafe 
> optimizations by assuming all loops must terminate and doesn't wrap.  
> Unfortunately, it's not as useful as expected because:
> 1) Simply assuming loop must terminate isn't enough.  What we really want 
> is to analyze scalar evolution and loop niter bound under such 
> assumptions.  This option does nothing in this aspect.
> 2) IIRC, this option generates bogus code for some common programs, 
> that's why it's disabled by default even at Ofast level.
>
> After I sent patches handling possible infinite loops in both 
> (scev/niter) analyzer and vectorizer, it's a natural step to remove such 
> options in GCC.  This patch does so by deleting code for 
> -funsafe-loop-optimizations, as well as -Wunsafe-loop-optimizations.  It 
> also deletes the two now useless tests, while the option interface is 
> preserved for backward compatibility purpose.

 There are a number of bugs opened against those options, including one
 that I just opened rather recently:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71769

 but some go back far, in this case 9 years:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34114

 If you are going to remove the options, you should address open bugs
 related to those options.
>>> Hi,
>>> Thanks for pointing me to these PRs, I will have a look at them.
>>
>> I only highlighted two PRs, I was suggesting that you look for all of them.
>>
>>> IMHO, the old one reports weakness in loop niter analyzer, the issue
>>> exists whether I remove unsafe-loop-optimization or not.  The new one
>>> is a little bit trickier, I will put some comments on PR, and again,
>>> the issue (if it is) is in niter analyzer which has nothing to do with
>>> the option really.
>>
>> Well, one thing to note is that the warning is an easy way to get a
>> notice of a possible missed optimization (and I have many more
>> occurrences of it in a particular code base that I use).  If the
>> warning is highlighted potential issues that aren't due to the -f
>> option but are issues nonetheless, and we remove the warning, then how
>> should I go about finding these missed opportunities in the future?
>> Is there a different mechanism that does the same thing?
> Hmm, good point, I will iterate the patch to see if I can only remove
> -funsafe-loop-optimizations, while keep -Wunsafe-loop-optimizations.

Of course the naming of -Wunsafe-loop-optimizations is misleading then.
Maybe provide an alias -Wmissed-loop-optimizations and re-word it to
say "disable _some_ loop optimizations" as I hope more loop optimizations
get aware of "assumptions" and deal with them.

Richard.

> Thanks,
> bin


Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-19 Thread Richard Biener
On Tue, Jul 19, 2016 at 10:00 AM, Richard Biener
 wrote:
> On Mon, Jul 18, 2016 at 5:36 PM, Bin.Cheng  wrote:
>> On Mon, Jul 18, 2016 at 4:28 PM, NightStrike  wrote:
>>> On Mon, Jul 18, 2016 at 3:55 AM, Bin.Cheng  wrote:
 On Sat, Jul 16, 2016 at 6:28 PM, NightStrike  wrote:
> On Fri, Jul 15, 2016 at 1:07 PM, Bin Cheng  wrote:
>> Hi,
>> This patch removes support for -funsafe-loop-optimizations, as well as 
>> -Wunsafe-loop-optimizations.  By its name, this option does unsafe 
>> optimizations by assuming all loops must terminate and doesn't wrap.  
>> Unfortunately, it's not as useful as expected because:
>> 1) Simply assuming loop must terminate isn't enough.  What we really 
>> want is to analyze scalar evolution and loop niter bound under such 
>> assumptions.  This option does nothing in this aspect.
>> 2) IIRC, this option generates bogus code for some common programs, 
>> that's why it's disabled by default even at Ofast level.
>>
>> After I sent patches handling possible infinite loops in both 
>> (scev/niter) analyzer and vectorizer, it's a natural step to remove such 
>> options in GCC.  This patch does so by deleting code for 
>> -funsafe-loop-optimizations, as well as -Wunsafe-loop-optimizations.  It 
>> also deletes the two now useless tests, while the option interface is 
>> preserved for backward compatibility purpose.
>
> There are a number of bugs opened against those options, including one
> that I just opened rather recently:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71769
>
> but some go back far, in this case 9 years:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34114
>
> If you are going to remove the options, you should address open bugs
> related to those options.
 Hi,
 Thanks for pointing me to these PRs, I will have a look at them.
>>>
>>> I only highlighted two PRs, I was suggesting that you look for all of them.
>>>
 IMHO, the old one reports weakness in loop niter analyzer, the issue
 exists whether I remove unsafe-loop-optimization or not.  The new one
 is a little bit trickier, I will put some comments on PR, and again,
 the issue (if it is) is in niter analyzer which has nothing to do with
 the option really.
>>>
>>> Well, one thing to note is that the warning is an easy way to get a
>>> notice of a possible missed optimization (and I have many more
>>> occurrences of it in a particular code base that I use).  If the
>>> warning is highlighted potential issues that aren't due to the -f
>>> option but are issues nonetheless, and we remove the warning, then how
>>> should I go about finding these missed opportunities in the future?
>>> Is there a different mechanism that does the same thing?
>> Hmm, good point, I will iterate the patch to see if I can only remove
>> -funsafe-loop-optimizations, while keep -Wunsafe-loop-optimizations.
>
> Of course the naming of -Wunsafe-loop-optimizations is misleading then.
> Maybe provide an alias -Wmissed-loop-optimizations and re-word it to
> say "disable _some_ loop optimizations" as I hope more loop optimizations
> get aware of "assumptions" and deal with them.

In which case a way to "re-introduce" -funsafe-loop-optimizations would be to
add a #pragma that can be used to annotate loops to tell GCC of various
properties like that it terminates without IV wrapping.

Richard.

> Richard.
>
>> Thanks,
>> bin


Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Richard Biener
On Mon, Jul 18, 2016 at 6:07 PM, Bernd Schmidt  wrote:
> The motivating example for this patch was a change that was submitted for
> genattrtab last year, which would have made us generate
>
> switch (type = get_attr_type (insn))
>   {
>... some cases ...
>default:
>  switch (type = get_attr_type (insn)))
>{
>... some other cases ...
>}
>   }
>
> The idea was to optimize this by merging the code into a single switch. My
> expectation was that this was most likely to occur in machine-generated
> code, but there are a few instances of this pattern in the gcc sources
> themselves. One case is
>
>code = gimple_code (stmt);
>switch (code)
>  {
>  
>  default:
>if (is_gimple_omp (code))
>  {
>  }
>  }
>
> where is_gimple_omp expands into another switch. More cases exist in the
> compiler as shown by various bootstrap failures along the way; sometimes
> these are exposed after other optimizations. One is in the Ada runtime
> library somewhere, and another (which currently cannot be transformed by the
> patch) is in the Fortran frontend.
>
> In the future we could also look for if statements making another comparison
> of the variable in the default branch, that would be a minor extension.
>
> The motivating example currently can't be transformed because get_attr_type
> calls are in the way.
>
> Bootstrapped and tested on x86_64-linux. Ok?

This is not appropriate for CFG cleanup due to its complexity not
being O(# bbs + # edges).
I tried hard in the past to make it so (at least when no transform is done).

Please move this transform elsewhere.  I suggest the switch-conversion
pass or if that
is not a good fit then maybe if-combine (whose transforms are remotely related).

Not looking closer at the patch but missing some comments on how it deals with
common cases (you see to handle fallthrus to the default label by
ignoring them?)

Thanks,
Richard.


>
> Bernd


Re: [PATCH] Add qsort comparator consistency checking (PR71702)

2016-07-19 Thread Richard Biener
On Mon, Jul 18, 2016 at 7:36 PM, Alexander Monakov  wrote:
> On Mon, 18 Jul 2016, Richard Biener wrote:
>> Ugh.  What impact does this have on stage2 compile-time?
>
> It doesn't seem to be high enough to be measured reliably.  I've made a trial
> run with -time=time.log in BOOT_CFLAGS, but there's a lot of variability in
> timings and the sum total of times ended up 1% lower on the patched compiler.
>
> However, this patch only runs checking for vec::qsort, while I'd like to have
> such checking on all qsort calls.  That would make it a bit more concerning.
>
> It is possible to consider other schemes of limiting the impact of this 
> checking
> by restricting the subset of pairs being tested. For instance, it's possible 
> to
> run all-pairs check on a really small prefix of the sorted array (e.g. 10,
> instead of 100 in the proposed patch), and for the rest of the elements, check
> only a logarithmic number of pairs. This would make this checking have time
> complexity O(n log n), matching qsort (but likely with a lower constant 
> factor).
> Would this scheme be appropriate?

Yes.  The other option is to enable this checking not with ENABLE_CHECKING
but some new checking option, say ENABLE_CHECKING_ALGORITHMS, and
do full checking in that case.

[The -fchecking option was supposed to be eventually extended to cover more
checking subsets, thus allow -fchecking=yes,algorithms for example]

Richard.

> Thanks.
> Alexander


Re: [patch] Add new hook to diagnose address space usage (take #2)

2016-07-19 Thread Georg-Johann Lay

On 18.07.2016 16:13, Bernd Schmidt wrote:

On 07/14/2016 05:11 PM, Georg-Johann Lay wrote:

The hook allows better diagnostics:  The address spaces are registered
with c_register_addr_space and if the parser comes across an address
space it provides the hook with the needed information, in particular
the location of the token so that the message would be something like


Looks reasonable, except...


+(diagnose_usage,
+ "Define this hook if the availability of an address space depends on\n\
+command line options and some diagnostics shall be printed when the\n\


"should", not "shall", I think.


Fixed.


+bool
+default_addr_space_diagnose_usage (addr_space_t ARG_UNUSED (as),
+   location_t ARG_UNUSED (loc))
+{
+  return false;
+}


The return value is not used, so it should return void. That would also match
the documentation you added (which says "does nothing" rather than "returns
false").


Fixed, the hook returns void now.

The idea was that in a future version the c-parser might take decision 
depending on whether an error has been issued.



Remove unused arg names in default hook implementations, I think.


Bernd




Done.  Attached is the updated version of the change, log entry is the same as 
before.


Johann


gcc/
* target.def (addr_space): Add new diagnose_usage to hook vector.
* targhooks.c (default_addr_space_diagnose_usage): Add default
implementation and...
* targhooks.h (default_addr_space_diagnose_usage): ... its prototype.
* c/c-parser.c (c_lex_one_token) [CPP_NAME]: If the token
is some address space, call targetm.addr_space.diagnose_usage.
* doc/tm.texi.in (Named Address Spaces): Add anchor for
TARGET_ADDR_SPACE_DIAGNOSE_USAGE documentation.
* doc/tm.texi: Regenerate.

Index: c/c-parser.c
===
--- c/c-parser.c	(revision 238425)
+++ c/c-parser.c	(working copy)
@@ -301,6 +301,9 @@ c_lex_one_token (c_parser *parser, c_tok
 	else if (rid_code >= RID_FIRST_ADDR_SPACE
 		 && rid_code <= RID_LAST_ADDR_SPACE)
 	  {
+		addr_space_t as;
+		as = (addr_space_t) (rid_code - RID_FIRST_ADDR_SPACE);
+		targetm.addr_space.diagnose_usage (as, token->location);
 		token->id_kind = C_ID_ADDRSPACE;
 		token->keyword = rid_code;
 		break;
Index: doc/tm.texi
===
--- doc/tm.texi	(revision 238425)
+++ doc/tm.texi	(working copy)
@@ -10431,6 +10431,17 @@ Define this to define how the address sp
 The result is the value to be used with @code{DW_AT_address_class}.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_ADDR_SPACE_DIAGNOSE_USAGE (addr_space_t @var{as}, location_t @var{loc})
+Define this hook if the availability of an address space depends on
+command line options and some diagnostics should be printed when the
+address space is used.  This hook is called during parsing and allows
+to emit a better diagnostic compared to the case where the address space
+was not registered with @code{c_register_addr_space}.  @var{as} is
+the address space as registered with @code{c_register_addr_space}.
+@var{loc} is the location of the address space qualifier token.
+The default implementation does nothing.
+@end deftypefn
+
 @node Misc
 @section Miscellaneous Parameters
 @cindex parameters, miscellaneous
Index: doc/tm.texi.in
===
--- doc/tm.texi.in	(revision 238425)
+++ doc/tm.texi.in	(working copy)
@@ -7486,6 +7486,8 @@ c_register_addr_space ("__ea", ADDR_SPAC
 
 @hook TARGET_ADDR_SPACE_DEBUG
 
+@hook TARGET_ADDR_SPACE_DIAGNOSE_USAGE
+
 @node Misc
 @section Miscellaneous Parameters
 @cindex parameters, miscellaneous
Index: target.def
===
--- target.def	(revision 238425)
+++ target.def	(working copy)
@@ -3241,6 +3241,20 @@ The result is the value to be used with
  int, (addr_space_t as),
  default_addr_space_debug)
 
+/* Function to emit custom diagnostic if an address space is used.  */
+DEFHOOK
+(diagnose_usage,
+ "Define this hook if the availability of an address space depends on\n\
+command line options and some diagnostics should be printed when the\n\
+address space is used.  This hook is called during parsing and allows\n\
+to emit a better diagnostic compared to the case where the address space\n\
+was not registered with @code{c_register_addr_space}.  @var{as} is\n\
+the address space as registered with @code{c_register_addr_space}.\n\
+@var{loc} is the location of the address space qualifier token.\n\
+The default implementation does nothing.",
+ void, (addr_space_t as, location_t loc),
+ default_addr_space_diagnose_usage)
+
 HOOK_VECTOR_END (addr_space)
 
 #undef HOOK_PREFIX
Index: targhooks.c
===
--- targhooks.c	(revision 238425)
+++ targhooks.c	(working copy)
@@ -1291,6 +1291,15 @@ defa

Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-19 Thread Richard Biener
On Tue, Jul 19, 2016 at 2:39 AM, Patrick Palka  wrote:
> On Mon, 18 Jul 2016, Segher Boessenkool wrote:
>
>> On Mon, Jul 18, 2016 at 06:35:11AM -0500, Segher Boessenkool wrote:
>> > Or, if using GNU ar, you can even use -S, if that helps (after testing
>> > for it in configure, of course).
>>
>> I meant -T.  Some day I will learn how to type, promise!
>
> According to the documentation of GNU ar,
>
>   "gnu ar can optionally create a thin archive, which contains a symbol
>   index and references to the original copies of the member files of the
>   archive. This is useful for building libraries for use within a local
>   build tree, where the relocatable objects are expected to remain
>   available, and copying the contents of each object would only waste time
>   and space."
>
> Since the objects which libbackend.a is composed of remain available
> throughout the build process I think it should be safe to make
> libbackend.a a thin archive.
>
> So here's a patch which builds libbackend.a as a thin archive if the
> toolchain supports it.  The time it takes to rebuild a
> --disable-bootstrap tree after touching a single source file is now 7.5s
> instead of 35+s -- a much better speedup than when simply eliding the
> call to ranlib since the archive is now 1-5MB in size instead of 450MB.
>
> Instead of changing AR_FLAGS, only the invocation of ar on libbackend.a
> is changed because that is by far the largest archive (by a factor of
> 20x) and it seems less risky this way.
>
> One thing that was not clear to me is whether the object file paths
> stored in a thin archive are relative or absolute paths.  If they are
> absolute paths then that would be a problem due to how the build system
> moves build directories in between stages (gcc/ -> prev-gcc/ etc).  But
> it looks like the object file paths are relative to the location of the
> archive which is compatible.
>
> Bootstrapped on x86_64-pc-linux-gnu.  Thoughts?

I like it.  Improving re-build time in my dev tree is very much
welcome, and yes,
libbackend build time is a big part of it usually (plus of course cc1
link time).

Richard.

> -- >8 --
>
> Subject: [PATCH] Build libbackend.a as a thin archive if possible
>
> gcc/ChangeLog:
>
> * configure.ac (thin_archive_support): New variable.  AC_SUBST it.
> * configure: Regenerate.
> * Makefile.in (THIN_ARCHIVE_SUPPORT): New variable.
> (USE_THIN_ARCHIVES): New variable.
> (libbackend.a): If USE_THIN_ARCHIVES then pass T to ar to build
> this archive as a thin archive.
> ---
>  gcc/Makefile.in  | 17 +
>  gcc/configure| 20 ++--
>  gcc/configure.ac | 13 +
>  3 files changed, 48 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 0786fa3..15a879b 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -275,6 +275,17 @@ else
>  LLINKER = $(LINKER)
>  endif
>
> +THIN_ARCHIVE_SUPPORT = @thin_archive_support@
> +
> +USE_THIN_ARCHIVES = no
> +ifeq ($(THIN_ARCHIVE_SUPPORT),yes)
> +ifeq ($(AR_FLAGS),rc)
> +ifeq ($(RANLIB_FLAGS),)
> +USE_THIN_ARCHIVES = yes
> +endif
> +endif
> +endif
> +
>  # ---
>  # Programs which operate on the build machine
>  # ---
> @@ -1882,8 +1893,14 @@ compilations: $(BACKEND)
>  # This archive is strictly for the host.
>  libbackend.a: $(OBJS)
> -rm -rf libbackend.a
> +   @# Build libbackend.a as a thin archive if possible, as doing so
> +   @# significantly reduces build times.
> +ifeq ($(USE_THIN_ARCHIVES),yes)
> +   $(AR) $(AR_FLAGS)T libbackend.a $(OBJS)
> +else
> $(AR) $(AR_FLAGS) libbackend.a $(OBJS)
> -$(RANLIB) $(RANLIB_FLAGS) libbackend.a
> +endif
>
>  libcommon-target.a: $(OBJS-libcommon-target)
> -rm -rf libcommon-target.a
> diff --git a/gcc/configure b/gcc/configure
> index ed44472..81c81b3 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -679,6 +679,7 @@ zlibinc
>  zlibdir
>  HOST_LIBS
>  enable_default_ssp
> +thin_archive_support
>  libgcc_visibility
>  gcc_cv_readelf
>  gcc_cv_objdump
> @@ -18475,7 +18476,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 18478 "configure"
> +#line 18479 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> @@ -18581,7 +18582,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 18584 "configure"
> +#line 18585 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> @@ -27846,6 +27847,21 @@ $as_echo "#define HAVE_AS_LINE_ZERO 1" >>confdefs.h
>
>  fi
>
> +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking support for thin archives" 
> >&5
> +$as_echo_n "checking support for thin archives... " >&6; }
> +thin_archive_support=no
> +echo 'int main (void) { return 0; }' > conftest.c
> +if ($AR --ver

Re: [RFC][IPA-VRP] Add support for IPA VRP in ipa-cp/ipa-prop

2016-07-19 Thread kugan

Hi Martin,

Thanks for the review.  I have revised the patch based on the review. 
Please see the comments below.


On 15/07/16 22:23, Martin Jambor wrote:

Hi,

thanks for working on extending IPA-CP in this way.  I do have a few
comments though:

On Fri, Jul 15, 2016 at 02:46:50PM +1000, kugan wrote:

Hi,

This patch extends ipa-cp/ipa-prop infrastructure to handle propagation of
VR.
Thanks,

Kugan

gcc/testsuite/ChangeLog:

2016-07-14  Kugan Vivekanandarajah  

 * gcc.dg/ipa/vrp1.c: New test.
 * gcc.dg/ipa/vrp2.c: New test.
 * gcc.dg/ipa/vrp3.c: New test.

gcc/ChangeLog:

2016-07-14  Kugan Vivekanandarajah  

 * common.opt: New option -fipa-vrp.
 * ipa-cp.c (ipa_get_vr_lat): New.
 (ipcp_vr_lattice::print): Likewise.
 (print_all_lattices): Call ipcp_vr_lattice::print.
 (ipcp_vr_lattice::meet_with): New.
 (ipcp_vr_lattice::meet_with_1): Likewise.
 (ipcp_vr_lattice::top_p): Likewise.
 (ipcp_vr_lattice::bottom_p): Likewsie.
 (ipcp_vr_lattice::set_to_bottom): Likewise.
 (set_all_contains_variable): Call VR set_to_bottom.
 (initialize_node_lattices): Init VR lattices.
 (propagate_vr_accross_jump_function): New.
 (propagate_constants_accross_call): Call
 propagate_vr_accross_jump_function.
 (ipcp_store_alignment_results): Rename to
 ipcp_store_alignment_and_vr_results and handke VR.
 * ipa-prop.c (ipa_set_jf_unknown):
 (ipa_compute_jump_functions_for_edge): Handle Value Range.
 (ipa_node_params_t::duplicate): Likewise.
 (ipa_write_jump_function): Likewise.
 (ipa_read_jump_function): Likewise.
 (write_ipcp_transformation_info): Likewise.
 (read_ipcp_transformation_info): Likewise.
 (ipcp_update_alignments): Rename to ipcp_update_vr_and_alignments
 and handle VR.




 From 092cbccd79c3859ff24846bb0e1892ef5d8086bc Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Tue, 21 Jun 2016 12:43:01 +1000
Subject: [PATCH 5/6] Add ipa vrp

---
  gcc/common.opt  |   4 +
  gcc/ipa-cp.c| 220 +++-
  gcc/ipa-prop.c  | 110 ++--
  gcc/ipa-prop.h  |  16 +++
  gcc/testsuite/gcc.dg/ipa/vrp1.c |  32 ++
  gcc/testsuite/gcc.dg/ipa/vrp2.c |  35 +++
  gcc/testsuite/gcc.dg/ipa/vrp3.c |  30 ++
  7 files changed, 433 insertions(+), 14 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/ipa/vrp1.c
  create mode 100644 gcc/testsuite/gcc.dg/ipa/vrp2.c
  create mode 100644 gcc/testsuite/gcc.dg/ipa/vrp3.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 29d0e4d..7bf7305 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2475,6 +2475,10 @@ ftree-evrp
  Common Report Var(flag_tree_early_vrp) Init(1) Optimization
  Perform Early Value Range Propagation on trees.

+fipa-vrp
+ommon Report Var(flag_ipa_vrp) Init(1) Optimization


Common


Done.




+Perform IPA Value Range Propagation on trees.


I think that nowadays we should omit the "on trees" part, they are not
particularly useful.



Done.


+
  fsplit-paths
  Common Report Var(flag_split_paths) Init(0) Optimization
  Split paths leading to loop backedges.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 4b7f6bb..97cd04b 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "params.h"
  #include "ipa-inline.h"
  #include "ipa-utils.h"
+#include "tree-vrp.h"

  template  class ipcp_value;

@@ -266,6 +267,25 @@ private:
bool meet_with_1 (unsigned new_align, unsigned new_misalign);
  };

+/* Lattice of value ranges.  */
+
+class ipcp_vr_lattice
+{
+public:
+  value_range vr;
+
+  inline bool bottom_p () const;
+  inline bool top_p () const;
+  inline bool set_to_bottom ();
+  bool meet_with (const value_range *vr);


Please do not call the parameter the same name as that of a member
variable, that might become very confusing in future.


Done.


+  bool meet_with (const ipcp_vr_lattice &other);
+  void init () { vr.type = VR_UNDEFINED; }
+  void print (FILE * f);
+
+private:
+  bool meet_with_1 (const value_range *vr);


Likewise.

I know that no other classes in the file do, but if you want to
strictly follow the GCC coding style, member vr should be called m_vr.
Perhaps I should add the m_ prefixes to other classes as well, I am
becoming to appreciate them.



Done.


+};
+
  /* Structure containing lattices for a parameter itself and for pieces of
 aggregates that are passed in the parameter or by a reference in a 
parameter
 plus some other useful flags.  */
@@ -281,6 +301,8 @@ public:
ipcp_agg_lattice *aggs;
/* Lattice describing known alignment.  */
ipcp_alignment_lattice alignment;
+  /* Lattice describing value range.  */
+  ipcp_vr_lattice vr;
/* Number of aggregate lattices */
int aggs_count;
/* True if aggregate dat

Re: [PATCH] c++/58796 Make nullptr match exception handlers of pointer type

2016-07-19 Thread Jonathan Wakely

On 18/07/16 12:49 -0400, Jason Merrill wrote:

Perhaps the right answer is to drop support for catching nullptr as a
pointers to member from the language.


Yes, I've been drafting a ballot comment along those lines.




Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-19 Thread Martin Jambor
Hi,

On Mon, Jul 18, 2016 at 11:28:48AM -0400, NightStrike wrote:
> Well, one thing to note is that the warning is an easy way to get a
> notice of a possible missed optimization (and I have many more
> occurrences of it in a particular code base that I use).  If the
> warning is highlighted potential issues that aren't due to the -f
> option but are issues nonetheless, and we remove the warning, then how
> should I go about finding these missed opportunities in the future?
> Is there a different mechanism that does the same thing?

Yes, -fopt-info and -fopt-info-OPTIONS switches.  It certainly seems
to be a more natural means for manual compiler-guided optimization
than warnings.

Martin


Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Bernd Schmidt

On 07/19/2016 10:07 AM, Richard Biener wrote:


This is not appropriate for CFG cleanup due to its complexity not
being O(# bbs + # edges).
I tried hard in the past to make it so (at least when no transform is done).


Why wouldn't it be, if no transform is done? Assuming we visit each bb 
once, we have at worst one walk over the successor edges of the 
predecessor (if we even find two switches on the same variable), and 
then we can decide whether or not to do the transformation.


When performing the transformation I could imagine one could construct a 
testcase where lots of these switches are nested inside each other, but 
I'm not convinved that's really a realistic worry.



Please move this transform elsewhere.  I suggest the switch-conversion
pass or if that
is not a good fit then maybe if-combine (whose transforms are remotely related).


One problem is that this triggers rarely, but when it does, it occurs at 
various stages in the compilation after other optimizations have been 
done. Moving it to any given point is likely to limit the effectiveness.



Not looking closer at the patch but missing some comments on how it deals with
common cases (you see to handle fallthrus to the default label by
ignoring them?)


If you are thinking of

switch (a)
 {
 case n:
 case m:
 default:
   switch (a) {  }
 }

then the cases for n and m can simply be dropped when merging from the 
second switch into the first one. That's what happens, and there's a 
comment for it. So please elaborate what you mean.



Bernd



Re: [patch, fortran] PR66310 Problems with intrinsic repeat for large number of copies

2016-07-19 Thread Dominique d'Humières

> Le 18 juil. 2016 à 02:02, Jerry DeLisle  a écrit :
> 
> Please test this revised patch. See my comments in the PR.
> 
> I think we should commit this one.
> 
> Jerry

Jerry,

As said on IRC I think the limit should be documented and a TODO comment added 
to gcc/fortran/gfortran.h.

While trying to bootstrap with the patch I got

/opt/gcc/build_w/./prev-gcc/xg++ -B/opt/gcc/build_w/./prev-gcc/ 
-B/opt/gcc/gcc7w/x86_64-apple-darwin15.5.0/bin/ -nostdinc++ 
-B/opt/gcc/build_w/prev-x86_64-apple-darwin15.5.0/libstdc++-v3/src/.libs 
-B/opt/gcc/build_w/prev-x86_64-apple-darwin15.5.0/libstdc++-v3/libsupc++/.libs  
-I/opt/gcc/build_w/prev-x86_64-apple-darwin15.5.0/libstdc++-v3/include/x86_64-apple-darwin15.5.0
  -I/opt/gcc/build_w/prev-x86_64-apple-darwin15.5.0/libstdc++-v3/include  
-I/opt/gcc/work/libstdc++-v3/libsupc++ 
-L/opt/gcc/build_w/prev-x86_64-apple-darwin15.5.0/libstdc++-v3/src/.libs 
-L/opt/gcc/build_w/prev-x86_64-apple-darwin15.5.0/libstdc++-v3/libsupc++/.libs 
-fno-PIE -c  -DIN_GCC_FRONTEND -g -O2   -gtoggle -DIN_GCC -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
 -DHAVE_CONFIG_H -I. -Ifortran -I../../work/gcc -I../../work/gcc/fortran 
-I../../work/gcc/../include -I./../intl -I../../work/gcc/../libcpp/include 
-I/opt/mp-new/include  -I../../work/gcc/../libdecnumber 
-I../../work/gcc/../libdecnumber/dpd -I../libdecnumber 
-I../../work/gcc/../libbacktrace -I/opt/mp-new/include  -o fortran/simplify.o 
-MT fortran/simplify.o -MMD -MP -MF fortran/.deps/simplify.TPo 
../../work/gcc/fortran/simplify.c
../../work/gcc/fortran/simplify.c: In function 'gfc_expr* 
gfc_simplify_repeat(gfc_expr*, gfc_expr*)':
../../work/gcc/fortran/simplify.c:5089:11: error: variable 'i' set but not used 
[-Werror=unused-but-set-variable]
   int i;
   ^
cc1plus: all warnings being treated as errors

i.e., the lines

   int i;

and

   i = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);

must be deleted. also the comment

  /* Compute the maximum value allowed for NCOPIES:
huge(cl) - 1 / len.  */

should be updated.

Last point I have found, the limit does not seem to take into account 
CHARACTER(KIND=4):

[Book15] f90/bug% cat pr66310_par_4.f90
   program p
  character(len=2,kind=4), parameter :: z = 'yz'
  print *, repeat(z, 2**25)
   end
[Book15] f90/bug% gfc pr66310_par_4.f90
f951(3881,0x7fff74e38000) malloc: *** mach_vm_map(size=18446744073441116160) 
failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug

f951: out of memory allocating 18446744073441116160 bytes after a total of 0 
bytes

Thanks for working on this PR.

Dominique



Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Richard Biener
On Tue, Jul 19, 2016 at 11:52 AM, Bernd Schmidt  wrote:
> On 07/19/2016 10:07 AM, Richard Biener wrote:
>>
>>
>> This is not appropriate for CFG cleanup due to its complexity not
>> being O(# bbs + # edges).
>> I tried hard in the past to make it so (at least when no transform is
>> done).
>
>
> Why wouldn't it be, if no transform is done? Assuming we visit each bb once,
> we have at worst one walk over the successor edges of the predecessor (if we
> even find two switches on the same variable), and then we can decide whether
> or not to do the transformation.

I saw walks over stmts of a BB.  IMHO that's a no-go.

That said, CFG cleanup is not the place for this optimization.

The only trivial CFG cleanup transform for switches I can see is transforming
them to a simple if / else in case there is a single non-default
label.  And it's
not even doing that currently.

> When performing the transformation I could imagine one could construct a
> testcase where lots of these switches are nested inside each other, but I'm
> not convinved that's really a realistic worry.
>
>> Please move this transform elsewhere.  I suggest the switch-conversion
>> pass or if that
>> is not a good fit then maybe if-combine (whose transforms are remotely
>> related).
>
>
> One problem is that this triggers rarely, but when it does, it occurs at
> various stages in the compilation after other optimizations have been done.
> Moving it to any given point is likely to limit the effectiveness.

Well, that's true for all optimization passes we have.  The opportunity once it
arises is not likely to be removed by any pass.

>> Not looking closer at the patch but missing some comments on how it deals
>> with
>> common cases (you see to handle fallthrus to the default label by
>> ignoring them?)
>
>
> If you are thinking of
>
> switch (a)
>  {
>  case n:
>  case m:
>  default:
>switch (a) {  }
>  }
>
> then the cases for n and m can simply be dropped when merging from the
> second switch into the first one. That's what happens, and there's a comment
> for it. So please elaborate what you mean.

I'm thinking of

  switch (a)
   {
   ...
   case n:
  do-stuff;
   default:
 switch (a)
   {
   case n:
 do-stuff;
   ...
   }
   }

yes, you can simply drop cases when there is no code in the outer switch.

Richard.

>
>
> Bernd
>


Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Bernd Schmidt

On 07/19/2016 12:09 PM, Richard Biener wrote:


I saw walks over stmts of a BB.  IMHO that's a no-go.


Only to find the first or last nondebug one. Is that unacceptable?


I'm thinking of

  switch (a)
   {
   ...
   case n:
  do-stuff;
   default:
 switch (a)
   {
   case n:
 do-stuff;
   ...
   }
   }

yes, you can simply drop cases when there is no code in the outer switch.


We check that the second switch has a single predecessor block so this 
case can't happen.



Bernd



[PATCH 0/2] New selftests: sreal and fibonacci_heap

2016-07-19 Thread marxin
Hello.

Following small patch set adds selftests for sreal and fibonacci_heap.
I basically transformed the existing tests (for sreal) which were implemented
as a GCC plugin.

Current implementation of the fibonacci heap corrupts memory in
union_with and another usability change was applied to insert_node function.

Patches survive regression tests and bootstrap on x86_64-linux-gnu
and ppc64le-linux-gnu. Apart from that, aarch64 compiler can be built
w/ --disable-bootstrap.

Ready for trunk?
Martin

marxin (2):
  Add sreal to selftests
  Add selftests for fibonacci_heap

 gcc/Makefile.in|   1 +
 gcc/fibonacci_heap.c   | 290 +
 gcc/fibonacci_heap.h   |  37 +++-
 gcc/selftest-run-tests.c   |   2 +
 gcc/selftest.h |   2 +
 gcc/sreal.c| 112 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp |   1 -
 gcc/testsuite/gcc.dg/plugin/sreal-test-1.c |   8 -
 gcc/testsuite/gcc.dg/plugin/sreal_plugin.c | 170 -
 9 files changed, 435 insertions(+), 188 deletions(-)
 create mode 100644 gcc/fibonacci_heap.c
 delete mode 100644 gcc/testsuite/gcc.dg/plugin/sreal-test-1.c
 delete mode 100644 gcc/testsuite/gcc.dg/plugin/sreal_plugin.c

-- 
2.8.4



[PATCH 1/2] Add sreal to selftests

2016-07-19 Thread marxin
gcc/ChangeLog:

2016-07-12  Martin Liska  

* selftest-run-tests.c (selftest::run_tests): New function.
* selftest.h (sreal_c_tests): Declare.
* sreal.c (sreal_verify_basics): New function.
(verify_aritmetics): Likewise.
(sreal_verify_arithmetics): Likewise.
(verify_shifting): Likewise.
(sreal_verify_shifting): Likewise.
(void sreal_c_tests): Likewise.

gcc/testsuite/ChangeLog:

2016-07-12  Martin Liska  

* gcc.dg/plugin/plugin.exp: Remove sreal test.
* gcc.dg/plugin/sreal-test-1.c: Remove.
* gcc.dg/plugin/sreal_plugin.c: Remove.
---
 gcc/selftest-run-tests.c   |   1 +
 gcc/selftest.h |   1 +
 gcc/sreal.c| 112 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp |   1 -
 gcc/testsuite/gcc.dg/plugin/sreal-test-1.c |   8 --
 gcc/testsuite/gcc.dg/plugin/sreal_plugin.c | 170 -
 6 files changed, 114 insertions(+), 179 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/plugin/sreal-test-1.c
 delete mode 100644 gcc/testsuite/gcc.dg/plugin/sreal_plugin.c

diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index bddf0b2..bb004cc 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -49,6 +49,7 @@ selftest::run_tests ()
   pretty_print_c_tests ();
   wide_int_cc_tests ();
   ggc_tests_c_tests ();
+  sreal_c_tests ();
 
   /* Mid-level data structures.  */
   input_c_tests ();
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 967e76b..c805386 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -86,6 +86,7 @@ extern void pretty_print_c_tests ();
 extern void rtl_tests_c_tests ();
 extern void spellcheck_c_tests ();
 extern void spellcheck_tree_c_tests ();
+extern void sreal_c_tests ();
 extern void tree_c_tests ();
 extern void tree_cfg_c_tests ();
 extern void vec_c_tests ();
diff --git a/gcc/sreal.c b/gcc/sreal.c
index a7c9c12..9c43b4e 100644
--- a/gcc/sreal.c
+++ b/gcc/sreal.c
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include "coretypes.h"
 #include "sreal.h"
+#include "selftest.h"
 
 /* Print the content of struct sreal.  */
 
@@ -233,3 +234,114 @@ sreal::operator/ (const sreal &other) const
   r.normalize ();
   return r;
 }
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Selftests for sreals.  */
+
+/* Verify basic sreal operations.  */
+
+static void
+sreal_verify_basics (void)
+{
+  sreal minimum = INT_MIN;
+  sreal maximum = INT_MAX;
+
+  sreal seven = 7;
+  sreal minus_two = -2;
+  sreal minus_nine = -9;
+
+  ASSERT_EQ (INT_MIN, minimum.to_int ());
+  ASSERT_EQ (INT_MAX, maximum.to_int ());
+
+  ASSERT_FALSE (minus_two < minus_two);
+  ASSERT_FALSE (seven < seven);
+  ASSERT_TRUE (seven > minus_two);
+  ASSERT_TRUE (minus_two < seven);
+  ASSERT_TRUE (minus_two != seven);
+  ASSERT_EQ (minus_two, -2);
+  ASSERT_EQ (seven, 7);
+  ASSERT_EQ ((seven << 10) >> 10, 7);
+  ASSERT_EQ (seven + minus_nine, -2);
+}
+
+/* Helper function that performs basic arithmetics and comparison
+   of given arguments A and B.  */
+
+static void
+verify_aritmetics (int64_t a, int64_t b)
+{
+  ASSERT_EQ (a, -(-(sreal (a))).to_int ());
+  ASSERT_EQ (a < b, sreal (a) < sreal (b));
+  ASSERT_EQ (a <= b, sreal (a) <= sreal (b));
+  ASSERT_EQ (a == b, sreal (a) == sreal (b));
+  ASSERT_EQ (a != b, sreal (a) != sreal (b));
+  ASSERT_EQ (a > b, sreal (a) > sreal (b));
+  ASSERT_EQ (a >= b, sreal (a) >= sreal (b));
+  ASSERT_EQ (a + b, (sreal (a) + sreal (b)).to_int ());
+  ASSERT_EQ (a - b, (sreal (a) - sreal (b)).to_int ());
+  ASSERT_EQ (b + a, (sreal (b) + sreal (a)).to_int ());
+  ASSERT_EQ (b - a, (sreal (b) - sreal (a)).to_int ());
+}
+
+/* Verify arithmetics for interesting numbers.  */
+
+static void
+sreal_verify_arithmetics (void)
+{
+  int values[] = {-14123413, -, -17, -10, -2, 0, 17, 139, 1234123};
+  unsigned c = sizeof (values) / sizeof (int);
+
+  for (unsigned i = 0; i < c; i++)
+for (unsigned j = 0; j < c; j++)
+  {
+   int a = values[i];
+   int b = values[j];
+
+   verify_aritmetics (a, b);
+  }
+}
+
+/* Helper function that performs various shifting test of a given
+   argument A.  */
+
+static void
+verify_shifting (int64_t a)
+{
+  sreal v = a;
+
+  for (unsigned i = 0; i < 16; i++)
+ASSERT_EQ (a << i, (v << i).to_int());
+
+  a = a << 16;
+  v = v << 16;
+
+  for (unsigned i = 0; i < 16; i++)
+ASSERT_EQ (a >> i, (v >> i).to_int());
+}
+
+/* Verify shifting for interesting numbers.  */
+
+static void
+sreal_verify_shifting (void)
+{
+  int values[] = {0, 17, 32, 139, 1024, 5, 1234123};
+  unsigned c = sizeof (values) / sizeof (int);
+
+  for (unsigned i = 0; i < c; i++)
+verify_shifting (values[i]);
+}
+
+/* Run all of the selftests within this file.  */
+
+void sreal_c_tests ()
+{
+  sreal_verify_basics ();
+  sreal_verify_arithmetics ();
+  sreal_verify_shifting ();
+}
+
+} // namespace selftest

[PATCH 2/2] Add selftests for fibonacci_heap

2016-07-19 Thread marxin
gcc/ChangeLog:

2016-07-13  Martin Liska  

* Makefile.in: Include fibonacci_heap.c
* fibonacci_heap.c: New file.
* fibonacci_heap.h (fibonacci_heap::insert): Use insert_node.
(fibonacci_heap::union_with): Fix deletion of the second heap.
* selftest-run-tests.c (selftest::run_tests): Incroporate
fibonacci heap tests.
* selftest.h: Declare fibonacci_heap_c_tests.
---
 gcc/Makefile.in  |   1 +
 gcc/fibonacci_heap.c | 290 +++
 gcc/fibonacci_heap.h |  37 --
 gcc/selftest-run-tests.c |   1 +
 gcc/selftest.h   |   1 +
 5 files changed, 321 insertions(+), 9 deletions(-)
 create mode 100644 gcc/fibonacci_heap.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0786fa3..bfa467c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1259,6 +1259,7 @@ OBJS = \
explow.o \
expmed.o \
expr.o \
+   fibonacci_heap.o \
final.o \
fixed-value.o \
fold-const.o \
diff --git a/gcc/fibonacci_heap.c b/gcc/fibonacci_heap.c
new file mode 100644
index 000..db58417
--- /dev/null
+++ b/gcc/fibonacci_heap.c
@@ -0,0 +1,290 @@
+/* Fibonacci heap for GNU compiler.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   Contributed by Martin Liska 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "fibonacci_heap.h"
+#include "selftest.h"
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Selftests.  */
+
+/* Verify that operations with empty heap work.  */
+
+typedef fibonacci_node  int_heap_node_t;
+typedef fibonacci_heap  int_heap_t;
+
+static void
+test_empty_heap ()
+{
+  int_heap_t *h1 = new int_heap_t (INT_MIN);
+
+  ASSERT_TRUE (h1->empty ());
+  ASSERT_EQ (0, h1->nodes ());
+  ASSERT_EQ (NULL, h1->min ());
+
+  int_heap_t *h2 = new int_heap_t (INT_MIN);
+
+  int_heap_t *r = h1->union_with (h2);
+  ASSERT_TRUE (r->empty ());
+  ASSERT_EQ (0, r->nodes ());
+  ASSERT_EQ (NULL, r->min ());
+
+  delete r;
+}
+
+#define TEST_HEAP_N 100
+#define TEST_CALCULATE_VALUE(i)  ((3 * i) + 1)
+
+/* Verify heap basic operations.  */
+
+static void
+test_basic_heap_operations ()
+{
+  int values[TEST_HEAP_N];
+  int_heap_t *h1 = new int_heap_t (INT_MIN);
+
+  for (unsigned i = 0; i < TEST_HEAP_N; i++)
+{
+  values[i] = TEST_CALCULATE_VALUE (i);
+  ASSERT_EQ (i, h1->nodes ());
+  h1->insert (i, &values[i]);
+  ASSERT_EQ (0, h1->min_key ());
+  ASSERT_EQ (values[0], *h1->min ());
+}
+
+  for (unsigned i = 0; i < TEST_HEAP_N; i++)
+{
+  ASSERT_EQ (TEST_HEAP_N - i, h1->nodes ());
+  ASSERT_EQ ((int)i, h1->min_key ());
+  ASSERT_EQ (values[i], *h1->min ());
+
+  h1->extract_min ();
+}
+
+  ASSERT_TRUE (h1->empty ());
+
+  delete h1;
+}
+
+/* Builds a simple heap with values in interval 0..TEST_HEAP_N-1, where values
+   of each key is equal to 3 * key + 1.  BUFFER is used as a storage
+   of values and NODES points to inserted nodes.  */
+
+static int_heap_t *
+build_simple_heap (int *buffer, int_heap_node_t **nodes)
+{
+  int_heap_t *h = new int_heap_t (INT_MIN);
+
+  for (unsigned i = 0; i < TEST_HEAP_N; i++)
+{
+  buffer[i] = TEST_CALCULATE_VALUE (i);
+  nodes[i] = h->insert (i, &buffer[i]);
+}
+
+  return h;
+}
+
+/* Verify that fibonacci_heap::replace_key works.  */
+
+static void
+test_replace_key ()
+{
+  int values[TEST_HEAP_N];
+  int_heap_node_t *nodes[TEST_HEAP_N];
+
+  int_heap_t *heap = build_simple_heap (values, nodes);
+
+  int N = 10;
+  for (unsigned i = 0; i < (unsigned)N; i++)
+heap->replace_key (nodes[i], 100 * 1000 + i);
+
+  ASSERT_EQ (TEST_HEAP_N, heap->nodes ());
+  ASSERT_EQ (N, heap->min_key ());
+  ASSERT_EQ (TEST_CALCULATE_VALUE (N), *heap->min ());
+
+  for (int i = 0; i < TEST_HEAP_N - 1; i++)
+heap->extract_min ();
+
+  ASSERT_EQ (1, heap->nodes ());
+  ASSERT_EQ (100 * 1000 + N - 1, heap->min_key ());
+
+  delete heap;
+}
+
+/* Verify that heap can handle duplicite keys.  */
+
+static void
+test_duplicite_keys ()
+{
+  int values[3 * TEST_HEAP_N];
+  int_heap_t *heap = new int_heap_t (INT_MIN);
+
+  for (unsigned i = 0; i < 3 * TEST_HEAP_N; i++)
+{
+  values[i] = TEST_CALCULATE_VALUE (i);
+  heap->insert (i / 3, &values[i]);
+}
+
+  ASSERT_EQ (3 * TEST_HEAP_N, heap->nodes ());
+  ASSER

Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Marc Glisse

On Tue, 19 Jul 2016, Bernd Schmidt wrote:


On 07/19/2016 12:09 PM, Richard Biener wrote:


I saw walks over stmts of a BB.  IMHO that's a no-go.


Only to find the first or last nondebug one. Is that unacceptable?


Does gsi_start_nondebug_after_labels_bb not fit?

--
Marc Glisse


Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Bernd Schmidt

On 07/19/2016 12:22 PM, Marc Glisse wrote:

On Tue, 19 Jul 2016, Bernd Schmidt wrote:


On 07/19/2016 12:09 PM, Richard Biener wrote:


I saw walks over stmts of a BB.  IMHO that's a no-go.


Only to find the first or last nondebug one. Is that unacceptable?


Does gsi_start_nondebug_after_labels_bb not fit?


It might, if one realizes that such a thing exists. Will try.


Bernd



[patch,avr] Slightly better memory accesses on avr_tiny

2016-07-19 Thread Georg-Johann Lay
This patch tries to improve the bloated code we are currently generating for 
AVR_TINY.  It's mostly about printing the memory loads and stores and more 
usage of reg_unused_after to print shorter instruction sequences in some cases.


Ok for trunk?

I also played around with PLUS in legitimate_address_p and legitimize_address 
and got better code, but the problem with such changes is that almost all tests 
for such small devices are failing and no reasonable portion of the testsuite 
will pass.


I don't even know if anybody is using avr_tiny + avr-gcc or if users are 
resorting to assembler.


Johann


gcc/
(avr_legitimize_address) [AVR_TINY]: Force constant addresses
outside [0,0xc0] into a register.
(avr_out_movhi_r_mr_reg_no_disp_tiny): Pass insn.  And handle
cases where the base address register is unused after.
(avr_out_movhi_r_mr_reg_disp_tiny): Same.
(avr_out_movhi_mr_r_reg_disp_tiny): Same.
(avr_out_store_psi_reg_disp_tiny): Same.

gcc/testsuite/
* gcc.target/avr/torture/get-mem.c: New test.
* gcc.target/avr/torture/set-mem.c: New test.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 238425)
+++ config/avr/avr.c	(working copy)
@@ -1922,6 +1922,16 @@ avr_legitimize_address (rtx x, rtx oldx,
 
   x = oldx;
 
+  if (AVR_TINY)
+{
+  if (CONSTANT_ADDRESS_P (x)
+  && !(CONST_INT_P (x)
+   && IN_RANGE (INTVAL (x), 0, 0xc0 - GET_MODE_SIZE (mode
+{
+  x = force_reg (Pmode, x);
+}
+}
+
   if (GET_CODE (oldx) == PLUS
   && REG_P (XEXP (oldx, 0)))
 {
@@ -3510,7 +3520,7 @@ out_movqi_r_mr (rtx_insn *insn, rtx op[]
 /* Same as movhi_r_mr, but TINY does not have ADIW, SBIW and LDD */
 
 static const char*
-avr_out_movhi_r_mr_reg_no_disp_tiny (rtx op[], int *plen)
+avr_out_movhi_r_mr_reg_no_disp_tiny (rtx_insn *insn, rtx op[], int *plen)
 {
   rtx dest = op[0];
   rtx src = op[1];
@@ -3524,17 +3534,20 @@ avr_out_movhi_r_mr_reg_no_disp_tiny (rtx
 			"ld %B0,%1"  CR_TAB
 			"mov %A0,__tmp_reg__", op, plen, -3);
 
-  return avr_asm_len ("ld %A0,%1" CR_TAB
-  TINY_ADIW (%E1, %F1, 1) CR_TAB
-  "ld %B0,%1" CR_TAB
-  TINY_SBIW (%E1, %F1, 1), op, plen, -6);
+  avr_asm_len ("ld %A0,%1+"  CR_TAB
+   "ld %B0,%1", op, plen, -2);
+
+  if (!reg_unused_after (insn, base))
+avr_asm_len (TINY_SBIW (%E1, %F1, 1), op, plen, 2);
+
+  return "";
 }
 
 
 /* Same as movhi_r_mr, but TINY does not have ADIW, SBIW and LDD */
 
 static const char*
-avr_out_movhi_r_mr_reg_disp_tiny (rtx op[], int *plen)
+avr_out_movhi_r_mr_reg_disp_tiny (rtx_insn *insn, rtx op[], int *plen)
 {
   rtx dest = op[0];
   rtx src = op[1];
@@ -3552,10 +3565,14 @@ avr_out_movhi_r_mr_reg_disp_tiny (rtx op
 }
   else
 {
-  return avr_asm_len (TINY_ADIW (%I1, %J1, %o1) CR_TAB
-  "ld %A0,%b1+" CR_TAB
-  "ld %B0,%b1"  CR_TAB
-  TINY_SBIW (%I1, %J1, %o1+1), op, plen, -6);
+  avr_asm_len (TINY_ADIW (%I1, %J1, %o1) CR_TAB
+   "ld %A0,%b1+" CR_TAB
+   "ld %B0,%b1", op, plen, -4);
+
+  if (!reg_unused_after (insn, XEXP (base, 0)))
+avr_asm_len (TINY_SBIW (%I1, %J1, %o1+1), op, plen, 2);
+
+  return "";
 }
 }
 
@@ -3603,7 +3620,7 @@ out_movhi_r_mr (rtx_insn *insn, rtx op[]
   if (reg_base > 0)
 {
   if (AVR_TINY)
-return avr_out_movhi_r_mr_reg_no_disp_tiny (op, plen);
+return avr_out_movhi_r_mr_reg_no_disp_tiny (insn, op, plen);
 
   if (reg_dest == reg_base) /* R = (R) */
 return avr_asm_len ("ld __tmp_reg__,%1+" CR_TAB
@@ -3628,7 +3645,7 @@ out_movhi_r_mr (rtx_insn *insn, rtx op[]
   int reg_base = true_regnum (XEXP (base, 0));
 
   if (AVR_TINY)
-return avr_out_movhi_r_mr_reg_disp_tiny (op, plen);
+return avr_out_movhi_r_mr_reg_disp_tiny (insn, op, plen);
 
   if (disp > MAX_LD_OFFSET (GET_MODE (src)))
 {
@@ -4377,8 +4394,8 @@ avr_out_load_psi_reg_no_disp_tiny (rtx_i
 		   "ld %B0,%1+"  CR_TAB
 		   "ld %C0,%1", op, plen, -3);
 
-  if (reg_dest != reg_base - 2 &&
-  !reg_unused_after (insn, base))
+  if (reg_dest != reg_base - 2
+  && !reg_unused_after (insn, base))
 {
   avr_asm_len (TINY_SBIW (%E1, %F1, 2), op, plen, 2);
 }
@@ -4408,13 +4425,13 @@ avr_out_load_psi_reg_disp_tiny (rtx_insn
   else
 {
   avr_asm_len (TINY_ADIW (%I1, %J1, %o1)   CR_TAB
-  "ld %A0,%b1+"  CR_TAB
-  "ld %B0,%b1+"  CR_TAB
-  "ld %C0,%b1", op, plen, -5);
+   "ld %A0,%b1+"   CR_TAB
+   "ld %B0,%b1+"   CR_TAB

Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Richard Biener
On Tue, Jul 19, 2016 at 12:25 PM, Bernd Schmidt  wrote:
> On 07/19/2016 12:22 PM, Marc Glisse wrote:
>>
>> On Tue, 19 Jul 2016, Bernd Schmidt wrote:
>>
>>> On 07/19/2016 12:09 PM, Richard Biener wrote:
>>>
 I saw walks over stmts of a BB.  IMHO that's a no-go.
>>>
>>>
>>> Only to find the first or last nondebug one. Is that unacceptable?
>>
>>
>> Does gsi_start_nondebug_after_labels_bb not fit?
>
>
> It might, if one realizes that such a thing exists. Will try.

I think that start/end_recording_case_labels also merged adjacent labels
via group_case_labels_stmt.  Not sure why you need to stop recording
case labels during the transform.  Is this because you are building a new
switch stmt?

Richard.


> Bernd
>


Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-19 Thread Aldy Hernandez

On 07/18/2016 11:14 PM, Martin Sebor wrote:

How does this look?


I think it's 99% there.  You've addressed all of my comments so
far -- thanks for that and for being so patient.  I realize it
would be a lot more efficient to get all the feedback (or as much
of it as possible) up front.  Unfortunately, some things don't get
noticed until round 2 or 3 (or even 4).  Please take this in lieu
of an apology for not spotting the issues below until now(*).


No problem, although I think we're getting to the point of diminishing 
returns with regards to functionality.  It may be best to involve Jeff 
or another global reviewer at this point for a review.  We can address 
any more minor things as a follow-up patch.  (Unless you find any show 
stoppers before then :)).




For this code:

   void f (void*);

   void g (int n)
   {
 int a [n];
 f (a);
   }

-Wvla-larger-than=32 prints:

   warning: argument to variable-length array may be too large
   note: limit is 32 bytes, but argument may be 18446744073709551612

An int argument cannot be that large.  I suspect the printed value
is actually the size of the VLA in bytes when N is -1, truncated
to size_t, rather than the value of the VLA bound.  To avoid
confusion the note should be corrected to say something like:

   note: limit is 32 bytes, but the variable-length array may be
   as large as 18446744073709551612


Note adjusted.



Also, the checker prints false positives for code like:

   void f (void*);

   void g (unsigned x, int *y)
   {
 if (1000 < x) return;

 while (*y) {
   char a [x];
   f (a);
 }
   }

With -Wvla-larger-than=1000 and greater it prints:

   warning: unbounded use of variable-length array

(Same thing with alloca).  There should be no warning for VLAs,
and for alloca, the warning should say "use of variable-length
array within a loop."  The VRP dump suggests the range information
is available within the loop.  Is the get_range_info() function
not returning the corresponding bounds?


This is a false positive, but there's little we can do with the current 
range infrastructure.  The range information becomes less precise the 
further down the optimization pipeline we get.  So, even though as far 
as *.c.126t.crited1, we still see appropriate range information:


   # RANGE [0, 1000] NONZERO 1023
   _10 = (sizetype) x_3(D);
...
   a.1_12 = __builtin_alloca_with_align (_10, 8);

The PRE pass cleans things up in such a way that we end up with:

   :
   if (x_3(D) > 1000)
 goto ;
   else
 goto ;
...
   :
   # VUSE <.MEM_2(D)>
   _16 = *y_1(D);
   if (_16 != 0)
 goto ;
   else
 goto ;

   :

   :
   # <-NO RANGE INFO->
   _4 = (sizetype) x_3(D);

...

a.1_12 = __builtin_alloca_with_align (_4, 8);

The -Walloca pass comes after PRE, which means we no longer have any 
range information for _4, and chasing the IL to glean this information 
would be fragile at best.  We will just have to live with this until we 
have better pervasive range information.


Updated patch tested on x86-64 Linux.

Aldy

gcc/

* Makefile.in (OBJS): Add gimple-ssa-warn-walloca.o.
* passes.def: Add two instances of pass_walloca.
* tree-pass.h (make_pass_walloca): New.
* gimple-ssa-warn-walloca.c: New file.
* opts.c (finish_options): Warn when using -Wvla-larger-than= and
-Walloca-larger-than= without -O2 or greater.
* doc/invoke.texi: Document -Walloca, -Walloca-larger-than=, and
-Wvla-larger-than= options.

gcc/c-family/

* c.opt (Walloca): New.
(Walloca-larger-than=): New.
(Wvla-larger-than=): New.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 776f6d7..2a13b8f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1284,6 +1284,7 @@ OBJS = \
gimple-ssa-nonnull-compare.o \
gimple-ssa-split-paths.o \
gimple-ssa-strength-reduction.o \
+   gimple-ssa-warn-alloca.o \
gimple-streamer-in.o \
gimple-streamer-out.o \
gimple-walk.o \
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index ff6339c..dc2be2d 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -376,6 +376,16 @@ c_common_handle_option (size_t scode, const char *arg, int 
value,
   cpp_opts->warn_num_sign_change = value;
   break;
 
+case OPT_Walloca_larger_than_:
+  if (!value)
+   inform (loc, "-Walloca-larger-than=0 is meaningless");
+  break;
+
+case OPT_Wvla_larger_than_:
+  if (!value)
+   inform (loc, "-Wvla-larger-than=0 is meaningless");
+  break;
+
 case OPT_Wunknown_pragmas:
   /* Set to greater than 1, so that even unknown pragmas in
 system headers will be warned about.  */
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 83fd84c..1d4ebf0 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -275,6 +275,16 @@ Wall
 C ObjC C++ ObjC++ Warning
 Enable most warning messages.
 
+Walloca
+C ObjC C++ ObjC+

Re: [PATCH v2] S/390: Add splitter for "and" with complement.

2016-07-19 Thread Andreas Krebbel
On 07/19/2016 11:37 AM, Dominik Vogt wrote:
>  ;
> +; And with complement
> +;
> +; c = ~b & a = (b & a) ^ a
> +
> +(define_insn_and_split "*andc_split"

Please append  here to make the insn name unique.

> +  [(set (match_operand:GPR 0 "nonimmediate_operand" "")
> + (and:GPR (not:GPR (match_operand:GPR 1 "nonimmediate_operand" ""))
> +  (match_operand:GPR 2 "general_operand" "")))
> +   (clobber (reg:CC CC_REGNUM))]
> +  "TARGET_ZARCH && s390_logical_operator_ok_p (operands)"
> +  "#"
> +  "&& 1"
> +  [
> +  (parallel
> +   [(set (match_dup 3) (and:GPR (match_dup 1) (match_dup 2)))
> +   (clobber (reg:CC CC_REGNUM))])
> +  (parallel
> +   [(set (match_dup 0) (xor:GPR (match_dup 3) (match_dup 2)))
> +   (clobber (reg:CC CC_REGNUM))])]
> +{
> +  if (reg_overlap_mentioned_p (operands[0], operands[2]))
> +{
> +  gcc_assert (can_create_pseudo_p ());

Is it really safe to assume we will never get here after reload? I don't see 
where this is
prevented. Btw. the very same assertion is in gen_reg_rtx anyway so no need to 
duplicate it.

> +  operands[3] = gen_reg_rtx (mode);
> +}
> +  else
> +operands[3] = operands[0];
> +})
> +
> +; Convert "(xor (operand) (-1))" to "(not (operand))" for low optimization
> +; levels so that "*andc_split" matches.
> +(define_insn_and_split "*andc_split2"

 missing

> +  [(set (match_operand:GPR 0 "nonimmediate_operand" "")
> +(and:GPR (xor:GPR (match_operand:GPR 1 "nonimmediate_operand" "")
> +   (const_int -1))
> +  (match_operand:GPR 2 "general_operand" "")))
> +(clobber (reg:CC CC_REGNUM))]
> +  "TARGET_ZARCH && s390_logical_operator_ok_p (operands)"
> +  "#"
> +  "&& 1"
> +  [(parallel
> +[(set (match_dup 0) (and:GPR (not:GPR (match_dup 1)) (match_dup 2)))
> +(clobber (reg:CC CC_REGNUM))])]
> +)
> +
> +;
>  ; Block and (NC) patterns.
>  ;
>

Looks like these testcase could be merged by putting the lp64 conditions at the 
scan-assembler
directives.

> diff --git a/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
> new file mode 100644
> index 000..ed78921
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
> @@ -0,0 +1,61 @@
> +/* Machine description pattern tests.  */
> +
> +/* { dg-do run { target { lp64 } } } */
> +/* { dg-options "-mzarch -save-temps -dP" } */
> +/* Skip test if -O0 is present on the command line:
> +
> +{ dg-skip-if "" { *-*-* } { "-O0" } { "" } }
> +
> +   Skip test if the -O option is missing from the command line
> +{ dg-skip-if "" { *-*-* } { "*" } { "-O*" } }
> +*/
> +
> +__attribute__ ((noinline))
> +unsigned long andc_vv(unsigned long a, unsigned long b)
> +{ return ~b & a; }
> +/* { dg-final { scan-assembler ":15 .\* \{\\*anddi3\}" } } */
> +/* { dg-final { scan-assembler ":15 .\* \{\\*xordi3\}" } } */
> +
> +__attribute__ ((noinline))
> +unsigned long andc_pv(unsigned long *a, unsigned long b)
> +{ return ~b & *a; }
> +/* { dg-final { scan-assembler ":21 .\* \{\\*anddi3\}" } } */
> +/* { dg-final { scan-assembler ":21 .\* \{\\*xordi3\}" } } */
> +
> +__attribute__ ((noinline))
> +unsigned long andc_vp(unsigned long a, unsigned long *b)
> +{ return ~*b & a; }
> +/* { dg-final { scan-assembler ":27 .\* \{\\*anddi3\}" } } */
> +/* { dg-final { scan-assembler ":27 .\* \{\\*xordi3\}" } } */
> +
> +__attribute__ ((noinline))
> +unsigned long andc_pp(unsigned long *a, unsigned long *b)
> +{ return ~*b & *a; }
> +/* { dg-final { scan-assembler ":33 .\* \{\\*anddi3\}" } } */
> +/* { dg-final { scan-assembler ":33 .\* \{\\*xordi3\}" } } */
> +
> +/* { dg-final { scan-assembler-times "\tngr\?k\?\t" 4 } } */
> +/* { dg-final { scan-assembler-times "\txgr\?\t" 4 } } */
> +
> +int
> +main (void)
> +{
> +  unsigned long a = 0xc00cllu;
> +  unsigned long b = 0x500allu;
> +  unsigned long e = 0x8004llu;
> +  unsigned long c;
> +
> +  c = andc_vv (a, b);
> +  if (c != e)
> +__builtin_abort ();
> +  c = andc_pv (&a, b);
> +  if (c != e)
> +__builtin_abort ();
> +  c = andc_vp (a, &b);
> +  if (c != e)
> +__builtin_abort ();
> +  c = andc_pp (&a, &b);
> +  if (c != e)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c
b/gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c
> new file mode 100644
> index 000..9e78335
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c
> @@ -0,0 +1,38 @@
> +/* Machine description pattern tests.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-mzarch -save-temps -dP" } */
> +/* Skip test if -O0 is present on the command line:
> +
> +{ dg-skip-if "" { *-*-* } { "-O0" } { "" } }
> +
> +   Skip test if the -O option is missing from the command line
> +{ dg-skip-if "" { *-*-* } { "*" } { "-O*" } }
> +*/
> +
> +__attribute__ ((noinline))
> +unsigned int andc_vv(unsigned int a, unsigned int b)
> +{ return ~b & a; }
> +/*

Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Bernd Schmidt

On 07/19/2016 12:35 PM, Richard Biener wrote:


I think that start/end_recording_case_labels also merged adjacent labels
via group_case_labels_stmt.  Not sure why you need to stop recording
case labels during the transform.  Is this because you are building a new
switch stmt?


It's because the cached mapping gets invalidated. Look in tree-cfg, it 
has a edge_to_cases map which I think cannot be maintained if you modify 
the structure. I certainly got lots of internal errors until I added 
that pair of calls.



Bernd



Do not ICE in resolve.c (PR fortran/71799)

2016-07-19 Thread Martin Liška
Hi.

Suggested patch was just grabbed from the PR.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Martin
>From e6c55da822bc5a7464edc9cedb9b08f56e310885 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 18 Jul 2016 16:47:27 +0200
Subject: [PATCH] Do not ICE in resolve.c (PR fortran/71799)

gcc/testsuite/ChangeLog:

2016-07-18  Martin Liska  

	* gfortran.dg/pr71799.f90: New test.

gcc/fortran/ChangeLog:

2016-07-18  Steven G. Kargl  

	* resolve.c (gfc_resolve_iterator): Generate error instead
	of ICE.
---
 gcc/fortran/resolve.c |  6 +++---
 gcc/testsuite/gfortran.dg/pr71799.f90 | 12 
 2 files changed, 15 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr71799.f90

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 1fc540a..23da9ac 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -6515,15 +6515,15 @@ gfc_resolve_iterator (gfc_iterator *iter, bool real_ok, bool own_scope)
   /* Convert start, end, and step to the same type as var.  */
   if (iter->start->ts.kind != iter->var->ts.kind
   || iter->start->ts.type != iter->var->ts.type)
-gfc_convert_type (iter->start, &iter->var->ts, 2);
+gfc_convert_type (iter->start, &iter->var->ts, 1);
 
   if (iter->end->ts.kind != iter->var->ts.kind
   || iter->end->ts.type != iter->var->ts.type)
-gfc_convert_type (iter->end, &iter->var->ts, 2);
+gfc_convert_type (iter->end, &iter->var->ts, 1);
 
   if (iter->step->ts.kind != iter->var->ts.kind
   || iter->step->ts.type != iter->var->ts.type)
-gfc_convert_type (iter->step, &iter->var->ts, 2);
+gfc_convert_type (iter->step, &iter->var->ts, 1);
 
   if (iter->start->expr_type == EXPR_CONSTANT
   && iter->end->expr_type == EXPR_CONSTANT
diff --git a/gcc/testsuite/gfortran.dg/pr71799.f90 b/gcc/testsuite/gfortran.dg/pr71799.f90
new file mode 100644
index 000..4760cc3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr71799.f90
@@ -0,0 +1,12 @@
+! PR fortran/71799
+! { dg-do compile }
+
+subroutine test2(array, s, block)
+integer(1) :: i, block(9), array(2)
+integer (8) :: s
+
+do i = 10, HUGE(i) - 10, 222 ! { dg-error "Arithmetic overflow converting INTEGER\\(4\\) to INTEGER\\(1\\)" }
+  s = s + 1
+end do
+
+end subroutine test2
-- 
2.8.4



Re: Merge switch statements in tree-cfgcleanup

2016-07-19 Thread Richard Biener
On Tue, Jul 19, 2016 at 1:07 PM, Bernd Schmidt  wrote:
> On 07/19/2016 12:35 PM, Richard Biener wrote:
>
>> I think that start/end_recording_case_labels also merged adjacent labels
>> via group_case_labels_stmt.  Not sure why you need to stop recording
>> case labels during the transform.  Is this because you are building a new
>> switch stmt?
>
>
> It's because the cached mapping gets invalidated. Look in tree-cfg, it has a
> edge_to_cases map which I think cannot be maintained if you modify the
> structure. I certainly got lots of internal errors until I added that pair
> of calls.

Yeah, I see that.  OTOH cfgcleanup relies on this cache to be efficient and
you (repeatedly) clear it.  Clearing parts of it should be sufficient and if you
used redirect_edge_and_branch instead of redirect_edge_pred it would have
maintained the cache as far as I can see, or you can make sure to maintain
it yourself or just clear the info associated with the edges you redirect from
one switch to another.

Btw,

+  gimple_stmt_iterator gsi1, gsi2;
+  gsi1 = gsi_last_nondebug_bb (pred_bb);
+  if (gsi_end_p (gsi1))
+return false;
+  gimple *pred_end = gsi_stmt (gsi1);
+  if (gimple_code (pred_end) != GIMPLE_SWITCH)

this is just

   gimple *pred_end = last_stmt (pred_bb);
   if (! pred_end || gimple_code (pred_end) != GIMPLE_SWITCH)
 ...

Richard.


Richard.

>
> Bernd
>


[BACKPORT 4.9/5] Fix compiling large files

2016-07-19 Thread Martin Liška
Hello.

As mentioned in PR71920, I would like to backport the change to GCC 4.9 and 5.

May I install the patch after proper testing?

Thanks,
Martin


Re: [BACKPORT 4.9/5] Fix compiling large files

2016-07-19 Thread Richard Biener
On Tue, Jul 19, 2016 at 1:19 PM, Martin Liška  wrote:
> Hello.
>
> As mentioned in PR71920, I would like to backport the change to GCC 4.9 and 5.
>
> May I install the patch after proper testing?

Yes, it's pretty obvious.

Richard.

> Thanks,
> Martin


Re: [PATCH GCC]Remove support for -funsafe-loop-optimizations

2016-07-19 Thread Bin.Cheng
On Tue, Jul 19, 2016 at 9:00 AM, Richard Biener
 wrote:
> On Mon, Jul 18, 2016 at 5:36 PM, Bin.Cheng  wrote:
>> On Mon, Jul 18, 2016 at 4:28 PM, NightStrike  wrote:
>>> On Mon, Jul 18, 2016 at 3:55 AM, Bin.Cheng  wrote:
 On Sat, Jul 16, 2016 at 6:28 PM, NightStrike  wrote:
> On Fri, Jul 15, 2016 at 1:07 PM, Bin Cheng  wrote:
>> Hi,
>> This patch removes support for -funsafe-loop-optimizations, as well as 
>> -Wunsafe-loop-optimizations.  By its name, this option does unsafe 
>> optimizations by assuming all loops must terminate and doesn't wrap.  
>> Unfortunately, it's not as useful as expected because:
>> 1) Simply assuming loop must terminate isn't enough.  What we really 
>> want is to analyze scalar evolution and loop niter bound under such 
>> assumptions.  This option does nothing in this aspect.
>> 2) IIRC, this option generates bogus code for some common programs, 
>> that's why it's disabled by default even at Ofast level.
>>
>> After I sent patches handling possible infinite loops in both 
>> (scev/niter) analyzer and vectorizer, it's a natural step to remove such 
>> options in GCC.  This patch does so by deleting code for 
>> -funsafe-loop-optimizations, as well as -Wunsafe-loop-optimizations.  It 
>> also deletes the two now useless tests, while the option interface is 
>> preserved for backward compatibility purpose.
>
> There are a number of bugs opened against those options, including one
> that I just opened rather recently:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71769
>
> but some go back far, in this case 9 years:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34114
>
> If you are going to remove the options, you should address open bugs
> related to those options.
 Hi,
 Thanks for pointing me to these PRs, I will have a look at them.
>>>
>>> I only highlighted two PRs, I was suggesting that you look for all of them.
>>>
 IMHO, the old one reports weakness in loop niter analyzer, the issue
 exists whether I remove unsafe-loop-optimization or not.  The new one
 is a little bit trickier, I will put some comments on PR, and again,
 the issue (if it is) is in niter analyzer which has nothing to do with
 the option really.
>>>
>>> Well, one thing to note is that the warning is an easy way to get a
>>> notice of a possible missed optimization (and I have many more
>>> occurrences of it in a particular code base that I use).  If the
>>> warning is highlighted potential issues that aren't due to the -f
>>> option but are issues nonetheless, and we remove the warning, then how
>>> should I go about finding these missed opportunities in the future?
>>> Is there a different mechanism that does the same thing?
>> Hmm, good point, I will iterate the patch to see if I can only remove
>> -funsafe-loop-optimizations, while keep -Wunsafe-loop-optimizations.
>
> Of course the naming of -Wunsafe-loop-optimizations is misleading then.
> Maybe provide an alias -Wmissed-loop-optimizations and re-word it to
> say "disable _some_ loop optimizations" as I hope more loop optimizations
The current behavior is to only warn possible missed loop optimization
in IVOPT, which effectively is the last niter related loop
optimization.  I would rather to keep this behavior because warning
against specific optimization would be a real hassle.   This leads to
another problem about precise warning message: If a loop optimization
has already handled loop with assumptions, we should not warn against
the loop afterwards.  This again reminds me of the patch adding
flag/constraint (or whatever the name finally made) extension.  We can
annotate loop structure once it's handled, so that warning message
won't be issued.  How about this?

> get aware of "assumptions" and deal with them.
The first question needs to be answered is how we export assumptions
to various loop optimizers.  For now, I only added one specific
interface number_of_iteration_exit_assumptions and this will only be
used in vectorizer in my following patches.  A generic method
exporting assumptions in loop structure would be great, but that's
difficult because sometime we not only need assumptions itself, but
also need to analyzer scev under assumptions.  This is another problem
though.

> In which case a way to "re-introduce" -funsafe-loop-optimizations would be to
> add a #pragma that can be used to annotate loops to tell GCC of various
> properties like that it terminates without IV wrapping.
Yeah, this could be easily done using the flag/constraint method.

Thanks,
bin
>
> Richard.
>
>> Thanks,
>> bin


Re: Implement C _FloatN, _FloatNx types [version 3]

2016-07-19 Thread James Greenhalgh
On Thu, Jun 23, 2016 at 02:19:52PM +, Joseph Myers wrote:



> No GCC port supports a floating-point format suitable for _Float128x.
> Although there is HFmode support for ARM and AArch64, use of that for
> _Float16 is not enabled.  Supporting _Float16 would require additional
> work on the excess precision aspects of TS 18661-3: there are new
> values of FLT_EVAL_METHOD, which are not currently supported in GCC,
> and FLT_EVAL_METHOD == 0 now means that operations and constants on
> types narrower than float are evaluated to the range and precision of
> float.  Implementing that, so that _Float16 gets evaluated with excess
> range and precision, would involve changes to the excess precision
> infrastructure so that the _Float16 case is enabled by default, unlike
> the x87 case which is only enabled for -fexcess-precision=standard.
> Other differences between _Float16 and __fp16 would also need to be
> disentangled.

Hi Joseph,

Thanks for the patch. Just to let you know, I'm interested in trying to
enable _Float16 for ARM/AArch64, and I'll be starting work on that once
Jiong and Matthew's support for the ARMv8.2-A 16-bit floating point
arithmetic extensions go in.

These slightly complicate the description you give above as we now want
two behaviours. Where the 16-bit floating point extensions are available,
we want to use the native operations (FLT_EVAL_METHOD == 16). Where they
are not available we want to use promotion to float (FLT_EVAL_METHOD == 0).
I'm hoping that the excess precision mechanics will be able to handle
this, and from your description above it sounds like making it work in the
presence of the 16-bit arithmetic extensions will be easier as I won't have
to modify the excess precision machinery much.

As you say, I've also got a few things to do to disentangle _Float16 and
__fp16. __fp16 should continue to always evaluate in excess precision, and
we want it to continue to exist as a distinct type for compatability. My
guess is that keeping this separation might not be too hard, I should
only need to change TARGET_PROMOTED_TYPE to promote if it sees an
__fp16 type, and do nothing otherwise.

I'm hoping that enabling _Float16 for ARM/AArch64 should not be too
difficult with the groundwork in your patches, but I would appreciate any
pointers on where I am likely to run in to trouble; I haven't worked in the
front-end before.
 
Thanks,
James



[PATCH] Fix missed constant propagation

2016-07-19 Thread Richard Biener

The following patch fixes missing constant propagation in SSA propagators
and FRE.  Noticed this when working on sth else.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-07-19  Richard Biener  

* gimple-fold.c (get_base_constructor): Add VIEW_CONVERT case,
handle all tcc_constant bases and valueize SSA names.
* tree-ssa-sccvn.c (fully_constant_vn_reference_p): Handle
tcc_constant bases.

* c-c++-common/vector-subscript-6.c: New testcase.
* c-c++-common/vector-subscript-7.c: Likewise.

Index: gcc/gimple-fold.c
===
*** gcc/gimple-fold.c   (revision 238426)
--- gcc/gimple-fold.c   (working copy)
*** get_base_constructor (tree base, HOST_WI
*** 5508,5513 
--- 5499,5507 
  return NULL_TREE;
base = TREE_OPERAND (base, 0);
  }
+   else if (valueize
+  && TREE_CODE (base) == SSA_NAME)
+ base = valueize (base);
  
/* Get a CONSTRUCTOR.  If BASE is a VAR_DECL, get its
   DECL_INITIAL.  If BASE is a nested reference into another
*** get_base_constructor (tree base, HOST_WI
*** 5529,5534 
--- 5523,5532 
return init;
}
  
+ case VIEW_CONVERT_EXPR:
+   return get_base_constructor (TREE_OPERAND (base, 0),
+  bit_offset, valueize);
+ 
  case ARRAY_REF:
  case COMPONENT_REF:
base = get_ref_base_and_extent (base, &bit_offset2, &size, &max_size,
*** get_base_constructor (tree base, HOST_WI
*** 5538,5548 
*bit_offset +=  bit_offset2;
return get_base_constructor (base, bit_offset, valueize);
  
- case STRING_CST:
  case CONSTRUCTOR:
return base;
  
  default:
return NULL_TREE;
  }
  }
--- 5536,5548 
*bit_offset +=  bit_offset2;
return get_base_constructor (base, bit_offset, valueize);
  
  case CONSTRUCTOR:
return base;
  
  default:
+   if (CONSTANT_CLASS_P (base))
+   return base;
+ 
return NULL_TREE;
  }
  }
Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c(revision 238426)
--- gcc/tree-ssa-sccvn.c(working copy)
*** fully_constant_vn_reference_p (vn_refere
*** 1331,1336 
--- 1337,1347 
unsigned i;
for (i = 0; i < operands.length (); ++i)
{
+ if (TREE_CODE_CLASS (operands[i].opcode) == tcc_constant)
+   {
+ ++i;
+ break;
+   }
  if (operands[i].off == -1)
return NULL_TREE;
  off += operands[i].off;
Index: gcc/testsuite/c-c++-common/vector-subscript-6.c
===
*** gcc/testsuite/c-c++-common/vector-subscript-6.c (revision 0)
--- gcc/testsuite/c-c++-common/vector-subscript-6.c (working copy)
***
*** 0 
--- 1,14 
+ /* { dg-do compile } */
+ /* { dg-options "-O -fno-tree-ccp -fdump-tree-fre1" } */
+ 
+ typedef int v4si __attribute__ ((vector_size (16)));
+ 
+ int
+ main (int argc, char** argv)
+ {
+   int i = 2;
+   int j = ((v4si){0, 1, 2, 3})[i];
+   return ((v4si){1, 2, 42, 0})[j];
+ }
+ 
+ /* { dg-final { scan-tree-dump "return 42;" "fre1" } } */
Index: gcc/testsuite/c-c++-common/vector-subscript-7.c
===
*** gcc/testsuite/c-c++-common/vector-subscript-7.c (revision 0)
--- gcc/testsuite/c-c++-common/vector-subscript-7.c (working copy)
***
*** 0 
--- 1,14 
+ /* { dg-do compile } */
+ /* { dg-options "-O -fdump-tree-ccp1" } */
+ 
+ typedef int v4si __attribute__ ((vector_size (16)));
+ 
+ int
+ main (int argc, char** argv)
+ {
+   int i = 2;
+   int j = ((v4si){0, 1, 2, 3})[i];
+   return ((v4si){1, 2, 42, 0})[j];
+ }
+ 
+ /* { dg-final { scan-tree-dump "return 42;" "ccp1" } } */


Re: [PATCH, vec-tails 10/10] Tests

2016-07-19 Thread Kirill Yukhin
Hi!
On 15 Jul 12:39, Ilya Enkovich wrote:
> 2016-07-14 20:32 GMT+03:00 Jeff Law :
> > On 07/05/2016 09:44 AM, Ilya Enkovich wrote:
> >>
> >> Hi,
> >>
> >> This patch adds several tests to check tails vectorization functionality.
> >>
> >> Thanks,
> >> Ilya
> >> --
> >> gcc/testsuite/
> >>
> >> 2016-07-05  Ilya Enkovich  
> >>
> >> * lib/target-supports.exp (check_avx2_hw_available): New.
> >> (check_effective_target_avx2_runtime): New.
> >> * gcc.dg/vect/vect-tail-combine-1.c: New test.
> >> * gcc.dg/vect/vect-tail-combine-2.c: New test.
> >> * gcc.dg/vect/vect-tail-combine-3.c: New test.
> >> * gcc.dg/vect/vect-tail-combine-4.c: New test.
> >> * gcc.dg/vect/vect-tail-combine-5.c: New test.
> >> * gcc.dg/vect/vect-tail-combine-6.c: New test.
> >> * gcc.dg/vect/vect-tail-combine-7.c: New test.
> >> * gcc.dg/vect/vect-tail-combine-9.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-1.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-2.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-3.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-4.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-5.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-6.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-7.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-8.c: New test.
> >> * gcc.dg/vect/vect-tail-mask-9.c: New test.
> >> * gcc.dg/vect/vect-tail-nomask-1.c: New test.
> >> * gcc.dg/vect/vect-tail-nomask-2.c: New test.
> >> * gcc.dg/vect/vect-tail-nomask-3.c: New test.
> >> * gcc.dg/vect/vect-tail-nomask-4.c: New test.
> >> * gcc.dg/vect/vect-tail-nomask-5.c: New test.
> >> * gcc.dg/vect/vect-tail-nomask-6.c: New test.
> >> * gcc.dg/vect/vect-tail-nomask-7.c: New test.
> >
> > This is fine when the rest of the patches go in.
> >
> >
> >> + unsigned int eax, ebx, ecx, edx;
> >> + if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx)
> >> + || ((ecx & bit_OSXSAVE) != bit_OSXSAVE))
> >> +   return 1;
> >> +
> >> + if (__get_cpuid_max (0, NULL) < 7)
> >> +   return 1;
> >> +
> >> + __cpuid_count (7, 0, eax, ebx, ecx, edx);
> >> +
> >> + return (ebx & bit_AVX2) != bit_AVX2;
> >
> > Ugh.  I'm going to trust this is correct.  I vaguely recall mucking around
> > with this stuff for the original AVX in glibc several years ago.
> 
> Actually I just copied some code from avx2-check.h.  Kirill should be able
> to review this piece of code.

LGTM.

--
Thanks, K
> 
> Thanks,
> Ilya
> 
> >
> > jeff
> >


[PATCH] S/390: Xfail some tests in insv-[12].c.

2016-07-19 Thread Dominik Vogt
The attached patch XFAILs some of the "insv" testcases as
discussed internally.  Tested on s390x biarch and s390.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* gcc.target/s390/insv-1.c: Xfail some tests.
* gcc.target/s390/insv-2.c: Likewise.
>From 6cfe287811766e751c5d94834e5314e97c6ab50d Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Tue, 19 Jul 2016 10:10:23 +0100
Subject: [PATCH] S/390: Xfail some tests in insv-[12].c.

---
 gcc/testsuite/gcc.target/s390/insv-1.c |  9 -
 gcc/testsuite/gcc.target/s390/insv-2.c | 15 ++-
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/insv-1.c 
b/gcc/testsuite/gcc.target/s390/insv-1.c
index e6c1b8b..8d464f5 100644
--- a/gcc/testsuite/gcc.target/s390/insv-1.c
+++ b/gcc/testsuite/gcc.target/s390/insv-1.c
@@ -108,4 +108,11 @@ foo4c (unsigned long a, unsigned long b)
 #endif
 }
 
-/* { dg-final { scan-assembler-times "risbg" 6 } } */
+/* The functions foo3, foo4, foo3b, foo4b no longer use risbg but rosbg 
instead.
+
+   On s390x, four risbg go away and four new ones appear in other functions ...
+ { dg-final { scan-assembler-times "risbg" 6 { target { s390x-*-* } } } }
+
+   but not on s390.
+ { dg-final { scan-assembler-times "risbg" 2 { target { s390-*-* } } } }
+*/
diff --git a/gcc/testsuite/gcc.target/s390/insv-2.c 
b/gcc/testsuite/gcc.target/s390/insv-2.c
index 2ba6d6c..70af123 100644
--- a/gcc/testsuite/gcc.target/s390/insv-2.c
+++ b/gcc/testsuite/gcc.target/s390/insv-2.c
@@ -108,4 +108,17 @@ foo4c (unsigned long a, unsigned long b)
 #endif
 }
 
-/* { dg-final { scan-assembler-times "risbgn" 6 } } */
+/* The functions foo3, foo4, foo3b, foo4b no longer use risbgn but rosbg 
instead
+   which is slightly worse.  Combine prefers to use the simpler two insn
+   combinations possible with rosbg instead of the more complicated three insn
+   combinations that result in risbgn.  This problem has been introduced with
+   the commit
+
+ S/390: Add patterns for rsbg instructions.
+
+   (3rd of May, 2016).  This should be fixed some time in the future, but for
+   now just adapt the expected result:
+
+   { dg-final { scan-assembler-times "risbgn" 6 { xfail { *-*-* } } } }
+   { dg-final { scan-assembler-times "risbgn" 2 } }
+*/
-- 
2.3.0



Re: [PATCH GCC]Improve no-overflow check in SCEV using value range info.

2016-07-19 Thread Richard Biener
On Mon, Jul 18, 2016 at 6:27 PM, Bin Cheng  wrote:
> Hi,
> Scalar evolution needs to prove no-overflow for source variable when handling 
> type conversion.  This is important because otherwise we would fail to 
> recognize result of the conversion as SCEV, resulting in missing loop 
> optimizations.  Take case added by this patch as an example, the loop can't 
> be distributed as memset call because address of memory reference is not 
> recognized.  At the moment, we rely on type overflow semantics and loop niter 
> info for no-overflow checking, unfortunately that's not enough.  This patch 
> introduces new method checking no-overflow using value range information.  As 
> commented in the patch, value range can only be used when source operand 
> variable evaluates on every loop iteration, rather than guarded by some 
> conditions.
>
> This together with patch improving loop niter analysis 
> (https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00736.html) can help various 
> loop passes like vectorization.
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

@@ -3187,7 +3187,8 @@ idx_infer_loop_bounds (tree base, tree *idx, void *dta)
   /* If access is not executed on every iteration, we must ensure that overlow
  may not make the access valid later.  */
   if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt))
-  && scev_probably_wraps_p (initial_condition_in_loop_num (ev, loop->num),
+  && scev_probably_wraps_p (NULL,

use NULL_TREE for the null pointer constant of tree.

+  /* Check if VAR evaluates in every loop iteration.  */
+  gimple *def;
+  if ((def = SSA_NAME_DEF_STMT (var)) != NULL

def is never NULL but it might be a GIMPLE_NOP which has a NULL gimple_bb.
Better check for ! SSA_DEFAULT_DEF_P (var)

+  if (TREE_CODE (step) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (var)))
+return false;

this looks like a cheaper test so please do that first.

+  step_wi = step;
+  type = TREE_TYPE (var);
+  if (tree_int_cst_sign_bit (step))
+{
+  diff = lower_bound_in_type (type, type);
+  diff = minv - diff;
+  step_wi = - step_wi;
+}
+  else
+{
+  diff = upper_bound_in_type (type, type);
+  diff = diff - maxv;
+}

this lacks a comment - it's not obvious to me what the gymnastics
with lower/upper_bound_in_type are supposed to achieve.

As VRP uses niter analysis itself I wonder how this fires back-to-back between
VRP1 and VRP2?  If the def of var dominates the latch isn't it enough to do
a + 1 to check whether VRP bumped the range up to INT_MAX/MIN?  That is,
why do we need to add step if not for the TYPE_OVERFLOW_UNDEFINED case
of VRP handling the ranges optimistically?

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-07-15  Bin Cheng  
>
> * tree-chrec.c (convert_affine_scev): New parameter.  Pass new arg.
> (chrec_convert_1, chrec_convert): Ditto.
> * tree-chrec.h (chrec_convert, convert_affine_scev): New parameter.
> * tree-scalar-evolution.c (interpret_rhs_expr): Pass new arg.
> * tree-vrp.c (adjust_range_with_scev): Ditto.
> * tree-ssa-loop-niter.c (idx_infer_loop_bounds): Ditto.
> (scev_var_range_cant_overflow): New function.
> (scev_probably_wraps_p): New parameter.  Call above function.
> * tree-ssa-loop-niter.h (scev_probably_wraps_p): New parameter.
>
> gcc/testsuite/ChangeLog
> 2016-07-15  Bin Cheng  
>
> * gcc.dg/tree-ssa/scev-15.c: New.


Re: [PATCH 2/2] Add selftests for fibonacci_heap

2016-07-19 Thread David Malcolm
On Tue, 2016-07-12 at 15:17 +0200, marxin wrote:

Thanks for writing selftests!

FWIW, some spelling nits below (I'm not a reviewer, and not familiar
with our fibonacci heap implementation).

> gcc/ChangeLog:
> 
> 2016-07-13  Martin Liska  
> 
>   * Makefile.in: Include fibonacci_heap.c
>   * fibonacci_heap.c: New file.
>   * fibonacci_heap.h (fibonacci_heap::insert): Use insert_node.
>   (fibonacci_heap::union_with): Fix deletion of the second heap.
>   * selftest-run-tests.c (selftest::run_tests): Incroporate

Spelling nit: "Incroporate" -> "Incorporate".

>   fibonacci heap tests.
>   * selftest.h: Declare fibonacci_heap_c_tests.


diff --git a/gcc/fibonacci_heap.c b/gcc/fibonacci_heap.c
> new file mode 100644
> index 000..db58417
> --- /dev/null
> +++ b/gcc/fibonacci_heap.c

[...]


> +/* Verify that heap can handle duplicite keys.  */

Spelling nit:
  "duplicite" -> "duplicate"

> +
> +static void
> +test_duplicite_keys ()

and here 

> diff --git a/gcc/fibonacci_heap.h b/gcc/fibonacci_heap.h
> index c6c2a45..602d5ee 100644
> --- a/gcc/fibonacci_heap.h
> +++ b/gcc/fibonacci_heap.h

[...]

> @@ -230,6 +230,9 @@ private:
>/* Insert new NODE given by KEY and DATA associated with the key. 
>  */
>fibonacci_node_t *insert (fibonacci_node_t *node, K key, V *data);
>  
> +  /* Insert new NODE that has alredy filled key and value.  */

Spelling nit: "alredy" -> "already".

> @@ -345,6 +348,15 @@ fibonacci_heap::insert (fibonacci_node_t
> +/* Insert new NODE that has alredy filled key and value.  */
> 

Likewise: "alredy" -> "already".



[PATCH] selftest.c: gracefully handle NULL in assert_streq

2016-07-19 Thread David Malcolm
If a NULL is passed in as the expected or actual value for an
ASSERT_STREQ, the call to strcmp within selftest::assert_streq
can segfault, leading to a failure of -fself-test without
indicating which test failed.

Handle this more gracefully by checking for NULL, so that
information on the failing test is printed to stderr if this
occurs.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
I also manually tested the various kinds of failure of
ASSERT_STR_EQ, and verified that each branch prints a sane
failure message before aborting.

OK for trunk?

gcc/ChangeLog:
* selftest.c (selftest::assert_streq): Handle NULL values of
val_actual and val_expected.
---
 gcc/selftest.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/selftest.c b/gcc/selftest.c
index ed6e517..76a4c41 100644
--- a/gcc/selftest.c
+++ b/gcc/selftest.c
@@ -60,13 +60,25 @@ selftest::fail_formatted (const location &loc, const char 
*fmt, ...)
   abort ();
 }
 
-/* Implementation detail of ASSERT_STREQ.  */
+/* Implementation detail of ASSERT_STREQ.
+   Compare val_expected and val_actual with strcmp.  They ought
+   to be non-NULL; fail gracefully if either are NULL.  */
 
 void
 selftest::assert_streq (const location &loc,
const char *desc_expected, const char *desc_actual,
const char *val_expected, const char *val_actual)
 {
+  /* If val_expected is NULL, the test is buggy.  Fail gracefully.  */
+  if (val_expected == NULL)
+::selftest::fail_formatted
+   (loc, "ASSERT_STREQ (%s, %s) expected=NULL",
+desc_expected, desc_actual);
+  /* If val_actual is NULL, fail with a custom error message.  */
+  if (val_actual == NULL)
+::selftest::fail_formatted
+   (loc, "ASSERT_STREQ (%s, %s) expected=\"%s\" actual=NULL",
+desc_expected, desc_actual, val_expected);
   if (0 == strcmp (val_expected, val_actual))
 ::selftest::pass (loc, "ASSERT_STREQ");
   else
-- 
1.8.5.3



[PATCH] Fix test problem for pr70729.

2016-07-19 Thread Yuri Rumyantsev
Hi All,

I was informed that the test pr70729.cc from g++.dg/vect is failed on
non-x86 targets.
I did minor changes to delete target specific stuff like xmmintrin.h.

Is it OK for trunk?

Changelog:
2016-07-19  Yuri Rumyantsev  

PR tree-optimization/71734
gcc/testsuite/ChangeLog:
* g++.dg/vect/pr70729.cc: Delete target dependent stuff.


test.patch
Description: Binary data


Re: [PATCH, DOC] Enhance documentation of -fipa-ra option.

2016-07-19 Thread Martin Liška
On 07/13/2016 07:04 PM, Alexander Monakov wrote:
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -7260,7 +7260,9 @@ any called function.  In that case it is not necessary 
>> to save and restore
>>  them around calls.  This is only possible if called functions are part of
>>  same compilation unit as current function and they are compiled before it.
>>  
>> -Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
>> +Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}, however the 
>> option
>> +is disabled if profiler is active (@option{-p}, @option{-pg} or
> 
> I think this should say "if generated code will be instrumented for profiling"
> (or "is instrumented") instead of "if profiler is active".  Internal comments
> can be fuzzy, but user-facing documentation should be more rigorous.
> 
>> +@option{-fprofile})
> 
> Right now option -fprofile is not documented, so it's probably not ok to
> mention it here (I realize it won't be so if you document it as an alias).
> 
>> or a port does not emit prologue and epilogue as RTL.
> 
> May I suggest "or if callee's register usage cannot be known exactly (this
> happens on targets that do not expose prologues and epilogues in RTL)"?
> 
> (well, this is still not 100% helpful to the user because they can't easily 
> know
> which targets do, but still a bit of an improvement)
> 
> Thanks for bringing this forward!  The bit about profiling is especially not
> obvious and nice to have documented.
> 
> Alexander
> 

Hi Alexander.

Thank you very much for the suggestions, I've basically followed all nits you
spotted. I also noticed that there is actually one more alias of -p and that's
-profile. That comes from >20 years old commit:

commit f3a30d9b5452e89b4700b857df7caa448c956489
Author: kenner 
Date:   Mon Jan 15 13:28:30 1996 +

(LIB_SPEC): Remove %{mieee-fp:-lieee}.
Use -lc_p for -profile.
(CC1_SPEC): New macro.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@10984 
138bc75d-0d04-0410-961f-82ee72b054a4

So I decided to also mention this alias.

Thoughts?

Martin
>From 2108f584b35331912a0dfa4fc30131c223c6844c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 13 Jul 2016 18:25:09 +0200
Subject: [PATCH] Enhance documentation of -fipa-ra option.

gcc/ChangeLog:

2016-07-13  Martin Liska  

	* doc/invoke.texi (-fipa-ra): Document when the option is
	disabled. Fix a typo.
	(profile): Document it as an alias.
	(fprofile): Likewise.
---
 gcc/doc/invoke.texi | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9a4db38..47d69c8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -422,7 +422,8 @@ Objective-C and Objective-C++ Dialects}.
 
 @item Program Instrumentation Options
 @xref{Instrumentation Options,,Program Instrumentation Options}.
-@gccoptlist{-p  -pg  -fprofile-arcs --coverage -ftest-coverage @gol
+@gccoptlist{-p -profile -fprofile -pg -fprofile-arcs --coverage @gol
+-ftest-coverage @gol
 -fprofile-dir=@var{path} -fprofile-generate -fprofile-generate=@var{path} @gol
 -fsanitize=@var{style} -fsanitize-recover -fsanitize-recover=@var{style} @gol
 -fasan-shadow-offset=@var{number} -fsanitize-sections=@var{s1},@var{s2},... @gol
@@ -7260,7 +7261,11 @@ any called function.  In that case it is not necessary to save and restore
 them around calls.  This is only possible if called functions are part of
 same compilation unit as current function and they are compiled before it.
 
-Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
+Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}, however the option
+is disabled if generated code will be instrumented for profiling
+(@option{-p}, @option{-profile}, @option{-fprofile} or @option{-pg})
+or if callee's register usage cannot be known exactly (this happens on targets
+that do not expose prologues and epilogues in RTL).
 
 @item -fconserve-stack
 @opindex fconserve-stack
@@ -7280,7 +7285,7 @@ Perform code hoisting.  Code hoisting tries to move the
 evaluation of expressions executed on all paths to the function exit
 as early as possible.  This is especially useful as a code size
 optimization, but it often helps for code speed as well.
-This flag is enabled by defailt at @option{-O2} and higher.
+This flag is enabled by default at @option{-O2} and higher.
 
 @item -ftree-pre
 @opindex ftree-pre
@@ -9804,6 +9809,16 @@ analysis program @command{prof}.  You must use this option when compiling
 the source files you want data about, and you must also use it when
 linking.
 
+@cindex @command{prof}
+@item -profile
+@opindex profile
+Alias of @option{-p}.
+
+@cindex @command{prof}
+@item -fprofile
+@opindex fprofile
+Alias of @option{-p}.
+
 @cindex @command{gprof}
 @item -pg
 @opindex pg
-- 
2.9.0



[PATCH]: Use HOST_WIDE_INT_{,M}1{,U} some more

2016-07-19 Thread Uros Bizjak
The result of exercises with sed in gcc/ directory.

2016-07-19  Uros Bizjak  

* builtins.c: Use HOST_WIDE_INT_1 instead of (HOST_WIDE_INT) 1,
HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1,
HOST_WIDE_INT_M1 instead of (HOST_WIDE_INT) -1 and
HOST_WIDE_INT_M1U instead of (unsigned HOST_WIDE_INT) -1.
* combine.c: Ditto.
* cse.c: Ditto.
* dojump.c: Ditto.
* double-int.c: Ditto.
* dse.c: Ditto.
* dwarf2out.c: Ditto.
* expmed.c: Ditto.
* expr.c: Ditto.
* fold-const.c: Ditto.
* function.c: Ditto.
* fwprop.c: Ditto.
* genmodes.c: Ditto.
* hwint.c: Ditto.
* hwint.h: Ditto.
* ifcvt.c: Ditto.
* loop-doloop.c: Ditto.
* loop-invariant.c: Ditto.
* loop-iv.c: Ditto.
* match.pd: Ditto.
* optabs.c: Ditto.
* real.c: Ditto.
* reload.c: Ditto.
* rtlanal.c: Ditto.
* simplify-rtx.c: Ditto.
* stor-layout.c: Ditto.
* toplev.c: Ditto.
* tree-ssa-loop-ivopts.c: Ditto.
* tree-vect-generic.c: Ditto.
* tree-vect-patterns.c: Ditto.
* tree.c: Ditto.
* tree.h: Ditto.
* ubsan.c: Ditto.
* varasm.c: Ditto.
* wide-int-print.cc: Ditto.
* wide-int.cc: Ditto.
* wide-int.h: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?

Uros.
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 5f1fd82..03a0dc8 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -668,11 +668,11 @@ target_char_cast (tree cst, char *p)
   val = TREE_INT_CST_LOW (cst);
 
   if (CHAR_TYPE_SIZE < HOST_BITS_PER_WIDE_INT)
-val &= (((unsigned HOST_WIDE_INT) 1) << CHAR_TYPE_SIZE) - 1;
+val &= (HOST_WIDE_INT_1U << CHAR_TYPE_SIZE) - 1;
 
   hostval = val;
   if (HOST_BITS_PER_CHAR < HOST_BITS_PER_WIDE_INT)
-hostval &= (((unsigned HOST_WIDE_INT) 1) << HOST_BITS_PER_CHAR) - 1;
+hostval &= (HOST_WIDE_INT_1U << HOST_BITS_PER_CHAR) - 1;
 
   if (val != hostval)
 return 1;
diff --git a/gcc/combine.c b/gcc/combine.c
index 4db11b0..1e5ee8e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -4882,7 +4882,7 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
  rtx dest = XEXP (SET_DEST (x), 0);
  machine_mode mode = GET_MODE (dest);
  unsigned HOST_WIDE_INT mask
-   = ((unsigned HOST_WIDE_INT) 1 << len) - 1;
+   = (HOST_WIDE_INT_1U << len) - 1;
  rtx or_mask;
 
  if (BITS_BIG_ENDIAN)
@@ -5016,7 +5016,7 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
  if (unsignedp && len <= 8)
{
  unsigned HOST_WIDE_INT mask
-   = ((unsigned HOST_WIDE_INT) 1 << len) - 1;
+   = (HOST_WIDE_INT_1U << len) - 1;
  SUBST (SET_SRC (x),
 gen_rtx_AND (mode,
  gen_rtx_LSHIFTRT
@@ -5852,7 +5852,7 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
  && ((GET_CODE (XEXP (XEXP (x, 0), 0)) == AND
   && CONST_INT_P (XEXP (XEXP (XEXP (x, 0), 0), 1))
   && (UINTVAL (XEXP (XEXP (XEXP (x, 0), 0), 1))
-  == ((unsigned HOST_WIDE_INT) 1 << (i + 1)) - 1))
+  == (HOST_WIDE_INT_1U << (i + 1)) - 1))
  || (GET_CODE (XEXP (XEXP (x, 0), 0)) == ZERO_EXTEND
  && (GET_MODE_PRECISION (GET_MODE (XEXP (XEXP (XEXP (x, 0), 
0), 0)))
  == (unsigned int) i + 1
@@ -6168,7 +6168,7 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
   else if (SHIFT_COUNT_TRUNCATED && !REG_P (XEXP (x, 1)))
SUBST (XEXP (x, 1),
   force_to_mode (XEXP (x, 1), GET_MODE (XEXP (x, 1)),
- ((unsigned HOST_WIDE_INT) 1
+ (HOST_WIDE_INT_1U
   << exact_log2 (GET_MODE_BITSIZE (GET_MODE (x
  - 1,
  0));
@@ -7134,7 +7134,7 @@ expand_compound_operation (rtx x)
  simplify_shift_const (NULL_RTX, LSHIFTRT,
GET_MODE (x),
XEXP (x, 0), pos),
- ((unsigned HOST_WIDE_INT) 1 << len) - 1);
+ (HOST_WIDE_INT_1U << len) - 1);
   else
 /* Any other cases we can't handle.  */
 return x;
@@ -7261,7 +7261,7 @@ expand_field_assignment (const_rtx x)
   /* Now compute the equivalent expression.  Make a copy of INNER
 for the SET_DEST in case it is a MEM into which we will substitute;
 we don't want shared RTL in that case.  */
-  mask = gen_int_mode (((unsigned HOST_WIDE_INT) 1 << len) - 1,
+  mask = gen_int_mode ((HOST_WIDE_INT_1U << len) - 1,
   compute_mode);
   cleared = simplify_gen_binary (AND, compute_mode,
 simplify_gen_unary (NOT, compute_mode,
@@ -7447,7 +7447,7 @

Re: [PATCH] Fix test problem for pr70729.

2016-07-19 Thread Jakub Jelinek
On Tue, Jul 19, 2016 at 03:40:47PM +0300, Yuri Rumyantsev wrote:
> Hi All,
> 
> I was informed that the test pr70729.cc from g++.dg/vect is failed on
> non-x86 targets.
> I did minor changes to delete target specific stuff like xmmintrin.h.
> 
> Is it OK for trunk?

This is still wrong, aligned_alloc is a C11 API, not all C library stdlib.h
headers will declare it and even if they do, it might not be visible in C++
programs (the fact that for glibc g++ predefines -D_GNU_SOURCE by default is
a bug).

I think we should go with the following instead, there is no point in
including any headers, for the test you don't need it.

inline void* my_alloc (__SIZE_TYPE__ bytes) {return __builtin_aligned_alloc 
(bytes, 128);}

is a possibility too, of course.

2016-07-19  Jakub Jelinek  

PR middle-end/71734
* g++.dg/vect/pr70729.cc: Don't include string.h or xmmintrin.h.
(my_alloc): Rewritten to use __builtin_posix_memalign and
__SIZE_TYPE__.
(my_free): Use __builtin_free instead of _mm_free.
(Vec::operator=): Use __builtin_memcpy.

--- gcc/testsuite/g++.dg/vect/pr70729.cc.jj 2016-07-18 19:42:48.0 
+0200
+++ gcc/testsuite/g++.dg/vect/pr70729.cc2016-07-19 13:31:04.611981641 
+0200
@@ -3,11 +3,8 @@
 // { dg-additional-options "-msse2" { target x86_64-*-* i?86-*-* } }
 
 
-#include 
-#include 
-
-inline void* my_alloc (size_t bytes) {return _mm_malloc (bytes, 128);}
-inline void my_free (void* memory) {_mm_free (memory);}
+inline void* my_alloc (__SIZE_TYPE__ bytes) {void *ptr; 
__builtin_posix_memalign (&ptr, bytes, 128);}
+inline void my_free (void* memory) {__builtin_free (memory);}
 
 template 
 class Vec
@@ -23,7 +20,7 @@ public:
   Vec& operator = (const Vec& other)   
 {
   if (this != &other)
-   memcpy (data, other.data, isize*sizeof (T));
+   __builtin_memcpy (data, other.data, isize*sizeof (T));
   return *this;
 }
 


Jakub


Re: [PATCH]: Use HOST_WIDE_INT_{,M}1{,U} some more

2016-07-19 Thread Jakub Jelinek
On Tue, Jul 19, 2016 at 02:46:46PM +0200, Uros Bizjak wrote:
> The result of exercises with sed in gcc/ directory.
> 
> 2016-07-19  Uros Bizjak  
> 
> * builtins.c: Use HOST_WIDE_INT_1 instead of (HOST_WIDE_INT) 1,
> HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1,
> HOST_WIDE_INT_M1 instead of (HOST_WIDE_INT) -1 and
> HOST_WIDE_INT_M1U instead of (unsigned HOST_WIDE_INT) -1.
> * combine.c: Ditto.
> * cse.c: Ditto.
> * dojump.c: Ditto.
> * double-int.c: Ditto.
> * dse.c: Ditto.
> * dwarf2out.c: Ditto.
> * expmed.c: Ditto.
> * expr.c: Ditto.
> * fold-const.c: Ditto.
> * function.c: Ditto.
> * fwprop.c: Ditto.
> * genmodes.c: Ditto.
> * hwint.c: Ditto.
> * hwint.h: Ditto.
> * ifcvt.c: Ditto.
> * loop-doloop.c: Ditto.
> * loop-invariant.c: Ditto.
> * loop-iv.c: Ditto.
> * match.pd: Ditto.
> * optabs.c: Ditto.
> * real.c: Ditto.
> * reload.c: Ditto.
> * rtlanal.c: Ditto.
> * simplify-rtx.c: Ditto.
> * stor-layout.c: Ditto.
> * toplev.c: Ditto.
> * tree-ssa-loop-ivopts.c: Ditto.
> * tree-vect-generic.c: Ditto.
> * tree-vect-patterns.c: Ditto.
> * tree.c: Ditto.
> * tree.h: Ditto.
> * ubsan.c: Ditto.
> * varasm.c: Ditto.
> * wide-int-print.cc: Ditto.
> * wide-int.cc: Ditto.
> * wide-int.h: Ditto.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> 
> OK for mainline?

> @@ -546,7 +546,7 @@ div_and_round_double (unsigned code, int uns,
>if (quo_neg && (*lrem != 0 || *hrem != 0))   /* ratio < 0 && rem != 0 
> */
>   {
> /* quo = quo - 1;  */
> -   add_double (*lquo, *hquo, (HOST_WIDE_INT) -1, (HOST_WIDE_INT)  -1,
> +   add_double (*lquo, *hquo, HOST_WIDE_INT_M1, (HOST_WIDE_INT)  -1,
> lquo, hquo);
>   }
>else

This surely should be
  add_double (*lquo, *hquo, HOST_WIDE_INT_M1, HOST_WIDE_INT_M1,
  lquo, hquo);

> @@ -557,7 +557,7 @@ div_and_round_double (unsigned code, int uns,
>  case CEIL_MOD_EXPR:  /* round toward positive infinity */
>if (!quo_neg && (*lrem != 0 || *hrem != 0))  /* ratio > 0 && rem != 0 
> */
>   {
> -   add_double (*lquo, *hquo, (HOST_WIDE_INT) 1, (HOST_WIDE_INT) 0,
> +   add_double (*lquo, *hquo, HOST_WIDE_INT_1, (HOST_WIDE_INT) 0,
> lquo, hquo);
>   }
>else

Dunno here, either just use 0 instead of (HOST_WIDE_INT) 0, or define
HOST_WIDE_INT_0 macro and use that?  Though as add_double is a macro
that in the end calls a function with UHWI or HWI arguments, I wonder what is 
the
point in all these casts, whether just using -1, -1, or 1, 0, etc. wouldn't
be better.

> @@ -590,10 +590,10 @@ div_and_round_double (unsigned code, int uns,
>   if (quo_neg)
> /* quo = quo - 1;  */
> add_double (*lquo, *hquo,
> -   (HOST_WIDE_INT) -1, (HOST_WIDE_INT) -1, lquo, hquo);
> +   HOST_WIDE_INT_M1, HOST_WIDE_INT_M1, lquo, hquo);
>   else
> /* quo = quo + 1; */
> -   add_double (*lquo, *hquo, (HOST_WIDE_INT) 1, (HOST_WIDE_INT) 0,
> +   add_double (*lquo, *hquo, HOST_WIDE_INT_1, (HOST_WIDE_INT) 0,
> lquo, hquo);
> }
>   else

> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> index f5c530a..4354b5b 100644
> --- a/gcc/simplify-rtx.c
> +++ b/gcc/simplify-rtx.c
> @@ -40,7 +40,7 @@ along with GCC; see the file COPYING3.  If not see
> occasionally need to sign extend from low to high as if low were a
> signed wide int.  */
>  #define HWI_SIGN_EXTEND(low) \
> - HOST_WIDE_INT) low) < 0) ? ((HOST_WIDE_INT) -1) : ((HOST_WIDE_INT) 0))
> + HOST_WIDE_INT) low) < 0) ? HOST_WIDE_INT_M1 : ((HOST_WIDE_INT) 0))
>  
>  static rtx neg_const_int (machine_mode, const_rtx);
>  static bool plus_minus_operand_p (const_rtx);

But then here we have yet another (HOST_WIDE_INT) 0 - HOST_WIDE_INT_0
candidate.

Otherwise LGTM.

Jakub


Re: [PATCH, DOC] Enhance documentation of -fipa-ra option.

2016-07-19 Thread Andreas Schwab
Martin Liška  writes:

> Thank you very much for the suggestions, I've basically followed all nits you
> spotted. I also noticed that there is actually one more alias of -p and that's
> -profile. That comes from >20 years old commit:

-profile is not the same as -p, as the former also adds -lc_p.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[committed] Fix copy&paste bug in function-tests.c

2016-07-19 Thread David Malcolm
I made a copy&paste error when writing
selftest::verify_three_block_rtl_cfg, forgetting to update the
basic block ptr when asserting the value of "flags".

Fixed thusly.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
Verified -fself-test of stage1 on powerpc-ibm-aix7.1.3.0.

Committed to trunk (r238470) under the "obvious" rule.

gcc/ChangeLog:
* function-tests.c (selftest::verify_three_block_rtl_cfg): Verify
the flags of the exit block and bb2, not just the entry block.
---
 gcc/function-tests.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/function-tests.c b/gcc/function-tests.c
index edd355f..a59a066 100644
--- a/gcc/function-tests.c
+++ b/gcc/function-tests.c
@@ -433,14 +433,14 @@ verify_three_block_rtl_cfg (function *fun)
 
   basic_block exit = EXIT_BLOCK_PTR_FOR_FN (fun);
   ASSERT_TRUE (exit != NULL);
-  ASSERT_EQ (BB_RTL, entry->flags & BB_RTL);
+  ASSERT_EQ (BB_RTL, exit->flags & BB_RTL);
   ASSERT_EQ (NULL, BB_HEAD (exit));
 
   /* The "real" basic block should be flagged as RTL, and have one
  or more insns.  */
   basic_block bb2 = get_real_block (fun);
   ASSERT_TRUE (bb2 != NULL);
-  ASSERT_EQ (BB_RTL, entry->flags & BB_RTL);
+  ASSERT_EQ (BB_RTL, bb2->flags & BB_RTL);
   ASSERT_TRUE (BB_HEAD (bb2) != NULL);
 }
 
-- 
1.8.5.3



Re: [PATCH v2] S/390: Add splitter for "and" with complement.

2016-07-19 Thread Dominik Vogt
On Wed, Apr 27, 2016 at 08:58:44AM +0100, Dominik Vogt wrote:
> The attached patch provides some improved patterns for "and with
> complement" to the s390 machine description.  Bootstrapped and
> regression tested on s390 and s390x.

Version 2 of the patch, reduced to the bare patterns.  Regression
tested on s390 and s390x.

While there are a few situations that result in worse assembly
language code, overall the results are good.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.md ("*andc_split", "*andc_split2"): New splitters
for and with complement.
gcc/testsuite/ChangeLog

* gcc.target/s390/md/andc-splitter-1.c: New test case.
* gcc.target/s390/md/andc-splitter-2.c: Likewise.
>From ac31266ceba7fb69b49d5161f4b2f3c416a4242c Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 14 Mar 2016 17:48:17 +0100
Subject: [PATCH] S/390: Add splitter for "and" with complement.

Force splitting of logical operator expressions ...  with three operands, a
register destination and a memory operand because there are no instructions for
that and combine results in inefficient code.
---
 gcc/config/s390/s390.md| 46 
 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c | 61 ++
 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c | 38 ++
 3 files changed, 145 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index f8c61a8..aad62a1 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -7262,6 +7262,52 @@
(set_attr "z10prop" "z10_super_E1,z10_super,*")])
 
 ;
+; And with complement
+;
+; c = ~b & a = (b & a) ^ a
+
+(define_insn_and_split "*andc_split"
+  [(set (match_operand:GPR 0 "nonimmediate_operand" "")
+   (and:GPR (not:GPR (match_operand:GPR 1 "nonimmediate_operand" ""))
+(match_operand:GPR 2 "general_operand" "")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ZARCH && s390_logical_operator_ok_p (operands)"
+  "#"
+  "&& 1"
+  [
+  (parallel
+   [(set (match_dup 3) (and:GPR (match_dup 1) (match_dup 2)))
+   (clobber (reg:CC CC_REGNUM))])
+  (parallel
+   [(set (match_dup 0) (xor:GPR (match_dup 3) (match_dup 2)))
+   (clobber (reg:CC CC_REGNUM))])]
+{
+  if (reg_overlap_mentioned_p (operands[0], operands[2]))
+{
+  gcc_assert (can_create_pseudo_p ());
+  operands[3] = gen_reg_rtx (mode);
+}
+  else
+operands[3] = operands[0];
+})
+
+; Convert "(xor (operand) (-1))" to "(not (operand))" for low optimization
+; levels so that "*andc_split" matches.
+(define_insn_and_split "*andc_split2"
+  [(set (match_operand:GPR 0 "nonimmediate_operand" "")
+(and:GPR (xor:GPR (match_operand:GPR 1 "nonimmediate_operand" "")
+ (const_int -1))
+(match_operand:GPR 2 "general_operand" "")))
+(clobber (reg:CC CC_REGNUM))]
+  "TARGET_ZARCH && s390_logical_operator_ok_p (operands)"
+  "#"
+  "&& 1"
+  [(parallel
+[(set (match_dup 0) (and:GPR (not:GPR (match_dup 1)) (match_dup 2)))
+(clobber (reg:CC CC_REGNUM))])]
+)
+
+;
 ; Block and (NC) patterns.
 ;
 
diff --git a/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c 
b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
new file mode 100644
index 000..ed78921
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
@@ -0,0 +1,61 @@
+/* Machine description pattern tests.  */
+
+/* { dg-do run { target { lp64 } } } */
+/* { dg-options "-mzarch -save-temps -dP" } */
+/* Skip test if -O0 is present on the command line:
+
+{ dg-skip-if "" { *-*-* } { "-O0" } { "" } }
+
+   Skip test if the -O option is missing from the command line
+{ dg-skip-if "" { *-*-* } { "*" } { "-O*" } }
+*/
+
+__attribute__ ((noinline))
+unsigned long andc_vv(unsigned long a, unsigned long b)
+{ return ~b & a; }
+/* { dg-final { scan-assembler ":15 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":15 .\* \{\\*xordi3\}" } } */
+
+__attribute__ ((noinline))
+unsigned long andc_pv(unsigned long *a, unsigned long b)
+{ return ~b & *a; }
+/* { dg-final { scan-assembler ":21 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":21 .\* \{\\*xordi3\}" } } */
+
+__attribute__ ((noinline))
+unsigned long andc_vp(unsigned long a, unsigned long *b)
+{ return ~*b & a; }
+/* { dg-final { scan-assembler ":27 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":27 .\* \{\\*xordi3\}" } } */
+
+__attribute__ ((noinline))
+unsigned long andc_pp(unsigned long *a, unsigned long *b)
+{ return ~*b & *a; }
+/* { dg-final { scan-assembler ":33 .\* \{\\*anddi3\}" } } */
+/* { dg-final { scan-assembler ":33 .\* \{\\*xordi3\}" } } */
+
+/* { dg-final { scan-assembler-times "\tngr\?k\?\t" 4 } } */
+/* { dg-final { scan-assembler-times "\txgr\?\t" 4 } } */
+
+int
+main (void)
+{

Re: [PATCH] Fix test problem for pr70729.

2016-07-19 Thread Yuri Rumyantsev
Thanks Jakub for your comments.
I changed the test as you proposed.

Yuri.

2016-07-19 15:50 GMT+03:00 Jakub Jelinek :
> On Tue, Jul 19, 2016 at 03:40:47PM +0300, Yuri Rumyantsev wrote:
>> Hi All,
>>
>> I was informed that the test pr70729.cc from g++.dg/vect is failed on
>> non-x86 targets.
>> I did minor changes to delete target specific stuff like xmmintrin.h.
>>
>> Is it OK for trunk?
>
> This is still wrong, aligned_alloc is a C11 API, not all C library stdlib.h
> headers will declare it and even if they do, it might not be visible in C++
> programs (the fact that for glibc g++ predefines -D_GNU_SOURCE by default is
> a bug).
>
> I think we should go with the following instead, there is no point in
> including any headers, for the test you don't need it.
>
> inline void* my_alloc (__SIZE_TYPE__ bytes) {return __builtin_aligned_alloc 
> (bytes, 128);}
>
> is a possibility too, of course.
>
> 2016-07-19  Jakub Jelinek  
>
> PR middle-end/71734
> * g++.dg/vect/pr70729.cc: Don't include string.h or xmmintrin.h.
> (my_alloc): Rewritten to use __builtin_posix_memalign and
> __SIZE_TYPE__.
> (my_free): Use __builtin_free instead of _mm_free.
> (Vec::operator=): Use __builtin_memcpy.
>
> --- gcc/testsuite/g++.dg/vect/pr70729.cc.jj 2016-07-18 19:42:48.0 
> +0200
> +++ gcc/testsuite/g++.dg/vect/pr70729.cc2016-07-19 13:31:04.611981641 
> +0200
> @@ -3,11 +3,8 @@
>  // { dg-additional-options "-msse2" { target x86_64-*-* i?86-*-* } }
>
>
> -#include 
> -#include 
> -
> -inline void* my_alloc (size_t bytes) {return _mm_malloc (bytes, 128);}
> -inline void my_free (void* memory) {_mm_free (memory);}
> +inline void* my_alloc (__SIZE_TYPE__ bytes) {void *ptr; 
> __builtin_posix_memalign (&ptr, bytes, 128);}
> +inline void my_free (void* memory) {__builtin_free (memory);}
>
>  template 
>  class Vec
> @@ -23,7 +20,7 @@ public:
>Vec& operator = (const Vec& other)
>  {
>if (this != &other)
> -   memcpy (data, other.data, isize*sizeof (T));
> +   __builtin_memcpy (data, other.data, isize*sizeof (T));
>return *this;
>  }
>
>
>
> Jakub


test.patch
Description: Binary data


[Patch, testsuite, committed] Fix gcc.dg/params/blocksort-part.c for non 32-bit int targets

2016-07-19 Thread Senthil Kumar Selvaraj
Hi,

  The below patch conditionally defines Int32 and UInt32 to accomodate
  targets with sizeof(int) != 4.

  Regtested with x86_64 and avr. Committed as obvious.

Regards
Senthil

2016-07-19  Senthil Kumar Selvaraj  

* gcc.dg/params/blocksort-part.c: Conditionally define Int32 
and UInt32 based on __SIZEOF_INT__.


--- gcc/testsuite/gcc.dg/params/blocksort-part.c
+++ gcc/testsuite/gcc.dg/params/blocksort-part.c
@@ -21,8 +21,13 @@
 typedef charChar;
 typedef unsigned char   Bool;
 typedef unsigned char   UChar;
+#if __SIZEOF_INT__ == 2
+typedef long Int32;
+typedef unsigned longUInt32;
+#else
 typedef int Int32;
 typedef unsigned intUInt32;
+#endif
 typedef short   Int16;
 typedef unsigned short  UInt16;
 
-- 
2.7.4




Re: [PATCH] Add qsort comparator consistency checking (PR71702)

2016-07-19 Thread Alexander Monakov
On Tue, 19 Jul 2016, Richard Biener wrote:
> Yes.  The other option is to enable this checking not with ENABLE_CHECKING
> but some new checking option, say ENABLE_CHECKING_ALGORITHMS, and
> do full checking in that case.

Thanks - I'm going to fold in this idea when redoing the patch (i.e. check a
subset of pairs under normal checking, all pairs under this option macro).

While the topic is fresh, I'd like to mention a small complication with
extending this checking to cover all qsort calls.  I mentioned in the opening
mail that I was going to do that with a '#define qsort(..) qsort_chk (..)' in
gcc/system.h, but I missed that vec::qsort would be subject to this macro
expansion as well.

I see two possible solutions.  The first is to use the argument counting trick
to disambiguate between libc qsort(base, n, sz, cmp) and vec::qsort(cmp) on the
preprocessor level.  I don't see a reason it wouldn't work, but in this context
I'd consider that a last-resort measure rather than an appropriate solution.

The second is to rename vec::qsort to vec::sort. While mass renaming is not very
nice, I hope it is acceptable in this case (I think formally vec::qsort
declaration in GCC is not portable, because it implicitly expects that stdlib.h
wouldn't shadow qsort with a function-like macro).


Actually, thinking about it more, instead of redirecting qsort in system.h, it
may be more appropriate to introduce gcc_qsort that wraps qsort and does
checking, add gcc_qsort_nochk as an escape hatch for cases where checking
shouldn't be done, and poison qsort in system.h (this again depends on doing the
vec::sort mass-rename).

Alexander


Re: [PATCH PR71503/PR71683]Fix ICE in tree-if-conv.c

2016-07-19 Thread Bin.Cheng
On Thu, Jul 14, 2016 at 6:49 PM, Jeff Law  wrote:
> On 07/14/2016 10:12 AM, Bin Cheng wrote:
>>
>> Hi,
>> This is a simple patch fixing ICE in tree-if-conv.c.  Existing code does
>> not setup a variable (cond) when predicate of basic block is true and it
>> asserts on the variable.  Interesting thing is dead code is not cleaned up
>> before ifcvt, but that's another story.
>> Bootstrap and test on x86_64.  Is it OK?
>>
>> Thanks,
>> bin
>>
>> 2016-07-13  Bin Cheng  
>>
>> PR tree-optimization/71503
>> PR tree-optimization/71683
>> * tree-if-conv.c (gen_phi_arg_condition): Set cond when predicate
>> is true.
>
> Maybe I'm missing something, but in the case where we COND is already set
> and we encounter a true predicate later, shouldn't that make the result true
> as well?
>
> I don't think the code will though -- we just throw away the true condition.
> ISTM that the right thing to do is
>
> if (is_true_predicate (c))
>   {
> cond = c;
> continue;
>   }
>
> Can you see if you can trigger a case where we have an existing cond, then
> later find a true condition to verify the right thing happens?
Hi,
Attachment is the updated patch, it breaks the loop once TRUE
predicate is found.  I also built spec2k6/spec2k and your case is not
found.  Is it OK?

Thanks,
bin
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index e5a3372..4253d19 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1687,7 +1687,10 @@ gen_phi_arg_condition (gphi *phi, vec *occur,
   e = gimple_phi_arg_edge (phi, (*occur)[i]);
   c = bb_predicate (e->src);
   if (is_true_predicate (c))
-   continue;
+   {
+ cond = c;
+ break;
+   }
   c = force_gimple_operand_gsi_1 (gsi, unshare_expr (c),
  is_gimple_condexpr, NULL_TREE,
  true, GSI_SAME_STMT);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr71503.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr71503.c
new file mode 100644
index 000..5a90abf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr71503.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" { target *-*-* } } */
+
+int a, b;
+unsigned long d;
+void fn1() {
+  unsigned long *h = &d;
+line1 : {
+  int i = 4;
+  for (; b; i++) {
+d = ((d + 6 ?: *h) ? a : 7) && (i &= 0 >= b);
+b += a;
+  }
+}
+  h = 0;
+  for (; *h;)
+goto line1;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr71683.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr71683.c
new file mode 100644
index 000..851be37
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr71683.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" { target *-*-* } } */
+
+short unsigned int ve;
+
+void
+u1 (void)
+{
+  int oq = 0;
+
+  while (ve != 0)
+{
+  int j4, w7 = oq;
+
+  oq = 0 / oq;
+  ve %= oq;
+  j4 = ve ^ 1;
+  ve ^= oq;
+  if (j4 != 0 ? j4 : ve)
+oq = ve;
+  else
+if (w7 != 0)
+  oq = ve;
+}
+}


Re: [PATCH, DOC] Enhance documentation of -fipa-ra option.

2016-07-19 Thread Alexander Monakov
> So I decided to also mention this alias.
> 
> Thoughts?

I'd like the new ipa-ra text to go in, but perhaps you should consider leaving
out option aliases out of this patch, especially given that it's non-trivial, as
Andreas' comment has shown.

(I'd say it's rather confusing that -fprofile is quite different from
-fprofile-generate; I doubt it's worthwhile to document this alias at all, but I
won't argue much either way)

Thanks.
Alexander


Re: [PATCH], PR 71493, Fix PowerPC ABI breakage on GCC trunk/6.1

2016-07-19 Thread Segher Boessenkool
On Mon, Jul 18, 2016 at 11:40:03PM -0400, Michael Meissner wrote:
> > > +/* { dg-do compile { target { powerpc*-*-linux* && ilp32 } } } */
> > > +/* { dg-options "-O2 -m32 -msvr4-struct-return" } */

> > I think you can drop the ilp32.
> 
> You cannot use -m32 on a 64-bit little endian system, so the && ilp32 test
> guarantees it is only run on a system that supports 32-bit (a pure 32-bit
> system, or a big endian 64-bit system that still has the 32-bit libraries
> installed).

ilp32 tests for a system that is *now* compiling for 32-bit, not one
that *could*.

> I also imagine somebody could build a 64-bit big endian compiler that was
> configured with --disable-multilib, and you would not be able to do -m32.

Ah, right, we would need a test saying "can compile for PowerPC 32-bit
ELF" (this also includes powerpc-elf targets).  There is no such test
yet.  What you do now works, okay.

Thanks,


Segher


Implement C _FloatN, _FloatNx types [version 4]

2016-07-19 Thread Joseph Myers
[Version 4 is respun to apply without conflicts to current sources
and, as previously discussed, makes the PowerPC logic for selecting
machine modes match that used for *q constants.]


ISO/IEC TS 18661-3:2015 defines C bindings to IEEE interchange and
extended types, in the form of _FloatN and _FloatNx type names with
corresponding fN/FN and fNx/FNx constant suffixes and FLTN_* / FLTNX_*
 macros.  This patch implements support for this feature in
GCC.

The _FloatN types, for N = 16, 32, 64 or >= 128 and a multiple of 32,
are types encoded according to the corresponding IEEE interchange
format (endianness unspecified; may use either the NaN conventions
recommended in IEEE 754-2008, or the MIPS NaN conventions, since the
choice of convention is only an IEEE recommendation, not a
requirement).  The _FloatNx types, for N = 32, 64 and 128, are IEEE
"extended" types: types extending a narrower format with range and
precision at least as big as those specified in IEEE 754 for each
extended type (and with unspecified representation, but still
following IEEE semantics for their values and operations - and with
the set of values being determined by the precision and the maximum
exponent, which means that while Intel "extended" is suitable for
_Float64x, m68k "extended" is not).  These types are always distinct
from and not compatible with each other and the standard floating
types float, double, long double; thus, double, _Float64 and _Float32x
may all have the same ABI, but they are three still distinct types.
The type names may be used with _Complex to construct corresponding
complex types (unlike __float128, which acts more like a typedef name
than a keyword - thus, this patch may be considered to fix PR
c/32187).  The new suffixes can be combined with GNU "i" and "j"
suffixes for constants of complex types (e.g. 1.0if128, 2.0f64i).

The set of types supported is implementation-defined.  In this GCC
patch, _Float32 is SFmode if that is suitable; _Float32x and _Float64
are DFmode if that is suitable; _Float128 is TFmode if that is
suitable; _Float64x is XFmode if that is suitable, and otherwise
TFmode if that is suitable.  There is a target hook to override the
choices if necessary.  "Suitable" means both conforming to the
requirements of that type, and supported as a scalar type including in
libgcc.  The ABI is whatever the back end does for scalars of that
mode (but note that _Float32 is passed without promotion in variable
arguments, unlike float).  All the existing issues with exceptions and
rounding modes for existing types apply equally to the new type names.

No GCC port supports a floating-point format suitable for _Float128x.
Although there is HFmode support for ARM and AArch64, use of that for
_Float16 is not enabled.  Supporting _Float16 would require additional
work on the excess precision aspects of TS 18661-3: there are new
values of FLT_EVAL_METHOD, which are not currently supported in GCC,
and FLT_EVAL_METHOD == 0 now means that operations and constants on
types narrower than float are evaluated to the range and precision of
float.  Implementing that, so that _Float16 gets evaluated with excess
range and precision, would involve changes to the excess precision
infrastructure so that the _Float16 case is enabled by default, unlike
the x87 case which is only enabled for -fexcess-precision=standard.
Other differences between _Float16 and __fp16 would also need to be
disentangled.

GCC has some prior support for nonstandard floating-point types in the
form of __float80 and __float128.  Where these were previously types
distinct from long double, they are made by this patch into aliases
for _Float64x / _Float128 if those types have the required properties.

In principle the set of possible _FloatN types is infinite.  This
patch hardcodes the four such types for N <= 128, but with as much
code as possible using loops over types to minimize the number of
places with such hardcoding.  I don't think it's likely any further
such types will be of use in future (or indeed that formats suitable
for _Float128x will actually be implemented).  There is a corner case
that all _FloatN, for N >= 128 and a multiple of 32, should be treated
as keywords even when the corresponding type is not supported; I
intend to deal with that in a followup patch.

Tests are added for various functionality of the new types, mostly
using type-generic headers.  PowerPC maintainers should note that the
tests do not do anything regarding passing special options to enable
support for the types, either for the tests themselves or for the
corresponding effective-target tests.  Thus, to run the _Float128
tests on PowerPC, you will need to add such support, { dg-add-options
float128 } or similar and make sure it affects both the
effective-target tests and the tests themselves.  The complex
arithmetic support in libgcc will also be needed, as otherwise the
associated tests will fail.  (The same would apply to _Float16 on ARM
as well if that w

Re: [PATCH 2/2] Add selftests for fibonacci_heap

2016-07-19 Thread Martin Liška
Thank you for the nits, sending second version of the patch.

M.
>From be12e60d4a2731cf4f1f68516e08f5bdb6c1ef77 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 12 Jul 2016 15:17:24 +0200
Subject: [PATCH 2/2] Add selftests for fibonacci_heap

gcc/ChangeLog:

2016-07-13  Martin Liska  

	* Makefile.in: Include fibonacci_heap.c
	* fibonacci_heap.c: New file.
	* fibonacci_heap.h (fibonacci_heap::insert): Use insert_node.
	(fibonacci_heap::union_with): Fix deletion of the second heap.
	* selftest-run-tests.c (selftest::run_tests): Incorporate
	fibonacci heap tests.
	* selftest.h: Declare fibonacci_heap_c_tests.
---
 gcc/Makefile.in  |   1 +
 gcc/fibonacci_heap.c | 290 +++
 gcc/fibonacci_heap.h |  37 --
 gcc/selftest-run-tests.c |   1 +
 gcc/selftest.h   |   1 +
 5 files changed, 321 insertions(+), 9 deletions(-)
 create mode 100644 gcc/fibonacci_heap.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0786fa3..bfa467c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1259,6 +1259,7 @@ OBJS = \
 	explow.o \
 	expmed.o \
 	expr.o \
+	fibonacci_heap.o \
 	final.o \
 	fixed-value.o \
 	fold-const.o \
diff --git a/gcc/fibonacci_heap.c b/gcc/fibonacci_heap.c
new file mode 100644
index 000..afc8581
--- /dev/null
+++ b/gcc/fibonacci_heap.c
@@ -0,0 +1,290 @@
+/* Fibonacci heap for GNU compiler.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   Contributed by Martin Liska 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "fibonacci_heap.h"
+#include "selftest.h"
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Selftests.  */
+
+/* Verify that operations with empty heap work.  */
+
+typedef fibonacci_node  int_heap_node_t;
+typedef fibonacci_heap  int_heap_t;
+
+static void
+test_empty_heap ()
+{
+  int_heap_t *h1 = new int_heap_t (INT_MIN);
+
+  ASSERT_TRUE (h1->empty ());
+  ASSERT_EQ (0, h1->nodes ());
+  ASSERT_EQ (NULL, h1->min ());
+
+  int_heap_t *h2 = new int_heap_t (INT_MIN);
+
+  int_heap_t *r = h1->union_with (h2);
+  ASSERT_TRUE (r->empty ());
+  ASSERT_EQ (0, r->nodes ());
+  ASSERT_EQ (NULL, r->min ());
+
+  delete r;
+}
+
+#define TEST_HEAP_N 100
+#define TEST_CALCULATE_VALUE(i)  ((3 * i) + 1)
+
+/* Verify heap basic operations.  */
+
+static void
+test_basic_heap_operations ()
+{
+  int values[TEST_HEAP_N];
+  int_heap_t *h1 = new int_heap_t (INT_MIN);
+
+  for (unsigned i = 0; i < TEST_HEAP_N; i++)
+{
+  values[i] = TEST_CALCULATE_VALUE (i);
+  ASSERT_EQ (i, h1->nodes ());
+  h1->insert (i, &values[i]);
+  ASSERT_EQ (0, h1->min_key ());
+  ASSERT_EQ (values[0], *h1->min ());
+}
+
+  for (unsigned i = 0; i < TEST_HEAP_N; i++)
+{
+  ASSERT_EQ (TEST_HEAP_N - i, h1->nodes ());
+  ASSERT_EQ ((int)i, h1->min_key ());
+  ASSERT_EQ (values[i], *h1->min ());
+
+  h1->extract_min ();
+}
+
+  ASSERT_TRUE (h1->empty ());
+
+  delete h1;
+}
+
+/* Builds a simple heap with values in interval 0..TEST_HEAP_N-1, where values
+   of each key is equal to 3 * key + 1.  BUFFER is used as a storage
+   of values and NODES points to inserted nodes.  */
+
+static int_heap_t *
+build_simple_heap (int *buffer, int_heap_node_t **nodes)
+{
+  int_heap_t *h = new int_heap_t (INT_MIN);
+
+  for (unsigned i = 0; i < TEST_HEAP_N; i++)
+{
+  buffer[i] = TEST_CALCULATE_VALUE (i);
+  nodes[i] = h->insert (i, &buffer[i]);
+}
+
+  return h;
+}
+
+/* Verify that fibonacci_heap::replace_key works.  */
+
+static void
+test_replace_key ()
+{
+  int values[TEST_HEAP_N];
+  int_heap_node_t *nodes[TEST_HEAP_N];
+
+  int_heap_t *heap = build_simple_heap (values, nodes);
+
+  int N = 10;
+  for (unsigned i = 0; i < (unsigned)N; i++)
+heap->replace_key (nodes[i], 100 * 1000 + i);
+
+  ASSERT_EQ (TEST_HEAP_N, heap->nodes ());
+  ASSERT_EQ (N, heap->min_key ());
+  ASSERT_EQ (TEST_CALCULATE_VALUE (N), *heap->min ());
+
+  for (int i = 0; i < TEST_HEAP_N - 1; i++)
+heap->extract_min ();
+
+  ASSERT_EQ (1, heap->nodes ());
+  ASSERT_EQ (100 * 1000 + N - 1, heap->min_key ());
+
+  delete heap;
+}
+
+/* Verify that heap can handle duplicate keys.  */
+
+static void
+test_duplicate_keys ()
+{
+  int values[3 * TEST_HEAP_N];
+  int_heap_t *heap = new int_heap_t (INT_MIN);
+
+  for (unsigned i = 0; i < 3 * TEST_HEAP_N; i++)
+

Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns

2016-07-19 Thread Segher Boessenkool
On Mon, Jul 18, 2016 at 07:03:04PM +0200, Bernd Schmidt wrote:
> >>>+  /* The frequency of executing the prologue for this BB and all BBs
> >>>+ dominated by it.  */
> >>>+  gcov_type cost;
> >>
> >>Is this frequency consideration the only thing that attempts to prevent
> >>placing prologue insns into loops?
> >
> >Yes.  The algorithm makes sure the prologues are executed as infrequently
> >as possible.  If a block that would get a prologue has the same frequency
> >as a predecessor does, and that predecessor always has that first block as
> >eventual successor, the prologue is moved to the earlier block (this
> >handles the case where both have a frequency of zero, and other cases
> >where the range of freq is too limited).
> 
> Ugh, that is really scaring me. I'd much prefer a classification of 
> valid blocks based on cfg structure alone - I'll need serious convincing 
> that the frequency data is reliable enough for what you are trying to do.

But you need the profile to make even reasonably good decisions.

The standard example:

   1
  / \
 2   3
  \ /
   4
  / \
 5   6
  \ /
   7

where 3 and 6 need some prologue, the rest do not.
If freq(3) + freq(6) > freq(1), it is better to put the prologue at 1;
if not, it is better to place it at 3 and 6.

If you do not use the profile, you cannot do better than the status quo,
i.e. always place it at 1.

In the general case, you have the choice between putting the prologue at
some basic block X, or at certain blocks dominated by X.  This algorithm
chooses the case that has the prologue executed the least often in total,
and that is really all there is to it.


Yes, our profile data sometimes is, uh, less than optimal.  But:
- All our other passes use it, too;
- What matters most here is comparing the execution frequency locally,
  and that is not usually messed up so badly;
- All our other passes use it, too;
- The important cases (loops, exceptional cases) normally have a pretty
  reasonable profile;
- All our other passes use it, too;
- Benchmarking shows big wins with this patch.


Segher


Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns

2016-07-19 Thread Bernd Schmidt

On 07/19/2016 04:46 PM, Segher Boessenkool wrote:


But you need the profile to make even reasonably good decisions.


I'm not worried about making cost decisions: as far as I'm concerned 
it's perfectly fine for that. I'm worried about correctness - you can't 
validly save registers inside a loop. So IMO there needs to be an 
additional cfg-based check that verifies whether the bb where we want to 
place parts of the prologue is guaranteed to be executed at most once.



Bernd


Re: [PATCH, DOC] Enhance documentation of -fipa-ra option.

2016-07-19 Thread Martin Liška
On 07/19/2016 03:46 PM, Alexander Monakov wrote:
>> So I decided to also mention this alias.
>>
>> Thoughts?
> 
> I'd like the new ipa-ra text to go in, but perhaps you should consider leaving
> out option aliases out of this patch, especially given that it's non-trivial, 
> as
> Andreas' comment has shown.
> 
> (I'd say it's rather confusing that -fprofile is quite different from
> -fprofile-generate; I doubt it's worthwhile to document this alias at all, 
> but I
> won't argue much either way)
> 
> Thanks.
> Alexander
> 

Thanks Andreas, you are right.

Ok, so I'm sending third version which just explains documentation
of -fipa-ra option.

Martin
>From 2894c36014d726fa0bfaebff261642d42550622c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 13 Jul 2016 18:25:09 +0200
Subject: [PATCH] Enhance documentation of -fipa-ra option.

gcc/ChangeLog:

2016-07-13  Martin Liska  

	* doc/invoke.texi (-fipa-ra): Document when the option is
	disabled. Fix a typo.
---
 gcc/doc/invoke.texi | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9a4db38..4435f54 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7260,7 +7260,11 @@ any called function.  In that case it is not necessary to save and restore
 them around calls.  This is only possible if called functions are part of
 same compilation unit as current function and they are compiled before it.
 
-Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
+Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}, however the option
+is disabled if generated code will be instrumented for profiling
+(@option{-p}, or @option{-pg}) or if callee's register usage cannot be known
+exactly (this happens on targets that do not expose prologues
+and epilogues in RTL).
 
 @item -fconserve-stack
 @opindex fconserve-stack
@@ -7280,7 +7284,7 @@ Perform code hoisting.  Code hoisting tries to move the
 evaluation of expressions executed on all paths to the function exit
 as early as possible.  This is especially useful as a code size
 optimization, but it often helps for code speed as well.
-This flag is enabled by defailt at @option{-O2} and higher.
+This flag is enabled by default at @option{-O2} and higher.
 
 @item -ftree-pre
 @opindex ftree-pre
-- 
2.9.0



Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-19 Thread Renlin Li

Hi Yuri,

I saw this test case runs on arm platforms, and maybe other platforms as 
well.


testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such 
file or directory


Before the change here, it's gated by vect_simd_clones target selector, 
which limit it to i?86/x86_64 platform only.


Regards,
Renlin Li



On 08/07/16 15:07, Yuri Rumyantsev wrote:

Hi Richard,

Thanks for your help - your patch looks much better.
Here is new patch in which additional argument was added to determine
source loop of reference.

Bootstrap and regression testing did not show any new failures.

Is it OK for trunk?
ChangeLog:
2016-07-08  Yuri Rumyantsev  

PR tree-optimization/71734
* tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which
contains REF, use it to check safelen, assume that safelen value
must be greater 1, fix style.
(ref_indep_loop_p_2): Add REF_LOOP argument.
(ref_indep_loop_p): Pass LOOP as additional argument to
ref_indep_loop_p_2.
gcc/testsuite/ChangeLog:
 * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style.

2016-07-08 11:18 GMT+03:00 Richard Biener :

On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev  wrote:

I checked simd3.f90 and found out that my additional check reject
independence of references

REF is independent in loop#3
.istart0.19, .iend0.20
which are defined in loop#1 which is outer for loop#3.
Note that these references are defined by
_103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20);
which is in loop#1.
It is clear that both these references can not be independent for loop#3.


Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops
of LOOP to catch memory references in those as well.  So the issue is really
that we look at the wrong loop for safelen and we _do_ want to apply safelen
to inner loops as well.

So better track the loop we are ultimately asking the question for, like in the
attached patch (fixes the testcase for me).

Richard.




2016-07-07 17:11 GMT+03:00 Richard Biener :

On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev  wrote:

I Added this check because of new failures in libgomp.fortran suite.
Here is copy of Jakub message:
--- Comment #29 from Jakub Jelinek  ---
The #c27 r237844 change looks bogus to me.
First of all, IMNSHO you can argue this way only if ref is a reference seen in
loop LOOP,


or inner loops of LOOP I guess.  I _think_ we never call ref_indep_loop_p_1 with
a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not make
sense to do that, it would be a waste of time).

So only if "or inner loops of LOOP" is not correct the check would be needed
but then my issue with unrolling an inner loop and turning a ref that safelen
does not apply to into a ref that it now applies to arises.

I don't fully get what Jakub is hinting at.

Can you install the safelen > 0 -> safelen > 1 fix please?  Jakub, can you
explain that bitmap check with a simple testcase?

Thanks,
Richard.


which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2
-fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - the
D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer loop
obviously can be dependent on many of the loads and/or stores in the loop, be
it "omp simd array" or not.
Say for
void
foo (int *p, int *q)
{
   #pragma omp simd
   for (int i = 0; i < 1024; i++)
 p[i] += q[0];
}
sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could write
something that changes its value, and then it would behave differently from
using VF = 1024, where everything is performed in parallel.
Though, actually, it can alias, just it would have to write the same value as
was there.  So, if this is used to determine if it is safe to hoist the load
before the loop, it is fine, if it is used to determine if &q[0] >= &p[0] &&
&q[0] <= &p[1023], then it is not fine.

For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a
valid program.  #pragma omp simd I think guarantees that the last iteration is
executed last, it isn't necessarily executed last alone, it could be, or
together with one before last iteration, or (for simdlen INT_MAX) even all
iterations can be done concurrently, in hw or sw, so it is fine if it is
transformed into:
   int temp[1024], temp2[1024], temp3[1024];
   for (int i = 0; i < 1024; i++)
 temp[i] = p[i];
   for (int i = 0; i < 1024; i++)
 temp2[i] = q[0];
   /* The above two loops can be also swapped, or intermixed.  */
   for (int i = 0; i < 1024; i++)
 temp3[i] = temp[i] + temp2[i];
   for (int i = 0; i < 1024; i++)
 p[i] = temp3[i];
   /* Or the above loop reversed etc. */

If you have:
int
bar (int *p, int *q)
{
   q[0] = 0;
   #pragma omp simd
   for (int i = 0; i < 1024; i++)
 p[i]++;
   return q[0];
}
i.e. something similar to what misbehaves in simd3.f90 with the change, then
the answer is that q[0] isn't guaranteed to be independent of any references in
the simd loop.

2016-07

[PATCH 1/3][AArch64] Improve zero extend

2016-07-19 Thread Wilco Dijkstra
This patchset improves zero extend costs and code generation.

When zero extending a 32-bit register, we emit a "mov", but currently
report the cost of the "mov" incorrectly.

In terms of speed, we currently say the cost is that of an extend
operation. But the cost of a "mov" is the cost of 1 instruction, so fix
that.

In terms of size, we currently say that the "mov" takes 0 instructions.
Fix it by changing it to 1.

Bootstrapped and tested on aarch64-none-elf.

2016-07-19  Kristina Martsenko  

* config/aarch64/aarch64.c (aarch64_rtx_costs): Fix cost of zero extend.

---
 gcc/config/aarch64/aarch64.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
d4c5665cf4d0b046a6129c35007fc2ae8265812f..bddffc3ab28cde3a996fd13c060de36227315fb5
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6430,12 +6430,10 @@ cost_plus:
{
  int op_cost = rtx_cost (op0, VOIDmode, ZERO_EXTEND, 0, speed);
 
- if (!op_cost && speed)
-   /* MOV.  */
-   *cost += extra_cost->alu.extend;
- else
+ if (op_cost)
/* Free, the cost is that of the SI mode operation.  */
*cost = op_cost;
+ /* Otherwise MOV.  */
 
  return true;
}
-- 
2.1.4



[PATCH 2/3][AArch64] Improve zero extend

2016-07-19 Thread Wilco Dijkstra
When zero extending a 32-bit value to 64 bits, there should always be a
SET operation on the outside, according to the patterns in aarch64.md.
However, the mid-end can also ask for the cost of a made-up instruction,
where the zero-extend is part of another operation, not SET.

In this case we currently cost the zero extend operation as a uxtb/uxth.
Instead, cost it the same way we cost "normal" 32-to-64-bit zero
extends: as a "mov" or the cost of the inner operation.

Bootstrapped and tested on aarch64-none-elf.

2016-07-19  Kristina Martsenko  

* config/aarch64/aarch64.c (aarch64_rtx_costs): Fix cost of zero extend.

---
 gcc/config/aarch64/aarch64.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
bddffc3ab28cde3a996fd13c060de36227315fb5..a2621313d3278d39db0f1d5640b33201efefac21
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6421,12 +6421,11 @@ cost_plus:
 a 'w' register implicitly zeroes the upper bits of an 'x'
 register.  However, if this is
 
-  (set (reg) (zero_extend (reg)))
+  (zero_extend (reg))
 
 we must cost the explicit register move.  */
   if (mode == DImode
- && GET_MODE (op0) == SImode
- && outer == SET)
+ && GET_MODE (op0) == SImode)
{
  int op_cost = rtx_cost (op0, VOIDmode, ZERO_EXTEND, 0, speed);
 
-- 
2.1.4





[PATCH 3/3][AArch64] Improve zero extend

2016-07-19 Thread Wilco Dijkstra
On AArch64 the UXTB and UXTH instructions are aliases of UBFM,
which does a shift as part of its operation. An AND immediate is a
simpler operation, and might be faster on some implementations, so it is
better to emit this this instead of UBFM.

Benchmarking showed no difference on implementations where UBFM has
the same performance as AND, and minor speedups across several
benchmarks on an implementation where UBFM is slower than AND.

Bootstrapped and tested on aarch64-none-elf.

2016-07-19  Kristina Martsenko  
2016-07-19  Wilco Dijkstra  

* config/aarch64/aarch64.md
(zero_extend2_aarch64): Change output
statement and type.
(qihi2_aarch64): Likewise, and split into two.
(extendqihi2_aarch64): New.
(zero_extendqihi2_aarch64): New.
* config/aarch64/iterators.md (ldrxt): Remove.
* config/aarch64/aarch64.c (aarch64_rtx_costs): Change cost of
uxtb/uxth.
---

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
c7249e8e98905bea4879bb2e2ee81d51a1004faa..e98e41521bfa8f807248b0147843de9e1f3447e3
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6886,8 +6886,8 @@ cost_plus:
}
  else
{
- /* UXTB/UXTH.  */
- *cost += extra_cost->alu.extend;
+ /* We generate an AND instead of UXTB/UXTH.  */
+ *cost += extra_cost->alu.logical;
}
}
   return false;
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
64f9ca1c4d1bec64cef769c9dbef9e4b5b00ba9e..5e8b1a815515eabc7e69c75574c2c300f50a6fe4
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1580,10 +1580,10 @@
 (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" 
"r,m,m")))]
   ""
   "@
-   uxt\t%0, %w1
+   and\t%0, %1, 
ldr\t%w0, %1
ldr\t%0, %1"
-  [(set_attr "type" "extend,load1,load1")]
+  [(set_attr "type" "logic_imm,load1,load1")]
 )
 
 (define_expand "qihi2"
@@ -1592,16 +1592,26 @@
   ""
 )
 
-(define_insn "*qihi2_aarch64"
+(define_insn "*extendqihi2_aarch64"
   [(set (match_operand:HI 0 "register_operand" "=r,r")
-(ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
+   (sign_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
   ""
   "@
-   xtb\t%w0, %w1
-   b\t%w0, %1"
+   sxtb\t%w0, %w1
+   ldrsb\t%w0, %1"
   [(set_attr "type" "extend,load1")]
 )
 
+(define_insn "*zero_extendqihi2_aarch64"
+  [(set (match_operand:HI 0 "register_operand" "=r,r")
+   (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
+  ""
+  "@
+   and\t%w0, %w1, 255
+   ldrb\t%w0, %1"
+  [(set_attr "type" "logic_imm,load1")]
+)
+
 ;; ---
 ;; Simple arithmetic
 ;; ---
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 
e8fbb1281dec2e8f37f58ef2ced792dd62e3b5aa..ef48ffda6f98a2d4aa29daaca206fef2bafcda48
 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -888,9 +888,6 @@
 ;; Similar, but when not(op)
 (define_code_attr nlogical [(and "bic") (ior "orn") (xor "eon")])
 
-;; Sign- or zero-extending load
-(define_code_attr ldrxt [(sign_extend "ldrs") (zero_extend "ldr")])
-
 ;; Sign- or zero-extending data-op
 (define_code_attr su [(sign_extend "s") (zero_extend "u")
  (sign_extract "s") (zero_extract "u")



Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns

2016-07-19 Thread Segher Boessenkool
On Tue, Jul 19, 2016 at 04:49:26PM +0200, Bernd Schmidt wrote:
> >But you need the profile to make even reasonably good decisions.
> 
> I'm not worried about making cost decisions: as far as I'm concerned 
> it's perfectly fine for that. I'm worried about correctness - you can't 
> validly save registers inside a loop.

Of course you can.  It needs to be paired with a restore; and we do
that just fine.

Pretty much *all* implementations in the literature do this, fwiw.

> So IMO there needs to be an 
> additional cfg-based check that verifies whether the bb where we want to 
> place parts of the prologue is guaranteed to be executed at most once.

That is equivalent to not doing this optimisation *at all*.


Segher


Re: [PATCH] simplify-rtx.c: start adding selftests (v2)

2016-07-19 Thread David Malcolm
On Wed, 2016-07-13 at 13:38 -0600, Jeff Law wrote:
> On 07/06/2016 01:30 PM, David Malcolm wrote:
> > > 
> > > This might be a bit confusing when more tests are added, since
> > > pointer equality is only useful in certain specific cases (e.g.
> > > when you know you're dealing with CONST_INTs or pseudo
> > > registers).
> > > How about making ASSERT_RTX_EQ check for rtx_equal_p equality and
> > > have something like ASSERT_RTX_PTR_EQ for cases where pointer
> > > equality really is needed?
> > 
> > > Also, how about using LAST_VIRTUAL_REGISTER + 1 as the base for
> > > register numbers?  DImode might not be valid for register 0 on
> > > all
> > > targets.
> > 
> > Thanks.  Here's an updated version which adds both ASSERT_RTX_EQ
> > and
> > ASSERT_RTX_PTR_EQ.  The simplify-rtx.c tests can use the stricter
> > pointer equality test, so I updated them to use ASSERT_RTX_PTR_EQ
> > condition.
> Richard S. is definitely right here WRT using pointer equality vs
> deeper 
> inspection.   The RTL structure sharing rules within GCC are
> something 
> you have to be cognizant of here.  Thankfully the RTL structure
> sharing 
> is reasonably well documented.
> > 
> > I added a selftest::make_test_reg to allocate pseudo regs, starting
> > at LAST_VIRTUAL_REGISTER + 1.
> Also the right thing to do.  There's hard registers, then virtuals,
> then 
> pseudos.
> 
> There's obviously much more we could do with the tests, but this is a
> reasonable start.
> 
> I note you iterate over all the modes -- which would include things
> like 
> FP modes, BImode, and CC special modes and such.  I don't think we
> can 
> necessarily be sure that the transformations you're testing are valid
> across all modes.
> 
> For example, does it even make sense to test (A & B) | A -> A for FP
> modes?
> 
> It almost seems like the iteration space has to be dependent on what 
> you're testing.  Ie, some tests you want to iterate over the standard
> integer modes.  Other tests you might reasonably include FP modes. 
>  CC 
> modes I think should be forbidden for these tests.  THere may be
> others 
> to ponder as well.

Thanks.  My thinking here is to have the iteration over all modes, and
then filter it within the tests, with something like this at the top of
a test:

  if (!INTEGRAL_MODE_P (mode))
 return;

That said, and this is probably my relative unfamiliarity with RTL
speaking, but I'm not sure exactly how the modes should be filtered.

For example, my naive "run in all modes" approach ran into the issue
that although (A + 0) -> A works, (0 + A) -> A currently doesn't work
for complex modes (where "0" is CONST0_RTX(mode)).  Is that a bug?  If
so, since the run-in-all-modes approach uncovered it, was the testing
useful?

If that's the case, would it make sense to have some kind of
NORMAL_MODE_P (mode) filter, say, or conversely SPECIAL_MODE_P (mode),
and do an early-reject in the loop over all modes?  (to reject CCmode,
VOIDmode, BImode, I think; any others?).

Dave



Re: [PATCH 1/3][AArch64] Improve zero extend

2016-07-19 Thread Richard Earnshaw (lists)
On 19/07/16 16:30, Wilco Dijkstra wrote:
> This patchset improves zero extend costs and code generation.
> 
> When zero extending a 32-bit register, we emit a "mov", but currently
> report the cost of the "mov" incorrectly.
> 
> In terms of speed, we currently say the cost is that of an extend
> operation. But the cost of a "mov" is the cost of 1 instruction, so fix
> that.
> 
> In terms of size, we currently say that the "mov" takes 0 instructions.
> Fix it by changing it to 1.
> 
> Bootstrapped and tested on aarch64-none-elf.
> 
> 2016-07-19  Kristina Martsenko  
> 
>   * config/aarch64/aarch64.c (aarch64_rtx_costs): Fix cost of zero extend.
> 
> ---
>  gcc/config/aarch64/aarch64.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> d4c5665cf4d0b046a6129c35007fc2ae8265812f..bddffc3ab28cde3a996fd13c060de36227315fb5
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -6430,12 +6430,10 @@ cost_plus:
>   {
> int op_cost = rtx_cost (op0, VOIDmode, ZERO_EXTEND, 0, speed);
>  
> -   if (!op_cost && speed)
> - /* MOV.  */
> - *cost += extra_cost->alu.extend;
> -   else
> +   if (op_cost)
>   /* Free, the cost is that of the SI mode operation.  */
>   *cost = op_cost;
> +   /* Otherwise MOV.  */

I don't think the comments help explain the logic here.  I think it
would be better to write something like:

/* If OP_COST is non-zero, then the cost of the zero extend
   is effectively the cost of the inner operation.  Otherwise
   we have a MOV instruction and we take the cost from the MOV
   itself.  This is true independently of whether we are
   optimizing for space or time.  */
if (op_cost)
...

OK with that change.

R.

>  
> return true;
>   }
> 



Re: [PATCH] selftest.c: gracefully handle NULL in assert_streq

2016-07-19 Thread Jeff Law

On 07/19/2016 07:04 AM, David Malcolm wrote:

If a NULL is passed in as the expected or actual value for an
ASSERT_STREQ, the call to strcmp within selftest::assert_streq
can segfault, leading to a failure of -fself-test without
indicating which test failed.

Handle this more gracefully by checking for NULL, so that
information on the failing test is printed to stderr if this
occurs.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
I also manually tested the various kinds of failure of
ASSERT_STR_EQ, and verified that each branch prints a sane
failure message before aborting.

OK for trunk?

gcc/ChangeLog:
* selftest.c (selftest::assert_streq): Handle NULL values of
val_actual and val_expected.

OK.
jeff



Commit: M32R: Build crtinit.o and crtfini.o

2016-07-19 Thread Nick Clifton
Hi Guys,

  I am applying the patch below to fix a long standing snafu for the
  m32r target where the files crtinit.o and crtfini.o were not being
  built along with the rest of libgcc.

  Tested with no regressions and a lot of test case fixes using an
  m32r-elf toolchain.

Cheers
  Nick

libgcc/ChangeLog
2016-07-19  Nick Clifton  

* config.host (m32r): Add m32r/t-m32r to tmake_file.
Add crtinit.o and crtfini.o to extra_parts.

Index: libgcc/config.host
===
--- libgcc/config.host  (revision 238477)
+++ libgcc/config.host  (working copy)
@@ -787,7 +787,8 @@
 tmake_file="lm32/t-lm32 lm32/t-uclinux t-libgcc-pic t-softfp-sfdf 
t-softfp"
;;  
 m32r-*-elf*)
-   tmake_file=t-fdpbit
+   tmake_file="$tmake_file m32r/t-m32r t-fdpbit"
+   extra_parts="$extra_parts crtinit.o crtfini.o"
;;
 m32rle-*-elf*)
tmake_file=t-fdpbit


Re: [PATCH 2/2] Add selftests for fibonacci_heap

2016-07-19 Thread Jeff Law

On 07/19/2016 08:31 AM, Martin Liška wrote:

Thank you for the nits, sending second version of the patch.

M.


0002-Add-selftests-for-fibonacci_heap-v2.patch


From be12e60d4a2731cf4f1f68516e08f5bdb6c1ef77 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 12 Jul 2016 15:17:24 +0200
Subject: [PATCH 2/2] Add selftests for fibonacci_heap

gcc/ChangeLog:

2016-07-13  Martin Liska  

* Makefile.in: Include fibonacci_heap.c
* fibonacci_heap.c: New file.
* fibonacci_heap.h (fibonacci_heap::insert): Use insert_node.
(fibonacci_heap::union_with): Fix deletion of the second heap.
* selftest-run-tests.c (selftest::run_tests): Incorporate
fibonacci heap tests.
* selftest.h: Declare fibonacci_heap_c_tests.

OK.
jeff



Re: [PATCH 1/2] Add sreal to selftests

2016-07-19 Thread Jeff Law

On 07/11/2016 08:34 AM, marxin wrote:

gcc/ChangeLog:

2016-07-12  Martin Liska  

* selftest-run-tests.c (selftest::run_tests): New function.
* selftest.h (sreal_c_tests): Declare.
* sreal.c (sreal_verify_basics): New function.
(verify_aritmetics): Likewise.
(sreal_verify_arithmetics): Likewise.
(verify_shifting): Likewise.
(sreal_verify_shifting): Likewise.
(void sreal_c_tests): Likewise.

gcc/testsuite/ChangeLog:

2016-07-12  Martin Liska  

* gcc.dg/plugin/plugin.exp: Remove sreal test.
* gcc.dg/plugin/sreal-test-1.c: Remove.
* gcc.dg/plugin/sreal_plugin.c: Remove.
---
 gcc/selftest-run-tests.c   |   1 +
 gcc/selftest.h |   1 +
 gcc/sreal.c| 112 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp |   1 -
 gcc/testsuite/gcc.dg/plugin/sreal-test-1.c |   8 --
 gcc/testsuite/gcc.dg/plugin/sreal_plugin.c | 170 -
 6 files changed, 114 insertions(+), 179 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/plugin/sreal-test-1.c
 delete mode 100644 gcc/testsuite/gcc.dg/plugin/sreal_plugin.c

diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index bddf0b2..bb004cc 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -49,6 +49,7 @@ selftest::run_tests ()
   pretty_print_c_tests ();
   wide_int_cc_tests ();
   ggc_tests_c_tests ();
+  sreal_c_tests ();

   /* Mid-level data structures.  */
   input_c_tests ();
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 967e76b..c805386 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -86,6 +86,7 @@ extern void pretty_print_c_tests ();
 extern void rtl_tests_c_tests ();
 extern void spellcheck_c_tests ();
 extern void spellcheck_tree_c_tests ();
+extern void sreal_c_tests ();
 extern void tree_c_tests ();
 extern void tree_cfg_c_tests ();
 extern void vec_c_tests ();
diff --git a/gcc/sreal.c b/gcc/sreal.c
index a7c9c12..9c43b4e 100644
--- a/gcc/sreal.c
+++ b/gcc/sreal.c
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
 #include 
 #include "coretypes.h"
 #include "sreal.h"
+#include "selftest.h"

 /* Print the content of struct sreal.  */

@@ -233,3 +234,114 @@ sreal::operator/ (const sreal &other) const
   r.normalize ();
   return r;
 }
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Selftests for sreals.  */
+
+/* Verify basic sreal operations.  */
+
+static void
+sreal_verify_basics (void)
+{
+  sreal minimum = INT_MIN;
+  sreal maximum = INT_MAX;
+
+  sreal seven = 7;
+  sreal minus_two = -2;
+  sreal minus_nine = -9;
+
+  ASSERT_EQ (INT_MIN, minimum.to_int ());
+  ASSERT_EQ (INT_MAX, maximum.to_int ());
+
+  ASSERT_FALSE (minus_two < minus_two);
+  ASSERT_FALSE (seven < seven);
+  ASSERT_TRUE (seven > minus_two);
+  ASSERT_TRUE (minus_two < seven);
+  ASSERT_TRUE (minus_two != seven);
+  ASSERT_EQ (minus_two, -2);
+  ASSERT_EQ (seven, 7);
+  ASSERT_EQ ((seven << 10) >> 10, 7);
+  ASSERT_EQ (seven + minus_nine, -2);
+}
+
+/* Helper function that performs basic arithmetics and comparison
+   of given arguments A and B.  */
+
+static void
+verify_aritmetics (int64_t a, int64_t b)

arithmetics rather than aritmetics?


OK with that change.

jeff



Re: [PATCH GCC]Improve no-overflow check in SCEV using value range info.

2016-07-19 Thread Bin.Cheng
On Tue, Jul 19, 2016 at 1:10 PM, Richard Biener
 wrote:
> On Mon, Jul 18, 2016 at 6:27 PM, Bin Cheng  wrote:
>> Hi,
>> Scalar evolution needs to prove no-overflow for source variable when 
>> handling type conversion.  This is important because otherwise we would fail 
>> to recognize result of the conversion as SCEV, resulting in missing loop 
>> optimizations.  Take case added by this patch as an example, the loop can't 
>> be distributed as memset call because address of memory reference is not 
>> recognized.  At the moment, we rely on type overflow semantics and loop 
>> niter info for no-overflow checking, unfortunately that's not enough.  This 
>> patch introduces new method checking no-overflow using value range 
>> information.  As commented in the patch, value range can only be used when 
>> source operand variable evaluates on every loop iteration, rather than 
>> guarded by some conditions.
>>
>> This together with patch improving loop niter analysis 
>> (https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00736.html) can help various 
>> loop passes like vectorization.
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> @@ -3187,7 +3187,8 @@ idx_infer_loop_bounds (tree base, tree *idx, void *dta)
>/* If access is not executed on every iteration, we must ensure that 
> overlow
>   may not make the access valid later.  */
>if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt))
> -  && scev_probably_wraps_p (initial_condition_in_loop_num (ev, 
> loop->num),
> +  && scev_probably_wraps_p (NULL,
>
> use NULL_TREE for the null pointer constant of tree.
>
> +  /* Check if VAR evaluates in every loop iteration.  */
> +  gimple *def;
> +  if ((def = SSA_NAME_DEF_STMT (var)) != NULL
>
> def is never NULL but it might be a GIMPLE_NOP which has a NULL gimple_bb.
> Better check for ! SSA_DEFAULT_DEF_P (var)
>
> +  if (TREE_CODE (step) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (var)))
> +return false;
>
> this looks like a cheaper test so please do that first.
>
> +  step_wi = step;
> +  type = TREE_TYPE (var);
> +  if (tree_int_cst_sign_bit (step))
> +{
> +  diff = lower_bound_in_type (type, type);
> +  diff = minv - diff;
> +  step_wi = - step_wi;
> +}
> +  else
> +{
> +  diff = upper_bound_in_type (type, type);
> +  diff = diff - maxv;
> +}
>
> this lacks a comment - it's not obvious to me what the gymnastics
> with lower/upper_bound_in_type are supposed to achieve.

Thanks for reviewing, I will prepare another version of patch.
>
> As VRP uses niter analysis itself I wonder how this fires back-to-back between
I am not sure if I mis-understood the question.  If the VRP
information comes from loop niter, I think it will not change loop
niter or VRP2 in back because that's the best information we got in
the first place in niter.  If the VRP information comes from other
places (guard conditions?)  SCEV and loop niter after vrp1 might be
improved and thus VRP2.  There should be no problems in either case,
as long as GCC breaks the recursive chain among niter/scev/vrp
correctly.
> VRP1 and VRP2?  If the def of var dominates the latch isn't it enough to do
> a + 1 to check whether VRP bumped the range up to INT_MAX/MIN?  That is,
> why do we need to add step if not for the TYPE_OVERFLOW_UNDEFINED case
> of VRP handling the ranges optimistically?
Again, please correct me if I mis-understood.  Considering a variable
whose type is unsigned int and scev is {0, 4}_loop, the value range
could be computed as [0, 0xfffc], thus MAX + 1 is smaller than
type_MAX, but the scev could be overflow.

Thanks,
bin


Re: [RFC][IPA-VRP] Early VRP Implementation

2016-07-19 Thread Jeff Law

On 07/14/2016 10:52 PM, Andrew Pinski wrote:

On Thu, Jul 14, 2016 at 9:45 PM, kugan
 wrote:


Hi,



This patch adds a very simple early vrp implementation. This visits the
basic blocks in the dominance order and set the Value Ranges (VR) for

SSA_NAMEs in the scope. Use this VR to discover more VRs. Restore the old VR
once the scope is exit.



Why separate out early VRP from tree-vrp?  Just a little curious.
I wouldn't mind seeing tree-vrp broken down a little -- it's quite large 
and there's at least 4 distinct things going on in that file.


1. ASSERT_EXPR handling.

2. Arithmetic on ranges

3. Propagation engine setup, callbacks, etc

4. Range management

There may be others, but it seems at least some of that ought to be 
factored out.


Jeff


[PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-19 Thread Bernd Edlinger
Hi!

As discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71876,
we have a _very_ old hack in gcc, that recognizes certain functions by
name, and inserts in some cases unsafe attributes, that don't work for
a freestanding environment.

It is unsafe to return ECF_MAY_BE_ALLOCA, ECF_LEAF and ECF_NORETURN
from special_function_p, just by the name of the function, especially
for less well known functions, like "getcontext" or "savectx", which
could easily used for something completely different.

Moreover, from the backend library we cannot check flag_hosted, or if
the function has "C" or "C++" binding.

So all these functions would get the leaf attribute, which makes it very
easy to construct some kind of wrong code examples.

This patch removes the unsafe flags, and adds them to built-in
functions instead.

The patch removes also support for some completely unknown function
names from the middle-end, because these functions were apparently in
use at some point in time, but are certainly dead by now.

It is however not possible to remove the special handling by name
altogether, because the glibc does not add the return_twice function
attribute on _setjmp, __sigsetjmp and getcontext until today; a glibc
BZ is filed at: https://sourceware.org/bugzilla/show_bug.cgi?id=20382

Without the return_twice attribute we would loose the -Wclobbered
warning, and some targets (spark, ia64, maybe others too) would even
generate wrong code.


Boot-strapped and reg-tested on x86_64-pc-linux-gnu.
OK for trunk?


Thanks
Bernd.

2016-07-19  Bernd Edlinger  

PR middle-end/71876
* builtin-attrs.def (ATTR_RT_NOTHROW_LEAF_LIST): New return twice
attribute.
* builtins.def (BUILT_IN_GETCONTEXT, BUILT_IN_VFORK): New built-in
using ATTR_RT_NOTHROW_LEAF_LIST.
(BUILT_IN_SETJMP): Use ATTR_RT_NOTHROW_LEAF_LIST here.
* calls.c (special_function_p): Remove special handling of "alloca"
by name, as well as "setjmp_syscall", "savectx", "qsetjmp" and the
prefixes "__x" and "__builtin_".  Remove potentially unsafe ECF_LEAF
and ECF_NORETURN from here, use attributes of built-in instead.
Index: gcc/builtin-attrs.def
===
--- gcc/builtin-attrs.def	(revision 238382)
+++ gcc/builtin-attrs.def	(working copy)
@@ -131,6 +131,8 @@ DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LIST, AT
 			ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
 			ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
+DEF_ATTR_TREE_LIST (ATTR_RT_NOTHROW_LEAF_LIST, ATTR_RETURNS_TWICE,\
+			ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_COLD_NOTHROW_LEAF_LIST, ATTR_COLD,\
 			ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST, ATTR_COLD,\
Index: gcc/builtins.def
===
--- gcc/builtins.def	(revision 238382)
+++ gcc/builtins.def	(working copy)
@@ -796,6 +796,7 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_FINITED32, "finit
 DEF_EXT_LIB_BUILTIN(BUILT_IN_FINITED64, "finited64", BT_FN_INT_DFLOAT64, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_FINITED128, "finited128", BT_FN_INT_DFLOAT128, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_FPCLASSIFY, "fpclassify", BT_FN_INT_INT_INT_INT_INT_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_GETCONTEXT, "getcontext", BT_FN_INT_PTR, ATTR_RT_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_ISFINITE, "isfinite", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_GCC_BUILTIN(BUILT_IN_ISINF_SIGN, "isinf_sign", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_ISINF, "isinf", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC)
@@ -837,7 +838,7 @@ DEF_LIB_BUILTIN(BUILT_IN_REALLOC, "realloc
 DEF_GCC_BUILTIN(BUILT_IN_RETURN, "return", BT_FN_VOID_PTR, ATTR_NORETURN_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_RETURN_ADDRESS, "return_address", BT_FN_PTR_UINT, ATTR_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_SAVEREGS, "saveregs", BT_FN_PTR_VAR, ATTR_NULL)
-DEF_GCC_BUILTIN(BUILT_IN_SETJMP, "setjmp", BT_FN_INT_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_SETJMP, "setjmp", BT_FN_INT_PTR, ATTR_RT_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN(BUILT_IN_STRFMON, "strfmon", BT_FN_SSIZE_STRING_SIZE_CONST_STRING_VAR, ATTR_FORMAT_STRFMON_NOTHROW_3_4)
 DEF_LIB_BUILTIN(BUILT_IN_STRFTIME, "strftime", BT_FN_SIZE_STRING_SIZE_CONST_STRING_CONST_PTR, ATTR_FORMAT_STRFTIME_NOTHROW_3_0)
 DEF_GCC_BUILTIN(BUILT_IN_TRAP, "trap", BT_FN_VOID, ATTR_NORETURN_NOTHROW_LEAF_LIST)
@@ -849,6 +850,7 @@ DEF_GCC_BUILTIN(BUILT_IN_VA_END, "va_end",
 DEF_GCC_BUILTIN(BUILT_IN_VA_START, "va_start", BT_FN_VOID_VALIST_REF_VAR, ATTR_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN(BUILT_IN_VA_ARG_PACK, "va_arg_pack", BT_FN_I

Re: [PATCH]: Use HOST_WIDE_INT_{,M}1{,U} some more

2016-07-19 Thread Uros Bizjak
On Tue, Jul 19, 2016 at 2:58 PM, Jakub Jelinek  wrote:
> On Tue, Jul 19, 2016 at 02:46:46PM +0200, Uros Bizjak wrote:
>> The result of exercises with sed in gcc/ directory.
>>
>> 2016-07-19  Uros Bizjak  
>>
>> * builtins.c: Use HOST_WIDE_INT_1 instead of (HOST_WIDE_INT) 1,
>> HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1,
>> HOST_WIDE_INT_M1 instead of (HOST_WIDE_INT) -1 and
>> HOST_WIDE_INT_M1U instead of (unsigned HOST_WIDE_INT) -1.
>> * combine.c: Ditto.
>> * cse.c: Ditto.
>> * dojump.c: Ditto.
>> * double-int.c: Ditto.
>> * dse.c: Ditto.
>> * dwarf2out.c: Ditto.
>> * expmed.c: Ditto.
>> * expr.c: Ditto.
>> * fold-const.c: Ditto.
>> * function.c: Ditto.
>> * fwprop.c: Ditto.
>> * genmodes.c: Ditto.
>> * hwint.c: Ditto.
>> * hwint.h: Ditto.
>> * ifcvt.c: Ditto.
>> * loop-doloop.c: Ditto.
>> * loop-invariant.c: Ditto.
>> * loop-iv.c: Ditto.
>> * match.pd: Ditto.
>> * optabs.c: Ditto.
>> * real.c: Ditto.
>> * reload.c: Ditto.
>> * rtlanal.c: Ditto.
>> * simplify-rtx.c: Ditto.
>> * stor-layout.c: Ditto.
>> * toplev.c: Ditto.
>> * tree-ssa-loop-ivopts.c: Ditto.
>> * tree-vect-generic.c: Ditto.
>> * tree-vect-patterns.c: Ditto.
>> * tree.c: Ditto.
>> * tree.h: Ditto.
>> * ubsan.c: Ditto.
>> * varasm.c: Ditto.
>> * wide-int-print.cc: Ditto.
>> * wide-int.cc: Ditto.
>> * wide-int.h: Ditto.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>>
>> OK for mainline?
>
>> @@ -546,7 +546,7 @@ div_and_round_double (unsigned code, int uns,
>>if (quo_neg && (*lrem != 0 || *hrem != 0))   /* ratio < 0 && rem != 0 
>> */
>>   {
>> /* quo = quo - 1;  */
>> -   add_double (*lquo, *hquo, (HOST_WIDE_INT) -1, (HOST_WIDE_INT)  -1,
>> +   add_double (*lquo, *hquo, HOST_WIDE_INT_M1, (HOST_WIDE_INT)  -1,
>> lquo, hquo);
>>   }
>>else
>
> This surely should be
>   add_double (*lquo, *hquo, HOST_WIDE_INT_M1, HOST_WIDE_INT_M1,
>   lquo, hquo);

Thanks, the repacement regexp missed the second one due to extra space. Fixed.

>> @@ -557,7 +557,7 @@ div_and_round_double (unsigned code, int uns,
>>  case CEIL_MOD_EXPR:  /* round toward positive infinity */
>>if (!quo_neg && (*lrem != 0 || *hrem != 0))  /* ratio > 0 && rem != 0 
>> */
>>   {
>> -   add_double (*lquo, *hquo, (HOST_WIDE_INT) 1, (HOST_WIDE_INT) 0,
>> +   add_double (*lquo, *hquo, HOST_WIDE_INT_1, (HOST_WIDE_INT) 0,
>> lquo, hquo);
>>   }
>>else
>
> Dunno here, either just use 0 instead of (HOST_WIDE_INT) 0, or define
> HOST_WIDE_INT_0 macro and use that?  Though as add_double is a macro
> that in the end calls a function with UHWI or HWI arguments, I wonder what is 
> the
> point in all these casts, whether just using -1, -1, or 1, 0, etc. wouldn't
> be better.

I didn't want to add more complex cases (e.g. ~(unsigned
HOST_WIDE_INT) 0 to the patch, just the trivial replacements. I will
submit a follow-up patch that converts these cases. The patch will be
much shorter, but there are some cases that are non-trivial.

>> @@ -590,10 +590,10 @@ div_and_round_double (unsigned code, int uns,
>>   if (quo_neg)
>> /* quo = quo - 1;  */
>> add_double (*lquo, *hquo,
>> -   (HOST_WIDE_INT) -1, (HOST_WIDE_INT) -1, lquo, hquo);
>> +   HOST_WIDE_INT_M1, HOST_WIDE_INT_M1, lquo, hquo);
>>   else
>> /* quo = quo + 1; */
>> -   add_double (*lquo, *hquo, (HOST_WIDE_INT) 1, (HOST_WIDE_INT) 0,
>> +   add_double (*lquo, *hquo, HOST_WIDE_INT_1, (HOST_WIDE_INT) 0,
>> lquo, hquo);
>> }
>>   else
>
>> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
>> index f5c530a..4354b5b 100644
>> --- a/gcc/simplify-rtx.c
>> +++ b/gcc/simplify-rtx.c
>> @@ -40,7 +40,7 @@ along with GCC; see the file COPYING3.  If not see
>> occasionally need to sign extend from low to high as if low were a
>> signed wide int.  */
>>  #define HWI_SIGN_EXTEND(low) \
>> - HOST_WIDE_INT) low) < 0) ? ((HOST_WIDE_INT) -1) : ((HOST_WIDE_INT) 0))
>> + HOST_WIDE_INT) low) < 0) ? HOST_WIDE_INT_M1 : ((HOST_WIDE_INT) 0))
>>
>>  static rtx neg_const_int (machine_mode, const_rtx);
>>  static bool plus_minus_operand_p (const_rtx);
>
> But then here we have yet another (HOST_WIDE_INT) 0 - HOST_WIDE_INT_0
> candidate.

I will submit yet another patch introducing HOST_WIDE_INT_0.

> Otherwise LGTM.

Thanks! Committed with the above mentioned fix.

Uros.


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-19 Thread Jakub Jelinek
On Tue, Jul 19, 2016 at 04:20:55PM +, Bernd Edlinger wrote:
> As discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71876,
> we have a _very_ old hack in gcc, that recognizes certain functions by
> name, and inserts in some cases unsafe attributes, that don't work for
> a freestanding environment.
> 
> It is unsafe to return ECF_MAY_BE_ALLOCA, ECF_LEAF and ECF_NORETURN
> from special_function_p, just by the name of the function, especially
> for less well known functions, like "getcontext" or "savectx", which
> could easily used for something completely different.
> 
> Moreover, from the backend library we cannot check flag_hosted, or if
> the function has "C" or "C++" binding.

I believe this will regress handling of various functions.
E.g. for alloca (as opposed to __builtin_alloca/__builtin_alloca_with_align,
this means EFC_MAY_BE_ALLOCA will not be set anymore.

_longjmp/siglongjmp will no longer be ECF_NORETURN (glibc
doesn't declare them as such), __sigsetjmp will no longer be ECF_LEAF.

Jakub


[PATCH] Fix memmove to memcpy folding (PR middle-end/71874)

2016-07-19 Thread Jakub Jelinek
Hi!

As mentioned in the PR and discussed on IRC, get_ref_base_and_extent
can return size != maxsize or maxsize -1 and then we really can't trust
the offset for the purposes we want.  So this patch instead uses a different
function that just computes the base and offset if the offset is constant;
we don't really care about the access size.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2,
for 5.5 the 4.9 patch applied to gimple-fold.c instead of bultins.c and for
4.9.4 the attached patch?

2016-07-19  Jakub Jelinek  

PR middle-end/71874
* gimple-fold.c (fold_builtin_memory_op): Use
get_addr_base_and_unit_offset instead of get_ref_base_and_extent.

* g++.dg/torture/pr71874.C: New test.

--- gcc/gimple-fold.c.jj2016-07-19 15:24:42.870004660 +0200
+++ gcc/gimple-fold.c   2016-07-19 15:26:55.449343027 +0200
@@ -796,22 +796,21 @@ gimple_fold_builtin_memory_op (gimple_st
{
  tree src_base, dest_base, fn;
  HOST_WIDE_INT src_offset = 0, dest_offset = 0;
- HOST_WIDE_INT size = -1;
- HOST_WIDE_INT maxsize = -1;
- bool reverse;
+ HOST_WIDE_INT maxsize;
 
  srcvar = TREE_OPERAND (src, 0);
- src_base = get_ref_base_and_extent (srcvar, &src_offset,
- &size, &maxsize, &reverse);
+ src_base = get_addr_base_and_unit_offset (srcvar, &src_offset);
+ if (src_base == NULL)
+   src_base = srcvar;
  destvar = TREE_OPERAND (dest, 0);
- dest_base = get_ref_base_and_extent (destvar, &dest_offset,
-  &size, &maxsize, &reverse);
+ dest_base = get_addr_base_and_unit_offset (destvar,
+&dest_offset);
+ if (dest_base == NULL)
+   dest_base = destvar;
  if (tree_fits_uhwi_p (len))
maxsize = tree_to_uhwi (len);
  else
maxsize = -1;
- src_offset /= BITS_PER_UNIT;
- dest_offset /= BITS_PER_UNIT;
  if (SSA_VAR_P (src_base)
  && SSA_VAR_P (dest_base))
{
--- gcc/testsuite/g++.dg/torture/pr71874.C.jj   2016-07-19 15:23:37.097832377 
+0200
+++ gcc/testsuite/g++.dg/torture/pr71874.C  2016-07-19 15:23:37.097832377 
+0200
@@ -0,0 +1,12 @@
+// PR middle-end/71874
+// { dg-do run }
+
+int
+main ()
+{
+  char str[] = "abcdefghijklmnopqrstuvwxyzABCDEF";
+  __builtin_memmove (str + 20, str + 15, 11);
+  if (__builtin_strcmp (str, "abcdefghijklmnopqrstpqrstuvwxyzF") != 0)
+__builtin_abort ();
+  return 0;
+}

Jakub
2016-07-19  Jakub Jelinek  

PR middle-end/71874
* builtins.c (fold_builtin_memory_op): Use
get_addr_base_and_unit_offset instead of get_ref_base_and_extent.

* g++.dg/torture/pr71874.C: New test.

--- gcc/builtins.c.jj   2016-04-28 21:47:35.0 +0200
+++ gcc/builtins.c  2016-07-19 13:13:41.563148783 +0200
@@ -8793,21 +8793,21 @@ fold_builtin_memory_op (location_t loc,
{
  tree src_base, dest_base, fn;
  HOST_WIDE_INT src_offset = 0, dest_offset = 0;
- HOST_WIDE_INT size = -1;
- HOST_WIDE_INT maxsize = -1;
+ HOST_WIDE_INT maxsize;
 
  srcvar = TREE_OPERAND (src, 0);
- src_base = get_ref_base_and_extent (srcvar, &src_offset,
- &size, &maxsize);
+ src_base = get_addr_base_and_unit_offset (srcvar, &src_offset);
+ if (src_base == NULL)
+   src_base = srcvar;
  destvar = TREE_OPERAND (dest, 0);
- dest_base = get_ref_base_and_extent (destvar, &dest_offset,
-  &size, &maxsize);
+ dest_base = get_addr_base_and_unit_offset (destvar,
+&dest_offset);
+ if (dest_base == NULL)
+   dest_base = destvar;
  if (tree_fits_uhwi_p (len))
maxsize = tree_to_uhwi (len);
  else
maxsize = -1;
- src_offset /= BITS_PER_UNIT;
- dest_offset /= BITS_PER_UNIT;
  if (SSA_VAR_P (src_base)
  && SSA_VAR_P (dest_base))
{
--- gcc/testsuite/g++.dg/torture/pr71874.C.jj   2016-07-19 12:58:54.816402724 
+0200
+++ gcc/testsuite/g++.dg/torture/pr71874.C  2016-07-19 13:00:04.106521149 
+0200
@@ -0,0 +1,12 @@
+// PR middle-end/71874
+// { dg-do run }
+
+int
+main ()
+{
+  char str[] = "abcdefghijklmnopqrstuvwxyzABCDEF";
+  __builtin_memmove (str + 20, str + 15, 11);
+  if (__builtin_strcmp (str, "abcdefghijklmnopqrstpqrstuvwxyzF") != 0)
+__builtin_abort ();
+  return 0;
+}


Re: [patch,avr] Slightly better memory accesses on avr_tiny

2016-07-19 Thread Denis Chertykov
2016-07-19 13:31 GMT+03:00 Georg-Johann Lay :
> This patch tries to improve the bloated code we are currently generating for
> AVR_TINY.  It's mostly about printing the memory loads and stores and more
> usage of reg_unused_after to print shorter instruction sequences in some
> cases.
>
> Ok for trunk?
>
> I also played around with PLUS in legitimate_address_p and
> legitimize_address and got better code, but the problem with such changes is
> that almost all tests for such small devices are failing and no reasonable
> portion of the testsuite will pass.
>
> I don't even know if anybody is using avr_tiny + avr-gcc or if users are
> resorting to assembler.
>

Keep Calm and Carry On
;-)

> Johann
>
>
> gcc/
> (avr_legitimize_address) [AVR_TINY]: Force constant addresses
> outside [0,0xc0] into a register.
> (avr_out_movhi_r_mr_reg_no_disp_tiny): Pass insn.  And handle
> cases where the base address register is unused after.
> (avr_out_movhi_r_mr_reg_disp_tiny): Same.
> (avr_out_movhi_mr_r_reg_disp_tiny): Same.
> (avr_out_store_psi_reg_disp_tiny): Same.
>
> gcc/testsuite/
> * gcc.target/avr/torture/get-mem.c: New test.
> * gcc.target/avr/torture/set-mem.c: New test.

Approved.


Re: [PATCH] Fix memmove to memcpy folding (PR middle-end/71874)

2016-07-19 Thread Richard Biener
On July 19, 2016 6:52:55 PM GMT+02:00, Jakub Jelinek  wrote:
>Hi!
>
>As mentioned in the PR and discussed on IRC, get_ref_base_and_extent
>can return size != maxsize or maxsize -1 and then we really can't trust
>the offset for the purposes we want.  So this patch instead uses a
>different
>function that just computes the base and offset if the offset is
>constant;
>we don't really care about the access size.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk/6.2,
>for 5.5 the 4.9 patch applied to gimple-fold.c instead of bultins.c and
>for
>4.9.4 the attached patch?

OK.

Thanks
Richard.

>2016-07-19  Jakub Jelinek  
>
>   PR middle-end/71874
>   * gimple-fold.c (fold_builtin_memory_op): Use
>   get_addr_base_and_unit_offset instead of get_ref_base_and_extent.
>
>   * g++.dg/torture/pr71874.C: New test.
>
>--- gcc/gimple-fold.c.jj   2016-07-19 15:24:42.870004660 +0200
>+++ gcc/gimple-fold.c  2016-07-19 15:26:55.449343027 +0200
>@@ -796,22 +796,21 @@ gimple_fold_builtin_memory_op (gimple_st
>   {
> tree src_base, dest_base, fn;
> HOST_WIDE_INT src_offset = 0, dest_offset = 0;
>-HOST_WIDE_INT size = -1;
>-HOST_WIDE_INT maxsize = -1;
>-bool reverse;
>+HOST_WIDE_INT maxsize;
> 
> srcvar = TREE_OPERAND (src, 0);
>-src_base = get_ref_base_and_extent (srcvar, &src_offset,
>-&size, &maxsize, &reverse);
>+src_base = get_addr_base_and_unit_offset (srcvar, &src_offset);
>+if (src_base == NULL)
>+  src_base = srcvar;
> destvar = TREE_OPERAND (dest, 0);
>-dest_base = get_ref_base_and_extent (destvar, &dest_offset,
>- &size, &maxsize, &reverse);
>+dest_base = get_addr_base_and_unit_offset (destvar,
>+   &dest_offset);
>+if (dest_base == NULL)
>+  dest_base = destvar;
> if (tree_fits_uhwi_p (len))
>   maxsize = tree_to_uhwi (len);
> else
>   maxsize = -1;
>-src_offset /= BITS_PER_UNIT;
>-dest_offset /= BITS_PER_UNIT;
> if (SSA_VAR_P (src_base)
> && SSA_VAR_P (dest_base))
>   {
>--- gcc/testsuite/g++.dg/torture/pr71874.C.jj  2016-07-19
>15:23:37.097832377 +0200
>+++ gcc/testsuite/g++.dg/torture/pr71874.C 2016-07-19
>15:23:37.097832377 +0200
>@@ -0,0 +1,12 @@
>+// PR middle-end/71874
>+// { dg-do run }
>+
>+int
>+main ()
>+{
>+  char str[] = "abcdefghijklmnopqrstuvwxyzABCDEF";
>+  __builtin_memmove (str + 20, str + 15, 11);
>+  if (__builtin_strcmp (str, "abcdefghijklmnopqrstpqrstuvwxyzF") != 0)
>+__builtin_abort ();
>+  return 0;
>+}
>
>   Jakub




[PATCH] Don't consider zero succs empty blocks as forwarders (PR rtl-optimization/71916)

2016-07-19 Thread Jakub Jelinek
Hi!

Normally empty blocks without successors (result of __builtin_unreachable ()
somewhere in RTL) aren't considered as forwarder_block_p, because they
don't satisfy single_succ_p.  But e.g. during cross-jumping fake edges to
exit are added and then they suddenly satisfy this predicate, which confuses
the cross-jumping among other things (normally cross-jumping attempts to be
careful with no real successor blocks if there isn't a noreturn call
with REG_ARGS_SIZE note, but if it believes those are forwarders, it doesn't
check hard).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?

2016-07-19  Jakub Jelinek  

PR rtl-optimization/71916
* cfgrtl.c (contains_no_active_insn_p): Return false also for
bb which have a single succ fake edge.

* gcc.c-torture/compile/pr71916.c: New test.

--- gcc/cfgrtl.c.jj 2016-05-11 15:15:49.0 +0200
+++ gcc/cfgrtl.c2016-07-19 16:38:20.362303955 +0200
@@ -574,8 +574,10 @@ contains_no_active_insn_p (const_basic_b
 {
   rtx_insn *insn;
 
-  if (bb == EXIT_BLOCK_PTR_FOR_FN (cfun) || bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)
-  || !single_succ_p (bb))
+  if (bb == EXIT_BLOCK_PTR_FOR_FN (cfun)
+  || bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)
+  || !single_succ_p (bb)
+  || (single_succ_edge (bb)->flags & EDGE_FAKE) != 0)
 return false;
 
   for (insn = BB_HEAD (bb); insn != BB_END (bb); insn = NEXT_INSN (insn))
--- gcc/testsuite/gcc.c-torture/compile/pr71916.c.jj2016-07-19 
16:43:37.610371787 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr71916.c   2016-07-19 
16:43:20.0 +0200
@@ -0,0 +1,36 @@
+/* PR rtl-optimization/71916 */
+
+int a, b, c, d, f, g;
+short h;
+
+short
+foo (short p1)
+{
+  return a >= 2 || p1 > 7 >> a ? p1 : p1 << a;
+}
+
+void
+bar (void)
+{
+  for (;;)
+{
+  int i, j[3];
+  h = b >= 2 ? d : d >> b;
+  if (foo (f > h ^ c))
+   {
+ d = 0;
+ while (f <= 2)
+   {
+ char k[2];
+ for (;;)
+   k[i++] = 7;
+   }
+   }
+  else
+   for (;;)
+ g = j[2];
+  if (g)
+   for (;;)
+ ;
+}
+}

Jakub


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-19 Thread Bernd Edlinger
On 07/19/16 18:37, Jakub Jelinek wrote:
> On Tue, Jul 19, 2016 at 04:20:55PM +, Bernd Edlinger wrote:
>> As discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71876,
>> we have a _very_ old hack in gcc, that recognizes certain functions by
>> name, and inserts in some cases unsafe attributes, that don't work for
>> a freestanding environment.
>>
>> It is unsafe to return ECF_MAY_BE_ALLOCA, ECF_LEAF and ECF_NORETURN
>> from special_function_p, just by the name of the function, especially
>> for less well known functions, like "getcontext" or "savectx", which
>> could easily used for something completely different.
>>
>> Moreover, from the backend library we cannot check flag_hosted, or if
>> the function has "C" or "C++" binding.
>
> I believe this will regress handling of various functions.
> E.g. for alloca (as opposed to __builtin_alloca/__builtin_alloca_with_align,
> this means EFC_MAY_BE_ALLOCA will not be set anymore.
>

That depends on which options are used: with -ansi and -ffreestanding, 
alloca is just a normal function, which is kind of correct.

If you include glibc's , alloca is directly defined to 
__builtin_alloca(x), which should work always.

If alloca is declared as void *alloca(size_t); it is also recognized as
built-in unless -ansi or -ffreestanding is used, so that handling was
in a way duplicated already.

So I see no regression here.

> _longjmp/siglongjmp will no longer be ECF_NORETURN (glibc
> doesn't declare them as such), __sigsetjmp will no longer be ECF_LEAF.

Which version of glibc do you refer to?

My 2.19-0ubuntu6.9 has:

extern void _longjmp (struct __jmp_buf_tag __env[1], int __val)
  __THROWNL __attribute__ ((__noreturn__));

extern int __sigsetjmp (struct __jmp_buf_tag __env[1], int __savemask) 
__THROWNL;

So yes, __THROWNL is "__attribute__ ((__nothrow__))".

But they also have __THROW around which is "__attribute__ ((__nothrow__
__LEAF))", so that is just a minor bug in the glibc header, the header
should declare it __THROW if it is no leaf.

If you are concerned about the leaf attribute, it would
be easy to add a builtin for _longjmp, and __sigsetjmp, as
the _ is reserved anyway.  However I considered it an implementation
detail of glibc, that could change, and I did not check newlib on that
either.

Should I add built-in for _longjmp and __sigsetjmp, and check if
that works for newlib too?


Thanks
Bernd.


Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-19 Thread Jeff Law

On 07/17/2016 09:52 AM, Manuel López-Ibáñez wrote:

+  if (is_vla)
+gcc_assert (warn_vla_limit > 0);
+  if (!is_vla)
+gcc_assert (warn_alloca_limit > 0);

if-else ? Or perhaps:
Shouldn't really matter, except perhaps in a -O0 compilation.  Though I 
think else-if makes it slightly clearer.




gcc_assert (!is_vla || warn_vla_limit > 0);
gcc_assert (is_vla || warn_alloca_limit > 0);
Would be acceptable as well.  I think any of the 3 is fine and leave it 
to Aldy's discretion which to use.


Jeff


Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-19 Thread Jeff Law

On 07/19/2016 05:03 AM, Aldy Hernandez wrote:

(Same thing with alloca).  There should be no warning for VLAs,
and for alloca, the warning should say "use of variable-length
array within a loop."  The VRP dump suggests the range information
is available within the loop.  Is the get_range_info() function
not returning the corresponding bounds?


This is a false positive, but there's little we can do with the current
range infrastructure.  The range information becomes less precise the
further down the optimization pipeline we get.  So, even though as far
as *.c.126t.crited1, we still see appropriate range information:

[ ... ]
So I think we can live with the false positive as an XFAIL while we wait 
for improved infrastructure.


I will note that you could use a two-stage approach to help with this 
kind of issue.  You note the set of potential large allocations early 
(before PRE or anyone else messes it up).  Then you allow the other 
optimizers to run, then go back and recheck the allocations after the 
last optimizer pass.  You end up with


flagged early  && flagged late --> warn
flagged early && ! flagged late -> optimization eliminated the false 
positive (which you can optionally issue a diagnostic for)

! flagged early -- never warn

I don't think you strictly need it here, but it's a way to approach some 
of these problems where you want to run a warning pass late (to allow 
the optimizers to eliminate false positives).


jeff


Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-19 Thread Manuel López-Ibáñez
On 19 July 2016 at 18:47, Jeff Law  wrote:
> On 07/17/2016 09:52 AM, Manuel López-Ibáñez wrote:
>>
>> +  if (is_vla)
>> +gcc_assert (warn_vla_limit > 0);
>> +  if (!is_vla)
>> +gcc_assert (warn_alloca_limit > 0);
>>
>> if-else ? Or perhaps:
>
> Shouldn't really matter, except perhaps in a -O0 compilation.  Though I
> think else-if makes it slightly clearer.

Of course, I mentioned it because of clarity. It was difficult to
distinguish !i versus (i in my screen and I had to stop to read it
again.

Cheers,

Manuel.


[PATCH] nvptx: do not implicitly enable -ftoplevel-reorder

2016-07-19 Thread Alexander Monakov
Hi,

I've recently committed a middle-end patch that adds handling of undefined
variables (that the nvptx backend needs) under -fno-toplevel-reorder (svn rev.
238371).  With that change, it's no longer necessary to implicitly enable
-ftoplevel-reorder in the backend, and the following patch removes that.

Tested with nvptx-none-run, OK for trunk?

* config/nvptx/nvptx.c (nvptx_option_override): Do not set 
flag_toplevel_reorder.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 5e47002..8eba2ad 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -152,12 +152,6 @@ nvptx_option_override (void)
 {
   init_machine_status = nvptx_init_machine_status;

-  /* Set toplevel_reorder, unless explicitly disabled.  We need
- reordering so that we emit necessary assembler decls of
- undeclared variables. */
-  if (!global_options_set.x_flag_toplevel_reorder)
-flag_toplevel_reorder = 1;
-
   /* Set flag_no_common, unless explicitly disabled.  We fake common
  using .weak, and that's not entirely accurate, so avoid it
  unless forced.  */


[v3 PATCH] Implement std::string_view and P0254r2, Integrating std::string_view and std::string.

2016-07-19 Thread Ville Voutilainen
Tested on Linux-x64.

I'm quite sure there are cosmetic issues left in this patch, like in the section
references of the tests. I will send it out for review now anyway.

2016-07-19  Ville Voutilainen  

Implement std::string_view and P0254r2,
Integrating std::string_view and std::string.
* include/Makefile.am: Add string_view and string_view.tcc
to the exported headers.
* include/Makefile.in: Likewise.
* include/bits/basic_string.h: Include  in C++17 mode.
(__sv_type): New.
(basic_string(__sv_type, const _Alloc&)): Likewise.
(operator=(__sv_type)): Likewise.
(operator __sv_type()): Likewise.
(operator+=(__sv_type)): Likewise.
(append(__sv_type __sv)): Likewise.
(append(__sv_type, size_type, size_type)): Likewise.
(assign(__sv_type)): Likewise.
(assign(__sv_type, size_type, size_type)): Likewise.
(insert(size_type, __sv_type)): Likewise.
(insert(size_type, __sv_type, size_type, size_type)): Likewise.
(replace(size_type, size_type, __sv_type)): Likewise.
(replace(size_type, size_type, __sv_type, size_type, size_type)):
Likewise.
(replace(const_iterator, const_iterator, __sv_type)): Likewise.
(find(__sv_type, size_type)): Likewise.
(rfind(__sv_type, size_type)): Likewise.
(find_first_of(__sv_type, size_type)): Likewise.
(find_last_of(__sv_type, size_type)): Likewise.
(find_first_not_of(__sv_type, size_type)): Likewise.
(find_last_not_of(__sv_type, size_type)): Likewise.
(compare(__sv_type)): Likewise.
(compare(size_type, size_type, __sv_type)): Likewise.
(compare(size_type, size_type, __sv_type, size_type, size_type)):
Likewise.
* include/bits/string_view.tcc: New.
* include/std/string_view: Likewise.
* testsuite/21_strings/basic_string/cons/char/7.cc: Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/7.cc: Likewise.
* testsuite/21_strings/basic_string/modifiers/append/char/4.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/append/wchar_t/4.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/assign/char/4.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/assign/wchar_t/4.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/insert/char/3.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/insert/wchar_t/3.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/replace/char/7.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/replace/wchar_t/7.cc:
Likewise.
* testsuite/21_strings/basic_string/operations/compare/char/2.cc:
Likewise.
* testsuite/21_strings/basic_string/operations/compare/wchar_t/2.cc:
Likewise.
* testsuite/21_strings/basic_string/operations/find/char/5.cc:
Likewise.
* testsuite/21_strings/basic_string/operations/find/wchar_t/5.cc:
Likewise.
* testsuite/21_strings/basic_string/operators/char/5.cc: Likewise.
* testsuite/21_strings/basic_string/operators/wchar_t/5.cc: Likewise.
* testsuite/21_strings/basic_string_view/capacity/1.cc: Likewise.
* testsuite/21_strings/basic_string_view/cons/char/1.cc: Likewise.
* testsuite/21_strings/basic_string_view/cons/char/2.cc: Likewise.
* testsuite/21_strings/basic_string_view/cons/char/3.cc: Likewise.
* testsuite/21_strings/basic_string_view/cons/wchar_t/1.cc: Likewise.
* testsuite/21_strings/basic_string_view/cons/wchar_t/2.cc: Likewise.
* testsuite/21_strings/basic_string_view/cons/wchar_t/3.cc: Likewise.
* testsuite/21_strings/basic_string_view/element_access/char/1.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/char/2.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/char/empty.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/char/front_back.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/2.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/empty.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/element_access/wchar_t/front_back.cc:
Likewise.
* testsuite/21_strings/basic_string_view/include.cc: Likewise.
* testsuite/21_strings/basic_string_view/inserters/char/1.cc: Likewise.
* testsuite/21_strings/basic_string_view/inserters/char/2.cc: Likewise.
* testsuite/21_strings/basic_string_view/inserters/char/3.cc: Likewise.
* testsuite/21_strings/basic_string_view/inserters/pod/10081-out.cc:
Likewise.
* testsuite/21_strings/basic_string_view/inserters/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string_view/inserters/wchar_t/2.cc:
Likewise.
* testsuite/21_strings/basic_string_view/inserters/wchar_t/3.cc:
Likewise.
* testsuite/21_strings/basic_string_view/literals/types.cc: Likewise.
* testsuite/21_string

Re: [RFC][IPA-VRP] Early VRP Implementation

2016-07-19 Thread Richard Biener
On July 19, 2016 6:19:23 PM GMT+02:00, Jeff Law  wrote:
>On 07/14/2016 10:52 PM, Andrew Pinski wrote:
>> On Thu, Jul 14, 2016 at 9:45 PM, kugan
>>  wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> This patch adds a very simple early vrp implementation. This visits
>the
>>> basic blocks in the dominance order and set the Value Ranges (VR)
>for
>>>
>>> SSA_NAMEs in the scope. Use this VR to discover more VRs. Restore
>the old VR
>>> once the scope is exit.
>>
>>
>> Why separate out early VRP from tree-vrp?  Just a little curious.
>I wouldn't mind seeing tree-vrp broken down a little -- it's quite
>large 
>and there's at least 4 distinct things going on in that file.
>
>1. ASSERT_EXPR handling.
>
>2. Arithmetic on ranges
>
>3. Propagation engine setup, callbacks, etc
>
>4. Range management
>
>There may be others, but it seems at least some of that ought to be 
>factored out.

Possibly, but not necessarily because of the proposed pass.

I'd like to see lattices and lattice entries becoming classes and the 
arithmetic on it being templated on it.

I do have some preliminary patches implementing a aggressive on-drmand VRP for 
the use in niter analysis and the
Lattice is what makes sharing code difficult (it's a hash-map instead of an 
array there).

Richard.


>
>Jeff




Re: [PATCH]: Use HOST_WIDE_INT_{,M}1{,U} some more

2016-07-19 Thread Mike Stump
On Jul 19, 2016, at 5:46 AM, Uros Bizjak  wrote:
> 
> The result of exercises with sed in gcc/ directory.
> 
> 2016-07-19  Uros Bizjak  
> 
>* builtins.c: Use HOST_WIDE_INT_1 instead of (HOST_WIDE_INT) 1,
>HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1,
>HOST_WIDE_INT_M1 instead of (HOST_WIDE_INT) -1 and
>HOST_WIDE_INT_M1U instead of (unsigned HOST_WIDE_INT) -1.

Maybe it's just me, but I actually find the new forms you put in to be more 
complex, not less complex.  The reason is that that the code that there are 4 
words one has to learn what they mean instead of 1 word.  Learning 4 words is 
4x more complex than learning 1 word.   The casting increases the complexity 
some, but not enough to overcome the loss of simplicity that all the new terms 
bring.  And that's _with_ my knowledge and experience of the new forms you use. 
 To a beginner, those forms are even more cryptic I think.  I don't feel too 
strongly about this.

Re: [PATCH] Don't consider zero succs empty blocks as forwarders (PR rtl-optimization/71916)

2016-07-19 Thread Richard Biener
On July 19, 2016 7:27:30 PM GMT+02:00, Jakub Jelinek  wrote:
>Hi!
>
>Normally empty blocks without successors (result of
>__builtin_unreachable ()
>somewhere in RTL) aren't considered as forwarder_block_p, because they
>don't satisfy single_succ_p.  But e.g. during cross-jumping fake edges
>to
>exit are added and then they suddenly satisfy this predicate, which
>confuses
>the cross-jumping among other things (normally cross-jumping attempts
>to be
>careful with no real successor blocks if there isn't a noreturn call
>with REG_ARGS_SIZE note, but if it believes those are forwarders, it
>doesn't
>check hard).
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk/6.2?

OK.

Thanks,
Richard.

>2016-07-19  Jakub Jelinek  
>
>   PR rtl-optimization/71916
>   * cfgrtl.c (contains_no_active_insn_p): Return false also for
>   bb which have a single succ fake edge.
>
>   * gcc.c-torture/compile/pr71916.c: New test.
>
>--- gcc/cfgrtl.c.jj2016-05-11 15:15:49.0 +0200
>+++ gcc/cfgrtl.c   2016-07-19 16:38:20.362303955 +0200
>@@ -574,8 +574,10 @@ contains_no_active_insn_p (const_basic_b
> {
>   rtx_insn *insn;
> 
>-  if (bb == EXIT_BLOCK_PTR_FOR_FN (cfun) || bb ==
>ENTRY_BLOCK_PTR_FOR_FN (cfun)
>-  || !single_succ_p (bb))
>+  if (bb == EXIT_BLOCK_PTR_FOR_FN (cfun)
>+  || bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)
>+  || !single_succ_p (bb)
>+  || (single_succ_edge (bb)->flags & EDGE_FAKE) != 0)
> return false;
> 
>for (insn = BB_HEAD (bb); insn != BB_END (bb); insn = NEXT_INSN (insn))
>--- gcc/testsuite/gcc.c-torture/compile/pr71916.c.jj   2016-07-19
>16:43:37.610371787 +0200
>+++ gcc/testsuite/gcc.c-torture/compile/pr71916.c  2016-07-19
>16:43:20.0 +0200
>@@ -0,0 +1,36 @@
>+/* PR rtl-optimization/71916 */
>+
>+int a, b, c, d, f, g;
>+short h;
>+
>+short
>+foo (short p1)
>+{
>+  return a >= 2 || p1 > 7 >> a ? p1 : p1 << a;
>+}
>+
>+void
>+bar (void)
>+{
>+  for (;;)
>+{
>+  int i, j[3];
>+  h = b >= 2 ? d : d >> b;
>+  if (foo (f > h ^ c))
>+  {
>+d = 0;
>+while (f <= 2)
>+  {
>+char k[2];
>+for (;;)
>+  k[i++] = 7;
>+  }
>+  }
>+  else
>+  for (;;)
>+g = j[2];
>+  if (g)
>+  for (;;)
>+;
>+}
>+}
>
>   Jakub




Re: [Fortran, Patch] First patch for coarray FAILED IMAGES (TS 18508)

2016-07-19 Thread Mikael Morin

Hello,

this is mostly good in general, but is lacking tests.
Especially, tests for successfull matching, and tests for every error 
you are adding in the patch (except maybe the -fcoarray= one).
Also tests that the code executes successfullly with -fcoarray=single, 
and that it produces the right function calls with -fcoarray=lib.


more specific comments below.

Mikael

Le 15/07/2016 à 19:34, Alessandro Fanfarillo a écrit :

Third *PING*

2016-07-04 16:46 GMT-06:00 Alessandro Fanfarillo :

* PING *

2016-06-21 10:59 GMT-06:00 Alessandro Fanfarillo :

* PING *

2016-06-06 15:05 GMT-06:00 Alessandro Fanfarillo :

Dear all,

please find in attachment the first patch (of n) for the FAILED IMAGES
capability defined in the coarray TS 18508.
The patch adds support for three new intrinsic functions defined in
the TS for simulating a failure (fail image), checking an image status
(image_status) and getting the list of failed images (failed_images).
The patch has been built and regtested on x86_64-pc-linux-gnu.

Ok for trunk?

Alessandro


first_complete_patch.diff

commit b3bca5b09f4cbcf18f2409dae2485a16a7c06498
Author: Alessandro Fanfarillo 
Date:   Mon Jun 6 14:27:37 2016 -0600

First patch Failed Images CAF TS-18508

diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c
index f3a4a43..9f519ff 100644
--- a/gcc/fortran/match.c
+++ b/gcc/fortran/match.c
@@ -1594,6 +1594,7 @@ gfc_match_if (gfc_statement *if_type)
   match ("event post", gfc_match_event_post, ST_EVENT_POST)
   match ("event wait", gfc_match_event_wait, ST_EVENT_WAIT)
   match ("exit", gfc_match_exit, ST_EXIT)
+  match ("fail image", gfc_match_fail_image, ST_FAIL_IMAGE)
   match ("flush", gfc_match_flush, ST_FLUSH)
   match ("forall", match_simple_forall, ST_FORALL)
   match ("go to", gfc_match_goto, ST_GOTO)
@@ -3073,6 +3074,41 @@ gfc_match_event_wait (void)
   return event_statement (ST_EVENT_WAIT);
 }

+/* Match a FAIl IMAGE statement */
+
+static match
+fail_image_statement (gfc_statement st)
+{
+  if (flag_coarray == GFC_FCOARRAY_NONE)
+{
+  gfc_fatal_error ("Coarrays disabled at %C, use %<-fcoarray=%> to 
enable");
+  return MATCH_ERROR;
+}
+
+  if (gfc_match_char ('(') == MATCH_YES)
+goto syntax;
+
+  if(st == ST_FAIL_IMAGE)
+new_st.op = EXEC_FAIL_IMAGE;
+  else
+gcc_unreachable();

You can use
gcc_assert (st == ST_FAIL_IMAGE);
foo...;
instead of
if (st == ST_FAIL_IMAGE)
foo...;
else
gcc_unreachable ();

+
+  return MATCH_YES;
+
+ syntax:
+  gfc_syntax_error (st);
+
+  return MATCH_ERROR;
+}
+
+match
+gfc_match_fail_image (void)
+{
+  /* if (!gfc_notify_std (GFC_STD_F2008_TS, "FAIL IMAGE statement at %C")) */
+  /*   return MATCH_ERROR; */
+

Can this be uncommented?


+  return fail_image_statement (ST_FAIL_IMAGE);
+}

 /* Match LOCK/UNLOCK statement. Syntax:
  LOCK ( lock-variable [ , lock-stat-list ] )
diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 1aaf4e2..b2f5596 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -1647,6 +1647,24 @@ trans_this_image (gfc_se * se, gfc_expr *expr)
   m, lbound));
 }

+static void
+gfc_conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr)
+{
+  unsigned int num_args;
+  tree *args,tmp;
+
+  num_args = gfc_intrinsic_argument_list_length (expr);
+  args = XALLOCAVEC (tree, num_args);
+
+  gfc_conv_intrinsic_function_args (se, expr, args, num_args);
+
+  if (flag_coarray == GFC_FCOARRAY_LIB)
+{

Can everything be put under the if?
Does it work with -fcoarray=single?


+  tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_image_status, 
2,
+args[0], build_int_cst (integer_type_node, 
-1));
+  se->expr = tmp;
+}
+}

 static void
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 7d3cf8c..ce0eae7 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -674,6 +674,31 @@ gfc_trans_stop (gfc_code *code, bool error_stop)
   return gfc_finish_block (&se.pre);
 }

+/* Translate the FAIL IMAGE statement.  We have to translate this statement
+   to a runtime library call.  */
+
+tree
+gfc_trans_fail_image (gfc_code *code ATTRIBUTE_UNUSED)
+{
+  tree gfc_int4_type_node = gfc_get_int_type (4);
+  gfc_se se;
+  tree tmp;
+
+  /* Start a new block for this statement.  */
+  gfc_init_se (&se, NULL);
+  gfc_start_block (&se.pre);
+
+  tmp = build_int_cst (gfc_int4_type_node, 0);

This tmp doesn't seem to be used.


+  tmp = build_call_expr_loc (input_location,
+gfor_fndecl_caf_fail_image, 1,
+build_int_cst (pchar_type_node, 0));
+
+  gfc_add_expr_to_block (&se.pre, tmp);
+
+  gfc_add_block_to_block (&se.pre, &se.post);
+
+  return gfc_finish_block (&se.pre);
+}

 tree
 gfc_trans_lock_unlock (gfc_code *code, gfc_exec_op op)




[PR debug/71855] avoid emitting DW_TAG_unspecified_parameters twice

2016-07-19 Thread Aldy Hernandez

Hi folks.

Ben brought this bug to my attention which was causing a failure in 
libabigail.


The problem is that varargs functions are getting two 
DW_TAG_unspecified_parameters DIEs, because they are being emitted in 
early debug and again in late debug.


This problem appears in GCC 6 and in mainline.

The attached patch fixes the problem everywhere.

Tested on x86-64 Linux.

OK for mainline?

p.s. I don't know what the rules are for GCC 6 right now, but if it's 
open for bugfixes, I'd be more than happy to commit it there if the 
patch is approved.
commit c1531a5d6f2394e4ba350d216a19d84bc8796c12
Author: Aldy Hernandez 
Date:   Tue Jul 19 12:18:48 2016 -0400

PR debug/71855
* dwarf2out.c (gen_subprogram_die): Only call
gen_unspecified_parameters_die while dumping early dwarf.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index fe09868..45ed28c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -20730,14 +20730,17 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 void_type_node 2) an unprototyped function declaration (not a
 definition).  This just means that we have no info about the
 parameters at all.  */
-  if (prototype_p (TREE_TYPE (decl)))
+  if (early_dwarf)
{
- /* This is the prototyped case, check for  */
- if (stdarg_p (TREE_TYPE (decl)))
+ if (prototype_p (TREE_TYPE (decl)))
+   {
+ /* This is the prototyped case, check for  */
+ if (stdarg_p (TREE_TYPE (decl)))
+   gen_unspecified_parameters_die (decl, subr_die);
+   }
+ else if (DECL_INITIAL (decl) == NULL_TREE)
gen_unspecified_parameters_die (decl, subr_die);
}
-  else if (DECL_INITIAL (decl) == NULL_TREE)
-   gen_unspecified_parameters_die (decl, subr_die);
 }
 
   if (subr_die != old_die)
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/pr71855.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/pr71855.c
new file mode 100644
index 000..4fd8b74
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/pr71855.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -g -dA" } */
+
+// Test that there is only one DW_TAG_unspecified_parameters DIE.
+
+void
+foo (const char *format, ...)
+{
+}
+
+// { dg-final { scan-assembler-times "DIE.*DW_TAG_unspecified_parameters" 1 } }


Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-19 Thread Aldy Hernandez

On 07/19/2016 01:54 PM, Jeff Law wrote:

On 07/19/2016 05:03 AM, Aldy Hernandez wrote:

(Same thing with alloca).  There should be no warning for VLAs,
and for alloca, the warning should say "use of variable-length
array within a loop."  The VRP dump suggests the range information
is available within the loop.  Is the get_range_info() function
not returning the corresponding bounds?


This is a false positive, but there's little we can do with the current
range infrastructure.  The range information becomes less precise the
further down the optimization pipeline we get.  So, even though as far
as *.c.126t.crited1, we still see appropriate range information:

[ ... ]
So I think we can live with the false positive as an XFAIL while we wait
for improved infrastructure.

I will note that you could use a two-stage approach to help with this
kind of issue.  You note the set of potential large allocations early
(before PRE or anyone else messes it up).  Then you allow the other
optimizers to run, then go back and recheck the allocations after the
last optimizer pass.  You end up with

flagged early  && flagged late --> warn
flagged early && ! flagged late -> optimization eliminated the false
positive (which you can optionally issue a diagnostic for)
! flagged early -- never warn

I don't think you strictly need it here, but it's a way to approach some
of these problems where you want to run a warning pass late (to allow
the optimizers to eliminate false positives).


If you feel strongly about this I can certainly do so, but until we get 
better range info, I'd prefer to work on other stuff ;-).


Let me know.

Aldy



Re: RFA: new pass to warn on questionable uses of alloca() and VLAs

2016-07-19 Thread Aldy Hernandez

On 07/19/2016 01:47 PM, Jeff Law wrote:

On 07/17/2016 09:52 AM, Manuel López-Ibáñez wrote:

+  if (is_vla)
+gcc_assert (warn_vla_limit > 0);
+  if (!is_vla)
+gcc_assert (warn_alloca_limit > 0);

if-else ? Or perhaps:

Shouldn't really matter, except perhaps in a -O0 compilation.  Though I
think else-if makes it slightly clearer.





My preference would've been the if/else.  The missing else was an oversight.

However, since I really don't care, the last posted patch uses this:

>> gcc_assert (!is_vla || warn_vla_limit > 0);
>> gcc_assert (is_vla || warn_alloca_limit > 0);
> Would be acceptable as well.  I think any of the 3 is fine and leave it
> to Aldy's discretion which to use.
>
> Jeff



Re: [PR debug/71855] avoid emitting DW_TAG_unspecified_parameters twice

2016-07-19 Thread Richard Biener
On July 19, 2016 9:01:07 PM GMT+02:00, Aldy Hernandez  wrote:
>Hi folks.
>
>Ben brought this bug to my attention which was causing a failure in 
>libabigail.
>
>The problem is that varargs functions are getting two 
>DW_TAG_unspecified_parameters DIEs, because they are being emitted in 
>early debug and again in late debug.
>
>This problem appears in GCC 6 and in mainline.
>
>The attached patch fixes the problem everywhere.
>
>Tested on x86-64 Linux.
>
>OK for mainline?

OK everywhere.

Thanks,
Richard.

>p.s. I don't know what the rules are for GCC 6 right now, but if it's 
>open for bugfixes, I'd be more than happy to commit it there if the 
>patch is approved.




Re: [RFC][IPA-VRP] Early VRP Implementation

2016-07-19 Thread Jeff Law

On 07/19/2016 12:35 PM, Richard Biener wrote:

I wouldn't mind seeing tree-vrp broken down a little -- it's quite
large and there's at least 4 distinct things going on in that
file.

1. ASSERT_EXPR handling.

2. Arithmetic on ranges

3. Propagation engine setup, callbacks, etc

4. Range management

There may be others, but it seems at least some of that ought to be
 factored out.


Possibly, but not necessarily because of the proposed pass.
Right.  These are things that, in my mind ought to be done regardless of 
the introduction of IPA-VRP.




I'd like to see lattices and lattice entries becoming classes and the
arithmetic on it being templated on it.

Seems reasonable as well.



I do have some preliminary patches implementing a aggressive
on-drmand VRP for the use in niter analysis and the Lattice is what
makes sharing code difficult (it's a hash-map instead of an array
there).
It's interesting you mention an on-demand VRP.  I've asked Andrew to 
poke that that some too.  It's for a different need, but interesting 
that we're both looking to take things in that direction.


Jeff


Re: [PATCH PR71503/PR71683]Fix ICE in tree-if-conv.c

2016-07-19 Thread Jeff Law

On 07/15/2016 02:33 AM, Bin.Cheng wrote:

On Thu, Jul 14, 2016 at 6:49 PM, Jeff Law  wrote:

On 07/14/2016 10:12 AM, Bin Cheng wrote:


Hi,
This is a simple patch fixing ICE in tree-if-conv.c.  Existing code does
not setup a variable (cond) when predicate of basic block is true and it
asserts on the variable.  Interesting thing is dead code is not cleaned up
before ifcvt, but that's another story.
Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin

2016-07-13  Bin Cheng  

PR tree-optimization/71503
PR tree-optimization/71683
* tree-if-conv.c (gen_phi_arg_condition): Set cond when predicate
is true.


Maybe I'm missing something, but in the case where we COND is already set
and we encounter a true predicate later, shouldn't that make the result true
as well?

Yes, this is my understanding too, if the case does exist.  To my
understanding, conditions for phi arguments should be complimentary to
others.  If one argument has true predicate, then other must? be
false, then the case doesn't exist.



I don't think the code will though -- we just throw away the true condition.
ISTM that the right thing to do is

if (is_true_predicate (c))
  {
cond = c;
continue;
  }

Anyway, I think we can make it more robust/efficient by using:
if (is_true_predicate (c))
 {
   cond = c;
   break;
  }
What do you think?  Actually we should discard other arguments if one
has true predicate.

That sounds even better.



Can you see if you can trigger a case where we have an existing cond, then
later find a true condition to verify the right thing happens?

Failed to do so.  It should be quite rare, if no impossible,

Thanks for checking.

jeff


Re: [PATCH PR71503/PR71683]Fix ICE in tree-if-conv.c

2016-07-19 Thread Jeff Law

On 07/19/2016 07:39 AM, Bin.Cheng wrote:

On Thu, Jul 14, 2016 at 6:49 PM, Jeff Law  wrote:

On 07/14/2016 10:12 AM, Bin Cheng wrote:


Hi,
This is a simple patch fixing ICE in tree-if-conv.c.  Existing code does
not setup a variable (cond) when predicate of basic block is true and it
asserts on the variable.  Interesting thing is dead code is not cleaned up
before ifcvt, but that's another story.
Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin

2016-07-13  Bin Cheng  

PR tree-optimization/71503
PR tree-optimization/71683
* tree-if-conv.c (gen_phi_arg_condition): Set cond when predicate
is true.


Maybe I'm missing something, but in the case where we COND is already set
and we encounter a true predicate later, shouldn't that make the result true
as well?

I don't think the code will though -- we just throw away the true condition.
ISTM that the right thing to do is

if (is_true_predicate (c))
  {
cond = c;
continue;
  }

Can you see if you can trigger a case where we have an existing cond, then
later find a true condition to verify the right thing happens?

Hi,
Attachment is the updated patch, it breaks the loop once TRUE
predicate is found.  I also built spec2k6/spec2k and your case is not
found.  Is it OK?

This version is OK.  Thanks,
jeff


Re: [PATCH, DOC] Enhance documentation of -fipa-ra option.

2016-07-19 Thread Jeff Law

On 07/19/2016 08:26 AM, Martin Liška wrote:

On 07/19/2016 03:46 PM, Alexander Monakov wrote:

So I decided to also mention this alias.

Thoughts?


I'd like the new ipa-ra text to go in, but perhaps you should consider leaving
out option aliases out of this patch, especially given that it's non-trivial, as
Andreas' comment has shown.

(I'd say it's rather confusing that -fprofile is quite different from
-fprofile-generate; I doubt it's worthwhile to document this alias at all, but I
won't argue much either way)

Thanks.
Alexander



Thanks Andreas, you are right.

Ok, so I'm sending third version which just explains documentation
of -fipa-ra option.

OK.

Jeff



Re: [PATCH] Giant concepts patch

2016-07-19 Thread Jason Merrill
On Sun, Jul 10, 2016 at 11:20 AM, Andrew Sutton
 wrote:
> I did find another bug building cmcstl2, hence the attached
> disable-opt patch. For some reason, the memoization of concept
> satisfaction is giving momoized results for concept + args that have
> not yet been evaluated. This is exactly the same problem that made me
> disable the lookup/memoize_constraint_sat optimizations. Somehow I'm
> getting the same hash code for different arguments, and they also
> happen to compare equal.

This bug turned out to be e.g. substituting int into "requires
C", which fails because int has no foo member, and
therefore deciding that C is false.

After I fixed that, I tried turning on the constraint memos, which
didn't seem to break anything.

I've pushed to the jason/concepts-rewrite branch again.  See any
reason I shouldn't merge to trunk now?

Jason


Re: [RFC][IPA-VRP] Add support for IPA VRP in ipa-cp/ipa-prop

2016-07-19 Thread kugan

Hi Martin,


On 19/07/16 18:22, kugan wrote:

Hi Martin,

Thanks for the review.  I have revised the patch based on the review.
Please see the comments below.



Maybe it is better to separate value range and alignment summary 
writing/reading to different functions. Here is another updated version 
which does this.
This should be more readable and easy to maintain. Bootstrapped (LTO and 
normal) and regression tested on x86-64-linux. There are few test-case 
regressions which I am looking int:


Tests that now fail, but worked before:

27_io/basic_istream/get/char/1.cc execution test
g++.dg/ipa/pure-const-3.C  -std=gnu++11  scan-tree-dump optimized "barvar"
g++.dg/ipa/pure-const-3.C  -std=gnu++14  scan-tree-dump optimized "barvar"
g++.dg/ipa/pure-const-3.C  -std=gnu++98  scan-tree-dump optimized "barvar"
gcc.dg/guality/pr54519-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  line 20 y == 25
gcc.dg/guality/pr54519-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  line 23 y == 117
gcc.dg/torture/ftrapv-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test


New tests that FAIL:

gcc.dg/guality/pr54519-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  line 20 z == 6
gcc.dg/guality/pr54519-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  line 23 z == 8
gcc.dg/tree-ssa/pr22117.c scan-tree-dump-times evrp "Folding predicate 
r_.* != 0B to 0" 1


I will send the patch with testcase fix and Changelog based on the 
preference.



Thanks,
Kugan
>From 7fe2e29a6ade143c6826a5738e1430fcbe09b05e Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah 
Date: Tue, 21 Jun 2016 12:43:01 +1000
Subject: [PATCH 5/6] Add ipa vrp

---
 gcc/common.opt  |   4 +
 gcc/ipa-cp.c| 237 +++-
 gcc/ipa-prop.c  | 214 
 gcc/ipa-prop.h  |  17 +++
 gcc/testsuite/gcc.dg/ipa/vrp1.c |  32 ++
 gcc/testsuite/gcc.dg/ipa/vrp2.c |  35 ++
 gcc/testsuite/gcc.dg/ipa/vrp3.c |  30 +
 7 files changed, 548 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/vrp1.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/vrp2.c
 create mode 100644 gcc/testsuite/gcc.dg/ipa/vrp3.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 29d0e4d..7e3ab5f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2475,6 +2475,10 @@ ftree-evrp
 Common Report Var(flag_tree_early_vrp) Init(1) Optimization
 Perform Early Value Range Propagation on trees.
 
+fipa-vrp
+Common Report Var(flag_ipa_vrp) Init(1) Optimization
+Perform IPA Value Range Propagation.
+
 fsplit-paths
 Common Report Var(flag_split_paths) Init(0) Optimization
 Split paths leading to loop backedges.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 4b7f6bb..760e9da 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-vrp.h"
 
 template  class ipcp_value;
 
@@ -266,6 +267,25 @@ private:
   bool meet_with_1 (unsigned new_align, unsigned new_misalign);
 };
 
+/* Lattice of value ranges.  */
+
+class ipcp_vr_lattice
+{
+public:
+  value_range m_vr;
+
+  inline bool bottom_p () const;
+  inline bool top_p () const;
+  inline bool set_to_bottom ();
+  bool meet_with (const value_range *p_vr);
+  bool meet_with (const ipcp_vr_lattice &other);
+  void init () { m_vr.type = VR_UNDEFINED; }
+  void print (FILE * f);
+
+private:
+  bool meet_with_1 (const value_range *other_vr);
+};
+
 /* Structure containing lattices for a parameter itself and for pieces of
aggregates that are passed in the parameter or by a reference in a parameter
plus some other useful flags.  */
@@ -281,6 +301,8 @@ public:
   ipcp_agg_lattice *aggs;
   /* Lattice describing known alignment.  */
   ipcp_alignment_lattice alignment;
+  /* Lattice describing value range.  */
+  ipcp_vr_lattice m_value_range;
   /* Number of aggregate lattices */
   int aggs_count;
   /* True if aggregate data were passed by reference (as opposed to by
@@ -348,6 +370,16 @@ ipa_get_poly_ctx_lat (struct ipa_node_params *info, int i)
   return &plats->ctxlat;
 }
 
+/* Return the lattice corresponding to the value range of the Ith formal
+   parameter of the function described by INFO.  */
+
+static inline ipcp_vr_lattice *
+ipa_get_vr_lat (struct ipa_node_params *info, int i)
+{
+  struct ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+  return &plats->m_value_range;
+}
+
 /* Return whether LAT is a lattice with a single constant and without an
undefined value.  */
 
@@ -458,6 +490,14 @@ ipcp_alignment_lattice::print (FILE * f)
 fprintf (f, " Alignment %u, misalignment %u\n", align, misalign);
 }
 
+/* Print value range lattice to F.  */
+
+void
+ipcp_vr_lattice::print (FILE * f)
+{
+  dump_value_range (f, &m_vr);
+}
+
 /* Print all ipcp_lattices of all functions to F.  */

Re: [patch, Fortran] Fix some string temporaries

2016-07-19 Thread Thomas Koenig

Hi Mikael,


Then handle the GFC_DEP_ERROR here. Or initialize fin_dep with
GFC_DEP_NODEP at the beginning, as you prefer.
OK with either (and the unreachable assertions).


Here is the pacth the way I committed it.

Thanks for the review and the comments.

Regards

Thomas

2016-07-19  Thomas Koenig  

PR fortran/71902
* dependency.c (gfc_check_dependency): Use dep_ref.  Handle case
if identical is true and two array element references differ.
(gfc_dep_resovler):  Move most of the code to dep_ref.
(dep_ref):  New function.
* frontend-passes.c (realloc_string_callback):  Name temporary
variable "realloc_string".

2016-07-19  Thomas Koenig  

PR fortran/71902
* gfortran.dg/dependency_47.f90:  New test.
Index: dependency.c
===
--- dependency.c	(Revision 238223)
+++ dependency.c	(Arbeitskopie)
@@ -54,6 +54,8 @@ enum gfc_dependency
 static gfc_dependency check_section_vs_section (gfc_array_ref *,
 		gfc_array_ref *, int);
 
+static gfc_dependency dep_ref (gfc_ref *, gfc_ref *, gfc_reverse *);
+
 /* Returns 1 if the expr is an integer constant value 1, 0 if it is not or
def if the value could not be determined.  */
 
@@ -1316,14 +1318,34 @@ gfc_check_dependency (gfc_expr *expr1, gfc_expr *e
 	  return 0;
 	}
 
-  if (identical)
-	return 1;
-
   /* Identical and disjoint ranges return 0,
 	 overlapping ranges return 1.  */
   if (expr1->ref && expr2->ref)
-	return gfc_dep_resolver (expr1->ref, expr2->ref, NULL);
+	{
+	  gfc_dependency dep;
+	  dep = dep_ref (expr1->ref, expr2->ref, NULL);
+	  switch (dep)
+	{
+	case GFC_DEP_EQUAL:
+	  return identical;
 
+	case GFC_DEP_FORWARD:
+	  return 0;
+
+	case GFC_DEP_BACKWARD:
+	  return 1;
+
+	case GFC_DEP_OVERLAP:
+	  return 1;
+
+	case GFC_DEP_NODEP:
+	  return 0;
+
+	default:
+	  gcc_unreachable();
+	}
+	}
+
   return 1;
 
 case EXPR_FUNCTION:
@@ -2052,11 +2074,39 @@ ref_same_as_full_array (gfc_ref *full_ref, gfc_ref
	2 : array references are overlapping but reversal of one or
 	more dimensions will clear the dependency.
	1 : array references are overlapping.
-   	0 : array references are identical or not overlapping.  */
+   	0 : array references are identical or can be handled in a forward loop.  */
 
 int
 gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gfc_reverse *reverse)
 {
+  enum gfc_dependency dep;
+  dep = dep_ref (lref, rref, reverse);
+  switch (dep)
+{
+case GFC_DEP_EQUAL:
+  return 0;
+
+case GFC_DEP_FORWARD:
+  return 0;
+
+case GFC_DEP_BACKWARD:
+  return 2;
+
+case GFC_DEP_OVERLAP:
+  return 1;
+
+case GFC_DEP_NODEP:
+  return 0;
+
+default:
+  gcc_unreachable();
+}
+}
+
+
+static gfc_dependency
+dep_ref (gfc_ref *lref, gfc_ref *rref, gfc_reverse *reverse)
+{
   int n;
   int m;
   gfc_dependency fin_dep;
@@ -2079,21 +2129,22 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf
 	  /* The two ranges can't overlap if they are from different
 	 components.  */
 	  if (lref->u.c.component != rref->u.c.component)
-	return 0;
+	return GFC_DEP_NODEP;
 	  break;
 
 	case REF_SUBSTRING:
 	  /* Substring overlaps are handled by the string assignment code
 	 if there is not an underlying dependency.  */
-	  return (fin_dep == GFC_DEP_OVERLAP) ? 1 : 0;
 
+	  return fin_dep == GFC_DEP_ERROR ? GFC_DEP_NODEP : fin_dep;
+
 	case REF_ARRAY:
 
 	  if (ref_same_as_full_array (lref, rref))
-	return 0;
+	return GFC_DEP_EQUAL;
 
 	  if (ref_same_as_full_array (rref, lref))
-	return 0;
+	return GFC_DEP_EQUAL;
 
 	  if (lref->u.ar.dimen != rref->u.ar.dimen)
 	{
@@ -2104,7 +2155,7 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf
 		fin_dep = gfc_full_array_ref_p (lref, NULL) ? GFC_DEP_EQUAL
 			: GFC_DEP_OVERLAP;
 	  else
-		return 1;
+		return GFC_DEP_OVERLAP;
 	  break;
 	}
 
@@ -2148,7 +2199,7 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf
 
 	  /* If any dimension doesn't overlap, we have no dependency.  */
 	  if (this_dep == GFC_DEP_NODEP)
-		return 0;
+		return GFC_DEP_NODEP;
 
 	  /* Now deal with the loop reversal logic:  This only works on
 		 ranges and is activated by setting
@@ -2215,7 +2266,7 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf
 	  /* Exactly matching and forward overlapping ranges don't cause a
 	 dependency.  */
 	  if (fin_dep < GFC_DEP_BACKWARD)
-	return 0;
+	return fin_dep == GFC_DEP_ERROR ? GFC_DEP_NODEP : fin_dep;
 
 	  /* Keep checking.  We only have a dependency if
 	 subsequent references also overlap.  */
@@ -2233,7 +2284,7 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf
 
   /* Assume the worst if we nest to different depths.  */
   if (lref || rref)
-return 1;
+return GFC_DEP_OVERLAP;
 
-  return fin_dep == GFC_DEP_OVERLAP;
+  re

Re: [PATCH] libstdc++/71856 Define _GLIBCXX_PARALLEL_ASSERTIONS

2016-07-19 Thread Jonathan Wakely

On 13/07/16 18:26 +0100, Jonathan Wakely wrote:

This fixes a conflict between how Parallel Mode has always used the
_GLIBCXX_ASSERTIONS macro and the new meaning we gave it for GCC 6
(enabling the lightweight debug checks).

It doesn't make sense for Parallel Mode to own that macro, and it
might be useful to enable Parallel Mode assertions without the other
checks, so I've changed all the Parallel Mode headers to check
_GLIBCXX_PARALLEL_ASSERTIONS instead. If that's not defined then it
defaults to the value of _GLIBCXX_ASSERTIONS, to preserve the old
behaviour.

PR libstdc++/71856
* include/bits/c++config (_GLIBCXX_ASSERTIONS): Define to 1 not empty.
* include/parallel/compiletime_settings.h (_GLIBCXX_ASSERTIONS):
Rename to _GLIBCXX_PARALLEL_ASSERTIONS and make default value depend
on _GLIBCXX_ASSERTIONS.
* include/parallel/balanced_quicksort.h: Rename _GLIBCXX_ASSERTIONS.
Include  for sleep.
* include/parallel/losertree.h: Rename _GLIBCXX_ASSERTIONS.
* include/parallel/merge.h: Likewise.
* include/parallel/multiway_merge.h: Likewise.
* include/parallel/partition.h: Likewise.
* include/parallel/queue.h: Likewise.
* include/parallel/sort.h: Likewise.
* testsuite/25_algorithms/headers/algorithm/
parallel_algorithm_assert.cc: New.


Here is a smaller patch for the gcc-6-branch, which doesn't rename the
macro, but just makes it possible to include  with
_GLIBCXX_DEBUG defined.

Tested x86_64-linux, committed to gcc-5-branch.

commit 41cc05d22def68878d0d1b3ce87d46976098189a
Author: Jonathan Wakely 
Date:   Tue Jul 19 19:03:04 2016 +0100

Do not define _GLIBCXX_ASSERTIONS in Parallel Mode

	PR libstdc++/71856
	* include/bits/c++config (_GLIBCXX_ASSERTIONS): Define to 1 not empty.
	* include/parallel/balanced_quicksort.h: Include  for sleep.
	* include/parallel/compiletime_settings.h (_GLIBCXX_ASSERTIONS):
	Do not define here.

diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index 57024e4..4625607 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -414,7 +414,7 @@ namespace std
 
 // Debug Mode implies checking assertions.
 #ifdef _GLIBCXX_DEBUG
-# define _GLIBCXX_ASSERTIONS
+# define _GLIBCXX_ASSERTIONS 1
 #endif
 
 // Disable std::string explicit instantiation declarations in order to assert.
diff --git a/libstdc++-v3/include/parallel/balanced_quicksort.h b/libstdc++-v3/include/parallel/balanced_quicksort.h
index 65dec30..16ef1ef 100644
--- a/libstdc++-v3/include/parallel/balanced_quicksort.h
+++ b/libstdc++-v3/include/parallel/balanced_quicksort.h
@@ -53,6 +53,9 @@
 
 #if _GLIBCXX_ASSERTIONS
 #include 
+#ifdef _GLIBCXX_HAVE_UNISTD_H
+#include 
+#endif
 #endif
 
 namespace __gnu_parallel
diff --git a/libstdc++-v3/include/parallel/compiletime_settings.h b/libstdc++-v3/include/parallel/compiletime_settings.h
index f4fb404..c1758aa 100644
--- a/libstdc++-v3/include/parallel/compiletime_settings.h
+++ b/libstdc++-v3/include/parallel/compiletime_settings.h
@@ -55,12 +55,6 @@
 #define _GLIBCXX_SCALE_DOWN_FPU 0
 #endif
 
-#ifndef _GLIBCXX_ASSERTIONS
-/** @brief Switch on many _GLIBCXX_PARALLEL_ASSERTions in parallel code.
- *  Should be switched on only locally. */
-#define _GLIBCXX_ASSERTIONS 0
-#endif
-
 #ifndef _GLIBCXX_RANDOM_SHUFFLE_CONSIDER_L1
 /** @brief Switch on many _GLIBCXX_PARALLEL_ASSERTions in parallel code.
  *  Consider the size of the L1 cache for


Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-19 Thread Bernd Edlinger
On 07/19/16 19:30, Bernd Edlinger wrote:
> On 07/19/16 18:37, Jakub Jelinek wrote:
>> On Tue, Jul 19, 2016 at 04:20:55PM +, Bernd Edlinger wrote:
>>> As discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71876,
>>> we have a _very_ old hack in gcc, that recognizes certain functions by
>>> name, and inserts in some cases unsafe attributes, that don't work for
>>> a freestanding environment.
>>>
>>> It is unsafe to return ECF_MAY_BE_ALLOCA, ECF_LEAF and ECF_NORETURN
>>> from special_function_p, just by the name of the function, especially
>>> for less well known functions, like "getcontext" or "savectx", which
>>> could easily used for something completely different.
>>>
>>> Moreover, from the backend library we cannot check flag_hosted, or if
>>> the function has "C" or "C++" binding.
>>
>> I believe this will regress handling of various functions.
>> E.g. for alloca (as opposed to
>> __builtin_alloca/__builtin_alloca_with_align,
>> this means EFC_MAY_BE_ALLOCA will not be set anymore.
>>
>
> That depends on which options are used: with -ansi and -ffreestanding,
> alloca is just a normal function, which is kind of correct.
>
> If you include glibc's , alloca is directly defined to
> __builtin_alloca(x), which should work always.
>
> If alloca is declared as void *alloca(size_t); it is also recognized as
> built-in unless -ansi or -ffreestanding is used, so that handling was
> in a way duplicated already.
>
> So I see no regression here.
>
>> _longjmp/siglongjmp will no longer be ECF_NORETURN (glibc
>> doesn't declare them as such), __sigsetjmp will no longer be ECF_LEAF.
>
> Which version of glibc do you refer to?
>
> My 2.19-0ubuntu6.9 has:
>
> extern void _longjmp (struct __jmp_buf_tag __env[1], int __val)
>   __THROWNL __attribute__ ((__noreturn__));
>
> extern int __sigsetjmp (struct __jmp_buf_tag __env[1], int __savemask)
> __THROWNL;
>
> So yes, __THROWNL is "__attribute__ ((__nothrow__))".
>
> But they also have __THROW around which is "__attribute__ ((__nothrow__
> __LEAF))", so that is just a minor bug in the glibc header, the header
> should declare it __THROW if it is no leaf.
>
> If you are concerned about the leaf attribute, it would
> be easy to add a builtin for _longjmp, and __sigsetjmp, as
> the _ is reserved anyway.  However I considered it an implementation
> detail of glibc, that could change, and I did not check newlib on that
> either.
>
> Should I add built-in for _longjmp and __sigsetjmp, and check if
> that works for newlib too?
>
>
> Thanks
> Bernd.


I have tried to find a test case with setjmp/longjmp where the leaf
attribute on the setjmp makes a difference, and the most aggressive
test case I could think of was this:

cat test1.c
static long env0[16], env1[16];
static int x = 0;

int jmp(void*) __attribute__((returns_twice, nothrow, leaf));
void go(void*, void*, int) __attribute__((nothrow));
void doit()
{
   x++;
}

int
test()
{
   static  int xx;
   xx = x;
   jmp (env0);
   xx += x;
   jmp (env1);
   go (env0, env1, xx + x);
   return xx;
}

gcc -O3 -S test.c && inspect assembler code.

Here jmp is marked leaf and returns_twice, but go is not leaf
and can either call doit, or jump to env0 or env1, or simply
return. It would be wrong to expect "x" not to change between
the jmp calls.  gcc-4.8.4 generates invalid code out of this,
and correct code if leaf is not in the function declaration.
However on trunk the code is correct, and the assembler code is
completely identical with or without leaf.

I wondered when that was fixed...

So I googled a bit around "gcc returns_twice leaf", and found
something interesting, where you argumented that it would
be wrong to put the leaf attribute at the setjmp function
in glibc: https://bugzilla.redhat.com/show_bug.cgi?id=752905

"Jakub Jelinek 2011-11-10 14:38:50 EST

setjmp/__sigsetjmp/_setjmp definitely must be __THROWNL rather than __THROW.
Similarly setcontext and swapcontext.  They all have side-effects which 
can modify module static variables that don't have address taken."


I completely agree with you.  But how can it be, that special_function_p
seems to do the completely opposite thing, and add leaf to the setjmp
declaration even if that is not written in the header file?


Thanks
Bernd.


Re: [PATCH] Avoid invoking ranlib on libbackend.a

2016-07-19 Thread Andrew Pinski
On Tue, Jul 19, 2016 at 1:20 AM, Richard Biener
 wrote:
> On Tue, Jul 19, 2016 at 2:39 AM, Patrick Palka  wrote:
>> On Mon, 18 Jul 2016, Segher Boessenkool wrote:
>>
>>> On Mon, Jul 18, 2016 at 06:35:11AM -0500, Segher Boessenkool wrote:
>>> > Or, if using GNU ar, you can even use -S, if that helps (after testing
>>> > for it in configure, of course).
>>>
>>> I meant -T.  Some day I will learn how to type, promise!
>>
>> According to the documentation of GNU ar,
>>
>>   "gnu ar can optionally create a thin archive, which contains a symbol
>>   index and references to the original copies of the member files of the
>>   archive. This is useful for building libraries for use within a local
>>   build tree, where the relocatable objects are expected to remain
>>   available, and copying the contents of each object would only waste time
>>   and space."
>>
>> Since the objects which libbackend.a is composed of remain available
>> throughout the build process I think it should be safe to make
>> libbackend.a a thin archive.
>>
>> So here's a patch which builds libbackend.a as a thin archive if the
>> toolchain supports it.  The time it takes to rebuild a
>> --disable-bootstrap tree after touching a single source file is now 7.5s
>> instead of 35+s -- a much better speedup than when simply eliding the
>> call to ranlib since the archive is now 1-5MB in size instead of 450MB.
>>
>> Instead of changing AR_FLAGS, only the invocation of ar on libbackend.a
>> is changed because that is by far the largest archive (by a factor of
>> 20x) and it seems less risky this way.
>>
>> One thing that was not clear to me is whether the object file paths
>> stored in a thin archive are relative or absolute paths.  If they are
>> absolute paths then that would be a problem due to how the build system
>> moves build directories in between stages (gcc/ -> prev-gcc/ etc).  But
>> it looks like the object file paths are relative to the location of the
>> archive which is compatible.
>>
>> Bootstrapped on x86_64-pc-linux-gnu.  Thoughts?
>
> I like it.  Improving re-build time in my dev tree is very much
> welcome, and yes,
> libbackend build time is a big part of it usually (plus of course cc1
> link time).

I like it too because a lot is spent of my builds are spent creating
libbackend and linking.  I usually go and grab coffee during that time
due to the disk usage and the kernel likes to grind to a halt during
the build at this point.

Thanks,
Andrew

>
> Richard.
>
>> -- >8 --
>>
>> Subject: [PATCH] Build libbackend.a as a thin archive if possible
>>
>> gcc/ChangeLog:
>>
>> * configure.ac (thin_archive_support): New variable.  AC_SUBST it.
>> * configure: Regenerate.
>> * Makefile.in (THIN_ARCHIVE_SUPPORT): New variable.
>> (USE_THIN_ARCHIVES): New variable.
>> (libbackend.a): If USE_THIN_ARCHIVES then pass T to ar to build
>> this archive as a thin archive.
>> ---
>>  gcc/Makefile.in  | 17 +
>>  gcc/configure| 20 ++--
>>  gcc/configure.ac | 13 +
>>  3 files changed, 48 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>> index 0786fa3..15a879b 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -275,6 +275,17 @@ else
>>  LLINKER = $(LINKER)
>>  endif
>>
>> +THIN_ARCHIVE_SUPPORT = @thin_archive_support@
>> +
>> +USE_THIN_ARCHIVES = no
>> +ifeq ($(THIN_ARCHIVE_SUPPORT),yes)
>> +ifeq ($(AR_FLAGS),rc)
>> +ifeq ($(RANLIB_FLAGS),)
>> +USE_THIN_ARCHIVES = yes
>> +endif
>> +endif
>> +endif
>> +
>>  # ---
>>  # Programs which operate on the build machine
>>  # ---
>> @@ -1882,8 +1893,14 @@ compilations: $(BACKEND)
>>  # This archive is strictly for the host.
>>  libbackend.a: $(OBJS)
>> -rm -rf libbackend.a
>> +   @# Build libbackend.a as a thin archive if possible, as doing so
>> +   @# significantly reduces build times.
>> +ifeq ($(USE_THIN_ARCHIVES),yes)
>> +   $(AR) $(AR_FLAGS)T libbackend.a $(OBJS)
>> +else
>> $(AR) $(AR_FLAGS) libbackend.a $(OBJS)
>> -$(RANLIB) $(RANLIB_FLAGS) libbackend.a
>> +endif
>>
>>  libcommon-target.a: $(OBJS-libcommon-target)
>> -rm -rf libcommon-target.a
>> diff --git a/gcc/configure b/gcc/configure
>> index ed44472..81c81b3 100755
>> --- a/gcc/configure
>> +++ b/gcc/configure
>> @@ -679,6 +679,7 @@ zlibinc
>>  zlibdir
>>  HOST_LIBS
>>  enable_default_ssp
>> +thin_archive_support
>>  libgcc_visibility
>>  gcc_cv_readelf
>>  gcc_cv_objdump
>> @@ -18475,7 +18476,7 @@ else
>>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>>lt_status=$lt_dlunknown
>>cat > conftest.$ac_ext <<_LT_EOF
>> -#line 18478 "configure"
>> +#line 18479 "configure"
>>  #include "confdefs.h"
>>
>>  #if HAVE_DLFCN_H
>> @@ -18581,7 +18582,7 @@ else
>>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>>lt_status=$lt_dlunknown
>>cat > conftest.$ac_ext <<_

[VRP] Use alloc-pool and obstack for value_range and vr->equiv allocations

2016-07-19 Thread kugan

Hi Richard,

As discussed in IPA-VRP discussion, this patch makes tree-vrp 
allocations use alloc-pool and obstack for value_range and vr->equiv 
respectively. Other allocations are rare and left as it is.


Bootstrapped and regression tested on x86-64-linux with no new 
regressions. Is this OK for trunk.


Thanks,
Kugan


gcc/ChangeLog:

2016-07-20  Kugan Vivekanandarajah  

* tree-vrp.c (set_value_range): Use vrp_equiv_obstack with
BITMAP_ALLOC.
(add_equivalence): Likewise.
(get_value_range): Allocate value range with vrp_value_range_pool.
(vrp_initialize): Initialize vrp_equiv_obstack for equiv allocation.
(vrp_finalize): Relase vrp_equiv_obstack and vrp_value_range_pool.


diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 68f2e90..0f7bdf7 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "omp-low.h"
 #include "target.h"
 #include "case-cfn-macros.h"
+#include "alloc-pool.h"
 
 /* Range of values that can be associated with an SSA_NAME after VRP
has executed.  */
@@ -87,6 +88,10 @@ struct value_range
 
 #define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL }
 
+/* Allocation pools for tree-vrp allocations.  */
+static object_allocator vrp_value_range_pool ("Tree VRP value 
ranges");
+static bitmap_obstack vrp_equiv_obstack;
+
 /* Set of SSA names found live during the RPO traversal of the function
for still active basic-blocks.  */
 static sbitmap *live;
@@ -406,7 +411,7 @@ set_value_range (value_range *vr, enum value_range_type t, 
tree min,
  bitmaps, only do it if absolutely necessary.  */
   if (vr->equiv == NULL
   && equiv != NULL)
-vr->equiv = BITMAP_ALLOC (NULL);
+vr->equiv = BITMAP_ALLOC (&vrp_equiv_obstack);
 
   if (equiv != vr->equiv)
 {
@@ -688,7 +693,8 @@ get_value_range (const_tree var)
 return CONST_CAST (value_range *, &vr_const_varying);
 
   /* Create a default value range.  */
-  vr_value[ver] = vr = XCNEW (value_range);
+  vr_value[ver] = vr = vrp_value_range_pool.allocate ();
+  memset (vr, 0, sizeof (*vr));
 
   /* Defer allocating the equivalence set.  */
   vr->equiv = NULL;
@@ -817,7 +823,7 @@ add_equivalence (bitmap *equiv, const_tree var)
   value_range *vr = vr_value[ver];
 
   if (*equiv == NULL)
-*equiv = BITMAP_ALLOC (NULL);
+*equiv = BITMAP_ALLOC (&vrp_equiv_obstack);
   bitmap_set_bit (*equiv, ver);
   if (vr && vr->equiv)
 bitmap_ior_into (*equiv, vr->equiv);
@@ -6882,6 +6888,7 @@ vrp_initialize (void)
   num_vr_values = num_ssa_names;
   vr_value = XCNEWVEC (value_range *, num_vr_values);
   vr_phi_edge_counts = XCNEWVEC (int, num_ssa_names);
+  bitmap_obstack_initialize (&vrp_equiv_obstack);
 
   FOR_EACH_BB_FN (bb, cfun)
 {
@@ -10206,15 +10213,10 @@ vrp_finalize (bool warn_array_bounds_p)
   identify_jump_threads ();
 
   /* Free allocated memory.  */
-  for (i = 0; i < num_vr_values; i++)
-if (vr_value[i])
-  {
-   BITMAP_FREE (vr_value[i]->equiv);
-   free (vr_value[i]);
-  }
-
   free (vr_value);
   free (vr_phi_edge_counts);
+  bitmap_obstack_release (&vrp_equiv_obstack);
+  vrp_value_range_pool.release ();
 
   /* So that we can distinguish between VRP data being available
  and not available.  */


Re: C++ PATCH for c++/67164 (error with variadic templates)

2016-07-19 Thread Jason Merrill
On Thu, Mar 3, 2016 at 8:41 PM, Jason Merrill  wrote:
> When we instantiate an element of a pack expansion, we replace the argument
> pack in the template argument vec with an ARGUMENT_PACK_SELECT which
> indicates the desired element of the vec.  If the args have been used to
> instantiate other templates as well, the args of those instances get
> modified as well, which can lead to strange results when we run into
> ARGUMENT_PACK_SELECT in inappropriate places.  This patch fixes this issue
> by making a copy of the template args before we start messing with them.

...so we don't need to deal with ARGUMENT_PACK_SELECT in the hash
tables anymore.
commit b5cb8944e6feb6f4c53170c5d58bc50e4fa5503a
Author: Jason Merrill 
Date:   Tue Jul 19 12:49:53 2016 -0400

PR c++/67164 - clean up dead code

* pt.c (iterative_hash_template_arg, template_args_equal): Don't
handle ARGUMENT_PACK_SELECT.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 916fd7b..7c7024c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1704,9 +1704,7 @@ iterative_hash_template_arg (tree arg, hashval_t val)
 STRIP_NOPS (arg);
 
   if (TREE_CODE (arg) == ARGUMENT_PACK_SELECT)
-/* We can get one of these when re-hashing a previous entry in the middle
-   of substituting into a pack expansion.  Just look through it.  */
-arg = ARGUMENT_PACK_SELECT_FROM_PACK (arg);
+gcc_unreachable ();
 
   code = TREE_CODE (arg);
   tclass = TREE_CODE_CLASS (code);
@@ -7894,17 +7892,7 @@ template_args_equal (tree ot, tree nt)
   return 1;
 }
   else if (ot && TREE_CODE (ot) == ARGUMENT_PACK_SELECT)
-{
-  /* We get here probably because we are in the middle of substituting
- into the pattern of a pack expansion. In that case the
-ARGUMENT_PACK_SELECT temporarily replaces the pack argument we are
-interested in. So we want to use the initial pack argument for
-the comparison.  */
-  ot = ARGUMENT_PACK_SELECT_FROM_PACK (ot);
-  if (nt && TREE_CODE (nt) == ARGUMENT_PACK_SELECT)
-   nt = ARGUMENT_PACK_SELECT_FROM_PACK (nt);
-  return template_args_equal (ot, nt);
-}
+gcc_unreachable ();
   else if (TYPE_P (nt))
 {
   if (!TYPE_P (ot))


  1   2   >