date:20140610

[PATCH, cpp] Fix line directive bug‏

2014-06-10 Thread Nicholas Ormrod

PR preprocessor/60723

Description:

When line directives are inserted into the expansion of a macro, the line
directive may erroneously specify the file as being a system file. This
causes certain warnings to be suppressed in the rest of the file.

The fact that line directives are ever inserted into a macro is itself a
half-bug. Please see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60723 for
full details.


Patch:

Information for locations is, for similar pieces of data, read from a
LOCATION_* macro. The sysp read which was causing the error was using an
inconsistent method to read the data. Resolving this is a two-line fix.


Testing:

make check-gcc on a clean build generates 104754 expected passes, 13
unexpected failures, 245 expected failures, and 1624 unsupported tests. The
new test case, on the clean checkout, generates one expected pass and one
unexpected failure. Once the changes are in, the original test numbers are
unchanged, but the new test case generates two expected passes.


Details:

2014-06-10  Nicholas Ormrod  

PR preprocessor/60723
* input.h: Add LOCATION_* macro for sysp
* c-family/c-ppoutput.c (print_line_1): Use LOCATION_SYSP for
consistency with other LOCATION_* accesses.
* testsuite/gcc.dg/cpp/syshdr4.c: New test case.
* testsuite/gcc.dg/cpp/syshdr4.h: New test case.


diff --git a/gcc/c-family/c-ppoutput.c b/gcc/c-family/c-ppoutput.c
index f3b5fa4..21484d9 100644
--- a/gcc/c-family/c-ppoutput.c
+++ b/gcc/c-family/c-ppoutput.c
@@ -384,7 +384,7 @@ print_line_1 (source_location src_loc, const char
*special_flags, FILE *stream)
print.src_line == 0 ? 1 : print.src_line,
to_file_quoted, special_flags);

-  sysp = in_system_header_at (src_loc);
+  sysp = LOCATION_SYSP(src_loc);
   if (sysp == 2)
 fputs (" 3 4", stream);
   else if (sysp == 1)
diff --git a/gcc/input.h b/gcc/input.h
index d910bb8..ff42e04 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -51,6 +51,7 @@ extern location_t input_location;
#define LOCATION_FILE(LOC) ((expand_location (LOC)).file)
#define LOCATION_LINE(LOC) ((expand_location (LOC)).line)
#define LOCATION_COLUMN(LOC)((expand_location (LOC)).column)
+#define LOCATION_SYSP(LOC) ((expand_location (LOC)).sysp)
#define LOCATION_LOCUS(LOC) \
   ((IS_ADHOC_LOC (LOC)) ? get_location_from_adhoc_loc (line_table, LOC) \
: (LOC))
diff --git a/gcc/testsuite/gcc.dg/cpp/syshdr4.c
b/gcc/testsuite/gcc.dg/cpp/syshdr4.c
new file mode 100644
index 000..8296bed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/syshdr4.c
@@ -0,0 +1,23 @@
+/* Contributed by Nicholas Ormrod */
+/* Origin: PR preprocessor/60723 */
+
+/* This tests that multi-line macro callsites, which are defined
+   in system headers and whose expansion contains a builtin followed
+   by a non-builtin token, do not generate a line directive that
+   mark the current file as being a system file, when performing
+   non-integrated preprocessing. */
+/* System files suppress div-by-zero warnings, so the presence of
+   such indicates the lack of the bug. */
+
+/* { dg-do compile } */
+/* { dg-options -no-integrated-cpp } */
+
+#include "syshdr4.h"
+FOO(
+)
+
+int
+foo()
+{
+  return 1 / 0; /* { dg-warning "div-by-zero" } */
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/syshdr4.h
b/gcc/testsuite/gcc.dg/cpp/syshdr4.h
new file mode 100644
index 000..e699299
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/syshdr4.h
@@ -0,0 +1,3 @@
+#pragma GCC system_header
+
+#define FOO() int line = __LINE__ ;


I have been working on a virtual Ubuntu 12.04 box.

gcc -v:
Using built-in specs.
COLLECT_GCC=./bin/gcc
COLLECT_LTO_WRAPPER=/home/njormrod/src/gcc/new_build/bin/../libexec/gcc/i686
-pc-linux-gnu/4.10.0/lto-wrapper
Target: i686-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/home/njormrod/src/gcc/_build
--disable-multilib --enable-languages=c,c++ --disable-libgcj
Thread model: posix
gcc version 4.10.0 20140610 (experimental) (GCC)



Cheers,
Nicholas Ormrod

Re: [Fortran-CAF][Fortran-DEV] Merge from the trunk into the branch

2014-06-10 Thread Tobias Burnus


Tobias Burnus wrote:
I have merged the trunk into the Fortran-caf branch (as Rev. 211423) 
and into the Fortran-dev branch (as Rev. 211427).


For fortran-dev, I had to fix a merge fallout, see attachment (committed 
as Rev. 211435).


Tobias


Index: trans-openmp.c
===
--- trans-openmp.c	(Revision 211427)
+++ trans-openmp.c	(Arbeitskopie)
@@ -1090,7 +1090,7 @@ gfc_trans_omp_udr_expr (gfc_omp_namelist *n, bool
   /* Enable loop reversal.  */
   for (i = 0; i < GFC_MAX_DIMENSIONS; i++)
 loop.reverse[i] = GFC_ENABLE_REVERSE;
-  gfc_conv_loop_setup (&loop, &ns->code->loc);
+  gfc_conv_loop_setup (&loop, &ns->code->loc, &syme->ts);
   gfc_copy_loopinfo_to_se (&symse, &loop);
   gfc_copy_loopinfo_to_se (&outerse, &loop);
   symse.ss = symss;

Re: Move DECL_SECTION_NAME into symtab

2014-06-10 Thread Jan Hubicka

Hi,
this patch proceeds with the conversion of sections to symtab. It adds the
verifier I described in previous mail and fixes the fallout. It also moves
implicit_section flag from decl to cgraph.

Next step I would like to do is to add an hashtable dictionary of sections used
and move representation out of STRING_CST. This is wastes a lot of tree nodes
especially with LTO where we load a lot of duplicated comdats and then quickly
remove them. STRING_CSTs are not shared.

Sadly the final version of patch doesn't seem to help AIX issues. I am looking
into them now.

Honza

* varasm.c (set_implicit_section): New function.
(resolve_unique_section): Use it to set implicit section
for aliases, too.
(get_named_text_section): Use symtab_get_node (decl)->implicit_section
(default_function_section): Likewise.
(decl_binds_to_current_def_p): Constify argument.
* varasm.h (decl_binds_to_current_def_p): Update prototype.
* asan.c (asan_protect_global): Use symtab_get_node 
(decl)->implicit_section.
* symtab.c (dump_symtab_base): Dump implicit sections.
(verify_symtab_base): Verify sanity of sectoins and comdats.
(symtab_resolve_alias): Alias share the section of its target.
(set_section_1): New function.
(symtab_node::set_section): Move here, recurse to aliases.
(verify_symtab): Check for duplicated symtab lists.
* tree-core.h (implicit_section_name_p): Remove.
* tree-vect-data-refs.c: Include varasm.h.
(vect_can_force_dr_alignment_p): Fix conditional on when
decl bints to current definition; use
symtab_get_node (decl)->implicit_section.
* cgraph.c (cgraph_make_node_local_1): Fix section set.
* cgraph.h (struct symtab_node): Add implicit_section.
(set_section): Rename to ...
(set_section_for_node): ... this one.
(set_section): Declare.
* tree.h (DECL_HAS_IMPLICIT_SECTION_NAME_P): Remove.
* lto-cgraph.c (lto_output_node, lto_output_varpool_node,
input_overwrite_node, input_varpool_node): Stream implicit_section.
* ipa.c (symtab_remove_unreachable_nodes): Do not check symtab before
removal; it will fail in LTO.

* vtable-class-hierarchy.c: Use symtab_get_node 
(var_decl)->implicit_section.
* optimize.c (cdtor_comdat_group): Fix handling of aliases.
(maybe_clone_body): Move symbol across comdat groups.
* method.c (use_thunk): Copy implicit section flag.

* go/go-gcc.cc (Gcc_backend::global_variable_set_init): Use
symtab_get_node(var_decl)->implicit_section.

* lto.c (read_cgraph_and_symbols): Remove unreachable symbols.
(do_whole_program_analysis): Use verify_symtab.
Index: asan.c
===
--- asan.c  (revision 211364)
+++ asan.c  (working copy)
@@ -1289,7 +1289,7 @@ asan_protect_global (tree decl)
 to be an array of such vars, putting padding in there
 breaks this assumption.  */
   || (DECL_SECTION_NAME (decl) != NULL_TREE
- && !DECL_HAS_IMPLICIT_SECTION_NAME_P (decl))
+ && !symtab_get_node (decl)->implicit_section)
   || DECL_SIZE (decl) == 0
   || ASAN_RED_ZONE_SIZE * BITS_PER_UNIT > MAX_OFILE_ALIGNMENT
   || !valid_constant_size_p (DECL_SIZE_UNIT (decl))
Index: cp/optimize.c
===
--- cp/optimize.c   (revision 211364)
+++ cp/optimize.c   (working copy)
@@ -191,7 +191,7 @@ cdtor_comdat_group (tree complete, tree
diff_seen = true;
   }
   grp_name[idx] = '\0';
-  gcc_assert (diff_seen);
+  gcc_assert (diff_seen || symtab_get_node (complete)->alias);
   return get_identifier (grp_name);
 }
 
@@ -553,6 +553,8 @@ maybe_clone_body (tree fn)
 *[CD][12]*.  */
  comdat_group = cdtor_comdat_group (fns[1], fns[0]);
  cgraph_get_create_node (fns[0])->set_comdat_group (comdat_group);
+ if (symtab_get_node (clone)->same_comdat_group)
+   symtab_remove_from_same_comdat_group (symtab_get_node (clone));
  symtab_add_to_same_comdat_group (symtab_get_node (clone),
   symtab_get_node (fns[0]));
}
Index: cp/vtable-class-hierarchy.c
===
--- cp/vtable-class-hierarchy.c (revision 211364)
+++ cp/vtable-class-hierarchy.c (working copy)
@@ -1249,7 +1249,7 @@ vtable_find_or_create_map_decl (tree bas
 
   set_decl_section_name (var_decl, build_string (strlen 
(".vtable_map_vars"),
  ".vtable_map_vars"));
-  DECL_HAS_IMPLICIT_SECTION_NAME_P (var_decl) = true;
+  symtab_get_node (var_decl)->implicit_section = true;
   DECL_INITIAL (var_decl) = initial_value;
 
   comdat_linkage (var_decl);
Ind

Re: [PING*2][PATCH] Extend mode-switching to support toggle (1/2)

2014-06-10 Thread Joern Rennecke

On 2 June 2014 13:34, Christian Bruel  wrote:
> Hello,
>
> Any feedback for this ? I'd like to commit only when OK for Epiphany.

>> Joern, is this new target macro interface OK with you ?

Yes, this interface should allow me to do switches between rounding
and truncating
floating-point modes with an add/subtract immediate.

However, the implentation, as posted, doesn't work - it causes memory
corruption.

It appears to work with the attached amendment patch.

=== gcc Summary ===

# of expected passes82184
# of unexpected failures41
# of unexpected successes   1
# of expected failures  90
# of unresolved testcases   2
# of unsupported tests  1585
/ssd/adapteva/bld-epiphany/gcc/xgcc  version 4.10.0 20140608
(experimental) (Epiphany toolchain (built 20140610))

This is the same as before applying the patch(es).

tmp
Description: Binary data

Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-10 Thread Jan Hubicka

> Note that I'm happy to revert the change.
> 
> I am hesitant to any approach that overloads TREE_ADDRESSABLE even more.
> It already is used for two (slightly) different things - first the
> "old" meaning that the address of the symbol is needed, second, that
> the symbol is aliased by pointers.  Those are of course related, but
> as you see they are not 100% equivalent.

An alternative is surely to add a flag to varpool.  But again, having several
flags of similar names and slightly different meanings doesn't make things more
maintaible either.
> 
> As I already added DECL_NONALIASED (for VAR_DECLs) to "fix" that
> coverage counter issue (those are TREE_STATIC but they have their
> address taken - still we know that no pointers alias the accesses),
> we can as well rely on that flag - but then we should set it whenever
> a TU-local decl does not have its address taken (!TREE_ADDRESSABLE).

I see, I did not notice this.  Will this help me with the situation where
address is taken, but it is only passed to external calls that do not capture
it (i.e. memset), so we know it does not appear in the points-to sets?
> 
> So it does impose some redundancy and possibility of things to go
> out-of-sync.
> 
> Btw, the C frontend doesn't call varpool_finalize_decl for externals,
> so setting TREE_ADDRESSABLE there doesn't work unfortunately.  It
> works with doing it in varpool_node_for_decl though.
> 
> Patch doing both attached (we may choose to do this in different
> places for DECL_EXTERNALs vs. TREE_PUBLIC && TREE_STATICs?).
> At LTO input time we directly call symtab_register_node which would
> side-step this thus an IPA pass could drop TREE_ADDRESSABLE from
> those decls.
> 
> Sofar untested.
> 
> Comments?

I think it may be easier to just set the flag as part of the ipa-visiblity
pass.  I.e. at the time we set externally_visile, we can also set
TREE_ADDRESSABLE for variable.  We don't use alias machinery before
that, right?

Honza
> 
> Thanks,
> Richard.
> 
> 2014-06-10  Richard Biener  
> 
>   * tree.h (TREE_ADDRESSABLE): Clarify.
>   * varpool.c (varpool_node_for_decl): Mark public or external
>   variables as TREE_ADDRESSABLE.
>   * cgraphunit.c (varpool_finalize_decl): Likewise.
> 
>   * gcc.dg/torture/20140610-1.c: New testcase.
>   * gcc.dg/torture/20140610-2.c: Likewise.
> 
> Index: gcc/tree.h
> ===
> --- gcc/tree.h(revision 211398)
> +++ gcc/tree.h(working copy)
> @@ -571,8 +571,9 @@ extern void omp_clause_range_check_faile
>  
>  /* Define many boolean fields that all tree nodes have.  */
>  
> -/* In VAR_DECL, PARM_DECL and RESULT_DECL nodes, nonzero means address
> -   of this is needed.  So it cannot be in a register.
> +/* In VAR_DECL, PARM_DECL and RESULT_DECL nodes, nonzero means the address
> +   of this is needed.  So it cannot be in a register.  If not set, then
> +   the address of this cannot be used to initialize an aliasing pointer.
> In a FUNCTION_DECL it has no meaning.
> In LABEL_DECL nodes, it means a goto for this label has been seen
> from a place outside all binding contours that restore stack levels.
> Index: gcc/varpool.c
> ===
> --- gcc/varpool.c (revision 211398)
> +++ gcc/varpool.c (working copy)
> @@ -149,6 +149,8 @@ varpool_node_for_decl (tree decl)
>if (node)
>  return node;
>  
> +  if (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl))
> +TREE_ADDRESSABLE (decl) = 1;
>node = varpool_create_empty_node ();
>node->decl = decl;
>symtab_register_node (node);
> Index: gcc/cgraphunit.c
> ===
> --- gcc/cgraphunit.c  (revision 211398)
> +++ gcc/cgraphunit.c  (working copy)
> @@ -818,6 +818,11 @@ varpool_finalize_decl (tree decl)
>  
>gcc_assert (TREE_STATIC (decl) || DECL_EXTERNAL (decl));
>  
> +  /* Mark all symbols visible to other TUs as possibly having their
> + address taken.  */
> +  if (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl))
> +TREE_ADDRESSABLE (decl) = 1;
> +
>if (node->definition)
>  return;
>notice_global_symbol (decl);
> Index: gcc/testsuite/gcc.dg/torture/20140610-1.c
> ===
> --- gcc/testsuite/gcc.dg/torture/20140610-1.c (revision 0)
> +++ gcc/testsuite/gcc.dg/torture/20140610-1.c (working copy)
> @@ -0,0 +1,15 @@
> +/* { dg-do run } */
> +/* { dg-additional-sources "20140610-2.c" } */
> +
> +extern int a;
> +extern int *p;
> +
> +void t

Re: [PATCH] GCC/MMIX: Remove orphan mmix_asm_output_source_line prototype

2014-06-10 Thread Hans-Peter Nilsson

On Tue, 10 Jun 2014, Maciej W. Rozycki wrote:
> Hi,
>
>  I've noticed mmix_asm_output_source_line is declared, but nowhere
> defined.  OK to remove the prototype?

Sure; in fact, obvious.

brgds, H-P

Re: [PATCH] libstdc++/testsuite: Fix a 4402.cc compilation error

2014-06-10 Thread Maciej W. Rozycki

On Wed, 11 Jun 2014, Jonathan Wakely wrote:

> > The reason is cout is a plain character stream and does not accept wide
> > characters.  An obvious fix is below, verified to produce correct output.
> > 
> > OK to apply?
> 
> Yes OK, thanks.

 Committed, thanks for your review.

  Maciej

Re: [PATCH] libstdc++/testsuite: Fix a 4402.cc compilation error

2014-06-10 Thread Jonathan Wakely


On 10/06/14 20:24 +0100, Maciej W. Rozycki wrote:


The reason is cout is a plain character stream and does not accept wide
characters.  An obvious fix is below, verified to produce correct output.

OK to apply?


Yes OK, thanks.

Re: ipa-visibility TLC 2/n

2014-06-10 Thread Jan Hubicka

> Honza,
> 
> I am not sure that the problem is caused only by aliases and thunks.
> The large increase in AIX linker warnings about branches not followed
> by nop also worry me.
> 
> Your patch was about visibility. How does the more aggressive

ipa-visibility is a pass that basically bring external symbols local whenever
it can (i.e. does privatizatoin), it is not that much about ELF visibilities.

> algorithm behave on a platform that does not support visibility? Is it
> defaulting to hidden? If the new algorithm is being too aggressive and
> incorrectly converting calls from global to local, it could cause
> serious problems for AIX because the GOT register will not be
> restored.

One of optimizations IPA visibility does is that it looks for global symbols
that
  1) are believed by target to be overwritable (interposed - I will probably
 update name here) by linker to a different definition (we have
 decl_binds_to_current_def_p that is target tweakable), and
  2) the interposition may not change semantics of the symbol (i.e. function
 body must be the same or variable initializer must match)

 For some symbols (such as inline functions, virtual tables, readonly
 variables with no address taken) we know it won't.
  3) the symbol's definition can not be optimized away by linker
 (by symtab_can_be_discarded)
If all conditons match, it creates a static alias (not hidden, just local to
the .o file) and redirects users of the symbol to the alias.  This should be
always win: we know that the representation of symbol will survive to final
binary (it is not discarded) and we replace expensive references through GOT by
cheap references to local symbol.

We did, for longer time, redirect calls.  The troublesome patch makes us to
redirect also references in virtual tables and newly we also consider virtual
tables themselves for aliases.

This should be a win, since virtual tables tends to be startup time hogs and
it is common to have virtual tables in one DSO to refer to comdats that are
shared with other DSO, but because they must be the same, we can just ignore
the sharing.

On AIX we observed interesting series of events.

 1) First the output machinery was not quite able to declare local static
alias for a symbol (this was about year ago when I introduced the first
change).
 2) AIX assembler seems to issue warning when jump happens to the local
static alias confusing it with the global symbol it is aliasing.
I do not know if the warning is just bogus or we output something
incorrectly.  This is the reason for NOP warning as I understand.
 3) Important difference we is that in AIX all COMDAT symbols are considered
non-discardable. This makes us to produce a lot more aliases than on ELF
system.

I am not sure if this is acurate and AIX linker really has no means
of removing duplicated bodies of COMDAT functions/initializers of variables.
If it has, we need to model it in symtab_can_be_discarded.
Currently we test whether symbol is in comdat group and in the case of
AIX it isn't. 

As disucssed earlier, I am thinking about making symtab_can_be_discarded
return true also for implicit sections (I have WIP patch for this to
commit today). Earlier version of the patch however did not solve
the warnings.

I also tested libstdc++.so sizes with current mainline, with the
local aliases disabled and with this change and current mainline
wins. Suggesting that perhaps there is really no way to discard
duplicated comdats or libstdc++ doesn't really have them.
Insight here would be welcome - I am sure it is easy to test if 
including and using inline __noinline__ function in two units
leads to two copies of that beast or not. It would be nice to know
how native toolchain handles it and if GCC does the same trick.
 4) Before 4.9 we hit bug in inliner dealing with aliases that gave me
a headache. It reproduced on AIX only because of 3)
 5) We hit problem with aliases to anchored sections, hopefully solved now
 6) We hit the problem that AIX assembler silently accepts but miscompile
when alias is declared before its target.  THis is also hopefully
fixed now.  We hit it twice - one for normal symbols about year ago
and now again for thunks.
 7) Given number of issues I ended up writting a verifier that checks
sanity of sections, aliases and comdat groups.  I check
  a) all symbols in comdat group have the same section
  b) alias and its target have same comdat group & section characteristics
 (obivously one can not place alias out of comdat)
  c) I check sanity of IMPLICIT_SECTION flag.
It turns out that C++ FE makes complete mess of those breaking all three
rules due to complex interactions in between comdat code, same body aliases
and the way thunks are produced.  I think this may confuse AIX output
macros since they are just given a contradictionary inf

Re: ipa-visibility TLC 2/n

2014-06-10 Thread David Edelsohn

On Tue, Jun 10, 2014 at 2:02 PM, Jan Hubicka  wrote:
>> Honza,
>>
>> Thanks for this patch which improves some of the G++ testsuite
>> failures, but most of the libstdc++ testsuite continues to fail on
>> AIX.
>
> Yep, I am still looking into this.  Just made new alias verifier that
> catches quite few nonsenses in how C++ builds thunks and same body
> aliases.  I am testing it on PPC now (it takes a while)
>>
>> This patch clearly was risky and should have been more thoroughly
>> tested on non-GNU/Linux systems. All of these failures make it
>> impossible to know if other failures have been introduced into the AIX
>> port.
>>
>> I will be happy to work with you to debug these failures, but I wish
>> to ask that the IPA visibility patch be reverted until AIX testsuite
>> results can return to a normal state with the patch applied.
>
> OK, the change does redirecting of functions and variables.  I will try
> to figure out which of those two changes actually breaks AIX and disable
> it for time being.

Honza,

I am not sure that the problem is caused only by aliases and thunks.
The large increase in AIX linker warnings about branches not followed
by nop also worry me.

Your patch was about visibility. How does the more aggressive
algorithm behave on a platform that does not support visibility? Is it
defaulting to hidden? If the new algorithm is being too aggressive and
incorrectly converting calls from global to local, it could cause
serious problems for AIX because the GOT register will not be
restored.

Thanks David

[PATCH] GCC/MMIX: Remove orphan mmix_asm_output_source_line prototype

2014-06-10 Thread Maciej W. Rozycki

Hi,

 I've noticed mmix_asm_output_source_line is declared, but nowhere 
defined.  OK to remove the prototype?

2014-06-10  Maciej W. Rozycki  

gcc/
* config/mmix/mmix-protos.h (mmix_asm_output_source_line): Remove 
prototype.

  Maciej

gcc-mmix-prototype-fix.patch
Index: gcc-fsf-trunk-quilt/gcc/config/mmix/mmix-protos.h
===
--- gcc-fsf-trunk-quilt.orig/gcc/config/mmix/mmix-protos.h  2013-02-07 
02:35:40.0 +
+++ gcc-fsf-trunk-quilt/gcc/config/mmix/mmix-protos.h   2013-07-13 
04:29:29.599981244 +0100
@@ -28,7 +28,6 @@ extern int mmix_reversible_cc_mode (enum
 extern const char *mmix_text_section_asm_op (void);
 extern const char *mmix_data_section_asm_op (void);
 extern void mmix_output_quoted_string (FILE *, const char *, int);
-extern void mmix_asm_output_source_line  (FILE *, int);
 extern void mmix_asm_output_ascii (FILE *, const char *, int);
 extern void mmix_asm_output_label (FILE *, const char *);
 extern void mmix_asm_output_internal_label (FILE *, const char *);

Re: [Patch ARM/testsuite 00/22] Neon intrinsics executable tests

2014-06-10 Thread Ramana Radhakrishnan

On Thu, Jun 5, 2014 at 11:04 PM, Christophe Lyon
 wrote:
> This is patch series is a more complete version of the patch I sent
> some time ago:
> https://gcc.gnu.org/ml/gcc-patches/2013-10/msg00624.html
>
> I have created a series of patches to help review.  The 1st one adds
> some documentation, the common .h files defining helpers used in the
> actual tests, and two real tests (vaba and vld1) to show how the
> various macros are used.
>
> The next patches add other tests (grouped when they use a common
> framework).
>
> Looking at the .exp file, you'll notice that the tests are performed twice:
> * once using c-torture-execute to make sure they execute correctly
>   under various levels of optimization. In this case dejagnu
>   directives embedded in each .c test file are ignored.
>
> * once using gcc-dg-runtest, which enables compiling with various
>   optimization levels and scanning the generated assembly for some
>   code sequences. Currently, only the vadd test contains some
>   scan-assembler-times directives, as an example. We can add such
>   directives to other tests later.

>
> Regarding the results of these tests on target
> arm-none-linux-gnueabihf, note that:
> * vclz tests currently fail at optimization levels starting with -O1
> * vqadd test fails when compiled with -Os
> * vadd scan-assembler fails for vadd.i64 (because the compiler uses
>   core registers instead of Neon ones. Not sure if this should be
>   considered as a bug or if the test should be changed)
> * this gives 1164 PASS and 18 FAIL
>

I am a bit ambivalent between getting folks to add scan-assembler
tests here and worrying between this and getting the behaviour
correct. Additionally if you add the complexity of scanning for
aarch64 as well this starts getting messy.

At this point I'm going to wait to see if any of the testsuite
maintainers step in and comment and if not I'll start looking at this
properly early next week.

regards
Ramana


> I have not looked at the results in detail on other arm* and aarch64*
> targets, but there are some other failures.
>
> I have many more tests to convert (currently 40 done, 96 remain), and
> my plan is to work on the rest once this set has been accepted.
>
> As of the ChangeLog entry, this patch only adds new files in
> testsuite/gcc.target/arm/neon-intrinsics (which is new too).
>
> OK for trunk?
>
> Thanks,
>
> Christophe.
>
> Christophe Lyon (22):
>   Neon intrinsics execution tests initial framework.
>   Add unary operators: vabs and vneg.
>   Add binary operators: vadd, vand, vbic, veor, vorn, vorr, vsub.
>   Add comparison operators: vceq, vcge, vcgt, vcle and vclt.
>   Add comparison operators with floating-point operands: vcage, vcagt,
>   vcale and cvalt.
>   Add unary saturating operators: vqabs and vqneg.
>   Add binary saturating operators: vqadd, vqsub.
>   Add vabal tests.
>   Add vabd tests.
>   Add vabdl tests.
>   Add vaddhn tests.
>   Add vaddl tests.
>   Add vaddw tests.
>   Add vbsl tests.
>   Add vclz tests.
>   Add vdup and vmov tests.
>   Add vld1_dup tests.
>   Add vld2/vld3/vld4 tests.
>   Add vld2_lane, vld3_lane and vld4_lane tests.
>   Add vmul tests.
>   Add vshl tests.
>   Add vuzp and vzip tests.

Re: [PATCH 8/8] Add a common .md file and define standard constraints there

2014-06-10 Thread Jeff Law


On 06/05/14 15:43, Richard Sandiford wrote:

This final patch uses a common .md file to define all standard
constraints except 'g'.  It then gets rid of explicit case statements
for the standard constraints, except in two cases:

(1) recog.c:asm_operand_ok still needs to handle 'o' specially for
 targets like ia64 that don't have offsettable addresses.  See the
 comment there for justification.

(2) the trickier cases in reload.  I'm not changing those more than I have to.
Can't argue with #2 ;-)  reload gets less and less important every day, 
so I see less and less value hacking too much on it.




I did wonder about defining a new rtl construct that could be used for 'g',
so that even that special case goes away.  In the end I think it would be
a false abstraction though.  No other constraint allows (or IMO should allow)
all three of a register class, a base-reloadable memory and a constant,
so handling it in the lookup_constraint paths would make things more
complicated rather than less.

OK.



Note that the s390 'e' constraint is TARGET_MEM_CONSTRAINT, which is now
defined in the common file.

I put the common .md file in the main gcc/ directory by analogy with
defaults.h and common.opt.  It could instead go in config/ or config/common/,
if those sound better.

Seems fine to me, I don't feel a need to bikeshed here.


Does the comment before indep_constraints in genoutput need updating? 
The constraints in common.md are machine independent, but aren't listed 
in indep_constraints in genoutput.c


Approved with whatever language you want to use for that comment.

Jeff

Re: [C++ Patch] PR 19200

2014-06-10 Thread Paolo Carlini


Hi,

On 06/10/2014 05:31 PM, Jason Merrill wrote:

On 06/10/2014 11:19 AM, Paolo Carlini wrote:
Back to you in a few hours, but I suspect we would have trouble with 
the famous


   struct S
   {
 friend S::S();
   };

compiled with -fpermissive.


I don't think so; that should be handled later in grokdeclarator by


  if (declarator
  && declarator->u.id.qualifying_scope
  && MAYBE_CLASS_TYPE_P (declarator->u.id.qualifying_scope))
{
  ctype = declarator->u.id.qualifying_scope;

Excellent, thanks for the clarification. In fact the below passed testing.

Thanks,
Paolo.

///
Index: cp/decl.c
===
--- cp/decl.c   (revision 211395)
+++ cp/decl.c   (working copy)
@@ -9686,7 +9686,7 @@ grokdeclarator (const cp_declarator *declarator,
if (ctype == NULL_TREE
&& decl_context == FIELD
&& funcdecl_p
-   && (friendp == 0 || dname == current_class_name))
+   && friendp == 0)
  ctype = current_class_type;
 
if (ctype && (sfk == sfk_constructor
Index: cp/parser.c
===
--- cp/parser.c (revision 211395)
+++ cp/parser.c (working copy)
@@ -2078,9 +2078,9 @@ static tree cp_parser_decltype
 static tree cp_parser_init_declarator
   (cp_parser *, cp_decl_specifier_seq *, vec *, 
bool, bool, int, bool *, tree *);
 static cp_declarator *cp_parser_declarator
-  (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool);
+  (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool, bool);
 static cp_declarator *cp_parser_direct_declarator
-  (cp_parser *, cp_parser_declarator_kind, int *, bool);
+  (cp_parser *, cp_parser_declarator_kind, int *, bool, bool);
 static enum tree_code cp_parser_ptr_operator
   (cp_parser *, tree *, cp_cv_quals *, tree *);
 static cp_cv_quals cp_parser_cv_qualifier_seq_opt
@@ -10014,7 +10014,8 @@ cp_parser_condition (cp_parser* parser)
   declarator = cp_parser_declarator (parser, CP_PARSER_DECLARATOR_NAMED,
 /*ctor_dtor_or_conv_p=*/NULL,
 /*parenthesized_p=*/NULL,
-/*member_p=*/false);
+/*member_p=*/false,
+/*friend_p=*/false);
   /* Parse the attributes.  */
   attributes = cp_parser_attributes_opt (parser);
   /* Parse the asm-specification.  */
@@ -14160,7 +14161,8 @@ cp_parser_explicit_instantiation (cp_parser* parse
= cp_parser_declarator (parser, CP_PARSER_DECLARATOR_NAMED,
/*ctor_dtor_or_conv_p=*/NULL,
/*parenthesized_p=*/NULL,
-   /*member_p=*/false);
+   /*member_p=*/false,
+   /*friend_p=*/false);
   if (declares_class_or_enum & 2)
cp_parser_check_for_definition_in_return_type (declarator,
   decl_specifiers.type,
@@ -16570,7 +16572,7 @@ cp_parser_init_declarator (cp_parser* parser,
 = cp_parser_declarator (parser, CP_PARSER_DECLARATOR_NAMED,
&ctor_dtor_or_conv_p,
/*parenthesized_p=*/NULL,
-   member_p);
+   member_p, /*friend_p=*/false);
   /* Gather up the deferred checks.  */
   stop_deferring_access_checks ();
 
@@ -16958,14 +16960,16 @@ cp_parser_init_declarator (cp_parser* parser,
If PARENTHESIZED_P is non-NULL, *PARENTHESIZED_P is set to true iff
the declarator is a direct-declarator of the form "(...)".
 
-   MEMBER_P is true iff this declarator is a member-declarator.  */
+   MEMBER_P is true iff this declarator is a member-declarator.
 
+   FRIEND_P is true iff this declarator is a friend.  */
+
 static cp_declarator *
 cp_parser_declarator (cp_parser* parser,
  cp_parser_declarator_kind dcl_kind,
  int* ctor_dtor_or_conv_p,
  bool* parenthesized_p,
- bool member_p)
+ bool member_p, bool friend_p)
 {
   cp_declarator *declarator;
   enum tree_code code;
@@ -17005,7 +17009,8 @@ cp_parser_declarator (cp_parser* parser,
   declarator = cp_parser_declarator (parser, dcl_kind,
 /*ctor_dtor_or_conv_p=*/NULL,
 /*parenthesized_p=*/NULL,
-/*member_p=*/false);
+/*member_p=*/false,
+friend_p);
 
   /* If we are parsing an abstract-declarator, we must handle the
 case where the dependent declarator is absent.  */
@@ -17024,7 +17029,7 @@ cp_parser_declarator (cp_parser* parser,

Re: [PR 61424] std::regex matches right to left, not leftmost longest

2014-06-10 Thread Tim Shen

On Tue, Jun 10, 2014 at 9:54 AM, Jonathan Wakely  wrote:
> I'm sure this is because I still don't understand all the regex code,
> but doesn't this change mean that for an "extended" mode regex with
> backrefs, the user could define _GLIBCXX_REGEX_USE_THOMPSON_NFA and
> backrefs wouldn't work?

Sorry I missed that basic POSIX (BRE) has back-references (damn!), but
extended POSIX (ERE) doesn't. So it should look like:
-  if (!__re._M_automaton->_M_has_backref
+  if (!(__re._M_automaton->_M_has_backref || (__re._M_flags &
regex_constants::ECMAScript))
...and all deleted _M_has_backref lines should be undeleted.

This patch is a temporary (I'm not sure how long though) workaround;
BFS's support for ECMAScript with no back-references shall be done
finally.


-- 
Regards,
Tim Shen

Re: [PATCH 4/8] Remove old macros and make lookup_constraint explicit

2014-06-10 Thread Jeff Law


On 06/05/14 15:32, Richard Sandiford wrote:

Now that all extra constraints are defined in .md files, there's no real
need for the old REG_CLASS_FROM_CONSTRAINT-style macros.  The macros also
seem dangerous performance-wise, since each one contains an embedded call to
lookup_constraint.  This means that code like:

if (REG_CLASS_FROM_CONSTRAINT (c, p) == NO_REGS)
  {
if (EXTRA_MEMORY_CONSTRAINT (c, p))
  ... EXTRA_CONSTRAINT_STR ...
if (EXTRA_ADDRESS_CONSTRAINT (c, p))
  ... EXTRA_CONSTRAINT_STR ...
... EXTRA_CONSTRAINT_STR ...
  }
...REG_CLASS_FROM_CONSTRAINT...

looks up the same constraint several times.

This patch replaces all uses of:

 REG_CLASS_FROM_CONSTRAINT
 REG_CLASS_FOR_CONSTRAINT
 EXTRA_CONSTRAINT_STR
 EXTRA_MEMORY_CONSTRAINT
 EXTRA_ADDRESS_CONSTRAINT

with separate calls to lookup_constraint and the underlying query function.
It poisons the old macros as a way of protecting against accidental use
(e.g. in #ifdef EXTRA_CONSTRAINT_STR blocks).

Several places want to handle each specific type of constraint in a
different way, so I added a convenience function for classifying constraints
into a type enum.  This also makes the range checks more efficient.
I've treated CONSTRAINT__UNKNOWN as a register constraint (the first type)
since that avoids one more range check and means that each consumer doesn't
have to handle non-constraints specially.  The range check in
reg_class_for_constraint already ensures that the CONSTRAINT__UNKNOWN->
NO_REGS mapping is inline.

Richard


gcc/
* system.h (REG_CLASS_FROM_CONSTRAINT): Poison.
(REG_CLASS_FOR_CONSTRAINT, EXTRA_CONSTRAINT_STR): Likewise.
(EXTRA_MEMORY_CONSTRAINT, EXTRA_ADDRESS_CONSTRAINT): Likewise.
* genpreds.c (print_type_tree): New function.
(write_tm_preds_h): Remove REG_CLASS_FROM_CONSTRAINT,
REG_CLASS_FOR_CONSTRAINT, EXTRA_MEMORY_CONSTRAINT,
EXTRA_ADDRESS_CONSTRAINT and EXTRA_CONSTRAINT_STR.
Write out enum constraint_type and get_constraint_type.
* lra-constraints.c (satisfies_memory_constraint_p): Take a
constraint_num rather than a constraint string.
(satisfies_address_constraint_p): Likewise.
(reg_class_from_constraints): Avoid old constraint macros.
(process_alt_operands, process_address_1): Likewise.
(curr_insn_transform): Likewise.
* ira-costs.c (record_reg_classes): Likewise.
(record_operand_costs): Likewise.
* ira-lives.c (single_reg_class): Likewise.
(ira_implicitly_set_insn_hard_regs): Likewise.
* ira.c (ira_setup_alts, ira_get_dup_out_num): Likewise.
* postreload.c (reload_cse_simplify_operands): Likewise.
* recog.c (asm_operand_ok, preprocess_constraints): Likewise.
(constrain_operands, peep2_find_free_register): Likewise.
* reload.c (push_secondary_reload, scratch_reload_class): Likewise.
(find_reloads, alternative_allows_const_pool_ref): Likewise.
* reload1.c (maybe_fix_stack_asms): Likewise.
* stmt.c (parse_output_constraint, parse_input_constraint): Likewise.
* targhooks.c (default_secondary_reload): Likewise.
* config/m32c/m32c.c (m32c_matches_constraint_p): Avoid reference
to EXTRA_CONSTRAINT_STR.
* config/sparc/constraints.md (U): Likewise REG_CLASS_FROM_CONSTRAINT.
Given the level of testing you've done, I only spot checked a few places 
after concluding the general direction you were going makes sense.  I 
don't expect any fallout, but I'm confident you'll deal with it if it 
happens.


Thanks.  OK for the trunk,

Jeff

[PATCH] libstdc++/testsuite: Fix a 4402.cc compilation error

2014-06-10 Thread Maciej W. Rozycki

Hi,

 I needed some diagnostics to sort out a failure observed on one of our 
targets in 27_io/basic_ostream/inserters_arithmetic/wchar_t/4402.cc in the 
libstdc++ test suite and defined the `TEST_NUMPUT_VERBOSE' macro referred 
there.  That resulted in a compilation error like below:

.../libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/wchar_t/4402.cc:
 In function 'void test02()':
.../libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/wchar_t/4402.cc:48:22:
 error: no match for 'operator<<' (operand types are 'std::basic_ostream' 
and 'std::basic_ostringstream::__string_type {aka 
std::basic_string}')
   cout << "result: " << os.str() << endl;
  ^

The reason is cout is a plain character stream and does not accept wide 
characters.  An obvious fix is below, verified to produce correct output.

 OK to apply?

2014-06-10  Maciej W. Rozycki  

libstdc++-v3/
* testsuite/27_io/basic_ostream/inserters_arithmetic/wchar_t/4402.cc
(test02) [TEST_NUMPUT_VERBOSE]: Use `wcout' rather than `cout'.

  Maciej

gcc-test-libstdcxx-4402.patch
Index: 
gcc-fsf-trunk-quilt/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/wchar_t/4402.cc
===
--- 
gcc-fsf-trunk-quilt.orig/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/wchar_t/4402.cc
2014-05-16 15:58:07.177522688 +0100
+++ 
gcc-fsf-trunk-quilt/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/wchar_t/4402.cc
 2014-05-19 04:49:29.168978165 +0100
@@ -42,8 +42,8 @@ test02()
   wchar_t largebuf[512];
   swprintf(largebuf, 512, L"%.*Le", prec, val);
 #ifdef TEST_NUMPUT_VERBOSE
-  cout << "expect: " << largebuf << endl;
-  cout << "result: " << os.str() << endl;
+  wcout << "expect: " << largebuf << endl;
+  wcout << "result: " << os.str() << endl;
 #endif
   VERIFY( os && os.str() == largebuf );
 
@@ -58,8 +58,8 @@ test02()
 
   swprintf(largebuf, 512, L"%.*f", 3, val2);
 #ifdef TEST_NUMPUT_VERBOSE
-  cout << "expect: " << largebuf << endl;
-  cout << "result: " << os2.str() << endl;
+  wcout << "expect: " << largebuf << endl;
+  wcout << "result: " << os2.str() << endl;
 #endif
   VERIFY( os2 && os2.str() == largebuf );
 }

[Fortran-CAF][Fortran-DEV] Merge from the trunk into the branch

2014-06-10 Thread Tobias Burnus

I have merged the trunk into the Fortran-caf branch (as Rev. 211423) and 
into the Fortran-dev branch (as Rev. 211427).


Tobias

Re: [PATCH 7/8] Remove 'I'-'P' and 'G'/'H' cases

2014-06-10 Thread Jeff Law


On 06/05/14 15:41, Richard Sandiford wrote:

After the previous patch, we can remove the separate 'I'-'P' and 'G'/'H'
cases without increasing compile time.  I didn't bother adding the kind of
fast-path for 'G'/'H' that I did for 'I'-'P' since it should be much rarer.

This removes the last use of CONST_DOUBLE_OK_FOR_CONSTRAINT_P, so I deleted
the code that defines it and added it to the poison list.  The only remaining
old-style macro is CONST_INT_OK_FOR_CONSTRAINT_P, which is used by the s390
backend.  If this series is OK I'll follow up with a patch to remove that
usage and poison CONST_INT_OK_FOR_CONSTRAINT_P too.

Richard


gcc/
* system.h (CONST_DOUBLE_OK_FOR_CONSTRAINT_P): Poison.
* genpreds.c (have_const_dbl_constraints): Delete.
(add_constraint): Don't set it.
(write_tm_preds_h): Don't call CONST_DOUBLE_OK_FOR_CONSTRAINT_P.
* ira-costs.c (record_reg_classes): Handle CONST_INT and CONST_DOUBLE
constraints using the lookup_constraint logic.
* ira-lives.c (single_reg_class): Likewise.
* ira.c (ira_setup_alts): Likewise.
* lra-constraints.c (process_alt_operands): Likewise.
* recog.c (asm_operand_ok, constrain_operands): Likewise.
* reload.c (find_reloads): Likewise.

OK once prerequisites have been OK'd.

Follow-up to remove last usage of CONST_INT_OK_FOR_CONSTRAINT_P 
pre-approved once prerequisites have gone in.  Just post it for archival 
purposes.


Jeff

Re: [PATCH 6/8] Treat 'I'-'P' as separate subtype

2014-06-10 Thread Jeff Law


On 06/05/14 15:37, Richard Sandiford wrote:

This patch extends patch 4 to have a CT_CONST_INT type for CONST_INT
constraints ('I'-'P'), which are already handled by things like
constraint_satisfied_p.  On its own this has little effect, since most
places handle 'I'-'P' as a separate case statement anyway.  It's really
just making way for the final patch.

It might be worth adding a define_const_int_constraint so that 'I'-'P'
are less special.

Richard


gcc/
* genpreds.c (const_int_start, const_int_end): New variables.
(choose_enum_order): Output CONST_INT constraints before memory
constraints.
(write_tm_preds_h): Always define insn_const_int_ok_for_constraint.
Add CT_CONST_INT.
* ira-costs.c (record_reg_classes): Handle CT_CONST_INT.
* ira.c (ira_setup_alts): Likewise.
* lra-constraints.c (process_alt_operands): Likewise.
* recog.c (asm_operand_ok, preprocess_constraints): Likewise.
* reload.c (find_reloads): Likewise.

OK once prerequisites are OK'd.

Jeff

Re: [PATCH 5/8] Remove unused operand_alternative fields

2014-06-10 Thread Jeff Law


On 06/05/14 15:33, Richard Sandiford wrote:

This patch just gets rid of some write-only operand_alternative fields,
which makes things easier for the later patches to preprocess_constraints.

Richard


gcc/
* recog.h (operand_alternative): Remove offmem_ok, nonffmem_ok,
decmem_ok and incmem_ok.  Reformat other bitfields for consistency.
* recog.c (preprocess_constraints): Update accordingly.

OK.
Jeff

Re: [PATCH 1/8] Faster checks for constraint types

2014-06-10 Thread Jeff Law


On 06/05/14 15:26, Richard Sandiford wrote:

genpreds.c defines routines insn_extra_memory_constraint and
insn_extra_address_constraint for testing whether a particular
constraint_num is a memory or address constraint.  At the moment it uses
an out-of-line switch-based function to do this, but if we organise the
constraint_num enum differently, we can use a simple range test instead.

Similarly, if we group register constraints together, we can handle
reg_class_for_constraint for non-register constraints inline.
The same goes for constraint_satisfied_p and register constraints.
The point is that constraints are either register constraints or things
that could be satisfied by constraint_satisfied_p, never both, and
exposing this helps with jump threading.  This becomes more important
with the last half of the series.

Richard


gcc/
* doc/md.texi (regclass_for_constraint): Rename to...
(reg_class_for_constraint): ...this.
* genpreds.c (num_constraints, enum_order, register_start)
(register_end, satisfied_start, memory_start, memory_end)
(address_start, address_end): New variables.
(add_constraint): Count the number of constraints.
(choose_enum_order): New function.
(write_enum_constraint_num): Iterate over enum_order.
(write_regclass_for_constraint): Rename to...
(write_reg_class_for_constraint_1): ...this and update output
accordingly.
(write_constraint_satisfied_p): Rename to...
(write_constraint_satisfied_p_1): ...this and update output
accordingly.  Do nothing if all extra constraints are register
constraints.
(write_insn_extra_memory_constraint): Delete.
(write_insn_extra_address_constraint): Delete.
(write_range_function): New function.
(write_tm_preds_h): Define constraint_satisfied_p and
reg_class_for_constraint as inline functions that do a range check
before calling the out-of-line function.  Use write_range_function
to implement insn_extra_{register,memory,address}_constraint,
the first of which is new.
(write_insn_preds_c): Update after above changes to write_* functions.
(main): Call choose_enum_order.

OK.
jeff

Re: [PATCH 3/8] Speed up constraint_satisfied_p

2014-06-10 Thread Jeff Law


On 06/05/14 15:30, Richard Sandiford wrote:

After the earlier changes, the only important function that switches on
constraint_num is constraint_satisfied_p (now constraint_satisfied_p_1).
Since constraint_num is a dense enum, it seems faster to use a jump
table instead.  In many cases this allows the handling function to be
a simple sibcall to something like a predicate.

Richard


gcc/
* genpreds.c (write_constraint_satisfied_p_1): Replace with...
(write_constraint_satisfied_p_array): ...this new function.
(write_tm_preds_h): Replace write_constraint_satisfied_p_1 with
an array.
(write_insn_preds_c): Update accordingly.

OK.
jeff

Re: [PATCH 2/8] Speed up lookup_constraint

2014-06-10 Thread Jeff Law


On 06/05/14 15:29, Richard Sandiford wrote:

lookup_constraint is also an out-of-line switch-based function.
Since most constraints are still single-letter ones, it should be
more efficient to have a lookup array for the single-character case
and an out-of-line function for the more complicated ones.  This becomes
even more important with the latter half of the series (which isn't as
much of a win otherwise).

Richard


gcc/
* genpreds.c (write_lookup_constraint): Rename to...
(write_lookup_constraint_1): ...this.
(write_lookup_constraint_array): New function.
(write_tm_preds_h): Define lookup_constraint as an inline function
that uses write_lookup_constraint_array where possible.
(write_insn_preds_c): Update for the changes above.

OK.
Jeff

Re: [PATCH, libbid]: Fix "variable ‘Ql’ set but not used" warnings

2014-06-10 Thread Uros Bizjak

On Mon, May 26, 2014 at 6:52 PM, Uros Bizjak  wrote:

> Attached patch fixes several "variable ‘Ql’ set but not used" warnings
> in bid128_div.c and bid64_div.c libbid sources. We can simply use
> __mul_128x128_high functions when lowpart is not needed.
>
> 2014-05-26  Uros Bizjak  
>
> * bid128_div.c (BID128_FUNCTION_ARG2): Remove unused variable 'Ql'.
> Call __mul_128x128_high instead of __mul_128x128_full.
> (TYPE0_FUNCTION_ARGTYPE1_ARGTYPE2): Ditto.
> (BID128_FUNCTION_ARGTYPE1_ARG128): Ditto.
> (BID128_FUNCTION_ARG128_ARGTYPE2): Ditto.
> * bid64_div.c (TYPE0_FUNCTION_ARGTYPE1_ARG128): Ditto.
> (TYPE0_FUNCTION_ARG128_ARGTYPE2): Ditto.
> (TYPE0_FUNCTION_ARG128_ARG128): Ditto.
>
> Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}.

The patch was OK'd offline by H.J.

Committed to mainline SVN.

Uros.

PING: Re: [PATCH] demangler, only access valid fields for DEMANGLE_COMPONENT_FIXED_TYPE.

2014-06-10 Thread Andrew Burgess

Ping!

Thanks for your time,
Andrew

On 29/05/2014 1:02 AM, Andrew Burgess wrote:
> On 28/05/2014 11:56 PM, Pedro Alves wrote:
>> On 05/28/2014 09:38 PM, Andrew Burgess wrote:
>>>
>>> diff --git a/libiberty/testsuite/demangle-expected 
>>> b/libiberty/testsuite/demangle-expected
>>> index 453f9a3..0e2bb12 100644
>>> --- a/libiberty/testsuite/demangle-expected
>>> +++ b/libiberty/testsuite/demangle-expected
>>> @@ -4343,3 +4343,9 @@ 
>>> cereal::detail::InputBindingMap::Serializers 
>>> cereal::p
>>>  --format=gnu-v3
>>>  _ZNSt9_Any_data9_M_accessIPZ4postISt8functionIFvvEEEvOT_EUlvE_EERS5_v
>>>  void post >(std::function>> ()>&&)::{lambda()#1}*& std::_Any_data::_M_access>> post >(void post 
>>> >(std::function&&)::{lambda()#1}*&&)::{lambda()#1}*>()
>>> +# The following input symbol was found during random, it caused a fault
>>
>> Could you add a single empty # above, to separate the tests?
>> I find that that makes it much easier to follow the file.
> 
> Done.
> 
>>> +# The following input symbol was found during random, it caused a fault
>>
>> "during random testing?"
>>
>>> +# within the demangler, it's not a symbol we'd expect in the real world.
>>
>> Why not?
> 
> Good point(s), that comment was out of date,  I've removed it.  The
> symbol is a perfectly good symbol which we could find in the real world,
> and should be able to handle.
> 
> Patch below only has changes to the tests.
> 
> Thanks,
> Andrew
> 
> libiberty/ChangeLog:
> 
>   * cp-demangle.c (d_dump): Only access field from s_fixed part of
>   the union for DEMANGLE_COMPONENT_FIXED_TYPE.
>   (d_count_templates_scopes): Likewise.
>   * testsuite/demangle-expected: New test case.
> 
> diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
> index 68d8ee1..a31dad4 100644
> --- a/libiberty/cp-demangle.c
> +++ b/libiberty/cp-demangle.c
> @@ -710,7 +710,9 @@ d_dump (struct demangle_component *dc, int indent)
>printf ("pointer to member type\n");
>break;
>  case DEMANGLE_COMPONENT_FIXED_TYPE:
> -  printf ("fixed-point type\n");
> +  printf ("fixed-point type, accum? %d, sat? %d\n",
> +  dc->u.s_fixed.accum, dc->u.s_fixed.sat);
> +  d_dump (dc->u.s_fixed.length, indent + 2)
>break;
>  case DEMANGLE_COMPONENT_ARGLIST:
>printf ("argument list\n");
> @@ -3869,7 +3871,13 @@ d_count_templates_scopes (int *num_templates, int 
> *num_scopes,
>  case DEMANGLE_COMPONENT_FUNCTION_TYPE:
>  case DEMANGLE_COMPONENT_ARRAY_TYPE:
>  case DEMANGLE_COMPONENT_PTRMEM_TYPE:
> +  goto recurse_left_right;
> +
>  case DEMANGLE_COMPONENT_FIXED_TYPE:
> +  d_count_templates_scopes (num_templates, num_scopes,
> +dc->u.s_fixed.length);
> +  break;
> +
>  case DEMANGLE_COMPONENT_VECTOR_TYPE:
>  case DEMANGLE_COMPONENT_ARGLIST:
>  case DEMANGLE_COMPONENT_TEMPLATE_ARGLIST:
> diff --git a/libiberty/testsuite/demangle-expected 
> b/libiberty/testsuite/demangle-expected
> index 453f9a3..63f6821 100644
> --- a/libiberty/testsuite/demangle-expected
> +++ b/libiberty/testsuite/demangle-expected
> @@ -4343,3 +4343,8 @@ 
> cereal::detail::InputBindingMap::Serializers 
> cereal::p
>  --format=gnu-v3
>  _ZNSt9_Any_data9_M_accessIPZ4postISt8functionIFvvEEEvOT_EUlvE_EERS5_v
>  void post >(std::function&&)::{lambda()#1}*& 
> std::_Any_data::_M_access >(void 
> post >(std::function ()>&&)::{lambda()#1}*&&)::{lambda()#1}*>()
> +#
> +--format=auto --no-params
> +_Z3xxxDFyuVb
> +xxx(unsigned long long _Fract, bool volatile)
> +xxx
> 
> 
>

[patch] libstdc++/61390 don't redeclare template-parameters

2014-06-10 Thread Jonathan Wakely


Yo dawg, I heard you like templates, so I renamed the
template-parameters of your template template-parameters so they are
not the same as the template-parameters of your templates.

G++ fails to diagnose this (PR17267) but Clang gives an error and EDG
gives a warning.

Tested x86_64-linux, committed to trunk.

commit 4e4fcc91a4edd9e0955dc3cd0412fa7e3f2fd93c
Author: Jonathan Wakely 
Date:   Tue Jun 10 18:52:37 2014 +0100

	PR libstdc++/61390
	* include/ext/pb_ds/detail/bin_search_tree_/traits.hpp
	(bin_search_tree_traits): Do not redeclare template-parameters.
	* testsuite/util/testsuite_iterators.h (test_container): Likewise.

diff --git a/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/traits.hpp b/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/traits.hpp
index d97b432..7ada365 100644
--- a/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/traits.hpp
+++ b/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/traits.hpp
@@ -55,7 +55,7 @@ namespace __gnu_pbds
 	 class Cmp_Fn,
 	 template
 	 class Node_Update,
 	 class Node,
@@ -161,7 +161,7 @@ namespace __gnu_pbds
 	 class Cmp_Fn,
 	 template
 	 class Node_Update,
 	 class Node,
diff --git a/libstdc++-v3/testsuite/util/testsuite_iterators.h b/libstdc++-v3/testsuite/util/testsuite_iterators.h
index c690581..6cf18b4 100644
--- a/libstdc++-v3/testsuite/util/testsuite_iterators.h
+++ b/libstdc++-v3/testsuite/util/testsuite_iterators.h
@@ -518,7 +518,7 @@ namespace __gnu_test
* It takes two pointers representing a range and presents them as 
* a container of iterators.
*/
-  template  class ItType>
+  template  class ItType>
   struct test_container
   {
 typename ItType::ContainerType bounds;

Re: [PATCH] Delete temporary string within demangler even in failure cases.

2014-06-10 Thread Andrew Burgess

On 27/05/2014 2:47 PM, Ian Lance Taylor wrote:
> On Tue, May 27, 2014 at 3:57 AM, Andrew Burgess  wrote:
>>
>> libiberty/ChangeLog
>>
>> * cplus-dem.c (do_type): Call string_delete even if the call to
>> demangle_template fails.
> 
> This is OK.
> 
> Thanks.
> 
> I have to ask: you know this code is not used, right?  You're looking
> at the old demangler, for symbols generated by versions of g++ before
> GCC 3.4 (released 2004).  The demangler for current versions of g++ is
> in cp-demangle.c.

Sorry for the delay.  Yes, I know it's not very old code, but it is
still shipped, and consumers such as gdb can still trigger this code
path.

Thanks for taking a look at this patch for me.

I don't have gcc write permissions, would you (or any other interested
maintainer) mind committing this for me please.

Thanks,
Andrew

Re: [gomp4] Add tables generation

2014-06-10 Thread Ilya Verbin

On 10 Jun 15:52, Bernd Schmidt wrote:
> On 04/17/2014 08:33 PM, Ilya Verbin wrote:
> >+{
> >+  /* Collect all omp-target global variables to offload_vars, if they have 
> >not
> >+ been gathered earlier by input_offload_tables.  */
> >+  if (vec_safe_is_empty (offload_vars))
> 
> What if a variable was entered into the table by something other
> than input_offload_tables? We'll skip this code entirely, which
> doesn't seem right. Can we even get here after input_offload_tables
> has been called, and if so, maybe this step of collecting variables
> belongs elsewhere?
> 
> Also, the previous code did the same for functions, and I can't find
> anything corresponding to that after the patch. Is this intentional?

I'll try to explain with an example bellow:

Suppose there are 2 source files: test1.c and test2.c.

1. During the compilation of test1.c:
  1.1. In expand_omp_target gcc adds new target functions into offload_funcs;
  1.2. In output_offload_tables gcc adds all target variables into offload_vars;
  1.3. In output_offload_tables gcc streams offload_funcs/vars into TARGET 
LTO_section_offload_table.
   And if there is -flto, it also streams them into the HOST 
LTO_section_offload_table;
  1.4. In omp_finish_file gcc writes addresses from offload_funcs/vars into 
test1.o.

2. The same steps happen for test2.c.

   3a. If there is no -flto, ld will join raw tables from test1.o and test2.o.
   And accel compiler will join tables from target 
LTO_section_offload_table.
   For now this mode isn't implemented, to run accel compiler we need -flto.
   3b. If there is -flto (let's consider WHOPR mode, since LTO mode is 
simpler), there are 2 stages:
  3.1. WPA:
3.1.1. In input_offload_tables gcc reads host LTO_section_offload_table from 
test1.o and test2.o;
3.1.2. In output_offload_tables gcc streams the joined tables into 
LTO_section_offload_table in the new partition xxx.ltrans0.o;
  3.2. LTRANS:
3.2.1. In input_offload_tables gcc reads host LTO_section_offload_table from 
xxx.ltrans0.o;
3.2.2. In omp_finish_file gcc writes addresses from offload_funcs/vars into the 
final xxx.ltrans0.ltrans.o.

So, the question is what is the right place for collecting decls into 
offload_funcs/vars?
I collect offload_funcs in expand_omp_target where they're created.
But for offload_vars I couldn't find a place better than output_offload_tables.
That's why I added "if (vec_safe_is_empty (offload_vars))".
If the var decls have been read by input_offload_tables on the step 3.1.1, 
there is no need to
collect them from FOR_EACH_DEFINED_VARIABLE on the step 3.1.2, because that 
order might be incorrect.

Thanks,
  -- Ilya

Re: ipa-visibility TLC 2/n

2014-06-10 Thread Jan Hubicka

> Honza,
> 
> Thanks for this patch which improves some of the G++ testsuite
> failures, but most of the libstdc++ testsuite continues to fail on
> AIX.

Yep, I am still looking into this.  Just made new alias verifier that
catches quite few nonsenses in how C++ builds thunks and same body
aliases.  I am testing it on PPC now (it takes a while)
> 
> This patch clearly was risky and should have been more thoroughly
> tested on non-GNU/Linux systems. All of these failures make it
> impossible to know if other failures have been introduced into the AIX
> port.
> 
> I will be happy to work with you to debug these failures, but I wish
> to ask that the IPA visibility patch be reverted until AIX testsuite
> results can return to a normal state with the patch applied.

OK, the change does redirecting of functions and variables.  I will try
to figure out which of those two changes actually breaks AIX and disable
it for time being.

We are running into existing problems with aliases (obvoiusly making alias
and redirecting uses to it should not break code). Do we care about aliases
working on release branches? (I can make backporatble patches for those fixes,
but given that no one noitced it so far means that probably no one uses them)

Honza
> 
> Thanks, David

Re: Move DECL_SECTION_NAME into symtab

2014-06-10 Thread Jan Hubicka

> On Mon, Jun 9, 2014 at 4:34 AM, Jan Hubicka  wrote:
> > Hi,
> > this patch follows the change to move DECL_COMDAT_GROUP by moving 
> > DECL_SECTION_NAME
> > into symtab nodes instead of keeping it in decl_with_vis. (I pla to proceed 
> > with
> > other symbol table related fields).
> >
> > It follows exactly same path as the previous patch. Notable change is adding
> > of node removal into duplicate_decl in c-decl.c.
> >
> > Memory usage wise the patch is small win for non-WPA, at WPA we actually
> > consume bit more memory (about 800K on Firefox).  We have more symtab nodes
> > than declarations because of inline clonning.  This will be solved by fixing
> > memory representation of symbol nodes (I plan to move rare items into on 
> > side
> > hashtables).  With accessors API it should be easy.
> 
> What I wondered about for some time is why 'clones' need to use the
> same structs as their origins.  They share some bits with their origin
> and they apply some changes.  Thus I think it would be nice to change
> the inheritance of a symtab entry to sth like
> 
>   symbol - cgraph-node-base - cgraph-node
>  |  \
>  |   cgraph-clone
>  varpool-node-base - varpool-node
>  \
>   varpool-clone (do we have those?)

Yep, revisiting hiearchy of symbol is on my TODO list. I want to clear APIs 
first
to make it easier.

We can also make difference in between external declarations, definitions,
thunks and aliases so we do not share data for all of them.

One problem is that we change one type of symbol into another.  We would need 
tool
for that - allocating duplicate node and redirecting datastructures.  Not a big 
deal,
but not that pretty either (I think we do not really dangle pointers to nodes, 
but
I am not 100% sure)

Still those rare data (i.e. things that are not set for most of nodes) probably 
should
sit in on side hashtable.

Honza

Re: [PR 61424] std::regex matches right to left, not leftmost longest

2014-06-10 Thread Jonathan Wakely


diff --git a/libstdc++-v3/include/bits/regex.tcc 
b/libstdc++-v3/include/bits/regex.tcc
index a81f517..6a1faaf 100644
--- a/libstdc++-v3/include/bits/regex.tcc
+++ b/libstdc++-v3/include/bits/regex.tcc
@@ -70,7 +70,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  // without defining a macro. Users should define
  // _GLIBCXX_REGEX_USE_THOMPSON_NFA if they need to use this approach.
  bool __ret;
-  if (!__re._M_automaton->_M_has_backref
+  if (!(__re._M_flags & regex_constants::ECMAScript)
#ifndef _GLIBCXX_REGEX_USE_THOMPSON_NFA
  && __policy == _RegexExecutorPolicy::_S_alternate
#endif


I'm sure this is because I still don't understand all the regex code,
but doesn't this change mean that for an "extended" mode regex with
backrefs, the user could define _GLIBCXX_REGEX_USE_THOMPSON_NFA and
backrefs wouldn't work?

Re: [PATCH][AArch64] Fix some reg-to-reg move scheduler types

2014-06-10 Thread Marcus Shawcroft

On 10 June 2014 16:37, Kyrill Tkachov  wrote:
> Hi all,
>
> This patch corrects the insn types used for scheduling for some of our move
> patterns.
> GP->FP moves have type f_mcr
> FP->GP moves have type f_mrc
> GP->GP moves have type mov_reg
> FP->FP moves have type fmov.
>
>
> Bootstrapped on aarch64-none-linux-gnu and tested aarch64-none-elf.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2014-06-10  Kyrylo Tkachov  
>
> * config/aarch64/aarch64-simd.md (move_lo_quad_):
> Change second alternative type to f_mcr.
> * config/aarch64/aarch64.md (*movsi_aarch64): Change 11th
> and 12th alternatives' types to f_mcr and f_mrc.
> (*movdi_aarch64): Same for 12th and 13th alternatives.
> (*movsf_aarch64): Change 9th alternatives' type to mov_reg.
> (aarch64_movtilow_tilow): Change type to fmov.

OK
/Marcus

[PATCH][AArch64] Fix some reg-to-reg move scheduler types

2014-06-10 Thread Kyrill Tkachov


Hi all,

This patch corrects the insn types used for scheduling for some of our 
move patterns.

GP->FP moves have type f_mcr
FP->GP moves have type f_mrc
GP->GP moves have type mov_reg
FP->FP moves have type fmov.


Bootstrapped on aarch64-none-linux-gnu and tested aarch64-none-elf.

Ok for trunk?

Thanks,
Kyrill

2014-06-10  Kyrylo Tkachov  

* config/aarch64/aarch64-simd.md (move_lo_quad_):
Change second alternative type to f_mcr.
* config/aarch64/aarch64.md (*movsi_aarch64): Change 11th
and 12th alternatives' types to f_mcr and f_mrc.
(*movdi_aarch64): Same for 12th and 13th alternatives.
(*movsf_aarch64): Change 9th alternatives' type to mov_reg.
(aarch64_movtilow_tilow): Change type to fmov.commit 988384e5d0555043cc95006107dca9d4aef9521a
Author: Kyrylo Tkachov 
Date:   Mon Jun 9 13:17:00 2014 +0100

[AArch64] Fix fmov insn types

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index decb1a7..522ca7f 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -937,41 +937,41 @@
  [(set (match_operand:VQ_S 0 "register_operand" "=w")
(MAXMIN:VQ_S (match_operand:VQ_S 1 "register_operand" "w")
 		(match_operand:VQ_S 2 "register_operand" "w")))]
  "TARGET_SIMD"
  "\t%0., %1., %2."
   [(set_attr "type" "neon_minmax")]
 )
 
 ;; Move into low-half clearing high half to 0.
 
 (define_insn "move_lo_quad_"
   [(set (match_operand:VQ 0 "register_operand" "=w,w,w")
 (vec_concat:VQ
 	  (match_operand: 1 "register_operand" "w,r,r")
 	  (vec_duplicate: (const_int 0]
   "TARGET_SIMD"
   "@
dup\\t%d0, %1.d[0]
fmov\\t%d0, %1
dup\\t%d0, %1"
-  [(set_attr "type" "neon_dup,fmov,neon_dup")
+  [(set_attr "type" "neon_dup,f_mcr,neon_dup")
(set_attr "simd" "yes,*,yes")
(set_attr "fp" "*,yes,*")
(set_attr "length" "4")]
 )
 
 ;; Move into high-half.
 
 (define_insn "aarch64_simd_move_hi_quad_"
   [(set (match_operand:VQ 0 "register_operand" "+w,w")
 (vec_concat:VQ
   (vec_select:
 (match_dup 0)
 (match_operand:VQ 2 "vect_par_cnst_lo_half" ""))
 	  (match_operand: 1 "register_operand" "w,r")))]
   "TARGET_SIMD"
   "@
ins\\t%0.d[1], %1.d[0]
ins\\t%0.d[1], %1"
   [(set_attr "type" "neon_ins")
(set_attr "length" "4")]
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 0564017..a4d8887 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -680,66 +680,66 @@
 (define_insn "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,*w,m,  m,r,r  ,*w, r,*w")
 	(match_operand:SI 1 "aarch64_mov_operand"  " r,r,k,M,m, m,rZ,*w,S,Ush,rZ,*w,*w"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %1
ldr\\t%w0, %1
ldr\\t%s0, %1
str\\t%w1, %0
str\\t%s1, %0
adr\\t%x0, %a1
adrp\\t%x0, %A1
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1"
   [(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,load1,load1,store1,store1,\
- adr,adr,fmov,fmov,fmov")
+ adr,adr,f_mcr,f_mrc,fmov")
(set_attr "fp" "*,*,*,*,*,yes,*,yes,*,*,yes,yes,yes")]
 )
 
 (define_insn "*movdi_aarch64"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,*w,m,  m,r,r,  *w, r,*w,w")
 	(match_operand:DI 1 "aarch64_mov_operand"  " r,r,k,N,m, m,rZ,*w,S,Ush,rZ,*w,*w,Dd"))]
   "(register_operand (operands[0], DImode)
 || aarch64_reg_or_zero (operands[1], DImode))"
   "@
mov\\t%x0, %x1
mov\\t%0, %x1
mov\\t%x0, %1
mov\\t%x0, %1
ldr\\t%x0, %1
ldr\\t%d0, %1
str\\t%x1, %0
str\\t%d1, %0
adr\\t%x0, %a1
adrp\\t%x0, %A1
fmov\\t%d0, %x1
fmov\\t%x0, %d1
fmov\\t%d0, %d1
movi\\t%d0, %1"
   [(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,load1,load1,store1,store1,\
- adr,adr,fmov,fmov,fmov,fmov")
+ adr,adr,f_mcr,f_mrc,fmov,fmov")
(set_attr "fp" "*,*,*,*,*,yes,*,yes,*,*,yes,yes,yes,*")
(set_attr "simd" "*,*,*,*,*,*,*,*,*,*,*,*,*,yes")]
 )
 
 (define_insn "insv_imm"
   [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r")
 			  (const_int 16)
 			  (match_operand:GPI 1 "const_int_operand" "n"))
 	(match_operand:GPI 2 "const_int_operand" "n"))]
   "UINTVAL (operands[1]) < GET_MODE_BITSIZE (mode)
&& UINTVAL (operands[1]) % 16 == 0"
   "movk\\t%0, %X2, lsl %1"
   [(set_attr "type" "mov_imm")]
 )
 
 (define_expand "movti"
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
 	(match_operand:TI 1 "general_operand" ""))]
   ""
   "
@@ -800,41 +800,41 @@
   operands[1] = force_reg (mode, operands[1]);
   "
 )
 
 (define_insn "*movsf_aarch64"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
 	(match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY

Re: [C++ Patch] PR 19200

2014-06-10 Thread Jason Merrill


On 06/10/2014 11:19 AM, Paolo Carlini wrote:

Back to you in a few hours, but I suspect we would have trouble with the famous

   struct S
   {
 friend S::S();
   };

compiled with -fpermissive.


I don't think so; that should be handled later in grokdeclarator by


  if (declarator
  && declarator->u.id.qualifying_scope
  && MAYBE_CLASS_TYPE_P (declarator->u.id.qualifying_scope))
{
  ctype = declarator->u.id.qualifying_scope;


Jason

Re: [C++ Patch] PR 19200

2014-06-10 Thread Paolo Carlini

Hi,

> On 10/giu/2014, at 16:32, Jason Merrill  wrote:
> 
>> On 06/10/2014 05:58 AM, Paolo Carlini wrote:
>>&& (friendp == 0 || dname == current_class_name))
> 
> Can't we just drop the dname condition here, rather than clear ctype later?  
> That seems to be specifically what we're fixing: a friend is not a member 
> function even if it has the same name as the class.

Back to you in a few hours, but I suspect we would have trouble with the famous

  struct S
  {
friend S::S();
  };

compiled with -fpermissive. If you ask me, assuming the idea otherwise works, 
I'm certainly in favor of turning the permerror into error and adjusting the 
testsuite, instead of further fiddling with grokdeclarator and complicating 
it... Did you consider this special case?!?

Thanks,
Paolo

Re: [RFC][ARM] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-06-10 Thread Ramana Radhakrishnan

On Tue, Jun 10, 2014 at 12:25 AM, Kugan
 wrote:
> On 30/05/14 18:35, Ramana Radhakrishnan wrote:
>>> +  if (!TARGET_VFP)
>>> +return;
>>> +
>>> +  /* Generate the equivalence of :
>>
>> s/equivalence/equivalent.
>>
>> Ok with that change and if no regressions.
>
> Hi Ramana,
>
> Sorry, I missed the thumb1 part. There are no mrc/mcr  versions of these
> instructions in thumb1. So these should be conditional on not being
> ARM_THUMB1.
>

No, this has nothing to do with TARGET_THUMB1 -  the real condition
should be TARGET_VFP && TARGET_HARD_FLOAT. These instructions only
work if TARGET_HARD_FLOAT is true. Thumb1 + VFP instructions is not
possible, similarly if generating code for -mfloat-abi=soft you don't
want these instructions being generated.

Ok if that works.

Ramana


> Is this OK. Regression tested with no new refression on qemu for
> arm-none-linux-gnueabi -march=armv7-a and on arm-none-linux-gnueabi
> --with-mode=thumb and -march=armv5t.
>
> Is this OK?
>
> Thanks,
> Kugan
>
> gcc/
>
> 2014-06-10  Kugan Vivekanandarajah  
>
> * config/arm/arm.c (arm_atomic_assign_expand_fenv): call
> default_atomic_assign_expand_fenv for TARGET_THUMB1.
> (arm_init_builtins) : Initialize builtins __builtins_arm_set_fpscr and
> __builtins_arm_get_fpscr only when !TARGET_THUMB1.
> * config/arm/vfp.md (set_fpscr): Make pattern conditional on
> !TARGERT_THUMB1.
> (get_fpscr) : Likewise.

Re: [C++ Patch] PR 19200

2014-06-10 Thread Jason Merrill


On 06/10/2014 05:58 AM, Paolo Carlini wrote:

&& (friendp == 0 || dname == current_class_name))


Can't we just drop the dname condition here, rather than clear ctype 
later?  That seems to be specifically what we're fixing: a friend is not 
a member function even if it has the same name as the class.


Jason

Re: [PATCH][AARCH64]Support full addressing modes for ldr/str in vectorization scenarios

2014-06-10 Thread Marcus Shawcroft

On 10 June 2014 15:29, Christophe Lyon  wrote:
> Hello,
>
> This commit (211211) causes gcc.target/aarch64/vect-mull.c execution
> test to FAIL for target aarch64_be-none-elf.
> (tested using qemu)

Yep, that is exactly what Bin said in his original submission..
/Marcus

>>> On Wed, May 28, 2014 at 3:02 PM, bin.cheng  wrote:

 The patch passes bootstrap and regression test on aarch64/little-endian.  
 It
 also passes regression test on aarch64/big-endian except for case
 "gcc.target/aarch64/vect-mull.c".  I analyzed the failed case and now
 believe it reveals a latent bug in vectorizer on aarch64/big-endian.  The
 analysis report is posted at
 https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00182.html.

Re: [PATCH][AArch64]Add testcases to cover various pro/epi stack layout

2014-06-10 Thread Marcus Shawcroft

On 10 June 2014 15:03, Jiong Wang  wrote:
> This patch add testcases for various aarch64 prologue/epilogue scenarios.
>
> It will make sure our later frame code refine and improvement will not cause
> any regression.
>
> OK for trunk?
>
> Thanks.
>
> gcc/testsuite/
>  * gcc.target/aarch64/test_frame_common.h: New pattern file.
>   * gcc.target/aarch64/test_frame_1.c: New testcase.
>   * gcc.target/aarch64/test_frame_2.c: Likewise.
>   * gcc.target/aarch64/test_frame_3.c: Likewise.
>   * gcc.target/aarch64/test_frame_4.c: Likewise.
>   * gcc.target/aarch64/test_frame_5.c: Likewise.
>   * gcc.target/aarch64/test_frame_6.c: Likewise.
>   * gcc.target/aarch64/test_frame_7.c: Likewise.
>   * gcc.target/aarch64/test_frame_8.c: Likewise.
>   * gcc.target/aarch64/test_frame_9.c: Likewise.
>   * gcc.target/aarch64/test_frame_10.c: Likewise.
>   * gcc.target/aarch64/test_frame_11.c: Likewise.
>   * gcc.target/aarch64/test_frame_12.c: Likewise.
>   * gcc.target/aarch64/test_frame_13.c: Likewise.
>   * gcc.target/aarch64/test_frame_14.c: Likewise.
>   * gcc.target/aarch64/test_frame_15.c: Likewise.

OK /MArcus

Re: [PATCH][AARCH64]Support full addressing modes for ldr/str in vectorization scenarios

2014-06-10 Thread Christophe Lyon

Hello,

This commit (211211) causes gcc.target/aarch64/vect-mull.c execution
test to FAIL for target aarch64_be-none-elf.
(tested using qemu)

Christophe.


On 3 June 2014 13:08, Marcus Shawcroft  wrote:
> On 28 May 2014 08:30, Bin.Cheng  wrote:
>> Missing patch.
>>
>> On Wed, May 28, 2014 at 3:02 PM, bin.cheng  wrote:
>>> Hi,
>>> I was surprised that GCC didn't support addressing modes like
>>> [REG+OFF]/[REG_REG] for instructions ldr/str in vectorization scenarios.
>>> The generated assembly is bad since all address expressions have to be
>>> computed outside of memory reference.  The root cause is because aarch64
>>> effectively rejects reg-indexing (and const-offset) addressing modes in
>>> aarch64_classify_address and miscellaneous simd patterns.
>>>
>>> By fixing this issue, performance of fp benchmarks can be obviously
>>> improved.  It can also help vectorized int cases.
>>>
>>> The patch passes bootstrap and regression test on aarch64/little-endian.  It
>>> also passes regression test on aarch64/big-endian except for case
>>> "gcc.target/aarch64/vect-mull.c".  I analyzed the failed case and now
>>> believe it reveals a latent bug in vectorizer on aarch64/big-endian.  The
>>> analysis report is posted at
>>> https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00182.html.
>>>
>>> So is it OK?
>>>
>>> Thanks,
>>> bin
>>>
>>>
>>> 2014-05-28  Bin Cheng  
>>>
>>> * config/aarch64/aarch64.c (aarch64_classify_address)
>>> (aarch64_legitimize_reload_address): Support full addressing modes
>>> for vector modes.
>>> * config/aarch64/aarch64.md (mov, movmisalign)
>>> (*aarch64_simd_mov, *aarch64_simd_mov): Relax
>>> predicates.
>
> OK Thanks /Marcus

[PATCH][AArch64]Add testcases to cover various pro/epi stack layout

2014-06-10 Thread Jiong Wang


This patch add testcases for various aarch64 prologue/epilogue scenarios.

It will make sure our later frame code refine and improvement will not cause 
any regression.

OK for trunk?

Thanks.

gcc/testsuite/
 * gcc.target/aarch64/test_frame_common.h: New pattern file.
  * gcc.target/aarch64/test_frame_1.c: New testcase.
  * gcc.target/aarch64/test_frame_2.c: Likewise.
  * gcc.target/aarch64/test_frame_3.c: Likewise.
  * gcc.target/aarch64/test_frame_4.c: Likewise.
  * gcc.target/aarch64/test_frame_5.c: Likewise.
  * gcc.target/aarch64/test_frame_6.c: Likewise.
  * gcc.target/aarch64/test_frame_7.c: Likewise.
  * gcc.target/aarch64/test_frame_8.c: Likewise.
  * gcc.target/aarch64/test_frame_9.c: Likewise.
  * gcc.target/aarch64/test_frame_10.c: Likewise.
  * gcc.target/aarch64/test_frame_11.c: Likewise.
  * gcc.target/aarch64/test_frame_12.c: Likewise.
  * gcc.target/aarch64/test_frame_13.c: Likewise.
  * gcc.target/aarch64/test_frame_14.c: Likewise.
  * gcc.target/aarch64/test_frame_15.c: Likewise.
commit ec07ddf26d31696b61d6ffbff52227bb5e86bd2a
Author: Jiong Wang 
Date:   Fri Jun 6 10:18:19 2014 +0100

[AArch64] Add a set of stack layout testcases.

  The following patch set will do various pro/epi code refine
and optimization.

  Before these, we check in a set of testcases to guarantee no
regression is caused by our following work.

diff --git a/gcc/testsuite/gcc.target/aarch64/test_frame_1.c b/gcc/testsuite/gcc.target/aarch64/test_frame_1.c
new file mode 100644
index 000..feea7a2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/test_frame_1.c
@@ -0,0 +1,14 @@
+/* Verify:
+ * -fomit-frame-pointer.
+ * withoug outgoing.
+ * total frame size <= 256.
+ * number of callee-save reg == 1.
+ * optimized code should use "str !" for stack adjustment.  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 -fomit-frame-pointer" } */
+
+#include "test_frame_common.h"
+
+t_frame_pattern (test1, 200, )
+t_frame_run (test1)
diff --git a/gcc/testsuite/gcc.target/aarch64/test_frame_10.c b/gcc/testsuite/gcc.target/aarch64/test_frame_10.c
new file mode 100644
index 000..2892c5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/test_frame_10.c
@@ -0,0 +1,16 @@
+/* Verify:
+ * -fomit-frame-pointer.
+ * with outgoing.
+ * total frame size > 512.
+   area except outgoing <= 512
+ * number of callee-saved reg >= 2.
+ * Split stack adjustment into two subtractions.
+   the first subtractions could be optimized into "stp !".  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 -fomit-frame-pointer" } */
+
+#include "test_frame_common.h"
+
+t_frame_pattern_outgoing (test10, 480, "x19", 24, a[8], a[9], a[10])
+t_frame_run (test10)
diff --git a/gcc/testsuite/gcc.target/aarch64/test_frame_11.c b/gcc/testsuite/gcc.target/aarch64/test_frame_11.c
new file mode 100644
index 000..8b860dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/test_frame_11.c
@@ -0,0 +1,16 @@
+/* Verify:
+ * without outgoing.
+ * total frame size <= 512.
+ * number of callee-save reg >= 2.
+ * optimized code should use "stp !" for stack adjustment.  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps" } */
+
+#include "test_frame_common.h"
+
+t_frame_pattern (test11, 400, )
+t_frame_run (test11)
+
+/* { dg-final { scan-assembler-times "stp\tx29, x30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/test_frame_12.c b/gcc/testsuite/gcc.target/aarch64/test_frame_12.c
new file mode 100644
index 000..3649527
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/test_frame_12.c
@@ -0,0 +1,15 @@
+/* Verify:
+ * with outgoing.
+ * total frame size <= 512.
+ * number of callee-save reg >= 2.  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps" } */
+
+#include "test_frame_common.h"
+
+t_frame_pattern_outgoing (test12, 400, , 8, a[8])
+t_frame_run (test12)
+
+/* { dg-final { scan-assembler-times "sub\tsp, sp, #\[0-9\]+" 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/test_frame_13.c b/gcc/testsuite/gcc.target/aarch64/test_frame_13.c
new file mode 100644
index 000..25df08b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/test_frame_13.c
@@ -0,0 +1,18 @@
+/* Verify:
+ * without outgoing.
+ * total frame size > 512.
+ * number of callee-save reg >= 2.
+ * split the stack adjustment into two substractions,
+   the second could be optimized into "stp !".  */
+
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps" } */
+
+#include "test_frame_common.h"
+
+t_frame_pattern (test13, 700, )
+t_frame_run (test13)
+
+/* { dg-final { scan-assembler-times "sub\tsp, sp, #\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "stp\tx29, x30, \\\[sp, -\[0-9\]+\\\]!" 2 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/test_frame_14.c b/gcc/testsuite/

Re: [PING*2][PATCH] Extend mode-switching to support toggle (1/2)

2014-06-10 Thread Joern Rennecke

On 13 May 2014 22:41, Oleg Endo  wrote:

> Right.  I was thinking to add FPSCR.SZ mode switching to SH, in order to
> do float vector moves.  For that SZ and PR need to be switched both at
> the same time (only SH4A has both, fpchg and fschg).  So basically I'd
> add another mode entity, which would emit SZ mode changes in addition to
> the PR mode changes.  But then adjacent FPSCR-changing insns could be
> combined ... any idea/suggestion how to accomplish that?

If they are sufficiently adjacent, you can use a peephole2 pattern for this.

I see Cristian's patch addresses this in a different way - keeping size and
precision in the same entity, and emitting toggles as appropriate.

The problem get's a bit more interesting if you have some instruction patterns
that care about one setting but not the other.
Describing this exactly allows lazy code motion to be a bit more lazy, but OTOH
it can make it harder to combine mode switching instructions if you
still want to
do that.

Re: [gomp4] Add tables generation

2014-06-10 Thread Bernd Schmidt


On 04/17/2014 08:33 PM, Ilya Verbin wrote:

Could you please take a look at this patch?  It fixes the ordering issue in the
tables stated above, and passes all the tests that I have.  But I'm not sure
about its correctness from the architectural point of view.


I'm still skeptical relying on ordering is going to work in the long 
run, but in the meantime this looks better than what we have at the 
moment. So I think this should probably go in for now, but first it 
needs a few small changes:



--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -82,6 +82,8 @@ enum LTO_symtab_tags
LTO_symtab_last_tag
  };

+extern vec *offload_funcs, *offload_vars;


Declarations go into header files.


+void
+output_offload_tables (void)


All functions should have a comment.


+{
+  /* Collect all omp-target global variables to offload_vars, if they have not
+ been gathered earlier by input_offload_tables.  */
+  if (vec_safe_is_empty (offload_vars))


What if a variable was entered into the table by something other than 
input_offload_tables? We'll skip this code entirely, which doesn't seem 
right. Can we even get here after input_offload_tables has been called, 
and if so, maybe this step of collecting variables belongs elsewhere?


Also, the previous code did the same for functions, and I can't find 
anything corresponding to that after the patch. Is this intentional?



Bernd

Re: [PATCH, PR61446] Fix mode for register copy in REE pass

2014-06-10 Thread Dominique Dhumieres

> This patch fixes PR61446. ...

Confirmed, it also allows to bootstrap Core* targets.
Could it be reviewed and committed ASAP?

TIA

Dominique

[patch] implement std::experimental::any

2014-06-10 Thread Jonathan Wakely


This patch implements std::experimental::any from the Fundamentals TS,
including small-object optimisation, uses-allocator construction and
partial support for -fno-rtti (I should probably disable the
allocator-extended constructors when RTTI is disabled, or any_cast
doesn't work).

The allocator-extended copy constructor is not implemented, I don't
think it's possible!

It could probably do with some more tests before I commit it, but
posting for comments as I've had it sitting in my tree for some time
and I might as well share it.

diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 0ea3bb9..0e687d8 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,17 @@
+2014-06-10  Jonathan Wakely  
+
+   * include/Makefile.am: Add new header.
+   * include/Makefile.in: Regenerate.
+   * include/experimental/any: New.
+   * include/ext/aligned_buffer.h (__aligned_buffer(nullptr_t)): New
+   constructor.
+   * testsuite/experimental/any/cons/1.cc: New.
+   * testsuite/experimental/any/cons/2.cc: New.
+   * testsuite/experimental/any/cons/3.cc: New.
+   * testsuite/experimental/any/misc/any_cast.cc: New.
+   * testsuite/experimental/any/misc/any_cast_neg.cc: New.
+   * testsuite/experimental/any/misc/swap.cc: New.
+
 2014-06-09  Jonathan Wakely  
 
* doc/Makefile.am: Add missing file. Use generate.consistent.ids
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index a079ff6..8fe82da 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -638,6 +638,7 @@ decimal_headers = \
 experimental_srcdir = ${glibcxx_srcdir}/include/experimental
 experimental_builddir = ./experimental
 experimental_headers = \
+   ${experimental_srcdir}/any \
${experimental_srcdir}/optional \
${experimental_srcdir}/string_view \
${experimental_srcdir}/string_view.tcc
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 502f04e..51fde97 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -904,6 +904,7 @@ decimal_headers = \
 experimental_srcdir = ${glibcxx_srcdir}/include/experimental
 experimental_builddir = ./experimental
 experimental_headers = \
+   ${experimental_srcdir}/any \
${experimental_srcdir}/optional \
${experimental_srcdir}/string_view \
${experimental_srcdir}/string_view.tcc
diff --git a/libstdc++-v3/include/experimental/any 
b/libstdc++-v3/include/experimental/any
new file mode 100644
index 000..17538d0
--- /dev/null
+++ b/libstdc++-v3/include/experimental/any
@@ -0,0 +1,530 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file experimental/any
+ *  This is a TS C++ Library header.
+ */
+
+#ifndef _GLIBCXX_EXPERIMENTAL_ANY
+#define _GLIBCXX_EXPERIMENTAL_ANY 1
+
+// #pragma GCC system_header
+
+/**
+ * @defgroup experimental Experimental
+ *
+ * Components specified by various Technical Specifications.
+ */
+
+#if __cplusplus <= 201103L
+# include 
+#else
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+namespace experimental
+{
+inline namespace any_v1
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  /**
+   * @defgroup any Type-safe container of any type
+   * @ingroup experimental
+   *
+   * A type-safe container for single values of value types, as
+   * described in n3804 "Any Library Proposal (Revision 3)".
+   *
+   * @{
+   */
+
+  /**
+   *  @brief Exception class thrown when a disengaged optional object is
+   *  dereferenced.
+   *  @ingroup exceptions
+   */
+  class bad_any_cast : public bad_cast
+  {
+  public:
+virtual const char* what() const noexcept { return "bad_any_cast"; }
+  };
+
+  [[gnu::noreturn]] inline void __throw_bad_any_cast()
+  {
+#ifdef __EXCEPTIONS
+throw bad_any_cast{};
+#else
+__builtin_abort();
+#endif
+  }
+
+  /**

Re: [PATCH][AArch64] Add a big-endian lane flip at expand-time in saturating math patterns

2014-06-10 Thread Marcus Shawcroft

On 10 June 2014 09:53, Kyrill Tkachov  wrote:

> * config/aarch64/aarch64-simd.md (aarch64_sqdmulh_lane):
> New expander.
> (aarch64_sqrdmulh_lane): Likewise.
> (aarch64_sqdmulh_lane): Rename to...
> (aarch64_sqdmulh_lane_interna): ...this.
> (aarch64_sqdmulh_laneq): New expander.
> (aarch64_sqrdmulh_laneq): Likewise.
> (aarch64_sqdmulh_laneq): Rename to...
> (aarch64_sqdmulh_laneq_internal): ...this.
> (aarch64_sqdmulh_lane): New expander.
> (aarch64_sqrdmulh_lane): Likewise.
> (aarch64_sqdmulh_lane): Rename to...
> (aarch64_sqdmulh_lane_internal): ...this.
> (aarch64_sqdmlal_lane): Add lane flip for big-endian.
> (aarch64_sqdmlal_laneq): Likewise.
> (aarch64_sqdmlsl_lane): Likewise.
> (aarch64_sqdmlsl_laneq): Likewise.
> (aarch64_sqdmlal2_lane): Likewise.
> (aarch64_sqdmlal2_laneq): Likewise.
> (aarch64_sqdmlsl2_lane): Likewise.
> (aarch64_sqdmlsl2_laneq): Likewise.
> (aarch64_sqdmull_lane): Likewise.
> (aarch64_sqdmull_laneq): Likewise.
> (aarch64_sqdmull2_lane): Likewise.
> (aarch64_sqdmull2_laneq): Likewise.

OK /Marcus

[COMMITTED] [AArch64] Fix layout of frame related functions.

2014-06-10 Thread Marcus Shawcroft


Fixing various white space issues in the frame layout code.  Committed.

/Marcusdiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e7f455b..3eb18e9 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1917,7 +1917,6 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
   rtx (*gen_mem_ref)(enum machine_mode, rtx)
 = (frame_pointer_needed)? gen_frame_mem : gen_rtx_MEM;
 
-
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
 {
   if (aarch64_register_saved_on_entry (regno))
@@ -1935,10 +1934,12 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 	{
 	  /* Empty loop.  */
 	}
+
 	  if (regno2 <= V31_REGNUM &&
 	  aarch64_register_saved_on_entry (regno2))
 	{
 	  rtx mem2;
+
 	  /* Next highest register to be saved.  */
 	  mem2 = gen_mem_ref (DFmode,
   plus_constant
@@ -1964,10 +1965,10 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 gen_rtx_REG (DFmode, regno2));
 		}
 
-		  /* The first part of a frame-related parallel insn
-		 is always assumed to be relevant to the frame
-		 calculations; subsequent parts, are only
-		 frame-related if explicitly marked.  */
+	  /* The first part of a frame-related parallel insn is
+		 always assumed to be relevant to the frame
+		 calculations; subsequent parts, are only
+		 frame-related if explicitly marked.  */
 	  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 1)) = 1;
 	  regno = regno2;
 	  start_offset += increment * 2;
@@ -1987,15 +1988,14 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 	  RTX_FRAME_RELATED_P (insn) = 1;
 	}
 }
-
 }
 
 
 /* offset from the stack pointer of where the saves and
restore's have to happen.  */
 static void
-aarch64_save_or_restore_callee_save_registers (HOST_WIDE_INT offset,
-	bool restore)
+aarch64_save_or_restore_callee_save_registers (HOST_WIDE_INT start_offset,
+	   bool restore)
 {
   rtx insn;
   rtx base_rtx = stack_pointer_rtx;
@@ -2027,6 +2027,7 @@ aarch64_save_or_restore_callee_save_registers (HOST_WIDE_INT offset,
 	  aarch64_register_saved_on_entry (regno2))
 	{
 	  rtx mem2;
+
 	  /* Next highest register to be saved.  */
 	  mem2 = gen_mem_ref (Pmode,
   plus_constant
@@ -2050,12 +2051,11 @@ aarch64_save_or_restore_callee_save_registers (HOST_WIDE_INT offset,
 		  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DImode, regno2));
 		}
 
-		  /* The first part of a frame-related parallel insn
-		 is always assumed to be relevant to the frame
-		 calculations; subsequent parts, are only
-		 frame-related if explicitly marked.  */
-	  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0,
-	1)) = 1;
+	  /* The first part of a frame-related parallel insn is
+		 always assumed to be relevant to the frame
+		 calculations; subsequent parts, are only
+		 frame-related if explicitly marked.  */
+	  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 1)) = 1;
 	  regno = regno2;
 	  start_offset += increment * 2;
 	}
@@ -2075,7 +2075,6 @@ aarch64_save_or_restore_callee_save_registers (HOST_WIDE_INT offset,
 }
 
   aarch64_save_or_restore_fprs (start_offset, increment, restore, base_rtx);
-
 }
 
 /* AArch64 stack frames generated by this compiler look like:

[AArch64] Fix REG_CFA_RESTORE mode.

2014-06-10 Thread Marcus Shawcroft


Looks like a copy n paste error originally.

Committed.

/Marcuscommit f6a9bafb21d26b2e7d767b392bea0f60c31701d5
Author: Marcus Shawcroft 
Date:   Fri Jun 6 14:26:50 2014 +0100

[AArch64] Fix REG_CFA_RESTORE mode.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index bf68f34..c510f44 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2014-06-10  Marcus Shawcroft  
+
+	* config/aarch64/aarch64.c (aarch64_save_or_restore_fprs): Fix
+	REG_CFA_RESTORE mode.
+
 2014-06-10  Kyrylo Tkachov  
 
 	* doc/arm-acle-intrinsics.texi: Specify when CRC32 intrinsics are
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a8b1523..e7f455b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1980,7 +1980,7 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 		{
 		  insn = emit_move_insn (gen_rtx_REG (DFmode, regno), mem);
 		  add_reg_note (insn, REG_CFA_RESTORE,
-gen_rtx_REG (DImode, regno));
+gen_rtx_REG (DFmode, regno));
 		}
 	  start_offset += increment;
 	}

Re: ipa-visibility TLC 2/n

2014-06-10 Thread David Edelsohn

On Sun, Jun 8, 2014 at 12:58 PM, Jan Hubicka  wrote:
> Hi,
> this is the last part.  It makes DECL_VIRTUAL to be copied when creating 
> aliases.  This is
> needed to make sanity check in gimple-fold happy (it checks that vtables are 
> DECL_VIRTUAL).
> It also resets initializers of aliases to save memory.

Honza,

Thanks for this patch which improves some of the G++ testsuite
failures, but most of the libstdc++ testsuite continues to fail on
AIX.

This patch clearly was risky and should have been more thoroughly
tested on non-GNU/Linux systems. All of these failures make it
impossible to know if other failures have been introduced into the AIX
port.

I will be happy to work with you to debug these failures, but I wish
to ask that the IPA visibility patch be reverted until AIX testsuite
results can return to a normal state with the patch applied.

Thanks, David

Re: [PATCH, i386] Remove use of vpmacsdql instruction from multiplication.

2014-06-10 Thread Uros Bizjak

On Tue, Jun 10, 2014 at 12:30 PM, Gopalasubramanian, Ganesh
 wrote:
> Hi,
>
> The below patch fixes the issue with 64-bit multiplication.
> The instruction "vpmacsdql" does signed 32-bit multiplication.
> For V2DImode, we require widened unsigned multiplication.
> So, replacing the "vpmacsdql" instruction with "vpmuludq" and "vpaddq".
>
> This patch had been already discussed in 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52908
>
> With required change in the test xop-imul64-vector.c,  make check passes. Is 
> it OK for upstream?
>
> Regards
> Ganesh
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index d0a1253..c158612 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2014-06-10  Ganesh Gopalasubramanian 
> +
> +   * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Issue instructions
> +"vpmuludq" and "vpaddq" instead of "vpmacsdql" for handling 32-bit
> +multiplication.
>

OK for mainline and release branches.

Thanks,
Uros.

Re: [PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.

2014-06-10 Thread Evgeny Stupachenko

ix86_reassociation_width checks INTEGRAL_MODE_P and FLOAT_MODE_P which
include vector mode.
I'll try to separate this into scalar and vector part, but it will
require more testing (under the testing now).
What about the rest of the patch?

Thanks,
Evgeny

On Thu, Jun 5, 2014 at 3:54 PM, Ramana Radhakrishnan
 wrote:
> On 06/05/14 12:43, Evgeny Stupachenko wrote:
>>
>> New hook is related to vector instructions only. Vector instructions
>> could be sequential in pipeline, but scalar - parallel. For x86
>> architectures TARGET_SCHED_REASSOC_WIDTH does not give required
>> differentiation.
>> General hooks could be potentially reused in other algorithms/by other
>> architectures.
>
>
> It already takes a "mode" argument. Couldn't you use a vector mode to work
> this out ?
>
> If it is not enough then please be more specific about the documentation of
> this hook about where it is useful so that it's easy for people reading the
> documentation to understand at a glance what purpose it serves.
>
>
> Ramana
>
>
>>
>> Thanks,
>> Evgeny
>>
>> On Thu, Jun 5, 2014 at 2:04 PM, Ramana Radhakrishnan
>>  wrote:
>>>
>>> On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko 
>>> wrote:

 Hi,

 The patch introduces alternative way of permutations for load groups
 of size 2 and 3 which should be faster on architectures with low
 parallelism.
 The patch gives 2 times gain on Silvermont to the test from PR52252
 (in addition to already committed 3 times gain).

 Patch passes bootstrap on x86. Make check is in progress.
>>>
>>>
>>> Why do we need a new hook ? Can't you derive this information from
>>> something which is equally badly named TARGET_SCHED_REASSOC_WIDTH
>>> though used in the reassociation logic but also serves a similar
>>> purpose ?
>>>
>>> Also the documentation of this hook is incomplete at best and wrong at
>>> worst as this is not applied everywhere in the vectorizer but just for
>>> this special case for load store permuting. Implying this is useful
>>> everywhere in the vectorizer does not appear to be correct.
>>>
>>> regards
>>> Ramana
>>>
>>>
>>>
>>>

 ChangeLog:

 2014-05-28  Evgeny Stupachenko  

  * config/i386/i386.c (ix86_have_vector_parallel_execution):
 New.
  (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New.
  * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
  * config/i386/x86-tune.def
 (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
  * target.def (have_vector_parallel_execution): New.
  * doc/tm.texi.in (have_vector_parallel_execution)): New.
  * doc/tm.texi: Regenerate.
  * targhooks.c (default_have_vector_parallel_execution): New.
  * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
  Introduces alternative way of loads group permutaions.
  (vect_transform_grouped_load): Try alternative way of
 permutaions.

 Evgeny
>>
>>
>

Re: [PATCH 1/7] Add missing documentation of four IPA-CP params

2014-06-10 Thread Gerald Pfeifer

On Wed, 21 May 2014, Martin Jambor wrote:
> +@item ipa-cp-loop-hint-bonus
> +When IPA-CP determines that a cloning candidate would make the number
> +of iterations of a loop known, it adds a bonus of
^
> +@option{ipa-cp-loop-hint-bonus} bonus to the profitability score of
> +the candidate.  ^

That's a bit much bonus in there. :-)

> +@item ipa-cp-array-index-hint-bonus
> +When IPA-CP determines that a cloning candidate would make the index of
> +an array access known, it adds a bonus of
> +@option{ipa-cp-array-index-hint-bonus} bonus to the profitability
> +score of the candidate.

In here, too.

Gerald

[Patch, GCC/Thumb-1]Mishandle the label type insn in function thumb1_reorg

2014-06-10 Thread Terry Guo

Hi There,

The thumb1_reorg function use macro INSN_CODE to find expected instructions.
But the macro INSN_CODE doesn’t work for label type instruction. The
INSN_CODE(label_insn) will return the label number. When we have a lot of
labels and current label_insn is the first insn of basic block, the
INSN_CODE(label_insn) could accidentally equal to CODE_FOR_cbranchsi4_insn
in this case. This leads to ICE due to SET_SRC(label_insn) in subsequent
code. In general we should skip all such improper insns. This is the purpose
of attached small patch.

Some failures in recent gcc regression test on thumb1 target are caused by
this reason. So with this patch, all of them passed and no new failures. Is
it ok to trunk?

BR,
Terry

2014-06-10  Terry Guo  

 * config/arm/arm.c (thumb1_reorg): Move to next basic block if the head
 of current basic block isn’t a proper insn.   diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ccad548..3ebe424 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16939,7 +16939,8 @@ thumb1_reorg (void)
insn = PREV_INSN (insn);
 
   /* Find the last cbranchsi4_insn in basic block BB.  */
-  if (INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn)
+  if (!NONDEBUG_INSN_P (insn)
+ || INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn)
continue;
 
   /* Get the register with which we are comparing.  */

Re: [PING, PATCH2/2, PR52252] Vectorization for load/store groups of size 3.

2014-06-10 Thread Richard Biener

On Tue, 10 Jun 2014, Evgeny Stupachenko wrote:

> ping.
> The changes are similar to already committed on loads group.

Ok.

Thanks,
Richard.

> On Tue, Jun 3, 2014 at 5:22 PM, Evgeny Stupachenko  wrote:
> > I've added a bug report for the stores group case:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61403
> >
> >
> > On Wed, May 28, 2014 at 5:18 PM, Evgeny Stupachenko  
> > wrote:
> >> Ping.
> >> Test is modified according to the fix in the test for loads.
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-st.c
> >> b/gcc/testsuite/gcc.dg/vect/pr52252-st.c
> >> new file mode 100644
> >> index 000..e7161f7
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/vect/pr52252-st.c
> >> @@ -0,0 +1,21 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } 
> >> */
> >> +
> >> +#define byte unsigned char
> >> +
> >> +void
> >> +matrix_mul (byte *in, byte *out, int size)
> >> +{
> >> +  int i;
> >> +  for (i = 0; i < size; i++)
> >> +{
> >> +  out[0] = in[0] + in[1] + in[3];
> >> +  out[1] = in[0] + in[2] + in[4];
> >> +  out[2] = in[1] + in[2] + in[4];
> >> +  in += 4;
> >> +  out += 3;
> >> +}
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
> >> target { i?86-*-* x86_64-*-* } } } } */
> >> +/* { dg-final { cleanup-tree-dump "vect" } } */
> >>
> >>
> >> On Tue, May 6, 2014 at 6:39 PM, Evgeny Stupachenko  
> >> wrote:
> >>> 2nd part of patch is on stores group.
> >>> Bootstrap and make check passed on x86.
> >>>
> >>> Is it ok?
> >>>
> >>> 2014-05-06  Evgeny Stupachenko  
> >>>
> >>> * tree-vect-data-refs.c (vect_grouped_store_supported): New
> >>> check for storess group of length 3.
> >>> (vect_permute_store_chain): New permutations for storess group of
> >>> length 3.
> >>> * tree-vect-stmts.c (vect_model_store_cost): Change cost
> >>> of vec_perm_shuffle for the new permutations.
> >>>
> >>> ChangeLog for testsuite:
> >>>
> >>> 2014-05-06  Evgeny Stupachenko  
> >>>
> >>>PR tree-optimization/52252
> >>>* gcc.dg/vect/pr52252-st.c: Test on stores group of size 3.
> >>>
> >>> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> >>> index ef710cf..fb0e30d 100644
> >>> --- a/gcc/tree-vect-data-refs.c
> >>> +++ b/gcc/tree-vect-data-refs.c
> >>> @@ -4365,13 +4365,14 @@ vect_grouped_store_supported (tree vectype,
> >>> unsigned HOST_WIDE_INT count)
> >>>  {
> >>>enum machine_mode mode = TYPE_MODE (vectype);
> >>>
> >>> -  /* vect_permute_store_chain requires the group size to be a power of 
> >>> two.  */
> >>> -  if (exact_log2 (count) == -1)
> >>> +  /* vect_permute_store_chain requires the group size to be equal to 3 or
> >>> + be a power of two.  */
> >>> +  if (count != 3 && exact_log2 (count) == -1)
> >>>  {
> >>>if (dump_enabled_p ())
> >>> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> >>> - "the size of the group of accesses"
> >>> - " is not a power of 2\n");
> >>> +"the size of the group of accesses"
> >>> +" is not a power of 2 or not eqaul to 3\n");
> >>>return false;
> >>>  }
> >>>
> >>> @@ -4380,23 +4381,76 @@ vect_grouped_store_supported (tree vectype,
> >>> unsigned HOST_WIDE_INT count)
> >>>  {
> >>>unsigned int i, nelt = GET_MODE_NUNITS (mode);
> >>>unsigned char *sel = XALLOCAVEC (unsigned char, nelt);
> >>> -  for (i = 0; i < nelt / 2; i++)
> >>> +
> >>> +  if (count == 3)
> >>> {
> >>> - sel[i * 2] = i;
> >>> - sel[i * 2 + 1] = i + nelt;
> >>> + unsigned int j0 = 0, j1 = 0, j2 = 0;
> >>> + unsigned int i, j;
> >>> +
> >>> + for (j = 0; j < 3; j++)
> >>> +   {
> >>> + int nelt0 = ((3 - j) * nelt) % 3;
> >>> + int nelt1 = ((3 - j) * nelt + 1) % 3;
> >>> + int nelt2 = ((3 - j) * nelt + 2) % 3;
> >>> + for (i = 0; i < nelt; i++)
> >>> +   {
> >>> + if (3 * i + nelt0 < nelt)
> >>> +   sel[3 * i + nelt0] = j0++;
> >>> + if (3 * i + nelt1 < nelt)
> >>> +   sel[3 * i + nelt1] = nelt + j1++;
> >>> + if (3 * i + nelt2 < nelt)
> >>> +   sel[3 * i + nelt2] = 0;
> >>> +   }
> >>> + if (!can_vec_perm_p (mode, false, sel))
> >>> +   {
> >>> + if (dump_enabled_p ())
> >>> +   dump_printf (MSG_MISSED_OPTIMIZATION,
> >>> +"permutaion op not supported by 
> >>> target.\n");
> >>> + return false;
> >>> +   }
> >>> +
> >>> + for (i = 0; i < nelt; i++)
> >>> +   {
> >>> + if (3 * i + nelt0 < nelt)
> >>> +   sel[3 * i + nelt0] = 3 * i + nelt

[PATCH] Fix PR61452

2014-06-10 Thread Richard Biener


The following fixes PR61452 - when keeping a lattice value at
VARYING we shouldn't adjust the lattice ->expr or ->has_constants.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2014-06-10  Richard Biener  

PR tree-optimization/61452
* tree-ssa-sccvn.c (visit_phi): Remove pointless setting of
expr and has_constants in case we found a leader.

* gcc.dg/torture/pr61452.c: New testcase.

Index: gcc/tree-ssa-sccvn.c
===
*** gcc/tree-ssa-sccvn.c(revision 211403)
--- gcc/tree-ssa-sccvn.c(working copy)
*** visit_phi (gimple phi)
*** 3140,3174 
/* If all value numbered to the same value, the phi node has that
   value.  */
if (allsame)
! {
!   if (is_gimple_min_invariant (sameval))
!   {
! VN_INFO (PHI_RESULT (phi))->has_constants = true;
! if (sameval != VN_TOP)
!   VN_INFO (PHI_RESULT (phi))->expr = sameval;
!   }
!   else
!   {
! VN_INFO (PHI_RESULT (phi))->has_constants = false;
! if (sameval != VN_TOP)
!   VN_INFO (PHI_RESULT (phi))->expr = sameval;
!   }
! 
!   if (TREE_CODE (sameval) == SSA_NAME)
!   return visit_copy (PHI_RESULT (phi), sameval);
! 
!   return set_ssa_val_to (PHI_RESULT (phi), sameval);
! }
  
/* Otherwise, see if it is equivalent to a phi node in this block.  */
result = vn_phi_lookup (phi);
if (result)
! {
!   if (TREE_CODE (result) == SSA_NAME)
!   changed = visit_copy (PHI_RESULT (phi), result);
!   else
!   changed = set_ssa_val_to (PHI_RESULT (phi), result);
! }
else
  {
vn_phi_insert (phi, PHI_RESULT (phi));
--- 3140,3151 
/* If all value numbered to the same value, the phi node has that
   value.  */
if (allsame)
! return set_ssa_val_to (PHI_RESULT (phi), sameval);
  
/* Otherwise, see if it is equivalent to a phi node in this block.  */
result = vn_phi_lookup (phi);
if (result)
! changed = set_ssa_val_to (PHI_RESULT (phi), result);
else
  {
vn_phi_insert (phi, PHI_RESULT (phi));
Index: gcc/testsuite/gcc.dg/torture/pr61452.c
===
*** gcc/testsuite/gcc.dg/torture/pr61452.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr61452.c  (working copy)
***
*** 0 
--- 1,31 
+ /* { dg-do run } */
+ 
+ int a, b;
+ short c, d;
+ char e, f;
+ 
+ int
+ fn1 (int p1, char p2)
+ {
+   return p1 || p2 ? 0 : p2;
+ }
+ 
+ void
+ fn2 ()
+ {
+   for (; a;)
+ {
+   int g;
+   g = c = e;
+   for (; a;)
+   b = fn1 (g = d = e, g);
+   f = g; 
+ }
+ }
+ 
+ int
+ main ()
+ {
+   fn2 (); 
+   return 0;
+ }

[PATCH] Fix PR61438

2014-06-10 Thread Richard Biener


The following fixes PR61438 - when moving the PHI insertion inhibiting
code I forgot to guard it so that it only runs for PRE.

Bootstrap/regtest pending on x86_64-unknown-linux-gnu.

Richard.

2014-06-10  Richard Biener  

PR tree-optimization/61438
* tree-ssa-pre.c (eliminate_dom_walker): Add do_pre member.
(eliminate_dom_walker::before_dom_children): Only try to inhibit
insertion of IVs if running PRE.
(eliminate): Adjust.
(pass_pre::execute): Likewise.
(pass_fre::execute): Likewise.

* gcc.dg/torture/pr61438.c: New testcase.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 211398)
--- gcc/tree-ssa-pre.c  (working copy)
*** eliminate_insert (gimple_stmt_iterator *
*** 3992,4001 
  class eliminate_dom_walker : public dom_walker
  {
  public:
!   eliminate_dom_walker (cdi_direction direction) : dom_walker (direction) {}
  
virtual void before_dom_children (basic_block);
virtual void after_dom_children (basic_block);
  };
  
  /* Perform elimination for the basic-block B during the domwalk.  */
--- 3992,4004 
  class eliminate_dom_walker : public dom_walker
  {
  public:
!   eliminate_dom_walker (cdi_direction direction, bool do_pre_)
!   : dom_walker (direction), do_pre (do_pre_) {}
  
virtual void before_dom_children (basic_block);
virtual void after_dom_children (basic_block);
+ 
+   bool do_pre;
  };
  
  /* Perform elimination for the basic-block B during the domwalk.  */
*** eliminate_dom_walker::before_dom_childre
*** 4192,4198 
 variable.  In other cases the vectorizer won't do anything
 anyway (either it's loop invariant or a complicated
 expression).  */
! if (flag_tree_loop_vectorize
  && gimple_assign_single_p (stmt)
  && TREE_CODE (sprime) == SSA_NAME
  && loop_outer (b->loop_father))
--- 4195,4202 
 variable.  In other cases the vectorizer won't do anything
 anyway (either it's loop invariant or a complicated
 expression).  */
! if (do_pre
! && flag_tree_loop_vectorize
  && gimple_assign_single_p (stmt)
  && TREE_CODE (sprime) == SSA_NAME
  && loop_outer (b->loop_father))
*** eliminate_dom_walker::after_dom_children
*** 4434,4440 
  /* Eliminate fully redundant computations.  */
  
  static unsigned int
! eliminate (void)
  {
gimple_stmt_iterator gsi;
gimple stmt;
--- 4438, 
  /* Eliminate fully redundant computations.  */
  
  static unsigned int
! eliminate (bool do_pre)
  {
gimple_stmt_iterator gsi;
gimple stmt;
*** eliminate (void)
*** 4448,4454 
el_avail.create (0);
el_avail_stack.create (0);
  
!   eliminate_dom_walker (CDI_DOMINATORS).walk (cfun->cfg->x_entry_block_ptr);
  
el_avail.release ();
el_avail_stack.release ();
--- 4452,4459 
el_avail.create (0);
el_avail_stack.create (0);
  
!   eliminate_dom_walker (CDI_DOMINATORS,
!   do_pre).walk (cfun->cfg->x_entry_block_ptr);
  
el_avail.release ();
el_avail_stack.release ();
*** pass_pre::execute (function *fun)
*** 4779,4785 
gsi_commit_edge_inserts ();
  
/* Remove all the redundant expressions.  */
!   todo |= eliminate ();
  
statistics_counter_event (fun, "Insertions", pre_stats.insertions);
statistics_counter_event (fun, "PA inserted", pre_stats.pa_insert);
--- 4784,4790 
gsi_commit_edge_inserts ();
  
/* Remove all the redundant expressions.  */
!   todo |= eliminate (true);
  
statistics_counter_event (fun, "Insertions", pre_stats.insertions);
statistics_counter_event (fun, "PA inserted", pre_stats.pa_insert);
*** pass_fre::execute (function *fun)
*** 4864,4870 
memset (&pre_stats, 0, sizeof (pre_stats));
  
/* Remove all the redundant expressions.  */
!   todo |= eliminate ();
  
todo |= fini_eliminate ();
  
--- 4869,4875 
memset (&pre_stats, 0, sizeof (pre_stats));
  
/* Remove all the redundant expressions.  */
!   todo |= eliminate (false);
  
todo |= fini_eliminate ();
  
Index: gcc/testsuite/gcc.dg/torture/pr61438.c
===
*** gcc/testsuite/gcc.dg/torture/pr61438.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr61438.c  (working copy)
***
*** 0 
--- 1,48 
+ /* { dg-do run } */
+ 
+ extern void abort (void);
+ 
+ int a, c, **d, e, g;
+ static int b = 1;
+ 
+ struct
+ {
+   int f0;
+ } f;
+ 
+ void
+ foo ()
+ {
+   int h, *i = &a;
+   for (; e;)
+ {
+   for (c = 0; c < 1; c++)
+   for (; b;)
+ ;
+   for (;;)
+   {
+ if (a)
+   {
+ for (; f.f0; f.

Re: [PATCH, loop2_invariant, 2/2] Change heuristics for identical invariants

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:23 AM, Zhenqiang Chen wrote:
> * loop-invariant.c (struct invariant): Add a new member: eqno;
> (find_identical_invariants): Update eqno;
> (create_new_invariant): Init eqno;
> (get_inv_cost): Compute comp_cost wiht eqno;
> (gain_for_invariant): Take spill cost into account.

Look OK except ...

> @@ -1243,7 +1256,13 @@ gain_for_invariant (struct invariant *inv,
> unsigned *regs_needed,
>  + IRA_LOOP_RESERVED_REGS
>  - ira_class_hard_regs_num[cl];
>if (size_cost > 0)
> -   return -1;
> +   {
> + int spill_cost = target_spill_cost [speed] * (int) regs_needed[cl];
> + if (comp_cost <= spill_cost)
> +   return -1;
> +
> + return 2;
> +   }
>else
> size_cost = 0;
>  }

... why "return 2", instead of just falling through to "return
comp_cost - size_cost;"?

Ciao!
Steven

Re: [PATCH, loop2_invariant, 1/2] Check only one register class

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:22 AM, Zhenqiang Chen wrote:
> Hi,
>
> For loop2-invariant pass, when flag_ira_loop_pressure is enabled,
> function gain_for_invariant checks the pressures of all register
> classes. This does not make sense since one invariant might impact
> only one register class.
>
> The patch enhances functions get_inv_cost and gain_for_invariant to
> check only the register pressure of the invariant if possible.

This patch may work for targets with more-or-less orthogonal reg
classes, but not if there is a lot of overlap between reg classes.

So I don't think this approach is OK.

Ciao!
Steven

Re: [PATCH AArch64] Remove from arm_neon.h functions not in the spec

2014-06-10 Thread Marcus Shawcroft

On 29 May 2014 17:47, Alan Lawrence  wrote:
> Patch retaining vfmaq_n_f64 attached, updated gcc/ChangeLog:
>
> * config/aarch64/arm_neon.h (vmlaq_n_f64, vmlsq_n_f64, vrsrtsq_f64,
>
> vcge_p8, vcgeq_p8, vcgez_p8, vcgez_u8, vcgez_u16, vcgez_u32,
> vcgez_u64,
> vcgezq_p8, vcgezq_u8, vcgezq_u16, vcgezq_u32, vcgezq_u64,
> vcgezd_u64,
> vcgt_p8, vcgtq_p8, vcgtz_p8, vcgtz_u8, vcgtz_u16, vcgtz_u32,
> vcgtz_u64,
> vcgtzq_p8, vcgtzq_u8, vcgtzq_u16, vcgtzq_u32, vcgtzq_u64,
> vcgtzd_u64,
> vcle_p8, vcleq_p8, vclez_p8, vclez_u64, vclezq_p8, vclezd_u64,
> vclt_p8,
> vcltq_p8, vcltz_p8, vcltzq_p8, vcltzd_u64): Remove functions as they
> are
> not in the spec.
>
>
> Alan Lawrence wrote:
>>
>> No, hold that, vfmaq_n_f64 has been added back in the latest version (to
>> which I linked). Hang on...
>>
>> --Alan
>>
>> Alan Lawrence wrote:
>>>
>>> arm_neon.h contains a bunch of functions (for example, the wonderful
>>> vcgez_u* intrinsics - that's an unsigned comparison of
>>> greater-than-or-equal-to zero) that are not present in the current ARM Neon
>>> Intrinsics spec:
>>>
>>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/index.html

OK... and I also think we should back port this before 4.9.1.
/Marcus

Re: [PATCH, loop2_invariant] Skip inv (marked as move) from depends_on

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:32 AM, Zhenqiang Chen wrote:
>
> * loop-invariant.c (get_inv_cost): Skip invariants, which are marked
> as "move", from depends_on.
>

This is OK.

Ciao!
Steven

Re: [PATCH, loop2_invariant] Pre-check invariants

2014-06-10 Thread Steven Bosscher

On Tue, Jun 10, 2014 at 11:55 AM, Zhenqiang Chen wrote:
>
> * loop-invariant.c (find_invariant_insn): Skip invariants, which
> can not make a valid insn during replacement in move_invariant_reg.
>
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -881,6 +881,35 @@ find_invariant_insn (rtx insn, bool
> always_reached, bool always_executed)
>|| HARD_REGISTER_P (dest))
>  simple = false;
>
> +  /* Pre-check candidate to skip the one which can not make a valid insn
> + during move_invariant_reg.  */
> +  if (flag_ira_loop_pressure && df_live && simple
> +  && REG_P (dest) && DF_REG_DEF_COUNT (REGNO (dest)) > 1)

Why only do this with (flag_ira_loop_pressure && df_live)? If the
invariant can't be moved, we should ignore it regardless of whether
register pressure is taken into account.


> +{
> +  df_ref use;
> +  rtx ref;
> +  unsigned int i = REGNO (dest);
> +  struct df_insn_info *insn_info;
> +  df_ref *def_rec;
> +
> +  for (use = DF_REG_USE_CHAIN (i); use; use = DF_REF_NEXT_REG (use))
> +   {
> + ref = DF_REF_INSN (use);
> + insn_info = DF_INSN_INFO_GET (ref);
> +
> + for (def_rec = DF_INSN_INFO_DEFS (insn_info); *def_rec; def_rec++)
> +   if (DF_REF_REGNO (*def_rec) == i)
> + {
> +   /* Multi definitions at this stage, most likely are due to
> +  instruction constrain, which requires both read and write
> +  on the same register.  Since move_invariant_reg is not
> +  powerful enough to handle such cases, just ignore the INV
> +  and leave the chance to others.  */
> +   return;
> + }
> +   }
> +}
> +
>if (!may_assign_reg_p (SET_DEST (set))
>|| !check_maybe_invariant (SET_SRC (set)))
>  return;


Can you put your new check between "may_assign_reg_p (dest)" and
"check_maybe_invariant"? The may_assign_reg_p check is cheap and
triggers quite often.

Looks good to me otherwise.

Ciao!
Steven

[PATCH, i386] Remove use of vpmacsdql instruction from multiplication.

2014-06-10 Thread Gopalasubramanian, Ganesh

Hi,

The below patch fixes the issue with 64-bit multiplication.
The instruction "vpmacsdql" does signed 32-bit multiplication.
For V2DImode, we require widened unsigned multiplication.
So, replacing the "vpmacsdql" instruction with "vpmuludq" and "vpaddq".

This patch had been already discussed in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52908

With required change in the test xop-imul64-vector.c,  make check passes. Is it 
OK for upstream?

Regards
Ganesh

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d0a1253..c158612 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2014-06-10  Ganesh Gopalasubramanian 
+
+   * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Issue instructions
+"vpmuludq" and "vpaddq" instead of "vpmacsdql" for handling 32-bit
+multiplication.
+
 2014-06-07  Jan Hubicka  

* cgraphunit.c (assemble_thunks_and_aliases): Expand thunks before
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9105132..184d82d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -45205,8 +45205,10 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
   /* t4: ((B*E)+(A*F))<<32, ((D*G)+(C*H))<<32 */
   emit_insn (gen_ashlv2di3 (t4, t3, GEN_INT (32)));

-  /* op0: (((B*E)+(A*F))<<32)+(B*F), (((D*G)+(C*H))<<32)+(D*H) */
-  emit_insn (gen_xop_pmacsdql (op0, op1, op2, t4));
+  /* Multiply lower parts and add all */
+  t5 = gen_reg_rtx (V2DImode);
+  emit_insn (gen_vec_widen_umult_even_v4si (t5, gen_lowpart (V4SImode, 
op1), gen_lowpart (V4SImode, op2)));
+  op0 = expand_binop (mode, add_optab, t5, t4, op0, 1, OPTAB_DIRECT);
 }
   else
 {
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index a6913af..757d3e3 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-06-10 Ganesh Gopalasubramanian  
+
+   * gcc.target/i386/xop-imul64-vector.c: Remove the check for
+   vpmacsdql instruction.
+
 2014-06-07  Eric Botcazou  

* gnat.dg/opt38.adb: New test.
diff --git a/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c 
b/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c
index fbf605f..fc8c880 100644
--- a/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c
+++ b/gcc/testsuite/gcc.target/i386/xop-imul64-vector.c
@@ -33,4 +33,3 @@ int main ()

 /* { dg-final { scan-assembler "vpmulld" } } */
 /* { dg-final { scan-assembler "vphadddq" } } */
-/* { dg-final { scan-assembler "vpmacsdql" } } */

[PATCH] Add testcases from PR57186

2014-06-10 Thread Richard Biener


The fix for PR59299 fixed two testcases referenced in PR57186.

Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-06-10  Richard Biener  

PR tree-optimization/57186
PR tree-optimization/59299
* gcc.dg/tree-ssa/ssa-sink-11.c: New testcase.
* gcc.dg/tree-ssa/ssa-sink-12.c: Likewise.

Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-12.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-12.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-12.c (working copy)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define SIZE 64
+
+int foo (int v1[], int v2[])
+{
+  int r, i, j;
+
+  for (j = 0; j < SIZE; j++)
+for (i = 0; i < SIZE; i++)
+  r = v1[j] + v2[i];
+
+  return r;
+}
+
+/* { dg-final { scan-tree-dump "MEM\\\[.* \\+ 252B\\\]" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define SIZE 64
+
+int foo (int v[], int a)
+{
+  int r, i;
+
+  for (i = 0; i < SIZE; i++)
+r = v[i] + a;
+
+  return r;
+}
+
+/* { dg-final { scan-tree-dump "MEM\\\[.* \\+ 252B\\\]" "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Re: [PATCH] Sink loads (fix PR59299)

2014-06-10 Thread Richard Biener

On Sat, 7 Jun 2014, Eric Botcazou wrote:

> > The following adds the missing capability to sink loads to
> > tree-ssa-sink.c.  This enables sinking of loads and dependent
> > expressions into code paths that uses them (thus performing
> > partial dead code elimination on loads).
> 
> There is a much heavier implementation in tree-ssa-loop-im.c attached to PR 
> tree-opt/57186 as well as 3 testcases.
> 
> > The algorithm is simple (similar to that sinking stores) to
> > be light-weight on compile-time thus it may miss some
> > opportunities but it fires quite a bit on GCC itself.
> 
> Is it sufficient for the aforementioned testcases?

It fixes testcase #1 and #2 but not #3 (needs store motion first,
but sinking runs before store motion - and sinking doesn't really
apply to store motion opportunities).

I'll add the first two testcases to the testsuite.

Thanks,
Richard.

Re: [PATCH, x86] Improves x86 permutation expand

2014-06-10 Thread Evgeny Stupachenko

The stability of the changes are covered in gcc.dg/vect/pr52252-ld.c
Test on "pblend" scan I'll add with the patch:
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00795.html

On Tue, Jun 10, 2014 at 12:19 AM, H.J. Lu  wrote:
> On Mon, Jun 9, 2014 at 12:49 PM, Richard Henderson  wrote:
>> On 06/09/2014 12:10 PM, Evgeny Stupachenko wrote:
>>> Nice catch.
>>> Patch with corresponding changes:
>>
>> Looks ok with an appropriate changelog.
>>
>
> It will be nice to include testcases to cover those changes.
>
> --
> H.J.

Re: [PATCH AArch64 / testsuite] Add V1DFmode, fixes PR/59843

2014-06-10 Thread Marcus Shawcroft

On 15 May 2014 17:12, Alan Lawrence  wrote:
> Oops, I missed:
>
> gcc/ChangeLog:
> 2014-05-15  Alan Lawrence  
>
> * config/aarch64/aarch64-modes.def: Add V1DFmode.
> * config/aarch64/aarch64.c (aarch64_vector_mode_supported_p):
> Support V1DFmode.
>
> gcc/testsuite/ChangeLog:
> 2014-05-15  Alan Lawrence  
>
> * gcc.dg/vect/vect-singleton_1.c: New file.

OK
/Marcus

Re: [PING, PATCH2/2, PR52252] Vectorization for load/store groups of size 3.

2014-06-10 Thread Evgeny Stupachenko

ping.
The changes are similar to already committed on loads group.

On Tue, Jun 3, 2014 at 5:22 PM, Evgeny Stupachenko  wrote:
> I've added a bug report for the stores group case:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61403
>
>
> On Wed, May 28, 2014 at 5:18 PM, Evgeny Stupachenko  
> wrote:
>> Ping.
>> Test is modified according to the fix in the test for loads.
>>
>> diff --git a/gcc/testsuite/gcc.dg/vect/pr52252-st.c
>> b/gcc/testsuite/gcc.dg/vect/pr52252-st.c
>> new file mode 100644
>> index 000..e7161f7
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/vect/pr52252-st.c
>> @@ -0,0 +1,21 @@
>> +/* { dg-do compile } */
>> +/* { dg-additional-options "-mssse3" { target { i?86-*-* x86_64-*-* } } } */
>> +
>> +#define byte unsigned char
>> +
>> +void
>> +matrix_mul (byte *in, byte *out, int size)
>> +{
>> +  int i;
>> +  for (i = 0; i < size; i++)
>> +{
>> +  out[0] = in[0] + in[1] + in[3];
>> +  out[1] = in[0] + in[2] + in[4];
>> +  out[2] = in[1] + in[2] + in[4];
>> +  in += 4;
>> +  out += 3;
>> +}
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
>> target { i?86-*-* x86_64-*-* } } } } */
>> +/* { dg-final { cleanup-tree-dump "vect" } } */
>>
>>
>> On Tue, May 6, 2014 at 6:39 PM, Evgeny Stupachenko  
>> wrote:
>>> 2nd part of patch is on stores group.
>>> Bootstrap and make check passed on x86.
>>>
>>> Is it ok?
>>>
>>> 2014-05-06  Evgeny Stupachenko  
>>>
>>> * tree-vect-data-refs.c (vect_grouped_store_supported): New
>>> check for storess group of length 3.
>>> (vect_permute_store_chain): New permutations for storess group of
>>> length 3.
>>> * tree-vect-stmts.c (vect_model_store_cost): Change cost
>>> of vec_perm_shuffle for the new permutations.
>>>
>>> ChangeLog for testsuite:
>>>
>>> 2014-05-06  Evgeny Stupachenko  
>>>
>>>PR tree-optimization/52252
>>>* gcc.dg/vect/pr52252-st.c: Test on stores group of size 3.
>>>
>>> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>>> index ef710cf..fb0e30d 100644
>>> --- a/gcc/tree-vect-data-refs.c
>>> +++ b/gcc/tree-vect-data-refs.c
>>> @@ -4365,13 +4365,14 @@ vect_grouped_store_supported (tree vectype,
>>> unsigned HOST_WIDE_INT count)
>>>  {
>>>enum machine_mode mode = TYPE_MODE (vectype);
>>>
>>> -  /* vect_permute_store_chain requires the group size to be a power of 
>>> two.  */
>>> -  if (exact_log2 (count) == -1)
>>> +  /* vect_permute_store_chain requires the group size to be equal to 3 or
>>> + be a power of two.  */
>>> +  if (count != 3 && exact_log2 (count) == -1)
>>>  {
>>>if (dump_enabled_p ())
>>> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>>> - "the size of the group of accesses"
>>> - " is not a power of 2\n");
>>> +"the size of the group of accesses"
>>> +" is not a power of 2 or not eqaul to 3\n");
>>>return false;
>>>  }
>>>
>>> @@ -4380,23 +4381,76 @@ vect_grouped_store_supported (tree vectype,
>>> unsigned HOST_WIDE_INT count)
>>>  {
>>>unsigned int i, nelt = GET_MODE_NUNITS (mode);
>>>unsigned char *sel = XALLOCAVEC (unsigned char, nelt);
>>> -  for (i = 0; i < nelt / 2; i++)
>>> +
>>> +  if (count == 3)
>>> {
>>> - sel[i * 2] = i;
>>> - sel[i * 2 + 1] = i + nelt;
>>> + unsigned int j0 = 0, j1 = 0, j2 = 0;
>>> + unsigned int i, j;
>>> +
>>> + for (j = 0; j < 3; j++)
>>> +   {
>>> + int nelt0 = ((3 - j) * nelt) % 3;
>>> + int nelt1 = ((3 - j) * nelt + 1) % 3;
>>> + int nelt2 = ((3 - j) * nelt + 2) % 3;
>>> + for (i = 0; i < nelt; i++)
>>> +   {
>>> + if (3 * i + nelt0 < nelt)
>>> +   sel[3 * i + nelt0] = j0++;
>>> + if (3 * i + nelt1 < nelt)
>>> +   sel[3 * i + nelt1] = nelt + j1++;
>>> + if (3 * i + nelt2 < nelt)
>>> +   sel[3 * i + nelt2] = 0;
>>> +   }
>>> + if (!can_vec_perm_p (mode, false, sel))
>>> +   {
>>> + if (dump_enabled_p ())
>>> +   dump_printf (MSG_MISSED_OPTIMIZATION,
>>> +"permutaion op not supported by 
>>> target.\n");
>>> + return false;
>>> +   }
>>> +
>>> + for (i = 0; i < nelt; i++)
>>> +   {
>>> + if (3 * i + nelt0 < nelt)
>>> +   sel[3 * i + nelt0] = 3 * i + nelt0;
>>> + if (3 * i + nelt1 < nelt)
>>> +   sel[3 * i + nelt1] = 3 * i + nelt1;
>>> + if (3 * i + nelt2 < nelt)
>>> +   sel[3 * i + nelt2] = nelt + j2++;
>>> +   }
>>> + if (!can_vec_perm_p (mode, false, sel))
>>> +   {
>>> +

Re: [PATCH, Fortran] PR61234: -Wuse-no-only

2014-06-10 Thread Dominique Dhumieres

> This explicitly tests that no bogus error message is issued
> for a use statement that has an only qualifier ?

I don't see the need for '! { dg-bogus "has no ONLY qualifier" }'.
AFAICT there is no warning emitted for this line (unless you add -Wall)
and if some day it happens that an error/warning is issued, the test will fail.

Otherwise the new patch is OK for me.

Cheers,

Dominique

[PATCH][AArch64][2/2] Add CRC32 ACLE intrinsics testsuite

2014-06-10 Thread Kyrill Tkachov


Hi all,

This is the testsuite for the CRC32 ACLE intrinsics. They are done in 
much the same way as the aarch32 tests in gcc.target/arm/acle/acle.exp 
except that these are modified to run at -O2 optimisation level.


These pass now on aarch64.

Ok for trunk after patch [1/2]?

Thanks,
Kyrill

2014-06-10  Kyrylo Tkachov  

* gcc.target/aarch64/acle/acle.exp: New.
* gcc.target/aarch64/acle/crc32b.c: New test.
* gcc.target/aarch64/acle/crc32cb.c: Likewise.
* gcc.target/aarch64/acle/crc32cd.c: Likewise.
* gcc.target/aarch64/acle/crc32ch.c: Likewise.
* gcc.target/aarch64/acle/crc32cw.c: Likewise.
* gcc.target/aarch64/acle/crc32d.c: Likewise.
* gcc.target/aarch64/acle/crc32h.c: Likewise.
* gcc.target/aarch64/acle/crc32w.c: Likewise.commit 02d00c53305ebd5c21ce48a427d3a1b97563b504
Author: Kyrylo Tkachov 
Date:   Fri May 16 17:48:40 2014 +0100

[AArch64] CRC32 ACLE intrinsics testsuite.

diff --git a/gcc/testsuite/gcc.target/aarch64/acle/acle.exp b/gcc/testsuite/gcc.target/aarch64/acle/acle.exp
new file mode 100644
index 000..e820f6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/acle.exp
@@ -0,0 +1,35 @@
+# Copyright (C) 2014 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if ![istarget aarch64*-*-*] then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
+	"" ""
+
+# All done.
+dg-finish
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/crc32b.c b/gcc/testsuite/gcc.target/aarch64/acle/crc32b.c
new file mode 100644
index 000..bf9a3d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/crc32b.c
@@ -0,0 +1,15 @@
+/* Test the crc32b ACLE intrinsic.  */
+
+/* { dg-do assemble } */
+/* { dg-options "-save-temps -O2 -march=armv8-a+crc" } */
+
+#include "arm_acle.h"
+
+uint32_t
+test_crc32b (uint32_t arg0, uint8_t arg1)
+{
+  return __crc32b (arg0, arg1);
+}
+
+/* { dg-final { scan-assembler "crc32b\tw..?, w..?, w..?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/crc32cb.c b/gcc/testsuite/gcc.target/aarch64/acle/crc32cb.c
new file mode 100644
index 000..a5a39b1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/crc32cb.c
@@ -0,0 +1,15 @@
+/* Test the crc32cb ACLE intrinsic.  */
+
+/* { dg-do assemble } */
+/* { dg-options "-save-temps -O2 -march=armv8-a+crc" } */
+
+#include "arm_acle.h"
+
+uint32_t
+test_crc32cb (uint32_t arg0, uint8_t arg1)
+{
+  return __crc32cb (arg0, arg1);
+}
+
+/* { dg-final { scan-assembler "crc32cb\tw..?, w..?, w..?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/crc32cd.c b/gcc/testsuite/gcc.target/aarch64/acle/crc32cd.c
new file mode 100644
index 000..b50097a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/crc32cd.c
@@ -0,0 +1,15 @@
+/* Test the crc32cd ACLE intrinsic.  */
+
+/* { dg-do assemble } */
+/* { dg-options "-save-temps -O2 -march=armv8-a+crc" } */
+
+#include "arm_acle.h"
+
+uint32_t
+test_crc32cd (uint32_t arg0, uint64_t arg1)
+{
+  return __crc32cd (arg0, arg1);
+}
+
+/* { dg-final { scan-assembler "crc32cx\tw..?, w..?, x..?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/crc32ch.c b/gcc/testsuite/gcc.target/aarch64/acle/crc32ch.c
new file mode 100644
index 000..523faa2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/crc32ch.c
@@ -0,0 +1,15 @@
+/* Test the crc32ch ACLE intrinsic.  */
+
+/* { dg-do assemble } */
+/* { dg-options "-save-temps -O2 -march=armv8-a+crc" } */
+
+#include "arm_acle.h"
+
+uint32_t
+test_crc32ch (uint32_t arg0, uint16_t arg1)
+{
+  return __crc32ch (arg0, arg1);
+}
+
+/* { dg-final { scan-assembler "crc32ch\tw..?, w..?, w..?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/crc32cw.c b/gcc/testsuite/gcc.target/aarch64/acle/crc32cw.c
new file mode 100644
index 000..531e604
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/crc32cw.c
@@ -0,0 +1,15 @@
+/* Test the crc32cw ACLE intrinsic.  */
+
+/* { dg-do assemble } */
+/* { dg-options "-save-temps -O2 -march=a

[PATCH][AArch64][1/2] Implement CRC32 ACLE intrinsics

2014-06-10 Thread Kyrill Tkachov


Hi all,

This is an implementation of the ACLE intrinsics that can be used to 
access the CRC32 instructions. We have them already implemented in aarch32.


You can find their definition and documentation at
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf

The CRC32 intrinsics are non-AdvancedSIMD intrinsics that live in a 
header file called arm_acle.h
There's only 8 of them, so I didn't create a separate .def file for 
them. The ACLE predefine "__ARM_FEATURE_CRC32" is now defined when the 
+crc arch extension is used.
Builtins for each CRC instruction form are defined and the intrinsics 
map to them straightforwardly.


Documentation is included.

Bootstrapped and tested aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2014-06-10  Kyrylo Tkachov  

* config.gcc (aarch64*-*-*): Add arm_acle.h to extra headers.
* Makefile.in (TEXI_GCC_FILES): Add aarch64-acle-intrinsics.texi to
dependencies.
* config/aarch64/aarch64-builtins.c (AARCH64_CRC32_BUILTINS): Define.
(aarch64_crc_builtin_datum): New struct.
(aarch64_crc_builtin_data): New.
(aarch64_init_crc32_builtins): New function.
(aarch64_init_builtins): Initialise CRC32 builtins when appropriate.
(aarch64_crc32_expand_builtin): New.
(aarch64_expand_builtin): Add CRC32 builtin expansion case.
* config/aarch64/aarch64.h (TARGET_CPU_CPP_BUILTINS): Define
__ARM_FEATURE_CRC32 when appropriate.
(TARGET_CRC32): Define.
* config/aarch64/aarch64.md (UNSPEC_CRC32B, UNSPEC_CRC32H,
UNSPEC_CRC32W, UNSPEC_CRC32X, UNSPEC_CRC32CB, UNSPEC_CRC32CH,
UNSPEC_CRC32CW, UNSPEC_CRC32CX): New unspec values.
(aarch64_): New pattern.
* config/aarch64/arm_acle.h: New file.
* config/aarch64/iterators.md (CRC): New int iterator.
(crc_variant, crc_mode): New int attributes.
* doc/aarch64-acle-intrinsics.texi: New file.
* doc/extend.texi (aarch64): Document aarch64 ACLE intrinsics.
Include aarch64-acle-intrinsics.texi.commit e686eaa8ac08683969e53c4c0eb4e912e0a46d54
Author: Kyrylo Tkachov 
Date:   Fri May 16 15:38:03 2014 +0100

[AArch64] Implement CRC32 ACLE intrinsics

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3350186..a6fba33 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2801,7 +2801,7 @@ TEXI_GCC_FILES = gcc.texi gcc-common.texi gcc-vers.texi frontends.texi	\
 	 contribute.texi compat.texi funding.texi gnu.texi gpl_v3.texi	\
 	 fdl.texi contrib.texi cppenv.texi cppopts.texi avr-mmcu.texi	\
 	 implement-c.texi implement-cxx.texi arm-neon-intrinsics.texi	\
-	 arm-acle-intrinsics.texi
+	 arm-acle-intrinsics.texi aarch64-acle-intrinsics.texi
 
 # we explicitly use $(srcdir)/doc/tm.texi here to avoid confusion with
 # the generated tm.texi; the latter might have a more recent timestamp,
diff --git a/gcc/config.gcc b/gcc/config.gcc
index c3f3ea6..80bb3db 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -302,7 +302,7 @@ m32c*-*-*)
 ;;
 aarch64*-*-*)
 	cpu_type=aarch64
-	extra_headers="arm_neon.h"
+	extra_headers="arm_neon.h arm_acle.h"
 	extra_objs="aarch64-builtins.o aarch-common.o"
 	target_has_targetm_common=yes
 	;;
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index fe4d392..a94ef52 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -411,6 +411,28 @@ static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = {
 #include "aarch64-simd-builtins.def"
 };
 
+/* There's only 8 CRC32 builtins.  Probably not worth their own .def file.  */
+#define AARCH64_CRC32_BUILTINS \
+  CRC32_BUILTIN (crc32b, QI) \
+  CRC32_BUILTIN (crc32h, HI) \
+  CRC32_BUILTIN (crc32w, SI) \
+  CRC32_BUILTIN (crc32x, DI) \
+  CRC32_BUILTIN (crc32cb, QI) \
+  CRC32_BUILTIN (crc32ch, HI) \
+  CRC32_BUILTIN (crc32cw, SI) \
+  CRC32_BUILTIN (crc32cx, DI)
+
+typedef struct
+{
+  const char *name;
+  enum machine_mode mode;
+  const enum insn_code icode;
+  unsigned int fcode;
+} aarch64_crc_builtin_datum;
+
+#define CRC32_BUILTIN(N, M) \
+  AARCH64_BUILTIN_##N,
+
 #undef VAR1
 #define VAR1(T, N, MAP, A) \
   AARCH64_SIMD_BUILTIN_##T##_##N##A,
@@ -428,9 +450,22 @@ enum aarch64_builtins
 #include "aarch64-simd-builtins.def"
   AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE
 			  + ARRAY_SIZE (aarch64_simd_builtin_data),
+  AARCH64_CRC32_BUILTIN_BASE,
+  AARCH64_CRC32_BUILTINS
+  AARCH64_CRC32_BUILTIN_MAX,
   AARCH64_BUILTIN_MAX
 };
 
+#undef CRC32_BUILTIN
+#define CRC32_BUILTIN(N, M) \
+  {"__builtin_aarch64_"#N, M##mode, CODE_FOR_aarch64_##N, AARCH64_BUILTIN_##N},
+
+static aarch64_crc_builtin_datum aarch64_crc_builtin_data[] = {
+  AARCH64_CRC32_BUILTINS
+};
+
+#undef CRC32_BUILTIN
+
 static GTY(()) tree aarch64_builtin_decls[AARCH64_BUILTIN_MAX];
 
 #define NUM_DREG_TYPES 6
@@ -802,6 +837,24 @@ aarch64_init_simd_builtins (void)
 }
 }
 
+static void
+aarch64_init_crc32_builtins ()
+{
+  tree usi_type = aarch64_build_unsigned_type (SImode);
+  unsigned in

Re: [C++ Patch] PR 19200

2014-06-10 Thread Paolo Carlini


Hi,

On 06/10/2014 03:40 AM, Jason Merrill wrote:

I think the parser approach is more correct.
I suspected that, in fact it's the first approach I tried, but the 
additional parameter in many places made me a little nervous and 
persisted ;)

On 06/09/2014 07:02 PM, Paolo Carlini wrote:

  !TYPE_WAS_ANONYMOUS (class_type)
+ && !friend_p
  && constructor_name_p (unqualified_name,
 class_type))


But here you also need to check qualifying_scope; 'friend' does not 
affect whether or not a qualified-id is a constructor.
Thanks. Fiddling with the qualifying_scope bit I noticed that we also 
crashed in constructor_name_p for friend10.C below, that is when 
CLASS_TYPE_P (qualifying_scope) is false, it's a namespace. When I 
figured out the right condition, I noticed that grokdeclarator also 
needed an adjustment: ctype, set by:


if (ctype == NULL_TREE
&& decl_context == FIELD
&& funcdecl_p
&& (friendp == 0 || dname == current_class_name))
  ctype = current_class_type;

means that, later on:

  if (ctype && TREE_CODE (type) == FUNCTION_TYPE && staticp < 2
  && !NEW_DELETE_OPNAME_P (unqualified_id))

is true and build_memfn_type is called. Then grokfndecl calls 
set_decl_namespace which fails because compares a FUNCTION_TYPE to a 
METHOD_TYPE. Simply clearing ctype when we know we are handling a 
friendp appears to do the trick and passes the testsuite...


Thanks!
Paolo.

//
/cp
2014-06-10  Paolo Carlini  

PR c++/19200
* parser.c (cp_parser_declarator): Add bool parameter.
(cp_parser_direct_declarator): Likewise, use it.
(cp_parser_member_declaration): Pass friend_p to cp_parser_declarator.
(cp_parser_condition, cp_parser_explicit_instantiation,
cp_parser_init_declarator, cp_parser_type_id_1,
cp_parser_parameter_declaration, cp_parser_exception_declaration,
cp_parser_cache_defarg, cp_parser_objc_class_ivars, 
cp_parser_objc_struct_declaration, cp_parser_omp_for_loop_init):
Adjust.
* decl.c (grokdeclarator): Fix handling of friend declared in
namespace scope.

/testsuite
2014-06-10  Paolo Carlini  

PR c++/19200
* g++.dg/parse/friend9.C: New.
* g++.dg/parse/friend10.C: Likewise.
* g++.dg/parse/friend7.C: Adjust.

Index: cp/decl.c
===
--- cp/decl.c   (revision 211395)
+++ cp/decl.c   (working copy)
@@ -9751,6 +9751,19 @@ grokdeclarator (const cp_declarator *declarator,
  }
else if (friendp)
  {
+   /* Ensure, eg:
+
+   namespace N {
+ void S();
+   }
+
+   struct S {
+ friend void N::S();
+   };
+
+  is handled correctly, clear ctype (c++/19200).  */
+   ctype = NULL_TREE;
+
if (initialized)
  error ("can%'t initialize friend function %qs", name);
if (virtualp)
Index: cp/parser.c
===
--- cp/parser.c (revision 211395)
+++ cp/parser.c (working copy)
@@ -2078,9 +2078,9 @@ static tree cp_parser_decltype
 static tree cp_parser_init_declarator
   (cp_parser *, cp_decl_specifier_seq *, vec *, 
bool, bool, int, bool *, tree *);
 static cp_declarator *cp_parser_declarator
-  (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool);
+  (cp_parser *, cp_parser_declarator_kind, int *, bool *, bool, bool);
 static cp_declarator *cp_parser_direct_declarator
-  (cp_parser *, cp_parser_declarator_kind, int *, bool);
+  (cp_parser *, cp_parser_declarator_kind, int *, bool, bool);
 static enum tree_code cp_parser_ptr_operator
   (cp_parser *, tree *, cp_cv_quals *, tree *);
 static cp_cv_quals cp_parser_cv_qualifier_seq_opt
@@ -10014,7 +10014,8 @@ cp_parser_condition (cp_parser* parser)
   declarator = cp_parser_declarator (parser, CP_PARSER_DECLARATOR_NAMED,
 /*ctor_dtor_or_conv_p=*/NULL,
 /*parenthesized_p=*/NULL,
-/*member_p=*/false);
+/*member_p=*/false,
+/*friend_p=*/false);
   /* Parse the attributes.  */
   attributes = cp_parser_attributes_opt (parser);
   /* Parse the asm-specification.  */
@@ -14160,7 +14161,8 @@ cp_parser_explicit_instantiation (cp_parser* parse
= cp_parser_declarator (parser, CP_PARSER_DECLARATOR_NAMED,
/*ctor_dtor_or_conv_p=*/NULL,
/*parenthesized_p=*/NULL,
-   /*member_p=*/false);
+   /*member_p=*/false,
+   /*friend_p=*/

[PATCH] Fix PR61456

2014-06-10 Thread Richard Biener


The following fixes an issue with nonoverlapping_component_refs_of_decl_p
(and the latent same issue in nonoverlapping_component_refs_p).  We
can't rely on all variant types having the same TYPE_FIELDS, so the
following simply uses DECL_FIELD_CONTEXT directly (which is either
the same for shared TYPE_FIELDS or not - in which case we can't
use TYPE_MAIN_VARIANT anyway).

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2014-06-10  Richard Biener  

PR middle-end/61456
* tree-ssa-alias.c (nonoverlapping_component_refs_of_decl_p):
Do not use the main variant for the type comparison.
(ncr_compar): Likewise.
(nonoverlapping_component_refs_p): Likewise.

* g++.dg/opt/pr61456.C: New testcase.

Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 211398)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -835,8 +835,8 @@ nonoverlapping_component_refs_of_decl_p
   /* ??? We cannot simply use the type of operand #0 of the refs here
 as the Fortran compiler smuggles type punning into COMPONENT_REFs
 for common blocks instead of using unions like everyone else.  */
-  tree type1 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field1));
-  tree type2 = TYPE_MAIN_VARIANT (DECL_CONTEXT (field2));
+  tree type1 = DECL_CONTEXT (field1);
+  tree type2 = DECL_CONTEXT (field2);
 
   /* We cannot disambiguate fields in a union or qualified union.  */
   if (type1 != type2 || TREE_CODE (type1) != RECORD_TYPE)
@@ -866,10 +866,8 @@ ncr_compar (const void *field1_, const v
 {
   const_tree field1 = *(const_tree *) const_cast (field1_);
   const_tree field2 = *(const_tree *) const_cast (field2_);
-  unsigned int uid1
-= TYPE_UID (TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field1)));
-  unsigned int uid2
-= TYPE_UID (TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field2)));
+  unsigned int uid1 = TYPE_UID (DECL_FIELD_CONTEXT (field1));
+  unsigned int uid2 = TYPE_UID (DECL_FIELD_CONTEXT (field2));
   if (uid1 < uid2)
 return -1;
   else if (uid1 > uid2)
@@ -893,7 +891,7 @@ nonoverlapping_component_refs_p (const_t
   while (TREE_CODE (x) == COMPONENT_REF)
 {
   tree field = TREE_OPERAND (x, 1);
-  tree type = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field));
+  tree type = DECL_FIELD_CONTEXT (field);
   if (TREE_CODE (type) == RECORD_TYPE)
fieldsx.safe_push (field);
   x = TREE_OPERAND (x, 0);
@@ -904,7 +902,7 @@ nonoverlapping_component_refs_p (const_t
   while (TREE_CODE (y) == COMPONENT_REF)
 {
   tree field = TREE_OPERAND (y, 1);
-  tree type = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field));
+  tree type = DECL_FIELD_CONTEXT (field);
   if (TREE_CODE (type) == RECORD_TYPE)
fieldsy.safe_push (TREE_OPERAND (y, 1));
   y = TREE_OPERAND (y, 0);
@@ -915,8 +913,8 @@ nonoverlapping_component_refs_p (const_t
   /* Most common case first.  */
   if (fieldsx.length () == 1
   && fieldsy.length () == 1)
-return ((TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldsx[0]))
-== TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldsy[0])))
+return ((DECL_FIELD_CONTEXT (fieldsx[0])
+== DECL_FIELD_CONTEXT (fieldsy[0]))
&& fieldsx[0] != fieldsy[0]
&& !(DECL_BIT_FIELD (fieldsx[0]) && DECL_BIT_FIELD (fieldsy[0])));
 
@@ -949,8 +947,8 @@ nonoverlapping_component_refs_p (const_t
 {
   const_tree fieldx = fieldsx[i];
   const_tree fieldy = fieldsy[j];
-  tree typex = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldx));
-  tree typey = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldy));
+  tree typex = DECL_FIELD_CONTEXT (fieldx);
+  tree typey = DECL_FIELD_CONTEXT (fieldy);
   if (typex == typey)
{
  /* We're left with accessing different fields of a structure,
Index: gcc/testsuite/g++.dg/opt/pr61456.C
===
--- gcc/testsuite/g++.dg/opt/pr61456.C  (revision 0)
+++ gcc/testsuite/g++.dg/opt/pr61456.C  (working copy)
@@ -0,0 +1,26 @@
+// { dg-do compile }
+// { dg-options "-O2 -std=c++11 -Werror=uninitialized" }
+
+int rand ();
+
+class Funcs
+{
+public:
+int *f1 ();
+int *f2 ();
+};
+typedef decltype (&Funcs::f1) pfunc;
+
+static int Set (Funcs * f, const pfunc & fp)
+{
+  (f->*fp) ();
+}
+
+void
+Foo ()
+{
+  pfunc fp = &Funcs::f1;
+  if (rand ())
+fp = &Funcs::f2;
+  Set (0, fp);
+}

[PATCH, loop2_invariant] Pre-check invariants

2014-06-10 Thread Zhenqiang Chen

Hi,

During tests, I found some invariants could not be replaced at the
last stage. If we can identify such invariants earlier, we can skip
them and give the chance to other invariants.  So the patch pre-checks
candidates to skip the one which can not make a valid insn during
replacement in move_invariant_reg.

Bootstrap and no make check regression on X86-64.
Bootstrap and no make check regression on X86-64 with
flag_ira_loop_pressure = true.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-06-10  Zhenqiang Chen  

* loop-invariant.c (find_invariant_insn): Skip invariants, which
can not make a valid insn during replacement in move_invariant_reg.

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index c43206a..7be4b29 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -881,6 +881,35 @@ find_invariant_insn (rtx insn, bool
always_reached, bool always_executed)
   || HARD_REGISTER_P (dest))
 simple = false;

+  /* Pre-check candidate to skip the one which can not make a valid insn
+ during move_invariant_reg.  */
+  if (flag_ira_loop_pressure && df_live && simple
+  && REG_P (dest) && DF_REG_DEF_COUNT (REGNO (dest)) > 1)
+{
+  df_ref use;
+  rtx ref;
+  unsigned int i = REGNO (dest);
+  struct df_insn_info *insn_info;
+  df_ref *def_rec;
+
+  for (use = DF_REG_USE_CHAIN (i); use; use = DF_REF_NEXT_REG (use))
+   {
+ ref = DF_REF_INSN (use);
+ insn_info = DF_INSN_INFO_GET (ref);
+
+ for (def_rec = DF_INSN_INFO_DEFS (insn_info); *def_rec; def_rec++)
+   if (DF_REF_REGNO (*def_rec) == i)
+ {
+   /* Multi definitions at this stage, most likely are due to
+  instruction constrain, which requires both read and write
+  on the same register.  Since move_invariant_reg is not
+  powerful enough to handle such cases, just ignore the INV
+  and leave the chance to others.  */
+   return;
+ }
+   }
+}
+
   if (!may_assign_reg_p (SET_DEST (set))
   || !check_maybe_invariant (SET_SRC (set)))
 return;

Re: [PATCH][ARM][doc] Improve description of AArch32 CRC32 intrinsics

2014-06-10 Thread Richard Earnshaw

On 09/06/14 11:06, Kyrill Tkachov wrote:
> Hi all,
> 
> The ACLE intrinsics documentation for arm can be improved a bit.
> 
> Since there are potentially other ACLE intrinsics besides the CRC32 ones 
> in the future, I moved the comment about their availability into the 
> CRC32 intrinsics subsection.
> 
> I removed the comment about the instruction form expected for AArch64 
> since that is a separate port and should be documented in the AArch64 
> section.
> 
> Tested by building the PDF doc and looking at it.
> 
> I think this should go for 4.9 as well as trunk as it is a clarification.
> 
> Ok?
> 

Ok both.

R.

> Thanks,
> Kyrill
> 
> 
> 2014-06-09  Kyrylo Tkachov  
> 
>  * doc/arm-acle-intrinsics.texi: Specify when CRC32 intrinsics are
>  available.
>  Simplify description of __crc32d and __crc32cd intrinsics.
>  * doc/extend.texi (ARM ACLE Intrinsics): Remove comment about CRC32
>  availability.
> 
> 
> arm-crc-doc-fix.patch
> 
> 
> diff --git a/gcc/doc/arm-acle-intrinsics.texi 
> b/gcc/doc/arm-acle-intrinsics.texi
> index e68f4cd..8c5523e 100644
> --- a/gcc/doc/arm-acle-intrinsics.texi
> +++ b/gcc/doc/arm-acle-intrinsics.texi
> @@ -4,6 +4,10 @@
>  
>  @subsubsection CRC32 intrinsics
>  
> +These intrinsics are available when the CRC32 architecture extension is
> +specified, e.g. when the @option{-march=armv8-a+crc} switch is used, or when
> +the target processor specified with @option{-mcpu} supports it.
> +
>  @itemize @bullet
>  @item uint32_t __crc32b (uint32_t, uint8_t)
>  @*@emph{Form of expected instruction(s):} @code{crc32b @var{r0}, @var{r0}, 
> @var{r0}}
> @@ -25,8 +29,7 @@
>  @itemize @bullet
>  @item uint32_t __crc32d (uint32_t, uint64_t)
>  @*@emph{Form of expected instruction(s):} Two @code{crc32w @var{r0}, 
> @var{r0}, @var{r0}}
> -instructions for AArch32. One @code{crc32w @var{w0}, @var{w0}, @var{x0}} 
> instruction for
> -AArch64.
> +instructions.
>  @end itemize
>  
>  @itemize @bullet
> @@ -50,6 +53,5 @@ AArch64.
>  @itemize @bullet
>  @item uint32_t __crc32cd (uint32_t, uint64_t)
>  @*@emph{Form of expected instruction(s):} Two @code{crc32cw @var{r0}, 
> @var{r0}, @var{r0}}
> -instructions for AArch32. One @code{crc32cw @var{w0}, @var{w0}, @var{x0}} 
> instruction for
> -AArch64.
> +instructions.
>  @end itemize
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index a2fe619..a68020a 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -10519,9 +10519,6 @@ when the @option{-mfpu=neon} switch is used:
>  @node ARM ACLE Intrinsics
>  @subsection ARM ACLE Intrinsics
>  
> -These built-in intrinsics for the ARMv8-A CRC32 extension are available when
> -the @option{-march=armv8-a+crc} switch is used:
> -
>  @include arm-acle-intrinsics.texi
>  
>  @node AVR Built-in Functions
>

[PATCH, loop2_invariant] Skip inv (marked as move) from depends_on

2014-06-10 Thread Zhenqiang Chen

Hi,

The patch skips an invariant from depends_on if it has been marked as
"move" since its register pressure and cost had been taken into
account in previous iterations.

Bootstrap and no make check regression on X86-64.
Bootstrap and no make check regression on X86-64 with
flag_ira_loop_pressure = true.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-06-10  Zhenqiang Chen  

* loop-invariant.c (get_inv_cost): Skip invariants, which are marked
as "move", from depends_on.

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index e822bb6..fca9c2f 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1148,6 +1148,10 @@ get_inv_cost (struct invariant *inv, int
*comp_cost, unsigned *regs_needed,

   dep = invariants[depno];

+  /* If DEP is moved out of the loop, it is not a depends_on any more.  */
+  if (dep->move)
+   continue;
+
   dep_ret = get_inv_cost (dep, &acomp_cost, aregs_needed, &dep_cl);

   if (! flag_ira_loop_pressure)

[PATCH, loop2_invariant, 2/2] Change heuristics for identical invariants

2014-06-10 Thread Zhenqiang Chen

Hi,

When analysing logs of loop2-invariant of eembc, I found the same
invariant occurred lots of times in a loop. But it was not selected
since its cost was not high and register pressure was high. Logs show
performance improvement by giving them higher priority to move.

The patch changes the heuristics to move identical invariants:
* add a new member eqno, which records the number of invariants eqto the inv.
* set its cost to: inv->cost * inv->eqno;
* compare with spill_cost if register pressure is high.

Bootstrap and no make check regression on X86-64.
Bootstrap and no make check regression on X86-64 with
flag_ira_loop_pressure = true.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-06-10  Zhenqiang Chen  

* loop-invariant.c (struct invariant): Add a new member: eqno;
(find_identical_invariants): Update eqno;
(create_new_invariant): Init eqno;
(get_inv_cost): Compute comp_cost wiht eqno;
(gain_for_invariant): Take spill cost into account.

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 92388f5..c43206a 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -104,6 +104,9 @@ struct invariant
   /* The number of the invariant with the same value.  */
   unsigned eqto;

+  /* The number of invariants which eqto this.  */
+  unsigned eqno;
+
   /* If we moved the invariant out of the loop, the register that contains its
  value.  */
   rtx reg;
@@ -498,6 +501,7 @@ find_identical_invariants (invariant_htab_type eq,
struct invariant *inv)
   struct invariant *dep;
   rtx expr, set;
   enum machine_mode mode;
+  struct invariant *tmp;

   if (inv->eqto != ~0u)
 return;
@@ -513,7 +517,12 @@ find_identical_invariants (invariant_htab_type
eq, struct invariant *inv)
   mode = GET_MODE (expr);
   if (mode == VOIDmode)
 mode = GET_MODE (SET_DEST (set));
-  inv->eqto = find_or_insert_inv (eq, expr, mode, inv)->invno;
+
+  tmp = find_or_insert_inv (eq, expr, mode, inv);
+  inv->eqto = tmp->invno;
+
+  if (tmp->invno != inv->invno && inv->always_executed)
+tmp->eqno++;

   if (dump_file && inv->eqto != inv->invno)
 fprintf (dump_file,
@@ -725,6 +734,10 @@ create_new_invariant (struct def *def, rtx insn,
bitmap depends_on,

   inv->invno = invariants.length ();
   inv->eqto = ~0u;
+
+  /* Itself.  */
+  inv->eqno = 1;
+
   if (def)
 def->invno = inv->invno;
   invariants.safe_push (inv);
@@ -1107,7 +1120,7 @@ get_inv_cost (struct invariant *inv, int
*comp_cost, unsigned *regs_needed,

   if (!inv->cheap_address
   || inv->def->n_addr_uses < inv->def->n_uses)
-(*comp_cost) += inv->cost;
+(*comp_cost) += inv->cost * inv->eqno;

 #ifdef STACK_REGS
   {
@@ -1243,7 +1256,13 @@ gain_for_invariant (struct invariant *inv,
unsigned *regs_needed,
 + IRA_LOOP_RESERVED_REGS
 - ira_class_hard_regs_num[cl];
   if (size_cost > 0)
-   return -1;
+   {
+ int spill_cost = target_spill_cost [speed] * (int) regs_needed[cl];
+ if (comp_cost <= spill_cost)
+   return -1;
+
+ return 2;
+   }
   else
size_cost = 0;
 }

[PATCH, loop2_invariant, 1/2] Check only one register class

2014-06-10 Thread Zhenqiang Chen

Hi,

For loop2-invariant pass, when flag_ira_loop_pressure is enabled,
function gain_for_invariant checks the pressures of all register
classes. This does not make sense since one invariant might impact
only one register class.

The patch enhances functions get_inv_cost and gain_for_invariant to
check only the register pressure of the invariant if possible.

Bootstrap and no make check regression on X86-64.
Bootstrap and no make check regression on X86-64 with
flag_ira_loop_pressure = true.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-06-10  Zhenqiang Chen  

* loop-invariant.c (get_inv_cost): Handle register class.
(gain_for_invariant): Check the register pressure of the inv,
other than all.

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 100a2c1..e822bb6 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1058,16 +1058,22 @@ get_pressure_class_and_nregs (rtx insn, int *nregs)
 }

 /* Calculates cost and number of registers needed for moving invariant INV
-   out of the loop and stores them to *COST and *REGS_NEEDED.  */
+   out of the loop and stores them to *COST and *REGS_NEEDED.  *CL will be
+   the REG_CLASS of INV.  Return
+ 0: if INV is invalid.
+ 1: if INV and its depends_on have same reg_class
+   > 1: if INV and its depends_on have different reg_classes.  */

-static void
-get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed)
+static int
+get_inv_cost (struct invariant *inv, int *comp_cost, unsigned *regs_needed,
+ enum reg_class *cl)
 {
   int i, acomp_cost;
   unsigned aregs_needed[N_REG_CLASSES];
   unsigned depno;
   struct invariant *dep;
   bitmap_iterator bi;
+  int ret = 2;

   /* Find the representative of the class of the equivalent invariants.  */
   inv = invariants[inv->eqto];
@@ -1083,7 +1089,7 @@ get_inv_cost (struct invariant *inv, int
*comp_cost, unsigned *regs_needed)

   if (inv->move
   || inv->stamp == actual_stamp)
-return;
+return 0;
   inv->stamp = actual_stamp;

   if (! flag_ira_loop_pressure)
@@ -1095,6 +1101,8 @@ get_inv_cost (struct invariant *inv, int
*comp_cost, unsigned *regs_needed)

   pressure_class = get_pressure_class_and_nregs (inv->insn, &nregs);
   regs_needed[pressure_class] += nregs;
+  *cl = pressure_class;
+  ret = 1;
 }
   if (!inv->cheap_address
@@ -1135,10 +1143,12 @@ get_inv_cost (struct invariant *inv, int
*comp_cost, unsigned *regs_needed)
   EXECUTE_IF_SET_IN_BITMAP (inv->depends_on, 0, depno, bi)
 {
   bool check_p;
+  enum reg_class dep_cl = NO_REGS;
+  int dep_ret;

   dep = invariants[depno];

-  get_inv_cost (dep, &acomp_cost, aregs_needed);
+  dep_ret = get_inv_cost (dep, &acomp_cost, aregs_needed, &dep_cl);

   if (! flag_ira_loop_pressure)
check_p = aregs_needed[0] != 0;
@@ -1148,6 +1158,11 @@ get_inv_cost (struct invariant *inv, int
*comp_cost, unsigned *regs_needed)
if (aregs_needed[ira_pressure_classes[i]] != 0)
  break;
  check_p = i < ira_pressure_classes_num;
+
+ if (dep_ret > 1)
+   ret += dep_ret;
+ else if ((dep_ret == 1) && (*cl != dep_cl))
+   ret++;
}
   if (check_p
  /* We need to check always_executed, since if the original value of
@@ -1181,6 +1196,7 @@ get_inv_cost (struct invariant *inv, int
*comp_cost, unsigned *regs_needed)
}
   (*comp_cost) += acomp_cost;
 }
+  return ret;
 }

 /* Calculates gain for eliminating invariant INV.  REGS_USED is the number
@@ -1195,10 +1211,12 @@ gain_for_invariant (struct invariant *inv,
unsigned *regs_needed,
bool speed, bool call_p)
 {
   int comp_cost, size_cost;
+  enum reg_class cl;
+  int ret;

   actual_stamp++;

-  get_inv_cost (inv, &comp_cost, regs_needed);
+  ret = get_inv_cost (inv, &comp_cost, regs_needed, &cl);

   if (! flag_ira_loop_pressure)
 {
@@ -1207,6 +1225,24 @@ gain_for_invariant (struct invariant *inv,
unsigned *regs_needed,
   - estimate_reg_pressure_cost (new_regs[0],
 regs_used, speed, call_p));
 }
+  else if (ret == 0)
+return -1;
+  else if (ret == 1)
+{
+  /* Hoist it anyway since it does not impact register pressure.  */
+  if (cl == NO_REGS)
+   return 1;
+
+  size_cost = (int) new_regs[cl]
++ (int) regs_needed[cl]
++ LOOP_DATA (curr_loop)->max_reg_pressure[cl]
++ IRA_LOOP_RESERVED_REGS
+- ira_class_hard_regs_num[cl];
+  if (size_cost > 0)
+   return -1;
+  else
+   size_cost = 0;
+}
   else
 {
   int i;

Re: Commit policy? Re: [PATCH 7/7] Plug ipa-prop escape analysis into gimple_call_arg_flags

2014-06-10 Thread Richard Biener

On Tue, Jun 10, 2014 at 8:47 AM, Thomas Schwinge
 wrote:
> Hi!
>
> On Tue, 3 Jun 2014 11:55:44 +0200, I wrote:
>> Ping -- OK to commit to trunk?
>
> Even though several of those who I'd consider regular GCC developers do
> agree with this patch (see also ), that is
> not enough "consensus" to consider this patch approved, right?  Wouldn't
> it be a good idea for GCC to move to a more "liberal" policy in this
> regard?  (That seems to work nicely for glibc.)

Ok.

Thanks,
Richard.

>
> Ping.
>
>
>> On Wed, 28 May 2014 23:55:31 +0200, Jan Hubicka  wrote:
>> > > On Mon, 26 May 2014 02:16:35 -0700, Andrew Pinski  
>> > > wrote:
>> > > > On Mon, May 26, 2014 at 1:59 AM, Dominique Dhumieres 
>> > > >  wrote:
>> > > > > r210901 breaks bootstrap on targets not supporting strnlen, e.g., 
>> > > > > darwin10.
>> > > > >
>> > > > > ../../_clean/gcc/lto-cgraph.c:976:68: error: 'strnlen' was not 
>> > > > > declared in this scope
>> > >
>> > > I'm seeing the same on MinGW, which also doesn't have strnlen (which is a
>> > > GNU extension).
>> > >
>> > > > strnlen should be declared in include/libiberty.h if there is no
>> > > > declaration as libiberty support is already there.  That should be a
>> > > > simple fix.
>> > >
>> > > Like this?
>> >
>> > This looks good to me (thoguh strnlen is posix).  I can not approve the 
>> > patch
>> > but I would preffer it over just hand implementing strnlen there (that is 
>> > easy,
>> > too)
>>
>> Patch is also considered good by testers in
>> .
>>
>> > > --- gcc/config.in
>> > > +++ gcc/config.in
>> > > [Regenerate.]
>> > > --- gcc/configure
>> > > +++ gcc/configure
>> > > [Regenerate.]
>> > > --- gcc/configure.ac
>> > > +++ gcc/configure.ac
>> > > @@ -1136,7 +1136,7 @@ CFLAGS="$CFLAGS -I${srcdir} -I${srcdir}/../include 
>> > > $GMPINC"
>> > >  saved_CXXFLAGS="$CXXFLAGS"
>> > >  CXXFLAGS="$CXXFLAGS -I${srcdir} -I${srcdir}/../include $GMPINC"
>> > >  gcc_AC_CHECK_DECLS(getenv atol asprintf sbrk abort atof getcwd getwd \
>> > > - strsignal strstr stpcpy strverscmp \
>> > > + stpcpy strnlen strsignal strstr strverscmp \
>> > >   errno snprintf vsnprintf vasprintf malloc realloc calloc \
>> > >   free basename getopt clock getpagesize ffs gcc_UNLOCKED_FUNCS, , ,[
>> > >  #include "ansidecl.h"
>> > > diff --git include/libiberty.h include/libiberty.h
>> > > index 7fd0703..56b8b43 100644
>> > > --- include/libiberty.h
>> > > +++ include/libiberty.h
>> > > @@ -636,6 +636,10 @@ extern int snprintf (char *, size_t, const char *, 
>> > > ...) ATTRIBUTE_PRINTF_3;
>> > >  extern int vsnprintf (char *, size_t, const char *, va_list) 
>> > > ATTRIBUTE_PRINTF(3,0);
>> > >  #endif
>> > >
>> > > +#if defined (HAVE_DECL_STRNLEN) && !HAVE_DECL_STRNLEN
>> > > +extern size_t strnlen (const char *, size_t);
>> > > +#endif
>> > > +
>> > >  #if defined(HAVE_DECL_STRVERSCMP) && !HAVE_DECL_STRVERSCMP
>> > >  /* Compare version strings.  */
>> > >  extern int strverscmp (const char *, const char *);
>>
>>
>> Grüße,
>>  Thomas

Re: [PATCH] Fix PR61375: cancel bswap optimization when value doesn't fit in a HOST_WIDE_INT

2014-06-10 Thread Richard Biener

On Tue, Jun 10, 2014 at 4:30 AM, Thomas Preud'homme
 wrote:
> When analyzing a bitwise AND with a constant as part of a bitwise OR,
> the bswap pass stores the constant in a int64_t variable without checking
> if it fits. As a result, we get ICE when the constant is an __int128 value.
> This affects GCC trunk but also GCC 4.9 and 4.8 (and possibly earlier
> version as well).
>
>
> ChangeLog are changed as follows:
>
> *** gcc/ChangeLog ***
>
> 2014-06-05  Thomas Preud'homme  
>
> PR tree-optimization/61375
> * tree-ssa-math-opts.c (init_symbolic_number): Cancel optimization if
> symbolic number cannot be represented in an unsigned HOST_WIDE_INT.
> (find_bswap_or_nop_1): Likewise.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2014-06-05  Thomas Preud'homme  
>
> PR tree-optimization/61375
> * gcc.c-torture/execute/pr61375-1.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61375.c 
> b/gcc/testsuite/gcc.c-torture/execute/pr61375.c
> new file mode 100644
> index 000..58df57a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr61375.c
> @@ -0,0 +1,34 @@
> +#ifdef __UINT64_TYPE__
> +typedef __UINT64_TYPE__ uint64_t;
> +#else
> +typedef unsigned long long uint64_t;
> +#endif
> +
> +#ifndef __SIZEOF_INT128__
> +#define __int128 long long
> +#endif
> +
> +/* Some version of bswap optimization would ICE when analyzing a mask 
> constant
> +   too big for an HOST_WIDE_INT (PR210931).  */
> +
> +__attribute__ ((noinline, noclone)) uint64_t
> +uint128_central_bitsi_ior (unsigned __int128 in1, uint64_t in2)
> +{
> +  __int128 mask = (__int128)0x << 56;
> +  return ((in1 & mask) >> 56) | in2;
> +}
> +
> +int main(int argc)
> +{
> +  __int128 in = 1;
> +#ifdef __SIZEOF_INT128__
> +  in <<= 64;
> +#endif
> +  if (sizeof (uint64_t) * __CHAR_BIT__ != 64)
> +return 0;
> +  if (sizeof (unsigned __int128) * __CHAR_BIT__ != 128)
> +return 0;
> +  if (uint128_central_bitsi_ior (in, 2) != 0x102)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> index 658b341..95b3f25 100644
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -1717,6 +1717,8 @@ init_symbolic_number (struct symbolic_number *n, tree 
> src)
>if (n->size % BITS_PER_UNIT != 0)
>  return false;
>n->size /= BITS_PER_UNIT;
> +  if (n->size > (int)sizeof (unsigned HOST_WIDE_INT))

this should be sizeof (uint64_t) on the trunk and sizeof (unsigned
HOST_WIDEST_INT) on branches.

> +return false;
>n->range = n->size;
>n->n = CMPNOP;
>
> @@ -1883,6 +1885,8 @@ find_bswap_or_nop_1 (gimple stmt, struct 
> symbolic_number *n, int limit)
> type_size = TYPE_PRECISION (gimple_expr_type (stmt));
> if (type_size % BITS_PER_UNIT != 0)
>   return NULL_TREE;
> +   if (type_size > (int)sizeof (unsigned HOST_WIDE_INT) * 8)
> + return NULL_TREE;

Likewise.

> if (type_size / BITS_PER_UNIT < (int)(sizeof (int64_t)))
>   {
>
> Is this OK for trunk?

Ok for trunk with using uint64_t.

 What about backports for 4.8 and 4.9? Would a
> reworked patch for these versions be accepted? The change would
> be trivial: the code in init_symbolic_number now was moved from
> some other place.

Backports are welcome - please post a patch.

Thanks,
Richard.

> Best regards,
>
> Thomas
>
>

Re: [PATCH][AArch64] Add a big-endian lane flip at expand-time in saturating math patterns

2014-06-10 Thread Kyrill Tkachov



On 10/06/14 09:53, Kyrill Tkachov wrote:

Hi all,
On some of the saturating math expanders we need to perform a lane flip
on big-endian when expanding to RTL so that we keep consistent with
GCCs' view of lane numbering.
During assembly emission the pattern will perform another lane flip to
translate from GCCs' numbering to the architectural lane number.

To do this a few of the patterns were renamed to *_internal and given an
expander that will perform that first lane flip while the existing
expanders get a lane flip added to them.

The tests for these patterns will come soon in a separate patch.

With this patch, when the user uses something like vqdmlal_laneq_s16 (a,
b, c, 0) from arm_neon.h in big endian the resulting instruction will
access lane 0 of c now, whereas before it would access lane 7.

Tested and bootstrapped aarch64-none-linux-gnu and aarch64_be-none-elf.


I should clarify that on aarch64_be-none-elf it was just tested, not 
bootstrapped ;)


Kyrill



Ok for trunk?

Thanks,
Kyrill

2014-06-10  Kyrylo Tkachov  

  * config/aarch64/aarch64-simd.md (aarch64_sqdmulh_lane):
  New expander.
  (aarch64_sqrdmulh_lane): Likewise.
  (aarch64_sqdmulh_lane): Rename to...
  (aarch64_sqdmulh_lane_interna): ...this.
  (aarch64_sqdmulh_laneq): New expander.
  (aarch64_sqrdmulh_laneq): Likewise.
  (aarch64_sqdmulh_laneq): Rename to...
  (aarch64_sqdmulh_laneq_internal): ...this.
  (aarch64_sqdmulh_lane): New expander.
  (aarch64_sqrdmulh_lane): Likewise.
  (aarch64_sqdmulh_lane): Rename to...
  (aarch64_sqdmulh_lane_internal): ...this.
  (aarch64_sqdmlal_lane): Add lane flip for big-endian.
  (aarch64_sqdmlal_laneq): Likewise.
  (aarch64_sqdmlsl_lane): Likewise.
  (aarch64_sqdmlsl_laneq): Likewise.
  (aarch64_sqdmlal2_lane): Likewise.
  (aarch64_sqdmlal2_laneq): Likewise.
  (aarch64_sqdmlsl2_lane): Likewise.
  (aarch64_sqdmlsl2_laneq): Likewise.
  (aarch64_sqdmull_lane): Likewise.
  (aarch64_sqdmull_laneq): Likewise.
  (aarch64_sqdmull2_lane): Likewise.
  (aarch64_sqdmull2_laneq): Likewise.

Re: Move DECL_SECTION_NAME into symtab

2014-06-10 Thread Richard Biener

On Mon, Jun 9, 2014 at 4:34 AM, Jan Hubicka  wrote:
> Hi,
> this patch follows the change to move DECL_COMDAT_GROUP by moving 
> DECL_SECTION_NAME
> into symtab nodes instead of keeping it in decl_with_vis. (I pla to proceed 
> with
> other symbol table related fields).
>
> It follows exactly same path as the previous patch. Notable change is adding
> of node removal into duplicate_decl in c-decl.c.
>
> Memory usage wise the patch is small win for non-WPA, at WPA we actually
> consume bit more memory (about 800K on Firefox).  We have more symtab nodes
> than declarations because of inline clonning.  This will be solved by fixing
> memory representation of symbol nodes (I plan to move rare items into on side
> hashtables).  With accessors API it should be easy.

What I wondered about for some time is why 'clones' need to use the
same structs as their origins.  They share some bits with their origin
and they apply some changes.  Thus I think it would be nice to change
the inheritance of a symtab entry to sth like

  symbol - cgraph-node-base - cgraph-node
 |  \
 |   cgraph-clone
 varpool-node-base - varpool-node
 \
  varpool-clone (do we have those?)

Richard.

> Bootstrapped/regtested x86_64-linux, slightly earlier version of patch was
> tested also at PPC (linux/AIX)
>
> Honza
>
> * symtab.c (dump_symtab_base): Update dumping.
> (symtab_make_decl_local): Clear only DECL_COMDAT.
> * tree-vect-data-refs.c (Check that variable is static before
> tampering with sections.
> * cgraphclones.c (duplicate_thunk_for_node): Do not clear section 
> name.
> (cgraph_create_virtual_clone): Likewise.
> * tree.c (decl_comdat_group, decl_comdat_group_id): Constify argument.
> (decl_section_name, set_decl_section_name): New accessors.
> (find_decls_types_r): Do not walk section name
> * tree.h (DECL_SECTION_NAME): Implement using
> decl_section_name.
> (decl_comdat_group, decl_comdat_group_id): Constify.
> (decl_section_name, set_decl_section_name): Update.
> * varpool.c (varpool_finalize_named_section_flags): Use
> get_section.
> * cgraph.c (cgraph_add_thunk): Reset node instead of rebuilding.
> (cgraph_make_node_local_1): Clear section and comdat group.
> * cgraph.h (set_comdat_group): Sanity check.
> (get_section, set_section): New.
> * ipa-comdats.c (ipa_comdats): Use get_section.
> * ipa.c (ipa_discover_readonly_nonaddressable_var): Likewise.
> * lto-streamer-out.c: Do not follow section names.
> * c-family/c-common.c (handle_section_attribute):
> Update.
> * lto-cgraph.c (lto_output_node): Output section.
> (lto_output_varpool_node): Likewise.
> (read_comdat_group): Rename to ...
> (read_identifier): ... this one.
> (read_string_cst): New function.
> (input_node, input_varpool_node): Input section names.
> * tree-emutls.c (get_emutls_init_templ_addr): Update.
> (new_emutls_decl): Update.
> (secname_for_decl): Check section names only of static
> vars.
> * config/mep/mep.c (mep_unique_section): Use set_decl_section_name.
> * config/i386/winnt.c (i386_pe_unique_section): Likewise.
> * config/i386/i386.c (x86_64_elf_unique_section): Likewise.
> * config/c6x/c6x.c (c6x_elf_unique_section): Likewise.
> * config/rs6000/rs6000.c (rs6000_xcoff_unique_section): Likewise.
> * config/mcore/mcore.c (mcore_unique_section): Likewise.
> * config/mips/mips.c (mips16_build_function_stub): Likewise.
> * config/v850/v850.c (v850_insert_attributes): Likewise.
> * config/h8300/h8300.c: (h8300_handle_eightbit_data_attribute):
> Likewise.
> (h8300_handle_tiny_data_attribute): Likewise.
> * config/bfin/bfin.c (bfin_handle_l1_text_attribute): Likewise.
> (bfin_handle_l2_attribute): Likewise.
>
> * lto.c (mentions_vars_p_decl_with_vis, compare_tree_sccs_1,
> lto_fixup_prevailing_decls): Skip section names.
>
> * go/go-gcc.cc (global_variable_set_init): Use
> set_decl_section_name.
> * tree-streamer-in.c (lto_input_ts_decl_with_vis_tree_pointers): Do 
> not read section name.
>
> * gcc-interface/utils.c (process_attributes): Use it.
>
> * c-decl.c (merge_decls): Use set_decl_section_name.
> (duplicate_decls): Remove node if it exists.
>
> * class.c (build_utf8_ref): Use set_decl_section_name.
> (emit_register_classes_in_jcr_section): Likewise.
> (emit_register_classes_in_jcr_section): Likewise.
>
> * method.c (use_thunk): Use set_decl_section_name.
> * optimize.c (maybe_clone_body): Use set_decl_section_name.
> * decl.c (

RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap

2014-06-10 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme

> 
> Is this OK for trunk? Does this bug qualify for a backport patch to
> 4.8 and 4.9 branches?

I forgot to mention that this was tested via bootstrap on
x86_64-linux-gnu target, the testsuite then showing no
regressions and the 3 tests added now passing.

Best regards,

Thomas

[PATCH][AArch64] Add a big-endian lane flip at expand-time in saturating math patterns

2014-06-10 Thread Kyrill Tkachov


Hi all,
On some of the saturating math expanders we need to perform a lane flip 
on big-endian when expanding to RTL so that we keep consistent with 
GCCs' view of lane numbering.
During assembly emission the pattern will perform another lane flip to 
translate from GCCs' numbering to the architectural lane number.


To do this a few of the patterns were renamed to *_internal and given an 
expander that will perform that first lane flip while the existing 
expanders get a lane flip added to them.


The tests for these patterns will come soon in a separate patch.

With this patch, when the user uses something like vqdmlal_laneq_s16 (a, 
b, c, 0) from arm_neon.h in big endian the resulting instruction will 
access lane 0 of c now, whereas before it would access lane 7.


Tested and bootstrapped aarch64-none-linux-gnu and aarch64_be-none-elf.

Ok for trunk?

Thanks,
Kyrill

2014-06-10  Kyrylo Tkachov  

* config/aarch64/aarch64-simd.md (aarch64_sqdmulh_lane):
New expander.
(aarch64_sqrdmulh_lane): Likewise.
(aarch64_sqdmulh_lane): Rename to...
(aarch64_sqdmulh_lane_interna): ...this.
(aarch64_sqdmulh_laneq): New expander.
(aarch64_sqrdmulh_laneq): Likewise.
(aarch64_sqdmulh_laneq): Rename to...
(aarch64_sqdmulh_laneq_internal): ...this.
(aarch64_sqdmulh_lane): New expander.
(aarch64_sqrdmulh_lane): Likewise.
(aarch64_sqdmulh_lane): Rename to...
(aarch64_sqdmulh_lane_internal): ...this.
(aarch64_sqdmlal_lane): Add lane flip for big-endian.
(aarch64_sqdmlal_laneq): Likewise.
(aarch64_sqdmlsl_lane): Likewise.
(aarch64_sqdmlsl_laneq): Likewise.
(aarch64_sqdmlal2_lane): Likewise.
(aarch64_sqdmlal2_laneq): Likewise.
(aarch64_sqdmlsl2_lane): Likewise.
(aarch64_sqdmlsl2_laneq): Likewise.
(aarch64_sqdmull_lane): Likewise.
(aarch64_sqdmull_laneq): Likewise.
(aarch64_sqdmull2_lane): Likewise.
(aarch64_sqdmull2_laneq): Likewise.commit 18ed07903bb21e7dea185a1618a130cd88ed9de7
Author: Kyrylo Tkachov 
Date:   Tue Jun 3 15:27:09 2014 +0100

[AArch64] Saturating math lane fixes

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 108bc8d..fc028f5 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2650,7 +2650,41 @@
 
 ;; sqdmulh_lane
 
-(define_insn "aarch64_sqdmulh_lane"
+(define_expand "aarch64_sqdmulh_lane"
+  [(match_operand:VDQHS 0 "register_operand" "")
+   (match_operand:VDQHS 1 "register_operand" "")
+   (match_operand: 2 "register_operand" "")
+   (match_operand:SI 3 "immediate_operand" "")]
+  "TARGET_SIMD"
+  {
+ aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode));
+ operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3])));
+ emit_insn (gen_aarch64_sqdmulh_lane_internal (operands[0],
+ operands[1],
+ operands[2],
+ operands[3]));
+ DONE;
+  }
+)
+
+(define_expand "aarch64_sqrdmulh_lane"
+  [(match_operand:VDQHS 0 "register_operand" "")
+   (match_operand:VDQHS 1 "register_operand" "")
+   (match_operand: 2 "register_operand" "")
+   (match_operand:SI 3 "immediate_operand" "")]
+  "TARGET_SIMD"
+  {
+ aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode));
+ operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3])));
+ emit_insn (gen_aarch64_sqrdmulh_lane_internal (operands[0],
+  operands[1],
+  operands[2],
+  operands[3]));
+ DONE;
+  }
+)
+
+(define_insn "aarch64_sqdmulh_lane_internal"
   [(set (match_operand:VDQHS 0 "register_operand" "=w")
 (unspec:VDQHS
 	  [(match_operand:VDQHS 1 "register_operand" "w")
@@ -2666,7 +2700,41 @@
   [(set_attr "type" "neon_sat_mul__scalar")]
 )
 
-(define_insn "aarch64_sqdmulh_laneq"
+(define_expand "aarch64_sqdmulh_laneq"
+  [(match_operand:VDQHS 0 "register_operand" "")
+   (match_operand:VDQHS 1 "register_operand" "")
+   (match_operand: 2 "register_operand" "")
+   (match_operand:SI 3 "immediate_operand" "")]
+  "TARGET_SIMD"
+  {
+ aarch64_simd_lane_bounds (operands[3], 0, GET_MODE_NUNITS (mode));
+ operands[3] = GEN_INT (ENDIAN_LANE_N (mode, INTVAL (operands[3])));
+ emit_insn (gen_aarch64_sqdmulh_laneq_internal (operands[0],
+  operands[1],
+  operands[2],
+  operands[3]));
+ DONE;
+   }
+)
+
+(define_expand "aarch64_sqrdmulh_laneq"
+  [(match_operand:VDQHS 0 "register_operand" "")
+   (match_operand:VDQHS 1 "register_operand" "")
+   (match_operand: 2 "register_operand" "")
+   (match_operand:SI 3

Re: ipa-visibility TLC 2/n

2014-06-10 Thread Richard Biener

On Sun, Jun 8, 2014 at 6:44 PM, Jan Hubicka  wrote:
>> Honza,
>>
>> I finally was able to bootstrap GCC on AIX yesterday. The bootstrap
>> works, but the testsuite results are in extremely bad shape: 250
>> failures in G++ testsuite and 470 failures in libstdc++ testsuite.
>
> David, this is first of the AIX fixes.
> It makes aliases to be output after thunks so they are not silently 
> miscompiled
> by the assembler.  I checked and the other ordering issue was actually
> Solaris linker not liking thunks before functions (that is a pity, because
> thunk just before function desn't need the tail jump: something I still want
> to get right, but incrementally).
>
> Bootstrapped/regtested on AIX, comitted.
>
> Honza
>
> * cgraphunit.c (assemble_thunks_and_aliases): Expand thunks before
> outputting aliases.
>
> Index: cgraphunit.c
> ===
> --- cgraphunit.c(revision 211106)
> +++ cgraphunit.c(working copy)
> @@ -1718,8 +1718,8 @@
> struct cgraph_node *thunk = e->caller;
>
> e = e->next_caller;
> +expand_thunk (thunk, true);
> assemble_thunks_and_aliases (thunk);

Please add a comment explaining the order of those calls.

> -expand_thunk (thunk, true);
>}
>  else
>e = e->next_caller;

RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap

2014-06-10 Thread Thomas Preud'homme

> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Wednesday, June 04, 2014 5:39 PM
> To: Thomas Preud'homme
> 

> 
> I'm failing to get why you possibly need two casts ... you should
> only need one, from the bswap/load result to the final type
> (zero-extended as you say - so the load type should simply be
> unsigned which it is already).

You are right indeed. I failed to realize that the problems I
encountered were caused by an initially wrong understanding of
the reason behind PR61306. All this code is not necessary.

> 
> So I think that the testcase in the patch is fixed already by
> doing the n->type change (and a proper sign-extension detection).
> 
> Can you please split that part out?

Doing so I realize the patch was incomplete. Sign extension can
be triggered in two distinct place in the code (right shift and cast)
that can both lead to incorrect code being generated. With some
efforts I managed to create two testcases that work both on
GCC trunk but also GCC 4.9 and 4.8.

ChangeLog entries are:

*** gcc/ChangeLog ***

2014-06-05  Thomas Preud'homme  

PR tree-optimization/61306
* tree-ssa-math-opts.c (struct symbolic_number): Store type of
expression instead of its size.
(do_shift_rotate): Adapt to change in struct symbolic_number. Return
false to prevent optimization when the result is unpredictable due to
arithmetic right shift of signed type with highest byte is set.
(verify_symbolic_number_p): Adapt to change in struct symbolic_number.
(init_symbolic_number): Likewise.
(find_bswap_or_nop_1): Likewise. Return NULL to prevent optimization
when the result is unpredictable due to sign extension.

*** gcc/testsuite/ChangeLog ***

2014-06-05  Thomas Preud'homme  

* gcc.c-torture/execute/pr61306-1.c: New test.
* gcc.c-torture/execute/pr61306-2.c: Likewise.
* gcc.c-torture/execute/pr61306-3.c: Likewise.


diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
new file mode 100644
index 000..f6e8ff3
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
@@ -0,0 +1,39 @@
+#ifdef __INT32_TYPE__
+typedef __INT32_TYPE__ int32_t;
+#else
+typedef int int32_t;
+#endif
+
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#define __fake_const_swab32(x) ((uint32_t)(  \
+(((uint32_t)(x) & (uint32_t)0x00ffUL) << 24) |\
+(((uint32_t)(x) & (uint32_t)0xff00UL) <<  8) |\
+(((uint32_t)(x) & (uint32_t)0x00ffUL) >>  8) |\
+(( (int32_t)(x) &  (int32_t)0xff00UL) >> 24)))
+
+/* Previous version of bswap optimization failed to consider sign extension
+   and as a result would replace an expression *not* doing a bswap by a
+   bswap.  */
+
+__attribute__ ((noinline, noclone)) uint32_t
+fake_bswap32 (uint32_t in)
+{
+  return __fake_const_swab32 (in);
+}
+
+int
+main(void)
+{
+  if (sizeof (int32_t) * __CHAR_BIT__ != 32)
+return 0;
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+return 0;
+  if (fake_bswap32 (0x87654321) != 0xff87)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c 
b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
new file mode 100644
index 000..6cbbd19
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
@@ -0,0 +1,40 @@
+#ifdef __INT16_TYPE__
+typedef __INT16_TYPE__ int16_t;
+#else
+typedef short int16_t;
+#endif
+
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#define __fake_const_swab32(x) ((uint32_t)(  \
+(((uint32_t) (x) & (uint32_t)0x00ffUL) << 24) |   \
+(((uint32_t)(int16_t)(x) & (uint32_t)0x0000UL) <<  8) |   \
+(((uint32_t) (x) & (uint32_t)0x00ffUL) >>  8) |   \
+(((uint32_t) (x) & (uint32_t)0xff00UL) >> 24)))
+
+
+/* Previous version of bswap optimization failed to consider sign extension
+   and as a result would replace an expression *not* doing a bswap by a
+   bswap.  */
+
+__attribute__ ((noinline, noclone)) uint32_t
+fake_bswap32 (uint32_t in)
+{
+  return __fake_const_swab32 (in);
+}
+
+int
+main(void)
+{
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+return 0;
+  if (sizeof (int16_t) * __CHAR_BIT__ != 16)
+return 0;
+  if (fake_bswap32 (0x81828384) != 0xff838281)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c 
b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
new file mode 100644
index 000..6086e27
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
@@ -0,0 +1,13 @@
+short a = -1;
+int b;
+char c;
+
+int
+main ()
+{
+  c = a;
+  b = a | c;
+  if (b != -1)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math

Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-10 Thread Richard Biener

On Mon, 9 Jun 2014, Eric Botcazou wrote:

> > I wonder if we want toupdate the frontends to set addressable flag with new
> > sense or we want symtab to simple set addressable on all global symbols or
> > invent a new flag.
> > I would preffer the first case - it seems to make most sense to me.
> 
> I think you need to explain why this change is desirable/necessary for LTO, 
> this would be a good starting point.  As for setting TREE_ADDRESSABLE on 
> every 
> single global symbol in every single front-end, why not, but this seems more 
> complicated than treating global symbols as so by default (in non-LTO mode).

Note that I'm happy to revert the change.

I am hesitant to any approach that overloads TREE_ADDRESSABLE even more.
It already is used for two (slightly) different things - first the
"old" meaning that the address of the symbol is needed, second, that
the symbol is aliased by pointers.  Those are of course related, but
as you see they are not 100% equivalent.

As I already added DECL_NONALIASED (for VAR_DECLs) to "fix" that
coverage counter issue (those are TREE_STATIC but they have their
address taken - still we know that no pointers alias the accesses),
we can as well rely on that flag - but then we should set it whenever
a TU-local decl does not have its address taken (!TREE_ADDRESSABLE).

So it does impose some redundancy and possibility of things to go
out-of-sync.

Btw, the C frontend doesn't call varpool_finalize_decl for externals,
so setting TREE_ADDRESSABLE there doesn't work unfortunately.  It
works with doing it in varpool_node_for_decl though.

Patch doing both attached (we may choose to do this in different
places for DECL_EXTERNALs vs. TREE_PUBLIC && TREE_STATICs?).
At LTO input time we directly call symtab_register_node which would
side-step this thus an IPA pass could drop TREE_ADDRESSABLE from
those decls.

Sofar untested.

Comments?

Thanks,
Richard.

2014-06-10  Richard Biener  

* tree.h (TREE_ADDRESSABLE): Clarify.
* varpool.c (varpool_node_for_decl): Mark public or external
variables as TREE_ADDRESSABLE.
    * cgraphunit.c (varpool_finalize_decl): Likewise.

    * gcc.dg/torture/20140610-1.c: New testcase.
* gcc.dg/torture/20140610-2.c: Likewise.

Index: gcc/tree.h
===
--- gcc/tree.h  (revision 211398)
+++ gcc/tree.h  (working copy)
@@ -571,8 +571,9 @@ extern void omp_clause_range_check_faile

 /* Define many boolean fields that all tree nodes have.  */

-/* In VAR_DECL, PARM_DECL and RESULT_DECL nodes, nonzero means address
-   of this is needed.  So it cannot be in a register.
+/* In VAR_DECL, PARM_DECL and RESULT_DECL nodes, nonzero means the address
+   of this is needed.  So it cannot be in a register.  If not set, then
+   the address of this cannot be used to initialize an aliasing pointer.
In a FUNCTION_DECL it has no meaning.
In LABEL_DECL nodes, it means a goto for this label has been seen
from a place outside all binding contours that restore stack levels.
Index: gcc/varpool.c
===
--- gcc/varpool.c   (revision 211398)
+++ gcc/varpool.c   (working copy)
@@ -149,6 +149,8 @@ varpool_node_for_decl (tree decl)
   if (node)
 return node;

+  if (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl))
+TREE_ADDRESSABLE (decl) = 1;
   node = varpool_create_empty_node ();
   node->decl = decl;
   symtab_register_node (node);
Index: gcc/cgraphunit.c
===
--- gcc/cgraphunit.c(revision 211398)
+++ gcc/cgraphunit.c(working copy)
@@ -818,6 +818,11 @@ varpool_finalize_decl (tree decl)

   gcc_assert (TREE_STATIC (decl) || DECL_EXTERNAL (decl));

+  /* Mark all symbols visible to other TUs as possibly having their
+ address taken.  */
+  if (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl))
+TREE_ADDRESSABLE (decl) = 1;
+
   if (node->definition)
 return;
   notice_global_symbol (decl);
Index: gcc/testsuite/gcc.dg/torture/20140610-1.c
===
--- gcc/testsuite/gcc.dg/torture/20140610-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/torture/20140610-1.c   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-additional-sources "20140610-2.c" } */
+
+extern int a;
+extern int *p;
+
+void test (void);
+
+int main ()
+{
+  *p = 0;
+  a = 1;
+  test ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/torture/20140610-2.c
===
--- gcc/testsuite/gcc.dg/torture/20140610-2.c   (revision 0)
+++ gcc/testsuite/gcc.dg/torture/20140610-2.c   (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+
+extern void abort (void);
+
+int a;
+int *p = &a;
+
+void test (void)
+{
+  if (a != 1)
+abort ();
+}

[PATCH, PR61446] Fix mode for register copy in REE pass

2014-06-10 Thread Ilya Enkovich

Hi,

This patch fixes PR61446.  The problem appears when we insert value copies 
after transformations. We use the widest extension mode met in a chain, but it 
may be wider than original destination register size.  This patch checks it and 
use smaller mode if required.

Bootstrapped and tested on linux-x86_64.

Does it look OK?

Thanks,
Ilya
--
2014-06-09  Ilya Enkovich  

PR 61446
* ree.c (find_and_remove_re): Narrow mode for register copy
if required.

diff --git a/gcc/ChangeLog.pr61446 b/gcc/ChangeLog.pr61446
new file mode 100644
index 000..b9e2148
--- /dev/null
+++ b/gcc/ChangeLog.pr61446
@@ -0,0 +1,5 @@
+2014-06-09  Ilya Enkovich  
+
+   PR 61446
+   * ree.c (find_and_remove_re): Narrow mode for register copy
+   if required.
diff --git a/gcc/ree.c b/gcc/ree.c
index ade413e..6d34764 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -1088,14 +1088,24 @@ find_and_remove_re (void)
   /* Use the mode of the destination of the defining insn
 for the mode of the copy.  This is necessary if the
 defining insn was used to eliminate a second extension
-that was wider than the first.  */
+that was wider than the first.  Truncate mode if it is
+too wide for destination reg.  */
   rtx sub_rtx = *get_sub_rtx (def_insn);
   rtx pat = PATTERN (curr_insn);
-  rtx new_dst = gen_rtx_REG (GET_MODE (SET_DEST (sub_rtx)),
-REGNO (XEXP (SET_SRC (pat), 0)));
-  rtx new_src = gen_rtx_REG (GET_MODE (SET_DEST (sub_rtx)),
-REGNO (SET_DEST (pat)));
-  rtx set = gen_rtx_SET (VOIDmode, new_dst, new_src);
+  unsigned int regno = REGNO (XEXP (SET_SRC (pat), 0));
+  enum machine_mode mode = GET_MODE (SET_DEST (sub_rtx));
+  rtx new_dst, new_src, set;
+
+  if (HARD_REGNO_NREGS (regno, mode) != 1)
+   {
+ mode = GET_CLASS_NARROWEST_MODE (GET_MODE_CLASS (mode));
+ while (HARD_REGNO_NREGS (regno, GET_MODE_WIDER_MODE (mode)) == 1)
+   mode = GET_MODE_WIDER_MODE (mode);
+   }
+
+  new_dst = gen_rtx_REG (mode, REGNO (XEXP (SET_SRC (pat), 0)));
+  new_src = gen_rtx_REG (mode, REGNO (SET_DEST (pat)));
+  set = gen_rtx_SET (VOIDmode, new_dst, new_src);
   emit_insn_after (set, def_insn);
 }

Re: [PATCH] Add patch for debugging compiler ICEs

2014-06-10 Thread Yury Gribov


> This is the resurrected patch with added GCC version information
> into generated repro file.

I wonder whether we should also set ADDR_NO_RANDOMIZE and 
ADDR_COMPAT_LAYOUT in addition to -frandom-seed (according to 
https://gcc.gnu.org/wiki/Randomization).


-Y

RE: [PATCH, Fortran] PR61234: -Wuse-no-only

2014-06-10 Thread VandeVondele Joost

Attached the reworked patch. The only change is that the warning is now not 
part of -Wall, given the consensus on the list.

The patch has been bootstrapped and regtested on x86_64-unknown-linux-gnu. If 
OK, please apply to trunk.
gcc/fortran/ChangeLog:

2014-06-04 Joost VandeVondele  

PR fortran/61234
* lang.opt (Wuse-no-only): New flag.
* gfortran.h (gfc_option_t): Add it.
* invoke.texi: Document it.
* module.c (gfc_use_module): Warn if needed.
* options.c (gfc_init_options,gfc_handle_option): Init accordingly.

gcc/testsuite/ChangeLog:

2014-06-04 Joost VandeVondele  

* gfortran.dg/use_no_only_1.f90: New test.

Index: gcc/fortran/options.c
===
--- gcc/fortran/options.c	(revision 211094)
+++ gcc/fortran/options.c	(working copy)
@@ -105,6 +105,7 @@ gfc_init_options (unsigned int decoded_o
   gfc_option.warn_tabs = 1;
   gfc_option.warn_underflow = 1;
   gfc_option.warn_intrinsic_shadow = 0;
+  gfc_option.warn_use_no_only = 0;
   gfc_option.warn_intrinsics_std = 0;
   gfc_option.warn_align_commons = 1;
   gfc_option.warn_real_q_constant = 0;
@@ -730,6 +731,10 @@ gfc_handle_option (size_t scode, const c
   gfc_option.warn_intrinsic_shadow = value;
   break;
 
+case OPT_Wuse_no_only:
+  gfc_option.warn_use_no_only = value;
+  break;
+
 case OPT_Walign_commons:
   gfc_option.warn_align_commons = value;
   break;
Index: gcc/fortran/gfortran.h
===
--- gcc/fortran/gfortran.h	(revision 211022)
+++ gcc/fortran/gfortran.h	(working copy)
@@ -2321,6 +2321,7 @@ typedef struct
   int warn_tabs;
   int warn_underflow;
   int warn_intrinsic_shadow;
+  int warn_use_no_only;
   int warn_intrinsics_std;
   int warn_character_truncation;
   int warn_array_temp;
Index: gcc/fortran/lang.opt
===
--- gcc/fortran/lang.opt	(revision 211022)
+++ gcc/fortran/lang.opt	(working copy)
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard
 
+Wuse-no-only
+Fortran Warning
+Warn about USE statements that have no only qualifier
+
 Wopenmp-simd
 Fortran
 ; Documented in C
Index: gcc/fortran/invoke.texi
===
--- gcc/fortran/invoke.texi	(revision 211022)
+++ gcc/fortran/invoke.texi	(working copy)
@@ -142,7 +142,7 @@ and warnings}.
 @gccoptlist{-Waliasing -Wall -Wampersand -Warray-bounds
 -Wc-binding-type -Wcharacter-truncation @gol
 -Wconversion -Wfunction-elimination -Wimplicit-interface @gol
--Wimplicit-procedure -Wintrinsic-shadow -Wintrinsics-std @gol
+-Wimplicit-procedure -Wintrinsic-shadow -Wuse-no-only -Wintrinsics-std @gol
 -Wline-truncation -Wno-align-commons -Wno-tabs -Wreal-q-constant @gol
 -Wsurprising -Wunderflow -Wunused-parameter -Wrealloc-lhs -Wrealloc-lhs-all @gol
 -Wtarget-lifetime -fmax-errors=@var{n} -fsyntax-only -pedantic -pedantic-errors
@@ -896,6 +896,13 @@ intrinsic; in this case, an explicit int
 @code{INTRINSIC} declaration might be needed to get calls later resolved to
 the desired intrinsic/procedure.  This option is implied by @option{-Wall}.
 
+@item -Wuse-no-only
+@opindex @code{Wuse-no-only}
+@cindex warnings, use statements
+@cindex intrinsic
+Warn if a use statement has no only qualifier and thus implicitly imports
+all public entities of the used module.
+
 @item -Wunused-dummy-argument
 @opindex @code{Wunused-dummy-argument}
 @cindex warnings, unused dummy argument
Index: gcc/fortran/module.c
===
--- gcc/fortran/module.c	(revision 211022)
+++ gcc/fortran/module.c	(working copy)
@@ -6398,6 +6398,9 @@ gfc_use_module (gfc_use_list *module)
   gfc_rename_list = module->rename;
   only_flag = module->only_flag;
 
+  if (!only_flag && gfc_option.warn_use_no_only) 
+gfc_warning_now ("USE statement at %C has no ONLY qualifier");
+
   filename = XALLOCAVEC (char, strlen (module_name) + strlen (MODULE_EXTENSION)
 			   + 1);
   strcpy (filename, module_name);
Index: gcc/testsuite/gfortran.dg/use_no_only_1.f90
===
--- gcc/testsuite/gfortran.dg/use_no_only_1.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/use_no_only_1.f90	(revision 0)
@@ -0,0 +1,44 @@
+! PR fortran/61234 Warn for use-stmt without explicit only-list.
+! { dg-do compile }
+! { dg-options "-Wuse-no-only" }
+MODULE foo
+  INTEGER :: bar
+END MODULE
+
+MODULE testmod
+  USE foo ! { dg-warning "has no ONLY qualifier" }
+  IMPLICIT NONE
+CONTAINS
+  SUBROUTINE S1
+ USE foo ! { dg-warning "has no ONLY qualifier" }
+  END SUBROUTINE S1
+  SUBROUTINE S2
+ USE foo, ONLY: bar ! { dg-bogus "has no ONLY qualifier" }
+  END SUBROUTINE
+  SUBROUTINE S3
+ USE ISO_C_BINDING ! { dg-warning "has no ONLY qualifier" }
+  END SUBROUTINE

90 matches

Mail list logo