Re: [PATCH][i386]Fix PR 57756

2013-10-22 Thread Uros Bizjak
On Tue, Oct 22, 2013 at 11:05 PM, Sriraman Tallam  wrote:

>This simple patch fixes the -m32 -mno-sse bugs you reported. A few
> more places where I did not change references to global_options.
> Uros/Richard: Is this ok to commit?
>
> * config/i386/i386.c (ix86_option_override_internal):
> Change TARGET_SSE2 to TARGET_SSE2_P (opts->...)
> (ix86_valid_target_attribute_tree):
> Change TARGET_64BIT to TARGET_64BIT_P (opts->...)
> Change TARGET_SSE to TARGET_SSE_P (opts->...)

OK.

Thanks,
Uros.


Re: [PING^2][PATCH][2 of 2] RTL expansion for zero sign extension elimination with VRP

2013-10-22 Thread Kugan

>>>   tem = (char) 255 + (char) 1;
>>>
>>> which has a value-range of [0,0] but clearly when computed in
>>> SImode the value-range is [256, 256].  That is, VRP computes
>>> value-ranges in the expression type, not in some arbitrary
>>> larger type.
>>>
>>> So what you'd have to do is take the value-ranges of the
>>> two operands of the plus and see whether the plus can overflow
>>> QImode when computed in SImode (for the example).
>>>
Ok, I will handle it as you have suggested here.

> Not sure if I understand what you are saying here.  As for the above
> case
> 
>>>   tem = (char) 255 + (char) 1;
> 
> tem is always of type 'char' in GIMPLE (even if later promoted
> via PROMOTE_MODE) the value-range is a 'char' value-range and thus
> never will exceed [CHAR_MIN, CHAR_MAX].  The only way you can
> use that directly is if you can rely on undefined behavior
> happening for signed overflow - but if you argue that way you
> can simply _always_ drop the (sext:SI (subreg:QI part and you
> do not need value ranges for this.  For unsigned operations
> for example [250, 254] + [8, 10] will simply wrap to [3, 7]
> (if I got the math correct) which is inside your [CHAR_MIN + 1,
> CHAR_MAX - 1] but if performed in SImode you can get 259 and
> thus clearly you cannot drop the (zext:SI (subreg:QI parts.
> The same applies to signed types if you do not want to rely
> on signed overflow being undefined of course.
> 

Thanks for the explanation. I now get it and I will rework the patch.

Thanks,
Kugan


Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-10-22 Thread Jeff Law

On 10/16/13 15:49, Iyer, Balaji V wrote:

In ira.c:

+   /* We need a frame pointer for all Cilk Plus functions that use
+ Cilk keywords.  */
+   || (flag_enable_cilkplus && cfun->is_cilk_function)
Can you explain to me a bit more why you need a frame pointer?  I'm trying to
determine if it's best to leave this as-is or have this code detect a property 
in the
generated code for the function.  From a modularity standpoint it seems pretty
gross that we have to peek at this within IRA.



Cilk Runtime functions changes the stack pointer. So, frame pointer is 
necessary.
Nevermind -- that seems to be the location where this is detected now. 
So this is fine.







In a couple places I saw this comment:
+  /* Cilk keywords currently need to replace some variables that
+ ordinary nested functions do not.  */  bool remap_var_for_cilk;
I didn't see anywhere that explained exactly why some variables that do not
ordinarily need replacing need replacing when cilk is enabled.  If it's in the 
patch
somewhere, just point me to it. If not, add documentation about why these
variables need remapping for cilk.



It is used in the cilk_outline function.
Thanks.  Presumably the comment "We don't want the private variables 
anymore" is the relevant code/comment?



Does anything actually ensure we don't have multiple syncs?




Well, _Cilk_sync expands to something like this:

If (!sync_occurred)
__cilkrts_sync()

So, having multiple Cilk syncs doesn't harm, just that the then case of the 
if-statement will not be taken.

OK.  Thanks.





What's the thinking behind parsing calls to cilk_spawn as a normal call if 
there's
an error?  Referring to this code in gimplify.c:
+   case CILK_SPAWN_STMT:
+ gcc_assert
+   (fn_contains_cilk_spawn_p (cfun)
+&& lang_hooks.cilkplus.cilk_detect_spawn_and_unwrap (expr_p));
+ if (!seen_error ())
+   {
+ ret = (enum gimplify_status)
+   lang_hooks.cilkplus.gimplify_cilk_spawn (expr_p, pre_p,
+post_p);
+ break;
+   }
+ /* If errors are seen, then just process it as a CALL_EXPR.
+ */
+



Well, if there is an error the compiler is not going to produce an executable. 
So, I just let the compiler go far as it can and catch all the other errors. If 
the error is cilk related, we have already called them out on it. Adding 
_Cilk_spawn specific routines would add additional complication.


I guess that's a reasonable fallback position in case of an error.


Meta-question, when we're not in cilk mode, should we be consuming the cilk
tokens?  I'm not familiar at all with our parser, so I'm not sure if we can 
handle
this gracefully.  Though I guess parsing hte token and warning/error if not in 
Cilk
mode is probably the best course of action.



In the compiler, I couldn't make conditional tokens. When the parser hits a 
_Cilk_spawn or _Cilk_sync token, it will check if Cilk Plus is enabled or will 
complain. Now that I think about it in detail, I suppose it will also block if 
anyone wants to have a variable name called _Cilk_spawn or _Cilk_sync and not 
using -fcilkplus. But, they start with '_', and so I guess it is not a normal 
case.
Figured it was ugly at best to avoid consuming the cilk tokens when not 
in cilk mode.





Can you take a look at calls.c::special_function_p and determine if we need to
do something special for spawn here?



I will look into it and let you know.

Any word on this?

jeff



Re: [PATCH] Small *-alias dump fixes

2013-10-22 Thread Jeff Law

On 10/22/13 13:23, Jakub Jelinek wrote:

Hi!

I've noticed that some -fdump-tree-*-alias dumps look badly, contain stuff
like:
   :
   # # RANGE [0, 52]
   _1 = PHI <0(2), f2_12(3)>
   return _1;
or:
   :
   # PT = { D.1732 } (glob)
   # ALIGN = 32, MISALIGN = 0
   # p_1 = PHI <&MEM[(void *)&a + 128B](2), &MEM[(void *)&a + 512B](3)>
   PT = { D.1732 } (glob)
   # ALIGN = 32, MISALIGN = 0
   # p_3 = p_1 + 64;
note, some lines get double an extra "# " and some don't get that when it
should, in both examples.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2013-10-22  Jakub Jelinek  

* gimple-pretty-print.c (dump_ssaname_info): Always print "# " before
the info, not after it.
(gump_gimple_phi): Add COMMENT argument, if true, print "# " after
dump_ssaname_info call.
(pp_gimple_stmt_1): Adjust caller.
(dump_phi_nodes): Likewise.  Don't print "# " here.

OK.

Jeff



Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Jeff Law

On 09/17/13 02:18, Ilya Enkovich wrote:

Hi,

Here is a patch introducing new type and mode for bounds. It is a part of MPX 
ISA support patch (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html).

Bootstrapped and tested on linux-x86_64. Is it OK for trunk?

Thanks,
Ilya
--

gcc/

2013-09-16  Ilya Enkovich  

* mode-classes.def (MODE_BOUND): New.
* tree.def (BOUND_TYPE): New.
* genmodes.c (complete_mode): Support MODE_BOUND.
(BOUND_MODE): New.
(make_bound_mode): New.
* machmode.h (BOUND_MODE_P): New.
* stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
(layout_type): Support BOUND_TYPE.
* tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
* tree.c (build_int_cst_wide): Support BOUND_TYPE.
(type_contains_placeholder_1): Likewise.
* tree.h (BOUND_TYPE_P): New.
* varasm.c (output_constant): Support BOUND_TYPE.
* doc/rtl.texi (MODE_BOUND): New.
Mostly OK.  Just a few minor things that should be fixed or at least 
clarified.







diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 1d62223..02b1214 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the 
@file{@var{machine}-modes.def}.
  @xref{Jump Patterns},
  also see @ref{Condition Code}.

+@findex MODE_BOUND
+@item MODE_BOUND
+Bound modes class.  Used to represent values of pointer bounds.
I can't help but feel more is needed here -- without going into the 
details of the MPX implementation we ought to say something about how 
these differ from the more normal integer modes.  Drawing from the brief 
discussion between Richard & myself earlier today should give some ideas 
on how to improve this.




I'd probably use MODE_POINTER_BOUNDS which is a bit more descriptive. 
We wouldn't want someone to (for example) think this stuff relates to 
array bounds.  Obviously a change to MODE_POINTER_BOUNDS would propagate 
into other places where you use "BOUND" without a "POINTER" 
qualification, such as "BOUND_MODE_P" which we'd change to 
POINTER_BOUNDS_MODE_P.



diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def
index 7207ef7..c5ea215 100644
--- a/gcc/mode-classes.def
+++ b/gcc/mode-classes.def
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
DEF_MODE_CLASS (MODE_RANDOM),   /* other */ 
   \
DEF_MODE_CLASS (MODE_CC),   /* condition code in a register */ \
DEF_MODE_CLASS (MODE_INT),  /* integer */  \
+  DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \
DEF_MODE_CLASS (MODE_PARTIAL_INT),  /* integer with padding bits */\
DEF_MODE_CLASS (MODE_FRACT),/* signed fractional number */  
   \
DEF_MODE_CLASS (MODE_UFRACT),   /* unsigned fractional number 
*/   \

Does genmodes do the right thing WRT MAX_INT_MODE and MIN_INT_MODE?

I'd be more comfortable if MODE_POINTER_BOUNDS wasn't sitting between 
MODE_INT and MODE_PARTIAL_INT.  I'm not aware of code that iterates over 
these things that would get confused, but ISTM putting 
MODE_POINTER_BOUNDS after MODE_PARTIAL_INT is marginally safer.





diff --git a/gcc/tree.c b/gcc/tree.c
index b469b97..bbbe16e 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -1197,6 +1197,7 @@ build_int_cst_wide (tree type, unsigned HOST_WIDE_INT 
low, HOST_WIDE_INT hi)

  case INTEGER_TYPE:
  case OFFSET_TYPE:
+case BOUND_TYPE:
if (TYPE_UNSIGNED (type))
{
  /* Cache 0..N */
So here you're effectively treading POINTER_BOUNDS_TYPE like an integer. 
 I'm guessing there's a number of flags that may not be relevant for 
your type and which you might want to repurpose (again, I haven't looked 
at the entire patchset).  If so, you want to be real careful here since 
you'll be looking at (for example) TYPE_UNSIGNED which may not have any 
real meaning for POINTER_BOUNDS_TYPE.


Overall, it seems fairly reasonable -- the biggest concern of mine is in 
the last comment.  Are you going to be repurposing various flag bits in 
the type?  If so, then we have to be more careful in code like above.



Jeff


[patch] Flatten tree-ssa.h

2013-10-22 Thread Andrew MacLeod
When moving all the prototypes out of  tree-flow.h and into individual 
files, the "kitchen sink" #include list temporarily propagated it way 
into the new tree-ssa.h file.  tree-ssa.h is now in a position to 
contain only it's own prototypes.


Its include list was:
#include "bitmap.h"
#include "gimple.h"
#include "gimple-ssa.h"
#include "cgraph.h"
#include "tree-cfg.h"
#include "tree-phinodes.h"
#include "ssa-iterators.h"
#include "tree-ssanames.h"
#include "tree-ssa-loop.h"
#include "tree-into-ssa.h"
#include "tree-dfa.h"

This patch flattens tree-ssa.h so that it no longer includes *any* of 
those include files.  The process, mostly automated:
1 - takes all the #include's from tree-ssa.h and copies them immediately 
before #include "tree-ssa.h" in every .c file which includes it.
2 - Remove each of those include files (as well as tree-ssa.h) one at a 
time from the .c file (bottom up)  and tries to compile it.

3 - Removes any which cause no compilation failures.

This way each .c file gets only the includes it actually needed from 
tree-ssa.h, and often doesn't even require tree-ssa.h itself.  I haven't 
touched any existing includes unless they formed a duplicate of those 
copied from tree-ssa.h. (and that happens)


This touches 147 files (!!!)  which included tree-ssa.h only 40 
still need tree-ssa.h.
tree-ssa.h had 11 include files and the average .c file ended up 
requiring only 3.


bootstraps on  x86_64-unknown-linux-gnu with no new regressions. OK?

Andrew



flat3.patch.gz
Description: application/gzip


Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-22 Thread pinskia


Sent from my iPad

> On Oct 22, 2013, at 7:23 PM, Cong Hou  wrote:
> 
> This patch aims at PR58762.
> 
> Currently GCC could not vectorize abs() operation for integers on x86
> with only SSE2 support. For int type, the reason is that the expand on
> abs() is not defined for vector type. This patch defines such an
> expand so that abs(int) will be vectorized with only SSE2.
> 
> For abs(char/short), type conversions are needed as the current abs()
> function/operation does not accept argument of char/short type.
> Therefore when we want to get the absolute value of a char_val using
> abs (char_val), it will be converted into abs ((int) char_val). It
> then can be vectorized, but the generated code is not efficient as
> lots of packings and unpackings are envolved. But if we convert
> (char) abs ((int) char_val) to abs (char_val), the vectorizer will be
> able to generate better code. Same for short.
> 
> This conversion also enables vectorizing abs(char/short) operation
> with PABSB and PABSW instructions in SSE3.
> 
> With only SSE2 support, I developed three methods to expand
> abs(char/short/int) seperately:
> 
> 1. For 32 bit int value x, we can get abs (x) from (((signed) x >>
> (W-1)) ^ x) - ((signed) x >> (W-1)). This is better than max (x, -x),
> which needs bit masking.
> 
> 2. For 16 bit int value x, we can get abs (x) from max (x, -x), as
> SSE2 provides PMAXSW instruction.
> 
> 3. For 8 bit int value x, we can get abs (x) from min ((unsigned char)
> x, (unsigned char) (-x)), as SSE2 provides PMINUB instruction.
> 
> 
> The patch is pasted below. Please point out any problem in my patch
> and analysis.
> 
> 
> thanks,
> Cong
> 
> 
> 
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 8a38316..e0f33ee 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,13 @@
> +2013-10-22  Cong Hou  
> +
> + PR target/58762
> + * convert.c (convert_to_integer): Convert (char) abs ((int) char_val)
> + into abs (char_val).  Also convert (short) abs ((int) short_val)
> + into abs (short_val).

I don't like this optimization in convert.  I think it should be submitted 
separately and should be done in tree-ssa-forwprop.

Also I think you should have a generic (non x86) test case for the above 
optimization. 

Thanks,
Andrew


> + * config/i386/i386-protos.h (ix86_expand_sse2_absvxsi2): New function.
> + * config/i386/i386.c (ix86_expand_sse2_absvxsi2): New function.
> + * config/i386/sse.md: Add SSE2 support to abs (char/int/short).



> +
> 2013-10-14  David Malcolm  
> 
>  * dumpfile.h (gcc::dump_manager): New class, to hold state
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index 3ab2f3a..e85f663 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -238,6 +238,7 @@ extern void ix86_expand_mul_widen_evenodd (rtx,
> rtx, rtx, bool, bool);
> extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, bool, bool);
> extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
> extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx);
> +extern void ix86_expand_sse2_absvxsi2 (rtx, rtx);
> 
> /* In i386-c.c  */
> extern void ix86_target_macros (void);
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 02cbbbd..8050e02 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -41696,6 +41696,53 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
>gen_rtx_MULT (mode, op1, op2));
> }
> 
> +void
> +ix86_expand_sse2_absvxsi2 (rtx op0, rtx op1)
> +{
> +  enum machine_mode mode = GET_MODE (op0);
> +  rtx tmp0, tmp1;
> +
> +  switch (mode)
> +{
> +  /* For 32-bit signed integer X, the best way to calculate the absolute
> + value of X is (((signed) X >> (W-1)) ^ X) - ((signed) X >> (W-1)).  */
> +  case V4SImode:
> + tmp0 = expand_simple_binop (mode, ASHIFTRT, op1,
> +GEN_INT (GET_MODE_BITSIZE
> + (GET_MODE_INNER (mode)) - 1),
> +NULL, 0, OPTAB_DIRECT);
> + if (tmp0)
> +  tmp1 = expand_simple_binop (mode, XOR, op1, tmp0,
> +  NULL, 0, OPTAB_DIRECT);
> + if (tmp0 && tmp1)
> +  expand_simple_binop (mode, MINUS, tmp1, tmp0,
> +   op0, 0, OPTAB_DIRECT);
> + break;
> +
> +  /* For 16-bit signed integer X, the best way to calculate the absolute
> + value of X is max (X, -X), as SSE2 provides the PMAXSW insn.  */
> +  case V8HImode:
> + tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
> + if (tmp0)
> +  expand_simple_binop (mode, SMAX, op1, tmp0, op0, 0,
> +   OPTAB_DIRECT);
> + break;
> +
> +  /* For 8-bit signed integer X, the best way to calculate the absolute
> + value of X is min ((unsigned char) X, (unsigned char) (-X)),
> + as SSE2 provides the PMINUB insn.  */
> +  case V16QImode:
> + tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
> + if (tmp0)
> +  expand_simple_binop (V16QImode, UMIN, op1, tmp0, op0, 0,
> +   OPTAB_DIRECT);
> + break;
> +
> +  default:
> + break;
> +}
> +}
> +
> /* Expand an insert into a vector register through p

Re: [PATCH, PR 53001] Re: Patch to split out new warning flag for floating point conversion

2013-10-22 Thread Joshua J Cogliati
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Only one of these should be applied.  The
warn_float_patch_simple_trunk.diff just changes formatting from the
previous version.  The warn_float_patch_and_new_testcase.diff leaves
existing testcases alone, and adds a new one that tests
- -Wfloat-conversion warnings.  warn_float_patch_and_new_testcase.diff
has been run thru make bootstrap and make check-gcc on revision 203640
of svn trunk.

Changelog for warn_float_patch_simple_trunk.diff:


Splitting out a -Wfloat-conversion from -Wconversion for
conversions that lower floating point number precision
or conversion from floating point numbers to integers
* c-family/c-common.c Switching unsafe_conversion_p to
return an enumeration with more detail, and conversion_warning
to use this information.
* c-family/c-common.h Adding conversion_safety enumeration
and switching return type of unsafe_conversion_p
* c-family/c.opt Adding new warning float-conversion and
enabling it -Wconversion
* doc/invoke.texi Adding documentation about
-Wfloat-conversion
* testsuite/c-c++-common/Wconversion-real.c Switching tests
to use float-conversion
* testsuite/gcc.dg/Wconversion-real-integer.c Switching
tests to use float-conversion
* testsuite/gcc.dg/pr35635.c Switching tests to use
float-conversion

Changelog for warn_float_patch_and_new_testcase.diff:

Splitting out a -Wfloat-conversion from -Wconversion for
conversions that lower floating point number precision
or conversion from floating point numbers to integers
* c-family/c-common.c Switching unsafe_conversion_p to
return an enumeration with more detail, and conversion_warning
to use this information.
* c-family/c-common.h Adding conversion_safety enumeration
and switching return type of unsafe_conversion_p
* c-family/c.opt Adding new warning float-conversion and
enabling it -Wconversion
* doc/invoke.texi Adding documentation about
-Wfloat-conversion
* testsuite/c-c++-common/Wfloat-conversion.c Copies relevant
tests from c-c++-common/Wconversion-real.c,
gcc.dg/Wconversion-real-integer.c and gcc.dg/pr35635.c into
new testcase for ones that are warned about by
-Wfloat-conversion

On 10/18/2013 09:21 AM, Joseph S. Myers wrote:
> On Fri, 18 Oct 2013, Joshua J Cogliati wrote:
> 
>> This patch does not change any of the non-commented c and c++
>> code. It changes the dg comments. Example: -  fsi (3.1f); /* {
>> dg-warning "conversion" } */ +  fsi (3.1f); /* { dg-warning
>> "float-conversion" } */
>> 
>> If you want I can change it to (in separate files if desired):
>> 
>> fsi (3.1f); /* { dg-warning "conversion" } */ fsi (3.1f); /* {
>> dg-warning "float-conversion" } */
>> 
>> so that now the tests are run both ways, but it would test the
>> exact same code path.
> 
> Really I think it's better for the dg-warning text to test the
> actual warning text rather than the name of the option that's also
> reported as part of the compiler output ("conversion" matches both,
> of course).

Is there an example of how this works or documentation?  I would
rather not make it overly detailed so it becomes very sensitive to
minor changes in the warning format.

> The problem isn't so much the change to dg-warning, though, as the
> change to dg-options.  Previously the test asserted that certain
> things warn with -Wconversion, by changing it you lose the
> assertion that -Wconversion enables those warnings.  So I think the
> test should remain as-is, verifying that -Wfloat-conversion causes
> certain warnings, and then be copied in a form using
> -Wfloat-conversion to verify that -Wfloat-conversion also causes
> the same warnings.

I am not fully convinced, so I just made the patch both ways. I don't
care that much which is used.

> Looking at
> , there
> are also some formatting problems, "if(" which should have a space
>  before the "(".
> 

Fixed.

Thanks for the comments.

Joshua Cogliati

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSZzs5AAoJEFeOW05LP1faBu0IAKOBXUGOrS8H53x2pN+UXdK7
H6EeypuCmlD3qFC1k2G3GIvaxrUAECxid2gaVABVpc4+0JJ2fYXMH2lwGaAiMlNv
ftdQ3c4oRLHHCKvu1UuZ3I/iwJ7ffQpYWxMhYLTdPkXxyEqoFm7cD3GYP3THGKhr
Sx1rxUGVQC2txRIiesEqrEOeO0XhoqnhneCH5rTpNSpsMghlibyGPG7Ag5aGEUJk
DfA+EacU6oKB9Uz98pCGaRAf7fiLZp5c9ug+aBFEgmZoBrnSE22tbor8SdcbSUyt
6yGkAENiJIhyMrkedT+9QxEBXzFv4/yh1av0W8bERGcLgH1tbLbydyDREL7kEZ4=
=4pyZ
-END PGP SIGNATURE-
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 203640)
+++ gcc/c-family/c-common.c	(working copy)
@@ -2518,7 +2518,7 @@ shorten_binary_op (tre

Re: [C++14] implement [[deprecated]].

2013-10-22 Thread Jason Merrill

I don't think this patch should be blocked on those bugs.

Jason


[PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-22 Thread Cong Hou
This patch aims at PR58762.

Currently GCC could not vectorize abs() operation for integers on x86
with only SSE2 support. For int type, the reason is that the expand on
abs() is not defined for vector type. This patch defines such an
expand so that abs(int) will be vectorized with only SSE2.

For abs(char/short), type conversions are needed as the current abs()
function/operation does not accept argument of char/short type.
Therefore when we want to get the absolute value of a char_val using
abs (char_val), it will be converted into abs ((int) char_val). It
then can be vectorized, but the generated code is not efficient as
lots of packings and unpackings are envolved. But if we convert
(char) abs ((int) char_val) to abs (char_val), the vectorizer will be
able to generate better code. Same for short.

This conversion also enables vectorizing abs(char/short) operation
with PABSB and PABSW instructions in SSE3.

With only SSE2 support, I developed three methods to expand
abs(char/short/int) seperately:

1. For 32 bit int value x, we can get abs (x) from (((signed) x >>
(W-1)) ^ x) - ((signed) x >> (W-1)). This is better than max (x, -x),
which needs bit masking.

2. For 16 bit int value x, we can get abs (x) from max (x, -x), as
SSE2 provides PMAXSW instruction.

3. For 8 bit int value x, we can get abs (x) from min ((unsigned char)
x, (unsigned char) (-x)), as SSE2 provides PMINUB instruction.


The patch is pasted below. Please point out any problem in my patch
and analysis.


thanks,
Cong




diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a38316..e0f33ee 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2013-10-22  Cong Hou  
+
+ PR target/58762
+ * convert.c (convert_to_integer): Convert (char) abs ((int) char_val)
+ into abs (char_val).  Also convert (short) abs ((int) short_val)
+ into abs (short_val).
+ * config/i386/i386-protos.h (ix86_expand_sse2_absvxsi2): New function.
+ * config/i386/i386.c (ix86_expand_sse2_absvxsi2): New function.
+ * config/i386/sse.md: Add SSE2 support to abs (char/int/short).
+
 2013-10-14  David Malcolm  

  * dumpfile.h (gcc::dump_manager): New class, to hold state
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 3ab2f3a..e85f663 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -238,6 +238,7 @@ extern void ix86_expand_mul_widen_evenodd (rtx,
rtx, rtx, bool, bool);
 extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, bool, bool);
 extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
 extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx);
+extern void ix86_expand_sse2_absvxsi2 (rtx, rtx);

 /* In i386-c.c  */
 extern void ix86_target_macros (void);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 02cbbbd..8050e02 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -41696,6 +41696,53 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
gen_rtx_MULT (mode, op1, op2));
 }

+void
+ix86_expand_sse2_absvxsi2 (rtx op0, rtx op1)
+{
+  enum machine_mode mode = GET_MODE (op0);
+  rtx tmp0, tmp1;
+
+  switch (mode)
+{
+  /* For 32-bit signed integer X, the best way to calculate the absolute
+ value of X is (((signed) X >> (W-1)) ^ X) - ((signed) X >> (W-1)).  */
+  case V4SImode:
+ tmp0 = expand_simple_binop (mode, ASHIFTRT, op1,
+GEN_INT (GET_MODE_BITSIZE
+ (GET_MODE_INNER (mode)) - 1),
+NULL, 0, OPTAB_DIRECT);
+ if (tmp0)
+  tmp1 = expand_simple_binop (mode, XOR, op1, tmp0,
+  NULL, 0, OPTAB_DIRECT);
+ if (tmp0 && tmp1)
+  expand_simple_binop (mode, MINUS, tmp1, tmp0,
+   op0, 0, OPTAB_DIRECT);
+ break;
+
+  /* For 16-bit signed integer X, the best way to calculate the absolute
+ value of X is max (X, -X), as SSE2 provides the PMAXSW insn.  */
+  case V8HImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (mode, SMAX, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  /* For 8-bit signed integer X, the best way to calculate the absolute
+ value of X is min ((unsigned char) X, (unsigned char) (-X)),
+ as SSE2 provides the PMINUB insn.  */
+  case V16QImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (V16QImode, UMIN, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  default:
+ break;
+}
+}
+
 /* Expand an insert into a vector register through pinsr insn.
Return true if successful.  */

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c3f6c94..bd90f2d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8721,7 +8721,7 @@
(set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)"))
(set_attr "mode" "DI")])

-(define_insn "abs2"
+(define_insn "*abs2"
   [(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand" "=v")
  (abs:VI124_AVX2_48_AVX512F
   (match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand" "vm")))]
@@ -8733,6 +87

Re: patch to enable LRA for ppc

2013-10-22 Thread Vladimir Makarov

On 13-10-22 10:21 AM, David Edelsohn wrote:

On Mon, Oct 21, 2013 at 10:42 PM, Vladimir Makarov  wrote:


I would say lets add -mlra, but make the default OFF for the time being.
We
can always switch the default later.

Sure, if you know some LRA problems it should not be on default. Moreover,
if we still have the problems when releasing gcc4.9, I think we should
exclude any possibility for a user to use LRA for ppc.  I don't want to have
GGC-4.9 users blaming LRA.

But adding LRA to PPC on the trunk (switched OFF by default) earlier could
help me a lot to work on the issues.

My main concern was disrupting Mike. If Mike is comfortable with
adding LRA disabled by default, it is okay with me.

The patch mostly adds lra_in_progress, which will not have any effect
while LRA remains disabled.

My one question about the patch is:

-  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r")
+  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,&r")

which may cause register preferencing problems for bswap when LRA is not used.

The rest of the patch is okay.


Thanks, David.  I'll commit the patch this week without this change (and 
making LRA active only when -mlra is given).  The change was for fixing 
a testsuite failure for a bad code generation.


It can be fixed in other way not affecting reload by adding a modified 
copy of insn definition active only when LRA is used and making the 
original definition active only when reload is used.  But I'll do it later.





[PATCH, rs6000] Fix mulv8hi3 pattern for little endian

2013-10-22 Thread Bill Schmidt
Hi,

The RTL generation for mulv8hi3 is slightly different for big and little
endian modes.  In the latter case, the operands of the vector-pack
instruction must be reversed to get the proper interleaving.

Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
regressions.  This fixes 3 test case failures for the little endian
target.  Is this ok for trunk?

Thanks,
Bill


2013-10-22  Bill Schmidt  

* config/rs6000/altivec.md (mulv8hi3): Adjust for little endian.


Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md(revision 203923)
+++ gcc/config/rs6000/altivec.md(working copy)
@@ -681,7 +681,10 @@
emit_insn (gen_altivec_vmrghw (high, even, odd));
emit_insn (gen_altivec_vmrglw (low, even, odd));
 
-   emit_insn (gen_altivec_vpkuwum (operands[0], high, low));
+   if (BYTES_BIG_ENDIAN)
+ emit_insn (gen_altivec_vpkuwum (operands[0], high, low));
+   else
+ emit_insn (gen_altivec_vpkuwum (operands[0], low, high));
 
DONE;
 }")




[c++-concepts] small tidbits to get it to build

2013-10-22 Thread Ed Smith-Rowland

I had to get past two small bugs to get c++-concepts to build.
Take a good look because I'm not sure if they're right.  The solutions 
should be harmless though.


Ed


2013-10-23  Edward Smith-Rowland  <3dw...@verizon.net>

make concepts build.
* constraint.cc (make_constraints): Change variable assume to
assumption.  Is assume a new keyword?
* typeck.c (cp_build_function_call_vec): Use unused variable loc.
Index: constraint.cc
===
--- constraint.cc   (revision 203944)
+++ constraint.cc   (working copy)
@@ -522,16 +522,15 @@
   if (expr == error_mark_node)
 return error_mark_node;
 
-  // Decompose those expressions into lists of lists of atomic
-  // propositions.
-  tree assume = decompose_assumptions (expr);
+  // Decompose those expressions into lists of lists of atomic propositions.
+  tree assumption = decompose_assumptions (expr);
 
   // Build the constraint info.
   tree_constraint_info *cinfo = 
 (tree_constraint_info *)make_node (CONSTRAINT_INFO);
   cinfo->spelling = reqs;
   cinfo->requirements = expr;
-  cinfo->assumptions = assume;
+  cinfo->assumptions = assumption;
   return (tree)cinfo;
 }
 
@@ -1380,7 +1379,7 @@
   // Print the header for the requires expression.
   tree parms = TREE_OPERAND (subst, 0);
   if (!VOID_TYPE_P (TREE_VALUE (parms)))
-inform (loc, "  requiring syntax with values %Z", TREE_OPERAND (subst, 0));
+inform (loc, "  requiring syntax with values %qE", TREE_OPERAND (subst, 
0));/*%Z*/
 
   // Create a new local specialization binding for the arguments. 
   // This lets us instantiate sub-expressions separately from the 
Index: typeck.c
===
--- typeck.c(revision 203944)
+++ typeck.c(working copy)
@@ -3445,7 +3445,7 @@
 {
   location_t loc = DECL_SOURCE_LOCATION (function);
   error ("%qD is not a viable candidate", function);
-  diagnose_constraints (input_location, tmpl, args);
+  diagnose_constraints (loc, tmpl, args);/*input_location*/
   return error_mark_node;
 }
 }


Re: [C++14] implement [[deprecated]].

2013-10-22 Thread Ed Smith-Rowland

On 10/22/2013 12:00 PM, Jason Merrill wrote:

OK.

Jason

There is discussion about several bugs in gnu::deprecated upon which 
this is based over on the libstdc++ list.
I could see where we are with those bugs in a week or two.  Or just wait 
until they are fixed.
OTOH, I don't think my patch would change one way or the other.  I can't 
see a reason we'd want [[gnu::deprecated]] and [[deprecated]] to differ.
It's just that [[deprecated]] wouldn't quite work the way it should 
until the bugs are fixed.


What do you think?

Ed



Re: [google gcc-4_8] Reset the some cp states in LIPO

2013-10-22 Thread Xinliang David Li
ok after testing.

David

On Tue, Oct 22, 2013 at 4:43 PM, Rong Xu  wrote:
> you are right.
> fixed in the attached patch.
>
>
> On Tue, Oct 22, 2013 at 4:37 PM, Xinliang David Li 
> wrote:
>>
>> Why is it needed to encode module id in the name? If the statics are
>> promoted later, it will be inserted at that time.
>>
>> David
>>
>> On Tue, Oct 22, 2013 at 4:28 PM, Rong Xu  wrote:
>> > Hi,
>> >
>> > The attached patch fixes some linking errors (undefined symbols) in
>> > LIPO-use
>> > build.
>> > r203583 does not seem to be complete as we may create different names in
>> > auxiliary and primary compilation because the counters are not reset
>> > after
>> > popping the parsing stack.
>> >
>> > Patch passed with the failing program. Other tests are ongoing.
>> >
>> > Thanks,
>> >
>> > -Rong
>> >
>
>


Re: rename c1x-*.c to c11-*.c

2013-10-22 Thread Mike Stump
On Oct 22, 2013, at 2:29 PM, Joseph S. Myers  wrote:
> There's a whole g++.dg/cpp0x directory of tests, many of them using 
> -std=c++0x, for which the same principle would also suggest using the 
> non-deprecated option and the name for the actual standard….

I agree.  C++11 looks nicer to my eye as well, and easier on the brain as one 
doesn't have to think about c++03 v c++11…  The doc said it was the same as 
c++11, so, I did that.  My fingers really didn't like changing 0x to 11.

This is just ++0x --> ++11 in gcc/testsuite/g++.dg/cpp0x.

Tested on x86_64-apple-darwin12.

Committed revision 203939.

Index: Wliteral-suffix.C
===
--- Wliteral-suffix.C   (revision 203938)
+++ Wliteral-suffix.C   (working copy)
@@ -1,5 +1,5 @@
// { dg-do run }
-// { dg-options "-std=c++0x" }
+// { dg-options "-std=c++11" }

// Make sure -Wliteral-suffix is enabled by default and
// triggers as expected.

[ … ]  rest omitted for brevity (and to get past the mail filter).

[PATCH] more robust check for multiplier in addressing mode when determining the costs

2013-10-22 Thread Igor Shevlyakov
I'm working on a chip with some unusual addressing modes.
it does have [base_rage+msize*index_reg] and one can't omit base_reg like
on x86.
But when ivopts tries to calculate the costs of different addressing modes
it only checks [mult*reg] to determine allowable multipliers.
I modified multiplier_allowed_in_address_p to check both cases.

There's no actual testcase that fails it's just my back-end generated
sub-optimal code.
I tested it on my tree off 4.8.1 and it fixes the problem.
Rebuilt trunk for x86_64-linux-gnu.

Should it be included in trunk to make check more robust?

Thanks
Igor

2013-10-22  Igor Shevlyakov  

* tree-ssa-loop-ivopts.c (multiplier_allowed_in_address_p ): Check both
  [reg+mult*reg] and [mult*reg] to determine if multiplier is allowed.


Index: tree-ssa-loop-ivopts.c
===
--- tree-ssa-loop-ivopts.c (revision 203937)
+++ tree-ssa-loop-ivopts.c (working copy)
@@ -3108,16 +3108,19 @@ multiplier_allowed_in_address_p (HOST_WI
 {
   enum machine_mode address_mode = targetm.addr_space.address_mode
(as);
   rtx reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1);
-  rtx addr;
+  rtx reg2 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2);
+  rtx addr, scaled;
   HOST_WIDE_INT i;

   valid_mult = sbitmap_alloc (2 * MAX_RATIO + 1);
   bitmap_clear (valid_mult);
-  addr = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX);
+  scaled = gen_rtx_fmt_ee (MULT, address_mode, reg1,   NULL_RTX);
+  addr   = gen_rtx_fmt_ee (PLUS, address_mode, scaled, reg2);
   for (i = -MAX_RATIO; i <= MAX_RATIO; i++)
  {
-  XEXP (addr, 1) = gen_int_mode (i, address_mode);
-  if (memory_address_addr_space_p (mode, addr, as))
+  XEXP (scaled, 1) = gen_int_mode (i, address_mode);
+  if (memory_address_addr_space_p (mode, addr, as) ||
+  memory_address_addr_space_p (mode, scaled, as))
 bitmap_set_bit (valid_mult, i + MAX_RATIO);
  }


Re: [google gcc-4_8] Reset the some cp states in LIPO

2013-10-22 Thread Xinliang David Li
Why is it needed to encode module id in the name? If the statics are
promoted later, it will be inserted at that time.

David

On Tue, Oct 22, 2013 at 4:28 PM, Rong Xu  wrote:
> Hi,
>
> The attached patch fixes some linking errors (undefined symbols) in LIPO-use
> build.
> r203583 does not seem to be complete as we may create different names in
> auxiliary and primary compilation because the counters are not reset after
> popping the parsing stack.
>
> Patch passed with the failing program. Other tests are ongoing.
>
> Thanks,
>
> -Rong
>


Re: Debug functions review

2013-10-22 Thread Paolo Carlini


Hi,

"François Dumont"  ha scritto:
>Hi
>
> Here is a patch to clean up a little some debug functions. I got
>rid of the __check_singular_aux, simply playing with __check_singular
>overloads was enough. I also added the missing __check_dereferenceable
>for safe local iterators.

This is probably straightforward but I want to be sure I understand your 
previous message + this one: do they mean that in some cases, due to that 
missing 'const', we weren't catching non-dereferenceable iterators? Thus, 
should we also add a testcase?

Paolo


Ping Re: Add predefined macros for library use in defining __STDC_IEC_559*

2013-10-22 Thread Joseph S. Myers
Ping.  This patch 
 is pending 
review (hook addition, rs6000/powerpc changes).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: rename c1x-*.c to c11-*.c

2013-10-22 Thread Joseph S. Myers
There's a whole g++.dg/cpp0x directory of tests, many of them using 
-std=c++0x, for which the same principle would also suggest using the 
non-deprecated option and the name for the actual standard

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch, Fortran] PR58793 - reject passing TYPE(*) to CLASS(*)

2013-10-22 Thread Paul Richard Thomas
Dear Tobias,

This is OK for trunk.

Thanks for the patch.

Paul

On 21 October 2013 19:32, Tobias Burnus  wrote:
> The issue came up while reviewing the patch for PR58793.
>
> Build and regtested on x86-64-gnu-linux
> OK for the trunk?
>
> Tobias



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy


Re: [PATCH][i386]Fix PR 57756

2013-10-22 Thread Sriraman Tallam
Hi Jakub,

   This simple patch fixes the -m32 -mno-sse bugs you reported. A few
more places where I did not change references to global_options.
Uros/Richard: Is this ok to commit?

* config/i386/i386.c (ix86_option_override_internal):
Change TARGET_SSE2 to TARGET_SSE2_P (opts->...)
(ix86_valid_target_attribute_tree):
Change TARGET_64BIT to TARGET_64BIT_P (opts->...)
Change TARGET_SSE to TARGET_SSE_P (opts->...)

Thanks
Sri

On Fri, Oct 18, 2013 at 7:10 PM, Sriraman Tallam  wrote:
> On Thu, Oct 17, 2013 at 3:10 PM, Jakub Jelinek  wrote:
>> On Thu, Oct 17, 2013 at 12:30:46PM -0700, Sriraman Tallam wrote:
>>> I checked my build again for these tests and they all pass.
>>
>> Even on x86_64-linux I can reproduce all of those with
>> -m32 -mno-sse.
>
> Figured out why this happens in -m32 -mno-sse mode. This simple patch
> that will fix it Basically, there are a couple of more places where
> references to global_options need to be fixed. . I am consolidating
> the patch for all the x86 test case failures into one patch which I
> will later send out for review.
>
> Index: config/i386/i386.c
> ===
> --- config/i386/i386.c (revision 203830)
> +++ config/i386/i386.c (working copy)
> @@ -3810,7 +3810,7 @@ ix86_option_override_internal (bool main_args_p,
>   codegen.  We may switch to 387 with -ffast-math for size optimized
>   functions. */
>else if (fast_math_flags_set_p (&global_options)
> -   && TARGET_SSE2)
> +   && TARGET_SSE2_P (opts->x_ix86_isa_flags))
>  ix86_fpmath = FPMATH_SSE;
>else
>  opts->x_ix86_fpmath = TARGET_FPMATH_DEFAULT_P (opts->x_ix86_isa_flags);
> @@ -4566,7 +4566,8 @@ ix86_valid_target_attribute_tree (tree args,
>/* If fpmath= is not set, and we now have sse2 on 32-bit, use it.  */
>if (enum_opts_set.x_ix86_fpmath)
>   opts_set->x_ix86_fpmath = (enum fpmath_unit) 1;
> -  else if (!TARGET_64BIT && TARGET_SSE)
> +  else if (!TARGET_64BIT_P (opts->x_ix86_isa_flags)
> +   && TARGET_SSE_P (opts->x_ix86_isa_flags))
>   {
>opts->x_ix86_fpmath = (enum fpmath_unit) (FPMATH_SSE | FPMATH_387);
>
>>
>> Jakub
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 203929)
+++ gcc/config/i386/i386.c  (working copy)
@@ -3798,7 +3798,7 @@ ix86_option_override_internal (bool main_args_p,
  codegen.  We may switch to 387 with -ffast-math for size optimized
  functions. */
   else if (fast_math_flags_set_p (&global_options)
-  && TARGET_SSE2)
+  && TARGET_SSE2_P (opts->x_ix86_isa_flags))
 ix86_fpmath = FPMATH_SSE;
   else
 opts->x_ix86_fpmath = TARGET_FPMATH_DEFAULT_P (opts->x_ix86_isa_flags);
@@ -4553,7 +4553,8 @@ ix86_valid_target_attribute_tree (tree args,
   /* If fpmath= is not set, and we now have sse2 on 32-bit, use it.  */
   if (enum_opts_set.x_ix86_fpmath)
opts_set->x_ix86_fpmath = (enum fpmath_unit) 1;
-  else if (!TARGET_64BIT && TARGET_SSE)
+  else if (!TARGET_64BIT_P (opts->x_ix86_isa_flags)
+  && TARGET_SSE_P (opts->x_ix86_isa_flags))
{
  opts->x_ix86_fpmath = (enum fpmath_unit) (FPMATH_SSE | FPMATH_387);
  opts_set->x_ix86_fpmath = (enum fpmath_unit) 1;



Debug functions review

2013-10-22 Thread François Dumont

Hi

Here is a patch to clean up a little some debug functions. I got 
rid of the __check_singular_aux, simply playing with __check_singular 
overloads was enough. I also added the missing __check_dereferenceable 
for safe local iterators.


2013-10-22  François Dumont 

* include/debug/formatter.h (__check_singular): Add const on
iterator reference.
* include/debug/functions.h (__check_singular_aux): Delete.
(__check_singular(const _Ite&)): Add const on iterator reference.
(__check_singular(const _Safe_iterator<_Ite, _Seq>&)): Delete.
(__check_dereferenceable(const _Ite&)): Add const on iterator
reference.
(__check_dereferenceable(const _Safe_local_iterator<>&)): New.
* include/debug/safe_iterator.h (__check_singular_aux): Delete.
(__check_singular(const _Safe_iterator_base&)): New.

Tested under Linux x86_64 debug mode.

Ok to commit ?

François

Index: include/debug/formatter.h
===
--- include/debug/formatter.h	(revision 203909)
+++ include/debug/formatter.h	(working copy)
@@ -38,7 +38,7 @@
   using std::type_info;
 
   template
-bool __check_singular(_Iterator&);
+bool __check_singular(const _Iterator&);
 
   class _Safe_sequence_base;
 
Index: include/debug/functions.h
===
--- include/debug/functions.h	(revision 203909)
+++ include/debug/functions.h	(working copy)
@@ -45,20 +45,19 @@
   template
 class _Safe_iterator;
 
+  template
+class _Safe_local_iterator;
+
   template
 struct _Insert_range_from_self_is_safe
 { enum { __value = 0 }; };
 
-  // An arbitrary iterator pointer is not singular.
-  inline bool
-  __check_singular_aux(const void*) { return false; }
-
-  // We may have an iterator that derives from _Safe_iterator_base but isn't
-  // a _Safe_iterator.
+  /** Assume that some arbitrary iterator is not singular, because we
+  can't prove that it is. */
   template
 inline bool
-__check_singular(_Iterator& __x)
-{ return __check_singular_aux(&__x); }
+__check_singular(const _Iterator& __x)
+{ return false; }
 
   /** Non-NULL pointers are nonsingular. */
   template
@@ -66,17 +65,11 @@
 __check_singular(const _Tp* __ptr)
 { return __ptr == 0; }
 
-  /** Safe iterators know if they are singular. */
-  template
-inline bool
-__check_singular(const _Safe_iterator<_Iterator, _Sequence>& __x)
-{ return __x._M_singular(); }
-
   /** Assume that some arbitrary iterator is dereferenceable, because we
   can't prove that it isn't. */
   template
 inline bool
-__check_dereferenceable(_Iterator&)
+__check_dereferenceable(const _Iterator&)
 { return true; }
 
   /** Non-NULL pointers are dereferenceable. */
@@ -85,12 +78,19 @@
 __check_dereferenceable(const _Tp* __ptr)
 { return __ptr; }
 
-  /** Safe iterators know if they are singular. */
+  /** Safe iterators know if they are dereferenceable. */
   template
 inline bool
 __check_dereferenceable(const _Safe_iterator<_Iterator, _Sequence>& __x)
 { return __x._M_dereferenceable(); }
 
+  /** Safe local iterators know if they are dereferenceable. */
+  template
+inline bool
+__check_dereferenceable(const _Safe_local_iterator<_Iterator,
+		   _Sequence>& __x)
+{ return __x._M_dereferenceable(); }
+
   /** If the distance between two random access iterators is
*  nonnegative, assume the range is valid.
   */
Index: include/debug/safe_iterator.h
===
--- include/debug/safe_iterator.h	(revision 203909)
+++ include/debug/safe_iterator.h	(working copy)
@@ -56,13 +56,10 @@
   { return __it == __seq->_M_base().begin(); }
 };
 
-  /** Iterators that derive from _Safe_iterator_base but that aren't
-   *  _Safe_iterators can be determined singular or non-singular via
-   *  _Safe_iterator_base.
-   */
-  inline bool 
-  __check_singular_aux(const _Safe_iterator_base* __x)
-  { return __x->_M_singular(); }
+  /** _Safe_iterators can be determined singular or non-singular. */
+  inline bool
+  __check_singular(const _Safe_iterator_base& __x)
+  { return __x._M_singular(); }
 
   /** The precision to which we can calculate the distance between
*  two iterators.



Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Jeff Law

On 10/22/13 13:31, Richard Henderson wrote:


Yes, which is where I believe the new types come from as well.
OK.  Thanks for clarifying.  I'm about to go offline for a few hours, 
but will start working my way through the MPX stuff.


jeff


[SH] PR 52483 - Fix volatile mem stores

2013-10-22 Thread Oleg Endo
Hello,

The attached patch fixes volatile mem stores on SH so that they won't
result in redundant sign/zero extensions and will utilize available
addressing modes.  This is similar to what has been done to fix memory
loads in http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01315.html

Tested on rev 203909 with
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

and no new failures.
However, the test result summary showed a bunch of "WARNING: program
timed out."  Kaz, could you please add it to your test queue and let me
know if it's OK for trunk?

Cheers,
Oleg

gcc/ChangeLog:
PR target/52483
* config/sh/predicates.md (general_movdst_operand): Allow
reg+reg addressing, do not use general_operand for memory 
operands.

testsuite/ChangeLog:
PR target/52483
* gcc.target/sh/pr52483-1.c: Add tests for memory stores.
* gcc.target/sh/pr52483-2.c: Likewise.
* gcc.target/sh/pr52483-3.c: Likewise.
* gcc.target/sh/pr52483-4.c: Likewise.
Index: gcc/config/sh/predicates.md
===
--- gcc/config/sh/predicates.md	(revision 203857)
+++ gcc/config/sh/predicates.md	(working copy)
@@ -550,17 +550,36 @@
   && ! (reload_in_progress || reload_completed))
 return 0;
 
-  if ((mode == QImode || mode == HImode)
-  && mode == GET_MODE (op)
-  && (MEM_P (op)
-	  || (GET_CODE (op) == SUBREG && MEM_P (SUBREG_REG (op)
+  if (mode == GET_MODE (op)
+  && (MEM_P (op) || (GET_CODE (op) == SUBREG && MEM_P (SUBREG_REG (op)
 {
-  rtx x = XEXP ((MEM_P (op) ? op : SUBREG_REG (op)), 0);
+  rtx mem_rtx = MEM_P (op) ? op : SUBREG_REG (op);
+  rtx x = XEXP (mem_rtx, 0);
 
-  if (GET_CODE (x) == PLUS
+  if ((mode == QImode || mode == HImode)
+	  && GET_CODE (x) == PLUS
 	  && REG_P (XEXP (x, 0))
 	  && CONST_INT_P (XEXP (x, 1)))
 	return sh_legitimate_index_p (mode, XEXP (x, 1), TARGET_SH2A, false);
+
+  /* Allow reg+reg addressing here without validating the register
+	 numbers.  Usually one of the regs must be R0 or a pseudo reg.
+	 In some cases it can happen that arguments from hard regs are
+	 propagated directly into address expressions.  In this cases reload
+	 will have to fix it up later.  However, allow this only for native
+	 1, 2 or 4 byte addresses.  */
+  if (can_create_pseudo_p () && GET_CODE (x) == PLUS
+	  && GET_MODE_SIZE (mode) <= 4
+	  && REG_P (XEXP (x, 0)) && REG_P (XEXP (x, 1)))
+	return true;
+
+  /* 'general_operand' does not allow volatile mems during RTL expansion to
+	 avoid matching arithmetic that operates on mems, it seems.
+	 On SH this leads to redundant sign extensions for QImode or HImode
+	 stores.  Thus we mimic the behavior but allow volatile mems.  */
+if (memory_address_addr_space_p (GET_MODE (mem_rtx), x,
+	 MEM_ADDR_SPACE (mem_rtx)))
+	  return true;
 }
 
   return general_operand (op, mode);
Index: gcc/testsuite/gcc.target/sh/pr52483-1.c
===
--- gcc/testsuite/gcc.target/sh/pr52483-1.c	(revision 203857)
+++ gcc/testsuite/gcc.target/sh/pr52483-1.c	(working copy)
@@ -1,9 +1,9 @@
-/* Check that loads from volatile mems don't result in redundant sign
-   extensions.  */
+/* Check that loads/stores from/to volatile mems don't result in redundant
+   sign/zero extensions.  */
 /* { dg-do compile { target "sh*-*-*" } } */
-/* { dg-options "-O1" } */
+/* { dg-options "-O2" } */
 /* { dg-skip-if "" { "sh*-*-*" } { "-m5*"} { "" } }  */
-/* { dg-final { scan-assembler-not "exts" } } */
+/* { dg-final { scan-assembler-not "exts|extu" } } */
 
 int
 test_00 (volatile char* x)
@@ -11,20 +11,44 @@
   return *x;
 }
 
+void
+test_100 (volatile char* x, char y)
+{
+  *x = y;
+}
+
 int
 test_01 (volatile short* x)
 {
   return *x;
 }
 
+void
+test_101 (volatile unsigned char* x, unsigned char y)
+{
+  *x = y;
+}
+
 int
 test_02 (volatile unsigned char* x)
 {
   return *x == 0x80;
 }
 
+void
+test_102 (volatile short* x, short y)
+{
+  *x = y;
+}
+
 int
 test_03 (volatile unsigned short* x)
 {
   return *x == 0xFF80;
 }
+
+void
+test_103 (volatile unsigned short* x, unsigned short y)
+{
+  *x = y;
+}
Index: gcc/testsuite/gcc.target/sh/pr52483-2.c
===
--- gcc/testsuite/gcc.target/sh/pr52483-2.c	(revision 203857)
+++ gcc/testsuite/gcc.target/sh/pr52483-2.c	(working copy)
@@ -1,14 +1,15 @@
-/* Check that loads from volatile mems utilize displacement addressing
-   modes and do not result in redundant sign extensions. */
+/* Check that loads/stores from/to volatile mems utilize displacement
+   addressing modes and do not result in redundant sign/zero extensions. */
 /* { dg-do compile { target "sh*-*-*" } } */
 /* { dg-options "-O1" } */
 /* { dg-skip-if "" { "sh*-*-*" } { "-m5*"} { "" } }  */
-/* { dg-final { scan-assembler-times "@\\(5,"

Re: [PATCH, PR58805] Add missing check in stmt_local_def for tail-merge

2013-10-22 Thread Jeff Law

On 10/22/13 03:58, Tom de Vries wrote:

Richard,

This patch adds a missing check for gimple_vdef in stmt_local_def for the
tail-merge pass.

Bootstrapped and reg-tested on x86_64.

OK for trunk, gcc-4_8-branch?

Thanks,
- Tom

2013-10-22  Tom de Vries  

PR tree-optimization/58805
* tree-ssa-tail-merge.c (stmt_local_def): Add gimple_vdef check.

* gcc.dg/pr58805.c: New test.

Doesn't this test belong in an architecture specific directory?

Under what conditions can a statement have a VDEF but not be considered 
as having a side effect by gimple_has_side_effects?


It almost seems to me that gimple_has_side_effects may need updating.

jeff


Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Richard Henderson
On 10/22/2013 12:18 PM, Jeff Law wrote:
>> The only way I could think to positively ensure that normal operations
>> didn't get implemented via mpx insns is to describe the new patterns
>> with distinct modes.
> Presumably once we have a distinct mode, we do the right magic in
> HARD_REGNO_MODE_OK and that's how you get your guarantee.  I'm assuming we're
> exposing these to the register allocator (I haven't looked at the full series
> yet).

Yeah, the register allocator was supposed to be involved.
(And I need to review the series myself.)

> Presumably you need a distinct mode coming out of the front-end/gimple to
> ensure we get the new mode in RTL.

Yes, which is where I believe the new types come from as well.


r~


[PATCH] Use get_range_info in vect_recog_divmod_pattern

2013-10-22 Thread Jakub Jelinek
Hi!

If VRP tells us that oprnd is always >= 0 or always < 0, we can generate
better code for the divmode vectorization.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-10-22  Jakub Jelinek  

* tree-vect-patterns.c (vect_recog_divmod_pattern): Optimize
sequence based on get_range_info returned range.

--- gcc/tree-vect-patterns.c.jj 2013-09-20 09:42:43.048260891 +0200
+++ gcc/tree-vect-patterns.c2013-10-22 18:21:21.518563159 +0200
@@ -2226,20 +2226,19 @@ vect_recog_divmod_pattern (vec *
   if (post_shift >= prec)
return NULL;
 
-  /* t1 = oprnd1 h* ml;  */
+  /* t1 = oprnd0 h* ml;  */
   t1 = vect_recog_temp_ssa_var (itype, NULL);
   def_stmt
= gimple_build_assign_with_ops (MULT_HIGHPART_EXPR, t1, oprnd0,
build_int_cst (itype, ml));
-  append_pattern_def_seq (stmt_vinfo, def_stmt);
 
   if (add)
{
  /* t2 = t1 + oprnd0;  */
+ append_pattern_def_seq (stmt_vinfo, def_stmt);
  t2 = vect_recog_temp_ssa_var (itype, NULL);
  def_stmt
= gimple_build_assign_with_ops (PLUS_EXPR, t2, t1, oprnd0);
- append_pattern_def_seq (stmt_vinfo, def_stmt);
}
   else
t2 = t1;
@@ -2247,27 +2246,57 @@ vect_recog_divmod_pattern (vec *
   if (post_shift)
{
  /* t3 = t2 >> post_shift;  */
+ append_pattern_def_seq (stmt_vinfo, def_stmt);
  t3 = vect_recog_temp_ssa_var (itype, NULL);
  def_stmt
= gimple_build_assign_with_ops (RSHIFT_EXPR, t3, t2,
build_int_cst (itype, post_shift));
- append_pattern_def_seq (stmt_vinfo, def_stmt);
}
   else
t3 = t2;
 
-  /* t4 = oprnd0 >> (prec - 1);  */
-  t4 = vect_recog_temp_ssa_var (itype, NULL);
-  def_stmt
-   = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0,
-   build_int_cst (itype, prec - 1));
-  append_pattern_def_seq (stmt_vinfo, def_stmt);
-
-  /* q = t3 - t4;  or q = t4 - t3;  */
-  q = vect_recog_temp_ssa_var (itype, NULL);
-  pattern_stmt
-   = gimple_build_assign_with_ops (MINUS_EXPR, q, d < 0 ? t4 : t3,
-   d < 0 ? t3 : t4);
+  double_int oprnd0_min, oprnd0_max;
+  int msb = 1;
+  if (get_range_info (oprnd0, &oprnd0_min, &oprnd0_max) == VR_RANGE)
+   {
+ if (!oprnd0_min.is_negative ())
+   msb = 0;
+ else if (oprnd0_max.is_negative ())
+   msb = -1;
+   }
+
+  if (msb == 0 && d >= 0)
+   {
+ /* q = t3;  */
+ q = t3;
+ pattern_stmt = def_stmt;
+   }
+  else
+   {
+ /* t4 = oprnd0 >> (prec - 1);
+or if we know from VRP that oprnd0 >= 0
+t4 = 0;
+or if we know from VRP that oprnd0 < 0
+t4 = -1;  */
+ append_pattern_def_seq (stmt_vinfo, def_stmt);
+ t4 = vect_recog_temp_ssa_var (itype, NULL);
+ if (msb != 1)
+   def_stmt
+ = gimple_build_assign_with_ops (INTEGER_CST,
+ t4, build_int_cst (itype, msb),
+ NULL_TREE);
+ else
+   def_stmt
+ = gimple_build_assign_with_ops (RSHIFT_EXPR, t4, oprnd0,
+ build_int_cst (itype, prec - 1));
+ append_pattern_def_seq (stmt_vinfo, def_stmt);
+
+ /* q = t3 - t4;  or q = t4 - t3;  */
+ q = vect_recog_temp_ssa_var (itype, NULL);
+ pattern_stmt
+   = gimple_build_assign_with_ops (MINUS_EXPR, q, d < 0 ? t4 : t3,
+   d < 0 ? t3 : t4);
+   }
 }
 
   if (rhs_code == TRUNC_MOD_EXPR)

Jakub


Re: [PING] 3 patches waiting for approval/review

2013-10-22 Thread Andreas Krebbel
On 16/10/13 22:25, Jeff Law wrote:
> On 10/11/13 11:23, Andreas Krebbel wrote:
>> On 10/10/13 18:41, Jeff Law wrote:
>>> On 10/10/13 04:00, Andreas Krebbel wrote:
 On 09/10/13 21:46, Jeff Law wrote:
> On 08/21/13 03:21, Andreas Krebbel wrote:
>> [RFC] Allow functions calling mcount before prologue to be leaf functions
>> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00993.html
> I don't think this is necessarily correct for all targets.  ISTM the
> ability to consider a function calling mcount as a leaf needs to be a
> property of the target.

 We have already "profile_before_prologue" as a target property. Shouldn't 
 this be enough to decide
 upon this? When a function calls mcount before the prologue it shouldn't 
 matter whether the function
 is leaf or not.
>>> I don't think so, I think it'd break the PA's 32 bit ABI, maybe the 64
>>> bit ABI as well.  It's the caller's responsibility to build a mini stack
>>> frame if the function makes any calls.  If the code in the prologue
>>> expander uses "leafness" to make the decision about whether or not to
>>> allocate the mini frame, then it'd do the wrong thing here.
>>
>> Since it seems to be about PROFILE_HOOKS vs FUNCTION_PROFILER targets what 
>> about the following patch:
> It's not about PROFILE_HOOKS vs FUNCTION_PROFILER, but the ABI and how 
> the backend changes its behavior on the return value of leaf_function_p.
> 
> Effectively you want to have functions which are not leaf functions 
> (they call mcount) be treated as leaf functions.  I have great concerns 
> about the safety of allowing that without exports on each port weighing 
> in on its correctness for their port.
> 
> 
> I still really feel this should be a target hook that is off by default 
> so that the target maintainers can audit their target to ensure it 
> operates correctly.
> 
> Maybe I'm missing something, so perhaps another approach.  Can you 
> explain why you think it is safe to ignore calls to mcount when trying 
> to determine if a function is a leaf or not?

In general it is not safe to ignore calls to mcount. But if a target does 
insert the call to mcount
before the function prologue then mcount will not use the stack space allocated 
by the current
function. It will instead use the stack space allocated by the caller for the 
current function. The
current function can still be a leaf function since the call does not happen 
within the scope of its
stack frame.

The difference between PROFILE_HOOKS and FUNCTION_PROFILER is that for the 
first the call to mcount
is inserted always after the function prologue no matter what the backend 
returns for the
profile_before_prologue target hook.

Bye,

-Andreas-


> 
> 
> BTW, there is a hunk of this patch that can and should go forward now 
> rather than later.  Specifically removing the profile_arc_flag part of 
> the test.  If you wanted to push that forward immediately, I'd support 
> that and you should consider it pre-approved.

Ok. Thanks.

> 
> 
> jeff
> 



[PATCH] Fix up pr58508.c testcase

2013-10-22 Thread Jakub Jelinek
Hi!

I've noticed that this testcase fails on i686-linux.  The problem is
that the vect.exp infrastructure is terrible and dg-options overrides
all the required flags.  So, vect/ testcases either don't have to have
explicit options at all (and the way to express options is through magic
naming of the tests), or one can use dg-additional-options.  But, in this
case you just want to pass the defaults.

Regtested on x86_64-linux and i686-linux, ok for trunk?

2013-10-22  Jakub Jelinek  

* gcc.dg/vect/pr58508.c: Remove dg-options.

--- gcc/testsuite/gcc.dg/vect/pr58508.c.jj  2013-10-21 09:00:49.0 
+0200
+++ gcc/testsuite/gcc.dg/vect/pr58508.c 2013-10-22 14:21:16.064261179 +0200
@@ -1,5 +1,4 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
 
 
 /* The GCC vectorizer generates loop versioning for the following loop

Jakub


[PATCH] Small *-alias dump fixes

2013-10-22 Thread Jakub Jelinek
Hi!

I've noticed that some -fdump-tree-*-alias dumps look badly, contain stuff
like:
  :
  # # RANGE [0, 52]
  _1 = PHI <0(2), f2_12(3)>
  return _1;
or:
  :
  # PT = { D.1732 } (glob)
  # ALIGN = 32, MISALIGN = 0
  # p_1 = PHI <&MEM[(void *)&a + 128B](2), &MEM[(void *)&a + 512B](3)>
  PT = { D.1732 } (glob)
  # ALIGN = 32, MISALIGN = 0
  # p_3 = p_1 + 64;
note, some lines get double an extra "# " and some don't get that when it
should, in both examples.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2013-10-22  Jakub Jelinek  

* gimple-pretty-print.c (dump_ssaname_info): Always print "# " before
the info, not after it.
(gump_gimple_phi): Add COMMENT argument, if true, print "# " after
dump_ssaname_info call.
(pp_gimple_stmt_1): Adjust caller.
(dump_phi_nodes): Likewise.  Don't print "# " here.

--- gcc/gimple-pretty-print.c.jj2013-10-17 22:30:56.0 +0200
+++ gcc/gimple-pretty-print.c   2013-10-22 14:04:53.272329144 +0200
@@ -1714,7 +1714,7 @@ dump_ssaname_info (pretty_printer *buffe
 {
   unsigned int align, misalign;
   struct ptr_info_def *pi = SSA_NAME_PTR_INFO (node);
-  pp_string (buffer, "PT = ");
+  pp_string (buffer, "# PT = ");
   pp_points_to_solution (buffer, &pi->pt);
   newline_and_indent (buffer, spc);
   if (get_ptr_info_alignment (pi, &align, &misalign))
@@ -1722,7 +1722,6 @@ dump_ssaname_info (pretty_printer *buffe
  pp_printf (buffer, "# ALIGN = %u, MISALIGN = %u", align, misalign);
  newline_and_indent (buffer, spc);
}
-  pp_string (buffer, "# ");
 }
 
   if (!POINTER_TYPE_P (TREE_TYPE (node))
@@ -1732,7 +1731,7 @@ dump_ssaname_info (pretty_printer *buffe
   value_range_type range_type = get_range_info (node, &min, &max);
 
   if (range_type == VR_VARYING)
-   pp_printf (buffer, "# RANGE  VR_VARYING");
+   pp_printf (buffer, "# RANGE VR_VARYING");
   else if (range_type == VR_RANGE || range_type == VR_ANTI_RANGE)
{
  pp_printf (buffer, "# RANGE ");
@@ -1749,10 +1748,11 @@ dump_ssaname_info (pretty_printer *buffe
 
 /* Dump a PHI node PHI.  BUFFER, SPC and FLAGS are as in pp_gimple_stmt_1.
The caller is responsible for calling pp_flush on BUFFER to finalize
-   pretty printer.  */
+   pretty printer.  If COMMENT is true, print this after #.  */
 
 static void
-dump_gimple_phi (pretty_printer *buffer, gimple phi, int spc, int flags)
+dump_gimple_phi (pretty_printer *buffer, gimple phi, int spc, bool comment,
+int flags)
 {
   size_t i;
   tree lhs = gimple_phi_result (phi);
@@ -1760,6 +1760,9 @@ dump_gimple_phi (pretty_printer *buffer,
   if (flags & TDF_ALIAS)
 dump_ssaname_info (buffer, lhs, spc);
 
+  if (comment)
+pp_string (buffer, "# ");
+
   if (flags & TDF_RAW)
 dump_gimple_fmt (buffer, spc, flags, "%G <%T, ", phi,
 gimple_phi_result (phi));
@@ -2095,7 +2098,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer
   break;
 
 case GIMPLE_PHI:
-  dump_gimple_phi (buffer, gs, spc, flags);
+  dump_gimple_phi (buffer, gs, spc, false, flags);
   break;
 
 case GIMPLE_OMP_PARALLEL:
@@ -2271,8 +2274,7 @@ dump_phi_nodes (pretty_printer *buffer,
   if (!virtual_operand_p (gimple_phi_result (phi)) || (flags & TDF_VOPS))
 {
   INDENT (indent);
-  pp_string (buffer, "# ");
-  dump_gimple_phi (buffer, phi, indent, flags);
+ dump_gimple_phi (buffer, phi, indent, true, flags);
   pp_newline (buffer);
 }
 }

Jakub


Re: *PING* Re: [Patch, Fortran] Use ANNOTATE_EXPR annot_expr_ivdep_kind for DO CONCURRENT

2013-10-22 Thread Steve Kargl
On Tue, Oct 22, 2013 at 08:30:44PM +0200, Tobias Burnus wrote:
> Two weeks ago I submitted the patch, available at: 
> http://gcc.gnu.org/ml/fortran/2013-10/msg00022.html ; while the ME patch 
> is not yet approved, the C FE was approved (latest C/ME patch: 
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01752.html).

The Fortran part is OK.

-- 
Steve


Re: [Patch, C++] Add C++ FE support for #pragma ivdep

2013-10-22 Thread Tobias Burnus
*PING* - with a slightly updated patch attached. Changes: New test case 
for an error message plus matching the wording to the C version.


Tobias

On October 10, 2013, Tobias Burnus wrote:
This patch depends on the middle-end / C FE patch: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00655.html


It adds parsing support for #pragma ivdep and annotates the condition 
of C-like for loops such that one can later set the vectorization safe 
length (loop->safelen) to INT_MAX.  I considered to add the annotation 
also to C++11's range-based loops, but as those are unlikely to 
vectorize, I didn't do so. (If one does, one might end up having a 
CLEANUP_POINT_EXPR around the annotate; that implies that the annotate 
internal function is not the direct predecessor of the GIMPLE_COND.)


Another question is how much diagnostic one gives, e.g. when there is 
no condition: Should there be an error, a warning or should the pragma 
then silently ignored?


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias


2013-08-22  Tobias Burnus  

	PR other/33426
	* parser.c (cp_parser_iteration_statement,
	cp_parser_for, cp_parser_c_for, cp_parser_pragma): Handle
	IVDEP pragma.

	* g++.dg/parse/ivdep.C: New.
	* g++.dg/vect/pr33426-ivdep.cc: New.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 10a7b96..46f6d18 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -1970,13 +1970,13 @@ static tree cp_parser_selection_statement
 static tree cp_parser_condition
   (cp_parser *);
 static tree cp_parser_iteration_statement
-  (cp_parser *);
+  (cp_parser *, bool);
 static bool cp_parser_for_init_statement
   (cp_parser *, tree *decl);
 static tree cp_parser_for
-  (cp_parser *);
+  (cp_parser *, bool);
 static tree cp_parser_c_for
-  (cp_parser *, tree, tree);
+  (cp_parser *, tree, tree, bool);
 static tree cp_parser_range_for
   (cp_parser *, tree, tree, tree);
 static void do_range_for_auto_deduction
@@ -9231,7 +9231,7 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 	case RID_WHILE:
 	case RID_DO:
 	case RID_FOR:
-	  statement = cp_parser_iteration_statement (parser);
+	  statement = cp_parser_iteration_statement (parser, false);
 	  break;
 
 	case RID_BREAK:
@@ -9892,7 +9892,7 @@ cp_parser_condition (cp_parser* parser)
not included. */
 
 static tree
-cp_parser_for (cp_parser *parser)
+cp_parser_for (cp_parser *parser, bool ivdep)
 {
   tree init, scope, decl;
   bool is_range_for;
@@ -9906,11 +9906,11 @@ cp_parser_for (cp_parser *parser)
   if (is_range_for)
 return cp_parser_range_for (parser, scope, init, decl);
   else
-return cp_parser_c_for (parser, scope, init);
+return cp_parser_c_for (parser, scope, init, ivdep);
 }
 
 static tree
-cp_parser_c_for (cp_parser *parser, tree scope, tree init)
+cp_parser_c_for (cp_parser *parser, tree scope, tree init, bool ivdep)
 {
   /* Normal for loop */
   tree condition = NULL_TREE;
@@ -9924,7 +9924,20 @@ cp_parser_c_for (cp_parser *parser, tree scope, tree init)
 
   /* If there's a condition, process it.  */
   if (cp_lexer_next_token_is_not (parser->lexer, CPP_SEMICOLON))
-condition = cp_parser_condition (parser);
+{
+  condition = cp_parser_condition (parser);
+  if (ivdep)
+	{
+	  condition = build1 (ANNOTATE_EXPR, TREE_TYPE (condition), condition);
+	  SET_ANNOTATE_EXPR_ID (condition, annot_expr_ivdep_kind);
+	}
+}
+  else if (ivdep)
+{
+  cp_parser_error (parser, "missing loop condition in loop with "
+		   "% pragma");
+  condition = error_mark_node;
+}
   finish_for_cond (condition, stmt);
   /* Look for the `;'.  */
   cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
@@ -10287,7 +10300,7 @@ cp_parser_range_for_member_function (tree range, tree identifier)
Returns the new WHILE_STMT, DO_STMT, FOR_STMT or RANGE_FOR_STMT.  */
 
 static tree
-cp_parser_iteration_statement (cp_parser* parser)
+cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
 {
   cp_token *token;
   enum rid keyword;
@@ -10360,7 +10373,7 @@ cp_parser_iteration_statement (cp_parser* parser)
 	/* Look for the `('.  */
 	cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN);
 
-	statement = cp_parser_for (parser);
+	statement = cp_parser_for (parser, ivdep);
 
 	/* Look for the `)'.  */
 	cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
@@ -30906,6 +30919,20 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context)
 		"%<#pragma omp sections%> construct");
   break;
 
+case PRAGMA_IVDEP:
+  {
+	cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+	cp_token *tok;
+	tok = cp_lexer_peek_token (the_parser->lexer);
+	if (tok->type != CPP_KEYWORD || tok->keyword != RID_FOR)
+	  {
+	cp_parser_error (parser, "for statement expected");
+	return false;
+	  }
+	cp_parser_iteration_statement (parser, true);
+	return true;
+  }
+
 default:
   gcc_assert (id >= PRAGMA_FIRST_EXTERNAL);
   c_invoke_pragma_handler (id);
diff --git a/gcc/testsuite/g++.

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Jeff Law

On 10/22/13 13:12, Richard Henderson wrote:

On 10/21/2013 11:10 AM, Jeff Law wrote:

So why are bounds distinct modes?Is there some inherent reason why bounds
are something other than an integer mode (MODE_INT)?


I suggested the distinct modes during the NDA phase.

The primary reason for this is that MPX is designed to be kind of
backward compatible with previous ISAs, operating as nops.  Thus
we cannot allow the compiler to use the MPX registers for anything
besides implementing the bounds checking.

Right.



The only way I could think to positively ensure that normal operations
didn't get implemented via mpx insns is to describe the new patterns
with distinct modes.
Presumably once we have a distinct mode, we do the right magic in 
HARD_REGNO_MODE_OK and that's how you get your guarantee.  I'm assuming 
we're exposing these to the register allocator (I haven't looked at the 
full series yet).


Presumably you need a distinct mode coming out of the front-end/gimple 
to ensure we get the new mode in RTL.


It all seems reasonable -- I wasn't asking Ilya to change anything, I'm 
just trying to understand the rationale before going any further with 
the patch and this helps considerably.



jeff


Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Richard Henderson
On 10/21/2013 11:10 AM, Jeff Law wrote:
> So why are bounds distinct modes?Is there some inherent reason why bounds
> are something other than an integer mode (MODE_INT)?

I suggested the distinct modes during the NDA phase.

The primary reason for this is that MPX is designed to be kind of
backward compatible with previous ISAs, operating as nops.  Thus
we cannot allow the compiler to use the MPX registers for anything
besides implementing the bounds checking.

The only way I could think to positively ensure that normal operations
didn't get implemented via mpx insns is to describe the new patterns
with distinct modes.


r~


Re: [Dwarf Patch] Generate -ggnu-pubnames

2013-10-22 Thread Sterling Augustine
On Tue, Oct 22, 2013 at 12:05 PM, Sterling Augustine
 wrote:
> On Thu, Oct 17, 2013 at 12:09 PM, Sterling Augustine
>  wrote:
>> The enclosed patch (which depends on
>> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01395.html), ports the
>> new -ggnu-pubnames option from google 4.8 to trunk.
>>
>> -gnu-pubnames enable the gold linker to to generate a better
>> .gdb_index section by including various bits of data about the symbol.
>>
>> OK for trunk?

For the record, ccout...@google.com approved this for trunk off line.


Re: [Dwarf Patch] Generate -ggnu-pubnames

2013-10-22 Thread Sterling Augustine
On Thu, Oct 17, 2013 at 12:09 PM, Sterling Augustine
 wrote:
> The enclosed patch (which depends on
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01395.html), ports the
> new -ggnu-pubnames option from google 4.8 to trunk.
>
> -gnu-pubnames enable the gold linker to to generate a better
> .gdb_index section by including various bits of data about the symbol.
>
> OK for trunk?
>
> Sterling
>
> 2013-10-17
>
> * doc/invoke.texi: Document -ggnu-pubnames.
> * common.opt: Add new option -ggnu-pubnames and modify -gpubnames
> logic.
> * dwarf2out.c: Include gdb/gdb-index.h.
> (DEBUG_PUBNAMES_SECTION, DEBUG_PUBTYPES_SECTION): Handle
> debug_generate_pub_sections.
> (is_java, output_pubtables, output_pubname): New functions.
> (include_pubname_in_output): Handle debug_generate_pub_sections at
> level 2.
> (size_of_pubnames): Use new local space_for_flags based on
> debug_generate_pub_sections.
> (output_pubnames): Unify pubnames and pubtypes output logic.
> Genericize comments.  Call output_pubname.
> (dwarf2out_finish): Move logic to output_pubnames and call it.

Checked in as below, which has a minor change to not ICE when
encountering unknown DW_TAGs.
Index: dwarf2out.c
===
--- dwarf2out.c	(revision 203930)
+++ dwarf2out.c	(working copy)
@@ -93,6 +93,7 @@
 #include "dumpfile.h"
 #include "opts.h"
 #include "tree-dfa.h"
+#include "gdb/gdb-index.h"
 
 static void dwarf2out_source_line (unsigned int, const char *, int, bool);
 static rtx last_var_location_insn;
@@ -3323,10 +3324,14 @@
 #define DEBUG_DWO_LOC_SECTION  ".debug_loc.dwo"
 #endif
 #ifndef DEBUG_PUBNAMES_SECTION
-#define DEBUG_PUBNAMES_SECTION	".debug_pubnames"
+#define DEBUG_PUBNAMES_SECTION	\
+  ((debug_generate_pub_sections == 2) \
+   ? ".debug_gnu_pubnames" : ".debug_pubnames")
 #endif
 #ifndef DEBUG_PUBTYPES_SECTION
-#define DEBUG_PUBTYPES_SECTION	".debug_pubtypes"
+#define DEBUG_PUBTYPES_SECTION	\
+  ((debug_generate_pub_sections == 2) \
+   ? ".debug_gnu_pubtypes" : ".debug_pubtypes")
 #endif
 #define DEBUG_NORM_STR_OFFSETS_SECTION ".debug_str_offsets"
 #define DEBUG_DWO_STR_OFFSETS_SECTION ".debug_str_offsets.dwo"
@@ -4569,6 +4574,16 @@
   return lang == DW_LANG_C_plus_plus || lang == DW_LANG_ObjC_plus_plus;
 }
 
+/* Return TRUE if the language is Java.  */
+
+static inline bool
+is_java (void)
+{
+  unsigned int lang = get_AT_unsigned (comp_unit_die (), DW_AT_language);
+
+  return lang == DW_LANG_Java;
+}
+
 /* Return TRUE if the language is Fortran.  */
 
 static inline bool
@@ -7960,6 +7975,12 @@
 static bool
 include_pubname_in_output (vec *table, pubname_entry *p)
 {
+  /* By limiting gnu pubnames to definitions only, gold can generate a
+ gdb index without entries for declarations, which don't include
+ enough information to be useful.  */
+  if (debug_generate_pub_sections == 2 && is_declaration_die (p->die))
+return false;
+
   if (table == pubname_table)
 {
   /* Enumerator names are part of the pubname table, but the
@@ -7989,11 +8010,12 @@
   unsigned long size;
   unsigned i;
   pubname_ref p;
+  int space_for_flags = (debug_generate_pub_sections == 2) ? 1 : 0;
 
   size = DWARF_PUBNAMES_HEADER_SIZE;
   FOR_EACH_VEC_ELT (*names, i, p)
 if (include_pubname_in_output (names, p))
-  size += strlen (p->name) + DWARF_OFFSET_SIZE + 1;
+  size += strlen (p->name) + DWARF_OFFSET_SIZE + 1 + space_for_flags;
 
   size += DWARF_OFFSET_SIZE;
   return size;
@@ -9146,6 +9168,76 @@
 }
 }
 
+/* Output a single entry in the pubnames table.  */
+
+static void
+output_pubname (dw_offset die_offset, pubname_entry *entry)
+{
+  dw_die_ref die = entry->die;
+  int is_static = get_AT_flag (die, DW_AT_external) ? 0 : 1;
+
+  dw2_asm_output_data (DWARF_OFFSET_SIZE, die_offset, "DIE offset");
+
+  if (debug_generate_pub_sections == 2)
+{
+  /* This logic follows gdb's method for determining the value of the flag
+ byte.  */
+  uint32_t flags = GDB_INDEX_SYMBOL_KIND_NONE;
+  switch (die->die_tag)
+  {
+case DW_TAG_typedef:
+case DW_TAG_base_type:
+case DW_TAG_subrange_type:
+  GDB_INDEX_SYMBOL_KIND_SET_VALUE(flags, GDB_INDEX_SYMBOL_KIND_TYPE);
+  GDB_INDEX_SYMBOL_STATIC_SET_VALUE(flags, 1);
+  break;
+case DW_TAG_enumerator:
+  GDB_INDEX_SYMBOL_KIND_SET_VALUE(flags,
+  GDB_INDEX_SYMBOL_KIND_VARIABLE);
+  if (!is_cxx () && !is_java ())
+GDB_INDEX_SYMBOL_STATIC_SET_VALUE(flags, 1);
+  break;
+case DW_TAG_subprogram:
+  GDB_INDEX_SYMBOL_KIND_SET_VALUE(flags,
+  GDB_INDEX_SYMBOL_KIND_FUNCTION);
+  if (!is_ada ())
+GDB_INDEX_SYMBOL_STATIC_SET_VALUE(flags, is_static);
+  break;
+case DW_TAG_constant:
+  GDB_INDEX_SYMBOL_KIND_SET_VALUE(flags,
+ 

Re: [RFC] Isolate & simplify paths with undefined behaviour

2013-10-22 Thread Jeff Law

On 10/18/13 14:31, Marc Glisse wrote:

On Fri, 18 Oct 2013, Jeff Law wrote:


On 10/18/13 12:47, Marc Glisse wrote:

* tree-vrp has a function infer_nonnull_range, do you think we could
share it? We now store the VRP ranges for integers, but not for
pointers. If we did (or maybe just a non-null bit), the pass could just
test that bit on the variable found in the PHI.

I'm not sure what can really be shared here -- this patch searches for
PHIs where one or more of the args is a NULL pointer.  The NULL
pointer will be explicit in the PHI arg.


But once you have that pointer defined by a PHI containing a zero, you
look at all its uses, trying to find one that proves the pointer is
non-zero (only dereferences for now, but you have a comment about the
non-null attribute). And infer_nonnull_range precisely says whether a
statement proves that a pointer is non-zero (well, there may be a few
subtle differences, and some tests might need to move between
infer_value_range and infer_nonnull_range). I am just talking of
replacing 20 lines of code with a function call, not a huge sharing I
agree...

Storing the VRP info might not actually help, since you need to know
starting from which statement the pointer is non-zero.
So I was poking at this a bit.  It's trival to use infer_nonnull_range 
and to teach infer_nonnull_range to use the returns_nonnull attribute to 
pick up that return x in an appropriately decorated function implies 
that x is non-null.


We'll need a better place to shove infer_nonnull_range so that it's 
available to both users.


I also looked at having VRP use the returns_nonnull to tighten ranges in 
the current function (when it's appropriately decorated).  However, 
that's a bit more problematical.  By the time we process the return 
statement, we've likely already taken the return value's range to 
VARYING.   Going back down the lattice (to ~[0, 0]) is generally 
considered bad.  Given how rarely I expect this to help, I'm dropping 
this part of the floor.



The hack I had to avoid processing a block multiple times if a single 
SSA_NAME has multiple uses in the block was totally unnecessary.  All 
the right things happen if that hack gets removed and isolate_path is 
slightly adjusted.  So if we have X that we've determined as a NULL 
value on some path, given a block like


x->y =
blah blah
x->z =
fu bar


If we find the x->z use first in the immediate use iterator, we'll first 
transform into

x->y =
blah blah
trap

Then we see the x->y use and transform into
trap


Which is exactly what we want.



jeff


Re: [PATCH] Fix various reassoc issues (PR tree-optimization/58791, tree-optimization/58775)

2013-10-22 Thread Jeff Law

On 10/22/13 07:09, Jakub Jelinek wrote:

Hi!

I've spent over two days looking at reassoc, fixing spots where
we invalidly reused SSA_NAMEs (this results in wrong-debug, as the added
guality testcases show, even some ICEs (pr58791-3.c) and wrong range info
for SSA_NAMEs)
This is something we all need to remember, directly reusing an existing 
SSA_NAME for a new value and the like this is bad.  It's far better to 
release the name back to the manager and grab a new one.





 and cleaning up the stmt scheduling stuff (e.g. all gsi_move*

calls are gone, if we need to "move" something or set an SSA_NAME to
different value than previously, we'll now always create new
stmt and the old one depending on the case either remove or mark as visited
zero uses, so that it will be removed later on by reassociate_bb.

Goodness.


Of course some gimple_assign_set_rhs* etc. calls are still valid even
without creating new stmts, optimizing some statement to equivalent
computation is fine, but computing something different in an old SSA_NAME
is not.

Right.


Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

For 4.8 a partial backport would be possible, but quite a lot of work,
for 4.7 I'd prefer not to backport given that there gsi_for_stmt isn't O(1).

2013-10-22  Jakub Jelinek  

PR tree-optimization/58775
PR tree-optimization/58791
* tree-ssa-reassoc.c (reassoc_stmt_dominates_stmt_p): New function.
(insert_stmt_after): Rewritten, don't move the stmt, but really
insert it.
(get_stmt_uid_with_default): Remove.
(build_and_add_sum): Use insert_stmt_after and
reassoc_stmt_dominates_stmt_p.  Fix up uid if bb contains only
labels.
(update_range_test): Set uid on stmts added by
force_gimple_operand_gsi.  Don't immediately modify statements
in inter-bb optimization, just update oe->op values.
(optimize_range_tests): Return bool whether any changed have
been made.
(update_ops): New function.
(struct inter_bb_range_test_entry): New type.
(maybe_optimize_range_tests): Perform statement changes here.
(not_dominated_by, appears_later_in_bb, get_def_stmt,
ensure_ops_are_available): Remove.
(find_insert_point): Rewritten.
(rewrite_expr_tree): Remove MOVED argument, add CHANGED argument,
return LHS of the (new resp. old) stmt.  Don't call
ensure_ops_are_available, don't reuse SSA_NAMEs, recurse first
instead of last, move new stmt at the right place.
(linearize_expr, repropagate_negates): Don't reuse SSA_NAMEs.
(negate_value): Likewise.  Set uids.
(break_up_subtract_bb): Initialize uids.
(reassociate_bb): Adjust rewrite_expr_tree caller.
(do_reassoc): Don't call renumber_gimple_stmt_uids.

* gcc.dg/guality/pr58791-1.c: New test.
* gcc.dg/guality/pr58791-2.c: New test.
* gcc.dg/guality/pr58791-3.c: New test.
* gcc.dg/guality/pr58791-4.c: New test.
* gcc.dg/guality/pr58791-5.c: New test.
* gcc.c-torture/compile/pr58775.c: New test.
* gcc.dg/tree-ssa/reassoc-28.c: Don't scan reassoc1 dump.

--- gcc/tree-ssa-reassoc.c.jj   2013-10-21 09:00:25.0 +0200
+++ gcc/tree-ssa-reassoc.c  2013-10-22 12:04:38.693490273 +0200
@@ -1143,12 +1143,80 @@ zero_one_operation (tree *def, enum tree
while (1);
  }

-/* Returns the UID of STMT if it is non-NULL. Otherwise return 1.  */
+/* Returns true if statement S1 dominates statement S2.  Like
+   stmt_dominates_stmt_p, but uses stmt UIDs to optimize.  */

-static inline unsigned
-get_stmt_uid_with_default (gimple stmt)
+static bool
+reassoc_stmt_dominates_stmt_p (gimple s1, gimple s2)
+{
+  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
+
+  if (!bb1 || s1 == s2)
+return true;
+
+  if (!bb2)
+return false;
Maybe this was carried over from somewhere else, but that looks awful 
strange.  When !bb1 you return true, but if !bb2 you return false.  At 
the least it deserves a comment.






+
+  if (bb1 == bb2)
+{
+  if (gimple_code (s2) == GIMPLE_PHI)
+   return false;
+
+  if (gimple_code (s1) == GIMPLE_PHI)
+   return true;
Deserves a comment.  I know what's going on here, but it's easier to 
read it in a comment rather than recalling the all-phis-in-parallel rule 
and verifying this handles that correctly.






+
+  if (gimple_uid (s1) < gimple_uid (s2))
+   return true;
+
+  if (gimple_uid (s1) > gimple_uid (s2))
+   return false;
So if one (but not both) of the UIDs isn't set yet, one of these two 
statements will return, which seems wrong since we don't know where the 
statement without a UID is relative to the statement with a UID.  Am I 
missing something?




+
+  gimple_stmt_iterator gsi = gsi_for_stmt (s1);
+  unsigned int uid = gimple_uid (s1);
+  for (gsi_next (&gsi); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+

Re: [libstdc++,C++11] Library style for user-defined literal operators

2013-10-22 Thread Paolo Carlini


Hi

3dw...@verizon.net ha scritto:
>Let me try that again. Sorry for the dupe and the bad subject in the
>previous message.
>
>This patch fixes a small stylistic nit in the user-defined literal
>operators in the standard library.I propose we prefer: operator""suf -
>with no spacerather than: operator"" suf - with spaceIt is only
>strictly necessary to have no space between quotes and suffix
>identifier when the suffix identifier is a keyword. On the other hand,
>consistently using no space means never having to say you're sorry. It
>is also consistent with our style of not having space between operator
>and, say '+'. I believe that first allowing the space in the first
>place might have been a mistake in the standard. The (to be committed
>tonight) literal operators for complex need to have no space
>because the suffix is 'if'. We might as well be consistent. I think
>this is better style too.

Looks Ok to me.

Thanks,
Paolo



Re: [GOOGLE] Check if varpool node exist for decl before checking if it's from auxiliary module.

2013-10-22 Thread Xinliang David Li
ok.

David

On Tue, Oct 22, 2013 at 10:51 AM, Dehao Chen  wrote:
> This is fixing a LIPO bug when there -fexception is on.
>
> When compilation is finished, compile_file calls
> dw2_output_indirect_constants, which may generate decls like
> DW.ref.__gxx_personality_v0 (generated in
> dw2_output_indirect_constant_1). This function is a global function,
> but does not have associated varpool node. So original code will
> segfault when checking if the varpool node is from auxiliary module.
>
> Verified that the segfault is fixed with the patch. Regression test on-going.
>
> OK for google-4_8 branch if reg test is green?
>
> Thanks,
> Dehao
>
> Index: gcc/varasm.c
> ===
> --- gcc/varasm.c (revision 203910)
> +++ gcc/varasm.c (working copy)
> @@ -1484,7 +1484,7 @@ notice_global_symbol (tree decl)
>if (L_IPO_COMP_MODE
>&& ((TREE_CODE (decl) == FUNCTION_DECL
> && cgraph_is_auxiliary (decl))
> -  || (TREE_CODE (decl) == VAR_DECL
> +  || (TREE_CODE (decl) == VAR_DECL && varpool_get_node (decl)
>&& varpool_is_auxiliary (varpool_get_node (decl)
>  return;


[PATCH, i386]: Fix PR 58779, wrong MINUS overflow checks

2013-10-22 Thread Uros Bizjak
Hello!

As explained in the PR [1], the pattern that implements MINUS overflow
checks is wrong. The combine pass is able to create
*subsi3_cconly_overflow pattern, but since the functionality of the
pattern depends on inverted conditions in put_condition_mode, the
final jump in the combined sequence gets the inverted condition.

The attached patch removes wrong definitions.

2013-10-22  Uros Bizjak  

PR target/58779
* config/i386/i386.c (put_condition_code) :
Remove CCCmode handling.
: Return 'c' suffix for CCCmode.
: Return 'nc' suffix for CCCmode.
(ix86_cc_mode) : Do not generate overflow checks.
* config/i386/i386.md (*sub3_cconly_overflow): Remove.
(*sub3_cc_overflow): Ditto.
(*subsi3_zext_cc_overflow): Ditto.

testsuite/ChangeLog:

2013-10-22  Uros Bizjak  

PR target/58779
* gcc.target/i386/pr30315.c: Remove MINUSCC, DECCC, MINUSCCONLY
and MINUSCCZEXT defines. Update scan-assembler dg directive.
* gcc.dg/torture/pr58779.c: New test.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32} for all default languages plus obj-c++ and go.

Patch was committed to mainline SVN and will be committed to release
branches in a couple of days.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58779#c5

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 203860)
+++ config/i386/i386.c  (working copy)
@@ -14103,8 +14103,6 @@ put_condition_code (enum rtx_code code, enum machi
 Those same assemblers have the same but opposite lossage on cmov.  */
   if (mode == CCmode)
suffix = fp ? "nbe" : "a";
-  else if (mode == CCCmode)
-   suffix = "b";
   else
gcc_unreachable ();
   break;
@@ -14126,8 +14124,12 @@ put_condition_code (enum rtx_code code, enum machi
}
   break;
 case LTU:
-  gcc_assert (mode == CCmode || mode == CCCmode);
-  suffix = "b";
+  if (mode == CCmode)
+   suffix = "b";
+  else if (mode == CCCmode)
+   suffix = "c";
+  else
+   gcc_unreachable ();
   break;
 case GE:
   switch (mode)
@@ -14147,20 +14149,20 @@ put_condition_code (enum rtx_code code, enum machi
}
   break;
 case GEU:
-  /* ??? As above.  */
-  gcc_assert (mode == CCmode || mode == CCCmode);
-  suffix = fp ? "nb" : "ae";
+  if (mode == CCmode)
+   suffix = fp ? "nb" : "ae";
+  else if (mode == CCCmode)
+   suffix = "nc";
+  else
+   gcc_unreachable ();
   break;
 case LE:
   gcc_assert (mode == CCmode || mode == CCGCmode || mode == CCNOmode);
   suffix = "le";
   break;
 case LEU:
-  /* ??? As above.  */
   if (mode == CCmode)
suffix = "be";
-  else if (mode == CCCmode)
-   suffix = fp ? "nb" : "ae";
   else
gcc_unreachable ();
   break;
@@ -18862,12 +18864,7 @@ ix86_cc_mode (enum rtx_code code, rtx op0, rtx op1
return CCmode;
 case GTU:  /* CF=0 & ZF=0 */
 case LEU:  /* CF=1 | ZF=1 */
-  /* Detect overflow checks.  They need just the carry flag.  */
-  if (GET_CODE (op0) == MINUS
- && rtx_equal_p (op1, XEXP (op0, 0)))
-   return CCCmode;
-  else
-   return CCmode;
+  return CCmode;
   /* Codes possibly doable only with sign flag when
  comparing against zero.  */
 case GE:   /* SF=OF   or   SF=0 */
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 203860)
+++ config/i386/i386.md (working copy)
@@ -6486,7 +6486,7 @@
(set_attr "use_carry" "1")
(set_attr "mode" "")])
 
-;; Overflow setting add and subtract instructions
+;; Overflow setting add instructions
 
 (define_insn "*add3_cconly_overflow"
   [(set (reg:CCC FLAGS_REG)
@@ -6501,43 +6501,31 @@
   [(set_attr "type" "alu")
(set_attr "mode" "")])
 
-(define_insn "*sub3_cconly_overflow"
+(define_insn "*add3_cc_overflow"
   [(set (reg:CCC FLAGS_REG)
(compare:CCC
- (minus:SWI
-   (match_operand:SWI 0 "nonimmediate_operand" "m,")
-   (match_operand:SWI 1 "" ",m"))
- (match_dup 0)))]
-  ""
-  "cmp{}\t{%1, %0|%0, %1}"
-  [(set_attr "type" "icmp")
-   (set_attr "mode" "")])
-
-(define_insn "*3_cc_overflow"
-  [(set (reg:CCC FLAGS_REG)
-   (compare:CCC
-   (plusminus:SWI
-   (match_operand:SWI 1 "nonimmediate_operand" "0,0")
+   (plus:SWI
+   (match_operand:SWI 1 "nonimmediate_operand" "%0,0")
(match_operand:SWI 2 "" ",m"))
(match_dup 1)))
(set (match_operand:SWI 0 "nonimmediate_operand" "=m,")
-   (plusminus:SWI (match_dup 1) (match_dup 2)))]
-  "ix86_binary_operator_ok (, mode, operands)"
-  "{}\t{%2, %0|%0, %2}"
+   (plus:SWI (match_dup 1) (match_dup 2)))]
+  "ix86_binary_operator_ok (PLUS, mode, operands)

*PING* Re: [Patch, Fortran] Use ANNOTATE_EXPR annot_expr_ivdep_kind for DO CONCURRENT

2013-10-22 Thread Tobias Burnus
Two weeks ago I submitted the patch, available at: 
http://gcc.gnu.org/ml/fortran/2013-10/msg00022.html ; while the ME patch 
is not yet approved, the C FE was approved (latest C/ME patch: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01752.html).


Additionally, I'd like to early ping for: 
http://gcc.gnu.org/ml/fortran/2013-10/msg00068.html


Tobias

PS: Actually, safelen (i.e. "GCC ivdep", "omp simd", Cilk's "simd") does 
not require that the result is independent of the loop-waking order. 
Just (quoting from my C patch for GCC ivdep): "With this pragma, the 
programmer asserts that there are no loop-carried dependencies which 
would prevent that consecutive iterations of the following loop can be 
executed concurrently with SIMD (single instruction multiple data) 
instructions."
Fortran's do concurrent requires more: "The DO CONCURRENT construct 
provides a means for the program to specify that individual loop
iterations have no interdependencies." (F2008, Introduction). Still, in 
terms of optimizing "do concurrent", setting safelen gives the ME all 
required information.



On Oct 08, 2013, Tobias Burnus wrote:
This patch requires my pending ME and C FE patch: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00514.html


Using C/C++'s #pragma ivdep or – with the attached Fortran patch – "do 
concurrent", the loop condition is annotated such that later the 
loop's vectorization safelen is set to infinity (well, INT_MAX). The 
main purpose is to tell the compiler that the result is independent of 
the order in which the loop is walked. The typical case is pointer 
aliasing, in which the compiler either doesn't vectorize or adds a 
run-time aliasing check (loop versioning). With the annotation, the 
compiler simply assumes that there is no aliasing and avoids the 
versioning. – Contrary to C++ which does not even have the "restrict" 
qualifier (gcc and others do support __restrict) and cases where 
C/C++'s __restrict qualifier isn't sufficient/applicable, the effect 
on typical Fortran code should be smaller as most variables cannot 
alias. Still, in some cases it can help. (See test case for an example.)


There is an alternative to ivdep, which is more lower level [1]: 
OpenMPv4's "omp simd" (with safelen=) for C/C++/Fortran and – for 
C/C++ – Cilk Plus's #pragma simd.


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias

PS: I think the same annotation could be also used with FORALL and 
implied loops with whole-array/array-section assignments, when the FE 
knows that there is no aliasing between the LHS and RHS side. (In some 
cases, the FE knows this while the ME doesn't.)


PPS: My personal motivation is my long-standing wish to pass this 
information to the middle end for DO CONCURRENT but also to use the 
pragma for a specific C++ code.


[1] The OpenMPv4 support for C/C++ will be merged soon, for Fortran it 
will take a while (maybe still in 4.9, maybe only later). See 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00502.html / The relevant 
Cilk Plus patch has been posted at 
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01626.html




[Google 4.8 Patch] For gnu-pubnames, Ignore unrecognized DW_TAGs to match gdb behavior

2013-10-22 Thread Sterling Augustine
I have checked in the enclosed patch, which makes GCC ignore
unrecognized DW_TAGs when building gnu-pubnames, instead of ICEing.
This matches GDB behavior.

Google ref:11307370

Sterling
Index: dwarf2out.c
===
--- dwarf2out.c	(revision 203902)
+++ dwarf2out.c	(working copy)
@@ -9249,7 +9249,8 @@
 	GDB_INDEX_SYMBOL_STATIC_SET_VALUE(flags, 1);
   break;
 default:
-  gcc_unreachable ();
+  /* An unusual tag.  Leave the flag-byte empty.  */
+  break;
   }
   dw2_asm_output_data (1, flags >> GDB_INDEX_CU_BITSIZE,
"GDB-index flags");


Re: [patch, mips] Fix optimization bug involving nor instruction

2013-10-22 Thread Steve Ellcey
On Tue, 2013-10-22 at 19:12 +0100, Richard Sandiford wrote:

> >> Richard
> >
> > OK, but I am curious why you put parenthesis around the right hand side
> > of the total expression.  I.e.  *total = ();
> 
> That's the "emacs formatting" rule:

OK, I have checked in the patch with the parenthesis (and your other
changes).

Steve




Re: Fwd: [PATCH] Add gdb/gdb-index.h to gcc tree.

2013-10-22 Thread Sterling Augustine
On Fri, Oct 18, 2013 at 4:45 PM, DJ Delorie  wrote:
>
> I'm not sure either, but if it's been approved in gdb and you're
> willing to cede control of it to gcc's policies, I'm OK with it.
>
> Note that this will be a new directory in gcc, and I think the
> automerge scripts will automatically pick it up.  Which means, after
> committing it, any future changes must go to gcc's repo first (if even
> by a few seconds ;)

The GDB project is on board with the change in procedure. And I have
checked this in as posted.

Please let me know if there are any merge issues

Thanks,

Sterling


Re: [patch, mips] Fix optimization bug involving nor instruction

2013-10-22 Thread Richard Sandiford
Steve Ellcey  writes:
> On Tue, 2013-10-22 at 18:34 +0100, Richard Sandiford wrote:
>> Good spot!
>
>> We should use set_src_cost for both operands (the idea being that only
>> a register is allowed, and that anything else will end up being a SET_SRC).
>> I think the formatting should be something like:
>> 
>>   /* (AND (NOT op0) (NOT op1)) is a NOR operation that can be done in
>>   a single instruction.  */
>>   if (!TARGET_MIPS16
>>&& GET_CODE (XEXP (x, 0)) == NOT
>>&& GET_CODE (XEXP (x, 1)) == NOT)
>>  {
>>cost = GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1;
>>*total = (COSTS_N_INSNS (cost)
>>  + set_src_cost (XEXP (XEXP (x, 0), 0), speed)
>>  + set_src_cost (XEXP (XEXP (x, 1), 0), speed));
>>return true;
>>  }
>> 
>> OK with that change, thanks.
>> 
>> Richard
>
> OK, but I am curious why you put parenthesis around the right hand side
> of the total expression.  I.e.  *total = ();

That's the "emacs formatting" rule:

  Insert extra parentheses so that Emacs will indent the code
  properly. For example, the following indentation looks nice if you do
  it by hand,

  v = rup->ru_utime.tv_sec*1000 + rup->ru_utime.tv_usec/1000
  + rup->ru_stime.tv_sec*1000 + rup->ru_stime.tv_usec/1000;

  but Emacs would alter it. Adding a set of parentheses produces
  something that looks equally nice, and which Emacs will preserve:

  v = (rup->ru_utime.tv_sec*1000 + rup->ru_utime.tv_usec/1000
   + rup->ru_stime.tv_sec*1000 + rup->ru_stime.tv_usec/1000);

Thanks,
Richard


Re: [GOOGLE] Check if varpool node exist for decl before checking if it's from auxiliary module.

2013-10-22 Thread Rong Xu
seems fine to me for google branches.

-Rong

On Tue, Oct 22, 2013 at 10:51 AM, Dehao Chen  wrote:
> This is fixing a LIPO bug when there -fexception is on.
>
> When compilation is finished, compile_file calls
> dw2_output_indirect_constants, which may generate decls like
> DW.ref.__gxx_personality_v0 (generated in
> dw2_output_indirect_constant_1). This function is a global function,
> but does not have associated varpool node. So original code will
> segfault when checking if the varpool node is from auxiliary module.
>
> Verified that the segfault is fixed with the patch. Regression test on-going.
>
> OK for google-4_8 branch if reg test is green?
>
> Thanks,
> Dehao
>
> Index: gcc/varasm.c
> ===
> --- gcc/varasm.c (revision 203910)
> +++ gcc/varasm.c (working copy)
> @@ -1484,7 +1484,7 @@ notice_global_symbol (tree decl)
>if (L_IPO_COMP_MODE
>&& ((TREE_CODE (decl) == FUNCTION_DECL
> && cgraph_is_auxiliary (decl))
> -  || (TREE_CODE (decl) == VAR_DECL
> +  || (TREE_CODE (decl) == VAR_DECL && varpool_get_node (decl)
>&& varpool_is_auxiliary (varpool_get_node (decl)
>  return;


[GOOGLE] Check if varpool node exist for decl before checking if it's from auxiliary module.

2013-10-22 Thread Dehao Chen
This is fixing a LIPO bug when there -fexception is on.

When compilation is finished, compile_file calls
dw2_output_indirect_constants, which may generate decls like
DW.ref.__gxx_personality_v0 (generated in
dw2_output_indirect_constant_1). This function is a global function,
but does not have associated varpool node. So original code will
segfault when checking if the varpool node is from auxiliary module.

Verified that the segfault is fixed with the patch. Regression test on-going.

OK for google-4_8 branch if reg test is green?

Thanks,
Dehao

Index: gcc/varasm.c
===
--- gcc/varasm.c (revision 203910)
+++ gcc/varasm.c (working copy)
@@ -1484,7 +1484,7 @@ notice_global_symbol (tree decl)
   if (L_IPO_COMP_MODE
   && ((TREE_CODE (decl) == FUNCTION_DECL
&& cgraph_is_auxiliary (decl))
-  || (TREE_CODE (decl) == VAR_DECL
+  || (TREE_CODE (decl) == VAR_DECL && varpool_get_node (decl)
   && varpool_is_auxiliary (varpool_get_node (decl)
 return;


Re: [patch, mips] Fix optimization bug involving nor instruction

2013-10-22 Thread Steve Ellcey
On Tue, 2013-10-22 at 18:34 +0100, Richard Sandiford wrote:
> Good spot!

> We should use set_src_cost for both operands (the idea being that only
> a register is allowed, and that anything else will end up being a SET_SRC).
> I think the formatting should be something like:
> 
>   /* (AND (NOT op0) (NOT op1)) is a NOR operation that can be done in
>a single instruction.  */
>   if (!TARGET_MIPS16
> && GET_CODE (XEXP (x, 0)) == NOT
> && GET_CODE (XEXP (x, 1)) == NOT)
>   {
> cost = GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1;
> *total = (COSTS_N_INSNS (cost)
>   + set_src_cost (XEXP (XEXP (x, 0), 0), speed)
>   + set_src_cost (XEXP (XEXP (x, 1), 0), speed));
> return true;
>   }
> 
> OK with that change, thanks.
> 
> Richard

OK, but I am curious why you put parenthesis around the right hand side
of the total expression.  I.e.  *total = ();

Steve Ellcey




Re: [patch, mips] Fix optimization bug involving nor instruction

2013-10-22 Thread Richard Sandiford
Good spot!

"Steve Ellcey "  writes:
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index 5993aab..ffb0b53 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -3796,6 +3796,18 @@ mips_rtx_costs (rtx x, int code, int outer_code, int 
> opno ATTRIBUTE_UNUSED,
> return true;
>   }
>   }
> +  /* (AND (NOT op0) (NOT op1) is a nor operation that can be done in
> +  a single instruction.  */
> +  if (!TARGET_MIPS16 && (GET_CODE (XEXP (x, 0)) == NOT)
> +   && (GET_CODE (XEXP (x, 1)) == NOT))
> + {
> +   rtx op0 = XEXP (x, 0);
> +   rtx op1 = XEXP (x, 1);
> +   cost = GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1;
> +  *total = COSTS_N_INSNS (cost) + set_src_cost (XEXP (op0, 0), speed)
> ++ rtx_cost (XEXP (op1, 0), GET_CODE (op1), 1, speed);
> +   return true;
> + }

We should use set_src_cost for both operands (the idea being that only
a register is allowed, and that anything else will end up being a SET_SRC).
I think the formatting should be something like:

  /* (AND (NOT op0) (NOT op1)) is a NOR operation that can be done in
 a single instruction.  */
  if (!TARGET_MIPS16
  && GET_CODE (XEXP (x, 0)) == NOT
  && GET_CODE (XEXP (x, 1)) == NOT)
{
  cost = GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1;
  *total = (COSTS_N_INSNS (cost)
+ set_src_cost (XEXP (XEXP (x, 0), 0), speed)
+ set_src_cost (XEXP (XEXP (x, 1), 0), speed));
  return true;
}

OK with that change, thanks.

Richard


[jit] Add gcc_jit_context_get_first_error

2013-10-22 Thread David Malcolm
Committed to dmalcolm/jit:

Add a way to query the first error message that occurred on a context.

I expect there often will be a cascade of followup errors: the first
error will often lead to a NULL return from an API call, which will then
erroneously (but safely) be used for further API calls.  Under such
circumstances I believe the first error message will be the one that the
user cares about.

gcc/jit/
* internal-api.c (gcc::jit::context::add_error_va): Record the
first error that occurs on a context.
(gcc::jit::context::get_first_error): New.
* internal-api.h (gcc::jit::context::get_first_error): New.
(gcc::jit::context::m_first_error_str): New.
* libgccjit.c (gcc_jit_context_get_first_error): New.
* libgccjit.h (gcc_jit_context_get_first_error): New.
* libgccjit.map (gcc_jit_context_get_first_error): New.

gcc/testsuite/
* jit.dg/harness.h (verify_code): Add context param so that
test cases of failure can query errors on it.
(CHECK_STRING_VALUE): New.
(check_string_value): New.
(test_jit): Add user_data param and pass it to the code factory.
Pass context to verify_code, calling it before releasing said
context.
(main): Add NULL user_data to test_jit call.
* jit.dg/test-accessing-struct.c (verify_code): Add context
param.
* jit.dg/test-calling-external-function.c (verify_code):
Likewise.
* jit.dg/test-combination.c (verify_code): Likewise.
* jit.dg/test-dot-product.c (verify_code): Likewise.
* jit.dg/test-expressions.c (verify_code): Likewise.
* jit.dg/test-factorial.c (verify_code): Likewise.
* jit.dg/test-failure.c (verify_code): Likewise.
* jit.dg/test-fibonacci.c (verify_code): Likewise.
* jit.dg/test-hello-world.c (verify_code): Likewise.
* jit.dg/test-string-literal.c (verify_code): Likewise.
* jit.dg/test-sum-of-squares.c (verify_code): Likewise.
* jit.dg/test-types.c (verify_code): Likewise.
* jit.dg/test-using-global.c (verify_code): Likewise.
* jit.dg/test-null-passed-to-api.c (verify_code): Likewise;
use context to verify that the library provides a sane error
message to the client code.
---
 gcc/jit/ChangeLog.jit  | 11 ++
 gcc/jit/internal-api.c | 16 
 gcc/jit/internal-api.h |  6 ++-
 gcc/jit/libgccjit.c|  8 
 gcc/jit/libgccjit.h|  9 +
 gcc/jit/libgccjit.map  |  1 +
 gcc/testsuite/ChangeLog.jit| 29 ++
 gcc/testsuite/jit.dg/harness.h | 45 ++
 gcc/testsuite/jit.dg/test-accessing-struct.c   |  2 +-
 .../jit.dg/test-calling-external-function.c|  2 +-
 gcc/testsuite/jit.dg/test-combination.c| 24 ++--
 gcc/testsuite/jit.dg/test-dot-product.c|  2 +-
 gcc/testsuite/jit.dg/test-expressions.c|  2 +-
 gcc/testsuite/jit.dg/test-factorial.c  |  2 +-
 gcc/testsuite/jit.dg/test-failure.c|  2 +-
 gcc/testsuite/jit.dg/test-fibonacci.c  |  2 +-
 gcc/testsuite/jit.dg/test-hello-world.c|  2 +-
 gcc/testsuite/jit.dg/test-null-passed-to-api.c |  6 ++-
 gcc/testsuite/jit.dg/test-string-literal.c |  2 +-
 gcc/testsuite/jit.dg/test-sum-of-squares.c |  2 +-
 gcc/testsuite/jit.dg/test-types.c  |  2 +-
 gcc/testsuite/jit.dg/test-using-global.c   |  2 +-
 22 files changed, 146 insertions(+), 33 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 2cf5c8d..5e8d0f9 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,14 @@
+2013-10-22  David Malcolm  
+
+   * internal-api.c (gcc::jit::context::add_error_va): Record the
+   first error that occurs on a context.
+   (gcc::jit::context::get_first_error): New.
+   * internal-api.h (gcc::jit::context::get_first_error): New.
+   (gcc::jit::context::m_first_error_str): New.
+   * libgccjit.c (gcc_jit_context_get_first_error): New.
+   * libgccjit.h (gcc_jit_context_get_first_error): New.
+   * libgccjit.map (gcc_jit_context_get_first_error): New.
+
 2013-10-21  David Malcolm  
 
* internal-api.c (gcc::jit::context::compile): Correctly cleanup
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index d90f001..639cd49 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -1478,9 +1478,25 @@ add_error_va (const char *fmt, va_list ap)
 
   error ("%s\n", buf);
 
+  if (!m_error_count)
+{
+  strncpy (m_first_error_str, buf, sizeof(m_first_error_str));
+  m_first_error_str[sizeof(m_first_error_str) - 1] = '\0';
+}
+
   m_error_count++;
 }
 
+const char *
+gcc::jit::context::

rename c1x-*.c to c11-*.c

2013-10-22 Thread Mike Stump
In gcc/testsuite/gcc.dg, we renamed c1x-*.c to c11-*.c.

Tested on x86_64-apple-darwin12.

Committed revision 203929.


[patch, mips] Fix optimization bug involving nor instruction

2013-10-22 Thread Steve Ellcey
While looking at some MIPS code I found that GCC was not making optimal
use of the MIPS nor instruction.  If you compile this function:

int f (int a, int b) { return ~(a|b); }

It generates:

or  $2,$4,$5
nor $2,$0,$2

instead of just:

nor $2,$4,$5

The problem is that mips_rtx_costs does not know that (AND (NOT op1) (NOT op2))
can be done in a single operation so it calculates the cost of the expression
as the cost of 3 operations, this results in combine thinking that
(NOT (OR op1 op2) is cheaper.

This patch changes mips_rtx_costs to change the cost calculation of a nor to
be the cost of one operation (plus the cost of getting the operands into
registers) instead of the cost of three operations.

Tested with no regressions, OK to checkin?

Steve Ellcey
sell...@mips.com


2013-10-22  Steve Ellcey  

* config/mips/mips.c (mips_rtx_costs):  Fix cost estimate for nor
(AND (NOT OP1) (NOT OP2)).


2013-10-22  Steve Ellcey  

* gcc.target/mips/nor.c: New.


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 5993aab..ffb0b53 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -3796,6 +3796,18 @@ mips_rtx_costs (rtx x, int code, int outer_code, int 
opno ATTRIBUTE_UNUSED,
  return true;
}
}
+  /* (AND (NOT op0) (NOT op1) is a nor operation that can be done in
+a single instruction.  */
+  if (!TARGET_MIPS16 && (GET_CODE (XEXP (x, 0)) == NOT)
+ && (GET_CODE (XEXP (x, 1)) == NOT))
+   {
+ rtx op0 = XEXP (x, 0);
+ rtx op1 = XEXP (x, 1);
+ cost = GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1;
+  *total = COSTS_N_INSNS (cost) + set_src_cost (XEXP (op0, 0), speed)
+  + rtx_cost (XEXP (op1, 0), GET_CODE (op1), 1, speed);
+ return true;
+   }

   /* Fall through.  */
 
diff --git a/gcc/testsuite/gcc.target/mips/nor.c 
b/gcc/testsuite/gcc.target/mips/nor.c
new file mode 100644
index 000..e71791b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/nor.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
+/* { dg-final { scan-assembler-times "\tnor\t" 1 } } */
+/* { dg-final { scan-assembler-not "\tor" } } */
+
+/* Test that we generate a 'nor' instruction and no 'or' instructions.  */
+
+NOMIPS16 int f (int a, int b)
+{
+   return ~(a|b);
+}



Re: [libstdc++,C++11] Library style for user-defined literal operators

2013-10-22 Thread Mike Stump
On Oct 22, 2013, at 9:27 AM, 3dw...@verizon.net wrote:

Can you arrange for your emails to have you name in them?  Thanks.


c1x --> c11

2013-10-22 Thread Mike Stump
One last straggler I found:

Tested on x86_64-apple-darwin12.

Committed revision 203928.

Index: builtin-complex-err-1.c
===
--- builtin-complex-err-1.c (revision 203926)
+++ builtin-complex-err-1.c (working copy)
@@ -1,6 +1,6 @@
 /* Test __builtin_complex errors.  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 typedef double D;
 



c1x --> c11

2013-10-22 Thread Mike Stump
This updates test cases to use the non-deprecated form of the option.  c1x --> 
c11

Tested on x86_64-apple-darwin12.

Committed revision 203927.

Index: c1x-align-1.c
===
--- c1x-align-1.c   (revision 203926)
+++ c1x-align-1.c   (working copy)
@@ -1,6 +1,6 @@
-/* Test C1X alignment support.  Test valid code.  */
+/* Test C11 alignment support.  Test valid code.  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 #include 
 
Index: c1x-align-2.c
===
--- c1x-align-2.c   (revision 203926)
+++ c1x-align-2.c   (working copy)
@@ -1,6 +1,6 @@
-/* Test C1X alignment support.  Test valid code using stdalign.h.  */
+/* Test C11 alignment support.  Test valid code using stdalign.h.  */
 /* { dg-do run } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 #include 
 #include 
Index: c1x-align-3.c
===
--- c1x-align-3.c   (revision 203926)
+++ c1x-align-3.c   (working copy)
@@ -1,6 +1,6 @@
-/* Test C1X alignment support.  Test invalid code.  */
+/* Test C11 alignment support.  Test invalid code.  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 int a = _Alignof (void (void)); /* { dg-error "function" } */
 struct s;
Index: c1x-align-4.c
===
--- c1x-align-4.c   (revision 203926)
+++ c1x-align-4.c   (working copy)
@@ -1,7 +1,7 @@
-/* Test C1X alignment support.  Test reducing alignment (assumes there
+/* Test C11 alignment support.  Test reducing alignment (assumes there
are at least some alignment constraints).  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 /* { dg-skip-if "no alignment constraints" { "avr-*-*" } { "*" } { "" } } */
 
 #include 
Index: c1x-align-5.c
===
--- c1x-align-5.c   (revision 203926)
+++ c1x-align-5.c   (working copy)
@@ -1,6 +1,6 @@
-/* Test C1X alignment support.  Test invalid code.  */
+/* Test C11 alignment support.  Test invalid code.  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 void foo (int []);
 void bar1 (int [_Alignas (double) 10]); /* { dg-error "expected expression 
before" } */
Index: c1x-anon-struct-1.c
===
--- c1x-anon-struct-1.c (revision 203926)
+++ c1x-anon-struct-1.c (working copy)
@@ -1,6 +1,6 @@
-/* Test for anonymous structures and unions in C1X.  */
+/* Test for anonymous structures and unions in C11.  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 #include 
 
Index: c1x-anon-struct-2.c
===
--- c1x-anon-struct-2.c (revision 203926)
+++ c1x-anon-struct-2.c (working copy)
@@ -1,7 +1,7 @@
-/* Test for anonymous structures and unions in C1X.  Test for invalid
+/* Test for anonymous structures and unions in C11.  Test for invalid
cases.  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 typedef struct s0
 {
Index: c1x-anon-struct-3.c
===
--- c1x-anon-struct-3.c (revision 203926)
+++ c1x-anon-struct-3.c (working copy)
@@ -1,7 +1,7 @@
-/* Test for anonymous structures and unions in C1X.  Test for invalid
+/* Test for anonymous structures and unions in C11.  Test for invalid
cases: typedefs disallowed by N1549.  */
 /* { dg-do compile } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
 typedef struct
 {
Index: c1x-float-1.c
===
--- c1x-float-1.c   (revision 203926)
+++ c1x-float-1.c   (working copy)
@@ -1,9 +1,9 @@
-/* Test for  C1X macros.  */
+/* Test for  C11 macros.  */
 /* Origin: Joseph Myers  */
 /* { dg-do preprocess } */
-/* { dg-options "-std=c1x -pedantic-errors" } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
 
-/* This test checks that the C1X macros are defined;
+/* This test checks that the C11 macros are defined;
it does not check the correctness of their values.  */
 
 #include 
Index: c1x-noreturn-1.c
===
--- c1x-noreturn-1.c(revision 203926)
+++ c1x-noreturn-1.c(working copy)
@@ -1,6 +1,6 @@
-/* Test C1X _Noreturn.  Test valid code.  */
+/* Test C11 _Noreturn.  T

Re: [C++14] implement [[deprecated]].

2013-10-22 Thread Jason Merrill

OK.

Jason


Re: [RFC] By default if-convert only basic blocks that will be vectorized (take 3)

2013-10-22 Thread Sergey Ostanevich
still fails on 403 et al.

regrename.c: In function 'regrename_optimize':
regrename.c:200:1: internal compiler error: Segmentation fault
 regrename_optimize ()
 ^
0x96698f crash_signal
../../gcc/toplev.c:335
0x9a2657 ssa_default_def(function*, tree_node*)
../../gcc/tree-dfa.c:310
0x9a2c68 get_or_create_ssa_default_def(function*, tree_node*)
../../gcc/tree-dfa.c:362
0x9d0773 get_reaching_def
../../gcc/tree-into-ssa.c:1188
0x9d0773 get_reaching_def
../../gcc/tree-into-ssa.c:1175
0x9d24f8 rewrite_update_phi_arguments
../../gcc/tree-into-ssa.c:2017
0x9d24f8 rewrite_update_dom_walker::before_dom_children(basic_block_def*)
../../gcc/tree-into-ssa.c:2133
0x9d24f8 rewrite_update_dom_walker::before_dom_children(basic_block_def*)
../../gcc/tree-into-ssa.c:2069
0xddcbea dom_walker::walk(basic_block_def*)
../../gcc/domwalk.c:176
0x9ceba2 rewrite_blocks
../../gcc/tree-into-ssa.c:2188
0x9d604e update_ssa(unsigned int)
../../gcc/tree-into-ssa.c:3311
0x9b729f version_loop_for_if_conversion
../../gcc/tree-if-conv.c:1962
0x9bad10 tree_if_conversion
../../gcc/tree-if-conv.c:2042
0x9bad10 main_tree_if_conversion
../../gcc/tree-if-conv.c:2097
0x9bad10 execute
../../gcc/tree-if-conv.c:2147
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
specmake: *** [regrename.o] Error 1

On Tue, Oct 22, 2013 at 6:08 PM, Sergey Ostanevich  wrote:
> I use
> -march=core-avx2 -static -Ofast -flto -funroll-loops
> with all 3 patches applied to git version I mentioned
>
> the same is in progress for 'take 3'
>
> On Tue, Oct 22, 2013 at 5:26 PM, Jakub Jelinek  wrote:
>> On Tue, Oct 22, 2013 at 05:16:29PM +0400, Sergey Ostanevich wrote:
>>> ouch.. html got me!
>>>
>>> applying to the same as before
>>>
>>> commit 0e3dfadd374c3045a926afa6d06af276cee2108d
>>> Author: rguenth 
>>> Date:   Fri Oct 18 08:36:28 2013 +
>>>
>>> trying '06 build.
>>>
>>>
>>> for take 2 on HSW there were mixed results for passed tests:
>>>
>>> 400.perlbench +0.25%
>>> 401.bzip2 -0.40%
>>> 429.mcf   -1.67%
>>> 456.hmmer +0%
>>> 458.sjeng -1.65%
>>> 471.omnetpp   -1.72%
>>> 483.xalancbmk -0.25%
>>> 410.bwaves+0%
>>> 433.milc  -0.28%
>>> 444.namd  -1.13%
>>> 450.soplex-1.01%
>>> 459.GemsFDTD  -0.25%
>>> 470.lbm   +1.44%
>>> 482.sphinx3   +0.61%
>>
>> Thanks.
>>
>> If this is with AVX or AVX2, there can be multiple things.
>>
>> One is whether if-conversion in it's current form is really so harmful
>> for generated code quality (we have various testcases where what we get with
>> it is very bad, but perhaps there are still testcases where it improves
>> non-vectorized loops), another one is when the masked loads/stores/gather is
>> actually beneficial and whether we don't want some better tuning for it.
>> The patchset as whole does both things, so determining what is what is hard.
>>
>> If only the first two patches are applied and not the third, then it should
>> give a picture on whether it is a win to have if-conversion done in it's
>> current form only for vectorized bbs.
>>
>> And, if all 3 patches are applied, but the condition in tree_if_conversion:
>>if ((flag_tree_loop_vectorize || loop->force_vect)
>>&& (flag_tree_loop_if_convert == -1 || any_mask_load_store)
>>&& !version_loop_for_if_conversion (loop, &version_outer_loop,
>>any_mask_load_store))
>>  goto cleanup;
>> is changed into just
>>   if (any_mask_load_store
>>   && !version_loop_for_if_conversion (loop, &version_outer_loop, true))
>> goto cleanup;
>> then if-conversion would happen as before, just in cases where previously
>> we wouldn't if-convert because of conditional loads/stores, but now we
>> could, those changes would be limited to vectorized loops only.
>>
>> Jakub


[libstdc++,C++11] Library style for user-defined literal operators

2013-10-22 Thread 3dw4rd
Let me try that again. Sorry for the dupe and the bad subject in the previous 
message.

This patch fixes a small stylistic nit in the user-defined literal operators in 
the standard library.I propose we prefer: operator""suf - with no spacerather 
than: operator"" suf - with spaceIt is only strictly necessary to have no space 
between quotes and suffix identifier when the suffix identifier is a keyword. 
On the other hand, consistently using no space means never having to say you're 
sorry. It is also consistent with our style of not having space between 
operator and, say '+'. I believe that first allowing the space in the first 
place might have been a mistake in the standard. The (to be committed tonight) 
literal operators for complex need to have no space because the suffix 
is 'if'. We might as well be consistent. I think this is better style too.Ed




CL_lit_oper_style
Description: Binary data


patch_lit_oper_style
Description: Binary data


Re: Patch: Add #pragma ivdep support to the ME and C FE

2013-10-22 Thread Frederic Riss
On 21 October 2013 22:09, Tobias Burnus  wrote:
> attached is a new version of the patch. Changes:
> * "#pragma GCC ivdep" instead of "#pragma ivdep"
> * Corrections to the error message in c-parser.c and a test case for it
> * New wording in the .texi and examples

Not that I have any authority here, but I like the new wording better. And
BTW, in the last exchange you pointed to a set of slides by B. Dinechin
about the ivdep directive; it was in fact a conversation with him that
prompted me to ask you about the semantics of your patch. I discussed
the issue with him again today and he agreed that mapping that to
safelen makes sense.

Fred


Re: [c++-concepts] bitfield reference bugfix

2013-10-22 Thread Jason Merrill

On 10/22/2013 08:54 AM, Andrew Sutton wrote:

This fixes the longstanding problem with bitfield references. The
default dialect was set to cxx1y, which was resulting different
conversions for bitfield references. I'm not sure if there's a change
in semantics for 1y or if that's a separate bug, but it's not related
to concepts.


Sounds like a bug.  OK.

Jason



New prologue/epilogue code for i386 string functions

2013-10-22 Thread Jan Hubicka
Hi,
this patch adds code to produce prologues/epilogues as suggested by Ondrej Bilka
(I described more the approach in 
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02082.html)
This patch is updated and cleaned up version after Mikhail changes merging 
memset/memcpy
generation code.  (I will continue with some incremental cleanups for the code 
dulication
we ended up with).

For now I don't have value range code in, but all logic is in place once
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02011.html
gets reviewed.

Bootstrapped/regtesed x86_64-linux also with -minline-all-stringops and tested 
on SPEC2k6.
I will commit it later today after more testing.

Honza

* i386.h (TARGET_MISALIGNED_MOVE_STRING_PROLOGUES_EPILOGUES): New 
tuning flag.
* x86-tune.def (TARGET_MISALIGNED_MOVE_STRING_PROLOGUES): Define it.
* i386.c (expand_small_movmem_or_setmem): New function.
(expand_set_or_movmem_prologue_epilogue_by_misaligned_moves): New 
function
(alg_usable_p): Add support for value ranges; cleanup.
(ix86_expand_set_or_movmem): Add support for misaligned moves.
Index: i386.h
===
--- i386.h  (revision 203888)
+++ i386.h  (working copy)
@@ -350,6 +350,8 @@ extern unsigned char ix86_tune_features[
 #define TARGET_PROMOTE_QImode  ix86_tune_features[X86_TUNE_PROMOTE_QIMODE]
 #define TARGET_FAST_PREFIX ix86_tune_features[X86_TUNE_FAST_PREFIX]
 #define TARGET_SINGLE_STRINGOP ix86_tune_features[X86_TUNE_SINGLE_STRINGOP]
+#define TARGET_MISALIGNED_MOVE_STRING_PROLOGUES_EPILOGUES \
+   ix86_tune_features[TARGET_MISALIGNED_MOVE_STRING_PROLOGUES]
 #define TARGET_QIMODE_MATH ix86_tune_features[X86_TUNE_QIMODE_MATH]
 #define TARGET_HIMODE_MATH ix86_tune_features[X86_TUNE_HIMODE_MATH]
 #define TARGET_PROMOTE_QI_REGS ix86_tune_features[X86_TUNE_PROMOTE_QI_REGS]
Index: x86-tune.def
===
--- x86-tune.def(revision 203888)
+++ x86-tune.def(working copy)
@@ -239,6 +239,15 @@ DEF_TUNE (X86_TUNE_AVOID_MEM_OPND_FOR_CM
as MOVS and STOS (without a REP prefix) to move/set sequences of bytes.  */
 DEF_TUNE (X86_TUNE_SINGLE_STRINGOP, "single_stringop", m_386 | m_P4_NOCONA)
 
+/* TARGET_MISALIGNED_MOVE_STRING_PROLOGUES: Enable generation of compace
+   prologues and epilogues by issuing a misaligned moves.  This require
+   target to handle misaligned moves and partial memory stalls resonably
+   well.  
+   FIXME: This actualy may be a win on more targets than listed here.  */
+DEF_TUNE (TARGET_MISALIGNED_MOVE_STRING_PROLOGUES,
+ "misaligned_move_string_prologues",
+ m_386 | m_486 | m_CORE_ALL | m_AMD_MULTIPLE | m_GENERIC)
+
 /* X86_TUNE_USE_SAHF: Controls use of SAHF.  */
 DEF_TUNE (X86_TUNE_USE_SAHF, "use_sahf",
   m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_ATOM | m_SLM | m_K6_GEODE
Index: i386.c
===
--- i386.c  (revision 203888)
+++ i386.c  (working copy)
@@ -22734,6 +22734,314 @@ expand_set_or_movmem_prologue (rtx destm
   return destmem;
 }
 
+/* Test if COUNT&SIZE is nonzero and if so, expand movme
+   or setmem sequence that is valid for SIZE..2*SIZE-1 bytes
+   and jump to DONE_LABEL.  */
+static void
+expand_small_movmem_or_setmem (rtx destmem, rtx srcmem,
+  rtx destptr, rtx srcptr,
+  rtx value, rtx vec_value,
+  rtx count, int size,
+  rtx done_label, bool issetmem)
+{
+  rtx label = ix86_expand_aligntest (count, size, false);
+  enum machine_mode mode = mode_for_size (size * BITS_PER_UNIT, MODE_INT, 1);
+  rtx modesize;
+  int n;
+
+  /* If we do not have vector value to copy, we must reduce size.  */
+  if (issetmem)
+{
+  if (!vec_value)
+   {
+ if (GET_MODE (value) == VOIDmode && size > 8)
+   mode = Pmode;
+ else if (GET_MODE_SIZE (mode) > GET_MODE_SIZE (GET_MODE (value)))
+   mode = GET_MODE (value);
+   }
+  else
+   mode = GET_MODE (vec_value), value = vec_value;
+}
+  else
+{
+  /* Choose appropriate vector mode.  */
+  if (size >= 32)
+   mode = TARGET_AVX ? V32QImode : TARGET_SSE ? V16QImode : DImode;
+  else if (size >= 16)
+   mode = TARGET_SSE ? V16QImode : DImode;
+  srcmem = change_address (srcmem, mode, srcptr);
+}
+  destmem = change_address (destmem, mode, destptr);
+  modesize = GEN_INT (GET_MODE_SIZE (mode));
+  gcc_assert (GET_MODE_SIZE (mode) <= size);
+  for (n = 0; n * GET_MODE_SIZE (mode) < size; n++)
+{
+  if (issetmem)
+   emit_move_insn (destmem, gen_lowpart (mode, value));
+  else
+   {
+  emit_move_insn (destmem, srcmem);
+  srcmem = offset_address (srcmem, modesize, GET_MODE_SIZE (mode));
+   }
+  destmem = offset_address (destmem, modesize, GET_MODE_

Re: patch to enable LRA for ppc

2013-10-22 Thread Michael Meissner
On Tue, Oct 22, 2013 at 10:21:32AM -0400, David Edelsohn wrote:
> On Mon, Oct 21, 2013 at 10:42 PM, Vladimir Makarov  
> wrote:
> 
> >> I would say lets add -mlra, but make the default OFF for the time being.
> >> We
> >> can always switch the default later.
> >
> > Sure, if you know some LRA problems it should not be on default. Moreover,
> > if we still have the problems when releasing gcc4.9, I think we should
> > exclude any possibility for a user to use LRA for ppc.  I don't want to have
> > GGC-4.9 users blaming LRA.
> >
> > But adding LRA to PPC on the trunk (switched OFF by default) earlier could
> > help me a lot to work on the issues.
> 
> My main concern was disrupting Mike. If Mike is comfortable with
> adding LRA disabled by default, it is okay with me.
> 
> The patch mostly adds lra_in_progress, which will not have any effect
> while LRA remains disabled.
> 
> My one question about the patch is:
> 
> -  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r")
> +  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,&r")
> 
> which may cause register preferencing problems for bswap when LRA is not used.
> 
> The rest of the patch is okay.
> 
> Thanks, David

Yeah, I can see a whole round of tuning issues, and everywhere
reload_in_progress is used, add lra_in_progress.  Because of the Advance
Toolchain, RHEL, and SLES, we will need to still deal with the original
register allocator.

Vlad, this is part of a message I had sent David, and I thought you were on the
CC list about LRA.

I haven't looked in detail what the changes are at this point.  I did do some
builds and comparisons.  It looks like there are definately problems with
32-bit fortran and decimal floating point (and likely long double using IBM's
double double format).  If somebody has some cycles, it may be useful digging
into why we get these failures.

Note, I have some sort of configuration problem in running dealII, so it
isn't run right now:

Spec 2006, 64-bit, 3 runs, picking the middle, power7 options:

Benchmark   TypePercent
400.perlbench   int  96.74%
401.bzip2   int 100.09%
403.gcc int  99.94%
429.mcf int  99.21%
445.gobmk   int  99.33%
456.hmmer   int  98.34%
458.sjeng   int  99.68%
462.libquantum  int 101.48%
464.h264ref int 101.40%
471.omnetpp int 100.28%
473.astar   int 100.09%
483.xalancbmk   int  98.28%
410.bwaves  fp   98.11%
416.gamess  fp  101.31%
433.milcfp   99.43%
434.zeusmp  fp  103.53%
435.gromacs fp  109.63%
436.cactusADM   fp   99.53%
437.leslie3dfp  101.23%
444.namdfp  103.42%
447.dealII  fp  --
450.soplex  fp   99.14%
453.povray  fp   99.66%
454.calculixfp   97.17%
459.GemsFDTDfp  100.88%
465.tonto   fp  101.18%
470.lbm fp   99.83%
481.wrf fp   93.38%
482.sphinx3 fp  100.82%
Spec INTint  99.57%
Spec FP except 447.dealII   fp  100.43%

Perlbench, calculix, and wrf are slower.  Zeusmp, gromacs, and Namd are
faster.

Unfortunately, the profiling tools on my system seem to abort when I run 32-bit
benchmarks, so I haven't gotten the numbers recently (nor had time to get the
tools team to look at it).

In terms of building 32-bit, 3 benchmarks don't build with LRA: gamess, dealII
(note in 64-bit dealII builds, it just doesn't run correctly), and wrf.

Lets see.  In gamess, I see:

/home/meissner/fsf-install-ppc64/gcc-4_9-lra/bin/gfortran -c -o 
ormas1.fppized.o -g -save-temps=obj -ffast-math -Ofast -mveclibabi=mass 
-mcpu=power7 -mrecip=rsqrt -fpeel-loops -funroll-loops -ftree-vectorize 
-fvect-cost-model -fno-aggressive-loop-optimizations -mlra -m32 ormas1.fppized.f
ormas1.fppized.f: In function 'maktabs':
ormas1.fppized.f:2281:0: internal compiler error: in check_rtl, at lra.c:2036
   END
 ^
0x105a08ef check_rtl
/home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2036
0x105a2bcb lra(_IO_FILE*)
/home/meissner/fsf-src/gcc-4_9-lra/gcc/lra.c:2432
0x10552933 do_reload
/home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4686
0x10552933 rest_of_handle_reload
/home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4815
0x10552933 execute
/home/meissner/fsf-src/gcc-4_9-lra/gcc/ira.c:4844
Please submit a full bug report,


In dealII we see:

/home/meissner/fsf-install-ppc64/gcc-4_9-lra/bin/g++ -c -o 
sparse_matrix_ez.float.o -DSPEC_CPU -DNDEBUG  -Iinclude -DBOOST_DISABLE_THREADS 
-Ddeal_II_dimension=3 -g -save-temps=obj -ffast-math -Ofast -mveclibabi=mass 
-mcpu=po

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-22 Thread Richard Henderson
On 10/22/2013 07:42 AM, Kirill Yukhin wrote:
> Hello Richard,
> Thanks for remarks, they all seems reasonable.
> 
> One question
> 
> On 21 Oct 16:01, Richard Henderson wrote:
>>> +(define_insn "avx512f_moves_mask"
>>> +  [(set (match_operand:VF_128 0 "register_operand" "=v")
>>> +   (vec_merge:VF_128
>>> + (vec_merge:VF_128
>>> +   (match_operand:VF_128 2 "register_operand" "v")
>>> +   (match_operand:VF_128 3 "vector_move_operand" "0C")
>>> +   (match_operand: 4 "register_operand" "k"))
>>> + (match_operand:VF_128 1 "register_operand" "v")
>>> + (const_int 1)))]
>>> +  "TARGET_AVX512F"
>>> +  "vmov\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"
>>> +  [(set_attr "type" "ssemov")
>>> +   (set_attr "prefix" "evex")
>>> +   (set_attr "mode" "")])
>>
>> Nested vec_merge?  That seems... odd to say the least.
>> How in the world does this get matched?
> 
> This is generic approach for all scalar `masked' instructions.
> 
> Reason is that we must save higher bits of vector (outer vec_merge)
> and apply single-bit mask (inner vec_merge).
> 
> 
> We may do it with unspecs though... But is it really better?
> 
> What do you think?

What I think is that while it's an instruction that exists in the ISA,
does that mean we must model it in the compiler?

How would this pattern be used?


r~



Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-22 Thread Kirill Yukhin
Hello Richard,
Thanks for remarks, they all seems reasonable.

One question

On 21 Oct 16:01, Richard Henderson wrote:
> > +(define_insn "avx512f_moves_mask"
> > +  [(set (match_operand:VF_128 0 "register_operand" "=v")
> > +   (vec_merge:VF_128
> > + (vec_merge:VF_128
> > +   (match_operand:VF_128 2 "register_operand" "v")
> > +   (match_operand:VF_128 3 "vector_move_operand" "0C")
> > +   (match_operand: 4 "register_operand" "k"))
> > + (match_operand:VF_128 1 "register_operand" "v")
> > + (const_int 1)))]
> > +  "TARGET_AVX512F"
> > +  "vmov\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"
> > +  [(set_attr "type" "ssemov")
> > +   (set_attr "prefix" "evex")
> > +   (set_attr "mode" "")])
> 
> Nested vec_merge?  That seems... odd to say the least.
> How in the world does this get matched?

This is generic approach for all scalar `masked' instructions.

Reason is that we must save higher bits of vector (outer vec_merge)
and apply single-bit mask (inner vec_merge).


We may do it with unspecs though... But is it really better?

What do you think?

--
Thanks, K


Re: Patch: Add #pragma ivdep support to the ME and C FE

2013-10-22 Thread Joseph S. Myers
On Mon, 21 Oct 2013, Tobias Burnus wrote:

> Dear all,
> 
> attached is a new version of the patch. Changes:
> * "#pragma GCC ivdep" instead of "#pragma ivdep"
> * Corrections to the error message in c-parser.c and a test case for it
> * New wording in the .texi and examples
> 
> I am still not completely happy with the wording ? and I am open for
> suggestions. In the example, I played safe and mention k < -m and k >=m; even
> if k >= 0 probably works.
> 
> I also didn't know how to best state the reason for requiring a condition.
> (Internal reason: The annotation is attached to the condition - thus, it has
> to be present. External reason: For vectorization, there shouldn't be a
> branching in the body of the loop and without a condition in either the "for"
> header or in its body, one has an endless loop.)
> 
> Do you have suggestions for a better wording? If not, is the patch okay for
> the trunk?
> Built and regtested (C only). [An all-language bootstrap + regtesting is
> underway.]

The C front-end changes in this version are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C PATCH] Warn for _Alignas in an array declarator (PR c/58267)

2013-10-22 Thread Joseph S. Myers
On Mon, 21 Oct 2013, Marek Polacek wrote:

> Tested via make check -C gcc RUNTESTFLAGS=dg.exp=c1x-*.c, ok for
> trunk?
> 
> 2013-10-21  Marek Polacek  
> 
> c/
>   * c-parser.c (c_parser_struct_declaration): Add a comment.
>   (c_parser_declarator): Don't allow _Alignas here.
> testsuite/
>   * gcc.dg/c1x-align-5.c: Add more testing.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFC: Add of type-demotion pass

2013-10-22 Thread Jakub Jelinek
On Fri, Oct 18, 2013 at 12:06:35PM +0200, Richard Biener wrote:
> You can't move type conversion "out of the way" in most cases as
> GIMPLE is stronly typed
> and data sources and sinks can obviously not be "promoted" (nor can
> function arguments).
> So you'll very likely not be able to remove the code from the
> optimizers, it will only maybe
> trigger less often.

My take on the type demotion and promotion is that we badly need it and the
question is just in which pass to do it.

The benefit of type demotion is code canonicalization and removing
unnecessary computation that e.g. only affects the upper bits that are going
to be thrown away anyway, the disadvantage of type demotion of signed
operations is that we need to perform them in unsigned type instead and thus
we can't perform some loop optimizations based on undefined behavior etc.
See e.g.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c8
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45397#c10
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47477#c16
for some testcases where type demotion can improve generated code.
If types are demoted, upper bits of constants go away, SCCVN can find
equivalences between SSA_NAMEs that wouldn't be considered before, etc.

But given the issue with signed operation type demotion, I think before loop
optimizations we should only be doing type demotions that don't result
in defining previously undefined behavior operations.  I guess passes like
forwprop, gimple-fold etc. could easily handle the easy cases, where there
is a tree of has_single_use SSA_NAMEs that can be demoted, but handling
a more complicated web would be harder.  Say in:
unsigned int a, b, c, d, e, f; unsigned char h, i, j;
void
foo (void)
{
  unsigned int k = a * 2 + b + 0x1234;
  unsigned int l = c * 4 + d + 0x23456700;
  unsigned int m = e * 5 + f, n = k + l - m, o = k - l + m, p = -k + 1;
  h = n; i = o; j = p;
}
k, l, m all have multiple imm uses, but still pretty much everything in this
function could be demoted to unsigned char, the two large constants could go
away as additions of zero, etc.  Perhaps that can be seen as little benefit,
but what if the above is all
s/unsigned int/unsigned long long/;s/unsigned char/unsigned int/ on 32-bit
target?  RTL subreg pass might help a little bit, but that is too late.

For the demotion which changes undefined overflow operations to defined
ones, I wonder when is the last pass that usefully makes use of that
information, if e.g. we could do the full type demotion already before
vectorization somewhere in the loop optimization queue, or if that is still
too early.

Where type demotion and promotion is very important is IMHO vectorization,
the code we generate for mixed types vectorization is just huge and
terrible.  If we can help it by not computing useless upper bits, or on the
other side sometimes not doing parts of computations in smaller types, which
lead to all the other computations on wider types to be done with bigger
vectorization factor, we could improve generated code quality.

I wonder if for vectorizations we couldn't use the same thing I wrote
recently for if-conversion, for bbs potentially suitable for vectorization
(with the right loop form etc.), that is, if we don't do full type demotion
before vectorization, check if we'd demote anything and if so, work only on
the vectorization only loop copy (or create it), and then try to do some
type promotion to minimize number of type sizes in the loop,
see the http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47477#c16 (admittedly
artificial) testcase for what I mean.  After demotion, we could replace the
cast of short to char and back just with and (for zero extension) or signed
shift right + shift left (for sign extension), etc.

And, finally, the question is if we generate good code if we just expand RTL
from the demoted types (we'd better be, because user could have written his
code in the narrower types from the beginning (well, C implicit promotions
make that harder, but fold-const already demotes some computations that
appear in a single statement), or if there are advantages of promoting some
types, what algorithm to use for that, what cost model, what target hooks
etc.

Jakub


Re: [C++ Patch, obvious?] PR 58816

2013-10-22 Thread Paolo Carlini

On 10/22/2013 01:02 PM, Jason Merrill wrote:

OK, thanks.

Applied.

Shall I apply this one too:

http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01166.html

?

Thanks,
Paolo.


Re: [PATCH, rs6000] Be careful with special permute masks for little endian, take 2

2013-10-22 Thread David Edelsohn
On Mon, Oct 21, 2013 at 8:03 PM, Bill Schmidt
 wrote:
> Hi,
>
> This is a revision of my earlier patch on the subject, expanded to catch
> a few more cases and with some attendant test-case adjustments:
>
> In altivec_expand_vec_perm_const, we look for special masks that match
> the behavior of specific instructions, so we can use those instructions
> rather than load a constant control vector and perform a permute.  Some
> of the masks must be treated differently for little endian mode.
>
> The masks that represent merge-high and merge-low operations have
> reversed meanings in little-endian, because of the reversed ordering of
> the vector elements.
>
> The masks that represent vector-pack operations remain correct when the
> mode of the input operands matches the natural mode of the instruction,
> but not otherwise.  This is because the pack instructions always select
> the rightmost, low-order bits of the vector element.  There are cases
> where we use this, for example, with a V8SI vector matching a vpkuwum
> mask in order to select the odd-numbered elements of the vector.  In
> little endian mode, this instruction will get us the even-numbered
> elements instead.  There is no alternative instruction with the desired
> behavior, so I've just disabled use of those masks for little endian
> when the mode isn't natural.
>
> This requires adjusting the altivec-perm-1.c test case.  The vector pack
> tests are moved to a new altivec-perm-3.c test, which is restricted to
> big-endian targets.
>
> These changes fix 49 failures in the test suite for little endian mode
> (9 vector failures left to go!).  Bootstrapped and tested on
> powerpc64{,le}-unknown-linux-gnu with no new failures.  Is this ok for
> trunk?
>
> Thanks,
> Bill
>
>
> gcc:
>
> 2013-10-21  Bill Schmidt  
>
> * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
> meaning of merge-high and merge-low masks for little endian; avoid
> use of vector-pack masks for little endian for mismatched modes.
>
> gcc/testsuite:
>
> 2013-10-21  Bill Schmidt  
>
> * gcc.target/powerpc/altivec-perm-1.c: Move the two vector pack
> tests into...
> * gcc.target/powerpc/altivec-perm-3.c: ...this new test, which is
> restricted to big-endian targets.

Okay.

Thanks, David


Re: patch to enable LRA for ppc

2013-10-22 Thread David Edelsohn
On Mon, Oct 21, 2013 at 10:42 PM, Vladimir Makarov  wrote:

>> I would say lets add -mlra, but make the default OFF for the time being.
>> We
>> can always switch the default later.
>
> Sure, if you know some LRA problems it should not be on default. Moreover,
> if we still have the problems when releasing gcc4.9, I think we should
> exclude any possibility for a user to use LRA for ppc.  I don't want to have
> GGC-4.9 users blaming LRA.
>
> But adding LRA to PPC on the trunk (switched OFF by default) earlier could
> help me a lot to work on the issues.

My main concern was disrupting Mike. If Mike is comfortable with
adding LRA disabled by default, it is okay with me.

The patch mostly adds lra_in_progress, which will not have any effect
while LRA remains disabled.

My one question about the patch is:

-  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r")
+  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,&r")

which may cause register preferencing problems for bswap when LRA is not used.

The rest of the patch is okay.

Thanks, David


Re: [RFC] By default if-convert only basic blocks that will be vectorized (take 3)

2013-10-22 Thread Sergey Ostanevich
I use
-march=core-avx2 -static -Ofast -flto -funroll-loops
with all 3 patches applied to git version I mentioned

the same is in progress for 'take 3'

On Tue, Oct 22, 2013 at 5:26 PM, Jakub Jelinek  wrote:
> On Tue, Oct 22, 2013 at 05:16:29PM +0400, Sergey Ostanevich wrote:
>> ouch.. html got me!
>>
>> applying to the same as before
>>
>> commit 0e3dfadd374c3045a926afa6d06af276cee2108d
>> Author: rguenth 
>> Date:   Fri Oct 18 08:36:28 2013 +
>>
>> trying '06 build.
>>
>>
>> for take 2 on HSW there were mixed results for passed tests:
>>
>> 400.perlbench +0.25%
>> 401.bzip2 -0.40%
>> 429.mcf   -1.67%
>> 456.hmmer +0%
>> 458.sjeng -1.65%
>> 471.omnetpp   -1.72%
>> 483.xalancbmk -0.25%
>> 410.bwaves+0%
>> 433.milc  -0.28%
>> 444.namd  -1.13%
>> 450.soplex-1.01%
>> 459.GemsFDTD  -0.25%
>> 470.lbm   +1.44%
>> 482.sphinx3   +0.61%
>
> Thanks.
>
> If this is with AVX or AVX2, there can be multiple things.
>
> One is whether if-conversion in it's current form is really so harmful
> for generated code quality (we have various testcases where what we get with
> it is very bad, but perhaps there are still testcases where it improves
> non-vectorized loops), another one is when the masked loads/stores/gather is
> actually beneficial and whether we don't want some better tuning for it.
> The patchset as whole does both things, so determining what is what is hard.
>
> If only the first two patches are applied and not the third, then it should
> give a picture on whether it is a win to have if-conversion done in it's
> current form only for vectorized bbs.
>
> And, if all 3 patches are applied, but the condition in tree_if_conversion:
>if ((flag_tree_loop_vectorize || loop->force_vect)
>&& (flag_tree_loop_if_convert == -1 || any_mask_load_store)
>&& !version_loop_for_if_conversion (loop, &version_outer_loop,
>any_mask_load_store))
>  goto cleanup;
> is changed into just
>   if (any_mask_load_store
>   && !version_loop_for_if_conversion (loop, &version_outer_loop, true))
> goto cleanup;
> then if-conversion would happen as before, just in cases where previously
> we wouldn't if-convert because of conditional loads/stores, but now we
> could, those changes would be limited to vectorized loops only.
>
> Jakub


Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.

2013-10-22 Thread Kirill Yukhin
Hello,
On 20 Oct 11:55, Uros Bizjak wrote:
> Please also add back expanders with operand fixups and insn
> constraints, as is the case with other commutative operators. They are
> needed to hoist operand loads out of the loops (reload and later
> passes won't hoist memory loads out of the loops when fixing up
> operands).
Whoops. I didn't know how git diff works for set of patches.
Updated patch in the bottom.


> The patch is OK with this change, but please wait for rths final approval.

Richard, are you ok?

Bootstrap pass.

--
Thanks, K

---
 gcc/config/i386/i386.md   |   5 +
 gcc/config/i386/predicates.md |  40 ++
 gcc/config/i386/sse.md| 932 +-
 3 files changed, 971 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 10ca6cb..e7e9f2d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -831,6 +831,11 @@
 (define_code_attr s [(sign_extend "s") (zero_extend "u")])
 (define_code_attr u_bool [(sign_extend "false") (zero_extend "true")])
 
+;; Used in signed and unsigned truncations.
+(define_code_iterator any_truncate [ss_truncate truncate us_truncate])
+;; Instruction suffix for truncations.
+(define_code_attr trunsuffix [(ss_truncate "s") (truncate "") (us_truncate 
"us")])
+
 ;; Used in signed and unsigned fix.
 (define_code_iterator any_fix [fix unsigned_fix])
 (define_code_attr fixsuffix [(fix "") (unsigned_fix "u")])
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 06b2914..999d8ab 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -752,6 +752,11 @@
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 6, 7)")))
 
+;; Match 8 to 9.
+(define_predicate "const_8_to_9_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 8, 9)")))
+
 ;; Match 8 to 11.
 (define_predicate "const_8_to_11_operand"
   (and (match_code "const_int")
@@ -762,16 +767,51 @@
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 8, 15)")))
 
+;; Match 10 to 11.
+(define_predicate "const_10_to_11_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 10, 11)")))
+
+;; Match 12 to 13.
+(define_predicate "const_12_to_13_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 12, 13)")))
+
 ;; Match 12 to 15.
 (define_predicate "const_12_to_15_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 12, 15)")))
 
+;; Match 14 to 15.
+(define_predicate "const_14_to_15_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 14, 15)")))
+
+;; Match 16 to 19.
+(define_predicate "const_16_to_19_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 16, 19)")))
+
 ;; Match 16 to 31.
 (define_predicate "const_16_to_31_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 16, 31)")))
 
+;; Match 20 to 23.
+(define_predicate "const_20_to_23_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 20, 23)")))
+
+;; Match 24 to 27.
+(define_predicate "const_24_to_27_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 24, 27)")))
+
+;; Match 28 to 31.
+(define_predicate "const_28_to_31_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 28, 31)")))
+
 ;; True if this is a constant appropriate for an increment or decrement.
 (define_predicate "incdec_operand"
   (match_code "const_int")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 623e919..c429855 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -87,6 +87,7 @@
   ;; For AVX512F support
   UNSPEC_VPERMI2
   UNSPEC_VPERMT2
+  UNSPEC_UNSIGNED_FIX_NOTRUNC
   UNSPEC_UNSIGNED_PCMP
   UNSPEC_TESTM
   UNSPEC_TESTNM
@@ -2994,6 +2995,34 @@
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "DI")])
 
+(define_insn "cvtusi232"
+  [(set (match_operand:VF_128 0 "register_operand" "=v")
+   (vec_merge:VF_128
+ (vec_duplicate:VF_128
+   (unsigned_float:
+ (match_operand:SI 2 "nonimmediate_operand" "rm")))
+ (match_operand:VF_128 1 "register_operand" "v")
+ (const_int 1)))]
+  "TARGET_AVX512F"
+  "vcvtusi2\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "sseicvt")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_insn "cvtusi264"
+  [(set (match_operand:VF_128 0 "register_operand" "=v")
+   (vec_merge:VF_128
+ (vec_duplicate:VF_128
+   (unsigned_float:
+ (match_operand:DI 2 "nonimmediate_operand" "rm")))
+ (match_operand:VF_128 1 "register_operand" "v")
+ (const_int 1)))]
+  "TARGET_AVX512F && TARGET_64BIT"
+  "vcvtusi2\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "sseicvt")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "

Re: [C++14] implement [[deprecated]].

2013-10-22 Thread Paolo Carlini

Hi,

On 10/22/2013 03:07 PM, Iain Sandoe wrote:

Is PR17729 still the most significant? (there are related test-suite ones).
For example, PR33911: still no warning for the line "goo f2;" of 
the original testcase submitted by Benjamin. Funny that apparently 
current clang has the same issue; icc is fine. There is also PR40075, 
related to 17729. And more.


Paolo.


RE: [PATCH, PR 57748] Check for out of bounds access

2013-10-22 Thread Bernd Edlinger
Hi, 

On Tue, 17 Sep 2013 01:09:45, Martin Jambor wrote:
>> @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b
>>        if (MEM_P (to_rtx)
>>        && GET_MODE (to_rtx) == BLKmode
>>        && GET_MODE (XEXP (to_rtx, 0)) != VOIDmode
>> +      && bitregion_start == 0
>> +      && bitregion_end == 0
>>        && bitsize> 0
>>        && (bitpos % bitsize) == 0
>>        && (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0
>>
> ...
>
> I'm not sure to what extent the hunk adding tests for bitregion_start
> and bitregion_end being zero is connected to this issue. I do not see
> any of the testcases exercising that path. If it is indeed another
> problem, I think it should be submitted (and potentially committed) as
> a separate patch, preferably with a testcase.
>

Meanwhile I am able to give an example where that code is executed
with bitpos = 64, bitsize=32, bitregion_start = 32, bitregion_end = 95.

Afterwards bitpos=0, bitsize=32, which is completely outside
bitregion_start=32, bitregion_end=95.

However this can only be seen in the debugger, as the store_field
goes thru a code path that does not look at bitregion_start/end.

Well that is at least extremely ugly, and I would not be sure, that
I cannot come up with a sample that crashes or creates wrong code.

Currently I think that maybe the best way to fix that would be this:

--- gcc/expr.c    2013-10-21 08:27:09.546035668 +0200
+++ gcc/expr.c    2013-10-22 15:19:56.749476525 +0200
@@ -4762,6 +4762,9 @@ expand_assignment (tree to, tree from, b
   && MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1))
     {
   to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT);
+      bitregion_start = 0;
+      if (bitregion_end>= (unsigned HOST_WIDE_INT) bitpos)
+        bitregion_end -= bitpos;
   bitpos = 0;
     }
 
Any suggestions?



Regards
Bernd.extern void abort (void);

struct x{
  int a;
  int :32;
  volatile int b:32;
};

struct s
{
  int a,b,c,d;
  struct x xx[1];
};

struct s ss;
volatile int k;

int main()
{
  ss.xx[k].b = 1;
//  asm volatile("":::"memory");
  if ( ss.xx[k].b != 1)
abort ();
  return 0;
}


Re: [RFC] By default if-convert only basic blocks that will be vectorized (take 3)

2013-10-22 Thread Jakub Jelinek
On Tue, Oct 22, 2013 at 05:16:29PM +0400, Sergey Ostanevich wrote:
> ouch.. html got me!
> 
> applying to the same as before
> 
> commit 0e3dfadd374c3045a926afa6d06af276cee2108d
> Author: rguenth 
> Date:   Fri Oct 18 08:36:28 2013 +
> 
> trying '06 build.
> 
> 
> for take 2 on HSW there were mixed results for passed tests:
> 
> 400.perlbench +0.25%
> 401.bzip2 -0.40%
> 429.mcf   -1.67%
> 456.hmmer +0%
> 458.sjeng -1.65%
> 471.omnetpp   -1.72%
> 483.xalancbmk -0.25%
> 410.bwaves+0%
> 433.milc  -0.28%
> 444.namd  -1.13%
> 450.soplex-1.01%
> 459.GemsFDTD  -0.25%
> 470.lbm   +1.44%
> 482.sphinx3   +0.61%

Thanks.

If this is with AVX or AVX2, there can be multiple things.

One is whether if-conversion in it's current form is really so harmful
for generated code quality (we have various testcases where what we get with
it is very bad, but perhaps there are still testcases where it improves
non-vectorized loops), another one is when the masked loads/stores/gather is
actually beneficial and whether we don't want some better tuning for it.
The patchset as whole does both things, so determining what is what is hard.

If only the first two patches are applied and not the third, then it should
give a picture on whether it is a win to have if-conversion done in it's
current form only for vectorized bbs.

And, if all 3 patches are applied, but the condition in tree_if_conversion:
   if ((flag_tree_loop_vectorize || loop->force_vect)
   && (flag_tree_loop_if_convert == -1 || any_mask_load_store)
   && !version_loop_for_if_conversion (loop, &version_outer_loop,
   any_mask_load_store))
 goto cleanup;
is changed into just
  if (any_mask_load_store
  && !version_loop_for_if_conversion (loop, &version_outer_loop, true))
goto cleanup;
then if-conversion would happen as before, just in cases where previously
we wouldn't if-convert because of conditional loads/stores, but now we
could, those changes would be limited to vectorized loops only.

Jakub


Re: [RFC] By default if-convert only basic blocks that will be vectorized (take 3)

2013-10-22 Thread Sergey Ostanevich
ouch.. html got me!

applying to the same as before

commit 0e3dfadd374c3045a926afa6d06af276cee2108d
Author: rguenth 
Date:   Fri Oct 18 08:36:28 2013 +

trying '06 build.


for take 2 on HSW there were mixed results for passed tests:

400.perlbench +0.25%
401.bzip2 -0.40%
429.mcf   -1.67%
456.hmmer +0%
458.sjeng -1.65%
471.omnetpp   -1.72%
483.xalancbmk -0.25%
410.bwaves+0%
433.milc  -0.28%
444.namd  -1.13%
450.soplex-1.01%
459.GemsFDTD  -0.25%
470.lbm   +1.44%
482.sphinx3   +0.61%

Sergos

On Tue, Oct 22, 2013 at 5:12 PM, Sergey Ostanevich  wrote:
> applying to the same as before
>
> commit 0e3dfadd374c3045a926afa6d06af276cee2108d
> Author: rguenth 
> Date:   Fri Oct 18 08:36:28 2013 +
>
> trying '06 build.
>
>
> for take 2 on HSW there were mixed results for passed tests:
>
> 400.perlbench +0.25%
> 401.bzip2 -0.40%
> 429.mcf   -1.67%
> 456.hmmer +0%
> 458.sjeng -1.65%
> 471.omnetpp   -1.72%
> 483.xalancbmk -0.25%
> 410.bwaves+0%
> 433.milc  -0.28%
> 444.namd  -1.13%
> 450.soplex-1.01%
> 459.GemsFDTD  -0.25%
> 470.lbm   +1.44%
> 482.sphinx3   +0.61%
>
> Sergos
>
>
> On Tue, Oct 22, 2013 at 2:56 PM, Jakub Jelinek  wrote:
>>
>> Hi!
>>
>> On Fri, Oct 18, 2013 at 05:45:15PM +0400, Sergey Ostanevich wrote:
>> > failed on 403 of '06:
>> > regclass.c: In function 'init_reg_sets':
>> > regclass.c:277:1: internal compiler error: tree check: expected
>> > ssa_name, have var_decl in copy_ssa_name_fn, at tree-ssanames.c:393
>> >  init_reg_sets ()
>> >  ^
>>
>> That is because for -ftree-loop-if-convert-stores or masked loads/stores
>> we need TODO_update_ssa_only_virtuals and loop versioning relies on no
>> SSA updates being needed in the loop being versioned.
>>
>> Attached is updated patchset (the fix is in the second patch, the third
>> one needed adjusting because of the fix and also apparently Andrew's
>> header reshuffling requires now additional include in the third patch).
>>
>> Will bootstrap/regtest this after my current bootstrap/regtest finishes.
>>
>> Jakub
>
>


[PATCH] Fix various reassoc issues (PR tree-optimization/58791, tree-optimization/58775)

2013-10-22 Thread Jakub Jelinek
Hi!

I've spent over two days looking at reassoc, fixing spots where
we invalidly reused SSA_NAMEs (this results in wrong-debug, as the added
guality testcases show, even some ICEs (pr58791-3.c) and wrong range info
for SSA_NAMEs) and cleaning up the stmt scheduling stuff (e.g. all gsi_move*
calls are gone, if we need to "move" something or set an SSA_NAME to
different value than previously, we'll now always create new
stmt and the old one depending on the case either remove or mark as visited
zero uses, so that it will be removed later on by reassociate_bb.
Of course some gimple_assign_set_rhs* etc. calls are still valid even
without creating new stmts, optimizing some statement to equivalent
computation is fine, but computing something different in an old SSA_NAME
is not.

I've also noticed that build_and_add_sum was using different framework from
rewrite_expr_tree, the former was using stmt_dominates_stmt_p (which is IMHO
quite clean interface, but with the added uid stuff in reassoc can be
unnecessarily slow on large basic blocks) and rewrite_expr_tree was using
worse APIs, but using the uids.  So, the patch also unifies that, into
a new reassoc_stmt_dominates_stmt_p that has the same semantics as the
tree-ssa-loop-niter.c function, but uses uids internally.  rewrite_expr_tree
is changed so that it recurses first, then handles current level (which is
needed if the recursion needs to create new stmt and give back a new
SSA_NAME), which allowed removing the ensure_ops_are_available recursive
stuff.  Also, uids are now computed in break_up_subtract_bb (and are per-bb,
starting with 1, we never compare uids from different bbs), which allows
us to get rid of an extra whole IL walk.

For the inter-bb optimization, I had to stop modifying stmts right away
in update_range_test, because we don't want to reuse SSA_NAMEs and if we
modified there, we'd need to modify potentially many dependent SSA_NAMEs
and sometimes many times.  So, now it instead just updates oe->op values
and maybe_optimize_range_tests just looks at those values and updates
what is needed.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

For 4.8 a partial backport would be possible, but quite a lot of work,
for 4.7 I'd prefer not to backport given that there gsi_for_stmt isn't O(1).

2013-10-22  Jakub Jelinek  

PR tree-optimization/58775
PR tree-optimization/58791
* tree-ssa-reassoc.c (reassoc_stmt_dominates_stmt_p): New function.
(insert_stmt_after): Rewritten, don't move the stmt, but really
insert it.
(get_stmt_uid_with_default): Remove.
(build_and_add_sum): Use insert_stmt_after and
reassoc_stmt_dominates_stmt_p.  Fix up uid if bb contains only
labels.
(update_range_test): Set uid on stmts added by
force_gimple_operand_gsi.  Don't immediately modify statements
in inter-bb optimization, just update oe->op values.
(optimize_range_tests): Return bool whether any changed have
been made.
(update_ops): New function.
(struct inter_bb_range_test_entry): New type.
(maybe_optimize_range_tests): Perform statement changes here.
(not_dominated_by, appears_later_in_bb, get_def_stmt,
ensure_ops_are_available): Remove.
(find_insert_point): Rewritten.
(rewrite_expr_tree): Remove MOVED argument, add CHANGED argument,
return LHS of the (new resp. old) stmt.  Don't call
ensure_ops_are_available, don't reuse SSA_NAMEs, recurse first
instead of last, move new stmt at the right place.
(linearize_expr, repropagate_negates): Don't reuse SSA_NAMEs.
(negate_value): Likewise.  Set uids.
(break_up_subtract_bb): Initialize uids.
(reassociate_bb): Adjust rewrite_expr_tree caller.
(do_reassoc): Don't call renumber_gimple_stmt_uids.

* gcc.dg/guality/pr58791-1.c: New test.
* gcc.dg/guality/pr58791-2.c: New test.
* gcc.dg/guality/pr58791-3.c: New test.
* gcc.dg/guality/pr58791-4.c: New test.
* gcc.dg/guality/pr58791-5.c: New test.
* gcc.c-torture/compile/pr58775.c: New test.
* gcc.dg/tree-ssa/reassoc-28.c: Don't scan reassoc1 dump.

--- gcc/tree-ssa-reassoc.c.jj   2013-10-21 09:00:25.0 +0200
+++ gcc/tree-ssa-reassoc.c  2013-10-22 12:04:38.693490273 +0200
@@ -1143,12 +1143,80 @@ zero_one_operation (tree *def, enum tree
   while (1);
 }
 
-/* Returns the UID of STMT if it is non-NULL. Otherwise return 1.  */
+/* Returns true if statement S1 dominates statement S2.  Like
+   stmt_dominates_stmt_p, but uses stmt UIDs to optimize.  */
 
-static inline unsigned
-get_stmt_uid_with_default (gimple stmt)
+static bool
+reassoc_stmt_dominates_stmt_p (gimple s1, gimple s2)
+{
+  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
+
+  if (!bb1 || s1 == s2)
+return true;
+
+  if (!bb2)
+return false;
+
+  if (bb1 == bb2)
+{
+  if (gimple_code (s2) ==

Re: [C++14] implement [[deprecated]].

2013-10-22 Thread Iain Sandoe
Hi Ed, Paolo,

On 22 Oct 2013, at 13:55, Ed Smith-Rowland wrote:

> On 10/22/2013 08:37 AM, Paolo Carlini wrote:
>> On 10/22/2013 02:28 PM, Ed Smith-Rowland wrote:
>>> I think this is pretty easy - gnu::deprecated has the same semantics.
>> Unfortunately however, gnu::deprecated has a number of long standing issues 
>> (just search Bugzilla), personally I'm not sure we want to say everybody 
>> that we have got [[deprecated]] implemented, until those are solved. Just my 
>> personal opinion.
>> 

> I'll check bugzilla.  We'll hold…

Is PR17729 still the most significant? (there are related test-suite ones).

unfortunately, I still don't have time to address this - but perhaps someone 
could update the attempt I made before?

It would be nice to fix it and to add the associated "unavailable" error 
attribute ...

Iain



Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Ilya Enkovich
2013/10/21 Jeff Law :
> On 10/15/13 07:31, Ilya Enkovich wrote:
>>
>> Hey guys,
>>
>> could please someone look at this small patch? It blocks approved MPX
>> ISA support on i386 target.
>
>

 diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
 index 1d62223..02b1214 100644
 --- a/gcc/doc/rtl.texi
 +++ b/gcc/doc/rtl.texi
 @@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the
 @file{@var{machine}-modes.def}.
   @xref{Jump Patterns},
   also see @ref{Condition Code}.

 +@findex MODE_BOUND
 +@item MODE_BOUND
 +Bound modes class.  Used to represent values of pointer bounds.
 +
   @findex MODE_RANDOM
   @item MODE_RANDOM
   This is a catchall mode class for modes which don't fit into the above
>
> So why are bounds distinct modes?Is there some inherent reason why
> bounds are something other than an integer mode (MODE_INT)?
>
> Similarly what's the rationale behind having new types for bounds?  Is there
> some reason why they couldn't be implemented with one of the existing types?
>
> ISTM the entire patch is gated on being able to answer those two questions.
>

Hello Jeff,

Before introducing new type and mode we tried to implement everything
using existing ones. We tried integers, pointers, complex with pointer
type as base and also structure of two pointers. The problem is that
semantics of bounds is different from everything we have for base
types. All operators (exprs) we have for existing types are not
applicable to bounds. We probably may use some existing type/mode but
it would still require some additional flag to mark bounds. And almost
each first time we handle chosen basic type, it would be required to
check if we are working with bounds. I do not think many GCC
developers (at least in the nearest future) will care about
instrumented code while writing their patches. It means that many
developers may break instrumented code by adding any sort of
manipulation with values of type/mode we choose as basic for bounds.
I'm sure having a proper type is much more convenient and natural.

In addition to all said for bound type, bound mode may also have
different binary format. On i386 target bounds have special binary
format, it is not equal to pair of pointers. In many places (ABI, insn
templates etc.) we need to know if we work with bounds. E. g. passing
'long long' and bounds on a register(s) is different even if size is
the same.

Shortly: why to use same base type/mode for totally different matters?
I do not know if it is possible to implement everything using existing
types and modes. Probably it is possible, but for me it does not seem
a right way to go.

Thanks,
Ilya

>
> jeff
>


Re: [C++14] implement [[deprecated]].

2013-10-22 Thread Ed Smith-Rowland

On 10/22/2013 08:37 AM, Paolo Carlini wrote:

On 10/22/2013 02:28 PM, Ed Smith-Rowland wrote:

I think this is pretty easy - gnu::deprecated has the same semantics.
Unfortunately however, gnu::deprecated has a number of long standing 
issues (just search Bugzilla), personally I'm not sure we want to say 
everybody that we have got [[deprecated]] implemented, until those are 
solved. Just my personal opinion.


Paolo.


I'll check bugzilla.  We'll hold...



[c++-concepts] bitfield reference bugfix

2013-10-22 Thread Andrew Sutton
This fixes the longstanding problem with bitfield references. The
default dialect was set to cxx1y, which was resulting different
conversions for bitfield references. I'm not sure if there's a change
in semantics for 1y or if that's a separate bug, but it's not related
to concepts.

Also prevent an ICE for invalid constrained friends.

2013-10-16  Andrew Sutton  
* gcc/c-family/c-common.c (cxx_dialect): Make the default
language C++11.
* gcc/cp/constraint.cc (check_constrained_friend): Don't assert
on error_mark_node.

Andrew


bugfix-2.patch
Description: Binary data


Re: [PATCH][buildrobot] tilepro/tilegx: fallout after tree.h refactoring (was: Re-factor inclusion of tree.h)

2013-10-22 Thread Diego Novillo
On Tue, Oct 22, 2013 at 4:22 AM, Jan-Benedict Glaw  wrote:
> On Mon, 2013-10-21 15:36:49 -0400, Diego Novillo  wrote:
>> Can anyone think of some way that we can use to automatically block
>> inclusions of tree.h from header files? Code review is the only way
>> that comes to mind.
>
> Grep once, then install a commit hook.

Hm, that may work.  Thanks.

> I get some fallout for tilepro-linux, see
> http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=21851 .

Sorry about that.  I missed it in my testing because tilepro-linux-gnu
fails early in libcpp with:

../../../gcc/libcpp/macro.c:2965:58: error: format not a string
literal and no format arguments [-Werror=format-security]
../../../gcc/libcpp/macro.c:2978:58: error: format not a string
literal and no format arguments [-Werror=format-security]


> This fixes it:
>
> 2013-10-22  Jan-Benedict Glaw  
>
> * config/tilepro/tilepro.c: Include "tree.h".

Sure.  It qualifies as obvious.  Thanks.


Diego.


Re: [C++14] implement [[deprecated]].

2013-10-22 Thread Paolo Carlini

On 10/22/2013 02:28 PM, Ed Smith-Rowland wrote:

I think this is pretty easy - gnu::deprecated has the same semantics.
Unfortunately however, gnu::deprecated has a number of long standing 
issues (just search Bugzilla), personally I'm not sure we want to say 
everybody that we have got [[deprecated]] implemented, until those are 
solved. Just my personal opinion.


Paolo.


m68k: handle register conflict with PRE_DEC in notice_update_cc

2013-10-22 Thread Andreas Schwab
When compiling libxml2 with -O2 -fomit-frame-pointer the following pair
of insns is generated:

(insn 113 112 114 (set (mem:SI (pre_dec:SI (reg/f:SI 15 %sp)) [0 S4 A16])
(mem/c:SI (plus:SI (reg/f:SI 15 %sp)
(const_int 44 [0x2c])) [5 strict+0 S4 A32])) xpath.i:269 38 
{*movsi_m68k2}
 (expr_list:REG_ARGS_SIZE (const_int 12 [0xc])
(nil)))
(insn 114 113 115 (set (cc0)
(compare (mem/c:SI (plus:SI (reg/f:SI 15 %sp)
(const_int 44 [0x2c])) [5 inf+0 S4 A32])
(const_int 0 [0]))) xpath.i:269 4 {*tstsi_internal_68020_cf}
 (nil))

For rtl_equal_p (mem/c:SI (plus:SI (reg/f:SI 15 %sp) (const_int 44
[0x2c])) [5 strict+0 S4 A32]) and (mem/c:SI (plus:SI (reg/f:SI 15 %sp)
(const_int 44 [0x2c])) [5 inf+0 S4 A32]) look the same, so
final_scan_insn thinks the latter insn is a redundant test.  But since
pre_dec has modified %sp the two mems refer to two different stack
slots.  notice_update_cc should not register the src operand in the
first insn as a CC status setter in this case.

Bootstrapped and tested on m68k-suse-linux, committed to trunk.

Andreas.

* config/m68k/m68k.c (notice_update_cc): Handle register conflict
with PRE_DEC.

diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index 5e3236f..7035504 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -4209,6 +4209,13 @@ notice_update_cc (rtx exp, rtx insn)
   && cc_status.value2
   && reg_overlap_mentioned_p (cc_status.value1, cc_status.value2))
 cc_status.value2 = 0;
+  /* Check for PRE_DEC in dest modifying a register used in src.  */
+  if (cc_status.value1 && GET_CODE (cc_status.value1) == MEM
+  && GET_CODE (XEXP (cc_status.value1, 0)) == PRE_DEC
+  && cc_status.value2
+  && reg_overlap_mentioned_p (XEXP (XEXP (cc_status.value1, 0), 0),
+ cc_status.value2))
+cc_status.value2 = 0;
   if (((cc_status.value1 && FP_REG_P (cc_status.value1))
|| (cc_status.value2 && FP_REG_P (cc_status.value2
 cc_status.flags = CC_IN_68881;
-- 
1.8.4.1

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[C++14] implement [[deprecated]].

2013-10-22 Thread Ed Smith-Rowland

I think this is pretty easy - gnu::deprecated has the same semantics.

Bootstrapped and tested on x86_64-linux.

OK?


gcc/cp:

2013-10-22  Edward Smith-Rowland  <3dw...@verizon.net>

* parser.c (cp_parser_std_attribute): Interpret [[deprecated]]
as [[gnu::deprecated]].

gcc/testsuite:

2013-10-22  Edward Smith-Rowland  <3dw...@verizon.net>

* g++.dg/cpp1y/attr-deprecated.C: New.
* g++.dg/cpp1y/attr-deprecated-neg.C: New.

Index: cp/parser.c
===
--- cp/parser.c (revision 203915)
+++ cp/parser.c (working copy)
@@ -21426,6 +21443,9 @@
   /* C++11 noreturn attribute is equivalent to GNU's.  */
   if (is_attribute_p ("noreturn", attr_id))
TREE_PURPOSE (TREE_PURPOSE (attribute)) = get_identifier ("gnu");
+  /* C++14 deprecated attribute is equivalent to GNU's.  */
+  else if (cxx_dialect >= cxx1y && is_attribute_p ("deprecated", attr_id))
+   TREE_PURPOSE (TREE_PURPOSE (attribute)) = get_identifier ("gnu");
 }
 
   /* Now parse the optional argument clause of the attribute.  */
Index: testsuite/g++.dg/cpp1y/attr-deprecated.C
===
--- testsuite/g++.dg/cpp1y/attr-deprecated.C(revision 0)
+++ testsuite/g++.dg/cpp1y/attr-deprecated.C(working copy)
@@ -0,0 +1,59 @@
+// { dg-do compile }
+// { dg-options -std=c++1y }
+
+class [[deprecated]] A
+{
+};
+
+[[deprecated]]
+int
+foo(int n)
+{
+  return 42 + n;
+}
+
+class [[deprecated("B has been superceded by C")]] B
+{
+};
+
+[[deprecated("bar is unsafe; use foobar instead")]]
+int
+bar(int n)
+{
+  return 42 + n - 1;
+}
+
+#if __cplusplus > 201103L
+
+//  Deprecate C for C++14 onwards.
+class [[deprecated]] C;
+
+//  Deprecate foobar for C++14 onwards.
+[[deprecated]]
+int
+foobar(int n);
+
+#endif
+
+class C
+{
+};
+
+int
+foobar(int n)
+{
+  return 43 + n - 1;
+}
+
+int
+main()
+{
+  A aaa; // { dg-warning "is deprecated" }
+  int n = foo(12); // { dg-warning "is deprecated" }
+
+  B bbb; // { dg-warning "is deprecated" "B has been superceded by C" }
+  int m = bar(666); // { dg-warning "is deprecated" "bar is unsafe; use foobar 
instead" }
+
+  C ccc; // { dg-warning "is deprecated" }
+  int l = foobar(8); // { dg-warning "is deprecated" }
+}
Index: testsuite/g++.dg/cpp1y/attr-deprecated-neg.C
===
--- testsuite/g++.dg/cpp1y/attr-deprecated-neg.C(revision 0)
+++ testsuite/g++.dg/cpp1y/attr-deprecated-neg.C(working copy)
@@ -0,0 +1,59 @@
+// { dg-do compile }
+// { dg-options -std=c++11 }
+
+class [[deprecated]] A // { dg-warning "attribute directive ignored" }
+{
+};
+
+[[deprecated]]
+int
+foo(int n) // { dg-warning "attribute directive ignored" }
+{
+  return 42 + n;
+}
+
+class [[deprecated("B has been superceded by C")]] B // { dg-warning 
"attribute directive ignored" }
+{
+};
+
+[[deprecated("bar is unsafe; use foobar instead")]]
+int
+bar(int n) // { dg-warning "attribute directive ignored" }
+{
+  return 42 + n - 1;
+}
+
+#if __cplusplus > 201103L
+
+//  Deprecate C for C++14 onwards.
+class [[deprecated]] C;
+
+//  Deprecate foobar for C++14 onwards.
+[[deprecated]]
+int
+foobar(int n);
+
+#endif
+
+class C
+{
+};
+
+int
+foobar(int n)
+{
+  return 43 + n - 1;
+}
+
+int
+main()
+{
+  A aaa;
+  int n = foo(12);
+
+  B bbb;
+  int m = bar(666);
+
+  C ccc;
+  int l = foobar(8);
+}


Re: User-define literals for std::complex.

2013-10-22 Thread Jonathan Wakely
On 20 October 2013 22:18, Ed Smith-Rowland wrote:
> On 09/27/2013 05:39 AM, Jonathan Wakely wrote:
>>
>> On 27 September 2013 05:17, Ed Smith-Rowland wrote:
>>>
>>> The complex user-defined literals finally passed (n3779) with the
>>> resolution
>>> to DR1473 allowing the suffix id to touch the quotes (Can't find it but I
>>> put it in not too long ago).
>>
>> I think it's been approved by the LWG and looks like it will go to a
>> vote by the full committee, but let's wait for that to pass before
>> making any changes.
>>
>
> Now that this is in the working paper can we go ahead?

Yes, thanks for waiting.  The patch you posted is OK, although I think
"inline constexpr" on the new operators is redundant, as constexpr
functions are implicitly inline.


Re: [C++ Patch, obvious?] PR 58816

2013-10-22 Thread Jason Merrill

OK, thanks.

Jason


[Doc patch, committed] Update contrib.texi

2013-10-22 Thread Paolo Carlini

Hi,

some due updates.

Thanks,
Paolo.

/
2013-10-22  Paolo Carlini  

* doc/contrib.texi ([Fran@,{c}ois Dumont], [Tim Shen],
[Ed Smith-Rowland]): New entries.
([Stephen M. Webb]): Update.
Index: doc/contrib.texi
===
--- doc/contrib.texi(revision 203915)
+++ doc/contrib.texi(working copy)
@@ -238,6 +238,10 @@ maintaining @code{complex<>}, sanity checking and
 architecture, libio maintenance, and early math work.
 
 @item
+Fran@,{c}ois Dumont for his work on libstdc++-v3, especially maintaining and
+improving @code{debug-mode} and associative and unordered containers.
+
+@item
 Zdenek Dvorak for a new loop unroller and various fixes.
 
 @item
@@ -838,6 +842,9 @@ Lars Segerlund for work on GNU Fortran.
 Dodji Seketeli for numerous C++ bug fixes and debug info improvements.
 
 @item
+Tim Shen for major work on @code{}.
+
+@item
 Joel Sherrill for his direction via the steering committee, RTEMS
 contributions and RTEMS testing.
 
@@ -873,6 +880,10 @@ Danny Smith for his major efforts on the Mingw (an
 Randy Smith finished the Sun FPA support.
 
 @item
+Ed Smith-Rowland for his continuous work on libstdc++-v3, special functions,
+@code{}, and various improvements to C++11 features.
+
+@item
 Scott Snyder for queue, iterator, istream, and string fixes and libstdc++
 testsuite entries.  Also for providing the patch to G77 to add
 rudimentary support for @code{INTEGER*1}, @code{INTEGER*2}, and
@@ -995,7 +1006,7 @@ Feng Wang for contributions to GNU Fortran.
 @item
 Stephen M. Webb for time and effort on making libstdc++ shadow files
 work with the tricky Solaris 8+ headers, and for pushing the build-time
-header tree.
+header tree. Also, for starting and driving the @code{} effort.
 
 @item
 John Wehle for various improvements for the x86 code generator,


RE: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-10-22 Thread Bernd Edlinger
Hi,

> On Tue, 8 Oct 2013 22:50:21, Eric Botcazou wrote:
>>
>>> I agree, that assigning a non-BLKmode to structures with zero-sized arrays
>>> should be considered a bug.
>>
>> Fine, then let's apply Martin's patch, on mainline at least.
>>
>
> That would definitely be a good move. Maybe someone should approve it?
>
>> But this testcase is invalid on STRICT_ALIGNMENT platforms: xx is pointer to 
>> a
>> type with 4-byte alignment so its value must be a multiple of 4.
>
> Then you probably win. But I still have some doubts.
>
> I had to use this silly alignment/pack(4) to circumvent this statement
> in compute_record_mode:
>
>   /* If structure's known alignment is less than what the scalar
>  mode would need, and it matters, then stick with BLKmode.  */
>   if (TYPE_MODE (type) != BLKmode
>   && STRICT_ALIGNMENT
>   && ! (TYPE_ALIGN (type)>= BIGGEST_ALIGNMENT
> || TYPE_ALIGN (type)>= GET_MODE_ALIGNMENT (TYPE_MODE (type
> {
>   /* If this is the only reason this type is BLKmode, then
>  don't force containing types to be BLKmode.  */
>   TYPE_NO_FORCE_BLK (type) = 1;
>   SET_TYPE_MODE (type, BLKmode);
> }
>
> But there are at least two targets where STRICT_ALIGNMENT = 0
> and SLOW_UNALIGNED_ACCESS != 0: rs6000 and alpha.
>
> This example with a byte-aligned structure will on one of these targets
> likely execute this code path in  expand_expr_real_1/case MEM_REF:
>
> else if (SLOW_UNALIGNED_ACCESS (mode, align))
>   temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
> 0, TYPE_UNSIGNED (TREE_TYPE (exp)),
> (modifier == EXPAND_STACK_PARM
>  ? NULL_RTX : target),
> mode, mode);
>
> This looks wrong, but unfortunately I cannot test on these targets...
>

Hmm, well,

the condition that would be necessary to execute that code path
would be

STRICT_ALIGNMENT = 0
and SLOW_UNALIGNED_ACCESS != 0 for any integer mode.

The only target that is close to hit this "bug" is rs6000:

#define STRICT_ALIGNMENT 0

#define SLOW_UNALIGNED_ACCESS(MODE, ALIGN)  \
  (STRICT_ALIGNMENT \
   || (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode    \
    || (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode)    \
   && (ALIGN) < 32) \
   || (VECTOR_MODE_P ((MODE)) && (((int)(ALIGN)) < VECTOR_ALIGN (MODE

but, luckily this is 0 for all integer modes.

So I am now convinced, there won't be any valid example with unions that
executes this code path.

Therefore I updated Martin's previous patch, to meet Eric's request:
That is to only handle zero-sized arrays at the end of the structure.

Boot-strapped and regression-tested on x86_64-linux-gnu.

Ok for trunk?

Regards
Bernd.2013-10-22  Martin Jambor  
Bernd Edlinger  

PR middle-end/57748
* stor-layout.c (compute_record_mode): Treat trailing zero-sized array
fields like incomplete types.

testsuite:
2013-10-22  Bernd Edlinger  

PR middle-end/57748
* gcc.dg/torture/pr57748-3.c: New test.



patch-pr57748-2.diff
Description: Binary data


[PATCH, PR58805] Add missing check in stmt_local_def for tail-merge

2013-10-22 Thread Tom de Vries
Richard,

This patch adds a missing check for gimple_vdef in stmt_local_def for the
tail-merge pass.

Bootstrapped and reg-tested on x86_64.

OK for trunk, gcc-4_8-branch?

Thanks,
- Tom

2013-10-22  Tom de Vries  

PR tree-optimization/58805
* tree-ssa-tail-merge.c (stmt_local_def): Add gimple_vdef check.

* gcc.dg/pr58805.c: New test.
diff --git a/gcc/testsuite/gcc.dg/pr58805.c b/gcc/testsuite/gcc.dg/pr58805.c
new file mode 100644
index 000..6e6eba5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr58805.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-tail-merge -fdump-tree-pre" } */
+
+static inline void bar(unsigned long *r)
+{
+  unsigned long t;
+  __asm__ (
+"movq $42, %[t]\n\t"
+"movq %[t], %[r]\n\t"
+: [t] "=&r" (t), [r] "=r" (*r)
+  );
+}
+
+void foo(int n, unsigned long *x, unsigned long *y)
+{
+  if (n == 0)
+bar(x);
+  else
+bar(y);
+}
+
+/* { dg-final { scan-tree-dump-times "__asm__" 2 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index 9094935..0090da6 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -300,7 +300,8 @@ stmt_local_def (gimple stmt)
   tree val;
   def_operand_p def_p;
 
-  if (gimple_has_side_effects (stmt))
+  if (gimple_has_side_effects (stmt)
+  || gimple_vdef (stmt) != NULL_TREE)
 return false;
 
   def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);


RE: [PATCH 1/n] Add conditional compare support

2013-10-22 Thread Zhenqiang Chen
Hi,

The patch is updated according to the comments. Changes include:
* Rewrite codes according to Richard Biener's comments.
* Change the algorithm to recursively combine compares. So it can handle any 
number of compares.
* Add a set of instruction patterns in ARM backend to match the conditional 
compares.

With this patch, the conditional compare sequence is like

 CC1 = CCMP (CMP (a, b), CMP (c, d));
 CC2 = CCMP (NE (CC1, 0), CMP (e, f));
 ...
 CCn/reg = CCMP (NE (CCn-1, 0), CMP (...));

To test more than two compares, you need the patch discussed in the thread.
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg63743.html

Bootstrap and no make check regression for both ARM and THUMB modes on ARM 
Chromebook.

ChangeLog:
2013-10-22  Zhenqiang Chen  

* config/arm/arm.c (arm_fixed_condition_code_regs, arm_ccmode_to_code,
arm_select_dominance_ccmp_mode): New functions.
(arm_select_dominance_cc_mode_1): New function extracted from
arm_select_dominance_cc_mode.
(arm_select_dominance_cc_mode): Call arm_select_dominance_cc_mode_1.
* config/arm/arm.md (ccmp, cbranchcc4, ccmp_and, ccmp_ior,
ccmp_ior_scc_scc, ccmp_ior_scc_scc_cmp, ccmp_and_scc_scc,
ccmp_and_scc_scc_cmp): New.
* config/arm/arm-protos.h (arm_select_dominance_ccmp_mode): New.
* expr.c (ccmp_candidate_p, used_in_cond_stmt_p, expand_ccmp_expr_2,
expand_ccmp_expr_3, expand_ccmp_expr_1, expand_ccmp_expr): New.
(expand_expr_real_1): Handle ccmp.
* optabs.c: Include gimple.h.
(expand_ccmp_op): New.
(get_rtx_code): Handle BIT_AND_EXPR and BIT_IOR_EXPR.
* optabs.def (ccmp): New.
* optabs.h (expand_ccmp_op): New.
* doc/md.texi (ccmp): New index.

Thanks!
-Zhenqiang

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Zhenqiang Chen
> Sent: Monday, September 23, 2013 2:50 PM
> To: Richard Earnshaw
> Cc: 'Richard Biener'; GCC Patches
> Subject: RE: [PATCH 1/n] Add conditional compare support
> 
> 
> 
> > -Original Message-
> > From: Richard Earnshaw
> > Sent: Thursday, September 19, 2013 5:13 PM
> > To: Zhenqiang Chen
> > Cc: 'Richard Biener'; GCC Patches
> > Subject: Re: [PATCH 1/n] Add conditional compare support
> >
> > On 18/09/13 10:45, Zhenqiang Chen wrote:
> > >
> > >> -Original Message-
> > >> From: Richard Biener [mailto:richard.guent...@gmail.com]
> > >> Sent: Tuesday, August 27, 2013 8:18 PM
> > >> To: Richard Earnshaw
> > >> Cc: Zhenqiang Chen; GCC Patches
> > >> Subject: Re: [PATCH 1/n] Add conditional compare support
> > >>
> > >> On Tue, Aug 27, 2013 at 1:56 PM, Richard Earnshaw
> > >> 
> > >> wrote:
> > >>> On 27/08/13 12:10, Richard Biener wrote:
> >  What's this for and what's the desired semantics? I don't like
> >  having extra tree codes for this.  Is this for a specific
> >  instruction set feature?
> > >>>
> > >>> The background is to support the conditional compare instructions
> > >>> in ARM (more effectively) and AArch64 at all.
> > >>>
> > >>> The current method used in ARM is to expand into a series of
> > >>> store-flag instructions and then hope that combine can optimize
> > >>> them away (though that fails far too often, particularly when the
> > >>> first instruction in the sequence is combined into another
> > >>> pattern).  To make it work at all the compiler has to lie about
> > >>> the costs of various store-flag type operations which overall
> > >>> risks producing worse code and means we also have to support many
> > >>> more complex multi-instruction patterns than is desirable.  I
> > >>> really don't want to go down the same
> > > route
> > >> for AArch64.
> > >>>
> > >>> The idea behind all this is to capture potential conditional
> > >>> compare operations early enough in the mid end that we can keep
> > >>> track of them until RTL expand time and then to emit the correct
> > >>> logic on all targets depending on what is the best thing for that
> > >>> target.  The current method of lowering into store-flag sequences
> > >>> doesn't really cut
> > > it.
> > >>
> > >> It seems to me that then the initial instruction selection process
> > >> (aka
> > > RTL
> > >> expansion) needs to be improved.  As we are expanding with having
> > >> the CFG around it should be easy enough to detect AND/ORIF cases
> > >> and do better here.  Yeah, I suppose this asks to turn existing
> > >> jump expansion
> > > optimizations
> > >> up-side-down to optimize with the GIMPLE CFG in mind.
> > >>
> > >> The current way of LOGICAL_OP_NON_SHORT_CIRCUIT is certainly
> bogus
> > -
> > >> fold-const.c is way too early to decide this.  Similar to the
> > >> ongoing work
> > > of
> > >> expanding / building-up switch expressions in a GIMPLE pass, moving
> > >> expand complexity up the pipeline this asks for a GIMPLE phase that
> > >> moves this decision down closer to RTL expansion.
> > 

Re: [PATCH] Generate fused widening multiply-and-accumulate operations only when the widening multiply has single use

2013-10-22 Thread Andrew Stubbs

On 21/10/13 23:01, Yufeng Zhang wrote:

Hi,

This patch changes the widening_mul pass to fuse the widening multiply
with accumulate only when the multiply has single use.  The widening_mul
pass currently does the conversion regardless of the number of the uses,
which can cause poor code-gen in cases like the following:


This seems reasonable to me, not that I have authority to approve it.

You should probably create at least one test case, ideally for both 
AArch32 and AArch64. There are already AArch32 testcases in gcc.target 
for the widening multiplies cases, but as they are single use only, you 
shouldn't need to adjust them.


Andrew


[C++ Patch, obvious?] PR 58816

2013-10-22 Thread Paolo Carlini

Hi,

this issue, where we should be using get_attribute_name, which works 
both with GNU and C++11 attributes, instead of TREE_PURPOSE, shows up as 
many unexpected fails of the cpp0x/gen-attrs* tests on AIX. David tested 
the patchlet on AIX and I double checked it on x86_64-linux. I think the 
change qualifies as obvious and I will commit it later today if nobody 
objects.


Thanks!
Paolo.


2013-10-22  Paolo Carlini  

PR c++/58816
* pt.c (apply_late_template_attributes): Use get_attribute_name,
not TREE_PURPOSE.
Index: pt.c
===
--- pt.c(revision 203915)
+++ pt.c(working copy)
@@ -8610,7 +8610,7 @@ apply_late_template_attributes (tree *decl_p, tree
 pass it through tsubst.  Attributes like mode, format,
 cleanup and several target specific attributes expect it
 unmodified.  */
- else if (attribute_takes_identifier_p (TREE_PURPOSE (t)))
+ else if (attribute_takes_identifier_p (get_attribute_name (t)))
{
  tree chain
= tsubst_expr (TREE_CHAIN (TREE_VALUE (t)), args, complain,


[PATCH][buildrobot] tilepro/tilegx: fallout after tree.h refactoring (was: Re-factor inclusion of tree.h)

2013-10-22 Thread Jan-Benedict Glaw
On Mon, 2013-10-21 15:36:49 -0400, Diego Novillo  wrote:
> Can anyone think of some way that we can use to automatically block
> inclusions of tree.h from header files? Code review is the only way
> that comes to mind.

Grep once, then install a commit hook.

> Committed both patches to trunk.

I get some fallout for tilepro-linux, see
http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=21851 .

This fixes it:

2013-10-22  Jan-Benedict Glaw  

* config/tilepro/tilepro.c: Include "tree.h".

diff --git a/gcc/config/tilepro/tilepro.c b/gcc/config/tilepro/tilepro.c
index 99ce5a0..12adda3 100644
--- a/gcc/config/tilepro/tilepro.c
+++ b/gcc/config/tilepro/tilepro.c
@@ -40,6 +40,7 @@
 #include "function.h"
 #include "dwarf2.h"
 #include "timevar.h"
+#include "tree.h"
 #include "gimple.h"
 #include "cfgloop.h"
 #include "tilepro-builtins.h"

Ok?

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: "really soon now":  an unspecified period of time, 
likly to
the second  : be greater than any reasonable 
definition
  of "soon".


signature.asc
Description: Digital signature


RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-10-22 Thread Bernd Edlinger
Well,

one more point where the current patch is probably wrong:

the AAPCS states that for volatile bit-field access:

"For a write operation the read must always occur even if the entire contents 
of the container will be replaced"

that means 
struct s
{
  volatile int a:32;
} ss;

ss.a=1; //needs to read the value exactly once and write the new value.

currently we just store.

Bernd.