date:20120718

Re: [patch][rfc] Clean up CFG dumping

2012-07-18 Thread Tobias Burnus


Steven,

I think your patch broke bootstrapping with Graphite enabled.

Tobias

PS: Possible patch, I haven't checked whether "0" makes sense or 
something else should be used.


--- a/gcc/graphite-poly.c
+++ b/gcc/graphite-poly.c
@@ -675,3 +675,3 @@ print_pbb_body (FILE *file, poly_bb_p pbb, int 
verbosity,

   fprintf (file, "{\n");
-  dump_bb (pbb_bb (pbb), file, 0);
+  dump_bb (file, pbb_bb (pbb), 0, 0);
   fprintf (file, "}\n");



/projects/tob/gcc-git/gcc/gcc/graphite-poly.c: In function 'print_pbb_body':
/projects/tob/gcc-git/gcc/gcc/graphite-poly.c:676:3: warning: passing 
argument 1 of 'dump_bb' from incompatible pointer type [enabled by default]

   dump_bb (pbb_bb (pbb), file, 0);
   ^
In file included from /projects/tob/gcc-git/gcc/gcc/basic-block.h:832:0,
 from /projects/tob/gcc-git/gcc/gcc/tree-flow.h:27,
 from /projects/tob/gcc-git/gcc/gcc/graphite-poly.c:38:
/projects/tob/gcc-git/gcc/gcc/cfghooks.h:144:13: note: expected 'struct 
FILE *' but argument is of type 'basic_block'

 extern void dump_bb (FILE *, basic_block, int, int);
 ^
/projects/tob/gcc-git/gcc/gcc/graphite-poly.c:676:3: warning: passing 
argument 2 of 'dump_bb' from incompatible pointer type [enabled by default]

   dump_bb (pbb_bb (pbb), file, 0);
   ^
In file included from /projects/tob/gcc-git/gcc/gcc/basic-block.h:832:0,
 from /projects/tob/gcc-git/gcc/gcc/tree-flow.h:27,
 from /projects/tob/gcc-git/gcc/gcc/graphite-poly.c:38:
/projects/tob/gcc-git/gcc/gcc/cfghooks.h:144:13: note: expected 
'basic_block' but argument is of type 'struct FILE *'

 extern void dump_bb (FILE *, basic_block, int, int);
 ^
/projects/tob/gcc-git/gcc/gcc/graphite-poly.c:676:3: error: too few 
arguments to function 'dump_bb'

   dump_bb (pbb_bb (pbb), file, 0);
   ^
In file included from /projects/tob/gcc-git/gcc/gcc/basic-block.h:832:0,
 from /projects/tob/gcc-git/gcc/gcc/tree-flow.h:27,
 from /projects/tob/gcc-git/gcc/gcc/graphite-poly.c:38:
/projects/tob/gcc-git/gcc/gcc/cfghooks.h:144:13: note: declared here
 extern void dump_bb (FILE *, basic_block, int, int);
 ^
make[3]: *** [graphite-poly.o] Error 1

Commit: ARM: Document -munaligned-access

2012-07-18 Thread Nick Clifton

Hi Guys,

  I am checking in this patch to the mainline to document the ARM port's
  -munaligned-access command line option.

Cheers
  Nick

gcc/ChangeLog
2012-07-18  Nick Clifton  

* doc/invoke.texi (ARM Options): Document -munaligned-access.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 189603)
+++ gcc/doc/invoke.texi (working copy)
@@ -497,7 +497,8 @@
 -mcaller-super-interworking  -mcallee-super-interworking @gol
 -mtp=@var{name} -mtls-dialect=@var{dialect} @gol
 -mword-relocations @gol
--mfix-cortex-m3-ldrd}
+-mfix-cortex-m3-ldrd @gol
+-munaligned-access}
 
 @emph{AVR Options}
 @gccoptlist{-mmcu=@var{mcu} -maccumulate-args -mbranch-cost=@var{cost} @gol
@@ -11049,6 +11050,23 @@
 generating these instructions.  This option is enabled by default when
 @option{-mcpu=cortex-m3} is specified.
 
+@item -munaligned-access
+@itemx -mno-unaligned-access
+@opindex munaligned-access
+@opindex mno-unaligned-access
+Enables (or disables) reading and writing of 16- and 32- bit values
+from addresses that are not 16- or 32- bit aligned.  By default
+unaligned access is disabled for all pre-ARMv6 and all ARMv6-M
+architectures, and enabled for all other architectures.  If unaligned
+access is not enabled then words in packed data structures will be
+accessed a byte at a time.
+
+The ARM attribute @code{Tag_CPU_unaligned_access} will be set in the
+generated object file to either true or false, depending upon the
+setting of this option.  If unaligned access is enabled then the
+preprocessor symbol @code{__ARM_FEATURE_UNALIGNED} will also be
+defined.
+
 @end table
 
 @node AVR Options

Re: [SH] Add test case for PR 38621

2012-07-18 Thread Oleg Endo

On Tue, 2012-07-17 at 15:26 -0700, Mike Stump wrote:
> On Jul 17, 2012, at 1:06 PM, Oleg Endo wrote:
> > The attached patch adds the test case from comment #3 of PR 38621 to the
> > test suite.
> > 
> > Tested with
> > make check-gcc RUNTESTFLAGS="compile.exp=pr38621.c --target_board=sh-sim
> > \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb,-m4-single/-ml,
> > -m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}"
> > 
> > OK?
> 
> Gosh, the code looks so, portable.  :-)
> 
> Ok.  I'd say nix the blank line at the end of the file.

Committed without the blank line as rev 189605.

Cheers,
Oleg

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread Richard Guenther

On Tue, 17 Jul 2012, William J. Schmidt wrote:

> I overlooked adding a pass-control flag for strength reduction, added
> here.  I named it -ftree-slsr for consistency with other -ftree- flags,
> but could change it to -fgimple-slsr if you prefer that for a pass named
> gimple-ssa-...
> 
> Bootstrapped and tested on powerpc-unknown-linux-gnu with no new
> regressions.  Ok for trunk?

The switch needs documentation in doc/invoke.texi.  Other than that
it's fine to stick with -ftree-..., even that exposes details to our
users that are not necessary (RTL passes didn't have -frtl-... either).
So in the end, why not re-use -fstrength-reduce that is already available
(but stubbed out)?

Comments from other folks?

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> 2012-07-17  Bill Schmidt  
> 
>   * opts.c (default_option): Make -ftree-slsr default at -O1 and above.
>   * gimple-ssa-strength-reduction.c (gate_strength_reduction): Use
>   flag_tree_slsr.
>   * common.opt: Add -ftree-slsr with flag_tree_slsr.
> 
> 
> Index: gcc/opts.c
> ===
> --- gcc/opts.c(revision 189574)
> +++ gcc/opts.c(working copy)
> @@ -452,6 +452,7 @@ static const struct default_options default_option
>  { OPT_LEVELS_1_PLUS, OPT_ftree_ch, NULL, 1 },
>  { OPT_LEVELS_1_PLUS, OPT_fcombine_stack_adjustments, NULL, 1 },
>  { OPT_LEVELS_1_PLUS, OPT_fcompare_elim, NULL, 1 },
> +{ OPT_LEVELS_1_PLUS, OPT_ftree_slsr, NULL, 1 },
>  
>  /* -O2 optimizations.  */
>  { OPT_LEVELS_2_PLUS, OPT_finline_small_functions, NULL, 1 },
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c   (revision 189574)
> +++ gcc/gimple-ssa-strength-reduction.c   (working copy)
> @@ -1501,7 +1501,7 @@ execute_strength_reduction (void)
>  static bool
>  gate_strength_reduction (void)
>  {
> -  return optimize > 0;
> +  return flag_tree_slsr;
>  }
>  
>  struct gimple_opt_pass pass_strength_reduction =
> Index: gcc/common.opt
> ===
> --- gcc/common.opt(revision 189574)
> +++ gcc/common.opt(working copy)
> @@ -2080,6 +2080,10 @@ ftree-sink
>  Common Report Var(flag_tree_sink) Optimization
>  Enable SSA code sinking on trees
>  
> +ftree-slsr
> +Common Report Var(flag_tree_slsr) Optimization
> +Perform straight-line strength reduction
> +
>  ftree-sra
>  Common Report Var(flag_tree_sra) Optimization
>  Perform scalar replacement of aggregates
> 
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH 1/2] gcc symbol database

2012-07-18 Thread Dodji Seketeli

Yunfeng ZHANG  writes:

>> It took me a couple of minutes to understand what you meant here, so
>> please let me re-phrase to make sure I got it.
>>
>> You are saying that the callback function of the cb_lex_token event is
>> set by the callback function of the macro_start_expand event.
>>
>> Is that correct?
>
> Yes.

Thank you for making this clear.

>> And this makes me wonder why you'd need the second parameter of
>> macro_start_expand (the token).  I believe you should have all the
>> information you need with the first, second, and last parameter.  As I
>> said in my previous email, you can get the file offset of the token in
>> your client code by doing 'file_offset = line + column'.  So that token
>> should not be needed.  Thus, calling macro_start_expand from inside
>> funlike_invocation_p once you are sure the expansion of the macro is
>> going to take place, is possible.
>
> The only thing is the file-offset or source_location of the macro
> leader token.

To try to avoid confusion, I think what you call "source_location of the
macro leader token" is actually the spelling location of the macro.

> I don't know how to get it when macro_start_expand is called in
> funlike_invocation_p intern.

[...]

> BTW, I can change my plugin to use line/column instead of fileoffset, there
> ins't design limitation, only time.

I think you can add a new source_location parameter to
funlike_invocation_p function, and pass it the result->src_loc that you
need.

-- 
Dodji

Re: [patch][rfc] Clean up CFG dumping

2012-07-18 Thread Steven Bosscher

On Wed, Jul 18, 2012 at 2:24 AM, H.J. Lu  wrote:
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54008

Yes, they failed in my testing, too. I must have been blind to overlook them...

I received some comments in private about the "new look" of the dumps,
that I'll be addressing with a patch later today. I'll fix these two
test cases with that patch also.

Ciao!
Steven

Re: [patch][rfc] Clean up CFG dumping

2012-07-18 Thread Steven Bosscher

On Wed, Jul 18, 2012 at 9:00 AM, Tobias Burnus  wrote:
> Steven,
>
> I think your patch broke bootstrapping with Graphite enabled.

Yes it did. That's twice in one week, because Graphite isn't enabled
for builds on the compile farm. I'll see if I can install the
necessary libraries on the machines I use to avoid breaking Graphite
in the future.

> Tobias
>
> PS: Possible patch, I haven't checked whether "0" makes sense or something
> else should be used.

I thought this morning that 0 is fine, and I've committed the same
patch as yours a few minutes ago.

Ciao!
Steven

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread Steven Bosscher

On Wed, Jul 18, 2012 at 9:59 AM, Richard Guenther  wrote:
> On Tue, 17 Jul 2012, William J. Schmidt wrote:
>
>> I overlooked adding a pass-control flag for strength reduction, added
>> here.  I named it -ftree-slsr for consistency with other -ftree- flags,
>> but could change it to -fgimple-slsr if you prefer that for a pass named
>> gimple-ssa-...
>>
>> Bootstrapped and tested on powerpc-unknown-linux-gnu with no new
>> regressions.  Ok for trunk?
>
> The switch needs documentation in doc/invoke.texi.  Other than that
> it's fine to stick with -ftree-..., even that exposes details to our
> users that are not necessary (RTL passes didn't have -frtl-... either).
> So in the end, why not re-use -fstrength-reduce that is already available
> (but stubbed out)?

In the past, -fstrength-reduce applied to loop strength reduction in
loop.c. I don't think it should be re-used for a completely different
code transformation.

Ciao!
Steven

Re: [ARM Patch 1/3]PR53189: optimizations of 64bit logic operation with constant

2012-07-18 Thread Carrot Wei

On Tue, Jul 17, 2012 at 9:47 PM, Ramana Radhakrishnan
 wrote:
> Carrot,
>
> Sorry about the delayed response.
>
> On 3 July 2012 12:28, Carrot Wei  wrote:
>> On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan
>>  wrote:
>>> On 28 May 2012 11:08, Carrot Wei  wrote:
 Hi

 This is the second part of the patches that deals with 64bit and. It 
 directly
 extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
 constant operands.

>>>
>>> Comments about const_di_ok_for_op still apply from my review of your add 
>>> patch.
>>>
>>> However I don't see why and /ior / xor with constants that have either
>>> the low or high parts set can't be expanded directly into ands of
>>> subregs with moves of zero's or the original value especially if you
>>> aren't looking at doing 64 bit operations in neon .With Neon being
>>> used for 64 bit arithmetic it gets more interesting.
>>>
>>> Finally this should target PR target/53189.
>>>
>>
>> Hi Ramana
>>
>> Thanks for the review. Following is the updated patch according to
>> your comments.
>
> You've missed answering this part of my review :)
>
>>> However I don't see why and /ior / xor with constants that have either
>>> the low or high parts set can't be expanded directly into ands of
>>> subregs with moves of zero's or the original value especially if you
>>> aren't looking at doing 64 bit operations in neon .With Neon being
>>> used for 64 bit arithmetic it gets more interesting.
>
It has been handled by the const_ok_for_dimode_op and the output part
of corresponding SI mode insn. Let's take the IOR case as an example.

In the const_ok_for_dimode_op patch

--- arm.c   (revision 189278)
+++ arm.c   (working copy)
@@ -2524,6 +2524,16 @@
 case PLUS:
   return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode);

+case IOR:
+  if ((const_ok_for_arm (hi_val) || (hi_val == 0x))
+ && (const_ok_for_arm (lo_val) || (lo_val == 0x)))
+   return 1;
+  if (TARGET_THUMB2
+ && (const_ok_for_arm (lo_val) || const_ok_for_arm (~lo_val))
+ && (const_ok_for_arm (hi_val) || const_ok_for_arm (~hi_val)))
+   return 1;
+  return 0;
+
 default:
   return 0;
 }

The 0x is not valid arm mode immediate, but ior 0X
results in all 1's, so it is also allowed in an iordi3 insn. And the
patch in iorsi3_insn pattern explicitly check the all 0's and all 1's
cases, and output either a simple register mov instruction or
instruction mov all 1's to the destination.

@@ -2902,10 +2915,29 @@
(ior:SI (match_operand:SI 1 "s_register_operand" "%r,r,r")
(match_operand:SI 2 "reg_or_int_operand" "rI,K,?n")))]
   "TARGET_32BIT"
-  "@
-   orr%?\\t%0, %1, %2
-   orn%?\\t%0, %1, #%B2
-   #"
+  "*
+  {
+if (CONST_INT_P (operands[2]))
+  {
+   HOST_WIDE_INT i = INTVAL (operands[2]) & 0x;
+   if (i == 0x)
+ return \"mvn%?\\t%0, #0\";
+   if (i == 0)
+ {
+   if (!rtx_equal_p (operands[0], operands[1]))
+ return \"mov%?\\t%0, %1\";
+   else
+ return \"\";
+ }
+  }
+
+switch (which_alternative)
+  {
+  case 0: return \"orr%?\\t%0, %1, %2\";
+  case 1: return \"orn%?\\t%0, %1, #%B2\";
+  case 2: return \"#\";
+  }
+  }"
   "TARGET_32BIT
&& GET_CODE (operands[2]) == CONST_INT
&& !(const_ok_for_arm (INTVAL (operands[2]))


> Is there any reason why we don't split such cases earlier into the
> constituent moves and the associated ands earlier than reload in the
> non-Neon case?
>
I referenced pattern arm_adddi3 which is split after reload_completed.
And the pattern arm_subdi3 is even not split. I guess they keep the
original constant longer may benefit some optimizations involving
constants. But it may also lose flexibility in other cases.

>  In addition, it would be good to have some tests for Thumb2 that deal
> with the replicated constants for Thumb2 . Can you have a look at
> creating some tests similar to the thumb2*replicated*.c tests in
> gcc.target/arm but for 64 bit constants ?
>

The new test cases involving thumb2 replicated constants are added as following.

thanks
Carrot



2012-07-18  Wei Guozhi  

PR target/53189
* gcc.target/arm/pr53189-10.c: New testcase.
* gcc.target/arm/pr53189-11.c: New testcase.
* gcc.target/arm/pr53189-12.c: New testcase.



Index: pr53189-10.c
===
--- pr53189-10.c(revision 0)
+++ pr53189-10.c(revision 0)
@@ -0,0 +1,9 @@
+/* { dg-options "-mthumb -O2" }  */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler "and.*#-16843010" } } */
+
+void t0p(long long * p)
+{
+  *p &= 0x9fefefefe;
+}


Index: pr53189-11.c
===
--- pr53

Re: [PATCH] Re-work get_object_alignment (again)

2012-07-18 Thread Eric Botcazou

> Now, back to PR53970, where #pragma pack() is used to pack a
> struct.  With #pragma pack() no part of the type or field-decls
> have a hint that packing took place (well, their align information
> tell you), which means the vectorizers use of contains_packed_reference
> is not conservative enough, nor is expand_exprs use:
> 
> case BIT_FIELD_REF:
> case ARRAY_RANGE_REF:
> normal_inner_ref:
>   {
> ...
> if (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (exp, 0)))
> 
> || (TREE_CODE (TREE_OPERAND (exp, 1)) == FIELD_DECL
> 
> && DECL_PACKED (TREE_OPERAND (exp, 1
>   packedp = true;
> 
> I'm not sure if this flag is required for correctness - it's only
> passed to extract_bit_field - but if it is the above code does
> not work for #pragma packed structs.  I suppose what should be
> checked is (a few lines below the above test, after we expanded tem)
> 
> if (MEM_P (op0)
> && GET_MODE (op0) != BLKmode
>   && MEM_ALIGN (op0) < GET_MODE_ALIGNMENT (GET_MODE (op0)))
>   packedp = true;
> 
> ?  I suppose packedp was computed for STRICT_ALIGNMENT targets only.
> I'm not changing the above, but Eric, you should be able to produce a
> #pragma packed testcase that fails on a STRICT_ALIGNMENT target?

This is the -fstrict-volatile-bitfields business though, and its documentation 
explicitly refers to the packed attribute:

 If the target requires strict alignment, and honoring the field
 type would require violating this alignment, a warning is issued.
 If the field has `packed' attribute, the access is done without
 honoring the field type.  If the field doesn't have `packed'
 attribute, the access is done honoring the field type.  In both
 cases, GCC assumes that the user knows something about the target
 hardware that it is unaware of.

so I'm a little reluctant to touch that.  But, yes, generally speaking, testing 
TYPE_PACKED or DECL_PACKED to drive code generation is wrong.

> Oh, and this does not yet fix PR53970 - but I hope that I can
> remove contains_packed_reference ;)

Right, it should definitely go away.

-- 
Eric Botcazou

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread Richard Guenther

On Wed, 18 Jul 2012, Steven Bosscher wrote:

> On Wed, Jul 18, 2012 at 9:59 AM, Richard Guenther  wrote:
> > On Tue, 17 Jul 2012, William J. Schmidt wrote:
> >
> >> I overlooked adding a pass-control flag for strength reduction, added
> >> here.  I named it -ftree-slsr for consistency with other -ftree- flags,
> >> but could change it to -fgimple-slsr if you prefer that for a pass named
> >> gimple-ssa-...
> >>
> >> Bootstrapped and tested on powerpc-unknown-linux-gnu with no new
> >> regressions.  Ok for trunk?
> >
> > The switch needs documentation in doc/invoke.texi.  Other than that
> > it's fine to stick with -ftree-..., even that exposes details to our
> > users that are not necessary (RTL passes didn't have -frtl-... either).
> > So in the end, why not re-use -fstrength-reduce that is already available
> > (but stubbed out)?
> 
> In the past, -fstrength-reduce applied to loop strength reduction in
> loop.c. I don't think it should be re-used for a completely different
> code transformation.

Ok.  I suppose -ftree-slsr is ok then.

Thanks,
Richard.

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread Eric Botcazou

> In the past, -fstrength-reduce applied to loop strength reduction in
> loop.c. I don't think it should be re-used for a completely different
> code transformation.

Seconded.

-- 
Eric Botcazou

Re: [Fortran, Patch] Fix #line parsing

2012-07-18 Thread Tobias Burnus


On 07/17/2012 11:29 PM, Tobias Burnus wrote:

Mikael Morin wrote:

On 17/07/2012 20:52, Tobias Burnus wrote:

Build and regtested on x86-64-gnu-linux.
OK for the trunk?


I should have bootstrapped and not just build the patch. It then fails 
in libgfortran:


Warning: libgfortran/kinds-override.h:47: file ./kinds.inc left but not 
entered
Warning: libgfortran/kinds-override.h:47: file 
/projects/tob/gcc-git/gcc/libgfortran/generated/_abs_c4.F90 left but not 
entered
Warning: ./c99_protos.inc:1: file libgfortran/generated/_abs_c4.F90 left 
but not entered


Hence, I retract the patch. Someone should look closer at those, but I 
won't do this in the near future.


Tobias

Re: [ARM Patch 1/3]PR53189: optimizations of 64bit logic operation with constant

2012-07-18 Thread Ramana Radhakrishnan

On 18 July 2012 09:20, Carrot Wei  wrote:
> On Tue, Jul 17, 2012 at 9:47 PM, Ramana Radhakrishnan
>  wrote:
>> Carrot,
>>
>> Sorry about the delayed response.
>>
>> On 3 July 2012 12:28, Carrot Wei  wrote:
>>> On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan
>>>  wrote:
 On 28 May 2012 11:08, Carrot Wei  wrote:
> Hi
>
> This is the second part of the patches that deals with 64bit and. It 
> directly
> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
> constant operands.
>

 Comments about const_di_ok_for_op still apply from my review of your add 
 patch.

 However I don't see why and /ior / xor with constants that have either
 the low or high parts set can't be expanded directly into ands of
 subregs with moves of zero's or the original value especially if you
 aren't looking at doing 64 bit operations in neon .With Neon being
 used for 64 bit arithmetic it gets more interesting.

 Finally this should target PR target/53189.

>>>
>>> Hi Ramana
>>>
>>> Thanks for the review. Following is the updated patch according to
>>> your comments.
>>
>> You've missed answering this part of my review :)
>>
 However I don't see why and /ior / xor with constants that have either
 the low or high parts set can't be expanded directly into ands of
 subregs with moves of zero's or the original value especially if you
 aren't looking at doing 64 bit operations in neon .With Neon being
 used for 64 bit arithmetic it gets more interesting.
>>
> It has been handled by the const_ok_for_dimode_op and the output part
> of corresponding SI mode insn. Let's take the IOR case as an example.
>

I noticed that - If I wasn't clear enough, the question was more
towards generating a subreg move at expand time rather than a split
and handling while outputting asm if you see what I mean.

regards,
Ramana

Re: Commit: ARM: Document -munaligned-access

2012-07-18 Thread Ramana Radhakrishnan

On 18 July 2012 07:51, Nick Clifton  wrote:
> Hi Guys,
>
>   I am checking in this patch to the mainline to document the ARM port's
>   -munaligned-access command line option.

Could you ask if RM's object to backporting this to the 4.7 branch. ?

Thanks,
ramana

>
> Cheers
>   Nick
>
> gcc/ChangeLog
> 2012-07-18  Nick Clifton  
>
> * doc/invoke.texi (ARM Options): Document -munaligned-access.
>
> Index: gcc/doc/invoke.texi
> ===
> --- gcc/doc/invoke.texi (revision 189603)
> +++ gcc/doc/invoke.texi (working copy)
> @@ -497,7 +497,8 @@
>  -mcaller-super-interworking  -mcallee-super-interworking @gol
>  -mtp=@var{name} -mtls-dialect=@var{dialect} @gol
>  -mword-relocations @gol
> --mfix-cortex-m3-ldrd}
> +-mfix-cortex-m3-ldrd @gol
> +-munaligned-access}
>
>  @emph{AVR Options}
>  @gccoptlist{-mmcu=@var{mcu} -maccumulate-args -mbranch-cost=@var{cost} @gol
> @@ -11049,6 +11050,23 @@
>  generating these instructions.  This option is enabled by default when
>  @option{-mcpu=cortex-m3} is specified.
>
> +@item -munaligned-access
> +@itemx -mno-unaligned-access
> +@opindex munaligned-access
> +@opindex mno-unaligned-access
> +Enables (or disables) reading and writing of 16- and 32- bit values
> +from addresses that are not 16- or 32- bit aligned.  By default
> +unaligned access is disabled for all pre-ARMv6 and all ARMv6-M
> +architectures, and enabled for all other architectures.  If unaligned
> +access is not enabled then words in packed data structures will be
> +accessed a byte at a time.
> +
> +The ARM attribute @code{Tag_CPU_unaligned_access} will be set in the
> +generated object file to either true or false, depending upon the
> +setting of this option.  If unaligned access is enabled then the
> +preprocessor symbol @code{__ARM_FEATURE_UNALIGNED} will also be
> +defined.
> +
>  @end table
>
>  @node AVR Options

Fwd: Re: Commit: ARM: Document -munaligned-access

2012-07-18 Thread nick clifton


Hi Richard,

  I have a documentation update for the 4.7 branch.  Is it OK to apply 
this ?


Cheers
  Nick


gcc/ChangeLog
2012-07-18  Nick Clifton  

* doc/invoke.texi (ARM Options): Document -munaligned-access.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 189603)
+++ gcc/doc/invoke.texi (working copy)
@@ -497,7 +497,8 @@
 -mcaller-super-interworking  -mcallee-super-interworking @gol
 -mtp=@var{name} -mtls-dialect=@var{dialect} @gol
 -mword-relocations @gol
--mfix-cortex-m3-ldrd}
+-mfix-cortex-m3-ldrd @gol
+-munaligned-access}

 @emph{AVR Options}
 @gccoptlist{-mmcu=@var{mcu} -maccumulate-args -mbranch-cost=@var{cost} @gol
@@ -11049,6 +11050,23 @@
 generating these instructions.  This option is enabled by default when
 @option{-mcpu=cortex-m3} is specified.

+@item -munaligned-access
+@itemx -mno-unaligned-access
+@opindex munaligned-access
+@opindex mno-unaligned-access
+Enables (or disables) reading and writing of 16- and 32- bit values
+from addresses that are not 16- or 32- bit aligned.  By default
+unaligned access is disabled for all pre-ARMv6 and all ARMv6-M
+architectures, and enabled for all other architectures.  If unaligned
+access is not enabled then words in packed data structures will be
+accessed a byte at a time.
+
+The ARM attribute @code{Tag_CPU_unaligned_access} will be set in the
+generated object file to either true or false, depending upon the
+setting of this option.  If unaligned access is enabled then the
+preprocessor symbol @code{__ARM_FEATURE_UNALIGNED} will also be
+defined.
+
 @end table

 @node AVR Options

Re: Fwd: Re: Commit: ARM: Document -munaligned-access

2012-07-18 Thread Richard Guenther

On Wed, 18 Jul 2012, nick clifton wrote:

> Hi Richard,
> 
>   I have a documentation update for the 4.7 branch.  Is it OK to apply this ?

Sure.

Thanks,
Richard.

> Cheers
>   Nick
> 
> > gcc/ChangeLog
> > 2012-07-18  Nick Clifton  
> > 
> > * doc/invoke.texi (ARM Options): Document -munaligned-access.
> > 
> > Index: gcc/doc/invoke.texi
> > ===
> > --- gcc/doc/invoke.texi (revision 189603)
> > +++ gcc/doc/invoke.texi (working copy)
> > @@ -497,7 +497,8 @@
> >  -mcaller-super-interworking  -mcallee-super-interworking @gol
> >  -mtp=@var{name} -mtls-dialect=@var{dialect} @gol
> >  -mword-relocations @gol
> > --mfix-cortex-m3-ldrd}
> > +-mfix-cortex-m3-ldrd @gol
> > +-munaligned-access}
> > 
> >  @emph{AVR Options}
> >  @gccoptlist{-mmcu=@var{mcu} -maccumulate-args -mbranch-cost=@var{cost} @gol
> > @@ -11049,6 +11050,23 @@
> >  generating these instructions.  This option is enabled by default when
> >  @option{-mcpu=cortex-m3} is specified.
> > 
> > +@item -munaligned-access
> > +@itemx -mno-unaligned-access
> > +@opindex munaligned-access
> > +@opindex mno-unaligned-access
> > +Enables (or disables) reading and writing of 16- and 32- bit values
> > +from addresses that are not 16- or 32- bit aligned.  By default
> > +unaligned access is disabled for all pre-ARMv6 and all ARMv6-M
> > +architectures, and enabled for all other architectures.  If unaligned
> > +access is not enabled then words in packed data structures will be
> > +accessed a byte at a time.
> > +
> > +The ARM attribute @code{Tag_CPU_unaligned_access} will be set in the
> > +generated object file to either true or false, depending upon the
> > +setting of this option.  If unaligned access is enabled then the
> > +preprocessor symbol @code{__ARM_FEATURE_UNALIGNED} will also be
> > +defined.
> > +
> >  @end table
> > 
> >  @node AVR Options
> 
> 
> 
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

[PATCH] Fix PR53970

2012-07-18 Thread Richard Guenther


With get_object_alignment and get_object_or_type_alignment fused
it is now easy to fix PR53970 and remove the bogus
contains_packed_reference function.  The vectorizer wants to know
whether the scalar access is aligned in a way that peeling
can eventually reach VF * scalar alignment (thus, vector alignment).
So, just ask that - whether the scalar access is aligned to at least
its size.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-07-18  Richard Guenther  

PR tree-optimization/53970
* tree.h (contains_packed_reference): Remove.
* expr.c (contains_packed_reference): Likewise.
* tree-vect-data-refs.c (not_size_aligned): New function.
(vector_alignment_reachable_p): Use it.
(vect_supportable_dr_alignment): Likewise.

* g++.dg/torture/pr53970.C: New testcase.

Index: gcc/tree.h
===
*** gcc/tree.h  (revision 189607)
--- gcc/tree.h  (working copy)
*** extern tree get_inner_reference (tree, H
*** 5068,5079 
 tree *, enum machine_mode *, int *, int *,
 bool);
  
- /* Given an expression EXP that may be a COMPONENT_REF, an ARRAY_REF or an
-ARRAY_RANGE_REF, look for whether EXP or any nested component-refs within
-EXP is marked as PACKED.  */
- 
- extern bool contains_packed_reference (const_tree exp);
- 
  /* Return a tree of sizetype representing the size, in bytes, of the element
 of EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
  
--- 5068,5073 
Index: gcc/expr.c
===
*** gcc/expr.c  (revision 189607)
--- gcc/expr.c  (working copy)
*** get_inner_reference (tree exp, HOST_WIDE
*** 6709,6755 
return exp;
  }
  
- /* Given an expression EXP that may be a COMPONENT_REF, an ARRAY_REF or an
-ARRAY_RANGE_REF, look for whether EXP or any nested component-refs within
-EXP is marked as PACKED.  */
- 
- bool
- contains_packed_reference (const_tree exp)
- {
-   bool packed_p = false;
- 
-   while (1)
- {
-   switch (TREE_CODE (exp))
-   {
-   case COMPONENT_REF:
- {
-   tree field = TREE_OPERAND (exp, 1);
-   packed_p = DECL_PACKED (field)
-  || TYPE_PACKED (TREE_TYPE (field))
-  || TYPE_PACKED (TREE_TYPE (exp));
-   if (packed_p)
- goto done;
- }
- break;
- 
-   case BIT_FIELD_REF:
-   case ARRAY_REF:
-   case ARRAY_RANGE_REF:
-   case REALPART_EXPR:
-   case IMAGPART_EXPR:
-   case VIEW_CONVERT_EXPR:
- break;
- 
-   default:
- goto done;
-   }
-   exp = TREE_OPERAND (exp, 0);
- }
-  done:
-   return packed_p;
- }
- 
  /* Return a tree of sizetype representing the size, in bytes, of the element
 of EXP, an ARRAY_REF or an ARRAY_RANGE_REF.  */
  
--- 6709,6714 
Index: gcc/tree-vect-data-refs.c
===
*** gcc/tree-vect-data-refs.c   (revision 189607)
--- gcc/tree-vect-data-refs.c   (working copy)
*** vect_verify_datarefs_alignment (loop_vec
*** 1131,1136 
--- 1131,1148 
return true;
  }
  
+ /* Given an memory reference EXP return whether its alignment is less
+than its size.  */
+ 
+ static bool
+ not_size_aligned (tree exp)
+ {
+   if (!host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1))
+ return true;
+ 
+   return (tree_low_cst (TYPE_SIZE (TREE_TYPE (exp)), 1)
+ > get_object_alignment (exp));
+ }
  
  /* Function vector_alignment_reachable_p
  
*** vector_alignment_reachable_p (struct dat
*** 1184,1195 
  
if (!known_alignment_for_access_p (dr))
  {
!   tree type = (TREE_TYPE (DR_REF (dr)));
!   bool is_packed = contains_packed_reference (DR_REF (dr));
! 
!   if (compare_tree_int (TYPE_SIZE (type), TYPE_ALIGN (type)) > 0)
!   is_packed = true;
! 
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
--- 1196,1203 
  
if (!known_alignment_for_access_p (dr))
  {
!   tree type = TREE_TYPE (DR_REF (dr));
!   bool is_packed = not_size_aligned (DR_REF (dr));
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
*** vect_supportable_dr_alignment (struct da
*** 4863,4869 
return dr_explicit_realign_optimized;
}
if (!known_alignment_for_access_p (dr))
!   is_packed = contains_packed_reference (DR_REF (dr));
  
if (targetm.vectorize.
  support_vector_misalignment (mode, type,
--- 4871,4877

Re: [ARM Patch 1/3]PR53189: optimizations of 64bit logic operation with constant

2012-07-18 Thread Carrot Wei

On Wed, Jul 18, 2012 at 5:39 PM, Ramana Radhakrishnan
 wrote:
> On 18 July 2012 09:20, Carrot Wei  wrote:
>> On Tue, Jul 17, 2012 at 9:47 PM, Ramana Radhakrishnan
>>  wrote:
>>> Carrot,
>>>
>>> Sorry about the delayed response.
>>>
>>> On 3 July 2012 12:28, Carrot Wei  wrote:
 On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan
  wrote:
> On 28 May 2012 11:08, Carrot Wei  wrote:
>> Hi
>>
>> This is the second part of the patches that deals with 64bit and. It 
>> directly
>> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
>> constant operands.
>>
>
> Comments about const_di_ok_for_op still apply from my review of your add 
> patch.
>
> However I don't see why and /ior / xor with constants that have either
> the low or high parts set can't be expanded directly into ands of
> subregs with moves of zero's or the original value especially if you
> aren't looking at doing 64 bit operations in neon .With Neon being
> used for 64 bit arithmetic it gets more interesting.
>
> Finally this should target PR target/53189.
>

 Hi Ramana

 Thanks for the review. Following is the updated patch according to
 your comments.
>>>
>>> You've missed answering this part of my review :)
>>>
> However I don't see why and /ior / xor with constants that have either
> the low or high parts set can't be expanded directly into ands of
> subregs with moves of zero's or the original value especially if you
> aren't looking at doing 64 bit operations in neon .With Neon being
> used for 64 bit arithmetic it gets more interesting.
>>>
>> It has been handled by the const_ok_for_dimode_op and the output part
>> of corresponding SI mode insn. Let's take the IOR case as an example.
>>
>
> I noticed that - If I wasn't clear enough, the question was more
> towards generating a subreg move at expand time rather than a split
> and handling while outputting asm if you see what I mean.
>
I see your point now. I don't know how much better if we handle it
earlier. Even if I generates subreg move for non-neon code at expand
time, the latter output handling is still necessary for neon insns. Do
you think it deserves the extra expand handling?

thanks
Carrot

Re: [PATCH 4/6] Thread pointer built-in functions, s390

2012-07-18 Thread Andreas Krebbel

On 07/12/2012 08:52 AM, Chung-Lin Tang wrote:
> * config/s390/s390.c (s390_builtin,code_for_builtin_64,
> code_for_builtin_31,s390_init_builtins,s390_expand_builtin):
> Remove.
> (s390_expand_builtin_thread_pointer): Add hook function for
> TARGET_EXPAND_BUILTIN_THREAD_POINTER.
> (s390_expand_builtin_set_thread_pointer): Add hook function for
> TARGET_EXPAND_BUILTIN_SET_THREAD_POINTER.

I've tested your patches on s390x. No regressions.

The patch is ok.

Bye,

-Andreas-

[Ada] Fix spurious 'noreturn' function does return warning at -O0 (1)

2012-07-18 Thread Eric Botcazou

This fixes a spurious 'noreturn' function does return warning at -O0 on code 
involving controlled types.

Tested on x86_64-suse-linux, applied on the mainline.


2012-07-18  Eric Botcazou  

* gcc-interface/trans.c (stmt_group_may_fallthru): New function.
(gnat_to_gnu) : Use it to find out whether the
block needs to be translated.


2012-07-18  Eric Botcazou  

* gnat.dg/noreturn4.ad[sb]: New test.
* gnat.dg/noreturn4_pkg.ads: New helper.


-- 
Eric Botcazou
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 189607)
+++ gcc-interface/trans.c	(working copy)
@@ -244,6 +244,7 @@ static void add_cleanup (tree, Node_Id);
 static void add_stmt_list (List_Id);
 static void push_exception_label_stack (VEC(tree,gc) **, Entity_Id);
 static tree build_stmt_group (List_Id, bool);
+static inline bool stmt_group_may_fallthru (void);
 static enum gimplify_status gnat_gimplify_stmt (tree *);
 static void elaborate_all_entities (Node_Id);
 static void process_freeze_entity (Node_Id);
@@ -6197,12 +6198,18 @@ gnat_to_gnu (Node_Id gnat_node)
   break;
 
 case N_Block_Statement:
-  start_stmt_group ();
-  gnat_pushlevel ();
-  process_decls (Declarations (gnat_node), Empty, Empty, true, true);
-  add_stmt (gnat_to_gnu (Handled_Statement_Sequence (gnat_node)));
-  gnat_poplevel ();
-  gnu_result = end_stmt_group ();
+  /* The only way to enter the block is to fall through to it.  */
+  if (stmt_group_may_fallthru ())
+	{
+	  start_stmt_group ();
+	  gnat_pushlevel ();
+	  process_decls (Declarations (gnat_node), Empty, Empty, true, true);
+	  add_stmt (gnat_to_gnu (Handled_Statement_Sequence (gnat_node)));
+	  gnat_poplevel ();
+	  gnu_result = end_stmt_group ();
+	}
+  else
+	gnu_result = alloc_stmt_list ();
   break;
 
 case N_Exit_Statement:
@@ -7240,6 +7247,17 @@ end_stmt_group (void)
   return gnu_retval;
 }
 
+/* Return whether the current statement group may fall through.  */
+
+static inline bool
+stmt_group_may_fallthru (void)
+{
+  if (current_stmt_group->stmt_list)
+return block_may_fallthru (current_stmt_group->stmt_list);
+  else
+return true;
+}
+
 /* Add a list of statements from GNAT_LIST, a possibly-empty list of
statements.*/
 
-- { dg-do compile }

with Noreturn4_Pkg; use Noreturn4_Pkg;

package body Noreturn4 is

  procedure P1 (Msg : String) is
  begin
 P1 (Msg, 0);
  end;
  procedure P1 (Msg : String; Val : Integer) is
  begin
 Fatal_Error (Value (It));
  end;

  procedure Fatal_Error (X : Integer) is
  begin
 raise PRogram_Error;
  end;

end Noreturn4;
with Ada.Finalization; use Ada.Finalization;

package Noreturn4_Pkg is

  type Priv is private;
  function It return Priv;
  function Value (Obj : Priv) return Integer;
  function OK (Obj : Priv) return Boolean;

private
  type Priv is new Controlled with record
 Value : Integer := 15;
  end record;

  procedure Adjust   (Obj : in out Priv);
  procedure Finalize (Obj : in out Priv);

end Noreturn4_Pkg;
package Noreturn4 is

  procedure P1 (Msg : String);
  procedure P1 (Msg : String; Val : Integer);
  pragma No_Return (P1);

  procedure Fatal_Error (X : Integer);
  pragma No_Return (Fatal_Error);

end Noreturn4;

[patch] Fix spurious 'noreturn' function does return warning at -O0 (2)

2012-07-18 Thread Eric Botcazou

This fixes a spurious 'noreturn' function does return warning at -O0 on code 
involving an exception block.  I overlooked this case when I implemented the 
mechanism in gimple-low.c during the 4.5 development phase.

Tested on x86_64-suse-linux, OK for the mainline?


2012-07-18  Eric Botcazou  

* gimple-low.c (lower_try_catch): New function.
(lower_stmt) : Use it to lower GIMPLE_TRY_CATCH.
: Delete.
: Likewise.


2012-07-18  Eric Botcazou  

* gnat.dg/noreturn5.ad[sb]: New test.


-- 
Eric Botcazou
Index: gimple-low.c
===
--- gimple-low.c	(revision 189525)
+++ gimple-low.c	(working copy)
@@ -76,6 +76,7 @@ struct lower_data
 
 static void lower_stmt (gimple_stmt_iterator *, struct lower_data *);
 static void lower_gimple_bind (gimple_stmt_iterator *, struct lower_data *);
+static void lower_try_catch (gimple_stmt_iterator *, struct lower_data *);
 static void lower_gimple_return (gimple_stmt_iterator *, struct lower_data *);
 static void lower_builtin_setjmp (gimple_stmt_iterator *);
 
@@ -373,31 +374,28 @@ lower_stmt (gimple_stmt_iterator *gsi, s
   return;
 
 case GIMPLE_TRY:
-  {
-	bool try_cannot_fallthru;
-	lower_sequence (gimple_try_eval_ptr (stmt), data);
-	try_cannot_fallthru = data->cannot_fallthru;
-	data->cannot_fallthru = false;
-	lower_sequence (gimple_try_cleanup_ptr (stmt), data);
-	/* See gimple_stmt_may_fallthru for the rationale.  */
-	if (gimple_try_kind (stmt) == GIMPLE_TRY_FINALLY)
-	  {
-	data->cannot_fallthru |= try_cannot_fallthru;
-	gsi_next (gsi);
-	return;
-	  }
-  }
-  break;
-
-case GIMPLE_CATCH:
-  data->cannot_fallthru = false;
-  lower_sequence (gimple_catch_handler_ptr (stmt), data);
-  break;
-
-case GIMPLE_EH_FILTER:
-  data->cannot_fallthru = false;
-  lower_sequence (gimple_eh_filter_failure_ptr (stmt), data);
-  break;
+  if (gimple_try_kind (stmt) == GIMPLE_TRY_CATCH)
+	lower_try_catch (gsi, data);
+  else
+	{
+	  /* It must be a GIMPLE_TRY_FINALLY.  */
+	  bool cannot_fallthru;
+	  lower_sequence (gimple_try_eval_ptr (stmt), data);
+	  cannot_fallthru = data->cannot_fallthru;
+
+	  /* The finally clause is always executed after the try clause,
+	 so if it does not fall through, then the try-finally will not
+	 fall through.  Otherwise, if the try clause does not fall
+	 through, then when the finally clause falls through it will
+	 resume execution wherever the try clause was going.  So the
+	 whole try-finally will only fall through if both the try
+	 clause and the finally clause fall through.  */
+	  data->cannot_fallthru = false;
+	  lower_sequence (gimple_try_cleanup_ptr (stmt), data);
+	  data->cannot_fallthru |= cannot_fallthru;
+	  gsi_next (gsi);
+	}
+  return;
 
 case GIMPLE_EH_ELSE:
   lower_sequence (gimple_eh_else_n_body_ptr (stmt), data);
@@ -520,6 +518,67 @@ lower_gimple_bind (gimple_stmt_iterator
   gsi_remove (gsi, false);
 }
 
+/* Same as above, but for a GIMPLE_TRY_CATCH.  */
+
+static void
+lower_try_catch (gimple_stmt_iterator *gsi, struct lower_data *data)
+{
+  bool cannot_fallthru;
+  gimple stmt = gsi_stmt (*gsi);
+  gimple_stmt_iterator i;
+
+  /* We don't handle GIMPLE_TRY_FINALLY.  */
+  gcc_assert (gimple_try_kind (stmt) == GIMPLE_TRY_CATCH);
+
+  lower_sequence (gimple_try_eval_ptr (stmt), data);
+  cannot_fallthru = data->cannot_fallthru;
+
+  i = gsi_start (*gimple_try_cleanup_ptr (stmt));
+  switch (gimple_code (gsi_stmt (i)))
+{
+case GIMPLE_CATCH:
+  /* We expect to see a sequence of GIMPLE_CATCH stmts, each with a
+	 catch expression and a body.  The whole try/catch may fall
+	 through iff any of the catch bodies falls through.  */
+  for (; !gsi_end_p (i); gsi_next (&i))
+	{
+	  data->cannot_fallthru = false;
+	  lower_sequence (gimple_catch_handler_ptr (gsi_stmt (i)), data);
+	  if (!data->cannot_fallthru)
+	cannot_fallthru = false;
+	}
+  break;
+
+case GIMPLE_EH_FILTER:
+  /* The exception filter expression only matters if there is an
+	 exception.  If the exception does not match EH_FILTER_TYPES,
+	 we will execute EH_FILTER_FAILURE, and we will fall through
+	 if that falls through.  If the exception does match
+	 EH_FILTER_TYPES, the stack unwinder will continue up the
+	 stack, so we will not fall through.  We don't know whether we
+	 will throw an exception which matches EH_FILTER_TYPES or not,
+	 so we just ignore EH_FILTER_TYPES and assume that we might
+	 throw an exception which doesn't match.  */
+  data->cannot_fallthru = false;
+  lower_sequence (gimple_eh_filter_failure_ptr (gsi_stmt (i)), data);
+  if (!data->cannot_fallthru)
+	cannot_fallthru = false;
+  break;
+
+default:
+  /* This case represents statements to be executed when an
+	 exception occurs.  Those statements are implicitly followed
+	 by a GIMPLE_RESX to resume execution after the excep

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread William J. Schmidt

On Wed, 2012-07-18 at 11:01 +0200, Richard Guenther wrote:
> On Wed, 18 Jul 2012, Steven Bosscher wrote:
> 
> > On Wed, Jul 18, 2012 at 9:59 AM, Richard Guenther  wrote:
> > > On Tue, 17 Jul 2012, William J. Schmidt wrote:
> > >
> > >> I overlooked adding a pass-control flag for strength reduction, added
> > >> here.  I named it -ftree-slsr for consistency with other -ftree- flags,
> > >> but could change it to -fgimple-slsr if you prefer that for a pass named
> > >> gimple-ssa-...
> > >>
> > >> Bootstrapped and tested on powerpc-unknown-linux-gnu with no new
> > >> regressions.  Ok for trunk?
> > >
> > > The switch needs documentation in doc/invoke.texi.  Other than that
> > > it's fine to stick with -ftree-..., even that exposes details to our
> > > users that are not necessary (RTL passes didn't have -frtl-... either).
> > > So in the end, why not re-use -fstrength-reduce that is already available
> > > (but stubbed out)?
> > 
> > In the past, -fstrength-reduce applied to loop strength reduction in
> > loop.c. I don't think it should be re-used for a completely different
> > code transformation.
> 
> Ok.  I suppose -ftree-slsr is ok then.

It turns out I was looking at a very old copy of the manual, and the
-ftree... stuff is not as prevalent now as it once was.  I'll just go
with -fslsr to be consistent with -fgcse, -fipa-sra, etc.

Thanks for the pointer to doc/invoke.texi -- it appears I also failed to
document -fhoist-adjacent-loads, so I will go ahead and do that as well.

Thanks!
Bill

> 
> Thanks,
> Richard.
>

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread William J. Schmidt

On Wed, 2012-07-18 at 08:24 -0500, William J. Schmidt wrote:
> On Wed, 2012-07-18 at 11:01 +0200, Richard Guenther wrote:
> > On Wed, 18 Jul 2012, Steven Bosscher wrote:
> > 
> > > On Wed, Jul 18, 2012 at 9:59 AM, Richard Guenther  
> > > wrote:
> > > > On Tue, 17 Jul 2012, William J. Schmidt wrote:
> > > >
> > > >> I overlooked adding a pass-control flag for strength reduction, added
> > > >> here.  I named it -ftree-slsr for consistency with other -ftree- flags,
> > > >> but could change it to -fgimple-slsr if you prefer that for a pass 
> > > >> named
> > > >> gimple-ssa-...
> > > >>
> > > >> Bootstrapped and tested on powerpc-unknown-linux-gnu with no new
> > > >> regressions.  Ok for trunk?
> > > >
> > > > The switch needs documentation in doc/invoke.texi.  Other than that
> > > > it's fine to stick with -ftree-..., even that exposes details to our
> > > > users that are not necessary (RTL passes didn't have -frtl-... either).
> > > > So in the end, why not re-use -fstrength-reduce that is already 
> > > > available
> > > > (but stubbed out)?
> > > 
> > > In the past, -fstrength-reduce applied to loop strength reduction in
> > > loop.c. I don't think it should be re-used for a completely different
> > > code transformation.
> > 
> > Ok.  I suppose -ftree-slsr is ok then.
> 
> It turns out I was looking at a very old copy of the manual, and the
> -ftree... stuff is not as prevalent now as it once was.  I'll just go
> with -fslsr to be consistent with -fgcse, -fipa-sra, etc.

Well, posted too fast.  Paging down I see that isn't true, sorry.  I'll
use the tree- for consistency even though it is useless information.

Thanks,
Bill

> 
> Thanks for the pointer to doc/invoke.texi -- it appears I also failed to
> document -fhoist-adjacent-loads, so I will go ahead and do that as well.
> 
> Thanks!
> Bill
> 
> > 
> > Thanks,
> > Richard.
> > 
>

[PATCH] [MIPS] Support for -mmcu and -mno-mcu

2012-07-18 Thread Moore, Catherine

Hi Richard,

This patch adds support for the -mmcu option to gcc.  Okay to commit?

Thanks,
Catherine

2012-07-18  Maciej W. Rozycki  
Chao-ying Fu  

* config/mips/mips.opt (mmcu): New option.
* config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Define 
_mips_mcu whenn TARGET_MCU.
(ASM_SPEC): Pass mcu options to the assembler.
* doc/invoke.texi (MIPS Options): Document -mmcu and -mno-mcu.


mcu.patch
Description: mcu.patch

Re: Fix PR c++/19351 (operator new[] overflow)

2012-07-18 Thread Jason Merrill


On 06/26/2012 10:29 AM, Florian Weimer wrote:

+  /* Set to (size_t)-1 if the size check fails.  */
+  if (size_check != NULL_TREE)
+*size = fold_build3 (COND_EXPR, sizetype, size_check,
+original_size, TYPE_MAX_VALUE (sizetype));
VEC_safe_insert (tree, gc, *args, 0, *size);
*args = resolve_args (*args, complain);
if (*args == NULL)
@@ -4022,7 +4030,11 @@ build_operator_new_call (tree fnname, VEC(tree,gc) 
**args,
 if (use_cookie)
 {
   /* Update the total size.  */
-  *size = size_binop (PLUS_EXPR, *size, *cookie_size);
+  *size = size_binop (PLUS_EXPR, original_size, *cookie_size);
+  /* Set to (size_t)-1 if the size check fails.  */
+  gcc_assert (size_check != NULL_TREE);
+  *size = fold_build3 (COND_EXPR, sizetype, size_check,
+   *size, TYPE_MAX_VALUE (sizetype));


Looks like you're evaluating the size_check twice for types that use 
cookies.



+  /* Unconditionally substract the array size.  This decreases the
+maximum object size and is safe even if we choose not to use
+a cookie after all.  */


"cookie size"

But since we're going to be deciding whether or not to use a cookie in 
this function anyway, why not do it here?


Jason

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread William J. Schmidt

Here's the patch with documentation changes included.  I also cleaned up
missing work from a couple of my previous patches, so
-fhoist-adjacent-loads is documented now, and -fvect-cost-model is added
to the list of options on by default at -O3.

Ok for trunk?

Thanks,
Bill


2012-07-18  Bill Schmidt  

* doc/invoke.texi: Add -fhoist-adjacent-loads and -ftree-slsr to list
of flags controlling optimization; add -ftree-slsr to list of flags
enabled by default at -O; add -fhoist-adjacent-loads to list of flags
enabled by default at -O2; add -fvect-cost-model to list of flags
enabled by default at -O3; document -fhoist-adjacent-loads and
-ftree-slsr.
* opts.c (default_option): Make -ftree-slsr default at -O1 and above.
* gimple-ssa-strength-reduction.c (gate_strength_reduction): Use
flag_tree_slsr.
* common.opt: Add -ftree-slsr with flag_tree_slsr.


Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 189574)
+++ gcc/doc/invoke.texi (working copy)
@@ -364,7 +364,8 @@ Objective-C and Objective-C++ Dialects}.
 -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} 
@gol
 -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
 -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
--fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
+-fgcse-sm -fhoist-adjacent-loads -fif-conversion @gol
+-fif-conversion2 -findirect-inlining @gol
 -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
 -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg @gol
 -fipa-pta -fipa-profile -fipa-pure-const -fipa-reference @gol
@@ -413,8 +414,8 @@ Objective-C and Objective-C++ Dialects}.
 -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
 -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
 -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol
--ftree-reassoc @gol
--ftree-sink -ftree-sra -ftree-switch-conversion -ftree-tail-merge @gol
+-ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol
+-ftree-switch-conversion -ftree-tail-merge @gol
 -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol
 -funit-at-a-time -funroll-all-loops -funroll-loops @gol
 -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
@@ -6259,6 +6260,7 @@ compilation time.
 -ftree-forwprop @gol
 -ftree-fre @gol
 -ftree-phiprop @gol
+-ftree-slsr @gol
 -ftree-sra @gol
 -ftree-pta @gol
 -ftree-ter @gol
@@ -6286,6 +6288,7 @@ also turns on the following optimization flags:
 -fdevirtualize @gol
 -fexpensive-optimizations @gol
 -fgcse  -fgcse-lm  @gol
+-fhoist-adjacent-loads @gol
 -finline-small-functions @gol
 -findirect-inlining @gol
 -fipa-sra @gol
@@ -6311,6 +6314,7 @@ Optimize yet more.  @option{-O3} turns on all opti
 by @option{-O2} and also turns on the @option{-finline-functions},
 @option{-funswitch-loops}, @option{-fpredictive-commoning},
 @option{-fgcse-after-reload}, @option{-ftree-vectorize},
+@option{-fvect-cost-model},
 @option{-ftree-partial-pre} and @option{-fipa-cp-clone} options.
 
 @item -O0
@@ -7129,6 +7133,13 @@ This flag is enabled by default at @option{-O} and
 Perform hoisting of loads from conditional pointers on trees.  This
 pass is enabled by default at @option{-O} and higher.
 
+@item -fhoist-adjacent-loads
+@opindex hoist-adjacent-loads
+Speculatively hoist loads from both branches of an if-then-else if the
+loads are from adjacent locations in the same structure and the target
+architecture has a conditional move instruction.  This flag is enabled
+by default at @option{-O2} and higher.
+
 @item -ftree-copy-prop
 @opindex ftree-copy-prop
 Perform copy propagation on trees.  This pass eliminates unnecessary
@@ -7529,6 +7540,13 @@ defining expression.  This results in non-GIMPLE c
 much more complex trees to work on resulting in better RTL generation.  This is
 enabled by default at @option{-O} and higher.
 
+@item -ftree-slsr
+@opindex ftree-slsr
+Perform straight-line strength reduction on trees.  This recognizes related
+expressions involving multiplications and replaces them by less expensive
+calculations when possible.  This is enabled by default at @option{-O} and
+higher.
+
 @item -ftree-vectorize
 @opindex ftree-vectorize
 Perform loop vectorization on trees. This flag is enabled by default at
@@ -7550,7 +7568,8 @@ except at level @option{-Os} where it is disabled.
 
 @item -fvect-cost-model
 @opindex fvect-cost-model
-Enable cost model for vectorization.
+Enable cost model for vectorization.  This option is enabled by default at
+@option{-O3}.
 
 @item -ftree-vrp
 @opindex ftree-vrp
Index: gcc/opts.c
===
--- gcc/opts.c  (revision 189574)
+++ gcc/opts.c  (working copy)
@@ -452,6 +452,7 @@ static const struct defa

[patch] Fix ICE in set_lattice_value

2012-07-18 Thread Eric Botcazou

This is a regression present on mainline and 4.7 branch for targets using SJLJ 
exceptions by default in Ada (e.g. ARM).  The error message is:

+===GNAT BUG DETECTED==+
| 4.8.0 20120716 (experimental) [trunk revision 189525] (x86_64-suse-linux) GCC 
error:|
| in set_lattice_value, at tree-ssa-ccp.c:452  |
| Error detected around p.adb:16:4  

It's valid_lattice_transition returning false on a transition from INTEGER_CST 
to a constant &x.  It occurs for an array reference with non-constant index: on 
the first round, &x + i is non-constant so the algorithm computes an alignment 
factor which is an INTEGER_CST; on the second round, i is 0 so the new value is 
the constant &x.

valid_lattice_transition accepts the reverse transition.  The attached patch 
makes the function accept this transition as well.

Tested on x86_64-suse-linux, OK for the mainline and 4.7 branch?


2012-07-18  Eric Botcazou  

* tree-ssa-ccp.c (valid_lattice_transition): Allow transitioning from
as well as to INTEGER_CST.


2012-07-18  Eric Botcazou  

* gnat.dg/loop_optimization11.adb: New test.
* gnat.dg/loop_optimization11_pkg.ads: New helper.


-- 
Eric Botcazou
Index: tree-ssa-ccp.c
===
--- tree-ssa-ccp.c	(revision 189525)
+++ tree-ssa-ccp.c	(working copy)
@@ -405,9 +405,9 @@ valid_lattice_transition (prop_value_t o
 
   /* Now both lattice values are CONSTANT.  */
 
-  /* Allow transitioning from &x to &x & ~3.  */
-  if (TREE_CODE (old_val.value) != INTEGER_CST
-  && TREE_CODE (new_val.value) == INTEGER_CST)
+  /* Allow transitioning from &x to &x & ~3 and vice versa.  */
+  if ((TREE_CODE (old_val.value) == INTEGER_CST)
+  != (TREE_CODE (new_val.value) == INTEGER_CST))
 return true;
 
   /* Bit-lattices have to agree in the still valid bits.  */
-- { dg-do compile }
-- { dg-options "-O" }

with Loop_Optimization11_Pkg; use Loop_Optimization11_Pkg;

procedure Loop_Optimization11 is
   Arr : array (Prot, Mem) of Integer := (others => (others => 0));
begin
   Put_Line (Img (0) & " ");
   for I in Arr'Range (1) loop
  for J in Arr'Range (2) loop
 declare
Elem : Integer renames Arr (I, J);
 begin
Put_Line (Img (Elem));
 end;
  end loop;
   end loop;
end;
package Loop_Optimization11_Pkg is

   function Img (X : Integer) return String;

   procedure Put_Line (Data : String);

   type Prot is (Execute, Execute_Read, Execute_Read_Write);

   type Mem is (Mem_Image, Mem_Mapped, Mem_Private, Unknown);

end Loop_Optimization11_Pkg;

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread Steven Bosscher

On Wed, Jul 18, 2012 at 3:24 PM, William J. Schmidt
 wrote:
> It turns out I was looking at a very old copy of the manual, and the
> -ftree... stuff is not as prevalent now as it once was.  I'll just go
> with -fslsr to be consistent with -fgcse, -fipa-sra, etc.

Sadly, it is more prevalent than it ever was!
It's IMHO very unfortunate that such an internal detail is exposed to
the user...

Ciao!
Steven

[patch] Fix GIMPLE verification failure on CONSTRUCTOR

2012-07-18 Thread Eric Botcazou

This is a regression present on mainline and 4.7 branch.  The error message is:

p.adb: In function 'P.Proc':
p.adb:3:4: error: non-trivial conversion at assignment
system__address
void (*) (void)
r.callback.callback.address = q__proc;

+===GNAT BUG DETECTED==+
| 4.8.0 20120716 (experimental) [trunk revision 189525] (x86_64-suse-linux) GCC 
error:|
| verify_gimple failed |
| Error detected around p.adb:3:4 

We lose a cast in an initializer before gimplification, hence type mismatch.
This happens as follows: a CONSTRUCTOR used as the initializer of a global 
constant and whose only value contains the cast is embedded (shared) in a 
CONSTRUCTOR used as the initializer of a second global constant, which is in 
turn embedded (shared) in a CONSTRUCTOR used as the initializer of a local 
variable.

The sharing is fine, since we have an unsharing pass running right before 
gimplification.  The problem is that, since:

r171903 | matz | 2011-04-03 13:13:09 +0200 (Sun, 03 Apr 2011) | 7 lines

* cgraphbuild.c (record_reference): Canonicalize constructor
values.
* gimple-fold.c (canonicalize_constructor_val): Accept being called
without function context.
* cgraphunit.c (cgraph_finalize_compilation_unit): Clear
current_function_decl and cfun.

record_reference can modify the contents of CONSTRUCTORs _before_ the unsharing 
pass is run and yield invalid GENERIC and later invalid GIMPLE.

Tested on x86_64-suse-linux, OK for the mainline and 4.7 branch?


2012-07-18  Eric Botcazou  

* gimple-fold.c (canonicalize_constructor_val): Strip only useless type
conversions.


2012-07-18  Eric Botcazou  

* gnat.dg/aggr20.ad[sb]: New test.
* gnat.dg/aggr20_pkg.ads: New helper.


-- 
Eric Botcazou
Index: gimple-fold.c
===
--- gimple-fold.c	(revision 189525)
+++ gimple-fold.c	(working copy)
@@ -139,7 +139,7 @@ can_refer_decl_in_current_unit_p (tree d
 tree
 canonicalize_constructor_val (tree cval, tree from_decl)
 {
-  STRIP_NOPS (cval);
+  STRIP_USELESS_TYPE_CONVERSION (cval);
   if (TREE_CODE (cval) == POINTER_PLUS_EXPR
   && TREE_CODE (TREE_OPERAND (cval, 1)) == INTEGER_CST)
 {
-- { dg-do compile }

package body Aggr20 is

   procedure Proc (R : out Rec3) is
   begin
  R := (Callback => Nil_Rec2);
   end;

end Aggr20;
with Aggr20_Pkg; use Aggr20_Pkg;
with System;

package Aggr20 is

   type Rec1 is record
  Address : System.Address;
   end record;

   Nil_Rec1 : constant Rec1 := (Address => Default_Nil_Address);

   type Rec2 is record
  Callback : Rec1;
   end record;

   Nil_Rec2 : constant Rec2 := (Callback => Nil_Rec1);

   type Rec3 is record
  Callback : Rec2;
   end record;

   procedure Proc (R : out Rec3);

end Aggr20;
with System;

package Aggr20_Pkg is

   procedure Proc;

   Default_Nil_Address : constant System.Address := Proc'Address;

end Aggr20_Pkg;

Re: Fix PR c++/19351 (operator new[] overflow)

2012-07-18 Thread Florian Weimer


On 07/18/2012 03:55 PM, Jason Merrill wrote:

On 06/26/2012 10:29 AM, Florian Weimer wrote:

+  /* Set to (size_t)-1 if the size check fails.  */
+  if (size_check != NULL_TREE)
+*size = fold_build3 (COND_EXPR, sizetype, size_check,
+ original_size, TYPE_MAX_VALUE (sizetype));
VEC_safe_insert (tree, gc, *args, 0, *size);
*args = resolve_args (*args, complain);
if (*args == NULL)
@@ -4022,7 +4030,11 @@ build_operator_new_call (tree fnname,
VEC(tree,gc) **args,
 if (use_cookie)
   {
 /* Update the total size.  */
-   *size = size_binop (PLUS_EXPR, *size, *cookie_size);
+   *size = size_binop (PLUS_EXPR, original_size, *cookie_size);
+   /* Set to (size_t)-1 if the size check fails.  */
+   gcc_assert (size_check != NULL_TREE);
+   *size = fold_build3 (COND_EXPR, sizetype, size_check,
+*size, TYPE_MAX_VALUE (sizetype));


Looks like you're evaluating the size_check twice for types that use
cookies.


I try to avoid this by using original_size instead of size on the first 
assignment under the use_cookie case.



+  /* Unconditionally substract the array size.  This decreases the
+ maximum object size and is safe even if we choose not to use
+ a cookie after all.  */


"cookie size"


Okay, I will fix that.


But since we're going to be deciding whether or not to use a cookie in
this function anyway, why not do it here?


The decision to use a cookie is currently split between the two 
functions and there are several special cases (Java, ABI compatibility 
flags).  I did not want to disturb this code too much because we do not 
have much test suite coverage in this area.


--
Florian Weimer / Red Hat Product Security Team

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread Richard Guenther

On Wed, 18 Jul 2012, William J. Schmidt wrote:

> Here's the patch with documentation changes included.  I also cleaned up
> missing work from a couple of my previous patches, so
> -fhoist-adjacent-loads is documented now, and -fvect-cost-model is added
> to the list of options on by default at -O3.
> 
> Ok for trunk?

Ok if it bootstraps / tests ok.

Thanks,
Richard.

> Thanks,
> Bill
> 
> 
> 2012-07-18  Bill Schmidt  
> 
>   * doc/invoke.texi: Add -fhoist-adjacent-loads and -ftree-slsr to list
>   of flags controlling optimization; add -ftree-slsr to list of flags
>   enabled by default at -O; add -fhoist-adjacent-loads to list of flags
>   enabled by default at -O2; add -fvect-cost-model to list of flags
>   enabled by default at -O3; document -fhoist-adjacent-loads and
>   -ftree-slsr.
>   * opts.c (default_option): Make -ftree-slsr default at -O1 and above.
>   * gimple-ssa-strength-reduction.c (gate_strength_reduction): Use
>   flag_tree_slsr.
>   * common.opt: Add -ftree-slsr with flag_tree_slsr.
> 
> 
> Index: gcc/doc/invoke.texi
> ===
> --- gcc/doc/invoke.texi   (revision 189574)
> +++ gcc/doc/invoke.texi   (working copy)
> @@ -364,7 +364,8 @@ Objective-C and Objective-C++ Dialects}.
>  -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} 
> @gol
>  -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol
>  -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
> --fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
> +-fgcse-sm -fhoist-adjacent-loads -fif-conversion @gol
> +-fif-conversion2 -findirect-inlining @gol
>  -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
>  -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg @gol
>  -fipa-pta -fipa-profile -fipa-pure-const -fipa-reference @gol
> @@ -413,8 +414,8 @@ Objective-C and Objective-C++ Dialects}.
>  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
>  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
>  -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta 
> @gol
> --ftree-reassoc @gol
> --ftree-sink -ftree-sra -ftree-switch-conversion -ftree-tail-merge @gol
> +-ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol
> +-ftree-switch-conversion -ftree-tail-merge @gol
>  -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol
>  -funit-at-a-time -funroll-all-loops -funroll-loops @gol
>  -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
> @@ -6259,6 +6260,7 @@ compilation time.
>  -ftree-forwprop @gol
>  -ftree-fre @gol
>  -ftree-phiprop @gol
> +-ftree-slsr @gol
>  -ftree-sra @gol
>  -ftree-pta @gol
>  -ftree-ter @gol
> @@ -6286,6 +6288,7 @@ also turns on the following optimization flags:
>  -fdevirtualize @gol
>  -fexpensive-optimizations @gol
>  -fgcse  -fgcse-lm  @gol
> +-fhoist-adjacent-loads @gol
>  -finline-small-functions @gol
>  -findirect-inlining @gol
>  -fipa-sra @gol
> @@ -6311,6 +6314,7 @@ Optimize yet more.  @option{-O3} turns on all opti
>  by @option{-O2} and also turns on the @option{-finline-functions},
>  @option{-funswitch-loops}, @option{-fpredictive-commoning},
>  @option{-fgcse-after-reload}, @option{-ftree-vectorize},
> +@option{-fvect-cost-model},
>  @option{-ftree-partial-pre} and @option{-fipa-cp-clone} options.
>  
>  @item -O0
> @@ -7129,6 +7133,13 @@ This flag is enabled by default at @option{-O} and
>  Perform hoisting of loads from conditional pointers on trees.  This
>  pass is enabled by default at @option{-O} and higher.
>  
> +@item -fhoist-adjacent-loads
> +@opindex hoist-adjacent-loads
> +Speculatively hoist loads from both branches of an if-then-else if the
> +loads are from adjacent locations in the same structure and the target
> +architecture has a conditional move instruction.  This flag is enabled
> +by default at @option{-O2} and higher.
> +
>  @item -ftree-copy-prop
>  @opindex ftree-copy-prop
>  Perform copy propagation on trees.  This pass eliminates unnecessary
> @@ -7529,6 +7540,13 @@ defining expression.  This results in non-GIMPLE c
>  much more complex trees to work on resulting in better RTL generation.  This 
> is
>  enabled by default at @option{-O} and higher.
>  
> +@item -ftree-slsr
> +@opindex ftree-slsr
> +Perform straight-line strength reduction on trees.  This recognizes related
> +expressions involving multiplications and replaces them by less expensive
> +calculations when possible.  This is enabled by default at @option{-O} and
> +higher.
> +
>  @item -ftree-vectorize
>  @opindex ftree-vectorize
>  Perform loop vectorization on trees. This flag is enabled by default at
> @@ -7550,7 +7568,8 @@ except at level @option{-Os} where it is disabled.
>  
>  @item -fvect-cost-model
>  @opindex fvect-cost-model
> -Enable cost model for vectorization.
> +Enable cost

Re: [PATCH] Add flag to control straight-line strength reduction

2012-07-18 Thread Richard Guenther

On Wed, 18 Jul 2012, Steven Bosscher wrote:

> On Wed, Jul 18, 2012 at 3:24 PM, William J. Schmidt
>  wrote:
> > It turns out I was looking at a very old copy of the manual, and the
> > -ftree... stuff is not as prevalent now as it once was.  I'll just go
> > with -fslsr to be consistent with -fgcse, -fipa-sra, etc.
> 
> Sadly, it is more prevalent than it ever was!
> It's IMHO very unfortunate that such an internal detail is exposed to
> the user...

Indeed...  not sure if we want a set of aliases without the tree-
or rtl- part.

Richard.

[patch] Fix ICE during out-of-SSA

2012-07-18 Thread Eric Botcazou

This is a regression present on mainline and 4.7 branch for targets using SJLJ 
exceptions by default in Ada (e.g. ARM).  The error message is:

Unable to coalesce ssa_names 2 and 20 which are marked as MUST COALESCE.
b1_2(ab) and  b1_20(ab)
+===GNAT BUG DETECTED==+
| 4.8.0 20120716 (experimental) [trunk revision 189525] (x86_64-suse-linux) GCC 
error:|
| SSA corruption   |
| Error detected around p.adb:4:1 

It's the usual case of overlapping live ranges for (ab) SSA names.

Tested on x86_64-suse-linux, OK for the mainline and 4.7 branch?


2012-07-18  Eric Botcazou  

* tree-ssa-forwprop.c (combine_conversions): Punt if the RHS of the
defining statement is a SSA name that occurs in abnormal PHIs.


2012-07-18  Eric Botcazou  

* gnat.dg/aggr20.ad[sb]: New test.
* gnat.dg/aggr20_pkg.ads: New helper.


-- 
Eric Botcazou
Index: tree-ssa-forwprop.c
===
--- tree-ssa-forwprop.c	(revision 189525)
+++ tree-ssa-forwprop.c	(working copy)
@@ -2584,6 +2584,11 @@ combine_conversions (gimple_stmt_iterato
   unsigned int final_prec = TYPE_PRECISION (type);
   int final_unsignedp = TYPE_UNSIGNED (type);
 
+  /* Don't propagate ssa names that occur in abnormal phis.  */
+  if (TREE_CODE (defop0) == SSA_NAME
+	  && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (defop0))
+	return 0;
+
   /* In addition to the cases of two conversions in a row
 	 handled below, if we are converting something to its own
 	 type via an object of identical or wider precision, neither
-- { dg-do compile }

package body Aggr20 is

   procedure Proc (R : out Rec3) is
   begin
  R := (Callback => Nil_Rec2);
   end;

end Aggr20;
with Aggr20_Pkg; use Aggr20_Pkg;
with System;

package Aggr20 is

   type Rec1 is record
  Address : System.Address;
   end record;

   Nil_Rec1 : constant Rec1 := (Address => Default_Nil_Address);

   type Rec2 is record
  Callback : Rec1;
   end record;

   Nil_Rec2 : constant Rec2 := (Callback => Nil_Rec1);

   type Rec3 is record
  Callback : Rec2;
   end record;

   procedure Proc (R : out Rec3);

end Aggr20;
with System;

package Aggr20_Pkg is

   procedure Proc;

   Default_Nil_Address : constant System.Address := Proc'Address;

end Aggr20_Pkg;

Re: [patch] Fix ICE in set_lattice_value

2012-07-18 Thread Richard Guenther

On Wed, Jul 18, 2012 at 4:05 PM, Eric Botcazou  wrote:
> This is a regression present on mainline and 4.7 branch for targets using SJLJ
> exceptions by default in Ada (e.g. ARM).  The error message is:
>
> +===GNAT BUG DETECTED==+
> | 4.8.0 20120716 (experimental) [trunk revision 189525] (x86_64-suse-linux) 
> GCC
> error:|
> | in set_lattice_value, at tree-ssa-ccp.c:452  |
> | Error detected around p.adb:16:4
>
> It's valid_lattice_transition returning false on a transition from INTEGER_CST
> to a constant &x.  It occurs for an array reference with non-constant index: 
> on
> the first round, &x + i is non-constant so the algorithm computes an alignment
> factor which is an INTEGER_CST; on the second round, i is 0 so the new value 
> is
> the constant &x.
>
> valid_lattice_transition accepts the reverse transition.  The attached patch
> makes the function accept this transition as well.
>
> Tested on x86_64-suse-linux, OK for the mainline and 4.7 branch?

Hmm, the point is of couse to not allow transitions that could form a cycle,
which is why the reverse transition is not allowed.

Let me have a closer look here.

Richard.

> 2012-07-18  Eric Botcazou  
>
> * tree-ssa-ccp.c (valid_lattice_transition): Allow transitioning from
> as well as to INTEGER_CST.
>
>
> 2012-07-18  Eric Botcazou  
>
> * gnat.dg/loop_optimization11.adb: New test.
> * gnat.dg/loop_optimization11_pkg.ads: New helper.
>
>
> --
> Eric Botcazou

Re: [patch] Fix GIMPLE verification failure on CONSTRUCTOR

2012-07-18 Thread Richard Guenther

On Wed, Jul 18, 2012 at 4:25 PM, Eric Botcazou  wrote:
> This is a regression present on mainline and 4.7 branch.  The error message 
> is:
>
> p.adb: In function 'P.Proc':
> p.adb:3:4: error: non-trivial conversion at assignment
> system__address
> void (*) (void)
> r.callback.callback.address = q__proc;
>
> +===GNAT BUG DETECTED==+
> | 4.8.0 20120716 (experimental) [trunk revision 189525] (x86_64-suse-linux) 
> GCC
> error:|
> | verify_gimple failed |
> | Error detected around p.adb:3:4
>
> We lose a cast in an initializer before gimplification, hence type mismatch.
> This happens as follows: a CONSTRUCTOR used as the initializer of a global
> constant and whose only value contains the cast is embedded (shared) in a
> CONSTRUCTOR used as the initializer of a second global constant, which is in
> turn embedded (shared) in a CONSTRUCTOR used as the initializer of a local
> variable.
>
> The sharing is fine, since we have an unsharing pass running right before
> gimplification.  The problem is that, since:
>
> r171903 | matz | 2011-04-03 13:13:09 +0200 (Sun, 03 Apr 2011) | 7 lines
>
> * cgraphbuild.c (record_reference): Canonicalize constructor
> values.
> * gimple-fold.c (canonicalize_constructor_val): Accept being called
> without function context.
> * cgraphunit.c (cgraph_finalize_compilation_unit): Clear
> current_function_decl and cfun.
>
> record_reference can modify the contents of CONSTRUCTORs _before_ the 
> unsharing
> pass is run and yield invalid GENERIC and later invalid GIMPLE.
>
> Tested on x86_64-suse-linux, OK for the mainline and 4.7 branch?

Ok.

Thanks,
Richard.

>
> 2012-07-18  Eric Botcazou  
>
> * gimple-fold.c (canonicalize_constructor_val): Strip only useless 
> type
> conversions.
>
>
> 2012-07-18  Eric Botcazou  
>
> * gnat.dg/aggr20.ad[sb]: New test.
> * gnat.dg/aggr20_pkg.ads: New helper.
>
>
> --
> Eric Botcazou

Re: [patch] Fix ICE during out-of-SSA

2012-07-18 Thread Richard Guenther

On Wed, Jul 18, 2012 at 4:54 PM, Eric Botcazou  wrote:
> This is a regression present on mainline and 4.7 branch for targets using SJLJ
> exceptions by default in Ada (e.g. ARM).  The error message is:
>
> Unable to coalesce ssa_names 2 and 20 which are marked as MUST COALESCE.
> b1_2(ab) and  b1_20(ab)
> +===GNAT BUG DETECTED==+
> | 4.8.0 20120716 (experimental) [trunk revision 189525] (x86_64-suse-linux) 
> GCC
> error:|
> | SSA corruption   |
> | Error detected around p.adb:4:1
>
> It's the usual case of overlapping live ranges for (ab) SSA names.
>
> Tested on x86_64-suse-linux, OK for the mainline and 4.7 branch?

Ok.

Thanks,
Richard.

>
> 2012-07-18  Eric Botcazou  
>
> * tree-ssa-forwprop.c (combine_conversions): Punt if the RHS of the
> defining statement is a SSA name that occurs in abnormal PHIs.
>
>
> 2012-07-18  Eric Botcazou  
>
> * gnat.dg/aggr20.ad[sb]: New test.
> * gnat.dg/aggr20_pkg.ads: New helper.
>
>
> --
> Eric Botcazou

Re: [patch] Add v850--rtems

2012-07-18 Thread Jeff Law


On 07/17/12 17:11, Ralf Corsepius wrote:

Hi,

The patch below adds an v850-*-rtems* target configuration to GCC.
It's a sightly modified copy of the v850*-*-* target, with some RTEMS
specific changes added.

I would like to apply this patch to trunk and gcc-4_7-branch.

OK to commit?

Yes.  This is fine.

jeff

[Patch, Fortran] Allow assumed-shape arrays with BIND(C) for TS29113

2012-07-18 Thread Tobias Burnus

This patch was written on top of the big assumed-shape patch.* However, 
it should also work by itself.


Bootstrapped and regtested on x86-64-linux.
OK for the trunk?

Tobias

* http://gcc.gnu.org/ml/fortran/2012-07/msg00052.html
2012-07-18  Tobias Burnus  

	* decl.c (gfc_verify_c_interop_param): Allow assumed-shape
	with -std=f2008ts.

2012-07-18  Tobias Burnus  

	* gfortran.dg/bind_c_array_params_2.f90: New.
	* gfortran.dg/bind_c_array_params.f03: Add -std=f2003
	and update dg-error.

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 01693ad..4184608 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -1092,29 +1096,15 @@ gfc_verify_c_interop_param (gfc_symbol *sym)
 	retval = FAILURE;
 
   /* Make sure that if it has the dimension attribute, that it is
-	 either assumed size or explicit shape.  */
-	  if (sym->as != NULL)
-	{
-	  if (sym->as->type == AS_ASSUMED_SHAPE)
-		{
-		  gfc_error ("Assumed-shape array '%s' at %L cannot be an "
-			 "argument to the procedure '%s' at %L because "
-			 "the procedure is BIND(C)", sym->name,
-			 &(sym->declared_at), sym->ns->proc_name->name,
-			 &(sym->ns->proc_name->declared_at));
-		  retval = FAILURE;
-		}
-
-	  if (sym->as->type == AS_DEFERRED)
-		{
-		  gfc_error ("Deferred-shape array '%s' at %L cannot be an "
-			 "argument to the procedure '%s' at %L because "
-			 "the procedure is BIND(C)", sym->name,
-			 &(sym->declared_at), sym->ns->proc_name->name,
- 			 &(sym->ns->proc_name->declared_at));
-		  retval = FAILURE;
-		}
-	  }
+	 either assumed size or explicit shape. Deferred shape is already
+	 covered by the pointer/allocatable attribute.  */
+	  if (sym->as != NULL && sym->as->type == AS_ASSUMED_SHAPE
+	  && gfc_notify_std (GFC_STD_F2008_TS, "Assumed-shape array '%s' "
+			  "at %L as dummy argument to the BIND(C) "
+			  "procedure '%s' at %L", sym->name,
+			  &(sym->declared_at), sym->ns->proc_name->name,
+			  &(sym->ns->proc_name->declared_at)) == FAILURE)
+	retval = FAILURE;
 	}
 }
 
diff --git a/gcc/testsuite/gfortran.dg/bind_c_array_params.f03 b/gcc/testsuite/gfortran.dg/bind_c_array_params.f03
index 6590db1..810f642 100644
--- a/gcc/testsuite/gfortran.dg/bind_c_array_params.f03
+++ b/gcc/testsuite/gfortran.dg/bind_c_array_params.f03
@@ -1,10 +1,11 @@
 ! { dg-do compile }
+! { dg-options "-std=f2003" }
 module bind_c_array_params
 use, intrinsic :: iso_c_binding
 implicit none
 
 contains
-  subroutine sub0(assumed_array) bind(c) ! { dg-error "cannot be an argument" }
+  subroutine sub0(assumed_array) bind(c) ! { dg-error "TS 29113: Assumed-shape array 'assumed_array' at .1. as dummy argument to the BIND.C. procedure 'sub0'" }
 integer(c_int), dimension(:) :: assumed_array
   end subroutine sub0
 
--- /dev/null	2012-07-18 07:03:52.759757921 +0200
+++ gcc/gcc/testsuite/gfortran.dg/bind_c_array_params_2.f90	2012-07-18 00:14:13.0 +0200
@@ -0,0 +1,19 @@
+! { dg-do compile }
+! { dg-options "-std=f2008ts -fdump-tree-original" }
+!
+! Check that assumed-shape variables are correctly passed to BIND(C)
+! as defined in TS 29913
+! 
+interface
+  subroutine test (xx) bind(C, name="myBindC")
+type(*), dimension(:,:) :: xx
+  end subroutine test
+end interface
+
+integer :: aa(4,4)
+call test(aa)
+end
+
+! { dg-final { scan-tree-dump-times "test \\\(&parm\\." 1 "original" } }
+! { dg-final { scan-assembler-times "myBindC" 1 } }
+

Re: [patch] Add v850--rtems

2012-07-18 Thread Ralf Corsepius


On 07/18/2012 05:02 PM, Jeff Law wrote:

On 07/17/12 17:11, Ralf Corsepius wrote:

Hi,

The patch below adds an v850-*-rtems* target configuration to GCC.
It's a sightly modified copy of the v850*-*-* target, with some RTEMS
specific changes added.

I would like to apply this patch to trunk and gcc-4_7-branch.

OK to commit?

Yes.  This is fine.


Thanks - Done. Patch is now in trunk and 4_7-branch.

Ralf

Re: [PATCH] New fdo summary-based icache sensitive unrolling (issue6351086)

2012-07-18 Thread Teresa Johnson

Ping (retrying ping in plain text mode so that it goes through properly).

Thanks,
Teresa

On Wed, Jul 11, 2012 at 10:42 AM, Teresa Johnson  wrote:
> Ports some patches related to improving FDO program summary information
> and using it to guide loop unrolling from google branches to mainline.
> The patch is enhanced to add additional summary information to aid
> in determining hot/cold decisions.
>
> The original patch description is at:
>   http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00437.html
> and further discussion about incorporating onto mainline is at:
>   http://gcc.gnu.org/ml/gcc-patches/2012-06/threads.html#00414
>
> Honza, can you take a look to see if this patch meets your needs?
>
> Full description:
>
> This patch adds new program summary information to the gcov
> profile files that indicate how many profiled counts compose
> the majority of the program's execution time. This is used to
> provide an indication of the overall code size of the program.
>
> The new profile summary information is then used to guide
> codesize based unroll and peel decisions, to prevent those
> optimizations from increasing code size too much when the
> program may be sensitive to icache effects.
>
> This patch also pulls in dependent portions of google/main r187660 that cache
> additional loop analysis results in the niter_desc auxiliary information
> hanging off the loop structure (the optimization portions of that
> change are not included here, and have an outstanding review request
> for mainline).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
>
> Thanks,
> Teresa
>
> 2012-07-11  Teresa Johnson  
>
> * libgcc/libgcov.c (sort_by_reverse_gcov_value): New function.
> (gcov_compute_cutoff_values): Ditto.
> (gcov_exit): Call gcov_compute_cutoff_values and merge new summary
> information.
> * gcc/doc/invoke.texi (roll much): Document new options
> -fpeel-codesize-limit and -funroll-codesize-limit, and new params
> codesize-hotness-threshold and unrollpeel-hotness-threshold.
> * gcc/gcov-io.c (gcov_write_summary): Write new summary info.
> (gcov_read_summary): Read new summary info.
> * gcc/gcov-io.h (GCOV_TAG_SUMMARY_LENGTH): Update for new summary 
> info.
> (struct gcov_ctr_summary): Add new summary info: num_hot_counters and
> hot_cutoff_value.
> * gcc/loop-unroll.c (code_size_limit_factor): New function.
> (decide_unroll_runtime_iterations): Call code_size_limit_factor
> to control the unroll factor, and retrieve number of branches from
> niter_desc instead of via function that walks loop.
> (decide_peel_simple, decide_unroll_stupid): Ditto.
> * gcc/coverage.c (read_counts_file): Propagate new summary info.
> * gcc/loop-iv.c (get_simple_loop_desc): Invoke new analyze_loop_insns
> function, and add guards to enable this function to work for the
> outermost loop.
> * gcc/common.opt: Add -fpeel-codesize-limit and
> -funroll-codesize-limit.
> * gcc/cfgloop.c (insn_has_fp_set, analyze_loop_insns): New functions.
> (num_loop_branches): Remove.
> * gcc/cfgloop.h (struct niter_desc): Added new fields to cache
> additional loop analysis information.
> (num_loop_branches): Remove.
> (analyze_loop_insns): Declare.
> * gcc/params.def (PARAM_UNROLLPEEL_CODESIZE_THRESHOLD): Add.
> (PARAM_UNROLLPEEL_HOTNESS_THRESHOLD): Ditto.
> * gcc/gcov-dump.c (tag_summary): Dump new summary info.
>
> Index: libgcc/libgcov.c
> ===
> --- libgcc/libgcov.c(revision 189413)
> +++ libgcc/libgcov.c(working copy)
> @@ -276,6 +276,120 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned
>return 1;
>  }
>
> +/* Used by qsort to sort gcov values in descending order.  */
> +
> +static int
> +sort_by_reverse_gcov_value (const void *pa, const void *pb)
> +{
> +  const gcov_type a = *(gcov_type const *)pa;
> +  const gcov_type b = *(gcov_type const *)pb;
> +
> +  if (b > a)
> +return 1;
> +  else if (b == a)
> +return 0;
> +  else
> +return -1;
> +}
> +
> +/* Determines the number of counters required to cover a given percentage
> +   of the total sum of execution counts in the summary, which is then also
> +   recorded in SUM.  */
> +
> +static void
> +gcov_compute_cutoff_values (struct gcov_summary *sum)
> +{
> +  struct gcov_info *gi_ptr;
> +  const struct gcov_fn_info *gfi_ptr;
> +  const struct gcov_ctr_info *ci_ptr;
> +  struct gcov_ctr_summary *cs_ptr;
> +  unsigned t_ix, f_ix, i, ctr_info_ix, index;
> +  gcov_unsigned_t c_num;
> +  gcov_type *value_array;
> +  gcov_type cum, cum_cutoff;
> +  char *cutoff_str;
> +  unsigned cutoff_perc;
> +
> +#define CUM_CUTOFF_PERCENT_TIMES_10 999
> +  cutoff_str = getenv ("GCOV_HOTCODE_CUTOFF_TIMES_10");
> +  if (cutoff_str && strlen (cutoff_str

Re: [PATCH, testsuite] Skip 20101011-1.c for bare-metal m68k

2012-07-18 Thread Mike Stump

On Jul 17, 2012, at 7:00 PM, Sandra Loosemore wrote:
> Like the subject line says; this is consistent with the existing test to bail 
> out for MIPS bare-metal.  OK for mainline?

Ok.

Re: [PATCH] Define FFI_SIZEOF_JAVA_RAW to 4 for x32

2012-07-18 Thread H.J. Lu

On Mon, Jul 16, 2012 at 01:14:08PM -0700, H.J. Lu wrote:
> Hi,
> 
> This patch defines FFI_SIZEOF_JAVA_RAW to 4 for x32, similar to MIPS n32.
> It fixed:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53982
> 
> 

Hi,

Here is the patch with updated ChangeLog entry.  X32 has the same issue
as MIPS n32, which was fixed by FFI_SIZEOF_JAVA_RAW:

http://gcc.gnu.org/ml/gcc-patches/2007-11/msg01612.html
http://gcc.gnu.org/ml/gcc-patches/2007-12/msg5.html

The same fix is needed for x32.  OK for trunk?

Thanks.


H.J.
---
2012-07-16  H.J. Lu  

* src/x86/ffitarget.h: Check __ILP32__ instead of __LP64__ for
x32.
(FFI_SIZEOF_JAVA_RAW): Defined to 4 for x32.

diff --git a/src/x86/ffitarget.h b/src/x86/ffitarget.h
index f442654..46f294c 100644
--- a/src/x86/ffitarget.h
+++ b/src/x86/ffitarget.h
@@ -61,8 +61,9 @@ typedef unsigned long long ffi_arg;
 typedef long long  ffi_sarg;
 #endif
 #else
-#if defined __x86_64__ && !defined __LP64__
+#if defined __x86_64__ && defined __ILP32__
 #define FFI_SIZEOF_ARG 8
+#define FFI_SIZEOF_JAVA_RAW  4
 typedef unsigned long long ffi_arg;
 typedef long long  ffi_sarg;
 #else
-- 
1.7.10.4

Re: PR libjava/53973: Check and and skip 67h address size prefix for x32

2012-07-18 Thread H.J. Lu

On Mon, Jul 16, 2012 at 02:04:41PM -0700, H.J. Lu wrote:
> Hi,
> 
> Since x32 may generate 64-bit integer divide like
> 
> 67 48 f7 bd a0 fe ff ff idivq  -0x160(%ebp)
> 
> we need to check and skip 67h address size for x32.  OK for trunk if
> there are no regressions on Linux/x86-64?
> 
> Thanks.
> 
> H.J.
> 
> 2012-07-16  H.J. Lu  
> 
>   PR libjava/53973
>   * include/x86_64-signal.h (CHECK_67H_PREFIX): New.
>   (HANDLE_DIVIDE_OVERFLOW): Check and and skip 67h address size
>   prefix if CHECK_67H_PREFIX is 1.
> 

Here is the patch with the updated ChangeLog entry.  OK for trunk?

Thanks.

H.J.
---
2012-07-16  H.J. Lu  

PR libjava/53973
* include/x86_64-signal.h (CHECK_67H_PREFIX): New.
(HANDLE_DIVIDE_OVERFLOW): Check and and skip 67h address size
prefix if CHECK_67H_PREFIX is 1.  Use ULL suffix for 64-bit
integer.

diff --git a/libjava/include/x86_64-signal.h b/libjava/include/x86_64-signal.h
index 4bd8a36..84907c3 100644
--- a/libjava/include/x86_64-signal.h
+++ b/libjava/include/x86_64-signal.h
@@ -21,6 +21,12 @@ details.  */
 #define HANDLE_SEGV 1
 #define HANDLE_FPE 1
 
+#ifdef __ILP32__
+# define CHECK_67H_PREFIX 1
+#else
+# define CHECK_67H_PREFIX 0
+#endif
+
 #define SIGNAL_HANDLER(_name)  \
 static void _Jv_##_name (int, siginfo_t *, \
 void *_p __attribute__ ((__unused__)))
@@ -47,6 +53,10 @@ do   
\
\
   bool _is_64_bit = false; \
\
+  /* Check and skip 67h address size prefix if needed.  */ \
+  if (CHECK_67H_PREFIX && _rip[0] == 0x67) \
+_rip++;\
+   \
   if ((_rip[0] & 0xf0) == 0x40)  /* REX byte present.  */  \
 {  \
   unsigned char _rex = _rip[0] & 0x0f; \
@@ -64,10 +74,10 @@ do  
\
{   \
  if (_is_64_bit)   \
_min_value_dividend =   \
- _gregs[REG_RAX] == (greg_t)0x8000UL;  \
+ _gregs[REG_RAX] == (greg_t)0x8000ULL; \
  else  \
_min_value_dividend =   \
- (_gregs[REG_RAX] & 0x) == (greg_t)0x8000UL;   \
+ (_gregs[REG_RAX] & 0x) == (greg_t)0x8000ULL;  \
}   \
\
   if (_min_value_dividend) \

[RESEND-2][PATCH] Allow printing of escaped curly braces in assembler directives with operands

2012-07-18 Thread Siddhesh Poyarekar

Hi,

Resending. I did not get any responses the last two times and I too
forgot about it. Can someone please review this?

Thanks,
Siddhesh

Begin forwarded message:

Date: Tue, 3 Apr 2012 18:46:53 +0530
From: Siddhesh Poyarekar 
To: gcc-patches@gcc.gnu.org
Subject: Fw: [PATCH] Allow printing of escaped curly braces in
assembler directives with operands


Hi,

ping?

--
Siddhesh

Begin forwarded message:

Date: Tue, 27 Mar 2012 10:51:37 +0530
From: Siddhesh Poyarekar 
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] Allow printing of escaped curly braces in assembler
directives with operands


Hi,

An assembler directive with an operand is filtered through
output_asm_insn (or asm_fprintf for gcc internal asm() directives) to
expand the operand values in the assembler as well as to choose
dialects if present. This patch is concerned primarily with the
dialects, since their syntax prevent inclusion of assembler strings
with curly braces, causing them to be interpreted as dialects.

The attached patch allows printing of curly braces in assembler by
preceding them with a \\. So to print the following code into assembler:

.pushsection ".foo"; .asciz "div { width : 50%% | height=10px }"; .long
42; .popsection

The following code needs to be used with this patch:

void f()
{
  asm ( ".pushsection \".foo\"; .asciz \"div \\{ width : 50%% |
height=10px \\} \"; .long %c0; .popsection" : : "i"(42) ); }

The other option to \\ (since it doesn't look as clean) was to use %
as an escape character, but I was not sure if that is a better looking
option or a worse looking one. I don't mind resubmitting the patch to
use %{ and %} to print curly braces if that is acceptable.

It is still possible to print curly braces in asm string literals
without operands since they do not undergo any transformation.

The patch does not introduce any regressions. I have tested this with
x86_64 and i686 and it works well with both of them.

Regards,
Siddhesh


gcc/ChangeLog:

2012-03-27  Siddhesh Poyarekar  

* final.c (output_asm_insn, asm_fprintf): Print curly braces if
preceded by an escaped backslash (\\).

testsuite/ChangeLog:

2012-03-27  Siddhesh Poyarekar  

* gcc.dg/asm-braces.c: New test case.
diff --git a/gcc/final.c b/gcc/final.c
index 718caf1..2393c0f 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -3444,6 +3444,12 @@ output_asm_insn (const char *templ, rtx *operands)
 	  output_operand_lossage ("invalid %%-code");
 	break;
 
+  /* Escaped braces. Print them as is. */
+  case '\\':
+if (*p == '{' || *p == '}')
+  c = *p++;
+/* FALLTHROUGH */
+
   default:
 	putc (c, asm_out_file);
   }
@@ -3955,6 +3961,12 @@ asm_fprintf (FILE *file, const char *p, ...)
 	  }
 	break;
 
+  /* Escaped braces. Print them as is. */
+  case '\\':
+if (*p == '{' || *p == '}')
+  c = *p++;
+/* FALLTHROUGH */
+
   default:
 	putc (c, file);
   }
diff --git a/gcc/testsuite/gcc.dg/asm-braces.c b/gcc/testsuite/gcc.dg/asm-braces.c
new file mode 100644
index 000..4f428c8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asm-braces.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+void f()
+{
+  asm ( ".pushsection \".foo\"; .asciz \"div \\{ width : 50%% | height = 10px \\} \"; .long %c0; .popsection" : : "i"(42) );
+}
+
+/* { dg-final { scan-assembler "div { width : 50%% | height = 10px }" } } */
-- 
1.7.7.6

RE: [committed] PR 51931: force non-MIPS16ness for long-branch tests (NOW RFA: MIPS16 Long Branch Patch)

2012-07-18 Thread Moore, Catherine

Hi Richard,

Now that we are in the window for 4.8, I'd like to discuss the possibility of 
applying this patch.  Have you had a chance to think about it?

Thanks,
Catherine

> -Original Message-
> From: Richard Sandiford [mailto:rdsandif...@googlemail.com]
> Sent: Monday, February 06, 2012 2:51 PM
> To: Moore, Catherine
> Cc: Tang, Chung-Lin; gcc-patches@gcc.gnu.org
> Subject: Re: [committed] PR 51931: force non-MIPS16ness for long-branch tests
> (NOW RFA: MIPS16 Long Branch Patch)
> 
> Thanks for the patch.
> 
> "Moore, Catherine"  writes:
> >> -Original Message-
> >> From: Chung-Lin Tang [mailto:clt...@codesourcery.com]
> >> Sent: Monday, January 30, 2012 4:36 AM
> >> To: gcc-patches@gcc.gnu.org; rdsandif...@googlemail.com
> >> Cc: Moore, Catherine
> >> Subject: Re: [committed] PR 51931: force non-MIPS16ness for
> >> long-branch tests
> >>
> >> On 2012/1/22 06:33 PM, Richard Sandiford wrote:
> >> > The MIPS16 port has never handled long branches properly; see PR
> >> > 51931 for the details.  It isn't easy to xfail MIPS16-specific
> >> > problems at the dejagnu level because of -mflip-mips16, so the
> >> > patch below forces a nomips16 attribute instead.
> >> >
> >> > Tested on mips64-linux-gnu and applied.
> >> >
> >> > Richard
> >>
> >> CCing Catherine, I think we have a fix for this?
> >>
> >
> > I do have a patch.  It's a heuristic and will not work in all
> > instances, but it does allow many additional programs to successfully
> > compile.  For example, this scheme allowed me to build glibc in
> > MIPS16-mode for a MIPS-Linux toolchain.
> >
> > The patch causes reorg to examine mips16 branches.  For branches that
> > are out-of-range, reorg will look for branches to the same target.  If
> > that branch is in range, the destination of the original branch
> > becomes the new branch.  If branches to the same target do not exist,
> > then reorg will search for barriers that are in range and insert
> > label+ branch at the barrier.
> >
> > Of the test cases mentioned in the bug report,
> > gcc.c-torture/compile/20001226-1.c still fails due to a lack of
> > barriers in the instruction stream.  g++.dg/opt/longbranch1.C will
> > pass.
> >
> > I've set off a test run with my patch applied against mainline.  In
> > the meantime, here's the patch.  Richard, what do you think?
> 
> Yeah, it's difficult.  On the one hand, this is probably more efficient (both 
> in
> terms of code size and speed) than a MIPS16 equivalent of the
> non-MIPS16 fallback, which uses a label load followed by an indirect jump.
> On the other hand, it can suffer from degenerate cases where we need so many
> new branches that even the trampolines become out of range.
> (Maybe that's what's happening in the 20001226-1.c case.)
> 
> Since this isn't a regression, the patch would need to wait for 4.8 anyway.
> I'll have a think about it before then (or at least try to remember to...)
> 
> Thanks,
> Richard

[patch] More cleanups for CFG dumping

2012-07-18 Thread Steven Bosscher

On Wed, Jul 18, 2012 at 10:08 AM, Steven Bosscher  wrote:
> On Wed, Jul 18, 2012 at 2:24 AM, H.J. Lu  wrote:
>> This caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54008
>
> Yes, they failed in my testing, too. I must have been blind to overlook 
> them...
>
> I received some comments in private about the "new look" of the dumps,
> that I'll be addressing with a patch later today. I'll fix these two
> test cases with that patch also.

Hello,

The attached patch further consolidates RTL and GIMPLE CFG dumping. It
also fixes a couple of formatting issues, and it fixes the two broken
test case patterns from yesterday's patch.

I also ran into a bug in the GIMPLE CFG dumping hooks while working on
value profiling cleanups (that have been oh so long on my TODO
list...). It is necessary to flush the pretty-printer buffer to the
dump file before printing directly to the pretty-printer stream with
dump_histograms_for_stmt. For good measure, I replaced pp_newfile with
pp_flush in places where that seemed to be more appropriate, and added
comments to the functions that do _not_ flush the buffer (this can
happen e.g. because a user of those functions wants to print
additional information before the newline that pp_flush adds).

Bootstrapped&tested on powerpc64-unknown-linux-gnu. This patch doesn't
affect Graphite, so perhaps it won't break this time :-)
OK for trunk?

Ciao!
Steven

cleanup_cfg_dump_2.diff
Description: Binary data

Re: [PATCH] [MIPS] Support for -mmcu and -mno-mcu

2012-07-18 Thread Richard Sandiford

"Moore, Catherine"  writes:
> +mmcu
> +Target Report Mask(MCU)
> +Use MCU instructions

Please use Var(TARGET_MCU) instead, in order to avoid eating up
target_flags.  OK with that change, thanks.

Richard

Re: [Patch, mips] Fix compiler abort with -mips32r2 -mips16 -msynci

2012-07-18 Thread Richard Sandiford

"Steve Ellcey "  writes:
> While working on my favorite mips option (-msynci) I noticed an odd thing.
> If I compile with '-mips32 -mips16 -msynci' I got a warning about synci not
> being supported but if I compiled with '-mips32r2 -mips16 -msynci' I did not
> get a warning, even though -mips16 mode does not support synci.  Furthermore
> if I compiled a program that called __builtin___clear_cache with '-mips32r2
> -mips16 -msynci', the compiler would abort.

The abort sounds like the bug here.  It's deliberate that things like
-msynci, -mbranch-likely, etc., are OK with -mips16.  On the one hand,
you could compile with -mips16 but have an __attribute__((nomips16))
function that could benefit from using SYNCI.  On the other, you could
compile without -mips16 but have an __attribute__((mips16)) function
that needs to avoid SYNCI.

-mips16 really just sets the default ISA mode for functions that don't
specify one.  That's why override_options hides mips16ness so early on,
like you say.

Richard

Re: [PATCH 1/2] if-to-switch conversion pass

2012-07-18 Thread Bernhard Reutner-Fischer

On Tue, Jul 17, 2012 at 01:21:00PM +0200, Tom de Vries wrote:

> /* The root of the compilation pass tree, once constructed.  */
> extern struct opt_pass *all_passes, *all_small_ipa_passes, 
> *all_lowering_passes,
>Index: gcc/tree-if-switch-conversion.c
>===
>--- /dev/null (new file)
>+++ gcc/tree-if-switch-conversion.c (revision 0)

>+/* Convert all trees in RANGES to TYPE.  */
>+
>+static bool
>+convert_ranges (tree type, VEC (range, gc) *ranges)
>+{
>+  unsigned int ix;
>+  range r;
>+
>+  for (ix = 0; VEC_iterate (range, ranges, ix, r); ix++)
>+{
>+  r->low = fold_convert (type, r->low);
>+  if (TREE_TYPE (r->low) != type)
>+  return false;
>+
>+  if (r->high == NULL_TREE)
>+  continue;
>+
>+  r->high = fold_convert (type, r->high);
>+  if (TREE_TYPE (r->low) != type)

low, not high? This is not immediately obvious to me, please explain?

>+  return false;
>+}
>+
>+  return true;
>+}

>+/* Analyze BB and store results in ifsc_info_def struct.  */
>+
>+static void
>+analyze_bb (basic_block bb)
>+{
>+  gimple stmt = last_stmt (bb);
>+  tree lhs, rhs, var, constant;
>+  edge true_edge, false_edge;
>+  enum tree_code cond_code;
>+  VEC (range, gc) *ranges = NULL;
>+  unsigned int nr_stmts = 0;
>+  bool swap_edges = false;
>+  tree low, high;
>+
>+  /* We currently only handle bbs with GIMPLE_COND.  */
>+  if (!stmt || gimple_code (stmt) != GIMPLE_COND)
>+return;
>+
>+  cond_code = gimple_cond_code (stmt);
>+  switch (cond_code)
>+{
>+case EQ_EXPR:
>+case NE_EXPR:
>+case LE_EXPR:
>+case GE_EXPR:
>+  break;
>+case LT_EXPR:
>+case GT_EXPR:
>+  /* Todo.  */
>+  return;
>+default:
>+  return;
>+}
>+
>+  lhs = gimple_cond_lhs (stmt);
>+  rhs = gimple_cond_rhs (stmt);
>+
>+  /* The comparison needs to be against a constant.  */
>+  if (!TREE_CONSTANT (lhs)
>+  && !TREE_CONSTANT (rhs))
>+return;
>+
>+  /* Normalize comparison into (var cond_code constant).  */
>+  var = TREE_CONSTANT (lhs) ? rhs : lhs;
>+  constant = TREE_CONSTANT (lhs)? lhs : rhs;

missing space

[]
>+/* Convert every if-chain in CHAINS into a switch statement.  */
>+
>+static void
>+convert_chains (VEC (if_chain, gc) *chains)
>+{
>+  unsigned int ix;
>+  if_chain chain;
>+
>+  if (VEC_empty (if_chain, chains))
>+return;
>+
>+  for (ix = 0; VEC_iterate (if_chain, chains, ix, chain); ix++)

shouldn't this be FOR_EACH_VEC_ELT nowadays? Everywhere.
>+{
>+  if (dump_file)
>+  dump_if_chain (chain);
>+
>+  convert_if_chain_to_switch (chain);
>+
>+  update_cfg (chain);
>+}
>+
>+  /* Force recalculation of dominance info.  */
>+  free_dominance_info (CDI_DOMINATORS);
>+  free_dominance_info (CDI_POST_DOMINATORS);
>+}

>Index: gcc/Makefile.in
>===
>--- gcc/Makefile.in (revision 189508)
>+++ gcc/Makefile.in (working copy)
>@@ -1391,6 +1391,7 @@ OBJS = \
>   tree-profile.o \
>   tree-scalar-evolution.o \
>   tree-sra.o \
>+  tree-if-switch-conversion.o \
>   tree-switch-conversion.o \
>   tree-ssa-address.o \
>   tree-ssa-alias.o \
>@@ -3013,7 +3014,12 @@ tree-sra.o : tree-sra.c $(CONFIG_H) $(SY
>$(IPA_PROP_H) $(DIAGNOSTIC_H) statistics.h $(TREE_DUMP_H) $(TIMEVAR_H) \
>$(PARAMS_H) $(TARGET_H) $(FLAGS_H) \
>$(DBGCNT_H) $(TREE_INLINE_H) $(GIMPLE_PRETTY_PRINT_H)
>+tree-if-switch-conversion.o : tree-if-switch-conversion.c $(CONFIG_H) \
>+$(SYSTEM_H) $(TREE_H) $(TM_P_H) $(TREE_FLOW_H) $(DIAGNOSTIC_H) \
>+$(TREE_INLINE_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
>+$(GIMPLE_H) $(TREE_PASS_H) $(FLAGS_H) $(EXPR_H) $(BASIC_BLOCK_H) output.h 
>\
>+$(GGC_H) $(OBSTACK_H) $(PARAMS_H) $(CPPLIB_H) $(PARAMS_H)

I think this list needs updating.

Nice to see if improvements, finally! :)
TIA && cheers,

Re: PR libjava/53973: Check and and skip 67h address size prefix for x32

2012-07-18 Thread Andrew Haley

On 07/18/2012 05:30 PM, H.J. Lu wrote:
> 2012-07-16  H.J. Lu  
> 
>   PR libjava/53973
>   * include/x86_64-signal.h (CHECK_67H_PREFIX): New.
>   (HANDLE_DIVIDE_OVERFLOW): Check and and skip 67h address size
>   prefix if CHECK_67H_PREFIX is 1.  Use ULL suffix for 64-bit
>   integer.

OK.  I would have thought it was OK to skip the 67h address size
whether or not it was x32, though.

Andrew.

[patch] PR debug/53948

2012-07-18 Thread Steven Bosscher

Hello,

This is my proposed fix for PR53948. We don't want to put user
variables in callee-clobbered registers, but obviously function
arguments are OK there. REG_USERVAR_P is set on PARM_DECLs and on user
variables, so it can't be used to distinguish between the two.

As it turns out, I can hi-jack a bit for that: 'unchanging' (currently
incorrectly documented as used on REG) for a new macro
REG_FUNCTION_PARM_P. I found one obvious place where this bit can be
used instead of REG_USERVAR_P, and probably there are a few more
places where this is useful (TBD, I'm going to look at all places
where RTL code looks at tree's PARM_DECL later).

Bootstrapped&tested on powerpc64-unknown-linux-gnu.
OK for trunk?

Ciao!
Steven

PR debug/53948
* rtl.h (REG_FUNCTION_PARM_P): New flag on a REG.  Re-use 'unchaning'.
* emit-rtl.c (mark_function_parm_reg): New function.
* function.c (assign_parm_setup_reg): Use mark_function_parm_reg
instead of mark_user_reg.
* combine.c (can_change_dest_mode): Preserve REG_FUNCTION_PARM_P.
* web.c (entry_register): Likewise.
* reload1.c (reload): Likewise.
* ira-emit.c (ira_create_new_reg): Likewise.
* reginfo.c (reg_scan_mark_refs): Likewise.
* optabs.c (emit_libcall_block_1): Use REG_FUNCTION_PARM_P instead
of REG_USERVAR_P.
* regstat.c (dump_reg_info): Print REG_FUNCTION_PARM_P.
* doc/rtl.texi (REG_FUNCTION_PARM_P): Document it.
('unchanging' flag): Fix documentation.


PR53948.diff
Description: Binary data

Re: PR libjava/53973: Check and and skip 67h address size prefix for x32

2012-07-18 Thread Uros Bizjak

On Wed, Jul 18, 2012 at 7:34 PM, Andrew Haley  wrote:
> On 07/18/2012 05:30 PM, H.J. Lu wrote:
>> 2012-07-16  H.J. Lu  
>>
>>   PR libjava/53973
>>   * include/x86_64-signal.h (CHECK_67H_PREFIX): New.
>>   (HANDLE_DIVIDE_OVERFLOW): Check and and skip 67h address size
>>   prefix if CHECK_67H_PREFIX is 1.  Use ULL suffix for 64-bit
>>   integer.
>
> OK.  I would have thought it was OK to skip the 67h address size
> whether or not it was x32, though.

You can just skip the prefix unconditionally.

Uros.

Re: [patch] PR debug/53948

2012-07-18 Thread Jan Kratochvil

Hello Steven,

On Wed, 18 Jul 2012 19:46:16 +0200, Steven Bosscher wrote:
> This is my proposed fix for PR53948.

I can't speak for the GCC code but could it have a GCC testcase?


Thanks,
Jan

Re: PR libjava/53973: Check and and skip 67h address size prefix for x32

2012-07-18 Thread H.J. Lu

On Wed, Jul 18, 2012 at 10:47 AM, Uros Bizjak  wrote:
> On Wed, Jul 18, 2012 at 7:34 PM, Andrew Haley  wrote:
>> On 07/18/2012 05:30 PM, H.J. Lu wrote:
>>> 2012-07-16  H.J. Lu  
>>>
>>>   PR libjava/53973
>>>   * include/x86_64-signal.h (CHECK_67H_PREFIX): New.
>>>   (HANDLE_DIVIDE_OVERFLOW): Check and and skip 67h address size
>>>   prefix if CHECK_67H_PREFIX is 1.  Use ULL suffix for 64-bit
>>>   integer.
>>
>> OK.  I would have thought it was OK to skip the 67h address size
>> whether or not it was x32, though.
>
> You can just skip the prefix unconditionally.
>
> Uros.

I will check in this patch shortly.

Thanks.

-- 
H.J.
---
2012-07-18  H.J. Lu  

PR libjava/53973
* include/x86_64-signal.h (CHECK_67H_PREFIX): Removed.
(HANDLE_DIVIDE_OVERFLOW): Check and and skip 67h address size
prefix unconditionally.

diff --git a/libjava/include/x86_64-signal.h b/libjava/include/x86_64-signal.h
index 84907c3..aa5a903 100644
--- a/libjava/include/x86_64-signal.h
+++ b/libjava/include/x86_64-signal.h
@@ -21,12 +21,6 @@ details.  */
 #define HANDLE_SEGV 1
 #define HANDLE_FPE 1

-#ifdef __ILP32__
-# define CHECK_67H_PREFIX 1
-#else
-# define CHECK_67H_PREFIX 0
-#endif
-
 #define SIGNAL_HANDLER(_name)  \
 static void _Jv_##_name (int, siginfo_t *, \
 void *_p __attribute__ ((__unused__)))
@@ -53,8 +47,8 @@ do
\
\
   bool _is_64_bit = false; \
\
-  /* Check and skip 67h address size prefix if needed.  */ \
-  if (CHECK_67H_PREFIX && _rip[0] == 0x67) \
+  /* Check and skip 67h address size prefix.  */   \
+  if (_rip[0] == 0x67) \
 _rip++;\
\

Re: [patch] PR debug/53948

2012-07-18 Thread Steven Bosscher

On Wed, Jul 18, 2012 at 7:55 PM, Jan Kratochvil
 wrote:
> Hello Steven,
>
> On Wed, 18 Jul 2012 19:46:16 +0200, Steven Bosscher wrote:
>> This is my proposed fix for PR53948.
>
> I can't speak for the GCC code but could it have a GCC testcase?

I wouldn't know what to test for. Looking for a .loc marker seems a bit fragile.

Ciao!
Steven

[PATCH]: Add entity argument to MODE_AFTER macro

2012-07-18 Thread Uros Bizjak

Hello!

As with all other mode switching macros, we need to pass entity index
also to MODE_AFTER macro. In a multi-entity mode switching case, we
usually don't have same modes for all entities, and we should be able
to return the mode that applies to a specific entity. It looks that
epiphany port already tripped on this issue.

2012-07-18  Uros Bizjak  

* doc/tm.texi.in (MODE_AFTER): Add entity as the first macro argument.
* doc/tm.texi: Regenerate.
* mode-switching.c (optimize_mode_switching): Update MODE_AFTER call.
* config/sh/sh.h (MODE_AFTER): Update.
* config/epiphany/epiphany.h (MODE_AFTER): Update.

Patch was bootstrapped on x86_64-pc-linux-gnu. Also, a functional C
crosscompiler was built for sh-elf and epiphany-elf targets.

Bordering on obvious, OK for mainline?

Uros.
Index: mode-switching.c
===
--- mode-switching.c(revision 189491)
+++ mode-switching.c(working copy)
@@ -534,7 +534,7 @@ optimize_mode_switching (void)
  RESET_BIT (transp[bb->index], j);
}
 #ifdef MODE_AFTER
- last_mode = MODE_AFTER (last_mode, insn);
+ last_mode = MODE_AFTER (e, last_mode, insn);
 #endif
  /* Update LIVE_NOW.  */
  for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
Index: doc/tm.texi
===
--- doc/tm.texi (revision 189491)
+++ doc/tm.texi (working copy)
@@ -9708,8 +9708,9 @@ return an integer value not larger than the corres
 be switched into prior to the execution of @var{insn}.
 @end defmac
 
-@defmac MODE_AFTER (@var{mode}, @var{insn})
-If this macro is defined, it is evaluated for every @var{insn} during
+@defmac MODE_AFTER (@var{entity}, @var{mode}, @var{insn})
+@var{entity} is an integer specifying a mode-switched entity.  If
+this macro is defined, it is evaluated for every @var{insn} during
 mode switching.  It determines the mode that an insn results in (if
 different from the incoming mode).
 @end defmac
Index: doc/tm.texi.in
===
--- doc/tm.texi.in  (revision 189491)
+++ doc/tm.texi.in  (working copy)
@@ -9587,8 +9587,9 @@ return an integer value not larger than the corres
 be switched into prior to the execution of @var{insn}.
 @end defmac
 
-@defmac MODE_AFTER (@var{mode}, @var{insn})
-If this macro is defined, it is evaluated for every @var{insn} during
+@defmac MODE_AFTER (@var{entity}, @var{mode}, @var{insn})
+@var{entity} is an integer specifying a mode-switched entity.  If
+this macro is defined, it is evaluated for every @var{insn} during
 mode switching.  It determines the mode that an insn results in (if
 different from the incoming mode).
 @end defmac
Index: config/sh/sh.h
===
--- config/sh/sh.h  (revision 189491)
+++ config/sh/sh.h  (working copy)
@@ -2351,7 +2351,7 @@ extern int current_function_interrupt;
? get_attr_fp_mode (INSN)   \
: FP_MODE_NONE)
 
-#define MODE_AFTER(MODE, INSN)  \
+#define MODE_AFTER(ENTITY, MODE, INSN) \
  (TARGET_HITACHI   \
   && recog_memoized (INSN) >= 0\
   && get_attr_fp_set (INSN) != FP_SET_NONE  \
Index: config/epiphany/epiphany.h
===
--- config/epiphany/epiphany.h  (revision 189491)
+++ config/epiphany/epiphany.h  (working copy)
@@ -888,8 +888,8 @@ enum epiphany_function_type
 
 #define MODE_ENTRY(ENTITY) (epiphany_mode_entry_exit ((ENTITY), false))
 #define MODE_EXIT(ENTITY) (epiphany_mode_entry_exit ((ENTITY), true))
-#define MODE_AFTER(LAST_MODE, INSN) \
-  (epiphany_mode_after (e, (LAST_MODE), (INSN)))
+#define MODE_AFTER(ENTITY, LAST_MODE, INSN) \
+  (epiphany_mode_after ((ENTITY), (LAST_MODE), (INSN)))
 
 #define TARGET_INSERT_MODE_SWITCH_USE epiphany_insert_mode_switch_use

Re: [PATCH] Define FFI_SIZEOF_JAVA_RAW to 4 for x32

2012-07-18 Thread Uros Bizjak

On Wed, Jul 18, 2012 at 6:27 PM, H.J. Lu  wrote:

>> This patch defines FFI_SIZEOF_JAVA_RAW to 4 for x32, similar to MIPS n32.
>> It fixed:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53982
>
> Here is the patch with updated ChangeLog entry.  X32 has the same issue
> as MIPS n32, which was fixed by FFI_SIZEOF_JAVA_RAW:
>
> http://gcc.gnu.org/ml/gcc-patches/2007-11/msg01612.html
> http://gcc.gnu.org/ml/gcc-patches/2007-12/msg5.html
>
> The same fix is needed for x32.  OK for trunk?
>
> 2012-07-16  H.J. Lu  
>
> * src/x86/ffitarget.h: Check __ILP32__ instead of __LP64__ for
> x32.
> (FFI_SIZEOF_JAVA_RAW): Defined to 4 for x32.

How did you tested this patch? Does it fix all the problems from
PR53982, including the one that is not 100% reproducible?

Uros.

Re: [patch] PR debug/53948

2012-07-18 Thread Jan Kratochvil

On Wed, 18 Jul 2012 20:05:46 +0200, Steven Bosscher wrote:
> I wouldn't know what to test for. Looking for a .loc marker seems a bit 
> fragile.

What is fragile on
// { dg-final { scan-assembler-times "\\.loc\t1 3 0\\r\\n\t\[^.\]" 6 } }

or something like that.  Line numbers are constant for the testcase.


Thanks,
Jan

Re: PR #53525 - track-macro-expansion performance regression

2012-07-18 Thread Dodji Seketeli

Hello Dimitrios,

> With the attached patches I introduce four new obstacks in struct
> cpp_reader to substitute malloc's/realloc's when expanding
> macros. Numbers have been posted in the PR, but to summarize:
> 
> before: 0.785 s or  2201 M instr
> after:  0.760 s or  2108 M instr
> 
> Memory overhead is some tens kilobytes worst case. Tested on x86, no
> regressions. I think this version of the patch is OK to merge, besides
> some TODO comments (I'd appreciate opinions on them) and the following
> maybe:

Thank you for your time and dedication.

I am not a maintainer of any kind, so I can not approve or deny your
patch.  I am just chiming in to help with a few comments and CC the
maintainers.

> IMHO the new obstack_{mark,release} functions are the ones that will
> be harder to apply, because they make gcc's obstacks even more
> different than libc's. I sent the patch to libc-alpha but the feedback
> I got was that I should first make the two obstack versions (gcc,libc)
> identical, and then try to push changes. I've noted the primary
> differences and plan on tackling this, but it's not as trivial as it
> seems.
> 
> So if it's not OK, where could the new obstack_{mark,release} go?

I am letting the maintainers reply to this one.  :-)

Please find my comments below.

For each sub-project that your patch modifies, you need to add a
GNU-Style ChangeLog.  The custom is to add it separately (e.g inline
or in attachment), not as part of the diff.  That way, applying the
patch is less likely to trigger conflicts.

> === modified file 'include/libiberty.h'
> --- include/libiberty.h   2011-09-28 19:04:30 +
> +++ include/libiberty.h   2012-07-07 16:04:01 +

[...]

> -/* Type-safe obstack allocator.  */
> +/* Type-safe obstack allocator. You must first initialize the obstack with
> +   obstack_init() or _obstack_begin(). */

Why not just using the _obstack_begin function?  Why introducing the
use of the obstack_init macro?

>  #define XOBNEW(O, T) ((T *) obstack_alloc ((O), sizeof (T)))
>  #define XOBNEWVEC(O, T, N)   ((T *) obstack_alloc ((O), sizeof (T) * (N)))
>  #define XOBNEWVAR(O, T, S)   ((T *) obstack_alloc ((O), (S)))
> -#define XOBFINISH(O, T) ((T) obstack_finish ((O)))

I think this change is not needed.  You basically remove this line to
replace it with the hunk below, that comes later in the patch:

> +#define XOBFINISH(O, PT) ((PT) obstack_finish ((O)))

So I believe you could do away with the change.

> +#define XOBDELETE(O, T, P)   (obstack_free ((O), (P)))

If you are not using the T parameter in the definition of the macro,
then you might as well remove it from the macro parameters.

> +
> +#define XOBGROW(O, T, D) obstack_grow ((O), (D), sizeof (T))
> +#define XOBGROWVEC(O, T, D, N)   obstack_grow ((O), (D), sizeof (T) * 
> (N))
> +#define XOBSHRINK(O, T)  obstack_blank ((O), -1 * sizeof (T))
> +#define XOBSHRINKVEC(O, T, N)obstack_blank ((O), -1 * sizeof (T) * 
> (N))

Maybe these new macros could use some comments at least to make it
easier to figure out what these O, T, D, N parameters mean.  I
understand that it is not done that way for the existing macros, but I
guess we could use some improvement here.  :-)

[...]

> === modified file 'libcpp/init.c'
> --- libcpp/init.c 2012-04-30 11:43:43 +
> +++ libcpp/init.c 2012-07-07 16:04:01 +

Missing ChangeLog for this changes of libcpp.

> @@ -241,10 +241,20 @@ cpp_create_reader (enum c_lang lang, has
>/* The expression parser stack.  */
>_cpp_expand_op_stack (pfile);
>  
> +#define obstack_chunk_alloc ((void *(*) (long)) xmalloc)
> +#define obstack_chunk_free  ((void (*) (void *)) free)
> +
>/* Initialize the buffer obstack.  */
> -  _obstack_begin (&pfile->buffer_ob, 0, 0,
> -   (void *(*) (long)) xmalloc,
> -   (void (*) (void *)) free);
> +  obstack_init(&pfile->buffer_ob);

Same comment as earlier.  Why obstack_init instead of just using
_obstack_begin?

> +
> +  /* Initialize the macro obstacks. */
> +  obstack_init (&pfile->exp_ob);
> +  if (CPP_OPTION (pfile, track_macro_expansion))
> +{
> +  obstack_init (&pfile->virt_locs_ob);
> +  obstack_init (&pfile->arg_locs_ob);
> +  obstack_init (&pfile->exp_locs_ob);
> +}

Same comment as above.

[...]

> === modified file 'libcpp/internal.h'
> --- libcpp/internal.h 2012-05-29 09:36:29 +
> +++ libcpp/internal.h 2012-07-07 17:18:53 +

[...]

@@ -555,6 +555,13 @@ struct cpp_reader

[...]

> +  /* Obstacks used to speed up macro expansion and virt_locs tracking. */

I'd say something like:

/* Obstacks used for fast memory allocation during macro expansion and
   virtual location tracking. */

+  struct obstack exp_ob;   /* for expanding macro arguments */

I'd rather call this field args_exp_ob, to make the name more
meaningful.

+ struct obstack exp_locs_ob;   /* virt_locs of expanded macro arguments */

Likewise, I'd call this field args_exp_

Re: [patch v2] support for multiarch systems

2012-07-18 Thread Thomas Schwinge

Hi!

On Sun, 08 Jul 2012 20:48:23 +0200, Matthias Klose  wrote:
> Please find attached v2 of the patch updated for trunk 20120706, x86 only, 
> tested on
> x86-linux-gnu, KFreeBSD and the Hurd.

As suggested by Diego Novillo I have now attached this patch to
.

> I left in the comment about the multiarch names, but I'm fine to 
> change/discard
> it. It was first required by Joseph Myers, then not found necessary by Paolo
> Bonzini. The patch includes the changes suggested by Thomas Schwinge.

I'm confirming that this version of that patch is identical to the patch
that I have been using "ever since", only this current one includes
additional documentation changes, and the s%eight%ninth documentation
change is missing from gcc/genmultilib.

> Ok for the trunk?
> 
>   Matthias
> 2012-06-25  Matthias Klose  
> 
>   * doc/invoke.texi: Document -print-multiarch.
>   * doc/install.texi: Document --enable-multiarch.
>   * doc/fragments.texi: Document MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME.
>   * configure.ac: Add --enable-multiarch option.
>   * configure.in: Regenerate.
>   * Makefile.in (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib.
>   enable_multiarch, with_float: New macros.
>   if_multiarch: New macro, define in terms of enable_multiarch.
>   * genmultilib: Add new argument for the multiarch name.
>   * gcc.c (multiarch_dir): Define.
>   (for_each_path): Search for multiarch suffixes.
>   (driver_handle_option): Handle multiarch option.
>   (do_spec_1): Pass -imultiarch if defined.
>   (main): Print multiarch.
>   (set_multilib_dir): Separate multilib and multiarch names
>   from multilib_select.
>   (print_multilib_info): Ignore multiarch names in multilib_select.
>   * incpath.c (add_standard_paths): Search the multiarch include dirs.
>   * cppdeault.h (default_include): Document multiarch in multilib
>   member.
>   * cppdefault.c: [LOCAL_INCLUDE_DIR, STANDARD_INCLUDE_DIR] Add an
> include directory for multiarch directories.
>   * common.opt: New options --print-multiarch and -imultilib.
>   * config.gcc  (tmake_file):
>   Include i386/t-linux.
>(tmake_file):
>   Include i386/t-kfreebsd.
>(tmake_file): Include i386/t-gnu.
>   * config/i386/t-linux64: Add multiarch names in
>   MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME.
>   * config/i386/t-gnu: New file.
>   * config/i386/t-kfreebsd: Likewise.
>   * config/i386/t-linux: Likewise.


Grüße,
 Thomas


pgpvUbsKmY0A9.pgp
Description: PGP signature

Re: [patch] PR debug/53948

2012-07-18 Thread Jakub Jelinek

On Wed, Jul 18, 2012 at 08:55:17PM +0200, Jan Kratochvil wrote:
> On Wed, 18 Jul 2012 20:05:46 +0200, Steven Bosscher wrote:
> > I wouldn't know what to test for. Looking for a .loc marker seems a bit 
> > fragile.
> 
> What is fragile on
> // { dg-final { scan-assembler-times "\\.loc\t1 3 0\\r\\n\t\[^.\]" 6 } }
> 
> or something like that.  Line numbers are constant for the testcase.

Not all assemblers support .loc markers, sometimes .debug_line is emitted by
gcc directly, and not all targets use dwarf.  For the latter we can just put
the test into gcc.dg/debug/dwarf/, for the former we would need a new
dejagnu test.  But more importantly, on different arches I'd guess the
number of .loc times might be different.
Better than that would be a guality testcase limited to -O0.

Jakub

Re: [PATCH] Define FFI_SIZEOF_JAVA_RAW to 4 for x32

2012-07-18 Thread H.J. Lu

On Wed, Jul 18, 2012 at 11:44 AM, Uros Bizjak  wrote:
> On Wed, Jul 18, 2012 at 6:27 PM, H.J. Lu  wrote:
>
>>> This patch defines FFI_SIZEOF_JAVA_RAW to 4 for x32, similar to MIPS n32.
>>> It fixed:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53982
>>
>> Here is the patch with updated ChangeLog entry.  X32 has the same issue
>> as MIPS n32, which was fixed by FFI_SIZEOF_JAVA_RAW:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2007-11/msg01612.html
>> http://gcc.gnu.org/ml/gcc-patches/2007-12/msg5.html
>>
>> The same fix is needed for x32.  OK for trunk?
>>
>> 2012-07-16  H.J. Lu  
>>
>> * src/x86/ffitarget.h: Check __ILP32__ instead of __LP64__ for
>> x32.
>> (FFI_SIZEOF_JAVA_RAW): Defined to 4 for x32.
>
> How did you tested this patch? Does it fix all the problems from
> PR53982, including the one that is not 100% reproducible?
>

I tested it on Linux/x32. All libjava tests are passing now for x32 with
this patch.


-- 
H.J.

Re: [PATCH] Define FFI_SIZEOF_JAVA_RAW to 4 for x32

2012-07-18 Thread Uros Bizjak

On Wed, Jul 18, 2012 at 9:10 PM, H.J. Lu  wrote:

 This patch defines FFI_SIZEOF_JAVA_RAW to 4 for x32, similar to MIPS n32.
 It fixed:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53982
>>>
>>> Here is the patch with updated ChangeLog entry.  X32 has the same issue
>>> as MIPS n32, which was fixed by FFI_SIZEOF_JAVA_RAW:
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2007-11/msg01612.html
>>> http://gcc.gnu.org/ml/gcc-patches/2007-12/msg5.html
>>>
>>> The same fix is needed for x32.  OK for trunk?
>>>
>>> 2012-07-16  H.J. Lu  
>>>
>>> * src/x86/ffitarget.h: Check __ILP32__ instead of __LP64__ for
>>> x32.
>>> (FFI_SIZEOF_JAVA_RAW): Defined to 4 for x32.
>>
>> How did you tested this patch? Does it fix all the problems from
>> PR53982, including the one that is not 100% reproducible?
>>
>
> I tested it on Linux/x32. All libjava tests are passing now for x32 with
> this patch.

OK then.

Thanks,
Uros.

Re: cosmetic change - simplify cse.c:preferable()

2012-07-18 Thread Dodji Seketeli

Hey Dimitrios,

I can't say much about your patch, so I am CC-ing the maintainers.

Thanks.

Dimitrios Apostolou  a écrit:

> Hello,
>
> I've had this patch some time now, it's simple and cosmetic only, I
> had done it while trying to understand expression costs in CSE. I
> think it's more readable than the previous one. FWIW it passed all
> tests on x86.
>
>
> Thanks,
> Dimitris
>
> === modified file 'gcc/cse.c'
> --- gcc/cse.c 2012-06-15 09:22:00 +
> +++ gcc/cse.c 2012-07-08 07:28:52 +
> @@ -713,32 +713,25 @@ approx_reg_cost (rtx x)
>  static int
>  preferable (int cost_a, int regcost_a, int cost_b, int regcost_b)
>  {
> -  /* First, get rid of cases involving expressions that are entirely
> - unwanted.  */
> -  if (cost_a != cost_b)
> -{
> -  if (cost_a == MAX_COST)
> - return 1;
> -  if (cost_b == MAX_COST)
> - return -1;
> -}
> +  int cost_diff = cost_a - cost_b;
> +  int regcost_diff = regcost_a - regcost_b;
>  
> -  /* Avoid extending lifetimes of hardregs.  */
> -  if (regcost_a != regcost_b)
> +  if (cost_diff != 0)
>  {
> -  if (regcost_a == MAX_COST)
> - return 1;
> -  if (regcost_b == MAX_COST)
> - return -1;
> +  /* If none of the expressions are entirely unwanted */
> +  if ((cost_a != MAX_COST) && (cost_b != MAX_COST)
> +   /* AND only one of the regs is HARD_REG */
> +   && (regcost_diff != 0)
> +   && ((regcost_a == MAX_COST) || (regcost_b == MAX_COST))
> +   )
> + /* Then avoid extending lifetime of HARD_REG */
> + return regcost_diff;
> +
> +  return cost_diff;
>  }
>  
> -  /* Normal operation costs take precedence.  */
> -  if (cost_a != cost_b)
> -return cost_a - cost_b;
> -  /* Only if these are identical consider effects on register pressure.  */
> -  if (regcost_a != regcost_b)
> -return regcost_a - regcost_b;
> -  return 0;
> +  /* cost_a == costb, consider effects on register pressure */
> +  return regcost_diff;
>  }
>  
>  /* Internal function, to compute cost when X is not a register; called
>

-- 
Dodji

cp-demangle PATCH to handle C++ casts

2012-07-18 Thread Jason Merrill

When I ran the C++ testsuite with -fabi-version defaulting to 0, I 
noticed a couple of tests that failed because they were expecting the =2 
mangling.  I also noticed that the demangler didn't understand the 
correct mangling for new-style casts.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 295d8111c7b0ee35933d201d1001798be4fc0b68
Author: Jason Merrill 
Date:   Wed Jul 18 15:28:02 2012 -0400

	* cp-demangle.c (cplus_demangle_operators): Add *_cast.
	(op_is_new_cast): New.
	(d_expression, d_print_comp): Check it.

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 27cc323..258aaa7 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -1582,11 +1582,13 @@ const struct demangle_operator_info cplus_demangle_operators[] =
   { "an", NL ("&"), 2 },
   { "at", NL ("alignof "),   1 },
   { "az", NL ("alignof "),   1 },
+  { "cc", NL ("const_cast"), 2 },
   { "cl", NL ("()"),2 },
   { "cm", NL (","), 2 },
   { "co", NL ("~"), 1 },
   { "dV", NL ("/="),2 },
   { "da", NL ("delete[] "), 1 },
+  { "dc", NL ("dynamic_cast"), 2 },
   { "de", NL ("*"), 1 },
   { "dl", NL ("delete "),   1 },
   { "ds", NL (".*"),2 },
@@ -1626,8 +1628,10 @@ const struct demangle_operator_info cplus_demangle_operators[] =
   { "qu", NL ("?"), 3 },
   { "rM", NL ("%="),2 },
   { "rS", NL (">>="),   2 },
+  { "rc", NL ("reinterpret_cast"), 2 },
   { "rm", NL ("%"), 2 },
   { "rs", NL (">>"),2 },
+  { "sc", NL ("static_cast"), 2 },
   { "st", NL ("sizeof "),   1 },
   { "sz", NL ("sizeof "),   1 },
   { "tr", NL ("throw"), 0 },
@@ -2809,6 +2813,18 @@ d_exprlist (struct d_info *di, char terminator)
   return list;
 }
 
+/* Returns nonzero iff OP is an operator for a C++ cast: const_cast,
+   dynamic_cast, static_cast or reinterpret_cast.  */
+
+static int
+op_is_new_cast (struct demangle_component *op)
+{
+  const char *code = op->u.s_operator.op->code;
+  return (code[1] == 'c'
+	  && (code[0] == 's' || code[0] == 'd'
+	  || code[0] == 'c' || code[0] == 'r'));
+}
+
 /*  ::= <(unary) operator-name> 
 ::= <(binary) operator-name>  
 ::= <(trinary) operator-name>   
@@ -2971,7 +2987,10 @@ d_expression (struct d_info *di)
 	struct demangle_component *left;
 	struct demangle_component *right;
 
-	left = d_expression (di);
+	if (op_is_new_cast (op))
+	  left = cplus_demangle_type (di);
+	else
+	  left = d_expression (di);
 	if (!strcmp (code, "cl"))
 	  right = d_exprlist (di, 'E');
 	else if (!strcmp (code, "dt") || !strcmp (code, "pt"))
@@ -4455,6 +4474,17 @@ d_print_comp (struct d_print_info *dpi, int options,
 	  return;
 	}
 
+  if (op_is_new_cast (d_left (dc)))
+	{
+	  d_print_expr_op (dpi, options, d_left (dc));
+	  d_append_char (dpi, '<');
+	  d_print_comp (dpi, options, d_left (d_right (dc)));
+	  d_append_string (dpi, ">(");
+	  d_print_comp (dpi, options, d_right (d_right (dc)));
+	  d_append_char (dpi, ')');
+	  return;
+	}
+
   /* We wrap an expression which uses the greater-than operator in
 	 an extra layer of parens so that it does not get confused
 	 with the '>' which ends the template parameters.  */
diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected
index 58c1368..6b55d30 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -4081,6 +4081,9 @@ decltype (new auto({parm#1})) f(int)
 --format=gnu-v3
 _Z1fIiERDaRKT_S1_
 auto& f(int const&, int)
+--format=gnu-v3
+_Z1gILi1EEvR1AIXT_EER1BIXscbT_EE
+void g<1>(A<1>&, B(1)>&)
 #
 # Ada (GNAT) tests.
 #
diff --git a/gcc/testsuite/g++.dg/abi/mangle3-2.C b/gcc/testsuite/g++.dg/abi/mangle3-2.C
new file mode 100644
index 000..ac85fb0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/mangle3-2.C
@@ -0,0 +1,20 @@
+// Test mangling of type casts
+// { dg-options "-fabi-version=0" }
+// { dg-do compile }
+
+template class A {};
+template class B {};
+
+template void f(A &, B &) {}
+template void g(A &, B(i)> &) {}
+
+int main()
+{
+  A<1> a;
+  B b;
+  f(a, b);
+  g(a, b);
+}
+
+// { dg-final { scan-assembler "\n_?_Z1fILi1EEvR1AIXT_EER1BIXcvbT_EE\[: \t\n\]" } }
+// { dg-final { scan-assembler "\n_?_Z1gILi1EEvR1AIXT_EER1BIXscbT_EE\[: \t\n\]" } }
diff --git a/gcc/testsuite/g++.dg/abi/mangle3.C b/gcc/testsuite/g++.dg/abi/mangle3.C
index a20b877..5f44f76 100644
--- a/gcc/testsuite/g++.dg/abi/mangle3.C
+++ b/gcc/testsuite/g++.dg/abi/mangle3.C
@@ -1,4 +1,5 @@
 // Test mangling of type casts
+// { dg-options "-fabi-version=2" }
 // { dg-do compile }
 
 template class A {};
diff --git a/gcc/testsuite/g++.dg/debug/nullptr01.C b/gcc/testsuite/g++.dg/debug/nullptr01.C
index ab08588..63c16ac 100644
--- a/gcc/testsuite/g++.dg/debug/nullptr01.C
+++ b/gcc/testsuite/g++.dg/debug/nullptr01.C
@@ -1,5 +1,5 @@
 // Test that debugging backends don't crash on NULLPTR_TYPE.
-// { dg-options "-std=c++0x" }

Re: [PATCH 1/2] if-to-switch conversion pass

2012-07-18 Thread Tom de Vries

Bernhard,

thanks for the review.

On 18/07/12 19:32, Bernhard Reutner-Fischer wrote:
> On Tue, Jul 17, 2012 at 01:21:00PM +0200, Tom de Vries wrote:
> 
>> /* The root of the compilation pass tree, once constructed.  */
>> extern struct opt_pass *all_passes, *all_small_ipa_passes, 
>> *all_lowering_passes,
>> Index: gcc/tree-if-switch-conversion.c
>> ===
>> --- /dev/null (new file)
>> +++ gcc/tree-if-switch-conversion.c (revision 0)
> 
>> +/* Convert all trees in RANGES to TYPE.  */
>> +
>> +static bool
>> +convert_ranges (tree type, VEC (range, gc) *ranges)
>> +{
>> +  unsigned int ix;
>> +  range r;
>> +
>> +  for (ix = 0; VEC_iterate (range, ranges, ix, r); ix++)
>> +{
>> +  r->low = fold_convert (type, r->low);
>> +  if (TREE_TYPE (r->low) != type)
>> +return false;
>> +
>> +  if (r->high == NULL_TREE)
>> +continue;
>> +
>> +  r->high = fold_convert (type, r->high);
>> +  if (TREE_TYPE (r->low) != type)
> 
> low, not high? This is not immediately obvious to me, please explain?
> 

It's not obvious because it wrong, thanks for spotting that. This will be fixed
in the next submission.

>> +return false;
>> +}
>> +
>> +  return true;
>> +}
> 
>> +/* Analyze BB and store results in ifsc_info_def struct.  */
>> +
>> +static void
>> +analyze_bb (basic_block bb)
>> +{
>> +  gimple stmt = last_stmt (bb);
>> +  tree lhs, rhs, var, constant;
>> +  edge true_edge, false_edge;
>> +  enum tree_code cond_code;
>> +  VEC (range, gc) *ranges = NULL;
>> +  unsigned int nr_stmts = 0;
>> +  bool swap_edges = false;
>> +  tree low, high;
>> +
>> +  /* We currently only handle bbs with GIMPLE_COND.  */
>> +  if (!stmt || gimple_code (stmt) != GIMPLE_COND)
>> +return;
>> +
>> +  cond_code = gimple_cond_code (stmt);
>> +  switch (cond_code)
>> +{
>> +case EQ_EXPR:
>> +case NE_EXPR:
>> +case LE_EXPR:
>> +case GE_EXPR:
>> +  break;
>> +case LT_EXPR:
>> +case GT_EXPR:
>> +  /* Todo.  */
>> +  return;
>> +default:
>> +  return;
>> +}
>> +
>> +  lhs = gimple_cond_lhs (stmt);
>> +  rhs = gimple_cond_rhs (stmt);
>> +
>> +  /* The comparison needs to be against a constant.  */
>> +  if (!TREE_CONSTANT (lhs)
>> +  && !TREE_CONSTANT (rhs))
>> +return;
>> +
>> +  /* Normalize comparison into (var cond_code constant).  */
>> +  var = TREE_CONSTANT (lhs) ? rhs : lhs;
>> +  constant = TREE_CONSTANT (lhs)? lhs : rhs;
> 
> missing space
> 
> []

This will be fixed in the next submission.

>> +/* Convert every if-chain in CHAINS into a switch statement.  */
>> +
>> +static void
>> +convert_chains (VEC (if_chain, gc) *chains)
>> +{
>> +  unsigned int ix;
>> +  if_chain chain;
>> +
>> +  if (VEC_empty (if_chain, chains))
>> +return;
>> +
>> +  for (ix = 0; VEC_iterate (if_chain, chains, ix, chain); ix++)
> 
> shouldn't this be FOR_EACH_VEC_ELT nowadays? Everywhere.

Same.

>> +{
>> +  if (dump_file)
>> +dump_if_chain (chain);
>> +
>> +  convert_if_chain_to_switch (chain);
>> +
>> +  update_cfg (chain);
>> +}
>> +
>> +  /* Force recalculation of dominance info.  */
>> +  free_dominance_info (CDI_DOMINATORS);
>> +  free_dominance_info (CDI_POST_DOMINATORS);
>> +}
> 
>> Index: gcc/Makefile.in
>> ===
>> --- gcc/Makefile.in (revision 189508)
>> +++ gcc/Makefile.in (working copy)
>> @@ -1391,6 +1391,7 @@ OBJS = \
>>  tree-profile.o \
>>  tree-scalar-evolution.o \
>>  tree-sra.o \
>> +tree-if-switch-conversion.o \
>>  tree-switch-conversion.o \
>>  tree-ssa-address.o \
>>  tree-ssa-alias.o \
>> @@ -3013,7 +3014,12 @@ tree-sra.o : tree-sra.c $(CONFIG_H) $(SY
>>$(IPA_PROP_H) $(DIAGNOSTIC_H) statistics.h $(TREE_DUMP_H) $(TIMEVAR_H) \
>>$(PARAMS_H) $(TARGET_H) $(FLAGS_H) \
>>$(DBGCNT_H) $(TREE_INLINE_H) $(GIMPLE_PRETTY_PRINT_H)
>> +tree-if-switch-conversion.o : tree-if-switch-conversion.c $(CONFIG_H) \
>> +$(SYSTEM_H) $(TREE_H) $(TM_P_H) $(TREE_FLOW_H) $(DIAGNOSTIC_H) \
>> +$(TREE_INLINE_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
>> +$(GIMPLE_H) $(TREE_PASS_H) $(FLAGS_H) $(EXPR_H) $(BASIC_BLOCK_H) 
>> output.h \
>> +$(GGC_H) $(OBSTACK_H) $(PARAMS_H) $(CPPLIB_H) $(PARAMS_H)
> 
> I think this list needs updating.
> 

I went over the list just now and all the elements appear in other makerules as
well, so I don't see any obvious ones that should be removed. I think the
PARAMS_H is not necessary. Do you have concerns about anything else?

Thanks,
- Tom

> Nice to see if improvements, finally! :)
> TIA && cheers,
>

Re: [PATCH 1/2] if-to-switch conversion pass

2012-07-18 Thread Steven Bosscher

On Wed, Jul 18, 2012 at 11:30 PM, Tom de Vries  wrote:
>>> +tree-if-switch-conversion.o : tree-if-switch-conversion.c $(CONFIG_H) \
>>> +$(SYSTEM_H) $(TREE_H) $(TM_P_H) $(TREE_FLOW_H) $(DIAGNOSTIC_H) \
>>> +$(TREE_INLINE_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
>>> +$(GIMPLE_H) $(TREE_PASS_H) $(FLAGS_H) $(EXPR_H) $(BASIC_BLOCK_H) 
>>> output.h \
>>> +$(GGC_H) $(OBSTACK_H) $(PARAMS_H) $(CPPLIB_H) $(PARAMS_H)
>>
>> I think this list needs updating.
>>
>
> I went over the list just now and all the elements appear in other makerules 
> as
> well, so I don't see any obvious ones that should be removed. I think the
> PARAMS_H is not necessary. Do you have concerns about anything else?

Why would the other make rules matter? You're adding new make rules
for your new file. Make it depend only on what your new file needs.

Makefile.in is a mess. One of these days, someone (hi, Tromey) will
hopefully get annoyed enough with this again to finish some tool to
auto-generate the dependences list. Until that time, let's try to
avoid proliferating the messy rules from old files to new ones.

> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"

Dearohdearohdear. You're going to look at target macros in this pass?
Surely not. So, let's drop this header,

> +
> +#include "params.h"

-ENONEEDFORTHIS

> +#include "flags.h"

You get flags.h for free from tree.h, but...

> +#include "tree.h"
> +#include "basic-block.h"
> +#include "tree-ssa-operands.h"

You get these if you be a nice GIMPLE pass and include gimple.h
instead of these three headers.

> +#include "tree-flow.h"
> +#include "tree-flow-inline.h"

You don't need tree-flow-inline.h. tree-flow.h includes it already for you.

> +#include "diagnostic.h"

I don't think you're emitting diagnostics.
Except maybe: "warning: trying to out-smart developer if I do this
without profile info" :-)

> +#include "tree-pass.h"
> +#include "tree-dump.h"

You don't need tree-dump.h

> +#include "timevar.h"

You don't need timevar.h, either.

> +#include "tree-pretty-print.h"

You want gimple-pretty-print.h in a GIMPLE pass.

So you're left with:

+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+
+#include "gimple.h"
+#include "gimple-pretty-print.h"
+#include "tree-flow.h"
+#include "tree-pass.h"

and

+tree-if-switch-conversion.o : tree-if-switch-conversion.c $(CONFIG_H)
$(SYSTEM_H) coretypes.h \
+   $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(TREE_FLOW_H) $(TREE_PASS_H)

Looks a lot nicer to me.

Please do feel free to join my headless header-hunt and help clean up
all those other files that have needlessly complex #includes and
correspondingly silly Makefile.in rules!

Ciao!
Steven

Re: [Patch, mips] Fix compiler abort with -mips32r2 -mips16 -msynci

2012-07-18 Thread Steve Ellcey

On Wed, 2012-07-18 at 18:30 +0100, Richard Sandiford wrote:

> The abort sounds like the bug here.  It's deliberate that things like
> -msynci, -mbranch-likely, etc., are OK with -mips16.  On the one hand,
> you could compile with -mips16 but have an __attribute__((nomips16))
> function that could benefit from using SYNCI.  On the other, you could
> compile without -mips16 but have an __attribute__((mips16)) function
> that needs to avoid SYNCI.

OK, I think that makes sense.

> -mips16 really just sets the default ISA mode for functions that don't
> specify one.  That's why override_options hides mips16ness so early on,
> like you say.

Ah, I didn't really understand why we were hiding the -mips16 setting,
now I do.

I will see if I can figure out why we abort.  The clear_cache insn in
mips.md looks a bit odd to me, there is the part that is executed when
TARGET_SYNCI is true and then a part that is only executed if
mips_cache_flush_func is defined.  It looks like if
mips_cache_flush_func is not defined then we do nothing and I was
wondering if that is correct or not?  Should mips_cache_flush_func
being NULL be an error?  I am not even sure if you can make it NULL
given that it is given a default value in mips.opt.

My test case is:

void f()
{
  int size = 40;
  char *memory = __builtin_alloca(size);
  __builtin___clear_cache(memory, memory + size);
}

And the abort with -mips32r2 -mips16 -msynci is:

x.c: In function ‘f’:
x.c:6:1: error: unrecognizable insn:
 }
 ^
(jump_insn 22 21 38 2 (set (pc)
(if_then_else (eq (reg:SI 207)
(reg/f:SI 196 [ D.1415 ]))
(label_ref 33)
(pc))) x.c:5 -1
 (nil)
 -> 33)
x.c:6:1: internal compiler error: in extract_insn, at recog.c:2129
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

If I can't figure out what is going on I will file a bug report.

Steve Ellcey
sell...@mips.com

Re: [PATCH] Fix PR53970

2012-07-18 Thread John David Anglin

On Wed, 18 Jul 2012, Richard Guenther wrote:

> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> Richard.
> 
> 2012-07-18  Richard Guenther  
> 
>   PR tree-optimization/53970
>   * tree.h (contains_packed_reference): Remove.
>   * expr.c (contains_packed_reference): Likewise.
>   * tree-vect-data-refs.c (not_size_aligned): New function.

../../gcc/gcc/tree-vect-data-refs.c: In function ãnot_size_alignedã:
../../gcc/gcc/tree-vect-data-refs.c:1144:4: warning: comparison between signed a

This causes a bootstrap failure on hppa.

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)

Re: [patch v2] support for multiarch systems

2012-07-18 Thread John David Anglin

On Sun, 08 Jul 2012, Matthias Klose wrote:

> Please find attached v2 of the patch updated for trunk 20120706, x86 only, 
> tested on
> x86-linux-gnu, KFreeBSD and the Hurd.

Currently, Debian gcc packages for hppa contain multiarch support.  Because
of this, I have used a multiarch patch for testing Debian Linux for hppa for
a long time.  My test system is multiarch.

It would make my life easier if the change included the multiarch bits for
hppa.  I imagine other Debian ports are in a similar situation.

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)

Re: [PATCH 1/2] gcc symbol database

2012-07-18 Thread Yunfeng ZHANG

To Dodji Seketeli:

Thanks for you check my patch, I will release it again later.

Yunfeng

Re: [C++ RFC / Patch] PR 51213 ("access control under SFINAE")

2012-07-18 Thread Jason Merrill


On 07/12/2012 07:06 PM, Jason Merrill wrote:

I notice that your patch changes the behavior of C++98/03 mode as well,
which seems wrong to me; I think this is a big enough change that we
should limit it to C++11 mode.


...except that I can't figure out what the semantics before this DR were 
really supposed to be; G++, EDG and Clang all handle this stuff 
differently.  After poking at it for a while I think the only sensible 
thing is to have C++03 work the same as C++11.  I've done that, and I'm 
going to check in the full patch.


The first attachment is the full patch; the second is the changes 
relative to your earlier patch.  I changed tf_error to 
tf_warning_or_error, went back to treating C++03 and C++11 the same, 
made the FNDECL_RECHECK_ACCESS_P change I mentioned, and streamlined 
LOOKUP_SPECULATIVE handling.


The third attachment is to fix another problem with my earlier DR 1402 
patch.  I'm applying this one to 4.7 as well.


Finally, the fourth attachment is to remove the substitution checking 
from instantiate_decl, as it seems to be redundant.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 32570a0651e12c51bdc6a976b2f981792ea8e4c8
Author: Jason Merrill 
Date:   Tue Jul 17 14:10:31 2012 -0400

	DR 1170
	PR c++/51213
	* semantics.c (perform_access_checks): Add complain parm, return bool.
	(perform_deferred_access_checks): Likewise.
	(perform_or_defer_access_check): Likewise.
	(speculative_access_check): Remove.
	* call.c (enforce_access): Add complain parm, return bool.
	* decl.c, friend.c, class.c, init.c, parser.c: Adjust callers.
	* search.c: Adjust callers.
	* cp-tree.h (TINFO_RECHECK_ACCESS_P): New macro.
	(FNDECL_RECHECK_ACCESS_P): New macro.
	* method.c (synthesized_method_walk): Stop deferring access checks.
	* pt.c (recheck_decl_substitution): New.
	(instantiate_template_1): Set and check FNDECL_RECHECK_ACCESS_P.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 72394f4..5b3245f 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5515,7 +5515,8 @@ build_op_delete_call (enum tree_code code, tree addr, tree size,
   /* If the FN is a member function, make sure that it is
 	 accessible.  */
   if (BASELINK_P (fns))
-	perform_or_defer_access_check (BASELINK_BINFO (fns), fn, fn);
+	perform_or_defer_access_check (BASELINK_BINFO (fns), fn, fn,
+   complain);
 
   /* Core issue 901: It's ok to new a type with deleted delete.  */
   if (DECL_DELETED_FN (fn) && alloc_fn)
@@ -5573,19 +5574,23 @@ build_op_delete_call (enum tree_code code, tree addr, tree size,
the declaration to use in the error diagnostic.  */
 
 bool
-enforce_access (tree basetype_path, tree decl, tree diag_decl)
+enforce_access (tree basetype_path, tree decl, tree diag_decl,
+		tsubst_flags_t complain)
 {
   gcc_assert (TREE_CODE (basetype_path) == TREE_BINFO);
 
   if (!accessible_p (basetype_path, decl, true))
 {
-  if (TREE_PRIVATE (decl))
-	error ("%q+#D is private", diag_decl);
-  else if (TREE_PROTECTED (decl))
-	error ("%q+#D is protected", diag_decl);
-  else
-	error ("%q+#D is inaccessible", diag_decl);
-  error ("within this context");
+  if (complain & tf_error)
+	{
+	  if (TREE_PRIVATE (decl))
+	error ("%q+#D is private", diag_decl);
+	  else if (TREE_PROTECTED (decl))
+	error ("%q+#D is protected", diag_decl);
+	  else
+	error ("%q+#D is inaccessible", diag_decl);
+	  error ("within this context");
+	}
   return false;
 }
 
@@ -6510,14 +6515,9 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 	access_fn = DECL_TI_TEMPLATE (fn);
   else
 	access_fn = fn;
-  if (flags & LOOKUP_SPECULATIVE)
-	{
-	  if (!speculative_access_check (cand->access_path, access_fn, fn,
-	 complain & tf_error))
-	return error_mark_node;
-	}
-  else
-	perform_or_defer_access_check (cand->access_path, access_fn, fn);
+  if (!perform_or_defer_access_check (cand->access_path, access_fn,
+	  fn, complain))
+	return error_mark_node;
 }
 
   /* If we're checking for implicit delete, don't bother with argument
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 82c28fa..96a7420 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1189,7 +1189,8 @@ alter_access (tree t, tree fdecl, tree access)
 }
   else
 {
-  perform_or_defer_access_check (TYPE_BINFO (t), fdecl, fdecl);
+  perform_or_defer_access_check (TYPE_BINFO (t), fdecl, fdecl,
+ tf_warning_or_error);
   DECL_ACCESS (fdecl) = tree_cons (t, access, DECL_ACCESS (fdecl));
   return 1;
 }
@@ -7147,7 +7148,8 @@ resolve_address_of_overloaded_function (tree target_type,
   && DECL_FUNCTION_MEMBER_P (fn))
 {
   gcc_assert (access_path);
-  perform_or_defer_access_check (access_path, fn, fn);
+  perform_or_defer_access_check (access_path, fn, fn,
+ tf_warning_or_error);
 }
 
   if (TYPE_PTRFN_P (target_type) || TYPE_PTRMEMFUNC_P (target_

Re: PR53914, rs6000 constraints and reload queries

2012-07-18 Thread Alan Modra

Thanks very much Uli for verifying my conclusions about reload,
operand predicates and constraints, and particularly the general
unusability of the "o" constraint.

Re http://gcc.gnu.org/ml/gcc/2012-07/msg00142.html, this patch adds
the missing secondary reload patterns, corrects constraints I got
wrong ("?*d", not "*?d"), and fixes pr54009.

Uli said:
> An address involving pseudos should be
> considered "legitimate" if there exists an assignment of hard
> registers that makes it strictly legitimate (not if *any* such
> assignment would be strictly legitimate).  [ It might make sense
> in some cases to make the check stricter; for example if we know
> that an address would nearly always require a reload, we might
> choose to completely reject it if that actually increases performance.
> But that would be just performance tuning, not required for
> correctness ... ]
So there is quite a bit more work in rs6000.c to fully implement this.
See ??? comments that I added on code handling lo_sum, and I'll admit
to not even trying to relax rs6000_legitimate_offset_address_p
conditions for e500.  That can wait for another day.  The patch is
large enough already.

Some notes:
- word_offset_memref_operand isn't used as a predicate and as both Uli
and I noted, constraints calling predicates lead to trouble with
reload_legitimize_address output.  So move it out of predicates.md to
rs6000.c (renamed as mem_operand_gpr and without checks more suited to
predicates).
- where I changed a bunch of mode tests to GET_MODE_SIZE checks, the
original mode list missing TImode is irrelevant for 32-bit, since
TImode isn't supported on 32-bit (Why do we have 32-bit TImode insns?)
- reordering insn alternatives in some cases is cosmetic.  As the
comments say, putting r->Y and Y->r before r->r is necessary, but
reordering d->m,m->d,d->d isn't strictly necessary.  I did that for
consistency, and future proofing should the m constraint need to be
changed.  Putting r->Y before Y->r is also cosmetic but I prefer it
that way for insns that land in reload as pseudo->pseudo ie. mem->mem,
where both load and store alternatives match with reloading.  I think
it's nicer to choose input reloads rather than output reloads, so put
the store first.
- I haven't actually seen the 32-bit gpr secondary reload patterns
trigger (it's hard to make a testcase), so that code is largely
untested.  Fortunately the code is very similar to the 64-bit gpr
secondary reload code.
- movdf_hardfloat32 insn lengths looked wrong to me, so I fixed that.
gpr load and store ought to be just two insns, not four.  I also took
out the ?? kludge since the offsettable address problem is now fixed.
- I don't really like disparaging fprs in a number of DImode insns,
but without that reload prefers to reload inputs.  So you get code
like stw 10,xxx(1); stw 11,xxx+4(1); lfd 0,xxx(1); stfd 0,32764(9);
rather than addi 9,9,32764; stw 10,0(9); stw 11,4(9);  The former is
slower and requires a stack frame.

Bootstrapped and regression tested powerpc-linux.  OK to apply?

PR target/53914
PR target/54009
* config/rs6000/constraints.md (Y): Use mem_operand_gpr.
(wY): New constraint using mem_operand_fpr.
* config/rs6000/predicates.md (word_offset_memref_operand): Delete.
Adjust all rs6000_legitimate_offset_address_p calls.
* config/rs6000/rs6000-protos.h (mem_operand_gpr): Declare.
(mem_operand_fpr, rs6000_secondary_reload_gpr): Declare.
(rs6000_legitimate_offset_address_p): Update prototype.
(rs6000_offsettable_memref_p): Delete.
(rs6000_secondary_reload_ppc64): Delete.
* config/rs6000/rs6000.c (address_offset): New function.
(mem_operand_gpr, mem_operand_fpr): Likewise.
(rs6000_legitimate_offset_address_p): Add worst_case param.  When
not worst_case assume class of regs with least restrictive offsets.
Adjust all calls.
(legitimate_lo_sum_address_p): Simplify register mode tests.
(rs6000_legitimize_address): Likewise.  Assume best case offset
addressing.  Combine ELF and MACHO lo_sum code.
(rs6000_mode_dependent_address): Correct offset addressing limits.
(rs6000_offsettable_memref_p): Make static, add reg_mode param.
Use reg_mode to help rs6000_legitimate_offset_address_p.
(rs6000_secondary_reload): Use address_offset.  Handle 32-bit multi
gpr load/store when offset too large.
(rs6000_secondary_reload_gpr): Renamed rs6000_secondary_reload_ppc64.
(rs6000_split_multireg_move): Adjust rs6000_offsettable_memref_p calls.
* config/rs6000/rs6000.md (movdf_hardfloat32): Use 'Y' constraint
for gpr load/store.  Order alternatives as r->Y,Y->r,r->r and
d->m,m->d,d->d.  Correct size of gpr load/store.
(movdf_softfloat32): Use 'Y' constraint for gpr load/store.  Order
alternatives.
(movti_ppc64): Likewise.
(movdi_internal32): Likewise.  Al

Re: [patch v2] support for multiarch systems

2012-07-18 Thread David Miller

From: John David Anglin 
Date: Wed, 18 Jul 2012 21:12:26 -0400

> I imagine other Debian ports are in a similar situation.

GCC hacking has been extremely painful on sparc for me because of this
debian multiarch situation, so yes I'm in this group as well.

[PATCH] Intrinsics for ADCX, ADOX, RDSEED and PREFETCHW

2012-07-18 Thread Michael Zolotukhin

Hi,
This patch adds new intrinsics for new ADCX, ADOX, RDSEED and
PREFETCHW instructions, introduced here:
http://software.intel.com/en-us/avx/

Bootstrapped on x86-64, testing is in progress.

Is it ok for trunk?

Changelog entry:
2012-07-17  Michael Zolotukhin  

* common/config/i386/i386-common.c (OPTION_MASK_ISA_RDSEED_SET): New.
(OPTION_MASK_ISA_ADX_SET): Likewise.
(OPTION_MASK_ISA_PRFCHW_SET): Likewise.
(OPTION_MASK_ISA_RDSEED_UNSET): Likewise.
(OPTION_MASK_ISA_ADX_UNSET): Likewise.
(OPTION_MASK_ISA_PRFCHW_UNSET): Likewise.
(ix86_handle_option): Handle mrdseed, madx and mprfchw options.
* config.gcc (i[34567]86-*-*): Add rdseedintrin.h, adxintrin.h and
prfchwintrin.h.
(x86_64-*-*): Likewise.
* config/i386/adxintrin.h: New header.
* config/i386/prfchwintrin.h: Likewise.
* config/i386/rdseedintrin.h: Likewise.
* config/i386/cpuid.h (bit_RDSEED): New.
(bit_ADX): Likewise.
(bit_PRFCHW): Likewise.
(bit_BMI): Formatting fix.
(bit_HLE): Likewise.
(bit_RTM): Likewise.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX,
RDSEED and PREFETCHW support.
* config/i386/i386-builtin-types.def
(UCHAR_FTYPE_UCHAR_UINT_UINT_PINT): New function type.
(UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT): Likewise.
* config/i386/i386-c.c: Define __RDSEED__, __ADX__, and __PRFCHW__ if
needed.
* config/i386/i386.c (ix86_target_string): Define -mrdseed, -madx,
-mprfchw options. Formatting fixes.
(PTA_HLE): Formatting fix.
(PTA_RDSEED): New.
(PTA_ADX): Likewise.
(PTA_PRFCHW): Likewise.
(ix86_option_override_internal): Handle new options.
(ix86_valid_target_attribute_inner_p): Add OPT_mrdseed, OPT_madx, and
OPT_mprfchw.
(ix86_builtins): Add IX86_BUILTIN_ADDXCARRY32,
IX86_BUILTIN_ADDXCARRY64, IX86_BUILTIN_RDSEED16,
IX86_BUILTIN_RDSEED32, IX86_BUILTIN_RDSEED64.
(ix86_init_mmx_sse_builtins): Define corresponding built-ins.
(ix86_expand_builtin): Handle these built-ins.
(ix86_expand_args_builtin): Handle new function types.
* config/i386/i386.h (TARGET_RDSEED): New.
(TARGET_ADX): Likewise.
(TARGET_PRFCHW): Likewise.
* config/i386/i386.md (UNSPEC_ADCX): New.
(UNSPEC_RDSEED): Likewise.
(attributes): Add rdseed, adx, prfchw.
(adcx): New define_insn.
(rdseed): Likewise.
(prefetch): Enable for TARGET_PRFCHW.
(prefetchw_): New define_insn for write-prefetch.
(prefetch_3dnow_): Keep only read-prefetch here.
* config/i386/i386.opt (mrdseed): New.
(madx): Likewise.
(mprfchw): Likewise.
* config/i386/mm3dnow.h: Move _m_prefetchw from here to
prfchwintrin.h.
* config/i386/x86intrin.h: Include prfchwintrin.h, rdseedintrin.h,
adxintrin.h.

testsuite/Changelog entry:
2012-07-17  Michael Zolotukhin  

* gcc.target/i386/adx-addxcarry32-1.c: New.
* gcc.target/i386/adx-addxcarry32-2.c: New.
* gcc.target/i386/adx-addxcarry64-1.c: New.
* gcc.target/i386/adx-addxcarry64-2.c: New.
* gcc.target/i386/adx-check.h: New.
* gcc.target/i386/i386.exp: New.
* gcc.target/i386/prefetchw-1.c: New.
* gcc.target/i386/rdseed16-1.c: New.
* gcc.target/i386/rdseed32-1.c: New.
* gcc.target/i386/rdseed64-1.c: New.

-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


adx_rdseed_prefetchw_intrin.patch
Description: Binary data

76 matches

Mail list logo