[PING] [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM

2013-11-14 Thread Bernd Edlinger
Hello,

could you please approve this patch: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02664.html

As it looks like, it fixes the problem reported under PR59134,
which is the expander enters infinite recursion while it tries to violate the 
C++ memory model.

The memory context has still the structure mode, which is QImode, and due to 
this patch the
memory accesses are no longer tried in SImode but in QImode, which is fixes the 
ICE here.

Maybe I should add the test case from PR59134 ?


Thanks
Bernd.

>
> Hello,
>
> meanwhile, I have added a test case to that patch.
>
> Boot-strapped and regression-tested as usual.
>
> OK for trunk?
>
> Bernd.
>
>> Hi,
>>
>> On Fri, 25 Oct 2013 11:26:20, Richard Biener wrote:
>>>
>>> On Fri, Oct 25, 2013 at 10:40 AM, Bernd Edlinger
>>>  wrote:
 Hello,

 this patch fixes the recently discovered data store race on arm-eabi-gcc 
 with -fno-strict-volatile-bitfields
 for structures like this:

 #define test_type unsigned short

 typedef struct s{
 unsigned char Prefix[1];
 test_type Type;
 }__attribute((__packed__,__aligned__(4))) ss;

 volatile ss v;

 void __attribute__((noinline))
 foo (test_type u)
 {
 v.Type = u;
 }

 test_type __attribute__((noinline))
 bar (void)
 {
 return v.Type;
 }


 I've manually confirmed the correct code generation using variations of the
 example above on an ARM cross-compiler for -fno-strict-volatile-bitfields.

 Note, that this example is still causes ICE's for 
 -fstrict-volatile-bitfields,
 but I'd like to fix that separately.

 Boot-strapped and regression-tested on x86_64-linux-gnu.

 Ok for trunk?
>>>
>>> Isn't it more appropriate to fix it here:
>>>
>>> if (TREE_CODE (to) == COMPONENT_REF
>>> && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
>>> get_bit_range (&bitregion_start, &bitregion_end, to, &bitpos, &offset);
>>>
>>> ?
>>>
>>
>> Honestly, I'd call this is a work-around, not a design.
>>
>> Therefore I would not move that workaround to expr.c.
>>
>> Also the block below is only a work-around IMHO.
>>
>> if (MEM_P (str_rtx) && bitregion_start> 0)
>> {
>> enum machine_mode bestmode;
>> HOST_WIDE_INT offset, size;
>>
>> gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0);
>>
>> offset = bitregion_start / BITS_PER_UNIT;
>> bitnum -= bitregion_start;
>> size = (bitnum + bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
>> bitregion_end -= bitregion_start;
>> bitregion_start = 0;
>> bestmode = get_best_mode (bitsize, bitnum,
>> bitregion_start, bitregion_end,
>> MEM_ALIGN (str_rtx), VOIDmode,
>> MEM_VOLATILE_P (str_rtx));
>> str_rtx = adjust_bitfield_address_size (str_rtx, bestmode, offset, size);
>> }
>>
>> Here, if bitregion_start = 8, we have a 4 byte aligned memory context,
>> and whoops, now it is only 1 byte aligned.
>>
>> this example:
>>
>> struct s
>> {
>> char a;
>> int b:24;
>> };
>>
>> struct s ss;
>>
>> void foo(int b)
>> {
>> ss.b = b;
>> }
>>
>>
>> gets compiled (at -O3) to:
>>
>> foo:
>> @ Function supports interworking.
>> @ args = 0, pretend = 0, frame = 0
>> @ frame_needed = 0, uses_anonymous_args = 0
>> @ link register save eliminated.
>> ldr r3, .L2
>> mov r1, r0, lsr #8
>> mov r2, r0, lsr #16
>> strb r1, [r3, #2]
>> strb r0, [r3, #1]
>> strb r2, [r3, #3]
>> bx lr
>>
>> while...
>>
>> struct s
>> {
>> char a;
>> int b:24;
>> };
>>
>> struct s ss;
>>
>> void foo(int b)
>> {
>> ss.b = b;
>> }
>>
>>
>> gets compiled (at -O3) to
>>
>> foo:
>> @ Function supports interworking.
>> @ args = 0, pretend = 0, frame = 0
>> @ frame_needed = 0, uses_anonymous_args = 0
>> @ link register save eliminated.
>> ldr r3, .L2
>> mov r2, r0, lsr #16
>> strb r2, [r3, #2]
>> strh r0, [r3] @ movhi
>> bx lr
>>
>> which is more efficient, but only because the memory context is still
>> aligned in this case.
>>
>>> Btw, the C++ standard doesn't cover packed or aligned attributes so
>>> we could declare this a non-issue. Any opinion on that?
>>>
>>> Thanks,
>>> Richard.
>>>
 Thanks
 Bernd.   

Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2013 at 06:29:50PM -0700, Aldy Hernandez wrote:
> >Well, if you don't change anything in omp-low.c, then it wouldn't diagnose
> >setjmp call in #pragma simd, but given that also the OpenMP 4.0 spec
> >requires that #pragma omp simd doesn't contain calls to setjmp or longjmp
> >(ditto for #pragma omp declare simd functions), then scan_omp_1_stmt
> >should be changed to also call check_omp_nesting_restrictions for
> >setjmp/longjmp calls (the GIMPLE_CALL case then in
> >check_omp_nesting_restrictions can't assume all calls it sees are
> >BUILT_IN_NORMAL).
> 
> Fixed in scan_omp_1_stmt.

Well, setjmp_call_p is not just setjmp, but various other functions,
including getcontext, fork, vfork and many others, but it isn't longjmp.
I'd say we should just follow the spec and look solely for setjmp/longjmp,
for the others perhaps we can warn (though I think it isn't a big deal,
we are never going to vectorize those), but not error.

> >Perhaps some bool is_cilkplus = false argument to
> >cp_parser_omp_clause_reduction would work for me (and for C too).
> 
> Ok, I'm at a loss here, what parts of cp_parser_omp_clause_reduction
> are the user-defined reductions?  I'm an OpenMP weenie.

I guess it depends on what the Cilk+ spec says about reduction clause,
and from what I saw it is just too vague.
http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/compiler/cpp-win/index.htm#GUID-44B505B6-01AF-4865-8DF4-AF851F51DDA1.htm
just mentions
reduction(oper:var1 [,var2]…) oper is a reduction operator.

So, the question is which reduction operators does it allow, for what
types, and what is the exact grammar for oper in that case.

E.g. shall it only allow the +/-/*/&/&&/|/|| that OpenMP 2.5 had?
Or also min/max that OpenMP 3.1 added?
Shall it (for C++) allow stuff like reduction(operator +:var1) ?
And UDRs?  Shall it allow something OpenMP doesn't allow?

Depending to answer to those questions, the changes will differ.
E.g. if you only allow the OpenMP 2.5 stuff, then you'd fail in
cp_parser_omp_clause_reduction in
switch (cp_lexer_peek_token (parser->lexer)->type)'s
default: break; - default: if (is_cilkplus) goto resync_fail; break;
with some error message.  If you want also min/max, you'd need
to add the fail after min/max recognition, but before recognition
of operator XYZ, if you want to allow even that, but not UDRs,
you'd fail before id = omp_reduction_id (code, id, NULL_TREE); if
code is still ERROR_MARK.

Jakub


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-14 Thread Jeff Law

On 11/14/13 15:19, Diego Novillo wrote:

On Thu, Nov 14, 2013 at 5:12 PM, Jeff Law  wrote:

On 11/14/13 13:28, Diego Novillo wrote:


Functions in each corresponding .c file got moved to those
headers and others that already existed. I wanted to make this
patch as mechanical as possible, so I made no attempt to fix
problems like having build_addr defined in tree-inline.c. I left
that for later.


This seems backwards to me and just ensures double-churn. Once to move it
now, then again to its final resting spot.

If this change is being made via some automated script, then, well, I guess
it is what it is and we'll have to come back to them.  But if you're doing
this by hand it seems to me that leaving it in its original location,
possibly grouped with its friends, with a FIXME would be better.


Most of it was automated.  I want to stage it in, and I worked pretty
hard at not making additional changes. Particularly since it is not
clear where we will want some of these functions to end up in.  So, we
will still need several passes.  Making each pass self contained makes
sense to me.

OK, I won't object to this part.





- Some header files always need another header file. I chose to
#include that header in the file. At this stage we want to do
the opposite, but this would've added even more bulk to the
change, so I left a FIXME marker for the next pass.


This seems a bit like a mistake.  How much of this patch would be blocked if
we didn't allow this right now.


A good chunk.  I'm doing these FIXMEs in the next sequence of patches,
so we won't have them for long. Again, I was going for an orderly
transition here.
However, I'm much more concerned about this.  It really feels like a 
step backwards.


jeff


[RFA][PATCH] Fix an ia64 Ada bootstrap problem

2013-11-14 Thread Jeff Law



I'm still unable to bootstrap the ia64 port if I back out Kirill's most 
recent patch.


The erroneous-path-isolation code has exposed a latent bug in the RTL 
if-conversion pass which runs after combine (ie no longer in cfglayout 
mode).


We have this as we leave combine:





(insn 107 106 108 10 (set (reg:BI 443)
(eq:BI (reg/v/f:DI 399 [ gnu_expr ])
(const_int 0 [0]))) 
../../gcc/gcc/ada/gcc-interface/decl.c:6238 311 {*cmpdi_normal}

 (nil))
(jump_insn 108 107 109 10 (set (pc)
(if_then_else (ne (reg:BI 443)
(const_int 0 [0]))
(label_ref 424)
(pc))) ../../gcc/gcc/ada/gcc-interface/decl.c:6238 318 
{*br_true}

 (expr_list:REG_DEAD (reg:BI 443)
(int_list:REG_BR_PROB 5359 (nil)))
 -> 424)

[ lots of insns ]

(code_label 424 414 425 52 1553 "" [1 uses])
(note 425 424 427 52 [bb 52] NOTE_INSN_BASIC_BLOCK)
(insn 427 425 430 52 (set (reg:HI 581 [ MEM[(union tree_node 
*)0B].base.code ])
(mem/v/j:HI (reg/v/f:DI 399 [ gnu_expr ]) [0 MEM[(union 
tree_node *)0B].base.code+0 S2 A128])) 
../../gcc/gcc/ada/gcc-interface/decl.c:6246 4 {movhi_internal}

 (expr_list:REG_DEAD (reg/v/f:DI 399 [ gnu_expr ])
(expr_list:REG_UNUSED (reg:HI 581 [ MEM[(union tree_node 
*)0B].base.code ])
(expr_list:REG_EQUAL (mem/v/j:HI (const_int 0 [0]) [0 
MEM[(union tree_node *)0B].base.code+0 S2 A128])

(nil)
(insn 430 427 476 52 (trap_if (const_int 1 [0x1])
(const_int 0 [0])) 363 {*trap}
 (nil))


Something deletes/moves insn 427 out of the way (most likely predicated 
and shoved into bb10).


After that point bb52 just contains the trap.




We get into ifcvt.c::find_cond_trap.

We determine that trap_bb is else_bb, insert the conditional trap into 
test_bb and cleanup/remove the trap block.  All that's left to do is 
cleanup test_bb.  Remember that the THEN edge is canonically the 
fallthru edge so in this case when the branch was taken it reached the trap.


That cleanup code looks like:


 /* Wire together the blocks again.  */
  if (current_ir_type () == IR_RTL_CFGLAYOUT)
single_succ_edge (test_bb)->flags |= EDGE_FALLTHRU;
  else
{
  rtx lab, newjump;

  lab = JUMP_LABEL (jump);
  newjump = emit_jump_insn_after (gen_jump (lab), jump);
  LABEL_NUSES (lab) += 1;
  JUMP_LABEL (newjump) = lab;
  emit_barrier_after (newjump);
}
  delete_insn (jump);


We're running after combine and thus we're not in cfglayout mode.

The code creates a new unconditional jump to the label referenced in the 
original conditional jump.  We then emit the new unconditional jump into 
the IL and delete the conditional jump.  That is fine if the trap was in 
the then (fallthru) block.


But that makes *no* sense when the trap is in the else block.  The label 
has been deleted from the insn chain and more importantly, we want to 
fallthru if we do not trap!


Thankfully the the CFG checking code detected this inconsistency.  It's 
been latent since 2002!  Clearly we aren't doing a lot of optimizing 
conditional jumps over/to traps into conditional traps!


Anyway, the fix is trivial.  When trap_bb == then_bb, run the code as 
is.  When trap_bb == else_bb we only want to remove the conditinoal jump 
as we want to fallthru if the conditional trap doesn't trigger.


With this patch applied and Kirill's patch removed, I can almost 
bootstrap the ia64 port with Ada enabled (comparison failure that AFAICT 
is not related to the isolate-erroneous-paths optimization)




OK for the trunk if it passes a bootstrap & regtest on x86_64 overnight?

Jeff

ps.  An Itanic with 108 processors is still, well, an Itanic.
* ifcvt.c (find_cond_trap): Properly handle case where
trap_bb == else_bb.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index fafff9d..17d26c5 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3694,7 +3694,7 @@ find_cond_trap (basic_block test_bb, edge then_edge, edge 
else_edge)
   /* Wire together the blocks again.  */
   if (current_ir_type () == IR_RTL_CFGLAYOUT)
 single_succ_edge (test_bb)->flags |= EDGE_FALLTHRU;
-  else
+  else if (trap_bb == then_bb)
 {
   rtx lab, newjump;
 


Re: Clean up configure glibc version detection, add --with-glibc-version

2013-11-14 Thread Jeff Law

On 11/06/13 11:33, Joseph S. Myers wrote:

When bootstrapping a cross toolchain with GCC and glibc, it's
desirable to keep down the number of GCC builds needed: to be able to
build an initial static-only C-only GCC, use that to build glibc, and
then build the full compiler with the resulting glibc.  The aim is
that if glibc were then rebuilt with the full compiler, the results
would be identical to the glibc built with the initial compiler.  (See
 for more
on how ideally this might work; really it should only be target
libraries that need rebuilding, not the compiler at all.)

Understandable.


Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
commit?

2013-11-06  Joseph Myers  

* acinclude.m4 (GCC_GLIBC_VERSION_GTE_IFELSE): New configure
macro.
* configure.ac: Determine target_header_dir earlier.
(--with-glibc-version): New configure option.
Use GCC_GLIBC_VERSION_GTE_IFELSE in enable_gnu_unique_object,
gcc_cv_libc_provides_ssp and gcc_cv_target_ldbl128 tests.
* configure: Regenerate.
* doc/install.texi (--enable-gnu-unique-object): Don't refer to
native toolchains for default.
(--with-glibc-version): Document.
Seems reasonable, and as long as you've verified the fallbacks work it's 
fine for the trunk.



Jeff



Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-14 Thread H.J. Lu
On Thu, Nov 14, 2013 at 1:16 PM, H.J. Lu  wrote:
> On Thu, Nov 14, 2013 at 6:16 AM, Arnaud Charlet  wrote:
>>> I also changed s-osinte-posix.adb and s-osprim-posix.adb
>>> for x32.  They aren't Linux specific.  What should I do with
>>> them?
>>
>> I would use the time_t type defined in s-osinte* (all POSIX implementations
>> of s-osinte* have such definition, or if they don't, it's easy to add), and
>> in the s-osinte-linux version we can have a renaming:
>>
>>subtype time_t is System.Linux.time_t
>>
>> and in System.Linux have either:
>>
>>type time_t is new Long_Integer;
>>
>> or
>>
>>type time_t is new Long_Long_Integer;
>>
>> depending on the variant.
>>
>> Arno
>
> Another problem.  s-osprim-posix.adb has
>
>--  ??? These definitions are duplicated from System.OS_Interface
>--  because we don't want to depend on any package. Consider removing
>--  these declarations in System.OS_Interface and move these ones in
>--  the spec.
>
> I can't use time_t from s-osinte-linux.ads since System.OS_Interface
> isn't available. What should I do?
>

This is what I got.  I added s-posix-time.ads which declares
System.OS_Time.time_t.  I use it instead long for time_t.  I
didn't add time_t to s-linux.ads since it isn't used by
s-osprim-posix.adb.

It passes all tests with -m32, -mx32 and -m64 on Linux/x86-64
I don't know if I do it right.  If it isn't right, please tell me exactly
how to fix it since I don't know Ada.

Thanks.

-- 
H.J.
---
2013-11-14  H.J. Lu  

PR ada/54040
* s-osinte-linux.ads (time_t): Replace long with
System.OS_Time.time_t.
(timespec): Replace long with time_t.
* s-osinte-posix.adb (To_Timespec): Likewise.
* s-osprim-posix.adb (time_t): Replace Long_Integer with
System.OS_Time.time_t.
(timespec): Replace Long_Integer with time_t.
(timeval): Likewise.
(To_Timespec): Likewise.
* s-posix-time-x32.ads: New file.
* s-posix-time.ads: Likewise.
* s-taprop-linux.adb (timeval): Replace C.long with
System.OS_Time.time_t.
* gcc-interface/Makefile.in (LIBGNAT_TARGET_PAIRS): Add
s-ostime.adsdiff --git a/gcc/ada/gcc-interface/Makefile.in 
b/gcc/ada/gcc-interface/Makefile.in
index 91778c5..18d3974 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -1007,6 +1007,7 @@ ifeq ($(strip $(filter-out arm% 
linux-androideabi,$(target_cpu) $(target_os))),)
   s-inmaop.adb S,
-   tv_nsec => long (Long_Long_Integer (F * 10#1#E9)));
+   tv_nsec => time_t (Long_Long_Integer (F * 10#1#E9)));
end To_Timespec;
 
 end System.OS_Interface;
diff --git a/gcc/ada/s-osprim-posix.adb b/gcc/ada/s-osprim-posix.adb
index e03a132..a3717a7 100644
--- a/gcc/ada/s-osprim-posix.adb
+++ b/gcc/ada/s-osprim-posix.adb
@@ -31,6 +31,8 @@
 
 --  This version is for POSIX-like operating systems
 
+with System.OS_Time;
+
 package body System.OS_Primitives is
 
--  ??? These definitions are duplicated from System.OS_Interface
@@ -38,11 +40,11 @@ package body System.OS_Primitives is
--  these declarations in System.OS_Interface and move these ones in
--  the spec.
 
-   type time_t is new Long_Integer;
+   type time_t is new System.OS_Time.time_t;
 
type timespec is record
   tv_sec  : time_t;
-  tv_nsec : Long_Integer;
+  tv_nsec : time_t;
end record;
pragma Convention (C, timespec);
 
@@ -54,7 +56,7 @@ package body System.OS_Primitives is
---
 
function Clock return Duration is
-  type timeval is array (1 .. 2) of Long_Integer;
+  type timeval is array (1 .. 2) of time_t;
 
   procedure timeval_to_duration
 (T: not null access timeval;
@@ -118,7 +120,7 @@ package body System.OS_Primitives is
 
   return
 timespec'(tv_sec  => S,
-  tv_nsec => Long_Integer (Long_Long_Integer (F * 10#1#E9)));
+  tv_nsec => time_t (Long_Long_Integer (F * 10#1#E9)));
end To_Timespec;
 
-
diff --git a/gcc/ada/s-posix-time-x32.ads b/gcc/ada/s-posix-time-x32.ads
new file mode 100644
index 000..2f71869
--- /dev/null
+++ b/gcc/ada/s-posix-time-x32.ads
@@ -0,0 +1,44 @@
+--
+--  --
+--GNU ADA RUN-TIME LIBRARY (GNARL) COMPONENTS   --
+--  --
+--  S Y S T E M .  O S _ T i m e--
+--  --
+--  S p e c --
+--  --
+--  Copyright (C) 2013, Free Software Foundation, Inc.  --
+--  --
+-- G

Re: [PATCH, i386] Fix -mpreferred-stack-boundary

2013-11-14 Thread Sriraman Tallam
On Wed, Nov 13, 2013 at 5:12 PM, Bernd Edlinger
 wrote:
> On Tue, 12 Nov 2013 17:50:27, Sriraman Tallam wrote:
>>
>> On Tue, Nov 12, 2013 at 5:17 PM, Sriraman Tallam  wrote:
>>> On Tue, Nov 12, 2013 at 2:53 PM, Bernd Edlinger
>>>  wrote:
 Hi,


 On Tue, 12 Nov 2013 10:30:16, Sriraman Tallam wrote:
>
> On Mon, Nov 11, 2013 at 11:30 PM, Uros Bizjak  wrote:
>> There was something wrong with Bernd's address, retrying.
>>
 Currently on trunk the option -mpreferred-stack-boundary does not work
 together with #pragma GCC target("sse") or 
 __attribute__((target("sse"))).

 There is already a test case that detects this: 
 gcc.target/i386/fastcall-sseregparm.c

 The attached patch fixes this test case under i686-pc-linux-gnu.

 Boot-strapped and regression-tested under i686-pc-linux-gnu.

 OK for trunk?
>>>
>>> No, this is not what I had in mind. This is simply reverting my
>>> refactoring work which was to make ix86_option_override_internal get
>>> rid of the global_options dependency. Here is the problem:
>>> global_options gets some flags set after command-line options are read
>>> (ix86_preferred_stack_boundary_arg in this case). But, this does not
>>> get saved into target_option_default_node because there is no
>>> corresponding field in cl_target_option for
>>> ix86_preferred_stack_boundary_arg. So, when you restore
>>> target_option_default_node to func_options in
>>> ix86_valid_target_attribute_p, this particular flag does not get
>>> copied. So, you can either copy this explicitly to func_options which
>>> was your first patch or you could extend cl_target_option to include
>>> this field too which is done by making
>>> ix86_preferred_stack_boundary_arg a Variable in i386.opt. The latter
>>> is cleaner because it always saves the default flags into
>>> target_option_default_node.
>>
>> I quickly hacked up what I had in mind and attached the patch. Can you
>> check if this fixes your problem?
>>
>> Thanks
>> Sri
>>
>>
>
> Well, this way it could be fixed too.
>
> But opts->x_ix86_preferred_stack_bounary_arg is not dependent on any
> pragma or target attribute. Correct me if that is wrong.

That seems correct.

>
> And this code
>
>   if (opts_set->x_ix86_preferred_stack_boundary_arg)
> {
>   int min = (TARGET_64BIT_P (opts->x_ix86_isa_flags)
>  ? (TARGET_SSE_P (opts->x_ix86_isa_flags) ? 4 : 3) : 2);
>   int max = (TARGET_SEH ? 4 : 12);
>
>   if (opts->x_ix86_preferred_stack_boundary_arg < min
>   || opts->x_ix86_preferred_stack_boundary_arg> max)
>
> checks func_options against global_options_set:
>
>   new_target = ix86_valid_target_attribute_tree (args, &func_options,
>  &global_options_set);
>
> So this code as it is will fail if this option was ever made target specific.

This is correct. But, right now global_options_set is capturing only
the command line options that are set and does not seem to be
modified. If this option were to be made target specific we should not
access this field off global_options_set. We should add a MASK to
target flags and get it from there just like any other target flag
that is function specific does it.

> There is still a reason why this check needs to be executed each time
> the opts->x_ix86_isa_flags changes.
>
> Because of this I still would prefer my second attempt of fixing this issue,
> because it is simple and it removes the different handling between
> -mpreferred-stack-boundary and -mincoming-stack-boundary.

I understand your problem better now. I still do not think we should
make ix86_option_override_internal should read global_options flags
directly. That is overriding opts passed in as a parameter. I am fine
with Patch 1 which is explicitly copying global_options
preferred_stack_boundary_arg fields onto func_options. FYI, I cannot
approve any patches and you still have to get it approved by the
maintainers. I will sweep these copies and save it into
cl_target_option as a cleanup if more of these emerge.

Thanks for the patience,
Sri


>
> If that should be re-factored for any reason, I think all similar options
> should be changed on one sweep. But in that case the global_options_set
> must somehow also become target specific.
> And we need to invent something like a target pragma to change this options
> because it must somehow be possible to test this code.
>
> Thanks
> Bernd.
>
>>>
>>> Thanks
>>> Sri
>>>
>>>
>>>
>>> I'm not experienced enough in this new option handling stuff, let's
>>> ask Sriraman for his opinion on the patch.
>
>
> I do not think this is the right fix, I am wondering how many other
> target flags we may have to copy this way from global_options. I
> notice that other flags like ix86_regparm and
> ix86_incoming_stack_boundary_arg are very similar. Why should this
> need to be restore

Re: Recent Go patch broke Alpha bootstrap

2013-11-14 Thread Ian Lance Taylor
On Wed, Nov 13, 2013 at 7:25 AM, Uros Bizjak  wrote:
> On Tue, Nov 12, 2013 at 8:52 AM, Uros Bizjak  wrote:
>
 panic: runtime error: invalid memory address or nil pointer dereference
 [signal 0xb code=0x1 addr=0x1c]
>>
 FAIL: runtime/pprof
 gmake[2]: *** [runtime/pprof/check] Error 1

 This one is new, I have to look into it a bit deeper.
>>>
>>>
>>> I don't know what is happening here.  I can't recreate it.  There was
>>> a different problem that could arise in runtime/pprof, that was fixed
>>> by a patch I submitted on Saturday
>>> (http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01016.html).  So it's
>>> possible that this is fixed now.
>>
>> The failure is specific to !USING_SPLIT_STACK targets:
>
> The same error triggered on CentOS 5.10 x86_64 (another
> !USING_SPLIT_STACK target) for 32bit lib (net, runtime). The panic:
> string is the same, only addr=0x9f. There are also a couple of
> segfaults (database/sql, net/http) and abort in sync/atomic.

Could you check to see if this patch fixes the problem?  Thanks.

Ian
diff -r 9b2a1ae08a21 libgo/runtime/proc.c
--- a/libgo/runtime/proc.c	Thu Nov 14 14:29:49 2013 -0800
+++ b/libgo/runtime/proc.c	Thu Nov 14 18:42:54 2013 -0800
@@ -1983,7 +1983,10 @@
 #endif
 	gp->gcnext_sp = nil;
 	runtime_memclr(&gp->gcregs, sizeof gp->gcregs);
-	m->p->syscalltick++;
+
+	// Don't refer to m again, we might be running on a different
+	// thread after returning from runtime_mcall.
+	runtime_m()->p->syscalltick++;
 }
 
 static bool


Re: Factor unrelated declarations out of tree.h (2/2)

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 05:16 PM, Joseph S. Myers wrote:

On Thu, 14 Nov 2013, Diego Novillo wrote:


This patch contains the mechanical side-effects from
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01663.html

There are rather a lot of "Include tm.h" changes here - especially in
front ends, where we've tried to eliminate tm.h calls, and put comments on
some of those remaining saying exactly what target macros are used to make
clear what's needed to eliminate them.  Putting in these includes, without
clear comments explaining how to eliminate them, seems a step backwards.
The problem is larger than that...  function.h includes tm.h as well... 
and something like 140ish files include function.h, not to mention 
another 5 include files bring it in...  basic-block.h, cfgloop.h, 
cgraph.h, expr.h, and gimple-streamer.h


so pretty much every file in existence gets tm.h one way or another 
:-P   Aren't our includes spectacular?


I' ve been thinking that the only way to really tackle this is to 
flatten *everything* so that nothing but .c files have #includes, and 
then trim out all the includes that each .c requires, and then see where 
we sit.  .h files bringing in other .h files really muck things up.


I was contemplating giving that a go over the weekend or maybe next week 
to see what it looks like... I have some scripts that flatten includes 
into the .c files and then try to trim out the ones which aren't needed 
from each .c file.


Andrew

btw, I ran tm.h through the include removal script for the c family 
front end files... The attached patch compiles on x64 and removes 37 
includes from the front end files those are just the extraneous 
ones...   but it may be helpful...
diff -cpN D2/c/c-convert.c c/c-convert.c
*** D2/c/c-convert.c	2013-11-14 21:20:19.045366205 -0500
--- c/c-convert.c	2013-11-14 21:34:12.118616282 -0500
*** along with GCC; see the file COPYING3.  
*** 26,32 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "flags.h"
  #include "convert.h"
--- 26,31 
diff -cpN D2/c/c-lang.c c/c-lang.c
*** D2/c/c-lang.c	2013-11-14 21:20:19.046366178 -0500
--- c/c-lang.c	2013-11-14 21:34:12.120616352 -0500
*** along with GCC; see the file COPYING3.  
*** 21,27 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "fold-const.h"
  #include "c-tree.h"
--- 21,26 
diff -cpN D2/cp/call.c cp/call.c
*** D2/cp/call.c	2013-11-14 21:20:19.055366196 -0500
--- cp/call.c	2013-11-14 21:34:12.128616282 -0500
*** along with GCC; see the file COPYING3.  
*** 25,31 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "stor-layout.h"
  #include "trans-mem.h"
--- 25,30 
diff -cpN D2/cp/class.c cp/class.c
*** D2/cp/class.c	2013-11-14 21:20:19.056366297 -0500
--- cp/class.c	2013-11-14 21:34:12.129616280 -0500
*** along with GCC; see the file COPYING3.  
*** 24,30 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "stringpool.h"
  #include "stor-layout.h"
--- 24,29 
diff -cpN D2/cp/cp-gimplify.c cp/cp-gimplify.c
*** D2/cp/cp-gimplify.c	2013-11-14 21:20:19.057366193 -0500
--- cp/cp-gimplify.c	2013-11-14 21:34:12.130616220 -0500
*** along with GCC; see the file COPYING3.  
*** 22,28 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "stor-layout.h"
  #include "cp-tree.h"
--- 22,27 
diff -cpN D2/cp/cp-lang.c cp/cp-lang.c
*** D2/cp/cp-lang.c	2013-11-14 21:20:19.057366193 -0500
--- cp/cp-lang.c	2013-11-14 21:34:12.130616220 -0500
*** along with GCC; see the file COPYING3.  
*** 21,27 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "cp-tree.h"
  #include "c-family/c-common.h"
--- 21,26 
diff -cpN D2/cp/cp-objcp-common.c cp/cp-objcp-common.c
*** D2/cp/cp-objcp-common.c	2013-11-14 21:20:19.057366193 -0500
--- cp/cp-objcp-common.c	2013-11-14 21:34:12.130616220 -0500
*** along with GCC; see the file COPYING3.  
*** 21,27 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "cp-tree.h"
  #include "c-family/c-common.h"
--- 21,26 
diff -cpN D2/cp/cvt.c cp/cvt.c
*** D2/cp/cvt.c	2013-11-14 21:20:19.058366239 -0500
--- cp/cvt.c	2013-11-14 21:34:12.132616292 -0500
*** along with GCC; see the file COPYING3.  
*** 27,33 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tm.h"
  #include "tree.h"
  #include "stor-layout.h"
  #include "flags.h"
--- 27,32 
diff -cpN D2/cp/cxx-pretty-print.c cp/cxx-pretty-print.c
*** D2/cp/cxx-pretty-print.c	2013-11-14 21:20:19.059366182 -0500
--- cp/cxx-pretty-prin

Re: [RFC PATCH] add auto_bitmap

2013-11-14 Thread Trevor Saunders
On Thu, Nov 14, 2013 at 02:33:00PM -0700, Jeff Law wrote:
> On 11/14/13 14:14, Richard Biener wrote:
> >>
> >>I'm just pointed out that of all the stuff you changed, these were the
> >>only ones I saw where lifetimes were changed significantly.
> >
> >I still ask why we need a new type and cannot put this functionality into 
> >bitmap_head itself.
> Given that bitmap is just a *bitmap_head_def aren't we suggesting
> the same thing?

I think bitmap_head some_bitmap; is a little funny name wise, but
auto_bitmap some_bitmap; is kind of funny too, so I think just having
people use bitmap_head is fine if that's what people prefer.  Its
unfortunate bitmap itself isn't available, but I don't have a plan for
dealing with bitmaps allocated in gc memory right now, so I guess they
need to stay as they are.

Trev

 
> 
> jeff


Fix C99 checks for UCN digits at start of identifiers

2013-11-14 Thread Joseph S. Myers
C99, but not C11, C++98, C++03 or C++11, disallows universal character
names for digits starting identifiers.  The cpplib logic for this gets
the "digit" property from Unicode data, but that data disagrees with
C99 Annex D, which considers Roman numerals (2160-2182), IDEOGRAPHIC
NUMBER ZERO (3007) and Suzhou numerals (3021-3029) to be special
characters instead of digits.

This patch fixes cpplib to follow C99's definition of digit.
C++98/C++03 have no restrictions on initial characters.  C11 and C++11
have identical list of permitted characters, and forbidden initial
characters, different from the lists in C99 and C++98/C++03; this
patch is preliminary to implementing support for the C11/C++11 lists.
In those lists, the forbidden initial characters appear to be
combining characters instead of digits.  (So I'll probably change the
C99, DIG, CXX flags in the followup to C99, N99 (meaning non-initial
character in C99), CXX (i.e. C++98/C++03), C11, N11.)

The new lists generally include large ranges of characters which may
not all be allocated in a particular Unicode version (meaning it will
be necessary to update the character composition information for
-Wnormalized= from Unicode from time to time, whereas that hasn't
mattered so much with the old smaller lists of characters).

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
to mainline.

gcc/testsuite:
2013-11-15  Joseph Myers  

* gcc.dg/cpp/ucnid-9.c: New test.

libcpp:
2013-11-15  Joseph Myers  

* ucnid.tab: Mark C99 digits as [C99DIG].
* makeucnid.c (read_ucnid): Handle [C99DIG].
(read_table): Don't check for digit characters.
* ucnid.h: Regenerate.

Index: libcpp/makeucnid.c
===
--- libcpp/makeucnid.c  (revision 204827)
+++ libcpp/makeucnid.c  (working copy)
@@ -66,6 +66,8 @@ read_ucnid (const char *fname)
break;
   if (strcmp (line, "[C99]\n") == 0)
fl = C99;
+  if (strcmp (line, "[C99DIG]\n") == 0)
+   fl = C99|digit;
   else if (strcmp (line, "[CXX]\n") == 0)
fl = CXX;
   else if (isxdigit (line[0]))
@@ -104,10 +106,10 @@ read_ucnid (const char *fname)
   fclose (f);
 }
 
-/* Read UnicodeData.txt and set the 'digit' flag, and
-   also fill in the 'decomp' table to be the decompositions of
-   characters for which both the character decomposed and all the code
-   points in the decomposition are either C99 or CXX.  */
+/* Read UnicodeData.txt and fill in the 'decomp' table to be the
+   decompositions of characters for which both the character
+   decomposed and all the code points in the decomposition are either
+   C99 or CXX.  */
 
 static void
 read_table (char *fname)
@@ -135,11 +137,7 @@ read_table (char *fname)
   do {
l++;
   } while (*l != ';');
-  /* Category value; things starting with 'N' are numbers of some
-kind.  */
-  if (*++l == 'N')
-   flags[codepoint] |= digit;
-
+  /* Category value.  */
   do {
l++;
   } while (*l != ';');
Index: libcpp/ucnid.h
===
--- libcpp/ucnid.h  (revision 204827)
+++ libcpp/ucnid.h  (working copy)
@@ -714,13 +714,12 @@
 {   0|  0|  0|CID|NFC|NKC|  0,   0, 0x2132 },
 { C99|  0|  0|CID|NFC|  0|  0,   0, 0x2138 },
 {   0|  0|  0|CID|NFC|  0|  0,   0, 0x215f },
-{ C99|DIG|  0|CID|NFC|  0|  0,   0, 0x217f },
-{ C99|DIG|  0|CID|NFC|NKC|  0,   0, 0x2182 },
+{ C99|  0|  0|CID|NFC|  0|  0,   0, 0x217f },
+{ C99|  0|  0|CID|NFC|NKC|  0,   0, 0x2182 },
 {   0|  0|  0|CID|NFC|NKC|  0,   0, 0x3004 },
-{ C99|  0|  0|CID|NFC|NKC|  0,   0, 0x3006 },
-{ C99|DIG|  0|CID|NFC|NKC|  0,   0, 0x3007 },
+{ C99|  0|  0|CID|NFC|NKC|  0,   0, 0x3007 },
 {   0|  0|  0|CID|NFC|NKC|  0,   0, 0x3020 },
-{ C99|DIG|  0|CID|NFC|NKC|  0,   0, 0x3029 },
+{ C99|  0|  0|CID|NFC|NKC|  0,   0, 0x3029 },
 {   0|  0|  0|CID|NFC|NKC|  0,   0, 0x3040 },
 { C99|  0|CXX|CID|NFC|NKC|  0,   0, 0x3093 },
 {   0|  0|CXX|CID|NFC|NKC|  0,   0, 0x3094 },
Index: libcpp/ucnid.tab
===
--- libcpp/ucnid.tab(revision 204827)
+++ libcpp/ucnid.tab(working copy)
@@ -119,7 +119,7 @@ ac00-d7a3
 0b3d 1fbe 203f-2040 2102 2107 210a-2113 2115 2118-211d 2124 2126 2128
 212a-2131 2133-2138 2160-2182 3005-3007 3021-3029
 
-; Digits
+[C99DIG]
 0660-0669 06f0-06f9 0966-096f 09e6-09ef 0a66-0a6f 0ae6-0aef 0b66-0b6f
 0be7-0bef 0c66-0c6f 0ce6-0cef 0d66-0d6f 0e50-0e59 0ed0-0ed9 0f20-0f33
 
Index: gcc/testsuite/gcc.dg/cpp/ucnid-9.c
===
--- gcc/testsuite/gcc.dg/cpp/ucnid-9.c  (revision 0)
+++ gcc/testsuite/gcc.dg/cpp/ucnid-9.c  (revision 0)
@@ -0,0 +1,8 @@
+/* { dg-do preprocess } */
+/* { dg-options "-std=c99 -pedantic -fextended-identifiers" } */
+
+\u2160
+\u2182
+\u3007
+\u3021
+\u3029

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.

2013-11-14 Thread Cong Hou
Hi

This patch adds the support to two non-isomorphic operations addsub
and subadd for SLP vectorizer. More non-isomorphic operations can be
added later, but the limitation is that operations on even/odd
elements should still be isomorphic. Once such an operation is
detected, the code of the operation used in vectorized code is stored
and later will be used during statement transformation. Two new GIMPLE
opeartions VEC_ADDSUB_EXPR and VEC_SUBADD_EXPR are defined. And also
new optabs for them. They are also documented.

The target supports for SSE/SSE2/SSE3/AVX are added for those two new
operations on floating points. SSE3/AVX provides ADDSUBPD and ADDSUBPS
instructions. For SSE/SSE2, those two operations are emulated using
two instructions (selectively negate then add).

With this patch the following function will be SLP vectorized:


float a[4], b[4], c[4];  // double also OK.

void subadd ()
{
  c[0] = a[0] - b[0];
  c[1] = a[1] + b[1];
  c[2] = a[2] - b[2];
  c[3] = a[3] + b[3];
}

void addsub ()
{
  c[0] = a[0] + b[0];
  c[1] = a[1] - b[1];
  c[2] = a[2] + b[2];
  c[3] = a[3] - b[3];
}


Boostrapped and tested on an x86-64 machine.


thanks,
Cong





diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2c0554b..656d5fb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,31 @@
+2013-11-14  Cong Hou  
+
+ * tree-vect-slp.c (vect_create_new_slp_node): Initialize
+ SLP_TREE_OP_CODE.
+ (slp_supported_non_isomorphic_op): New function.  Check if the
+ non-isomorphic operation is supported or not.
+ (vect_build_slp_tree_1): Consider non-isomorphic operations.
+ (vect_build_slp_tree): Change argument.
+ * tree-vect-stmts.c (vectorizable_operation): Consider the opcode
+ for non-isomorphic operations.
+ * optabs.def (vec_addsub_optab, vec_subadd_optab): New optabs.
+ * tree.def (VEC_ADDSUB_EXPR, VEC_SUBADD_EXPR): New operations.
+ * expr.c (expand_expr_real_2): Add support to VEC_ADDSUB_EXPR and
+ VEC_SUBADD_EXPR.
+ * gimple-pretty-print.c (dump_binary_rhs): Likewise.
+ * optabs.c (optab_for_tree_code): Likewise.
+ * tree-cfg.c (verify_gimple_assign_binary): Likewise.
+ * tree-vectorizer.h (struct _slp_tree): New data member.
+ * config/i386/i386-protos.h (ix86_sse_expand_fp_addsub_operator):
+ New funtion.  Expand addsub/subadd operations for SSE2.
+ * config/i386/i386.c (ix86_sse_expand_fp_addsub_operator): Likewise.
+ * config/i386/sse.md (UNSPEC_SUBADD, UNSPEC_ADDSUB): New RTL operation.
+ (vec_subadd_v4sf3, vec_subadd_v2df3, vec_subadd_3,
+ vec_addsub_v4sf3, vec_addsub_v2df3, vec_addsub_3):
+ Expand addsub/subadd operations for SSE/SSE2/SSE3/AVX.
+ * doc/generic.texi (VEC_ADDSUB_EXPR, VEC_SUBADD_EXPR): New doc.
+ * doc/md.texi (vec_addsub_@var{m}3, vec_subadd_@var{m}3): New doc.
+
 2013-11-12  Jeff Law  

  * tree-ssa-threadedge.c (thread_around_empty_blocks): New
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index fdf9d58..b02b757 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -117,6 +117,7 @@ extern rtx ix86_expand_adjust_ufix_to_sfix_si (rtx, rtx *);
 extern enum ix86_fpcmp_strategy ix86_fp_comparison_strategy (enum rtx_code);
 extern void ix86_expand_fp_absneg_operator (enum rtx_code, enum machine_mode,
 rtx[]);
+extern void ix86_sse_expand_fp_addsub_operator (bool, enum
machine_mode, rtx[]);
 extern void ix86_expand_copysign (rtx []);
 extern void ix86_split_copysign_const (rtx []);
 extern void ix86_split_copysign_var (rtx []);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 5287b49..76f38f5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -18702,6 +18702,51 @@ ix86_expand_fp_absneg_operator (enum rtx_code
code, enum machine_mode mode,
 emit_insn (set);
 }

+/* Generate code for addsub or subadd on fp vectors for sse/sse2.  The flag
+   SUBADD indicates if we are generating code for subadd or addsub.  */
+
+void
+ix86_sse_expand_fp_addsub_operator (bool subadd, enum machine_mode mode,
+rtx operands[])
+{
+  rtx mask;
+  rtx neg_mask32 = GEN_INT (0x8000);
+  rtx neg_mask64 = GEN_INT ((HOST_WIDE_INT)1 << 63);
+
+  switch (mode)
+{
+case V4SFmode:
+  if (subadd)
+ mask = gen_rtx_CONST_VECTOR (V4SImode, gen_rtvec (4,
+ neg_mask32, const0_rtx, neg_mask32, const0_rtx));
+  else
+ mask = gen_rtx_CONST_VECTOR (V4SImode, gen_rtvec (4,
+ const0_rtx, neg_mask32, const0_rtx, neg_mask32));
+  break;
+
+case V2DFmode:
+  if (subadd)
+ mask = gen_rtx_CONST_VECTOR (V2DImode, gen_rtvec (2,
+ neg_mask64, const0_rtx));
+  else
+ mask = gen_rtx_CONST_VECTOR (V2DImode, gen_rtvec (2,
+ const0_rtx, neg_mask64));
+  break;
+
+default:
+  gcc_unreachable ();
+}
+
+  rtx tmp = gen_reg_rtx (mode);
+  convert_move (tmp, mask, false);
+
+  rtx tmp2 = gen_reg_rtx (mode);
+  tmp2 = expand_simple_binop (mode, XOR, tmp, operands[2],
+  tmp2, 0, OPTAB_DIRECT);
+  expand_simple_binop (mode, PLUS, operands[1], tmp2,
+   operands[0], 0, OPTAB_DIRECT);
+}
+
 /* Ex

[4.7] Fix libiberty install-pdf

2013-11-14 Thread Joseph S. Myers
I've applied this patch backport to 4.7 branch to fix "make install-pdf" 
there with newer Texinfo tools.

2013-11-15  Joseph Myers  

Backport from mainline:
2012-06-29  Andreas Schwab  

* copying-lib.texi (Library Copying): Don't use @heading inside
@enumerate.

Index: copying-lib.texi
===
--- copying-lib.texi(revision 204833)
+++ copying-lib.texi(working copy)
@@ -476,12 +476,7 @@
 of all derivatives of our free software and of promoting the sharing
 and reuse of software generally.
 
-@iftex
-@heading NO WARRANTY
-@end iftex
-@ifinfo
 @center NO WARRANTY
-@end ifinfo
 
 @item
 BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 06:56 PM, Jeff Law wrote:

On 11/14/13 16:00, Andrew MacLeod wrote:



  I was bootstrapping Ada as well until the end of last week. It seems
to be broken right now, so I had turned Ada off until the issue is
resolved.
Ada should be working again...  At least on x86_64.  I'm still looking 
at it on Itanic, but I suspect you aren't building Ada on that :-)


jeff

I'll check again. it was still busted earlier this afternoon

Andrew


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Jeff Law

On 11/14/13 16:00, Andrew MacLeod wrote:



  I was bootstrapping Ada as well until the end of last week. It seems
to be broken right now, so I had turned Ada off until the issue is
resolved.
Ada should be working again...  At least on x86_64.  I'm still looking 
at it on Itanic, but I suspect you aren't building Ada on that :-)


jeff


Re: [PATCH] Fix lto-profiledbootstrap [was: Merge cgraph_get_create_node and cgraph_get_create_real_symbol_node]

2013-11-14 Thread Jan Hubicka
> > 2013-11-14  Uros Bizjak  
> >
> > * lto-streamer-in.c (input function): Call cgraph_create_node if
> > cgraph_get_node failed.
> >
> > Tested with lto-profiledbootstrap on x86_64-pc-linux-gnu, regression
> > tested also with -m32 [1].
> >
> > OK for mainline?

OK (though it is a bit ugly - but lto streaming is very much a special case),
thanks!
Honza
> >
> > [1] http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00989.html
> >
> > Uros.

> Index: lto-streamer-in.c
> ===
> --- lto-streamer-in.c (revision 204807)
> +++ lto-streamer-in.c (working copy)
> @@ -917,7 +917,8 @@ input_function (tree fn_decl, struct data_in *data
>gimple_register_cfg_hooks ();
>  
>node = cgraph_get_node (fn_decl);
> -  gcc_checking_assert (node);
> +  if (!node)
> +node = cgraph_create_node (fn_decl);
>input_struct_function_base (fn, data_in, ib);
>input_cfg (ib_cfg, fn, node->count_materialization_scale);
>  



[PATCH] MIPS: MIPS32r2 FP reciprocal instruction set support

2013-11-14 Thread Maciej W. Rozycki
Hi,

 Complementing the recent change to enable FP MADD instructions on 
MIPS32r2 processors in the 32-bit FPR mode (CP0.Status.FR=0) here's one to 
enable FP reciprocal instructions (RECIP.fmt and RSQRT.fmt) in that case 
as well.  Architecture documents have been amended to make it unambiguous 
that these instructions are supported in that FPU configuration[1][2].

 My understanding also is only a single implementation of a strict 32-bit 
MIPS32r2 FPU (CP1.FIR.F64=0) has been ever made and that chip actually 
supports these instructions, and no future 32-bit FPUs are supposed to be 
made as the architecture no longer allows it[3][4].

 Note that these instructions were allowed in either FPU mode in the MIPS 
IV ISA, but for forward ISA compatibility this change does not enable them 
for -march=mips4 in the 32-bit FPR mode because the original revision of 
the MIPS64 ISA did not support it.

 I have regression-tested this change with the mips-linux-gnu target and 
the mips32r2/o32 multilib.  I have also verified that the instructions 
affected were absent across the binaries produced by the testsuite before 
applying this change and present afterwards -- although only RECIP.D and 
RECIP.S are produced and only once each, by gcc.dg/builtins-24.c and 
gcc.dg/pr41963.c respectively.  Neither RSQRT.D nor RSQRT.S have coverage 
in our testsuite.

 OK to apply?

 References:

[1] "MIPS Architecture For Programmers, Volume II-A: The MIPS32
Instruction Set", Document Number: MD00086, Revision 5.03, Sept. 9,
2013

[2] "MIPS Architecture for Programmers, Volume II-B: The microMIPS32
Instruction Set, Document Number: MD00582, Revision 5.03, Sept. 9,
2013

[3] "MIPS Architecture For Programmers, Volume I-A: Introduction to the
MIPS32 Architecture", Document Number: MD00082, Revision 5.03, Sept.
9, 2013

[4] "MIPSR Architecture For Programmers, Volume I-B: Introduction to the
microMIPS32 Architecture", Document Number: MD00741, Revision 5.03,
Sept. 9, 2013

2013-11-14  Maciej W. Rozycki  

gcc/
* config/mips/mips.h (ISA_HAS_FP_RECIP_RSQRT): New macro.
* config/mips/mips.c (mips_rtx_costs) : Check for
ISA_HAS_FP_RECIP_RSQRT rather than ISA_HAS_FP4.
* config/mips/mips.md (recip_condition): Remove mode attribute.
(div3): Use ISA_HAS_FP_RECIP_RSQRT rather than 
.
(*recip3, *rsqrta, *rsqrtb): Likewise.

  Maciej

gcc-mips32r2-recip.patch
Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c
===
--- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c 2013-11-12 
15:32:19.767952530 +
+++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c  2013-11-12 15:33:22.277646941 
+
@@ -3967,7 +3967,7 @@ mips_rtx_costs (rtx x, int code, int out
 case DIV:
   /* Check for a reciprocal.  */
   if (float_mode_p
- && ISA_HAS_FP4
+ && ISA_HAS_FP_RECIP_RSQRT (mode)
  && flag_unsafe_math_optimizations
  && XEXP (x, 0) == CONST1_RTX (mode))
{
Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.h
===
--- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.h 2013-11-12 
15:31:46.758734464 +
+++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.h  2013-11-12 15:33:22.277646941 
+
@@ -921,6 +921,21 @@ struct mips_cpu_info {
'c = -((a * b) [+-] c)'.  */
 #define ISA_HAS_NMADD3_NMSUB3  TARGET_LOONGSON_2EF
 
+/* ISA has floating-point RECIP.fmt and RSQRT.fmt instructions.  The
+   MIPS64 rev. 1 ISA says that RECIP.D and RSQRT.D are unpredictable when
+   doubles are stored in pairs of FPRs, so for safety's sake, we apply
+   this restriction to the MIPS IV ISA too.  */
+#define ISA_HAS_FP_RECIP_RSQRT(MODE)   \
+   (((ISA_HAS_FP4  \
+  || (ISA_MIPS32R2 && !TARGET_MIPS16)) \
+ && ((MODE) == SFmode  \
+ || ((TARGET_FLOAT64   \
+  || !(ISA_MIPS4   \
+   || ISA_MIPS64)) \
+ && (MODE) == DFmode)))\
+|| ((TARGET_SB1 && !TARGET_MIPS16) \
+&& (MODE) == V2SFmode))
+
 /* ISA has count leading zeroes/ones instruction (not implemented).  */
 #define ISA_HAS_CLZ_CLO((ISA_MIPS32
\
  || ISA_MIPS32R2   \
Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.md
===
--- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.md2013-11-12 
15:31:46.758734464 +
+++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.md 2013-11-12 15:33:22.277

[PATCH] MIPS: MIPS32r2 FP indexed access instruction set support

2013-11-14 Thread Maciej W. Rozycki
Hi,

 Complementing the recent change to enable FP MADD instructions on 
MIPS32r2 processors in the 32-bit FPR mode (CP0.Status.FR=0) here's one to 
enable FP ordinary indexed memory access (i.e. LWXC1, SWXC1, LDXC1 and 
SDXC1) instructions in that case as well.  Architecture documents have 
been amended to make it unambiguous that these instructions are supported 
in that FPU configuration[1][2].  Note that PREFX is already handled like 
this even though old architecture documents did not allow it.

 My understanding also is only a single implementation of a strict 32-bit 
MIPS32r2 FPU (CP1.FIR.F64=0) has been ever made and that chip actually 
supports these instructions, and no future 32-bit FPUs are supposed to be 
made as the architecture no longer allows it[3][4].

 Please note that these instructions continue being generated for MIPS IV 
ISA processors regardless of the FPR mode selected and continue being 
avoided for original MIPS32 ISA revision processors.  We may consider 
changing that separately (for MIPS32 that is), for the sake of emulated 
code such as under Linux (I believe no MIPS32 original revision FPU 
hardware has been ever made; I'll be happy to get corrected if I am wrong 
though).

 As a side effect a number of macros that rely on ISA_HAS_FP4 can be 
simplified and MIPS32r2 special-casing removed.

 I have regression-tested this change with the mips-linux-gnu target and 
the mips32r2/o32 multilib.  I have also verified that the instructions 
affected were absent across the binaries produced by the testsuite before 
applying this change and present afterwards (at 899 instances across the 
testsuite all these instructions are extensively covered; PREFX is 
covered too).

 OK to apply?

 References:

[1] "MIPS Architecture For Programmers, Volume II-A: The MIPS32
Instruction Set", Document Number: MD00086, Revision 5.03, Sept. 9,
2013

[2] "MIPS Architecture for Programmers, Volume II-B: The microMIPS32 
Instruction Set, Document Number: MD00582, Revision 5.03, Sept. 9, 
2013

[3] "MIPS Architecture For Programmers, Volume I-A: Introduction to the
MIPS32 Architecture", Document Number: MD00082, Revision 5.03, Sept.
9, 2013

[4] "MIPSR Architecture For Programmers, Volume I-B: Introduction to the 
microMIPS32 Architecture", Document Number: MD00741, Revision 5.03,
Sept. 9, 2013

2013-11-14  Maciej W. Rozycki  

gcc/
* config/mips/mips.h (ISA_HAS_FP4): Remove TARGET_FLOAT64 
restriction for ISA_MIPS32R2.
(ISA_HAS_FP_MADD4_MSUB4): Remove ISA_MIPS32R2 special-casing.
(ISA_HAS_NMADD4_NMSUB4): Likewise.
(ISA_HAS_FP_RECIP_RSQRT): Likewise.
(ISA_HAS_PREFETCHX): Redefine in terms of ISA_HAS_FP4.

  Maciej

gcc-mips32r2-index.patch
Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.h
===
--- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.h 2013-11-12 
15:33:22.277646941 +
+++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.h  2013-11-12 15:33:43.788707112 
+
@@ -884,7 +884,7 @@ struct mips_cpu_info {
FP madd and msub instructions, and the FP recip and recip sqrt
instructions.  */
 #define ISA_HAS_FP4((ISA_MIPS4 \
- || (ISA_MIPS32R2 && TARGET_FLOAT64)   \
+ || ISA_MIPS32R2   \
  || ISA_MIPS64 \
  || ISA_MIPS64R2)  \
 && !TARGET_MIPS16)
@@ -906,16 +906,14 @@ struct mips_cpu_info {
 #define GENERATE_MADD_MSUB (TARGET_IMADD && !TARGET_MIPS16)
 
 /* ISA has floating-point madd and msub instructions 'd = a * b [+-] c'.  */
-#define ISA_HAS_FP_MADD4_MSUB4  (ISA_HAS_FP4   \
-|| (ISA_MIPS32R2 && !TARGET_MIPS16))
+#define ISA_HAS_FP_MADD4_MSUB4  ISA_HAS_FP4
 
 /* ISA has floating-point madd and msub instructions 'c = a * b [+-] c'.  */
 #define ISA_HAS_FP_MADD3_MSUB3  TARGET_LOONGSON_2EF
 
 /* ISA has floating-point nmadd and nmsub instructions
'd = -((a * b) [+-] c)'.  */
-#define ISA_HAS_NMADD4_NMSUB4  (ISA_HAS_FP4\
-|| (ISA_MIPS32R2 && !TARGET_MIPS16))
+#define ISA_HAS_NMADD4_NMSUB4  ISA_HAS_FP4
 
 /* ISA has floating-point nmadd and nmsub instructions
'c = -((a * b) [+-] c)'.  */
@@ -926,8 +924,7 @@ struct mips_cpu_info {
doubles are stored in pairs of FPRs, so for safety's sake, we apply
this restriction to the MIPS IV ISA too.  */
 #define ISA_HAS_FP_RECIP_RSQRT(MODE)   \
-   (((ISA_HAS_FP4  \
-  || (ISA_MIPS32R2 && !TARGET_MIPS16)) \
+   ((ISA_HAS_FP4   \

Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 05:06 PM, H.J. Lu wrote:

On Thu, Nov 14, 2013 at 8:34 AM, Andrew MacLeod  wrote:

On 11/14/2013 11:23 AM, Michael Matz wrote:

Hi,

On Thu, 14 Nov 2013, Andrew MacLeod wrote:


I think if following through with the whole plan there would (and
should) be nothing remaining that could be called a gimple expression.

very possibly, i just haven't gotten to those parts yet. I can change
the name back to gimple-decl.[ch] or some such thing if you like that
better.

-object? -operand? -stuff? ;-)  Will all of these splits land at trunk,
i.e. 4.9?  Why the hurry when not even such high-level things are clear?
I mean how can you think about rearchitecting the gimple data structures
without having looked at the current details.  It's clear that not every
detail of the design can be fixated at this point, but basic questions
like "what's the operands?", "will there be expressions?", "how do we
iterate?", "recursive structures or not?" should at least get some answer
before really starting grind work, shouldn't they?


The splits are for header file cleanup and re-structuring into logical
components.  As I mentioned in the original post,  the file is needed to
break dependency cycles between gimple.h (the statements) , the iterators,
and gimplification.  It is for the gimple stuff which doesn't need any of
those things but is consumed by them.

This really has nothing to do with my future plans, other than the fact that
I also said whatever is in this file is will eventually be split into more
things, but I'm not ready to do those splits yet, thus the gimple-blah name
doesn't matter to me.  gimple-expr seemed convenient at the time but clearly
you don't like it, and I'll happily call it whatever you want.  It's a grab
bag of all the gimple values which are still trees...

maybe the suggested  gimple-val.[ch] is ok?

Andrew

It breaks Ada build.  I checked in the following patch to unbreak
it.



Thanks.

 I was bootstrapping Ada as well until the end of last week. It seems 
to be broken right now, so I had turned Ada off until the issue is resolved.


Andrew


Re: [PATCH] Document -mabi=elfv[12] (Re: [PATCH, rs6000] ELFv2 ABI 1/8: Add options and infrastructure)

2013-11-14 Thread David Edelsohn
On Thu, Nov 14, 2013 at 5:07 PM, Ulrich Weigand  wrote:

> Here's a patch to add documentation along the lines of what we have
> for the longdouble switches.
>
> Doc build tested on powerpc64-linux.
>
> David, would that be OK for mainline, or do have other suggestions?

I don't think that the wording is correct because -mabi=elfv1 and
-mabi=elfv2 are "options" for either endian.

> ChangeLog:
>
> * doc/invoke.texi (-mabi=elfv1, -mabi=elfv2): Document.
>
> Index: gcc/gcc/doc/invoke.texi
> ===
> --- gcc.orig/gcc/doc/invoke.texi
> +++ gcc/gcc/doc/invoke.texi
> @@ -18846,7 +18846,8 @@ SVR4 ABI)@.
>  @opindex mabi
>  Extend the current ABI with a particular extension, or remove such extension.
>  Valid values are @var{altivec}, @var{no-altivec}, @var{spe},
> -@var{no-spe}, @var{ibmlongdouble}, @var{ieeelongdouble}@.
> +@var{no-spe}, @var{ibmlongdouble}, @var{ieeelongdouble},
> +@var{elfv1}, @var{elfv2}@.
>
>  @item -mabi=spe
>  @opindex mabi=spe
> @@ -18868,6 +18869,16 @@ This is a PowerPC 32-bit SYSV ABI option
>  Change the current ABI to use IEEE extended-precision long double.
>  This is a PowerPC 32-bit Linux ABI option.
>
> +@item -mabi=elfv1
> +@opindex mabi=elfv1
> +Change the current ABI to use the ELFv1 ABI.
> +This is a little-endian PowerPC 64-bit Linux ABI option.

Delete the second line and replace it with something like

"This is the default ABI for big-endian PowerPC 64-bit Linux.
Overriding the default ABI requires special system support and is
likely to fail in spectacular ways."

> +
> +@item -mabi=elfv2
> +@opindex mabi=elfv2
> +Change the current ABI to use the ELFv2 ABI.
> +This is a big-endian PowerPC 64-bit Linux ABI option.

Similarly, delete the second line and replace it with something like

"This is the default ABI for little-endian PowerPC 64-bit Linux.
Overriding the default ABI requires special system support and is
likely to fail in spectacular ways."

> +
>  @item -mprototype
>  @itemx -mno-prototype
>  @opindex mprototype


Re: XFAIL a couple of gnat.dg testcases on MIPS

2013-11-14 Thread H.J. Lu
On Thu, Nov 14, 2013 at 3:51 AM, Eric Botcazou  wrote:
>> Here is a patch.  OK to install?
>
> Yes, thanks.
>
> --
> Eric Botcazou

It doesn't work:

ERROR: gnat.dg/specs/addr1.ads: syntax error in target selector "xfail
mips*-*-* { { i?86-*-* x86_64-*-* } && x32 }" for " dg-bogus 24
"(alignment|erroneous)" "" { xfail mips*-*-* { { i?86-*-* x86_64-*-* }
&& x32 } } "

I couldn't find a way to make it to xfail for mips or x32.
I reverted it.

-- 
H.J.


Re: Factor unrelated declarations out of tree.h (2/2)

2013-11-14 Thread Joseph S. Myers
On Thu, 14 Nov 2013, Diego Novillo wrote:

> These are due to builtins.h.  The structs defined in there need
> FIRST_PSEUDO_REGISTER.  This means that we have parts of builtins.h
> that are OK for FEs and others that aren't.  This is not good.
> 
> The best alternative for this change is to leave the declarations for
> builtins.h inside tree.h and then decide what to do about builtins.h
> itself. We clearly need it to declare everything related to builtins,
> but from what you're stating about tm.h, we will need to have an FE
> variant and an ME/BE variant?

I imagine that FIRST_PSEUDO_REGISTER will be one of the harder parts of 
the back-end interface to move away from macros, so, yes, it will need 
splitting (and GIMPLE optimizers, as well as front ends, should avoid tm.h 
where possible - target macros they use are generally among the more 
straightforward to convert to hooks - it's only the RTL parts of the 
compiler where we're a long way from being able to eliminate tm.h).

So, put all the new prototypes in a new tree-builtins.h (for example).  
Everything in the existing header needs tm.h (SWITCHABLE_TARGET also needs 
tm.h - most flags.h users manage to get away without it because they don't 
use those bits of flags.h, but really users of flags.h should move to 
options.h and other headers as needed).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))

2013-11-14 Thread Jeff Law

On 11/14/13 10:49, David Malcolm wrote:


FWIW, I prefer the downcasts to adding virtual functions; what I've
tried to do is create a very direct mapping from the status quo,
introducing inheritance to gain the benefits listed earlier in the
thread, whilst only changing "surface syntax".
I understand (and I probably encouraged you to stay close as close to 
the status quo as possible while still moving this stuff forward ;-)





It seems to me that we're considering the general problem of type-safe
code dispatch: given a type hierarchy, and various sites that want to
vary behavior based on what types they see, how best to invoke the
appropriate code, ensuring that the code that's called "knows" that its
dealing with the appropriate subclass i.e. in a typesafe manner.
Right.  Certainly in my quick browsing, this is primarily a dispatch 
problem.  I think Andrew had one in his Cauldron slide deck as well.




There are various idioms for doing this kind of dispatch in C++, a
non-exhaustive list is:

(a) switches and if/then tests on the GIMPLE_CODE (stmt) - the status
quo,
Right.  And probably appropriate for now.  But I do want us to think 
about better ways to handle this.




(b) adding virtual functions to gimple would be another way to handle
type-safe dispatch, but they carry costs:
   (i) they would implicitly add a vtable ptr to the top of every
gimple statement, increasing the memory consumption of the process

Thats my biggest concern.


   (ii) it's my belief that a virtual function call is more expensive
than the kinds of switch/if+then branching that we're currently doing on
the code - though I don't have measurements to back this up
The virtual call is probably more expensive, but probably not as much as 
you might think as the switch likely compiles down to a multi-way branch 
which is on-par with an indirect call.





   (c) the "Visitor" design pattern [1] - rather than adding virtual
functions to gimple, instead add them to a visitor class e.g.:
Basically this just puts the vtable in a different class.  But doesn't 
the wrapping visitor need to know about the underlying details of the 
gimple statement class?   If we're trying to encapsulate things better, 
doesn't a visitor break the encapsulation?




(the above isn't *exactly* the Visitor pattern from the Gang of Four
book, I'm doing things in the visitor in order avoiding adding vfuncs to
gimple).
Right.  My mental model when I wrote my last message as a visit method 
which dispatched to the statement specific bits, but with the method as 
a part of the gimple base class.




This approach avoids adding an implicit vtable field to the top of
gimple [(i) above], and keeps the vtables with the code using them
[(iii) above].

Right.



   However it still would mean (ii) changing from switch/if-then control
flow to vfunc calls, with unknown impact on performance.  I'd be nervous
about adding virtual functions anywhere where we're not already jumping
though function ptrs.
As noted above, jumping through a function pointer probably isn't much 
different performance-wise than a multi-way branch.


Anyway, just wanted to get the conversation around this started as 
cleaning this stuff up is a natural follow-on at some point.




(nods).   Note that I don't regard the downcasting as inherently bad,
just one approach to the generic issue of typesafe dynamic code
dispatch.   Yes, in many OO textbooks, it's regarded as a code smell,
but then again "goto" has its uses :)
Again, my opinions come from working on large codes which did a lot of 
downcasting (and sadly upcasting too) and it was a major PITA to get 
sorted out.


Thanks,
jeff


libgo patch committed: Don't use filename without '/' for backtrace

2013-11-14 Thread Ian Lance Taylor
This patch changes libgo so that if the executable filename does not
have a '/', it does not use that name for the backtrace library.  This
is the case when an executable is found on PATH.  In that case, there is
no particular reason to believe that a file with that name in the
current directory is the same as the executable itself.  This is
http://golang.org/issue/6715.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 87729ea100b9 libgo/runtime/go-caller.c
--- a/libgo/runtime/go-caller.c	Thu Nov 14 14:12:12 2013 -0800
+++ b/libgo/runtime/go-caller.c	Thu Nov 14 14:28:40 2013 -0800
@@ -101,6 +101,13 @@
   const char *filename;
 
   filename = (const char *) runtime_progname ();
+
+  /* If there is no '/' in FILENAME, it was found on PATH, and
+	 might not be the same as the file with the same name in the
+	 current directory.  */
+  if (__builtin_strchr (filename, '/') == NULL)
+	filename = NULL;
+
   back_state = backtrace_create_state (filename, 1, error_callback, NULL);
 }
   runtime_unlock (&back_state_lock);


Re: Factor unrelated declarations out of tree.h (2/2)

2013-11-14 Thread Diego Novillo
On Thu, Nov 14, 2013 at 5:16 PM, Joseph S. Myers
 wrote:
> On Thu, 14 Nov 2013, Diego Novillo wrote:
>
>> This patch contains the mechanical side-effects from
>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01663.html
>
> There are rather a lot of "Include tm.h" changes here - especially in
> front ends, where we've tried to eliminate tm.h calls, and put comments on
> some of those remaining saying exactly what target macros are used to make
> clear what's needed to eliminate them.  Putting in these includes, without
> clear comments explaining how to eliminate them, seems a step backwards.

These are due to builtins.h.  The structs defined in there need
FIRST_PSEUDO_REGISTER.  This means that we have parts of builtins.h
that are OK for FEs and others that aren't.  This is not good.

The best alternative for this change is to leave the declarations for
builtins.h inside tree.h and then decide what to do about builtins.h
itself. We clearly need it to declare everything related to builtins,
but from what you're stating about tm.h, we will need to have an FE
variant and an ME/BE variant?


Diego.


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-14 Thread Diego Novillo
On Thu, Nov 14, 2013 at 5:12 PM, Jeff Law  wrote:
> On 11/14/13 13:28, Diego Novillo wrote:
>>
>> Functions in each corresponding .c file got moved to those
>> headers and others that already existed. I wanted to make this
>> patch as mechanical as possible, so I made no attempt to fix
>> problems like having build_addr defined in tree-inline.c. I left
>> that for later.
>
> This seems backwards to me and just ensures double-churn. Once to move it
> now, then again to its final resting spot.
>
> If this change is being made via some automated script, then, well, I guess
> it is what it is and we'll have to come back to them.  But if you're doing
> this by hand it seems to me that leaving it in its original location,
> possibly grouped with its friends, with a FIXME would be better.

Most of it was automated.  I want to stage it in, and I worked pretty
hard at not making additional changes. Particularly since it is not
clear where we will want some of these functions to end up in.  So, we
will still need several passes.  Making each pass self contained makes
sense to me.


>> - Some header files always need another header file. I chose to
>>#include that header in the file. At this stage we want to do
>>the opposite, but this would've added even more bulk to the
>>change, so I left a FIXME marker for the next pass.
>
> This seems a bit like a mistake.  How much of this patch would be blocked if
> we didn't allow this right now.

A good chunk.  I'm doing these FIXMEs in the next sequence of patches,
so we won't have them for long. Again, I was going for an orderly
transition here.

Diego.


Re: Factor unrelated declarations out of tree.h (2/2)

2013-11-14 Thread Joseph S. Myers
On Thu, 14 Nov 2013, Diego Novillo wrote:

> This patch contains the mechanical side-effects from
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01663.html

There are rather a lot of "Include tm.h" changes here - especially in 
front ends, where we've tried to eliminate tm.h calls, and put comments on 
some of those remaining saying exactly what target macros are used to make 
clear what's needed to eliminate them.  Putting in these includes, without 
clear comments explaining how to eliminate them, seems a step backwards.

As far as I can see, your previous patch did not add any declarations to 
tm.h itself, so I guess this is because files are now including some other 
header that has a tm.h requirement.  This indicates that this other header 
needs to be split up, with the parts needing tm.h separate from those that 
don't (well - a more logical split would be better than one based on 
"needing tm.h"), to avoid regressing so much in the elimination of tm.h 
from front ends.

(FWIW, I consider tm.h one of the worst headers in modularity terms, and 
one of the most important to eliminate includes of.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Add a new header to PLUGIN_HEADERS

2013-11-14 Thread Diego Novillo
I did not add all headers factored out of tree.h because it is unclear
(and impossible to tell) what plugins need.  This adds the one header
used by the plugins in the testsuite.

This will be changing quite dramatically as we progress with the
header refactoring.


2013-11-14  Diego Novillo  

* Makefile.in (PLUGIN_HEADERS): Add stringpool.h.

testsuite/ChangeLog

* gcc.dg/plugin/selfassign.c: Include stringpool.h.
* gcc.dg/plugin/start_unit_plugin.c: Likewise.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 806b6ca..44b3eb4 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3114,7 +3114,7 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
   cppdefault.h flags.h $(MD5_H) params.def params.h prefix.h tree-inline.h \
   $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
   $(IPA_PROP_H) $(TARGET_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) \
-  version.h
+  version.h stringpool.h
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefile
diff --git a/gcc/testsuite/gcc.dg/plugin/selfassign.c 
b/gcc/testsuite/gcc.dg/plugin/selfassign.c
index 2498153..cdab74a 100644
--- a/gcc/testsuite/gcc.dg/plugin/selfassign.c
+++ b/gcc/testsuite/gcc.dg/plugin/selfassign.c
@@ -8,6 +8,7 @@
 #include "coretypes.h"
 #include "tm.h"
 #include "tree.h"
+#include "stringpool.h"
 #include "toplev.h"
 #include "basic-block.h"
 #include "gimple.h"
diff --git a/gcc/testsuite/gcc.dg/plugin/start_unit_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/start_unit_plugin.c
index 257aad8..39f4462 100644
--- a/gcc/testsuite/gcc.dg/plugin/start_unit_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/start_unit_plugin.c
@@ -11,6 +11,7 @@
 #include "coretypes.h"
 #include "tm.h"
 #include "tree.h"
+#include "stringpool.h"
 #include "toplev.h"
 #include "basic-block.h"
 #include "gimple.h"
-- 
1.8.4.1



Go patch committed: Use backend interface for comparisons

2013-11-14 Thread Ian Lance Taylor
This patch from Chris Manghane changes the Go frontend to use the
backend interface for comparisons.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 8a3737dd11fd go/expressions.cc
--- a/go/expressions.cc	Thu Nov 14 12:18:10 2013 -0800
+++ b/go/expressions.cc	Thu Nov 14 14:12:04 2013 -0800
@@ -5321,7 +5321,7 @@
 	}
 }
 
-  // Lower struct and array comparisons.
+  // Lower struct, array, and some interface comparisons.
   if (op == OPERATOR_EQEQ || op == OPERATOR_NOTEQ)
 {
   if (left->type()->struct_type() != NULL)
@@ -5329,6 +5329,11 @@
   else if (left->type()->array_type() != NULL
 	   && !left->type()->is_slice_type())
 	return this->lower_array_comparison(gogo, inserter);
+  else if ((left->type()->interface_type() != NULL
+&& right->type()->interface_type() == NULL)
+   || (left->type()->interface_type() == NULL
+   && right->type()->interface_type() != NULL))
+	return this->lower_interface_value_comparison(gogo, inserter);
 }
 
   return this;
@@ -5457,6 +5462,57 @@
   return ret;
 }
 
+// Lower an interface to value comparison.
+
+Expression*
+Binary_expression::lower_interface_value_comparison(Gogo*,
+Statement_inserter* inserter)
+{
+  Type* left_type = this->left_->type();
+  Type* right_type = this->right_->type();
+  Interface_type* ift;
+  if (left_type->interface_type() != NULL)
+{
+  ift = left_type->interface_type();
+  if (!ift->implements_interface(right_type, NULL))
+return this;
+}
+  else
+{
+  ift = right_type->interface_type();
+  if (!ift->implements_interface(left_type, NULL))
+return this;
+}
+  if (!Type::are_compatible_for_comparison(true, left_type, right_type, NULL))
+return this;
+
+  Location loc = this->location();
+
+  if (left_type->interface_type() == NULL
+  && left_type->points_to() == NULL
+  && !this->left_->is_addressable())
+{
+  Temporary_statement* temp =
+  Statement::make_temporary(left_type, NULL, loc);
+  inserter->insert(temp);
+  this->left_ =
+  Expression::make_set_and_use_temporary(temp, this->left_, loc);
+}
+
+  if (right_type->interface_type() == NULL
+  && right_type->points_to() == NULL
+  && !this->right_->is_addressable())
+{
+  Temporary_statement* temp =
+  Statement::make_temporary(right_type, NULL, loc);
+  inserter->insert(temp);
+  this->right_ =
+  Expression::make_set_and_use_temporary(temp, this->right_, loc);
+}
+
+  return this;
+}
+
 // Lower a struct or array comparison to a call to memcmp.
 
 Expression*
@@ -5919,8 +5975,7 @@
 case OPERATOR_GT:
 case OPERATOR_GE:
   return Expression::comparison_tree(context, this->type_, this->op_,
-	 this->left_->type(), left,
-	 this->right_->type(), right,
+	 this->left_, this->right_,
 	 this->location());
 
 case OPERATOR_OROR:
@@ -6417,12 +6472,16 @@
 
 tree
 Expression::comparison_tree(Translate_context* context, Type* result_type,
-			Operator op, Type* left_type, tree left_tree,
-			Type* right_type, tree right_tree,
-			Location location)
-{
-  Type* int_type = Type::lookup_integer_type("int");
-  tree int_type_tree = type_to_tree(int_type->get_backend(context->gogo()));
+			Operator op, Expression* left_expr,
+			Expression* right_expr, Location location)
+{
+  Type* left_type = left_expr->type();
+  Type* right_type = right_expr->type();
+
+  mpz_t zval;
+  mpz_init_set_ui(zval, 0UL);
+  Expression* zexpr = Expression::make_integer(&zval, NULL, location);
+  mpz_clear(zval);
 
   enum tree_code code;
   switch (op)
@@ -6449,21 +6508,17 @@
   go_unreachable();
 }
 
+  // FIXME: Computing the tree here means it will be computed multiple times,
+  // which is wasteful.  This is a temporary modification until all tree code
+  // here can be replaced with frontend expressions.
+  tree left_tree = left_expr->get_tree(context);
+  tree right_tree = right_expr->get_tree(context);
   if (left_type->is_string_type() && right_type->is_string_type())
 {
-  Type* st = Type::make_string_type();
-  tree string_type = type_to_tree(st->get_backend(context->gogo()));
-  static tree string_compare_decl;
-  left_tree = Gogo::call_builtin(&string_compare_decl,
- location,
- "__go_strcmp",
- 2,
- int_type_tree,
- string_type,
- left_tree,
- string_type,
- right_tree);
-  right_tree = build_int_cst_type(int_type_tree, 0);
+  Expression* strcmp_call = Runtime::make_call(Runtime::STRCMP, location, 2,
+   left_expr, right_expr);
+  left_tree = strcmp_call->get_tree(context);
+  right_tree = zexpr->get_tree(context);
 }
   else if ((left_type->interface_type() != NULL
 	&& right_type->inte

Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-14 Thread Jeff Law

On 11/14/13 13:28, Diego Novillo wrote:

Functions in each corresponding .c file got moved to those
headers and others that already existed. I wanted to make this
patch as mechanical as possible, so I made no attempt to fix
problems like having build_addr defined in tree-inline.c. I left
that for later.
This seems backwards to me and just ensures double-churn. Once to move 
it now, then again to its final resting spot.


If this change is being made via some automated script, then, well, I 
guess it is what it is and we'll have to come back to them.  But if 
you're doing this by hand it seems to me that leaving it in its original 
location, possibly grouped with its friends, with a FIXME would be better.





There were some declarations that I could not move out of tree.h
because of header poisoning. We forbid the inclusion of things
like expr.h from FE files. While that's a reasonable idea, the FE
file *still* manage to at expr.c functionality because the
declarations they want to use were defined in tree.h.

If that functionality is allowed to be accessed from the FEs,
then I will later move those functions out of expr.c into tree.c.
I have moved these declarations to the bottom of tree.h so they
are easy to identify later.
Yea :(  Hell, this seems like a no-brainer that ought to go in as is, 
right now.  The ability to identify these warts quickly I'm sure will be 
useful.





There is a namespace collision with libcpp. The file gcc/symtab.c
cannot use gcc/symtab.h because the #include command picks up
libcpp/include/symtab.h first. So I named this file gcc-symtab.h
for now.

Seems reasonable.


- Some header files always need another header file. I chose to
   #include that header in the file. At this stage we want to do
   the opposite, but this would've added even more bulk to the
   change, so I left a FIXME marker for the next pass.
This seems a bit like a mistake.  How much of this patch would be 
blocked if we didn't allow this right now.


I'm keen to avoid you and Andrew stomping on each other, so I'd rather 
not go backwards on something like this.




[PATCH] Document -mabi=elfv[12] (Re: [PATCH, rs6000] ELFv2 ABI 1/8: Add options and infrastructure)

2013-11-14 Thread Ulrich Weigand
Joseph Myers wrote:
> On Tue, 12 Nov 2013, Ulrich Weigand wrote:
> > > > Therefore, it is introduces via a new pair of options
> > > >-mabi=elfv1 / -mabi=elfv2
> > > > where -mabi=elfv1 select the current Linux ABI, and -mabi=elfv2
> > > > selects the new one.
> > > 
> > > New command-line options need invoke.texi documentation.
> > 
> > As mentioned above, it's probably best to mark it undocumented.
> 
> No, even always-warning options like -mabi=ibmlongdouble are documented.

Ah, indeed; thanks for pointing that out.  Well, that's certainly
fine with me too.

Here's a patch to add documentation along the lines of what we have
for the longdouble switches.

Doc build tested on powerpc64-linux.

David, would that be OK for mainline, or do have other suggestions?

Bye,
Ulrich

ChangeLog:

* doc/invoke.texi (-mabi=elfv1, -mabi=elfv2): Document.

Index: gcc/gcc/doc/invoke.texi
===
--- gcc.orig/gcc/doc/invoke.texi
+++ gcc/gcc/doc/invoke.texi
@@ -18846,7 +18846,8 @@ SVR4 ABI)@.
 @opindex mabi
 Extend the current ABI with a particular extension, or remove such extension.
 Valid values are @var{altivec}, @var{no-altivec}, @var{spe},
-@var{no-spe}, @var{ibmlongdouble}, @var{ieeelongdouble}@.
+@var{no-spe}, @var{ibmlongdouble}, @var{ieeelongdouble},
+@var{elfv1}, @var{elfv2}@.
 
 @item -mabi=spe
 @opindex mabi=spe
@@ -18868,6 +18869,16 @@ This is a PowerPC 32-bit SYSV ABI option
 Change the current ABI to use IEEE extended-precision long double.
 This is a PowerPC 32-bit Linux ABI option.
 
+@item -mabi=elfv1
+@opindex mabi=elfv1
+Change the current ABI to use the ELFv1 ABI.
+This is a little-endian PowerPC 64-bit Linux ABI option.
+
+@item -mabi=elfv2
+@opindex mabi=elfv2
+Change the current ABI to use the ELFv2 ABI.
+This is a big-endian PowerPC 64-bit Linux ABI option.
+
 @item -mprototype
 @itemx -mno-prototype
 @opindex mprototype

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread H.J. Lu
On Thu, Nov 14, 2013 at 8:34 AM, Andrew MacLeod  wrote:
> On 11/14/2013 11:23 AM, Michael Matz wrote:
>>
>> Hi,
>>
>> On Thu, 14 Nov 2013, Andrew MacLeod wrote:
>>
 I think if following through with the whole plan there would (and
 should) be nothing remaining that could be called a gimple expression.
>>>
>>> very possibly, i just haven't gotten to those parts yet. I can change
>>> the name back to gimple-decl.[ch] or some such thing if you like that
>>> better.
>>
>> -object? -operand? -stuff? ;-)  Will all of these splits land at trunk,
>> i.e. 4.9?  Why the hurry when not even such high-level things are clear?
>> I mean how can you think about rearchitecting the gimple data structures
>> without having looked at the current details.  It's clear that not every
>> detail of the design can be fixated at this point, but basic questions
>> like "what's the operands?", "will there be expressions?", "how do we
>> iterate?", "recursive structures or not?" should at least get some answer
>> before really starting grind work, shouldn't they?
>>
> The splits are for header file cleanup and re-structuring into logical
> components.  As I mentioned in the original post,  the file is needed to
> break dependency cycles between gimple.h (the statements) , the iterators,
> and gimplification.  It is for the gimple stuff which doesn't need any of
> those things but is consumed by them.
>
> This really has nothing to do with my future plans, other than the fact that
> I also said whatever is in this file is will eventually be split into more
> things, but I'm not ready to do those splits yet, thus the gimple-blah name
> doesn't matter to me.  gimple-expr seemed convenient at the time but clearly
> you don't like it, and I'll happily call it whatever you want.  It's a grab
> bag of all the gimple values which are still trees...
>
> maybe the suggested  gimple-val.[ch] is ok?
>
> Andrew

It breaks Ada build.  I checked in the following patch to unbreak
it.


-- 
H.J.
---
Index: ada/ChangeLog
===
--- ada/ChangeLog(revision 204825)
+++ ada/ChangeLog(working copy)
@@ -1,3 +1,7 @@
+2013-11-14  H.J. Lu  
+
+* gcc-interface/trans.c: Include gimple.h and pointer-set.h.
+
 2013-11-12  Andrew MacLeod  

 * gcc-interface/trans.c: Include gimplify.h.
Index: ada/gcc-interface/trans.c
===
--- ada/gcc-interface/trans.c(revision 204825)
+++ ada/gcc-interface/trans.c(working copy)
@@ -33,7 +33,9 @@
 #include "output.h"
 #include "libfuncs.h"/* For set_stack_check_libfunc.  */
 #include "tree-iterator.h"
+#include "gimple.h"
 #include "gimplify.h"
+#include "pointer-set.h"
 #include "bitmap.h"
 #include "cgraph.h"
 #include "diagnostic.h"


Re: [patch] Fix stack allocation oddity

2013-11-14 Thread Jeff Law

On 11/14/13 04:52, Eric Botcazou wrote:

Hi,

we have a test in the gnat.dg testsuite (stack_usage1.adb) which checks that
the allocation of big temporaries created in non-overlapping blocks on the
stack is optimal, i.e. that they share a stack slot.  It is run at -O0 and
passes.  If you run it at -O2, it also passes.  Now, if you run it at -O1, it
fails and that's a regression from the pre-TREE_CLOBBER_P era.

The problem is that, when optimization is enabled, DECL_IGNORED_P variables
are removed from blocks by remove_unused_scope_block_p and moved to the
toplevel.  Now defer_stack_allocation has:

   /* Variables in the outermost scope automatically conflict with
  every other variable.  The only reason to want to defer them
  at all is that, after sorting, we can more efficiently pack
  small variables in the stack frame.  Continue to defer at -O2.  */
   if (toplevel && optimize < 2)
 return false;

The comment is slightly obsolete in the TREE_CLOBBER_P era, since toplevel
variables don't necessarily conflict with each other, for example the above
variables moved to toplevel by remove_unused_scope_block_p.

We don't think that we need to tweak again remove_unused_scope_block_p in the
TREE_CLOBBER_P era; instead we can defer the allocation of big DECL_IGNORED_P
variables at toplevel from defer_stack_allocation.

Tested on x86_64-suse-linux, OK for the mainline?


2013-11-14  Olivier Hainque  

* cfgexpand.c (defer_stack_allocation): When optimization is enabled,
defer allocation of DECL_IGNORED_P variables at toplevel unless really
small.  Factorize size threshold computation from the existing one.
(expand_used_vars): Refine comment.


2013-11-14  Eric Botcazou  

* gnat.dg/stack_usage1b.adb: New test.
* gnat.dg/stack_usage1c.adb: Likewise.

This looks fine to me.

Thanks,
jeff


Re: [PATCH 1/6] Convert gimple types from a union to C++ inheritance

2013-11-14 Thread Jeff Law

On 10/31/13 10:26, David Malcolm wrote:

* Makefile.in (GIMPLE_H): Add dep on is-a.h.
Not asking you, but I'd like to hope many of the *_H things in 
Makefile.in should be going away...






diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index cc88fb8..7fbb533 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -887,7 +887,7 @@ BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) 
$(FUNCTION_H) \
cfg-flags.def cfghooks.h
  GIMPLE_H = gimple.h gimple.def gsstruct.def pointer-set.h $(VEC_H) \
$(GGC_H) $(BASIC_BLOCK_H) $(TREE_H) tree-ssa-operands.h \
-   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H)
+   tree-ssa-alias.h $(INTERNAL_FN_H) $(HASH_TABLE_H) is-a.h
  GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h
  RECOG_H = recog.h
  EMIT_RTL_H = emit-rtl.h
Ugh.  OK I guess.  I hate all these _H thingies.  Ideally they'll go 
away at some point.  I think their biggest use now is for 
PLUGIN_HEADERS.  But that's not an issue for this patch to go forward.





diff --git a/gcc/ggc.h b/gcc/ggc.h
index b31bc80..bb8f939 100644
--- a/gcc/ggc.h
+++ b/gcc/ggc.h
@@ -269,10 +269,10 @@ ggc_alloc_cleared_tree_node_stat (size_t s MEM_STAT_DECL)
return (union tree_node *) ggc_internal_cleared_alloc_stat (s 
PASS_MEM_STAT);
  }

-static inline union gimple_statement_d *
-ggc_alloc_cleared_gimple_statement_d_stat (size_t s MEM_STAT_DECL)
+static inline struct gimple_statement_base *
+ggc_alloc_cleared_gimple_statement_stat (size_t s MEM_STAT_DECL)
  {
-  return (union gimple_statement_d *)
+  return (struct gimple_statement_base *)
  ggc_internal_cleared_alloc_stat (s PASS_MEM_STAT);
  }
Didn't I see something in the last 48hrs indicating that we don't need 
"static inline" anymore, just "inline"?  If so, can you drop the static 
here since you're changing it already.



With that, this, IMO is OK and a definite step forward.

Given the contention over this, please give other maintainers 24hrs to 
object before installing the set.


jeff




Re: [c] Remove unnecessary host_integerp check

2013-11-14 Thread Jeff Law

On 11/14/13 13:46, Richard Sandiford wrote:

pp_c_character_constant only calls pp_p_char for values that fit into
a HWI of the constant's signedness (i.e. an unsigned HWI if TYPE_UNSIGNED
and a signed HWI otherwise).  But pp_c_character_constant is only called by:

 case INTEGER_CST:
   {
tree type = TREE_TYPE (e);
 ...
else if (type == char_type_node)
  pp_c_character_constant (this, e);

and in practice a character constant is always going to fit into a HWI.
The current !host_integerp case simply truncates the constant to an
unsigned int anyway.

Maybe the type == wchar_type_node test is dead too, I'm not sure.
I'm happy to remove it at the same time if that seems like the right
thing to do.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/c-family/
* c-pretty-print.c (pp_c_character_constant): Remove unnecessary
host_integerp check.

Fine by me.  Your call on the type == wchar_type_code.

jeff



Re: [RFC PATCH] add auto_bitmap

2013-11-14 Thread Jeff Law

On 11/14/13 14:14, Richard Biener wrote:


I'm just pointed out that of all the stuff you changed, these were the
only ones I saw where lifetimes were changed significantly.


I still ask why we need a new type and cannot put this functionality into 
bitmap_head itself.
Given that bitmap is just a *bitmap_head_def aren't we suggesting the 
same thing?


jeff


Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-14 Thread H.J. Lu
On Thu, Nov 14, 2013 at 6:16 AM, Arnaud Charlet  wrote:
>> I also changed s-osinte-posix.adb and s-osprim-posix.adb
>> for x32.  They aren't Linux specific.  What should I do with
>> them?
>
> I would use the time_t type defined in s-osinte* (all POSIX implementations
> of s-osinte* have such definition, or if they don't, it's easy to add), and
> in the s-osinte-linux version we can have a renaming:
>
>subtype time_t is System.Linux.time_t
>
> and in System.Linux have either:
>
>type time_t is new Long_Integer;
>
> or
>
>type time_t is new Long_Long_Integer;
>
> depending on the variant.
>
> Arno

Another problem.  s-osprim-posix.adb has

   --  ??? These definitions are duplicated from System.OS_Interface
   --  because we don't want to depend on any package. Consider removing
   --  these declarations in System.OS_Interface and move these ones in
   --  the spec.

I can't use time_t from s-osinte-linux.ads since System.OS_Interface
isn't available. What should I do?

-- 
H.J.


Re: [RFC PATCH] add auto_bitmap

2013-11-14 Thread Richard Biener
Jeff Law  wrote:
>On 11/14/13 04:04, tsaund...@mozilla.com wrote:
>> From: Trevor Saunders 
>>
>> Hi,
>>
>> this patch adds and starts to use a class auto_bitmap, which is a
>very thin
>> wrapper around bitmap.  Its advantage is that it takes care of
>delocation
>> automatically.  So you can do things like
>>
>> int
>> f ()
>> {
>>auto_bitmap x;
>>// do stuff with x
>> }
>>
>> Another advantage of this class is it puts the bitmap_head struct on
>the stack
>> instead of mallocing it or using a obstack.
>>
>> I Think the biggest question is if I should make auto_bitmap a full
>c++ified
>> wrapper around   bitmap or if I should contiune just taking the
>address of it
>> and passing it as a bitmap, but other comments are of course welcome
>too.
>I'd prefer to see it fully c++ified.
>
>In response to one of Richi's comments, I spot checked the patch and 
>only found two occurrences where this lengthened the lifetime of the 
>bitmap in any significant way.  The vast majority of the time any 
>increase in length was trivial.
>
>Those instances are in tree-ssa-loop-ivopts.c and the other in 
>tree-ssa-strlen.c.  I don't think you need to change anything for them,
>
>I'm just pointed out that of all the stuff you changed, these were the 
>only ones I saw where lifetimes were changed significantly.

I still ask why we need a new type and cannot put this functionality into 
bitmap_head itself.

Richard.

>jeff




Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-14 Thread Richard Biener
Sergey Ostanevich  wrote:
>this is only for the whole file? I mean to have a particular loop
>vectorized in a
>file while all others - up to compiler's cost model. is there such a
>machinery?

No, there is not.

Richard.

>Sergos
>
>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener 
>wrote:
>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>>
>>> I will get some tests.
>>> As for cost analysis - simply consider the pragma as a request to
>>> vectorize. How can I - as a developer - enforce it beyond the
>pragma?
>>
>> You can disable the cost model via -fvect-cost-model=unlimited
>>
>> Richard.
>>
>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener 
>wrote:
>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>>> >
>>> >> The reason patch was in its original state is because we want
>>> >> to notify user that his assumption of profitability may be wrong.
>>> >> This is not a part of any spec and as far as I know ICC does not
>>> >> notify user about the case. Still it can be a good hint for those
>>> >> users who tries to get as much as possible performance.
>>> >>
>>> >> Richard's comment on the vectorization problems is about the same
>-
>>> >> to inform user that his attempt to force vectorization is failed.
>>> >>
>>> >> As for profitable or not - sometimes I believe it's impossible to
>be
>>> >> precise. For OMP we have case of a vector version of a function
>>> >> and we have no chance to figure out whether it is profitable to
>use
>>> >> it or to loose it. If we can't map the loop for any vector length
>>> >> other than 1 - I believe in this case we have to bail out and
>report.
>>> >> Is it about 'never profitable'?
>>> >
>>> > For example.  I think we should report non-vectorized loops
>>> > that are marked with force_vect anyway, with
>-Wdisabled-optimization.
>>> > Another case is that a loop may be profitable to vectorize if
>>> > the ISA supports a gather instruction but otherwise not.  Or if
>the
>>> > ISA supports efficient vector construction from N not loop
>>> > invariant scalars (for vectorization of strided loads).
>>> >
>>> > Simply disregarding all of the cost analysis sounds completely
>>> > bogus to me.
>>> >
>>> > I'd simply go for the diagnostic for now, not changing anything
>else.
>>> > We want to have a good understanding about why the cost model is
>>> > so bad that we have to force to ignore it for #pragma simd - thus
>we
>>> > want testcases.
>>> >
>>> > Richard.
>>> >
>>> >>
>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
> wrote:
>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>wrote:
>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>>> >> >>> nothing related to cost model. ICC does not cancel its
>>> >> >>> cost model in case of #pragma ivdep
>>> >> >>>
>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>>> >> >>> for the vector length. this means if no safelen is present
>>> >> >>> an arbitrary vector length can be used.
>>> >> >>
>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>#pragma omp simd
>>> >> >> without safelen clause or #pragma simd without vectorlength
>clause.
>>> >> >>
>>> >> >>> so I believe loop->force_vect is the only trigger to
>disregard
>>> >> >>> the cost model
>>> >> >>
>>> >> >> Anyway, in that case I think the originally posted patch is
>wrong,
>>> >> >> if we want to treat force_vect as disregard all the cost model
>and
>>> >> >> force vectorization (well, the name of the field already kind
>of suggest
>>> >> >> that), then IMHO we should treat it the same as
>-fvect-cost-model=unlimited
>>> >> >> for those loops.
>>> >> >
>>> >> > Err - the user may have a specific sub-architecture in mind
>when using
>>> >> > #pragma simd, if you say we should completely ignore the cost
>model
>>> >> > then should we also sorry () if we cannot vectorize the loop
>(either
>>> >> > because of GCC deficiencies or lack of sub-target support)?
>>> >> >
>>> >> > That said, at least in the cases that the cost model says the
>loop
>>> >> > is never profitable to vectorize we should follow its advice.
>>> >> >
>>> >> > Richard.
>>> >> >
>>> >> >> Thus (untested):
>>> >> >>
>>> >> >> 2013-11-12  Jakub Jelinek  
>>> >> >>
>>> >> >>   * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>Use
>>> >> >>   unlimited cost model also for force_vect loops.
>>> >> >>
>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.0
>+0100
>>> >> >> +++ gcc/tree-vect-loop.c  2013-11-12 15:11:43.821404330
>+0100
>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>>> >> >>void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>(loop_vinfo);
>>> >> >>
>>> >> >>/* Cost model disabled.  */
>>> >> >> -  if (unlimited_cost_model ())
>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>(loop_vinfo)->force_vect)
>>> >> >>  {
>>> >> >>dump_printf_loc (MSG_NOTE, vect_location, "cost model

Remove host_integerp tests with variable signedness

2013-11-14 Thread Richard Sandiford
Apart from the case in the C front end, there are 4 calls to host_integerp
with a variable "pos" argument.  These "pos" arguments are all taken from
TYPE_UNSIGNED.  In the dwarf2out.c case we go on to require:

  simple_type_size_in_bits (TREE_TYPE (value)) <= HOST_BITS_PER_WIDE_INT
  || host_integerp (value, 0)

The host_integerp (value, 0) makes the first host_integerp trivially
redundant for !TYPE_UNSIGNED.  Checking that the precision is
<= HOST_BITS_PER_WIDE_INT makes the first host_integerp redundant
for TYPE_UNSIGNED too, since all unsigned types of those precisions
will fit in an unsigned HWI.  We already know that we're dealing with
an INTEGER_CST, so there's no need for a code check either.

vect_recog_divmod_pattern is similar in the sense that we already know
that we have an INTEGER_CST and that we specifically check for precisions
<= HOST_BITS_PER_WIDE_INT.

In the other two cases we don't know whether we're dealing with an
INTEGER_CST but we do check for precisions <= HOST_BITS_PER_WIDE_INT.
So these host_integerps reduce to code tests.  (The precision check
for expand_vector_divmod doesn't show up in the context but is at
the top of the function:

  if (prec > HOST_BITS_PER_WIDE_INT)
return NULL_TREE;
)

I also replaced the associated tree_low_csts with TREE_INT_CST_LOWs.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* dwarf2out.c (gen_enumeration_type_die): Remove unnecessary
host_integerp test.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Likewise.
Use TREE_INT_CST_LOW rather than tree_low_cst when reading the
constant.
* fold-const.c (fold_binary_loc): Replace a host_integerp/tree_low_cst
pair with a TREE_CODE test and TREE_INT_CST_LOW.
* tree-vect-generic.c (expand_vector_divmod): Likewise.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c 2013-11-14 20:21:27.183058648 +
+++ gcc/dwarf2out.c 2013-11-14 20:22:18.128829681 +
@@ -17321,9 +17321,8 @@ gen_enumeration_type_die (tree type, dw_
  if (TREE_CODE (value) == CONST_DECL)
value = DECL_INITIAL (value);
 
- if (host_integerp (value, TYPE_UNSIGNED (TREE_TYPE (value)))
- && (simple_type_size_in_bits (TREE_TYPE (value))
- <= HOST_BITS_PER_WIDE_INT || host_integerp (value, 0)))
+ if (simple_type_size_in_bits (TREE_TYPE (value))
+ <= HOST_BITS_PER_WIDE_INT || host_integerp (value, 0))
/* DWARF2 does not provide a way of indicating whether or
   not enumeration constants are signed or unsigned.  GDB
   always assumes the values are signed, so we output all
Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2013-11-14 20:21:27.183058648 +
+++ gcc/tree-vect-patterns.c2013-11-14 20:22:18.129829676 +
@@ -2064,9 +2064,8 @@ vect_recog_divmod_pattern (vec *
   return pattern_stmt;
 }
 
-  if (!host_integerp (oprnd1, TYPE_UNSIGNED (itype))
-  || integer_zerop (oprnd1)
-  || prec > HOST_BITS_PER_WIDE_INT)
+  if (prec > HOST_BITS_PER_WIDE_INT
+  || integer_zerop (oprnd1))
 return NULL;
 
   if (!can_mult_highpart_p (TYPE_MODE (vectype), TYPE_UNSIGNED (itype)))
@@ -2078,8 +2077,8 @@ vect_recog_divmod_pattern (vec *
 {
   unsigned HOST_WIDE_INT mh, ml;
   int pre_shift, post_shift;
-  unsigned HOST_WIDE_INT d = tree_low_cst (oprnd1, 1)
-& GET_MODE_MASK (TYPE_MODE (itype));
+  unsigned HOST_WIDE_INT d = (TREE_INT_CST_LOW (oprnd1)
+ & GET_MODE_MASK (TYPE_MODE (itype)));
   tree t1, t2, t3, t4;
 
   if (d >= ((unsigned HOST_WIDE_INT) 1 << (prec - 1)))
@@ -2195,7 +2194,7 @@ vect_recog_divmod_pattern (vec *
 {
   unsigned HOST_WIDE_INT ml;
   int post_shift;
-  HOST_WIDE_INT d = tree_low_cst (oprnd1, 0);
+  HOST_WIDE_INT d = TREE_INT_CST_LOW (oprnd1);
   unsigned HOST_WIDE_INT abs_d;
   bool add = false;
   tree t1, t2, t3, t4;
Index: gcc/fold-const.c
===
--- gcc/fold-const.c2013-11-14 20:21:27.183058648 +
+++ gcc/fold-const.c2013-11-14 20:22:18.124829699 +
@@ -12032,16 +12032,15 @@ fold_binary_loc (location_t loc,
 if the new mask might be further optimized.  */
   if ((TREE_CODE (arg0) == LSHIFT_EXPR
   || TREE_CODE (arg0) == RSHIFT_EXPR)
- && host_integerp (TREE_OPERAND (arg0, 1), 1)
- && host_integerp (arg1, TYPE_UNSIGNED (TREE_TYPE (arg1)))
- && tree_low_cst (TREE_OPERAND (arg0, 1), 1)
-< TYPE_PRECISION (TREE_TYPE (arg0))
  && TYPE_PRECISION (TREE_TYPE (arg0)) <= HOST_BITS_PER_WIDE_INT
- && tree_low_cst (TREE_OPERAND (arg0, 1), 1) > 0)
+ && TREE_CODE (arg1) == INTEGER_CST
+  

[PATCH][PR middle-end/59127] Fix Ada bootstrap on x86_64-unknown-linux-gnu

2013-11-14 Thread Jeff Law


This patch fixes two issues, the most important issue is the related to 
the Ada build failures on the trunk.


When non-call-exceptions is on, most memory references potentially 
throw.  As a result those statements end basic blocks.  This causes
checking failures when the __builtin_trap is placed immediately after 
the memory reference because we find the memory reference in the middle 
of a basic block.


While I think we could support this with some more work, it just doesn't 
seem worth the effort.  It certainly isn't something that's occurring 
with any regularity AFAICT when buliding the Ada compiler/runtime system.


It's easiest to just disallow optimization when the statement that 
triggers undefined behaviour ends a block -- with the exception of 
GIMPLE_RETURN.  That captures the key issue, namely that the code is not 
currently prepared to have the trap in a separate block from the 
statement triggering undefined behaviour.


I didn't make a testcase for this failure because it triggers during 
bootstrapping Ada and my Ada-fu has severely eroded since the 80s when I 
was forced to learn Ada.


The second issue is when a block has outgoing abnormal edges, out of an 
abundance of caution, we should simply not apply the optimization.  That 
may be a bit too cautious, but it's clearly the safe thing to do.  I 
don't have a testcase for this.


In addition to the usual bootstrap & regression test on 
x86_64-unknown-linux-gnu, this patch fixes the Ada bootstrap on 
x86_64-unknown-linux-gnu and shows no regressions in the Ada test suite 
when compared to a compiler with the optimization totally disabled.  ie, 
a poor mans regression test for Ada given that Ada wasn't bootstrapping 
w/o this patch.


Applied to the trunk.

Jeff

ps.  Obviously the next thing to verify is that Ada is bootstrapping on 
Itanic, but there's other potential issues on Itanic that might get in 
the way.


* basic-block.h (has_abnormal_outgoing_edge_p): Moved here from...
* tree-inline.c (has_abnormal_outgoing_edge_p): Remove.
* gimple-ssa-isolate-paths.c: Include tree-cfg.h.
(find_implicit_erroneous_behaviour): If a block has abnormal outgoing
edges, then ignore it.  If the statement exhibiting erroneous
behaviour ends basic blocks, with the exception of GIMPLE_RETURNs,
then we can not optimize.
(find_explicit_erroneous_behaviour): Likewise.

 
diff --git a/gcc/basic-block.h b/gcc/basic-block.h
index 9c28f14..b7e3b50 100644
--- a/gcc/basic-block.h
+++ b/gcc/basic-block.h
@@ -1008,4 +1008,19 @@ inverse_probability (int prob1)
   check_probability (prob1);
   return REG_BR_PROB_BASE - prob1;
 }
+
+/* Return true if BB has at least one abnormal outgoing edge.  */
+
+static inline bool
+has_abnormal_outgoing_edge_p (basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+if (e->flags & EDGE_ABNORMAL)
+  return true;
+
+  return false;
+}
 #endif /* GCC_BASIC_BLOCK_H */
diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
index 108b98e..66c13f4 100644
--- a/gcc/gimple-ssa-isolate-paths.c
+++ b/gcc/gimple-ssa-isolate-paths.c
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ssa-iterators.h"
 #include "cfgloop.h"
 #include "tree-pass.h"
+#include "tree-cfg.h"
 
 
 static bool cfg_altered;
@@ -215,6 +216,17 @@ find_implicit_erroneous_behaviour (void)
 {
   gimple_stmt_iterator si;
 
+  /* Out of an abundance of caution, do not isolate paths to a
+block where the block has any abnormal outgoing edges.
+
+We might be able to relax this in the future.  We have to detect
+when we have to split the block with the NULL dereference and
+the trap we insert.  We have to preserve abnormal edges out
+of the isolated block which in turn means updating PHIs at
+the targets of those abnormal outgoing edges.  */
+  if (has_abnormal_outgoing_edge_p (bb))
+   continue;
+
   /* First look for a PHI which sets a pointer to NULL and which
 is then dereferenced within BB.  This is somewhat overly
 conservative, but probably catches most of the interesting
@@ -256,8 +268,15 @@ find_implicit_erroneous_behaviour (void)
{
  /* We only care about uses in BB.  Catching cases in
 in other blocks would require more complex path
-isolation code.  */
- if (gimple_bb (use_stmt) != bb)
+isolation code. 
+
+If the statement must end a block and is not a
+GIMPLE_RETURN, then additional work would be
+necessary to isolate the path.  Just punt it for
+now.  */
+ if (gimple_bb (use_stmt) != bb
+ || (stmt_ends_bb_p (use_stmt)
+ && gimple_code (use_stmt) != GIMPLE_RETURN))
  

Re: [PATCH] PR ada/54040: [x32] Incorrect timeval and timespec

2013-11-14 Thread H.J. Lu
On Thu, Nov 14, 2013 at 6:16 AM, Arnaud Charlet  wrote:
>> I also changed s-osinte-posix.adb and s-osprim-posix.adb
>> for x32.  They aren't Linux specific.  What should I do with
>> them?
>
> I would use the time_t type defined in s-osinte* (all POSIX implementations
> of s-osinte* have such definition, or if they don't, it's easy to add), and
> in the s-osinte-linux version we can have a renaming:
>
>subtype time_t is System.Linux.time_t
>
> and in System.Linux have either:
>
>type time_t is new Long_Integer;
>
> or
>
>type time_t is new Long_Long_Integer;
>
> depending on the variant.
>
> Arno

Please bear with me.  I know next to nothing about Ada.
I tried:

diff --git a/gcc/ada/s-linux.ads b/gcc/ada/s-linux.ads
index c8a7ad1..6244ca7 100644
--- a/gcc/ada/s-linux.ads
+++ b/gcc/ada/s-linux.ads
@@ -38,6 +38,24 @@
 package System.Linux is
pragma Preelaborate;

+   
+   -- time_t --
+   
+
+   type time_t is new Long_Integer;
+
+   ---
+   -- tv_nsec_t --
+   ---
+
+   type tv_nsec_t is new Long_Integer;
+
+   ---
+   -- timeval_element_t --
+   ---
+
+   type timeval_element_t is new Long_Integer;
+
---
-- Errno --
---
diff --git a/gcc/ada/s-osinte-linux.ads b/gcc/ada/s-osinte-linux.ads
index a99c4e5..6fbe13a 100644
--- a/gcc/ada/s-osinte-linux.ads
+++ b/gcc/ada/s-osinte-linux.ads
@@ -596,11 +596,12 @@ private

type pid_t is new int;

-   type time_t is new long;
+   subtype time_t is System.Linux.time_t;
+   subtype tv_nsec_t is System.Linux.tv_nsec_t;

type timespec is record
   tv_sec  : time_t;
-  tv_nsec : long;
+  tv_nsec : tv_nsec_t;
end record;
pragma Convention (C, timespec);


and got

s-osinte.adb:102:17: operator for type "System.Linux.time_t" is not
directly visible
s-osinte.adb:102:17: use clause would make operation legal
s-osinte.adb:107:35: expected type "System.Linux.tv_nsec_t"
s-osinte.adb:107:35: found type "Interfaces.C.long"
make[7]: *** [s-osinte.o] Error 1

How do I resolve this?

Thanks.

-- 
H.J.


Re: [PATCH] Fix lto-profiledbootstrap [was: Merge cgraph_get_create_node and cgraph_get_create_real_symbol_node]

2013-11-14 Thread Uros Bizjak
Now with the patch attached.

On Thu, Nov 14, 2013 at 9:49 PM, Uros Bizjak  wrote:
> On Thu, Nov 14, 2013 at 7:16 PM, Uros Bizjak  wrote:
>
> Attached patch fixes lto-profiledbootstrap, introduced by:
>
> 2013-11-12  Martin Jambor  
>
> * cgraph.c (cgraph_get_create_node): Do what
> cgraph_get_create_real_symbol_node used to do.
> (cgraph_get_create_real_symbol_node): Removed.  Changed all users to
> call cgraph_get_create_node.
> * cgraph.h (cgraph_get_create_real_symbol_node): Removed.
> * lto-streamer-in.c (input_function): Call cgraph_get_node instead of
> cgraph_get_create_node.  Assert we get a node.
>
> The patch reverts lto-streamer-in.c functionality back to what was
> doing before the above patch.
>
> 2013-11-14  Uros Bizjak  
>
> * lto-streamer-in.c (input function): Call cgraph_create_node if
> cgraph_get_node failed.
>
> Tested with lto-profiledbootstrap on x86_64-pc-linux-gnu, regression
> tested also with -m32 [1].
>
> OK for mainline?
>
> [1] http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00989.html
>
> Uros.
Index: lto-streamer-in.c
===
--- lto-streamer-in.c   (revision 204807)
+++ lto-streamer-in.c   (working copy)
@@ -917,7 +917,8 @@ input_function (tree fn_decl, struct data_in *data
   gimple_register_cfg_hooks ();
 
   node = cgraph_get_node (fn_decl);
-  gcc_checking_assert (node);
+  if (!node)
+node = cgraph_create_node (fn_decl);
   input_struct_function_base (fn, data_in, ib);
   input_cfg (ib_cfg, fn, node->count_materialization_scale);
 


[PATCH] Fix lto-profiledbootstrap [was: Merge cgraph_get_create_node and cgraph_get_create_real_symbol_node]

2013-11-14 Thread Uros Bizjak
On Thu, Nov 14, 2013 at 7:16 PM, Uros Bizjak  wrote:

Attached patch fixes lto-profiledbootstrap, introduced by:

 2013-11-12  Martin Jambor  

 * cgraph.c (cgraph_get_create_node): Do what
 cgraph_get_create_real_symbol_node used to do.
 (cgraph_get_create_real_symbol_node): Removed.  Changed all users to
 call cgraph_get_create_node.
 * cgraph.h (cgraph_get_create_real_symbol_node): Removed.
 * lto-streamer-in.c (input_function): Call cgraph_get_node instead of
 cgraph_get_create_node.  Assert we get a node.

The patch reverts lto-streamer-in.c functionality back to what was
doing before the above patch.

2013-11-14  Uros Bizjak  

* lto-streamer-in.c (input function): Call cgraph_create_node if
cgraph_get_node failed.

Tested with lto-profiledbootstrap on x86_64-pc-linux-gnu, regression
tested also with -m32 [1].

OK for mainline?

[1] http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00989.html

Uros.


[c] Remove unnecessary host_integerp check

2013-11-14 Thread Richard Sandiford
pp_c_character_constant only calls pp_p_char for values that fit into
a HWI of the constant's signedness (i.e. an unsigned HWI if TYPE_UNSIGNED
and a signed HWI otherwise).  But pp_c_character_constant is only called by:

case INTEGER_CST:
  {
tree type = TREE_TYPE (e);
...
else if (type == char_type_node)
  pp_c_character_constant (this, e);

and in practice a character constant is always going to fit into a HWI.
The current !host_integerp case simply truncates the constant to an
unsigned int anyway.

Maybe the type == wchar_type_node test is dead too, I'm not sure.
I'm happy to remove it at the same time if that seems like the right
thing to do.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/c-family/
* c-pretty-print.c (pp_c_character_constant): Remove unnecessary
host_integerp check.

Index: gcc/c-family/c-pretty-print.c
===
--- gcc/c-family/c-pretty-print.c   2013-11-14 20:21:27.183058648 +
+++ gcc/c-family/c-pretty-print.c   2013-11-14 20:22:20.664818284 +
@@ -954,10 +954,7 @@ pp_c_character_constant (c_pretty_printe
   if (type == wchar_type_node)
 pp_character (pp, 'L');
   pp_quote (pp);
-  if (host_integerp (c, TYPE_UNSIGNED (type)))
-pp_c_char (pp, tree_low_cst (c, TYPE_UNSIGNED (type)));
-  else
-pp_scalar (pp, "\\x%x", (unsigned) TREE_INT_CST_LOW (c));
+  pp_c_char (pp, (unsigned) TREE_INT_CST_LOW (c));
   pp_quote (pp);
 }
 


Re: [RFC PATCH] add auto_bitmap

2013-11-14 Thread Jeff Law

On 11/14/13 07:52, Richard Biener wrote:

Another advantage of this class is it puts the bitmap_head struct on the stack

instead of mallocing it or using a obstack.


Hm, but then eventually you increase the lifetime of the bitmap
until the scope closes.
Yea, but often that's when we're releasing them anyway ;-)  I don't 
think we've gone to too much trouble to try and release the bitmaps as 
soon as they're no longer needed.  Releasing at end of scope seems fine 
to me.


I'm a fan of RAII style code.  It's less error prone and often results 
in code that is easier to understand because the code isn't cluttered 
with resource de-allocation either in if bodies or in blocks reached by 
gotos.


Jeff



Re: [RFC PATCH] add auto_bitmap

2013-11-14 Thread Jeff Law

On 11/14/13 04:04, tsaund...@mozilla.com wrote:

From: Trevor Saunders 

Hi,

this patch adds and starts to use a class auto_bitmap, which is a very thin
wrapper around bitmap.  Its advantage is that it takes care of delocation
automatically.  So you can do things like

int
f ()
{
   auto_bitmap x;
   // do stuff with x
}

Another advantage of this class is it puts the bitmap_head struct on the stack
instead of mallocing it or using a obstack.

I Think the biggest question is if I should make auto_bitmap a full c++ified
wrapper around   bitmap or if I should contiune just taking the address of it
and passing it as a bitmap, but other comments are of course welcome too.

I'd prefer to see it fully c++ified.

In response to one of Richi's comments, I spot checked the patch and 
only found two occurrences where this lengthened the lifetime of the 
bitmap in any significant way.  The vast majority of the time any 
increase in length was trivial.


Those instances are in tree-ssa-loop-ivopts.c and the other in 
tree-ssa-strlen.c.  I don't think you need to change anything for them, 
I'm just pointed out that of all the stuff you changed, these were the 
only ones I saw where lifetimes were changed significantly.


jeff



Re: [gomp4, WIP] Elementals improvements

2013-11-14 Thread Jakub Jelinek
On Fri, Nov 15, 2013 at 06:26:28AM +1000, Richard Henderson wrote:
> On 11/15/2013 06:13 AM, Jakub Jelinek wrote:
> > On Fri, Nov 15, 2013 at 05:48:27AM +1000, Richard Henderson wrote:
> >> Pointers are certainly a decent fallback that would always be compatible,
> >> but I wonder if we need go that far.
> >>
> >> Each target will have a (set of) natural simdlen to which it vectorizes.  
> >> This
> >> is the set returned by autovectorize_vector_sizes.  That means we've got
> >> registers of those sizes, and probably parameter passing of those sizes 
> >> will be
> >> efficient.  It's easy to split input parameters into multiples, as you've 
> >> done;
> >> no reason this can't apply generically.
> > 
> > The problem is that if a target doesn't support target attribute (all
> > targets except x86_64/i686/powerpc* right now), what do you do if command
> > line options when compiling the #pragma omp declare simd definition don't
> > include target options needed for use of the supposedly vector registers the
> > ABI wants to pass the arguments or return value in?
> 
> Error or sorry.  We really have no other choice.

Well, that other choice is the pointer passing, perhaps tiny bit slower, but
it will just work.

In the patch right now a target hook let's decide what to do (with default
no SIMD clones at all).  We can easily provide say two generic
definitions of the target hook, one which uses pointers, one which uses
vector arguments, and let the target maintainers choose what is best for
them.

> I think it should be an array of vectors, ensuring that we can perform
> efficient aligned accesses to the array on both sides of the call.
> 
> I believe that Ada can return an ARRAY_TYPE.

Ok, will try that (though, likely only early next week, want to spend
another day on Asan tomorrow).

Jakub


Re: Recent Go patch broke Solaris bootstrap

2013-11-14 Thread Ian Lance Taylor
On Tue, Nov 12, 2013 at 7:16 AM, Rainer Orth
 wrote:
> Ian Lance Taylor  writes:
>
>> On Tue, Nov 12, 2013 at 6:43 AM, Rainer Orth
>>  wrote:
>>>
>>> works on Solaris 11, but not on Solaris 9 and 10 which lack
>>> TCP_KEEPALIVE_THRESHOLD.
>>
>> Do they have any facility for changing the keepalive timers?
>
> There's TCP_KEEPALIVE which is already used in tcpsockopt_darwin.go.

OK, great.  This patch changes the Solaris code to use the existing
Darwin code.  Bootstrapped on x86_64-unknown-linux-gnu, not that that
proves much.  Committed to mainline.

Ian
diff -r bfe1d96993b8 libgo/Makefile.am
--- a/libgo/Makefile.am	Thu Nov 14 12:13:29 2013 -0800
+++ b/libgo/Makefile.am	Thu Nov 14 12:16:15 2013 -0800
@@ -752,7 +752,7 @@
 go_net_tcpsockopt_file = go/net/tcpsockopt_darwin.go
 else
 if LIBGO_IS_SOLARIS
-go_net_tcpsockopt_file = go/net/tcpsockopt_solaris.go
+go_net_tcpsockopt_file = go/net/tcpsockopt_darwin.go
 else
 go_net_tcpsockopt_file =  go/net/tcpsockopt_unix.go
 endif
diff -r bfe1d96993b8 libgo/go/net/tcpsockopt_solaris.go
--- a/libgo/go/net/tcpsockopt_solaris.go	Thu Nov 14 12:13:29 2013 -0800
+++ /dev/null	Thu Jan 01 00:00:00 1970 +
@@ -1,25 +0,0 @@
-// Copyright 2009 The Go Authors.  All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-package net
-
-import (
-	"os"
-	"syscall"
-	"time"
-)
-
-// Set keep alive period.
-func setKeepAlivePeriod(fd *netFD, d time.Duration) error {
-	if err := fd.incref(); err != nil {
-		return err
-	}
-	defer fd.decref()
-
-	// The kernel expects milliseconds so round to next highest millisecond.
-	d += (time.Millisecond - time.Nanosecond)
-	msecs := int(d.Nanoseconds() / time.Millisecond)
-
-	return os.NewSyscallError("setsockopt", syscall.SetsockoptInt(fd.sysfd, syscall.IPPROTO_TCP, syscall.TCP_KEEPALIVE_THRESHOLD, msecs))
-}


Re: [gomp4, WIP] Elementals improvements

2013-11-14 Thread Richard Henderson
On 11/15/2013 06:13 AM, Jakub Jelinek wrote:
> On Fri, Nov 15, 2013 at 05:48:27AM +1000, Richard Henderson wrote:
>> Pointers are certainly a decent fallback that would always be compatible,
>> but I wonder if we need go that far.
>>
>> Each target will have a (set of) natural simdlen to which it vectorizes.  
>> This
>> is the set returned by autovectorize_vector_sizes.  That means we've got
>> registers of those sizes, and probably parameter passing of those sizes will 
>> be
>> efficient.  It's easy to split input parameters into multiples, as you've 
>> done;
>> no reason this can't apply generically.
> 
> The problem is that if a target doesn't support target attribute (all
> targets except x86_64/i686/powerpc* right now), what do you do if command
> line options when compiling the #pragma omp declare simd definition don't
> include target options needed for use of the supposedly vector registers the
> ABI wants to pass the arguments or return value in?

Error or sorry.  We really have no other choice.

There is an element to these declare simd declarations that is inherently
non-portable.  You simply cannot use the same declarations for ARM that you can
for AVX2.

>> It's the return value wider than the register size that's tricky.  Here I 
>> think
>> we may be best off returning a struct/array and letting the base calling
>> convention handle it.  Normally that _will_ be via a pointer, but sometimes
>> that pointer will be in some special non-parameter register.  Thus I think
>> we're best off not performing the hidden argument conversion manually.
> 
> Shall it be array of vectors, array of the scalar types, struct with such
> arrays?  I mean, do targets handle returning ARRAY_TYPE at all (does any FE
> produce those)?

I think it should be an array of vectors, ensuring that we can perform
efficient aligned accesses to the array on both sides of the call.

I believe that Ada can return an ARRAY_TYPE.

But if in testing we find that fails for some reason, we can wrap the array in
a struct.


r~


Re: Recent Go patch broke Solaris bootstrap

2013-11-14 Thread Ian Lance Taylor
On Fri, Nov 8, 2013 at 4:01 AM, Rainer Orth  
wrote:
> The recent Go patch (couldn't find the submission on gcc-patches) broke
> Solaris bootstrap: on Solaris 10/x86 I get
>
> /vol/gcc/src/hg/trunk/local/libgo/go/net/fd_select.go:90:30: error: use of 
> undefined type 'pollServer'
>  func (p *pollster) WaitFD(s *pollServer, nsec int64) (fd int, mode int, err 
> error) {
>   ^
> /vol/gcc/src/hg/trunk/local/libgo/go/net/fd_select.go:113:5: error: reference 
> to field 'Unlock' in object which has no fields or methods
> s.Unlock()
>  ^
> /vol/gcc/src/hg/trunk/local/libgo/go/net/fd_select.go:115:5: error: reference 
> to field 'Lock' in object which has no fields or methods
> s.Lock()

Network polling is different and more efficient in the updated
library.  This patch adds the select-based polling that Solaris has to
use to the new system.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu both using the select polling and using the
default epoll polling.  Committed to mainline.

Ian
diff -r 0ba9f772ad41 libgo/Makefile.am
--- a/libgo/Makefile.am	Thu Nov 14 12:03:31 2013 -0800
+++ b/libgo/Makefile.am	Thu Nov 14 12:09:04 2013 -0800
@@ -413,12 +413,12 @@
 endif
 
 if LIBGO_IS_LINUX
-runtime_netpoll_files = netpoll.c runtime/netpoll_epoll.c
+runtime_netpoll_files = runtime/netpoll_epoll.c
 else
-if LIBGO_IS_DARWIN
-runtime_netpoll_files = netpoll.c runtime/netpoll_kqueue.c
+if LIBGO_IS_SOLARIS
+runtime_netpoll_files = runtime/netpoll_select.c
 else
-runtime_netpoll_files = runtime/netpoll_stub.c
+runtime_netpoll_files = runtime/netpoll_kqueue.c
 endif
 endif
 
@@ -515,6 +515,7 @@
 	malloc.c \
 	map.c \
 	mprof.c \
+	netpoll.c \
 	reflect.c \
 	runtime1.c \
 	sema.c \
@@ -670,26 +671,6 @@
 	go/mime/type.go \
 	go/mime/type_unix.go
 
-if LIBGO_IS_RTEMS
-go_net_fd_os_file = go/net/fd_select.go
-go_net_newpollserver_file = go/net/newpollserver_rtems.go
-else # !LIBGO_IS_RTEMS
-if LIBGO_IS_LINUX
-go_net_fd_os_file =
-go_net_newpollserver_file =
-else # !LIBGO_IS_LINUX && !LIBGO_IS_RTEMS
-if LIBGO_IS_NETBSD
-go_net_fd_os_file =
-go_net_newpollserver_file =
-else # !LIBGO_IS_NETBSD && !LIBGO_IS_LINUX && !LIBGO_IS_RTEMS
-# By default use select with pipes.  Most systems should have
-# something better.
-go_net_fd_os_file = go/net/fd_select.go
-go_net_newpollserver_file =
-endif # !LIBGO_IS_NETBSD
-endif # !LIBGO_IS_LINUX
-endif # !LIBGO_IS_RTEMS
-
 if LIBGO_IS_LINUX
 go_net_cgo_file = go/net/cgo_linux.go
 go_net_sock_file = go/net/sock_linux.go
@@ -787,10 +768,8 @@
 	go/net/dnsclient_unix.go \
 	go/net/dnsconfig_unix.go \
 	go/net/dnsmsg.go \
-	$(go_net_newpollserver_file) \
 	go/net/fd_mutex.go \
 	go/net/fd_unix.go \
-	$(go_net_fd_os_file) \
 	go/net/file_unix.go \
 	go/net/hosts.go \
 	go/net/interface.go \
diff -r 0ba9f772ad41 libgo/runtime/malloc.h
--- a/libgo/runtime/malloc.h	Thu Nov 14 12:03:31 2013 -0800
+++ b/libgo/runtime/malloc.h	Thu Nov 14 12:09:04 2013 -0800
@@ -515,3 +515,4 @@
 
 void	runtime_proc_scan(void (*)(Obj));
 void	runtime_time_scan(void (*)(Obj));
+void	runtime_netpoll_scan(void (*)(Obj));
diff -r 0ba9f772ad41 libgo/runtime/mgc0.c
--- a/libgo/runtime/mgc0.c	Thu Nov 14 12:03:31 2013 -0800
+++ b/libgo/runtime/mgc0.c	Thu Nov 14 12:09:04 2013 -0800
@@ -1491,6 +1491,7 @@
 	runtime_proc_scan(addroot);
 	runtime_MProf_Mark(addroot);
 	runtime_time_scan(addroot);
+	runtime_netpoll_scan(addroot);
 
 	// MSpan.types
 	allspans = runtime_mheap.allspans;
diff -r 0ba9f772ad41 libgo/runtime/netpoll_epoll.c
--- a/libgo/runtime/netpoll_epoll.c	Thu Nov 14 12:03:31 2013 -0800
+++ b/libgo/runtime/netpoll_epoll.c	Thu Nov 14 12:09:04 2013 -0800
@@ -11,6 +11,7 @@
 
 #include "runtime.h"
 #include "defs.h"
+#include "malloc.h"
 
 #ifndef EPOLLRDHUP
 #define EPOLLRDHUP 0x2000
@@ -156,3 +157,9 @@
 		goto retry;
 	return gp;
 }
+
+void
+runtime_netpoll_scan(void (*addroot)(Obj))
+{
+	USED(addroot);
+}
diff -r 0ba9f772ad41 libgo/runtime/netpoll_kqueue.c
--- a/libgo/runtime/netpoll_kqueue.c	Thu Nov 14 12:03:31 2013 -0800
+++ b/libgo/runtime/netpoll_kqueue.c	Thu Nov 14 12:09:04 2013 -0800
@@ -5,8 +5,8 @@
 // +build darwin dragonfly freebsd netbsd openbsd
 
 #include "runtime.h"
-#include "defs_GOOS_GOARCH.h"
-#include "os_GOOS.h"
+#include "defs.h"
+#include "malloc.h"
 
 // Integrated network poller (kqueue-based implementation).
 
@@ -102,3 +102,9 @@
 		goto retry;
 	return gp;
 }
+
+void
+runtime_netpoll_scan(void (*addroot)(Obj))
+{
+	USED(addroot);
+}
diff -r 0ba9f772ad41 libgo/runtime/netpoll_select.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/runtime/netpoll_select.c	Thu Nov 14 12:09:04 2013 -0800
@@ -0,0 +1,223 @@
+// Copyright 2013 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build solaris
+
+#include "config.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef HAVE_SYS_SELECT_H
+#include 
+#endif
+
+#include "runtime.h"
+#include "m

Re: [gomp4, WIP] Elementals improvements

2013-11-14 Thread Jakub Jelinek
On Fri, Nov 15, 2013 at 05:48:27AM +1000, Richard Henderson wrote:
> Pointers are certainly a decent fallback that would always be compatible,
> but I wonder if we need go that far.
> 
> Each target will have a (set of) natural simdlen to which it vectorizes.  This
> is the set returned by autovectorize_vector_sizes.  That means we've got
> registers of those sizes, and probably parameter passing of those sizes will 
> be
> efficient.  It's easy to split input parameters into multiples, as you've 
> done;
> no reason this can't apply generically.

The problem is that if a target doesn't support target attribute (all
targets except x86_64/i686/powerpc* right now), what do you do if command
line options when compiling the #pragma omp declare simd definition don't
include target options needed for use of the supposedly vector registers the
ABI wants to pass the arguments or return value in?  What simd clones we
emit should not depend on the compiler ISA options, unless those are ABI
incompatible.
I admit I'm not very familiar with vector support on targets not listed
above (just fuzzy memories from sparc, where it seems from a quick test
that the vector arguments are passed in normal floating point registers,
right?  Thus it wouldn't need pointer fallback).  Absolutely no idea about
ARM, I'm always lost in the tons of ABI changing (and some non-ABI changing)
options there, aarch64 (does it always have vector registers?), mips, what
else has vectorization?  I guess it doesn't make sense to emit simd clones
on targets that don't support vectorization at all.

> It's the return value wider than the register size that's tricky.  Here I 
> think
> we may be best off returning a struct/array and letting the base calling
> convention handle it.  Normally that _will_ be via a pointer, but sometimes
> that pointer will be in some special non-parameter register.  Thus I think
> we're best off not performing the hidden argument conversion manually.

Shall it be array of vectors, array of the scalar types, struct with such
arrays?  I mean, do targets handle returning ARRAY_TYPE at all (does any FE
produce those)?

> We could generically use log2(vector_byte_size) + 'a' as the abi letter.
> 
> I'll look at the patches themselves later.

Jakub


libgo patch committed: Fix flag when allocating cgo memory

2013-11-14 Thread Ian Lance Taylor
This patch to libgo fixes a flag when allocating memory from cgo.  I
misunderstood the meanings of the flags.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 2544e5a0261f libgo/runtime/malloc.goc
--- a/libgo/runtime/malloc.goc	Thu Nov 14 10:14:48 2013 -0800
+++ b/libgo/runtime/malloc.goc	Thu Nov 14 12:03:00 2013 -0800
@@ -75,7 +75,7 @@
 		runtime_exitsyscall();
 		m = runtime_m();
 		incallback = true;
-		flag |= FlagNoGC;
+		flag |= FlagNoInvokeGC;
 	}
 
 	if(runtime_gcwaiting() && g != m->g0 && m->locks == 0 && !(flag & FlagNoInvokeGC)) {


Re: [gomp4, WIP] Elementals improvements

2013-11-14 Thread Richard Henderson
On 11/14/2013 02:30 AM, Jakub Jelinek wrote:
> As discussed earlier, if we strictly follow the Intel ABI for simds,
> we run into various issues.  The clones then have to use __regcall calling
> convention which e.g. mandates that on x86_64 up to 16 vector arguments
> are passed in xmm/ymm registers (problem, because the dynamic linker
> during lazy binding can clobber ymm8 through ymm15), requires up to 16
> vector values returned in xmm/ymm registers (for e.g.
> #pragma omp declare simd simdlen(16)
> _Complex double foo (double);
> ) - we don't have infrastructure for that plus we'd need to teach backend(s)
> about that new calling convention, and declares {x,y}mm4-7 for 32-bit
> and {x,y}mm8-15 for 64-bit to be call saved (on 64-bit again there is a
> problem with that because the dynamic linker may clobber that, plus
> it is an issue for bt/up in the debugger (we don't save/restore those in
> unwind info and how big vectors would we save; note, elementals aren't
> allowed to throw or setjmp/longjmp (the standard doesn't mention
> setcontext/swapcontext etc. though)).

Sadly, the last time I reviewed Intel's document, I only looked at the mangling
itself, and ignored the calling convention addition.

I agree with you that the __regcall convention is broken as written.
I think we should ignore it until it gets fixed.

> So, shall we just use different ISA letters to make it clear we are ABI
> incompatible with ICC?

Yes, that is also prudent.

> I wonder if the generic representation
> just shouldn't be ISA 'a', which would pass all non-uniform/non-linear
> arguments as pointers to array of simdlen elements, and ditto for return
> value through first hidden argument.  For x86_64/i?86, because (at least on
> a tiny benchmark I've tried) the pointer arguments variant is somewhat
> slower, we would use ISA 'b', 'c', 'd' for SSE2/AVX/AVX2 (shall we do
> anything for AVX512-F too?) if simdlen is in between 2 and 16, otherwise
> we'd use 'a' and arrays too.

Pointers are certainly a decent fallback that would always be compatible,
but I wonder if we need go that far.

Each target will have a (set of) natural simdlen to which it vectorizes.  This
is the set returned by autovectorize_vector_sizes.  That means we've got
registers of those sizes, and probably parameter passing of those sizes will be
efficient.  It's easy to split input parameters into multiples, as you've done;
no reason this can't apply generically.

It's the return value wider than the register size that's tricky.  Here I think
we may be best off returning a struct/array and letting the base calling
convention handle it.  Normally that _will_ be via a pointer, but sometimes
that pointer will be in some special non-parameter register.  Thus I think
we're best off not performing the hidden argument conversion manually.

We could generically use log2(vector_byte_size) + 'a' as the abi letter.

I'll look at the patches themselves later.


r~


[PATCH] Avoid HOST cflags polluting BUILD cflags

2013-11-14 Thread Henderson, Stuart
Hi,
I noticed a Canadian cross failure in 4.8 which was down to BUILD_CXXFLAGS 
being set to ALL_FLAGS even when build != host.  Obviously this has only become 
apparent with 4.8.

Thanks,
Stu


2013-11-14  Stuart Henderson 

* configure (BUILD_CXXFLAGS): Set appropriately when build != host.
* configure.ac (BUILD_CXXFLAGS): Likewise

diff --git a/gcc/configure b/gcc/configure
index fbdcd89..a2791a3 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -11704,6 +11704,7 @@ STMP_FIXINC=stmp-fixinc
 if test x$build != x$host || test "x$coverage_flags" != x
 then
 BUILD_CFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CFLAGS_FOR_BUILD)'
+BUILD_CXXFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CXXFLAGS_FOR_BUILD)'
 BUILD_LDFLAGS='$(LDFLAGS_FOR_BUILD)'
 fi

diff --git a/gcc/configure.ac b/gcc/configure.ac
index 773cb5d..5d7d18b 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1887,6 +1887,7 @@ STMP_FIXINC=stmp-fixinc   AC_SUBST(STMP_FIXINC)
 if test x$build != x$host || test "x$coverage_flags" != x
 then
 BUILD_CFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CFLAGS_FOR_BUILD)'
+BUILD_CXXFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CXXFLAGS_FOR_BUILD)'
 BUILD_LDFLAGS='$(LDFLAGS_FOR_BUILD)'
 fi



upstream.patch
Description: upstream.patch


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-14 Thread Sergey Ostanevich
this is only for the whole file? I mean to have a particular loop
vectorized in a
file while all others - up to compiler's cost model. is there such a machinery?

Sergos

On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener  wrote:
> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>
>> I will get some tests.
>> As for cost analysis - simply consider the pragma as a request to
>> vectorize. How can I - as a developer - enforce it beyond the pragma?
>
> You can disable the cost model via -fvect-cost-model=unlimited
>
> Richard.
>
>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener  wrote:
>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>> >
>> >> The reason patch was in its original state is because we want
>> >> to notify user that his assumption of profitability may be wrong.
>> >> This is not a part of any spec and as far as I know ICC does not
>> >> notify user about the case. Still it can be a good hint for those
>> >> users who tries to get as much as possible performance.
>> >>
>> >> Richard's comment on the vectorization problems is about the same -
>> >> to inform user that his attempt to force vectorization is failed.
>> >>
>> >> As for profitable or not - sometimes I believe it's impossible to be
>> >> precise. For OMP we have case of a vector version of a function
>> >> and we have no chance to figure out whether it is profitable to use
>> >> it or to loose it. If we can't map the loop for any vector length
>> >> other than 1 - I believe in this case we have to bail out and report.
>> >> Is it about 'never profitable'?
>> >
>> > For example.  I think we should report non-vectorized loops
>> > that are marked with force_vect anyway, with -Wdisabled-optimization.
>> > Another case is that a loop may be profitable to vectorize if
>> > the ISA supports a gather instruction but otherwise not.  Or if the
>> > ISA supports efficient vector construction from N not loop
>> > invariant scalars (for vectorization of strided loads).
>> >
>> > Simply disregarding all of the cost analysis sounds completely
>> > bogus to me.
>> >
>> > I'd simply go for the diagnostic for now, not changing anything else.
>> > We want to have a good understanding about why the cost model is
>> > so bad that we have to force to ignore it for #pragma simd - thus we
>> > want testcases.
>> >
>> > Richard.
>> >
>> >>
>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener  wrote:
>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>> >> >>> nothing related to cost model. ICC does not cancel its
>> >> >>> cost model in case of #pragma ivdep
>> >> >>>
>> >> >>> as for the safelen - OMP standart treats it as a limitation
>> >> >>> for the vector length. this means if no safelen is present
>> >> >>> an arbitrary vector length can be used.
>> >> >>
>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for #pragma 
>> >> >> omp simd
>> >> >> without safelen clause or #pragma simd without vectorlength clause.
>> >> >>
>> >> >>> so I believe loop->force_vect is the only trigger to disregard
>> >> >>> the cost model
>> >> >>
>> >> >> Anyway, in that case I think the originally posted patch is wrong,
>> >> >> if we want to treat force_vect as disregard all the cost model and
>> >> >> force vectorization (well, the name of the field already kind of 
>> >> >> suggest
>> >> >> that), then IMHO we should treat it the same as 
>> >> >> -fvect-cost-model=unlimited
>> >> >> for those loops.
>> >> >
>> >> > Err - the user may have a specific sub-architecture in mind when using
>> >> > #pragma simd, if you say we should completely ignore the cost model
>> >> > then should we also sorry () if we cannot vectorize the loop (either
>> >> > because of GCC deficiencies or lack of sub-target support)?
>> >> >
>> >> > That said, at least in the cases that the cost model says the loop
>> >> > is never profitable to vectorize we should follow its advice.
>> >> >
>> >> > Richard.
>> >> >
>> >> >> Thus (untested):
>> >> >>
>> >> >> 2013-11-12  Jakub Jelinek  
>> >> >>
>> >> >>   * tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
>> >> >>   unlimited cost model also for force_vect loops.
>> >> >>
>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.0 +0100
>> >> >> +++ gcc/tree-vect-loop.c  2013-11-12 15:11:43.821404330 +0100
>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>> >> >>void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>> >> >>
>> >> >>/* Cost model disabled.  */
>> >> >> -  if (unlimited_cost_model ())
>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP 
>> >> >> (loop_vinfo)->force_vect)
>> >> >>  {
>> >> >>dump_printf_loc (MSG_NOTE, vect_location, "cost model 
>> >> >> disabled.\n");
>> >> >>*ret_min_profitable_niters = 0;
>> >> >>
>> >> >>   Jakub
>> >> >>
>> >> >
>> >>
>> >>
>> >
>> > --
>>

Re: [PATCH] Do not set flag_complex_method to 2 for C++ by default.

2013-11-14 Thread Xinliang David Li
On Thu, Nov 14, 2013 at 10:17 AM, Andrew Pinski  wrote:
> On Thu, Nov 14, 2013 at 8:25 AM, Xinliang David Li  wrote:
>> Can we revisit the decision for this? Here are the reasons:
>>
>> 1) It seems that the motivation to make C++ consistent with c99 is to
>> avoid confusing users who build the C source with both C and C++
>> compilers. Why should C++'s default behavior be tuned for this niche
>> case?
>
> It is not a niche case.  It is confusing for people who write C++ code
> to rewrite their code to C99

Compared with people who just work on C++ or C but do not worry about
rewrite nor cross language comparison?

>and find that C is much slower because of
> correctness?  I think they have this backwards here.  C++ should be
> consistent with C here.

Correctness by what definition?

>
>> 2) It is very confusing for users who see huge performance difference
>> between compiler generated code for Complex multiplication vs manually
>> expanded code
>
> I don't see why this is an issue if they understand how complex
> multiplication works for correctness.  I am sorry but correctness over
> speed is a good argument of why this should stay this way.
>
>> 3) The default setting can also block potential vectorization
>> opportunities for complex operations
>
> Yes so again this is about a correctness issue over a speed issue.
>
>> 4) GCC is about the only compiler which has this default -- very few
>> user knows about GCC's strict default, and will think GCC performs
>> poorly.
>
>
> Correctness over speed is better.  I am sorry GCC is the only one
> which gets it correct here.  If people don't like there is a flag to
> disable it.

You can say the same thing that people who find C is slower can use
the flag to disable it.

thanks,

David

>
> Thanks,
> Andrew Pinski
>
>>
>> thanks,
>>
>> David
>>
>>
>> On Wed, Nov 13, 2013 at 9:07 PM, Andrew Pinski  wrote:
>>> On Wed, Nov 13, 2013 at 5:26 PM, Cong Hou  wrote:
 This patch is for PR58963.

 In the patch http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00560.html,
 the builtin function is used to perform complex multiplication and
 division. This is to comply with C99 standard, but I am wondering if
 C++ also needs this.

 There is no complex keyword in C++, and no content in C++ standard
 about the behavior of operations on complex types. The 
 header file is all written in source code, including complex
 multiplication and division. GCC should not do too much for them by
 using builtin calls by default (although we can set -fcx-limited-range
 to prevent GCC doing this), which has a big impact on performance
 (there may exist vectorization opportunities).

 In this patch flag_complex_method will not be set to 2 for C++.
 Bootstraped and tested on an x86-64 machine.
>>>
>>> I think you need to look into this issue deeper as the original patch
>>> only enabled it for C99:
>>> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01483.html .
>>>
>>> Just a little deeper will find
>>> http://gcc.gnu.org/ml/gcc/2007-07/msg00124.html which says yes C++
>>> needs this.
>>>
>>> Thanks,
>>> Andrew Pinski
>>>


 thanks,
 Cong


 Index: gcc/c-family/c-opts.c
 ===
 --- gcc/c-family/c-opts.c (revision 204712)
 +++ gcc/c-family/c-opts.c (working copy)
 @@ -198,8 +198,10 @@ c_common_init_options_struct (struct gcc
opts->x_warn_write_strings = c_dialect_cxx ();
opts->x_flag_warn_unused_result = true;

 -  /* By default, C99-like requirements for complex multiply and divide.  
 */
 -  opts->x_flag_complex_method = 2;
 +  /* By default, C99-like requirements for complex multiply and divide.
 + But for C++ this should not be required.  */
 +  if (c_language != clk_cxx && c_language != clk_objcxx)
 +opts->x_flag_complex_method = 2;
  }

  /* Common initialization before calling option handlers.  */
 Index: gcc/c-family/ChangeLog
 ===
 --- gcc/c-family/ChangeLog (revision 204712)
 +++ gcc/c-family/ChangeLog (working copy)
 @@ -1,3 +1,8 @@
 +2013-11-13  Cong Hou  
 +
 + * c-opts.c (c_common_init_options_struct): Don't let C++ comply with
 + C99-like requirements for complex multiply and divide.
 +
  2013-11-12  Joseph Myers  

   * c-common.c (c_common_reswords): Add _Thread_local.


Committed: arc.md: Remove extra alignment in doloop_begin_i

2013-11-14 Thread Joern Rennecke

That extra alignment causes some branches to go out of range.
2013-11-14  Joern Rennecke  

* config/arc/arc.md (doloop_begin_i): Remove extra alignment;
use (.&-4) idiom.

Index: config/arc/arc.md
===
--- config/arc/arc.md   (revision 204806)
+++ config/arc/arc.md   (working copy)
@@ -4789,8 +4789,7 @@ (define_insn "doloop_begin_i"
 {
   /* ??? Can do better for when a scratch register
 is known.  But that would require extra testing.  */
-  arc_clear_unalign ();
-  return ".p2align 2\;push_s r0\;add r0,pcl,%4-.+2\;sr r0,[2]; 
LP_START\;add r0,pcl,.L__GCC__LP%1-.+2\;sr r0,[3]; LP_END\;pop_s r0";
+  return "push_s r0\;add r0,pcl,%4-(.&-4)\;sr r0,[2]; LP_START\;add 
r0,pcl,.L__GCC__LP%1-(.&-4)\;sr r0,[3]; LP_END\;pop_s r0";
 }
   /* Check if the loop end is in range to be set by the lp instruction.  */
   size = INTVAL (operands[3]) < 2 ? 0 : 2048;


Re: [PATCH] Do not set flag_complex_method to 2 for C++ by default.

2013-11-14 Thread Cong Hou
See the following code:


#include 
using std::complex;

template
complex<_Tp>&
mult_assign (complex<_Tp>& __y, const complex<_Up>& __z)
{
  _Up& _M_real = __y.real();
  _Up& _M_imag = __y.imag();
  const _Tp __r = _M_real * __z.real() - _M_imag * __z.imag();
  _M_imag = _M_real * __z.imag() + _M_imag * __z.real();
  _M_real = __r;
  return __y;
}

void foo (complex& c1, complex& c2)
{ c1 *= c2; }

void bar (complex& c1, complex& c2)
{ mult_assign(c1, c2); }


The function mult_assign is written almost by copying the
implementation of operator *= from . They have exactly the
same behavior from the view of the source code. However, the compiled
results of foo() and bar() are different: foo() is using builtin
function for multiplication but bar() is not. Just because of a name
change the final behavior is changed? This should not be how a
compiler is working.


thanks,
Cong


On Thu, Nov 14, 2013 at 10:17 AM, Andrew Pinski  wrote:
> On Thu, Nov 14, 2013 at 8:25 AM, Xinliang David Li  wrote:
>> Can we revisit the decision for this? Here are the reasons:
>>
>> 1) It seems that the motivation to make C++ consistent with c99 is to
>> avoid confusing users who build the C source with both C and C++
>> compilers. Why should C++'s default behavior be tuned for this niche
>> case?
>
> It is not a niche case.  It is confusing for people who write C++ code
> to rewrite their code to C99 and find that C is much slower because of
> correctness?  I think they have this backwards here.  C++ should be
> consistent with C here.
>
>> 2) It is very confusing for users who see huge performance difference
>> between compiler generated code for Complex multiplication vs manually
>> expanded code
>
> I don't see why this is an issue if they understand how complex
> multiplication works for correctness.  I am sorry but correctness over
> speed is a good argument of why this should stay this way.
>
>> 3) The default setting can also block potential vectorization
>> opportunities for complex operations
>
> Yes so again this is about a correctness issue over a speed issue.
>
>> 4) GCC is about the only compiler which has this default -- very few
>> user knows about GCC's strict default, and will think GCC performs
>> poorly.
>
>
> Correctness over speed is better.  I am sorry GCC is the only one
> which gets it correct here.  If people don't like there is a flag to
> disable it.
>
> Thanks,
> Andrew Pinski
>
>>
>> thanks,
>>
>> David
>>
>>
>> On Wed, Nov 13, 2013 at 9:07 PM, Andrew Pinski  wrote:
>>> On Wed, Nov 13, 2013 at 5:26 PM, Cong Hou  wrote:
 This patch is for PR58963.

 In the patch http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00560.html,
 the builtin function is used to perform complex multiplication and
 division. This is to comply with C99 standard, but I am wondering if
 C++ also needs this.

 There is no complex keyword in C++, and no content in C++ standard
 about the behavior of operations on complex types. The 
 header file is all written in source code, including complex
 multiplication and division. GCC should not do too much for them by
 using builtin calls by default (although we can set -fcx-limited-range
 to prevent GCC doing this), which has a big impact on performance
 (there may exist vectorization opportunities).

 In this patch flag_complex_method will not be set to 2 for C++.
 Bootstraped and tested on an x86-64 machine.
>>>
>>> I think you need to look into this issue deeper as the original patch
>>> only enabled it for C99:
>>> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01483.html .
>>>
>>> Just a little deeper will find
>>> http://gcc.gnu.org/ml/gcc/2007-07/msg00124.html which says yes C++
>>> needs this.
>>>
>>> Thanks,
>>> Andrew Pinski
>>>


 thanks,
 Cong


 Index: gcc/c-family/c-opts.c
 ===
 --- gcc/c-family/c-opts.c (revision 204712)
 +++ gcc/c-family/c-opts.c (working copy)
 @@ -198,8 +198,10 @@ c_common_init_options_struct (struct gcc
opts->x_warn_write_strings = c_dialect_cxx ();
opts->x_flag_warn_unused_result = true;

 -  /* By default, C99-like requirements for complex multiply and divide.  
 */
 -  opts->x_flag_complex_method = 2;
 +  /* By default, C99-like requirements for complex multiply and divide.
 + But for C++ this should not be required.  */
 +  if (c_language != clk_cxx && c_language != clk_objcxx)
 +opts->x_flag_complex_method = 2;
  }

  /* Common initialization before calling option handlers.  */
 Index: gcc/c-family/ChangeLog
 ===
 --- gcc/c-family/ChangeLog (revision 204712)
 +++ gcc/c-family/ChangeLog (working copy)
 @@ -1,3 +1,8 @@
 +2013-11-13  Cong Hou  
 +
 + * c-opts.c (c_common_init_options_struct): Don't let C++ comply with
 + 

Re: [C++ Patch/RFC PR c++/57887 (and dups)

2013-11-14 Thread Jason Merrill

On 11/14/2013 11:37 AM, Paolo Carlini wrote:

+   ? TREE_TYPE (CLASSTYPE_TEMPLATE_INFO (DECL_CONTEXT (decl)))


Use CLASSTYPE_TI_TEMPLATE instead.  OK with that change.

Jason



[committed] Unbreak asan on ppc (PR sanitizer/59122)

2013-11-14 Thread Jakub Jelinek
Hi!

I've committed following fix to unbreak asan on -fsection-anchors targets.
If anyone has better ideas how to represent for RTL artificial label
we emit from final.c at the beginning of current function, I'd appreciate
it.

2013-11-14  Jakub Jelinek  

PR sanitizer/59122
* asan.c (asan_emit_stack_protection): Ensure -fsection-anchors
isn't confused by the artificial decl.

--- gcc/asan.c  (revision 204800)
+++ gcc/asan.c  (working copy)
@@ -1002,6 +1002,9 @@ asan_emit_stack_protection (rtx base, HO
   TREE_STATIC (decl) = 1;
   TREE_PUBLIC (decl) = 0;
   TREE_USED (decl) = 1;
+  DECL_INITIAL (decl) = decl;
+  TREE_ASM_WRITTEN (decl) = 1;
+  TREE_ASM_WRITTEN (id) = 1;
   emit_move_insn (mem, expand_normal (build_fold_addr_expr (decl)));
   shadow_base = expand_binop (Pmode, lshr_optab, base,
  GEN_INT (ASAN_SHADOW_SHIFT),

Jakub


Re: [PATCH, PR 10474] Take two on splitting live-ranges of function arguments to help shrink-wrapping

2013-11-14 Thread Martin Jambor
Hi,

On Thu, Nov 14, 2013 at 12:18:24PM +, Matthew Leach wrote:
> Martin Jambor  writes:
> 
> > Hi,
> 
> Hi Martin,
> 
> [...]
> 
> >
> > 2013-11-04  Martin Jambor  
> >
> > PR rtl-optimization/10474
> > * ira.c (interesting_dest_for_shprep): New function.
> > (split_live_ranges_for_shrink_wrap): Likewise.
> > (find_moveable_pseudos): Move calculation of dominance info,
> > df_analysios and the final anlyses to...
> > (ira): ...here, call split_live_ranges_for_shrink_wrap.
> >
> > testsuite/
> > * gcc.dg/pr10474.c: New testcase.
> > * gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
> > * gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
> 
> This patch seems to breaks stage-3 bootstrap for
> armv7l-unknown-linux-gnueabihf. genmddeps and genhooks seem to be in
> an infinite loop (I've left them running for approx 1h 30m now).
> 
> My configure flags are:
> 
> ../gcc/configure --with-cpu=cortex-a15 --with-tune=cortex-a15 --disable-nla 
> --enable-shared --with-float=hard --with-fpu=neon-vfpv4
> 

I hope all the issues are just different manisfestations of PR 59099.
I hope to have fixed that by the patch below but I've just only
started bootstrapping it (and I'd like to do that on multiple
platforms before proposing it).  But of course everybody can test if
it helps with their issues (and does not create new ones), I'd be
grateful if you do.

The problem in that PR was with reload (or perhaps I should say LRA)
getting confused by information in array ira_reg_equiv which was not
updated by my patch.  Rather than updating it, I have decided to move
the transformation before its computation.

The reason my previous attempts t this failed was that I use
ira_create_new_reg which resizes that array too and so I needed to
move its allocation upwards as well (what a stupid thing to debug for
a couple of hours, sigh).

Thanks and sorry again, but it is a tough area for me,

Martin


2013-11-14  Martin Jambor  

* ira.c (find_moveable_pseudos): Put back various analyses from ira()
here.
(ira): Move init_reg_equiv and call to
split_live_ranges_for_shrink_wrap up, remove analyses around call
to find_moveable_pseudos.

diff --git a/gcc/ira.c b/gcc/ira.c
index 2ef69cb..a171761 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -4515,6 +4515,9 @@ find_moveable_pseudos (void)
   pseudo_replaced_reg.release ();
   pseudo_replaced_reg.safe_grow_cleared (max_regs);
 
+  df_analyze ();
+  calculate_dominance_info (CDI_DOMINATORS);
+
   i = 0;
   bitmap_initialize (&live, 0);
   bitmap_initialize (&used, 0);
@@ -4827,6 +4830,14 @@ find_moveable_pseudos (void)
   free (bb_moveable_reg_sets);
 
   last_moveable_pseudo = max_reg_num ();
+
+  fix_reg_equiv_init ();
+  expand_reg_info ();
+  regstat_free_n_sets_and_refs ();
+  regstat_free_ri ();
+  regstat_init_n_sets_and_refs ();
+  regstat_compute_ri ();
+  free_dominance_info (CDI_DOMINATORS);
 }
 
 
@@ -5187,7 +5198,19 @@ ira (FILE *f)
 #endif
   df_analyze ();
 
+  init_reg_equiv ();
+  if (ira_conflicts_p)
+{
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  if (split_live_ranges_for_shrink_wrap ())
+   df_analyze ();
+
+  free_dominance_info (CDI_DOMINATORS);
+}
+
   df_clear_flags (DF_NO_INSN_RESCAN);
+
   regstat_init_n_sets_and_refs ();
   regstat_compute_ri ();
 
@@ -5205,7 +5228,6 @@ ira (FILE *f)
   if (resize_reg_info () && flag_ira_loop_pressure)
 ira_set_pseudo_classes (true, ira_dump_file);
 
-  init_reg_equiv ();
   rebuild_p = update_equiv_regs ();
   setup_reg_equiv ();
   setup_reg_equiv_init ();
@@ -5228,22 +5250,7 @@ ira (FILE *f)
  allocation because of -O0 usage or because the function is too
  big.  */
   if (ira_conflicts_p)
-{
-  df_analyze ();
-  calculate_dominance_info (CDI_DOMINATORS);
-
-  find_moveable_pseudos ();
-  if (split_live_ranges_for_shrink_wrap ())
-   df_analyze ();
-
-  fix_reg_equiv_init ();
-  expand_reg_info ();
-  regstat_free_n_sets_and_refs ();
-  regstat_free_ri ();
-  regstat_init_n_sets_and_refs ();
-  regstat_compute_ri ();
-  free_dominance_info (CDI_DOMINATORS);
-}
+find_moveable_pseudos ();
 
   max_regno_before_ira = max_reg_num ();
   ira_setup_eliminable_regset (true);


Re: [PATCH] Do not set flag_complex_method to 2 for C++ by default.

2013-11-14 Thread Andrew Pinski
On Thu, Nov 14, 2013 at 8:25 AM, Xinliang David Li  wrote:
> Can we revisit the decision for this? Here are the reasons:
>
> 1) It seems that the motivation to make C++ consistent with c99 is to
> avoid confusing users who build the C source with both C and C++
> compilers. Why should C++'s default behavior be tuned for this niche
> case?

It is not a niche case.  It is confusing for people who write C++ code
to rewrite their code to C99 and find that C is much slower because of
correctness?  I think they have this backwards here.  C++ should be
consistent with C here.

> 2) It is very confusing for users who see huge performance difference
> between compiler generated code for Complex multiplication vs manually
> expanded code

I don't see why this is an issue if they understand how complex
multiplication works for correctness.  I am sorry but correctness over
speed is a good argument of why this should stay this way.

> 3) The default setting can also block potential vectorization
> opportunities for complex operations

Yes so again this is about a correctness issue over a speed issue.

> 4) GCC is about the only compiler which has this default -- very few
> user knows about GCC's strict default, and will think GCC performs
> poorly.


Correctness over speed is better.  I am sorry GCC is the only one
which gets it correct here.  If people don't like there is a flag to
disable it.

Thanks,
Andrew Pinski

>
> thanks,
>
> David
>
>
> On Wed, Nov 13, 2013 at 9:07 PM, Andrew Pinski  wrote:
>> On Wed, Nov 13, 2013 at 5:26 PM, Cong Hou  wrote:
>>> This patch is for PR58963.
>>>
>>> In the patch http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00560.html,
>>> the builtin function is used to perform complex multiplication and
>>> division. This is to comply with C99 standard, but I am wondering if
>>> C++ also needs this.
>>>
>>> There is no complex keyword in C++, and no content in C++ standard
>>> about the behavior of operations on complex types. The 
>>> header file is all written in source code, including complex
>>> multiplication and division. GCC should not do too much for them by
>>> using builtin calls by default (although we can set -fcx-limited-range
>>> to prevent GCC doing this), which has a big impact on performance
>>> (there may exist vectorization opportunities).
>>>
>>> In this patch flag_complex_method will not be set to 2 for C++.
>>> Bootstraped and tested on an x86-64 machine.
>>
>> I think you need to look into this issue deeper as the original patch
>> only enabled it for C99:
>> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01483.html .
>>
>> Just a little deeper will find
>> http://gcc.gnu.org/ml/gcc/2007-07/msg00124.html which says yes C++
>> needs this.
>>
>> Thanks,
>> Andrew Pinski
>>
>>>
>>>
>>> thanks,
>>> Cong
>>>
>>>
>>> Index: gcc/c-family/c-opts.c
>>> ===
>>> --- gcc/c-family/c-opts.c (revision 204712)
>>> +++ gcc/c-family/c-opts.c (working copy)
>>> @@ -198,8 +198,10 @@ c_common_init_options_struct (struct gcc
>>>opts->x_warn_write_strings = c_dialect_cxx ();
>>>opts->x_flag_warn_unused_result = true;
>>>
>>> -  /* By default, C99-like requirements for complex multiply and divide.  */
>>> -  opts->x_flag_complex_method = 2;
>>> +  /* By default, C99-like requirements for complex multiply and divide.
>>> + But for C++ this should not be required.  */
>>> +  if (c_language != clk_cxx && c_language != clk_objcxx)
>>> +opts->x_flag_complex_method = 2;
>>>  }
>>>
>>>  /* Common initialization before calling option handlers.  */
>>> Index: gcc/c-family/ChangeLog
>>> ===
>>> --- gcc/c-family/ChangeLog (revision 204712)
>>> +++ gcc/c-family/ChangeLog (working copy)
>>> @@ -1,3 +1,8 @@
>>> +2013-11-13  Cong Hou  
>>> +
>>> + * c-opts.c (c_common_init_options_struct): Don't let C++ comply with
>>> + C99-like requirements for complex multiply and divide.
>>> +
>>>  2013-11-12  Joseph Myers  
>>>
>>>   * c-common.c (c_common_reswords): Add _Thread_local.


libgo patch committed: Fill in list of gccgo architectures

2013-11-14 Thread Ian Lance Taylor
This patch expands the recent one from Dave Cheney to add all the known
gccgo architectures.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 5d4a0b7216b6 libgo/go/go/build/syslist.go
--- a/libgo/go/go/build/syslist.go	Thu Nov 14 10:07:00 2013 -0800
+++ b/libgo/go/go/build/syslist.go	Thu Nov 14 10:11:53 2013 -0800
@@ -5,4 +5,4 @@
 package build
 
 const goosList = "darwin dragonfly freebsd linux netbsd openbsd plan9 windows solaris "
-const goarchList = "386 amd64 arm "
+const goarchList = "386 amd64 arm alpha m68k mipso32 mipsn32 mipsn64 mipso64 ppc ppc64 sparc sparc64 "


Re: [PATCH] Merge cgraph_get_create_node and cgraph_get_create_real_symbol_node

2013-11-14 Thread Uros Bizjak
On Thu, Nov 14, 2013 at 8:51 AM, Uros Bizjak  wrote:
> On Wed, Nov 13, 2013 at 6:18 PM, Uros Bizjak  wrote:
>
>>> as discussed with Honza on many occasions, all users of
>>> cgraph_get_create_node really want cgraph_get_create_real_symbol_node,
>>> i.e. they are not interested in inline nodes and should get a
>>> standalone node instead.  So this patch changes cgraph_get_create_node
>>> to do what cgraph_get_create_real_symbol_node currently does and
>>> removes the latter altogether.
>>>
>>> I had to change a call to cgraph_get_create_node to cgraph_get_node in
>>> lto-streamer-in.c so that it does not make the node it operates on a
>>> clone of another one because this made ipa_pta_execute abort on assert
>>> after calling cgraph_get_body (visionary points to Richi for putting
>>> the assert there).
>>>
>>> The patch successfully passed bootstrap and testing ("all" languages +
>>> Ada) and LTO-bootstrap (C and C++ only) on x86_64-linux.
>>>
>>> 2013-11-12  Martin Jambor  
>>>
>>> * cgraph.c (cgraph_get_create_node): Do what
>>> cgraph_get_create_real_symbol_node used to do.
>>> (cgraph_get_create_real_symbol_node): Removed.  Changed all users to
>>> call cgraph_get_create_node.
>>> * cgraph.h (cgraph_get_create_real_symbol_node): Removed.
>>> * lto-streamer-in.c (input_function): Call cgraph_get_node instead of
>>> cgraph_get_create_node.  Assert we get a node.
>>
>> This patch breaks lto-profiledbootstrap on x86_64-pc-linux-gnu with:
>>
>> In function ‘colorize_start’:
>> lto1: internal compiler error: in input_function, at lto-streamer-in.c:919
>> 0xa585c1 input_function
>> /home/uros/gcc-svn/trunk/gcc/lto-streamer-in.c:919
>> 0xa585c1 lto_read_body
>> /home/uros/gcc-svn/trunk/gcc/lto-streamer-in.c:1067
>> 0xa585c1 lto_input_function_body(lto_file_decl_data*, cgraph_node*, char 
>> const*)
>> /home/uros/gcc-svn/trunk/gcc/lto-streamer-in.c:1109
>> 0x66eb2c cgraph_get_body(cgraph_node*)
>> /home/uros/gcc-svn/trunk/gcc/cgraph.c:2967
>> 0x999339 ipa_merge_profiles(cgraph_node*, cgraph_node*)
>> /home/uros/gcc-svn/trunk/gcc/ipa-utils.c:699
>> 0x5979a6 lto_cgraph_replace_node
>> /home/uros/gcc-svn/trunk/gcc/lto/lto-symtab.c:82
>> 0x598079 lto_symtab_merge_symbols_1
>> /home/uros/gcc-svn/trunk/gcc/lto/lto-symtab.c:561
>> 0x598079 lto_symtab_merge_symbols()
>> /home/uros/gcc-svn/trunk/gcc/lto/lto-symtab.c:589
>> 0x586fad read_cgraph_and_symbols
>> /home/uros/gcc-svn/trunk/gcc/lto/lto.c:2945
>> 0x586fad lto_main()
>> /home/uros/gcc-svn/trunk/gcc/lto/lto.c:3254
>>
>> You will need patches from Teresa [1],[2] to get up to there in the
>> lto-profiledbootstrap.
>
> These patches are now in mainline, the failure is confirmed by HJ's
> buildboot at http://gcc.gnu.org/ml/gcc-regression/2013-11/msg00350.html

I was able to finish LTO profiledbootstrap with a partial revert of:

* lto-streamer-in.c (input_function): Call cgraph_get_node instead of
  cgraph_get_create_node.  Assert we get a node.

Index: lto-streamer-in.c
===
--- lto-streamer-in.c   (revision 204792)
+++ lto-streamer-in.c   (working copy)
@@ -916,8 +916,7 @@ input_function (tree fn_decl, struct data_in *data

   gimple_register_cfg_hooks ();

-  node = cgraph_get_node (fn_decl);
-  gcc_checking_assert (node);
+  node = cgraph_get_create_node (fn_decl);
   input_struct_function_base (fn, data_in, ib);
   input_cfg (ib_cfg, fn, node->count_materialization_scale);

But it looks that this blind revert introduced a couple of failures in
the testsuite:

FAIL: gcc.dg/torture/pr43879_1.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.dg/torture/pr43879_1.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
UNRESOLVED: gcc.dg/torture/pr43879_1.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  compilation failed to produce executable
FAIL: gcc.dg/torture/pr47426-1.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.dg/torture/pr47426-1.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
UNRESOLVED: gcc.dg/torture/pr47426-1.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  compilation failed to produce executable

FAIL: g++.dg/torture/pr43879-1_1.C  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)
FAIL: g++.dg/torture/pr43879-1_1.C  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
UNRESOLVED: g++.dg/torture/pr43879-1_1.C  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  compilation failed to
produce executable
FAIL: g++.dg/torture/pr49394.C  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
FAIL: g++.dg/torture/pr49394.C  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: g++.dg/torture/pr49394.C  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-

Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-14 Thread Aldy Hernandez



Well, if you don't change anything in omp-low.c, then it wouldn't diagnose
setjmp call in #pragma simd, but given that also the OpenMP 4.0 spec
requires that #pragma omp simd doesn't contain calls to setjmp or longjmp
(ditto for #pragma omp declare simd functions), then scan_omp_1_stmt
should be changed to also call check_omp_nesting_restrictions for
setjmp/longjmp calls (the GIMPLE_CALL case then in
check_omp_nesting_restrictions can't assume all calls it sees are
BUILT_IN_NORMAL).


Well, as you can see from my latest patch I left the check for setjmp() 
in c_validate_cilk_plus_loop, so it's still working as advertised.  But 
I can certainly extend the functionality to OMP simd as you suggest.



Aldy


[rx] extend mode-dependent address offsets

2013-11-14 Thread DJ Delorie

All RX opcodes which take a dsp:8 also take a dsp:16 so we can relax
this offset check.  Committed.

* config/rx/rx.c (rx_mode_dependent_address_p): Allow offsets up
to 16 bits.

Index: config/rx/rx.c
===
--- config/rx/rx.c  (revision 204792)
+++ config/rx/rx.c  (working copy)
@@ -341,15 +341,15 @@ rx_mode_dependent_address_p (const_rtx a
case REG:
  /* REG+REG only works in SImode.  */
  return true;
 
case CONST_INT:
  /* REG+INT is only mode independent if INT is a
-multiple of 4, positive and will fit into 8-bits.  */
+multiple of 4, positive and will fit into 16-bits.  */
  if (((INTVAL (addr) & 3) == 0)
- && IN_RANGE (INTVAL (addr), 4, 252))
+ && IN_RANGE (INTVAL (addr), 4, 0xfffc))
return false;
  return true;
 
case SYMBOL_REF:
case LABEL_REF:
  return true;


Re: libsanitizer merge from upstream r191666

2013-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2013 at 07:08:17PM +0100, Jakub Jelinek wrote:
> On Thu, Nov 14, 2013 at 09:56:36PM +0400, Konstantin Serebryany wrote:
> > I thought about alignment but did not reflect it anywhere in the
> > interface/comments.
> > The alignment should be min(4096, N), which is enough for most purposes.
> 
> You mean max(4096, N), right?

Oops, of course min.

> And, what exactly is N?  1 << (class_id + 6)?

But this is valid.

Jakub


Re: libsanitizer merge from upstream r191666

2013-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2013 at 09:56:36PM +0400, Konstantin Serebryany wrote:
> I thought about alignment but did not reflect it anywhere in the
> interface/comments.
> The alignment should be min(4096, N), which is enough for most purposes.

You mean max(4096, N), right?  And, what exactly is N?  1 << (class_id + 6)?

> But let me double-check (and add a CHECK) tomorrow.

Jakub


libgo patch committed: Fix list of supported os's and arch's

2013-11-14 Thread Ian Lance Taylor
This patch from Dave Cheney fixes the list of supported operating
systems and architectures in libgo.  syslist.go used to be a generated
file in the master Go library, but it was changed a while back to a
fixed list.  This patch makes the same change to libgo.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

diff -r f1d01dcd0442 libgo/Makefile.am
--- a/libgo/Makefile.am	Thu Nov 14 10:06:17 2013 -0800
+++ b/libgo/Makefile.am	Thu Nov 14 10:06:37 2013 -0800
@@ -1300,7 +1300,7 @@
 	go/go/build/build.go \
 	go/go/build/doc.go \
 	go/go/build/read.go \
-	syslist.go
+	go/go/build/syslist.go
 go_go_doc_files = \
 	go/go/doc/comment.go \
 	go/go/doc/doc.go \
@@ -2777,15 +2777,6 @@
 	@$(CHECK)
 .PHONY: go/build/check
 
-syslist.go: s-syslist; @true
-s-syslist: Makefile
-	echo '// Generated automatically by make.' >syslist.go.tmp
-	echo 'package build' >>syslist.go.tmp
-	echo 'const goosList = "$(GOOS)"' >>syslist.go.tmp
-	echo 'const goarchList = "$(GOARCH)"' >>syslist.go.tmp
-	$(SHELL) $(srcdir)/../move-if-change syslist.go.tmp syslist.go
-	$(STAMP) $@
-
 @go_include@ go/doc.lo.dep
 go/doc.lo.dep: $(go_go_doc_files)
 	$(BUILDDEPS)
diff -r f1d01dcd0442 libgo/go/go/build/syslist.go
--- /dev/null	Thu Jan 01 00:00:00 1970 +
+++ b/libgo/go/go/build/syslist.go	Thu Nov 14 10:06:37 2013 -0800
@@ -0,0 +1,8 @@
+// Copyright 2011 The Go Authors.  All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package build
+
+const goosList = "darwin dragonfly freebsd linux netbsd openbsd plan9 windows solaris "
+const goarchList = "386 amd64 arm "


[PATCH] Generalize thread_through_normal_block

2013-11-14 Thread Jeff Law


As part of the generalized FSA optimization work we need the ability to 
start a jump threading path with a joiner, then later in the path have a 
normal jump threading block (ie, has side effects and thus requires 
duplication).


thread_through_normal_block needs one tweak to make that possible. 
Namely it should only push the EDGE_START_JUMP_THREAD marker if the jump 
threading path is currently empty.


Right now we don't call thread_through_normal_block after we've 
processed a joiner block, but we will soon :-)  So at least today, this 
patch has no impact on the code we generate.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Installed on the trunk.




* tree-ssa-threadedge.c (thread_through_normal_block): Only push
the EDGE_START_JUMP_THREAD marker if the jump threading path is
empty.

diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index c9b2c69..cabfc82 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -940,12 +940,18 @@ thread_through_normal_block (edge e,
  || bitmap_bit_p (visited, dest->index))
return false;
 
-  jump_thread_edge *x
-   = new jump_thread_edge (e, EDGE_START_JUMP_THREAD);
- path->safe_push (x);
- *backedge_seen_p |= ((e->flags & EDGE_DFS_BACK) != 0);
+ /* Only push the EDGE_START_JUMP_THREAD marker if this is
+first edge on the path.  */
+ if (path->length () == 0)
+   {
+  jump_thread_edge *x
+   = new jump_thread_edge (e, EDGE_START_JUMP_THREAD);
+ path->safe_push (x);
+ *backedge_seen_p |= ((e->flags & EDGE_DFS_BACK) != 0);
+   }
 
- x = new jump_thread_edge (taken_edge, EDGE_COPY_SRC_BLOCK);
+ jump_thread_edge *x
+   = new jump_thread_edge (taken_edge, EDGE_COPY_SRC_BLOCK);
  path->safe_push (x);
  *backedge_seen_p |= ((taken_edge->flags & EDGE_DFS_BACK) != 0);
 
@@ -953,7 +959,7 @@ thread_through_normal_block (edge e,
 secondary effects of threading without having to re-run DOM or
 VRP.  */
  if (!*backedge_seen_p
-  || ! cond_arg_set_in_bb (taken_edge, e->dest))
+ || ! cond_arg_set_in_bb (taken_edge, e->dest))
{
  /* We don't want to thread back to a block we have already
 visited.  This may be overly conservative.  */


Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2013 at 10:39:02AM -0700, Aldy Hernandez wrote:
> On 11/14/13 10:05, Jakub Jelinek wrote:
> 
> [Balaji, see below for question.]
> 
> >On Thu, Nov 14, 2013 at 09:49:41AM -0700, Aldy Hernandez wrote:
> >>+case OMP_PARALLEL:
> >>+case OMP_TASK:
> >>+case OMP_FOR:
> >>+case OMP_SIMD:
> >>+case OMP_SECTIONS:
> >>+case OMP_SINGLE:
> >>+case OMP_SECTION:
> >>+case OMP_MASTER:
> >>+case OMP_ORDERED:
> >>+case OMP_CRITICAL:
> >>+case OMP_ATOMIC:
> >>+case OMP_ATOMIC_READ:
> >>+case OMP_ATOMIC_CAPTURE_OLD:
> >>+case OMP_ATOMIC_CAPTURE_NEW:
> >
> >This is only a subset of OpenMP statements.
> >You are missing OMP_DISTRIBUTE, OMP_TARGET, OMP_TARGET_DATA, OMP_TEAMS,
> >OMP_TARGET_UPDATE, OMP_TASKGROUP.
> >Also, CALL_EXPRs to
> >   case BUILT_IN_GOMP_BARRIER:
> >   case BUILT_IN_GOMP_CANCEL:
> >   case BUILT_IN_GOMP_CANCELLATION_POINT:
> >   case BUILT_IN_GOMP_TASKYIELD:
> >   case BUILT_IN_GOMP_TASKWAIT:
> >are OpenMP statements.  For OpenMP we diagnose this later on, in
> >check_omp_nesting_restrictions in omp-low.c, wouldn't it be better to
> >handle it there too?
> 
> Woah, indeed.  I removed all of this section in favor of the error
> in check_omp_nesting_restriction, and adjusted the testcase error
> accordingly.

Well, if you don't change anything in omp-low.c, then it wouldn't diagnose
setjmp call in #pragma simd, but given that also the OpenMP 4.0 spec
requires that #pragma omp simd doesn't contain calls to setjmp or longjmp
(ditto for #pragma omp declare simd functions), then scan_omp_1_stmt
should be changed to also call check_omp_nesting_restrictions for
setjmp/longjmp calls (the GIMPLE_CALL case then in
check_omp_nesting_restrictions can't assume all calls it sees are
BUILT_IN_NORMAL).

> >I'm surprised here, does the Cilk+ reduction clause really want to grok
> >OpenMP user defined reductions etc.?
> 
> Hmm, I doubt it.  Balaji, OpenMP user defined reductions are not
> allowed for Cilk Plus, right?
> 
> If so, Jakub what do you suggest, disallowing parsing of user
> defined reductions in cp_parser_omp_clause_reduction() when some
> Cilk Plus flag, or did you have something else in mind?

Perhaps some bool is_cilkplus = false argument to
cp_parser_omp_clause_reduction would work for me (and for C too).

Jakub


Re: libsanitizer merge from upstream r191666

2013-11-14 Thread Konstantin Serebryany
On Thu, Nov 14, 2013 at 7:33 PM, Jakub Jelinek  wrote:
> On Tue, Oct 29, 2013 at 07:05:49AM -0700, Konstantin Serebryany wrote:
>> The calls are emitted by default, but the __asan_stack_malloc call is
>> done under a run-time flag
>> __asan_option_detect_stack_use_after_return.
>> So, to use the stack-use-after-return feature you should simply
>> compile with -fsanitize=address and then at run-time
>> pass ASAN_OPTIONS=detect_stack_use_after_return=1
>> For small stack frame sizes the call to __asan_stack_free is inlined
>> (as a performance optimization, not mandatory).
>
> Ok, here is heavily untested implementation of the use after return
> sanitization.  Tested just by eyeballing assembly generated for some
> testcases and on stack-use-after-return.cc and testing asan.exp (but that
> has no use-after-return tests, right?).

Nice! I haven't tried gcc-asan on heavy things for a while.
I'll try to give it a shot (first w/o this patch, then with it) on
chrome, where stack-use-after-return is known to just work.


>
> What I'm not very happy about is that __asan_stack_malloc_N doesn't have
> align argument, is there at least some guaranteed alignment on what it
> returns that gcc

I thought about alignment but did not reflect it anywhere in the
interface/comments.
The alignment should be min(4096, N), which is enough for most purposes.
But let me double-check (and add a CHECK) tomorrow.

--kcc

> can hardcode (and if the required alignment is bigger, just
> don't call __asan_stack_malloc_N)?  I mean, e.g. on x86_64/i686, right now
> say for -mavx code quite often 256-bit alignment is needed, with -mavx512f
> 512-bit alignment will be needed from time to time.  So if for now
> libasan could guarantee 64-byte alignment of the stack chunks or at least
> 32-byte alignment, I could pass the required alignment to
> asan_emit_stack_protection and force no __asan_stack_malloc_N use if
> the required alignment is smaller than the one guaranteed by libasan.
>
> 2013-11-14  Jakub Jelinek  
>
> * cfgexpand.c (struct stack_vars_data): Add asan_base field.
> (expand_stack_vars): For -fsanitize=address, use (and set initially)
> data->asan_base as base for vars.
> (expand_used_vars): Initialize data.asan_base.  Pass it to
> asan_emit_stack_protection.
> * asan.c (asan_detect_stack_use_after_return): New variable.
> (asan_emit_stack_protection): Add pbase argument.  Implement use
> after return sanitization.
> * asan.h (asan_emit_stack_protection): Adjust prototype.
> (ASAN_STACK_MAGIC_USE_AFTER_RET, ASAN_STACK_RETIRED_MAGIC): Define.
>
> --- gcc/cfgexpand.c.jj  2013-11-14 09:10:08.0 +0100
> +++ gcc/cfgexpand.c 2013-11-14 12:42:25.194538823 +0100
> @@ -879,6 +879,9 @@ struct stack_vars_data
>
>/* Vector of partition representative decls in between the paddings.  */
>vec asan_decl_vec;
> +
> +  /* Base pseudo register for Address Sanitizer protected automatic vars.  */
> +  rtx asan_base;
>  };
>
>  /* A subroutine of expand_used_vars.  Give each partition representative
> @@ -963,6 +966,7 @@ expand_stack_vars (bool (*pred) (size_t)
>alignb = stack_vars[i].alignb;
>if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
> {
> + base = virtual_stack_vars_rtx;
>   if ((flag_sanitize & SANITIZE_ADDRESS) && pred)
> {
>   HOST_WIDE_INT prev_offset = frame_offset;
> @@ -991,10 +995,12 @@ expand_stack_vars (bool (*pred) (size_t)
>   if (repr_decl == NULL_TREE)
> repr_decl = stack_vars[i].decl;
>   data->asan_decl_vec.safe_push (repr_decl);
> + if (data->asan_base == NULL)
> +   data->asan_base = gen_reg_rtx (Pmode);
> + base = data->asan_base;
> }
>   else
> offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
> - base = virtual_stack_vars_rtx;
>   base_align = crtl->max_used_stack_slot_alignment;
> }
>else
> @@ -1768,6 +1774,7 @@ expand_used_vars (void)
>
>data.asan_vec = vNULL;
>data.asan_decl_vec = vNULL;
> +  data.asan_base = NULL_RTX;
>
>/* Reorder decls to be protected by iterating over the variables
>  array multiple times, and allocating out of each phase in turn.  */
> @@ -1800,8 +1807,9 @@ expand_used_vars (void)
>
>   var_end_seq
> = asan_emit_stack_protection (virtual_stack_vars_rtx,
> + data.asan_base,
>   data.asan_vec.address (),
> - data.asan_decl_vec. address (),
> + data.asan_decl_vec.address (),
>   data.asan_vec.length ());
> }
>
> --- gcc/asan.c.jj   2013-11-14 09:10:08.0 +0100
> +++ gcc/asan.c  2013-11-14 16:06:42.203713457 +

Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Richard Biener
Diego Novillo  wrote:
>On Thu, Nov 14, 2013 at 11:13 AM, Andrew MacLeod 
>wrote:
>
>> very possibly, i just haven't gotten to those parts yet. I can change
>the
>> name back to gimple-decl.[ch] or some such thing if you like that
>better.
>
>As much as I hate to paint name sheds: gimple-val.[ch].

Yeah, everything that is a gimple value ...

Richard.

>
>Diego.




Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))

2013-11-14 Thread David Malcolm
On Thu, 2013-11-14 at 00:13 -0700, Jeff Law wrote:
> On 11/08/13 12:02, David Malcolm wrote:
> >> I wouldn't mind seeing a small example proof of concept posted to help
> >> those who don't see where this is going understand the goal.  I would
> >> recommend against posting another large patch for inclusion at this time.
> > Attached is a proof-of-concept patch which uses the
> > gimple_statement_switch subclass (as a "gimple_switch" typedef).  This
> > is one of the subclasses that the earlier patch added, which has no new
> > fields, but which carries the invariant that, if non-NULL,
> > gimple_code (gs) == GIMPLE_SWITCH.
> [ ... ]
> 
> Thanks.  It's pretty much what I expected.  Obviously for other codes 
> there may be a lot more changes that you have to slog through, but I 
> think this example shows the main concepts.
> 
> Presumably in this new world order, the various gimple statement types 
> will continue to inherit from a base class.  That seems somewhat 
> inevitable and implies a certain amount of downcasting (via whatever 
> means we agree upon).  The worry, in my mind is how pervasive the 
> downcasting will be and how much of it we can get rid of over time.
> 
> I may be wrong, but ISTM some of the downcasting is a result of not 
> providing certain capabilities via (pure?) virtual methods.  For 
> example, expand_gimple_stmt_1 seems ripe for implementing as virtual 
> methods.   ISTM you could also have virtuals to build the statements, 
> dump/pretty-print them, verify them, branch/edge redirection, 
> estimations for inlining, etc.  ISTM that would eliminate a good chunk 
> of the downcasting.

FWIW, I prefer the downcasts to adding virtual functions; what I've
tried to do is create a very direct mapping from the status quo,
introducing inheritance to gain the benefits listed earlier in the
thread, whilst only changing "surface syntax".

It seems to me that we're considering the general problem of type-safe
code dispatch: given a type hierarchy, and various sites that want to
vary behavior based on what types they see, how best to invoke the
appropriate code, ensuring that the code that's called "knows" that its
dealing with the appropriate subclass i.e. in a typesafe manner.

There are various idioms for doing this kind of dispatch in C++, a
non-exhaustive list is:

   (a) switches and if/then tests on the GIMPLE_CODE (stmt) - the status
quo, and what my proposed patch continues to do, albeit gaining some
compile-time typechecking using as_a<> for the switch and dyn_cast<> for
the if/then.  This is changing some surface syntax without making major
changes, and gains us compile-time typesafety and IMHO more readable
code, though clearly opinions vary here.   In my (brief) testing, (a)
has no significant effect on compiler performance.

   (b) adding virtual functions to gimple would be another way to handle
type-safe dispatch, but they carry costs:
  (i) they would implicitly add a vtable ptr to the top of every
gimple statement, increasing the memory consumption of the process
  (ii) it's my belief that a virtual function call is more expensive
than the kinds of switch/if+then branching that we're currently doing on
the code - though I don't have measurements to back this up
  (iii) what I call "link-time granularity". A vtable references
every method within it.  I'd love to have a libgimple.so, but to do so,
every vtable would need to be populated with a particular set of
operations at link time - where do we draw the line for "core" gimple
operations, the dispatches performed by every core pass?   The set of
operations will never be complete: some plugin may want to add a new set
of per-gimple-subclass behaviors for some custom gimple pass.

  (c) the "Visitor" design pattern [1] - rather than adding virtual
functions to gimple, instead add them to a visitor class e.g.:

 class gimple_visitor
 {
 public:
/* Visit a statement.  This will dispatch to the appropriate
   handler below, based on GIMPLE_CODE (stmt), encapsulating
   the appropriate downcast within a big switch statement.  */
void visit_stmt (gimple stmt);

 protected:
/* Each gimple code gets its own handler.  This class
   provides an empty implementation of each.  If we want
   to force overrides, we could have an abstract_gimple_visitor
   base class above this one that has all of these be pure
   virtual.  */
virtual void visit_cond (gimple_cond stmt) {}
virtual void visit_switch (gimple_switch stmt) {}
virtual void visit_assign (gimple_assign stmt) {}
virtual void visit_phi (gimple_phi phi) {}
/* etc */
 };

   Example of a subclass:

 class gimple_pretty_printer : public gimple_visitor
 {
 protected:
/* Each of these implements the subclass-specific
   pretty-printing logic.  */
void visit_cond (gimple_cond stmt);
void visit_swit

Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-14 Thread Aldy Hernandez

On 11/14/13 09:57, Joseph S. Myers wrote:

Where you have

+   error ("break statement within <#pragma simd> loop body");
+   error ("continue statement within <#pragma simd> loop body");

I think you mean %< and %> (i.e. quotes) not < and >.



Indeed.  Fixed.

Thank you.


Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2013 at 09:49:41AM -0700, Aldy Hernandez wrote:
> +case OMP_PARALLEL:
> +case OMP_TASK:
> +case OMP_FOR:
> +case OMP_SIMD:
> +case OMP_SECTIONS:
> +case OMP_SINGLE:
> +case OMP_SECTION:
> +case OMP_MASTER:
> +case OMP_ORDERED:
> +case OMP_CRITICAL:
> +case OMP_ATOMIC:
> +case OMP_ATOMIC_READ:
> +case OMP_ATOMIC_CAPTURE_OLD:
> +case OMP_ATOMIC_CAPTURE_NEW:

This is only a subset of OpenMP statements.
You are missing OMP_DISTRIBUTE, OMP_TARGET, OMP_TARGET_DATA, OMP_TEAMS,
OMP_TARGET_UPDATE, OMP_TASKGROUP.
Also, CALL_EXPRs to
  case BUILT_IN_GOMP_BARRIER:
  case BUILT_IN_GOMP_CANCEL:
  case BUILT_IN_GOMP_CANCELLATION_POINT:
  case BUILT_IN_GOMP_TASKYIELD:
  case BUILT_IN_GOMP_TASKWAIT:
are OpenMP statements.  For OpenMP we diagnose this later on, in
check_omp_nesting_restrictions in omp-low.c, wouldn't it be better to
handle it there too?

> +  error_at (EXPR_LOCATION (*tp), "OpenMP statements are not allowed "
> + "within loops annotated with #pragma simd");
> +  *valid = false;
> +  *walk_subtrees = 0;
> +  break;
> --- a/gcc/c-family/c-pragma.c
> +++ b/gcc/c-family/c-pragma.c
> @@ -1380,6 +1380,12 @@ init_pragma (void)
> omp_pragmas_simd[i].id, true, true);
>  }
>  
> +  if (flag_enable_cilkplus && !flag_preprocess_only)
> +{
> +  cpp_register_deferred_pragma (parse_in, NULL, "simd", 
> + PRAGMA_CILK_SIMD, true, false);
> +}

Unnecessary {}s (or do you expect further cilk+ pragmas?

> @@ -11543,6 +11553,9 @@ c_parser_omp_for_loop (location_t loc, c_parser 
> *parser, enum tree_code code,
>   case LT_EXPR:
>   case LE_EXPR:
> break;
> + case NE_EXPR:
> +   if (code == CILK_SIMD)
> + break;

Add /* FALLTHRU */ here?

>   default:
> /* Can't be cond = error_mark_node, because we want to preserve
>the location until c_finish_omp_for.  */

> +  else if (c_kind == PRAGMA_CILK_CLAUSE_REDUCTION)
> + /* Use the OMP 4.0 equivalent function.  */
> + clauses = cp_parser_omp_clause_reduction (parser, clauses);

I'm surprised here, does the Cilk+ reduction clause really want to grok
OpenMP user defined reductions etc.?

Jakub


Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-11-14 Thread Joseph S. Myers
Where you have

+   error ("break statement within <#pragma simd> loop body");
+   error ("continue statement within <#pragma simd> loop body");

I think you mean %< and %> (i.e. quotes) not < and >.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch 1/2] add gimplfy-be.[ch] for iterator-aware BE-only gimplification routines.

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 10:41 AM, Richard Biener wrote:

On Thu, Nov 14, 2013 at 4:10 PM, Andrew MacLeod  wrote:

This patch splits out the force_gimple_operand  parts of gimplify.[ch] into
their own file which will prevent the front ends from having to see
iterators, and breaks the annoying dependency cycle between gimple.h,
gimplify.h and gimple-iterator.h.  I suspect more stuff may end up here, but
this is all that is needed for now.

There were also a few gimplification related things still hanging around in
gimple.h so I moved those to gimplify.h as well.

This also allows gimple-iterator.h to finally take ownership of "enum
gsi_iterator_update".

When I originally created gimplify.h, I included gimple.h right from
gimplify.h thinking it would be better for the front end files, but really,
no. It just complicates things, so I flatten gimplify.h as well here...
That's the primary reason for the #include churn.  Now gimple.h is included
where it is needed rather than blanket including it with gimplfy.h.

I also trimmed the #include list in gimplify.c and gimplify-be.c to only
include what is actually required.

Next I will clean up what remains of gimple.h, and flatten it. Then the
gimple refactoring is done for now.

patch 1 is the core changes
patch2 contains the resulting include changes.

Bootstrapped on x86_64-unknown-linux-gnu with no new regressions, and stage
1 built for all targets to confirm those changes.

OK?

Eh, it's not "backend", it's "middle-end" please.  And that should include
gimple_regimplify_operands.


OK, I have to expose rhs_predicate_for() in gimplify.h to do that, but 
its not a big deal.  done.




 GS_ALL_DONE = 1 /* The expression is fully gimplified.  */
   };
+ /* Gimplify hashtable helper.  */
+
+ struct gimplify_hasher : typed_free_remove 
+ {
+   typedef elt_t value_type;

watch out for missing vertical space when cut & pasting (just look over
your own patches).


Just missed that one.   These gimlpify-hasher routines seem to move 
safely into gimplify.c.




Why put this in a header?  That's super-ugly - this all should be
private to gimplifciation.

+ /* Return true if gimplify_one_sizepos doesn't need to gimplify
+expr (when in TYPE_SIZE{,_UNIT} and similar type/decl size/bitsize
+fields).  */
+ static inline bool
+ is_gimple_sizepos (tree expr)

likewise.  And in C++ times it's now plain 'inline', not 'static inline'.
Yeah, except this routine is also used outside of gimplifcation at the 
moment, in tree.c of all places... for variably_modified_type_p().  
Tackling that issue is a little outside the scope of what I'm doing at 
this moment :-P.


So does that mean when "processing" an include file, we generally ought 
to strip out any 'static inline' and just make them 'inline' now?




Oh, I see you moved it from gimple.h - oh well.

Thus, ok with s/gimplify-be.[ch]/gimplify-me.[ch]/ (still ugly name).


yeah... I'm open to alternate suggestions :-)  I eventually hope to be 
rid of the file anyway


Andrew




Re: [PATCH, rs6000] ELFv2 ABI 1/8 - 8/8

2013-11-14 Thread Ulrich Weigand
David Edelsohn wrote:
> 
> > To avoid having a partial ABI implementation in tree, it seems best
> > to commit this whole patch series as a single commit.
> 
> > Is the series OK for mainline?
> 
> The 8 patch series implementing and enabling ELFv2 is okay, including
> the implementation of command line switches.

Thanks for the review, David!

> There appear to be a few new regressions on AIX for
> compat/struct-layout that need to be analyzed and addressed.

I've looked into these a bit more.  I'm seeing new failures:
FAIL: tmpdir-g++.dg-struct-layout-1/t029 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute
FAIL: tmpdir-g++.dg-struct-layout-1/t030 cp_compat_x_tst.o-cp_compat_y_tst.o 
execute

and looking into these, there are a total of 7 sub-tests that fail:
t029: test2852
t029: test2865
t030: test2915
t030: test2961
t030: test2962
t030: test2966
t030: test2984

Now, every single one of them fails not on any ABI test, but already
on the preliminary check that the structure is properly aligned:

in g++.dg/compat/struct-layout-1_x1.h:

void test##n (void) \
{   \
  int i, j; \
  memset (&s##n, '\0', sizeof (s##n));  \
  memset (a##n, '\0', sizeof (a##n));   \
  memset (&info, '\0', sizeof (info));  \
  info.sp = &s##n;  \
  info.a0p = &a##n[0];  \
  info.a3p = &a##n[3];  \
  info.sz = sizeof (s##n);  \
  info.als = __alignof__ (s##n);\
  info.ala0 = __alignof__ (a##n[0]);\
  info.ala3 = __alignof__ (a##n[3]);\
  if (((long) (__SIZE_TYPE__) &a##n[3]) & (info.als - 1))   
\
FAIL (n, 1);\

In all seven cases, info.als is 64, so for some reason the
structure is supposed to be 64-byte aligned, but the actual
location of the struct in memory does not respect that,
so the FAIL is invoked here.

Interestingly, I've seen the same test fail with the 4.6.0
baseline compiler installed on the system.  My assumption
would be that this is simply a pre-existing bug (either
the alignment is computed incorrectly, or it is not being
respected properly throughout the toolchain), and you were
seeing successful runs in the past simply because the structs
just happened to end up at aligned addresses ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [C++ Patch/RFC PR c++/57887 (and dups)

2013-11-14 Thread Paolo Carlini

Hi,

On 11/14/2013 03:41 PM, Jason Merrill wrote:
I don't think we need a new parameter; just pass the FIELD_DECL into 
maybe_end_member_template_processing and adjust it appropriately.


Also, call m_e_m_t_p from cp_parser_late_parsing_nsdmi rather than 
cp_parser_class_specifier_1.

Thanks, much simpler indeed. I'm finishing testing the below.

Paolo.


/cp
2013-11-14  Paolo Carlini  

PR c++/57887
* parser.c (cp_parser_late_parsing_nsdmi): Call
maybe_begin_member_template_processing.
* pt.c (maybe_begin_member_template_processing): Handle NSDMIs.
(inline_needs_template_parms): Adjust.

/testsuite
2013-11-14  Paolo Carlini  

PR c++/57887
* g++.dg/cpp0x/nsdmi-template3.C: New.
* g++.dg/cpp0x/nsdmi-template4.C: Likewise.
Index: cp/parser.c
===
--- cp/parser.c (revision 204780)
+++ cp/parser.c (working copy)
@@ -23378,12 +23378,16 @@ cp_parser_late_parsing_nsdmi (cp_parser *parser, t
 {
   tree def;
 
+  maybe_begin_member_template_processing (field);
+
   push_unparsed_function_queues (parser);
   def = cp_parser_late_parse_one_default_arg (parser, field,
  DECL_INITIAL (field),
  NULL_TREE);
   pop_unparsed_function_queues (parser);
 
+  maybe_end_member_template_processing ();
+
   DECL_INITIAL (field) = def;
 }
 
Index: cp/pt.c
===
--- cp/pt.c (revision 204780)
+++ cp/pt.c (working copy)
@@ -151,7 +151,7 @@ static int for_each_template_parm (tree, tree_fn_t
   struct pointer_set_t*, bool);
 static tree expand_template_argument_pack (tree);
 static tree build_template_parm_index (int, int, int, tree, tree);
-static bool inline_needs_template_parms (tree);
+static bool inline_needs_template_parms (tree, bool);
 static void push_inline_template_parms_recursive (tree, int);
 static tree retrieve_local_specialization (tree);
 static void register_local_specialization (tree, tree);
@@ -377,9 +377,9 @@ template_class_depth (tree type)
Returns true if processing DECL needs us to push template parms.  */
 
 static bool
-inline_needs_template_parms (tree decl)
+inline_needs_template_parms (tree decl, bool nsdmi)
 {
-  if (! DECL_TEMPLATE_INFO (decl))
+  if (!decl || (!nsdmi && ! DECL_TEMPLATE_INFO (decl)))
 return false;
 
   return (TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (most_general_template (decl)))
@@ -448,16 +448,23 @@ push_inline_template_parms_recursive (tree parmlis
 }
 }
 
-/* Restore the template parameter context for a member template or
-   a friend template defined in a class definition.  */
+/* Restore the template parameter context for a member template, a
+   friend template defined in a class definition, or a non-template
+   member of template class.  */
 
 void
 maybe_begin_member_template_processing (tree decl)
 {
   tree parms;
   int levels = 0;
+  bool nsdmi = TREE_CODE (decl) == FIELD_DECL;
 
-  if (inline_needs_template_parms (decl))
+  if (nsdmi)
+decl = (CLASSTYPE_TEMPLATE_INFO (DECL_CONTEXT (decl))
+   ? TREE_TYPE (CLASSTYPE_TEMPLATE_INFO (DECL_CONTEXT (decl)))
+   : NULL_TREE);
+
+  if (inline_needs_template_parms (decl, nsdmi))
 {
   parms = DECL_TEMPLATE_PARMS (most_general_template (decl));
   levels = TMPL_PARMS_DEPTH (parms) - processing_template_decl;
Index: testsuite/g++.dg/cpp0x/nsdmi-template3.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template3.C(revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template3.C(working copy)
@@ -0,0 +1,16 @@
+// PR c++/58760
+// { dg-do compile { target c++11 } }
+
+enum en
+{
+  a,b,c
+};
+ 
+struct B
+{
+  template
+  struct A
+  {
+const int X = N;
+  };
+};
Index: testsuite/g++.dg/cpp0x/nsdmi-template4.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template4.C(revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template4.C(working copy)
@@ -0,0 +1,24 @@
+// PR c++/57887
+// { dg-do compile { target c++11 } }
+
+struct B
+{
+  template
+  struct A
+  {
+int X = N;
+  };
+};
+
+template
+struct C
+{
+  int Y = M;
+
+  template
+  struct A
+  {
+int X = N;
+int Y = M;
+  };
+};


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 11:23 AM, Michael Matz wrote:

Hi,

On Thu, 14 Nov 2013, Andrew MacLeod wrote:


I think if following through with the whole plan there would (and
should) be nothing remaining that could be called a gimple expression.

very possibly, i just haven't gotten to those parts yet. I can change
the name back to gimple-decl.[ch] or some such thing if you like that
better.

-object? -operand? -stuff? ;-)  Will all of these splits land at trunk,
i.e. 4.9?  Why the hurry when not even such high-level things are clear?
I mean how can you think about rearchitecting the gimple data structures
without having looked at the current details.  It's clear that not every
detail of the design can be fixated at this point, but basic questions
like "what's the operands?", "will there be expressions?", "how do we
iterate?", "recursive structures or not?" should at least get some answer
before really starting grind work, shouldn't they?

The splits are for header file cleanup and re-structuring into logical 
components.  As I mentioned in the original post,  the file is needed to 
break dependency cycles between gimple.h (the statements) , the 
iterators, and gimplification.  It is for the gimple stuff which doesn't 
need any of those things but is consumed by them.


This really has nothing to do with my future plans, other than the fact 
that I also said whatever is in this file is will eventually be split 
into more things, but I'm not ready to do those splits yet, thus the 
gimple-blah name doesn't matter to me.  gimple-expr seemed convenient at 
the time but clearly you don't like it, and I'll happily call it 
whatever you want.  It's a grab bag of all the gimple values which are 
still trees...


maybe the suggested  gimple-val.[ch] is ok?

Andrew


Re: [PATCH] Do not set flag_complex_method to 2 for C++ by default.

2013-11-14 Thread Xinliang David Li
Can we revisit the decision for this? Here are the reasons:

1) It seems that the motivation to make C++ consistent with c99 is to
avoid confusing users who build the C source with both C and C++
compilers. Why should C++'s default behavior be tuned for this niche
case?
2) It is very confusing for users who see huge performance difference
between compiler generated code for Complex multiplication vs manually
expanded code
3) The default setting can also block potential vectorization
opportunities for complex operations
4) GCC is about the only compiler which has this default -- very few
user knows about GCC's strict default, and will think GCC performs
poorly.

thanks,

David


On Wed, Nov 13, 2013 at 9:07 PM, Andrew Pinski  wrote:
> On Wed, Nov 13, 2013 at 5:26 PM, Cong Hou  wrote:
>> This patch is for PR58963.
>>
>> In the patch http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00560.html,
>> the builtin function is used to perform complex multiplication and
>> division. This is to comply with C99 standard, but I am wondering if
>> C++ also needs this.
>>
>> There is no complex keyword in C++, and no content in C++ standard
>> about the behavior of operations on complex types. The 
>> header file is all written in source code, including complex
>> multiplication and division. GCC should not do too much for them by
>> using builtin calls by default (although we can set -fcx-limited-range
>> to prevent GCC doing this), which has a big impact on performance
>> (there may exist vectorization opportunities).
>>
>> In this patch flag_complex_method will not be set to 2 for C++.
>> Bootstraped and tested on an x86-64 machine.
>
> I think you need to look into this issue deeper as the original patch
> only enabled it for C99:
> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01483.html .
>
> Just a little deeper will find
> http://gcc.gnu.org/ml/gcc/2007-07/msg00124.html which says yes C++
> needs this.
>
> Thanks,
> Andrew Pinski
>
>>
>>
>> thanks,
>> Cong
>>
>>
>> Index: gcc/c-family/c-opts.c
>> ===
>> --- gcc/c-family/c-opts.c (revision 204712)
>> +++ gcc/c-family/c-opts.c (working copy)
>> @@ -198,8 +198,10 @@ c_common_init_options_struct (struct gcc
>>opts->x_warn_write_strings = c_dialect_cxx ();
>>opts->x_flag_warn_unused_result = true;
>>
>> -  /* By default, C99-like requirements for complex multiply and divide.  */
>> -  opts->x_flag_complex_method = 2;
>> +  /* By default, C99-like requirements for complex multiply and divide.
>> + But for C++ this should not be required.  */
>> +  if (c_language != clk_cxx && c_language != clk_objcxx)
>> +opts->x_flag_complex_method = 2;
>>  }
>>
>>  /* Common initialization before calling option handlers.  */
>> Index: gcc/c-family/ChangeLog
>> ===
>> --- gcc/c-family/ChangeLog (revision 204712)
>> +++ gcc/c-family/ChangeLog (working copy)
>> @@ -1,3 +1,8 @@
>> +2013-11-13  Cong Hou  
>> +
>> + * c-opts.c (c_common_init_options_struct): Don't let C++ comply with
>> + C99-like requirements for complex multiply and divide.
>> +
>>  2013-11-12  Joseph Myers  
>>
>>   * c-common.c (c_common_reswords): Add _Thread_local.


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Diego Novillo
On Thu, Nov 14, 2013 at 11:13 AM, Andrew MacLeod  wrote:

> very possibly, i just haven't gotten to those parts yet. I can change the
> name back to gimple-decl.[ch] or some such thing if you like that better.

As much as I hate to paint name sheds: gimple-val.[ch].


Diego.


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Michael Matz
Hi,

On Thu, 14 Nov 2013, Andrew MacLeod wrote:

> > I think if following through with the whole plan there would (and 
> > should) be nothing remaining that could be called a gimple expression.
> 
> very possibly, i just haven't gotten to those parts yet. I can change 
> the name back to gimple-decl.[ch] or some such thing if you like that 
> better.

-object? -operand? -stuff? ;-)  Will all of these splits land at trunk, 
i.e. 4.9?  Why the hurry when not even such high-level things are clear?  
I mean how can you think about rearchitecting the gimple data structures 
without having looked at the current details.  It's clear that not every 
detail of the design can be fixated at this point, but basic questions 
like "what's the operands?", "will there be expressions?", "how do we 
iterate?", "recursive structures or not?" should at least get some answer 
before really starting grind work, shouldn't they?


Ciao,
Michael.


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 10:57 AM, Michael Matz wrote:

Hi,

On Thu, 14 Nov 2013, Andrew MacLeod wrote:


That's why I think talking about a gimple expression as if they were
somehow some stand-alone concept is fairly confusing, and introducing it
now as if it would somewhen exist would lead to going down some inferior
design paths.

Well, for gimple expressions I was thinking more about the addressing
expressions we currently leave as trees... MEM stuff... that where most
of the remaining 'expressions' are I guess, so perhaps gimple-addressing
is a better term...

in any case, it refers mostly to the parts of trees which are
tcc_expression and are not subsumed by gimple_statement contructs. So I
use expression for lack of a better term since I don't know what exact
uses there are yet.

Well, I can precisely name you the set of things you mean then, and I
wouldn't call any of them expressions (all of them either represent a
(sub)object or a statement of fact, not a computation (except as part of
how to get at the specified object)):

   CODE_CLASS == tcc_reference
  : COMPONENT_REF, BIT_FIELD_REF, ARRAY_REF, ARRAY_RANGE_REF,
REALPART_EXPR, IMAGPART_EXPR, VIEW_CONVERT_EXPR, INDIRECT_REF,
TARGET_MEM_REF, MEM_REF
   CODE ==
  : CONSTRUCTOR, OBJ_TYPE_REF, ASSERT_EXPR, ADDR_EXPR, WITH_SIZE_EXPR

The rest is the trivial SSA_NAME, tcc_constant and tcc_declaration, what I
called singletons.

Most of the codes above have a shallow one- or two-level structure in what
they operate on, or can be made so by some more lowering.  A few of them
can contain arbitrarily deep recursive structures (but not of all possible
trees), and I think that's the only thing that would remain if the above
would be better melded into serveral new statement types.  That's of
course also the difficult part to get sensibly rid of, because the
recursive structure lends itself to something that terribly looks like
'tree'.

I think if following through with the whole plan there would (and should)
be nothing remaining that could be called a gimple expression.


very possibly, i just haven't gotten to those parts yet. I can change 
the name back to gimple-decl.[ch] or some such thing if you like that 
better.


Andrew



Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Michael Matz
Hi,

On Thu, 14 Nov 2013, Andrew MacLeod wrote:

> > That's why I think talking about a gimple expression as if they were
> > somehow some stand-alone concept is fairly confusing, and introducing it
> > now as if it would somewhen exist would lead to going down some inferior
> > design paths.
> 
> Well, for gimple expressions I was thinking more about the addressing 
> expressions we currently leave as trees... MEM stuff... that where most 
> of the remaining 'expressions' are I guess, so perhaps gimple-addressing 
> is a better term...
> 
> in any case, it refers mostly to the parts of trees which are 
> tcc_expression and are not subsumed by gimple_statement contructs. So I 
> use expression for lack of a better term since I don't know what exact 
> uses there are yet.

Well, I can precisely name you the set of things you mean then, and I 
wouldn't call any of them expressions (all of them either represent a 
(sub)object or a statement of fact, not a computation (except as part of 
how to get at the specified object)):

  CODE_CLASS == tcc_reference
 : COMPONENT_REF, BIT_FIELD_REF, ARRAY_REF, ARRAY_RANGE_REF,
   REALPART_EXPR, IMAGPART_EXPR, VIEW_CONVERT_EXPR, INDIRECT_REF,
   TARGET_MEM_REF, MEM_REF
  CODE ==
 : CONSTRUCTOR, OBJ_TYPE_REF, ASSERT_EXPR, ADDR_EXPR, WITH_SIZE_EXPR

The rest is the trivial SSA_NAME, tcc_constant and tcc_declaration, what I 
called singletons.

Most of the codes above have a shallow one- or two-level structure in what 
they operate on, or can be made so by some more lowering.  A few of them 
can contain arbitrarily deep recursive structures (but not of all possible 
trees), and I think that's the only thing that would remain if the above 
would be better melded into serveral new statement types.  That's of 
course also the difficult part to get sensibly rid of, because the 
recursive structure lends itself to something that terribly looks like 
'tree'.

I think if following through with the whole plan there would (and should) 
be nothing remaining that could be called a gimple expression.


Ciao,
Michael.


Re: [PATCH, rs6000] ELFv2 ABI 1/8 - 8/8

2013-11-14 Thread David Edelsohn
> To avoid having a partial ABI implementation in tree, it seems best
> to commit this whole patch series as a single commit.

> Is the series OK for mainline?

The 8 patch series implementing and enabling ELFv2 is okay, including
the implementation of command line switches.

There appear to be a few new regressions on AIX for
compat/struct-layout that need to be analyzed and addressed.

Thanks, David


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 10:37 AM, Diego Novillo wrote:

On Thu, Nov 14, 2013 at 10:34 AM, Andrew MacLeod  wrote:

On 11/14/2013 10:26 AM, Michael Matz wrote:

Hi,

On Wed, 13 Nov 2013, Andrew MacLeod wrote:


There needs to be a place which has gimple componentry that is not
related to or require a statement.  gimple.h is becoming the home for
just 0gimple statements.  There are 3 (for the moment) major classes of
things that are in statements and are also used by other parts of the
compiler .. Types, Decls, and Expressions.


E.g. there won't ever be something like a gimple arithmetic addition
expression (or I hope there never will be).  It's always a gimple
statement that assigns the addition result somewhere.  Even the
non-singleton objects don't exist in isolation, they're always part of
some action (i.e. statement) to operate on those objects.

That's why I think talking about a gimple expression as if they were
somehow some stand-alone concept is fairly confusing, and introducing it
now as if it would somewhen exist would lead to going down some inferior
design paths.

Well, for gimple expressions I was thinking more about the addressing
expressions we currently leave as trees... MEM stuff... that where most of
the remaining 'expressions' are I guess, so perhaps gimple-addressing is a
better term...

in any case, it refers mostly to the parts of trees which are tcc_expression
and are not subsumed by gimple_statement contructs. So I use expression for
lack of a better term since I don't know what exact uses there are yet.

I think we'll end up with a hierarchy that will have some generic
"value" at its base, with constants, symbols, aggregates, arrays,
memrefs, etc as children. But perhaps we can wait until we have a
better idea of how we want it to look like.


That is pretty much what my prototypes at th cauldron had...   I just 
hadn't fit expressions into it yet... so just left the term. but that 
why it is lumped in with decls and types... things that will have their 
tree replaced.  In any case... thats all to come...


Andrew


Re: [patch 1/2] add gimplfy-be.[ch] for iterator-aware BE-only gimplification routines.

2013-11-14 Thread Richard Biener
On Thu, Nov 14, 2013 at 4:10 PM, Andrew MacLeod  wrote:
> This patch splits out the force_gimple_operand  parts of gimplify.[ch] into
> their own file which will prevent the front ends from having to see
> iterators, and breaks the annoying dependency cycle between gimple.h,
> gimplify.h and gimple-iterator.h.  I suspect more stuff may end up here, but
> this is all that is needed for now.
>
> There were also a few gimplification related things still hanging around in
> gimple.h so I moved those to gimplify.h as well.
>
> This also allows gimple-iterator.h to finally take ownership of "enum
> gsi_iterator_update".
>
> When I originally created gimplify.h, I included gimple.h right from
> gimplify.h thinking it would be better for the front end files, but really,
> no. It just complicates things, so I flatten gimplify.h as well here...
> That's the primary reason for the #include churn.  Now gimple.h is included
> where it is needed rather than blanket including it with gimplfy.h.
>
> I also trimmed the #include list in gimplify.c and gimplify-be.c to only
> include what is actually required.
>
> Next I will clean up what remains of gimple.h, and flatten it. Then the
> gimple refactoring is done for now.
>
> patch 1 is the core changes
> patch2 contains the resulting include changes.
>
> Bootstrapped on x86_64-unknown-linux-gnu with no new regressions, and stage
> 1 built for all targets to confirm those changes.
>
> OK?

Eh, it's not "backend", it's "middle-end" please.  And that should include
gimple_regimplify_operands.

GS_ALL_DONE = 1 /* The expression is fully gimplified.  */
  };
+ /* Gimplify hashtable helper.  */
+
+ struct gimplify_hasher : typed_free_remove 
+ {
+   typedef elt_t value_type;

watch out for missing vertical space when cut & pasting (just look over
your own patches).

Why put this in a header?  That's super-ugly - this all should be
private to gimplifciation.

+ /* Return true if gimplify_one_sizepos doesn't need to gimplify
+expr (when in TYPE_SIZE{,_UNIT} and similar type/decl size/bitsize
+fields).  */
+ static inline bool
+ is_gimple_sizepos (tree expr)

likewise.  And in C++ times it's now plain 'inline', not 'static inline'.

Oh, I see you moved it from gimple.h - oh well.

Thus, ok with s/gimplify-be.[ch]/gimplify-me.[ch]/ (still ugly name).

Thanks,
Richard.


> Andrew
>
> PS Interestingly, many back end files still need gimplify.h, and the vast
> majority of them are actually only looking for 'unshare_expr'... which is
> used by both the front ends and back ends.  (Some front end files are also
> including gimplify.h simply to get that routine.)
>
> It is obviously a very tree specific thing since it duplicates most of a
> tree, but is only required because gimple uses unshared trees for
> expressions and such. Its not really part of the gimplification process, but
> it is utilized by it.  I am wondering if perhaps we ought to split this code
> out as well and make "tree-unshare.[ch]"...


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Diego Novillo
On Thu, Nov 14, 2013 at 10:34 AM, Andrew MacLeod  wrote:
> On 11/14/2013 10:26 AM, Michael Matz wrote:
>>
>> Hi,
>>
>> On Wed, 13 Nov 2013, Andrew MacLeod wrote:
>>
>>> There needs to be a place which has gimple componentry that is not
>>> related to or require a statement.  gimple.h is becoming the home for
>>> just 0gimple statements.  There are 3 (for the moment) major classes of
>>> things that are in statements and are also used by other parts of the
>>> compiler .. Types, Decls, and Expressions.
>>
>>
>> E.g. there won't ever be something like a gimple arithmetic addition
>> expression (or I hope there never will be).  It's always a gimple
>> statement that assigns the addition result somewhere.  Even the
>> non-singleton objects don't exist in isolation, they're always part of
>> some action (i.e. statement) to operate on those objects.
>>
>> That's why I think talking about a gimple expression as if they were
>> somehow some stand-alone concept is fairly confusing, and introducing it
>> now as if it would somewhen exist would lead to going down some inferior
>> design paths.
>
> Well, for gimple expressions I was thinking more about the addressing
> expressions we currently leave as trees... MEM stuff... that where most of
> the remaining 'expressions' are I guess, so perhaps gimple-addressing is a
> better term...
>
> in any case, it refers mostly to the parts of trees which are tcc_expression
> and are not subsumed by gimple_statement contructs. So I use expression for
> lack of a better term since I don't know what exact uses there are yet.

I think we'll end up with a hierarchy that will have some generic
"value" at its base, with constants, symbols, aggregates, arrays,
memrefs, etc as children. But perhaps we can wait until we have a
better idea of how we want it to look like.


Diego.


Re: libsanitizer merge from upstream r191666

2013-11-14 Thread Jakub Jelinek
On Tue, Oct 29, 2013 at 07:05:49AM -0700, Konstantin Serebryany wrote:
> The calls are emitted by default, but the __asan_stack_malloc call is
> done under a run-time flag
> __asan_option_detect_stack_use_after_return.
> So, to use the stack-use-after-return feature you should simply
> compile with -fsanitize=address and then at run-time
> pass ASAN_OPTIONS=detect_stack_use_after_return=1
> For small stack frame sizes the call to __asan_stack_free is inlined
> (as a performance optimization, not mandatory).

Ok, here is heavily untested implementation of the use after return
sanitization.  Tested just by eyeballing assembly generated for some
testcases and on stack-use-after-return.cc and testing asan.exp (but that
has no use-after-return tests, right?).

What I'm not very happy about is that __asan_stack_malloc_N doesn't have
align argument, is there at least some guaranteed alignment on what it
returns that gcc can hardcode (and if the required alignment is bigger, just
don't call __asan_stack_malloc_N)?  I mean, e.g. on x86_64/i686, right now
say for -mavx code quite often 256-bit alignment is needed, with -mavx512f
512-bit alignment will be needed from time to time.  So if for now
libasan could guarantee 64-byte alignment of the stack chunks or at least
32-byte alignment, I could pass the required alignment to
asan_emit_stack_protection and force no __asan_stack_malloc_N use if
the required alignment is smaller than the one guaranteed by libasan.

2013-11-14  Jakub Jelinek  

* cfgexpand.c (struct stack_vars_data): Add asan_base field.
(expand_stack_vars): For -fsanitize=address, use (and set initially)
data->asan_base as base for vars.
(expand_used_vars): Initialize data.asan_base.  Pass it to
asan_emit_stack_protection.
* asan.c (asan_detect_stack_use_after_return): New variable.
(asan_emit_stack_protection): Add pbase argument.  Implement use
after return sanitization.
* asan.h (asan_emit_stack_protection): Adjust prototype.
(ASAN_STACK_MAGIC_USE_AFTER_RET, ASAN_STACK_RETIRED_MAGIC): Define.

--- gcc/cfgexpand.c.jj  2013-11-14 09:10:08.0 +0100
+++ gcc/cfgexpand.c 2013-11-14 12:42:25.194538823 +0100
@@ -879,6 +879,9 @@ struct stack_vars_data
 
   /* Vector of partition representative decls in between the paddings.  */
   vec asan_decl_vec;
+
+  /* Base pseudo register for Address Sanitizer protected automatic vars.  */
+  rtx asan_base;
 };
 
 /* A subroutine of expand_used_vars.  Give each partition representative
@@ -963,6 +966,7 @@ expand_stack_vars (bool (*pred) (size_t)
   alignb = stack_vars[i].alignb;
   if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
{
+ base = virtual_stack_vars_rtx;
  if ((flag_sanitize & SANITIZE_ADDRESS) && pred)
{
  HOST_WIDE_INT prev_offset = frame_offset;
@@ -991,10 +995,12 @@ expand_stack_vars (bool (*pred) (size_t)
  if (repr_decl == NULL_TREE)
repr_decl = stack_vars[i].decl;
  data->asan_decl_vec.safe_push (repr_decl);
+ if (data->asan_base == NULL)
+   data->asan_base = gen_reg_rtx (Pmode);
+ base = data->asan_base;
}
  else
offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
- base = virtual_stack_vars_rtx;
  base_align = crtl->max_used_stack_slot_alignment;
}
   else
@@ -1768,6 +1774,7 @@ expand_used_vars (void)
 
   data.asan_vec = vNULL;
   data.asan_decl_vec = vNULL;
+  data.asan_base = NULL_RTX;
 
   /* Reorder decls to be protected by iterating over the variables
 array multiple times, and allocating out of each phase in turn.  */
@@ -1800,8 +1807,9 @@ expand_used_vars (void)
 
  var_end_seq
= asan_emit_stack_protection (virtual_stack_vars_rtx,
+ data.asan_base,
  data.asan_vec.address (),
- data.asan_decl_vec. address (),
+ data.asan_decl_vec.address (),
  data.asan_vec.length ());
}
 
--- gcc/asan.c.jj   2013-11-14 09:10:08.0 +0100
+++ gcc/asan.c  2013-11-14 16:06:42.203713457 +0100
@@ -226,6 +226,9 @@ alias_set_type asan_shadow_set = -1;
alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Decl for __asan_option_detect_stack_use_after_return.  */
+static GTY(()) tree asan_detect_stack_use_after_return;
+
 /* Hashtable support for memory references used by gimple
statements.  */
 
@@ -939,20 +942,25 @@ asan_function_start (void)
and DECLS is an array of representative decls for each var partition.
LENGTH is the length of the OFFSETS array, DECLS array is LENGTH / 2 - 1
elements long (OFFSETS include gap before the first variabl

Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Andrew MacLeod

On 11/14/2013 10:26 AM, Michael Matz wrote:

Hi,

On Wed, 13 Nov 2013, Andrew MacLeod wrote:


There needs to be a place which has gimple componentry that is not
related to or require a statement.  gimple.h is becoming the home for
just 0gimple statements.  There are 3 (for the moment) major classes of
things that are in statements and are also used by other parts of the
compiler .. Types, Decls, and Expressions.


E.g. there won't ever be something like a gimple arithmetic addition
expression (or I hope there never will be).  It's always a gimple
statement that assigns the addition result somewhere.  Even the
non-singleton objects don't exist in isolation, they're always part of
some action (i.e. statement) to operate on those objects.

That's why I think talking about a gimple expression as if they were
somehow some stand-alone concept is fairly confusing, and introducing it
now as if it would somewhen exist would lead to going down some inferior
design paths.
Well, for gimple expressions I was thinking more about the addressing 
expressions we currently leave as trees... MEM stuff... that where most 
of the remaining 'expressions' are I guess, so perhaps gimple-addressing 
is a better term...


in any case, it refers mostly to the parts of trees which are 
tcc_expression and are not subsumed by gimple_statement contructs. So I 
use expression for lack of a better term since I don't know what exact 
uses there are yet.


Andrew




Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Diego Novillo
On Thu, Nov 14, 2013 at 10:26 AM, Michael Matz  wrote:

> Put another way: what do you envision that gimple expressions would be.
> For example what would you propose we could do with them?

The only expressions I have in mind are memory references and
aggregates, which can get pretty convoluted.

Perhaps we could label them something different than expressions. They
would be in the same taxonomy of "gimple values".  They are operands
to gimple statements, they can have multiple symbol references inside
and they may have a tree like structure.


Diego.


Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-11-14 Thread Richard Biener
On Thu, Nov 14, 2013 at 11:16 AM, Bernd Edlinger
 wrote:
> Hi,
>
> sorry, for the delay.
> Sandra seems to be even more busy than me...
>
> Attached is a combined patch of the original part 1, and the update,
> in diff -up format.
>
> On Mon, 11 Nov 2013 13:10:45, Richard Biener wrote:
>>
>> On Thu, Oct 31, 2013 at 1:46 AM, Sandra Loosemore
>>  wrote:
>>> On 10/29/2013 02:51 AM, Bernd Edlinger wrote:


 On Mon, 28 Oct 2013 21:29:24, Sandra Loosemore wrote:
>
> On 10/28/2013 03:20 AM, Bernd Edlinger wrote:
>>
>> I have attached an update to your patch, that should
>> a) fix the recursion problem.
>> b) restrict the -fstrict-volatile-bitfields to not violate the C++
>> memory model.
>>>
>>>
>>> Here's a new version of the update patch.
>>>
>>>
> Alternatively, if strict_volatile_bitfield_p returns false but
> flag_strict_volatile_bitfields> 0, then always force to word_mode and
> change the -fstrict-volatile-bitfields documentation to indicate that's
> the fallback if the insertion/extraction cannot be done in the declared
> mode, rather than claiming that it tries to do the same thing as if
> -fstrict-volatile-bitfields were not enabled at all.
>>>
>>>
>>> I decided that this approach was more expedient, after all.
>>>
>>> I've tested this patch (in conjunction with my already-approved but
>>> not-yet-applied patch) on mainline for arm-none-eabi, x86_64-linux-gnu, and
>>> mips-linux gnu. I also backported the entire series to GCC 4.8 and tested
>>> there on arm-none-eabi and x86_64-linux-gnu. OK to apply?
>>
>> Hm, I can't seem to find the context for
>>
>> @@ -923,6 +935,14 @@
>> store_fixed_bit_field (str_rtx, bitsize, bitnum, 0, 0, value);
>> return;
>> }
>> + else if (MEM_P (str_rtx)
>> + && MEM_VOLATILE_P (str_rtx)
>> + && flag_strict_volatile_bitfields> 0)
>> + /* This is a case where -fstrict-volatile-bitfields doesn't apply
>> + because we can't do a single access in the declared mode of the field.
>> + Since the incoming STR_RTX has already been adjusted to that mode,
>> + fall back to word mode for subsequent logic. */
>> + str_rtx = adjust_address (str_rtx, word_mode, 0);
>>
>> /* Under the C++0x memory model, we must not touch bits outside the
>> bit region. Adjust the address to start at the beginning of the
>>
>> and the other similar hunk. I suppose they apply to earlier patches
>> in the series? I suppose the above applies to store_bit_field (diff -p
>> really helps!). Why would using word_mode be any good as
>> fallback? That is, why is "Since the incoming STR_RTX has already
>> been adjusted to that mode" not the thing to fix?
>>
>
> Well, this hunk does not force the access to be in word_mode.
>
> Instead it allows get_best_mode to choose the access to be in any mode from
> QI to word_mode.
>
> It is there to revert the effect of this weird code in expr.c 
> (expand_assigment):
>
>   if (volatilep && flag_strict_volatile_bitfields> 0)
> to_rtx = adjust_address (to_rtx, mode1, 0);
>
> Note that this does not even check if the access is on a bit-field !

Then why not remove that ...

> The problem with the strict_volatile_bitfields is that it is used already
> before the code reaches store_bit_field or extract_bit_field.
>
> It starts in get_inner_reference, (which is not only used in expand_assignment
> and expand_expr_real_1)
>
> Then this,
>
> if (volatilep && flag_strict_volatile_bitfields> 0)
>   op0 = adjust_address (op0, mode1, 0);

... and this ...

> and then this,
>
> /* If the field is volatile, we always want an aligned
>access.  Do this in following two situations:
>1. the access is not already naturally
>aligned, otherwise "normal" (non-bitfield) volatile fields
>become non-addressable.
>2. the bitsize is narrower than the access size. Need
>to extract bitfields from the access.  */
> || (volatilep && flag_strict_volatile_bitfields> 0
> && (bitpos % GET_MODE_ALIGNMENT (mode) != 0
> || (mode1 != BLKmode
> && bitsize < GET_MODE_SIZE (mode1) * BITS_PER_UNIT)))

... or this ...

> As a result, a read access to an unaligned volatile data member does
> not even reach the expand_bit_field if flag_strict_volatile_bitfields <= 0,
> and instead goes through convert_move (target, op0, unsignedp).
>
> I still believe the proposed patch is guaranteed to not change anything if
> -fno-strict-volatile-bitfields is used, and even if we can not guarantee
> that it creates exactly the same code for cases where the 
> strict-volatile-bitfields
> does not apply, it certainly generates valid code, where we had invalid code,
> or ICEs without the patch.
>
> OK for trunk?

Again, most of the patch is ok (and nice), the
store_bit_field/extract_bit_field changes
point to the above issues which we should rather fix than 

Re: [RFC PATCH] add auto_bitmap

2013-11-14 Thread Michael Matz
Hi,

On Thu, 14 Nov 2013, Richard Biener wrote:

> Why not give bitmap_head a constructor/destructor and allow auto use of 
> that.  Isn't that exactly what should get 'auto' handling automagically?

auto != c++98 :-/


Ciao,
Michael.


Re: [patch 3/4] Separate gimple.[ch] and gimplify.[ch] - front end files

2013-11-14 Thread Michael Matz
Hi,

On Wed, 13 Nov 2013, Andrew MacLeod wrote:

> There needs to be a place which has gimple componentry that is not 
> related to or require a statement.  gimple.h is becoming the home for 
> just 0gimple statements.  There are 3 (for the moment) major classes of 
> things that are in statements and are also used by other parts of the 
> compiler .. Types, Decls, and Expressions.

Actually I wouldn't say gimple statements contain expressions.  They 
refer to objects (and unfortunately also sometimes to types directly, 
namely in the call stmt and in the exception statements, though the latter 
don't exist anymore after lowering).  It's those objects which have types.

Currently those objects are trees and hence can be more complex than just 
singletons (which are the decls, constants and ssa names), that is most of 
the SINGLE_RHS objects.  If you want to get rid of trees you somehow need 
to represent those objects in a different way, but they still aren't 
expressions in the common meaning.

E.g. there won't ever be something like a gimple arithmetic addition 
expression (or I hope there never will be).  It's always a gimple 
statement that assigns the addition result somewhere.  Even the 
non-singleton objects don't exist in isolation, they're always part of 
some action (i.e. statement) to operate on those objects.

That's why I think talking about a gimple expression as if they were 
somehow some stand-alone concept is fairly confusing, and introducing it 
now as if it would somewhen exist would lead to going down some inferior 
design paths.

> Its true that gimple-tree would in fact be a more appropriate name at 
> the moment, but these gimple-* files are the core ones I'll be changing 
> first, so the tree part would no longer be meaningful.  the 'expr' part 
> is suppose to represent the abstract purpose...  The stuff required to 
> represent an expression in gimple IL.  And yes, that is currently a tree 
> :-)

Put another way: what do you envision that gimple expressions would be.  
For example what would you propose we could do with them?


Ciao,
Michael.


Re: [RFC] Masked load/store vectorization (take 5)

2013-11-14 Thread Richard Biener
On Thu, 14 Nov 2013, Jakub Jelinek wrote:

> On Wed, Nov 13, 2013 at 01:21:03PM +0100, Jakub Jelinek wrote:
> > Sergey has kindly tested the patch on SPEC2k6, but on 4 tests it revealed
> > an ICE.  Here is an incremental fix for that, for now it just punts on
> > those.  In theory the invariant conditional loads could be handled e.g.
> > using gather, or perhaps V{,P}MOVMSKP{S,D,B} on the mask followed by
> > conditional scalar load of the value if any bits in the mask are set,
> > then broadcasting it.
> 
> And another issue, ICE in predcom.  The problem is that the patch creates
> DR_READ data ref for the MASK_LOAD internal call and !DR_READ data ref
> for the MASK_STORE, which is handled properly in the vectorizer, but
> apparently other data ref consumers aren't prepared for it.
> 
> So, in the spirit of the recommended change to do some dataref processing
> privately in tree-vect-data-refs.c, the first patch (bootstrapped/regtested
> on x86_64-linux and i686-linux) handles MASK_LOAD and MASK_STORE solely in
> the vectorizer and leaves those as failing the data ref analysis otherwise.
> 
> Or, alternatively, we would need to handle those or punt in the other
> consumers.  Untested second patch does the punting in predcom and phiopt,
> but there are others still unverified.  Apparently e.g. predcom isn't
> prepared to handle data refs on any calls at all, which can happen already
> before my patches, and apparently other passes aren't either.
> E.g. I've tried to create a testcase for pcom (which asserts that DR_STMT
> is either gimple assign or PHI):
> 
> struct S { int i; };
> __attribute__((const, noinline, noclone))
> struct S foo (int x)
> {
>   struct S s;
>   s.i = x;
>   return s;
> }
> 
> int a[2048], b[2048], c[2048], d[2048];
> struct S e[2048];
> 
> __attribute__((noinline, noclone)) void
> bar (void)
> {
>   int i;
>   for (i = 0; i < 1024; i++)
> {
>   e[i] = foo (i);
>   a[i+2] = a[i] + a[i+1];
>   b[10] = b[10] + i;
>   c[i] = c[2047 - i];
>   d[i] = d[i + 1];
> }
> }
> 
> int
> main ()
> {
>   int i;
>   bar ();
>   for (i = 0; i < 1024; i++)
> if (e[i].i != i)
>   __builtin_abort ();
>   return 0;
> }
> 
> but to my surprise pcom didn't fail on it (because the data reference
> doesn't have a gimple type and thus isn't suitable_reference_p),
> but e.g. ldist miscompiled it (completely ignored it as if there wasn't
> e[i] = foo (i); and let it out of the loop, so now it fails at runtime).

Ah, yeah - that's because I removed the "rest of the stores" handling
and rely on all stores be requested to distribute.  Fixed by

Index: tree-loop-distribution.c
===
--- tree-loop-distribution.c(revision 204787)
+++ tree-loop-distribution.c(working copy)
@@ -1723,8 +1723,7 @@ tree_loop_distribution (void)
  if (stmt_has_scalar_dependences_outside_loop (loop, stmt))
;
  /* Otherwise only distribute stores for now.  */
- else if (!gimple_assign_single_p (stmt)
-  || is_gimple_reg (gimple_assign_lhs (stmt)))
+ else if (gimple_vdef (stmt))
continue;
 
  work_list.safe_push (stmt);

> Richard, any preferences on whether you'd prefer MASK_LOAD/MASK_STORE
> to be handled as data references only for the vectorization, or also
> other passes?  Even in the former case, we need to investigate the
> non-vectorizer data ref users how they handle calls and fix the ldist
> bug.

Well certainly calls with "known" data references should be handled
in the passes (I remember improving it at some point but never got
around committing it because nobody made use of the improvement).

So if you can generate valid DRs for MASK_LOAD/MASK_STORE then do
so.  Valid as in "conservative correct" - that should be enough
for calls.

Richard.


[PATCH] LRA: check_rtl modifies RTL instruction stream

2013-11-14 Thread Robert Suchanek
Hi, 

This patch follows up the problem outlined here:
http://gcc.gnu.org/ml/gcc/2013-11/msg00152.html

The patch attached prevents adding clobbers when LRA is running
in insn_invalid_p function and lra_in_progress variable is being set
just before check_rtl call. This should stop modification of insns stream
in cases where there is a insn match after adding clobbers. 

The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu (revision 
204787).

Regards,
Robert

2013-11-13  Robert Suchanek  

* lra.c (lra): Set lra_in_progress before check_rtl call.
* recog.c (insn_invalid_p): Add !lra_in_progress to prevent
adding clobber regs when LRA is running

diff --git a/gcc/lra.c b/gcc/lra.c
index 1aea599..83d45b6 100644
--- a/gcc/lra.c
+++ b/gcc/lra.c
@@ -2238,6 +2238,10 @@ lra (FILE *f)
 
   init_insn_recog_data ();
 
+  /* We can not set up reload_in_progress because it prevents new
+ pseudo creation.  */
+  lra_in_progress = 1;
+
 #ifdef ENABLE_CHECKING
   check_rtl (false);
 #endif
@@ -2248,10 +2252,6 @@ lra (FILE *f)
 
   setup_reg_spill_flag ();
 
-  /* We can not set up reload_in_progress because it prevents new
- pseudo creation.  */
-  lra_in_progress = 1;
-
   /* Function remove_scratches can creates new pseudos for clobbers --
  so set up lra_constraint_new_regno_start before its call to
  permit changing reg classes for pseudos created by this
diff --git a/gcc/recog.c b/gcc/recog.c
index c8594bb..5c0ec16 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -314,7 +314,9 @@ insn_invalid_p (rtx insn, bool in_group)
  clobbers.  */
   int icode = recog (pat, insn,
 (GET_CODE (pat) == SET
- && ! reload_completed && ! reload_in_progress)
+ && ! reload_completed 
+  && ! reload_in_progress
+  && ! lra_in_progress)
 ? &num_clobbers : 0);
   int is_asm = icode < 0 && asm_noperands (PATTERN (insn)) >= 0;




Re: [patch 2/2] add gimplfy-be.[ch] for iterator-aware BE-only gimplification routines. Include changes

2013-11-14 Thread Andrew MacLeod



patch2 contains the resulting include changes.


the remaining changes...
Andrew

	* asan.c: Include only gimplify.h, gimplify-be.h, and/or gimple.h as
	required.
	* cfgloopmanip.c: Likewise.
	* cgraphunit.c: Likewise.
	* cilk-common.c: Likewise.
	* fold-const.c: Likewise.
	* function.c: Likewise.
	* gimple-expr.c: Likewise.
	* gimple-fold.c: Likewise.
	* gimple-ssa-strength-reduction.c: Likewise.
	* gimple.c: Likewise.
	* graphite-clast-to-gimple.c: Likewise.
	* graphite-sese-to-poly.c: Likewise.
	* ipa-prop.c: Likewise.
	* ipa-split.c: Likewise.
	* ipa.c: Likewise.
	* langhooks.c: Likewise.
	* omp-low.c: Likewise.
	* sese.c: Likewise.
	* stor-layout.c: Likewise.
	* targhooks.c: Likewise.
	* trans-mem.c: Likewise.
	* tree-affine.c: Likewise.
	* tree-cfg.c: Likewise.
	* tree-cfgcleanup.c: Likewise.
	* tree-complex.c: Likewise.
	* tree-if-conv.c: Likewise.
	* tree-inline.c: Likewise.
	* tree-loop-distribution.c: Likewise.
	* tree-nested.c: Likewise.
	* tree-parloops.c: Likewise.
	* tree-predcom.c: Likewise.
	* tree-profile.c: Likewise.
	* tree-scalar-evolution.c: Likewise.
	* tree-sra.c: Likewise.
	* tree-ssa-address.c: Likewise.
	* tree-ssa-ccp.c: Likewise.
	* tree-ssa-dce.c: Likewise.
	* tree-ssa-forwprop.c: Likewise.
	* tree-ssa-ifcombine.c: Likewise.
	* tree-ssa-loop-im.c: Likewise.
	* tree-ssa-loop-ivopts.c: Likewise.
	* tree-ssa-loop-manip.c: Likewise.
	* tree-ssa-loop-niter.c: Likewise.
	* tree-ssa-loop-prefetch.c: Likewise.
	* tree-ssa-loop-unswitch.c: Likewise.
	* tree-ssa-math-opts.c: Likewise.
	* tree-ssa-phiopt.c: Likewise.
	* tree-ssa-phiprop.c: Likewise.
	* tree-ssa-pre.c: Likewise.
	* tree-ssa-propagate.c: Likewise.
	* tree-ssa-reassoc.c: Likewise.
	* tree-ssa-sccvn.c: Likewise.
	* tree-ssa-strlen.c: Likewise.
	* tree-ssa.c: Likewise.
	* tree-switch-conversion.c: Likewise.
	* tree-tailcall.c: Likewise.
	* tree-vect-data-refs.c: Likewise.
	* tree-vect-generic.c: Likewise.
	* tree-vect-loop-manip.c: Likewise.
	* tree-vect-loop.c: Likewise.
	* tree-vect-patterns.c: Likewise.
	* tree-vect-stmts.c: Likewise.
	* tree.c: Likewise.
	* tsan.c: Likewise.
	* value-prof.c: Likewise.
	* config/aarch64/aarch64.c: Likewise.
	* config/alpha/alpha.c: Likewise.
	* config/darwin.c: Likewise.
	* config/i386/i386.c: Likewise.
	* config/ia64/ia64.c: Likewise.
	* config/mep/mep.c: Likewise.
	* config/mips/mips.c: Likewise.
	* config/rs6000/rs6000.c: Likewise.
	* config/s390/s390.c: Likewise.
	* config/sh/sh.c: Likewise.
	* config/sparc/sparc.c: Likewise.
	* config/spu/spu.c: Likewise.
	* config/stormy16/stormy16.c: Likewise.
	* config/tilegx/tilegx.c: Likewise.
	* config/tilepro/tilepro.c: Likewise.
	* config/xtensa/xtensa.c: Likewise.

	* c/c-typeck.c: Include only gimplify.h and gimple.h as needed.
	* c-family/c-common.c: Likewise.
	* c-family/c-gimplify.c: Likewise.
	* c-family/cilk.c: Likewise.

	* cp/class.c: Include only gimplify.h and gimple.h as needed.
	* cp/cp-gimplify.c: Likewise.
	* cp/error.c: Likewise.
	* cp/init.c: Likewise.
	* cp/optimize.c: Likewise.
	* cp/pt.c: Likewise.
	* cp/semantics.c: Likewise.
	* cp/tree.c: Likewise.
	* cp/vtable-class-hierarchy.c: Likewise.

	* fortran/trans-expr.c: Include only gimplify.h and gimple.h as needed.
	* fortran/trans-openmp.c: Likewise.

	* go/go-lang.c: Include only gimplify.h and gimple.h as needed.

	* java/java-gimplify.c: Include only gimplify.h and gimple.h as needed.

	* objc/objc-act.c: Include only gimplify.h and gimple.h as needed.

Index: asan.c
===
*** asan.c	(revision 204763)
--- asan.c	(working copy)
*** along with GCC; see the file COPYING3.
*** 23,28 
--- 23,29 
  #include "system.h"
  #include "coretypes.h"
  #include "tree.h"
+ #include "gimple.h"
  #include "gimplify.h"
  #include "gimple-iterator.h"
  #include "tree-iterator.h"
Index: cfgloopmanip.c
===
*** cfgloopmanip.c	(revision 204763)
--- cfgloopmanip.c	(working copy)
*** along with GCC; see the file COPYING3.
*** 25,32 
  #include "basic-block.h"
  #include "cfgloop.h"
  #include "tree.h"
! #include "gimplify.h"
  #include "gimple-iterator.h"
  #include "tree-ssa-loop-manip.h"
  #include "dumpfile.h"
  
--- 25,33 
  #include "basic-block.h"
  #include "cfgloop.h"
  #include "tree.h"
! #include "gimple.h"
  #include "gimple-iterator.h"
+ #include "gimplify-be.h"
  #include "tree-ssa-loop-manip.h"
  #include "dumpfile.h"
  
Index: cgraphunit.c
===
*** cgraphunit.c	(revision 204763)
--- cgraphunit.c	(working copy)
*** along with GCC; see the file COPYING3.
*** 164,171 
--- 164,173 
  #include "tree.h"
  #include "output.h"
  #include "rtl.h"
+ #include "gimple.h"
  #include "gimplify.h"
  #include "gimple-iterator.h"
+ #include "gimplify-be.h"
  #include "gimple-ssa.h"
  #include "tree-cfg.h"
  #include "tree-into-ssa.h"
Index: cilk-commo

  1   2   >