date:20110420

Re: [PATCH] use build_function_type_list in the avr backend

2011-04-20 Thread Denis Chertykov

2011/4/20 Nathan Froyd :
> As $SUBJECT suggests.  Tested with cross to avr-elf.  OK to commit?
>
> -Nathan
>
>        * config/avr/avr.c (avr_init_builtins): Call
>        build_function_type_list instead of build_function_type.
>

Please, commit.

Denis.

Ping: Make 128 bits the default vector size for NEON

2011-04-20 Thread Ira Rosen

http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02172.html

The last version:

ChangeLog:

 * doc/invoke.texi (preferred-vector-size): Document.
 * params.h (PREFERRED_VECTOR_SIZE): Define.
 * config/arm/arm.c (arm_preferred_simd_mode): Use param
 PREFERRED_VECTOR_SIZE instead of
 TARGET_NEON_VECTORIZE_QUAD. Make 128 bits the default.
 (arm_autovectorize_vector_sizes): Likewise.
 * config/arm/arm.opt (NEON_VECTORIZE_QUAD): Add
 RejectNegative.
 * params.def (PARAM_PREFERRED_VECTOR_SIZE): Define.

testsuite/ChangeLog:

 * lib/target-supports.exp (check_effective_target_vect_multiple_sizes):
 New procedure.
 (add_options_for_quad_vectors): Replace with ...
 (add_options_for_double_vectors): ... this.
 * gfortran.dg/vect/pr19049.f90: Expect more printings on targets that
 support multiple vector sizes since the vectorizer attempts to
 vectorize with both vector sizes.
 * gcc.dg/vect/slp-reduc-6.c, gcc.dg/vect/no-vfa-vect-79.c,
 gcc.dg/vect/no-vfa-vect-102a.c, gcc.dg/vect/vect-outer-1a.c,
 gcc.dg/vect/vect-outer-1b.c, gcc.dg/vect/vect-outer-2b.c,
 gcc.dg/vect/vect-outer-3a.c, gcc.dg/vect/no-vfa-vect-37.c,
 gcc.dg/vect/vect-outer-3b.c, gcc.dg/vect/no-vfa-vect-101.c,
 gcc.dg/vect/no-vfa-vect-102.c, gcc.dg/vect/vect-reduc-dot-s8b.c,
 gcc.dg/vect/vect-outer-1.c, gcc.dg/vect/vect-104.c: Likewise.
 * gcc.dg/vect/vect-16.c: Rename to...
 * gcc.dg/vect/no-fast-math-vect-16.c: ... this to ensure that it runs
 without -ffast-math.
 * gcc.dg/vect/vect-42.c: Run with 64 bit vectors if applicable.
 * gcc.dg/vect/vect-multitypes-6.c, gcc.dg/vect/vect-52.c,
 gcc.dg/vect/vect-54.c, gcc.dg/vect/vect-46.c, gcc.dg/vect/vect-48.c,
 gcc.dg/vect/vect-96.c, gcc.dg/vect/vect-multitypes-3.c,
 gcc.dg/vect/vect-40.c: Likewise.
 * gcc.dg/vect/vect-outer-5.c: Remove quad-vectors option as
 redundant.
 * gcc.dg/vect/vect-109.c, gcc.dg/vect/vect-peel-1.c,
 gcc.dg/vect/vect-peel-2.c, gcc.dg/vect/slp-25.c,
 gcc.dg/vect/vect-multitypes-1.c, gcc.dg/vect/slp-3.c,
 gcc.dg/vect/no-vfa-pr29145.c, gcc.dg/vect/vect-multitypes-4.c:
 Likewise.
 * gcc.dg/vect/vect.exp: Run no-fast-math-vect*.c tests with
 -fno-fast-math.

Thanks,
Ira
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 171723)
+++ doc/invoke.texi (working copy)
@@ -8874,6 +8874,10 @@ The maximum number of conditional stores paires th
 if either vectorization (@option{-ftree-vectorize}) or if-conversion
 (@option{-ftree-loop-if-convert}) is disabled.  The default is 2.
 
+@item preferred-vector-size
+Preferred vector size in bits for targets that support multiple vector sizes.
+Invalid values are ignored.  The default is 128.
+
 @end table
 @end table
 
Index: params.h
===
--- params.h(revision 171723)
+++ params.h(working copy)
@@ -204,6 +204,8 @@ extern void init_param_values (int *params);
   PARAM_VALUE (PARAM_PREFETCH_MIN_INSN_TO_MEM_RATIO)
 #define MIN_NONDEBUG_INSN_UID \
   PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID)
+#define PREFERRED_VECTOR_SIZE \
+  PARAM_VALUE (PARAM_PREFERRED_VECTOR_SIZE)
 #define MAX_STORES_TO_SINK \
   PARAM_VALUE (PARAM_MAX_STORES_TO_SINK)
 #endif /* ! GCC_PARAMS_H */
Index: testsuite/lib/target-supports.exp
===
--- testsuite/lib/target-supports.exp   (revision 171723)
+++ testsuite/lib/target-supports.exp   (working copy)
@@ -3203,6 +3203,24 @@ proc check_effective_target_vect_strided_wide { }
 return $et_vect_strided_wide_saved
 }
 
+# Return 1 if the target supports multiple vector sizes
+
+proc check_effective_target_vect_multiple_sizes { } {
+global et_vect_multiple_sizes
+
+if [info exists et_vect_multiple_sizes_saved] {
+verbose "check_effective_target_vect_multiple_sizes: using cached 
result" 2
+} else {
+set et_vect_multiple_sizes_saved 0
+if { ([istarget arm*-*-*] && [check_effective_target_arm_neon]) } {
+   set et_vect_multiple_sizes_saved 1
+}
+}
+
+verbose "check_effective_target_vect_multiple_sizes: returning 
$et_vect_multiple_sizes_saved" 2
+return $et_vect_multiple_sizes_saved
+}
+
 # Return 1 if the target supports section-anchors
 
 proc check_effective_target_section_anchors { } {
@@ -3585,9 +3603,9 @@ proc add_options_for_bind_pic_locally { flags } {
 
 # Add to FLAGS the flags needed to enable 128-bit vectors.
 
-proc add_options_for_quad_vectors { flags } {
+proc add_options_for_double_vectors { flags } {
 if [is-effective-target arm_neon_ok] {
-   return "$flags -mvectorize-with-neon-quad"
+   return "$flags --param preferred-vector-size=64"
 }
 
 return $flags
Index: testsuite/gfortran.dg/vect/pr19049.f90
===
--- testsui

Go patch committed: Use mpfr_prec_round to constraint floats

2011-04-20 Thread Ian Lance Taylor

This patch to the Go frontend changes it to use mpfr_prec_round rather
than real_convert to constrain untyped float or complex constants to
their actual type.  This is a simplification and also eliminates a call
from the frontend to a gcc-specific middle-end function.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

diff -r f05cc4c5ad22 go/expressions.cc
--- a/go/expressions.cc	Tue Apr 19 16:19:05 2011 -0700
+++ b/go/expressions.cc	Wed Apr 20 22:52:36 2011 -0700
@@ -1909,13 +1909,7 @@
 {
   Float_type* ftype = type->float_type();
   if (ftype != NULL && !ftype->is_abstract())
-{
-  tree type_tree = ftype->type_tree();
-  REAL_VALUE_TYPE rvt;
-  real_from_mpfr(&rvt, val, type_tree, GMP_RNDN);
-  real_convert(&rvt, TYPE_MODE(type_tree), &rvt);
-  mpfr_from_real(val, &rvt, GMP_RNDN);
-}
+mpfr_prec_round(val, ftype->bits(), GMP_RNDN);
 }
 
 // Return a floating point constant value.
@@ -2158,16 +2152,8 @@
   Complex_type* ctype = type->complex_type();
   if (ctype != NULL && !ctype->is_abstract())
 {
-  tree type_tree = ctype->type_tree();
-
-  REAL_VALUE_TYPE rvt;
-  real_from_mpfr(&rvt, real, TREE_TYPE(type_tree), GMP_RNDN);
-  real_convert(&rvt, TYPE_MODE(TREE_TYPE(type_tree)), &rvt);
-  mpfr_from_real(real, &rvt, GMP_RNDN);
-
-  real_from_mpfr(&rvt, imag, TREE_TYPE(type_tree), GMP_RNDN);
-  real_convert(&rvt, TYPE_MODE(TREE_TYPE(type_tree)), &rvt);
-  mpfr_from_real(imag, &rvt, GMP_RNDN);
+  mpfr_prec_round(real, ctype->bits() / 2, GMP_RNDN);
+  mpfr_prec_round(imag, ctype->bits() / 2, GMP_RNDN);
 }
 }

Re: FDO usability patch -- cfg and lineno checksum

2011-04-20 Thread Xinliang David Li

On Tue, Apr 19, 2011 at 3:20 PM, Jan Hubicka  wrote:
> I can not review tree.c changes.  I would probably suggest making crc_byte 
> inline.

Diego, can you review this change? This is just a simple refactoring.


>
>> +#if IN_LIBGCOV
>> +
>> +/* These functions are guarded by #if to avoid compile time warning.  */
>> +
>> +/* Return the number of words STRING would need including the length
>> +   field in the output stream itself.  This should be identical to
>> +   "alloc" calculation in gcov_write_string().  */
>
> Hmm, probably better to make gcov_write_string to use gcov_string_length then.


yes.

>> @@ -238,13 +265,15 @@ gcov_write_words (unsigned words)
>>
>>    gcc_assert (gcov_var.mode < 0);
>>  #if IN_LIBGCOV
>> -  if (gcov_var.offset >= GCOV_BLOCK_SIZE)
>> +  if (gcov_var.offset + words >= GCOV_BLOCK_SIZE)
>>      {
>> -      gcov_write_block (GCOV_BLOCK_SIZE);
>> +      gcov_write_block (MIN (gcov_var.offset, GCOV_BLOCK_SIZE));
>>        if (gcov_var.offset)
>>       {
>> -       gcc_assert (gcov_var.offset == 1);
>> -       memcpy (gcov_var.buffer, gcov_var.buffer + GCOV_BLOCK_SIZE, 4);
>> +       gcc_assert (gcov_var.offset < GCOV_BLOCK_SIZE);
>> +       memcpy (gcov_var.buffer,
>> +                  gcov_var.buffer + GCOV_BLOCK_SIZE,
>> +                  gcov_var.offset << 2);
>
> I don't really follow the logic here.  buffer is allocated to be size of
> block+4 and it is expected that gcov_write_words is not executed on size
> greater than 4.  Since gcov_write_string now seems to be expected to handle
> strings of bigger size, I think you acually need to make write_string to write
> in chunks when you reach block boundary?

gcov_write_words is used to reserve words*4 bytes in buffer for data
write later. The the old logic is wrong -- if 'words' is large enough,
it will lead to out of bound access.

>
> As soon as you decide to not write out in the GCOV_BLOCK_SIZE chunks, the
> MIN computation above seems unnecesary (and bogus, since we won't let
> gcov_var.offset to grow past GCOV_BLOCK_SIZE.

Yes, using gcov_var.offset should be good enough.

>
> What gets into libgcov is very problematic busyness for embedded targets, 
> where you really want
> libgcov to be small.  Why do you need to store strings now?

It is needed to store the assembler name for functions. It allows
lookup of the profile data using assembler name as key instead of
using function ids. For gcc, the total size of gcda files is about 59k
bytes without storing the names, and about 65k with names -- about 10%
increase. For eon, the size changes from 27k to 35k bytes.

I can split the patches into two parts -- one with cfg checksum and
one with the name string.

>> Index: gcov-io.h
>> ===
>> --- gcov-io.h (revision 172693)
>> +++ gcov-io.h (working copy)
>> @@ -101,9 +101,10 @@ see the files COPYING3 and COPYING.RUNTI
>>
>>     The basic block graph file contains the following records
>>       note: unit function-graph*
>> -     unit: header int32:checksum string:source
>> +     unit: header int32:checksum int32: string:source
>>       function-graph: announce_function basic_blocks {arcs | lines}*
>> -     announce_function: header int32:ident int32:checksum
>> +     announce_function: header int32:ident
>> +             int32:lineno_checksum int32:cfg_checksum
>>               string:name string:source int32:lineno
>>       basic_block: header int32:flags*
>>       arcs: header int32:block_no arc*
>> @@ -132,7 +133,9 @@ see the files COPYING3 and COPYING.RUNTI
>>          data: {unit function-data* summary:object summary:program*}*
>>       unit: header int32:checksum
>>          function-data:       announce_function arc_counts
>> -     announce_function: header int32:ident int32:checksum
>> +     announce_function: header int32:ident
>> +             int32:lineno_checksum int32:cfg_checksum
>> +             string:name
>>       arc_counts: header int64:count*
>>       summary: int32:checksum {count-summary}GCOV_COUNTERS
>>       count-summary:  int32:num int32:runs int64:sum
>
> We also need to bump gcov version, right?

Yes -- but the version is currently derived from gcc version number
and phase number --- this is wrong -- different version of gcc may
have compatible coverage data format.  Any suggestions to change this?
--- probably just hard code the GCOV_VERSION string?


>
>> @@ -411,11 +417,20 @@ struct gcov_summary
>>  /* Information about a single function.  This uses the trailing array
>>     idiom. The number of counters is determined from the counter_mask
>>     in gcov_info.  We hold an array of function info, so have to
>> -   explicitly calculate the correct array stride.  */
>> +   explicitly calculate the correct array stride.
>> +
>> +   "ident" is no longer used in hash computation.  Additionally,
>> +   "name" is used in hash computation.  This makes the profile data
>> +   not compatible across function name changes.
>> +   Also, the fu

Some small C++ PATCHes

2011-04-20 Thread Jason Merrill

1) While looking at 48530 I noticed that my recent change to put array 
literals in static storage again needed to check for non-trivial 
destructors.


2) I also added support for trivial destructors to build_over_call, 
though it isn't currently used by anything.


3) lookup_fnfields_slot wasn't causing the type to be completed, which 
isn't currently an issue on the trunk, but was when backporting to 4.4.


Tested x86_64-pc-linux-gnu, applied to trunk.  Patch 1 also applied to 4.6.
commit cc1fefa687f03b99b3ac88782d2264561f603401
Author: Jason Merrill 
Date:   Wed Apr 20 13:05:15 2011 -0700

* semantics.c (finish_compound_literal): Don't put an array
with a dtor in a static variable.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index e9b1907..7763ae0 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2383,6 +2383,7 @@ finish_compound_literal (tree type, tree compound_literal,
  represent class temporaries with TARGET_EXPR so we elide copies.  */
   if ((!at_function_scope_p () || CP_TYPE_CONST_P (type))
   && TREE_CODE (type) == ARRAY_TYPE
+  && !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type)
   && initializer_constant_valid_p (compound_literal, type))
 {
   tree decl = create_temporary_var (type);
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist47.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist47.C
new file mode 100644
index 000..b76fb58
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist47.C
@@ -0,0 +1,9 @@
+// { dg-options -std=c++0x }
+
+struct A { ~A() = delete; };   // { dg-error "declared" }
+
+int main()
+{
+  typedef const A cA[2];
+  cA{};// { dg-error "deleted" }
+}
commit 3bde00a1e03ff09f14bd70454d1b4ceb5047b8b3
Author: Jason Merrill 
Date:   Wed Apr 20 13:04:50 2011 -0700

* call.c (build_over_call): Handle trivial dtor.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 78104b1..cf8e1a5 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6411,7 +6411,11 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
 
   return val;
 }
-  /* FIXME handle trivial default constructor and destructor, too.  */
+  else if (DECL_DESTRUCTOR_P (fn)
+  && trivial_fn_p (fn)
+  && !DECL_DELETED_FN (fn))
+return fold_convert (void_type_node, argarray[0]);
+  /* FIXME handle trivial default constructor, too.  */
 
   if (!already_used)
 mark_used (fn);
diff --git a/gcc/testsuite/g++.dg/init/dtor4.C 
b/gcc/testsuite/g++.dg/init/dtor4.C
new file mode 100644
index 000..4bca69e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/dtor4.C
@@ -0,0 +1,9 @@
+// { dg-final { scan-assembler-not "_ZN1AD2Ev" } }
+
+struct A { };
+
+int main()
+{
+  A a;
+  a.~A();
+}
commit a2ff11d1ecbd5f238080fd6e3f95fa13d3cefff2
Author: Jason Merrill 
Date:   Wed Apr 20 15:59:38 2011 -0700

* search.c (lookup_fnfields_slot): Call complete_type.

diff --git a/gcc/cp/search.c b/gcc/cp/search.c
index 9ec6fc3..e7d2048 100644
--- a/gcc/cp/search.c
+++ b/gcc/cp/search.c
@@ -1451,7 +1451,7 @@ lookup_fnfields_1 (tree type, tree name)
 tree
 lookup_fnfields_slot (tree type, tree name)
 {
-  int ix = lookup_fnfields_1 (type, name);
+  int ix = lookup_fnfields_1 (complete_type (type), name);
   if (ix < 0)
 return NULL_TREE;
   return VEC_index (tree, CLASSTYPE_METHOD_VEC (type), ix);

Re: fix memory leak in gengtype

2011-04-20 Thread Laurynas Biveinis

Dimitrios -

The patch is OK with a ChangeLog entry. Also a patch to fix the same
in gengtype.c:matching_file_name_substitute is pre-approved (but it
looks like Jeff will beat you to this :)

> P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu) but
> unfortunately the sparcstation crashed and burned after this, so I can't
> continue the build and report back :-(

:( Why don't you get yourself a compile farm account?
http://gcc.gnu.org/wiki/CompileFarm

-- 
Laurynas

Re: C++ PATCH for c++/48594 (failure with overloaded ->* in template)

2011-04-20 Thread Jason Merrill

While backporting this to 4.5 and 4.4 I noticed that making the object 
non-dependent shouldn't be conditionalized.


Tested x86_64-pc-linux-gnu, applying to 4.6 and trunk.  Applying the two 
patches folded together on 4.4 and 4.5.
commit a39f5c2859bb16af16945830f3c0802c40441b70
Author: Jason Merrill 
Date:   Wed Apr 20 12:48:33 2011 -0700

PR c++/48594
* decl2.c (build_offset_ref_call_from_tree): Move
non-dependency of object outside condition.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1217e42..89e03c0 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4079,9 +4079,9 @@ build_offset_ref_call_from_tree (tree fn, VEC(tree,gc) 
**args)
 parameter.  That must be done before the FN is transformed
 because we depend on the form of FN.  */
   make_args_non_dependent (*args);
+  object = build_non_dependent_expr (object);
   if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
{
- object = build_non_dependent_expr (object);
  if (TREE_CODE (fn) == DOTSTAR_EXPR)
object = cp_build_addr_expr (object, tf_warning_or_error);
  VEC_safe_insert (tree, gc, *args, 0, object);

Re: [RFA] [PowerPC]

2011-04-20 Thread Segher Boessenkool


The test and-1.c has wrong logic.
In the formula:
y & ~(y & -y)

The part (y & -y) is always a mask with one bit set, which corresponds
to the least significant "1" bit in y.
The final result is that bit, is set to zero (y & ~mask)

There is no boolean simplification possible, and the compiler always 
produces

a nand instruction.


The formula is equal to  y & (y-1) , maybe the testcase is testing that?


Segher

Re: Ping^2 Re: Target header etc. cleanup patch

2011-04-20 Thread Paul Koning


On Apr 20, 2011, at 5:09 PM, Joseph S. Myers wrote:

> Ping^2.  This patch 
>  is still pending 
> review.  This version applies cleanly to current trunk.
> ...

pdp11 is fine.  Thanks!

paul

Re: [PATCH] use build_function_type_list in the rs6000 backend

2011-04-20 Thread David Edelsohn

On Wed, Apr 20, 2011 at 3:49 PM, Nathan Froyd  wrote:
> As $SUBJECT suggests.  The only tricky part is in builtin_function_type,
> where we fill in unused args with NULL_TREE so that passing extra
> arguments to build_function_type_list doesn't matter.
>
> Tested with cross to powerpc-eabi.  OK to commit?
>
> -Nathan
>
>        * config/rs6000/rs6000.c (spe_init_builtins): Call
>        build_function_type_list instead of build_function_type.
>        (paired_init_builtins, altivec_init_builtins): Likewise.
>        (builtin_function_type): Likewise.

Okay.

Thanks, David

Re: fix memory leak in gengtype

2011-04-20 Thread Dimitrios Apostolou


On Wed, 20 Apr 2011, Jeff Law wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/20/11 15:08, Dimitrios Apostolou wrote:

Hello list,

while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM
killed. That's when I noticed that its RAM usage peaks at 150MB, which
is a bit excessive for parsing a ~500K text file.

The attached patch fixes the leak and gengtype now uses a peak of 4MB
heap. Hopefully I don't do something wrong, since it took me a while to
understand those obstacks...

The code in question creates an obstack, allocates (and grows) a single
object on the obstack, then frees the object.  This leaks the underlying
obstack structure itself and potentially any chunks that were too small
to hold the object.


Plus a whole page which is preallocated by the obstack, if I understand 
correctly. As a result, for each word in the text file we consume 4KB, 
which are never freed.




It turns out there's a similar leak in gengtype.c which is fixed in the
same way.



Nice, thanks for looking deeper into this, I just stopped when memory 
utilisation seemed ok.



A quick valgrind test shows that prior to your change gengtype leaked
roughly 200M, after your change it leaks about 1.3M and after fixing
gengtype it leaks a little under 300k.

I'll run those changes through the usual tests and check in the changes
assuming they pass those tests.

Thanks for the patch!



P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu)
but unfortunately the sparcstation crashed and burned after this, so I
can't continue the build and report back :-(

:(  My old PA box has similar problems, though it merely overheats
before a bootstrap can complete, so in theory I could coax it to finish
a bootstrap.   Luckily others (particularly John) have stepped in over
the last decade and taken excellent care of the PA port.


If by PA you mean PA-RISC, I remember when I had access to a Visualize 
C200 with gentoo on. I loved the machine, but it had an important issue: 
it was absolutely random if it would power up, when pressing the power 
button. But as long as we never turned it off, it worked ok :-)



Dimitris

Re: [patch testsuite committed] Skip gcc.dg/torture/pr37868.c on sh

2011-04-20 Thread Mike Stump

On Apr 20, 2011, at 5:22 AM, Kaz Kojima wrote:
> Mike Stump  wrote:
>> I'd pre-approve hoisting these up into the lib/.exp files and checking a 
>> generic target requirement...  :-)
>> 
>>> -/* { dg-skip-if "unaligned access" { sparc*-*-* } "*" "" } */
>>> +/* { dg-skip-if "unaligned access" { sparc*-*-* sh*-*-* } "*" "" } */
> 
> I've thought the same thing when reading the recent HP's comment
> about changes of testcases for avr, but gave up after grepping
> STRICT_ALIGNMENT in gcc/config/*/*.h.

Oh, I had even less work in mind.  If the test makes non-portable assumptions 
about alignment, just tag it as unportable due to alignment, and then for sparc 
and sh, just set that flag.  As others find other testcases for other machines 
that suffer the same general problem, they eventually would find and switch to 
the more maintainable form.  As someone did a new port, and saw the testcase 
fail, they would glance at it, see the non-portable due to alignment, say, 
yeah, that applies to me a well, and then just add their port to the list.  
Once they did this, then presto, all the testcases would suddenly turn off, 
which is what they want to happen.

Re: fix memory leak in gengtype

2011-04-20 Thread Jeff Law

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/20/11 15:08, Dimitrios Apostolou wrote:
> Hello list,
> 
> while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM
> killed. That's when I noticed that its RAM usage peaks at 150MB, which
> is a bit excessive for parsing a ~500K text file.
> 
> The attached patch fixes the leak and gengtype now uses a peak of 4MB
> heap. Hopefully I don't do something wrong, since it took me a while to
> understand those obstacks...
The code in question creates an obstack, allocates (and grows) a single
object on the obstack, then frees the object.  This leaks the underlying
obstack structure itself and potentially any chunks that were too small
to hold the object.

It turns out there's a similar leak in gengtype.c which is fixed in the
same way.

A quick valgrind test shows that prior to your change gengtype leaked
roughly 200M, after your change it leaks about 1.3M and after fixing
gengtype it leaks a little under 300k.

I'll run those changes through the usual tests and check in the changes
assuming they pass those tests.

Thanks for the patch!

> 
> P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu)
> but unfortunately the sparcstation crashed and burned after this, so I
> can't continue the build and report back :-(
:(  My old PA box has similar problems, though it merely overheats
before a bootstrap can complete, so in theory I could coax it to finish
a bootstrap.   Luckily others (particularly John) have stepped in over
the last decade and taken excellent care of the PA port.

Jeff

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNr2RiAAoJEBRtltQi2kC7ryUH/iYvVw8LWZNWc1zSczCOOo8w
T8uyVX6WX+0xjPDA52si34BdCXfKdNDmtQXAVpnRbbTrgT42lj1bTH9c9KLadWEZ
0/FUZQB5VGQTMYah7iDDAfyjUdyRRCZW/YWnbyfAP0UdVTR7xJsjqjjWEetuyyFA
jF6WQYovzWzjssUnKfPnD/WyQxoPm+gihBVw0abhdPpojXcH8uMYrXpZrGLEk0QA
drR0ogL3ZKNJiRMFZQH5NKrhhx76mPiACsRZmCJkXSm+N6GqRsJFE9gGbc7Lwpdn
bVjd1CGo5yYCscEM/yUBS4fclO6aDRRdMbT5/cVsObYXv58WGG1gfk0F6g1GqFs=
=d6SQ
-END PGP SIGNATURE-

Re: Fix PR48703: segfault in mangler due to -g

2011-04-20 Thread Michael Matz

Hi,

I wrote:

> Basically we have to set assembler names early also for TYPE_DECLs, we 
> can't rely on the frontends langhook to do that after free_lang_data.
> 
> Okay for trunk assuming regstrapping on x86_64-linux works?

Patch retracted, doesn't even survive testsuite.  The problem is that we 
can't simply accept TYPE_DECLs for generating assembler names, because the 
other frontends except C++ can't deal with that (they use the default  
set_decl_assembler_name hook).  Even conditionalizing on 
  lang_hooks.set_decl_assembler_name == lhd_set_decl_assembler_name
doesn't work, because mysteriously for C++ we'll get ICEs in the C++ 
frontend itself when presented to mangle some TYPE_DECLs (namely when 
flag_abi_version is set, mangle_decl unconditionally calls make_alias_for, 
which in turn doesn't work with type_decls).

It's all quite messy and a wonder why -g worked somewhat with -flto at all 
for so long :-(

Ciao,
Michael.

Re: [patch] Split Parse Timevar (issue4378056)

2011-04-20 Thread Jason Merrill


On 04/12/2011 11:49 AM, Lawrence Crowl wrote:

This patch is available for review at http://codereview.appspot.com/4378056


I tried to comment there, but it didn't seem to be working; looking at 
the side-by-side diffs didn't show any changes, and double-clicking on a 
line in the patch form didn't let me add a comment.



+  timevar_start (TV_RESOLVE_OVERLOAD);


Putting this in perform_overload_resolution isn't enough; only a couple 
of cases of overload resolution actually use it.  Any function that 
calls tourney will also need this.



+lookup_template_class (tree d1, tree arglist, tree in_decl, tree context,
+  int entering_scope, tsubst_flags_t complain)
+{
+  tree ret;
+  bool subtime = timevar_cond_start (TV_NAME_LOOKUP);


Let's count this as TV_INSTANTIATE_TEMPLATE instead.


@@ -17194,7 +17225,7 @@ instantiate_decl (tree d, int defer_ok,
-  timevar_push (TV_PARSE);
+  timevar_push (TV_PARSE_GLOBAL);


This too.


@@ -1911,7 +1911,7 @@ ggc_collect (void)
-  timevar_push (TV_GC);
+  timevar_start (TV_GC);


Why this change?  GC time shouldn't be counted against whatever we 
happen to be parsing when it happens.



+DEFTIMEVAR (TV_PHASE_C_WRAPUP_CHECK  , "phase C wrapup & check")
+DEFTIMEVAR (TV_PHASE_CP_DEFERRED , "phase C++ deferred")


Why do these need to be different timevars?


+DEFTIMEVAR (TV_PARSE_INMETH  , "parser inl. meth. body")


Is it really important to distinguish this from other functions?


-DEFTIMEVAR (TV_NAME_LOOKUP   , "name lookup")
-DEFTIMEVAR (TV_OVERLOAD  , "overload resolution")
-DEFTIMEVAR (TV_TEMPLATE_INSTANTIATION, "template instantiation")
+DEFTIMEVAR (TV_INSTANTIATE_TEMPLATE  , "instantiate template")
+DEFTIMEVAR (TV_NAME_LOOKUP   , "|name lookup")
+DEFTIMEVAR (TV_RESOLVE_OVERLOAD  , "|overload resolution")


Why these changes?


@@ -564,6 +564,8 @@ compile_file (void)
+  timevar_start (TV_PHASE_PARSING);


Why does this happen before...


+  timevar_push (TV_PARSE_GLOBAL);


...this?  I would think the bits in there should be part of _SETUP.


@@ -16760,6 +16770,7 @@ cp_parser_class_specifier (cp_parser* parser)
+  timevar_pop (TV_PARSE_STRUCT);
+  timevar_pop (TV_PARSE_STRUCT);
+  timevar_pop (TV_PARSE_STRUCT);
+  timevar_pop (TV_PARSE_STRUCT);


Why not factor this out like you did with so many functions outside the 
parser?


Jason

Re: [PATCH] use build_function_type_list in the ia64 backend

2011-04-20 Thread Steve Ellcey

On Wed, 2011-04-20 at 17:25 -0400, Nathan Froyd wrote:
> On Wed, Apr 20, 2011 at 02:09:49PM -0700, Steve Ellcey wrote:
> > I am not sure what the patch would look like then.  You removed the
> > assignment to decl, so what are you putting in ia64_builtins?  Can you
> > send the full correct patch.
> 
> Sure.  Updated patch below, which probably looks somewhat more sane.
> 
> -Nathan

OK, that looks good.

Steve Ellcey
s...@cup.hp.com

Re: [PATCH] use build_function_type_list in the ia64 backend

2011-04-20 Thread Nathan Froyd

On Wed, Apr 20, 2011 at 02:09:49PM -0700, Steve Ellcey wrote:
> I am not sure what the patch would look like then.  You removed the
> assignment to decl, so what are you putting in ia64_builtins?  Can you
> send the full correct patch.

Sure.  Updated patch below, which probably looks somewhat more sane.

-Nathan

diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 5f22b17..880aa8d 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -10165,7 +10165,7 @@ ia64_init_builtins (void)
   (*lang_hooks.types.register_builtin_type) (float128_type, "__float128");
 
   /* TFmode support builtins.  */
-  ftype = build_function_type (float128_type, void_list_node);
+  ftype = build_function_type_list (float128_type, NULL_TREE);
   decl = add_builtin_function ("__builtin_infq", ftype,
   IA64_BUILTIN_INFQ, BUILT_IN_MD,
   NULL, NULL_TREE);
@@ -10212,13 +10212,13 @@ ia64_init_builtins (void)
   NULL, NULL_TREE)
 
   decl = def_builtin ("__builtin_ia64_bsp",
-  build_function_type (ptr_type_node, void_list_node),
-  IA64_BUILTIN_BSP);
+ build_function_type_list (ptr_type_node, NULL_TREE),
+ IA64_BUILTIN_BSP);
   ia64_builtins[IA64_BUILTIN_BSP] = decl;
 
   decl = def_builtin ("__builtin_ia64_flushrs",
-  build_function_type (void_type_node, void_list_node),
-  IA64_BUILTIN_FLUSHRS);
+ build_function_type_list (void_type_node, NULL_TREE),
+ IA64_BUILTIN_FLUSHRS);
   ia64_builtins[IA64_BUILTIN_FLUSHRS] = decl;
 
 #undef def_builtin

Re: Ping^2 Re: Target header etc. cleanup patch

2011-04-20 Thread DJ Delorie


The m32c one is OK

Ping Re: Don't use linux.h for non-Linux targets

2011-04-20 Thread Joseph S. Myers

Ping.  This patch 
 is pending 
review.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] use build_function_type_list in the ia64 backend

2011-04-20 Thread Steve Ellcey

On Wed, 2011-04-20 at 13:03 -0700, Nathan Froyd wrote:
> On Wed, Apr 20, 2011 at 03:29:19PM -0400, Nathan Froyd wrote:
> > As $SUBJECT suggests.  Tested with cross to ia64-linux-gnu.  OK to
> > commit?
> >
> > -  ftype = build_function_type (float128_type, void_list_node);
> > -  decl = add_builtin_function ("__builtin_infq", ftype,
> > -  IA64_BUILTIN_INFQ, BUILT_IN_MD,
> > -  NULL, NULL_TREE);
> > -  ia64_builtins[IA64_BUILTIN_INFQ] = decl;
> > +  ftype = build_function_type_list (float128_type, NULL_TREE);
> > +  add_builtin_function ("__builtin_infq", ftype,
> > +   IA64_BUILTIN_INFQ, BUILT_IN_MD,
> > +   NULL, NULL_TREE);
> 
> Of course, the patch I tested didn't delete the assignment to
> ia64_builtins.  Please disregard that bit.
> 
> -Nathan

I am not sure what the patch would look like then.  You removed the
assignment to decl, so what are you putting in ia64_builtins?  Can you
send the full correct patch.

Steve Ellcey
s...@cup.hp.com

Ping^2 Re: Target header etc. cleanup patch

2011-04-20 Thread Joseph S. Myers

Ping^2.  This patch 
 is still pending 
review.  This version applies cleanly to current trunk.

2011-04-20  Joseph Myers  

* config/alpha/alpha.c (struct machine_function): Use rtx, not
struct rtx_def *.
* config/bfin/bfin-protos.h (Mmode): Don't define.  Expand
definition where used.
* config/bfin/bfin.h (bfin_cc_rtx, bfin_rets_rtx): Use rtx, not
struct rtx_def *.
* config/cris/cris-protos.h (STDIO_INCLUDED): Don't define.
* config/fr30/fr30-protos.h (Mmode): Don't define.
* config/fr30/fr30.h (inhibit_libc): Don't define.
* config/h8300/h8300.h (struct cum_arg): Use rtx, not struct
rtx_def *.
* config/i386/cygming.h (union tree_node, TREE): Don't define or
undefine.
(FILE): Don't undefine.
* config/iq2000/iq2000.h (struct iq2000_args): Use rtx, not struct
rtx_def *.
* config/m32c/m32c-protos.h (MM, UINT): Don't define.  Expand
definitions where used.
* config/m32r/m32r-protos.h (Mmode): Don't define.  Expand
definition where used.
* config/microblaze/microblaze.h (struct microblaze_args): Use
rtx, not struct rtx_def *.
* config/mn10300/mn10300-protos.h (Mmode, Cstar, Rclas): Don't
define.  Expand definitions where used.
* config/pa/pa-protos.h (return_addr_rtx): Use rtx, not struct
rtx_def *.
* config/pa/pa.h (hppa_pic_save_rtx): Use rtx, not struct rtx_def
*.
* config/pdp11/pdp11.h (cc0_reg_rtx): Use rtx, not struct rtx_def
*.
* config/rx/rx-protos.h (Mmode, Fargs, Rcode): Don't define.
Expand definitions where used.
* config/rx/rx.c (rx_is_legitimate_address, rx_function_arg_size,
rx_function_arg, rx_function_arg_advance,
rx_function_arg_boundary): Expand definitions of those macros.
* config/sh/sh-protos.h (sfunc_uses_reg, get_fpscr_rtx): Use rtx,
not struct rtx_def *.
* config/sh/sh.h (sh_compare_op0, sh_compare_op1): Use rtx, not
struct rtx_def *.
* config/spu/spu-protos.h (spu_float_const): Use rtx, not struct
rtx_def *.
* config/spu/spu.c (spu_float_const): Use rtx, not struct rtx_def
*.
* config/v850/v850-protos.h (Mmode): Don't define.  Expand
definition where used.
* config/v850/v850.h (GHS_default_section_names,
GHS_current_section_names): Use tree, not union tree_node *.

Index: gcc/config/alpha/alpha.c
===
--- gcc/config/alpha/alpha.c(revision 172767)
+++ gcc/config/alpha/alpha.c(working copy)
@@ -4606,7 +4606,7 @@ struct GTY(()) machine_function
   const char *some_ld_name;
 
   /* For TARGET_LD_BUGGY_LDGP.  */
-  struct rtx_def *gp_save_rtx;
+  rtx gp_save_rtx;
 
   /* For VMS condition handlers.  */
   bool uses_condition_handler;  
Index: gcc/config/m32c/m32c-protos.h
===
--- gcc/config/m32c/m32c-protos.h   (revision 172767)
+++ gcc/config/m32c/m32c-protos.h   (working copy)
@@ -1,5 +1,5 @@
 /* Target Prototypes for R8C/M16C/M32C
-   Copyright (C) 2005, 2007, 2008, 2010
+   Copyright (C) 2005, 2007, 2008, 2010, 2011
Free Software Foundation, Inc.
Contributed by Red Hat.
 
@@ -19,12 +19,9 @@
along with GCC; see the file COPYING3.  If not see
.  */
 
-#define MM enum machine_mode
-#define UINT unsigned int
-
 void m32c_conditional_register_usage (void);
 int  m32c_const_ok_for_constraint_p (HOST_WIDE_INT, char, const char *);
-UINT m32c_dwarf_frame_regnum (int);
+unsigned int m32c_dwarf_frame_regnum (int);
 int  m32c_eh_return_data_regno (int);
 void m32c_emit_epilogue (void);
 void m32c_emit_prologue (void);
@@ -47,8 +44,8 @@ int  m32c_trampoline_size (void);
 
 #ifdef RTX_CODE
 
-int  m32c_cannot_change_mode_class (MM, MM, int);
-int  m32c_class_max_nregs (int, MM);
+int  m32c_cannot_change_mode_class (enum machine_mode, enum machine_mode, int);
+int  m32c_class_max_nregs (int, enum machine_mode);
 rtx  m32c_eh_return_stackadj_rtx (void);
 void m32c_emit_eh_epilogue (rtx);
 int  m32c_expand_cmpstr (rtx *);
@@ -60,20 +57,20 @@ void m32c_expand_neg_mulpsi3 (rtx *);
 int  m32c_expand_setmemhi (rtx *);
 int  m32c_extra_constraint_p (rtx, char, const char *);
 int  m32c_extra_constraint_p2 (rtx, char, const char *);
-int  m32c_hard_regno_nregs (int, MM);
-int  m32c_hard_regno_ok (int, MM);
+int  m32c_hard_regno_nregs (int, enum machine_mode);
+int  m32c_hard_regno_ok (int, enum machine_mode);
 bool m32c_illegal_subreg_p (rtx);
-bool m32c_immd_dbl_mov (rtx *, MM);
+bool m32c_immd_dbl_mov (rtx *, enum machine_mode);
 rtx  m32c_incoming_return_addr_rtx (void);
 int  m32c_legitimate_constant_p (rtx);
-int  m32c_legitimize_reload_address (rtx *, MM, int, int, int);
-int  m32c_limit_re

Re: [PATCH] use build_function_type_list a few places in the ObjC frontend

2011-04-20 Thread Mike Stump

On Apr 20, 2011, at 10:27 AM, Nathan Froyd wrote:
> Tested on x86_64-unknown-linux-gnu.  IIUC the changes to
> objc-next-runtime-abi-02.c would not be tested on that platform, so it
> would be helpful to have a Darwin tester double-check my work.

Just check http://gcc.gnu.org/regtest/HEAD/ after about 10 hours.

> OK to commit?

Ok.

fix memory leak in gengtype

2011-04-20 Thread Dimitrios Apostolou


Hello list,

while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM 
killed. That's when I noticed that its RAM usage peaks at 150MB, which is 
a bit excessive for parsing a ~500K text file.


The attached patch fixes the leak and gengtype now uses a peak of 4MB 
heap. Hopefully I don't do something wrong, since it took me a while to 
understand those obstacks...



Thanks,
Dimitris


P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu) but 
unfortunately the sparcstation crashed and burned after this, so I can't 
continue the build and report back :-(
--- gcc/gengtype-state.c.orig   2011-04-20 23:06:29.0 +0300
+++ gcc/gengtype-state.c2011-04-20 23:12:43.0 +0300
@@ -303,7 +303,7 @@
   obstack_1grow (&id_obstack, (char) 0);
   ids = XOBFINISH (&id_obstack, char *);
   sid = state_ident_by_name (ids, INSERT);
-  obstack_free (&id_obstack, ids);
+  obstack_free (&id_obstack, NULL);
   ids = NULL;
   tk = XCNEW (struct state_token_st);
   tk->stok_kind = STOK_NAME;
@@ -408,7 +408,7 @@
   tk->stok_file = state_path;
   tk->stok_next = NULL;
   strcpy (tk->stok_un.stok_string, cstr);
-  obstack_free (&bstring_obstack, cstr);
+  obstack_free (&bstring_obstack, NULL);
 
   return tk;
 }

Re: [patch] Do not generate discriminator directive in strict mode

2011-04-20 Thread Richard Henderson

On 04/20/2011 12:09 PM, Eric Botcazou wrote:
>> How is this not redundant with the existing
>>
>>   /* The discriminator column was added in dwarf4.  Simplify the below
>>  by simply removing it if we're not supposed to output it.  */
>>   if (dwarf_version < 4 && dwarf_strict)
>> discriminator = 0;
>>
>> check near the top of the function?
> 
> Obviously I missed this recent change, sorry.  So the question is: would the 
> change be appropriate for the release branches, where we emit the directive 
> unconditionally, i.e 4.5 and 4.6 branches, or would mine be safer for them?

Let's try to keep the branches more similar than not.  It's just as safe, since
prior to mainline we ignore the discriminator when not emitting via gas 
directive.


r~

Re: Second ping for cannot_force_const_mem & LEGITIMATE_CONSTANT_P changes

2011-04-20 Thread Richard Henderson

On 04/18/2011 02:30 AM, Richard Sandiford wrote:
> Ping for these two changes:
> 
> http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00194.html
> http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00195.html
> 
Both ok.


r~

[PATCH] use build_function_type_list in the sh backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  The only tricky bit is the initialization of
`args' to NULL_TREEs so that we can safely pass all of the relevant args
to build_function_type_list, regardless of whether the function type in
question has that many args.

Tested with cross to sh-elf.  OK to commit?

-Nathan

* config/sh/sh.c (sh_media_init_builtins): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c
index 78f6f0f..0f158d5 100644
--- a/gcc/config/sh/sh.c
+++ b/gcc/config/sh/sh.c
@@ -11222,6 +11222,7 @@ sh_media_init_builtins (void)
   else
{
  int has_result = signature_args[signature][0] != 0;
+ tree args[3];
 
  if ((signature_args[signature][1] & 8)
  && (((signature_args[signature][1] & 1) && TARGET_SHMEDIA32)
@@ -11230,7 +11231,8 @@ sh_media_init_builtins (void)
  if (! TARGET_FPU_ANY
  && FLOAT_MODE_P (insn_data[d->icode].operand[0].mode))
continue;
- type = void_list_node;
+ for (i = 0; i < (int) ARRAY_SIZE (args); i++)
+   args[i] = NULL_TREE;
  for (i = 3; ; i--)
{
  int arg = signature_args[signature][i];
@@ -11248,9 +11250,10 @@ sh_media_init_builtins (void)
arg_type = void_type_node;
  if (i == 0)
break;
- type = tree_cons (NULL_TREE, arg_type, type);
+ args[i-1] = arg_type;
}
- type = build_function_type (arg_type, type);
+ type = build_function_type_list (arg_type, args[0], args[1],
+  args[2], NULL_TREE);
  if (signature < SH_BLTIN_NUM_SHARED_SIGNATURES)
shared[signature] = type;
}

[PATCH] use build_function_type_list in the stormy16 backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  For safety's sake, we initialize all the
arguments to NULL before passing them to build_function_type_list.  This
is not necessary currently, as we always completely fill in the args
array, but it might save some future coder from quite some grief...

Tested with cross to xstormy16-elf.  OK to commit?

-Nathan

* config/stormy16/stormy16 (xstormy16_init_builtins): Call
build_function_type_list instead of build_function_type.
Rearrange initialization of `args' to do so.

diff --git a/gcc/config/stormy16/stormy16.c b/gcc/config/stormy16/stormy16.c
index 052285c..1a90e16 100644
--- a/gcc/config/stormy16/stormy16.c
+++ b/gcc/config/stormy16/stormy16.c
@@ -2255,15 +2255,21 @@ static struct
 static void
 xstormy16_init_builtins (void)
 {
-  tree args, ret_type, arg;
-  int i, a;
+  tree args[2], ret_type, arg = NULL_TREE, ftype;
+  int i, a, n_args;
 
   ret_type = void_type_node;
 
   for (i = 0; s16builtins[i].name; i++)
 {
-  args = void_list_node;
-  for (a = strlen (s16builtins[i].arg_types) - 1; a >= 0; a--)
+  n_args = strlen (s16builtins[i].arg_types) - 1;
+
+  gcc_assert (n_args <= (int) ARRAY_SIZE (args));
+
+  for (a = n_args; a >= 0; a--)
+   args[a] = NULL_TREE;
+
+  for (a = n_args; a >= 0; a--)
{
  switch (s16builtins[i].arg_types[a])
{
@@ -2276,10 +2282,10 @@ xstormy16_init_builtins (void)
  if (a == 0)
ret_type = arg;
  else
-   args = tree_cons (NULL_TREE, arg, args);
+   args[a-1] = arg;
}
-  add_builtin_function (s16builtins[i].name,
-   build_function_type (ret_type, args),
+  ftype = build_function_type_list (ret_type, arg[0], arg[1], NULL_TREE);
+  add_builtin_function (s16builtins[i].name, ftype,
i, BUILT_IN_MD, NULL, NULL);
 }
 }

[PATCH] use build_function_type_list in the spu backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  The only tricky bit is initializing all the args
to NULL_TREE so that we can safely pass all the args to
build_function_type_list.

Tested with cross to spu-elf; I couldn't build all of libgcc, but that
appears to be a pre-existing problem.  OK to commit?

-Nathan

* config/spu/spu.c (spu_init_builtins): Call build_function_type_list
instead of build_function_type.  Rearrange gathering of args to
do so.
* config/spu/spu-builtins.def (SPU_MAX_ARGS_TO_BUILTIN): Define.

diff --git a/gcc/config/spu/spu-builtins.def b/gcc/config/spu/spu-builtins.def
index 4d01d94..6dfdf8c 100644
--- a/gcc/config/spu/spu-builtins.def
+++ b/gcc/config/spu/spu-builtins.def
@@ -23,6 +23,8 @@
 #define _A3(a,b,c)   {a, b, c, SPU_BTI_END_OF_PARAMS}
 #define _A4(a,b,c,d) {a, b, c, d, SPU_BTI_END_OF_PARAMS}
 
+#define SPU_MAX_ARGS_TO_BUILTIN 3
+
 /* definitions to support si intrinsic functions: (These and other builtin
  * definitions must precede definitions of the overloaded generic intrinsics */
 
diff --git a/gcc/config/spu/spu.c b/gcc/config/spu/spu.c
index 941194b..ea9d580 100644
--- a/gcc/config/spu/spu.c
+++ b/gcc/config/spu/spu.c
@@ -5777,9 +5777,10 @@ spu_init_builtins (void)
  sure nodes are shared. */
   for (i = 0, d = spu_builtins; i < NUM_SPU_BUILTINS; i++, d++)
 {
-  tree p;
+  tree ftype;
   char name[64];   /* build_function will make a copy. */
-  int parm;
+  int parm, i;
+  tree args[SPU_MAX_ARGS_TO_BUILTIN];
 
   if (d->name == 0)
continue;
@@ -5788,15 +5789,23 @@ spu_init_builtins (void)
   for (parm = 1; d->parm[parm] != SPU_BTI_END_OF_PARAMS; parm++)
;
 
-  p = void_list_node;
+  gcc_assert (parm <= (SPU_MAX_ARGS_TO_BUILTIN + 1));
+
+  for (i = 0; i < ARRAY_SIZE (args); i++)
+   args[i] = NULL_TREE;
+
   while (parm > 1)
-   p = tree_cons (NULL_TREE, spu_builtin_types[d->parm[--parm]], p);
+   {
+ tree arg = spu_builtin_types[d->parm[--parm]];
+ args[parm-1] = arg;
+   }
 
-  p = build_function_type (spu_builtin_types[d->parm[0]], p);
+  ftype = build_function_type_list (spu_builtin_types[d->parm[0]],
+   args[0], args[1], args[2], NULL_TREE);
 
   sprintf (name, "__builtin_%s", d->name);
   spu_builtin_decls[i] =
-   add_builtin_function (name, p, i, BUILT_IN_MD, NULL, NULL_TREE);
+   add_builtin_function (name, ftype, i, BUILT_IN_MD, NULL, NULL_TREE);
   if (d->fcode == SPU_MASK_FOR_LOAD)
TREE_READONLY (spu_builtin_decls[i]) = 1;

Re: [PATCH] use build_function_type_list in the pa backend

2011-04-20 Thread John David Anglin

>   * config/pa/pa.c (pa_init_builtins): Call build_function_type_list
>   instead of build_function_type.

Ok.

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)

Re: [PATCH] use build_function_type_list in the ia64 backend

2011-04-20 Thread Nathan Froyd

On Wed, Apr 20, 2011 at 03:29:19PM -0400, Nathan Froyd wrote:
> As $SUBJECT suggests.  Tested with cross to ia64-linux-gnu.  OK to
> commit?
>
> -  ftype = build_function_type (float128_type, void_list_node);
> -  decl = add_builtin_function ("__builtin_infq", ftype,
> -IA64_BUILTIN_INFQ, BUILT_IN_MD,
> -NULL, NULL_TREE);
> -  ia64_builtins[IA64_BUILTIN_INFQ] = decl;
> +  ftype = build_function_type_list (float128_type, NULL_TREE);
> +  add_builtin_function ("__builtin_infq", ftype,
> + IA64_BUILTIN_INFQ, BUILT_IN_MD,
> + NULL, NULL_TREE);

Of course, the patch I tested didn't delete the assignment to
ia64_builtins.  Please disregard that bit.

-Nathan

[PATCH] use build_function_type_list in the arm backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  There's one remaining use of build_function_type,
but replace that will have to wait until we have a better
FUNCTION_TYPE-building interface.

Tested with cross to arm-eabi.  OK to commit?

-Nathan

* config/arm/arm.c (arm_init_iwmmxt_builtins): Call
build_function_type_list instead of build_function_type.
Delete variable `endlink'.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5f964d6..9f10ac4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -18915,196 +18915,137 @@ arm_init_iwmmxt_builtins (void)
 {
   const struct builtin_description * d;
   size_t i;
-  tree endlink = void_list_node;
 
   tree V2SI_type_node = build_vector_type_for_mode (intSI_type_node, V2SImode);
   tree V4HI_type_node = build_vector_type_for_mode (intHI_type_node, V4HImode);
   tree V8QI_type_node = build_vector_type_for_mode (intQI_type_node, V8QImode);
 
   tree int_ftype_int
-= build_function_type (integer_type_node,
-  tree_cons (NULL_TREE, integer_type_node, endlink));
+= build_function_type_list (integer_type_node,
+   integer_type_node, NULL_TREE);
   tree v8qi_ftype_v8qi_v8qi_int
-= build_function_type (V8QI_type_node,
-  tree_cons (NULL_TREE, V8QI_type_node,
- tree_cons (NULL_TREE, V8QI_type_node,
-tree_cons (NULL_TREE,
-   integer_type_node,
-   endlink;
+= build_function_type_list (V8QI_type_node,
+   V8QI_type_node, V8QI_type_node,
+   integer_type_node, NULL_TREE);
   tree v4hi_ftype_v4hi_int
-= build_function_type (V4HI_type_node,
-  tree_cons (NULL_TREE, V4HI_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-endlink)));
+= build_function_type_list (V4HI_type_node,
+   V4HI_type_node, integer_type_node, NULL_TREE);
   tree v2si_ftype_v2si_int
-= build_function_type (V2SI_type_node,
-  tree_cons (NULL_TREE, V2SI_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-endlink)));
+= build_function_type_list (V2SI_type_node,
+   V2SI_type_node, integer_type_node, NULL_TREE);
   tree v2si_ftype_di_di
-= build_function_type (V2SI_type_node,
-  tree_cons (NULL_TREE, long_long_integer_type_node,
- tree_cons (NULL_TREE,
-long_long_integer_type_node,
-endlink)));
+= build_function_type_list (V2SI_type_node,
+   long_long_integer_type_node,
+   long_long_integer_type_node,
+   NULL_TREE);
   tree di_ftype_di_int
-= build_function_type (long_long_integer_type_node,
-  tree_cons (NULL_TREE, long_long_integer_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-endlink)));
+= build_function_type_list (long_long_integer_type_node,
+   long_long_integer_type_node,
+   integer_type_node, NULL_TREE);
   tree di_ftype_di_int_int
-= build_function_type (long_long_integer_type_node,
-  tree_cons (NULL_TREE, long_long_integer_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-tree_cons (NULL_TREE,
-   integer_type_node,
-   endlink;
+= build_function_type_list (long_long_integer_type_node,
+   long_long_integer_type_node,
+   integer_type_node,
+   integer_type_node, NULL_TREE);
   tree int_ftype_v8qi
-= build_function_type (integer_type_node,
-  tree_cons (NULL_TREE, V8QI_type_node,
- endlink));
+= build_function_type_list (integer_type_node,
+   V8QI_type_node, NULL_TREE);
   tree int_ftype_v4hi
-= build_function_type (integer_type_node,
-  tree_cons (NULL_TREE, V4HI_type_node,
- endlink));
+= build_function_type_list (integer_type_node,
+   V4HI_type_node, NULL_TREE);
   tree int_ftype_v2si
-= build_function_ty

[PATCH] use build_function_type_list in the pa backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to hppa-linux-gnu.  OK to
commit?

-Nathan

* config/pa/pa.c (pa_init_builtins): Call build_function_type_list
instead of build_function_type.

diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c
index e05cf19..aeb8061 100644
--- a/gcc/config/pa/pa.c
+++ b/gcc/config/pa/pa.c
@@ -641,7 +641,7 @@ pa_init_builtins (void)
   TREE_READONLY (decl) = 1;
   pa_builtins[PA_BUILTIN_COPYSIGNQ] = decl;
 
-  ftype = build_function_type (long_double_type_node, void_list_node);
+  ftype = build_function_type_list (long_double_type_node, NULL_TREE);
   decl = add_builtin_function ("__builtin_infq", ftype,
   PA_BUILTIN_INFQ, BUILT_IN_MD,
   NULL, NULL_TREE);

[PATCH] use build_function_type_list in the avr backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to avr-elf.  OK to commit?

-Nathan

* config/avr/avr.c (avr_init_builtins): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c
index 500a5b2..6dbf8b4 100644
--- a/gcc/config/avr/avr.c
+++ b/gcc/config/avr/avr.c
@@ -6535,7 +6535,7 @@ static void
 avr_init_builtins (void)
 {
   tree void_ftype_void
-= build_function_type (void_type_node, void_list_node);
+= build_function_type_list (void_type_node, NULL_TREE);
   tree uchar_ftype_uchar
 = build_function_type_list (unsigned_char_type_node, 
 unsigned_char_type_node,

[PATCH, i386]: Expand insv pattern to pinsr{q,w,d,q} insn

2011-04-20 Thread Uros Bizjak

Hello!

Attached patch enhances the fix for PR target/48678 to generate
pinsr{q,w,d,q} insn when value is inserted into vector register.

2011-04-20  Uros Bizjak  

PR target/48678
* config/i386/i386.md (insv): Change operand 0 constraint to
"register_operand".  Change operand 1 and 2 constraint to
"const_int_operand".  Expand to pinsr{b,w,d,q} * when appropriate.
* config/i386/sse.md (sse4_1_pinsrb): Export.
(sse2_pinsrw): Ditto.
(sse4_1_pinsrd): Ditto.
(sse4_1_pinsrq): Ditto.
* config/i386/i386-protos.h (ix86_expand_pinsr): Add prototype.
* config/i386/i386.c (ix86_expand_pinsr): New.

testsuite/ChangeLog:

2011-04-20  Uros Bizjak  

PR target/48678
* gcc.target/i386/sse2-pinsrw.c: New test.
* gcc.target/i386/avx-vpinsrw.c: Ditto.
* gcc.target/i386/sse4_1-insvqi.c: Ditto.
* gcc.target/i386/sse2-insvhi.c: Ditto.
* gcc.target/i386/sse4_1-insvsi.c: Ditto.
* gcc.target/i386/sse4_1-insvdi.c: Ditto.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{-m32}.  Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 172780)
+++ config/i386/i386.md (working copy)
@@ -10393,14 +10393,17 @@
 })
 
 (define_expand "insv"
-  [(set (zero_extract (match_operand 0 "ext_register_operand" "")
- (match_operand 1 "const8_operand" "")
- (match_operand 2 "const8_operand" ""))
+  [(set (zero_extract (match_operand 0 "register_operand" "")
+ (match_operand 1 "const_int_operand" "")
+ (match_operand 2 "const_int_operand" ""))
 (match_operand 3 "register_operand" ""))]
   ""
 {
   rtx (*gen_mov_insv_1) (rtx, rtx);
 
+  if (ix86_expand_pinsr (operands))
+DONE;
+
   /* Handle insertions to %ah et al.  */
   if (INTVAL (operands[1]) != 8 || INTVAL (operands[2]) != 8)
 FAIL;
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 172780)
+++ config/i386/sse.md  (working copy)
@@ -6051,7 +6051,7 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "TI")])
 
-(define_insn "*sse4_1_pinsrb"
+(define_insn "sse4_1_pinsrb"
   [(set (match_operand:V16QI 0 "register_operand" "=x,x,x,x")
(vec_merge:V16QI
  (vec_duplicate:V16QI
@@ -6083,7 +6083,7 @@
(set_attr "prefix" "orig,orig,vex,vex")
(set_attr "mode" "TI")])
 
-(define_insn "*sse2_pinsrw"
+(define_insn "sse2_pinsrw"
   [(set (match_operand:V8HI 0 "register_operand" "=x,x,x,x")
(vec_merge:V8HI
  (vec_duplicate:V8HI
@@ -6117,7 +6117,7 @@
(set_attr "mode" "TI")])
 
 ;; It must come before sse2_loadld since it is preferred.
-(define_insn "*sse4_1_pinsrd"
+(define_insn "sse4_1_pinsrd"
   [(set (match_operand:V4SI 0 "register_operand" "=x,x")
(vec_merge:V4SI
  (vec_duplicate:V4SI
@@ -6145,7 +6145,7 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "TI")])
 
-(define_insn "*sse4_1_pinsrq"
+(define_insn "sse4_1_pinsrq"
   [(set (match_operand:V2DI 0 "register_operand" "=x,x")
(vec_merge:V2DI
  (vec_duplicate:V2DI
Index: config/i386/i386-protos.h
===
--- config/i386/i386-protos.h   (revision 172780)
+++ config/i386/i386-protos.h   (working copy)
@@ -203,6 +203,7 @@ extern void ix86_expand_vector_extract (
 extern void ix86_expand_reduc_v4sf (rtx (*)(rtx, rtx, rtx), rtx, rtx);
 
 extern void ix86_expand_vec_extract_even_odd (rtx, rtx, rtx, unsigned);
+extern bool ix86_expand_pinsr (rtx *);
 
 /* In i386-c.c  */
 extern void ix86_target_macros (void);
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 172780)
+++ config/i386/i386.c  (working copy)
@@ -34106,6 +34106,88 @@ ix86_expand_vec_extract_even_odd (rtx ta
   /* ... or we use the special-case patterns.  */
   expand_vec_perm_even_odd_1 (&d, odd);
 }
+
+/* Expand an insert into a vector register through pinsr insn.
+   Return true if successful.  */
+
+bool
+ix86_expand_pinsr (rtx *operands)
+{
+  rtx dst = operands[0];
+  rtx src = operands[3];
+
+  unsigned int size = INTVAL (operands[1]);
+  unsigned int pos = INTVAL (operands[2]);
+
+  if (GET_CODE (dst) == SUBREG)
+{
+  pos += SUBREG_BYTE (dst) * BITS_PER_UNIT;
+  dst = SUBREG_REG (dst);
+}
+
+  if (GET_CODE (src) == SUBREG)
+src = SUBREG_REG (src);
+
+  switch (GET_MODE (dst))
+{
+case V16QImode:
+case V8HImode:
+case V4SImode:
+case V2DImode:
+  {
+   enum machine_mode srcmode, dstmode;
+   rtx (*pinsr)(rtx, rtx, rtx, rtx);
+
+   srcmode = mode_for_size (size, MODE_INT, 0);
+
+   switch (srcmode)
+ {
+ case QImode:
+   if (!TARGET_SSE4_1)
+ return

[PATCH] use build_function_type_list in the picochip backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to picochip-elf.  OK to commit?

-Nathan

* config/picochip/picochip.c (picochip_init_builtins): Call
build_function_type_list instead of build_function_type.
Delete `endlink' variable.

diff --git a/gcc/config/picochip/picochip.c b/gcc/config/picochip/picochip.c
index 1ca95b4..4442d1e 100644
--- a/gcc/config/picochip/picochip.c
+++ b/gcc/config/picochip/picochip.c
@@ -4216,18 +4216,6 @@ void
 picochip_init_builtins (void)
 {
   tree noreturn;
-  tree endlink = void_list_node;
-  tree int_endlink = tree_cons (NULL_TREE, integer_type_node, endlink);
-  tree unsigned_endlink = tree_cons (NULL_TREE, unsigned_type_node, endlink);
-  tree long_endlink = tree_cons (NULL_TREE, long_integer_type_node, endlink);
-  tree int_int_endlink =
-tree_cons (NULL_TREE, integer_type_node, int_endlink);
-  tree int_int_int_endlink =
-tree_cons (NULL_TREE, integer_type_node, int_int_endlink);
-  tree int_long_endlink =
-tree_cons (NULL_TREE, integer_type_node, long_endlink);
-  tree long_int_int_int_endlink =
-tree_cons (NULL_TREE, long_integer_type_node, int_int_int_endlink);
 
   tree int_ftype_int, int_ftype_int_int;
   tree long_ftype_int, long_ftype_int_int_int;
@@ -4236,36 +4224,51 @@ picochip_init_builtins (void)
   tree void_ftype_void, unsigned_ftype_unsigned;
 
   /* void func (void) */
-  void_ftype_void = build_function_type (void_type_node, endlink);
+  void_ftype_void = build_function_type_list (void_type_node, NULL_TREE);
 
   /* int func (int) */
-  int_ftype_int = build_function_type (integer_type_node, int_endlink);
+  int_ftype_int = build_function_type_list (integer_type_node,
+   integer_type_node, NULL_TREE);
 
   /* unsigned int func (unsigned int) */
-  unsigned_ftype_unsigned = build_function_type (unsigned_type_node, 
unsigned_endlink);
+  unsigned_ftype_unsigned
+= build_function_type_list (unsigned_type_node,
+   unsigned_type_node, NULL_TREE);
 
   /* int func(int, int) */
   int_ftype_int_int
-= build_function_type (integer_type_node, int_int_endlink);
+= build_function_type_list (integer_type_node,
+   integer_type_node, integer_type_node,
+   NULL_TREE);
 
   /* long func(int) */
-  long_ftype_int = build_function_type (long_integer_type_node, int_endlink);
+  long_ftype_int = build_function_type_list (long_integer_type_node,
+integer_type_node, NULL_TREE);
 
   /* long func(int, int, int) */
   long_ftype_int_int_int
-= build_function_type (long_integer_type_node, int_int_int_endlink);
+= build_function_type_list (long_integer_type_node,
+   integer_type_node, integer_type_node,
+   integer_type_node, NULL_TREE);
 
   /* int func(int, int, int) */
   int_ftype_int_int_int
-= build_function_type (integer_type_node, int_int_int_endlink);
+= build_function_type_list (integer_type_node,
+   integer_type_node, integer_type_node,
+   integer_type_node, NULL_TREE);
 
   /* void func(int, long) */
   void_ftype_int_long
-= build_function_type (void_type_node, int_long_endlink);
+= build_function_type_list (void_type_node,
+   integer_type_node, long_integer_type_node,
+   NULL_TREE);
 
   /* void func(long, int, int, int) */
   void_ftype_long_int_int_int
-= build_function_type (void_type_node, long_int_int_int_endlink);
+= build_function_type_list (void_type_node,
+   long_integer_type_node, integer_type_node,
+   integer_type_node, integer_type_node,
+   NULL_TREE);
 
   /* Initialise the sign-bit-count function. */
   add_builtin_function ("__builtin_sbc", int_ftype_int,

[PATCH] use build_function_type_list in the rs6000 backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  The only tricky part is in builtin_function_type,
where we fill in unused args with NULL_TREE so that passing extra
arguments to build_function_type_list doesn't matter.

Tested with cross to powerpc-eabi.  OK to commit?

-Nathan

* config/rs6000/rs6000.c (spe_init_builtins): Call
build_function_type_list instead of build_function_type.
(paired_init_builtins, altivec_init_builtins): Likewise.
(builtin_function_type): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8182bf0..c08c16e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -12824,107 +12824,97 @@ enable_mask_for_builtins (struct builtin_description 
*desc, int size,
 static void
 spe_init_builtins (void)
 {
-  tree endlink = void_list_node;
   tree puint_type_node = build_pointer_type (unsigned_type_node);
   tree pushort_type_node = build_pointer_type (short_unsigned_type_node);
   struct builtin_description *d;
   size_t i;
 
   tree v2si_ftype_4_v2si
-= build_function_type
-(opaque_V2SI_type_node,
- tree_cons (NULL_TREE, opaque_V2SI_type_node,
-   tree_cons (NULL_TREE, opaque_V2SI_type_node,
-  tree_cons (NULL_TREE, opaque_V2SI_type_node,
- tree_cons (NULL_TREE, 
opaque_V2SI_type_node,
-endlink);
+= build_function_type_list (opaque_V2SI_type_node,
+opaque_V2SI_type_node,
+opaque_V2SI_type_node,
+opaque_V2SI_type_node,
+opaque_V2SI_type_node,
+NULL_TREE);
 
   tree v2sf_ftype_4_v2sf
-= build_function_type
-(opaque_V2SF_type_node,
- tree_cons (NULL_TREE, opaque_V2SF_type_node,
-   tree_cons (NULL_TREE, opaque_V2SF_type_node,
-  tree_cons (NULL_TREE, opaque_V2SF_type_node,
- tree_cons (NULL_TREE, 
opaque_V2SF_type_node,
-endlink);
+= build_function_type_list (opaque_V2SF_type_node,
+opaque_V2SF_type_node,
+opaque_V2SF_type_node,
+opaque_V2SF_type_node,
+opaque_V2SF_type_node,
+NULL_TREE);
 
   tree int_ftype_int_v2si_v2si
-= build_function_type
-(integer_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-   tree_cons (NULL_TREE, opaque_V2SI_type_node,
-  tree_cons (NULL_TREE, opaque_V2SI_type_node,
- endlink;
+= build_function_type_list (integer_type_node,
+integer_type_node,
+opaque_V2SI_type_node,
+opaque_V2SI_type_node,
+NULL_TREE);
 
   tree int_ftype_int_v2sf_v2sf
-= build_function_type
-(integer_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-   tree_cons (NULL_TREE, opaque_V2SF_type_node,
-  tree_cons (NULL_TREE, opaque_V2SF_type_node,
- endlink;
+= build_function_type_list (integer_type_node,
+integer_type_node,
+opaque_V2SF_type_node,
+opaque_V2SF_type_node,
+NULL_TREE);
 
   tree void_ftype_v2si_puint_int
-= build_function_type (void_type_node,
-  tree_cons (NULL_TREE, opaque_V2SI_type_node,
- tree_cons (NULL_TREE, puint_type_node,
-tree_cons (NULL_TREE,
-   integer_type_node,
-   endlink;
+= build_function_type_list (void_type_node,
+opaque_V2SI_type_node,
+puint_type_node,
+integer_type_node,
+NULL_TREE);
 
   tree void_ftype_v2si_puint_char
-= build_function_type (void_type_node,
-  tree_cons (NULL_TREE, opaque_V2SI_type_node,
- tree_cons (NULL_TREE, puint_type_node,
-tree_cons (NULL_TREE,
-   char_type_node,
-   endlink;
+= build_function_type_list (void_type_node,
+opaque_V2SI_type_node,
+puint_type_node,
+char_t

Re: [PATCH] use build_function_type_list in the mips backend

2011-04-20 Thread Richard Sandiford

Nathan Froyd  writes:
>   * config/mips/mips.c (mips16_build_function_stub): Call
>   build_function_type_list instead of build_function_type.
>   (mips16_build_call_stub): Likewise.

OK, thanks, but:

> -  build_function_type (void_type_node, NULL_TREE));
> +  build_function_type_list (void_type_node, 
> NULL_TREE));

please split the long line.

Richard

[PATCH] use build_function_type_list in the xtensa backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to xtensa-elf.  OK to commit?

-Nathan

* config/xtensa/xtensa.c (xtensa_init_builtins): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c
index fe70270..574e08e 100644
--- a/gcc/config/xtensa/xtensa.c
+++ b/gcc/config/xtensa/xtensa.c
@@ -3083,7 +3083,7 @@ xtensa_init_builtins (void)
 
   if (TARGET_THREADPTR)
 {
-  ftype = build_function_type (ptr_type_node, void_list_node);
+  ftype = build_function_type_list (ptr_type_node, NULL_TREE);
   decl = add_builtin_function ("__builtin_thread_pointer", ftype,
   XTENSA_BUILTIN_THREAD_POINTER, BUILT_IN_MD,
   NULL, NULL_TREE);

[PATCH] use build_function_type_list in the sparc backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to sparc-elf.  OK to commit?

-Nathan

* config/sparc/sparc.c (sparc_file_end): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 03b5e66..e7dd75b 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -9501,8 +9501,8 @@ sparc_file_end (void)
{
  tree decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
  get_identifier (name),
- build_function_type (void_type_node,
-  void_list_node));
+ build_function_type_list (void_type_node,
+NULL_TREE));
  DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
   NULL_TREE, void_type_node);
  TREE_STATIC (decl) = 1;

[PATCH] use build_function_type_list in the s390 backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to s390-linux-gnu.  OK to
commit?

-Nathan

* config/s390/s390.c (s390_init_builtins): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index caee077..adacfa3 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -9172,7 +9172,7 @@ s390_init_builtins (void)
 {
   tree ftype;
 
-  ftype = build_function_type (ptr_type_node, void_list_node);
+  ftype = build_function_type_list (ptr_type_node, NULL_TREE);
   add_builtin_function ("__builtin_thread_pointer", ftype,
S390_BUILTIN_THREAD_POINTER, BUILT_IN_MD,
NULL, NULL_TREE);

C++ PATCH for c++/48657 (rejects-valid with local variable used as non-type template argument)

2011-04-20 Thread Jason Merrill

The problem in this testcase was that we were recognizing a local const 
variable with a constant initializer as a constant expression, but we 
weren't doing the necessary adjustments to convert the initializer to 
the type of the variable.


But some of the other bits of cp_finish_decl caused problems for 
variables with function scope.  After some investigation, it seemed to 
me that the only part of cp_finish_decl that we really want for 
constants in templates is the initializer processing, so rather than 
mess with clearing processing_template_decl and going through all the 
other pieces, we can just call check_initializer directly and then be done.


For 4.6 I made a smaller change that only affects local variables.

Tested x86_64-pc-linux-gnu, applying to trunk and 4.6.
commit 56ee3cf091b9b349ddbcdc8afb62e4ec4cf0eae0
Author: Jason Merrill 
Date:   Tue Apr 19 23:25:21 2011 -0700

PR c++/48657
* decl.c (cp_finish_decl): Simplify template handling.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 6309648..cf4a40e 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -5750,7 +5750,6 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
   const char *asmspec = NULL;
   int was_readonly = 0;
   bool var_definition_p = false;
-  int saved_processing_template_decl;
   tree auto_node;
 
   if (decl == error_mark_node)
@@ -5772,7 +5771,6 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
 
   /* Assume no cleanup is required.  */
   cleanup = NULL_TREE;
-  saved_processing_template_decl = processing_template_decl;
 
   /* If a name was specified, get the string.  */
   if (global_scope_p (current_binding_level))
@@ -5878,39 +5876,24 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
 template is instantiated.  But, if DECL is a variable constant
 then it can be used in future constant expressions, so its value
 must be available. */
-  if (!(init
-   && DECL_CLASS_SCOPE_P (decl)
-   /* We just set TREE_CONSTANT appropriately; see above.  */
-   && TREE_CONSTANT (decl)
-   && !type_dependent_p
-   /* FIXME non-value-dependent constant expression  */
-   && !value_dependent_init_p (init)))
+  if (init
+ && init_const_expr_p
+ && !type_dependent_p
+ && decl_maybe_constant_var_p (decl)
+ && !value_dependent_init_p (init))
{
- if (init)
-   DECL_INITIAL (decl) = init;
- if (TREE_CODE (decl) == VAR_DECL
- && !DECL_PRETTY_FUNCTION_P (decl)
- && !type_dependent_p)
-   maybe_deduce_size_from_array_init (decl, init);
- goto finish_end;
+ tree init_code = check_initializer (decl, init, flags, &cleanup);
+ if (init_code == NULL_TREE)
+   init = NULL_TREE;
}
+  else if (TREE_CODE (decl) == VAR_DECL
+  && !DECL_PRETTY_FUNCTION_P (decl)
+  && !type_dependent_p)
+   maybe_deduce_size_from_array_init (decl, init);
 
-  if (TREE_CODE (init) == TREE_LIST)
-   {
- /* If the parenthesized-initializer form was used (e.g.,
-"int A::i(X)"), then INIT will be a TREE_LIST of initializer
-arguments.  (There is generally only one.)  We convert them
-individually.  */
- tree list = init;
- for (; list; list = TREE_CHAIN (list))
-   {
- tree elt = TREE_VALUE (list);
- TREE_VALUE (list) = fold_non_dependent_expr (elt);
-   }
-   }
-  else
-   init = fold_non_dependent_expr (init);
-  processing_template_decl = 0;
+  if (init)
+   DECL_INITIAL (decl) = init;
+  return;
 }
 
   /* Take care of TYPE_DECLs up front.  */
@@ -5933,7 +5916,7 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
 
   rest_of_decl_compilation (decl, DECL_FILE_SCOPE_P (decl),
at_eof);
-  goto finish_end;
+  return;
 }
 
   /* A reference will be modified here, as it is initialized.  */
@@ -6057,8 +6040,7 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
   else if (TREE_CODE (type) == ARRAY_TYPE)
layout_type (type);
 
-  if (!processing_template_decl
- && TREE_STATIC (decl)
+  if (TREE_STATIC (decl)
  && !at_function_scope_p ()
  && current_function_decl == NULL)
/* So decl is a global variable or a static member of a
@@ -6078,9 +6060,8 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
 
   /* Let the middle end know about variables and functions -- but not
  static data members in uninstantiated class templates.  */
-  if (!saved_processing_template_decl
-  && (TREE_CODE (decl) == VAR_DECL 
- || TREE_CODE (decl) == FUNCTION_DECL))
+  if (TREE_CODE (decl) == VAR_DECL
+  || TREE_CODE (decl) == FUNCTION_DECL)
 {
   if (TREE_CODE (decl) == VAR_DECL)
{
@@ -6

Re: [PATCH] use build_function_type_list in the mep backend

2011-04-20 Thread DJ Delorie


>   * config/mep/mep.c (mep_init_builtins): Call build_function_type_list
>   instead of build_function_type.

Ok.

[PATCH] use build_function_type_list in the mips backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to mips-elf.  OK to commit?

-Nathan

* config/mips/mips.c (mips16_build_function_stub): Call
build_function_type_list instead of build_function_type.
(mips16_build_call_stub): Likewise.

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index e075c4f..4d4d639 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -6075,7 +6075,7 @@ mips16_build_function_stub (void)
   /* Build a decl for the stub.  */
   stubdecl = build_decl (BUILTINS_LOCATION,
 FUNCTION_DECL, get_identifier (stubname),
-build_function_type (void_type_node, NULL_TREE));
+build_function_type_list (void_type_node, NULL_TREE));
   DECL_SECTION_NAME (stubdecl) = build_string (strlen (secname), secname);
   DECL_RESULT (stubdecl) = build_decl (BUILTINS_LOCATION,
   RESULT_DECL, NULL_TREE, void_type_node);
@@ -6321,7 +6321,7 @@ mips16_build_call_stub (rtx retval, rtx *fn_ptr, rtx 
args_size, int fp_code)
   stubid = get_identifier (stubname);
   stubdecl = build_decl (BUILTINS_LOCATION,
 FUNCTION_DECL, stubid,
-build_function_type (void_type_node, NULL_TREE));
+build_function_type_list (void_type_node, 
NULL_TREE));
   DECL_SECTION_NAME (stubdecl) = build_string (strlen (secname), secname);
   DECL_RESULT (stubdecl) = build_decl (BUILTINS_LOCATION,
   RESULT_DECL, NULL_TREE,

[PATCH] use build_function_type_list in the mep backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to mep-elf.  OK to commit?

-Nathan

* config/mep/mep.c (mep_init_builtins): Call build_function_type_list
instead of build_function_type.

diff --git a/gcc/config/mep/mep.c b/gcc/config/mep/mep.c
index 02c825a..b8ef440 100644
--- a/gcc/config/mep/mep.c
+++ b/gcc/config/mep/mep.c
@@ -6133,7 +6133,7 @@ mep_init_builtins (void)
if (cgen_insns[i].cret_p)
  ret_type = mep_cgen_regnum_to_type (cgen_insns[i].regnums[0].type);
 
-   bi_type = build_function_type (ret_type, 0);
+   bi_type = build_function_type_list (ret_type, NULL_TREE);
add_builtin_function (cgen_intrinsics[cgen_insns[i].intrinsic],
  bi_type,
  cgen_insns[i].intrinsic, BUILT_IN_MD, NULL, NULL);

[PATCH] use build_function_type_list in the iq2000 backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to iq2000-elf.  OK to commit?

-Nathan

* config/iq2000/i2000.c (iq2000_init_builtins): Call
build_function_type_list instead of build_function_type.
Delete `endlink' variable.

diff --git a/gcc/config/iq2000/iq2000.c b/gcc/config/iq2000/iq2000.c
index 2d69085..aa63674 100644
--- a/gcc/config/iq2000/iq2000.c
+++ b/gcc/config/iq2000/iq2000.c
@@ -2466,7 +2466,6 @@ iq2000_output_conditional_branch (rtx insn, rtx * 
operands, int two_operands_p,
 static void
 iq2000_init_builtins (void)
 {
-  tree endlink = void_list_node;
   tree void_ftype, void_ftype_int, void_ftype_int_int;
   tree void_ftype_int_int_int;
   tree int_ftype_int, int_ftype_int_int, int_ftype_int_int_int;
@@ -2474,76 +2473,55 @@ iq2000_init_builtins (void)
 
   /* func () */
   void_ftype
-= build_function_type (void_type_node,
-  tree_cons (NULL_TREE, void_type_node, endlink));
+= build_function_type_list (void_type_node, NULL_TREE);
 
   /* func (int) */
   void_ftype_int
-= build_function_type (void_type_node,
-  tree_cons (NULL_TREE, integer_type_node, endlink));
+= build_function_type_list (void_type_node, integer_type_node, NULL_TREE);
 
   /* void func (int, int) */
   void_ftype_int_int
-= build_function_type (void_type_node,
-   tree_cons (NULL_TREE, integer_type_node,
-  tree_cons (NULL_TREE, integer_type_node,
- endlink)));
+= build_function_type_list (void_type_node,
+integer_type_node,
+integer_type_node,
+NULL_TREE);
 
   /* int func (int) */
   int_ftype_int
-= build_function_type (integer_type_node,
-   tree_cons (NULL_TREE, integer_type_node, endlink));
+= build_function_type_list (integer_type_node,
+integer_type_node, NULL_TREE);
 
   /* int func (int, int) */
   int_ftype_int_int
-= build_function_type (integer_type_node,
-   tree_cons (NULL_TREE, integer_type_node,
-  tree_cons (NULL_TREE, integer_type_node,
- endlink)));
+= build_function_type_list (integer_type_node,
+integer_type_node,
+integer_type_node,
+NULL_TREE);
 
   /* void func (int, int, int) */
-void_ftype_int_int_int
-= build_function_type
-(void_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-   tree_cons (NULL_TREE, integer_type_node,
-  tree_cons (NULL_TREE,
- integer_type_node,
- endlink;
-
-  /* int func (int, int, int, int) */
-  int_ftype_int_int_int_int
-= build_function_type
-(integer_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-   tree_cons (NULL_TREE, integer_type_node,
-  tree_cons (NULL_TREE,
- integer_type_node,
- tree_cons (NULL_TREE,
-integer_type_node,
-endlink);
+  void_ftype_int_int_int
+= build_function_type_list (void_type_node,
+integer_type_node,
+integer_type_node,
+integer_type_node,
+NULL_TREE);
 
   /* int func (int, int, int) */
   int_ftype_int_int_int
-= build_function_type
-(integer_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-   tree_cons (NULL_TREE, integer_type_node,
-  tree_cons (NULL_TREE,
- integer_type_node,
- endlink;
+= build_function_type_list (integer_type_node,
+integer_type_node,
+integer_type_node,
+integer_type_node,
+NULL_TREE);
 
   /* int func (int, int, int, int) */
   int_ftype_int_int_int_int
-= build_function_type
-(integer_type_node,
- tree_cons (NULL_TREE, integer_type_node,
-   tree_cons (NULL_TREE, integer_type_node,
-  tree_cons (NULL_TREE,
- integer_type_node,
- tree_cons (NULL_TREE,
-integer_type_node,
-endlink);
+= build_function_type_list (integer_type_node,
+integer_type_node,
+

[PATCH] use build_function_type_list in the ia64 backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to ia64-linux-gnu.  OK to
commit?

-Nathan

* config/ia64/ia64.c (ia64_init_builtins): Call
build_function_type_list instead of builtin_function_type.

diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 5f22b17..166ec43 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -10165,11 +10165,10 @@ ia64_init_builtins (void)
   (*lang_hooks.types.register_builtin_type) (float128_type, "__float128");
 
   /* TFmode support builtins.  */
-  ftype = build_function_type (float128_type, void_list_node);
-  decl = add_builtin_function ("__builtin_infq", ftype,
-  IA64_BUILTIN_INFQ, BUILT_IN_MD,
-  NULL, NULL_TREE);
-  ia64_builtins[IA64_BUILTIN_INFQ] = decl;
+  ftype = build_function_type_list (float128_type, NULL_TREE);
+  add_builtin_function ("__builtin_infq", ftype,
+   IA64_BUILTIN_INFQ, BUILT_IN_MD,
+   NULL, NULL_TREE);
 
   decl = add_builtin_function ("__builtin_huge_valq", ftype,
   IA64_BUILTIN_HUGE_VALQ, BUILT_IN_MD,
@@ -10211,15 +10210,13 @@ ia64_init_builtins (void)
   add_builtin_function ((name), (type), (code), BUILT_IN_MD,   \
   NULL, NULL_TREE)
 
-  decl = def_builtin ("__builtin_ia64_bsp",
-  build_function_type (ptr_type_node, void_list_node),
+  def_builtin ("__builtin_ia64_bsp",
+  build_function_type_list (ptr_type_node, NULL_TREE),
   IA64_BUILTIN_BSP);
-  ia64_builtins[IA64_BUILTIN_BSP] = decl;
 
-  decl = def_builtin ("__builtin_ia64_flushrs",
-  build_function_type (void_type_node, void_list_node),
+  def_builtin ("__builtin_ia64_flushrs",
+  build_function_type_list (void_type_node, NULL_TREE),
   IA64_BUILTIN_FLUSHRS);
-  ia64_builtins[IA64_BUILTIN_FLUSHRS] = decl;
 
 #undef def_builtin

[PATCH] use build_function_type_list in the i386 backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  There's still one use of build_function_type;
replacing that type will have to wait for an improved
FUNCTION_TYPE-building interface.

Tested on x86_64-unknown-linux-gnu.  OK to commit?

-Nathan

* config/i386/i386.c (ix86_code_end): Call build_function_type_list
instead of build_function_type.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b6d41f0..40151f4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -8833,7 +8833,7 @@ ix86_code_end (void)
 
   decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
 get_identifier (name),
-build_function_type (void_type_node, void_list_node));
+build_function_type_list (void_type_node, NULL_TREE));
   DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
   NULL_TREE, void_type_node);
   TREE_PUBLIC (decl) = 1;

[PATCH] use build_function_type_list in the frv backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to frv-elf.  OK to commit?

-Nathan

* config/frv/frv.c (frv_init_builtins): Delete `endlink' variable.
Call builtin_function_type_list instead of builtin_function_type.
(UNARY, BINARY, TRINARY, QUAD): Likewise.

diff --git a/gcc/config/frv/frv.c b/gcc/config/frv/frv.c
index 0913765..564baa0 100644
--- a/gcc/config/frv/frv.c
+++ b/gcc/config/frv/frv.c
@@ -8390,7 +8390,6 @@ static struct builtin_description bdesc_stores[] =
 static void
 frv_init_builtins (void)
 {
-  tree endlink = void_list_node;
   tree accumulator = integer_type_node;
   tree integer = integer_type_node;
   tree voidt = void_type_node;
@@ -8405,24 +8404,18 @@ frv_init_builtins (void)
   tree iacc   = integer_type_node;
 
 #define UNARY(RET, T1) \
-  build_function_type (RET, tree_cons (NULL_TREE, T1, endlink))
+  build_function_type_list (RET, T1, NULL_TREE)
 
 #define BINARY(RET, T1, T2) \
-  build_function_type (RET, tree_cons (NULL_TREE, T1, \
-   tree_cons (NULL_TREE, T2, endlink)))
+  build_function_type_list (RET, T1, T2, NULL_TREE)
 
 #define TRINARY(RET, T1, T2, T3) \
-  build_function_type (RET, tree_cons (NULL_TREE, T1, \
-   tree_cons (NULL_TREE, T2, \
-   tree_cons (NULL_TREE, T3, endlink
+  build_function_type_list (RET, T1, T2, T3, NULL_TREE)
 
 #define QUAD(RET, T1, T2, T3, T4) \
-  build_function_type (RET, tree_cons (NULL_TREE, T1, \
-   tree_cons (NULL_TREE, T2, \
-   tree_cons (NULL_TREE, T3, \
-   tree_cons (NULL_TREE, T4, endlink)
+  build_function_type_list (RET, T1, T2, T3, NULL_TREE)
 
-  tree void_ftype_void = build_function_type (voidt, endlink);
+  tree void_ftype_void = build_function_type_list (voidt, NULL_TREE);
 
   tree void_ftype_acc = UNARY (voidt, accumulator);
   tree void_ftype_uw4_uw1 = BINARY (voidt, uword4, uword1);

[PATCH] use build_function_type_list in the bfin backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to bfin-elf.  OK to commit?

-Nathan

* config/bfin/bfin.c (bfin_init_builtins): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c
index 5d08437..03a833d 100644
--- a/gcc/config/bfin/bfin.c
+++ b/gcc/config/bfin/bfin.c
@@ -5967,7 +5967,7 @@ bfin_init_builtins (void)
 {
   tree V2HI_type_node = build_vector_type_for_mode (intHI_type_node, V2HImode);
   tree void_ftype_void
-= build_function_type (void_type_node, void_list_node);
+= build_function_type_list (void_type_node, NULL_TREE);
   tree short_ftype_short
 = build_function_type_list (short_integer_type_node, 
short_integer_type_node,
NULL_TREE);

[PATCH] use build_function_type_list in the alpha backend

2011-04-20 Thread Nathan Froyd

As $SUBJECT suggests.  Tested with cross to alpha-elf.  OK to commit?

-Nathan

* config/alpha/alpha.c (alpha_init_builtins): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 5e85e2b..237e9b3 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -6409,7 +6409,7 @@ alpha_init_builtins (void)
   implicit_built_in_decls[(int) BUILT_IN_FWRITE_UNLOCKED] = NULL_TREE;
 #endif
 
-  ftype = build_function_type (dimode_integer_type_node, void_list_node);
+  ftype = build_function_type_list (dimode_integer_type_node, NULL_TREE);
   alpha_add_builtins (zero_arg_builtins, ARRAY_SIZE (zero_arg_builtins),
  ftype);
 
@@ -6424,7 +6424,7 @@ alpha_init_builtins (void)
   alpha_add_builtins (two_arg_builtins, ARRAY_SIZE (two_arg_builtins),
  ftype);
 
-  ftype = build_function_type (ptr_type_node, void_list_node);
+  ftype = build_function_type_list (ptr_type_node, NULL_TREE);
   alpha_builtin_function ("__builtin_thread_pointer", ftype,
  ALPHA_BUILTIN_THREAD_POINTER, ECF_NOTHROW);

Re: FDO usability: pid handling

2011-04-20 Thread Xinliang David Li

Please review the latest patch. SPEC2k FDO testing pass.

Thanks,

David

On Wed, Apr 20, 2011 at 11:22 AM, Xinliang David Li  wrote:
> Here is the revised patch. Basic FDO testing went fine. I still saw
> the ipa-inline assertion in SPEC FDO. Will run it when the regression
> is fixed.
>
> Thanks,
>
> David
>
> On Tue, Apr 19, 2011 at 5:33 PM, Jan Hubicka  wrote:
>>> On Tue, Apr 19, 2011 at 4:49 PM, Jan Hubicka  wrote:
>>> >> Actually, among all the choices, funcdef_no is probably the most dense
>>> >> one -- it is for function decl with definition only.  In LIPO, the
>>> >
>>> > Yes, funddef_no is densiest, but we don't really need great density here
>>> > (in many other places we index arrays by cgraph_uid - it is intended for
>>> > that purpose and we pay some attention to recycle unused nodes).
>>> >
>>>
>>> That does not mean it is right to use sparse ids:)  DECL_UID will be
>>> the worst amongst them.
>>
>> Sure, that is why I suggested cgraph->uid.  That one is kept dense and it 
>> also
>> tracks cgraph node creation order. Unlike pid it counts also functions w/o
>> bodies.
>>> > We only care to avoid divergence in the indexes in between instrumentation
>>> > and feedback compilation.  With the IPA pass organizatoin the compiler 
>>> > doesn't
>>> > really diverge until the profile pass, so it seems to me that all should 
>>> > be safe.
>>>
>>> When I said 'fragile' -- I meant it depends on the optimization pass
>>> phase ordering which can change in the future.
>>
>> Well, that is the case of all of them (passes can create function bodies that
>> can make funcdef_no also diverge).  This is the case of couple passes 
>> already,
>> especially OMP lowering and friends.
>>
>>> Ok, I will throw away pid and use funcdef_no for now.  For future
>>> replacement for the function ids, please consider the following
>>> desired properties:
>>>
>>> 1) The id sequence does not depend on optimization passes -- only
>>> depend on source/parsing order;
>>
>> It depends on optimization, too.  This is why we actually have cgraph->order
>> that is used for -fno-toplevel-reorder and is similar to funcdef_no, but
>> assigned at finalization time.
>>
>>> 2) It is dense and sequential for defined functions
>>>    a) it has proven to be very useful to use nice looking, sequential
>>> ids in triaging coverage mismatch related problems (the tree dump
>>> should also show the function id);
>>
>> You get the cgraph uids in the dumps already.
>>>    b) it can be very useful in bug triaging via binary search by
>>> specifying ranges of function ids (to enable optimizations etc).
>>
>> But as you wish, we can process with fundef_no first and then discuss
>> removal of that field later.
>>
>> Honza
>>>
>>> Thanks,
>>>
>>> David
>>>
>>>
>>> >
>>> > Honza
>>> >>
>>> >> David
>>> >>
>>> >>
>>> >>
>>> >> >
>>> >> > Honza
>>> >> >
>>> >
>>
>
Index: cgraph.c
===
--- cgraph.c	(revision 172781)
+++ cgraph.c	(working copy)
@@ -142,9 +142,6 @@ int cgraph_max_uid;
 /* Maximal uid used in cgraph edges.  */
 int cgraph_edge_max_uid;
 
-/* Maximal pid used for profiling */
-int cgraph_max_pid;
-
 /* Set when whole unit has been analyzed so we can access global info.  */
 bool cgraph_global_info_ready = false;
 
@@ -472,7 +469,6 @@ cgraph_create_node_1 (void)
   struct cgraph_node *node = cgraph_allocate_node ();
 
   node->next = cgraph_nodes;
-  node->pid = -1;
   node->order = cgraph_order++;
   if (cgraph_nodes)
 cgraph_nodes->previous = node;
@@ -1827,8 +1823,7 @@ dump_cgraph_node (FILE *f, struct cgraph
   struct cgraph_edge *edge;
   int indirect_calls_count = 0;
 
-  fprintf (f, "%s/%i(%i)", cgraph_node_name (node), node->uid,
-	   node->pid);
+  fprintf (f, "%s/%i", cgraph_node_name (node), node->uid);
   dump_addr (f, " @", (void *)node);
   if (DECL_ASSEMBLER_NAME_SET_P (node->decl))
 fprintf (f, " (asm: %s)", IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node->decl)));
Index: cgraph.h
===
--- cgraph.h	(revision 172781)
+++ cgraph.h	(working copy)
@@ -200,9 +200,6 @@ struct GTY((chain_next ("%h.next"), chai
   /* Ordering of all cgraph nodes.  */
   int order;
 
-  /* unique id for profiling. pid is not suitable because of different
- number of cfg nodes with -fprofile-generate and -fprofile-use */
-  int pid;
   enum ld_plugin_symbol_resolution resolution;
 
   /* Set when function must be output for some reason.  The primary
@@ -472,7 +469,6 @@ extern GTY(()) struct cgraph_node *cgrap
 extern GTY(()) int cgraph_n_nodes;
 extern GTY(()) int cgraph_max_uid;
 extern GTY(()) int cgraph_edge_max_uid;
-extern GTY(()) int cgraph_max_pid;
 extern bool cgraph_global_info_ready;
 enum cgraph_state
 {
@@ -730,6 +726,8 @@ void cgraph_clone_inlined_nodes (struct 
 void compute_inline_parameters (struct cgraph_node *);
 cgraph_inline_failed_t cgraph_edge_inlinable_p (struct cgraph_edge *);
 
+void cgraph_init_n

Re: [patch] Do not generate discriminator directive in strict mode

2011-04-20 Thread Eric Botcazou

> How is this not redundant with the existing
>
>   /* The discriminator column was added in dwarf4.  Simplify the below
>  by simply removing it if we're not supposed to output it.  */
>   if (dwarf_version < 4 && dwarf_strict)
> discriminator = 0;
>
> check near the top of the function?

Obviously I missed this recent change, sorry.  So the question is: would the 
change be appropriate for the release branches, where we emit the directive 
unconditionally, i.e 4.5 and 4.6 branches, or would mine be safer for them?  
This directive apparently confuses (some versions of) the Wind River debugger.

-- 
Eric Botcazou

Re: Allow union variables to share stack slots wwith -fno-strict-aliasing (issue4444051)

2011-04-20 Thread Easwaran Raman

On Wed, Apr 20, 2011 at 2:12 AM, Eric Botcazou  wrote:
>> 2011-04-19  Easwaran Raman  
>>
>>       * gcc/testsuite/gcc.dg/stack-layout-1.c: New
>>       * gcc/cfgexpand.c (add_alias_set_conflicts): Add conflicts
>>       with a variable containing union type only with
>>       -fstrict-aliasing.
>
> You need an entry for each relevant ChangeLog, without prefixes:
>
>
> 2011-04-20  Easwaran Raman  
>
>        * cfgexpand.c (add_alias_set_conflicts): Add conflicts with a variable
>        containing union type only with -fstrict-aliasing.
>
>
> 2011-04-20  Easwaran Raman  
>
>        * gcc.dg/stack-layout-1.c: New test.
>
>
> --
> Eric Botcazou
>

Thanks. I've added them to the respective Changelog files and
committed the patch.

-Easwaran

Re: FDO usability: pid handling

2011-04-20 Thread Xinliang David Li

Discard this version of the patch. I have not updated source properly
and the build/test was invalid.

David

On Wed, Apr 20, 2011 at 11:22 AM, Xinliang David Li  wrote:
> Here is the revised patch. Basic FDO testing went fine. I still saw
> the ipa-inline assertion in SPEC FDO. Will run it when the regression
> is fixed.
>
> Thanks,
>
> David
>
> On Tue, Apr 19, 2011 at 5:33 PM, Jan Hubicka  wrote:
>>> On Tue, Apr 19, 2011 at 4:49 PM, Jan Hubicka  wrote:
>>> >> Actually, among all the choices, funcdef_no is probably the most dense
>>> >> one -- it is for function decl with definition only.  In LIPO, the
>>> >
>>> > Yes, funddef_no is densiest, but we don't really need great density here
>>> > (in many other places we index arrays by cgraph_uid - it is intended for
>>> > that purpose and we pay some attention to recycle unused nodes).
>>> >
>>>
>>> That does not mean it is right to use sparse ids:)  DECL_UID will be
>>> the worst amongst them.
>>
>> Sure, that is why I suggested cgraph->uid.  That one is kept dense and it 
>> also
>> tracks cgraph node creation order. Unlike pid it counts also functions w/o
>> bodies.
>>> > We only care to avoid divergence in the indexes in between instrumentation
>>> > and feedback compilation.  With the IPA pass organizatoin the compiler 
>>> > doesn't
>>> > really diverge until the profile pass, so it seems to me that all should 
>>> > be safe.
>>>
>>> When I said 'fragile' -- I meant it depends on the optimization pass
>>> phase ordering which can change in the future.
>>
>> Well, that is the case of all of them (passes can create function bodies that
>> can make funcdef_no also diverge).  This is the case of couple passes 
>> already,
>> especially OMP lowering and friends.
>>
>>> Ok, I will throw away pid and use funcdef_no for now.  For future
>>> replacement for the function ids, please consider the following
>>> desired properties:
>>>
>>> 1) The id sequence does not depend on optimization passes -- only
>>> depend on source/parsing order;
>>
>> It depends on optimization, too.  This is why we actually have cgraph->order
>> that is used for -fno-toplevel-reorder and is similar to funcdef_no, but
>> assigned at finalization time.
>>
>>> 2) It is dense and sequential for defined functions
>>>    a) it has proven to be very useful to use nice looking, sequential
>>> ids in triaging coverage mismatch related problems (the tree dump
>>> should also show the function id);
>>
>> You get the cgraph uids in the dumps already.
>>>    b) it can be very useful in bug triaging via binary search by
>>> specifying ranges of function ids (to enable optimizations etc).
>>
>> But as you wish, we can process with fundef_no first and then discuss
>> removal of that field later.
>>
>> Honza
>>>
>>> Thanks,
>>>
>>> David
>>>
>>>
>>> >
>>> > Honza
>>> >>
>>> >> David
>>> >>
>>> >>
>>> >>
>>> >> >
>>> >> > Honza
>>> >> >
>>> >
>>
>

unnecessary test before free changes committed

2011-04-20 Thread Jim Meyering

FYI, I've just pushed the following two change sets.
I verified that "make check" on x86_64 produced the same set of 92
failures without as with my changes.  However, when I ran
"make check MALLOC_PERTURB_=0 MALLOC_CHECK_=0", I saw only 91 failures.
(normally those MALLOC_*_ variables are set to nonzero values in my environment)

This was the culprit:

FAIL: gcc.dg/matrix/transpose-3.c execution,-fprofile-use 
-fipa-matrix-reorg -fdump-ipa-matrix-reorg -O3 -fwhole-program -fno-tree-fre

From 7e50b781d25170cf5bbe5f6247607c5dca879009 Mon Sep 17 00:00:00 2001
From: Jim Meyering 
Date: Mon, 3 Jan 2011 16:52:37 +0100
Subject: [PATCH 1/2] discourage unnecessary use of if before free

* README.Portability: Explain why "if (P) free (P)" is best avoided.
---
 gcc/README.Portability |   27 ---
 1 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/gcc/README.Portability b/gcc/README.Portability
index 32a33e2..4101a2f 100644
--- a/gcc/README.Portability
+++ b/gcc/README.Portability
@@ -51,14 +51,28 @@ foo (bar, )
 needs to be coded in some other way.


-free and realloc
-
+Avoid unnecessary test before free
+--

-Some implementations crash upon attempts to free or realloc the null
-pointer.  Thus if mem might be null, you need to write
+Since SunOS 4 stopped being a reasonable portability target,
+(which happened around 2007) there has been no need to guard
+against "free (NULL)".  Thus, any guard like the following
+constitutes a redundant test:
+
+  if (P)
+free (P);
+
+It is better to avoid the test.[*]
+Instead, simply free P, regardless of whether it is NULL.
+
+[*] However, if your profiling exposes a test like this in a
+performance-critical loop, say where P is nearly always NULL, and
+the cost of calling free on a NULL pointer would be prohibitively
+high, consider using __builtin_expect, e.g., like this:
+
+  if (__builtin_expect (ptr != NULL, 0))
+free (ptr);

-  if (mem)
-free (mem);


 Trigraphs
@@ -194,4 +208,3 @@ o Passing incorrect types to fprintf and friends.

 o Adding a function declaration for a module declared in another file to
   a .c file instead of to a .h file.
-
-- 
1.7.5.rc2.295.g19c42


From 08544935e6fcfd6a1d1cba6d302ccede02e13681 Mon Sep 17 00:00:00 2001
From: Jim Meyering 
Date: Fri, 15 Apr 2011 20:47:40 +0200
Subject: [PATCH 2/2] remove useless if-before-free tests

Change "if (E) free (E);" to "free (E);" everywhere except in the
libgo/, intl/, zlib/ and classpath/ directories.
Also transform equivalent variants like
"if (E != NULL) free (E);" and allow an extra cast on the
argument to free.  Otherwise, the tested and freed "E"
expressions must be identical, modulo white space.
---
 gcc/ChangeLog   |   39 +
 gcc/ada/ChangeLog   |4 ++
 gcc/ada/initialize.c|3 +-
 gcc/c-family/ChangeLog  |7 +++-
 gcc/c-family/c-format.c |6 +--
 gcc/calls.c |   15 ++
 gcc/cfgcleanup.c|3 +-
 gcc/collect2.c  |3 +-
 gcc/config/i386/i386.c  |3 +-
 gcc/config/mcore/mcore.c|3 +-
 gcc/coverage.c  |3 +-
 gcc/cp/ChangeLog|4 ++
 gcc/cp/tree.c   |3 +-
 gcc/cse.c   |6 +--
 gcc/cselib.c|3 +-
 gcc/df-core.c   |   15 ++
 gcc/fortran/ChangeLog   |7 +++
 gcc/fortran/expr.c  |3 +-
 gcc/fortran/gfortranspec.c  |5 +-
 gcc/fortran/interface.c |3 +-
 gcc/fortran/trans-openmp.c  |3 +-
 gcc/function.c  |3 +-
 gcc/gcc.c   |   15 ++
 gcc/gcov.c  |6 +--
 gcc/gensupport.c|   12 ++
 gcc/graphite-clast-to-gimple.c  |3 +-
 gcc/graphite-sese-to-poly.c |3 +-
 gcc/haifa-sched.c   |3 +-
 gcc/ipa-prop.c  |3 +-
 gcc/ipa-pure-const.c|3 +-
 gcc/ipa-reference.c |3 +-
 gcc/ira-costs.c |   12 ++
 gcc/ira.c   |   18 +++-
 gcc/java/ChangeLog  |6 ++-
 gcc/java/jcf-parse.c|3 +-
 gcc/matrix-reorg.c  |9 +---
 gcc/prefix.c|3 +-
 gcc/profile.c   |3 +-
 gcc/reload1.c   |6 +--
 gcc/sched-deps.c|3 +-
 gcc/sel-sched-ir.c  |3 +-
 gcc/sese.c  |6 +--
 gcc/tree-data-ref.c |6 +--
 gcc/tree-eh.c   |3 +-
 gcc/tree-ssa-coalesce.c |3 +-
 gcc/tree-ssa-live.c |6 +--
 gcc/tree-ssa-loop-ivopts.c

Added myself to MAINTAINERS (write after approval)

2011-04-20 Thread Easwaran Raman

Changelog:
2011-04-20  Easwaran Raman  

* MAINTAINERS (Write After Approval): Add myself.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 172782)
+++ MAINTAINERS (working copy)
@@ -455,6 +455,7 @@
 Yao Qi y...@codesourcery.com
 Jerry Quinnjlqu...@optonline.net
 Ramana Radhakrishnan   ramana.radhakrish...@arm.com
+Easwaran Raman era...@google.com
 Rolf Rasmussen rol...@gcc.gnu.org
 Volker Reicheltv.reich...@netcologne.de
 Joern Rennecke amyl...@spamcop.net

Re: FDO usability: pid handling

2011-04-20 Thread Xinliang David Li

Here is the revised patch. Basic FDO testing went fine. I still saw
the ipa-inline assertion in SPEC FDO. Will run it when the regression
is fixed.

Thanks,

David

On Tue, Apr 19, 2011 at 5:33 PM, Jan Hubicka  wrote:
>> On Tue, Apr 19, 2011 at 4:49 PM, Jan Hubicka  wrote:
>> >> Actually, among all the choices, funcdef_no is probably the most dense
>> >> one -- it is for function decl with definition only.  In LIPO, the
>> >
>> > Yes, funddef_no is densiest, but we don't really need great density here
>> > (in many other places we index arrays by cgraph_uid - it is intended for
>> > that purpose and we pay some attention to recycle unused nodes).
>> >
>>
>> That does not mean it is right to use sparse ids:)  DECL_UID will be
>> the worst amongst them.
>
> Sure, that is why I suggested cgraph->uid.  That one is kept dense and it also
> tracks cgraph node creation order. Unlike pid it counts also functions w/o
> bodies.
>> > We only care to avoid divergence in the indexes in between instrumentation
>> > and feedback compilation.  With the IPA pass organizatoin the compiler 
>> > doesn't
>> > really diverge until the profile pass, so it seems to me that all should 
>> > be safe.
>>
>> When I said 'fragile' -- I meant it depends on the optimization pass
>> phase ordering which can change in the future.
>
> Well, that is the case of all of them (passes can create function bodies that
> can make funcdef_no also diverge).  This is the case of couple passes already,
> especially OMP lowering and friends.
>
>> Ok, I will throw away pid and use funcdef_no for now.  For future
>> replacement for the function ids, please consider the following
>> desired properties:
>>
>> 1) The id sequence does not depend on optimization passes -- only
>> depend on source/parsing order;
>
> It depends on optimization, too.  This is why we actually have cgraph->order
> that is used for -fno-toplevel-reorder and is similar to funcdef_no, but
> assigned at finalization time.
>
>> 2) It is dense and sequential for defined functions
>>    a) it has proven to be very useful to use nice looking, sequential
>> ids in triaging coverage mismatch related problems (the tree dump
>> should also show the function id);
>
> You get the cgraph uids in the dumps already.
>>    b) it can be very useful in bug triaging via binary search by
>> specifying ranges of function ids (to enable optimizations etc).
>
> But as you wish, we can process with fundef_no first and then discuss
> removal of that field later.
>
> Honza
>>
>> Thanks,
>>
>> David
>>
>>
>> >
>> > Honza
>> >>
>> >> David
>> >>
>> >>
>> >>
>> >> >
>> >> > Honza
>> >> >
>> >
>
Index: cgraph.c
===
--- cgraph.c	(revision 172781)
+++ cgraph.c	(working copy)
@@ -142,9 +142,6 @@ int cgraph_max_uid;
 /* Maximal uid used in cgraph edges.  */
 int cgraph_edge_max_uid;
 
-/* Maximal pid used for profiling */
-int cgraph_max_pid;
-
 /* Set when whole unit has been analyzed so we can access global info.  */
 bool cgraph_global_info_ready = false;
 
@@ -472,7 +469,6 @@ cgraph_create_node_1 (void)
   struct cgraph_node *node = cgraph_allocate_node ();
 
   node->next = cgraph_nodes;
-  node->pid = -1;
   node->order = cgraph_order++;
   if (cgraph_nodes)
 cgraph_nodes->previous = node;
@@ -1827,8 +1823,7 @@ dump_cgraph_node (FILE *f, struct cgraph
   struct cgraph_edge *edge;
   int indirect_calls_count = 0;
 
-  fprintf (f, "%s/%i(%i)", cgraph_node_name (node), node->uid,
-	   node->pid);
+  fprintf (f, "%s/%i", cgraph_node_name (node), node->uid);
   dump_addr (f, " @", (void *)node);
   if (DECL_ASSEMBLER_NAME_SET_P (node->decl))
 fprintf (f, " (asm: %s)", IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node->decl)));
Index: cgraph.h
===
--- cgraph.h	(revision 172781)
+++ cgraph.h	(working copy)
@@ -200,9 +200,6 @@ struct GTY((chain_next ("%h.next"), chai
   /* Ordering of all cgraph nodes.  */
   int order;
 
-  /* unique id for profiling. pid is not suitable because of different
- number of cfg nodes with -fprofile-generate and -fprofile-use */
-  int pid;
   enum ld_plugin_symbol_resolution resolution;
 
   /* Set when function must be output for some reason.  The primary
@@ -472,7 +469,6 @@ extern GTY(()) struct cgraph_node *cgrap
 extern GTY(()) int cgraph_n_nodes;
 extern GTY(()) int cgraph_max_uid;
 extern GTY(()) int cgraph_edge_max_uid;
-extern GTY(()) int cgraph_max_pid;
 extern bool cgraph_global_info_ready;
 enum cgraph_state
 {
@@ -730,6 +726,8 @@ void cgraph_clone_inlined_nodes (struct 
 void compute_inline_parameters (struct cgraph_node *);
 cgraph_inline_failed_t cgraph_edge_inlinable_p (struct cgraph_edge *);
 
+void cgraph_init_node_map (void);
+void cgraph_del_node_map (void);
 
 /* Create a new static variable of type TYPE.  */
 tree add_new_static_var (tree type);
Index: value-prof.c
===

[Patch, Fortran, committed] PR 48692 - Fix gfortran.dg/module_write_1.f90 failure

2011-04-20 Thread Tobias Burnus

The committal of the patch for PR 48588 caused that 
gfortran.dg/module_write_1.f90 was now failing (ICE).


After some debugging (cf. PR 48692) it turned out that it only worked by 
chance before. The attached patch fixes the issue more properly. (The 
ICE occurred as a check whether all symbols were committed failed.)


The patch was build, tested and regtested on x86-64-linux; it is rather 
obvious and has also been approved by Steve in a private email.


Tobias
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 172781)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,10 @@
+2011-04-19  Tobias Burnus  
+
+	PR fortran/48588
+	PR fortran/48692
+
+	* module.c (fix_mio_expr): Commit created symbol.
+
 2011-04-19  Janne Blomqvist  
 
 	* scanner.c (load_file): Use XCNEWVAR instead of xcalloc.
Index: gcc/fortran/module.c
===
--- gcc/fortran/module.c	(Revision 172781)
+++ gcc/fortran/module.c	(Arbeitskopie)
@@ -3013,6 +3013,7 @@ fix_mio_expr (gfc_expr *e)
   sym->attr.flavor = FL_PROCEDURE;
   sym->attr.generic = 1;
   e->symtree = gfc_find_symtree (gfc_current_ns->sym_root, fname);
+  gfc_commit_symbol (sym);
 }
 }

Re: [PATCH] Stop in note_eh_region_may_contain_throw after ERT_MUST_NOT_THROW (PR tree-optimization/48611)

2011-04-20 Thread Richard Henderson

On 04/18/2011 02:35 PM, Jakub Jelinek wrote:
>   PR tree-optimization/48611
>   * tree-eh.c (note_eh_region_may_contain_throw): Don't propagate
>   beyond ERT_MUST_NOT_THROW region.

Ok.


r~

Re: Improve stack layout heuristic.

2011-04-20 Thread Easwaran Raman

On Wed, Apr 20, 2011 at 6:53 AM, Michael Matz  wrote:
> Hi,
>
> On Tue, 19 Apr 2011, Easwaran Raman wrote:
>
>> > That is correct but is also what the use of stack_vars[u].representative
>> > achieves alone, ...
>> >
>> >> I am adding a check to that effect.
>> >
>> > ... without any check.
>> >
>> > @@ -596,7 +581,8 @@
>> >   if (vb->conflicts)
>> >     {
>> >       EXECUTE_IF_SET_IN_BITMAP (vb->conflicts, 0, u, bi)
>> > -       add_stack_var_conflict (a, stack_vars[u].representative);
>> > +        if (stack_vars[u].next == EOC && stack_vars[u].representative == 
>> > u)
>> > +          add_stack_var_conflict (a, u);
>> >       BITMAP_FREE (vb->conflicts);
>> >     }
>> >  }
>> >
>> > What's your objective with this change?  I find the original code clearer.
>>
>> Let us say we try to merge 'a' to 'b' and 'a' has conflicts with many
>> members of an existing partition C. It is not necessary to add all
>> those conflicts to 'b' since they will be never considered in the call
>> to union_stack_vars.
>
> Right, that's why I was objecting to your initial change. a
I agree that my initial change - adding a conflict with u -  was wrong.
> The original
> code (adding stack_vars[u].representative to the conflicts of A) made sure
> the target conflict bitmap only got representatives added.
In my above example, it is not even necessary to add a conflict
between 'b' and representative(C) since it is already in a partition.
But you're right - not adding that conflict doesn't actually reduce
the size of bit maps. Reverting back to what was there originally.

Thanks,
Easwaran

 That's why I
> was asking why you changed this area at all.
>
>> I was motivated by your comment on bit-vector bloat to try this, but if
>> this affects readability I'll happily revert back to what it was before.
>
>
> Ciao,
> Michael.
Index: gcc/testsuite/gcc.dg/stack-layout-2.c
===
--- gcc/testsuite/gcc.dg/stack-layout-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/stack-layout-2.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+void bar( char *);
+int foo()
+{
+  int i=0;
+  {
+char a[8000];
+bar(a);
+i += a[0];
+  }
+  {
+char a[8192];
+char b[32];
+bar(a);
+i += a[0];
+bar(b);
+i += a[0];
+  }
+  return i;
+}
+/* { dg-final { scan-rtl-dump "size 8192" "expand" } } */
+/* { dg-final { scan-rtl-dump "size 32" "expand" } } */
Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c	(revision 171954)
+++ gcc/cfgexpand.c	(working copy)
@@ -158,11 +158,6 @@
   /* The Variable.  */
   tree decl;
 
-  /* The offset of the variable.  During partitioning, this is the
- offset relative to the partition.  After partitioning, this
- is relative to the stack frame.  */
-  HOST_WIDE_INT offset;
-
   /* Initially, the size of the variable.  Later, the size of the partition,
  if this variable becomes it's partition's representative.  */
   HOST_WIDE_INT size;
@@ -267,7 +262,6 @@
   v = &stack_vars[stack_vars_num];
 
   v->decl = decl;
-  v->offset = 0;
   v->size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1);
   /* Ensure that all variables have size, so that &a != &b for any two
  variables that are simultaneously live.  */
@@ -403,9 +397,9 @@
 return (int)largeb - (int)largea;
 
   /* Secondary compare on size, decreasing  */
-  if (sizea < sizeb)
-return -1;
   if (sizea > sizeb)
+return -1;
+  if (sizea < sizeb)
 return 1;
 
   /* Tertiary compare on true alignment, decreasing.  */
@@ -564,28 +558,19 @@
 
 /* A subroutine of partition_stack_vars.  The UNION portion of a UNION/FIND
partitioning algorithm.  Partitions A and B are known to be non-conflicting.
-   Merge them into a single partition A.
+   Merge them into a single partition A.  */
 
-   At the same time, add OFFSET to all variables in partition B.  At the end
-   of the partitioning process we've have a nice block easy to lay out within
-   the stack frame.  */
-
 static void
-union_stack_vars (size_t a, size_t b, HOST_WIDE_INT offset)
+union_stack_vars (size_t a, size_t b)
 {
-  size_t i, last;
   struct stack_var *vb = &stack_vars[b];
   bitmap_iterator bi;
   unsigned u;
 
-  /* Update each element of partition B with the given offset,
- and merge them into partition A.  */
-  for (last = i = b; i != EOC; last = i, i = stack_vars[i].next)
-{
-  stack_vars[i].offset += offset;
-  stack_vars[i].representative = a;
-}
-  stack_vars[last].next = stack_vars[a].next;
+  gcc_assert (stack_vars[b].next == EOC);
+   /* Add B to A's partition.  */
+  stack_vars[b].next = stack_vars[a].next;
+  stack_vars[b].representative = a;
   stack_vars[a].next = b;
 
   /* Update the required alignment of partition A to account for B.  */
@@ -605,16 +590,13 @@
partitions constrained by the interference graph.  The overall
algorithm used is a

Re: [patch] Do not generate discriminator directive in strict mode

2011-04-20 Thread Richard Henderson

On 04/19/2011 06:40 AM, Eric Botcazou wrote:
> -  if (SUPPORTS_DISCRIMINATOR && discriminator != 0)
> +  if (SUPPORTS_DISCRIMINATOR
> +   && discriminator != 0
> +   && (dwarf_version >= 4 || !dwarf_strict))
>   fprintf (asm_out_file, " discriminator %d", discriminator);

How is this not redundant with the existing

  /* The discriminator column was added in dwarf4.  Simplify the below
 by simply removing it if we're not supposed to output it.  */
  if (dwarf_version < 4 && dwarf_strict)
discriminator = 0;

check near the top of the function?

r~

[Patch,AVR]: Solve PR42210

2011-04-20 Thread Georg-Johann Lay

This solves some missed optimization that can be seen when moving
around bits.

There are 4 combiner patterns that operate on regs and one that uses
them as intermediate patterns and works on I/O. Even if just an
intermediate pattern matches it's still an improvement because avoid
of shift.

Tested on some home-brew example.

Ok if I see no regressions?

Johann

2011-04-20  Georg-Johann Lay  

PR target/42210

* config/avr/avr.md ("*movbitqi.1-6.a", "*movbitqi.1-6.b",
"*movbitqi.0", "*movbitqi.7", "*movbitqi.io"): New insns.
Index: config/avr/avr.md
===
--- config/avr/avr.md	(Revision 172770)
+++ config/avr/avr.md	(Arbeitskopie)
@@ -3388,3 +3388,81 @@ (define_insn "fmulsu"
 	clr __zero_reg__"
   [(set_attr "length" "3")
(set_attr "cc" "clobber")])
+
+
+;; Some combiner patterns dealing with bits.
+;; See PR42210
+
+;; Move bit $3.$4 into bit $0.$4
+(define_insn "*movbitqi.1-6.a"
+  [(set (match_operand:QI 0 "register_operand""=r")
+(ior:QI (and:QI (match_operand:QI 1 "register_operand" "0")
+(match_operand:QI 2 "single_zero_operand"  "n"))
+(and:QI (ashift:QI (match_operand:QI 3 "register_operand"  "r")
+   (match_operand:QI 4 "const_int_operand" "n"))
+(match_operand:QI 5 "single_one_operand" "n"]
+  "optimize
+   && INTVAL(operands[4]) == exact_log2 (INTVAL(operands[5]) & GET_MODE_MASK (QImode))
+   && INTVAL(operands[4]) == exact_log2 (~INTVAL(operands[2]) & GET_MODE_MASK (QImode))"
+  "bst %3,%4\;bld %0,%4"
+  [(set_attr "length" "2")
+   (set_attr "cc" "none")])
+
+;; Move bit $3.$4 into bit $0.$4
+;; Variation of above. Unfortunately, there is no canonicalized representation
+;; of moving around bits.  So what we see here depends on how user writes down
+;; bit manipulations.
+(define_insn "*movbitqi.1-6.b"
+  [(set (match_operand:QI 0 "register_operand""=r")
+(ior:QI (and:QI (match_operand:QI 1 "register_operand" "0")
+(match_operand:QI 2 "single_zero_operand"  "n"))
+(ashift:QI (and:QI (match_operand:QI 3 "register_operand"  "r")
+   (const_int 1))
+   (match_operand:QI 4 "const_int_operand" "n"]
+  "optimize
+   && INTVAL(operands[4]) == exact_log2 (~INTVAL(operands[2]) & GET_MODE_MASK (QImode))"
+  "bst %3,%4\;bld %0,%4"
+  [(set_attr "length" "2")
+   (set_attr "cc" "none")])
+
+;; Move bit $3.0 into bit $0.0.
+;; For bit 0, combiner generates slightly different pattern.
+(define_insn "*movbitqi.0"
+  [(set (match_operand:QI 0 "register_operand" "=r")
+(ior:QI (and:QI (match_operand:QI 1 "register_operand"  "0")
+(match_operand:QI 2 "single_zero_operand"   "n"))
+(and:QI (match_operand:QI 3 "register_operand"  "r")
+(const_int 1]
+  "optimize
+   && 0 == exact_log2 (~INTVAL(operands[2]) & GET_MODE_MASK (QImode))"
+  "bst %3,0\;bld %0,0"
+  [(set_attr "length" "2")
+   (set_attr "cc" "none")])
+
+;; Move bit $2.7 into bit $0.7.
+;; For bit 7, combiner generates slightly different pattern
+(define_insn "*movbitqi.7"
+  [(set (match_operand:QI 0 "register_operand"  "=r")
+(ior:QI (and:QI (match_operand:QI 1 "register_operand"   "0")
+(const_int 127))
+(ashift:QI (match_operand:QI 2 "register_operand""r")
+   (const_int 7]
+  "optimize"
+  "bst %2,7\;bld %0,7"
+  [(set_attr "length" "2")
+   (set_attr "cc" "none")])
+
+;; Combiner transforms above four pattern into ZERO_EXTRACT if it sees MEM
+;; and input/output match.  We provide a special pattern for this, because
+;; in contrast to a IN/BST/BLD/OUT sequence we need less registers and the
+;; operation on I/O is atomic.
+(define_insn "*movbitqi.io"
+  [(set (zero_extract:QI (mem:QI (match_operand 0 "low_io_address_operand" "n"))
+ (const_int 1)   ;; width
+ (match_operand 1 "const_int_operand"  "n")) ;; pos
+(match_operand:QI 2 "register_operand" "r"))]
+  "optimize
+   && IN_RANGE (INTVAL(operands[1]), 0, 7)"
+  "sbrc %2,0\;sbi %m0-0x20,%1\;sbrs %2,0\;cbi %m0-0x20,%1"
+  [(set_attr "length" "4")
+   (set_attr "cc" "none")])

Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Kai Tietz

2011/4/20 Kai Tietz :
> 2011/4/20 Richard Henderson :
>> On 04/20/2011 08:50 AM, Kai Tietz wrote:
>>> +      if (TREE_CODE (arg0) == TREE_CODE (arg1)
>>> +       && TREE_CODE (arg1) == TRUTH_AND_EXPR)
>>
>> Ok with these both explicitly testing TRUTH_AND_EXPR now.
>>
>>
>> r~
>>
>
> Committed at revision 172776 with explicit testing for TRUTH_AND_EXPR.
>
> Kai

Fixed encoding issue of backslashs in testcases at revision 172781.
Committed as obvious.

Kai

Re: [pph] Namespaces, step 1. Trace formatting. (issue4433054)

2011-04-20 Thread Lawrence Crowl

On 4/20/11, dnovi...@google.com  wrote:
> http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c
> File gcc/cp/pph-streamer.c (right):
>
> http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode144
> gcc/cp/pph-streamer.c:144: return;
> +  if ((type == PPH_TRACE_TREE || type == PPH_TRACE_CHAIN)
> +  && !data && flag_pph_tracer <= 3)
> +return;
>
> Line up the predicates vertically.

Can you be more specific?

>
> http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode172
> gcc/cp/pph-streamer.c:172: fprintf (pph_logfile, ", code=%s",
> tree_code_name[TREE_CODE (t)]);
> case PPH_TRACE_REF:
> +  {
> + const_tree t = (const_tree) data;
> + if (t)
> +   {
> + print_generic_expr (pph_logfile, CONST_CAST (union tree_node *,
> t),
> + 0);
> + fprintf (pph_logfile, ", code=%s", tree_code_name[TREE_CODE (t)]);
>
>
> But how are we going to tell if this is a REF instead of a tree?

The type_s array is indexed by PPH_TRACE_REF.

> The output seems identical to the PPH_TRACE_TREE case.

Well, the case in those branches is identical.  The splitting was
a bit preemptive, as I was planning to see what changes I needed
after seeing what items were refs.  None actually were refs, so
the distinction isn't there.

> http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h
> File gcc/cp/pph-streamer.h (right):
>
> http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode149
> gcc/cp/pph-streamer.h:149: }
> pph_output_tree_lst (pph_stream *stream, tree t, bool ref_p)
> +{
> +  if (flag_pph_tracer >= 2)
> +pph_stream_trace_tree (stream, t, ref_p);
> +  lto_output_tree (stream->ob, t, ref_p);
> +}
>
> I don't really like all this code duplication.  Wouldn't it be better if
> instead of having pph_output_tree_aux and pph_output_tree_lst, we added
> another argument to pph_output_tree?  The argument would be an enum and
> we could have a default 'DONT_CARE' value.

I'm not sure that would save much code.  It would induce some
runtime overhead (unless the compiler specialized routines).
It would also change the callbacks.

> http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode298
> gcc/cp/pph-streamer.h:298: pph_stream_trace_tree (stream, t, false); /*
> FIXME pph: always false? */
> @@ -285,7 +295,7 @@ pph_input_tree (pph_stream *stream)
>   {
> tree t = lto_input_tree (stream->ib, stream->data_in);
> if (flag_pph_tracer >= 4)
> -pph_stream_trace_tree (stream, t);
> +pph_stream_trace_tree (stream, t, false); /* FIXME pph: always
> false?
>
> Yes, on input we can't tell if we read a reference or a real tree.  We
> could, but not at this level.  That's inside the actual LTO streaming
> code.

It would be nice to have an indication, but it is not something I want
to do now.

>
> http://codereview.appspot.com/4433054/

-- 
Lawrence Crowl

Re: [PATCH] Optimize (x * 8) | 5 and (x << 3) ^ 3 to use lea (PR target/48688)

2011-04-20 Thread Richard Henderson

On 04/20/2011 09:09 AM, Jakub Jelinek wrote:
> Hi!
> 
> This splitter allows us to optimize (x {* {2,4,8},<< {1,2,3}}) {|,^} y
> for constant integer y <= {1ULL,3ULL,7ULL} using lea{l,q} (| or ^ in
> that case, when the low bits are known to be all 0, is like plus).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2011-04-20  Jakub Jelinek  
> 
>   PR target/48688
>   * config/i386/i386.md (*lea_general_4): New define_insn_and_split.

Any chance you could do this in combine instead?  Shift-and-add patterns
are a fairly common architectural feature...


r~

[PATCH] use build_function_type_list a few places in the ObjC frontend

2011-04-20 Thread Nathan Froyd

Just as $SUBJECT suggests.  All the other uses of
build_function_type_list are tied up with get_arg_type_list and will
therefore have to wait for a better FUNCTION_TYPE builder.

Tested on x86_64-unknown-linux-gnu.  IIUC the changes to
objc-next-runtime-abi-02.c would not be tested on that platform, so it
would be helpful to have a Darwin tester double-check my work.

OK to commit?

-Nathan

* objc-act.c (synth_module_prologue): Call build_function_type_list
instead of build_function_type.
* objc-next-runtime-abi-02.c (next_runtime_02_initialize):
Likewise.

diff --git a/gcc/objc/objc-act.c b/gcc/objc/objc-act.c
index b48f179..0b6b793 100644
--- a/gcc/objc/objc-act.c
+++ b/gcc/objc/objc-act.c
@@ -2995,8 +2995,8 @@ synth_module_prologue (void)
   build_fast_enumeration_state_template ();
   
   /* void objc_enumeration_mutation (id) */
-  type = build_function_type (void_type_node,
- tree_cons (NULL_TREE, objc_object_type, 
NULL_TREE));
+  type = build_function_type_list (void_type_node,
+  objc_object_type, NULL_TREE);
   objc_enumeration_mutation_decl 
 = add_builtin_function (TAG_ENUMERATION_MUTATION, type, 0, NOT_BUILT_IN, 
NULL, NULL_TREE);
diff --git a/gcc/objc/objc-next-runtime-abi-02.c 
b/gcc/objc/objc-next-runtime-abi-02.c
index 4ce0159..f3cf359 100644
--- a/gcc/objc/objc-next-runtime-abi-02.c
+++ b/gcc/objc/objc-next-runtime-abi-02.c
@@ -492,9 +492,8 @@ static void next_runtime_02_initialize (void)
   build_v2_ehtype_template ();
 
   /* void * objc_begin_catch (void *) */
-  type = build_function_type (ptr_type_node,
- tree_cons (NULL_TREE, ptr_type_node,
- OBJC_VOID_AT_END));
+  type = build_function_type_list (ptr_type_node,
+  ptr_type_node, NULL_TREE);
 
   objc2_begin_catch_decl = add_builtin_function ("objc_begin_catch",
 type, 0, NOT_BUILT_IN,
@@ -502,14 +501,13 @@ static void next_runtime_02_initialize (void)
   TREE_NOTHROW (objc2_begin_catch_decl) = 0;
 
   /* void objc_end_catch () */
-  type = build_function_type (void_type_node, OBJC_VOID_AT_END);
+  type = build_function_type_list (void_type_node, NULL_TREE);
   objc2_end_catch_decl = add_builtin_function ("objc_end_catch",
type, 0, NOT_BUILT_IN,
NULL, NULL_TREE);
   TREE_NOTHROW (objc2_end_catch_decl) = 0;
 
   /* void objc_exception_rethrow (void) */
-  type = build_function_type (void_type_node, OBJC_VOID_AT_END);
   objc_rethrow_exception_decl = 
add_builtin_function ("objc_exception_rethrow",
  type, 0, NOT_BUILT_IN,

Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-20 Thread Richard Henderson

On 04/19/2011 02:52 AM, Denis Chertykov wrote:
> 2011/4/19 Georg-Johann Lay :
>> How can add, sub etc. be split? This would need an explicit
>> representation of carry.
> 
> Yes.
> 
> Look at http://gcc.gnu.org/ml/gcc/2005-03/msg00871.html

Well, sort-of, but not really.

It gets a tad ugly, but have a look at the adddi3* patterns in
the mn10300 and rx ports.

In particular note how both the inputs and outputs to the insn
(not the expander) are all SImode, allowing for lower_subreg to
do its job.  The patterns are split post-reload -- some of that
is for scheduling, some of that simply makes computing the 
individual insn lengths significantly easier -- but you wouldn't
really have to do that for AVR.

For AVR things would become even trickier.  You might consider

(define_predicate "concat_operator"
  (match_code "concat"))

(define_insn "addsi3_"
  [(set (match_operand:QI 0 "register_operand" "=r")
(truncate:QI
  (plus:SI
(match_operator:SI 12 "concat_operator"
   [(match_operand:QI  4 "register_operand" "0")
(match_operand:QI  5 "register_operand" "1")
(match_operand:QI  6 "register_operand" "2")
(match_operand:QI  7 "register_operand" "3")])
(match_operator:SI 13 "concat_operator"
   [(match_operand:QI  8 "reg_or_0_operand" "rL")
(match_operand:QI  9 "reg_or_0_operand" "rL")
(match_operand:QI 10 "reg_or_0_operand" "rL")
(match_operand:QI 11 "reg_or_0_operand" "rL")])))
   (set (match_operand:QI 1 "register_operand" "=r")
(truncate:QI
  (lshiftrt:SI
(plus:SI (match_dup 24) (match_dup 25))
(const_int 8
   (set (match_operand:QI 2 "register_operand" "=r")
(truncate:QI
  (lshiftrt:SI
(plus:SI (match_dup 24) (match_dup 25))
(const_int 16
   (set (match_operand:QI 3 "register_operand" "=r")
(truncate:QI
  (lshiftrt:SI
(plus:SI (match_dup 24) (match_dup 25))
(const_int 24]
  ""
  "add %0,%Z8\;adc %1,%Z9\;adc %2,%Z10\;adc %3,%Z11"
  [(set_attr "length" "4")]
)

This may require a little bit of code inside combine to handle
CONCAT in a reasonable way, but that should be fairly minimal.

It may also want some more special case patterns and/or peep2s
to more efficiently handle constants, particularly considering
adiw and subic.  But I think it's at least worth investigating.

r~

[PATCH] make Ada runtime function building use build_function_type_list

2011-04-20 Thread Nathan Froyd

This patch changes most of the uses of build_function_type in the Ada to
use build_function_type_list.  There are a handful of
build_function_type calls left; replacing those will have to wait until
we get a build_function_type_{n,vec} interface.

Tested on x86_64-unknown-linux-gnu.  OK to commit?

-Nathan

* gcc-interface/trans.c (gigi): Call build_function_type_list
instead of build_function_type.  Adjust calls to...
(build_raise_check): ...this.  Do not take a void_tree parameter.
Call build_function_type_list instead of build_function_type.

diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index 378f88c..05e2842 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -214,7 +214,7 @@ static void set_expr_location_from_node (tree, Node_Id);
 static bool set_end_locus_from_node (tree, Node_Id);
 static void set_gnu_expr_location_from_node (tree, Node_Id);
 static int lvalue_required_p (Node_Id, tree, bool, bool, bool);
-static tree build_raise_check (int, tree, enum exception_info_kind);
+static tree build_raise_check (int, enum exception_info_kind);
 
 /* Hooks for debug info back-ends, only supported and used in a restricted set
of configurations.  */
@@ -236,7 +236,7 @@ gigi (Node_Id gnat_root, int max_gnat_node, int number_name 
ATTRIBUTE_UNUSED,
   Entity_Id standard_exception_type, Int gigi_operating_mode)
 {
   Entity_Id gnat_literal;
-  tree long_long_float_type, exception_type, t;
+  tree long_long_float_type, exception_type, t, ftype;
   tree int64_type = gnat_type_for_size (64, 0);
   struct elab_info *info;
   int i;
@@ -344,47 +344,39 @@ gigi (Node_Id gnat_root, int max_gnat_node, int 
number_name ATTRIBUTE_UNUSED,
   DECL_IGNORED_P (t) = 1;
   save_gnu_tree (gnat_literal, t, false);
 
-  void_ftype = build_function_type (void_type_node, NULL_TREE);
+  void_ftype = build_function_type_list (void_type_node, NULL_TREE);
   ptr_void_ftype = build_pointer_type (void_ftype);
 
   /* Now declare run-time functions.  */
-  t = tree_cons (NULL_TREE, void_type_node, NULL_TREE);
 
+  ftype = build_function_type_list (ptr_void_type_node, sizetype, NULL_TREE);
   /* malloc is a function declaration tree for a function to allocate
  memory.  */
   malloc_decl
 = create_subprog_decl (get_identifier ("__gnat_malloc"), NULL_TREE,
-  build_function_type (ptr_void_type_node,
-   tree_cons (NULL_TREE,
-  sizetype, t)),
-  NULL_TREE, false, true, true, NULL, Empty);
+  ftype, NULL_TREE, false, true, true, NULL, Empty);
   DECL_IS_MALLOC (malloc_decl) = 1;
 
   /* malloc32 is a function declaration tree for a function to allocate
  32-bit memory on a 64-bit system.  Needed only on 64-bit VMS.  */
   malloc32_decl
 = create_subprog_decl (get_identifier ("__gnat_malloc32"), NULL_TREE,
-  build_function_type (ptr_void_type_node,
-   tree_cons (NULL_TREE,
-  sizetype, t)),
-  NULL_TREE, false, true, true, NULL, Empty);
+  ftype, NULL_TREE, false, true, true, NULL, Empty);
   DECL_IS_MALLOC (malloc32_decl) = 1;
 
   /* free is a function declaration tree for a function to free memory.  */
+  ftype = build_function_type_list (void_type_node,
+   ptr_void_type_node, NULL_TREE);
   free_decl
 = create_subprog_decl (get_identifier ("__gnat_free"), NULL_TREE,
-  build_function_type (void_type_node,
-   tree_cons (NULL_TREE,
-  ptr_void_type_node,
-  t)),
-  NULL_TREE, false, true, true, NULL, Empty);
+  ftype, NULL_TREE, false, true, true, NULL, Empty);
 
   /* This is used for 64-bit multiplication with overflow checking.  */
+  ftype = build_function_type_list (int64_type,
+   int64_type, int64_type, NULL_TREE);
   mulv64_decl
 = create_subprog_decl (get_identifier ("__gnat_mulv64"), NULL_TREE,
-  build_function_type_list (int64_type, int64_type,
-int64_type, NULL_TREE),
-  NULL_TREE, false, true, true, NULL, Empty);
+  ftype, NULL_TREE, false, true, true, NULL, Empty);
 
   /* Name of the _Parent field in tagged record types.  */
   parent_name_id = get_identifier (Get_Name_String (Name_uParent));
@@ -401,61 +393,54 @@ gigi (Node_Id gnat_root, int max_gnat_node, int 
number_name ATTRIBUTE_UNUSED,
   jmpbuf_ptr_type = build_pointer_type (jmpbuf_type);
 
   /* F

Re: [PATCH][ARM] New testcases for NEON

2011-04-20 Thread Andrew Stubbs


On 19/04/11 15:58, Richard Earnshaw wrote:

OK.

2008-12-03  Daniel Jacobowitz

 gcc/testsuite/
 * gcc.dg/vect/vect-shift-3.c, gcc.dg/vect/vect-shift-4.c: New.
 * lib/target-supports.exp (check_effective_target_vect_shift_char): New
 function.


Committed, thanks.

Andrew

Re: [PATCH][ARM] Clean up movw support

2011-04-20 Thread Andrew Stubbs


On 20/04/11 16:46, Richard Earnshaw wrote:

2011-04-20  Andrew Stubbs

 gcc/
 * config/arm/arm.c (arm_gen_constant): Move mowv support 
 (const_ok_for_op): ... to here.

it's movw (not mowv :)

Otherwise OK.


Committed, thanks.

Andrew

Re: FDO usage: -Wcoverage-mismatch should not ignore -Wno-error

2011-04-20 Thread Xinliang David Li

This would work if there is a way to set Werror=coverage-mismatch
without having to explicitly set the option classification as
DK_ERROR.   Does this mechanism exist?

Thanks,

David

On Tue, Apr 19, 2011 at 12:52 AM, Richard Guenther
 wrote:
> On Tue, Apr 19, 2011 at 9:13 AM, Xinliang David Li  wrote:
>> -Wcoverage-mismatch is enabled by default, and the warning is promoted
>> to error by default. However in the current implementation -Wno-error
>> can not demote the error back to warning. The patch was ported from
>> one contributed by Neil.
>>
>> OK for trunk after regression testing?
>
> I am sure there is a better way to achieve this, like making
> Werror=coverage-mismatch
> the default.  Joseph?
>
> Richard.
>
>>
>> 2011-04-18  Neil Vachharajani  
>>
>>    * flags.c:  New flag variable.
>>    * opts.c (common_handle_options): Set flag_werror_set.
>>    * opts-global.c (decode_options): Delay Werror decision
>>    for Wcoverage-mismatch util after options are parsed.
>>
>> The following test case can be added, but the test harness does not
>> like the extra warnings -- how can they be pruned?
>>
>> Thanks,
>>
>> David
>>
>> /* { dg-options "-O2 -Wcoverage-mismatch -Wno-error" } */
>>
>> int __attribute__((noinline)) bar (void)
>> {
>> }
>>
>> #ifdef _PROFILE_USE
>> int foo (int i)
>> {
>>  if (i)
>>    bar ();
>>  else
>>   bar ();
>>  return 0;
>> }
>> #else
>> int foo (int i)
>> {
>>  if (i)
>>    bar ();
>>  return 0;
>> }
>> #endif
>>
>> int main(int argc, char **argv)
>> {
>>  foo (argc);
>>  return 0;
>> }
>>
>

Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Kai Tietz

2011/4/20 Richard Henderson :
> On 04/20/2011 08:50 AM, Kai Tietz wrote:
>> +      if (TREE_CODE (arg0) == TREE_CODE (arg1)
>> +       && TREE_CODE (arg1) == TRUTH_AND_EXPR)
>
> Ok with these both explicitly testing TRUTH_AND_EXPR now.
>
>
> r~
>

Committed at revision 172776 with explicit testing for TRUTH_AND_EXPR.

Kai

Re: [PATCH][ARM] Remove redundant code in arm.c

2011-04-20 Thread Andrew Stubbs


On 20/04/11 16:34, Richard Earnshaw wrote:

On Wed, 2011-04-20 at 13:55 +0100, Andrew Stubbs wrote:

This patch removes some redundant code that caused me some confusion.

It's not possible to construct a constant from multiple ORN
instructions, just as it's not possible to do it with multiple AND
instructions.

OK?

Andrew


OK.


Committed, thanks.

Andrew

[PATCH] Optimize (x * 8) | 5 and (x << 3) ^ 3 to use lea (PR target/48688)

2011-04-20 Thread Jakub Jelinek

Hi!

This splitter allows us to optimize (x {* {2,4,8},<< {1,2,3}}) {|,^} y
for constant integer y <= {1ULL,3ULL,7ULL} using lea{l,q} (| or ^ in
that case, when the low bits are known to be all 0, is like plus).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-04-20  Jakub Jelinek  

PR target/48688
* config/i386/i386.md (*lea_general_4): New define_insn_and_split.

* gcc.target/i386/pr48688.c: New test.

--- gcc/config/i386/i386.md.jj  2011-04-19 14:08:55.0 +0200
+++ gcc/config/i386/i386.md 2011-04-20 14:34:50.0 +0200
@@ -6646,6 +6646,40 @@ (define_insn_and_split "*lea_general_3_z
 }
   [(set_attr "type" "lea")
(set_attr "mode" "SI")])
+
+(define_insn_and_split "*lea_general_4"
+  [(set (match_operand:SWI 0 "register_operand" "=r")
+   (any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" 
"l")
+   (match_operand:SWI 2 "const_int_operand" "n"))
+   (match_operand 3 "const_int_operand" "n")))]
+  "(mode == DImode
+|| mode == SImode
+|| !TARGET_PARTIAL_REG_STALL
+|| optimize_function_for_size_p (cfun))
+   && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3
+   && ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
+   <= ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  rtx pat;
+  if (mode != DImode)
+operands[0] = gen_lowpart (SImode, operands[0]);
+  operands[1] = gen_lowpart (Pmode, operands[1]);
+  operands[2] = GEN_INT (1 << INTVAL (operands[2]));
+  pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]),
+  INTVAL (operands[3]));
+  if (Pmode != SImode && mode != DImode)
+pat = gen_rtx_SUBREG (SImode, pat, 0);
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat));
+  DONE;
+}
+  [(set_attr "type" "lea")
+   (set (attr "mode")
+  (if_then_else (eq (symbol_ref "mode == DImode") (const_int 0))
+   (const_string "SI")
+   (const_string "DI")))])
 
 ;; Subtract instructions
 
--- gcc/testsuite/gcc.target/i386/pr48688.c.jj  2011-04-20 14:55:37.0 
+0200
+++ gcc/testsuite/gcc.target/i386/pr48688.c 2011-04-20 14:57:03.0 
+0200
@@ -0,0 +1,24 @@
+/* PR target/48688 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int fn1 (int x) { return (x << 3) | 5; }
+int fn2 (int x) { return (x * 8) | 5; }
+int fn3 (int x) { return (x << 3) + 5; }
+int fn4 (int x) { return (x * 8) + 5; }
+int fn5 (int x) { return (x << 3) ^ 5; }
+int fn6 (int x) { return (x * 8) ^ 5; }
+long fn7 (long x) { return (x << 3) | 5; }
+long fn8 (long x) { return (x * 8) | 5; }
+long fn9 (long x) { return (x << 3) + 5; }
+long fn10 (long x) { return (x * 8) + 5; }
+long fn11 (long x) { return (x << 3) ^ 5; }
+long fn12 (long x) { return (x * 8) ^ 5; }
+long fn13 (unsigned x) { return (x << 3) | 5; }
+long fn14 (unsigned x) { return (x * 8) | 5; }
+long fn15 (unsigned x) { return (x << 3) + 5; }
+long fn16 (unsigned x) { return (x * 8) + 5; }
+long fn17 (unsigned x) { return (x << 3) ^ 5; }
+long fn18 (unsigned x) { return (x * 8) ^ 5; }
+
+/* { dg-final { scan-assembler-not "\[ \t\]x?or\[bwlq\]\[ \t\]" } } */

Jakub

Fix PR48703: segfault in mangler due to -g

2011-04-20 Thread Michael Matz

Hi,

as noted in the bug trail the fix for PR48207 broke compilation of C++ 
programs with -g.  This variant fixes the bug too without breaking -g.

Basically we have to set assembler names early also for TYPE_DECLs, we 
can't rely on the frontends langhook to do that after free_lang_data.

Okay for trunk assuming regstrapping on x86_64-linux works?


Ciao,
Michael.

PR debug/48703
* dwarf2out.c (retry_incomplete_types): Export.  Clear
incomplete_types.
* dwarf2out.h (retry_incomplete_types): Declare.
* tree.c (need_assembler_name_p): Also handle TYPE_DECLs.
(free_lang_data_in_cgraph): Call retry_incomplete_types.
(free_lang_data): Reset set_decl_assembler_name langhook.
* Makefile.in (tree.o): Depend on dwarf2out.h.

Index: tree.c
===
--- tree.c  (revision 172769)
+++ tree.c  (working copy)
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.
 #include "except.h"
 #include "debug.h"
 #include "intl.h"
+#include "dwarf2out.h"
 
 /* Tree code classes.  */
 
@@ -4500,7 +4501,8 @@ need_assembler_name_p (tree decl)
 {
   /* Only FUNCTION_DECLs and VAR_DECLs are considered.  */
   if (TREE_CODE (decl) != FUNCTION_DECL
-  && TREE_CODE (decl) != VAR_DECL)
+  && TREE_CODE (decl) != VAR_DECL
+  && TREE_CODE (decl) != TYPE_DECL)
 return false;
 
   /* If DECL already has its assembler name set, it does not need a
@@ -4538,6 +4540,11 @@ need_assembler_name_p (tree decl)
return false;
 }
 
+  if (TREE_CODE (decl) == TYPE_DECL)
+{
+  if (TYPE_DECL_SUPPRESS_DEBUG (decl))
+   return false;
+}
   return true;
 }
 
@@ -5111,6 +5118,8 @@ free_lang_data_in_cgraph (void)
   FOR_EACH_VEC_ELT (tree, fld.decls, i, t)
 assign_assembler_name_if_neeeded (t);
 
+  retry_incomplete_types ();
+
   /* Traverse every decl found freeing its language data.  */
   FOR_EACH_VEC_ELT (tree, fld.decls, i, t)
 free_lang_data_in_decl (t);
@@ -5182,6 +5191,7 @@ free_lang_data (void)
  name and only produce assembler names for local symbols.  Or rather
  make sure we never call decl_assembler_name on local symbols and
  devise a separate, middle-end private scheme for it.  */
+  lang_hooks.set_decl_assembler_name = lhd_set_decl_assembler_name;
 
   /* Reset diagnostic machinery.  */
   diagnostic_starter (global_dc) = default_tree_diagnostic_starter;
Index: dwarf2out.c
===
--- dwarf2out.c (revision 172769)
+++ dwarf2out.c (working copy)
@@ -6575,7 +6575,6 @@ static dw_die_ref force_type_die (tree);
 static dw_die_ref setup_namespace_context (tree, dw_die_ref);
 static dw_die_ref declare_in_namespace (tree, dw_die_ref);
 static struct dwarf_file_data * lookup_filename (const char *);
-static void retry_incomplete_types (void);
 static void gen_type_die_for_member (tree, tree, dw_die_ref);
 static void gen_generic_params_dies (tree);
 static void gen_tagged_type_die (tree, dw_die_ref, enum debug_info_usage);
@@ -18497,15 +18496,17 @@ gen_entry_point_die (tree decl, dw_die_r
 /* Walk through the list of incomplete types again, trying once more to
emit full debugging info for them.  */
 
-static void
+void
 retry_incomplete_types (void)
 {
   int i;
-
-  for (i = VEC_length (tree, incomplete_types) - 1; i >= 0; i--)
-if (should_emit_struct_debug (VEC_index (tree, incomplete_types, i),
+  VEC(tree,gc) *types = incomplete_types;
+  incomplete_types = NULL;
+  for (i = VEC_length (tree, types) - 1; i >= 0; i--)
+if (should_emit_struct_debug (VEC_index (tree, types, i),
  DINFO_USAGE_DIR_USE))
-  gen_type_die (VEC_index (tree, incomplete_types, i), comp_unit_die ());
+  gen_type_die (VEC_index (tree, types, i), comp_unit_die ());
+  types = NULL;
 }
 
 /* Determine what tag to use for a record type.  */
Index: dwarf2out.h
===
--- dwarf2out.h (revision 172769)
+++ dwarf2out.h (working copy)
@@ -25,6 +25,8 @@ extern void dwarf2out_cfi_begin_epilogue
 extern void dwarf2out_frame_debug_restore_state (void);
 extern void dwarf2out_flush_queued_reg_saves (void);
 
+extern void retry_incomplete_types (void);
+
 extern void debug_dwarf (void);
 struct die_struct;
 extern void debug_dwarf_die (struct die_struct *);
Index: Makefile.in
===
--- Makefile.in (revision 172769)
+++ Makefile.in (working copy)
@@ -2354,7 +2354,7 @@ langhooks.o : langhooks.c $(CONFIG_H) $(
 tree.o: tree.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
all-tree.def $(FLAGS_H) $(FUNCTION_H) $(PARAMS_H) \
toplev.h $(DIAGNOSTIC_CORE_H) $(GGC_H) $(HASHTAB_H) $(TARGET_H) output.h 
$(TM_P_H) \
-   langhooks.h gt-tree.h $(TREE_INLINE_H) tree-iterator.h \
+   langhooks.h gt-tree.h $(TREE_INLINE_H) tree-iterator.h dwarf2

Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Kai Tietz

2011/4/20 Jakub Jelinek :
> On Wed, Apr 20, 2011 at 05:22:31PM +0200, Kai Tietz wrote:
>> --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200
>> +++ gcc/gcc/fold-const.c      2011-04-20 17:11:22.901039400 +0200
>> @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
>>         && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
>>       return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
>>
>> +      /* (X & ~Y) | (~X & Y) is X ^ Y */
>> +      if (TREE_CODE (arg0) == BIT_AND_EXPR
>> +       && TREE_CODE (arg1) == BIT_AND_EXPR)
>> +        {
>> +       tree a0, a1, l0, l1, n0, n1;
>> +
>> +       a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
>> +       a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
>> +
>> +       l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
>> +       l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
>> +
>> +       n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
>> +       n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
>> +
>> +       if ((operand_equal_p (n0, a0, 0)
>> +            && operand_equal_p (n1, a1, 0))
>> +           || (operand_equal_p (n0, a1, 0)
>> +               && operand_equal_p (n1, a0, 0)))
>> +         return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
>> +     }
>> +
>
> I must say I don't like first folding/building new trees, then testing
> and then maybe optimizing, that is slow and creates unnecessary garbage
> in the likely case the optimization can't do anything.
>
> Wouldn't something like:
>    int arg0_not = TREE_CODE (TREE_OPERAND (arg0, 1)) == BIT_NOT_EXPR;
>    int arg1_not = TREE_CODE (TREE_OPERAND (arg1, 1)) == BIT_NOT_EXPR;
>    if (TREE_CODE (TREE_OPERAND (arg0, arg0_not)) == BIT_NOT_EXPR
>        && TREE_CODE (TREE_OPERAND (arg1, arg1_not)) == BIT_NOT_EXPR
>        && operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg0, arg0_not), 0),
>                            TREE_OPERAND (arg1, 1 - arg1_not), 0)
>        && operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg1, arg1_not), 0),
>                            TREE_OPERAND (arg0, 1 - arg0_not), 0))
>      return fold_build2_loc (loc, TRUTH_XOR_EXPR, type,
>                              fold_convert_loc (loc, type,
>                                                TREE_OPERAND (arg0, 1 - 
> arg0_not)),
>                              fold_convert_loc (loc, type,
>                                                TREE_OPERAND (arg1, 1 - 
> arg1_not)));
> work better?
>
>        Jakub
>

Well, as special case we could use that, but we have here also to
handle integer-values, so I used fold to make sure I get inverse. Also
there might be some transformations, which otherwise might be not
caught, like !(X || Y) == !X && !Y ...

Regards,
Kai


-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination

Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Richard Henderson

On 04/20/2011 08:50 AM, Kai Tietz wrote:
> +  if (TREE_CODE (arg0) == TREE_CODE (arg1)
> +   && TREE_CODE (arg1) == TRUTH_AND_EXPR)

Ok with these both explicitly testing TRUTH_AND_EXPR now.


r~

Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Kai Tietz

2011/4/20 Richard Henderson :
> On 04/20/2011 08:22 AM, Kai Tietz wrote:
>> +      if (TREE_CODE (arg0) == BIT_AND_EXPR
>> +       && TREE_CODE (arg1) == BIT_AND_EXPR)
>> +        {
>> +       tree a0, a1, l0, l1, n0, n1;
>> +
>> +       a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
>> +       a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
>> +
>> +       l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
>> +       l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
>> +
>> +       n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
>> +       n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
>> +
>> +       if ((operand_equal_p (n0, a0, 0)
>> +            && operand_equal_p (n1, a1, 0))
>> +           || (operand_equal_p (n0, a1, 0)
>> +               && operand_equal_p (n1, a0, 0)))
>> +         return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
>
> First, you typoed BIT_XOR_EXPR in this first block.

Duh, corrected.

> Second, I don't see how you're arbitrarily choosing L0 and N1 in the
> expansion.  If you write the expression the other way around,
>
>  (~x & y) | (x & ~y)
>
> don't you wind up with
>
>  (~x ^ ~y)
>
> ?  Or do the extra NOT expressions get folded away anyway?

Not I didn't wind up here. First ~X ^ ~Y is in result the same as X ^
Y, and for this I used here the explicit folding. Well, it might be a
bit slower, but it has the advantage to compare equal transformations
in doubt.

>> +      if (TREE_CODE (arg0) == TREE_CODE (arg1)
>> +       && (TREE_CODE (arg1) == TRUTH_AND_EXPR
>> +           || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR))
>
> I don't believe you want to apply this transformation with ANDIF.

Yes, it is superflous. I removed it.

>
> r~
>

Adjusted patch attached.

Kai
Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c   2011-04-20 17:10:39.478091900 +0200
+++ gcc/gcc/fold-const.c2011-04-20 17:41:23.427677200 +0200
@@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
 
+  /* (X & ~Y) | (~X & Y) is X ^ Y */
+  if (TREE_CODE (arg0) == BIT_AND_EXPR
+ && TREE_CODE (arg1) == BIT_AND_EXPR)
+{
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+ 
+ n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
+ 
+ if ((operand_equal_p (n0, a0, 0)
+  && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+   return fold_build2_loc (loc, BIT_XOR_EXPR, type, l0, n1);
+   }
+
   t1 = distribute_bit_expr (loc, code, type, arg0, arg1);
   if (t1 != NULL_TREE)
return t1;
@@ -12039,6 +12061,27 @@ fold_binary_loc (location_t loc,
  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
return omit_one_operand_loc (loc, type, integer_one_node, arg0);
 
+  /* (X && !Y) || (!X && Y) is X ^ Y */
+  if (TREE_CODE (arg0) == TREE_CODE (arg1)
+ && TREE_CODE (arg1) == TRUTH_AND_EXPR)
+{
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+ 
+ n0 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l1);
+ 
+ if ((operand_equal_p (n0, a0, 0)
+  && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+   return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
+   }
   goto truth_andor;
 
 case TRUTH_XOR_EXPR:
Index: gcc/gcc/testsuite/gcc.dg/binop-xor1.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ gcc/gcc/testsuite/gcc.dg/binop-xor1.c   2011-04-20 17:11:22.905039900 
+0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  return ((a && !b && c) || (!a && b && c));
+}
+
+/* We expect to see ""; confirm that, so that we know to count
+   it in the real test.  */
+/* { dg-final { scan-tree-dump-times "\]*>" 5 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/

[PATCH] Fix PR47892

2011-04-20 Thread Richard Guenther


This fixes PR47892, we are failing to if-convert function calls,
even those we can vectorize.  This includes pow() which we
canonicalize x*x to with -ffast-math (yeah, I know ...).
No reason to not if-convert at least const builtins.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-04-20  Richard Guenther  

PR tree-optimization/47892
* tree-if-conv.c (if_convertible_stmt_p): Const builtins
are if-convertible.

* gcc.dg/vect/fast-math-ifcvt-1.c: New testcase.

Index: gcc/tree-if-conv.c
===
*** gcc/tree-if-conv.c  (revision 172759)
--- gcc/tree-if-conv.c  (working copy)
*** if_convertible_stmt_p (gimple stmt, VEC
*** 719,724 
--- 719,740 
  case GIMPLE_ASSIGN:
return if_convertible_gimple_assign_stmt_p (stmt, refs);
  
+ case GIMPLE_CALL:
+   {
+   tree fndecl = gimple_call_fndecl (stmt);
+   if (fndecl)
+ {
+   int flags = gimple_call_flags (stmt);
+   if ((flags & ECF_CONST)
+   && !(flags & ECF_LOOPING_CONST_OR_PURE)
+   /* We can only vectorize some builtins at the moment,
+  so restrict if-conversion to those.  */
+   && DECL_BUILT_IN (fndecl))
+ return true;
+ }
+   return false;
+   }
+ 
  default:
/* Don't know what to do with 'em so don't do anything.  */
if (dump_file && (dump_flags & TDF_DETAILS))
Index: gcc/testsuite/gcc.dg/vect/fast-math-ifcvt-1.c
===
*** gcc/testsuite/gcc.dg/vect/fast-math-ifcvt-1.c   (revision 0)
--- gcc/testsuite/gcc.dg/vect/fast-math-ifcvt-1.c   (revision 0)
***
*** 0 
--- 1,18 
+ /* PR 47892 */
+ /* { dg-do compile } */
+ /* { dg-require-effective-target vect_float } */
+ /* { dg-require-effective-target vect_condition } */
+ 
+ void
+ bestseries9 (float * __restrict__ arr, int len)
+ {
+   int i;
+   for (i = 0; i < len; ++i)
+ {
+   float or = arr[i];
+   arr[i] = (or > 0.0f) * (2 - or * or);
+ }
+ }
+ 
+ /* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */

Re: [PATCH][ARM] Clean up movw support

2011-04-20 Thread Richard Earnshaw

On Wed, 2011-04-20 at 15:20 +0100, Andrew Stubbs wrote:
> This patch doesn't change the compiler behaviour; it merely moves the 
> support for MOVW's 16-bit immediate constant to const_ok_for_op.
> 
> This patch is broken out of my previous (rejected) Thumb2-constants 
> patch. I'll be posting v2 of that patch soon, and this clean up will be 
> required then.
> 
> OK?
> 
> Andrew

2011-04-20  Andrew Stubbs  

gcc/
* config/arm/arm.c (arm_gen_constant): Move mowv support 
(const_ok_for_op): ... to here.

it's movw (not mowv :)

Otherwise OK.

R.

Re: [PATCH][ARM] Add support for ADDW and SUBW instructions

2011-04-20 Thread Andrew Stubbs


On 20/04/11 16:27, Andrew Stubbs wrote:

(*arm_subsi3_insn): Add subw support.


Oh, I should probably say that I've added subw support to arm_subsi3 
even though it's not obvious that anything will ever use this.


The existing implementation of arm_subsi3 (sans 'w') supports 
immediates, so I added subw to match.


If there are any objections, I expect I can remove that hunk of the patch.

Andrew

Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Jakub Jelinek

On Wed, Apr 20, 2011 at 05:22:31PM +0200, Kai Tietz wrote:
> --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200
> +++ gcc/gcc/fold-const.c  2011-04-20 17:11:22.901039400 +0200
> @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
> && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
>   return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
>  
> +  /* (X & ~Y) | (~X & Y) is X ^ Y */
> +  if (TREE_CODE (arg0) == BIT_AND_EXPR
> +   && TREE_CODE (arg1) == BIT_AND_EXPR)
> +{
> +   tree a0, a1, l0, l1, n0, n1;
> +
> +   a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> +   a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> +
> +   l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> +   l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> +   
> +   n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
> +   n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
> +   
> +   if ((operand_equal_p (n0, a0, 0)
> +&& operand_equal_p (n1, a1, 0))
> +   || (operand_equal_p (n0, a1, 0)
> +   && operand_equal_p (n1, a0, 0)))
> + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
> + }
> +

I must say I don't like first folding/building new trees, then testing
and then maybe optimizing, that is slow and creates unnecessary garbage
in the likely case the optimization can't do anything.

Wouldn't something like:
int arg0_not = TREE_CODE (TREE_OPERAND (arg0, 1)) == BIT_NOT_EXPR;
int arg1_not = TREE_CODE (TREE_OPERAND (arg1, 1)) == BIT_NOT_EXPR;
if (TREE_CODE (TREE_OPERAND (arg0, arg0_not)) == BIT_NOT_EXPR
&& TREE_CODE (TREE_OPERAND (arg1, arg1_not)) == BIT_NOT_EXPR
&& operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg0, arg0_not), 0),
TREE_OPERAND (arg1, 1 - arg1_not), 0)
&& operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg1, arg1_not), 0),
TREE_OPERAND (arg0, 1 - arg0_not), 0))
  return fold_build2_loc (loc, TRUTH_XOR_EXPR, type,
  fold_convert_loc (loc, type,
TREE_OPERAND (arg0, 1 - 
arg0_not)),
  fold_convert_loc (loc, type,
TREE_OPERAND (arg1, 1 - 
arg1_not)));
work better?

Jakub

Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Richard Henderson

On 04/20/2011 08:22 AM, Kai Tietz wrote:
> +  if (TREE_CODE (arg0) == BIT_AND_EXPR
> +   && TREE_CODE (arg1) == BIT_AND_EXPR)
> +{
> +   tree a0, a1, l0, l1, n0, n1;
> +
> +   a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> +   a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> +
> +   l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> +   l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> +   
> +   n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
> +   n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
> +   
> +   if ((operand_equal_p (n0, a0, 0)
> +&& operand_equal_p (n1, a1, 0))
> +   || (operand_equal_p (n0, a1, 0)
> +   && operand_equal_p (n1, a0, 0)))
> + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);

First, you typoed BIT_XOR_EXPR in this first block.

Second, I don't see how you're arbitrarily choosing L0 and N1 in the
expansion.  If you write the expression the other way around,

  (~x & y) | (x & ~y)

don't you wind up with

  (~x ^ ~y)

?  Or do the extra NOT expressions get folded away anyway?

> +  if (TREE_CODE (arg0) == TREE_CODE (arg1)
> +   && (TREE_CODE (arg1) == TRUTH_AND_EXPR
> +   || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR))

I don't believe you want to apply this transformation with ANDIF.


r~

Re: [PATCH][ARM] Remove redundant code in arm.c

2011-04-20 Thread Richard Earnshaw


On Wed, 2011-04-20 at 13:55 +0100, Andrew Stubbs wrote:
> This patch removes some redundant code that caused me some confusion.
> 
> It's not possible to construct a constant from multiple ORN 
> instructions, just as it's not possible to do it with multiple AND 
> instructions.
> 
> OK?
> 
> Andrew

OK.

R.

Re: [patch, ARM] PR48250, rehaul arm_legitimize_reload_address()

2011-04-20 Thread Chung-Lin Tang

On 2011/4/20 11:12 PM, Richard Earnshaw wrote:
> 
> On Wed, 2011-04-20 at 23:06 +0800, Chung-Lin Tang wrote:
>> On 2011/4/20 09:24 PM, Richard Sandiford wrote:
>>> Hi Chung-Lin,
>>>
>>> I'm seeing an ICE with this patch, specifically;
>>>
>>> Chung-Lin Tang  writes:
 +  if (coproc_p)
 +  low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
>>>
>>> We generate:
>>>
>>> Reload 1: reload_out (V4SI) = (mem/c:V4SI (plus:SI (plus:SI (reg/f:SI 11 fp)
>>> (const_int 
>>> -6144 [0xe800]))
>>> (const_int 1020 
>>> [0x3fc])) [43 %sfp+-5024 S16 A64])
>>>
>>> but 1020 isn't a legitimate offset for V4SI:
>>>
>>>   /* For quad modes, we restrict the constant offset to be slightly less
>>>  than what the instruction format permits.  We do this because for
>>>  quad mode moves, we will actually decompose them into two separate
>>>  double-mode reads or writes.  INDEX must therefore be a valid
>>>  (double-mode) offset and so should INDEX+8.  */
>>>   if (TARGET_NEON && VALID_NEON_QREG_MODE (mode))
>>> return (code == CONST_INT
>>> && INTVAL (index) < 1016
>>> && INTVAL (index) > -1024
>>> && (INTVAL (index) & 3) == 0);
>>>
>>> A simple "fix" would be to use 9 instead of 10, but something a little
>>> more subtle might be preferred :-)
>>>
>>> Richard
>>
>> Oh dear, for some reason I mistakenly thought that NEON had a quad-word
>> load/store, sorry :P
>>
>> Reducing from 10 to 9 may be a possible solution, if restricted to the
>> necessary cases. For example:
>>
>> -low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
>> +{
>> +  low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
>> +
>> +  /* NEON quad-word load/stores are made of two double-word accesses,
>> + so the valid index range is reduced by 8. Treat as 9-bit range if
>> + we go over it.  */
>> +  if (TARGET_NEON && VALID_NEON_QREG_MODE (mode) && low >= 1016)
>> +low = SIGN_MAG_LOW_ADDR_BITS (val, 9);
>> +}
>>
>> To Richard Earnshaw, how do you think of a fix like this? Or should we
>> just simply return false under this out-of-range case (it should be rare
>> I hope).
>>
> 
> I don't think it matters a great deal.  The above is fine.
> 
> Note, that some targets don't have LDRD either.  Do we do the right
> thing if we're going to fall back to two LDR instructions?
> 
> R.

The current non-TARGET_LDRD case goes through this path:
...
else
  /* For pre-ARMv5TE (without ldrd), we use ldm/stm(db/da/ib)
 to access doublewords. The supported load/store offsets are
 -8, -4, and 4, which we try to produce here.  */
  low = ((val & 0xf) ^ 0x8) - 0x8;

which uses ldm/stm. This should be safe.

As for pre-ARMv4 ldrh, this is special cased as:
if (arm_arch4)
  low = SIGN_MAG_LOW_ADDR_BITS (val, 8);
else
  {
 /* The storehi/movhi_bytes fallbacks can use only
[-4094,+4094] of the full ldrb/strb index range.  */
 low = SIGN_MAG_LOW_ADDR_BITS (val, 12);
 if (low == 4095 || low == -4095)
   return false;
  }

Although to be frank, I haven't really tested a pre-ARMv4 config; not
very easy to do so in an EABI world :)

I'll take the above NEON QREG mode fix as approved.

Chung-Lin

[PATCH][ARM] Add support for ADDW and SUBW instructions

2011-04-20 Thread Andrew Stubbs


This patch adds basic support for the Thumb ADDW and SUBW instructions.

The patch permits the compiler to use the new instructions for constants 
that can be loaded with a single instruction (i.e. 16-bit unshifted), 
but does not support use of addw with split-constants; I have a patch 
for that coming soon.


This patch requires that my previously posted patch for MOVW is applied 
first.


OK?

Andrew
2011-04-20  Andrew Stubbs  

	gcc/
	* config/arm/arm-protos.h (const_ok_for_op): Add prototype.
	* config/arm/arm.c (const_ok_for_op): Add support for addw/subw.
	Remove prototype. Remove static function type.
	* config/arm/arm.md (*arm_addsi3): Add addw/subw support.
	Add arch attribute.
	(*arm_subsi3_insn): Add subw support.
	Add arch attribute.
	* config/arm/constraints.md (Pj, PJ): New constraints.

--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -46,6 +46,7 @@ extern bool arm_vector_mode_supported_p (enum machine_mode);
 extern bool arm_small_register_classes_for_mode_p (enum machine_mode);
 extern int arm_hard_regno_mode_ok (unsigned int, enum machine_mode);
 extern int const_ok_for_arm (HOST_WIDE_INT);
+extern int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
 extern int arm_split_constant (RTX_CODE, enum machine_mode, rtx,
 			   HOST_WIDE_INT, rtx, rtx, int);
 extern RTX_CODE arm_canonicalize_comparison (RTX_CODE, rtx *, rtx *);
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -82,7 +82,6 @@ inline static int thumb1_index_register_rtx_p (rtx, int);
 static bool arm_legitimate_address_p (enum machine_mode, rtx, bool);
 static int thumb_far_jump_used_p (void);
 static bool thumb_force_lr_save (void);
-static int const_ok_for_op (HOST_WIDE_INT, enum rtx_code);
 static rtx emit_sfm (int, int);
 static unsigned arm_size_return_regs (void);
 static bool arm_assemble_integer (rtx, unsigned int, int);
@@ -2453,7 +2452,7 @@ const_ok_for_arm (HOST_WIDE_INT i)
 }
 
 /* Return true if I is a valid constant for the operation CODE.  */
-static int
+int
 const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code)
 {
   if (const_ok_for_arm (i))
@@ -2469,6 +2468,13 @@ const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code)
 	return 0;
 
 case PLUS:
+  /* See if we can use addw or subw.  */
+  if (TARGET_THUMB2
+	  && ((i & 0xf000) == 0
+	  || ((-i) & 0xf000) == 0))
+	return 1;
+  /* else fall through.  */
+
 case COMPARE:
 case EQ:
 case NE:
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -707,21 +707,24 @@
 ;;  (plus (reg rN) (reg sp)) into (reg rN).  In this case reload will
 ;; put the duplicated register first, and not try the commutative version.
 (define_insn_and_split "*arm_addsi3"
-  [(set (match_operand:SI  0 "s_register_operand" "=r, k,r,r, k,r")
-	(plus:SI (match_operand:SI 1 "s_register_operand" "%rk,k,r,rk,k,rk")
-		 (match_operand:SI 2 "reg_or_int_operand" "rI,rI,k,L, L,?n")))]
+  [(set (match_operand:SI  0 "s_register_operand" "=r, k,r,r, k, r, k,r, k, r")
+	(plus:SI (match_operand:SI 1 "s_register_operand" "%rk,k,r,rk,k, rk,k,rk,k, rk")
+		 (match_operand:SI 2 "reg_or_int_operand" "rI,rI,k,Pj,Pj,L, L,PJ,PJ,?n")))]
   "TARGET_32BIT"
   "@
add%?\\t%0, %1, %2
add%?\\t%0, %1, %2
add%?\\t%0, %2, %1
+   addw%?\\t%0, %1, %2
+   addw%?\\t%0, %1, %2
sub%?\\t%0, %1, #%n2
sub%?\\t%0, %1, #%n2
+   subw%?\\t%0, %1, #%n2
+   subw%?\\t%0, %1, #%n2
#"
   "TARGET_32BIT
&& GET_CODE (operands[2]) == CONST_INT
-   && !(const_ok_for_arm (INTVAL (operands[2]))
-|| const_ok_for_arm (-INTVAL (operands[2])))
+   && !const_ok_for_op (INTVAL (operands[2]), PLUS)
&& (reload_completed || !arm_eliminable_register (operands[1]))"
   [(clobber (const_int 0))]
   "
@@ -730,8 +733,9 @@
 		  operands[1], 0);
   DONE;
   "
-  [(set_attr "length" "4,4,4,4,4,16")
-   (set_attr "predicable" "yes")]
+  [(set_attr "length" "4,4,4,4,4,4,4,4,4,16")
+   (set_attr "predicable" "yes")
+   (set_attr "arch" "*,*,*,t2,t2,*,*,t2,t2,*")]
 )
 
 (define_insn_and_split "*thumb1_addsi3"
@@ -1184,28 +1188,33 @@
 
 ; ??? Check Thumb-2 split length
 (define_insn_and_split "*arm_subsi3_insn"
-  [(set (match_operand:SI   0 "s_register_operand" "=r,r,rk,r,r")
-	(minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,k,?n,r")
-		  (match_operand:SI 2 "reg_or_int_operand" "r,rI,r, r,?n")))]
+  [(set (match_operand:SI   0 "s_register_operand" "=r,r,rk,r, k, r,r")
+	(minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,k, rk,k, ?n,r")
+		  (match_operand:SI 2 "reg_or_int_operand" "r,rI,r, Pj,Pj,r,?n")))]
   "TARGET_32BIT"
   "@
rsb%?\\t%0, %2, %1
sub%?\\t%0, %1, %2
sub%?\\t%0, %1, %2
+   subw%?\\t%0, %1, %2
+   subw%?\\t%0, %1, %2
#
#"
   "&& ((GET_CODE (operands[1]) == CONST_INT
-   	&& !const_ok_for_arm (INTVAL (operands[1])))
+   	&& !(const_ok_for_arm (INTVAL (operands[1]))
+	 || satisfies_constraint_Pj (operands[2])))
|| (GET_CODE (operands[2]) == CONST_INT
-

libgo patch committed: Remove empty directory

2011-04-20 Thread Ian Lance Taylor

I committed a patch to remove the now-empty directory
libgo/go/crypto/block.

Ian

Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)

2011-04-20 Thread Jan Hubicka

> On Wed, 20 Apr 2011, Jan Hubicka wrote:
> 
> > > Hi,
> > > 
> > > On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote:
> > > > Actually what happens here is that CCP devirtualize by propagating the
> > > > constructors and due to Richard's new code to drop OBJ_TYPE_REF we 
> > > > finally get
> > > > a direct call.  This is all good and desirable.
> > > > 
> > > > I think good solution would be to fold further and inline the thunk
> > > > adjustment, just like the type based devirtualization does.  Even
> > > > once I get far enough with my cgraph cleanuping project to make
> > > > cgraph represent thunks nicely, we would win if in these cases ccp
> > > > and other passes simply inlined the this adjustment, like we do with
> > > > type based devirtualization already.
> > > 
> > > > Martin, I guess it is matter of looking up the thunk info by
> > > > associated cgraph node alias and extending fold_stmts of passes that
> > > > now drop the OBJ_TYPE_REF wrappers?
> > > 
> > > Well, if you have a cgraph node then yes.  But if the method is
> > > implemented in a different compilation unit you don't.  And as I
> > > already said today on IRC, I don't think it is possible to tell
> > > whether a function is a thunk by looking at the decl alone (the front
> > > hand has a flag for it as Jakub noted, though), let alone what kind of
> > > thunk it is.
> > 
> > Well, you don't care about thunks resisting in other unit/partition...
> 
> Sure you do - LTO might bring them into scope if you fold them to 
> a direct call early.

Well, we do have way to represent the thunk adjust over edges and lto-symtab
can just feed the data in while merging external decl with thunk.
Honza
> 
> Richard.

Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)

2011-04-20 Thread Martin Jambor

Hi,

On Wed, Apr 20, 2011 at 04:38:25PM +0200, Jan Hubicka wrote:
> > Hi,
> > 
> > On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote:
> > > Actually what happens here is that CCP devirtualize by propagating the
> > > constructors and due to Richard's new code to drop OBJ_TYPE_REF we 
> > > finally get
> > > a direct call.  This is all good and desirable.
> > > 
> > > I think good solution would be to fold further and inline the thunk
> > > adjustment, just like the type based devirtualization does.  Even
> > > once I get far enough with my cgraph cleanuping project to make
> > > cgraph represent thunks nicely, we would win if in these cases ccp
> > > and other passes simply inlined the this adjustment, like we do with
> > > type based devirtualization already.
> > 
> > > Martin, I guess it is matter of looking up the thunk info by
> > > associated cgraph node alias and extending fold_stmts of passes that
> > > now drop the OBJ_TYPE_REF wrappers?
> > 
> > Well, if you have a cgraph node then yes.  But if the method is
> > implemented in a different compilation unit you don't.  And as I
> > already said today on IRC, I don't think it is possible to tell
> > whether a function is a thunk by looking at the decl alone (the front
> > hand has a flag for it as Jakub noted, though), let alone what kind of
> > thunk it is.
> 
> Well, you don't care about thunks resisting in other unit/partition...
> 

Unless you fold in early optimizations and LTO later, deciding to
inline the function but forgetting about the thunk adjustment.

Martin

[patch middle-end]: Missed optimization for (x & ~y) | (~x & y)

2011-04-20 Thread Kai Tietz

Hello,

well the bonus points might gain somebody else ... But this adds a
missing optimization
for tree level implemented in fold-const.

ChangeLog gcc/

2011-04-20  Kai Tietz

* fold-const.c (fold_binary_loc): Add handling for
(X & ~Y) | (~X & Y) and (X && !Y) | (!X && Y) optimization
to (X ^ Y).

ChangeLog gcc/testsuite

2011-04-20  Kai Tietz

* gcc.dg/binio-xor1.c: New test.
* gcc.dg/binio-xor2.c: New test.
* gcc.dg/binio-xor3.c: New test.
* gcc.dg/binio-xor4.c: New test.
* gcc.dg/binio-xor5.c: New test.

Tested for i686-w64-mingw32, x86_64-w64-mingw32, and
x86_64-pc-linux-gnu (multilib). Ok for apply?

Regards,
Kai
Index: gcc/gcc/fold-const.c
===
--- gcc.orig/gcc/fold-const.c   2011-04-20 17:10:39.478091900 +0200
+++ gcc/gcc/fold-const.c2011-04-20 17:11:22.901039400 +0200
@@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
 
+  /* (X & ~Y) | (~X & Y) is X ^ Y */
+  if (TREE_CODE (arg0) == BIT_AND_EXPR
+ && TREE_CODE (arg1) == BIT_AND_EXPR)
+{
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+ 
+ n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
+ 
+ if ((operand_equal_p (n0, a0, 0)
+  && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+   return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
+   }
+
   t1 = distribute_bit_expr (loc, code, type, arg0, arg1);
   if (t1 != NULL_TREE)
return t1;
@@ -12039,6 +12061,28 @@ fold_binary_loc (location_t loc,
  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
return omit_one_operand_loc (loc, type, integer_one_node, arg0);
 
+  /* (X && !Y) || (!X && Y) is X ^ Y */
+  if (TREE_CODE (arg0) == TREE_CODE (arg1)
+ && (TREE_CODE (arg1) == TRUTH_AND_EXPR
+ || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR))
+{
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+ 
+ n0 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l1);
+ 
+ if ((operand_equal_p (n0, a0, 0)
+  && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+   return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
+   }
   goto truth_andor;
 
 case TRUTH_XOR_EXPR:
Index: gcc/gcc/testsuite/gcc.dg/binop-xor1.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ gcc/gcc/testsuite/gcc.dg/binop-xor1.c   2011-04-20 17:11:22.905039900 
+0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+  return ((a && !b && c) || (!a && b && c));
+}
+
+/* We expect to see ""; confirm that, so that we know to count
+   it in the real test.  */
+/* { dg-final { scan-tree-dump-times "\]*>" 5 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor2.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ gcc/gcc/testsuite/gcc.dg/binop-xor2.c   2011-04-20 17:11:22.908540300 
+0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b)
+{
+  return ((a & ~b) | (~a & b));
+}
+
+/* We expect to see ""; confirm that, so that we know to count
+   it in the real test.  */
+/* { dg-final { scan-tree-dump-times "\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor3.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ gcc/gcc/testsuite/gcc.dg/binop-xor3.c   2011-04-20 17:11:22.911040600 
+0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */

Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)

2011-04-20 Thread Richard Guenther

On Wed, 20 Apr 2011, Jan Hubicka wrote:

> > Hi,
> > 
> > On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote:
> > > Actually what happens here is that CCP devirtualize by propagating the
> > > constructors and due to Richard's new code to drop OBJ_TYPE_REF we 
> > > finally get
> > > a direct call.  This is all good and desirable.
> > > 
> > > I think good solution would be to fold further and inline the thunk
> > > adjustment, just like the type based devirtualization does.  Even
> > > once I get far enough with my cgraph cleanuping project to make
> > > cgraph represent thunks nicely, we would win if in these cases ccp
> > > and other passes simply inlined the this adjustment, like we do with
> > > type based devirtualization already.
> > 
> > > Martin, I guess it is matter of looking up the thunk info by
> > > associated cgraph node alias and extending fold_stmts of passes that
> > > now drop the OBJ_TYPE_REF wrappers?
> > 
> > Well, if you have a cgraph node then yes.  But if the method is
> > implemented in a different compilation unit you don't.  And as I
> > already said today on IRC, I don't think it is possible to tell
> > whether a function is a thunk by looking at the decl alone (the front
> > hand has a flag for it as Jakub noted, though), let alone what kind of
> > thunk it is.
> 
> Well, you don't care about thunks resisting in other unit/partition...

Sure you do - LTO might bring them into scope if you fold them to 
a direct call early.

Richard.

Re: better wpa [1/n]: merge types during read-in

2011-04-20 Thread Michael Matz

Hi,

On Wed, 20 Apr 2011, Richard Guenther wrote:

> >> If t is a type, why fix up its field if it may not be the canonical 
> >> variant?
> >
> > Because type merging to work sometimes requires already canonicalized
> > fields, at least that's what I found in investigating why some types
> > weren't merged that should have been.  Hence I'm first canonicalizing all
> > fields of everything and then see if something merged.
> 
> That sounds like a bug in type-merging.  You don't happen to have
> a small testcase? ;)

cc1 was my testcase :-/

> > Think shared field_decl chains.  I'll have fixed up the chain for one of
> > the type pairs already and can later come to a type referring exactly the
> > same field_decls again.
> 
> But only in case the first one is already equal.  What I wanted to say is
> that we shouldn't have partially shared chains, so
> 
>   if (TYPE_FIELDS (t) != TYPE_FIELDS (oldt))
> for (...)
>   if (TREE_CODE (f1) == FIELD_DECL)
> ...
> 
> should be enough, no?

Indeed.

> In fact, why restrict fixing up the cache to
> FIELD_DECLs and not also do it for TYPE_DECLs or FUNCTION_DECLs
> that may reside in this chain?

non-FIELD_DECLs are removed in free_lang_data.  But even more so I can 
remove the test for FIELD_DECL.

> > Hmm, it's gross but seems to me still required for the diagnostic and 
> > to emit the VIEW_CONVERT_EXPR, at least for invalid input code.  OTOH 
> > if the streamed out code ensures that a field_decl in a component_ref 
> > always is included in its DECL_CONTEXT, then the new merging should 
> > indeed make sure that this also holds after streaming in.  Do we have 
> > testcases specifically trying this code?  greping for "mismatching" in 
> > testsuite/ doesn't show anything relevant.
> 
> The lto testsuite harness doesn't support dg-error/warning, so there are 
> no testcases.  There are testcases that ICEd (type verification) before 
> introducing these fixups though.

Okay, I'll leave investigating this to a follow up.


Ciao,
Michael.

Re: [patch, ARM] PR48250, rehaul arm_legitimize_reload_address()

2011-04-20 Thread Richard Earnshaw


On Wed, 2011-04-20 at 23:06 +0800, Chung-Lin Tang wrote:
> On 2011/4/20 09:24 PM, Richard Sandiford wrote:
> > Hi Chung-Lin,
> > 
> > I'm seeing an ICE with this patch, specifically;
> > 
> > Chung-Lin Tang  writes:
> >> +  if (coproc_p)
> >> +  low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
> > 
> > We generate:
> > 
> > Reload 1: reload_out (V4SI) = (mem/c:V4SI (plus:SI (plus:SI (reg/f:SI 11 fp)
> > (const_int 
> > -6144 [0xe800]))
> > (const_int 1020 
> > [0x3fc])) [43 %sfp+-5024 S16 A64])
> > 
> > but 1020 isn't a legitimate offset for V4SI:
> > 
> >   /* For quad modes, we restrict the constant offset to be slightly less
> >  than what the instruction format permits.  We do this because for
> >  quad mode moves, we will actually decompose them into two separate
> >  double-mode reads or writes.  INDEX must therefore be a valid
> >  (double-mode) offset and so should INDEX+8.  */
> >   if (TARGET_NEON && VALID_NEON_QREG_MODE (mode))
> > return (code == CONST_INT
> > && INTVAL (index) < 1016
> > && INTVAL (index) > -1024
> > && (INTVAL (index) & 3) == 0);
> > 
> > A simple "fix" would be to use 9 instead of 10, but something a little
> > more subtle might be preferred :-)
> > 
> > Richard
> 
> Oh dear, for some reason I mistakenly thought that NEON had a quad-word
> load/store, sorry :P
> 
> Reducing from 10 to 9 may be a possible solution, if restricted to the
> necessary cases. For example:
> 
> -low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
> +{
> +  low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
> +
> +  /* NEON quad-word load/stores are made of two double-word accesses,
> + so the valid index range is reduced by 8. Treat as 9-bit range if
> + we go over it.  */
> +  if (TARGET_NEON && VALID_NEON_QREG_MODE (mode) && low >= 1016)
> +low = SIGN_MAG_LOW_ADDR_BITS (val, 9);
> +}
> 
> To Richard Earnshaw, how do you think of a fix like this? Or should we
> just simply return false under this out-of-range case (it should be rare
> I hope).
> 

I don't think it matters a great deal.  The above is fine.

Note, that some targets don't have LDRD either.  Do we do the right
thing if we're going to fall back to two LDR instructions?

R.

Re: [pph] Namespaces, step 1. Trace formatting. (issue4433054)

2011-04-20 Thread dnovillo



http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c
File gcc/cp/pph-streamer.c (right):

http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode144
gcc/cp/pph-streamer.c:144: return;
+  if ((type == PPH_TRACE_TREE || type == PPH_TRACE_CHAIN)
+  && !data && flag_pph_tracer <= 3)
+return;

Line up the predicates vertically.

http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.c#newcode172
gcc/cp/pph-streamer.c:172: fprintf (pph_logfile, ", code=%s",
tree_code_name[TREE_CODE (t)]);
   case PPH_TRACE_REF:
+  {
+   const_tree t = (const_tree) data;
+   if (t)
+ {
+   print_generic_expr (pph_logfile, CONST_CAST (union tree_node *,
t),
+   0);
+   fprintf (pph_logfile, ", code=%s", tree_code_name[TREE_CODE (t)]);


But how are we going to tell if this is a REF instead of a tree?  The
output seems identical to the PPH_TRACE_TREE case.

http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h
File gcc/cp/pph-streamer.h (right):

http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode149
gcc/cp/pph-streamer.h:149: }
pph_output_tree_lst (pph_stream *stream, tree t, bool ref_p)
+{
+  if (flag_pph_tracer >= 2)
+pph_stream_trace_tree (stream, t, ref_p);
+  lto_output_tree (stream->ob, t, ref_p);
+}

I don't really like all this code duplication.  Wouldn't it be better if
instead of having pph_output_tree_aux and pph_output_tree_lst, we added
another argument to pph_output_tree?  The argument would be an enum and
we could have a default 'DONT_CARE' value.

http://codereview.appspot.com/4433054/diff/1/gcc/cp/pph-streamer.h#newcode298
gcc/cp/pph-streamer.h:298: pph_stream_trace_tree (stream, t, false); /*
FIXME pph: always false? */
@@ -285,7 +295,7 @@ pph_input_tree (pph_stream *stream)
 {
   tree t = lto_input_tree (stream->ib, stream->data_in);
   if (flag_pph_tracer >= 4)
-pph_stream_trace_tree (stream, t);
+pph_stream_trace_tree (stream, t, false); /* FIXME pph: always
false?

Yes, on input we can't tell if we read a reference or a real tree.  We
could, but not at this level.  That's inside the actual LTO streaming
code.

http://codereview.appspot.com/4433054/

Re: [patch, ARM] PR48250, rehaul arm_legitimize_reload_address()

2011-04-20 Thread Chung-Lin Tang

On 2011/4/20 09:24 PM, Richard Sandiford wrote:
> Hi Chung-Lin,
> 
> I'm seeing an ICE with this patch, specifically;
> 
> Chung-Lin Tang  writes:
>> +  if (coproc_p)
>> +low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
> 
> We generate:
> 
> Reload 1: reload_out (V4SI) = (mem/c:V4SI (plus:SI (plus:SI (reg/f:SI 11 fp)
> (const_int -6144 
> [0xe800]))
> (const_int 1020 
> [0x3fc])) [43 %sfp+-5024 S16 A64])
> 
> but 1020 isn't a legitimate offset for V4SI:
> 
>   /* For quad modes, we restrict the constant offset to be slightly less
>  than what the instruction format permits.  We do this because for
>  quad mode moves, we will actually decompose them into two separate
>  double-mode reads or writes.  INDEX must therefore be a valid
>  (double-mode) offset and so should INDEX+8.  */
>   if (TARGET_NEON && VALID_NEON_QREG_MODE (mode))
> return (code == CONST_INT
>   && INTVAL (index) < 1016
>   && INTVAL (index) > -1024
>   && (INTVAL (index) & 3) == 0);
> 
> A simple "fix" would be to use 9 instead of 10, but something a little
> more subtle might be preferred :-)
> 
> Richard

Oh dear, for some reason I mistakenly thought that NEON had a quad-word
load/store, sorry :P

Reducing from 10 to 9 may be a possible solution, if restricted to the
necessary cases. For example:

-low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
+{
+  low = SIGN_MAG_LOW_ADDR_BITS (val, 10);
+
+  /* NEON quad-word load/stores are made of two double-word accesses,
+ so the valid index range is reduced by 8. Treat as 9-bit range if
+ we go over it.  */
+  if (TARGET_NEON && VALID_NEON_QREG_MODE (mode) && low >= 1016)
+low = SIGN_MAG_LOW_ADDR_BITS (val, 9);
+}

To Richard Earnshaw, how do you think of a fix like this? Or should we
just simply return false under this out-of-range case (it should be rare
I hope).

Thanks,
Chung-Lin

[vms/committed]: fix ICE on alpha-vms

2011-04-20 Thread Tristan Gingold

Hi,

This patch fixes a compiler crash for alpha-vms.

Back-ends should not lie to the middle-end by defining macros to plain abort 
since the middle-end is entitled to infer properties from their existence.
The correct thing to do is not to define the macros in the first place.

Committed on trunk.

Tristan.

2011-04-20  Eric Botcazou  

* config/alpha/vms.h (ASM_OUTPUT_ADDR_DIFF_ELT): Do not redefine.

*** gcc/config/alpha/vms.h.02011-02-19 16:51:47.0 +0100
--- gcc/config/alpha/vms.h  2011-02-19 16:52:07.0 +0100
*** typedef struct {int num_args; enum avms_
*** 202,208 
 asm (SECTION_OP "\n\t.long " #FUNC"\n");
  
  #undef ASM_OUTPUT_ADDR_DIFF_ELT
- #define ASM_OUTPUT_ADDR_DIFF_ELT(FILE, BODY, VALUE, REL) gcc_unreachable ()
  
  #undef ASM_OUTPUT_ADDR_VEC_ELT
  #define ASM_OUTPUT_ADDR_VEC_ELT(FILE, VALUE) \
--- 202,207

Re: [google]Pass --save-temps to the assembler (issue4436049)

2011-04-20 Thread Diego Novillo

On Tue, Apr 19, 2011 at 20:32, Easwaran Raman  wrote:

> The revised patch has a comment that this should be used with an
> assembler wrapper that can recognize --save-temps.

Thanks.  Will commit after testing finishes.


Diego.

Re: [RFA] MIPS 24K Errata Support

2011-04-20 Thread Richard Sandiford

Catherine Moore  writes:
> +Work around the 24K E48 Lost Data on Stores during Refill errata.

I think this should either be:

  Work around the 24K E48 (@cite{Lost Data on Stores During Refill}) errata.

or:

  Work around the 24K E48 (lost data on stores during refill) errata.

Maybe the second is safer.

> @@ -479,7 +479,9 @@ (define_attr "length" ""
> (eq_attr "move_type" "load,fpload")
> (symbol_ref "mips_load_store_insns (operands[1], insn) * 4")
> (eq_attr "move_type" "store,fpstore")
> -   (symbol_ref "mips_load_store_insns (operands[0], insn) * 4")
> + (cond [(eq (symbol_ref "TARGET_FIX_24K") (const_int 0))
> +(symbol_ref "mips_load_store_insns (operands[0], insn) * 4")]
> +(symbol_ref "mips_load_store_insns (operands[0], insn) * 4 + 
> 4"))

Keep the existing indentation (i.e. move the new block two spaces
to the left).  Sorry for being so picky...

OK with those changes, thanks.

Richard

Re: [PATCH] Fix incorrect devirtualization (PR middle-end/48661)

2011-04-20 Thread Jan Hubicka

> Hi,
> 
> On Tue, Apr 19, 2011 at 02:15:18AM +0200, Jan Hubicka wrote:
> > Actually what happens here is that CCP devirtualize by propagating the
> > constructors and due to Richard's new code to drop OBJ_TYPE_REF we finally 
> > get
> > a direct call.  This is all good and desirable.
> > 
> > I think good solution would be to fold further and inline the thunk
> > adjustment, just like the type based devirtualization does.  Even
> > once I get far enough with my cgraph cleanuping project to make
> > cgraph represent thunks nicely, we would win if in these cases ccp
> > and other passes simply inlined the this adjustment, like we do with
> > type based devirtualization already.
> 
> > Martin, I guess it is matter of looking up the thunk info by
> > associated cgraph node alias and extending fold_stmts of passes that
> > now drop the OBJ_TYPE_REF wrappers?
> 
> Well, if you have a cgraph node then yes.  But if the method is
> implemented in a different compilation unit you don't.  And as I
> already said today on IRC, I don't think it is possible to tell
> whether a function is a thunk by looking at the decl alone (the front
> hand has a flag for it as Jakub noted, though), let alone what kind of
> thunk it is.

Well, you don't care about thunks resisting in other unit/partition...

Honza
> 
> The more I think about this the more I would also like to make thunks
> as ordinary real functions as possible, with perhaps some kind of
> totally opaque decls/cgraph_nodes for the most obscure types which
> could be generated by assembly.
> 
> Martin

[committed/vms]: do not use vms-dwarf2.o for gnu-ld

2011-04-20 Thread Tristan Gingold

Hi,

when gnu-ld is used, we do not need the extra vms-dwarf2.o file - which is 
needed only for the native vms linker.

Committed on trunk.

Tristan.

2011-04-20  Tristan Gingold  

* config/alpha/vms.h (LINK_SPEC): Do not use vms-dwarf2.o for gnu-ld.


===
--- config/alpha/vms.h  (revision 172769)
+++ config/alpha/vms.h  (working copy)
@@ -329,11 +329,16 @@
 }   \
 } while (0)
 
+#undef LINK_SPEC
+#if HAVE_GNU_LD
+/* GNU-ld built-in linker script already handles the dwarf2 debug sections.  */
+#define LINK_SPEC "%{shared} %{v}"
+#else
 /* Link with vms-dwarf2.o if -g (except -g0). This causes the
VMS link to pull all the dwarf2 debug sections together.  */
-#undef LINK_SPEC
 #define LINK_SPEC "%{g:-g vms-dwarf2.o%s} %{g0} %{g1:-g1 vms-dwarf2.o%s} \
 %{g2:-g2 vms-dwarf2.o%s} %{g3:-g3 vms-dwarf2.o%s} %{shared} %{v} %{map}"
+#endif
 
 #undef STARTFILE_SPEC
 #define STARTFILE_SPEC \

1 2 >

1 - 100 of 126 matches

Mail list logo