Re: C++14 digit separators..

2013-10-28 Thread Václav Zeman
On 10/28/2013 03:51 AM, Ed Smith-Rowland wrote:
 Here is an implementation for C++14 digit separators (single quote).
 
 It's still testing on x86_64-linux but I wanted to give folks a chance
 to check it out.
 
 Ed
 
 Index: gcc/testsuite/g++.dg/cpp1y/digit-sep-neg.C
 ===
 --- gcc/testsuite/g++.dg/cpp1y/digit-sep-neg.C(revision 0)
 +++ gcc/testsuite/g++.dg/cpp1y/digit-sep-neg.C(working copy)
 @@ -0,0 +1,17 @@
 +// { dg-options -std=c++1y }
 +
 +#define assert(E) if(!(E))__builtin_abort();
 +
 +int
 +main()
 +{
 +  (1048576 == 1048''576);
 +  (1048576 == 0X'10);
 +  (1048576 == 0x'10);
 +  (1048576 == 0'004'000'000);
 +  (1048576 == 0B1);
 +  (1048576 == 0b0001''''');
 +
 +  (1.'602'176'565e-19 == 1.602176565e-19);
 +  (1.602'176'565e'-19 == 1.602176565e-19);
 +}
These tests seem to be missing the 'assert' there.

-- 
VZ




signature.asc
Description: OpenPGP digital signature


Re: [PATCH, PR 53001] Re: Patch to split out new warning flag for floating point conversion

2013-10-28 Thread Dodji Seketeli
Hello Joshua,

Joshua J Cogliati jrinc...@yahoo.com writes:

 I am not certain that c.opt was modified correctly.

I don't see any problem with the c.opt part.  So unless another
maintainer says otherwise, I'd say this is OK.

 1. warn_float_patch_and_new_testcase.diff
 This adds a new testcase and checks for float-conversion in the
 warning.  This will add somewhat more time for running the testcases
 compared to version 1 while still testing more or less the same code
 paths.  This does however check that the warning occurs when
 -Wconversion is not used.

As said earlier, I'd really prefer this approach because it leaves the
existing tests alone and just adds new one.  I guess we cannot do much
at this point for the speed concern (okay, we could try and make the
test harness run more tests in parallel but that is a separate
discussion) and I think Joseph agrees too.  So please let's stick to
this approach.


 Changelog for warn_float_patch_and_new_testcase.diff and
 warn_float_patch_and_new_testcase2.diff:
 
   Splitting out a -Wfloat-conversion from -Wconversion for
   conversions that lower floating point number precision
   or conversion from floating point numbers to integers
   * c-family/c-common.c Switching unsafe_conversion_p to
   return an enumeration with more detail, and conversion_warning
   to use this information.

Please, strictly follow the same format as the others entries in the
ChangeLog file.  That is, explicitly write the names of the functions
you changed, in parenthesis, followed by a colon.  That would make the
entry look like:

* c-family/c-common.c (unsafe_conversion_p): Make this function
return an enumeration with more detail.
(conversion_warning): Use the new return type of
unsafe_conversion_p to separately warn either about conversions
that lower floating point number precision or about the other
kinds of conversions.


   * c-family/c-common.h Adding conversion_safety enumeration
   and switching return type of unsafe_conversion_p

Likewise.

   * c-family/c.opt Adding new warning float-conversion and
   enabling it -Wconversion
   * doc/invoke.texi Adding documentation about
   -Wfloat-conversion

Likewise (colon missing at the end of the file name).


   * testsuite/c-c++-common/Wfloat-conversion.c Copies relevant
   tests from c-c++-common/Wconversion-real.c,
   gcc.dg/Wconversion-real-integer.c and gcc.dg/pr35635.c into
   new testcase for ones that are warned about by
   -Wfloat-conversion


You need to add the above ChangeLog entry to the ChangeLog file in
gcc/testsuite/ChangeLog.  Thus, the entry would look like (note how the
prefix testsuite/ is removed from the path name):

* c-c++-common/Wfloat-conversion.c: New test file.  Its content
started as a copy of c-c++-common/Wconversion-real.c,
gcc.dg/Wconversion-real-integer.c and gcc.dg/pr35635.c.  It has
been augmented by new tests to exercise the -Wfloat-conversion
option.


 Index: gcc/testsuite/c-c++-common/Wfloat-conversion.c
 ===
 --- gcc/testsuite/c-c++-common/Wfloat-conversion.c(working copy)
 +++ gcc/testsuite/c-c++-common/Wfloat-conversion.c(working copy)
 @@ -1,85 +1,58 @@
  /* Test for diagnostics for Wconversion for floating-point.  */

Hmmh.  I think this diff hunk should say that this is a new file.  Here
what it says is that it's a modification of an existing file.

Thank you for your time on this.

-- 
Dodji


RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-10-28 Thread Bernd Edlinger
Hi Sandra,


On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote:

 On 10/18/2013 10:38 AM, Richard Biener wrote:
 Sandra Loosemore san...@codesourcery.com wrote:
 On 10/18/2013 04:50 AM, Richard Biener wrote:
 On Sat, Sep 28, 2013 at 4:19 AM, Sandra Loosemore
 san...@codesourcery.com wrote:
 This patch fixes various -fstrict-volatile-bitfields wrong-code
 bugs,
 including instances where bitfield values were being quietly
 truncated as
 well as bugs relating to using the wrong width. The code changes
 are
 identical to those in the previous version of this patch series
 (originally
 posted in June and re-pinged many times since then), but for this
 version I
 have cleaned up the test cases to remove dependencies on header
 files, as
 Bernd requested.

 Ok.

 Just to clarify, is this approval unconditional, independent of part 2
 and other patches or changes still under active discussion?

 Yes.

 Hr. After some further testing, I'm afraid this patch is still
 buggy. :-(

 I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new
 pr56997-1.c testcase, it got stuck in infinite recursion between
 store_split_bit_field/store_fixed_bit_field and/or
 extract_split_bit_field/extract_fixed_bit_field. This didn't show up in
 my previous mainline testing.

 The difference between 4.8 and mainline head is the alignment of the
 incoming str_rtx passed to store_bit_field/extract_bit_field, due to the
 changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8.
 It seems to me that the bitfield code ought to handle store/extract
 from a more-aligned object and it's probably possible to construct an
 example that fails in this way on mainline as well.

 It looks like there are conflicting assumptions between get_best_mode,
 the places that call it in store_fixed_bit_field and
 extract_fixed_bit_field, and the code that actually does the splitting
 -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather
 than its mode. In the case where it's failing, get_best_mode is
 deciding it can't do a HImode access without splitting, but then the
 split code is assuming SImode units because of the 32-bit alignment, but
 not actually changing the mode of the rtx to match that.

 On top of that, this is one of the cases that strict_volatile_bitfield_p
 checks for and returns false, but the callers of store_bit_field and
 extract_bit_field in expr.c have already fiddled with the mode of the
 incoming rtx assuming that -fstrict-volatile-bitfields does apply. It
 doesn't get into that infinite recursion if it's compiled with
 -fno-strict-volatile-bitfields instead; in that case the incoming rtx
 has BLKmode, get_best_mode chooses SImode, and it's able to do the
 access without splitting at all.

 Anyway I tried a couple different bandaids that solved the infinite
 recursion problem but caused regressions elsewhere, and now I'm not sure
 of the right place to fix this. Given that there is also still ongoing
 discussion about making this a 3-way switch (etc) I am going to hold off
 on committing this patch for now.

 -Sandra


I have attached an update to your patch, that should
a) fix the recursion problem.
b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model.

Bernd.

patch-bitfields-update.diff
Description: Binary data


Re: [PATCH, ARM] Fix line number data for PIC register setup code

2013-10-28 Thread Tom de Vries
On 27/10/13 17:04, Eric Botcazou wrote:
 The PIC register setup code is emitted after NOTE_INSNS_FUNCTION_BEG,
 because it uses the insert_insn_on_edge mechanism, and the corresponding
 insertion in cfgexpand.c:gimple_expand_cfg takes care to insert the code
 after the parm_birth_insn:
 ...
/* Avoid putting insns before parm_birth_insn.  */
if (e-src == ENTRY_BLOCK_PTR
 single_succ_p (ENTRY_BLOCK_PTR)
 parm_birth_insn)
  {
rtx insns = e-insns.r;
e-insns.r = NULL_RTX;
emit_insn_after_noloc (insns, parm_birth_insn, e-dest);
  }
 ...
 And in the case for this test-case, parm_birth_insn is the
 NOTE_INSNS_FUNCTION_BEG.
 
 So this means that parm_birth_insn can never be null, right?
 

Yes, AFAICT parm_birth_insn is never NULL at this point.

 2013-10-13  Tom de Vries  t...@codesourcery.com

  * cfgexpand.c (gimple_expand_cfg): Don't commit insertions after
  NOTE_INSN_FUNCTION_BEG.

  * gcc.target/arm/require-pic-register-loc.c: New test.
 
 OK if you also remove the test on parm_birth_insn.
 

Updated patch, re-bootstrapped on x86_64 and committed to trunk.

Also applied to 4.7 and 4.8 branches, the same problem is present there.

Thanks,
- Tom

2013-10-28  Tom de Vries  t...@codesourcery.com

* cfgexpand.c (gimple_expand_cfg): Remove test for parm_birth_insn.
Don't commit insertions after NOTE_INSN_FUNCTION_BEG.

* gcc.target/arm/require-pic-register-loc.c: New test.

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index ba4c0e6..9705036 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4789,14 +4789,18 @@ gimple_expand_cfg (void)
 	  if (e-insns.r)
 	{
 	  rebuild_jump_labels_chain (e-insns.r);
-	  /* Avoid putting insns before parm_birth_insn.  */
+	  /* Put insns after parm birth, but before
+		 NOTE_INSNS_FUNCTION_BEG.  */
 	  if (e-src == ENTRY_BLOCK_PTR
-		   single_succ_p (ENTRY_BLOCK_PTR)
-		   parm_birth_insn)
+		   single_succ_p (ENTRY_BLOCK_PTR))
 		{
 		  rtx insns = e-insns.r;
 		  e-insns.r = NULL_RTX;
-		  emit_insn_after_noloc (insns, parm_birth_insn, e-dest);
+		  if (NOTE_P (parm_birth_insn)
+		   NOTE_KIND (parm_birth_insn) == NOTE_INSN_FUNCTION_BEG)
+		emit_insn_before_noloc (insns, parm_birth_insn, e-dest);
+		  else
+		emit_insn_after_noloc (insns, parm_birth_insn, e-dest);
 		}
 	  else
 		commit_one_edge_insertion (e);
 2013-10-27  Tobias Burnus  bur...@net-b.de
 
 	PR other/33426
diff --git a/gcc/testsuite/gcc.target/arm/require-pic-register-loc.c b/gcc/testsuite/gcc.target/arm/require-pic-register-loc.c
new file mode 100644
index 000..bd85e86
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/require-pic-register-loc.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options -g -fPIC } */
+
+void *v;
+void a (void *x) { }
+void b (void) { }
+   /* line 7.  */
+int/* line 8.  */
+main (int argc)/* line 9.  */
+{  /* line 10.  */
+  if (argc == 12345)   /* line 11.  */
+{
+  a (v);
+  return 1;
+}
+  b ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not \.loc 1 7 0 } } */
+/* { dg-final { scan-assembler-not \.loc 1 8 0 } } */
+/* { dg-final { scan-assembler-not \.loc 1 9 0 } } */
+
+/* The loc at the start of the prologue.  */
+/* { dg-final { scan-assembler-times \.loc 1 10 0 1 } } */
+
+/* The loc at the end of the prologue, with the first user line.  */
+/* { dg-final { scan-assembler-times \.loc 1 11 0 1 } } */


Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-28 Thread Kirill Yukhin
Hello Richard,
On 22 Oct 08:16, Richard Henderson wrote:
 On 10/22/2013 07:42 AM, Kirill Yukhin wrote:
  Hello Richard,
  Thanks for remarks, they all seems reasonable.
  
  One question
  
  On 21 Oct 16:01, Richard Henderson wrote:
  +(define_insn avx512f_movesmode_mask
  +  [(set (match_operand:VF_128 0 register_operand =v)
  + (vec_merge:VF_128
  +   (vec_merge:VF_128
  + (match_operand:VF_128 2 register_operand v)
  + (match_operand:VF_128 3 vector_move_operand 0C)
  + (match_operand:avx512fmaskmode 4 register_operand k))
  +   (match_operand:VF_128 1 register_operand v)
  +   (const_int 1)))]
  +  TARGET_AVX512F
  +  vmovssescalarmodesuffix\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}
  +  [(set_attr type ssemov)
  +   (set_attr prefix evex)
  +   (set_attr mode sseinsnmode)])
 
  Nested vec_merge?  That seems... odd to say the least.
  How in the world does this get matched?
  
  This is generic approach for all scalar `masked' instructions.
  
  Reason is that we must save higher bits of vector (outer vec_merge)
  and apply single-bit mask (inner vec_merge).
  
  
  We may do it with unspecs though... But is it really better?
  
  What do you think?
 
 What I think is that while it's an instruction that exists in the ISA,
 does that mean we must model it in the compiler?
 
 How would this pattern be used?

When we have all-1 mask then simplifier may reduce such pattern to simpler form
with single vec_merge.
This will be impossible if we put unspec there.

So, for example for thise code:
__m128d
foo (__m128d x, __m128d y)
{
  return _mm_maskz_add_sd (-1, x, y);
}

With unspec we will have:
foo:
.LFB2328:
movl$-1, %eax   # 10*movqi_internal/2   [length = 5]
kmovw   %eax, %k1   # 24*movqi_internal/8   [length = 4]
vaddsd  %xmm1, %xmm0, %xmm0{%k1}{z} # 11sse2_vmaddv2df3_mask/2 
[length = 6]
ret # 27simple_return_internal  [length = 1]

While for `semantic' version it will be simplified to:
foo:
.LFB2329:
vaddsd  %xmm1, %xmm0, %xmm0 # 11sse2_vmaddv2df3/2   [length 
= 4]
ret # 26simple_return_internal  [length = 1]

So, we have short VEX insn vs. long EVEX one + mask creation insns.
That is why we want to expose semantics of such operations.

Thanks, K



Committed: Add testcase / extra comment to recent arc_ccfsm_post_advance patch

2013-10-28 Thread Joern Rennecke
gcc/testsuite:

2013-10-28   Claudiu Zissulescu  claz...@synopsys.com
 Joern Rennecke  joern.renne...@embecosm.com

* gcc.target/arc/jump-around-jump.c: New test.

gcc:

2013-10-28  Joern Rennecke  joern.renne...@embecosm.com
* config/arc/arc.c (arc_ccfsm_post_advance):

Add comment about TYPE_RETURN.


tmp
Description: Binary data


Re: Rework c99status.html

2013-10-28 Thread Joseph S. Myers
On Mon, 28 Oct 2013, Gerald Pfeifer wrote:

  I think it would be appropriate to redirect all the c99status.html files 
  for particular GCC versions to this file, now it covers all versions.
 
 To make sure I understand:  you are proposing we remove all
 individual c99status.html pages and have the web server redirect
 to the one you just updated for all of those?

Yes.  (I wonder if we should actually rename it to a more generic name, 
such as cstdstatus.html, with a view to adding a section on C11 support, 
but I'm less sure of that.  If I complete the atomics support during stage 
1, and also get _Thread_local done, then the C11 support can be described 
as substantially complete in 4.9 with the same caveats as for C99.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: C++14 digit separators..

2013-10-28 Thread Joseph S. Myers
On Sun, 27 Oct 2013, Ed Smith-Rowland wrote:

 Here is an implementation for C++14 digit separators (single quote).

I tend to think that such features should come with a test that the 
feature is not enabled for language / standard versions for which it 
shouldn't be.  That is, something like

#define m(x) 0
int i = m(1'2)+(3'4);

that is valid in both cases, but whose semantics differ depending on 
whether the feature is enabled (with an appropriate check on the value of 
i, and the test being run both in cases where it should be enabled and 
cases where it shouldn't).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, PR 53001] Re: Patch to split out new warning flag for floating point conversion

2013-10-28 Thread Joseph S. Myers
On Mon, 28 Oct 2013, Dodji Seketeli wrote:

   * c-family/c-common.c (unsafe_conversion_p): Make this function

As previously noted, c-family/ has its own ChangeLog, so the c-family/ 
part does not go in the ChangeLog entry.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: C++14 digit separators..

2013-10-28 Thread Jason Merrill

On 10/28/2013 09:10 AM, Joseph S. Myers wrote:

On Sun, 27 Oct 2013, Ed Smith-Rowland wrote:


Here is an implementation for C++14 digit separators (single quote).


I tend to think that such features should come with a test that the
feature is not enabled for language / standard versions for which it
shouldn't be.  That is, something like

#define m(x) 0
int i = m(1'2)+(3'4);


Good idea.  Other than that, the patch looks good to me.

Jason



Re: [PATCH, PR 53001] Re: Patch to split out new warning flag for floating point conversion

2013-10-28 Thread Dodji Seketeli
Joseph S. Myers jos...@codesourcery.com writes:

 On Mon, 28 Oct 2013, Dodji Seketeli wrote:

  * c-family/c-common.c (unsafe_conversion_p): Make this function

 As previously noted, c-family/ has its own ChangeLog, so the c-family/ 
 part does not go in the ChangeLog entry.

Agreed.  I missed this.

-- 
Dodji


Re: [PATCH i386 4/8] [AVX512] [2/n] Add substed patterns: mask scalar subst.

2013-10-28 Thread Kirill Yukhin
Hello,

This patch introduces mask scalar subst.

Is it ok to commit to main trunk?

Testing pass.

--
Thanks, K

---
 gcc/config/i386/sse.md   | 104 ---
 gcc/config/i386/subst.md |  23 +++
 2 files changed, 95 insertions(+), 32 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index bf0e1ed..1f0d6fa 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1307,7 +1307,7 @@
(set_attr prefix mask_prefix3)
(set_attr mode MODE)])
 
-(define_insn sse_vmplusminus_insnmode3
+(define_insn sse_vmplusminus_insnmode3mask_scalar_name
   [(set (match_operand:VF_128 0 register_operand =x,v)
(vec_merge:VF_128
  (plusminus:VF_128
@@ -1318,10 +1318,10 @@
   TARGET_SSE
   @
plusminus_mnemonicssescalarmodesuffix\t{%2, %0|%0, %iptr2}
-   vplusminus_mnemonicssescalarmodesuffix\t{%2, %1, %0|%0, %1, %iptr2}
+   vplusminus_mnemonicssescalarmodesuffix\t{%2, %1, 
%0mask_scalar_operand3|%0mask_scalar_operand3, %1, %iptr2}
   [(set_attr isa noavx,avx)
(set_attr type sseadd)
-   (set_attr prefix orig,vex)
+   (set_attr prefix mask_scalar_prefix)
(set_attr mode ssescalarmode)])
 
 (define_expand mulmode3mask_name
@@ -1347,7 +1347,7 @@
(set_attr btver2_decode direct,double)
(set_attr mode MODE)])
 
-(define_insn sse_vmmultdiv_mnemonicmode3
+(define_insn sse_vmmultdiv_mnemonicmode3mask_scalar_name
   [(set (match_operand:VF_128 0 register_operand =x,v)
(vec_merge:VF_128
  (multdiv:VF_128
@@ -1358,10 +1358,10 @@
   TARGET_SSE
   @
multdiv_mnemonicssescalarmodesuffix\t{%2, %0|%0, %iptr2}
-   vmultdiv_mnemonicssescalarmodesuffix\t{%2, %1, %0|%0, %1, %iptr2}
+   vmultdiv_mnemonicssescalarmodesuffix\t{%2, %1, 
%0mask_scalar_operand3|%0mask_scalar_operand3, %1, %iptr2}
   [(set_attr isa noavx,avx)
(set_attr type ssemultdiv_mnemonic)
-   (set_attr prefix orig,vex)
+   (set_attr prefix mask_scalar_prefix)
(set_attr btver2_decode direct,double)
(set_attr mode ssescalarmode)])
 
@@ -1446,7 +1446,7 @@
(set_attr prefix evex)
(set_attr mode MODE)])
 
-(define_insn *srcp14mode
+(define_insn mask_scalar_codeforsrcp14modemask_scalar_name
   [(set (match_operand:VF_128 0 register_operand =v)
(vec_merge:VF_128
  (unspec:VF_128
@@ -1456,7 +1456,7 @@
  (match_dup 1)
  (const_int 1)))]
   TARGET_AVX512F
-  vrcp14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2}
+  vrcp14ssescalarmodesuffix\t{%2, %1, 
%0mask_scalar_operand3|%0mask_scalar_operand3, %1, %2}
   [(set_attr type sse)
(set_attr prefix evex)
(set_attr mode MODE)])
@@ -1493,7 +1493,7 @@
(set_attr prefix maybe_vex)
(set_attr mode MODE)])
 
-(define_insn sse_vmsqrtmode2
+(define_insn sse_vmsqrtmode2mask_scalar_name
   [(set (match_operand:VF_128 0 register_operand =x,v)
(vec_merge:VF_128
  (sqrt:VF_128
@@ -1503,11 +1503,11 @@
   TARGET_SSE
   @
sqrtssescalarmodesuffix\t{%1, %0|%0, %iptr1}
-   vsqrtssescalarmodesuffix\t{%1, %2, %0|%0, %2, %iptr1}
+   vsqrtssescalarmodesuffix\t{%1, %2, 
%0mask_scalar_operand3|%0mask_scalar_operand3, %2, %iptr1}
   [(set_attr isa noavx,avx)
(set_attr type sse)
(set_attr atom_sse_attr sqrt)
-   (set_attr prefix orig,vex)
+   (set_attr prefix mask_scalar_prefix)
(set_attr btver2_sse_attr sqrt)
(set_attr mode ssescalarmode)])
 
@@ -1542,7 +1542,7 @@
(set_attr prefix evex)
(set_attr mode MODE)])
 
-(define_insn *rsqrt14mode
+(define_insn mask_scalar_codeforrsqrt14modemask_scalar_name
   [(set (match_operand:VF_128 0 register_operand =v)
(vec_merge:VF_128
  (unspec:VF_128
@@ -1552,7 +1552,7 @@
  (match_dup 1)
  (const_int 1)))]
   TARGET_AVX512F
-  vrsqrt14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2}
+  vrsqrt14ssescalarmodesuffix\t{%2, %1, 
%0mask_scalar_operand3|%0mask_scalar_operand3, %1, %2}
   [(set_attr type sse)
(set_attr prefix evex)
(set_attr mode MODE)])
@@ -1622,7 +1622,7 @@
(set_attr prefix mask_prefix3)
(set_attr mode MODE)])
 
-(define_insn sse_vmcodemode3
+(define_insn sse_vmcodemode3mask_scalar_name
   [(set (match_operand:VF_128 0 register_operand =x,v)
(vec_merge:VF_128
  (smaxmin:VF_128
@@ -1633,11 +1633,11 @@
   TARGET_SSE
   @
maxmin_floatssescalarmodesuffix\t{%2, %0|%0, %iptr2}
-   vmaxmin_floatssescalarmodesuffix\t{%2, %1, %0|%0, %1, %iptr2}
+   vmaxmin_floatssescalarmodesuffix\t{%2, %1, 
%0mask_scalar_operand3|%0mask_scalar_operand3, %1, %iptr2}
   [(set_attr isa noavx,avx)
(set_attr type sse)
(set_attr btver2_sse_attr maxmin)
-   (set_attr prefix orig,vex)
+   (set_attr prefix mask_scalar_prefix)
(set_attr mode ssescalarmode)])
 
 ;; These versions of the min/max patterns implement exactly the operations
@@ -2748,7 +2748,7 @@
  (match_operand:FMAMODE 3 nonimmediate_operand)))]
   )
 
-(define_insn *fma_fmadd_mode
+(define_insn fma_fmadd_mode
   [(set (match_operand:FMAMODE 0 register_operand =v,v,v,x,x)
  

Re: [PATCH i386 4/8] [AVX512] Add substed patterns: mask_scalar_merge subst.

2013-10-28 Thread Kirill Yukhin
Hello,

This patch introduces mask_scalar_merge  subst.

Is it ok to commit to main trunk?

Testing pass.

--
Thanks, K

---
 gcc/config/i386/sse.md   | 26 +-
 gcc/config/i386/subst.md | 16 
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 1f0d6fa..f3cca59 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -2153,7 +2153,7 @@
   [(V16SF const_0_to_31_operand) (V8DF const_0_to_31_operand)
   (V16SI const_0_to_7_operand) (V8DI const_0_to_7_operand)])
 
-(define_insn avx512f_cmpmode3
+(define_insn avx512f_cmpmode3mask_scalar_merge_name
   [(set (match_operand:avx512fmaskmode 0 register_operand =k)
(unspec:avx512fmaskmode
  [(match_operand:VI48F_512 1 register_operand v)
@@ -2161,13 +2161,13 @@
   (match_operand:SI 3 cmp_imm_predicate n)]
  UNSPEC_PCMP))]
   TARGET_AVX512F
-  vsseintprefixcmpssemodesuffix\t{%3, %2, %1, %0|%0, %1, %2, %3}
+  vsseintprefixcmpssemodesuffix\t{%3, %2, %1, 
%0mask_scalar_merge_operand4|%0mask_scalar_merge_operand4, %1, %2, %3}
   [(set_attr type ssecmp)
(set_attr length_immediate 1)
(set_attr prefix evex)
(set_attr mode sseinsnmode)])
 
-(define_insn avx512f_ucmpmode3
+(define_insn avx512f_ucmpmode3mask_scalar_merge_name
   [(set (match_operand:avx512fmaskmode 0 register_operand =k)
(unspec:avx512fmaskmode
  [(match_operand:VI48_512 1 register_operand v)
@@ -2175,7 +2175,7 @@
   (match_operand:SI 3 const_0_to_7_operand n)]
  UNSPEC_UNSIGNED_PCMP))]
   TARGET_AVX512F
-  vpcmpussemodesuffix\t{%3, %2, %1, %0|%0, %1, %2, %3}
+  vpcmpussemodesuffix\t{%3, %2, %1, 
%0mask_scalar_merge_operand4|%0mask_scalar_merge_operand4, %1, %2, %3}
   [(set_attr type ssecmp)
(set_attr length_immediate 1)
(set_attr prefix evex)
@@ -8712,7 +8712,7 @@
(set_attr prefix vex)
(set_attr mode OI)])
 
-(define_expand avx512f_eqmode3
+(define_expand avx512f_eqmode3mask_scalar_merge_name
   [(set (match_operand:avx512fmaskmode 0 register_operand)
(unspec:avx512fmaskmode
  [(match_operand:VI48_512 1 register_operand)
@@ -8721,14 +8721,14 @@
   TARGET_AVX512F
   ix86_fixup_binary_operands_no_copy (EQ, MODEmode, operands);)
 
-(define_insn avx512f_eqmode3_1
+(define_insn avx512f_eqmode3mask_scalar_merge_name_1
   [(set (match_operand:avx512fmaskmode 0 register_operand =k)
(unspec:avx512fmaskmode
  [(match_operand:VI48_512 1 register_operand %v)
   (match_operand:VI48_512 2 nonimmediate_operand vm)]
  UNSPEC_MASKED_EQ))]
   TARGET_AVX512F  ix86_binary_operator_ok (EQ, MODEmode, operands)
-  vpcmpeqssemodesuffix\t{%2, %1, %0|%0, %1, %2}
+  vpcmpeqssemodesuffix\t{%2, %1, 
%0mask_scalar_merge_operand3|%0mask_scalar_merge_operand3, %1, %2}
   [(set_attr type ssecmp)
(set_attr prefix_extra 1)
(set_attr prefix evex)
@@ -8808,13 +8808,13 @@
(set_attr prefix vex)
(set_attr mode OI)])
 
-(define_insn avx512f_gtmode3
+(define_insn avx512f_gtmode3mask_scalar_merge_name
   [(set (match_operand:avx512fmaskmode 0 register_operand =k)
(unspec:avx512fmaskmode
  [(match_operand:VI48_512 1 register_operand v)
   (match_operand:VI48_512 2 nonimmediate_operand vm)] 
UNSPEC_MASKED_GT))]
   TARGET_AVX512F
-  vpcmpgtssemodesuffix\t{%2, %1, %0|%0, %1, %2}
+  vpcmpgtssemodesuffix\t{%2, %1, 
%0mask_scalar_merge_operand3|%0mask_scalar_merge_operand3, %1, %2}
   [(set_attr type ssecmp)
(set_attr prefix_extra 1)
(set_attr prefix evex)
@@ -9208,25 +9208,25 @@
  ]
  (const_string sseinsnmode)))])
 
-(define_insn avx512f_testmmode3
+(define_insn avx512f_testmmode3mask_scalar_merge_name
   [(set (match_operand:avx512fmaskmode 0 register_operand =k)
(unspec:avx512fmaskmode
 [(match_operand:VI48_512 1 register_operand v)
  (match_operand:VI48_512 2 nonimmediate_operand vm)]
 UNSPEC_TESTM))]
   TARGET_AVX512F
-  vptestmssemodesuffix\t{%2, %1, %0|%0, %1, %2}
+  vptestmssemodesuffix\t{%2, %1, 
%0mask_scalar_merge_operand3|%0mask_scalar_merge_operand3, %1, %2}
   [(set_attr prefix evex)
(set_attr mode  sseinsnmode)])
 
-(define_insn avx512f_testnmmode3
+(define_insn avx512f_testnmmode3mask_scalar_merge_name
   [(set (match_operand:avx512fmaskmode 0 register_operand =k)
(unspec:avx512fmaskmode
 [(match_operand:VI48_512 1 register_operand v)
  (match_operand:VI48_512 2 nonimmediate_operand vm)]
 UNSPEC_TESTNM))]
   TARGET_AVX512CD
-  %vptestnmssemodesuffix\t{%2, %1, %0|%0, %1, %2}
+  %vptestnmssemodesuffix\t{%2, %1, 
%0mask_scalar_merge_operand3|%0mask_scalar_merge_operand3, %1, %2}
   [(set_attr prefix evex)
(set_attr mode  sseinsnmode)])
 
diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
index 532a3a1..b537c5e 100644
--- a/gcc/config/i386/subst.md
+++ b/gcc/config/i386/subst.md
@@ -27,6 +27,9 @@
V16SF V8SF  V4SF
V8DF  V4DF  

Re: [PATCH 1/n] Add conditional compare support

2013-10-28 Thread Richard Henderson
On 10/28/2013 01:32 AM, Zhenqiang Chen wrote:
 Patch is updated according to your comments. Main changes are:
 * Add two hooks: legitimize_cmp_combination and legitimize_ccmp_combination
 * Improve document.

No, these are not the hooks I proposed.

You should *not* have a ccmp_optab, because the middle-end
has absolutely no idea what mode TARGET should be in.

The hook needs to return an rtx expression appropriate for
feeding to cbranch et al.  E.g. for arm,

  (ne (reg:CC_Z CC_REGNUM) (const_int 0))

after having emitted the instruction that sets CC_REGNUM.

We need to push this to the backend because for ia64 the
expression needs to be of te form

  (ne (reg:BI new_pseudo) (const_int 0))


r~


Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-28 Thread Richard Henderson
On 10/28/2013 03:24 AM, Kirill Yukhin wrote:
 Hello Richard,
 On 22 Oct 08:16, Richard Henderson wrote:
 On 10/22/2013 07:42 AM, Kirill Yukhin wrote:
 Hello Richard,
 Thanks for remarks, they all seems reasonable.

 One question

 On 21 Oct 16:01, Richard Henderson wrote:
 +(define_insn avx512f_movesmode_mask
 +  [(set (match_operand:VF_128 0 register_operand =v)
 + (vec_merge:VF_128
 +   (vec_merge:VF_128
 + (match_operand:VF_128 2 register_operand v)
 + (match_operand:VF_128 3 vector_move_operand 0C)
 + (match_operand:avx512fmaskmode 4 register_operand k))
 +   (match_operand:VF_128 1 register_operand v)
 +   (const_int 1)))]
 +  TARGET_AVX512F
 +  vmovssescalarmodesuffix\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}
 +  [(set_attr type ssemov)
 +   (set_attr prefix evex)
 +   (set_attr mode sseinsnmode)])

 Nested vec_merge?  That seems... odd to say the least.
 How in the world does this get matched?

 This is generic approach for all scalar `masked' instructions.

 Reason is that we must save higher bits of vector (outer vec_merge)
 and apply single-bit mask (inner vec_merge).


 We may do it with unspecs though... But is it really better?

 What do you think?

 What I think is that while it's an instruction that exists in the ISA,
 does that mean we must model it in the compiler?

 How would this pattern be used?
 
 When we have all-1 mask then simplifier may reduce such pattern to simpler 
 form
 with single vec_merge.
 This will be impossible if we put unspec there.
 
 So, for example for thise code:
 __m128d
 foo (__m128d x, __m128d y)
 {
   return _mm_maskz_add_sd (-1, x, y);
 }
 
 With unspec we will have:
 foo:
 .LFB2328:
 movl$-1, %eax   # 10*movqi_internal/2   [length = 5]
 kmovw   %eax, %k1   # 24*movqi_internal/8   [length = 4]
 vaddsd  %xmm1, %xmm0, %xmm0{%k1}{z} # 11
 sse2_vmaddv2df3_mask/2 [length = 6]
 ret # 27simple_return_internal  [length = 1]
 
 While for `semantic' version it will be simplified to:
 foo:
 .LFB2329:
 vaddsd  %xmm1, %xmm0, %xmm0 # 11sse2_vmaddv2df3/2   
 [length = 4]
   ret # 26simple_return_internal  [length = 1]
 
 So, we have short VEX insn vs. long EVEX one + mask creation insns.
 That is why we want to expose semantics of such operations.

This is not the question that I asked.

Why is a masked *scalar* operation useful?  The only way I can see
that one would get created is that builtin, which begs the question
of why the builtin exists.


r~


Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-28 Thread Ilya Enkovich
On 24 Oct 23:21, Jeff Law wrote:
 On 10/24/13 02:24, Ilya Enkovich wrote:
 These two hooks are used by expand pass to pass/receive bounds for
 args. When bounds are passed in a register, expand does not need this
 hook and uses regular move insn. If we are out of bound register or
 platform does not have them at all, this hook is called. If bounded
 arg is passed in memory, regular way to store associated bounds is
 supposed to be used in hook. That is why no slot_no value is required,
 only arg and it's place (slot) are used. If bounded arg is passed in
 register (e.g. it happens on i386 with MPX when more that 4 pointers
 are passed in registers), then some special slot has to be used for
 bounds. slot_no here holds identifier of this special slot. E.g. if we
 call function with 6 pointer on i386, we call this hook passing R8 and
 R9 as slot with const1_rtx and const2_rtx as slot_no.
 So can we find a concise way to describe this and include that in
 the docs for the hooks.  Otherwise I can't see how a developer is
 going to know how to use this stuff.
 
 So how am I (as a GCC developer) suppsoed to know what BNDMK, BNDLDX, BNDCU,
 etc mean?  The names aren't particularly descriptive.  Are these documented
 elsewhere in a follow-up patch?  If not, it seems to me we need to document
 them here.
 
 Actually the next patch introduces them and is a good place for
 documentation. But currently this patch has documentation for user
 visible built-ins only. For now built-ins used for instrumentation are
 described on Wiki only
 (http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler#Builtins_used_for_instrumentation).
 Where should I move it? Does it also go to extend.texi?
 If they're strictly for developers, then there's not a real good
 place for them.  They're not really extensions we would expect the
 user to use, so extend.texi seems inappropriate.
 
 Perhaps a section in tm.texi?
 
 Jeff
 

I fixed documentation for bounds load/store hooks.  Also added documentation 
for list of builtins returned by builtin_chkp_function.

Thanks,
Ilya
--

gcc/

2013-10-28  Ilya Enkovich  ilya.enkov...@intel.com

* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(fn_abi_va_list_bounds_size): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
* doc/tm.texi.in (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New.
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
* doc/tm.texi: Regenerated.
* langhooks.h (lang_hooks): Add chkp_supported field.
* langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): New.
(LANG_HOOKS_INITIALIZER); Add LANG_HOOKS_CHKP_SUPPORTED.


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8d220f3..c60ebef 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4334,6 +4334,13 @@ This hook returns the va_list type of the calling 
convention specified by
 The default version of this hook returns @code{va_list_type_node}.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
@var{fndecl})
+This hook returns size for @code{va_list} object in function specified
+by @var{fndecl}.  This hook is used by Pointer Bounds Checker to build bounds
+for @code{va_list} object.  Return @code{integer_zero_node} if no bounds
+should be used (e.g. @code{va_list} is a scalar pointer to the stack).
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_CANONICAL_VA_LIST_TYPE (tree @var{type})
 This hook returns the va_list type of the calling convention specified by the
 type of @var{type}. If @var{type} is not a valid va_list type, it returns
@@ -5151,6 +5158,26 @@ defined, then define this hook to return @code{true} if
 Otherwise, you should not define this hook.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot}, rtx 
@var{arg}, rtx @var{slot_no})
+This hook is used by expand pass to emit insn to load bounds of
+@var{arg} passed in @var{slot}.  Expand pass uses this hook in case
+bounds of @var{arg} are not passed in register.  If @var{slot} is a
+memory, then bounds are loaded as for regular pointer loaded from
+memory.  If @var{slot} is not a memory then @var{slot_no} is an integer
+constant holding 

[C++ Patch] PR 58888

2013-10-28 Thread Paolo Carlini

Hi,

here we reject the declaration of b with the invalid use of ‘auto’ 
error. The reason is that, at the beginning of grokfield, init != 
NULL_TREE, processing_template_decl is false (in fact, a templatized 
version of the testcase is accepted) and we get to:


if (TREE_CODE (init) == CONSTRUCTOR)
init = digest_init (TREE_TYPE (value), init, tf_warning_or_error);

then, at the beginning of digest_init_r, complete_type_or_maybe_complain 
finds the auto type incomplete. It seems to me that in grokfield, in the 
non-template case too, we should just process the declaration later, in 
the final switch, thus forward to finish_static_data_member_decl (then 
the only difference with the templatized case that we already handle 
correctly, is that we don't call push_template_decl in the middle)


Tested x86_64-linux.

Thanks,
Paolo.


/cp
2013-10-28  Paolo Carlini  paolo.carl...@oracle.com

PR c++/5
* decl2.c (grokfield): Handle auto like NSDMI.

/testsuite
2013-10-28  Paolo Carlini  paolo.carl...@oracle.com

PR c++/5
* g++.dg/cpp0x/auto40.C: New.
Index: cp/decl2.c
===
--- cp/decl2.c  (revision 204133)
+++ cp/decl2.c  (working copy)
@@ -953,6 +953,8 @@ grokfield (const cp_declarator *declarator,
}
   else if (TREE_CODE (value) == FIELD_DECL)
/* C++11 NSDMI, keep going.  */;
+  else if (is_auto (TREE_TYPE (value)))
+   /* C++11 auto, keep going.  */;
   else if (!VAR_P (value))
gcc_unreachable ();
   else if (!processing_template_decl)
Index: testsuite/g++.dg/cpp0x/auto40.C
===
--- testsuite/g++.dg/cpp0x/auto40.C (revision 0)
+++ testsuite/g++.dg/cpp0x/auto40.C (working copy)
@@ -0,0 +1,11 @@
+// PR c++/5
+// { dg-do compile { target c++11 } }
+
+#include initializer_list
+
+struct A
+{
+  static constexpr auto b{1.0};
+};
+
+constexpr decltype(A::b) A::b;


Re: RFC/A: PR 58079: CONST_INT mode assert in combine.c:do_SUBST

2013-10-28 Thread Uros Bizjak
Hello!

 PR 58079 is about the do_SUBST assert:

   /* Sanity check that we're replacing oldval with a CONST_INT
 that is a valid sign-extension for the original mode.  */
   gcc_assert (INTVAL (newval)
  == trunc_int_for_mode (INTVAL (newval), GET_MODE (oldval)));

 triggering while trying to optimise the temporary result:

   (eq (const_int -73 [0xffb7])
   (reg:QI 234 [ MEM[(const u8 *)buf_7(D) + 4B] ])

 combine_simplify_rtx calls simplify_comparison, which first canonicalises
 the order so that the constant is second and then promotes the comparison
 to SImode (the first supported comparison mode on MIPS).  So we end up with:

   (eq (zero_extend:SI (reg:QI 234 [ MEM[(const u8 *)buf_7(D) + 4B] ]))
   (const_int 183 [0xb7]))

 So far so good.  But combine_simplify_rtx then tries to install the
 new operands in-place, with the second part of:

  /* If the code changed, return a whole new comparison.  */
  if (new_code != code)
return gen_rtx_fmt_ee (new_code, mode, op0, op1);

  /* Otherwise, keep this operation, but maybe change its operands.
 This also converts (ne (compare FOO BAR) 0) to (ne FOO BAR).  */
  SUBST (XEXP (x, 0), op0);
  SUBST (XEXP (x, 1), op1);

 And this triggers the assert because we're replacing a QImode register
 with the non-QImode constant 183.

 One fix would be to extend the new_code != code condition, as below.
 This should avoid the problem cases without generating too much
 garbage rtl.  But if the condition's getting this complicated,
 perhaps it'd be better just to avoid SUBST here?  I.e.

  if (new_code != code || op0 != XEXP (x, 0) || op1 != XEXP (x, 1))
return gen_rtx_fmt_ee (new_code, mode, op0, op1);

 Just asking though. :-)

 Tested on mipsisa64-elf.  OK to install?

 Thanks,
 Richard


 gcc/
 PR rtl-optimization/58079
 * combine.c (combine_simplify_rtx): Avoid using SUBST if
 simplify_comparison has widened a comparison with an integer.

 gcc/testsuite/
 * gcc.dg/torture/pr58079.c: New test.

I would like to backport this patch to 4.8 branch. The patch fixes the
failure on alpha-linux-gnu and also (reportedly) fixes the same
failure on mips and pa targets.

I have bootstrapped and regression test the patch on
x86_64-pc-linux-gnu {,-m32} and started bootstrap and regtests on
alphaev68-linux-gnu on 4.8 branch for all default languages + obj-c++
and go. The latter will take some ~10 hours.

OK to backort the patch to the 4.8 branch if both tests pass?

Thanks,
Uros.


Re: RFA: Andes nds32 port v4 patch

2013-10-28 Thread Joseph S. Myers
Since Richard looked at the machine description part of the port, I looked 
at the other pieces.  I'm happy with those pieces with the following minor 
changes:

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00520.html (libgcc): remove 
the .file directive from crtzero.S.  See 
http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00123.html for why such 
directives are best avoided.

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00522.html (documentation): 
use @deftypefn rather than @table @code when documenting built-in 
functions.  In install.texi, say @samp{newlib} and @samp{mculib} instead 
of using '' quotes.

-- 
Joseph S. Myers
jos...@codesourcery.com


[patch] Remove gimple.h from header files.

2013-10-28 Thread Andrew MacLeod
There are currently only 2 .h files which include gimple.h, so removing 
this indirection is not too difficult.


Similar to the tree.h change Diego made, only .c files should include 
gimple.h.


Bootstraps on x86_64-unknown-linux-gnu, regression tests are running.  
Assuming no issues, OK?


Andrew

	* ipa-prop.h: Remove gimple.h from include list.
	* lto-streamer.h: Likewise.
	* cgraphbuild.c:  Add gimple.h to include list.
	* data-streamer-in.c: Likewise.
	* ipa-cp.c: Likewise.
	* tree-streamer.c: Likewise.
	* lto-compress.c: Likewise.
	* ipa-reference.c: Likewise.
	* data-streamer-out.c: Likewise.
	* lto-cgraph.c: Likewise.
	* cgraphclones.c: Likewise.
	* ipa-utils.c: Likewise.
	* data-streamer.c: Likewise.
	* ipa-split.c: Likewise.
	* lto-section-in.c: Likewise.
	* lto/lto-object.c: Likewise.
	* lto/lto-partition.c: Likewise.
	* tree-streamer-out.c: Likewise.
	* ipa-prop.c: Likewise.
	* tree-streamer-in.c: Likewise.
	* symtab.c: Likewise.
	* opts-global.c: Likewise.
	* lto-opts.c: Likewise.
	* lto-section-out.c: Likewise.
	* lto-streamer.c: Likewise.

Index: ipa-prop.h
===
*** ipa-prop.h	(revision 204031)
--- ipa-prop.h	(working copy)
*** along with GCC; see the file COPYING3.  
*** 23,29 
  #include tree-core.h
  #include vec.h
  #include cgraph.h
- #include gimple.h
  #include alloc-pool.h
  
  /* The following definitions and interfaces are used by
--- 23,28 
Index: lto-streamer.h
===
*** lto-streamer.h	(revision 204031)
--- lto-streamer.h	(working copy)
*** along with GCC; see the file COPYING3.  
*** 26,32 
  #include plugin-api.h
  #include hash-table.h
  #include tree-core.h
- #include gimple.h
  #include target.h
  #include cgraph.h
  #include vec.h
--- 26,31 
Index: cgraphbuild.c
===
*** cgraphbuild.c	(revision 204031)
--- cgraphbuild.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 23,28 
--- 23,29 
  #include coretypes.h
  #include tm.h
  #include tree.h
+ #include gimple.h
  #include langhooks.h
  #include pointer-set.h
  #include intl.h
Index: data-streamer-in.c
===
*** data-streamer-in.c	(revision 204031)
--- data-streamer-in.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 25,30 
--- 25,31 
  #include coretypes.h
  #include diagnostic.h
  #include tree.h
+ #include gimple.h
  #include data-streamer.h
  
  /* Read a string from the string table in DATA_IN using input block
Index: ipa-cp.c
===
*** ipa-cp.c	(revision 204031)
--- ipa-cp.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 104,109 
--- 104,110 
  #include system.h
  #include coretypes.h
  #include tree.h
+ #include gimple.h
  #include target.h
  #include ipa-prop.h
  #include bitmap.h
Index: tree-streamer.c
===
*** tree-streamer.c	(revision 204031)
--- tree-streamer.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 24,29 
--- 24,30 
  #include system.h
  #include coretypes.h
  #include tree.h
+ #include gimple.h
  #include streamer-hooks.h
  #include tree-streamer.h
  
Index: lto-compress.c
===
*** lto-compress.c	(revision 204031)
--- lto-compress.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 28,33 
--- 28,34 
  #include zlib.h
  #include coretypes.h
  #include tree.h
+ #include gimple.h
  #include diagnostic-core.h
  #include langhooks.h
  #include lto-streamer.h
Index: ipa-reference.c
===
*** ipa-reference.c	(revision 204031)
--- ipa-reference.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 41,46 
--- 41,47 
  #include coretypes.h
  #include tm.h
  #include tree.h
+ #include gimple.h
  #include tree-inline.h
  #include tree-pass.h
  #include pointer-set.h
Index: data-streamer-out.c
===
*** data-streamer-out.c	(revision 204031)
--- data-streamer-out.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 24,29 
--- 24,30 
  #include system.h
  #include coretypes.h
  #include tree.h
+ #include gimple.h
  #include data-streamer.h
  
  /* Return index used to reference STRING of LEN characters in the string table
Index: lto-cgraph.c
===
*** lto-cgraph.c	(revision 204031)
--- lto-cgraph.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 25,30 
--- 25,31 
  #include coretypes.h
  

Re: [C++ Patch] PR 58888

2013-10-28 Thread Jason Merrill
My question is, why do we need that whole block for massaging VAR_DECL 
initializers?  That all ought to be handled properly by cp_finish_decl. 
 Does removing everything after



  else if (!VAR_P (value))
gcc_unreachable ();


work?

Jason


Re: [patch] Remove gimple.h from header files.

2013-10-28 Thread Jeff Law

On 10/28/13 11:39, Andrew MacLeod wrote:

There are currently only 2 .h files which include gimple.h, so removing
this indirection is not too difficult.

Similar to the tree.h change Diego made, only .c files should include
gimple.h.

Bootstraps on x86_64-unknown-linux-gnu, regression tests are running.
Assuming no issues, OK?

OK.  Thanks,
Jeff


Re: Fix for cris-elf breakage from mudflap removal, take 2

2013-10-28 Thread Jeff Law

On 10/27/13 08:43, Hans-Peter Nilsson wrote:

PRED_NORETURN seems a better match; not that I see anything in
the current source that actually treats them differently other
than generating them, but my grep-fu may be weak and certainly
my crystall-ball-fu is.

After testing (no regressions compared to r204080 before the
breakage), committed.

* config/cris/cris.c (cris_emit_trap_for_misalignment): Replace the
removed PRED_MUDFLAP with PRED_NORETURN.  Correct file-path in comment.
Thanks.  SOrry for mucking up cris.  I know I'd scanned that code at one 
time and somehow missed fixing the PRED_MUDFLAP reference.


jeff



Re: [PATCH] Final removal of mudflap

2013-10-28 Thread Jeff Law

On 10/26/13 15:08, Joseph S. Myers wrote:

As far as I can see, the commit left empty libmudflap directories around
rather than removing them (SVN, unlike git, allows empty directories to be
represented in the repository).

I'll check out an SVN tree and remove the empty directory.

jeff


[patch] Flatten tree-outof-ssa.h

2013-10-28 Thread Andrew MacLeod
This is even more trivial...   Remove the 3 includes from 
tree-outof-ssa.h and pushed the required ones down to the .c files that 
include tree-outof-ssa.h.


Bootstraps on x86_64-unknown-linux-gnu, regressions are running. 
Assuming no issues, OK?


Andrew

	* tree-outof-ssa.h: Remove include files.
	* tree-outof-ssa.c: Add required include files from tree-outof-ssa.h.
	* expr.c: Likewise.
	* tree-ssa-coalesce.c: Likewise.
	* cfgexpand.c: Likewise.
	* tree-ssa-ter.c: Likewise.

Index: tree-outof-ssa.h
===
*** tree-outof-ssa.h	(revision 204135)
--- tree-outof-ssa.h	(working copy)
*** along with GCC; see the file COPYING3.  
*** 21,29 
  #ifndef GCC_TREE_OUTOF_SSA_H
  #define GCC_TREE_OUTOF_SSA_H
  
- #include tree-ssa-live.h
- #include tree-ssa-ter.h
- #include tree-ssa-coalesce.h
  
  /* This structure (of which only a singleton SA exists) is used to
 pass around information between the outof-SSA functions, cfgexpand
--- 21,26 
Index: tree-outof-ssa.c
===
*** tree-outof-ssa.c	(revision 204135)
--- tree-outof-ssa.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 36,41 
--- 36,44 
  #include tree-ssanames.h
  #include dumpfile.h
  #include diagnostic-core.h
+ #include tree-ssa-live.h
+ #include tree-ssa-ter.h
+ #include tree-ssa-coalesce.h
  #include tree-outof-ssa.h
  
  /* FIXME: A lot of code here deals with expanding to RTL.  All that code
Index: expr.c
===
*** expr.c	(revision 204135)
--- expr.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 52,57 
--- 52,58 
  #include timevar.h
  #include df.h
  #include diagnostic.h
+ #include tree-ssa-live.h
  #include tree-outof-ssa.h
  #include target-globals.h
  #include params.h
Index: tree-ssa-coalesce.c
===
*** tree-ssa-coalesce.c	(revision 204135)
--- tree-ssa-coalesce.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 33,39 
  #include ssa-iterators.h
  #include tree-ssanames.h
  #include hash-table.h
! #include tree-outof-ssa.h
  #include diagnostic-core.h
  
  
--- 33,40 
  #include ssa-iterators.h
  #include tree-ssanames.h
  #include hash-table.h
! #include tree-ssa-live.h
! #include tree-ssa-coalesce.h
  #include diagnostic-core.h
  
  
Index: cfgexpand.c
===
*** cfgexpand.c	(revision 204135)
--- cfgexpand.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 49,54 
--- 49,55 
  #include tree-inline.h
  #include value-prof.h
  #include target.h
+ #include tree-ssa-live.h
  #include tree-outof-ssa.h
  #include sbitmap.h
  #include cfgloop.h
Index: tree-ssa-ter.c
===
*** tree-ssa-ter.c	(revision 204135)
--- tree-ssa-ter.c	(working copy)
*** along with GCC; see the file COPYING3.  
*** 32,37 
--- 32,39 
  #include ssa-iterators.h
  #include tree-ssanames.h
  #include dumpfile.h
+ #include tree-ssa-live.h
+ #include tree-ssa-ter.h
  #include tree-outof-ssa.h
  #include flags.h
  


Re: [patch] Flatten tree-outof-ssa.h

2013-10-28 Thread Jeff Law

On 10/28/13 12:24, Andrew MacLeod wrote:

This is even more trivial...   Remove the 3 includes from
tree-outof-ssa.h and pushed the required ones down to the .c files that
include tree-outof-ssa.h.

Bootstraps on x86_64-unknown-linux-gnu, regressions are running.
Assuming no issues, OK?

OK.

jeff


Re: [PATCH] rewrite stack vectors

2013-10-28 Thread Jeff Law

On 10/25/13 13:56, Trevor Saunders wrote:

On Fri, Oct 25, 2013 at 03:30:47PM -0400, Diego Novillo wrote:

On 2013-10-10 14:07 , tsaund...@mozilla.com wrote:

This makes the implementation of stack vectors simpler and easier to use.  This 

   works by 
making the size of the on stack storage a template argument, so the
size is embedded in the type.  This allows you to implicitly convert a
stack_vecT, N to a vecT, va_heap *, and it will just work.  Because there's
no need to support stack vectors in unions we can make them be a more normal
c++ class with a constructor and destructor that are nontrivial.



Thanks.  This looks much simpler, indeed.  The patch is fine to
commit.  Just a couple of observations/questions:


I don't have commit access, so can someone check it in for me?  I
bootstrapped and got no changes in regression tests two weeks ago, but
haven't checked it since if that helps.
Just doing a quick sanity build given things change quite a bit in a 
couple weeks.Once the sanity build completes, I'll check it in.


Thanks,
Jeff



[PATCH, testsuite/rs6000] Adjust two VMX tests for little endian

2013-10-28 Thread Bill Schmidt
Hi,

These two test cases require source changes when compiled on a little
endian target.  Verified on powerpc64{,le}-unknown-linux-gnu.  Ok to
commit?

Thanks,
Bill


2013-10-28  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gcc.dg/vmx/gcc-bug-i.c: Add little endian variant.
* gcc.dg/vmx/eg-5.c: Likewise.


Index: gcc/testsuite/gcc.dg/vmx/gcc-bug-i.c
===
--- gcc/testsuite/gcc.dg/vmx/gcc-bug-i.c(revision 203979)
+++ gcc/testsuite/gcc.dg/vmx/gcc-bug-i.c(working copy)
@@ -13,12 +13,27 @@
 #define DO_INLINE __attribute__ ((always_inline))
 #define DONT_INLINE __attribute__ ((noinline))
 
+#ifdef __LITTLE_ENDIAN__
+static inline DO_INLINE int inline_me(vector signed short data)
+{
+  union {vector signed short v; signed short s[8];} u;
+  signed short x;
+  unsigned char x1, x2;
+
+  u.v = data;
+  x = u.s[7];
+  x1 = (x  8)  0xff;
+  x2 = x  0xff;
+  return ((x2  8) | x1);
+}
+#else
 static inline DO_INLINE int inline_me(vector signed short data) 
 {
   union {vector signed short v; signed short s[8];} u;
   u.v = data;
   return u.s[7];
 }
+#endif
 
 static DONT_INLINE int foo(vector signed short data)
 {
Index: gcc/testsuite/gcc.dg/vmx/eg-5.c
===
--- gcc/testsuite/gcc.dg/vmx/eg-5.c (revision 203979)
+++ gcc/testsuite/gcc.dg/vmx/eg-5.c (working copy)
@@ -7,10 +7,17 @@ matvecmul4 (vector float c0, vector float c1, vect
   /* Set result to a vector of f32 0's */
   vector float result = ((vector float){0.,0.,0.,0.});
 
+#ifdef __LITTLE_ENDIAN__
+  result  = vec_madd (c0, vec_splat (v, 3), result);
+  result  = vec_madd (c1, vec_splat (v, 2), result);
+  result  = vec_madd (c2, vec_splat (v, 1), result);
+  result  = vec_madd (c3, vec_splat (v, 0), result);
+#else
   result  = vec_madd (c0, vec_splat (v, 0), result);
   result  = vec_madd (c1, vec_splat (v, 1), result);
   result  = vec_madd (c2, vec_splat (v, 2), result);
   result  = vec_madd (c3, vec_splat (v, 3), result);
+#endif
 
   return result;
 }




Re: [PATCH] Final removal of mudflap

2013-10-28 Thread pinskia


 On Oct 28, 2013, at 11:24 AM, Jeff Law l...@redhat.com wrote:
 
 On 10/26/13 15:08, Joseph S. Myers wrote:
 As far as I can see, the commit left empty libmudflap directories around
 rather than removing them (SVN, unlike git, allows empty directories to be
 represented in the repository).
 I'll check out an SVN tree and remove the empty directory.

Looks like mike stump has already did this.

Thanks,
Andrew

 
 jeff


[patch] Mostly remove tree-core.h from other .h files.

2013-10-28 Thread Andrew MacLeod
I noticed there were a few .h files which we including tree-core.h 
themselves.   expr.h was one of them, and many of the rtl files used 
that to get at  the basic tree structure for various bits.


I moved the include of tree-core.h to rtl.h instead.  That way all the 
rtl files get it, and when we flatten the rtl.h includes into the c 
files, it'll be reduced to just the ones that need it. Eventually, those 
things that need it from rtl.h won't even need tree-code.h, they'll get 
converted to the new gimple interface instead.


streamer-hooks.h was including tree-core.h as well, but really only 
streamer-hooks.c needed location_t from input.h, so I included that 
instead from there.


Now only tree.h and rtl.h are including tree-core.h, which is more 
appropriate.


bootstrapping and regressions running on x86_64-unknown-linux-gnu.. 
assuming no issues, OK?


Andrew


	* rtl.h: Add tree-core.h to include list.
	* expr.h: Remove tree-core.h from include list.
	* gimple.h: Likewise.
	* ipa-prop.h: Likewise.
	* ipa-utils.h: Likewise.
	* lto-streamer.h: Likewise.
	* streamer-hooks.h: Likewise.
	* streamer-hooks.c: Include input.h.

*** HEADER/rtl.h	2013-10-28 11:55:32.021437788 -0400
--- rtl.h	2013-10-28 14:49:39.749635187 -0400
*** along with GCC; see the file COPYING3.
*** 29,34 
--- 29,35 
  #include alias.h
  #include hashtab.h
  #include flags.h
+ #include tree-core.h
  
  /* Value used by some passes to recognize noop moves as valid
   instructions.  */
*** HEADER/expr.h	2013-10-28 11:55:32.005437796 -0400
--- expr.h	2013-10-28 14:46:25.069932215 -0400
*** along with GCC; see the file COPYING3.
*** 26,34 
  #include rtl.h
  /* For optimize_size */
  #include flags.h
- /* For host_integerp, tree_low_cst, fold_convert, size_binop, ssize_int,
-TREE_CODE, TYPE_SIZE, int_size_in_bytes,*/
- #include tree-core.h
  /* For GET_MODE_BITSIZE, word_mode */
  #include machmode.h
  
--- 26,31 
*** HEADER/gimple.h	2013-10-28 11:55:32.009437794 -0400
--- gimple.h	2013-10-28 15:01:27.285050961 -0400
*** along with GCC; see the file COPYING3.
*** 27,33 
  #include vec.h
  #include ggc.h
  #include basic-block.h
- #include tree-core.h
  #include tree-ssa-alias.h
  #include internal-fn.h
  #include gimple-fold.h
--- 27,32 
*** HEADER/ipa-prop.h	2013-10-28 11:55:32.013437792 -0400
--- ipa-prop.h	2013-10-28 15:03:36.512726491 -0400
*** along with GCC; see the file COPYING3.
*** 20,26 
  #ifndef IPA_PROP_H
  #define IPA_PROP_H
  
- #include tree-core.h
  #include vec.h
  #include cgraph.h
  #include alloc-pool.h
--- 20,25 
*** HEADER/ipa-utils.h	2013-10-28 11:55:32.014437792 -0400
--- ipa-utils.h	2013-10-28 15:04:02.304664416 -0400
*** along with GCC; see the file COPYING3.
*** 20,26 
  
  #ifndef GCC_IPA_UTILS_H
  #define GCC_IPA_UTILS_H
- #include tree-core.h
  #include cgraph.h
  
  struct ipa_dfs_info {
--- 20,25 
*** HEADER/lto-streamer.h	2013-10-28 11:55:32.017437790 -0400
--- lto-streamer.h	2013-10-28 15:04:30.626597222 -0400
*** along with GCC; see the file COPYING3.
*** 25,31 
  
  #include plugin-api.h
  #include hash-table.h
- #include tree-core.h
  #include target.h
  #include cgraph.h
  #include vec.h
--- 25,30 
*** HEADER/streamer-hooks.h	2013-10-28 11:55:32.023437787 -0400
--- streamer-hooks.h	2013-10-28 15:30:31.876866273 -0400
*** along with GCC; see the file COPYING3.
*** 23,30 
  #ifndef GCC_STREAMER_HOOKS_H
  #define GCC_STREAMER_HOOKS_H
  
- #include tree-core.h
- 
  /* Forward declarations to avoid including unnecessary headers.  */
  struct output_block;
  struct lto_input_block;
--- 23,28 
*** HEADER/streamer-hooks.c	2013-10-28 11:55:32.023437787 -0400
--- streamer-hooks.c	2013-10-28 15:30:42.437856247 -0400
*** along with GCC; see the file COPYING3.
*** 23,28 
--- 23,29 
  #include config.h
  #include system.h
  #include coretypes.h
+ #include input.h
  #include streamer-hooks.h
  
  /* Streamer hooks.  */


Re: [PATCH i386 4/8] [AVX512] Add substed patterns: mask_scalar_merge subst.

2013-10-28 Thread Richard Henderson
On 10/28/2013 07:53 AM, Kirill Yukhin wrote:
 This patch introduces mask_scalar_merge  subst.
 
 Is it ok to commit to main trunk?

Ok.


r~


Re: [PATCH] rewrite stack vectors

2013-10-28 Thread Jeff Law

On 10/25/13 13:56, Trevor Saunders wrote:

On Fri, Oct 25, 2013 at 03:30:47PM -0400, Diego Novillo wrote:

On 2013-10-10 14:07 , tsaund...@mozilla.com wrote:

This makes the implementation of stack vectors simpler and easier to use.  This 

   works by 
making the size of the on stack storage a template argument, so the
size is embedded in the type.  This allows you to implicitly convert a
stack_vecT, N to a vecT, va_heap *, and it will just work.  Because there's
no need to support stack vectors in unions we can make them be a more normal
c++ class with a constructor and destructor that are nontrivial.



Thanks.  This looks much simpler, indeed.  The patch is fine to
commit.  Just a couple of observations/questions:


I don't have commit access, so can someone check it in for me?  I
bootstrapped and got no changes in regression tests two weeks ago, but
haven't checked it since if that helps.

Bootstrap went fine (as expected).  Installed on your behalf.  Thanks!

jeff



Re: [patch] Mostly remove tree-core.h from other .h files.

2013-10-28 Thread Jeff Law

On 10/28/13 13:38, Andrew MacLeod wrote:

I noticed there were a few .h files which we including tree-core.h
themselves.   expr.h was one of them, and many of the rtl files used
that to get at  the basic tree structure for various bits.

I moved the include of tree-core.h to rtl.h instead.  That way all the
rtl files get it, and when we flatten the rtl.h includes into the c
files, it'll be reduced to just the ones that need it. Eventually, those
things that need it from rtl.h won't even need tree-code.h, they'll get
converted to the new gimple interface instead.

streamer-hooks.h was including tree-core.h as well, but really only
streamer-hooks.c needed location_t from input.h, so I included that
instead from there.

Now only tree.h and rtl.h are including tree-core.h, which is more
appropriate.

bootstrapping and regressions running on x86_64-unknown-linux-gnu..
assuming no issues, OK?

OK.
jeff


Re: [PATCH] Final removal of mudflap

2013-10-28 Thread Mike Stump
On Oct 28, 2013, at 11:24 AM, Jeff Law l...@redhat.com wrote:
 On 10/26/13 15:08, Joseph S. Myers wrote:
 As far as I can see, the commit left empty libmudflap directories around
 rather than removing them (SVN, unlike git, allows empty directories to be
 represented in the repository).
 I'll check out an SVN tree and remove the empty directory.

No time like the present, I removed it.  I did a bootstrap on linux to try and 
ensure no fallout.

Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS

2013-10-28 Thread Mike Stump
Ping?

On Oct 18, 2013, at 10:03 AM, Mike Stump mikest...@comcast.net wrote:
 On Oct 18, 2013, at 4:24 AM, Richard Biener richard.guent...@gmail.com 
 wrote:
 I agree.  Btw, the implementation defined precision of PDI on SH-5 
 definitely
 looks interesting, but I wonder how you can perform implementation defined
 arithmetic on that PDI mode then ;)
 
 Easy, perform it in the maximal size supported…
 
 What else blocks this patch?
 
 Nothing.  We have heard from everyone, and all the issues found by testing 
 were on msp, and those were exclusively due to port issues that I believe now 
 have been resolved.  The below patch has the change to 22 bits for PSI on SH.
 
 Ok?
 

Index: gcc/config/msp430/msp430-modes.def
===
--- gcc/config/msp430/msp430-modes.def  (revision 202634)
+++ gcc/config/msp430/msp430-modes.def  (working copy)
@@ -1,3 +1,3 @@
 /* 20-bit address */
-PARTIAL_INT_MODE (SI);
+PARTIAL_INT_MODE_NAME (SI, 20, PSI);
 
Index: gcc/config/bfin/bfin-modes.def
===
--- gcc/config/bfin/bfin-modes.def  (revision 202634)
+++ gcc/config/bfin/bfin-modes.def  (working copy)
@@ -19,7 +19,7 @@
http://www.gnu.org/licenses/.  */
 
 /* PDImode for the 40-bit accumulators.  */
-PARTIAL_INT_MODE (DI);
+PARTIAL_INT_MODE_NAME (DI, 40, PDI);
 
 /* Two of those - covering both accumulators for vector multiplications.  */
 VECTOR_MODE (INT, PDI, 2);
Index: gcc/config/m32c/m32c-modes.def
===
--- gcc/config/m32c/m32c-modes.def  (revision 202634)
+++ gcc/config/m32c/m32c-modes.def  (working copy)
@@ -22,7 +22,7 @@
 /*INT_MODE (PI, 3);*/
 
 /* 24-bit pointers, in 32-bit units */
-PARTIAL_INT_MODE (SI);
+PARTIAL_INT_MODE_NAME (SI, 24, PSI);
 
 /* 48-bit MULEX result */
 /* INT_MODE (MI, 6); */
Index: gcc/config/rs6000/rs6000-modes.def
===
--- gcc/config/rs6000/rs6000-modes.def  (revision 202634)
+++ gcc/config/rs6000/rs6000-modes.def  (working copy)
@@ -45,4 +45,4 @@ VECTOR_MODES (FLOAT, 32); /*   V
 /* Replacement for TImode that only is allowed in GPRs.  We also use PTImode
for quad memory atomic operations to force getting an even/odd register
combination.  */
-PARTIAL_INT_MODE (TI);
+PARTIAL_INT_MODE_NAME (TI, 128, PTI);
Index: gcc/config/sh/sh-modes.def
===
--- gcc/config/sh/sh-modes.def  (revision 202634)
+++ gcc/config/sh/sh-modes.def  (working copy)
@@ -18,9 +18,9 @@ along with GCC; see the file COPYING3.
 http://www.gnu.org/licenses/.  */
 
 /* The SH uses a partial integer mode to represent the FPSCR register.  */
-PARTIAL_INT_MODE (SI);
+PARTIAL_INT_MODE_NAME (SI, 22, PSI);
 /* PDI mode is used to represent a function address in a target register.  */
-PARTIAL_INT_MODE (DI);
+PARTIAL_INT_MODE_NAME (DI, 64, PDI);
 
 /* Vector modes.  */
 VECTOR_MODE  (INT, QI, 2);/* V2QI */
Index: gcc/genmodes.c
===
--- gcc/genmodes.c  (revision 202634)
+++ gcc/genmodes.c  (working copy)
@@ -629,10 +629,14 @@ reset_float_format (const char *name, co
   m-format = format;
 }
 
-/* Partial integer modes are specified by relation to a full integer mode.
-   For now, we do not attempt to narrow down their bit sizes.  */
-#define PARTIAL_INT_MODE(M) \
-  make_partial_integer_mode (#M, P #M, -1U, __FILE__, __LINE__)
+/* Partial integer modes are specified by relation to a full integer
+   mode.  */
+#define PARTIAL_INT_MODE(M,PREC)   \
+  make_partial_integer_mode (#M, P #PREC #M, PREC, __FILE__, __LINE__)
+/* Partial integer modes are specified by relation to a full integer
+   mode.  */
+#define PARTIAL_INT_MODE_NAME(M,PREC,NAME) \
+  make_partial_integer_mode (#M, #NAME, PREC, __FILE__, __LINE__)
 static void ATTRIBUTE_UNUSED
 make_partial_integer_mode (const char *base, const char *name,
   unsigned int precision,
@@ -669,7 +673,7 @@ make_vector_mode (enum mode_class bclass
   struct mode_data *v;
   enum mode_class vclass = vector_class (bclass);
   struct mode_data *component = find_mode (base);
-  char namebuf[8];
+  char namebuf[16];
 
   if (vclass == MODE_RANDOM)
 return;
@@ -917,7 +921,7 @@ enum machine_mode\n{);
 end will try to use it for bitfields in structures and the
 like, which we do not want.  Only the target md file should
 generate BImode widgets.  */
-  if (first  first-precision == 1)
+  if (first  first-precision == 1  c == MODE_INT)
first = first-next;
 
   if (first  last)
@@ -1187,7 +1191,7 @@ emit_class_narrowest_mode (void)
 /* Bleah, all this to get the comment right for MIN_MODE_INT.  */
 tagged_printf (MIN_%s, 

Re: Commit: MSP430: Pass -md on to assembler

2013-10-28 Thread Mike Stump
Ping?

On Sep 27, 2013, at 11:21 AM, Mike Stump mikest...@comcast.net wrote:
 On Sep 27, 2013, at 1:48 AM, nick clifton ni...@redhat.com wrote:
 OK by me, although I cannot approve that particular patch.
 
 I know, the intent is for someone that can, to approve it.
 
 But I ran into a very strange problem.  With your PARTIAL_INT_MODE_NAME 
 patch applied GCC started erroneously eliminating NULL function pointer 
 checks!  This was particularly noticeable in libgcc/crtstuff.c where for 
 example:
 
 Index: stor-layout.c
 ===
 --- stor-layout.c (revision 202634)
 +++ stor-layout.c (working copy)
 @@ -2821,7 +2821,7 @@ get_mode_bounds (enum machine_mode mode,
enum machine_mode target_mode,
rtx *mmin, rtx *mmax)
 {
 -  unsigned size = GET_MODE_BITSIZE (mode);
 +  unsigned size = GET_MODE_PRECISION (mode);
   unsigned HOST_WIDE_INT min_val, max_val;
 
   gcc_assert (size = HOST_BITS_PER_WIDE_INT);
 
 fixes this problem.  The problem is that we treat the maximum value of 
 PSImode as -1, and then later we do:
 
  case EQ:
/* x == y is always false for y out of range.  */
  =if (val  mmin || val  mmax)
 B return const0_rtx;
break;
 
 and the answer to the question is 0  -1, is no, so the entire test is 
 eliminated as never able to be true.  After the fix, in your case, we get:
 
 (gdb) p mmin
 $72 = -524288
 (gdb) p mmax
 $73 = 524287
 
 and the test becomes if (0  -524288 || 0  524287), which is not true, then 
 the test isn't eliminated as never true.
 
 Here, we see the test that protects this code uses GET_MODE_PRECISION:
 
 (gdb) macro expand HWI_COMPUTABLE_MODE_P (PSImode)
 expands to: enum mode_class) mode_class[PSImode]) == MODE_INT || ((enum 
 mode_class) mode_class[PSIm\
 ode]) == MODE_PARTIAL_INT)  mode_precision[PSImode] = (8 * 8))
 
 so, clearly, someone was thinking about GET_MODE_PRECISION being in range, 
 not GET_MODE_BITSIZE.
 
 Ok?  [ for the rtl people ]



Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS

2013-10-28 Thread DJ Delorie

Who needs to approve this?  THe target maintainers, or some global maintainer?


Re: Apply attribute returns_nonnull in libiberty

2013-10-28 Thread Marc Glisse

On Fri, 11 Oct 2013, Marc Glisse wrote:


Hello,

here are some uses for returns_nonnull in libiberty.

Bootstrap+testsuite (default languages) on x86_64-unknown-linux-gnu.

2013-10-11  Marc Glisse  marc.gli...@inria.fr

PR tree-optimization/58689
include/
* ansidecl.h (ATTRIBUTE_RETURNS_NONNULL): New macro.
* libiberty.h (basename, lbasename, dos_lbasename, unix_lbasename,
concat_copy): Mark with attributes nonnull(1) and returns_nonnull.
(concat, reconcat, concat_copy2, choose_temp_base, xstrerror,
xmalloc, xrealloc, xcalloc, xstrdup, xstrndup, xmemdup, pex_init):
Mark with attribute returns_nonnull.

libiberty/
* concat.c: Remove note about xmalloc.


Hello,

I cross-posted this message:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00986.html

to the various libiberty users, and the only replies I got were:

ok for gcc
gdb doesn't mind (Tom Tromey)
makes sense (Mike Frysinger)

is the original patch thus ok?
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00818.html

--
Marc Glisse


Re: Apply attribute returns_nonnull in libiberty

2013-10-28 Thread DJ Delorie

Sure, go for it.


Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-10-28 Thread Jeff Law

On 10/22/13 10:22, Iyer, Balaji V wrote:

Hi Jeff, I have attached 2 patches - 1 for C and 1 for C++ - along
with the changelogs (ChangeLog.cilkplus for C and common changes,
cp-ChangeLog.cilkplus for C++ specific files) with the changes you
have requested.  Answers to your questions are given below also.

It passes all its tests and doesn't affect any other existing tests
(i.e. by affect I mean fail a passing test or pass a failing test).



A note.  cilk-common seems to be a mix of tree and rtl bits.  Those may 
need to be broken into separate files and/or moved around as part of the 
reorganization work that's occurring in GCC right now.  So if Andrew or 
someone else pings you, please respond appropriately.


The C (and shared) parts of this patch are OK.  Jason needs to review 
the C++ front-end changes.


If possible, I recommend installing the C (and shared) bits as well as 
the runtime component if they can build  work w/o the C++ front-end 
changes, then track the C++ front-end changes separately.


Jeff


Re: [PATCH, testsuite/rs6000] Adjust two VMX tests for little endian

2013-10-28 Thread David Edelsohn
On Mon, Oct 28, 2013 at 3:17 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 These two test cases require source changes when compiled on a little
 endian target.  Verified on powerpc64{,le}-unknown-linux-gnu.  Ok to
 commit?

 Thanks,
 Bill


 2013-10-28  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gcc.dg/vmx/gcc-bug-i.c: Add little endian variant.
 * gcc.dg/vmx/eg-5.c: Likewise.

Okay.

Thanks, David


Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-28 Thread Jeff Law

On 10/28/13 09:12, Ilya Enkovich wrote:

I fixed documentation for bounds load/store hooks.  Also added documentation 
for list of builtins returned by builtin_chkp_function.

Thanks,
Ilya
--

gcc/

2013-10-28  Ilya Enkovich  ilya.enkov...@intel.com

* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(fn_abi_va_list_bounds_size): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
* doc/tm.texi.in (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New.
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
* doc/tm.texi: Regenerated.
* langhooks.h (lang_hooks): Add chkp_supported field.
* langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): New.
(LANG_HOOKS_INITIALIZER); Add LANG_HOOKS_CHKP_SUPPORTED.

This is fine.

Please install after a bootstrap test to ensure everything builds correctly.

Thanks,
Jeff



Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-28 Thread Kirill Yukhin
Hello Richard,
On 28 Oct 08:20, Richard Henderson wrote:
 Why is a masked *scalar* operation useful?

The reason the instructions exist is so that
you can do fully fault correct predicated scalar algorithms.

I example.

In fact, with some hacky tricks, you can fully predicate
normal C code in the SIMD registers. 

One might want to have such region
as fast as possible while staying scalar:
(all vars are integers):
  if ( a[i] )
b[i] += c[i];

Definetely to have max performace we want to have
the region fully predicated. This code cannot be
predicated correctly in IA pre-AVX-512:

  vmovd a[i], %xmm0
  vptestm %zmm0, %zmm0, %k1
  // hack because we didn’t have masking for VPMOVD/Q
  vmovss b[i], %xmm0 {%k1}{z}
  // no scalar int add, hack, 128-bit works fine
  // because mask is sawed off in the right places
  vpaddd c[i], %zmm0, %zmm0 {%k1}{z}
  // vmpmovd/w hack again
  vmovss %xmm0, b[i] {%k1}

So, having such masked scalar insns allows us to have
non-branching scalar code.

II Example.

Perhaps one interesting case of scalar and mask, though not
to do with predication (and really narrow), is as an idiom to
generate a write mask value of 0x1:

  vcmpss k1, xmm0, xmm0, {sae}, 0xf

Currently we do:
  mov   1, %eax
  kmovw %eax, %k1

But we sometimes have to spill a GPR in order to introduce
this sequence. In that case the vcmpss idiom seems like a
better choice  (though you might worry about the additional
false dependency on xmm0).

And finally.
It’s kind of strange not to have complete ISA support.
What if someone would want to have equivalent code for
the vector version and the scalar remainder version?

--
Thanks, K



RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-10-28 Thread Iyer, Balaji V


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Jeff Law
 Sent: Monday, October 28, 2013 4:24 PM
 To: Iyer, Balaji V; r...@redhat.com; Jason Merrill (ja...@redhat.com); Aldy
 Hernandez (al...@redhat.com)
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and 
 C++)
 
 On 10/22/13 10:22, Iyer, Balaji V wrote:
  Hi Jeff, I have attached 2 patches - 1 for C and 1 for C++ - along
  with the changelogs (ChangeLog.cilkplus for C and common changes,
  cp-ChangeLog.cilkplus for C++ specific files) with the changes you
  have requested.  Answers to your questions are given below also.
 
  It passes all its tests and doesn't affect any other existing tests
  (i.e. by affect I mean fail a passing test or pass a failing test).
 
 
 A note.  cilk-common seems to be a mix of tree and rtl bits.  Those may need 
 to
 be broken into separate files and/or moved around as part of the 
 reorganization
 work that's occurring in GCC right now.  So if Andrew or someone else pings
 you, please respond appropriately.
 
 The C (and shared) parts of this patch are OK.  Jason needs to review the C++
 front-end changes.
 
 If possible, I recommend installing the C (and shared) bits as well as the 
 runtime
 component if they can build  work w/o the C++ front-end changes, then track
 the C++ front-end changes separately.

Thanks! I will extract and check in the Cilk_spawn and _Cilk_sync for C work.

-Balaji V. Iyer.

 
 Jeff


Re: free is a killer

2013-10-28 Thread Jeff Law

On 10/26/13 01:15, Marc Glisse wrote:

Hello,

this patch teaches gcc that free kills the memory its argument points
to. The equality test is probably too strict, I guess we can loosen it
later (unless you have suggestions?).

Note that the corresponding code for BUILT_IN_MEMCPY and others seems
suspicious to me, it looks like it is testing for equality between a
pointer and a mem_ref, which is unlikely to happen.

Bootstrap+testsuite on x86_64-unknown-linux-gnu.

2013-10-27  Marc Glisse  marc.gli...@inria.fr

 PR tree-optimization/19831
gcc/
 * tree-ssa-alias.c (stmt_kills_ref_p_1): Handle BUILT_IN_FREE.

gcc/testsuite/
 * gcc.dg/tree-ssa/alias-25.c: New file.

OK for the trunk.

I agree the MEM_REF* and VA_END cases look strange and I think they're 
wrong as well.  Did you happen to try fixing them and running the 
testsuite to see if anything fails?


ISTM your testcase could be tweaked to test conclusively if they're 
doing the right thing -- and regardless of what's found that test 
can/should make its way into the testsuite.


Jeff



Re: [PATCH] make gengtype more robust against user error

2013-10-28 Thread Jeff Law

On 10/25/13 15:37, David Malcolm wrote:

This patch addresses various forms of failure described in
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01974.html

It adds a default: gcc_unreachable(); to the autogenerated switch()
statement in the routines for a base class, converting various kinds of
tag errors from leading to silent lack-of-field traversal into giving
run-time assertion failures (addressing (F) and (G))

It also issues an error within gengtype itself for a missing desc
(failure B), turning this into an error message from gengtype.

I found another potential failure mode:

(H) If you had:

class GTY((desc(%0.kind), tag(0))) foo
{
public:
   int kind;
   tree p;
};

class GTY((tag(1))) bar : public foo
{
public:
   tree q;
};

static GTY(()) foo *dummy_foo;

and there are no explicit pointers to bar in the code, gengtype
treated it as unused, and silently omitted the case for it from
foo's marking routine.

I've updated set_gc_type_used so that it propagates usage down into
subclasses (which recurses), fixing this issue.

To do this efficiently we need to track subclasses in within gengtype,
so the patch also adds that, and uses it to eliminate an O(N^2).

Note that for error (G), if a class within the hierarchy omits a GTY
marker, gengtype doesn't parse it at all, and so doesn't parse the
inheritance information - so we can't issue a warning about this.
However, the lack of a tag will trigger a run-time assertion failure
due to hitting the default:  gcc_unreachable(); in the switch.
The patch also adds a paragraph to the docs, spelling out the need
for evary class in such a hierarchy to have a GTY marker.

I believe this addresses all of the silent-lack-of-field-traversal
issues, converting them to gengtype errors or runtime assertions.
It also adds a handler for (E), turning this from a failure to
compile bogus C to a specific error in gengtype.

I'm bootstrapping/regtesting now.
OK for trunk if that passes?

gcc/
* doc/gty.texi (Inheritance and GTY): Make it clear that
to use autogenerated markers for a class-hierarchy, every class
must have a GTY marker.
* gengtype.h (struct type): Add linked list of subclasses to
the s member of the union.
(add_subclass): New decl.
* gengtype-state.c (read_state_struct_type): Set up subclass
linked list.
* gengtype.c (get_ultimate_base_class): New.
(add_subclass): New.
(new_structure): Set up subclass linked list.
(set_gc_used_type): Propagate usage information to subclasses.
(output_mangled_typename): Use get_ultimate_base_class.
(walk_subclasses): Use the subclass linked list, avoiding an
O(N^2) when writing out all types.
(walk_type): Issue an error if the base class is missing a tag,
rather than generating bogus C code.  Add a gcc_unreachable
default case, in case people omit tags from concrete subclasses,
or get the values wrong.
(write_func_for_structure): Issue an error for subclasses for
which the base doesn't have a desc, since otherwise the
autogenerated routines for the base would silently fail to visit
any subclass fields.
(write_root): Use get_ultimate_base_class, tweaking constness of
tp to match that function's signature.

Thanks for diving into this stuff and getting it fixed.

OK for the trunk,

Jeff



Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-28 Thread Jeff Law

On 10/25/13 11:57, Ilya Enkovich wrote:



There are currently two known issues with LTO. The first one is ICE in
LTO streamer when it reads instrumented code. The second one is
unitialized flag_check_pointers when code is compiled by lto1 (do you
know why it may happen BTW?). It also causes ICE beacause instrumented
code is met when not expected. Of course, I'll fix these problems
anyway, but I was going to allow '-fcheck-pointers -flto' only when
checker testsuite has 100% pass rate with lto.
No idea about why LTO would be failing in this way -- my experience with 
LTO is virtually nil.


I'm OK with the restriction as a temporary measure.  But we really 
dislike this kidn of stuff, so I need a commitment from you to dive into 
these problems and sort out what's really going on.





Ilya





[SH] PR 54236 - add some more addc patterns

2013-10-28 Thread Oleg Endo
Hello,

This adds some more patterns to utilize the SH addc instruction.
Tested on rev 204111 with
make -k check RUNTESTFLAGS=--target_board=sh-sim
\{-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4-single/-ml,-m4/-mb,-m4-single/-mb}

and no new failures.
OK for trunk?

Cheers,
Oleg

gcc/ChangeLog:
PR target/54236
* config/sh/sh.md (*addc): Add more addc related insn and 
splits.
* config/sh/sh.c (addsubcosts): Handle some addc special cases.

testsuite/ChangeLog:
PR target/54236
* gcc.target/sh/pr54236-2: New.
* gcc.target/sh/pr54089-6: Add another rotl special case.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 204098)
+++ gcc/config/sh/sh.md	(working copy)
@@ -1910,6 +1910,126 @@
 			(match_dup 1)))
 	  (clobber (reg:SI T_REG))])])
 
+;; Use shlr-addc to do 'reg + (reg  1)'.
+(define_insn_and_split *addc
+  [(set (match_operand:SI 0 arith_reg_dest)
+	(plus:SI (and:SI (match_operand:SI 1 arith_reg_operand)
+			 (const_int 1))
+		 (match_operand:SI 2 arith_reg_operand)))
+   (clobber (reg:SI T_REG))]
+  TARGET_SH1
+  #
+   can_create_pseudo_p ()
+  [(parallel [(set (match_dup 0) (plus:SI (reg:SI T_REG) (match_dup 2)))
+	  (clobber (reg:SI T_REG))])]
+{
+  emit_insn (gen_shlr (gen_reg_rtx (SImode), operands[1]));
+})
+
+;; Use shlr-addc to do 'reg + reg + (reg  1)'.
+(define_insn_and_split *addc
+  [(set (match_operand:SI 0 arith_reg_dest)
+	(plus:SI (plus:SI (and:SI (match_operand:SI 1 arith_reg_operand)
+  (const_int 1))
+			  (match_operand:SI 2 arith_reg_operand))
+		 (match_operand:SI 3 arith_reg_operand)))
+   (clobber (reg:SI T_REG))]
+  TARGET_SH1
+  #
+   can_create_pseudo_p ()
+  [(parallel [(set (match_dup 0) (plus:SI (plus:SI (match_dup 2) (match_dup 3))
+  (reg:SI T_REG)))
+	  (clobber (reg:SI T_REG))])]
+{
+  emit_insn (gen_shlr (gen_reg_rtx (SImode), operands[1]));
+})
+
+;; Canonicalize 'reg + (reg  1) + reg' into 'reg + reg + (reg  1)'.
+(define_insn_and_split *addc
+  [(set (match_operand:SI 0 arith_reg_dest)
+	(plus:SI (and:SI (match_operand:SI 1 arith_reg_operand)
+			 (const_int 1))
+		 (plus:SI (match_operand:SI 2 arith_reg_operand)
+			  (match_operand:SI 3 arith_reg_operand
+   (clobber (reg:SI T_REG))]
+  TARGET_SH1
+  #
+   can_create_pseudo_p ()
+  [(parallel [(set (match_dup 0)
+		   (plus:SI (plus:SI (and:SI (match_dup 1) (const_int 1))
+ (match_dup 2))
+			(match_dup 3)))
+	  (clobber (reg:SI T_REG))])])
+
+;; Canonicalize '2 * reg + (reg  1)' into 'reg + reg + (reg  1)'.
+(define_insn_and_split *addc
+  [(set (match_operand:SI 0 arith_reg_dest)
+	(plus:SI (and:SI (match_operand:SI 1 arith_reg_operand)
+			 (const_int 1))
+		 (mult:SI (match_operand:SI 2 arith_reg_operand)
+			  (const_int 2
+   (clobber (reg:SI T_REG))]
+  TARGET_SH1
+  #
+   can_create_pseudo_p ()
+  [(parallel [(set (match_dup 0)
+		   (plus:SI (plus:SI (and:SI (match_dup 1) (const_int 1))
+ (match_dup 2))
+			(match_dup 2)))
+	  (clobber (reg:SI T_REG))])])
+
+;; Use shll-addc to do 'reg + ((unsigned int)reg  31)'.
+(define_insn_and_split *addc
+  [(set (match_operand:SI 0 arith_reg_dest)
+	(plus:SI (lshiftrt:SI (match_operand:SI 1 arith_reg_operand)
+			  (const_int 31))
+		 (match_operand:SI 2 arith_reg_operand)))
+   (clobber (reg:SI T_REG))]
+  TARGET_SH1
+  #
+   can_create_pseudo_p ()
+  [(parallel [(set (match_dup 0) (plus:SI (reg:SI T_REG) (match_dup 2)))
+	  (clobber (reg:SI T_REG))])]
+{
+  emit_insn (gen_shll (gen_reg_rtx (SImode), operands[1]));
+})
+
+;; Use shll-addc to do 'reg + reg + ((unsigned int)reg  31)'.
+(define_insn_and_split *addc
+  [(set (match_operand:SI 0 arith_reg_dest)
+	(plus:SI (plus:SI (lshiftrt:SI (match_operand:SI 1 arith_reg_operand)
+   (const_int 31))
+		 	  (match_operand:SI 2 arith_reg_operand))
+		 (match_operand:SI 3 arith_reg_operand)))
+   (clobber (reg:SI T_REG))]
+  TARGET_SH1
+  #
+   can_create_pseudo_p ()
+  [(parallel [(set (match_dup 0) (plus:SI (plus:SI (match_dup 2) (match_dup 3))
+  (reg:SI T_REG)))
+	  (clobber (reg:SI T_REG))])]
+{
+  emit_insn (gen_shll (gen_reg_rtx (SImode), operands[1]));
+})
+
+;; Canonicalize '2 * reg + ((unsigned int)reg  31)'
+;; into 'reg + reg + (reg  1)'.
+(define_insn_and_split *addc
+  [(set (match_operand:SI 0 arith_reg_dest)
+	(plus:SI (mult:SI (match_operand:SI 1 arith_reg_operand)
+			  (const_int 2))
+		 (lshiftrt:SI (match_operand:SI 2 arith_reg_operand)
+			  (const_int 31
+   (clobber (reg:SI T_REG))]
+  TARGET_SH1
+  #
+   can_create_pseudo_p ()
+  [(parallel [(set (match_dup 0)
+		   (plus:SI (plus:SI (lshiftrt:SI (match_dup 2) (const_int 31))
+ (match_dup 1))
+			(match_dup 1)))
+	  (clobber (reg:SI T_REG))])])
+
 (define_expand addsi3
   [(set (match_operand:SI 0 arith_reg_operand )
 	(plus:SI (match_operand:SI 1 arith_operand )
Index: gcc/config/sh/sh.c

Re: [PATCH] fixing typo in expr.c to allow proper recognition of complex addresses in some arches.

2013-10-28 Thread Jeff Law

On 10/25/13 02:39, Eric Botcazou wrote:

Thanks.  Installed on the trunk.


Well, no, that will be problematic for some architectures.  The history of
this piece of code is complicated and it's admittedly lacking a comment, but
the purpose of the block is clear enough:

op0 = expand_expr (base, NULL_RTX, VOIDmode, EXPAND_SUM);
op0 = memory_address_addr_space (address_mode, op0, as);
if (!integer_zerop (TREE_OPERAND (exp, 1)))
  {
rtx off
  = immed_double_int_const (mem_ref_offset (exp), address_mode);
op0 = simplify_gen_binary (PLUS, address_mode, op0, off);
  }
op0 = memory_address_addr_space (mode, op0, as);

The offset computation is done in address_mode and then, only at the end,
converted to mode.
My reading of memory_address_addr_space is that MODE is the mode of the 
memory reference, not the mode of the address.  I fail to see how 
passing in the mode of the address in the first call can be correct.


What am I missing?

jeff


[PATCH] Fix comment typo

2013-10-28 Thread Jeff Law


Just something I noticed while reviewing a patch.  Committed.



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e525748..013fe73 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2013-10-28  Jeff Law  l...@redhat.com
+
+   * lower-subreg.c (resolve_simple_move): Fix comment typo.
+
 2013-10-28  Trevor Saunders  tsaund...@mozilla.com
 
* df-scan.c (df_collection_rec): Adjust.
diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index 8ff5fc3..ebf364f 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -1070,7 +1070,7 @@ resolve_simple_move (rtx set, rtx insn)
 
   emit_insn_before (insns, insn);
 
-  /* If we get here via self-recutsion, then INSN is not yet in the insns
+  /* If we get here via self-recursion, then INSN is not yet in the insns
  chain and delete_insn will fail.  We only want to remove INSN from the
  current sequence.  See PR56738.  */
   if (in_sequence_p ())


Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-28 Thread Richard Henderson
On 10/28/2013 01:58 PM, Kirill Yukhin wrote:
 Hello Richard,
 On 28 Oct 08:20, Richard Henderson wrote:
 Why is a masked *scalar* operation useful?
 
 The reason the instructions exist is so that
 you can do fully fault correct predicated scalar algorithms.

Using VEC_MERGE isn't the proper representation for that.

If that's your real goal, then COND_EXEC is the only way to let
rtl know that faults are suppressed in the false condition.


r~


Re: [Patch, C, C++] Accept GCC ivdep for 'do' and 'while', and for C++11's range-based loops

2013-10-28 Thread Tobias Burnus

Jason Merrill wrote:

On 10/25/2013 04:46 PM, Tobias Burnus wrote:
+do_range_for_auto_deduction (range_decl, range_expr); // 
FIXME:IVDEP


I think in this situation let's set a flag on the RANGE_FOR_STMT so we 
can do ivdep handling at instantiation time.


I am not completely sure whether you had the following in mind, but 
that's what I have now implemented:

  DEFTREECODE (RANGE_FOR_STMT, range_for_stmt, tcc_statement, 4)
has now a 5th operator (RANGE_FOR_IVDEP), which has the value 
boolean_true_node - or NULL_TREE (via begin_range_for_stmt).


Additionally, I moved the ivdep annotation to semantics.c as one 
otherwise might get a cleanup_point_expr around the annotate expression, 
which the current algorithm in tree-cfg.c's replace_loop_annotate 
doesn't handle.


Unfortunately, the GCC ivdep annotation doesn't help with 
pr33426-ivdep-4.cc: It still versions the loop with an alias check. But 
that seems to be a middle-end problem as loop-safelen gets set.


I did an all-language bootstrap + regtesting on x86-64-gnu-linux.
Is the patch now OK?

Tobias
2013-10-29  Tobias Burnus  bur...@net-b.de

gcc/cp/
	PR other/33426
	* cp-tree.def (RANGE_FOR_STMT): Add RANGE_FOR_IVDEP flag.
	* cp-tree.h (RANGE_FOR_IVDEP): Define.
	(cp_convert_range_for, finish_while_stmt_cond, finish_do_stmt,
	finish_for_cond): Take 'bool ivdep' parameter.
	* cp-array-notation.c (create_an_loop): Update call.
	* init.c (build_vec_init): Ditto.
	* pt.c (tsubst_expr): Ditto.
	* parser.c (cp_parser_iteration_statement, cp_parser_for,
	cp_parser_range_for, cp_convert_range_for): Update calls.
	(cp_parser_pragma): Accept GCC ivdep for 'while' and 'do'.
	* semantics.c (finish_while_stmt_cond, finish_do_stmt,
	finish_for_cond): Optionally build ivdep annotation.
	(begin_range_for_stmt): Adapt for RANGE_FOR_STMT changes.

gcc/testsuite/
	PR other/33426
	* g++.dg/vect/pr33426-ivdep-2.cc: New.
	* g++.dg/vect/pr33426-ivdep-3.cc: New.
	* g++.dg/vect/pr33426-ivdep-4.cc: New.

gcc/
	PR other/33426
	* gcc/tree-cfg.c (replace_loop_annotate): Replace warning by
	warning_at.

diff --git a/gcc/cp/cp-array-notation.c b/gcc/cp/cp-array-notation.c
index c700f58..e1fb0ee 100644
--- a/gcc/cp/cp-array-notation.c
+++ b/gcc/cp/cp-array-notation.c
@@ -71,7 +71,7 @@ create_an_loop (tree init, tree cond, tree incr, tree body)
   finish_expr_stmt (init);
   for_stmt = begin_for_stmt (NULL_TREE, NULL_TREE);
   finish_for_init_stmt (for_stmt);
-  finish_for_cond (cond, for_stmt);
+  finish_for_cond (cond, for_stmt, false);
   finish_for_expr (incr, for_stmt);
   finish_expr_stmt (body);
   finish_for_stmt (for_stmt);
diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index bd9bfa8..93355b5 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -299,9 +299,9 @@ DEFTREECODE (IF_STMT, if_stmt, tcc_statement, 4)
 DEFTREECODE (FOR_STMT, for_stmt, tcc_statement, 5)
 
 /* Used to represent a range-based `for' statement. The operands are
-   RANGE_FOR_DECL, RANGE_FOR_EXPR, RANGE_FOR_BODY, and RANGE_FOR_SCOPE,
-   respectively.  Only used in templates.  */
-DEFTREECODE (RANGE_FOR_STMT, range_for_stmt, tcc_statement, 4)
+   RANGE_FOR_DECL, RANGE_FOR_EXPR, RANGE_FOR_BODY, RANGE_FOR_SCOPE,
+   and RANGE_FOR_IVDEP, respectively.  Only used in templates.  */
+DEFTREECODE (RANGE_FOR_STMT, range_for_stmt, tcc_statement, 5)
 
 /* Used to represent a 'while' statement. The operands are WHILE_COND
and WHILE_BODY, respectively.  */
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 507b389..0d3bc70 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4088,6 +4088,7 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter)
 #define RANGE_FOR_EXPR(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 1)
 #define RANGE_FOR_BODY(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 2)
 #define RANGE_FOR_SCOPE(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 3)
+#define RANGE_FOR_IVDEP(NODE)	TREE_OPERAND (RANGE_FOR_STMT_CHECK (NODE), 4)
 
 #define SWITCH_STMT_COND(NODE)	TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 0)
 #define SWITCH_STMT_BODY(NODE)	TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 1)
@@ -4321,7 +4322,7 @@ extern int comparing_specializations;
sizeof can be nested.  */
 
 extern int cp_unevaluated_operand;
-extern tree cp_convert_range_for (tree, tree, tree);
+extern tree cp_convert_range_for (tree, tree, tree, bool);
 extern bool parsing_nsdmi (void);
 
 /* in pt.c  */
@@ -5671,16 +5672,16 @@ extern void begin_else_clause			(tree);
 extern void finish_else_clause			(tree);
 extern void finish_if_stmt			(tree);
 extern tree begin_while_stmt			(void);
-extern void finish_while_stmt_cond		(tree, tree);
+extern void finish_while_stmt_cond		(tree, tree, bool);
 extern void finish_while_stmt			(tree);
 extern tree begin_do_stmt			(void);
 extern void finish_do_body			(tree);
-extern void finish_do_stmt			(tree, tree);
+extern void finish_do_stmt			(tree, tree, bool);
 extern tree finish_return_stmt			(tree);
 extern tree begin_for_scope			(tree *);
 extern 

Re: free is a killer

2013-10-28 Thread Marc Glisse

On Mon, 28 Oct 2013, Jeff Law wrote:


On 10/26/13 01:15, Marc Glisse wrote:

Hello,

this patch teaches gcc that free kills the memory its argument points
to. The equality test is probably too strict, I guess we can loosen it
later (unless you have suggestions?).

Note that the corresponding code for BUILT_IN_MEMCPY and others seems
suspicious to me, it looks like it is testing for equality between a
pointer and a mem_ref, which is unlikely to happen.

Bootstrap+testsuite on x86_64-unknown-linux-gnu.

2013-10-27  Marc Glisse  marc.gli...@inria.fr

 PR tree-optimization/19831
gcc/
 * tree-ssa-alias.c (stmt_kills_ref_p_1): Handle BUILT_IN_FREE.

gcc/testsuite/
 * gcc.dg/tree-ssa/alias-25.c: New file.

OK for the trunk.


Thanks.

I agree the MEM_REF* and VA_END cases look strange and I think they're wrong 
as well.  Did you happen to try fixing them and running the testsuite to see 
if anything fails?


ISTM your testcase could be tweaked to test conclusively if they're doing the 
right thing -- and regardless of what's found that test can/should make its 
way into the testsuite.


I checked and it does the wrong thing (I don't have the testcase handy 
anymore, but it shouldn't be hard to recreate one), I even wrote a patch 
(attached) but it is related to:

http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
so it can't go in. A more limited fix (still missing many cases) would be 
possible. I didn't test VA_END, but it doesn't look as bad (it is the 
SSA_NAME case that is wrong for memcpy).



While I am posting non-submitted patches, does the following vaguely make 
sense? I couldn't find a testcase that exercised it without some local 
patches, but the idea (IIRC) is that with:

struct A { struct B b; int i; }
we can easily end up with one ao_ref.base that is a MEM_REF[p] and 
another one a MEM_REF[(B*)q] where p and q are of type A*, and that 
prevents us from noticing that p-i and q-b don't alias. Maybe I should 
instead find a way to get MEM_REF[q] as ao_ref.base, but if q actually 
points to a larger structure that starts with A it is likely to fail as 
well...


--- gcc/tree-ssa-alias.c(revision 204095)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -1099,22 +1099,24 @@ indirect_refs_may_alias_p (tree ref1 ATT

   /* If both references are through the same type, they do not alias
  if the accesses do not overlap.  This does extra disambiguation
  for mixed/pointer accesses but requires strict aliasing.  */
   if ((TREE_CODE (base1) != TARGET_MEM_REF
|| (!TMR_INDEX (base1)  !TMR_INDEX2 (base1)))
(TREE_CODE (base2) != TARGET_MEM_REF
  || (!TMR_INDEX (base2)  !TMR_INDEX2 (base2)))
same_type_for_tbaa (TREE_TYPE (base1), TREE_TYPE (ptrtype1)) == 1
same_type_for_tbaa (TREE_TYPE (base2), TREE_TYPE (ptrtype2)) == 1
-   same_type_for_tbaa (TREE_TYPE (ptrtype1),
-TREE_TYPE (ptrtype2)) == 1)
+   (same_type_for_tbaa (TREE_TYPE (ptrtype1),
+ TREE_TYPE (ptrtype2)) == 1
+ || same_type_for_tbaa (TREE_TYPE (TREE_TYPE (ptr1)),
+TREE_TYPE (TREE_TYPE (ptr2))) == 1))
 return ranges_overlap_p (offset1, max_size1, offset2, max_size2);

   /* Do type-based disambiguation.  */
   if (base1_alias_set != base2_alias_set
!alias_sets_conflict_p (base1_alias_set, base2_alias_set))
 return false;

   /* Do access-path based disambiguation.  */
   if (ref1  ref2
(handled_component_p (ref1) || handled_component_p (ref2))



In the category ugly hack that I won't submit, let me also attach this 
patch that sometimes helps notice that malloc doesn't alias other things 
(I don't know if the first hunk ever fires).


--
Marc GlisseIndex: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204089)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -1985,23 +1985,24 @@ stmt_may_clobber_ref_p (gimple stmt, tre
   return stmt_may_clobber_ref_p_1 (stmt, r);
 }
 
 /* If STMT kills the memory reference REF return true, otherwise
return false.  */
 
 static bool
 stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
 {
   /* For a must-alias check we need to be able to constrain
- the access properly.  */
-  ao_ref_base (ref);
-  if (ref-max_size == -1)
+ the access properly.
+ FIXME: except for BUILTIN_FREE.  */
+  if (!ao_ref_base (ref)
+  || ref-max_size == -1)
 return false;
 
   if (gimple_has_lhs (stmt)
TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME
   /* The assignment is not necessarily carried out if it can throw
 and we can catch it in the current function where we could inspect
 the previous value.
 ???  We only need to care about the RHS throwing.  For aggregate
 assignments or similar calls and non-call exceptions the LHS
 might throw as well.  */
@@ -2064,37 +2065,45 @@ 

Re: [PATCH] Do not append *INTERNAL* to the decl name

2013-10-28 Thread Dehao Chen
ping...

Thanks,
Dehao

On Fri, Oct 18, 2013 at 11:06 AM, Dehao Chen de...@google.com wrote:
 On Fri, Oct 18, 2013 at 10:39 AM, Jason Merrill ja...@redhat.com wrote:
 On 10/11/2013 01:59 PM, Dehao Chen wrote:

 It's hard to get a testcase without
 http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=201856 because
 none of these *INTERNAL* symbols will be emitted in debug info.


 Why does that change cause one of these symbols to be emitted?  As Cary
 says, that was done as an assertion.  If you're invalidating the assertion,
 I need more explanation as to why.

 Sorry, pointed to the incorrect patch, it should be
 http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=201858

 These symbols will not be emitted in the symbol table in assembly, but
 they will be emitted in the debug_info after
 http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=201858, which
 emits function's linkage name in debug_info even when it's abstract.
 The *INTERNAL* assertion I removed is an over-assertion that not only
 assert for symbol table, but also for debug info.

 Thanks,
 Dehao


 Jason



[Patch, fortran] PRs 57893 and 58858

2013-10-28 Thread Paul Richard Thomas
Dear All,

This patch addresses issues arising from PR57893.  It is entirely obvious.

Bootstraps and regtests on FC17/x86_64 - OK for trunk?

Cheers

Paul

PS I wash my hands of all attempts to use BT_HOLLERITH types!

2013-10-29  Paul Thomas  pa...@gcc.gnu.org

PR fortran 57893
* trans-types.c (gfc_typenode_for_spec): Add typenode for
BT_HOLLERITH. Note that the length is incorrect but unusable.

PR fortran 58858
* target-memory.c (gfc_element_size): Add element sizes for
BT_VOID and BT_ASSUMED, using gfc_typenode_for_spec.

2013-10-29  Paul Thomas  pa...@gcc.gnu.org

PR fortran 57893
* gfortran.dg/unlimited_polymorphic_13.f90 : Use real variables
to determine sizes of real kinds.

PR fortran 58858
* gfortran.dg/unlimited_polymorphic_14.f90 : New test.
Index: gcc/fortran/trans-types.c
===
*** gcc/fortran/trans-types.c	(revision 204135)
--- gcc/fortran/trans-types.c	(working copy)
*** gfc_typenode_for_spec (gfc_typespec * sp
*** 1099,1104 
--- 1099,1110 
  	basetype = gfc_get_character_type (spec-kind, spec-u.cl);
break;
  
+ case BT_HOLLERITH:
+   /* Since this cannot be used, return a length one character.  */
+   basetype = gfc_get_character_type_len (gfc_default_character_kind,
+ 	 gfc_index_one_node);
+   break;
+ 
  case BT_DERIVED:
  case BT_CLASS:
basetype = gfc_get_derived_type (spec-u.derived);
Index: gcc/fortran/target-memory.c
===
*** gcc/fortran/target-memory.c	(revision 204135)
--- gcc/fortran/target-memory.c	(working copy)
*** gfc_element_size (gfc_expr *e)
*** 109,114 
--- 109,116 
return e-representation.length;
  case BT_DERIVED:
  case BT_CLASS:
+ case BT_VOID:
+ case BT_ASSUMED:
{
  	/* Determine type size without clobbering the typespec for ISO C
  	   binding types.  */
Index: gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90
===
*** gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90	(revision 204135)
--- gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90	(working copy)
*** module m
*** 13,18 
--- 13,22 
integer, parameter :: c2 = real_kinds(2)
integer, parameter :: c3 = real_kinds(size(real_kinds)-1)
integer, parameter :: c4 = real_kinds(size(real_kinds))
+   real(c1) :: r1
+   real(c2) :: r2
+   real(c3) :: r3
+   real(c4) :: r4
  contains
   subroutine s(o, k)
  class(*) :: o
*** contains
*** 21,31 
  
  select case (k)
   case (4)
!   sz = 32*2
   case (8)
!   sz = 64*2
!  case (10,16)
!   sz = 128*2
   case default
 call abort()
  end select
--- 25,37 
  
  select case (k)
   case (4)
!   sz = storage_size(r1)*2
   case (8)
!   sz = storage_size(r2)*2
!  case (10)
!   sz = storage_size(r3)*2
!  case (16)
!   sz = storage_size(r4)*2
   case default
 call abort()
  end select
*** contains
*** 36,43 
  if (storage_size(o) /= sz) call abort()
type is (complex(c2))
  if (storage_size(o) /= sz) call abort()
- end select
- select type (o)
type is (complex(c3))
  if (storage_size(o) /= sz) call abort()
type is (complex(c4))
--- 42,47 
Index: gcc/testsuite/gfortran.dg/unlimited_polymorphic_14.f90
===
*** gcc/testsuite/gfortran.dg/unlimited_polymorphic_14.f90	(revision 0)
--- gcc/testsuite/gfortran.dg/unlimited_polymorphic_14.f90	(working copy)
***
*** 0 
--- 1,26 
+ ! { dg-do run }
+ !
+ ! Uncovered in fixing PR fortran/58793
+ !
+ ! Contributed by Tobias Burnus  bur...@gcc.gnu.org
+ !
+ ! Barfed on the hollerith argument
+ !
+ program test
+   logical l
+   call up(abc, l)
+   if (l) call abort
+   call up(3habc, l) ! { dg-warning Legacy Extension }
+   if (.not. l) call abort
+ contains
+   subroutine up(x, l)
+ class(*) :: x
+ logical l
+ select type(x)
+  type is (character(*))
+   l = .false.
+  class default
+   l = .true.
+ end select
+   end subroutine
+ end program test


[PATCH] libstdc++ testsuite cxxflags

2013-10-28 Thread Cesar Philippidis
This patch addresses two issues with the libstdc++ testsuite:

  * duplicate -g -O2 CXXFLAGS
  * missing -g -O2 for remote targets

The duplicate -g -O2 flags is a result of testsuite_flags.in using
build-time CXXFLAGS and proc libstdc++_init using the environmental
CXXFLAGS, which defaults to its build-time value. This patch prevents
testsuite_flags.in from using build-time CXXFLAGS.

Certain remote targets require a minimum optimization level -O1 in order
to pass several atomics built-in function tests. This patch ensures
cxxflags contains -g -O2 at minimum when no other optimization flags
are specified. The testsuite used to set those flags prior to Benjamin's
patch to remove duplicate cxxflags here
http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01572.html.

Is this OK for trunk? If so, please apply.

Thanks,
Cesar
2013-10-28  Cesar Philippidis  ce...@codesourcery.com

libstdc++-v3/
* scripts/testsuite_flags.in (cxxflags): Don't use build-time
CXXFLAGS and EXTRA_CXX_FLAGS.
* testsuite/lib/libstdc++.exp (libstdc++_init): Ensure, at minimum,
cxxflags contains -g -O2.

diff --git a/libstdc++-v3/scripts/testsuite_flags.in 
b/libstdc++-v3/scripts/testsuite_flags.in
index d7710ca..35b36e7 100755
--- a/libstdc++-v3/scripts/testsuite_flags.in
+++ b/libstdc++-v3/scripts/testsuite_flags.in
@@ -55,7 +55,7 @@ case ${query} in
   ;;
 --cxxflags)
   CXXFLAGS_default=-D_GLIBCXX_ASSERT -fmessage-length=0
-  CXXFLAGS_config=@SECTION_FLAGS@ @CXXFLAGS@ @EXTRA_CXX_FLAGS@
+  CXXFLAGS_config=@SECTION_FLAGS@
   echo ${CXXFLAGS_default} ${CXXFLAGS_config}
   ;;
 --cxxparallelflags)
diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp 
b/libstdc++-v3/testsuite/lib/libstdc++.exp
index 51ff6dd..68dcb15 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -265,6 +265,15 @@ proc libstdc++_init { testfile } {
 }
 append cxxflags  
 append cxxflags [getenv CXXFLAGS]
+
+if {$cxxflags == -D_GLIBCXX_ASSERT -fmessage-length=0 } {
+   append cxxflags -g
+}
+
+if ![regexp \-O $cxxflags] {
+   append cxxflags  -O2
+}
+
 v3track cxxflags 2
 
 # Always use MO files built by this test harness.


[GOOGLE] Don't update the callee count if caller is not resolved node

2013-10-28 Thread Dehao Chen
This patch changes to no update callee count if caller count is not a
resolved node (in LIPO mode) during AutoFDO compilation. This is
because AutoFDO will have the same edge counts for all unresolved
nodes. Original update method will lead to multi-update of the callee.

Bootstrapped and testing on going.

OK for google-4_8 if test is OK?

Thanks,
Dehao

Index: gcc/ipa-inline-transform.c
===
--- gcc/ipa-inline-transform.c (revision 204131)
+++ gcc/ipa-inline-transform.c (working copy)
@@ -169,6 +169,15 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
   else
  {
   struct cgraph_node *n;
+  if (flag_auto_profile  L_IPO_COMP_MODE
+   cgraph_pre_profiling_inlining_done)
+{
+  struct cgraph_node *caller = e-caller;
+  if (caller-global.inlined_to)
+ caller = caller-global.inlined_to;
+  if (cgraph_lipo_get_resolved_node (caller-symbol.decl) != caller)
+ update_original = false;
+}
   n = cgraph_clone_node (e-callee, e-callee-symbol.decl,
  e-count, e-frequency,
  update_original, vNULL, true);


Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS

2013-10-28 Thread Mike Stump
On Oct 28, 2013, at 1:30 PM, DJ Delorie d...@redhat.com wrote:
 Who needs to approve this?  THe target maintainers, or some global maintainer?

So Richard said he'd review/approve it…  He's been busy clearing out a backlog 
or email/patches and reviewing wide-int stuff…  when next he comes up for air, 
I suspect he'll approve it.  I've got all the review for the target bits, 
mainly verifying partial sizes, and trying out builds.

Re: [PATCH] libstdc++ testsuite cxxflags

2013-10-28 Thread Mike Stump
On Oct 28, 2013, at 3:29 PM, Cesar Philippidis ce...@codesourcery.com wrote:
 This patch addresses two issues with the libstdc++ testsuite:
 
  * duplicate -g -O2 CXXFLAGS
  * missing -g -O2 for remote targets

I need to defer this to the libstdc++ people…  there is enough of them, and 
they do such a nice job…

Re: [ARM][PATCH] Fix testsuite testcase neon-vcond-[ltgt,unordered].c

2013-10-28 Thread Kugan
On 25/10/13 19:04, Kyrill Tkachov wrote:
 On 24/10/13 20:03, Kugan wrote:

 Hi Kyrill,

 It happens for armv5te arm-none-linux-gnueabi. --with-mode=arm
 --with-arch=armv5te --with-float=soft
 
 Ah ok, I can reproduce it now. So, while I agree that we add a scan for
 vbit and vbif to these testcases, there seems to be something dodgy
 going on with the register allocation.
 
 With -march=armv5te I'm getting the following snippet of code in the
 ltgt case:
 
 .L12:
 ldr r4, [ip]
 ldr r5, [ip, #4]
 ldr r6, [ip, #8]
 ldr r7, [ip, #12]
 vmovd20, r4, r5  @ v4sf
 vmovd21, r6, r7
 vcgt.f32q8, q10, q9
 vcgt.f32q10, q9, q10
 vorrq8, q8, q10
 vmovd22, r4, r5  @ v4sf
 vmovd23, r6, r7
 vbitq11, q9, q8
 vmovr4, r5, d22  @ v4sf
 vmovr6, r7, d23
 
 The second vcgt.f32 trashes q10, then recreates it in q11 with:
 vmovd22, r4, r5  @ v4sf
 vmovd23, r6, r7
 
 so it can do the vbit. Surely there's something better that can be done?
 
 In contrast, with -march=armv7-a we get:
 
 .L12:
 vld1.32 {q9}, [r4]!
 vcgt.f32q8, q9, q10
 vcgt.f32q11, q10, q9
 vorrq8, q8, q11
 vbslq8, q10, q9
 vst1.32 {q8}, [lr]!
 

This is because  of the unaligned access done for armv7-a. arm.c has the
following comment:

  /* Enable -munaligned-access by default for
 - all ARMv6 architecture-based processors
 - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
 - ARMv8 architecture-base processors.

 Disable -munaligned-access by default for
 - all pre-ARMv6 architecture-based processors
 - ARMv6-M architecture-based processors.  */

Please look at the rtl difference.
- is armv7-a
+ is armv5te

;; vect_var_.18_61 = MEM[(float *)vect_pw2.14_59];

-(insn 71 70 72 (set (reg:V4SF 192)
-(unspec:V4SF [
-(mem:V4SF (reg:SI 163 [ ivtmp.47 ]) [0 MEM[(float
*)vect_pw2.14_59]+0 S16 A32])
-] UNSPEC_MISALIGNED_ACCESS)) neon-vcond-ltgt.c:12 -1
+(insn 71 70 72 (clobber (reg:V4SF 168 [ vect_var_.18 ]))
neon-vcond-ltgt.c:12 -1
+ (nil))
+
+(insn 72 71 73 (set (subreg:SI (reg:V4SF 168 [ vect_var_.18 ]) 0)
+(mem:SI (reg:SI 163 [ ivtmp.47 ]) [0 MEM[(float
*)vect_pw2.14_59]+0 S4 A32])) neon-vcond-ltgt.c:12 -1
+ (nil))
+
+(insn 73 72 74 (set (subreg:SI (reg:V4SF 168 [ vect_var_.18 ]) 4)
+(mem:SI (plus:SI (reg:SI 163 [ ivtmp.47 ])
+(const_int 4 [0x4])) [0 MEM[(float *)vect_pw2.14_59]+4
S4 A32])) neon-vcond-ltgt.c:12 -1
+ (nil))
+
+(insn 74 73 75 (set (subreg:SI (reg:V4SF 168 [ vect_var_.18 ]) 8)
+(mem:SI (plus:SI (reg:SI 163 [ ivtmp.47 ])
+(const_int 8 [0x8])) [0 MEM[(float *)vect_pw2.14_59]+8
S4 A32])) neon-vcond-ltgt.c:12 -1
  (nil))

-(insn 72 71 0 (set (reg:V4SF 168 [ vect_var_.18 ])
-(reg:V4SF 192)) neon-vcond-ltgt.c:12 -1
+(insn 75 74 0 (set (subreg:SI (reg:V4SF 168 [ vect_var_.18 ]) 12)
+(mem:SI (plus:SI (reg:SI 163 [ ivtmp.47 ])
+(const_int 12 [0xc])) [0 MEM[(float
*)vect_pw2.14_59]+12 S4 A32])) neon-vcond-ltgt.c:12 -1
  (nil))




patch to fix a LRA crash on ppc

2013-10-28 Thread Vladimir Makarov
  The following patch is a new version of a fix for LRA crash on 32-bit 
ppc SPEC2006 gamess.


  The previous version broke mode-switching optimization on x86/x86-64.

  The patch was successfully bootstrapped and tested on x86/x86-64.

  Committed as rev. 204140.

2013-10-28  Vladimir Makarov  vmaka...@redhat.com

* lra-spills.c (lra_final_code_change): Remove useless move insns
originated from moves of pseudos.

Index: lra-spills.c
===
--- lra-spills.c(revision 204131)
+++ lra-spills.c(working copy)
@@ -625,7 +625,7 @@ lra_final_code_change (void)
 {
   int i, hard_regno;
   basic_block bb;
-  rtx insn, curr;
+  rtx insn, curr, set;
   int max_regno = max_reg_num ();
 
   for (i = FIRST_PSEUDO_REGISTER; i  max_regno; i++)
@@ -636,6 +636,7 @@ lra_final_code_change (void)
 FOR_BB_INSNS_SAFE (bb, insn, curr)
   if (INSN_P (insn))
{
+ bool change_p;
  rtx pat = PATTERN (insn);
 
  if (GET_CODE (pat) == CLOBBER  LRA_TEMP_CLOBBER_P (pat))
@@ -648,6 +649,12 @@ lra_final_code_change (void)
  continue;
}
 
+ set = single_set (insn);
+ change_p = (set != NULL
+  REG_P (SET_SRC (set))  REG_P (SET_DEST (set))
+  REGNO (SET_SRC (set)) = FIRST_PSEUDO_REGISTER
+  REGNO (SET_DEST (set)) = FIRST_PSEUDO_REGISTER);
+ 
  lra_insn_recog_data_t id = lra_get_insn_recog_data (insn);
  struct lra_static_insn_data *static_id = id-insn_static_data;
  bool insn_change_p = false;
@@ -661,5 +668,20 @@ lra_final_code_change (void)
  }
  if (insn_change_p)
lra_update_operator_dups (id);
+
+ if (change_p  REGNO (SET_SRC (set)) == REGNO (SET_DEST (set)))
+   {
+ /* Remove an useless move insn but only involving
+pseudos as some subsequent optimizations are based on
+that move insns involving originally hard registers
+are preserved.  IRA can generate move insns involving
+pseudos.  It is better remove them earlier to speed
+up compiler a bit.  It is also better to do it here
+as they might not pass final RTL check in LRA,
+(e.g. insn moving a control register into
+itself).  */
+ lra_invalidate_insn_data (insn);
+ delete_insn (insn);
+   }
}
 }


Re: [PATCH v2 3/4] Handle simple inheritance in gengtype.

2013-10-28 Thread David Malcolm
On Tue, 2013-10-15 at 12:58 -0600, Jeff Law wrote:
[...]
 This is OK.

Thanks; committed to trunk as r204146.

(I was holding off until approval of the robustification followup, which
you've now approved, as [1]).

Dave

[1] http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02330.html




Re: [PATCH] Enhance ifcombine to recover non short circuit branches

2013-10-28 Thread Zhenqiang Chen
On 28 October 2013 02:55, Andrew Pinski pins...@gmail.com wrote:
 On Sat, Oct 26, 2013 at 4:49 PM, Andrew Pinski pins...@gmail.com wrote:
 On Sat, Oct 26, 2013 at 2:30 PM, Andrew Pinski pins...@gmail.com wrote:
 On Fri, Oct 18, 2013 at 2:21 AM, Zhenqiang Chen
 zhenqiang.c...@linaro.org wrote:
 On 18 October 2013 00:58, Jeff Law l...@redhat.com wrote:
 On 10/17/13 05:03, Richard Biener wrote:

 Is it OK for trunk?


 I had a much simpler change which did basically the same from 4.7 (I
 can update it if people think this is a better approach).


 I like that more (note you can now use is_gimple_condexpr as predicate
 for force_gimple_operand).

 The obvious question is whether or not Andrew's simpler change picks up as
 many transformations as Zhenqiang's change.  If not are the things missed
 important.

 Zhenqiang, can you do some testing of your change vs Andrew P.'s change?

 Here is a rough compare:

 1) Andrew P.'s change can not handle ssa-ifcombine-ccmp-3.c (included
 in my patch). Root cause is that it does not skip LABEL. The guard
 to do this opt should be the same the bb_has_overhead_p in my patch.

 This should be an easy change, I am working on this right now.


 2) Andrew P.'s change always generate TRUTH_AND_EXPR, which is not
 efficient for ||. e.g. For ssa-ifcombine-ccmp-6.c, it will generate

   _3 = a_2(D)  0;
   _5 = b_4(D)  0;
   _6 = _3 | _5;
   _9 = c_7(D) = 0;
   _10 = ~_6;
   _11 = _9  _10;
   if (_11 == 0)

 With my patch, it will generate

   _3 = a_2(D)  0;
   _5 = b_4(D)  0;
   _6 = _3 | _5;
   _9 = c_7(D)  0;
   _10 = _6 | _9;
   if (_10 != 0)

 As mentioned otherwise, this seems like a missed optimization inside
 forwprop.  When I originally wrote this code there used to be two
 cases one for  and one for |, but this was removed sometime and I
 just made the code evolve with that.

 Actually I can make a small patch (3 lines) to my current patch which
 causes tree-ssa-ifcombine.c to produce the behavior of your patch.

  if (result_inv)
{
  t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t);
  result_inv = false;
}

 I will submit a new patch after some testing of my current patch which
 fixes items 1 and 2.


 Here is my latest patch which adds the testcases from Zhenqiang's
 patch and fixes item 1 and 2.

The patch works fine for me. Bootstrap and no make check regression on
ARM Chromebook.

Thanks!
-Zhenqiang

 OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.

 Thanks,
 Andrew Pinski

 ChangeLog:
 * tree-ssa-ifcombine.c: Include rtl.h and tm_p.h.
 (ifcombine_ifandif): Handle cases where
 maybe_fold_and_comparisons fails, combining the branches
 anyways.
 (tree_ssa_ifcombine): Inverse the order of
 the basic block walk, increases the number of combinings.
 * gimple.h (gsi_start_nondebug_after_labels_bb): New function.


 testsuite/ChangeLog:

 * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: New test case.
 * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: New test case.
 * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: New test case.
 * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: New test case.
 * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: New test case.
 * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: New test case.

 * gcc.dg/tree-ssa/phi-opt-9.c: Use a function call to prevent
 conditional move to be used.
 * gcc.dg/tree-ssa/ssa-dom-thread-3.c: Remove check for one or more
 intermediate.





 Thanks,
 Andrew Pinski




 3) The good thing of Andrew P.'s change is that Inverse the order of
 the basic block walk so it can do combine recursively.

 But I think we need some heuristic to control the number of ifs. Move
 too much compares from
 the inner_bb to outer_bb is not good.


 I think this depends on the target.  For MIPS we don't want an upper
 bound as integer comparisons go directly to GPRs.  I wrote this patch
 with MIPS in mind as that was the target I was working on.


 4) Another good thing of Andrew P.'s change is that it reuses some
 existing functions. So it looks much simple.


 Thanks,
 Andrew Pinski



 With that we should be able to kill the fold-const.c transform?

 That would certainly be nice and an excellent follow-up for Zhenqiang.

 That's my final goal to kill the fold-const.c transform. I think we
 may combine the two changes to make a simple and good patch.

 Thanks!
 -Zhenqiang


Limited support for inheritance in gengtype now in trunk (was Re: [PATCH] make gengtype more robust against user error)

2013-10-28 Thread David Malcolm
On Mon, 2013-10-28 at 15:19 -0600, Jeff Law wrote:
 On 10/25/13 15:37, David Malcolm wrote:
  This patch addresses various forms of failure described in
  http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01974.html
 
 [...]
 Thanks for diving into this stuff and getting it fixed.
 
 OK for the trunk,

Thanks; it passed bootstrapregtesting on x86_64-unknown-linux, so I've
committed it to trunk as r204148, completing (I hope) the addition of
support in gengtype for some forms of simple class hierarchies.

I believe this may also complete the prerequisites for my conversion of
the symtable types into a C++ class hierarchy (in that IIRC all of those
patches were approved, but were pending this gengtype feature)... but
I'll have to see to what extent they've bitrotted (and recheck the
bootstrap/regtest, naturally).

Dave



Re: [PATCH] Vectorizing abs(char/short/int) on x86.

2013-10-28 Thread Cong Hou
As there are some issues with abs() type conversions, I removed the
related content from the patch but only kept the SSE2 support for
abs(int).

For the define_expand I added as below, the else body is there to
avoid fall-through transformations to ABS operation in optabs.c.
Otherwise ABS will be converted to other operations even that we have
corresponding instructions from SSSE3.


(define_expand absmode2
  [(set (match_operand:VI124_AVX2_48_AVX512F 0 register_operand)
(abs:VI124_AVX2_48_AVX512F
 (match_operand:VI124_AVX2_48_AVX512F 1 nonimmediate_operand)))]
  TARGET_SSE2
{
  if (!TARGET_SSSE3)
ix86_expand_sse2_abs (operands[0], force_reg (MODEmode, operands[1]));
  else
emit_insn (gen_rtx_SET (VOIDmode, operands[0],
   gen_rtx_ABS (MODEmode, operands[1])));
  DONE;
})


The patch is attached here. Please give me your comments.


thanks,
Cong



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a38316..84c7ab5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2013-10-22  Cong Hou  co...@google.com
+
+ PR target/58762
+ * config/i386/i386-protos.h (ix86_expand_sse2_abs): New function.
+ * config/i386/i386.c (ix86_expand_sse2_abs): New function.
+ * config/i386/sse.md: Add SSE2 support to abs (8/16/32-bit-int).
+
 2013-10-14  David Malcolm  dmalc...@redhat.com

  * dumpfile.h (gcc::dump_manager): New class, to hold state
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 3ab2f3a..ca31224 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -238,6 +238,7 @@ extern void ix86_expand_mul_widen_evenodd (rtx,
rtx, rtx, bool, bool);
 extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, bool, bool);
 extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
 extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx);
+extern void ix86_expand_sse2_abs (rtx, rtx);

 /* In i386-c.c  */
 extern void ix86_target_macros (void);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 02cbbbd..71905fc 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -41696,6 +41696,53 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2)
gen_rtx_MULT (mode, op1, op2));
 }

+void
+ix86_expand_sse2_abs (rtx op0, rtx op1)
+{
+  enum machine_mode mode = GET_MODE (op0);
+  rtx tmp0, tmp1;
+
+  switch (mode)
+{
+  /* For 32-bit signed integer X, the best way to calculate the absolute
+ value of X is (((signed) X  (W-1)) ^ X) - ((signed) X  (W-1)).  */
+  case V4SImode:
+ tmp0 = expand_simple_binop (mode, ASHIFTRT, op1,
+GEN_INT (GET_MODE_BITSIZE
+ (GET_MODE_INNER (mode)) - 1),
+NULL, 0, OPTAB_DIRECT);
+ if (tmp0)
+  tmp1 = expand_simple_binop (mode, XOR, op1, tmp0,
+  NULL, 0, OPTAB_DIRECT);
+ if (tmp0  tmp1)
+  expand_simple_binop (mode, MINUS, tmp1, tmp0,
+   op0, 0, OPTAB_DIRECT);
+ break;
+
+  /* For 16-bit signed integer X, the best way to calculate the absolute
+ value of X is max (X, -X), as SSE2 provides the PMAXSW insn.  */
+  case V8HImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (mode, SMAX, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  /* For 8-bit signed integer X, the best way to calculate the absolute
+ value of X is min ((unsigned char) X, (unsigned char) (-X)),
+ as SSE2 provides the PMINUB insn.  */
+  case V16QImode:
+ tmp0 = expand_unop (mode, neg_optab, op1, NULL_RTX, 0);
+ if (tmp0)
+  expand_simple_binop (V16QImode, UMIN, op1, tmp0, op0, 0,
+   OPTAB_DIRECT);
+ break;
+
+  default:
+ break;
+}
+}
+
 /* Expand an insert into a vector register through pinsr insn.
Return true if successful.  */

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c3f6c94..b85ded4 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8721,7 +8721,7 @@
(set (attr prefix_rex) (symbol_ref x86_extended_reg_mentioned_p (insn)))
(set_attr mode DI)])

-(define_insn absmode2
+(define_insn *absmode2
   [(set (match_operand:VI124_AVX2_48_AVX512F 0 register_operand =v)
  (abs:VI124_AVX2_48_AVX512F
   (match_operand:VI124_AVX2_48_AVX512F 1 nonimmediate_operand vm)))]
@@ -8733,6 +8733,20 @@
(set_attr prefix maybe_vex)
(set_attr mode sseinsnmode)])

+(define_expand absmode2
+  [(set (match_operand:VI124_AVX2_48_AVX512F 0 register_operand)
+ (abs:VI124_AVX2_48_AVX512F
+  (match_operand:VI124_AVX2_48_AVX512F 1 nonimmediate_operand)))]
+  TARGET_SSE2
+{
+  if (!TARGET_SSSE3)
+ix86_expand_sse2_abs (operands[0], force_reg (MODEmode, operands[1]));
+  else
+emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+gen_rtx_ABS (MODEmode, operands[1])));
+  DONE;
+})
+
 (define_insn absmode2
   [(set (match_operand:MMXMODEI 0 register_operand =y)
  (abs:MMXMODEI
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 075d071..cf5b942 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2013-10-22  Cong Hou  co...@google.com
+
+ PR 

RE: [PATCH 1/n] Add conditional compare support

2013-10-28 Thread Hans-Peter Nilsson
On Tue, 22 Oct 2013, Zhenqiang Chen wrote:
 ChangeLog:
 2013-10-22  Zhenqiang Chen  zhenqiang.c...@linaro.org

   * config/arm/arm.c (arm_fixed_condition_code_regs, arm_ccmode_to_code,
   arm_select_dominance_ccmp_mode): New functions.
   (arm_select_dominance_cc_mode_1): New function extracted from
   arm_select_dominance_cc_mode.
   (arm_select_dominance_cc_mode): Call arm_select_dominance_cc_mode_1.
   * config/arm/arm.md (ccmp, cbranchcc4, ccmp_and, ccmp_ior,
   ccmp_ior_scc_scc, ccmp_ior_scc_scc_cmp, ccmp_and_scc_scc,
   ccmp_and_scc_scc_cmp): New.
   * config/arm/arm-protos.h (arm_select_dominance_ccmp_mode): New.
   * expr.c (ccmp_candidate_p, used_in_cond_stmt_p, expand_ccmp_expr_2,
   expand_ccmp_expr_3, expand_ccmp_expr_1, expand_ccmp_expr): New.
   (expand_expr_real_1): Handle ccmp.
   * optabs.c: Include gimple.h.
   (expand_ccmp_op): New.
   (get_rtx_code): Handle BIT_AND_EXPR and BIT_IOR_EXPR.
   * optabs.def (ccmp): New.
   * optabs.h (expand_ccmp_op): New.
   * doc/md.texi (ccmp): New index.

One thing I don't see other people mentioning, is that this
patch has just too much code inside #ifdef HAVE_ccmp ... #endif.

I couldn't actually find the part that *requires* that, i.e.
code that does something like gen_ccmp (...) but maybe it's there.

Where needed and where the conditioned code is more than a few
lines, such code is preferably transformed into if (0):d code
using constructs like that at builtins.c:5354:

#ifndef HAVE_atomic_clear
# define HAVE_atomic_clear 0
# define gen_atomic_clear(x,y) (gcc_unreachable (), NULL_RTX)
#endif

Right, this causes dead code, but for maintenance it's *much*
better than when only a fraction of the code being compiled for
other targets.  (Also, the dead code may be eliminated by gcc.)
Unfortunately the number of examples (as above) are few compared
to the pages of #if HAVE_thisorthat'd code. :(

(And IMHO that whole construct should be the default
implementation and shouldn't have to be written manually in the
first place.  But that's material for an invasive patch.)

brgds, H-P


Re: [SH] PR 54236 - add some more addc patterns

2013-10-28 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 This adds some more patterns to utilize the SH addc instruction.
 Tested on rev 204111 with
 make -k check RUNTESTFLAGS=--target_board=sh-sim
 \{-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4-single/-ml,-m4/-mb,-m4-single/-mb}
 
 and no new failures.
 OK for trunk?

Using same *addc for 7 different insn_and_split looks a bit
confusing.  It might be better to add some sufficies like
_r_1, _r_r_1, _r_1_r, _2r_1, _r_31, _r_r_31, _2r_31.
Other than that, the patch is OK.

Regards,
kaz


[PATCH (updated)] Convert symtab, cgraph and varpool nodes into a real class hierarchy

2013-10-28 Thread David Malcolm
On Fri, 2013-09-20 at 17:33 +0200, Jan Hubicka wrote:
  This patch is the handwritten part of the conversion of these types
  to C++; it requires the followup patch, which is autogenerated.
  
  It converts:
struct symtab_node_base
  to:
class symtab_node_base
  
  and converts:
struct cgraph_node
  to:
struct cgraph_node : public symtab_node_base
  and:
struct varpool_node
  to:
class varpool_node : public symtab_node_base
  
  dropping the symtab_node_def union.
 
 Yep, incrementally we should continue with the grand renaming 
 retiring symtab_node_base and getting things symmetric.
  
  * cgraph.h (symtab_node_base): Convert to a class;
  add GTY((desc (%h.type), tag (SYMTAB_SYMBOL))).
  (cgraph_node): Inherit from symtab_node; add GTY option
  tag (SYMTAB_FUNCTION).
  (varpool_node): Inherit from symtab_node; add GTY option
  tag (SYMTAB_VARIABLE).
  (symtab_node_def): Remove.
  (is_a_helper cgraph_node::test (symtab_node_def *)): Convert to...
  (is_a_helper cgraph_node::test (symtab_node_base *)): ...this.
  (is_a_helper varpool_node::test (symtab_node_def *)): Convert to...
  (is_a_helper varpool_node::test (symtab_node_base *)): ...this.
  
  * ipa-ref.h (symtab_node_def): Drop.
  (symtab_node): Change underlying type from symtab_node_def to
  symtab_node_base.
  (const_symtab_node): Likwise.
  
  * is-a.h: Update examples in comment.
  
  * symtab.c (symtab_hash): Change symtab_node_def to symtab_node_base.
  (assembler_name_hash): Likewise.
 
 This patch is OK.  Thanks for working on this!

These symtab changes were dependent on having gengtype support for
inheritance, which is now in trunk, so I'm now revisiting these patches.

The above patch hasn't bitrotted, though the autogenerated one that goes
with it needed regenerating.

A new version of the autogenerated patch can be seen at:
http://dmalcolm.fedorapeople.org/gcc/large-patches/eaba9669644c84592ea32be2dcd19ba92beca381-0003-Autogenerated-fixes-of-symbol.-to.patch
Is that new patch OK?  (it's 450KB so one's eyes tend to glaze over
after a while, but FWIW you approved an earlier version of that in:
  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00730.html
and the test suite for the script that generated the patch can be seen
at:
https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_refactor_symtab.py
 )

Also, I noticed when reviewing the autogenerated marking routines in
gtype-desc.c that the symtab_node_def union had chain_next/chain_prev
markings, and that my patch above effectively dropped them.

The attached patch reinstates them, albeit to the symtab_node_base base
class, and inspection of the generated gtype-desc.c shows that the
generated code does the right thing.

Is this change also OK for trunk?

(I'm still bootstrapping/regtesting the patch series now, so the above
is all conditional on that, so my apologies if this turns out to be
premature).

Dave
From f070f1ecc2baa536d121f2942bd8af750dbc2f02 Mon Sep 17 00:00:00 2001
From: David Malcolm dmalc...@redhat.com
Date: Mon, 28 Oct 2013 22:25:38 -0400
Subject: [PATCH 2/3] Add chain_next and chain_prev options back to
 symtab_node_base

gcc/
	* cgraph.h (symtab_node_base): Add missing chain_next and
	chain_prev GTY options.
---
 gcc/cgraph.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 5a582d8..35f79ec 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -39,7 +39,7 @@ enum symtab_type
 
 /* Base of all entries in the symbol table.
The symtab_node is inherited by cgraph and varpol nodes.  */
-class GTY((desc (%h.type), tag (SYMTAB_SYMBOL))) symtab_node_base
+class GTY((desc (%h.type), tag (SYMTAB_SYMBOL), chain_next (%h.next), chain_prev (%h.previous))) symtab_node_base
 {
 public:
   /* Type of the symbol.  */
-- 
1.7.11.7



Re: [PATCH (updated)] Convert symtab, cgraph and varpool nodes into a real class hierarchy

2013-10-28 Thread David Malcolm
On Mon, 2013-10-28 at 22:57 -0400, David Malcolm wrote:
 On Fri, 2013-09-20 at 17:33 +0200, Jan Hubicka wrote:
   This patch is the handwritten part of the conversion of these types
   to C++; it requires the followup patch, which is autogenerated.
   
   It converts:
 struct symtab_node_base
   to:
 class symtab_node_base
   
   and converts:
 struct cgraph_node
   to:
 struct cgraph_node : public symtab_node_base
   and:
 struct varpool_node
   to:
 class varpool_node : public symtab_node_base
   
   dropping the symtab_node_def union.
  
  Yep, incrementally we should continue with the grand renaming 
  retiring symtab_node_base and getting things symmetric.
   
 * cgraph.h (symtab_node_base): Convert to a class;
 add GTY((desc (%h.type), tag (SYMTAB_SYMBOL))).
 (cgraph_node): Inherit from symtab_node; add GTY option
 tag (SYMTAB_FUNCTION).
 (varpool_node): Inherit from symtab_node; add GTY option
 tag (SYMTAB_VARIABLE).
 (symtab_node_def): Remove.
 (is_a_helper cgraph_node::test (symtab_node_def *)): Convert to...
 (is_a_helper cgraph_node::test (symtab_node_base *)): ...this.
 (is_a_helper varpool_node::test (symtab_node_def *)): Convert to...
 (is_a_helper varpool_node::test (symtab_node_base *)): ...this.
   
 * ipa-ref.h (symtab_node_def): Drop.
 (symtab_node): Change underlying type from symtab_node_def to
 symtab_node_base.
 (const_symtab_node): Likwise.
   
 * is-a.h: Update examples in comment.
   
 * symtab.c (symtab_hash): Change symtab_node_def to symtab_node_base.
 (assembler_name_hash): Likewise.
  
  This patch is OK.  Thanks for working on this!
 
 These symtab changes were dependent on having gengtype support for
 inheritance, which is now in trunk, so I'm now revisiting these patches.
 
 The above patch hasn't bitrotted, though the autogenerated one that goes
 with it needed regenerating.
 
 A new version of the autogenerated patch can be seen at:
 http://dmalcolm.fedorapeople.org/gcc/large-patches/eaba9669644c84592ea32be2dcd19ba92beca381-0003-Autogenerated-fixes-of-symbol.-to.patch
 Is that new patch OK?  (it's 450KB so one's eyes tend to glaze over
 after a while, but FWIW you approved an earlier version of that in:
   http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00730.html
 and the test suite for the script that generated the patch can be seen
 at:
 https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_refactor_symtab.py
  )
 
 Also, I noticed when reviewing the autogenerated marking routines in
 gtype-desc.c that the symtab_node_def union had chain_next/chain_prev
 markings, and that my patch above effectively dropped them.
 
 The attached patch reinstates them, albeit to the symtab_node_base base
 class, and inspection of the generated gtype-desc.c shows that the
 generated code does the right thing.
...and here's a revised version one that avoids overlong lines.

commit 739276306f1d8f5d0b1202da21cc29b5fcac102e
Author: David Malcolm dmalc...@redhat.com
Date:   Mon Oct 28 22:25:38 2013 -0400

Add chain_next and chain_prev options back to symtab_node_base

gcc/
	* cgraph.h (symtab_node_base): Add missing chain_next and
	chain_prev GTY options.

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 5a582d8..4cc2049 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -39,7 +39,9 @@ enum symtab_type
 
 /* Base of all entries in the symbol table.
The symtab_node is inherited by cgraph and varpol nodes.  */
-class GTY((desc (%h.type), tag (SYMTAB_SYMBOL))) symtab_node_base
+class GTY((desc (%h.type), tag (SYMTAB_SYMBOL),
+	   chain_next (%h.next), chain_prev (%h.previous)))
+  symtab_node_base
 {
 public:
   /* Type of the symbol.  */


Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-10-28 Thread Sandra Loosemore

On 10/28/2013 03:20 AM, Bernd Edlinger wrote:


On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote:


I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new
pr56997-1.c testcase, it got stuck in infinite recursion between
store_split_bit_field/store_fixed_bit_field and/or
extract_split_bit_field/extract_fixed_bit_field. This didn't show up in
my previous mainline testing.

[snip]


I have attached an update to your patch, that should
a) fix the recursion problem.
b) restrict the -fstrict-volatile-bitfields to not violate the C++ memory model.



Thanks for picking up the ball on this -- I've been busy with other 
tasks for several days.


I again tried backporting the patch series along with your fix to GCC 
4.8 and tested on arm-none-eabi.  I found that it was still getting 
stuck in infinite recursion unless the test from this patch hunk



@@ -1699,6 +1711,14 @@ extract_bit_field (rtx str_rtx, unsigned
 {
   enum machine_mode mode1;

+  /* Handle extraction of unaligned fields,
+ this can happen in -fstrict-volatile-bitfields.  */
+  if (GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0
+   GET_MODE_BITSIZE (GET_MODE (str_rtx))  GET_MODE_BITSIZE (word_mode)
+   bitnum % GET_MODE_BITSIZE (GET_MODE (str_rtx)) + bitsize
+ GET_MODE_BITSIZE (GET_MODE (str_rtx)) )
+str_rtx = adjust_address (str_rtx, word_mode, 0);
+
   /* Handle -fstrict-volatile-bitfields in the cases where it applies.  */
   if (GET_MODE_BITSIZE (GET_MODE (str_rtx))  0)
 mode1 = GET_MODE (str_rtx);


was also inserted into store_bit_field.

I also think that, minimally, this needs to be rewritten as something like

  if (MEM_P (str_rtx)
   STRICT_ALIGNMENT
   GET_MODE_BITSIZE (GET_MODE (str_rtx)) != 0
   GET_MODE_BITSIZE (GET_MODE (str_rtx))  GET_MODE_BITSIZE 
(word_mode)

   (bitnum % MEM_ALIGN (str_rtx) + bitsize
   GET_MODE_BITSIZE (GET_MODE (str_rtx
str_rtx = adjust_address (str_rtx, word_mode, 0);

Otherwise, we'll end up using word_mode instead of the field mode on 
targets that support unaligned accesses.  I tested that (again, 4.8 and 
arm-none-eabi) and results looked good, but of course ARM isn't one of 
the targets that supports unaligned accesses.  :-S


I'm still not convinced this is the right fix, though.  It seems to me 
that callers of store_bit_field and extract_bit_field in expr.c ought 
not to be messing with the mode of the rtx when 
flag_strict_volatile_bitfields  0, because that is losing information 
about the original mode that we are trying to patch up here by forcing 
it to word_mode.  Instead, I think the callers ought to be passing the 
declared field mode into store_bit_field and extract_bit_field, and 
those functions ought to change the mode of the incoming rtx to match 
the field mode only if strict_volatile_bitfield_p assures us that the 
insertion/extraction can be done in that mode.


Alternatively, if strict_volatile_bitfield_p returns false but 
flag_strict_volatile_bitfields  0, then always force to word_mode and 
change the -fstrict-volatile-bitfields documentation to indicate that's 
the fallback if the insertion/extraction cannot be done in the declared 
mode, rather than claiming that it tries to do the same thing as if 
-fstrict-volatile-bitfields were not enabled at all.


Either way, still needs more work, and more thorough testing (more 
targets, and obviously trunk as well as the 4.8 backport) before I'd 
consider this ready to commit.  I might or might not be able to find 
some more time to work on this in the next week


-Sandra



Re: [GOOGLE] Don't update the callee count if caller is not resolved node

2013-10-28 Thread Xinliang David Li
Is it sufficient to check if the final caller is defined in primary module?

Note that in some cases, doing profile update 'speculatively' (without
your fix) can be more precise (assuming the inlining gets materialized
in a different compilation), but in general it might be better to be
conservative in profile update (as in your patch) -- the downside is
you may end up with less aggressive size optimization.

David

On Mon, Oct 28, 2013 at 3:51 PM, Dehao Chen de...@google.com wrote:
 This patch changes to no update callee count if caller count is not a
 resolved node (in LIPO mode) during AutoFDO compilation. This is
 because AutoFDO will have the same edge counts for all unresolved
 nodes. Original update method will lead to multi-update of the callee.

 Bootstrapped and testing on going.

 OK for google-4_8 if test is OK?

 Thanks,
 Dehao

 Index: gcc/ipa-inline-transform.c
 ===
 --- gcc/ipa-inline-transform.c (revision 204131)
 +++ gcc/ipa-inline-transform.c (working copy)
 @@ -169,6 +169,15 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
else
   {
struct cgraph_node *n;
 +  if (flag_auto_profile  L_IPO_COMP_MODE
 +   cgraph_pre_profiling_inlining_done)
 +{
 +  struct cgraph_node *caller = e-caller;
 +  if (caller-global.inlined_to)
 + caller = caller-global.inlined_to;
 +  if (cgraph_lipo_get_resolved_node (caller-symbol.decl) != caller)
 + update_original = false;
 +}
n = cgraph_clone_node (e-callee, e-callee-symbol.decl,
   e-count, e-frequency,
   update_original, vNULL, true);


[PATCH][RFA] Improvements to infer_nonnull_range

2013-10-28 Thread Jeff Law


Based on a suggestion from Marc, I want to use infer_nonnull_range in 
the erroneous path isolation optimization.


In the process of doing that, I found  a few deficiencies in 
infer_nonnull_range that need to be addressed.



First, infer_nonnull_range_p doesn't infer ranges from a GIMPLE_RETURN 
if the current function is marked as returns_nonnull.  That's fixed in 
the relatively obvious way.


Second, infer_nonnull_range_p, when presented with an arglist where the 
non-null attribute applies to all pointer arguments, it won't bother to 
determine if OP is actually one of the arguments :(  It just assumes 
that OP is in the argument list.  Opps.


Third, I want to be able to call infer_nonnull_range with OP being 0B. 
That lets me use infer_nonnull_range to look for explicit null pointer 
dereferences, explicit uses of null in a return statement in functions 
that can't return non-null and explicit uses of null arguments when 
those arguments can't be null.   Sadly, to make that work we need to use 
operand_equal_p rather than simple pointer comparisons to see if OP 
shows up in STMT.


Finally, for detecting explicit null pointers, infer_nonnull_range calls 
count_ptr_derefs.  count_ptr_derefs counts things in two ways.  One with 
a FOR_EACH_SSA_TREE_OPERAND, then again with simple walks of the tree 
structures.   Not surprisingly if we're looking for an explicit 0B, then 
the loop over the operands finds nothing, but walking the tree 
structures does. And the checking assert triggers.  This change removes 
the assert and instead sets *num_uses_p to a sane value.


I don't have testcases for this stuff that are independent of the 
erroneous path isolation optimization.  However, each is triggered by 
tests I'll include in the the erroneous path isolation patch.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for 
the trunk?


jeff
* tree-ssa.c (count_ptr_derefs): Use operand_equal_p rather than
testing pointer equality.  Remove invalid assert.  Clamp minimum
value for *num_uses_p at *num_loads_p + *num_stores_p.

* tree-vrp.c (infer_nonnull_range): Verify OP is in the argument
list and the argument corresponding to OP is a pointer type.
Use operand_equal_p rather than pointer equality when testing
if OP is on the nonnull list.  Handle GIMPLE_RETURN statements too.


diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index 0b743d1..d29665b 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -265,7 +265,8 @@ count_ptr_derefs (tree *tp, int *walk_subtrees, void *data)
   return NULL_TREE;
 }
 
-  if (TREE_CODE (*tp) == MEM_REF  TREE_OPERAND (*tp, 0) == count_p-ptr)
+  if (TREE_CODE (*tp) == MEM_REF
+   operand_equal_p (TREE_OPERAND (*tp, 0), count_p-ptr, 0))
 {
   if (wi_p-is_lhs)
count_p-num_stores++;
@@ -326,7 +327,11 @@ count_uses_and_derefs (tree ptr, gimple stmt, unsigned 
*num_uses_p,
   *num_loads_p = count.num_loads;
 }
 
-  gcc_assert (*num_uses_p = *num_loads_p + *num_stores_p);
+  /* This can happen if PTR refers to something other than an SSA_NAME.
+ In that case we will find no uses as the FOR_EACH_SSA_TREE_OPERAND
+ only iterates over the SSA_NAMEs in a statement.  */
+  if (*num_uses_p  *num_loads_p + *num_stores_p)
+*num_uses_p = *num_loads_p + *num_stores_p;
 }
 
 
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d3a07f3..bbcb34c 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -4509,21 +4509,38 @@ infer_nonnull_range (gimple stmt, tree op)
return false;
 
  /* If nonnull applies to all the arguments, then ARG
-is non-null.  */
+is non-null if it's in the argument list.  */
  if (TREE_VALUE (attrs) == NULL_TREE)
-   return true;
+   {
+ for (unsigned int i = 0; i  gimple_call_num_args (stmt); i++)
+   {
+ if (operand_equal_p (op, gimple_call_arg (stmt, i), 0)
+  POINTER_TYPE_P (TREE_TYPE (gimple_call_arg (stmt, i
+   return true;
+   }
+ return false;
+   }
 
  /* Now see if op appears in the nonnull list.  */
  for (tree t = TREE_VALUE (attrs); t; t = TREE_CHAIN (t))
{
  int idx = TREE_INT_CST_LOW (TREE_VALUE (t)) - 1;
  tree arg = gimple_call_arg (stmt, idx);
- if (op == arg)
+ if (operand_equal_p (op, arg, 0))
return true;
}
}
 }
 
+  /* If this function is marked as returning non-null, then we can
+ infer OP is non-null if it is used in the return statement.  */
+  if (gimple_code (stmt) == GIMPLE_RETURN
+   gimple_return_retval (stmt)
+   operand_equal_p (gimple_return_retval (stmt), op, 0)
+   lookup_attribute (returns_nonnull,
+  TYPE_ATTRIBUTES (TREE_TYPE (current_function_decl
+return true;
+
   return 

Re: free is a killer

2013-10-28 Thread Jeff Law

On 10/28/13 16:05, Marc Glisse wrote:


I checked and it does the wrong thing (I don't have the testcase handy
anymore, but it shouldn't be hard to recreate one), I even wrote a patch
(attached) but it is related to:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02238.html
so it can't go in. A more limited fix (still missing many cases) would
be possible. I didn't test VA_END, but it doesn't look as bad (it is the
SSA_NAME case that is wrong for memcpy

Actually, it's fairly easy to see with memset.  Something like this:

/* { dg-do compile } */
/* { dg-options -O1 -fdump-tree-optimized } */

void f (long *p) {
  *p = 42;
  p[4] = 42;
  __builtin_memset (p, 0, 100);
}

/* { dg-final { scan-tree-dump-not = 42 optimized } } */
/* { dg-final { cleanup-tree-dump optimized } } */


I'm sure it can be made to work with memcpy as well, it's just harder 
because you have to ensure the source/dest args don't alias.






While I am posting non-submitted patches, does the following vaguely
make sense? I couldn't find a testcase that exercised it without some
local patches, but the idea (IIRC) is that with:
struct A { struct B b; int i; }
we can easily end up with one ao_ref.base that is a MEM_REF[p] and
another one a MEM_REF[(B*)q] where p and q are of type A*, and that
prevents us from noticing that p-i and q-b don't alias. Maybe I should
instead find a way to get MEM_REF[q] as ao_ref.base, but if q actually
points to a larger structure that starts with A it is likely to fail as
well...
I've never looked at the alias oracle stuff, but ISTM that offsets from 
the base ought to be enough to disambiguate?






In the category ugly hack that I won't submit, let me also attach this
patch that sometimes helps notice that malloc doesn't alias other things
(I don't know if the first hunk ever fires).
Well, one thing that can be done here (I think it comes from Steensgaard 
and I'm pretty sure it's discussed in Morgan's text) is you assign a 
storage class and propagate that.   Once you do so, you can eliminate 
certain things.  You use a vrp-like engine with a fairly simple meet 
operator to handle propagation of the storage class.


Once you have storage classes for all the pointers, you can make certain 
deductions about aliasing relationships.  Including what things in the 
malloc storage class can and can not alias.


Jeff




Re: [PATCH (updated)] Convert symtab, cgraph and varpool nodes into a real class hierarchy

2013-10-28 Thread Jeff Law

On 10/28/13 20:57, David Malcolm wrote:

* cgraph.h (symtab_node_base): Convert to a class;
add GTY((desc (%h.type), tag (SYMTAB_SYMBOL))).
(cgraph_node): Inherit from symtab_node; add GTY option
tag (SYMTAB_FUNCTION).
(varpool_node): Inherit from symtab_node; add GTY option
tag (SYMTAB_VARIABLE).
(symtab_node_def): Remove.
(is_a_helper cgraph_node::test (symtab_node_def *)): Convert to...
(is_a_helper cgraph_node::test (symtab_node_base *)): ...this.
(is_a_helper varpool_node::test (symtab_node_def *)): Convert to...
(is_a_helper varpool_node::test (symtab_node_base *)): ...this.

* ipa-ref.h (symtab_node_def): Drop.
(symtab_node): Change underlying type from symtab_node_def to
symtab_node_base.
(const_symtab_node): Likwise.

* is-a.h: Update examples in comment.

* symtab.c (symtab_hash): Change symtab_node_def to symtab_node_base.
(assembler_name_hash): Likewise.


This patch is OK.  Thanks for working on this!


These symtab changes were dependent on having gengtype support for
inheritance, which is now in trunk, so I'm now revisiting these patches.

The above patch hasn't bitrotted, though the autogenerated one that goes
with it needed regenerating.

A new version of the autogenerated patch can be seen at:
http://dmalcolm.fedorapeople.org/gcc/large-patches/eaba9669644c84592ea32be2dcd19ba92beca381-0003-Autogenerated-fixes-of-symbol.-to.patch
Is that new patch OK?  (it's 450KB so one's eyes tend to glaze over
after a while, but FWIW you approved an earlier version of that in:
   http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00730.html
and the test suite for the script that generated the patch can be seen
at:
https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_refactor_symtab.py
 )
Given it's already approved and hasn't changed in any substantial way, 
it's OK for the trunk once you get through the bootstrap and regression 
testing.


jeff




Re: [PATCH] Enhance ifcombine to recover non short circuit branches

2013-10-28 Thread Jeff Law

On 10/27/13 12:55, Andrew Pinski wrote:

Here is my latest patch which adds the testcases from Zhenqiang's
patch and fixes item 1 and 2.

OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:
* tree-ssa-ifcombine.c: Include rtl.h and tm_p.h.
(ifcombine_ifandif): Handle cases where
maybe_fold_and_comparisons fails, combining the branches
anyways.
(tree_ssa_ifcombine): Inverse the order of
the basic block walk, increases the number of combinings.
* gimple.h (gsi_start_nondebug_after_labels_bb): New function.


testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: New test case.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: New test case.

* gcc.dg/tree-ssa/phi-opt-9.c: Use a function call to prevent
conditional move to be used.
* gcc.dg/tree-ssa/ssa-dom-thread-3.c: Remove check for one or more
intermediate.
I don't be able to look at this in any depth tonight.  However, I would 
ask that you pass along the dump file for ssa-dom-thread-3.c so that I 
can evaluate the correctness of your change better.


Thanks,
jeff