RFA: speeding up dg-extract-results.sh

2014-02-13 Thread Richard Sandiford
dg-extract-results.sh is used to combine the various .sum.sep and .log.sep
files produced by parallel testing into single .sum and .log files.
It's written in a combination of shell scripts and awk, so stays well
within the minimum system requirements.

However, it seems to be quadratic in the number of test variations,
since the size of the .sums and .logs are linear in it and the script
parses them all once per variation.  This means that when I'm doing the
mipsisa64-sde-elf testing:

http://gcc.gnu.org/ml/gcc-testresults/2014-02/msg00025.html

the script takes just over 5 hours to produce the gcc.log file.

This patch tries to reduce that by providing an alternative single-script
version.  I was torn between Python and Tcl, but given how most people
tend to react to Tcl, I thought I'd better go for Python.  I wouldn't
mind rewriting it in Tcl if that seems better though, not least because
expect is already a prerequisite.

Python isn't yet required and I'm pretty sure this script needs 2.6
or later.  I'm also worried that the seek/tell stuff might not work on
Windows.  The patch therefore gets dg-extract-results.sh to check the
environment first and call into the python version if possible,
otherwise it falls back on the current approach.  This also means
that the patch is contained entirely within contrib/.  If this does
indeed not work on Windows then we should either fix the python code
(obviously preferred) or get dg-extract-results.sh to skip it on
Windows for now.

The new version processes the mipsisa64-sde-elf gcc.log in just over a minute.
It's also noticeably faster for more normal runs, e.g. for my 4-variant
mips64-linux-gnu testing the time taken to process gcc.log goes from 114s
to 11s.  But that's probably in the noise given how long testing takes anyway.

For completeness, although the basic approach was heavily based on the
original script, there are some minor differences in output:

- the 'Host is ' line is copied over.

- not all sorts in the .sh version were protected by LC_ALL=C, so the
  order of .exp files in the .sum could depend on locale.  The new version
  always follows the LC_ALL=C ordering (since that's what Python uses
  unless the script forces it not to).

- when the run for a particular .exp is split over several .log.seps,
  the separate logs are now reassembled in the same order as the .sum
  output, based on the first test in each .log fragment.  I've left this
  under the control of an internal variable for easier comparison though.

- the new version tries to keep the earliest start message and latest
  end message (based on the time in the message).  I thought this would
  give a better idea how long the full run took.

- the .log output now contains the tool version information at the end
  (as both versions do for .sum).

- the .log output only contains one set of 'Using foo.exp as the blah.'
  messages per run.  The .sh version drops most of the others but not all.

I checked that the outputs were otherwise identical for a set of
mips64-linux-gnu, mipsisa64-sde-elf and x86_64-linux-gnu runs.  I also
reran the acats tests with some nobbled testcases in order to test the
failure paths there.

Also bootstrapped  regression-tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


contrib/
* dg-extract-results.py: New file.
* dg-extract-results.sh: Use it if the environment seems suitable.

Index: contrib/dg-extract-results.py
===
--- /dev/null   2014-02-10 23:36:59.384652914 +
+++ contrib/dg-extract-results.py   2014-02-13 07:50:18.877804877 +
@@ -0,0 +1,577 @@
+#!/usr/bin/python
+#
+# Copyright (C) 2014 Free Software Foundation, Inc.
+#
+# This script is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+
+import sys
+import getopt
+import re
+from datetime import datetime
+
+# True if unrecognised lines should cause a fatal error.  Might want to turn
+# this on by default later.
+strict = False
+
+# True if the order of .log segments should match the .sum file, false if
+# they should keep the original order.
+sort_logs = True
+
+class Named:
+def __init__ (self, name):
+self.name = name
+
+def __cmp__ (self, other):
+return cmp (self.name, other.name)
+
+class ToolRun (Named):
+def __init__ (self, name):
+Named.__init__ (self, name)
+# The variations run for this tool, mapped by --target_board name.
+self.variations = dict()
+
+# Return the VariationRun for variation NAME.
+def get_variation (self, name):
+if name not in self.variations:
+self.variations[name] = VariationRun (name)
+return self.variations[name]
+
+class VariationRun (Named):
+def __init__ (self, name):
+Named.__init__ (self, name)

Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c

2014-02-13 Thread Hans-Peter Nilsson
On Tue, 4 Feb 2014, Rainer Orth wrote:

 AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about
 20131114:

Bah, missing analysis. Everywhere does not include cris-elf,
powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu,
s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0.


 XPASS: gcc.dg/binop-xor1.c scan-tree-dump-times optimized ^ 1

 To reduce testsuite noise, I'd like to apply the following patch.
 Tested with the appropriate runtest invocations on i386-pc-solaris2.11
 and x86_64-unknown-linux-gnu.

 Ok for mainline?

   Rainer


 2014-02-04  Rainer Orth  r...@cebitec.uni-bielefeld.de

   * gcc.dg/binop-xor1.c: Don't xfail scan-tree-dump-times.



The XPASS wasn't universal.  I opened PR60173.

brgds, H-P


[PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.

2014-02-13 Thread Kirill Yukhin
Hello,
I’ve noticed that _mm512_permutexvar_epi[64|32] intrinsics
have wrong arguments order. As per [1] first argument is index.
For vmpermps/vpermpd intrinsics are fine, but I’ve changed tests
to call CALC with same arg order as intrinsic. here is the same 
problem (wrong argument order) with vrcp14s[d|s].
Also avx512er-vrcp28ss-2.c test called wrong intrinsic.

[1]  http://software.intel.com/sites/landingpage/IntrinsicsGuide/

gcc/
* config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap
arguments order in builtin.
(_mm512_permutexvar_epi64): Ditto.
(_mm512_mask_permutexvar_epi64): Ditto
(_mm512_maskz_permutexvar_epi32): Ditto
(_mm512_permutexvar_epi32): Ditto
(_mm512_mask_permutexvar_epi32): Ditto
* config/i386/sse.md (srcp14mode): Swap operands.

gcc/testsuite/
* gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic.
* gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations.
* gcc.target/i386/avx512f-vpermpd-2.c: Ditto.
* gcc.target/i386/avx512f-vpermps-2.c: Ditto.
* gcc.target/i386/avx512f-vpermq-var-2.c: Ditto.
* gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
* gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.

Is it ok for trunk? Or we should wait until 4.9 fork?

--
Thanks, K

---
 gcc/config/i386/avx512fintrin.h| 24 +++---
 gcc/config/i386/sse.md |  6 +++---
 .../gcc.target/i386/avx512er-vrcp28ss-2.c  |  2 +-
 gcc/testsuite/gcc.target/i386/avx512f-vpermd-2.c   |  2 +-
 gcc/testsuite/gcc.target/i386/avx512f-vpermpd-2.c  |  4 ++--
 gcc/testsuite/gcc.target/i386/avx512f-vpermps-2.c  |  4 ++--
 .../gcc.target/i386/avx512f-vpermq-var-2.c |  2 +-
 gcc/testsuite/gcc.target/i386/avx512f-vrcp14sd-2.c |  4 ++--
 gcc/testsuite/gcc.target/i386/avx512f-vrcp14ss-2.c |  8 
 9 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index d53a40d..b3a4f3a 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -6148,8 +6148,8 @@ extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_maskz_permutexvar_epi64 (__mmask8 __M, __m512i __X, __m512i __Y)
 {
-  return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __X,
-(__v8di) __Y,
+  return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __Y,
+(__v8di) __X,
 (__v8di)
 _mm512_setzero_si512 (),
 __M);
@@ -6159,8 +6159,8 @@ extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_permutexvar_epi64 (__m512i __X, __m512i __Y)
 {
-  return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __X,
-(__v8di) __Y,
+  return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __Y,
+(__v8di) __X,
 (__v8di)
 _mm512_setzero_si512 (),
 (__mmask8) -1);
@@ -6171,8 +6171,8 @@ __attribute__ ((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm512_mask_permutexvar_epi64 (__m512i __W, __mmask8 __M, __m512i __X,
   __m512i __Y)
 {
-  return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __X,
-(__v8di) __Y,
+  return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __Y,
+(__v8di) __X,
 (__v8di) __W,
 __M);
 }
@@ -6181,8 +6181,8 @@ extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_maskz_permutexvar_epi32 (__mmask16 __M, __m512i __X, __m512i __Y)
 {
-  return (__m512i) __builtin_ia32_permvarsi512_mask ((__v16si) __X,
-(__v16si) __Y,
+  return (__m512i) __builtin_ia32_permvarsi512_mask ((__v16si) __Y,
+(__v16si) __X,
 (__v16si)
 _mm512_setzero_si512 (),
 __M);
@@ -6192,8 +6192,8 @@ extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_permutexvar_epi32 (__m512i __X, __m512i __Y)
 {
-  return (__m512i) __builtin_ia32_permvarsi512_mask ((__v16si) __X,
-

Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c

2014-02-13 Thread Richard Sandiford
Hans-Peter Nilsson h...@bitrange.com writes:
 On Tue, 4 Feb 2014, Rainer Orth wrote:
 AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about
 20131114:

 Bah, missing analysis. Everywhere does not include cris-elf,
 powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu,
 s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0.

Based on this list I'm guessing it's another BRANCH_COST==1 thing,
so that we don't convert  and || into  and |?  There are a few other
similar tests that either XFAIL based on that or force a higher branch cost.

Thanks,
Richard



Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c

2014-02-13 Thread Richard Sandiford
Richard Sandiford rsand...@linux.vnet.ibm.com writes:
 Hans-Peter Nilsson h...@bitrange.com writes:
 On Tue, 4 Feb 2014, Rainer Orth wrote:
 AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about
 20131114:

 Bah, missing analysis. Everywhere does not include cris-elf,
 powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu,
 s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0.

 Based on this list I'm guessing it's another BRANCH_COST==1

BRANCH_COST==1 || !LOGICAL_OP_NON_SHORT_CIRCUIT



[PATCH] Fix Cilk+ ICEs in the alias oracle

2014-02-13 Thread Richard Biener

Cilk+ builds INDIRECT_REFs when expanding builtins (oops) and thus
those can leak into MEM_EXRs which will lead to ICEs later.
The following patch properly builds a MEM_REF instead.  Grepping
for INDIRECT_REF I found another suspicious use (just removed,
it cannot have triggered and it looks bogus) and the use of
a langhook instead of proper GIMPLE interfaces (function also
used during expansion).

Bootstrap / testing in progress together with some other stuff.

Ok?

Thanks,
Richard.

2014-02-13  Richard Biener  rguent...@suse.de

* cilk-common.c: Include gimple-expr.h.
(cilk_arrow): Build a MEM_REF, not an INDIRECT_REF.
(get_frame_arg): Use middel-end types_compatible_p.  Do not
strip INDIRECT_REFs.

Index: gcc/cilk-common.c
===
--- gcc/cilk-common.c   (revision 207725)
+++ gcc/cilk-common.c   (working copy)
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.
 #include recog.h
 #include tree-iterator.h
 #include gimplify.h
+#include gimple-expr.h
 #include cilk.h
 
 /* This structure holds all the important fields of the internal structures,
@@ -66,8 +67,7 @@ cilk_dot (tree frame, int field_number,
 tree
 cilk_arrow (tree frame_ptr, int field_number, bool volatil)
 {
-  return cilk_dot (fold_build1 (INDIRECT_REF, 
-   TREE_TYPE (TREE_TYPE (frame_ptr)), frame_ptr), 
+  return cilk_dot (build_simple_mem_ref (frame_ptr), 
   field_number, volatil);
 }
 
@@ -287,12 +287,11 @@ get_frame_arg (tree call)
 
   argtype = TREE_TYPE (argtype);
   
-  gcc_assert (!lang_hooks.types_compatible_p
- || lang_hooks.types_compatible_p (argtype, cilk_frame_type_decl));
+  gcc_assert (types_compatible_p (argtype, cilk_frame_type_decl));
 
   /* If it is passed in as an address, then just use the value directly 
  since the function is inlined.  */
-  if (TREE_CODE (arg) == INDIRECT_REF || TREE_CODE (arg) == ADDR_EXPR)
+  if (TREE_CODE (arg) == ADDR_EXPR)
 return TREE_OPERAND (arg, 0);
   return arg;
 }


[PATCH][AArch64] vrnd*_f64 patch for stage-1

2014-02-13 Thread Alex Velenko

Hi,
This patch adds vrnd*_f64 aarch64 intrinsics. A testcase for those
intrinsics is added. Run a complete LE and BE regression run with no 
regressions.


Is patch OK for stage-1?

2014-02-13  Alex Velenko  alex.vele...@arm.com

gcc/

* config/aarch64/aarch64-builtins.c (BUILTIN_VDQF_DF): Macro
added.
* config/aarch64/aarch64-simd-builtins.def (frintn): Use added
macro.
* config/aarch64/aarch64-simd.md (frint_pattern): Comment
corrected.
* config/aarch64/aarch64.md (frint_pattern): Likewise.
* config/aarch64/arm_neon.h (vrnd_f64): Added.
(vrnda_f64): Likewise.
(vrndi_f64): Likewise.
(vrndm_f64): Likewise.
(vrndn_f64): Likewise.
(vrndp_f64): Likewise.
(vrndx_f64): Likewise.

gcc/testsuite/

gcc.target/aarch64/vrnd_f64_1.c : New testcase.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index ebab2ce8347a4425977c5cbd0f285c3ff1d9f2f1..7adc5fb96b6473ecde5c4f76973aff68af0ca7d4 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -307,6 +307,8 @@ aarch64_types_store1_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   VAR7 (T, N, MAP, v8qi, v16qi, v4hi, v8hi, v2si, v4si, v2di)
 #define BUILTIN_VDQF(T, N, MAP) \
   VAR3 (T, N, MAP, v2sf, v4sf, v2df)
+#define BUILTIN_VDQF_DF(T, N, MAP) \
+  VAR4 (T, N, MAP, v2sf, v4sf, v2df, df)
 #define BUILTIN_VDQH(T, N, MAP) \
   VAR2 (T, N, MAP, v4hi, v8hi)
 #define BUILTIN_VDQHS(T, N, MAP) \
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index e5f71b479ccfd1a9cbf84aed0f96b49762053f59..09e230c56683a0225f8760472d7137b7bac98297 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -264,7 +264,7 @@
   BUILTIN_VDQF (UNOP, nearbyint, 2)
   BUILTIN_VDQF (UNOP, rint, 2)
   BUILTIN_VDQF (UNOP, round, 2)
-  BUILTIN_VDQF (UNOP, frintn, 2)
+  BUILTIN_VDQF_DF (UNOP, frintn, 2)
 
   /* Implemented by lfcvt_patternsu_optabVQDF:modevcvt_target2.  */
   VAR1 (UNOP, lbtruncv2sf, 2, v2si)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 4dffb59e856aeaafb79007255d3b91a73ef1ef13..0c1d7de5b3f4fb0fa8fa226b81ec690d8112b849 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1427,7 +1427,7 @@
 )
 
 ;; Vector versions of the floating-point frint patterns.
-;; Expands to btrunc, ceil, floor, nearbyint, rint, round.
+;; Expands to btrunc, ceil, floor, nearbyint, rint, round, frintn.
 (define_insn frint_patternmode2
   [(set (match_operand:VDQF 0 register_operand =w)
 	(unspec:VDQF [(match_operand:VDQF 1 register_operand w)]
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 99a6ac8fcbdcd24a0ea18cc037bef9cf72070281..577aa9fe08bb445e66734bc404e94e13dc1fa65b 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3187,7 +3187,7 @@
 ;; ---
 
 ;; frint floating-point round to integral standard patterns.
-;; Expands to btrunc, ceil, floor, nearbyint, rint, round.
+;; Expands to btrunc, ceil, floor, nearbyint, rint, round, frintn.
 
 (define_insn frint_patternmode2
   [(set (match_operand:GPF 0 register_operand =w)
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 6af99361b8e265f66026dc506cfc23f044d153b4..797e37ad638648312ef34bcd63c463e5873c30c4 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -22481,6 +22481,12 @@ vrnd_f32 (float32x2_t __a)
   return __builtin_aarch64_btruncv2sf (__a);
 }
 
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vrnd_f64 (float64x1_t __a)
+{
+  return vset_lane_f64 (__builtin_trunc (vget_lane_f64 (__a, 0)), __a, 0);
+}
+
 __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
 vrndq_f32 (float32x4_t __a)
 {
@@ -22501,6 +22507,12 @@ vrnda_f32 (float32x2_t __a)
   return __builtin_aarch64_roundv2sf (__a);
 }
 
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vrnda_f64 (float64x1_t __a)
+{
+  return vset_lane_f64 (__builtin_round (vget_lane_f64 (__a, 0)), __a, 0);
+}
+
 __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
 vrndaq_f32 (float32x4_t __a)
 {
@@ -22521,6 +22533,12 @@ vrndi_f32 (float32x2_t __a)
   return __builtin_aarch64_nearbyintv2sf (__a);
 }
 
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vrndi_f64 (float64x1_t __a)
+{
+  return vset_lane_f64 (__builtin_nearbyint (vget_lane_f64 (__a, 0)), __a, 0);
+}
+
 __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
 vrndiq_f32 (float32x4_t __a)
 {
@@ -22541,6 +22559,12 @@ vrndm_f32 (float32x2_t __a)
   return __builtin_aarch64_floorv2sf (__a);
 }
 
+__extension__ static __inline float64x1_t 

Re: [PATCH] Fix compress_float_constants related ICE (PR target/43546)

2014-02-13 Thread Eric Botcazou
 2014-02-12  Jakub Jelinek  ja...@redhat.com
 
   PR target/43546
   * expr.c (compress_float_constant): If x is a hard register,
   extend into a pseudo and then move to x.
 
   * gcc.target/i386/pr43546.c: New test.

OK, thanks.

-- 
Eric Botcazou



Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-13 Thread Richard Biener
On Wed, Feb 12, 2014 at 6:51 PM, Eric Botcazou ebotca...@adacore.com wrote:
 I am not sure how to deal with this, given that we have mismatched
 V_C_Es anyway, I'm inclined not to care and let the expander deal with
 it.  But at the same I understand that it is ugly and will certainly
 cause somebody more headache in the future.  I suppose that not
 scalarizing here might hurt performance and would be frowned upon at
 the very least.  If the fields bigger than the record approach is the
 standard way of doing this, perhaps SRA can detect such cases and
 produce these strange COMPONENT_REFs instead, but is it so?

 You may remember that we went that way before (building a COMPONENT_REF for
 bit-fields instead of fully lowering the access) so doing it again would be a
 step backwards.  Likewise if we refuses to scalarize.  So IMO it's either low-
 level fiddling in SRA or in the expander (my preference too).

Ok, I've looked at the testcase and I suppose the following change is
what triggers the bug:

   bb 11:
   _56 = m.P_ARRAY;
-  my_rec2.r1 = VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb:
_3 sz: 1});
-  _58 = my_rec2.r1.f;
+  _51 = VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1
...]{lb: _3 sz: 1});
+  my_rec2$r1$f_43 = _51;
+  _58 = my_rec2$r1$f_43;
   if (_58  11059199)

I observe that SRA modifies an existing but not replaced memory reference
(something I always thought is asking for trouble).  It changes
VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb: _3 sz: 1});
to VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1 ...]{lb:
_3 sz: 1});.

Created a replacement for my_rec2 offset: 128, size: 24: my_rec2$r1$f

Access trees for my_rec2 (UID: 2659):
access { base = (2659)'my_rec2', offset = 128, size = 24, expr =
my_rec2.r1.f, type = opt31__time_t___XDLU_0__11059199, grp_read = 1,
grp_write = 1, grp_assignment_read = 1, grp_assignment_write = 1,
grp_scalar_read = 1, grp_scalar_write = 0, grp_total_scalarization =
0, grp_hint = 0, grp_covered = 1, grp_unscalarizable_region = 0,
grp_unscalarized_data = 0, grp_partial_lhs = 0, grp_to_be_replaced =
1, grp_to_be_debug_replaced = 0, grp_maybe_modified = 0,
grp_not_necessarilly_dereferenced = 0

but obviously 'type' doesn't agree with 'size' here.

In other places we disqualify exprs using VIEW_CONVERT_EXPRs but
appearantly only for the candidate itself, not for stuff assigned to it.
(though I never understood why disqualifying was necessary at all
for VIEW_CONVERT_EXPRs).

We are using the type of a bitfield field for the replacement which
we IMHO should avoid because the FIELD_DECLs size is 24
but the fields type TYPE_SIZE is 32 (it's precision is 24).  That's
all not an issue until you start to VIEW_CONVERT to such type
(VIEW_CONVERT being a reference op just cares for size not
precision).  Other ops are treated correctly by expansion.

Now - using a non-mode precision integer type as scalar replacement
isn't going to produce great code and, as we can see, has issues
when using VIEW_CONVERT_EXPRs.

SRA should either avoid this transform or fixup by VIEW_CONVERTing
memory reads only to mode-precision integer types and then inserting
a fixup cast.  The direct VIEW_CONVERsion it creates, from

  my_rec2.r1 = VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb: _3 sz: 1});
  _58 = my_rec2.r1.f;


to basically

  _58 = VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1
...]{lb: _3 sz: 1});

is simply wrong.

If you fix expansion then consider a nested VIEW_CONVERT_EXPR
that views back to the aggregate type - is that now supposed to
clear the upper 8 bits because of the VIEW_CONVERT_EXPR in the
middle?  Not so.  So fixing VIEW_CONVERT_EXPR sounds conceptually
wrong to me.

Not scalarizing a field to a DECL_BIT_FIELD FIELD_DECLs type looks like
the best fix to me.

Richard.


Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.

2014-02-13 Thread Uros Bizjak
On Thu, Feb 13, 2014 at 11:44 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote:

 I've noticed that _mm512_permutexvar_epi[64|32] intrinsics
 have wrong arguments order. As per [1] first argument is index.
 For vmpermps/vpermpd intrinsics are fine, but I've changed tests
 to call CALC with same arg order as intrinsic. here is the same
 problem (wrong argument order) with vrcp14s[d|s].
 Also avx512er-vrcp28ss-2.c test called wrong intrinsic.

 [1]  http://software.intel.com/sites/landingpage/IntrinsicsGuide/

 gcc/
 * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap
 arguments order in builtin.
 (_mm512_permutexvar_epi64): Ditto.
 (_mm512_mask_permutexvar_epi64): Ditto
 (_mm512_maskz_permutexvar_epi32): Ditto
 (_mm512_permutexvar_epi32): Ditto
 (_mm512_mask_permutexvar_epi32): Ditto
 * config/i386/sse.md (srcp14mode): Swap operands.

 gcc/testsuite/
 * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic.
 * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations.
 * gcc.target/i386/avx512f-vpermpd-2.c: Ditto.
 * gcc.target/i386/avx512f-vpermps-2.c: Ditto.
 * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto.
 * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
 * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.

 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
 index a04b289..d3b2dc5 100644
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -1456,12 +1456,12 @@
[(set (match_operand:VF_128 0 register_operand =v)
 (vec_merge:VF_128
   (unspec:VF_128
 -   [(match_operand:VF_128 1 nonimmediate_operand vm)]
 +   [(match_operand:VF_128 2 nonimmediate_operand vm)]
 UNSPEC_RCP14)
 - (match_operand:VF_128 2 register_operand v)
 + (match_operand:VF_128 1 register_operand v)
   (const_int 1)))]
TARGET_AVX512F
 -  vrcp14ssescalarmodesuffix\t{%1, %2, %0|%0, %2, %1}
 +  vrcp14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2}

Please don't change srcp pattern, it should be defined similar to
vrcpss (aka sse_vmrcpv4sf). You need to switch operand order
elsewhere.

Other than that, the patch is OK.

Uros.


Re: [PATCH] S390: Add test for hotpatching of nested functions

2014-02-13 Thread Andreas Krebbel
 2014-02-13  Dominik Vogt  v...@linux.vnet.ibm.com
 
   * gcc.target/s390/hotpatch-compile-8.c: New test

Ok committed.  Thanks!

-Andreas-



Re: [PATCH] (gcc-4.8) S390: Fix crash with -mhotpatch and gfortran

2014-02-13 Thread Andreas Krebbel
 2014-02-12  Dominik Vogt  v...@linux.vnet.ibm.com
 
   * config/s390/s390.c (s390_asm_output_function_label):
   fix crash caused by bad second argument to warning_at() with -mhotpatch
   and nested functions (e.g. with gfortran)

Applied. Thanks!

-Andreas-



Re: [PATCH] S390: Fix crash with -mhotpatch and gfortran

2014-02-13 Thread Andreas Krebbel
 2014-02-12  Dominik Vogt  v...@linux.vnet.ibm.com
 
   * config/s390/s390.c (s390_asm_output_function_label):
   fix crash caused by bad second argument to warning_at() with -mhotpatch
   and nested functions (e.g. with gfortran)

Applied.  Thanks!

-Andreas-



Re: [PATCH] Fix Cilk+ ICEs in the alias oracle

2014-02-13 Thread Richard Biener
On Thu, 13 Feb 2014, Richard Biener wrote:

 
 Cilk+ builds INDIRECT_REFs when expanding builtins (oops) and thus
 those can leak into MEM_EXRs which will lead to ICEs later.
 The following patch properly builds a MEM_REF instead.  Grepping
 for INDIRECT_REF I found another suspicious use (just removed,
 it cannot have triggered and it looks bogus) and the use of
 a langhook instead of proper GIMPLE interfaces (function also
 used during expansion).
 
 Bootstrap / testing in progress together with some other stuff.
 
 Ok?

Btw, this exposes that Cilk+ is LTO-ignorant - it doesn't properly
register its global trees (bah, more global trees...).  So
the types_compatible_p call ICEs.  Trying to process them in
lto/lto.c:read_cgraph_and_symbols doesn't seem to work though.

So I'm opting to remove the assert and leave fixing LTO for
somebody who cares about Cilk+.

Simpifies the patch as follows, bootstrapped  tested on
x86_64-unknown-linux-gnu.

Richard.

2014-02-13  Richard Biener  rguent...@suse.de

* cilk-common.c (cilk_arrow): Build a MEM_REF, not an INDIRECT_REF.
(get_frame_arg): Drop the assert with langhook types_compatible_p.
Do not strip INDIRECT_REFs.

Index: gcc/cilk-common.c
===
--- gcc/cilk-common.c   (revision 207725)
+++ gcc/cilk-common.c   (working copy)
@@ -66,8 +66,7 @@ cilk_dot (tree frame, int field_number,
 tree
 cilk_arrow (tree frame_ptr, int field_number, bool volatil)
 {
-  return cilk_dot (fold_build1 (INDIRECT_REF, 
-   TREE_TYPE (TREE_TYPE (frame_ptr)), frame_ptr), 
+  return cilk_dot (build_simple_mem_ref (frame_ptr), 
   field_number, volatil);
 }
 
@@ -287,12 +286,9 @@ get_frame_arg (tree call)
 
   argtype = TREE_TYPE (argtype);
   
-  gcc_assert (!lang_hooks.types_compatible_p
- || lang_hooks.types_compatible_p (argtype, cilk_frame_type_decl));
-
   /* If it is passed in as an address, then just use the value directly 
  since the function is inlined.  */
-  if (TREE_CODE (arg) == INDIRECT_REF || TREE_CODE (arg) == ADDR_EXPR)
+  if (TREE_CODE (arg) == ADDR_EXPR)
 return TREE_OPERAND (arg, 0);
   return arg;
 }


Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.

2014-02-13 Thread Uros Bizjak
On Thu, Feb 13, 2014 at 1:37 PM, Uros Bizjak ubiz...@gmail.com wrote:

 I've noticed that _mm512_permutexvar_epi[64|32] intrinsics
 have wrong arguments order. As per [1] first argument is index.
 For vmpermps/vpermpd intrinsics are fine, but I've changed tests
 to call CALC with same arg order as intrinsic. here is the same
 problem (wrong argument order) with vrcp14s[d|s].
 Also avx512er-vrcp28ss-2.c test called wrong intrinsic.

 [1]  http://software.intel.com/sites/landingpage/IntrinsicsGuide/

 gcc/
 * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap
 arguments order in builtin.
 (_mm512_permutexvar_epi64): Ditto.
 (_mm512_mask_permutexvar_epi64): Ditto
 (_mm512_maskz_permutexvar_epi32): Ditto
 (_mm512_permutexvar_epi32): Ditto
 (_mm512_mask_permutexvar_epi32): Ditto
 * config/i386/sse.md (srcp14mode): Swap operands.

 gcc/testsuite/
 * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic.
 * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations.
 * gcc.target/i386/avx512f-vpermpd-2.c: Ditto.
 * gcc.target/i386/avx512f-vpermps-2.c: Ditto.
 * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto.
 * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
 * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.

 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
 index a04b289..d3b2dc5 100644
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -1456,12 +1456,12 @@
[(set (match_operand:VF_128 0 register_operand =v)
 (vec_merge:VF_128
   (unspec:VF_128
 -   [(match_operand:VF_128 1 nonimmediate_operand vm)]
 +   [(match_operand:VF_128 2 nonimmediate_operand vm)]
 UNSPEC_RCP14)
 - (match_operand:VF_128 2 register_operand v)
 + (match_operand:VF_128 1 register_operand v)
   (const_int 1)))]
TARGET_AVX512F
 -  vrcp14ssescalarmodesuffix\t{%1, %2, %0|%0, %2, %1}
 +  vrcp14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2}

 Please don't change srcp pattern, it should be defined similar to
 vrcpss (aka sse_vmrcpv4sf). You need to switch operand order
 elsewhere.

No, you are correct. Operands should be swapped as in your patch.

The patch is OK for mainline.

Thanks,
Uros.


Re: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end

2014-02-13 Thread Ilmir Usmanov

Hi Thomas!

Thanks a lot for your review!
I agree with all your notes.

On 11.02.2014 20:51, Thomas Schwinge wrote:

For ChangeLog files updates (on gomp-4_0-branch, use the respective
ChangeLog.gomp files, by the way), should just you be listed as the
author, or also your colleagues?
Thank you for the notice, I added Evgeny and Dmitry as authors for this 
part (see attached ChangeLog entry).

With these issues addressed, this patch is ready for commit to
gomp-4_0-branch.  Use your own judgement; if you feel confident, just
commit it, or otherwise post it again for a final review -- as you
prefer.
I fixed patch according to your review and ready to commit it. OK for 
GOMP4 branch?


--
Ilmir.
From bf14158b1a28c2c5b29c41071fa62c011d9f4f65 Mon Sep 17 00:00:00 2001
From: Ilmir Usmanov i.usma...@samsung.com
Date: Thu, 13 Feb 2014 15:58:28 +0400
Subject: [PATCH] OpenACC GENERIC nodes

---
 gcc/doc/generic.texi|  45 ++
 gcc/gimplify.c  |  62 +
 gcc/omp-low.c   |  96 --
 gcc/tree-core.h |  61 ++---
 gcc/tree-pretty-print.c | 119 
 gcc/tree.c  |  44 +-
 gcc/tree.def|  42 +
 gcc/tree.h  |  61 -
 8 files changed, 507 insertions(+), 23 deletions(-)

diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index a56715b..ce14620 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -2052,6 +2052,15 @@ edge.  Rethrowing the exception is represented using @code{RESX_EXPR}.
 @node OpenMP
 @subsection OpenMP
 @tindex OACC_PARALLEL
+@tindex OACC_KERNELS
+@tindex OACC_DATA
+@tindex OACC_HOST_DATA
+@tindex OACC_DECLARE
+@tindex OACC_UPDATE
+@tindex OACC_ENTER_DATA
+@tindex OACC_EXIT_DATA
+@tindex OACC_WAIT
+@tindex OACC_CACHE
 @tindex OMP_PARALLEL
 @tindex OMP_FOR
 @tindex OMP_SECTIONS
@@ -2073,6 +2082,42 @@ clauses used by the OpenMP API @w{@uref{http://www.openmp.org/}}.
 
 Represents @code{#pragma acc parallel [clause1 @dots{} clauseN]}.
 
+@item OACC_KERNELS
+
+Represents @code{#pragma acc kernels [clause1 @dots{} clauseN]}.
+
+@item OACC_DATA
+
+Represents @code{#pragma acc data [clause1 @dots{} clauseN]}.
+
+@item OACC_HOST_DATA
+
+Represents @code{#pragma acc host_data [clause1 @dots{} clauseN]}.
+
+@item OACC_DECLARE
+
+Represents @code{#pragma acc declare [clause1 @dots{} clauseN]}.
+
+@item OACC_UPDATE
+
+Represents @code{#pragma acc update [clause1 @dots{} clauseN]}.
+
+@item OACC_ENTER_DATA
+
+Represents @code{#pragma acc enter data [clause1 @dots{} clauseN]}.
+
+@item OACC_EXIT_DATA
+
+Represents @code{#pragma acc exit data [clause1 @dots{} clauseN]}.
+
+@item OACC_WAIT
+
+Represents @code{#pragma acc wait [(num @dots{})]}.
+
+@item OACC_CACHE
+
+Represents @code{#pragma acc cache (var @dots{})}.
+
 @item OMP_PARALLEL
 
 Represents @code{#pragma omp parallel [clause1 @dots{} clauseN]}. It
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index d20f07f..06d7790 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4333,6 +4333,15 @@ is_gimple_stmt (tree t)
 case ASM_EXPR:
 case STATEMENT_LIST:
 case OACC_PARALLEL:
+case OACC_KERNELS:
+case OACC_DATA:
+case OACC_HOST_DATA:
+case OACC_DECLARE:
+case OACC_UPDATE:
+case OACC_ENTER_DATA:
+case OACC_EXIT_DATA:
+case OACC_WAIT:
+case OACC_CACHE:
 case OMP_PARALLEL:
 case OMP_FOR:
 case OMP_SIMD:
@@ -6157,6 +6166,23 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	remove = true;
 	  break;
 
+	case OMP_CLAUSE_HOST:
+	case OMP_CLAUSE_OACC_DEVICE:
+	case OMP_CLAUSE_DEVICE_RESIDENT:
+	case OMP_CLAUSE_USE_DEVICE:
+	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_WAIT:
+	case OMP_NO_CLAUSE_CACHE:
+	case OMP_CLAUSE_INDEPENDENT:
+	case OMP_CLAUSE_ASYNC:
+	case OMP_CLAUSE_WORKER:
+	case OMP_CLAUSE_VECTOR:
+	case OMP_CLAUSE_NUM_GANGS:
+	case OMP_CLAUSE_NUM_WORKERS:
+	case OMP_CLAUSE_VECTOR_LENGTH:
+	  remove = true;
+	  break;
+
 	case OMP_CLAUSE_NOWAIT:
 	case OMP_CLAUSE_ORDERED:
 	case OMP_CLAUSE_UNTIED:
@@ -6498,6 +6524,20 @@ gimplify_adjust_omp_clauses (tree *list_p)
 	case OMP_CLAUSE_DEPEND:
 	  break;
 
+	case OMP_CLAUSE_HOST:
+	case OMP_CLAUSE_OACC_DEVICE:
+	case OMP_CLAUSE_DEVICE_RESIDENT:
+	case OMP_CLAUSE_USE_DEVICE:
+	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_WAIT:
+	case OMP_NO_CLAUSE_CACHE:
+	case OMP_CLAUSE_INDEPENDENT:
+	case OMP_CLAUSE_ASYNC:
+	case OMP_CLAUSE_WORKER:
+	case OMP_CLAUSE_VECTOR:
+	case OMP_CLAUSE_NUM_GANGS:
+	case OMP_CLAUSE_NUM_WORKERS:
+	case OMP_CLAUSE_VECTOR_LENGTH:
 	default:
 	  gcc_unreachable ();
 	}
@@ -7988,6 +8028,19 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	  ret = GS_ALL_DONE;
 	  break;
 
+	case OACC_KERNELS:
+	case OACC_DATA:
+	case OACC_HOST_DATA:
+	case OACC_DECLARE:
+	case OACC_UPDATE:
+	case OACC_ENTER_DATA:
+	case OACC_EXIT_DATA:
+	case OACC_WAIT:
+	case OACC_CACHE:

[PATCH] Update isl/cloog recommended versions

2014-02-13 Thread Richard Biener

This updates the recommended versions to match those I just put
at ftp://gcc.gnu.org/pub/gcc/infrastructure/.  It also mentions
the possibility of doing in-tree builds and fixes PR59878 by
re-wording the cloog install parts.

Committed.

Richard.

2014-02-13  Richard Biener  rguent...@suse.de

PR bootstrap/59878
* doc/install.texi (ISL): Update recommended version to 0.12.2,
mention the possibility of an in-tree build.
(CLooG): Update recommended version to 0.18.1, mention the
possibility of an in-tree build and clarify that the ISL
bundled with CLooG does not work.

Index: gcc/doc/install.texi
===
*** gcc/doc/install.texi(revision 207725)
--- gcc/doc/install.texi(working copy)
*** installed but it is not in your default
*** 383,407 
  @option{--with-mpc} configure option should be used.  See also
  @option{--with-mpc-lib} and @option{--with-mpc-include}.
  
! @item ISL Library version 0.11.1
  
  Necessary to build GCC with the Graphite loop optimizations.
  It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/}
! as @file{isl-0.11.1.tar.bz2}.
  
! The @option{--with-isl} configure option should be used if ISL is not
! installed in your default library search path.
! 
! @item CLooG 0.18.0
  
  Necessary to build GCC with the Graphite loop optimizations.  It can be
  downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as
! @file{cloog-0.18.0.tar.gz}.  The @option{--with-cloog} configure option should
! be used if CLooG is not installed in your default library search path.
! CLooG needs to be built against ISL 0.11.1.  Use @option{--with-isl=system}
! to direct CLooG to pick up an already installed ISL, otherwise it will use
! ISL 0.11.1 as bundled with CLooG.  CLooG needs to be configured to use GMP
! internally, use @option{--with-bits=gmp} to direct it to do that.
  
  @end table
  
--- 383,412 
  @option{--with-mpc} configure option should be used.  See also
  @option{--with-mpc-lib} and @option{--with-mpc-include}.
  
! @item ISL Library version 0.12.2
  
  Necessary to build GCC with the Graphite loop optimizations.
  It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/}
! as @file{isl-0.12.2.tar.bz2}.  If an ISL source distribution is found
! in a subdirectory of your GCC sources named @file{isl}, it will be
! built together with GCC.  Alternatively, the @option{--with-isl} configure
! option should be used if ISL is not installed in your default library
! search path.
  
! @item CLooG 0.18.1
  
  Necessary to build GCC with the Graphite loop optimizations.  It can be
  downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as
! @file{cloog-0.18.1.tar.gz}.  If a CLooG source distribution is found
! in a subdirectory of your GCC sources named @file{cloog}, it will be
! built together with GCC.  Alternatively, the @option{--with-cloog} configure
! option should be used if CLooG is not installed in your default library search
! path.
! 
! If you want to install CLooG separately it needs to be built against
! ISL 0.12.2 by using the @option{--with-isl=system} to direct CLooG to pick
! up an already installed ISL.  Using the ISL library as bundled with CLooG
! is not supported.
  
  @end table
  


Re: [PATCH] Update isl/cloog recommended versions

2014-02-13 Thread Richard Biener
On Thu, 13 Feb 2014, Richard Biener wrote:

 
 This updates the recommended versions to match those I just put
 at ftp://gcc.gnu.org/pub/gcc/infrastructure/.  It also mentions
 the possibility of doing in-tree builds and fixes PR59878 by
 re-wording the cloog install parts.

And this updates download_prerequisites.

Committed.

Richard.

2014-02-13  Richard Biener  rguent...@suse.de

* download_prerequisites: Update ISL and CLOOG versions.

Index: contrib/download_prerequisites
===
--- contrib/download_prerequisites  (revision 207757)
+++ contrib/download_prerequisites  (working copy)
@@ -43,8 +43,8 @@ ln -sf $MPC mpc || exit 1
 
 # Necessary to build GCC with the Graphite loop optimizations.
 if [ $GRAPHITE_LOOP_OPT = yes ] ; then
-  ISL=isl-0.11.1
-  CLOOG=cloog-0.18.0
+  ISL=isl-0.12.2
+  CLOOG=cloog-0.18.1
 
   wget ftp://gcc.gnu.org/pub/gcc/infrastructure/$ISL.tar.bz2 || exit 1
   tar xjf $ISL.tar.bz2  || exit 1


Re: [PING][PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope

2014-02-13 Thread Dimitris Papavasiliou

Hello,

Pinging this patch review request.  Can someone involved in the 
Objective-C language frontend have a quick look at the description of 
the proposed features and tell me if it'd be ok to have them in the 
trunk so I can go ahead and create proper patches?


Thanks,
Dimitris

On 02/06/2014 11:25 AM, Dimitris Papavasiliou wrote:

Hello,

This is a patch regarding a couple of Objective-C related dialect
options and warning switches. I have already submitted it a while ago
but gave up after pinging a couple of times. I am now informed that
should have kept pinging until I got someone's attention so I'm
resending it.

The patch is now against an old revision and as I stated originally it's
probably not in a state that can be adopted as is. I'm sending it as is
so that the implemented features can be assesed in terms of their
usefulness and if they're welcome I'd be happy to make any necessary
changes to bring it up-to-date, split it into smaller patches, add
test-cases and anything else that is deemed necessary.

Here's the relevant text from my initial message:

Two of these switches are related to a feature request I submitted a
while ago, Bug 56044
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't reproduce
the entire argument here since it is available in the feature request.
The relevant functionality in the patch comes in the form of two switches:

-Wshadow-ivars which controls the local declaration of ‘somevar’ hides
instance variable warning which curiously is enabled by default instead
of being controlled at least by -Wshadow. The patch changes it so that
this warning can be enabled and disabled specifically through
-Wshadow-ivars as well as with all other shadowing-related warnings
through -Wshadow.

The reason for the extra switch is that, while searching through the
Internet for a solution to this problem I have found out that other
people are inconvenienced by this particular warning as well so it might
be useful to be able to turn it off while keeping all the other
shadowing-related warnings enabled.

-flocal-ivars which when true, as it is by default, treats instance
variables as having local scope. If false (-fno-local-ivars) instance
variables must always be referred to as self-ivarname and references of
ivarname resolve to the local or global scope as usual.

I've also taken the opportunity of adding another switch unrelated to
the above but related to instance variables:

-fivar-visibility which can be set to either private, protected (the
default), public and package. This sets the default instance variable
visibility which normally is implicitly protected. My use-case for it is
basically to be able to set it to public and thus effectively disable
this visibility mechanism altogether which I find no use for and
therefore have to circumvent. I'm not sure if anyone else feels the same
way towards this but I figured it was worth a try.

I'm attaching a preliminary patch against the current revision in case
anyone wants to have a look. The changes are very small and any blatant
mistakes should be immediately obvious. I have to admit to having
virtually no knowledge of the internals of GCC but I have tried to keep
in line with formatting guidelines and general style as well as looking
up the particulars of the way options are handled in the available
documentation to avoid blind copy-pasting. I have also tried to test the
functionality both in my own (relatively large, or at least not too
small) project and with small test programs and everything works as
expected. Finallly, I tried running the tests too but these fail to
complete both in the patched and unpatched version, possibly due to the
way I've configured GCC.

Dimitris




[PATCH, ARM] Skip pr59858.c test for -mfloat-abi=hard

2014-02-13 Thread Ian Bolton
Hi,

The pr59858.c testcase explicitly sets -msoft-float which is incompatible
with our -mfloat-abi=hard variant.

This patch therefore should not be run if you have -mfloat-abi=hard.

Tested with both variations for arm-none-eabi build.

OK for commit?

Cheers,
Ian


2014-02-13  Ian Bolton  ian.bol...@arm.com

testsuite/
* gcc.target/arm/pr59858.c: Skip test if -mfloat-abi=hard.diff --git a/gcc/testsuite/gcc.target/arm/pr59858.c 
b/gcc/testsuite/gcc.target/arm/pr59858.c
index 463bd38..1e03203 100644
--- a/gcc/testsuite/gcc.target/arm/pr59858.c
+++ b/gcc/testsuite/gcc.target/arm/pr59858.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options -march=armv5te -marm -mthumb-interwork -Wall 
-Wstrict-prototypes -Wstrict-aliasing -funsigned-char -fno-builtin -fno-asm 
-msoft-float -std=gnu99 -mlittle-endian -mthumb -fno-stack-protector  -Os -g 
-feliminate-unused-debug-types -funit-at-a-time -fmerge-all-constants 
-fstrict-aliasing -fno-tree-loop-optimize -fno-tree-dominator-opts 
-fno-strength-reduce -fPIC -w } */
+/* { dg-skip-if Test is not compatible with hard-float { *-*-* } { 
-mfloat-abi=hard } {  } } */
 
 typedef enum {
  REG_ENOSYS = -1,


Re: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end

2014-02-13 Thread Thomas Schwinge
Hi Ilmir!

On Thu, 13 Feb 2014 17:15:47 +0400, Ilmir Usmanov i.usma...@samsung.com wrote:
 I fixed patch according to your review and ready to commit it. OK for 
 GOMP4 branch?

Yes!  :-) Congratulations, and thanks for promptly addressing the issues
raised during review.  I'm aware this can be a bit of a boring or tedious
process, but in the end, the code quality will be higher (well, that's
the idea about code review), and certainly you'll have learned some
things, too (and I have, too), and so next time this process will likely
be faster.


Only a few minor comments about the ChangeLog formatting:

 13-02-2014  Ilmir Usmanov  i.usma...@samsung.com

-MM-DD is the format used in ChangeLogs.

   Add OpenACC 1.0 support to GENERIC, except loop directive and subarrays.
 
   Dmitry Bocharnikov dmitr...@samsung.com
   Evgeny Gavrin e.gav...@samsung.com
   Ilmir Usmanov i.usma...@samsung.com

For multiple authors, do it like this:

2014-02-13  Ilmir Usmanov  i.usma...@samsung.com
Dmitry Bocharnikov  dmitr...@samsung.com
Evgeny Gavrin  e.gav...@samsung.com

|  gcc/
|  * gimplify.c (is_gimple_stmt): Stub OpenACC directives and clauses.
|  (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Likewise.
|  (gimplify_expr): Likewise.

(I don't care, but) you can also do it as follows, a bit simpler:

* [file] ([item 1], [item 2], [...]): [text].

|  * tree-core.h 
|  (OMP_CLAUSE_HOST, OMP_CLAUSE_OACC_DEVICE, OMP_CLAUSE_DEVICE_RESIDENT,
|  OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE_GANG, OMP_CLAUSE_WAIT,
|  OMP_NO_CLAUSE_CACHE, OMP_CLAUSE_INDEPENDENT, OMP_CLAUSE_ASYNC,
|  OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, OMP_CLAUSE_NUM_GANGS,
|  OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH): New clauses.

As the enum omp_clause_code is the thing that you modify, that would be:

* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_HOST, [...].

Or, as other people do:

* tree-core.h (omp_clause_code): Add OMP_CLAUSE_HOST, [...].


Grüße,
 Thomas


pgpWWwUBmWOVa.pgp
Description: PGP signature


Re: RFA: one more version of patch for PR59535

2014-02-13 Thread Richard Earnshaw
On 11/02/14 19:43, Vladimir Makarov wrote:
   This is one more version of the patch to fix the PR59535
 
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535
 
   Here are the results of applying the patch:
 
 ThumbThumb2
 
 reload 2626334  2400154
 lra (before the patch) 2665749  2414926
 lra (after the patch)  2626334  2397132
 
 
 I already wrote that the change in arm.h is to prevent reloading sp as
 an address by LRA. Reload has no such problem as it uses legitimate
 address hook and LRA mostly relies on base_reg_class.
 
 Richard, I need an approval for this change.
 
 2014-02-11  Vladimir Makarov  vmaka...@redhat.com
 
 PR rtl-optimization/59535
 * lra-constraints.c (process_alt_operands): Encourage alternative
 when unassigned pseudo class is superset of the alternative class.
 (inherit_reload_reg): Don't inherit when optimizing for code size.
 * config/arm/arm.h (MODE_BASE_REG_CLASS): Return CORE_REGS for
 Thumb2 and BASE_REGS for modes not less than 4 for LRA.


 Index: config/arm/arm.h
 ===
 --- config/arm/arm.h  (revision 207562)
 +++ config/arm/arm.h  (working copy)
 @@ -1272,8 +1272,10 @@ enum reg_class
 when addressing quantities in QI or HI mode; if we don't know the
 mode, then we must be conservative.  */
  #define MODE_BASE_REG_CLASS(MODE)\
 -(TARGET_ARM || (TARGET_THUMB2  !optimize_size) ? CORE_REGS :  \
 - (((MODE) == SImode) ? BASE_REGS : LO_REGS))
 +(TARGET_ARM || (TARGET_THUMB2  (!optimize_size || arm_lra_flag))   
 \
 + ? CORE_REGS : ((MODE) == SImode \
 +|| (arm_lra_flag  GET_MODE_SIZE (MODE) = 4)   \
 +? BASE_REGS : LO_REGS))
  
  /* For Thumb we can not support SP+reg addressing, so we return LO_REGS
 instead of BASE_REGS.  */
 

Awesome.  Thanks, Vladimir.

I find that while I can't convince myself that the logic in the change
to MODE_BASE_REG_CLASS is wrong, it's very hard to follow.  Furthermore,
when we come to rip out the old reload code it will be quite prone to
getting this wrong.  I think restructuring this along the lines of:

#define MODE_BASE_REG_CLASS(MODE)
  (arm_lra_flag
   ? (TARGET_32BIT ? CORE_REGS
  : GET_MODE_SIZE (MODE) = 4 ? BASE_REGS
  : LO_REGS)
   : ((TARGET_ARM || (TARGET_THUMB2  !optimize_size)) ? CORE_REGS
  : ((MODE) == SImode) ? BASE_REGS
  : LO_REGS))

Is both easier to understand and easier to simplify later when reload
goes away.

I'll run a regression test on this and let you know the results.

R.



Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-13 Thread Eric Botcazou
 We are using the type of a bitfield field for the replacement which
 we IMHO should avoid because the FIELD_DECLs size is 24
 but the fields type TYPE_SIZE is 32 (it's precision is 24).  That's
 all not an issue until you start to VIEW_CONVERT to such type
 (VIEW_CONVERT being a reference op just cares for size not
 precision).  Other ops are treated correctly by expansion.
 
 Now - using a non-mode precision integer type as scalar replacement
 isn't going to produce great code and, as we can see, has issues
 when using VIEW_CONVERT_EXPRs.
 
 SRA should either avoid this transform or fixup by VIEW_CONVERTing
 memory reads only to mode-precision integer types and then inserting
 a fixup cast.  The direct VIEW_CONVERsion it creates, from
 
   my_rec2.r1 = VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb: _3 sz:
 1}); _58 = my_rec2.r1.f;
 
 
 to basically
 
   _58 = VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1
 ...]{lb: _3 sz: 1});
 
 is simply wrong.

There is nothing obvious I think, i.e. that's debatable.  I agree that a VCE 
from a 32-bit object to a 32-bit integer with 24-bit precision should not 
clear the upper 8 bits (so the REDUCE_BIT_FIELD part of my patch is wrong).
But here we have a VCE from a 24-bit object to a 32-bit integer with 24-bit 
precision which reads *more bits* than the size of the source type; that I 
think is plain wrong and is fixed by the bit-field extraction in the patch.

 If you fix expansion then consider a nested VIEW_CONVERT_EXPR
 that views back to the aggregate type - is that now supposed to
 clear the upper 8 bits because of the VIEW_CONVERT_EXPR in the
 middle?  Not so.  So fixing VIEW_CONVERT_EXPR sounds conceptually
 wrong to me.

I agree that we need not clear, but we need to prevent the expansion from 
reading more bits than what is contained in the source type.  And this is 
sufficient to fix the regression.

 Not scalarizing a field to a DECL_BIT_FIELD FIELD_DECLs type looks like
 the best fix to me.

That seems like a big hammer though.

-- 
Eric Botcazou


[AArch64] Improve vst4_lane intrinsics

2014-02-13 Thread James Greenhalgh

Hi,

This patch rewrites the vst4_lane intrinsics in terms of RTL builtins.

Tested on aarch64-none-elf with no issues.

OK to queue for Stage 1?

Thanks,
James

---
gcc/

2014-02-13  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64-builtins.c
(aarch64_types_storestruct_lane_qualifiers): New.
(TYPES_STORESTRUCT_LANE): Likewise.
* config/aarch64/aarch64-simd-builtins.def (st2_lane): New.
(st3_lane): Likewise.
(st4_lane): Likewise.
* config/aarch64/aarch64-simd.md (vec_store_lanesoi_lanemode): New.
(vec_store_lanesci_lanemode): Likewise.
(vec_store_lanesxi_lanemode): Likewise.
(aarch64_st2_laneVQ:mode): Likewise.
(aarch64_st3_laneVQ:mode): Likewise.
(aarch64_st4_laneVQ:mode): Likewise.
* config/aarch64/aarch64.md (unspec): Add UNSPEC_ST{2,3,4}_LANE.
* config/aarch64/arm_neon.h
(__ST2_LANE_FUNC): Rewrite using builtins, update use points to
use new macro arguments.
(__ST3_LANE_FUNC): Likewise.
(__ST4_LANE_FUNC): Likewise.
* config/aarch64/iterators.md (V_TWO_ELEM): New.
(V_THREE_ELEM): Likewise.
(V_FOUR_ELEM): Likewise.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index ebab2ce..a12a1aa 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -226,6 +226,11 @@ aarch64_types_store1_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_void, qualifier_pointer_map_mode, qualifier_none };
 #define TYPES_STORE1 (aarch64_types_store1_qualifiers)
 #define TYPES_STORESTRUCT (aarch64_types_store1_qualifiers)
+static enum aarch64_type_qualifiers
+aarch64_types_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_void, qualifier_pointer_map_mode,
+  qualifier_none, qualifier_none };
+#define TYPES_STORESTRUCT_LANE (aarch64_types_storestruct_lane_qualifiers)
 
 #define CF0(N, X) CODE_FOR_aarch64_##N##X
 #define CF1(N, X) CODE_FOR_##N##X##1
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index e5f71b4..7bfdfca 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -107,6 +107,10 @@
   BUILTIN_VQ (STORESTRUCT, st3, 0)
   BUILTIN_VQ (STORESTRUCT, st4, 0)
 
+  BUILTIN_VQ (STORESTRUCT_LANE, st2_lane, 0)
+  BUILTIN_VQ (STORESTRUCT_LANE, st3_lane, 0)
+  BUILTIN_VQ (STORESTRUCT_LANE, st4_lane, 0)
+
   BUILTIN_VQW (BINOP, saddl2, 0)
   BUILTIN_VQW (BINOP, uaddl2, 0)
   BUILTIN_VQW (BINOP, ssubl2, 0)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 4dffb59e856aeaafb79007255d3b91a73ef1ef13..f19b7d5123b5a6249026d48f943445f8167b1c45 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3584,6 +3584,17 @@ (define_insn vec_store_lanesoimode
   [(set_attr type neon_store2_2regq)]
 )
 
+(define_insn vec_store_lanesoi_lanemode
+  [(set (match_operand:V_TWO_ELEM 0 aarch64_simd_struct_operand =Utv)
+	(unspec:V_TWO_ELEM [(match_operand:OI 1 register_operand w)
+(unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)
+		(match_operand:SI 2 immediate_operand i)]
+   UNSPEC_ST2_LANE))]
+  TARGET_SIMD
+  st2\\t{%S1.Vetype - %T1.Vetype}[%2], %0
+  [(set_attr type neon_store3_one_laneq)]
+)
+
 (define_insn vec_load_lanescimode
   [(set (match_operand:CI 0 register_operand =w)
 	(unspec:CI [(match_operand:CI 1 aarch64_simd_struct_operand Utv)
@@ -3604,6 +3615,17 @@ (define_insn vec_store_lanescimode
   [(set_attr type neon_store3_3regq)]
 )
 
+(define_insn vec_store_lanesci_lanemode
+  [(set (match_operand:V_THREE_ELEM 0 aarch64_simd_struct_operand =Utv)
+	(unspec:V_THREE_ELEM [(match_operand:CI 1 register_operand w)
+(unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)
+		(match_operand:SI 2 immediate_operand i)]
+   UNSPEC_ST3_LANE))]
+  TARGET_SIMD
+  st3\\t{%S1.Vetype - %U1.Vetype}[%2], %0
+  [(set_attr type neon_store3_one_laneq)]
+)
+
 (define_insn vec_load_lanesximode
   [(set (match_operand:XI 0 register_operand =w)
 	(unspec:XI [(match_operand:XI 1 aarch64_simd_struct_operand Utv)
@@ -3624,6 +3646,17 @@ (define_insn vec_store_lanesximode
   [(set_attr type neon_store4_4regq)]
 )
 
+(define_insn vec_store_lanesxi_lanemode
+  [(set (match_operand:V_FOUR_ELEM 0 aarch64_simd_struct_operand =Utv)
+	(unspec:V_FOUR_ELEM [(match_operand:XI 1 register_operand w)
+(unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)
+		(match_operand:SI 2 immediate_operand i)]
+   UNSPEC_ST4_LANE))]
+  TARGET_SIMD
+  st4\\t{%S1.Vetype - %V1.Vetype}[%2], %0
+  [(set_attr type neon_store4_one_laneq)]
+)
+
 ;; Reload patterns for AdvSIMD register list operands.
 
 (define_expand movmode
@@ -4118,6 +4151,57 @@ (define_expand aarch64_stVSTRUCT:nregs
   DONE;
 })
 
+(define_expand 

Fix PR libffi/60073

2014-02-13 Thread Eric Botcazou
This adds proper variadic support to the SPARC port of libffi, thus fixing a 
regression in the testsuite in 64-bit mode, and fixes a small inaccuracy in 
the documentation.

Tested on SPARC/Solaris and SPARC64/Solaris, applied on the mainline.


2014-02-13  Eric Botcazou  ebotca...@adacore.com

PR libffi/60073
* src/sparc/ffitarget.h (FFI_TARGET_SPECIFIC_VARIADIC): Define.
(FFI_EXTRA_CIF_FIELDS): Likewise.
(FFI_NATIVE_RAW_API): Move around.
* src/sparc/ffi.c (ffi_prep_cif_machdep_core): New function from...
(ffi_prep_cif_machdep): ...here.  Call ffi_prep_cif_machdep_core.
(ffi_prep_cif_machdep_var): New function.
(ffi_closure_sparc_inner_v9): Do not pass anonymous FP arguments in
FP registers.
* doc/libffi.texi (Introduction): Fix inaccuracy.


-- 
Eric BotcazouIndex: src/sparc/ffitarget.h
===
--- src/sparc/ffitarget.h	(revision 207685)
+++ src/sparc/ffitarget.h	(working copy)
@@ -58,16 +58,17 @@ typedef enum ffi_abi {
 } ffi_abi;
 #endif
 
+#define FFI_TARGET_SPECIFIC_VARIADIC 1
+#define FFI_EXTRA_CIF_FIELDS unsigned int nfixedargs
+
 /*  Definitions for closures - */
 
 #define FFI_CLOSURES 1
-#define FFI_NATIVE_RAW_API 0
-
 #ifdef SPARC64
 #define FFI_TRAMPOLINE_SIZE 24
 #else
 #define FFI_TRAMPOLINE_SIZE 16
 #endif
+#define FFI_NATIVE_RAW_API 0
 
 #endif
-
Index: src/sparc/ffi.c
===
--- src/sparc/ffi.c	(revision 207685)
+++ src/sparc/ffi.c	(working copy)
@@ -249,7 +249,7 @@ int ffi_prep_args_v9(char *stack, extend
 }
 
 /* Perform machine dependent cif processing */
-ffi_status ffi_prep_cif_machdep(ffi_cif *cif)
+static ffi_status ffi_prep_cif_machdep_core(ffi_cif *cif)
 {
   int wordsize;
 
@@ -334,6 +334,19 @@ ffi_status ffi_prep_cif_machdep(ffi_cif
   return FFI_OK;
 }
 
+ffi_status ffi_prep_cif_machdep(ffi_cif *cif)
+{
+  cif-nfixedargs = cif-nargs;
+  return ffi_prep_cif_machdep_core (cif);
+}
+
+ffi_status ffi_prep_cif_machdep_var(ffi_cif *cif, unsigned int nfixedargs,
+unsigned int ntotalargs)
+{
+  cif-nfixedargs = nfixedargs;
+  return ffi_prep_cif_machdep_core (cif);
+}
+
 int ffi_v9_layout_struct(ffi_type *arg, int off, char *ret, char *intg, char *flt)
 {
   ffi_type **ptr = arg-elements[0];
@@ -604,8 +617,7 @@ ffi_closure_sparc_inner_v9(ffi_closure *
 
   /* Copy the caller's structure return address so that the closure
  returns the data directly to the caller.  */
-  if (cif-flags == FFI_TYPE_VOID
-   cif-rtype-type == FFI_TYPE_STRUCT)
+  if (cif-flags == FFI_TYPE_VOID  cif-rtype-type == FFI_TYPE_STRUCT)
 {
   rvalue = (void *) gpr[0];
   /* Skip the structure return address.  */
@@ -619,6 +631,10 @@ ffi_closure_sparc_inner_v9(ffi_closure *
   /* Grab the addresses of the arguments from the stack frame.  */
   for (i = 0; i  cif-nargs; i++)
 {
+  /* If the function is variadic, FP arguments are passed in FP
+	 registers only if the corresponding parameter is named.  */
+  const int named = (i  cif-nfixedargs);
+
   if (arg_types[i]-type == FFI_TYPE_STRUCT)
 	{
 	  if (arg_types[i]-size  16)
@@ -633,7 +649,9 @@ ffi_closure_sparc_inner_v9(ffi_closure *
    0,
    (char *) gpr[argn],
    (char *) gpr[argn],
-   (char *) fpr[argn]);
+   named
+   ? (char *) fpr[argn]
+   : (char *) gpr[argn]);
 	  avalue[i] = gpr[argn];
 	  argn += ALIGN(arg_types[i]-size, FFI_SIZEOF_ARG) / FFI_SIZEOF_ARG;
 	}
@@ -649,6 +667,7 @@ ffi_closure_sparc_inner_v9(ffi_closure *
 	argn++;
 #endif
 	  if (i  fp_slot_max
+	   named
 	   (arg_types[i]-type == FFI_TYPE_FLOAT
 		  || arg_types[i]-type == FFI_TYPE_DOUBLE
 #if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
@@ -662,7 +681,7 @@ ffi_closure_sparc_inner_v9(ffi_closure *
 }
 
   /* Invoke the closure.  */
-  (closure-fun) (cif, rvalue, avalue, closure-user_data);
+  closure-fun (cif, rvalue, avalue, closure-user_data);
 
   /* Tell ffi_closure_sparc how to perform return type promotions.  */
   return cif-rtype-type;
Index: doc/libffi.texi
===
--- doc/libffi.texi	(revision 207685)
+++ doc/libffi.texi	(working copy)
@@ -63,14 +63,14 @@ section entitled ``GNU General Public Li
 @node Introduction
 @chapter What is libffi?
 
-Compilers for high level languages generate code that follow certain
+Compilers for high-level languages generate code that follow certain
 conventions.  These conventions are necessary, in part, for separate
 compilation to work.  One such convention is the @dfn{calling
 convention}.  The calling convention is a set of assumptions made by
 the compiler about where function arguments will be found on entry to
 a function.  A calling convention also specifies where the return
-value for a function is found.  The calling convention is also

Fix PR libffi/60073

2014-02-13 Thread Eric Botcazou
This adds proper variadic support to the SPARC port of libffi, thus fixing a 
regression in the testsuite in 64-bit mode, and fixes a small inaccuracy in 
the documentation.

Tested on SPARC/Solaris and SPARC64/Solaris, applied on the mainline.


2014-02-13  Eric Botcazou  ebotca...@adacore.com

PR libffi/60073
* src/sparc/ffitarget.h (FFI_TARGET_SPECIFIC_VARIADIC): Define.
(FFI_EXTRA_CIF_FIELDS): Likewise.
(FFI_NATIVE_RAW_API): Move around.
* src/sparc/ffi.c (ffi_prep_cif_machdep_core): New function from...
(ffi_prep_cif_machdep): ...here.  Call ffi_prep_cif_machdep_core.
(ffi_prep_cif_machdep_var): New function.
(ffi_closure_sparc_inner_v9): Do not pass anonymous FP arguments in
FP registers.
* doc/libffi.texi (Introduction): Fix inaccuracy.


-- 
Eric BotcazouIndex: src/sparc/ffitarget.h
===
--- src/sparc/ffitarget.h	(revision 207685)
+++ src/sparc/ffitarget.h	(working copy)
@@ -58,16 +58,17 @@ typedef enum ffi_abi {
 } ffi_abi;
 #endif
 
+#define FFI_TARGET_SPECIFIC_VARIADIC 1
+#define FFI_EXTRA_CIF_FIELDS unsigned int nfixedargs
+
 /*  Definitions for closures - */
 
 #define FFI_CLOSURES 1
-#define FFI_NATIVE_RAW_API 0
-
 #ifdef SPARC64
 #define FFI_TRAMPOLINE_SIZE 24
 #else
 #define FFI_TRAMPOLINE_SIZE 16
 #endif
+#define FFI_NATIVE_RAW_API 0
 
 #endif
-
Index: src/sparc/ffi.c
===
--- src/sparc/ffi.c	(revision 207685)
+++ src/sparc/ffi.c	(working copy)
@@ -249,7 +249,7 @@ int ffi_prep_args_v9(char *stack, extend
 }
 
 /* Perform machine dependent cif processing */
-ffi_status ffi_prep_cif_machdep(ffi_cif *cif)
+static ffi_status ffi_prep_cif_machdep_core(ffi_cif *cif)
 {
   int wordsize;
 
@@ -334,6 +334,19 @@ ffi_status ffi_prep_cif_machdep(ffi_cif
   return FFI_OK;
 }
 
+ffi_status ffi_prep_cif_machdep(ffi_cif *cif)
+{
+  cif-nfixedargs = cif-nargs;
+  return ffi_prep_cif_machdep_core (cif);
+}
+
+ffi_status ffi_prep_cif_machdep_var(ffi_cif *cif, unsigned int nfixedargs,
+unsigned int ntotalargs)
+{
+  cif-nfixedargs = nfixedargs;
+  return ffi_prep_cif_machdep_core (cif);
+}
+
 int ffi_v9_layout_struct(ffi_type *arg, int off, char *ret, char *intg, char *flt)
 {
   ffi_type **ptr = arg-elements[0];
@@ -604,8 +617,7 @@ ffi_closure_sparc_inner_v9(ffi_closure *
 
   /* Copy the caller's structure return address so that the closure
  returns the data directly to the caller.  */
-  if (cif-flags == FFI_TYPE_VOID
-   cif-rtype-type == FFI_TYPE_STRUCT)
+  if (cif-flags == FFI_TYPE_VOID  cif-rtype-type == FFI_TYPE_STRUCT)
 {
   rvalue = (void *) gpr[0];
   /* Skip the structure return address.  */
@@ -619,6 +631,10 @@ ffi_closure_sparc_inner_v9(ffi_closure *
   /* Grab the addresses of the arguments from the stack frame.  */
   for (i = 0; i  cif-nargs; i++)
 {
+  /* If the function is variadic, FP arguments are passed in FP
+	 registers only if the corresponding parameter is named.  */
+  const int named = (i  cif-nfixedargs);
+
   if (arg_types[i]-type == FFI_TYPE_STRUCT)
 	{
 	  if (arg_types[i]-size  16)
@@ -633,7 +649,9 @@ ffi_closure_sparc_inner_v9(ffi_closure *
    0,
    (char *) gpr[argn],
    (char *) gpr[argn],
-   (char *) fpr[argn]);
+   named
+   ? (char *) fpr[argn]
+   : (char *) gpr[argn]);
 	  avalue[i] = gpr[argn];
 	  argn += ALIGN(arg_types[i]-size, FFI_SIZEOF_ARG) / FFI_SIZEOF_ARG;
 	}
@@ -649,6 +667,7 @@ ffi_closure_sparc_inner_v9(ffi_closure *
 	argn++;
 #endif
 	  if (i  fp_slot_max
+	   named
 	   (arg_types[i]-type == FFI_TYPE_FLOAT
 		  || arg_types[i]-type == FFI_TYPE_DOUBLE
 #if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
@@ -662,7 +681,7 @@ ffi_closure_sparc_inner_v9(ffi_closure *
 }
 
   /* Invoke the closure.  */
-  (closure-fun) (cif, rvalue, avalue, closure-user_data);
+  closure-fun (cif, rvalue, avalue, closure-user_data);
 
   /* Tell ffi_closure_sparc how to perform return type promotions.  */
   return cif-rtype-type;
Index: doc/libffi.texi
===
--- doc/libffi.texi	(revision 207685)
+++ doc/libffi.texi	(working copy)
@@ -63,14 +63,14 @@ section entitled ``GNU General Public Li
 @node Introduction
 @chapter What is libffi?
 
-Compilers for high level languages generate code that follow certain
+Compilers for high-level languages generate code that follow certain
 conventions.  These conventions are necessary, in part, for separate
 compilation to work.  One such convention is the @dfn{calling
 convention}.  The calling convention is a set of assumptions made by
 the compiler about where function arguments will be found on entry to
 a function.  A calling convention also specifies where the return
-value for a function is found.  The calling convention is also

Re: [PATCH] Fix PR 58960

2014-02-13 Thread Vladimir Makarov
On 01/30/2014 12:42 AM, Andrey Belevantsev wrote:
 Hello,

 As detailed in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58960#c6,
 we fail to use the DF liveness info in the register pressure sensitive
 scheduling for the new blocks as we do not properly compute it in this
 case.  The patch fixes this by avoiding to use the sched-pressure for
 the new regions, as currently these are only ia64 recovery blocks and
 supposed to be cold.  In the case we'd get other cases of the new
 blocks, this may be reconsidered.  The other options of computing the
 DF info sketched at the above link do not seem plausible for this stage.

 Bootstrapped and tested on ia64, also tested by Andreas Schwab on ia64
 (see PR log).  OK for trunk?



The patch is ok.  Andrey, thanks for working on the PR and sorry for the
delay with the approval.


 2013-01-30  Andrey Belevantsev  a...@ispras.ru

 PR rtl-optimization/58960

 * haifa-sched.c (alloc_global_sched_pressure_data): New, factored
 out from ...
 (sched_init) ... here.
 (free_global_sched_pressure_data): New, factored out from ...
 (sched_finish): ... here.
 * sched-int.h (free_global_sched_pressure_data): Declare.
 * sched-rgn.c (nr_regions_initial): New static global.
 (haifa_find_rgns): Initialize it.
 (schedule_region): Disable sched-pressure for the newly generated
 regions.



[PATCH, i386]: Fix xop_vmfrczmode2 expander

2014-02-13 Thread Uros Bizjak
Hello!

No functional changes.

2014-02-13  Uros Bizjak  ubiz...@gmail.com

* config/i386/sse.md (xop_vmfrczmode2): Generate const0 in
operands[2], not operands[3].

Tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 207762)
+++ config/i386/sse.md  (working copy)
@@ -13618,10 +13618,10 @@
  (unspec:VF_128
   [(match_operand:VF_128 1 nonimmediate_operand)]
   UNSPEC_FRCZ)
- (match_dup 3)
+ (match_dup 2)
  (const_int 1)))]
   TARGET_XOP
-  operands[3] = CONST0_RTX (MODEmode);)
+  operands[2] = CONST0_RTX (MODEmode);)
 
 (define_insn *xop_vmfrczmode2
   [(set (match_operand:VF_128 0 register_operand =x)


[jit] Require function names to be valid C identifiers for now

2014-02-13 Thread David Malcolm
(Looks like the comma in the Subject stopped this getting through;
resending with suitably edited Subject)

Committed to branch dmalcolm/jit:

gcc/jit/
* libgccjit.c (IS_ASCII_ALPHA): New macro.
(IS_ASCII_DIGIT): New macro.
(IS_ASCII_ALNUM): New macro.
(gcc_jit_context_new_function): Require that function names be valid
C identifiers for now, to avoid later problems in the assembler.
---
 gcc/jit/libgccjit.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 3c2d962..bca60bd 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -8,6 +8,19 @@
 #include libgccjit.h
 #include internal-api.h
 
+#define IS_ASCII_ALPHA(CHAR) \
+  (\
+((CHAR) = 'a'  (CHAR) ='z')\
+|| \
+((CHAR) = 'A'  (CHAR) = 'Z')   \
+  )
+
+#define IS_ASCII_DIGIT(CHAR) \
+  ((CHAR) = '0'  (CHAR) ='9')
+
+#define IS_ASCII_ALNUM(CHAR) \
+  (IS_ASCII_ALPHA (CHAR) || IS_ASCII_DIGIT (CHAR))
+
 struct gcc_jit_context : public gcc::jit::recording::context
 {
   gcc_jit_context (gcc_jit_context *parent_ctxt) :
@@ -395,6 +408,27 @@ gcc_jit_context_new_function (gcc_jit_context *ctxt,
   RETURN_NULL_IF_FAIL (ctxt, NULL, NULL context);
   RETURN_NULL_IF_FAIL (return_type, ctxt, NULL return_type);
   RETURN_NULL_IF_FAIL (name, ctxt, NULL name);
+  /* The assembler can only handle certain names, so for now, enforce
+ C's rules for identiers upon the name.
+ Eventually we'll need some way to interact with e.g. C++ name mangling.  
*/
+  {
+/* Leading char: */
+char ch = *name;
+RETURN_NULL_IF_FAIL_PRINTF2 (
+   IS_ASCII_ALPHA (ch) || ch == '_',
+   ctxt,
+   name \%s\ contains invalid character: '%c',
+   name, ch);
+/* Subsequent chars: */
+for (const char *ptr = name + 1; (ch = *ptr); ptr++)
+  {
+   RETURN_NULL_IF_FAIL_PRINTF2 (
+ IS_ASCII_ALNUM (ch) || ch == '_',
+ ctxt,
+ name \%s\ contains invalid character: '%c',
+ name, ch);
+  }
+  }
   RETURN_NULL_IF_FAIL ((num_params == 0) || params, ctxt, NULL params);
   for (int i = 0; i  num_params; i++)
 if (!params[i])
-- 
1.7.11.7



Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.

2014-02-13 Thread Uros Bizjak
On Thu, Feb 13, 2014 at 1:55 PM, Uros Bizjak ubiz...@gmail.com wrote:

 I've noticed that _mm512_permutexvar_epi[64|32] intrinsics
 have wrong arguments order. As per [1] first argument is index.
 For vmpermps/vpermpd intrinsics are fine, but I've changed tests
 to call CALC with same arg order as intrinsic. here is the same
 problem (wrong argument order) with vrcp14s[d|s].
 Also avx512er-vrcp28ss-2.c test called wrong intrinsic.

 [1]  http://software.intel.com/sites/landingpage/IntrinsicsGuide/

 gcc/
 * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap
 arguments order in builtin.
 (_mm512_permutexvar_epi64): Ditto.
 (_mm512_mask_permutexvar_epi64): Ditto
 (_mm512_maskz_permutexvar_epi32): Ditto
 (_mm512_permutexvar_epi32): Ditto
 (_mm512_mask_permutexvar_epi32): Ditto
 * config/i386/sse.md (srcp14mode): Swap operands.

 gcc/testsuite/
 * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic.
 * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations.
 * gcc.target/i386/avx512f-vpermpd-2.c: Ditto.
 * gcc.target/i386/avx512f-vpermps-2.c: Ditto.
 * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto.
 * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
 * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.

 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
 index a04b289..d3b2dc5 100644
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -1456,12 +1456,12 @@
[(set (match_operand:VF_128 0 register_operand =v)
 (vec_merge:VF_128
   (unspec:VF_128
 -   [(match_operand:VF_128 1 nonimmediate_operand vm)]
 +   [(match_operand:VF_128 2 nonimmediate_operand vm)]
 UNSPEC_RCP14)
 - (match_operand:VF_128 2 register_operand v)
 + (match_operand:VF_128 1 register_operand v)
   (const_int 1)))]
TARGET_AVX512F
 -  vrcp14ssescalarmodesuffix\t{%1, %2, %0|%0, %2, %1}
 +  vrcp14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2}

 Please don't change srcp pattern, it should be defined similar to
 vrcpss (aka sse_vmrcpv4sf). You need to switch operand order
 elsewhere.

 No, you are correct. Operands should be swapped as in your patch.

Eh, sorry that after some more thinking, I have to again revert this decision.

The srcp pattern should remain as is, and you should swap operands in
avx512fintrin.h instead:

--cut here--
Index: avx512fintrin.h
===
--- avx512fintrin.h (revision 207762)
+++ avx512fintrin.h (working copy)
@@ -1470,8 +1470,8 @@
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_rcp14_sd (__m128d __A, __m128d __B)
 {
-  return (__m128d) __builtin_ia32_rcp14sd ((__v2df) __A,
-  (__v2df) __B);
+  return (__m128d) __builtin_ia32_rcp14sd ((__v2df) __B,
+  (__v2df) __A);
 }

 extern __inline __m128
@@ -1478,8 +1478,8 @@
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_rcp14_ss (__m128 __A, __m128 __B)
 {
-  return (__m128) __builtin_ia32_rcp14ss ((__v4sf) __A,
- (__v4sf) __B);
+  return (__m128) __builtin_ia32_rcp14ss ((__v4sf) __B,
+ (__v4sf) __A);
 }

 extern __inline __m512d
--cut here--

vec_merge RSQRT and RCP are unops of type sse. To correctly
determine memory attribute, sse types look at operand1 only, so
this is the reason that the pattern is defined in this way.

There is similar problem with vec_merge rcp28 and rsqrt28 patterns.
operands 1 and 2 are swapped in the mnemonic, since only the last
operands allow memory:

Index: sse.md
===
--- sse.md  (revision 207764)
+++ sse.md  (working copy)
@@ -12825,7 +12825,7 @@
  (match_operand:VF_128 2 register_operand v)
  (const_int 1)))]
   TARGET_AVX512ER
-  vrcp28ssescalarmodesuffix\t{round_saeonly_op3%2, %1, %0|%0,
%1, %2round_saeonly_op3}
+  vrcp28ssescalarmodesuffix\t{round_saeonly_op3%1, %2, %0|%0,
%2, %1round_saeonly_op3}
   [(set_attr length_immediate 1)
(set_attr prefix evex)
(set_attr mode MODE)])
@@ -12849,7 +12849,7 @@
  (match_operand:VF_128 2 register_operand v)
  (const_int 1)))]
   TARGET_AVX512ER
-  vrsqrt28ssescalarmodesuffix\t{round_saeonly_op3%2, %1, %0|%0,
%1, %2round_saeonly_op3}
+  vrsqrt28ssescalarmodesuffix\t{round_saeonly_op3%1, %2, %0|%0,
%2, %1round_saeonly_op3}
   [(set_attr length_immediate 1)
(set_attr prefix evex)
(set_attr mode MODE)])

Intrinsics should swap their operands accordingly.

Uros.


Re: [patch c++]: Fix pr/58835 [4.7/4.8/4.9 Regression] ICE with __PRETTY_FUNCTION__ in broken function

2014-02-13 Thread Kai Tietz
Ping

- Original Message -
 Hi,
 
 the following patch adds missing handling of error_mark_node result of
 fname_decl within finish_fname.
 
 ChangeLog
 
 2014-02-11  Kai Tietz  kti...@redhat.com
 
 PR c++/58835
 * semantics.c (finish_fname): Handle error_mark_node.
 
 Regression tested for x86_64-unknown-linux-gnu, i686-w64-mingw32.  Ok for
 apply?
 
 Regards,
 Kai
 
 Index: semantics.c
 ===
 --- semantics.c (Revision 207686)
 +++ semantics.c (Arbeitskopie)
 @@ -2630,7 +2630,8 @@ finish_fname (tree id)
tree decl;
 
decl = fname_decl (input_location, C_RID_CODE (id), id);
 -  if (processing_template_decl  current_function_decl)
 +  if (processing_template_decl  current_function_decl
 +   decl != error_mark_node)
  decl = DECL_NAME (decl);
return decl;
  }
 


Re: [PATCH][AArch64] vrnd*_f64 patch for stage-1

2014-02-13 Thread Richard Henderson
On 02/13/2014 03:17 AM, Alex Velenko wrote:
 +/* Sets rmode field of FPCR control register to
 +   FPROUNDING_ZERO.  */

Comment is wrong, or at least misleading.

 +void __inline __attribute__ ((__always_inline__))
 +set_rounding_mode (uint32_t mode)
 +{
 +  uint32_t r;
 +
 +  /* Read current FPCR.  */
 +  asm volatile (mrs %[r], fpcr : [r] =r (r) : :);
 +
 +  /* Clear rmode.  */
 +  r = 3  RMODE_START;

  ~(3  RMODE_START)

 +  /* Calculate desired FPCR.  */
 +  r |= mode  RMODE_START;
 +
 +  /* Write desired FPCR back.  */
 +  asm volatile (msr fpcr, %[r] : : [r] r (r) :);
 +}

Fortunately for this testcase, you do always use FPROUNDING_ZERO == 3 when
calling this function, so the bugs are hidden.


r~


Re: [PATCH] Handle more COMDAT profiling issues

2014-02-13 Thread Mike Stump
On Feb 13, 2014, at 8:41 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Feb 12, 2014 at 2:03 PM, Xinliang David Li davi...@google.com wrote:

[ extra lines deleted ]

 Should non comdat function be skipped?
 
 We warn in drop_profile if this is not a COMDAT, as we should only
 have this case and reach the call in that case. (See the check in
 drop_profile and the comments at the top of handle_missing_profile for
 more info)

[ more extra lines deleted ]

Can we edit out the extra lines when they get this large?  Not doing that is 
actually worse than top-posting.

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-13 Thread Xinliang David Li
On Thu, Feb 13, 2014 at 9:48 AM, Mike Stump mikest...@comcast.net wrote:
 On Feb 13, 2014, at 8:41 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, Feb 12, 2014 at 2:03 PM, Xinliang David Li davi...@google.com 
 wrote:

 [ extra lines deleted ]

 Should non comdat function be skipped?

 We warn in drop_profile if this is not a COMDAT, as we should only
 have this case and reach the call in that case. (See the check in
 drop_profile and the comments at the top of handle_missing_profile for
 more info)

 [ more extra lines deleted ]

 Can we edit out the extra lines when they get this large?  Not doing that is 
 actually worse than top-posting.

Right -- gmail users probably won't notice the problem as extra lines
are hidden for you.

David


Re: [PATCH][RFC][libatomic] Override -mcpu option for arm linux ifunc targets

2014-02-13 Thread Kyrill Tkachov

Ping?
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00069.html

On 03/02/14 11:50, Kyrill Tkachov wrote:

Hi all,

There is a slight issue with the libatomic Makefile for arm linux ifunc targets.
It adds an explicity -march=armv7-a option to the command line to enable
building the exclusive instruction variants in libatomic. However, if the
multilib machinery tries to add an -mcpu option that conflicts with the -march
one (such as -mcpu=cortex-a15) gcc will give a warning about incompatible -march
and -mcpu options, causing the -Werror build to fail.

A workaround here is to override the -mcpu option as well as the -march one.
This patch does that by adding an EXTRA_OVERRIDE variable and setting it to
-mcpu=cortex-a9 under the same conditions as when -march=armv7-a is selected, so
that it's added only when -march=armv70a is added.

Can someone see a better way of achieving this?

If this is acceptable, ok to commit?

Build and test arm-none-linux-gnueabi with --enable-gnu-indirect-function
Bootstrap on x86 with --enable-gnu-indirect-function

Thanks,
Kyrill

2014-02-03  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * Makefile.in: Override -mcpu option when building arm
  linux ifunc targets.





[jit] New API entrypoint: gcc_jit_context_get_builtin_function

2014-02-13 Thread David Malcolm
Committed to branch dmalcolm/jit:

This commit adds the ability for client code to look up GCC builtins by
name, potentially allowing GCC to optimize the resulting function usage
based on what it knows about the behavior of each builtin.

Note that if the optimizer can't eliminate the call, the generated caller
code will still require machine code for the callee, and thus may need
the DSO implementing the builtin to already be linked into the client
process, or you'll get a linker error - so perhaps builtin is a bad
name?

Implementing this required creating function types (to handle
builtin-types.def), which are used internally by the new builtins_manager.
They're not yet exposed to client code.

gcc/jit/
* libgccjit.h (gcc_jit_context_get_builtin_function): New.
* libgccjit.map (gcc_jit_context_get_builtin_function): New.
* libgccjit++.h (gccjit::context::get_builtin_function): New method.

* Make-lang.in (jit_OBJS): Add jit/jit-builtins.o
* jit-builtins.c: New source file, for managing builtin functions
and their types.
* jit-builtins.h: Likewise.

* libgccjit.c (gcc_jit_context_new_function): Pass BUILT_IN_NONE for
the new argument of new_function
(gcc_jit_context_get_builtin_function): New.

* internal-api.h: Add idempotency guards.
(gcc::jit::recording::context::new_function): Add parameter
for builtin functions.
(gcc::jit::recording::context::get_builtin_function): New method.
(gcc::jit::recording::context::m_builtins_manager): New field.
(gcc::jit::recording::type::as_a_function_type): New virtual function.
(gcc::jit::recording::function_type): New subclass of type.
(gcc::jit::recording::function::function): Add parameter for
builtin functions.
(gcc::jit::recording::function::m_builtin_id): New field.
(gcc::jit::recording::function::new_function_type): New method.
(gcc::jit::playback::function::function):  Add parameter for
builtin functions.
* internal-api.c (gcc::jit::recording::context::context):
NULL-initialize new field m_builtins_manager.
(gcc::jit::recording::context::~context): Clean up the builtins
manager, if one has been created.
(gcc::jit::recording::context::new_function): Add parameter
(gcc::jit::recording::context::get_builtin_function): New method.
(gcc::jit::recording::function_type::function_type): Implement
constructor for new subclass.
(gcc::jit::recording::function_type::dereference): Implement
method for new subclass.
(gcc::jit::recording::function_type::replay_into): Likewise.
(gcc::jit::recording::function_type::make_debug_string): Likewise.
(gcc::jit::recording::function::function): Add parameter for
builtin functions.
(gcc::jit::recording::function::replay_into): Likewise for
creation of playback object.
(gcc::jit::recording::function::new_function_type): New method.
(gcc::jit::playback::function::new_function):  Add parameter for
builtin functions, using it to set up the fndecl accordingly.

gcc/testsuite/
* jit.dg/harness.h (CHECK_DOUBLE_VALUE): New macro.
(CHECK): New macro.
* jit.dg/test-functions.c: New testcase, exercising
gcc_jit_context_get_builtin_function.
* jit.dg/test-combination.c: Add test-functions.c to the combined
test.
---
 gcc/jit/ChangeLog.jit   |  48 
 gcc/jit/Make-lang.in|   3 +-
 gcc/jit/internal-api.c  | 157 -
 gcc/jit/internal-api.h  |  55 -
 gcc/jit/jit-builtins.c  | 395 
 gcc/jit/jit-builtins.h  | 114 +
 gcc/jit/libgccjit++.h   |   9 +
 gcc/jit/libgccjit.c |  13 +-
 gcc/jit/libgccjit.h |   4 +
 gcc/jit/libgccjit.map   |   1 +
 gcc/testsuite/ChangeLog.jit |   9 +
 gcc/testsuite/jit.dg/harness.h  |  29 +++
 gcc/testsuite/jit.dg/test-combination.c |   9 +
 gcc/testsuite/jit.dg/test-functions.c   | 175 ++
 14 files changed, 1009 insertions(+), 12 deletions(-)
 create mode 100644 gcc/jit/jit-builtins.c
 create mode 100644 gcc/jit/jit-builtins.h
 create mode 100644 gcc/testsuite/jit.dg/test-functions.c

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index adccd57..603dd96 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,51 @@
+2014-02-13  David Malcolm  dmalc...@redhat.com
+
+   * libgccjit.h (gcc_jit_context_get_builtin_function): New.
+   * libgccjit.map (gcc_jit_context_get_builtin_function): New.
+   * libgccjit++.h (gccjit::context::get_builtin_function): New method.
+
+   * Make-lang.in (jit_OBJS): Add jit/jit-builtins.o
+   * jit-builtins.c: New source 

Re: [PATCH][RFC][libatomic] Override -mcpu option for arm linux ifunc targets

2014-02-13 Thread Richard Henderson
On 02/03/2014 03:50 AM, Kyrill Tkachov wrote:
 +# For ARM, the -march option by itself conflicts with any -mcpu option that
 +# we might end up passing to the build, causing an error.
 +# Therefore we override the -mcpu option as well.
 +# This shouldn't affect tuning much because the affected code is mostly
 +# in inline assembly anyway.
  @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a 
 -DHAVE_KERNEL64
 +@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@EXTRA_OVERRIDE = -mcpu=cortex-a9

Why would you want to split these across two different variables?  It's easier
to just add the -march and -mcpu to the same IFUNC_OPTIONS variable.

Why the choice of cortext-a9, as opposed to any of the other v7-a
possibilities?  If we're going to force anything, perhaps generic-armv7-a is
more appropriate?


r~


Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c

2014-02-13 Thread Hans-Peter Nilsson
On Thu, 13 Feb 2014, Richard Sandiford wrote:
 Richard Sandiford rsand...@linux.vnet.ibm.com writes:
  Hans-Peter Nilsson h...@bitrange.com writes:
  On Tue, 4 Feb 2014, Rainer Orth wrote:
  AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about
  20131114:
 
  Bah, missing analysis. Everywhere does not include cris-elf,
  powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu,
  s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0.
 
  Based on this list I'm guessing it's another BRANCH_COST==1

 BRANCH_COST==1 || !LOGICAL_OP_NON_SHORT_CIRCUIT

Thanks.  Ouch, not again!  Anyone with an idea for an
effective-target identification function?

I'd like to avoid an explicit target list but if that's what it
takes, better collect the target list in an
check_effective_target_branch_cost1 and/or
check_effective_target_logical_op_short_circuit - and yes, the
test function should use the positive sense (it should not use a
negative sense for reasons QED. :)

brgds, H-P


Re: [PATCH] [libgomp] make it possible to use OMP on both sides of a fork

2014-02-13 Thread Richard Henderson
 +/* This is to enable best-effort cleanup after fork.  */
 +static int gomp_we_are_forked = 0;

bool, no explicit initialization, possible removal, see below.

 +static void
 +gomp_free_thread_pool (int threads_running)

bool for threads_running.  It looks like a count otherwise.

 +gomp_after_fork_callback ()

 (void)

 +  pthread_atfork (NULL, NULL, gomp_after_fork_callback);

 not needed.

Any reason not to just run gomp_free_thread_pool from gomp_after_fork_callback
directly?  I see no restrictions on what kind of code is allowed to execute
during that callback.


r~


Re: [PATCH] Documentation for dump and optinfo output

2014-02-13 Thread Sharad Singhai
Committed as r207767.

Indeed, I had an older version of makeinfo. Once I updated to the
latest version 5.2, I saw the warnings. Those are fixed by this patch.

Thanks,
Sharad

 On Tue, Feb 11, 2014 at 11:42 PM, Thomas Schwinge tho...@codesourcery.com
 wrote:

 Hi!

 On Wed, 5 Feb 2014 16:33:19 -0800, Sharad Singhai sing...@google.com
 wrote:
  I am really sorry about the delay.

 No worries; not exactly a severe issue.  ;-)

  I couldn't exactly reproduce the
  warning which you described

 Maybe the version of makeinfo is relevant?

 $ makeinfo --version | head -n 1
 makeinfo (GNU texinfo) 5.1

  but I found a place where two nodes were
  out of order in optinfo.texi. Could you please apply the following
  patch and see if the problem goes away? If it works for you, I will
  commit the doc fixes.
 
  Also I would appreciate the exact command which produces these warnings.

 Your patch does fix the problem; see the following diff of the build log,
 where the warnings are now gone, and which also happens to contain the
 makeinfo command line.

 @@ -4199,12 +4199,6 @@ if [ xinfo = xinfo ]; then \
 makeinfo --split-size=500 --split-size=500
 --no-split -I . -I ../../source/gcc/doc \
 -I ../../source/gcc/doc/include -o doc/gccint.info
 ../../source/gcc/doc/gccint.texi; \
 fi
 -../../source/gcc/doc/optinfo.texi:45: warning: node next `Optimization
 groups' in menu `Dump output verbosity' and in sectioning `Dump files and
 streams' differ
 -../../source/gcc/doc/optinfo.texi:77: warning: node next `Dump files and
 streams' in menu `Dump types' and in sectioning `Dump output verbosity'
 differ
 -../../source/gcc/doc/optinfo.texi:77: warning: node prev `Dump files and
 streams' in menu `Dump output verbosity' and in sectioning `Optimization
 groups' differ
 -../../source/gcc/doc/optinfo.texi:104: warning: node next `Dump output
 verbosity' in menu `Dump files and streams' and in sectioning `Dump types'
 differ
 -../../source/gcc/doc/optinfo.texi:104: warning: node prev `Dump output
 verbosity' in menu `Optimization groups' and in sectioning `Dump files and
 streams' differ
 -../../source/gcc/doc/optinfo.texi:137: warning: node prev `Dump types' in
 menu `Dump files and streams' and in sectioning `Dump output verbosity'
 differ
  if [ xinfo = xinfo ]; then \
 makeinfo --split-size=500 --split-size=500
 --no-split -I ../../source/gcc/doc \
 -I ../../source/gcc/doc/include -o
 doc/gccinstall.info ../../source/gcc/doc/install.texi; \

  * doc/optinfo.texi: Fix order of nodes.

 Thanks, please commit.


 Grüße,
  Thomas




[PATCH, testsuite] Fix profile test failures

2014-02-13 Thread Steve Ellcey
While testing the C++ profiling tests in g++.dg/bprob and using the
qemu simulator we discovered that these tests were passing when we ran
the testsuite with no extra options but that if we specified some options
on the testsuite run then the tests would fail with this message in the
c++.log file:

rsh: Could not resolve hostname multi-sim/-EL: Name or service not known

After some poking around I found that profopt-execute in lib/profopt.exp
was using remote_file and remote_upload with 'target' where I believe it 
should be using 'host'.  No other *.exp file uses 'target' on their
remote_file or remote_update calls, they either use 'build' or 'host'.

So while it seems weird that 'host' is the proper replacement for 'target'
as the machine where the executable is run, this seems to be the right fix
and it does give me a clean run now with or without extra arguments on
the test run.

OK for checkin?

Steve Ellcey
sell...@mips.com


2014-02-13  Steve Ellcey sell...@mips.com

* lib/profopt.exp (profopt-execute): Use host instead of target
in remote_file and remote_upload calls.


diff --git a/gcc/testsuite/lib/profopt.exp b/gcc/testsuite/lib/profopt.exp
index e0d849e..e045b53 100644
--- a/gcc/testsuite/lib/profopt.exp
+++ b/gcc/testsuite/lib/profopt.exp
@@ -264,7 +264,7 @@ proc profopt-execute { src } {
 
# Remove old profiling and performance data files.
foreach ext $prof_ext {
-   remote_file target delete $tmpdir/$base.$ext
+   remote_file host delete $tmpdir/$base.$ext
}
if [info exists perf_ext] {
profopt-cleanup $testcase $perf_ext
@@ -312,7 +312,7 @@ proc profopt-execute { src } {
# Make sure the profile data was generated, and fail if not.
if { $status == pass } {
foreach ext $prof_ext {
-   remote_upload target $tmpdir/$base.$ext
+   remote_upload host $tmpdir/$base.$ext
set files [glob -nocomplain $base.$ext]
if { $files ==  } {
set status fail
@@ -368,7 +368,7 @@ proc profopt-execute { src } {
 
# Remove the profiling data files.
foreach ext $prof_ext {
-   remote_file target delete $tmpdir/$base.$ext
+   remote_file host delete $tmpdir/$base.$ext
}
 
if { $status != pass } {



Re: std::regex_replace behaviour (LWG DR 2213)

2014-02-13 Thread Tim Shen
On Thu, Feb 13, 2014 at 1:13 PM, Jonathan Wakely jwakely@gmail.com wrote:
 The LWG have decided that
 http://cplusplus.github.io/LWG/lwg-active.html#2213 is a defect.

 In our std::regex_replace we do not appear to update out in all places
 that we should.

1) Yes, the current implementation is buggy for not updating __out
after calling std::copy;
2) I'd rather say the standard is misleading but well intended (return
the new out iterator) rather than ill intended (return the original
out iterator). It'll be a little troubler if match_results::format()
do not return the new out iterator, which regex_replace() the caller
needs. Boost and libc++ as well return the new iterator.

So my suggestion is just following the LWG proposal, as well as Boost
and libc++.

Here's the patch tested with -m32 and -m64 respectively.

Thanks!


-- 
Regards,
Tim Shen
commit 3f8621b5f7ced00e21e7038f1e9737eea1bb4251
Author: tim timshe...@gmail.com
Date:   Thu Feb 13 17:23:48 2014 -0500

2014-02-13  Tim Shen  timshe...@gmail.com

* include/bits/regex.tcc (match_results::format,
regex_replace): Update __out after calling std::copy.
* testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc:
Add testcase.
* testsuite/28_regex/match_results/format.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/regex.tcc 
b/libstdc++-v3/include/bits/regex.tcc
index 73f55df..5fa1f01 100644
--- a/libstdc++-v3/include/bits/regex.tcc
+++ b/libstdc++-v3/include/bits/regex.tcc
@@ -425,7 +425,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
  auto __sub = _Base_type::operator[](__idx);
  if (__sub.matched)
-   std::copy(__sub.first, __sub.second, __out);
+   __out = std::copy(__sub.first, __sub.second, __out);
};
 
   if (__flags  regex_constants::format_sed)
@@ -455,7 +455,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (__next == __fmt_last)
break;
 
- std::copy(__fmt_first, __next, __out);
+ __out = std::copy(__fmt_first, __next, __out);
 
  auto __eat = [](char __ch) - bool
{
@@ -493,7 +493,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*__out++ = '$';
  __fmt_first = __next;
}
- std::copy(__fmt_first, __fmt_last, __out);
+ __out = std::copy(__fmt_first, __fmt_last, __out);
}
   return __out;
 }
@@ -512,7 +512,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   if (__i == __end)
{
  if (!(__flags  regex_constants::format_no_copy))
-   std::copy(__first, __last, __out);
+   __out = std::copy(__first, __last, __out);
}
   else
{
@@ -521,14 +521,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  for (; __i != __end; ++__i)
{
  if (!(__flags  regex_constants::format_no_copy))
-   std::copy(__i-prefix().first, __i-prefix().second, __out);
+   __out = std::copy(__i-prefix().first, __i-prefix().second,
+ __out);
  __out = __i-format(__out, __fmt, __fmt + __len, __flags);
  __last = __i-suffix();
  if (__flags  regex_constants::format_first_only)
break;
}
  if (!(__flags  regex_constants::format_no_copy))
-   std::copy(__last.first, __last.second, __out);
+   __out = std::copy(__last.first, __last.second, __out);
}
   return __out;
 }
diff --git 
a/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc
 
b/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc
index 28f78a0..38ef970 100644
--- 
a/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc
+++ 
b/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc
@@ -41,6 +41,14 @@ test01()
   VERIFY(regex_replace(string(This is a string), regex(\\b\\w*\\b), |$0|,
   regex_constants::format_first_only)
 == |This| is a string);
+
+  char buff[4096] = {0};
+  regex re(asdf);
+  string s = asdf;
+  string res = |asdf|asdf|;
+  regex_replace(buff, s.data(), s.data() + s.size(), re, ||\\0|,
+   regex_constants::format_sed);
+  VERIFY(res == buff);
 }
 
 int
diff --git a/libstdc++-v3/testsuite/28_regex/match_results/format.cc 
b/libstdc++-v3/testsuite/28_regex/match_results/format.cc
index 11e3bdb..097a0d7 100644
--- a/libstdc++-v3/testsuite/28_regex/match_results/format.cc
+++ b/libstdc++-v3/testsuite/28_regex/match_results/format.cc
@@ -43,6 +43,14 @@ test01()
   VERIFY(m.format(|\\3|\\4|\\2|\\1|\\,
  regex_constants::format_sed)
 == this is a string|a|string|is|this|\\);
+
+  regex re(asdf);
+  regex_match(asdf, m, re);
+  string fmt = ||\\0|;
+  char buff[4096] = {0};
+  m.format(buff, fmt.data(), fmt.data() + fmt.size(),
+  regex_constants::format_sed);
+  VERIFY(string(buff) == 

[PATCH] Fix a couple of tree-vect-loop.c issues

2014-02-13 Thread Jakub Jelinek
Hi!

While fixing a -O3 -g vectorizer ICE that only reproduced on GCC-4.4-RH
branch, I've noticed couple of similar issues on the trunk.
The first hunk is just a cleanup, there is no point to set use_stmt again to
the same thing as it has been set before.

The second and third hunks are to ignore debug stmts, other places in
tree-vect-loop.c that similarly look for the exit phi look similarly.

The last hunk fixes GOMP_SIMD_LANE handling, and the testcase is
from the FAIL in redhat/gcc-4_4-branch.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-02-13  Jakub Jelinek  ja...@redhat.com

* tree-vect-loop.c (vect_is_slp_reduction): Don't set
use_stmt twice.
(get_initial_def_for_induction, vectorizable_induction): Ignore
debug stmts when looking for exit_phi.
(vectorizable_live_operation): Fix up condition.

* gcc.c-torture/compile/20140213.c: New test.

--- gcc/tree-vect-loop.c.jj 2014-02-05 15:28:10.0 +0100
+++ gcc/tree-vect-loop.c2014-02-13 15:36:38.117741038 +0100
@@ -1968,10 +1968,8 @@ vect_is_slp_reduction (loop_vec_info loo
   FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
 {
  gimple use_stmt = USE_STMT (use_p);
-  if (is_gimple_debug (use_stmt))
-continue;
-
- use_stmt = USE_STMT (use_p);
+ if (is_gimple_debug (use_stmt))
+   continue;
 
   /* Check if we got back to the reduction phi.  */
  if (use_stmt == phi)
@@ -3507,9 +3505,13 @@ get_initial_def_for_induction (gimple iv
   exit_phi = NULL;
   FOR_EACH_IMM_USE_FAST (use_p, imm_iter, loop_arg)
 {
- if (!flow_bb_inside_loop_p (iv_loop, gimple_bb (USE_STMT (use_p
+ gimple use_stmt = USE_STMT (use_p);
+ if (is_gimple_debug (use_stmt))
+   continue;
+
+ if (!flow_bb_inside_loop_p (iv_loop, gimple_bb (use_stmt)))
{
- exit_phi = USE_STMT (use_p);
+ exit_phi = use_stmt;
  break;
}
 }
@@ -5413,10 +5415,13 @@ vectorizable_induction (gimple phi, gimp
   loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e);
   FOR_EACH_IMM_USE_FAST (use_p, imm_iter, loop_arg)
{
- if (!flow_bb_inside_loop_p (loop-inner,
- gimple_bb (USE_STMT (use_p
+ gimple use_stmt = USE_STMT (use_p);
+ if (is_gimple_debug (use_stmt))
+   continue;
+
+ if (!flow_bb_inside_loop_p (loop-inner, gimple_bb (use_stmt)))
{
- exit_phi = USE_STMT (use_p);
+ exit_phi = use_stmt;
  break;
}
}
@@ -5514,7 +5519,7 @@ vectorizable_live_operation (gimple stmt
{
  gimple use_stmt = USE_STMT (use_p);
  if (gimple_code (use_stmt) == GIMPLE_PHI
- || gimple_bb (use_stmt) == merge_bb)
+  gimple_bb (use_stmt) == merge_bb)
{
  if (vec_stmt)
{
--- gcc/testsuite/gcc.c-torture/compile/20140213.c.jj   2013-08-25 
18:20:55.717911035 +0200
+++ gcc/testsuite/gcc.c-torture/compile/20140213.c  2014-02-13 
16:23:45.631401820 +0100
@@ -0,0 +1,21 @@
+static unsigned short
+foo (unsigned char *x, int y)
+{
+  unsigned short r = 0;
+  int i;
+  for (i = 0; i  y; i++)
+r += x[i];
+  return r;
+}
+
+int baz (int, unsigned short);
+
+void
+bar (unsigned char *x, unsigned char *y)
+{
+  int i;
+  unsigned short key = foo (x, 0x1);
+  baz (0, 0);
+  for (i = 0; i  0x8; i++)
+y[i] = x[baz (i, key)];
+}

Jakub


Re: [PATCH, testsuite] Fix profile test failures

2014-02-13 Thread Joseph S. Myers
On Thu, 13 Feb 2014, Steve Ellcey  wrote:

 While testing the C++ profiling tests in g++.dg/bprob and using the
 qemu simulator we discovered that these tests were passing when we ran
 the testsuite with no extra options but that if we specified some options
 on the testsuite run then the tests would fail with this message in the
 c++.log file:
 
 rsh: Could not resolve hostname multi-sim/-EL: Name or service not known

That means your board file is buggy.  If rsh is not the right way to 
access your target system, you need to implement the board file methods in 
some way other than rsh (possibly some operations should be no-ops, or do 
something directly on the build system, if you have a shared filesystem).

 So while it seems weird that 'host' is the proper replacement for 'target'
 as the machine where the executable is run, this seems to be the right fix

It's certainly not the proper replacement.  If a file is on the target, 
use target for deletion / manipulation; if it's on the host, use host for 
deletion / manipulation; on build, use build; in multiple places, run the 
deletion operation once per system with the file; to copy from target to 
the system (build) running DejaGnu, use remote_upload specifying target; 
to copy from host to build, use remote_upload specifying host; to copy 
from build to host or target, use remote_download specifying host or 
target as appropriate.

To determine whether anything should be changed in a GCC .exp file, reason 
about which of the three systems (build, host, target) a file is on, or is 
needed on, at each point, rather than looking at what does or does not 
work with a buggy board file.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix a couple of tree-vect-loop.c issues

2014-02-13 Thread Richard Henderson
On 02/13/2014 02:46 PM, Jakub Jelinek wrote:
 2014-02-13  Jakub Jelinek  ja...@redhat.com
 
   * tree-vect-loop.c (vect_is_slp_reduction): Don't set
   use_stmt twice.
   (get_initial_def_for_induction, vectorizable_induction): Ignore
   debug stmts when looking for exit_phi.
   (vectorizable_live_operation): Fix up condition.
 
   * gcc.c-torture/compile/20140213.c: New test.

Ok.


r~


[PATCH] x86: Use ud2 assembly mnemonic when available.

2014-02-13 Thread Roland McGrath
Non-ancient assemblers support the ud2 mnemonic, so there is no need
to emit the literal opcode as data.

OK for trunk and 4.8?


Thanks,
Roland


gcc/
2014-02-13  Roland McGrath  mcgra...@google.com

* configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic.
* configure: Regenerated.
* config.in: Regenerated.
* config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic
instead of ASM_SHORT.

--- a/gcc/config.in
+++ b/gcc/config.in
@@ -375,6 +375,12 @@
 #endif


+/* Define if your assembler supports the 'ud2' mnemonic. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_IX86_UD2
+#endif
+
+
 /* Define if your assembler supports the lituse_jsrdirect relocation. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_AS_JSRDIRECT_RELOCS
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -17843,7 +17843,13 @@
 (define_insn trap
   [(trap_if (const_int 1) (const_int 6))]
   
-  { return ASM_SHORT 0x0b0f; }
+{
+#ifdef HAVE_AS_IX86_UD2
+  return ud2;
+#else
+  return ASM_SHORT 0x0b0f;
+#endif
+}
   [(set_attr length 2)])

 (define_expand prefetch
--- a/gcc/configure
+++ b/gcc/configure
@@ -25109,6 +25109,37 @@ $as_echo #define
HAVE_AS_IX86_REP_LOCK_PREFIX 1 confdefs.h

 fi

+{ $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for
ud2 mnemonic 5
+$as_echo_n checking assembler for ud2 mnemonic...  6; }
+if test ${gcc_cv_as_ix86_ud2+set} = set; then :
+  $as_echo_n (cached)  6
+else
+  gcc_cv_as_ix86_ud2=no
+  if test x$gcc_cv_as != x; then
+$as_echo 'ud2'  conftest.s
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s 5'
+  { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5
+  (eval $ac_try) 25
+  ac_status=$?
+  $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5
+  test $ac_status = 0; }; }
+then
+   gcc_cv_as_ix86_ud2=yes
+else
+  echo configure: failed program was 5
+  cat conftest.s 5
+fi
+rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_ud2 5
+$as_echo $gcc_cv_as_ix86_ud2 6; }
+if test $gcc_cv_as_ix86_ud2 = yes; then
+
+$as_echo #define HAVE_AS_IX86_UD2 1 confdefs.h
+
+fi
+
 { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for
R_386_TLS_GD_PLT reloc 5
 $as_echo_n checking assembler for R_386_TLS_GD_PLT reloc...  6; }
 if test ${gcc_cv_as_ix86_tlsgdplt+set} = set; then :
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3895,6 +3895,12 @@ foo: nop
 [AC_DEFINE(HAVE_AS_IX86_REP_LOCK_PREFIX, 1,
   [Define if the assembler supports 'rep insn, lock insn'.])])

+gcc_GAS_CHECK_FEATURE([ud2 mnemonic],
+   gcc_cv_as_ix86_ud2,,,
+   [ud2],,
+  [AC_DEFINE(HAVE_AS_IX86_UD2, 1,
+   [Define if your assembler supports the 'ud2' mnemonic.])])
+
 gcc_GAS_CHECK_FEATURE([R_386_TLS_GD_PLT reloc],
 gcc_cv_as_ix86_tlsgdplt,,,
[calltls_gd@tlsgdplt],


Re: [PATCH, testsuite] Fix profile test failures

2014-02-13 Thread Steve Ellcey
On Thu, 2014-02-13 at 23:09 +, Joseph S. Myers wrote:
 On Thu, 13 Feb 2014, Steve Ellcey  wrote:
 
  While testing the C++ profiling tests in g++.dg/bprob and using the
  qemu simulator we discovered that these tests were passing when we ran
  the testsuite with no extra options but that if we specified some options
  on the testsuite run then the tests would fail with this message in the
  c++.log file:
  
  rsh: Could not resolve hostname multi-sim/-EL: Name or service not known
 
 That means your board file is buggy.  If rsh is not the right way to 
 access your target system, you need to implement the board file methods in 
 some way other than rsh (possibly some operations should be no-ops, or do 
 something directly on the build system, if you have a shared filesystem).

I thought the bug was that it was using 'multi-sim/-EL' instead of just
'multi-sim'.  I.e.  I thought that target was a combination of where the
test was run and what options were used, whereas host was just going to
be where the test was run.  I guess I was wrong about that.

  So while it seems weird that 'host' is the proper replacement for 'target'
  as the machine where the executable is run, this seems to be the right fix
 
 It's certainly not the proper replacement.  If a file is on the target, 
 use target for deletion / manipulation; if it's on the host, use host for 
 deletion / manipulation; on build, use build; in multiple places, run the 
 deletion operation once per system with the file; to copy from target to 
 the system (build) running DejaGnu, use remote_upload specifying target; 
 to copy from host to build, use remote_upload specifying host; to copy 
 from build to host or target, use remote_download specifying host or 
 target as appropriate.

So let me make sure I understand this:  host is where you run the
testsuite from, build is where the compilation happens (probably the
same as host for most people), and target is where the test program is
executed.

 To determine whether anything should be changed in a GCC .exp file, reason 
 about which of the three systems (build, host, target) a file is on, or is 
 needed on, at each point, rather than looking at what does or does not 
 work with a buggy board file.

I am not convinced that the problem is in the board file because the
only tests I see fail this way are the ones in g++.exp/bprob and that is
also the only GCC .exp file that uses remote_upload or remote_file
with 'target'.  I will dig into it some more and also try it with some
different boards.

Steve Ellcey
sell...@mips.com






Re: [PATCH] x86: Use ud2 assembly mnemonic when available.

2014-02-13 Thread Andrew Pinski
On Thu, Feb 13, 2014 at 3:46 PM, Roland McGrath mcgra...@google.com wrote:
 Non-ancient assemblers support the ud2 mnemonic, so there is no need
 to emit the literal opcode as data.

 OK for trunk and 4.8?

I changed this to use .word due to openbsd3.1:
http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01347.html .  I no longer
have access to this older openbsd box so I don't object to this
change.  In fact I doubt we support any binutils that are pre 2.0 any
more; so maybe move over unconditionally to ud.

Thanks,
Andrew Pinski



 Thanks,
 Roland


 gcc/
 2014-02-13  Roland McGrath  mcgra...@google.com

 * configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic.
 * configure: Regenerated.
 * config.in: Regenerated.
 * config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic
 instead of ASM_SHORT.

 --- a/gcc/config.in
 +++ b/gcc/config.in
 @@ -375,6 +375,12 @@
  #endif


 +/* Define if your assembler supports the 'ud2' mnemonic. */
 +#ifndef USED_FOR_TARGET
 +#undef HAVE_AS_IX86_UD2
 +#endif
 +
 +
  /* Define if your assembler supports the lituse_jsrdirect relocation. */
  #ifndef USED_FOR_TARGET
  #undef HAVE_AS_JSRDIRECT_RELOCS
 --- a/gcc/config/i386/i386.md
 +++ b/gcc/config/i386/i386.md
 @@ -17843,7 +17843,13 @@
  (define_insn trap
[(trap_if (const_int 1) (const_int 6))]

 -  { return ASM_SHORT 0x0b0f; }
 +{
 +#ifdef HAVE_AS_IX86_UD2
 +  return ud2;
 +#else
 +  return ASM_SHORT 0x0b0f;
 +#endif
 +}
[(set_attr length 2)])

  (define_expand prefetch
 --- a/gcc/configure
 +++ b/gcc/configure
 @@ -25109,6 +25109,37 @@ $as_echo #define
 HAVE_AS_IX86_REP_LOCK_PREFIX 1 confdefs.h

  fi

 +{ $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for
 ud2 mnemonic 5
 +$as_echo_n checking assembler for ud2 mnemonic...  6; }
 +if test ${gcc_cv_as_ix86_ud2+set} = set; then :
 +  $as_echo_n (cached)  6
 +else
 +  gcc_cv_as_ix86_ud2=no
 +  if test x$gcc_cv_as != x; then
 +$as_echo 'ud2'  conftest.s
 +if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s 5'
 +  { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5
 +  (eval $ac_try) 25
 +  ac_status=$?
 +  $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5
 +  test $ac_status = 0; }; }
 +then
 +   gcc_cv_as_ix86_ud2=yes
 +else
 +  echo configure: failed program was 5
 +  cat conftest.s 5
 +fi
 +rm -f conftest.o conftest.s
 +  fi
 +fi
 +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_ud2 5
 +$as_echo $gcc_cv_as_ix86_ud2 6; }
 +if test $gcc_cv_as_ix86_ud2 = yes; then
 +
 +$as_echo #define HAVE_AS_IX86_UD2 1 confdefs.h
 +
 +fi
 +
  { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for
 R_386_TLS_GD_PLT reloc 5
  $as_echo_n checking assembler for R_386_TLS_GD_PLT reloc...  6; }
  if test ${gcc_cv_as_ix86_tlsgdplt+set} = set; then :
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -3895,6 +3895,12 @@ foo: nop
  [AC_DEFINE(HAVE_AS_IX86_REP_LOCK_PREFIX, 1,
[Define if the assembler supports 'rep insn, lock insn'.])])

 +gcc_GAS_CHECK_FEATURE([ud2 mnemonic],
 +   gcc_cv_as_ix86_ud2,,,
 +   [ud2],,
 +  [AC_DEFINE(HAVE_AS_IX86_UD2, 1,
 +   [Define if your assembler supports the 'ud2' mnemonic.])])
 +
  gcc_GAS_CHECK_FEATURE([R_386_TLS_GD_PLT reloc],
  gcc_cv_as_ix86_tlsgdplt,,,
 [calltls_gd@tlsgdplt],


Re: [PATCH] x86: Use ud2 assembly mnemonic when available.

2014-02-13 Thread Andrew Pinski
On Thu, Feb 13, 2014 at 3:50 PM, Andrew Pinski pins...@gmail.com wrote:
 On Thu, Feb 13, 2014 at 3:46 PM, Roland McGrath mcgra...@google.com wrote:
 Non-ancient assemblers support the ud2 mnemonic, so there is no need
 to emit the literal opcode as data.

 OK for trunk and 4.8?

 I changed this to use .word due to openbsd3.1:
 http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01347.html .  I no longer
 have access to this older openbsd box so I don't object to this
 change.  In fact I doubt we support any binutils that are pre 2.0 any
 more; so maybe move over unconditionally to ud.


Oh looking into this further, it looks like Sun's assembler does not
support it either:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23359

Thanks,
Andrew


 Thanks,
 Andrew Pinski



 Thanks,
 Roland


 gcc/
 2014-02-13  Roland McGrath  mcgra...@google.com

 * configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic.
 * configure: Regenerated.
 * config.in: Regenerated.
 * config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic
 instead of ASM_SHORT.

 --- a/gcc/config.in
 +++ b/gcc/config.in
 @@ -375,6 +375,12 @@
  #endif


 +/* Define if your assembler supports the 'ud2' mnemonic. */
 +#ifndef USED_FOR_TARGET
 +#undef HAVE_AS_IX86_UD2
 +#endif
 +
 +
  /* Define if your assembler supports the lituse_jsrdirect relocation. */
  #ifndef USED_FOR_TARGET
  #undef HAVE_AS_JSRDIRECT_RELOCS
 --- a/gcc/config/i386/i386.md
 +++ b/gcc/config/i386/i386.md
 @@ -17843,7 +17843,13 @@
  (define_insn trap
[(trap_if (const_int 1) (const_int 6))]

 -  { return ASM_SHORT 0x0b0f; }
 +{
 +#ifdef HAVE_AS_IX86_UD2
 +  return ud2;
 +#else
 +  return ASM_SHORT 0x0b0f;
 +#endif
 +}
[(set_attr length 2)])

  (define_expand prefetch
 --- a/gcc/configure
 +++ b/gcc/configure
 @@ -25109,6 +25109,37 @@ $as_echo #define
 HAVE_AS_IX86_REP_LOCK_PREFIX 1 confdefs.h

  fi

 +{ $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for
 ud2 mnemonic 5
 +$as_echo_n checking assembler for ud2 mnemonic...  6; }
 +if test ${gcc_cv_as_ix86_ud2+set} = set; then :
 +  $as_echo_n (cached)  6
 +else
 +  gcc_cv_as_ix86_ud2=no
 +  if test x$gcc_cv_as != x; then
 +$as_echo 'ud2'  conftest.s
 +if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s 5'
 +  { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5
 +  (eval $ac_try) 25
 +  ac_status=$?
 +  $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5
 +  test $ac_status = 0; }; }
 +then
 +   gcc_cv_as_ix86_ud2=yes
 +else
 +  echo configure: failed program was 5
 +  cat conftest.s 5
 +fi
 +rm -f conftest.o conftest.s
 +  fi
 +fi
 +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_ud2 5
 +$as_echo $gcc_cv_as_ix86_ud2 6; }
 +if test $gcc_cv_as_ix86_ud2 = yes; then
 +
 +$as_echo #define HAVE_AS_IX86_UD2 1 confdefs.h
 +
 +fi
 +
  { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for
 R_386_TLS_GD_PLT reloc 5
  $as_echo_n checking assembler for R_386_TLS_GD_PLT reloc...  6; }
  if test ${gcc_cv_as_ix86_tlsgdplt+set} = set; then :
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -3895,6 +3895,12 @@ foo: nop
  [AC_DEFINE(HAVE_AS_IX86_REP_LOCK_PREFIX, 1,
[Define if the assembler supports 'rep insn, lock insn'.])])

 +gcc_GAS_CHECK_FEATURE([ud2 mnemonic],
 +   gcc_cv_as_ix86_ud2,,,
 +   [ud2],,
 +  [AC_DEFINE(HAVE_AS_IX86_UD2, 1,
 +   [Define if your assembler supports the 'ud2' mnemonic.])])
 +
  gcc_GAS_CHECK_FEATURE([R_386_TLS_GD_PLT reloc],
  gcc_cv_as_ix86_tlsgdplt,,,
 [calltls_gd@tlsgdplt],


Re: [PATCH] x86: Use ud2 assembly mnemonic when available.

2014-02-13 Thread Roland McGrath
Did you read the patch?  It uses an empirical configure check to
discover if the assembler does in fact support ud2.


[PATCH][ARM] add HFmode to arm_preferred_simd_mode

2014-02-13 Thread Kugan
Hi,

Is there any reason why HFmode is not there in arm_preferred_simd_mode?
NEON does support this.

Cross regression tested for arm-none-linux-gnueabi with qemu and no new
regressions.

Attached patch enables this. Is this OK for stage1.

Thanks,
Kugan

gcc/
+2014-02-14  Kugan Vivekanandarajah  kug...@linaro.org
+
+   * config/arm/arm.c (arm_preferred_simd_mode): Add HFmode to
+preferred modes.


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b49f43e..bd90e85 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28564,6 +28564,10 @@ arm_preferred_simd_mode (enum machine_mode mode)
   if (TARGET_NEON)
 switch (mode)
   {
+  case HFmode:
+   if (arm_fp16_format)
+ return TARGET_NEON_VECTORIZE_DOUBLE ? V4HFmode : V8HFmode;
+   break;
   case SFmode:
return TARGET_NEON_VECTORIZE_DOUBLE ? V2SFmode : V4SFmode;
   case SImode:


Re: [PATCH][ARM] add HFmode to arm_preferred_simd_mode

2014-02-13 Thread Andrew Pinski
On Thu, Feb 13, 2014 at 4:15 PM, Kugan
kugan.vivekanandara...@linaro.org wrote:
 Hi,

 Is there any reason why HFmode is not there in arm_preferred_simd_mode?
 NEON does support this.

Most likely because there is no support for Half-float in the vectorizer.

Thanks,
Andrew Pinski


 Cross regression tested for arm-none-linux-gnueabi with qemu and no new
 regressions.

 Attached patch enables this. Is this OK for stage1.

 Thanks,
 Kugan

 gcc/
 +2014-02-14  Kugan Vivekanandarajah  kug...@linaro.org
 +
 +   * config/arm/arm.c (arm_preferred_simd_mode): Add HFmode to
 +preferred modes.




Re: [PATCH, testsuite] Fix profile test failures

2014-02-13 Thread Joseph S. Myers
On Thu, 13 Feb 2014, Steve Ellcey wrote:

 So let me make sure I understand this:  host is where you run the
 testsuite from, build is where the compilation happens (probably the
 same as host for most people), and target is where the test program is
 executed.

Host is the system on which the compilers being tested run.  Build is the 
system on which runtest runs and executes the .exp files.  They are only 
different in the case of remote-host testing (using DejaGnu on GNU/Linux 
to test a compiler for Windows host, for example) - typically the same 
cases in which a Canadian cross compiler is built.

  To determine whether anything should be changed in a GCC .exp file, reason 
  about which of the three systems (build, host, target) a file is on, or is 
  needed on, at each point, rather than looking at what does or does not 
  work with a buggy board file.
 
 I am not convinced that the problem is in the board file because the
 only tests I see fail this way are the ones in g++.exp/bprob and that is
 also the only GCC .exp file that uses remote_upload or remote_file
 with 'target'.  I will dig into it some more and also try it with some
 different boards.

Branch profiling involves the generated executables creating files with 
profile information when they run, so those files (on the target) need 
manipulating.  Most testsuites do not involve testcases generating any 
files.  But the libstdc++ testsuite uses remote_download to transfer files 
to the target, because various testcases need to open and read input 
files.

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8

2014-02-13 Thread Joey Ye
Ping ^3

These fixes are very important to 4.8 ARM embedded users, as they rely on
strict volatile bitfields a lot. Please let them in 4.8.

 -Original Message-
 From: Joey Ye [mailto:joey...@arm.com]
 Sent: Saturday, February 08, 2014 10:42
 To: gcc-patches@gcc.gnu.org
 Subject: RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8
 
 Ping ^ 2
 
 OK to 4.8?
 
  -Original Message-
  From: Joey Ye [mailto:joey...@arm.com]
  Sent: Monday, January 20, 2014 10:47
  To: gcc-patches@gcc.gnu.org
  Subject: RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to
4.8
 
  Ping
 
   -Original Message-
   From: Joey Ye [mailto:joey...@arm.com]
   Sent: Thursday, January 16, 2014 16:28
   To: gcc-patches@gcc.gnu.org
   Subject: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8
  
   4.8 has a number of strict-volatile-bitfields issues that can be fixed
   by following patches.
   trunk@205899, 205898, 205897, 205896, 203003
  
   Tested on x86_64 and arm without regression.
  
   OK to 4.8?
  
   2013-09-28  Sandra Loosemore  san...@codesourcery.com
  
   gcc/
   * expr.h (extract_bit_field): Remove packedp parameter.
   * expmed.c (extract_fixed_bit_field): Remove packedp parameter
   from forward declaration.
   (store_split_bit_field): Remove packedp arg from calls to
   extract_fixed_bit_field.
   (extract_bit_field_1): Remove packedp parameter and packedp
   argument from recursive calls and calls to
extract_fixed_bit_field.
   (extract_bit_field): Remove packedp parameter and
corresponding
   arg to extract_bit_field_1.
   (extract_fixed_bit_field): Remove packedp parameter.  Remove
 code
   to issue warnings.
   (extract_split_bit_field): Remove packedp arg from call to
   extract_fixed_bit_field.
   * expr.c (emit_group_load_1): Adjust calls to
extract_bit_field.
   (copy_blkmode_from_reg): Likewise.
   (copy_blkmode_to_reg): Likewise.
   (read_complex_part): Likewise.
   (store_field): Likewise.
   (expand_expr_real_1): Likewise.
   * calls.c (store_unaligned_arguments_into_pseudos): Adjust
call
   to extract_bit_field.
   * config/tilegx/tilegx.c (tilegx_expand_unaligned_load):
Adjust
   call to extract_bit_field.
   * config/tilepro/tilepro.c (tilepro_expand_unaligned_load):
Adjust
   call to extract_bit_field.
   * doc/invoke.texi (Code Gen Options): Remove mention of
warnings
   and special packedp behavior from -fstrict-volatile-bitfields
   documentation.
  
   2013-12-11  Bernd Edlinger  bernd.edlin...@hotmail.de
  
   * expr.c (expand_assignment): Remove dependency on
   flag_strict_volatile_bitfields. Always set the memory
   access mode.
   (expand_expr_real_1): Likewise.
  
   2013-12-11  Sandra Loosemore  san...@codesourcery.com
  
   PR middle-end/23623
   PR middle-end/48784
   PR middle-end/56341
   PR middle-end/56997
  
   gcc/
   * expmed.c (strict_volatile_bitfield_p): New function.
   (store_bit_field_1): Don't special-case strict volatile
   bitfields here.
   (store_bit_field): Handle strict volatile bitfields here
instead.
   (store_fixed_bit_field): Don't special-case strict volatile
   bitfields here.
   (extract_bit_field_1): Don't special-case strict volatile
   bitfields here.
   (extract_bit_field): Handle strict volatile bitfields here
instead.
   (extract_fixed_bit_field): Don't special-case strict volatile
   bitfields here.  Simplify surrounding code to resemble that in
   store_fixed_bit_field.
   * doc/invoke.texi (Code Gen Options): Update
   -fstrict-volatile-bitfields description.
  
   gcc/testsuite/
   * gcc.dg/pr23623.c: New test.
   * gcc.dg/pr48784-1.c: New test.
   * gcc.dg/pr48784-2.c: New test.
   * gcc.dg/pr56341-1.c: New test.
   * gcc.dg/pr56341-2.c: New test.
   * gcc.dg/pr56997-1.c: New test.
   * gcc.dg/pr56997-2.c: New test.
   * gcc.dg/pr56997-3.c: New test.
  
   2013-12-11  Bernd Edlinger  bernd.edlin...@hotmail.de
Sandra Loosemore  san...@codesourcery.com
  
   PR middle-end/23623
   PR middle-end/48784
   PR middle-end/56341
   PR middle-end/56997
   * expmed.c (strict_volatile_bitfield_p): Add bitregion_start
   and bitregion_end parameters.  Test for compliance with C++
   memory model.
   (store_bit_field): Adjust call to strict_volatile_bitfield_p.
   Add fallback logic for cases where -fstrict-volatile-bitfields
   is supposed to apply, but cannot.
   (extract_bit_field): Likewise. Use narrow_bit_field_mem and
   

Re: FRE may run out of memory

2014-02-13 Thread dxq
Richard Biener-2 wrote
 On Sat, Feb 8, 2014 at 8:29 AM, dxq lt;

 ziyan01@

 gt; wrote:
 hi all,

 We found that gcc would run out of memory on Windows when compiling a
 *big*
 function (10 lines).

 More investigation shows that gcc crashes at the function
 *compute_avail*,
 in tree-fre pass.  *compute_avail* collects information from basic
 blocks,
 so memory is allocated to record informantion.
 However, if there are huge number of basic blocks,  the memory would be
 exhausted and gcc would crash down, especially for Windows PC, only 2G or
 4G
 memory generally. It's ok On linux, and *compute_avail* allocates *2.4G*
 memory. I guess some optimization passes in gcc like FRE didn't consider
 the
 extreme
 case.
 
 This was fixed for GCC 4.8, FRE no longer uses compute_avail (but PRE
 still does).
 Basically GCC 4.8 should (at -O1) compile most extreme cases just fine.
 
 Richard.

hi, Richard,

More  investigation shows that 
1, loop related passes take more compiling time and memory, especially
pass_rtl_move_loop_invariants, lim, 
  and at least lim on tree will impact a lot to the following passes. 
2, ira will take more than 20g memory in function *create_loop_tree_nodes*,
because ira chooses 'mixed' 
  or 'all' region when optimize level. 
3, sms pass always creats ddgs for all loops in compiled function, then does
sms optimization for all loops, 
  and finally frees ddgs. If there are huge number of loops, sms may crash
when creating ddgs because of 
  running out of memory.

The passes above , should someone confirm about memory pressure problem?

Thanks for your reply!

danxiaoqiang



--
View this message in context: 
http://gcc.1065356.n5.nabble.com/FRE-may-run-out-of-memory-tp1009578p1011035.html
Sent from the gcc - patches mailing list archive at Nabble.com.


Re: [PATCH][ARM] add HFmode to arm_preferred_simd_mode

2014-02-13 Thread Kugan


On 14/02/14 11:24, Andrew Pinski wrote:
 On Thu, Feb 13, 2014 at 4:15 PM, Kugan
 kugan.vivekanandara...@linaro.org wrote:
 Hi,

 Is there any reason why HFmode is not there in arm_preferred_simd_mode?
 NEON does support this.
 
 Most likely because there is no support for Half-float in the vectorizer.
 

I can see that get_vectype_for_scalar_type_and_size failing while
building vector type (with build_vector_type) for Half-float. I guess we
should add support there first.

Thanks,
Kugan


Re: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end

2014-02-13 Thread Ilmir Usmanov

Committed as r207776.

--
Ilmir.


RE: [Patch, microblaze]: Add optimized lshrsi3

2014-02-13 Thread David Holsgrove
Hi Michael,

 -Original Message-
 From: Michael Eager [mailto:ea...@eagerm.com]
 Sent: Sunday, 9 February 2014 2:58 am
 To: David Holsgrove; gcc-patches@gcc.gnu.org
 Cc: Edgar Iglesias; John Williams; Vidhumouli Hunsigida; Nagaraju Mekala
 Subject: Re: [Patch, microblaze]: Add optimized lshrsi3
 
 On 11/25/13 23:53, David Holsgrove wrote:
  Add optimized lshrsi3 instruction, to be used when optimizing for size
  with immediate values over 5
 
  Changelog
 
  2013-11-26  Nagaraju Mekala nagaraju.mek...@xilinx.com
 
* gcc/config/microblaze/microblaze.md: Add size optimized lshrsi3 insn.
 
 David --
 
 Please put the description of the patch in the text of the email,
 rather than hiding it within an attached patch.
 
 The patch describes a very specific situation where this patch
 will have an effect.  Please provide a test case.

Updated version of patch attached with testcase. New Changelog entries are;

Changelog

2013-11-26  David Holsgrove david.holsgr...@xilinx.com

 * gcc/config/microblaze/microblaze.md: Add size optimized lshrsi3 insn

ChangeLog/testsuite

2014-02-12  David Holsgrove david.holsgr...@xilinx.com

 * gcc/testsuite/gcc.target/microblaze/others/lshrsi_Os_1.c: New test.

thanks,
David

 
 --
 Michael Eager  ea...@eagercon.com
 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077





0003-Patch-microblaze-Add-optimized-lshrsi3.patch
Description: 0003-Patch-microblaze-Add-optimized-lshrsi3.patch


RE: [Patch, microblaze]: Remove SECONDARY_MEMORY_NEEDED

2014-02-13 Thread David Holsgrove
Hi Michael, List,

 -Original Message-
 From: David Holsgrove
 Sent: Wednesday, 22 January 2014 1:43 pm
 To: 'Michael Eager'; gcc-patches@gcc.gnu.org
 Cc: Edgar Iglesias; John Williams; Vidhumouli Hunsigida; Nagaraju Mekala
 Subject: RE: [Patch, microblaze]: Remove SECONDARY_MEMORY_NEEDED
 
 Hi Michael,
 
  -Original Message-
  From: Michael Eager [mailto:ea...@eagerm.com]
  Sent: Friday, 17 January 2014 4:44 am
  To: David Holsgrove; gcc-patches@gcc.gnu.org
  Cc: Edgar Iglesias; John Williams; Vidhumouli Hunsigida; Nagaraju Mekala
  Subject: Re: [Patch, microblaze]: Remove SECONDARY_MEMORY_NEEDED
 
  On 11/25/13 23:51, David Holsgrove wrote:
   Hi Michael,
  
   I've attached patch based on latest gcc master. Please let me know if
   you need anything further.
  
   thanks,
   David
  
   On 15 July 2013 14:44, David Holsgrove david.holsgr...@xilinx.com wrote:
   Hi Michael,
  
   On 18 March 2013 22:49, David Holsgrove david.holsgr...@xilinx.com
  wrote:
   MicroBlaze doesn't have restrictions that would force us to
   reload regs via memory. Don't define SECONDARY_MEMORY_NEEDED.
   Fixes an ICE when compiling OpenSSL for linux.
  
   Changelog
  
   2013-03-18  Edgar E. Iglesias edgar.igles...@xilinx.com
  
 * gcc/config/microblaze/microblaze.h: Remove
  SECONDARY_MEMORY_NEEDED
   definition.
  
   Signed-off-by: Edgar E. Iglesias edgar.igles...@xilinx.com
   Signed-off-by: Peter A. G. Crosthwaite peter.crosthwa...@xilinx.com
  
  
   Patch remains the same, please apply when ready.
  
   thanks,
   David
 
  Hi David --
 
  Is it possible to add a test case which shows the ICE?
 
 
 I'm afraid I don’t still have my test environment for this patch from last 
 March, I'll
 attempt to recreate and distil into a small test case if possible, based on 
 the error
 encountered whilst building openssl.
 
 I'll update again when I have some further detail.
 

I've managed to recreate the original internal compiler error whilst building 
openssl with microblazeel linux toolchain.

I've reduced the error down to the attached testcase.
It is taken directly from openssl (with no dependencies on openssl headers), so 
I'm unsure of the suitability of this test both technically and license wise 
for inclusion in gcc.

Changelog entry would be;

2013-03-18  Edgar E. Iglesias edgar.igles...@xilinx.com

 * gcc/config/microblaze/microblaze.h: Remove SECONDARY_MEMORY_NEEDED
   definition.

ChangeLog/testsuite

2014-02-13  David Holsgrove david.holsgr...@xilinx.com

 * gcc/testsuite/gcc.target/microblaze/others/mem_reload.c: New test.

thanks,
David


 thanks,
 David
 
  Thanks.
 
 
  --
  Michael Eagerea...@eagercon.com
  1960 Park Blvd., Palo Alto, CA 94306  650-325-8077





0002-Patch-microblaze-Remove-SECONDARY_MEMORY_NEEDED.patch
Description: 0002-Patch-microblaze-Remove-SECONDARY_MEMORY_NEEDED.patch


RE: [Patch, microblaze]: Add TARGET_ASM_OUTPUT_MI_THUNK to support varargs thunk

2014-02-13 Thread David Holsgrove
Hi Michael,

 -Original Message-
 From: Michael Eager [mailto:ea...@eagerm.com]
 Sent: Sunday, 26 January 2014 1:57 am
 To: David Holsgrove
 Cc: gcc-patches@gcc.gnu.org; Edgar Iglesias; John Williams; Vinod Kathail;
 Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui
 Subject: Re: [Patch, microblaze]: Add TARGET_ASM_OUTPUT_MI_THUNK to
 support varargs thunk
 
 On 07/14/13 21:37, David Holsgrove wrote:
  Hi Michael,
 
  -Original Message-
  From: Michael Eager [mailto:ea...@eagerm.com]
  Sent: Saturday, 13 July 2013 9:33 am
  To: David Holsgrove
  Cc: gcc-patches@gcc.gnu.org; Edgar Iglesias; John Williams; Vinod Kathail;
  Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui
  Subject: Re: [Patch, microblaze]: Add TARGET_ASM_OUTPUT_MI_THUNK to
  support varargs thunk
 
  On 03/18/13 05:49, David Holsgrove wrote:
  Changelog
 
  2013-03-18  David Holsgrove david.holsgr...@xilinx.com
 
 * gcc/config/microblaze/microblaze.c: Add
 microblaze_asm_output_mi_thunk
   and define TARGET_ASM_OUTPUT_MI_THUNK and
  TARGET_ASM_CAN_OUTPUT_MI_THUNK
 
  Sorry it has taken so long to review this patch.
 
[--snip--]
 
  2013-07-15  David Holsgrove david.holsgr...@xilinx.com
 
* gcc/config/microblaze/microblaze.c: Add microblaze_asm_output_mi_thunk
  and define TARGET_ASM_OUTPUT_MI_THUNK and
 TARGET_ASM_CAN_OUTPUT_MI_THUNK
 
 This patch causes a number of regressions in the G++ test suite.
 For example, abi/covariant{3,4,5}.C, abi/vcall1.C,
 inherit/covariant{1,2,3,4,17,18}.C,
 inherit/thunk{7,10}.C and others.
 
 

Apologies - this patch was originally written in 2012 and submitted to this 
list a year ago.
It has not been reviewed or tested for regressions in 12 months, and has taken 
me a bit of time to go back to the original work and rerun the testsuite as it 
stands today.

Please find attached updated patch which has no regressions. I believe the 
testcase which checks the functionality of this patch is ' 
g++.old-deja/g++.jason/thunk3.C'

Changelog entry remains the same since March 2013.

thanks,
David

 --
 Michael Eager  ea...@eagercon.com
 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077





0001-Patch-microblaze-Add-TARGET_ASM_OUTPUT_MI_THUNK-to-s.patch
Description: 0001-Patch-microblaze-Add-TARGET_ASM_OUTPUT_MI_THUNK-to-s.patch


[Patch, testsuite]: Add MicroBlaze pattern for dg-function-on-line

2014-02-13 Thread David Holsgrove
Hi,

Attached patch adds a MicroBlaze specific pattern for checking line number and
generation of function in dg-function-on-line, in line with the mips method.

Changelog/testsuite

2014-02-13  David Holsgrove david.holsgr...@xilinx.com

 * gcc/testsuite/lib/scanasm.exp (dg-function-on-line): Add
   MicroBlaze specific pattern.

thanks,
David




0004-Patch-testsuite-Add-MicroBlaze-pattern-for-dg-functi.patch
Description: 0004-Patch-testsuite-Add-MicroBlaze-pattern-for-dg-functi.patch


[Patch, testsuite]: Allow MicroBlaze .weakext pattern in regex match

2014-02-13 Thread David Holsgrove
Hi All,

I've attached a patch to extend the regex pattern to include optional 'ext' at 
the end of
'.weak' to match the MicroBlaze weak label '.weakext' in two of the g++ 
testcases.

The only other rule in these tests was for ! { *-*-darwin* }, so I'm not sure 
if it's appropriate to modify
the scan-assembler line in this fashion for a specific architecture's pattern?

ChangeLog/testsuite

2014-02-14  David Holsgrove david.holsgr...@xilinx.com

 * gcc/testsuite/g++.dg/abi/rtti3.C: Extend scan-assembler
   pattern to take optional ext after .weak.
 * gcc/testsuite/g++.dg/abi/thunk4.C: Likewise.

thanks,
David




0005-Patch-testsuite-Allow-MicroBlaze-.weakext-pattern-in.patch
Description: 0005-Patch-testsuite-Allow-MicroBlaze-.weakext-pattern-in.patch


Re: [PATCH] x86: Use ud2 assembly mnemonic when available.

2014-02-13 Thread Uros Bizjak
Hello!

 Non-ancient assemblers support the ud2 mnemonic, so there is no need
 to emit the literal opcode as data.

 OK for trunk and 4.8?

You forgot to tell us how the patch tested...

 gcc/
 2014-02-13  Roland McGrath  mcgra...@google.com

 * configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic.
 * configure: Regenerated.
 * config.in: Regenerated.
 * config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic
 instead of ASM_SHORT.

OK for mainline and release branches.

Thanks,
Uros.


Re: [RS6000] power8 internal compiler errors

2014-02-13 Thread Alan Modra
On Wed, Feb 12, 2014 at 06:47:37PM +0100, Ulrich Weigand wrote:
 Note that find_replacement itself already recurses into both sides
 of a PLUS.

Thanks, I missed seeing that.  I'd analysed the bug and knew what
needed doing from past forays into reload, so went looking for ways to
get at the reloads, ie. replacements at that stage of reload.  Lo
and behold, there's a function tailor made to do just that!  So I
plugged in find_replacements() wherever it seemed necessary.

 So it might be
 easier and cheaper overall to just do a find_replacement within
 the PRE_MODIFY clause ...

That's a good idea, since PRE_MODIFY doesn't occur that often.
Here is the revised patch with your recommendations.  Bootstrapped
and regression tested powerpc64-linux.

PR target/58675
PR target/57935
* config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Use
find_replacement on parts of insn rtl that might be reloaded.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 207649)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -16170,7 +16156,7 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r
 rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
   rclass = REGNO_REG_CLASS (regno);
-  addr = XEXP (mem, 0);
+  addr = find_replacement (XEXP (mem, 0));
 
   switch (rclass)
 {
@@ -16181,19 +16167,18 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r
   if (GET_CODE (addr) == AND)
{
  and_op2 = XEXP (addr, 1);
- addr = XEXP (addr, 0);
+ addr = find_replacement (XEXP (addr, 0));
}
 
   if (GET_CODE (addr) == PRE_MODIFY)
{
- scratch_or_premodify = XEXP (addr, 0);
+ scratch_or_premodify = find_replacement (XEXP (addr, 0));
  if (!REG_P (scratch_or_premodify))
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
- if (GET_CODE (XEXP (addr, 1)) != PLUS)
+ addr = find_replacement (XEXP (addr, 1));
+ if (GET_CODE (addr) != PLUS)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- addr = XEXP (addr, 1);
}
 
   if (GET_CODE (addr) == PLUS
@@ -16201,6 +16186,8 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r
  || !rs6000_legitimate_offset_address_p (PTImode, addr,
  false, true)))
{
+ /* find_replacement already recurses into both operands of
+PLUS so we don't need to call it here.  */
  addr_op1 = XEXP (addr, 0);
  addr_op2 = XEXP (addr, 1);
  if (!legitimate_indirect_address_p (addr_op1, false))
@@ -16276,7 +16263,7 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r
  || !VECTOR_MEM_ALTIVEC_P (mode)))
{
  and_op2 = XEXP (addr, 1);
- addr = XEXP (addr, 0);
+ addr = find_replacement (XEXP (addr, 0));
}
 
   /* If we aren't using a VSX load, save the PRE_MODIFY register and use it
@@ -16288,14 +16275,13 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r
  || and_op2 != NULL_RTX
  || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
{
- scratch_or_premodify = XEXP (addr, 0);
+ scratch_or_premodify = find_replacement (XEXP (addr, 0));
  if (!legitimate_indirect_address_p (scratch_or_premodify, false))
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
- if (GET_CODE (XEXP (addr, 1)) != PLUS)
+ addr = find_replacement (XEXP (addr, 1));
+ if (GET_CODE (addr) != PLUS)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- addr = XEXP (addr, 1);
}
 
   if (legitimate_indirect_address_p (addr, false)  /* reg */

-- 
Alan Modra
Australia Development Lab, IBM