RFA: speeding up dg-extract-results.sh
dg-extract-results.sh is used to combine the various .sum.sep and .log.sep files produced by parallel testing into single .sum and .log files. It's written in a combination of shell scripts and awk, so stays well within the minimum system requirements. However, it seems to be quadratic in the number of test variations, since the size of the .sums and .logs are linear in it and the script parses them all once per variation. This means that when I'm doing the mipsisa64-sde-elf testing: http://gcc.gnu.org/ml/gcc-testresults/2014-02/msg00025.html the script takes just over 5 hours to produce the gcc.log file. This patch tries to reduce that by providing an alternative single-script version. I was torn between Python and Tcl, but given how most people tend to react to Tcl, I thought I'd better go for Python. I wouldn't mind rewriting it in Tcl if that seems better though, not least because expect is already a prerequisite. Python isn't yet required and I'm pretty sure this script needs 2.6 or later. I'm also worried that the seek/tell stuff might not work on Windows. The patch therefore gets dg-extract-results.sh to check the environment first and call into the python version if possible, otherwise it falls back on the current approach. This also means that the patch is contained entirely within contrib/. If this does indeed not work on Windows then we should either fix the python code (obviously preferred) or get dg-extract-results.sh to skip it on Windows for now. The new version processes the mipsisa64-sde-elf gcc.log in just over a minute. It's also noticeably faster for more normal runs, e.g. for my 4-variant mips64-linux-gnu testing the time taken to process gcc.log goes from 114s to 11s. But that's probably in the noise given how long testing takes anyway. For completeness, although the basic approach was heavily based on the original script, there are some minor differences in output: - the 'Host is ' line is copied over. - not all sorts in the .sh version were protected by LC_ALL=C, so the order of .exp files in the .sum could depend on locale. The new version always follows the LC_ALL=C ordering (since that's what Python uses unless the script forces it not to). - when the run for a particular .exp is split over several .log.seps, the separate logs are now reassembled in the same order as the .sum output, based on the first test in each .log fragment. I've left this under the control of an internal variable for easier comparison though. - the new version tries to keep the earliest start message and latest end message (based on the time in the message). I thought this would give a better idea how long the full run took. - the .log output now contains the tool version information at the end (as both versions do for .sum). - the .log output only contains one set of 'Using foo.exp as the blah.' messages per run. The .sh version drops most of the others but not all. I checked that the outputs were otherwise identical for a set of mips64-linux-gnu, mipsisa64-sde-elf and x86_64-linux-gnu runs. I also reran the acats tests with some nobbled testcases in order to test the failure paths there. Also bootstrapped regression-tested on x86_64-linux-gnu. OK to install? Thanks, Richard contrib/ * dg-extract-results.py: New file. * dg-extract-results.sh: Use it if the environment seems suitable. Index: contrib/dg-extract-results.py === --- /dev/null 2014-02-10 23:36:59.384652914 + +++ contrib/dg-extract-results.py 2014-02-13 07:50:18.877804877 + @@ -0,0 +1,577 @@ +#!/usr/bin/python +# +# Copyright (C) 2014 Free Software Foundation, Inc. +# +# This script is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. + +import sys +import getopt +import re +from datetime import datetime + +# True if unrecognised lines should cause a fatal error. Might want to turn +# this on by default later. +strict = False + +# True if the order of .log segments should match the .sum file, false if +# they should keep the original order. +sort_logs = True + +class Named: +def __init__ (self, name): +self.name = name + +def __cmp__ (self, other): +return cmp (self.name, other.name) + +class ToolRun (Named): +def __init__ (self, name): +Named.__init__ (self, name) +# The variations run for this tool, mapped by --target_board name. +self.variations = dict() + +# Return the VariationRun for variation NAME. +def get_variation (self, name): +if name not in self.variations: +self.variations[name] = VariationRun (name) +return self.variations[name] + +class VariationRun (Named): +def __init__ (self, name): +Named.__init__ (self, name)
Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c
On Tue, 4 Feb 2014, Rainer Orth wrote: AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about 20131114: Bah, missing analysis. Everywhere does not include cris-elf, powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu, s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0. XPASS: gcc.dg/binop-xor1.c scan-tree-dump-times optimized ^ 1 To reduce testsuite noise, I'd like to apply the following patch. Tested with the appropriate runtest invocations on i386-pc-solaris2.11 and x86_64-unknown-linux-gnu. Ok for mainline? Rainer 2014-02-04 Rainer Orth r...@cebitec.uni-bielefeld.de * gcc.dg/binop-xor1.c: Don't xfail scan-tree-dump-times. The XPASS wasn't universal. I opened PR60173. brgds, H-P
[PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.
Hello, I’ve noticed that _mm512_permutexvar_epi[64|32] intrinsics have wrong arguments order. As per [1] first argument is index. For vmpermps/vpermpd intrinsics are fine, but I’ve changed tests to call CALC with same arg order as intrinsic. here is the same problem (wrong argument order) with vrcp14s[d|s]. Also avx512er-vrcp28ss-2.c test called wrong intrinsic. [1] http://software.intel.com/sites/landingpage/IntrinsicsGuide/ gcc/ * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap arguments order in builtin. (_mm512_permutexvar_epi64): Ditto. (_mm512_mask_permutexvar_epi64): Ditto (_mm512_maskz_permutexvar_epi32): Ditto (_mm512_permutexvar_epi32): Ditto (_mm512_mask_permutexvar_epi32): Ditto * config/i386/sse.md (srcp14mode): Swap operands. gcc/testsuite/ * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic. * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations. * gcc.target/i386/avx512f-vpermpd-2.c: Ditto. * gcc.target/i386/avx512f-vpermps-2.c: Ditto. * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto. Is it ok for trunk? Or we should wait until 4.9 fork? -- Thanks, K --- gcc/config/i386/avx512fintrin.h| 24 +++--- gcc/config/i386/sse.md | 6 +++--- .../gcc.target/i386/avx512er-vrcp28ss-2.c | 2 +- gcc/testsuite/gcc.target/i386/avx512f-vpermd-2.c | 2 +- gcc/testsuite/gcc.target/i386/avx512f-vpermpd-2.c | 4 ++-- gcc/testsuite/gcc.target/i386/avx512f-vpermps-2.c | 4 ++-- .../gcc.target/i386/avx512f-vpermq-var-2.c | 2 +- gcc/testsuite/gcc.target/i386/avx512f-vrcp14sd-2.c | 4 ++-- gcc/testsuite/gcc.target/i386/avx512f-vrcp14ss-2.c | 8 9 files changed, 28 insertions(+), 28 deletions(-) diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h index d53a40d..b3a4f3a 100644 --- a/gcc/config/i386/avx512fintrin.h +++ b/gcc/config/i386/avx512fintrin.h @@ -6148,8 +6148,8 @@ extern __inline __m512i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_maskz_permutexvar_epi64 (__mmask8 __M, __m512i __X, __m512i __Y) { - return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __X, -(__v8di) __Y, + return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __Y, +(__v8di) __X, (__v8di) _mm512_setzero_si512 (), __M); @@ -6159,8 +6159,8 @@ extern __inline __m512i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_permutexvar_epi64 (__m512i __X, __m512i __Y) { - return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __X, -(__v8di) __Y, + return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __Y, +(__v8di) __X, (__v8di) _mm512_setzero_si512 (), (__mmask8) -1); @@ -6171,8 +6171,8 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_mask_permutexvar_epi64 (__m512i __W, __mmask8 __M, __m512i __X, __m512i __Y) { - return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __X, -(__v8di) __Y, + return (__m512i) __builtin_ia32_permvardi512_mask ((__v8di) __Y, +(__v8di) __X, (__v8di) __W, __M); } @@ -6181,8 +6181,8 @@ extern __inline __m512i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_maskz_permutexvar_epi32 (__mmask16 __M, __m512i __X, __m512i __Y) { - return (__m512i) __builtin_ia32_permvarsi512_mask ((__v16si) __X, -(__v16si) __Y, + return (__m512i) __builtin_ia32_permvarsi512_mask ((__v16si) __Y, +(__v16si) __X, (__v16si) _mm512_setzero_si512 (), __M); @@ -6192,8 +6192,8 @@ extern __inline __m512i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_permutexvar_epi32 (__m512i __X, __m512i __Y) { - return (__m512i) __builtin_ia32_permvarsi512_mask ((__v16si) __X, -
Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c
Hans-Peter Nilsson h...@bitrange.com writes: On Tue, 4 Feb 2014, Rainer Orth wrote: AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about 20131114: Bah, missing analysis. Everywhere does not include cris-elf, powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu, s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0. Based on this list I'm guessing it's another BRANCH_COST==1 thing, so that we don't convert and || into and |? There are a few other similar tests that either XFAIL based on that or force a higher branch cost. Thanks, Richard
Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c
Richard Sandiford rsand...@linux.vnet.ibm.com writes: Hans-Peter Nilsson h...@bitrange.com writes: On Tue, 4 Feb 2014, Rainer Orth wrote: AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about 20131114: Bah, missing analysis. Everywhere does not include cris-elf, powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu, s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0. Based on this list I'm guessing it's another BRANCH_COST==1 BRANCH_COST==1 || !LOGICAL_OP_NON_SHORT_CIRCUIT
[PATCH] Fix Cilk+ ICEs in the alias oracle
Cilk+ builds INDIRECT_REFs when expanding builtins (oops) and thus those can leak into MEM_EXRs which will lead to ICEs later. The following patch properly builds a MEM_REF instead. Grepping for INDIRECT_REF I found another suspicious use (just removed, it cannot have triggered and it looks bogus) and the use of a langhook instead of proper GIMPLE interfaces (function also used during expansion). Bootstrap / testing in progress together with some other stuff. Ok? Thanks, Richard. 2014-02-13 Richard Biener rguent...@suse.de * cilk-common.c: Include gimple-expr.h. (cilk_arrow): Build a MEM_REF, not an INDIRECT_REF. (get_frame_arg): Use middel-end types_compatible_p. Do not strip INDIRECT_REFs. Index: gcc/cilk-common.c === --- gcc/cilk-common.c (revision 207725) +++ gcc/cilk-common.c (working copy) @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. #include recog.h #include tree-iterator.h #include gimplify.h +#include gimple-expr.h #include cilk.h /* This structure holds all the important fields of the internal structures, @@ -66,8 +67,7 @@ cilk_dot (tree frame, int field_number, tree cilk_arrow (tree frame_ptr, int field_number, bool volatil) { - return cilk_dot (fold_build1 (INDIRECT_REF, - TREE_TYPE (TREE_TYPE (frame_ptr)), frame_ptr), + return cilk_dot (build_simple_mem_ref (frame_ptr), field_number, volatil); } @@ -287,12 +287,11 @@ get_frame_arg (tree call) argtype = TREE_TYPE (argtype); - gcc_assert (!lang_hooks.types_compatible_p - || lang_hooks.types_compatible_p (argtype, cilk_frame_type_decl)); + gcc_assert (types_compatible_p (argtype, cilk_frame_type_decl)); /* If it is passed in as an address, then just use the value directly since the function is inlined. */ - if (TREE_CODE (arg) == INDIRECT_REF || TREE_CODE (arg) == ADDR_EXPR) + if (TREE_CODE (arg) == ADDR_EXPR) return TREE_OPERAND (arg, 0); return arg; }
[PATCH][AArch64] vrnd*_f64 patch for stage-1
Hi, This patch adds vrnd*_f64 aarch64 intrinsics. A testcase for those intrinsics is added. Run a complete LE and BE regression run with no regressions. Is patch OK for stage-1? 2014-02-13 Alex Velenko alex.vele...@arm.com gcc/ * config/aarch64/aarch64-builtins.c (BUILTIN_VDQF_DF): Macro added. * config/aarch64/aarch64-simd-builtins.def (frintn): Use added macro. * config/aarch64/aarch64-simd.md (frint_pattern): Comment corrected. * config/aarch64/aarch64.md (frint_pattern): Likewise. * config/aarch64/arm_neon.h (vrnd_f64): Added. (vrnda_f64): Likewise. (vrndi_f64): Likewise. (vrndm_f64): Likewise. (vrndn_f64): Likewise. (vrndp_f64): Likewise. (vrndx_f64): Likewise. gcc/testsuite/ gcc.target/aarch64/vrnd_f64_1.c : New testcase. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index ebab2ce8347a4425977c5cbd0f285c3ff1d9f2f1..7adc5fb96b6473ecde5c4f76973aff68af0ca7d4 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -307,6 +307,8 @@ aarch64_types_store1_qualifiers[SIMD_MAX_BUILTIN_ARGS] VAR7 (T, N, MAP, v8qi, v16qi, v4hi, v8hi, v2si, v4si, v2di) #define BUILTIN_VDQF(T, N, MAP) \ VAR3 (T, N, MAP, v2sf, v4sf, v2df) +#define BUILTIN_VDQF_DF(T, N, MAP) \ + VAR4 (T, N, MAP, v2sf, v4sf, v2df, df) #define BUILTIN_VDQH(T, N, MAP) \ VAR2 (T, N, MAP, v4hi, v8hi) #define BUILTIN_VDQHS(T, N, MAP) \ diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index e5f71b479ccfd1a9cbf84aed0f96b49762053f59..09e230c56683a0225f8760472d7137b7bac98297 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -264,7 +264,7 @@ BUILTIN_VDQF (UNOP, nearbyint, 2) BUILTIN_VDQF (UNOP, rint, 2) BUILTIN_VDQF (UNOP, round, 2) - BUILTIN_VDQF (UNOP, frintn, 2) + BUILTIN_VDQF_DF (UNOP, frintn, 2) /* Implemented by lfcvt_patternsu_optabVQDF:modevcvt_target2. */ VAR1 (UNOP, lbtruncv2sf, 2, v2si) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 4dffb59e856aeaafb79007255d3b91a73ef1ef13..0c1d7de5b3f4fb0fa8fa226b81ec690d8112b849 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1427,7 +1427,7 @@ ) ;; Vector versions of the floating-point frint patterns. -;; Expands to btrunc, ceil, floor, nearbyint, rint, round. +;; Expands to btrunc, ceil, floor, nearbyint, rint, round, frintn. (define_insn frint_patternmode2 [(set (match_operand:VDQF 0 register_operand =w) (unspec:VDQF [(match_operand:VDQF 1 register_operand w)] diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 99a6ac8fcbdcd24a0ea18cc037bef9cf72070281..577aa9fe08bb445e66734bc404e94e13dc1fa65b 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3187,7 +3187,7 @@ ;; --- ;; frint floating-point round to integral standard patterns. -;; Expands to btrunc, ceil, floor, nearbyint, rint, round. +;; Expands to btrunc, ceil, floor, nearbyint, rint, round, frintn. (define_insn frint_patternmode2 [(set (match_operand:GPF 0 register_operand =w) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 6af99361b8e265f66026dc506cfc23f044d153b4..797e37ad638648312ef34bcd63c463e5873c30c4 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -22481,6 +22481,12 @@ vrnd_f32 (float32x2_t __a) return __builtin_aarch64_btruncv2sf (__a); } +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) +vrnd_f64 (float64x1_t __a) +{ + return vset_lane_f64 (__builtin_trunc (vget_lane_f64 (__a, 0)), __a, 0); +} + __extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) vrndq_f32 (float32x4_t __a) { @@ -22501,6 +22507,12 @@ vrnda_f32 (float32x2_t __a) return __builtin_aarch64_roundv2sf (__a); } +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) +vrnda_f64 (float64x1_t __a) +{ + return vset_lane_f64 (__builtin_round (vget_lane_f64 (__a, 0)), __a, 0); +} + __extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) vrndaq_f32 (float32x4_t __a) { @@ -22521,6 +22533,12 @@ vrndi_f32 (float32x2_t __a) return __builtin_aarch64_nearbyintv2sf (__a); } +__extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) +vrndi_f64 (float64x1_t __a) +{ + return vset_lane_f64 (__builtin_nearbyint (vget_lane_f64 (__a, 0)), __a, 0); +} + __extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) vrndiq_f32 (float32x4_t __a) { @@ -22541,6 +22559,12 @@ vrndm_f32 (float32x2_t __a) return __builtin_aarch64_floorv2sf (__a); } +__extension__ static __inline float64x1_t
Re: [PATCH] Fix compress_float_constants related ICE (PR target/43546)
2014-02-12 Jakub Jelinek ja...@redhat.com PR target/43546 * expr.c (compress_float_constant): If x is a hard register, extend into a pseudo and then move to x. * gcc.target/i386/pr43546.c: New test. OK, thanks. -- Eric Botcazou
Re: [patch] Fix wrong code with VCE to bit-field type at -O
On Wed, Feb 12, 2014 at 6:51 PM, Eric Botcazou ebotca...@adacore.com wrote: I am not sure how to deal with this, given that we have mismatched V_C_Es anyway, I'm inclined not to care and let the expander deal with it. But at the same I understand that it is ugly and will certainly cause somebody more headache in the future. I suppose that not scalarizing here might hurt performance and would be frowned upon at the very least. If the fields bigger than the record approach is the standard way of doing this, perhaps SRA can detect such cases and produce these strange COMPONENT_REFs instead, but is it so? You may remember that we went that way before (building a COMPONENT_REF for bit-fields instead of fully lowering the access) so doing it again would be a step backwards. Likewise if we refuses to scalarize. So IMO it's either low- level fiddling in SRA or in the expander (my preference too). Ok, I've looked at the testcase and I suppose the following change is what triggers the bug: bb 11: _56 = m.P_ARRAY; - my_rec2.r1 = VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb: _3 sz: 1}); - _58 = my_rec2.r1.f; + _51 = VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1 ...]{lb: _3 sz: 1}); + my_rec2$r1$f_43 = _51; + _58 = my_rec2$r1$f_43; if (_58 11059199) I observe that SRA modifies an existing but not replaced memory reference (something I always thought is asking for trouble). It changes VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb: _3 sz: 1}); to VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1 ...]{lb: _3 sz: 1});. Created a replacement for my_rec2 offset: 128, size: 24: my_rec2$r1$f Access trees for my_rec2 (UID: 2659): access { base = (2659)'my_rec2', offset = 128, size = 24, expr = my_rec2.r1.f, type = opt31__time_t___XDLU_0__11059199, grp_read = 1, grp_write = 1, grp_assignment_read = 1, grp_assignment_write = 1, grp_scalar_read = 1, grp_scalar_write = 0, grp_total_scalarization = 0, grp_hint = 0, grp_covered = 1, grp_unscalarizable_region = 0, grp_unscalarized_data = 0, grp_partial_lhs = 0, grp_to_be_replaced = 1, grp_to_be_debug_replaced = 0, grp_maybe_modified = 0, grp_not_necessarilly_dereferenced = 0 but obviously 'type' doesn't agree with 'size' here. In other places we disqualify exprs using VIEW_CONVERT_EXPRs but appearantly only for the candidate itself, not for stuff assigned to it. (though I never understood why disqualifying was necessary at all for VIEW_CONVERT_EXPRs). We are using the type of a bitfield field for the replacement which we IMHO should avoid because the FIELD_DECLs size is 24 but the fields type TYPE_SIZE is 32 (it's precision is 24). That's all not an issue until you start to VIEW_CONVERT to such type (VIEW_CONVERT being a reference op just cares for size not precision). Other ops are treated correctly by expansion. Now - using a non-mode precision integer type as scalar replacement isn't going to produce great code and, as we can see, has issues when using VIEW_CONVERT_EXPRs. SRA should either avoid this transform or fixup by VIEW_CONVERTing memory reads only to mode-precision integer types and then inserting a fixup cast. The direct VIEW_CONVERsion it creates, from my_rec2.r1 = VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb: _3 sz: 1}); _58 = my_rec2.r1.f; to basically _58 = VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1 ...]{lb: _3 sz: 1}); is simply wrong. If you fix expansion then consider a nested VIEW_CONVERT_EXPR that views back to the aggregate type - is that now supposed to clear the upper 8 bits because of the VIEW_CONVERT_EXPR in the middle? Not so. So fixing VIEW_CONVERT_EXPR sounds conceptually wrong to me. Not scalarizing a field to a DECL_BIT_FIELD FIELD_DECLs type looks like the best fix to me. Richard.
Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.
On Thu, Feb 13, 2014 at 11:44 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: I've noticed that _mm512_permutexvar_epi[64|32] intrinsics have wrong arguments order. As per [1] first argument is index. For vmpermps/vpermpd intrinsics are fine, but I've changed tests to call CALC with same arg order as intrinsic. here is the same problem (wrong argument order) with vrcp14s[d|s]. Also avx512er-vrcp28ss-2.c test called wrong intrinsic. [1] http://software.intel.com/sites/landingpage/IntrinsicsGuide/ gcc/ * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap arguments order in builtin. (_mm512_permutexvar_epi64): Ditto. (_mm512_mask_permutexvar_epi64): Ditto (_mm512_maskz_permutexvar_epi32): Ditto (_mm512_permutexvar_epi32): Ditto (_mm512_mask_permutexvar_epi32): Ditto * config/i386/sse.md (srcp14mode): Swap operands. gcc/testsuite/ * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic. * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations. * gcc.target/i386/avx512f-vpermpd-2.c: Ditto. * gcc.target/i386/avx512f-vpermps-2.c: Ditto. * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto. diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index a04b289..d3b2dc5 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1456,12 +1456,12 @@ [(set (match_operand:VF_128 0 register_operand =v) (vec_merge:VF_128 (unspec:VF_128 - [(match_operand:VF_128 1 nonimmediate_operand vm)] + [(match_operand:VF_128 2 nonimmediate_operand vm)] UNSPEC_RCP14) - (match_operand:VF_128 2 register_operand v) + (match_operand:VF_128 1 register_operand v) (const_int 1)))] TARGET_AVX512F - vrcp14ssescalarmodesuffix\t{%1, %2, %0|%0, %2, %1} + vrcp14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2} Please don't change srcp pattern, it should be defined similar to vrcpss (aka sse_vmrcpv4sf). You need to switch operand order elsewhere. Other than that, the patch is OK. Uros.
Re: [PATCH] S390: Add test for hotpatching of nested functions
2014-02-13 Dominik Vogt v...@linux.vnet.ibm.com * gcc.target/s390/hotpatch-compile-8.c: New test Ok committed. Thanks! -Andreas-
Re: [PATCH] (gcc-4.8) S390: Fix crash with -mhotpatch and gfortran
2014-02-12 Dominik Vogt v...@linux.vnet.ibm.com * config/s390/s390.c (s390_asm_output_function_label): fix crash caused by bad second argument to warning_at() with -mhotpatch and nested functions (e.g. with gfortran) Applied. Thanks! -Andreas-
Re: [PATCH] S390: Fix crash with -mhotpatch and gfortran
2014-02-12 Dominik Vogt v...@linux.vnet.ibm.com * config/s390/s390.c (s390_asm_output_function_label): fix crash caused by bad second argument to warning_at() with -mhotpatch and nested functions (e.g. with gfortran) Applied. Thanks! -Andreas-
Re: [PATCH] Fix Cilk+ ICEs in the alias oracle
On Thu, 13 Feb 2014, Richard Biener wrote: Cilk+ builds INDIRECT_REFs when expanding builtins (oops) and thus those can leak into MEM_EXRs which will lead to ICEs later. The following patch properly builds a MEM_REF instead. Grepping for INDIRECT_REF I found another suspicious use (just removed, it cannot have triggered and it looks bogus) and the use of a langhook instead of proper GIMPLE interfaces (function also used during expansion). Bootstrap / testing in progress together with some other stuff. Ok? Btw, this exposes that Cilk+ is LTO-ignorant - it doesn't properly register its global trees (bah, more global trees...). So the types_compatible_p call ICEs. Trying to process them in lto/lto.c:read_cgraph_and_symbols doesn't seem to work though. So I'm opting to remove the assert and leave fixing LTO for somebody who cares about Cilk+. Simpifies the patch as follows, bootstrapped tested on x86_64-unknown-linux-gnu. Richard. 2014-02-13 Richard Biener rguent...@suse.de * cilk-common.c (cilk_arrow): Build a MEM_REF, not an INDIRECT_REF. (get_frame_arg): Drop the assert with langhook types_compatible_p. Do not strip INDIRECT_REFs. Index: gcc/cilk-common.c === --- gcc/cilk-common.c (revision 207725) +++ gcc/cilk-common.c (working copy) @@ -66,8 +66,7 @@ cilk_dot (tree frame, int field_number, tree cilk_arrow (tree frame_ptr, int field_number, bool volatil) { - return cilk_dot (fold_build1 (INDIRECT_REF, - TREE_TYPE (TREE_TYPE (frame_ptr)), frame_ptr), + return cilk_dot (build_simple_mem_ref (frame_ptr), field_number, volatil); } @@ -287,12 +286,9 @@ get_frame_arg (tree call) argtype = TREE_TYPE (argtype); - gcc_assert (!lang_hooks.types_compatible_p - || lang_hooks.types_compatible_p (argtype, cilk_frame_type_decl)); - /* If it is passed in as an address, then just use the value directly since the function is inlined. */ - if (TREE_CODE (arg) == INDIRECT_REF || TREE_CODE (arg) == ADDR_EXPR) + if (TREE_CODE (arg) == ADDR_EXPR) return TREE_OPERAND (arg, 0); return arg; }
Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.
On Thu, Feb 13, 2014 at 1:37 PM, Uros Bizjak ubiz...@gmail.com wrote: I've noticed that _mm512_permutexvar_epi[64|32] intrinsics have wrong arguments order. As per [1] first argument is index. For vmpermps/vpermpd intrinsics are fine, but I've changed tests to call CALC with same arg order as intrinsic. here is the same problem (wrong argument order) with vrcp14s[d|s]. Also avx512er-vrcp28ss-2.c test called wrong intrinsic. [1] http://software.intel.com/sites/landingpage/IntrinsicsGuide/ gcc/ * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap arguments order in builtin. (_mm512_permutexvar_epi64): Ditto. (_mm512_mask_permutexvar_epi64): Ditto (_mm512_maskz_permutexvar_epi32): Ditto (_mm512_permutexvar_epi32): Ditto (_mm512_mask_permutexvar_epi32): Ditto * config/i386/sse.md (srcp14mode): Swap operands. gcc/testsuite/ * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic. * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations. * gcc.target/i386/avx512f-vpermpd-2.c: Ditto. * gcc.target/i386/avx512f-vpermps-2.c: Ditto. * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto. diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index a04b289..d3b2dc5 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1456,12 +1456,12 @@ [(set (match_operand:VF_128 0 register_operand =v) (vec_merge:VF_128 (unspec:VF_128 - [(match_operand:VF_128 1 nonimmediate_operand vm)] + [(match_operand:VF_128 2 nonimmediate_operand vm)] UNSPEC_RCP14) - (match_operand:VF_128 2 register_operand v) + (match_operand:VF_128 1 register_operand v) (const_int 1)))] TARGET_AVX512F - vrcp14ssescalarmodesuffix\t{%1, %2, %0|%0, %2, %1} + vrcp14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2} Please don't change srcp pattern, it should be defined similar to vrcpss (aka sse_vmrcpv4sf). You need to switch operand order elsewhere. No, you are correct. Operands should be swapped as in your patch. The patch is OK for mainline. Thanks, Uros.
Re: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end
Hi Thomas! Thanks a lot for your review! I agree with all your notes. On 11.02.2014 20:51, Thomas Schwinge wrote: For ChangeLog files updates (on gomp-4_0-branch, use the respective ChangeLog.gomp files, by the way), should just you be listed as the author, or also your colleagues? Thank you for the notice, I added Evgeny and Dmitry as authors for this part (see attached ChangeLog entry). With these issues addressed, this patch is ready for commit to gomp-4_0-branch. Use your own judgement; if you feel confident, just commit it, or otherwise post it again for a final review -- as you prefer. I fixed patch according to your review and ready to commit it. OK for GOMP4 branch? -- Ilmir. From bf14158b1a28c2c5b29c41071fa62c011d9f4f65 Mon Sep 17 00:00:00 2001 From: Ilmir Usmanov i.usma...@samsung.com Date: Thu, 13 Feb 2014 15:58:28 +0400 Subject: [PATCH] OpenACC GENERIC nodes --- gcc/doc/generic.texi| 45 ++ gcc/gimplify.c | 62 + gcc/omp-low.c | 96 -- gcc/tree-core.h | 61 ++--- gcc/tree-pretty-print.c | 119 gcc/tree.c | 44 +- gcc/tree.def| 42 + gcc/tree.h | 61 - 8 files changed, 507 insertions(+), 23 deletions(-) diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index a56715b..ce14620 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -2052,6 +2052,15 @@ edge. Rethrowing the exception is represented using @code{RESX_EXPR}. @node OpenMP @subsection OpenMP @tindex OACC_PARALLEL +@tindex OACC_KERNELS +@tindex OACC_DATA +@tindex OACC_HOST_DATA +@tindex OACC_DECLARE +@tindex OACC_UPDATE +@tindex OACC_ENTER_DATA +@tindex OACC_EXIT_DATA +@tindex OACC_WAIT +@tindex OACC_CACHE @tindex OMP_PARALLEL @tindex OMP_FOR @tindex OMP_SECTIONS @@ -2073,6 +2082,42 @@ clauses used by the OpenMP API @w{@uref{http://www.openmp.org/}}. Represents @code{#pragma acc parallel [clause1 @dots{} clauseN]}. +@item OACC_KERNELS + +Represents @code{#pragma acc kernels [clause1 @dots{} clauseN]}. + +@item OACC_DATA + +Represents @code{#pragma acc data [clause1 @dots{} clauseN]}. + +@item OACC_HOST_DATA + +Represents @code{#pragma acc host_data [clause1 @dots{} clauseN]}. + +@item OACC_DECLARE + +Represents @code{#pragma acc declare [clause1 @dots{} clauseN]}. + +@item OACC_UPDATE + +Represents @code{#pragma acc update [clause1 @dots{} clauseN]}. + +@item OACC_ENTER_DATA + +Represents @code{#pragma acc enter data [clause1 @dots{} clauseN]}. + +@item OACC_EXIT_DATA + +Represents @code{#pragma acc exit data [clause1 @dots{} clauseN]}. + +@item OACC_WAIT + +Represents @code{#pragma acc wait [(num @dots{})]}. + +@item OACC_CACHE + +Represents @code{#pragma acc cache (var @dots{})}. + @item OMP_PARALLEL Represents @code{#pragma omp parallel [clause1 @dots{} clauseN]}. It diff --git a/gcc/gimplify.c b/gcc/gimplify.c index d20f07f..06d7790 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -4333,6 +4333,15 @@ is_gimple_stmt (tree t) case ASM_EXPR: case STATEMENT_LIST: case OACC_PARALLEL: +case OACC_KERNELS: +case OACC_DATA: +case OACC_HOST_DATA: +case OACC_DECLARE: +case OACC_UPDATE: +case OACC_ENTER_DATA: +case OACC_EXIT_DATA: +case OACC_WAIT: +case OACC_CACHE: case OMP_PARALLEL: case OMP_FOR: case OMP_SIMD: @@ -6157,6 +6166,23 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, remove = true; break; + case OMP_CLAUSE_HOST: + case OMP_CLAUSE_OACC_DEVICE: + case OMP_CLAUSE_DEVICE_RESIDENT: + case OMP_CLAUSE_USE_DEVICE: + case OMP_CLAUSE_GANG: + case OMP_CLAUSE_WAIT: + case OMP_NO_CLAUSE_CACHE: + case OMP_CLAUSE_INDEPENDENT: + case OMP_CLAUSE_ASYNC: + case OMP_CLAUSE_WORKER: + case OMP_CLAUSE_VECTOR: + case OMP_CLAUSE_NUM_GANGS: + case OMP_CLAUSE_NUM_WORKERS: + case OMP_CLAUSE_VECTOR_LENGTH: + remove = true; + break; + case OMP_CLAUSE_NOWAIT: case OMP_CLAUSE_ORDERED: case OMP_CLAUSE_UNTIED: @@ -6498,6 +6524,20 @@ gimplify_adjust_omp_clauses (tree *list_p) case OMP_CLAUSE_DEPEND: break; + case OMP_CLAUSE_HOST: + case OMP_CLAUSE_OACC_DEVICE: + case OMP_CLAUSE_DEVICE_RESIDENT: + case OMP_CLAUSE_USE_DEVICE: + case OMP_CLAUSE_GANG: + case OMP_CLAUSE_WAIT: + case OMP_NO_CLAUSE_CACHE: + case OMP_CLAUSE_INDEPENDENT: + case OMP_CLAUSE_ASYNC: + case OMP_CLAUSE_WORKER: + case OMP_CLAUSE_VECTOR: + case OMP_CLAUSE_NUM_GANGS: + case OMP_CLAUSE_NUM_WORKERS: + case OMP_CLAUSE_VECTOR_LENGTH: default: gcc_unreachable (); } @@ -7988,6 +8028,19 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, ret = GS_ALL_DONE; break; + case OACC_KERNELS: + case OACC_DATA: + case OACC_HOST_DATA: + case OACC_DECLARE: + case OACC_UPDATE: + case OACC_ENTER_DATA: + case OACC_EXIT_DATA: + case OACC_WAIT: + case OACC_CACHE:
[PATCH] Update isl/cloog recommended versions
This updates the recommended versions to match those I just put at ftp://gcc.gnu.org/pub/gcc/infrastructure/. It also mentions the possibility of doing in-tree builds and fixes PR59878 by re-wording the cloog install parts. Committed. Richard. 2014-02-13 Richard Biener rguent...@suse.de PR bootstrap/59878 * doc/install.texi (ISL): Update recommended version to 0.12.2, mention the possibility of an in-tree build. (CLooG): Update recommended version to 0.18.1, mention the possibility of an in-tree build and clarify that the ISL bundled with CLooG does not work. Index: gcc/doc/install.texi === *** gcc/doc/install.texi(revision 207725) --- gcc/doc/install.texi(working copy) *** installed but it is not in your default *** 383,407 @option{--with-mpc} configure option should be used. See also @option{--with-mpc-lib} and @option{--with-mpc-include}. ! @item ISL Library version 0.11.1 Necessary to build GCC with the Graphite loop optimizations. It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} ! as @file{isl-0.11.1.tar.bz2}. ! The @option{--with-isl} configure option should be used if ISL is not ! installed in your default library search path. ! ! @item CLooG 0.18.0 Necessary to build GCC with the Graphite loop optimizations. It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as ! @file{cloog-0.18.0.tar.gz}. The @option{--with-cloog} configure option should ! be used if CLooG is not installed in your default library search path. ! CLooG needs to be built against ISL 0.11.1. Use @option{--with-isl=system} ! to direct CLooG to pick up an already installed ISL, otherwise it will use ! ISL 0.11.1 as bundled with CLooG. CLooG needs to be configured to use GMP ! internally, use @option{--with-bits=gmp} to direct it to do that. @end table --- 383,412 @option{--with-mpc} configure option should be used. See also @option{--with-mpc-lib} and @option{--with-mpc-include}. ! @item ISL Library version 0.12.2 Necessary to build GCC with the Graphite loop optimizations. It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} ! as @file{isl-0.12.2.tar.bz2}. If an ISL source distribution is found ! in a subdirectory of your GCC sources named @file{isl}, it will be ! built together with GCC. Alternatively, the @option{--with-isl} configure ! option should be used if ISL is not installed in your default library ! search path. ! @item CLooG 0.18.1 Necessary to build GCC with the Graphite loop optimizations. It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/} as ! @file{cloog-0.18.1.tar.gz}. If a CLooG source distribution is found ! in a subdirectory of your GCC sources named @file{cloog}, it will be ! built together with GCC. Alternatively, the @option{--with-cloog} configure ! option should be used if CLooG is not installed in your default library search ! path. ! ! If you want to install CLooG separately it needs to be built against ! ISL 0.12.2 by using the @option{--with-isl=system} to direct CLooG to pick ! up an already installed ISL. Using the ISL library as bundled with CLooG ! is not supported. @end table
Re: [PATCH] Update isl/cloog recommended versions
On Thu, 13 Feb 2014, Richard Biener wrote: This updates the recommended versions to match those I just put at ftp://gcc.gnu.org/pub/gcc/infrastructure/. It also mentions the possibility of doing in-tree builds and fixes PR59878 by re-wording the cloog install parts. And this updates download_prerequisites. Committed. Richard. 2014-02-13 Richard Biener rguent...@suse.de * download_prerequisites: Update ISL and CLOOG versions. Index: contrib/download_prerequisites === --- contrib/download_prerequisites (revision 207757) +++ contrib/download_prerequisites (working copy) @@ -43,8 +43,8 @@ ln -sf $MPC mpc || exit 1 # Necessary to build GCC with the Graphite loop optimizations. if [ $GRAPHITE_LOOP_OPT = yes ] ; then - ISL=isl-0.11.1 - CLOOG=cloog-0.18.0 + ISL=isl-0.12.2 + CLOOG=cloog-0.18.1 wget ftp://gcc.gnu.org/pub/gcc/infrastructure/$ISL.tar.bz2 || exit 1 tar xjf $ISL.tar.bz2 || exit 1
Re: [PING][PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope
Hello, Pinging this patch review request. Can someone involved in the Objective-C language frontend have a quick look at the description of the proposed features and tell me if it'd be ok to have them in the trunk so I can go ahead and create proper patches? Thanks, Dimitris On 02/06/2014 11:25 AM, Dimitris Papavasiliou wrote: Hello, This is a patch regarding a couple of Objective-C related dialect options and warning switches. I have already submitted it a while ago but gave up after pinging a couple of times. I am now informed that should have kept pinging until I got someone's attention so I'm resending it. The patch is now against an old revision and as I stated originally it's probably not in a state that can be adopted as is. I'm sending it as is so that the implemented features can be assesed in terms of their usefulness and if they're welcome I'd be happy to make any necessary changes to bring it up-to-date, split it into smaller patches, add test-cases and anything else that is deemed necessary. Here's the relevant text from my initial message: Two of these switches are related to a feature request I submitted a while ago, Bug 56044 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't reproduce the entire argument here since it is available in the feature request. The relevant functionality in the patch comes in the form of two switches: -Wshadow-ivars which controls the local declaration of ‘somevar’ hides instance variable warning which curiously is enabled by default instead of being controlled at least by -Wshadow. The patch changes it so that this warning can be enabled and disabled specifically through -Wshadow-ivars as well as with all other shadowing-related warnings through -Wshadow. The reason for the extra switch is that, while searching through the Internet for a solution to this problem I have found out that other people are inconvenienced by this particular warning as well so it might be useful to be able to turn it off while keeping all the other shadowing-related warnings enabled. -flocal-ivars which when true, as it is by default, treats instance variables as having local scope. If false (-fno-local-ivars) instance variables must always be referred to as self-ivarname and references of ivarname resolve to the local or global scope as usual. I've also taken the opportunity of adding another switch unrelated to the above but related to instance variables: -fivar-visibility which can be set to either private, protected (the default), public and package. This sets the default instance variable visibility which normally is implicitly protected. My use-case for it is basically to be able to set it to public and thus effectively disable this visibility mechanism altogether which I find no use for and therefore have to circumvent. I'm not sure if anyone else feels the same way towards this but I figured it was worth a try. I'm attaching a preliminary patch against the current revision in case anyone wants to have a look. The changes are very small and any blatant mistakes should be immediately obvious. I have to admit to having virtually no knowledge of the internals of GCC but I have tried to keep in line with formatting guidelines and general style as well as looking up the particulars of the way options are handled in the available documentation to avoid blind copy-pasting. I have also tried to test the functionality both in my own (relatively large, or at least not too small) project and with small test programs and everything works as expected. Finallly, I tried running the tests too but these fail to complete both in the patched and unpatched version, possibly due to the way I've configured GCC. Dimitris
[PATCH, ARM] Skip pr59858.c test for -mfloat-abi=hard
Hi, The pr59858.c testcase explicitly sets -msoft-float which is incompatible with our -mfloat-abi=hard variant. This patch therefore should not be run if you have -mfloat-abi=hard. Tested with both variations for arm-none-eabi build. OK for commit? Cheers, Ian 2014-02-13 Ian Bolton ian.bol...@arm.com testsuite/ * gcc.target/arm/pr59858.c: Skip test if -mfloat-abi=hard.diff --git a/gcc/testsuite/gcc.target/arm/pr59858.c b/gcc/testsuite/gcc.target/arm/pr59858.c index 463bd38..1e03203 100644 --- a/gcc/testsuite/gcc.target/arm/pr59858.c +++ b/gcc/testsuite/gcc.target/arm/pr59858.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options -march=armv5te -marm -mthumb-interwork -Wall -Wstrict-prototypes -Wstrict-aliasing -funsigned-char -fno-builtin -fno-asm -msoft-float -std=gnu99 -mlittle-endian -mthumb -fno-stack-protector -Os -g -feliminate-unused-debug-types -funit-at-a-time -fmerge-all-constants -fstrict-aliasing -fno-tree-loop-optimize -fno-tree-dominator-opts -fno-strength-reduce -fPIC -w } */ +/* { dg-skip-if Test is not compatible with hard-float { *-*-* } { -mfloat-abi=hard } { } } */ typedef enum { REG_ENOSYS = -1,
Re: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end
Hi Ilmir! On Thu, 13 Feb 2014 17:15:47 +0400, Ilmir Usmanov i.usma...@samsung.com wrote: I fixed patch according to your review and ready to commit it. OK for GOMP4 branch? Yes! :-) Congratulations, and thanks for promptly addressing the issues raised during review. I'm aware this can be a bit of a boring or tedious process, but in the end, the code quality will be higher (well, that's the idea about code review), and certainly you'll have learned some things, too (and I have, too), and so next time this process will likely be faster. Only a few minor comments about the ChangeLog formatting: 13-02-2014 Ilmir Usmanov i.usma...@samsung.com -MM-DD is the format used in ChangeLogs. Add OpenACC 1.0 support to GENERIC, except loop directive and subarrays. Dmitry Bocharnikov dmitr...@samsung.com Evgeny Gavrin e.gav...@samsung.com Ilmir Usmanov i.usma...@samsung.com For multiple authors, do it like this: 2014-02-13 Ilmir Usmanov i.usma...@samsung.com Dmitry Bocharnikov dmitr...@samsung.com Evgeny Gavrin e.gav...@samsung.com | gcc/ | * gimplify.c (is_gimple_stmt): Stub OpenACC directives and clauses. | (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Likewise. | (gimplify_expr): Likewise. (I don't care, but) you can also do it as follows, a bit simpler: * [file] ([item 1], [item 2], [...]): [text]. | * tree-core.h | (OMP_CLAUSE_HOST, OMP_CLAUSE_OACC_DEVICE, OMP_CLAUSE_DEVICE_RESIDENT, | OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE_GANG, OMP_CLAUSE_WAIT, | OMP_NO_CLAUSE_CACHE, OMP_CLAUSE_INDEPENDENT, OMP_CLAUSE_ASYNC, | OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, OMP_CLAUSE_NUM_GANGS, | OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH): New clauses. As the enum omp_clause_code is the thing that you modify, that would be: * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_HOST, [...]. Or, as other people do: * tree-core.h (omp_clause_code): Add OMP_CLAUSE_HOST, [...]. Grüße, Thomas pgpWWwUBmWOVa.pgp Description: PGP signature
Re: RFA: one more version of patch for PR59535
On 11/02/14 19:43, Vladimir Makarov wrote: This is one more version of the patch to fix the PR59535 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535 Here are the results of applying the patch: ThumbThumb2 reload 2626334 2400154 lra (before the patch) 2665749 2414926 lra (after the patch) 2626334 2397132 I already wrote that the change in arm.h is to prevent reloading sp as an address by LRA. Reload has no such problem as it uses legitimate address hook and LRA mostly relies on base_reg_class. Richard, I need an approval for this change. 2014-02-11 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/59535 * lra-constraints.c (process_alt_operands): Encourage alternative when unassigned pseudo class is superset of the alternative class. (inherit_reload_reg): Don't inherit when optimizing for code size. * config/arm/arm.h (MODE_BASE_REG_CLASS): Return CORE_REGS for Thumb2 and BASE_REGS for modes not less than 4 for LRA. Index: config/arm/arm.h === --- config/arm/arm.h (revision 207562) +++ config/arm/arm.h (working copy) @@ -1272,8 +1272,10 @@ enum reg_class when addressing quantities in QI or HI mode; if we don't know the mode, then we must be conservative. */ #define MODE_BASE_REG_CLASS(MODE)\ -(TARGET_ARM || (TARGET_THUMB2 !optimize_size) ? CORE_REGS : \ - (((MODE) == SImode) ? BASE_REGS : LO_REGS)) +(TARGET_ARM || (TARGET_THUMB2 (!optimize_size || arm_lra_flag)) \ + ? CORE_REGS : ((MODE) == SImode \ +|| (arm_lra_flag GET_MODE_SIZE (MODE) = 4) \ +? BASE_REGS : LO_REGS)) /* For Thumb we can not support SP+reg addressing, so we return LO_REGS instead of BASE_REGS. */ Awesome. Thanks, Vladimir. I find that while I can't convince myself that the logic in the change to MODE_BASE_REG_CLASS is wrong, it's very hard to follow. Furthermore, when we come to rip out the old reload code it will be quite prone to getting this wrong. I think restructuring this along the lines of: #define MODE_BASE_REG_CLASS(MODE) (arm_lra_flag ? (TARGET_32BIT ? CORE_REGS : GET_MODE_SIZE (MODE) = 4 ? BASE_REGS : LO_REGS) : ((TARGET_ARM || (TARGET_THUMB2 !optimize_size)) ? CORE_REGS : ((MODE) == SImode) ? BASE_REGS : LO_REGS)) Is both easier to understand and easier to simplify later when reload goes away. I'll run a regression test on this and let you know the results. R.
Re: [patch] Fix wrong code with VCE to bit-field type at -O
We are using the type of a bitfield field for the replacement which we IMHO should avoid because the FIELD_DECLs size is 24 but the fields type TYPE_SIZE is 32 (it's precision is 24). That's all not an issue until you start to VIEW_CONVERT to such type (VIEW_CONVERT being a reference op just cares for size not precision). Other ops are treated correctly by expansion. Now - using a non-mode precision integer type as scalar replacement isn't going to produce great code and, as we can see, has issues when using VIEW_CONVERT_EXPRs. SRA should either avoid this transform or fixup by VIEW_CONVERTing memory reads only to mode-precision integer types and then inserting a fixup cast. The direct VIEW_CONVERsion it creates, from my_rec2.r1 = VIEW_CONVERT_EXPRstruct opt31__rec1(*_56[1 ...]{lb: _3 sz: 1}); _58 = my_rec2.r1.f; to basically _58 = VIEW_CONVERT_EXPRopt31__time_t___XDLU_0__11059199(*_56[1 ...]{lb: _3 sz: 1}); is simply wrong. There is nothing obvious I think, i.e. that's debatable. I agree that a VCE from a 32-bit object to a 32-bit integer with 24-bit precision should not clear the upper 8 bits (so the REDUCE_BIT_FIELD part of my patch is wrong). But here we have a VCE from a 24-bit object to a 32-bit integer with 24-bit precision which reads *more bits* than the size of the source type; that I think is plain wrong and is fixed by the bit-field extraction in the patch. If you fix expansion then consider a nested VIEW_CONVERT_EXPR that views back to the aggregate type - is that now supposed to clear the upper 8 bits because of the VIEW_CONVERT_EXPR in the middle? Not so. So fixing VIEW_CONVERT_EXPR sounds conceptually wrong to me. I agree that we need not clear, but we need to prevent the expansion from reading more bits than what is contained in the source type. And this is sufficient to fix the regression. Not scalarizing a field to a DECL_BIT_FIELD FIELD_DECLs type looks like the best fix to me. That seems like a big hammer though. -- Eric Botcazou
[AArch64] Improve vst4_lane intrinsics
Hi, This patch rewrites the vst4_lane intrinsics in terms of RTL builtins. Tested on aarch64-none-elf with no issues. OK to queue for Stage 1? Thanks, James --- gcc/ 2014-02-13 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64-builtins.c (aarch64_types_storestruct_lane_qualifiers): New. (TYPES_STORESTRUCT_LANE): Likewise. * config/aarch64/aarch64-simd-builtins.def (st2_lane): New. (st3_lane): Likewise. (st4_lane): Likewise. * config/aarch64/aarch64-simd.md (vec_store_lanesoi_lanemode): New. (vec_store_lanesci_lanemode): Likewise. (vec_store_lanesxi_lanemode): Likewise. (aarch64_st2_laneVQ:mode): Likewise. (aarch64_st3_laneVQ:mode): Likewise. (aarch64_st4_laneVQ:mode): Likewise. * config/aarch64/aarch64.md (unspec): Add UNSPEC_ST{2,3,4}_LANE. * config/aarch64/arm_neon.h (__ST2_LANE_FUNC): Rewrite using builtins, update use points to use new macro arguments. (__ST3_LANE_FUNC): Likewise. (__ST4_LANE_FUNC): Likewise. * config/aarch64/iterators.md (V_TWO_ELEM): New. (V_THREE_ELEM): Likewise. (V_FOUR_ELEM): Likewise. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index ebab2ce..a12a1aa 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -226,6 +226,11 @@ aarch64_types_store1_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_void, qualifier_pointer_map_mode, qualifier_none }; #define TYPES_STORE1 (aarch64_types_store1_qualifiers) #define TYPES_STORESTRUCT (aarch64_types_store1_qualifiers) +static enum aarch64_type_qualifiers +aarch64_types_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_void, qualifier_pointer_map_mode, + qualifier_none, qualifier_none }; +#define TYPES_STORESTRUCT_LANE (aarch64_types_storestruct_lane_qualifiers) #define CF0(N, X) CODE_FOR_aarch64_##N##X #define CF1(N, X) CODE_FOR_##N##X##1 diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index e5f71b4..7bfdfca 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -107,6 +107,10 @@ BUILTIN_VQ (STORESTRUCT, st3, 0) BUILTIN_VQ (STORESTRUCT, st4, 0) + BUILTIN_VQ (STORESTRUCT_LANE, st2_lane, 0) + BUILTIN_VQ (STORESTRUCT_LANE, st3_lane, 0) + BUILTIN_VQ (STORESTRUCT_LANE, st4_lane, 0) + BUILTIN_VQW (BINOP, saddl2, 0) BUILTIN_VQW (BINOP, uaddl2, 0) BUILTIN_VQW (BINOP, ssubl2, 0) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 4dffb59e856aeaafb79007255d3b91a73ef1ef13..f19b7d5123b5a6249026d48f943445f8167b1c45 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3584,6 +3584,17 @@ (define_insn vec_store_lanesoimode [(set_attr type neon_store2_2regq)] ) +(define_insn vec_store_lanesoi_lanemode + [(set (match_operand:V_TWO_ELEM 0 aarch64_simd_struct_operand =Utv) + (unspec:V_TWO_ELEM [(match_operand:OI 1 register_operand w) +(unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY) + (match_operand:SI 2 immediate_operand i)] + UNSPEC_ST2_LANE))] + TARGET_SIMD + st2\\t{%S1.Vetype - %T1.Vetype}[%2], %0 + [(set_attr type neon_store3_one_laneq)] +) + (define_insn vec_load_lanescimode [(set (match_operand:CI 0 register_operand =w) (unspec:CI [(match_operand:CI 1 aarch64_simd_struct_operand Utv) @@ -3604,6 +3615,17 @@ (define_insn vec_store_lanescimode [(set_attr type neon_store3_3regq)] ) +(define_insn vec_store_lanesci_lanemode + [(set (match_operand:V_THREE_ELEM 0 aarch64_simd_struct_operand =Utv) + (unspec:V_THREE_ELEM [(match_operand:CI 1 register_operand w) +(unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY) + (match_operand:SI 2 immediate_operand i)] + UNSPEC_ST3_LANE))] + TARGET_SIMD + st3\\t{%S1.Vetype - %U1.Vetype}[%2], %0 + [(set_attr type neon_store3_one_laneq)] +) + (define_insn vec_load_lanesximode [(set (match_operand:XI 0 register_operand =w) (unspec:XI [(match_operand:XI 1 aarch64_simd_struct_operand Utv) @@ -3624,6 +3646,17 @@ (define_insn vec_store_lanesximode [(set_attr type neon_store4_4regq)] ) +(define_insn vec_store_lanesxi_lanemode + [(set (match_operand:V_FOUR_ELEM 0 aarch64_simd_struct_operand =Utv) + (unspec:V_FOUR_ELEM [(match_operand:XI 1 register_operand w) +(unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY) + (match_operand:SI 2 immediate_operand i)] + UNSPEC_ST4_LANE))] + TARGET_SIMD + st4\\t{%S1.Vetype - %V1.Vetype}[%2], %0 + [(set_attr type neon_store4_one_laneq)] +) + ;; Reload patterns for AdvSIMD register list operands. (define_expand movmode @@ -4118,6 +4151,57 @@ (define_expand aarch64_stVSTRUCT:nregs DONE; }) +(define_expand
Fix PR libffi/60073
This adds proper variadic support to the SPARC port of libffi, thus fixing a regression in the testsuite in 64-bit mode, and fixes a small inaccuracy in the documentation. Tested on SPARC/Solaris and SPARC64/Solaris, applied on the mainline. 2014-02-13 Eric Botcazou ebotca...@adacore.com PR libffi/60073 * src/sparc/ffitarget.h (FFI_TARGET_SPECIFIC_VARIADIC): Define. (FFI_EXTRA_CIF_FIELDS): Likewise. (FFI_NATIVE_RAW_API): Move around. * src/sparc/ffi.c (ffi_prep_cif_machdep_core): New function from... (ffi_prep_cif_machdep): ...here. Call ffi_prep_cif_machdep_core. (ffi_prep_cif_machdep_var): New function. (ffi_closure_sparc_inner_v9): Do not pass anonymous FP arguments in FP registers. * doc/libffi.texi (Introduction): Fix inaccuracy. -- Eric BotcazouIndex: src/sparc/ffitarget.h === --- src/sparc/ffitarget.h (revision 207685) +++ src/sparc/ffitarget.h (working copy) @@ -58,16 +58,17 @@ typedef enum ffi_abi { } ffi_abi; #endif +#define FFI_TARGET_SPECIFIC_VARIADIC 1 +#define FFI_EXTRA_CIF_FIELDS unsigned int nfixedargs + /* Definitions for closures - */ #define FFI_CLOSURES 1 -#define FFI_NATIVE_RAW_API 0 - #ifdef SPARC64 #define FFI_TRAMPOLINE_SIZE 24 #else #define FFI_TRAMPOLINE_SIZE 16 #endif +#define FFI_NATIVE_RAW_API 0 #endif - Index: src/sparc/ffi.c === --- src/sparc/ffi.c (revision 207685) +++ src/sparc/ffi.c (working copy) @@ -249,7 +249,7 @@ int ffi_prep_args_v9(char *stack, extend } /* Perform machine dependent cif processing */ -ffi_status ffi_prep_cif_machdep(ffi_cif *cif) +static ffi_status ffi_prep_cif_machdep_core(ffi_cif *cif) { int wordsize; @@ -334,6 +334,19 @@ ffi_status ffi_prep_cif_machdep(ffi_cif return FFI_OK; } +ffi_status ffi_prep_cif_machdep(ffi_cif *cif) +{ + cif-nfixedargs = cif-nargs; + return ffi_prep_cif_machdep_core (cif); +} + +ffi_status ffi_prep_cif_machdep_var(ffi_cif *cif, unsigned int nfixedargs, +unsigned int ntotalargs) +{ + cif-nfixedargs = nfixedargs; + return ffi_prep_cif_machdep_core (cif); +} + int ffi_v9_layout_struct(ffi_type *arg, int off, char *ret, char *intg, char *flt) { ffi_type **ptr = arg-elements[0]; @@ -604,8 +617,7 @@ ffi_closure_sparc_inner_v9(ffi_closure * /* Copy the caller's structure return address so that the closure returns the data directly to the caller. */ - if (cif-flags == FFI_TYPE_VOID - cif-rtype-type == FFI_TYPE_STRUCT) + if (cif-flags == FFI_TYPE_VOID cif-rtype-type == FFI_TYPE_STRUCT) { rvalue = (void *) gpr[0]; /* Skip the structure return address. */ @@ -619,6 +631,10 @@ ffi_closure_sparc_inner_v9(ffi_closure * /* Grab the addresses of the arguments from the stack frame. */ for (i = 0; i cif-nargs; i++) { + /* If the function is variadic, FP arguments are passed in FP + registers only if the corresponding parameter is named. */ + const int named = (i cif-nfixedargs); + if (arg_types[i]-type == FFI_TYPE_STRUCT) { if (arg_types[i]-size 16) @@ -633,7 +649,9 @@ ffi_closure_sparc_inner_v9(ffi_closure * 0, (char *) gpr[argn], (char *) gpr[argn], - (char *) fpr[argn]); + named + ? (char *) fpr[argn] + : (char *) gpr[argn]); avalue[i] = gpr[argn]; argn += ALIGN(arg_types[i]-size, FFI_SIZEOF_ARG) / FFI_SIZEOF_ARG; } @@ -649,6 +667,7 @@ ffi_closure_sparc_inner_v9(ffi_closure * argn++; #endif if (i fp_slot_max + named (arg_types[i]-type == FFI_TYPE_FLOAT || arg_types[i]-type == FFI_TYPE_DOUBLE #if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE @@ -662,7 +681,7 @@ ffi_closure_sparc_inner_v9(ffi_closure * } /* Invoke the closure. */ - (closure-fun) (cif, rvalue, avalue, closure-user_data); + closure-fun (cif, rvalue, avalue, closure-user_data); /* Tell ffi_closure_sparc how to perform return type promotions. */ return cif-rtype-type; Index: doc/libffi.texi === --- doc/libffi.texi (revision 207685) +++ doc/libffi.texi (working copy) @@ -63,14 +63,14 @@ section entitled ``GNU General Public Li @node Introduction @chapter What is libffi? -Compilers for high level languages generate code that follow certain +Compilers for high-level languages generate code that follow certain conventions. These conventions are necessary, in part, for separate compilation to work. One such convention is the @dfn{calling convention}. The calling convention is a set of assumptions made by the compiler about where function arguments will be found on entry to a function. A calling convention also specifies where the return -value for a function is found. The calling convention is also
Fix PR libffi/60073
This adds proper variadic support to the SPARC port of libffi, thus fixing a regression in the testsuite in 64-bit mode, and fixes a small inaccuracy in the documentation. Tested on SPARC/Solaris and SPARC64/Solaris, applied on the mainline. 2014-02-13 Eric Botcazou ebotca...@adacore.com PR libffi/60073 * src/sparc/ffitarget.h (FFI_TARGET_SPECIFIC_VARIADIC): Define. (FFI_EXTRA_CIF_FIELDS): Likewise. (FFI_NATIVE_RAW_API): Move around. * src/sparc/ffi.c (ffi_prep_cif_machdep_core): New function from... (ffi_prep_cif_machdep): ...here. Call ffi_prep_cif_machdep_core. (ffi_prep_cif_machdep_var): New function. (ffi_closure_sparc_inner_v9): Do not pass anonymous FP arguments in FP registers. * doc/libffi.texi (Introduction): Fix inaccuracy. -- Eric BotcazouIndex: src/sparc/ffitarget.h === --- src/sparc/ffitarget.h (revision 207685) +++ src/sparc/ffitarget.h (working copy) @@ -58,16 +58,17 @@ typedef enum ffi_abi { } ffi_abi; #endif +#define FFI_TARGET_SPECIFIC_VARIADIC 1 +#define FFI_EXTRA_CIF_FIELDS unsigned int nfixedargs + /* Definitions for closures - */ #define FFI_CLOSURES 1 -#define FFI_NATIVE_RAW_API 0 - #ifdef SPARC64 #define FFI_TRAMPOLINE_SIZE 24 #else #define FFI_TRAMPOLINE_SIZE 16 #endif +#define FFI_NATIVE_RAW_API 0 #endif - Index: src/sparc/ffi.c === --- src/sparc/ffi.c (revision 207685) +++ src/sparc/ffi.c (working copy) @@ -249,7 +249,7 @@ int ffi_prep_args_v9(char *stack, extend } /* Perform machine dependent cif processing */ -ffi_status ffi_prep_cif_machdep(ffi_cif *cif) +static ffi_status ffi_prep_cif_machdep_core(ffi_cif *cif) { int wordsize; @@ -334,6 +334,19 @@ ffi_status ffi_prep_cif_machdep(ffi_cif return FFI_OK; } +ffi_status ffi_prep_cif_machdep(ffi_cif *cif) +{ + cif-nfixedargs = cif-nargs; + return ffi_prep_cif_machdep_core (cif); +} + +ffi_status ffi_prep_cif_machdep_var(ffi_cif *cif, unsigned int nfixedargs, +unsigned int ntotalargs) +{ + cif-nfixedargs = nfixedargs; + return ffi_prep_cif_machdep_core (cif); +} + int ffi_v9_layout_struct(ffi_type *arg, int off, char *ret, char *intg, char *flt) { ffi_type **ptr = arg-elements[0]; @@ -604,8 +617,7 @@ ffi_closure_sparc_inner_v9(ffi_closure * /* Copy the caller's structure return address so that the closure returns the data directly to the caller. */ - if (cif-flags == FFI_TYPE_VOID - cif-rtype-type == FFI_TYPE_STRUCT) + if (cif-flags == FFI_TYPE_VOID cif-rtype-type == FFI_TYPE_STRUCT) { rvalue = (void *) gpr[0]; /* Skip the structure return address. */ @@ -619,6 +631,10 @@ ffi_closure_sparc_inner_v9(ffi_closure * /* Grab the addresses of the arguments from the stack frame. */ for (i = 0; i cif-nargs; i++) { + /* If the function is variadic, FP arguments are passed in FP + registers only if the corresponding parameter is named. */ + const int named = (i cif-nfixedargs); + if (arg_types[i]-type == FFI_TYPE_STRUCT) { if (arg_types[i]-size 16) @@ -633,7 +649,9 @@ ffi_closure_sparc_inner_v9(ffi_closure * 0, (char *) gpr[argn], (char *) gpr[argn], - (char *) fpr[argn]); + named + ? (char *) fpr[argn] + : (char *) gpr[argn]); avalue[i] = gpr[argn]; argn += ALIGN(arg_types[i]-size, FFI_SIZEOF_ARG) / FFI_SIZEOF_ARG; } @@ -649,6 +667,7 @@ ffi_closure_sparc_inner_v9(ffi_closure * argn++; #endif if (i fp_slot_max + named (arg_types[i]-type == FFI_TYPE_FLOAT || arg_types[i]-type == FFI_TYPE_DOUBLE #if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE @@ -662,7 +681,7 @@ ffi_closure_sparc_inner_v9(ffi_closure * } /* Invoke the closure. */ - (closure-fun) (cif, rvalue, avalue, closure-user_data); + closure-fun (cif, rvalue, avalue, closure-user_data); /* Tell ffi_closure_sparc how to perform return type promotions. */ return cif-rtype-type; Index: doc/libffi.texi === --- doc/libffi.texi (revision 207685) +++ doc/libffi.texi (working copy) @@ -63,14 +63,14 @@ section entitled ``GNU General Public Li @node Introduction @chapter What is libffi? -Compilers for high level languages generate code that follow certain +Compilers for high-level languages generate code that follow certain conventions. These conventions are necessary, in part, for separate compilation to work. One such convention is the @dfn{calling convention}. The calling convention is a set of assumptions made by the compiler about where function arguments will be found on entry to a function. A calling convention also specifies where the return -value for a function is found. The calling convention is also
Re: [PATCH] Fix PR 58960
On 01/30/2014 12:42 AM, Andrey Belevantsev wrote: Hello, As detailed in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58960#c6, we fail to use the DF liveness info in the register pressure sensitive scheduling for the new blocks as we do not properly compute it in this case. The patch fixes this by avoiding to use the sched-pressure for the new regions, as currently these are only ia64 recovery blocks and supposed to be cold. In the case we'd get other cases of the new blocks, this may be reconsidered. The other options of computing the DF info sketched at the above link do not seem plausible for this stage. Bootstrapped and tested on ia64, also tested by Andreas Schwab on ia64 (see PR log). OK for trunk? The patch is ok. Andrey, thanks for working on the PR and sorry for the delay with the approval. 2013-01-30 Andrey Belevantsev a...@ispras.ru PR rtl-optimization/58960 * haifa-sched.c (alloc_global_sched_pressure_data): New, factored out from ... (sched_init) ... here. (free_global_sched_pressure_data): New, factored out from ... (sched_finish): ... here. * sched-int.h (free_global_sched_pressure_data): Declare. * sched-rgn.c (nr_regions_initial): New static global. (haifa_find_rgns): Initialize it. (schedule_region): Disable sched-pressure for the newly generated regions.
[PATCH, i386]: Fix xop_vmfrczmode2 expander
Hello! No functional changes. 2014-02-13 Uros Bizjak ubiz...@gmail.com * config/i386/sse.md (xop_vmfrczmode2): Generate const0 in operands[2], not operands[3]. Tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: config/i386/sse.md === --- config/i386/sse.md (revision 207762) +++ config/i386/sse.md (working copy) @@ -13618,10 +13618,10 @@ (unspec:VF_128 [(match_operand:VF_128 1 nonimmediate_operand)] UNSPEC_FRCZ) - (match_dup 3) + (match_dup 2) (const_int 1)))] TARGET_XOP - operands[3] = CONST0_RTX (MODEmode);) + operands[2] = CONST0_RTX (MODEmode);) (define_insn *xop_vmfrczmode2 [(set (match_operand:VF_128 0 register_operand =x)
[jit] Require function names to be valid C identifiers for now
(Looks like the comma in the Subject stopped this getting through; resending with suitably edited Subject) Committed to branch dmalcolm/jit: gcc/jit/ * libgccjit.c (IS_ASCII_ALPHA): New macro. (IS_ASCII_DIGIT): New macro. (IS_ASCII_ALNUM): New macro. (gcc_jit_context_new_function): Require that function names be valid C identifiers for now, to avoid later problems in the assembler. --- gcc/jit/libgccjit.c | 34 ++ 1 file changed, 34 insertions(+) diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c index 3c2d962..bca60bd 100644 --- a/gcc/jit/libgccjit.c +++ b/gcc/jit/libgccjit.c @@ -8,6 +8,19 @@ #include libgccjit.h #include internal-api.h +#define IS_ASCII_ALPHA(CHAR) \ + (\ +((CHAR) = 'a' (CHAR) ='z')\ +|| \ +((CHAR) = 'A' (CHAR) = 'Z') \ + ) + +#define IS_ASCII_DIGIT(CHAR) \ + ((CHAR) = '0' (CHAR) ='9') + +#define IS_ASCII_ALNUM(CHAR) \ + (IS_ASCII_ALPHA (CHAR) || IS_ASCII_DIGIT (CHAR)) + struct gcc_jit_context : public gcc::jit::recording::context { gcc_jit_context (gcc_jit_context *parent_ctxt) : @@ -395,6 +408,27 @@ gcc_jit_context_new_function (gcc_jit_context *ctxt, RETURN_NULL_IF_FAIL (ctxt, NULL, NULL context); RETURN_NULL_IF_FAIL (return_type, ctxt, NULL return_type); RETURN_NULL_IF_FAIL (name, ctxt, NULL name); + /* The assembler can only handle certain names, so for now, enforce + C's rules for identiers upon the name. + Eventually we'll need some way to interact with e.g. C++ name mangling. */ + { +/* Leading char: */ +char ch = *name; +RETURN_NULL_IF_FAIL_PRINTF2 ( + IS_ASCII_ALPHA (ch) || ch == '_', + ctxt, + name \%s\ contains invalid character: '%c', + name, ch); +/* Subsequent chars: */ +for (const char *ptr = name + 1; (ch = *ptr); ptr++) + { + RETURN_NULL_IF_FAIL_PRINTF2 ( + IS_ASCII_ALNUM (ch) || ch == '_', + ctxt, + name \%s\ contains invalid character: '%c', + name, ch); + } + } RETURN_NULL_IF_FAIL ((num_params == 0) || params, ctxt, NULL params); for (int i = 0; i num_params; i++) if (!params[i]) -- 1.7.11.7
Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.
On Thu, Feb 13, 2014 at 1:55 PM, Uros Bizjak ubiz...@gmail.com wrote: I've noticed that _mm512_permutexvar_epi[64|32] intrinsics have wrong arguments order. As per [1] first argument is index. For vmpermps/vpermpd intrinsics are fine, but I've changed tests to call CALC with same arg order as intrinsic. here is the same problem (wrong argument order) with vrcp14s[d|s]. Also avx512er-vrcp28ss-2.c test called wrong intrinsic. [1] http://software.intel.com/sites/landingpage/IntrinsicsGuide/ gcc/ * config/i386/avx512fintrin.h (_mm512_maskz_permutexvar_epi64): Swap arguments order in builtin. (_mm512_permutexvar_epi64): Ditto. (_mm512_mask_permutexvar_epi64): Ditto (_mm512_maskz_permutexvar_epi32): Ditto (_mm512_permutexvar_epi32): Ditto (_mm512_mask_permutexvar_epi32): Ditto * config/i386/sse.md (srcp14mode): Swap operands. gcc/testsuite/ * gcc.target/i386/avx512er-vrcp28ss-2.c: Call rigth intrinsic. * gcc.target/i386/avx512f-vpermd-2.c: Fix reference calculations. * gcc.target/i386/avx512f-vpermpd-2.c: Ditto. * gcc.target/i386/avx512f-vpermps-2.c: Ditto. * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto. * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto. diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index a04b289..d3b2dc5 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1456,12 +1456,12 @@ [(set (match_operand:VF_128 0 register_operand =v) (vec_merge:VF_128 (unspec:VF_128 - [(match_operand:VF_128 1 nonimmediate_operand vm)] + [(match_operand:VF_128 2 nonimmediate_operand vm)] UNSPEC_RCP14) - (match_operand:VF_128 2 register_operand v) + (match_operand:VF_128 1 register_operand v) (const_int 1)))] TARGET_AVX512F - vrcp14ssescalarmodesuffix\t{%1, %2, %0|%0, %2, %1} + vrcp14ssescalarmodesuffix\t{%2, %1, %0|%0, %1, %2} Please don't change srcp pattern, it should be defined similar to vrcpss (aka sse_vmrcpv4sf). You need to switch operand order elsewhere. No, you are correct. Operands should be swapped as in your patch. Eh, sorry that after some more thinking, I have to again revert this decision. The srcp pattern should remain as is, and you should swap operands in avx512fintrin.h instead: --cut here-- Index: avx512fintrin.h === --- avx512fintrin.h (revision 207762) +++ avx512fintrin.h (working copy) @@ -1470,8 +1470,8 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm_rcp14_sd (__m128d __A, __m128d __B) { - return (__m128d) __builtin_ia32_rcp14sd ((__v2df) __A, - (__v2df) __B); + return (__m128d) __builtin_ia32_rcp14sd ((__v2df) __B, + (__v2df) __A); } extern __inline __m128 @@ -1478,8 +1478,8 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm_rcp14_ss (__m128 __A, __m128 __B) { - return (__m128) __builtin_ia32_rcp14ss ((__v4sf) __A, - (__v4sf) __B); + return (__m128) __builtin_ia32_rcp14ss ((__v4sf) __B, + (__v4sf) __A); } extern __inline __m512d --cut here-- vec_merge RSQRT and RCP are unops of type sse. To correctly determine memory attribute, sse types look at operand1 only, so this is the reason that the pattern is defined in this way. There is similar problem with vec_merge rcp28 and rsqrt28 patterns. operands 1 and 2 are swapped in the mnemonic, since only the last operands allow memory: Index: sse.md === --- sse.md (revision 207764) +++ sse.md (working copy) @@ -12825,7 +12825,7 @@ (match_operand:VF_128 2 register_operand v) (const_int 1)))] TARGET_AVX512ER - vrcp28ssescalarmodesuffix\t{round_saeonly_op3%2, %1, %0|%0, %1, %2round_saeonly_op3} + vrcp28ssescalarmodesuffix\t{round_saeonly_op3%1, %2, %0|%0, %2, %1round_saeonly_op3} [(set_attr length_immediate 1) (set_attr prefix evex) (set_attr mode MODE)]) @@ -12849,7 +12849,7 @@ (match_operand:VF_128 2 register_operand v) (const_int 1)))] TARGET_AVX512ER - vrsqrt28ssescalarmodesuffix\t{round_saeonly_op3%2, %1, %0|%0, %1, %2round_saeonly_op3} + vrsqrt28ssescalarmodesuffix\t{round_saeonly_op3%1, %2, %0|%0, %2, %1round_saeonly_op3} [(set_attr length_immediate 1) (set_attr prefix evex) (set_attr mode MODE)]) Intrinsics should swap their operands accordingly. Uros.
Re: [patch c++]: Fix pr/58835 [4.7/4.8/4.9 Regression] ICE with __PRETTY_FUNCTION__ in broken function
Ping - Original Message - Hi, the following patch adds missing handling of error_mark_node result of fname_decl within finish_fname. ChangeLog 2014-02-11 Kai Tietz kti...@redhat.com PR c++/58835 * semantics.c (finish_fname): Handle error_mark_node. Regression tested for x86_64-unknown-linux-gnu, i686-w64-mingw32. Ok for apply? Regards, Kai Index: semantics.c === --- semantics.c (Revision 207686) +++ semantics.c (Arbeitskopie) @@ -2630,7 +2630,8 @@ finish_fname (tree id) tree decl; decl = fname_decl (input_location, C_RID_CODE (id), id); - if (processing_template_decl current_function_decl) + if (processing_template_decl current_function_decl + decl != error_mark_node) decl = DECL_NAME (decl); return decl; }
Re: [PATCH][AArch64] vrnd*_f64 patch for stage-1
On 02/13/2014 03:17 AM, Alex Velenko wrote: +/* Sets rmode field of FPCR control register to + FPROUNDING_ZERO. */ Comment is wrong, or at least misleading. +void __inline __attribute__ ((__always_inline__)) +set_rounding_mode (uint32_t mode) +{ + uint32_t r; + + /* Read current FPCR. */ + asm volatile (mrs %[r], fpcr : [r] =r (r) : :); + + /* Clear rmode. */ + r = 3 RMODE_START; ~(3 RMODE_START) + /* Calculate desired FPCR. */ + r |= mode RMODE_START; + + /* Write desired FPCR back. */ + asm volatile (msr fpcr, %[r] : : [r] r (r) :); +} Fortunately for this testcase, you do always use FPROUNDING_ZERO == 3 when calling this function, so the bugs are hidden. r~
Re: [PATCH] Handle more COMDAT profiling issues
On Feb 13, 2014, at 8:41 AM, Teresa Johnson tejohn...@google.com wrote: On Wed, Feb 12, 2014 at 2:03 PM, Xinliang David Li davi...@google.com wrote: [ extra lines deleted ] Should non comdat function be skipped? We warn in drop_profile if this is not a COMDAT, as we should only have this case and reach the call in that case. (See the check in drop_profile and the comments at the top of handle_missing_profile for more info) [ more extra lines deleted ] Can we edit out the extra lines when they get this large? Not doing that is actually worse than top-posting.
Re: [PATCH] Handle more COMDAT profiling issues
On Thu, Feb 13, 2014 at 9:48 AM, Mike Stump mikest...@comcast.net wrote: On Feb 13, 2014, at 8:41 AM, Teresa Johnson tejohn...@google.com wrote: On Wed, Feb 12, 2014 at 2:03 PM, Xinliang David Li davi...@google.com wrote: [ extra lines deleted ] Should non comdat function be skipped? We warn in drop_profile if this is not a COMDAT, as we should only have this case and reach the call in that case. (See the check in drop_profile and the comments at the top of handle_missing_profile for more info) [ more extra lines deleted ] Can we edit out the extra lines when they get this large? Not doing that is actually worse than top-posting. Right -- gmail users probably won't notice the problem as extra lines are hidden for you. David
Re: [PATCH][RFC][libatomic] Override -mcpu option for arm linux ifunc targets
Ping? http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00069.html On 03/02/14 11:50, Kyrill Tkachov wrote: Hi all, There is a slight issue with the libatomic Makefile for arm linux ifunc targets. It adds an explicity -march=armv7-a option to the command line to enable building the exclusive instruction variants in libatomic. However, if the multilib machinery tries to add an -mcpu option that conflicts with the -march one (such as -mcpu=cortex-a15) gcc will give a warning about incompatible -march and -mcpu options, causing the -Werror build to fail. A workaround here is to override the -mcpu option as well as the -march one. This patch does that by adding an EXTRA_OVERRIDE variable and setting it to -mcpu=cortex-a9 under the same conditions as when -march=armv7-a is selected, so that it's added only when -march=armv70a is added. Can someone see a better way of achieving this? If this is acceptable, ok to commit? Build and test arm-none-linux-gnueabi with --enable-gnu-indirect-function Bootstrap on x86 with --enable-gnu-indirect-function Thanks, Kyrill 2014-02-03 Kyrylo Tkachov kyrylo.tkac...@arm.com * Makefile.in: Override -mcpu option when building arm linux ifunc targets.
[jit] New API entrypoint: gcc_jit_context_get_builtin_function
Committed to branch dmalcolm/jit: This commit adds the ability for client code to look up GCC builtins by name, potentially allowing GCC to optimize the resulting function usage based on what it knows about the behavior of each builtin. Note that if the optimizer can't eliminate the call, the generated caller code will still require machine code for the callee, and thus may need the DSO implementing the builtin to already be linked into the client process, or you'll get a linker error - so perhaps builtin is a bad name? Implementing this required creating function types (to handle builtin-types.def), which are used internally by the new builtins_manager. They're not yet exposed to client code. gcc/jit/ * libgccjit.h (gcc_jit_context_get_builtin_function): New. * libgccjit.map (gcc_jit_context_get_builtin_function): New. * libgccjit++.h (gccjit::context::get_builtin_function): New method. * Make-lang.in (jit_OBJS): Add jit/jit-builtins.o * jit-builtins.c: New source file, for managing builtin functions and their types. * jit-builtins.h: Likewise. * libgccjit.c (gcc_jit_context_new_function): Pass BUILT_IN_NONE for the new argument of new_function (gcc_jit_context_get_builtin_function): New. * internal-api.h: Add idempotency guards. (gcc::jit::recording::context::new_function): Add parameter for builtin functions. (gcc::jit::recording::context::get_builtin_function): New method. (gcc::jit::recording::context::m_builtins_manager): New field. (gcc::jit::recording::type::as_a_function_type): New virtual function. (gcc::jit::recording::function_type): New subclass of type. (gcc::jit::recording::function::function): Add parameter for builtin functions. (gcc::jit::recording::function::m_builtin_id): New field. (gcc::jit::recording::function::new_function_type): New method. (gcc::jit::playback::function::function): Add parameter for builtin functions. * internal-api.c (gcc::jit::recording::context::context): NULL-initialize new field m_builtins_manager. (gcc::jit::recording::context::~context): Clean up the builtins manager, if one has been created. (gcc::jit::recording::context::new_function): Add parameter (gcc::jit::recording::context::get_builtin_function): New method. (gcc::jit::recording::function_type::function_type): Implement constructor for new subclass. (gcc::jit::recording::function_type::dereference): Implement method for new subclass. (gcc::jit::recording::function_type::replay_into): Likewise. (gcc::jit::recording::function_type::make_debug_string): Likewise. (gcc::jit::recording::function::function): Add parameter for builtin functions. (gcc::jit::recording::function::replay_into): Likewise for creation of playback object. (gcc::jit::recording::function::new_function_type): New method. (gcc::jit::playback::function::new_function): Add parameter for builtin functions, using it to set up the fndecl accordingly. gcc/testsuite/ * jit.dg/harness.h (CHECK_DOUBLE_VALUE): New macro. (CHECK): New macro. * jit.dg/test-functions.c: New testcase, exercising gcc_jit_context_get_builtin_function. * jit.dg/test-combination.c: Add test-functions.c to the combined test. --- gcc/jit/ChangeLog.jit | 48 gcc/jit/Make-lang.in| 3 +- gcc/jit/internal-api.c | 157 - gcc/jit/internal-api.h | 55 - gcc/jit/jit-builtins.c | 395 gcc/jit/jit-builtins.h | 114 + gcc/jit/libgccjit++.h | 9 + gcc/jit/libgccjit.c | 13 +- gcc/jit/libgccjit.h | 4 + gcc/jit/libgccjit.map | 1 + gcc/testsuite/ChangeLog.jit | 9 + gcc/testsuite/jit.dg/harness.h | 29 +++ gcc/testsuite/jit.dg/test-combination.c | 9 + gcc/testsuite/jit.dg/test-functions.c | 175 ++ 14 files changed, 1009 insertions(+), 12 deletions(-) create mode 100644 gcc/jit/jit-builtins.c create mode 100644 gcc/jit/jit-builtins.h create mode 100644 gcc/testsuite/jit.dg/test-functions.c diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit index adccd57..603dd96 100644 --- a/gcc/jit/ChangeLog.jit +++ b/gcc/jit/ChangeLog.jit @@ -1,3 +1,51 @@ +2014-02-13 David Malcolm dmalc...@redhat.com + + * libgccjit.h (gcc_jit_context_get_builtin_function): New. + * libgccjit.map (gcc_jit_context_get_builtin_function): New. + * libgccjit++.h (gccjit::context::get_builtin_function): New method. + + * Make-lang.in (jit_OBJS): Add jit/jit-builtins.o + * jit-builtins.c: New source
Re: [PATCH][RFC][libatomic] Override -mcpu option for arm linux ifunc targets
On 02/03/2014 03:50 AM, Kyrill Tkachov wrote: +# For ARM, the -march option by itself conflicts with any -mcpu option that +# we might end up passing to the build, causing an error. +# Therefore we override the -mcpu option as well. +# This shouldn't affect tuning much because the affected code is mostly +# in inline assembly anyway. @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a -DHAVE_KERNEL64 +@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@EXTRA_OVERRIDE = -mcpu=cortex-a9 Why would you want to split these across two different variables? It's easier to just add the -march and -mcpu to the same IFUNC_OPTIONS variable. Why the choice of cortext-a9, as opposed to any of the other v7-a possibilities? If we're going to force anything, perhaps generic-armv7-a is more appropriate? r~
Re: [testsuite] Don't xfail gcc.dg/binop-xor1.c
On Thu, 13 Feb 2014, Richard Sandiford wrote: Richard Sandiford rsand...@linux.vnet.ibm.com writes: Hans-Peter Nilsson h...@bitrange.com writes: On Tue, 4 Feb 2014, Rainer Orth wrote: AFAICT the gcc.dg/binop-xor1.c test is XPASSing everywhere since about 20131114: Bah, missing analysis. Everywhere does not include cris-elf, powerpc64-unknown-linux-gnu, m68k-unknown-linux-gnu, s390x-ibm-linux-gnu, powerpc-ibm-aix7.1.0.0. Based on this list I'm guessing it's another BRANCH_COST==1 BRANCH_COST==1 || !LOGICAL_OP_NON_SHORT_CIRCUIT Thanks. Ouch, not again! Anyone with an idea for an effective-target identification function? I'd like to avoid an explicit target list but if that's what it takes, better collect the target list in an check_effective_target_branch_cost1 and/or check_effective_target_logical_op_short_circuit - and yes, the test function should use the positive sense (it should not use a negative sense for reasons QED. :) brgds, H-P
Re: [PATCH] [libgomp] make it possible to use OMP on both sides of a fork
+/* This is to enable best-effort cleanup after fork. */ +static int gomp_we_are_forked = 0; bool, no explicit initialization, possible removal, see below. +static void +gomp_free_thread_pool (int threads_running) bool for threads_running. It looks like a count otherwise. +gomp_after_fork_callback () (void) + pthread_atfork (NULL, NULL, gomp_after_fork_callback); not needed. Any reason not to just run gomp_free_thread_pool from gomp_after_fork_callback directly? I see no restrictions on what kind of code is allowed to execute during that callback. r~
Re: [PATCH] Documentation for dump and optinfo output
Committed as r207767. Indeed, I had an older version of makeinfo. Once I updated to the latest version 5.2, I saw the warnings. Those are fixed by this patch. Thanks, Sharad On Tue, Feb 11, 2014 at 11:42 PM, Thomas Schwinge tho...@codesourcery.com wrote: Hi! On Wed, 5 Feb 2014 16:33:19 -0800, Sharad Singhai sing...@google.com wrote: I am really sorry about the delay. No worries; not exactly a severe issue. ;-) I couldn't exactly reproduce the warning which you described Maybe the version of makeinfo is relevant? $ makeinfo --version | head -n 1 makeinfo (GNU texinfo) 5.1 but I found a place where two nodes were out of order in optinfo.texi. Could you please apply the following patch and see if the problem goes away? If it works for you, I will commit the doc fixes. Also I would appreciate the exact command which produces these warnings. Your patch does fix the problem; see the following diff of the build log, where the warnings are now gone, and which also happens to contain the makeinfo command line. @@ -4199,12 +4199,6 @@ if [ xinfo = xinfo ]; then \ makeinfo --split-size=500 --split-size=500 --no-split -I . -I ../../source/gcc/doc \ -I ../../source/gcc/doc/include -o doc/gccint.info ../../source/gcc/doc/gccint.texi; \ fi -../../source/gcc/doc/optinfo.texi:45: warning: node next `Optimization groups' in menu `Dump output verbosity' and in sectioning `Dump files and streams' differ -../../source/gcc/doc/optinfo.texi:77: warning: node next `Dump files and streams' in menu `Dump types' and in sectioning `Dump output verbosity' differ -../../source/gcc/doc/optinfo.texi:77: warning: node prev `Dump files and streams' in menu `Dump output verbosity' and in sectioning `Optimization groups' differ -../../source/gcc/doc/optinfo.texi:104: warning: node next `Dump output verbosity' in menu `Dump files and streams' and in sectioning `Dump types' differ -../../source/gcc/doc/optinfo.texi:104: warning: node prev `Dump output verbosity' in menu `Optimization groups' and in sectioning `Dump files and streams' differ -../../source/gcc/doc/optinfo.texi:137: warning: node prev `Dump types' in menu `Dump files and streams' and in sectioning `Dump output verbosity' differ if [ xinfo = xinfo ]; then \ makeinfo --split-size=500 --split-size=500 --no-split -I ../../source/gcc/doc \ -I ../../source/gcc/doc/include -o doc/gccinstall.info ../../source/gcc/doc/install.texi; \ * doc/optinfo.texi: Fix order of nodes. Thanks, please commit. Grüße, Thomas
[PATCH, testsuite] Fix profile test failures
While testing the C++ profiling tests in g++.dg/bprob and using the qemu simulator we discovered that these tests were passing when we ran the testsuite with no extra options but that if we specified some options on the testsuite run then the tests would fail with this message in the c++.log file: rsh: Could not resolve hostname multi-sim/-EL: Name or service not known After some poking around I found that profopt-execute in lib/profopt.exp was using remote_file and remote_upload with 'target' where I believe it should be using 'host'. No other *.exp file uses 'target' on their remote_file or remote_update calls, they either use 'build' or 'host'. So while it seems weird that 'host' is the proper replacement for 'target' as the machine where the executable is run, this seems to be the right fix and it does give me a clean run now with or without extra arguments on the test run. OK for checkin? Steve Ellcey sell...@mips.com 2014-02-13 Steve Ellcey sell...@mips.com * lib/profopt.exp (profopt-execute): Use host instead of target in remote_file and remote_upload calls. diff --git a/gcc/testsuite/lib/profopt.exp b/gcc/testsuite/lib/profopt.exp index e0d849e..e045b53 100644 --- a/gcc/testsuite/lib/profopt.exp +++ b/gcc/testsuite/lib/profopt.exp @@ -264,7 +264,7 @@ proc profopt-execute { src } { # Remove old profiling and performance data files. foreach ext $prof_ext { - remote_file target delete $tmpdir/$base.$ext + remote_file host delete $tmpdir/$base.$ext } if [info exists perf_ext] { profopt-cleanup $testcase $perf_ext @@ -312,7 +312,7 @@ proc profopt-execute { src } { # Make sure the profile data was generated, and fail if not. if { $status == pass } { foreach ext $prof_ext { - remote_upload target $tmpdir/$base.$ext + remote_upload host $tmpdir/$base.$ext set files [glob -nocomplain $base.$ext] if { $files == } { set status fail @@ -368,7 +368,7 @@ proc profopt-execute { src } { # Remove the profiling data files. foreach ext $prof_ext { - remote_file target delete $tmpdir/$base.$ext + remote_file host delete $tmpdir/$base.$ext } if { $status != pass } {
Re: std::regex_replace behaviour (LWG DR 2213)
On Thu, Feb 13, 2014 at 1:13 PM, Jonathan Wakely jwakely@gmail.com wrote: The LWG have decided that http://cplusplus.github.io/LWG/lwg-active.html#2213 is a defect. In our std::regex_replace we do not appear to update out in all places that we should. 1) Yes, the current implementation is buggy for not updating __out after calling std::copy; 2) I'd rather say the standard is misleading but well intended (return the new out iterator) rather than ill intended (return the original out iterator). It'll be a little troubler if match_results::format() do not return the new out iterator, which regex_replace() the caller needs. Boost and libc++ as well return the new iterator. So my suggestion is just following the LWG proposal, as well as Boost and libc++. Here's the patch tested with -m32 and -m64 respectively. Thanks! -- Regards, Tim Shen commit 3f8621b5f7ced00e21e7038f1e9737eea1bb4251 Author: tim timshe...@gmail.com Date: Thu Feb 13 17:23:48 2014 -0500 2014-02-13 Tim Shen timshe...@gmail.com * include/bits/regex.tcc (match_results::format, regex_replace): Update __out after calling std::copy. * testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc: Add testcase. * testsuite/28_regex/match_results/format.cc: Likewise. diff --git a/libstdc++-v3/include/bits/regex.tcc b/libstdc++-v3/include/bits/regex.tcc index 73f55df..5fa1f01 100644 --- a/libstdc++-v3/include/bits/regex.tcc +++ b/libstdc++-v3/include/bits/regex.tcc @@ -425,7 +425,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { auto __sub = _Base_type::operator[](__idx); if (__sub.matched) - std::copy(__sub.first, __sub.second, __out); + __out = std::copy(__sub.first, __sub.second, __out); }; if (__flags regex_constants::format_sed) @@ -455,7 +455,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION if (__next == __fmt_last) break; - std::copy(__fmt_first, __next, __out); + __out = std::copy(__fmt_first, __next, __out); auto __eat = [](char __ch) - bool { @@ -493,7 +493,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION *__out++ = '$'; __fmt_first = __next; } - std::copy(__fmt_first, __fmt_last, __out); + __out = std::copy(__fmt_first, __fmt_last, __out); } return __out; } @@ -512,7 +512,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION if (__i == __end) { if (!(__flags regex_constants::format_no_copy)) - std::copy(__first, __last, __out); + __out = std::copy(__first, __last, __out); } else { @@ -521,14 +521,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION for (; __i != __end; ++__i) { if (!(__flags regex_constants::format_no_copy)) - std::copy(__i-prefix().first, __i-prefix().second, __out); + __out = std::copy(__i-prefix().first, __i-prefix().second, + __out); __out = __i-format(__out, __fmt, __fmt + __len, __flags); __last = __i-suffix(); if (__flags regex_constants::format_first_only) break; } if (!(__flags regex_constants::format_no_copy)) - std::copy(__last.first, __last.second, __out); + __out = std::copy(__last.first, __last.second, __out); } return __out; } diff --git a/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc b/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc index 28f78a0..38ef970 100644 --- a/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc +++ b/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/basic_replace.cc @@ -41,6 +41,14 @@ test01() VERIFY(regex_replace(string(This is a string), regex(\\b\\w*\\b), |$0|, regex_constants::format_first_only) == |This| is a string); + + char buff[4096] = {0}; + regex re(asdf); + string s = asdf; + string res = |asdf|asdf|; + regex_replace(buff, s.data(), s.data() + s.size(), re, ||\\0|, + regex_constants::format_sed); + VERIFY(res == buff); } int diff --git a/libstdc++-v3/testsuite/28_regex/match_results/format.cc b/libstdc++-v3/testsuite/28_regex/match_results/format.cc index 11e3bdb..097a0d7 100644 --- a/libstdc++-v3/testsuite/28_regex/match_results/format.cc +++ b/libstdc++-v3/testsuite/28_regex/match_results/format.cc @@ -43,6 +43,14 @@ test01() VERIFY(m.format(|\\3|\\4|\\2|\\1|\\, regex_constants::format_sed) == this is a string|a|string|is|this|\\); + + regex re(asdf); + regex_match(asdf, m, re); + string fmt = ||\\0|; + char buff[4096] = {0}; + m.format(buff, fmt.data(), fmt.data() + fmt.size(), + regex_constants::format_sed); + VERIFY(string(buff) ==
[PATCH] Fix a couple of tree-vect-loop.c issues
Hi! While fixing a -O3 -g vectorizer ICE that only reproduced on GCC-4.4-RH branch, I've noticed couple of similar issues on the trunk. The first hunk is just a cleanup, there is no point to set use_stmt again to the same thing as it has been set before. The second and third hunks are to ignore debug stmts, other places in tree-vect-loop.c that similarly look for the exit phi look similarly. The last hunk fixes GOMP_SIMD_LANE handling, and the testcase is from the FAIL in redhat/gcc-4_4-branch. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-02-13 Jakub Jelinek ja...@redhat.com * tree-vect-loop.c (vect_is_slp_reduction): Don't set use_stmt twice. (get_initial_def_for_induction, vectorizable_induction): Ignore debug stmts when looking for exit_phi. (vectorizable_live_operation): Fix up condition. * gcc.c-torture/compile/20140213.c: New test. --- gcc/tree-vect-loop.c.jj 2014-02-05 15:28:10.0 +0100 +++ gcc/tree-vect-loop.c2014-02-13 15:36:38.117741038 +0100 @@ -1968,10 +1968,8 @@ vect_is_slp_reduction (loop_vec_info loo FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) { gimple use_stmt = USE_STMT (use_p); - if (is_gimple_debug (use_stmt)) -continue; - - use_stmt = USE_STMT (use_p); + if (is_gimple_debug (use_stmt)) + continue; /* Check if we got back to the reduction phi. */ if (use_stmt == phi) @@ -3507,9 +3505,13 @@ get_initial_def_for_induction (gimple iv exit_phi = NULL; FOR_EACH_IMM_USE_FAST (use_p, imm_iter, loop_arg) { - if (!flow_bb_inside_loop_p (iv_loop, gimple_bb (USE_STMT (use_p + gimple use_stmt = USE_STMT (use_p); + if (is_gimple_debug (use_stmt)) + continue; + + if (!flow_bb_inside_loop_p (iv_loop, gimple_bb (use_stmt))) { - exit_phi = USE_STMT (use_p); + exit_phi = use_stmt; break; } } @@ -5413,10 +5415,13 @@ vectorizable_induction (gimple phi, gimp loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e); FOR_EACH_IMM_USE_FAST (use_p, imm_iter, loop_arg) { - if (!flow_bb_inside_loop_p (loop-inner, - gimple_bb (USE_STMT (use_p + gimple use_stmt = USE_STMT (use_p); + if (is_gimple_debug (use_stmt)) + continue; + + if (!flow_bb_inside_loop_p (loop-inner, gimple_bb (use_stmt))) { - exit_phi = USE_STMT (use_p); + exit_phi = use_stmt; break; } } @@ -5514,7 +5519,7 @@ vectorizable_live_operation (gimple stmt { gimple use_stmt = USE_STMT (use_p); if (gimple_code (use_stmt) == GIMPLE_PHI - || gimple_bb (use_stmt) == merge_bb) + gimple_bb (use_stmt) == merge_bb) { if (vec_stmt) { --- gcc/testsuite/gcc.c-torture/compile/20140213.c.jj 2013-08-25 18:20:55.717911035 +0200 +++ gcc/testsuite/gcc.c-torture/compile/20140213.c 2014-02-13 16:23:45.631401820 +0100 @@ -0,0 +1,21 @@ +static unsigned short +foo (unsigned char *x, int y) +{ + unsigned short r = 0; + int i; + for (i = 0; i y; i++) +r += x[i]; + return r; +} + +int baz (int, unsigned short); + +void +bar (unsigned char *x, unsigned char *y) +{ + int i; + unsigned short key = foo (x, 0x1); + baz (0, 0); + for (i = 0; i 0x8; i++) +y[i] = x[baz (i, key)]; +} Jakub
Re: [PATCH, testsuite] Fix profile test failures
On Thu, 13 Feb 2014, Steve Ellcey wrote: While testing the C++ profiling tests in g++.dg/bprob and using the qemu simulator we discovered that these tests were passing when we ran the testsuite with no extra options but that if we specified some options on the testsuite run then the tests would fail with this message in the c++.log file: rsh: Could not resolve hostname multi-sim/-EL: Name or service not known That means your board file is buggy. If rsh is not the right way to access your target system, you need to implement the board file methods in some way other than rsh (possibly some operations should be no-ops, or do something directly on the build system, if you have a shared filesystem). So while it seems weird that 'host' is the proper replacement for 'target' as the machine where the executable is run, this seems to be the right fix It's certainly not the proper replacement. If a file is on the target, use target for deletion / manipulation; if it's on the host, use host for deletion / manipulation; on build, use build; in multiple places, run the deletion operation once per system with the file; to copy from target to the system (build) running DejaGnu, use remote_upload specifying target; to copy from host to build, use remote_upload specifying host; to copy from build to host or target, use remote_download specifying host or target as appropriate. To determine whether anything should be changed in a GCC .exp file, reason about which of the three systems (build, host, target) a file is on, or is needed on, at each point, rather than looking at what does or does not work with a buggy board file. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Fix a couple of tree-vect-loop.c issues
On 02/13/2014 02:46 PM, Jakub Jelinek wrote: 2014-02-13 Jakub Jelinek ja...@redhat.com * tree-vect-loop.c (vect_is_slp_reduction): Don't set use_stmt twice. (get_initial_def_for_induction, vectorizable_induction): Ignore debug stmts when looking for exit_phi. (vectorizable_live_operation): Fix up condition. * gcc.c-torture/compile/20140213.c: New test. Ok. r~
[PATCH] x86: Use ud2 assembly mnemonic when available.
Non-ancient assemblers support the ud2 mnemonic, so there is no need to emit the literal opcode as data. OK for trunk and 4.8? Thanks, Roland gcc/ 2014-02-13 Roland McGrath mcgra...@google.com * configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic. * configure: Regenerated. * config.in: Regenerated. * config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic instead of ASM_SHORT. --- a/gcc/config.in +++ b/gcc/config.in @@ -375,6 +375,12 @@ #endif +/* Define if your assembler supports the 'ud2' mnemonic. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_IX86_UD2 +#endif + + /* Define if your assembler supports the lituse_jsrdirect relocation. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_JSRDIRECT_RELOCS --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -17843,7 +17843,13 @@ (define_insn trap [(trap_if (const_int 1) (const_int 6))] - { return ASM_SHORT 0x0b0f; } +{ +#ifdef HAVE_AS_IX86_UD2 + return ud2; +#else + return ASM_SHORT 0x0b0f; +#endif +} [(set_attr length 2)]) (define_expand prefetch --- a/gcc/configure +++ b/gcc/configure @@ -25109,6 +25109,37 @@ $as_echo #define HAVE_AS_IX86_REP_LOCK_PREFIX 1 confdefs.h fi +{ $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for ud2 mnemonic 5 +$as_echo_n checking assembler for ud2 mnemonic... 6; } +if test ${gcc_cv_as_ix86_ud2+set} = set; then : + $as_echo_n (cached) 6 +else + gcc_cv_as_ix86_ud2=no + if test x$gcc_cv_as != x; then +$as_echo 'ud2' conftest.s +if { ac_try='$gcc_cv_as $gcc_cv_as_flags -o conftest.o conftest.s 5' + { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5 + (eval $ac_try) 25 + ac_status=$? + $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5 + test $ac_status = 0; }; } +then + gcc_cv_as_ix86_ud2=yes +else + echo configure: failed program was 5 + cat conftest.s 5 +fi +rm -f conftest.o conftest.s + fi +fi +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_ud2 5 +$as_echo $gcc_cv_as_ix86_ud2 6; } +if test $gcc_cv_as_ix86_ud2 = yes; then + +$as_echo #define HAVE_AS_IX86_UD2 1 confdefs.h + +fi + { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for R_386_TLS_GD_PLT reloc 5 $as_echo_n checking assembler for R_386_TLS_GD_PLT reloc... 6; } if test ${gcc_cv_as_ix86_tlsgdplt+set} = set; then : --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -3895,6 +3895,12 @@ foo: nop [AC_DEFINE(HAVE_AS_IX86_REP_LOCK_PREFIX, 1, [Define if the assembler supports 'rep insn, lock insn'.])]) +gcc_GAS_CHECK_FEATURE([ud2 mnemonic], + gcc_cv_as_ix86_ud2,,, + [ud2],, + [AC_DEFINE(HAVE_AS_IX86_UD2, 1, + [Define if your assembler supports the 'ud2' mnemonic.])]) + gcc_GAS_CHECK_FEATURE([R_386_TLS_GD_PLT reloc], gcc_cv_as_ix86_tlsgdplt,,, [calltls_gd@tlsgdplt],
Re: [PATCH, testsuite] Fix profile test failures
On Thu, 2014-02-13 at 23:09 +, Joseph S. Myers wrote: On Thu, 13 Feb 2014, Steve Ellcey wrote: While testing the C++ profiling tests in g++.dg/bprob and using the qemu simulator we discovered that these tests were passing when we ran the testsuite with no extra options but that if we specified some options on the testsuite run then the tests would fail with this message in the c++.log file: rsh: Could not resolve hostname multi-sim/-EL: Name or service not known That means your board file is buggy. If rsh is not the right way to access your target system, you need to implement the board file methods in some way other than rsh (possibly some operations should be no-ops, or do something directly on the build system, if you have a shared filesystem). I thought the bug was that it was using 'multi-sim/-EL' instead of just 'multi-sim'. I.e. I thought that target was a combination of where the test was run and what options were used, whereas host was just going to be where the test was run. I guess I was wrong about that. So while it seems weird that 'host' is the proper replacement for 'target' as the machine where the executable is run, this seems to be the right fix It's certainly not the proper replacement. If a file is on the target, use target for deletion / manipulation; if it's on the host, use host for deletion / manipulation; on build, use build; in multiple places, run the deletion operation once per system with the file; to copy from target to the system (build) running DejaGnu, use remote_upload specifying target; to copy from host to build, use remote_upload specifying host; to copy from build to host or target, use remote_download specifying host or target as appropriate. So let me make sure I understand this: host is where you run the testsuite from, build is where the compilation happens (probably the same as host for most people), and target is where the test program is executed. To determine whether anything should be changed in a GCC .exp file, reason about which of the three systems (build, host, target) a file is on, or is needed on, at each point, rather than looking at what does or does not work with a buggy board file. I am not convinced that the problem is in the board file because the only tests I see fail this way are the ones in g++.exp/bprob and that is also the only GCC .exp file that uses remote_upload or remote_file with 'target'. I will dig into it some more and also try it with some different boards. Steve Ellcey sell...@mips.com
Re: [PATCH] x86: Use ud2 assembly mnemonic when available.
On Thu, Feb 13, 2014 at 3:46 PM, Roland McGrath mcgra...@google.com wrote: Non-ancient assemblers support the ud2 mnemonic, so there is no need to emit the literal opcode as data. OK for trunk and 4.8? I changed this to use .word due to openbsd3.1: http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01347.html . I no longer have access to this older openbsd box so I don't object to this change. In fact I doubt we support any binutils that are pre 2.0 any more; so maybe move over unconditionally to ud. Thanks, Andrew Pinski Thanks, Roland gcc/ 2014-02-13 Roland McGrath mcgra...@google.com * configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic. * configure: Regenerated. * config.in: Regenerated. * config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic instead of ASM_SHORT. --- a/gcc/config.in +++ b/gcc/config.in @@ -375,6 +375,12 @@ #endif +/* Define if your assembler supports the 'ud2' mnemonic. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_IX86_UD2 +#endif + + /* Define if your assembler supports the lituse_jsrdirect relocation. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_JSRDIRECT_RELOCS --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -17843,7 +17843,13 @@ (define_insn trap [(trap_if (const_int 1) (const_int 6))] - { return ASM_SHORT 0x0b0f; } +{ +#ifdef HAVE_AS_IX86_UD2 + return ud2; +#else + return ASM_SHORT 0x0b0f; +#endif +} [(set_attr length 2)]) (define_expand prefetch --- a/gcc/configure +++ b/gcc/configure @@ -25109,6 +25109,37 @@ $as_echo #define HAVE_AS_IX86_REP_LOCK_PREFIX 1 confdefs.h fi +{ $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for ud2 mnemonic 5 +$as_echo_n checking assembler for ud2 mnemonic... 6; } +if test ${gcc_cv_as_ix86_ud2+set} = set; then : + $as_echo_n (cached) 6 +else + gcc_cv_as_ix86_ud2=no + if test x$gcc_cv_as != x; then +$as_echo 'ud2' conftest.s +if { ac_try='$gcc_cv_as $gcc_cv_as_flags -o conftest.o conftest.s 5' + { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5 + (eval $ac_try) 25 + ac_status=$? + $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5 + test $ac_status = 0; }; } +then + gcc_cv_as_ix86_ud2=yes +else + echo configure: failed program was 5 + cat conftest.s 5 +fi +rm -f conftest.o conftest.s + fi +fi +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_ud2 5 +$as_echo $gcc_cv_as_ix86_ud2 6; } +if test $gcc_cv_as_ix86_ud2 = yes; then + +$as_echo #define HAVE_AS_IX86_UD2 1 confdefs.h + +fi + { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for R_386_TLS_GD_PLT reloc 5 $as_echo_n checking assembler for R_386_TLS_GD_PLT reloc... 6; } if test ${gcc_cv_as_ix86_tlsgdplt+set} = set; then : --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -3895,6 +3895,12 @@ foo: nop [AC_DEFINE(HAVE_AS_IX86_REP_LOCK_PREFIX, 1, [Define if the assembler supports 'rep insn, lock insn'.])]) +gcc_GAS_CHECK_FEATURE([ud2 mnemonic], + gcc_cv_as_ix86_ud2,,, + [ud2],, + [AC_DEFINE(HAVE_AS_IX86_UD2, 1, + [Define if your assembler supports the 'ud2' mnemonic.])]) + gcc_GAS_CHECK_FEATURE([R_386_TLS_GD_PLT reloc], gcc_cv_as_ix86_tlsgdplt,,, [calltls_gd@tlsgdplt],
Re: [PATCH] x86: Use ud2 assembly mnemonic when available.
On Thu, Feb 13, 2014 at 3:50 PM, Andrew Pinski pins...@gmail.com wrote: On Thu, Feb 13, 2014 at 3:46 PM, Roland McGrath mcgra...@google.com wrote: Non-ancient assemblers support the ud2 mnemonic, so there is no need to emit the literal opcode as data. OK for trunk and 4.8? I changed this to use .word due to openbsd3.1: http://gcc.gnu.org/ml/gcc-patches/2005-07/msg01347.html . I no longer have access to this older openbsd box so I don't object to this change. In fact I doubt we support any binutils that are pre 2.0 any more; so maybe move over unconditionally to ud. Oh looking into this further, it looks like Sun's assembler does not support it either: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23359 Thanks, Andrew Thanks, Andrew Pinski Thanks, Roland gcc/ 2014-02-13 Roland McGrath mcgra...@google.com * configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic. * configure: Regenerated. * config.in: Regenerated. * config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic instead of ASM_SHORT. --- a/gcc/config.in +++ b/gcc/config.in @@ -375,6 +375,12 @@ #endif +/* Define if your assembler supports the 'ud2' mnemonic. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_IX86_UD2 +#endif + + /* Define if your assembler supports the lituse_jsrdirect relocation. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_JSRDIRECT_RELOCS --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -17843,7 +17843,13 @@ (define_insn trap [(trap_if (const_int 1) (const_int 6))] - { return ASM_SHORT 0x0b0f; } +{ +#ifdef HAVE_AS_IX86_UD2 + return ud2; +#else + return ASM_SHORT 0x0b0f; +#endif +} [(set_attr length 2)]) (define_expand prefetch --- a/gcc/configure +++ b/gcc/configure @@ -25109,6 +25109,37 @@ $as_echo #define HAVE_AS_IX86_REP_LOCK_PREFIX 1 confdefs.h fi +{ $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for ud2 mnemonic 5 +$as_echo_n checking assembler for ud2 mnemonic... 6; } +if test ${gcc_cv_as_ix86_ud2+set} = set; then : + $as_echo_n (cached) 6 +else + gcc_cv_as_ix86_ud2=no + if test x$gcc_cv_as != x; then +$as_echo 'ud2' conftest.s +if { ac_try='$gcc_cv_as $gcc_cv_as_flags -o conftest.o conftest.s 5' + { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5 + (eval $ac_try) 25 + ac_status=$? + $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5 + test $ac_status = 0; }; } +then + gcc_cv_as_ix86_ud2=yes +else + echo configure: failed program was 5 + cat conftest.s 5 +fi +rm -f conftest.o conftest.s + fi +fi +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_ix86_ud2 5 +$as_echo $gcc_cv_as_ix86_ud2 6; } +if test $gcc_cv_as_ix86_ud2 = yes; then + +$as_echo #define HAVE_AS_IX86_UD2 1 confdefs.h + +fi + { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for R_386_TLS_GD_PLT reloc 5 $as_echo_n checking assembler for R_386_TLS_GD_PLT reloc... 6; } if test ${gcc_cv_as_ix86_tlsgdplt+set} = set; then : --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -3895,6 +3895,12 @@ foo: nop [AC_DEFINE(HAVE_AS_IX86_REP_LOCK_PREFIX, 1, [Define if the assembler supports 'rep insn, lock insn'.])]) +gcc_GAS_CHECK_FEATURE([ud2 mnemonic], + gcc_cv_as_ix86_ud2,,, + [ud2],, + [AC_DEFINE(HAVE_AS_IX86_UD2, 1, + [Define if your assembler supports the 'ud2' mnemonic.])]) + gcc_GAS_CHECK_FEATURE([R_386_TLS_GD_PLT reloc], gcc_cv_as_ix86_tlsgdplt,,, [calltls_gd@tlsgdplt],
Re: [PATCH] x86: Use ud2 assembly mnemonic when available.
Did you read the patch? It uses an empirical configure check to discover if the assembler does in fact support ud2.
[PATCH][ARM] add HFmode to arm_preferred_simd_mode
Hi, Is there any reason why HFmode is not there in arm_preferred_simd_mode? NEON does support this. Cross regression tested for arm-none-linux-gnueabi with qemu and no new regressions. Attached patch enables this. Is this OK for stage1. Thanks, Kugan gcc/ +2014-02-14 Kugan Vivekanandarajah kug...@linaro.org + + * config/arm/arm.c (arm_preferred_simd_mode): Add HFmode to +preferred modes. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index b49f43e..bd90e85 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -28564,6 +28564,10 @@ arm_preferred_simd_mode (enum machine_mode mode) if (TARGET_NEON) switch (mode) { + case HFmode: + if (arm_fp16_format) + return TARGET_NEON_VECTORIZE_DOUBLE ? V4HFmode : V8HFmode; + break; case SFmode: return TARGET_NEON_VECTORIZE_DOUBLE ? V2SFmode : V4SFmode; case SImode:
Re: [PATCH][ARM] add HFmode to arm_preferred_simd_mode
On Thu, Feb 13, 2014 at 4:15 PM, Kugan kugan.vivekanandara...@linaro.org wrote: Hi, Is there any reason why HFmode is not there in arm_preferred_simd_mode? NEON does support this. Most likely because there is no support for Half-float in the vectorizer. Thanks, Andrew Pinski Cross regression tested for arm-none-linux-gnueabi with qemu and no new regressions. Attached patch enables this. Is this OK for stage1. Thanks, Kugan gcc/ +2014-02-14 Kugan Vivekanandarajah kug...@linaro.org + + * config/arm/arm.c (arm_preferred_simd_mode): Add HFmode to +preferred modes.
Re: [PATCH, testsuite] Fix profile test failures
On Thu, 13 Feb 2014, Steve Ellcey wrote: So let me make sure I understand this: host is where you run the testsuite from, build is where the compilation happens (probably the same as host for most people), and target is where the test program is executed. Host is the system on which the compilers being tested run. Build is the system on which runtest runs and executes the .exp files. They are only different in the case of remote-host testing (using DejaGnu on GNU/Linux to test a compiler for Windows host, for example) - typically the same cases in which a Canadian cross compiler is built. To determine whether anything should be changed in a GCC .exp file, reason about which of the three systems (build, host, target) a file is on, or is needed on, at each point, rather than looking at what does or does not work with a buggy board file. I am not convinced that the problem is in the board file because the only tests I see fail this way are the ones in g++.exp/bprob and that is also the only GCC .exp file that uses remote_upload or remote_file with 'target'. I will dig into it some more and also try it with some different boards. Branch profiling involves the generated executables creating files with profile information when they run, so those files (on the target) need manipulating. Most testsuites do not involve testcases generating any files. But the libstdc++ testsuite uses remote_download to transfer files to the target, because various testcases need to open and read input files. -- Joseph S. Myers jos...@codesourcery.com
RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8
Ping ^3 These fixes are very important to 4.8 ARM embedded users, as they rely on strict volatile bitfields a lot. Please let them in 4.8. -Original Message- From: Joey Ye [mailto:joey...@arm.com] Sent: Saturday, February 08, 2014 10:42 To: gcc-patches@gcc.gnu.org Subject: RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8 Ping ^ 2 OK to 4.8? -Original Message- From: Joey Ye [mailto:joey...@arm.com] Sent: Monday, January 20, 2014 10:47 To: gcc-patches@gcc.gnu.org Subject: RE: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8 Ping -Original Message- From: Joey Ye [mailto:joey...@arm.com] Sent: Thursday, January 16, 2014 16:28 To: gcc-patches@gcc.gnu.org Subject: [PATCH][4.8] Backport strict-volatile-bitfields fixes to 4.8 4.8 has a number of strict-volatile-bitfields issues that can be fixed by following patches. trunk@205899, 205898, 205897, 205896, 203003 Tested on x86_64 and arm without regression. OK to 4.8? 2013-09-28 Sandra Loosemore san...@codesourcery.com gcc/ * expr.h (extract_bit_field): Remove packedp parameter. * expmed.c (extract_fixed_bit_field): Remove packedp parameter from forward declaration. (store_split_bit_field): Remove packedp arg from calls to extract_fixed_bit_field. (extract_bit_field_1): Remove packedp parameter and packedp argument from recursive calls and calls to extract_fixed_bit_field. (extract_bit_field): Remove packedp parameter and corresponding arg to extract_bit_field_1. (extract_fixed_bit_field): Remove packedp parameter. Remove code to issue warnings. (extract_split_bit_field): Remove packedp arg from call to extract_fixed_bit_field. * expr.c (emit_group_load_1): Adjust calls to extract_bit_field. (copy_blkmode_from_reg): Likewise. (copy_blkmode_to_reg): Likewise. (read_complex_part): Likewise. (store_field): Likewise. (expand_expr_real_1): Likewise. * calls.c (store_unaligned_arguments_into_pseudos): Adjust call to extract_bit_field. * config/tilegx/tilegx.c (tilegx_expand_unaligned_load): Adjust call to extract_bit_field. * config/tilepro/tilepro.c (tilepro_expand_unaligned_load): Adjust call to extract_bit_field. * doc/invoke.texi (Code Gen Options): Remove mention of warnings and special packedp behavior from -fstrict-volatile-bitfields documentation. 2013-12-11 Bernd Edlinger bernd.edlin...@hotmail.de * expr.c (expand_assignment): Remove dependency on flag_strict_volatile_bitfields. Always set the memory access mode. (expand_expr_real_1): Likewise. 2013-12-11 Sandra Loosemore san...@codesourcery.com PR middle-end/23623 PR middle-end/48784 PR middle-end/56341 PR middle-end/56997 gcc/ * expmed.c (strict_volatile_bitfield_p): New function. (store_bit_field_1): Don't special-case strict volatile bitfields here. (store_bit_field): Handle strict volatile bitfields here instead. (store_fixed_bit_field): Don't special-case strict volatile bitfields here. (extract_bit_field_1): Don't special-case strict volatile bitfields here. (extract_bit_field): Handle strict volatile bitfields here instead. (extract_fixed_bit_field): Don't special-case strict volatile bitfields here. Simplify surrounding code to resemble that in store_fixed_bit_field. * doc/invoke.texi (Code Gen Options): Update -fstrict-volatile-bitfields description. gcc/testsuite/ * gcc.dg/pr23623.c: New test. * gcc.dg/pr48784-1.c: New test. * gcc.dg/pr48784-2.c: New test. * gcc.dg/pr56341-1.c: New test. * gcc.dg/pr56341-2.c: New test. * gcc.dg/pr56997-1.c: New test. * gcc.dg/pr56997-2.c: New test. * gcc.dg/pr56997-3.c: New test. 2013-12-11 Bernd Edlinger bernd.edlin...@hotmail.de Sandra Loosemore san...@codesourcery.com PR middle-end/23623 PR middle-end/48784 PR middle-end/56341 PR middle-end/56997 * expmed.c (strict_volatile_bitfield_p): Add bitregion_start and bitregion_end parameters. Test for compliance with C++ memory model. (store_bit_field): Adjust call to strict_volatile_bitfield_p. Add fallback logic for cases where -fstrict-volatile-bitfields is supposed to apply, but cannot. (extract_bit_field): Likewise. Use narrow_bit_field_mem and
Re: FRE may run out of memory
Richard Biener-2 wrote On Sat, Feb 8, 2014 at 8:29 AM, dxq lt; ziyan01@ gt; wrote: hi all, We found that gcc would run out of memory on Windows when compiling a *big* function (10 lines). More investigation shows that gcc crashes at the function *compute_avail*, in tree-fre pass. *compute_avail* collects information from basic blocks, so memory is allocated to record informantion. However, if there are huge number of basic blocks, the memory would be exhausted and gcc would crash down, especially for Windows PC, only 2G or 4G memory generally. It's ok On linux, and *compute_avail* allocates *2.4G* memory. I guess some optimization passes in gcc like FRE didn't consider the extreme case. This was fixed for GCC 4.8, FRE no longer uses compute_avail (but PRE still does). Basically GCC 4.8 should (at -O1) compile most extreme cases just fine. Richard. hi, Richard, More investigation shows that 1, loop related passes take more compiling time and memory, especially pass_rtl_move_loop_invariants, lim, and at least lim on tree will impact a lot to the following passes. 2, ira will take more than 20g memory in function *create_loop_tree_nodes*, because ira chooses 'mixed' or 'all' region when optimize level. 3, sms pass always creats ddgs for all loops in compiled function, then does sms optimization for all loops, and finally frees ddgs. If there are huge number of loops, sms may crash when creating ddgs because of running out of memory. The passes above , should someone confirm about memory pressure problem? Thanks for your reply! danxiaoqiang -- View this message in context: http://gcc.1065356.n5.nabble.com/FRE-may-run-out-of-memory-tp1009578p1011035.html Sent from the gcc - patches mailing list archive at Nabble.com.
Re: [PATCH][ARM] add HFmode to arm_preferred_simd_mode
On 14/02/14 11:24, Andrew Pinski wrote: On Thu, Feb 13, 2014 at 4:15 PM, Kugan kugan.vivekanandara...@linaro.org wrote: Hi, Is there any reason why HFmode is not there in arm_preferred_simd_mode? NEON does support this. Most likely because there is no support for Half-float in the vectorizer. I can see that get_vectype_for_scalar_type_and_size failing while building vector type (with build_vector_type) for Half-float. I guess we should add support there first. Thanks, Kugan
Re: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end
Committed as r207776. -- Ilmir.
RE: [Patch, microblaze]: Add optimized lshrsi3
Hi Michael, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Sunday, 9 February 2014 2:58 am To: David Holsgrove; gcc-patches@gcc.gnu.org Cc: Edgar Iglesias; John Williams; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [Patch, microblaze]: Add optimized lshrsi3 On 11/25/13 23:53, David Holsgrove wrote: Add optimized lshrsi3 instruction, to be used when optimizing for size with immediate values over 5 Changelog 2013-11-26 Nagaraju Mekala nagaraju.mek...@xilinx.com * gcc/config/microblaze/microblaze.md: Add size optimized lshrsi3 insn. David -- Please put the description of the patch in the text of the email, rather than hiding it within an attached patch. The patch describes a very specific situation where this patch will have an effect. Please provide a test case. Updated version of patch attached with testcase. New Changelog entries are; Changelog 2013-11-26 David Holsgrove david.holsgr...@xilinx.com * gcc/config/microblaze/microblaze.md: Add size optimized lshrsi3 insn ChangeLog/testsuite 2014-02-12 David Holsgrove david.holsgr...@xilinx.com * gcc/testsuite/gcc.target/microblaze/others/lshrsi_Os_1.c: New test. thanks, David -- Michael Eager ea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 0003-Patch-microblaze-Add-optimized-lshrsi3.patch Description: 0003-Patch-microblaze-Add-optimized-lshrsi3.patch
RE: [Patch, microblaze]: Remove SECONDARY_MEMORY_NEEDED
Hi Michael, List, -Original Message- From: David Holsgrove Sent: Wednesday, 22 January 2014 1:43 pm To: 'Michael Eager'; gcc-patches@gcc.gnu.org Cc: Edgar Iglesias; John Williams; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE: [Patch, microblaze]: Remove SECONDARY_MEMORY_NEEDED Hi Michael, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Friday, 17 January 2014 4:44 am To: David Holsgrove; gcc-patches@gcc.gnu.org Cc: Edgar Iglesias; John Williams; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [Patch, microblaze]: Remove SECONDARY_MEMORY_NEEDED On 11/25/13 23:51, David Holsgrove wrote: Hi Michael, I've attached patch based on latest gcc master. Please let me know if you need anything further. thanks, David On 15 July 2013 14:44, David Holsgrove david.holsgr...@xilinx.com wrote: Hi Michael, On 18 March 2013 22:49, David Holsgrove david.holsgr...@xilinx.com wrote: MicroBlaze doesn't have restrictions that would force us to reload regs via memory. Don't define SECONDARY_MEMORY_NEEDED. Fixes an ICE when compiling OpenSSL for linux. Changelog 2013-03-18 Edgar E. Iglesias edgar.igles...@xilinx.com * gcc/config/microblaze/microblaze.h: Remove SECONDARY_MEMORY_NEEDED definition. Signed-off-by: Edgar E. Iglesias edgar.igles...@xilinx.com Signed-off-by: Peter A. G. Crosthwaite peter.crosthwa...@xilinx.com Patch remains the same, please apply when ready. thanks, David Hi David -- Is it possible to add a test case which shows the ICE? I'm afraid I don’t still have my test environment for this patch from last March, I'll attempt to recreate and distil into a small test case if possible, based on the error encountered whilst building openssl. I'll update again when I have some further detail. I've managed to recreate the original internal compiler error whilst building openssl with microblazeel linux toolchain. I've reduced the error down to the attached testcase. It is taken directly from openssl (with no dependencies on openssl headers), so I'm unsure of the suitability of this test both technically and license wise for inclusion in gcc. Changelog entry would be; 2013-03-18 Edgar E. Iglesias edgar.igles...@xilinx.com * gcc/config/microblaze/microblaze.h: Remove SECONDARY_MEMORY_NEEDED definition. ChangeLog/testsuite 2014-02-13 David Holsgrove david.holsgr...@xilinx.com * gcc/testsuite/gcc.target/microblaze/others/mem_reload.c: New test. thanks, David thanks, David Thanks. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 0002-Patch-microblaze-Remove-SECONDARY_MEMORY_NEEDED.patch Description: 0002-Patch-microblaze-Remove-SECONDARY_MEMORY_NEEDED.patch
RE: [Patch, microblaze]: Add TARGET_ASM_OUTPUT_MI_THUNK to support varargs thunk
Hi Michael, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Sunday, 26 January 2014 1:57 am To: David Holsgrove Cc: gcc-patches@gcc.gnu.org; Edgar Iglesias; John Williams; Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui Subject: Re: [Patch, microblaze]: Add TARGET_ASM_OUTPUT_MI_THUNK to support varargs thunk On 07/14/13 21:37, David Holsgrove wrote: Hi Michael, -Original Message- From: Michael Eager [mailto:ea...@eagerm.com] Sent: Saturday, 13 July 2013 9:33 am To: David Holsgrove Cc: gcc-patches@gcc.gnu.org; Edgar Iglesias; John Williams; Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Mekala; Tom Shui Subject: Re: [Patch, microblaze]: Add TARGET_ASM_OUTPUT_MI_THUNK to support varargs thunk On 03/18/13 05:49, David Holsgrove wrote: Changelog 2013-03-18 David Holsgrove david.holsgr...@xilinx.com * gcc/config/microblaze/microblaze.c: Add microblaze_asm_output_mi_thunk and define TARGET_ASM_OUTPUT_MI_THUNK and TARGET_ASM_CAN_OUTPUT_MI_THUNK Sorry it has taken so long to review this patch. [--snip--] 2013-07-15 David Holsgrove david.holsgr...@xilinx.com * gcc/config/microblaze/microblaze.c: Add microblaze_asm_output_mi_thunk and define TARGET_ASM_OUTPUT_MI_THUNK and TARGET_ASM_CAN_OUTPUT_MI_THUNK This patch causes a number of regressions in the G++ test suite. For example, abi/covariant{3,4,5}.C, abi/vcall1.C, inherit/covariant{1,2,3,4,17,18}.C, inherit/thunk{7,10}.C and others. Apologies - this patch was originally written in 2012 and submitted to this list a year ago. It has not been reviewed or tested for regressions in 12 months, and has taken me a bit of time to go back to the original work and rerun the testsuite as it stands today. Please find attached updated patch which has no regressions. I believe the testcase which checks the functionality of this patch is ' g++.old-deja/g++.jason/thunk3.C' Changelog entry remains the same since March 2013. thanks, David -- Michael Eager ea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 0001-Patch-microblaze-Add-TARGET_ASM_OUTPUT_MI_THUNK-to-s.patch Description: 0001-Patch-microblaze-Add-TARGET_ASM_OUTPUT_MI_THUNK-to-s.patch
[Patch, testsuite]: Add MicroBlaze pattern for dg-function-on-line
Hi, Attached patch adds a MicroBlaze specific pattern for checking line number and generation of function in dg-function-on-line, in line with the mips method. Changelog/testsuite 2014-02-13 David Holsgrove david.holsgr...@xilinx.com * gcc/testsuite/lib/scanasm.exp (dg-function-on-line): Add MicroBlaze specific pattern. thanks, David 0004-Patch-testsuite-Add-MicroBlaze-pattern-for-dg-functi.patch Description: 0004-Patch-testsuite-Add-MicroBlaze-pattern-for-dg-functi.patch
[Patch, testsuite]: Allow MicroBlaze .weakext pattern in regex match
Hi All, I've attached a patch to extend the regex pattern to include optional 'ext' at the end of '.weak' to match the MicroBlaze weak label '.weakext' in two of the g++ testcases. The only other rule in these tests was for ! { *-*-darwin* }, so I'm not sure if it's appropriate to modify the scan-assembler line in this fashion for a specific architecture's pattern? ChangeLog/testsuite 2014-02-14 David Holsgrove david.holsgr...@xilinx.com * gcc/testsuite/g++.dg/abi/rtti3.C: Extend scan-assembler pattern to take optional ext after .weak. * gcc/testsuite/g++.dg/abi/thunk4.C: Likewise. thanks, David 0005-Patch-testsuite-Allow-MicroBlaze-.weakext-pattern-in.patch Description: 0005-Patch-testsuite-Allow-MicroBlaze-.weakext-pattern-in.patch
Re: [PATCH] x86: Use ud2 assembly mnemonic when available.
Hello! Non-ancient assemblers support the ud2 mnemonic, so there is no need to emit the literal opcode as data. OK for trunk and 4.8? You forgot to tell us how the patch tested... gcc/ 2014-02-13 Roland McGrath mcgra...@google.com * configure.ac (HAVE_AS_IX86_UD2): New test for 'ud2' mnemonic. * configure: Regenerated. * config.in: Regenerated. * config/i386/i386.md (trap) [HAVE_AS_IX86_UD2]: Use the mnemonic instead of ASM_SHORT. OK for mainline and release branches. Thanks, Uros.
Re: [RS6000] power8 internal compiler errors
On Wed, Feb 12, 2014 at 06:47:37PM +0100, Ulrich Weigand wrote: Note that find_replacement itself already recurses into both sides of a PLUS. Thanks, I missed seeing that. I'd analysed the bug and knew what needed doing from past forays into reload, so went looking for ways to get at the reloads, ie. replacements at that stage of reload. Lo and behold, there's a function tailor made to do just that! So I plugged in find_replacements() wherever it seemed necessary. So it might be easier and cheaper overall to just do a find_replacement within the PRE_MODIFY clause ... That's a good idea, since PRE_MODIFY doesn't occur that often. Here is the revised patch with your recommendations. Bootstrapped and regression tested powerpc64-linux. PR target/58675 PR target/57935 * config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Use find_replacement on parts of insn rtl that might be reloaded. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 207649) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -16170,7 +16156,7 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p); rclass = REGNO_REG_CLASS (regno); - addr = XEXP (mem, 0); + addr = find_replacement (XEXP (mem, 0)); switch (rclass) { @@ -16181,19 +16167,18 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r if (GET_CODE (addr) == AND) { and_op2 = XEXP (addr, 1); - addr = XEXP (addr, 0); + addr = find_replacement (XEXP (addr, 0)); } if (GET_CODE (addr) == PRE_MODIFY) { - scratch_or_premodify = XEXP (addr, 0); + scratch_or_premodify = find_replacement (XEXP (addr, 0)); if (!REG_P (scratch_or_premodify)) rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p); - if (GET_CODE (XEXP (addr, 1)) != PLUS) + addr = find_replacement (XEXP (addr, 1)); + if (GET_CODE (addr) != PLUS) rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p); - - addr = XEXP (addr, 1); } if (GET_CODE (addr) == PLUS @@ -16201,6 +16186,8 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r || !rs6000_legitimate_offset_address_p (PTImode, addr, false, true))) { + /* find_replacement already recurses into both operands of +PLUS so we don't need to call it here. */ addr_op1 = XEXP (addr, 0); addr_op2 = XEXP (addr, 1); if (!legitimate_indirect_address_p (addr_op1, false)) @@ -16276,7 +16263,7 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r || !VECTOR_MEM_ALTIVEC_P (mode))) { and_op2 = XEXP (addr, 1); - addr = XEXP (addr, 0); + addr = find_replacement (XEXP (addr, 0)); } /* If we aren't using a VSX load, save the PRE_MODIFY register and use it @@ -16288,14 +16275,13 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, r || and_op2 != NULL_RTX || !legitimate_indexed_address_p (XEXP (addr, 1), false))) { - scratch_or_premodify = XEXP (addr, 0); + scratch_or_premodify = find_replacement (XEXP (addr, 0)); if (!legitimate_indirect_address_p (scratch_or_premodify, false)) rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p); - if (GET_CODE (XEXP (addr, 1)) != PLUS) + addr = find_replacement (XEXP (addr, 1)); + if (GET_CODE (addr) != PLUS) rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p); - - addr = XEXP (addr, 1); } if (legitimate_indirect_address_p (addr, false) /* reg */ -- Alan Modra Australia Development Lab, IBM