[PR83663] Revert r255946
Hello, This patch reverts the changes introduced by r255946 and further changes to that done by r256195, as the former causes large number of regressions on aarch64_be* targets. This should be respun with the mismatch in lane numbering in AArch64 and GCC's numbering fixed as explained in PR83663. OK for trunk? VP. ChangeLog: gcc/ PR target/83663 - Revert r255946 * config/aarch64/aarch64.c (aarch64_expand_vector_init): Modify code generation for cases where splatting a value is not useful. * simplify-rtx.c (simplify_ternary_operation): Simplify vec_merge across a vec_duplicate and a paradoxical subreg forming a vector mode to a vec_concat. gcc/testsuite/ PR target/83663 - Revert r255946 * gcc.target/aarch64/vect-slp-dup.c: New. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a189605..03a92b6 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -12129,51 +12129,9 @@ aarch64_expand_vector_init (rtx target, rtx vals) maxv = matches[i][1]; } - /* Create a duplicate of the most common element, unless all elements - are equally useless to us, in which case just immediately set the - vector register using the first element. */ - - if (maxv == 1) - { - /* For vectors of two 64-bit elements, we can do even better. */ - if (n_elts == 2 - && (inner_mode == E_DImode - || inner_mode == E_DFmode)) - - { - rtx x0 = XVECEXP (vals, 0, 0); - rtx x1 = XVECEXP (vals, 0, 1); - /* Combine can pick up this case, but handling it directly - here leaves clearer RTL. - - This is load_pair_lanes, and also gives us a clean-up - for store_pair_lanes. */ - if (memory_operand (x0, inner_mode) - && memory_operand (x1, inner_mode) - && !STRICT_ALIGNMENT - && rtx_equal_p (XEXP (x1, 0), - plus_constant (Pmode, - XEXP (x0, 0), - GET_MODE_SIZE (inner_mode - { - rtx t; - if (inner_mode == DFmode) - t = gen_load_pair_lanesdf (target, x0, x1); - else - t = gen_load_pair_lanesdi (target, x0, x1); - emit_insn (t); - return; - } - } - rtx x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, 0)); - aarch64_emit_move (target, lowpart_subreg (mode, x, inner_mode)); - maxelement = 0; - } - else - { - rtx x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, maxelement)); - aarch64_emit_move (target, gen_vec_duplicate (mode, x)); - } + /* Create a duplicate of the most common element. */ + rtx x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, maxelement)); + aarch64_emit_move (target, gen_vec_duplicate (mode, x)); /* Insert the rest. */ for (int i = 0; i < n_elts; i++) diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 6cb5a6e..b052fbb 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -5888,57 +5888,6 @@ simplify_ternary_operation (enum rtx_code code, machine_mode mode, return simplify_gen_binary (VEC_CONCAT, mode, newop0, newop1); } - /* Replace: - - (vec_merge:outer (vec_duplicate:outer x:inner) - (subreg:outer y:inner 0) - (const_int N)) - - with (vec_concat:outer x:inner y:inner) if N == 1, - or (vec_concat:outer y:inner x:inner) if N == 2. - We assume that degenrate cases (N == 0 or N == 3), which - represent taking all elements from either input, are handled - elsewhere. - - Implicitly, this means we have a paradoxical subreg, but such - a check is cheap, so make it anyway. - - Only applies for vectors of two elements. */ - - if ((GET_CODE (op0) == VEC_DUPLICATE - || GET_CODE (op1) == VEC_DUPLICATE) - && GET_MODE (op0) == GET_MODE (op1) - && known_eq (GET_MODE_NUNITS (GET_MODE (op0)), 2) - && known_eq (GET_MODE_NUNITS (GET_MODE (op1)), 2) - && IN_RANGE (sel, 1, 2)) - { - rtx newop0 = op0, newop1 = op1; - - /* Canonicalize locally such that the VEC_DUPLICATE is always - the first operand. */ - if (GET_CODE (newop1) == VEC_DUPLICATE) - { - std::swap (newop0, newop1); - /* If we swap the operand order, we also need to swap - the selector mask. */ - sel = sel == 1 ? 2 : 1; - } - - if (GET_CODE (newop1) == SUBREG - && paradoxical_subreg_p (newop1) - && subreg_lowpart_p (newop1) - && GET_MODE (SUBREG_REG (newop1)) - == GET_MODE (XEXP (newop0, 0))) - { - newop0 = XEXP (newop0, 0); - newop1 = SUBREG_REG (newop1); - if (sel == 2) - std::swap (newop0, newop1); - return simplify_gen_binary (VEC_CONCAT, mode, - newop0, newop1); - } - } - /* Replace (vec_merge (vec_duplicate x) (vec_duplicate y) (const_int n)) with (vec_concat x y) or (vec_concat y x) depending on value diff --git a/gcc/testsuite/gcc.target/aarch64/vect-slp-dup.c b/gcc/testsuite/gcc.target/aarch64/vect-slp-dup.c deleted file mode 100644 index 0541
[patch][arm] (respin) Improve error checking in parsecpu.awk
Hello, This patch by Richard Earnshaw was reverted along with the commit that preceded it as the preceding commit was causing cross-native builds to fail and I presumed this patch was related too. Now I am respinning as the issue that caused the cross-native failure have been fixed. This patch however is simply rebased and has no other changes. For reference, the ChangeLog of the preceding patch that broke cross-native build. [arm] auto-generate arm-isa.h from CPU descriptions This patch autogenerates arm-isa.h from new entries in arm-cpus.in. This has the primary advantage that it makes the description file more self-contained, but it also solves the 'array dimensioning' problem that Tamar recently encountered. It adds two new constructs to arm-cpus.in: features and fgroups. Fgroups are simply a way of naming a group of feature bits so that they can be referenced together. We follow the convention that feature bits are all lower case, while fgroups are (predominantly) upper case. This is helpful as in some contexts they share the same namespace. Most of the minor changes in this patch are related to adopting this new naming convention. * config.gcc (arm*-*-*): Don't add arm-isa.h to tm_p_file. * config/arm/arm-isa.h: Delete. Move definitions to ... * arm-cpus.in: ... here. Use new feature and fgroup values. * config/arm/arm.c (arm_option_override): Use lower case for feature bit names. * config/arm/arm.h (TARGET_HARD_FLOAT): Likewise. (TARGET_VFP3, TARGET_VFP5, TARGET_FMA): Likewise. * config/arm/parsecpu.awk (END): Add new command 'isa'. (isa_pfx): Delete. (print_isa_bits_for): New function. (gen_isa): New function. (gen_comm_data): Use print_isa_bits_for. (define feature): New keyword. (define fgroup): New keyword. * config/arm/t-arm (OPTIONS_H_EXTRA): Add arm-isa.h (arm-isa.h): Add rule to generate file. * common/config/arm/arm-common.c: (arm_canon_arch_option): Use lower case for feature bit names. Tested by building cross/cross-native arm-none-linux-gnueabihf and baremetal cross build (arm-none-eabi) on x86_64. OK for trunk? Regards VP. gcc/ChangeLog: [arm] Improve error checking in parsecpu.awk This patch adds a bit more error checking to parsecpu.awk to ensure that statements are not missing arguments or have excess arguments beyond those permitted. It also slightly improves the handling of errors so that we terminate properly if parsing fails and be as helpful as we can while in the parsing phase. 2017-09-22 Richard Earnshaw * config/arm/parsecpu.awk (fatal): Note that we've encountered an error. Only quit immediately if parsing is complete. (BEGIN): Initialize fatal_err and parse_done. (begin fpu, end fpu): Check number of arguments. (begin arch, end arch): Likewise. (begin cpu, end cpu): Likewise. (cname, tune for, tune flags, architecture, fpu, option): Likewise. (optalias): Likewise. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 5b9217c..4885746 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -2154,6 +2154,17 @@ 2017-09-06 Richard Earnshaw + * config/arm/parsecpu.awk (fatal): Note that we've encountered an + error. Only quit immediately if parsing is complete. + (BEGIN): Initialize fatal_err and parse_done. + (begin fpu, end fpu): Check number of arguments. + (begin arch, end arch): Likewise. + (begin cpu, end cpu): Likewise. + (cname, tune for, tune flags, architecture, fpu, option): Likewise. + (optalias): Likewise. + +2017-09-06 Richard Earnshaw + * config.gcc (arm*-*-*): Don't add arm-isa.h to tm_p_file. * config/arm/arm-isa.h: Delete. Move definitions to ... * arm-cpus.in: ... here. Use new feature and fgroup values. diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk index d07d3fc..0b4fc68 100644 --- a/gcc/config/arm/parsecpu.awk +++ b/gcc/config/arm/parsecpu.awk @@ -32,7 +32,8 @@ function fatal (m) { print "error ("lineno"): " m > "/dev/stderr" -exit 1 +fatal_err = 1 +if (parse_done) exit 1 } function toplevel () { @@ -502,14 +503,18 @@ BEGIN { arch_name = "" fpu_name = "" lineno = 0 +fatal_err = 0 +parse_done = 0 if (cmd == "") fatal("Usage parsecpu.awk -v cmd=") } +# New line. Reset parse status and increment line count for error messages // { lineno++ parse_ok = 0 } +# Comments must be on a line on their own. /^#/ { parse_ok = 1 } @@ -552,12 +557,14 @@ BEGIN { } /^begin fpu / { +if (NF != 3) fatal("syntax: begin fpu ") toplevel() fpu_name = $3 parse_ok = 1 } /^end fpu / { +if (NF != 3) fatal("syntax: end fpu ") if (fpu_name != $3) fatal("mimatched end fpu") if (! (fpu_name in fpu_isa)) { fatal("fpu definition \"" fpu_name "\" lacks an \"isa\" stat
[patch][arm] (respin) auto-generate arm-isa.h from CPU descriptions
Hello, This patch by Richard Earnshaw was reverted earlier as it was breaking cross-native builds. Respinning now with a minor change that fixes the build issue - adding arm-isa.h to GTM_H. Also remove a redundant dependency (TM_H includes GTM_H). Tested by building cross/cross-native arm-none-linux-gnueabihf and baremetal cross build (arm-none-eabi) on x86_64. OK for trunk? Regards VP. gcc/ChangeLog: [arm] auto-generate arm-isa.h from CPU descriptions This patch autogenerates arm-isa.h from new entries in arm-cpus.in. This has the primary advantage that it makes the description file more self-contained, but it also solves the 'array dimensioning' problem that Tamar recently encountered. It adds two new constructs to arm-cpus.in: features and fgroups. Fgroups are simply a way of naming a group of feature bits so that they can be referenced together. We follow the convention that feature bits are all lower case, while fgroups are (predominantly) upper case. This is helpful as in some contexts they share the same namespace. Most of the minor changes in this patch are related to adopting this new naming convention. 2017-09-22 Richard Earnshaw * config.gcc (arm*-*-*): Don't add arm-isa.h to tm_p_file. * config/arm/arm-isa.h: Delete. Move definitions to ... * arm-cpus.in: ... here. Use new feature and fgroup values. * config/arm/arm.c (arm_option_override): Use lower case for feature bit names. * config/arm/arm.h (TARGET_HARD_FLOAT): Likewise. (TARGET_VFP3, TARGET_VFP5, TARGET_FMA): Likewise. * config/arm/parsecpu.awk (END): Add new command 'isa'. (isa_pfx): Delete. (print_isa_bits_for): New function. (gen_isa): New function. (gen_comm_data): Use print_isa_bits_for. (define feature): New keyword. (define fgroup): New keyword. * config/arm/t-arm (TM_H): Remove. (GTM_H): Add arm-isa.h. (arm-isa.h): Add rule to generate file. * common/config/arm/arm-common.c: (arm_canon_arch_option): Use lower case for feature bit names. diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c index 38bd3a7..7cb99ec 100644 --- a/gcc/common/config/arm/arm-common.c +++ b/gcc/common/config/arm/arm-common.c @@ -574,7 +574,7 @@ arm_canon_arch_option (int argc, const char **argv) { /* The easiest and safest way to remove the default fpu capabilities is to look for a '+no..' option that removes - the base FPU bit (isa_bit_VFPv2). If that doesn't exist + the base FPU bit (isa_bit_vfpv2). If that doesn't exist then the best we can do is strip out all the bits that might be part of the most capable FPU we know about, which is "crypto-neon-fp-armv8". */ @@ -586,7 +586,7 @@ arm_canon_arch_option (int argc, const char **argv) ++ext) { if (ext->remove - && check_isa_bits_for (ext->isa_bits, isa_bit_VFPv2)) + && check_isa_bits_for (ext->isa_bits, isa_bit_vfpv2)) { arm_initialize_isa (fpu_isa, ext->isa_bits); bitmap_and_compl (target_isa, target_isa, fpu_isa); @@ -620,7 +620,7 @@ arm_canon_arch_option (int argc, const char **argv) { /* Clearing the VFPv2 bit is sufficient to stop any extention that builds on the FPU from matching. */ - bitmap_clear_bit (target_isa, isa_bit_VFPv2); + bitmap_clear_bit (target_isa, isa_bit_vfpv2); } /* If we don't have a selected architecture by now, something's @@ -692,8 +692,8 @@ arm_canon_arch_option (int argc, const char **argv) capable FPU variant that we do support. This is sufficient for multilib selection. */ - if (bitmap_bit_p (target_isa_unsatisfied, isa_bit_VFPv2) - && bitmap_bit_p (fpu_isa, isa_bit_VFPv2)) + if (bitmap_bit_p (target_isa_unsatisfied, isa_bit_vfpv2) + && bitmap_bit_p (fpu_isa, isa_bit_vfpv2)) { std::list::iterator ipoint = extensions.begin (); diff --git a/gcc/config.gcc b/gcc/config.gcc index 555ed69..00225104 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -593,7 +593,7 @@ x86_64-*-*) tm_file="vxworks-dummy.h ${tm_file}" ;; arm*-*-*) - tm_p_file="arm/arm-flags.h arm/arm-isa.h ${tm_p_file} arm/aarch-common-protos.h" + tm_p_file="arm/arm-flags.h ${tm_p_file} arm/aarch-common-protos.h" tm_file="vxworks-dummy.h ${tm_file}" ;; mips*-*-* | sh*-*-* | sparc*-*-*) diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index d009a9e..07de4c9 100644 --- a/gcc/config/arm/arm-cpus.in +++ b/gcc/config/arm/arm-cpus.in @@ -40,6 +40,210 @@ # names in the final compiler. The order within each group is preserved and # forms the order for the list within the compiler. +# Most objects in this file support forward references. The major +# exception is feature groups, which may only refer to previously +# defined features or feature groups. This is done to avoid the risk +# of feature groups recursively ref
Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations
On Tue, Sep 05, 2017 at 03:12:47PM +0200, Richard Biener wrote: > On Tue, 5 Sep 2017, Tamar Christina wrote: > > > > > > > > -Original Message- > > > From: Richard Biener [mailto:rguent...@suse.de] > > > Sent: 05 September 2017 13:51 > > > To: Tamar Christina > > > Cc: Andrew Pinski; Andreas Schwab; Jon Beniston; gcc-patches@gcc.gnu.org; > > > nd > > > Subject: RE: [RFC, vectorizer] Allow single element vector types for > > > vector > > > reduction operations > > > > > > On Tue, 5 Sep 2017, Richard Biener wrote: > > > > > > > On Tue, 5 Sep 2017, Tamar Christina wrote: > > > > > > > > > Hi Richard, > > > > > > > > > > That was an really interesting analysis, thanks for the details! > > > > > > > > > > Would you be submitting the patch you proposed at the end as a fix? > > > > > > > > I'm testing it currently. > > > > > > Unfortunately it breaks some required lowering. I'll have to more closely > > > look at this. > > > > Ah, ok. In the meantime, can this patch be reverted? It's currently > > breaking spec for us so we're > > Not able to get any benchmarking numbers. > > Testing the following instead: Any news on this? VP. > > Index: gcc/tree-vect-generic.c > === > --- gcc/tree-vect-generic.c (revision 251642) > +++ gcc/tree-vect-generic.c (working copy) > @@ -1640,7 +1640,7 @@ expand_vector_operations_1 (gimple_stmt_ >|| code == VEC_UNPACK_FLOAT_LO_EXPR) > type = TREE_TYPE (rhs1); > > - /* For widening/narrowing vector operations, the relevant type is of > the > + /* For widening vector operations, the relevant type is of the > arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is > calculated in the same way above. */ >if (code == WIDEN_SUM_EXPR > @@ -1650,9 +1650,6 @@ expand_vector_operations_1 (gimple_stmt_ >|| code == VEC_WIDEN_MULT_ODD_EXPR >|| code == VEC_UNPACK_HI_EXPR >|| code == VEC_UNPACK_LO_EXPR > - || code == VEC_PACK_TRUNC_EXPR > - || code == VEC_PACK_SAT_EXPR > - || code == VEC_PACK_FIX_TRUNC_EXPR >|| code == VEC_WIDEN_LSHIFT_HI_EXPR >|| code == VEC_WIDEN_LSHIFT_LO_EXPR) > type = TREE_TYPE (rhs1); > > > also fix for a bug uncovered by the previous one: > > Index: gcc/gimple-ssa-strength-reduction.c > === > --- gcc/gimple-ssa-strength-reduction.c (revision 251710) > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > @@ -1742,8 +1742,7 @@ find_candidates_dom_walker::before_dom_c > slsr_process_ref (gs); > >else if (is_gimple_assign (gs) > - && SCALAR_INT_MODE_P > - (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs) > + && INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (gs > { > tree rhs1 = NULL_TREE, rhs2 = NULL_TREE; > >
Re: [COMMITTED][arm] Revert r251800 & r251799
Now with the patch :-) VP. On Mon, Sep 11, 2017 at 03:20:12PM +0100, Vidya Praveen wrote: > Hello, > > The following two related patches need to be reverted as it causes > cross-native > builds to fail with the following message: > > g++ -c -DIN_GCC -DGENERATOR_FILE -I. [...] \ > -o build/genpreds.o /path/to/src/gcc/gcc/genpreds.c > In file included from ./options.h:8:0, > from ./tm.h:23, > from /path/to/src/gcc/gcc/genpreds.c:26: > /path/to/src/gcc/gcc/config/arm/arm-opts.h:29:21: fatal error: arm-isa.h: No > such file or directory > #include "arm-isa.h" > ^ > genpreds depends on GTM_H which does not depend on options.h, or any of its > dependencies. Nevertheless, it still tries to include options.h when reading > tm.h, so we miss the rule to build arm-isa.h. It is unclear why it is only an > issue with the cross-native builds. > > For now, in order to keep the builds going, I am reverting these patches. > > > r251800 | rearnsha | 2017-09-06 14:42:54 +0100 (Wed, 06 Sep 2017) | 16 lines > > [arm] Improve error checking in parsecpu.awk > > This patch adds a bit more error checking to parsecpu.awk to ensure > that statements are not missing arguments or have excess arguments > beyond those permitted. It also slightly improves the handling of > errors so that we terminate properly if parsing fails and be as > helpful as we can while in the parsing phase. > > * config/arm/parsecpu.awk (fatal): Note that we've encountered an > error. Only quit immediately if parsing is complete. > (BEGIN): Initialize fatal_err and parse_done. > (begin fpu, end fpu): Check number of arguments. > (begin arch, end arch): Likewise. > (begin cpu, end cpu): Likewise. > (cname, tune for, tune flags, architecture, fpu, option): Likewise. > (optalias): Likewise. > > r251799 | rearnsha | 2017-09-06 14:42:46 +0100 (Wed, 06 Sep 2017) | 31 lines > > [arm] auto-generate arm-isa.h from CPU descriptions > > This patch autogenerates arm-isa.h from new entries in arm-cpus.in. > This has the primary advantage that it makes the description file more > self-contained, but it also solves the 'array dimensioning' problem > that Tamar recently encountered. It adds two new constructs to > arm-cpus.in: features and fgroups. Fgroups are simply a way of naming > a group of feature bits so that they can be referenced together. We > follow the convention that feature bits are all lower case, while > fgroups are (predominantly) upper case. This is helpful as in some > contexts they share the same namespace. Most of the minor changes in > this patch are related to adopting this new naming convention. > > * config.gcc (arm*-*-*): Don't add arm-isa.h to tm_p_file. > * config/arm/arm-isa.h: Delete. Move definitions to ... > * arm-cpus.in: ... here. Use new feature and fgroup values. > * config/arm/arm.c (arm_option_override): Use lower case for feature > bit names. > * config/arm/arm.h (TARGET_HARD_FLOAT): Likewise. > (TARGET_VFP3, TARGET_VFP5, TARGET_FMA): Likewise. > * config/arm/parsecpu.awk (END): Add new command 'isa'. > (isa_pfx): Delete. > (print_isa_bits_for): New function. > (gen_isa): New function. > (gen_comm_data): Use print_isa_bits_for. > (define feature): New keyword. > (define fgroup): New keyword. > * config/arm/t-arm (OPTIONS_H_EXTRA): Add arm-isa.h > (arm-isa.h): Add rule to generate file. > * common/config/arm/arm-common.c: (arm_canon_arch_option): Use lower > case for feature bit names. > > Regards, > VP. > > > gcc/ChangeLog: > > 2017-09-11 Vidya Praveen > > Revert r251800 and r251799. diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c index 7cb99ec..38bd3a7 100644 --- a/gcc/common/config/arm/arm-common.c +++ b/gcc/common/config/arm/arm-common.c @@ -574,7 +574,7 @@ arm_canon_arch_option (int argc, const char **argv) { /* The easiest and safest way to remove the default fpu capabilities is to look for a '+no..' option that removes - the base FPU bit (isa_bit_vfpv2). If that doesn't exist + the base FPU bit (isa_bit_VFPv2). If that doesn't exist then the best we can do is strip out all the bits that might be part of the most capable FPU we know about, which is "crypto-neon-fp-armv8". */ @@ -586,7 +586,7 @@ arm_canon_arch_option (int argc, const char **argv) ++ext) { if (ext->remove - && che
[COMMITTED][arm] Revert r251800 & r251799
Hello, The following two related patches need to be reverted as it causes cross-native builds to fail with the following message: g++ -c -DIN_GCC -DGENERATOR_FILE -I. [...] \ -o build/genpreds.o /path/to/src/gcc/gcc/genpreds.c In file included from ./options.h:8:0, from ./tm.h:23, from /path/to/src/gcc/gcc/genpreds.c:26: /path/to/src/gcc/gcc/config/arm/arm-opts.h:29:21: fatal error: arm-isa.h: No such file or directory #include "arm-isa.h" ^ genpreds depends on GTM_H which does not depend on options.h, or any of its dependencies. Nevertheless, it still tries to include options.h when reading tm.h, so we miss the rule to build arm-isa.h. It is unclear why it is only an issue with the cross-native builds. For now, in order to keep the builds going, I am reverting these patches. r251800 | rearnsha | 2017-09-06 14:42:54 +0100 (Wed, 06 Sep 2017) | 16 lines [arm] Improve error checking in parsecpu.awk This patch adds a bit more error checking to parsecpu.awk to ensure that statements are not missing arguments or have excess arguments beyond those permitted. It also slightly improves the handling of errors so that we terminate properly if parsing fails and be as helpful as we can while in the parsing phase. * config/arm/parsecpu.awk (fatal): Note that we've encountered an error. Only quit immediately if parsing is complete. (BEGIN): Initialize fatal_err and parse_done. (begin fpu, end fpu): Check number of arguments. (begin arch, end arch): Likewise. (begin cpu, end cpu): Likewise. (cname, tune for, tune flags, architecture, fpu, option): Likewise. (optalias): Likewise. r251799 | rearnsha | 2017-09-06 14:42:46 +0100 (Wed, 06 Sep 2017) | 31 lines [arm] auto-generate arm-isa.h from CPU descriptions This patch autogenerates arm-isa.h from new entries in arm-cpus.in. This has the primary advantage that it makes the description file more self-contained, but it also solves the 'array dimensioning' problem that Tamar recently encountered. It adds two new constructs to arm-cpus.in: features and fgroups. Fgroups are simply a way of naming a group of feature bits so that they can be referenced together. We follow the convention that feature bits are all lower case, while fgroups are (predominantly) upper case. This is helpful as in some contexts they share the same namespace. Most of the minor changes in this patch are related to adopting this new naming convention. * config.gcc (arm*-*-*): Don't add arm-isa.h to tm_p_file. * config/arm/arm-isa.h: Delete. Move definitions to ... * arm-cpus.in: ... here. Use new feature and fgroup values. * config/arm/arm.c (arm_option_override): Use lower case for feature bit names. * config/arm/arm.h (TARGET_HARD_FLOAT): Likewise. (TARGET_VFP3, TARGET_VFP5, TARGET_FMA): Likewise. * config/arm/parsecpu.awk (END): Add new command 'isa'. (isa_pfx): Delete. (print_isa_bits_for): New function. (gen_isa): New function. (gen_comm_data): Use print_isa_bits_for. (define feature): New keyword. (define fgroup): New keyword. * config/arm/t-arm (OPTIONS_H_EXTRA): Add arm-isa.h (arm-isa.h): Add rule to generate file. * common/config/arm/arm-common.c: (arm_canon_arch_option): Use lower case for feature bit names. Regards, VP. gcc/ChangeLog: 2017-09-11 Vidya Praveen Revert r251800 and r251799.
Re: [PATCH] Be careful about combined chain with length == 0 (PR, tree-optimization/70754).
On Wed, Jan 18, 2017 at 11:10:32AM +0100, Martin Liška wrote: > Hello. > > After basic understanding of loop predictive commoning, the problematic > combined chain is: > > Loads-only chain 0x38b6730 (combined) > max distance 0 > references: > MEM[(real(kind=8) *)vectp_a.29_81] (id 1) > offset 20 > distance 0 > MEM[(real(kind=8) *)vectp_a.38_141] (id 3) > offset 20 > distance 0 > > Loads-only chain 0x38b68b0 (combined) > max distance 0 > references: > MEM[(real(kind=8) *)vectp_a.23_102] (id 0) > offset 0 > distance 0 > MEM[(real(kind=8) *)vectp_a.33_33] (id 2) > offset 0 > distance 0 > > Combination chain 0x38b65b0 > max distance 0, may reuse first > equal to 0x38b6730 + 0x38b68b0 in type vector(2) real(kind=8) > references: > combination ref > in statement predreastmp.48_10 = vect__32.31_78 + vect__28.25_100; > > distance 0 > combination ref > in statement predreastmp.50_17 = vect__42.41_138 + vect__38.36_29; > > distance 0 > > It's important to note that distance is equal to zero (happening within a > same loop iteration). > Aforementioned chains correspond to: > > ... > r2: vect__28.25_100 = MEM[(real(kind=8) *)vectp_a.23_102]; > vectp_a.23_99 = vectp_a.23_102 + 16; > vect__28.26_98 = MEM[(real(kind=8) *)vectp_a.23_99]; > vect__82.27_97 = vect__22.22_108; > vect__82.27_96 = vect__22.22_107; > vect__79.28_95 = vect__82.27_97 + vect__84.17_120; > vect__79.28_94 = vect__82.27_96 + vect__84.17_119; > r1: vect__32.31_78 = MEM[(real(kind=8) *)vectp_a.29_81]; > vectp_a.29_77 = vectp_a.29_81 + 16; > vect__32.32_76 = MEM[(real(kind=8) *)vectp_a.29_77]; > vect__38.35_39 = MEM[(real(kind=8) *)vectp_a.33_57]; > r2': vectp_a.33_33 = vectp_a.33_57 + 16; > vect__38.36_29 = MEM[(real(kind=8) *)vectp_a.33_33]; > vect__56.37_23 = vect__38.35_39; > vect__56.37_15 = vect__32.32_76; > vect__42.40_161 = MEM[(real(kind=8) *)vectp_a.38_163]; > vectp_a.38_141 = vectp_a.38_163 + 16; > r1': vect__42.41_138 = MEM[(real(kind=8) *)vectp_a.38_141]; > vect__54.42_135 = vect__42.40_161 + vect__56.37_23; > r1'+r2': predreastmp.50_17 = vect__42.41_138 + vect__38.36_29; > predreastmp.51_18 = vect__56.37_15; > vect__54.42_134 = predreastmp.50_17; > r1+r2: predreastmp.48_10 = vect__32.31_78 + vect__28.25_100; > ... > > Problematic construct is that while having load-only chains r1->r1' and > r2->r2', the combination > is actually r1'+r2'->r1+r2, which cause the troubles. I believe the proper > fix is to reject such > combinations where combined root stmt does not dominate usages. It's probably > corner case as it does > not reuse any values among loop iterations (which is main motivation of the > pass), it's doing PRE > if I'm right. > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. I could bootstrap on aarch64-none-linux-gnu without any issues, regression tests are fine and the testcase compiles without ICE. Thanks for fixing this. VP. > > Ready to be installed? > Martin > > From 41b153cf975374fff48419ec8ac5991ac134735f Mon Sep 17 00:00:00 2001 > From: marxin > Date: Tue, 17 Jan 2017 14:22:40 +0100 > Subject: [PATCH] Be careful about combined chain with length == 0 (PR > tree-optimization/70754). > > gcc/testsuite/ChangeLog: > > 2017-01-17 Martin Liska > > PR tree-optimization/70754 > * gfortran.dg/pr70754.f90: New test. > > gcc/ChangeLog: > > 2017-01-17 Martin Liska > > PR tree-optimization/70754 > * tree-predcom.c (combine_chains): Do not create a combined chain > with length equal to zero when root_stmt does not dominate > stmts of references. > --- > gcc/testsuite/gfortran.dg/pr70754.f90 | 35 > +++ > gcc/tree-predcom.c| 10 ++ > 2 files changed, 45 insertions(+) > create mode 100644 gcc/testsuite/gfortran.dg/pr70754.f90 > > diff --git a/gcc/testsuite/gfortran.dg/pr70754.f90 > b/gcc/testsuite/gfortran.dg/pr70754.f90 > new file mode 100644 > index 000..758901ce2b2 > --- /dev/null > +++ b/gcc/testsuite/gfortran.dg/pr70754.f90 > @@ -0,0 +1,35 @@ > +! { dg-options "-Ofast" } > + > +module m > + implicit none > + private > + save > + > + integer, parameter, public :: & > +ii4 = selected_int_kind(6), & > +rr8 = selected_real_kind(13) > + > + integer (ii4), dimension(40,40,199), public :: xyz > + public :: foo > +contains > + subroutine foo(a) > +real (rr8), dimension(40,40), intent(out) :: a > +real (rr8), dimension(40,40) :: b > +integer (ii4), dimension(40,40) :: c > +integer i, j > + > +do i=1,8 > + b(i,j) = 123 * a(i,j) + a(i,j+1) & > + + a(i,j) + a(i+1,j+1) & > + + a(i+1,j) + a(i-1,j+1) & > + + a(i-1,j) > + c(i,j) = 123 > +end do > + > +where ((xyz(:,:,2) /= 0) .and. (c /= 0)) > + a = b/real(c) > +els
Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8
On Mon, Jun 15, 2015 at 08:14:31PM +0100, Bill Schmidt wrote: > On Fri, 2015-06-12 at 17:36 +0100, Vidya Praveen wrote: > > On Thu, Apr 30, 2015 at 01:34:18PM +0100, Bill Schmidt wrote: > > > On Thu, 2015-04-30 at 18:26 +0800, Bin.Cheng wrote: > > > > On Mon, Apr 27, 2015 at 9:26 PM, Bill Schmidt > > > > wrote: > > > > > On Mon, 2015-04-27 at 14:23 +0800, Bin.Cheng wrote: > > > > >> On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt > > > > >> wrote: > > > > > > > > > >> > > > > >> > Index: gcc/testsuite/gcc.dg/vect/vect-33.c > > > > >> > === > > > > >> > --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 221118) > > > > >> > +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy) > > > > >> > @@ -36,9 +36,10 @@ int main (void) > > > > >> >return main1 (); > > > > >> > } > > > > >> > > > > > >> > +/* vect_hw_misalign && { ! vect64 } */ > > > > >> > > > > > >> > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 > > > > >> > "vect" } } */ > > > > >> > -/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" > > > > >> > "vect" { target { vect_hw_misalign && { {! vect64} || > > > > >> > vect_multiple_sizes } } } } } */ > > > > >> > +/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" > > > > >> > "vect" { target { { { ! powerpc*-*-* } && vect_hw_misalign } && { > > > > >> > { ! vect64 } || vect_multiple_sizes } } } } } */ > > > > >> > /* { dg-final { scan-tree-dump "Alignment of access forced using > > > > >> > peeling" "vect" { target { vector_alignment_reachable && { vect64 > > > > >> > && {! vect_multiple_sizes} } } } } } */ > > > > >> > /* { dg-final { scan-tree-dump-times "Alignment of access forced > > > > >> > using versioning" 1 "vect" { target { { {! > > > > >> > vector_alignment_reachable} || {! vect64} } && {! > > > > >> > vect_hw_misalign} } } } } */ > > > > >> > /* { dg-final { cleanup-tree-dump "vect" } } */ > > > > >> > > > > >> Hi Bill, > > > > >> With this change, the test case is skipped on aarch64 now. Since it > > > > >> passed before, Is it expected to act like this on 64bit platforms? > > > > > > > > > > Hi Bin, > > > > > > > > > > No, that's a mistake on my part -- thanks for the report! That first > > > > > added line was not intended to be part of the patch: > > > > > > > > > > +/* vect_hw_misalign && { ! vect64 } */ > > > > > > > > > > Please try removing that line and verify that the patch succeeds again > > > > > for ARM. Assuming so, I'll prepare a patch to fix this. > > > > > > > > > > It looks like this mistake was introduced only in this particular > > > > > test, > > > > > but please let me know if you see any other anomalies. > > > > Hi Bill, > > > > I chased the wrong branch. The test disappeared on fsf-48 branch in > > > > out build, rather than trunk. I guess it's not your patch's fault. > > > > Will follow up and get back to you later. > > > > Sorry for the inconvenience. > > > > > > OK, thanks for letting me know! There was still a bad line in this > > > patch, although it was only introduced in 5.1 and trunk, so I guess that > > > wasn't responsible in this case. Thanks for checking! > > > > > > Hi Bill, > > > > In 4.8 branch, you have changed: > > > > -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 > > "vect" } } */ > > +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 > > "vect" { target { ! vect_hw_misalign } } } } */ > > > > Whereas your comment says: > > > >2015-04-24 Bill Schmidt > > > > Backport from ma
Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8
On Thu, Apr 30, 2015 at 01:34:18PM +0100, Bill Schmidt wrote: > On Thu, 2015-04-30 at 18:26 +0800, Bin.Cheng wrote: > > On Mon, Apr 27, 2015 at 9:26 PM, Bill Schmidt > > wrote: > > > On Mon, 2015-04-27 at 14:23 +0800, Bin.Cheng wrote: > > >> On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt > > >> wrote: > > > > > >> > > >> > Index: gcc/testsuite/gcc.dg/vect/vect-33.c > > >> > === > > >> > --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 221118) > > >> > +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy) > > >> > @@ -36,9 +36,10 @@ int main (void) > > >> >return main1 (); > > >> > } > > >> > > > >> > +/* vect_hw_misalign && { ! vect64 } */ > > >> > > > >> > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } > > >> > } */ > > >> > -/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" > > >> > "vect" { target { vect_hw_misalign && { {! vect64} || > > >> > vect_multiple_sizes } } } } } */ > > >> > +/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" > > >> > "vect" { target { { { ! powerpc*-*-* } && vect_hw_misalign } && { { ! > > >> > vect64 } || vect_multiple_sizes } } } } } */ > > >> > /* { dg-final { scan-tree-dump "Alignment of access forced using > > >> > peeling" "vect" { target { vector_alignment_reachable && { vect64 && > > >> > {! vect_multiple_sizes} } } } } } */ > > >> > /* { dg-final { scan-tree-dump-times "Alignment of access forced > > >> > using versioning" 1 "vect" { target { { {! vector_alignment_reachable} > > >> > || {! vect64} } && {! vect_hw_misalign} } } } } */ > > >> > /* { dg-final { cleanup-tree-dump "vect" } } */ > > >> > > >> Hi Bill, > > >> With this change, the test case is skipped on aarch64 now. Since it > > >> passed before, Is it expected to act like this on 64bit platforms? > > > > > > Hi Bin, > > > > > > No, that's a mistake on my part -- thanks for the report! That first > > > added line was not intended to be part of the patch: > > > > > > +/* vect_hw_misalign && { ! vect64 } */ > > > > > > Please try removing that line and verify that the patch succeeds again > > > for ARM. Assuming so, I'll prepare a patch to fix this. > > > > > > It looks like this mistake was introduced only in this particular test, > > > but please let me know if you see any other anomalies. > > Hi Bill, > > I chased the wrong branch. The test disappeared on fsf-48 branch in > > out build, rather than trunk. I guess it's not your patch's fault. > > Will follow up and get back to you later. > > Sorry for the inconvenience. > > OK, thanks for letting me know! There was still a bad line in this > patch, although it was only introduced in 5.1 and trunk, so I guess that > wasn't responsible in this case. Thanks for checking! Hi Bill, In 4.8 branch, you have changed: -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target { ! vect_hw_misalign } } } } */ Whereas your comment says: 2015-04-24 Bill Schmidt Backport from mainline r222349 2015-04-22 Bill Schmidt PR target/65456 [...] * gcc.dg/vect/vect-33.c: Exclude unaligned access test for POWER8. [...] There wasn't an unaligned access test in the first place. But if you wanted to introduce it and exclude it for POWER8 then it should've been: ... { { ! powerpc*-*-* } && vect_hw_misalign } ... like you have done for the trunk. At the moment, this change has made the test to be skipped for AArch64. It should've been skipped for x86_64-*-* and i*86-*-* as well. I believe it wasn't intended to be skipped so? Regards VP. > > Bill > > > > > Thanks, > > bin > > > > > > Thanks very much! > > > > > > Bill > > >> > > >> PASS->NA: gcc.dg/vect/vect-33.c -flto -ffat-lto-objects > > >> scan-tree-dump-times vect "Vectorizing an unaligned access" 0 > > >> PASS->NA: gcc.dg/vect/vect-33.c scan-tree-dump-times vect "Vectorizing > > >> an unaligned access" 0 > > >> > > >> Thanks, > > >> bin > > >> > > > > > > > > > >
Re: [PATCH, RFC] New memory usage statistics infrastructure
On 01/06/15 15:21, Vidya Praveen wrote: On 01/06/15 15:08, Martin Liška wrote: On 06/01/2015 02:18 PM, Richard Biener wrote: On Mon, Jun 1, 2015 at 1:38 PM, Martin Liška wrote: On 05/29/2015 06:09 PM, Vidya Praveen wrote: Martin, The following change: @@ -2655,10 +2655,10 @@ s-iov: build/gcov-iov$(build_exeext) $(BASEVER) $(DEVPHASE) GCOV_OBJS = gcov.o gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) - +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@ + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) build/hash-table.o ggc-none.o $(LIBS) -o $@ seem to cause canadian cross build failure for arm and aarch64 on x86_64 as build/hash-table.o and ggc-none.o are not built by the same compiler? arm-none-linux-gnueabi-g++ -no-pie -g -O2 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing +-Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wn build/hash-table.o ggc-none.o libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a .. +/libdecnumber/libdecnumber.a -o gcov build/hash-table.o: file not recognized: File format not recognized collect2: error: ld returned 1 exit status make[1]: *** [gcov] Error 1 Should it be: - +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@ + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) hash-table.o ggc-none.o $(LIBS) -o $@ instead? Hello Vidya. Thanks for pointing out. To be honest, I'm not a build system guru and it's hard for me to verify that the change you suggest is the correct. May I please ask you for sending a patch to mailing? gcov isn't a build but a host tool so the patch looks good to me. Richard. Thanks, Martin VP. On 15/05/15 15:38, Martin Liška wrote: Hello. Following patch attempts to rewrite memory reports for GCC's internal allocations so that it uses a new template type. The type shares parts which are currently duplicated, adds support for special 'counters' and introduces new support for hash-{set,map,table}. Transformation of the current code is a bit tricky as we internally used hash-table as main data structure which takes care of location-related allocations. As I want to add support even for hash tables (and all derived types), header files inclusion and forward declaration is utilized. Feel free to comment the patch, as well as missing features one may want to track by location sensitive memory allocation. Attachment contains sample output taken from tramp3d-v4.cpp. Thanks, Martin Ok. I'm going to install following patch. Martin, I realized we require change in one more place. I'm just doing builds to verify this. VP. GCOV_OBJS = gcov.o gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) \ - build/hash-table.o ggc-none.o $(LIBS) -o $@ + hash-table.o ggc-none.o $(LIBS) -o $@ GCOV_DUMP_OBJS = gcov-dump.o gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \ - build/hash-table.o build/ggc-none.o\ + hash-table.o ggc-none.o\ $(LIBS) -o $@ GCOV_TOOL_DEP_FILES = $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \ Installing the following patch as it is obvious and same kind of change as the the previous one (which was approved by richi). Verified by building canadian cross of aarch64-none-linux-gnu and cross build of arm-none-eabi. gcc/ChangeLog: 2015-06-01 Vidya Praveen * Makefile.in: Pick up gcov-dump dependencies from gcc/ directory rather than from gcc/build directory. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 952f285..3d14938 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2671,7 +2671,7 @@ gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) GCOV_DUMP_OBJS = gcov-dump.o gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \ - build/hash-table.o build/ggc-none.o\ + hash-table.o ggc-none.o\ $(LIBS) -o $@ GCOV_TOOL_DEP_FILES = $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \
Re: [PATCH, RFC] New memory usage statistics infrastructure
On 01/06/15 15:08, Martin Liška wrote: On 06/01/2015 02:18 PM, Richard Biener wrote: On Mon, Jun 1, 2015 at 1:38 PM, Martin Liška wrote: On 05/29/2015 06:09 PM, Vidya Praveen wrote: Martin, The following change: @@ -2655,10 +2655,10 @@ s-iov: build/gcov-iov$(build_exeext) $(BASEVER) $(DEVPHASE) GCOV_OBJS = gcov.o gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) - +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@ + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) build/hash-table.o ggc-none.o $(LIBS) -o $@ seem to cause canadian cross build failure for arm and aarch64 on x86_64 as build/hash-table.o and ggc-none.o are not built by the same compiler? arm-none-linux-gnueabi-g++ -no-pie -g -O2 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing +-Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wn build/hash-table.o ggc-none.o libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a .. +/libdecnumber/libdecnumber.a -o gcov build/hash-table.o: file not recognized: File format not recognized collect2: error: ld returned 1 exit status make[1]: *** [gcov] Error 1 Should it be: - +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@ + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) hash-table.o ggc-none.o $(LIBS) -o $@ instead? Hello Vidya. Thanks for pointing out. To be honest, I'm not a build system guru and it's hard for me to verify that the change you suggest is the correct. May I please ask you for sending a patch to mailing? gcov isn't a build but a host tool so the patch looks good to me. Richard. Thanks, Martin VP. On 15/05/15 15:38, Martin Liška wrote: Hello. Following patch attempts to rewrite memory reports for GCC's internal allocations so that it uses a new template type. The type shares parts which are currently duplicated, adds support for special 'counters' and introduces new support for hash-{set,map,table}. Transformation of the current code is a bit tricky as we internally used hash-table as main data structure which takes care of location-related allocations. As I want to add support even for hash tables (and all derived types), header files inclusion and forward declaration is utilized. Feel free to comment the patch, as well as missing features one may want to track by location sensitive memory allocation. Attachment contains sample output taken from tramp3d-v4.cpp. Thanks, Martin Ok. I'm going to install following patch. Martin, I realized we require change in one more place. I'm just doing builds to verify this. VP. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index b59b5d9..3d14938 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2667,11 +2667,11 @@ s-iov: build/gcov-iov$(build_exeext) $(BASEVER) $(DEVPHASE) GCOV_OBJS = gcov.o gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) \ - build/hash-table.o ggc-none.o $(LIBS) -o $@ + hash-table.o ggc-none.o $(LIBS) -o $@ GCOV_DUMP_OBJS = gcov-dump.o gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \ - build/hash-table.o build/ggc-none.o\ + hash-table.o ggc-none.o\ $(LIBS) -o $@ GCOV_TOOL_DEP_FILES = $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \
Re: [PATCH, RFC] New memory usage statistics infrastructure
Martin, The following change: @@ -2655,10 +2655,10 @@ s-iov: build/gcov-iov$(build_exeext) $(BASEVER) $(DEVPHASE) GCOV_OBJS = gcov.o gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) - +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@ + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) build/hash-table.o ggc-none.o $(LIBS) -o $@ seem to cause canadian cross build failure for arm and aarch64 on x86_64 as build/hash-table.o and ggc-none.o are not built by the same compiler? arm-none-linux-gnueabi-g++ -no-pie -g -O2 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing +-Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wn build/hash-table.o ggc-none.o libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a .. +/libdecnumber/libdecnumber.a -o gcov build/hash-table.o: file not recognized: File format not recognized collect2: error: ld returned 1 exit status make[1]: *** [gcov] Error 1 Should it be: - +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) $(LIBS) -o $@ + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) hash-table.o ggc-none.o $(LIBS) -o $@ instead? VP. On 15/05/15 15:38, Martin Liška wrote: Hello. Following patch attempts to rewrite memory reports for GCC's internal allocations so that it uses a new template type. The type shares parts which are currently duplicated, adds support for special 'counters' and introduces new support for hash-{set,map,table}. Transformation of the current code is a bit tricky as we internally used hash-table as main data structure which takes care of location-related allocations. As I want to add support even for hash tables (and all derived types), header files inclusion and forward declaration is utilized. Feel free to comment the patch, as well as missing features one may want to track by location sensitive memory allocation. Attachment contains sample output taken from tramp3d-v4.cpp. Thanks, Martin
Re: [Patch,testsuite] Fix bind_pic_locally
PING! On Wed, Jun 04, 2014 at 02:56:00PM +0100, Vidya Praveen wrote: > Hello, > > This is to follow up the patch I had posted to fix bind_pic_locally some time > ago (sorry, this went in to my back log for a while). > > To summarize, multilib_flags when it contains -fpic or -fPIC, overrides -fpie > or -fPIE that is added by bind_pic_locally. The fix that was finally agreed on > was to store the flags to a variable at bind_pic_locally and append it > to > multilib_flags just before invoking target_compile and remove it > immediately > after that (Refer [1]). > > This patch implements the same. Since this is an issue not only for gcc > but > also for g++ and gfortran tests, I have fixed this in g++.exp and gfortran.exp > along with gcc.exp. > > This was tested and works fine on: > > aarch64-none-elf > aarch64-none-linux-gnu > arm-none-linux-gnueabihf > x86_64-unknown-linux-gnu > > OK for trunk? > > Cheers > VP. > > [1] http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00365.html > > ~~~ > > gcc/testsuite/ChangeLog: > > 2014-06-04 Vidya Praveen > > * lib/target-support.exp (bind_pic_locally): Save the flags to > 'flags_to_postpone' instead of appending to 'flags'. > * lib/gcc.exp (gcc_target_compile): Append board_info's multilib_flags > with flags_to_postpone and revert after target_compile. > * lib/g++.exp (g++_target_compile): Ditto. > * lib/gfortran.exp (gfortran_target_compile): Ditto. > diff --git a/gcc/testsuite/lib/g++.exp b/gcc/testsuite/lib/g++.exp > index 751e27b..6658c58 100644 > --- a/gcc/testsuite/lib/g++.exp > +++ b/gcc/testsuite/lib/g++.exp > @@ -288,6 +288,8 @@ proc g++_target_compile { source dest type options } { > global gluefile wrap_flags > global ALWAYS_CXXFLAGS > global GXX_UNDER_TEST > +global flags_to_postpone > +global board_info > > if { [target_info needs_status_wrapper] != "" && [info exists gluefile] > } { > lappend options "libs=${gluefile}" > @@ -313,10 +315,25 @@ proc g++_target_compile { source dest type options } { > exec rm -f $rponame > } > > +# bind_pic_locally adds -fpie/-fPIE flags to flags_to_postpone and it is > +# appended here to multilib_flags as it can be overridden by the latter > +# if it was added earlier. After the target_compile, multilib_flags is > +# restored to its orignal content. > +set tboard [target_info name] > +if {[board_info $tboard exists multilib_flags]} { > +set orig_multilib_flags "[board_info [target_info name] > multilib_flags]" > +append board_info($tboard,multilib_flags) " $flags_to_postpone" > +} > + > set options [dg-additional-files-options $options $source] > > set result [target_compile $source $dest $type $options] > > +if {[board_info $tboard exists multilib_flags]} { > +set board_info($tboard,multilib_flags) $orig_multilib_flags > +set flags_to_postpone "" > +} > + > return $result > } > > diff --git a/gcc/testsuite/lib/gcc.exp b/gcc/testsuite/lib/gcc.exp > index 49394b0..f937064 100644 > --- a/gcc/testsuite/lib/gcc.exp > +++ b/gcc/testsuite/lib/gcc.exp > @@ -126,7 +126,9 @@ proc gcc_target_compile { source dest type options } { > global GCC_UNDER_TEST > global TOOL_OPTIONS > global TEST_ALWAYS_FLAGS > - > +global flags_to_postpone > +global board_info > + > if {[target_info needs_status_wrapper] != "" && \ > [target_info needs_status_wrapper] != "0" && \ > [info exists gluefile] } { > @@ -162,8 +164,26 @@ proc gcc_target_compile { source dest type options } { > set options [concat "{additional_flags=$TOOL_OPTIONS}" $options] > } > > +# bind_pic_locally adds -fpie/-fPIE flags to flags_to_postpone and it is > +# appended here to multilib_flags as it can be overridden by the latter > +# if it was added earlier. After the target_compile, multilib_flags is > +# restored to its orignal content. > +set tboard [target_info name] > +if {[board_info $tboard exists multilib_flags]} { > +set orig_multilib_flags "[board_info [target_info name] > multilib_flags]" > +append board_info($tboard,multilib_flags) " $flags_to_postpone" > +} > + > lappend options "timeout=[timeout_value]" > lappend options "compiler=$GCC_UNDER_TEST" > set options [dg-additional-files-options $options $source] > -
Re: [Patch,testsuite] Fix tests that fail due to symbol visibility when -fPIC
PING! On Wed, Jun 04, 2014 at 03:01:38PM +0100, Vidya Praveen wrote: > Hello, > > The following test cases fail when -fPIC is passed as dejagnu multilib > flag > since -fPIC causes the 'availability' of the functions to be overwritable. > I > have fixed this by adding bind_pic_locally to these cases. > > gcc.dg/fail_always_inline.c > gcc.dg/inline-22.c > gcc.dg/inline_4.c > g++.dg/ipa/devirt-25.C > > Tested on: > > aarch64-none-elf > aarch64-none-linux-gnu > arm-none-linux-gnueabihf > x86_64-unknown-linux-gnu > > OK for trunk? > > Cheers > VP. > > ~~~ > > gcc/testsuite/ChangeLog: > > 2014-06-04 Vidya Praveen > > * gcc.dg/inline-22.c: Add bind_pic_locally. > * gcc.dg/inline_4.c: Ditto. > * gcc.dg/fail_always_inline.c: Ditto. > * g++.dg/ipa/devirt-25.C: Ditto. > > diff --git a/gcc/testsuite/g++.dg/ipa/devirt-25.C > b/gcc/testsuite/g++.dg/ipa/devirt-25.C > index 7516479..387d529 100644 > --- a/gcc/testsuite/g++.dg/ipa/devirt-25.C > +++ b/gcc/testsuite/g++.dg/ipa/devirt-25.C > @@ -1,5 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-O3 -fdump-ipa-cp" } */ > +/* { dg-add-options bind_pic_locally } */ > > class ert_RefCounter { > protected: > diff --git a/gcc/testsuite/gcc.dg/fail_always_inline.c > b/gcc/testsuite/gcc.dg/fail_always_inline.c > index 4b196ac..86645b8 100644 > --- a/gcc/testsuite/gcc.dg/fail_always_inline.c > +++ b/gcc/testsuite/gcc.dg/fail_always_inline.c > @@ -1,4 +1,5 @@ > /* { dg-do compile } */ > +/* { dg-add-options bind_pic_locally } */ > > extern __attribute__ ((always_inline)) void > bar() { } /* { dg-warning "function might not be inlinable" } */ > diff --git a/gcc/testsuite/gcc.dg/inline-22.c > b/gcc/testsuite/gcc.dg/inline-22.c > index 1785e1c..6795c5f 100644 > --- a/gcc/testsuite/gcc.dg/inline-22.c > +++ b/gcc/testsuite/gcc.dg/inline-22.c > @@ -1,5 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-funit-at-a-time -Wno-attributes" } */ > +/* { dg-add-options bind_pic_locally } */ > /* Verify we can inline without a complete prototype and with promoted > arguments. See also PR32492. */ > __attribute__((always_inline)) void f1() {} > diff --git a/gcc/testsuite/gcc.dg/inline_4.c b/gcc/testsuite/gcc.dg/inline_4.c > index dd4fadb..ebd57e9 100644 > --- a/gcc/testsuite/gcc.dg/inline_4.c > +++ b/gcc/testsuite/gcc.dg/inline_4.c > @@ -1,5 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -fdump-tree-optimized -fdisable-tree-einline=foo2 > -fdisable-ipa-inline -Wno-attributes" } */ > +/* { dg-add-options bind_pic_locally } */ > int g; > __attribute__((always_inline)) void bar (void) > {
[Patch,testsuite] Fix tests that fail due to symbol visibility when -fPIC
Hello, The following test cases fail when -fPIC is passed as dejagnu multilib flag since -fPIC causes the 'availability' of the functions to be overwritable. I have fixed this by adding bind_pic_locally to these cases. gcc.dg/fail_always_inline.c gcc.dg/inline-22.c gcc.dg/inline_4.c g++.dg/ipa/devirt-25.C Tested on: aarch64-none-elf aarch64-none-linux-gnu arm-none-linux-gnueabihf x86_64-unknown-linux-gnu OK for trunk? Cheers VP. ~~~ gcc/testsuite/ChangeLog: 2014-06-04 Vidya Praveen * gcc.dg/inline-22.c: Add bind_pic_locally. * gcc.dg/inline_4.c: Ditto. * gcc.dg/fail_always_inline.c: Ditto. * g++.dg/ipa/devirt-25.C: Ditto. diff --git a/gcc/testsuite/g++.dg/ipa/devirt-25.C b/gcc/testsuite/g++.dg/ipa/devirt-25.C index 7516479..387d529 100644 --- a/gcc/testsuite/g++.dg/ipa/devirt-25.C +++ b/gcc/testsuite/g++.dg/ipa/devirt-25.C @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O3 -fdump-ipa-cp" } */ +/* { dg-add-options bind_pic_locally } */ class ert_RefCounter { protected: diff --git a/gcc/testsuite/gcc.dg/fail_always_inline.c b/gcc/testsuite/gcc.dg/fail_always_inline.c index 4b196ac..86645b8 100644 --- a/gcc/testsuite/gcc.dg/fail_always_inline.c +++ b/gcc/testsuite/gcc.dg/fail_always_inline.c @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-add-options bind_pic_locally } */ extern __attribute__ ((always_inline)) void bar() { } /* { dg-warning "function might not be inlinable" } */ diff --git a/gcc/testsuite/gcc.dg/inline-22.c b/gcc/testsuite/gcc.dg/inline-22.c index 1785e1c..6795c5f 100644 --- a/gcc/testsuite/gcc.dg/inline-22.c +++ b/gcc/testsuite/gcc.dg/inline-22.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-funit-at-a-time -Wno-attributes" } */ +/* { dg-add-options bind_pic_locally } */ /* Verify we can inline without a complete prototype and with promoted arguments. See also PR32492. */ __attribute__((always_inline)) void f1() {} diff --git a/gcc/testsuite/gcc.dg/inline_4.c b/gcc/testsuite/gcc.dg/inline_4.c index dd4fadb..ebd57e9 100644 --- a/gcc/testsuite/gcc.dg/inline_4.c +++ b/gcc/testsuite/gcc.dg/inline_4.c @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-optimized -fdisable-tree-einline=foo2 -fdisable-ipa-inline -Wno-attributes" } */ +/* { dg-add-options bind_pic_locally } */ int g; __attribute__((always_inline)) void bar (void) {
[Patch,testsuite] Fix bind_pic_locally
Hello, This is to follow up the patch I had posted to fix bind_pic_locally some time ago (sorry, this went in to my back log for a while). To summarize, multilib_flags when it contains -fpic or -fPIC, overrides -fpie or -fPIE that is added by bind_pic_locally. The fix that was finally agreed on was to store the flags to a variable at bind_pic_locally and append it to multilib_flags just before invoking target_compile and remove it immediately after that (Refer [1]). This patch implements the same. Since this is an issue not only for gcc but also for g++ and gfortran tests, I have fixed this in g++.exp and gfortran.exp along with gcc.exp. This was tested and works fine on: aarch64-none-elf aarch64-none-linux-gnu arm-none-linux-gnueabihf x86_64-unknown-linux-gnu OK for trunk? Cheers VP. [1] http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00365.html ~~~ gcc/testsuite/ChangeLog: 2014-06-04 Vidya Praveen * lib/target-support.exp (bind_pic_locally): Save the flags to 'flags_to_postpone' instead of appending to 'flags'. * lib/gcc.exp (gcc_target_compile): Append board_info's multilib_flags with flags_to_postpone and revert after target_compile. * lib/g++.exp (g++_target_compile): Ditto. * lib/gfortran.exp (gfortran_target_compile): Ditto.diff --git a/gcc/testsuite/lib/g++.exp b/gcc/testsuite/lib/g++.exp index 751e27b..6658c58 100644 --- a/gcc/testsuite/lib/g++.exp +++ b/gcc/testsuite/lib/g++.exp @@ -288,6 +288,8 @@ proc g++_target_compile { source dest type options } { global gluefile wrap_flags global ALWAYS_CXXFLAGS global GXX_UNDER_TEST +global flags_to_postpone +global board_info if { [target_info needs_status_wrapper] != "" && [info exists gluefile] } { lappend options "libs=${gluefile}" @@ -313,10 +315,25 @@ proc g++_target_compile { source dest type options } { exec rm -f $rponame } +# bind_pic_locally adds -fpie/-fPIE flags to flags_to_postpone and it is +# appended here to multilib_flags as it can be overridden by the latter +# if it was added earlier. After the target_compile, multilib_flags is +# restored to its orignal content. +set tboard [target_info name] +if {[board_info $tboard exists multilib_flags]} { +set orig_multilib_flags "[board_info [target_info name] multilib_flags]" +append board_info($tboard,multilib_flags) " $flags_to_postpone" +} + set options [dg-additional-files-options $options $source] set result [target_compile $source $dest $type $options] +if {[board_info $tboard exists multilib_flags]} { +set board_info($tboard,multilib_flags) $orig_multilib_flags +set flags_to_postpone "" +} + return $result } diff --git a/gcc/testsuite/lib/gcc.exp b/gcc/testsuite/lib/gcc.exp index 49394b0..f937064 100644 --- a/gcc/testsuite/lib/gcc.exp +++ b/gcc/testsuite/lib/gcc.exp @@ -126,7 +126,9 @@ proc gcc_target_compile { source dest type options } { global GCC_UNDER_TEST global TOOL_OPTIONS global TEST_ALWAYS_FLAGS - +global flags_to_postpone +global board_info + if {[target_info needs_status_wrapper] != "" && \ [target_info needs_status_wrapper] != "0" && \ [info exists gluefile] } { @@ -162,8 +164,26 @@ proc gcc_target_compile { source dest type options } { set options [concat "{additional_flags=$TOOL_OPTIONS}" $options] } +# bind_pic_locally adds -fpie/-fPIE flags to flags_to_postpone and it is +# appended here to multilib_flags as it can be overridden by the latter +# if it was added earlier. After the target_compile, multilib_flags is +# restored to its orignal content. +set tboard [target_info name] +if {[board_info $tboard exists multilib_flags]} { +set orig_multilib_flags "[board_info [target_info name] multilib_flags]" +append board_info($tboard,multilib_flags) " $flags_to_postpone" +} + lappend options "timeout=[timeout_value]" lappend options "compiler=$GCC_UNDER_TEST" set options [dg-additional-files-options $options $source] -return [target_compile $source $dest $type $options] +set return_val [target_compile $source $dest $type $options] + +if {[board_info $tboard exists multilib_flags]} { +set board_info($tboard,multilib_flags) $orig_multilib_flags +set flags_to_postpone "" +} + +return $return_val } + diff --git a/gcc/testsuite/lib/gfortran.exp b/gcc/testsuite/lib/gfortran.exp index c9b5d64..9d174bb 100644 --- a/gcc/testsuite/lib/gfortran.exp +++ b/gcc/testsuite/lib/gfortran.exp @@ -234,16 +234,35 @@ proc gfortran_target_compile { source dest type options } { global gluefile wrap_flags global ALWAYS_GFORTRANFLAGS global GFORTRAN_UNDER_TEST +
[Patch,AArch64] Support SISD variants of SCVTF,UCVTF
Hello, This patch adds support to the SISD variants of SCVTF/UCVTF instructions. This also refactors the existing support for floating point instruction variants of SCVTF/UCVTF in order to direct the instruction selection based on the constraints. Given that the floating-point variations supports inequal width convertions (SI to DF and DI to SF), new mode iterator w1 and w2 have been introduced and fcvt_target,FCVT_TARGET have been extended to support non vector type. Since this patch changes the existing patterns, the testcase includes tests for both SISD and floating point variations of the instructions. Tested for aarch64-none-elf. OK for trunk? Cheers VP. gcc/ChangeLog: 2013-01-13 Vidya Praveen * aarch64.md (float2): Remove. (floatuns2): Remove. (2): New pattern for equal width float and floatuns conversions. (2): New pattern for inequal width float and floatuns conversions. * iterators.md (fcvt_target, FCVT_TARGET): Support SF and DF modes. (w1,w2): New mode attributes for inequal width conversions. gcc/testsuite/ChangeLog: 2013-01-13 Vidya Praveen * gcc.target/aarch64/cvtf_1.c: New.diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c83622d..1775849 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3295,20 +3295,24 @@ [(set_attr "type" "f_cvtf2i")] ) -(define_insn "float2" - [(set (match_operand:GPF 0 "register_operand" "=w") -(float:GPF (match_operand:GPI 1 "register_operand" "r")))] - "TARGET_FLOAT" - "scvtf\\t%0, %1" - [(set_attr "type" "f_cvti2f")] +(define_insn "2" + [(set (match_operand:GPF 0 "register_operand" "=w,w") +(FLOATUORS:GPF (match_operand: 1 "register_operand" "w,r")))] + "" + "@ + cvtf\t%0, %1 + cvtf\t%0, %1" + [(set_attr "simd" "yes,no") + (set_attr "fp" "no,yes") + (set_attr "type" "neon_int_to_fp_,f_cvti2f")] ) -(define_insn "floatuns2" +(define_insn "2" [(set (match_operand:GPF 0 "register_operand" "=w") -(unsigned_float:GPF (match_operand:GPI 1 "register_operand" "r")))] +(FLOATUORS:GPF (match_operand: 1 "register_operand" "r")))] "TARGET_FLOAT" - "ucvtf\\t%0, %1" - [(set_attr "type" "f_cvt")] + "cvtf\t%0, %1" + [(set_attr "type" "f_cvti2f")] ) ;; --- diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c4f95dc..11bdc35 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -293,6 +293,10 @@ ;; 32-bit version and "%x0" in the 64-bit version. (define_mode_attr w [(QI "w") (HI "w") (SI "w") (DI "x") (SF "s") (DF "d")]) +;; For inequal width int to float conversion +(define_mode_attr w1 [(SF "w") (DF "x")]) +(define_mode_attr w2 [(SF "x") (DF "w")]) + ;; For constraints used in scalar immediate vector moves (define_mode_attr hq [(HI "h") (QI "q")]) @@ -558,8 +562,12 @@ (define_mode_attr atomic_sfx [(QI "b") (HI "h") (SI "") (DI "")]) -(define_mode_attr fcvt_target [(V2DF "v2di") (V4SF "v4si") (V2SF "v2si")]) -(define_mode_attr FCVT_TARGET [(V2DF "V2DI") (V4SF "V4SI") (V2SF "V2SI")]) +(define_mode_attr fcvt_target [(V2DF "v2di") (V4SF "v4si") (V2SF "v2si") (SF "si") (DF "di")]) +(define_mode_attr FCVT_TARGET [(V2DF "V2DI") (V4SF "V4SI") (V2SF "V2SI") (SF "SI") (DF "DI")]) + +;; for the inequal width integer to fp conversions +(define_mode_attr fcvt_iesize [(SF "di") (DF "si")]) +(define_mode_attr FCVT_IESIZE [(SF "DI") (DF "SI")]) (define_mode_attr VSWAP_WIDTH [(V8QI "V16QI") (V16QI "V8QI") (V4HI "V8HI") (V8HI "V4HI") diff --git a/gcc/testsuite/gcc.target/aarch64/cvtf_1.c b/gcc/testsuite/gcc.target/aarch64/cvtf_1.c new file mode 100644 index 000..80ab9a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/cvtf_1.c @@ -0,0 +1,95 @@ +/* { dg-do run } */ +/* { dg-options "-save-temps -fno-inline -O1" } */ + +#define FCVTDEF(ftype,itype) \ +void \ +cvt_##itype##_to_##ftype (itype a, ftype b)\ +{\ + ftype c;\ + c = (ftype) a;\ + if ( (c - b) > 0.1) abort();\ +} + +#define force_simd_for_float(v) asm volatile (
Re: [Patch,testsuite] Fix testcases that use bind_pic_locally
On Wed, Jan 08, 2014 at 12:28:56PM +, Jakub Jelinek wrote: > On Wed, Jan 08, 2014 at 11:49:08AM +0000, Vidya Praveen wrote: > > On Tue, Jan 07, 2014 at 09:35:54PM +, Mike Stump wrote: > > > On Dec 17, 2013, at 6:06 AM, Vidya Praveen wrote: > > > > bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by > > > > default [1][2]. > > > > > > Let's give Jakub 2 days to weigh in? If no objections, Ok, though, do > > > see about adding documentation for it. > > > > Sure. I didn't respin the patch with documentation since I wanted to know > > if the solution is acceptable. If this patch is OK, I'll respin with the > > documentation for bind_pic_locally_ok. > > > > > I kinda would like a simpler interface for these two, but? that can be > > > follow on work, if someone has a bright idea and some time to implement > > > it. > > > > > > > Could you explain what do you mean by simpler interface here? > > The simpler interface, as I said earlier, would be just to make sure > /* { dg-add-options bind_pic_locally } */ > does the right thing, I really don't believe you've tried hard enough. > > It is true dejagnu's default_target_compile has: > if {[board_info $dest exists multilib_flags]} { > append add_flags " [board_info $dest multilib_flags]" > } > last (before just adding -o $destfile; is multilib_flags where the > -fpic/-fPIC comes in, right?), but if say dg-add-options bind_pic_locally > adds the necessary options not to dg-extra-tools-flags, but to some > other variable and say gcc_target_compile (and g++_target_compile) > around the [target_compile ...] invocation e.g. temporarily append > that other variable (if not empty) to board_info's multilib_flags > and afterwards remove it, I don't see why it wouldn't work. > Tcl is quite flexible in this. Thanks Jakub. I seem to have not properly understood your earlier email. I could do this and works fine. I'll test and post the patch. VP.
Re: [Patch,testsuite] Fix testcases that use bind_pic_locally
On Tue, Jan 07, 2014 at 09:35:54PM +, Mike Stump wrote: > On Dec 17, 2013, at 6:06 AM, Vidya Praveen wrote: > > bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by > > default [1][2]. > > Let's give Jakub 2 days to weigh in? If no objections, Ok, though, do see > about adding documentation for it. Sure. I didn't respin the patch with documentation since I wanted to know if the solution is acceptable. If this patch is OK, I'll respin with the documentation for bind_pic_locally_ok. > I kinda would like a simpler interface for these two, but? that can be > follow on work, if someone has a bright idea and some time to implement it. > Could you explain what do you mean by simpler interface here? Cheers VP.
Re: [Patch,testsuite] Fix testcases that use bind_pic_locally
Ping! On Tue, Dec 17, 2013 at 02:06:13PM +, Vidya Praveen wrote: > Hello, > > bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by > default [1][2]. > > One of the suggestions was to have a effective target check called > bind_pic_locally_ok which checks if bind_pic_locally will work and have it > included in all the tests that uses bind_pic_locally in dg-add-options [1]. > > This patch implements the same by checking if -fpic/-fPIC are passed by > default as well in general with the flags passed through various means. It > returns 1 when either the -fpic/-fPIC is passed by default OR when it is > not passed by default as well as not passed through any other means. This > however, will allow if -fpic/-fPIC is passed both by default and by the > other means since we can't really tell such a case and it makes no sense > to do so (because there's no reason for the testcase to pass -fPIC/-fpic > when it tries to override the same using bind_pic_locally and if it is > passed by default, there's no need to pass them through, say, board file's > cflags). > > default other-means returns > pic - 1 > pic pic 1 (invalid) > - pic 0 > - - 1 > > This patch also modifies all the testcases that use bind_pic_locally to > include this bind_pic_locally_ok check. > > Tested for aarch64-none-elf, arm-none-eabi, arm-none-linux-gnueabihf. > > OK? > > Cheers > VP. > > [1] http://gcc.gnu.org/ml/gcc/2013-09/msg00207.html > [2] http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00462.html > > > gcc/testsuite/ChangeLog: > > 2013-12-17 Vidya Praveen > > * lib/target-support.exp: (check_effective_target_bind_pic_locally_ok): > New check. > * g++.dg/ipa/iinline-1.C: Introduce bind_pic_locally_ok. > * g++.dg/ipa/iinline-2.C: Likewise. > * g++.dg/ipa/iinline-3.C: Likewise. > * g++.dg/ipa/inline-1.C: Likewise. > * g++.dg/ipa/inline-2.C: Likewise. > * g++.dg/ipa/inline-3.C: Likewise. > * g++.dg/other/first-global.C: Likewise. > * g++.dg/parse/attr-externally-visible-1.C: Likewise. > * g++.dg/torture/pr40323.C: Likewise. > * g++.dg/torture/pr55260-1.C: Likewise. > * g++.dg/torture/pr55260-2.C: Likewise. > * g++.dg/tree-ssa/inline-1.C: Likewise. > * g++.dg/tree-ssa/inline-2.C: Likewise. > * g++.dg/tree-ssa/inline-3.C: Likewise. > * g++.dg/tree-ssa/nothrow-1.C: Likewise. > * gcc.dg/inline-33.c: Likewise. > * gcc.dg/ipa/ipa-1.c: Likewise. > * gcc.dg/ipa/ipa-2.c: Likewise. > * gcc.dg/ipa/ipa-3.c: Likewise. > * gcc.dg/ipa/ipa-4.c: Likewise. > * gcc.dg/ipa/ipa-5.c: Likewise. > * gcc.dg/ipa/ipa-7.c: Likewise. > * gcc.dg/ipa/ipa-8.c: Likewise. > * gcc.dg/ipa/ipacost-2.c: Likewise. > * gcc.dg/ipa/ipcp-1.c: Likewise. > * gcc.dg/ipa/ipcp-2.c: Likewise. > * gcc.dg/ipa/ipcp-4.c: Likewise. > * gcc.dg/ipa/ipcp-agg-1.c: Likewise. > * gcc.dg/ipa/ipcp-agg-2.c: Likewise. > * gcc.dg/ipa/ipcp-agg-3.c: Likewise. > * gcc.dg/ipa/ipcp-agg-4.c: Likewise. > * gcc.dg/ipa/ipcp-agg-5.c: Likewise. > * gcc.dg/ipa/ipcp-agg-6.c: Likewise. > * gcc.dg/ipa/ipcp-agg-7.c: Likewise. > * gcc.dg/ipa/ipcp-agg-8.c: Likewise. > * gcc.dg/ipa/pr56988.c: Likewise. > * gcc.dg/tree-ssa/inline-3.c: Likewise. > * gcc.dg/tree-ssa/inline-4.c: Likewise. > * gcc.dg/tree-ssa/ipa-cp-1.c: Likewise. > * gcc.dg/tree-ssa/local-pure-const.c: Likewise. > * gfortran.dg/whole_file_5.f90: Likewise. > * gfortran.dg/whole_file_6.f90: Likewise. > > diff --git a/gcc/testsuite/g++.dg/ipa/iinline-1.C > b/gcc/testsuite/g++.dg/ipa/iinline-1.C > index 9f99893..b86daf1 100644 > --- a/gcc/testsuite/g++.dg/ipa/iinline-1.C > +++ b/gcc/testsuite/g++.dg/ipa/iinline-1.C > @@ -1,6 +1,7 @@ > /* Verify that simple indirect calls are inlined even without early > inlining.. */ > /* { dg-do compile } */ > +/* { dg-require-effective-target bind_pic_locally_ok } */ > /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining" } */ > /* { dg-add-options bind_pic_locally } */ > > diff --git a/gcc/testsuite/g++.dg/ipa/iinline-2.C > b/gcc/testsuite/g++.dg/ipa/iinline-2.C > index 670a5dd..d4329c1 100644 > --- a/gcc/testsuite/g++.dg/ipa/iinline-2.C > +++ b/gcc/testsuite/g++.dg/ipa/iinline-2.C > @@ -1,6 +1,7 @@ > /* Verify that simple indirect calls are inlined even without early > inlining.. */ > /* { dg-do compile } */ > +/* { dg-require-effective-target bind_pic_locally_ok } */ >
[Patch,testsuite] Fix testcases that use bind_pic_locally
Hello, bind_pic_locally is broken for targets that doesn't pass -fPIC/-fpic by default [1][2]. One of the suggestions was to have a effective target check called bind_pic_locally_ok which checks if bind_pic_locally will work and have it included in all the tests that uses bind_pic_locally in dg-add-options [1]. This patch implements the same by checking if -fpic/-fPIC are passed by default as well in general with the flags passed through various means. It returns 1 when either the -fpic/-fPIC is passed by default OR when it is not passed by default as well as not passed through any other means. This however, will allow if -fpic/-fPIC is passed both by default and by the other means since we can't really tell such a case and it makes no sense to do so (because there's no reason for the testcase to pass -fPIC/-fpic when it tries to override the same using bind_pic_locally and if it is passed by default, there's no need to pass them through, say, board file's cflags). default other-means returns pic - 1 pic pic 1 (invalid) - pic 0 - - 1 This patch also modifies all the testcases that use bind_pic_locally to include this bind_pic_locally_ok check. Tested for aarch64-none-elf, arm-none-eabi, arm-none-linux-gnueabihf. OK? Cheers VP. [1] http://gcc.gnu.org/ml/gcc/2013-09/msg00207.html [2] http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00462.html gcc/testsuite/ChangeLog: 2013-12-17 Vidya Praveen * lib/target-support.exp: (check_effective_target_bind_pic_locally_ok): New check. * g++.dg/ipa/iinline-1.C: Introduce bind_pic_locally_ok. * g++.dg/ipa/iinline-2.C: Likewise. * g++.dg/ipa/iinline-3.C: Likewise. * g++.dg/ipa/inline-1.C: Likewise. * g++.dg/ipa/inline-2.C: Likewise. * g++.dg/ipa/inline-3.C: Likewise. * g++.dg/other/first-global.C: Likewise. * g++.dg/parse/attr-externally-visible-1.C: Likewise. * g++.dg/torture/pr40323.C: Likewise. * g++.dg/torture/pr55260-1.C: Likewise. * g++.dg/torture/pr55260-2.C: Likewise. * g++.dg/tree-ssa/inline-1.C: Likewise. * g++.dg/tree-ssa/inline-2.C: Likewise. * g++.dg/tree-ssa/inline-3.C: Likewise. * g++.dg/tree-ssa/nothrow-1.C: Likewise. * gcc.dg/inline-33.c: Likewise. * gcc.dg/ipa/ipa-1.c: Likewise. * gcc.dg/ipa/ipa-2.c: Likewise. * gcc.dg/ipa/ipa-3.c: Likewise. * gcc.dg/ipa/ipa-4.c: Likewise. * gcc.dg/ipa/ipa-5.c: Likewise. * gcc.dg/ipa/ipa-7.c: Likewise. * gcc.dg/ipa/ipa-8.c: Likewise. * gcc.dg/ipa/ipacost-2.c: Likewise. * gcc.dg/ipa/ipcp-1.c: Likewise. * gcc.dg/ipa/ipcp-2.c: Likewise. * gcc.dg/ipa/ipcp-4.c: Likewise. * gcc.dg/ipa/ipcp-agg-1.c: Likewise. * gcc.dg/ipa/ipcp-agg-2.c: Likewise. * gcc.dg/ipa/ipcp-agg-3.c: Likewise. * gcc.dg/ipa/ipcp-agg-4.c: Likewise. * gcc.dg/ipa/ipcp-agg-5.c: Likewise. * gcc.dg/ipa/ipcp-agg-6.c: Likewise. * gcc.dg/ipa/ipcp-agg-7.c: Likewise. * gcc.dg/ipa/ipcp-agg-8.c: Likewise. * gcc.dg/ipa/pr56988.c: Likewise. * gcc.dg/tree-ssa/inline-3.c: Likewise. * gcc.dg/tree-ssa/inline-4.c: Likewise. * gcc.dg/tree-ssa/ipa-cp-1.c: Likewise. * gcc.dg/tree-ssa/local-pure-const.c: Likewise. * gfortran.dg/whole_file_5.f90: Likewise. * gfortran.dg/whole_file_6.f90: Likewise. diff --git a/gcc/testsuite/g++.dg/ipa/iinline-1.C b/gcc/testsuite/g++.dg/ipa/iinline-1.C index 9f99893..b86daf1 100644 --- a/gcc/testsuite/g++.dg/ipa/iinline-1.C +++ b/gcc/testsuite/g++.dg/ipa/iinline-1.C @@ -1,6 +1,7 @@ /* Verify that simple indirect calls are inlined even without early inlining.. */ /* { dg-do compile } */ +/* { dg-require-effective-target bind_pic_locally_ok } */ /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining" } */ /* { dg-add-options bind_pic_locally } */ diff --git a/gcc/testsuite/g++.dg/ipa/iinline-2.C b/gcc/testsuite/g++.dg/ipa/iinline-2.C index 670a5dd..d4329c1 100644 --- a/gcc/testsuite/g++.dg/ipa/iinline-2.C +++ b/gcc/testsuite/g++.dg/ipa/iinline-2.C @@ -1,6 +1,7 @@ /* Verify that simple indirect calls are inlined even without early inlining.. */ /* { dg-do compile } */ +/* { dg-require-effective-target bind_pic_locally_ok } */ /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining" } */ /* { dg-add-options bind_pic_locally } */ diff --git a/gcc/testsuite/g++.dg/ipa/iinline-3.C b/gcc/testsuite/g++.dg/ipa/iinline-3.C index 3daee9a..4dc604e 100644 --- a/gcc/testsuite/g++.dg/ipa/iinline-3.C +++ b/gcc/testsuite/g++.dg/ipa/iinline-3.C @@ -1,6 +1,7 @@ /* Verify that we do not indirect-inline using member pointer parameters which have been modified. */ /* { dg-do run } */ +/* { dg-require-effective-target bind_pic_locally_ok } */ /* { dg-opt
Re: Re: [Patch] Fix gcc.dg/20050922-*.c
Mike, On 25/10/13 00:37, Mike Stump wrote: On Oct 24, 2013, at 2:26 AM, Vidya Praveen wrote: On Mon, Oct 21, 2013 at 06:40:28PM +0100, Mike Stump wrote: On Oct 21, 2013, at 3:28 AM, Vidya Praveen wrote: Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This can be a issue especially since they define uint32_t. OK for 4.7, 4.8? It fails on arm-none-eabi. Ok, let it bake on trunk and then you can back port it if no one screams. I think it has baked long enough. Could this be approved for 4.7 and 4.8 now? VP.
Re: [Patch] Fix gcc.dg/20050922-*.c
On Mon, Oct 21, 2013 at 05:47:44PM +0100, Jeff Law wrote: > On 10/21/13 04:28, Vidya Praveen wrote: > > Hello, > > > > Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This > > can > > be a issue especially since they define uint32_t. Testcase writing > > guidelines > > discourages such inclusion as well. > > > > This patch replaces these #includes with manual declarations. > > > > Tested for aarch64-none-elf, arm-none-eabi and x86_64-linux-gnu > > > > OK for trunk, 4.7, 4.8? > > > > VP. > > > > --- > > > > gcc/testsuite/ChangeLog: > > > > 2013-10-21 Vidya Praveen > > > > * gcc.dg/20050922-1.c: Remove stdlib.h and declare abort(). > > * gcc.dg/20050922-1.c: Remove stdlib.h and declare abort() and exit(). > OK & installed on trunk. > > Release managers would need to make a decision about whether or not to > include this for the 4.7/4.8 branches. > Thanks Jeff! VP.
Re: [Patch] Fix gcc.dg/20050922-*.c
On Mon, Oct 21, 2013 at 06:40:28PM +0100, Mike Stump wrote: > On Oct 21, 2013, at 3:28 AM, Vidya Praveen wrote: > > Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This > > can > > be a issue especially since they define uint32_t. > > > OK for 4.7, 4.8? > > For release branches, you'd need to transition from the theoretical to the > practical. On which systems (software) does it fail? If none, then no, a > back port isn't necessary. If it fails on a system (or software) on which > real users use, then I'll approve it once you name the system (software) and > let it bake on trunk for a week and see if anyone objects? > It fails on arm-none-eabi. VP.
[Patch] Fix gcc.dg/20050922-*.c
Hello, Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This can be a issue especially since they define uint32_t. Testcase writing guidelines discourages such inclusion as well. This patch replaces these #includes with manual declarations. Tested for aarch64-none-elf, arm-none-eabi and x86_64-linux-gnu OK for trunk, 4.7, 4.8? VP. --- gcc/testsuite/ChangeLog: 2013-10-21 Vidya Praveen * gcc.dg/20050922-1.c: Remove stdlib.h and declare abort(). * gcc.dg/20050922-1.c: Remove stdlib.h and declare abort() and exit(). diff --git a/gcc/testsuite/gcc.dg/20050922-1.c b/gcc/testsuite/gcc.dg/20050922-1.c index ed5a3c6..982f820 100644 --- a/gcc/testsuite/gcc.dg/20050922-1.c +++ b/gcc/testsuite/gcc.dg/20050922-1.c @@ -4,7 +4,7 @@ /* { dg-do run } */ /* { dg-options "-O1 -std=c99" } */ -#include +extern void abort (void); #if __INT_MAX__ == 2147483647 typedef unsigned int uint32_t; diff --git a/gcc/testsuite/gcc.dg/20050922-2.c b/gcc/testsuite/gcc.dg/20050922-2.c index c2974d0..2e8db82 100644 --- a/gcc/testsuite/gcc.dg/20050922-2.c +++ b/gcc/testsuite/gcc.dg/20050922-2.c @@ -4,7 +4,8 @@ /* { dg-do run } */ /* { dg-options "-O1 -std=c99" } */ -#include +extern void abort (void); +extern void exit (int); #if __INT_MAX__ == 2147483647 typedef unsigned int uint32_t;
Re: [Patch] Fix the testcases that use bind_pic_locally
On Tue, Oct 08, 2013 at 10:30:22AM +0100, Jakub Jelinek wrote: > On Tue, Oct 08, 2013 at 10:14:59AM +0100, Vidya Praveen wrote: > > There are several tests that use "dg-add-options bind_pic_locally" in order > > to > > add -fPIE or -fpie when -fPIC or -fpic are used respectively with the > > expecta- > > tion that -fPIE/-fpie will override -fPIC/-fpic. But this doesn't happen > > since > > since -fPIE/-fpie will be added before the -fPIC/-fpic (whether -fPIC/-fpic > > is > > added as a multilib option or through cflags). This is essentially due to > > the > > fact that cflags and multilib flags are added after the options are added > > through dg-options, dg-add-options, et al. in default_target_compile > > function. > > > > Assuming dg-options or dg-add-options should always win, we can fix this by > > modifying the order in which they are concatenated at > > default_target_compile in > > target.exp. But this is not recommended since it depends on everyone who > > tests > > upgrading their dejagnu (refer [1]). > > This looks like a big step backwards and I'm afraid it can break targets > where -fpic/-fPIC is the default. I agree. I didn't think of this. Since the -fPIC/-fpic comes before the -fPIE/-fpie this will work here. In other words, bind_pic_locally is not broken in this case. (This is assuming the -fPIC/-fpic as default option is passed through DRIVER_SELF_SPECS or similar). > If dg-add-options bind_pic_locally must > add options to the end of command line, then can't you just push the options > that must go last to some variable other than dg-extra-tool-flags and as we > override dejagnu's dg-test, put it in our override last (or in whatever > other method that already added the multilib options)? Well, multilib options are added at default_target_compile which is in target.exp. If I store the flags in some variable at add_options_for_bind_pic_locally and add it later, it still going to be before default_target_compile is called. Hope I understood your suggestion right. Cheers VP
[Patch] Fix the testcases that use bind_pic_locally
Hello, There are several tests that use "dg-add-options bind_pic_locally" in order to add -fPIE or -fpie when -fPIC or -fpic are used respectively with the expecta- tion that -fPIE/-fpie will override -fPIC/-fpic. But this doesn't happen since since -fPIE/-fpie will be added before the -fPIC/-fpic (whether -fPIC/-fpic is added as a multilib option or through cflags). This is essentially due to the fact that cflags and multilib flags are added after the options are added through dg-options, dg-add-options, et al. in default_target_compile function. Assuming dg-options or dg-add-options should always win, we can fix this by modifying the order in which they are concatenated at default_target_compile in target.exp. But this is not recommended since it depends on everyone who tests upgrading their dejagnu (refer [1]). So this patch replaces: /* { dg-add-options bind_pic_locally } */ with /* { dg-skip-if "" { *-*-* } { "-fPIC" "-fpic" } { "" } } */ in all the applicable test files. NOTE: There are many files that uses bind_pic_locally but they do PASS whether or not -fPIE/-fpie is passed. But I've replaced in all the files that uses bind_pic_locally. add_options_for_bind_pic_locally should IMO be removed or deprecated since it is is misleading. I can post a separate patch for this if everyone agrees to it. References: [1] http://gcc.gnu.org/ml/gcc/2013-07/msg00281.html [2] http://gcc.gnu.org/ml/gcc/2013-09/msg00207.html This issue for obvious reasons, common to all targets. Tested for aarch64-none-elf. OK for trunk? Cheers VP --- gcc/testsuite/ChangeLog: 2013-10-08 Vidya Praveen * gcc.dg/inline-33.c: Remove bind_pic_locally and skip if -fPIC/-fpic is used. * gcc.dg/ipa/ipa-3.c: Likewise. * gcc.dg/ipa/ipa-5.c: Likewise. * gcc.dg/ipa/ipa-7.c: Likewise. * gcc.dg/ipa/ipcp-2.c: Likewise. * gcc.dg/ipa/ipcp-agg-1.c: Likewise. * gcc.dg/ipa/ipcp-agg-2.c: Likewise. * gcc.dg/ipa/ipcp-agg-6.c: Likewise. * gcc.dg/ipa/ipa-1.c: Likewise. * gcc.dg/ipa/ipa-2.c: Likewise. * gcc.dg/ipa/ipa-4.c: Likewise. * gcc.dg/ipa/ipa-8.c: Likewise. * gcc.dg/ipa/ipacost-2.c: Likewise. * gcc.dg/ipa/ipcp-1.c: Likewise. * gcc.dg/ipa/ipcp-4.c: Likewise. * gcc.dg/ipa/ipcp-agg-3.c: Likewise. * gcc.dg/ipa/ipcp-agg-4.c: Likewise. * gcc.dg/ipa/ipcp-agg-5.c: Likewise. * gcc.dg/ipa/ipcp-agg-7.c: Likewise. * gcc.dg/ipa/ipcp-agg-8.c: Likewise. * gcc.dg/ipa/pr56988.c: Likewise. * g++.dg/ipa/iinline-1.C: Likewise. * g++.dg/ipa/iinline-2.C: Likewise. * g++.dg/ipa/iinline-3.C: Likewise. * g++.dg/ipa/inline-1.C: Likewise. * g++.dg/ipa/inline-2.C: Likewise. * g++.dg/ipa/inline-3.C: Likewise. * g++.dg/other/first-global.C: Likewise. * g++.dg/parse/attr-externally-visible-1.C: Likewise. * g++.dg/torture/pr40323.C: Likewise. * g++.dg/torture/pr55260-1.C: Likewise. * g++.dg/torture/pr55260-2.C: Likewise. * g++.dg/tree-ssa/inline-1.C: Likewise. * g++.dg/tree-ssa/inline-2.C: Likewise. * g++.dg/tree-ssa/inline-3.C: Likewise. * g++.dg/tree-ssa/nothrow-1.C: Likewise. * gcc.dg/tree-ssa/inline-3.c: Likewise. * gcc.dg/tree-ssa/inline-4.c: Likewise. * gcc.dg/tree-ssa/ipa-cp-1.c: Likewise. * gcc.dg/tree-ssa/local-pure-const.c: Likewise. * gfortran.dg/whole_file_5.f90: Likewise. * gfortran.dg/whole_file_6.f90: Likewise. diff --git a/gcc/testsuite/g++.dg/ipa/iinline-1.C b/gcc/testsuite/g++.dg/ipa/iinline-1.C index 9f99893..e4daa8c 100644 --- a/gcc/testsuite/g++.dg/ipa/iinline-1.C +++ b/gcc/testsuite/g++.dg/ipa/iinline-1.C @@ -2,7 +2,7 @@ inlining.. */ /* { dg-do compile } */ /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining" } */ -/* { dg-add-options bind_pic_locally } */ +/* { dg-skip-if "" { *-*-* } { "-fPIC" "-fpic" } { "" } } */ extern void non_existent (const char *, int); diff --git a/gcc/testsuite/g++.dg/ipa/iinline-2.C b/gcc/testsuite/g++.dg/ipa/iinline-2.C index 670a5dd..64a4dce 100644 --- a/gcc/testsuite/g++.dg/ipa/iinline-2.C +++ b/gcc/testsuite/g++.dg/ipa/iinline-2.C @@ -2,7 +2,7 @@ inlining.. */ /* { dg-do compile } */ /* { dg-options "-O3 -fdump-ipa-inline -fno-early-inlining" } */ -/* { dg-add-options bind_pic_locally } */ +/* { dg-skip-if "" { *-*-* } { "-fPIC" "-fpic" } { "" } } */ extern void non_existent (const char *, int); diff --git a/gcc/testsuite/g++.dg/ipa/iinline-3.C b/gcc/testsuite/g++.dg/ipa/iinline-3.C index 3daee9a..0d59969 100644 --- a/gcc/testsuite/g++.dg/ipa/iinline-3.C +++ b/gcc/testsuite/g++.dg/ipa/iinline-3.C @@ -2,7 +2,7 @@ parameters which have been modified. */
[Patch,AArch64] Support SADDL/SSUBL/UADDL/USUBL
Hello, This patch adds support to generate SADDL/SSUBL/UADDL/USUBL. Part of the support is available already (supported for intrinsics). This patch extends this support to generate these instructions (and lane variations) in all scenarios and adds a testcase. Tested for aarch64-none-elf, aarch64_be-none-elf with no regressions. OK for trunk? Cheers VP ~~~ gcc/ChangeLog: 2013-09-30 Vidya Praveen * aarch64-simd.md (aarch64_l2_internal): Rename to ... (aarch64_l_hi_internal): ... this; Insert '\t' to output template. (aarch64_l_lo_internal): New. (aarch64_saddl2, aarch64_uaddl2): Modify to call gen_aarch64_l_hi_internal() instead. (aarch64_ssubl2, aarch64_usubl2): Ditto. gcc/testsuite/ChangeLog: 2013-09-30 Vidya Praveen * gcc.target/aarch64/vect_saddl_1.c: New. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index f13cd5b..a0259b8 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2586,7 +2586,7 @@ ;; l. -(define_insn "aarch64_l2_internal" +(define_insn "aarch64_l_hi_internal" [(set (match_operand: 0 "register_operand" "=w") (ADDSUB: (ANY_EXTEND: (vec_select: (match_operand:VQW 1 "register_operand" "w") @@ -2595,11 +2595,26 @@ (match_operand:VQW 2 "register_operand" "w") (match_dup 3)] "TARGET_SIMD" - "l2 %0., %1., %2." + "l2\t%0., %1., %2." [(set_attr "simd_type" "simd_addl") (set_attr "simd_mode" "")] ) +(define_insn "aarch64_l_lo_internal" + [(set (match_operand: 0 "register_operand" "=w") + (ADDSUB: (ANY_EXTEND: (vec_select: + (match_operand:VQW 1 "register_operand" "w") + (match_operand:VQW 3 "vect_par_cnst_lo_half" ""))) + (ANY_EXTEND: (vec_select: + (match_operand:VQW 2 "register_operand" "w") + (match_dup 3)] + "TARGET_SIMD" + "l\t%0., %1., %2." + [(set_attr "simd_type" "simd_addl") + (set_attr "simd_mode" "")] +) + + (define_expand "aarch64_saddl2" [(match_operand: 0 "register_operand" "=w") (match_operand:VQW 1 "register_operand" "w") @@ -2607,8 +2622,8 @@ "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - emit_insn (gen_aarch64_saddl2_internal (operands[0], operands[1], - operands[2], p)); + emit_insn (gen_aarch64_saddl_hi_internal (operands[0], operands[1], + operands[2], p)); DONE; }) @@ -2619,8 +2634,8 @@ "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - emit_insn (gen_aarch64_uaddl2_internal (operands[0], operands[1], - operands[2], p)); + emit_insn (gen_aarch64_uaddl_hi_internal (operands[0], operands[1], + operands[2], p)); DONE; }) @@ -2631,7 +2646,7 @@ "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - emit_insn (gen_aarch64_ssubl2_internal (operands[0], operands[1], + emit_insn (gen_aarch64_ssubl_hi_internal (operands[0], operands[1], operands[2], p)); DONE; }) @@ -2643,7 +2658,7 @@ "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, true); - emit_insn (gen_aarch64_usubl2_internal (operands[0], operands[1], + emit_insn (gen_aarch64_usubl_hi_internal (operands[0], operands[1], operands[2], p)); DONE; }) diff --git a/gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c b/gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c new file mode 100644 index 000..ecbd8a8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c @@ -0,0 +1,315 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -fno-inline -save-temps -fno-vect-cost-model" } */ + +typedef signed char S8_t; +typedef signed short S16_t; +typedef signed int S32_t; +typedef signed long long S64_t; + +typedef signed char *__restrict__ pS8_t; +typedef signed short *__restrict__ pS16_t; +typedef signed int *__restrict__ pS32_t; +typedef signed long long *__restrict__ pS64_t; + +typedef unsigned char U8_t; +typedef unsigned short U16_t; +typedef unsigned int U32_t; +typedef unsigned long long U64_t; + +typedef unsigned char *__restrict__ pU8_t; +typedef unsigned short *__restrict__ pU16_t; +typedef unsigned int *__restrict__ pU32_t; +typedef unsigned long long *__restrict__ pU64_t; + +extern void abort (); + +void +test_addl_S64_S32_4 (pS64_t a, pS32_t b, pS32_t c) +{ + int i; + for (i = 0; i < 4; i++) +a[i] = (S64_t
Re: [Patch,AArch64] Support SISD Shifts (SHL/USHR/SSHL/USHL/SSHR)
With the attachment this time :-) Regards VP On Tue, Aug 20, 2013 at 04:01:59PM +0100, Vidya Praveen wrote: > Hello, > > This patch supports SISD shift instructions SHL/USHR/SSHR/SSHL/USHL for > SImode and DImode. This patch also refactors the integer shifts pattern > "3_insn". Pattern for rotate is moved out as ror3_insn. > > Shift patterns (aarch64_{lshr|ashl|ashr}_sisd_or_int_{si|di}3) support > both SIMD registers and general purpose registers with the shift quantity > either as variable or literal. Since there are no SISD instructions for > right shifts, the instruction SSHL and USHL are used with shift operand > negated using NEG in order reverse the direction. This is done by > insisting on splitting (after reload) in to neg and UNSPEC_SISD_USHL or > UNSPEC_SISD_SSHL or UNSPEC_USHL_S2 or UNSPEC_SSHL_S2 pattern. Since there > are no SISD variants of shift instructions available for SImode, the SIMD > variants of corresponsing instructions are used with 2S size by taking > one lane alone in to cosideration and ignoring other. > > This patch also introduces a predicate aarch64_simd_register to help in > splitting patterns. Tests for both newly introduced instructions as well > as for the integer instructions are included. > > Tested and no new regressions. > > OK for trunk? > > Regards > VP > > --- > > gcc/ChangeLog > > 2013-08-20 Vidya Praveen > > * config/aarch64/aarch64.md (unspec): Add UNSPEC_SISD_SSHL, > UNSPEC_SISD_USHL, UNSPEC_USHL_2S, UNSPEC_SSHL_2S, UNSPEC_SISD_NEG. > (3_insn): Remove. > (aarch64_ashl_sisd_or_int_3): New Pattern. > (aarch64_lshr_sisd_or_int_3): Likewise. > (aarch64_ashr_sisd_or_int_3): Likewise. > (define_split for aarch64_lshr_sisd_or_int_di3): Likewise. > (define_split for aarch64_lshr_sisd_or_int_si3): Likewise. > (define_split for aarch64_ashr_sisd_or_int_di3): Likewise. > (define_split for aarch64_ashr_sisd_or_int_si3): Likewise. > (aarch64_sisd_ushl, aarch64_sisd_sshl): Likewise. > (aarch64_ushl_2s, aarch64_sshl_2s, aarch64_sisd_neg_qi): Likewise. > (ror3_insn): Likewise. > * config/aarch64/predicates.md (aarch64_simd_register): New. > > gcc/testsuite/ChangeLog > > 2013-08-20 Vidya Praveen > > * gcc.target/aarch64/scalar_shift_1.c: New. > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 5312a79..07349c6 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -88,11 +88,16 @@ UNSPEC_NOP UNSPEC_PRLG_STK UNSPEC_RBIT +UNSPEC_SISD_NEG +UNSPEC_SISD_SSHL +UNSPEC_SISD_USHL +UNSPEC_SSHL_2S UNSPEC_ST2 UNSPEC_ST3 UNSPEC_ST4 UNSPEC_TLS UNSPEC_TLSDESC +UNSPEC_USHL_2S UNSPEC_VSTRUCTDUMMY ]) @@ -3183,13 +3188,182 @@ } ) -(define_insn "*3_insn" +;; Logical left shift using SISD or Integer instruction +(define_insn "*aarch64_ashl_sisd_or_int_3" + [(set (match_operand:GPI 0 "register_operand" "=w,w,r") +(ashift:GPI + (match_operand:GPI 1 "register_operand" "w,w,r") + (match_operand:QI 2 "aarch64_reg_or_shift_imm_" "Us,w,rUs")))] + "" + "@ + shl\t%0, %1, %2 + ushl\t%0, %1, %2 + lsl\t%0, %1, %2" + [(set_attr "simd" "yes,yes,no") + (set_attr "simd_type" "simd_shift_imm,simd_shift,*") + (set_attr "simd_mode" ",,*") + (set_attr "v8type" "*,*,shift") + (set_attr "type" "*,*,shift") + (set_attr "mode" "*,*,")] +) + +;; Logical right shift using SISD or Integer instruction +(define_insn "*aarch64_lshr_sisd_or_int_3" + [(set (match_operand:GPI 0 "register_operand" "=w,w,r") +(lshiftrt:GPI + (match_operand:GPI 1 "register_operand" "w,w,r") + (match_operand:QI 2 "aarch64_reg_or_shift_imm_" "Us,w,rUs")))] + "" + "@ + ushr\t%0, %1, %2 + # + lsr\t%0, %1, %2" + [(set_attr "simd" "yes,yes,no") + (set_attr "simd_type" "simd_shift_imm,simd_shift,*") + (set_attr "simd_mode" ",,*") + (set_attr "v8type" "*,*,shift") + (set_attr "type" "*,*,shift") + (set_attr "mode" "*,*,")] +) + +(define_split + [(set (match_operand:DI 0 "aarch64_simd_register") +(lshiftrt:DI + (match_operand:DI 1 "aarch64_simd_register") + (match_operand:QI 2 "aarch64_simd_register")))] + "TARGET_SIMD && relo
[Patch,AArch64] Support SISD Shifts (SHL/USHR/SSHL/USHL/SSHR)
Hello, This patch supports SISD shift instructions SHL/USHR/SSHR/SSHL/USHL for SImode and DImode. This patch also refactors the integer shifts pattern "3_insn". Pattern for rotate is moved out as ror3_insn. Shift patterns (aarch64_{lshr|ashl|ashr}_sisd_or_int_{si|di}3) support both SIMD registers and general purpose registers with the shift quantity either as variable or literal. Since there are no SISD instructions for right shifts, the instruction SSHL and USHL are used with shift operand negated using NEG in order reverse the direction. This is done by insisting on splitting (after reload) in to neg and UNSPEC_SISD_USHL or UNSPEC_SISD_SSHL or UNSPEC_USHL_S2 or UNSPEC_SSHL_S2 pattern. Since there are no SISD variants of shift instructions available for SImode, the SIMD variants of corresponsing instructions are used with 2S size by taking one lane alone in to cosideration and ignoring other. This patch also introduces a predicate aarch64_simd_register to help in splitting patterns. Tests for both newly introduced instructions as well as for the integer instructions are included. Tested and no new regressions. OK for trunk? Regards VP --- gcc/ChangeLog 2013-08-20 Vidya Praveen * config/aarch64/aarch64.md (unspec): Add UNSPEC_SISD_SSHL, UNSPEC_SISD_USHL, UNSPEC_USHL_2S, UNSPEC_SSHL_2S, UNSPEC_SISD_NEG. (3_insn): Remove. (aarch64_ashl_sisd_or_int_3): New Pattern. (aarch64_lshr_sisd_or_int_3): Likewise. (aarch64_ashr_sisd_or_int_3): Likewise. (define_split for aarch64_lshr_sisd_or_int_di3): Likewise. (define_split for aarch64_lshr_sisd_or_int_si3): Likewise. (define_split for aarch64_ashr_sisd_or_int_di3): Likewise. (define_split for aarch64_ashr_sisd_or_int_si3): Likewise. (aarch64_sisd_ushl, aarch64_sisd_sshl): Likewise. (aarch64_ushl_2s, aarch64_sshl_2s, aarch64_sisd_neg_qi): Likewise. (ror3_insn): Likewise. * config/aarch64/predicates.md (aarch64_simd_register): New. gcc/testsuite/ChangeLog 2013-08-20 Vidya Praveen * gcc.target/aarch64/scalar_shift_1.c: New.
Re: [Patch] Fix selector for vect-iv-5.c
Ping! On Tue, Jul 23, 2013 at 10:21:52AM +0100, Vidya Praveen wrote: > Hello > > gcc.dg/vect/vect-iv-5.c XPASSes for arm-*-* since gcc.dg/vect/*.c tests are > always run with -ffast-math for arm-*-*. This patch makes xfail conditional > for this test by adding effective target keyword !arm_neon_ok. > > OK for trunk? > > Regards > VP > > -- > > gcc/testsuite/ChangeLog: > > 2013-07-22 Vidya Praveen > > * gcc.dg/vect/vect-iv-5.c: Make xfail conditional with !arm_neon_ok. > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > index 1766ae6..8861095 100644 > --- a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > +++ b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > @@ -36,5 +36,5 @@ int main (void) >return main1 (); > } > > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail > *-*-* } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail > {! arm_neon_ok } } } } */ > /* { dg-final { cleanup-tree-dump "vect" } } */
[Patch] Fix selector for vect-iv-5.c
Hello gcc.dg/vect/vect-iv-5.c XPASSes for arm-*-* since gcc.dg/vect/*.c tests are always run with -ffast-math for arm-*-*. This patch makes xfail conditional for this test by adding effective target keyword !arm_neon_ok. OK for trunk? Regards VP -- gcc/testsuite/ChangeLog: 2013-07-22 Vidya Praveen * gcc.dg/vect/vect-iv-5.c: Make xfail conditional with !arm_neon_ok. diff --git a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c index 1766ae6..8861095 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c +++ b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c @@ -36,5 +36,5 @@ int main (void) return main1 (); } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail {! arm_neon_ok } } } } */ /* { dg-final { cleanup-tree-dump "vect" } } */
Added myself to MAINTAINERS (Write After Approval)
2013-06-14 Vidya Praveen * MAINTAINERS (Write After Approval): Add myself. Index: MAINTAINERS === --- MAINTAINERS (revision 200091) +++ MAINTAINERS (working copy) @@ -487,6 +487,7 @@ Paul Pluzhnikovppluzhni...@google.com Marek Polacek pola...@redhat.com Antoniu Popantoniu@gmail.com +Vidya Praveen vidyaprav...@arm.com Vladimir Prus vladi...@codesourcery.com Yao Qi y...@codesourcery.com Jerry Quinnjlqu...@optonline.net
Re: [AArch64] Support for SMLAL/SMLSL/UMLAL/UMLSL
On 14/06/13 16:01, Richard Earnshaw wrote: On 14/06/13 15:33, Marcus Shawcroft wrote: On 14/06/13 14:55, Vidya Praveen wrote: [...] to support SMLAL/UMLAL instructions for 64 bit vector modes. * config/aarch64/aarch64-simd.md (*aarch64_mlsl): New pattern to support SMLSL/UMLSL instructions for 64 bit vector modes. Convention is that we say what changed in the changelog entry and write the justification in the covering email summary. Tmsg00853.htmlherefore in instances like this where you are defining a new pattern in is sufficient to write simply. * config/aarch64/aarch64-simd.md (*aarch64_mlal_lo): Define. I tend to prefer "New pattern." over "Define." on the grounds that it tells me that this is a pattern, not a constraint or some other construct. Also, there's no need to repeat the file name each time, or put the leading '*' on the pattern name. You can also list more than one function at the same time if it has the same description, and use 'Likewise' when this extends to multiple lines. Finally, don't over-indent continuation lines. So: 2013-06-14 Vidya Praveen * config/aarch64/aarch64-simd.md (aarch64_mlal_lo): New pattern. (aarch64_mlal_hi, aarch64_mlsl_lo): Likewise. (aarch64_mlsl_hi, aarch64_mlal): Likewise. etc. Thanks Marcus/Richard for the recommendations. After changes: gcc/ChangeLog 2013-06-14 Vidya Praveen * config/aarch64/aarch64-simd.md (aarch64_mlal_lo): New pattern. (aarch64_mlal_hi, aarch64_mlsl_lo): Likewise. (aarch64_mlsl_hi, aarch64_mlal): Likewise. (aarch64_mlsl): Likewise. gcc/testsuite/ChangeLog 2013-06-14 Vidya Praveen * gcc.target/aarch64/vect_smlal_1.c: New file. ~VP
[AArch64] Support for SMLAL/SMLSL/UMLAL/UMLSL
Hello, This patch adds support to SMLAL/SMLSL/UMLAL/UMLSL instructions and adds tests for the same. Regression test run for aarch64-none-elf with no regressions. OK? ~VP --- gcc/ChangeLog 2013-06-14 Vidya Praveen * config/aarch64/aarch64-simd.md (*aarch64_mlal_lo): New pattern to support SMLAL,UMLAL instructions. * config/aarch64/aarch64-simd.md (*aarch64_mlal_hi): New pattern to support SMLAL2,UMLAL2 instructions. * config/aarch64/aarch64-simd.md (*aarch64_mlsl_lo): New pattern to support SMLSL,UMLSL instructions. * config/aarch64/aarch64-simd.md (*aarch64_mlsl_hi): New pattern to support SMLSL2,UMLSL2 instructions. * config/aarch64/aarch64-simd.md (*aarch64_mlal): New pattern to support SMLAL/UMLAL instructions for 64 bit vector modes. * config/aarch64/aarch64-simd.md (*aarch64_mlsl): New pattern to support SMLSL/UMLSL instructions for 64 bit vector modes. gcc/testsuite/ChangeLog 2013-06-14 Vidya Praveen * gcc.target/aarch64/vect_smlal_1.c: New file.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index e5990d4..8589476 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1190,6 +1190,104 @@ ;; Widening arithmetic. +(define_insn "*aarch64_mlal_lo" + [(set (match_operand: 0 "register_operand" "=w") +(plus: + (mult: + (ANY_EXTEND: (vec_select: + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "vect_par_cnst_lo_half" ""))) + (ANY_EXTEND: (vec_select: + (match_operand:VQW 4 "register_operand" "w") + (match_dup 3 + (match_operand: 1 "register_operand" "0")))] + "TARGET_SIMD" + "mlal\t%0., %2., %4." + [(set_attr "simd_type" "simd_mlal") + (set_attr "simd_mode" "")] +) + +(define_insn "*aarch64_mlal_hi" + [(set (match_operand: 0 "register_operand" "=w") +(plus: + (mult: + (ANY_EXTEND: (vec_select: + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "vect_par_cnst_hi_half" ""))) + (ANY_EXTEND: (vec_select: + (match_operand:VQW 4 "register_operand" "w") + (match_dup 3 + (match_operand: 1 "register_operand" "0")))] + "TARGET_SIMD" + "mlal2\t%0., %2., %4." + [(set_attr "simd_type" "simd_mlal") + (set_attr "simd_mode" "")] +) + +(define_insn "*aarch64_mlsl_lo" + [(set (match_operand: 0 "register_operand" "=w") +(minus: + (match_operand: 1 "register_operand" "0") + (mult: + (ANY_EXTEND: (vec_select: + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "vect_par_cnst_lo_half" ""))) + (ANY_EXTEND: (vec_select: + (match_operand:VQW 4 "register_operand" "w") + (match_dup 3))] + "TARGET_SIMD" + "mlsl\t%0., %2., %4." + [(set_attr "simd_type" "simd_mlal") + (set_attr "simd_mode" "")] +) + +(define_insn "*aarch64_mlsl_hi" + [(set (match_operand: 0 "register_operand" "=w") +(minus: + (match_operand: 1 "register_operand" "0") + (mult: + (ANY_EXTEND: (vec_select: + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "vect_par_cnst_hi_half" ""))) + (ANY_EXTEND: (vec_select: + (match_operand:VQW 4 "register_operand" "w") + (match_dup 3))] + "TARGET_SIMD" + "mlsl2\t%0., %2., %4." + [(set_attr "simd_type" "simd_mlal") + (set_attr "simd_mode" "")] +) + +(define_insn "*aarch64_mlal" + [(set (match_operand: 0 "register_operand" "=w") +(plus: + (mult: +(ANY_EXTEND: + (match_operand:VDW 1 "register_operand" "w")) +(ANY_EXTEND: + (match_operand:VDW 2 "register_operand" "w"))) + (match_operand: 3 "register_operand" "0")))] + "TARGET_SIMD" + "mlal\t%0., %1., %2." + [(set_attr "simd_type" "simd_mlal&
Re: [AArch64] Support for CLZ
On 23/05/13 14:40, Marcus Shawcroft wrote: On 22 May 2013 12:47, Vidya Praveen wrote: Hello, This patch adds support to AdvSIMD CLZ instruction and adds tests for the same. Regression test done for aarch64-none-elf with no issues. OK? Regards VP --- gcc/ChangeLog 2013-05-22 Vidya Praveen * config/aarch64/aarch64-simd.md (clzv4si2): Support for CLZ instruction (AdvSIMD). * config/aarch64/aarch64-builtins.c (aarch64_builtin_vectorized_function): Handler for BUILT_IN_CLZ. * config/aarch64/aarch-simd-builtins.def: Entry for CLZ. * testsuite/gcc.target/aarch64/vect-clz.c: New file. I committed this for you, and moved the testsuite ChangeLog entry over to gcc/testsuite/ChangeLog. Thanks Marcus! :-) Regards VP
[AArch64] Support for CLZ
Hello, This patch adds support to AdvSIMD CLZ instruction and adds tests for the same. Regression test done for aarch64-none-elf with no issues. OK? Regards VP --- gcc/ChangeLog 2013-05-22 Vidya Praveen * config/aarch64/aarch64-simd.md (clzv4si2): Support for CLZ instruction (AdvSIMD). * config/aarch64/aarch64-builtins.c (aarch64_builtin_vectorized_function): Handler for BUILT_IN_CLZ. * config/aarch64/aarch-simd-builtins.def: Entry for CLZ. * testsuite/gcc.target/aarch64/vect-clz.c: New file. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 4fdfe24..2a0e5fd 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1245,6 +1245,16 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in) return AARCH64_FIND_FRINT_VARIANT (sqrt); #undef AARCH64_CHECK_BUILTIN_MODE #define AARCH64_CHECK_BUILTIN_MODE(C, N) \ + (out_mode == SImode && out_n == C \ + && in_mode == N##Imode && in_n == C) +case BUILT_IN_CLZ: + { +if (AARCH64_CHECK_BUILTIN_MODE (4, S)) + return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_clzv4si]; +return NULL_TREE; + } +#undef AARCH64_CHECK_BUILTIN_MODE +#define AARCH64_CHECK_BUILTIN_MODE(C, N) \ (out_mode == N##Imode && out_n == C \ && in_mode == N##Fmode && in_n == C) case BUILT_IN_LFLOOR: diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index e420173..5134f96 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -49,6 +49,7 @@ BUILTIN_VDQF (UNOP, sqrt, 2) BUILTIN_VD_BHSI (BINOP, addp, 0) VAR1 (UNOP, addp, 0, di) + VAR1 (UNOP, clz, 2, v4si) BUILTIN_VD_RE (REINTERP, reinterpretdi, 0) BUILTIN_VDC (REINTERP, reinterpretv8qi, 0) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 9069a73..82fe1ad 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1611,6 +1611,15 @@ DONE; }) +(define_insn "clz2" + [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w") + (clz:VDQ_BHSI (match_operand:VDQ_BHSI 1 "register_operand" "w")))] + "TARGET_SIMD" + "clz\\t%0., %1." + [(set_attr "simd_type" "simd_cls") + (set_attr "simd_mode" "")] +) + ;; 'across lanes' max and min ops. (define_insn "reduc__" diff --git a/gcc/testsuite/gcc.target/aarch64/vect-clz.c b/gcc/testsuite/gcc.target/aarch64/vect-clz.c new file mode 100644 index 000..8f1fe70 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect-clz.c @@ -0,0 +1,35 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps -fno-inline" } */ + +extern void abort (); + +void +count_lz_v4si (unsigned *__restrict a, int *__restrict b) +{ + int i; + + for (i = 0; i < 4; i++) +b[i] = __builtin_clz (a[i]); +} + +/* { dg-final { scan-assembler "clz\tv\[0-9\]+\.4s" } } */ + +int +main () +{ + unsigned int x[4] = { 0x0, 0x, 0x1, 0x }; + int r[4] = { 32, 16, 15, 0 }; + int d[4], i; + + count_lz_v4si (x, d); + + for (i = 0; i < 4; i++) +{ + if (d[i] != r[i]) + abort (); +} + + return 0; +} + +/* { dg-final { cleanup-saved-temps } } */
[AArch64] Fix the description of simd_fabd
Hello, This attached patch corrects the description for simd_fabd. OK? Regards VP gcc/ChangeLog 2013-05-02 Vidya Praveen * config/aarch64/aarch64-simd.md (simd_fabd): Correct the description.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 5862d26..65847ce 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -44,7 +44,7 @@ ; simd_dup duplicate element. ; simd_dupgpduplicate general purpose register. ; simd_ext bitwise extract from pair. -; simd_fabd floating absolute difference and accumulate. +; simd_fabd floating point absolute difference. ; simd_fadd floating point add/sub. ; simd_fcmp floating point compare. ; simd_fcvtifloating point convert to integer.
[AArch64] Support scalar form of FABD
Hello, This attached patch adds support to the scalar form of FABD instruction along with the compile & execute tests for the same. Regression tested on aarch64-none-elf with no issues. OK? Regards VP --- gcc/ChangeLog 2013-05-02 Vidya Praveen * config/aarch64/aarch64-simd.md (*fabd_scalar3): Support scalar form of FABD instruction. gcc/testsuite/ChangeLog 2013-05-02 Vidya Praveen * gcc.target/aarch64/fabd.c: New file.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 5862d26..e5fc032 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -556,6 +556,17 @@ (set_attr "simd_mode" "")] ) +(define_insn "*fabd_scalar3" + [(set (match_operand:GPF 0 "register_operand" "=w") +(abs:GPF (minus:GPF + (match_operand:GPF 1 "register_operand" "w") + (match_operand:GPF 2 "register_operand" "w"] + "TARGET_SIMD" + "fabd\t%0, %1, %2" + [(set_attr "simd_type" "simd_fabd") + (set_attr "mode" "")] +) + (define_insn "and3" [(set (match_operand:VDQ 0 "register_operand" "=w") (and:VDQ (match_operand:VDQ 1 "register_operand" "w") diff --git a/gcc/testsuite/gcc.target/aarch64/fabd.c b/gcc/testsuite/gcc.target/aarch64/fabd.c new file mode 100644 index 000..7206d5e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fabd.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options "-O1 -fno-inline --save-temps" } */ + +extern double fabs (double); +extern float fabsf (float); +extern void abort (); +extern void exit (int); + +void +fabd_d (double x, double y, double d) +{ + if ((fabs (x - y) - d) > 0.1) +abort (); +} + +/* { dg-final { scan-assembler "fabd\td\[0-9\]+" } } */ + +void +fabd_f (float x, float y, float d) +{ + if ((fabsf (x - y) - d) > 0.1) +abort (); +} + +/* { dg-final { scan-assembler "fabd\ts\[0-9\]+" } } */ + +int +main () +{ + fabd_d (10.0, 5.0, 5.0); + fabd_d (5.0, 10.0, 5.0); + fabd_f (10.0, 5.0, 5.0); + fabd_f (5.0, 10.0, 5.0); + + return 0; +} + +/* { dg-final { cleanup-saved-temps } } */