RE: [patch] Fix PR middle-end/59138
Hi, On Fri, 29 Nov 2013 12:28:15, Eric Botcazou wrote: I think I see a small flaw in that patch: + /* Make sure not to write past the end of the struct. */ + store_bit_field (dest, + adj_bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT, + bytepos * BITS_PER_UNIT, ssize * BITS_PER_UNIT, + VOIDmode, tmps[i]); the parameter BITREGION_END is wrong. it should be: ssize * BITS_PER_UNIT - 1 This won't change anything since you cannot access one bit without accessing its unit but yes, it's more correct. Feel free to change it on both branches. -- Eric Botcazou Ok, committed as obvious, in both branches, after regression-testing. Thanks Bernd. patch-emit-group-store.diff Description: Binary data
Re: [wide-int] Add a fast path for multiplication by 0
Richard Biener richard.guent...@gmail.com writes: Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Fri, Nov 29, 2013 at 12:14 PM, Richard Sandiford rdsandif...@googlemail.com wrote: In the fold-const.ii testcase, well over half of the mul_internal calls were for multiplication by 0 (106038 out of 169355). This patch adds an early-out for that. Tested on x86_64-linux-gnu. OK to install? Ok. Did you check how many of the remaining are multiplies by 1? Turns out to be 9685, which is probably enough to justify a special case. Tested on x86_64-linux-gnu. OK to install? Ok. I assume we already have a special.case for division by 1? I don't think it's really worth having one there. If the inputs fit into single HWIs then the output does too, so we can just use a host division. Thanks, Richard
Re: [Patch, fortran] PR58410 - [4.8/4.9 Regression] Bogus uninitialized variable warning for allocatable derived type array function result
Paul Richard Thomas wrote: This turned out to be a valid uninitialized variable warning. Bootstrapped and regtested on FC17/x86_64 - OK from trunk and 4.8? OK – thanks for the patch! Tobias 2013-11-30 Paul Thomas pa...@gcc.gnu.org PR fortran/58410 * trans-array.c (gfc_alloc_allocatable_for_assignment): Do not use the array bounds of an unallocated array but set its size to zero instead.
[wide-int]
At the moment we only use host divisions for precisions = HOST_BITS_PER_WIDE_INT. This patch extends it to any division in which the inputs fit in HWIs. The easiest way seemed to be to construct wide_int_refs for the numbers and reuse wi::fits_*hwi_p. This also simplifies some of the other code. The only tricky thing is that we need to handle HOST_WIDE_INT_MIN / -1 specially for precisions HOST_BITS_PER_WIDE_INT, since that isn't an overflow and is the only case where the result needs two HWIs. The slow path is now only used 7 times for insn-recog.ii, 4 for cp/parser.ii and not at all for fold-const.ii. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2013-11-30 10:30:07.512582970 + +++ gcc/wide-int.cc 2013-12-01 10:48:47.128907474 + @@ -1663,9 +1663,10 @@ divmod_internal_2 (unsigned HOST_HALF_WI the division overflowed. */ unsigned int wi::divmod_internal (HOST_WIDE_INT *quotient, unsigned int *remainder_len, -HOST_WIDE_INT *remainder, const HOST_WIDE_INT *dividend, +HOST_WIDE_INT *remainder, +const HOST_WIDE_INT *dividend_val, unsigned int dividend_len, unsigned int dividend_prec, -const HOST_WIDE_INT *divisor, unsigned int divisor_len, +const HOST_WIDE_INT *divisor_val, unsigned int divisor_len, unsigned int divisor_prec, signop sgn, bool *oflow) { @@ -1680,42 +1681,25 @@ wi::divmod_internal (HOST_WIDE_INT *quot unsigned HOST_HALF_WIDE_INT b_divisor[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT]; unsigned int m, n; - HOST_WIDE_INT u0[WIDE_INT_MAX_ELTS]; - HOST_WIDE_INT u1[WIDE_INT_MAX_ELTS]; bool dividend_neg = false; bool divisor_neg = false; bool overflow = false; + wide_int neg_dividend, neg_divisor; - if (divisor[0] == 0 divisor_len == 1) + wide_int_ref dividend = wi::storage_ref (dividend_val, dividend_len, + dividend_prec); + wide_int_ref divisor = wi::storage_ref (divisor_val, divisor_len, + divisor_prec); + if (divisor == 0) overflow = true; - /* The smallest signed number / -1 causes overflow. */ + /* The smallest signed number / -1 causes overflow. The dividend_len + check is for speed rather than correctness. */ if (sgn == SIGNED dividend_len == BLOCKS_NEEDED (dividend_prec) - divisor_len == 1) -{ - HOST_WIDE_INT divisor_low = divisor[0]; - if (divisor_prec HOST_BITS_PER_WIDE_INT) - divisor_low = sext_hwi (divisor_low, divisor_prec); - unsigned HOST_WIDE_INT dividend_high = dividend[dividend_len - 1]; - dividend_high = -dividend_prec % HOST_BITS_PER_WIDE_INT; - if (divisor_low == -1 - HOST_WIDE_INT (dividend_high) == HOST_WIDE_INT_MIN) - { - /* The smallest neg number is 100...00. The high word was -checked above, now check the rest of the words are zero. */ - unsigned int i; - bool all_zero = true; - for (i = 0; i + 1 dividend_len; i++) - if (dividend[i] != 0) - { - all_zero = false; - break; - } - if (all_zero) - overflow = true; - } -} + divisor == -1 + wi::only_sign_bit_p (dividend)) +overflow = true; /* If overflow is set, just get out. There will only be grief by continuing. */ @@ -1737,27 +1721,30 @@ wi::divmod_internal (HOST_WIDE_INT *quot *oflow = false; /* Do it on the host if you can. */ - if (dividend_prec = HOST_BITS_PER_WIDE_INT - divisor_prec = HOST_BITS_PER_WIDE_INT) + if (sgn == SIGNED + wi::fits_shwi_p (dividend) + wi::fits_shwi_p (divisor)) { - if (sgn == SIGNED) - { - HOST_WIDE_INT o0 = sext_hwi (dividend[0], dividend_prec); - HOST_WIDE_INT o1 = sext_hwi (divisor[0], divisor_prec); + HOST_WIDE_INT o0 = dividend.to_shwi (); + HOST_WIDE_INT o1 = divisor.to_shwi (); + if (o0 == HOST_WIDE_INT_MIN o1 == -1) + { + gcc_checking_assert (dividend_prec HOST_BITS_PER_WIDE_INT); if (quotient) - quotient[0] = o0 / o1; + { + quotient[0] = HOST_WIDE_INT_MIN; + quotient[1] = 0; + } if (remainder) { - remainder[0] = o0 % o1; + remainder[0] = 0; *remainder_len = 1; } + return 2; } else { - unsigned HOST_WIDE_INT o0 = zext_hwi (dividend[0], dividend_prec); - unsigned HOST_WIDE_INT o1 = zext_hwi (divisor[0], divisor_prec); - if (quotient) quotient[0] = o0 / o1; if (remainder) @@ -1765,8 +1752,24 @@
Re: [Patch, fortran] PR34547 - [4.8/4.9 regression] NULL(): Fortran 2003 changes, accepts invalid, ICE on invalid
Paul Richard Thomas wrote: This one is trivial. NULL(...) is simply out of context in a transfer statement. Bootstrapped and regtested on FC17/x86_64. OK for trunk and 4.8? Looks good to me, except that I wonder whether the wording could be improved: Invalid context for NULL () intrinsic at %L, For instance, something like NULL() intrinsic not permitted in data-transfer statement or Data transfer statement requires an associated pointer or NULL() is not an associated pointer as required for an data-transfer statement or something like that, given that we know that the context is a data transfer statement. The standard requires: If an output item is a pointer, it shall be associated with a target (see just added quote to the PR). Thus, the patch is fine after shortly pondering about the wording; but I am also fine with your wording. Tobias 2013-11-30 Paul Thomaspa...@gcc.gnu.org PR fortran/34547 * resolve.c (resolve_transfer): EXPR_NULL is always in an invalid context in a transfer statement. 2013-11-30 Paul Thomaspa...@gcc.gnu.org PR fortran/34547 * gfortran.dg/null_5.f90 : Include new error. * gfortran.dg/null_6.f90 : Include new error.
Re: [patch] Fix PR middle-end/59138
Ok, committed as obvious, in both branches, after regression-testing. Thanks for spotting and fixing it. -- Eric Botcazou
Re: [Patch, fortran] PR57354 - Wrong run-time assignment of allocatable array of derived type with allocatable component
Paul Richard Thomas wrote: This is a partial fix for this problem Well, it is a full fix for the wrong-code issue, even if a missed-optimization issue remains ;-) I will retain the PR and will have another go at suppressing the reallocation in a few weeks time. Thanks! Bootstrapped and regtested on Fc17/x86_64 - OK for trunk? OK and thanks for going through the PRs and fixing them! Tobias 2013-11-30 Paul Thomas pa...@gcc.gnu.org PR fortran/57354 * trans-array.c (gfc_conv_resolve_dependencies): For other than SS_SECTION, do a dependency check if the lhs is liable to be reallocated. 2013-11-30 Paul Thomas pa...@gcc.gnu.org PR fortran/57354 * gfortran.dg/realloc_on_assign_23.f90 : New test
Fix -g3 on Windows
Any compiler configured to target Windows with tree checking breaks on -g3: eric@polaris:~/gnat/bugs/MB28-035 ~/build/gcc/i686-pc-mingw32/gcc/cc1 -quiet -g3 t.c t.c:1:0: internal compiler error: tree check: expected tree that contains 'decl common' structure, have 'identifier_node' in i386_pe_asm_named_section, at config/i386/winnt.c:569 Without tree checking, e.g. on release branches, you get sometimes a segfault. The reason is that i386_pe_asm_named_section takes DECL_ATTRIBUTES of an IDENTIFIER_NODE. The latter is valid here, see default_elf_asm_named_section. This is a (somewhat old) regression, tested on i686-pc-mingw32, applied on all active branches as obvious. 2013-12-01 Eric Botcazou ebotca...@adacore.com * config/i386/winnt.c (i386_pe_asm_named_section): Be prepared for an identifier node. -- Eric BotcazouIndex: config/i386/winnt.c === --- config/i386/winnt.c (revision 205562) +++ config/i386/winnt.c (working copy) @@ -565,8 +565,9 @@ i386_pe_asm_named_section (const char *n sets 'discard' characteristic, rather than telling linker to warn of size or content mismatch, so do the same. */ bool discard = (flags SECTION_CODE) - || lookup_attribute (selectany, - DECL_ATTRIBUTES (decl)); + || (TREE_CODE (decl) != IDENTIFIER_NODE + lookup_attribute (selectany, + DECL_ATTRIBUTES (decl))); fprintf (asm_out_file, \t.linkonce %s\n, (discard ? discard : same_size)); }
Re: [wide-int] Avoid some temporaries and use shifts more often
Richard Biener richard.guent...@gmail.com writes: Richard Sandiford rdsandif...@googlemail.com wrote: This started out as an another attempt to find places where we had things like: offset_int x = wi::to_offset (...); x = ...x...; and change them to: offset_int x = ...wi::to_offset (...)...; with the get_ref_base_and_extent case being the main one. But it turned out that some of them were also multiplying or dividing by BITS_PER_UNIT, so it ended up also being a patch to convert those to shifts. Ok and yes please. Thanks. If it's OK I'll wait until after the merge to convert the existing uses to LOG2_BITS_PER_UNIT. Richard
Re: [wide-int] Avoid some temporaries and use shifts more often
Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: Richard Sandiford rdsandif...@googlemail.com wrote: This started out as an another attempt to find places where we had things like: offset_int x = wi::to_offset (...); x = ...x...; and change them to: offset_int x = ...wi::to_offset (...)...; with the get_ref_base_and_extent case being the main one. But it turned out that some of them were also multiplying or dividing by BITS_PER_UNIT, so it ended up also being a patch to convert those to shifts. Ok and yes please. Thanks. If it's OK I'll wait until after the merge to convert the existing uses to LOG2_BITS_PER_UNIT. That's ok with me. Richard. Richard
Re: patch for elimination to SP when it is changed in RTL (PR57293)
On Thu, Nov 28, 2013 at 10:11:26PM +, Vladimir Makarov wrote: Committed as rev. 205498. 2013-11-28 Vladimir Makarovvmaka...@redhat.com PR target/57293 * ira.h (ira_setup_eliminable_regset): Remove parameter. * ira.c (ira_setup_eliminable_regset): Ditto. Add SUPPORTS_STACK_ALIGNMENT for crtl-stack_realign_needed. Don't call lra_init_elimination. (ira): Call ira_setup_eliminable_regset without arguments. * loop-invariant.c (calculate_loop_reg_pressure): Remove argument from ira_setup_eliminable_regset call. * gcse.c (calculate_bb_reg_pressure): Ditto. * haifa-sched.c (sched_init): Ditto. * lra.h (lra_init_elimination): Remove the prototype. * lra-int.h (lra_insn_recog_data): New member sp_offset. Move used_insn_alternative upper. (lra_eliminate_regs_1): Add one more parameter. (lra-eliminate): Ditto. * lra.c (lra_invalidate_insn_data): Set sp_offset. (setup_sp_offset): New. (lra_process_new_insns): Call setup_sp_offset. (lra): Add argument to lra_eliminate calls. * lra-constraints.c (get_equiv_substitution): Rename to get_equiv. (get_equiv_with_elimination): New. (process_addr_reg): Call get_equiv_with_elimination instead of get_equiv_substitution. (equiv_address_substitution): Ditto. (loc_equivalence_change_p): Ditto. (loc_equivalence_callback, lra_constraints): Ditto. (curr_insn_transform): Ditto. Print the sp offset (process_alt_operands): Prevent stack pointer reloads. (lra_constraints): Remove one argument from lra_eliminate call. Move it up. Mark used hard regs bfore it. Use get_equiv_with_elimination instead of get_equiv_substitution. * lra-eliminations.c (lra_eliminate_regs_1): Add parameter and assert for param values combination. Use sp offset. Add argument to lra_eliminate_regs_1 calls. (lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call. (curr_sp_change): New static var. (mark_not_eliminable): Add parameter. Update curr_sp_change. Don't prevent elimination to sp if we can calculate its change. Pass the argument to mark_not_eliminable calls. (eliminate_regs_in_insn): Add a parameter. Use sp offset. Add argument to lra_eliminate_regs_1 call. (update_reg_eliminate): Move calculation of hard regs for spill lower. Switch off lra_in_progress temporarily to generate regs involved into elimination. (lra_init_elimination): Rename to init_elimination. Make it static. Set up insn sp offset, check the offsets at the end of BBs. (process_insn_for_elimination): Add parameter. Pass its value to eliminate_regs_in_insn. (lra_eliminate): : Add parameter. Pass its value to process_insn_for_elimination. Add assert for param values combination. Call init_elimination. Don't update offsets in equivalence substitutions. * lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1 for created stack slot. (remove_pseudos): Call lra_eliminate_regs_1 before changing memory onto stack slot. 2013-11-28 Vladimir Makarovvmaka...@redhat.com PR target/57293 * gcc.target/i386/pr57293.c: New. Hi Vlad, This patch seems to cause some problems for AArch64. I see an assert triggering when building libgloss: /work/gcc-clean/build-aarch64-none-elf/obj/gcc1/gcc/xgcc -B/work/gcc-clean/build-aarch64-none-elf/obj/gcc1/gcc/ -B/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/newlib/ -isystem /work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/newlib/targ-include -isystem /work/gcc-clean/src/binutils/newlib/libc/include -B/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/libgloss/aarch64 -L/work/gcc-clean/build-aarch64-none-elf/obj/binutils/aarch64-none-elf/libgloss/libnosys -L/work/gcc-clean/src/binutils/libgloss/aarch64 -L/work/gcc-clean/build-aarch64-none-elf/obj/binutils/./ld-O2 -g -O2 -g -I. -I/work/gcc-clean/src/binutils/libgloss/aarch64/.. -DARM_RDI_MONITOR -o rdimon-_exit.o -c /work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c /work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c: In function '_exit': /work/gcc-clean/src/binutils/libgloss/aarch64/_exit.c:41:1: internal compiler error: in update_reg_eliminate, at lra-eliminations.c:1157 } ^ 0x84587e update_reg_eliminate /work/gcc-clean/src/gcc/gcc/lra-eliminations.c:1157 0x846ca6 lra_eliminate(bool, bool) /work/gcc-clean/src/gcc/gcc/lra-eliminations.c:1387 0x84114a lra_constraints(bool) /work/gcc-clean/src/gcc/gcc/lra-constraints.c:3887 0x832f3b lra(_IO_FILE*) /work/gcc-clean/src/gcc/gcc/lra.c:2331 0x7f3b1e do_reload /work/gcc-clean/src/gcc/gcc/ira.c:5451 0x7f3b1e rest_of_handle_reload
doc: Add -fuse-ld to option index
I noticed there wasn't an entry in the option index for -fuse-ld. If OK, can someone apply? Thanks. Regards, Ryan Mansfield 2013-12-01 Ryan Mansfield rmansfi...@qnx.com * doc/invoke.texi (-fuse-ld): Add index entry. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 205573) +++ gcc/doc/invoke.texi (working copy) @@ -8691,6 +8691,7 @@ Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}. @item -fuse-ld=bfd +@opindex fuse-ld Use the @command{bfd} linker instead of the default linker. @item -fuse-ld=gold
Re: doc: Add -fuse-ld to option index
Am 01.12.2013 18:13, schrieb Ryan Mansfield: I noticed there wasn't an entry in the option index for -fuse-ld. If OK, can someone apply? Thanks. There are many more options which lack an optindex, cf. still unreviewed and pinged patch at http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03082.html Tobias Ryan Mansfield 2013-12-01 Ryan Mansfield rmansfi...@qnx.com * doc/invoke.texi (-fuse-ld): Add index entry.
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
On Sun, Dec 1, 2013 at 2:28 AM, Richard Sandiford rdsandif...@googlemail.com wrote: I followed Joseph's suggestion and reused longlong.h. I copied it from libgcc rather than glibc since it seemed better for GCC to have a single version across both gcc/ and libgcc/. I can put it in include/ if that seems better. Actually copying complex code like this does not seem maintainable. I think there needs to be only one copy in the GCC sources. If that requires moving it back from libgcc to gcc, or moving it to include, do that. Ian
[PATCH, libgfortran]: Fix PR59313, gfortran.dg/erf_3.F90 FAILs
Hello! Currently, gfortran.dg/erf_3.F90 FAILs on targets with 128bit (quadruple) long double, since high-precision erfc_scaled_r16 gets defined only for __float128 quadruple precision. Attached patch defines this function also for 128bit long double targets. [BTW: The patch really implements what FX's ChangeLog says: * intrinsics/erfc_scaled_inc.c: Do not provide quadruple precision variant.] 2013-12-01 Uros Bizjak ubiz...@gmail.com PR libfortran/59313 * intrinsics/erfc_scaled.c (erfc_scaled_r16): Also provide for quadruple precision long double variant. Tested on x86_64-pc-linux-gnu and alphaev68-pc-linux-gnu. OK for mainline? Uros. Index: intrinsics/erfc_scaled.c === --- intrinsics/erfc_scaled.c(revision 205539) +++ intrinsics/erfc_scaled.c(working copy) @@ -45,33 +45,50 @@ see the files COPYING3 and COPYING.RUNTIME respect #include erfc_scaled_inc.c #endif -#if defined(HAVE_GFC_REAL_16) defined(GFC_REAL_16_IS_LONG_DOUBLE) -#undef KIND -#define KIND 16 -#include erfc_scaled_inc.c -#endif +#ifdef HAVE_GFC_REAL_16 +/* For quadruple-precision, netlib's implementation is + not accurate enough. We provide another one. */ #ifdef GFC_REAL_16_IS_FLOAT128 -/* For quadruple-precision (__float128), netlib's implementation is - not accurate enough. We provide another one. */ +# define _THRESH -106.566990228185312813205074546585730Q +# define _M_2_SQRTPI M_2_SQRTPIq +# define _INF __builtin_infq() +# define _ERFC(x) erfcq(x) +# define _EXP(x) expq(x) +#else + +# define _THRESH -106.566990228185312813205074546585730L +# define _M_2_SQRTPI M_2_SQRTPIl +# define _INF __builtin_infl() +# ifdef HAVE_ERFCL +# define _ERFC(x) erfcl(x) +# endif +# ifdef HAVE_EXPL +# define _EXP(x) expl(x) +# endif + +#endif + +#if defined(_ERFC) defined(_EXP) + extern GFC_REAL_16 erfc_scaled_r16 (GFC_REAL_16); export_proto(erfc_scaled_r16); GFC_REAL_16 erfc_scaled_r16 (GFC_REAL_16 x) { - if (x -106.566990228185312813205074546585730Q) + if (x _THRESH) { - return __builtin_infq(); + return _INF; } if (x 12) { /* Compute directly as ERFC_SCALED(x) = ERFC(x) * EXP(X**2). This is not perfect, but much better than netlib. */ - return erfcq(x) * expq(x * x); + return _ERFC(x) * _EXP(x * x); } else { @@ -97,9 +114,10 @@ erfc_scaled_r16 (GFC_REAL_16 x) n++; } - return (1 + sum) / x * (M_2_SQRTPIq / 2); + return (1 + sum) / x * (_M_2_SQRTPI / 2); } } #endif +#endif Index: intrinsics/erfc_scaled_inc.c === --- intrinsics/erfc_scaled_inc.c(revision 205539) +++ intrinsics/erfc_scaled_inc.c(working copy) @@ -39,7 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect # define EXP(x) exp(x) # define TRUNC(x) trunc(x) -#elif (KIND == 10) || (KIND == 16 defined(GFC_REAL_16_IS_LONG_DOUBLE)) +#elif (KIND == 10) # ifdef HAVE_EXPL # define EXP(x) expl(x)
Re: [patch] introduce aarch64 as a Go architecture
Ian Lance Taylor i...@google.com writes: I've gotten a patch from Michael Hudson-Doyle to set GOARCH to arm64 on an Aarch64 system (https://codereview.appspot.com/34830045/). Haha, go us. I've gotten a patch from Matthias Klose to set GOARCH to aarch64 on such a system (http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03765.html). I don't care one way or another myself, but we need to pick one. I don't care too much myself. AArch64 is more correct but arm64 is more obvious. Also plan9/inferno will use arm64 IIUC. Cheers, mwh
Re: [PATCH, libgfortran]: Fix PR59313, gfortran.dg/erf_3.F90 FAILs
Currently, gfortran.dg/erf_3.F90 FAILs on targets with 128bit (quadruple) long double, since high-precision erfc_scaled_r16 gets defined only for __float128 quadruple precision. I can’t approve it, but yes, it makes more sense than what I did earlier. FX
Re: [PATCH, libgfortran]: Fix PR59313, gfortran.dg/erf_3.F90 FAILs
FX wrote: Uros Bizjak wrote: Currently, gfortran.dg/erf_3.F90 FAILs on targets with 128bit (quadruple) long double, since high-precision erfc_scaled_r16 gets defined only for __float128 quadruple precision. [...] yes, it makes more sense than what I did earlier. Looks also good to me. Thanks for the patch! Tobias
Re: [PATCH] fix combine.c:reg_nonzero_bits_for_combine where last_set_mode is narrower than mode
On 30/11/13 11:38, Eric Botcazou wrote: 2013-11-29 Paulo Matos pma...@broadcom.com Eric Botcazou ebotca...@adacore.com * combine.c (reg_nonzero_bits_for_combine): Apply mask transformation as applied to nonzero_sign_valid fixing bug when last_set_mode has less precision than mode. Applied, thanks. Excellent. :) Thanks for submitting. -- PMatos
Update my email address in MAINTAINERS
Hi, I have updated email address in MAINTAINERS, since j...@suse.cz is no longer active. Comitted as obvious, Honza Index: MAINTAINERS === --- MAINTAINERS (revision 205489) +++ MAINTAINERS (working copy) @@ -66,7 +66,7 @@ h8 port Kazu Hirata kazu@codesourcery hppa port Jeff Lawl...@redhat.com hppa port John David Anglin dave.ang...@nrc-cnrc.gc.ca i386 port Richard Henderson r...@redhat.com -i386 port Jan Hubicka j...@suse.cz +i386 port Jan Hubicka hubi...@ucw.cz i386 port Uros Bizjak ubiz...@gmail.com ia64 port Jim Wilson wil...@tuliptree.org ia64 port Steve Ellceysell...@mips.com @@ -112,7 +112,7 @@ tilegx port Walter Lee w...@tilera.com tilepro port Walter Lee w...@tilera.com v850 port Nick Cliftonni...@redhat.com vax port Matt Thomas m...@3am-software.com -x86-64 portJan Hubicka j...@suse.cz +x86-64 portJan Hubicka hubi...@ucw.cz xstormy16 port Nick Cliftonni...@redhat.com xtensa portSterling Augustine augustine.sterl...@gmail.com @@ -188,7 +188,7 @@ scheduler (+ haifa) Vladimir Makarovvma modulo-scheduler Ayal Zaks z...@il.ibm.com reorg Jeff Lawl...@redhat.com caller-save.c Jeff Lawl...@redhat.com -callgraph Jan Hubicka j...@suse.cz +callgraph Jan Hubicka hubi...@ucw.cz debugging code Jim Wilson wil...@tuliptree.org dwarf debugging code Jason Merrill ja...@redhat.com dwarf debugging code Cary Coutantccout...@google.com @@ -222,9 +222,9 @@ docs co-maintainer Joseph Myersjoseph@ docstring relicensing Diego Novillo dnovi...@google.com docstring relicensing Gerald Pfeifer ger...@pfeifer.com docstring relicensing Joseph Myersjos...@codesourcery.com -predict.defJan Hubicka j...@suse.cz +predict.defJan Hubicka hubi...@ucw.cz contrib/regression Geoff Keating geo...@geoffk.org -gcov Jan Hubicka j...@suse.cz +gcov Jan Hubicka hubi...@ucw.cz gcov Nathan Sidwell nat...@codesourcery.com option handlingNeil Booth n...@daikokuya.co.uk option handlingJoseph Myersjos...@codesourcery.com @@ -240,7 +240,7 @@ code sinkingDaniel Berlin dberlin@dbe tree browser/unparser Sebastian Pop sebastian@amd.com scev, data dependence Daniel Berlin dber...@dberlin.org scev, data dependence Sebastian Pop sebastian@amd.com -profile feedback Jan Hubicka j...@suse.cz +profile feedback Jan Hubicka hubi...@ucw.cz type-safe vectors Nathan Sidwell nat...@codesourcery.com alias analysis Daniel Berlin dber...@dberlin.org reload Ulrich Weigand uweig...@de.ibm.com
Re: gcc/invoke.texi: Add missing @opindex
On Sun, 24 Nov 2013, Tobias Burnus wrote: 2013-11-24 Tobias Burnus bur...@net-b.de Manuel López-Ibáñez m...@gcc.gnu.org PR middle-end/59257 * doc/invoke.texi: Add missing @opindex. (-fsanitize=): Use @gcctabopt instead of @itemize. OK for the trunk? This looks fine to me, thanks. Gerald
Re: [PATCH] Fix PRs59125 and 54570
On Mon, Nov 18, 2013 at 5:25 AM, Richard Biener rguent...@suse.de wrote: On Fri, 15 Nov 2013, Richard Biener wrote: On Fri, 15 Nov 2013, Jakub Jelinek wrote: On Fri, Nov 15, 2013 at 02:56:51PM +0100, Richard Biener wrote: Now that there is (finally :() a wrong-code testcase for the PR54570 issue we can no longer ignore it (bah). So the following tries to paper over the fact that object-size sucks and disables value-numbering of equal addresses the same before that pass had a chance to finally look at the structure of the addresses. To make this fix suck less I moved the object-size pass before the final FRE pass runs which is after IPA inlining and the propagation of constants and addresses. You won't catch any improvements you'd get by memory CSE opportunities that IPA inlining exposes, but you cannot have everything here. If it doesn't regress anything in the testsuite, I guess that is ok. (IMHO object-size would should run during early optimizations) It can't, because inlining and some limited cleanup afterwards is essential for it. Otherwise you'd regress not just for __builtin_object_size (x, 1), which admittedly is problematic since the introduction of MEM_REFs and various other changes, but also for __builtin_object_size (x, 0), which would be much more important. As discussed earlier, perhaps instead of checking cfun-after_inlining you could just introduce a new flag whether cfun contains any __builtin_object_size (x, {1,3}) calls, initialized by the gimplifier, propagated by the inliner and finally cleared again by objsz pass. But you'd need to pessimistically initialize it because if you inline into a function with __builtin_object_size you may not previously optimize. You can of course analyze the cgraph to clear it for functions that cannot end up being inlined into such function. So that's effectively the same as -after_inlining minus losing the optimization that didn't end up with __builtin_object_size after that. Not sure if it's worth all the trouble. Arriving at a better design for computing __builtin_object_size would be better ;) (not that I have one) Of course, if moving objsz earlier seems to work, it could stay where you put it, but the flag could make it clearer why you want to avoid certain optimizations. Well, all object-size testcases are pretty simplistic right now and don't trigger the IPA inliner for example. Bootstrap / regtest pending on x86_64-unknown-linux-gnu. Similar candidate for the 4.8 branch. Please wait sufficiently long for trunk issues before you backport. Of course. So I had to do some more changes because doing objsz earlier (and separating it from strlenopt) exposes that the GENERIC folders (ugh) for strcat mess up points-to info as it folds strcat to memcpy (SAVE_EXPR dest + strlen (), ...) which adds a new SSA name for the destination with no alias info associated. Rather than moving the non-constant builtin foldings to GIMPLE I chose to make sure to forward the constants in objsz and fold the stmts there (as PTA is re-run shortly after it). This then breaks some of the scan-tree-dumps in the strlenopt testcases because nobody has previously folded the _chk builtins with -1U size to non-_chk variants, so I have to adjust them (I didn't want to move the strlenopt pass as well). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Any objections? Thanks, Richard. 2013-11-18 Richard Biener rguent...@suse.de PR tree-optimization/59125 PR tree-optimization/54570 * tree-ssa-sccvn.c (copy_reference_ops_from_ref): When inlining is not complete do not treat component-references with offset zero but different fields as equal. * tree-object-size.c: Include tree-phinodes.h and ssa-iterators.h. (compute_object_sizes): Apply TLC. Propagate the constant results into all uses and fold their stmts. * passes.def (pass_all_optimizations): Move pass_object_sizes after the first pass_forwprop and before pass_fre. * gcc.dg/builtin-object-size-8.c: Un-xfail. * gcc.dg/builtin-object-size-14.c: New testcase. * gcc.dg/strlenopt-14gf.c: Adjust. * gcc.dg/strlenopt-1f.c: Likewise. * gcc.dg/strlenopt-4gf.c: Likewise. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59362 -- H.J.
Re: [golang-dev] Re: [gofrontend-dev] Re: [patch] introduce aarch64 as a Go architecture
arm64 it is
Re: *ping* Re: wwwdocs: Broken links due to the preprocess script
Working on it. I hope to have a patch within the next 48 hours. Gerald
Re: [PATCH ARM]Refine scaled address expression on ARM
On Sat, Nov 30, 2013 at 12:34 AM, Richard Earnshaw rearn...@arm.com wrote: On 29/11/13 11:46, Yufeng Zhang wrote: On 11/29/13 07:52, Bin.Cheng wrote: After thinking twice, I some kind of think we should not re-associate addresses during expanding, because of lacking of context information. Take base + scaled_index + offset as an example in PR57540, we just don't know if base+offset is loop invariant from either backend or RTL expander. I'm getting less convinced by re-associating base with offset unconditionally. One counter example is typedef int arr_1[20]; void foo (arr_1 a1, int i) { a1[i+10] = 1; } I'm experimenting a patch to get the immediate offset in the above example to be the last addend in the address computing (as mentioned in http://gcc.gnu.org/ml/gcc/2013-11/msg00581.html), aiming to get the following code-gen: add r1, r0, r1, asl #2 mov r3, #1 str r3, [r1, #40] With your patch applied, the effort will be reverted to add r0, r0, #40 mov r3, #1 str r3, [r0, r1, asl #2] And another one is: typedef int arr_1[20]; void foo (arr_1 a1, int i) { a1[i+10] = 1; a1[i+11] = 1; } This should compile to: add r1, r0, r1, asl #2 mov r3, #1 str r3, [r1, #40] str r3, [r1, #44] And which on Thumb2 should then collapse to: add r1, r0, r1, asl #2 mov r3, #1 strdr3, r3, [r1, #40] With your patch I don't see any chance of being able to get to this situation. (BTW, we currently generate: mov r3, #1 add r1, r1, #10 add r2, r0, r1, asl #2 str r3, [r0, r1, asl #2] str r3, [r2, #4] which is insane). I think I see where you're coming from on the original testcase, but I think you're trying to solve the wrong problem. In your test case the base is an eliminable register, which is likely to be replaced with an offset expression during register allocation. The problem then, I think, is that the cost of these virtual registers is treated the same as any other pseudo register, when it may really have the cost of a PLUS expression. Perhaps the cost of using an eliminable register should be raised in rtx_costs() (treating them as equivalent to (PLUS (reg) (CONST_INT (TBD))), so that loop optimizations will try to hoist suitable sub-expressions out the loop and replace them with real pseudos. I now have access to the code. The gimple before expanding is like: bb 6: # j_26 = PHI j_9(8), 0(5) # k_29 = PHI k_8(8), k_24(5) j_9 = j_26 + 1; k_8 = parent[k_29]; if (k_8 = 0) goto bb 8; else goto bb 7; The rtl generated after expanding is like: 88: NOTE_INSN_BASIC_BLOCK 7 89: r174:SI=r174:SI+0x1 90: r191:SI=r173:SI0x2 91: r190:SI=r105:SI+r191:SI 92: r173:SI=[r190:SI-0x810] with r105 == virtual_stack_vars_rtx, and it will be instantiated into frame_pointer_rtx in vregs pass: 88: NOTE_INSN_BASIC_BLOCK 7 89: r174:SI=r174:SI+0x1 90: r191:SI=r173:SI0x2 91: r190:SI=sfp:SI+r191:SI 92: r173:SI=[r190:SI-0x810] As you pointed out, sfp is not hoisted as a high cost invariant. I am not sure if loop-invariant will hoist a single pseudo reg even it's assigned with a higher cost. But before the invariant problem, the choice made by RTL expand is bad because it hides the CSE opportunity, because (sfp + r1732 - 0x810) == (sp + r1732), (sfp-0x810) can be folded into (sp), then we can embed the shift instruction in scaled addressing mode [sp + r173 2]. Thanks, bin -- Best Regards.
Re: [PING^2] [PATCH] PR59063
I am checking in this as an obvious fix. Tested on Linux/x86. H.J. I think bootstrap-asan.mk should also be updated. H.J. I checked in this as an obvious fix. Tested on Linux/x86-64. H.J. Thanks HJ. I wonder how this managed to pass tests here without your changes. Probably some misconfigure on my side. BTW I wasn't aware of bootstrap-asan. Should I check it on every sanitizer-related commit? -Y
Re: [PING^2] [PATCH] PR59063
This is causing all the tests being run on all targets, even if libsanitizer is not supported, most of them failing due to link errors. Thanks for the info and sorry about this. I should probably check non-sanitized platforms as well before commiting patches. Does the attached patch make sense to you? Worked for me on x64 and x64 with manually disabled libsanitizer. -Y diff --git a/gcc/testsuite/lib/asan-dg.exp b/gcc/testsuite/lib/asan-dg.exp index 1e6c7dd..8990677 100644 --- a/gcc/testsuite/lib/asan-dg.exp +++ b/gcc/testsuite/lib/asan-dg.exp @@ -39,9 +39,9 @@ proc asan_link_flags { paths } { set shlib_ext [get_shlib_extension] if { $gccpath != } { - append flags -B${gccpath}/libsanitizer/ if { [file exists ${gccpath}/libsanitizer/asan/.libs/libasan.a] || [file exists ${gccpath}/libsanitizer/asan/.libs/libasan.${shlib_ext}] } { + append flags -B${gccpath}/libsanitizer/ append flags -B${gccpath}/libsanitizer/asan/ append flags -L${gccpath}/libsanitizer/asan/.libs append ld_library_path :${gccpath}/libsanitizer/asan/.libs diff --git a/gcc/testsuite/lib/ubsan-dg.exp b/gcc/testsuite/lib/ubsan-dg.exp index 2e6b272..aa01988 100644 --- a/gcc/testsuite/lib/ubsan-dg.exp +++ b/gcc/testsuite/lib/ubsan-dg.exp @@ -30,9 +30,10 @@ proc ubsan_link_flags { paths } { set shlib_ext [get_shlib_extension] if { $gccpath != } { - append flags -B${gccpath}/libsanitizer/ubsan/ if { [file exists ${gccpath}/libsanitizer/ubsan/.libs/libubsan.a] || [file exists ${gccpath}/libsanitizer/ubsan/.libs/libubsan.${shlib_ext}] } { + append flags -B${gccpath}/libsanitizer/ + append flags -B${gccpath}/libsanitizer/ubsan/ append flags -L${gccpath}/libsanitizer/ubsan/.libs append ld_library_path :${gccpath}/libsanitizer/ubsan/.libs }
Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2
Ping http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02793.html -- Alan Modra Australia Development Lab, IBM
[Ping] Re: [RFC] [PATCH V2, AARCH64]: Re: [RFC] [PATCH, AARCH64] Machine descriptions to support stack smashing protection
Hi Richard, Pinging for further comments. regards, Venkat. On 27 November 2013 14:24, Venkataramanan Kumar venkataramanan.ku...@linaro.org wrote: Hi Richard, I don't think it's good to have long lists of targets on generic tests. Can we factor this out into a target-supports option? I have updated the patch as per your recommendation. Please let me know if it is fine. 2013-11-26 Venkataramanan Kumar venkataramanan.ku...@linaro.org * configure.ac (gcc_cv_libc_provides_tls_ssp): Add test to check TLS support in target C library for Aarch64. * configure: Regenerate. * config.in: Regenerate. * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test) (stack_protect_set_mode, stack_protect_test_mode): Add initial machine description for Stack Smashing Protector. * config/aarch64/aarch64-linux.h (TARGET_THREAD_SSP_OFFSET): Define. 2013-11-26 Venkataramanan Kumar venkataramanan.ku...@linaro.org * lib/target-supports.exp (check_effective_target_stack_protection): New procedure. * g++.dg/fstack-protector-strong.C: Add target check for stack protection. * gcc.dg/fstack-protector-strong.c: Likewise. regards, Venkat. On 26 November 2013 20:23, Richard Earnshaw rearn...@arm.com wrote: On 26/11/13 14:16, Venkataramanan Kumar wrote: Index: gcc/testsuite/gcc.dg/fstack-protector-strong.c === --- gcc/testsuite/gcc.dg/fstack-protector-strong.c(revision 205378) +++ gcc/testsuite/gcc.dg/fstack-protector-strong.c(working copy) @@ -1,6 +1,6 @@ /* Test that stack protection is done on chosen functions. */ -/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */ +/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* aarch64-*-* } } */ /* { dg-options -O2 -fstack-protector-strong } */ I don't think it's good to have long lists of targets on generic tests. Can we factor this out into a target-supports option? R.