Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On Wed, Feb 11, 2015 at 08:56:20AM -0700, Jeff Law wrote: On 02/11/15 04:16, Richard Biener wrote: Application of this pattern (and the one I posted for 47477) is a concern for targets that don't do sub-word arithmetic/logicals. But I just did a sniff test of one such target (v850-elf because it was handy) and I couldn't spot a change in the end code using both the 47477 patch and my WIP patch for this BZ. The c-family frontends perform this kind of narrowing already anyway (via the shorten_* stuff which is misplaced there and should be done elsewhere for all frontends - thus in match.pd, thanks for starting that). True, but I wanted to see if there was any impact, but thankfully there isn't. The fact that the C/C++ front-ends are doing most of the shortening now probably explains why the fix for 47477 only affected code generation for the Java front-end. The C/C++ FEs are doing it only if it appears on a single statement though, something fold can see at once. If you rewrite it so that there are multiple vars holding the intermediate values, they of course can't do that and it is a task for match.pd or specialized pass to handle those. Jakub
PING^2: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option
PING. On Wed, Jan 28, 2015 at 8:05 AM, H.J. Lu hjl.to...@gmail.com wrote: PING. On Tue, Jan 13, 2015 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Jan 13, 2015 at 5:03 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Jan 12, 2015 at 11:50:41PM +, Joseph Myers wrote: On Mon, 12 Jan 2015, H.J. Lu wrote: +if test x$enable_default_pie = xyes; then + AC_MSG_CHECKING(if $target supports default PIE) + enable_default_pie=no + case $target in +i?86*-*-linux* | x86_64*-*-linux*) + saved_LDFLAGS=$LDFLAGS + saved_CFLAGS=$CFLAGS + CFLAGS=$CFLAGS -fPIE + LDFLAGS=$LDFLAGS -fPIE -pie + AC_TRY_LINK(,,[enable_default_pie=yes],) + LDFLAGS=$saved_LDFLAGS + CFLAGS=$saved_CFLAGS + ;; +*) + ;; +esac There should not be any such hardcoding of targets here without concrete evidence that the targets for which this sets enable_default_pie=no really cannot support PIE. In particular, there is no reason at all for this to be architecture-specific; all GNU/Linux architectures should support PIE. I believe AC_TRY_LINK here will test for the host, whereas what you want to know is what's supported for the target (but it's not possible to run link tests for the target at this point; the compiler for the target hasn't even been built). So: just presume that if the user passes --enable-default-pie then they know what they are doing, and don't try to override their choice. diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index c9e3bf1..89fc305 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1583,6 +1583,10 @@ not be built. Specify that the run-time libraries for stack smashing protection should not be built. +@item --enable-default-pie +Turn on @option{-fPIE} and @option{-pie} by default if supported. +Currently supported targets are i?86-*-linux* and x86-64-*-linux*. The if supported and target list can then be removed here. Here is the updated patch. To support --enable-default-pie, each target must update STARTFILE_SPEC to support PIE_SPEC and NO_PIE_SPEC. I can provide STARTFILE_SPEC patch if needed. Thanks. H.J. --- gcc/ 2015-01-12 Magnus Granberg zo...@gentoo.org H.J. Lu hongjiu...@intel.com * Makefile.in (COMPILER): Add @NO_PIE_CFLAGS@. (LINKER): Add @NO_PIE_FLAG@. (libgcc.mvars): Set NO_PIE_CFLAGS to -fno-PIE for --enable-default-pie. * common.opt (fPIE): Initialize to -1. (fpie): Likewise. (static): Add RejectNegative Negative(shared). (no-pie): New option. (pie): Replace Negative(shared) with Negative(no-pie). * configure.ac: Add --enable-default-pie. (NO_PIE_CFLAGS): New. Check if -fno-PIE works. AC_SUBST. (NO_PIE_FLAG): New. Check if -no-pie works. AC_SUBST. * defaults.h (DEFAULT_FLAG_PIE): New. Default PIE to -fPIE. * gcc.c (NO_PIE_SPEC): New. (PIE_SPEC): Likewise. (LD_PIE_SPEC): Likewise. (LINK_PIE_SPEC): Handle -no-pie. Use PIE_SPEC and LD_PIE_SPEC. * opts.c (DEFAULT_FLAG_PIE): New. Set to 0 if ENABLE_DEFAULT_PIE is undefined. (finish_options): Update opts-x_flag_pie if it is -1. * config/gnu-user.h (FVTABLE_VERIFY_SPEC): New. (GNU_USER_TARGET_STARTFILE_SPEC): Use FVTABLE_VERIFY_SPEC. Use NO_PIE_SPEC and NO_PIE_SPEC if ENABLE_DEFAULT_PIE is defined. (GNU_USER_TARGET_STARTFILE_SPEC): Use FVTABLE_VERIFY_SPEC. * doc/install.texi: Document --enable-default-pie. * doc/invoke.texi: Document -no-pie. * config.in: Regenerated. * configure: Likewise. gcc/ada/ 2015-01-12 H.J. Lu hongjiu...@intel.com * gcc-interface/Makefile.in (TOOLS_LIBS): Add @NO_PIE_FLAG@. libgcc/ 2015-01-12 H.J. Lu hongjiu...@intel.com * Makefile.in (CRTSTUFF_CFLAGS): Add $(NO_PIE_CFLAGS). This is the updated patch. I fixed the -r regression. LTO tests pass now. -- H.J. -- H.J. -- H.J.
Re: [Patch, fortran] PR64952 - Missing temporary in assignment from elemental function
Dear Dominique, The patch works as advertised! I have two remarks: Of course it does :-) (1) AFAIU there is no need for a temporary for Indeed not. I cannot see how that can be avoided without a much more elaborate patch. Frankly, I do not see that it would be warranted. Much better some extra temporaries in corner cases, than wrong code. PROGRAM Main INTEGER :: i, index(5) = (/ (i, i = 1,5) /) REAL :: tmp(5), array(5) = (/ (i+0.0, i = 1,5) /) tmp = Fred(index,array) array = tmp PRINT *, array CONTAINS ELEMENTAL FUNCTION Fred (n, x) REAL :: Fred INTEGER, INTENT(IN) :: n REAL, INTENT(IN) :: x ! In general, this would be in an external procedure Fred = x+SUM(array(:n-1))+SUM(array(n+1:)) END FUNCTION Fred END PROGRAM Main However I get [Book15] f90/bug% gfc elemental_weird_db_2.f90 -Warray-temporaries elemental_weird_db_2.f90:4:10: tmp = Fred(index,array) 1 Warning: Creating array temporary at (1) [-Warray-temporaries] (2) You wrote: However, this works and has no perceivable effect on Polyhedron timings. . This is hardly a surprise since the Polyhedron tests don't use any elemental procedure. :-) You might have gathered that I didn't check! Within errors, it didn't affect compile times either. Thanks for the feedback Paul Thanks, Dominique Le 10 févr. 2015 à 23:35, Paul Richard Thomas paul.richard.tho...@gmail.com a écrit : Dear Mikael, dear all, Thank you for the previous review. I believe that the attached responds to all of your comments and correctly compiles the three testcases that you provided. Two of these have been included in the original testcase and the third appears separately. Bootstrapped and reg tested on FC21/x86_64 - OK for trunk? Cheers Paul 2015-02-10 Paul Thomas pa...@gcc.gnu.org PR fortran/64952 * gfortran.h : Add 'array_outer_dependency' to symbol_attr. * trans.h : Add 'array_outer_dependency' to gfc_ss_info. * module.c : Add AB_ARRAY_OUTER_DEPENDENCY to ab_attribute. Add same to attr_bits. (mio_symbol_attribute): Handle 'array_outer_dependency' attr in module read and write. * resolve.c (resolve_function): If an elemental function is referenced that is marked as having an external array reference and the current namespace is that of an elemental function, mark the containing function likewise. (resolve_variable): Mark elemental function symbol as 'array_outer_dependency' if it has an array reference from outside its own namespace. * trans-array.c (gfc_conv_resolve_dependencies): If any ss is marked as 'array_outer_dependency' generate a temporary. (gfc_walk_function_expr): If the function is marked as 'array_outer_dependency', likewise mark the head gfc_ss. 2015-02-10 Paul Thomas pa...@gcc.gnu.org PR fortran/64952 * gfortran.dg/elemental_dependency_4.f90: New test * gfortran.dg/elemental_dependency_5.f90: New test -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: [testsuite] PATCH: Add check_effective_target_pie
H.J. Lu hjl.to...@gmail.com writes: On Wed, Feb 11, 2015 at 6:10 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: The new proc is bogus, unfortunately: there's already an existing check_effective_target_pie that checks if a target can support PIE. The new one just overrides the previous one. On targets supporting PIE (like Darwin), but not defaulting to it, the PIE tests suddenly turn out UNSUPPORTED. You should rename the new one to e.g. check_effective_target_pie_default, update the single user, and document it in sourcebuild.texi. I checked in this as an obvious fix. I think pie_enabled is not a very descriptive name: Index: doc/sourcebuild.texi === --- doc/sourcebuild.texi (revision 220617) +++ doc/sourcebuild.texi (working copy) @@ -1884,6 +1884,9 @@ @item nonpic Target does not generate PIC by default. +@item pie_enabled +Target generates PIE by default. + @item pcc_bitfield_type_matters Target defines @code{PCC_BITFIELD_TYPE_MATTERS}. With -fpie, PIE is also enabled, just not the default without any I was testing # make RUNTESTFLAGS=--target_board='unix{-m32\ -fpie,-fpie}' I don't consider PIE is default. It is just enabled. options. Please either go with the pie_default I sugested or wait for others to weigh in before rushing in another `obvious' fix. Then the description (both sourcebuild.texi and target-supports.texi) is confusing. What are you trying to achieve here, actually? Even on Solaris 11/x86 (which doesn't support PIE), -fpie lets the check_effective_target_pie_enabled (or whatever it's called) proc pass. Shouldn't it also check if the target can support PIE at all? Please clarify your goals before going forward with this. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
RE: [PATCH, FT32] initial support
On Wed, 11 Feb 2015, James Bowman wrote: +@table @gcctabopt + +@item -mspace +@opindex mspace +Enable code-size optimizations. +Some of these optimizations incur a minor performance penalty. We already have -Os, so why is an architecture-specific option for this needed? +A 16-bit signed constant (-32768..32767) Use @minus{} for a minus sign in Texinfo documentation, and @dots{} instead of literal .. or -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch, fortran] PR64952 - Missing temporary in assignment from elemental function
Dear Paul, The patch works as advertised! I have two remarks: (1) AFAIU there is no need for a temporary for PROGRAM Main INTEGER :: i, index(5) = (/ (i, i = 1,5) /) REAL :: tmp(5), array(5) = (/ (i+0.0, i = 1,5) /) tmp = Fred(index,array) array = tmp PRINT *, array CONTAINS ELEMENTAL FUNCTION Fred (n, x) REAL :: Fred INTEGER, INTENT(IN) :: n REAL, INTENT(IN) :: x ! In general, this would be in an external procedure Fred = x+SUM(array(:n-1))+SUM(array(n+1:)) END FUNCTION Fred END PROGRAM Main However I get [Book15] f90/bug% gfc elemental_weird_db_2.f90 -Warray-temporaries elemental_weird_db_2.f90:4:10: tmp = Fred(index,array) 1 Warning: Creating array temporary at (1) [-Warray-temporaries] (2) You wrote: « However, this works and has no perceivable effect on Polyhedron timings. ». This is hardly a surprise since the Polyhedron tests don’t use any elemental procedure. Thanks, Dominique Le 10 févr. 2015 à 23:35, Paul Richard Thomas paul.richard.tho...@gmail.com a écrit : Dear Mikael, dear all, Thank you for the previous review. I believe that the attached responds to all of your comments and correctly compiles the three testcases that you provided. Two of these have been included in the original testcase and the third appears separately. Bootstrapped and reg tested on FC21/x86_64 - OK for trunk? Cheers Paul 2015-02-10 Paul Thomas pa...@gcc.gnu.org PR fortran/64952 * gfortran.h : Add 'array_outer_dependency' to symbol_attr. * trans.h : Add 'array_outer_dependency' to gfc_ss_info. * module.c : Add AB_ARRAY_OUTER_DEPENDENCY to ab_attribute. Add same to attr_bits. (mio_symbol_attribute): Handle 'array_outer_dependency' attr in module read and write. * resolve.c (resolve_function): If an elemental function is referenced that is marked as having an external array reference and the current namespace is that of an elemental function, mark the containing function likewise. (resolve_variable): Mark elemental function symbol as 'array_outer_dependency' if it has an array reference from outside its own namespace. * trans-array.c (gfc_conv_resolve_dependencies): If any ss is marked as 'array_outer_dependency' generate a temporary. (gfc_walk_function_expr): If the function is marked as 'array_outer_dependency', likewise mark the head gfc_ss. 2015-02-10 Paul Thomas pa...@gcc.gnu.org PR fortran/64952 * gfortran.dg/elemental_dependency_4.f90: New test * gfortran.dg/elemental_dependency_5.f90: New test
Re: [testsuite] PATCH: Add check_effective_target_pie
H.J. Lu hjl.to...@gmail.com writes: On Wed, Feb 11, 2015 at 7:19 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: On Wed, Feb 11, 2015 at 6:10 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: The new proc is bogus, unfortunately: there's already an existing check_effective_target_pie that checks if a target can support PIE. The new one just overrides the previous one. On targets supporting PIE (like Darwin), but not defaulting to it, the PIE tests suddenly turn out UNSUPPORTED. You should rename the new one to e.g. check_effective_target_pie_default, update the single user, and document it in sourcebuild.texi. I checked in this as an obvious fix. I think pie_enabled is not a very descriptive name: Index: doc/sourcebuild.texi === --- doc/sourcebuild.texi (revision 220617) +++ doc/sourcebuild.texi (working copy) @@ -1884,6 +1884,9 @@ @item nonpic Target does not generate PIC by default. +@item pie_enabled +Target generates PIE by default. + @item pcc_bitfield_type_matters Target defines @code{PCC_BITFIELD_TYPE_MATTERS}. With -fpie, PIE is also enabled, just not the default without any I was testing # make RUNTESTFLAGS=--target_board='unix{-m32\ -fpie,-fpie}' I don't consider PIE is default. It is just enabled. options. Please either go with the pie_default I sugested or wait for others to weigh in before rushing in another `obvious' fix. Then the description (both sourcebuild.texi and target-supports.texi) is confusing. That is how other options are described. You're not getting my point: * target-supports.exp now has # Return 1 if the current multilib generates PIE by default. * sourcebuild.texi states Target generates PIE by default. Which one is it? What are you trying to achieve here, actually? Even on Solaris 11/x86 (which doesn't support PIE), -fpie lets the check_effective_target_pie_enabled (or whatever it's called) proc pass. Shouldn't it also check if the target can support PIE at all? Assembly outputs may be different, depending on if PIE is enabled nor not. When we scan assembly outputs for test results, we have different expected results when PIE is enabled. That is how pie_enabled is used so far. But why would you need a new effective-target keyword for that? Simply test the existing { target pie }, add -fpie and scan the assembler output. Why would you need pie_enabled or whatever at all? Again: what are you trying to achieve that cannot be done with the current keyword? Right now, in gcc.target/i386/pie.c, you're only testing the PIE case when PIE has been enabled by some means external to testsuite. The test always comes out UNSUPPORTED if not, so it isn't even exercised in a regular bootstrap (except on Darwin which defaults to PIE, I believe). Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] PATCH: Add check_effective_target_pie
On Wed, Feb 11, 2015 at 8:15 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: On Wed, Feb 11, 2015 at 7:19 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: On Wed, Feb 11, 2015 at 6:10 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: The new proc is bogus, unfortunately: there's already an existing check_effective_target_pie that checks if a target can support PIE. The new one just overrides the previous one. On targets supporting PIE (like Darwin), but not defaulting to it, the PIE tests suddenly turn out UNSUPPORTED. You should rename the new one to e.g. check_effective_target_pie_default, update the single user, and document it in sourcebuild.texi. I checked in this as an obvious fix. I think pie_enabled is not a very descriptive name: Index: doc/sourcebuild.texi === --- doc/sourcebuild.texi (revision 220617) +++ doc/sourcebuild.texi (working copy) @@ -1884,6 +1884,9 @@ @item nonpic Target does not generate PIC by default. +@item pie_enabled +Target generates PIE by default. + @item pcc_bitfield_type_matters Target defines @code{PCC_BITFIELD_TYPE_MATTERS}. With -fpie, PIE is also enabled, just not the default without any I was testing # make RUNTESTFLAGS=--target_board='unix{-m32\ -fpie,-fpie}' I don't consider PIE is default. It is just enabled. options. Please either go with the pie_default I sugested or wait for others to weigh in before rushing in another `obvious' fix. Then the description (both sourcebuild.texi and target-supports.texi) is confusing. That is how other options are described. You're not getting my point: * target-supports.exp now has # Return 1 if the current multilib generates PIE by default. * sourcebuild.texi states Target generates PIE by default. Which one is it? I copied them from nonpic # Return 1 if the current multilib does not generate PIC by default. @item nonpic Target does not generate PIC by default. What are you trying to achieve here, actually? Even on Solaris 11/x86 (which doesn't support PIE), -fpie lets the check_effective_target_pie_enabled (or whatever it's called) proc pass. Shouldn't it also check if the target can support PIE at all? Assembly outputs may be different, depending on if PIE is enabled nor not. When we scan assembly outputs for test results, we have different expected results when PIE is enabled. That is how pie_enabled is used so far. But why would you need a new effective-target keyword for that? Simply test the existing { target pie }, add -fpie and scan the assembler output. Why would you need pie_enabled or whatever at all? Again: what are you trying to achieve that cannot be done with the current keyword? Right now, in gcc.target/i386/pie.c, you're only testing the PIE case when PIE has been enabled by some means external to testsuite. The test always comes out UNSUPPORTED if not, so it isn't even exercised in a regular bootstrap (except on Darwin which defaults to PIE, I believe). That is intentional. gcc.target/i386/pie.c will be test by 1. RUNTESTFLAGS=--target_board='unix{-m32\ -fpie,-fpie}' . Or 2. PIE is enabled by default: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00956.html -- H.J.
Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
On 11/02/15 14:21, Kenneth Zadeck wrote: On 02/11/2015 06:20 AM, Jiong Wang wrote: 2014-12-19 15:21 GMT+00:00 Kenneth Zadeck zad...@naturalbridge.com: however, since i am a nice person loop-invariant solves the DF_UD_CHAIN which builds a data structure that connects each use with all of the defs that reach it. I believe that this is the opposite of what you want here. if you really need this, you need to also turn on the DF_DU_CHAIN which builds the opposite structure.Both structures can be space hogs, but they are both turned on in other places in the compiler so it is unlikely to be an issue. Exactly, Thanks, Kenneth. This approach works from my experiment and look much better than previous REG_NOTE approach. while it do have one problem. We need the UD/DU chain built before we do insn re-shuffling. While after re-shuffling, UD chain needs update, otherwise, the later check_dependecies in find_invariant_insn may fail. although we have re-shuffle instruction 1 into 2, the later check still using old UD info, thus decide instruction 2 is not iv. 1: regA - vfp + regB 2: regA - vfp + const my current fix is to insert those re-shuffled insn into a table named vfp_const_iv, then skip those dependencies check for them as they don't have any dependencies. You now are beginning to understand why Mark Wegman and I invented SSA form almost 30 years ago Indeed, thanks. attachment is the new draft patch, pass x86-84 bootstrap (actually it will not affect x86-64, because of addressing mode), while failed on aarch64 bootstrap during stage2/3 binary comparision, there must be some glitches needs to be fixed. Will clean up later after I finished my upcoming long flight and what I am seeking now is whether the general thoughts, code logic framework, in this patch is acceptable to the community? personally, I think this version is much better than previous version. new added code integrated with existed code in a more natural way. no use of unsafe REG_NOTE info which is not updated in this pass. and from AArch64 gcc bootstrap, 239 new loop invariant found by this re-shuffling. so, this small improvement on rtl loop invariant pass do have some value. please review and give comments. Thanks. 2015-02-11 Jiong Wang jiong.w...@arm.com gcc/ * loop-invariant.c (find_defs): Enable DF_DU_CHAIN build. (vfp_const_iv): New hash table. (expensive_addr_check_p): New boolean. (init_inv_motion_data): Initialize new variables. (free_inv_motion_data): Release hash table. (create_new_invariant): Set cheap_address to false for iv in vfp_const_iv table. (find_invariant_insn): Skip dependencies check for iv in vfp_const_iv table. (use_for_single_du): New function. (reshuffle_insn_with_vfp): Likewise. (find_invariants_bb): Call reshuffle_insn_with_vfp. gcc/testsuite/ * gcc.dg/pr62173.c: New testcase. diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f79b497..b1c4760 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -203,6 +203,8 @@ typedef struct invariant *invariant_p; /* The invariants. */ static vecinvariant_p invariants; +static hash_table pointer_hash rtx_insn *vfp_const_iv; +static bool need_expensive_addr_check_p; /* Check the size of the invariant table and realloc if necessary. */ @@ -695,7 +697,7 @@ find_defs (struct loop *loop) df_remove_problem (df_chain); df_process_deferred_rescans (); - df_chain_add_problem (DF_UD_CHAIN); + df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN); df_set_flags (DF_RD_PRUNE_DEAD_DEFS); df_analyze_loop (loop); check_invariant_table_size (); @@ -742,6 +744,9 @@ create_new_invariant (struct def *def, rtx_insn *insn, bitmap depends_on, See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html . */ inv-cheap_address = address_cost (SET_SRC (set), word_mode, ADDR_SPACE_GENERIC, speed) 3; + + if (need_expensive_addr_check_p vfp_const_iv-find (insn)) + inv-cheap_address = false; } else { @@ -952,7 +957,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed) return; depends_on = BITMAP_ALLOC (NULL); - if (!check_dependencies (insn, depends_on)) + if (!vfp_const_iv-find (insn) + !check_dependencies (insn, depends_on)) { BITMAP_FREE (depends_on); return; @@ -1007,6 +1013,143 @@ find_invariants_insn (rtx_insn *insn, bool always_reached, bool always_executed) record_uses (insn); } +/* Find the single use of def of insn. At the same time, +make sure def of insn is the single def which reach the use. */ + +static rtx_insn * +use_for_single_du (rtx_insn *insn) +{ + struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); + df_ref def; + + FOR_EACH_INSN_INFO_DEF (def, insn_info) +{ + struct df_link *chain = DF_REF_CHAIN (def); + + if (!chain + || chain-next) + return NULL; + + df_ref use = chain-ref; + chain = DF_REF_CHAIN (use); + + if
Re: [Patch, fortran] PR64952 - Missing temporary in assignment from elemental function
Dear Paul, Le 11 févr. 2015 à 17:57, Paul Richard Thomas paul.richard.tho...@gmail.com a écrit : Dear Dominique, The patch works as advertised! I have two remarks: Of course it does :-) ;-) (1) AFAIU there is no need for a temporary for Indeed not. I cannot see how that can be avoided without a much more elaborate patch. Frankly, I do not see that it would be warranted. Much better some extra temporaries in corner cases, than wrong code. Agreed. (2) You wrote: However, this works and has no perceivable effect on Polyhedron timings. . This is hardly a surprise since the Polyhedron tests don't use any elemental procedure. :-) You might have gathered that I didn't check! Within errors, it didn't affect compile times either. AFAICT, I have only one test in my archives that time elemental functions: pr40581 for which I don’t see any difference with/without your patch. Looking at bugzilla it seems that Fran Martinez Fadrique is a heavy consumer of elemental procs (see e.g. pr52332). May be you can ask him to do some timing of his code with/without your patch. Thanks for the feedback You’re welcome! Dominique
Re: [PATCH] Fix PR64764 Fix scan-tree-dump for scop-19.c
On 11-02-15 10:22, Dominique d'Humières wrote: Le 10 févr. 2015 à 12:42, Tom de Vries tom_devr...@mentor.com a écrit : I think we need to understand first what's going on. Sure, my patch was mainly to silence the failures on my working tree. In both test-cases, on Linux with -fpic the inlining of one funct ion into the other is not done, because we cannot be sure the function call in one function binds to the other function at runtime. I don’t understand why -fpic should change the inlining decision (except register pressure and so on). It's about symbol binding, not inlining heuristics. So, is the inlining happening with -fpic for Darwin? If so, is that correct? The inlining is happening for darwin. I don’t know if that is correct or not. I found an explanation. darwin.c: ... /* Cross-module name binding. Darwin does not support overriding functions at dynamic-link time, except for vtables in kexts. */ bool darwin_binds_local_p (const_tree decl) { return default_binds_local_p_1 (decl, TARGET_KEXTABI DARWIN_VTABLE_P (decl)); } ... i386.c: ... #if TARGET_MACHO #undef TARGET_BINDS_LOCAL_P #define TARGET_BINDS_LOCAL_P darwin_binds_local_p #endif ... This is probably the cause for the difference. I guess that means your patch is ok. Thanks, - Tom
Re: Fix i386 wrong code issues introduced by the local flag fix
Hi, this patch fixes ICE when building firefox reported by Markus: lto1: internal compiler error: Segmentation fault 0x106e1feb crash_signal ../../gcc/gcc/toplev.c:383 0x10589cd4 lto_get_decl_name_mapping(lto_file_decl_data*, char const*) ../../gcc/gcc/lto-section-in.c:369 0x1057bb6b copy_function_or_variable ../../gcc/gcc/lto-streamer-out.c:2194 0x1057df9b lto_output() ../../gcc/gcc/lto-streamer-out.c:2289 0x105edd87 write_lto ../../gcc/gcc/passes.c:2405 0x105f3073 ipa_write_optimization_summaries(lto_symtab_encoder_d*) ../../gcc/gcc/passes.c:2607 0x1018f6ab do_stream_out ../../gcc/gcc/lto/lto.c:2480 0x10190a33 stream_out ../../gcc/gcc/lto/lto.c:2544 0x10190a33 lto_wpa_write_files ../../gcc/gcc/lto/lto.c:2661 0x1019c0a7 do_whole_program_analysis ../../gcc/gcc/lto/lto.c:3331 0x1019c0a7 lto_main() ../../gcc/gcc/lto/lto.c:3451 I tested it on Firefox, will commit after bootstrapregtesting finishes (x86_64) Honza * ipa.c (symbol_table::remove_unreachable_nodes): Avoid releasing bodies of thunks; comment on why. * symtab.c (symtab_node::get_partitioning_class): Aliases of extern symbols are extern. Index: ipa.c === --- ipa.c (revision 220608) +++ ipa.c (working copy) @@ -537,7 +537,13 @@ symbol_table::remove_unreachable_nodes ( /* If node is unreachable, remove its body. */ else if (!reachable.contains (node)) { - if (!body_needed_for_clonning.contains (node-decl)) + /* We keep definitions of thunks and aliases in the boundary so +we can walk to the ultimate alias targets and function symbols +reliably. */ + if (node-alias || node-thunk.thunk_p) + ; + else if (!body_needed_for_clonning.contains (node-decl) + !node-alias !node-thunk.thunk_p) node-release_body (); else if (!node-clone_of) gcc_assert (in_lto_p || DECL_RESULT (node-decl)); Index: symtab.c === --- symtab.c(revision 220608) +++ symtab.c(working copy) @@ -1779,6 +1779,8 @@ symtab_node::get_partitioning_class (voi if (varpool_node *vnode = dyn_cast varpool_node * (this)) { + if (alias definition !ultimate_alias_target ()-definition) + return SYMBOL_EXTERNAL; /* Constant pool references use local symbol names that can not be promoted global. We should never put into a constant pool objects that can not be duplicated across partitions. */ @@ -1790,7 +1792,7 @@ symtab_node::get_partitioning_class (voi Handle them as external; compute_ltrans_boundary take care to make proper things to happen (i.e. to make them appear in the boundary but with body streamed, so clone can me materialized). */ - else if (!dyn_cast cgraph_node * (this)-definition) + else if (!dyn_cast cgraph_node * (this)-function_symbol ()-definition) return SYMBOL_EXTERNAL; /* Linker discardable symbols are duplicated to every use unless they are
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On Tue, 10 Feb 2015, Jeff Law wrote: I think the way to go is to always convert the inner operands to an unsigned type. In fact, everything except the outer convert should be using an unsigned type of the same size/precision as @0's type. The outer convert should, of course, be the final type. That avoids all the concerns about sign bit propagation from the narrow type into the wider final type. Subject of course to allowing for the cases where the original expression uses sign-extension (signed narrow type), the mask is consistent with this (sign bit of narrow type and all higher bits set, e.g. -128), and the operation is bitwise not arithmetic, if you want to optimize those as well. -- Joseph S. Myers jos...@codesourcery.com
Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
On 02/11/15 03:56, Richard Biener wrote: diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 7f3816c..7a95029 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2015-02-10 Jeff Law l...@redhat.com + + * match.pd (convert (plus/minus (convert @0) (convert @1): New + simplifier to narrow arithmetic. + Please reference PR47477 from here Doh. Fixed. 2015-02-10 Richard Biener rguent...@suse.de PR tree-optimization/64909 diff --git a/gcc/match.pd b/gcc/match.pd index 81c4ee6..abc703e 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3. If not see (logs (pows @0 @1)) (mult @1 (logs @0) +/* If we have a narrowing conversion of an arithmetic operation where + both operands are widening conversions from the same type as the outer + narrowing conversion. Then convert the innermost operands to a suitable + unsigned type (to avoid introducing undefined behaviour), perform the + operation and convert the result to the desired type. + + This narrows the arithmetic operation. */ This is also a part of what shorten_binary_op does, so please refer to that in the comment so we can eventually complete this from there and remove shorten_binary_op. Done. I've created a block comment meant to cover this pattern and any additions we need along the way to moving shortening/narrowing out of the front-ends. +(for op (plus minus) + (simplify +(convert (op (convert@2 @0) (convert @1))) +(if (TREE_TYPE (@0) == TREE_TYPE (@1) + TREE_TYPE (@0) == type Otherwhere in match.pd we use (GIMPLE types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))) || (GENERIC TREE_TYPE (@0) == TREE_TYPE (@1)) for type equality checks. And I think even for GENERIC you want to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == TYPE_MAIN_VARAINT (TREE_TYPE (@1)). Seems reasonable.Makes for some ugly formatting though. It might make sense to factor that test so we don't end up repeating it senselessly. + INTEGRAL_TYPE_P (type) So instead of testing TREE_TYPE (@0) == type we probably want to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION (type) to allow sign-changes. Say for Yea, makes sense. short i, j; (unsigned short) (i + j) we still want to narrow the widened i and j. Right. + TYPE_PRECISION (TREE_TYPE (@2)) TYPE_PRECISION (TREE_TYPE (@0)) +/* This prevents infinite recursion. */ + unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2)) How can it recurse with the type precision constraint right above? It may no longer be needed. The precision check used to be =, but that regressed some java codes. So it got tightened in the last iteration before posting. Note that technically we don't need to perform the operation in an unsigned type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)). Thus (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) (convert (op @0 @1))) (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); } (convert (op (convert:utype @0) (convert:utype @1)) You can see that GCC exploits that with -fwrapv. Maybe this simplification should be split out into a separate pattern though, tacking sign-changes for binary ops only. No strong opinion on separate pattern or integrating into existing pattern. jeff
Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)
On Feb 11, 2015, at 4:24 AM, Marek Polacek pola...@redhat.com wrote: The following patch splits the test into C and C++ test cases, so hopefully fixing the issue. Ok for trunk? 2015-02-11 Marek Polacek pola...@redhat.com * g++.dg/ubsan/shift-1.C: New test. * gcc.dg/ubsan/c-shift-2.c: New test. * c-c++-common/ubsan/shift-5.c: Remove file. diff --git gcc/testsuite/g++.dg/ubsan/shift-1.C gcc/testsuite/g++.dg/ubsan/shift-1.C index e69de29..8a71279 100644 --- gcc/testsuite/g++.dg/ubsan/shift-1.C +++ gcc/testsuite/g++.dg/ubsan/shift-1.C @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options -fsanitize=shift -w } */ +/* { dg-shouldfail ubsan } */ + +int +foo (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case 1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 11 } */ Never include line numbers, unless there is no other way. Here, I think you can drop it, and merely ensure this is on the right line? An example from gcc.dg: int g2(int a; __attribute__((unused))); /* { dg-error just a forward declaration no parms { xfail *-*-* } } */ I’m hoping that style will work. +case -1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 13 } */ Likewise. +case 1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 15 } */ Likewise. +case -1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 17 } */ Likewise. + return 1; +} + return 0; +} + +int +bar (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case -1 200: +/* { dg-error is not a constant expression { xfail { *-*-* } } 30 } */ Likewise. +case 1 200: +/* { dg-error is not a constant expression { xfail { *-*-* } } 32 } */ Likewise. Ok with the above fixes, if applicable.
[gomp4] Merge trunk r220584 (2015-02-10) into gomp-4_0-branch
Hi! In r220627, I have committed a merge from trunk r220584 (2015-02-10) into gomp-4_0-branch. Grüße, Thomas pgpwCbyTbYKcB.pgp Description: PGP signature
Re: C++ Patch for c++/60894
Hi, On 12/01/2014 09:59 PM, Jason Merrill wrote: On 12/01/2014 07:01 AM, Fabien Chêne wrote: 2014-11-03 21:18 GMT+01:00 Fabien Chêne fabien.ch...@gmail.com: 2014-10-09 15:34 GMT+02:00 Jason Merrill ja...@redhat.com: [...] If the USING_DECL is returned, the code below will be rejected as expected, but the error message will not mention the line where the USING_DECL appears as the previous definition, but at the target declaration of the USING_DECL instead. I think that's what happens if you strip the USING_DECL and return what it points to; if you return the USING_DECL itself that shouldn't happen (though then the caller needs to be able to deal with getting a USING_DECL). [Sorry for the delay] Humm, l_a_c_t returns a TYPE upon success, shall I change it and return a DECL instead ? Ping. Before I made the change, I'd like to be sure this is what you have in mind. I think that makes sense. Today I was having a look to this pending issue and went astray due to the broken thread: I wondered if, basing on Fabien' first try and the comments accompanying tag_scope, it would make sense to use strip_using_decl only when the scope is ts_global (or maybe != ts_current)?!? The below certainly passes testing. Unless of course Fabien has the above implemented and ready (I don't, I miss the correct changes to the lookup_and_check_tag callers) Thanks, Paolo. / Index: decl.c === --- decl.c (revision 220611) +++ decl.c (working copy) @@ -12207,6 +12207,9 @@ lookup_and_check_tag (enum tag_types tag_code, tre DECL_TEMPLATE_TEMPLATE_PARM_P (decl decl = DECL_TEMPLATE_RESULT (decl); + if (scope == ts_global) +decl = strip_using_decl (decl); + if (decl TREE_CODE (decl) == TYPE_DECL) { /* Look for invalid nested type:
Re: [testsuite] PATCH: Add check_effective_target_pie
H.J. Lu hjl.to...@gmail.com writes: The new proc is bogus, unfortunately: there's already an existing check_effective_target_pie that checks if a target can support PIE. The new one just overrides the previous one. On targets supporting PIE (like Darwin), but not defaulting to it, the PIE tests suddenly turn out UNSUPPORTED. You should rename the new one to e.g. check_effective_target_pie_default, update the single user, and document it in sourcebuild.texi. I checked in this as an obvious fix. I think pie_enabled is not a very descriptive name: Index: doc/sourcebuild.texi === --- doc/sourcebuild.texi (revision 220617) +++ doc/sourcebuild.texi (working copy) @@ -1884,6 +1884,9 @@ @item nonpic Target does not generate PIC by default. +@item pie_enabled +Target generates PIE by default. + @item pcc_bitfield_type_matters Target defines @code{PCC_BITFIELD_TYPE_MATTERS}. With -fpie, PIE is also enabled, just not the default without any options. Please either go with the pie_default I sugested or wait for others to weigh in before rushing in another `obvious' fix. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [testsuite] PATCH: Add check_effective_target_pie
On Tue, Feb 10, 2015 at 3:11 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Jeff Law l...@redhat.com writes: Subject: [PATCH 1/5] Add check_effective_target_pie Hi, This patch adds check_effective_target_pie to check if the current multilib generates PIE by default. Thanks. H.J. --- 2015-01-11 H.J. Lu hongjiu...@intel.com * gcc.target/i386/pie.c: New test. * lib/target-supports.exp (check_effective_target_pie): New. OK. The new proc is bogus, unfortunately: there's already an existing check_effective_target_pie that checks if a target can support PIE. The new one just overrides the previous one. On targets supporting PIE (like Darwin), but not defaulting to it, the PIE tests suddenly turn out UNSUPPORTED. You should rename the new one to e.g. check_effective_target_pie_default, update the single user, and document it in sourcebuild.texi. I checked in this as an obvious fix. Thanks. -- H.J. --- Index: ChangeLog === --- ChangeLog (revision 220617) +++ ChangeLog (working copy) @@ -1,3 +1,7 @@ +2015-02-11 H.J. Lu hongjiu...@intel.com + + * doc/sourcebuild.texi (pie_enabled): Document. + 2015-02-11 Martin Liska mli...@suse.cz PR ipa/64813 Index: testsuite/lib/target-supports.exp === --- testsuite/lib/target-supports.exp (revision 220617) +++ testsuite/lib/target-supports.exp (working copy) @@ -1100,8 +1100,8 @@ # Return 1 if the current multilib generates PIE by default. -proc check_effective_target_pie { } { -return [check_no_compiler_messages pie assembly { +proc check_effective_target_pie_enabled { } { +return [check_no_compiler_messages pie_enabled assembly { #ifndef __PIE__ #error unsupported #endif Index: testsuite/gcc.dg/tree-ssa/ssa-store-ccp-3.c === --- testsuite/gcc.dg/tree-ssa/ssa-store-ccp-3.c (revision 220617) +++ testsuite/gcc.dg/tree-ssa/ssa-store-ccp-3.c (working copy) @@ -13,6 +13,6 @@ /* There should be no reference for nonpic targets to conststaticvariable as we should have inlined the 0. */ -/* { dg-final { scan-tree-dump-times conststaticvariable 0 optimized { target { pie || nonpic } } } } */ -/* { dg-final { scan-tree-dump-times conststaticvariable 1 optimized { target { { ! pie } { ! nonpic } } } } } */ +/* { dg-final { scan-tree-dump-times conststaticvariable 0 optimized { target { pie_enabled || nonpic } } } } */ +/* { dg-final { scan-tree-dump-times conststaticvariable 1 optimized { target { { ! pie_enabled } { ! nonpic } } } } } */ /* { dg-final { cleanup-tree-dump optimized } } */ Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 220617) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,11 @@ +2015-02-11 H.J. Lu hongjiu...@intel.com + + * lib/target-supports.exp (check_effective_target_pie): Renamed + to ... + (check_effective_target_pie_enabled): This. + * gcc.dg/tree-ssa/ssa-store-ccp-3.c: Replace pie with pie_enabled. + * gcc.target/i386/pie.c: Likewise. + 2015-02-11 Andrew Pinski apin...@cavium.com PR target/64893 Index: testsuite/gcc.target/i386/pie.c === --- testsuite/gcc.target/i386/pie.c (revision 220617) +++ testsuite/gcc.target/i386/pie.c (working copy) @@ -1,4 +1,4 @@ -/* { dg-do compile { target pie } } */ +/* { dg-do compile { target pie_enabled } } */ /* { dg-options -O2 } */ int foo (void); Index: doc/sourcebuild.texi === --- doc/sourcebuild.texi (revision 220617) +++ doc/sourcebuild.texi (working copy) @@ -1884,6 +1884,9 @@ @item nonpic Target does not generate PIC by default. +@item pie_enabled +Target generates PIE by default. + @item pcc_bitfield_type_matters Target defines @code{PCC_BITFIELD_TYPE_MATTERS}.
Re: [testsuite] PATCH: Add check_effective_target_pie
On Wed, Feb 11, 2015 at 6:10 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: The new proc is bogus, unfortunately: there's already an existing check_effective_target_pie that checks if a target can support PIE. The new one just overrides the previous one. On targets supporting PIE (like Darwin), but not defaulting to it, the PIE tests suddenly turn out UNSUPPORTED. You should rename the new one to e.g. check_effective_target_pie_default, update the single user, and document it in sourcebuild.texi. I checked in this as an obvious fix. I think pie_enabled is not a very descriptive name: Index: doc/sourcebuild.texi === --- doc/sourcebuild.texi (revision 220617) +++ doc/sourcebuild.texi (working copy) @@ -1884,6 +1884,9 @@ @item nonpic Target does not generate PIC by default. +@item pie_enabled +Target generates PIE by default. + @item pcc_bitfield_type_matters Target defines @code{PCC_BITFIELD_TYPE_MATTERS}. With -fpie, PIE is also enabled, just not the default without any I was testing # make RUNTESTFLAGS=--target_board='unix{-m32\ -fpie,-fpie}' I don't consider PIE is default. It is just enabled. options. Please either go with the pie_default I sugested or wait for others to weigh in before rushing in another `obvious' fix. -- H.J.
Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.
On 11/02/15 11:14 +, Matthew Wahab wrote: Attached the fixed patch. Tested by running check-target-libstdc++-v3, with the modified tests, for arm-none-eabi and aarch64-none-linux-gnu. Ok to commit? OK, thanks.
Re: [patch, build] Restore bootstrap in building libcc1 on darwin
Hi Jeff, On 9 Feb 2015, at 14:47, Jeff Law l...@redhat.com wrote: On 02/01/15 09:42, Iain Sandoe wrote: This is a GCC5 bootstrap regression on 32bit Darwin hosts ( I can raise a PR if that is considered necessary). Has this been addressed or is it still pending? I believe it's still awaiting review - I'm travelling this week, so limited access to resources - but will be able to tackle any comments at the weekend. thanks Iain In fact it is not libcc1, but libiberty that is the cause - On 26 Jan 2015, at 14:13, Rainer Orth wrote: FX fxcoud...@gmail.com writes: The default BOOT_CFLAGS are: -O2 -g -mdynamic-no-pic the libiberty pic build appends: -fno-common (and not even -fPIC) [NB -fPIC _won't_ override -mdynamic-no-pic, so that's not a simple way out] This means that the PIC library is being built with non-pic relocs. config/mh-darwin says that -mdynamic-no-pic is there because it “speeds compiles by 3-5%”. I don’t think we care about speed when the bootstrap fails, so can we remove it altogether? Darwin/i686 still doesn't bootstrap without this patch, I believe. Shouldn't it be applied to trunk before GCC 5 ships, rather than leaving that target broken? The PIC variant of libiberty, has never (since it was added, AFAICT) catered for the possibility that non-PIC and PIC options might conflict (and/or that it might not be possible to override non-PIC options simply by appending PIC ones). This has gone un-noticed until the libcc1 plugin started linking with the pic version of libiberty. Darwin uses -mdynamic-no-pic as a default flag during bootstrap for 32bit targets, since that gives a stated ~ 3-5% improvement in performance. It is not possible to override this non-pic option by appending -fPIC (that simply results in a warning that mdynamic-no-pic takes precedence). I'd rather not pretend that there's no problem and simply penalise performance on m32 darwin by removing the option from the bootflags. So here's a patch that allows a target to declare incompatible non-pic options (in the same way that we have PIC options already declared as distinct). The patch is largely mechanical (each of the targets adjusted to use the NONPIC flag for the relevant case). As an aside, is there a portability reason that we don't make this repeated operation into a make function? bootstapped on x86_64-linux, x86_64-darwin12, i686-darwin10, powerpc-darwin9, cross-compiled x86-64-darwin12 X i686-darwin10 , native X i686-darwin10 (x86_64-darwin12 build). all languages including Ada (note that there's a local patch needed to work-around an Ada bootstrap issue - pr64349) OK for trunk? Iain libiberty: * Makefile.in (NONPICFLAG, NEWCFLAGS, NEWCPPFLAGS): New. (COMPILE.c): Adjust to use new flags. (all non-pic targets): Adjust to use NONPIC flag. * configure.ac (NOPICFLAG): New. * configure: Regenerate.
Re: [testsuite] PATCH: Add check_effective_target_pie
On Wed, Feb 11, 2015 at 7:19 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: On Wed, Feb 11, 2015 at 6:10 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: H.J. Lu hjl.to...@gmail.com writes: The new proc is bogus, unfortunately: there's already an existing check_effective_target_pie that checks if a target can support PIE. The new one just overrides the previous one. On targets supporting PIE (like Darwin), but not defaulting to it, the PIE tests suddenly turn out UNSUPPORTED. You should rename the new one to e.g. check_effective_target_pie_default, update the single user, and document it in sourcebuild.texi. I checked in this as an obvious fix. I think pie_enabled is not a very descriptive name: Index: doc/sourcebuild.texi === --- doc/sourcebuild.texi (revision 220617) +++ doc/sourcebuild.texi (working copy) @@ -1884,6 +1884,9 @@ @item nonpic Target does not generate PIC by default. +@item pie_enabled +Target generates PIE by default. + @item pcc_bitfield_type_matters Target defines @code{PCC_BITFIELD_TYPE_MATTERS}. With -fpie, PIE is also enabled, just not the default without any I was testing # make RUNTESTFLAGS=--target_board='unix{-m32\ -fpie,-fpie}' I don't consider PIE is default. It is just enabled. options. Please either go with the pie_default I sugested or wait for others to weigh in before rushing in another `obvious' fix. Then the description (both sourcebuild.texi and target-supports.texi) is confusing. That is how other options are described. What are you trying to achieve here, actually? Even on Solaris 11/x86 (which doesn't support PIE), -fpie lets the check_effective_target_pie_enabled (or whatever it's called) proc pass. Shouldn't it also check if the target can support PIE at all? Assembly outputs may be different, depending on if PIE is enabled nor not. When we scan assembly outputs for test results, we have different expected results when PIE is enabled. That is how pie_enabled is used so far. -- H.J.
Re: nvptx offloading patches [4/n]
Hi! On Wed, 28 Jan 2015 18:05:25 +0100, I wrote: On Sat, 1 Nov 2014 13:11:29 +0100, Bernd Schmidt ber...@codesourcery.com wrote: I'm sending this for reference more than anything else - this is the patch that adds the target support for offloading to the nvptx port. It depends on the other offloading patches Ilya is currently submitting. Committed to trunk in r220209: commit 9c08fbb35aa95420268c2762151f22a6a9b90e85 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Jan 28 17:03:44 2015 + nvptx mkoffload. Committed to trunk in r220620: commit c84c7f1923ab042478d73029475cf7f1aab6a61a Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Feb 11 14:15:38 2015 + nvptx mkoffload: __OPENMP_TARGET__ - __OFFLOAD_TABLE__. gcc/ * config/nvptx/mkoffload.c (process): Refer to __OFFLOAD_TABLE__ instead of __OPENMP_TARGET__. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@220620 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog| 3 +++ gcc/config/nvptx/mkoffload.c | 4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index 1479dcb..bc4a050 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,5 +1,8 @@ 2015-02-11 Thomas Schwinge tho...@codesourcery.com + * config/nvptx/mkoffload.c (process): Refer to __OFFLOAD_TABLE__ + instead of __OPENMP_TARGET__. + * config/nvptx/mkoffload.c: Include gomp-constants.h. (process): Use its GOMP_DEVICE_NVIDIA_PTX instead of (wrongly) hard-coding PTX_ID. diff --git gcc/config/nvptx/mkoffload.c gcc/config/nvptx/mkoffload.c index 8f359cf..2287316 100644 --- gcc/config/nvptx/mkoffload.c +++ gcc/config/nvptx/mkoffload.c @@ -791,9 +791,9 @@ process (FILE *in, FILE *out) fprintf (out, extern void GOMP_offload_register (const void *, int, void *);\n); - fprintf (out, extern void *__OPENMP_TARGET__[];\n\n); + fprintf (out, extern void *__OFFLOAD_TABLE__[];\n\n); fprintf (out, static __attribute__((constructor)) void init (void)\n{\n); - fprintf (out, GOMP_offload_register (__OPENMP_TARGET__, %d,\n, + fprintf (out, GOMP_offload_register (__OFFLOAD_TABLE__, %d,\n, GOMP_DEVICE_NVIDIA_PTX); fprintf (out, target_data);\n); fprintf (out, };\n); Grüße, Thomas pgp1FDb66hFdS.pgp Description: PGP signature
Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
On 02/11/2015 06:20 AM, Jiong Wang wrote: 2014-12-19 15:21 GMT+00:00 Kenneth Zadeck zad...@naturalbridge.com: however, since i am a nice person loop-invariant solves the DF_UD_CHAIN which builds a data structure that connects each use with all of the defs that reach it. I believe that this is the opposite of what you want here. if you really need this, you need to also turn on the DF_DU_CHAIN which builds the opposite structure.Both structures can be space hogs, but they are both turned on in other places in the compiler so it is unlikely to be an issue. Exactly, Thanks, Kenneth. This approach works from my experiment and look much better than previous REG_NOTE approach. while it do have one problem. We need the UD/DU chain built before we do insn re-shuffling. While after re-shuffling, UD chain needs update, otherwise, the later check_dependecies in find_invariant_insn may fail. although we have re-shuffle instruction 1 into 2, the later check still using old UD info, thus decide instruction 2 is not iv. 1: regA - vfp + regB 2: regA - vfp + const my current fix is to insert those re-shuffled insn into a table named vfp_const_iv, then skip those dependencies check for them as they don't have any dependencies. You now are beginning to understand why Mark Wegman and I invented SSA form almost 30 years ago There is no good way to keep the information up to data as you are changing the program.To a large extent you are on your own. If what you are suggesting works, then this is likely much faster than rebuilding the chains. LOG_LINKs have nothing to do with single use; they point from the _first_ use to its corresponding def. You might want to look at what fwprop does instead. Pass rtl fwprop uses df information in single-definition way, it doesn't really take into consideration if register is a single use. This often corrupts other optimizations like post-increment and load/store pair. For example: add r2, r1, r0 ldr rx, [r2] add r2, r2, #4 is transformed into below form: add r2, r1, r0 ldr rx, [r1, r0] add r2, r2, #4 As a result, post-increment opportunity is corrupted, also definition of r2 can't be deleted because it's not single use. Thanks, bin thanks for all these suggestion. Have look at the LOG_LINK build function, a simple reverse scan, while needs to allocate big map array for all pseudo regs. Haven't looked at similar code in fwprop, actually, when found the first matched insn pattern, I just want to scan several insns next, then abort quickly if nothing meet requirement. there is no need to build full single-use information. still can anyone confirm that it is safe to re-use REG_DEAD info there without calling df_note_add_problem and df_analysis first? or I am using those info passed down from the previous pass which calculated these info and maybe broken? It's not safe to use REG_DEAD info without re-computing it. not sure that reg_dead is the right answer even if you did re-compute it. I believe you can have two parallel uses (on both sides of an if-then-else) for a single def (above the if then else) and have two REG_DEAD notes. Richard.
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On 02/11/15 04:16, Richard Biener wrote: Application of this pattern (and the one I posted for 47477) is a concern for targets that don't do sub-word arithmetic/logicals. But I just did a sniff test of one such target (v850-elf because it was handy) and I couldn't spot a change in the end code using both the 47477 patch and my WIP patch for this BZ. The c-family frontends perform this kind of narrowing already anyway (via the shorten_* stuff which is misplaced there and should be done elsewhere for all frontends - thus in match.pd, thanks for starting that). True, but I wanted to see if there was any impact, but thankfully there isn't. The fact that the C/C++ front-ends are doing most of the shortening now probably explains why the fix for 47477 only affected code generation for the Java front-end. Jeff
Re: nvptx offloading patches [4/n]
Hi! On Wed, 28 Jan 2015 20:59:45 +0300, Ilya Verbin iver...@gmail.com wrote: On 28 Jan 18:05, Thomas Schwinge wrote: + fprintf (out, #define PTX_ID 1\n); + fprintf (out, static __attribute__((constructor)) void init (void)\n{\n); + fprintf (out, GOMP_offload_register (__OPENMP_TARGET__, PTX_ID,\n); The file include/gomp-constants.h already contains: #define GOMP_DEVICE_NVIDIA_PTX 5 I guess it would be better to include gomp-constants.h into mkoffload and to use this define. Thanks! You're right indeed -- I mistakenly committed an older version of this file. Committed to trunk in r220619: commit ab0e6fbc36cacaa619dfacec41e17a681a28562a Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Feb 11 14:15:28 2015 + nvptx mkoffload: Don't hard-code GOMP_DEVICE_NVIDIA_PTX. gcc/ * config/nvptx/mkoffload.c: Include gomp-constants.h. (process): Use its GOMP_DEVICE_NVIDIA_PTX instead of (wrongly) hard-coding PTX_ID. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@220619 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog| 6 ++ gcc/config/nvptx/mkoffload.c | 5 +++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index 2fa7ff2..1479dcb 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,9 @@ +2015-02-11 Thomas Schwinge tho...@codesourcery.com + + * config/nvptx/mkoffload.c: Include gomp-constants.h. + (process): Use its GOMP_DEVICE_NVIDIA_PTX instead of (wrongly) + hard-coding PTX_ID. + 2015-02-11 H.J. Lu hongjiu...@intel.com * doc/sourcebuild.texi (pie_enabled): Document. diff --git gcc/config/nvptx/mkoffload.c gcc/config/nvptx/mkoffload.c index 38ccdba..8f359cf 100644 --- gcc/config/nvptx/mkoffload.c +++ gcc/config/nvptx/mkoffload.c @@ -35,6 +35,7 @@ #include obstack.h #include diagnostic-core.h #include collect-utils.h +#include gomp-constants.h const char tool_name[] = nvptx mkoffload; @@ -791,9 +792,9 @@ process (FILE *in, FILE *out) fprintf (out, extern void GOMP_offload_register (const void *, int, void *);\n); fprintf (out, extern void *__OPENMP_TARGET__[];\n\n); - fprintf (out, #define PTX_ID 1\n); fprintf (out, static __attribute__((constructor)) void init (void)\n{\n); - fprintf (out, GOMP_offload_register (__OPENMP_TARGET__, PTX_ID,\n); + fprintf (out, GOMP_offload_register (__OPENMP_TARGET__, %d,\n, + GOMP_DEVICE_NVIDIA_PTX); fprintf (out, target_data);\n); fprintf (out, };\n); } Grüße, Thomas pgpaqE_j2EL5W.pgp Description: PGP signature
[PATCH] Fix -fsection-anchors alias handling (PR middle-end/65003)
Hi! This patch fixes a thinko in -fsection-anchors alias handling code Honza added in r211357. Usually for aliases we don't create SYMBOL_REF_HAS_BLOCK_INFO_P, but in the following testcase we do, because the rtl is created through notice_global_symbol already when the symbol wasn't an alias, only later optimizations made it into an alias. The thinko is that DECL_RTL of the alias target would be SYMBOL_REF, I think that can't ever happen, usually the DECL_RTL would be a MEM with SYMBOL_REF operand. I've bootstrapped/regtested this on {x86_64,i686,ppc64,ppc64le,aarch64,armv7hl,s390,s390x}-linux without regressions, so I hope the assertion this way is safe, if not, we'd need to avoid turning a symbol into an alias somewhere much earlier. Ok for trunk? 2015-02-11 Jakub Jelinek ja...@redhat.com PR middle-end/65003 * varasm.c (place_block_symbol): Assert that DECL_RTL of the ultimate alias is MEM with SYMBOL_REF satisfying SYMBOL_REF_HAS_BLOCK_INFO_P as its operand. Don't pass the MEM to place_block_symbol, but instead pass the SYMBOL_REF operand of it. * g++.dg/opt/pr65003.C: New test. --- gcc/varasm.c.jj 2015-02-04 15:24:20.0 +0100 +++ gcc/varasm.c2015-02-10 15:37:55.872609142 +0100 @@ -7180,6 +7180,10 @@ place_block_symbol (rtx symbol) { rtx target = DECL_RTL (snode-ultimate_alias_target ()-decl); + gcc_assert (MEM_P (target) + GET_CODE (XEXP (target, 0)) == SYMBOL_REF + SYMBOL_REF_HAS_BLOCK_INFO_P (XEXP (target, 0))); + target = XEXP (target, 0); place_block_symbol (target); SYMBOL_REF_BLOCK_OFFSET (symbol) = SYMBOL_REF_BLOCK_OFFSET (target); return; --- gcc/testsuite/g++.dg/opt/pr65003.C.jj 2015-02-10 15:50:44.240734029 +0100 +++ gcc/testsuite/g++.dg/opt/pr65003.C 2015-02-10 15:50:39.468814055 +0100 @@ -0,0 +1,31 @@ +// PR middle-end/65003 +// { dg-do compile } +// { dg-options -O2 } +// { dg-additional-options -fpic { target fpic } } + +struct A +{ + void operator= (A ); + A (); +}; +struct B +{ + A b; +}; +struct C +{ + virtual bool foo (int , bool) const; +}; +struct D : virtual C +{ + bool foo (int , bool) const; + B e; +}; +struct F : D +{ + F (int , const int , const A ); + bool foo (int , bool) const; +}; +bool D::foo (int , bool) const {} +F::F (int , const int , const A ) {} +bool F::foo (int , bool) const {} Jakub
Re: nvptx offloading patches [4/n]
On 02/11/2015 03:44 PM, Thomas Schwinge wrote: Note the global cold/hot labels. This confuses mkoffload, and it runs into a busy loop due to what I understand to be a bug in skipping of strange tokens, cited above, which such global labels would fall under. Here is what might be a fix for this (but I didn't analyze the parsing code in detail); OK for trunk? I'd rather fail if anything unexpected is seen. Things like -freorder-blocks-and-partition should be forced off in nvptx_option_override. Bernd
Re: nvptx offloading patches [4/n]
Hi Bernd! On Sat, 1 Nov 2014 13:11:29 +0100, Bernd Schmidt ber...@codesourcery.com wrote: I'm sending this for reference more than anything else - this is the patch that adds the target support for offloading to the nvptx port. (I committed this in r220209.) I actually expect this to change a little in the near future; the nvptx mkoffload duplicates some of the logic that we have in nvptx-as and I'm thinking of making some improvements. But I figure it would be good to show the entire picture, as it is as of now. I'm aware this is in progress, and will replace the code I'm commenting on below. Just to make sure that similar issues don't exist in nvptx-as, too. --- /dev/null +++ git/gcc/config/nvptx/mkoffload.c +static Token * +parse_file (Token *tok) +{ + [...] + else +{ + /* Something strange. Ignore it. */ + if (comment) + append_stmt (fns, comment); + + while (tok-kind !tok-end) + tok++; +} + return tok; +} I'm not sure if silently ignoring strange tokens is a good strategy? If -freorder-blocks-and-partition is active, this results in PTX code such as: // BEGIN PREAMBLE .version3.1 .target sm_30 .address_size 64 // END PREAMBLE $LCOLDB0: $LHOTB0: // BEGIN FUNCTION DECL: vec_mult$_omp_fn$1 .entry vec_mult$_omp_fn$1(.param.u64 %in_ar1); // BEGIN FUNCTION DEF: vec_mult$_omp_fn$1 .entry vec_mult$_omp_fn$1(.param.u64 %in_ar1) { .reg.u64 %ar1; [...] Note the global cold/hot labels. This confuses mkoffload, and it runs into a busy loop due to what I understand to be a bug in skipping of strange tokens, cited above, which such global labels would fall under. Here is what might be a fix for this (but I didn't analyze the parsing code in detail); OK for trunk? --- gcc/config/nvptx/mkoffload.c +++ gcc/config/nvptx/mkoffload.c @@ -755,8 +755,9 @@ parse_file (Token *tok) if (comment) append_stmt (fns, comment); - while (tok-kind !tok-end) + do tok++; + while (tok-kind !tok-end); } return tok; } Grüße, Thomas signature.asc Description: PGP signature
[4.9] 3 backports to 4.9 branch from the trunk
Hi! I've bootstrapped/regtested these 3 backports from trunk to 4.9 branch on x86_64-linux and i686-linux and installed. Jakub 2015-02-11 Jakub Jelinek ja...@redhat.com Backported from mainline 2015-02-04 Jakub Jelinek ja...@redhat.com PR c/64824 PR c/64868 * c-omp.c (c_finish_omp_atomic): Use TRUNC_DIV_EXPR instead of RDIV_EXPR. Use build_binary_op instead of build2_loc. * c-parser.c (c_parser_omp_atomic): Handle RDIV_EXPR. * parser.c (cp_parser_omp_atomic): Handle RDIV_EXPR. * testsuite/libgomp.c/pr64824.c: New test. * testsuite/libgomp.c/pr64868.c: New test. * testsuite/libgomp.c++/pr64824.C: New test. * testsuite/libgomp.c++/pr64868.C: New test. --- gcc/c-family/c-omp.c(revision 220419) +++ gcc/c-family/c-omp.c(revision 220420) @@ -206,6 +206,9 @@ c_finish_omp_atomic (location_t loc, enu return error_mark_node; } + if (opcode == RDIV_EXPR) +opcode = TRUNC_DIV_EXPR; + /* ??? Validate that rhs does not overlap lhs. */ /* Take and save the address of the lhs. From then on we'll reference it @@ -240,7 +243,7 @@ c_finish_omp_atomic (location_t loc, enu to do this, and then take it apart again. */ if (swapped) { - rhs = build2_loc (loc, opcode, TREE_TYPE (lhs), rhs, lhs); + rhs = build_binary_op (loc, opcode, rhs, lhs, 1); opcode = NOP_EXPR; } bool save = in_late_binary_op; --- gcc/c/c-parser.c(revision 220419) +++ gcc/c/c-parser.c(revision 220420) @@ -12611,6 +12611,7 @@ restart: { case MULT_EXPR: case TRUNC_DIV_EXPR: + case RDIV_EXPR: case PLUS_EXPR: case MINUS_EXPR: case LSHIFT_EXPR: --- gcc/cp/parser.c (revision 220419) +++ gcc/cp/parser.c (revision 220420) @@ -29835,6 +29835,7 @@ restart: { case MULT_EXPR: case TRUNC_DIV_EXPR: + case RDIV_EXPR: case PLUS_EXPR: case MINUS_EXPR: case LSHIFT_EXPR: --- libgomp/testsuite/libgomp.c++/pr64868.C (revision 0) +++ libgomp/testsuite/libgomp.c++/pr64868.C (revision 220420) @@ -0,0 +1,5 @@ +// PR c/64868 +// { dg-do run } +// { dg-options -O2 -fopenmp } + +#include ../libgomp.c/pr64868.c --- libgomp/testsuite/libgomp.c++/pr64824.C (revision 0) +++ libgomp/testsuite/libgomp.c++/pr64824.C (revision 220420) @@ -0,0 +1,5 @@ +// PR c/64824 +// { dg-do run } +// { dg-options -O2 -fopenmp } + +#include ../libgomp.c/pr64824.c --- libgomp/testsuite/libgomp.c/pr64868.c (revision 0) +++ libgomp/testsuite/libgomp.c/pr64868.c (revision 220420) @@ -0,0 +1,87 @@ +/* PR c/64868 */ +/* { dg-do run } */ +/* { dg-options -O2 -fopenmp } */ + +float f = 2.0f; +double d = 4.0; +long double ld = 8.0L; + +void +foo () +{ +#pragma omp atomic + f = 1.0f / f; +#pragma omp atomic + f = 1 / f; +#pragma omp atomic + f = f / 2.0f; +#pragma omp atomic + f = f / 2; +#pragma omp atomic + f /= 2.0f; +#pragma omp atomic + f /= 2; +#pragma omp atomic + d = 1.0 / d; +#pragma omp atomic + d = 1 / d; +#pragma omp atomic + d = d / 2.0; +#pragma omp atomic + d = d / 2; +#pragma omp atomic + d /= 2.0; +#pragma omp atomic + d /= 2; +#pragma omp atomic + ld = 1.0L / ld; +#pragma omp atomic + ld = 1 / ld; +#pragma omp atomic + ld = ld / 2.0L; +#pragma omp atomic + ld = ld / 2; +#pragma omp atomic + ld /= 2.0L; +#pragma omp atomic + ld /= 2; + if (f != 0.125f || d != 0.25 || ld != 0.5L) +__builtin_abort (); +} + +#ifdef __cplusplus +template typename T, int N1, int N2 +void +bar () +{ + T v = ::d; +#pragma omp atomic + v *= 16; +#pragma omp atomic + v = 1.0 / v; +#pragma omp atomic + v = N1 / v; +#pragma omp atomic + v = v / 2.0; +#pragma omp atomic + v = v / N2; +#pragma omp atomic + v /= 2.0; +#pragma omp atomic + v /= N2; + if (v != 0.25) +__builtin_abort (); +} +#endif + +int +main () +{ + foo (); +#ifdef __cplusplus + barfloat, 1, 2 (); + bardouble, 1, 2 (); + barlong double, 1, 2 (); +#endif + return 0; +} --- libgomp/testsuite/libgomp.c/pr64824.c (revision 0) +++ libgomp/testsuite/libgomp.c/pr64824.c (revision 220420) @@ -0,0 +1,16 @@ +/* PR c/64824 */ +/* { dg-do run } */ +/* { dg-options -O2 -fopenmp } */ + +int +main () +{ + long long a; + long long b = 1LL; + int c = 3; +#pragma omp atomic capture + a = b = c b; + if (b != 6LL || a != 6LL) +__builtin_abort (); + return 0; +} 2015-02-11 Jakub Jelinek ja...@redhat.com Backported from mainline 2015-02-09 Jakub Jelinek ja...@redhat.com PR target/64979 * tree-stdarg.c (pass_stdarg::execute): Scan phi node args for va_list escapes. * gcc.dg/tree-ssa/stdarg-7.c: New test. * gcc.c-torture/execute/pr64979.c: New test. --- gcc/tree-stdarg.c (revision 220542) +++ gcc/tree-stdarg.c (revision
Re: [PATCH] Fix -fsection-anchors alias handling (PR middle-end/65003)
On Wed, 11 Feb 2015, Jakub Jelinek wrote: Hi! This patch fixes a thinko in -fsection-anchors alias handling code Honza added in r211357. Usually for aliases we don't create SYMBOL_REF_HAS_BLOCK_INFO_P, but in the following testcase we do, because the rtl is created through notice_global_symbol already when the symbol wasn't an alias, only later optimizations made it into an alias. The thinko is that DECL_RTL of the alias target would be SYMBOL_REF, I think that can't ever happen, usually the DECL_RTL would be a MEM with SYMBOL_REF operand. Yeah, indeed. I've bootstrapped/regtested this on {x86_64,i686,ppc64,ppc64le,aarch64,armv7hl,s390,s390x}-linux without regressions, so I hope the assertion this way is safe, if not, we'd need to avoid turning a symbol into an alias somewhere much earlier. Ok for trunk? Ok. Thanks, Richard. 2015-02-11 Jakub Jelinek ja...@redhat.com PR middle-end/65003 * varasm.c (place_block_symbol): Assert that DECL_RTL of the ultimate alias is MEM with SYMBOL_REF satisfying SYMBOL_REF_HAS_BLOCK_INFO_P as its operand. Don't pass the MEM to place_block_symbol, but instead pass the SYMBOL_REF operand of it. * g++.dg/opt/pr65003.C: New test. --- gcc/varasm.c.jj 2015-02-04 15:24:20.0 +0100 +++ gcc/varasm.c 2015-02-10 15:37:55.872609142 +0100 @@ -7180,6 +7180,10 @@ place_block_symbol (rtx symbol) { rtx target = DECL_RTL (snode-ultimate_alias_target ()-decl); + gcc_assert (MEM_P (target) +GET_CODE (XEXP (target, 0)) == SYMBOL_REF +SYMBOL_REF_HAS_BLOCK_INFO_P (XEXP (target, 0))); + target = XEXP (target, 0); place_block_symbol (target); SYMBOL_REF_BLOCK_OFFSET (symbol) = SYMBOL_REF_BLOCK_OFFSET (target); return; --- gcc/testsuite/g++.dg/opt/pr65003.C.jj 2015-02-10 15:50:44.240734029 +0100 +++ gcc/testsuite/g++.dg/opt/pr65003.C2015-02-10 15:50:39.468814055 +0100 @@ -0,0 +1,31 @@ +// PR middle-end/65003 +// { dg-do compile } +// { dg-options -O2 } +// { dg-additional-options -fpic { target fpic } } + +struct A +{ + void operator= (A ); + A (); +}; +struct B +{ + A b; +}; +struct C +{ + virtual bool foo (int , bool) const; +}; +struct D : virtual C +{ + bool foo (int , bool) const; + B e; +}; +struct F : D +{ + F (int , const int , const A ); + bool foo (int , bool) const; +}; +bool D::foo (int , bool) const {} +F::F (int , const int , const A ) {} +bool F::foo (int , bool) const {} Jakub -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: nvptx offloading patches [4/n]
Hi! On Wed, 28 Jan 2015 18:05:25 +0100, I wrote: On Sat, 1 Nov 2014 13:11:29 +0100, Bernd Schmidt ber...@codesourcery.com wrote: I'm sending this for reference more than anything else - this is the patch that adds the target support for offloading to the nvptx port. It depends on the other offloading patches Ilya is currently submitting. Committed to trunk in r220209: commit 9c08fbb35aa95420268c2762151f22a6a9b90e85 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Jan 28 17:03:44 2015 + nvptx mkoffload. Committed to trunk in r220621: commit bfeb7c1c69130bb13265c49126e6159dbd9b8a5d Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Feb 11 14:15:47 2015 + nvptx mkoffload: Initialize GCC diagnostic machinery before using it. gcc/ * config/nvptx/mkoffload.c: Include diagnostic.h instead of diagnostic-core.h. (main): Initialize progname, and call diagnostic_initialize. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@220621 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog| 4 gcc/config/nvptx/mkoffload.c | 5 - 2 files changed, 8 insertions(+), 1 deletion(-) diff --git gcc/ChangeLog gcc/ChangeLog index bc4a050..482b128 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,5 +1,9 @@ 2015-02-11 Thomas Schwinge tho...@codesourcery.com + * config/nvptx/mkoffload.c: Include diagnostic.h instead of + diagnostic-core.h. + (main): Initialize progname, and call diagnostic_initialize. + * config/nvptx/mkoffload.c (process): Refer to __OFFLOAD_TABLE__ instead of __OPENMP_TARGET__. diff --git gcc/config/nvptx/mkoffload.c gcc/config/nvptx/mkoffload.c index 2287316..739aee8 100644 --- gcc/config/nvptx/mkoffload.c +++ gcc/config/nvptx/mkoffload.c @@ -33,7 +33,7 @@ #include intl.h #include libgen.h #include obstack.h -#include diagnostic-core.h +#include diagnostic.h #include collect-utils.h #include gomp-constants.h @@ -828,6 +828,9 @@ main (int argc, char **argv) FILE *out = stdout; const char *outname = 0; + progname = mkoffload; + diagnostic_initialize (global_dc, 0); + char *collect_gcc = getenv (COLLECT_GCC); if (collect_gcc == NULL) fatal_error (input_location, COLLECT_GCC must be set.); Grüße, Thomas pgp87KXUxlCcH.pgp Description: PGP signature
Re: [PATCH] PR rtl-optimization/32219: optimizer causes wrong code in pic/hidden/weak symbol checking
Richard, Your proposed patch fails to bootstrap here on x86_64-apple-darwin14 against the system clang compilers... g++ -c -g -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc-5-20150211/gcc -I../../gcc-5-20150211/gcc/. -I../../gcc-5-20150211/gcc/../include -I../../gcc-5-20150211/gcc/../libcpp/include -I/sw/include -I/sw/include -I../../gcc-5-20150211/gcc/../libdecnumber -I../../gcc-5-20150211/gcc/../libdecnumber/dpd -I../libdecnumber -I../../gcc-5-20150211/gcc/../libbacktrace -I/sw/include -I/sw/include -o varasm.o -MT varasm.o -MMD -MP -MF ./.deps/varasm.TPo ../../gcc-5-20150211/gcc/varasm.c ... ../../gcc-5-20150211/gcc/varasm.c:6899:9: error: use of undeclared identifier 'resolution_to_local_definition_p' return resolution_to_local_definition_p (vnode-resolution); ^ ../../gcc-5-20150211/gcc/varasm.c:6906:9: error: use of undeclared identifier 'resolution_to_local_definition_p' return resolution_to_local_definition_p (node-resolution); ^ Jack On Tue, Feb 10, 2015 at 4:19 PM, Richard Henderson r...@redhat.com wrote: On 02/07/2015 08:45 AM, H.J. Lu wrote: Here is an updated patch. This is an annoying comment to differentiate 3 different versions, FYI. @@ -6826,11 +6817,18 @@ default_binds_local_p_1 (const_tree exp, int shlib) (TREE_STATIC (exp) || DECL_EXTERNAL (exp))) { varpool_node *vnode = varpool_node::get (exp); - if (vnode (resolution_local_p (vnode-resolution) || vnode-in_other_partition)) - resolved_locally = true; - if (vnode -resolution_to_local_definition_p (vnode-resolution)) - resolved_to_local_def = true; + /* If not building shared library, common or initialized symbols + are also resolved locally, regardless they are weak or not if + weak_dominate is true. */ + if (vnode) + { + if ((weak_dominate !shlib vnode-definition) + || vnode-in_other_partition + || resolution_local_p (vnode-resolution)) + resolved_locally = true; + if (resolution_to_local_definition_p (vnode-resolution)) + resolved_to_local_def = true; + } } else if (TREE_CODE (exp) == FUNCTION_DECL TREE_PUBLIC (exp)) { Not the same test for a function_decl? That's certainly a mistake. Indeed, I believe we can now unify these two cases by using the base symtab_node instead of varpool_node and cgraph_node. I'm a bit confused about why we have both resolved_to_local_def vs resolved_locally. At least within this function, which only cares about module locality rather than object file locality. Honza? @@ -6859,9 +6857,15 @@ default_binds_local_p_1 (const_tree exp, int shlib) else if (! TREE_PUBLIC (exp)) local_p = true; /* A variable is local if the user has said explicitly that it will - be. */ + be and it is resolved or defined locally, not compiling for PIC, + not weak or weak_dominate is false. */ else if ((DECL_VISIBILITY_SPECIFIED (exp) || resolved_to_local_def) + (!weak_dominate +|| resolved_locally +|| !flag_pic +|| !DECL_EXTERNAL (exp) +|| !DECL_WEAK (exp)) DECL_VISIBILITY (exp) != VISIBILITY_DEFAULT) local_p = true; I think this test is conflating several issues, and as such it's very confusing. As an existing issue, I'm not sure why specified visibility is any different from unspecified visibility. As far as I'm aware, the specified bit simply means that the decl doesn't inherit inherit visibility from the class, or from the command-line. But once we're this far, the visibility actually applied to the symbol should be all that matters. Unfortunately, this test is 11 years old, from r85167. It came in as part of the implementation of the visibility attribute for the class. I believe the next test should be else if (DECL_WEAK (exp) !resolved_locally) local_p = false; Given the change above, resolved_locally will now be true for (weak defined weak_dominate), and therefore we need not reiterate that test. I believe the next test should be dictated by normal non-lto usage like extern void bar(void) __attribute__((visibility(hidden))); void foo(void) { ... bar(); ...) where the user is expecting that the function call make use of a simpler instruction sequence, probably avoiding an PLT. Therefore else if (DECL_VISIBILITY (exp) != VISIBILITY_DEFAULT) local_p = true; Since we have already excluded undef-weak, we know that any symbol here is either defined or strong, and non-default visibility must resolve it locally. /* Variables defined outside this object
[nvptx] -freorder-blocks-and-partition, -freorder-functions (was: nvptx offloading patches [4/n])
Hi! On Wed, 11 Feb 2015 15:44:26 +0100, I wrote: If -freorder-blocks-and-partition is active, this results in PTX code such as: // BEGIN PREAMBLE .version3.1 .target sm_30 .address_size 64 // END PREAMBLE $LCOLDB0: $LHOTB0: // BEGIN FUNCTION DECL: vec_mult$_omp_fn$1 .entry vec_mult$_omp_fn$1(.param.u64 %in_ar1); // BEGIN FUNCTION DEF: vec_mult$_omp_fn$1 .entry vec_mult$_omp_fn$1(.param.u64 %in_ar1) { .reg.u64 %ar1; [...] Note the global cold/hot labels. Such partitioning might not make a lot of sense for the virtual ISA that PTX is, but disabling it in nvptx.c:nvptx_option_override does not work. (Because that is not invoked in the offloading code path?) I see x86 has a ix86_option_override_internal (but I don't know how that options processing works) -- is something like that needed for nvptx, too, and how to interconnect that with the offloading code path? Sounds a bit like what Jakub suggests in https://gcc.gnu.org/PR64374#c8? Maybe -freorder-functions (of no use?) should also be disabled? Here is a WIP patch for -freorder-blocks-and-partition (missing handling of the offloading code path) -- does something like that make sense? --- gcc/config/nvptx/nvptx.c +++ gcc/config/nvptx/nvptx.c @@ -93,6 +93,18 @@ nvptx_option_override (void) init_machine_status = nvptx_init_machine_status; /* Gives us a predictable order, which we need especially for variables. */ flag_toplevel_reorder = 1; + /* If enabled, global cold/hot labels will be emitted, which our mkoffload + currently doesn't cope with. Also, it's not clear whether such + partitioning actually has any positive effect on the virtual ISA that PTX + is. */ + if (flag_reorder_blocks_and_partition) +{ + inform (input_location, + -freorder-blocks-and-partition not supported on this + architecture); + flag_reorder_blocks_and_partition = 0; + flag_reorder_blocks = 1; +} /* Assumes that it will see only hard registers. */ flag_var_tracking = 0; write_symbols = NO_DEBUG; Grüße, Thomas signature.asc Description: PGP signature
Re: patch to fix rtl documentation for new floating point comparisons
I vote for quiet.For three reasons: 1) it matches the expectation of what programmers expect. I view people who do floating point as falling into two categories: a) people who wish that there were no such thing as nans. These people are happy to write programs with just = = == != and then be unhappily surprised if their code blows up. b) careful floating pt programmers. These people never want a signal, they properly choose their operators to with knowledge of how each of those operations work with respect to nans. These people will never use anything like a signaling 2) it matches iso n1778 which is primarily written to satisfy the needs to (b). 3) Whenever you leave something like this undefined, you are basically saying do not optimize On 02/10/2015 04:46 PM, Joseph Myers wrote: On Mon, 9 Feb 2015, Kenneth Zadeck wrote: I don't think it's useful to have the trapping semantics unspecified for a comparison operation like that. So the question is what's most useful for LTGT and LTGT_EXPR to do (presumably they should do the same thing). Lots of existing code in this area seems confused (for example, HONOR_SNANS should be irrelevant to reversing an equality comparison, as reversing it will change neither results nor exceptions raised, whether or not signaling NaNs are involved). But maybe more code treats LTGT as a signaling operation than otherwise, in accordance with the original intent, and so that's the most appropriate way to document it (with !UNEQ being the representation of the corresponding quiet operation). section 7.12.14.4 in C99 seems pretty much to say that this is a quiet operation. It says islessgreater is quiet. It says nothing about the LTGT RTL operation or the LTGT_EXPR GIMPLE/GENERIC operation. __builtin_islessgreater is implemented using UNEQ_EXPR not LTGT_EXPR. It may make sense to define LTGT as exactly !UNEQ, and so quiet, but the choice of definition is a matter of what's convenient for the implementation (and which choice you make determines which existing code in GCC should be considered incorrect). Where back ends implement ltgt patterns, I don't know if they are consistently quiet or signaling.
Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)
On Wed, Feb 11, 2015 at 09:44:24AM -0800, Mike Stump wrote: On Feb 11, 2015, at 4:24 AM, Marek Polacek pola...@redhat.com wrote: The following patch splits the test into C and C++ test cases, so hopefully fixing the issue. Ok for trunk? 2015-02-11 Marek Polacek pola...@redhat.com * g++.dg/ubsan/shift-1.C: New test. * gcc.dg/ubsan/c-shift-2.c: New test. * c-c++-common/ubsan/shift-5.c: Remove file. diff --git gcc/testsuite/g++.dg/ubsan/shift-1.C gcc/testsuite/g++.dg/ubsan/shift-1.C index e69de29..8a71279 100644 --- gcc/testsuite/g++.dg/ubsan/shift-1.C +++ gcc/testsuite/g++.dg/ubsan/shift-1.C @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options -fsanitize=shift -w } */ +/* { dg-shouldfail ubsan } */ + +int +foo (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case 1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 11 } */ Never include line numbers, unless there is no other way. Here, I think you can drop it, and merely ensure this is on the right line? An example from gcc.dg: int g2(int a; __attribute__((unused))); /* { dg-error just a forward declaration no parms { xfail *-*-* } } */ I’m hoping that style will work. Yeah, it works, I don't know why I changed it for the C test case, but not for the C++ one. Ok with the above fixes, if applicable. Thanks. 2015-02-11 Marek Polacek pola...@redhat.com * g++.dg/ubsan/shift-1.C: New test. * gcc.dg/ubsan/c-shift-2.c: New test. * c-c++-common/ubsan/shift-5.c: Remove file. diff --git gcc/testsuite/g++.dg/ubsan/shift-1.C gcc/testsuite/g++.dg/ubsan/shift-1.C index e69de29..05e049e 100644 --- gcc/testsuite/g++.dg/ubsan/shift-1.C +++ gcc/testsuite/g++.dg/ubsan/shift-1.C @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options -fsanitize=shift -w } */ +/* { dg-shouldfail ubsan } */ + +int +foo (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case 1 -1: /* { dg-error is not a constant expression { xfail { *-*-* } } } */ +case -1 -1: /* { dg-error is not a constant expression { xfail { *-*-* } } } */ +case 1 -1: /* { dg-error is not a constant expression { xfail { *-*-* } } } */ +case -1 -1: /* { dg-error is not a constant expression { xfail { *-*-* } } } */ + return 1; +} + return 0; +} + +int +bar (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case -1 200: /* { dg-error is not a constant expression { xfail { *-*-* } } } */ +case 1 200: /* { dg-error is not a constant expression { xfail { *-*-* } } } */ + return 1; +} + return 0; +} diff --git gcc/testsuite/gcc.dg/ubsan/c-shift-2.c gcc/testsuite/gcc.dg/ubsan/c-shift-2.c index e69de29..beb0dbe 100644 --- gcc/testsuite/gcc.dg/ubsan/c-shift-2.c +++ gcc/testsuite/gcc.dg/ubsan/c-shift-2.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options -fsanitize=shift -w } */ +/* { dg-shouldfail ubsan } */ + +int +foo (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case 1 -1: /* { dg-error case label does not reduce to an integer constant } */ +case -1 -1: /* { dg-error case label does not reduce to an integer constant } */ +case 1 -1: /* { dg-error case label does not reduce to an integer constant } */ +case -1 -1: /* { dg-error case label does not reduce to an integer constant } */ + return 1; +} + return 0; +} + +int +bar (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case -1 200: /* { dg-error case label does not reduce to an integer constant } */ +case 1 200: /* { dg-error case label does not reduce to an integer constant } */ + return 1; +} + return 0; +} Marek
Re: [PATCH][AArch64] Testcase fix for __ATOMIC_CONSUME
On Mon, 2015-02-09 at 09:10 -0800, Mike Stump wrote: On Feb 9, 2015, at 7:11 AM, Alex Velenko alex.vele...@arm.com wrote: The following patch makes atomic-op-consume.c XFAIL Is this patch ok? Ok. I’d shorten the comment above the xfail to be exceedingly short: /* PR59448 consume not implemented yet */ The reason is the brain can process this about 8x faster. Also, one can cut and paste the PR part into a web browser directly, or, if you have an electric bugzilla mode for emacs, it will pop right up. */ Given the discussions we had in ISO C++ SG1, it seems the only way to fix memory_order_consume is to deprecate it (or let it rot forever), and add something else to the standard. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4321.pdf IOW, I believe the promotion is here to stay. I'm not aware of any other implementation doing something else. Thus, XFAIL doesn't seem right to me.
[PATCH][PR target/63347][P3 Regression] Fix SCHED_GROUP_P handling in haifa-sched.c
PR 63347 is a case where we can end up scheduling an insn into the middle of a scheduling group. Basically given two consecutive insns A B, if B has SCHED_GROUP_P set, then it must always be immediately after A. The problem we ran into in this PR was that there was a cost to issue B, so it got delayed several cycles. While B was delayed an earlier, unrelated insn Z which had been delayed earlier became ready and issued. The result being A Z B which ultimately led to incorrect code generation. The fix is actually quite simple. In prune_ready_list, we queue insns that have had their dependencies satisfied, but have a cost (say for example, they're waiting for a functional unit to become available). We want to queue based on that cost, with one exception. If the insn to be queued is part of a scheduling group, then we want to queue it with cost 1. That ensures that on each cycle, any SCHED_GROUP_P insn that has its dependencies satisfied is in the ready list. Once we do that, everything is in place to assure that nothing issues until after the SCHED_GROUP_P insn issues. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Also verified the testcase works correctly on m68k-unknown-linux. Installed on the trunk. commit 07e421b3fbc2d29c9636200906c3edf3cbd33499 Author: Jeff Law l...@redhat.com Date: Wed Feb 11 16:19:05 2015 -0700 PR target/63347 * haifa-sched.c (prune_ready_list): If we have a SCHED_GROUP_P insn that needs to be queued, just queue it for a single cycle. PR target/63347 * gcc.target/m68k/pr63347.c: New test. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d895209..c9ac045 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2015-02-11 Jeff Law l...@redhat.com + + PR target/63347 + * haifa-sched.c (prune_ready_list): If we have a SCHED_GROUP_P insn + that needs to be queued, just queue it for a single cycle. + 2015-02-11 Jan Hubicka hubi...@ucw.cz * ipa.c (symbol_table::remove_unreachable_nodes): Avoid releasing diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 75d2421..64c8c9c1 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -6291,7 +6291,15 @@ prune_ready_list (state_t temp_state, bool first_cycle_insn_p, if (SCHED_GROUP_P (insn) cost min_cost_group) min_cost_group = cost; ready_remove (ready, i); - queue_insn (insn, cost, reason); + /* Normally we'd want to queue INSN for COST cycles. However, +if SCHED_GROUP_P is set, then we must ensure that nothing +else comes between INSN and its predecessor. If there is +some other insn ready to fire on the next cycle, then that +invariant would be broken. + +So when SCHED_GROUP_P is set, just queue this insn for a +single cycle. */ + queue_insn (insn, SCHED_GROUP_P (insn) ? 1 : cost, reason); if (i + 1 n) break; } diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index c62cc23..ee5be51 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2015-02-11 Jeff Law l...@redhat.com + + PR target/63347 + * gcc.target/m68k/pr63347.c: New test. + 2015-02-11 Marek Polacek pola...@redhat.com * g++.dg/ubsan/shift-1.C: New test. diff --git a/gcc/testsuite/gcc.target/m68k/pr63347.c b/gcc/testsuite/gcc.target/m68k/pr63347.c new file mode 100644 index 000..1d23e9a --- /dev/null +++ b/gcc/testsuite/gcc.target/m68k/pr63347.c @@ -0,0 +1,46 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -mcpu=5208 } */ + +#include stdlib.h + +void __attribute__ ((noinline)) +oof() +{ + asm volatile ( ::: memory); +} +int print_info(unsigned int *ip_addr) +{ +int invalid = 0; + +if (ip_addr) { +unsigned int haddr = *ip_addr; +oof(stuff); +if (0x0 == haddr) { +invalid = 1; +} +oof(stuff2); +} else { +invalid = 1; +} + +return invalid; +} + +int main(int argc, char *argv[]) +{ +unsigned int myaddr; +int ret; + +myaddr = 0x0; +ret = print_info(myaddr); +if (!ret) +abort (); + +myaddr = 0x01020304; +ret = print_info(myaddr); +if (ret) +abort (); +exit (0); +} + +
RE: [PATCH, FT32] initial support
+@table @gcctabopt + +@item -mspace +@opindex mspace +Enable code-size optimizations. +Some of these optimizations incur a minor performance penalty. We already have -Os, so why is an architecture-specific option for this needed? Code compiled with -mspace is somewhat slower than code without. So we typically build *all* code with -Os, with everything non-critical also compiled -mspace. Is this a legitimate option or should I just use -Os? -- James Bowman FTDI Open Source Liaison
Re: [RFC testsuite] Fix PR64850, tweak acc_on_device* tests
Thomas Schwinge tho...@codesourcery.com wrote: To resolve the immediate problem: is my approval enough for Kaz to commit the patch, or does that need a more authoritative approval? I'd like to commit my patch as a quick fix in a few days if no one objects. I think I copied this approach from some other test case (but don't remember which). In the current set of tests, we need to verify that the acc_on_device library function is called 0, 1, or 4 times (see below). For example, for gcc.dg/goacc/acc_on_device-1.c we'Ve got: $ build-gcc/gcc/xgcc -Bbuild-gcc/gcc/ source-gcc/gcc/testsuite/gcc.dg/goacc/acc_on_device-1.c -fopenacc -O -std=c89 -Wno-implicit-function-declaration -S -fpic -mcmodel=large -o acc_on_device-1.s $ grep acc_on_device acc_on_device-1.s .file acc_on_device-1.c movabsq $acc_on_device@PLTOFF, %rdx movabsq $acc_on_device@PLTOFF, %rdx movabsq $acc_on_device@PLTOFF, %rdx movabsq $acc_on_device@PLTOFF, %rdx Isn't it even more fragile to scan here for acc_on_device being called four times compared to the -fdump-rtl-expand dump? Or should I split up the four tests into four separate files? (I guess I lack knowledge of the best approach for doing such a thing in the GCC testsuite.) Another example with the asm output of m68k compiler: lea acc_on_device,%a2 jsr (%a2) ... jsr (%a2) ... jsr (%a2) ... jsr (%a2) for -fopenacc -O -fno-openacc acc_on_device-1.c. It seems that getting the number of calls for the specific function isn't easy with the asm output on some targets. Regards, kaz
Re: Add CFLAGS_FOR_TARGET to Ada OS Constant Extraction Process
On 2/11/2015 1:22 AM, Arnaud Charlet wrote: When building Ada for RTEMS, you need to pass in the CFLAGS_FOR_TARGET to see OS specific .h files. This was already done for the RTS but needed to be added to the process which extracts value settings from the .h files. This patch is against 4.9. OK to apply to 4.8, 4.9, and head assuming it applies cleanly? This patch won't apply cleanly for head I think, where I suspect at least part of your patch is OBE. OK. Rebuilding on the head. It would be better to first have a patch for trunk (if still relevant) and then backport it to the other branches, since changes on trunk have already been done in a different way to address at least partly this issue (e.g. use of GNATLIBCFLAGS_FOR_C in OSCONS_CPP, different computation of OSCONS_CC). I am happy to do and from your description it sounds like the it would have built if the code had my previous patch to add CFLAGS_FOR_TARGET to the GNATLIBCFLAGS. GNATLIBCFLAGS_FOR_C = -W -Wall $(GNATLIBCFLAGS) $(CFLAGS_FOR_TARGET) \ -fexceptions -DIN_RTS -DHAVE_GETIPINFO But the build didn't get far enough to know. It fails before that with: gcc -c -g -O2 -gnatpg -gnata -W -Wall -nostdinc -I- -I. -Iada/generated -Iada -I/users/joel/test-gcc/gcc/gcc/ada -I/users/joel/test-gcc/gcc/gcc/ada/gcc-interface /users/joel/test-gcc/gcc/gcc/ada/a-ioexce.ads -o ada/a-ioexce.o a-ioexce.ads:21:19: violation of restriction No_Elaboration_Code at system.ads:53 a-ioexce.ads:22:19: violation of restriction No_Elaboration_Code at system.ads:53 a-ioexce.ads:23:19: violation of restriction No_Elaboration_Code at system.ads:53 a-ioexce.ads:24:19: violation of restriction No_Elaboration_Code at system.ads:53 a-ioexce.ads:25:19: violation of restriction No_Elaboration_Code at system.ads:53 a-ioexce.ads:26:19: violation of restriction No_Elaboration_Code at system.ads:53 a-ioexce.ads:27:19: violation of restriction No_Elaboration_Code at system.ads:53 a-ioexce.ads:28:19: violation of restriction No_Elaboration_Code at system.ads:53 Any ideas? My native compiler is from the head back in August. I will rebuild it overnight and try again in the morning in case that's a factor. Arno -- Joel Sherrill, Ph.D. Director of Research Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985
Re: [PATCH PR64705]Don't aggressively expand induction variable's base
On Wed, Feb 11, 2015 at 7:24 PM, Richard Biener richard.guent...@gmail.com wrote: On Wed, Feb 11, 2015 at 9:23 AM, Bin.Cheng amker.ch...@gmail.com wrote: On Tue, Feb 10, 2015 at 12:24 AM, Richard Biener richard.guent...@gmail.com wrote: Previously, the computation of _1174 can be replaced by _629 in bb8 in DOM2 pass, while it can't after patching. This is the only possible regression that I can see on TREE level. There is another difference but not regression on TREE. Seems real change happens on RTL pre with different register uses in the input. I failed to go further or extract a test case, it's just very volatile. Well, I can see what is happening and indeed we shouldn't blame the patch for this. With all of this, I guess this patch shouldn't be blamed for this. I also wonder if the PR should be fixed in this way since the patch definitely is a corner case. It might not fix the real issue (whatever that is), but not making IVOPTs (or tree-affines) life harder is very good (I believe I have seen this kind of issue as well). I guess IV's base is expanded because we want to explore more CSE opportunities? I suspect this doesn't work very well because of two reasons: 1) overload the tree-affine facility; 2) weak IV rewriting capacity in GCC (for example, mess up loop variant/invariant part expressions). I will do experiments on this. As for the patch itself, I collected instrumental data from GCC bootstrap and Spec2k6 compilation. Can confirm in most cases (bootstrap 99.9%, spec2k6 99.1%), there is only one ssa name in IV's step. So I do think that the patch is fine. Just seen the known-to-work GCC 3.4 version so it's even a regression Here is the refined patch according to your comments. It passes bootstrap and test on x86_64. Thanks, bin 2015-02-12 Bin Cheng bin.ch...@arm.com PR tree-optimization/64705 * tree-ssa-loop-niter.h (expand_simple_operations): New parameter. * tree-ssa-loop-niter.c (expand_simple_operations): New parameter. * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New. (find_bivs, find_givs_in_stmt_scev): Pass new argument to expand_simple_operations. gcc/testsuite/ChangeLog 2015-02-12 Bin Cheng bin.ch...@arm.com PR tree-optimization/64705 * gcc.dg/tree-ssa/pr64705.c: New test. Index: gcc/tree-ssa-loop-niter.c === --- gcc/tree-ssa-loop-niter.c (revision 219574) +++ gcc/tree-ssa-loop-niter.c (working copy) @@ -1552,10 +1552,11 @@ simplify_replace_tree (tree expr, tree old, tree n } /* Expand definitions of ssa names in EXPR as long as they are simple - enough, and return the new expression. */ + enough, and return the new expression. If STOP is specified, stop + expanding if EXPR equals to it. */ tree -expand_simple_operations (tree expr) +expand_simple_operations (tree expr, tree stop) { unsigned i, n; tree ret = NULL_TREE, e, ee, e1; @@ -1575,7 +1576,7 @@ tree for (i = 0; i n; i++) { e = TREE_OPERAND (expr, i); - ee = expand_simple_operations (e); + ee = expand_simple_operations (e, stop); if (e == ee) continue; @@ -1594,7 +1595,8 @@ tree return ret; } - if (TREE_CODE (expr) != SSA_NAME) + /* Stop if it's not ssa name or the one we don't want to expand. */ + if (TREE_CODE (expr) != SSA_NAME || expr == stop) return expr; stmt = SSA_NAME_DEF_STMT (expr); @@ -1614,7 +1616,7 @@ tree src-loop_father != dest-loop_father) return expr; - return expand_simple_operations (e); + return expand_simple_operations (e, stop); } if (gimple_code (stmt) != GIMPLE_ASSIGN) return expr; @@ -1634,7 +1636,7 @@ tree return e; if (code == SSA_NAME) - return expand_simple_operations (e); + return expand_simple_operations (e, stop); return expr; } @@ -1643,7 +1645,7 @@ tree { CASE_CONVERT: /* Casts are simple. */ - ee = expand_simple_operations (e); + ee = expand_simple_operations (e, stop); return fold_build1 (code, TREE_TYPE (expr), ee); case PLUS_EXPR: @@ -1658,7 +1660,7 @@ tree if (!is_gimple_min_invariant (e1)) return expr; - ee = expand_simple_operations (e); + ee = expand_simple_operations (e, stop); return fold_build2 (code, TREE_TYPE (expr), ee, e1); default: Index: gcc/tree-ssa-loop-niter.h === --- gcc/tree-ssa-loop-niter.h (revision 219574) +++ gcc/tree-ssa-loop-niter.h (working copy) @@ -20,7 +20,7 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_TREE_SSA_LOOP_NITER_H #define GCC_TREE_SSA_LOOP_NITER_H -extern tree expand_simple_operations (tree); +extern tree expand_simple_operations (tree, tree = NULL); extern bool loop_only_exit_p (const struct loop *, const_edge); extern
[PATCH] Fix rotate discovery in fold-const.c (PR tree-optimization/65014)
Hi! We pass the values after STRIP_NOPS to *ROTATE_EXPR build, which is undesirable, as the testcase shows, the operand could very well end up being a pointer rather than integer. This patch fixes it by passing the original trees instead. Alternatively we could fold_convert the STRIP_NOPS result to the original type, but I'd be afraid that in the common case that would just create more GC garbage. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-02-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/65014 * fold-const.c (fold_binary_loc): When creating {L,R}ROTATE_EXPR, use original second operand of arg0 or arg1 instead of that adjusted by STRIP_NOPS. * gcc.c-torture/compile/pr65014.c: New test. --- gcc/fold-const.c.jj 2015-01-22 19:30:59.0 +0100 +++ gcc/fold-const.c2015-02-11 15:06:52.628078961 +0100 @@ -10261,7 +10261,9 @@ fold_binary_loc (location_t loc, tem = build2_loc (loc, LROTATE_EXPR, TREE_TYPE (TREE_OPERAND (arg0, 0)), TREE_OPERAND (arg0, 0), - code0 == LSHIFT_EXPR ? tree01 : tree11); + code0 == LSHIFT_EXPR + ? TREE_OPERAND (arg0, 1) + : TREE_OPERAND (arg1, 1)); return fold_convert_loc (loc, type, tem); } else if (code11 == MINUS_EXPR) @@ -10283,7 +10285,8 @@ fold_binary_loc (location_t loc, ? LROTATE_EXPR : RROTATE_EXPR), TREE_TYPE (TREE_OPERAND (arg0, 0)), - TREE_OPERAND (arg0, 0), tree01)); + TREE_OPERAND (arg0, 0), + TREE_OPERAND (arg0, 1))); } else if (code01 == MINUS_EXPR) { @@ -10304,7 +10307,7 @@ fold_binary_loc (location_t loc, ? LROTATE_EXPR : RROTATE_EXPR), TREE_TYPE (TREE_OPERAND (arg0, 0)), - TREE_OPERAND (arg0, 0), tree11)); + TREE_OPERAND (arg0, 0), TREE_OPERAND (arg1, 1))); } } } --- gcc/testsuite/gcc.c-torture/compile/pr65014.c.jj2015-02-11 15:21:19.175681614 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr65014.c 2015-02-11 15:22:18.040703653 +0100 @@ -0,0 +1,10 @@ +/* PR tree-optimization/65014 */ +/* { dg-do compile { target int32plus } } */ + +extern int x; + +unsigned +foo (unsigned int y) +{ + return (y ((__INTPTR_TYPE__) x)) | (y (32 - ((__INTPTR_TYPE__) x))); +} Jakub
Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
On Wed, Feb 11, 2015 at 4:55 AM, Jeff Law l...@redhat.com wrote: This PR was originally minor issue where we regressed on this kind of sequence: typedef struct toto_s *toto_t; toto_t add (toto_t a, toto_t b) { int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b~1L); return (toto_t)(intptr_t) tmp; } There was talk of trying to peephole this in the x86 backend. But later Jakub speculated that if we had good type narrowing this could be done in the tree optimizers... S, here we go. I didn't do anything with logicals are those are already handled elsewhere in match.pd. I didn't try to handle MULT as in the early experiments I did, it was a lose because of the existing mechanisms for widening multiplications. Interestingly enough, this patch seems to help out libjava more than anything else in a GCC build and it really only helps a few routines. There weren't any routines I could see where the code regressed after this patch. This is probably an indicator that these things aren't *that* common, or the existing shortening code better than we thought, or some important shortening case is missing. Cool that we are trying to simplify type conversion using generic match facility. I have thought about type promotion in match.pd too. For example, (unsigned long long)(unsigned long)(int_expr), if we can prove int_expr is always positive (in my case, this is from vrp information), then the first conversion can be saved. This is another way for (and related? I didn't look at the code) the sign/zero extension elimination work using VRP I suppose? Thanks, bin I think we should pull the other tests from 47477 which are not regressions out into their own bug for future work. Or alternately, when this fix is checked in remove the regression marker in 47477. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for the trunk? diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 7f3816c..7a95029 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2015-02-10 Jeff Law l...@redhat.com + + * match.pd (convert (plus/minus (convert @0) (convert @1): New + simplifier to narrow arithmetic. + 2015-02-10 Richard Biener rguent...@suse.de PR tree-optimization/64909 diff --git a/gcc/match.pd b/gcc/match.pd index 81c4ee6..abc703e 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3. If not see (logs (pows @0 @1)) (mult @1 (logs @0) +/* If we have a narrowing conversion of an arithmetic operation where + both operands are widening conversions from the same type as the outer + narrowing conversion. Then convert the innermost operands to a suitable + unsigned type (to avoid introducing undefined behaviour), perform the + operation and convert the result to the desired type. + + This narrows the arithmetic operation. */ +(for op (plus minus) + (simplify +(convert (op (convert@2 @0) (convert @1))) +(if (TREE_TYPE (@0) == TREE_TYPE (@1) + TREE_TYPE (@0) == type + INTEGRAL_TYPE_P (type) + TYPE_PRECISION (TREE_TYPE (@2)) TYPE_PRECISION (TREE_TYPE (@0)) +/* This prevents infinite recursion. */ + unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2)) + (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); } +(convert (op (convert:utype @0) (convert:utype @1))) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 15d5e2d..76e5254 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2015-02-10 Jeff Law l...@redhat.com + + PR rtl-optimization/47477 + * gcc.dg/tree-ssa/narrow-arith-1.c: New test. + 2015-02-10 Richard Biener rguent...@suse.de PR tree-optimization/64909 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c new file mode 100644 index 000..104cb6f5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c @@ -0,0 +1,22 @@ +/* PR tree-optimization/47477 */ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized -w } */ +/* { dg-require-effective-target ilp32 } */ + +typedef int int64_t __attribute__ ((__mode__ (__DI__))); +typedef int * intptr_t; + +typedef struct toto_s *toto_t; +toto_t add (toto_t a, toto_t b) { + int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b~1L); + return (toto_t)(intptr_t) tmp; +} + +/* For an ILP32 target there'll be 6 casts when we start, but just 4 + if the match.pd pattern is successfully matched. */ +/* { dg-final { scan-tree-dump-times = \\(int\\) 1 optimized } } */ +/* { dg-final { scan-tree-dump-times = \\(unsigned int\\) 2 optimized } } */ +/* { dg-final { scan-tree-dump-times = \\(struct toto_s \\*\\) 1 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ + +
[PATCH] Fix UBSAN_OBJECT_SIZE lowering (PR sanitizer/65019)
Hi! Similar problem to the recently fixed UBSAN_VPTR lowering, ubsan_expand_objsize_ifn sets *gsi to the first stmt in a new bb after splitting block after UBSAN_OBJECT_SIZE, which is the next stmt that should be processed, so we should always return no_next = true to avoid gsi_next on it before it will be processed. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-02-12 Jakub Jelinek ja...@redhat.com PR sanitizer/65019 * ubsan.c (ubsan_expand_objsize_ifn): Always return true. * g++.dg/ubsan/pr65019.C: New test. --- gcc/ubsan.c.jj 2015-02-10 22:58:55.0 +0100 +++ gcc/ubsan.c 2015-02-11 17:20:36.313063536 +0100 @@ -987,7 +987,7 @@ ubsan_expand_objsize_ifn (gimple_stmt_it /* Get rid of the UBSAN_OBJECT_SIZE call from the IR. */ unlink_stmt_vdef (stmt); gsi_remove (gsi_orig, true); - return gsi_end_p (*gsi); + return true; } /* Cached __ubsan_vptr_type_cache decl. */ --- gcc/testsuite/g++.dg/ubsan/pr65019.C.jj 2015-02-11 17:26:44.832959016 +0100 +++ gcc/testsuite/g++.dg/ubsan/pr65019.C2015-02-11 17:26:23.0 +0100 @@ -0,0 +1,24 @@ +// PR sanitizer/65019 +// { dg-do compile } +// { dg-options -fsanitize=alignment,object-size,vptr -std=c++11 -O2 -fcompare-debug } + +struct A { }; +struct B { }; +struct C final { + C (const A , int); + static B *foo (const A , int = 1); + virtual ~C (); + void *c; +}; + +B * +C::foo (const A x, int y) +{ + C *d = new C (x, y); + if (d-c == nullptr) +delete d; +} + +C::~C () +{ +} Jakub
Re: [PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0
On 8 February 2015 at 02:24, Andrew Pinski pins...@gmail.com wrote: Here is the updated patch with Jakub's comments included and added a testcase for the 0, 0 case. Thanks, Andrew Pinski ChangeLog: PR target/64893 * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Change the first argument type to size_type_node and add another size_type_node. (aarch64_simd_expand_builtin): Handle the new argument to AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather print an out when the first two arguments are not nonzero integer constants. * config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK): Pass the sizeof directly to __builtin_aarch64_im_lane_boundsi. testsuite/ChangeLog: * c-c++-common/torture/aarch64-vect-lane-1.c: New testcase. * c-c++-common/torture/aarch64-vect-lane-2.c: New testcase. OK, thanks /Marcus
Re: [PATCH PR64705]Don't aggressively expand induction variable's base
On Tue, Feb 10, 2015 at 12:24 AM, Richard Biener richard.guent...@gmail.com wrote: On February 9, 2015 11:09:49 AM CET, Bin Cheng bin.ch...@arm.com wrote: Did you extract a testcase for it? Note that the IV step itself may be expanded Too much. I looked into the regression and thought it was because of passes after IVOPT. The IVOPT dump is at least not worse than the original version. But different I presume. So how did you figure it regressed? The tree level dump is like, ---Without patch +++ With patch bb 8: _618 = MAX_EXPR n_105, 0; _619 = (integer(kind=8)) _618; _629 = (unsigned long) _618; .. bb 37: _257 = _619 + 1; _255 = (sizetype) _257; _261 = _255 * 8; _256 = (sizetype) _140; - _1174 = (sizetype) _618; + _1174 = (sizetype) _619; _1175 = _256 + _1174; _1176 = _1175 * 8; _1177 = _148 + _1176; ivtmp.3849_258 = (unsigned long) _1177; _1179 = (unsigned int) _104; Previously, the computation of _1174 can be replaced by _629 in bb8 in DOM2 pass, while it can't after patching. This is the only possible regression that I can see on TREE level. There is another difference but not regression on TREE. Seems real change happens on RTL pre with different register uses in the input. I failed to go further or extract a test case, it's just very volatile. With all of this, I guess this patch shouldn't be blamed for this. I also wonder if the PR should be fixed in this way since the patch definitely is a corner case. Thanks, bin Thanks, Richard. Bootstrap and test on x86_64 and AArch64, so is it OK? 2015-02-09 Bin Cheng bin.ch...@arm.com PR tree-optimization/64705 * tree-ssa-loop-niter.h (expand_simple_operations): New parameter. * tree-ssa-loop-niter.c (expand_simple_operations): New parameter. (tree_simplify_using_condition_1, refine_bounds_using_guard) (number_of_iterations_exit): Pass new argument to expand_simple_operations. * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New. (find_bivs, find_givs_in_stmt_scev): Pass new argument to expand_simple_operations. Call extract_single_var_from_expr. (difference_cannot_overflow_p): Pass new argument to expand_simple_operations. gcc/testsuite/ChangeLog 2015-02-09 Bin Cheng bin.ch...@arm.com PR tree-optimization/64705 * gcc.dg/tree-ssa/pr64705.c: New test.
Re: [Haifa Scheduler] Fix latent bug in macro-fusion/instruction grouping
On Mon, Feb 09, 2015 at 11:16:56PM +, Jeff Law wrote: On 02/06/15 05:24, James Greenhalgh wrote: --- 2015-02-06 James Greenhalgh james.greenha...@arm.com * haifa-sched.c (recompute_todo_spec): After applying a replacement and cancelling a dependency, also clear the SCHED_GROUP_P flag. My worry here would be that we might be clearing a SCHED_GROUP_P that had been set for some reason other than macro-fusion. Yeah, I also had this worry. This patch tackles the problem from the other direction. If we see a SCHED_GROUP_P on an insn, treat it as a hard dependency, and don't try to rewrite it. I think this will always be safe but it might pessimize if the dependency breaker would have resulted in better code generation. I don't think this gives you anything towards fixing your bug, but it clears mine. I've bootstrapped and tested on x86_64-unknown-linux-gnu with no issues and given it a quick check on the problem code from before, where it has the desired impact. Thanks, James --- 2015-02-10 James Greenhalgh james.greenha...@arm.com * haifa-sched.c (recompute_todo_spec): Treat SCHED_GROUP_P as forcing a HARD_DEP between instructions, thereby disallowing rewriting to break dependencies. diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 75d2421..06444de 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -1233,6 +1233,11 @@ recompute_todo_spec (rtx_insn *next, bool for_backtrack) if (!sd_lists_empty_p (next, SD_LIST_HARD_BACK)) return HARD_DEP; + /* If NEXT is intended to sit adjacent to this instruction, we don't + want to try to break any dependencies. Treat it as a HARD_DEP. */ + if (SCHED_GROUP_P (next)) +return HARD_DEP; + /* Now we've got NEXT with speculative deps only. 1. Look at the deps to see what we have to do. 2. Check if we can do 'todo'. */
Re: [PATCH] Fix x86 #pragma GCC target and target attribute handling (PR target/61925)
On Wed, Jan 28, 2015 at 9:05 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! This patch rewrites the target pragma and target attribute handling in the i386 backend, so that outside of functions global_options and target globals reflect the currently active #pragma GCC target (if none active, obviously the default options) and inside of functions (in between set_cfun to that function and set_cfun to another function or NULL) the target state of the function. Without this patch, that state often was wherever last set_cfun left it (consider e.g. function with target attribute inside of #pragma GCC target region), and furthermore #pragma GCC target without any function definition before #pragma GCC pop_options confused it altogether, plus there has been confusion between when should be the default options active and when should be the currently active #pragma GCC target activated. The last hunk in i386.c also attempts to save some compile time cycles, I think it is pointless to add currently active #pragma GCC target as attributes to all the target builtins that are activated in the call, we certainly don't do that e.g. for target attribute. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-28 Jakub Jelinek ja...@redhat.com PR target/61925 * config/i386/i386.c (ix86_reset_to_default_globals): Removed. (ix86_reset_previous_fndecl): Restore it here, unconditionally. (ix86_set_current_function): Rewritten. (ix86_add_new_builtins): Temporarily clear current_target_pragma when creating builtin fndecls. * gcc.target/i386/pr61925-1.c: New test. * gcc.target/i386/pr61925-2.c: New test. * gcc.target/i386/pr61925-3.c: New test. LGTM as far as I can review this, so if there are no objections from Honza, the patch is OK. Thanks, Uros.
Re: [Patch Testsuite] XFAIL gfortran.dg/pr45636.f90 on AArch64
On 10 February 2015 at 16:06, James Greenhalgh james.greenha...@arm.com wrote: Hi, As is already done for mips and hppa, we should XFAIL this test on AArch64 as we don't currently use the store_by_pieces infrastructure. We may in future want to tweak this, but for GCC 5.0 the safe thing to do is just to XFAIL the test. Tested with aarch64-none-elf. OK? Ok, with Mike's comment addressed... /Marcus Thanks, James --- 2015-02-10 James Greenhalgh james.greenha...@arm.com * gfortran.dg/pr45636.f90: XFAIL for aarch64* targets.
Re: [PATCH] Fix PR64764 Fix scan-tree-dump for scop-19.c
Le 10 févr. 2015 à 12:42, Tom de Vries tom_devr...@mentor.com a écrit : I think we need to understand first what's going on. Sure, my patch was mainly to silence the failures on my working tree. In both test-cases, on Linux with -fpic the inlining of one function into the other is not done, because we cannot be sure the function call in one function binds to the other function at runtime. I don’t understand why -fpic should change the inlining decision (except register pressure and so on). So, is the inlining happening with -fpic for Darwin? If so, is that correct? The inlining is happening for darwin. I don’t know if that is correct or not. The test has been introduced at r143947 (Thu Feb 5 00:29:48 2009) and the only failures I have seen are due to r 220532 (checked back to r154691). Dominique Thanks, - Tom
Re: [Patch, fortran] PR64932 [4.9/5 Regression] ICE in gfc_conv_descriptor_data_get for generated finalizer
Dear Paul, dear all, first, sorry for the belate review. Paul Thomas wrote: This is a slight development of the patch posted on the PR itself. class.c(finalize_component) is not able to deal correctly with non-allocatable, derived type array components that have allocatable components. Rather than generating loops in finalize_component, the condition is detected in trans-stmt.c(gfc_trans_deallocate) and gfc_deallocate_alloc_comp is called after obtaining the derived type for the array and checking that it is not finalizable. Happily, this fix does not generate the error: Error: Two or more part references with nonzero rank must not be specified at (1) which occurs if the code is written explicitly. Bootstraps and regtests on FC21/x86_64 OK for trunk and 4.9? OK. I think the patch should be okay - even if the code is not really beautiful. (Fault of our data representation not of the patch itself.) Cheers, Tobias
Re: [PATCH PR64705]Don't aggressively expand induction variable's base
On Wed, Feb 11, 2015 at 9:23 AM, Bin.Cheng amker.ch...@gmail.com wrote: On Tue, Feb 10, 2015 at 12:24 AM, Richard Biener richard.guent...@gmail.com wrote: On February 9, 2015 11:09:49 AM CET, Bin Cheng bin.ch...@arm.com wrote: Did you extract a testcase for it? Note that the IV step itself may be expanded Too much. I looked into the regression and thought it was because of passes after IVOPT. The IVOPT dump is at least not worse than the original version. But different I presume. So how did you figure it regressed? The tree level dump is like, ---Without patch +++ With patch bb 8: _618 = MAX_EXPR n_105, 0; _619 = (integer(kind=8)) _618; _629 = (unsigned long) _618; .. bb 37: _257 = _619 + 1; _255 = (sizetype) _257; _261 = _255 * 8; _256 = (sizetype) _140; - _1174 = (sizetype) _618; + _1174 = (sizetype) _619; _1175 = _256 + _1174; _1176 = _1175 * 8; _1177 = _148 + _1176; ivtmp.3849_258 = (unsigned long) _1177; _1179 = (unsigned int) _104; Previously, the computation of _1174 can be replaced by _629 in bb8 in DOM2 pass, while it can't after patching. This is the only possible regression that I can see on TREE level. There is another difference but not regression on TREE. Seems real change happens on RTL pre with different register uses in the input. I failed to go further or extract a test case, it's just very volatile. Well, I can see what is happening and indeed we shouldn't blame the patch for this. With all of this, I guess this patch shouldn't be blamed for this. I also wonder if the PR should be fixed in this way since the patch definitely is a corner case. It might not fix the real issue (whatever that is), but not making IVOPTs (or tree-affines) life harder is very good (I believe I have seen this kind of issue as well). So I do think that the patch is fine. Just seen the known-to-work GCC 3.4 version so it's even a regression Thanks, Richard. Thanks, bin Thanks, Richard. Bootstrap and test on x86_64 and AArch64, so is it OK? 2015-02-09 Bin Cheng bin.ch...@arm.com PR tree-optimization/64705 * tree-ssa-loop-niter.h (expand_simple_operations): New parameter. * tree-ssa-loop-niter.c (expand_simple_operations): New parameter. (tree_simplify_using_condition_1, refine_bounds_using_guard) (number_of_iterations_exit): Pass new argument to expand_simple_operations. * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New. (find_bivs, find_givs_in_stmt_scev): Pass new argument to expand_simple_operations. Call extract_single_var_from_expr. (difference_cannot_overflow_p): Pass new argument to expand_simple_operations. gcc/testsuite/ChangeLog 2015-02-09 Bin Cheng bin.ch...@arm.com PR tree-optimization/64705 * gcc.dg/tree-ssa/pr64705.c: New test.
[PATCH][1/n] Fix PR65015
The following drops tempfile names from the DW_AT_producer string resulting from -fltrans-output-list and -fresolution. Applied as obvious. Richard. 2015-02-11 Richard Biener rguent...@suse.de PR lto/65015 * dwarf2out.c (gen_producer_string): Drop -fltrans-output-list and -fresolution. Index: gcc/dwarf2out.c === *** gcc/dwarf2out.c (revision 220578) --- gcc/dwarf2out.c (working copy) *** gen_producer_string (void) *** 19651,19656 --- 19651,19658 case OPT_nostdinc: case OPT_nostdinc__: case OPT_fpreprocessed: + case OPT_fltrans_output_list_: + case OPT_fresolution_: /* Ignore these. */ continue; default:
Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
On Tue, Feb 10, 2015 at 9:55 PM, Jeff Law l...@redhat.com wrote: This PR was originally minor issue where we regressed on this kind of sequence: typedef struct toto_s *toto_t; toto_t add (toto_t a, toto_t b) { int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b~1L); return (toto_t)(intptr_t) tmp; } There was talk of trying to peephole this in the x86 backend. But later Jakub speculated that if we had good type narrowing this could be done in the tree optimizers... S, here we go. I didn't do anything with logicals are those are already handled elsewhere in match.pd. I didn't try to handle MULT as in the early experiments I did, it was a lose because of the existing mechanisms for widening multiplications. Interestingly enough, this patch seems to help out libjava more than anything else in a GCC build and it really only helps a few routines. There weren't any routines I could see where the code regressed after this patch. This is probably an indicator that these things aren't *that* common, or the existing shortening code better than we thought, or some important shortening case is missing. I think we should pull the other tests from 47477 which are not regressions out into their own bug for future work. Or alternately, when this fix is checked in remove the regression marker in 47477. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for the trunk? diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 7f3816c..7a95029 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2015-02-10 Jeff Law l...@redhat.com + + * match.pd (convert (plus/minus (convert @0) (convert @1): New + simplifier to narrow arithmetic. + Please reference PR47477 from here 2015-02-10 Richard Biener rguent...@suse.de PR tree-optimization/64909 diff --git a/gcc/match.pd b/gcc/match.pd index 81c4ee6..abc703e 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3. If not see (logs (pows @0 @1)) (mult @1 (logs @0) +/* If we have a narrowing conversion of an arithmetic operation where + both operands are widening conversions from the same type as the outer + narrowing conversion. Then convert the innermost operands to a suitable + unsigned type (to avoid introducing undefined behaviour), perform the + operation and convert the result to the desired type. + + This narrows the arithmetic operation. */ This is also a part of what shorten_binary_op does, so please refer to that in the comment so we can eventually complete this from there and remove shorten_binary_op. +(for op (plus minus) + (simplify +(convert (op (convert@2 @0) (convert @1))) +(if (TREE_TYPE (@0) == TREE_TYPE (@1) + TREE_TYPE (@0) == type Otherwhere in match.pd we use (GIMPLE types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))) || (GENERIC TREE_TYPE (@0) == TREE_TYPE (@1)) for type equality checks. And I think even for GENERIC you want to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == TYPE_MAIN_VARAINT (TREE_TYPE (@1)). + INTEGRAL_TYPE_P (type) So instead of testing TREE_TYPE (@0) == type we probably want to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION (type) to allow sign-changes. Say for short i, j; (unsigned short) (i + j) we still want to narrow the widened i and j. + TYPE_PRECISION (TREE_TYPE (@2)) TYPE_PRECISION (TREE_TYPE (@0)) +/* This prevents infinite recursion. */ + unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2)) How can it recurse with the type precision constraint right above? If you think it's necessary then nesting that condition below as + (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); } (if (utype != TREE_TYPE (@2)) avoids evaluating unsigned_type_for twice. Note that technically we don't need to perform the operation in an unsigned type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)). Thus (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) (convert (op @0 @1))) (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); } (convert (op (convert:utype @0) (convert:utype @1)) You can see that GCC exploits that with -fwrapv. Maybe this simplification should be split out into a separate pattern though, tacking sign-changes for binary ops only. Thanks, Richard. +(convert (op (convert:utype @0) (convert:utype @1))) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 15d5e2d..76e5254 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2015-02-10 Jeff Law l...@redhat.com + + PR rtl-optimization/47477 + * gcc.dg/tree-ssa/narrow-arith-1.c: New test. + 2015-02-10 Richard Biener rguent...@suse.de PR tree-optimization/64909 diff --git
Re: [wwwdocs] Some stuff for porting to
On Tue, Feb 10, 2015 at 09:56:48PM +, Joseph Myers wrote: __FUNCTION__ and __func__ aren't macros (essentially they're built-in variables). Oops, I must have been thinking of __STDC_VERSION__ when writing that. Fixed, thanks. Index: porting_to.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/porting_to.html,v retrieving revision 1.3 diff -u -r1.3 porting_to.html --- porting_to.html 10 Feb 2015 11:12:20 - 1.3 +++ porting_to.html 11 Feb 2015 11:03:54 - @@ -251,6 +251,99 @@ b style='color:lime'^/b /pre +h4code__STDC_VERSION__/code macro/h4 + +pIn the C11 mode, the code__STDC_VERSION__/code standard macro, +introduced in C95, is now defined to code201112L/code. Typically, +this macro is used as in the following:/p + +precode + #if !defined __STDC_VERSION__ || __STDC_VERSION__ lt; 199901L +/* ... */ + #else + # include lt;stdint.hgt; + #endif +/code/pre + +pYou can check the macro using codegcc -dM -E -std=gnu11 - lt; /dev/null | grep STDC_VER/code./p + +h4Different meaning of the code%a *scanf/code modifier/h4 + +pThe GNU C library supports dynamic allocation via the code%a/code +modifier. But in C99, the code%a/code modifier is a synonym for +code%f/code (float), so the compiler expects an argument of type +codefloat */code. This in combination with the code-Wformat/code +warning option may result in additional warnings:/p + +precode + #include lt;stdio.hgt; + + int + main (void) + { +char *s; +scanf (%as, s); + } +/code/pre + +pre +bq.c:7:10:/b b style='color:magenta'warning:/b format b'%a'/b expects argument of type b'float *'/b, but argument 2 has type b'char **'/b [-Wformat=] + scanf (%as, s); + b style='color:lime'^/b +/pre + +pThe fix is to use the code%m/code modifier instead, specified by +POSIX.1-2008./p + +h3New warnings/h3 + +pSeveral new warnings have been added to the C front end. One of the new +warnings is that GCC now warns about non-standard predefined identifiers with +the code-Wpedantic/code option. For instance:/p + +precode + void + foo (void) + { +const char *s = __FUNCTION__; + } +/code/pre + +pre +bq.c:4:19:/b b style='color:magenta'warning:/b ISO C does not support b'__FUNCTION__'/b predefined identifier [-Wpedantic] + const char *s = __FUNCTION__; + b style='color:lime'^/b +/pre + +pThe fix is either to use the standard predefined identifier code__func__/code +(since C99), or to use the code__extension__/code keyword:p/ + +precode + const char *s = __extension__ __FUNCTION__; +/code/pre + +h2C++ language issues/h2 + +h3Converting codestd::nullptr_t/code to codebool/code/h3 + +pConverting codestd::nullptr_t/code to codebool/code in the C++11 +mode now requires direct-initialization. This has been changed in +a href=http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1423;DR 1423/a. +As a consequence, the following is invalid:/p + +precode + bool b = nullptr; +/code/pre + +pbut the following is valid:/p + +precode + bool b(nullptr); +/code/pre + +It is recommended to use codetrue/code, resp. codefalse/code keywords +in such cases. + h3Links/h3 p Marek
Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.
On 10/02/15 08:51, Matthew Wahab wrote: On 09/02/15 23:17, Hans-Peter Nilsson wrote: On Mon, 9 Feb 2015, Matthew Wahab wrote: Attached a patch to replace the test for macro __NEWLIB__ with a test for macro NEWLINE_IN_CLASS_BLANK, defined by { dg-additional-options -DNEWLINE_IN_CLASS_BLANK { target newlib } } Tested with check-target-libstdc++-v3, with the modified tests, for arm-none-eabi and aarch64-none-linux-gnu. Does this look ok to commit? Not valid for approval, but that's what I meant! Modulo the typo forgetting to s/__NEWLIB__/NEWLINE_IN_CLASS_BLANK/ in the ifdef in wchar_t/isctype.cc of course. Attached the fixed patch. Tested by running check-target-libstdc++-v3, with the modified tests, for arm-none-eabi and aarch64-none-linux-gnu. Ok to commit? Matthew libstdc++-v3/ 2015-02-11 Matthew Wahab matthew.wa...@arm.com * testsuite/28_regex/traits/char/isctype.cc (test01): Replace test for __NEWLIB__ macro with a dejagnu set macro. * testsuite/28_regex/traits/wchar_t/isctype.cc (test01): Likewise. diff --git a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc index 7c47045..b6fd6ff 100644 --- a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc +++ b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc @@ -1,5 +1,6 @@ // { dg-do run } // { dg-options -std=gnu++11 } +// { dg-additional-options -DNEWLINE_IN_CLASS_BLANK { target newlib } } // // 2010-06-23 Stephen M. Webb stephen.w...@bregmasoft.ca @@ -53,8 +54,8 @@ test01() VERIFY(!t.isctype('_', t.lookup_classname(range(digit; VERIFY( t.isctype(' ', t.lookup_classname(range(blank; VERIFY( t.isctype('\t', t.lookup_classname(range(blank; -#if defined (__NEWLIB__) - /* newlib includes '\n' in class 'blank'. +#if defined (NEWLINE_IN_CLASS_BLANK) + /* On some targets, '\n' is in class 'blank'. See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html. */ VERIFY( t.isctype('\n', t.lookup_classname(range(blank; #else diff --git a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc index 1b3d69a..144d679 100644 --- a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc +++ b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc @@ -1,5 +1,6 @@ // { dg-do run } // { dg-options -std=gnu++11 } +// { dg-additional-options -DNEWLINE_IN_CLASS_BLANK { target newlib } } // Copyright (C) 2010-2015 Free Software Foundation, Inc. // @@ -50,8 +51,8 @@ test01() VERIFY(!t.isctype(L'_', t.lookup_classname(range(digit; VERIFY( t.isctype(L' ', t.lookup_classname(range(blank; VERIFY( t.isctype(L'\t', t.lookup_classname(range(blank; -#if defined (__NEWLIB__) - /* newlib includes '\n' in class 'blank'. +#if defined (NEWLINE_IN_CLASS_BLANK) + /* On some targets, '\n' is in class 'blank'. See https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html. */ VERIFY( t.isctype(L'\n', t.lookup_classname(range(blank; #else
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On Wed, Feb 11, 2015 at 7:43 AM, Jeff Law l...@redhat.com wrote: On 02/03/15 05:23, Joseph Myers wrote: + TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1)) + TYPE_UNSIGNED (TREE_TYPE (@0)) == TYPE_UNSIGNED (TREE_TYPE (@1)) + TYPE_PRECISION (type) TYPE_PRECISION (TREE_TYPE (@0))) + (convert (bit_and (inner_op @0 @1) (convert @3)) I still don't think this is safe. Suppose @0 and @1 are -128 in type int8_t and @3 is 128 in a wider type and the operation is AND. Then the original expression has result 128. But if you convert @3 to int8_t you get -128 and this would result in -128 from the simplified expression. Looking at this again, @3 is being converted to the wrong type, plain and simple. I should never write code while heavily medicated. I suspect that combined with trying to work around the genmatch bug Richi just fixed sent me down a totally wrong path. I think the way to go is to always convert the inner operands to an unsigned type. In fact, everything except the outer convert should be using an unsigned type of the same size/precision as @0's type. The outer convert should, of course, be the final type. That avoids all the concerns about sign bit propagation from the narrow type into the wider final type. Application of this pattern (and the one I posted for 47477) is a concern for targets that don't do sub-word arithmetic/logicals. But I just did a sniff test of one such target (v850-elf because it was handy) and I couldn't spot a change in the end code using both the 47477 patch and my WIP patch for this BZ. The c-family frontends perform this kind of narrowing already anyway (via the shorten_* stuff which is misplaced there and should be done elsewhere for all frontends - thus in match.pd, thanks for starting that). Richard. jeff
[PATCH][2/n] Drop DW_AT_name for LTO produced unit DIEs
Testing in progress. Ok? Thanks, Richard. 2015-02-11 Richard Biener rguent...@suse.de PR lto/65015 * dwarf2out.c (dwarf2out_finish): Do not emit DW_AT_name on the comp unit DIE for LTO produced units. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 220613) +++ gcc/dwarf2out.c (working copy) @@ -24521,8 +24521,11 @@ dwarf2out_finish (const char *filename) gen_remaining_tmpl_value_param_die_attribute (); /* Add the name for the main input file now. We delayed this from - dwarf2out_init to avoid complications with PCH. */ - add_name_attribute (comp_unit_die (), remap_debug_filename (filename)); + dwarf2out_init to avoid complications with PCH. + Avoid doing this for LTO produced units as it adds random + tempfile names. */ + if (!in_lto_p) +add_name_attribute (comp_unit_die (), remap_debug_filename (filename)); if (!IS_ABSOLUTE_PATH (filename) || targetm.force_at_comp_dir) add_comp_dir_attribute (comp_unit_die ()); else if (get_AT (comp_unit_die (), DW_AT_comp_dir) == NULL)
Re: [PATCH][AArch64] Fix illegal assembly 'eon v1, v2, v3'
Here is a revised patch using '#'. Bootstrap + check-gcc natively on aarch64-none-linux-gnu. My feeling is still against including the testcase because even when it passes it doesn't increase my confidence that the compiler is right very much (i.e. the insn could still be reading which_alternative at split time, but which_alternative just happened to have a useful value in it - as was happening in existing test gcc.target/aarch64/eon_1.c). I've recorded the testcase in PR/64997, however. If the maintainers feel the testcase should be included, then I could prepare that as a separate patch? gcc/ChangeLog: PR target/64997 * config/aarch64/aarch64.md (*xor_one_cmplmode3): Use FP_REGNUM_P as split condition; force split via '#' in output pattern. --Alan James Greenhalgh wrote: On Wed, Jan 28, 2015 at 02:04:04PM +, James Greenhalgh wrote: On Wed, Jan 28, 2015 at 12:32:45PM +, Alan Lawrence wrote: Ok for stage 4? This is a regression from 4.9, so once we iron out some nits, it should be. gcc/ChangeLog: * config/aarch64/aarch64.md (*xor_one_cmplmode3): Use FP_REGNUM_P as split condition. And a testcase, please! diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index bc49fbe68a978b3ca069c6d084f542773df84bcb..d4b3f7b03ba0ab570cec5ce862e8c5f38f417ed1 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3054,7 +3054,7 @@ (match_operand:GPI 2 register_operand r,w] eon\\t%w0, %w1, %w2 ;; For GPR registers (only). This should be: @ eon\\t%w0, %w1, %w2 # which would have forced a split. Your patch is useful regardless, as I guess we could have ended up needlessly splitting if we got unlucky with whatever had been left in which_alternative. Hi Alan, Do you have any plans to respin this patch? I'd like to see it fixed for GCC 5.0 if possible. Thanks, James diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index bc49fbe..9f08046 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3053,8 +3053,10 @@ (not:GPI (xor:GPI (match_operand:GPI 1 register_operand r,?w) (match_operand:GPI 2 register_operand r,w] - eon\\t%w0, %w1, %w2 ;; For GPR registers (only). - reload_completed (which_alternative == 1) ;; For SIMD registers. + @ + eon\\t%w0, %w1, %w2 + # + reload_completed FP_REGNUM_P (REGNO (operands[0])) ;; For SIMD registers. [(set (match_operand:GPI 0 register_operand =w) (xor:GPI (match_operand:GPI 1 register_operand w) (match_operand:GPI 2 register_operand w)))
Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
2014-12-19 15:21 GMT+00:00 Kenneth Zadeck zad...@naturalbridge.com: however, since i am a nice person loop-invariant solves the DF_UD_CHAIN which builds a data structure that connects each use with all of the defs that reach it. I believe that this is the opposite of what you want here. if you really need this, you need to also turn on the DF_DU_CHAIN which builds the opposite structure.Both structures can be space hogs, but they are both turned on in other places in the compiler so it is unlikely to be an issue. Exactly, Thanks, Kenneth. This approach works from my experiment and look much better than previous REG_NOTE approach. while it do have one problem. We need the UD/DU chain built before we do insn re-shuffling. While after re-shuffling, UD chain needs update, otherwise, the later check_dependecies in find_invariant_insn may fail. although we have re-shuffle instruction 1 into 2, the later check still using old UD info, thus decide instruction 2 is not iv. 1: regA - vfp + regB 2: regA - vfp + const my current fix is to insert those re-shuffled insn into a table named vfp_const_iv, then skip those dependencies check for them as they don't have any dependencies. LOG_LINKs have nothing to do with single use; they point from the _first_ use to its corresponding def. You might want to look at what fwprop does instead. Pass rtl fwprop uses df information in single-definition way, it doesn't really take into consideration if register is a single use. This often corrupts other optimizations like post-increment and load/store pair. For example: add r2, r1, r0 ldr rx, [r2] add r2, r2, #4 is transformed into below form: add r2, r1, r0 ldr rx, [r1, r0] add r2, r2, #4 As a result, post-increment opportunity is corrupted, also definition of r2 can't be deleted because it's not single use. Thanks, bin thanks for all these suggestion. Have look at the LOG_LINK build function, a simple reverse scan, while needs to allocate big map array for all pseudo regs. Haven't looked at similar code in fwprop, actually, when found the first matched insn pattern, I just want to scan several insns next, then abort quickly if nothing meet requirement. there is no need to build full single-use information. still can anyone confirm that it is safe to re-use REG_DEAD info there without calling df_note_add_problem and df_analysis first? or I am using those info passed down from the previous pass which calculated these info and maybe broken? It's not safe to use REG_DEAD info without re-computing it. not sure that reg_dead is the right answer even if you did re-compute it. I believe you can have two parallel uses (on both sides of an if-then-else) for a single def (above the if then else) and have two REG_DEAD notes. Richard.
Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)
On Sun, Jan 25, 2015 at 12:07:46PM -0800, Mike Stump wrote: On Dec 1, 2014, at 2:52 AM, Marek Polacek pola...@redhat.com wrote: On Sun, Nov 30, 2014 at 11:00:12PM -0500, Jason Merrill wrote: On 11/27/2014 08:57 AM, Marek Polacek wrote: -/* { dg-error is not a constant expression { target c++ } 12 } */ +/* { dg-error { xfail { *-*-* } } 11 } */ Please keep the expected message. Done in the below. 2014-12-01 Marek Polacek pola...@redhat.com PR sanitizer/63956 * ubsan.c (is_ubsan_builtin_p): Check also built-in class. cp/ * constexpr.c: Include ubsan.h. (cxx_eval_call_expression): Bail out for IFN_UBSAN_{NULL,BOUNDS} internal functions and for ubsan builtins. * error.c: Include internal-fn.h. (dump_expr): Add printing of internal functions. testsuite/ * c-c++-common/ubsan/shift-5.c: Add fails. Do you see: PASS: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 11) XFAIL: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 11) PASS: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 14) XFAIL: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 14) PASS: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 17) XFAIL: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 17) PASS: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 20) XFAIL: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 20) PASS: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 34) XFAIL: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 34) PASS: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 37) XFAIL: c-c++-common/ubsan/shift-5.c -O0 (test for errors, line 37) on x86_64 on linux? If so, this really isn’t cool. You cannot have the same name as both pass and fail. At the heart of regression analysis is the notion that no test that passed before now fails. One can run contrib/compare_tests to see if a patch one is working on has any regressions in it, the beauty of the script is it will tell you in plain language if there are any regressions or not. The standard for gcc is, no regressions. Could you find a way to fix this? Splitting into C and C++ test cases might be one way. Fixing any expected failures might be another. Sorry for not replying sooner. The following patch splits the test into C and C++ test cases, so hopefully fixing the issue. Ok for trunk? 2015-02-11 Marek Polacek pola...@redhat.com * g++.dg/ubsan/shift-1.C: New test. * gcc.dg/ubsan/c-shift-2.c: New test. * c-c++-common/ubsan/shift-5.c: Remove file. diff --git gcc/testsuite/g++.dg/ubsan/shift-1.C gcc/testsuite/g++.dg/ubsan/shift-1.C index e69de29..8a71279 100644 --- gcc/testsuite/g++.dg/ubsan/shift-1.C +++ gcc/testsuite/g++.dg/ubsan/shift-1.C @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options -fsanitize=shift -w } */ +/* { dg-shouldfail ubsan } */ + +int +foo (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case 1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 11 } */ +case -1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 13 } */ +case 1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 15 } */ +case -1 -1: +/* { dg-error is not a constant expression { xfail { *-*-* } } 17 } */ + return 1; +} + return 0; +} + +int +bar (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case -1 200: +/* { dg-error is not a constant expression { xfail { *-*-* } } 30 } */ +case 1 200: +/* { dg-error is not a constant expression { xfail { *-*-* } } 32 } */ + return 1; +} + return 0; +} diff --git gcc/testsuite/gcc.dg/ubsan/c-shift-2.c gcc/testsuite/gcc.dg/ubsan/c-shift-2.c index e69de29..beb0dbe 100644 --- gcc/testsuite/gcc.dg/ubsan/c-shift-2.c +++ gcc/testsuite/gcc.dg/ubsan/c-shift-2.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options -fsanitize=shift -w } */ +/* { dg-shouldfail ubsan } */ + +int +foo (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case 1 -1: /* { dg-error case label does not reduce to an integer constant } */ +case -1 -1: /* { dg-error case label does not reduce to an integer constant } */ +case 1 -1: /* { dg-error case label does not reduce to an integer constant } */ +case -1 -1: /* { dg-error case label does not reduce to an integer constant } */ + return 1; +} + return 0; +} + +int +bar (int x) +{ + /* None of the following should pass. */ + switch (x) +{ +case -1 200: /* { dg-error case label does not reduce to an integer constant } */ +case 1 200: /* { dg-error case label does not reduce to an integer constant } */ + return 1; +} + return 0; +} Marek
Re: [PATCH][2/n] Drop DW_AT_name for LTO produced unit DIEs
On Wed, 11 Feb 2015, Richard Biener wrote: Testing in progress. Ok? So it was said that retaining something like artificial is better. Thus the following. LTO bootstrapped tested on x86_64-unknown-linux-gnu - ok? Thanks, Richard. 2015-02-11 Richard Biener rguent...@suse.de PR lto/65015 * dwarf2out.c (dwarf2out_finish): Use artificial as DW_AT_name for LTO produced CUs. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 220613) +++ gcc/dwarf2out.c (working copy) @@ -24521,8 +24521,13 @@ dwarf2out_finish (const char *filename) gen_remaining_tmpl_value_param_die_attribute (); /* Add the name for the main input file now. We delayed this from - dwarf2out_init to avoid complications with PCH. */ - add_name_attribute (comp_unit_die (), remap_debug_filename (filename)); + dwarf2out_init to avoid complications with PCH. + For LTO produced units use a fixed artificial name to avoid + leaking tempfile names into the dwarf. */ + if (!in_lto_p) +add_name_attribute (comp_unit_die (), remap_debug_filename (filename)); + else +add_name_attribute (comp_unit_die (), artificial); if (!IS_ABSOLUTE_PATH (filename) || targetm.force_at_comp_dir) add_comp_dir_attribute (comp_unit_die ()); else if (get_AT (comp_unit_die (), DW_AT_comp_dir) == NULL)
Re: [PATCH] Fix PR ipa/64813
On Wed, Feb 11, 2015 at 1:46 PM, Martin Liška mli...@suse.cz wrote: On 02/10/2015 05:00 PM, Jakub Jelinek wrote: On Tue, Feb 10, 2015 at 04:56:40PM +0100, Martin Liška wrote: On 02/10/2015 01:50 PM, Richard Biener wrote: On Tue, Feb 10, 2015 at 12:07 PM, Martin Liška mli...@suse.cz wrote: Hello. Following patch is fix for PR ipa/64813. The patch was tested on a darwin target by Dominique. Ready for trunk? - if (!(gimple_call_flags (call) ECF_NORETURN)) + if (!alias_is_noreturn) that was technically unnecessary, right? The call flags properly return ECF_NORETURN already? Ok if that is the case. Hi. You are right, !(gimple_call_flags (call) ECF_NORETURN) returns a correct value. Motivation for replacement is not to repeat the same condition, I hope the value of alias_is_noreturn is correct in all uses. And gimple_call_flags (call) ECF_NORETURN doesn't? I mean, if you could initialize alias_is_noreturn to that, it would be nicer. Jakub Hello. There are actually 3 places I need to guard if a function non-return. First place is responsible for result type creation: /* Build call to the function being thunked. */ if (!VOID_TYPE_P (restype) !alias_is_noreturn) Where I can create guard just based on TREE_THIS_VOLATILE of the called function. Once the new gimple call is created I can use: (gimple_call_flags (call) ECF_NORETURN So that's the reason I create one bool value (alias_is_noreturn) and the very beginning of expand_thunk method. Dost it make sense? Yes, and I think your patch is ok. Thanks, Richard. Thanks, Martin
Re: [PATCH] Fix PR ipa/64813
On 02/10/2015 05:00 PM, Jakub Jelinek wrote: On Tue, Feb 10, 2015 at 04:56:40PM +0100, Martin Liška wrote: On 02/10/2015 01:50 PM, Richard Biener wrote: On Tue, Feb 10, 2015 at 12:07 PM, Martin Liška mli...@suse.cz wrote: Hello. Following patch is fix for PR ipa/64813. The patch was tested on a darwin target by Dominique. Ready for trunk? - if (!(gimple_call_flags (call) ECF_NORETURN)) + if (!alias_is_noreturn) that was technically unnecessary, right? The call flags properly return ECF_NORETURN already? Ok if that is the case. Hi. You are right, !(gimple_call_flags (call) ECF_NORETURN) returns a correct value. Motivation for replacement is not to repeat the same condition, I hope the value of alias_is_noreturn is correct in all uses. And gimple_call_flags (call) ECF_NORETURN doesn't? I mean, if you could initialize alias_is_noreturn to that, it would be nicer. Jakub Hello. There are actually 3 places I need to guard if a function non-return. First place is responsible for result type creation: /* Build call to the function being thunked. */ if (!VOID_TYPE_P (restype) !alias_is_noreturn) Where I can create guard just based on TREE_THIS_VOLATILE of the called function. Once the new gimple call is created I can use: (gimple_call_flags (call) ECF_NORETURN So that's the reason I create one bool value (alias_is_noreturn) and the very beginning of expand_thunk method. Dost it make sense? Thanks, Martin
Re: [patch, libgfortran] Bug 57822 - I/O: (g0) wrongly prints E+0000
On Sun, Feb 08, 2015 at 05:10:44PM -0800, Jerry DeLisle wrote: The attached patch fixes this by checking for the case when we are doing g0 editing and the exponent is 0. Regression tested on X86-64. For the larger kinds, we are on a different code path out of necessity, so we need to address this corner case. I will commit in a day or two as simple/obvious, with a Changelog for the testsuite as well. The test fails on most of the architectures. Not all targets support _10 or _16. Jakub
Re: Re: [PATCH 7/8] Model cache auto-prefetcher in scheduler
On 20/01/15 13:26, Maxim Kuvyrkov wrote: On Jan 20, 2015, at 1:24 PM, Richard Earnshaw rearn...@arm.com wrote: ... In general, how should someone tuning the compiler for this parameter select a value that isn't one of (-1, m_i_q_d+1)? From my experiments it seems there are 4 reasonable values for the parameter: (-1) autopref turned off, (0) turned on in rank_for_schedule, (m_i_q_d+1) turned on everywhere. If there is a static constructor generated for tune tables and it is a problem to have it -- I can shrink acceptable values to these 3 and call it a day. You only mention 3 values: what was the fourth? Typo. No fourth. It might be better then to define a set of values that represent each of these cases and only allow the tuning parameters to select one of those. The init code then uses that set to select how to set up the various parameters to meet those goals. So something like ARM_SCHED_AUTOPREF_OFF ARM_SCHED_AUTOPREF_RANK ARM_SCHED_AUTOPREF_FULL A patch is attached. I bootstrapped it on arm-linux-gnueabihf. OK to apply? bootstrap failure on chromebook, reproduced on two chromebook. see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65020 please. -- Maxim Kuvyrkov www.linaro.org
Re: [patch, libgfortran] Bug 57822 - I/O: (g0) wrongly prints E+0000
On Wed, Feb 11, 2015 at 02:04:09PM +0100, Jakub Jelinek wrote: On Sun, Feb 08, 2015 at 05:10:44PM -0800, Jerry DeLisle wrote: The attached patch fixes this by checking for the case when we are doing g0 editing and the exponent is 0. Regression tested on X86-64. For the larger kinds, we are on a different code path out of necessity, so we need to address this corner case. I will commit in a day or two as simple/obvious, with a Changelog for the testsuite as well. The test fails on most of the architectures. Not all targets support _10 or _16. For _16 you'd need to guard the test with fortran_real_16 (perhaps split that part into a separate test), for _10 dunno. Jakub
[committed] Fix C #pragma omp atomic parsing (PR c/64824)
Hi! This patch fixes a thinko in POP macro in c_parser_binary_expression that resulted in not triggering the build2 case if omp_atomic_lhs is non-NULL. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk so far, 4.9 testing in progress, 4.8 will follow at some point. 2015-02-11 Jakub Jelinek ja...@redhat.com PR c/64824 * c-parser.c (c_parser_binary_expression): Fix OpenMP stack[sp].prec check in the POP macro. * testsuite/libgomp.c/atomic-18.c: New test. * testsuite/libgomp.c++/atomic-16.C: New test. --- gcc/c/c-parser.c.jj 2015-02-04 23:30:34.0 +0100 +++ gcc/c/c-parser.c2015-02-11 12:16:31.274379159 +0100 @@ -6233,8 +6233,8 @@ c_parser_binary_expression (c_parser *pa if (__builtin_expect (omp_atomic_lhs != NULL_TREE, 0) sp == 1 \ c_parser_peek_token (parser)-type == CPP_SEMICOLON\ ((1 stack[sp].prec) \ -(1 (PREC_BITOR | PREC_BITXOR | PREC_BITAND | PREC_SHIFT \ -| PREC_ADD | PREC_MULT)))\ +((1 PREC_BITOR) | (1 PREC_BITXOR) | (1 PREC_BITAND)\ + | (1 PREC_SHIFT) | (1 PREC_ADD) | (1 PREC_MULT))) \ stack[sp].op != TRUNC_MOD_EXPR \ stack[0].expr.value != error_mark_node \ stack[1].expr.value != error_mark_node \ --- libgomp/testsuite/libgomp.c/atomic-18.c.jj 2015-02-11 12:23:12.704648053 +0100 +++ libgomp/testsuite/libgomp.c/atomic-18.c 2015-02-11 12:24:11.749658560 +0100 @@ -0,0 +1,61 @@ +/* PR c/64824 */ +/* { dg-do run } */ +/* { dg-options -O2 -fopenmp } */ + +void +f1 (void) +{ + short a; + short b = 1; + int c = 3; +#pragma omp atomic capture + a = b = c b; + if (b != 6 || a != 6) +__builtin_abort (); +} + +void +f2 (void) +{ + short a; + short b = 1; + int c = 3; +#pragma omp atomic capture + a = b = c + b; + if (b != 4 || a != 4) +__builtin_abort (); +} + +void +f3 (void) +{ + short a; + short b = 1; + long long int c = 3; +#pragma omp atomic capture + a = b = c + b; + if (b != 4 || a != 4) +__builtin_abort (); +} + +void +f4 (void) +{ + char a; + char b = 1; + long long int c = 3LL; +#pragma omp atomic capture + a = b = c b; + if (b != 6 || a != 6) +__builtin_abort (); +} + +int +main () +{ + f1 (); + f2 (); + f3 (); + f4 (); + return 0; +} --- libgomp/testsuite/libgomp.c++/atomic-16.C.jj2015-02-11 12:24:03.337799529 +0100 +++ libgomp/testsuite/libgomp.c++/atomic-16.C 2015-02-11 12:23:58.019888648 +0100 @@ -0,0 +1,5 @@ +// PR c/64824 +// { dg-do run } +// { dg-options -O2 -fopenmp } + +#include ../libgomp.c/atomic-18.c Jakub
Re: [PATCH] PR rtl-optimization/32219: optimizer causees wrong code in pic/hidden/weak symbol checking
On 02/10/2015 01:19 PM, Richard Henderson wrote: As an existing issue, I'm not sure why specified visibility is any different from unspecified visibility. As far as I'm aware, the specified bit simply means that the decl doesn't inherit inherit visibility from the class, or from the command-line. But once we're this far, the visibility actually applied to the symbol should be all that matters. The test is there to differentiate explicit visibility from that implied from the command-line. Without it, we assume hidden visibility for external symbols too early, making the command-line option useless. This is visible even in building libgcc. I believe this set of patches does what we want, and cleans things up a bit in the process. r~ From bf5d4f20368d1173a8d586f6bdd258b00b807e33 Mon Sep 17 00:00:00 2001 From: Richard Henderson r...@redhat.com Date: Wed, 11 Feb 2015 17:37:48 -0800 Subject: [PATCH 1/5] Replace local_p with direct returns --- gcc/varasm.c | 66 +++- 1 file changed, 34 insertions(+), 32 deletions(-) diff --git a/gcc/varasm.c b/gcc/varasm.c index 3f62fca..83d9de3 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -6814,7 +6814,6 @@ default_binds_local_p (const_tree exp) bool default_binds_local_p_1 (const_tree exp, int shlib) { - bool local_p; bool resolved_locally = false; bool resolved_to_local_def = false; @@ -6845,54 +6844,57 @@ default_binds_local_p_1 (const_tree exp, int shlib) /* A non-decl is an entry in the constant pool. */ if (!DECL_P (exp)) -local_p = true; +return true; + /* Weakrefs may not bind locally, even though the weakref itself is always static and therefore local. Similarly, the resolver for ifunc functions might resolve to a non-local function. FIXME: We can resolve the weakref case more curefuly by looking at the weakref alias. */ - else if (lookup_attribute (weakref, DECL_ATTRIBUTES (exp)) + if (lookup_attribute (weakref, DECL_ATTRIBUTES (exp)) || (TREE_CODE (exp) == FUNCTION_DECL lookup_attribute (ifunc, DECL_ATTRIBUTES (exp -local_p = false; +return false; + /* Static variables are always local. */ - else if (! TREE_PUBLIC (exp)) -local_p = true; - /* A variable is local if the user has said explicitly that it will - be. */ - else if ((DECL_VISIBILITY_SPECIFIED (exp) - || resolved_to_local_def) - DECL_VISIBILITY (exp) != VISIBILITY_DEFAULT) -local_p = true; + if (! TREE_PUBLIC (exp)) +return true; + + /* A variable is local if the user has said explicitly that it will be. */ + if ((DECL_VISIBILITY_SPECIFIED (exp) || resolved_to_local_def) + DECL_VISIBILITY (exp) != VISIBILITY_DEFAULT) +return true; + /* Variables defined outside this object might not be local. */ - else if (DECL_EXTERNAL (exp) !resolved_locally) -local_p = false; - /* If defined in this object and visibility is not default, must be - local. */ - else if (DECL_VISIBILITY (exp) != VISIBILITY_DEFAULT) -local_p = true; + if (DECL_EXTERNAL (exp) !resolved_locally) +return false; + + /* If defined in this object and visibility is not default, + must be local. */ + if (DECL_VISIBILITY (exp) != VISIBILITY_DEFAULT) +return true; + /* Default visibility weak data can be overridden by a strong symbol in another module and so are not local. */ - else if (DECL_WEAK (exp) - !resolved_locally) -local_p = false; + if (DECL_WEAK (exp) !resolved_locally) +return false; + /* If PIC, then assume that any global name can be overridden by symbols resolved from other modules. */ - else if (shlib) -local_p = false; + if (shlib) +return false; + /* Uninitialized COMMON variable may be unified with symbols resolved from other modules. */ - else if (DECL_COMMON (exp) - !resolved_locally - (DECL_INITIAL (exp) == NULL - || (!in_lto_p DECL_INITIAL (exp) == error_mark_node))) -local_p = false; + if (DECL_COMMON (exp) + !resolved_locally + (DECL_INITIAL (exp) == NULL + || (!in_lto_p DECL_INITIAL (exp) == error_mark_node))) +return false; + /* Otherwise we're left with initialized (or non-common) global data which is of necessity defined locally. */ - else -local_p = true; - - return local_p; + return true; } /* Return true when references to DECL must bind to current definition in -- 2.1.0 From 0b1ac37c688093d720a99b5ae446602a0c1120b3 Mon Sep 17 00:00:00 2001 From: Richard Henderson r...@redhat.com Date: Wed, 11 Feb 2015 19:27:14 -0800 Subject: [PATCH 2/5] Use symtab_node to unify var and function handling Delay that until after static symbols are eliminated. --- gcc/varasm.c | 59 +++ 1 file changed, 19 insertions(+), 40 deletions(-) diff --git a/gcc/varasm.c b/gcc/varasm.c index 83d9de3..dcc70e3 100644 --- a/gcc/varasm.c